@lemoncode/lemony 0.1.0 → 0.1.1-alpha.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/NOTICE +39 -0
- package/catalog/VERSION +1 -1
- package/catalog/agents/architect.md +4 -4
- package/catalog/agents/fit-assessment.md +1 -1
- package/catalog/agents/implementer.md +15 -8
- package/catalog/agents/orchestrator.md +165 -24
- package/catalog/agents/reviewer.md +7 -7
- package/catalog/agents/spec-author.md +4 -4
- package/catalog/agents/ui-designer.md +115 -15
- package/catalog/commands/add-capability.md +3 -3
- package/catalog/commands/resume.md +10 -4
- package/catalog/commands/spinoff.md +2 -2
- package/catalog/commands/sync-design-tokens.md +29 -0
- package/catalog/harness.config.schema.json +14 -0
- package/catalog/hooks/init.sh +11 -11
- package/catalog/hooks/lib/lemony.sh +3 -3
- package/catalog/hooks/lib/playbook-scan.sh +10 -11
- package/catalog/hooks/session-close.sh +7 -7
- package/catalog/schemas/tier2-events-history.md +11 -11
- package/catalog/schemas/tier2-events.md +46 -47
- package/catalog/skills/a11y-audit/SKILL.md +121 -0
- package/catalog/skills/bootstrap-architecture/SKILL.md +3 -3
- package/catalog/skills/build-ui/SKILL.md +147 -0
- package/catalog/skills/build-ui/accessibility.md +101 -0
- package/catalog/skills/build-ui/anti-slop.md +107 -0
- package/catalog/skills/code-explorer/SKILL.md +1 -1
- package/catalog/skills/design-critique/SKILL.md +110 -0
- package/catalog/skills/design-tool-sync/SKILL.md +120 -0
- package/catalog/skills/grill-ui/SKILL.md +187 -0
- package/catalog/skills/grill-ui/ui-handoff-format.md +148 -0
- package/catalog/skills/grill-with-docs/SKILL.md +9 -2
- package/catalog/skills/mutation-testing/SKILL.md +1 -1
- package/catalog/skills/note-side-finding/SKILL.md +1 -1
- package/catalog/skills/playbook-iterate/SKILL.md +2 -2
- package/catalog/skills/review-pr/SKILL.md +3 -3
- package/catalog/skills/task-closeout/SKILL.md +9 -8
- package/catalog/skills/update-architecture/SKILL.md +3 -3
- package/catalog/templates/claude-code/agents.md.tpl +16 -10
- package/catalog/templates/claude-code/docs/playbooks/README.md.tpl +1 -3
- package/catalog/templates/claude-code/harness.config.yml.tpl +9 -1
- package/dist/cli.mjs +1286 -1665
- package/package.json +13 -4
- package/catalog/agents/README.md +0 -29
- package/catalog/hooks/README.md +0 -56
- package/catalog/playbook-format.md +0 -198
- package/catalog/schemas/README.md +0 -13
- package/catalog/skills/README.md +0 -62
- package/catalog/templates/README.md +0 -32
package/NOTICE
ADDED
|
@@ -0,0 +1,39 @@
|
|
|
1
|
+
# NOTICE
|
|
2
|
+
|
|
3
|
+
Lemony (`@lemoncode/lemony`) is distributed under the MIT License (see `LICENSE`).
|
|
4
|
+
It includes components adapted from, or inspired by, third-party open-source work;
|
|
5
|
+
this file retains the required attributions.
|
|
6
|
+
|
|
7
|
+
This file is generated from per-component attribution metadata — do not edit it by hand.
|
|
8
|
+
|
|
9
|
+
## Derived from third-party sources (MIT)
|
|
10
|
+
|
|
11
|
+
The catalog components below adapt text or code from the following MIT-licensed
|
|
12
|
+
sources. Each source's copyright notice is reproduced here; the shared MIT
|
|
13
|
+
permission notice (reproduced once, at the end of this section) applies to each.
|
|
14
|
+
|
|
15
|
+
- **mattpocock/skills** — Copyright (c) Matt Pocock
|
|
16
|
+
Source: https://github.com/mattpocock/skills
|
|
17
|
+
Adapted by:
|
|
18
|
+
- grill-ui — grill interview engine — one question at a time, decision-by-decision interrogation
|
|
19
|
+
- grill-with-docs — grill workflow and decision-by-decision interview structure
|
|
20
|
+
|
|
21
|
+
### MIT License (applies to each source listed above)
|
|
22
|
+
|
|
23
|
+
Permission is hereby granted, free of charge, to any person obtaining a copy
|
|
24
|
+
of this software and associated documentation files (the "Software"), to deal
|
|
25
|
+
in the Software without restriction, including without limitation the rights
|
|
26
|
+
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
|
|
27
|
+
copies of the Software, and to permit persons to whom the Software is
|
|
28
|
+
furnished to do so, subject to the following conditions:
|
|
29
|
+
|
|
30
|
+
The above copyright notice and this permission notice shall be included in all
|
|
31
|
+
copies or substantial portions of the Software.
|
|
32
|
+
|
|
33
|
+
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
|
|
34
|
+
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
|
|
35
|
+
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
|
|
36
|
+
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
|
|
37
|
+
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
|
|
38
|
+
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
|
|
39
|
+
SOFTWARE.
|
package/catalog/VERSION
CHANGED
|
@@ -1 +1 @@
|
|
|
1
|
-
0.1.0
|
|
1
|
+
0.1.1-alpha.0
|
|
@@ -17,7 +17,7 @@ true, exploring an unfamiliar codebase, and iterating the client's playbooks.
|
|
|
17
17
|
|
|
18
18
|
The Architect **proposes**; the human (via the Orchestrator) **decides**. It owns the
|
|
19
19
|
ADRs, `docs/architecture.md`, and the client's playbooks — but playbooks are
|
|
20
|
-
client-owned
|
|
20
|
+
client-owned, so it never imposes content, it suggests changes.
|
|
21
21
|
|
|
22
22
|
## When the Orchestrator invokes you
|
|
23
23
|
|
|
@@ -35,7 +35,7 @@ decision, the change, or the request):
|
|
|
35
35
|
These are conditions, not a sequence — run the one you were invoked for. Which skills
|
|
36
36
|
are installed depends on the repo's capabilities (see Skills below); run whichever landed.
|
|
37
37
|
|
|
38
|
-
Your most reliable activation is **closeout
|
|
38
|
+
Your most reliable activation is **closeout**: the `task-closeout` skill
|
|
39
39
|
drives `write-adr`, `update-architecture`, and `playbook-iterate` at the end of every task,
|
|
40
40
|
in cold blood, so durable capture isn't lost to mid-task resume pressure. There the
|
|
41
41
|
Orchestrator invokes you **automatically** for `update-architecture` (when
|
|
@@ -74,8 +74,8 @@ DEFINE-mode grill is a different use; there is no operational overlap.
|
|
|
74
74
|
|
|
75
75
|
## Skills
|
|
76
76
|
|
|
77
|
-
The installer fills this list with the skills your repo's capabilities resolved to
|
|
78
|
-
|
|
77
|
+
The installer fills this list with the skills your repo's capabilities resolved to;
|
|
78
|
+
each skill renders with the condition that triggers it. The rich "how"
|
|
79
79
|
of each lives in its own `SKILL.md`.
|
|
80
80
|
|
|
81
81
|
{{SKILLS}}
|
|
@@ -6,7 +6,7 @@
|
|
|
6
6
|
> fuller model and worked examples behind that one-paragraph rule; when the two
|
|
7
7
|
> ever drift, `orchestrator.md` wins.
|
|
8
8
|
|
|
9
|
-
The harness is a **dial, not an on/off switch
|
|
9
|
+
The harness is a **dial, not an on/off switch**. Every
|
|
10
10
|
incoming task lands at one of three levels of ceremony. The Orchestrator (an
|
|
11
11
|
LLM) classifies — there is **no runtime scorer** (a keyword heuristic would be
|
|
12
12
|
less accurate and would bias toward the expensive failure; a data-driven scorer
|
|
@@ -19,18 +19,24 @@ A **sub-agent** with fresh context. Implements the approved change via TDD.
|
|
|
19
19
|
it** — it is the maintained map of the system's shape, and shape matters most to whoever
|
|
20
20
|
writes code: don't violate a boundary/seam the map states. Trust it for the parts your
|
|
21
21
|
change won't touch; the slice you do edit you're reading anyway, so verify there. It is
|
|
22
|
-
**absent by default** — orient as today and never suggest creating it
|
|
22
|
+
**absent by default** — orient as today and never suggest creating it. You
|
|
23
23
|
work on the task branch `harness/<id>-<slug>` the Orchestrator created; the spec is
|
|
24
24
|
already committed there.
|
|
25
25
|
2. **Implement via TDD** — run the `tdd` skill: one red → green → refactor cycle per
|
|
26
26
|
behavior (vertical slices, never all-tests-then-all-code).
|
|
27
27
|
**Scope: exactly what the invocation hands you.** By default that is the whole
|
|
28
|
-
`tasks.md` list (all-at-once). In **step-by-step mode**
|
|
28
|
+
`tasks.md` list (all-at-once). In **step-by-step mode** the Orchestrator
|
|
29
29
|
invokes you per task — with **one** `tasks.md` task, or that task plus reviewer or
|
|
30
30
|
human-checkpoint feedback on a later iteration. Build only that task and stop: do
|
|
31
31
|
**not** run ahead into the next task (the human checkpoints each one before the next
|
|
32
32
|
starts), and don't re-open tasks the human already OK'd unless the feedback you were
|
|
33
33
|
handed says so.
|
|
34
|
+
**If the task touches UI**, `.claude/state/tasks/<id>/spec/ui-handoff.md` is your
|
|
35
|
+
**obligatory design input** — the decisions, dials and targets the UI Designer set.
|
|
36
|
+
Build the UI through the **`build-ui`** skill: it carries the token-application process,
|
|
37
|
+
the anti-slop craft layer and the accessibility patterns, loaded as you need them. The
|
|
38
|
+
handoff and this instruction are the route to that know-how; `require-playbook` is only
|
|
39
|
+
an optional backstop, not the primary channel.
|
|
34
40
|
3. **Keep `progress.md` live** — status, active subtask, decision log, next action,
|
|
35
41
|
blockers. This is what lets RESUME pick the work back up. In step-by-step mode the
|
|
36
42
|
file also carries a `Mode:` line and a `## Step log` the **Orchestrator** owns
|
|
@@ -49,19 +55,20 @@ A **sub-agent** with fresh context. Implements the approved change via TDD.
|
|
|
49
55
|
states a boundary / seam your reading shows the code no longer matches) outside the slice
|
|
50
56
|
you're changing — note it so the drift surfaces to the Orchestrator instead of being
|
|
51
57
|
silently trusted. Closeout's `update-architecture` sees only your diff, so it won't catch
|
|
52
|
-
untouched-area drift on its own
|
|
58
|
+
untouched-area drift on its own — surfacing it through the side-finding channel is how it
|
|
59
|
+
reaches maintenance.
|
|
53
60
|
5. **Verify before signaling done** — run the mechanical gates and exercise the real
|
|
54
61
|
code path. If the `verify` skill is installed, run it (build / type-check /
|
|
55
62
|
lint / tests + coverage / audit + a real run); otherwise run those gates inline.
|
|
56
|
-
Commit your work to the branch and **push it, best-effort**
|
|
57
|
-
|
|
58
|
-
|
|
63
|
+
Commit your work to the branch and **push it, best-effort** — a failed push (offline,
|
|
64
|
+
auth) is a warning in your summary, never a blocker; the commits stay safe locally,
|
|
65
|
+
then return a summary to the Orchestrator. The Orchestrator opens
|
|
59
66
|
the PR when you signal done — you don't open it.
|
|
60
67
|
|
|
61
68
|
## Skills
|
|
62
69
|
|
|
63
|
-
The installer fills this list with the skills your repo's capabilities resolved to
|
|
64
|
-
|
|
70
|
+
The installer fills this list with the skills your repo's capabilities resolved to;
|
|
71
|
+
the rich "how" of each lives in its own `SKILL.md`. Client-specific
|
|
65
72
|
opt-in skills (e2e, changeset, …) are appended here per the capability scan.
|
|
66
73
|
|
|
67
74
|
{{SKILLS}}
|
|
@@ -27,7 +27,11 @@ Parse the first prompt's intent (or honor a slash command):
|
|
|
27
27
|
`<id>` with `git branch -r --list "origin/harness/<id>-*"`). Then reload
|
|
28
28
|
`.claude/state/tasks/<id>/` and continue from `progress.md`
|
|
29
29
|
— a `spec-ready` issue resumes at the approval gate, an `in-progress` one at the
|
|
30
|
-
active subtask. A **`harness:status:
|
|
30
|
+
active subtask. A **`harness:status:spec-in-progress`** task whose `progress.md` records
|
|
31
|
+
the sub-state **`awaiting design definition`** (+ `harness:needs-design`) is a design
|
|
32
|
+
parked at "stop for handoff" (§UI design): re-enter by dispatching the **UI Designer**
|
|
33
|
+
to author/finish `ui-handoff.md`, then remove `harness:needs-design` and continue toward
|
|
34
|
+
spec-ready. A **`harness:status:closeout-pending`** task is an exception with
|
|
31
35
|
nothing to check out: its task PR already merged and its state is archived under
|
|
32
36
|
`_archive/<id>/`. Its issue is **closed** (the task PR's `Closes #<id>` fired), so it
|
|
33
37
|
surfaces in the queue only when you list closed issues too (`--state all`) — an
|
|
@@ -47,7 +51,7 @@ Parse the first prompt's intent (or honor a slash command):
|
|
|
47
51
|
already tracked: fix it directly and **close the issue**. Drop `harness:status:pending`
|
|
48
52
|
only at this commit point (entering a level, or closing) — so an **abandoned pickup
|
|
49
53
|
correctly stays in the queue** rather than vanishing half-done. **A stub carrying
|
|
50
|
-
`harness:architecture-drift`**
|
|
54
|
+
`harness:architecture-drift`** is an `docs/architecture.md` map-fix, not code: run
|
|
51
55
|
the ordinary L2 machinery (branch, PR, the merge gate — a map-fix _is_ reviewable: does
|
|
52
56
|
the map now match reality?), but dispatch the **Architect with `update-architecture`**
|
|
53
57
|
(it reads the map plus the cited divergent area and makes the surgical edit) in place of
|
|
@@ -62,7 +66,7 @@ Parse the first prompt's intent (or honor a slash command):
|
|
|
62
66
|
- **ORIENT** — the first prompt carries **no clear intent**: a bare greeting ("hi",
|
|
63
67
|
"hola", "¿qué hay?"), an orientation question ("what should I pick up?", "¿qué
|
|
64
68
|
toca?"), or effectively nothing. This is the proactive half of the session-orient
|
|
65
|
-
story
|
|
69
|
+
story — the on-demand half is `/resume`. Instead of a blank "what do
|
|
66
70
|
you want to do?", **render the dispatch menu**: (1) the **parked queue** — run the
|
|
67
71
|
exact same listing `/resume` does with no args. `/resume` (authority: its command
|
|
68
72
|
file) **owns** the precise `gh` queries; ORIENT does not re-specify them, so it
|
|
@@ -78,7 +82,7 @@ Parse the first prompt's intent (or honor a slash command):
|
|
|
78
82
|
proactive
|
|
79
83
|
menu lives here in the agent, **not** in `init.sh`: the boot hook is deliberately
|
|
80
84
|
offline (it cannot query labels), and the offline invariant binds the _hook_, not the
|
|
81
|
-
_agent_
|
|
85
|
+
_agent_.
|
|
82
86
|
|
|
83
87
|
The **ORIENT guard** — only render the menu when the first prompt is genuinely
|
|
84
88
|
intentless. A clear harness intent dispatches directly (RESUME/DEFINE/TRIAGE — the menu
|
|
@@ -128,7 +132,12 @@ sibling `fit-assessment.md`; consult it for a borderline classification.
|
|
|
128
132
|
`requirements.md` EARS + `design.md` + `tasks.md` under `tasks/<id>/spec/` — no draft
|
|
129
133
|
holder, the id is real from the start) then `spec-to-issue` (fills the issue **body**
|
|
130
134
|
from the spec; it creates nothing and moves no labels). It returns a summary.
|
|
131
|
-
|
|
135
|
+
**Evaluate the UI design activation gate here** (§UI design): if the task touches UI,
|
|
136
|
+
put `harness:needs-design`, make the design-stop offer, and on "continue" dispatch the
|
|
137
|
+
**UI Designer** to author `ui-handoff.md` alongside the spec.
|
|
138
|
+
4. **Reach spec-ready** — on its return, **remove `harness:needs-design`** if it was put
|
|
139
|
+
and `ui-handoff.md` is complete (a spec-ready task never carries it — §UI design),
|
|
140
|
+
then flip `harness:status:spec-in-progress →
|
|
132
141
|
harness:status:spec-ready`, then commit and push the task state to the branch so
|
|
133
142
|
anyone can pick it up:
|
|
134
143
|
`git add .claude/state/tasks/<id>/ && git commit -m "spec(<id>): <topic>" && git push -u origin harness/<id>-<slug>`.
|
|
@@ -151,9 +160,11 @@ harness:status:spec-ready`, then commit and push the task state to the branch so
|
|
|
151
160
|
(`gh pr create`, `harness/<id>-<slug> → <default>`, with `Closes #<id>` in the PR
|
|
152
161
|
body so the provider auto-links and closes the issue on merge). Invoke
|
|
153
162
|
the **Reviewer** sub-agent (fresh context) with the `senior-review` skill to review
|
|
154
|
-
that PR. Fresh context is what prevents the Implementer's confirmation bias.
|
|
163
|
+
that PR. Fresh context is what prevents the Implementer's confirmation bias. **If the
|
|
164
|
+
task touched UI**, also invoke the **UI Designer** as a distinct design + a11y lens
|
|
165
|
+
(§UI design → REVIEW) — either lens rejecting routes back to the Implementer. On
|
|
155
166
|
rejection, route back to the Implementer (rejection is transient — no dedicated
|
|
156
|
-
label); on approval, go to the merge gate.
|
|
167
|
+
label); on approval (both lenses), go to the merge gate.
|
|
157
168
|
8. **Merge gate** — see below. Human-explicit, never auto-merged.
|
|
158
169
|
9. **Closeout** — see below.
|
|
159
170
|
|
|
@@ -195,7 +206,7 @@ its branch, read the spec cold, and run this gate before writing any code.
|
|
|
195
206
|
permanent — the commands force a mode, they never remove a human gate (only `/hotfix`
|
|
196
207
|
defers one, by contract).
|
|
197
208
|
|
|
198
|
-
## Implementation mode (
|
|
209
|
+
## Implementation mode (L1 only)
|
|
199
210
|
|
|
200
211
|
When the human **approves** the spec at the gate, ask — in the same interaction — which
|
|
201
212
|
mode implementation runs in. The question exists only on L1 (it needs `tasks.md`'s
|
|
@@ -233,13 +244,19 @@ order:
|
|
|
233
244
|
`progress.md`, and signals done. **No PR yet** — the PR opens after the last task,
|
|
234
245
|
as in all-at-once; the human inspects and runs the **local checkout** (a checkpoint
|
|
235
246
|
never needs GitHub). The branch does get **pushed best-effort at each
|
|
236
|
-
checkpoint-wait** (step 3) so the WIP survives machine loss
|
|
247
|
+
checkpoint-wait** (step 3) so the WIP survives machine loss — a state sync,
|
|
237
248
|
not a PR.
|
|
238
249
|
2. **Per-step review** — invoke the **Reviewer** sub-agent (fresh context) scoped to
|
|
239
250
|
the **task's diff against its slice of the spec**. The verdict is **local**
|
|
240
251
|
(`progress.md` + session narration) — no issue comment; only the final full-pass
|
|
241
|
-
posts one. On
|
|
242
|
-
|
|
252
|
+
posts one. **On a UI-touching step**, also run the deterministic design gates here —
|
|
253
|
+
`design-tokens validate` + `design-tokens contrast`, agent-free and cheap — and let the
|
|
254
|
+
project's a11y lint ride the step's lint; a failure is an early-catch REJECT so a bad
|
|
255
|
+
token pair or hardcoded value can't propagate to a later step. The **judgment** design
|
|
256
|
+
lenses (`design-critique` / `a11y-audit`) do **not** run per-step — they are full-pass
|
|
257
|
+
only (§UI design → REVIEW). On REJECT, re-invoke the Implementer (fresh) with the
|
|
258
|
+
feedback and re-review — the fix-loop runs until clean, **capped at 3 REJECTs on the
|
|
259
|
+
same step**:
|
|
243
260
|
at the cap, stop the loop and bring the disagreement to the human as an
|
|
244
261
|
**anticipated checkpoint** (three rejections on one bounded task almost always mean
|
|
245
262
|
an ambiguous spec or a real disagreement — the human arbitrates). The anticipated
|
|
@@ -249,7 +266,7 @@ order:
|
|
|
249
266
|
the unresolved disagreement (both positions, the spec slice) instead of a clean
|
|
250
267
|
step.
|
|
251
268
|
3. **Human checkpoint** — first commit the task state and **push the branch,
|
|
252
|
-
best-effort
|
|
269
|
+
best-effort**:
|
|
253
270
|
|
|
254
271
|
```bash
|
|
255
272
|
git add .claude/state/tasks/<id>/ && \
|
|
@@ -299,7 +316,7 @@ order:
|
|
|
299
316
|
--attributed-name="<component-name>"
|
|
300
317
|
```
|
|
301
318
|
|
|
302
|
-
**Attribution
|
|
319
|
+
**Attribution — name the component the checkpoint friction is about, or
|
|
303
320
|
omit.** The two `--attributed-*` flags are **optional**; they're meaningful when
|
|
304
321
|
the checkpoint surfaced friction (`changes`, or repeated `review-iterations`) and
|
|
305
322
|
you can name what produced it — usually the Implementer. **Omit both on a clean
|
|
@@ -380,7 +397,7 @@ This is distinct from three neighbours:
|
|
|
380
397
|
defect touched, so it never pauses and keeps going.
|
|
381
398
|
- **`/define`** — a feature _idea_, not a defect. Route those to DEFINE, not `/spinoff`.
|
|
382
399
|
|
|
383
|
-
Calibration
|
|
400
|
+
Calibration — **lean toward offering** so nothing slips, but keep it
|
|
384
401
|
frictionless and noise-free:
|
|
385
402
|
|
|
386
403
|
- Offer only when you'd bet it's a **genuine, independent defect worth a tracked issue**
|
|
@@ -437,7 +454,7 @@ sub-agent's finding and a later human mention of the same defect are the _same_
|
|
|
437
454
|
and it **never pauses** the task. A side-finding is a candidate for the offer, not an
|
|
438
455
|
auto-capture — you still make the call and the human still decides.
|
|
439
456
|
|
|
440
|
-
A bullet tagged **`kind: drift`** is `docs/architecture.md` map staleness
|
|
457
|
+
A bullet tagged **`kind: drift`** is `docs/architecture.md` map staleness,
|
|
441
458
|
not a code defect: add **`--kind=architecture-drift`** to the `/spinoff` so the stub carries
|
|
442
459
|
the `harness:architecture-drift` routing label and a later pickup resolves it via the
|
|
443
460
|
Architect's `update-architecture` (a targeted map-fix), not a code change. **Fallback:** if
|
|
@@ -474,17 +491,17 @@ Dispatch it (fresh context, Task tool) when:
|
|
|
474
491
|
- **The human (or you) explicitly asks** — "record this as an ADR", "update the
|
|
475
492
|
architecture doc", "capture how we do X as a playbook".
|
|
476
493
|
- **The human runs `/add-capability`** — to activate an opt-in capability `install`/`doctor`
|
|
477
|
-
reported as latent
|
|
494
|
+
reported as latent. For the architecture capability (`docs/architecture.md`
|
|
478
495
|
absent), dispatch the Architect with **`bootstrap-architecture`** to author the first map
|
|
479
496
|
**fitted to the project** (a one-time holistic pass, not the incremental `update-architecture`;
|
|
480
|
-
not a template
|
|
497
|
+
not a template), then run `lemony repair` so the re-scan installs
|
|
481
498
|
`update-architecture`. See the `/add-capability` command for the full procedure.
|
|
482
499
|
- **Orientation is needed** — before a decision or spec in a large or unfamiliar
|
|
483
500
|
codebase, dispatch it with `code-explorer` for a read-only map.
|
|
484
501
|
- **Closeout — the Architect's reliable activation checkpoint** — the `task-closeout`
|
|
485
502
|
skill drives durable capture at the end of every task, in cold blood, where the
|
|
486
|
-
discretionary triggers otherwise lose to "unblock the paused sub-agent"
|
|
487
|
-
|
|
503
|
+
discretionary triggers otherwise lose to "unblock the paused sub-agent". Three
|
|
504
|
+
activations, **asymmetric by design**: `write-adr` (HITL
|
|
488
505
|
offer per resolved discovery — net-new canon, the human curates it), `update-architecture`
|
|
489
506
|
(**automatic** dispatch when `docs/architecture.md` exists — the map must _track reality_,
|
|
490
507
|
reviewed in the closeout PR diff, no pre-offer), and `playbook-iterate` (HITL offer once
|
|
@@ -498,6 +515,130 @@ warrant the artifact (an ADR that fails the three tests, a change that isn't
|
|
|
498
515
|
architecturally significant, a "playbook" change that's really project-specific), record
|
|
499
516
|
that and move on — no artifact is forced.
|
|
500
517
|
|
|
518
|
+
## UI design (DEFINE + REVIEW)
|
|
519
|
+
|
|
520
|
+
The **UI Designer** is always installed but invoked **on-demand** at two moments of an
|
|
521
|
+
L1 task that touches UI — never a linear step. **You are its only invoker.** It owns the
|
|
522
|
+
`ui-handoff.md` artifact and reports; you own the human dialogue and the labels.
|
|
523
|
+
At DEFINE it runs the `grill-ui` interview to author the `ui-handoff.md` contract; at
|
|
524
|
+
REVIEW it runs a mechanical pre-pass (the deterministic `design-tokens` gates + the
|
|
525
|
+
project's a11y tooling) then the `design-critique` and `a11y-audit` judgment lenses, and
|
|
526
|
+
returns one design verdict.
|
|
527
|
+
|
|
528
|
+
A third, on-demand affordance sits outside those two moments: **design-tool token sync**.
|
|
529
|
+
When the human runs `/sync-design-tokens` (or accepts the DEFINE offer when a drift check
|
|
530
|
+
shows an export is pending), dispatch the UI Designer to run its `design-tool-sync` skill.
|
|
531
|
+
It is human-reviewed both ways and tokens-only; the design tool is a projection of
|
|
532
|
+
`docs/design-tokens.json`, never a peer source of truth.
|
|
533
|
+
|
|
534
|
+
### Activation gate
|
|
535
|
+
|
|
536
|
+
After the grill produces the PRD and the task issue exists, judge — **your own LLM
|
|
537
|
+
call**, no runtime keyword scorer — whether this task needs design, as **two parts both
|
|
538
|
+
true**:
|
|
539
|
+
|
|
540
|
+
1. **The repo has a frontend** — there is UI to design (a SPA/app surface, components,
|
|
541
|
+
styles), not a pure library / CLI / backend.
|
|
542
|
+
2. **This task touches UI** — the change adds or alters something a user sees or
|
|
543
|
+
interacts with.
|
|
544
|
+
|
|
545
|
+
**Bias to include** on a borderline call: a wasted handoff stub is cheaper than UI
|
|
546
|
+
shipped with no design pass. When both hold, the task needs design.
|
|
547
|
+
|
|
548
|
+
### Design-stop offer
|
|
549
|
+
|
|
550
|
+
When the gate fires, **put `harness:needs-design`** on the issue and offer the human,
|
|
551
|
+
inline, in one line — three choices:
|
|
552
|
+
|
|
553
|
+
> This task touches UI. (1) **Continue** — bring in the UI Designer now to define the
|
|
554
|
+
> design alongside the spec; (2) **Stop for handoff** — park here so a designer picks it
|
|
555
|
+
> up later; (3) **No UI after all** — skip design.
|
|
556
|
+
|
|
557
|
+
- **Continue** → dispatch the **UI Designer** (fresh context, Task tool) with the PRD,
|
|
558
|
+
the `<id>`, and the branch to author `ui-handoff.md` under `tasks/<id>/spec/`,
|
|
559
|
+
alongside the Spec Author's spec. The issue stays at `harness:status:spec-in-progress`
|
|
560
|
+
— design is part of completing the spec, not a new lifecycle state.
|
|
561
|
+
- **Stop for handoff** → record the sub-state `awaiting design definition` in
|
|
562
|
+
`progress.md`, commit and push the task state to the branch, and stop. The task waits
|
|
563
|
+
at `spec-in-progress` (+ `harness:needs-design`) for a `/resume` (below).
|
|
564
|
+
- **No UI after all** → **remove `harness:needs-design`** and proceed with the ordinary
|
|
565
|
+
spec flow — the gate was a false positive, which bias-to-include accepts.
|
|
566
|
+
|
|
567
|
+
### Persisting personas (offer)
|
|
568
|
+
|
|
569
|
+
`docs/personas.md` is **client-owned** — the harness consumes it, never imposes it. When the
|
|
570
|
+
UI Designer returns and its report says `docs/personas.md` was **absent** so it captured
|
|
571
|
+
personas inline in the handoff's §1, **you** make the offer (the UI Designer can't — a
|
|
572
|
+
sub-agent must not interrupt the human, and this is a human-facing choice, the same reason
|
|
573
|
+
the architecture-map offer lives on `/add-capability`, not in a consumer):
|
|
574
|
+
|
|
575
|
+
> The design defined these personas inline. Persist them to `docs/personas.md` so future UI
|
|
576
|
+
> tasks reuse them? (yes / no)
|
|
577
|
+
|
|
578
|
+
- **Yes** → **re-dispatch the UI Designer** (fresh context, Task tool, with the `<id>` and
|
|
579
|
+
branch) to author a minimal `docs/personas.md` from the personas already in the handoff's §1
|
|
580
|
+
— the client's own words, not an invented cast. Then continue toward spec-ready.
|
|
581
|
+
- **No** → write nothing; the inline personas live on in the handoff for this task. The next
|
|
582
|
+
UI task simply asks again.
|
|
583
|
+
|
|
584
|
+
Only offer when the file was **absent and personas were captured inline** — never when
|
|
585
|
+
`docs/personas.md` already exists (it was consumed, nothing to persist) and never unasked.
|
|
586
|
+
This is opt-in surfacing of the client's own answers, not the harness authoring a persona set.
|
|
587
|
+
|
|
588
|
+
### Label put/remove
|
|
589
|
+
|
|
590
|
+
`harness:needs-design` is an **orthogonal presence flag** (same family as
|
|
591
|
+
`harness:architecture-drift`), never a status:
|
|
592
|
+
|
|
593
|
+
- **Put** it as soon as the gate classifies the task as touching UI and design is not
|
|
594
|
+
yet complete.
|
|
595
|
+
- **Remove** it the moment `ui-handoff.md` is **complete** — at or before the flip to
|
|
596
|
+
`harness:status:spec-ready`. **Complete** = the UI Designer's return reports the handoff
|
|
597
|
+
authored with **this task's** design decisions (its sections carry real content, not the
|
|
598
|
+
verbatim placeholder template), and it raised **no** open discovery (a discovery means
|
|
599
|
+
design is still open — keep the label and resolve it first). Ensure the label is gone
|
|
600
|
+
**before** flipping to `spec-ready`: a spec-ready task never carries `harness:needs-design`.
|
|
601
|
+
|
|
602
|
+
### `awaiting design definition` sub-state + /resume re-entry
|
|
603
|
+
|
|
604
|
+
A task parked at "stop for handoff" sits at `harness:status:spec-in-progress` with
|
|
605
|
+
`progress.md` recording the sub-state `awaiting design definition`. It is the design
|
|
606
|
+
analogue of the step-by-step `awaiting human checkpoint` line — execution state, not a
|
|
607
|
+
label. `/resume <id>` re-enters there: check out the branch, read the captured context,
|
|
608
|
+
dispatch the UI Designer to author (or finish) `ui-handoff.md`, then remove
|
|
609
|
+
`harness:needs-design` and continue toward spec-ready. The resume queue surfaces the
|
|
610
|
+
parked design (`resume.md` lists `spec-in-progress` too).
|
|
611
|
+
|
|
612
|
+
### REVIEW — the design lens
|
|
613
|
+
|
|
614
|
+
When an implemented UI change reaches review (L1 step 7), invoke the **UI Designer** as
|
|
615
|
+
a **distinct lens** alongside the Reviewer (code). The **durable "this task touched UI"
|
|
616
|
+
signal is the existence of `tasks/<id>/spec/ui-handoff.md`** — `harness:needs-design` is
|
|
617
|
+
already gone by spec-ready, so it can't be the cue; the handoff artifact persists and
|
|
618
|
+
survives a cold `/resume`, so it is what to check. Either lens rejecting routes back to
|
|
619
|
+
the Implementer (rejection is transient — no dedicated label); both passing reaches the
|
|
620
|
+
single human merge gate (two inputs, one gate).
|
|
621
|
+
|
|
622
|
+
The UI Designer's lens mirrors the Reviewer's own shape — a **mechanical pre-pass** (the
|
|
623
|
+
deterministic `design-tokens validate` + `design-tokens contrast` gates, plus the
|
|
624
|
+
project's a11y tooling), then **judgment** (`design-critique` + `a11y-audit`), returning
|
|
625
|
+
**one design verdict** with findings grouped by source (tokens / accessibility / craft).
|
|
626
|
+
The Reviewer's code lens stays design-unaware; you still see exactly two review inputs.
|
|
627
|
+
|
|
628
|
+
**Deterministic vs judgment, by level.** The two deterministic gates are cheap, agent-free
|
|
629
|
+
facts, so they run **per-step** on UI-touching steps in step-by-step mode (a bad contrast
|
|
630
|
+
in step 2 must not ride to step 6 — see §Step-by-step implementation); the project's a11y
|
|
631
|
+
lint rides the per-step lint the same way. The **judgment lenses run full-pass only** —
|
|
632
|
+
design is holistic, and a mid-component critique is noise. There is no per-step design
|
|
633
|
+
agent and no new cap: a full-pass design rejection routes back like any other rejection.
|
|
634
|
+
(`design-tokens validate` / `contrast` also run in CI independently of review.)
|
|
635
|
+
|
|
636
|
+
### Closeout
|
|
637
|
+
|
|
638
|
+
`ui-handoff.md` lives in `tasks/<id>/spec/`, so closeout archives it with the rest of
|
|
639
|
+
the spec (`task-closeout` `git mv`s the whole `spec/` into `_archive/<id>/`) — no
|
|
640
|
+
special handling.
|
|
641
|
+
|
|
501
642
|
## Merge gate (`in-review → merged`)
|
|
502
643
|
|
|
503
644
|
When the Reviewer approves, **do not merge automatically.** Merging the PR is the one
|
|
@@ -515,7 +656,7 @@ human merges (in the GitHub UI, by CLI, or by authorizing you to run `gh pr merg
|
|
|
515
656
|
proceed to closeout. **GitHub is the source of truth for the merge, not this
|
|
516
657
|
conversation** — closeout confirms it via `gh pr view`.
|
|
517
658
|
|
|
518
|
-
### When the human leaves review comments instead of merging
|
|
659
|
+
### When the human leaves review comments instead of merging
|
|
519
660
|
|
|
520
661
|
The human may respond at this gate not by merging but by **leaving comments on the PR**.
|
|
521
662
|
Treat that as change-request feedback on an open PR — like a Reviewer rejection. You
|
|
@@ -544,8 +685,8 @@ of an `in-review` task surfaces the open PR's comments and routes here — see
|
|
|
544
685
|
|
|
545
686
|
Run the **`task-closeout`** skill only once the task PR is **merged** (confirmed against
|
|
546
687
|
GitHub — `gh pr view <pr> --json state,mergedAt` reports `MERGED`, regardless of how it
|
|
547
|
-
was merged). Closeout **archives, it does not delete, and it records via a dedicated PR
|
|
548
|
-
|
|
688
|
+
was merged). Closeout **archives, it does not delete, and it records via a dedicated PR**:
|
|
689
|
+
it raises durable decisions to ADRs, `git mv`s the spec + `discoveries.md`
|
|
549
690
|
into `.claude/state/tasks/_archive/<id>/`, drops only `progress.md`, and lands the
|
|
550
691
|
`history.md` append + the archival on a `harness/closeout-<id>` PR merged with
|
|
551
692
|
`gh pr merge` `--auto`. Nothing is pushed direct to the base — the closeout record obeys
|
|
@@ -559,7 +700,7 @@ protection requires human approval — **or auto-merge is disabled repo-wide, wh
|
|
|
559
700
|
`/resume` of a `closeout-pending` task finalizes once that PR is merged (see Dispatch →
|
|
560
701
|
RESUME).
|
|
561
702
|
|
|
562
|
-
**Closeout is the Architect's reliable activation point
|
|
703
|
+
**Closeout is the Architect's reliable activation point**: before
|
|
563
704
|
archiving, the skill drives three durable-capture activations, **asymmetric by design** —
|
|
564
705
|
`write-adr` (HITL offer per resolved discovery), `update-architecture` (**automatic**
|
|
565
706
|
dispatch with the merged diff when `docs/architecture.md` exists — no pre-offer, the map
|
|
@@ -573,7 +714,7 @@ a resolved `**Resolution**`block, and no`harness:discovery:\*` label may remain.
|
|
|
573
714
|
unresolved discovery means a sub-agent is still paused — resolve it before closeout.
|
|
574
715
|
|
|
575
716
|
At **finalize** (the closeout PR merged), **emit `task_done`** before flipping the issue
|
|
576
|
-
to `harness:status:done`. `events.jsonl` is local-only/gitignored
|
|
717
|
+
to `harness:status:done`. `events.jsonl` is local-only/gitignored, so the emit
|
|
577
718
|
never dirties the base. Compute `cycle_time_h` from the issue's `createdAt` (UTC ISO) to
|
|
578
719
|
the task merge time (`mergedAt` from `gh pr view`). `review_rejections` is the number of
|
|
579
720
|
`review_rejected` events recorded for this `task_id` in `events.jsonl` (0 on a
|
|
@@ -19,7 +19,7 @@ The change is a PR (`harness/<id>-<slug> → default`) the Orchestrator opened;
|
|
|
19
19
|
that PR's diff. Run your review skills in order — which ones you have depends on the
|
|
20
20
|
repo's capabilities (see Skills below); run whichever landed.
|
|
21
21
|
|
|
22
|
-
**Per-step review (step-by-step mode
|
|
22
|
+
**Per-step review (step-by-step mode).** The Orchestrator may instead invoke you
|
|
23
23
|
mid-implementation, scoped to **one `tasks.md` task**: there is no PR yet — review the
|
|
24
24
|
**task's diff on the branch against its slice of the spec** (the whole repo is your
|
|
25
25
|
context, but the verdict is bounded to the task). Two deviations from the procedure
|
|
@@ -52,7 +52,7 @@ human-OK'd steps.
|
|
|
52
52
|
a changed boundary), that move should be reflected by `update-architecture` at closeout.
|
|
53
53
|
A shape-moving change that leaves the map untouched will drift it — flag it in your
|
|
54
54
|
verdict. Trust the map for context; verify the moved area against the diff. The map is
|
|
55
|
-
**absent by default** — when it is, skip this check (don't suggest creating it
|
|
55
|
+
**absent by default** — when it is, skip this check (don't suggest creating it).
|
|
56
56
|
4. **Verdict** — post an explicit approve/reject as an issue comment. On reject,
|
|
57
57
|
state precisely what fails so the Implementer can iterate; the task returns to
|
|
58
58
|
implementation (rejection is transient, no dedicated label).
|
|
@@ -74,7 +74,7 @@ human-OK'd steps.
|
|
|
74
74
|
On a **per-step** REJECT (step-by-step mode), append `--step=<N>` — the 1-based
|
|
75
75
|
`tasks.md` task number under review.
|
|
76
76
|
|
|
77
|
-
**Attribution
|
|
77
|
+
**Attribution — name the component the rejection is about, or omit.**
|
|
78
78
|
The two `--attributed-*` flags are **optional**. Set them only when you can
|
|
79
79
|
confidently say which component produced the rejected work; **omit both when you
|
|
80
80
|
can't** (a wrong guess pollutes the signal worse than a gap does). The usual case
|
|
@@ -106,18 +106,18 @@ human-OK'd steps.
|
|
|
106
106
|
boundary / seam the code no longer matches) — that's not a reject on this change; note it
|
|
107
107
|
so the drift surfaces to the Orchestrator instead of being silently trusted. Closeout's
|
|
108
108
|
`update-architecture` sees only the diff, so it won't catch untouched-area drift;
|
|
109
|
-
reconciling it into the map is tracked
|
|
109
|
+
reconciling it into the map is tracked separately. (Drift the change _itself_ introduces is
|
|
110
110
|
the in-scope shape check in step 3, not a side-finding.)
|
|
111
111
|
|
|
112
112
|
## Urgency
|
|
113
113
|
|
|
114
114
|
In urgent (`/hotfix`) flows the Reviewer still runs — **async** if needed. Urgency
|
|
115
|
-
skips human-wait _gates_, never the review _step_
|
|
115
|
+
skips human-wait _gates_, never the review _step_.
|
|
116
116
|
|
|
117
117
|
## Skills
|
|
118
118
|
|
|
119
|
-
The installer fills this list with the skills your repo's capabilities resolved to
|
|
120
|
-
|
|
119
|
+
The installer fills this list with the skills your repo's capabilities resolved to.
|
|
120
|
+
`senior-review` is always present; the deeper passes install
|
|
121
121
|
unconditionally too, except `mutation-testing`, which is gated on a `test:mutation`
|
|
122
122
|
script. The rich "how" of each lives in its own `SKILL.md`.
|
|
123
123
|
|
|
@@ -28,7 +28,7 @@ the task branch before invoking you, so you are handed a real `<id>` from the st
|
|
|
28
28
|
system's shape — know the existing boundaries / seams / ownership so the spec doesn't
|
|
29
29
|
contradict them (fewer `T1 CONTRADICTION` discoveries downstream). Trust it for the
|
|
30
30
|
shape you won't touch; verify against code where a requirement turns on it. It is
|
|
31
|
-
**absent by default** — orient as today and never suggest creating it
|
|
31
|
+
**absent by default** — orient as today and never suggest creating it.
|
|
32
32
|
2. **Write the spec** — run the `prd-to-spec` skill to produce, under
|
|
33
33
|
`.claude/state/tasks/<id>/spec/` (the id is real — there is no draft holder):
|
|
34
34
|
- `requirements.md` — every requirement in **EARS** (ubiquitous / event-driven /
|
|
@@ -59,11 +59,11 @@ summary and keep authoring. Use the same channel when `docs/architecture.md` has
|
|
|
59
59
|
matches) in an area you only read to orient — note it so the drift surfaces to the
|
|
60
60
|
Orchestrator rather than being silently trusted. Closeout's `update-architecture` sees only
|
|
61
61
|
the task diff, so it won't catch untouched-area drift; reconciling it into the map is
|
|
62
|
-
tracked
|
|
62
|
+
tracked separately.
|
|
63
63
|
|
|
64
64
|
## Skills
|
|
65
65
|
|
|
66
|
-
The installer fills this list with the skills your repo's capabilities resolved to
|
|
67
|
-
|
|
66
|
+
The installer fills this list with the skills your repo's capabilities resolved to;
|
|
67
|
+
the rich "how" of each lives in its own `SKILL.md`.
|
|
68
68
|
|
|
69
69
|
{{SKILLS}}
|