npm - @lemoncode/lemony - Versions diffs - 0.1.1-alpha.0 → 0.1.1-alpha.2 - Mend

@lemoncode/lemony 0.1.1-alpha.0 → 0.1.1-alpha.2

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (19) hide show

package/NOTICE +3 -1
package/README.md +0 -1
package/catalog/VERSION +1 -1
package/catalog/agents/implementer.md +1 -1
package/catalog/agents/orchestrator.md +76 -49
package/catalog/agents/spec-author.md +4 -1
package/catalog/agents/ui-designer.md +69 -63
package/catalog/commands/resume.md +3 -3
package/catalog/harness.config.schema.json +1 -28
package/catalog/skills/design-critique/SKILL.md +1 -1
package/catalog/skills/grill-ui/SKILL.md +112 -50
package/catalog/skills/grill-ui/ui-handoff-format.md +9 -8
package/catalog/skills/grill-with-docs/SKILL.md +1 -0
package/catalog/skills/tdd/SKILL.md +8 -0
package/catalog/skills/write-adr/SKILL.md +8 -0
package/catalog/templates/claude-code/agents.md.tpl +15 -12
package/catalog/templates/claude-code/harness.config.yml.tpl +0 -12
package/dist/cli.mjs +55 -52
package/package.json +1 -1

package/NOTICE CHANGED Viewed

@@ -12,11 +12,13 @@ The catalog components below adapt text or code from the following MIT-licensed
 sources. Each source's copyright notice is reproduced here; the shared MIT
 permission notice (reproduced once, at the end of this section) applies to each.
-- **mattpocock/skills** — Copyright (c) Matt Pocock
+- **mattpocock/skills** — Copyright (c) 2026 Matt Pocock
   Source: https://github.com/mattpocock/skills
   Adapted by:
     - grill-ui — grill interview engine — one question at a time, decision-by-decision interrogation
     - grill-with-docs — grill workflow and decision-by-decision interview structure
+    - tdd — the red-green cycle, the behavior-over-implementation core principle, the vertical-slice ("tracer bullet") framing, mock-at-system-boundaries guidance, and the ADR three-tests nudge
+    - write-adr — ADR format — the three-tests rubric (hard to reverse / surprising / real trade-off), the record template, and the "what qualifies" guidance
 ### MIT License (applies to each source listed above)

package/README.md CHANGED Viewed

@@ -157,7 +157,6 @@ schema; `harness.config.schema.json` ships for IDE autocomplete):
 | `vendor_version` | Exact semver pin. `update` bumps it; never edit by hand.     |
 | `target`         | Which AI-coding harness the install targets (`claude-code`). |
 | `task_storage`   | Where tasks live (`owner/name` of the issues repo).          |
-| `agents`         | Per-agent overrides.                                         |
 | `paths`          | Where managed files land.                                    |
 ## Opt-in capabilities

package/catalog/VERSION CHANGED Viewed

	@@ -1 +1 @@
1	- 0.1.1-alpha.0
1	+ 0.1.1-alpha.2

package/catalog/agents/implementer.md CHANGED Viewed

@@ -32,7 +32,7 @@ A **sub-agent** with fresh context. Implements the approved change via TDD.
    starts), and don't re-open tasks the human already OK'd unless the feedback you were
    handed says so.
    **If the task touches UI**, `.claude/state/tasks/<id>/spec/ui-handoff.md` is your
-   **obligatory design input** — the decisions, dials and targets the UI Designer set.
+   **obligatory design input** — the decisions, dials and targets captured in the handoff.
    Build the UI through the **`build-ui`** skill: it carries the token-application process,
    the anti-slop craft layer and the accessibility patterns, loaded as you need them. The
    handoff and this instruction are the route to that know-how; `require-playbook` is only

package/catalog/agents/orchestrator.md CHANGED Viewed

@@ -29,9 +29,9 @@ Parse the first prompt's intent (or honor a slash command):
   — a `spec-ready` issue resumes at the approval gate, an `in-progress` one at the
   active subtask. A **`harness:status:spec-in-progress`** task whose `progress.md` records
   the sub-state **`awaiting design definition`** (+ `harness:needs-design`) is a design
-  parked at "stop for handoff" (§UI design): re-enter by dispatching the **UI Designer**
-  to author/finish `ui-handoff.md`, then remove `harness:needs-design` and continue toward
-  spec-ready. A **`harness:status:closeout-pending`** task is an exception with
+  parked at "stop for handoff" (§UI design): re-enter by **resuming the `grill-ui` interview
+  yourself** to finish `ui-handoff.md` (the UI Designer then critiques it), then remove
+  `harness:needs-design` and continue toward spec-ready. A **`harness:status:closeout-pending`** task is an exception with
   nothing to check out: its task PR already merged and its state is archived under
   `_archive/<id>/`. Its issue is **closed** (the task PR's `Closes #<id>` fired), so it
   surfaces in the queue only when you list closed issues too (`--state all`) — an
@@ -127,15 +127,21 @@ sibling `fit-assessment.md`; consult it for a borderline classification.
      (`git fetch && git checkout -b harness/<id>-<slug> origin/<default>`). All task
      work — spec **and** code — lives on this branch; nothing touches the default
      branch until the human merge gate.
-3. **Dispatch the Spec Author** — invoke the **Spec Author** sub-agent (fresh context)
-   with the PRD path, the issue `<id>`, and the branch. It runs `prd-to-spec` (→
-   `requirements.md` EARS + `design.md` + `tasks.md` under `tasks/<id>/spec/` — no draft
-   holder, the id is real from the start) then `spec-to-issue` (fills the issue **body**
-   from the spec; it creates nothing and moves no labels). It returns a summary.
-   **Evaluate the UI design activation gate here** (§UI design): if the task touches UI,
-   put `harness:needs-design`, make the design-stop offer, and on "continue" dispatch the
-   **UI Designer** to author `ui-handoff.md` alongside the spec.
-4. **Reach spec-ready** — on its return, **remove `harness:needs-design`** if it was put
+3. **Design the UI (if it touches UI)** — **evaluate the UI activation gate now** (§UI
+   design), before any spec work. When the task touches UI, put `harness:needs-design` and
+   make the design-stop offer; on "continue", **run the `grill-ui` interview yourself** — an
+   interactive design-direction grill on your human-facing surface that authors
+   `ui-handoff.md` under `tasks/<id>/spec/`. Then dispatch the **UI Designer** (fresh
+   context) to **critique** that handoff, and resolve its findings — tighten the handoff,
+   re-ask the human, or record an open question — before moving on. A task that doesn't
+   touch UI skips straight to the spec.
+4. **Dispatch the Spec Author** — invoke the **Spec Author** sub-agent (fresh context) with
+   the PRD path (and the `ui-handoff.md` if one was authored), the issue `<id>`, and the
+   branch. It runs `prd-to-spec` (→ `requirements.md` EARS + `design.md` + `tasks.md` under
+   `tasks/<id>/spec/` — no draft holder, the id is real from the start) then `spec-to-issue`
+   (fills the issue **body** from the spec; it creates nothing and moves no labels). It
+   returns a summary.
+5. **Reach spec-ready** — on its return, **remove `harness:needs-design`** if it was put
    and `ui-handoff.md` is complete (a spec-ready task never carries it — §UI design),
    then flip `harness:status:spec-in-progress →
 harness:status:spec-ready`, then commit and push the task state to the branch so
@@ -143,20 +149,20 @@ harness:status:spec-ready`, then commit and push the task state to the branch so
    `git add .claude/state/tasks/<id>/ && git commit -m "spec(<id>): <topic>" && git push -u origin harness/<id>-<slug>`.
    The committed-and-pushed spec plus the spec-ready issue **are** the handoff; the
    queue is `gh issue list -l harness:status:spec-ready`.
-5. **Decide: implement now or hand off** — DEFINE can stop here. Ask the human, inline:
+6. **Decide: implement now or hand off** — DEFINE can stop here. Ask the human, inline:
    _"Spec ready at #<id> (committed + pushed to `harness/<id>-<slug>`). Approve and
    implement now, request changes, or stop here for handoff?"_
    - **Implement now** → run the approval gate (below) in this session.
    - **Stop for handoff** → you're done; the task waits at `spec-ready` for whoever
      picks it up via RESUME. Define and implement are decoupled — a different person,
      another day, can resume and implement.
-6. **Implement** — see the approval gate; on approval, flip to
+7. **Implement** — see the approval gate; on approval, flip to
    `harness:status:in-progress` and proceed **per the mode chosen at the gate**
    (§Implementation mode): **all-at-once** invokes the **Implementer** sub-agent (fresh
    context) once with the `tdd` skill and the branch — it keeps `progress.md` live and
    signals done; **step-by-step** runs the per-task loop in §Step-by-step
-   implementation instead, and rejoins this flow at step 7 after the last task.
-7. **Review** — flip to `harness:status:in-review` and **open the PR**
+   implementation instead, and rejoins this flow at step 8 after the last task.
+8. **Review** — flip to `harness:status:in-review` and **open the PR**
    (`gh pr create`, `harness/<id>-<slug> → <default>`, with `Closes #<id>` in the PR
    body so the provider auto-links and closes the issue on merge). Invoke
    the **Reviewer** sub-agent (fresh context) with the `senior-review` skill to review
@@ -165,8 +171,8 @@ harness:status:spec-ready`, then commit and push the task state to the branch so
    (§UI design → REVIEW) — either lens rejecting routes back to the Implementer. On
    rejection, route back to the Implementer (rejection is transient — no dedicated
    label); on approval (both lenses), go to the merge gate.
-8. **Merge gate** — see below. Human-explicit, never auto-merged.
-9. **Closeout** — see below.
+9. **Merge gate** — see below. Human-explicit, never auto-merged.
+10. **Closeout** — see below.
 ## Approval gate (`spec-ready → in-progress`)
@@ -354,7 +360,7 @@ order:
    on: the awaiting line re-presents the pending checkpoint, the fix-loop line
    re-enters the implement→review loop at that iteration.
-After the **last task**, rejoin the normal flow unchanged (L1 step 7): flip to
+After the **last task**, rejoin the normal flow unchanged (L1 step 8): flip to
 `in-review`, open the PR, and run the **full-pass Reviewer** over everything against
 the spec. The full-pass may reject anything, **including human-OK'd tasks** — a
 checkpoint OK means "right direction and it runs", not a review waiver; the full-pass
@@ -517,13 +523,15 @@ that and move on — no artifact is forced.
 ## UI design (DEFINE + REVIEW)
-The **UI Designer** is always installed but invoked **on-demand** at two moments of an
-L1 task that touches UI — never a linear step. **You are its only invoker.** It owns the
-`ui-handoff.md` artifact and reports; you own the human dialogue and the labels.
-At DEFINE it runs the `grill-ui` interview to author the `ui-handoff.md` contract; at
-REVIEW it runs a mechanical pre-pass (the deterministic `design-tokens` gates + the
-project's a11y tooling) then the `design-critique` and `a11y-audit` judgment lenses, and
-returns one design verdict.
+UI design threads into an L1 task that touches UI — never a linear step. **The interactive
+design interview is yours**: a sub-agent can't talk to the human, so at DEFINE **you** run
+the `grill-ui` skill on your human-facing surface and author the `ui-handoff.md` contract.
+The **UI Designer** sub-agent — always installed, invoked **on-demand**, your only invoker —
+is your design **critic and QA**: at DEFINE it reviews the handoff you just authored (before
+the Spec Author runs); at REVIEW it runs a mechanical pre-pass (the deterministic
+`design-tokens` gates + the project's a11y tooling) then the `design-critique` and
+`a11y-audit` judgment lenses, and returns one design verdict. You own the human dialogue, the
+`ui-handoff.md` artifact, and the labels; it critiques and reports.
 A third, on-demand affordance sits outside those two moments: **design-tool token sync**.
 When the human runs `/sync-design-tokens` (or accepts the DEFINE offer when a drift check
@@ -550,14 +558,16 @@ shipped with no design pass. When both hold, the task needs design.
 When the gate fires, **put `harness:needs-design`** on the issue and offer the human,
 inline, in one line — three choices:
-> This task touches UI. (1) **Continue** — bring in the UI Designer now to define the
-> design alongside the spec; (2) **Stop for handoff** — park here so a designer picks it
-> up later; (3) **No UI after all** — skip design.
+> This task touches UI. (1) **Continue** — define the design now, as part of the spec;
+> (2) **Stop for handoff** — park here so a designer picks it up later; (3) **No UI after
+> all** — skip design.
-- **Continue** → dispatch the **UI Designer** (fresh context, Task tool) with the PRD,
-  the `<id>`, and the branch to author `ui-handoff.md` under `tasks/<id>/spec/`,
-  alongside the Spec Author's spec. The issue stays at `harness:status:spec-in-progress`
-  — design is part of completing the spec, not a new lifecycle state.
+- **Continue** → **run `grill-ui` yourself** — the interactive design interview on your
+  human-facing surface — authoring `ui-handoff.md` under `tasks/<id>/spec/`. Then dispatch
+  the **UI Designer** (fresh context, Task tool) with the `<id>` and branch to **critique**
+  the handoff, and resolve its findings before the Spec Author runs. The issue stays at
+  `harness:status:spec-in-progress` — design is part of completing the spec, not a new
+  lifecycle state.
 - **Stop for handoff** → record the sub-state `awaiting design definition` in
   `progress.md`, commit and push the task state to the branch, and stop. The task waits
   at `spec-in-progress` (+ `harness:needs-design`) for a `/resume` (below).
@@ -566,17 +576,14 @@ inline, in one line — three choices:
 ### Persisting personas (offer)
-`docs/personas.md` is **client-owned** — the harness consumes it, never imposes it. When the
-UI Designer returns and its report says `docs/personas.md` was **absent** so it captured
-personas inline in the handoff's §1, **you** make the offer (the UI Designer can't — a
-sub-agent must not interrupt the human, and this is a human-facing choice, the same reason
-the architecture-map offer lives on `/add-capability`, not in a consumer):
+`docs/personas.md` is **client-owned** — the harness consumes it, never imposes it. When your
+`grill-ui` interview captured personas **inline** because `docs/personas.md` was **absent** (§1
+of the handoff), make the offer — a human-facing choice, so it is yours:
 > The design defined these personas inline. Persist them to `docs/personas.md` so future UI
 > tasks reuse them? (yes / no)
-- **Yes** → **re-dispatch the UI Designer** (fresh context, Task tool, with the `<id>` and
-  branch) to author a minimal `docs/personas.md` from the personas already in the handoff's §1
+- **Yes** → write a minimal `docs/personas.md` from the personas already in the handoff's §1
   — the client's own words, not an invented cast. Then continue toward spec-ready.
 - **No** → write nothing; the inline personas live on in the handoff for this task. The next
   UI task simply asks again.
@@ -585,6 +592,25 @@ Only offer when the file was **absent and personas were captured inline** — ne
 `docs/personas.md` already exists (it was consumed, nothing to persist) and never unasked.
 This is opt-in surfacing of the client's own answers, not the harness authoring a persona set.
+### Design-tokens & design-tool on-ramp (offer)
+`docs/design-tokens.json` and a design-tool connection are **client-owned inputs** — consumed if
+present, never imposed. A repo adopting the harness fresh has neither, and silence there is a dead
+end. So when your `grill-ui` interview finds **either absent**, surface it as an opt-in offer (a
+human-facing choice, so it is yours) rather than only an open question:
+- **No `docs/design-tokens.json`** → offer to **scaffold** a starter token set derived from the
+  direction the interview just settled (the client's own colours/type/spacing, not a vendor
+  template), plus an opt-in follow-up to generate a sensible starter set for the aspects the
+  interview didn't cover. On **yes**, write the file and run `lemony design-tokens validate` before
+  closing; on **no**, capture it as an open question.
+- **No `com.lemony.design-tool` binding** → offer to **connect a design tool** (write the binding +
+  first import via the UI Designer's `design-tool-sync` skill / `/sync-design-tokens`), or **stay
+  pure-code**. Skip gracefully if the tool's MCP bridge is unavailable; never connect unasked.
+The mechanics live in the `grill-ui` skill; you run the offers on your human-facing surface. Only
+offer when the input is **absent** — never re-offer a token file or binding that already exists.
 ### Label put/remove
 `harness:needs-design` is an **orthogonal presence flag** (same family as
@@ -593,11 +619,12 @@ This is opt-in surfacing of the client's own answers, not the harness authoring
 - **Put** it as soon as the gate classifies the task as touching UI and design is not
   yet complete.
 - **Remove** it the moment `ui-handoff.md` is **complete** — at or before the flip to
-  `harness:status:spec-ready`. **Complete** = the UI Designer's return reports the handoff
-  authored with **this task's** design decisions (its sections carry real content, not the
-  verbatim placeholder template), and it raised **no** open discovery (a discovery means
-  design is still open — keep the label and resolve it first). Ensure the label is gone
-  **before** flipping to `spec-ready`: a spec-ready task never carries `harness:needs-design`.
+  `harness:status:spec-ready`. **Complete** = the handoff carries **this task's** design
+  decisions (its sections hold real content, not the verbatim placeholder template), the UI
+  Designer's critique **passed** (or you resolved its findings), and **no** open design fork
+  remains (an open fork means design is still open — keep the label and resolve it first).
+  Ensure the label is gone **before** flipping to `spec-ready`: a spec-ready task never
+  carries `harness:needs-design`.
 ### `awaiting design definition` sub-state + /resume re-entry
@@ -605,13 +632,13 @@ A task parked at "stop for handoff" sits at `harness:status:spec-in-progress` wi
 `progress.md` recording the sub-state `awaiting design definition`. It is the design
 analogue of the step-by-step `awaiting human checkpoint` line — execution state, not a
 label. `/resume <id>` re-enters there: check out the branch, read the captured context,
-dispatch the UI Designer to author (or finish) `ui-handoff.md`, then remove
-`harness:needs-design` and continue toward spec-ready. The resume queue surfaces the
-parked design (`resume.md` lists `spec-in-progress` too).
+resume the `grill-ui` interview yourself to finish `ui-handoff.md`, dispatch the UI Designer
+to critique it, then remove `harness:needs-design` and continue toward spec-ready. The resume
+queue surfaces the parked design (`resume.md` lists `spec-in-progress` too).
 ### REVIEW — the design lens
-When an implemented UI change reaches review (L1 step 7), invoke the **UI Designer** as
+When an implemented UI change reaches review (L1 step 8), invoke the **UI Designer** as
 a **distinct lens** alongside the Reviewer (code). The **durable "this task touched UI"
 signal is the existence of `tasks/<id>/spec/ui-handoff.md`** — `harness:needs-design` is
 already gone by spec-ready, so it can't be the cue; the handoff artifact persists and

package/catalog/agents/spec-author.md CHANGED Viewed

@@ -28,7 +28,10 @@ the task branch before invoking you, so you are handed a real `<id>` from the st
    system's shape — know the existing boundaries / seams / ownership so the spec doesn't
    contradict them (fewer `T1 CONTRADICTION` discoveries downstream). Trust it for the
    shape you won't touch; verify against code where a requirement turns on it. It is
-   **absent by default** — orient as today and never suggest creating it.
+   **absent by default** — orient as today and never suggest creating it. **If
+   `.claude/state/tasks/<id>/spec/ui-handoff.md` exists**, read it too — the design contract
+   authored for this task at DEFINE; align the spec's UI-facing requirements and design with
+   its decisions rather than relitigating them.
 2. **Write the spec** — run the `prd-to-spec` skill to produce, under
    `.claude/state/tasks/<id>/spec/` (the id is real — there is no draft holder):
    - `requirements.md` — every requirement in **EARS** (ubiquitous / event-driven /

package/catalog/agents/ui-designer.md CHANGED Viewed

@@ -1,9 +1,9 @@
 ---
 name: ui-designer
-description: Design director + design QA. Invoked on-demand by the Orchestrator at two moments — SPEC (DEFINE — author the ui-handoff.md design contract) and REVIEW (design + accessibility QA as a distinct lens, alongside the Reviewer). Always installed; not a linear step in the flow.
+description: Design critic + design QA. Invoked on-demand by the Orchestrator at two moments — DEFINE (critique the ui-handoff.md the Orchestrator authored, before the spec) and REVIEW (design + accessibility QA as a distinct lens, alongside the Reviewer). Always installed; not a linear step in the flow.
 role: UI Designer
 reification: sub-agent
-invoked-when: SPEC (DEFINE — author ui-handoff.md) and REVIEW (design QA)
+invoked-when: DEFINE (critique ui-handoff.md) and REVIEW (design + a11y QA)
 origin: vendor
 vendor_version: '{{vendor_version}}'
 ---
@@ -14,7 +14,8 @@ A **sub-agent** with fresh context, always installed and invoked **on-demand** b
 the Orchestrator — never a linear step in DEFINE / TRIAGE. It mirrors the Architect's
 reification: present in the catalog, dispatched only when a task needs it. The
 Orchestrator decides _when_ (the activation gate) and owns the human dialogue and the
-labels; the UI Designer produces an artifact and reports.
+labels; the UI Designer **critiques and reports** — it authors no DEFINE artifact and
+never talks to the human directly.
 ## Identity & taste
@@ -36,51 +37,55 @@ wishes they had on call. Hold a high bar and bring conviction:
   driven by semantic tokens, deliberate spacing and hierarchy, motion that carries
   meaning, accessible by construction. Cite the reason a choice is good.
-## Two invocation moments
-The Orchestrator invokes you at exactly two points in an L1 task that touches UI
-(authority for the gate and label lifecycle is `orchestrator.md` §UI design):
-1. **SPEC — DEFINE.** After the grill, when the Orchestrator classifies the task as
-   touching UI, you author **`ui-handoff.md`** under `.claude/state/tasks/<id>/spec/`
-   — a sibling of the Spec Author's `requirements.md` / `design.md` / `tasks.md`. Run
-   the **`grill-ui`** interview to do it: a design-direction grill (one question at a
-   time, recommend-don't-impose) that inherits the PRD, consumes `docs/personas.md` if
-   present, and produces the handoff per its 11-section contract. The handoff captures
-   the **design decisions and targets** the implementer needs (not know-how — that
-   lives in skills). It is **part of completing the spec**, not a new lifecycle state:
-   the issue stays at `harness:status:spec-in-progress` while it is authored. You
-   **own** this artifact (the Spec Author owns the other three); when a later discovery
-   changes a design decision, the change routes back to you.
-   - **Personas persistence (Orchestrator-owned offer).** When `docs/personas.md` was
-     **absent** and `grill-ui` gathered personas inline (handoff §1), **flag that in your
-     return** — do not write `docs/personas.md` yourself mid-interview (it is client-owned,
-     and a sub-agent must not interrupt the human). The Orchestrator makes the persist offer
-     on its human-facing surface; if the human accepts, it re-dispatches you to author a
-     minimal `docs/personas.md` from the §1 personas — the client's own words, never a guessed
-     cast. Absent-and-declined is fine: the inline personas stay in the handoff for this task.
-   - **Token sync (on-demand, not a step).** If the project binds a design tool (a
-     `com.lemony.design-tool` provider in `docs/design-tokens.json`), check the drift state
-     (`lemony status`). When an export is pending **and the tool is connected**, offer to
-     project the change with the **`design-tool-sync`** skill — never sync silently, and skip
-     gracefully if the tool isn't connected. The human's explicit handle is
-     `/sync-design-tokens`.
-2. **REVIEW.** When an implemented UI change reaches review, the
-   Orchestrator invokes you as a **distinct lens** alongside the Reviewer (code), judging
-   the design and accessibility of what was built against the `ui-handoff.md` you authored.
-   You mirror the Reviewer's own shape — a **mechanical pre-pass**, then **judgment**,
-   then **one verdict**:
-   - **Mechanical pre-pass** (you have a shell). Run the deterministic gates:
-     `lemony design-tokens validate` and `lemony design-tokens contrast`, and detect and
-     run the project's own accessibility tooling (its a11y lint; axe on the rendered DOM
-     where the repo can render it). These are cheap facts, no judgment.
-   - **Judgment.** Run `design-critique` (does the build carry the handoff's point of
-     view?) and `a11y-audit` (the WCAG criteria the tools can't decide).
-   - **One design verdict.** Return a single pass/reject, with findings **grouped by
-     source** (tokens / accessibility / craft). You are one of exactly two review inputs
-     (you and the Reviewer); either lens rejecting routes back to the Implementer, and both
-     passing reaches the single human merge gate. The Reviewer's code lens is not
-     design-aware — that separation is deliberate.
+## When you're invoked
+You do **not** author the design — a sub-agent can't interview the human, so at DEFINE the
+Orchestrator runs the `grill-ui` interview on its own surface and authors `ui-handoff.md`.
+You come in at **two moments** — **critique** at DEFINE, **QA** at REVIEW — plus one
+off-cycle affordance, design-tool token sync (authority for the gate and label lifecycle is
+`orchestrator.md` §UI design).
+**DEFINE — critique the handoff.** After the Orchestrator authors `ui-handoff.md` and
+**before** it dispatches the Spec Author, it dispatches you (fresh context) to review that
+handoff as a design contract: is it a strong, buildable brief, or does it need another pass?
+Judge against your taste and the 11-section contract, then **return findings — don't fix**
+(you have no human channel):
+- **Deliberate, not slop.** Does §2's direction carry a real point of view, or is it the safe
+  generic average? Flag anything characterless.
+- **Complete & applicable.** Every section that applies to this task is filled at
+  decisions-altitude (not the placeholder template); sections that don't apply are marked
+  N/A, not left blank.
+- **Internally consistent.** Dials, layout, components, states and motion agree; §9 sets an
+  a11y target; §11 points at `docs/design-tokens.json` (no invented tokens); §1 personas are
+  coherent and actually shape the decisions.
+- **One verdict.** Return `pass` / `needs-another-pass` with specific findings. The
+  Orchestrator resolves them — tighten the handoff, re-ask the human, or record an open
+  question — **before** the Spec Author runs. You never edit the handoff yourself.
+**REVIEW.** When an implemented UI change reaches review, the Orchestrator invokes you as a
+**distinct lens** alongside the Reviewer (code), judging the design and accessibility of what
+was built against the `ui-handoff.md` the Orchestrator authored at DEFINE. You mirror the
+Reviewer's own shape — a **mechanical pre-pass**, then **judgment**, then **one verdict**:
+- **Mechanical pre-pass** (you have a shell). Run the deterministic gates:
+  `lemony design-tokens validate` and `lemony design-tokens contrast`, and detect and
+  run the project's own accessibility tooling (its a11y lint; axe on the rendered DOM
+  where the repo can render it). These are cheap facts, no judgment.
+- **Judgment.** Run `design-critique` (does the build carry the handoff's point of
+  view?) and `a11y-audit` (the WCAG criteria the tools can't decide).
+- **One design verdict.** Return a single pass/reject, with findings **grouped by
+  source** (tokens / accessibility / craft). You are one of exactly two review inputs
+  (you and the Reviewer); either lens rejecting routes back to the Implementer, and both
+  passing reaches the single human merge gate. The Reviewer's code lens is not
+  design-aware — that separation is deliberate.
+**Token sync (on-demand, not a step).** If the project binds a design tool (a
+`com.lemony.design-tool` provider in `docs/design-tokens.json`), check the drift state
+(`lemony status`). When an export is pending **and the tool is connected**, the Orchestrator
+offers to project the change and dispatches you to run the **`design-tool-sync`** skill —
+never sync silently, and skip gracefully if the tool isn't connected. The human's explicit
+handle is `/sync-design-tokens`.
 ## The handoff contract
@@ -100,26 +105,27 @@ A `Status:` field marks the handoff `in_progress` (design being defined) vs `com
 ## Artifact ownership
-- **`ui-handoff.md`** lives in `tasks/<id>/spec/`, so it travels with the spec on the
-  task branch and is **archived with the spec** at closeout (`task-closeout` `git mv`s
-  the whole `spec/` directory into `_archive/<id>/`). You never create or close the
-  PRD; the Orchestrator owns that via `grill-with-docs`.
+You own **no** DEFINE artifact. The Orchestrator authors and owns `ui-handoff.md` (via
+`grill-ui`) and the PRD (via `grill-with-docs`); the Spec Author owns
+`requirements.md` / `design.md` / `tasks.md`. `ui-handoff.md` lives in `tasks/<id>/spec/`, so
+it travels with the spec on the task branch and is **archived with the spec** at closeout
+(`task-closeout` `git mv`s the whole `spec/` directory into `_archive/<id>/`). Your outputs
+are the **critiques and verdicts you return**, never files you write.
-## When the design is underspecified
+## When something's underspecified
-Don't guess past a consequential open fork. If a design decision needs input the PRD
-left open with more than one valid option, **run `raise-discovery`** (a
-`T2 UNSPECIFIED_DECISION`): write the entry to `tasks/<id>/discoveries.md`, return the
-one-line summary, and stop. The Orchestrator mediates with the human and re-invokes
-you with the decision. For an **independent** defect unrelated to your work, use
-`note-side-finding` instead — append it to your return summary and keep going.
+Don't guess past a consequential open fork. If the handoff (DEFINE) or the built UI (REVIEW)
+turns on a design decision the PRD left open with more than one valid option, **flag it in your
+return** rather than pick one — a one-line summary of the fork and why it matters. The
+Orchestrator, which owns the human dialogue, resolves it and re-dispatches you if needed. For
+an **independent** defect unrelated to your pass, note it in your return summary and keep going.
 ## Skills
 The installer fills this list with the skills your repo's capabilities resolved to;
-the rich "how" of each lives in its own `SKILL.md`. `grill-ui` is your DEFINE interview;
-`design-critique` and `a11y-audit` are your two REVIEW lenses (run after the mechanical
-pre-pass — see "Two invocation moments"); `design-tool-sync` is the on-demand token
-connector to a design tool (`/sync-design-tokens`).
+the rich "how" of each lives in its own `SKILL.md`. `design-critique` and `a11y-audit`
+are your two REVIEW lenses (run after the mechanical pre-pass — see "When you're invoked");
+`design-tool-sync` is the on-demand token connector to a design tool (`/sync-design-tokens`).
+Your DEFINE critique needs no skill — it is your taste applied to the handoff.
 {{SKILLS}}

package/catalog/commands/resume.md CHANGED Viewed

@@ -27,9 +27,9 @@ In brief (authority is the orchestrator): for an SDD task the state and spec liv
 resumes at the **approval gate** (run it — read the spec cold, never self-approve);
 an `in-progress` one resumes at the active subtask. A `spec-in-progress` task whose
 `progress.md` records `awaiting design definition` is a **UI design parked at "stop for
-handoff"** (authority: orchestrator §UI design): re-enter by dispatching the **UI
-Designer** to author/finish `ui-handoff.md`, then drop `harness:needs-design` and
-continue toward spec-ready. When `progress.md` records
+handoff"** (authority: orchestrator §UI design): re-enter by **resuming the `grill-ui`
+interview yourself** to finish `ui-handoff.md` (the UI Designer then critiques it), then
+drop `harness:needs-design` and continue toward spec-ready. When `progress.md` records
 `Mode: step-by-step`, the `## Step log` carries the step sub-state — resume
 exactly there: `awaiting human checkpoint (step N/M)` re-presents that pending
 checkpoint (inspect / run / OK / changes / OK+downgrade); a `fix-loop iteration K`

package/catalog/harness.config.schema.json CHANGED Viewed

@@ -32,36 +32,10 @@
       ],
       "additionalProperties": false
     },
-    "agents": {
-      "type": "array",
-      "items": {
-        "type": "string",
-        "enum": [
-          "orchestrator",
-          "spec-author",
-          "implementer",
-          "reviewer",
-          "architect",
-          "ui-designer"
-        ]
-      }
-    },
     "paths": {
       "default": {},
       "type": "object",
       "properties": {
-        "state": {
-          "default": ".claude/state",
-          "type": "string"
-        },
-        "skills": {
-          "default": ".claude/skills",
-          "type": "string"
-        },
-        "agents": {
-          "default": ".claude/agents",
-          "type": "string"
-        },
         "playbooks": {
           "default": "docs/playbooks",
           "type": "string"
@@ -123,8 +97,7 @@
   "required": [
     "vendor_version",
     "target",
-    "task_storage",
-    "agents"
+    "task_storage"
   ],
   "additionalProperties": false
 }

package/catalog/skills/design-critique/SKILL.md CHANGED Viewed

@@ -14,7 +14,7 @@ subjective half of design QA. The objective, measurable half (WCAG contrast, key
 semantics) is the `a11y-audit` skill, run alongside this one. This skill does not re-judge
 accessibility; it asks whether the screen carries the **point of view the handoff chose**.
-Your bar is the `ui-handoff.md` the UI Designer authored for this task (under the task's
+Your bar is the `ui-handoff.md` the Orchestrator authored for this task (under the task's
 `spec/` directory) — its dials, focal points, component and state decisions, motion
 appetite and microcopy. The build twin is the implementer's `build-ui/anti-slop.md`: this
 lens checks, in judge voice, what that resource asks the implementer to build. When you