@lemoncode/lemony 0.1.0 → 0.1.1-alpha.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/NOTICE +39 -0
- package/catalog/VERSION +1 -1
- package/catalog/agents/architect.md +4 -4
- package/catalog/agents/fit-assessment.md +1 -1
- package/catalog/agents/implementer.md +15 -8
- package/catalog/agents/orchestrator.md +165 -24
- package/catalog/agents/reviewer.md +7 -7
- package/catalog/agents/spec-author.md +4 -4
- package/catalog/agents/ui-designer.md +115 -15
- package/catalog/commands/add-capability.md +3 -3
- package/catalog/commands/resume.md +10 -4
- package/catalog/commands/spinoff.md +2 -2
- package/catalog/commands/sync-design-tokens.md +29 -0
- package/catalog/harness.config.schema.json +14 -0
- package/catalog/hooks/init.sh +11 -11
- package/catalog/hooks/lib/lemony.sh +3 -3
- package/catalog/hooks/lib/playbook-scan.sh +10 -11
- package/catalog/hooks/session-close.sh +7 -7
- package/catalog/schemas/tier2-events-history.md +11 -11
- package/catalog/schemas/tier2-events.md +46 -47
- package/catalog/skills/a11y-audit/SKILL.md +121 -0
- package/catalog/skills/bootstrap-architecture/SKILL.md +3 -3
- package/catalog/skills/build-ui/SKILL.md +147 -0
- package/catalog/skills/build-ui/accessibility.md +101 -0
- package/catalog/skills/build-ui/anti-slop.md +107 -0
- package/catalog/skills/code-explorer/SKILL.md +1 -1
- package/catalog/skills/design-critique/SKILL.md +110 -0
- package/catalog/skills/design-tool-sync/SKILL.md +120 -0
- package/catalog/skills/grill-ui/SKILL.md +187 -0
- package/catalog/skills/grill-ui/ui-handoff-format.md +148 -0
- package/catalog/skills/grill-with-docs/SKILL.md +9 -2
- package/catalog/skills/mutation-testing/SKILL.md +1 -1
- package/catalog/skills/note-side-finding/SKILL.md +1 -1
- package/catalog/skills/playbook-iterate/SKILL.md +2 -2
- package/catalog/skills/review-pr/SKILL.md +3 -3
- package/catalog/skills/task-closeout/SKILL.md +9 -8
- package/catalog/skills/update-architecture/SKILL.md +3 -3
- package/catalog/templates/claude-code/agents.md.tpl +16 -10
- package/catalog/templates/claude-code/docs/playbooks/README.md.tpl +1 -3
- package/catalog/templates/claude-code/harness.config.yml.tpl +9 -1
- package/dist/cli.mjs +1286 -1665
- package/package.json +13 -4
- package/catalog/agents/README.md +0 -29
- package/catalog/hooks/README.md +0 -56
- package/catalog/playbook-format.md +0 -198
- package/catalog/schemas/README.md +0 -13
- package/catalog/skills/README.md +0 -62
- package/catalog/templates/README.md +0 -32
|
@@ -1,25 +1,125 @@
|
|
|
1
1
|
---
|
|
2
2
|
name: ui-designer
|
|
3
|
-
description:
|
|
3
|
+
description: Design director + design QA. Invoked on-demand by the Orchestrator at two moments — SPEC (DEFINE — author the ui-handoff.md design contract) and REVIEW (design + accessibility QA as a distinct lens, alongside the Reviewer). Always installed; not a linear step in the flow.
|
|
4
4
|
role: UI Designer
|
|
5
|
-
reification:
|
|
6
|
-
invoked-when:
|
|
7
|
-
|
|
5
|
+
reification: sub-agent
|
|
6
|
+
invoked-when: SPEC (DEFINE — author ui-handoff.md) and REVIEW (design QA)
|
|
7
|
+
origin: vendor
|
|
8
|
+
vendor_version: '{{vendor_version}}'
|
|
8
9
|
---
|
|
9
10
|
|
|
10
11
|
# UI Designer
|
|
11
12
|
|
|
12
|
-
|
|
13
|
-
|
|
13
|
+
A **sub-agent** with fresh context, always installed and invoked **on-demand** by
|
|
14
|
+
the Orchestrator — never a linear step in DEFINE / TRIAGE. It mirrors the Architect's
|
|
15
|
+
reification: present in the catalog, dispatched only when a task needs it. The
|
|
16
|
+
Orchestrator decides _when_ (the activation gate) and owns the human dialogue and the
|
|
17
|
+
labels; the UI Designer produces an artifact and reports.
|
|
14
18
|
|
|
15
|
-
|
|
16
|
-
workflow), this role would help produce wireframes/mockups/prototypes and attach
|
|
17
|
-
designs to issues.
|
|
19
|
+
## Identity & taste
|
|
18
20
|
|
|
19
|
-
|
|
21
|
+
You are a **design director with real taste** — the kind a solo dev without a designer
|
|
22
|
+
wishes they had on call. Hold a high bar and bring conviction:
|
|
20
23
|
|
|
21
|
-
-
|
|
22
|
-
|
|
23
|
-
|
|
24
|
-
|
|
25
|
-
|
|
24
|
+
- **Default to deliberate, distinctive design.** The failure mode to avoid is generic
|
|
25
|
+
"AI slop" — the overused font, the predictable purple gradient, the cookie-cutter
|
|
26
|
+
card grid, the layout with no point of view. When the user has no opinion, propose a
|
|
27
|
+
direction, don't reach for the safe average.
|
|
28
|
+
- **Have a point of view, hold it lightly.** Recommend boldly and explain _why_, then
|
|
29
|
+
let the user decide. It is their product; your job is to make the tasteful choice the
|
|
30
|
+
obvious one, not to impose it.
|
|
31
|
+
- **Decisions, not know-how.** You set direction — tone, dials, layout, component and
|
|
32
|
+
state decisions, motion appetite, accessibility targets, microcopy. _How_ to apply
|
|
33
|
+
tokens to a stack, _how_ to avoid slop in code, _how_ to meet WCAG — that is the
|
|
34
|
+
implementer's and reviewer's craft, delivered through their skills. Never restate it.
|
|
35
|
+
- **Ground taste in craft, not vibes.** Characterful typography, intentional colour
|
|
36
|
+
driven by semantic tokens, deliberate spacing and hierarchy, motion that carries
|
|
37
|
+
meaning, accessible by construction. Cite the reason a choice is good.
|
|
38
|
+
|
|
39
|
+
## Two invocation moments
|
|
40
|
+
|
|
41
|
+
The Orchestrator invokes you at exactly two points in an L1 task that touches UI
|
|
42
|
+
(authority for the gate and label lifecycle is `orchestrator.md` §UI design):
|
|
43
|
+
|
|
44
|
+
1. **SPEC — DEFINE.** After the grill, when the Orchestrator classifies the task as
|
|
45
|
+
touching UI, you author **`ui-handoff.md`** under `.claude/state/tasks/<id>/spec/`
|
|
46
|
+
— a sibling of the Spec Author's `requirements.md` / `design.md` / `tasks.md`. Run
|
|
47
|
+
the **`grill-ui`** interview to do it: a design-direction grill (one question at a
|
|
48
|
+
time, recommend-don't-impose) that inherits the PRD, consumes `docs/personas.md` if
|
|
49
|
+
present, and produces the handoff per its 11-section contract. The handoff captures
|
|
50
|
+
the **design decisions and targets** the implementer needs (not know-how — that
|
|
51
|
+
lives in skills). It is **part of completing the spec**, not a new lifecycle state:
|
|
52
|
+
the issue stays at `harness:status:spec-in-progress` while it is authored. You
|
|
53
|
+
**own** this artifact (the Spec Author owns the other three); when a later discovery
|
|
54
|
+
changes a design decision, the change routes back to you.
|
|
55
|
+
- **Personas persistence (Orchestrator-owned offer).** When `docs/personas.md` was
|
|
56
|
+
**absent** and `grill-ui` gathered personas inline (handoff §1), **flag that in your
|
|
57
|
+
return** — do not write `docs/personas.md` yourself mid-interview (it is client-owned,
|
|
58
|
+
and a sub-agent must not interrupt the human). The Orchestrator makes the persist offer
|
|
59
|
+
on its human-facing surface; if the human accepts, it re-dispatches you to author a
|
|
60
|
+
minimal `docs/personas.md` from the §1 personas — the client's own words, never a guessed
|
|
61
|
+
cast. Absent-and-declined is fine: the inline personas stay in the handoff for this task.
|
|
62
|
+
- **Token sync (on-demand, not a step).** If the project binds a design tool (a
|
|
63
|
+
`com.lemony.design-tool` provider in `docs/design-tokens.json`), check the drift state
|
|
64
|
+
(`lemony status`). When an export is pending **and the tool is connected**, offer to
|
|
65
|
+
project the change with the **`design-tool-sync`** skill — never sync silently, and skip
|
|
66
|
+
gracefully if the tool isn't connected. The human's explicit handle is
|
|
67
|
+
`/sync-design-tokens`.
|
|
68
|
+
2. **REVIEW.** When an implemented UI change reaches review, the
|
|
69
|
+
Orchestrator invokes you as a **distinct lens** alongside the Reviewer (code), judging
|
|
70
|
+
the design and accessibility of what was built against the `ui-handoff.md` you authored.
|
|
71
|
+
You mirror the Reviewer's own shape — a **mechanical pre-pass**, then **judgment**,
|
|
72
|
+
then **one verdict**:
|
|
73
|
+
- **Mechanical pre-pass** (you have a shell). Run the deterministic gates:
|
|
74
|
+
`lemony design-tokens validate` and `lemony design-tokens contrast`, and detect and
|
|
75
|
+
run the project's own accessibility tooling (its a11y lint; axe on the rendered DOM
|
|
76
|
+
where the repo can render it). These are cheap facts, no judgment.
|
|
77
|
+
- **Judgment.** Run `design-critique` (does the build carry the handoff's point of
|
|
78
|
+
view?) and `a11y-audit` (the WCAG criteria the tools can't decide).
|
|
79
|
+
- **One design verdict.** Return a single pass/reject, with findings **grouped by
|
|
80
|
+
source** (tokens / accessibility / craft). You are one of exactly two review inputs
|
|
81
|
+
(you and the Reviewer); either lens rejecting routes back to the Implementer, and both
|
|
82
|
+
passing reaches the single human merge gate. The Reviewer's code lens is not
|
|
83
|
+
design-aware — that separation is deliberate.
|
|
84
|
+
|
|
85
|
+
## The handoff contract
|
|
86
|
+
|
|
87
|
+
`ui-handoff.md` follows the canonical **11-section** contract carried by the `grill-ui`
|
|
88
|
+
skill (`ui-handoff-format.md`): (1) personas; (2) aesthetic direction — the dials
|
|
89
|
+
(variance / motion / density) and optional tone preset; (3) screen inventory + nav/flow;
|
|
90
|
+
(4) per-screen layout; (5) components; (6) states; (7) responsive; (8) motion/interaction;
|
|
91
|
+
(9) accessibility; (10) microcopy; (11) token reference. Two principles run through it:
|
|
92
|
+
|
|
93
|
+
- **Graduated altitude.** Decisions-altitude by default (reference the system, name the
|
|
94
|
+
variant, state what matters); the full component anatomy only for a novel or critical
|
|
95
|
+
component.
|
|
96
|
+
- **Text-first, tokens by reference.** Screens travel as text — an inventory, a nav/flow
|
|
97
|
+
map and structural intent; tokens point at `docs/design-tokens.json`, never inlined.
|
|
98
|
+
|
|
99
|
+
A `Status:` field marks the handoff `in_progress` (design being defined) vs `completed`.
|
|
100
|
+
|
|
101
|
+
## Artifact ownership
|
|
102
|
+
|
|
103
|
+
- **`ui-handoff.md`** lives in `tasks/<id>/spec/`, so it travels with the spec on the
|
|
104
|
+
task branch and is **archived with the spec** at closeout (`task-closeout` `git mv`s
|
|
105
|
+
the whole `spec/` directory into `_archive/<id>/`). You never create or close the
|
|
106
|
+
PRD; the Orchestrator owns that via `grill-with-docs`.
|
|
107
|
+
|
|
108
|
+
## When the design is underspecified
|
|
109
|
+
|
|
110
|
+
Don't guess past a consequential open fork. If a design decision needs input the PRD
|
|
111
|
+
left open with more than one valid option, **run `raise-discovery`** (a
|
|
112
|
+
`T2 UNSPECIFIED_DECISION`): write the entry to `tasks/<id>/discoveries.md`, return the
|
|
113
|
+
one-line summary, and stop. The Orchestrator mediates with the human and re-invokes
|
|
114
|
+
you with the decision. For an **independent** defect unrelated to your work, use
|
|
115
|
+
`note-side-finding` instead — append it to your return summary and keep going.
|
|
116
|
+
|
|
117
|
+
## Skills
|
|
118
|
+
|
|
119
|
+
The installer fills this list with the skills your repo's capabilities resolved to;
|
|
120
|
+
the rich "how" of each lives in its own `SKILL.md`. `grill-ui` is your DEFINE interview;
|
|
121
|
+
`design-critique` and `a11y-audit` are your two REVIEW lenses (run after the mechanical
|
|
122
|
+
pre-pass — see "Two invocation moments"); `design-tool-sync` is the on-demand token
|
|
123
|
+
connector to a design tool (`/sync-design-tokens`).
|
|
124
|
+
|
|
125
|
+
{{SKILLS}}
|
|
@@ -6,8 +6,8 @@ allowed-tools: Read, Bash, Task
|
|
|
6
6
|
# /add-capability
|
|
7
7
|
|
|
8
8
|
Activate an **opt-in** capability that `install`/`doctor` reported as _available but not
|
|
9
|
-
installed_. The harness reports these latent capabilities
|
|
10
|
-
convention artifact on its own
|
|
9
|
+
installed_. The harness reports these latent capabilities but never creates the
|
|
10
|
+
convention artifact on its own. This command is the **user-initiated**
|
|
11
11
|
"yes, add it" — it has the right agent author the artifact, then re-syncs so the gated
|
|
12
12
|
skill installs.
|
|
13
13
|
|
|
@@ -38,7 +38,7 @@ latent ones.
|
|
|
38
38
|
latent): dispatch the **Architect** (Task tool, fresh context) with the
|
|
39
39
|
**`bootstrap-architecture`** skill. It reads the repo and authors the first
|
|
40
40
|
`docs/architecture.md` — a holistic map **fitted to this project**, read from the
|
|
41
|
-
actual code, never a vendor template
|
|
41
|
+
actual code, never a vendor template. When it returns, show the human
|
|
42
42
|
its summary and the new file — they own the result and may edit it. If the Architect
|
|
43
43
|
reports the project has no meaningful architecture to map, it writes nothing: relay
|
|
44
44
|
that to the user and **end the command** — skip step 4, force nothing.
|
|
@@ -11,7 +11,9 @@ Enter **RESUME** mode and pick up an existing task exactly as specified in
|
|
|
11
11
|
is the single source for the procedure.
|
|
12
12
|
|
|
13
13
|
`$ARGUMENTS` is the task `#id` or name to resume. If empty, list the open queue
|
|
14
|
-
(`gh issue list -l harness:status:spec-ready`, `-l harness:status:in-progress`,
|
|
14
|
+
(`gh issue list -l harness:status:spec-ready`, `-l harness:status:in-progress`,
|
|
15
|
+
`-l harness:status:spec-in-progress` for specs parked mid-authoring — including a UI task
|
|
16
|
+
parked at `awaiting design definition` (+ `harness:needs-design`), and
|
|
15
17
|
`-l harness:status:pending` for stubs captured by `/spinoff`), plus parked closeouts
|
|
16
18
|
(`-l harness:status:closeout-pending --state all` — their issue is already **closed** by
|
|
17
19
|
the task PR's `Closes #<id>`, so an open-only listing would miss them), and ask which to
|
|
@@ -23,8 +25,12 @@ In brief (authority is the orchestrator): for an SDD task the state and spec liv
|
|
|
23
25
|
`git branch -r --list "origin/harness/<id>-*"`), then reload
|
|
24
26
|
`.claude/state/tasks/<id>/` and continue from `progress.md`. A `spec-ready` issue
|
|
25
27
|
resumes at the **approval gate** (run it — read the spec cold, never self-approve);
|
|
26
|
-
an `in-progress` one resumes at the active subtask.
|
|
27
|
-
`
|
|
28
|
+
an `in-progress` one resumes at the active subtask. A `spec-in-progress` task whose
|
|
29
|
+
`progress.md` records `awaiting design definition` is a **UI design parked at "stop for
|
|
30
|
+
handoff"** (authority: orchestrator §UI design): re-enter by dispatching the **UI
|
|
31
|
+
Designer** to author/finish `ui-handoff.md`, then drop `harness:needs-design` and
|
|
32
|
+
continue toward spec-ready. When `progress.md` records
|
|
33
|
+
`Mode: step-by-step`, the `## Step log` carries the step sub-state — resume
|
|
28
34
|
exactly there: `awaiting human checkpoint (step N/M)` re-presents that pending
|
|
29
35
|
checkpoint (inspect / run / OK / changes / OK+downgrade); a `fix-loop iteration K`
|
|
30
36
|
line re-enters the per-step implement→review loop at that iteration (authority:
|
|
@@ -33,7 +39,7 @@ the orchestrator §Step-by-step implementation). If the Mode line carries a
|
|
|
33
39
|
resume at the active subtask as usual, not via the step loop (the Step log is then
|
|
34
40
|
history, not sub-state).
|
|
35
41
|
|
|
36
|
-
**Cross-machine pickup of an `in-progress` task
|
|
42
|
+
**Cross-machine pickup of an `in-progress` task**: the branch on origin
|
|
37
43
|
carries the last **successful** best-effort WIP push — the Implementer pushes on
|
|
38
44
|
signaling done, and step-by-step pushes at each checkpoint-wait. If a **local** copy
|
|
39
45
|
of the branch is ahead of origin, prefer it (it is the newer state — this is what
|
|
@@ -37,8 +37,8 @@ code pointer). If empty, ask one short question to capture the symptom, then pro
|
|
|
37
37
|
|
|
38
38
|
Omit `--parent` when there is no active task; omit `--severity` unless you can infer
|
|
39
39
|
it cheaply (it is best-effort and never blocks capture). Omit `--kind` for an ordinary
|
|
40
|
-
defect — its one value, `architecture-drift`, is for `docs/architecture.md` map staleness
|
|
41
|
-
|
|
40
|
+
defect — its one value, `architecture-drift`, is for `docs/architecture.md` map staleness:
|
|
41
|
+
it tags the stub `harness:architecture-drift` so pickup routes it to the
|
|
42
42
|
Architect's `update-architecture` instead of a code change. The stub creation is
|
|
43
43
|
**fail-loud** — a non-zero exit means the issue did not open, so surface it; do not
|
|
44
44
|
pretend the defect was captured. The telemetry emit is **best-effort** — if only that
|
|
@@ -0,0 +1,29 @@
|
|
|
1
|
+
---
|
|
2
|
+
description: Sync design tokens between docs/design-tokens.json and the design tool (import or export).
|
|
3
|
+
allowed-tools: Read, Write, Edit, Bash, Task, Skill
|
|
4
|
+
---
|
|
5
|
+
|
|
6
|
+
# /sync-design-tokens
|
|
7
|
+
|
|
8
|
+
Sync the project's design tokens with its design tool. This command only **starts the flow** —
|
|
9
|
+
it does not redefine it. The authority is the **`design-tool-sync`** skill, run by the UI
|
|
10
|
+
Designer; follow the steps there.
|
|
11
|
+
|
|
12
|
+
`$ARGUMENTS` selects the direction:
|
|
13
|
+
|
|
14
|
+
- `import` — pull the tool's variables into `docs/design-tokens.json` (the primary flow). The
|
|
15
|
+
CLI maps them to the 3-tier model and prints a diff; you present it and the human curates
|
|
16
|
+
which slice lands before it is written.
|
|
17
|
+
- `export` — project the JSON to the tool as an additive upsert (creates and updates, never
|
|
18
|
+
deletes tool-only variables). Preview the plan, push on confirmation, then record the drift
|
|
19
|
+
baseline only after the push succeeds.
|
|
20
|
+
- _empty_ — check the drift state (`lemony status`) and, if an export is pending and the tool
|
|
21
|
+
is connected, offer `export`; otherwise report the current state.
|
|
22
|
+
|
|
23
|
+
The design tool is a **projection** of the canonical JSON, never a peer source of truth.
|
|
24
|
+
Detect the tool at runtime: read the `com.lemony.design-tool` binding at the root of
|
|
25
|
+
`docs/design-tokens.json`; if no tool is declared, the project is pure-code and there is
|
|
26
|
+
nothing to sync. If the declared tool's MCP server is not connected, skip gracefully with a
|
|
27
|
+
note — the deterministic drift check still works without it.
|
|
28
|
+
|
|
29
|
+
The human reviews every write (to the JSON and to the tool). Never sync silently.
|
|
@@ -104,6 +104,20 @@
|
|
|
104
104
|
}
|
|
105
105
|
},
|
|
106
106
|
"additionalProperties": false
|
|
107
|
+
},
|
|
108
|
+
"design_tokens": {
|
|
109
|
+
"default": {},
|
|
110
|
+
"type": "object",
|
|
111
|
+
"properties": {
|
|
112
|
+
"scan_extensions": {
|
|
113
|
+
"default": [],
|
|
114
|
+
"type": "array",
|
|
115
|
+
"items": {
|
|
116
|
+
"type": "string"
|
|
117
|
+
}
|
|
118
|
+
}
|
|
119
|
+
},
|
|
120
|
+
"additionalProperties": false
|
|
107
121
|
}
|
|
108
122
|
},
|
|
109
123
|
"required": [
|
package/catalog/hooks/init.sh
CHANGED
|
@@ -1,5 +1,5 @@
|
|
|
1
1
|
#!/usr/bin/env bash
|
|
2
|
-
# Lemony — Claude Code SessionStart hook
|
|
2
|
+
# Lemony — Claude Code SessionStart hook.
|
|
3
3
|
#
|
|
4
4
|
# Read-only orient run on every session entry filtered by source: orient on
|
|
5
5
|
# {startup, resume, clear}; skip {compact} silently to keep long sessions free
|
|
@@ -37,7 +37,7 @@ ERRORS=()
|
|
|
37
37
|
WARNINGS=()
|
|
38
38
|
|
|
39
39
|
# ── Block 1 ─ harness.config.yml present and parseable ─────────────────────
|
|
40
|
-
# Pure-bash validation (no yq
|
|
40
|
+
# Pure-bash validation (no yq): presence of the four keys the harness
|
|
41
41
|
# needs to operate, checked top-level plus block-scoped under `task_storage`
|
|
42
42
|
# in one awk pass, then extraction of the two values the orient output uses.
|
|
43
43
|
# Deep schema validation (formats, enums, did-you-mean typos) lives in
|
|
@@ -96,7 +96,7 @@ fi
|
|
|
96
96
|
|
|
97
97
|
# ── Warning 1 ─ vendor version drift (config vs installed CLI) ─────────────
|
|
98
98
|
# Resolve the CLI for the version read: the project-local devDependency bin first
|
|
99
|
-
# (it survives an fnm Node switch
|
|
99
|
+
# (it survives an fnm Node switch), then a global install. No npx fallback —
|
|
100
100
|
# the orient hook stays inside its <1s p99 budget, so a non-resolvable CLI just
|
|
101
101
|
# skips the drift check rather than paying a network round trip every session.
|
|
102
102
|
LF_BIN=""
|
|
@@ -108,7 +108,7 @@ fi
|
|
|
108
108
|
if [ -n "$CONFIG_VERSION" ] && [ -n "$LF_BIN" ]; then
|
|
109
109
|
INSTALLED_VERSION="$("$LF_BIN" --version 2>/dev/null | head -1)"
|
|
110
110
|
if [ -n "$INSTALLED_VERSION" ] && [ "$CONFIG_VERSION" != "$INSTALLED_VERSION" ]; then
|
|
111
|
-
# Direction-neutral by design
|
|
111
|
+
# Direction-neutral by design: the hook stays offline + node-free, so it
|
|
112
112
|
# can't order the two versions reliably (no portable semver compare in bash —
|
|
113
113
|
# BSD `sort` lacks `-V`). It flags the drift with a remedy safe in BOTH
|
|
114
114
|
# directions; `lemony doctor` resolves the precise direction (it has a
|
|
@@ -116,10 +116,10 @@ if [ -n "$CONFIG_VERSION" ] && [ -n "$LF_BIN" ]; then
|
|
|
116
116
|
WARNINGS+=("vendor version drift: harness.config.yml pins $CONFIG_VERSION, installed CLI is $INSTALLED_VERSION. If your CLI is newer, run \`lemony update\`; if it is older, upgrade your CLI (\`npm install\`) and do NOT run \`update\` (it would downgrade the repo). \`lemony doctor\` tells you which.")
|
|
117
117
|
fi
|
|
118
118
|
elif [ -n "$CONFIG_VERSION" ]; then
|
|
119
|
-
#
|
|
119
|
+
# The repo pins a version but no CLI resolves (a teammate never ran
|
|
120
120
|
# `npm install`, no global). This used to be a silent skip — so a stale CLI could
|
|
121
121
|
# drift with no signal at all. Emit a visible line instead, so the check's absence
|
|
122
|
-
# isn't mistaken for "no drift". Still offline
|
|
122
|
+
# isn't mistaken for "no drift". Still offline: no registry lookup, just a
|
|
123
123
|
# heads-up to install the project-local CLI.
|
|
124
124
|
WARNINGS+=("could not verify CLI version: harness.config.yml pins $CONFIG_VERSION but no \`lemony\` CLI resolved. Run \`npm install\` to get the project-local one (then re-check with \`lemony doctor\`). Version-drift check skipped.")
|
|
125
125
|
fi
|
|
@@ -163,7 +163,7 @@ fi
|
|
|
163
163
|
# If the blocks didn't fire and we know the user, ensure the per-dev pointer
|
|
164
164
|
# file exists with a fresh `session_start_ts` so `session-close.sh` can compute
|
|
165
165
|
# `session_active_h` accurately. The refresh is a pure-awk in-place rewrite of
|
|
166
|
-
# the frontmatter scalar (
|
|
166
|
+
# the frontmatter scalar (no yq).
|
|
167
167
|
if [ "${#ERRORS[@]}" -eq 0 ] && [ -n "$GIT_USER_EMAIL" ]; then
|
|
168
168
|
NOW_ISO="$(date -u +%Y-%m-%dT%H:%M:%SZ)"
|
|
169
169
|
# `%%@*` strips from the FIRST `@` (parity with status.ts's `email.split('@')[0]`),
|
|
@@ -223,11 +223,11 @@ EOF
|
|
|
223
223
|
fi
|
|
224
224
|
fi
|
|
225
225
|
|
|
226
|
-
# ── Telemetry opt-out disclosure
|
|
226
|
+
# ── Telemetry opt-out disclosure ───────────────────────────────────────────
|
|
227
227
|
# Anonymous telemetry is on by default, so the first session — and any session
|
|
228
228
|
# after the effective consent changes — must surface the opt-out. The sentinel +
|
|
229
229
|
# fingerprint logic lives in TS (one resolver, no bash duplication of the env ›
|
|
230
|
-
# local › config precedence + .strict() fail-safe
|
|
230
|
+
# local › config precedence + .strict() fail-safe); init.sh just shells
|
|
231
231
|
# out when a CLI resolves ($LF_BIN from the version-drift check) and skips
|
|
232
232
|
# otherwise. A CLI-less teammate is not left uninformed: `install` already showed
|
|
233
233
|
# the notice once. Healthy-path only, and never fails the boot — the command
|
|
@@ -236,8 +236,8 @@ if [ "${#ERRORS[@]}" -eq 0 ] && [ -n "$LF_BIN" ]; then
|
|
|
236
236
|
"$LF_BIN" telemetry notice 2>/dev/null || true
|
|
237
237
|
fi
|
|
238
238
|
|
|
239
|
-
# ── Telemetry catch-up send
|
|
240
|
-
# SessionEnd's send
|
|
239
|
+
# ── Telemetry catch-up send ────────────────────────────────────────────────
|
|
240
|
+
# SessionEnd's send is the timely path; this is the safety net for a
|
|
241
241
|
# session that closed uncleanly (crash / SIGKILL / no SessionEnd fired) and so
|
|
242
242
|
# never shipped its tail. Flush the unsent tail of events.jsonl now, on entry.
|
|
243
243
|
# Detached in a subshell (`( … & )`) with output redirected so SessionStart never
|
|
@@ -1,10 +1,10 @@
|
|
|
1
1
|
#!/usr/bin/env bash
|
|
2
|
-
# Lemony — CLI launcher
|
|
2
|
+
# Lemony — CLI launcher.
|
|
3
3
|
#
|
|
4
4
|
# The harness's hooks, agents and commands invoke the telemetry CLI (`lemony
|
|
5
5
|
# emit …`). Calling the bare `lemony` binary is fragile: an `npx …` install
|
|
6
|
-
# leaves nothing on PATH
|
|
7
|
-
# global bin when the active version changes
|
|
6
|
+
# leaves nothing on PATH, and a Node version manager (fnm/nvm) drops the
|
|
7
|
+
# global bin when the active version changes, so `command -v lemony`
|
|
8
8
|
# silently fails and telemetry is skipped.
|
|
9
9
|
#
|
|
10
10
|
# This launcher resolves the CLI deterministically, preferring the project-local
|
|
@@ -2,21 +2,21 @@
|
|
|
2
2
|
# Discover playbooks from two layers and expose lookup helpers used by the
|
|
3
3
|
# require-playbook and suggest-playbook hooks.
|
|
4
4
|
#
|
|
5
|
-
# Layers
|
|
5
|
+
# Layers:
|
|
6
6
|
# - Local: <repo_root>/docs/playbooks/<topic>.md (committed)
|
|
7
7
|
# - Global: <home>/.claude/playbooks/<topic>.md (per-developer)
|
|
8
8
|
# A local file shadows the global one for the same topic (filename stem).
|
|
9
9
|
#
|
|
10
|
-
# Frontmatter (
|
|
10
|
+
# Frontmatter (the playbook format):
|
|
11
11
|
# applies_to: list of bash globs → file paths that REQUIRE this playbook
|
|
12
12
|
# keywords: list of regex strings → prompts that SUGGEST this playbook
|
|
13
13
|
# Both lists are optional and independent; missing/empty lists are no-ops.
|
|
14
14
|
#
|
|
15
15
|
# Frontmatter is read with a SINGLE awk pass over every discovered playbook
|
|
16
|
-
# (`_scan_frontmatter_tsv`), not one `yq` fork per playbook (
|
|
16
|
+
# (`_scan_frontmatter_tsv`), not one `yq` fork per playbook (the
|
|
17
17
|
# per-file fork dominated hook latency, blowing the <50ms p99 target past ~5
|
|
18
18
|
# playbooks). awk is also the only YAML reader now: the keyword format is
|
|
19
|
-
# constrained to the awk-parseable subset (
|
|
19
|
+
# constrained to the awk-parseable subset (no `\b`/`\d` escape
|
|
20
20
|
# sequences), so `yq`'s extra fidelity was unused while its ~10-30ms cold fork
|
|
21
21
|
# was pure cost. jq remains the keyword regex engine (Oniguruma, case-insensitive),
|
|
22
22
|
# batched into one invocation across all keywords. Without jq, suggestion lookups
|
|
@@ -31,7 +31,7 @@
|
|
|
31
31
|
# regress on stock macOS — keep the regex path even when bash 4+ is present.
|
|
32
32
|
|
|
33
33
|
# Resolve the two playbook lookup dirs into PLAYBOOKS_LOCAL_DIR / PLAYBOOKS_GLOBAL_DIR
|
|
34
|
-
# from `harness.config.yml`'s `paths` block (
|
|
34
|
+
# from `harness.config.yml`'s `paths` block (config-driven):
|
|
35
35
|
# paths.playbooks → local layer, repo-relative (default docs/playbooks)
|
|
36
36
|
# paths.playbooks_global → global layer, ~ expanded (default ~/.claude/playbooks)
|
|
37
37
|
#
|
|
@@ -138,13 +138,13 @@ discover_playbooks() {
|
|
|
138
138
|
# playbook path in a single awk pass: one line per item, `<file>\t<field>\t<value>`,
|
|
139
139
|
# with surrounding single or double quotes stripped. This is the batched replacement for
|
|
140
140
|
# the old per-file/per-field reader: at K playbooks the hooks went from K (or
|
|
141
|
-
# 2K) subprocess forks to exactly one
|
|
141
|
+
# 2K) subprocess forks to exactly one.
|
|
142
142
|
#
|
|
143
143
|
# awk reads the leading `---` frontmatter block of each file (resetting on
|
|
144
144
|
# `FNR==1`, stopping at the second `---`). Supports the documented subset —
|
|
145
145
|
# block lists (`field:\n - item`) and inline lists (`field: [a, b]`) of scalar
|
|
146
146
|
# strings. It does NOT process YAML escape sequences inside double-quoted
|
|
147
|
-
# strings; keywords are constrained to that subset
|
|
147
|
+
# strings; keywords are constrained to that subset (no `\b`/`\d`
|
|
148
148
|
# word boundaries). A file with no frontmatter contributes nothing.
|
|
149
149
|
#
|
|
150
150
|
# `FILENAME` is the path exactly as passed (ALL_PLAYBOOKS holds absolute paths),
|
|
@@ -204,8 +204,7 @@ _scan_frontmatter_tsv() {
|
|
|
204
204
|
' "$@" 2>/dev/null
|
|
205
205
|
}
|
|
206
206
|
|
|
207
|
-
# Convert a bash glob into an anchored POSIX-ERE regex. Supported subset
|
|
208
|
-
# in sync with catalog/playbook-format.md):
|
|
207
|
+
# Convert a bash glob into an anchored POSIX-ERE regex. Supported subset:
|
|
209
208
|
# `**/` — zero or more path components (so `**/x` matches `x` at root
|
|
210
209
|
# too; this is what users intuitively expect even though bash
|
|
211
210
|
# globstar would require at least one component)
|
|
@@ -228,7 +227,7 @@ _scan_frontmatter_tsv() {
|
|
|
228
227
|
# Result is returned in the global `GLOB_RE`, NOT printed: callers run this once
|
|
229
228
|
# per applies_to glob (5×K times at K playbooks), and a `regex="$(glob_to_regex)"`
|
|
230
229
|
# command substitution would fork a subshell each time — the dominant cost of
|
|
231
|
-
# the require hook at scale
|
|
230
|
+
# the require hook at scale. Assigning a global keeps it fork-free.
|
|
232
231
|
glob_to_regex() {
|
|
233
232
|
local glob="$1"
|
|
234
233
|
local re=""
|
|
@@ -359,7 +358,7 @@ playbook_scan_for_path() {
|
|
|
359
358
|
# contains a regex matching the prompt. A single jq invocation tests the prompt
|
|
360
359
|
# (passed as a JSON-safe --arg, newlines intact) against every keyword regex
|
|
361
360
|
# with jq's Oniguruma engine (case-insensitive), avoiding macOS BSD grep regex
|
|
362
|
-
# limits and the old one-fork-per-keyword cost
|
|
361
|
+
# limits and the old one-fork-per-keyword cost. Distinct matching
|
|
363
362
|
# files are emitted in first-seen (discovery) order.
|
|
364
363
|
playbook_scan_for_prompt() {
|
|
365
364
|
local repo_root="$1"
|
|
@@ -1,5 +1,5 @@
|
|
|
1
1
|
#!/usr/bin/env bash
|
|
2
|
-
# Lemony — Claude Code SessionEnd hook + `/pause` adaptor
|
|
2
|
+
# Lemony — Claude Code SessionEnd hook + `/pause` adaptor.
|
|
3
3
|
#
|
|
4
4
|
# Fires:
|
|
5
5
|
# • automatically on SessionEnd (no `--manual`) → auto-close path: writes a
|
|
@@ -61,7 +61,7 @@ NOW_ISO="$(date -u +%Y-%m-%dT%H:%M:%SZ)"
|
|
|
61
61
|
# ── Read current-<user>.md frontmatter ─────────────────────────────────────
|
|
62
62
|
# An awk scrape of the leading frontmatter block — works for the shallow
|
|
63
63
|
# scalars the harness writes, fails gracefully on anything richer. yq was
|
|
64
|
-
# removed harness-wide
|
|
64
|
+
# removed harness-wide: jq is the sole hard dependency; awk/grep/git
|
|
65
65
|
# (preinstalled) cover the rest.
|
|
66
66
|
CURRENT_PATH="$REPO_ROOT/.claude/state/current-$USER_SLUG.md"
|
|
67
67
|
SESSION_START_TS=""
|
|
@@ -148,7 +148,7 @@ EMIT_ARGS=(
|
|
|
148
148
|
if [ -n "$ACTIVE_TASK" ] && [ "$ACTIVE_TASK" != "null" ]; then
|
|
149
149
|
EMIT_ARGS+=(--task-id="$ACTIVE_TASK")
|
|
150
150
|
fi
|
|
151
|
-
# Resolve the CLI via the launcher (local devDependency → global → fail-fast
|
|
151
|
+
# Resolve the CLI via the launcher (local devDependency → global → fail-fast)
|
|
152
152
|
# rather than a bare `command -v` on PATH, which an npx install or an fnm Node
|
|
153
153
|
# switch leaves empty. Fail open — a missing launcher or a failed emit warns and
|
|
154
154
|
# continues; the session must never be blocked by telemetry.
|
|
@@ -160,13 +160,13 @@ else
|
|
|
160
160
|
echo "session-close: CLI launcher missing ($LF_CLI) — skipping event emit. Re-run \`lemony install\`." >&2
|
|
161
161
|
fi
|
|
162
162
|
|
|
163
|
-
# ── Fire-and-forget telemetry send
|
|
163
|
+
# ── Fire-and-forget telemetry send ─────────────────────────────────────────
|
|
164
164
|
# Flush the unsent tail of events.jsonl to the ingest Worker. Detached in a
|
|
165
165
|
# subshell (`( … & )`) so SessionEnd never waits on the network; `nohup` lets it
|
|
166
166
|
# outlive a SIGHUP if the session tears down its process group before the send
|
|
167
|
-
# completes. The CLI has its own hard per-request timeout
|
|
167
|
+
# completes. The CLI has its own hard per-request timeout and never throws —
|
|
168
168
|
# output is discarded and a failure leaves the cursor untouched, so the next run
|
|
169
|
-
# (or init.sh catch-up
|
|
169
|
+
# (or init.sh catch-up) retries the same bytes. No-op when no endpoint is
|
|
170
170
|
# configured. Telemetry must never block or break session exit.
|
|
171
171
|
if [ -x "$LF_CLI" ]; then
|
|
172
172
|
( nohup "$LF_CLI" telemetry send >/dev/null 2>&1 & )
|
|
@@ -198,7 +198,7 @@ fi
|
|
|
198
198
|
# ── Update current-<user>.md frontmatter pointers ──────────────────────────
|
|
199
199
|
# Blank session_start_ts (the next SessionStart re-stamps it) and record the
|
|
200
200
|
# close ts. awk in-place rewrite of the frontmatter scalars — mirrors the read
|
|
201
|
-
# fallback above and the session_start_ts rewrite in init.sh (
|
|
201
|
+
# fallback above and the session_start_ts rewrite in init.sh (no yq).
|
|
202
202
|
if [ -f "$CURRENT_PATH" ]; then
|
|
203
203
|
# Rewrite each scalar if present; if a legacy/hand-edited pointer lacks the
|
|
204
204
|
# line, append it just before the closing `---` (parity with the `yq -i` set
|
|
@@ -2,8 +2,8 @@
|
|
|
2
2
|
|
|
3
3
|
> **Forward-only changelog** of the event schema defined in
|
|
4
4
|
> [`tier2-events.md`](tier2-events.md). Readers add aliases here when they care
|
|
5
|
-
> about renamed fields; the schema document is always the current truth
|
|
6
|
-
>
|
|
5
|
+
> about renamed fields; the schema document is always the current truth. One
|
|
6
|
+
> entry per release that touches the schema. ~30 min/release.
|
|
7
7
|
|
|
8
8
|
Format per entry (one block per release):
|
|
9
9
|
|
|
@@ -33,16 +33,16 @@ Empty sections may be omitted.
|
|
|
33
33
|
### Added
|
|
34
34
|
|
|
35
35
|
- `review_rejected.attributed_kind` — optional `agent` | `skill` | `playbook`: the
|
|
36
|
-
kind of component the friction is attributed to
|
|
36
|
+
kind of component the friction is attributed to. Omitted when the emitter
|
|
37
37
|
can't attribute.
|
|
38
38
|
- `review_rejected.attributed_name` — optional free string (1-200): the component's
|
|
39
|
-
name
|
|
40
|
-
- `step_completed.attributed_kind` — optional, as above
|
|
41
|
-
- `step_completed.attributed_name` — optional, as above
|
|
39
|
+
name. Free-string by design this phase (measure-then-decide).
|
|
40
|
+
- `step_completed.attributed_kind` — optional, as above.
|
|
41
|
+
- `step_completed.attributed_name` — optional, as above.
|
|
42
42
|
|
|
43
43
|
### Changed
|
|
44
44
|
|
|
45
|
-
- **Field-tag model → 5 axes, assigned per `(event_type, field)
|
|
45
|
+
- **Field-tag model → 5 axes, assigned per `(event_type, field)`**. The
|
|
46
46
|
old `sensitive` / `internal` / `metric` tags become `local-only` / `identity` /
|
|
47
47
|
`free-text` / `internal-enum` / `metric`. **No field renamed, added, or removed** —
|
|
48
48
|
only the axis column changed (the tag is now a property of the occurrence, fixing
|
|
@@ -60,17 +60,17 @@ Empty sections may be omitted.
|
|
|
60
60
|
### Added
|
|
61
61
|
|
|
62
62
|
- `step_completed` — new event type emitted by the Orchestrator at each resolved
|
|
63
|
-
human checkpoint in step-by-step mode
|
|
63
|
+
human checkpoint in step-by-step mode. Fields: `task_id` (required),
|
|
64
64
|
`step` (required, 1-based `tasks.md` task number), `review_iterations`
|
|
65
65
|
(required, ≥ 1), `checkpoint_result` (required, `ok` | `changes` |
|
|
66
66
|
`ok_downgrade`). One event per checkpoint — a "changes" step emits again when
|
|
67
67
|
it re-checkpoints.
|
|
68
68
|
- `task_done.mode` — optional `all_at_once` | `step_by_step`: the mode chosen at
|
|
69
|
-
the L1 approval gate
|
|
69
|
+
the L1 approval gate. Absent on L2 and on earlier lines.
|
|
70
70
|
- `task_done.steps` — optional count of `step_completed` events for the task
|
|
71
71
|
(≥ 1). Only meaningful when `mode` is `step_by_step`.
|
|
72
72
|
- `review_rejected.step` — optional 1-based task number when the rejection came
|
|
73
|
-
from a per-step review
|
|
73
|
+
from a per-step review. Absent on full-pass and all-at-once rejections.
|
|
74
74
|
|
|
75
75
|
---
|
|
76
76
|
|
|
@@ -78,7 +78,7 @@ Empty sections may be omitted.
|
|
|
78
78
|
|
|
79
79
|
### Added
|
|
80
80
|
|
|
81
|
-
- `followup_captured` — new event type emitted by `/spinoff`
|
|
81
|
+
- `followup_captured` — new event type emitted by `/spinoff` when a
|
|
82
82
|
non-blocking defect found mid-task is parked as a stub. Fields: `task_id`
|
|
83
83
|
(required, the stub), `parent_task_id` (optional, the originating task),
|
|
84
84
|
`severity` (optional, best-effort). Distinct from `bug_post_merge` (post-merge
|