@lemoncode/lemony 0.1.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/LICENSE +21 -0
- package/PRIVACY.md +147 -0
- package/README.md +189 -0
- package/catalog/VERSION +1 -0
- package/catalog/agents/README.md +29 -0
- package/catalog/agents/architect.md +81 -0
- package/catalog/agents/fit-assessment.md +94 -0
- package/catalog/agents/implementer.md +67 -0
- package/catalog/agents/orchestrator.md +627 -0
- package/catalog/agents/reviewer.md +124 -0
- package/catalog/agents/spec-author.md +69 -0
- package/catalog/agents/ui-designer.md +25 -0
- package/catalog/commands/add-capability.md +69 -0
- package/catalog/commands/bypass.md +40 -0
- package/catalog/commands/define.md +24 -0
- package/catalog/commands/hotfix.md +47 -0
- package/catalog/commands/pause.md +52 -0
- package/catalog/commands/resume.md +56 -0
- package/catalog/commands/spinoff.md +59 -0
- package/catalog/commands/triage.md +24 -0
- package/catalog/harness.config.schema.json +116 -0
- package/catalog/hooks/README.md +56 -0
- package/catalog/hooks/init.sh +281 -0
- package/catalog/hooks/lib/lemony.sh +41 -0
- package/catalog/hooks/lib/playbook-scan.sh +394 -0
- package/catalog/hooks/lib/transcript-grep.sh +56 -0
- package/catalog/hooks/require-playbook.sh +97 -0
- package/catalog/hooks/session-close.sh +232 -0
- package/catalog/hooks/suggest-playbook.sh +72 -0
- package/catalog/playbook-format.md +198 -0
- package/catalog/schemas/README.md +13 -0
- package/catalog/schemas/tier2-events-history.md +104 -0
- package/catalog/schemas/tier2-events.md +286 -0
- package/catalog/skills/README.md +62 -0
- package/catalog/skills/bootstrap-architecture/SKILL.md +78 -0
- package/catalog/skills/code-explorer/SKILL.md +76 -0
- package/catalog/skills/grill-with-docs/ADR-FORMAT.md +49 -0
- package/catalog/skills/grill-with-docs/CONTEXT-FORMAT.md +77 -0
- package/catalog/skills/grill-with-docs/SKILL.md +270 -0
- package/catalog/skills/grill-with-docs/reference.md +236 -0
- package/catalog/skills/mutation-testing/SKILL.md +84 -0
- package/catalog/skills/note-side-finding/SKILL.md +89 -0
- package/catalog/skills/playbook-iterate/SKILL.md +78 -0
- package/catalog/skills/prd-to-spec/SKILL.md +181 -0
- package/catalog/skills/raise-discovery/SKILL.md +112 -0
- package/catalog/skills/resolve-discovery/SKILL.md +123 -0
- package/catalog/skills/review-pr/SKILL.md +106 -0
- package/catalog/skills/review-pr/reference.md +105 -0
- package/catalog/skills/security-review/SKILL.md +90 -0
- package/catalog/skills/senior-review/SKILL.md +99 -0
- package/catalog/skills/silent-failure-hunter/SKILL.md +76 -0
- package/catalog/skills/spec-compliance-check/SKILL.md +74 -0
- package/catalog/skills/spec-to-issue/SKILL.md +88 -0
- package/catalog/skills/task-closeout/SKILL.md +229 -0
- package/catalog/skills/tdd/SKILL.md +171 -0
- package/catalog/skills/test-gap-report/SKILL.md +71 -0
- package/catalog/skills/triage-issue/SKILL.md +102 -0
- package/catalog/skills/update-architecture/SKILL.md +69 -0
- package/catalog/skills/verify/SKILL.md +90 -0
- package/catalog/skills/write-adr/SKILL.md +77 -0
- package/catalog/templates/README.md +32 -0
- package/catalog/templates/claude-code/.claude/settings.json.tpl +34 -0
- package/catalog/templates/claude-code/agents.md.tpl +109 -0
- package/catalog/templates/claude-code/docs/playbooks/README.md.tpl +96 -0
- package/catalog/templates/claude-code/harness.config.yml.tpl +59 -0
- package/catalog/templates/claude-code/state/history.md.tpl +6 -0
- package/dist/cli.mjs +5691 -0
- package/package.json +80 -0
|
@@ -0,0 +1,90 @@
|
|
|
1
|
+
---
|
|
2
|
+
name: verify
|
|
3
|
+
description: Mechanical verification of a change — the "does it work?" gate. Runs build, type-check, lint, tests + coverage, and dependency audit, then exercises the real code path. Use before signaling a change done (Implementer) and at the start of review (Reviewer, fresh context). Pairs with `senior-review`, which judges quality.
|
|
4
|
+
origin: vendor
|
|
5
|
+
vendor_version: '{{vendor_version}}'
|
|
6
|
+
phase: post-implementation
|
|
7
|
+
invoked-by: [implementer, reviewer]
|
|
8
|
+
---
|
|
9
|
+
|
|
10
|
+
# Verify
|
|
11
|
+
|
|
12
|
+
The "**does it work?**" gate: a fixed, mechanical sequence with a pass/fail verdict.
|
|
13
|
+
No judgment — that is `senior-review`'s job. The **Implementer** runs `verify` before
|
|
14
|
+
signaling a change done; the **Reviewer** runs it again first thing, with fresh
|
|
15
|
+
context, because re-running the gates yourself is the anti-bias point — never trust
|
|
16
|
+
that they passed.
|
|
17
|
+
|
|
18
|
+
## Detect the ecosystem first
|
|
19
|
+
|
|
20
|
+
The gates are **language-agnostic**; only the commands differ. Read the project to
|
|
21
|
+
pick them — do not assume Node:
|
|
22
|
+
|
|
23
|
+
| Manifest present | Ecosystem | Gate commands (typical) |
|
|
24
|
+
| ------------------------------ | -------------------------------------- | ------------------------------------------- |
|
|
25
|
+
| `package.json` | **Node/TS** (the worked example below) | `node --run <script>` |
|
|
26
|
+
| `pyproject.toml` / `setup.cfg` | Python | `ruff`, `mypy`, `pytest` |
|
|
27
|
+
| `go.mod` | Go | `go build`, `go vet`, `go test` |
|
|
28
|
+
| `Cargo.toml` | Rust | `cargo build`, `cargo clippy`, `cargo test` |
|
|
29
|
+
|
|
30
|
+
Prefer the project's declared scripts/tasks over raw tool invocations (a
|
|
31
|
+
`package.json` `scripts` entry, a `Makefile` target, a `justfile` recipe) — they
|
|
32
|
+
encode the project's intended flags.
|
|
33
|
+
|
|
34
|
+
## The gate sequence — stop at the first failure
|
|
35
|
+
|
|
36
|
+
Run in order; a failure halts the sequence (a green later gate is meaningless if an
|
|
37
|
+
earlier one is red). Node/TS shown as the worked example:
|
|
38
|
+
|
|
39
|
+
1. **Build** — `node --run build` (skip if the project has no build step, e.g. a
|
|
40
|
+
pure app run with `tsx`).
|
|
41
|
+
2. **Type-check** — `node --run check-types` (`tsc --noEmit`).
|
|
42
|
+
3. **Lint** — `node --run lint` (oxlint). Warnings are not failures unless the
|
|
43
|
+
project treats them as errors.
|
|
44
|
+
4. **Tests + coverage** — `node --run test` (`vitest run`). Run with coverage when
|
|
45
|
+
the project exposes it (`node --run test:coverage`); coverage is reported, not a
|
|
46
|
+
hard gate (no thresholds — the project's testing playbook decides).
|
|
47
|
+
5. **Audit** — `npm audit --audit-level=moderate` (or the ecosystem's equivalent).
|
|
48
|
+
Report advisories; a moderate+ vulnerability introduced by this change is a
|
|
49
|
+
finding.
|
|
50
|
+
|
|
51
|
+
## Execution — exercise the real path
|
|
52
|
+
|
|
53
|
+
Type-checks and a green suite do not prove the code runs. Identify the project's
|
|
54
|
+
**archetype** (from its manifest/deps) and exercise it for real:
|
|
55
|
+
|
|
56
|
+
| Archetype | How to exercise |
|
|
57
|
+
| ----------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
|
|
58
|
+
| **Backend / API** | Start the service; hit the affected endpoint with `curl` / an HTTP client; inspect the response and the server logs. |
|
|
59
|
+
| **Frontend SPA** | Start the dev server; drive the browser to the changed view; verify rendered output and the console (no errors). Browser specifics → the client's e2e/run playbook. |
|
|
60
|
+
| **SSR** | Start the server; request the affected route; verify the rendered HTML and that hydration runs without console errors. |
|
|
61
|
+
| **CLI** | Run the command on a real fixture; check stdout/stderr and the exit code. |
|
|
62
|
+
| **Library** | Call the public API on a fixture, or import the built artifact; confirm exports resolve. |
|
|
63
|
+
| **Worker / job** | Trigger the job; observe the side effect (queue, DB row, emitted event). |
|
|
64
|
+
| **Monorepo** | Do the above for the **affected** package(s) only. |
|
|
65
|
+
|
|
66
|
+
The exact launch command is the project's, not the harness's — infer it from the
|
|
67
|
+
manifest scripts, and defer project-specific launch/browser steps to the client's
|
|
68
|
+
**run playbook** (`docs/playbooks/run.md` or as configured). Check both the happy path
|
|
69
|
+
and the error output (unexpected warnings/errors even when the happy path worked).
|
|
70
|
+
|
|
71
|
+
## Report
|
|
72
|
+
|
|
73
|
+
```
|
|
74
|
+
## Verify — <task name>
|
|
75
|
+
|
|
76
|
+
**Build**: ✅ / ❌ / N/A
|
|
77
|
+
**Type-check**: ✅ / ❌
|
|
78
|
+
**Lint**: ✅ / ⚠️ <warnings>
|
|
79
|
+
**Tests**: ✅ <N passing> / ❌ <failure>
|
|
80
|
+
**Coverage**: <reported, or N/A>
|
|
81
|
+
**Audit**: ✅ / ⚠️ <advisories>
|
|
82
|
+
**Execution**: ✅ <archetype + what was exercised> / ❌ <what broke>
|
|
83
|
+
|
|
84
|
+
**Verdict**: works / does not work — <one-line reason>
|
|
85
|
+
```
|
|
86
|
+
|
|
87
|
+
On a failure, read the **full** error, find the origin (not where it surfaced), fix
|
|
88
|
+
the root cause, and re-run from the failed gate. If a failure reveals the spec itself
|
|
89
|
+
is wrong or silent, that's a **discovery** — run `raise-discovery`, don't paper over
|
|
90
|
+
it.
|
|
@@ -0,0 +1,77 @@
|
|
|
1
|
+
---
|
|
2
|
+
name: write-adr
|
|
3
|
+
description: Record an architecture decision as a numbered ADR in docs/adr/. Use when a discovery resolution (or an explicit request) settles a decision that is hard to reverse, surprising without context, and the result of a real trade-off — the three tests that earn an ADR. The Orchestrator invokes the Architect with this skill; it never writes an ADR for a decision that fails the tests.
|
|
4
|
+
origin: vendor
|
|
5
|
+
vendor_version: '{{vendor_version}}'
|
|
6
|
+
invoked-by: [architect]
|
|
7
|
+
trigger-condition: a resolved decision is worth recording (a discovery resolution, or an explicit request)
|
|
8
|
+
---
|
|
9
|
+
|
|
10
|
+
# Write ADR
|
|
11
|
+
|
|
12
|
+
## Core Principle
|
|
13
|
+
|
|
14
|
+
An ADR records **that** a decision was made and **why** — not how to implement it. Its
|
|
15
|
+
audience is a future engineer who finds the code and asks "why on earth did they do it
|
|
16
|
+
this way?" Most ADRs are a single paragraph. The value is in the record, not in filling
|
|
17
|
+
out sections.
|
|
18
|
+
|
|
19
|
+
This is the **Architect's** skill, invoked on-demand by the Orchestrator — typically
|
|
20
|
+
when a discovery resolution settles something durable, or on an explicit request. It is
|
|
21
|
+
not part of the linear implementation flow.
|
|
22
|
+
|
|
23
|
+
## When an ADR is warranted (all three must hold)
|
|
24
|
+
|
|
25
|
+
1. **Hard to reverse** — changing your mind later carries a meaningful cost.
|
|
26
|
+
2. **Surprising without context** — a future reader would otherwise wonder why.
|
|
27
|
+
3. **The result of a real trade-off** — there were genuine alternatives and one was
|
|
28
|
+
chosen for specific reasons.
|
|
29
|
+
|
|
30
|
+
If a decision is easy to reverse, not surprising, or had no real alternative, **do not
|
|
31
|
+
write an ADR** — say so and stop. Examples that qualify: architectural shape, an
|
|
32
|
+
integration pattern between contexts, a technology choice that carries lock-in, a
|
|
33
|
+
boundary/scope decision, a deliberate deviation from the obvious path, a constraint not
|
|
34
|
+
visible in the code, a non-obvious rejected alternative.
|
|
35
|
+
|
|
36
|
+
## Process
|
|
37
|
+
|
|
38
|
+
1. **Confirm it qualifies.** Apply the three tests. If it fails, report that an ADR
|
|
39
|
+
isn't warranted and why — don't pad the log with reversible or obvious decisions.
|
|
40
|
+
2. **Number it.** Scan `docs/adr/` for the highest existing `NNNN-*.md` and increment.
|
|
41
|
+
Create `docs/adr/` lazily if this is the first ADR.
|
|
42
|
+
3. **Write it** to `docs/adr/NNNN-<slug>.md`. Start minimal:
|
|
43
|
+
|
|
44
|
+
```md
|
|
45
|
+
# NNNN — {Short title of the decision}
|
|
46
|
+
|
|
47
|
+
{1–3 sentences: the context, what was decided, and why.}
|
|
48
|
+
```
|
|
49
|
+
|
|
50
|
+
Add an optional section **only when it earns its place** (most ADRs need none):
|
|
51
|
+
- **Status** (`proposed | accepted | deprecated | superseded by ADR-NNNN`) — when the
|
|
52
|
+
decision may be revisited.
|
|
53
|
+
- **Considered options** — when the rejected alternatives are worth remembering.
|
|
54
|
+
- **Consequences** — when non-obvious downstream effects need calling out.
|
|
55
|
+
|
|
56
|
+
4. **Link, don't duplicate.** If the decision changes the architecture, the
|
|
57
|
+
`update-architecture` skill keeps `docs/architecture.md` current and references this
|
|
58
|
+
ADR — the ADR captures the _why_, the architecture doc the _shape_. Don't restate one
|
|
59
|
+
in the other.
|
|
60
|
+
|
|
61
|
+
## Report
|
|
62
|
+
|
|
63
|
+
Return to the Orchestrator: the ADR path and number, a one-line summary of the decision,
|
|
64
|
+
or — if it didn't qualify — that no ADR was written and which test it failed. This is the
|
|
65
|
+
same lightweight ADR convention `grill-with-docs` offers during a grill (`# NNNN — title`
|
|
66
|
+
|
|
67
|
+
- 1–3 sentences, optional sections only when they earn their place), so ADRs born either
|
|
68
|
+
way read the same.
|
|
69
|
+
|
|
70
|
+
## Uncontemplated Scenarios
|
|
71
|
+
|
|
72
|
+
When a case doesn't clearly fit:
|
|
73
|
+
|
|
74
|
+
1. Apply the closest matching approach with reasoning.
|
|
75
|
+
2. **Flag it**: "This isn't covered by the write-adr skill. I did [approach] because
|
|
76
|
+
[reason]. Want to refine the skill?"
|
|
77
|
+
3. Offer to add a rule for the case.
|
|
@@ -0,0 +1,32 @@
|
|
|
1
|
+
# templates/ — installer render sources
|
|
2
|
+
|
|
3
|
+
> **Status: claude-code templates (P1–P4 s2).** `agents.md.tpl`,
|
|
4
|
+
> `harness.config.yml.tpl`, `state/history.md.tpl`, and `docs/playbooks/README.md.tpl`
|
|
5
|
+
> (the playbooks-home convention doc, P4 s2) exist and are rendered by the installer.
|
|
6
|
+
> `.claude/commands/`, `.claude/settings.json`, and the rest of `docs/` land in later
|
|
7
|
+
> phases.
|
|
8
|
+
|
|
9
|
+
Files the installer **renders into a client repo** during `install`. Templates use
|
|
10
|
+
placeholders (e.g. `{{vendor_version}}`, `{{target}}`, and the per-agent `{{SKILLS}}`
|
|
11
|
+
marker) resolved from the repo capability scan. They are organized by `target`
|
|
12
|
+
(decision #38), so additional targets plug in without a refactor.
|
|
13
|
+
|
|
14
|
+
## Planned layout
|
|
15
|
+
|
|
16
|
+
```
|
|
17
|
+
templates/
|
|
18
|
+
└── claude-code/ # only operative target in Sprint 0
|
|
19
|
+
├── agents.md.tpl # entry protocol / dispatcher
|
|
20
|
+
├── harness.config.yml.tpl # config with {{capabilities}} placeholders
|
|
21
|
+
├── .claude/
|
|
22
|
+
│ ├── agents/<role>.md.tpl # role instances (PRE/DURING/POST filled)
|
|
23
|
+
│ ├── commands/*.md.tpl # /resume, /define, /triage, /pause, /bypass, /hotfix
|
|
24
|
+
│ └── settings.json.tpl # hooks, permissions
|
|
25
|
+
├── state/
|
|
26
|
+
│ ├── current.md.tpl
|
|
27
|
+
│ └── history.md.tpl
|
|
28
|
+
└── docs/
|
|
29
|
+
├── playbooks/README.md.tpl # playbooks-home convention (P4 s2)
|
|
30
|
+
├── CONTEXT.md.tpl
|
|
31
|
+
└── adr/.tpl
|
|
32
|
+
```
|
|
@@ -0,0 +1,34 @@
|
|
|
1
|
+
{
|
|
2
|
+
"hooks": {
|
|
3
|
+
"SessionStart": [
|
|
4
|
+
{
|
|
5
|
+
"matcher": "startup|resume|clear",
|
|
6
|
+
"hooks": [
|
|
7
|
+
{ "type": "command", "command": ".claude/hooks/init.sh" }
|
|
8
|
+
]
|
|
9
|
+
}
|
|
10
|
+
],
|
|
11
|
+
"SessionEnd": [
|
|
12
|
+
{
|
|
13
|
+
"hooks": [
|
|
14
|
+
{ "type": "command", "command": ".claude/hooks/session-close.sh" }
|
|
15
|
+
]
|
|
16
|
+
}
|
|
17
|
+
],
|
|
18
|
+
"PreToolUse": [
|
|
19
|
+
{
|
|
20
|
+
"matcher": "Write|Edit",
|
|
21
|
+
"hooks": [
|
|
22
|
+
{ "type": "command", "command": ".claude/hooks/require-playbook.sh" }
|
|
23
|
+
]
|
|
24
|
+
}
|
|
25
|
+
],
|
|
26
|
+
"UserPromptSubmit": [
|
|
27
|
+
{
|
|
28
|
+
"hooks": [
|
|
29
|
+
{ "type": "command", "command": ".claude/hooks/suggest-playbook.sh" }
|
|
30
|
+
]
|
|
31
|
+
}
|
|
32
|
+
]
|
|
33
|
+
}
|
|
34
|
+
}
|
|
@@ -0,0 +1,109 @@
|
|
|
1
|
+
# Agents — entry protocol
|
|
2
|
+
|
|
3
|
+
> Lemony harness · vendor `{{vendor_version}}` · target `{{target}}`.
|
|
4
|
+
> Generated by `lemony install`; safe to edit.
|
|
5
|
+
|
|
6
|
+
This file is the **dispatcher** the Orchestrator follows on every session. The
|
|
7
|
+
Orchestrator is the single human-facing hat; it invokes sub-agents (Spec Author,
|
|
8
|
+
Implementer, Reviewer) with fresh context via the Task tool. Full role rules live
|
|
9
|
+
in `.claude/agents/<role>.md`; skills in `.claude/skills/`; task state in
|
|
10
|
+
`.claude/state/`.
|
|
11
|
+
|
|
12
|
+
## Mode dispatcher
|
|
13
|
+
|
|
14
|
+
Parse the intent of the first prompt (or honor an explicit slash command):
|
|
15
|
+
|
|
16
|
+
| Intent signal | Mode |
|
|
17
|
+
| --- | --- |
|
|
18
|
+
| "continue", "resume", "pick up" + a `#id` or name | **RESUME** |
|
|
19
|
+
| "define", "new task", "I have an idea", "I want to build" | **DEFINE** |
|
|
20
|
+
| "bug", "error in", "it's broken", "fails when" | **TRIAGE** |
|
|
21
|
+
| a bare greeting / orientation question / nothing | **ORIENT** |
|
|
22
|
+
|
|
23
|
+
If the intent is ambiguous **between harness modes**, ask **one** disambiguation
|
|
24
|
+
question, then proceed. A concrete **non-harness** request is just answered — don't
|
|
25
|
+
surface the menu.
|
|
26
|
+
|
|
27
|
+
- **DEFINE** — the L1 full-SDD round-trip (see below).
|
|
28
|
+
- **RESUME** — for an SDD task, `git fetch` and check out `harness/<id>-<slug>` first
|
|
29
|
+
(the task state and spec live only on the branch until merge), then reload
|
|
30
|
+
`.claude/state/tasks/<id>/` and continue from `progress.md`.
|
|
31
|
+
- **TRIAGE** — the L2 lightweight path (see below). Invoke the `triage-issue` skill.
|
|
32
|
+
- **ORIENT** — an intentless entry (greeting / "what should I pick up?" / nothing).
|
|
33
|
+
Render the **dispatch menu**: the parked queue (the same listing `/resume` runs with
|
|
34
|
+
no args — including the **`/spinoff` pending stubs**) plus the start options `/define`
|
|
35
|
+
and `/triage`, and ask which to do. The start options are **unconditional** — an empty
|
|
36
|
+
queue still renders the menu with just them. Best-effort and offline-safe: if `gh` is
|
|
37
|
+
absent or unauthenticated, skip the listing and still offer the start options — never
|
|
38
|
+
block.
|
|
39
|
+
Authority + the full guard: `.claude/agents/orchestrator.md` §Dispatch → ORIENT.
|
|
40
|
+
|
|
41
|
+
## Task-fit dial
|
|
42
|
+
|
|
43
|
+
The harness is a dial, not on/off (decisions #57–#61):
|
|
44
|
+
|
|
45
|
+
- **L1 — full SDD**: real features. Grill → spec → human gate → implement → review.
|
|
46
|
+
- **L2 — lightweight**: small bugs. Issue + state + implement (TDD) + light review.
|
|
47
|
+
- **L3 — bypass**: typos, mechanical renames, lockfile bumps. No issue, no review.
|
|
48
|
+
|
|
49
|
+
A task deserves the harness if it is **specifiable**, **verifiable**, or
|
|
50
|
+
**reviewable** (≥1). If none → L3. When in doubt, keep it in the harness.
|
|
51
|
+
|
|
52
|
+
## L1 round-trip (DEFINE)
|
|
53
|
+
|
|
54
|
+
1. **Grill** — `grill-with-docs`: interview the idea into a PRD at
|
|
55
|
+
`docs/prds/<topic>-<date>.md`.
|
|
56
|
+
2. **Open the task** — create the issue (skeleton body + `harness:managed` +
|
|
57
|
+
`harness:sdd` + `harness:status:spec-in-progress`) and the branch
|
|
58
|
+
`harness/<id>-<slug>`. `<id>` is the GitHub issue number in this build; spec and code
|
|
59
|
+
both live on that branch — nothing touches the default branch until the merge gate.
|
|
60
|
+
3. **Spec** — Spec Author sub-agent (given the `<id>` + branch): `prd-to-spec` (→
|
|
61
|
+
`requirements.md` EARS + `design.md` + `tasks.md` under `tasks/<id>/spec/`) then
|
|
62
|
+
`spec-to-issue` (fills the issue body — it creates nothing and moves no labels).
|
|
63
|
+
4. **Spec-ready + handoff** — flip to `harness:status:spec-ready`, commit and push the
|
|
64
|
+
task state to the branch. DEFINE can stop here: the spec-ready queue
|
|
65
|
+
(`gh issue list -l harness:status:spec-ready`) is the handoff. Ask: implement now or
|
|
66
|
+
hand off?
|
|
67
|
+
5. **Approval gate** — run by whoever implements (a RESUME of a spec-ready issue runs
|
|
68
|
+
it). Present the spec; the human approves (or asks for changes → Spec Author
|
|
69
|
+
revises). On approval, flip to `harness:status:in-progress`. The spec, not the code,
|
|
70
|
+
is the source of intent — never self-approve.
|
|
71
|
+
6. **Implement** — Implementer sub-agent, `tdd` skill, on the branch, keeping
|
|
72
|
+
`.claude/state/tasks/<id>/progress.md` live.
|
|
73
|
+
7. **Review** — flip to `harness:status:in-review`, open the PR (`gh pr create`,
|
|
74
|
+
branch → default), Reviewer sub-agent reviews it (`senior-review`, fresh context).
|
|
75
|
+
8. **Merge gate** — never auto-merge. Surface the approved PR; the human merges (or
|
|
76
|
+
authorizes you to). The task stays at `in-review` until merged.
|
|
77
|
+
9. **Closeout** — `task-closeout`: confirm the merge via `gh`, offer durable discoveries
|
|
78
|
+
to the Architect (`write-adr`, HITL), archive the spec + discoveries to `_archive/<id>/` (drop
|
|
79
|
+
`progress.md`), and land `history.md` + the archival via a dedicated
|
|
80
|
+
`harness/closeout-<id>` PR (`gh pr merge --auto`) — never a direct push to the base.
|
|
81
|
+
If protection needs approval the PR waits: park at `closeout-pending`, `/resume`
|
|
82
|
+
finalizes. Then close the issue and delete the branch.
|
|
83
|
+
|
|
84
|
+
## L2 round-trip (TRIAGE)
|
|
85
|
+
|
|
86
|
+
1. **Triage** — `triage-issue`: investigate, find root cause, draft a TDD-based fix
|
|
87
|
+
plan, and create the issue with `harness:managed` (no `harness:sdd` — its absence
|
|
88
|
+
marks the lightweight path).
|
|
89
|
+
2. **Branch + scaffold** — create the branch `harness/<id>-<slug>`, scaffold
|
|
90
|
+
`progress.md` on it (nothing touches the default branch until the merge gate).
|
|
91
|
+
3. **Implement** — Implementer sub-agent, `tdd` skill.
|
|
92
|
+
4. **Review** — flip to `harness:status:in-review`, open the PR, Reviewer sub-agent
|
|
93
|
+
(`senior-review`, fresh context).
|
|
94
|
+
5. **Merge gate** — the same human-explicit gate as L1: never auto-merge.
|
|
95
|
+
6. **Closeout** — `task-closeout`: confirm the merge, archive the spec + discoveries to
|
|
96
|
+
`_archive/<id>/`, and land `history.md` + the archival via a dedicated
|
|
97
|
+
`harness/closeout-<id>` PR (`gh pr merge --auto`); park at `closeout-pending` if it
|
|
98
|
+
needs approval. Then close the issue and delete the branch. Same as L1.
|
|
99
|
+
|
|
100
|
+
## Discovery interrupts (decision #30)
|
|
101
|
+
|
|
102
|
+
At any step, a sub-agent that hits a T1–T6 case (contradiction, unspecified decision,
|
|
103
|
+
scope drift, existing solution, infeasibility, playbook conflict) **pauses instead of
|
|
104
|
+
guessing**: it runs `raise-discovery`, writing `tasks/<id>/discoveries.md` and
|
|
105
|
+
returning a one-line summary. The Orchestrator then runs `resolve-discovery` — the
|
|
106
|
+
task moves to `harness:status:paused-for-clarification` (+ `harness:discovery:T*`), the
|
|
107
|
+
human arbitrates, the owning agent updates its artifact, and the paused sub-agent
|
|
108
|
+
resumes. No discovery may remain unresolved at closeout. Details in
|
|
109
|
+
`.claude/agents/*.md`.
|
|
@@ -0,0 +1,96 @@
|
|
|
1
|
+
# Playbooks
|
|
2
|
+
|
|
3
|
+
This folder holds your project's **playbooks** — project-agnostic documents describing
|
|
4
|
+
*how to build* a category of software here (a backend service, an SPA, a testing
|
|
5
|
+
strategy, a release flow). Agents read them as canonical patterns while implementing and
|
|
6
|
+
reviewing.
|
|
7
|
+
|
|
8
|
+
Playbooks are **yours**. The harness defines *where to look* and *the format*; it ships
|
|
9
|
+
**no** playbook content (you own your architecture). This file documents the lookup
|
|
10
|
+
convention — it is safe to keep, edit, or replace.
|
|
11
|
+
|
|
12
|
+
## The lookup convention
|
|
13
|
+
|
|
14
|
+
A playbook is found by its **topic filename**, `<topic>.md`, across two layers:
|
|
15
|
+
|
|
16
|
+
1. **Project-local** — `docs/playbooks/<topic>.md` (here). Committed with the repo.
|
|
17
|
+
2. **Global** — your personal, generic playbooks at `~/.claude/playbooks/` (default).
|
|
18
|
+
|
|
19
|
+
> Both locations are **configurable** in `harness.config.yml`: `paths.playbooks`
|
|
20
|
+
> (repo-relative) moves this local layer, `paths.playbooks_global` (absolute or
|
|
21
|
+
> `~`-based) moves the global one. The hooks read those keys on every fire, so a
|
|
22
|
+
> change takes effect immediately — no install or restart. Leave a key unset to
|
|
23
|
+
> keep the default.
|
|
24
|
+
|
|
25
|
+
Lookup is **local → global**, and a **local file overrides** the global one for the same
|
|
26
|
+
topic. There is **no index or manifest** — the filename *is* the registry (an index only
|
|
27
|
+
drifts out of sync). A missing playbook is **not an error**: skills that consult one
|
|
28
|
+
degrade gracefully when it's absent.
|
|
29
|
+
|
|
30
|
+
Common topics: `run.md` (how to launch the app), `testing.md`, `e2e.md`, `<stack>.md`.
|
|
31
|
+
Use whatever topic names your skills and team reach for.
|
|
32
|
+
|
|
33
|
+
## Enforcement (P5 slice 2)
|
|
34
|
+
|
|
35
|
+
Two Claude Code hooks read this folder on every turn:
|
|
36
|
+
|
|
37
|
+
- **`require-playbook.sh`** (`PreToolUse` on `Write`/`Edit`) — blocks the call when the
|
|
38
|
+
target file path matches a playbook's `applies_to` glob list and that playbook has not
|
|
39
|
+
been read in this conversation. The block message names the exact playbook(s) to read
|
|
40
|
+
and the path. After reading, the agent retries the Write/Edit.
|
|
41
|
+
- **`suggest-playbook.sh`** (`UserPromptSubmit`, non-blocking) — emits a
|
|
42
|
+
`<system-reminder>` when the prompt matches a playbook's `keywords` regex list and
|
|
43
|
+
that playbook has not been read yet.
|
|
44
|
+
|
|
45
|
+
Add a playbook → the hooks pick it up on next fire (no install/restart).
|
|
46
|
+
|
|
47
|
+
> **Where they live / how to verify.** These hooks are installed in **this project's**
|
|
48
|
+
> `.claude/hooks/` and wired in `.claude/settings.json` (`PreToolUse` / `UserPromptSubmit`)
|
|
49
|
+
> — **not** in your global `~/.claude/hooks/`. To confirm enforcement is active, run
|
|
50
|
+
> `lemony doctor` (its `hooks` check reports it) or inspect `.claude/settings.json`.
|
|
51
|
+
> Checking `~/.claude/hooks/` gives a **false negative**. Note that a playbook with only
|
|
52
|
+
> `keywords` (no `applies_to`) is suggest-only **by design** — it never blocks a Write/Edit.
|
|
53
|
+
|
|
54
|
+
## Format at a glance
|
|
55
|
+
|
|
56
|
+
```markdown
|
|
57
|
+
---
|
|
58
|
+
name: testing
|
|
59
|
+
applies_to:
|
|
60
|
+
- "**/*.spec.ts"
|
|
61
|
+
- "**/*.bench.ts"
|
|
62
|
+
keywords:
|
|
63
|
+
- "vitest"
|
|
64
|
+
- "test runner"
|
|
65
|
+
status: active
|
|
66
|
+
---
|
|
67
|
+
|
|
68
|
+
# Testing
|
|
69
|
+
|
|
70
|
+
(canonical patterns here…)
|
|
71
|
+
```
|
|
72
|
+
|
|
73
|
+
- `name` (required) — kebab-case, equals the filename stem.
|
|
74
|
+
- `applies_to` (optional) — bash globs. **Files matching any glob require this playbook.**
|
|
75
|
+
- `keywords` (optional) — case-insensitive regexes (Oniguruma). **Prompts matching any regex suggest this playbook.**
|
|
76
|
+
|
|
77
|
+
`applies_to` and `keywords` are independently optional: a playbook can be require-only,
|
|
78
|
+
suggest-only, both, or neither. The full spec lives in
|
|
79
|
+
[`catalog/playbook-format.md`](https://github.com/{{task_storage_repo}}/blob/main/catalog/playbook-format.md)
|
|
80
|
+
(packaged with the vendor — referenced here for convenience).
|
|
81
|
+
|
|
82
|
+
## What belongs here (and what doesn't)
|
|
83
|
+
|
|
84
|
+
A playbook is **project-agnostic**: it uses placeholder names (`@org/x`, `<your-app>`,
|
|
85
|
+
`getResourceById`) and describes the *pattern*, never this codebase's real symbols.
|
|
86
|
+
|
|
87
|
+
Project-**specific** rules — a business rule, a quirk of one app, a one-off integration —
|
|
88
|
+
do **not** go in a playbook. They belong in `CLAUDE.md`, a project-local ADR
|
|
89
|
+
(`docs/adr/`), or `CONTEXT.md`.
|
|
90
|
+
|
|
91
|
+
## Creating and iterating playbooks
|
|
92
|
+
|
|
93
|
+
- Author one by hand following the format above, or
|
|
94
|
+
- ask the **Architect** to extract one from the codebase (its `playbook-iterate` skill
|
|
95
|
+
can capture a pattern the code already follows, or revise a playbook when a spec
|
|
96
|
+
collides with it — a `T6 PLAYBOOK_CONFLICT`). The Architect proposes; you decide.
|
|
@@ -0,0 +1,59 @@
|
|
|
1
|
+
# yaml-language-server: $schema=./harness.config.schema.json
|
|
2
|
+
# Lemony — harness configuration.
|
|
3
|
+
# Generated by `lemony install`. Safe to edit; `install` refuses to overwrite
|
|
4
|
+
# this file — use `repair` to re-sync at the pinned version, or `update` to change
|
|
5
|
+
# the pinned version. The `$schema` header above
|
|
6
|
+
# gives an IDE (VS Code YAML extension) autocomplete + inline validation from the
|
|
7
|
+
# committed `harness.config.schema.json` the installer drops next to this file.
|
|
8
|
+
|
|
9
|
+
# Vendor catalog version this instance is pinned to (exact pin during the pilot).
|
|
10
|
+
vendor_version: {{vendor_version}}
|
|
11
|
+
|
|
12
|
+
# AI-coding harness this install targets. Only `claude-code` is operative today.
|
|
13
|
+
target: {{target}}
|
|
14
|
+
|
|
15
|
+
# Where managed tasks live. Auth is delegated to the `gh` CLI (no token here).
|
|
16
|
+
task_storage:
|
|
17
|
+
type: github
|
|
18
|
+
repo: {{task_storage_repo}}
|
|
19
|
+
|
|
20
|
+
# Active agents (one `.claude/agents/<role>.md` per entry).
|
|
21
|
+
agents:
|
|
22
|
+
- orchestrator
|
|
23
|
+
- spec-author
|
|
24
|
+
- implementer
|
|
25
|
+
- reviewer
|
|
26
|
+
- architect # always installed; invoked on-demand, outside the linear flow
|
|
27
|
+
# - ui-designer # catalog slot, inactive by default (decision #11)
|
|
28
|
+
|
|
29
|
+
# Paths (relative; defaults shown — declare only overrides).
|
|
30
|
+
paths:
|
|
31
|
+
state: .claude/state
|
|
32
|
+
skills: .claude/skills
|
|
33
|
+
agents: .claude/agents
|
|
34
|
+
# Project-local playbooks (how-to-build patterns). Looked up by convention
|
|
35
|
+
# `<topic>.md`, local then global; the local entry overrides the global one.
|
|
36
|
+
# The require/suggest hooks read these two keys directly on every fire, so a
|
|
37
|
+
# change takes effect immediately (no install/restart). Set `playbooks` to a
|
|
38
|
+
# repo-relative dir to relocate the local layer.
|
|
39
|
+
playbooks: docs/playbooks
|
|
40
|
+
# Global playbook layer (dev-owned, generic). Default `~/.claude/playbooks`.
|
|
41
|
+
# Uncomment and set (absolute or `~`-based) to relocate the global layer.
|
|
42
|
+
# playbooks_global: ~/.claude/playbooks
|
|
43
|
+
|
|
44
|
+
# Rollback snapshots. Each `update` bundles the pre-update vendor tree under
|
|
45
|
+
# `.claude/.harness/snapshots/<from-version>/` (gitignored); `rollback` restores it.
|
|
46
|
+
rollback:
|
|
47
|
+
# How many snapshot bundles to retain (rotation drops the oldest). A positive
|
|
48
|
+
# integer or `unlimited` (never rotate). Default 3.
|
|
49
|
+
keep_snapshots: 3
|
|
50
|
+
|
|
51
|
+
# Anonymous usage telemetry — ON by default. At session close the harness sends an
|
|
52
|
+
# anonymized, identifier-free projection of local events to improve the tool.
|
|
53
|
+
# Three ways to opt out (highest precedence first):
|
|
54
|
+
# 1. env `LEMONY_TELEMETRY_DISABLED=1` (this shell/machine),
|
|
55
|
+
# 2. `lemony telemetry disable` (this checkout only, not committed),
|
|
56
|
+
# 3. the committed switch below (the whole team, for this repo).
|
|
57
|
+
# Uncomment to opt the team out:
|
|
58
|
+
# telemetry:
|
|
59
|
+
# enabled: false
|