@lemoncode/lemony 0.1.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (68) hide show
  1. package/LICENSE +21 -0
  2. package/PRIVACY.md +147 -0
  3. package/README.md +189 -0
  4. package/catalog/VERSION +1 -0
  5. package/catalog/agents/README.md +29 -0
  6. package/catalog/agents/architect.md +81 -0
  7. package/catalog/agents/fit-assessment.md +94 -0
  8. package/catalog/agents/implementer.md +67 -0
  9. package/catalog/agents/orchestrator.md +627 -0
  10. package/catalog/agents/reviewer.md +124 -0
  11. package/catalog/agents/spec-author.md +69 -0
  12. package/catalog/agents/ui-designer.md +25 -0
  13. package/catalog/commands/add-capability.md +69 -0
  14. package/catalog/commands/bypass.md +40 -0
  15. package/catalog/commands/define.md +24 -0
  16. package/catalog/commands/hotfix.md +47 -0
  17. package/catalog/commands/pause.md +52 -0
  18. package/catalog/commands/resume.md +56 -0
  19. package/catalog/commands/spinoff.md +59 -0
  20. package/catalog/commands/triage.md +24 -0
  21. package/catalog/harness.config.schema.json +116 -0
  22. package/catalog/hooks/README.md +56 -0
  23. package/catalog/hooks/init.sh +281 -0
  24. package/catalog/hooks/lib/lemony.sh +41 -0
  25. package/catalog/hooks/lib/playbook-scan.sh +394 -0
  26. package/catalog/hooks/lib/transcript-grep.sh +56 -0
  27. package/catalog/hooks/require-playbook.sh +97 -0
  28. package/catalog/hooks/session-close.sh +232 -0
  29. package/catalog/hooks/suggest-playbook.sh +72 -0
  30. package/catalog/playbook-format.md +198 -0
  31. package/catalog/schemas/README.md +13 -0
  32. package/catalog/schemas/tier2-events-history.md +104 -0
  33. package/catalog/schemas/tier2-events.md +286 -0
  34. package/catalog/skills/README.md +62 -0
  35. package/catalog/skills/bootstrap-architecture/SKILL.md +78 -0
  36. package/catalog/skills/code-explorer/SKILL.md +76 -0
  37. package/catalog/skills/grill-with-docs/ADR-FORMAT.md +49 -0
  38. package/catalog/skills/grill-with-docs/CONTEXT-FORMAT.md +77 -0
  39. package/catalog/skills/grill-with-docs/SKILL.md +270 -0
  40. package/catalog/skills/grill-with-docs/reference.md +236 -0
  41. package/catalog/skills/mutation-testing/SKILL.md +84 -0
  42. package/catalog/skills/note-side-finding/SKILL.md +89 -0
  43. package/catalog/skills/playbook-iterate/SKILL.md +78 -0
  44. package/catalog/skills/prd-to-spec/SKILL.md +181 -0
  45. package/catalog/skills/raise-discovery/SKILL.md +112 -0
  46. package/catalog/skills/resolve-discovery/SKILL.md +123 -0
  47. package/catalog/skills/review-pr/SKILL.md +106 -0
  48. package/catalog/skills/review-pr/reference.md +105 -0
  49. package/catalog/skills/security-review/SKILL.md +90 -0
  50. package/catalog/skills/senior-review/SKILL.md +99 -0
  51. package/catalog/skills/silent-failure-hunter/SKILL.md +76 -0
  52. package/catalog/skills/spec-compliance-check/SKILL.md +74 -0
  53. package/catalog/skills/spec-to-issue/SKILL.md +88 -0
  54. package/catalog/skills/task-closeout/SKILL.md +229 -0
  55. package/catalog/skills/tdd/SKILL.md +171 -0
  56. package/catalog/skills/test-gap-report/SKILL.md +71 -0
  57. package/catalog/skills/triage-issue/SKILL.md +102 -0
  58. package/catalog/skills/update-architecture/SKILL.md +69 -0
  59. package/catalog/skills/verify/SKILL.md +90 -0
  60. package/catalog/skills/write-adr/SKILL.md +77 -0
  61. package/catalog/templates/README.md +32 -0
  62. package/catalog/templates/claude-code/.claude/settings.json.tpl +34 -0
  63. package/catalog/templates/claude-code/agents.md.tpl +109 -0
  64. package/catalog/templates/claude-code/docs/playbooks/README.md.tpl +96 -0
  65. package/catalog/templates/claude-code/harness.config.yml.tpl +59 -0
  66. package/catalog/templates/claude-code/state/history.md.tpl +6 -0
  67. package/dist/cli.mjs +5691 -0
  68. package/package.json +80 -0
@@ -0,0 +1,229 @@
1
+ ---
2
+ name: task-closeout
3
+ description: Close out a finished task — record it in history.md, archive the task spec + discoveries, activate the Architect for durable capture (ADRs, the architecture map, playbooks), and finalize via a dedicated closeout PR. Use when the Orchestrator finalizes an approved+merged task, mentions "closeout", "close the task", or "wrap up the issue".
4
+ origin: vendor
5
+ vendor_version: '{{vendor_version}}'
6
+ invoked-by: [orchestrator]
7
+ ---
8
+
9
+ # Task Closeout
10
+
11
+ ## Core Principle
12
+
13
+ Closeout is **archive-on-done through a dedicated PR** (ADR 0009). The durable memory of
14
+ a finished task — its spec and its resolved discoveries — is **archived live**, not
15
+ deleted, so a future agent reads the decisions instead of doing git archaeology. The
16
+ genuinely architectural decisions rise further, to ADRs. And the closeout record reaches
17
+ the base branch **only through a PR**, never a direct push — the same branch isolation
18
+ every other change obeys.
19
+
20
+ Three moves, in this order:
21
+
22
+ 1. **Activate the Architect** — closeout is its reliable end-of-task checkpoint (#138).
23
+ In cold blood, no resume pressure, it drives durable capture: `write-adr` (HITL offer
24
+ per resolved discovery), `update-architecture` (automatic, when the map exists), and
25
+ `playbook-iterate` (HITL offer, once per task). The canon outlives any task folder.
26
+ 2. **Archive, don't delete** — `git mv` the spec + `discoveries.md` into
27
+ `_archive/<id>/`; drop only `progress.md` (true scratch). The high-value memory stays
28
+ live and grep-able.
29
+ 3. **Land via a PR** — the `history.md` append, the archival move, and any new ADR ride a
30
+ dedicated `harness/closeout-<id>` PR merged with `--auto`. No direct push to the base.
31
+
32
+ Run this only when the **Reviewer has approved** and the task PR is merged. The merge is a
33
+ human decision (the merge gate) — closeout never merges the task; it **confirms** the
34
+ merge, then records it.
35
+
36
+ ## Preconditions
37
+
38
+ - The Reviewer posted an explicit approval (the task passed `in-review`).
39
+ - **The task PR is merged — confirmed against GitHub, not the conversation.**
40
+ `gh pr view <pr> --json state,mergedAt` must report `"state": "MERGED"`, regardless of
41
+ how it was merged (GitHub UI, CLI, or the Orchestrator running `gh pr merge` after the
42
+ human authorized it). If it is not `MERGED`, stop — the task stays at `in-review`
43
+ until the human merges.
44
+ - The PRD for this task is `Status: completed` — `grill-with-docs` normally sets this
45
+ when the grill closed; if it's still `in_progress`, flip it now. (This is the
46
+ latest point the PRD can be marked done.)
47
+ - **No discovery is left open.** If `discoveries.md` exists, every entry must carry a
48
+ resolved `**Resolution**` block (`Decision` / `Resumed at` filled) and no
49
+ `harness:discovery:*` label may remain on the issue. An unresolved discovery means a
50
+ sub-agent is still paused — run `resolve-discovery` before closing out.
51
+ - **The `harness:status:closeout-pending` label exists on the remote.** `install` /
52
+ `update` / `repair` create it during label sync; an install that predates this label
53
+ needs one `lemony repair` first, or the park step's label flip will fail.
54
+
55
+ ## Why post-merge (don't archive early)
56
+
57
+ Archival is **destructive** and must wait until the task PR is merged. Until that merge,
58
+ the task can re-enter iteration — a Reviewer rejection, or human review comments at the
59
+ gate (#111) — and that re-work needs the spec **live** in `tasks/<id>/spec/`. Archiving
60
+ pre-merge would pull the spec out from under an in-flight fix. So every step below runs
61
+ **after** the merge is confirmed.
62
+
63
+ ## Process
64
+
65
+ Closeout runs **on the default branch, after the merge** — the task PR brought the
66
+ branch (spec + code) onto it. Land there first so the closeout branch forks from the
67
+ merged base: `git checkout <default> && git pull`.
68
+
69
+ ### 1. Activate the Architect for durable capture
70
+
71
+ Closeout is the Architect's **reliable activation point** (#138): in cold blood, after
72
+ the merge, with no paused sub-agent pulling toward "just unblock me", it drives the three
73
+ durable-capture skills the on-demand triggers otherwise lose. Each one **no-ops cleanly
74
+ when its skill isn't installed**, so this step is safe everywhere.
75
+
76
+ The three differ in **who decides** — and the asymmetry is deliberate (ADR 0010):
77
+
78
+ - **`write-adr` — HITL offer, per resolved discovery.** Walk the **resolved** entries in
79
+ `discoveries.md` and **offer** the ones that settled a real decision (not a trivial
80
+ clarification) as candidate ADRs, then dispatch the **Architect** (`write-adr`, fresh
81
+ context) on the ones the human accepts. Offer liberally — closeout does **not** apply
82
+ the ADR criterion itself; the Architect owns the three tests (hard to reverse /
83
+ surprising without context / a real trade-off) and **declines** the ones that fail, so
84
+ a generous offer costs only a "no". If there are no resolved discoveries, or none
85
+ qualify, skip — never pad the canon with reversible or obvious decisions. An ADR is
86
+ **net-new canon**: the human curates what enters, so it stays an offer.
87
+
88
+ - **`update-architecture` — automatic dispatch, no pre-offer.** Only when
89
+ `docs/architecture.md` exists (the skill installs solely then). Dispatch the
90
+ **Architect** (`update-architecture`, fresh context) with the task's **merged diff**
91
+ (`gh pr diff <pr>`) plus the task's `spec/design.md` (still live here — archival is
92
+ step 4 below). The Architect reads the change, makes
93
+ the smallest true edit if the system's **shape** moved (referencing any ADR just
94
+ written — `see ADR-NNNN`), or reports **no-op** when nothing architectural changed.
95
+ There is **no human prompt here**: the map must _track reality_, not be curated, and the
96
+ edit is reviewed in the closeout PR diff like any other change. Run this **after**
97
+ `write-adr` so the map can cite the new ADR numbers.
98
+
99
+ - **`playbook-iterate` — HITL offer, once per task.** A single reflective offer: did this
100
+ task surface a **reusable pattern** worth extracting, or a playbook worth iterating,
101
+ that **no `T6 PLAYBOOK_CONFLICT` already routed** mid-task? (A conflict blocks a
102
+ sub-agent and is handled in the moment; this catches the _non-blocking_ learning that
103
+ would otherwise be lost.) On the human's "yes", dispatch the **Architect**
104
+ (`playbook-iterate`, fresh context). Like an ADR, a playbook change is curated canon, so
105
+ it stays an offer; decline costs only a "no".
106
+
107
+ Any artifact written — an ADR, the architecture edit, a playbook — lands in the working
108
+ tree and rides the closeout PR below.
109
+
110
+ ### 2. Branch for the record
111
+
112
+ ```bash
113
+ git checkout -b harness/closeout-<id>
114
+ ```
115
+
116
+ All of closeout's writes go on this branch — nothing touches the base directly.
117
+
118
+ ### 3. Append one line to `history.md`
119
+
120
+ `.claude/state/history.md` is the append-only, committed changelog — one entry per
121
+ finished task. Add the task with whatever metrics you actually have:
122
+
123
+ ```markdown
124
+ ## #<id> — <topic> (<start-date> → <end-date>)
125
+
126
+ Cycle: <h> | Iter: <review rounds> | Bugs 14d: <n/a until measured>
127
+ ```
128
+
129
+ Record only metrics you can back up. Rich telemetry (tokens, post-merge bugs) is
130
+ populated once the telemetry/event machinery is authored in a later phase; until
131
+ then, leave those as `n/a` rather than inventing numbers.
132
+
133
+ ### 4. Archive the spec + discoveries; drop the scratch
134
+
135
+ Move the durable memory into the archive and remove only the ephemeral scratch. The
136
+ existence guards keep this **idempotent** — a retried closeout (e.g. crash mid-run) that
137
+ already archived skips the move rather than erroring on an existing destination:
138
+
139
+ ```bash
140
+ mkdir -p .claude/state/tasks/_archive/<id>
141
+ # Move the durables only if not already archived (retry-safe):
142
+ if [ -d .claude/state/tasks/<id>/spec ]; then
143
+ git mv .claude/state/tasks/<id>/spec .claude/state/tasks/_archive/<id>/spec
144
+ fi
145
+ # discoveries.md only exists if the task raised one — move it conditionally:
146
+ if [ -f .claude/state/tasks/<id>/discoveries.md ]; then
147
+ git mv .claude/state/tasks/<id>/discoveries.md .claude/state/tasks/_archive/<id>/discoveries.md
148
+ fi
149
+ # Drop everything that remains under tasks/<id>/ — progress.md plus any straggler an
150
+ # Implementer left. -r tolerates whatever is there; git doesn't track the empty dir.
151
+ git rm -r .claude/state/tasks/<id>/
152
+ ```
153
+
154
+ `spec/` and `discoveries.md` survive **live** under `_archive/<id>/` (grep-able), plus in
155
+ git history and the GitHub issue. `progress.md` (the pause-point bitácora) and any other
156
+ working scratch are gone.
157
+
158
+ ### 5. Open the closeout PR and auto-merge
159
+
160
+ Commit the record, push the branch, open a PR, and let GitHub apply the repo's own rules:
161
+
162
+ ```bash
163
+ git commit -m "closeout(<id>): archive task state, record in history.md"
164
+ git push -u origin harness/closeout-<id>
165
+ gh pr create --base <default> --head harness/closeout-<id> \
166
+ --title "closeout(<id>): <topic>" --body "Closeout record for #<id>."
167
+ gh pr merge harness/closeout-<id> --auto --squash --delete-branch
168
+ ```
169
+
170
+ The closeout PR **must not** carry `Closes #<id>` — the task PR already auto-closed the
171
+ issue on merge; closeout only flips the label and finalizes.
172
+
173
+ `gh pr merge` `--auto` defers to branch protection:
174
+
175
+ - **Protection is PR + checks only** → the PR self-merges once checks pass. Continue to
176
+ step 6 (finalize) once the closeout PR reports merged.
177
+ - **Protection requires human approval** → the PR waits. **Park** (see below).
178
+ - **Auto-merge is disabled repo-wide** → `gh pr merge` `--auto` **errors** ("auto-merge is
179
+ not allowed for this repository") rather than queuing. This is a repo setting,
180
+ independent of branch protection. Treat the error as the wait case: **park**. (You may
181
+ instead merge it immediately with a plain `gh pr merge` `--squash --delete-branch` if the
182
+ human has authorized you to merge — same gesture as the task merge gate.)
183
+
184
+ **Park:** flip the issue to `harness:status:closeout-pending`, tell the human the closeout
185
+ PR is open and awaiting their merge, and stop. The task issue is **already closed** (the
186
+ task PR's `Closes #<id>` fired on its merge), so `/resume` finds the parked closeout only
187
+ by listing closed issues too (`--state all`) — a default open-only queue would miss it. A
188
+ later `/resume` picks up at step 6 once the PR is merged. (Authority for the RESUME entry:
189
+ the Orchestrator.)
190
+
191
+ ### 6. Finalize (once the closeout PR is merged)
192
+
193
+ Confirm the closeout PR merged (`gh pr view <pr> --json state,mergedAt` → `MERGED`; on a
194
+ `/resume`, pass the deterministic branch `harness/closeout-<id>` as `<pr>` — the
195
+ merge-confirm accepts a branch name in place of a PR number), land the merged base
196
+ (`git checkout <default> && git pull`), then:
197
+
198
+ ```bash
199
+ git push origin --delete harness/<id>-<slug> # delete the task branch (skip if already gone)
200
+ gh issue close <id> # belt-and-suspenders if the merge didn't auto-close it
201
+ ```
202
+
203
+ Then flip the issue to `harness:status:done` and **emit `task_done`** — both happen here,
204
+ in finalize, exactly once (the parked → `/resume` path also lands here, so it is the
205
+ single emit point for either path). You compute the envelope (cycle time, review
206
+ rejections, level) as the Orchestrator running this skill; the fields and the `emit`
207
+ command line are in `orchestrator.md` §Closeout. `events.jsonl` is local-only/gitignored
208
+ (ADR 0008), so the emit never dirties the base.
209
+
210
+ ### 7. Report
211
+
212
+ Return a one-line summary: the task id, the `history.md` entry, the archive path
213
+ (`_archive/<id>/`), any ADR raised, and confirmation that the issue is closed (or that
214
+ closeout is parked at `closeout-pending` awaiting the record PR's merge).
215
+
216
+ ## Scope note
217
+
218
+ In this build, closeout records to `history.md`, archives the spec + discoveries, raises
219
+ ADRs, and finalizes via the closeout PR. The session/metrics accounting beyond
220
+ `task_done` is wired in a later phase — closeout is the place it will attach.
221
+
222
+ ## Uncontemplated Scenarios
223
+
224
+ When a scenario doesn't clearly fit these rules:
225
+
226
+ 1. Apply the closest matching approach with reasoning.
227
+ 2. **Flag it**: "This scenario isn't covered by the task-closeout skill. I applied
228
+ [approach] because [reason]. Want to update the skill?"
229
+ 3. Offer to add a new rule for the case.
@@ -0,0 +1,171 @@
1
+ ---
2
+ name: tdd
3
+ description: Test-driven development with red-green-refactor loop. Use when user wants to build features or fix bugs using TDD, mentions "red-green-refactor", wants integration tests, or asks for test-first development.
4
+ origin: vendor
5
+ vendor_version: '{{vendor_version}}'
6
+ phase: during-implementation
7
+ invoked-by: [implementer]
8
+ ---
9
+
10
+ # Test-Driven Development
11
+
12
+ ## Core Principle
13
+
14
+ Tests verify behavior through public interfaces, not implementation details. Code
15
+ can change entirely; tests shouldn't break. A good test reads like a specification —
16
+ "user can checkout with valid cart" tells you exactly what capability exists.
17
+
18
+ ## Workflow
19
+
20
+ ### 0. Playbook preflight (mandatory — before writing any code or test)
21
+
22
+ 1. Identify which of the project's playbooks apply to this task and read them.
23
+ Playbooks are looked up by topic — `<topic>.md` under `docs/playbooks/` (then the
24
+ global layer); the local entry wins, and a missing playbook is not a blocker.
25
+ 2. Declare the test environment explicitly:
26
+
27
+ | This file type… | Use… |
28
+ | ------------------------------------------- | -------------------------------------------------- |
29
+ | Pure logic, service, hook (no DOM) | Node spec (`*.spec.ts`) — `environment: 'node'` |
30
+ | React component, DOM interaction, rendering | Browser spec (`*.browser.spec.tsx`) — real browser |
31
+
32
+ State aloud: "This is a [type] test → [node/browser] mode, because [reason]."
33
+
34
+ 3. Output preflight log: "Playbooks loaded: [list]. Test environment: [node/browser].
35
+ Proceeding."
36
+
37
+ ### 1. Planning
38
+
39
+ Before writing any code:
40
+
41
+ - Confirm with user what interface changes are needed
42
+ - Confirm with user which behaviors to test (prioritize — you can't test everything)
43
+ - List the behaviors to test (not implementation steps)
44
+ - Get user approval on the plan
45
+
46
+ Ask: "What should the public interface look like? Which behaviors are most important
47
+ to test?"
48
+
49
+ ### 2. Tracer Bullet
50
+
51
+ Write ONE test that confirms ONE thing about the system:
52
+
53
+ ```
54
+ RED: Write test for first behavior → test fails
55
+ GREEN: Write minimal code to pass → test passes
56
+ ```
57
+
58
+ This is your tracer bullet — proves the path works end-to-end.
59
+
60
+ ### 3. Incremental Loop
61
+
62
+ For each remaining behavior:
63
+
64
+ ```
65
+ RED: Write next test → fails
66
+ GREEN: Minimal code to pass → passes
67
+ ```
68
+
69
+ Rules:
70
+
71
+ - One test at a time
72
+ - Only enough code to pass current test
73
+ - Don't anticipate future tests
74
+ - Keep tests focused on observable behavior
75
+
76
+ ### 4. Refactor
77
+
78
+ After all tests pass, look for refactor candidates:
79
+
80
+ - Never refactor while RED — get to GREEN first
81
+ - Run tests after each refactor step
82
+
83
+ ## Capture decisions worth remembering (ADR nudge)
84
+
85
+ A green test proves _what_ the code does. It rarely captures _why_ this
86
+ implementation was chosen over the alternatives. After a cycle (or at the end of a
87
+ feature), ask: did this step embed a decision that will look arbitrary to a future
88
+ reader?
89
+
90
+ Three signals — if **all three** are true, write an ADR (`docs/adr/NNNN-slug.md`):
91
+
92
+ 1. **Hard to reverse** — changing your mind later costs real work (data migration,
93
+ public API break, cross-module refactor, downstream client coordination).
94
+ 2. **Surprising without context** — a future reader will see the code and wonder
95
+ "why on earth did they do it this way?"
96
+ 3. **Result of a real trade-off** — there were genuine alternatives and you picked
97
+ one for specific reasons (not "the obvious thing").
98
+
99
+ Common triggers inside a TDD cycle:
100
+
101
+ - A defensive cap or limit (depth, budget, retries, page size). The _value_ you
102
+ picked is the artifact most likely to drift later; the reason is rarely in the code.
103
+ - A "consistency at every level" rule — easy to write, easy for a future patch to
104
+ break without noticing.
105
+ - A non-obvious algorithm/data structure picked to satisfy a perf or budget test.
106
+ - A deliberate **no-test** decision — "we are intentionally not covering X because the
107
+ cost outweighs the value." Future contributors will add the missing test and break
108
+ the design.
109
+ - A boundary or scope decision: "this concern lives in module A, not module B."
110
+
111
+ If only one or two signals hit, a `// Why: …` comment at the call site is usually
112
+ enough. The bar for an ADR is "this decision will outlive my memory of writing it".
113
+
114
+ When in doubt, flag it: "I made a non-trivial trade-off about X. Want me to draft an
115
+ ADR?" — let the user decide. (The Architect's `write-adr` skill owns the ADR format;
116
+ numbering is sequential under `docs/adr/`.)
117
+
118
+ ## Anti-Pattern: Horizontal Slices
119
+
120
+ **DO NOT write all tests first, then all implementation.** This is the biggest TDD
121
+ mistake.
122
+
123
+ ```
124
+ WRONG (horizontal):
125
+ RED: test1, test2, test3, test4, test5
126
+ GREEN: impl1, impl2, impl3, impl4, impl5
127
+
128
+ RIGHT (vertical):
129
+ RED→GREEN: test1→impl1
130
+ RED→GREEN: test2→impl2
131
+ RED→GREEN: test3→impl3
132
+ ```
133
+
134
+ Tests written in bulk test _imagined_ behavior. You outrun your headlights,
135
+ committing to test structure before understanding the implementation. Each vertical
136
+ cycle informs the next — because you just wrote the code, you know exactly what
137
+ behavior matters.
138
+
139
+ ## When NOT to Apply TDD
140
+
141
+ Not everything needs red-green-refactor:
142
+
143
+ - **Validation schemas** — declarative, self-validating
144
+ - **Configuration** (routes wiring, test config) — no logic to test
145
+ - **Type definitions / interfaces** — no runtime behavior
146
+ - **Re-exports / barrel files** — just wiring
147
+ - **Constants and enums** — declarative
148
+ - **Middleware wiring** — the middleware logic is tested, the wiring isn't
149
+ - **Pass-through wrappers** — re-exporting a library without added logic
150
+ - **Database migrations** — DDL scripts, validated by execution
151
+ - **Static content** — i18n strings, templates, fixed content
152
+ - **Logger / env config setup** — pure configuration
153
+ - **Prototypes / spikes** — ask the user: "Do you want tests for this spike?"
154
+
155
+ **Always test**: mappers (even small ones), services with business logic,
156
+ repositories (integration), routes (API behavior), components (user interaction).
157
+
158
+ ## Mocking Strategy
159
+
160
+ **`vi.spyOn` as default** — mock at system boundaries only (external APIs, databases,
161
+ time/randomness). Don't mock your own modules' internals. DI only when it genuinely
162
+ adds value. For mocking recipes → see your project's testing playbook.
163
+
164
+ ## Uncontemplated Scenarios
165
+
166
+ When a scenario doesn't clearly fit these rules:
167
+
168
+ 1. Apply the closest matching rule with reasoning
169
+ 2. **Flag it**: "This scenario isn't covered by the tdd skill. I applied [rule]
170
+ because [reason]. Want to update the skill?"
171
+ 3. Offer to add a new rule for the case
@@ -0,0 +1,71 @@
1
+ ---
2
+ name: test-gap-report
3
+ description: Structural test-gap analysis of a change — which logic-bearing files lack a dedicated test, and which behaviors are untested. Walks the coverage matrix, not the runtime. Use during review to find missing tests before they become regressions; distinct from `verify` (which runs the suite) and `senior-review` (which judges the tests that exist).
4
+ origin: vendor
5
+ vendor_version: '{{vendor_version}}'
6
+ phase: post-implementation
7
+ invoked-by: [reviewer]
8
+ ---
9
+
10
+ # Test Gap Report
11
+
12
+ A **structural** gap analysis: which files in the change _should_ have a dedicated test
13
+ and don't, and which behaviors are untested. This is separate from `verify` (which
14
+ _runs_ the suite) and from `senior-review` (which judges whether the tests that exist
15
+ are meaningful). Coverage numbers are a weak proxy — this walks the **coverage matrix**
16
+ by file role instead.
17
+
18
+ ## Process
19
+
20
+ ### 1. Classify each changed file
21
+
22
+ For every file the change adds or modifies, decide whether it needs a dedicated test
23
+ using the coverage matrix (consult the project's testing playbook for project-specific
24
+ roles):
25
+
26
+ **Always needs a dedicated `*.spec.*`** (it carries logic — conditionals, loops,
27
+ transformations, calculations):
28
+
29
+ | Role | What to cover |
30
+ | ------------------------- | ------------------------------------------------------ |
31
+ | service / use-case | happy path, expected errors, edge cases |
32
+ | mapper / transformer | every field, optional/nullable, empty collections |
33
+ | helper / util / validator | boundary values, error conditions |
34
+ | hook | state transitions, side effects, cleanup |
35
+ | component | renders expected content, interactions, conditional UI |
36
+ | processor / reducer | each input/event type, unknown input |
37
+
38
+ **Covered by integration (no dedicated spec needed)** — wiring, not logic:
39
+ type-only models, thin route/api wrappers, simple stores, constants, config. A file
40
+ here with **branching logic** is promoted to "always test."
41
+
42
+ ### 2. Find the gaps
43
+
44
+ For each file in the "always" set:
45
+
46
+ 1. Does a co-located `*.spec.*` exist?
47
+ 2. If yes, does it cover the **behaviors** the change introduced (not just the ones
48
+ that were already there)? A new branch / error path / edge case added by this change
49
+ needs a new test even if the file already had a spec.
50
+ 3. **User-visible flows** (navigation / form / list / auth) added or modified → is
51
+ there an E2E test, or is one warranted per the testing playbook?
52
+
53
+ ### 3. Report
54
+
55
+ ```
56
+ ## Test Gap Report — <task name>
57
+
58
+ | File | Role | Spec? | Gap |
59
+ | --- | --- | --- | --- |
60
+ | `order.service.ts` | service | ✅ | error path for empty cart untested |
61
+ | `slug.util.ts` | util | ❌ | no spec — boundary cases untested |
62
+ | `cart.view.tsx` | component | ✅ | covered |
63
+
64
+ **Missing specs**: <n> · **Partial coverage**: <n>
65
+ **E2E**: needed / present / N/A
66
+ **Verdict**: adequately covered / <n> gaps to close
67
+ ```
68
+
69
+ Report gaps; do **not** silently write the missing tests yourself (that's the
70
+ Implementer's loop). A gap is a finding routed back. If closing a gap reveals the
71
+ behavior itself is unspecified, that's a **discovery** — run `raise-discovery`.
@@ -0,0 +1,102 @@
1
+ ---
2
+ name: triage-issue
3
+ description: Triage a bug or issue by exploring the codebase to find root cause, then create a GitHub issue with a TDD-based fix plan. Use when user reports a bug, wants to file an issue, mentions "triage", or wants to investigate and plan a fix for a problem.
4
+ origin: vendor
5
+ vendor_version: '{{vendor_version}}'
6
+ invoked-by: [orchestrator]
7
+ ---
8
+
9
+ # Triage Issue
10
+
11
+ ## Core Principle
12
+
13
+ Investigate first, ask later. Find the root cause through codebase exploration
14
+ before asking the user anything. Minimize questions, maximize diagnosis.
15
+
16
+ ## Process
17
+
18
+ ### 1. Capture the problem
19
+
20
+ Get a brief description from the user. If they haven't provided one, ask ONE
21
+ question: "What's the problem you're seeing?"
22
+
23
+ Do NOT ask follow-up questions. Start investigating immediately.
24
+
25
+ ### 2. Explore and diagnose
26
+
27
+ Deeply investigate the codebase to find:
28
+
29
+ - **Where** the bug manifests (entry points, UI, API responses)
30
+ - **What** code path is involved (trace the flow)
31
+ - **Why** it fails (the root cause, not just the symptom)
32
+ - **What** related code exists (similar patterns, tests, adjacent modules)
33
+
34
+ Look at:
35
+
36
+ - Related source files and their dependencies
37
+ - Existing tests (what's tested, what's missing)
38
+ - Recent changes to affected files (`git log`)
39
+ - Error handling in the code path
40
+ - Similar patterns elsewhere that work correctly
41
+
42
+ ### 3. Identify the fix approach
43
+
44
+ Determine:
45
+
46
+ - The minimal change needed to fix the root cause
47
+ - Which modules/interfaces are affected
48
+ - What behaviors need to be verified via tests
49
+ - Whether this is a regression, missing feature, or design flaw
50
+
51
+ ### 4. Design TDD fix plan
52
+
53
+ Create an ordered list of RED-GREEN cycles (vertical slices):
54
+
55
+ - **RED**: a specific test that captures the broken/missing behavior
56
+ - **GREEN**: the minimal code change to make that test pass
57
+
58
+ Rules:
59
+
60
+ - Tests verify behavior through public interfaces, not implementation details
61
+ - One test at a time (NOT all tests first, then all code)
62
+ - **Durability**: describe behaviors and contracts, not internal structure. A good
63
+ fix plan reads like a spec, a bad one reads like a diff
64
+
65
+ For TDD philosophy → see the **tdd** skill. For test recipes → see your project's
66
+ testing playbook.
67
+
68
+ ### 5. Confirm with the user
69
+
70
+ Present the issue draft: problem summary, root cause analysis, and TDD fix plan.
71
+ Ask: "Does this look right? Should I create the issue?"
72
+
73
+ ### 6. Create the GitHub issue
74
+
75
+ After confirmation, create with `gh issue create` and the `harness:managed` label.
76
+ Use a structure like:
77
+
78
+ ```
79
+ ## Problem
80
+ <one-paragraph summary of the observed behavior>
81
+
82
+ ## Root cause
83
+ <the actual cause, traced to file:line>
84
+
85
+ ## Fix plan (TDD)
86
+ - [ ] RED: <test capturing behavior 1> → GREEN: <minimal change>
87
+ - [ ] RED: <test capturing behavior 2> → GREEN: <minimal change>
88
+
89
+ ## Affected
90
+ <modules / interfaces touched>
91
+ ```
92
+
93
+ Print the issue URL and a one-line summary of the root cause.
94
+
95
+ ## Uncontemplated Scenarios
96
+
97
+ When a scenario doesn't clearly fit these rules:
98
+
99
+ 1. Apply the closest matching approach with reasoning
100
+ 2. **Flag it**: "This scenario isn't covered by the triage-issue skill. I applied
101
+ [approach] because [reason]. Want to update the skill?"
102
+ 3. Offer to add a new rule for the case
@@ -0,0 +1,69 @@
1
+ ---
2
+ name: update-architecture
3
+ description: Keep docs/architecture.md current after an architecturally significant change — a new module/context, a changed boundary or integration seam, a new external dependency, a data-flow change. Use when the change alters the system's shape and the project keeps an architecture doc. The Orchestrator invokes the Architect with this skill; it only lands at install when docs/architecture.md already exists.
4
+ origin: vendor
5
+ vendor_version: '{{vendor_version}}'
6
+ applies-when: [has-architecture-doc]
7
+ invoked-by: [architect]
8
+ trigger-condition: a change alters the architecture (or drift is surfaced) and docs/architecture.md exists
9
+ ---
10
+
11
+ # Update Architecture
12
+
13
+ ## Core Principle
14
+
15
+ `docs/architecture.md` is the **living high-level map** of the system — the shape a new
16
+ engineer reads first. It drifts the moment a change alters that shape and the doc isn't
17
+ updated. This skill keeps it true to reality after an architecturally significant change.
18
+
19
+ The trigger is usually a change in hand, but the job is the same when **drift is surfaced
20
+ without a change** — an orientation or review finds the map already diverged from the code
21
+ (a `code-explorer` Notes flag, a Reviewer side-finding; ADR 0011). Reconcile it the same
22
+ way, applying the same significance bar below: only shape-level divergence is worth an edit.
23
+
24
+ It is gated on `applies-when: has-architecture-doc`: it only installs when the project
25
+ already keeps `docs/architecture.md`. The harness **never creates** an architecture doc
26
+ where the client chose not to have one (decision #8 — the vendor gives the framework,
27
+ not the architecture). If a project has no such doc and one is wanted, that's a client
28
+ decision, not an automatic harness action.
29
+
30
+ ## What counts as architecturally significant
31
+
32
+ Update the doc when the change touches the **shape**, not the detail:
33
+
34
+ - a new module / bounded context, or one removed or merged;
35
+ - a changed boundary or ownership rule ("X now owns Y; others reference by id");
36
+ - a new or changed integration seam (sync HTTP → domain events, a new queue);
37
+ - a new external dependency that carries lock-in (a database, a broker, an auth provider);
38
+ - a data-flow change a reader of the diagram would otherwise get wrong.
39
+
40
+ A bug fix, a refactor that preserves the shape, or a new function inside an existing
41
+ module is **not** architecturally significant — leave the doc alone.
42
+
43
+ ## Process
44
+
45
+ 1. **Read the current doc** and locate the section the change — or the surfaced drift —
46
+ affects. Understand the existing structure before editing — match its level of
47
+ abstraction and its style.
48
+ 2. **Make the smallest true edit.** Reflect the new reality surgically: update the
49
+ affected section, a diagram, or a boundary statement. Don't rewrite what didn't change.
50
+ 3. **Link the why, don't restate it.** If the change was recorded as an ADR (via
51
+ `write-adr`), reference it (`see ADR-NNNN`) rather than re-explaining the rationale —
52
+ the architecture doc holds the _shape_, the ADR the _decision_.
53
+ 4. **Keep it high-level.** This is a map, not a code listing. Resist pulling in
54
+ implementation detail that belongs in code, `CLAUDE.md`, or a playbook.
55
+
56
+ ## Report
57
+
58
+ Return to the Orchestrator: the section(s) updated and a one-line summary of what the
59
+ map now reflects. If the change turned out **not** to be architecturally significant,
60
+ say so and make no edit.
61
+
62
+ ## Uncontemplated Scenarios
63
+
64
+ When a case doesn't clearly fit:
65
+
66
+ 1. Apply the closest matching approach with reasoning.
67
+ 2. **Flag it**: "This isn't covered by the update-architecture skill. I did [approach]
68
+ because [reason]. Want to refine the skill?"
69
+ 3. Offer to add a rule for the case.