@ai-agent-lead/skills 1.0.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +37 -0
- package/bin/install.js +272 -0
- package/package.json +34 -0
- package/skills/LANGUAGE.md +72 -0
- package/skills/README.md +156 -0
- package/skills/SKILL-TEMPLATE.md +120 -0
- package/skills/TRIGGERS.md +64 -0
- package/skills/WORKFLOWS.md +369 -0
- package/skills/bench/SKILL.md +40 -0
- package/skills/bench/templates/benchmark-report.md +26 -0
- package/skills/bootstrap/BOOTSTRAP.md +13 -0
- package/skills/bootstrap/SKILL.md +47 -0
- package/skills/code-hygiene/SKILL.md +92 -0
- package/skills/debug/SKILL.md +122 -0
- package/skills/design/DEEP-MODULES.md +76 -0
- package/skills/design/FUNCTIONAL-CORE.md +121 -0
- package/skills/design/ILLEGAL-STATES.md +102 -0
- package/skills/design/OBSERVABILITY.md +49 -0
- package/skills/design/PERSONAS.md +41 -0
- package/skills/design/SKILL.md +139 -0
- package/skills/design/TESTABILITY.md +84 -0
- package/skills/feature-doc/SKILL.md +113 -0
- package/skills/feature-doc/templates/feature-template.md +52 -0
- package/skills/formats/ADR-FORMAT.md +51 -0
- package/skills/formats/CONTEXT-FORMAT.md +109 -0
- package/skills/formats/CONTEXT-MAP-FORMAT.md +6 -0
- package/skills/grill-plan/SKILL.md +112 -0
- package/skills/improve-codebase-architecture/DEEPENING.md +37 -0
- package/skills/improve-codebase-architecture/INTERFACE-DESIGN.md +41 -0
- package/skills/improve-codebase-architecture/SKILL.md +115 -0
- package/skills/investigate/SKILL.md +97 -0
- package/skills/investigate/templates/research-note.md +84 -0
- package/skills/pr-review/SKILL.md +197 -0
- package/skills/prod-ready/SKILL.md +88 -0
- package/skills/security-review/SKILL.md +145 -0
- package/skills/simplify/SKILL.md +105 -0
- package/skills/sync-check/SKILL.md +69 -0
- package/skills/system-design/SKILL.md +160 -0
- package/skills/tdd/SKILL.md +121 -0
- package/skills/tdd/TESTS.md +93 -0
- package/skills/tdd-rounds/COMMITS.md +122 -0
- package/skills/tdd-rounds/SKILL.md +96 -0
- package/skills/tdd-rounds/templates/builder-brief.md +73 -0
- package/skills/tdd-rounds/templates/builder-report.md +21 -0
- package/skills/verify-real-deps/MOTIVATION.md +18 -0
- package/skills/verify-real-deps/SKILL.md +118 -0
- package/skills/verify-real-deps/templates/known-issues.md +45 -0
- package/skills/zoom-out/SKILL.md +104 -0
|
@@ -0,0 +1,120 @@
|
|
|
1
|
+
# SKILL.md template
|
|
2
|
+
|
|
3
|
+
Canonical scaffold for every skill in this set. Copy the body below into a new `<skill-name>/SKILL.md`, fill it in, then prune sections that genuinely don't apply (rather than leaving placeholders).
|
|
4
|
+
|
|
5
|
+
The order is load-bearing. Claude scans top-to-bottom — `When to use` / `When to skip` should hit early so routing is decided before the reader gets to the body.
|
|
6
|
+
|
|
7
|
+
## Naming and placement
|
|
8
|
+
|
|
9
|
+
- File path: `skills/<skill-name>/SKILL.md` (lowercase, hyphenated directory).
|
|
10
|
+
- Supporting reference docs in the same directory go UPPERCASE (`MOTIVATION.md`, `DEEPENING.md`, etc.).
|
|
11
|
+
- Templates that callers fill in go in `<skill-name>/templates/` and stay lowercase (`feature-template.md`).
|
|
12
|
+
- Format docs that more than one skill consumes belong in [`skills/formats/`](formats/), not in any one skill's directory.
|
|
13
|
+
|
|
14
|
+
## Frontmatter rules
|
|
15
|
+
|
|
16
|
+
```yaml
|
|
17
|
+
---
|
|
18
|
+
name: <kebab-case>
|
|
19
|
+
description: <one paragraph: trigger phrases AND skip conditions AND adjacent skills>
|
|
20
|
+
---
|
|
21
|
+
```
|
|
22
|
+
|
|
23
|
+
The `description` is the routing signal. It should:
|
|
24
|
+
- Name what the skill does in one clause.
|
|
25
|
+
- List trigger phrases ("Use when…", "Triggered by phrases like…").
|
|
26
|
+
- List skip conditions ("Skip for…", "Use … instead when…").
|
|
27
|
+
- Name 1–3 adjacent skills (upstream / downstream / lateral) so Claude can de-conflict.
|
|
28
|
+
|
|
29
|
+
A `description` that names only the happy path will route falsely. Always name what to skip.
|
|
30
|
+
|
|
31
|
+
## Canonical body shape
|
|
32
|
+
|
|
33
|
+
```md
|
|
34
|
+
# <Title>
|
|
35
|
+
|
|
36
|
+
<Optional 1–2 sentences naming what the skill does and the *one* discipline at its core. Keep it tight — the description already covered the hook.>
|
|
37
|
+
|
|
38
|
+
## Why this skill exists (optional — include for skills that teach a discipline; skip for orchestration / utility skills)
|
|
39
|
+
|
|
40
|
+
<2–4 sentences. Name the failure mode this skill prevents. Concrete > abstract.>
|
|
41
|
+
|
|
42
|
+
## When to use
|
|
43
|
+
|
|
44
|
+
- <trigger condition>
|
|
45
|
+
- <trigger phrase>
|
|
46
|
+
- <upstream signal — "<adjacent-skill> handed off"; "the user has <artifact>">
|
|
47
|
+
|
|
48
|
+
## When to skip
|
|
49
|
+
|
|
50
|
+
- <case where a different skill is the right answer; name that skill>
|
|
51
|
+
- <case where no skill is needed at all — "just fix it">
|
|
52
|
+
|
|
53
|
+
## Phases (or **Process**, or **Workflow** — pick one and stick to it)
|
|
54
|
+
|
|
55
|
+
### 1. <Phase name — verb-leading>
|
|
56
|
+
|
|
57
|
+
<What happens in this phase. The output of the phase is named: a file, a finding, a decision.>
|
|
58
|
+
|
|
59
|
+
### 2. <Phase name>
|
|
60
|
+
|
|
61
|
+
<...>
|
|
62
|
+
|
|
63
|
+
## Anti-patterns
|
|
64
|
+
|
|
65
|
+
- **<Name of the failure mode>.** <One sentence on what it looks like in practice and why it's wrong.>
|
|
66
|
+
- ...
|
|
67
|
+
|
|
68
|
+
## Pairing with other skills
|
|
69
|
+
|
|
70
|
+
- **`<upstream-skill>`** — runs before. <One line on what hand-off looks like.>
|
|
71
|
+
- **`<downstream-skill>`** — runs after. <One line.>
|
|
72
|
+
- **`<lateral-skill>`** — applies alongside. <One line.>
|
|
73
|
+
|
|
74
|
+
## Done when
|
|
75
|
+
|
|
76
|
+
- <Verifiable condition — not "the skill was applied" but "the artifact exists / the named region is identified / the green test pins the bug">.
|
|
77
|
+
- ...
|
|
78
|
+
|
|
79
|
+
## Handoff (optional — include when the skill clearly feeds into another)
|
|
80
|
+
|
|
81
|
+
<One sentence per branch. "If <condition> → run `<next-skill>`.">
|
|
82
|
+
```
|
|
83
|
+
|
|
84
|
+
## Section requirements
|
|
85
|
+
|
|
86
|
+
| Section | Required | Notes |
|
|
87
|
+
| --- | --- | --- |
|
|
88
|
+
| Frontmatter (`name`, `description`) | yes | `description` must name skip conditions, not only trigger phrases. |
|
|
89
|
+
| Title (`# <Title>`) | yes | |
|
|
90
|
+
| Why this skill exists | optional | Include for *teaching* skills (discipline being taught). Skip for *orchestration* and *utility* skills. |
|
|
91
|
+
| When to use | yes | Body section, not just frontmatter. Frontmatter alone is too easy to skim past. |
|
|
92
|
+
| When to skip | yes | Body section. Mirror the description — explicit, not implied. |
|
|
93
|
+
| Phases / Process / Workflow | yes | Pick one heading. Don't mix. |
|
|
94
|
+
| Anti-patterns | recommended | Skip only if the skill is so simple there are none. |
|
|
95
|
+
| Pairing with other skills | yes | At minimum name 1 upstream and 1 downstream. |
|
|
96
|
+
| Done when | yes | Verifiable conditions, not vibes. |
|
|
97
|
+
| Handoff | optional | Use when the skill cleanly feeds into another; otherwise the Pairing section covers it. |
|
|
98
|
+
|
|
99
|
+
## Voice and length
|
|
100
|
+
|
|
101
|
+
- **Body length matches role**, not importance. Teaching skills run long (debug, security-review, tdd). Orchestration / utility / lens skills run short (tdd-rounds, simplify, code-hygiene). Don't pad an orchestration skill to match a teaching skill — it adds noise.
|
|
102
|
+
- **No hedging.** "Sometimes consider maybe doing X" is dead text. Pick a recommendation.
|
|
103
|
+
- **No corporate voice.** Direct sentences. The reader is a fast-reading senior engineer or an LLM, not an executive.
|
|
104
|
+
- **Cite paths**: `path:line` or `[link](relative/path.md)`. Don't say "see the auth module"; say `src/auth/session.go:42`.
|
|
105
|
+
|
|
106
|
+
## Vocabulary
|
|
107
|
+
|
|
108
|
+
- Architecture terms come from [`LANGUAGE.md`](LANGUAGE.md) — don't redefine. Link instead.
|
|
109
|
+
- Domain terms come from `docs/CONTEXT.md` (the consuming repo's glossary, not this skill set's).
|
|
110
|
+
- Format references for ADRs / CONTEXT.md live in [`formats/`](formats/) — link, don't inline.
|
|
111
|
+
|
|
112
|
+
## Adding the skill to the index
|
|
113
|
+
|
|
114
|
+
Once `SKILL.md` is written:
|
|
115
|
+
|
|
116
|
+
1. Add a row to README.md's "by trigger phase" index (under the right role group).
|
|
117
|
+
2. Add a row to README.md's "by role" table.
|
|
118
|
+
3. Add an entry to [`TRIGGERS.md`](TRIGGERS.md) — list the trigger phrases that should route to it.
|
|
119
|
+
4. If the skill participates in a canonical workflow, update [`WORKFLOWS.md`](WORKFLOWS.md).
|
|
120
|
+
5. If the skill produces a `docs/` artifact, add a row to the "Artifacts accumulate in `docs/`" table at the bottom of WORKFLOWS.md.
|
|
@@ -0,0 +1,64 @@
|
|
|
1
|
+
# Trigger index
|
|
2
|
+
|
|
3
|
+
Flat lookup from user phrases / situations → skill. The canonical source of trigger phrases is each skill's `description:` frontmatter; this file aggregates them so collisions and gaps are visible.
|
|
4
|
+
|
|
5
|
+
If a phrase appears in multiple rows, the disambiguation column names how to choose.
|
|
6
|
+
|
|
7
|
+
## Planning & investigation
|
|
8
|
+
|
|
9
|
+
| Phrase / situation | Routes to | Disambiguator |
|
|
10
|
+
| --- | --- | --- |
|
|
11
|
+
| "investigate", "research", "give me a proposal", "what are our options", "how would we approach", "let's explore", "should we…" | [`investigate`](investigate/SKILL.md) | Direction not yet chosen. If a direction *is* chosen and you want to stress-test it → `grill-plan`. |
|
|
12
|
+
| "new project", "initialize", "bootstrap", "start new repo" | [`bootstrap`](bootstrap/SKILL.md) | Only for project start. |
|
|
13
|
+
| "before any non-trivial feature", "let's spec this out", "write a feature doc" | [`feature-doc`](feature-doc/SKILL.md) | Skip for typo / dep bump / pure refactor. |
|
|
14
|
+
| "grill me on this", "stress-test this plan", "walk me through this", "is this consistent with our model" | [`grill-plan`](grill-plan/SKILL.md) | A plan exists; you want it pressure-tested. |
|
|
15
|
+
| "benchmark", "performance test", "measure latency", "profile" | [`bench`](bench/SKILL.md) | Performance verification. |
|
|
16
|
+
| "is the naming right?", "does this match our glossary?", "audit terminology", "sync check", "context audit" | [`sync-check`](sync-check/SKILL.md) | Before `pr-review` or after refactor. |
|
|
17
|
+
|
|
18
|
+
## Design & architecture
|
|
19
|
+
|
|
20
|
+
| Phrase / situation | Routes to | Disambiguator |
|
|
21
|
+
| --- | --- | --- |
|
|
22
|
+
| "system architecture", "module boundaries", "service boundaries", "how should I structure this system", "draw the architecture", "topology" | [`system-design`](system-design/SKILL.md) | Greenfield, multi-module. For single-module work → `design`. For reorganising an existing codebase → `improve-codebase-architecture`. |
|
|
23
|
+
| "how should I structure this", "deep modules", "testability", "API design", "design this module" | [`design`](design/SKILL.md) | Single module / public API, NEW code. For existing code → `improve-codebase-architecture`. |
|
|
24
|
+
| "improve architecture", "find refactoring opportunities", "consolidate", "deepen these modules", "make this more testable" | [`improve-codebase-architecture`](improve-codebase-architecture/SKILL.md) | EXISTING code, cross-module. For local single-module refactor → `design` or just refactor inline. |
|
|
25
|
+
| "simpler", "boring", "naming", "YAGNI", "premature abstraction", "over-engineered", "clean this up at line level" | [`code-hygiene`](code-hygiene/SKILL.md) | Lens, not phase — applies during simplify, pr-review, or whenever you re-read code. |
|
|
26
|
+
| "I'm lost", "give me higher-level context", "zoom out", "I don't know this area" | [`zoom-out`](zoom-out/SKILL.md) | User-invoked utility (`disable-model-invocation`). |
|
|
27
|
+
|
|
28
|
+
## Implementation
|
|
29
|
+
|
|
30
|
+
| Phrase / situation | Routes to | Disambiguator |
|
|
31
|
+
| --- | --- | --- |
|
|
32
|
+
| "it's broken", "this is failing", "intermittent", "flaky", "regression", "not sure why", "production issue", "doesn't work in <env>" | [`debug`](debug/SKILL.md) | Root cause not obvious. If the trace points directly at the fix → skip `debug`, run `tdd`. |
|
|
33
|
+
| "TDD", "test-first", "red-green-refactor", "implement this feature", "fix this bug" | [`tdd`](tdd/SKILL.md) | Default for code-writing. |
|
|
34
|
+
| "drive the sub-agent team", "multi-round TDD", "orchestrate rounds", "Builder agents", ≥10 ACs, multi-package | [`tdd-rounds`](tdd-rounds/SKILL.md) | Single-AC fix or single-package feature → just `tdd`. |
|
|
35
|
+
| "simplify pass", "tighten this", "clean up before commit", "end-of-round sweep" | [`simplify`](simplify/SKILL.md) | Runs after `tdd` reaches green; before PR. |
|
|
36
|
+
|
|
37
|
+
## Pre-merge gates & review
|
|
38
|
+
|
|
39
|
+
| Phrase / situation | Routes to | Disambiguator |
|
|
40
|
+
| --- | --- | --- |
|
|
41
|
+
| "shipping", "ready to merge", "before deploy", "production readiness", "prod-ready" | [`prod-ready`](prod-ready/SKILL.md) | Author's pre-merge checklist. Skip for pure docs / test-only / one-line bug fix without infra impact. |
|
|
42
|
+
| "security review", "threat model", "STRIDE", "auth flow", "permissions", "secrets", "PII", "public API", "external surface", "abuse", "hardening" | [`security-review`](security-review/SKILL.md) | Surface-changing work only. Non-surface-changing diffs use `prod-ready` Section 3 alone. |
|
|
43
|
+
| "review this PR", "look over the diff", "check this change", "give feedback on" | [`pr-review`](pr-review/SKILL.md) | Reviewing someone else's PR (or self-review). Pairs with `prod-ready` (author side) and `security-review` (escalation). |
|
|
44
|
+
| "smoke test", "real API", "live verify", "before tag", "end-to-end against actual <vendor>" | [`verify-real-deps`](verify-real-deps/SKILL.md) | After `prod-ready` clean, before tagging. Skip if no third-party API or staging is fully owned. |
|
|
45
|
+
|
|
46
|
+
## Routing collisions worth knowing
|
|
47
|
+
|
|
48
|
+
Some phrases overlap. Pick by the disambiguator:
|
|
49
|
+
|
|
50
|
+
- **"design"** alone — almost always [`design`](design/SKILL.md). If the user means "system design", that phrase routes to [`system-design`](system-design/SKILL.md).
|
|
51
|
+
- **"refactor"** — [`improve-codebase-architecture`](improve-codebase-architecture/SKILL.md) for cross-module; [`design`](design/SKILL.md) or just inline edits for one-module shape; [`simplify`](simplify/SKILL.md) for end-of-round line-level cleanup.
|
|
52
|
+
- **"clean up"** — [`code-hygiene`](code-hygiene/SKILL.md) as a lens; [`simplify`](simplify/SKILL.md) as the end-of-round action; [`improve-codebase-architecture`](improve-codebase-architecture/SKILL.md) if the cleanup is structural.
|
|
53
|
+
- **"test"** — [`tdd`](tdd/SKILL.md) for writing tests against new behavior; [`debug`](debug/SKILL.md) when the test you'd write isn't obvious yet (root cause unclear); [`simplify`](simplify/SKILL.md)'s lens 4 for assessing existing tests.
|
|
54
|
+
- **"ship"** — [`prod-ready`](prod-ready/SKILL.md) for the gate; [`verify-real-deps`](verify-real-deps/SKILL.md) for tagged release with third-party APIs.
|
|
55
|
+
|
|
56
|
+
## When NO skill fires
|
|
57
|
+
|
|
58
|
+
Some tasks don't need a skill:
|
|
59
|
+
|
|
60
|
+
- Typo fixes, one-line config tweaks, dependency bumps with no API change.
|
|
61
|
+
- Mechanical renames where the desired result is unambiguous.
|
|
62
|
+
- Reading code to answer a question (no artifact, no decision).
|
|
63
|
+
|
|
64
|
+
If the user invokes a skill on these, surface that the overhead exceeds the value and offer to just do the change.
|
|
@@ -0,0 +1,369 @@
|
|
|
1
|
+
# Workflows
|
|
2
|
+
|
|
3
|
+
How the 16 skills compose into the common paths users actually walk.
|
|
4
|
+
|
|
5
|
+
The skills aren't a flat menu — they form a workflow with branching. This file maps the canonical paths so you can see where to enter, what to expect, and what produces what.
|
|
6
|
+
|
|
7
|
+
For routing by trigger phrase, see [TRIGGERS.md](./TRIGGERS.md). For the cross-skill graph, see the relationship map in [README.md](./README.md#skill-relationship-map).
|
|
8
|
+
|
|
9
|
+
## Decision tree — where to start
|
|
10
|
+
|
|
11
|
+
```
|
|
12
|
+
Got a task? Pick by what you have in front of you:
|
|
13
|
+
|
|
14
|
+
┌─────────────────────────────────────┬──────────────────────────────────┐
|
|
15
|
+
│ Typo / dep bump / one-liner config │ just fix it — no skill needed │
|
|
16
|
+
├─────────────────────────────────────┼──────────────────────────────────┤
|
|
17
|
+
│ Bug fix — root cause obvious │ /tdd (Workflow 5a) │
|
|
18
|
+
├─────────────────────────────────────┼──────────────────────────────────┤
|
|
19
|
+
│ Bug — root cause unclear/intermittent│ /debug → /tdd (Workflow 5b) │
|
|
20
|
+
├─────────────────────────────────────┼──────────────────────────────────┤
|
|
21
|
+
│ New SYSTEM (multi-module) greenfield│ /system-design (Workflow 6) │
|
|
22
|
+
├─────────────────────────────────────┼──────────────────────────────────┤
|
|
23
|
+
│ New feature in existing system │ /feature-doc (Workflow 1 or 2) │
|
|
24
|
+
├─────────────────────────────────────┼──────────────────────────────────┤
|
|
25
|
+
│ New feature, direction unclear │ /investigate (Workflow 3) │
|
|
26
|
+
├─────────────────────────────────────┼──────────────────────────────────┤
|
|
27
|
+
│ Refactor existing code │ /improve-codebase-architecture │
|
|
28
|
+
│ │ (Workflow 4) │
|
|
29
|
+
├─────────────────────────────────────┼──────────────────────────────────┤
|
|
30
|
+
│ Reviewing someone else's PR │ /pr-review (utility) │
|
|
31
|
+
├─────────────────────────────────────┼──────────────────────────────────┤
|
|
32
|
+
│ Surface-changing work (auth, public │ /security-review (gate, runs │
|
|
33
|
+
│ API, sensitive data, new entry pt)│ alongside Workflow 1/2/6) │
|
|
34
|
+
├─────────────────────────────────────┼──────────────────────────────────┤
|
|
35
|
+
│ Lost in unfamiliar area, mid-task │ /zoom-out (utility, anytime) │
|
|
36
|
+
└─────────────────────────────────────┴──────────────────────────────────┘
|
|
37
|
+
```
|
|
38
|
+
|
|
39
|
+
---
|
|
40
|
+
|
|
41
|
+
## Workflow 1 — Standard greenfield feature
|
|
42
|
+
|
|
43
|
+
For a single-package feature with a manageable acceptance-criteria count.
|
|
44
|
+
|
|
45
|
+
```
|
|
46
|
+
[user has a clear idea]
|
|
47
|
+
│
|
|
48
|
+
▼
|
|
49
|
+
feature-doc ──── produces: docs/features/<name>.md
|
|
50
|
+
│ (Problem, User Story, ACs, Non-Goals)
|
|
51
|
+
│
|
|
52
|
+
▼
|
|
53
|
+
[doc reviewed; ACs stable]
|
|
54
|
+
│
|
|
55
|
+
├─── Surface-changing? (new entry point, identity flow,
|
|
56
|
+
│ authz, sensitive data, external dep)
|
|
57
|
+
│ └─► security-review ──── runs alongside design/tdd;
|
|
58
|
+
│ produces: feature-doc Security section
|
|
59
|
+
│ or docs/security/<feature>.md
|
|
60
|
+
│
|
|
61
|
+
▼
|
|
62
|
+
(optional) design ──── new module shape; optional sibling
|
|
63
|
+
│ docs/features/<name>.design.md if non-trivial
|
|
64
|
+
│
|
|
65
|
+
▼
|
|
66
|
+
tdd ──── red → green → refactor, per AC
|
|
67
|
+
│
|
|
68
|
+
▼
|
|
69
|
+
simplify ──── end-of-slice sweep (reuse / quality /
|
|
70
|
+
│ efficiency / test relevance)
|
|
71
|
+
│
|
|
72
|
+
▼
|
|
73
|
+
prod-ready ──── 7-section checklist
|
|
74
|
+
│
|
|
75
|
+
▼
|
|
76
|
+
(optional) pr-review ── self-check against the diff before opening
|
|
77
|
+
│
|
|
78
|
+
▼
|
|
79
|
+
[PR / merge]
|
|
80
|
+
```
|
|
81
|
+
|
|
82
|
+
---
|
|
83
|
+
|
|
84
|
+
## Workflow 2 — Large feature *(≥ 10 ACs or multi-package)*
|
|
85
|
+
|
|
86
|
+
```
|
|
87
|
+
feature-doc (lists many ACs)
|
|
88
|
+
│
|
|
89
|
+
▼
|
|
90
|
+
tdd-rounds ──── parent plans rounds, dispatches Builders
|
|
91
|
+
│
|
|
92
|
+
▼
|
|
93
|
+
┌──────── Round N ────────┐
|
|
94
|
+
│ Parent writes brief │ ◄─── self-contained, references docs/STATE.md
|
|
95
|
+
│ │ │
|
|
96
|
+
│ ▼ │
|
|
97
|
+
│ Builder is dispatched │
|
|
98
|
+
│ Builder runs IN ORDER: │
|
|
99
|
+
│ • design (if new pkg)
|
|
100
|
+
│ • tdd (mandatory every round)
|
|
101
|
+
│ • simplify (end of round)
|
|
102
|
+
│ • prod-ready (final round only)
|
|
103
|
+
│ │ │
|
|
104
|
+
│ ▼ │
|
|
105
|
+
│ Builder commits per AC slice (R<N>: prefix)
|
|
106
|
+
│ │ │
|
|
107
|
+
│ ▼ │
|
|
108
|
+
│ Builder returns report │
|
|
109
|
+
│ │ │
|
|
110
|
+
│ ▼ │
|
|
111
|
+
│ Parent reviews diff, │
|
|
112
|
+
│ runs tests independently,
|
|
113
|
+
│ appends to STATE.md │
|
|
114
|
+
└──────────┬───────────────┘
|
|
115
|
+
│
|
|
116
|
+
▼
|
|
117
|
+
[parent runs improve-codebase-architecture ONCE mid-project]
|
|
118
|
+
│
|
|
119
|
+
▼
|
|
120
|
+
[continue rounds until all ACs green]
|
|
121
|
+
│
|
|
122
|
+
▼
|
|
123
|
+
verify-real-deps ──── parent, against real upstream
|
|
124
|
+
│ produces: docs/known-issues.md
|
|
125
|
+
│
|
|
126
|
+
▼
|
|
127
|
+
[bug ledger entries → fix-rounds via tdd-rounds → loop]
|
|
128
|
+
│
|
|
129
|
+
▼
|
|
130
|
+
[tag vN.0]
|
|
131
|
+
```
|
|
132
|
+
|
|
133
|
+
---
|
|
134
|
+
|
|
135
|
+
## Workflow 3 — Investigation before commitment
|
|
136
|
+
|
|
137
|
+
For non-trivial structural decisions: new dependency, framework choice, cross-cutting refactor.
|
|
138
|
+
|
|
139
|
+
```
|
|
140
|
+
[user has structural decision pending]
|
|
141
|
+
│
|
|
142
|
+
▼
|
|
143
|
+
investigate ──── 5 phases:
|
|
144
|
+
│ 1. Survey current state (cite path:line)
|
|
145
|
+
│ 2. Map 2-3 options
|
|
146
|
+
│ 3. Recommend with reasoning
|
|
147
|
+
│ 4. Checkpoint questions
|
|
148
|
+
│ 5. (Optional) independent review
|
|
149
|
+
│
|
|
150
|
+
│ produces: docs/research/<topic>.md
|
|
151
|
+
▼
|
|
152
|
+
[user picks an option]
|
|
153
|
+
│
|
|
154
|
+
├─── Hard to reverse / load-bearing? ─────► grill-plan ───► ADR
|
|
155
|
+
│ │
|
|
156
|
+
│ ▼
|
|
157
|
+
│ docs/adr/<n>-<topic>.md
|
|
158
|
+
│
|
|
159
|
+
▼
|
|
160
|
+
feature-doc (capture chosen direction as a contract)
|
|
161
|
+
│
|
|
162
|
+
▼
|
|
163
|
+
[→ Workflow 1 or 2]
|
|
164
|
+
```
|
|
165
|
+
|
|
166
|
+
---
|
|
167
|
+
|
|
168
|
+
## Workflow 4 — Refactoring existing code
|
|
169
|
+
|
|
170
|
+
```
|
|
171
|
+
[user wants to improve architecture]
|
|
172
|
+
│
|
|
173
|
+
▼
|
|
174
|
+
(optional) zoom-out ──── if unfamiliar with the area
|
|
175
|
+
│ map of relevant modules + callers
|
|
176
|
+
│
|
|
177
|
+
▼
|
|
178
|
+
improve-codebase-architecture
|
|
179
|
+
│
|
|
180
|
+
├─ explore (with Explore subagent)
|
|
181
|
+
├─ apply DELETION TEST to suspect modules
|
|
182
|
+
├─ present numbered candidate list
|
|
183
|
+
▼
|
|
184
|
+
[user picks one candidate]
|
|
185
|
+
│
|
|
186
|
+
▼
|
|
187
|
+
[grilling loop — design deepened module's interface]
|
|
188
|
+
│
|
|
189
|
+
├─ updates CONTEXT.md inline if new term
|
|
190
|
+
└─ offers ADR if user rejects with load-bearing reason
|
|
191
|
+
▼
|
|
192
|
+
tdd (or tdd-rounds if cross-package)
|
|
193
|
+
│ refactor rounds: ACs are "all existing tests still green"
|
|
194
|
+
▼
|
|
195
|
+
prod-ready
|
|
196
|
+
│
|
|
197
|
+
▼
|
|
198
|
+
[PR]
|
|
199
|
+
|
|
200
|
+
If user wants more candidates, run improve-codebase-architecture again.
|
|
201
|
+
Don't batch multiple candidates in one pass.
|
|
202
|
+
```
|
|
203
|
+
|
|
204
|
+
---
|
|
205
|
+
|
|
206
|
+
## Workflow 5a — Bug fix, root cause obvious
|
|
207
|
+
|
|
208
|
+
When the symptom and stack trace point at the bug directly.
|
|
209
|
+
|
|
210
|
+
```
|
|
211
|
+
[bug report — root cause clear from message/trace]
|
|
212
|
+
│
|
|
213
|
+
├─── Trivial (typo, config) ────► just fix it (no skills)
|
|
214
|
+
│
|
|
215
|
+
▼ (real bug, but cause is known)
|
|
216
|
+
tdd ──── 1. Failing test reproducing the bug
|
|
217
|
+
│ 2. Fix
|
|
218
|
+
│ 3. Refactor with the test as safety net
|
|
219
|
+
│
|
|
220
|
+
▼
|
|
221
|
+
[touches infra / auth / DB / API surface?]
|
|
222
|
+
│
|
|
223
|
+
├─── YES, surface-changing ──► security-review ──► prod-ready
|
|
224
|
+
│
|
|
225
|
+
├─── YES, infra/DB/ops ──────► prod-ready
|
|
226
|
+
│
|
|
227
|
+
└─── NO ─────────────────────► [skip prod-ready]
|
|
228
|
+
│
|
|
229
|
+
▼
|
|
230
|
+
[PR]
|
|
231
|
+
```
|
|
232
|
+
|
|
233
|
+
## Workflow 5b — Bug fix, root cause unclear
|
|
234
|
+
|
|
235
|
+
When the bug is intermittent, environment-specific, a regression, or you don't yet know what to assert.
|
|
236
|
+
|
|
237
|
+
```
|
|
238
|
+
[bug report — symptom present, root cause not obvious]
|
|
239
|
+
│
|
|
240
|
+
▼
|
|
241
|
+
(optional) zoom-out ──── if unfamiliar with the area
|
|
242
|
+
│
|
|
243
|
+
▼
|
|
244
|
+
debug ──── 1. Reproduce (minimum reliable repro)
|
|
245
|
+
│ 2. Isolate (bisect / log / trace)
|
|
246
|
+
│ 3. Hypothesis test (one variable at a time)
|
|
247
|
+
│ 4. Name root cause (distinct from symptom)
|
|
248
|
+
│
|
|
249
|
+
│ optional: docs/research/<bug-slug>.md
|
|
250
|
+
│
|
|
251
|
+
▼
|
|
252
|
+
[root cause named, blast radius known]
|
|
253
|
+
│
|
|
254
|
+
▼
|
|
255
|
+
tdd ──── failing test from the reproduction →
|
|
256
|
+
│ fix at the named region → refactor
|
|
257
|
+
│
|
|
258
|
+
▼
|
|
259
|
+
[touches surface / infra / auth?]
|
|
260
|
+
│
|
|
261
|
+
├─── YES, surface-changing ──► security-review ──► prod-ready
|
|
262
|
+
│
|
|
263
|
+
├─── YES, infra/DB/ops ──────► prod-ready
|
|
264
|
+
│
|
|
265
|
+
└─── NO ─────────────────────► [skip prod-ready]
|
|
266
|
+
│
|
|
267
|
+
▼
|
|
268
|
+
[PR]
|
|
269
|
+
```
|
|
270
|
+
|
|
271
|
+
---
|
|
272
|
+
|
|
273
|
+
## Workflow 6 — Greenfield system *(new multi-module project from scratch)*
|
|
274
|
+
|
|
275
|
+
```
|
|
276
|
+
[user starts a new system]
|
|
277
|
+
│
|
|
278
|
+
▼
|
|
279
|
+
(optional) investigate ──── if direction itself is unclear
|
|
280
|
+
│ (architecture style, framework choice)
|
|
281
|
+
│
|
|
282
|
+
▼
|
|
283
|
+
system-design ──── name modules, set dependencies, identify seams
|
|
284
|
+
│ produces: docs/architecture.md (system map)
|
|
285
|
+
│
|
|
286
|
+
▼
|
|
287
|
+
[topology stable]
|
|
288
|
+
│
|
|
289
|
+
├── For each module with a public interface ──► design
|
|
290
|
+
│
|
|
291
|
+
└── For any hard-to-reverse topology decision ──► grill-plan ──► ADR
|
|
292
|
+
│
|
|
293
|
+
▼
|
|
294
|
+
feature-doc (first feature using the topology)
|
|
295
|
+
│
|
|
296
|
+
▼
|
|
297
|
+
[→ Workflow 1 or 2 per feature]
|
|
298
|
+
```
|
|
299
|
+
|
|
300
|
+
This workflow runs **once per system**, not per feature. After it, each feature follows its own Workflow 1 or 2 within the established topology.
|
|
301
|
+
|
|
302
|
+
---
|
|
303
|
+
|
|
304
|
+
## Utility — `/zoom-out`
|
|
305
|
+
|
|
306
|
+
```
|
|
307
|
+
[user feels lost in unfamiliar code, mid-task]
|
|
308
|
+
│
|
|
309
|
+
▼
|
|
310
|
+
/zoom-out ──── user-invoked only (disable-model-invocation: true)
|
|
311
|
+
│
|
|
312
|
+
▼
|
|
313
|
+
[Claude maps the relevant modules + callers
|
|
314
|
+
using docs/CONTEXT.md vocabulary]
|
|
315
|
+
│
|
|
316
|
+
▼
|
|
317
|
+
[user returns to original workflow with better context]
|
|
318
|
+
```
|
|
319
|
+
|
|
320
|
+
---
|
|
321
|
+
|
|
322
|
+
## Cross-workflow patterns
|
|
323
|
+
|
|
324
|
+
A few things that happen across all workflows:
|
|
325
|
+
|
|
326
|
+
1. **`zoom-out` can interrupt any workflow.** Slash command, user-only — pulls Claude up an abstraction layer when the user is lost. Doesn't change which workflow they're in.
|
|
327
|
+
|
|
328
|
+
2. **`grill-plan` is reusable as a sub-step.** Workflow 3 calls it explicitly; Workflow 4's grilling loop borrows the same discipline. It's also valid as a standalone skill if the user has a plan they want to stress-test. Has a **bootstrap mode** for greenfield repos with no `CONTEXT.md` / ADRs yet — the session creates them lazily.
|
|
329
|
+
|
|
330
|
+
3. **`design` doesn't have a workflow of its own** — it's a sub-step inside Workflow 1, 2, and 4. Always paired with `tdd` (or implicitly with `tdd-rounds`). Optional sibling artifact `docs/features/<name>.design.md` when the module shape is non-trivial.
|
|
331
|
+
|
|
332
|
+
4. **`code-hygiene` is a lens, not a phase.** Apply it during the simplify sweep that follows TDD green, during `pr-review`, or whenever you re-read code and pause to understand it. Especially relevant in Workflows 1, 2, 4, and 5.
|
|
333
|
+
|
|
334
|
+
5. **`debug` runs *before* `tdd` for non-trivial bugs.** Workflow 5b makes this explicit. The reproduction from `debug` becomes the failing test for `tdd`. Skip for bugs whose root cause is obvious from the trace (Workflow 5a).
|
|
335
|
+
|
|
336
|
+
6. **`security-review` is a gate, not a workflow.** Fires when a change is **surface-changing** — new entry point, identity / session / token flow, authorization logic, sensitive-data path, new external dependency, secrets handling. Runs alongside `design` and `tdd` in Workflows 1, 2, 4, 5a, 5b, 6 whenever those criteria hit. Not a substitute for `prod-ready` Section 3 — both run when the surface changes.
|
|
337
|
+
|
|
338
|
+
7. **`pr-review` is a utility workflow.** Runs when reviewing someone else's PR. Also runs (lighter form) as a self-check before opening the PR. The `tdd-rounds` parent's per-round verification borrows from it.
|
|
339
|
+
|
|
340
|
+
8. **`prod-ready` is the universal pre-merge gate** for Workflows 1, 2, 4, and sometimes 5. Single exit ramp before opening a PR.
|
|
341
|
+
|
|
342
|
+
9. **`verify-real-deps` fires whenever a workflow ends in a tagged release that touches a third-party API.** Most commonly that's Workflow 2 (large feature → tag), but it also applies when Workflow 1, 4, or 6 culminates in a release whose code path talks to an upstream you don't control. It does **not** fire for pure-internal services with database-only state, or for continuous-deploy environments that don't tag.
|
|
343
|
+
|
|
344
|
+
10. **`system-design` runs once per system, not per feature.** It's the greenfield precursor to Workflows 1 and 2. Once the topology is set, individual features run their own Workflow 1 or 2 inside it.
|
|
345
|
+
|
|
346
|
+
11. **Vocabulary is shared.** All architecture-talking skills (`design`, `system-design`, `improve-codebase-architecture`, `pr-review`, `grill-plan`) read from [`skills/LANGUAGE.md`](./LANGUAGE.md). Format references (ADRs, CONTEXT.md) live in [`skills/formats/`](./formats/). Domain vocabulary (Customer, Order, etc.) lives in `docs/CONTEXT.md`. Keep them distinct.
|
|
347
|
+
|
|
348
|
+
13. **Bootstrap mode for greenfield repos** is documented in one place — [`grill-plan/BOOTSTRAP.md`](./grill-plan/BOOTSTRAP.md). `feature-doc` and `system-design` defer there rather than re-explaining the rules.
|
|
349
|
+
|
|
350
|
+
14. **`simplify` is the end-of-slice / end-of-round sweep.** Runs after every `tdd` slice goes green; in `tdd-rounds`, lands as its own commit per [`tdd-rounds/COMMITS.md` rule 4](./tdd-rounds/COMMITS.md). Applies the `code-hygiene` lens to the changed files, plus a test-relevance check. Distinct from `code-hygiene` (the lens) and from `improve-codebase-architecture` (the structural escalation when simplify finds bigger issues).
|
|
351
|
+
|
|
352
|
+
12. **Artifacts accumulate in `docs/`:**
|
|
353
|
+
|
|
354
|
+
| Location | Produced by | Type |
|
|
355
|
+
|---|---|---|
|
|
356
|
+
| `docs/features/<name>.md` | `feature-doc` | One per feature |
|
|
357
|
+
| `docs/features/<name>.design.md` | `design` (optional) | One per feature with non-trivial module shape |
|
|
358
|
+
| `docs/research/<topic>.md` | `investigate`, `debug` (optional) | One per investigation or non-trivial bug |
|
|
359
|
+
| `docs/adr/<n>-<topic>.md` | `grill-plan`, `improve-codebase-architecture` | One per architectural decision |
|
|
360
|
+
| `docs/CONTEXT.md` | `grill-plan`, `improve-codebase-architecture` (inline updates) | One per repo / context |
|
|
361
|
+
| `docs/architecture.md` | `system-design` | One per system (the system map) |
|
|
362
|
+
| `docs/STATE.md` | `tdd-rounds` parent (append-only) | One per multi-round project |
|
|
363
|
+
| `docs/security/<feature>.md` | `security-review` (high-stakes only) | One per surface-changing feature where a feature-doc section isn't enough |
|
|
364
|
+
| `docs/benchmarks/<feature>.md` | `bench` | One per performance-critical feature |
|
|
365
|
+
| `docs/known-issues.md` | `verify-real-deps` | One per repo (post-mortem record) |
|
|
366
|
+
All `docs/` files are created **lazily** — they don't have to pre-exist for a workflow to run. The skill creates them on first use.
|
|
367
|
+
s/known-issues.md` | `verify-real-deps` | One per repo (post-mortem record) |
|
|
368
|
+
|
|
369
|
+
All `docs/` files are created **lazily** — they don't have to pre-exist for a workflow to run. The skill creates them on first use.
|
|
@@ -0,0 +1,40 @@
|
|
|
1
|
+
---
|
|
2
|
+
name: bench
|
|
3
|
+
description: Performance benchmarking discipline. Measures latency, throughput, and records baseline environments. Triggered by phrases like "benchmark", "performance test", "measure latency".
|
|
4
|
+
complexity: medium
|
|
5
|
+
expected_duration: 30 minutes
|
|
6
|
+
---
|
|
7
|
+
|
|
8
|
+
# Benchmark
|
|
9
|
+
|
|
10
|
+
Turns "it feels faster" into "p99 reduced by 40ms under 200 RPS." Benchmarking provides the empirical evidence required to validate performance ACs.
|
|
11
|
+
|
|
12
|
+
## Why this skill exists
|
|
13
|
+
|
|
14
|
+
Performance claims are often "vibes-based" or measured on a developer's machine without a recorded baseline. This skill ensures performance data is reproducible and comparable.
|
|
15
|
+
|
|
16
|
+
## When to use
|
|
17
|
+
|
|
18
|
+
- Verifying performance-related Acceptance Criteria in a feature doc.
|
|
19
|
+
- Identifying regressions or improvements after a major refactor.
|
|
20
|
+
- Profiling hot paths to guide optimization.
|
|
21
|
+
|
|
22
|
+
## Process
|
|
23
|
+
|
|
24
|
+
### 1. Establish Baseline
|
|
25
|
+
|
|
26
|
+
Measure the performance of the code *before* the change. Record the environment (hardware, load, concurrency).
|
|
27
|
+
|
|
28
|
+
### 2. Execute Benchmark
|
|
29
|
+
|
|
30
|
+
Run the same test against the changed code. Ensure identical environment conditions.
|
|
31
|
+
|
|
32
|
+
### 3. Record Findings
|
|
33
|
+
|
|
34
|
+
Create a report in `docs/benchmarks/<feature>.md` using the template.
|
|
35
|
+
|
|
36
|
+
## Done when
|
|
37
|
+
|
|
38
|
+
- A benchmark report exists in `docs/benchmarks/`.
|
|
39
|
+
- Baseline and current measurements are clearly compared.
|
|
40
|
+
- The environment and load profile are documented.
|
|
@@ -0,0 +1,26 @@
|
|
|
1
|
+
# Benchmark Report
|
|
2
|
+
|
|
3
|
+
**Feature:** <feature>
|
|
4
|
+
**Date:** YYYY-MM-DD
|
|
5
|
+
|
|
6
|
+
## Environment
|
|
7
|
+
- **Hardware:** <e.g., Apple M1 Max, 64GB RAM>
|
|
8
|
+
- **OS/Runtime:** <e.g., macOS 14.4, Node v20.11.0>
|
|
9
|
+
- **Network:** <e.g., localhost / AWS VPC>
|
|
10
|
+
|
|
11
|
+
## Load Profile
|
|
12
|
+
- **Concurrency:** <number of concurrent users/threads>
|
|
13
|
+
- **Duration:** <test duration>
|
|
14
|
+
- **Request Type:** <e.g., POST /api/orders>
|
|
15
|
+
|
|
16
|
+
## Comparison
|
|
17
|
+
|
|
18
|
+
| Metric | Baseline (vN.X) | Current (vN.Y) | Change |
|
|
19
|
+
| --- | --- | --- | --- |
|
|
20
|
+
| Avg Latency | | | |
|
|
21
|
+
| p95 Latency | | | |
|
|
22
|
+
| p99 Latency | | | |
|
|
23
|
+
| Throughput | | | |
|
|
24
|
+
|
|
25
|
+
## Analysis
|
|
26
|
+
<Describe the result. Did it meet the AC? Any surprising bottlenecks?>
|
|
@@ -0,0 +1,13 @@
|
|
|
1
|
+
# Bootstrap mode — the canonical rules for greenfield repos
|
|
2
|
+
|
|
3
|
+
This document is the **single source of truth** for how greenfield repos onboard their vocabulary and decision record. `feature-doc`, `system-design`, and `improve-codebase-architecture` all defer here when their inputs would normally come from `CONTEXT.md` / `docs/adr/` but those don't exist yet.
|
|
4
|
+
|
|
5
|
+
## The rules
|
|
6
|
+
|
|
7
|
+
1. **No file is required to pre-exist.** [`docs/CONTEXT.md`](../../docs/CONTEXT.md), [`docs/adr/`](../../docs/adr/), and [`docs/CONTEXT-MAP.md`](../../docs/CONTEXT-MAP.md) are all created **lazily** — only when the first term is resolved or the first ADR is needed.
|
|
8
|
+
2. **Resolve 3–7 core domain terms first** for the plan being discussed. Not exhaustive — the terms most load-bearing for the decisions in scope. Format per [`../formats/CONTEXT-FORMAT.md`](../formats/CONTEXT-FORMAT.md).
|
|
9
|
+
3. **Capture ADRs only when all three criteria hold** (hard-to-reverse / surprising / real trade-off). Format per [`../formats/ADR-FORMAT.md`](../formats/ADR-FORMAT.md).
|
|
10
|
+
4. **Grilling is shorter in bootstrap mode** — there's less existing model to stress-test against. Output is fewer challenges, more *terminology and decision capture*.
|
|
11
|
+
5. **If nothing remains to grill** after bootstrapping (plan is fully exploratory), stop and hand off to [`investigate`](../investigate/SKILL.md) — `grill-plan` is for stress-testing, not exploring.
|
|
12
|
+
|
|
13
|
+
Other skills should link here rather than re-explaining bootstrap.
|