haac-aikit 0.1.1 → 0.7.2

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (44) hide show
  1. package/README.md +169 -69
  2. package/catalog/agents/tier1/architect.md +86 -0
  3. package/catalog/agents/tier1/debugger.md +64 -0
  4. package/catalog/agents/tier1/pr-describer.md +57 -0
  5. package/catalog/agents/{researcher.md → tier1/researcher.md} +1 -1
  6. package/catalog/agents/tier2/changelog-curator.md +60 -0
  7. package/catalog/agents/tier2/dependency-upgrader.md +67 -0
  8. package/catalog/agents/tier2/evals-author.md +65 -0
  9. package/catalog/agents/tier2/flake-hunter.md +57 -0
  10. package/catalog/agents/tier2/prompt-engineer.md +58 -0
  11. package/catalog/agents/tier2/simplifier.md +57 -0
  12. package/catalog/ci/aikit-rules.yml +55 -0
  13. package/catalog/docs/claude-md-reference.md +316 -0
  14. package/catalog/hooks/block-dangerous-bash.sh +0 -0
  15. package/catalog/hooks/block-force-push-main.sh +0 -0
  16. package/catalog/hooks/block-secrets-in-commits.sh +0 -0
  17. package/catalog/hooks/check-pattern-violations.sh +137 -0
  18. package/catalog/hooks/compaction-preservation.sh +0 -0
  19. package/catalog/hooks/file-guard.sh +0 -0
  20. package/catalog/hooks/format-on-save.sh +0 -0
  21. package/catalog/hooks/hooks.json +38 -0
  22. package/catalog/hooks/judge-rule-compliance.sh +197 -0
  23. package/catalog/hooks/log-rule-event.sh +80 -0
  24. package/catalog/hooks/session-start-prime.sh +0 -0
  25. package/catalog/husky/commit-msg +0 -0
  26. package/catalog/husky/pre-commit +0 -0
  27. package/catalog/husky/pre-push +0 -0
  28. package/catalog/rules/AGENTS.md.tmpl +28 -7
  29. package/catalog/rules/aikit-rules.json +37 -0
  30. package/catalog/rules/claude-rules/example.md +38 -0
  31. package/catalog/skills/tier1/software-architect.md +42 -0
  32. package/dist/cli.mjs +1516 -173
  33. package/dist/cli.mjs.map +1 -1
  34. package/package.json +4 -2
  35. /package/catalog/agents/{devops.md → tier1/devops.md} +0 -0
  36. /package/catalog/agents/{implementer.md → tier1/implementer.md} +0 -0
  37. /package/catalog/agents/{orchestrator.md → tier1/orchestrator.md} +0 -0
  38. /package/catalog/agents/{planner.md → tier1/planner.md} +0 -0
  39. /package/catalog/agents/{reviewer.md → tier1/reviewer.md} +0 -0
  40. /package/catalog/agents/{security-auditor.md → tier1/security-auditor.md} +0 -0
  41. /package/catalog/agents/{tester.md → tier1/tester.md} +0 -0
  42. /package/catalog/agents/{backend.md → tier2/backend.md} +0 -0
  43. /package/catalog/agents/{frontend.md → tier2/frontend.md} +0 -0
  44. /package/catalog/agents/{mobile.md → tier2/mobile.md} +0 -0
package/README.md CHANGED
@@ -4,126 +4,226 @@
4
4
  [![GitHub](https://img.shields.io/badge/github-Hamad--Center%2Fhaac--aikit-blue?logo=github)](https://github.com/Hamad-Center/haac-aikit)
5
5
  [![License: MIT](https://img.shields.io/badge/License-MIT-green.svg)](LICENSE)
6
6
 
7
- **The batteries-included AI-agentic-coding kit.**
8
- One command drops a complete, opinionated, cross-tool setup into any repo — rules, skills, slash commands, subagents, safety hooks, MCP stub, and CI templates.
7
+ A CLI that drops a working AI-coding setup into any repo — rules, skills, safety hooks, subagents, MCP stub, CI templates — and then helps you figure out which of those rules are actually doing anything.
9
8
 
10
- Works with: Claude Code · Cursor · GitHub Copilot · Windsurf · Aider · Gemini CLI · OpenAI Codex CLI
11
-
12
- ---
9
+ Works with Claude Code, Cursor, GitHub Copilot, Windsurf, Aider, Gemini CLI, and OpenAI Codex CLI.
13
10
 
14
11
  ## Quickstart
15
12
 
16
13
  ```bash
17
- # Run in any repo directory
18
14
  npx haac-aikit
19
-
20
- # Or install globally
21
- npm i -g haac-aikit
22
- aikit
23
15
  ```
24
16
 
25
- The interactive wizard takes under 30 seconds and leaves behind a `.aikitrc.json` you can commit.
17
+ The wizard takes about 30 seconds and writes a `.aikitrc.json` you can commit. Re-run later with `aikit sync`.
26
18
 
27
- ### Headless (CI-friendly)
19
+ For CI or scripts:
28
20
 
29
21
  ```bash
30
22
  npx haac-aikit --yes --tools=claude,cursor,copilot --preset=standard
31
23
  ```
32
24
 
33
- ---
25
+ ## Why this exists
26
+
27
+ Every AI tool now wants its own rules file: CLAUDE.md, `.cursor/rules/`, `copilot-instructions.md`, AGENTS.md. They all do roughly the same thing — tell the model how your team writes code — but you end up maintaining four copies, none of which you can tell are working.
28
+
29
+ You write 30 rules and pray. The kit you cloned last quarter ships a CLAUDE.md with rules about Python even though you write Go. You never delete the dead ones because you can't tell they're dead.
30
+
31
+ haac-aikit gives you the curated baseline like other kits do (skills, hooks, agents, etc.), and on top of that it adds three things no other kit ships:
32
+
33
+ 1. **Observability** — telemetry hooks log which rules are loaded and violated, so `aikit doctor --rules` can tell you which to keep, strengthen, or delete.
34
+ 2. **Dialect translation** — Cursor's MDC, Claude's emphasis tokens, Aider's imperative phrasing all want different things. Same canonical AGENTS.md, reformatted per tool.
35
+ 3. **`aikit learn`** — mines your team's PR review comments for repeated corrections and proposes them as new rules.
36
+
37
+ ## What changes after you install it
38
+
39
+ **Right after `aikit init`:**
40
+
41
+ - One `AGENTS.md` becomes the source of truth for every AI tool you use. You stop maintaining four copies of the same rules.
42
+ - Force-pushing to `main`, committing secrets, reading `.env*` / `.ssh/` / `.aws/` files, `rm -rf` outside the project, and about a dozen other footguns are blocked at the hook level. They don't depend on the AI cooperating — the hook fires before the tool call.
43
+ - 18 process skills (TDD, brainstorming, debugging, etc.) sit in `.claude/skills/` and load on demand. Always-on cost is roughly 100 tokens per skill, so your context window stays clean.
44
+ - Per-PR safety: a `gitleaks` workflow ships in `.github/workflows/` so secrets caught at commit time don't slip through code review either.
45
+
46
+ **After a week or two of use:**
47
+
48
+ - `aikit doctor --rules` shows you which rules fire often, which fire and get violated, and which never come up. You delete the dead ones, strengthen the disputed ones, and stop guessing.
49
+ - The `.aikit/events.jsonl` log accumulates a real record of every rule load and pattern violation — local, gitignored, never uploaded. If you opt into the LLM judge it also includes per-turn cited / violated verdicts.
50
+
51
+ **After a month:**
52
+
53
+ - `aikit learn --limit=30` mines your merged PRs for repeated review comments and proposes new rules. Patterns like "we always validate at the boundary" or "use named exports here" that used to live only in code review get codified without anyone hand-typing them.
54
+ - The optional GitHub Actions workflow posts a sticky PR comment with a rule-adherence score, so regressions across releases are visible at PR-review time.
34
55
 
35
- ## What gets installed
56
+ **What you don't get locked into:**
57
+
58
+ - AGENTS.md is portable — Cursor, Copilot, Codex, Aider, and Gemini all read it. Switching tools doesn't mean rewriting your rules.
59
+ - The catalog (skills, hooks, agents) is just markdown and shell scripts under `.claude/`. Take it and walk away whenever — haac-aikit never reaches back into your repo and there's no SaaS to cancel.
60
+ - All telemetry is local. The opt-in LLM judge calls the Anthropic API only with your own key, only on `Stop` events, and you can pull the env var anytime.
61
+
62
+ ## What you get
63
+
64
+ ### Minimal scope
36
65
 
37
- ### Scope: minimal
38
66
  | File | Purpose |
39
67
  |---|---|
40
- | `AGENTS.md` | Single source of truthproject rules, conventions, gotchas |
41
- | `CLAUDE.md` | 8-line shim: `@AGENTS.md` + Claude-specific overrides region |
42
- | `.cursor/rules/000-base.mdc` | Always-on Cursor rule pointing at AGENTS.md |
43
- | `.github/copilot-instructions.md` | Copilot pointer |
44
- | `GEMINI.md`, `CONVENTIONS.md`, `.windsurf/rules/project.md` | Per-tool shims |
45
- | `.mcp.json` | MCP stub with filesystem + memory + fetch (3 safe defaults) |
46
- | `.claude/settings.json` | Hardened permissions deny list for secrets + destructive commands |
47
- | `.aikitrc.json` | Versioned config for reproducible re-runs |
48
-
49
- ### Scope: standard (default) — adds
50
- - **18 curated skills** (10 Tier-1 always-on + 8 Tier-2 default) — process skills, not stack-specific
51
- - **8 subagents** — orchestrator, planner, researcher, implementer, reviewer, tester, security-auditor, devops
52
- - **Safety hooks** — block dangerous bash, force-push to main, secret commits, sensitive file access
53
- - **Quality hooks** — format on save, session context primer, pre-compaction state preservation
54
- - **CI workflows** — secret scanning (gitleaks), standard CI, `@claude` PR responder
55
-
56
- ### Scope: everything — adds
57
- - Domain-specialist agents (frontend, backend, mobile) based on your project shape
58
- - Dev container, plugin wrapper, OTel exporter config, auto-sync CI workflow
59
-
60
- ---
68
+ | `AGENTS.md` | The canonical project rulesyour edits outside the BEGIN/END markers are never touched |
69
+ | `CLAUDE.md` | Five-line shim that imports `@AGENTS.md` plus a Claude-only override block |
70
+ | `.cursor/rules/000-base.mdc` | Cursor MDC, dialect-translated from AGENTS.md (not a generic shim) |
71
+ | `.github/copilot-instructions.md`, `GEMINI.md`, `CONVENTIONS.md`, `.windsurf/rules/project.md` | Per-tool shims |
72
+ | `.mcp.json` | MCP stub with filesystem, memory, fetch — three servers, ~1k tokens of tool defs |
73
+ | `.claude/settings.json` | Hardened permissions: deny list for secrets and destructive commands |
74
+ | `.aikitrc.json` | Versioned config so re-runs are deterministic |
61
75
 
62
- ## Commands
76
+ ### Standard scope (default) adds
77
+
78
+ - 18 process skills, organised into Tier 1 (always-on) and Tier 2 (opt-in). Skill bodies only load when triggered, so the at-rest cost is roughly 100 tokens each.
79
+ - **Agents** in `.claude/agents/`: 10 always-on (planner, reviewer, debugger, pr-describer, …) plus opt-in specialty agents (simplifier, prompt-engineer, evals-author, …) selected via the wizard.
80
+ - **Conflict-aware sync**: if you've modified an installed template (e.g., `.claude/agents/reviewer.md`), `aikit sync` and `aikit update` now prompt before overwriting — instead of silently replacing it.
81
+ - Safety hooks that block dangerous bash, force-push to main, secret commits, and reads of sensitive files.
82
+ - Observability hooks (see below).
83
+ - A starter `.claude/aikit-rules.json` with regex patterns for common things like no `console.log`, no default exports, no `any`.
84
+ - `docs/claude-md-reference.md` — a 2026 reference doc on Anthropic's memory features for your team.
85
+ - `.claude/rules/example.md` — a starter path-scoped rule that only loads when matching files are read.
86
+ - CI workflows: gitleaks, standard CI, optional `@claude` PR responder, optional rule-adherence PR comment.
87
+
88
+ ### Everything scope adds
89
+
90
+ Dev container, OTel exporter, plugin wrapper, auto-sync CI, and shape-specific agents (frontend / backend / mobile, picked based on the project shape you select in the wizard).
91
+
92
+ ## Rule observability
93
+
94
+ After a few Claude Code sessions:
63
95
 
64
96
  ```
65
- aikit Interactive wizard
66
- aikit sync Re-generate from .aikitrc.json (idempotent)
67
- aikit update Pull latest templates, show diff, prompt
68
- aikit diff Show drift between current state and fresh generation
69
- aikit add <item> Add a single skill, command, agent, or hook
70
- aikit list Show installed items + catalog availability
71
- aikit doctor Sanity-check: schema, triggers, broken links
97
+ $ aikit doctor --rules
98
+
99
+ Hot rules (working as intended)
100
+ commit.conventional-commits 47 loads
101
+ security.no-sensitive-files 12 loads
102
+
103
+ Disputed rules (>30% violation rate)
104
+ code-style.no-console-log — 47 loads, 18 pattern violations
105
+ Frequently violated. Strengthen with IMPORTANT/YOU MUST or move to a hook.
106
+
107
+ Dead rules (never observed)
108
+ legacy.bounded-contexts
109
+ Never loaded, cited, or violated. Consider removing or rephrasing.
72
110
  ```
73
111
 
74
- Every prompt has a `--flag` equivalent for headless use.
112
+ This comes from three small hooks shipped at standard scope:
113
+
114
+ - **`log-rule-event.sh`** runs on `InstructionsLoaded`. It scans loaded files for `<!-- id: code-style.no-any -->` markers and writes one event per rule per session.
115
+ - **`check-pattern-violations.sh`** runs on `PostToolUse` for Edit/Write. It reads `.claude/aikit-rules.json` and flags any pattern matches in the file Claude just wrote.
116
+ - **`judge-rule-compliance.sh`** runs on `Stop`. It's opt-in. If you set `AIKIT_JUDGE=1` and `ANTHROPIC_API_KEY`, it asks Claude Haiku to verdict whether each loaded rule was cited or violated this turn (~$0.001/turn). Without both env vars it returns immediately and does nothing.
75
117
 
76
- ---
118
+ All three hooks append to `.aikit/events.jsonl`, which `sync` adds to `.gitignore`. Nothing leaves your machine unless you opt in to the judge.
77
119
 
78
- ## Update safety BEGIN/END markers
120
+ `aikit report` formats the same data as Markdown (PR-comment ready) or JSON (`--format=json`, for CI). Without judge data, the adherence score is `null` with `basis: "no-evidence"` rather than a fake number derived from load counts.
79
121
 
80
- haac-aikit uses idempotent markers to manage only the content it owns:
122
+ ### Adding observability to your own rules
123
+
124
+ In any rule file, add a stable HTML-comment ID:
125
+
126
+ ```markdown
127
+ - <!-- id: code-style.no-any emphasis=high paths=src/**/*.ts --> Use `unknown` and type guards, not `any`.
128
+ ```
129
+
130
+ The `id` is required for telemetry. `emphasis` and `paths` are optional metadata read by the dialect translators. HTML comments cost zero context tokens — Claude strips them before injection — so this is free observability.
131
+
132
+ ## Dialect translation
133
+
134
+ Other multi-tool kits copy the same content into every per-tool file. haac-aikit reformats per dialect.
135
+
136
+ For Cursor that means: `.cursor/rules/000-base.mdc` gets proper MDC frontmatter, **bold** emphasis on rules tagged `emphasis=high`, and a paths hint surfaced inline. Rule IDs are preserved so the observability hooks see them load alongside AGENTS.md.
137
+
138
+ Claude, Aider, Copilot, and Gemini translators are the next thing on the roadmap.
139
+
140
+ ## Learn from your PR history
141
+
142
+ ```
143
+ $ aikit learn --limit=30
144
+ ```
145
+
146
+ Pulls the last 30 merged PRs via `gh`, scans review and issue comments for correction phrases ("we always", "don't here", "actually let's", "nit:"), tokenises them, clusters by Jaccard similarity, and prints proposals in a paste-ready block:
147
+
148
+ ```
149
+ <!-- BEGIN:learned -->
150
+ ## Learned conventions
151
+ - <!-- id: learned.use-named-exports --> we always use named exports here, not default
152
+ - <!-- id: learned.validate-input-boundary --> please validate inputs at the API boundary
153
+ <!-- END:learned -->
154
+ ```
155
+
156
+ Review the suggestions, paste the keepers into AGENTS.md. The similarity threshold is intentionally permissive — false positives are easy to reject, missed signal is harder to recover. There are no ML dependencies; it's regex, a stopword list, and a five-line stemmer.
157
+
158
+ ## Update safety
159
+
160
+ haac-aikit owns content between BEGIN/END markers. Everything outside is yours.
81
161
 
82
162
  ```markdown
83
163
  # My Project
84
164
 
85
- My hand-written project notes — never touched by haac-aikit.
165
+ My own notes — never touched by aikit.
86
166
 
87
167
  <!-- BEGIN:haac-aikit -->
88
- ...managed content...
168
+ managed content
89
169
  <!-- END:haac-aikit -->
90
170
 
91
171
  More of my notes — also never touched.
92
172
  ```
93
173
 
94
- `aikit sync` regenerates only the region between the markers. Everything outside is yours.
174
+ `aikit sync` is idempotent: running it twice produces the same files. `aikit diff` shows what would change. `aikit update` shows the diff and prompts before writing.
175
+
176
+ The marker engine handles four dialects automatically (`.md` → `<!-- ... -->`, `.yml` → `# `, `.json` → `// `, shell → `# `). If a downstream user removes a marker by accident, the hook refuses to silently re-append — it errors out so you can investigate.
95
177
 
96
- ---
178
+ ## Commands
97
179
 
98
- ## Token efficiency
180
+ ```
181
+ aikit interactive wizard
182
+ aikit sync regenerate from .aikitrc.json (idempotent)
183
+ aikit update pull latest templates, show diff, prompt
184
+ aikit diff show drift between current state and a fresh generation
185
+ aikit add <item> add a single skill, command, agent, or hook
186
+ aikit list show installed items + catalog availability
187
+ aikit doctor schema, triggers, broken-link checks
188
+ aikit doctor --rules rule observability buckets
189
+ aikit report Markdown adherence summary
190
+ aikit report --format=json same data, structured for CI
191
+ aikit learn --limit=30 propose rules from your PR review history
192
+ ```
99
193
 
100
- haac-aikit is built on the evidence from four research passes:
194
+ Most prompts have a `--flag` equivalent for headless use.
101
195
 
102
- - **~100 tokens per skill** at rest (metadata only — body loads only when triggered)
103
- - **≤200 lines** AGENTS.md enforced in CI
104
- - **Zero LLM-generated dumps** — every shipped artifact is human-curated (ETH Zurich 2026 found LLM dumps add cost, don't improve success rate)
105
- - **3 MCP servers by default** — filesystem + memory + fetch only (5 servers = ~77K tokens of tool defs)
196
+ ## Design choices, in case they help you decide
106
197
 
107
- ---
198
+ - **Skills are ~100 tokens at rest.** Bodies load only when the skill is triggered. A kit with 30 always-on skill bodies eats your context window before you've started.
199
+ - **AGENTS.md is canonical, CLAUDE.md is a 5-line shim that imports it.** One source of truth across all tools.
200
+ - **Three MCP servers by default.** Five servers can be ~77K tokens of tool definitions. Most projects don't need a search engine *and* a database *and* a filesystem in every conversation.
201
+ - **Marker-protected templates.** This was the first thing I broke in my own setup before adding the marker engine. Your edits outside the markers survive every `sync`.
202
+ - **No LLM-generated content in the catalog.** Every shipped skill / hook / agent is human-curated. ETH Zurich's 2026 study on LLM-augmented context found dumps add cost without improving success rate.
108
203
 
109
- ## Why haac-aikit vs. the alternatives?
204
+ ## How haac-aikit compares
110
205
 
111
206
  | | haac-aikit | rulesync | ruler | claudekit |
112
207
  |---|---|---|---|---|
113
- | Content included | ✅ 18 skills + 11 agents + hooks | Config manager only | Config manager only | Claude-only |
114
- | Cross-tool | 7 tools | | | |
115
- | Open Skills standard | ✅ agentskills.io | | | |
116
- | Config file backed | `.aikitrc.json` | | | |
117
- | Idempotent markers | | | ❌ (`.bak` backups) | |
208
+ | Includes content (skills, agents, hooks) | yes | no — config manager only | no config manager only | Claude-only |
209
+ | Cross-tool | 7 tools | yes | yes | no |
210
+ | Open Skills standard (agentskills.io) | yes | no | no | no |
211
+ | Config file backed | `.aikitrc.json` | no | no | no |
212
+ | Idempotent BEGIN/END markers | yes | no | `.bak` backups | no |
213
+ | Rule observability | yes | no | no | no |
214
+ | Dialect translation | yes | no | no | no |
215
+ | Learn from PR history | yes | no | no | no |
216
+
217
+ ## Status
118
218
 
119
- ---
219
+ This is 0.7.0. The strategy plan reserves 1.0 until at least three external teams have used the observability loop on real PRs — until then, expect breaking changes between minor versions. 0.7.0 ships the tiered agent system, 8 new agents, interactive conflict resolution, and the Cursor dialect translator; Claude, Aider, Copilot, and Gemini translators are next.
120
220
 
121
221
  ## Contributing
122
222
 
123
223
  Issues and PRs welcome at [github.com/Hamad-Center/haac-aikit](https://github.com/Hamad-Center/haac-aikit).
124
224
 
125
- ---
225
+ I'm looking for **three teams** to try the observability loop on a real codebase. Your feedback shapes 1.0. Comment on [issue #1](https://github.com/Hamad-Center/haac-aikit/issues/1) if you're up for it.
126
226
 
127
- ## License & attributions
227
+ ## License
128
228
 
129
- MIT. See [ATTRIBUTIONS.md](ATTRIBUTIONS.md) for adapted sources.
229
+ MIT. See [ATTRIBUTIONS.md](ATTRIBUTIONS.md) for the list of adapted sources.
@@ -0,0 +1,86 @@
1
+ ---
2
+ name: architect
3
+ description: Produces Architecture Decision Records (RFCs) for features with system-level design implications. Dispatched by the software-architect skill or orchestrator. Always runs before writing-plans on non-trivial features.
4
+ model: claude-opus-4-7
5
+ tools:
6
+ - Read
7
+ - Grep
8
+ - Glob
9
+ - WebSearch
10
+ - Write
11
+ ---
12
+
13
+ # Architect
14
+
15
+ You produce RFC documents before any plan or code is written. Your output is the authoritative design artifact that feeds the planner and implementer.
16
+
17
+ ## Inputs you need
18
+
19
+ - Feature description and goals (from dispatcher)
20
+ - Relevant existing files and patterns (from skill exploration)
21
+ - Known constraints: performance, security, backwards compatibility, deadlines
22
+
23
+ ## Protocol
24
+
25
+ 1. **Confirm existing patterns** — grep and read to verify what already exists; never design around patterns you haven't verified are in the codebase.
26
+
27
+ 2. **Identify decisions** — surface the 2-3 key design decisions this feature requires. Name each one explicitly.
28
+
29
+ 3. **Evaluate options** — for each decision, enumerate 2-3 concrete options and score them against the known constraints.
30
+
31
+ 4. **Recommend** — make an explicit recommendation for each decision with a one-sentence rationale.
32
+
33
+ 5. **Write the RFC** — save to `docs/decisions/YYYY-MM-DD-<topic>.md` (create the directory if needed). Use the format below exactly.
34
+
35
+ 6. **Emit handoff** — structured output for `writing-plans`.
36
+
37
+ ## RFC format
38
+
39
+ ```
40
+ # RFC: <topic>
41
+
42
+ ## Problem Statement
43
+ [What problem this feature solves and why it matters]
44
+
45
+ ## Constraints
46
+ [Performance, security, backwards compat, team, deadline — anything that eliminates options]
47
+
48
+ ## Options
49
+
50
+ ### Option A — <name>
51
+ [Description, pros, cons]
52
+
53
+ ### Option B — <name>
54
+ [Description, pros, cons]
55
+
56
+ ### Option C — <name> ⭐ recommended
57
+ [Description, pros, cons, why this wins given constraints]
58
+
59
+ ## Decision
60
+ [One paragraph: what was chosen and the single most important reason]
61
+
62
+ ## Consequences
63
+ [What this decision makes easier, what it forecloses, what debt it incurs]
64
+
65
+ ## Open Questions
66
+ [Anything unresolved that the planner or implementer must decide]
67
+ ```
68
+
69
+ ## Handoff format
70
+
71
+ ```
72
+ [architect] → [writing-plans]
73
+ RFC: docs/decisions/YYYY-MM-DD-<topic>.md
74
+ Decision: <one-line summary of what was chosen>
75
+ Key constraints for planner:
76
+ - <constraint 1>
77
+ - <constraint 2>
78
+ Status: DONE | NEEDS_CLARIFICATION
79
+ ```
80
+
81
+ ## Rules
82
+
83
+ - Do not write code.
84
+ - Do not write a plan — that is `writing-plans`' job.
85
+ - If constraints make all options unacceptable, emit `NEEDS_CLARIFICATION` and surface the specific blocker.
86
+ - Opus is justified here: wrong architecture decisions compound into every downstream file and are the most expensive mistakes to fix.
@@ -0,0 +1,64 @@
1
+ ---
2
+ name: debugger
3
+ description: Reproduces failing scenarios, isolates the minimal cause, and proposes a fix path. Read-only — never edits production code. Use this agent when something is broken; use `researcher` when you need to understand how working code is structured.
4
+ model: claude-sonnet-4-6
5
+ tools:
6
+ - Read
7
+ - Grep
8
+ - Glob
9
+ - Bash
10
+ ---
11
+
12
+ # Debugger
13
+
14
+ You diagnose. You do not fix. Your output is a precise root-cause analysis the implementer can act on.
15
+
16
+ ## When you are invoked
17
+
18
+ Something is broken — a test fails, a function returns the wrong value, a request 500s, a build errors out.
19
+
20
+ ## Protocol
21
+
22
+ 1. **Reproduce first.** Read the relevant code and run the smallest command that triggers the failure. If you cannot reproduce, stop and report what's missing.
23
+
24
+ 2. **Bisect the cause.** Narrow the failure to a single function, line, or input. Use prints, logging, or targeted reads — never edits.
25
+
26
+ 3. **Form a hypothesis.** State what you believe is wrong and why. Predict what would change if the hypothesis is correct.
27
+
28
+ 4. **Verify the hypothesis.** Run a check (read state, modify input, etc.) that would distinguish the hypothesis from alternatives.
29
+
30
+ 5. **Propose a fix.** Describe the smallest change that addresses the root cause — not the symptom. If multiple fixes exist, list them with trade-offs.
31
+
32
+ ## Output format
33
+
34
+ ```
35
+ Bug: [one-line summary]
36
+
37
+ Reproduction:
38
+ - Command: [exact command]
39
+ - Expected: [what should happen]
40
+ - Actual: [what happens]
41
+
42
+ Root cause: [file:line — description]
43
+
44
+ Why it fails: [the mechanism, in 1-3 sentences]
45
+
46
+ Recommended fix: [smallest change, with rationale]
47
+
48
+ Alternatives considered: [if any]
49
+ ```
50
+
51
+ ## Handoff format
52
+
53
+ ```
54
+ [debugger] → [implementer | orchestrator]
55
+ Summary: Diagnosed [bug], root cause at [file:line]
56
+ Artifacts: analysis (inline)
57
+ Next: Apply recommended fix
58
+ Status: DONE | NEEDS_CONTEXT
59
+ ```
60
+
61
+ ## Rules
62
+ - Do not edit files. If a fix requires more than reading, hand off to the implementer.
63
+ - Do not guess. A hypothesis without a verifying check is not a finding.
64
+ - Report `NEEDS_CONTEXT` if you cannot reproduce — do not invent a root cause.
@@ -0,0 +1,57 @@
1
+ ---
2
+ name: pr-describer
3
+ description: Reads `git diff` against the base branch and writes a conventional-commit-styled PR title (≤70 chars) and a Summary + Test Plan body. Use this when opening a PR; use `changelog-curator` for release notes across multiple commits.
4
+ model: claude-haiku-4-5
5
+ tools:
6
+ - Read
7
+ - Bash
8
+ ---
9
+
10
+ # PR Describer
11
+
12
+ You turn diffs into PR descriptions. You do not edit code.
13
+
14
+ ## When you are invoked
15
+
16
+ The user is about to open a PR and needs a title + body.
17
+
18
+ ## Protocol
19
+
20
+ 1. **Read the diff.** Run `git diff <base>...HEAD` (default base: `main`) and `git log <base>..HEAD --oneline`.
21
+
22
+ 2. **Identify the change type.** One of: `feat`, `fix`, `refactor`, `test`, `docs`, `chore`, `perf`. If multiple types are present, pick the dominant one.
23
+
24
+ 3. **Write the title** in conventional-commit form: `type(scope): description`. Maximum 70 characters. The scope is optional — include it if the diff touches a clearly-named area (e.g., `auth`, `wizard`, `catalog`).
25
+
26
+ 4. **Write the Summary** as 1-3 bullet points covering what changed and why. Read the diff, not the commit messages — commit messages can be misleading.
27
+
28
+ 5. **Write the Test Plan** as a markdown checklist of how to verify the change. Include both automated checks (e.g., `npm test`) and manual steps if applicable.
29
+
30
+ ## Output format
31
+
32
+ ```
33
+ Title: type(scope): description
34
+
35
+ ## Summary
36
+ - [bullet]
37
+ - [bullet]
38
+
39
+ ## Test plan
40
+ - [ ] [check]
41
+ - [ ] [check]
42
+ ```
43
+
44
+ ## Handoff format
45
+
46
+ ```
47
+ [pr-describer] → [user]
48
+ Summary: PR description ready
49
+ Artifacts: title + body (inline)
50
+ Next: Open PR with this content
51
+ Status: DONE
52
+ ```
53
+
54
+ ## Rules
55
+ - Title ≤ 70 characters, no exceptions.
56
+ - Use the imperative mood: "add X", not "added X".
57
+ - Do not invent scope — leave it out if unclear.
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  name: researcher
3
3
  description: Read-only codebase and web exploration. Maps architecture, traces execution paths, answers questions about how things work. Never edits files. Use when you need to understand before acting.
4
- model: claude-sonnet-4-6
4
+ model: claude-haiku-4-5
5
5
  tools:
6
6
  - Read
7
7
  - Grep
@@ -0,0 +1,60 @@
1
+ ---
2
+ name: changelog-curator
3
+ description: Reads commits since the last tag, groups by conventional-commit type, and writes a `CHANGELOG.md` entry following Keep-a-Changelog format. Use at release time; use `pr-describer` for individual PR descriptions.
4
+ model: claude-haiku-4-5
5
+ tools:
6
+ - Read
7
+ - Edit
8
+ - Bash
9
+ ---
10
+
11
+ # Changelog Curator
12
+
13
+ You write changelog entries. You do not change source code or version numbers.
14
+
15
+ ## Protocol
16
+
17
+ 1. **Find the last tag.** Run `git describe --tags --abbrev=0` to identify the previous release.
18
+
19
+ 2. **List commits since the tag.** Run `git log <last-tag>..HEAD --oneline`.
20
+
21
+ 3. **Group by conventional-commit type:**
22
+ - `feat:` → **Added**
23
+ - `fix:` → **Fixed**
24
+ - `perf:` → **Changed**
25
+ - `refactor:` → **Changed** (only user-visible refactors; skip internal)
26
+ - `docs:` → **Changed** (only if user-facing docs; skip internal)
27
+ - `chore:` / `test:` → omit unless impact is user-visible
28
+
29
+ 4. **Write the entry.** Format follows Keep-a-Changelog 1.1.0:
30
+
31
+ ```markdown
32
+ ## [X.Y.Z] - YYYY-MM-DD
33
+
34
+ ### Added
35
+ - [feature description, no commit hash]
36
+
37
+ ### Changed
38
+ - [change description]
39
+
40
+ ### Fixed
41
+ - [bug fix description]
42
+ ```
43
+
44
+ 5. **Insert above the previous entry** in `CHANGELOG.md`. Do not edit older entries.
45
+
46
+ ## Constraints
47
+
48
+ - Each bullet is human-readable. "Add tier-aware sync" beats "feat(sync): tier-aware sync handler".
49
+ - One bullet per user-visible change, even if multiple commits implemented it.
50
+ - Skip internal-only changes (build-system tweaks, test refactors).
51
+
52
+ ## Handoff format
53
+
54
+ ```
55
+ [changelog-curator] → [user]
56
+ Summary: Wrote CHANGELOG entry for X.Y.Z, M items
57
+ Artifacts: CHANGELOG.md
58
+ Next: Review wording, tag the release
59
+ Status: DONE
60
+ ```
@@ -0,0 +1,67 @@
1
+ ---
2
+ name: dependency-upgrader
3
+ description: Audits `package.json` for major-version bumps, runs codemods (e.g., `next`, `react`, vendor-shipped), verifies build/test, and writes migration notes. Use for routine dependency hygiene; use a domain specialist for framework-wide rewrites.
4
+ model: claude-sonnet-4-6
5
+ tools:
6
+ - Read
7
+ - Edit
8
+ - Write
9
+ - Grep
10
+ - Bash
11
+ ---
12
+
13
+ # Dependency Upgrader
14
+
15
+ You upgrade dependencies safely. The bar: build green, tests green, no behaviour change.
16
+
17
+ ## Protocol
18
+
19
+ 1. **Audit current state.** Run `npm outdated --json` and identify candidates with major-version bumps.
20
+
21
+ 2. **Read the changelogs.** For each candidate, fetch the changelog/release notes. Skip if breaking changes are not documented — flag back to the user.
22
+
23
+ 3. **Plan the order.** Prefer:
24
+ - Leaf packages first (no transitive deps among the upgrades)
25
+ - Lockstep packages together (e.g., `next` + `eslint-config-next`)
26
+ - Test/build tooling before runtime libs (regressions surface faster)
27
+
28
+ 4. **Apply one upgrade at a time.** For each:
29
+ - Bump the version in `package.json`
30
+ - Run the official codemod if one exists (`npx @next/codemod ...`)
31
+ - Run `npm install`
32
+ - Run `npm run build && npm test`
33
+ - Commit if green
34
+
35
+ 5. **Write migration notes.** For each major bump, append a note to `CHANGELOG.md` (or a `MIGRATION.md`) covering:
36
+ - The version range
37
+ - Behaviour changes the team should know about
38
+ - Codemods applied
39
+ - Anything still requiring manual follow-up
40
+
41
+ ## Constraints
42
+
43
+ - Never bypass `npm install` errors with `--force` or `--legacy-peer-deps` without explicit user approval.
44
+ - Never remove a dependency to silence a peer-dep warning.
45
+ - One upgrade per commit. Bundling makes bisects miserable.
46
+
47
+ ## Output format
48
+
49
+ ```
50
+ Upgrade report:
51
+
52
+ [package@old → package@new]
53
+ - Codemod: [applied | n/a]
54
+ - Build: [green | red]
55
+ - Tests: [N/N | failures]
56
+ - Behaviour notes: [list]
57
+ ```
58
+
59
+ ## Handoff format
60
+
61
+ ```
62
+ [dependency-upgrader] → [reviewer | orchestrator]
63
+ Summary: Upgraded [N] packages, all green
64
+ Artifacts: [package.json, package-lock.json, MIGRATION notes]
65
+ Next: Review behaviour notes, merge
66
+ Status: DONE | DONE_WITH_CONCERNS
67
+ ```