agent-eng 0.10.0 → 0.12.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +2 -3
- package/package.json +1 -1
- package/src/init.js +2 -2
- package/src/templates/.claude/agents/executor.md +3 -1
- package/src/templates/.claude/agents/planner.md +3 -2
- package/src/templates/.claude/agents/reviewer.md +11 -2
- package/src/templates/.claude/scripts/update-status.sh +54 -0
- package/src/templates/.claude/settings.json +31 -0
- package/src/templates/CLAUDE.md +73 -46
- package/src/templates/STATUS.md +48 -0
- package/src/templates/architecture/decisions/0001-how-we-work.md +36 -13
- package/src/templates/architecture/overview.md +14 -2
- package/src/templates/orchestration.yaml +53 -42
- package/src/templates/.claude/agents/custodian.md +0 -77
- package/src/templates/.claude/agents/qa-tester.md +0 -82
package/README.md
CHANGED
|
@@ -21,9 +21,8 @@ This creates the following structure in your project:
|
|
|
21
21
|
│ ├── system-architect.md
|
|
22
22
|
│ ├── planner.md
|
|
23
23
|
│ ├── executor.md
|
|
24
|
-
│ ├── qa-tester.md
|
|
25
24
|
│ ├── reviewer.md
|
|
26
|
-
│ └──
|
|
25
|
+
│ └── summarizer.md
|
|
27
26
|
├── architecture/
|
|
28
27
|
│ ├── overview.md # High-level architecture overview
|
|
29
28
|
│ └── decisions/
|
|
@@ -97,7 +96,7 @@ Defines the runtime system components, their tiers (client/service/engine/data),
|
|
|
97
96
|
4. **Map the system** — Use the `system-architect` subagent to create your `architecture.yaml`
|
|
98
97
|
5. **Plan the work** — Use the `planner` subagent to decompose your ADR into tickets
|
|
99
98
|
6. **Execute** — Use the `executor` subagent to implement a ticket following your conventions
|
|
100
|
-
7. **
|
|
99
|
+
7. **Review** — Use the `reviewer` subagent to validate the work (tests are written during execution)
|
|
101
100
|
|
|
102
101
|
## License
|
|
103
102
|
|
package/package.json
CHANGED
package/src/init.js
CHANGED
|
@@ -9,11 +9,10 @@ const TEMPLATES = join(__dirname, "templates");
|
|
|
9
9
|
const STRUCTURE = [
|
|
10
10
|
".github/workflows/notify-site.yml",
|
|
11
11
|
".claude/settings.json",
|
|
12
|
+
".claude/scripts/update-status.sh",
|
|
12
13
|
".claude/agents/architect.md",
|
|
13
|
-
".claude/agents/custodian.md",
|
|
14
14
|
".claude/agents/executor.md",
|
|
15
15
|
".claude/agents/planner.md",
|
|
16
|
-
".claude/agents/qa-tester.md",
|
|
17
16
|
".claude/agents/reviewer.md",
|
|
18
17
|
".claude/agents/summarizer.md",
|
|
19
18
|
".claude/agents/system-architect.md",
|
|
@@ -26,6 +25,7 @@ const STRUCTURE = [
|
|
|
26
25
|
"tickets/example/001-example-ticket.md",
|
|
27
26
|
"orchestration.yaml",
|
|
28
27
|
"architecture.yaml",
|
|
28
|
+
"STATUS.md",
|
|
29
29
|
];
|
|
30
30
|
|
|
31
31
|
function prompt(question) {
|
|
@@ -1,12 +1,14 @@
|
|
|
1
1
|
---
|
|
2
2
|
name: executor
|
|
3
|
-
description: Use
|
|
3
|
+
description: Use for complex tickets that need guided execution with strict verification. For most tickets, prefer Claude Code plan mode (shift+tab) instead — it explores, plans, and implements in one session. Use this agent when you need enforced discipline (plan-before-code, mandatory verification) or when plan mode is unavailable.
|
|
4
4
|
tools: Read, Write, Edit, Grep, Glob, Bash, WebFetch
|
|
5
5
|
model: sonnet
|
|
6
6
|
---
|
|
7
7
|
|
|
8
8
|
You are an executor agent. Your role is to implement tickets by writing code that follows the project's conventions and architecture decisions.
|
|
9
9
|
|
|
10
|
+
> **Note:** For most tickets, Claude Code's built-in plan mode (`shift+tab`) is the recommended execution path — it provides faster context continuity and integrated exploration. This agent exists for cases where you need enforced discipline or are running execution as part of the full agent pipeline.
|
|
11
|
+
|
|
10
12
|
## Responsibilities
|
|
11
13
|
|
|
12
14
|
1. **Read the ticket fully** — Understand the acceptance criteria, linked ADRs, and specs before writing any code
|
|
@@ -5,7 +5,7 @@ tools: Read, Grep, Glob, Write, Edit
|
|
|
5
5
|
model: sonnet
|
|
6
6
|
---
|
|
7
7
|
|
|
8
|
-
You are a planner agent. Your role is to decompose specs and ADRs into actionable tickets.
|
|
8
|
+
You are a planner agent. Your role is to decompose specs and ADRs into actionable tickets scoped for execution via Claude Code plan mode.
|
|
9
9
|
|
|
10
10
|
## Responsibilities
|
|
11
11
|
|
|
@@ -13,7 +13,7 @@ You are a planner agent. Your role is to decompose specs and ADRs into actionabl
|
|
|
13
13
|
2. **Decompose work** — Break features into small, focused tickets
|
|
14
14
|
3. **Define acceptance criteria** — Each ticket must have testable completion criteria
|
|
15
15
|
4. **Sequence work** — Order tickets to minimize blocked dependencies
|
|
16
|
-
5. **Scope
|
|
16
|
+
5. **Scope for plan mode** — Each ticket should be completable in a single Claude Code plan mode session
|
|
17
17
|
|
|
18
18
|
## Constraints
|
|
19
19
|
|
|
@@ -21,6 +21,7 @@ You are a planner agent. Your role is to decompose specs and ADRs into actionabl
|
|
|
21
21
|
- Each ticket should be **independently mergeable** where possible
|
|
22
22
|
- Acceptance criteria must be **specific and testable**
|
|
23
23
|
- Link tickets to relevant ADRs and specs
|
|
24
|
+
- **Size tickets for plan mode** — a ticket should be achievable in one focused session with plan mode (typically S or M size). If a ticket would require multiple plan mode sessions, split it
|
|
24
25
|
|
|
25
26
|
## Process
|
|
26
27
|
|
|
@@ -13,12 +13,13 @@ You are a reviewer agent. Your role is to review code changes against acceptance
|
|
|
13
13
|
2. **Validate against ADRs** — Ensure changes follow established architectural decisions
|
|
14
14
|
3. **Check conventions** — Verify code follows the relevant language conventions
|
|
15
15
|
4. **Identify issues** — Flag problems but don't fix them directly
|
|
16
|
-
5. **
|
|
16
|
+
5. **Check test adequacy** — Verify that tests exist for each acceptance criterion and cover key edge cases
|
|
17
|
+
6. **Provide actionable feedback** — Be specific about what needs to change
|
|
17
18
|
|
|
18
19
|
## Constraints
|
|
19
20
|
|
|
20
21
|
- You **review and comment**, you do not write code
|
|
21
|
-
- You flag issues
|
|
22
|
+
- You flag issues to be fixed (via plan mode or the executor agent)
|
|
22
23
|
- You reference specific lines and files
|
|
23
24
|
- You cite the relevant ADR or convention when flagging violations
|
|
24
25
|
|
|
@@ -31,6 +32,8 @@ You are a reviewer agent. Your role is to review code changes against acceptance
|
|
|
31
32
|
- Do the changes violate any ADRs?
|
|
32
33
|
- Do the changes follow conventions?
|
|
33
34
|
- Are there obvious bugs or edge cases?
|
|
35
|
+
- Does each acceptance criterion have a corresponding test?
|
|
36
|
+
- Are critical edge cases (empty input, error states) covered by tests?
|
|
34
37
|
4. Produce a review with:
|
|
35
38
|
- Checklist of acceptance criteria (pass/fail)
|
|
36
39
|
- List of issues (if any) with specific locations
|
|
@@ -47,6 +50,12 @@ You are a reviewer agent. Your role is to review code changes against acceptance
|
|
|
47
50
|
- [ ] Criterion 2 — **Not satisfied**: missing error handling for empty input
|
|
48
51
|
- [x] Criterion 3 — Satisfied
|
|
49
52
|
|
|
53
|
+
### Test Coverage
|
|
54
|
+
|
|
55
|
+
- [x] Criterion 1 — Tested in `tests/feature.test.ts:12`
|
|
56
|
+
- [ ] Criterion 2 — **No test found** for empty input handling
|
|
57
|
+
- [x] Criterion 3 — Tested in `tests/feature.test.ts:28`
|
|
58
|
+
|
|
50
59
|
### Issues
|
|
51
60
|
|
|
52
61
|
1. **Convention violation** (`src/file.ts:15`): Missing type annotation on `processData` return value. See `conventions/typescript.md`.
|
|
@@ -0,0 +1,54 @@
|
|
|
1
|
+
#!/usr/bin/env bash
|
|
2
|
+
set -euo pipefail
|
|
3
|
+
|
|
4
|
+
STATUS="STATUS.md"
|
|
5
|
+
[ -f "$STATUS" ] || exit 0
|
|
6
|
+
|
|
7
|
+
BRANCH=$(git rev-parse --abbrev-ref HEAD 2>/dev/null || echo "unknown")
|
|
8
|
+
TIMESTAMP=$(date -u +"%Y-%m-%d %H:%M UTC")
|
|
9
|
+
|
|
10
|
+
# Update timestamp
|
|
11
|
+
sed -i.bak "s|^> Last updated:.*|> Last updated: ${TIMESTAMP}|" "$STATUS"
|
|
12
|
+
rm -f "${STATUS}.bak"
|
|
13
|
+
|
|
14
|
+
# Build commits table (last 10 commits)
|
|
15
|
+
TMPDIR="${TMPDIR:-/tmp}"
|
|
16
|
+
COMMITS_TMP="${TMPDIR}/_ae_commits.tmp"
|
|
17
|
+
FILES_TMP="${TMPDIR}/_ae_files.tmp"
|
|
18
|
+
|
|
19
|
+
{
|
|
20
|
+
echo "**Branch:** \`${BRANCH}\` "
|
|
21
|
+
echo "**Last commit:** ${TIMESTAMP}"
|
|
22
|
+
echo ""
|
|
23
|
+
echo "| Hash | Date | Message |"
|
|
24
|
+
echo "|------|------|---------|"
|
|
25
|
+
git log --format="| \`%h\` | %ad | %s |" --date=short -10 2>/dev/null || echo "| - | - | No commits yet |"
|
|
26
|
+
} > "$COMMITS_TMP"
|
|
27
|
+
|
|
28
|
+
# Build recent file changes
|
|
29
|
+
COMMIT_COUNT=$(git rev-list --count HEAD 2>/dev/null || echo "0")
|
|
30
|
+
{
|
|
31
|
+
echo "**Files changed (last 5 commits):**"
|
|
32
|
+
echo ""
|
|
33
|
+
echo '```'
|
|
34
|
+
if [ "$COMMIT_COUNT" -ge 5 ] 2>/dev/null; then
|
|
35
|
+
git diff --stat HEAD~5..HEAD 2>/dev/null | head -20
|
|
36
|
+
elif [ "$COMMIT_COUNT" -gt 1 ] 2>/dev/null; then
|
|
37
|
+
git diff --stat HEAD~$((COMMIT_COUNT - 1))..HEAD 2>/dev/null | head -20
|
|
38
|
+
else
|
|
39
|
+
echo "Not enough commits yet."
|
|
40
|
+
fi
|
|
41
|
+
echo '```'
|
|
42
|
+
} > "$FILES_TMP"
|
|
43
|
+
|
|
44
|
+
# Replace AUTO:START..AUTO:END and AUTO:FILES:START..AUTO:FILES:END
|
|
45
|
+
awk '
|
|
46
|
+
/<!-- AUTO:START -->/ { print; while((getline line < "'"$COMMITS_TMP"'") > 0) print line; skip=1; next }
|
|
47
|
+
/<!-- AUTO:END -->/ { skip=0 }
|
|
48
|
+
/<!-- AUTO:FILES:START -->/ { print; while((getline line < "'"$FILES_TMP"'") > 0) print line; skip=1; next }
|
|
49
|
+
/<!-- AUTO:FILES:END -->/ { skip=0 }
|
|
50
|
+
skip { next }
|
|
51
|
+
{ print }
|
|
52
|
+
' "$STATUS" > "${STATUS}.tmp" && mv "${STATUS}.tmp" "$STATUS"
|
|
53
|
+
|
|
54
|
+
rm -f "$COMMITS_TMP" "$FILES_TMP"
|
|
@@ -1,4 +1,35 @@
|
|
|
1
1
|
{
|
|
2
|
+
"hooks": {
|
|
3
|
+
"PostToolUse": [
|
|
4
|
+
{
|
|
5
|
+
"matcher": "Bash",
|
|
6
|
+
"hooks": [
|
|
7
|
+
{
|
|
8
|
+
"type": "command",
|
|
9
|
+
"if": "Bash(git commit *)",
|
|
10
|
+
"command": "bash \"$CLAUDE_PROJECT_DIR\"/.claude/scripts/update-status.sh",
|
|
11
|
+
"timeout": 30
|
|
12
|
+
},
|
|
13
|
+
{
|
|
14
|
+
"type": "command",
|
|
15
|
+
"if": "Bash(git push *)",
|
|
16
|
+
"command": "bash \"$CLAUDE_PROJECT_DIR\"/.claude/scripts/update-status.sh",
|
|
17
|
+
"timeout": 30
|
|
18
|
+
}
|
|
19
|
+
]
|
|
20
|
+
},
|
|
21
|
+
{
|
|
22
|
+
"matcher": "Write|Edit",
|
|
23
|
+
"hooks": [
|
|
24
|
+
{
|
|
25
|
+
"type": "command",
|
|
26
|
+
"command": "lines=$(wc -l < CLAUDE.md 2>/dev/null) && [ \"${lines:-0}\" -gt 200 ] && echo \"CLAUDE.md is ${lines} lines (limit: 200). Move content to linked files.\" || true",
|
|
27
|
+
"timeout": 10
|
|
28
|
+
}
|
|
29
|
+
]
|
|
30
|
+
}
|
|
31
|
+
]
|
|
32
|
+
},
|
|
2
33
|
"mcpServers": {
|
|
3
34
|
"context7": {
|
|
4
35
|
"command": "npx",
|
package/src/templates/CLAUDE.md
CHANGED
|
@@ -1,67 +1,72 @@
|
|
|
1
1
|
# Project
|
|
2
2
|
|
|
3
|
-
This project uses a
|
|
3
|
+
This project uses a hybrid agentic workflow: specialized agents handle process (decisions, planning, review), and Claude Code's plan mode handles execution.
|
|
4
4
|
|
|
5
|
-
## Workflow
|
|
5
|
+
## Workflow — Hybrid Approach
|
|
6
6
|
|
|
7
|
-
|
|
7
|
+
Agents own the **process** — architecture decisions, work decomposition, quality gates, and maintenance. Claude Code plan mode owns the **execution** — implementing individual tickets efficiently within a single session.
|
|
8
8
|
|
|
9
|
-
|
|
|
10
|
-
|
|
11
|
-
| **
|
|
12
|
-
| **
|
|
13
|
-
| **
|
|
14
|
-
| **
|
|
15
|
-
| **
|
|
16
|
-
| **
|
|
17
|
-
| **Custodian** | `custodian` | Keep CLAUDE.md lean (≤200 lines), current, and routed to external files |
|
|
18
|
-
| **Summarizer** | `summarizer` | Generate executive summaries of completed sprints or features for stakeholders |
|
|
9
|
+
| Phase | How | When |
|
|
10
|
+
|-------|-----|------|
|
|
11
|
+
| **Decide** | `/architect` agent | New feature, significant design choice, unclear requirements |
|
|
12
|
+
| **Map** | `/system-architect` agent | New system or major structural change |
|
|
13
|
+
| **Decompose** | `/planner` agent | ADR/spec ready, work needs to be broken into tickets |
|
|
14
|
+
| **Execute** | Claude Code **plan mode** (`shift+tab`) | Implementing a specific ticket (includes writing tests) |
|
|
15
|
+
| **Review** | `/reviewer` agent | Code and tests ready for validation |
|
|
16
|
+
| **Report** | `/summarizer` agent | Sprint or feature complete, stakeholder update needed |
|
|
19
17
|
|
|
20
|
-
|
|
21
|
-
|
|
22
|
-
When work can be parallelized, spin up sub-agents to handle independent tasks concurrently. Sub-agents research, test, or implement in isolation and report back to the main thread.
|
|
18
|
+
### Why hybrid?
|
|
23
19
|
|
|
24
|
-
|
|
20
|
+
- Agents enforce **separation of concerns** — the Architect can't write code, the Reviewer can't fix issues
|
|
21
|
+
- Plan mode provides **speed and context continuity** — it explores, plans, and executes in one session
|
|
22
|
+
- Artifacts (ADRs, tickets, reviews) **persist across sessions** — plan mode's output is code, agents' output is documentation
|
|
25
23
|
|
|
26
|
-
|
|
27
|
-
- **Test in parallel** — e.g., one sub-agent runs unit tests while another checks integration tests
|
|
28
|
-
- **Implement independent tickets** — tickets with no dependencies on each other can be executed simultaneously
|
|
29
|
-
- **Verify in parallel** — e.g., one sub-agent checks browser behavior while another reviews console output
|
|
24
|
+
### Choosing the right tool
|
|
30
25
|
|
|
31
|
-
|
|
26
|
+
**Use an agent** when the task produces a persistent artifact (ADR, ticket, review, summary) or when role separation matters (the person deciding shouldn't be the person implementing).
|
|
32
27
|
|
|
33
|
-
|
|
28
|
+
**Use plan mode** when you have a well-scoped ticket with clear acceptance criteria and want to go from plan to working code in one session.
|
|
34
29
|
|
|
35
|
-
|
|
36
|
-
|------------|-------|----------|
|
|
37
|
-
| **Low** | Haiku | File lookups, grep searches, reading docs, running tests, formatting, simple code generation |
|
|
38
|
-
| **Medium** | Sonnet | Multi-file changes, moderate reasoning, code review, writing tests |
|
|
39
|
-
| **High** | Opus | Architecture decisions, complex refactors, subtle bug investigation, cross-cutting changes |
|
|
30
|
+
**Quick fixes and bug fixes** don't need the full pipeline — use plan mode directly, or just implement without ceremony. The workflow exists to help, not to slow down trivial changes.
|
|
40
31
|
|
|
41
|
-
|
|
32
|
+
## Before Starting Any Feature
|
|
42
33
|
|
|
43
|
-
|
|
34
|
+
1. Check if an ADR exists in `architecture/decisions/` — if not, run `/architect` first
|
|
35
|
+
2. Check if tickets exist in `tickets/` — if not, run `/planner` first
|
|
36
|
+
3. For each ticket: use plan mode (`shift+tab`) to implement it
|
|
37
|
+
4. After implementation: run `/reviewer` to validate against acceptance criteria
|
|
38
|
+
5. If the ticket touches an existing ADR's scope, verify the decision still holds
|
|
44
39
|
|
|
45
|
-
|
|
46
|
-
- `description` — Short label for what the sub-agent does
|
|
47
|
-
- `prompt` — Self-contained brief (the sub-agent has no context from the main thread)
|
|
48
|
-
- `model` — Set to `"haiku"` for simple tasks, `"sonnet"` for moderate, omit for complex (inherits parent model)
|
|
49
|
-
- `run_in_background` — Set to `true` when you don't need the result before continuing other work
|
|
40
|
+
## Testing in Plan Mode
|
|
50
41
|
|
|
51
|
-
|
|
42
|
+
Plan mode writes tests as part of implementing each ticket. For every acceptance criterion:
|
|
43
|
+
1. Write at least one automated test that verifies it
|
|
44
|
+
2. Cover edge cases (empty, null, boundary values) and error handling
|
|
45
|
+
3. Run the tests and confirm they pass before marking the ticket done
|
|
52
46
|
|
|
53
|
-
|
|
54
|
-
- **Parallelize independent work** — Launch multiple sub-agents in a single message when their tasks don't depend on each other.
|
|
55
|
-
- **Don't delegate synthesis** — Sub-agents gather information; the main thread makes decisions. Never write "based on your findings, decide X."
|
|
56
|
-
- **Verify sub-agent output** — Sub-agents report what they intended, not necessarily what they achieved. Check their actual changes before reporting to the user.
|
|
47
|
+
Follow the project's existing test framework and patterns. Test observable behavior, not implementation details.
|
|
57
48
|
|
|
58
|
-
## Before Starting Any Ticket
|
|
49
|
+
## Before Starting Any Ticket (in plan mode)
|
|
59
50
|
|
|
60
51
|
1. Read the ticket fully, including all linked documents
|
|
61
52
|
2. Read any referenced ADRs in `architecture/decisions/`
|
|
62
|
-
3. Check
|
|
63
|
-
4.
|
|
64
|
-
5.
|
|
53
|
+
3. Check relevant conventions in `conventions/`
|
|
54
|
+
4. Let plan mode explore and propose the implementation plan
|
|
55
|
+
5. Verify the work end-to-end before marking done
|
|
56
|
+
|
|
57
|
+
## Sub-Agent Deployment
|
|
58
|
+
|
|
59
|
+
When work can be parallelized, spin up sub-agents for independent tasks concurrently.
|
|
60
|
+
|
|
61
|
+
### Model selection
|
|
62
|
+
|
|
63
|
+
| Complexity | Model | Use when |
|
|
64
|
+
|------------|-------|----------|
|
|
65
|
+
| **Low** | Haiku | File lookups, grep, reading docs, running tests, formatting |
|
|
66
|
+
| **Medium** | Sonnet | Multi-file changes, code review, writing tests |
|
|
67
|
+
| **High** | Opus | Architecture decisions, complex refactors, subtle bugs |
|
|
68
|
+
|
|
69
|
+
**Default to Haiku** unless the task requires multi-step reasoning or cross-file understanding.
|
|
65
70
|
|
|
66
71
|
## Key Files and Directories
|
|
67
72
|
|
|
@@ -71,11 +76,33 @@ Use the Agent tool with these parameters:
|
|
|
71
76
|
- `specs/` — Feature specifications
|
|
72
77
|
- `tickets/` — Work items organized by feature folder, with `_backlog.md` as the sprint board
|
|
73
78
|
- `conventions/` — Language and framework coding standards
|
|
74
|
-
- `.claude/agents/` — Subagent definitions for each role
|
|
79
|
+
- `.claude/agents/` — Subagent definitions for each role
|
|
80
|
+
- `STATUS.md` — Live project dashboard (auto-updated git data + manually maintained context)
|
|
81
|
+
|
|
82
|
+
## CLAUDE.md Maintenance
|
|
83
|
+
|
|
84
|
+
Keep this file lean and current (target: under 200 lines). A hook warns when it exceeds the limit.
|
|
85
|
+
- When you discover a new gotcha or pattern, add it here or to the appropriate linked file
|
|
86
|
+
- Route large or specialized content to separate files (e.g., `conventions/`, `docs/`) and link from here
|
|
87
|
+
- Remove stale entries that no longer reflect how the project works
|
|
88
|
+
- Never duplicate information that already lives in a linked file
|
|
89
|
+
|
|
90
|
+
## STATUS.md Maintenance
|
|
91
|
+
|
|
92
|
+
STATUS.md is a live project dashboard. Git sections (branch, commits, file changes) auto-update via a hook on every commit/push. You maintain the semantic sections:
|
|
93
|
+
|
|
94
|
+
**Update after significant milestones** (completing a ticket, finishing a phase, hitting a blocker):
|
|
95
|
+
1. **Current Phase** — Mark the active workflow phase(s) from the table
|
|
96
|
+
2. **Active Work** — One paragraph: what feature/ticket is in progress, next step
|
|
97
|
+
3. **Open Tickets** — Snapshot from `tickets/_backlog.md`
|
|
98
|
+
4. **Risks & Blockers** — Add blockers; remove resolved ones
|
|
99
|
+
5. **Session Log** — One-line entry with today's date and what was accomplished
|
|
100
|
+
|
|
101
|
+
Keep updates brief. STATUS.md is a dashboard, not a report — use `/summarizer` for detailed retrospectives.
|
|
75
102
|
|
|
76
103
|
## MCP Servers
|
|
77
104
|
|
|
78
|
-
- **Context7** — Pulls up-to-date, version-specific documentation from live code libraries
|
|
105
|
+
- **Context7** — Pulls up-to-date, version-specific documentation from live code libraries. Use `resolve` then `get-library-docs` before writing code that depends on a third-party library.
|
|
79
106
|
|
|
80
107
|
## Conventions
|
|
81
108
|
|
|
@@ -0,0 +1,48 @@
|
|
|
1
|
+
# Project Status
|
|
2
|
+
|
|
3
|
+
> Last updated: _not yet updated_
|
|
4
|
+
|
|
5
|
+
## Current Phase
|
|
6
|
+
|
|
7
|
+
| Phase | Status |
|
|
8
|
+
|-------|--------|
|
|
9
|
+
| Decide | |
|
|
10
|
+
| Map | |
|
|
11
|
+
| Decompose | |
|
|
12
|
+
| Execute | |
|
|
13
|
+
| Test | |
|
|
14
|
+
| Review | |
|
|
15
|
+
| Maintain | |
|
|
16
|
+
| Report | |
|
|
17
|
+
|
|
18
|
+
## Active Work
|
|
19
|
+
|
|
20
|
+
_What is currently being worked on, the relevant tickets, and the expected next step._
|
|
21
|
+
|
|
22
|
+
## Branch & Commits
|
|
23
|
+
|
|
24
|
+
<!-- AUTO:START -->
|
|
25
|
+
_Commit or run `.claude/scripts/update-status.sh` to populate._
|
|
26
|
+
<!-- AUTO:END -->
|
|
27
|
+
|
|
28
|
+
## Recent File Changes
|
|
29
|
+
|
|
30
|
+
<!-- AUTO:FILES:START -->
|
|
31
|
+
_Commit or run `.claude/scripts/update-status.sh` to populate._
|
|
32
|
+
<!-- AUTO:FILES:END -->
|
|
33
|
+
|
|
34
|
+
## Open Tickets
|
|
35
|
+
|
|
36
|
+
| Ticket | Feature | Status |
|
|
37
|
+
|--------|---------|--------|
|
|
38
|
+
| | | |
|
|
39
|
+
|
|
40
|
+
## Risks & Blockers
|
|
41
|
+
|
|
42
|
+
- None currently
|
|
43
|
+
|
|
44
|
+
## Session Log
|
|
45
|
+
|
|
46
|
+
| Date | Summary |
|
|
47
|
+
|------|---------|
|
|
48
|
+
| | |
|
|
@@ -2,6 +2,7 @@
|
|
|
2
2
|
|
|
3
3
|
**Status:** Accepted
|
|
4
4
|
**Date:** 2026-05-04
|
|
5
|
+
**Updated:** 2026-05-08
|
|
5
6
|
**Author:** swarpi
|
|
6
7
|
|
|
7
8
|
## Context
|
|
@@ -12,19 +13,36 @@ Working with AI coding assistants can be highly productive, but without structur
|
|
|
12
13
|
- Lost context between sessions
|
|
13
14
|
- No clear separation between planning and execution
|
|
14
15
|
|
|
15
|
-
We need a workflow that maximizes the benefits of AI assistance while maintaining engineering rigor.
|
|
16
|
+
We need a workflow that maximizes the benefits of AI assistance while maintaining engineering rigor. However, a fully agent-driven pipeline adds friction for execution — Claude Code's built-in plan mode is faster and maintains better context continuity when implementing well-scoped work.
|
|
16
17
|
|
|
17
18
|
## Decision
|
|
18
19
|
|
|
19
|
-
We adopt a
|
|
20
|
+
We adopt a **hybrid approach**: specialized agents own the process, Claude Code plan mode owns execution.
|
|
21
|
+
|
|
22
|
+
### Agents (process)
|
|
20
23
|
|
|
21
24
|
1. **Architect** — Focuses on design decisions. Produces ADRs. Asks clarifying questions before deciding. Has read access to the codebase but does not write code.
|
|
22
25
|
|
|
23
|
-
2. **Planner** — Takes specs and ADRs as input. Decomposes work into tickets
|
|
26
|
+
2. **Planner** — Takes specs and ADRs as input. Decomposes work into tickets scoped for plan mode sessions. Does not implement.
|
|
27
|
+
|
|
28
|
+
3. **Reviewer** — Reviews diffs against acceptance criteria and linked ADRs. Checks for convention violations and test adequacy. Does not fix issues directly — flags them for fixing.
|
|
29
|
+
|
|
30
|
+
4. **Summarizer** — Produces non-technical executive summaries of completed work.
|
|
31
|
+
|
|
32
|
+
### Plan mode (execution)
|
|
33
|
+
|
|
34
|
+
Each ticket from the Planner is executed using Claude Code's plan mode (`shift+tab`). Plan mode:
|
|
35
|
+
- Explores the codebase and proposes a plan before implementing
|
|
36
|
+
- Executes within a single session with full context continuity
|
|
37
|
+
- Adapts its depth to the task — simple ticket, simple plan
|
|
38
|
+
|
|
39
|
+
The **Executor agent** remains as a fallback for cases where plan mode is unavailable or strict verification discipline is needed.
|
|
24
40
|
|
|
25
|
-
|
|
41
|
+
### When to skip the pipeline
|
|
26
42
|
|
|
27
|
-
|
|
43
|
+
- Bug fixes, typos, and small changes → plan mode directly
|
|
44
|
+
- Well-understood changes with no architectural implications → plan mode directly
|
|
45
|
+
- New features or significant decisions → full pipeline starting with Architect
|
|
28
46
|
|
|
29
47
|
All significant decisions are recorded as ADRs. All work items are tickets with acceptance criteria. Conventions are documented per-language.
|
|
30
48
|
|
|
@@ -35,14 +53,15 @@ All significant decisions are recorded as ADRs. All work items are tickets with
|
|
|
35
53
|
- Clear separation of concerns reduces cognitive overload per session
|
|
36
54
|
- ADRs create a searchable decision history
|
|
37
55
|
- Tickets with acceptance criteria make "done" unambiguous
|
|
56
|
+
- Plan mode provides fast, context-aware execution without agent switching overhead
|
|
38
57
|
- Conventions prevent style drift across sessions
|
|
39
|
-
-
|
|
58
|
+
- The workflow adapts to task size — no ceremony for small changes
|
|
40
59
|
|
|
41
60
|
### Negative
|
|
42
61
|
|
|
43
|
-
- More upfront documentation work
|
|
44
|
-
-
|
|
45
|
-
-
|
|
62
|
+
- More upfront documentation work for new features
|
|
63
|
+
- Requires discipline to follow the process for larger changes
|
|
64
|
+
- Plan mode doesn't produce persistent artifacts the way the Executor agent does
|
|
46
65
|
|
|
47
66
|
### Neutral
|
|
48
67
|
|
|
@@ -54,10 +73,14 @@ All significant decisions are recorded as ADRs. All work items are tickets with
|
|
|
54
73
|
|
|
55
74
|
Just talk to the AI and let it write code directly. Rejected because it leads to inconsistent decisions and lost context.
|
|
56
75
|
|
|
57
|
-
###
|
|
76
|
+
### Fully agent-driven pipeline (previous approach)
|
|
58
77
|
|
|
59
|
-
|
|
78
|
+
All eight agents in sequence, including Executor for implementation. Rejected because plan mode provides better context continuity and speed for execution, and the Executor agent was frequently skipped by plan mode anyway.
|
|
79
|
+
|
|
80
|
+
### Plan mode only (no agents)
|
|
60
81
|
|
|
61
|
-
|
|
82
|
+
Use plan mode for everything. Rejected because plan mode doesn't enforce role separation (the same session that decides also implements), produces no persistent artifacts (ADRs, tickets, reviews), and doesn't scale to multi-session features.
|
|
62
83
|
|
|
63
|
-
|
|
84
|
+
### Heavy-process frameworks (e.g., full Agile ceremony)
|
|
85
|
+
|
|
86
|
+
Too heavyweight for a solo developer. We take the useful parts (clear acceptance criteria, documented decisions) without the ceremony.
|
|
@@ -8,13 +8,25 @@ This document provides a high-level view of the system architecture.
|
|
|
8
8
|
2. **Written artifacts** — Decisions, plans, and reviews are documented, not just discussed
|
|
9
9
|
3. **Verify before execute** — Plans are reviewed before implementation begins
|
|
10
10
|
4. **Single source of truth** — Each piece of information lives in one canonical place
|
|
11
|
+
5. **Right tool for the job** — Agents own process; plan mode owns execution
|
|
11
12
|
|
|
12
|
-
## The Flow
|
|
13
|
+
## The Flow (Hybrid)
|
|
13
14
|
|
|
14
15
|
```
|
|
15
|
-
Requirement → Architect → ADR/Spec → Planner → Tickets →
|
|
16
|
+
Requirement → Architect → ADR/Spec → Planner → Tickets → Plan Mode → Code → Reviewer → Merged
|
|
17
|
+
(agent) (agent) (built-in) (agent)
|
|
16
18
|
```
|
|
17
19
|
|
|
20
|
+
**Process phases** (agents): Architect, System Architect, Planner, Reviewer, Summarizer
|
|
21
|
+
**Execution phase** (plan mode): Claude Code's built-in plan mode implements each ticket in a focused session
|
|
22
|
+
|
|
23
|
+
## When to Skip the Pipeline
|
|
24
|
+
|
|
25
|
+
Not every change needs the full workflow:
|
|
26
|
+
- **Bug fix or typo** → Plan mode directly, or just implement
|
|
27
|
+
- **Small well-understood change** → Plan mode directly
|
|
28
|
+
- **New feature or significant decision** → Start with Architect, then full pipeline
|
|
29
|
+
|
|
18
30
|
## Artifact Types
|
|
19
31
|
|
|
20
32
|
| Artifact | Purpose | Location |
|
|
@@ -1,5 +1,7 @@
|
|
|
1
|
-
name: Agentic Workflow
|
|
2
|
-
description:
|
|
1
|
+
name: Hybrid Agentic Workflow
|
|
2
|
+
description: >
|
|
3
|
+
Agents own the process (decisions, planning, review).
|
|
4
|
+
Claude Code plan mode owns execution (implementing tickets).
|
|
3
5
|
|
|
4
6
|
agents:
|
|
5
7
|
- id: architect
|
|
@@ -29,8 +31,8 @@ agents:
|
|
|
29
31
|
- id: planner
|
|
30
32
|
kind: planning
|
|
31
33
|
title: Planner
|
|
32
|
-
tagline: Decomposes specs into
|
|
33
|
-
description: Takes ADRs and specs as input. Decomposes
|
|
34
|
+
tagline: Decomposes specs into plan-mode-sized tickets
|
|
35
|
+
description: Takes ADRs and specs as input. Decomposes work into discrete tickets scoped for a single Claude Code plan mode session.
|
|
34
36
|
outputs:
|
|
35
37
|
- Tickets
|
|
36
38
|
- Milestones
|
|
@@ -38,35 +40,39 @@ agents:
|
|
|
38
40
|
color: indigo
|
|
39
41
|
docLink: /.claude/agents/planner.md
|
|
40
42
|
|
|
41
|
-
- id:
|
|
43
|
+
- id: plan-mode
|
|
42
44
|
kind: execution
|
|
43
|
-
title:
|
|
44
|
-
tagline:
|
|
45
|
-
description:
|
|
45
|
+
title: Plan Mode
|
|
46
|
+
tagline: Claude Code's built-in plan-then-execute cycle
|
|
47
|
+
description: >
|
|
48
|
+
Not a custom agent — this is Claude Code's native plan mode (shift+tab).
|
|
49
|
+
It explores the codebase, proposes a plan, gets approval, then implements.
|
|
50
|
+
Each ticket from the Planner is executed as a separate plan mode session.
|
|
46
51
|
outputs:
|
|
47
52
|
- Code
|
|
48
53
|
- PRs
|
|
49
|
-
- Plan docs
|
|
50
54
|
color: indigo
|
|
51
|
-
|
|
55
|
+
builtin: true
|
|
52
56
|
|
|
53
|
-
- id:
|
|
54
|
-
kind:
|
|
55
|
-
title:
|
|
56
|
-
tagline:
|
|
57
|
-
description:
|
|
57
|
+
- id: executor
|
|
58
|
+
kind: execution
|
|
59
|
+
title: Executor (fallback)
|
|
60
|
+
tagline: Guided execution with strict verification
|
|
61
|
+
description: >
|
|
62
|
+
Fallback for when plan mode is unavailable or when enforced verification
|
|
63
|
+
discipline is needed. Most tickets should use plan mode instead.
|
|
58
64
|
outputs:
|
|
59
|
-
-
|
|
60
|
-
-
|
|
61
|
-
-
|
|
62
|
-
color:
|
|
63
|
-
docLink: /.claude/agents/
|
|
65
|
+
- Code
|
|
66
|
+
- PRs
|
|
67
|
+
- Plan docs
|
|
68
|
+
color: gray
|
|
69
|
+
docLink: /.claude/agents/executor.md
|
|
64
70
|
|
|
65
71
|
- id: reviewer
|
|
66
72
|
kind: validation
|
|
67
73
|
title: Reviewer
|
|
68
74
|
tagline: Validates against acceptance criteria
|
|
69
|
-
description: Validates code and tests against the original acceptance criteria. Flags issues back
|
|
75
|
+
description: Validates code and tests against the original acceptance criteria. Flags issues back for fixing and provides final approval.
|
|
70
76
|
outputs:
|
|
71
77
|
- Feedback
|
|
72
78
|
- Approval
|
|
@@ -74,22 +80,11 @@ agents:
|
|
|
74
80
|
color: amber
|
|
75
81
|
docLink: /.claude/agents/reviewer.md
|
|
76
82
|
|
|
77
|
-
- id: custodian
|
|
78
|
-
kind: maintenance
|
|
79
|
-
title: Custodian
|
|
80
|
-
tagline: Keeps CLAUDE.md lean, current, and well-routed
|
|
81
|
-
description: Maintains the project's CLAUDE.md file — adds new patterns and gotchas, removes stale entries, and routes large content to external files. Enforces a 150–200 line limit to prevent context bloat.
|
|
82
|
-
outputs:
|
|
83
|
-
- Updated CLAUDE.md
|
|
84
|
-
- Extracted reference files
|
|
85
|
-
color: blue
|
|
86
|
-
docLink: /.claude/agents/custodian.md
|
|
87
|
-
|
|
88
83
|
- id: summarizer
|
|
89
84
|
kind: reporting
|
|
90
85
|
title: Summarizer
|
|
91
86
|
tagline: Communicates completed work to stakeholders
|
|
92
|
-
description: Reads tickets, ADRs, specs, git history, and test results to produce concise, non-technical executive summaries
|
|
87
|
+
description: Reads tickets, ADRs, specs, git history, and test results to produce concise, non-technical executive summaries.
|
|
93
88
|
outputs:
|
|
94
89
|
- Executive summaries
|
|
95
90
|
- Sprint recaps
|
|
@@ -98,6 +93,7 @@ agents:
|
|
|
98
93
|
docLink: /.claude/agents/summarizer.md
|
|
99
94
|
|
|
100
95
|
connections:
|
|
96
|
+
# Process phase: agents produce artifacts
|
|
101
97
|
- from: architect
|
|
102
98
|
to: system-architect
|
|
103
99
|
artifact: ADRs · Specs
|
|
@@ -106,28 +102,42 @@ connections:
|
|
|
106
102
|
to: planner
|
|
107
103
|
artifact: architecture.yaml
|
|
108
104
|
|
|
105
|
+
# Execution phase: plan mode implements tickets
|
|
106
|
+
- from: planner
|
|
107
|
+
to: plan-mode
|
|
108
|
+
artifact: Tickets
|
|
109
|
+
note: Each ticket becomes a separate plan mode session
|
|
110
|
+
|
|
111
|
+
# Fallback execution path
|
|
109
112
|
- from: planner
|
|
110
113
|
to: executor
|
|
111
114
|
artifact: Tickets
|
|
115
|
+
type: fallback
|
|
116
|
+
note: Use when plan mode is unavailable
|
|
112
117
|
|
|
113
|
-
|
|
114
|
-
|
|
115
|
-
|
|
118
|
+
# Validation phase
|
|
119
|
+
- from: plan-mode
|
|
120
|
+
to: reviewer
|
|
121
|
+
artifact: Code · Tests
|
|
116
122
|
|
|
117
|
-
- from:
|
|
123
|
+
- from: executor
|
|
118
124
|
to: reviewer
|
|
119
125
|
artifact: Code · Tests
|
|
126
|
+
type: fallback
|
|
120
127
|
|
|
128
|
+
# Feedback loops
|
|
121
129
|
- from: reviewer
|
|
122
|
-
to:
|
|
130
|
+
to: plan-mode
|
|
123
131
|
artifact: Feedback
|
|
124
132
|
type: feedback
|
|
133
|
+
note: Fix issues in a new plan mode session
|
|
125
134
|
|
|
126
135
|
- from: reviewer
|
|
127
|
-
to:
|
|
128
|
-
artifact:
|
|
129
|
-
type:
|
|
136
|
+
to: executor
|
|
137
|
+
artifact: Feedback
|
|
138
|
+
type: feedback
|
|
130
139
|
|
|
140
|
+
# Reporting trigger
|
|
131
141
|
- from: reviewer
|
|
132
142
|
to: summarizer
|
|
133
143
|
artifact: Completed work context
|
|
@@ -137,6 +147,7 @@ parallelization:
|
|
|
137
147
|
description: >
|
|
138
148
|
The main session can deploy sub-agents for independent, parallelizable work.
|
|
139
149
|
Sub-agents run in isolation, report back, and the main thread synthesizes results.
|
|
150
|
+
Plan mode sessions run sequentially (one ticket at a time) unless tickets are independent.
|
|
140
151
|
model-routing:
|
|
141
152
|
low-complexity: haiku
|
|
142
153
|
medium-complexity: sonnet
|
|
@@ -145,7 +156,7 @@ parallelization:
|
|
|
145
156
|
examples:
|
|
146
157
|
- parallel research across multiple files
|
|
147
158
|
- running independent test suites simultaneously
|
|
148
|
-
-
|
|
159
|
+
- independent tickets can be executed in separate plan mode sessions
|
|
149
160
|
- concurrent verification (browser + console + tests)
|
|
150
161
|
|
|
151
162
|
layout: diamond
|
|
@@ -1,77 +0,0 @@
|
|
|
1
|
-
---
|
|
2
|
-
name: custodian
|
|
3
|
-
description: Use periodically (after a batch of tickets, or when CLAUDE.md grows past 200 lines) to keep CLAUDE.md lean, current, and routed to external files. Modifies only CLAUDE.md and the files it links to. Also keeps orchestration.yaml in sync when agents are added or removed.
|
|
4
|
-
tools: Read, Write, Edit, Grep, Glob
|
|
5
|
-
model: haiku
|
|
6
|
-
---
|
|
7
|
-
|
|
8
|
-
You are a custodian agent. Your role is to maintain the project's `CLAUDE.md` file and `orchestration.yaml` — keeping them accurate, lean, and in sync.
|
|
9
|
-
|
|
10
|
-
## Responsibilities
|
|
11
|
-
|
|
12
|
-
1. **Keep CLAUDE.md current** — Update it with new patterns, gotchas, and conventions discovered during development
|
|
13
|
-
2. **Keep CLAUDE.md lean** — The file must stay between 150–200 lines max to prevent context bloat
|
|
14
|
-
3. **Route to external files** — Large or specialized content belongs in separate files that CLAUDE.md links to, so the main context only loads them when needed
|
|
15
|
-
4. **Remove stale content** — Delete entries that no longer reflect how the project works
|
|
16
|
-
5. **Sync orchestration.yaml** — When agents are added, removed, or renamed in `.claude/agents/`, update the agents list and connections in `orchestration.yaml` to match
|
|
17
|
-
|
|
18
|
-
## Constraints
|
|
19
|
-
|
|
20
|
-
- You only modify `CLAUDE.md`, `orchestration.yaml`, and the files `CLAUDE.md` routes to — you do not write application code
|
|
21
|
-
- You never exceed 200 lines in `CLAUDE.md`
|
|
22
|
-
- You preserve the existing structure and section ordering unless restructuring is necessary to stay within the line budget
|
|
23
|
-
- You do not duplicate information that already lives in linked files
|
|
24
|
-
|
|
25
|
-
## Process
|
|
26
|
-
|
|
27
|
-
1. Read the current `CLAUDE.md` and count its lines
|
|
28
|
-
2. Review recent work context — what patterns, gotchas, or conventions were discovered?
|
|
29
|
-
3. Decide what to update:
|
|
30
|
-
- **Add** new patterns or gotchas that would help future sessions
|
|
31
|
-
- **Remove** stale or outdated entries
|
|
32
|
-
- **Route out** any section that has grown too large — extract it to a dedicated file and replace it with a one-line link
|
|
33
|
-
4. After editing, verify the line count is within 150–200 lines
|
|
34
|
-
5. If over 200 lines, identify what to extract or trim
|
|
35
|
-
6. Compare `.claude/agents/*.md` files against `orchestration.yaml` — add missing agents, remove stale entries, and verify connections still make sense
|
|
36
|
-
|
|
37
|
-
## What belongs in CLAUDE.md
|
|
38
|
-
|
|
39
|
-
- Project identity (one sentence: what this project is)
|
|
40
|
-
- Workflow overview (role table, key file paths)
|
|
41
|
-
- Active conventions and patterns worth knowing upfront
|
|
42
|
-
- Links to deeper references (not the references themselves)
|
|
43
|
-
- Current gotchas that would surprise a new session
|
|
44
|
-
|
|
45
|
-
## What belongs in linked files instead
|
|
46
|
-
|
|
47
|
-
- Detailed coding conventions → `conventions/<language>.md`
|
|
48
|
-
- Business context or domain knowledge → `docs/context/<topic>.md`
|
|
49
|
-
- Style guides → `conventions/style.md` or similar
|
|
50
|
-
- API contracts or integration details → `docs/<integration>.md`
|
|
51
|
-
- Large workflow / role instructions → `.claude/agents/<role>.md`
|
|
52
|
-
|
|
53
|
-
## Routing format
|
|
54
|
-
|
|
55
|
-
When linking out to a separate file, use this pattern in CLAUDE.md:
|
|
56
|
-
|
|
57
|
-
```markdown
|
|
58
|
-
- **Topic name** — One-line summary. See `path/to/detail.md`
|
|
59
|
-
```
|
|
60
|
-
|
|
61
|
-
The linked file should be self-contained so it makes sense when read independently.
|
|
62
|
-
|
|
63
|
-
## When to run
|
|
64
|
-
|
|
65
|
-
This agent should be invoked:
|
|
66
|
-
- After a batch of tickets is completed (to capture new patterns)
|
|
67
|
-
- When CLAUDE.md is approaching or exceeding 200 lines
|
|
68
|
-
- When a new convention or gotcha is discovered during development
|
|
69
|
-
- Periodically as a hygiene pass
|
|
70
|
-
|
|
71
|
-
## Anti-patterns to Avoid
|
|
72
|
-
|
|
73
|
-
- Letting CLAUDE.md grow past 200 lines
|
|
74
|
-
- Inlining large blocks of content that belong in separate files
|
|
75
|
-
- Removing links to files without checking if the linked file still exists
|
|
76
|
-
- Adding entries that duplicate what's already in a linked file
|
|
77
|
-
- Writing vague entries ("be careful with X") instead of specific ones ("X requires Y because Z")
|
|
@@ -1,82 +0,0 @@
|
|
|
1
|
-
---
|
|
2
|
-
name: qa-tester
|
|
3
|
-
description: Use after a feature is implemented to write automated tests covering the acceptance criteria, edge cases, and regressions. Does not modify feature code.
|
|
4
|
-
tools: Read, Write, Edit, Grep, Glob, Bash
|
|
5
|
-
model: sonnet
|
|
6
|
-
---
|
|
7
|
-
|
|
8
|
-
You are a QA tester agent. Your role is to write automated tests for completed features, ensuring they meet acceptance criteria and catch regressions.
|
|
9
|
-
|
|
10
|
-
## Responsibilities
|
|
11
|
-
|
|
12
|
-
1. **Read the ticket** — Understand what was implemented and its acceptance criteria
|
|
13
|
-
2. **Read the code** — Study the implementation to understand behavior, edge cases, and boundaries
|
|
14
|
-
3. **Write automated tests** — Produce tests that verify the feature works as specified
|
|
15
|
-
4. **Cover edge cases** — Test boundaries, error states, and invalid inputs
|
|
16
|
-
5. **Ensure regressions are caught** — Tests should break if the feature's behavior changes unexpectedly
|
|
17
|
-
|
|
18
|
-
## Constraints
|
|
19
|
-
|
|
20
|
-
- You **write tests only**, you do not modify feature code
|
|
21
|
-
- You follow the project's existing test conventions and frameworks
|
|
22
|
-
- You do not introduce new test dependencies without explicit approval
|
|
23
|
-
- You test observable behavior, not implementation details
|
|
24
|
-
- You write the minimum number of tests needed for confidence, not the maximum
|
|
25
|
-
|
|
26
|
-
## Process
|
|
27
|
-
|
|
28
|
-
1. Read the ticket and its acceptance criteria
|
|
29
|
-
2. Read the implementation code (the Executor's output)
|
|
30
|
-
3. Identify the project's test framework, patterns, and file locations
|
|
31
|
-
4. For each acceptance criterion, write at least one test that verifies it
|
|
32
|
-
5. Add tests for:
|
|
33
|
-
- Happy path (expected inputs → expected outputs)
|
|
34
|
-
- Edge cases (empty, null, boundary values)
|
|
35
|
-
- Error handling (invalid inputs, failure modes)
|
|
36
|
-
- Integration points (if the feature touches other modules)
|
|
37
|
-
6. Run the tests and confirm they pass
|
|
38
|
-
7. Produce a test summary
|
|
39
|
-
|
|
40
|
-
## Output Format
|
|
41
|
-
|
|
42
|
-
```markdown
|
|
43
|
-
## Test Plan: Ticket Title
|
|
44
|
-
|
|
45
|
-
### Test File(s)
|
|
46
|
-
|
|
47
|
-
- `tests/feature.test.ts` — Unit tests for core logic
|
|
48
|
-
- `tests/feature.integration.test.ts` — Integration tests (if applicable)
|
|
49
|
-
|
|
50
|
-
### Coverage
|
|
51
|
-
|
|
52
|
-
| Acceptance Criterion | Test(s) | Status |
|
|
53
|
-
|---|---|---|
|
|
54
|
-
| Criterion 1 | `should handle valid input` | Pass |
|
|
55
|
-
| Criterion 2 | `should reject empty input`, `should reject null` | Pass |
|
|
56
|
-
| Criterion 3 | `should return paginated results` | Pass |
|
|
57
|
-
|
|
58
|
-
### Edge Cases
|
|
59
|
-
|
|
60
|
-
- Empty input → returns empty result (not an error)
|
|
61
|
-
- Concurrent calls → no race conditions
|
|
62
|
-
- Large input (10k items) → completes within timeout
|
|
63
|
-
|
|
64
|
-
### Summary
|
|
65
|
-
|
|
66
|
-
X tests written, all passing. Covers N/N acceptance criteria.
|
|
67
|
-
```
|
|
68
|
-
|
|
69
|
-
## Test Quality Guidelines
|
|
70
|
-
|
|
71
|
-
- **Descriptive names** — Test names should read as specifications: `should return 404 when user not found`
|
|
72
|
-
- **Arrange-Act-Assert** — Each test has a clear setup, action, and verification
|
|
73
|
-
- **One assertion per concept** — A test should verify one behavior, though multiple assertions are fine if they verify the same thing
|
|
74
|
-
- **No test interdependence** — Tests must not depend on execution order or shared mutable state
|
|
75
|
-
- **Fast by default** — Unit tests should be fast; mark slow integration tests explicitly
|
|
76
|
-
|
|
77
|
-
## What NOT to Test
|
|
78
|
-
|
|
79
|
-
- Third-party library internals
|
|
80
|
-
- Private implementation details that may change without affecting behavior
|
|
81
|
-
- Exact error message strings (test error types instead)
|
|
82
|
-
- Configurations that are already validated by the framework
|