agent-eng 0.10.0 → 0.12.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -21,9 +21,8 @@ This creates the following structure in your project:
21
21
  │ ├── system-architect.md
22
22
  │ ├── planner.md
23
23
  │ ├── executor.md
24
- │ ├── qa-tester.md
25
24
  │ ├── reviewer.md
26
- │ └── custodian.md
25
+ │ └── summarizer.md
27
26
  ├── architecture/
28
27
  │ ├── overview.md # High-level architecture overview
29
28
  │ └── decisions/
@@ -97,7 +96,7 @@ Defines the runtime system components, their tiers (client/service/engine/data),
97
96
  4. **Map the system** — Use the `system-architect` subagent to create your `architecture.yaml`
98
97
  5. **Plan the work** — Use the `planner` subagent to decompose your ADR into tickets
99
98
  6. **Execute** — Use the `executor` subagent to implement a ticket following your conventions
100
- 7. **Test & Review** — Use the `qa-tester` and `reviewer` subagents to validate the work
99
+ 7. **Review** — Use the `reviewer` subagent to validate the work (tests are written during execution)
101
100
 
102
101
  ## License
103
102
 
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "agent-eng",
3
- "version": "0.10.0",
3
+ "version": "0.12.0",
4
4
  "description": "Scaffold a structured agentic engineering workflow for AI-assisted development",
5
5
  "type": "module",
6
6
  "bin": {
package/src/init.js CHANGED
@@ -9,11 +9,10 @@ const TEMPLATES = join(__dirname, "templates");
9
9
  const STRUCTURE = [
10
10
  ".github/workflows/notify-site.yml",
11
11
  ".claude/settings.json",
12
+ ".claude/scripts/update-status.sh",
12
13
  ".claude/agents/architect.md",
13
- ".claude/agents/custodian.md",
14
14
  ".claude/agents/executor.md",
15
15
  ".claude/agents/planner.md",
16
- ".claude/agents/qa-tester.md",
17
16
  ".claude/agents/reviewer.md",
18
17
  ".claude/agents/summarizer.md",
19
18
  ".claude/agents/system-architect.md",
@@ -26,6 +25,7 @@ const STRUCTURE = [
26
25
  "tickets/example/001-example-ticket.md",
27
26
  "orchestration.yaml",
28
27
  "architecture.yaml",
28
+ "STATUS.md",
29
29
  ];
30
30
 
31
31
  function prompt(question) {
@@ -1,12 +1,14 @@
1
1
  ---
2
2
  name: executor
3
- description: Use when the user wants to implement a specific ticket. Reads the ticket, proposes a plan first, then implements following project conventions and ADRs. Verifies work end-to-end before marking done.
3
+ description: Use for complex tickets that need guided execution with strict verification. For most tickets, prefer Claude Code plan mode (shift+tab) instead it explores, plans, and implements in one session. Use this agent when you need enforced discipline (plan-before-code, mandatory verification) or when plan mode is unavailable.
4
4
  tools: Read, Write, Edit, Grep, Glob, Bash, WebFetch
5
5
  model: sonnet
6
6
  ---
7
7
 
8
8
  You are an executor agent. Your role is to implement tickets by writing code that follows the project's conventions and architecture decisions.
9
9
 
10
+ > **Note:** For most tickets, Claude Code's built-in plan mode (`shift+tab`) is the recommended execution path — it provides faster context continuity and integrated exploration. This agent exists for cases where you need enforced discipline or are running execution as part of the full agent pipeline.
11
+
10
12
  ## Responsibilities
11
13
 
12
14
  1. **Read the ticket fully** — Understand the acceptance criteria, linked ADRs, and specs before writing any code
@@ -5,7 +5,7 @@ tools: Read, Grep, Glob, Write, Edit
5
5
  model: sonnet
6
6
  ---
7
7
 
8
- You are a planner agent. Your role is to decompose specs and ADRs into actionable tickets.
8
+ You are a planner agent. Your role is to decompose specs and ADRs into actionable tickets scoped for execution via Claude Code plan mode.
9
9
 
10
10
  ## Responsibilities
11
11
 
@@ -13,7 +13,7 @@ You are a planner agent. Your role is to decompose specs and ADRs into actionabl
13
13
  2. **Decompose work** — Break features into small, focused tickets
14
14
  3. **Define acceptance criteria** — Each ticket must have testable completion criteria
15
15
  4. **Sequence work** — Order tickets to minimize blocked dependencies
16
- 5. **Scope appropriately** — Each ticket should be completable in one focused session
16
+ 5. **Scope for plan mode** — Each ticket should be completable in a single Claude Code plan mode session
17
17
 
18
18
  ## Constraints
19
19
 
@@ -21,6 +21,7 @@ You are a planner agent. Your role is to decompose specs and ADRs into actionabl
21
21
  - Each ticket should be **independently mergeable** where possible
22
22
  - Acceptance criteria must be **specific and testable**
23
23
  - Link tickets to relevant ADRs and specs
24
+ - **Size tickets for plan mode** — a ticket should be achievable in one focused session with plan mode (typically S or M size). If a ticket would require multiple plan mode sessions, split it
24
25
 
25
26
  ## Process
26
27
 
@@ -13,12 +13,13 @@ You are a reviewer agent. Your role is to review code changes against acceptance
13
13
  2. **Validate against ADRs** — Ensure changes follow established architectural decisions
14
14
  3. **Check conventions** — Verify code follows the relevant language conventions
15
15
  4. **Identify issues** — Flag problems but don't fix them directly
16
- 5. **Provide actionable feedback** — Be specific about what needs to change
16
+ 5. **Check test adequacy** — Verify that tests exist for each acceptance criterion and cover key edge cases
17
+ 6. **Provide actionable feedback** — Be specific about what needs to change
17
18
 
18
19
  ## Constraints
19
20
 
20
21
  - You **review and comment**, you do not write code
21
- - You flag issues for the executor to fix
22
+ - You flag issues to be fixed (via plan mode or the executor agent)
22
23
  - You reference specific lines and files
23
24
  - You cite the relevant ADR or convention when flagging violations
24
25
 
@@ -31,6 +32,8 @@ You are a reviewer agent. Your role is to review code changes against acceptance
31
32
  - Do the changes violate any ADRs?
32
33
  - Do the changes follow conventions?
33
34
  - Are there obvious bugs or edge cases?
35
+ - Does each acceptance criterion have a corresponding test?
36
+ - Are critical edge cases (empty input, error states) covered by tests?
34
37
  4. Produce a review with:
35
38
  - Checklist of acceptance criteria (pass/fail)
36
39
  - List of issues (if any) with specific locations
@@ -47,6 +50,12 @@ You are a reviewer agent. Your role is to review code changes against acceptance
47
50
  - [ ] Criterion 2 — **Not satisfied**: missing error handling for empty input
48
51
  - [x] Criterion 3 — Satisfied
49
52
 
53
+ ### Test Coverage
54
+
55
+ - [x] Criterion 1 — Tested in `tests/feature.test.ts:12`
56
+ - [ ] Criterion 2 — **No test found** for empty input handling
57
+ - [x] Criterion 3 — Tested in `tests/feature.test.ts:28`
58
+
50
59
  ### Issues
51
60
 
52
61
  1. **Convention violation** (`src/file.ts:15`): Missing type annotation on `processData` return value. See `conventions/typescript.md`.
@@ -0,0 +1,54 @@
1
+ #!/usr/bin/env bash
2
+ set -euo pipefail
3
+
4
+ STATUS="STATUS.md"
5
+ [ -f "$STATUS" ] || exit 0
6
+
7
+ BRANCH=$(git rev-parse --abbrev-ref HEAD 2>/dev/null || echo "unknown")
8
+ TIMESTAMP=$(date -u +"%Y-%m-%d %H:%M UTC")
9
+
10
+ # Update timestamp
11
+ sed -i.bak "s|^> Last updated:.*|> Last updated: ${TIMESTAMP}|" "$STATUS"
12
+ rm -f "${STATUS}.bak"
13
+
14
+ # Build commits table (last 10 commits)
15
+ TMPDIR="${TMPDIR:-/tmp}"
16
+ COMMITS_TMP="${TMPDIR}/_ae_commits.tmp"
17
+ FILES_TMP="${TMPDIR}/_ae_files.tmp"
18
+
19
+ {
20
+ echo "**Branch:** \`${BRANCH}\` "
21
+ echo "**Last commit:** ${TIMESTAMP}"
22
+ echo ""
23
+ echo "| Hash | Date | Message |"
24
+ echo "|------|------|---------|"
25
+ git log --format="| \`%h\` | %ad | %s |" --date=short -10 2>/dev/null || echo "| - | - | No commits yet |"
26
+ } > "$COMMITS_TMP"
27
+
28
+ # Build recent file changes
29
+ COMMIT_COUNT=$(git rev-list --count HEAD 2>/dev/null || echo "0")
30
+ {
31
+ echo "**Files changed (last 5 commits):**"
32
+ echo ""
33
+ echo '```'
34
+ if [ "$COMMIT_COUNT" -ge 5 ] 2>/dev/null; then
35
+ git diff --stat HEAD~5..HEAD 2>/dev/null | head -20
36
+ elif [ "$COMMIT_COUNT" -gt 1 ] 2>/dev/null; then
37
+ git diff --stat HEAD~$((COMMIT_COUNT - 1))..HEAD 2>/dev/null | head -20
38
+ else
39
+ echo "Not enough commits yet."
40
+ fi
41
+ echo '```'
42
+ } > "$FILES_TMP"
43
+
44
+ # Replace AUTO:START..AUTO:END and AUTO:FILES:START..AUTO:FILES:END
45
+ awk '
46
+ /<!-- AUTO:START -->/ { print; while((getline line < "'"$COMMITS_TMP"'") > 0) print line; skip=1; next }
47
+ /<!-- AUTO:END -->/ { skip=0 }
48
+ /<!-- AUTO:FILES:START -->/ { print; while((getline line < "'"$FILES_TMP"'") > 0) print line; skip=1; next }
49
+ /<!-- AUTO:FILES:END -->/ { skip=0 }
50
+ skip { next }
51
+ { print }
52
+ ' "$STATUS" > "${STATUS}.tmp" && mv "${STATUS}.tmp" "$STATUS"
53
+
54
+ rm -f "$COMMITS_TMP" "$FILES_TMP"
@@ -1,4 +1,35 @@
1
1
  {
2
+ "hooks": {
3
+ "PostToolUse": [
4
+ {
5
+ "matcher": "Bash",
6
+ "hooks": [
7
+ {
8
+ "type": "command",
9
+ "if": "Bash(git commit *)",
10
+ "command": "bash \"$CLAUDE_PROJECT_DIR\"/.claude/scripts/update-status.sh",
11
+ "timeout": 30
12
+ },
13
+ {
14
+ "type": "command",
15
+ "if": "Bash(git push *)",
16
+ "command": "bash \"$CLAUDE_PROJECT_DIR\"/.claude/scripts/update-status.sh",
17
+ "timeout": 30
18
+ }
19
+ ]
20
+ },
21
+ {
22
+ "matcher": "Write|Edit",
23
+ "hooks": [
24
+ {
25
+ "type": "command",
26
+ "command": "lines=$(wc -l < CLAUDE.md 2>/dev/null) && [ \"${lines:-0}\" -gt 200 ] && echo \"CLAUDE.md is ${lines} lines (limit: 200). Move content to linked files.\" || true",
27
+ "timeout": 10
28
+ }
29
+ ]
30
+ }
31
+ ]
32
+ },
2
33
  "mcpServers": {
3
34
  "context7": {
4
35
  "command": "npx",
@@ -1,67 +1,72 @@
1
1
  # Project
2
2
 
3
- This project uses a structured agentic engineering workflow. Before starting any work, read this document and the relevant references below.
3
+ This project uses a hybrid agentic workflow: specialized agents handle process (decisions, planning, review), and Claude Code's plan mode handles execution.
4
4
 
5
- ## Workflow
5
+ ## Workflow — Hybrid Approach
6
6
 
7
- This project separates AI-assisted work into eight roles. Each role is a Claude Code subagent in `.claude/agents/` invoke them via the Agent tool with `subagent_type: "<name>"` (e.g., `subagent_type: "architect"`). Claude will also auto-route work to the appropriate subagent based on each agent's `description`.
7
+ Agents own the **process** architecture decisions, work decomposition, quality gates, and maintenance. Claude Code plan mode owns the **execution** implementing individual tickets efficiently within a single session.
8
8
 
9
- | Role | Subagent | Responsibility |
10
- |------|----------|----------------|
11
- | **Architect** | `architect` | Analyze requirements, ask clarifying questions, produce ADRs |
12
- | **System Architect** | `system-architect` | Map and document system architecture as `architecture.yaml` |
13
- | **Planner** | `planner` | Decompose specs and ADRs into actionable tickets |
14
- | **Executor** | `executor` | Implement tickets, verify work before requesting feedback |
15
- | **QA Tester** | `qa-tester` | Write automated tests for completed features |
16
- | **Reviewer** | `reviewer` | Validate code and tests against acceptance criteria and ADRs |
17
- | **Custodian** | `custodian` | Keep CLAUDE.md lean (≤200 lines), current, and routed to external files |
18
- | **Summarizer** | `summarizer` | Generate executive summaries of completed sprints or features for stakeholders |
9
+ | Phase | How | When |
10
+ |-------|-----|------|
11
+ | **Decide** | `/architect` agent | New feature, significant design choice, unclear requirements |
12
+ | **Map** | `/system-architect` agent | New system or major structural change |
13
+ | **Decompose** | `/planner` agent | ADR/spec ready, work needs to be broken into tickets |
14
+ | **Execute** | Claude Code **plan mode** (`shift+tab`) | Implementing a specific ticket (includes writing tests) |
15
+ | **Review** | `/reviewer` agent | Code and tests ready for validation |
16
+ | **Report** | `/summarizer` agent | Sprint or feature complete, stakeholder update needed |
19
17
 
20
- ## Sub-Agent Deployment
21
-
22
- When work can be parallelized, spin up sub-agents to handle independent tasks concurrently. Sub-agents research, test, or implement in isolation and report back to the main thread.
18
+ ### Why hybrid?
23
19
 
24
- ### When to deploy sub-agents
20
+ - Agents enforce **separation of concerns** — the Architect can't write code, the Reviewer can't fix issues
21
+ - Plan mode provides **speed and context continuity** — it explores, plans, and executes in one session
22
+ - Artifacts (ADRs, tickets, reviews) **persist across sessions** — plan mode's output is code, agents' output is documentation
25
23
 
26
- - **Research in parallel** — e.g., one sub-agent reads existing ADRs while another explores the codebase for relevant patterns
27
- - **Test in parallel** — e.g., one sub-agent runs unit tests while another checks integration tests
28
- - **Implement independent tickets** — tickets with no dependencies on each other can be executed simultaneously
29
- - **Verify in parallel** — e.g., one sub-agent checks browser behavior while another reviews console output
24
+ ### Choosing the right tool
30
25
 
31
- ### Model selection
26
+ **Use an agent** when the task produces a persistent artifact (ADR, ticket, review, summary) or when role separation matters (the person deciding shouldn't be the person implementing).
32
27
 
33
- Not every sub-agent needs the most powerful model. Choose the model based on task complexity:
28
+ **Use plan mode** when you have a well-scoped ticket with clear acceptance criteria and want to go from plan to working code in one session.
34
29
 
35
- | Complexity | Model | Use when |
36
- |------------|-------|----------|
37
- | **Low** | Haiku | File lookups, grep searches, reading docs, running tests, formatting, simple code generation |
38
- | **Medium** | Sonnet | Multi-file changes, moderate reasoning, code review, writing tests |
39
- | **High** | Opus | Architecture decisions, complex refactors, subtle bug investigation, cross-cutting changes |
30
+ **Quick fixes and bug fixes** don't need the full pipeline — use plan mode directly, or just implement without ceremony. The workflow exists to help, not to slow down trivial changes.
40
31
 
41
- **Default to Haiku for sub-agents** unless the task requires multi-step reasoning or cross-file understanding. Most research and verification tasks are Haiku-appropriate.
32
+ ## Before Starting Any Feature
42
33
 
43
- ### How to deploy
34
+ 1. Check if an ADR exists in `architecture/decisions/` — if not, run `/architect` first
35
+ 2. Check if tickets exist in `tickets/` — if not, run `/planner` first
36
+ 3. For each ticket: use plan mode (`shift+tab`) to implement it
37
+ 4. After implementation: run `/reviewer` to validate against acceptance criteria
38
+ 5. If the ticket touches an existing ADR's scope, verify the decision still holds
44
39
 
45
- Use the Agent tool with these parameters:
46
- - `description` — Short label for what the sub-agent does
47
- - `prompt` — Self-contained brief (the sub-agent has no context from the main thread)
48
- - `model` — Set to `"haiku"` for simple tasks, `"sonnet"` for moderate, omit for complex (inherits parent model)
49
- - `run_in_background` — Set to `true` when you don't need the result before continuing other work
40
+ ## Testing in Plan Mode
50
41
 
51
- ### Rules
42
+ Plan mode writes tests as part of implementing each ticket. For every acceptance criterion:
43
+ 1. Write at least one automated test that verifies it
44
+ 2. Cover edge cases (empty, null, boundary values) and error handling
45
+ 3. Run the tests and confirm they pass before marking the ticket done
52
46
 
53
- - **Make prompts self-contained** Sub-agents don't see the main conversation. Include file paths, context, and what specifically to do or find.
54
- - **Parallelize independent work** — Launch multiple sub-agents in a single message when their tasks don't depend on each other.
55
- - **Don't delegate synthesis** — Sub-agents gather information; the main thread makes decisions. Never write "based on your findings, decide X."
56
- - **Verify sub-agent output** — Sub-agents report what they intended, not necessarily what they achieved. Check their actual changes before reporting to the user.
47
+ Follow the project's existing test framework and patterns. Test observable behavior, not implementation details.
57
48
 
58
- ## Before Starting Any Ticket
49
+ ## Before Starting Any Ticket (in plan mode)
59
50
 
60
51
  1. Read the ticket fully, including all linked documents
61
52
  2. Read any referenced ADRs in `architecture/decisions/`
62
- 3. Check if there's a relevant spec in `specs/`
63
- 4. Propose a plan before writing code get alignment first
64
- 5. If the ticket touches an existing ADR's scope, verify the decision still holds
53
+ 3. Check relevant conventions in `conventions/`
54
+ 4. Let plan mode explore and propose the implementation plan
55
+ 5. Verify the work end-to-end before marking done
56
+
57
+ ## Sub-Agent Deployment
58
+
59
+ When work can be parallelized, spin up sub-agents for independent tasks concurrently.
60
+
61
+ ### Model selection
62
+
63
+ | Complexity | Model | Use when |
64
+ |------------|-------|----------|
65
+ | **Low** | Haiku | File lookups, grep, reading docs, running tests, formatting |
66
+ | **Medium** | Sonnet | Multi-file changes, code review, writing tests |
67
+ | **High** | Opus | Architecture decisions, complex refactors, subtle bugs |
68
+
69
+ **Default to Haiku** unless the task requires multi-step reasoning or cross-file understanding.
65
70
 
66
71
  ## Key Files and Directories
67
72
 
@@ -71,11 +76,33 @@ Use the Agent tool with these parameters:
71
76
  - `specs/` — Feature specifications
72
77
  - `tickets/` — Work items organized by feature folder, with `_backlog.md` as the sprint board
73
78
  - `conventions/` — Language and framework coding standards
74
- - `.claude/agents/` — Subagent definitions for each role (Architect, Planner, Executor, etc.)
79
+ - `.claude/agents/` — Subagent definitions for each role
80
+ - `STATUS.md` — Live project dashboard (auto-updated git data + manually maintained context)
81
+
82
+ ## CLAUDE.md Maintenance
83
+
84
+ Keep this file lean and current (target: under 200 lines). A hook warns when it exceeds the limit.
85
+ - When you discover a new gotcha or pattern, add it here or to the appropriate linked file
86
+ - Route large or specialized content to separate files (e.g., `conventions/`, `docs/`) and link from here
87
+ - Remove stale entries that no longer reflect how the project works
88
+ - Never duplicate information that already lives in a linked file
89
+
90
+ ## STATUS.md Maintenance
91
+
92
+ STATUS.md is a live project dashboard. Git sections (branch, commits, file changes) auto-update via a hook on every commit/push. You maintain the semantic sections:
93
+
94
+ **Update after significant milestones** (completing a ticket, finishing a phase, hitting a blocker):
95
+ 1. **Current Phase** — Mark the active workflow phase(s) from the table
96
+ 2. **Active Work** — One paragraph: what feature/ticket is in progress, next step
97
+ 3. **Open Tickets** — Snapshot from `tickets/_backlog.md`
98
+ 4. **Risks & Blockers** — Add blockers; remove resolved ones
99
+ 5. **Session Log** — One-line entry with today's date and what was accomplished
100
+
101
+ Keep updates brief. STATUS.md is a dashboard, not a report — use `/summarizer` for detailed retrospectives.
75
102
 
76
103
  ## MCP Servers
77
104
 
78
- - **Context7** — Pulls up-to-date, version-specific documentation from live code libraries (React, Next.js, etc.) before writing code. Configured in `.claude/settings.json`. Use the `resolve` tool to look up a library, then `get-library-docs` to fetch the relevant docs. Always query Context7 before writing code that depends on a third-party library to avoid using outdated or deprecated APIs.
105
+ - **Context7** — Pulls up-to-date, version-specific documentation from live code libraries. Use `resolve` then `get-library-docs` before writing code that depends on a third-party library.
79
106
 
80
107
  ## Conventions
81
108
 
@@ -0,0 +1,48 @@
1
+ # Project Status
2
+
3
+ > Last updated: _not yet updated_
4
+
5
+ ## Current Phase
6
+
7
+ | Phase | Status |
8
+ |-------|--------|
9
+ | Decide | |
10
+ | Map | |
11
+ | Decompose | |
12
+ | Execute | |
13
+ | Test | |
14
+ | Review | |
15
+ | Maintain | |
16
+ | Report | |
17
+
18
+ ## Active Work
19
+
20
+ _What is currently being worked on, the relevant tickets, and the expected next step._
21
+
22
+ ## Branch & Commits
23
+
24
+ <!-- AUTO:START -->
25
+ _Commit or run `.claude/scripts/update-status.sh` to populate._
26
+ <!-- AUTO:END -->
27
+
28
+ ## Recent File Changes
29
+
30
+ <!-- AUTO:FILES:START -->
31
+ _Commit or run `.claude/scripts/update-status.sh` to populate._
32
+ <!-- AUTO:FILES:END -->
33
+
34
+ ## Open Tickets
35
+
36
+ | Ticket | Feature | Status |
37
+ |--------|---------|--------|
38
+ | | | |
39
+
40
+ ## Risks & Blockers
41
+
42
+ - None currently
43
+
44
+ ## Session Log
45
+
46
+ | Date | Summary |
47
+ |------|---------|
48
+ | | |
@@ -2,6 +2,7 @@
2
2
 
3
3
  **Status:** Accepted
4
4
  **Date:** 2026-05-04
5
+ **Updated:** 2026-05-08
5
6
  **Author:** swarpi
6
7
 
7
8
  ## Context
@@ -12,19 +13,36 @@ Working with AI coding assistants can be highly productive, but without structur
12
13
  - Lost context between sessions
13
14
  - No clear separation between planning and execution
14
15
 
15
- We need a workflow that maximizes the benefits of AI assistance while maintaining engineering rigor.
16
+ We need a workflow that maximizes the benefits of AI assistance while maintaining engineering rigor. However, a fully agent-driven pipeline adds friction for execution — Claude Code's built-in plan mode is faster and maintains better context continuity when implementing well-scoped work.
16
17
 
17
18
  ## Decision
18
19
 
19
- We adopt a role-based workflow with four distinct agent modes:
20
+ We adopt a **hybrid approach**: specialized agents own the process, Claude Code plan mode owns execution.
21
+
22
+ ### Agents (process)
20
23
 
21
24
  1. **Architect** — Focuses on design decisions. Produces ADRs. Asks clarifying questions before deciding. Has read access to the codebase but does not write code.
22
25
 
23
- 2. **Planner** — Takes specs and ADRs as input. Decomposes work into tickets with clear acceptance criteria. Does not implement.
26
+ 2. **Planner** — Takes specs and ADRs as input. Decomposes work into tickets scoped for plan mode sessions. Does not implement.
27
+
28
+ 3. **Reviewer** — Reviews diffs against acceptance criteria and linked ADRs. Checks for convention violations and test adequacy. Does not fix issues directly — flags them for fixing.
29
+
30
+ 4. **Summarizer** — Produces non-technical executive summaries of completed work.
31
+
32
+ ### Plan mode (execution)
33
+
34
+ Each ticket from the Planner is executed using Claude Code's plan mode (`shift+tab`). Plan mode:
35
+ - Explores the codebase and proposes a plan before implementing
36
+ - Executes within a single session with full context continuity
37
+ - Adapts its depth to the task — simple ticket, simple plan
38
+
39
+ The **Executor agent** remains as a fallback for cases where plan mode is unavailable or strict verification discipline is needed.
24
40
 
25
- 3. **Executor** Implements tickets. Follows conventions. References the ticket and relevant ADRs. Proposes a plan before coding.
41
+ ### When to skip the pipeline
26
42
 
27
- 4. **Reviewer** Reviews diffs against acceptance criteria and linked ADRs. Checks for convention violations. Does not fix issues directly — flags them for the executor.
43
+ - Bug fixes, typos, and small changes plan mode directly
44
+ - Well-understood changes with no architectural implications → plan mode directly
45
+ - New features or significant decisions → full pipeline starting with Architect
28
46
 
29
47
  All significant decisions are recorded as ADRs. All work items are tickets with acceptance criteria. Conventions are documented per-language.
30
48
 
@@ -35,14 +53,15 @@ All significant decisions are recorded as ADRs. All work items are tickets with
35
53
  - Clear separation of concerns reduces cognitive overload per session
36
54
  - ADRs create a searchable decision history
37
55
  - Tickets with acceptance criteria make "done" unambiguous
56
+ - Plan mode provides fast, context-aware execution without agent switching overhead
38
57
  - Conventions prevent style drift across sessions
39
- - Role separation allows using different models for different tasks (e.g., larger model for architecture, faster model for execution)
58
+ - The workflow adapts to task size no ceremony for small changes
40
59
 
41
60
  ### Negative
42
61
 
43
- - More upfront documentation work
44
- - Overhead may feel excessive for small changes
45
- - Requires discipline to follow the process
62
+ - More upfront documentation work for new features
63
+ - Requires discipline to follow the process for larger changes
64
+ - Plan mode doesn't produce persistent artifacts the way the Executor agent does
46
65
 
47
66
  ### Neutral
48
67
 
@@ -54,10 +73,14 @@ All significant decisions are recorded as ADRs. All work items are tickets with
54
73
 
55
74
  Just talk to the AI and let it write code directly. Rejected because it leads to inconsistent decisions and lost context.
56
75
 
57
- ### Heavy-process frameworks (e.g., full Agile ceremony)
76
+ ### Fully agent-driven pipeline (previous approach)
58
77
 
59
- Too heavyweight for a solo developer. We take the useful parts (clear acceptance criteria, documented decisions) without the ceremony.
78
+ All eight agents in sequence, including Executor for implementation. Rejected because plan mode provides better context continuity and speed for execution, and the Executor agent was frequently skipped by plan mode anyway.
79
+
80
+ ### Plan mode only (no agents)
60
81
 
61
- ### Separate human architect with AI executor
82
+ Use plan mode for everything. Rejected because plan mode doesn't enforce role separation (the same session that decides also implements), produces no persistent artifacts (ADRs, tickets, reviews), and doesn't scale to multi-session features.
62
83
 
63
- Keeps all design decisions with the human. Rejected because AI architects can surface alternatives humans wouldn't consider, and documenting the reasoning in ADRs preserves human oversight.
84
+ ### Heavy-process frameworks (e.g., full Agile ceremony)
85
+
86
+ Too heavyweight for a solo developer. We take the useful parts (clear acceptance criteria, documented decisions) without the ceremony.
@@ -8,13 +8,25 @@ This document provides a high-level view of the system architecture.
8
8
  2. **Written artifacts** — Decisions, plans, and reviews are documented, not just discussed
9
9
  3. **Verify before execute** — Plans are reviewed before implementation begins
10
10
  4. **Single source of truth** — Each piece of information lives in one canonical place
11
+ 5. **Right tool for the job** — Agents own process; plan mode owns execution
11
12
 
12
- ## The Flow
13
+ ## The Flow (Hybrid)
13
14
 
14
15
  ```
15
- Requirement → Architect → ADR/Spec → Planner → Tickets → Executor → Code → Reviewer → Merged
16
+ Requirement → Architect → ADR/Spec → Planner → Tickets → Plan Mode → Code → Reviewer → Merged
17
+ (agent) (agent) (built-in) (agent)
16
18
  ```
17
19
 
20
+ **Process phases** (agents): Architect, System Architect, Planner, Reviewer, Summarizer
21
+ **Execution phase** (plan mode): Claude Code's built-in plan mode implements each ticket in a focused session
22
+
23
+ ## When to Skip the Pipeline
24
+
25
+ Not every change needs the full workflow:
26
+ - **Bug fix or typo** → Plan mode directly, or just implement
27
+ - **Small well-understood change** → Plan mode directly
28
+ - **New feature or significant decision** → Start with Architect, then full pipeline
29
+
18
30
  ## Artifact Types
19
31
 
20
32
  | Artifact | Purpose | Location |
@@ -1,5 +1,7 @@
1
- name: Agentic Workflow
2
- description: Eight-role pipeline for AI-assisted software engineering
1
+ name: Hybrid Agentic Workflow
2
+ description: >
3
+ Agents own the process (decisions, planning, review).
4
+ Claude Code plan mode owns execution (implementing tickets).
3
5
 
4
6
  agents:
5
7
  - id: architect
@@ -29,8 +31,8 @@ agents:
29
31
  - id: planner
30
32
  kind: planning
31
33
  title: Planner
32
- tagline: Decomposes specs into actionable work
33
- description: Takes ADRs and specs as input. Decomposes the work into discrete, actionable tickets with clear acceptance criteria.
34
+ tagline: Decomposes specs into plan-mode-sized tickets
35
+ description: Takes ADRs and specs as input. Decomposes work into discrete tickets scoped for a single Claude Code plan mode session.
34
36
  outputs:
35
37
  - Tickets
36
38
  - Milestones
@@ -38,35 +40,39 @@ agents:
38
40
  color: indigo
39
41
  docLink: /.claude/agents/planner.md
40
42
 
41
- - id: executor
43
+ - id: plan-mode
42
44
  kind: execution
43
- title: Executor
44
- tagline: Implements with intent and discipline
45
- description: Implements tickets following established conventions. Always proposes a plan before touching the codebase.
45
+ title: Plan Mode
46
+ tagline: Claude Code's built-in plan-then-execute cycle
47
+ description: >
48
+ Not a custom agent — this is Claude Code's native plan mode (shift+tab).
49
+ It explores the codebase, proposes a plan, gets approval, then implements.
50
+ Each ticket from the Planner is executed as a separate plan mode session.
46
51
  outputs:
47
52
  - Code
48
53
  - PRs
49
- - Plan docs
50
54
  color: indigo
51
- docLink: /.claude/agents/executor.md
55
+ builtin: true
52
56
 
53
- - id: qa-tester
54
- kind: validation
55
- title: QA Tester
56
- tagline: Writes automated tests for completed features
57
- description: Writes automated tests after the Executor finishes a feature. Covers acceptance criteria, edge cases, and regression scenarios.
57
+ - id: executor
58
+ kind: execution
59
+ title: Executor (fallback)
60
+ tagline: Guided execution with strict verification
61
+ description: >
62
+ Fallback for when plan mode is unavailable or when enforced verification
63
+ discipline is needed. Most tickets should use plan mode instead.
58
64
  outputs:
59
- - Tests
60
- - Coverage report
61
- - Test plan
62
- color: green
63
- docLink: /.claude/agents/qa-tester.md
65
+ - Code
66
+ - PRs
67
+ - Plan docs
68
+ color: gray
69
+ docLink: /.claude/agents/executor.md
64
70
 
65
71
  - id: reviewer
66
72
  kind: validation
67
73
  title: Reviewer
68
74
  tagline: Validates against acceptance criteria
69
- description: Validates code and tests against the original acceptance criteria. Flags issues back to the Executor and provides final approval.
75
+ description: Validates code and tests against the original acceptance criteria. Flags issues back for fixing and provides final approval.
70
76
  outputs:
71
77
  - Feedback
72
78
  - Approval
@@ -74,22 +80,11 @@ agents:
74
80
  color: amber
75
81
  docLink: /.claude/agents/reviewer.md
76
82
 
77
- - id: custodian
78
- kind: maintenance
79
- title: Custodian
80
- tagline: Keeps CLAUDE.md lean, current, and well-routed
81
- description: Maintains the project's CLAUDE.md file — adds new patterns and gotchas, removes stale entries, and routes large content to external files. Enforces a 150–200 line limit to prevent context bloat.
82
- outputs:
83
- - Updated CLAUDE.md
84
- - Extracted reference files
85
- color: blue
86
- docLink: /.claude/agents/custodian.md
87
-
88
83
  - id: summarizer
89
84
  kind: reporting
90
85
  title: Summarizer
91
86
  tagline: Communicates completed work to stakeholders
92
- description: Reads tickets, ADRs, specs, git history, and test results to produce concise, non-technical executive summaries of what was delivered and why it matters.
87
+ description: Reads tickets, ADRs, specs, git history, and test results to produce concise, non-technical executive summaries.
93
88
  outputs:
94
89
  - Executive summaries
95
90
  - Sprint recaps
@@ -98,6 +93,7 @@ agents:
98
93
  docLink: /.claude/agents/summarizer.md
99
94
 
100
95
  connections:
96
+ # Process phase: agents produce artifacts
101
97
  - from: architect
102
98
  to: system-architect
103
99
  artifact: ADRs · Specs
@@ -106,28 +102,42 @@ connections:
106
102
  to: planner
107
103
  artifact: architecture.yaml
108
104
 
105
+ # Execution phase: plan mode implements tickets
106
+ - from: planner
107
+ to: plan-mode
108
+ artifact: Tickets
109
+ note: Each ticket becomes a separate plan mode session
110
+
111
+ # Fallback execution path
109
112
  - from: planner
110
113
  to: executor
111
114
  artifact: Tickets
115
+ type: fallback
116
+ note: Use when plan mode is unavailable
112
117
 
113
- - from: executor
114
- to: qa-tester
115
- artifact: Code
118
+ # Validation phase
119
+ - from: plan-mode
120
+ to: reviewer
121
+ artifact: Code · Tests
116
122
 
117
- - from: qa-tester
123
+ - from: executor
118
124
  to: reviewer
119
125
  artifact: Code · Tests
126
+ type: fallback
120
127
 
128
+ # Feedback loops
121
129
  - from: reviewer
122
- to: executor
130
+ to: plan-mode
123
131
  artifact: Feedback
124
132
  type: feedback
133
+ note: Fix issues in a new plan mode session
125
134
 
126
135
  - from: reviewer
127
- to: custodian
128
- artifact: Completed work context
129
- type: trigger
136
+ to: executor
137
+ artifact: Feedback
138
+ type: feedback
130
139
 
140
+ # Reporting trigger
131
141
  - from: reviewer
132
142
  to: summarizer
133
143
  artifact: Completed work context
@@ -137,6 +147,7 @@ parallelization:
137
147
  description: >
138
148
  The main session can deploy sub-agents for independent, parallelizable work.
139
149
  Sub-agents run in isolation, report back, and the main thread synthesizes results.
150
+ Plan mode sessions run sequentially (one ticket at a time) unless tickets are independent.
140
151
  model-routing:
141
152
  low-complexity: haiku
142
153
  medium-complexity: sonnet
@@ -145,7 +156,7 @@ parallelization:
145
156
  examples:
146
157
  - parallel research across multiple files
147
158
  - running independent test suites simultaneously
148
- - implementing tickets with no dependencies on each other
159
+ - independent tickets can be executed in separate plan mode sessions
149
160
  - concurrent verification (browser + console + tests)
150
161
 
151
162
  layout: diamond
@@ -1,77 +0,0 @@
1
- ---
2
- name: custodian
3
- description: Use periodically (after a batch of tickets, or when CLAUDE.md grows past 200 lines) to keep CLAUDE.md lean, current, and routed to external files. Modifies only CLAUDE.md and the files it links to. Also keeps orchestration.yaml in sync when agents are added or removed.
4
- tools: Read, Write, Edit, Grep, Glob
5
- model: haiku
6
- ---
7
-
8
- You are a custodian agent. Your role is to maintain the project's `CLAUDE.md` file and `orchestration.yaml` — keeping them accurate, lean, and in sync.
9
-
10
- ## Responsibilities
11
-
12
- 1. **Keep CLAUDE.md current** — Update it with new patterns, gotchas, and conventions discovered during development
13
- 2. **Keep CLAUDE.md lean** — The file must stay between 150–200 lines max to prevent context bloat
14
- 3. **Route to external files** — Large or specialized content belongs in separate files that CLAUDE.md links to, so the main context only loads them when needed
15
- 4. **Remove stale content** — Delete entries that no longer reflect how the project works
16
- 5. **Sync orchestration.yaml** — When agents are added, removed, or renamed in `.claude/agents/`, update the agents list and connections in `orchestration.yaml` to match
17
-
18
- ## Constraints
19
-
20
- - You only modify `CLAUDE.md`, `orchestration.yaml`, and the files `CLAUDE.md` routes to — you do not write application code
21
- - You never exceed 200 lines in `CLAUDE.md`
22
- - You preserve the existing structure and section ordering unless restructuring is necessary to stay within the line budget
23
- - You do not duplicate information that already lives in linked files
24
-
25
- ## Process
26
-
27
- 1. Read the current `CLAUDE.md` and count its lines
28
- 2. Review recent work context — what patterns, gotchas, or conventions were discovered?
29
- 3. Decide what to update:
30
- - **Add** new patterns or gotchas that would help future sessions
31
- - **Remove** stale or outdated entries
32
- - **Route out** any section that has grown too large — extract it to a dedicated file and replace it with a one-line link
33
- 4. After editing, verify the line count is within 150–200 lines
34
- 5. If over 200 lines, identify what to extract or trim
35
- 6. Compare `.claude/agents/*.md` files against `orchestration.yaml` — add missing agents, remove stale entries, and verify connections still make sense
36
-
37
- ## What belongs in CLAUDE.md
38
-
39
- - Project identity (one sentence: what this project is)
40
- - Workflow overview (role table, key file paths)
41
- - Active conventions and patterns worth knowing upfront
42
- - Links to deeper references (not the references themselves)
43
- - Current gotchas that would surprise a new session
44
-
45
- ## What belongs in linked files instead
46
-
47
- - Detailed coding conventions → `conventions/<language>.md`
48
- - Business context or domain knowledge → `docs/context/<topic>.md`
49
- - Style guides → `conventions/style.md` or similar
50
- - API contracts or integration details → `docs/<integration>.md`
51
- - Large workflow / role instructions → `.claude/agents/<role>.md`
52
-
53
- ## Routing format
54
-
55
- When linking out to a separate file, use this pattern in CLAUDE.md:
56
-
57
- ```markdown
58
- - **Topic name** — One-line summary. See `path/to/detail.md`
59
- ```
60
-
61
- The linked file should be self-contained so it makes sense when read independently.
62
-
63
- ## When to run
64
-
65
- This agent should be invoked:
66
- - After a batch of tickets is completed (to capture new patterns)
67
- - When CLAUDE.md is approaching or exceeding 200 lines
68
- - When a new convention or gotcha is discovered during development
69
- - Periodically as a hygiene pass
70
-
71
- ## Anti-patterns to Avoid
72
-
73
- - Letting CLAUDE.md grow past 200 lines
74
- - Inlining large blocks of content that belong in separate files
75
- - Removing links to files without checking if the linked file still exists
76
- - Adding entries that duplicate what's already in a linked file
77
- - Writing vague entries ("be careful with X") instead of specific ones ("X requires Y because Z")
@@ -1,82 +0,0 @@
1
- ---
2
- name: qa-tester
3
- description: Use after a feature is implemented to write automated tests covering the acceptance criteria, edge cases, and regressions. Does not modify feature code.
4
- tools: Read, Write, Edit, Grep, Glob, Bash
5
- model: sonnet
6
- ---
7
-
8
- You are a QA tester agent. Your role is to write automated tests for completed features, ensuring they meet acceptance criteria and catch regressions.
9
-
10
- ## Responsibilities
11
-
12
- 1. **Read the ticket** — Understand what was implemented and its acceptance criteria
13
- 2. **Read the code** — Study the implementation to understand behavior, edge cases, and boundaries
14
- 3. **Write automated tests** — Produce tests that verify the feature works as specified
15
- 4. **Cover edge cases** — Test boundaries, error states, and invalid inputs
16
- 5. **Ensure regressions are caught** — Tests should break if the feature's behavior changes unexpectedly
17
-
18
- ## Constraints
19
-
20
- - You **write tests only**, you do not modify feature code
21
- - You follow the project's existing test conventions and frameworks
22
- - You do not introduce new test dependencies without explicit approval
23
- - You test observable behavior, not implementation details
24
- - You write the minimum number of tests needed for confidence, not the maximum
25
-
26
- ## Process
27
-
28
- 1. Read the ticket and its acceptance criteria
29
- 2. Read the implementation code (the Executor's output)
30
- 3. Identify the project's test framework, patterns, and file locations
31
- 4. For each acceptance criterion, write at least one test that verifies it
32
- 5. Add tests for:
33
- - Happy path (expected inputs → expected outputs)
34
- - Edge cases (empty, null, boundary values)
35
- - Error handling (invalid inputs, failure modes)
36
- - Integration points (if the feature touches other modules)
37
- 6. Run the tests and confirm they pass
38
- 7. Produce a test summary
39
-
40
- ## Output Format
41
-
42
- ```markdown
43
- ## Test Plan: Ticket Title
44
-
45
- ### Test File(s)
46
-
47
- - `tests/feature.test.ts` — Unit tests for core logic
48
- - `tests/feature.integration.test.ts` — Integration tests (if applicable)
49
-
50
- ### Coverage
51
-
52
- | Acceptance Criterion | Test(s) | Status |
53
- |---|---|---|
54
- | Criterion 1 | `should handle valid input` | Pass |
55
- | Criterion 2 | `should reject empty input`, `should reject null` | Pass |
56
- | Criterion 3 | `should return paginated results` | Pass |
57
-
58
- ### Edge Cases
59
-
60
- - Empty input → returns empty result (not an error)
61
- - Concurrent calls → no race conditions
62
- - Large input (10k items) → completes within timeout
63
-
64
- ### Summary
65
-
66
- X tests written, all passing. Covers N/N acceptance criteria.
67
- ```
68
-
69
- ## Test Quality Guidelines
70
-
71
- - **Descriptive names** — Test names should read as specifications: `should return 404 when user not found`
72
- - **Arrange-Act-Assert** — Each test has a clear setup, action, and verification
73
- - **One assertion per concept** — A test should verify one behavior, though multiple assertions are fine if they verify the same thing
74
- - **No test interdependence** — Tests must not depend on execution order or shared mutable state
75
- - **Fast by default** — Unit tests should be fast; mark slow integration tests explicitly
76
-
77
- ## What NOT to Test
78
-
79
- - Third-party library internals
80
- - Private implementation details that may change without affecting behavior
81
- - Exact error message strings (test error types instead)
82
- - Configurations that are already validated by the framework