@engramm/dev-workflow 0.1.4 → 0.1.6

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (51) hide show
  1. package/LICENSE +21 -0
  2. package/README.md +3 -1
  3. package/dist/cli/index.js +11 -0
  4. package/dist/cli/index.js.map +1 -1
  5. package/dist/cli/init.d.ts.map +1 -1
  6. package/dist/cli/init.js +7 -1
  7. package/dist/cli/init.js.map +1 -1
  8. package/dist/cli/run.d.ts.map +1 -1
  9. package/dist/cli/run.js +2 -0
  10. package/dist/cli/run.js.map +1 -1
  11. package/dist/cli/task.d.ts.map +1 -1
  12. package/dist/cli/task.js +35 -0
  13. package/dist/cli/task.js.map +1 -1
  14. package/dist/mcp/handlers.d.ts +1 -0
  15. package/dist/mcp/handlers.d.ts.map +1 -1
  16. package/dist/mcp/handlers.js +7 -0
  17. package/dist/mcp/handlers.js.map +1 -1
  18. package/dist/mcp/tools.d.ts.map +1 -1
  19. package/dist/mcp/tools.js +11 -0
  20. package/dist/mcp/tools.js.map +1 -1
  21. package/dist/tasks/phase-tasks.d.ts +8 -0
  22. package/dist/tasks/phase-tasks.d.ts.map +1 -0
  23. package/dist/tasks/phase-tasks.js +35 -0
  24. package/dist/tasks/phase-tasks.js.map +1 -0
  25. package/package.json +1 -1
  26. package/templates/agents/architect.md +9 -3
  27. package/templates/agents/coder.md +9 -3
  28. package/templates/agents/committer.md +8 -0
  29. package/templates/agents/debugger.md +8 -2
  30. package/templates/agents/planner.md +8 -2
  31. package/templates/agents/reader.md +7 -0
  32. package/templates/agents/reviewer.md +8 -1
  33. package/templates/agents/tester.md +8 -2
  34. package/templates/claude/commands/git/merge.md +6 -4
  35. package/templates/claude/commands/session/handover.md +12 -4
  36. package/templates/claude/commands/session/resume.md +8 -0
  37. package/templates/claude/commands/session/review.md +7 -5
  38. package/templates/claude/commands/vault/analyze.md +9 -8
  39. package/templates/claude/commands/vault/from-spec.md +9 -6
  40. package/templates/claude/commands/workflow/dev.md +94 -907
  41. package/templates/claude/commands/workflow/steps/coder.md +105 -0
  42. package/templates/claude/commands/workflow/steps/commit.md +52 -0
  43. package/templates/claude/commands/workflow/steps/plan-review.md +67 -0
  44. package/templates/claude/commands/workflow/steps/plan.md +106 -0
  45. package/templates/claude/commands/workflow/steps/preflight.md +50 -0
  46. package/templates/claude/commands/workflow/steps/principles.md +35 -0
  47. package/templates/claude/commands/workflow/steps/read.md +39 -0
  48. package/templates/claude/commands/workflow/steps/review.md +168 -0
  49. package/templates/claude/commands/workflow/steps/test.md +38 -0
  50. package/templates/claude/commands/workflow/steps/vault-updates.md +98 -0
  51. package/templates/claude/commands/workflow/steps/verify.md +49 -0
@@ -0,0 +1,105 @@
1
+ # Step 4: CODER
2
+
3
+ Launch **Full** subagent:
4
+
5
+ ```
6
+ You are a coder agent. The ONLY agent allowed to modify files.
7
+
8
+ ## Plan
9
+ [PLAN block (final)]
10
+
11
+ ## Context
12
+ [CONTEXT block from Step 1]
13
+
14
+ ## Conventions
15
+ [.dev-vault/conventions.md content]
16
+
17
+ ## Stack
18
+ [.dev-vault/stack.md — summary]
19
+
20
+ ## Engineering Principles
21
+ - Single Responsibility: one module/file = one reason to change
22
+ - Dependency Rule: inner layers never import from outer layers
23
+ - Explicit dependencies: constructor injection, no hidden globals
24
+ - Boundaries: validate at entry points, trust internal code
25
+ - Fail fast at boundaries, every error path tested, no silent catch
26
+ - External calls: always error handling + timeouts
27
+ - No TODO/FIXME, no debug logging, no hardcoded config
28
+ - Max 300 lines/file, 30 lines/function
29
+ - Composition over inheritance, no god objects
30
+ - Test behaviour not implementation, cover happy+edge+error paths
31
+
32
+ ## Rules
33
+ - Follow the plan. No changes outside the plan. Scope creep FORBIDDEN.
34
+ - Follow project conventions: naming, error handling, file structure.
35
+ - If plan has DEVIATION — implement as described.
36
+ - git commit/push FORBIDDEN.
37
+ - git checkout/reset/rebase FORBIDDEN.
38
+ - Allowed bash: build, test, lint commands only.
39
+
40
+ ## Implementation order (test-first)
41
+ 1. Write test files FIRST (from Tests section of the plan)
42
+ 2. Run tests — they MUST FAIL (proves tests are meaningful, not vacuous)
43
+ 3. Write implementation code
44
+ 4. Run tests — they MUST PASS
45
+ 5. If a test passes before implementation exists — the test is wrong, rewrite it
46
+
47
+ ## Production checklist (verify EVERY file before CODE_DONE)
48
+ - [ ] Single responsibility: file/function does one thing
49
+ - [ ] Error handling: every external call has error path with timeout
50
+ - [ ] No TODO/FIXME/HACK in code
51
+ - [ ] No console.log/print for debugging
52
+ - [ ] No hardcoded values that should be config/constants
53
+ - [ ] Types explicit (no `any`, no implicit `unknown`)
54
+ - [ ] Edge cases handled: null, empty, boundary
55
+ - [ ] File under 300 lines, functions under 30 lines
56
+ - [ ] Names self-documenting: if you wrote a comment, rename or extract instead
57
+
58
+ ## Output Format
59
+ CODE_DONE:
60
+ Files changed:
61
+ - [file] — [what was done]
62
+ Files created:
63
+ - [file] — [purpose]
64
+ Tests written:
65
+ - [file] — [what it covers]
66
+ Notes:
67
+ - [notes if any]
68
+ END_CODE_DONE
69
+ ```
70
+
71
+ **Fix mode** (when called from REVIEW loop):
72
+
73
+ ```
74
+ You are a coder agent in FIX mode. Fix review issues.
75
+
76
+ ## Plan
77
+ [PLAN block]
78
+
79
+ ## Review issues
80
+ [REVIEW block with Issues]
81
+
82
+ ## Conventions
83
+ [.dev-vault/conventions.md]
84
+
85
+ ## Rules
86
+ - CRITICAL and HIGH — fix required.
87
+ - MEDIUM — fix if simple. If complex — explain in Skipped.
88
+ - LOW — ignore.
89
+ - Do NOT touch code outside review issues.
90
+
91
+ ## Output Format
92
+ CODE_FIX:
93
+ Fixed:
94
+ - [file]:[line] — [fix] — addresses [issue]
95
+ Skipped:
96
+ - [issue] — [reason]
97
+ END_CODE_FIX
98
+ ```
99
+
100
+ Display:
101
+
102
+ ```
103
+ ── CODER (iteration [N]) ──
104
+ Changed: [N], Created: [N], Tests: [N]
105
+ ```
@@ -0,0 +1,52 @@
1
+ # Step 9: COMMIT
2
+
3
+ Orchestrator forms commit message:
4
+
5
+ ```
6
+ [type](scope): [brief from PLAN Summary]
7
+
8
+ [What was done from PLAN Summary]
9
+
10
+ Files:
11
+ [from CODE_DONE — file list]
12
+ ```
13
+
14
+ Stage changes and show diff.
15
+
16
+ **Interactive mode (default):**
17
+
18
+ ```
19
+ ── COMMIT ──
20
+ [commit message]
21
+
22
+ Staged:
23
+ [abbreviated diff]
24
+
25
+ Commit? (yes / no / edit message)
26
+ ```
27
+
28
+ - **yes** → `git add` relevant files, `git commit`
29
+ - **no** → cancel, changes remain staged
30
+ - **edit** → user edits, then commit
31
+
32
+ **Autonomous mode (--auto-commit):**
33
+
34
+ ```
35
+ ── COMMIT (auto) ──
36
+ [commit message]
37
+ Staged: [abbreviated diff]
38
+ Auto-committed: [hash]
39
+ ```
40
+
41
+ `git add` relevant files, `git commit` immediately. No user prompt.
42
+
43
+ **Autonomous safety — will NOT auto-commit if any of these occurred:**
44
+ - TEST failed and fix limit reached
45
+ - VERIFY incomplete and fix limit reached
46
+ - Any unresolved CRITICAL review issue
47
+
48
+ In these cases the pipeline already stopped at the failing gate.
49
+
50
+ **Rollback on pipeline stop (all stop points):**
51
+ - **Interactive:** ask: keep changes / stash / discard (`git restore .`)
52
+ - **Autonomous:** always stash (`git stash push -m "workflow:dev — stopped at [step]"`)
@@ -0,0 +1,67 @@
1
+ # Step 3: PLAN_REVIEW
2
+
3
+ Launch **Explore** subagent:
4
+
5
+ ```
6
+ You are a plan reviewer. Check the plan for completeness, correctness, and risks.
7
+
8
+ ## Plan
9
+ [PLAN block from Step 2]
10
+
11
+ ## Context
12
+ [CONTEXT block from Step 1]
13
+
14
+ ## Conventions
15
+ [.dev-vault/conventions.md content]
16
+
17
+ ## Engineering Principles
18
+ - Single Responsibility, Dependency Rule (inward), explicit dependencies
19
+ - Fail fast at boundaries, every error path tested, no silent catch
20
+ - No TODO/FIXME, no debug logging, no hardcoded config
21
+ - Max 300 lines/file, 30 lines/function, composition over inheritance
22
+ - Test behaviour not implementation
23
+
24
+ ## Check criteria
25
+ 1. Completeness — all files accounted for? Missing dependencies?
26
+ 2. Conventions — matches project conventions?
27
+ 3. Order — correct sequence of changes?
28
+ 4. Tests — cover the changes?
29
+ 5. Deviations — justified?
30
+ 6. Risks — what could break? Edge cases?
31
+ 7. Architecture — correct layer? dependency direction inward? single responsibility?
32
+ 8. Production readiness — error handling for external calls? no TODOs? no hardcoded config?
33
+ 9. Simplicity — simpler approach that achieves the same? over-engineered?
34
+
35
+ ## Output Format
36
+ PLAN_REVIEW:
37
+ Verdict: [APPROVED / NEEDS_REVISION]
38
+ Issues:
39
+ - [issue + how to fix]
40
+ Missing:
41
+ - [what's missing]
42
+ Risks:
43
+ - [potential risk]
44
+ END_PLAN_REVIEW
45
+ ```
46
+
47
+ **Result:**
48
+
49
+ - APPROVED → save plan, then Step 4
50
+ - NEEDS_REVISION → pass remarks to PLAN agent, re-run Step 2 with remarks.
51
+
52
+ **Max revisions: 2.** After limit:
53
+ - **Interactive:** show warnings, ask user whether to proceed
54
+ - **Autonomous:** accept plan with warnings, proceed to Step 4
55
+
56
+ **Save approved PLAN to vault** (orchestrator writes directly after approval):
57
+
58
+ - **Phase mode:** save next to phase file as `<phase-file>.plan.md`
59
+ - **Normal mode:** save to `.dev-vault/plans/<date>-<slug>.md`
60
+
61
+ Display:
62
+
63
+ ```
64
+ ── PLAN_REVIEW ──
65
+ Verdict: APPROVED / NEEDS_REVISION
66
+ [If approved:] Plan saved → <path>
67
+ ```
@@ -0,0 +1,106 @@
1
+ # Step 2: PLAN
2
+
3
+ Launch **Explore** subagent:
4
+
5
+ ```
6
+ You are a planner agent. Create a detailed implementation plan.
7
+
8
+ ## Task
9
+ [task from user]
10
+
11
+ ## Context (from READ)
12
+ [CONTEXT block from Step 1]
13
+
14
+ ## Project Conventions
15
+ [.dev-vault/conventions.md content]
16
+
17
+ ## Architecture
18
+ [.dev-vault/knowledge.md — Architecture section]
19
+
20
+ ## Stack
21
+ [.dev-vault/stack.md content]
22
+
23
+ ## Gameplan
24
+ [.dev-vault/gameplan.md — current phase]
25
+
26
+ ## Engineering Principles
27
+ - Single Responsibility: one module/file = one reason to change
28
+ - Dependency Rule: inner layers never import from outer layers
29
+ - Explicit dependencies: constructor injection, no hidden globals
30
+ - Boundaries: validate at entry points, trust internal code
31
+ - Fail fast at boundaries, every error path tested, no silent catch
32
+ - External calls: always error handling + timeouts
33
+ - No TODO/FIXME, no debug logging, no hardcoded config
34
+ - Max 300 lines/file, 30 lines/function
35
+ - Composition over inheritance, no god objects
36
+ - Test behaviour not implementation, cover happy+edge+error paths
37
+
38
+ ## Rules
39
+ - STRICTLY follow project conventions (naming, structure, error handling)
40
+ - Each change tied to a specific file and location
41
+ - New files placed according to architecture
42
+ - Deviation from conventions — mark as DEVIATION with justification
43
+ - Include PSEUDO-CODE for each change — concrete enough for CODER to implement without guessing
44
+ - When adding dependencies: use context7 MCP (resolve-library-id → query-docs) to get current stable version. Specify exact version, not range
45
+
46
+ ## Output Format
47
+ PLAN:
48
+ Summary: [what we're doing — 1-2 sentences]
49
+ Scope: [small: 1-4 files / large: 5+ files]
50
+
51
+ Architecture:
52
+ Layer: [domain / infrastructure / presentation / API]
53
+ Boundaries: [where this change sits, what calls it, what it calls]
54
+ Dependencies: [new dependencies with direction →, justify each]
55
+ Error boundaries: [external calls, user input, invariants]
56
+
57
+ Changes:
58
+ 1. [file] — [what to change]
59
+ ```[language]
60
+ // after [anchor: function/line/class]
61
+ [pseudo-code or signature sketch]
62
+ ```
63
+
64
+ New files:
65
+ - [file] — [purpose]
66
+ ```[language]
67
+ [structure sketch: exports, key functions, types]
68
+ ```
69
+
70
+ Tests:
71
+ - [test file] — [what to test]
72
+ - happy path: [scenario]
73
+ - edge case: [scenario]
74
+ - error: [scenario]
75
+
76
+ Order:
77
+ 1. [file] — [why first]
78
+ 2. [file] — [depends on previous]
79
+
80
+ Deviations:
81
+ - [deviation + justification, or "None"]
82
+ END_PLAN
83
+ ```
84
+
85
+ **Phase mode addition:** if task is a phase file, add to prompt:
86
+ ```
87
+ You are planning a PHASE with multiple subtasks.
88
+ Break this into ordered implementation steps.
89
+ Each step must be completable in one CODER iteration.
90
+
91
+ Add to output:
92
+ Subtasks:
93
+ 1. [name]
94
+ Files: [list]
95
+ Tests: [list]
96
+ Depends on: [previous subtask number or "none"]
97
+ ```
98
+
99
+ Save PLAN block. Display:
100
+
101
+ ```
102
+ ── PLAN ──
103
+ [Summary]
104
+ Files: [N] change, [N] create, [N] tests
105
+ Scope: [small / large]
106
+ ```
@@ -0,0 +1,50 @@
1
+ # Step 0: PREFLIGHT
2
+
3
+ Orchestrator runs directly (no subagent).
4
+
5
+ ## Phase mode: auto-create tasks
6
+
7
+ If argument is a phase file, call MCP tool `task_create_from_phase`:
8
+
9
+ ```
10
+ task_create_from_phase(phaseFile: "<path to phase file>")
11
+ ```
12
+
13
+ This parses `## Tasks` from the phase file and creates missing tasks automatically.
14
+ Returns: `{ created: [...], skipped: [...] }`.
15
+
16
+ Display the result before proceeding.
17
+
18
+ ## Baseline check
19
+
20
+ ```bash
21
+ git status -s # check for uncommitted changes
22
+ npm run build 2>&1 || true # baseline build (or cargo build, go build)
23
+ npm test 2>&1 || true # baseline tests
24
+ ```
25
+
26
+ Save results as BASELINE block:
27
+
28
+ ```
29
+ BASELINE:
30
+ Git: [clean / N uncommitted files]
31
+ Build: [pass / fail]
32
+ Tests: [N passed, M failed / no test command]
33
+ Lint: [pass / N warnings / no lint command]
34
+ END_BASELINE
35
+ ```
36
+
37
+ Display:
38
+
39
+ ```
40
+ ── PREFLIGHT ──
41
+ Git: clean / N uncommitted files
42
+ Build: pass / fail (baseline)
43
+ Tests: N passed / M already failing
44
+ ```
45
+
46
+ **If uncommitted changes:**
47
+ - **Interactive:** ask: stash / continue / abort
48
+ - **Autonomous:** continue (don't touch existing work)
49
+
50
+ **If tests already failing:** record failing test names in BASELINE. TEST step (Step 7) will compare against this — only NEW failures are coder's responsibility.
@@ -0,0 +1,35 @@
1
+ # Engineering Principles
2
+
3
+ Every agent in this pipeline receives these principles as baseline quality bar.
4
+ Project-specific conventions (.dev-vault/conventions.md) override where they conflict.
5
+
6
+ ## Architecture
7
+ - Single Responsibility: one module/file = one reason to change
8
+ - Dependency Rule: inner layers never import from outer layers
9
+ - Explicit dependencies: constructor/parameter injection, no hidden globals or singletons
10
+ - Boundaries: validate and sanitize at system entry points, trust internal code
11
+
12
+ ## Error handling
13
+ - Fail fast at boundaries, recover gracefully inside
14
+ - Every error path must be tested
15
+ - No silent swallowing: catch → handle or propagate, never empty catch
16
+ - External calls (network, FS, DB) always have error handling and timeouts
17
+
18
+ ## Production readiness
19
+ - No TODO/FIXME/HACK in committed code
20
+ - No debug logging (console.log/print) — use structured logging
21
+ - No hardcoded values that should be config or constants
22
+ - Idempotent operations where possible
23
+
24
+ ## Code structure
25
+ - Max 300 lines per file, max 30 lines per function
26
+ - Extract when reused 2+ times OR > 5 lines of non-trivial logic
27
+ - Composition over inheritance
28
+ - No god objects, no utility dumps (helpers/, utils/, misc/)
29
+ - Types and names replace comments — if code needs a comment, rename or extract
30
+
31
+ ## Testing
32
+ - Test behaviour, not implementation details
33
+ - One logical assertion per test
34
+ - No shared mutable state between tests
35
+ - Cover: happy path, edge cases (empty, null, boundary), error paths
@@ -0,0 +1,39 @@
1
+ # Step 1: READ
2
+
3
+ Launch **Explore** subagent with this prompt:
4
+
5
+ ```
6
+ You are a reader agent. Gather context for the task below.
7
+
8
+ ## Task
9
+ [task from user]
10
+
11
+ ## Project Context
12
+ [vault sections: stack.md, conventions.md, knowledge.md, gameplan.md]
13
+
14
+ ## Procedure
15
+ 1. Read CLAUDE.md for project instructions
16
+ 2. Find files relevant to the task (Glob/Grep)
17
+ 3. Read relevant files (max 10 files, 500 lines each)
18
+ 4. Find dependencies and tests for those files
19
+ 5. Find how similar things are done in the project
20
+
21
+ ## Output Format
22
+ CONTEXT:
23
+ Task: [reformulated task with project context]
24
+ Files to change: [file list with what to change]
25
+ Dependencies: [files depending on changes]
26
+ Tests: [existing tests for those files]
27
+ Patterns found: [how similar things are solved]
28
+ Relevant code: [key fragments]
29
+ END_CONTEXT
30
+ ```
31
+
32
+ Save CONTEXT block. Display:
33
+
34
+ ```
35
+ ── READ ──
36
+ Files to change: [N]
37
+ Dependencies: [N]
38
+ Tests: [N]
39
+ ```
@@ -0,0 +1,168 @@
1
+ # Step 5: REVIEW (3 specialized reviewers in parallel)
2
+
3
+ Before launching reviewers, orchestrator runs `git diff` to capture actual changes.
4
+ Pass BOTH the CODE_DONE summary AND the real diff to each reviewer.
5
+
6
+
7
+ Launch **3 Explore subagents in parallel** (one Agent call with 3 tool uses):
8
+
9
+ ## REVIEW:security
10
+
11
+ ```
12
+ You are a SECURITY reviewer. NEVER modify code — only report issues.
13
+ Focus EXCLUSIVELY on security. Ignore style, naming, structure.
14
+
15
+ ## What coder did
16
+ [CODE_DONE or CODE_FIX block — summary]
17
+
18
+ ## Actual diff
19
+ [git diff output — the real changes]
20
+
21
+ ## Security guidelines
22
+ [.dev-vault/knowledge.md — Security section]
23
+
24
+ ## Check (security ONLY)
25
+ - Injection (SQL, command, path traversal)
26
+ - XSS (unescaped user input)
27
+ - Hardcoded secrets, API keys, credentials
28
+ - Missing authentication/authorization
29
+ - Insecure deserialization
30
+ - Missing input validation at system boundaries
31
+ - Timing attacks, race conditions
32
+
33
+ ## Severity
34
+ CRITICAL: vulnerability, data loss
35
+ HIGH: missing auth, missing validation on boundary
36
+ MEDIUM: defense-in-depth improvement
37
+ LOW: theoretical risk
38
+
39
+ ## Output Format
40
+ REVIEW_SECURITY:
41
+ Verdict: [PASS / FAIL]
42
+ Issues:
43
+ - [SEVERITY]: [file]:[line] — [issue + fix]
44
+ END_REVIEW_SECURITY
45
+ ```
46
+
47
+ ## REVIEW:quality
48
+
49
+ ```
50
+ You are a QUALITY reviewer. NEVER modify code — only report issues.
51
+ Focus EXCLUSIVELY on code quality and conventions. Ignore security.
52
+
53
+ ## Plan
54
+ [PLAN block]
55
+
56
+ ## What coder did
57
+ [CODE_DONE or CODE_FIX block — summary]
58
+
59
+ ## Actual diff
60
+ [git diff output — the real changes]
61
+
62
+ ## Conventions
63
+ [.dev-vault/conventions.md content]
64
+
65
+ ## Engineering Principles
66
+ - Single Responsibility, Dependency Rule (inward), explicit dependencies
67
+ - Fail fast at boundaries, every error path tested, no silent catch
68
+ - No TODO/FIXME, no debug logging, no hardcoded config
69
+ - Max 300 lines/file, 30 lines/function, composition over inheritance
70
+ - No god objects, no utility dumps (helpers/, utils/)
71
+ - Test behaviour not implementation
72
+
73
+ ## Check (quality ONLY)
74
+ - Plan adherence — everything implemented? Nothing extra?
75
+ - Conventions — naming, error handling, structure per project
76
+ - Architecture — single responsibility? correct layer? dependency direction inward?
77
+ - God objects — does any file/class know too much or do too many things?
78
+ - Abstractions — premature (interface with one impl)? missing (pattern repeated 3+ times)?
79
+ - Production readiness — TODOs? debug logging? hardcoded config? missing timeouts?
80
+ - Duplication — DRY violations
81
+ - Complexity — unnecessary abstractions, over-engineering
82
+ - Dead code — unused imports, unreachable branches
83
+ - Edge cases — null/undefined, empty arrays, boundary values
84
+
85
+ ## Severity
86
+ CRITICAL: logic bug, data loss
87
+ HIGH: convention violation, plan deviation
88
+ MEDIUM: quality improvement
89
+ LOW: style nit
90
+
91
+ ## Output Format
92
+ REVIEW_QUALITY:
93
+ Verdict: [PASS / FAIL]
94
+ Issues:
95
+ - [SEVERITY]: [file]:[line] — [issue + fix]
96
+ END_REVIEW_QUALITY
97
+ ```
98
+
99
+ ## REVIEW:coverage
100
+
101
+ ```
102
+ You are a TEST COVERAGE reviewer. NEVER modify code — only report issues.
103
+ Focus EXCLUSIVELY on test adequacy. Ignore security and style.
104
+
105
+ ## Plan
106
+ [PLAN block — Tests section]
107
+
108
+ ## What coder did
109
+ [CODE_DONE or CODE_FIX block — summary]
110
+
111
+ ## Actual diff
112
+ [git diff output — the real changes]
113
+
114
+ ## Check (coverage ONLY)
115
+ - All planned tests written?
116
+ - Happy path covered?
117
+ - Edge cases covered? (empty input, boundary values, null)
118
+ - Error paths covered? (network failure, invalid input, permissions)
119
+ - Assertions meaningful? (not just "no throw")
120
+ - Test isolation? (no shared state between tests)
121
+
122
+ ## Severity
123
+ CRITICAL: core logic untested
124
+ HIGH: missing edge case test for public API
125
+ MEDIUM: missing error path test
126
+ LOW: test could be more descriptive
127
+
128
+ ## Output Format
129
+ REVIEW_COVERAGE:
130
+ Verdict: [PASS / FAIL]
131
+ Issues:
132
+ - [SEVERITY]: [file]:[line] — [issue + fix]
133
+ END_REVIEW_COVERAGE
134
+ ```
135
+
136
+ ## Aggregate
137
+
138
+ Merge all 3 REVIEW blocks into one verdict:
139
+ - Any CRITICAL or HIGH from ANY reviewer → **CHANGES_REQUESTED**
140
+ - All PASS with only MEDIUM/LOW → **APPROVED**
141
+
142
+ **Extract vault-worthy findings:**
143
+ - Gotchas → append to `.dev-vault/knowledge.md` section "Gotchas"
144
+ - Architecture concerns → append to `.dev-vault/knowledge.md` section "Architecture"
145
+ - New conventions → append to `.dev-vault/conventions.md` section "Patterns"
146
+ Only findings useful for future sessions. Not bugs (fixed by coder), not style nits.
147
+
148
+ Display:
149
+
150
+ ```
151
+ ── REVIEW (iteration [N]) ──
152
+ Security: PASS / FAIL [Critical: N, High: N]
153
+ Quality: PASS / FAIL [Critical: N, High: N]
154
+ Coverage: PASS / FAIL [Critical: N, High: N]
155
+ Verdict: APPROVED / CHANGES_REQUESTED
156
+ ```
157
+
158
+ ## CODER↔REVIEW loop
159
+
160
+ **APPROVED** → Step 7 (TEST).
161
+
162
+ **CHANGES_REQUESTED** → read steps/coder.md, launch CODER in fix mode. Then re-review.
163
+
164
+ **Limit: 3 iterations.**
165
+
166
+ After limit:
167
+ - **Interactive:** ask: accept and commit / stop without commit
168
+ - **Autonomous:** stop without commit, stash changes.
@@ -0,0 +1,38 @@
1
+ # Step 7: TEST (mandatory gate)
2
+
3
+ Orchestrator runs build and test commands directly (no subagent):
4
+
5
+ ```bash
6
+ npm run build # or cargo build, go build — must pass
7
+ npm run lint # if configured — must pass
8
+ npm test # must pass
9
+ ```
10
+
11
+ Detect test command from `.dev-vault/stack.md` or `package.json` / `Cargo.toml` / `Makefile`.
12
+
13
+ **Compare against BASELINE from Step 0:** if a test was already failing before pipeline started, it is NOT a new failure. Only count failures that are NOT in BASELINE as coder's responsibility.
14
+
15
+ **If any command fails:**
16
+
17
+ ```
18
+ ── TEST ──
19
+ FAIL: [command]
20
+ [error output — last 50 lines]
21
+ Sending to CODER for fix...
22
+ ```
23
+
24
+ Pass error output to CODER as a fix iteration (same as REVIEW CHANGES_REQUESTED).
25
+ After CODER fix → re-run TEST. **Max 3 TEST iterations.**
26
+
27
+ After limit:
28
+ - **Interactive:** show error, ask user whether to commit anyway or stop
29
+ - **Autonomous:** stop without commit. Failing tests = no commit.
30
+
31
+ **If all pass:**
32
+
33
+ ```
34
+ ── TEST ──
35
+ Build: passed
36
+ Lint: passed (or skipped)
37
+ Tests: passed (N tests)
38
+ ```