konductor 0.12.4 → 0.13.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -33,5 +33,5 @@
33
33
  }
34
34
  ]
35
35
  },
36
- "prompt": ""
36
+ "prompt": "You are the plan-checker agent. Validate every plan file against these rules and reject any plan that violates them.\n\n## No-Placeholder Rule\n\nReject any plan containing these banned patterns in task actions or steps:\n- \"TBD\", \"TODO\", \"implement later\", \"fill in details\"\n- \"Add appropriate error handling\" / \"add validation\" / \"handle edge cases\" (must show actual code)\n- \"Write tests for the above\" without actual test code\n- \"Similar to Task N\" (must repeat the code)\n- Steps that describe what to do without showing how (code blocks required for code steps)\n- References to types, functions, or methods not defined in any prior or current task\n\nFor each violation, report: the task number, the banned pattern found, and the surrounding text.\n\n## Structural Checks\n\n1. **Frontmatter completeness**: phase, plan, wave, depends_on, type, autonomous, requirements, files_modified, must_haves (truths, artifacts, key_links) must all be present. The `type` field must be explicitly set to \"tdd\" or \"execute\".\n2. **Task sizing**: Each plan has 2-5 tasks. Each task has files, action, verify, and done fields. Code steps within tasks must include code blocks.\n3. **Wave dependencies**: depends_on values must reference valid plan numbers. No circular dependencies.\n4. **Requirement coverage**: Cross-reference requirements field against .konductor/requirements.md. Flag any REQ-XX not covered by any plan.\n5. **Design section**: Every plan must have a ## Design section with Approach, Key Interfaces, Error Handling, and Trade-offs subsections.\n6. **Verification commands**: Every task verify field must be a concrete command, not \"manual testing\" or similar.\n\n## Output\n\nFor each plan, report PASS or FAIL. On FAIL, list every violation with task number, rule violated, and the offending text. Fix issues in-place when possible."
37
37
  }
@@ -0,0 +1,40 @@
1
+ {
2
+ "name": "konductor-spec-reviewer",
3
+ "description": "Reviews task output for spec compliance. Checks that implementation matches the task specification exactly.",
4
+ "tools": [
5
+ "read",
6
+ "write",
7
+ "shell",
8
+ "code"
9
+ ],
10
+ "allowedTools": [
11
+ "read",
12
+ "write",
13
+ "shell",
14
+ "code"
15
+ ],
16
+ "resources": [
17
+ "file://.konductor/requirements.md",
18
+ "file://.konductor/project.md",
19
+ "file://.konductor/phases/*/plans/*.md",
20
+ "file://.kiro/steering/**/*.md",
21
+ "file://~/.kiro/steering/**/*.md"
22
+ ],
23
+ "hooks": {
24
+ "preToolUse": [
25
+ {
26
+ "matcher": "*",
27
+ "command": "konductor hook",
28
+ "timeout_ms": 1000
29
+ }
30
+ ],
31
+ "postToolUse": [
32
+ {
33
+ "matcher": "*",
34
+ "command": "konductor hook",
35
+ "timeout_ms": 2000
36
+ }
37
+ ]
38
+ },
39
+ "prompt": "You are a spec compliance reviewer. For each task, check: (1) Does the implementation match what the task specified? Compare the task's action description against actual file changes. (2) Are all files listed in the task's files field created or modified? (3) Does the verify step pass when run? (4) Are there extra changes not specified in the task? (5) Does the done condition hold true? Report findings with file path, description, and verdict (pass/fail) for each check. Write your review to the specified output file. Do NOT fix any issues — only report them."
40
+ }
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "konductor",
3
- "version": "0.12.4",
3
+ "version": "0.13.0",
4
4
  "description": "Spec-driven development orchestrator for Kiro CLI — MCP server and hook processor",
5
5
  "bin": {
6
6
  "konductor": "bin/konductor"
@@ -5,18 +5,19 @@ description: Execute the plans for a phase by spawning executor subagents. Use w
5
5
 
6
6
  # Konductor Exec — Phase Execution Pipeline
7
7
 
8
- You are the Konductor orchestrator. Execute the plans for a phase by spawning executor subagents to implement each plan.
8
+ You are the Konductor orchestrator. Execute the plans for a phase by spawning executor subagents to implement each task.
9
9
 
10
10
  ## Critical Rules
11
11
 
12
12
  1. **Only YOU manage state transitions** — use the MCP tools (`state_get`, `state_transition`, `state_add_blocker`) instead of writing `state.toml` directly. Subagents write their own output files (summary files, result files).
13
- 2. **Read config via MCP** — call `config_get` to get parallelism settings and git configuration.
14
- 3. **Report errors, don't retry crashes** — if an executor fails, write an error result for that plan, continue with remaining plans, and report all failures at the end.
15
- 4. **Resume support** — scan for existing summary files to skip completed plans.
13
+ 2. **Read config via MCP** — call `config_get` to get parallelism settings, git configuration, and feature flags.
14
+ 3. **Fresh executor per task** — spawn a new konductor-executor for each task. Do not reuse executors across tasks.
15
+ 4. **Resume support** — scan for existing per-task summary files to skip completed tasks. A task is complete only when its summary has `## Status: DONE` AND `## Review Status: passed`.
16
+ 5. **Circuit breaker** — if 3+ tasks in the phase are BLOCKED, stop execution and report to user.
16
17
 
17
18
  ## Step 1: Read State and Config
18
19
 
19
- Call the `state_get` MCP tool to read current state, and call the `config_get` MCP tool for execution settings (parallelism, git config).
20
+ Call the `state_get` MCP tool to read current state, and call the `config_get` MCP tool for execution settings (parallelism, git config, feature flags).
20
21
 
21
22
  Validate that `[current].step` is either:
22
23
  - `"planned"` — ready to start execution
@@ -30,95 +31,147 @@ Then stop.
30
31
 
31
32
  Call `state_transition` with `step = "executing"` to mark the start of execution.
32
33
 
33
- ## Step 3: Load and Group Plans by Wave
34
+ ## Step 3: Load Plans, Extract Tasks, and Group by Wave
34
35
 
35
36
  Read all plan files from `.konductor/phases/{phase}/plans/`.
36
37
 
37
38
  For each plan file:
38
39
  1. Parse the TOML frontmatter (delimited by `+++` markers at start and end)
39
- 2. Extract the `wave` field (required)
40
- 3. Extract the `plan` field (plan number)
41
- 4. Group plans by wave number
40
+ 2. Extract the `wave` field (required) and `plan` field (plan number)
41
+ 3. Parse the `## Tasks` section to extract individual tasks. Each task is a `### Task N` subsection. Record the task number and its content (action, files, verify, done criteria).
42
+ 4. Store the task list per plan: e.g., plan 1 has tasks [1, 2, 3], plan 2 has tasks [1, 2].
42
43
 
43
- **Wave ordering:** Plans execute in wave order (wave 1, then wave 2, etc.). Plans within a wave can execute in parallel if `max_wave_parallelism > 1`.
44
+ Group plans by wave number. **Wave ordering:** Plans execute in wave order (wave 1, then wave 2, etc.). Plans within a wave can execute in parallel if `max_wave_parallelism > 1`.
44
45
 
45
46
  ## Step 4: Resume Check
46
47
 
47
- Scan `.konductor/phases/{phase}/plans/` for existing summary files.
48
+ Scan `.konductor/phases/{phase}/plans/` for existing per-task summary files.
48
49
 
49
- **Summary file naming:** `{plan-number}-summary.md` (e.g., `001-summary.md`, `002-summary.md`)
50
+ **Summary file naming:** `{plan-number}-task-{n}-summary.md` (e.g., `001-task-1-summary.md`, `001-task-2-summary.md`)
50
51
 
51
- For each plan:
52
- - If `{plan-number}-summary.md` exists, the plan is complete — skip it
53
- - If summary does not exist, the plan needs execution
52
+ **Task completion definition:** A task is complete when BOTH conditions are met:
53
+ 1. Its summary file exists with `## Status: DONE`
54
+ 2. The summary contains `## Review Status: passed` (appended by the orchestrator after both review stages pass)
54
55
 
55
- Resume from the current wave with incomplete plans.
56
+ **Plan completion:** A plan is complete when ALL its tasks are complete.
57
+
58
+ **Resume logic:**
59
+ - Find the first incomplete task in the first incomplete plan of the current wave
60
+ - If a task has a summary with `NEEDS_CONTEXT` or `BLOCKED` status, report it to the user before resuming
61
+ - Resume execution from that task
56
62
 
57
63
  ## Step 5: Wave Execution Loop
58
64
 
59
65
  For each wave (in ascending order):
60
66
 
61
- ### 5.1: Update Wave State
67
+ ### 5.1: Execute Plans in Wave
62
68
 
63
- Track the current wave number for reporting purposes.
69
+ Read `config.toml` field `execution.max_wave_parallelism`:
64
70
 
65
- ### 5.2: Execute Plans in Wave
71
+ - **Parallel mode** (`max_wave_parallelism > 1`): Each plan's task sequence runs independently in parallel.
72
+ - **Sequential mode** (`max_wave_parallelism = 1`): Execute plans one at a time within the wave.
66
73
 
67
- Read `config.toml` field `execution.max_wave_parallelism`:
74
+ For each plan in the wave, execute its tasks sequentially:
75
+
76
+ #### Per-Task Dispatch Loop
77
+
78
+ For each task in the plan (sequential within a plan):
79
+
80
+ **5.1.1 — Dispatch Executor**
81
+
82
+ Spawn a fresh **konductor-executor** agent with:
83
+ - The plan file path (absolute path)
84
+ - The specific task number to execute
85
+ - Summaries from prior completed tasks in this plan (for context)
86
+ - Git configuration: `git.auto_commit` and `git.branching_strategy`
87
+ - Reference to `references/execution-guide.md` (status protocol, deviation rules, commit protocol)
88
+ - Reference to `references/tdd.md` if plan frontmatter `type = "tdd"`
89
+
90
+ Wait for `{plan-number}-task-{n}-summary.md` to be written.
91
+
92
+ **5.1.2 — Handle Implementer Status**
93
+
94
+ Read the `## Status` field from the task summary and handle per the implementer status protocol (see `references/execution-guide.md`):
68
95
 
69
- **If `max_wave_parallelism > 1` (parallel mode):**
70
- - For each plan in this wave, spawn a **konductor-executor** agent simultaneously
71
- - Each executor receives:
72
- - Its specific plan file path (absolute path)
73
- - The git configuration: `git.auto_commit` and `git.branching_strategy`
74
- - Reference to `references/execution-guide.md` (deviation rules, commit protocol, analysis paralysis guard)
75
- - Reference to `references/tdd.md` if plan frontmatter `type = "tdd"`
76
- - Wait for ALL executors in the wave to complete (check for summary files)
96
+ - **DONE** proceed to two-stage review (Step 5.1.3)
97
+ - **DONE_WITH_CONCERNS** read `## Concerns`. If concerns mention correctness issues, security risks, or spec deviations (actionable) → dispatch executor to address them, then proceed to review. If concerns are informational → proceed to review.
98
+ - **NEEDS_CONTEXT** → read `## Missing Context`, provide the information, re-dispatch executor with the context (max 2 retries). If still NEEDS_CONTEXT after 2 retries → treat as BLOCKED.
99
+ - **BLOCKED** read `## Blocker`. Assess: context problem → provide context and re-dispatch; task too complex → split into smaller tasks. If assessment fails or the task remains blocked after re-dispatch, call `state_add_blocker` with the blocker description. Track blocked count. If 3+ tasks in the phase have been BLOCKED, trigger circuit breaker: stop execution and report all blockers to user.
77
100
 
78
- **If `max_wave_parallelism = 1` (sequential mode):**
79
- - Execute plans one at a time within the wave
80
- - Spawn one executor, wait for completion, then spawn the next
101
+ **5.1.3 — Two-Stage Review** (if `config.toml` `features.code_review = true`)
81
102
 
82
- **Executor completion check:**
83
- - A plan is complete when `{plan-number}-summary.md` exists
84
- - If an executor crashes or produces no summary, treat it as a failure (see Step 5.4)
103
+ If `features.code_review` is false, skip reviews and mark task complete (append `## Review Status: passed` to the task summary).
85
104
 
86
- ### 5.3: Write Result Files
105
+ **Stage 1 Spec Compliance Review:**
106
+ Dispatch **konductor-spec-reviewer** agent with:
107
+ - The task spec (the specific `### Task N` section from the plan file)
108
+ - The task summary file (`{plan}-task-{n}-summary.md`)
109
+ - The modified files listed in the summary
87
110
 
88
- After each plan completes (successfully or with errors), write `.konductor/.results/execute-{phase}-plan-{n}.toml`:
111
+ The reviewer checks whether the implementation matches the task specification and writes findings to `{plan}-task-{n}-spec-review.md`.
112
+
113
+ If the reviewer reports issues:
114
+ 1. Dispatch **konductor-executor** to fix the reported issues
115
+ 2. Re-run **konductor-spec-reviewer** to verify fixes
116
+ 3. Maximum 2 review-fix iterations. If still failing → log as needing manual intervention, continue to next task.
117
+
118
+ **Stage 2 — Code Quality Review:**
119
+ Dispatch **konductor-code-reviewer** agent with:
120
+ - The task summary file
121
+ - The modified files listed in the summary
122
+ - Git diff for the task's changes
123
+
124
+ The reviewer checks code quality (correctness, error handling, security, duplication, performance, dead code, consistency) and writes findings to `{plan}-task-{n}-quality-review.md`.
125
+
126
+ If the reviewer reports issues:
127
+ 1. Dispatch **konductor-executor** to fix the reported issues
128
+ 2. Re-run **konductor-code-reviewer** to verify fixes
129
+ 3. Maximum 2 review-fix iterations. If still failing → log as needing manual intervention, continue to next task.
130
+
131
+ **After both stages pass:** Append `## Review Status: passed` to the task summary file. Mark task complete.
132
+
133
+ ### 5.2: Write Result Files
134
+
135
+ After each plan completes (all tasks done, successfully or with errors), write `.konductor/.results/execute-{phase}-plan-{n}.toml`:
89
136
 
90
137
  ```toml
91
138
  step = "execute"
92
139
  phase = "{phase}"
93
140
  plan = {plan_number}
94
141
  wave = {wave_number}
95
- status = "ok" # or "error" if executor failed
142
+ status = "ok" # or "error" if any task failed
143
+ tasks_total = {total_tasks}
144
+ tasks_completed = {completed_tasks}
96
145
  timestamp = {current ISO timestamp}
97
146
  ```
98
147
 
99
- ### 5.4: Error Handling
148
+ ### 5.3: Error Handling
100
149
 
101
150
  If an executor fails (crashes, times out, or reports errors):
102
151
  1. Write `.konductor/.results/execute-{phase}-plan-{n}.toml` with `status = "error"` and error details
103
- 2. **Continue** with remaining plans in the wave (do not stop)
104
- 3. Track failed plan numbers
105
- 4. At the end of the wave, report which plans failed
152
+ 2. **Continue** with remaining tasks/plans in the wave (do not stop unless circuit breaker triggers)
153
+ 3. Track failed task numbers
154
+ 4. At the end of the wave, report which tasks failed
106
155
 
107
- **Do NOT retry failed executors automatically.** Let the user decide how to proceed.
156
+ ### 5.4: Update Progress Counters
108
157
 
109
- ### 5.5: Update Progress Counters
158
+ After each wave completes, track progress (completed tasks/plans count and percentage) for reporting.
110
159
 
111
- After each wave completes, track progress (completed plans count and percentage) for reporting.
112
-
113
- ## Step 6: Code Review (if enabled)
160
+ ## Step 6: Code Review Holistic Final Pass (if enabled)
114
161
 
115
162
  If `config.toml` `features.code_review = true`:
116
163
 
164
+ Per-task reviews (spec compliance + code quality) have already been performed during execution. This phase-level code review is a **holistic final pass** checking cross-task consistency:
165
+ - Shared interfaces match across plans (types, function signatures, API contracts)
166
+ - Naming conventions are consistent across all modified files
167
+ - Integration points work correctly (modules wire together properly)
168
+ - No cross-plan duplication (shared logic extracted)
169
+
117
170
  Spawn a **konductor-code-reviewer** agent. Provide it with:
118
- - `.konductor/.tracking/modified-files.log` (list of changed files)
119
- - All `*-summary.md` files from `.konductor/phases/{phase}/plans/`
171
+ - `.konductor/.tracking/modified-files.log` (list of all changed files)
172
+ - All `*-task-*-summary.md` files from `.konductor/phases/{phase}/plans/`
120
173
  - The phase name and plan files for context
121
- - Instructions: review all modified source files, run tests and linting, do NOT fix any issues — only report them with file, line, description, and severity (minor/significant). Write findings to `.konductor/phases/{phase}/code-review.md`.
174
+ - Instructions: focus on cross-task and cross-plan consistency, not individual task correctness (already reviewed). Write findings to `.konductor/phases/{phase}/code-review.md`.
122
175
 
123
176
  Wait for the reviewer to complete. Read `code-review.md`.
124
177
 
@@ -131,23 +184,24 @@ Wait for the reviewer to complete. Read `code-review.md`.
131
184
 
132
185
  ## Step 7: Set Executed State
133
186
 
134
- After code review completes (with no blocking issues), call `state_transition` with `step = "executed"` to advance the pipeline.
187
+ After all execution and reviews complete (with no blocking issues), call `state_transition` with `step = "executed"` to advance the pipeline.
135
188
 
136
189
  Tell the user:
137
- - Total plans executed
138
- - Plans succeeded vs. failed (if any)
139
- - Code review findings (issues fixed, warnings reported)
190
+ - Total plans and tasks executed
191
+ - Tasks succeeded vs. failed (if any)
192
+ - Per-task review results (spec + quality)
193
+ - Phase-level code review findings (if enabled)
140
194
  - Next step suggestion: "Say 'next' to verify the phase."
141
195
 
142
- If any plans failed, list them and suggest:
143
- > "Review the errors in `.results/execute-{phase}-plan-{n}.toml` files. You can re-run individual plans or fix issues manually."
196
+ If any tasks failed or need manual intervention, list them and suggest:
197
+ > "Review the task summaries and review files in `.konductor/phases/{phase}/plans/`. You can re-run execution to retry incomplete tasks."
144
198
 
145
199
  ## Error Handling
146
200
 
147
201
  **Executor crashes:**
148
202
  If an executor subagent crashes:
149
203
  1. Write error result file for that plan
150
- 2. Continue with remaining plans
204
+ 2. Continue with remaining tasks/plans
151
205
  3. Report the failure at the end of execution
152
206
 
153
207
  **State corruption:**
@@ -1,20 +1,22 @@
1
1
  # Execution Guide — For Konductor Executor Agents
2
2
 
3
- This guide is for executor subagents that implement individual plans. You are responsible for executing one plan from start to finish.
3
+ This guide is for executor subagents that implement individual tasks. You are responsible for executing one task and writing a per-task summary.
4
4
 
5
5
  ## Your Role
6
6
 
7
7
  You are a **konductor-executor** agent. You receive:
8
- - A plan file with tasks to complete
8
+ - A plan file (for context on the overall goal and prior tasks)
9
+ - A specific task number to execute
10
+ - Summaries from prior completed tasks in this plan (for context)
9
11
  - Git configuration (auto-commit, branching strategy)
10
12
  - Reference to this guide
11
13
 
12
14
  Your job:
13
- 1. Read and understand the plan
14
- 2. Execute each task in order
15
+ 1. Read and understand the assigned task
16
+ 2. Execute the task
15
17
  3. Write tests when required (TDD plans)
16
18
  4. Commit changes following the protocol
17
- 5. Write a summary when done
19
+ 5. Write a per-task summary with your status
18
20
 
19
21
  ## Deviation Rules
20
22
 
@@ -120,15 +122,11 @@ Make commits atomic and descriptive. Follow this protocol for every commit.
120
122
 
121
123
  ### Commit Frequency
122
124
 
123
- **One commit per task** (preferred):
124
- - After completing each task, commit the changes
125
+ **One commit per task** (required):
126
+ - After completing your assigned task, commit the changes
125
127
  - Keeps history granular and reviewable
126
128
  - Easier to roll back individual changes
127
129
 
128
- **Exceptions:**
129
- - If tasks are tightly coupled and splitting commits would break functionality, combine them
130
- - Always explain in the commit body why tasks were combined
131
-
132
130
  ### Staging Files
133
131
 
134
132
  **IMPORTANT:** Stage specific files, never use `git add -A` or `git add .`
@@ -201,84 +199,144 @@ Check `config.toml` field `git.auto_commit`:
201
199
  - Reading referenced interfaces from dependencies (doesn't count)
202
200
  - First-time codebase exploration at start of plan (first 3 reads don't count)
203
201
 
202
+ ## Implementer Status Protocol
203
+
204
+ After completing a task (or failing to), report exactly one of these four statuses in your summary file. The orchestrator uses your status to decide what happens next.
205
+
206
+ ### DONE
207
+
208
+ Task completed successfully. All files created/modified, tests pass, verify step satisfied.
209
+
210
+ **Orchestrator action:** Proceed to spec review, then code quality review.
211
+
212
+ ### DONE_WITH_CONCERNS
213
+
214
+ Task completed, but you have doubts or observations the orchestrator should know about.
215
+
216
+ **Orchestrator triage:**
217
+ - **Actionable concerns** (potential correctness issues, security risks, spec deviations) → orchestrator dispatches an executor to address them before proceeding to review.
218
+ - **Informational concerns** (considered alternative approach, style preferences, future improvement ideas) → orchestrator proceeds directly to review.
219
+
220
+ **Examples of actionable concerns:**
221
+ - "The spec says validate email format, but I used a simple regex that may miss edge cases"
222
+ - "This endpoint accepts user input without rate limiting"
223
+ - "The plan says return 404, but the existing codebase returns 204 for missing resources"
224
+
225
+ **Examples of informational concerns:**
226
+ - "Considered using a builder pattern but kept it simple per the plan"
227
+ - "This function could be split further in a future refactor"
228
+
229
+ ### NEEDS_CONTEXT
230
+
231
+ You cannot complete the task because information is missing. Be specific about what you need.
232
+
233
+ **Orchestrator action:** Provide the missing context and re-dispatch you. Maximum 2 retries — if still blocked after 2 attempts, escalate to user.
234
+
235
+ **Your summary must include a `## Missing Context` section listing exactly what you need.**
236
+
237
+ ### BLOCKED
238
+
239
+ You cannot complete the task due to a technical or architectural issue.
240
+
241
+ **Orchestrator assessment:**
242
+ - **Context problem** → provide context and re-dispatch
243
+ - **Task too complex** → split into smaller tasks
244
+ - **Plan wrong** → escalate to user
245
+
246
+ **Your summary must include a `## Blocker` section describing the issue.**
247
+
248
+ **Default rule:** If you encounter an issue you cannot classify into the other three statuses, use BLOCKED with a description.
249
+
204
250
  ## Summary Writing
205
251
 
206
- After completing all tasks (or encountering a blocker), write a summary file.
252
+ After completing each task (or encountering a blocker), write a per-task summary file.
207
253
 
208
- **File location:** `.konductor/phases/{phase}/plans/{plan-number}-summary.md`
254
+ **File location:** `.konductor/phases/{phase}/plans/{plan}-task-{n}-summary.md`
209
255
 
210
256
  **File name examples:**
211
- - `001-summary.md`
212
- - `002-summary.md`
213
- - `010-summary.md`
257
+ - `001-task-1-summary.md` — Plan 001, Task 1
258
+ - `001-task-2-summary.md` — Plan 001, Task 2
259
+ - `003-task-1-summary.md` — Plan 003, Task 1
214
260
 
215
261
  ### Summary Structure
216
262
 
217
263
  ```markdown
218
- # Plan {plan-number} Summary
264
+ # Plan {plan} — Task {n} Summary
219
265
 
220
- ## Status
221
- [Completed | Blocked | Partial]
266
+ ## Status: DONE
222
267
 
223
268
  ## Files Created
224
269
  - `src/models/user.rs` — User struct with password hashing
225
- - `src/db/migrations/001_users.sql` — Users table migration
226
270
 
227
271
  ## Files Modified
228
- - `src/routes/auth.rs` — Added registration endpoint
229
272
  - `Cargo.toml` — Added bcrypt dependency
230
273
 
231
274
  ## Tests Added
232
275
  - `user::test_password_hashing` — Verifies bcrypt integration
233
- - `auth::test_registration_endpoint` — Verifies POST /auth/register
234
276
 
235
277
  ### Test Results
236
278
  ```
237
279
  cargo test user
238
- Compiling auth-system v0.1.0
239
- Finished test [unoptimized + debuginfo] target(s) in 2.3s
240
- Running unittests (target/debug/deps/auth_system-abc123)
241
- running 2 tests
280
+ running 1 test
242
281
  test user::test_password_hashing ... ok
243
- test auth::test_registration_endpoint ... ok
244
-
245
- test result: ok. 2 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out
282
+ test result: ok. 1 passed; 0 failed
246
283
  ```
247
284
 
248
285
  ## Deviations from Plan
249
- 1. **Rule 1:** Fixed type error in existing auth routes that prevented compilation
250
- 2. **Rule 2:** Added email validation to registration endpoint (plan didn't specify)
251
- 3. **Rule 3:** Added bcrypt to Cargo.toml (plan assumed it was already present)
286
+ 1. **Rule 3:** Added bcrypt to Cargo.toml (plan assumed it was already present)
252
287
 
253
288
  ## Decisions Made
254
- - Used bcrypt cost factor of 12 (industry standard for password hashing)
255
- - Made password_hash field private to prevent accidental exposure
256
- - Added index on users.email for faster lookups during login
257
-
258
- ## Blockers Encountered
259
- None. All tasks completed successfully.
289
+ - Used bcrypt cost factor of 12 (industry standard)
260
290
 
261
291
  ## Verification
262
- All must_haves from plan frontmatter verified:
263
- - [x] Users can register with email and password (POST /auth/register returns 201)
264
- - [x] Passwords are hashed with bcrypt (verified in test)
265
- - [x] User model imported by auth routes (compiler confirms)
292
+ - [x] User model exists with password hashing (compiler confirms)
293
+ ```
294
+
295
+ ### Conditional Sections by Status
296
+
297
+ Include these sections only when the status requires them:
298
+
299
+ **DONE_WITH_CONCERNS** — add `## Concerns`:
300
+ ```markdown
301
+ ## Status: DONE_WITH_CONCERNS
302
+
303
+ ## Concerns
304
+ - Email validation uses a simple regex that may miss edge cases (potential correctness issue)
305
+ - Considered using a builder pattern but kept it simple per the plan (informational)
306
+ ```
307
+
308
+ **NEEDS_CONTEXT** — add `## Missing Context`:
309
+ ```markdown
310
+ ## Status: NEEDS_CONTEXT
311
+
312
+ ## Missing Context
313
+ - What authentication strategy does the existing codebase use? (JWT vs sessions)
314
+ - Is there an existing User type in `src/models/` that should be extended?
315
+ ```
316
+
317
+ **BLOCKED** — add `## Blocker`:
318
+ ```markdown
319
+ ## Status: BLOCKED
320
+
321
+ ## Blocker
322
+ The plan requires adding a DynamoDB table, but the SAM template uses a format incompatible with the existing deployment pipeline. This is an architectural decision (Rule 4).
266
323
  ```
267
324
 
268
325
  ### Summary Requirements
269
326
 
270
327
  Your summary MUST include:
271
- - **Status:** One of [Completed, Blocked, Partial]
272
- - **Files created:** List with brief descriptions
273
- - **Files modified:** List with brief descriptions
274
- - **Tests added:** Test names and what they verify
275
- - **Test results:** Actual output from test runner (paste full output)
328
+ - **Status:** One of DONE, DONE_WITH_CONCERNS, NEEDS_CONTEXT, BLOCKED
329
+ - **Files created:** List with brief descriptions (if any)
330
+ - **Files modified:** List with brief descriptions (if any)
331
+ - **Tests added:** Test names and what they verify (if any)
332
+ - **Test results:** Actual output from test runner (if tests were run)
276
333
  - **Deviations:** Every deviation with rule number and explanation
277
334
  - **Decisions:** Technical choices you made
278
- - **Blockers:** Any issues that stopped you (or "None")
279
- - **Verification:** Checklist of must_haves from plan frontmatter
335
+ - **Verification:** Checklist of task verify/done criteria from the plan
336
+
337
+ **Conditional sections:** Include Concerns, Missing Context, or Blocker as required by your status.
280
338
 
281
- **If blocked:** Explain clearly what stopped you and what information or decision is needed to proceed.
339
+ **Note:** After both spec compliance and code quality reviews pass, the orchestrator appends `## Review Status: passed` to your summary file. You do not write this field yourself — it is managed by the orchestrator.
282
340
 
283
341
  ## Working with TDD Plans
284
342
 
@@ -90,36 +90,61 @@ The phase is ready for execution. Run the **Execution Pipeline**:
90
90
  3. Group plans by wave number (wave 1 first, then 2, etc.).
91
91
  4. Call `state_transition` with `step = "executing"`.
92
92
 
93
- 5. **For each wave** (in order):
94
- Update wave tracking as needed.
93
+ 5. **Per-Task Wave Execution Loop:**
95
94
 
96
- **If `max_wave_parallelism > 1` (parallel mode):**
97
- For each plan in this wave, use the **konductor-executor** agent to execute it. Launch all plans in the wave simultaneously. Each executor receives:
98
- - Its specific plan file path
99
- - Whether to auto-commit (`git.auto_commit`)
100
- - The branching strategy (`git.branching_strategy`)
101
- - Reference: see `references/execution-guide.md` in the konductor-exec skill
102
- Wait for ALL executors to complete (check for summary files).
95
+ For each wave (in ascending order):
103
96
 
104
- **If `max_wave_parallelism = 1` (sequential mode):**
105
- Execute plans one at a time, in order within the wave.
97
+ For each plan in the wave (parallel if `max_wave_parallelism > 1`, sequential otherwise):
98
+ Parse the plan's `## Tasks` section to extract individual tasks. For each task (sequential within a plan):
106
99
 
107
- After each wave completes, track progress.
100
+ **5a. Dispatch executor:**
101
+ Spawn a fresh **konductor-executor** agent with:
102
+ - The plan file path and the specific task number to execute
103
+ - Summaries from prior completed tasks in this plan (for context)
104
+ - Git config: `git.auto_commit` and `git.branching_strategy`
105
+ - Reference: see `references/execution-guide.md` in the konductor-exec skill (status protocol, deviation rules)
106
+ Wait for `{plan}-task-{n}-summary.md` in `.konductor/phases/{phase}/plans/`.
108
107
 
109
- 6. Write `.konductor/.results/execute-{phase}-plan-{n}.toml` for each completed plan.
110
- 7. **Code Review** (if `config.toml` `features.code_review = true`):
111
- Spawn **konductor-code-reviewer** agent with: `.konductor/.tracking/modified-files.log`, all `*-summary.md` files from plans directory, phase name. The reviewer writes `.konductor/phases/{phase}/code-review.md`.
112
- If issues found: spawn a **konductor-executor** agent with the issues to fix them, then re-run the reviewer. Maximum 3 review-fix iterations. If still unresolved, call `state_add_blocker` and report to user.
113
- 8. Call `state_transition` with `step = "executed"`.
114
- 9. Tell the user: "Phase {phase} executed. N plans completed. Say 'next' to verify."
108
+ **5b. Handle implementer status:**
109
+ Read the `## Status` field from the task summary:
110
+ - **DONE** proceed to 5c (two-stage review).
111
+ - **DONE_WITH_CONCERNS** → read `## Concerns`. If concerns mention correctness issues, security risks, or spec deviations: dispatch a fresh **konductor-executor** to address them, then proceed to 5c. If concerns are informational (style preferences, alternative approaches considered): proceed to 5c.
112
+ - **NEEDS_CONTEXT** → read `## Missing Context`, provide the requested information, re-dispatch a fresh executor for the same task. Maximum 2 retries. If still NEEDS_CONTEXT after retries, treat as BLOCKED.
113
+ - **BLOCKED** read `## Blocker`. Assess: context problem → provide context and re-dispatch; task too complex → split into smaller tasks. If assessment fails or the task remains blocked after re-dispatch, call `state_add_blocker` with the blocker description. If 3 or more tasks in this phase have been BLOCKED, trigger circuit breaker: stop execution entirely and report all blockers to the user. Otherwise continue with the next task.
114
+
115
+ **5c. Two-stage review** (if `config.toml` `features.code_review = true`; skip both stages if disabled, and append `## Review Status: passed` to the task summary so resume logic works correctly):
116
+
117
+ **Stage 1 — Spec Compliance:**
118
+ Spawn **konductor-spec-reviewer** with the task spec (from the plan file), the task summary, and modified files. The reviewer writes `{plan}-task-{n}-spec-review.md`.
119
+ If issues found: spawn a fresh **konductor-executor** with the issues to fix, then re-run the spec reviewer. Maximum 2 iterations.
120
+
121
+ **Stage 2 — Code Quality:**
122
+ Spawn **konductor-code-reviewer** with the task summary, modified files, and git diff for the task. The reviewer writes `{plan}-task-{n}-quality-review.md`.
123
+ If issues found: spawn a fresh **konductor-executor** with the issues to fix, then re-run the quality reviewer. Maximum 2 iterations.
124
+
125
+ After both stages pass: append `## Review Status: passed` to the task summary file. Mark task complete.
126
+
127
+ **5d. Write result file** after each plan completes (all tasks done):
128
+ Write `.konductor/.results/execute-{phase}-plan-{n}.toml` with status and timestamp.
129
+
130
+ 6. **Phase-Level Code Review** (optional holistic final pass, if `config.toml` `features.code_review = true`):
131
+ Per-task reviews have already been performed. This step checks cross-task consistency (shared interfaces, naming conventions, integration points).
132
+ Spawn **konductor-code-reviewer** with `.konductor/.tracking/modified-files.log`, all task summary files, and phase name. The reviewer writes `.konductor/phases/{phase}/code-review.md`.
133
+ If significant cross-task issues found: spawn a **konductor-executor** to fix, then re-review. Maximum 3 iterations. If still unresolved, call `state_add_blocker` and report to user.
134
+ 7. Call `state_transition` with `step = "executed"`.
135
+ 8. Tell the user: "Phase {phase} executed. N plans completed. Say 'next' to verify."
115
136
 
116
137
  ### Case: `step = "executing"`
117
138
 
118
- Execution was interrupted. Resume:
119
- 1. Check which `{plan}-summary.md` files exist in `.konductor/phases/{phase}/plans/`.
120
- 2. Plans with summaries are complete skip them.
121
- 3. Resume from the first incomplete plan in the current wave.
122
- 4. Continue the Execution Pipeline from step 5 above.
139
+ Execution was interrupted. Resume at task-level granularity:
140
+ 1. Scan `.konductor/phases/{phase}/plans/` for `{plan}-task-{n}-summary.md` files.
141
+ 2. A task is complete only when BOTH conditions are met:
142
+ - Its summary file exists with `## Status: DONE`
143
+ - The summary contains `## Review Status: passed` (added by the orchestrator after both review stages pass)
144
+ 3. A plan is complete when ALL its tasks meet the above definition.
145
+ 4. Check for any tasks with `## Status: NEEDS_CONTEXT` or `## Status: BLOCKED` — report these to the user before resuming.
146
+ 5. Resume from the first incomplete task in the first incomplete plan of the current wave.
147
+ 6. Continue the Execution Pipeline from step 5 above.
123
148
 
124
149
  ### Case: `step = "executed"`
125
150
 
@@ -39,26 +39,80 @@ Plans execute in waves. Wave dependencies must form a DAG (directed acyclic grap
39
39
 
40
40
  ## Task Sizing
41
41
 
42
- Each plan contains 2-5 tasks. Each task should take 15-60 minutes of execution time.
42
+ Each plan contains 2-5 tasks. Each task is broken into **bite-sized steps that take 2-5 minutes each**. Every step that involves code must include the actual code block — no prose descriptions of what to write.
43
43
 
44
44
  **If a plan would need more than 5 tasks:** Split it into multiple plans in the same wave.
45
45
 
46
46
  **Task structure:**
47
47
  - **files:** Which files to modify/create
48
- - **action:** What to do (be specific)
48
+ - **action:** What to do broken into numbered steps, each 2-5 minutes
49
49
  - **verify:** How to check success (command to run)
50
50
  - **done:** What "done" looks like (observable outcome)
51
51
 
52
- **Example task:**
52
+ **Each step within a task must:**
53
+ 1. Be completable in 2-5 minutes
54
+ 2. Include the exact code to write (for code steps)
55
+ 3. Include the command to run (for verification steps)
56
+ 4. Be independently verifiable
57
+
58
+ **Example task with bite-sized steps (TDD):**
53
59
  ```markdown
54
60
  ### Task 2: Add password hashing to User model
55
61
 
56
62
  - **files:** `src/models/user.rs`
57
- - **action:** Import bcrypt crate, add `hash_password` method to User impl, call it in `new` constructor
58
- - **verify:** `cargo test user::test_password_hashing`
59
- - **done:** Passwords are hashed with bcrypt before storage
63
+ - **action:**
64
+ 1. Write failing test:
65
+ ```rust
66
+ #[cfg(test)]
67
+ mod tests {
68
+ use super::*;
69
+
70
+ #[test]
71
+ fn test_password_is_hashed() {
72
+ let user = User::new("test@example.com".into(), "Secret123!".into()).unwrap();
73
+ assert_ne!(user.password_hash(), "Secret123!");
74
+ assert!(user.verify_password("Secret123!"));
75
+ }
76
+ }
77
+ ```
78
+ 2. Run `cargo test user::tests::test_password_is_hashed` — confirm it fails (method not found)
79
+ 3. Implement:
80
+ ```rust
81
+ use bcrypt::{hash, verify, DEFAULT_COST};
82
+
83
+ impl User {
84
+ pub fn new(email: String, password: String) -> Result<Self, AuthError> {
85
+ let password_hash = hash(&password, DEFAULT_COST)
86
+ .map_err(|_| AuthError::HashError)?;
87
+ Ok(Self { id: Uuid::new_v4(), email, password_hash, created_at: Utc::now() })
88
+ }
89
+
90
+ pub fn password_hash(&self) -> &str { &self.password_hash }
91
+
92
+ pub fn verify_password(&self, password: &str) -> bool {
93
+ verify(password, &self.password_hash).unwrap_or(false)
94
+ }
95
+ }
96
+ ```
97
+ 4. Run `cargo test user::tests::test_password_is_hashed` — confirm it passes
98
+ - **verify:** `cargo test user::tests::test_password_is_hashed`
99
+ - **done:** Passwords are hashed with bcrypt before storage, verified by test
60
100
  ```
61
101
 
102
+ ## No Placeholders
103
+
104
+ Every step must contain the actual content an engineer needs to execute it. The **konductor-plan-checker** agent enforces this rule and will reject plans that violate it.
105
+
106
+ **Banned patterns:**
107
+ - `"TBD"`, `"TODO"`, `"implement later"`, `"fill in details"`
108
+ - `"Add appropriate error handling"` / `"add validation"` / `"handle edge cases"` (show the actual error handling, validation, or edge case code)
109
+ - `"Write tests for the above"` without actual test code (include the test code)
110
+ - `"Similar to Task N"` (repeat the code — the engineer may be reading tasks out of order)
111
+ - Steps that describe what to do without showing how (code blocks required for code steps)
112
+ - References to types, functions, or methods not defined in any prior or current task
113
+
114
+ **Rule:** If a step involves writing code, the step must include the code block. If a step involves running a command, the step must include the command. No exceptions.
115
+
62
116
  ## Plan File Format
63
117
 
64
118
  Each plan is a markdown file with TOML frontmatter and a structured body.
@@ -68,7 +122,7 @@ Each plan is a markdown file with TOML frontmatter and a structured body.
68
122
  - `plan`: Plan number within the phase (1, 2, 3...)
69
123
  - `wave`: Execution wave (1, 2, 3...)
70
124
  - `depends_on`: List of plan numbers this plan depends on (e.g., `[1, 2]`)
71
- - `type`: Either "execute" (standard implementation) or "tdd" (test-driven)
125
+ - `type`: Either "tdd" (test-driven, default) or "execute" (standard implementation). Use `type = "execute"` to opt out of TDD for infrastructure, configuration, or documentation tasks. The planner must always emit an explicit `type` field.
72
126
  - `autonomous`: Boolean, true if executor can proceed without human input
73
127
  - `requirements`: List of REQ-XX identifiers this plan addresses
74
128
  - `files_modified`: List of files this plan will touch (helps with merge conflict prediction)
@@ -93,7 +147,7 @@ phase = "01-auth-system"
93
147
  plan = 1
94
148
  wave = 1
95
149
  depends_on = []
96
- type = "execute"
150
+ type = "tdd"
97
151
  autonomous = true
98
152
  requirements = ["REQ-01", "REQ-02"]
99
153
  files_modified = ["src/models/user.rs", "src/db/migrations/001_users.sql"]
@@ -138,33 +192,120 @@ impl User {
138
192
 
139
193
  ## Tasks
140
194
 
141
- ### Task 1: Create User struct
195
+ ### Task 1: Create User struct with tests
142
196
 
143
197
  - **files:** `src/models/user.rs`
144
- - **action:** Define User struct with fields: id (UUID), email (String), password_hash (String), created_at (DateTime)
145
- - **verify:** `cargo check` passes
146
- - **done:** User struct compiles
198
+ - **action:**
199
+ 1. Write failing test:
200
+ ```rust
201
+ #[cfg(test)]
202
+ mod tests {
203
+ use super::*;
204
+
205
+ #[test]
206
+ fn test_new_user_has_correct_email() {
207
+ let user = User::new("test@example.com".into(), "Secret123!".into()).unwrap();
208
+ assert_eq!(user.email, "test@example.com");
209
+ }
210
+
211
+ #[test]
212
+ fn test_new_user_has_uuid() {
213
+ let user = User::new("test@example.com".into(), "Secret123!".into()).unwrap();
214
+ assert!(!user.id.is_nil());
215
+ }
216
+ }
217
+ ```
218
+ 2. Run `cargo test models::user::tests` — confirm it fails (User not defined)
219
+ 3. Implement:
220
+ ```rust
221
+ use chrono::{DateTime, Utc};
222
+ use uuid::Uuid;
223
+
224
+ pub struct User {
225
+ pub id: Uuid,
226
+ pub email: String,
227
+ password_hash: String,
228
+ pub created_at: DateTime<Utc>,
229
+ }
230
+
231
+ impl User {
232
+ pub fn new(email: String, password: String) -> Result<Self, AuthError> {
233
+ Ok(Self {
234
+ id: Uuid::new_v4(),
235
+ email,
236
+ password_hash: password, // placeholder — next task adds hashing
237
+ created_at: Utc::now(),
238
+ })
239
+ }
240
+ }
241
+ ```
242
+ 4. Run `cargo test models::user::tests` — confirm both tests pass
243
+ - **verify:** `cargo test models::user::tests`
244
+ - **done:** User struct compiles and passes basic tests
147
245
 
148
246
  ### Task 2: Add password hashing
149
247
 
150
248
  - **files:** `src/models/user.rs`
151
- - **action:** Import bcrypt, add `hash_password` method, call in constructor
152
- - **verify:** `cargo test user::test_password_hashing`
153
- - **done:** Passwords are hashed before storage
249
+ - **action:**
250
+ 1. Add failing test:
251
+ ```rust
252
+ #[test]
253
+ fn test_password_is_hashed_not_plaintext() {
254
+ let user = User::new("test@example.com".into(), "Secret123!".into()).unwrap();
255
+ assert_ne!(user.password_hash, "Secret123!");
256
+ }
257
+
258
+ #[test]
259
+ fn test_verify_correct_password() {
260
+ let user = User::new("test@example.com".into(), "Secret123!".into()).unwrap();
261
+ assert!(user.verify_password("Secret123!"));
262
+ }
263
+
264
+ #[test]
265
+ fn test_verify_wrong_password() {
266
+ let user = User::new("test@example.com".into(), "Secret123!".into()).unwrap();
267
+ assert!(!user.verify_password("WrongPass1!"));
268
+ }
269
+ ```
270
+ 2. Run `cargo test models::user::tests` — confirm new tests fail
271
+ 3. Update `User::new` and add `verify_password`:
272
+ ```rust
273
+ use bcrypt::{hash, verify, DEFAULT_COST};
274
+
275
+ impl User {
276
+ pub fn new(email: String, password: String) -> Result<Self, AuthError> {
277
+ let password_hash = hash(&password, DEFAULT_COST)
278
+ .map_err(|_| AuthError::HashError)?;
279
+ Ok(Self { id: Uuid::new_v4(), email, password_hash, created_at: Utc::now() })
280
+ }
281
+
282
+ pub fn verify_password(&self, password: &str) -> bool {
283
+ verify(password, &self.password_hash).unwrap_or(false)
284
+ }
285
+ }
286
+ ```
287
+ 4. Run `cargo test models::user::tests` — confirm all 5 tests pass
288
+ - **verify:** `cargo test models::user::tests`
289
+ - **done:** Passwords are hashed with bcrypt, verified by 3 new tests
154
290
 
155
291
  ### Task 3: Create migration
156
292
 
157
293
  - **files:** `src/db/migrations/001_users.sql`
158
- - **action:** Write CREATE TABLE users with columns matching User struct
159
- - **verify:** `sqlx migrate run` succeeds
160
- - **done:** users table exists in database
161
-
162
- ### Task 4: Wire User model to auth routes
163
-
164
- - **files:** `src/routes/auth.rs`
165
- - **action:** Import User model, use it in registration handler
166
- - **verify:** Compilation succeeds, User is referenced
167
- - **done:** Registration route can create User instances
294
+ - **action:**
295
+ 1. Write the migration:
296
+ ```sql
297
+ CREATE TABLE users (
298
+ id UUID PRIMARY KEY,
299
+ email VARCHAR(255) NOT NULL UNIQUE,
300
+ password_hash VARCHAR(255) NOT NULL,
301
+ created_at TIMESTAMPTZ NOT NULL DEFAULT NOW()
302
+ );
303
+
304
+ CREATE INDEX idx_users_email ON users (email);
305
+ ```
306
+ 2. Run `sqlx migrate run` — confirm it succeeds
307
+ - **verify:** `sqlx migrate run`
308
+ - **done:** users table exists in database with email unique constraint
168
309
  ```
169
310
 
170
311
  ## Phase-Level Design Document
@@ -245,37 +386,71 @@ A requirement can span multiple plans. Example: "REQ-05: Users can manage their
245
386
  - Plan 3: View profile (requirements = ["REQ-05"])
246
387
  - Plan 4: Edit profile (requirements = ["REQ-05"])
247
388
 
248
- ## TDD Detection
389
+ ## TDD as Default
390
+
391
+ TDD is the default execution mode for all plans. The planner must always emit `type = "tdd"` unless the plan explicitly opts out.
249
392
 
250
- If a task can be expressed as "expect(fn(input)).toBe(output)", make it a TDD plan.
393
+ **Backward compatibility:** Existing plans without an explicit `type` field are treated as `"execute"`. The planner must always emit an explicit `type` field going forward, making the default moot for well-formed plans.
251
394
 
252
- **Indicators:**
253
- - Pure functions (no I/O)
254
- - Clear input/output contract
255
- - Algorithmic logic (sorting, parsing, validation)
256
- - Data transformations
395
+ **Opt-out with `type = "execute"`:** Use `type = "execute"` for tasks where TDD doesn't apply:
396
+ - Infrastructure plans (SAM templates, Terraform, CI/CD configs)
397
+ - Configuration files (TOML, YAML, JSON configs)
398
+ - Documentation-only plans (README, guides, specs)
399
+ - Refactoring plans where existing tests already cover the behavior
257
400
 
258
- **TDD plan differences:**
259
- - `type = "tdd"` in frontmatter
260
- - First task writes tests
261
- - Remaining tasks implement to pass tests
262
- - Verification is `cargo test` or equivalent
401
+ **TDD task structure (RED → GREEN → REFACTOR):**
402
+ 1. **RED:** Write a failing test with the exact test code
403
+ 2. **Verify RED:** Run the test command — confirm it fails
404
+ 3. **GREEN:** Write the minimal implementation to pass the test
405
+ 4. **Verify GREEN:** Run the test command — confirm it passes
406
+ 5. **REFACTOR** (optional): Clean up while keeping tests green
263
407
 
264
408
  **Example TDD task:**
265
409
  ```markdown
266
- ### Task 1: Write password validation tests
267
-
268
- - **files:** `src/validation/password_test.rs`
269
- - **action:** Write tests for: min 8 chars, has uppercase, has number, has special char
270
- - **verify:** Tests exist and fail
271
- - **done:** 4 test cases written
272
-
273
- ### Task 2: Implement password validation
274
-
275
- - **files:** `src/validation/password.rs`
276
- - **action:** Write validate_password function to satisfy tests
277
- - **verify:** `cargo test validation::password` passes
278
- - **done:** All password validation tests pass
410
+ ### Task 1: Write password validation tests and implement
411
+
412
+ - **files:** `src/validation/password.rs`, `src/validation/password_test.rs`
413
+ - **action:**
414
+ 1. Write failing tests:
415
+ ```rust
416
+ #[cfg(test)]
417
+ mod tests {
418
+ use super::validate_password;
419
+
420
+ #[test]
421
+ fn rejects_short_password() {
422
+ assert!(validate_password("Ab1!").is_err());
423
+ }
424
+
425
+ #[test]
426
+ fn rejects_no_uppercase() {
427
+ assert!(validate_password("abcdefg1!").is_err());
428
+ }
429
+
430
+ #[test]
431
+ fn rejects_no_number() {
432
+ assert!(validate_password("Abcdefgh!").is_err());
433
+ }
434
+
435
+ #[test]
436
+ fn accepts_valid_password() {
437
+ assert!(validate_password("Secret123!").is_ok());
438
+ }
439
+ }
440
+ ```
441
+ 2. Run `cargo test validation::password` — confirm all 4 tests fail
442
+ 3. Implement:
443
+ ```rust
444
+ pub fn validate_password(password: &str) -> Result<(), &'static str> {
445
+ if password.len() < 8 { return Err("too short"); }
446
+ if !password.chars().any(|c| c.is_uppercase()) { return Err("no uppercase"); }
447
+ if !password.chars().any(|c| c.is_numeric()) { return Err("no number"); }
448
+ Ok(())
449
+ }
450
+ ```
451
+ 4. Run `cargo test validation::password` — confirm all 4 tests pass
452
+ - **verify:** `cargo test validation::password`
453
+ - **done:** Password validation passes all 4 test cases
279
454
  ```
280
455
 
281
456
  ## Interface Context
@@ -330,9 +505,13 @@ Before finalizing plans, verify:
330
505
 
331
506
  - [ ] Phase-level `design.md` exists with overview, components, interactions, key decisions, and shared interfaces
332
507
  - [ ] Every plan has valid TOML frontmatter
508
+ - [ ] Every plan has an explicit `type` field (`"tdd"` or `"execute"`)
509
+ - [ ] Plans default to TDD unless explicitly opted out (infra, config, docs)
333
510
  - [ ] Every plan has a `## Design` section with Approach, Key Interfaces, Error Handling, and Trade-offs
334
511
  - [ ] Wave numbers form a valid DAG (no cycles)
335
512
  - [ ] Each plan has 2-5 tasks
513
+ - [ ] Each task has bite-sized steps (2-5 minutes each) with code blocks for code steps
514
+ - [ ] No placeholder patterns (TBD, TODO, implement later, etc.)
336
515
  - [ ] Each task has files, action, verify, and done
337
516
  - [ ] Every requirement from requirements.md is covered
338
517
  - [ ] `must_haves` section is complete and observable