npm - konductor - Versions diffs - 0.12.4 → 0.13.0 - Mend

konductor 0.12.4 → 0.13.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (7) hide show

package/agents/konductor-plan-checker.json +1 -1
package/agents/konductor-spec-reviewer.json +40 -0
package/package.json +1 -1
package/skills/konductor-exec/SKILL.md +110 -56
package/skills/konductor-exec/references/execution-guide.md +108 -50
package/skills/konductor-next/SKILL.md +48 -23
package/skills/konductor-plan/references/planning-guide.md +229 -50

package/agents/konductor-plan-checker.json CHANGED Viewed

@@ -33,5 +33,5 @@
             }
         ]
     },
-    "prompt": ""
+    "prompt": "You are the plan-checker agent. Validate every plan file against these rules and reject any plan that violates them.\n\n## No-Placeholder Rule\n\nReject any plan containing these banned patterns in task actions or steps:\n- \"TBD\", \"TODO\", \"implement later\", \"fill in details\"\n- \"Add appropriate error handling\" / \"add validation\" / \"handle edge cases\" (must show actual code)\n- \"Write tests for the above\" without actual test code\n- \"Similar to Task N\" (must repeat the code)\n- Steps that describe what to do without showing how (code blocks required for code steps)\n- References to types, functions, or methods not defined in any prior or current task\n\nFor each violation, report: the task number, the banned pattern found, and the surrounding text.\n\n## Structural Checks\n\n1. **Frontmatter completeness**: phase, plan, wave, depends_on, type, autonomous, requirements, files_modified, must_haves (truths, artifacts, key_links) must all be present. The `type` field must be explicitly set to \"tdd\" or \"execute\".\n2. **Task sizing**: Each plan has 2-5 tasks. Each task has files, action, verify, and done fields. Code steps within tasks must include code blocks.\n3. **Wave dependencies**: depends_on values must reference valid plan numbers. No circular dependencies.\n4. **Requirement coverage**: Cross-reference requirements field against .konductor/requirements.md. Flag any REQ-XX not covered by any plan.\n5. **Design section**: Every plan must have a ## Design section with Approach, Key Interfaces, Error Handling, and Trade-offs subsections.\n6. **Verification commands**: Every task verify field must be a concrete command, not \"manual testing\" or similar.\n\n## Output\n\nFor each plan, report PASS or FAIL. On FAIL, list every violation with task number, rule violated, and the offending text. Fix issues in-place when possible."
 }

package/agents/konductor-spec-reviewer.json ADDED Viewed

@@ -0,0 +1,40 @@
+{
+    "name": "konductor-spec-reviewer",
+    "description": "Reviews task output for spec compliance. Checks that implementation matches the task specification exactly.",
+    "tools": [
+        "read",
+        "write",
+        "shell",
+        "code"
+    ],
+    "allowedTools": [
+        "read",
+        "write",
+        "shell",
+        "code"
+    ],
+    "resources": [
+        "file://.konductor/requirements.md",
+        "file://.konductor/project.md",
+        "file://.konductor/phases/*/plans/*.md",
+        "file://.kiro/steering/**/*.md",
+        "file://~/.kiro/steering/**/*.md"
+    ],
+    "hooks": {
+        "preToolUse": [
+            {
+                "matcher": "*",
+                "command": "konductor hook",
+                "timeout_ms": 1000
+            }
+        ],
+        "postToolUse": [
+            {
+                "matcher": "*",
+                "command": "konductor hook",
+                "timeout_ms": 2000
+            }
+        ]
+    },
+    "prompt": "You are a spec compliance reviewer. For each task, check: (1) Does the implementation match what the task specified? Compare the task's action description against actual file changes. (2) Are all files listed in the task's files field created or modified? (3) Does the verify step pass when run? (4) Are there extra changes not specified in the task? (5) Does the done condition hold true? Report findings with file path, description, and verdict (pass/fail) for each check. Write your review to the specified output file. Do NOT fix any issues — only report them."
+}

package/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "konductor",
-  "version": "0.12.4",
+  "version": "0.13.0",
   "description": "Spec-driven development orchestrator for Kiro CLI — MCP server and hook processor",
   "bin": {
     "konductor": "bin/konductor"

package/skills/konductor-exec/SKILL.md CHANGED Viewed

@@ -5,18 +5,19 @@ description: Execute the plans for a phase by spawning executor subagents. Use w
 # Konductor Exec — Phase Execution Pipeline
-You are the Konductor orchestrator. Execute the plans for a phase by spawning executor subagents to implement each plan.
+You are the Konductor orchestrator. Execute the plans for a phase by spawning executor subagents to implement each task.
 ## Critical Rules
 1. **Only YOU manage state transitions** — use the MCP tools (`state_get`, `state_transition`, `state_add_blocker`) instead of writing `state.toml` directly. Subagents write their own output files (summary files, result files).
-2. **Read config via MCP** — call `config_get` to get parallelism settings and git configuration.
-3. **Report errors, don't retry crashes** — if an executor fails, write an error result for that plan, continue with remaining plans, and report all failures at the end.
-4. **Resume support** — scan for existing summary files to skip completed plans.
+2. **Read config via MCP** — call `config_get` to get parallelism settings, git configuration, and feature flags.
+3. **Fresh executor per task** — spawn a new konductor-executor for each task. Do not reuse executors across tasks.
+4. **Resume support** — scan for existing per-task summary files to skip completed tasks. A task is complete only when its summary has `## Status: DONE` AND `## Review Status: passed`.
+5. **Circuit breaker** — if 3+ tasks in the phase are BLOCKED, stop execution and report to user.
 ## Step 1: Read State and Config
-Call the `state_get` MCP tool to read current state, and call the `config_get` MCP tool for execution settings (parallelism, git config).
+Call the `state_get` MCP tool to read current state, and call the `config_get` MCP tool for execution settings (parallelism, git config, feature flags).
 Validate that `[current].step` is either:
 - `"planned"` — ready to start execution
@@ -30,95 +31,147 @@ Then stop.
 Call `state_transition` with `step = "executing"` to mark the start of execution.
-## Step 3: Load and Group Plans by Wave
+## Step 3: Load Plans, Extract Tasks, and Group by Wave
 Read all plan files from `.konductor/phases/{phase}/plans/`.
 For each plan file:
 1. Parse the TOML frontmatter (delimited by `+++` markers at start and end)
-2. Extract the `wave` field (required)
-3. Extract the `plan` field (plan number)
-4. Group plans by wave number
+2. Extract the `wave` field (required) and `plan` field (plan number)
+3. Parse the `## Tasks` section to extract individual tasks. Each task is a `### Task N` subsection. Record the task number and its content (action, files, verify, done criteria).
+4. Store the task list per plan: e.g., plan 1 has tasks [1, 2, 3], plan 2 has tasks [1, 2].
-**Wave ordering:** Plans execute in wave order (wave 1, then wave 2, etc.). Plans within a wave can execute in parallel if `max_wave_parallelism > 1`.
+Group plans by wave number. **Wave ordering:** Plans execute in wave order (wave 1, then wave 2, etc.). Plans within a wave can execute in parallel if `max_wave_parallelism > 1`.
 ## Step 4: Resume Check
-Scan `.konductor/phases/{phase}/plans/` for existing summary files.
+Scan `.konductor/phases/{phase}/plans/` for existing per-task summary files.
-**Summary file naming:** `{plan-number}-summary.md` (e.g., `001-summary.md`, `002-summary.md`)
+**Summary file naming:** `{plan-number}-task-{n}-summary.md` (e.g., `001-task-1-summary.md`, `001-task-2-summary.md`)
-For each plan:
-- If `{plan-number}-summary.md` exists, the plan is complete — skip it
-- If summary does not exist, the plan needs execution
+**Task completion definition:** A task is complete when BOTH conditions are met:
+1. Its summary file exists with `## Status: DONE`
+2. The summary contains `## Review Status: passed` (appended by the orchestrator after both review stages pass)
-Resume from the current wave with incomplete plans.
+**Plan completion:** A plan is complete when ALL its tasks are complete.
+**Resume logic:**
+- Find the first incomplete task in the first incomplete plan of the current wave
+- If a task has a summary with `NEEDS_CONTEXT` or `BLOCKED` status, report it to the user before resuming
+- Resume execution from that task
 ## Step 5: Wave Execution Loop
 For each wave (in ascending order):
-### 5.1: Update Wave State
+### 5.1: Execute Plans in Wave
-Track the current wave number for reporting purposes.
+Read `config.toml` field `execution.max_wave_parallelism`:
-### 5.2: Execute Plans in Wave
+- **Parallel mode** (`max_wave_parallelism > 1`): Each plan's task sequence runs independently in parallel.
+- **Sequential mode** (`max_wave_parallelism = 1`): Execute plans one at a time within the wave.
-Read `config.toml` field `execution.max_wave_parallelism`:
+For each plan in the wave, execute its tasks sequentially:
+#### Per-Task Dispatch Loop
+For each task in the plan (sequential within a plan):
+**5.1.1 — Dispatch Executor**
+Spawn a fresh **konductor-executor** agent with:
+- The plan file path (absolute path)
+- The specific task number to execute
+- Summaries from prior completed tasks in this plan (for context)
+- Git configuration: `git.auto_commit` and `git.branching_strategy`
+- Reference to `references/execution-guide.md` (status protocol, deviation rules, commit protocol)
+- Reference to `references/tdd.md` if plan frontmatter `type = "tdd"`
+Wait for `{plan-number}-task-{n}-summary.md` to be written.
+**5.1.2 — Handle Implementer Status**
+Read the `## Status` field from the task summary and handle per the implementer status protocol (see `references/execution-guide.md`):
-**If `max_wave_parallelism > 1` (parallel mode):**
-- For each plan in this wave, spawn a **konductor-executor** agent simultaneously
-- Each executor receives:
-  - Its specific plan file path (absolute path)
-  - The git configuration: `git.auto_commit` and `git.branching_strategy`
-  - Reference to `references/execution-guide.md` (deviation rules, commit protocol, analysis paralysis guard)
-  - Reference to `references/tdd.md` if plan frontmatter `type = "tdd"`
-- Wait for ALL executors in the wave to complete (check for summary files)
+- **DONE** → proceed to two-stage review (Step 5.1.3)
+- **DONE_WITH_CONCERNS** → read `## Concerns`. If concerns mention correctness issues, security risks, or spec deviations (actionable) → dispatch executor to address them, then proceed to review. If concerns are informational → proceed to review.
+- **NEEDS_CONTEXT** → read `## Missing Context`, provide the information, re-dispatch executor with the context (max 2 retries). If still NEEDS_CONTEXT after 2 retries → treat as BLOCKED.
+- **BLOCKED** → read `## Blocker`. Assess: context problem → provide context and re-dispatch; task too complex → split into smaller tasks. If assessment fails or the task remains blocked after re-dispatch, call `state_add_blocker` with the blocker description. Track blocked count. If 3+ tasks in the phase have been BLOCKED, trigger circuit breaker: stop execution and report all blockers to user.
-**If `max_wave_parallelism = 1` (sequential mode):**
-- Execute plans one at a time within the wave
-- Spawn one executor, wait for completion, then spawn the next
+**5.1.3 — Two-Stage Review** (if `config.toml` `features.code_review = true`)
-**Executor completion check:**
-- A plan is complete when `{plan-number}-summary.md` exists
-- If an executor crashes or produces no summary, treat it as a failure (see Step 5.4)
+If `features.code_review` is false, skip reviews and mark task complete (append `## Review Status: passed` to the task summary).
-### 5.3: Write Result Files
+**Stage 1 — Spec Compliance Review:**
+Dispatch **konductor-spec-reviewer** agent with:
+- The task spec (the specific `### Task N` section from the plan file)
+- The task summary file (`{plan}-task-{n}-summary.md`)
+- The modified files listed in the summary
-After each plan completes (successfully or with errors), write `.konductor/.results/execute-{phase}-plan-{n}.toml`:
+The reviewer checks whether the implementation matches the task specification and writes findings to `{plan}-task-{n}-spec-review.md`.
+If the reviewer reports issues:
+1. Dispatch **konductor-executor** to fix the reported issues
+2. Re-run **konductor-spec-reviewer** to verify fixes
+3. Maximum 2 review-fix iterations. If still failing → log as needing manual intervention, continue to next task.
+**Stage 2 — Code Quality Review:**
+Dispatch **konductor-code-reviewer** agent with:
+- The task summary file
+- The modified files listed in the summary
+- Git diff for the task's changes
+The reviewer checks code quality (correctness, error handling, security, duplication, performance, dead code, consistency) and writes findings to `{plan}-task-{n}-quality-review.md`.
+If the reviewer reports issues:
+1. Dispatch **konductor-executor** to fix the reported issues
+2. Re-run **konductor-code-reviewer** to verify fixes
+3. Maximum 2 review-fix iterations. If still failing → log as needing manual intervention, continue to next task.
+**After both stages pass:** Append `## Review Status: passed` to the task summary file. Mark task complete.
+### 5.2: Write Result Files
+After each plan completes (all tasks done, successfully or with errors), write `.konductor/.results/execute-{phase}-plan-{n}.toml`:
 ```toml
 step = "execute"
 phase = "{phase}"
 plan = {plan_number}
 wave = {wave_number}
-status = "ok"  # or "error" if executor failed
+status = "ok"  # or "error" if any task failed
+tasks_total = {total_tasks}
+tasks_completed = {completed_tasks}
 timestamp = {current ISO timestamp}
 ```
-### 5.4: Error Handling
+### 5.3: Error Handling
 If an executor fails (crashes, times out, or reports errors):
 1. Write `.konductor/.results/execute-{phase}-plan-{n}.toml` with `status = "error"` and error details
-2. **Continue** with remaining plans in the wave (do not stop)
-3. Track failed plan numbers
-4. At the end of the wave, report which plans failed
+2. **Continue** with remaining tasks/plans in the wave (do not stop unless circuit breaker triggers)
+3. Track failed task numbers
+4. At the end of the wave, report which tasks failed
-**Do NOT retry failed executors automatically.** Let the user decide how to proceed.
+### 5.4: Update Progress Counters
-### 5.5: Update Progress Counters
+After each wave completes, track progress (completed tasks/plans count and percentage) for reporting.
-After each wave completes, track progress (completed plans count and percentage) for reporting.
-## Step 6: Code Review (if enabled)
+## Step 6: Code Review — Holistic Final Pass (if enabled)
 If `config.toml` `features.code_review = true`:
+Per-task reviews (spec compliance + code quality) have already been performed during execution. This phase-level code review is a **holistic final pass** checking cross-task consistency:
+- Shared interfaces match across plans (types, function signatures, API contracts)
+- Naming conventions are consistent across all modified files
+- Integration points work correctly (modules wire together properly)
+- No cross-plan duplication (shared logic extracted)
 Spawn a **konductor-code-reviewer** agent. Provide it with:
-- `.konductor/.tracking/modified-files.log` (list of changed files)
-- All `*-summary.md` files from `.konductor/phases/{phase}/plans/`
+- `.konductor/.tracking/modified-files.log` (list of all changed files)
+- All `*-task-*-summary.md` files from `.konductor/phases/{phase}/plans/`
 - The phase name and plan files for context
-- Instructions: review all modified source files, run tests and linting, do NOT fix any issues — only report them with file, line, description, and severity (minor/significant). Write findings to `.konductor/phases/{phase}/code-review.md`.
+- Instructions: focus on cross-task and cross-plan consistency, not individual task correctness (already reviewed). Write findings to `.konductor/phases/{phase}/code-review.md`.
 Wait for the reviewer to complete. Read `code-review.md`.
@@ -131,23 +184,24 @@ Wait for the reviewer to complete. Read `code-review.md`.
 ## Step 7: Set Executed State
-After code review completes (with no blocking issues), call `state_transition` with `step = "executed"` to advance the pipeline.
+After all execution and reviews complete (with no blocking issues), call `state_transition` with `step = "executed"` to advance the pipeline.
 Tell the user:
-- Total plans executed
-- Plans succeeded vs. failed (if any)
-- Code review findings (issues fixed, warnings reported)
+- Total plans and tasks executed
+- Tasks succeeded vs. failed (if any)
+- Per-task review results (spec + quality)
+- Phase-level code review findings (if enabled)
 - Next step suggestion: "Say 'next' to verify the phase."
-If any plans failed, list them and suggest:
-> "Review the errors in `.results/execute-{phase}-plan-{n}.toml` files. You can re-run individual plans or fix issues manually."
+If any tasks failed or need manual intervention, list them and suggest:
+> "Review the task summaries and review files in `.konductor/phases/{phase}/plans/`. You can re-run execution to retry incomplete tasks."
 ## Error Handling
 **Executor crashes:**
 If an executor subagent crashes:
 1. Write error result file for that plan
-2. Continue with remaining plans
+2. Continue with remaining tasks/plans
 3. Report the failure at the end of execution
 **State corruption:**

package/skills/konductor-exec/references/execution-guide.md CHANGED Viewed

@@ -1,20 +1,22 @@
 # Execution Guide — For Konductor Executor Agents
-This guide is for executor subagents that implement individual plans. You are responsible for executing one plan from start to finish.
+This guide is for executor subagents that implement individual tasks. You are responsible for executing one task and writing a per-task summary.
 ## Your Role
 You are a **konductor-executor** agent. You receive:
-- A plan file with tasks to complete
+- A plan file (for context on the overall goal and prior tasks)
+- A specific task number to execute
+- Summaries from prior completed tasks in this plan (for context)
 - Git configuration (auto-commit, branching strategy)
 - Reference to this guide
 Your job:
-1. Read and understand the plan
-2. Execute each task in order
+1. Read and understand the assigned task
+2. Execute the task
 3. Write tests when required (TDD plans)
 4. Commit changes following the protocol
-5. Write a summary when done
+5. Write a per-task summary with your status
 ## Deviation Rules
@@ -120,15 +122,11 @@ Make commits atomic and descriptive. Follow this protocol for every commit.
 ### Commit Frequency
-**One commit per task** (preferred):
-- After completing each task, commit the changes
+**One commit per task** (required):
+- After completing your assigned task, commit the changes
 - Keeps history granular and reviewable
 - Easier to roll back individual changes
-**Exceptions:**
-- If tasks are tightly coupled and splitting commits would break functionality, combine them
-- Always explain in the commit body why tasks were combined
 ### Staging Files
 **IMPORTANT:** Stage specific files, never use `git add -A` or `git add .`
@@ -201,84 +199,144 @@ Check `config.toml` field `git.auto_commit`:
 - Reading referenced interfaces from dependencies (doesn't count)
 - First-time codebase exploration at start of plan (first 3 reads don't count)
+## Implementer Status Protocol
+After completing a task (or failing to), report exactly one of these four statuses in your summary file. The orchestrator uses your status to decide what happens next.
+### DONE
+Task completed successfully. All files created/modified, tests pass, verify step satisfied.
+**Orchestrator action:** Proceed to spec review, then code quality review.
+### DONE_WITH_CONCERNS
+Task completed, but you have doubts or observations the orchestrator should know about.
+**Orchestrator triage:**
+- **Actionable concerns** (potential correctness issues, security risks, spec deviations) → orchestrator dispatches an executor to address them before proceeding to review.
+- **Informational concerns** (considered alternative approach, style preferences, future improvement ideas) → orchestrator proceeds directly to review.
+**Examples of actionable concerns:**
+- "The spec says validate email format, but I used a simple regex that may miss edge cases"
+- "This endpoint accepts user input without rate limiting"
+- "The plan says return 404, but the existing codebase returns 204 for missing resources"
+**Examples of informational concerns:**
+- "Considered using a builder pattern but kept it simple per the plan"
+- "This function could be split further in a future refactor"
+### NEEDS_CONTEXT
+You cannot complete the task because information is missing. Be specific about what you need.
+**Orchestrator action:** Provide the missing context and re-dispatch you. Maximum 2 retries — if still blocked after 2 attempts, escalate to user.
+**Your summary must include a `## Missing Context` section listing exactly what you need.**
+### BLOCKED
+You cannot complete the task due to a technical or architectural issue.
+**Orchestrator assessment:**
+- **Context problem** → provide context and re-dispatch
+- **Task too complex** → split into smaller tasks
+- **Plan wrong** → escalate to user
+**Your summary must include a `## Blocker` section describing the issue.**
+**Default rule:** If you encounter an issue you cannot classify into the other three statuses, use BLOCKED with a description.
 ## Summary Writing
-After completing all tasks (or encountering a blocker), write a summary file.
+After completing each task (or encountering a blocker), write a per-task summary file.
-**File location:** `.konductor/phases/{phase}/plans/{plan-number}-summary.md`
+**File location:** `.konductor/phases/{phase}/plans/{plan}-task-{n}-summary.md`
 **File name examples:**
-- `001-summary.md`
-- `002-summary.md`
-- `010-summary.md`
+- `001-task-1-summary.md` — Plan 001, Task 1
+- `001-task-2-summary.md` — Plan 001, Task 2
+- `003-task-1-summary.md` — Plan 003, Task 1
 ### Summary Structure
 ```markdown
-# Plan {plan-number} Summary
+# Plan {plan} — Task {n} Summary
-## Status
-[Completed | Blocked | Partial]
+## Status: DONE
 ## Files Created
 - `src/models/user.rs` — User struct with password hashing
-- `src/db/migrations/001_users.sql` — Users table migration
 ## Files Modified
-- `src/routes/auth.rs` — Added registration endpoint
 - `Cargo.toml` — Added bcrypt dependency
 ## Tests Added
 - `user::test_password_hashing` — Verifies bcrypt integration
-- `auth::test_registration_endpoint` — Verifies POST /auth/register
 ### Test Results
 ```
 cargo test user
-   Compiling auth-system v0.1.0
-    Finished test [unoptimized + debuginfo] target(s) in 2.3s
-     Running unittests (target/debug/deps/auth_system-abc123)
-running 2 tests
+running 1 test
 test user::test_password_hashing ... ok
-test auth::test_registration_endpoint ... ok
-test result: ok. 2 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out
+test result: ok. 1 passed; 0 failed
 ```
 ## Deviations from Plan
-1. **Rule 1:** Fixed type error in existing auth routes that prevented compilation
-2. **Rule 2:** Added email validation to registration endpoint (plan didn't specify)
-3. **Rule 3:** Added bcrypt to Cargo.toml (plan assumed it was already present)
+1. **Rule 3:** Added bcrypt to Cargo.toml (plan assumed it was already present)
 ## Decisions Made
-- Used bcrypt cost factor of 12 (industry standard for password hashing)
-- Made password_hash field private to prevent accidental exposure
-- Added index on users.email for faster lookups during login
-## Blockers Encountered
-None. All tasks completed successfully.
+- Used bcrypt cost factor of 12 (industry standard)
 ## Verification
-All must_haves from plan frontmatter verified:
-- [x] Users can register with email and password (POST /auth/register returns 201)
-- [x] Passwords are hashed with bcrypt (verified in test)
-- [x] User model imported by auth routes (compiler confirms)
+- [x] User model exists with password hashing (compiler confirms)
+```
+### Conditional Sections by Status
+Include these sections only when the status requires them:
+**DONE_WITH_CONCERNS** — add `## Concerns`:
+```markdown
+## Status: DONE_WITH_CONCERNS
+## Concerns
+- Email validation uses a simple regex that may miss edge cases (potential correctness issue)
+- Considered using a builder pattern but kept it simple per the plan (informational)
+```
+**NEEDS_CONTEXT** — add `## Missing Context`:
+```markdown
+## Status: NEEDS_CONTEXT
+## Missing Context
+- What authentication strategy does the existing codebase use? (JWT vs sessions)
+- Is there an existing User type in `src/models/` that should be extended?
+```
+**BLOCKED** — add `## Blocker`:
+```markdown
+## Status: BLOCKED
+## Blocker
+The plan requires adding a DynamoDB table, but the SAM template uses a format incompatible with the existing deployment pipeline. This is an architectural decision (Rule 4).
 ```
 ### Summary Requirements
 Your summary MUST include:
-- **Status:** One of [Completed, Blocked, Partial]
-- **Files created:** List with brief descriptions
-- **Files modified:** List with brief descriptions
-- **Tests added:** Test names and what they verify
-- **Test results:** Actual output from test runner (paste full output)
+- **Status:** One of DONE, DONE_WITH_CONCERNS, NEEDS_CONTEXT, BLOCKED
+- **Files created:** List with brief descriptions (if any)
+- **Files modified:** List with brief descriptions (if any)
+- **Tests added:** Test names and what they verify (if any)
+- **Test results:** Actual output from test runner (if tests were run)
 - **Deviations:** Every deviation with rule number and explanation
 - **Decisions:** Technical choices you made
-- **Blockers:** Any issues that stopped you (or "None")
-- **Verification:** Checklist of must_haves from plan frontmatter
+- **Verification:** Checklist of task verify/done criteria from the plan
+**Conditional sections:** Include Concerns, Missing Context, or Blocker as required by your status.
-**If blocked:** Explain clearly what stopped you and what information or decision is needed to proceed.
+**Note:** After both spec compliance and code quality reviews pass, the orchestrator appends `## Review Status: passed` to your summary file. You do not write this field yourself — it is managed by the orchestrator.
 ## Working with TDD Plans

package/skills/konductor-next/SKILL.md CHANGED Viewed

@@ -90,36 +90,61 @@ The phase is ready for execution. Run the **Execution Pipeline**:
 3. Group plans by wave number (wave 1 first, then 2, etc.).
 4. Call `state_transition` with `step = "executing"`.
-5. **For each wave** (in order):
-   Update wave tracking as needed.
+5. **Per-Task Wave Execution Loop:**
-   **If `max_wave_parallelism > 1` (parallel mode):**
-   For each plan in this wave, use the **konductor-executor** agent to execute it. Launch all plans in the wave simultaneously. Each executor receives:
-   - Its specific plan file path
-   - Whether to auto-commit (`git.auto_commit`)
-   - The branching strategy (`git.branching_strategy`)
-   - Reference: see `references/execution-guide.md` in the konductor-exec skill
-   Wait for ALL executors to complete (check for summary files).
+   For each wave (in ascending order):
-   **If `max_wave_parallelism = 1` (sequential mode):**
-   Execute plans one at a time, in order within the wave.
+   For each plan in the wave (parallel if `max_wave_parallelism > 1`, sequential otherwise):
+   Parse the plan's `## Tasks` section to extract individual tasks. For each task (sequential within a plan):
-   After each wave completes, track progress.
+   **5a. Dispatch executor:**
+   Spawn a fresh **konductor-executor** agent with:
+   - The plan file path and the specific task number to execute
+   - Summaries from prior completed tasks in this plan (for context)
+   - Git config: `git.auto_commit` and `git.branching_strategy`
+   - Reference: see `references/execution-guide.md` in the konductor-exec skill (status protocol, deviation rules)
+   Wait for `{plan}-task-{n}-summary.md` in `.konductor/phases/{phase}/plans/`.
-6. Write `.konductor/.results/execute-{phase}-plan-{n}.toml` for each completed plan.
-7. **Code Review** (if `config.toml` `features.code_review = true`):
-   Spawn **konductor-code-reviewer** agent with: `.konductor/.tracking/modified-files.log`, all `*-summary.md` files from plans directory, phase name. The reviewer writes `.konductor/phases/{phase}/code-review.md`.
-   If issues found: spawn a **konductor-executor** agent with the issues to fix them, then re-run the reviewer. Maximum 3 review-fix iterations. If still unresolved, call `state_add_blocker` and report to user.
-8. Call `state_transition` with `step = "executed"`.
-9. Tell the user: "Phase {phase} executed. N plans completed. Say 'next' to verify."
+   **5b. Handle implementer status:**
+   Read the `## Status` field from the task summary:
+   - **DONE** → proceed to 5c (two-stage review).
+   - **DONE_WITH_CONCERNS** → read `## Concerns`. If concerns mention correctness issues, security risks, or spec deviations: dispatch a fresh **konductor-executor** to address them, then proceed to 5c. If concerns are informational (style preferences, alternative approaches considered): proceed to 5c.
+   - **NEEDS_CONTEXT** → read `## Missing Context`, provide the requested information, re-dispatch a fresh executor for the same task. Maximum 2 retries. If still NEEDS_CONTEXT after retries, treat as BLOCKED.
+   - **BLOCKED** → read `## Blocker`. Assess: context problem → provide context and re-dispatch; task too complex → split into smaller tasks. If assessment fails or the task remains blocked after re-dispatch, call `state_add_blocker` with the blocker description. If 3 or more tasks in this phase have been BLOCKED, trigger circuit breaker: stop execution entirely and report all blockers to the user. Otherwise continue with the next task.
+   **5c. Two-stage review** (if `config.toml` `features.code_review = true`; skip both stages if disabled, and append `## Review Status: passed` to the task summary so resume logic works correctly):
+   **Stage 1 — Spec Compliance:**
+   Spawn **konductor-spec-reviewer** with the task spec (from the plan file), the task summary, and modified files. The reviewer writes `{plan}-task-{n}-spec-review.md`.
+   If issues found: spawn a fresh **konductor-executor** with the issues to fix, then re-run the spec reviewer. Maximum 2 iterations.
+   **Stage 2 — Code Quality:**
+   Spawn **konductor-code-reviewer** with the task summary, modified files, and git diff for the task. The reviewer writes `{plan}-task-{n}-quality-review.md`.
+   If issues found: spawn a fresh **konductor-executor** with the issues to fix, then re-run the quality reviewer. Maximum 2 iterations.
+   After both stages pass: append `## Review Status: passed` to the task summary file. Mark task complete.
+   **5d. Write result file** after each plan completes (all tasks done):
+   Write `.konductor/.results/execute-{phase}-plan-{n}.toml` with status and timestamp.
+6. **Phase-Level Code Review** (optional holistic final pass, if `config.toml` `features.code_review = true`):
+   Per-task reviews have already been performed. This step checks cross-task consistency (shared interfaces, naming conventions, integration points).
+   Spawn **konductor-code-reviewer** with `.konductor/.tracking/modified-files.log`, all task summary files, and phase name. The reviewer writes `.konductor/phases/{phase}/code-review.md`.
+   If significant cross-task issues found: spawn a **konductor-executor** to fix, then re-review. Maximum 3 iterations. If still unresolved, call `state_add_blocker` and report to user.
+7. Call `state_transition` with `step = "executed"`.
+8. Tell the user: "Phase {phase} executed. N plans completed. Say 'next' to verify."
 ### Case: `step = "executing"`
-Execution was interrupted. Resume:
-1. Check which `{plan}-summary.md` files exist in `.konductor/phases/{phase}/plans/`.
-2. Plans with summaries are complete — skip them.
-3. Resume from the first incomplete plan in the current wave.
-4. Continue the Execution Pipeline from step 5 above.
+Execution was interrupted. Resume at task-level granularity:
+1. Scan `.konductor/phases/{phase}/plans/` for `{plan}-task-{n}-summary.md` files.
+2. A task is complete only when BOTH conditions are met:
+   - Its summary file exists with `## Status: DONE`
+   - The summary contains `## Review Status: passed` (added by the orchestrator after both review stages pass)
+3. A plan is complete when ALL its tasks meet the above definition.
+4. Check for any tasks with `## Status: NEEDS_CONTEXT` or `## Status: BLOCKED` — report these to the user before resuming.
+5. Resume from the first incomplete task in the first incomplete plan of the current wave.
+6. Continue the Execution Pipeline from step 5 above.
 ### Case: `step = "executed"`

package/skills/konductor-plan/references/planning-guide.md CHANGED Viewed

@@ -39,26 +39,80 @@ Plans execute in waves. Wave dependencies must form a DAG (directed acyclic grap
 ## Task Sizing
-Each plan contains 2-5 tasks. Each task should take 15-60 minutes of execution time.
+Each plan contains 2-5 tasks. Each task is broken into **bite-sized steps that take 2-5 minutes each**. Every step that involves code must include the actual code block — no prose descriptions of what to write.
 **If a plan would need more than 5 tasks:** Split it into multiple plans in the same wave.
 **Task structure:**
 - **files:** Which files to modify/create
-- **action:** What to do (be specific)
+- **action:** What to do — broken into numbered steps, each 2-5 minutes
 - **verify:** How to check success (command to run)
 - **done:** What "done" looks like (observable outcome)
-**Example task:**
+**Each step within a task must:**
+1. Be completable in 2-5 minutes
+2. Include the exact code to write (for code steps)
+3. Include the command to run (for verification steps)
+4. Be independently verifiable
+**Example task with bite-sized steps (TDD):**
 ```markdown
 ### Task 2: Add password hashing to User model
 - **files:** `src/models/user.rs`
-- **action:** Import bcrypt crate, add `hash_password` method to User impl, call it in `new` constructor
-- **verify:** `cargo test user::test_password_hashing`
-- **done:** Passwords are hashed with bcrypt before storage
+- **action:**
+  1. Write failing test:
+     ```rust
+     #[cfg(test)]
+     mod tests {
+         use super::*;
+         #[test]
+         fn test_password_is_hashed() {
+             let user = User::new("test@example.com".into(), "Secret123!".into()).unwrap();
+             assert_ne!(user.password_hash(), "Secret123!");
+             assert!(user.verify_password("Secret123!"));
+         }
+     }
+     ```
+  2. Run `cargo test user::tests::test_password_is_hashed` — confirm it fails (method not found)
+  3. Implement:
+     ```rust
+     use bcrypt::{hash, verify, DEFAULT_COST};
+     impl User {
+         pub fn new(email: String, password: String) -> Result<Self, AuthError> {
+             let password_hash = hash(&password, DEFAULT_COST)
+                 .map_err(|_| AuthError::HashError)?;
+             Ok(Self { id: Uuid::new_v4(), email, password_hash, created_at: Utc::now() })
+         }
+         pub fn password_hash(&self) -> &str { &self.password_hash }
+         pub fn verify_password(&self, password: &str) -> bool {
+             verify(password, &self.password_hash).unwrap_or(false)
+         }
+     }
+     ```
+  4. Run `cargo test user::tests::test_password_is_hashed` — confirm it passes
+- **verify:** `cargo test user::tests::test_password_is_hashed`
+- **done:** Passwords are hashed with bcrypt before storage, verified by test
 ```
+## No Placeholders
+Every step must contain the actual content an engineer needs to execute it. The **konductor-plan-checker** agent enforces this rule and will reject plans that violate it.
+**Banned patterns:**
+- `"TBD"`, `"TODO"`, `"implement later"`, `"fill in details"`
+- `"Add appropriate error handling"` / `"add validation"` / `"handle edge cases"` (show the actual error handling, validation, or edge case code)
+- `"Write tests for the above"` without actual test code (include the test code)
+- `"Similar to Task N"` (repeat the code — the engineer may be reading tasks out of order)
+- Steps that describe what to do without showing how (code blocks required for code steps)
+- References to types, functions, or methods not defined in any prior or current task
+**Rule:** If a step involves writing code, the step must include the code block. If a step involves running a command, the step must include the command. No exceptions.
 ## Plan File Format
 Each plan is a markdown file with TOML frontmatter and a structured body.
@@ -68,7 +122,7 @@ Each plan is a markdown file with TOML frontmatter and a structured body.
 - `plan`: Plan number within the phase (1, 2, 3...)
 - `wave`: Execution wave (1, 2, 3...)
 - `depends_on`: List of plan numbers this plan depends on (e.g., `[1, 2]`)
-- `type`: Either "execute" (standard implementation) or "tdd" (test-driven)
+- `type`: Either "tdd" (test-driven, default) or "execute" (standard implementation). Use `type = "execute"` to opt out of TDD for infrastructure, configuration, or documentation tasks. The planner must always emit an explicit `type` field.
 - `autonomous`: Boolean, true if executor can proceed without human input
 - `requirements`: List of REQ-XX identifiers this plan addresses
 - `files_modified`: List of files this plan will touch (helps with merge conflict prediction)
@@ -93,7 +147,7 @@ phase = "01-auth-system"
 plan = 1
 wave = 1
 depends_on = []
-type = "execute"
+type = "tdd"
 autonomous = true
 requirements = ["REQ-01", "REQ-02"]
 files_modified = ["src/models/user.rs", "src/db/migrations/001_users.sql"]
@@ -138,33 +192,120 @@ impl User {
 ## Tasks
-### Task 1: Create User struct
+### Task 1: Create User struct with tests
 - **files:** `src/models/user.rs`
-- **action:** Define User struct with fields: id (UUID), email (String), password_hash (String), created_at (DateTime)
-- **verify:** `cargo check` passes
-- **done:** User struct compiles
+- **action:**
+  1. Write failing test:
+     ```rust
+     #[cfg(test)]
+     mod tests {
+         use super::*;
+         #[test]
+         fn test_new_user_has_correct_email() {
+             let user = User::new("test@example.com".into(), "Secret123!".into()).unwrap();
+             assert_eq!(user.email, "test@example.com");
+         }
+         #[test]
+         fn test_new_user_has_uuid() {
+             let user = User::new("test@example.com".into(), "Secret123!".into()).unwrap();
+             assert!(!user.id.is_nil());
+         }
+     }
+     ```
+  2. Run `cargo test models::user::tests` — confirm it fails (User not defined)
+  3. Implement:
+     ```rust
+     use chrono::{DateTime, Utc};
+     use uuid::Uuid;
+     pub struct User {
+         pub id: Uuid,
+         pub email: String,
+         password_hash: String,
+         pub created_at: DateTime<Utc>,
+     }
+     impl User {
+         pub fn new(email: String, password: String) -> Result<Self, AuthError> {
+             Ok(Self {
+                 id: Uuid::new_v4(),
+                 email,
+                 password_hash: password, // placeholder — next task adds hashing
+                 created_at: Utc::now(),
+             })
+         }
+     }
+     ```
+  4. Run `cargo test models::user::tests` — confirm both tests pass
+- **verify:** `cargo test models::user::tests`
+- **done:** User struct compiles and passes basic tests
 ### Task 2: Add password hashing
 - **files:** `src/models/user.rs`
-- **action:** Import bcrypt, add `hash_password` method, call in constructor
-- **verify:** `cargo test user::test_password_hashing`
-- **done:** Passwords are hashed before storage
+- **action:**
+  1. Add failing test:
+     ```rust
+     #[test]
+     fn test_password_is_hashed_not_plaintext() {
+         let user = User::new("test@example.com".into(), "Secret123!".into()).unwrap();
+         assert_ne!(user.password_hash, "Secret123!");
+     }
+     #[test]
+     fn test_verify_correct_password() {
+         let user = User::new("test@example.com".into(), "Secret123!".into()).unwrap();
+         assert!(user.verify_password("Secret123!"));
+     }
+     #[test]
+     fn test_verify_wrong_password() {
+         let user = User::new("test@example.com".into(), "Secret123!".into()).unwrap();
+         assert!(!user.verify_password("WrongPass1!"));
+     }
+     ```
+  2. Run `cargo test models::user::tests` — confirm new tests fail
+  3. Update `User::new` and add `verify_password`:
+     ```rust
+     use bcrypt::{hash, verify, DEFAULT_COST};
+     impl User {
+         pub fn new(email: String, password: String) -> Result<Self, AuthError> {
+             let password_hash = hash(&password, DEFAULT_COST)
+                 .map_err(|_| AuthError::HashError)?;
+             Ok(Self { id: Uuid::new_v4(), email, password_hash, created_at: Utc::now() })
+         }
+         pub fn verify_password(&self, password: &str) -> bool {
+             verify(password, &self.password_hash).unwrap_or(false)
+         }
+     }
+     ```
+  4. Run `cargo test models::user::tests` — confirm all 5 tests pass
+- **verify:** `cargo test models::user::tests`
+- **done:** Passwords are hashed with bcrypt, verified by 3 new tests
 ### Task 3: Create migration
 - **files:** `src/db/migrations/001_users.sql`
-- **action:** Write CREATE TABLE users with columns matching User struct
-- **verify:** `sqlx migrate run` succeeds
-- **done:** users table exists in database
-### Task 4: Wire User model to auth routes
-- **files:** `src/routes/auth.rs`
-- **action:** Import User model, use it in registration handler
-- **verify:** Compilation succeeds, User is referenced
-- **done:** Registration route can create User instances
+- **action:**
+  1. Write the migration:
+     ```sql
+     CREATE TABLE users (
+         id UUID PRIMARY KEY,
+         email VARCHAR(255) NOT NULL UNIQUE,
+         password_hash VARCHAR(255) NOT NULL,
+         created_at TIMESTAMPTZ NOT NULL DEFAULT NOW()
+     );
+     CREATE INDEX idx_users_email ON users (email);
+     ```
+  2. Run `sqlx migrate run` — confirm it succeeds
+- **verify:** `sqlx migrate run`
+- **done:** users table exists in database with email unique constraint
 ```
 ## Phase-Level Design Document
@@ -245,37 +386,71 @@ A requirement can span multiple plans. Example: "REQ-05: Users can manage their
 - Plan 3: View profile (requirements = ["REQ-05"])
 - Plan 4: Edit profile (requirements = ["REQ-05"])
-## TDD Detection
+## TDD as Default
+TDD is the default execution mode for all plans. The planner must always emit `type = "tdd"` unless the plan explicitly opts out.
-If a task can be expressed as "expect(fn(input)).toBe(output)", make it a TDD plan.
+**Backward compatibility:** Existing plans without an explicit `type` field are treated as `"execute"`. The planner must always emit an explicit `type` field going forward, making the default moot for well-formed plans.
-**Indicators:**
-- Pure functions (no I/O)
-- Clear input/output contract
-- Algorithmic logic (sorting, parsing, validation)
-- Data transformations
+**Opt-out with `type = "execute"`:** Use `type = "execute"` for tasks where TDD doesn't apply:
+- Infrastructure plans (SAM templates, Terraform, CI/CD configs)
+- Configuration files (TOML, YAML, JSON configs)
+- Documentation-only plans (README, guides, specs)
+- Refactoring plans where existing tests already cover the behavior
-**TDD plan differences:**
-- `type = "tdd"` in frontmatter
-- First task writes tests
-- Remaining tasks implement to pass tests
-- Verification is `cargo test` or equivalent
+**TDD task structure (RED → GREEN → REFACTOR):**
+1. **RED:** Write a failing test with the exact test code
+2. **Verify RED:** Run the test command — confirm it fails
+3. **GREEN:** Write the minimal implementation to pass the test
+4. **Verify GREEN:** Run the test command — confirm it passes
+5. **REFACTOR** (optional): Clean up while keeping tests green
 **Example TDD task:**
 ```markdown
-### Task 1: Write password validation tests
-- **files:** `src/validation/password_test.rs`
-- **action:** Write tests for: min 8 chars, has uppercase, has number, has special char
-- **verify:** Tests exist and fail
-- **done:** 4 test cases written
-### Task 2: Implement password validation
-- **files:** `src/validation/password.rs`
-- **action:** Write validate_password function to satisfy tests
-- **verify:** `cargo test validation::password` passes
-- **done:** All password validation tests pass
+### Task 1: Write password validation tests and implement
+- **files:** `src/validation/password.rs`, `src/validation/password_test.rs`
+- **action:**
+  1. Write failing tests:
+     ```rust
+     #[cfg(test)]
+     mod tests {
+         use super::validate_password;
+         #[test]
+         fn rejects_short_password() {
+             assert!(validate_password("Ab1!").is_err());
+         }
+         #[test]
+         fn rejects_no_uppercase() {
+             assert!(validate_password("abcdefg1!").is_err());
+         }
+         #[test]
+         fn rejects_no_number() {
+             assert!(validate_password("Abcdefgh!").is_err());
+         }
+         #[test]
+         fn accepts_valid_password() {
+             assert!(validate_password("Secret123!").is_ok());
+         }
+     }
+     ```
+  2. Run `cargo test validation::password` — confirm all 4 tests fail
+  3. Implement:
+     ```rust
+     pub fn validate_password(password: &str) -> Result<(), &'static str> {
+         if password.len() < 8 { return Err("too short"); }
+         if !password.chars().any(|c| c.is_uppercase()) { return Err("no uppercase"); }
+         if !password.chars().any(|c| c.is_numeric()) { return Err("no number"); }
+         Ok(())
+     }
+     ```
+  4. Run `cargo test validation::password` — confirm all 4 tests pass
+- **verify:** `cargo test validation::password`
+- **done:** Password validation passes all 4 test cases
 ```
 ## Interface Context
@@ -330,9 +505,13 @@ Before finalizing plans, verify:
 - [ ] Phase-level `design.md` exists with overview, components, interactions, key decisions, and shared interfaces
 - [ ] Every plan has valid TOML frontmatter
+- [ ] Every plan has an explicit `type` field (`"tdd"` or `"execute"`)
+- [ ] Plans default to TDD unless explicitly opted out (infra, config, docs)
 - [ ] Every plan has a `## Design` section with Approach, Key Interfaces, Error Handling, and Trade-offs
 - [ ] Wave numbers form a valid DAG (no cycles)
 - [ ] Each plan has 2-5 tasks
+- [ ] Each task has bite-sized steps (2-5 minutes each) with code blocks for code steps
+- [ ] No placeholder patterns (TBD, TODO, implement later, etc.)
 - [ ] Each task has files, action, verify, and done
 - [ ] Every requirement from requirements.md is covered
 - [ ] `must_haves` section is complete and observable