npm - codeforge-dev - Versions diffs - 1.4.0 - Mend

codeforge-dev 1.4.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (131) hide show

package/.devcontainer/plugins/devs-marketplace/plugins/workflow-enhancer/functional-conjuring-map.md ADDED Viewed

@@ -0,0 +1,989 @@
+# Plan: Advanced Plan & Task Validation System
+## Problem Statement
+The workflow-enhancer plugin currently provides basic plan and task enhancement hooks, but lacks intelligent validation. Plans can be approved without verifying they address user requirements, contain unmarked assumptions, or have adequate risk mitigation. Tasks can be created without verifying coverage against the approved plan.
+This creates risk of:
+- Requirements silently dropped or misunderstood
+- Assumptions made without explicit approval
+- High-risk changes without mitigation strategies
+- Tasks that don't cover the full plan scope
+## Scope Definition
+### In Scope
+- Marker enforcement system (`[ASSUMPTION]`, `[DEFERRED]`, `[APPROVED]`)
+- Bash-based syntactic validation (fast gate)
+- Opus-based semantic analysis (reasoning over plan + transcript + tasks)
+- Sonnet-based verification (file access for grounding claims)
+- Task coverage matrix generation
+- Risk escalation detection and mitigation verification
+### Out of Scope
+- Historical pattern matching [DEFERRED] [APPROVED]
+- Dependency graph visualization [DEFERRED] [APPROVED]
+- Context window monitoring [DEFERRED] [APPROVED]
+- Real-time collaboration hooks [DEFERRED] [APPROVED]
+- External CI/CD integration [DEFERRED] [APPROVED]
+### Assumptions
+- Claude CLI supports headless mode with model selection [ASSUMPTION]
+- Transcript JSONL format is stable and parseable [ASSUMPTION]
+- Token costs for Opus/Sonnet calls are acceptable (~$0.05-0.08 per plan) [ASSUMPTION]
+## Current State
+```
+PreToolUse (EnterPlanMode)
+└── enhance-planning.py
+    └── Injects planning-instructions.md content (basic template)
+PostToolUse (Write to /plans/)
+└── post-enhance-plan.py
+    └── Detects plan files
+    └── Runs enhance-plan.sh (placeholder, no real validation)
+    └── Returns additionalContext
+PostToolUse (TaskCreate|TaskUpdate)
+└── post-enhance-task.py
+    └── Finds task file
+    └── Finds session plan (correlation implemented)
+    └── Runs enhance-task.sh (placeholder)
+    └── Returns additionalContext
+```
+## Desired State
+```
+PreToolUse (EnterPlanMode)
+└── enhance-planning.py
+    └── Injects comprehensive planning-instructions.md:
+        ├── Required section template
+        ├── Marker syntax requirements
+        ├── Assumption/deferral rules
+        └── Risk documentation requirements
+PostToolUse (Write to /plans/)
+└── post-enhance-plan.py (orchestrator)
+    │
+    ├── LAYER 1: Bash Validators (fast gate)
+    │   ├── section_validator.sh    → Required sections present
+    │   ├── marker_validator.sh     → [ASSUMPTION], [DEFERRED], [APPROVED] syntax
+    │   ├── assumption_detector.sh  → Unmarked assumption language
+    │   └── risk_detector.sh        → High-risk keywords, mitigation check
+    │
+    ├── LAYER 2: Opus Analyzer (reasoning, 1-2 turns)
+    │   └── opus_analyzer.py
+    │       ├── Input: plan + transcript + tasks
+    │       ├── Requirement traceability analysis
+    │       ├── Assumption audit
+    │       ├── Gap/ambiguity detection
+    │       ├── Task coverage matrix
+    │       └── Risk assessment
+    │
+    └── LAYER 3: Sonnet Verifier (conditional, file access)
+        └── sonnet_verifier.py
+            ├── Triggered by: Opus concerns OR high-risk flags
+            ├── File path verification
+            ├── Pattern alignment check
+            ├── Conflict detection
+            └── Security spot-check
+PostToolUse (TaskCreate|TaskUpdate)
+└── post-enhance-task.py
+    └── [Existing functionality]
+    └── Coverage data passed to Opus in plan validation
+```
+## Technical Approach
+### Phase 1: Planning Instructions Enhancement
+**Files:**
+- `config/planning-instructions.md` - complete rewrite
+**Content to inject:**
+```markdown
+## Required Plan Sections
+Your plan MUST include these sections:
+1. **Problem Statement** - What problem and why it matters
+2. **Scope Definition** - In scope, out of scope, assumptions
+3. **Current vs Desired State** - Before and after
+4. **Technical Approach** - Phases with files and steps
+5. **Verification Checklist** - How to confirm completion
+6. **Risks and Mitigations** - What could go wrong
+## Required Markers
+### Assumptions
+Any assumption MUST be marked: `[ASSUMPTION]`
+Approved assumptions: `[ASSUMPTION] [APPROVED]`
+Example:
+- Database supports JSON columns [ASSUMPTION] [APPROVED]
+### Deferrals
+Anything deferred MUST be marked: `[DEFERRED]`
+Approved deferrals: `[DEFERRED] [APPROVED]`
+Example:
+- Admin panel [DEFERRED] [APPROVED] - Phase 2 work
+### Unapproved markers will be flagged
+The plan validator will reject plans with:
+- `[ASSUMPTION]` without `[APPROVED]`
+- `[DEFERRED]` without `[APPROVED]`
+- Assumption-like language without `[ASSUMPTION]` marker
+## Risk Documentation
+High-risk patterns MUST have explicit mitigation:
+- Data deletion/modification → Backup strategy
+- Production changes → Rollback plan
+- Auth/security changes → Security review notes
+- Schema migrations → Migration rollback steps
+```
+**Verification:**
+- Enter plan mode, verify instructions appear in context
+- Create plan without markers, verify they're requested
+---
+### Phase 2: Bash Validators
+**Files to create:**
+- `scripts/lib/section_validator.sh`
+- `scripts/lib/marker_validator.sh`
+- `scripts/lib/assumption_detector.sh`
+- `scripts/lib/risk_detector.sh`
+#### section_validator.sh
+```bash
+#!/bin/bash
+# Validates required sections exist in plan
+# Exit 0 = pass, Exit 1 = fail with errors on stderr
+PLAN_FILE="$1"
+errors=0
+required_sections=(
+    "## Problem Statement"
+    "## Scope Definition"
+    "## Current.*State|## Current State"
+    "## Desired.*State|## Desired State"
+    "## Verification"
+    "## Risks"
+)
+for pattern in "${required_sections[@]}"; do
+    if ! grep -qE "$pattern" "$PLAN_FILE"; then
+        echo "MISSING SECTION: $pattern" >&2
+        ((errors++))
+    fi
+done
+exit $((errors > 0 ? 1 : 0))
+```
+#### marker_validator.sh
+```bash
+#!/bin/bash
+# Validates marker syntax and approval status
+# Exit 0 = pass, Exit 1 = warnings, Exit 2 = errors
+PLAN_FILE="$1"
+errors=0
+warnings=0
+# Find deferrals without approval
+unapproved_deferrals=$(grep -n '\[DEFERRED\]' "$PLAN_FILE" | grep -v '\[APPROVED\]')
+if [[ -n "$unapproved_deferrals" ]]; then
+    echo "UNAPPROVED DEFERRAL:" >&2
+    echo "$unapproved_deferrals" >&2
+    ((errors++))
+fi
+# Find assumptions without approval
+unapproved_assumptions=$(grep -n '\[ASSUMPTION\]' "$PLAN_FILE" | grep -v '\[APPROVED\]')
+if [[ -n "$unapproved_assumptions" ]]; then
+    echo "UNAPPROVED ASSUMPTION:" >&2
+    echo "$unapproved_assumptions" >&2
+    ((errors++))
+fi
+# Check for TBD/placeholder items
+tbd_items=$(grep -n -iE '\bTBD\b|\bTODO\b|\bPLACEHOLDER\b' "$PLAN_FILE")
+if [[ -n "$tbd_items" ]]; then
+    echo "UNRESOLVED ITEMS:" >&2
+    echo "$tbd_items" >&2
+    ((errors++))
+fi
+if [[ $errors -gt 0 ]]; then
+    exit 2
+elif [[ $warnings -gt 0 ]]; then
+    exit 1
+else
+    exit 0
+fi
+```
+#### assumption_detector.sh
+```bash
+#!/bin/bash
+# Detects assumption-like language without [ASSUMPTION] marker
+PLAN_FILE="$1"
+warnings=0
+assumption_patterns=(
+    "assum[ei]"
+    "expect that"
+    "should be able"
+    "probably"
+    "likely"
+    "I believe"
+    "presumably"
+    "should work"
+    "will likely"
+)
+for pattern in "${assumption_patterns[@]}"; do
+    matches=$(grep -in "$pattern" "$PLAN_FILE" | grep -v '\[ASSUMPTION\]')
+    if [[ -n "$matches" ]]; then
+        echo "UNMARKED ASSUMPTION LANGUAGE:" >&2
+        echo "$matches" >&2
+        ((warnings++))
+    fi
+done
+exit $((warnings > 0 ? 1 : 0))
+```
+#### risk_detector.sh
+```bash
+#!/bin/bash
+# Detects high-risk patterns and checks for mitigation
+PLAN_FILE="$1"
+warnings=0
+declare -A risk_patterns=(
+    ["DELETE|DROP|TRUNCATE|rm -rf|remove.*all"]="CRITICAL:destructive-operation"
+    ["production|prod environment|live server"]="HIGH:production-impact"
+    ["migration|schema change|ALTER TABLE|database change"]="HIGH:data-migration"
+    ["auth|password|token|secret|credential|API.key"]="HIGH:security-sensitive"
+    ["\.env|environment variable|config secret"]="HIGH:secrets-handling"
+)
+for pattern in "${!risk_patterns[@]}"; do
+    severity="${risk_patterns[$pattern]%:*}"
+    category="${risk_patterns[$pattern]#*:}"
+    if grep -qiE "$pattern" "$PLAN_FILE"; then
+        echo "RISK DETECTED [$severity]: $category" >&2
+        # Check for mitigation
+        if ! grep -qiE "mitigat|rollback|backup|revert|recover" "$PLAN_FILE"; then
+            echo "  WARNING: No mitigation strategy found for $category" >&2
+            ((warnings++))
+        fi
+    fi
+done
+exit $((warnings > 0 ? 1 : 0))
+```
+**Verification:**
+- Run each validator against sample plans
+- Verify correct exit codes and error messages
+---
+### Phase 3: Transcript Parser
+**File to create:**
+- `scripts/lib/transcript_parser.py`
+```python
+#!/usr/bin/env python3
+"""Extract user messages from session transcript for analysis."""
+import json
+import sys
+from pathlib import Path
+def extract_user_messages(transcript_path: str, max_chars: int = 4000) -> str:
+    """
+    Extract user messages from JSONL transcript.
+    Args:
+        transcript_path: Path to session JSONL file
+        max_chars: Maximum characters to return (truncate from end)
+    Returns:
+        Concatenated user messages, most recent last
+    """
+    messages = []
+    try:
+        with open(transcript_path) as f:
+            for line in f:
+                try:
+                    entry = json.loads(line)
+                    if entry.get("type") == "user":
+                        msg = entry.get("message", "")
+                        if msg:
+                            messages.append(msg)
+                except json.JSONDecodeError:
+                    continue
+    except (OSError, IOError) as e:
+        print(f"Error reading transcript: {e}", file=sys.stderr)
+        return ""
+    combined = "\n---\n".join(messages)
+    # Truncate from beginning if too long (keep recent messages)
+    if len(combined) > max_chars:
+        combined = "...[truncated]...\n" + combined[-max_chars:]
+    return combined
+def extract_requirements(transcript_path: str) -> list[str]:
+    """
+    Extract likely requirements from user messages.
+    Heuristic: Lines that look like requirements (imperative, bullet points, etc.)
+    """
+    messages = extract_user_messages(transcript_path, max_chars=10000)
+    requirements = []
+    for line in messages.split("\n"):
+        line = line.strip()
+        # Heuristic patterns for requirements
+        if any([
+            line.startswith("- "),
+            line.startswith("* "),
+            line.startswith("1.") or line.startswith("2."),
+            "should" in line.lower(),
+            "must" in line.lower(),
+            "need" in line.lower(),
+            "want" in line.lower(),
+        ]):
+            requirements.append(line)
+    return requirements
+if __name__ == "__main__":
+    if len(sys.argv) < 2:
+        print("Usage: transcript_parser.py <transcript_path>", file=sys.stderr)
+        sys.exit(1)
+    print(extract_user_messages(sys.argv[1]))
+```
+**Verification:**
+- Run against actual session transcript
+- Verify user messages extracted correctly
+---
+### Phase 4: Opus Analyzer
+**File to create:**
+- `scripts/lib/opus_analyzer.py`
+```python
+#!/usr/bin/env python3
+"""
+Opus-based semantic analysis of implementation plans.
+Analyzes plan against transcript and tasks for:
+- Requirement coverage
+- Assumption audit
+- Gap detection
+- Task coverage matrix
+- Risk assessment
+"""
+import json
+import subprocess
+import sys
+from pathlib import Path
+from dataclasses import dataclass
+@dataclass
+class AnalysisResult:
+    requirements: list[dict]  # {req, status, location}
+    assumptions: list[dict]   # {text, marked, approved}
+    gaps: list[str]
+    ambiguities: list[str]
+    task_coverage: dict       # {plan_item: task_or_uncovered}
+    risks: list[dict]         # {risk, severity, mitigation}
+    raw_response: str
+OPUS_PROMPT = '''You are an implementation plan auditor. Analyze the provided data and report findings.
+## INPUT DATA
+### User's Original Messages (from session transcript):
+"""
+{user_messages}
+"""
+### The Implementation Plan:
+"""
+{plan_content}
+"""
+### Current Task List (if any):
+"""
+{task_json}
+"""
+## ANALYSIS REQUIRED
+Respond in this exact format:
+### REQUIREMENTS
+For each user requirement identified:
+REQ: [requirement text]
+STATUS: ADDRESSED | PARTIAL | MISSING
+LOCATION: [section in plan, or "not found"]
+### ASSUMPTIONS
+For each assumption (explicit or implicit):
+ASSUMPTION: [text]
+MARKED: YES | NO
+APPROVED: YES | NO
+### GAPS
+GAP: [what's missing or could fail]
+### AMBIGUITIES
+AMBIGUITY: [what's unclear or has multiple interpretations]
+### TASK_COVERAGE
+PLAN_ITEM: [phase/step from plan]
+COVERED_BY: [task subject] | UNCOVERED
+COVERAGE_SCORE: X/Y
+### RISKS
+RISK: [pattern or concern]
+SEVERITY: CRITICAL | HIGH | MEDIUM
+MITIGATION: PRESENT | MISSING
+## RULES
+- Be terse. No explanations unless critical.
+- Flag only issues, not successes (except for REQUIREMENTS status).
+- If task list is empty, skip TASK_COVERAGE section.
+'''
+def run_opus_analysis(
+    plan_content: str,
+    user_messages: str,
+    task_json: str = "[]",
+    timeout: int = 60
+) -> AnalysisResult:
+    """
+    Run Opus analysis on plan.
+    Args:
+        plan_content: The implementation plan markdown
+        user_messages: Extracted user messages from transcript
+        task_json: JSON string of current tasks
+        timeout: Max seconds to wait for response
+    Returns:
+        AnalysisResult with parsed findings
+    """
+    prompt = OPUS_PROMPT.format(
+        user_messages=user_messages,
+        plan_content=plan_content,
+        task_json=task_json
+    )
+    try:
+        # Call Claude CLI in headless mode
+        result = subprocess.run(
+            ["claude", "--print", "--model", "opus", "-p", prompt],
+            capture_output=True,
+            text=True,
+            timeout=timeout
+        )
+        if result.returncode != 0:
+            raise RuntimeError(f"Claude CLI error: {result.stderr}")
+        response = result.stdout.strip()
+        return parse_opus_response(response)
+    except subprocess.TimeoutExpired:
+        raise RuntimeError("Opus analysis timed out")
+    except FileNotFoundError:
+        raise RuntimeError("Claude CLI not found")
+def parse_opus_response(response: str) -> AnalysisResult:
+    """Parse structured Opus response into AnalysisResult."""
+    # Initialize result containers
+    requirements = []
+    assumptions = []
+    gaps = []
+    ambiguities = []
+    task_coverage = {}
+    risks = []
+    current_section = None
+    current_item = {}
+    for line in response.split("\n"):
+        line = line.strip()
+        # Section headers
+        if line.startswith("### "):
+            current_section = line[4:].upper()
+            continue
+        # Parse based on current section
+        if current_section == "REQUIREMENTS":
+            if line.startswith("REQ:"):
+                if current_item:
+                    requirements.append(current_item)
+                current_item = {"req": line[4:].strip()}
+            elif line.startswith("STATUS:"):
+                current_item["status"] = line[7:].strip()
+            elif line.startswith("LOCATION:"):
+                current_item["location"] = line[9:].strip()
+                requirements.append(current_item)
+                current_item = {}
+        elif current_section == "ASSUMPTIONS":
+            if line.startswith("ASSUMPTION:"):
+                if current_item:
+                    assumptions.append(current_item)
+                current_item = {"text": line[11:].strip()}
+            elif line.startswith("MARKED:"):
+                current_item["marked"] = line[7:].strip() == "YES"
+            elif line.startswith("APPROVED:"):
+                current_item["approved"] = line[9:].strip() == "YES"
+                assumptions.append(current_item)
+                current_item = {}
+        elif current_section == "GAPS":
+            if line.startswith("GAP:"):
+                gaps.append(line[4:].strip())
+        elif current_section == "AMBIGUITIES":
+            if line.startswith("AMBIGUITY:"):
+                ambiguities.append(line[10:].strip())
+        elif current_section == "TASK_COVERAGE":
+            if line.startswith("PLAN_ITEM:"):
+                current_item = {"plan_item": line[10:].strip()}
+            elif line.startswith("COVERED_BY:"):
+                current_item["covered_by"] = line[11:].strip()
+                task_coverage[current_item["plan_item"]] = current_item["covered_by"]
+                current_item = {}
+        elif current_section == "RISKS":
+            if line.startswith("RISK:"):
+                if current_item:
+                    risks.append(current_item)
+                current_item = {"risk": line[5:].strip()}
+            elif line.startswith("SEVERITY:"):
+                current_item["severity"] = line[9:].strip()
+            elif line.startswith("MITIGATION:"):
+                current_item["mitigation"] = line[11:].strip()
+                risks.append(current_item)
+                current_item = {}
+    return AnalysisResult(
+        requirements=requirements,
+        assumptions=assumptions,
+        gaps=gaps,
+        ambiguities=ambiguities,
+        task_coverage=task_coverage,
+        risks=risks,
+        raw_response=response
+    )
+def format_findings(result: AnalysisResult) -> str:
+    """Format analysis results for additionalContext."""
+    lines = ["--- Opus Plan Analysis ---"]
+    # Requirements
+    missing = [r for r in result.requirements if r.get("status") == "MISSING"]
+    partial = [r for r in result.requirements if r.get("status") == "PARTIAL"]
+    if missing:
+        lines.append(f"\nMISSING REQUIREMENTS ({len(missing)}):")
+        for r in missing:
+            lines.append(f"  - {r['req']}")
+    if partial:
+        lines.append(f"\nPARTIAL REQUIREMENTS ({len(partial)}):")
+        for r in partial:
+            lines.append(f"  - {r['req']} (at: {r.get('location', '?')})")
+    # Assumptions
+    unmarked = [a for a in result.assumptions if not a.get("marked")]
+    unapproved = [a for a in result.assumptions if a.get("marked") and not a.get("approved")]
+    if unmarked:
+        lines.append(f"\nUNMARKED ASSUMPTIONS ({len(unmarked)}):")
+        for a in unmarked:
+            lines.append(f"  - {a['text']}")
+    if unapproved:
+        lines.append(f"\nUNAPPROVED ASSUMPTIONS ({len(unapproved)}):")
+        for a in unapproved:
+            lines.append(f"  - {a['text']}")
+    # Gaps
+    if result.gaps:
+        lines.append(f"\nGAPS ({len(result.gaps)}):")
+        for g in result.gaps:
+            lines.append(f"  - {g}")
+    # Ambiguities
+    if result.ambiguities:
+        lines.append(f"\nAMBIGUITIES ({len(result.ambiguities)}):")
+        for a in result.ambiguities:
+            lines.append(f"  - {a}")
+    # Task coverage
+    uncovered = [k for k, v in result.task_coverage.items() if v == "UNCOVERED"]
+    if uncovered:
+        lines.append(f"\nUNCOVERED PLAN ITEMS ({len(uncovered)}):")
+        for item in uncovered:
+            lines.append(f"  - {item}")
+    # Risks
+    critical = [r for r in result.risks if r.get("severity") == "CRITICAL"]
+    high_no_mitigation = [r for r in result.risks
+                          if r.get("severity") == "HIGH" and r.get("mitigation") == "MISSING"]
+    if critical:
+        lines.append(f"\nCRITICAL RISKS ({len(critical)}):")
+        for r in critical:
+            lines.append(f"  - {r['risk']} (mitigation: {r.get('mitigation', '?')})")
+    if high_no_mitigation:
+        lines.append(f"\nHIGH RISKS WITHOUT MITIGATION ({len(high_no_mitigation)}):")
+        for r in high_no_mitigation:
+            lines.append(f"  - {r['risk']}")
+    lines.append("\n--- End Opus Analysis ---")
+    return "\n".join(lines)
+if __name__ == "__main__":
+    # Test with sample data
+    print("Opus analyzer module loaded. Use run_opus_analysis() to analyze plans.")
+```
+**Verification:**
+- Test with sample plan, transcript, and tasks
+- Verify Claude CLI invocation works
+- Verify response parsing handles edge cases
+---
+### Phase 5: Sonnet Verifier
+**File to create:**
+- `scripts/lib/sonnet_verifier.py`
+```python
+#!/usr/bin/env python3
+"""
+Sonnet-based verification of implementation plans.
+Verifies claims in the plan against actual codebase:
+- File paths exist/don't exist as expected
+- Proposed changes align with existing patterns
+- No conflicts with existing code
+- Security concerns addressed
+"""
+import subprocess
+import sys
+from dataclasses import dataclass
+@dataclass
+class VerificationResult:
+    verified: list[str]
+    conflicts: list[str]
+    missing: list[str]
+    security: list[str]
+    raw_response: str
+SONNET_PROMPT = '''You are verifying an implementation plan against the actual codebase.
+## Context from Opus Analysis
+{opus_summary}
+## Plan Being Verified
+"""
+{plan_content}
+"""
+## Your Tasks
+1. **File Verification**: For each file path mentioned in the plan:
+   - Use Read tool to check if it exists
+   - Verify described current state matches reality
+2. **Pattern Alignment**: For proposed changes:
+   - Read similar existing code (max 3 files)
+   - Confirm approach matches codebase conventions
+3. **Conflict Detection**: Check if changes would conflict with:
+   - Existing functionality in mentioned files
+4. **Security Spot-Check**: For any auth/data/API changes:
+   - Note any security concerns
+## Rules
+- Maximum 5 file reads total
+- Focus on CRITICAL and HIGH items from Opus analysis first
+- Skip verification for low-risk items
+## Output Format (use exactly these tags)
+[VERIFIED] item - checks out
+[CONFLICT] item - issue description
+[MISSING] item - file or pattern not found
+[SECURITY] item - concern description
+'''
+def run_sonnet_verification(
+    plan_content: str,
+    opus_summary: str,
+    max_turns: int = 5,
+    timeout: int = 120
+) -> VerificationResult:
+    """
+    Run Sonnet verification on plan.
+    Args:
+        plan_content: The implementation plan markdown
+        opus_summary: Summary of Opus findings to focus verification
+        max_turns: Maximum conversation turns (file reads)
+        timeout: Max seconds to wait
+    Returns:
+        VerificationResult with parsed findings
+    """
+    prompt = SONNET_PROMPT.format(
+        opus_summary=opus_summary,
+        plan_content=plan_content
+    )
+    try:
+        # Call Claude CLI with Sonnet, allowing some turns for file reads
+        result = subprocess.run(
+            ["claude", "--print", "--model", "sonnet",
+             "--max-turns", str(max_turns), "-p", prompt],
+            capture_output=True,
+            text=True,
+            timeout=timeout
+        )
+        if result.returncode != 0:
+            raise RuntimeError(f"Claude CLI error: {result.stderr}")
+        response = result.stdout.strip()
+        return parse_sonnet_response(response)
+    except subprocess.TimeoutExpired:
+        raise RuntimeError("Sonnet verification timed out")
+    except FileNotFoundError:
+        raise RuntimeError("Claude CLI not found")
+def parse_sonnet_response(response: str) -> VerificationResult:
+    """Parse Sonnet response into VerificationResult."""
+    verified = []
+    conflicts = []
+    missing = []
+    security = []
+    for line in response.split("\n"):
+        line = line.strip()
+        if line.startswith("[VERIFIED]"):
+            verified.append(line[10:].strip())
+        elif line.startswith("[CONFLICT]"):
+            conflicts.append(line[10:].strip())
+        elif line.startswith("[MISSING]"):
+            missing.append(line[9:].strip())
+        elif line.startswith("[SECURITY]"):
+            security.append(line[10:].strip())
+    return VerificationResult(
+        verified=verified,
+        conflicts=conflicts,
+        missing=missing,
+        security=security,
+        raw_response=response
+    )
+def format_verification(result: VerificationResult) -> str:
+    """Format verification results for additionalContext."""
+    lines = ["--- Sonnet Verification ---"]
+    if result.conflicts:
+        lines.append(f"\nCONFLICTS ({len(result.conflicts)}):")
+        for c in result.conflicts:
+            lines.append(f"  - {c}")
+    if result.missing:
+        lines.append(f"\nMISSING ({len(result.missing)}):")
+        for m in result.missing:
+            lines.append(f"  - {m}")
+    if result.security:
+        lines.append(f"\nSECURITY CONCERNS ({len(result.security)}):")
+        for s in result.security:
+            lines.append(f"  - {s}")
+    if result.verified and not (result.conflicts or result.missing or result.security):
+        lines.append(f"\nAll {len(result.verified)} items verified successfully.")
+    lines.append("\n--- End Verification ---")
+    return "\n".join(lines)
+if __name__ == "__main__":
+    print("Sonnet verifier module loaded. Use run_sonnet_verification() to verify plans.")
+```
+**Verification:**
+- Test with plan containing file references
+- Verify file reads work within turn limit
+- Verify response parsing
+---
+### Phase 6: Orchestrator Update
+**File to modify:**
+- `scripts/post-enhance-plan.py`
+**Changes:**
+1. Import new modules
+2. Add orchestration logic for three-layer validation
+3. Aggregate results into additionalContext
+```python
+# New orchestration flow (pseudocode)
+def handle_plan_write(plan_path, transcript_path, cwd, session_id):
+    plan_content = read_file(plan_path)
+    results = []
+    # Layer 1: Bash validators
+    bash_results = run_bash_validators(plan_path)
+    if bash_results.has_errors:
+        return format_bash_errors(bash_results)
+    results.append(bash_results)
+    # Layer 2: Opus analysis
+    user_messages = extract_user_messages(transcript_path)
+    task_json = get_task_json(session_id)
+    opus_result = run_opus_analysis(plan_content, user_messages, task_json)
+    results.append(opus_result)
+    # Layer 3: Sonnet verification (conditional)
+    needs_verification = (
+        opus_result.has_critical_risks or
+        opus_result.has_missing_requirements or
+        bash_results.risk_keywords_found
+    )
+    if needs_verification:
+        sonnet_result = run_sonnet_verification(
+            plan_content,
+            summarize_opus_findings(opus_result)
+        )
+        results.append(sonnet_result)
+    return format_all_results(results)
+```
+**Verification:**
+- End-to-end test with plan creation
+- Verify all layers execute in correct order
+- Verify conditional Sonnet invocation
+---
+## Files to Create/Modify
+| File | Action | Description |
+|------|--------|-------------|
+| `config/planning-instructions.md` | Modify | Add marker requirements, section template |
+| `scripts/lib/section_validator.sh` | Create | Validate required sections |
+| `scripts/lib/marker_validator.sh` | Create | Validate marker syntax |
+| `scripts/lib/assumption_detector.sh` | Create | Detect unmarked assumptions |
+| `scripts/lib/risk_detector.sh` | Create | Detect risks, check mitigation |
+| `scripts/lib/transcript_parser.py` | Create | Extract user messages from JSONL |
+| `scripts/lib/opus_analyzer.py` | Create | Opus semantic analysis |
+| `scripts/lib/sonnet_verifier.py` | Create | Sonnet file verification |
+| `scripts/post-enhance-plan.py` | Modify | Orchestrate three-layer validation |
+## Risks and Mitigations
+| Risk | Severity | Mitigation |
+|------|----------|------------|
+| Claude CLI syntax differs from assumed | HIGH | Verify CLI flags before implementation |
+| Token costs exceed budget | MEDIUM | Add cost caps, skip Sonnet for low-risk |
+| Opus/Sonnet timeout on complex plans | MEDIUM | Set reasonable timeouts, fallback to bash-only |
+| Response parsing fails on edge cases | MEDIUM | Add robust error handling, log raw responses |
+| False positives annoy users | MEDIUM | Tune prompts, add severity thresholds |
+## Verification Checklist
+- [ ] Planning instructions injection works
+- [ ] Each bash validator runs independently
+- [ ] Transcript parser extracts messages correctly
+- [ ] Opus analyzer produces structured output
+- [ ] Sonnet verifier respects turn limits
+- [ ] Orchestrator chains all layers correctly
+- [ ] additionalContext returned to Claude
+- [ ] End-to-end: Create plan, see validation results
+## Dependencies
+- Claude CLI with headless mode support
+- `jq` for JSON parsing in bash scripts
+- Python 3.10+ for type hints
+- Session transcript accessible via `transcript_path`
+## Open Questions
+1. **Claude CLI flags**: Need to verify exact syntax for `--model`, `--print`, `--max-turns`
+2. **Cost thresholds**: At what risk level should Sonnet verification trigger?
+3. **Blocking vs warning**: Should critical issues block plan approval or just warn?