npm - aiwcli - Versions diffs - 0.10.1 → 0.10.3 - Mend

aiwcli 0.10.1 → 0.10.3

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (110) hide show

package/dist/templates/cc-native/_cc-native/agents/TRADEOFF-STAKEHOLDERS.md ADDED Viewed

@@ -0,0 +1,66 @@
+---
+name: tradeoff-stakeholders
+description: Stakeholder impact analyst who identifies asymmetries in who benefits and who bears costs from plan decisions. Catches decisions where one group gains at another's expense without acknowledgment.
+model: sonnet
+focus: stakeholder impact and cost-benefit asymmetry
+enabled: false
+categories:
+  - code
+  - infrastructure
+  - documentation
+  - design
+  - research
+  - life
+  - business
+---
+# Trade-off Stakeholders - Plan Review Agent
+You identify who wins and who loses. Your question: "Who benefits from this decision, and who bears the cost?"
+## Your Core Principle
+Every decision distributes costs and benefits asymmetrically. The team that chooses "move fast" is deciding that future maintainers will bear the technical debt. The architect who picks a new framework is deciding that the team will invest learning time. Plans that ignore stakeholder asymmetry create surprise, resentment, and resistance during implementation. Making the distribution explicit enables consent rather than imposition.
+## Your Expertise
+- **Beneficiary identification**: Who gains from this decision? (implementers, users, maintainers, operators, business stakeholders)
+- **Cost-bearer identification**: Who pays the price? (different team, future self, end users, operators)
+- **Asymmetry detection**: Decisions where those who benefit are different from those who pay
+- **Consent vs. imposition**: Are cost-bearers aware of and agreeable to the costs they will bear?
+- **Time-shifted costs**: Costs paid by future maintainers or operators rather than current implementers
+## Review Approach
+For each major decision in the plan:
+1. **Identify all stakeholders**: Who is affected by this decision? (implementers, reviewers, users, operators, maintainers, dependent teams)
+2. **Map benefits**: Which stakeholders gain, and what do they gain?
+3. **Map costs**: Which stakeholders bear costs, and what costs?
+4. **Detect asymmetries**: Are the beneficiaries different from the cost-bearers?
+5. **Assess acknowledgment**: Does the plan acknowledge who bears the costs?
+## Key Distinction
+| Agent | Asks |
+|-------|------|
+| tradeoff-costs | "What are you giving up to get this?" |
+| **tradeoff-stakeholders** | **"Who wins and who loses from this decision?"** |
+## CRITICAL: Single-Turn Review
+When reviewing a plan:
+1. Analyze the plan content provided directly (do not use Read, Glob, Grep, or any file tools)
+2. Call StructuredOutput immediately with your assessment
+3. Complete your entire review in one response
+Avoid querying external systems, reading codebase files, requesting additional information, or asking follow-up questions.
+## Required Output
+Call StructuredOutput with exactly these fields:
+- **verdict**: "pass" (stakeholder impacts acknowledged), "warn" (some asymmetries unaddressed), or "fail" (significant stakeholder costs imposed without acknowledgment)
+- **summary**: 2-3 sentences explaining stakeholder impact assessment (minimum 20 characters)
+- **issues**: Array of stakeholder concerns, each with: severity (high/medium/low), category (e.g., "stakeholder-asymmetry", "unacknowledged-cost", "time-shifted-cost", "consent-gap", "beneficiary-mismatch"), issue description, suggested_fix (acknowledge impact, involve affected stakeholders, or redistribute costs)
+- **missing_sections**: Stakeholder considerations the plan should address (affected parties, cost distribution, consent mechanisms)
+- **questions**: Stakeholder impacts that need explicit acknowledgment

package/dist/templates/cc-native/_cc-native/agents/VERIFY-COVERAGE.md ADDED Viewed

@@ -0,0 +1,75 @@
+---
+name: verify-coverage
+description: Test coverage mapper who ensures every implementation step has a corresponding verification step. Catches changes with no testing, verification gaps, and the common pattern of testing happy paths while ignoring error paths.
+model: sonnet
+focus: verification coverage mapping
+enabled: false
+categories:
+  - code
+  - infrastructure
+  - documentation
+  - design
+  - research
+  - life
+  - business
+---
+# Verify Coverage - Plan Review Agent
+You map implementation steps to verification steps. Your question: "Is every change covered by a verification step?"
+## Your Core Principle
+A plan without adequate verification is a plan that assumes success. The most dangerous gap is not a missing feature — it is a missing test. Every implementation step that lacks a corresponding verification step is a step where failure will go undetected. Coverage mapping ensures 1:1 correspondence between "what we change" and "how we confirm it worked."
+## Your Expertise
+- **Coverage gap detection**: Implementation steps with no corresponding verification
+- **Happy path bias**: Verification that only tests the success case, ignoring error and edge cases
+- **Verification specificity**: Are verification steps concrete enough to execute without interpretation?
+- **Regression awareness**: Do verification steps confirm existing functionality still works after the change?
+- **Coverage completeness**: Does the verification plan cover all dimensions of the change (functionality, performance, security)?
+## Review Approach
+Build a coverage map between implementation and verification:
+1. **List all implementation steps**: Every change the plan makes
+2. **List all verification steps**: Every check the plan includes
+3. **Map 1:1**: For each implementation step, identify its verification step(s)
+4. **Find gaps**: Implementation steps with no verification
+5. **Assess coverage quality**: Do verification steps test the right things?
+## Verification Coverage Levels
+| Level | Description | Example |
+|-------|-------------|---------|
+| **Full** | Every change verified with specific criteria | "Run `pytest test_auth.py -k test_token_expiry` — 3 tests pass" |
+| **Partial** | Some changes verified, others assumed | "Run the auth tests" (misses schema change verification) |
+| **Minimal** | Only overall functionality checked | "Verify it works" |
+| **None** | Implementation step has no verification | Change with no corresponding check |
+## Key Distinction
+| Agent | Asks |
+|-------|------|
+| verify-strength | "Would these tests catch a subtle bug?" |
+| **verify-coverage** | **"Is every change covered by a verification step?"** |
+## CRITICAL: Single-Turn Review
+When reviewing a plan:
+1. Analyze the plan content provided directly (do not use Read, Glob, Grep, or any file tools)
+2. Call StructuredOutput immediately with your assessment
+3. Complete your entire review in one response
+Avoid querying external systems, reading codebase files, requesting additional information, or asking follow-up questions.
+## Required Output
+Call StructuredOutput with exactly these fields:
+- **verdict**: "pass" (verification covers all changes), "warn" (some gaps in verification coverage), or "fail" (critical changes without verification)
+- **summary**: 2-3 sentences explaining verification coverage assessment (minimum 20 characters)
+- **issues**: Array of coverage concerns, each with: severity (high/medium/low), category (e.g., "missing-verification", "happy-path-only", "weak-verification", "no-regression-check"), issue description, suggested_fix (specific verification step to add)
+- **missing_sections**: Verification gaps the plan should address (untested changes, missing edge cases, absent regression checks)
+- **questions**: Verification aspects that need clarification

package/dist/templates/cc-native/_cc-native/agents/VERIFY-STRENGTH.md ADDED Viewed

@@ -0,0 +1,70 @@
+---
+name: verify-strength
+description: Test quality analyst who evaluates whether verification steps would catch subtle bugs, not just total failures. Uses mutation testing logic to assess whether tests distinguish correct from almost-correct implementations.
+model: sonnet
+focus: test quality and mutation analysis
+enabled: false
+categories:
+  - code
+  - infrastructure
+---
+# Verify Strength - Plan Review Agent
+You evaluate the quality of verification steps. Your question: "Would these tests catch a subtle bug, or only a total failure?"
+## Your Core Principle
+Mutation testing (DeMillo et al. 1978) reveals test strength by asking: "If I introduced a small bug, would the tests catch it?" Weak tests pass on both correct and incorrect implementations. Strong tests fail when the implementation is wrong in any way. A plan with 100% coverage but weak assertions is less safe than a plan with 50% coverage but strong assertions.
+## Your Expertise
+- **Assertion strength evaluation**: Do verification steps check specific expected values, or just "no error"?
+- **Mutation sensitivity**: Would a small change to the implementation (off-by-one, wrong variable, swapped condition) be caught?
+- **Boundary testing**: Do tests exercise boundary conditions where bugs cluster?
+- **Negative testing**: Do tests verify that invalid inputs are rejected, not just that valid inputs succeed?
+- **State verification**: Do tests check the full resulting state, or just the return value?
+## Review Approach
+For each verification step in the plan, apply mutation logic:
+1. **Identify what is being verified**: What specific behavior does this test confirm?
+2. **Apply mental mutations**: If the implementation had an off-by-one error, wrong variable, or swapped condition, would this test catch it?
+3. **Evaluate assertion specificity**: Does the test check a specific expected value, or just "it runs without error"?
+4. **Check boundary coverage**: Are edge cases and boundary values tested?
+5. **Assess negative testing**: Are failure cases and invalid inputs covered?
+## Test Strength Levels
+| Level | Test Behavior | Example |
+|-------|---------------|---------|
+| **Strong** | Fails on any mutation to the implementation | Checks specific values, boundaries, and error cases |
+| **Moderate** | Catches major bugs but misses subtle ones | Checks return type and approximate value |
+| **Weak** | Only catches total failure | "Assert no error" or "assert result is not null" |
+| **Absent** | No verification at all | Implementation change with no test |
+## Key Distinction
+| Agent | Asks |
+|-------|------|
+| verify-coverage | "Is every change covered by a verification step?" |
+| **verify-strength** | **"Would these tests catch a subtle bug?"** |
+## CRITICAL: Single-Turn Review
+When reviewing a plan:
+1. Analyze the plan content provided directly (do not use Read, Glob, Grep, or any file tools)
+2. Call StructuredOutput immediately with your assessment
+3. Complete your entire review in one response
+Avoid querying external systems, reading codebase files, requesting additional information, or asking follow-up questions.
+## Required Output
+Call StructuredOutput with exactly these fields:
+- **verdict**: "pass" (tests would catch subtle bugs), "warn" (some weak assertions), or "fail" (tests would miss common bug patterns)
+- **summary**: 2-3 sentences explaining test strength assessment (minimum 20 characters)
+- **issues**: Array of strength concerns, each with: severity (high/medium/low), category (e.g., "weak-assertion", "no-boundary-test", "missing-negative-test", "mutation-survivor", "state-unchecked"), issue description, suggested_fix (strengthen specific assertion or add test case)
+- **missing_sections**: Test strength improvements the plan should address (boundary tests, negative tests, specific assertions)
+- **questions**: Test quality aspects that need clarification

package/dist/templates/cc-native/_cc-native/hooks/__pycache__/cc-native-plan-review.cpython-313.pyc CHANGED Viewed

Binary file

package/dist/templates/cc-native/_cc-native/hooks/cc-native-plan-review.py CHANGED Viewed

@@ -69,6 +69,7 @@ try:
         write_combined_artifacts,
         build_inline_review_summary,
         extract_top_issues_text,
+        build_high_issues_document,
         load_config,
         get_display_settings,
     )
@@ -133,16 +134,44 @@ def skip_with_info(reason: str) -> int:
 # ---------------------------
 DEFAULT_AGENTS: List[Dict[str, Any]] = [
-    {"name": "architect-reviewer", "model": "sonnet", "focus": "architectural concerns and scalability", "enabled": True, "categories": ["code", "infrastructure", "design"]},
-    {"name": "penetration-tester", "model": "sonnet", "focus": "security vulnerabilities and attack vectors", "enabled": True, "categories": ["code", "infrastructure"]},
-    {"name": "performance-engineer", "model": "sonnet", "focus": "performance bottlenecks and optimization", "enabled": True, "categories": ["code", "infrastructure"]},
-    {"name": "accessibility-tester", "model": "sonnet", "focus": "accessibility compliance and UX concerns", "enabled": True, "categories": ["code", "design"]},
+    # Mandatory agents
+    {"name": "handoff-readiness", "model": "sonnet", "focus": "fresh context execution readiness", "enabled": True, "categories": ["code", "infrastructure", "documentation", "design", "research", "life", "business"]},
+    {"name": "clarity-auditor", "model": "sonnet", "focus": "communication clarity and execution readiness", "enabled": True, "categories": ["code", "infrastructure", "documentation", "design", "research", "life", "business"]},
+    {"name": "skeptic", "model": "sonnet", "focus": "problem-solution alignment and assumption validation", "enabled": True, "categories": ["code", "infrastructure", "documentation", "design", "research", "life", "business"]},
+    {"name": "documentation-philosophy", "model": "sonnet", "focus": "knowledge capture and documentation placement", "enabled": True, "categories": ["code", "infrastructure", "documentation", "design", "research", "life", "business"]},
+    # Risk family
+    {"name": "risk-premortem", "model": "sonnet", "focus": "pre-mortem failure analysis", "enabled": True, "categories": ["code", "infrastructure", "documentation", "design", "research", "life", "business"]},
+    {"name": "risk-fmea", "model": "sonnet", "focus": "systematic failure mode analysis", "enabled": True, "categories": ["code", "infrastructure", "design"]},
+    {"name": "risk-dependency", "model": "sonnet", "focus": "dependency chain and blast radius analysis", "enabled": True, "categories": ["code", "infrastructure"]},
+    {"name": "risk-reversibility", "model": "sonnet", "focus": "decision reversibility and optionality", "enabled": True, "categories": ["code", "infrastructure", "documentation", "design", "research", "life", "business"]},
+    # Completeness family
+    {"name": "completeness-gaps", "model": "sonnet", "focus": "structural gap analysis", "enabled": True, "categories": ["code", "infrastructure", "documentation", "design", "research", "life", "business"]},
+    {"name": "completeness-feasibility", "model": "sonnet", "focus": "feasibility and resource analysis", "enabled": True, "categories": ["code", "infrastructure", "documentation", "design", "research", "life", "business"]},
+    {"name": "completeness-ordering", "model": "sonnet", "focus": "step ordering and critical path analysis", "enabled": True, "categories": ["code", "infrastructure", "design"]},
+    # Architecture family
+    {"name": "arch-structure", "model": "sonnet", "focus": "coupling, cohesion, and boundary analysis", "enabled": True, "categories": ["code", "infrastructure", "design"]},
+    {"name": "arch-evolution", "model": "sonnet", "focus": "evolutionary architecture and change amplification", "enabled": True, "categories": ["code", "infrastructure", "design"]},
+    {"name": "arch-patterns", "model": "sonnet", "focus": "pattern selection and technology fit", "enabled": True, "categories": ["code", "infrastructure"]},
+    # Verification family
+    {"name": "verify-coverage", "model": "sonnet", "focus": "verification coverage mapping", "enabled": True, "categories": ["code", "infrastructure", "documentation", "design", "research", "life", "business"]},
+    {"name": "verify-strength", "model": "sonnet", "focus": "test quality and mutation analysis", "enabled": True, "categories": ["code", "infrastructure"]},
+    # Trade-off family
+    {"name": "tradeoff-costs", "model": "sonnet", "focus": "opportunity cost and capability sacrifice", "enabled": True, "categories": ["code", "infrastructure", "documentation", "design", "research", "life", "business"]},
+    {"name": "tradeoff-stakeholders", "model": "sonnet", "focus": "stakeholder impact and cost-benefit asymmetry", "enabled": True, "categories": ["code", "infrastructure", "documentation", "design", "research", "life", "business"]},
+    # Standalone agents
+    {"name": "scope-boundary", "model": "sonnet", "focus": "scope drift and boundary enforcement", "enabled": True, "categories": ["code", "infrastructure", "documentation", "design", "research", "life", "business"]},
+    {"name": "hidden-complexity", "model": "sonnet", "focus": "understated complexity and hidden difficulty", "enabled": True, "categories": ["code", "infrastructure", "documentation", "design", "research", "life", "business"]},
+    {"name": "simplicity-guardian", "model": "sonnet", "focus": "over-engineering and unnecessary complexity", "enabled": True, "categories": ["code", "infrastructure", "documentation", "design", "research", "life", "business"]},
+    {"name": "devils-advocate", "model": "sonnet", "focus": "contrarian analysis and reductio ad absurdum", "enabled": True, "categories": ["code", "infrastructure", "documentation", "design", "research", "life", "business"]},
+    {"name": "assumption-tracer", "model": "sonnet", "focus": "dependency chains and foundational assumptions", "enabled": True, "categories": ["code", "infrastructure", "documentation", "design", "research", "life", "business"]},
+    {"name": "incremental-delivery", "model": "sonnet", "focus": "incremental delivery and vertical slicing", "enabled": True, "categories": ["code", "infrastructure", "documentation", "design", "research", "life", "business"]},
+    {"name": "constraint-validator", "model": "sonnet", "focus": "constraint identification and satisfaction", "enabled": True, "categories": ["code", "infrastructure", "documentation", "design", "research", "life", "business"]},
 ]
 DEFAULT_ORCHESTRATOR: Dict[str, Any] = {
     "enabled": True,
-    "model": "haiku",
-    "timeout": 30,
+    "model": "opus",
+    "timeout": 60,
 }
 DEFAULT_AGENT_MODEL: str = "sonnet"
@@ -154,6 +183,30 @@ DEFAULT_REVIEW_ITERATIONS: Dict[str, int] = {
 }
+def resolve_mandatory_agents(config_value, complexity: str) -> set:
+    """Resolve mandatory agent names based on config format and complexity.
+    Supports two formats:
+    - Legacy (list): ["a", "b"] — all treated as 'always'
+    - Structured (dict): {"always": [...], "medium+": [...], "high": [...]}
+    """
+    if isinstance(config_value, list):
+        return set(config_value)
+    if not isinstance(config_value, dict):
+        return {"handoff-readiness", "clarity-auditor", "skeptic"}
+    names = set(config_value.get("always", []))
+    if complexity in ("medium", "high"):
+        names.update(config_value.get("medium+", []))
+    if complexity == "high":
+        names.update(config_value.get("high", []))
+    return names
 # ---------------------------
 # Context-based State Management
 # ---------------------------
@@ -358,6 +411,7 @@ def load_settings(proj_dir: Path) -> Dict[str, Any]:
             "orchestrator": DEFAULT_ORCHESTRATOR.copy(),
             "timeout": 180,
             "warnThreshold": 0.5,
+            "highIssueThreshold": 3,
             "legacyMode": False,
             "display": DEFAULT_DISPLAY.copy(),
             "agentSelection": DEFAULT_AGENT_SELECTION.copy(),
@@ -567,10 +621,14 @@ def main() -> int:
         timeout=orch_settings.get("timeout", 30),
     )
-    # Compute mandatory agent names early so orchestrator can exclude them
-    mandatory_names = set(agent_settings.get("mandatoryAgents", [
+    # Two-phase mandatory resolution:
+    # Phase 1 (pre-orchestrator): Only "always" mandatory agents excluded from orchestrator pool
+    # Phase 2 (post-orchestrator): Full mandatory set including conditional agents
+    mandatory_config = agent_settings.get("mandatoryAgents", [
         "handoff-readiness", "clarity-auditor", "skeptic"
-    ]))
+    ])
+    always_mandatory = resolve_mandatory_agents(mandatory_config, "simple")
+    mandatory_names = always_mandatory
     log_debug("cc-native-plan-review", f"Codex enabled: {codex_enabled}, Gemini enabled: {gemini_enabled}")
     log_debug("cc-native-plan-review", f"Agent library: {[a.name for a in agent_library]}")
@@ -585,7 +643,7 @@ def main() -> int:
     if gemini_enabled:
         phase1_tasks.append(("gemini", lambda: run_gemini_review(plan, REVIEW_SCHEMA, plan_settings)))
     if orchestrator_config.enabled and enabled_agents and not legacy_mode:
-        phase1_tasks.append(("orchestrator", lambda: run_orchestrator(plan, enabled_agents, orchestrator_config, agent_settings, mandatory_names=mandatory_names)))
+        phase1_tasks.append(("orchestrator", lambda: run_orchestrator(plan, enabled_agents, orchestrator_config, agent_settings, mandatory_names=always_mandatory)))
     log_info("cc-native-plan-review", f"=== PHASE 1: Running {len(phase1_tasks)} tasks in parallel ===")
@@ -605,12 +663,8 @@ def main() -> int:
     # Collect CLI results
     if "codex" in phase1_results and phase1_results["codex"]:
         cli_results["codex"] = phase1_results["codex"]
-        if phase1_results["codex"].verdict and phase1_results["codex"].verdict not in ("skip", "error"):
-            all_verdicts.append(phase1_results["codex"].verdict)
     if "gemini" in phase1_results and phase1_results["gemini"]:
         cli_results["gemini"] = phase1_results["gemini"]
-        if phase1_results["gemini"].verdict and phase1_results["gemini"].verdict not in ("skip", "error"):
-            all_verdicts.append(phase1_results["gemini"].verdict)
     # Get orchestrator result
     if "orchestrator" in phase1_results and phase1_results["orchestrator"]:
@@ -640,6 +694,11 @@ def main() -> int:
             if orch_result and not legacy_mode:
                 detected_complexity = orch_result.complexity
+                # Phase 2: Recompute mandatory set with actual complexity
+                mandatory_names = resolve_mandatory_agents(mandatory_config, detected_complexity)
+                mandatory_agents = [a for a in enabled_agents if a.name in mandatory_names]
+                non_mandatory = [a for a in enabled_agents if a.name not in mandatory_names]
                 # Get orchestrator's additional selections (excluding mandatory since they always run)
                 orch_selected_names = set(orch_result.selected_agents) - mandatory_names
                 orch_selected = [a for a in non_mandatory if a.name in orch_selected_names]
@@ -666,8 +725,9 @@ def main() -> int:
                 log_info("cc-native-plan-review", f"Final selection: {len(selected_agents)} agents ({len(mandatory_agents)} mandatory + {len(orch_selected)} additional)")
             else:
                 log_info("cc-native-plan-review", "Running in legacy mode (all enabled agents)")
-                selected_agents = enabled_agents
                 detected_complexity = "medium"  # Default for legacy mode
+                mandatory_names = resolve_mandatory_agents(mandatory_config, detected_complexity)
+                selected_agents = enabled_agents
         log_diagnostic("cc-native-plan-review", "decide",
                         f"Selected {len(selected_agents)} agents, complexity={detected_complexity}",
@@ -706,8 +766,6 @@ def main() -> int:
                     try:
                         result = future.result()
                         agent_results[agent.name] = result
-                        if result.verdict and result.verdict not in ("skip", "error"):
-                            all_verdicts.append(result.verdict)
                         log_info("cc-native-plan-review", f"{agent.name} completed with verdict: {result.verdict}")
                     except Exception as ex:
                         log_error("cc-native-plan-review", f"{agent.name} failed with exception: {ex}")
@@ -720,6 +778,25 @@ def main() -> int:
                             err=str(ex),
                         )
+    # ============================================
+    # Per-agent high-severity threshold: override verdict to "fail" if threshold met
+    # ============================================
+    high_issue_threshold = agent_settings.get("highIssueThreshold", 3)
+    all_verdicts = []  # Recompute with overrides applied
+    for r in list(cli_results.values()) + list(agent_results.values()):
+        if not r.verdict or r.verdict in ("skip", "error"):
+            continue
+        agent_high = sum(
+            1 for issue in (r.data.get("issues", []) if r.data else [])
+            if issue.get("severity") == "high"
+        )
+        if agent_high >= high_issue_threshold:
+            log_info("cc-native-plan-review",
+                     f"{r.name}: verdict overridden to 'fail' ({agent_high} high issues >= {high_issue_threshold})")
+            r.verdict = "fail"
+        all_verdicts.append(r.verdict)
     # ============================================
     # PHASE 4: Generate Combined Output
     # ============================================
@@ -765,33 +842,27 @@ def main() -> int:
     context_parts = [inline_summary, f"\nFull review: `{review_file}`\n"]
-    # Review decision — only fail triggers a block
+    # Review decision — fail veto triggers a block (per-agent override already applied)
     warn_threshold = agent_settings.get("warnThreshold", 0.5)
-    should_deny, deny_reason, review_score = compute_review_decision(all_verdicts, warn_threshold)
-    # Count high-severity issues for logging
-    high_count = sum(
-        1 for r in list(combined_result.cli_reviewers.values()) + list(combined_result.agents.values())
-        if r.data
-        for issue in r.data.get("issues", [])
-        if issue.get("severity") == "high"
+    should_deny, deny_reason, review_score = compute_review_decision(
+        all_verdicts, warn_threshold,
     )
     # Structured log entries for review influence tracking
-    log_info("cc-native-plan-review", f"REVIEW_DECISION: verdict={combined_result.overall_verdict}, deny={should_deny}, score={review_score:.2f}, high_issues={high_count}")
+    log_info("cc-native-plan-review", f"REVIEW_DECISION: verdict={combined_result.overall_verdict}, deny={should_deny}, reason={deny_reason}, score={review_score:.2f}")
     log_diagnostic("cc-native-plan-review", "result",
-                    f"verdict={combined_result.overall_verdict}, deny={should_deny}, high={high_count}",
+                    f"verdict={combined_result.overall_verdict}, deny={should_deny}, reason={deny_reason}",
                     decision="deny" if should_deny else "allow",
-                    reasoning=f"score={review_score:.2f}, threshold={warn_threshold}",
+                    reasoning=f"reason={deny_reason}, score={review_score:.2f}, warn_threshold={warn_threshold}",
                     inputs={"overall_verdict": combined_result.overall_verdict,
-                            "high_issue_count": high_count, "review_score": round(review_score, 2),
+                            "review_score": round(review_score, 2),
                             "cli_count": len(cli_results), "agent_count": len(agent_results)})
     # Terminal progress indicator
     verdict_emoji = "✅" if not should_deny else "❌"
     eprint(f"[plan-review] {verdict_emoji} {combined_result.overall_verdict.upper()} (score={review_score:.2f})")
     if should_deny:
-        eprint(f"[plan-review] Blocking ExitPlanMode — {high_count} high-severity issue(s) found")
+        eprint(f"[plan-review] Blocking ExitPlanMode — {deny_reason}")
     # Handle iteration logic
     needs_more_iterations = False
@@ -809,11 +880,13 @@ def main() -> int:
         else:
             # Final iteration - increment current and save state
             iteration_state["current"] = iteration_state.get("current", 1) + 1
-            # Also increment max by 1 to allow another review cycle if the user rejects
-            # the plan and requests changes. Without this, once iterations are exhausted,
-            # the hook would skip review entirely even if the user sent the
-            # planner back to revise. This ensures rejected plans can always be re-reviewed.
-            iteration_state["max"] = iteration_state.get("max", 1) + 1
+            # Extend max ONLY when the plan passes review (for user rejection recovery).
+            # When the hook denies (should_deny=True), don't extend — the hook will
+            # keep blocking on each resubmission via should_deny regardless of max.
+            # This prevents max from inflating on repeated hook rejections while still
+            # allowing re-review after a user rejects a plan that passed review.
+            if not should_deny:
+                iteration_state["max"] = iteration_state.get("max", 1) + 1
             save_iteration_state(reviews_dir, iteration_state)
     # Emit output with correct Claude Code hook format
@@ -832,29 +905,41 @@ def main() -> int:
     )
     if needs_more_iterations:
-        mark_plan_reviewed(session_id, plan_hash, "cc-native-plan-review", iteration_state, decision="deny")
+        mark_plan_reviewed(session_id, plan_hash, "cc-native-plan-review", iteration_state, decision="hook_deny_iteration")
         current = iteration_state["current"] - 1  # Display the just-completed iteration
         max_iter = iteration_state["max"]
         remaining = max_iter - current
         top_issues_text = extract_top_issues_text(combined_result, max_count=3, severity="high")
+        # Two-fold deny signal: inline issues (fallback) + high-issues.md (primary)
+        high_issues_doc = build_high_issues_document(combined_result)
+        high_issues_path = review_folder / "high-issues.md"
+        high_issues_path.write_text(high_issues_doc, encoding="utf-8")
         emit_context_and_block(
             context_text,
             f"Plan review iteration {current}/{max_iter} FAILED ({deny_reason}, score={review_score:.2f}). "
             f"Critical issues: {top_issues_text}. "
+            f"IMPORTANT: Read `{high_issues_path}` for ALL high-severity issues — "
+            f"this file contains only the most critical findings, no noise. "
             f"{_REVIEWER_CAVEAT} "
-            f"Revise the plan, then call ExitPlanMode again. "
+            f"Revise the plan to address these issues, then call ExitPlanMode again. "
             f"({remaining} revision{'s' if remaining != 1 else ''} remaining) "
             f"{_RESUBMIT_INSTRUCTION}",
         )
     elif should_deny:
-        mark_plan_reviewed(session_id, plan_hash, "cc-native-plan-review", iteration_state, decision="deny")
+        mark_plan_reviewed(session_id, plan_hash, "cc-native-plan-review", iteration_state, decision="hook_deny_final")
         top_issues_text = extract_top_issues_text(combined_result, max_count=3, severity="high")
+        # Two-fold deny signal: inline issues (fallback) + high-issues.md (primary)
+        high_issues_doc = build_high_issues_document(combined_result)
+        high_issues_path = review_folder / "high-issues.md"
+        high_issues_path.write_text(high_issues_doc, encoding="utf-8")
         emit_context_and_block(
             context_text,
             f"Plan review FAILED ({deny_reason}, score={review_score:.2f}). "
             f"Critical issues: {top_issues_text}. "
+            f"IMPORTANT: Read `{high_issues_path}` for ALL high-severity issues — "
+            f"this file contains only the most critical findings, no noise. "
             f"{_REVIEWER_CAVEAT} "
-            f"Revise the plan, then call ExitPlanMode again. "
+            f"Revise the plan to address these issues, then call ExitPlanMode again. "
             f"{_RESUBMIT_INSTRUCTION}",
         )
     else:

package/dist/templates/cc-native/_cc-native/lib/__pycache__/utils.cpython-313.pyc CHANGED Viewed

Binary file