npm - pan-wizard - Versions diffs - 2.8.1 → 2.9.0 - Mend

pan-wizard 2.8.1 → 2.9.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (17) hide show

package/README.md +4 -2
package/bin/install.js +23 -0
package/commands/pan/focus-design.md +235 -12
package/commands/pan/focus-doc-audit.md +530 -0
package/commands/pan/focus-drift-walking.md +525 -0
package/commands/pan/focus-plan.md +204 -12
package/commands/pan/profile.md +2 -1
package/package.json +1 -1
package/pan-wizard-core/bin/lib/commands.cjs +29 -7
package/pan-wizard-core/bin/lib/config.cjs +10 -0
package/pan-wizard-core/bin/lib/core.cjs +168 -21
package/pan-wizard-core/bin/lib/verify.cjs +283 -4
package/pan-wizard-core/bin/pan-tools.cjs +11 -2
package/pan-wizard-core/references/model-profiles.md +191 -62
package/pan-wizard-core/workflows/help.md +11 -1
package/pan-wizard-core/workflows/profile.md +8 -1
package/pan-wizard-core/workflows/settings.md +14 -0

package/commands/pan/focus-plan.md CHANGED Viewed

@@ -1,19 +1,21 @@
 ---
 name: focus-plan
 group: Focus
-description: Create capacity-budgeted work batch with 4 execution modes
+description: Create capacity-budgeted work batch with spec coverage verification and 4 execution modes
 allowed-tools:
   - Read
+  - Write
+  - Edit
   - Bash
   - Grep
   - Glob
 ---
-# /pan:focus-plan — Capacity-Budgeted Work Batch Planner
+# /pan:focus-plan — Capacity-Budgeted Work Batch Planner with Spec Coverage Verification
-Create a capacity-budgeted work batch from focus-scan results. $ARGUMENTS
+Create a capacity-budgeted work batch from focus-scan results **with mandatory verification that planned work covers all relevant spec and ADR requirements.** $ARGUMENTS
-**Goal:** Select a right-sized batch of work items that fits within the session's point budget, ordered for maximum impact with minimum risk.
+**Goal:** Select a right-sized batch of work items that (a) fits within the session's point budget, (b) is ordered for maximum impact with minimum risk, and (c) demonstrably covers the requirements from any associated specs, ADRs, and success criteria — flagging coverage gaps BEFORE execution begins.
 ---
@@ -42,10 +44,67 @@ If no recent scan exists, run `/pan:focus-scan` automatically before proceeding.
   - `full` — Full-spectrum: enhanced budget, all priorities equally weighted (60 pts)
 - `--priority P0-P6` — Only pick items from these priority tiers
 - `--lean` — Apply RS filtering: exclude items with RS < 1.5
+- `--no-spec-check` — Skip spec coverage verification (NOT recommended — use only for pure bugfix batches)
 ---
-## Capacity Budget System
+## Phase 1: Spec & ADR Discovery (MANDATORY)
+> *Before planning work, understand what has been designed and promised.*
+### 1.1 Scan for Specifications
+Search the project for feature specifications and design documents:
+- `docs/specs/*.md` or `docs/specs/**/*.md`
+- `.planning/specs/` or `.planning/designs/`
+- Any `*_featureai.md`, `*_spec.md`, `*_design.md` files
+- README sections describing planned features
+For each spec found, extract:
+| Spec File | Feature Name | Status | Requirements Count | Success Criteria Count |
+|-----------|-------------|--------|-------------------|----------------------|
+| [path] | [name] | Proposed/In Progress/Complete | [N] | [N] |
+### 1.2 Scan for ADRs
+Search for Architecture Decision Records:
+- `docs/decisions/ADR-*.md`
+- `.planning/decisions/`
+For each ADR, extract:
+| ADR | Decision | Status | Success Criteria | Implementation Tasks |
+|-----|----------|--------|-----------------|---------------------|
+| [ADR-NNNN] | [summary] | Proposed/Accepted/Implemented | [count or "none defined"] | [count or "none defined"] |
+### 1.3 Extract Requirement Inventory
+From every spec and ADR found, build a **master requirements list**:
+| Req ID | Source | Requirement | Type | Implemented? |
+|--------|--------|-------------|------|-------------|
+| SC-1 | ADR-0015 | JWT auth with 4-role RBAC | Feature | Yes/No/Partial |
+| SC-2 | spec/extraction.md | Image extraction for JPG/PNG | Feature | Yes/No/Partial |
+| T-3 | ADR-0018 §Task 6 | Unmatched description table | Task | Yes/No/Partial |
+| BRK-1 | ADR-0018 §Breaking | Hierarchy roll-up for backward compat | Migration | Yes/No/Partial |
+**Verification method for "Implemented?":**
+- Search the codebase for files, classes, functions, routes, or tests matching each requirement
+- Check if tests exist that verify the requirement
+- Mark as `Partial` if code exists but tests don't, or if the feature is stubbed
+### 1.4 Identify Unimplemented Requirements
+Filter the master list to requirements where `Implemented? = No` or `Partial`:
+| Req ID | Source | Requirement | Gap Type | Estimated Effort |
+|--------|--------|-------------|----------|-----------------|
+| SC-2 | ADR-0018 | Keyword count >= 500 | Not started | M |
+| T-6 | ADR-0018 | Unmatched description table | Not started | M |
+| BRK-1 | ADR-0018 | Hierarchy roll-up | Partial (code, no tests) | S |
+This becomes the **spec gap backlog** — items that specs/ADRs promised but the codebase doesn't deliver yet.
+---
+## Phase 2: Capacity Budget System
 | Size | Points | Per Session | Meaning |
 |------|--------|-------------|---------|
@@ -57,45 +116,178 @@ If no recent scan exists, run `/pan:focus-scan` automatically before proceeding.
 ---
-## Execution Modes
+## Phase 3: Execution Modes & Batch Selection
 ### `bugfix` — Stability-First
 - **Budget:** 40 pts
 - **Algorithm:** P0 mandatory -> P1 -> P2-P4 smallest-first
 - **Feature allocation:** None
+- **Spec coverage:** Verify P0/P1 items close spec gaps where applicable
 ### `balanced` — Mix of Fixes + Features (DEFAULT)
 - **Budget:** 50 pts
 - **Stability pass (60%):** 30 pts for P0-P2
 - **Feature pass (40%):** 20 pts for P3-P6
+- **Spec coverage:** Cross-reference feature items against spec gap backlog — prefer items that close gaps
 ### `features` — Feature-Focused Sprint
 - **Budget:** 50 pts
 - **Mandatory pass:** All P0 items
 - **Feature pass (80%):** 40 pts for P3-P5
 - **Stability pass (20%):** 10 pts for P1-P2 quick wins
+- **Spec coverage:** Feature items MUST map to spec requirements — reject unspecified feature work
 ### `full` — Full-Spectrum Marathon
 - **Budget:** 60 pts
 - **All priorities weighted equally, largest-impact-first**
+- **Spec coverage:** Full traceability — every item maps to a spec/ADR requirement or is flagged as unspecified
+### Batch Selection Algorithm
+1. Build candidate list from focus-scan results
+2. **For each candidate, attempt to map it to a spec/ADR requirement** (by keyword match, file overlap, or feature area)
+3. Score candidates: `impact_score = base_priority_score + spec_coverage_bonus`
+   - Items that close spec gaps get +2 priority bonus
+   - Items that close success criteria get +3 priority bonus
+   - Items with no spec mapping get +0 (no penalty, but no bonus)
+4. Apply mode-specific budget allocation
+5. Select items greedily by score until budget exhausted
+---
+## Phase 4: Spec Coverage Analysis (MANDATORY unless `--no-spec-check`)
+> *The most important output of focus-plan: does the batch actually deliver against what was designed?*
+### 4.1 Coverage Matrix
+For each spec/ADR requirement, show whether the batch covers it:
+| Req ID | Source | Requirement | Batch Item | Coverage |
+|--------|--------|-------------|-----------|----------|
+| SC-1 | ADR-0018 | Category count >= 65 | #3: Expand categories | COVERED |
+| SC-2 | ADR-0018 | Keyword count >= 500 | #4: Expand keywords | COVERED |
+| SC-3 | ADR-0018 | Unmatched queue API | — | **GAP** |
+| SC-4 | ADR-0018 | NCA affordability output | — | **GAP (deferred to v1)** |
+| SC-5 | ADR-0018 | No regression | #1: Run existing tests | COVERED |
+### 4.2 Coverage Score
+```
+Spec Coverage: X / Y requirements covered (Z%)
+├── Fully covered:    N items
+├── Partially covered: N items (code but no tests, or tests but incomplete)
+├── Gaps:             N items (not in batch)
+└── Deferred:         N items (explicitly deferred to future version)
+```
+### 4.3 Gap Analysis & Justification
+For every **GAP** in the coverage matrix, provide:
+| Gap | Requirement | Why Not In This Batch | When Will It Be Addressed |
+|-----|------------|----------------------|--------------------------|
+| SC-3 | Unmatched queue API | Exceeds budget (M=4pts, only 2pts remaining) | Next batch (features mode) |
+| SC-4 | NCA affordability | Depends on SC-1 + SC-2 (must complete first) | After category expansion |
+**CRITICAL:** If the coverage score is < 50% for a spec that has `Status: In Progress`, flag this prominently:
+```
+⚠️ WARNING: Batch covers only X% of [spec name] requirements.
+   Y requirements remain unaddressed. Consider:
+   - Increasing budget (--budget N)
+   - Switching to features mode (--mode features)
+   - Breaking spec into smaller milestones
+```
+### 4.4 Dependency Verification
+Check that batch items respect dependency ordering from specs:
+| Batch Item | Depends On | Dependency In Batch? | Order Correct? |
+|-----------|-----------|---------------------|----------------|
+| #4: Keywords | #3: Categories | Yes | Yes (#3 before #4) |
+| #6: Suggestions | #5: Unmatched API | No — #5 not in batch | **BLOCKED** |
+**If any item is BLOCKED:** Either add the dependency to the batch (if budget allows) or remove the blocked item and flag it.
+### 4.5 Success Criteria Verification Plan
+For each success criterion in the batch, specify HOW it will be verified after execution:
+| SC ID | Criterion | Verification Command | Expected Result |
+|-------|-----------|---------------------|-----------------|
+| SC-1 | Category count >= 65 | `SELECT COUNT(*) FROM stx_category` | >= 65 |
+| SC-2 | Keywords >= 500 | `SELECT COUNT(*) FROM stx_keyword` | >= 500 |
+| SC-5 | No regression | `dotnet test` | All pass, count >= N |
+This becomes the post-execution checklist for `/pan:focus-exec`.
 ---
-## Output
+## Phase 5: Output
 Produce a batch file at `.planning/focus/batch-<YYYY-MM-DD>.json` via `pan-tools focus plan`:
 ```markdown
 ## Focus Batch — <date>
 **Mode:** balanced | **Budget:** 50 pts | **Allocated:** N pts
+**Specs referenced:** N specs, M ADRs
+**Spec coverage:** X/Y requirements (Z%)
+### Batch Items
+| # | ID | Title | Priority | Size | Pts | Tier | Track | Spec Req |
+|---|----|-------|----------|------|-----|------|-------|----------|
+| 1 | P0-1 | Fix crash in state cmd | P0 | S | 2 | MICRO | Stability | ADR-0005 SC-3 |
+| 2 | P2-3 | Add tests for milestone | P2 | M | 4 | STANDARD | Stability | — |
+| 3 | P3-1 | Expand category taxonomy | P3 | M | 4 | STANDARD | Feature | ADR-0018 SC-1 |
-| # | ID | Title | Priority | Size | Pts | Tier | Track |
-|---|----|-------|----------|------|-----|------|-------|
-| 1 | P0-1 | Fix crash in state cmd | P0 | S | 2 | MICRO | Stability |
-| 2 | P2-3 | Add tests for milestone | P2 | M | 4 | STANDARD | Stability |
-| 3 | P3-1 | Add --json flag to phase | P3 | M | 4 | STANDARD | Feature |
+### Spec Coverage Summary
+| Source | Total Reqs | Covered | Gaps | Deferred |
+|--------|-----------|---------|------|----------|
+| ADR-0018 | 7 | 3 | 2 | 2 |
+| spec/extraction.md | 5 | 5 | 0 | 0 |
+| **Total** | **12** | **8 (67%)** | **2** | **2** |
+### Uncovered Requirements (Gaps)
+| Req | Source | Reason | Next Batch? |
+|-----|--------|--------|-------------|
+| Unmatched queue API | ADR-0018 SC-3 | Budget exceeded | Yes — features mode |
+| NCA affordability | ADR-0018 SC-4 | Blocked by SC-1, SC-2 | After this batch |
+### Dependency Order
+```
+#1 (P0 crash fix) → independent
+#3 (categories) → #4 (keywords) → #5 (match types)
+#2 (tests) → independent
+```
+### Post-Execution Verification Checklist
+- [ ] SC-1: Category count >= 65 → `SELECT COUNT(*) FROM stx_category`
+- [ ] SC-2: Keywords >= 500 → `SELECT COUNT(*) FROM stx_keyword`
+- [ ] SC-5: All existing tests pass → `dotnet test`
 Execution Order: MICRO first, then STANDARD, then FULL
 ```
 Ready for `/pan:focus-exec`.
+---
+## NEVER DO
+- Plan a batch without checking specs and ADRs for coverage gaps
+- Include a feature item that contradicts or conflicts with an accepted ADR
+- Ignore dependency ordering defined in specs (Task A before Task B)
+- Claim 100% spec coverage without actually verifying each requirement against the codebase
+- Include blocked items (items whose dependencies are not in the batch and not yet implemented)
+- Silently drop spec requirements — every gap must be justified and scheduled
+- Plan implementation tasks that aren't traceable to a spec, ADR, scan finding, or user request
+- Exceed the capacity budget (hard limit — not "approximately")
+## ALWAYS DO
+- Discover ALL specs and ADRs before selecting batch items
+- Cross-reference every batch item against spec requirements where applicable
+- Flag coverage gaps prominently with justification and scheduling
+- Verify dependency ordering matches spec-defined task dependencies
+- Include a post-execution verification checklist with concrete commands
+- Prefer items that close spec gaps over items with no spec mapping (when priority is equal)
+- State the coverage score as a percentage in the batch header
+- Report unimplemented success criteria that aren't addressed by this batch

package/commands/pan/profile.md CHANGED Viewed

@@ -32,5 +32,6 @@ The workflow handles all logic including:
 2. Config file ensuring
 3. Config reading and updating
 4. Model table generation from MODEL_PROFILES
-5. Confirmation display
+5. Cost estimation display (relative cost multiplier per profile)
+6. Confirmation display
 </process>

package/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "pan-wizard",
-  "version": "2.8.1",
+  "version": "2.9.0",
   "description": "A lightweight workflow automation and context engineering system for Claude Code, OpenCode, Gemini CLI, Codex, and Copilot CLI.",
   "bin": {
     "pan-wizard": "bin/install.js"

package/pan-wizard-core/bin/lib/commands.cjs CHANGED Viewed

@@ -3,7 +3,7 @@
  */
 const fs = require('fs');
 const path = require('path');
-const { safeReadFile, loadConfig, isGitIgnored, isGitRepo, execGit, normalizePhaseName, comparePhaseNum, getArchivedPhaseDirs, generateSlugInternal, getMilestoneInfo, resolveModelInternal, MODEL_PROFILES, output, error, findPhaseInternal, scanPendingTodos, toPosix } = require('./core.cjs');
+const { safeReadFile, loadConfig, isGitIgnored, isGitRepo, execGit, normalizePhaseName, comparePhaseNum, getArchivedPhaseDirs, generateSlugInternal, getMilestoneInfo, resolveModelInternal, detectProvider, resolveTierToModel, estimateCostMultiplier, MODEL_PROFILES, output, error, findPhaseInternal, scanPendingTodos, toPosix } = require('./core.cjs');
 const { extractFrontmatter } = require('./frontmatter.cjs');
 const { PLANNING_DIR, PHASES_DIR, MILESTONES_DIR, QUICK_DIR, STATE_FILE, ROADMAP_FILE, PROJECT_FILE, PATTERNS_FILE, SESSION_HISTORY_FILE, LEARNINGS_FILE, CONTEXT_SUFFIX, UAT_SUFFIX, VERIFICATION_SUFFIX, isPlanFile, isSummaryFile, ARCHIVE_DIR_RE, PHASE_DIR_RE, CONTEXT_WINDOW, WARNING_THRESHOLD, CRITICAL_THRESHOLD, VALID_COMMIT_TYPES, DEFAULT_SENSITIVE_PATTERNS } = require('./constants.cjs');
 const { planningPath, phasesPath, filterPlanFiles, filterSummaryFiles } = require('./utils.cjs');
@@ -272,29 +272,50 @@ function cmdHistoryDigest(cwd, raw) {
  * @param {string} cwd - Working directory path
  * @param {string} agentType - Agent type identifier (e.g., "pan-executor", "pan-planner")
  * @param {boolean} raw - If true, output raw model name instead of JSON
+ * @param {string} [metadataJson] - Optional JSON string with task metadata for complexity routing
  * @returns {void}
  */
-function cmdResolveModel(cwd, agentType, raw) {
+function cmdResolveModel(cwd, agentType, raw, metadataJson) {
   if (!agentType) {
     error('agent-type required');
   }
+  let taskMetadata = null;
+  if (metadataJson) {
+    try { taskMetadata = JSON.parse(metadataJson); }
+    catch { /* ignore invalid metadata, use static routing */ }
+  }
   const config = loadConfig(cwd);
   const profile = config.model_profile || 'balanced';
+  const strategy = config.routing?.strategy || 'static';
   const agentModels = MODEL_PROFILES[agentType];
   if (!agentModels) {
-    const result = { model: 'sonnet', profile, unknown_agent: true };
-    output(result, raw, 'sonnet');
+    const model = resolveTierToModel('mid', detectProvider(cwd, config));
+    const result = { model, profile, strategy, unknown_agent: true };
+    output(result, raw, model);
     return;
   }
-  const resolved = agentModels[profile] || agentModels['balanced'] || 'sonnet';
-  const model = resolved === 'opus' ? 'inherit' : resolved;
-  const result = { model, profile };
+  const model = resolveModelInternal(cwd, agentType, taskMetadata);
+  const result = { model, profile, strategy };
   output(result, raw, model);
 }
+/**
+ * Estimate cost multipliers for all profiles.
+ * @param {string} cwd - Working directory path
+ * @param {boolean} raw - If true, output formatted text instead of JSON
+ * @returns {void}
+ */
+function cmdEstimateCost(cwd, raw) {
+  const estimates = ['quality', 'balanced', 'budget'].map(estimateCostMultiplier);
+  output({ estimates }, raw, estimates.map(e =>
+    `${e.profile}: ~${e.average}x baseline (${e.agentCount} agents)`
+  ).join('\n'));
+}
 /**
  * Stage and commit planning files to git, respecting commit_docs config and gitignore.
@@ -1416,6 +1437,7 @@ module.exports = {
   cmdVerifyPathExists,
   cmdHistoryDigest,
   cmdResolveModel,
+  cmdEstimateCost,
   cmdCommit,
   cmdSummaryExtract,
   cmdWebsearch,

package/pan-wizard-core/bin/lib/config.cjs CHANGED Viewed

@@ -70,6 +70,15 @@ function buildConfigDefaults(hasBraveSearch, userDefaults) {
       rollback_snapshots: true,
       error_pattern_learning: true,
     },
+    routing: {
+      strategy: 'static',
+      provider: 'auto',
+      cascade_quality_gate: true,
+      complexity_thresholds: {
+        downgrade_max: 2,
+        upgrade_min: 6,
+      },
+    },
   };
   return {
     ...hardcoded,
@@ -78,6 +87,7 @@ function buildConfigDefaults(hasBraveSearch, userDefaults) {
     budget: { ...hardcoded.budget, ...(userDefaults.budget || {}) },
     commit: { ...hardcoded.commit, ...(userDefaults.commit || {}) },
     execution: { ...hardcoded.execution, ...(userDefaults.execution || {}) },
+    routing: { ...hardcoded.routing, ...(userDefaults.routing || {}) },
   };
 }

package/pan-wizard-core/bin/lib/core.cjs CHANGED Viewed

@@ -25,21 +25,41 @@ const {
   MILESTONE_VERSION_RE,
 } = require('./constants.cjs');
+// ─── Multi-Model Routing ─────────────────────────────────────────────────────
+/**
+ * Provider-specific model name mapping for each tier alias.
+ * Each provider maps reasoning/mid/fast to its native model identifiers.
+ * "inherit" means the host runtime uses its own top-tier model selection.
+ */
+const PROVIDER_MODELS = {
+  anthropic: { reasoning: 'inherit', mid: 'sonnet', fast: 'haiku' },
+  openai:    { reasoning: 'inherit', mid: 'mid',    fast: 'fast'  },
+  google:    { reasoning: 'inherit', mid: 'mid',    fast: 'fast'  },
+  default:   { reasoning: 'inherit', mid: 'sonnet', fast: 'haiku' },
+};
+/** Maps legacy Anthropic model names to provider-agnostic tier aliases. */
+const LEGACY_ALIASES = { opus: 'reasoning', sonnet: 'mid', haiku: 'fast' };
+/** Relative cost multipliers per tier (fast = 1× baseline). */
+const COST_MULTIPLIERS = { reasoning: 15, mid: 3, fast: 1 };
 // ─── Model Profile Table ─────────────────────────────────────────────────────
 const MODEL_PROFILES = {
-  'pan-planner':              { quality: 'opus', balanced: 'opus',   budget: 'sonnet' },
-  'pan-roadmapper':           { quality: 'opus', balanced: 'sonnet', budget: 'sonnet' },
-  'pan-executor':             { quality: 'opus', balanced: 'sonnet', budget: 'sonnet' },
-  'pan-phase-researcher':     { quality: 'opus', balanced: 'sonnet', budget: 'haiku' },
-  'pan-project-researcher':   { quality: 'opus', balanced: 'sonnet', budget: 'haiku' },
-  'pan-research-synthesizer': { quality: 'opus', balanced: 'sonnet', budget: 'haiku' },
-  'pan-debugger':             { quality: 'opus', balanced: 'sonnet', budget: 'sonnet' },
-  'pan-document_code':        { quality: 'opus', balanced: 'haiku', budget: 'haiku' },
-  'pan-verifier':             { quality: 'opus', balanced: 'sonnet', budget: 'haiku' },
-  'pan-plan-checker':         { quality: 'opus', balanced: 'sonnet', budget: 'haiku' },
-  'pan-integration-checker':  { quality: 'opus', balanced: 'sonnet', budget: 'haiku' },
-  'pan-reviewer':             { quality: 'opus', balanced: 'haiku',  budget: 'haiku' },
+  'pan-planner':              { quality: 'reasoning', balanced: 'reasoning', budget: 'mid' },
+  'pan-roadmapper':           { quality: 'reasoning', balanced: 'mid',      budget: 'mid' },
+  'pan-executor':             { quality: 'reasoning', balanced: 'mid',      budget: 'mid' },
+  'pan-phase-researcher':     { quality: 'reasoning', balanced: 'mid',      budget: 'fast' },
+  'pan-project-researcher':   { quality: 'reasoning', balanced: 'mid',      budget: 'fast' },
+  'pan-research-synthesizer': { quality: 'reasoning', balanced: 'mid',      budget: 'fast' },
+  'pan-debugger':             { quality: 'reasoning', balanced: 'mid',      budget: 'mid' },
+  'pan-document_code':        { quality: 'reasoning', balanced: 'fast',     budget: 'fast' },
+  'pan-verifier':             { quality: 'reasoning', balanced: 'mid',      budget: 'fast' },
+  'pan-plan-checker':         { quality: 'reasoning', balanced: 'mid',      budget: 'fast' },
+  'pan-integration-checker':  { quality: 'reasoning', balanced: 'mid',      budget: 'fast' },
+  'pan-reviewer':             { quality: 'reasoning', balanced: 'fast',     budget: 'fast' },
 };
 // ─── Output helpers ───────────────────────────────────────────────────────────
@@ -179,6 +199,8 @@ function loadConfig(cwd) {
       commit: parsed.commit || { safety_checks: true, conventional_types: true, sensitive_patterns: ['\\.env$', '\\.pem$', '\\.key$', 'credentials', 'secret', 'password', 'token'] },
       execution: parsed.execution || { default_mode: 'wave_order', rollback_snapshots: true, error_pattern_learning: true },
       focus: parsed.focus || { auto_commit: true },
+      model_overrides: parsed.model_overrides || {},
+      routing: parsed.routing || { strategy: 'static', provider: 'auto' },
     };
   } catch { // Config missing or malformed — use defaults
     return {
@@ -187,6 +209,8 @@ function loadConfig(cwd) {
       commit: { safety_checks: true, conventional_types: true, sensitive_patterns: ['\\.env$', '\\.pem$', '\\.key$', 'credentials', 'secret', 'password', 'token'] },
       execution: { default_mode: 'wave_order', rollback_snapshots: true, error_pattern_learning: true },
       focus: { auto_commit: true },
+      model_overrides: {},
+      routing: { strategy: 'static', provider: 'auto' },
     };
   }
 }
@@ -485,27 +509,142 @@ function getRoadmapPhaseInternal(cwd, phaseNum) {
 }
 /**
- * Resolve the model for a given agent type based on profile and overrides.
- * Returns "inherit" for opus-tier to let Claude Code use its configured opus version.
+ * Extract a model tier override from a roadmap phase section.
+ * Looks for `<!-- model_tier: <tier> -->` in the phase section text.
+ * @param {string} cwd - Project root directory
+ * @param {string|number} phaseNum - Phase number to look up
+ * @returns {string|null} Tier alias if found, null otherwise
+ */
+function getPhaseModelTier(cwd, phaseNum) {
+  const phaseData = getRoadmapPhaseInternal(cwd, phaseNum);
+  if (!phaseData?.section) return null;
+  const match = phaseData.section.match(/<!--\s*model_tier:\s*(\S+)\s*-->/i);
+  return match ? match[1] : null;
+}
+/**
+ * Resolve the model for a given agent type based on profile, provider, and routing strategy.
+ * Returns "inherit" for reasoning-tier to let the host runtime use its top-tier model.
  * @param {string} cwd - Project root directory
  * @param {string} agentType - Agent name (e.g., "pan-planner", "pan-executor")
- * @returns {string} Model identifier: "inherit" (opus), "sonnet", or "haiku"
+ * @param {Object} [taskMetadata] - Optional metadata for complexity routing
+ * @returns {string} Model identifier: "inherit", "sonnet", "haiku", "mid", "fast", etc.
  */
-function resolveModelInternal(cwd, agentType) {
+function resolveModelInternal(cwd, agentType, taskMetadata) {
   const config = loadConfig(cwd);
+  const provider = detectProvider(cwd, config);
-  // Check per-agent override first
+  // Check per-agent override first (highest priority)
   const override = config.model_overrides?.[agentType];
   if (override) {
-    return override === 'opus' ? 'inherit' : override;
+    return resolveTierToModel(override, provider);
+  }
+  // Check per-phase override from roadmap (second priority)
+  if (taskMetadata?.phaseNum) {
+    const phaseTier = getPhaseModelTier(cwd, taskMetadata.phaseNum);
+    if (phaseTier) {
+      return resolveTierToModel(phaseTier, provider);
+    }
   }
   // Fall back to profile lookup
   const profile = config.model_profile || 'balanced';
   const agentModels = MODEL_PROFILES[agentType];
-  if (!agentModels) return 'sonnet';
-  const resolved = agentModels[profile] || agentModels['balanced'] || 'sonnet';
-  return resolved === 'opus' ? 'inherit' : resolved;
+  if (!agentModels) return resolveTierToModel('mid', provider);
+  let tier = agentModels[profile] || agentModels['balanced'] || 'mid';
+  // Apply routing strategy
+  const strategy = config.routing?.strategy || 'static';
+  if (strategy === 'complexity' && taskMetadata) {
+    const thresholds = config.routing?.complexity_thresholds;
+    tier = resolveComplexityTier(tier, { ...taskMetadata, thresholds });
+  }
+  return resolveTierToModel(tier, provider);
+}
+/**
+ * Detect the LLM provider from config, environment, or runtime directory presence.
+ * @param {string} cwd - Project root directory
+ * @param {Object} config - Loaded config object
+ * @returns {string} Provider name: "anthropic", "openai", "google", or "default"
+ */
+function detectProvider(cwd, config) {
+  // 1. Explicit config
+  if (config.routing?.provider && config.routing.provider !== 'auto') {
+    const p = config.routing.provider;
+    return PROVIDER_MODELS[p] ? p : 'default';
+  }
+  // 2. Environment variable
+  const envProvider = process.env.PAN_PROVIDER;
+  if (envProvider) {
+    return PROVIDER_MODELS[envProvider] ? envProvider : 'default';
+  }
+  // 3. Runtime directory detection
+  const checks = [
+    ['.claude', 'anthropic'], ['.codex', 'openai'],
+    ['.gemini', 'google'], ['.opencode', 'openai'], ['.github', 'default'],
+  ];
+  for (const [dir, provider] of checks) {
+    try { if (fs.statSync(path.join(cwd, dir)).isDirectory()) return provider; }
+    catch { /* continue */ }
+  }
+  return 'default';
+}
+/**
+ * Resolve a tier alias (or legacy model name) to a provider-specific model name.
+ * @param {string} tier - Tier alias ("reasoning", "mid", "fast") or legacy name ("opus", "sonnet", "haiku")
+ * @param {string} provider - Provider key from detectProvider()
+ * @returns {string} Provider-specific model name
+ */
+function resolveTierToModel(tier, provider) {
+  const normalizedTier = LEGACY_ALIASES[tier] || tier;
+  const providerMap = PROVIDER_MODELS[provider] || PROVIDER_MODELS['default'];
+  return providerMap[normalizedTier] || providerMap['mid'];
+}
+/**
+ * Adjust model tier based on task complexity metadata.
+ * @param {string} baseTier - Starting tier ("reasoning", "mid", "fast")
+ * @param {Object} [taskMetadata] - Complexity indicators
+ * @returns {string} Adjusted tier
+ */
+function resolveComplexityTier(baseTier, taskMetadata) {
+  if (!taskMetadata) return baseTier;
+  const { fileCount = 0, waveCount = 0, requirementCount = 0, isArchitectural = false } = taskMetadata;
+  const score =
+    (fileCount > 15 ? 2 : fileCount > 5 ? 1 : 0) +
+    (waveCount > 3 ? 2 : waveCount > 1 ? 1 : 0) +
+    (requirementCount > 5 ? 2 : requirementCount > 2 ? 1 : 0) +
+    (isArchitectural ? 3 : 0);
+  const thresholds = taskMetadata.thresholds || { downgrade_max: 2, upgrade_min: 6 };
+  const tiers = ['fast', 'mid', 'reasoning'];
+  const idx = tiers.indexOf(baseTier);
+  if (idx === -1) return baseTier;
+  if (score <= thresholds.downgrade_max && idx > 0) return tiers[idx - 1];
+  if (score >= thresholds.upgrade_min && idx < 2) return tiers[idx + 1];
+  return baseTier;
+}
+/**
+ * Estimate relative cost multiplier for a given profile.
+ * @param {string} profile - "quality", "balanced", or "budget"
+ * @returns {Object} Cost estimation with total, average, agentCount
+ */
+function estimateCostMultiplier(profile) {
+  let total = 0;
+  const agents = Object.keys(MODEL_PROFILES);
+  for (const agent of agents) {
+    const tier = MODEL_PROFILES[agent][profile] || 'mid';
+    total += COST_MULTIPLIERS[tier] || 3;
+  }
+  return { profile, total, average: +(total / agents.length).toFixed(1), agentCount: agents.length };
 }
 // ─── Misc utilities ───────────────────────────────────────────────────────────
@@ -625,6 +764,9 @@ function scanSourceTodos(cwd) {
 module.exports = {
   MODEL_PROFILES,
+  PROVIDER_MODELS,
+  LEGACY_ALIASES,
+  COST_MULTIPLIERS,
   output,
   error,
   verbose,
@@ -641,6 +783,11 @@ module.exports = {
   getArchivedPhaseDirs,
   getRoadmapPhaseInternal,
   resolveModelInternal,
+  detectProvider,
+  resolveTierToModel,
+  resolveComplexityTier,
+  estimateCostMultiplier,
+  getPhaseModelTier,
   pathExistsInternal,
   generateSlugInternal,
   getMilestoneInfo,