npm - agentic-sdlc-wizard - Versions diffs - 1.37.1 → 1.38.0 - Mend

agentic-sdlc-wizard 1.37.1 → 1.38.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (11) hide show

package/.claude-plugin/marketplace.json +1 -1
package/.claude-plugin/plugin.json +1 -1
package/CHANGELOG.md +12 -0
package/CLAUDE_CODE_SDLC_WIZARD.md +73 -2
package/cli/bin/sdlc-wizard.js +17 -4
package/cli/lib/repo-complexity.js +167 -0
package/hooks/sdlc-prompt-check.sh +12 -0
package/package.json +1 -1
package/skills/sdlc/SKILL.md +2 -0
package/skills/setup/SKILL.md +33 -11
package/skills/update/SKILL.md +6 -3

package/.claude-plugin/marketplace.json CHANGED Viewed

@@ -13,7 +13,7 @@
       "name": "sdlc-wizard",
       "source": ".",
       "description": "SDLC enforcement for AI agents — TDD, planning, self-review, CI shepherd",
-      "version": "1.37.1",
+      "version": "1.38.0",
       "author": {
         "name": "Stefan Ayala"
       },

package/.claude-plugin/plugin.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "sdlc-wizard",
-  "version": "1.37.1",
+  "version": "1.38.0",
   "description": "SDLC enforcement for AI agents — TDD, planning, self-review, CI shepherd",
   "author": {
     "name": "Stefan Ayala",

package/CHANGELOG.md CHANGED Viewed

@@ -4,6 +4,18 @@ All notable changes to the SDLC Wizard.
 > **Note:** This changelog is for humans to read. Don't manually apply these changes - just run the wizard ("Check for SDLC wizard updates") and it handles everything automatically.
+## [1.38.0] - 2026-04-24
+### Added
+- **Prompt-hook-fires-once instrumentation** — ROADMAP #224. `hooks/sdlc-prompt-check.sh` now records one tab-separated record (`<ts>\t<pid>\tsdlc-prompt-check`) per post-dedupe invocation when the opt-in env var `SDLC_HOOK_FIRE_LOG` is set. Maintainer can count lines per user prompt to verify CC 2.1.118's double-fire fix in real sessions; >1 line per prompt indicates regression. Unwritable paths fail silently. Procedure documented in `CLAUDE_CODE_SDLC_WIZARD.md` → "Verifying Prompt-Hook-Fires-Once". 6 regression tests in `tests/test-prompt-hook-fires-once.sh` cover the instrumentation contract (counter increments, opt-in semantics, log shape, output stability, error tolerance).
+- **Mixed-mode tier (Sonnet 4.6 coder + Opus 4.7 reviewer)** — ROADMAP #233. New `cli/lib/repo-complexity.js` heuristic classifies repos as `simple` or `complex` from filesystem signals (LOC, test count, hook count, workflow count, plus stakes flag for `.env` / `secrets/` / `credentials/`). Setup skill Step 9.5 expanded from binary y/N into a 3-way prompt:
+  - **`[N]`** No pin (default, recommended for most repos) — preserves Claude Code auto-mode
+  - **`[m]`** Mixed-mode pin `model: "sonnet[1m]"` — suggested for `simple` tier; coder runs on Sonnet, cross-model reviewer always stays at flagship (Opus 4.7 / gpt-5.5 xhigh)
+  - **`[f]`** Flagship pin `model: "opus[1m]"` + `CLAUDE_AUTOCOMPACT_PCT_OVERRIDE=30` — suggested for `complex` / stakes-flagged tier; current pre-#233 default
+  Stakes flag (`.env` / `secrets/` / `credentials/`) forces `complex` regardless of size and detects at any depth (e.g. `config/.env`, `app/secrets/`); the coder is doing security-relevant work and the saving isn't worth the risk. Heuristic outputs are advisory — the user always picks the final tier. New `tests/test-repo-complexity.sh` (11 tests) with six fixture repos (`tests/fixtures/complexity/{simple,complex,stakes,nested-stakes,boundary-simple,boundary-complex}-repo`) covers tier classification, nested stakes detection, threshold-boundary cases (29 tests = simple, 30 = complex), JSON shape, missing-dir error path, and the `npx agentic-sdlc-wizard complexity` CLI subcommand. Cross-model review section in `skills/sdlc/SKILL.md` explicitly notes the reviewer **always** runs at flagship regardless of coder pin — weakening the review leg defeats the savings. Update skill Step 7.5 recognizes `sonnet[1m]` as a valid mixed-mode pin (no migration prompt). Wizard doc gets a new "Mixed-Mode Tier" subsection documenting the split, when to use each tier, the prove-it gate (pair-test on 3+ simple repos before recommending mixed-mode as default), and tradeoffs. **Reconciles with #198:** mixed-mode is opt-in per-project via Step 9.5; no-pin remains the default.
 ## [1.37.1] - 2026-04-24
 ### Fixed

package/CLAUDE_CODE_SDLC_WIZARD.md CHANGED Viewed

@@ -940,6 +940,77 @@ Claude Code supports both 200K and 1M context windows. **`opus[1m]` is an opt-in
 **Autocompact pairing (important):** If you opt into `opus[1m]`, also set `CLAUDE_AUTOCOMPACT_PCT_OVERRIDE=30` — otherwise CC's default autocompact fires at ~76K and destroys the headroom you're paying for. Step 9.5 writes both together when you opt in.
+### Mixed-Mode Tier (Sonnet coder + Opus reviewer, roadmap #233)
+For trivial / blank / config-only / CRUD-style repos, full Opus 4.7 on every turn is overkill on the coder leg. The **mixed-mode tier** pins `model: "sonnet[1m]"` for in-session work while keeping the cross-model review layer (Codex / external reviewer) at the flagship — so the reviewer still catches what Sonnet missed.
+**The split:**
+| Layer | Mixed-mode tier | Flagship tier |
+|-------|----------------|---------------|
+| Coder (in-session CC) | `model: "sonnet[1m]"` | `model: "opus[1m]"` |
+| Cross-model reviewer (Codex etc.) | gpt-5.5 xhigh (or Opus 4.7 max via Bash) | gpt-5.5 xhigh (or Opus 4.7 max via Bash) |
+| Effort floor (CC session) | xhigh; max preferred | xhigh; max preferred |
+The reviewer always stays at flagship — the whole point of mixed-mode is that adversarial review catches Sonnet's blind spots, so weakening the review leg defeats the savings.
+**When mixed-mode is the right call:**
+- Repo is small (LOC < 10K), few tests (< 30), few hooks (< 5), few workflows (< 5), no `.env` / secrets handling
+- You're on API billing (not Max subscription) and 2× cost on simple repos actually matters
+- Tasks are predominantly mechanical — typo fixes, config tweaks, small CRUD endpoints
+- You're running the SDLC Wizard's setup flow against a sibling repo where the coder doesn't need flagship reasoning
+**When to stay flagship:**
+- Stakes-flagged repo: anywhere `.env` / `secrets/` / `credentials/` exists. Force flagship even if LOC is tiny — leaks are catastrophic
+- Architecture work, debugging non-obvious bugs, security review, anything where the *coder's* judgment matters as much as the reviewer's
+- Long shepherd sessions (plan → TDD → review → CI loop) — they cross 100K tokens regularly and Opus 4.7 fits the window better in a single thread
+**Auto-detection:** the setup wizard runs `cli/lib/repo-complexity.js` against the target repo and suggests the tier. Stakes flag (`.env` / `secrets/`) forces complex regardless of size. The user always picks the final answer — the heuristic is a hint, not a gate.
+**How to opt in (manual):**
+```json
+{
+  "model": "sonnet[1m]"
+}
+```
+Don't add `CLAUDE_AUTOCOMPACT_PCT_OVERRIDE` — Sonnet's 1M window has different compaction characteristics than Opus's; let upstream defaults ride until we benchmark.
+**Prove-It Gate (#233 acceptance criterion):** mixed-mode ships only if pair-tested on 3+ simple repos shows Sonnet-coder + Opus-reviewer produces ≥ same SDLC scores as full-Opus baseline. The first version of the heuristic ships v1.38.0; pair-test results land in CHANGELOG before recommending mixed-mode as the default for any tier.
+**Tradeoffs (be honest):**
+- Sonnet 4.6 will drop some fine-grained self-review moves (it's fast, less deliberate). The Opus reviewer catches them — but you'll see more "fix in round 2" cycles compared to Opus-coder runs.
+- Mixed-mode disables auto-mode (same as flagship pin). The Sonnet pin is per-session — to switch back, remove the `model` line.
+### Verifying Prompt-Hook-Fires-Once (roadmap #224)
+CC 2.1.118 shipped a fix for `prompt` hooks double-firing when an agent-hook verifier subagent itself made tool calls. The bug would manifest as duplicate `SDLC BASELINE` injections per `UserPromptSubmit` — context bloat plus possible confusion. The dual-channel (project + plugin) double-print is already handled by `dedupe_plugin_or_project` in v1.37.1; this section is the runtime check for the *CC-internal* double-fire case.
+`hooks/sdlc-prompt-check.sh` ships an opt-in instrumentation: when the env var `SDLC_HOOK_FIRE_LOG` is set, every post-dedupe invocation appends one tab-separated record (`<unix-ts>\t<pid>\tsdlc-prompt-check`) to that log. Counting lines per prompt tells you whether CC fired the hook once or twice.
+**Maintainer procedure (real session):**
+```bash
+# 1. Pick a fresh log path
+export SDLC_HOOK_FIRE_LOG="$(mktemp /tmp/sdlc-fire-log.XXXXXX)"
+# 2. Restart Claude Code so the env propagates into spawned hooks
+#    (or set it in your shell rc / .envrc and start a fresh session)
+# 3. Run a normal SDLC session — including any task that triggers a verifier
+#    subagent (e.g., /code-review, /sdlc with multi-step planning)
+# 4. After N user prompts, count log lines:
+wc -l "$SDLC_HOOK_FIRE_LOG"
+#    Expect: N lines. >N indicates the CC double-fire bug regressed.
+# 5. Optional: tail the log live in another terminal to watch each fire:
+tail -f "$SDLC_HOOK_FIRE_LOG"
+```
+The instrumentation is opt-in — when the env var is unset, no log is written and no overhead is added. Unwritable log paths fail silently so a bad `SDLC_HOOK_FIRE_LOG` value never crashes the hook.
+**Regression test:** `tests/test-prompt-hook-fires-once.sh` covers the instrumentation contract (counter increments per invocation, opt-in semantics, log line shape, output stability, unwritable-path tolerance). It does *not* spawn Claude Code — that's a maintainer-runtime check by design. The test asserts the recording mechanism works so the maintainer's real-session count is trustworthy.
 ---
 ## Example Workflow (End-to-End)
@@ -2715,7 +2786,7 @@ If deployment fails or post-deploy verification catches issues:
 **SDLC.md:**
 ```markdown
-<!-- SDLC Wizard Version: 1.37.1 -->
+<!-- SDLC Wizard Version: 1.38.0 -->
 <!-- Setup Date: [DATE] -->
 <!-- Completed Steps: step-0.1, step-0.2, step-0.4, step-1, step-2, step-3, step-4, step-5, step-6, step-7, step-8, step-9 -->
 <!-- Git Workflow: [PRs or Solo] -->
@@ -3777,7 +3848,7 @@ Walk through updates? (y/n)
 Store wizard state in `SDLC.md` as metadata comments (invisible to readers, parseable by Claude):
 ```markdown
-<!-- SDLC Wizard Version: 1.37.1 -->
+<!-- SDLC Wizard Version: 1.38.0 -->
 <!-- Setup Date: 2026-01-24 -->
 <!-- Completed Steps: step-0.1, step-0.2, step-1, step-2, step-3, step-4, step-5, step-6, step-7, step-8, step-9 -->
 <!-- Git Workflow: PRs -->

package/cli/bin/sdlc-wizard.js CHANGED Viewed

@@ -3,6 +3,7 @@
 const { version } = require('../../package.json');
 const { init, check } = require('../init');
+const { detectComplexity } = require('../lib/repo-complexity');
 const args = process.argv.slice(2);
@@ -12,7 +13,8 @@ const flags = {
   json: args.includes('--json'),
 };
-const command = args.find((a) => !a.startsWith('--'));
+const positional = args.filter((a) => !a.startsWith('--'));
+const command = positional[0];
 if (args.includes('--version') || args.includes('-v')) {
   console.log(version);
@@ -24,13 +26,14 @@ if (args.includes('--help') || args.includes('-h') || !command) {
   agentic-sdlc-wizard v${version}
   Usage:
-    sdlc-wizard init [options]    Install SDLC wizard into current directory
-    sdlc-wizard check [options]   Check installation health and updates
+    sdlc-wizard init [options]               Install SDLC wizard into current directory
+    sdlc-wizard check [options]              Check installation health and updates
+    sdlc-wizard complexity [path]            Print mixed-mode tier heuristic (roadmap #233)
   Options:
     --force       Overwrite existing files (init only)
     --dry-run     Preview changes without writing (init only)
-    --json        Output as JSON (check only)
+    --json        Output as JSON (check / complexity)
     --version     Show version
     --help        Show this help
   `.trim());
@@ -55,6 +58,16 @@ if (command === 'init') {
     console.error(`Error: ${err.message}`);
     process.exit(1);
   }
+} else if (command === 'complexity') {
+  try {
+    const target = positional[1] || process.cwd();
+    const result = detectComplexity(target);
+    process.stdout.write(JSON.stringify(result, null, 2) + '\n');
+    process.exit(0);
+  } catch (err) {
+    console.error(`Error: ${err.message}`);
+    process.exit(2);
+  }
 } else {
   console.error(`Unknown command: ${command}`);
   console.error('Run "sdlc-wizard --help" for usage.');

package/cli/lib/repo-complexity.js ADDED Viewed

@@ -0,0 +1,167 @@
+// Roadmap #233: repo complexity heuristic for mixed-mode tier selection.
+//
+// Output: { tier: 'simple' | 'complex', score: <number>, signals: [...] }
+//   - 'simple' → setup wizard suggests mixed-mode (Sonnet 4.6 coder + Opus 4.7 reviewer)
+//   - 'complex' → setup wizard suggests full flagship (Opus 4.7 everywhere)
+// Cross-model review (Codex / external) always stays at the flagship tier
+// regardless of coder selection — see CLAUDE_CODE_SDLC_WIZARD.md.
+//
+// Classification (matches CLAUDE_CODE_SDLC_WIZARD.md → "Mixed-Mode Tier"):
+//   simple = LOC < 10K AND tests < 30 AND hooks < 5 AND workflows < 5 AND no stakes
+//   complex = ANY high signal OR stakes flag (.env / secrets/ / credentials/ at any depth)
+// `score` is an additive ladder kept for transparency (low=0, mid=1, high=2 per signal).
+//
+// Heuristic is intentionally cheap: a single sync filesystem walk, no parsing.
+// It is a setup-time hint, not a runtime gate; users can override the result.
+const fs = require('fs');
+const path = require('path');
+const STAKES_FILES = new Set(['.env', '.env.local', '.env.production', '.env.development', '.envrc']);
+const STAKES_DIRS = new Set(['secrets', 'credentials', '.secrets', '.credentials']);
+const SKIP_DIRS = new Set([
+  'node_modules', '.git', 'dist', 'build', 'out', 'target',
+  '.next', '.nuxt', 'coverage', '.cache', 'vendor', '__pycache__',
+]);
+const SOURCE_EXTS = new Set([
+  '.js', '.jsx', '.ts', '.tsx', '.mjs', '.cjs',
+  '.py', '.go', '.rs', '.rb', '.java', '.kt', '.swift',
+  '.c', '.h', '.cpp', '.hpp', '.cc',
+  '.sh', '.bash', '.zsh',
+]);
+const TEST_PATTERNS = [/\.test\.[jt]sx?$/, /\.spec\.[jt]sx?$/, /_test\.go$/, /test_.*\.py$/, /.*_test\.py$/];
+function isTestFile(name, parentDir) {
+  if (TEST_PATTERNS.some((p) => p.test(name))) return true;
+  return parentDir === 'tests' || parentDir === 'test' || parentDir === '__tests__' || parentDir === 'spec';
+}
+function walk(rootDir, repoRoot, onFile, onDir, visited) {
+  let entries;
+  try {
+    entries = fs.readdirSync(rootDir, { withFileTypes: true });
+  } catch (_) {
+    return;
+  }
+  for (const entry of entries) {
+    const full = path.join(rootDir, entry.name);
+    if (entry.isSymbolicLink()) continue; // don't follow symlinks (cycle / out-of-tree risk)
+    if (entry.isDirectory()) {
+      if (SKIP_DIRS.has(entry.name)) continue;
+      // Resolve real path for cycle detection.
+      let realPath;
+      try {
+        realPath = fs.realpathSync(full);
+      } catch (_) {
+        continue;
+      }
+      if (visited.has(realPath)) continue;
+      visited.add(realPath);
+      onDir && onDir(full, entry.name);
+      walk(full, repoRoot, onFile, onDir, visited);
+    } else if (entry.isFile()) {
+      onFile(full, entry.name, path.basename(rootDir));
+    }
+  }
+}
+function countLines(filePath) {
+  try {
+    const content = fs.readFileSync(filePath, 'utf8');
+    if (!content) return 0;
+    return content.split('\n').length;
+  } catch (_) {
+    return 0;
+  }
+}
+function detectComplexity(repoPath) {
+  if (!fs.existsSync(repoPath)) {
+    throw new Error(`Repo path does not exist: ${repoPath}`);
+  }
+  const stat = fs.statSync(repoPath);
+  if (!stat.isDirectory()) {
+    throw new Error(`Repo path is not a directory: ${repoPath}`);
+  }
+  const repoRoot = path.resolve(repoPath);
+  let loc = 0;
+  let testFiles = 0;
+  let hookFiles = 0;
+  let workflowFiles = 0;
+  const stakesHits = [];
+  const visited = new Set([fs.realpathSync(repoRoot)]);
+  walk(
+    repoRoot,
+    repoRoot,
+    (filePath, name, parent) => {
+      const ext = path.extname(name).toLowerCase();
+      const rel = path.relative(repoRoot, filePath);
+      if (STAKES_FILES.has(name)) {
+        stakesHits.push(`stakes:file:${rel}`);
+      }
+      if (SOURCE_EXTS.has(ext) || ext === '.yml' || ext === '.yaml') {
+        if (isTestFile(name, parent)) testFiles++;
+        else if (SOURCE_EXTS.has(ext)) loc += countLines(filePath);
+      }
+      if (parent === 'hooks' && filePath.includes(`.claude${path.sep}hooks`) && (ext === '.sh' || ext === '.bash')) {
+        hookFiles++;
+      }
+      if (parent === 'workflows' && filePath.includes(`.github${path.sep}workflows`) && (ext === '.yml' || ext === '.yaml')) {
+        workflowFiles++;
+      }
+    },
+    (dirPath, name) => {
+      if (STAKES_DIRS.has(name)) {
+        const rel = path.relative(repoRoot, dirPath);
+        stakesHits.push(`stakes:dir:${rel || name}/`);
+      }
+    },
+    visited
+  );
+  // Bands match docs: LOC<10K / tests<30 / hooks<5 / workflows<5 = "simple band"
+  const signals = [];
+  let highHits = 0;
+  let score = 0;
+  function band(value, midThreshold, highThreshold, label) {
+    if (value >= highThreshold) {
+      score += 2;
+      highHits++;
+      signals.push(`${label}:${value} (high → +2)`);
+    } else if (value >= midThreshold) {
+      score += 1;
+      signals.push(`${label}:${value} (mid → +1)`);
+    } else {
+      signals.push(`${label}:${value} (low → +0)`);
+    }
+  }
+  band(loc, 1000, 10000, 'loc');
+  band(testFiles, 5, 30, 'tests');
+  band(hookFiles, 3, 5, 'hooks');
+  band(workflowFiles, 2, 5, 'workflows');
+  let tier = highHits > 0 ? 'complex' : 'simple';
+  if (stakesHits.length > 0) {
+    tier = 'complex';
+    signals.push(...stakesHits, 'override:stakes-forces-complex');
+  }
+  return { tier, score, signals };
+}
+module.exports = { detectComplexity };
+if (require.main === module) {
+  const target = process.argv[2] || '.';
+  try {
+    const result = detectComplexity(target);
+    process.stdout.write(JSON.stringify(result, null, 2) + '\n');
+  } catch (err) {
+    process.stderr.write(`error: ${err.message}\n`);
+    process.exit(2);
+  }
+}

package/hooks/sdlc-prompt-check.sh CHANGED Viewed

@@ -12,6 +12,18 @@ source "$HOOK_DIR/_find-sdlc-root.sh"
 # plugin paths even when the script is sourced or invoked via aliases.
 dedupe_plugin_or_project "${BASH_SOURCE[0]}" || exit 0
+# Roadmap #224: opt-in fires-once instrumentation. CC 2.1.118 shipped a fix for
+# prompt hooks double-firing when a verifier subagent itself made tool calls.
+# When SDLC_HOOK_FIRE_LOG is set, append one tab-separated record per real
+# invocation (post-dedupe). Maintainer can compare line count against prompt
+# count to verify the CC fix in real sessions. See CLAUDE_CODE_SDLC_WIZARD.md →
+# "Verifying Prompt-Hook-Fires-Once" for the procedure.
+if [ -n "${SDLC_HOOK_FIRE_LOG:-}" ]; then
+    {
+        printf '%s\t%s\tsdlc-prompt-check\n' "$(date +%s)" "$$" >> "$SDLC_HOOK_FIRE_LOG"
+    } 2>/dev/null || true
+fi
 # CWD walk-up finds nearest SDLC project (#173: silent exit for non-SDLC dirs)
 if find_sdlc_root; then
     PROJECT_DIR="$SDLC_ROOT"

package/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "agentic-sdlc-wizard",
-  "version": "1.37.1",
+  "version": "1.38.0",
   "description": "SDLC enforcement for Claude Code — hooks, skills, and wizard setup in one command",
   "bin": {
     "sdlc-wizard": "cli/bin/sdlc-wizard.js"

package/skills/sdlc/SKILL.md CHANGED Viewed

@@ -230,6 +230,8 @@ PLANNING -> DOCS -> TDD RED -> TDD GREEN -> Tests Pass -> Self-Review
 **The core insight:** The review PROTOCOL is universal across domains. Only the review INSTRUCTIONS change. Code review is the default template below. For non-code domains (research, persuasion, medical content), adapt the `review_instructions` and `verification_checklist` fields while keeping the same handoff/dialogue/convergence loop.
+**Reviewer always at the flagship tier (roadmap #233):** if the project pins `model: "sonnet[1m]"` (mixed-mode) or any non-flagship coder, the cross-model reviewer **still runs at the flagship**: `codex exec -c 'model_reasoning_effort="xhigh"'` (gpt-5.5) or an Opus 4.7 max equivalent. The whole point of mixed-mode is that adversarial review catches Sonnet's blind spots — weakening the reviewer leg defeats the savings. Don't downscale the review just because the coder is downscaled.
 ### Step 0: Write Preflight Self-Review Doc
 Before submitting to an external reviewer, document what YOU already checked. This is proven to reduce reviewer findings to 0-1 per round (evidence: anticheat repo preflight discipline).

package/skills/setup/SKILL.md CHANGED Viewed

@@ -198,24 +198,45 @@ Write the shape as:
 Present suggestions and let the user confirm.
-### Step 9.5: Context Window Configuration (Opt-In)
+### Step 9.5: Context Window + Mixed-Mode Configuration (Opt-In)
-The CLI ships `cli/templates/settings.json` with **no** `model` or `env` pin by default. This preserves Claude Code's built-in model auto-selection (Sonnet for cheap tasks, Opus for hard ones) and the upstream autocompact threshold. Power users who want guaranteed 1M context can opt in during setup.
+The CLI ships `cli/templates/settings.json` with **no** `model` or `env` pin by default. This preserves Claude Code's built-in model auto-selection (Sonnet for cheap tasks, Opus for hard ones) and the upstream autocompact threshold. Power users can opt into a pin during setup; mixed-mode users (Sonnet coder + Opus reviewer) can pin Sonnet here too.
-**Why this is opt-in (issue #198):** A top-level `"model"` in `settings.json` tells Claude Code "the user has explicitly chosen a model" and disables auto-mode for the session. That is a real tradeoff — the pin is only worth it when you actually need the 1M headroom and want to lock to Opus 4.7.
+**Why this is opt-in (issue #198):** A top-level `"model"` in `settings.json` tells Claude Code "the user has explicitly chosen a model" and disables auto-mode for the session. That is a real tradeoff — pinning is only worth it when you actually need the 1M headroom or you've decided mixed-mode tier-splitting is better than per-turn auto-selection.
+**Run the complexity heuristic first (roadmap #233):**
+```bash
+npx agentic-sdlc-wizard complexity .
+```
+The output is JSON: `{ tier: "simple" | "complex", score, signals }`. Use the result to suggest a default in the prompt below — do NOT override the user's choice. The heuristic flags any `.env` / `secrets/` / `credentials/` at any depth as a stakes signal that forces `complex` regardless of size.
 **Ask the user exactly once in Step 9.5:**
-> Pin the session to `opus[1m]` (Opus 4.7 with 1M context) and set `CLAUDE_AUTOCOMPACT_PCT_OVERRIDE=30`?
+> Detected repo complexity: **{tier}** ({score}, signals: {loc, tests, hooks, workflows, stakes-flag if any}).
+>
+> How do you want to configure the model for this repo?
 >
-> - **No (default):** Leaves auto-mode enabled. Claude Code picks the model per turn, compaction follows upstream defaults. Recommended for most users.
-> - **Yes:** Long SDLC sessions (plan → TDD → review → CI shepherd on one feature) regularly cross 100K tokens; the 1M window gives headroom and 30% autocompact fires at ~300K. Requires Claude Code v2.1.111+ and comfort with losing model auto-selection.
+> - **[N] No pin (default, recommended for most repos):** Leaves auto-mode enabled. Claude Code picks the model per turn. Compaction follows upstream defaults. Simplest, lowest friction.
+> - **[m] Mixed-mode** *(suggested for **simple** tier — roadmap #233):* Pins `model: "sonnet[1m]"` for the coder (Sonnet 4.6 with 1M context). The cross-model review layer (Codex / external reviewer) **always stays at the flagship** (Opus 4.7 max or gpt-5.5 xhigh) regardless. Saves cost/quota on simple repos; reviewer catches what Sonnet misses. Requires comfort with losing per-turn auto-selection.
+> - **[f] Flagship full** *(suggested for **complex** / stakes-flagged tier):* Pins `model: "opus[1m]"` (Opus 4.7 with 1M context) and sets `CLAUDE_AUTOCOMPACT_PCT_OVERRIDE=30`. Long SDLC sessions cross 100K tokens regularly; the 1M window gives headroom and 30% autocompact fires at ~300K. Requires Claude Code v2.1.111+.
 >
-> `[y/N]`
+> `[N/m/f]`
+**If the user answers `N` (default):** Make no edits to `.claude/settings.json`. Auto-mode stays on. Done.
+**If the user answers `m` (mixed-mode):** Edit `.claude/settings.json` and add:
+```json
+{
+  "model": "sonnet[1m]"
+}
+```
-**If the user answers No (default):** Make no edits to `.claude/settings.json`. Auto-mode stays on. Done.
+Do NOT add `CLAUDE_AUTOCOMPACT_PCT_OVERRIDE` for Sonnet — Sonnet's 1M window has different compaction characteristics than Opus; let the upstream default ride. Tell the user explicitly: "Cross-model reviews still run at the flagship — `codex exec -c 'model_reasoning_effort=\"xhigh\"'` (gpt-5.5) or any future Opus-tier reviewer. Mixed-mode is coder-only."
-**If the user answers Yes:** Edit `.claude/settings.json` and add both fields at the top level:
+**If the user answers `f` (flagship):** Edit `.claude/settings.json` and add both fields at the top level:
 ```json
 {
@@ -226,9 +247,10 @@ The CLI ships `cli/templates/settings.json` with **no** `model` or `env` pin by
 }
 ```
-Mention the escape hatch either way:
+Mention the escape hatch in all three cases:
 - To opt out later: remove the `model` line (and optionally the `env` block) from `.claude/settings.json`, or run `/model` and pick "Default (recommended)".
-- For CI pipelines with short tasks, consider `CLAUDE_AUTOCOMPACT_PCT_OVERRIDE=60` — compact early to stay fast.
+- To switch tiers later: edit `.claude/settings.json` and replace the `model` value, or re-run `/setup-wizard` Step 9.5.
+- For CI pipelines with short tasks (flagship only), consider `CLAUDE_AUTOCOMPACT_PCT_OVERRIDE=60` — compact early to stay fast.
 This is project-scoped and shared with the team via git.

package/skills/update/SKILL.md CHANGED Viewed

@@ -46,9 +46,10 @@ Parse all CHANGELOG entries between the user's installed version and the latest.
 ```
 Installed: 1.24.0
-Latest:    1.37.1
+Latest:    1.38.0
 What changed:
+- [1.38.0] Mixed-mode tier (Sonnet 4.6 coder + Opus 4.7 reviewer) for simple repos — ROADMAP #233. New `cli/lib/repo-complexity.js` heuristic + `npx agentic-sdlc-wizard complexity .` CLI command. Setup Step 9.5 expanded from binary y/N to 3-way (no-pin / mixed / flagship). Cross-model review always stays at flagship regardless of coder pin. Reconciles with #198: mixed-mode is opt-in per-project; no-pin remains the default.
 - [1.37.1] Token-bloat fix: dedupe 2× SDLC BASELINE print when both project + plugin register the same hook (~300 tokens doubled per prompt). 5 hooks gain `dedupe_plugin_or_project()` helper. Codex 2-round 100/100.
 - [1.37.0] `monthly-research.yml` workflow deleted (ROADMAP #231 Phase 1) — 0 merged artifacts in 30d while burning $11-23/month; research happens inline now. `model-effort-check.sh` loud WARNING below xhigh (#217) — max preferred, xhigh floor; duplicate effort nudge in `instructions-loaded-check.sh` removed; single source of truth. Both changes Codex-certified.
 - [1.36.1] Repo renamed `agentic-ai-sdlc-wizard` → `claude-sdlc-wizard` (matches sibling pattern; npm package unchanged); `npm pkg fix` metadata cleanup; slug migration across docs/tests/configs
@@ -133,9 +134,11 @@ If the user is upgrading from a pre-#198 version, check their `.claude/settings.
 2. **If only one of the two fields matches** (e.g. `model: "opus[1m]"` but custom autocompact, or vice versa) — treat as intentional customization. Do not prompt.
-3. **If `model` is some other value** (e.g. `"sonnet"`, `"opus"`) — treat as user's explicit choice. Do not touch.
+3. **If `model` is `"sonnet[1m]"` (mixed-mode tier, roadmap #233, v1.38.0+)** — treat as user's explicit mixed-mode choice. Do not prompt; this is the supported mixed-mode pin. Mention in the upgrade summary: "Detected mixed-mode tier (Sonnet coder + flagship reviewer). Cross-model review still uses Opus / gpt-5.5 — see CLAUDE_CODE_SDLC_WIZARD.md → 'Mixed-Mode Tier'."
-4. **If neither field is set** — user is already on the new default. No action.
+4. **If `model` is some other value** (e.g. `"sonnet"`, `"opus"`) — treat as user's explicit choice. Do not touch.
+5. **If neither field is set** — user is already on the new default. No action.
 When removing: edit the file in place, drop the `model` key (and the `env.CLAUDE_AUTOCOMPACT_PCT_OVERRIDE` key if nothing else is in `env`, otherwise leave `env` alone). Never touch other keys the user added.