agentic-sdlc-wizard 1.70.0 → 1.72.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -13,7 +13,7 @@
13
13
  "name": "sdlc-wizard",
14
14
  "source": ".",
15
15
  "description": "SDLC enforcement for AI agents — TDD, planning, self-review, CI shepherd",
16
- "version": "1.70.0",
16
+ "version": "1.72.0",
17
17
  "author": {
18
18
  "name": "Stefan Ayala"
19
19
  },
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "sdlc-wizard",
3
- "version": "1.70.0",
3
+ "version": "1.72.0",
4
4
  "description": "SDLC enforcement for AI agents — TDD, planning, self-review, CI shepherd",
5
5
  "author": {
6
6
  "name": "Stefan Ayala",
package/CHANGELOG.md CHANGED
@@ -4,6 +4,93 @@ All notable changes to the SDLC Wizard.
4
4
 
5
5
  > **Note:** This changelog is for humans to read. Don't manually apply these changes - just run the wizard ("Check for SDLC wizard updates") and it handles everything automatically.
6
6
 
7
+ ## [1.72.0] - 2026-05-05
8
+
9
+ ### Closes #323: `init --force` no longer silently overwrites CUSTOMIZED files
10
+
11
+ User report 2026-05-05 — `npx agentic-sdlc-wizard check` flags 6 files as CUSTOMIZED then recommends `init --force`, which silently clobbers all 6. Reporter: "the wizard correctly **detected** 6 CUSTOMIZED files and then **recommended a command that would overwrite all 6**." Fully closed in two parts:
12
+
13
+ #### Part 1 — customization-aware recommendation in `check` (PR #325)
14
+
15
+ When `check` finds CUSTOMIZED files alongside an available update, the suggested command now warns and points at `init --dry-run` first instead of silently recommending `init --force`. Pure function `buildUpdateRecommendation(updateInfo, customizedCount)` exported from `cli/init.js`. Backward compat: zero-customized recommendation byte-for-byte unchanged.
16
+
17
+ #### Part 2 — new `init --preserve-customized` flag (PR #328)
18
+
19
+ Composes existing `--force` semantics — when both flags are set:
20
+
21
+ - **CUSTOMIZED** files (sha256 mismatch with template) → action `PRESERVE`, skipped during write, reported in summary footer
22
+ - **MATCH** files → `OVERWRITE` (refresh; effectively no-op since hashes already match)
23
+ - **MISSING** files → `CREATE` (new files added in latest version still get installed)
24
+
25
+ Without `--force`, the flag is a no-op (all existing files SKIP regardless). Updated `check` recommendation now suggests `init --force --preserve-customized` as the "upgrade safely" path when CUSTOMIZED files exist.
26
+
27
+ #### Sample output
28
+
29
+ ```
30
+ PRESERVE .claude/hooks/sdlc-prompt-check.sh
31
+ PRESERVE .claude/skills/sdlc/SKILL.md
32
+ OVERWRITE .claude/hooks/tdd-pretool-check.sh
33
+ CREATE .claude/skills/feedback/SKILL.md
34
+
35
+ PRESERVED 2 customized file(s) — review with `init --dry-run` to see what differs from the latest template.
36
+ ```
37
+
38
+ #### Test coverage
39
+
40
+ 10 new tests in `tests/test-cli.sh` across 3 PRs (78 → 88 green): customization-aware recommendation (2 from #325), null/undefined edge cases (2 from #326 P2 follow-up), preserve-customized core behavior (3 from #328 round 1), no-force-no-op + settings.json MERGE precedence + WIZARD_DOC parity (3 from #328 round-2).
41
+
42
+ #### Deferred
43
+
44
+ Option 3 from #323 (real `update` subcommand with backup directory + per-file diff prompt) — options 1 + 2 close the immediate footgun without a new subcommand and backup-storage convention.
45
+
46
+ #### Files
47
+
48
+ - `cli/init.js` — `buildUpdateRecommendation()` exported; `--preserve-customized` threaded through `init()` → `planOperations()`; `isCustomized()` hash helper; `PRESERVE` action skipped in `executeOperations()`; `printOps()` adds yellow `PRESERVE` color + summary footer
49
+ - `cli/bin/sdlc-wizard.js` — `--preserve-customized` flag + help text
50
+ - `tests/test-cli.sh` — 10 new tests across 3 PRs (78 → 88 green)
51
+ - `package.json`, `.claude-plugin/plugin.json` + `marketplace.json`, `SDLC.md`, `skills/update/SKILL.md`, `CLAUDE_CODE_SDLC_WIZARD.md`, `CHANGELOG.md` (1.71.0 → 1.72.0)
52
+
53
+ ## [1.71.0] - 2026-05-05
54
+
55
+ ### Token-bloat fix: SDLC skill Cross-Model Review section trimmed
56
+
57
+ `skills/sdlc/SKILL.md` Cross-Model Review section condensed from ~70 lines to ~20 lines. Saves ~427 tokens per SDLC skill auto-invoke (4995 → 4568 tokens). The skill auto-loads on virtually every productive `implement/fix/refactor` task, so this is real per-session cost.
58
+
59
+ ### What stayed in SKILL.md
60
+
61
+ - Decision-making: when to run / skip / prerequisites / flagship-tier reviewer rule (#233)
62
+ - 4-step protocol summary (preflight → handoff → reviewer → dialogue loop)
63
+ - Required handoff JSON keys + `pr_number` self-heal opt-in note (#209)
64
+ - Convergence rule (2 rounds sweet spot, 3 max)
65
+ - Release-review verification-checklist additions
66
+ - Sandbox flag for Codex from CC
67
+
68
+ ### What moved to canonical wizard doc only
69
+
70
+ Full JSON example, full codex command example, anti-patterns, multi-reviewer (Claude+Codex+human) workflow, non-code-domain variants. All these live in `CLAUDE_CODE_SDLC_WIZARD.md` → "Cross-Model Review Loop" (194 lines, full canonical protocol). The trimmed SKILL.md ends with an explicit pointer to that section.
71
+
72
+ ### Audit method
73
+
74
+ ROADMAP #236 phase 3. `scripts/audit-session-load.sh` ranked SKILL.md files at the top of the size table:
75
+ - `skills/sdlc/SKILL.md`: 4995 tokens (sat right at 5K threshold)
76
+ - `skills/update/SKILL.md`: 4931 tokens
77
+ - `skills/setup/SKILL.md`: 4490 tokens
78
+
79
+ SDLC skill auto-invokes most often (every implement/fix/refactor task), so it earned the cut. Verified 8 test suites that grep for SKILL.md content (mocking table, TDD prove, Memory Audit Protocol heading, opus[1m], autocompact compound, Deployment Tasks, plus `tests/test-self-update.sh` which asserts cross-model-review-specific content: `### Release Review Focus` heading, `Version parity` focus area, `"mission"`/`"success"`/`"failure"` JSON-quoted schema keys, "verification checklist" pattern, "preflight" mention). Codex round 1 caught 3 missed assertions in `test-self-update.sh`; round 2 fixes restored those constraints in tighter prose without re-bloating.
80
+
81
+ ### Files
82
+
83
+ - `skills/sdlc/SKILL.md` — Cross-Model Review section trimmed
84
+ - `CHANGELOG.md`, `SDLC.md`, `skills/update/SKILL.md` (Latest:), `package.json`, `.claude-plugin/plugin.json` + `marketplace.json`, `CLAUDE_CODE_SDLC_WIZARD.md` (1.70.0 → 1.71.0)
85
+
86
+ ### Combined savings ROADMAP #236 phases 1-3
87
+
88
+ - v1.69.0: ~12K tokens/session (BASELINE block fires once)
89
+ - v1.70.0: ~0.5-1.5K tokens/session (TDD CHECK fires once)
90
+ - v1.71.0: ~573 tokens/session (SDLC skill leaner on auto-invoke)
91
+
92
+ Total on a 50-prompt + 20-Edit + 1 SDLC-skill-invoke session: **~14K tokens/session**.
93
+
7
94
  ## [1.70.0] - 2026-05-05
8
95
 
9
96
  ### Token-bloat fix: TDD CHECK nudge fires once per CC session
@@ -2976,7 +2976,7 @@ If deployment fails or post-deploy verification catches issues:
2976
2976
 
2977
2977
  **SDLC.md:**
2978
2978
  ```markdown
2979
- <!-- SDLC Wizard Version: 1.70.0 -->
2979
+ <!-- SDLC Wizard Version: 1.72.0 -->
2980
2980
  <!-- Setup Date: [DATE] -->
2981
2981
  <!-- Completed Steps: step-0.1, step-0.2, step-0.4, step-1, step-2, step-3, step-4, step-5, step-6, step-7, step-8, step-9 -->
2982
2982
  <!-- Git Workflow: [PRs or Solo] -->
@@ -3923,6 +3923,21 @@ CLI-distributed file parity (skills, hooks, settings).
3923
3923
 
3924
3924
  **This complements automated tests, not replaces them.** Tests catch exact version mismatches (e.g., `test_package_version_matches_changelog`). Cross-model review catches semantic issues tests cannot — a section silently dropped, examples using outdated but syntactically valid versions, docs describing features that no longer exist.
3925
3925
 
3926
+ #### Anti-patterns
3927
+
3928
+ - **"Find at least N problems"** — incentivizes false positives. The reviewer will manufacture findings to hit the count.
3929
+ - **"Review this"** — too vague. Always pair with `verification_checklist` items that name file:line evidence to verify.
3930
+ - **1-10 score with no criteria** — every reviewer scores differently. Either define what 1, 5, 10 mean for *this* review, or drop the score and just produce CERTIFIED / NOT CERTIFIED with findings.
3931
+ - **Author reasoning visible to reviewer** — anchoring bias. The reviewer should see code + handoff, not the author's self-assessment of why it's correct.
3932
+
3933
+ #### Multiple reviewers (Claude review + Codex + human)
3934
+
3935
+ Run them in parallel; collect feedback via `gh api repos/OWNER/REPO/pulls/PR/comments` (single source of truth). Respond per-reviewer (different blind spots — don't merge feedback). On conflicts, pick the stronger argument with reasoning, not the louder voice. Cap iterations at 3 per reviewer to avoid infinite loops.
3936
+
3937
+ #### Non-code domains (research, persuasion, medical content)
3938
+
3939
+ Same handoff format. Adapt `review_instructions` (e.g. "verify each cited claim links to a primary source") and `verification_checklist` (specific claim → specific source). Add `"audience"` and `"stakes"` keys to the JSON so the reviewer knows what reading level / risk profile to apply.
3940
+
3926
3941
  ---
3927
3942
 
3928
3943
  ## User Understanding and Periodic Feedback
@@ -4055,7 +4070,7 @@ Walk through updates? (y/n)
4055
4070
  Store wizard state in `SDLC.md` as metadata comments (invisible to readers, parseable by Claude):
4056
4071
 
4057
4072
  ```markdown
4058
- <!-- SDLC Wizard Version: 1.70.0 -->
4073
+ <!-- SDLC Wizard Version: 1.72.0 -->
4059
4074
  <!-- Setup Date: 2026-01-24 -->
4060
4075
  <!-- Completed Steps: step-0.1, step-0.2, step-1, step-2, step-3, step-4, step-5, step-6, step-7, step-8, step-9 -->
4061
4076
  <!-- Git Workflow: PRs -->
@@ -11,6 +11,7 @@ const flags = {
11
11
  force: args.includes('--force'),
12
12
  dryRun: args.includes('--dry-run'),
13
13
  json: args.includes('--json'),
14
+ preserveCustomized: args.includes('--preserve-customized'),
14
15
  };
15
16
 
16
17
  const positional = args.filter((a) => !a.startsWith('--'));
@@ -31,11 +32,12 @@ if (args.includes('--help') || args.includes('-h') || !command) {
31
32
  sdlc-wizard complexity [path] Print mixed-mode tier heuristic (roadmap #233)
32
33
 
33
34
  Options:
34
- --force Overwrite existing files (init only)
35
- --dry-run Preview changes without writing (init only)
36
- --json Output as JSON (check / complexity)
37
- --version Show version
38
- --help Show this help
35
+ --force Overwrite existing files (init only)
36
+ --preserve-customized With --force, skip files that have local edits (init only)
37
+ --dry-run Preview changes without writing (init only)
38
+ --json Output as JSON (check / complexity)
39
+ --version Show version
40
+ --help Show this help
39
41
  `.trim());
40
42
  process.exit(0);
41
43
  }
package/cli/init.js CHANGED
@@ -130,9 +130,23 @@ function mergeSettings(existingPath, templatePath, force) {
130
130
  }
131
131
  }
132
132
 
133
- function planOperations(targetDir, { force }) {
133
+ function planOperations(targetDir, { force, preserveCustomized = false }) {
134
134
  const ops = [];
135
135
 
136
+ // #323 option 2: preserve mode requires hashing each existing file to
137
+ // distinguish CUSTOMIZED from MATCH. Only meaningful with --force; without
138
+ // --force, existing files are SKIP regardless of hash.
139
+ const isCustomized = (srcPath, destPath) => {
140
+ if (!preserveCustomized || !force) return false;
141
+ try {
142
+ const srcHash = crypto.createHash('sha256').update(fs.readFileSync(srcPath)).digest('hex');
143
+ const destHash = crypto.createHash('sha256').update(fs.readFileSync(destPath)).digest('hex');
144
+ return srcHash !== destHash;
145
+ } catch (_) {
146
+ return false;
147
+ }
148
+ };
149
+
136
150
  for (const file of FILES) {
137
151
  const destPath = path.join(targetDir, file.dest);
138
152
  const srcPath = path.join(file.base || TEMPLATES_DIR, file.src);
@@ -154,11 +168,16 @@ function planOperations(targetDir, { force }) {
154
168
  // Invalid JSON — fall through to normal SKIP/OVERWRITE
155
169
  }
156
170
 
171
+ let action;
172
+ if (!exists) action = 'CREATE';
173
+ else if (isCustomized(srcPath, destPath)) action = 'PRESERVE';
174
+ else action = force ? 'OVERWRITE' : 'SKIP';
175
+
157
176
  ops.push({
158
177
  src: srcPath,
159
178
  dest: destPath,
160
179
  relativeDest: file.dest,
161
- action: exists ? (force ? 'OVERWRITE' : 'SKIP') : 'CREATE',
180
+ action,
162
181
  executable: file.executable || false,
163
182
  });
164
183
  }
@@ -166,11 +185,15 @@ function planOperations(targetDir, { force }) {
166
185
  // Wizard doc
167
186
  const wizardDest = path.join(targetDir, 'CLAUDE_CODE_SDLC_WIZARD.md');
168
187
  const wizardExists = fs.existsSync(wizardDest);
188
+ let wizardAction;
189
+ if (!wizardExists) wizardAction = 'CREATE';
190
+ else if (isCustomized(WIZARD_DOC, wizardDest)) wizardAction = 'PRESERVE';
191
+ else wizardAction = force ? 'OVERWRITE' : 'SKIP';
169
192
  ops.push({
170
193
  src: WIZARD_DOC,
171
194
  dest: wizardDest,
172
195
  relativeDest: 'CLAUDE_CODE_SDLC_WIZARD.md',
173
- action: wizardExists ? (force ? 'OVERWRITE' : 'SKIP') : 'CREATE',
196
+ action: wizardAction,
174
197
  executable: false,
175
198
  });
176
199
 
@@ -183,7 +206,7 @@ function ensureDir(filePath) {
183
206
 
184
207
  function executeOperations(ops) {
185
208
  for (const op of ops) {
186
- if (op.action === 'SKIP') continue;
209
+ if (op.action === 'SKIP' || op.action === 'PRESERVE') continue;
187
210
  ensureDir(op.dest);
188
211
  if (op.action === 'MERGE') {
189
212
  fs.writeFileSync(op.dest, op.mergedContent);
@@ -230,12 +253,18 @@ function updateGitignore(targetDir, { dryRun }) {
230
253
  }
231
254
 
232
255
  function printOps(ops) {
256
+ let preserveCount = 0;
233
257
  for (const op of ops) {
234
258
  const color = op.action === 'CREATE' ? GREEN
235
259
  : op.action === 'SKIP' ? YELLOW
236
260
  : op.action === 'MERGE' ? MAGENTA
261
+ : op.action === 'PRESERVE' ? YELLOW
237
262
  : CYAN;
238
263
  console.log(` ${color}${op.action}${RESET} ${op.relativeDest}`);
264
+ if (op.action === 'PRESERVE') preserveCount++;
265
+ }
266
+ if (preserveCount > 0) {
267
+ console.log(`\n ${YELLOW}PRESERVED ${preserveCount} customized file(s)${RESET} — review with \`init --dry-run\` to see what differs from the latest template.`);
239
268
  }
240
269
  }
241
270
 
@@ -254,7 +283,7 @@ function invalidateVersionCache({ dryRun }) {
254
283
  return true;
255
284
  }
256
285
 
257
- function init(targetDir, { force = false, dryRun = false } = {}) {
286
+ function init(targetDir, { force = false, dryRun = false, preserveCustomized = false } = {}) {
258
287
  if (!dryRun && !force) {
259
288
  const pluginPaths = detectPluginInstall();
260
289
  if (pluginPaths.length > 0) {
@@ -273,7 +302,7 @@ function init(targetDir, { force = false, dryRun = false } = {}) {
273
302
  }
274
303
  }
275
304
 
276
- const ops = planOperations(targetDir, { force });
305
+ const ops = planOperations(targetDir, { force, preserveCustomized });
277
306
 
278
307
  if (dryRun) {
279
308
  console.log('Dry run — no files will be written:\n');
@@ -396,14 +425,37 @@ function check(targetDir, { json = false } = {}) {
396
425
  }
397
426
  }
398
427
  if (updateInfo) {
399
- console.log(`\n ${YELLOW}UPDATE${RESET} v${updateInfo.current} -> v${updateInfo.latest}`);
400
- console.log(' Run: npx agentic-sdlc-wizard init --force');
428
+ const customizedCount = results.filter((r) => r.status === 'CUSTOMIZED').length;
429
+ for (const line of buildUpdateRecommendation(updateInfo, customizedCount)) {
430
+ console.log(line);
431
+ }
401
432
  }
402
433
  }
403
434
 
404
435
  return { results, updateInfo, hasDrift, marketplace };
405
436
  }
406
437
 
438
+ // #323: when `check` finds CUSTOMIZED files alongside an available update,
439
+ // the historical recommendation `init --force` would silently clobber them.
440
+ // This builds an update-recommendation block that's customization-aware:
441
+ // no-CUSTOMIZED → recommend --force as before. Has-CUSTOMIZED → warn and
442
+ // recommend `--dry-run` first so the user can preview the diff.
443
+ function buildUpdateRecommendation(updateInfo, customizedCount) {
444
+ if (!updateInfo) return [];
445
+ const lines = [
446
+ `\n ${YELLOW}UPDATE${RESET} v${updateInfo.current} -> v${updateInfo.latest}`,
447
+ ];
448
+ if (customizedCount > 0) {
449
+ lines.push(` ${YELLOW}WARNING${RESET}: ${customizedCount} file(s) CUSTOMIZED — \`init --force\` will overwrite them`);
450
+ lines.push(` Preview first: npx agentic-sdlc-wizard init --dry-run`);
451
+ lines.push(` Or upgrade safely: npx agentic-sdlc-wizard init --force --preserve-customized`);
452
+ lines.push(` (\`--preserve-customized\` skips CUSTOMIZED files and updates only MATCH ones.)`);
453
+ } else {
454
+ lines.push(' Run: npx agentic-sdlc-wizard init --force');
455
+ }
456
+ return lines;
457
+ }
458
+
407
459
  function checkFile(srcPath, destPath, relativeDest, shouldBeExecutable) {
408
460
  if (!fs.existsSync(destPath)) {
409
461
  return { file: relativeDest, status: 'MISSING' };
@@ -519,4 +571,4 @@ function checkMarketplacePaths() {
519
571
  return results;
520
572
  }
521
573
 
522
- module.exports = { init, check, planOperations, detectPluginInstall, GITIGNORE_ENTRIES };
574
+ module.exports = { init, check, planOperations, detectPluginInstall, buildUpdateRecommendation, GITIGNORE_ENTRIES };
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "agentic-sdlc-wizard",
3
- "version": "1.70.0",
3
+ "version": "1.72.0",
4
4
  "description": "SDLC enforcement for Claude Code — hooks, skills, and wizard setup in one command",
5
5
  "bin": {
6
6
  "sdlc-wizard": "cli/bin/sdlc-wizard.js"
@@ -120,76 +120,24 @@ The loop goes back to PLANNING, not TDD RED. Run `/code-review`; issues at confi
120
120
 
121
121
  ## Cross-Model Review (If Configured)
122
122
 
123
- **When to run:** high-stakes changes (auth, payments, data), releases/publishes, complex refactors.
124
- **When to skip:** trivial changes, time-sensitive hotfixes, risk < review cost.
125
- **Prerequisites:** Codex CLI (`npm i -g @openai/codex`) + OpenAI API key.
126
-
127
- The PROTOCOL is universal across domains; only `review_instructions` and `verification_checklist` change. **Reviewer always at flagship tier (#233):** if the project pins `model: "sonnet[1m]"` (mixed-mode), the reviewer still runs `gpt-5.5` or Opus 4.7 max — adversarial diversity is the point.
128
-
129
- ### Step 0: Preflight Self-Review
130
-
131
- At `.reviews/preflight-{review_id}.md`, document what you already checked: `/code-review` passed, all tests passing, specific concerns checked, what you verified manually, known limitations. Reduces reviewer findings to 0-1 per round.
132
-
133
- ### Step 1: Mission-First Handoff
134
-
135
- Write `.reviews/handoff.json`:
136
- ```jsonc
137
- {
138
- "review_id": "feature-xyz-001",
139
- "status": "PENDING_REVIEW",
140
- "round": 1,
141
- "mission": "What changed and why — 2-3 sentences",
142
- "success": "What 'correctly reviewed' looks like",
143
- "failure": "What gets missed if reviewer is superficial",
144
- "files_changed": ["src/auth.ts", "tests/auth.test.ts"],
145
- "fixes_applied": [],
146
- "previous_score": null,
147
- "verification_checklist": [
148
- "(a) Verify input validation at auth.ts:45 handles empty strings",
149
- "(b) Verify test covers null-token edge case"
150
- ],
151
- "review_instructions": "Focus on security and edge cases. Be strict — assume bugs may be present.",
152
- "preflight_path": ".reviews/preflight-feature-xyz-001.md",
153
- "pr_number": 205
154
- }
155
- ```
156
-
157
- `mission/success/failure` give context (without them: generic "looks good"). `verification_checklist` is specific (file:line), not "review for correctness." `pr_number` (optional) is the **PreCompact self-heal opt-in (ROADMAP #209)**: when set, `precompact-seam-check.sh` checks `gh pr view N --json state` on `/compact` and, if MERGED, treats handoff as implicit CERTIFIED. Without it, a forgotten PENDING handoff blocks every manual compact until you flip status or hit `SDLC_HANDOFF_STALE_DAYS` (default 14).
158
-
159
- ### Step 2: Run the Reviewer
160
-
161
- ```bash
162
- codex exec -c 'model_reasoning_effort="xhigh"' -s danger-full-access \
163
- -o .reviews/latest-review.md \
164
- "Independent code reviewer. Read .reviews/handoff.json for context. \
165
- Verify each checklist item with evidence (file:line, grep, test output). \
166
- Each finding: ID, severity (P0/P1/P2), evidence, certify condition. \
167
- End with: score (1-10), CERTIFIED or NOT CERTIFIED."
168
- ```
169
-
170
- Always `xhigh` — lower settings miss subtle errors. **Progress (#259):** xhigh runs take 1-5 min; for a heartbeat use `scripts/codex-review-with-progress.sh` (`SDLC_CODEX_HEARTBEAT_INTERVAL` tunes). **Sandbox:** Codex's Rust binary needs `SCDynamicStore`; CC's sandbox blocks this. From CC, use `dangerouslyDisableSandbox: true` — Codex has its own sandbox via `-s danger-full-access`. Known issue: [codex#15640](https://github.com/openai/codex/issues/15640).
171
-
172
- CERTIFIED → CI. NOT CERTIFIED → dialogue loop.
173
-
174
- ### Step 3: Dialogue Loop
175
-
176
- Per-finding response in `.reviews/response.json`: `{"finding": "1", "action": "FIXED|DISPUTED|ACCEPTED", "summary": "..."}`. Update `handoff.json`: increment `round`, status `PENDING_RECHECK`, add `fixes_applied` (numbered, file:line refs).
177
-
178
- Recheck prompt: "TARGETED RECHECK. For each finding: FIXED → verify certify condition. DISPUTED → ACCEPT if sound, REJECT with reasoning. ACCEPTED → verify applied. Do NOT raise new findings unless P0. End with score, CERTIFIED or NOT CERTIFIED."
123
+ **When to run:** high-stakes changes (auth, payments, data), releases/publishes, complex refactors. **When to skip:** trivial changes, time-sensitive hotfixes, risk < review cost. **Prerequisites:** Codex CLI (`npm i -g @openai/codex`) + OpenAI API key. **Reviewer at flagship tier (#233):** even when project pins `sonnet[1m]`, reviewer runs `gpt-5.5` / Opus 4.7 max — adversarial diversity is the point.
179
124
 
180
- **Convergence:** 2 rounds is the sweet spot, 3 max (research: 14 repos + 7 papers). After 3 still NOT CERTIFIED → escalate to user.
125
+ PROTOCOL is universal across domains; only `review_instructions` and `verification_checklist` change.
181
126
 
182
- **Anti-patterns:** "find at least N problems," "review this," 1-10 without criteria, letting reviewer see author's reasoning (anchoring).
127
+ 1. **Preflight** (`.reviews/preflight-{review_id}.md`) what you already checked: `/code-review` passed, tests passing, manual verifications, known limits. Reduces reviewer findings to 0-1/round.
128
+ 2. **Mission-first handoff** (`.reviews/handoff.json`) — required JSON keys: `"review_id"`, `"status": "PENDING_REVIEW"`, `"round": 1`, `"mission"` / `"success"` / `"failure"` (2-3 sentences each — without them you get "looks good"), `"files_changed"`, `"verification_checklist"` — the **verification checklist** is specific items with file:line refs (NOT a generic "review this"), `"review_instructions"`, `"preflight_path"`. Optional `"pr_number":` opts into the PreCompact self-heal (#209): if PR is MERGED, `/compact` treats handoff as implicit CERTIFIED.
129
+ 3. **Run reviewer:** `codex exec -c 'model_reasoning_effort="xhigh"' -s danger-full-access -o .reviews/latest-review.md "<prompt>"`. Always `xhigh`. CC sandbox blocks Codex's Rust binary (`SCDynamicStore`) — use `dangerouslyDisableSandbox: true` on Bash; Codex has its own sandbox. xhigh runs take 1-5 min; for a heartbeat use `scripts/codex-review-with-progress.sh`.
130
+ 4. **Dialogue loop:** per-finding response (`{"finding": "1", "action": "FIXED|DISPUTED|ACCEPTED", "summary": "..."}` in `.reviews/response.json`). Bump round, set status `PENDING_RECHECK`, add `fixes_applied` (numbered, file:line). Recheck prompt: "TARGETED RECHECK. FIXED → verify certify condition. DISPUTED → ACCEPT if sound, REJECT with reasoning. ACCEPTED → verify applied. No new findings unless P0."
183
131
 
184
- **Multiple reviewers** (Claude review + Codex + human): `gh api repos/OWNER/REPO/pulls/PR/comments` for all feedback, respond to each reviewer independently (different blind spots), pick stronger argument on conflicts, max 3 iterations per reviewer.
132
+ **Convergence:** 2 rounds sweet spot, 3 max (research: 14 repos + 7 papers). After 3 still NOT CERTIFIED escalate.
185
133
 
186
- **Non-code domains** (research, persuasion, medical): same handoff format, adapt `review_instructions` + `verification_checklist`, add `audience` + `stakes`.
134
+ **Multi-reviewer / non-code domains:** when running multiple reviewers in parallel (e.g. Claude review + Codex + human), respond per-reviewer (different blind spots, no shared anchoring). For non-code domains (research, persuasion, medical), keep the same handoff format and add `"audience"` + `"stakes"` keys.
187
135
 
188
136
  ### Release Review Focus
189
137
 
190
138
  Before any release/publish, add to `verification_checklist`: **CHANGELOG consistency** (sections present, no lost entries), **Version parity** (package.json + SDLC.md + CHANGELOG + wizard metadata), **Stale examples** (hardcoded version strings), **Docs accuracy** (README + ARCHITECTURE reflect current features), **CLI-distributed file parity** (live skills/hooks match CLI templates).
191
139
 
192
- (Full protocol with rationale and convergence diagrams: `CLAUDE_CODE_SDLC_WIZARD.md` → Cross-Model Review.)
140
+ **Full protocol** (rationale, full JSON example, anti-patterns like "find at least N", convergence diagrams): `CLAUDE_CODE_SDLC_WIZARD.md` → "Cross-Model Review Loop".
193
141
 
194
142
  ## Documentation Sync (REQUIRED — During Planning)
195
143
 
@@ -93,11 +93,13 @@ Parse CHANGELOG entries between the user's installed version and latest. Present
93
93
 
94
94
  ```
95
95
  Installed: 1.42.0
96
- Latest: 1.70.0
96
+ Latest: 1.72.0
97
97
 
98
98
  What changed:
99
- - [1.70.0] token-bloat fix phase 2 `hooks/tdd-pretool-check.sh` TDD CHECK JSON nudge (the per-`Write/Edit` "Are you writing IMPLEMENTATION before a FAILING TEST?" reminder) now fires once per CC `session_id` instead of every src/ edit. Saves ~0.5-1.5K tokens/session (10-30 src Edits × ~50 tok). Same atomic-noclobber claim pattern as v1.69.0 BASELINE gate. Non-src/ edits don't consume the sentinel slot.
100
- - [1.69.0] token-bloat fix phase 1 — `hooks/sdlc-prompt-check.sh` BASELINE block (the ~250-token "TodoWrite FIRST / STATE CONFIDENCE / AUTO-INVOKE" reminder) now fires once per CC `session_id`. Saves ~12K tokens/session. SETUP-not-complete + EFFORT-bump warnings still fire every prompt (dynamic state).
99
+ - [1.72.0] #323 closed customization-aware `check` recommendation + new `--preserve-customized` flag. `init --force --preserve-customized` skips CUSTOMIZED files (action `PRESERVE`), still OVERWRITEs MATCH and CREATEs MISSING. Default `init --force` unchanged. 10 tests.
100
+ - [1.71.0] token-bloat phase 3 — `skills/sdlc/SKILL.md` Cross-Model Review trimmed (~70 ~20 lines, -427 tokens). Full protocol + examples moved to canonical `CLAUDE_CODE_SDLC_WIZARD.md` "Cross-Model Review Loop" section.
101
+ - [1.70.0] token-bloat fix phase 2 — `hooks/tdd-pretool-check.sh` TDD CHECK JSON nudge fires once per CC `session_id` instead of every src/ edit. Saves ~0.5-1.5K tokens/session.
102
+ - [1.69.0] token-bloat fix phase 1 — `hooks/sdlc-prompt-check.sh` BASELINE block fires once per CC `session_id`. Saves ~12K tokens/session.
101
103
  - [1.68.0–1.65.0] roadmap hygiene — five paperwork closes: #97 Anthropic Policy NO-GO + AAR-paper validating parallel; #99 AutoGPT NO-GO; #95 Nous NO-GO; #243 token-history liveness verified; #210 Node-24 false-green; #235 Thoughtworks AI Evals NO-GO. **6/6 external-product audits NO-GO** (continues #76, #77). Research write-ups in `.reviews/research-*.md`.
102
104
  - [1.64.0] XDLC ecosystem cross-references — README, wizard doc, and ROADMAP now cross-reference all three sibling packages (`agentic-sdlc-wizard`, `codex-sdlc-wizard`, `claude-gdlc-wizard`). New "Ecosystem (Sibling Projects)" section in README. 3 new doc-consistency tests prevent drift.
103
105
  - [1.63.0] cache-cost observability closeout (#204 absorbed by #220) — `tests/test-token-spike.sh` gains explicit cache-miss regression test + negative-control test. SDLC skill + wizard doc gain "Cache-Cost Surprises" sections covering 10-20× silent cost blowups (mid-session CLAUDE.md edits, idle pruning, upstream cache bugs) and detection via `hooks/token-spike-check.sh`'s `costly_tokens` metric.