agentic-sdlc-wizard 1.29.0 → 1.31.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -13,7 +13,7 @@
13
13
  "name": "sdlc-wizard",
14
14
  "source": ".",
15
15
  "description": "SDLC enforcement for AI agents — TDD, planning, self-review, CI shepherd",
16
- "version": "1.29.0",
16
+ "version": "1.31.0",
17
17
  "author": {
18
18
  "name": "Stefan Ayala"
19
19
  },
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "sdlc-wizard",
3
- "version": "1.29.0",
3
+ "version": "1.31.0",
4
4
  "description": "SDLC enforcement for AI agents — TDD, planning, self-review, CI shepherd",
5
5
  "author": {
6
6
  "name": "Stefan Ayala",
package/CHANGELOG.md CHANGED
@@ -4,6 +4,49 @@ All notable changes to the SDLC Wizard.
4
4
 
5
5
  > **Note:** This changelog is for humans to read. Don't manually apply these changes - just run the wizard ("Check for SDLC wizard updates") and it handles everything automatically.
6
6
 
7
+ ## [1.31.0] - 2026-04-14
8
+
9
+ ### Added
10
+ - Ephemeral marketplace path detection in CLI `check` command (#174)
11
+ - Scans `~/.claude/settings.json` `extraKnownMarketplaces` for directory sources on ephemeral paths (`/tmp/`, `/private/tmp/`, `/var/folders/`)
12
+ - `EPHEMERAL` status (path exists but in ephemeral root) warns but doesn't fail check
13
+ - `DANGLING` status (path doesn't exist) errors with non-zero exit code
14
+ - Suggests moving to `~/.claude/plugins-local/<name>` for stable installs
15
+ - JSON output (`--json`) includes new `marketplace` field
16
+ - 10 new tests (51 total CLI tests)
17
+
18
+ ### Fixed
19
+ - Hook false-positive "SETUP NOT COMPLETE" in non-SDLC directories (#173, PR #175)
20
+ - Three-way detection: both files (normal), one file (warn partial setup), neither (silent exit)
21
+ - Added `find_partial_sdlc_root` helper for partial-setup detection
22
+ - 2 new hook tests (60 total hook tests)
23
+
24
+ ## [1.30.0] - 2026-04-12
25
+
26
+ ### Added
27
+ - CC degradation detection (#96, PR #166)
28
+ - Score persistence: CI now git-commits `score-history.jsonl` to PR branch after E2E runs, feeding CUSUM drift detection with real data
29
+ - Fork guard (`head.repo.full_name == github.repository`) prevents silent push failures on fork PRs
30
+ - Injection-safe: `head.ref` passed via `env:` block, not inline `${{ }}`
31
+ - Wizard effort section hardened: explains adaptive thinking root cause (Boris Cherny GH #42796), scopes "medium default" to Pro/Max plans, cites code.claude.com docs
32
+ - `CLAUDE_CODE_DISABLE_ADAPTIVE_THINKING` documented as opt-in hardening (not default)
33
+ - Anti-laziness CLAUDE.md guidance section targeting specific mechanisms (adaptive thinking, effort levels, thinking budget)
34
+ - 14 behavioral tests (`test-degradation-detection.sh`)
35
+ - Model A/B comparison workflow (#94, PRs #164, #165)
36
+ - `workflow_dispatch` benchmark: Opus vs Sonnet on E2E scenarios with 95% CI
37
+ - Matrix strategy over scenarios, parameterized model/trials/max_turns
38
+ - Wizard installation verification before simulation (P0 fix)
39
+ - jq-based artifact construction (safe against empty outputs)
40
+ - 37 quality tests (`test-model-comparison.sh`)
41
+ - Firmware-embedded E2E fixture (#78, PR #163)
42
+ - Python SD card overlay manager, 3 device configs (Raspberry Pi, STM32, ESP32)
43
+ - SIL + config validation tests within fixture
44
+ - Domain-adaptive testing proof: firmware indicators, Python overlay, multi-device differentiation
45
+ - 12 quality tests (`test-firmware-fixture.sh`)
46
+
47
+ ### Fixed
48
+ - P0 shell injection in model comparison workflow: `${{ inputs.model }}` directly in `run:` blocks. Fixed by passing all inputs through `env:` block (caught by Codex review)
49
+
7
50
  ## [1.29.0] - 2026-04-07
8
51
 
9
52
  ### Added
@@ -224,7 +224,11 @@ Claude Code's **effort level** controls how much thinking the model does before
224
224
  | `high` | **Default for all SDLC work.** Features, bug fixes, refactoring, tests, reviews | `effort: high` in skill frontmatter (already set) |
225
225
  | `max` | LOW confidence, FAILED 2x, architecture decisions, complex debugging, cross-model reviews | `/effort max` (session only — resets next session) |
226
226
 
227
- **Why `high` is the default:** The `/sdlc` skill sets `effort: high` in its frontmatter, so every SDLC invocation automatically uses high effort. This gives thorough reasoning without the unbounded token cost of `max`.
227
+ **Why `high` is the default:** Claude Code uses **adaptive thinking** to dynamically allocate reasoning budget per turn. On Pro and Max plans, the default effort level is **medium (85)**, which causes the model to under-allocate reasoning on complex multi-step tasks — leading to shallow analysis, missed edge cases, and "lazy" outputs. This was [confirmed by Anthropic engineer Boris Cherny](https://github.com/anthropics/claude-code/issues/42796) and is documented at [code.claude.com](https://code.claude.com/docs/en/model-config). API, Team, and Enterprise plans default to high effort and are not affected.
228
+
229
+ The `/sdlc` skill sets `effort: high` in its frontmatter, overriding the medium default on every SDLC invocation. This gives thorough reasoning without the unbounded token cost of `max`.
230
+
231
+ **Nuclear option — disable adaptive thinking entirely:** Set `CLAUDE_CODE_DISABLE_ADAPTIVE_THINKING=1` in your environment or settings.json `env` block. This forces a fixed reasoning budget per turn instead of letting the model dynamically allocate. Use this if you observe persistent quality issues even with `effort: high`. See [Claude Code model config docs](https://code.claude.com/docs/en/model-config) for details.
228
232
 
229
233
  **When to escalate to `max`:**
230
234
  - You hit LOW confidence on your approach — deeper thinking may find clarity
@@ -242,6 +246,21 @@ Claude Code's **effort level** controls how much thinking the model does before
242
246
 
243
247
  > See also: the **Effort** column in the [Confidence Check table](#confidence-check-required) below for per-confidence-level guidance on when to escalate to `max`.
244
248
 
249
+ ### Anti-Laziness Guidance for CLAUDE.md
250
+
251
+ If you notice Claude Code producing shallow outputs despite `effort: high`, add these instructions to your project's `CLAUDE.md`. These target the **specific mechanisms** behind quality degradation — adaptive thinking and effort levels — rather than vague directives:
252
+
253
+ ```markdown
254
+ ## Quality Anchoring
255
+ - This project uses effort: high via SDLC skill frontmatter. Do not reduce reasoning depth.
256
+ - Adaptive thinking may under-allocate your thinking budget on complex tasks. When working on
257
+ multi-file changes, architecture decisions, or debugging: reason through the full problem
258
+ before acting, even if the system prompt suggests taking the "simplest approach first."
259
+ - If you catch yourself skipping steps, re-read the task requirements and verify completeness.
260
+ ```
261
+
262
+ **Why this works:** Claude Code's hidden system prompt includes "Go straight to the point. Try the simplest approach first." This is good for simple queries but causes the model to under-invest in reasoning on complex SDLC tasks. The instructions above don't fight the system prompt — they provide task-specific context that justifies deeper reasoning. Note that CLAUDE.md instructions can be partially overridden by the system prompt, so `effort: high` in skill frontmatter remains the primary defense.
263
+
245
264
  ---
246
265
 
247
266
  ## Claude Code Feature Updates
@@ -2609,7 +2628,7 @@ If deployment fails or post-deploy verification catches issues:
2609
2628
 
2610
2629
  **SDLC.md:**
2611
2630
  ```markdown
2612
- <!-- SDLC Wizard Version: 1.29.0 -->
2631
+ <!-- SDLC Wizard Version: 1.31.0 -->
2613
2632
  <!-- Setup Date: [DATE] -->
2614
2633
  <!-- Completed Steps: step-0.1, step-0.2, step-0.4, step-1, step-2, step-3, step-4, step-5, step-6, step-7, step-8, step-9 -->
2615
2634
  <!-- Git Workflow: [PRs or Solo] -->
@@ -3668,7 +3687,7 @@ Walk through updates? (y/n)
3668
3687
  Store wizard state in `SDLC.md` as metadata comments (invisible to readers, parseable by Claude):
3669
3688
 
3670
3689
  ```markdown
3671
- <!-- SDLC Wizard Version: 1.29.0 -->
3690
+ <!-- SDLC Wizard Version: 1.31.0 -->
3672
3691
  <!-- Setup Date: 2026-01-24 -->
3673
3692
  <!-- Completed Steps: step-0.1, step-0.2, step-1, step-2, step-3, step-4, step-5, step-6, step-7, step-8, step-9 -->
3674
3693
  <!-- Git Workflow: PRs -->
package/README.md CHANGED
@@ -87,7 +87,7 @@ Layer 4: STATISTICAL VALIDATION
87
87
  SDP normalizes for model quality. CUSUM catches drift.
88
88
 
89
89
  Layer 3: SCORING ENGINE
90
- 7 criteria, 10/11 points. Claude evaluates Claude.
90
+ Multi-criteria scoring, 10/11 points. Claude evaluates Claude.
91
91
  Before/after wizard A/B comparison in CI.
92
92
 
93
93
  Layer 2: ENFORCEMENT
@@ -229,7 +229,7 @@ This isn't the only Claude Code SDLC tool. Here's an honest comparison:
229
229
  | Document | What It Covers |
230
230
  |----------|---------------|
231
231
  | [ARCHITECTURE.md](ARCHITECTURE.md) | System design, 5-layer diagram, data flows, file structure |
232
- | [CI_CD.md](CI_CD.md) | All 6 workflows, E2E scoring, tier system, SDP, integrity checks |
232
+ | [CI_CD.md](CI_CD.md) | All workflows, E2E scoring, tier system, SDP, integrity checks |
233
233
  | [SDLC.md](SDLC.md) | Version tracking, enforcement rules, SDLC configuration |
234
234
  | [TESTING.md](TESTING.md) | Testing philosophy, test diamond, TDD approach |
235
235
  | [CHANGELOG.md](CHANGELOG.md) | Version history, what changed and when |
package/cli/init.js CHANGED
@@ -2,6 +2,7 @@
2
2
 
3
3
  const crypto = require('crypto');
4
4
  const fs = require('fs');
5
+ const os = require('os');
5
6
  const path = require('path');
6
7
 
7
8
  const RESET = '\x1b[0m';
@@ -285,23 +286,37 @@ function check(targetDir, { json = false } = {}) {
285
286
  // Offline or npm unavailable — skip update check
286
287
  }
287
288
 
288
- const hasDrift = results.some((r) => r.status === 'MISSING' || r.status === 'DRIFT');
289
+ const marketplace = checkMarketplacePaths();
290
+ const hasDrift = results.some((r) => r.status === 'MISSING' || r.status === 'DRIFT')
291
+ || marketplace.some((m) => m.status === 'DANGLING');
289
292
 
290
293
  if (json) {
291
- console.log(JSON.stringify({ files: results, update: updateInfo }, null, 2));
294
+ console.log(JSON.stringify({ files: results, update: updateInfo, marketplace }, null, 2));
292
295
  } else {
293
296
  for (const r of results) {
294
297
  const color = r.status === 'MATCH' ? GREEN : r.status === 'MISSING' ? RED : YELLOW;
295
298
  console.log(` ${color}${r.status}${RESET} ${r.file}`);
296
299
  if (r.details) console.log(` ${r.details}`);
297
300
  }
301
+ for (const m of marketplace) {
302
+ const color = m.status === 'DANGLING' ? RED : YELLOW;
303
+ const heading = m.status === 'EPHEMERAL'
304
+ ? `Marketplace '${m.name}' source path is ephemeral:`
305
+ : `Marketplace '${m.name}' source path does not exist:`;
306
+ console.log(`\n ${color}${m.status}${RESET} ${heading}`);
307
+ console.log(` ${m.path}`);
308
+ console.log(` ${m.details}`);
309
+ if (m.suggestion) {
310
+ console.log(` Recommended: move to ${m.suggestion}`);
311
+ }
312
+ }
298
313
  if (updateInfo) {
299
314
  console.log(`\n ${YELLOW}UPDATE${RESET} v${updateInfo.current} -> v${updateInfo.latest}`);
300
315
  console.log(' Run: npx agentic-sdlc-wizard init --force');
301
316
  }
302
317
  }
303
318
 
304
- return { results, updateInfo, hasDrift };
319
+ return { results, updateInfo, hasDrift, marketplace };
305
320
  }
306
321
 
307
322
  function checkFile(srcPath, destPath, relativeDest, shouldBeExecutable) {
@@ -341,4 +356,56 @@ function checkGitignore(gitignorePath) {
341
356
  return { file: '.gitignore', status: 'MATCH' };
342
357
  }
343
358
 
359
+ const EPHEMERAL_ROOTS = /^(\/tmp\/|\/private\/tmp\/|\/var\/folders\/|\/private\/var\/folders\/)/;
360
+
361
+ function checkMarketplacePaths() {
362
+ const results = [];
363
+ const globalSettings = path.join(os.homedir(), '.claude', 'settings.json');
364
+
365
+ if (!fs.existsSync(globalSettings)) return results;
366
+
367
+ let data;
368
+ try {
369
+ data = JSON.parse(fs.readFileSync(globalSettings, 'utf8'));
370
+ } catch (_) {
371
+ return results;
372
+ }
373
+
374
+ const marketplaces = data.extraKnownMarketplaces;
375
+ if (!marketplaces || typeof marketplaces !== 'object') return results;
376
+
377
+ for (const [name, entry] of Object.entries(marketplaces)) {
378
+ const source = entry && entry.source;
379
+ if (!source || source.source !== 'directory' || !source.path || typeof source.path !== 'string') continue;
380
+
381
+ const sourcePath = source.path;
382
+ const isEphemeral = EPHEMERAL_ROOTS.test(sourcePath);
383
+ const exists = fs.existsSync(sourcePath);
384
+ const basename = path.basename(sourcePath);
385
+ const suggestion = `~/.claude/plugins-local/${basename}`;
386
+
387
+ if (!exists) {
388
+ results.push({
389
+ name,
390
+ path: sourcePath,
391
+ status: 'DANGLING',
392
+ details: isEphemeral
393
+ ? 'Ephemeral path has been reaped — plugin is broken'
394
+ : 'Path does not exist — plugin may be silently broken',
395
+ suggestion: isEphemeral ? suggestion : undefined,
396
+ });
397
+ } else if (isEphemeral) {
398
+ results.push({
399
+ name,
400
+ path: sourcePath,
401
+ status: 'EPHEMERAL',
402
+ details: 'macOS reaps this path periodically — plugin may break silently',
403
+ suggestion,
404
+ });
405
+ }
406
+ }
407
+
408
+ return results;
409
+ }
410
+
344
411
  module.exports = { init, check, planOperations, GITIGNORE_ENTRIES };
@@ -0,0 +1,36 @@
1
+ #!/bin/bash
2
+ # Shared helper: walk up from CWD to find nearest SDLC.md + TESTING.md pair
3
+ # Sourced by sdlc-prompt-check.sh and instructions-loaded-check.sh
4
+ # Fixes #171: false-positive "SETUP NOT COMPLETE" in monorepos / nested projects
5
+
6
+ # find_sdlc_root — walks up from pwd, stops at $HOME (exclusive)
7
+ # Sets SDLC_ROOT to the found directory, or empty string if not found
8
+ find_sdlc_root() {
9
+ local check_dir
10
+ check_dir="$(pwd)"
11
+ SDLC_ROOT=""
12
+ while [ "$check_dir" != "/" ] && [ "$check_dir" != "$HOME" ] && [ -n "$check_dir" ]; do
13
+ if [ -f "$check_dir/SDLC.md" ] && [ -f "$check_dir/TESTING.md" ]; then
14
+ SDLC_ROOT="$check_dir"
15
+ return 0
16
+ fi
17
+ check_dir="$(dirname "$check_dir")"
18
+ done
19
+ return 1
20
+ }
21
+
22
+ # find_partial_sdlc_root — walks up looking for EITHER SDLC.md OR TESTING.md
23
+ # Used to detect partial setup (one file exists but not both) vs not-an-SDLC-project
24
+ find_partial_sdlc_root() {
25
+ local check_dir
26
+ check_dir="$(pwd)"
27
+ SDLC_ROOT=""
28
+ while [ "$check_dir" != "/" ] && [ "$check_dir" != "$HOME" ] && [ -n "$check_dir" ]; do
29
+ if [ -f "$check_dir/SDLC.md" ] || [ -f "$check_dir/TESTING.md" ]; then
30
+ SDLC_ROOT="$check_dir"
31
+ return 0
32
+ fi
33
+ check_dir="$(dirname "$check_dir")"
34
+ done
35
+ return 1
36
+ }
@@ -4,7 +4,21 @@
4
4
  # Available since Claude Code v2.1.69
5
5
  # Note: no set -e — this hook must always exit 0 to not block session start
6
6
 
7
- PROJECT_DIR="${CLAUDE_PROJECT_DIR:-.}"
7
+ # Walk up from CWD to find nearest SDLC.md + TESTING.md (#171: monorepo support)
8
+ HOOK_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
9
+ source "$HOOK_DIR/_find-sdlc-root.sh"
10
+
11
+ # CWD walk-up finds nearest SDLC project (#173: silent exit for non-SDLC dirs)
12
+ if find_sdlc_root; then
13
+ PROJECT_DIR="$SDLC_ROOT"
14
+ elif find_partial_sdlc_root; then
15
+ # Partial setup — one file exists but not both. Warn about missing files
16
+ PROJECT_DIR="$SDLC_ROOT"
17
+ else
18
+ # Not an SDLC project at all — exit silently
19
+ exit 0
20
+ fi
21
+
8
22
  MISSING=""
9
23
 
10
24
  if [ ! -f "$PROJECT_DIR/SDLC.md" ]; then
@@ -2,8 +2,21 @@
2
2
  # Light SDLC hook - baseline reminder every prompt (~100 tokens)
3
3
  # Full guidance in skill: .claude/skills/sdlc/
4
4
 
5
- # Check if setup has been completed
6
- PROJECT_DIR="${CLAUDE_PROJECT_DIR:-.}"
5
+ # Walk up from CWD to find nearest SDLC.md + TESTING.md (#171: monorepo support)
6
+ HOOK_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
7
+ source "$HOOK_DIR/_find-sdlc-root.sh"
8
+
9
+ # CWD walk-up finds nearest SDLC project (#173: silent exit for non-SDLC dirs)
10
+ if find_sdlc_root; then
11
+ PROJECT_DIR="$SDLC_ROOT"
12
+ elif find_partial_sdlc_root; then
13
+ # Partial setup — one file exists but not both. Warn about missing files
14
+ PROJECT_DIR="$SDLC_ROOT"
15
+ else
16
+ # Not an SDLC project at all — exit silently
17
+ exit 0
18
+ fi
19
+
7
20
  if [ ! -s "$PROJECT_DIR/SDLC.md" ] || [ ! -s "$PROJECT_DIR/TESTING.md" ]; then
8
21
  cat << 'SETUP'
9
22
  SETUP NOT COMPLETE: SDLC.md and/or TESTING.md are missing.
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "agentic-sdlc-wizard",
3
- "version": "1.29.0",
3
+ "version": "1.31.0",
4
4
  "description": "SDLC enforcement for Claude Code — hooks, skills, and wizard setup in one command",
5
5
  "bin": {
6
6
  "sdlc-wizard": "./cli/bin/sdlc-wizard.js"
@@ -46,9 +46,11 @@ Parse all CHANGELOG entries between the user's installed version and the latest.
46
46
 
47
47
  ```
48
48
  Installed: 1.24.0
49
- Latest: 1.29.0
49
+ Latest: 1.31.0
50
50
 
51
51
  What changed:
52
+ - [1.31.0] Hook false-positive fix for non-SDLC dirs, ephemeral marketplace path warning, ...
53
+ - [1.30.0] Firmware fixture, model A/B comparison workflow, CC degradation detection, ...
52
54
  - [1.29.0] Node 24 compliance, autocompact in settings.json, effectiveness scoreboard, ...
53
55
  - [1.28.0] Autocompact benchmarking methodology, canary fact mechanism, benchmark harness, ...
54
56
  - [1.27.0] Domain-adaptive testing diamond, 3 domain fixtures, 25 quality tests, ...