agentic-sdlc-wizard 1.75.1 → 1.77.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -13,7 +13,7 @@
13
13
  "name": "sdlc-wizard",
14
14
  "source": ".",
15
15
  "description": "SDLC enforcement for AI agents — TDD, planning, self-review, CI shepherd",
16
- "version": "1.75.1",
16
+ "version": "1.77.0",
17
17
  "author": {
18
18
  "name": "Stefan Ayala"
19
19
  },
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "sdlc-wizard",
3
- "version": "1.75.1",
3
+ "version": "1.77.0",
4
4
  "description": "SDLC enforcement for AI agents — TDD, planning, self-review, CI shepherd",
5
5
  "author": {
6
6
  "name": "Stefan Ayala",
package/CHANGELOG.md CHANGED
@@ -4,6 +4,44 @@ All notable changes to the SDLC Wizard.
4
4
 
5
5
  > **Note:** This changelog is for humans to read. Don't manually apply these changes - just run the wizard ("Check for SDLC wizard updates") and it handles everything automatically.
6
6
 
7
+ ## [1.77.0] - 2026-05-24
8
+
9
+ ### Added
10
+
11
+ - **`release-dry-run.yml` CI workflow** (A1, v1.75.1 post-mortem). Runs `npm publish --dry-run --tag dry-run --json` on every PR touching `release.yml`, `package.json`, or the shipped package surface (`cli/`, `hooks/`, `skills/`, `.claude-plugin/`, `CLAUDE_CODE_SDLC_WIZARD.md`, `CHANGELOG.md`). Catches MODULE_NOT_FOUND-class regressions, dropped shipped paths, Node/npm version drift — visible at PR-time instead of producing a half-published tag. Uses `permissions: contents: read` only (no `id-token: write` — npm CLI runs OIDC setup before the dry-run branch, so requesting that permission would attempt token mint on every PR). Rewrites `package.json` to `0.0.0-dry-run-<SHA>` in a temp checkout to avoid the "cannot publish over previously published versions" error. Path filter is `package.json.files`-aware and tested for drift. 25 quality tests in `tests/test-release-dry-run-workflow.sh`.
12
+ - **`cc-version-drift.yml` cadence workflow** (closes #350). The fix for the gap that let native `/goal` (CC v2.1.139) slip past for 32 versions / 5 weeks. Pure GH-API detector — no LLM, no API spend. Mon 09:30 UTC cron (staggered from `weekly-update.yml` 09:00 and `weekly-api-update.yml` 10:00) + `workflow_dispatch`. Parses new `<!-- Claude Code Baseline: vX.Y.Z -->` anchor in `SDLC.md` (single-purpose; NOT `<!-- SDLC Wizard Version -->` which is the wizard's own pkg version), compares to `npm view @anthropic-ai/claude-code version`, opens a tracking issue when patch gap > 5 (major/minor jumps alert regardless of threshold). Idempotent via label `cc-version-drift` + machine-readable marker in issue body. Edits existing open issue instead of comment-spamming. Re-opens a closed issue only if delta WIDENED (respects "won't fix for now"). 22 workflow tests + 18 unit tests on the extracted `scripts/cc-drift-check.sh` (bash 3 compatible, strict SemVer validation, refuses prereleases and regressions).
13
+ - **`/goal` SDLC-discipline gates in `/sdlc` skill** (PR-D). Two new load-bearing rules now that native `/goal` is universal across CC (v2.1.139+) and Codex CLI: **(1) Confidence gate — NEVER invoke `/goal` below HIGH 95%** (mirrors existing Confidence Check; below 95% the Haiku evaluator rubber-stamps "did the agent flail" as progress). **(2) DLC binding — the condition MUST name the active DLC** (`/sdlc`, `/gdlc`, `/ldlc`, etc.) so the evaluator anchors on "doing it right," not just "doing it." Example: `/goal "tests pass + clean tree following /sdlc, stop after 20 turns"`. `test_sdlc_skill_has_goal_wrapper` extended to grep both new keywords (9 quality elements total).
14
+ - **`<!-- Claude Code Baseline: v2.1.150 -->` anchor in `SDLC.md`** — single-purpose machine-parseable source of truth for `cc-version-drift.yml`. Maintainer updates this anchor + the human-readable `Claude Code Recommended` row when bumping CC support. Test asserts both stay in sync.
15
+
16
+ ### Notes
17
+
18
+ - **Trusted Publishing flow held.** Third release shipped via OIDC since the v1.75.0 migration; no token to rotate, expire, or 2FA-gate.
19
+ - **Cross-model design reviews** at `.reviews/176-followup-prio-codex.md` (Codex gpt-5.5 xhigh). Caught: (a) naive `npm publish --dry-run` against already-published version fails today; (b) `id-token: write` on the dry-run workflow would attempt OIDC mint on every PR; (c) `weekly-update.yml` is the wrong host for #350 (cron is disabled + tests assert no `issues: write`); (d) "minor versions" is wrong terminology — `2.1.150 → 2.1.180` is a SemVer patch delta.
20
+ - **Self-test landed in PR-B:** the new `release-dry-run.yml` workflow ran ON the PR that introduced it (paths filter included `release-dry-run.yml` itself) and SUCCEEDED — first proof that the dry-run mechanism works end-to-end against the temp-version-rewrite + `--tag dry-run` flow.
21
+
22
+ ## [1.76.0] - 2026-05-24
23
+
24
+ ### Added
25
+
26
+ - **Native `/goal` wrapper in `/sdlc` skill** (closes #347). CC v2.1.139 shipped a native `/goal <condition>` command — set a completion condition, Haiku evaluator re-checks the transcript per turn until met. The /sdlc skill now carries a tight wrapper section with 5 load-bearing elements: pre-flight checklist (workspace trusted, hooks not disabled, CC ≥ v2.1.143 for the subagent-race fix), condition-as-SDLC-contract guidance (measurable end state + check + constraints + hard turn/time bound since there's no native cap), compose-with-hooks note (UserPromptSubmit/SessionStart/PreCompact fire normally per turn), `--resume` resets counters caveat, and the off-transcript anti-pattern callout (the evaluator cannot call tools, so `/goal "production is healthy"` is wrong — it can only judge what's in the conversation). New quality test `test_sdlc_skill_has_goal_wrapper` greps for each element by keyword. Per Prove-It Gate absorption principle: extends the existing /sdlc skill, no new skill/hook/template scaffolding.
27
+ - **CC v2.1.119 → v2.1.150 feature adoption** in `CLAUDE_CODE_SDLC_WIZARD.md` "Complementary native skills" table: `/code-review [effort] [--comment]` (v2.1.147+, renamed from `/simplify`, posts findings as inline GH PR comments), `/usage` per-category breakdown of limits usage (v2.1.149+ — skills, subagents, plugins, MCP-server costs), `/context all` per-skill per-model token estimates (v2.1.139+). Each row carries usage guidance with the same caveat discipline applied to `/insights` (#235a).
28
+ - **`Claude Code Recommended` baseline guidance** in `SDLC.md` — new row at `v2.1.150+` alongside the unchanged `Claude Code Minimum` floor at `v2.1.111+`. Splits "what we require" from "what unlocks the latest features" without breaking back-compat for consumers still on the minimum.
29
+
30
+ ### Changed
31
+
32
+ - **ROADMAP cleanup with demand-signal-first entry gate** (PR #349). New top-of-ROADMAP gate: new entries require one of a maintainer pain event with repro, a second external user signal, a dated platform deadline, or a low-cost cleanup. Everything else goes to a Research Parking Lot with 30/60-day expiry — if the trigger hasn't fired by expiry, the item is deleted rather than carried forward. Excised 4 items now tracked in sibling repos (#9 OpenCode → `BaseInfinity/opencode-sdlc-wizard`, #82 Domain DLCs → Stefan's separate track, #91 Multi-Agent Adapter umbrella → per-adapter sibling repos, Back Burner Agent-agnostic SDLC). Killed 4 stale items with no demand signal (#85 Phase 2 — 1.7 months stale, killed by #231 zero-cron philosophy; #233 automation subitems — 1 month stale, Max-user footgun; Back Burner Chaos/Resilience Testing — 1.8 months, no concrete failure; Back Burner Subagent Model Compliance Audit — 3+ months, prototype already deleted in #236). Source: cross-model prioritization at `.reviews/roadmap-prio-codex.md`.
33
+ - **#347 corrected from "no native primitive" to "native `/goal` exists, build the wrapper"**. The original 2026-05-23 research (claude-code-guide subagent) was wrong — CC v2.1.139 had shipped `/goal` weeks earlier. Implementation scope shrunk from a 5-step plan + GOAL.md/HANDOFF.md templates down to a ~30-line wrapper in the existing `/sdlc` skill (now shipped — see Added). Meta-lesson captured: require explicit citation of an authoritative source (docs index, raw changelog grep) before accepting a "feature does not exist" claim — negative claims are easier to fake than positive ones.
34
+
35
+ ### Added (ROADMAP only — implementation deferred)
36
+
37
+ - **#302 — User-level setup-wizard + repo-local lifecycle split** (PR #346). Cross-model design review (Codex gpt-5.5 xhigh) scored Claude's first-pass 5/10 NOT CERTIFIED and replaced it with a concrete channel contract: plugin = user-level/global, npm/npx = repo-local, no npm `postinstall` writing to `~/.claude/`. Implementation deferred per demand-signal-first gate.
38
+ - **#350 — CC feature-discovery cadence fix** (PR #350). Captures the process gap that let `/goal` slip past for ~5 weeks: #231 Phase 3d gutted the in-CI LLM-ranker to take `weekly-update.yml` to $0, and the maintainer-run replacement on Max never ran. Proposed fix: a thin GH-API-only check that opens a "CC version drift" issue when our `<!-- SDLC Wizard Version -->` baseline is >5 minor versions behind latest npm. No LLM, no API spend.
39
+
40
+ ### Notes
41
+
42
+ - **Trusted Publishing flow is stable.** Second release shipped via OIDC since the v1.75.0 migration; no token to rotate, expire, mis-scope, or 2FA-gate.
43
+ - **Inventory of all 32 missed CC versions** (v2.1.119 → v2.1.150) lives at `.reviews/cc-feature-inventory-2026-05-24.md` (gitignored) with HIGH/MEDIUM/LOW relevance triage and the adoption sequence for follow-up PRs. After verifying against CC's hook docs, H2 (`$CLAUDE_EFFORT` env var) and H8 (Stop hook `background_tasks`) were demoted to non-applicable: `$CLAUDE_EFFORT` is only available in tool-use-context hooks (not our SessionStart-typed `model-effort-check.sh`), and `background_tasks` is only in Stop/SubagentStop input (not our PreCompact-typed `precompact-seam-check.sh`). The existing hook code is correct as-written.
44
+
7
45
  ## [1.75.1] - 2026-05-20
8
46
 
9
47
  ### Fixed
@@ -2981,7 +2981,7 @@ If deployment fails or post-deploy verification catches issues:
2981
2981
 
2982
2982
  **SDLC.md:**
2983
2983
  ```markdown
2984
- <!-- SDLC Wizard Version: 1.75.1 -->
2984
+ <!-- SDLC Wizard Version: 1.77.0 -->
2985
2985
  <!-- Setup Date: [DATE] -->
2986
2986
  <!-- Completed Steps: step-0.1, step-0.2, step-0.4, step-1, step-2, step-3, step-4, step-5, step-6, step-7, step-8, step-9 -->
2987
2987
  <!-- Git Workflow: [PRs or Solo] -->
@@ -4433,6 +4433,10 @@ The gap this closes: the advisor tool (API beta, `advisor-tool-2026-03-01`) ship
4433
4433
  | `/less-permission-prompts` | Scans transcripts for common read-only Bash/MCP calls and proposes a prioritized allowlist | After a few sessions — reduces permission friction without auto mode |
4434
4434
  | `/permissions` | Pre-allow specific commands and check them into `.claude/settings.json` | Anytime you want an auditable team allowlist |
4435
4435
  | `/insights` | Local analyzer of your CC session history. Generates HTML report at `~/.claude/usage-data/report.html` + per-session facet JSON at `~/.claude/usage-data/facets/<session>.json`. Surfaces `underlying_goal`, `outcome`, `friction_counts`, `user_satisfaction_counts`, `brief_summary`, recurring friction patterns, suggested CLAUDE.md additions | Monthly — **qualitative-only**; see caveat below |
4436
+ | `/goal <condition>` (v2.1.139+) | Set a completion condition; Claude keeps working across turns until a separate evaluator pass (Haiku default) says it's met. Survives `--resume` (counters reset), not `/clear`. No disk writes — session-state only. Bound it yourself: `/goal "tests pass + git status clean, or stop after 20 turns"`. The evaluator judges the transcript only — it cannot run tools, so don't use `/goal` for "doneness" that lives off-transcript. Requires v2.1.143+ for the subagent-race fix. Composes cleanly with wizard hooks (`UserPromptSubmit`/`SessionStart`/`PreCompact` fire per turn) | Long-running goal-bound work — refactors, migrations, anything where "are we there yet?" has a checkable answer in the transcript |
4437
+ | `/code-review [effort] [--comment]` (v2.1.147+, renamed from `/simplify`) | Reports correctness bugs at chosen effort level; `--comment` posts findings as inline GitHub PR comments. Our /sdlc skill already invokes `/code-review`; the `--comment` flag streamlines CI shepherd workflows | Self-review during SDLC; PR review when shepherding |
4438
+ | `/usage` (v2.1.149+) | Per-category breakdown of limits usage — skills, subagents, plugins, per-MCP-server cost. Complement to `/context all` which shows per-skill per-model token estimates | When investigating session bloat / quota burn — pairs with `scripts/audit-session-load.sh` (#236) |
4439
+ | `/context all` (v2.1.139+) | Rounded token estimates per-skill per-model, names the providing plugin for plugin-sourced skills | Same as `/usage` — diagnose what's eating your context |
4436
4440
 
4437
4441
  These are shipped by Claude Code itself. The wizard doesn't reimplement them — it points you at them so you benefit from the native version's ongoing maintenance.
4438
4442
 
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "agentic-sdlc-wizard",
3
- "version": "1.75.1",
3
+ "version": "1.77.0",
4
4
  "description": "SDLC enforcement for Claude Code — hooks, skills, and wizard setup in one command",
5
5
  "bin": {
6
6
  "sdlc-wizard": "cli/bin/sdlc-wizard.js"
@@ -108,6 +108,10 @@ State your confidence before presenting an approach:
108
108
 
109
109
  Use plan mode for: multi-file changes, new features, LOW confidence, bugs needing investigation. **Skip plan approval step** (auto-approval) when confidence HIGH (95%+) AND single-file/trivial AND no new patterns AND no architectural decisions — still announce approach, don't wait. When in doubt, wait.
110
110
 
111
+ ## Long-Running Goals (`/goal`)
112
+
113
+ Native `/goal <condition>` (**v2.1.143+**). Haiku evaluator re-checks transcript per turn. **Confidence gate — NEVER invoke below HIGH 95%**; below that the evaluator rubber-stamps flailing as progress. **DLC binding — condition MUST name the DLC** (`/sdlc`, `/gdlc`, `/ldlc`, etc.) so the evaluator anchors on "doing it right." **Pre-flight:** trusted workspace; `disableAllHooks`/`allowManagedHooksOnly` both off. **Condition = contract:** end state + check + constraints + hard turn/time bound (no native cap); e.g. `/goal "tests pass + clean tree following /sdlc, stop after 20 turns"`. **Anti-pattern:** evaluator can't call tools — transcript-only. `--resume` resets counters.
114
+
111
115
  ## Recommended Model
112
116
 
113
117
  **Opt-in: `opus[1m]` (Opus 4.7 with 1M context).** `/model opus[1m]` at the start of non-trivial sessions — understand the tradeoff (issue #198). A top-level `model` pin in `.claude/settings.json` disables CC's per-turn auto-selection; pin only when you need 1M headroom. Requires CC v2.1.111+.
@@ -137,7 +141,7 @@ PROTOCOL is universal across domains; only `review_instructions` and `verificati
137
141
 
138
142
  **Convergence:** 2 rounds sweet spot, 3 max (research: 14 repos + 7 papers). After 3 still NOT CERTIFIED → escalate.
139
143
 
140
- **Multi-reviewer / non-code domains:** when running multiple reviewers in parallel (e.g. Claude review + Codex + human), respond per-reviewer (different blind spots, no shared anchoring). For non-code domains (research, persuasion, medical), keep the same handoff format and add `"audience"` + `"stakes"` keys.
144
+ **Multi-reviewer:** parallel reviewers respond per-reviewer (different blind spots, no shared anchoring). **Non-code domains:** same handoff + add `"audience"`/`"stakes"` keys.
141
145
 
142
146
  ### Release Review Focus
143
147
 
@@ -93,13 +93,15 @@ Parse CHANGELOG entries between the user's installed version and latest. Present
93
93
 
94
94
  ```
95
95
  Installed: 1.42.0
96
- Latest: 1.75.1
96
+ Latest: 1.77.0
97
97
 
98
98
  What changed:
99
+ - [1.77.0] release-dry-run.yml + cc-version-drift.yml (#350) + /goal SDLC gates (95% + DLC binding).
100
+ - [1.76.0] /goal /sdlc wrapper (#347) + CC v2.1.150 feature adoption + ROADMAP demand-signal gate (4 excise, 4 kill).
99
101
  - [1.75.1] release-workflow fix — Node 22 → 24 (ships npm 11.x), dropped flaky `npm install -g` self-upgrade (hit MODULE_NOT_FOUND on v1.75.0 publish). Explicit npm-version guard.
100
102
  - [1.75.0] npm Trusted Publishing — `release.yml` swapped from `NPM_TOKEN` to OIDC. No more token rotation. Requires one-time publisher config on the npm package page.
101
- - [1.74.0] Salvage from v1.43 PR: #338 sdlc-skill source-precedence preamble; #235(a)(b) `/insights` guidance; codex stdin-hang `< /dev/null` fix; test-hooks env-isolation.
102
- - [1.73.0] precompact stale REBASE_HEAD fix + bloat sweep — `hooks/precompact-seam-check.sh` no longer HOLDs `/compact` on stale REBASE_HEAD without `rebase-{merge,apply}/` dirs. 15 tracked artifacts deleted (-460 LOC).
103
+ - [1.74.0] v1.43 salvage: #338 precedence preamble; #235ab `/insights`; codex `< /dev/null` stdin-hang fix; test env-isolation.
104
+ - [1.73.0] precompact stale REBASE_HEAD fix + 15 stale artifact deletes (-460 LOC).
103
105
  - [1.72.0] #323 closed — customization-aware `check` recommendation + new `--preserve-customized` flag. `init --force --preserve-customized` skips CUSTOMIZED files (action `PRESERVE`), still OVERWRITEs MATCH and CREATEs MISSING. Default `init --force` unchanged. 10 tests.
104
106
  - [1.71.0–1.69.0] token-bloat sweep #236 — BASELINE + TDD CHECK fire once per `session_id` (-12K, -0.5-1.5K); sdlc-skill Cross-Model Review trimmed.
105
107
  - [1.68.0–1.65.0] roadmap hygiene — five paperwork closes: #97 Anthropic Policy NO-GO + AAR-paper validating parallel; #99 AutoGPT NO-GO; #95 Nous NO-GO; #243 token-history liveness verified; #210 Node-24 false-green; #235 Thoughtworks AI Evals NO-GO. **6/6 external-product audits NO-GO** (continues #76, #77). Research write-ups in `.reviews/research-*.md`.