npm - agentic-sdlc-wizard - Versions diffs - 1.60.0 → 1.61.0 - Mend

agentic-sdlc-wizard 1.60.0 → 1.61.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (6) hide show

package/.claude-plugin/marketplace.json +1 -1
package/.claude-plugin/plugin.json +1 -1
package/CHANGELOG.md +27 -0
package/CLAUDE_CODE_SDLC_WIZARD.md +2 -2
package/package.json +1 -1
package/skills/update/SKILL.md +2 -1

package/.claude-plugin/marketplace.json CHANGED Viewed

@@ -13,7 +13,7 @@
       "name": "sdlc-wizard",
       "source": ".",
       "description": "SDLC enforcement for AI agents — TDD, planning, self-review, CI shepherd",
-      "version": "1.60.0",
+      "version": "1.61.0",
       "author": {
         "name": "Stefan Ayala"
       },

package/.claude-plugin/plugin.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "sdlc-wizard",
-  "version": "1.60.0",
+  "version": "1.61.0",
   "description": "SDLC enforcement for AI agents — TDD, planning, self-review, CI shepherd",
   "author": {
     "name": "Stefan Ayala",

package/CHANGELOG.md CHANGED Viewed

@@ -4,6 +4,33 @@ All notable changes to the SDLC Wizard.
 > **Note:** This changelog is for humans to read. Don't manually apply these changes - just run the wizard ("Check for SDLC wizard updates") and it handles everything automatically.
+## [1.61.0] - 2026-04-30
+### Added
+- **Calibration scenario suite — closes ROADMAP #96 Phase 3 PR 2.** New `tests/e2e/scenarios/calibration-careful-read.md` is the first scenario in a new `calibration-*` family designed specifically to reward self-review and punish rushed implementations. Builds on PR 1 (the lift-proof harness, v1.60.0).
+  The scenario asks the agent to add a `parsePrice(input)` utility that handles five distinct formats (standard, cents-only, comma thousand-separator, surrounding whitespace, invalid). A self-reviewing agent reads all five requirements before coding; a rushed agent skims the first example and ships `parseFloat(s.replace('$', ''))` — which silently corrupts `'$1,000.00'` to `1` (a thousand-fold pricing bug). The score delta between these two agent profiles on this scenario is a load-bearing **calibration signal** for `lift-proof.sh`.
+  Out of scope: actual end-to-end calibration verification (does a low-effort agent actually score lower on this scenario than an xhigh agent?). That's deferred to ROADMAP #212(i) Prove-It Gate paired runs. PR 2 ships the scenario; #212(i) runs the comparison.
+### Tests
+- New `tests/test-calibration-scenarios.sh` (6 tests): asserts a `calibration-*.md` exists, has the standard scenario format (`# Scenario:`, `## Task`, `## Fixture:`, calibration-signal docs), and lists ≥2 numbered requirements (the careful-read multi-path signal).
+- Wired into `ci.yml` and `CONTRIBUTING.md` test list.
+### Files
+- `tests/e2e/scenarios/calibration-careful-read.md` (new) — the parsePrice scenario
+- `tests/test-calibration-scenarios.sh` (new) — format validator
+- `.github/workflows/ci.yml` — new test step
+- `CONTRIBUTING.md` — test list parity
+- `CHANGELOG.md`, `ROADMAP.md`, `SDLC.md`, `CLAUDE_CODE_SDLC_WIZARD.md`, `skills/update/SKILL.md`, `package.json`, `.claude-plugin/plugin.json` + `marketplace.json` (1.60.0 → 1.61.0)
+### What's still TBD
+The `calibration-*` family is intentionally extensible. Future scenarios could test other SDLC virtues — scope guard (don't fix things outside the task), TDD discipline (write tests before implementation), regression hygiene (don't break existing tests). Each new calibration scenario is a separate PR; PR 2 establishes the format.
 ## [1.60.0] - 2026-04-30
 ### Added

package/CLAUDE_CODE_SDLC_WIZARD.md CHANGED Viewed

@@ -2974,7 +2974,7 @@ If deployment fails or post-deploy verification catches issues:
 **SDLC.md:**
 ```markdown
-<!-- SDLC Wizard Version: 1.60.0 -->
+<!-- SDLC Wizard Version: 1.61.0 -->
 <!-- Setup Date: [DATE] -->
 <!-- Completed Steps: step-0.1, step-0.2, step-0.4, step-1, step-2, step-3, step-4, step-5, step-6, step-7, step-8, step-9 -->
 <!-- Git Workflow: [PRs or Solo] -->
@@ -4039,7 +4039,7 @@ Walk through updates? (y/n)
 Store wizard state in `SDLC.md` as metadata comments (invisible to readers, parseable by Claude):
 ```markdown
-<!-- SDLC Wizard Version: 1.60.0 -->
+<!-- SDLC Wizard Version: 1.61.0 -->
 <!-- Setup Date: 2026-01-24 -->
 <!-- Completed Steps: step-0.1, step-0.2, step-1, step-2, step-3, step-4, step-5, step-6, step-7, step-8, step-9 -->
 <!-- Git Workflow: PRs -->

package/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "agentic-sdlc-wizard",
-  "version": "1.60.0",
+  "version": "1.61.0",
   "description": "SDLC enforcement for Claude Code — hooks, skills, and wizard setup in one command",
   "bin": {
     "sdlc-wizard": "cli/bin/sdlc-wizard.js"

package/skills/update/SKILL.md CHANGED Viewed

@@ -93,9 +93,10 @@ Parse CHANGELOG entries between the user's installed version and latest. Present
 ```
 Installed: 1.42.0
-Latest:    1.60.0
+Latest:    1.61.0
 What changed:
+- [1.61.0] calibration scenarios for #96 Phase 3 PR 2 — `tests/e2e/scenarios/calibration-careful-read.md` (parsePrice with 5 edge-case formats) tests whether self-review catches missed requirements. Score delta between SDLC and naive agents on this scenario is a calibration signal for `lift-proof.sh`
 - [1.60.0] wizard-installation lift-proof harness (#96 Phase 3 PR 1) — `tests/e2e/lift-proof.sh` runs same scenario on bare vs wizard-installed fixture, emits score delta. Closes the "does the wizard work?" question. Honestly zero-API (sim + eval on Max)
 - [1.59.0] evaluator on Max via `claude --print` (#228) — `EVAL_USE_CLI=1` swaps `evaluate.sh`'s per-criterion judge transport from `curl` → API to `claude --print --output-format json`. local-shepherd.sh sets it by default, so the local path is honestly zero-API
 - [1.58.0] ground-truth gate for E2E benchmark (#96 Phase 2) — `tests/e2e/ground-truth.sh` runs `npm test` post-sim; final score capped at 5 if tests fail. Catches "agent followed protocol but produced broken code"