agentic-sdlc-wizard 1.42.2 → 1.43.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -13,7 +13,7 @@
13
13
  "name": "sdlc-wizard",
14
14
  "source": ".",
15
15
  "description": "SDLC enforcement for AI agents — TDD, planning, self-review, CI shepherd",
16
- "version": "1.42.2",
16
+ "version": "1.43.0",
17
17
  "author": {
18
18
  "name": "Stefan Ayala"
19
19
  },
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "sdlc-wizard",
3
- "version": "1.42.2",
3
+ "version": "1.43.0",
4
4
  "description": "SDLC enforcement for AI agents — TDD, planning, self-review, CI shepherd",
5
5
  "author": {
6
6
  "name": "Stefan Ayala",
package/CHANGELOG.md CHANGED
@@ -4,6 +4,20 @@ All notable changes to the SDLC Wizard.
4
4
 
5
5
  > **Note:** This changelog is for humans to read. Don't manually apply these changes - just run the wizard ("Check for SDLC wizard updates") and it handles everything automatically.
6
6
 
7
+ ## [1.43.0] - 2026-04-27
8
+
9
+ ### Added
10
+
11
+ - **Token-spike anomaly detection** (ROADMAP #220 closure). New SessionStart hook `hooks/token-spike-check.sh` walks the CC transcript dir (`~/.claude/projects/<sanitized-cwd>/*.jsonl`), sums per-session `usage.{input_tokens, output_tokens, cache_creation_input_tokens, cache_read_input_tokens}` from every assistant message with a usage block, and idempotently appends one record per `session_id` to `.metrics/token-history.jsonl`. The hook then warns when the most recent completed session's `costly_tokens` (= `input + cache_creation + output`, excluding the cheap ~$1.50/M `cache_read` tier) exceeds the rolling baseline by more than 2σ. Anthropic's 2026-04-23 post-mortem documented a CC caching bug that "continuously dropped thinking blocks from subsequent requests" — invisible until the invoice arrived; this hook surfaces the same shape of regression the moment it occurs. The `--metric median` mode (default) uses MAD (median absolute deviation) instead of stdev for the spread term, so a single outlier session in the baseline doesn't mask the next genuine spike. Hook is gated on `.metrics/` existing in the project root (opt-in for consumers, on for the wizard repo which already maintains `.metrics/catches.jsonl`). 14 quality tests in `tests/test-token-spike.sh` cover burn calculation against summed transcript fields, idempotent ingest, positive/negative spike detection, the min-baseline floor (no false positives on <5-record windows), the median-vs-mean contrast (both `--metric` modes invoked, asserting median warns and mean does not on an outlier-inflated fixture), flat-baseline minimum-spread floor (1000→1100 suppressed, 1000→50000 still fires), privacy/type-coercion (a malicious transcript with `"USER_SECRET_INPUT"` strings in usage fields cannot leak content into history), concurrent-ingest atomic-lock serialization (parallel ingests produce 1 record per session), and hook gating + warning surface.
12
+
13
+ ### Files
14
+
15
+ - New `hooks/token-spike-check.sh` (SessionStart, opt-in)
16
+ - New `tests/e2e/token-analytics.sh` (writer + checker engine; supports `--ingest`, `--check`, `--report`, `--metric median|mean`, `--window`, `--threshold-sigma`)
17
+ - New `tests/test-token-spike.sh` (14 quality tests)
18
+ - Hook registered in `hooks/hooks.json` and `.claude/settings.json` SessionStart event
19
+ - `SDLC.md` hooks table + file tree updated
20
+
7
21
  ## [1.42.2] - 2026-04-26
8
22
 
9
23
  ### Documented
@@ -2918,7 +2918,7 @@ If deployment fails or post-deploy verification catches issues:
2918
2918
 
2919
2919
  **SDLC.md:**
2920
2920
  ```markdown
2921
- <!-- SDLC Wizard Version: 1.42.2 -->
2921
+ <!-- SDLC Wizard Version: 1.43.0 -->
2922
2922
  <!-- Setup Date: [DATE] -->
2923
2923
  <!-- Completed Steps: step-0.1, step-0.2, step-0.4, step-1, step-2, step-3, step-4, step-5, step-6, step-7, step-8, step-9 -->
2924
2924
  <!-- Git Workflow: [PRs or Solo] -->
@@ -3983,7 +3983,7 @@ Walk through updates? (y/n)
3983
3983
  Store wizard state in `SDLC.md` as metadata comments (invisible to readers, parseable by Claude):
3984
3984
 
3985
3985
  ```markdown
3986
- <!-- SDLC Wizard Version: 1.42.2 -->
3986
+ <!-- SDLC Wizard Version: 1.43.0 -->
3987
3987
  <!-- Setup Date: 2026-01-24 -->
3988
3988
  <!-- Completed Steps: step-0.1, step-0.2, step-1, step-2, step-3, step-4, step-5, step-6, step-7, step-8, step-9 -->
3989
3989
  <!-- Git Workflow: PRs -->
package/hooks/hooks.json CHANGED
@@ -39,6 +39,10 @@
39
39
  {
40
40
  "type": "command",
41
41
  "command": "${CLAUDE_PLUGIN_ROOT}/hooks/model-effort-check.sh"
42
+ },
43
+ {
44
+ "type": "command",
45
+ "command": "${CLAUDE_PLUGIN_ROOT}/hooks/token-spike-check.sh"
42
46
  }
43
47
  ]
44
48
  }
@@ -0,0 +1,60 @@
1
+ #!/bin/bash
2
+ # SessionStart hook — token-spike anomaly detection (ROADMAP #220).
3
+ #
4
+ # Reads CC transcript history, computes per-session token burn, and warns
5
+ # if the last completed session's burn deviates >2σ above the rolling median.
6
+ # Catches silent CC-side regressions (caching bugs, prompt-inflation defaults)
7
+ # that only otherwise surface on the invoice. Reference: Anthropic 2026-04-23
8
+ # post-mortem on the dropped-thinking-blocks caching bug.
9
+ #
10
+ # Gated on `.metrics/` directory existing in the project root — opt-in for
11
+ # consumers, on-by-default for the wizard repo (which already maintains
12
+ # `.metrics/catches.jsonl` for the effectiveness scoreboard).
13
+ #
14
+ # Non-blocking: always exits 0.
15
+
16
+ # Token-bloat fix: when both project + plugin register this hook, plugin yields.
17
+ HOOK_DIR="${BASH_SOURCE[0]%/*}"
18
+ [ "$HOOK_DIR" = "${BASH_SOURCE[0]}" ] && HOOK_DIR="."
19
+ # shellcheck disable=SC1091
20
+ source "$HOOK_DIR/_find-sdlc-root.sh"
21
+ dedupe_plugin_or_project "${BASH_SOURCE[0]}" || { [ ! -t 0 ] && cat > /dev/null; exit 0; }
22
+
23
+ # Drain stdin (SessionStart sends JSON; we don't need any of it)
24
+ [ ! -t 0 ] && cat > /dev/null
25
+
26
+ ROOT="${CLAUDE_PROJECT_DIR:-$PWD}"
27
+
28
+ # Gate 1: opt-in via .metrics/ directory
29
+ [ -d "$ROOT/.metrics" ] || exit 0
30
+
31
+ # Gate 2: analytics script must exist. Resolve hook-relative first so the
32
+ # wizard repo's hook always finds its own analytics regardless of how
33
+ # CLAUDE_PROJECT_DIR is set (e.g., test fixtures pointing at a tmp dir).
34
+ # Fall back to project-relative for consumer forks that ship the script.
35
+ ANALYTICS=""
36
+ for candidate in \
37
+ "$HOOK_DIR/../tests/e2e/token-analytics.sh" \
38
+ "$ROOT/tests/e2e/token-analytics.sh"; do
39
+ if [ -x "$candidate" ]; then
40
+ ANALYTICS="$candidate"
41
+ break
42
+ fi
43
+ done
44
+ [ -n "$ANALYTICS" ] || exit 0
45
+
46
+ # Gate 3: jq is required by the analytics script
47
+ command -v jq > /dev/null 2>&1 || exit 0
48
+
49
+ ARGS=(--history "$ROOT/.metrics/token-history.jsonl" --ingest --check)
50
+
51
+ # Test override: SDLC_TOKEN_SPIKE_TRANSCRIPT_DIR points the ingest at a
52
+ # fixture directory instead of the real ~/.claude/projects/... path.
53
+ if [ -n "$SDLC_TOKEN_SPIKE_TRANSCRIPT_DIR" ]; then
54
+ ARGS+=(--transcript-dir "$SDLC_TOKEN_SPIKE_TRANSCRIPT_DIR" --no-skip-recent)
55
+ fi
56
+
57
+ OUTPUT=$("$ANALYTICS" "${ARGS[@]}" 2>&1) || true
58
+ [ -n "$OUTPUT" ] && echo "$OUTPUT"
59
+
60
+ exit 0
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "agentic-sdlc-wizard",
3
- "version": "1.42.2",
3
+ "version": "1.43.0",
4
4
  "description": "SDLC enforcement for Claude Code — hooks, skills, and wizard setup in one command",
5
5
  "bin": {
6
6
  "sdlc-wizard": "cli/bin/sdlc-wizard.js"
@@ -131,9 +131,10 @@ Parse all CHANGELOG entries between the user's installed version and the latest.
131
131
 
132
132
  ```
133
133
  Installed: 1.24.0
134
- Latest: 1.42.2
134
+ Latest: 1.43.0
135
135
 
136
136
  What changed:
137
+ - [1.43.0] Token-spike anomaly detection — ROADMAP #220 closure. New `hooks/token-spike-check.sh` (SessionStart, opt-in via `.metrics/`) ingests CC transcript usage (`input_tokens` / `output_tokens` / `cache_creation_input_tokens` / `cache_read_input_tokens`) into `.metrics/token-history.jsonl`, then warns when the last session's `costly_tokens` (input + cache_creation + output, excluding the cheap cache_read tier) exceeds median + 2σ over a rolling baseline. Catches silent CC-side caching regressions (per Anthropic's 2026-04-23 post-mortem) before they surface on the invoice. Uses MAD-based spread for the median metric so a single baseline outlier doesn't mask the next spike. 14 quality tests in `tests/test-token-spike.sh` (incl. malicious-transcript privacy probe, flat-baseline floor, median-vs-mean contrast, concurrent-ingest mkdir lock).
137
138
  - [1.42.2] PreCompact self-heal documented — ROADMAP #209 closure. Added `pr_number` opt-in to all 3 handoff template schemas (skill Step 1; wizard Round 1 + cross-model section). Self-heal logic shipped earlier with #229 but was undocumented, leaving the dead-code path. New `test_handoff_template_documents_pr_number` enforces template/doc parity. Together with #229 (mtime auto-expire) closes the "stuck PENDING handoff blocks /compact forever" footgun from both directions.
138
139
  - [1.42.1] CI hygiene fix — skip Claude PR review on wizard self-PRs. 7 self-PRs (v1.39.0–v1.42.0) had shipped with red `review` job (API canary firing on dead credit balance). Treated as "expected" but red normalizes red. Workflow `if:` now skips review on `BaseInfinity/claude-sdlc-wizard` repo only; consumer projects unaffected. 7 quality tests, mutation-verified (== inversion fails).
139
140
  - [1.42.0] AGENTS.md interop detection — ROADMAP #205 phase (a). Setup wizard auto-scan now lists AGENTS.md (cross-tool agent-instructions standard, CC issue #6235); new Step 4.5 surfaces a 3-way decision (dual-maintain / merge / skip) when AGENTS.md is detected. Phase (b) write-fresh and phase (d) drift-test deferred. 7 quality tests.