agentic-sdlc-wizard 1.33.0 → 1.35.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/.claude-plugin/marketplace.json +1 -1
- package/.claude-plugin/plugin.json +1 -1
- package/CHANGELOG.md +52 -0
- package/CLAUDE_CODE_SDLC_WIZARD.md +106 -38
- package/README.md +19 -1
- package/cli/init.js +1 -0
- package/cli/templates/settings.json +11 -4
- package/hooks/hooks.json +11 -0
- package/hooks/instructions-loaded-check.sh +104 -4
- package/hooks/precompact-seam-check.sh +75 -0
- package/hooks/sdlc-prompt-check.sh +63 -0
- package/package.json +1 -1
- package/skills/sdlc/SKILL.md +44 -7
- package/skills/setup/SKILL.md +44 -11
- package/skills/update/SKILL.md +50 -1
package/CHANGELOG.md
CHANGED
|
@@ -4,6 +4,58 @@ All notable changes to the SDLC Wizard.
|
|
|
4
4
|
|
|
5
5
|
> **Note:** This changelog is for humans to read. Don't manually apply these changes - just run the wizard ("Check for SDLC wizard updates") and it handles everything automatically.
|
|
6
6
|
|
|
7
|
+
## [1.35.0] - 2026-04-19
|
|
8
|
+
|
|
9
|
+
### Added
|
|
10
|
+
|
|
11
|
+
- **PreCompact seam gate** (#205 / ROADMAP #208). New `hooks/precompact-seam-check.sh` blocks manual `/compact` when `.reviews/handoff.json` status is `PENDING_REVIEW`/`PENDING_RECHECK` or a git rebase/merge/cherry-pick is in flight. Matcher is `manual` — auto-compact is deliberately NOT gated (blocking it could push past 100% context). Requires Claude Code v2.1.105+. 10 quality tests.
|
|
12
|
+
- **Self-healing PreCompact** (#206 / ROADMAP #209). Hook now treats a stale `PENDING_*` handoff as implicit `CERTIFIED` when optional `pr_number` field is present and `gh pr view <N>` reports `MERGED`. Fixes the "forgot to flip status to CERTIFIED after merge" consumer bug. Graceful fallback: if `pr_number` absent, `gh` missing, offline, or any error → existing block behavior. 4 new quality tests with mocked `gh` binary.
|
|
13
|
+
- **Dynamic effort auto-bump hook** (#202 / ROADMAP #195). `sdlc-prompt-check.sh` scans `UserPromptSubmit` payload for LOW-confidence / FAILED-repeatedly / CONFUSED phrases and logs a timestamped signal. At ≥2 recent signals in a 30-min window, emits a loud `!! EFFORT BUMP REQUIRED !!` block with the exact `/effort xhigh` command. 8 quality tests.
|
|
14
|
+
- **Loud staleness nudge** (#201 / ROADMAP #196). `instructions-loaded-check.sh` now caches npm-latest for 24h and prints a loud multi-line warning when the installed wizard is ≥3 minor versions behind. 1–2 minor keeps the existing mild one-liner.
|
|
15
|
+
- **Session-start CC auto-update nudge** (#192 / ROADMAP #85). Instructions-loaded hook queries for open auto-update PRs and nudges the user to review before compacting.
|
|
16
|
+
- **Hook token-cost caps** (#203). 4 new size-cap tests across every hook. Negative control injects echo bloat to prove the caps trip.
|
|
17
|
+
- **Permissions allowlist** (#204). Top read-only tools pre-approved in `.claude/settings.json` to cut permission prompts during automation.
|
|
18
|
+
- **Codex-audit-on-CI-logs shepherd step** (docs). SDLC skill now requires running `codex exec xhigh` against both Tier 1 and Tier 2 CI logs separately — catches silent failures, degraded metrics, and warnings-promoted-to-errors that the green checkmark hides. Dogfooded on PR #206: caught 4 P1s in pre-existing CI infra (tracked as ROADMAP #210, #211, #215 + regression of #93).
|
|
19
|
+
|
|
20
|
+
### Fixed
|
|
21
|
+
|
|
22
|
+
- **CI persist-to-PR-branch race** (#196 / ROADMAP #193). Tier 2 no longer aborts the whole run on a single low-score trial; records the trial instead.
|
|
23
|
+
- **Setup wizard `allowedTools` → `permissions.allow`** (#200 / ROADMAP #197). CC v2.1 renamed the field; setup template updated.
|
|
24
|
+
- **Setup wizard `opus[1m]` opt-in** (#199 / ROADMAP #198). Default no longer force-pins; respects explicit user choice.
|
|
25
|
+
- **SessionStart hook model-field absence** (#180 / commit 3b23860). Model isn't exposed in SessionStart payload; hook now detects effort-only.
|
|
26
|
+
|
|
27
|
+
### Docs
|
|
28
|
+
|
|
29
|
+
- Memory lessons promoted (#194): tier2 exit-code pattern + pipeline liveness.
|
|
30
|
+
- Community paths (#191 / ROADMAP #98): issue + PR templates + Discussions enabled.
|
|
31
|
+
- ROADMAP backlog filed this cycle: #210 (Node 24 false-green), #211 (Tier 1 "11/10" score), #212 (local-Max E2E), #213 (ship degradation env vars by default — blocked on #214), #214 (adaptive-thinking A/B Prove-It), #215 (Tier 2 persist dead code), #216 (repo rename).
|
|
32
|
+
|
|
33
|
+
## [1.34.0] - 2026-04-17
|
|
34
|
+
|
|
35
|
+
### Added
|
|
36
|
+
- Memory Audit Protocol for promoting private-memory lessons to shared docs (#189)
|
|
37
|
+
- New `/sdlc` subsection under "After Session (Capture Learnings)" defines a three-bucket classifier (`promote` / `keep` / `manual-review`) with a rule-based privacy denylist (`type: user`/`reference` → keep, `project`/`feedback` → manual-review)
|
|
38
|
+
- YAML frontmatter parser in `tests/test-memory-audit-protocol.sh` normalizes inline comments, quoted values, and whitespace so variants like `type: "user" # external` still route to keep
|
|
39
|
+
- `SDLC.md` now has a `## Lessons Learned` section seeded with 7 verified technical gotchas (GH CLI stdout, `workflows` YAML scope, GITHUB_TOKEN workflow triggers, GHA `${{ }}` backtick substitution, macOS bash 3.x, stderr/stdout separation for JSON parsing, `continue-on-error` + `||` masking); each entry cites its originating PR or incident date and was re-verified with a runnable repro before promotion
|
|
40
|
+
- 10-fixture corpus at `tests/fixtures/memory-audit-corpus/` (6 promote / 2 keep / 2 manual-review) with `test_expected` frontmatter seeds the future LLM-gated quality runner
|
|
41
|
+
- 12-test protocol suite covers structure, rule-based denylist, YAML-variant hardening, corpus consistency (promote fixtures route to manual-review under rule-based), and corpus shape
|
|
42
|
+
- Codex xhigh 3-round code review: 4/10 → 8/10 → 10/10 CERTIFIED. Caught two false lessons in private memory (`${3:-{}}` brace-default claim and `--argjson result` jq-conflict claim) that were retracted with dated strikethroughs — the protocol's first real use prevented its own false claims from shipping
|
|
43
|
+
- CLI distributes skill updates + new SDLC.md section; CI wire-up in `.github/workflows/ci.yml` (validate job)
|
|
44
|
+
- API feature detection shepherd for Claude API release notes (#100, PRs #184, #186, #187)
|
|
45
|
+
- LLM-free weekly detector at `.github/workflows/weekly-api-update.yml` polls `platform.claude.com/docs/en/release-notes/api.md`
|
|
46
|
+
- `scripts/parse-api-changelog.py` parses ATX date headers with ordinal-date normalizer and bullet-summary capture (non-date sub-headers like `#### SDKs` no longer terminate bullet extraction); 200-char truncation with ellipsis; tab scrub
|
|
47
|
+
- `scripts/persist-api-state.sh` writes last-seen date with branch-protection-safe non-blocking push; opens/updates a single `api-review-needed` tracking issue with enriched bullet summaries (not just dates)
|
|
48
|
+
- `instructions-loaded-check.sh` nudges at session start when open issues exist; gated on local workflow presence so consumer forks see only their own detector's issues
|
|
49
|
+
- 33 tests including 8 fixture-based parser tests (bullet capture, subheader boundary, tab scrub, truncation, ordinal dates) and 2 integration tests
|
|
50
|
+
- Codex xhigh 5 rounds across 2 PRs: 9/10 CERTIFIED. Found-in-prod P0 hotfix in #187 — `gh api` writes JSON error bodies to stdout (not stderr), so the label-create `already_exists` check was broken after the first successful dispatch; pattern now captures both streams
|
|
51
|
+
|
|
52
|
+
### Fixed
|
|
53
|
+
- `gh api` error handling in `weekly-api-update.yml` now captures stdout+stderr together for `already_exists` detection on label creation (#187). Added as portable lesson in `SDLC.md` Lessons Learned
|
|
54
|
+
|
|
55
|
+
### Docs
|
|
56
|
+
- `/less-permission-prompts` Claude Code native skill surfaced in wizard and setup documentation (#183)
|
|
57
|
+
- README community section restyled with visual Discord badge for Automation Station
|
|
58
|
+
|
|
7
59
|
## [1.33.0] - 2026-04-17
|
|
8
60
|
|
|
9
61
|
### Added
|
|
@@ -403,7 +403,7 @@ The `if` field on individual hook handlers filters by tool name AND arguments us
|
|
|
403
403
|
| `matcher` | Group (all hooks in array) | Tool name only | Regex (`Write\|Edit`) |
|
|
404
404
|
| `if` | Individual handler | Tool name + arguments | Permission rule (`Edit(src/**)`) |
|
|
405
405
|
|
|
406
|
-
**Pattern examples:** `Edit(*.ts)`, `Write(src/**)`, `Bash(git *)`. Same syntax as `
|
|
406
|
+
**Pattern examples:** `Edit(*.ts)`, `Write(src/**)`, `Bash(git *)`. Same syntax as `permissions.allow` in settings.json.
|
|
407
407
|
|
|
408
408
|
**Only works on tool-use events:** `PreToolUse`, `PostToolUse`, `PostToolUseFailure`. Adding `if` to non-tool events prevents the hook from running.
|
|
409
409
|
|
|
@@ -846,6 +846,22 @@ Two tools for managing context — use the right one:
|
|
|
846
846
|
|
|
847
847
|
**Best practice:** Put persistent instructions in CLAUDE.md (survives both `/compact` and `/clear`), not in conversation.
|
|
848
848
|
|
|
849
|
+
### Compact at Seams, Not Thresholds (PreCompact hook)
|
|
850
|
+
|
|
851
|
+
**The threshold is the trigger, not the decision.** 25-30% remaining (~70% used) is the commonly-cited "sweet spot" but ignores *what you're doing* at that moment. Compacting mid-Codex-round loses the round-1 evidence and certify conditions that round-2 needs to re-verify. Compacting mid-rebase strands the operation without the context that was setting it up.
|
|
852
|
+
|
|
853
|
+
A **seam** is a point where losing conversational context is safe:
|
|
854
|
+
- Commit boundary (change persisted to git)
|
|
855
|
+
- Codex `CERTIFIED` (review cycle closed)
|
|
856
|
+
- PR merged (work shipped)
|
|
857
|
+
- ROADMAP item marked DONE
|
|
858
|
+
|
|
859
|
+
The wizard's `PreCompact` hook (`hooks/precompact-seam-check.sh`) enforces this for **manual** `/compact` only — it reads `.reviews/handoff.json` and blocks with `HOLD` + exit 2 when status is `PENDING_REVIEW` / `PENDING_RECHECK`, and also blocks when a git rebase, merge, or cherry-pick is in progress. Auto-compact is **not** gated — blocking it could push past 100% context and lose everything. Requires Claude Code **v2.1.105+** (PreCompact event introduced 2026-04-13).
|
|
860
|
+
|
|
861
|
+
**What's NOT checked:** in-progress TodoWrite tasks. Claude Code does not persist TodoWrite state to a file readable from a hook, so "finish the current todo first" is on you, not the hook. Watch the TodoWrite panel before you `/compact`.
|
|
862
|
+
|
|
863
|
+
Override: resolve the blocker (certify the review, finish the rebase), or temporarily disable the hook in `.claude/settings.json`. Don't suppress the warning reflexively — the warning is the point.
|
|
864
|
+
|
|
849
865
|
### Autocompact Tuning
|
|
850
866
|
|
|
851
867
|
Override the default auto-compact threshold with environment variables. These are community-discovered settings referenced in upstream issues ([#34332](https://github.com/anthropics/claude-code/issues/34332), [#42375](https://github.com/anthropics/claude-code/issues/42375)) — not yet officially documented by Anthropic. For a rigorous benchmarking methodology to validate these thresholds, see [AUTOCOMPACT_BENCHMARK.md](AUTOCOMPACT_BENCHMARK.md).
|
|
@@ -855,7 +871,9 @@ Override the default auto-compact threshold with environment variables. These ar
|
|
|
855
871
|
| `CLAUDE_AUTOCOMPACT_PCT_OVERRIDE` | Trigger compaction at this % of context capacity (1-100) | ~95% |
|
|
856
872
|
| `CLAUDE_CODE_AUTO_COMPACT_WINDOW` | Override context capacity in tokens (useful for 1M models) | Model default |
|
|
857
873
|
|
|
858
|
-
**
|
|
874
|
+
**Opt-in (issue #198):** The SDLC Wizard CLI ships `.claude/settings.json` with **no** `model` or `env` pin so Claude Code's auto-mode stays enabled. The setup skill's Step 9.5 asks whether to opt into `"model": "opus[1m]"` + `CLAUDE_AUTOCOMPACT_PCT_OVERRIDE=30` (tuned for the 1M window — compacts at ~300K). Default answer is **No**. Pinning the model at the top level tells Claude Code you've explicitly chosen a model and turns off per-turn model auto-selection — a real tradeoff, so we ask. Power users who want guaranteed Opus 4.7 + 1M context answer yes.
|
|
875
|
+
|
|
876
|
+
To opt in by hand, edit `.claude/settings.json`:
|
|
859
877
|
|
|
860
878
|
```json
|
|
861
879
|
{
|
|
@@ -876,7 +894,7 @@ export CLAUDE_AUTOCOMPACT_PCT_OVERRIDE=30
|
|
|
876
894
|
|
|
877
895
|
| Use Case | AUTOCOMPACT % | Why |
|
|
878
896
|
|----------|--------------|-----|
|
|
879
|
-
| **SDLC
|
|
897
|
+
| **Opt-in SDLC setup (`opus[1m]`)** | **30%** | **Fires at ~300K on 1M — right balance for plan + TDD + review sessions. Paired with the opt-in `opus[1m]` pin (see issue #198)** |
|
|
880
898
|
| General development (200K `opus`) | 75% | Leaves room for implementation after planning |
|
|
881
899
|
| Complex refactors (200K `opus`) | 80% | Slightly more context before compaction |
|
|
882
900
|
| CI pipelines | 60% | Short tasks, compact early to stay fast |
|
|
@@ -892,31 +910,33 @@ The thresholds above are community consensus — not empirically validated. For
|
|
|
892
910
|
|
|
893
911
|
### 1M vs 200K Context Window
|
|
894
912
|
|
|
895
|
-
Claude Code supports both 200K and 1M context windows.
|
|
913
|
+
Claude Code supports both 200K and 1M context windows. **`opus[1m]` is an opt-in power-user pin** — ask yourself whether you actually need the headroom before setting it, because pinning the model at the top level disables Claude Code's auto-mode (see issue #198).
|
|
896
914
|
|
|
897
|
-
| | 200K Context (
|
|
915
|
+
| | 200K Context (default / auto-mode) | 1M Context (`opus[1m]`, opt-in) |
|
|
898
916
|
|---|---|---|
|
|
899
|
-
| **Best for** |
|
|
917
|
+
| **Best for** | Most work — auto-mode picks Sonnet/Opus per turn | Multi-feature / long plan+TDD+review cycles where a single session really crosses 100K+ |
|
|
900
918
|
| **Typical usage** | 50-80K tokens per task | 50-80K typical, up to 200K+ for complex workflows |
|
|
901
919
|
| **Cost** | Standard pricing | Anthropic currently lists the 1M window at standard pricing across the full context for supported Opus/Sonnet models — **verify current rates at [docs.anthropic.com/pricing](https://docs.anthropic.com/)** before assuming no premium |
|
|
902
|
-
| **Auto-
|
|
903
|
-
| **
|
|
920
|
+
| **Auto-mode** | **Enabled** — Claude Code chooses model per turn | **Disabled** — top-level `model` tells CC you've chosen explicitly |
|
|
921
|
+
| **Auto-compact** | Default ~95% works well | Fires at ~76K by default ([issue #34332](https://github.com/anthropics/claude-code/issues/34332)) — pair with `CLAUDE_AUTOCOMPACT_PCT_OVERRIDE=30` |
|
|
922
|
+
| **Suggested override (if you pin)** | `CLAUDE_AUTOCOMPACT_PCT_OVERRIDE=75` | `CLAUDE_AUTOCOMPACT_PCT_OVERRIDE=30` or `CLAUDE_CODE_AUTO_COMPACT_WINDOW=400000` |
|
|
923
|
+
|
|
924
|
+
**Why `opus[1m]` is opt-in (issue #198):**
|
|
925
|
+
- **Pinning disables auto-mode.** Max-plan users pay for Claude Code's per-turn model selection (Sonnet for cheap tasks, Opus for hard ones, plus weekly-limit smoothing). A top-level `model` gives that up.
|
|
926
|
+
- **The 1M headroom has to earn it.** If your typical session stays under 150K, you're giving up auto-mode for headroom you're not using.
|
|
927
|
+
- **Power users who want guaranteed Opus 4.7 + 1M** — go ahead, it's a real win for long shepherding sessions. Just make it a conscious choice, not a silent default.
|
|
904
928
|
|
|
905
|
-
**
|
|
906
|
-
- **Long SDLC sessions accumulate context fast** — plan → TDD → review → CI shepherd on a single feature regularly crosses 100K tokens
|
|
907
|
-
- **Safety margin against autocompact loss** — cheaper to have headroom than to re-read files after a forced compact
|
|
908
|
-
- **At time of writing, Anthropic lists 1M context at standard pricing for supported Opus/Sonnet models.** Verify current rates for your plan before relying on this — see [docs.anthropic.com/pricing](https://docs.anthropic.com/)
|
|
929
|
+
**Opt in when:** you routinely cross 100K tokens in a single session (plan → TDD → review → CI shepherd on one feature), you want Opus 4.7 specifically (not Sonnet), and you're OK losing auto-mode.
|
|
909
930
|
|
|
910
|
-
**
|
|
931
|
+
**Stay on auto-mode (default) when:** you're unsure, your work is mixed short/long, or you want Claude Code to do the model math for you.
|
|
911
932
|
|
|
912
|
-
**
|
|
913
|
-
|
|
914
|
-
|
|
915
|
-
- Your team has cost controls that flag >200K prompts
|
|
933
|
+
**How to opt in:** run `/model opus[1m]` in your session (transient), or set `"model": "opus[1m]"` in `.claude/settings.json` (persistent). Requires Claude Code v2.1.111+ for Opus 4.7. The setup wizard's Step 9.5 also asks once, with default No.
|
|
934
|
+
|
|
935
|
+
**How to opt out:** remove the `model` line from `.claude/settings.json`, or run `/model` and pick "Default (recommended)".
|
|
916
936
|
|
|
917
937
|
**Cost awareness:** Larger windows let you consume more tokens in one session, and total cost always scales with tokens consumed regardless of tier. Use `/cost` to monitor — a 900K-token session is meaningfully more expensive than an 80K one even at standard rates.
|
|
918
938
|
|
|
919
|
-
**Autocompact pairing (important):** If you
|
|
939
|
+
**Autocompact pairing (important):** If you opt into `opus[1m]`, also set `CLAUDE_AUTOCOMPACT_PCT_OVERRIDE=30` — otherwise CC's default autocompact fires at ~76K and destroys the headroom you're paying for. Step 9.5 writes both together when you opt in.
|
|
920
940
|
|
|
921
941
|
---
|
|
922
942
|
|
|
@@ -1300,7 +1320,7 @@ Feature branches still recommended for solo devs (keeps main clean, easy rollbac
|
|
|
1300
1320
|
**CI monitoring detail:**
|
|
1301
1321
|
> "Should Claude monitor CI checks after pushing and auto-diagnose failures? (y/n)"
|
|
1302
1322
|
|
|
1303
|
-
- **Yes** → Enable CI feedback loop in SDLC skill, add `gh` CLI to
|
|
1323
|
+
- **Yes** → Enable CI feedback loop in SDLC skill, add `gh` CLI to `permissions.allow`
|
|
1304
1324
|
- **No** → Skip CI monitoring steps (Claude still runs local tests, just doesn't watch CI)
|
|
1305
1325
|
|
|
1306
1326
|
**CI review feedback question (only if CI monitoring is enabled):**
|
|
@@ -1365,7 +1385,7 @@ Claude scans for:
|
|
|
1365
1385
|
│ ├── .github/workflows/deploy*.yml → GitHub Actions deploy
|
|
1366
1386
|
│ └── package.json scripts (deploy:*) → npm deploy scripts
|
|
1367
1387
|
│
|
|
1368
|
-
├── Tool permissions (for
|
|
1388
|
+
├── Tool permissions (for permissions.allow):
|
|
1369
1389
|
│ ├── package.json → Bash(npm *), Bash(node *), Bash(npx *)
|
|
1370
1390
|
│ ├── pnpm-lock.yaml → Bash(pnpm *)
|
|
1371
1391
|
│ ├── yarn.lock → Bash(yarn *)
|
|
@@ -1759,18 +1779,20 @@ Create `.claude/settings.json`:
|
|
|
1759
1779
|
```json
|
|
1760
1780
|
{
|
|
1761
1781
|
"verbosity": "medium",
|
|
1762
|
-
"
|
|
1763
|
-
"
|
|
1764
|
-
|
|
1765
|
-
|
|
1766
|
-
|
|
1767
|
-
|
|
1768
|
-
|
|
1769
|
-
|
|
1770
|
-
|
|
1771
|
-
|
|
1772
|
-
|
|
1773
|
-
|
|
1782
|
+
"permissions": {
|
|
1783
|
+
"allow": [
|
|
1784
|
+
"Read",
|
|
1785
|
+
"Edit",
|
|
1786
|
+
"Write",
|
|
1787
|
+
"Glob",
|
|
1788
|
+
"Grep",
|
|
1789
|
+
"Task",
|
|
1790
|
+
"Bash(npm *)",
|
|
1791
|
+
"Bash(node *)",
|
|
1792
|
+
"Bash(npx *)",
|
|
1793
|
+
"Bash(gh *)"
|
|
1794
|
+
]
|
|
1795
|
+
},
|
|
1774
1796
|
"hooks": {
|
|
1775
1797
|
"UserPromptSubmit": [
|
|
1776
1798
|
{
|
|
@@ -1800,7 +1822,7 @@ Create `.claude/settings.json`:
|
|
|
1800
1822
|
|
|
1801
1823
|
### Allowed Tools (Adaptive)
|
|
1802
1824
|
|
|
1803
|
-
The `
|
|
1825
|
+
The `permissions.allow` array is auto-generated based on your stack detected in Step 0.4. (Historical note: pre-#197 guidance used a top-level `allowedTools` array — that form silently disables Claude Code auto-mode, so the wizard writes `permissions.allow` now.)
|
|
1804
1826
|
|
|
1805
1827
|
| If Detected | Tools Added |
|
|
1806
1828
|
|-------------|-------------|
|
|
@@ -1841,6 +1863,9 @@ The `allowedTools` array is auto-generated based on your stack detected in Step
|
|
|
1841
1863
|
|------|---------------|---------|
|
|
1842
1864
|
| `UserPromptSubmit` | Every message you send | Baseline SDLC reminder, skill auto-invoke |
|
|
1843
1865
|
| `PreToolUse` | Before Claude edits files | TDD check: "Did you write the test first?" Uses `if` field to only fire on source files |
|
|
1866
|
+
| `InstructionsLoaded` | On first SDLC.md/CLAUDE.md load | Staleness nudges (wizard version, review-protocol reminders, CC release alerts) |
|
|
1867
|
+
| `SessionStart` | On `claude` startup | Detect stale effort setting / model upgrades |
|
|
1868
|
+
| `PreCompact` (manual only) | When user runs `/compact` | **Seam gate** — blocks manual compact when `.reviews/handoff.json` is `PENDING_REVIEW`/`PENDING_RECHECK` or a git rebase/merge/cherry-pick is in progress. Auto-compact is NOT gated (blocking it risks pushing past 100% context and losing everything). Requires Claude Code v2.1.105+ |
|
|
1844
1869
|
|
|
1845
1870
|
### How Skill Auto-Invoke Works
|
|
1846
1871
|
|
|
@@ -2057,9 +2082,11 @@ Before presenting approach, STATE your confidence:
|
|
|
2057
2082
|
|-------|---------|--------|--------|
|
|
2058
2083
|
| HIGH (90%+) | Know exactly what to do | Present approach, proceed after approval | `high` (default) |
|
|
2059
2084
|
| MEDIUM (60-89%) | Solid approach, some uncertainty | Present approach, highlight uncertainties | `high` (default) |
|
|
2060
|
-
| LOW (<60%) | Not sure | ASK USER before proceeding |
|
|
2061
|
-
| FAILED 2x | Something's wrong | STOP. ASK USER immediately |
|
|
2062
|
-
| CONFUSED | Can't diagnose why something is failing | STOP. Describe what you tried, ask for help |
|
|
2085
|
+
| LOW (<60%) | Not sure | ASK USER before proceeding | **Run `/effort xhigh` now** — don't wait |
|
|
2086
|
+
| FAILED 2x | Something's wrong | STOP. ASK USER immediately | **Run `/effort max` now** — you're burning cycles at lower effort |
|
|
2087
|
+
| CONFUSED | Can't diagnose why something is failing | STOP. Describe what you tried, ask for help | **Run `/effort max` now** — stop spinning |
|
|
2088
|
+
|
|
2089
|
+
**Dynamic bumping is NOT optional.** "Consider max effort" is the same as "ignore this" in practice. If your confidence drops or tests fail twice, bump effort BEFORE the next attempt — spinning at low effort is an SDLC failure mode.
|
|
2063
2090
|
|
|
2064
2091
|
## Self-Review Loop (CRITICAL)
|
|
2065
2092
|
|
|
@@ -2626,6 +2653,28 @@ These are your full reference docs. Start with stubs and expand over time:
|
|
|
2626
2653
|
|
|
2627
2654
|
**Claude follows this automatically.** After a deploy task, Claude runs through the Post-Deploy Verification table for the target environment. If any check fails, Claude suggests rollback and a new fix cycle.
|
|
2628
2655
|
|
|
2656
|
+
## Pipeline Liveness Audits — CI green ≠ data flowing
|
|
2657
|
+
|
|
2658
|
+
A green CI badge only means "no step crashed." It does **not** mean "this pipeline is still producing the output it's supposed to produce." Long-running pipelines — scheduled benchmarks, nightly analytics jobs, weekly report generators, any workflow that appends to a log or dataset — can silently stop producing output while every run still reports success. Regression tests alone do not catch this: the fault lives between "green status" and "observable artifact."
|
|
2659
|
+
|
|
2660
|
+
**Symptom to watch for:** a file, table, or dashboard that's supposed to be updated by a scheduled workflow stops advancing even though the workflow keeps running green.
|
|
2661
|
+
|
|
2662
|
+
**Concrete example from this repo (2026-04-18):** `tests/e2e/score-history.jsonl` hadn't been appended to since 2026-03-30, yet weekly runs kept completing. Two stacked causes:
|
|
2663
|
+
|
|
2664
|
+
1. On the 2026-04-13 run, a `CRITICAL MISS` caused `evaluate.sh` to exit 1 *with* a valid score payload; the tier2 wrapper aborted on the non-zero exit and dropped the trial (fix: PR #193 — disambiguate infra error from legitimate low-score exit via JSON payload, not exit code).
|
|
2665
|
+
2. On other runs, a separate PR-branch push race (`refs/pull/N/merge` checkout vs. `refs/heads/<branch>` push) silently dropped the new trial because `continue-on-error: true` was set on the push step.
|
|
2666
|
+
|
|
2667
|
+
The second failure was silent (push step protected by `continue-on-error`); the first was a red CI run but nobody was watching weekly runs closely enough to notice the artifact had stopped advancing. Either way, the artifact's liveness would have caught the stall weeks earlier than a CI-badge-only review.
|
|
2668
|
+
|
|
2669
|
+
**Audit pattern — run when you touch a pipeline, and when asked to check pipeline health:**
|
|
2670
|
+
|
|
2671
|
+
1. **Identify the observable output** — the artifact the pipeline is supposed to produce (file, PR, issue, log row, dashboard row).
|
|
2672
|
+
2. **Check its liveness** — what's the last timestamp? If the pipeline runs weekly but the artifact is 3+ cycles stale, that's a stall, not a lull.
|
|
2673
|
+
3. **Walk backward from the artifact to CI** — find the step that writes it, read that specific step's logs (not just the top-level status), confirm the write actually happened.
|
|
2674
|
+
4. **When `continue-on-error: true` is present upstream of the write step**, treat that step as suspect by default — its failures are masked.
|
|
2675
|
+
|
|
2676
|
+
**When Claude should run this.** Claude runs the liveness audit when it merges or edits a scheduled workflow, when it ships a change that writes to a long-running artifact, and whenever the user asks to check a pipeline's health. It is not a background task — if no one is looking, the audit does not happen.
|
|
2677
|
+
|
|
2629
2678
|
## Rollback
|
|
2630
2679
|
|
|
2631
2680
|
If deployment fails or post-deploy verification catches issues:
|
|
@@ -2662,7 +2711,7 @@ If deployment fails or post-deploy verification catches issues:
|
|
|
2662
2711
|
|
|
2663
2712
|
**SDLC.md:**
|
|
2664
2713
|
```markdown
|
|
2665
|
-
<!-- SDLC Wizard Version: 1.
|
|
2714
|
+
<!-- SDLC Wizard Version: 1.35.0 -->
|
|
2666
2715
|
<!-- Setup Date: [DATE] -->
|
|
2667
2716
|
<!-- Completed Steps: step-0.1, step-0.2, step-0.4, step-1, step-2, step-3, step-4, step-5, step-6, step-7, step-8, step-9 -->
|
|
2668
2717
|
<!-- Git Workflow: [PRs or Solo] -->
|
|
@@ -3135,6 +3184,8 @@ Want me to file these? (yes/no/not now)
|
|
|
3135
3184
|
|
|
3136
3185
|
**`/revise-claude-md` scope:** Only updates CLAUDE.md. It does NOT touch feature docs, TESTING.md, hooks, or skills. Use it for general project context that applies across the codebase.
|
|
3137
3186
|
|
|
3187
|
+
**Memory Audit Protocol:** Per-user memory at `~/.claude/projects/<proj>/memory/` accumulates private learnings. Some are portable technical lessons that belong in shared docs. The `/sdlc` skill's **Memory Audit Protocol** section (under "After Session (Capture Learnings)") defines a three-bucket classifier (`promote` / `keep` / `manual-review`) with a type-based denylist that keeps `user`/`reference` entries private and routes `project`/`feedback` entries to human review. Run at end-of-release or after debugging-heavy sessions. Human approves every promotion chunk-by-chunk before apply.
|
|
3188
|
+
|
|
3138
3189
|
**When to do mini-retro:** After features, tricky bugs, or discovering gotchas. Skip for one-line fixes or questions.
|
|
3139
3190
|
|
|
3140
3191
|
**The SDLC evolves:** Weekly research, monthly deep-dives, and CI friction signals feed improvements. Human approves, the system gets better.
|
|
@@ -3721,7 +3772,7 @@ Walk through updates? (y/n)
|
|
|
3721
3772
|
Store wizard state in `SDLC.md` as metadata comments (invisible to readers, parseable by Claude):
|
|
3722
3773
|
|
|
3723
3774
|
```markdown
|
|
3724
|
-
<!-- SDLC Wizard Version: 1.
|
|
3775
|
+
<!-- SDLC Wizard Version: 1.35.0 -->
|
|
3725
3776
|
<!-- Setup Date: 2026-01-24 -->
|
|
3726
3777
|
<!-- Completed Steps: step-0.1, step-0.2, step-1, step-2, step-3, step-4, step-5, step-6, step-7, step-8, step-9 -->
|
|
3727
3778
|
<!-- Git Workflow: PRs -->
|
|
@@ -4060,6 +4111,23 @@ When Anthropic provides official plugins that overlap with this SDLC:
|
|
|
4060
4111
|
|
|
4061
4112
|
**Re-run `claude-code-setup` periodically** (quarterly, or when your project expands in scope) to catch new automations — MCP servers, hooks, subagents — that weren't relevant at initial setup but are now.
|
|
4062
4113
|
|
|
4114
|
+
**API feature shepherd (self-maintenance, roadmap #100):**
|
|
4115
|
+
|
|
4116
|
+
The wizard watches the **Anthropic API changelog** — not just Claude Code CLI releases — for new betas, tools, and agent features. The detector runs in `.github/workflows/weekly-api-update.yml`, is intentionally LLM-free, and only opens a tracking issue labeled `api-review-needed` when new entries appear at `platform.claude.com/docs/en/release-notes/api`.
|
|
4117
|
+
|
|
4118
|
+
When that issue is open, the session-start hook nudges you. The session (not the workflow) does the deep research + adoption via the full SDLC loop. This mirrors the "local shepherd" pattern used for CI fixes: cheap Action-layer detection + session-time analysis beats expensive Action-layer LLM calls.
|
|
4119
|
+
|
|
4120
|
+
The gap this closes: the advisor tool (API beta, `advisor-tool-2026-03-01`) shipped and was missed for several days before manual discovery. Detector would have flagged it on the next weekly tick.
|
|
4121
|
+
|
|
4122
|
+
**Complementary native skills worth knowing:**
|
|
4123
|
+
|
|
4124
|
+
| Native Skill | What It Does | When to Run |
|
|
4125
|
+
|--------------|--------------|-------------|
|
|
4126
|
+
| `/less-permission-prompts` | Scans transcripts for common read-only Bash/MCP calls and proposes a prioritized allowlist | After a few sessions — reduces permission friction without auto mode |
|
|
4127
|
+
| `/permissions` | Pre-allow specific commands and check them into `.claude/settings.json` | Anytime you want an auditable team allowlist |
|
|
4128
|
+
|
|
4129
|
+
These are shipped by Claude Code itself. The wizard doesn't reimplement them — it points you at them so you benefit from the native version's ongoing maintenance.
|
|
4130
|
+
|
|
4063
4131
|
### When Claude Code Improves
|
|
4064
4132
|
|
|
4065
4133
|
Claude Code is actively improving. When they add built-in features:
|
package/README.md
CHANGED
|
@@ -237,8 +237,26 @@ This isn't the only Claude Code SDLC tool. Here's an honest comparison:
|
|
|
237
237
|
|
|
238
238
|
## Community
|
|
239
239
|
|
|
240
|
-
|
|
240
|
+
<div align="center">
|
|
241
|
+
|
|
242
|
+
[](https://discord.com/invite/fGPEF7GHrF)
|
|
243
|
+
|
|
244
|
+
**[Automation Station](https://discord.com/invite/fGPEF7GHrF)** — a community Discord packed with software engineers bringing 40+ years of combined experience across every area of the stack.
|
|
245
|
+
|
|
246
|
+
_Frontend · Backend · Infra · Embedded · Data · QA · DevOps_
|
|
247
|
+
|
|
248
|
+
Share patterns, ask questions, compare notes on AI agents, automation, and SDLC tooling.
|
|
249
|
+
|
|
250
|
+
</div>
|
|
241
251
|
|
|
242
252
|
## Contributing
|
|
243
253
|
|
|
244
254
|
PRs welcome. See [CONTRIBUTING.md](CONTRIBUTING.md) for evaluation methodology and testing.
|
|
255
|
+
|
|
256
|
+
## Feedback
|
|
257
|
+
|
|
258
|
+
Three ways to report bugs, request features, or ask questions:
|
|
259
|
+
|
|
260
|
+
- **In-session:** run `/feedback` inside any Claude Code session using this wizard — auto-fills context and redacts secrets before filing
|
|
261
|
+
- **Issue templates:** [bug report](https://github.com/BaseInfinity/agentic-ai-sdlc-wizard/issues/new?template=bug_report.md), [feature request](https://github.com/BaseInfinity/agentic-ai-sdlc-wizard/issues/new?template=feature_request.md), [question](https://github.com/BaseInfinity/agentic-ai-sdlc-wizard/issues/new?template=question.md)
|
|
262
|
+
- **Discussions:** open-ended conversations at [github.com/BaseInfinity/agentic-ai-sdlc-wizard/discussions](https://github.com/BaseInfinity/agentic-ai-sdlc-wizard/discussions)
|
package/cli/init.js
CHANGED
|
@@ -24,6 +24,7 @@ const FILES = [
|
|
|
24
24
|
{ src: 'hooks/tdd-pretool-check.sh', dest: '.claude/hooks/tdd-pretool-check.sh', executable: true, base: REPO_ROOT },
|
|
25
25
|
{ src: 'hooks/instructions-loaded-check.sh', dest: '.claude/hooks/instructions-loaded-check.sh', executable: true, base: REPO_ROOT },
|
|
26
26
|
{ src: 'hooks/model-effort-check.sh', dest: '.claude/hooks/model-effort-check.sh', executable: true, base: REPO_ROOT },
|
|
27
|
+
{ src: 'hooks/precompact-seam-check.sh', dest: '.claude/hooks/precompact-seam-check.sh', executable: true, base: REPO_ROOT },
|
|
27
28
|
{ src: 'skills/sdlc/SKILL.md', dest: '.claude/skills/sdlc/SKILL.md', base: REPO_ROOT },
|
|
28
29
|
{ src: 'skills/setup/SKILL.md', dest: '.claude/skills/setup/SKILL.md', base: REPO_ROOT },
|
|
29
30
|
{ src: 'skills/update/SKILL.md', dest: '.claude/skills/update/SKILL.md', base: REPO_ROOT },
|
|
@@ -1,8 +1,4 @@
|
|
|
1
1
|
{
|
|
2
|
-
"model": "opus[1m]",
|
|
3
|
-
"env": {
|
|
4
|
-
"CLAUDE_AUTOCOMPACT_PCT_OVERRIDE": "30"
|
|
5
|
-
},
|
|
6
2
|
"hooks": {
|
|
7
3
|
"UserPromptSubmit": [
|
|
8
4
|
{
|
|
@@ -45,6 +41,17 @@
|
|
|
45
41
|
}
|
|
46
42
|
]
|
|
47
43
|
}
|
|
44
|
+
],
|
|
45
|
+
"PreCompact": [
|
|
46
|
+
{
|
|
47
|
+
"matcher": "manual",
|
|
48
|
+
"hooks": [
|
|
49
|
+
{
|
|
50
|
+
"type": "command",
|
|
51
|
+
"command": "\"$CLAUDE_PROJECT_DIR\"/.claude/hooks/precompact-seam-check.sh"
|
|
52
|
+
}
|
|
53
|
+
]
|
|
54
|
+
}
|
|
48
55
|
]
|
|
49
56
|
}
|
|
50
57
|
}
|
package/hooks/hooks.json
CHANGED
|
@@ -34,14 +34,71 @@ if [ -n "$MISSING" ]; then
|
|
|
34
34
|
echo "Invoke Skill tool, skill=\"setup-wizard\" to generate them."
|
|
35
35
|
fi
|
|
36
36
|
|
|
37
|
-
# Version update check (non-blocking, best-effort)
|
|
37
|
+
# Version update check (non-blocking, best-effort).
|
|
38
|
+
# Fetches npm latest at most once per 24h (ROADMAP #196). Prints a stronger
|
|
39
|
+
# multi-line nudge when the gap is ≥3 minor versions — the one-liner gets
|
|
40
|
+
# skipped after weeks of ignoring it (user feedback 2026-04-18).
|
|
38
41
|
SDLC_MD="$PROJECT_DIR/SDLC.md"
|
|
42
|
+
# Strict x.y.z semver — rejects whitespace, "junk", "1.alpha.0", etc.
|
|
43
|
+
SEMVER_RE='^[0-9]+\.[0-9]+\.[0-9]+$'
|
|
39
44
|
if [ -f "$SDLC_MD" ]; then
|
|
40
45
|
INSTALLED_VERSION=$(grep -o 'SDLC Wizard Version: [0-9.]*' "$SDLC_MD" | head -1 | sed 's/SDLC Wizard Version: //')
|
|
41
|
-
if [ -n "$INSTALLED_VERSION" ] &&
|
|
42
|
-
|
|
46
|
+
if [ -n "$INSTALLED_VERSION" ] && [[ "$INSTALLED_VERSION" =~ $SEMVER_RE ]]; then
|
|
47
|
+
VERSION_CACHE_DIR="${SDLC_WIZARD_CACHE_DIR:-$HOME/.cache/sdlc-wizard}"
|
|
48
|
+
VERSION_CACHE_FILE="$VERSION_CACHE_DIR/latest-version"
|
|
49
|
+
LATEST_VERSION=""
|
|
50
|
+
|
|
51
|
+
# Use cache if present, <24h old, and contents are valid semver
|
|
52
|
+
if [ -f "$VERSION_CACHE_FILE" ]; then
|
|
53
|
+
if stat -f %m "$VERSION_CACHE_FILE" > /dev/null 2>&1; then
|
|
54
|
+
CACHE_MTIME=$(stat -f %m "$VERSION_CACHE_FILE")
|
|
55
|
+
else
|
|
56
|
+
CACHE_MTIME=$(stat -c %Y "$VERSION_CACHE_FILE" 2>/dev/null || echo 0)
|
|
57
|
+
fi
|
|
58
|
+
CACHE_AGE=$(( $(date +%s) - CACHE_MTIME ))
|
|
59
|
+
if [ "$CACHE_AGE" -lt 86400 ]; then
|
|
60
|
+
CACHE_CONTENT=$(cat "$VERSION_CACHE_FILE" 2>/dev/null) || CACHE_CONTENT=""
|
|
61
|
+
if [[ "$CACHE_CONTENT" =~ $SEMVER_RE ]]; then
|
|
62
|
+
LATEST_VERSION="$CACHE_CONTENT"
|
|
63
|
+
fi
|
|
64
|
+
fi
|
|
65
|
+
fi
|
|
66
|
+
|
|
67
|
+
# Fetch from npm if cache miss / stale / malformed
|
|
68
|
+
if [ -z "$LATEST_VERSION" ] && command -v npm > /dev/null 2>&1; then
|
|
69
|
+
NPM_RESULT=$(npm view agentic-sdlc-wizard version 2>/dev/null) || NPM_RESULT=""
|
|
70
|
+
if [[ "$NPM_RESULT" =~ $SEMVER_RE ]]; then
|
|
71
|
+
LATEST_VERSION="$NPM_RESULT"
|
|
72
|
+
mkdir -p "$VERSION_CACHE_DIR" 2>/dev/null || true
|
|
73
|
+
printf '%s' "$LATEST_VERSION" > "$VERSION_CACHE_FILE" 2>/dev/null || true
|
|
74
|
+
fi
|
|
75
|
+
fi
|
|
76
|
+
|
|
43
77
|
if [ -n "$LATEST_VERSION" ] && [ "$LATEST_VERSION" != "$INSTALLED_VERSION" ]; then
|
|
44
|
-
|
|
78
|
+
# Minor-version delta: 1.25.0 vs 1.34.0 → 9
|
|
79
|
+
INSTALLED_MINOR=$(echo "$INSTALLED_VERSION" | awk -F. '{print $2+0}')
|
|
80
|
+
LATEST_MINOR=$(echo "$LATEST_VERSION" | awk -F. '{print $2+0}')
|
|
81
|
+
INSTALLED_MAJOR=$(echo "$INSTALLED_VERSION" | awk -F. '{print $1+0}')
|
|
82
|
+
LATEST_MAJOR=$(echo "$LATEST_VERSION" | awk -F. '{print $1+0}')
|
|
83
|
+
MINOR_DELTA=0
|
|
84
|
+
if [ "$INSTALLED_MAJOR" = "$LATEST_MAJOR" ]; then
|
|
85
|
+
MINOR_DELTA=$(( LATEST_MINOR - INSTALLED_MINOR ))
|
|
86
|
+
else
|
|
87
|
+
# Major bump: treat as a very large delta
|
|
88
|
+
MINOR_DELTA=99
|
|
89
|
+
fi
|
|
90
|
+
|
|
91
|
+
if [ "$MINOR_DELTA" -ge 3 ]; then
|
|
92
|
+
echo ""
|
|
93
|
+
echo "!! WARNING: SDLC Wizard is ${MINOR_DELTA} minor versions behind !!"
|
|
94
|
+
echo " Installed: ${INSTALLED_VERSION}"
|
|
95
|
+
echo " Latest: ${LATEST_VERSION}"
|
|
96
|
+
echo " You're missing bug fixes and features shipped across ${MINOR_DELTA} releases."
|
|
97
|
+
echo " Strongly recommend running /update-wizard before starting new work."
|
|
98
|
+
echo ""
|
|
99
|
+
else
|
|
100
|
+
echo "SDLC Wizard update available: ${INSTALLED_VERSION} → ${LATEST_VERSION} (run /update-wizard)"
|
|
101
|
+
fi
|
|
45
102
|
fi
|
|
46
103
|
fi
|
|
47
104
|
fi
|
|
@@ -98,6 +155,49 @@ if [ -d "$PROJECT_DIR/.claude/skills/update" ]; then
|
|
|
98
155
|
done
|
|
99
156
|
fi
|
|
100
157
|
|
|
158
|
+
# API feature review nudge (#100) — surface open 'api-review-needed' issues
|
|
159
|
+
# opened by .github/workflows/weekly-api-update.yml so the session picks up
|
|
160
|
+
# new API features without waiting for manual discovery.
|
|
161
|
+
#
|
|
162
|
+
# Gated on LOCAL presence of the detector workflow: the CLI distributes this
|
|
163
|
+
# hook to consumer projects, and we don't want to pester those users with
|
|
164
|
+
# upstream-wizard issues. The nudge only fires when the current repo owns
|
|
165
|
+
# the detector (= the wizard repo or a fork of it).
|
|
166
|
+
if [ -f "$PROJECT_DIR/.github/workflows/weekly-api-update.yml" ] && \
|
|
167
|
+
command -v gh > /dev/null 2>&1; then
|
|
168
|
+
# Query the current repo (not hardcoded upstream) — in a fork, users see
|
|
169
|
+
# their own detector's issues, not ours.
|
|
170
|
+
API_REVIEW_COUNT=$(gh issue list \
|
|
171
|
+
--state open \
|
|
172
|
+
--label "api-review-needed" \
|
|
173
|
+
--limit 1 \
|
|
174
|
+
--json number \
|
|
175
|
+
--jq 'length' 2>/dev/null) || API_REVIEW_COUNT=""
|
|
176
|
+
if [[ "$API_REVIEW_COUNT" =~ ^[0-9]+$ ]] && [ "$API_REVIEW_COUNT" -gt 0 ]; then
|
|
177
|
+
echo "Anthropic API features pending review: ${API_REVIEW_COUNT} open issue(s) with label 'api-review-needed' (see .github/workflows/weekly-api-update.yml)"
|
|
178
|
+
fi
|
|
179
|
+
fi
|
|
180
|
+
|
|
181
|
+
# Claude Code release review nudge (#85) — surface open 'auto-update' PRs
|
|
182
|
+
# opened by .github/workflows/weekly-update.yml so new CC releases get triaged
|
|
183
|
+
# before they bit-rot (relevance-HIGH PRs can sit for days otherwise).
|
|
184
|
+
#
|
|
185
|
+
# Gated on LOCAL presence of weekly-update.yml: the CLI distributes this hook
|
|
186
|
+
# to consumer projects, which don't own the detector — don't pester them with
|
|
187
|
+
# upstream-wizard PRs.
|
|
188
|
+
if [ -f "$PROJECT_DIR/.github/workflows/weekly-update.yml" ] && \
|
|
189
|
+
command -v gh > /dev/null 2>&1; then
|
|
190
|
+
CC_UPDATE_COUNT=$(gh pr list \
|
|
191
|
+
--state open \
|
|
192
|
+
--label "auto-update" \
|
|
193
|
+
--limit 1 \
|
|
194
|
+
--json number \
|
|
195
|
+
--jq 'length' 2>/dev/null) || CC_UPDATE_COUNT=""
|
|
196
|
+
if [[ "$CC_UPDATE_COUNT" =~ ^[0-9]+$ ]] && [ "$CC_UPDATE_COUNT" -gt 0 ]; then
|
|
197
|
+
echo "Claude Code update pending review: ${CC_UPDATE_COUNT} open auto-update PR(s) (see .github/workflows/weekly-update.yml)"
|
|
198
|
+
fi
|
|
199
|
+
fi
|
|
200
|
+
|
|
101
201
|
# Claude Code version check (non-blocking, best-effort)
|
|
102
202
|
if command -v claude > /dev/null 2>&1 && command -v npm > /dev/null 2>&1; then
|
|
103
203
|
CC_LOCAL=$(claude --version 2>/dev/null | grep -o '[0-9][0-9.]*' | head -1) || true
|
|
@@ -0,0 +1,75 @@
|
|
|
1
|
+
#!/bin/bash
|
|
2
|
+
# PreCompact hook - block manual /compact at non-seam boundaries
|
|
3
|
+
# Fires on manual /compact only (auto-compact is threshold-driven; blocking
|
|
4
|
+
# it could push past 100% context and lose everything). Matcher: "manual"
|
|
5
|
+
# in .claude/settings.json.
|
|
6
|
+
#
|
|
7
|
+
# Requires Claude Code v2.1.105+ (PreCompact event introduced April 13, 2026).
|
|
8
|
+
#
|
|
9
|
+
# Seam policy: compacting mid-cycle loses evidence the next round needs.
|
|
10
|
+
# Block when:
|
|
11
|
+
# (1) .reviews/handoff.json status is PENDING_REVIEW or PENDING_RECHECK
|
|
12
|
+
# → mid-Codex-round, compact after CERTIFIED
|
|
13
|
+
# (2) git rebase/merge/cherry-pick in progress
|
|
14
|
+
# → finish in-progress git operation first
|
|
15
|
+
# Allow otherwise.
|
|
16
|
+
|
|
17
|
+
[ ! -t 0 ] && INPUT=$(cat) || INPUT=""
|
|
18
|
+
|
|
19
|
+
# Determine project root: prefer $CLAUDE_PROJECT_DIR, fall back to cwd
|
|
20
|
+
ROOT="${CLAUDE_PROJECT_DIR:-$PWD}"
|
|
21
|
+
|
|
22
|
+
HOLD_REASONS=""
|
|
23
|
+
|
|
24
|
+
# Check 1: Codex review mid-cycle
|
|
25
|
+
# Self-heal: if handoff has a pr_number and gh reports that PR as MERGED,
|
|
26
|
+
# the handoff is a stale artifact from a closed review — treat as implicit
|
|
27
|
+
# CERTIFIED so a forgotten status field doesn't permanently block /compact.
|
|
28
|
+
HANDOFF="$ROOT/.reviews/handoff.json"
|
|
29
|
+
if [ -f "$HANDOFF" ]; then
|
|
30
|
+
STATUS=$(grep -o '"status"[[:space:]]*:[[:space:]]*"[^"]*"' "$HANDOFF" 2>/dev/null | head -1 | sed 's/.*"\([^"]*\)"$/\1/')
|
|
31
|
+
case "$STATUS" in
|
|
32
|
+
PENDING_REVIEW|PENDING_RECHECK)
|
|
33
|
+
PR_NUMBER=$(grep -o '"pr_number"[[:space:]]*:[[:space:]]*[0-9][0-9]*' "$HANDOFF" 2>/dev/null | head -1 | grep -o '[0-9][0-9]*$')
|
|
34
|
+
HEALED=0
|
|
35
|
+
if [ -n "$PR_NUMBER" ] && command -v gh >/dev/null 2>&1; then
|
|
36
|
+
PR_STATE=$(gh pr view "$PR_NUMBER" --json state --jq .state 2>/dev/null)
|
|
37
|
+
[ "$PR_STATE" = "MERGED" ] && HEALED=1
|
|
38
|
+
fi
|
|
39
|
+
if [ "$HEALED" -ne 1 ]; then
|
|
40
|
+
HOLD_REASONS="${HOLD_REASONS} - Codex review is ${STATUS}. Round-1 evidence lives in this context — compacting now loses what round-2 needs to re-verify.
|
|
41
|
+
Resolve: wait for CERTIFIED (or escalate) before /compact."$'\n'
|
|
42
|
+
fi
|
|
43
|
+
;;
|
|
44
|
+
esac
|
|
45
|
+
fi
|
|
46
|
+
|
|
47
|
+
# Check 2: in-progress git operation
|
|
48
|
+
GITDIR="$ROOT/.git"
|
|
49
|
+
if [ -d "$GITDIR" ]; then
|
|
50
|
+
if [ -e "$GITDIR/REBASE_HEAD" ] || [ -d "$GITDIR/rebase-merge" ] || [ -d "$GITDIR/rebase-apply" ]; then
|
|
51
|
+
HOLD_REASONS="${HOLD_REASONS} - Git rebase in progress. Compacting mid-rebase loses the operation's context.
|
|
52
|
+
Resolve: finish or abort the rebase before /compact."$'\n'
|
|
53
|
+
fi
|
|
54
|
+
if [ -e "$GITDIR/MERGE_HEAD" ]; then
|
|
55
|
+
HOLD_REASONS="${HOLD_REASONS} - Git merge in progress. Compacting mid-merge loses the operation's context.
|
|
56
|
+
Resolve: finish or abort the merge before /compact."$'\n'
|
|
57
|
+
fi
|
|
58
|
+
if [ -e "$GITDIR/CHERRY_PICK_HEAD" ]; then
|
|
59
|
+
HOLD_REASONS="${HOLD_REASONS} - Git cherry-pick in progress. Compacting mid-cherry-pick loses the operation's context.
|
|
60
|
+
Resolve: finish or abort the cherry-pick before /compact."$'\n'
|
|
61
|
+
fi
|
|
62
|
+
fi
|
|
63
|
+
|
|
64
|
+
if [ -n "$HOLD_REASONS" ]; then
|
|
65
|
+
{
|
|
66
|
+
echo "HOLD: manual /compact at a non-seam. Compacting mid-cycle loses evidence the next round needs."
|
|
67
|
+
echo ""
|
|
68
|
+
echo "$HOLD_REASONS"
|
|
69
|
+
echo "Natural seams: commit boundary, Codex CERTIFIED, PR merge, ROADMAP item DONE."
|
|
70
|
+
echo "Override: resolve the blocker above, or temporarily disable this hook in .claude/settings.json."
|
|
71
|
+
} >&2
|
|
72
|
+
exit 2
|
|
73
|
+
fi
|
|
74
|
+
|
|
75
|
+
exit 0
|
|
@@ -17,6 +17,69 @@ else
|
|
|
17
17
|
exit 0
|
|
18
18
|
fi
|
|
19
19
|
|
|
20
|
+
# Effort auto-bump (ROADMAP #195). Watches this UserPromptSubmit payload for
|
|
21
|
+
# LOW-confidence / FAILED-repeatedly / CONFUSED phrases, logs a timestamped
|
|
22
|
+
# signal, and emits a loud '/effort xhigh' nudge when ≥2 signals land inside
|
|
23
|
+
# a 30-minute window. Enforces the SDLC confidence table mid-session so
|
|
24
|
+
# Claude stops burning budget at 'high' after confidence drops.
|
|
25
|
+
EFFORT_CACHE_DIR="${SDLC_WIZARD_CACHE_DIR:-$HOME/.cache/sdlc-wizard}"
|
|
26
|
+
EFFORT_SIGNALS="$EFFORT_CACHE_DIR/effort-signals.log"
|
|
27
|
+
PROMPT_TEXT=""
|
|
28
|
+
if [ ! -t 0 ] && command -v jq > /dev/null 2>&1; then
|
|
29
|
+
STDIN_JSON=$(cat)
|
|
30
|
+
if [ -n "$STDIN_JSON" ]; then
|
|
31
|
+
PROMPT_TEXT=$(printf '%s' "$STDIN_JSON" | jq -r '.prompt // empty' 2>/dev/null) || PROMPT_TEXT=""
|
|
32
|
+
fi
|
|
33
|
+
fi
|
|
34
|
+
if [ -n "$PROMPT_TEXT" ]; then
|
|
35
|
+
LOWER=$(printf '%s' "$PROMPT_TEXT" | tr '[:upper:]' '[:lower:]')
|
|
36
|
+
SIGNAL_REASON=""
|
|
37
|
+
# Every trigger requires first-person ownership or a structured-label
|
|
38
|
+
# form, so educational/quoted mentions ("How do I name a low confidence
|
|
39
|
+
# badge?", "What does 'failed again' mean?") don't fire.
|
|
40
|
+
case "$LOWER" in
|
|
41
|
+
*"i'm stuck"*|*"i am stuck"*|*"im stuck"*|\
|
|
42
|
+
*"i'm confused"*|*"i am confused"*|*"im confused"*|\
|
|
43
|
+
*"i tried twice"*|*"i've tried twice"*|*"ive tried twice"*|\
|
|
44
|
+
*"i can't figure"*|*"i cant figure"*|\
|
|
45
|
+
*"i'm not sure why"*|*"i am not sure why"*|*"im not sure why"*|\
|
|
46
|
+
*"my confidence is low"*|*"my confidence: low"*|*"confidence: low"*|\
|
|
47
|
+
*"it's still failing"*|*"its still failing"*|\
|
|
48
|
+
*"it keeps failing"*|*"this keeps failing"*|\
|
|
49
|
+
*"it failed again"*|*"this failed again"*|\
|
|
50
|
+
*"failed twice"*|*"failed 2x"*)
|
|
51
|
+
SIGNAL_REASON="low"
|
|
52
|
+
;;
|
|
53
|
+
esac
|
|
54
|
+
if [ -n "$SIGNAL_REASON" ]; then
|
|
55
|
+
# Group the write so redirection errors (e.g., unwritable HOME,
|
|
56
|
+
# cache-dir-is-a-file) land on /dev/null instead of leaking to stderr.
|
|
57
|
+
{
|
|
58
|
+
if mkdir -p "$EFFORT_CACHE_DIR" && [ -d "$EFFORT_CACHE_DIR" ]; then
|
|
59
|
+
# Prune entries older than 1h on every write to cap log size.
|
|
60
|
+
if [ -f "$EFFORT_SIGNALS" ]; then
|
|
61
|
+
PRUNE_THRESH=$(( $(date +%s) - 3600 ))
|
|
62
|
+
awk -v t="$PRUNE_THRESH" '$1+0 >= t' "$EFFORT_SIGNALS" > "${EFFORT_SIGNALS}.tmp" \
|
|
63
|
+
&& mv "${EFFORT_SIGNALS}.tmp" "$EFFORT_SIGNALS"
|
|
64
|
+
fi
|
|
65
|
+
printf '%s\t%s\n' "$(date +%s)" "$SIGNAL_REASON" >> "$EFFORT_SIGNALS"
|
|
66
|
+
fi
|
|
67
|
+
} 2>/dev/null || true
|
|
68
|
+
fi
|
|
69
|
+
fi
|
|
70
|
+
if [ -f "$EFFORT_SIGNALS" ]; then
|
|
71
|
+
NOW=$(date +%s)
|
|
72
|
+
THRESH=$(( NOW - 1800 ))
|
|
73
|
+
RECENT=$(awk -v t="$THRESH" '$1+0 >= t' "$EFFORT_SIGNALS" 2>/dev/null | wc -l | tr -d ' ')
|
|
74
|
+
if [ "${RECENT:-0}" -ge 2 ]; then
|
|
75
|
+
echo ""
|
|
76
|
+
echo "!! EFFORT BUMP REQUIRED: ${RECENT} low-confidence signals in last 30 min !!"
|
|
77
|
+
echo " Run /effort xhigh NOW — spinning at 'high' after confidence drops wastes budget."
|
|
78
|
+
echo " (Auto-enforcement of the SDLC confidence table. ROADMAP #195.)"
|
|
79
|
+
echo ""
|
|
80
|
+
fi
|
|
81
|
+
fi
|
|
82
|
+
|
|
20
83
|
if [ ! -s "$PROJECT_DIR/SDLC.md" ] || [ ! -s "$PROJECT_DIR/TESTING.md" ]; then
|
|
21
84
|
cat << 'SETUP'
|
|
22
85
|
SETUP NOT COMPLETE: SDLC.md and/or TESTING.md are missing.
|
package/package.json
CHANGED
package/skills/sdlc/SKILL.md
CHANGED
|
@@ -184,9 +184,11 @@ Before presenting approach, STATE your confidence:
|
|
|
184
184
|
|-------|---------|--------|--------|
|
|
185
185
|
| HIGH (90%+) | Know exactly what to do | Present approach, proceed after approval | `high` (default) |
|
|
186
186
|
| MEDIUM (60-89%) | Solid approach, some uncertainty | Present approach, highlight uncertainties | `high` (default) |
|
|
187
|
-
| LOW (<60%) | Not sure | Do more research or try cross-model research (Codex) to get to 95%. If still LOW after research, ASK USER |
|
|
188
|
-
| FAILED 2x | Something's wrong | Try cross-model research (Codex) for a fresh perspective. If still stuck, STOP and ASK USER |
|
|
189
|
-
| CONFUSED | Can't diagnose why something is failing | Try cross-model research (Codex). If still confused, STOP. Describe what you tried, ask for help |
|
|
187
|
+
| LOW (<60%) | Not sure | Do more research or try cross-model research (Codex) to get to 95%. If still LOW after research, ASK USER | **Run `/effort xhigh` now** — don't wait |
|
|
188
|
+
| FAILED 2x | Something's wrong | Try cross-model research (Codex) for a fresh perspective. If still stuck, STOP and ASK USER | **Run `/effort max` now** — you're already burning cycles at lower effort |
|
|
189
|
+
| CONFUSED | Can't diagnose why something is failing | Try cross-model research (Codex). If still confused, STOP. Describe what you tried, ask for help | **Run `/effort max` now** — stop spinning |
|
|
190
|
+
|
|
191
|
+
**Dynamic bumping is NOT optional.** "Consider max effort" is the same as "ignore this" in practice. If your confidence drops or tests fail twice, bump effort BEFORE the next attempt — not after a third failure. Spinning at low effort is an SDLC failure mode, not a style choice.
|
|
190
192
|
|
|
191
193
|
## Self-Review Loop (CRITICAL)
|
|
192
194
|
|
|
@@ -536,10 +538,11 @@ Local tests pass -> Commit -> Push -> Watch CI
|
|
|
536
538
|
1. Push changes to remote
|
|
537
539
|
2. Watch CI: `gh pr checks --watch`
|
|
538
540
|
3. Read CI logs — **pass or fail**: `gh run view <RUN_ID> --log` (not just `--log-failed`). Passing CI can still hide warnings, skipped steps, or degraded scores. Don't just check the green checkmark
|
|
539
|
-
4.
|
|
540
|
-
5. If CI
|
|
541
|
-
6.
|
|
542
|
-
7.
|
|
541
|
+
4. **Cross-model review the CI logs themselves** — pipe `gh run view <RUN_ID> --log` to a tmp file and run `codex exec -c 'model_reasoning_effort="xhigh"' -s danger-full-access` with a prompt like *"Audit this CI log for silent failures, skipped tests, degraded metrics, or warnings-that-should-be-errors. Green checkmark is necessary but not sufficient."* A second model catches things the first missed (e.g., a job that passed but degraded an E2E score by 30%, or a test that was silently excluded). Cheap — one extra `codex exec` per PR. **Run separately on Tier 1 quick-check AND Tier 2 5x evaluation logs** — they exercise different code paths, so a clean Tier 1 audit doesn't imply a clean Tier 2. Evidence from PR #206: Tier 1 audit found 3 P1s (Node 24 false-green, "11/10" score leak, E2E incomplete); Tier 2 audit TBD — value is measured by running both and comparing.
|
|
542
|
+
5. If CI fails → diagnose from logs, fix, push again (max 2 attempts)
|
|
543
|
+
6. If CI passes → read ALL review comments: `gh api repos/OWNER/REPO/pulls/PR/comments`
|
|
544
|
+
7. Fix valid suggestions, push, iterate until clean
|
|
545
|
+
8. Only then: explicit merge with `gh pr merge --squash`
|
|
543
546
|
|
|
544
547
|
**Why this is non-negotiable:** PR #145 auto-merged a release before review feedback was read. CI reviewer found a P1 dead-code bug that shipped to main. The fix required a follow-up commit. Auto-merge cost more time than the shepherd loop would have taken.
|
|
545
548
|
|
|
@@ -717,6 +720,40 @@ If this session revealed insights, update the right place:
|
|
|
717
720
|
- **General project context** → `CLAUDE.md` (or `/revise-claude-md`)
|
|
718
721
|
- **Plan files** → If this session's work came from a plan file, delete it or mark it complete. Stale plans mislead future sessions into thinking work is still pending
|
|
719
722
|
|
|
723
|
+
### Memory Audit Protocol
|
|
724
|
+
|
|
725
|
+
Per-user memory at `~/.claude/projects/<proj>/memory/` accumulates private learnings. Some belong there (user preferences, external references). Others are portable technical lessons (tool quirks, platform gotchas, bash/GHA/macOS footguns) that would save the next contributor hours. Run this audit to promote the portable ones.
|
|
726
|
+
|
|
727
|
+
**When to run:**
|
|
728
|
+
- End-of-release (before cutting a tag)
|
|
729
|
+
- After a debugging-heavy session with multiple memory additions
|
|
730
|
+
- On explicit "audit my memory" request
|
|
731
|
+
|
|
732
|
+
**Classify each memory file in `~/.claude/projects/<proj>/memory/`:**
|
|
733
|
+
|
|
734
|
+
1. **Rule-based denylist (deterministic, no LLM):**
|
|
735
|
+
- `type: user` → `keep` (user identity, preferences — never promote)
|
|
736
|
+
- `type: reference` → `keep` (external pointers to Discord/URL/etc — private by default)
|
|
737
|
+
- `type: project` → `manual-review` (often mixed state + portable lesson — human decides)
|
|
738
|
+
- `type: feedback` → `manual-review` (often mixed personal preference + portable rule — human decides)
|
|
739
|
+
- Parser must normalize YAML variants (`type: "user"`, `type: user # comment`, surrounding whitespace) — see `tests/test-memory-audit-protocol.sh::apply_denylist_rule` for the reference implementation
|
|
740
|
+
2. **Remaining entries** (no type, or type outside the 4 above) fall through to human-gated review. An LLM-assisted classification runner is Prove-It-Gated: build it only after running this protocol 4+ times with manual classification. Until then, human review at promotion time IS the quality gate
|
|
741
|
+
|
|
742
|
+
**Destinations for `promote` entries (no new files — use existing wizard destinations):**
|
|
743
|
+
|
|
744
|
+
| Content | Target |
|
|
745
|
+
|---------|--------|
|
|
746
|
+
| Language/tool/platform gotchas (bash, gh CLI, GHA, macOS) | `SDLC.md` → `## Lessons Learned` section |
|
|
747
|
+
| Testing gotchas (flaky patterns, mock-vs-integration lessons) | `TESTING.md` |
|
|
748
|
+
| Tool-specific quirks tied to a skill | That skill's `SKILL.md` |
|
|
749
|
+
| Process rules that should govern the project | `CLAUDE.md` |
|
|
750
|
+
|
|
751
|
+
**Tracking:** When you promote an entry, add `promoted_to: <path>` to that memory file's YAML frontmatter. Subsequent audits skip already-promoted entries.
|
|
752
|
+
|
|
753
|
+
**Human gate is MANDATORY.** Protocol produces diffs; user approves chunk-by-chunk before apply. Never auto-apply — private memory touching public docs needs human judgement.
|
|
754
|
+
|
|
755
|
+
**Prove It Gate:** If you find yourself running this protocol 4+ times and manually doing the same classification work, that's evidence to build a `/memory-audit` slash command AND wire the LLM-gated quality tests (8/10 classification, 6/6 destination). Until then, protocol + human review is enough — and no stub tests that skip (they mislead reviewers into thinking a gate exists when it doesn't).
|
|
756
|
+
|
|
720
757
|
## Post-Mortem: When Process Fails, Feed It Back
|
|
721
758
|
|
|
722
759
|
**Every process failure becomes an enforcement rule.** When you skip a step and it causes a problem, don't just fix the symptom — add a gate so it can't happen again.
|
package/skills/setup/SKILL.md
CHANGED
|
@@ -174,31 +174,61 @@ Skip this step if no branding assets or UI/content patterns detected.
|
|
|
174
174
|
|
|
175
175
|
### Step 9: Configure Tool Permissions
|
|
176
176
|
|
|
177
|
-
Based on detected stack, suggest `
|
|
177
|
+
Based on detected stack, suggest entries for `permissions.allow` in `.claude/settings.json`:
|
|
178
178
|
- Package manager commands (npm, pnpm, yarn, cargo, go, pip, etc.)
|
|
179
179
|
- Build/test commands
|
|
180
180
|
- CI tools (gh)
|
|
181
181
|
|
|
182
|
+
Write the shape as:
|
|
183
|
+
|
|
184
|
+
```json
|
|
185
|
+
{
|
|
186
|
+
"permissions": {
|
|
187
|
+
"allow": [
|
|
188
|
+
"Bash(npm:*)",
|
|
189
|
+
"Bash(npx:*)",
|
|
190
|
+
"Bash(git:*)",
|
|
191
|
+
"Bash(gh:*)"
|
|
192
|
+
]
|
|
193
|
+
}
|
|
194
|
+
}
|
|
195
|
+
```
|
|
196
|
+
|
|
197
|
+
**Do NOT write the deprecated top-level `allowedTools` array** (issue #197). Claude Code treats the presence of `allowedTools` in project settings as "user has explicitly scoped tool permissions" and silently disables its auto-mode classifier — same failure family as the model pin in #198. `permissions.allow` is the supported successor and does not trip the auto-mode gate.
|
|
198
|
+
|
|
182
199
|
Present suggestions and let the user confirm.
|
|
183
200
|
|
|
184
|
-
### Step 9.5: Context Window Configuration
|
|
201
|
+
### Step 9.5: Context Window Configuration (Opt-In)
|
|
202
|
+
|
|
203
|
+
The CLI ships `cli/templates/settings.json` with **no** `model` or `env` pin by default. This preserves Claude Code's built-in model auto-selection (Sonnet for cheap tasks, Opus for hard ones) and the upstream autocompact threshold. Power users who want guaranteed 1M context can opt in during setup.
|
|
204
|
+
|
|
205
|
+
**Why this is opt-in (issue #198):** A top-level `"model"` in `settings.json` tells Claude Code "the user has explicitly chosen a model" and disables auto-mode for the session. That is a real tradeoff — the pin is only worth it when you actually need the 1M headroom and want to lock to Opus 4.7.
|
|
206
|
+
|
|
207
|
+
**Ask the user exactly once in Step 9.5:**
|
|
208
|
+
|
|
209
|
+
> Pin the session to `opus[1m]` (Opus 4.7 with 1M context) and set `CLAUDE_AUTOCOMPACT_PCT_OVERRIDE=30`?
|
|
210
|
+
>
|
|
211
|
+
> - **No (default):** Leaves auto-mode enabled. Claude Code picks the model per turn, compaction follows upstream defaults. Recommended for most users.
|
|
212
|
+
> - **Yes:** Long SDLC sessions (plan → TDD → review → CI shepherd on one feature) regularly cross 100K tokens; the 1M window gives headroom and 30% autocompact fires at ~300K. Requires Claude Code v2.1.111+ and comfort with losing model auto-selection.
|
|
213
|
+
>
|
|
214
|
+
> `[y/N]`
|
|
185
215
|
|
|
186
|
-
|
|
216
|
+
**If the user answers No (default):** Make no edits to `.claude/settings.json`. Auto-mode stays on. Done.
|
|
187
217
|
|
|
188
|
-
|
|
189
|
-
- **200K fallback (`opus`):** Edit `.claude/settings.json` — change `"model"` to `"opus"` and raise `CLAUDE_AUTOCOMPACT_PCT_OVERRIDE` to `"75"` (otherwise `30%` of 200K compacts too early at 60K).
|
|
218
|
+
**If the user answers Yes:** Edit `.claude/settings.json` and add both fields at the top level:
|
|
190
219
|
|
|
191
|
-
To fall back to 200K, edit `.claude/settings.json`:
|
|
192
220
|
```json
|
|
193
221
|
{
|
|
194
|
-
"model": "opus",
|
|
222
|
+
"model": "opus[1m]",
|
|
195
223
|
"env": {
|
|
196
|
-
"CLAUDE_AUTOCOMPACT_PCT_OVERRIDE": "
|
|
224
|
+
"CLAUDE_AUTOCOMPACT_PCT_OVERRIDE": "30"
|
|
197
225
|
}
|
|
198
226
|
}
|
|
199
227
|
```
|
|
200
228
|
|
|
201
|
-
|
|
229
|
+
Mention the escape hatch either way:
|
|
230
|
+
- To opt out later: remove the `model` line (and optionally the `env` block) from `.claude/settings.json`, or run `/model` and pick "Default (recommended)".
|
|
231
|
+
- For CI pipelines with short tasks, consider `CLAUDE_AUTOCOMPACT_PCT_OVERRIDE=60` — compact early to stay fast.
|
|
202
232
|
|
|
203
233
|
This is project-scoped and shared with the team via git.
|
|
204
234
|
|
|
@@ -224,10 +254,13 @@ Tell the user:
|
|
|
224
254
|
> **Exit Claude Code and restart it** for the new configuration to take effect.
|
|
225
255
|
> On restart, the SDLC hook will fire and you'll see the checklist in every response.
|
|
226
256
|
>
|
|
227
|
-
> **Optional next
|
|
257
|
+
> **Optional next steps:**
|
|
228
258
|
> - Run `/claude-automation-recommender` for stack-specific tooling suggestions (MCP servers, formatting hooks, type-checking hooks, plugins)
|
|
259
|
+
> - After a few sessions, run `/less-permission-prompts` — a native Claude Code skill
|
|
260
|
+
> that scans your transcripts for common read-only Bash/MCP calls and proposes a
|
|
261
|
+
> prioritized allowlist. Reduces permission friction without enabling auto mode.
|
|
229
262
|
>
|
|
230
|
-
>
|
|
263
|
+
> Both are complementary to the SDLC wizard — they add tooling and quality-of-life, not process enforcement.
|
|
231
264
|
|
|
232
265
|
## Rules
|
|
233
266
|
|
package/skills/update/SKILL.md
CHANGED
|
@@ -46,9 +46,10 @@ Parse all CHANGELOG entries between the user's installed version and the latest.
|
|
|
46
46
|
|
|
47
47
|
```
|
|
48
48
|
Installed: 1.24.0
|
|
49
|
-
Latest: 1.
|
|
49
|
+
Latest: 1.34.0
|
|
50
50
|
|
|
51
51
|
What changed:
|
|
52
|
+
- [1.34.0] API feature detection shepherd for Claude releases, Memory Audit Protocol with 7 verified lessons (+2 caught-and-retracted), /less-permission-prompts surfaced, ...
|
|
52
53
|
- [1.33.0] opus[1m] as SDLC default, dual-channel install drift guardrails, model/effort session-start nudge, ...
|
|
53
54
|
- [1.32.0] Opus 4.7 + xhigh support, model/effort upgrade detection, benchmark ceiling audit, ...
|
|
54
55
|
- [1.31.0] Hook false-positive fix for non-SDLC dirs, ephemeral marketplace path warning, ...
|
|
@@ -109,6 +110,54 @@ NEVER overwrite settings.json. Instead:
|
|
|
109
110
|
|
|
110
111
|
The CLI's `init --force` already has smart merge logic for settings.json. If the manual merge gets complicated, suggest: `npx agentic-sdlc-wizard init --force` (it preserves custom hooks).
|
|
111
112
|
|
|
113
|
+
### Step 7.5: Model Pin Migration (Issue #198)
|
|
114
|
+
|
|
115
|
+
Wizard versions 1.31.0–1.33.x unconditionally wrote `"model": "opus[1m]"` and `"env": { "CLAUDE_AUTOCOMPACT_PCT_OVERRIDE": "30" }` to `.claude/settings.json`. Issue #198 flipped that to opt-in because a top-level `model` disables Claude Code's auto-mode for the session.
|
|
116
|
+
|
|
117
|
+
If the user is upgrading from a pre-#198 version, check their `.claude/settings.json`:
|
|
118
|
+
|
|
119
|
+
1. **If `model` is `"opus[1m]"` and `env.CLAUDE_AUTOCOMPACT_PCT_OVERRIDE` is `"30"`** — this is likely the old wizard-installed pair, not an intentional user choice. Ask:
|
|
120
|
+
|
|
121
|
+
> Your `.claude/settings.json` pins `model: "opus[1m]"` with `CLAUDE_AUTOCOMPACT_PCT_OVERRIDE=30`.
|
|
122
|
+
> This pair was the SDLC wizard default in 1.31.0–1.33.x, but it disables Claude Code's auto-mode (see issue #198).
|
|
123
|
+
>
|
|
124
|
+
> - **Remove the pin** (recommended for most users) — keeps auto-mode enabled, lets Claude Code pick the model per turn.
|
|
125
|
+
> - **Keep the pin** — you want guaranteed Opus 4.7 + 1M context, and you're OK giving up model auto-selection.
|
|
126
|
+
>
|
|
127
|
+
> Remove, keep, or decide later? `[r/k/l]`
|
|
128
|
+
|
|
129
|
+
2. **If only one of the two fields matches** (e.g. `model: "opus[1m]"` but custom autocompact, or vice versa) — treat as intentional customization. Do not prompt.
|
|
130
|
+
|
|
131
|
+
3. **If `model` is some other value** (e.g. `"sonnet"`, `"opus"`) — treat as user's explicit choice. Do not touch.
|
|
132
|
+
|
|
133
|
+
4. **If neither field is set** — user is already on the new default. No action.
|
|
134
|
+
|
|
135
|
+
When removing: edit the file in place, drop the `model` key (and the `env.CLAUDE_AUTOCOMPACT_PCT_OVERRIDE` key if nothing else is in `env`, otherwise leave `env` alone). Never touch other keys the user added.
|
|
136
|
+
|
|
137
|
+
### Step 7.6: `allowedTools` → `permissions.allow` Migration (Issue #197)
|
|
138
|
+
|
|
139
|
+
Wizard versions before #197 guided users to write a top-level `allowedTools` array in `.claude/settings.json`. Claude Code silently disables its auto-mode classifier when that key is present, even with `defaultMode: "auto"` set globally.
|
|
140
|
+
|
|
141
|
+
If the user's `.claude/settings.json` has a top-level `allowedTools` array, offer to migrate:
|
|
142
|
+
|
|
143
|
+
1. **If only `allowedTools` is present** (no `permissions.allow`) — ask:
|
|
144
|
+
|
|
145
|
+
> Your `.claude/settings.json` has a top-level `allowedTools` array. This silently disables Claude Code auto-mode (see issue #197). The supported successor is `permissions.allow`, which accepts the same patterns but doesn't trip the auto-mode gate.
|
|
146
|
+
>
|
|
147
|
+
> - **Migrate** (recommended): move all entries into `permissions.allow`, remove the old `allowedTools`.
|
|
148
|
+
> - **Keep** — you have a specific reason to use the legacy key.
|
|
149
|
+
> - **Later** — don't touch it now.
|
|
150
|
+
>
|
|
151
|
+
> `[m/k/l]`
|
|
152
|
+
|
|
153
|
+
2. **If both `allowedTools` and `permissions.allow` are present** — flag it: the two lists may have diverged. Show both arrays to the user. On migrate, append every entry from `allowedTools` to the end of `permissions.allow` (preserving order within each list), then drop the legacy `allowedTools` key. **Do NOT dedup.** If the same string appears in both lists, it stays in both positions — Claude Code treats duplicate entries as a no-op, but dedup would silently remove user data that the user might have intended. If the user explicitly asks to dedup, do that as a separate follow-up edit.
|
|
154
|
+
|
|
155
|
+
3. **If only `permissions.allow` is present** — user is already on the new shape. No action.
|
|
156
|
+
|
|
157
|
+
4. **If neither is present** — no action.
|
|
158
|
+
|
|
159
|
+
When migrating: preserve every entry byte-for-byte; only the container key changes. Do not reorder, dedup, or expand wildcards. Other top-level keys (hooks, env, model, custom user fields) are never touched.
|
|
160
|
+
|
|
112
161
|
### Step 8: Apply Selected Changes
|
|
113
162
|
|
|
114
163
|
For each file the user approved:
|