agentic-sdlc-wizard 1.20.0 → 1.22.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/CHANGELOG.md +36 -0
- package/CLAUDE_CODE_SDLC_WIZARD.md +213 -160
- package/README.md +5 -5
- package/cli/init.js +5 -4
- package/cli/templates/skills/feedback/SKILL.md +92 -0
- package/cli/templates/skills/sdlc/SKILL.md +98 -2
- package/cli/templates/skills/setup/SKILL.md +85 -54
- package/cli/templates/skills/update/SKILL.md +4 -4
- package/package.json +1 -1
package/CHANGELOG.md
CHANGED
|
@@ -4,6 +4,42 @@ All notable changes to the SDLC Wizard.
|
|
|
4
4
|
|
|
5
5
|
> **Note:** This changelog is for humans to read. Don't manually apply these changes - just run the wizard ("Check for SDLC wizard updates") and it handles everything automatically.
|
|
6
6
|
|
|
7
|
+
## [1.22.0] - 2026-04-01
|
|
8
|
+
|
|
9
|
+
### Added
|
|
10
|
+
- Plan Auto-Approval Gate — skip plan approval when confidence >= 95% AND task is single-file/trivial. Still announces approach, just doesn't wait. "When in doubt, wait for approval" (#53)
|
|
11
|
+
- Debugging Workflow section — systematic Reproduce → Isolate → Root Cause → Fix → Regression Test methodology. `git bisect` for regressions, environment-specific debugging guidance (#55)
|
|
12
|
+
- `/feedback` skill — privacy-first community contribution loop. Bug reports, feature requests, pattern sharing, SDLC improvements. Never scans without explicit consent. Creates GH issues on wizard repo (#37)
|
|
13
|
+
- BRANDING.md detection in setup wizard — scans for brand/, logos/, style-guide.md, brand-voice.md. Conditional generation only when branding assets found (#44)
|
|
14
|
+
- N-Reviewer CI Pipeline guidance — address each reviewer independently, resolve conflicts, max 3 iterations per reviewer (#32)
|
|
15
|
+
- Custom Subagents documentation — `.claude/agents/` pattern for sdlc-reviewer, ci-debug, test-writer agents. Skills vs agents comparison (#45)
|
|
16
|
+
- CLI distributes `/feedback` skill (9 template files, was 8)
|
|
17
|
+
- Improved CLI install restart messaging — `--continue` promoted as primary option for preserving conversation history
|
|
18
|
+
- 20 new tests across all 6 roadmap items
|
|
19
|
+
|
|
20
|
+
### Changed
|
|
21
|
+
- SDLC skill: added Auto-Approval, Debugging Workflow, Multiple Reviewers, Custom Subagents sections
|
|
22
|
+
- Setup skill: added branding asset detection (Step 1) and BRANDING.md generation (Step 8.5)
|
|
23
|
+
- Wizard doc: added Plan Auto-Approval, Debugging Workflow, N-Reviewer Pipeline, Custom Subagents, BRANDING.md template
|
|
24
|
+
|
|
25
|
+
## [1.21.0] - 2026-03-31
|
|
26
|
+
|
|
27
|
+
### Added
|
|
28
|
+
- Confidence-driven setup wizard — kills the fixed 18 questions. Scans repo, builds confidence per data point, only asks what it can't infer. Dynamic question count (0-2 for well-configured projects, 10+ for bare repos). 95% aggregate confidence threshold (#52)
|
|
29
|
+
- CI Shepherd opt-in question in setup wizard (#48 partial)
|
|
30
|
+
- Cross-model release review recommendation — releases/publishes as explicit trigger, Release Review Checklist with v1.20.0 evidence (#49)
|
|
31
|
+
- Prove It Gate enforcement in SDLC skill — prevents unvalidated additions with quality test requirements (#50)
|
|
32
|
+
- 6 confidence-driven setup tests, 10 prove-it-gate tests, 6 release review tests
|
|
33
|
+
|
|
34
|
+
### Removed
|
|
35
|
+
- ci-analyzer skill — violated Prove It philosophy (existence-only tests, no quality validation, overlap with `/claude-automation-recommender`) (#50)
|
|
36
|
+
- ci-self-heal.yml deprecated — local shepherd is the primary CI fix mechanism
|
|
37
|
+
|
|
38
|
+
### Changed
|
|
39
|
+
- Wizard doc: Q-numbered questions → data point descriptions with detection hints
|
|
40
|
+
- Setup skill: 12 steps (was 11) with new "Build Confidence Map" step
|
|
41
|
+
- CLI distributes 8 template files (was 9, removed ci-analyzer)
|
|
42
|
+
|
|
7
43
|
## [1.20.0] - 2026-03-31
|
|
8
44
|
|
|
9
45
|
### Added
|
|
@@ -37,7 +37,7 @@ As Claude Code improves, the wizard absorbs those improvements and removes its o
|
|
|
37
37
|
**But here's the key:** This isn't a one-size-fits-all answer. It's a starting point that helps you find YOUR answer. Every project is different. The self-evaluating loop (plan → build → test → review → improve) needs to be tuned to your codebase, your team, your standards. The wizard gives you the framework — you shape it into something bespoke.
|
|
38
38
|
|
|
39
39
|
**The living system:**
|
|
40
|
-
-
|
|
40
|
+
- The local shepherd captures friction signals during active sessions
|
|
41
41
|
- You approve changes to the process
|
|
42
42
|
- Both sides learn over time
|
|
43
43
|
- The system improves the system (recursive improvement)
|
|
@@ -356,6 +356,14 @@ This applies to everything: native Claude Code commands vs custom skills, framew
|
|
|
356
356
|
|
|
357
357
|
**For the wizard's CI/CD:** When the weekly-update workflow detects a new Claude Code feature that overlaps with a wizard feature, the CI should automatically run E2E with both versions and recommend KEEP CUSTOM / SWITCH TO NATIVE / TIE.
|
|
358
358
|
|
|
359
|
+
**This applies to YOUR OWN additions too — not just native vs custom:**
|
|
360
|
+
- Adding a new skill? Prove it fills a gap nothing else covers. Write quality tests.
|
|
361
|
+
- Adding a new hook? Prove it improves scores or catches real issues.
|
|
362
|
+
- Adding a new workflow? Prove the automation ROI exceeds maintenance cost.
|
|
363
|
+
- Existence tests ("file exists", "has frontmatter") are NOT proof. They prove the file was created, not that it works.
|
|
364
|
+
|
|
365
|
+
**Evidence:** ci-analyzer skill was added in v1.20.0 with 4 existence-only tests, zero quality validation, and overlap with the third-party `/claude-automation-recommender`. Deleted in next release. This gap led to the Prove It Gate enforcement in the SDLC skill.
|
|
366
|
+
|
|
359
367
|
---
|
|
360
368
|
|
|
361
369
|
## What You're Setting Up
|
|
@@ -410,6 +418,8 @@ After planning, you get a free `/compact` - Claude's plan is preserved in the su
|
|
|
410
418
|
4. You run `/compact` → frees context, plan preserved in summary
|
|
411
419
|
5. Claude implements with clean context
|
|
412
420
|
|
|
421
|
+
**Plan Auto-Approval:** For HIGH confidence (95%+) tasks that are single-file or trivial (config tweak, small bug fix, string change) with no new patterns — skip plan approval and go straight to TDD. Claude still announces the approach but doesn't wait for approval. When in doubt, wait.
|
|
422
|
+
|
|
413
423
|
### 2. Confidence Levels Prevent Disasters
|
|
414
424
|
|
|
415
425
|
Claude MUST state confidence before implementing:
|
|
@@ -954,7 +964,7 @@ After SDLC setup is complete, run `/claude-automation-recommender` for stack-spe
|
|
|
954
964
|
| Category | Wizard Ships | Recommender Suggests |
|
|
955
965
|
|----------|-------------|---------------------|
|
|
956
966
|
| SDLC process (TDD, planning, review) | Enforced via hooks + skills | Not covered |
|
|
957
|
-
| CI workflows (
|
|
967
|
+
| CI workflows (PR review) | Templates + docs | Not covered |
|
|
958
968
|
| MCP servers (context7, Playwright, DB) | Not covered | Per-stack suggestions |
|
|
959
969
|
| Auto-formatting hooks (Prettier, ESLint) | Not covered | Per-stack suggestions |
|
|
960
970
|
| Type-checking hooks (tsc, mypy) | Not covered | Per-stack suggestions |
|
|
@@ -1026,39 +1036,44 @@ Feature branches still recommended for solo devs (keeps main clean, easy rollbac
|
|
|
1026
1036
|
|
|
1027
1037
|
**Back-and-forth:** User questions live in PR comments. Bot's response is always the latest sticky comment. Clean and organized.
|
|
1028
1038
|
|
|
1029
|
-
**CI
|
|
1030
|
-
> "
|
|
1039
|
+
**CI shepherd opt-in (only if CI detected during auto-scan):**
|
|
1040
|
+
> "Enable CI shepherd role? Claude will actively watch CI, auto-fix failures, and iterate on review feedback. (y/n)"
|
|
1031
1041
|
|
|
1032
|
-
- **Yes** → Enable CI
|
|
1033
|
-
- **No** → Skip CI
|
|
1042
|
+
- **Yes** → Enable full shepherd loop: CI fix loop + review feedback loop. Ask detail questions below
|
|
1043
|
+
- **No** → Skip CI shepherd entirely (Claude still runs local tests, just doesn't interact with CI after pushing)
|
|
1034
1044
|
|
|
1035
|
-
**What
|
|
1036
|
-
1. After pushing, Claude
|
|
1037
|
-
2.
|
|
1038
|
-
3. Claude diagnoses the failure and proposes a fix
|
|
1039
|
-
4. Max 2 fix attempts, then asks user
|
|
1040
|
-
5. Job isn't done until CI is green
|
|
1045
|
+
**What the CI shepherd does:**
|
|
1046
|
+
1. **CI fix loop:** After pushing, Claude watches CI via `gh pr checks`, reads failure logs, diagnoses and fixes, pushes again (max 2 attempts)
|
|
1047
|
+
2. **Review feedback loop:** After CI passes, Claude reads automated review comments, implements valid suggestions, pushes and re-reviews (max 3 iterations)
|
|
1041
1048
|
|
|
1042
|
-
**Recommendation:** Yes if you have CI configured.
|
|
1043
|
-
"local tests pass" and "PR is actually ready to merge."
|
|
1049
|
+
**Recommendation:** Yes if you have CI configured. The shepherd closes the loop between "local tests pass" and "PR is actually ready to merge."
|
|
1044
1050
|
|
|
1045
1051
|
**Requirements:**
|
|
1046
1052
|
- `gh` CLI installed and authenticated
|
|
1047
1053
|
- CI/CD configured (GitHub Actions, etc.)
|
|
1048
1054
|
- If no CI yet: skip, add later when you set up CI
|
|
1049
1055
|
|
|
1056
|
+
**Stored in SDLC.md metadata as:**
|
|
1057
|
+
```
|
|
1058
|
+
<!-- CI Shepherd: enabled -->
|
|
1059
|
+
```
|
|
1060
|
+
|
|
1061
|
+
**Detail questions (only if CI shepherd is enabled):**
|
|
1062
|
+
|
|
1063
|
+
**CI monitoring detail:**
|
|
1064
|
+
> "Should Claude monitor CI checks after pushing and auto-diagnose failures? (y/n)"
|
|
1065
|
+
|
|
1066
|
+
- **Yes** → Enable CI feedback loop in SDLC skill, add `gh` CLI to allowedTools
|
|
1067
|
+
- **No** → Skip CI monitoring steps (Claude still runs local tests, just doesn't watch CI)
|
|
1068
|
+
|
|
1050
1069
|
**CI review feedback question (only if CI monitoring is enabled):**
|
|
1051
1070
|
> "What level of automated review response do you want?"
|
|
1052
1071
|
|
|
1053
|
-
| Level | Name | What
|
|
1054
|
-
|
|
1055
|
-
| **L1** | `ci-only` | CI failures only (broken tests, lint) |
|
|
1056
|
-
| **L2** | `criticals` (default) | + Critical review findings (must-fix) |
|
|
1057
|
-
| **L3** | `all-findings` | + Every suggestion the reviewer flags |
|
|
1058
|
-
|
|
1059
|
-
> **Cost note:** Higher levels mean more autofix iterations (each ~$0.50).
|
|
1060
|
-
> L3 typically adds 1-2 extra iterations per PR but produces cleaner code.
|
|
1061
|
-
> You can change this anytime by editing `AUTOFIX_LEVEL` in your ci-autofix workflow.
|
|
1072
|
+
| Level | Name | What the shepherd handles |
|
|
1073
|
+
|-------|------|--------------------------|
|
|
1074
|
+
| **L1** | `ci-only` | CI failures only (broken tests, lint) |
|
|
1075
|
+
| **L2** | `criticals` (default) | + Critical review findings (must-fix) |
|
|
1076
|
+
| **L3** | `all-findings` | + Every suggestion the reviewer flags |
|
|
1062
1077
|
|
|
1063
1078
|
**What this does:**
|
|
1064
1079
|
1. After CI passes, Claude reads the automated code review comments
|
|
@@ -1233,9 +1248,11 @@ Recommendation: Your current tests rely heavily on mocks.
|
|
|
1233
1248
|
|
|
1234
1249
|
---
|
|
1235
1250
|
|
|
1236
|
-
## Step 1:
|
|
1251
|
+
## Step 1: Build Confidence Map and Fill Gaps
|
|
1252
|
+
|
|
1253
|
+
Claude assigns a state to each configuration data point based on scan results. **RESOLVED (detected)** items are presented for bulk confirmation. **RESOLVED (inferred)** items are presented with inferred values for the user to verify. **UNRESOLVED** items become questions. **The number of questions is dynamic — it depends on how much the scan resolves.** Stop asking when ALL data points are resolved (detected, inferred+confirmed, or answered by user).
|
|
1237
1254
|
|
|
1238
|
-
Claude presents what it found
|
|
1255
|
+
Claude presents what it found, organized by resolution state:
|
|
1239
1256
|
|
|
1240
1257
|
### Project Structure (Auto-Detected)
|
|
1241
1258
|
|
|
@@ -1244,13 +1261,13 @@ Claude presents what it found. You confirm or override:
|
|
|
1244
1261
|
Override? (leave blank to accept): _______________
|
|
1245
1262
|
```
|
|
1246
1263
|
|
|
1247
|
-
**
|
|
1264
|
+
**Test directory** (detect from tests/, __tests__/, spec/, test file patterns)
|
|
1248
1265
|
```
|
|
1249
1266
|
Examples: tests/, __tests__/, src/**/*.test.ts, spec/
|
|
1250
1267
|
Your answer: _______________
|
|
1251
1268
|
```
|
|
1252
1269
|
|
|
1253
|
-
**
|
|
1270
|
+
**Test framework** (detect from jest.config, vitest.config, pytest.ini, etc.)
|
|
1254
1271
|
```
|
|
1255
1272
|
Options: Jest, Vitest, Playwright, Cypress, pytest, Go testing, other
|
|
1256
1273
|
Your answer: _______________
|
|
@@ -1258,31 +1275,31 @@ Your answer: _______________
|
|
|
1258
1275
|
|
|
1259
1276
|
### Commands
|
|
1260
1277
|
|
|
1261
|
-
**
|
|
1278
|
+
**Lint command** (detect from package.json scripts, Makefile, config files)
|
|
1262
1279
|
```
|
|
1263
1280
|
Examples: npm run lint, pnpm lint, eslint ., biome check
|
|
1264
1281
|
Your answer: _______________
|
|
1265
1282
|
```
|
|
1266
1283
|
|
|
1267
|
-
**
|
|
1284
|
+
**Type-check command** (detect from tsconfig.json, mypy.ini, etc.)
|
|
1268
1285
|
```
|
|
1269
1286
|
Examples: npm run typecheck, tsc --noEmit, mypy, none
|
|
1270
1287
|
Your answer: _______________
|
|
1271
1288
|
```
|
|
1272
1289
|
|
|
1273
|
-
**
|
|
1290
|
+
**Run all tests command** (detect from package.json "test" script, Makefile)
|
|
1274
1291
|
```
|
|
1275
1292
|
Examples: npm run test, pnpm test, pytest, go test ./...
|
|
1276
1293
|
Your answer: _______________
|
|
1277
1294
|
```
|
|
1278
1295
|
|
|
1279
|
-
**
|
|
1296
|
+
**Run single test file command** (infer from framework: jest → jest path, pytest → pytest path)
|
|
1280
1297
|
```
|
|
1281
1298
|
Examples: npm run test -- path/to/test.ts, pytest path/to/test.py
|
|
1282
1299
|
Your answer: _______________
|
|
1283
1300
|
```
|
|
1284
1301
|
|
|
1285
|
-
**
|
|
1302
|
+
**Production build command** (detect from package.json "build" script, Makefile)
|
|
1286
1303
|
```
|
|
1287
1304
|
Examples: npm run build, pnpm build, go build, cargo build
|
|
1288
1305
|
Your answer: _______________
|
|
@@ -1290,7 +1307,7 @@ Your answer: _______________
|
|
|
1290
1307
|
|
|
1291
1308
|
### Deployment
|
|
1292
1309
|
|
|
1293
|
-
**
|
|
1310
|
+
**Deployment setup** (auto-detected from Dockerfile, vercel.json, fly.toml, deploy scripts)
|
|
1294
1311
|
```
|
|
1295
1312
|
Detected: [e.g., Vercel, GitHub Actions, Docker, none]
|
|
1296
1313
|
|
|
@@ -1313,19 +1330,19 @@ Your answer: _______________
|
|
|
1313
1330
|
|
|
1314
1331
|
### Infrastructure
|
|
1315
1332
|
|
|
1316
|
-
**
|
|
1333
|
+
**Database(s)** (detect from prisma/, .env DB vars, docker-compose services)
|
|
1317
1334
|
```
|
|
1318
1335
|
Examples: PostgreSQL, MySQL, SQLite, MongoDB, none
|
|
1319
1336
|
Your answer: _______________
|
|
1320
1337
|
```
|
|
1321
1338
|
|
|
1322
|
-
**
|
|
1339
|
+
**Caching layer** (detect from .env REDIS vars, docker-compose redis service)
|
|
1323
1340
|
```
|
|
1324
1341
|
Examples: Redis, Memcached, none
|
|
1325
1342
|
Your answer: _______________
|
|
1326
1343
|
```
|
|
1327
1344
|
|
|
1328
|
-
**
|
|
1345
|
+
**Test duration** (estimate from test file count, CI run times if available)
|
|
1329
1346
|
```
|
|
1330
1347
|
Examples: <1 minute, 1-5 minutes, 5+ minutes
|
|
1331
1348
|
Your answer: _______________
|
|
@@ -1333,7 +1350,7 @@ Your answer: _______________
|
|
|
1333
1350
|
|
|
1334
1351
|
### Output Preferences
|
|
1335
1352
|
|
|
1336
|
-
**
|
|
1353
|
+
**Response detail level** (cannot detect — always ask if no preference found)
|
|
1337
1354
|
```
|
|
1338
1355
|
Options:
|
|
1339
1356
|
- Small - Minimal output, just essentials (experienced users)
|
|
@@ -1351,7 +1368,7 @@ Stored in `.claude/settings.json` as `"verbosity": "small|medium|large"`.
|
|
|
1351
1368
|
|
|
1352
1369
|
### Testing Philosophy
|
|
1353
1370
|
|
|
1354
|
-
**
|
|
1371
|
+
**Testing approach** (infer from existing test patterns — test-first files, coverage config)
|
|
1355
1372
|
```
|
|
1356
1373
|
Options:
|
|
1357
1374
|
- Strict TDD (test first always)
|
|
@@ -1362,7 +1379,7 @@ Options:
|
|
|
1362
1379
|
Your answer: _______________
|
|
1363
1380
|
```
|
|
1364
1381
|
|
|
1365
|
-
**
|
|
1382
|
+
**Test types** (detect from existing test file patterns: *.test.*, *.spec.*, e2e/, integration/)
|
|
1366
1383
|
```
|
|
1367
1384
|
(Check all that apply)
|
|
1368
1385
|
[ ] Unit tests (pure logic, isolated)
|
|
@@ -1372,7 +1389,7 @@ Your answer: _______________
|
|
|
1372
1389
|
[ ] Other: _______________
|
|
1373
1390
|
```
|
|
1374
1391
|
|
|
1375
|
-
**
|
|
1392
|
+
**Mocking philosophy** (detect from jest.mock, unittest.mock usage patterns)
|
|
1376
1393
|
```
|
|
1377
1394
|
Options:
|
|
1378
1395
|
- Minimal mocking (real DB, mock external APIs only)
|
|
@@ -1387,7 +1404,7 @@ Your answer: _______________
|
|
|
1387
1404
|
**If test framework detected (Jest, pytest, Go, etc.):**
|
|
1388
1405
|
|
|
1389
1406
|
```
|
|
1390
|
-
|
|
1407
|
+
Code Coverage (Optional)
|
|
1391
1408
|
|
|
1392
1409
|
Detected: [test framework] with coverage configuration
|
|
1393
1410
|
|
|
@@ -1408,7 +1425,7 @@ Your answer: _______________
|
|
|
1408
1425
|
**If no test framework detected (docs/AI-heavy project):**
|
|
1409
1426
|
|
|
1410
1427
|
```
|
|
1411
|
-
|
|
1428
|
+
Code Coverage (Optional)
|
|
1412
1429
|
|
|
1413
1430
|
No test framework detected (documentation/AI-heavy project).
|
|
1414
1431
|
|
|
@@ -1428,19 +1445,19 @@ Your answer: _______________
|
|
|
1428
1445
|
|
|
1429
1446
|
---
|
|
1430
1447
|
|
|
1431
|
-
###
|
|
1448
|
+
### How Configuration Data Points Map to Files
|
|
1432
1449
|
|
|
1433
|
-
|
|
1450
|
+
Each resolved data point (whether detected or confirmed by the user) maps to generated files:
|
|
1434
1451
|
|
|
1435
|
-
|
|
|
1436
|
-
|
|
1437
|
-
|
|
|
1438
|
-
|
|
|
1439
|
-
|
|
|
1440
|
-
|
|
|
1441
|
-
|
|
|
1442
|
-
|
|
|
1443
|
-
|
|
|
1452
|
+
| Data Point | Used In |
|
|
1453
|
+
|-----------|---------|
|
|
1454
|
+
| Source directory | `tdd-pretool-check.sh` - pattern match |
|
|
1455
|
+
| Test directory | `TESTING.md` - documentation |
|
|
1456
|
+
| Test framework | `TESTING.md` - documentation |
|
|
1457
|
+
| Commands (lint, typecheck, test, build) | `CLAUDE.md` - Commands section |
|
|
1458
|
+
| Infrastructure (DB, cache) | `CLAUDE.md` - Architecture section, `TESTING.md` - mock decisions |
|
|
1459
|
+
| Test duration | `SDLC skill` - wait time note |
|
|
1460
|
+
| Test types (E2E) | `TESTING.md` - testing diamond top |
|
|
1444
1461
|
|
|
1445
1462
|
---
|
|
1446
1463
|
|
|
@@ -1689,6 +1706,7 @@ TodoWrite([
|
|
|
1689
1706
|
{ content: "Find and read relevant documentation", status: "in_progress", activeForm: "Reading docs" },
|
|
1690
1707
|
{ content: "Assess doc health - flag issues (ask before cleaning)", status: "pending", activeForm: "Checking doc health" },
|
|
1691
1708
|
{ content: "DRY scan: What patterns exist to reuse?", status: "pending", activeForm: "Scanning for reusable patterns" },
|
|
1709
|
+
{ content: "Prove It Gate: adding new component? Research alternatives, prove quality with tests", status: "pending", activeForm: "Checking prove-it gate" },
|
|
1692
1710
|
{ content: "Blast radius: What depends on code I'm changing?", status: "pending", activeForm: "Checking dependencies" },
|
|
1693
1711
|
{ content: "Restate task in own words - verify understanding", status: "pending", activeForm: "Verifying understanding" },
|
|
1694
1712
|
{ content: "Scrutinize test design - right things tested? Follow TESTING.md?", status: "pending", activeForm: "Reviewing test approach" },
|
|
@@ -1730,6 +1748,22 @@ TodoWrite([
|
|
|
1730
1748
|
- Does test approach follow TESTING.md philosophies?
|
|
1731
1749
|
- If introducing new test patterns, same scrutiny as code patterns
|
|
1732
1750
|
|
|
1751
|
+
## Prove It Gate (REQUIRED for New Additions)
|
|
1752
|
+
|
|
1753
|
+
**Adding a new skill, hook, workflow, or component? PROVE IT FIRST:**
|
|
1754
|
+
|
|
1755
|
+
1. **Research:** Does something equivalent already exist (native CC, third-party plugin, existing skill)?
|
|
1756
|
+
2. **If YES:** Why is yours better? Show evidence (A/B test, quality comparison, gap analysis)
|
|
1757
|
+
3. **If NO:** What gap does this fill? Is the gap real or theoretical?
|
|
1758
|
+
4. **Quality tests:** New additions MUST have tests that prove OUTPUT QUALITY, not just existence
|
|
1759
|
+
5. **Less is more:** Every addition is maintenance burden. Default answer is NO unless proven YES
|
|
1760
|
+
|
|
1761
|
+
**Existence tests are NOT quality tests:**
|
|
1762
|
+
- BAD: "ci-analyzer skill file exists" — proves nothing about quality
|
|
1763
|
+
- GOOD: "ci-analyzer recommends lint-first when test-before-lint detected" — proves behavior
|
|
1764
|
+
|
|
1765
|
+
**If you can't write a quality test for it, you can't prove it works, so don't add it.**
|
|
1766
|
+
|
|
1733
1767
|
## Plan Mode Integration
|
|
1734
1768
|
|
|
1735
1769
|
**Use plan mode for:** Multi-file changes, new features, LOW confidence, bugs needing investigation.
|
|
@@ -1741,6 +1775,19 @@ TodoWrite([
|
|
|
1741
1775
|
|
|
1742
1776
|
**Before TDD, MUST ask:** "Docs updated. Run `/compact` before implementation?"
|
|
1743
1777
|
|
|
1778
|
+
### Auto-Approval: Skip Plan Approval Step
|
|
1779
|
+
|
|
1780
|
+
If ALL of these are true, skip plan approval and go straight to TDD:
|
|
1781
|
+
- Confidence is **HIGH (95%+)** — you know exactly what to do
|
|
1782
|
+
- Task is **single-file or trivial** (config tweak, small bug fix, string change)
|
|
1783
|
+
- No new patterns introduced
|
|
1784
|
+
- No architectural decisions
|
|
1785
|
+
|
|
1786
|
+
When auto-approving, still announce your approach — just don't wait for approval:
|
|
1787
|
+
> "Confidence HIGH (95%). Single-file change. Proceeding directly to TDD."
|
|
1788
|
+
|
|
1789
|
+
**When in doubt, wait for approval.** Auto-approval is for clear-cut cases only.
|
|
1790
|
+
|
|
1744
1791
|
## Confidence Check (REQUIRED)
|
|
1745
1792
|
|
|
1746
1793
|
Before presenting approach, STATE your confidence:
|
|
@@ -1779,7 +1826,7 @@ PLANNING → DOCS → TDD RED → TDD GREEN → Tests Pass → Self-Review
|
|
|
1779
1826
|
|
|
1780
1827
|
## Cross-Model Review (If Configured)
|
|
1781
1828
|
|
|
1782
|
-
**When to run:** High-stakes changes (auth, payments, data handling), complex refactors, research-heavy work.
|
|
1829
|
+
**When to run:** High-stakes changes (auth, payments, data handling), releases/publishes (version bumps, CHANGELOG, npm publish), complex refactors, research-heavy work.
|
|
1783
1830
|
**When to skip:** Trivial changes (typo fixes, config tweaks), time-sensitive hotfixes, risk < review cost.
|
|
1784
1831
|
|
|
1785
1832
|
**Prerequisites:** Codex CLI installed (`npm i -g @openai/codex`), OpenAI API key set.
|
|
@@ -1884,6 +1931,39 @@ Self-review passes → handoff.json (round 1, PENDING_REVIEW)
|
|
|
1884
1931
|
|
|
1885
1932
|
**Full protocol:** See the "Cross-Model Review Loop (Optional)" section below for key flags and reasoning effort guidance.
|
|
1886
1933
|
|
|
1934
|
+
### Release Review Focus
|
|
1935
|
+
|
|
1936
|
+
Before any release/publish, add these to `review_instructions`:
|
|
1937
|
+
- **CHANGELOG consistency** — all sections present, no lost entries during consolidation
|
|
1938
|
+
- **Version parity** — package.json, SDLC.md, CHANGELOG, wizard metadata all match
|
|
1939
|
+
- **Stale examples** — hardcoded version strings in docs match current release
|
|
1940
|
+
- **Docs accuracy** — README, ARCHITECTURE.md reflect current feature set
|
|
1941
|
+
- **CLI-distributed file parity** — live skills, hooks, settings match CLI templates
|
|
1942
|
+
|
|
1943
|
+
Evidence: v1.20.0 cross-model review caught CHANGELOG section loss and stale wizard version examples that passed all tests and self-review.
|
|
1944
|
+
|
|
1945
|
+
### Multiple Reviewers (N-Reviewer Pipeline)
|
|
1946
|
+
|
|
1947
|
+
When multiple reviewers comment on a PR (Claude, Codex, human reviewers), address each reviewer independently:
|
|
1948
|
+
|
|
1949
|
+
1. **Read all reviews** — collect feedback from every active reviewer
|
|
1950
|
+
2. **Respond per-reviewer** — each reviewer has different blind spots. Address each one's findings separately
|
|
1951
|
+
3. **Resolve conflicts** — if reviewers disagree, pick the stronger argument, note why
|
|
1952
|
+
4. **Iterate until all approve** — don't merge until every active reviewer is satisfied
|
|
1953
|
+
5. **Max 3 iterations per reviewer** — escalate to user if a reviewer keeps finding new things
|
|
1954
|
+
|
|
1955
|
+
The value of multiple reviewers: different models/humans catch different issues. No single reviewer is sufficient for high-stakes changes.
|
|
1956
|
+
|
|
1957
|
+
### Custom Subagents (`.claude/agents/`)
|
|
1958
|
+
|
|
1959
|
+
Claude Code supports custom subagents in `.claude/agents/`. These run as independent subprocesses focused on a single task:
|
|
1960
|
+
|
|
1961
|
+
- **`sdlc-reviewer`** — SDLC compliance review (planning, TDD, self-review checks)
|
|
1962
|
+
- **`ci-debug`** — CI failure diagnosis (reads logs, identifies root cause)
|
|
1963
|
+
- **`test-writer`** — Quality test writing following TESTING.md philosophies
|
|
1964
|
+
|
|
1965
|
+
**Skills vs agents:** Skills guide Claude's behavior for a task type. Agents are independent subprocesses that run autonomously and return results. Use agents when you want parallel work or a fresh context window.
|
|
1966
|
+
|
|
1887
1967
|
## Test Review (Harder Than Implementation)
|
|
1888
1968
|
|
|
1889
1969
|
During self-review, critique tests HARDER than app code:
|
|
@@ -1961,9 +2041,27 @@ Sometimes the flakiness is genuinely in CI infrastructure (runner environment, G
|
|
|
1961
2041
|
- **Keep quality gates strict** — the actual pass/fail decision must NOT have `continue-on-error`
|
|
1962
2042
|
- **Separate "fail the build" from "nice to have"** — a missing PR comment is not a regression
|
|
1963
2043
|
|
|
2044
|
+
## Debugging Workflow (Systematic Investigation)
|
|
2045
|
+
|
|
2046
|
+
When something breaks and the cause isn't obvious, follow this systematic debugging workflow:
|
|
2047
|
+
|
|
2048
|
+
```
|
|
2049
|
+
Reproduce → Isolate → Root Cause → Fix → Regression Test
|
|
2050
|
+
```
|
|
2051
|
+
|
|
2052
|
+
1. **Reproduce** — Can you make it fail consistently? If intermittent, stress-test (run N times). If you can't reproduce it, you can't fix it
|
|
2053
|
+
2. **Isolate** — Narrow the scope. Which file? Which function? Which input? Use binary search: comment out half the code, does it still fail?
|
|
2054
|
+
3. **Root cause** — Don't fix symptoms. Ask "why?" until you hit the actual cause. "It crashes on line 42" is a symptom. "Null pointer because the API returns undefined when rate-limited" is a root cause
|
|
2055
|
+
4. **Fix** — Fix the root cause, not the symptom
|
|
2056
|
+
5. **Regression test** — Write a test that fails without your fix and passes with it (TDD GREEN)
|
|
2057
|
+
|
|
2058
|
+
**For regressions** (it worked before, now it doesn't): Use `git bisect` to find the exact breaking commit. `git bisect start`, `git bisect bad` (current), `git bisect good <known-good-commit>`. Narrows to the breaking commit in O(log n) steps.
|
|
2059
|
+
|
|
2060
|
+
**Environment-specific bugs** (works locally, fails in CI/staging/prod): Check environment differences (env vars, OS version, dependency versions, file permissions). Reproduce the environment locally if possible. Add logging at the failure point — don't guess, observe.
|
|
2061
|
+
|
|
1964
2062
|
## CI Feedback Loop — Local Shepherd (After Commit)
|
|
1965
2063
|
|
|
1966
|
-
**This is the "local shepherd" —
|
|
2064
|
+
**This is the "local shepherd" — your CI fix mechanism.** It runs in your active session with full context.
|
|
1967
2065
|
|
|
1968
2066
|
**The SDLC doesn't end at local tests.** CI must pass too.
|
|
1969
2067
|
|
|
@@ -2041,25 +2139,6 @@ CI passes -> Read review suggestions
|
|
|
2041
2139
|
- **Ask first**: Present suggestions to user, let them decide which to implement
|
|
2042
2140
|
- **Skip review feedback**: Ignore CI review suggestions, only fix CI failures
|
|
2043
2141
|
|
|
2044
|
-
## Shepherd vs. Bot: Two-Tier CI Fix Model
|
|
2045
|
-
|
|
2046
|
-
| Aspect | Local Shepherd | CI Auto-Fix Bot |
|
|
2047
|
-
|--------|---------------|-----------------|
|
|
2048
|
-
| **When** | Active session (you're working) | Unattended (pushed and walked away) |
|
|
2049
|
-
| **Context** | Full: codebase, conversation, intent | Minimal: `--bare`, 200-line truncated logs |
|
|
2050
|
-
| **Cost** | Session tokens (marginal cost ~$0) | Separate API calls ($0.50-$2.00 per fix) |
|
|
2051
|
-
| **Noise** | 0 extra commits | 1+ `[autofix N/M]` commits per attempt |
|
|
2052
|
-
| **Quality** | High: full diagnosis, targeted fix | Lower: stateless, may repeat same approach |
|
|
2053
|
-
| **Speed** | Immediate: fix locally, push once | Delayed: workflow_run trigger + runner queue |
|
|
2054
|
-
| **Deconfliction** | N/A (is the primary) | SHA check: skips if branch advanced since failure |
|
|
2055
|
-
|
|
2056
|
-
**The shepherd is the default.** It runs as part of the SDLC checklist above whenever you push from an active session. The bot is optional and only adds value for:
|
|
2057
|
-
- Dependabot/Renovate PRs (no human session)
|
|
2058
|
-
- PRs where you push and walk away
|
|
2059
|
-
- Overnight CI runs
|
|
2060
|
-
|
|
2061
|
-
If you set up the bot, the SHA-based suppression ensures they never conflict.
|
|
2062
|
-
|
|
2063
2142
|
## DRY Principle
|
|
2064
2143
|
|
|
2065
2144
|
**Before coding:** "What patterns exist I can reuse?"
|
|
@@ -2158,7 +2237,7 @@ Create `CLAUDE.md` in your project root. This is your project-specific configura
|
|
|
2158
2237
|
|
|
2159
2238
|
## Commands
|
|
2160
2239
|
|
|
2161
|
-
<!-- CUSTOMIZE: Replace with your actual commands
|
|
2240
|
+
<!-- CUSTOMIZE: Replace with your actual detected/confirmed commands -->
|
|
2162
2241
|
|
|
2163
2242
|
- Build: `[your build command]`
|
|
2164
2243
|
- Run dev: `[your dev command]`
|
|
@@ -2245,7 +2324,7 @@ These are your full reference docs. Start with stubs and expand over time:
|
|
|
2245
2324
|
|
|
2246
2325
|
## Environments
|
|
2247
2326
|
|
|
2248
|
-
<!-- Claude auto-populates this from
|
|
2327
|
+
<!-- Claude auto-populates this from deployment detection -->
|
|
2249
2328
|
|
|
2250
2329
|
| Environment | URL | Deploy Command | Trigger |
|
|
2251
2330
|
|-------------|-----|----------------|---------|
|
|
@@ -2292,7 +2371,7 @@ If deployment fails or post-deploy verification catches issues:
|
|
|
2292
2371
|
|
|
2293
2372
|
| Environment | Rollback Command | Notes |
|
|
2294
2373
|
|-------------|------------------|-------|
|
|
2295
|
-
| Preview | [auto-expires or redeploy] |
|
|
2374
|
+
| Preview | [auto-expires or redeploy] | Ephemeral — redeploy to fix |
|
|
2296
2375
|
| Staging | `[your rollback command]` | [notes] |
|
|
2297
2376
|
| Production | `[your rollback command]` | [critical - document clearly] |
|
|
2298
2377
|
|
|
@@ -2322,7 +2401,7 @@ If deployment fails or post-deploy verification catches issues:
|
|
|
2322
2401
|
|
|
2323
2402
|
**SDLC.md:**
|
|
2324
2403
|
```markdown
|
|
2325
|
-
<!-- SDLC Wizard Version: 1.
|
|
2404
|
+
<!-- SDLC Wizard Version: 1.22.0 -->
|
|
2326
2405
|
<!-- Setup Date: [DATE] -->
|
|
2327
2406
|
<!-- Completed Steps: step-0.1, step-0.2, step-0.4, step-1, step-2, step-3, step-4, step-5, step-6, step-7, step-8, step-9 -->
|
|
2328
2407
|
<!-- Git Workflow: [PRs or Solo] -->
|
|
@@ -2431,6 +2510,36 @@ Reference: `components/ui/` or Storybook
|
|
|
2431
2510
|
|
|
2432
2511
|
**If you have external design system:** Point to Storybook/Figma URL instead of duplicating.
|
|
2433
2512
|
|
|
2513
|
+
### BRANDING.md (If Branding Assets Detected)
|
|
2514
|
+
|
|
2515
|
+
**Only generated if branding-related files are found:** BRANDING.md, brand/, logos/, style-guide.md, brand-voice.md, tone-of-voice.*, or UI/content-heavy project patterns.
|
|
2516
|
+
|
|
2517
|
+
```markdown
|
|
2518
|
+
# Brand Guidelines
|
|
2519
|
+
|
|
2520
|
+
## Brand Voice & Tone
|
|
2521
|
+
- [Detected from brand-voice.md or style guide, or ask user]
|
|
2522
|
+
- Formal/casual/technical/friendly
|
|
2523
|
+
- Target audience description
|
|
2524
|
+
|
|
2525
|
+
## Naming Conventions
|
|
2526
|
+
- Product name: [official name, capitalization]
|
|
2527
|
+
- Feature names: [naming pattern]
|
|
2528
|
+
- Technical terminology: [glossary of project-specific terms]
|
|
2529
|
+
|
|
2530
|
+
## Visual Identity
|
|
2531
|
+
- Logo usage: [reference to logo files or guidelines]
|
|
2532
|
+
- Color palette: [reference to DESIGN_SYSTEM.md if exists]
|
|
2533
|
+
- Typography: [font choices and usage]
|
|
2534
|
+
|
|
2535
|
+
## Content Style
|
|
2536
|
+
- [Any content writing guidelines]
|
|
2537
|
+
- [Error message tone]
|
|
2538
|
+
- [User-facing copy standards]
|
|
2539
|
+
```
|
|
2540
|
+
|
|
2541
|
+
**Why BRANDING.md?** Claude writing user-facing copy, error messages, or documentation needs to know the brand voice. Without this, output tone is inconsistent. Skip for backend-only or internal-tool projects.
|
|
2542
|
+
|
|
2434
2543
|
---
|
|
2435
2544
|
|
|
2436
2545
|
## Step 10: Verify Setup (Claude Does This Automatically)
|
|
@@ -2889,87 +2998,6 @@ Claude: [fetches via gh api, discusses with you interactively]
|
|
|
2889
2998
|
|
|
2890
2999
|
This is optional - skip if you prefer fresh reviews only.
|
|
2891
3000
|
|
|
2892
|
-
### CI Auto-Fix Loop (Optional — Bot Fallback)
|
|
2893
|
-
|
|
2894
|
-
> **Two-tier model:** The SDLC skill's CI loops (above) are the "local shepherd" — they handle CI fixes during active sessions. This bot is the second tier: an unattended fallback for when no one is watching. The bot includes SHA-based suppression — if you push a fix locally before the bot runs, it skips automatically.
|
|
2895
|
-
|
|
2896
|
-
Automatically fix CI failures and PR review findings. Claude reads the error context, fixes the code, commits, and re-triggers CI. Loops until CI passes AND review has no findings at your chosen level, or max retries hit.
|
|
2897
|
-
|
|
2898
|
-
**The Loop:**
|
|
2899
|
-
```
|
|
2900
|
-
Push to PR
|
|
2901
|
-
|
|
|
2902
|
-
v
|
|
2903
|
-
CI runs ──► FAIL ──► ci-autofix: Claude reads logs, fixes, commits [autofix 1/3] ──► re-trigger
|
|
2904
|
-
|
|
|
2905
|
-
└── PASS ──► PR Review ──► has findings at your level? ──► ci-autofix: fixes all ──► re-trigger
|
|
2906
|
-
|
|
|
2907
|
-
└── APPROVE, no findings ──► DONE
|
|
2908
|
-
```
|
|
2909
|
-
|
|
2910
|
-
**Safety measures:**
|
|
2911
|
-
- Never runs on main branch
|
|
2912
|
-
- Max retries (default 3, configurable via `MAX_AUTOFIX_RETRIES`)
|
|
2913
|
-
- `AUTOFIX_LEVEL` controls what findings to act on (`ci-only`, `criticals`, `all-findings`)
|
|
2914
|
-
- Restricted Claude tools (no git, no npm)
|
|
2915
|
-
- Self-modification ban (can't edit its own workflow file)
|
|
2916
|
-
- `[autofix N/M]` commit tags for audit trail
|
|
2917
|
-
- Sticky PR comments show status
|
|
2918
|
-
|
|
2919
|
-
**Setup:**
|
|
2920
|
-
1. Create `.github/workflows/ci-autofix.yml`:
|
|
2921
|
-
|
|
2922
|
-
```yaml
|
|
2923
|
-
name: CI Auto-Fix
|
|
2924
|
-
|
|
2925
|
-
on:
|
|
2926
|
-
workflow_run:
|
|
2927
|
-
workflows: ["CI", "PR Code Review"]
|
|
2928
|
-
types: [completed]
|
|
2929
|
-
|
|
2930
|
-
permissions:
|
|
2931
|
-
contents: write
|
|
2932
|
-
pull-requests: write
|
|
2933
|
-
|
|
2934
|
-
env:
|
|
2935
|
-
MAX_AUTOFIX_RETRIES: 3
|
|
2936
|
-
AUTOFIX_LEVEL: criticals # ci-only | criticals | all-findings
|
|
2937
|
-
|
|
2938
|
-
jobs:
|
|
2939
|
-
autofix:
|
|
2940
|
-
runs-on: ubuntu-latest
|
|
2941
|
-
if: |
|
|
2942
|
-
github.event.workflow_run.head_branch != 'main' &&
|
|
2943
|
-
github.event.workflow_run.event == 'pull_request' &&
|
|
2944
|
-
(
|
|
2945
|
-
(github.event.workflow_run.name == 'CI' && github.event.workflow_run.conclusion == 'failure') ||
|
|
2946
|
-
(github.event.workflow_run.name == 'PR Code Review' && github.event.workflow_run.conclusion == 'success')
|
|
2947
|
-
)
|
|
2948
|
-
steps:
|
|
2949
|
-
# Count previous [autofix] commits to enforce max retries
|
|
2950
|
-
# Download CI failure logs or fetch review comment
|
|
2951
|
-
# Check findings at your AUTOFIX_LEVEL (criticals + suggestions)
|
|
2952
|
-
# Run Claude to fix ALL findings with restricted tools
|
|
2953
|
-
# Commit [autofix N/M], push, re-trigger CI
|
|
2954
|
-
# Post sticky PR comment with status
|
|
2955
|
-
```
|
|
2956
|
-
|
|
2957
|
-
2. Add `workflow_dispatch:` trigger to your CI workflow (so autofix can re-trigger it)
|
|
2958
|
-
3. Optionally configure a GitHub App for token generation (avoids `workflow_run` default-branch constraint)
|
|
2959
|
-
|
|
2960
|
-
**Token approaches:**
|
|
2961
|
-
|
|
2962
|
-
| Approach | When | Pros |
|
|
2963
|
-
|----------|------|------|
|
|
2964
|
-
| GITHUB_TOKEN + `gh workflow run` | Default | No extra setup |
|
|
2965
|
-
| GitHub App token | `CI_AUTOFIX_APP_ID` secret exists | Push triggers `synchronize` naturally |
|
|
2966
|
-
|
|
2967
|
-
**Note:** `workflow_run` only fires for workflows on the default branch. The ci-autofix workflow is dormant until first merged to main.
|
|
2968
|
-
|
|
2969
|
-
> **Template vs. this repo:** The template above uses `ci-autofix.yml` with `criticals` as a safe default for new projects. The wizard's own repo has evolved this into `ci-self-heal.yml` with `all-findings` — a more aggressive configuration we dogfood internally. Both naming conventions work; the behavior is identical.
|
|
2970
|
-
|
|
2971
|
-
---
|
|
2972
|
-
|
|
2973
3001
|
### Cross-Model Review Loop (Optional)
|
|
2974
3002
|
|
|
2975
3003
|
Use an independent AI model from a different company as a code reviewer. The author can't grade their own homework — a model with different training data and different biases catches blind spots the authoring model misses.
|
|
@@ -3108,6 +3136,7 @@ Claude writes code → self-review passes → handoff.json (round 1)
|
|
|
3108
3136
|
|
|
3109
3137
|
**When to use this:**
|
|
3110
3138
|
- High-stakes changes (auth, payments, data handling)
|
|
3139
|
+
- **Releases and publishes** (version bumps, CHANGELOG, npm publish) — see Release Review Checklist below
|
|
3111
3140
|
- Research-heavy work where accuracy matters more than speed
|
|
3112
3141
|
- Complex refactors touching many files
|
|
3113
3142
|
- Any time you want higher confidence before merging
|
|
@@ -3117,6 +3146,30 @@ Claude writes code → self-review passes → handoff.json (round 1)
|
|
|
3117
3146
|
- Time-sensitive hotfixes
|
|
3118
3147
|
- Changes where the review cost exceeds the risk
|
|
3119
3148
|
|
|
3149
|
+
#### Release Review Checklist
|
|
3150
|
+
|
|
3151
|
+
Before any release or npm publish, add these focus areas to the cross-model `review_instructions`:
|
|
3152
|
+
|
|
3153
|
+
**Why:** Self-review and automated tests regularly miss release-specific inconsistencies. Evidence: v1.20.0 cross-model review caught 2 real issues (CHANGELOG section lost during consolidation, stale hardcoded version examples) that passed all tests and self-review.
|
|
3154
|
+
|
|
3155
|
+
| Check | What to Look For | Example Failure |
|
|
3156
|
+
|-------|-------------------|-----------------|
|
|
3157
|
+
| CHANGELOG consistency | All sections present, no lost entries during consolidation | v1.19.0 section dropped when merging into v1.20.0 |
|
|
3158
|
+
| Version parity | package.json, SDLC.md, CHANGELOG, wizard metadata all match | SDLC.md says 1.19.0 but package.json says 1.20.0 |
|
|
3159
|
+
| Stale examples | Hardcoded version strings in docs/wizard match current release | Wizard examples showing v1.15.0 when publishing v1.20.0 |
|
|
3160
|
+
| Docs accuracy | README, ARCHITECTURE.md reflect current feature set | "8 workflows" when there are actually 7 |
|
|
3161
|
+
| CLI-distributed file parity | Live skills, hooks, settings match CLI templates | SKILL.md edited but cli/templates/ not updated |
|
|
3162
|
+
|
|
3163
|
+
**Example `review_instructions` for releases:**
|
|
3164
|
+
```
|
|
3165
|
+
Review for release consistency: CHANGELOG completeness (no lost sections),
|
|
3166
|
+
version parity across package.json/SDLC.md/CHANGELOG/wizard metadata,
|
|
3167
|
+
stale hardcoded versions in examples, docs accuracy vs actual features,
|
|
3168
|
+
CLI-distributed file parity (skills, hooks, settings).
|
|
3169
|
+
```
|
|
3170
|
+
|
|
3171
|
+
**This complements automated tests, not replaces them.** Tests catch exact version mismatches (e.g., `test_package_version_matches_changelog`). Cross-model review catches semantic issues tests cannot — a section silently dropped, examples using outdated but syntactically valid versions, docs describing features that no longer exist.
|
|
3172
|
+
|
|
3120
3173
|
---
|
|
3121
3174
|
|
|
3122
3175
|
## User Understanding and Periodic Feedback
|
|
@@ -3249,7 +3302,7 @@ Walk through updates? (y/n)
|
|
|
3249
3302
|
Store wizard state in `SDLC.md` as metadata comments (invisible to readers, parseable by Claude):
|
|
3250
3303
|
|
|
3251
3304
|
```markdown
|
|
3252
|
-
<!-- SDLC Wizard Version: 1.
|
|
3305
|
+
<!-- SDLC Wizard Version: 1.22.0 -->
|
|
3253
3306
|
<!-- Setup Date: 2026-01-24 -->
|
|
3254
3307
|
<!-- Completed Steps: step-0.1, step-0.2, step-1, step-2, step-3, step-4, step-5, step-6, step-7, step-8, step-9 -->
|
|
3255
3308
|
<!-- Git Workflow: PRs -->
|
package/README.md
CHANGED
|
@@ -83,7 +83,7 @@ Layer 1: PHILOSOPHY
|
|
|
83
83
|
| **SDP normalization** | Separates "the model had a bad day" from "our SDLC broke" by cross-referencing external benchmarks |
|
|
84
84
|
| **CUSUM drift detection** | Catches gradual quality decay over time — borrowed from manufacturing quality control |
|
|
85
85
|
| **Pre-tool TDD hooks** | Before source edits, a hook reminds Claude to write tests first. CI scoring checks whether it actually followed TDD |
|
|
86
|
-
| **Self-evolving loop** | Weekly/monthly external research + CI
|
|
86
|
+
| **Self-evolving loop** | Weekly/monthly external research + local CI shepherd loop — you approve, the system gets better |
|
|
87
87
|
|
|
88
88
|
## How It Works
|
|
89
89
|
|
|
@@ -186,14 +186,14 @@ This isn't the only Claude Code SDLC tool. Here's an honest comparison:
|
|
|
186
186
|
|--------|------------|----------------------|-------------|
|
|
187
187
|
| **Focus** | SDLC enforcement + measurement | Agent performance optimization | Plugin marketplace |
|
|
188
188
|
| **Hooks** | 3 (SDLC, TDD, instructions) | 12+ (dev blocker, prettier, etc.) | Webhook watcher |
|
|
189
|
-
| **Skills** |
|
|
189
|
+
| **Skills** | 4 (/sdlc, /setup, /update, /feedback) | 80+ domain-specific | 13 slash commands |
|
|
190
190
|
| **Evaluation** | 95% CI, CUSUM, SDP, Tier 1/2 | Configuration testing | skilltest framework |
|
|
191
|
-
| **
|
|
191
|
+
| **CI Shepherd** | Local CI fix loop | No | No |
|
|
192
192
|
| **Auto-updates** | Weekly CC + community scan | No | No |
|
|
193
193
|
| **Install** | `npx agentic-sdlc-wizard init` | npm install | npm install |
|
|
194
194
|
| **Philosophy** | Lightweight, prove-it-or-delete | Scale and optimization | Documentation-first |
|
|
195
195
|
|
|
196
|
-
**Our unique strengths:** Statistical rigor (CUSUM + 95% CI), SDP scoring (model quality vs SDLC compliance),
|
|
196
|
+
**Our unique strengths:** Statistical rigor (CUSUM + 95% CI), SDP scoring (model quality vs SDLC compliance), CI shepherd loop, Prove-It A/B pipeline, comprehensive automated test suite, dogfooding enforcement.
|
|
197
197
|
|
|
198
198
|
**Where others are stronger:** everything-claude-code has broader language/framework coverage. claude-sdlc has webhook-driven automation. Both have npm distribution.
|
|
199
199
|
|
|
@@ -204,7 +204,7 @@ This isn't the only Claude Code SDLC tool. Here's an honest comparison:
|
|
|
204
204
|
| Document | What It Covers |
|
|
205
205
|
|----------|---------------|
|
|
206
206
|
| [ARCHITECTURE.md](ARCHITECTURE.md) | System design, 5-layer diagram, data flows, file structure |
|
|
207
|
-
| [CI_CD.md](CI_CD.md) | All
|
|
207
|
+
| [CI_CD.md](CI_CD.md) | All 4 workflows, E2E scoring, tier system, SDP, integrity checks |
|
|
208
208
|
| [SDLC.md](SDLC.md) | Version tracking, enforcement rules, SDLC configuration |
|
|
209
209
|
| [TESTING.md](TESTING.md) | Testing philosophy, test diamond, TDD approach |
|
|
210
210
|
| [CHANGELOG.md](CHANGELOG.md) | Version history, what changed and when |
|
package/cli/init.js
CHANGED
|
@@ -22,6 +22,7 @@ const FILES = [
|
|
|
22
22
|
{ src: 'skills/sdlc/SKILL.md', dest: '.claude/skills/sdlc/SKILL.md' },
|
|
23
23
|
{ src: 'skills/setup/SKILL.md', dest: '.claude/skills/setup/SKILL.md' },
|
|
24
24
|
{ src: 'skills/update/SKILL.md', dest: '.claude/skills/update/SKILL.md' },
|
|
25
|
+
{ src: 'skills/feedback/SKILL.md', dest: '.claude/skills/feedback/SKILL.md' },
|
|
25
26
|
];
|
|
26
27
|
|
|
27
28
|
const WIZARD_HOOK_MARKERS = FILES
|
|
@@ -224,12 +225,12 @@ function init(targetDir, { force = false, dryRun = false } = {}) {
|
|
|
224
225
|
console.log(`
|
|
225
226
|
${GREEN}SDLC Wizard installed successfully!${RESET}
|
|
226
227
|
|
|
227
|
-
${YELLOW}
|
|
228
|
-
|
|
229
|
-
|
|
228
|
+
${YELLOW}Restart Claude Code${RESET} to load new hooks and skills:
|
|
229
|
+
${CYAN}/exit${RESET} then ${CYAN}claude --continue${RESET} (keeps conversation history)
|
|
230
|
+
${CYAN}/exit${RESET} then ${CYAN}claude${RESET} (fresh start)
|
|
230
231
|
|
|
231
232
|
Next steps:
|
|
232
|
-
1.
|
|
233
|
+
1. Restart Claude Code (see above)
|
|
233
234
|
2. Tell Claude anything — setup auto-invokes when SDLC files are missing
|
|
234
235
|
3. Claude reads the wizard doc and creates CLAUDE.md, SDLC.md, TESTING.md, ARCHITECTURE.md
|
|
235
236
|
|
|
@@ -0,0 +1,92 @@
|
|
|
1
|
+
---
|
|
2
|
+
name: feedback
|
|
3
|
+
description: Submit feedback, bug reports, feature requests, or share SDLC patterns you've discovered. Privacy-first — always asks before scanning.
|
|
4
|
+
argument-hint: [optional: bug | feature | pattern | improvement]
|
|
5
|
+
effort: medium
|
|
6
|
+
---
|
|
7
|
+
# Feedback — Community Contribution Loop
|
|
8
|
+
|
|
9
|
+
## Task
|
|
10
|
+
$ARGUMENTS
|
|
11
|
+
|
|
12
|
+
## Purpose
|
|
13
|
+
|
|
14
|
+
Help users contribute back to the SDLC wizard: bug reports, feature requests, pattern sharing, and SDLC improvements. Privacy-first — never scan without explicit permission.
|
|
15
|
+
|
|
16
|
+
## Privacy & Permission (MANDATORY)
|
|
17
|
+
|
|
18
|
+
**NEVER scan the user's repo without explicit consent.** Always ask first:
|
|
19
|
+
|
|
20
|
+
> "I can scan your SDLC setup to identify what you've customized vs wizard defaults. This helps me create a more specific report. May I scan? (Only file names and SDLC config are read — no source code, secrets, or business logic.)"
|
|
21
|
+
|
|
22
|
+
**What IS scanned (with permission):**
|
|
23
|
+
- SDLC.md, TESTING.md, CLAUDE.md structure (not content details)
|
|
24
|
+
- Hook file names and which hooks are active
|
|
25
|
+
- Skill names and which skills exist
|
|
26
|
+
- .claude/settings.json hook configuration (not allowedTools or secrets)
|
|
27
|
+
|
|
28
|
+
**What is NEVER scanned:**
|
|
29
|
+
- Source code files
|
|
30
|
+
- .env files, secrets, credentials
|
|
31
|
+
- Business logic or proprietary code
|
|
32
|
+
- Git history or commit messages
|
|
33
|
+
|
|
34
|
+
## Feedback Types
|
|
35
|
+
|
|
36
|
+
### Bug Report
|
|
37
|
+
1. Ask user to describe the issue
|
|
38
|
+
2. With permission, check which wizard version is installed (`SDLC.md` metadata)
|
|
39
|
+
3. Check if hooks are properly configured
|
|
40
|
+
4. Create a GitHub issue with reproduction steps
|
|
41
|
+
|
|
42
|
+
### Feature Request
|
|
43
|
+
1. Ask user what they want
|
|
44
|
+
2. With permission, check if a similar capability already exists in their setup
|
|
45
|
+
3. Create a GitHub issue with the request and context
|
|
46
|
+
|
|
47
|
+
### Pattern Sharing
|
|
48
|
+
1. Ask user what pattern they've discovered (custom hook, modified philosophy, test approach)
|
|
49
|
+
2. With permission, diff their SDLC setup against wizard defaults to identify customizations
|
|
50
|
+
3. Ask: "Which of these customizations worked well for you?"
|
|
51
|
+
4. Create a GitHub issue describing the pattern and evidence it works
|
|
52
|
+
|
|
53
|
+
### SDLC Improvement
|
|
54
|
+
1. Ask what could be better about the SDLC workflow
|
|
55
|
+
2. With permission, check which SDLC steps they use most/least
|
|
56
|
+
3. Create a GitHub issue with the improvement suggestion
|
|
57
|
+
|
|
58
|
+
## Creating the Issue
|
|
59
|
+
|
|
60
|
+
Use `gh issue create` on the wizard repo:
|
|
61
|
+
|
|
62
|
+
```bash
|
|
63
|
+
gh issue create \
|
|
64
|
+
--repo BaseInfinity/agentic-ai-sdlc-wizard \
|
|
65
|
+
--title "[feedback-type]: Brief description" \
|
|
66
|
+
--body "$(cat <<'EOF'
|
|
67
|
+
## Feedback Type
|
|
68
|
+
bug / feature / pattern / improvement
|
|
69
|
+
|
|
70
|
+
## Description
|
|
71
|
+
[User's description]
|
|
72
|
+
|
|
73
|
+
## Context
|
|
74
|
+
- Wizard version: [from SDLC.md metadata]
|
|
75
|
+
- Setup type: [detected stack if permission granted]
|
|
76
|
+
|
|
77
|
+
## Evidence (if pattern sharing)
|
|
78
|
+
[What the user customized and why it worked]
|
|
79
|
+
|
|
80
|
+
---
|
|
81
|
+
Submitted via `/feedback` skill
|
|
82
|
+
EOF
|
|
83
|
+
)"
|
|
84
|
+
```
|
|
85
|
+
|
|
86
|
+
## Rules
|
|
87
|
+
|
|
88
|
+
- **Privacy first** — always ask before scanning anything
|
|
89
|
+
- **Opt-in only** — if user declines scan, still create the issue with whatever they tell you manually
|
|
90
|
+
- **No source code** — never include source code snippets in issues
|
|
91
|
+
- **Be specific** — vague issues waste maintainer time. Ask clarifying questions
|
|
92
|
+
- **Check for duplicates** — `gh issue list --repo BaseInfinity/agentic-ai-sdlc-wizard --search "keywords"` before creating
|
|
@@ -19,6 +19,7 @@ TodoWrite([
|
|
|
19
19
|
{ content: "Find and read relevant documentation", status: "in_progress", activeForm: "Reading docs" },
|
|
20
20
|
{ content: "Assess doc health - flag issues (ask before cleaning)", status: "pending", activeForm: "Checking doc health" },
|
|
21
21
|
{ content: "DRY scan: What patterns exist to reuse? New pattern = get approval", status: "pending", activeForm: "Scanning for reusable patterns" },
|
|
22
|
+
{ content: "Prove It Gate: adding new component? Research alternatives, prove quality with tests", status: "pending", activeForm: "Checking prove-it gate" },
|
|
22
23
|
{ content: "Blast radius: What depends on code I'm changing?", status: "pending", activeForm: "Checking dependencies" },
|
|
23
24
|
{ content: "Design system check (if UI change)", status: "pending", activeForm: "Checking design system" },
|
|
24
25
|
{ content: "Restate task in own words - verify understanding", status: "pending", activeForm: "Verifying understanding" },
|
|
@@ -84,6 +85,22 @@ Critical miss on `tdd_red` or `self_review` = process failure regardless of tota
|
|
|
84
85
|
- Does test approach follow TESTING.md philosophies?
|
|
85
86
|
- If introducing new test patterns, same scrutiny as code patterns
|
|
86
87
|
|
|
88
|
+
## Prove It Gate (REQUIRED for New Additions)
|
|
89
|
+
|
|
90
|
+
**Adding a new skill, hook, workflow, or component? PROVE IT FIRST:**
|
|
91
|
+
|
|
92
|
+
1. **Research:** Does something equivalent already exist (native CC, third-party plugin, existing skill)?
|
|
93
|
+
2. **If YES:** Why is yours better? Show evidence (A/B test, quality comparison, gap analysis)
|
|
94
|
+
3. **If NO:** What gap does this fill? Is the gap real or theoretical?
|
|
95
|
+
4. **Quality tests:** New additions MUST have tests that prove OUTPUT QUALITY, not just existence
|
|
96
|
+
5. **Less is more:** Every addition is maintenance burden. Default answer is NO unless proven YES
|
|
97
|
+
|
|
98
|
+
**Existence tests are NOT quality tests:**
|
|
99
|
+
- BAD: "ci-analyzer skill file exists" — proves nothing about quality
|
|
100
|
+
- GOOD: "ci-analyzer recommends lint-first when test-before-lint detected" — proves behavior
|
|
101
|
+
|
|
102
|
+
**If you can't write a quality test for it, you can't prove it works, so don't add it.**
|
|
103
|
+
|
|
87
104
|
## Plan Mode Integration
|
|
88
105
|
|
|
89
106
|
**Use plan mode for:** Multi-file changes, new features, LOW confidence, bugs needing investigation.
|
|
@@ -93,6 +110,19 @@ Critical miss on `tdd_red` or `self_review` = process failure regardless of tota
|
|
|
93
110
|
2. **Transition** (after approval): Update feature docs
|
|
94
111
|
3. **Implementation**: TDD RED -> GREEN -> PASS
|
|
95
112
|
|
|
113
|
+
### Auto-Approval: Skip Plan Approval Step
|
|
114
|
+
|
|
115
|
+
If ALL of these are true, skip plan approval and go straight to TDD:
|
|
116
|
+
- Confidence is **HIGH (95%+)** — you know exactly what to do
|
|
117
|
+
- Task is **single-file or trivial** (config tweak, small bug fix, string change)
|
|
118
|
+
- No new patterns introduced
|
|
119
|
+
- No architectural decisions
|
|
120
|
+
|
|
121
|
+
When auto-approving, still announce your approach — just don't wait for approval:
|
|
122
|
+
> "Confidence HIGH (95%). Single-file change. Proceeding directly to TDD."
|
|
123
|
+
|
|
124
|
+
**When in doubt, wait for approval.** Auto-approval is for clear-cut cases only.
|
|
125
|
+
|
|
96
126
|
## Confidence Check (REQUIRED)
|
|
97
127
|
|
|
98
128
|
Before presenting approach, STATE your confidence:
|
|
@@ -131,7 +161,7 @@ PLANNING -> DOCS -> TDD RED -> TDD GREEN -> Tests Pass -> Self-Review
|
|
|
131
161
|
|
|
132
162
|
## Cross-Model Review (If Configured)
|
|
133
163
|
|
|
134
|
-
**When to run:** High-stakes changes (auth, payments, data handling), complex refactors, research-heavy work.
|
|
164
|
+
**When to run:** High-stakes changes (auth, payments, data handling), releases/publishes (version bumps, CHANGELOG, npm publish), complex refactors, research-heavy work.
|
|
135
165
|
**When to skip:** Trivial changes (typo fixes, config tweaks), time-sensitive hotfixes, risk < review cost.
|
|
136
166
|
|
|
137
167
|
**Prerequisites:** Codex CLI installed (`npm i -g @openai/codex`), OpenAI API key set.
|
|
@@ -236,6 +266,43 @@ Self-review passes → handoff.json (round 1, PENDING_REVIEW)
|
|
|
236
266
|
|
|
237
267
|
**Full protocol:** See the wizard's "Cross-Model Review Loop (Optional)" section for key flags and reasoning effort guidance.
|
|
238
268
|
|
|
269
|
+
### Release Review Focus
|
|
270
|
+
|
|
271
|
+
Before any release/publish, add these to `review_instructions`:
|
|
272
|
+
- **CHANGELOG consistency** — all sections present, no lost entries during consolidation
|
|
273
|
+
- **Version parity** — package.json, SDLC.md, CHANGELOG, wizard metadata all match
|
|
274
|
+
- **Stale examples** — hardcoded version strings in docs match current release
|
|
275
|
+
- **Docs accuracy** — README, ARCHITECTURE.md reflect current feature set
|
|
276
|
+
- **CLI-distributed file parity** — live skills, hooks, settings match CLI templates
|
|
277
|
+
|
|
278
|
+
Evidence: v1.20.0 cross-model review caught CHANGELOG section loss and stale wizard version examples that passed all tests and self-review. Tests catch version mismatches; cross-model review catches semantic issues tests cannot.
|
|
279
|
+
|
|
280
|
+
### Multiple Reviewers (N-Reviewer Pipeline)
|
|
281
|
+
|
|
282
|
+
When multiple reviewers comment on a PR (Claude PR review, Codex, human reviewers), address each reviewer independently:
|
|
283
|
+
|
|
284
|
+
1. **Read all reviews** — `gh api repos/OWNER/REPO/pulls/PR/comments` to get every reviewer's feedback
|
|
285
|
+
2. **Respond per-reviewer** — Each reviewer has different blind spots and priorities. Address each one's findings separately
|
|
286
|
+
3. **Resolve conflicts** — If reviewers disagree, use your judgment: pick the stronger argument, note why you chose it
|
|
287
|
+
4. **Iterate until all approve** — Don't merge until every active reviewer is satisfied or their concerns are explicitly addressed
|
|
288
|
+
5. **Max 3 iterations per reviewer** — If a reviewer keeps finding new things after 3 rounds, escalate to the user
|
|
289
|
+
|
|
290
|
+
**The value of multiple reviewers:** Different models/humans catch different issues. Claude excels at SDLC/process compliance. Codex catches logic bugs. Humans catch "does this make sense for the product?" None alone is sufficient for high-stakes changes.
|
|
291
|
+
|
|
292
|
+
### Custom Subagents (`.claude/agents/`)
|
|
293
|
+
|
|
294
|
+
Claude Code supports custom subagents in `.claude/agents/`. These are specialized agents you can invoke for focused tasks:
|
|
295
|
+
|
|
296
|
+
- **`sdlc-reviewer`** — An agent focused purely on SDLC compliance review (planning, TDD, self-review checks)
|
|
297
|
+
- **`ci-debug`** — An agent specialized in diagnosing CI failures (reads logs, identifies root cause, suggests fix)
|
|
298
|
+
- **`test-writer`** — An agent focused on writing quality tests following TESTING.md philosophies
|
|
299
|
+
|
|
300
|
+
**When to use agents vs skills:**
|
|
301
|
+
- **Skills** (`.claude/skills/`) — Prompts that guide Claude's behavior for a task type. Claude reads and follows them
|
|
302
|
+
- **Agents** (`.claude/agents/`) — Independent subprocesses that run autonomously on a focused task and return results
|
|
303
|
+
|
|
304
|
+
Agents are useful when you want parallel work (e.g., run `sdlc-reviewer` while you continue implementing) or when a task benefits from a fresh context window focused on one thing.
|
|
305
|
+
|
|
239
306
|
## Test Review (Harder Than Implementation)
|
|
240
307
|
|
|
241
308
|
During self-review, critique tests HARDER than app code:
|
|
@@ -335,9 +402,38 @@ If tests fail:
|
|
|
335
402
|
|
|
336
403
|
Debug it. Find root cause. Fix it properly. Tests ARE code.
|
|
337
404
|
|
|
405
|
+
## Debugging Workflow (Systematic Investigation)
|
|
406
|
+
|
|
407
|
+
When something breaks and the cause isn't obvious, follow this systematic debugging workflow:
|
|
408
|
+
|
|
409
|
+
```
|
|
410
|
+
Reproduce → Isolate → Root Cause → Fix → Regression Test
|
|
411
|
+
```
|
|
412
|
+
|
|
413
|
+
1. **Reproduce** — Can you make it fail consistently? If intermittent, stress-test (run N times). If you can't reproduce it, you can't fix it
|
|
414
|
+
2. **Isolate** — Narrow the scope. Which file? Which function? Which input? Use binary search: comment out half the code, does it still fail?
|
|
415
|
+
3. **Root cause** — Don't fix symptoms. Ask "why?" until you hit the actual cause. "It crashes on line 42" is a symptom. "Null pointer because the API returns undefined when rate-limited" is a root cause
|
|
416
|
+
4. **Fix** — Fix the root cause, not the symptom. Write the fix
|
|
417
|
+
5. **Regression test** — Write a test that fails without your fix and passes with it (TDD GREEN)
|
|
418
|
+
|
|
419
|
+
**For regressions** (it worked before, now it doesn't):
|
|
420
|
+
- Use `git bisect` to find the exact commit that broke it
|
|
421
|
+
- `git bisect start`, `git bisect bad` (current), `git bisect good <known-good-commit>`
|
|
422
|
+
- Bisect narrows to the breaking commit in O(log n) steps
|
|
423
|
+
|
|
424
|
+
**Environment-specific bugs** (works locally, fails in CI/staging/prod):
|
|
425
|
+
- Check environment differences: env vars, OS version, dependency versions, file permissions
|
|
426
|
+
- Reproduce the environment locally if possible (Docker, env vars)
|
|
427
|
+
- Add logging at the failure point — don't guess, observe
|
|
428
|
+
|
|
429
|
+
**When to stop and ask:**
|
|
430
|
+
- After 2 failed fix attempts → STOP and ASK USER
|
|
431
|
+
- If the bug is in code you don't understand → read first, then fix
|
|
432
|
+
- If reproducing requires access you don't have → ASK USER
|
|
433
|
+
|
|
338
434
|
## CI Feedback Loop — Local Shepherd (After Commit)
|
|
339
435
|
|
|
340
|
-
**This is the "local shepherd" — the
|
|
436
|
+
**This is the "local shepherd" — the CI fix mechanism.** It runs in your active session with full context.
|
|
341
437
|
|
|
342
438
|
**The SDLC doesn't end at local tests.** CI must pass too.
|
|
343
439
|
|
|
@@ -1,17 +1,19 @@
|
|
|
1
1
|
---
|
|
2
2
|
name: setup-wizard
|
|
3
|
-
description: Setup wizard — scans codebase,
|
|
3
|
+
description: Setup wizard — scans codebase, builds confidence per data point, only asks what it can't figure out, generates SDLC files. Use for first-time setup or re-running setup.
|
|
4
4
|
argument-hint: [optional: regenerate | verify-only]
|
|
5
5
|
effort: high
|
|
6
6
|
---
|
|
7
|
-
# Setup Wizard -
|
|
7
|
+
# Setup Wizard - Confidence-Driven Project Configuration
|
|
8
8
|
|
|
9
9
|
## Task
|
|
10
10
|
$ARGUMENTS
|
|
11
11
|
|
|
12
12
|
## Purpose
|
|
13
13
|
|
|
14
|
-
You are
|
|
14
|
+
You are a confidence-driven setup wizard. Your job is to scan the project, infer as much as possible, and only ask the user about what you can't figure out. The number of questions is DYNAMIC — it depends on how much you can detect. Stop asking when all configuration data points are resolved (detected, confirmed, or answered).
|
|
15
|
+
|
|
16
|
+
**DO NOT ask a fixed list of questions. DO NOT ask what you already know.**
|
|
15
17
|
|
|
16
18
|
## MANDATORY FIRST ACTION: Read the Wizard Doc
|
|
17
19
|
|
|
@@ -35,57 +37,72 @@ Scan the project root for:
|
|
|
35
37
|
- Feature docs: *_PLAN.md, *_DOCS.md, *_SPEC.md, docs/
|
|
36
38
|
- Deployment: Dockerfile, vercel.json, fly.toml, netlify.toml, Procfile, k8s/
|
|
37
39
|
- Design system: tailwind.config.*, .storybook/, theme files, CSS custom properties
|
|
40
|
+
- Branding assets: BRANDING.md, brand/, logos/, style-guide.md, brand-voice.md, tone-of-voice.*
|
|
38
41
|
- Existing docs: README.md, CLAUDE.md, ARCHITECTURE.md
|
|
42
|
+
- Scripts in package.json (lint, test, build, typecheck, etc.)
|
|
43
|
+
- Database config files (prisma/, drizzle.config.*, knexfile.*, .env with DB_*)
|
|
44
|
+
- Cache config (redis.conf, .env with REDIS_*)
|
|
45
|
+
|
|
46
|
+
### Step 2: Build Confidence Map
|
|
47
|
+
|
|
48
|
+
For each configuration data point, assign a confidence level based on scan results:
|
|
49
|
+
|
|
50
|
+
**Configuration Data Points:**
|
|
39
51
|
|
|
40
|
-
|
|
52
|
+
| Category | Data Point | How to Detect |
|
|
53
|
+
|----------|-----------|---------------|
|
|
54
|
+
| Structure | Source directory | Look for src/, app/, lib/, etc. |
|
|
55
|
+
| Structure | Test directory | Look for tests/, __tests__/, spec/ |
|
|
56
|
+
| Structure | Test framework | Config files (jest.config, vitest.config, pytest.ini) |
|
|
57
|
+
| Commands | Lint command | package.json scripts, Makefile, config files |
|
|
58
|
+
| Commands | Type-check command | tsconfig.json → tsc, mypy.ini → mypy |
|
|
59
|
+
| Commands | Run all tests | package.json "test" script, Makefile |
|
|
60
|
+
| Commands | Run single test file | Infer from framework (jest → jest path, pytest → pytest path) |
|
|
61
|
+
| Commands | Production build | package.json "build" script, Makefile |
|
|
62
|
+
| Commands | Deployment setup | Dockerfile, vercel.json, fly.toml, deploy scripts |
|
|
63
|
+
| Infra | Database(s) | prisma/, .env DB vars, docker-compose services |
|
|
64
|
+
| Infra | Caching layer | .env REDIS vars, docker-compose redis service |
|
|
65
|
+
| Infra | Test duration | Count test files, check CI run times if available |
|
|
66
|
+
| Preferences | Response detail level | Cannot detect — ALWAYS ASK |
|
|
67
|
+
| Preferences | Testing approach | Cannot detect intent from existing code — ALWAYS ASK |
|
|
68
|
+
| Preferences | Mocking philosophy | Cannot detect intent from existing code — ALWAYS ASK |
|
|
69
|
+
| Testing | Test types | What test files exist (*.test.*, *.spec.*, e2e/, integration/) |
|
|
70
|
+
| Coverage | Coverage config | nyc, c8, coverage.py config, CI coverage steps |
|
|
71
|
+
| CI | CI shepherd opt-in | Only if CI detected — ALWAYS ASK |
|
|
41
72
|
|
|
42
|
-
|
|
73
|
+
**Each data point has one of three states:**
|
|
74
|
+
- **RESOLVED (detected):** Found concrete evidence — config file, script, directory exists. No question needed, just confirm.
|
|
75
|
+
- **RESOLVED (inferred):** Found indirect evidence — naming patterns, related config. Present inference, let user confirm or correct.
|
|
76
|
+
- **UNRESOLVED:** No evidence found — must ask user directly.
|
|
43
77
|
|
|
44
|
-
|
|
78
|
+
**Preference data points** (response detail, testing approach, mocking philosophy, CI shepherd) are ALWAYS UNRESOLVED regardless of what code patterns exist. Current code patterns show what IS, not what the user WANTS going forward.
|
|
45
79
|
|
|
46
|
-
|
|
47
|
-
1. Source directory (detected or ask)
|
|
48
|
-
2. Test directory (detected or ask)
|
|
49
|
-
3. Test framework (detected or ask)
|
|
80
|
+
### Step 3: Present Findings and Fill Gaps
|
|
50
81
|
|
|
51
|
-
|
|
52
|
-
4. Lint command
|
|
53
|
-
5. Type-check command
|
|
54
|
-
6. Run all tests command
|
|
55
|
-
7. Run single test file command
|
|
56
|
-
8. Production build command
|
|
57
|
-
9. Deployment setup (detected environments, confirm or customize)
|
|
82
|
+
Present ALL detected values organized by state to the user.
|
|
58
83
|
|
|
59
|
-
**
|
|
60
|
-
10. Database(s) used
|
|
61
|
-
11. Caching layer (Redis, etc.)
|
|
62
|
-
12. Test duration (<1 min, 1-5 min, 5+ min)
|
|
84
|
+
**For RESOLVED (detected) items:** Show what was found, let user bulk-confirm with a single "Looks good" or override specific items.
|
|
63
85
|
|
|
64
|
-
**
|
|
65
|
-
13. Response detail level (small/medium/large)
|
|
86
|
+
**For RESOLVED (inferred) items:** Show what was inferred with reasoning, ask user to confirm or correct.
|
|
66
87
|
|
|
67
|
-
**
|
|
68
|
-
14. Testing approach (strict TDD, test-after, mixed, minimal, none yet)
|
|
69
|
-
15. Test types wanted (unit, integration, E2E, API)
|
|
70
|
-
16. Mocking philosophy (minimal, heavy, no mocking)
|
|
88
|
+
**For UNRESOLVED items:** Ask the user directly — these are your questions.
|
|
71
89
|
|
|
72
|
-
**
|
|
73
|
-
17. Code coverage preferences (enforce threshold, report only, AI suggestions, skip)
|
|
90
|
+
**The ready rule:** You are ready to generate files when ALL data points are resolved (detected, inferred+confirmed, or answered by user). The number of questions you ask depends entirely on how many data points remain unresolved after scanning. A well-configured project might need 3-4 questions (just preferences). A bare repo might need 10+. There is no fixed count.
|
|
74
91
|
|
|
75
|
-
DO NOT proceed to file generation until
|
|
92
|
+
DO NOT proceed to file generation until all data points are resolved.
|
|
76
93
|
|
|
77
|
-
### Step
|
|
94
|
+
### Step 4: Generate CLAUDE.md
|
|
78
95
|
|
|
79
|
-
Using
|
|
96
|
+
Using detected + confirmed values, generate `CLAUDE.md` with:
|
|
80
97
|
- Project overview (from scan results)
|
|
81
|
-
- Commands table (
|
|
98
|
+
- Commands table (detected/confirmed commands)
|
|
82
99
|
- Code style section (from detected linters/formatters)
|
|
83
100
|
- Architecture summary (from scan)
|
|
84
|
-
- Special notes (
|
|
101
|
+
- Special notes (infra, deployment)
|
|
85
102
|
|
|
86
103
|
Reference: See "Step 3" in `CLAUDE_CODE_SDLC_WIZARD.md` for the full template.
|
|
87
104
|
|
|
88
|
-
### Step
|
|
105
|
+
### Step 5: Generate SDLC.md
|
|
89
106
|
|
|
90
107
|
Generate `SDLC.md` with the full SDLC checklist customized to the project:
|
|
91
108
|
- Plan mode guidance
|
|
@@ -98,35 +115,35 @@ Include metadata comments:
|
|
|
98
115
|
```
|
|
99
116
|
<!-- SDLC Wizard Version: [version from CLAUDE_CODE_SDLC_WIZARD.md] -->
|
|
100
117
|
<!-- Setup Date: [today's date] -->
|
|
101
|
-
<!-- Completed Steps: 0.
|
|
118
|
+
<!-- Completed Steps: step-0.1, step-0.2, step-1, step-2, step-3, step-4, step-5, step-6, step-7, step-8, step-9 -->
|
|
102
119
|
```
|
|
103
120
|
|
|
104
121
|
Reference: See "Step 4" in `CLAUDE_CODE_SDLC_WIZARD.md` for the full template.
|
|
105
122
|
|
|
106
|
-
### Step
|
|
123
|
+
### Step 6: Generate TESTING.md
|
|
107
124
|
|
|
108
|
-
Generate `TESTING.md` based on
|
|
125
|
+
Generate `TESTING.md` based on detected/confirmed testing data:
|
|
109
126
|
- Testing Diamond visualization
|
|
110
127
|
- Test types and their purposes
|
|
111
|
-
- Mocking rules (from
|
|
112
|
-
- Test file organization (from
|
|
113
|
-
- Coverage config (from
|
|
128
|
+
- Mocking rules (from detected patterns or user input)
|
|
129
|
+
- Test file organization (from detected structure)
|
|
130
|
+
- Coverage config (from detected config or user input)
|
|
114
131
|
- Framework-specific patterns
|
|
115
132
|
|
|
116
133
|
Reference: See "Step 5" in `CLAUDE_CODE_SDLC_WIZARD.md` for the full template.
|
|
117
134
|
|
|
118
|
-
### Step
|
|
135
|
+
### Step 7: Generate ARCHITECTURE.md
|
|
119
136
|
|
|
120
137
|
Generate `ARCHITECTURE.md` with:
|
|
121
138
|
- System overview diagram (from scan)
|
|
122
139
|
- Component descriptions
|
|
123
|
-
- Environments table (from
|
|
140
|
+
- Environments table (from detected deployment config)
|
|
124
141
|
- Deployment checklist
|
|
125
142
|
- Key technical decisions
|
|
126
143
|
|
|
127
144
|
Reference: See "Step 6" in `CLAUDE_CODE_SDLC_WIZARD.md` for the full template.
|
|
128
145
|
|
|
129
|
-
### Step
|
|
146
|
+
### Step 8: Generate DESIGN_SYSTEM.md (If UI Detected)
|
|
130
147
|
|
|
131
148
|
Only if design system artifacts were found in Step 1:
|
|
132
149
|
- Extract colors, fonts, spacing from config
|
|
@@ -135,7 +152,17 @@ Only if design system artifacts were found in Step 1:
|
|
|
135
152
|
|
|
136
153
|
Skip this step if no UI/design system detected.
|
|
137
154
|
|
|
138
|
-
### Step 8:
|
|
155
|
+
### Step 8.5: Generate BRANDING.md (If Branding Detected)
|
|
156
|
+
|
|
157
|
+
Only if branding-related assets were found in Step 1 (brand/, logos/, style-guide.md, brand-voice.md, existing BRANDING.md, or UI/content-heavy project detected):
|
|
158
|
+
- Brand voice and tone guidelines
|
|
159
|
+
- Naming conventions (product names, feature names, terminology)
|
|
160
|
+
- Visual identity summary (logo usage, color palette references)
|
|
161
|
+
- Content style guide (if the project has user-facing copy)
|
|
162
|
+
|
|
163
|
+
Skip this step if no branding assets or UI/content patterns detected.
|
|
164
|
+
|
|
165
|
+
### Step 9: Configure Tool Permissions
|
|
139
166
|
|
|
140
167
|
Based on detected stack, suggest `allowedTools` entries for `.claude/settings.json`:
|
|
141
168
|
- Package manager commands (npm, pnpm, yarn, cargo, go, pip, etc.)
|
|
@@ -144,11 +171,11 @@ Based on detected stack, suggest `allowedTools` entries for `.claude/settings.js
|
|
|
144
171
|
|
|
145
172
|
Present suggestions and let the user confirm.
|
|
146
173
|
|
|
147
|
-
### Step
|
|
174
|
+
### Step 10: Customize Hooks
|
|
148
175
|
|
|
149
|
-
Update `tdd-pretool-check.sh` with the actual source directory
|
|
176
|
+
Update `tdd-pretool-check.sh` with the actual source directory (replace generic `/src/` pattern).
|
|
150
177
|
|
|
151
|
-
### Step
|
|
178
|
+
### Step 11: Verify Setup
|
|
152
179
|
|
|
153
180
|
Run verification checks:
|
|
154
181
|
1. All generated files exist and are non-empty
|
|
@@ -159,20 +186,24 @@ Run verification checks:
|
|
|
159
186
|
|
|
160
187
|
Report any issues found.
|
|
161
188
|
|
|
162
|
-
### Step
|
|
189
|
+
### Step 12: Instruct Restart and Next Steps
|
|
163
190
|
|
|
164
191
|
Tell the user:
|
|
165
192
|
> Setup complete. Hooks and settings load at session start.
|
|
166
193
|
> **Exit Claude Code and restart it** for the new configuration to take effect.
|
|
167
194
|
> On restart, the SDLC hook will fire and you'll see the checklist in every response.
|
|
168
195
|
>
|
|
169
|
-
> **Optional next step:**
|
|
196
|
+
> **Optional next step:**
|
|
197
|
+
> - Run `/claude-automation-recommender` for stack-specific tooling suggestions (MCP servers, formatting hooks, type-checking hooks, plugins)
|
|
198
|
+
>
|
|
199
|
+
> The recommender is complementary to the SDLC wizard — it adds tooling recommendations, not process enforcement.
|
|
170
200
|
|
|
171
201
|
## Rules
|
|
172
202
|
|
|
173
|
-
- NEVER
|
|
174
|
-
- NEVER
|
|
175
|
-
- ALWAYS show detected values and let the user confirm or override.
|
|
203
|
+
- NEVER ask what you already know from scanning. If you found it, confirm it — don't ask it.
|
|
204
|
+
- NEVER use a fixed question count. The number of questions is dynamic based on scan results.
|
|
205
|
+
- ALWAYS show detected values organized by resolution state and let the user confirm or override.
|
|
176
206
|
- ALWAYS generate metadata comments in SDLC.md (version, date, steps).
|
|
207
|
+
- If most data points are resolved after scanning, present findings for bulk confirmation — don't force individual questions.
|
|
177
208
|
- If the user passes `regenerate` as an argument, skip Q&A and regenerate files from existing SDLC.md metadata.
|
|
178
|
-
- If the user passes `verify-only` as an argument, skip to Step
|
|
209
|
+
- If the user passes `verify-only` as an argument, skip to Step 11 (verify) only.
|
|
@@ -45,13 +45,13 @@ Extract the latest version from the first `## [X.X.X]` line.
|
|
|
45
45
|
Parse all CHANGELOG entries between the user's installed version and the latest. Present a clear summary:
|
|
46
46
|
|
|
47
47
|
```
|
|
48
|
-
Installed: 1.
|
|
49
|
-
Latest: 1.
|
|
48
|
+
Installed: 1.20.0
|
|
49
|
+
Latest: 1.22.0
|
|
50
50
|
|
|
51
51
|
What changed:
|
|
52
|
+
- [1.22.0] Plan auto-approval, debugging workflow, /feedback skill, BRANDING.md detection, ...
|
|
53
|
+
- [1.21.0] Confidence-driven setup, prove-it gate, cross-model release review, ...
|
|
52
54
|
- [1.20.0] Version-pinned CC update gate, Tier 1 flakiness fix, flaky test guidance, ...
|
|
53
|
-
- [1.19.0] CI shepherd model, token efficiency, feature doc enforcement, ...
|
|
54
|
-
- [1.18.0] Added /update-wizard skill, ...
|
|
55
55
|
```
|
|
56
56
|
|
|
57
57
|
**If versions match:** Say "You're up to date! (version X.X.X)" and stop.
|