cc-dev-template 0.1.53 → 0.1.55

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "cc-dev-template",
3
- "version": "0.1.53",
3
+ "version": "0.1.55",
4
4
  "description": "Structured AI-assisted development framework for Claude Code",
5
5
  "bin": {
6
6
  "cc-dev-template": "./bin/install.js"
@@ -0,0 +1,11 @@
1
+ ---
2
+ name: project-setup
3
+ description: Set up standardized dev environment with Makefile, scripts, and hooks.
4
+ disable-model-invocation: true
5
+ ---
6
+
7
+ # Project Setup
8
+
9
+ Configure this project with a standardized development environment.
10
+
11
+ Read `references/step-1-analyze.md`.
@@ -0,0 +1,54 @@
1
+ # Step 1: Analyze the Project
2
+
3
+ Understand the project before making changes.
4
+
5
+ ## Detect Project Type
6
+
7
+ Check for language indicators:
8
+
9
+ | Language | Indicators |
10
+ |----------|------------|
11
+ | TypeScript/JavaScript | `package.json`, `tsconfig.json`, `.ts`/`.tsx` files |
12
+ | Go | `go.mod`, `go.sum`, `.go` files |
13
+ | C# | `*.csproj`, `*.sln`, `.cs` files |
14
+
15
+ If multiple languages exist, this may be a monorepo. Note each component.
16
+
17
+ ## Check for Submodules
18
+
19
+ Run `git submodule status`. If submodules exist, ask the user:
20
+ - Which submodule(s) to set up
21
+ - Whether to orchestrate multiple submodules together under one `make dev`
22
+
23
+ ## Check Existing Setup
24
+
25
+ Look for:
26
+ - `Makefile` in root
27
+ - `scripts/` directory with `build.sh`, `dev.sh`, `test.sh`
28
+ - `.claude/settings.json` with Stop hook
29
+ - `.git/hooks/pre-commit`
30
+
31
+ If partial setup exists, show the user what's already configured vs what's missing. Ask whether to:
32
+ - Complete the missing pieces
33
+ - Replace existing setup entirely
34
+ - Verify existing setup works correctly
35
+
36
+ ## Gather Project-Specific Details
37
+
38
+ For the detected language(s), identify:
39
+ - Build command (e.g., `npm run build`, `go build ./...`, `dotnet build`)
40
+ - Dev server command and typical port
41
+ - Test command (e.g., `npm test`, `go test ./...`, `dotnet test`)
42
+ - Lint/format commands (e.g., `npm run lint`, `go fmt ./...`)
43
+
44
+ If anything is unclear, ask the user.
45
+
46
+ ## When Ready
47
+
48
+ Once you have:
49
+ - Confirmed project type
50
+ - Resolved any submodule questions
51
+ - Determined what needs to be created vs already exists
52
+ - Gathered language-specific commands
53
+
54
+ Read `references/step-2-makefile.md`.
@@ -0,0 +1,72 @@
1
+ # Step 2: Create Makefile and Scripts
2
+
3
+ Create the Makefile and supporting scripts. The Makefile stays minimal—it just calls scripts.
4
+
5
+ ## Create the Makefile
6
+
7
+ Create `Makefile` in the project root:
8
+
9
+ ```makefile
10
+ .PHONY: build dev stop test
11
+
12
+ build:
13
+ @./scripts/build.sh
14
+
15
+ dev:
16
+ @./scripts/dev.sh start
17
+
18
+ stop:
19
+ @./scripts/dev.sh stop
20
+
21
+ test:
22
+ @./scripts/test.sh
23
+ ```
24
+
25
+ ## Create scripts/build.sh
26
+
27
+ Purpose: Compile, typecheck, lint. Return minimal output. No binaries.
28
+
29
+ Requirements:
30
+ - Run all quality checks for the detected language (typecheck, lint, format check)
31
+ - Exit 0 on success with single "Build passed" message
32
+ - Exit non-zero on failure with only the relevant error lines
33
+ - Strip ANSI colors and limit output to essential errors
34
+
35
+ ## Create scripts/dev.sh
36
+
37
+ Purpose: Start/stop dev server as background process. Track state via PID file.
38
+
39
+ Requirements:
40
+ - Accept `start` or `stop` argument
41
+ - On `start`:
42
+ - Check if already running (PID file exists and process alive)
43
+ - If running, print "Already running on port XXXX" and exit 0
44
+ - If not running, start in background
45
+ - Pipe all output to `agent.log` in project root
46
+ - Truncate `agent.log` on each start (fresh logs)
47
+ - Save PID to `.dev.pid`
48
+ - Print "Started on port XXXX. Logs: agent.log"
49
+ - On `stop`:
50
+ - Kill process from PID file
51
+ - Clean up PID file
52
+ - Print "Stopped"
53
+
54
+ ## Create scripts/test.sh
55
+
56
+ Purpose: Run ALL tests. Return pass/fail with minimal output.
57
+
58
+ Requirements:
59
+ - Run the full test suite (unit, integration, all test types)
60
+ - On success: print "All tests passed"
61
+ - On failure: print only failing test names and brief error messages
62
+ - Filter verbose test output to extract just failures
63
+
64
+ ## Make Scripts Executable
65
+
66
+ ```bash
67
+ chmod +x scripts/build.sh scripts/dev.sh scripts/test.sh
68
+ ```
69
+
70
+ ## When Ready
71
+
72
+ Once all scripts are created and executable, read `references/step-3-hooks.md`.
@@ -0,0 +1,80 @@
1
+ # Step 3: Configure Hooks
2
+
3
+ Set up the stop hook and pre-commit hook.
4
+
5
+ ## Stop Hook
6
+
7
+ The stop hook runs `make build` when the agent finishes. If build fails, the agent continues working to fix it (up to 3 attempts).
8
+
9
+ ### Create .claude/settings.json
10
+
11
+ ```json
12
+ {
13
+ "hooks": {
14
+ "Stop": [
15
+ {
16
+ "hooks": [
17
+ {
18
+ "type": "command",
19
+ "command": "node .claude/hooks/quality-gate.cjs"
20
+ }
21
+ ]
22
+ }
23
+ ]
24
+ }
25
+ }
26
+ ```
27
+
28
+ If `.claude/settings.json` already exists, merge the hooks configuration.
29
+
30
+ ### Create the Quality Gate Hook
31
+
32
+ Copy the template from this skill's `templates/quality-gate.cjs` to `.claude/hooks/quality-gate.cjs`.
33
+
34
+ Adapt it for this project:
35
+ - Update the `checks` array to match project language
36
+ - For TypeScript: tsc, eslint, tests
37
+ - For Go: go build, go vet, go test
38
+ - For C#: dotnet build, dotnet test
39
+
40
+ The hook:
41
+ 1. Reads JSON from stdin (required by hook protocol)
42
+ 2. Runs `make build`
43
+ 3. If build passes: returns `{ "decision": "approve" }`
44
+ 4. If build fails: returns `{ "decision": "block", "reason": "..." }` with concise error info
45
+ 5. Checks `stop_hook_active` field—if true and still failing, approve to prevent infinite loops
46
+
47
+ ## Pre-Commit Hook
48
+
49
+ Create `.git/hooks/pre-commit`:
50
+
51
+ ```bash
52
+ #!/bin/bash
53
+ set -e
54
+
55
+ echo "Running pre-commit checks..."
56
+
57
+ # Run build checks
58
+ if ! make build; then
59
+ echo "Build failed. Commit aborted."
60
+ exit 1
61
+ fi
62
+
63
+ # Run tests
64
+ if ! make test; then
65
+ echo "Tests failed. Commit aborted."
66
+ exit 1
67
+ fi
68
+
69
+ echo "Pre-commit checks passed."
70
+ ```
71
+
72
+ Make it executable:
73
+
74
+ ```bash
75
+ chmod +x .git/hooks/pre-commit
76
+ ```
77
+
78
+ ## When Ready
79
+
80
+ Once hooks are configured, read `references/step-4-documentation.md`.
@@ -0,0 +1,34 @@
1
+ # Step 4: Update CLAUDE.md
2
+
3
+ Document the make commands so the agent knows how to use them.
4
+
5
+ ## If CLAUDE.md Exists
6
+
7
+ Add a "Dev Commands" section. Place it near the top, after any project overview.
8
+
9
+ ## If CLAUDE.md Does Not Exist
10
+
11
+ Create it with the dev commands section plus a minimal project description.
12
+
13
+ ## Content to Add
14
+
15
+ ```markdown
16
+ ## Dev Commands
17
+
18
+ | Command | Purpose |
19
+ |---------|---------|
20
+ | `make dev` | Start dev server (background). Logs to `agent.log`. |
21
+ | `make stop` | Stop the dev server. |
22
+ | `make build` | Run all build checks (typecheck, lint). No output on success. |
23
+ | `make test` | Run all tests. Shows only failures. |
24
+
25
+ - `agent.log` is cleared on each `make dev` — check it for errors after changes
26
+ - Stop hook runs `make build` automatically when you finish working
27
+ - Pre-commit hook runs `make build` and `make test` before each commit
28
+ ```
29
+
30
+ Adjust the table if this project has additional commands or specific notes.
31
+
32
+ ## When Ready
33
+
34
+ Once CLAUDE.md is updated, read `references/step-5-verify.md`.
@@ -0,0 +1,70 @@
1
+ # Step 5: Verify Setup
2
+
3
+ Run each command to confirm everything works.
4
+
5
+ ## Verification Steps
6
+
7
+ ### 1. Build
8
+
9
+ ```bash
10
+ make build
11
+ ```
12
+
13
+ Expected: exits 0, minimal output (or single success message).
14
+
15
+ If it fails, fix the underlying issue—missing dependencies, syntax errors, etc.
16
+
17
+ ### 2. Test
18
+
19
+ ```bash
20
+ make test
21
+ ```
22
+
23
+ Expected: exits 0 with "All tests passed" or similar.
24
+
25
+ If tests fail, that's fine—report them to the user but the setup itself is working.
26
+
27
+ ### 3. Dev Server
28
+
29
+ ```bash
30
+ make dev
31
+ ```
32
+
33
+ Expected: prints port and confirms it's running in background.
34
+
35
+ Then verify it's actually running:
36
+
37
+ ```bash
38
+ make dev
39
+ ```
40
+
41
+ Expected: prints "Already running on port XXXX".
42
+
43
+ ### 4. Stop
44
+
45
+ ```bash
46
+ make stop
47
+ ```
48
+
49
+ Expected: prints "Stopped" or similar.
50
+
51
+ Then verify it actually stopped:
52
+
53
+ ```bash
54
+ make dev
55
+ ```
56
+
57
+ Expected: starts fresh (not "already running").
58
+
59
+ ### 5. agent.log
60
+
61
+ Check that `agent.log` exists and contains server output.
62
+
63
+ ## Report Results
64
+
65
+ Tell the user:
66
+ - Which commands succeeded
67
+ - Any issues encountered
68
+ - Whether the project is ready to use
69
+
70
+ If everything passed, the setup is complete.
@@ -0,0 +1,132 @@
1
+ #!/usr/bin/env node
2
+ /**
3
+ * Quality gate stop hook.
4
+ * Runs make build when the agent stops. Blocks if build fails (up to 3 retries).
5
+ *
6
+ * Adapt the `checks` array for your project's language/tooling.
7
+ */
8
+ const { execSync } = require('child_process');
9
+ const fs = require('fs');
10
+ const path = require('path');
11
+
12
+ const PROJECT_ROOT = process.cwd();
13
+ const MAX_RETRIES = 3;
14
+ const RETRY_FILE = path.join(PROJECT_ROOT, '.claude', 'hooks', '.quality-gate-retries');
15
+
16
+ function approve() {
17
+ cleanupRetryFile();
18
+ console.log(JSON.stringify({ decision: 'approve' }));
19
+ process.exit(0);
20
+ }
21
+
22
+ function block(reason) {
23
+ console.log(JSON.stringify({ decision: 'block', reason }));
24
+ process.exit(0);
25
+ }
26
+
27
+ function cleanupRetryFile() {
28
+ try {
29
+ if (fs.existsSync(RETRY_FILE)) {
30
+ fs.unlinkSync(RETRY_FILE);
31
+ }
32
+ } catch (e) {
33
+ // Ignore cleanup errors
34
+ }
35
+ }
36
+
37
+ function getRetryCount() {
38
+ try {
39
+ if (fs.existsSync(RETRY_FILE)) {
40
+ return parseInt(fs.readFileSync(RETRY_FILE, 'utf8').trim(), 10) || 0;
41
+ }
42
+ } catch (e) {
43
+ // Ignore read errors
44
+ }
45
+ return 0;
46
+ }
47
+
48
+ function incrementRetryCount() {
49
+ const count = getRetryCount() + 1;
50
+ const dir = path.dirname(RETRY_FILE);
51
+ if (!fs.existsSync(dir)) {
52
+ fs.mkdirSync(dir, { recursive: true });
53
+ }
54
+ fs.writeFileSync(RETRY_FILE, String(count));
55
+ return count;
56
+ }
57
+
58
+ // Strip ANSI color codes
59
+ function stripAnsi(str) {
60
+ return str.replace(/\x1B\[[0-9;]*[a-zA-Z]/g, '');
61
+ }
62
+
63
+ // Extract key error lines from output
64
+ function extractErrors(output, maxLines = 10) {
65
+ const clean = stripAnsi(output);
66
+ const lines = clean.split('\n');
67
+
68
+ const errorLines = lines.filter(line =>
69
+ line.toLowerCase().includes('error') ||
70
+ line.includes('FAIL') ||
71
+ line.includes('failed') ||
72
+ line.match(/:\d+:\d+/) // file:line:col pattern
73
+ );
74
+
75
+ return errorLines.slice(0, maxLines);
76
+ }
77
+
78
+ async function main() {
79
+ try {
80
+ // Read hook input from stdin
81
+ let input = '';
82
+ for await (const chunk of process.stdin) {
83
+ input += chunk;
84
+ }
85
+
86
+ JSON.parse(input); // Validate JSON (content not needed)
87
+
88
+ // Check retry count
89
+ const retries = getRetryCount();
90
+ if (retries >= MAX_RETRIES) {
91
+ // Max retries reached, approve to prevent infinite loop
92
+ approve();
93
+ return;
94
+ }
95
+
96
+ // Run make build
97
+ try {
98
+ execSync('make build', {
99
+ cwd: PROJECT_ROOT,
100
+ encoding: 'utf8',
101
+ stdio: 'pipe',
102
+ timeout: 120000
103
+ });
104
+
105
+ // Build passed
106
+ approve();
107
+ } catch (error) {
108
+ const output = (error.stdout || '') + '\n' + (error.stderr || '');
109
+ const errors = extractErrors(output);
110
+
111
+ // Increment retry count
112
+ const currentRetry = incrementRetryCount();
113
+
114
+ let message = `Build failed (attempt ${currentRetry}/${MAX_RETRIES})`;
115
+ if (errors.length > 0) {
116
+ message += ':\n' + errors.map(e => ` ${e}`).join('\n');
117
+ }
118
+
119
+ if (currentRetry >= MAX_RETRIES) {
120
+ message += '\n\nMax retries reached. Fix manually and run `make build`.';
121
+ // Still block this time, but next time will approve
122
+ }
123
+
124
+ block(message);
125
+ }
126
+ } catch (e) {
127
+ // Parse error or other issue - approve to fail open
128
+ approve();
129
+ }
130
+ }
131
+
132
+ main();
@@ -6,6 +6,23 @@ argument-hint: <spec-name>
6
6
 
7
7
  # Spec Interview
8
8
 
9
+ ## Context Hygiene
10
+
11
+ **IMPORTANT:** During planning, protect the context window. Never write code. Never search, grep, or read files directly.
12
+
13
+ Use Explorer subagents for ALL codebase research:
14
+ - Explorer uses a faster, cheaper model
15
+ - Explorer works better with focused tasks
16
+ - Explorer returns only relevant findings, keeping your context clean
17
+
18
+ **Layered approach:**
19
+ 1. First: One Explorer for broad understanding of a system
20
+ 2. Then: Multiple Explorers in parallel for deep dives on specifics
21
+
22
+ Spin up as many Explorers as needed. There is no downside to parallel subagents.
23
+
24
+ **Why this matters:** Search results and file contents that aren't directly relevant cause context rot, degrading planning quality. Subagents curate information before it enters your context.
25
+
9
26
  ## What To Do Now
10
27
 
11
28
  If an argument was provided, use it as the feature name. Otherwise, ask what feature to spec out.
@@ -4,7 +4,7 @@ Establish understanding of the feature before diving into details.
4
4
 
5
5
  ## Opening Questions
6
6
 
7
- Ask one or two questions at a time. Follow up on anything unclear.
7
+ Use AskUserQuestion to gather information. Ask one or two questions at a time. Follow up on anything unclear.
8
8
 
9
9
  Start with:
10
10
  - What problem does this feature solve?
@@ -16,6 +16,6 @@ Then explore:
16
16
 
17
17
  ## When to Move On
18
18
 
19
- Move to `references/step-2-deep-dive.md` when:
19
+ Move to `references/step-2-ui-ux.md` when:
20
20
  - The core problem and user goal are clear
21
21
  - Success criteria are understood at a high level
@@ -0,0 +1,73 @@
1
+ # Step 2: UI/UX Design
2
+
3
+ If the feature has no user interface, skip to `references/step-3-deep-dive.md`.
4
+
5
+ ## Determine Design Direction
6
+
7
+ Before any wireframes, establish the visual approach. Use AskUserQuestion to confirm:
8
+
9
+ **Product context:**
10
+ - What does this product need to feel like?
11
+ - Who uses it? (Power users want density, occasional users want guidance)
12
+ - What's the emotional job? (Trust, efficiency, delight, focus)
13
+
14
+ **Design direction options:**
15
+ - Precision & Density — tight spacing, monochrome, information-forward (Linear, Raycast)
16
+ - Warmth & Approachability — generous spacing, soft shadows, friendly (Notion, Coda)
17
+ - Sophistication & Trust — cool tones, layered depth, financial gravitas (Stripe, Mercury)
18
+ - Boldness & Clarity — high contrast, dramatic negative space (Vercel)
19
+ - Utility & Function — muted palette, functional density (GitHub)
20
+
21
+ **Color foundation:**
22
+ - Warm (creams, warm grays) — approachable, human
23
+ - Cool (slate, blue-gray) — professional, serious
24
+ - Pure neutrals (true grays) — minimal, technical
25
+
26
+ **Layout approach:**
27
+ - Dense grids for scanning/comparing
28
+ - Generous spacing for focused tasks
29
+ - Sidebar navigation for multi-section apps
30
+ - Split panels for list-detail patterns
31
+
32
+ Use AskUserQuestion to present 2-3 options and get the user's preference.
33
+
34
+ ## Create ASCII Wireframes
35
+
36
+ Sketch the interface in ASCII. Keep it rough—this is for alignment, not pixel precision.
37
+
38
+ ```
39
+ Example:
40
+ ┌─────────────────────────────────────────┐
41
+ │ Page Title [Action ▾] │
42
+ ├──────────┬──────────────────────────────┤
43
+ │ Nav Item │ Content Area │
44
+ │ Nav Item │ ┌─────────────────────────┐ │
45
+ │ Nav Item │ │ Component │ │
46
+ │ │ └─────────────────────────┘ │
47
+ └──────────┴──────────────────────────────┘
48
+ ```
49
+
50
+ Create wireframes for:
51
+ - Primary screen(s) the user will interact with
52
+ - Key states (empty, loading, error, populated)
53
+ - Any modals or secondary views
54
+
55
+ Present each wireframe to the user. Use AskUserQuestion to confirm or iterate.
56
+
57
+ ## Map User Flows
58
+
59
+ For each primary action, document the interaction sequence:
60
+
61
+ 1. Where does the user start?
62
+ 2. What do they click/type?
63
+ 3. What feedback do they see?
64
+ 4. Where do they end up?
65
+
66
+ Format as simple numbered steps under each flow name.
67
+
68
+ ## When to Move On
69
+
70
+ Proceed to `references/step-3-deep-dive.md` when:
71
+ - Design direction is agreed upon
72
+ - Wireframes exist for primary screens
73
+ - User has confirmed the layout approach
@@ -1,7 +1,9 @@
1
- # Step 2: Deep Dive
1
+ # Step 3: Deep Dive
2
2
 
3
3
  Cover all specification areas through conversation. Update `docs/specs/<name>/spec.md` incrementally as information emerges.
4
4
 
5
+ Use AskUserQuestion whenever requirements are ambiguous or multiple approaches exist. Present options with tradeoffs and get explicit decisions.
6
+
5
7
  ## Areas to Cover
6
8
 
7
9
  ### Intent & Goals
@@ -13,7 +15,17 @@ Cover all specification areas through conversation. Update `docs/specs/<name>/sp
13
15
  - External services, APIs, or libraries
14
16
  - Data flows in and out
15
17
 
16
- Spawn exploration subagents to investigate the codebase when integration questions arise. They return only relevant findings.
18
+ **IMPORTANT:** Use Explorer subagents for all codebase investigation. Never search or read files directly.
19
+
20
+ Layered approach:
21
+ 1. First Explorer: "How does [system] work at a high level?"
22
+ 2. Parallel Explorers: Deep dive into specific components identified in step 1
23
+
24
+ Example: To understand auth integration:
25
+ - Explorer 1: "How does authentication work in this codebase?"
26
+ - Then parallel: "How are auth tokens validated?", "Where is the user session stored?", "What middleware handles protected routes?"
27
+
28
+ No assumptions. If you don't know how something works, send an Explorer to find out.
17
29
 
18
30
  ### Data Model
19
31
  - Entities and relationships
@@ -74,4 +86,4 @@ Write to `docs/specs/<name>/spec.md` with this structure:
74
86
 
75
87
  ## When to Move On
76
88
 
77
- Move to `references/step-3-research-needs.md` when all areas have been covered and the spec document is substantially complete.
89
+ Move to `references/step-4-research-needs.md` when all areas have been covered and the spec document is substantially complete.
@@ -1,4 +1,4 @@
1
- # Step 3: Identify Research Needs
1
+ # Step 4: Identify Research Needs
2
2
 
3
3
  Before finalizing, determine if implementation requires unfamiliar paradigms.
4
4
 
@@ -12,10 +12,16 @@ This is not about whether Claude knows how to do something in general. It's abou
12
12
 
13
13
  Review the spec's integration points, data model, and behavior sections.
14
14
 
15
- For each significant implementation element:
16
- 1. Search the codebase for existing examples of this pattern
17
- 2. If found this paradigm is established, no research needed
18
- 3. If not found this is a new paradigm requiring research
15
+ **IMPORTANT:** Use Explorer subagents to check for existing patterns. Never search directly.
16
+
17
+ For each significant implementation element, spawn an Explorer:
18
+ - "Does this codebase have an existing example of [pattern]? If yes, where and how does it work?"
19
+
20
+ Spin up multiple Explorers in parallel for different patterns.
21
+
22
+ Based on Explorer findings:
23
+ - If pattern exists → paradigm is established, no research needed
24
+ - If not found → this is a new paradigm requiring research
19
25
 
20
26
  Examples of "new paradigm" triggers:
21
27
  - Using a library not yet in the project
@@ -27,18 +33,18 @@ Examples of "new paradigm" triggers:
27
33
 
28
34
  For each new paradigm identified:
29
35
  1. State what needs research and why (no existing example found)
30
- 2. Ask the user if they want to proceed with research, or if they have existing knowledge to share
36
+ 2. Use AskUserQuestion to ask if they want to proceed with research, or if they have existing knowledge to share
31
37
  3. If proceeding, invoke the `research` skill for that topic
32
38
 
33
39
  Wait for research to complete before continuing. The research output goes to `docs/research/` and informs implementation.
34
40
 
35
41
  ## If No Research Needed
36
42
 
37
- State that all paradigms have existing examples in the codebase. Proceed to `references/step-4-finalize.md`.
43
+ State that all paradigms have existing examples in the codebase. Proceed to `references/step-5-verification.md`.
38
44
 
39
45
  ## When to Move On
40
46
 
41
- Proceed to `references/step-4-finalize.md` when:
47
+ Proceed to `references/step-5-verification.md` when:
42
48
  - All new paradigms have been researched, OR
43
49
  - User confirmed no research is needed, OR
44
50
  - All patterns have existing codebase examples
@@ -0,0 +1,74 @@
1
+ # Step 5: Verification Planning
2
+
3
+ Every acceptance criterion needs a specific, executable verification method. The goal: autonomous implementation with zero ambiguity about whether something works.
4
+
5
+ ## Verification Methods
6
+
7
+ ### UI Verification: agent-browser
8
+
9
+ For any criterion involving visual output or user interaction, use Vercel's agent-browser CLI:
10
+
11
+ ```
12
+ agent-browser open <url> # Navigate to page
13
+ agent-browser snapshot # Get accessibility tree with refs
14
+ agent-browser click @ref # Click element by ref
15
+ agent-browser fill @ref "value" # Fill input by ref
16
+ agent-browser get text @ref # Read text content
17
+ agent-browser screenshot file.png # Capture visual state
18
+ agent-browser close # Close browser
19
+ ```
20
+
21
+ Example verification for "Dashboard shows signup count":
22
+ 1. `agent-browser open /admin`
23
+ 2. `agent-browser snapshot`
24
+ 3. `agent-browser get text @signup-count`
25
+ 4. Assert returned value is a number
26
+
27
+ ### Automated Tests
28
+
29
+ For logic, data, and API behavior, specify the exact test:
30
+ - Unit tests for pure functions
31
+ - Integration tests for API endpoints
32
+ - End-to-end tests for critical flows
33
+
34
+ Include the test file path: `pnpm test src/convex/featureFlags.test.ts`
35
+
36
+ ### Database/State Verification
37
+
38
+ For data persistence criteria:
39
+ 1. Perform the action
40
+ 2. Query the database directly
41
+ 3. Assert expected state
42
+
43
+ ### Manual Verification (Fallback)
44
+
45
+ If no automated method exists, document exactly what to check. Flag these as candidates for future automation.
46
+
47
+ ## Update Each Acceptance Criterion
48
+
49
+ Review every acceptance criterion in the spec. Add a verification method using this format:
50
+
51
+ ```markdown
52
+ ## Acceptance Criteria
53
+
54
+ - [ ] Dashboard loads in under 2s
55
+ **Verify:** `agent-browser open /admin`, measure time to snapshot ready
56
+
57
+ - [ ] Flag toggles persist across refresh
58
+ **Verify:** `pnpm test src/convex/featureFlags.test.ts` (toggle persistence test)
59
+
60
+ - [ ] Signup chart shows accurate counts
61
+ **Verify:** `agent-browser get text @chart-total`, compare to `npx convex run users:count`
62
+ ```
63
+
64
+ ## Confirm With User
65
+
66
+ Use AskUserQuestion to review verification methods with the user:
67
+ - "For [criterion], I'll verify by [method]. Does that prove it works?"
68
+ - Flag any criteria where verification seems insufficient
69
+
70
+ The standard: if the agent executes the verification and it passes, the feature is done. No human checking required.
71
+
72
+ ## When to Move On
73
+
74
+ Proceed to `references/step-6-finalize.md` when every acceptance criterion has a verification method and the user agrees each method proves the criterion works.
@@ -1,19 +1,52 @@
1
- # Step 4: Finalize
1
+ # Step 6: Finalize
2
2
 
3
- Review the spec for completeness and hand off.
3
+ Review the spec for completeness and soundness, then hand off.
4
4
 
5
- ## Review for Gaps
5
+ ## Run Both Reviews
6
6
 
7
- Invoke the `spec-review` skill, specifying which spec to review. It analyzes the spec and returns feedback.
7
+ Invoke both skills in parallel, specifying the spec path:
8
+ - `spec-review` — checks completeness, format, and implementation readiness
9
+ - `spec-sanity-check` — checks logic, assumptions, and unconsidered scenarios
8
10
 
9
- If gaps are found, ask follow-up questions to address them. Repeat until review passes.
11
+ Both return findings to you. They do not modify the spec directly.
12
+
13
+ ## Curate the Findings
14
+
15
+ Synthesize findings from both reviews. Some findings may be:
16
+ - Critical issues that must be addressed
17
+ - Valid suggestions worth considering
18
+ - Pedantic or irrelevant items to skip
19
+
20
+ For each finding, form a recommendation: address it or skip it, and why.
21
+
22
+ ## Walk Through With User
23
+
24
+ Use AskUserQuestion to present findings in batches (2-3 at a time). For each finding:
25
+ - State what the review found
26
+ - Give your recommendation (always include a recommended option)
27
+ - Let user decide: fix, skip, or something else
28
+
29
+ Track two lists:
30
+ - **Addressed**: findings the user chose to fix
31
+ - **Intentionally skipped**: findings the user chose to ignore
32
+
33
+ After walking through all findings, make the approved changes to the spec.
34
+
35
+ ## Offer Another Pass
36
+
37
+ Use AskUserQuestion: "Do you want to run the reviews again?"
38
+
39
+ If yes, invoke both reviews again with additional context:
40
+ - "We already ran a review. These changes were made: [list]. These findings were intentionally skipped: [list]. Look for anything new we haven't considered."
41
+
42
+ Repeat the curate → walk through → offer another pass cycle until user is satisfied.
10
43
 
11
44
  ## Complete the Interview
12
45
 
13
- Once review passes:
46
+ Once user confirms no more review passes needed:
14
47
 
15
48
  1. Show the user the final spec
16
- 2. Confirm they are satisfied
49
+ 2. Use AskUserQuestion to confirm they are satisfied
17
50
  3. Ask if they want to proceed to task breakdown
18
51
 
19
52
  If yes, invoke `spec-to-tasks` and specify which spec to break down.
@@ -11,8 +11,10 @@ context: fork
11
11
 
12
12
  1. **Find the spec** - Use the path from the prompt if provided. Otherwise, find the most recently modified file in `docs/specs/`. If no specs exist, inform the user and stop.
13
13
  2. **Read the spec file**
14
- 3. **Evaluate against the checklist below**
15
- 4. **Return structured feedback using the output format**
14
+ 3. **Find all CLAUDE.md files** - Search for every CLAUDE.md in the project (root and subdirectories)
15
+ 4. **Read all CLAUDE.md files** - These contain project constraints and conventions
16
+ 5. **Evaluate against the checklist below** - Including CLAUDE.md alignment
17
+ 6. **Return structured feedback using the output format**
16
18
 
17
19
  ## Completeness Checklist
18
20
 
@@ -25,22 +27,33 @@ A spec is implementation-ready when ALL of these are satisfied:
25
27
  - [ ] **Integration points mapped** - What existing code this touches is documented
26
28
  - [ ] **Core behavior specified** - Main flows are step-by-step clear
27
29
  - [ ] **Acceptance criteria exist** - Testable requirements are listed
30
+ - [ ] **Verification methods defined** - Every acceptance criterion has a specific way to verify it (test command, agent-browser steps, or explicit check)
31
+ - [ ] **No ambiguities** - Nothing requires interpretation; all requirements are explicit
32
+ - [ ] **No unknowns** - All information needed for implementation is present; nothing left to discover
33
+ - [ ] **CLAUDE.md alignment** - Spec does not conflict with constraints in any CLAUDE.md file
28
34
 
29
35
  ### Should Have (Gaps that cause implementation friction)
30
36
 
31
37
  - [ ] **Edge cases covered** - Error conditions and boundaries are addressed
32
38
  - [ ] **External dependencies documented** - APIs, libraries, services are listed
33
39
  - [ ] **Blockers section exists** - Missing credentials, pending decisions are called out
40
+ - [ ] **UI/UX wireframes exist** - If feature has a user interface, ASCII wireframes are present
41
+ - [ ] **Design direction documented** - If feature has UI, visual approach is explicit (not assumed)
34
42
 
35
43
  ### Implementation Readiness
36
44
 
37
- The test: could someone implement this feature completely hands-off, with zero questions?
45
+ The test: could an agent implement this feature with ZERO assumptions? If the agent would need to guess, interpret, or discover anything, the spec is not ready.
38
46
 
39
47
  Flag these problems:
40
48
  - Vague language ("should handle errors appropriately" — HOW?)
41
49
  - Missing details ("integrates with auth" — WHERE? HOW?)
42
50
  - Unstated assumptions ("uses the standard pattern" — WHICH pattern?)
43
51
  - Blocking dependencies ("needs API access" — DO WE HAVE IT?)
52
+ - Unverifiable criteria ("dashboard works correctly" — HOW DO WE CHECK?)
53
+ - Missing verification ("loads fast" — WHAT COMMAND PROVES IT?)
54
+ - Implicit knowledge ("depends on how X works" — SPECIFY IT)
55
+ - Unverified claims ("the API returns..." — HAS THIS BEEN CONFIRMED?)
56
+ - CLAUDE.md conflicts (spec proposes X but CLAUDE.md requires Y — WHICH IS IT?)
44
57
 
45
58
  ## Output Format
46
59
 
@@ -54,6 +67,9 @@ Return the review as:
54
67
  ### Missing (Blocking)
55
68
  - [Item]: [What's missing and why it blocks implementation]
56
69
 
70
+ ### CLAUDE.md Conflicts
71
+ - [Constraint from CLAUDE.md]: [How the spec conflicts with it]
72
+
57
73
  ### Gaps (Non-blocking but should address)
58
74
  - [Item]: [What's unclear or incomplete]
59
75
 
@@ -0,0 +1,79 @@
1
+ ---
2
+ name: spec-sanity-check
3
+ description: This skill should be used alongside spec-review to catch logic gaps and incorrect assumptions. Invoked when the user says "sanity check this spec", "does this plan make sense", or "what am I missing". Also auto-invoked by spec-interview during finalization.
4
+ argument-hint: <spec-path>
5
+ context: fork
6
+ ---
7
+
8
+ # Spec Sanity Check
9
+
10
+ Provide a "fresh eyes" review of the spec. This is different from spec-review — you're not checking format or completeness. You're checking whether the plan will actually work.
11
+
12
+ ## Find the Spec
13
+
14
+ Use the path from the prompt if provided. Otherwise, find the most recently modified file in `docs/specs/`. If no specs exist, inform the user and stop.
15
+
16
+ ## Read and Understand
17
+
18
+ Read the entire spec. Understand what is being built and how.
19
+
20
+ ## Ask These Questions
21
+
22
+ For each section of the spec, challenge it:
23
+
24
+ ### Logic Gaps
25
+ - Does the described flow actually work end-to-end?
26
+ - Are there steps that assume a previous step succeeded without checking?
27
+ - Are there circular dependencies?
28
+ - Does the order of operations make sense?
29
+
30
+ ### Incorrect Assumptions
31
+ - Are there assumptions about how existing systems work that might be wrong?
32
+ - Are there assumptions about external APIs, libraries, or services?
33
+ - Are there assumptions about data formats or availability?
34
+ - Use Explorer subagents to verify assumptions against the actual codebase
35
+
36
+ ### Unconsidered Scenarios
37
+ - What happens in edge cases not explicitly covered?
38
+ - What happens under load or at scale?
39
+ - What happens if external dependencies fail?
40
+ - What happens if data is malformed or missing?
41
+
42
+ ### Implementation Pitfalls
43
+ - Are there common bugs this approach would likely introduce?
44
+ - Are there security implications not addressed?
45
+ - Are there performance implications not addressed?
46
+ - Are there race conditions or timing issues?
47
+
48
+ ### The "What If" Test
49
+ - What if [key assumption] is wrong?
50
+ - What if [external dependency] changes?
51
+ - What if [data volume] is 10x what we expect?
52
+
53
+ ## Output Format
54
+
55
+ Return findings as:
56
+
57
+ ```
58
+ ## Sanity Check: [Feature Name]
59
+
60
+ ### Status: [SOUND | CONCERNS]
61
+
62
+ ### Logic Issues
63
+ - [Issue]: [Why this is a problem]
64
+
65
+ ### Questionable Assumptions
66
+ - [Assumption]: [Why this might be wrong] [Suggestion to verify]
67
+
68
+ ### Unconsidered Scenarios
69
+ - [Scenario]: [What could go wrong]
70
+
71
+ ### Potential Pitfalls
72
+ - [Pitfall]: [How to avoid]
73
+
74
+ ### Recommendation
75
+ [Either "Plan is sound" or specific concerns to address]
76
+ ```
77
+
78
+ **SOUND**: No significant concerns found.
79
+ **CONCERNS**: Issues that should be addressed before implementation.