@abdullah-alnahas/claude-sdd 0.5.0 → 0.7.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -1,5 +1,5 @@
1
1
  {
2
2
  "name": "claude-sdd",
3
- "version": "0.5.0",
3
+ "version": "0.7.0",
4
4
  "description": "Spec-Driven Development discipline system — behavioral guardrails, spec-first development, architecture awareness, TDD enforcement, iterative execution loops"
5
5
  }
package/README.md CHANGED
@@ -41,6 +41,9 @@ Red → Green → Refactor enforcement. Test traceability from behavior spec to
41
41
  ### Iterative Execution
42
42
  Disciplined delivery loops: implement with TDD → verify against spec → fix gaps → repeat. TDD is the inner discipline (how you write code), iterative execution is the outer cycle (how you deliver features).
43
43
 
44
+ ### Performance Optimization
45
+ Profile-first discipline for performance work. Defends against convenience bias (shallow, input-specific hacks), bottleneck mis-targeting, and correctness regressions during optimization.
46
+
44
47
  ## Commands
45
48
 
46
49
  | Command | Purpose |
@@ -61,6 +64,7 @@ Disciplined delivery loops: implement with TDD → verify against spec → fix g
61
64
  | **simplifier** | Complexity reducer — proposes simpler alternatives |
62
65
  | **spec-compliance** | Spec adherence checker — verifies traceability (spec → test → code) |
63
66
  | **security-reviewer** | Security analysis — OWASP Top 10, input validation, auth review |
67
+ | **performance-reviewer** | Performance optimization reviewer — validates patches for bottleneck targeting, convenience bias, measured improvement |
64
68
 
65
69
  ## Configuration
66
70
 
@@ -107,9 +111,9 @@ whitelist:
107
111
  ## Self-Test
108
112
 
109
113
  ```bash
110
- bash sdd/scripts/verify-hooks.sh
111
- bash sdd/scripts/verify-skills.sh
112
- bash sdd/scripts/verify-commands.sh
114
+ bash scripts/verify-hooks.sh
115
+ bash scripts/verify-skills.sh
116
+ bash scripts/verify-commands.sh
113
117
  ```
114
118
 
115
119
  ## Development Phases
package/agents/critic.md CHANGED
@@ -70,8 +70,4 @@ You are an adversarial reviewer. Your job is to find what's wrong, not confirm w
70
70
 
71
71
  ## Performance Patch Review
72
72
 
73
- When reviewing performance optimization patches, additionally check:
74
- - **Bottleneck targeting**: Does the patch address the actual bottleneck, or a convenient but less impactful location?
75
- - **Convenience bias**: Is this a structural improvement (algorithm, data structure) or a shallow, input-specific hack that's fragile and hard to maintain?
76
- - **Measured improvement**: Is the speedup quantified with before/after evidence, or just assumed?
77
- - **Correctness preservation**: Do all existing tests still pass after the optimization?
73
+ When a patch includes performance changes, check for correctness regressions and logical errors as usual. For dedicated performance analysis (bottleneck targeting, convenience bias, measured speedup), defer to the **performance-reviewer** agent.
@@ -1,7 +1,6 @@
1
1
  ---
2
2
  name: sdd-adopt
3
3
  description: Adopt an existing project into the SDD discipline system
4
- argument-hint: ""
5
4
  allowed-tools:
6
5
  - Read
7
6
  - Write
@@ -54,28 +54,41 @@ When invoked, execute the following phases in order. Announce each phase transit
54
54
 
55
55
  **Input**: Behavior spec + roadmap from previous phases.
56
56
  **Actions**:
57
- 1. Work through roadmap items in priority order
57
+ 1. Work through roadmap items in priority order, in **batches of 3**
58
58
  2. For each item, use TDD:
59
59
  - Write failing test(s) that cover the relevant acceptance criteria
60
60
  - Write minimal code to pass
61
61
  - Refactor
62
- 3. After each roadmap item, run available verification (test suite, linters, type checks)
62
+ 3. After each batch of 3 items:
63
+ - Run available verification (test suite, linters, type checks)
64
+ - Report progress with verification evidence (actual test output)
65
+ - Pause for user feedback before continuing
63
66
  4. If tests fail, fix using TDD (understand failure → write targeted fix → verify)
64
67
  5. Continue until all roadmap items complete
65
68
 
66
69
  Use the iterative execution outer loop: implement → verify → fix gaps → repeat (max 10 iterations per roadmap item).
67
70
 
68
- **Transition**: "Implement phase complete — N of M roadmap items done. Entering Verify phase."
71
+ **Transition** (only when all items are complete): "Implement phase complete — all M roadmap items done. Entering Verify phase."
69
72
 
70
73
  ### Phase 4: Verify
71
74
 
72
75
  **Input**: Implementation from Phase 3.
73
76
  **Actions**:
77
+ Use the two-stage review process (see `/sdd-review`):
78
+
79
+ **Stage 1 — Spec Compliance:**
74
80
  1. Run full test suite
75
81
  2. Invoke **spec-compliance agent** — compare implementation against behavior-spec.md
76
- 3. Invoke **critic agent** find logical errors, assumption issues
77
- 4. Invoke **security-reviewer agent** check for vulnerabilities
78
- 5. Collect all findings
82
+ 3. DO NOT trust the implementation report. Read actual code and test output independently.
83
+ 4. For each acceptance criterion: PASS / FAIL / PARTIAL with evidence
84
+ 5. Stage 1 must pass before proceeding to Stage 2
85
+
86
+ **Stage 2 — Code Quality:**
87
+ 6. Invoke **critic agent** — find logical errors, assumption issues
88
+ 7. Invoke **simplifier agent** — find unnecessary complexity
89
+ 8. Invoke **security-reviewer agent** — check for vulnerabilities
90
+ 9. If performance optimization was part of the spec, invoke **performance-reviewer agent**
91
+ 10. Collect all findings
79
92
 
80
93
  **Transition**: "Verify phase complete — N findings (X critical, Y high, Z medium). Entering Review phase."
81
94
 
@@ -73,9 +73,27 @@ No critical issues from critic agent.
73
73
  Completion is genuine — verified against spec.
74
74
  ```
75
75
 
76
+ ## Batch Execution
77
+
78
+ When working on multiple criteria or tasks, group them into batches of 3:
79
+
80
+ 1. Implement batch (3 criteria/tasks) using TDD
81
+ 2. Verify the batch — run tests, check spec compliance
82
+ 3. Report progress with verification evidence (test output, not claims)
83
+ 4. Pause for user feedback before continuing to the next batch
84
+
85
+ This prevents long unverified runs and gives the user control over direction.
86
+
87
+ ## Verification
88
+
89
+ After each batch, use the two-stage review process:
90
+ - **Stage 1**: Spec compliance — verify each criterion with evidence (see `/sdd-review`)
91
+ - **Stage 2**: Code quality — only after Stage 1 passes
92
+
76
93
  ## Principles
77
94
 
78
95
  - TDD is the inner discipline: every piece of new code starts with a failing test
79
96
  - The outer loop verifies against the spec, not just test results
80
97
  - Honest reporting: never claim done when criteria are unsatisfied
81
98
  - Bounded: max iterations prevent infinite loops
99
+ - Batch execution: groups of 3 with checkpoint reports
@@ -45,7 +45,7 @@ Phase state is stored in `.sdd-phase` in the project root. This file contains a
45
45
  - `design` → architecture-aware skill
46
46
  - `implement` → TDD discipline + iterative execution skills
47
47
  - `verify` → iterative execution (verification step)
48
- - `review` → all agents (critic, simplifier, spec-compliance, security-reviewer)
48
+ - `review` → all agents (critic, simplifier, spec-compliance, security-reviewer, performance-reviewer)
49
49
 
50
50
  ## Output Format
51
51
 
@@ -54,6 +54,14 @@ SDD Phase: implement
54
54
  ─────────────────────
55
55
  Focus: TDD cycles within iterative execution — write tests first, then minimal code to pass
56
56
 
57
- Available skills: tdd-discipline, iterative-execution, guardrails
58
- Available agents: critic, simplifier, spec-compliance, security-reviewer
57
+ Available skills: tdd-discipline, iterative-execution, guardrails, performance-optimization
58
+ Recommended agents: critic, spec-compliance
59
+ All agents: critic, simplifier, spec-compliance, security-reviewer, performance-reviewer
59
60
  ```
61
+
62
+ Phase-specific agent recommendations:
63
+ - **specify**: spec-compliance (verify spec completeness)
64
+ - **design**: critic (architectural review), simplifier
65
+ - **implement**: spec-compliance (traceability), critic (logic review)
66
+ - **verify**: critic, security-reviewer, performance-reviewer, spec-compliance
67
+ - **review**: all agents
@@ -1,6 +1,6 @@
1
1
  ---
2
2
  name: sdd-review
3
- description: On-demand self-review using critic and simplifier agents with iterative fix cycles
3
+ description: Two-stage review spec compliance first, then code quality. Only proceeds to Stage 2 after Stage 1 passes.
4
4
  argument-hint: "[--max-iterations <n>]"
5
5
  allowed-tools:
6
6
  - Read
@@ -14,40 +14,70 @@ allowed-tools:
14
14
 
15
15
  # /sdd-review
16
16
 
17
- Trigger an on-demand review of recent work using the critic and simplifier agents. Runs iteratively reviews, fixes, re-reviews until no critical issues remain.
17
+ Trigger a two-stage review of recent work. Stage 1 verifies spec compliance. Stage 2 reviews code quality. Stage 2 only runs after Stage 1 passes.
18
18
 
19
19
  ## Usage
20
20
 
21
21
  - `/sdd-review` — Review recent changes
22
22
  - `/sdd-review --max-iterations <n>` — Set max review-fix cycles (default: 3)
23
23
 
24
- ## Behavior
24
+ ## Stage 1: Spec Compliance
25
+
26
+ **Goal**: Verify the implementation satisfies the behavior spec.
25
27
 
26
28
  1. Identify what was recently changed (git diff or session context)
27
- 2. Run the **critic agent** — find logical errors, spec drift, assumption issues
28
- 3. Run the **simplifier agent** find unnecessary complexity
29
- 4. If spec documents exist, run the **spec-compliance agent**
30
- 5. Present findings with severity levels
31
- 6. Offer to auto-fix issues found
32
- 7. If fixes are made (using TDD — write test for the fix first if applicable), re-review
33
- 8. Repeat until no critical issues remain or max iterations reached
29
+ 2. Find the relevant behavior spec and acceptance criteria
30
+ 3. Run the **spec-compliance agent** with the Stage 1 prompt from `iterative-execution/references/review-prompts.md`
31
+ 4. **DO NOT trust the implementation report.** Read the actual code and test output independently.
32
+ 5. For each acceptance criterion: PASS / FAIL / PARTIAL with evidence
33
+ 6. If any criterion fails:
34
+ - Present findings
35
+ - Offer to fix (using TDD write a test for the fix first)
36
+ - After fixing, re-run Stage 1
37
+ - Repeat until all criteria pass or max iterations reached
38
+
39
+ **Stage 1 must pass before proceeding to Stage 2.**
40
+
41
+ ## Stage 2: Code Quality
42
+
43
+ **Goal**: Find unnecessary complexity, dead code, scope creep.
44
+
45
+ 1. Run the **critic agent** — find logical errors, assumption issues
46
+ 2. Run the **simplifier agent** — find unnecessary complexity
47
+ 3. If the changes involve performance optimization, run the **performance-reviewer agent**
48
+ 4. Present findings with severity levels:
49
+ - [Critical] — must fix
50
+ - [Simplification] — should fix
51
+ - [Observation] — consider fixing
52
+ 5. Offer to auto-fix critical and simplification issues
53
+ 6. If fixes are made (using TDD), re-run Stage 2
54
+ 7. Repeat until no critical issues remain or max iterations reached
34
55
 
35
56
  ## Output Format
36
57
 
37
58
  ```
38
- SDD Review — Iteration 1/3
39
- ──────────────────────────
59
+ SDD Review — Stage 1: Spec Compliance
60
+ ──────────────────────────────────────
61
+
62
+ Spec: specs/behavior-spec.md (5 criteria)
63
+
64
+ ✓ Criterion 1: User can log in — PASS (test_login passes)
65
+ ✓ Criterion 2: Invalid credentials show error — PASS (test_invalid_login passes)
66
+ ✗ Criterion 3: Session persists — FAIL (no test found for this criterion)
67
+
68
+ Stage 1: 2/3 FAIL — must fix before proceeding to Stage 2.
69
+ ```
70
+
71
+ ```
72
+ SDD Review — Stage 2: Code Quality
73
+ ───────────────────────────────────
40
74
 
41
75
  Critic Findings:
42
76
  [Critical] ...
43
- [Warning] ...
44
77
 
45
78
  Simplifier Findings:
46
79
  [Simplification] ...
47
80
 
48
- Spec Compliance:
49
- [X of Y criteria satisfied]
50
-
51
81
  Actions:
52
82
  - Fix critical issues? (y/n)
53
83
  - Apply simplifications? (y/n)
@@ -55,7 +85,12 @@ Actions:
55
85
 
56
86
  ## Principles
57
87
 
88
+ - Stage 1 is the gate. No code quality review on non-compliant code.
58
89
  - Reviews are honest — findings are reported as-is, not softened
90
+ - DO NOT trust the implementer's report. Verify independently.
59
91
  - Fixes follow TDD: if the fix changes behavior, write a test first
60
92
  - Max iterations prevent infinite loops
61
- - The review itself is part of the iterative execution cycle
93
+
94
+ ## References
95
+
96
+ See: `iterative-execution/references/review-prompts.md` — Subagent prompt templates
@@ -1,7 +1,6 @@
1
1
  ---
2
2
  name: sdd-yolo
3
3
  description: Temporarily disable all SDD guardrails for this session
4
- argument-hint: ""
5
4
  allowed-tools:
6
5
  - Write
7
6
  - Bash
package/hooks/hooks.json CHANGED
@@ -6,8 +6,9 @@
6
6
  "hooks": [
7
7
  {
8
8
  "type": "command",
9
- "command": "bash $CLAUDE_PLUGIN_ROOT/hooks/scripts/session-init.sh",
10
- "timeout": 10
9
+ "command": "bash \"$CLAUDE_PLUGIN_ROOT/hooks/scripts/session-init.sh\"",
10
+ "timeout": 10,
11
+ "additionalContext": true
11
12
  }
12
13
  ]
13
14
  }
@@ -29,7 +30,7 @@
29
30
  "hooks": [
30
31
  {
31
32
  "type": "command",
32
- "command": "bash $CLAUDE_PLUGIN_ROOT/hooks/scripts/post-edit-review.sh",
33
+ "command": "bash \"$CLAUDE_PLUGIN_ROOT/hooks/scripts/post-edit-review.sh\"",
33
34
  "timeout": 15
34
35
  }
35
36
  ]
@@ -10,24 +10,42 @@ if [ "${GUARDRAILS_DISABLED:-false}" = "true" ]; then
10
10
  fi
11
11
 
12
12
  PROJECT_DIR="${CLAUDE_PROJECT_DIR:-.}"
13
+ # Resolve to absolute path (POSIX-compatible fallback when realpath unavailable)
14
+ if command -v realpath &>/dev/null; then
15
+ PROJECT_DIR=$(realpath "$PROJECT_DIR" 2>/dev/null || echo "$PROJECT_DIR")
16
+ else
17
+ PROJECT_DIR=$(cd "$PROJECT_DIR" 2>/dev/null && pwd || echo "$PROJECT_DIR")
18
+ fi
13
19
 
14
20
  # Read tool input from stdin (JSON with file_path)
15
21
  INPUT=$(cat)
16
22
 
17
23
  # Use jq if available, fall back to sed
18
24
  if command -v jq &>/dev/null; then
19
- FILE_PATH=$(echo "$INPUT" | jq -r '.file_path // .filePath // empty' 2>/dev/null || echo "")
25
+ FILE_PATH=$(echo "$INPUT" | jq -r '.file_path // .filePath // empty' 2>/dev/null || true)
26
+ if [ -z "$FILE_PATH" ]; then
27
+ echo "SDD: post-edit-review skipped — could not parse file_path from hook input" >&2
28
+ exit 0
29
+ fi
20
30
  else
21
- FILE_PATH=$(echo "$INPUT" | sed -n 's/.*"file_path"\s*:\s*"\([^"]*\)".*/\1/p' | head -1)
31
+ FILE_PATH=$(echo "$INPUT" | sed -n 's/.*"file_path"[ \t]*:[ \t]*"\([^"]*\)".*/\1/p' | head -1)
22
32
  if [ -z "$FILE_PATH" ]; then
23
- FILE_PATH=$(echo "$INPUT" | sed -n 's/.*"filePath"\s*:\s*"\([^"]*\)".*/\1/p' | head -1)
33
+ FILE_PATH=$(echo "$INPUT" | sed -n 's/.*"filePath"[ \t]*:[ \t]*"\([^"]*\)".*/\1/p' | head -1)
24
34
  fi
25
35
  fi
26
36
 
27
37
  if [ -z "$FILE_PATH" ]; then
38
+ echo "SDD: post-edit-review skipped — no file_path in hook input" >&2
28
39
  exit 0
29
40
  fi
30
41
 
42
+ # Resolve file path to absolute for consistent comparison
43
+ if command -v realpath &>/dev/null; then
44
+ FILE_PATH=$(realpath "$FILE_PATH" 2>/dev/null || echo "$FILE_PATH")
45
+ elif [ -f "$FILE_PATH" ]; then
46
+ FILE_PATH=$(cd "$(dirname "$FILE_PATH")" 2>/dev/null && echo "$(pwd)/$(basename "$FILE_PATH")" || echo "$FILE_PATH")
47
+ fi
48
+
31
49
  # Check if inside project directory
32
50
  case "$FILE_PATH" in
33
51
  "$PROJECT_DIR/"*|"$PROJECT_DIR") ;; # Inside project, OK
@@ -39,7 +57,7 @@ esac
39
57
 
40
58
  # Check git status for unrelated modifications
41
59
  if command -v git &>/dev/null && [ -d "$PROJECT_DIR/.git" ]; then
42
- MODIFIED_COUNT=$(cd "$PROJECT_DIR" && git diff --name-only 2>/dev/null | wc -l)
60
+ MODIFIED_COUNT=$(cd "$PROJECT_DIR" && git diff --name-only 2>/dev/null | wc -l | tr -d ' ')
43
61
  if [ "$MODIFIED_COUNT" -gt 10 ]; then
44
62
  echo "SDD SCOPE WARNING: $MODIFIED_COUNT files modified — possible scope creep. Review changes with 'git diff --stat'" >&2
45
63
  exit 2
@@ -1,9 +1,14 @@
1
1
  #!/bin/bash
2
2
  # SDD Session Initialization Hook
3
- # Loads .sdd.yaml config, sets environment variables, checks yolo flag
3
+ # Loads .sdd.yaml config, sets environment variables, checks yolo flag, injects using-sdd skill
4
4
 
5
5
  set -euo pipefail
6
6
 
7
+ # Validate plugin environment
8
+ if [ -z "${CLAUDE_PLUGIN_ROOT:-}" ]; then
9
+ echo "SDD WARNING: CLAUDE_PLUGIN_ROOT is not set — hooks may not locate plugin resources" >&2
10
+ fi
11
+
7
12
  PROJECT_DIR="${CLAUDE_PROJECT_DIR:-.}"
8
13
  ENV_FILE="${CLAUDE_ENV_FILE:-}"
9
14
  CONFIG_FILE="$PROJECT_DIR/.sdd.yaml"
@@ -13,7 +18,9 @@ YOLO_FLAG="$PROJECT_DIR/.sdd-yolo"
13
18
  if [ -f "$YOLO_FLAG" ]; then
14
19
  echo "SDD: Previous YOLO mode detected — clearing flag, guardrails disabled for this session" >&2
15
20
  # Remove yolo flag (auto-clears on session start)
16
- rm -f "$YOLO_FLAG"
21
+ if ! rm -f "$YOLO_FLAG" 2>/dev/null; then
22
+ echo "SDD WARNING: Could not remove yolo flag at $YOLO_FLAG — guardrails may remain disabled next session" >&2
23
+ fi
17
24
  if [ -n "$ENV_FILE" ]; then
18
25
  echo "GUARDRAILS_DISABLED=true" >> "$ENV_FILE"
19
26
  echo "SDD_YOLO_CLEARED=true" >> "$ENV_FILE"
@@ -40,5 +47,14 @@ else
40
47
  fi
41
48
  fi
42
49
 
50
+ # Inject using-sdd skill as additionalContext
51
+ USING_SDD_PATH="${CLAUDE_PLUGIN_ROOT:-}/skills/using-sdd/SKILL.md"
52
+ if [ -f "$USING_SDD_PATH" ]; then
53
+ # Strip frontmatter and output as additionalContext
54
+ sed '1{/^---$/!q;};1,/^---$/d' "$USING_SDD_PATH"
55
+ else
56
+ echo "SDD WARNING: using-sdd skill not found at $USING_SDD_PATH" >&2
57
+ fi
58
+
43
59
  echo "SDD: Session initialized — guardrails active" >&2
44
60
  exit 0
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "@abdullah-alnahas/claude-sdd",
3
- "version": "0.5.0",
3
+ "version": "0.7.0",
4
4
  "description": "Spec-Driven Development discipline system for Claude Code — behavioral guardrails, spec-first development, architecture awareness, TDD enforcement, iterative execution loops",
5
5
  "keywords": [
6
6
  "claude-code-plugin",
@@ -9,11 +9,13 @@ FAIL=0
9
9
  check() {
10
10
  local desc="$1"
11
11
  shift
12
- if "$@" >/dev/null 2>&1; then
12
+ local output
13
+ if output=$("$@" 2>&1); then
13
14
  echo " ✓ $desc"
14
15
  PASS=$((PASS + 1))
15
16
  else
16
17
  echo " ✗ $desc"
18
+ [ -n "$output" ] && echo " $output"
17
19
  FAIL=$((FAIL + 1))
18
20
  fi
19
21
  }
@@ -29,7 +31,7 @@ for cmd in "${COMMANDS[@]}"; do
29
31
  CMD_FILE="$PLUGIN_DIR/commands/$cmd.md"
30
32
 
31
33
  check "File exists" test -f "$CMD_FILE"
32
- check "Has frontmatter" grep -q "^---" "$CMD_FILE"
34
+ check "Has frontmatter" bash -c "sed -n '1p' \"$CMD_FILE\" | grep -q '^---'"
33
35
  check "Has name field" grep -q "^name:" "$CMD_FILE"
34
36
  check "Has description field" grep -q "^description:" "$CMD_FILE"
35
37
  check "Is non-empty (>100 chars)" test "$(wc -c < "$CMD_FILE")" -gt 100
@@ -45,7 +47,7 @@ check "All command names are unique" test "$(echo "$NAMES" | wc -l)" -eq "$(echo
45
47
  # Check agents referenced by commands exist
46
48
  echo ""
47
49
  echo "Agent references:"
48
- AGENTS=("critic" "simplifier" "spec-compliance" "security-reviewer")
50
+ AGENTS=("critic" "simplifier" "spec-compliance" "security-reviewer" "performance-reviewer")
49
51
  for agent in "${AGENTS[@]}"; do
50
52
  check "Agent: $agent.md exists" test -f "$PLUGIN_DIR/agents/$agent.md"
51
53
  done
@@ -9,11 +9,13 @@ FAIL=0
9
9
  check() {
10
10
  local desc="$1"
11
11
  shift
12
- if "$@" >/dev/null 2>&1; then
12
+ local output
13
+ if output=$("$@" 2>&1); then
13
14
  echo " ✓ $desc"
14
15
  PASS=$((PASS + 1))
15
16
  else
16
17
  echo " ✗ $desc"
18
+ [ -n "$output" ] && echo " $output"
17
19
  FAIL=$((FAIL + 1))
18
20
  fi
19
21
  }
@@ -21,36 +23,50 @@ check() {
21
23
  echo "SDD Hook Verification"
22
24
  echo "─────────────────────"
23
25
 
26
+ # Detect Python interpreter
27
+ PYTHON=""
28
+ for candidate in python3 python; do
29
+ if command -v "$candidate" &>/dev/null; then
30
+ PYTHON="$candidate"
31
+ break
32
+ fi
33
+ done
34
+
24
35
  # Check hooks.json exists and is valid JSON
25
36
  echo ""
26
37
  echo "hooks.json:"
27
38
  check "File exists" test -f "$PLUGIN_DIR/hooks/hooks.json"
28
- check "Valid JSON" python3 -c "import json; json.load(open('$PLUGIN_DIR/hooks/hooks.json'))"
29
- check "Has hooks wrapper" python3 -c "
30
- import json
31
- d = json.load(open('$PLUGIN_DIR/hooks/hooks.json'))
39
+ if [ -n "$PYTHON" ]; then
40
+ check "Valid JSON" "$PYTHON" -c "import json, sys; json.load(open(sys.argv[1]))" "$PLUGIN_DIR/hooks/hooks.json"
41
+ check "Has hooks wrapper" "$PYTHON" -c "
42
+ import json, sys
43
+ d = json.load(open(sys.argv[1]))
32
44
  assert 'hooks' in d, 'Missing hooks key'
33
- "
34
- check "Has SessionStart hook" python3 -c "
35
- import json
36
- d = json.load(open('$PLUGIN_DIR/hooks/hooks.json'))
45
+ " "$PLUGIN_DIR/hooks/hooks.json"
46
+ check "Has SessionStart hook" "$PYTHON" -c "
47
+ import json, sys
48
+ d = json.load(open(sys.argv[1]))
37
49
  assert 'SessionStart' in d['hooks']
38
- "
39
- check "Has UserPromptSubmit hook" python3 -c "
40
- import json
41
- d = json.load(open('$PLUGIN_DIR/hooks/hooks.json'))
50
+ " "$PLUGIN_DIR/hooks/hooks.json"
51
+ check "Has UserPromptSubmit hook" "$PYTHON" -c "
52
+ import json, sys
53
+ d = json.load(open(sys.argv[1]))
42
54
  assert 'UserPromptSubmit' in d['hooks']
43
- "
44
- check "Has PostToolUse hook" python3 -c "
45
- import json
46
- d = json.load(open('$PLUGIN_DIR/hooks/hooks.json'))
55
+ " "$PLUGIN_DIR/hooks/hooks.json"
56
+ check "Has PostToolUse hook" "$PYTHON" -c "
57
+ import json, sys
58
+ d = json.load(open(sys.argv[1]))
47
59
  assert 'PostToolUse' in d['hooks']
48
- "
49
- check "Has Stop hook" python3 -c "
50
- import json
51
- d = json.load(open('$PLUGIN_DIR/hooks/hooks.json'))
60
+ " "$PLUGIN_DIR/hooks/hooks.json"
61
+ check "Has Stop hook" "$PYTHON" -c "
62
+ import json, sys
63
+ d = json.load(open(sys.argv[1]))
52
64
  assert 'Stop' in d['hooks']
53
- "
65
+ " "$PLUGIN_DIR/hooks/hooks.json"
66
+ else
67
+ echo " ⚠ Python not found — skipping JSON validation (install python3 or activate a venv)"
68
+ FAIL=$((FAIL + 1))
69
+ fi
54
70
 
55
71
  # Check scripts exist and are executable
56
72
  echo ""
@@ -60,10 +76,16 @@ check "post-edit-review.sh exists" test -f "$PLUGIN_DIR/hooks/scripts/post-edit-
60
76
  check "session-init.sh is executable or bash-runnable" bash -n "$PLUGIN_DIR/hooks/scripts/session-init.sh"
61
77
  check "post-edit-review.sh is executable or bash-runnable" bash -n "$PLUGIN_DIR/hooks/scripts/post-edit-review.sh"
62
78
 
63
- # Test session-init.sh runs without error
79
+ # Test session-init.sh runs without error (in isolated temp dir to avoid side effects)
64
80
  echo ""
65
81
  echo "Script execution:"
66
- check "session-init.sh runs without error" bash "$PLUGIN_DIR/hooks/scripts/session-init.sh"
82
+ check "session-init.sh runs without error" bash -c "
83
+ TMPDIR=\$(mktemp -d)
84
+ CLAUDE_PROJECT_DIR=\"\$TMPDIR\" CLAUDE_ENV_FILE=\"\" bash \"$PLUGIN_DIR/hooks/scripts/session-init.sh\" 2>/dev/null
85
+ rc=\$?
86
+ rm -rf \"\$TMPDIR\"
87
+ exit \$rc
88
+ "
67
89
 
68
90
  echo ""
69
91
  echo "─────────────────────"
@@ -9,11 +9,13 @@ FAIL=0
9
9
  check() {
10
10
  local desc="$1"
11
11
  shift
12
- if "$@" >/dev/null 2>&1; then
12
+ local output
13
+ if output=$("$@" 2>&1); then
13
14
  echo " ✓ $desc"
14
15
  PASS=$((PASS + 1))
15
16
  else
16
17
  echo " ✗ $desc"
18
+ [ -n "$output" ] && echo " $output"
17
19
  FAIL=$((FAIL + 1))
18
20
  fi
19
21
  }
@@ -21,7 +23,7 @@ check() {
21
23
  echo "SDD Skill Verification"
22
24
  echo "──────────────────────"
23
25
 
24
- SKILLS=("guardrails" "spec-first" "architecture-aware" "tdd-discipline" "iterative-execution")
26
+ SKILLS=("guardrails" "spec-first" "architecture-aware" "tdd-discipline" "iterative-execution" "performance-optimization" "using-sdd")
25
27
 
26
28
  for skill in "${SKILLS[@]}"; do
27
29
  echo ""
@@ -30,7 +32,7 @@ for skill in "${SKILLS[@]}"; do
30
32
 
31
33
  check "Directory exists" test -d "$SKILL_DIR"
32
34
  check "SKILL.md exists" test -f "$SKILL_DIR/SKILL.md"
33
- check "SKILL.md has frontmatter" grep -q "^---" "$SKILL_DIR/SKILL.md"
35
+ check "SKILL.md has frontmatter" bash -c "sed -n '1p' \"$SKILL_DIR/SKILL.md\" | grep -q '^---'"
34
36
  check "SKILL.md has name field" grep -q "^name:" "$SKILL_DIR/SKILL.md"
35
37
  check "SKILL.md has description field" grep -q "^description:" "$SKILL_DIR/SKILL.md"
36
38
 
@@ -1,10 +1,9 @@
1
1
  ---
2
2
  name: Architecture Awareness
3
3
  description: >
4
- This skill provides architecture consciousness during development, including integration patterns,
5
- anti-patterns, and ADR guidance. It should be used when the user asks how to structure or organize code,
6
- discusses architecture or design patterns, plans integrations between components, or asks
7
- "how should I structure this?", "what pattern should I use?", or "should I split this into services?"
4
+ Use when structuring or organizing code, discussing architecture or design patterns, planning
5
+ integrations between components, or when the user asks "how should I structure this?", "what
6
+ pattern should I use?", "should I split this into services?", or "document this decision."
8
7
  ---
9
8
 
10
9
  # Architecture Awareness
@@ -22,13 +21,13 @@ Every pattern has trade-offs. State the specific benefit for THIS codebase, not
22
21
  ### Record Significant Decisions
23
22
  If a decision is hard to reverse or affects multiple components, it deserves an ADR (Architecture Decision Record).
24
23
 
25
- ## When to Engage
24
+ ## When an Architecture Question Arises
26
25
 
27
- - User asks "how should I structure this?"
28
- - Adding a new component to an existing system
29
- - Introducing a new technology or pattern
30
- - Changing how components communicate
31
- - Anything that touches 3+ modules/services
26
+ 1. **Survey existing patterns** read the codebase to understand current conventions, patterns, and structure
27
+ 2. **Evaluate fit** does the proposed approach align with or diverge from existing patterns? Divergence needs justification.
28
+ 3. **State trade-offs explicitly** every option has costs and benefits. Name them concretely for this codebase.
29
+ 4. **Decide whether an ADR is warranted** — write one if the decision is hard to reverse or affects multiple components
30
+ 5. **Document if yes** use the ADR template from `references/adr-guide.md`
32
31
 
33
32
  ## What to Check
34
33
 
@@ -37,6 +36,12 @@ If a decision is hard to reverse or affects multiple components, it deserves an
37
36
  3. **Coupling**: Are we creating tight coupling between components?
38
37
  4. **Consistency**: Does this follow or violate established conventions?
39
38
 
39
+ ## Related Skills
40
+
41
+ - **spec-first** — architecture decisions emerge during Stage 4 (Architecture)
42
+ - **iterative-execution** — architectural context guides integration during implementation
43
+ - **guardrails** — enforces architectural consistency as part of scope discipline
44
+
40
45
  ## References
41
46
 
42
47
  See: `references/integration-patterns.md`
@@ -1,16 +1,19 @@
1
1
  ---
2
2
  name: SDD Guardrails
3
3
  description: >
4
- This skill enforces core behavioral guardrails defending against 12 common LLM failure modes during
5
- software development. It should be used when the user asks to implement, build, write, fix, refactor,
6
- add, change, or modify code essentially any coding task. It enforces honesty over agreement, scope
7
- discipline, simplicity, and verification before claiming completion.
4
+ Use when implementing, building, fixing, refactoring, adding, changing, or modifying code.
5
+ Use when reviewing code or claiming work is complete. Use when you notice yourself agreeing
6
+ without critical evaluation or adding code beyond what was requested.
8
7
  ---
9
8
 
10
9
  # SDD Behavioral Guardrails
11
10
 
12
11
  Operate under the SDD (Spec-Driven Development) discipline system. These guardrails defend against known LLM failure patterns in software development.
13
12
 
13
+ ## Spirit vs. Letter
14
+
15
+ Follow the **spirit** of these guardrails, not just their checklists. The goal is disciplined development that produces correct, simple, spec-compliant code. If following a checklist item mechanically would produce worse results than thoughtful application of the principle behind it, follow the principle. But this is **never** an excuse to skip steps — it's a reason to apply them thoughtfully.
16
+
14
17
  ## Core Principles
15
18
 
16
19
  ### 1. Honesty Over Agreement
@@ -26,7 +29,50 @@ The right solution is the simplest one that works. Before writing any code, ask:
26
29
  Every assumption you make is a potential bug. Enumerate your assumptions explicitly. If you're uncertain about intent, ask. If you're uncertain about behavior, test. Never silently guess.
27
30
 
28
31
  ### 5. Verify Before Claiming
29
- Never say "done" until you've verified. Run the tests. Check the output. Read your own code critically. A completion claim without verification is a lie.
32
+ Never say "done" until you've verified. This is a formal gate:
33
+
34
+ 1. **IDENTIFY** the command or check needed to verify
35
+ 2. **RUN** the command (test suite, linter, type checker)
36
+ 3. **READ** the output — actually read it, don't skim
37
+ 4. **VERIFY** the claim against the output — does the evidence support "done"?
38
+ 5. **THEN** claim completion, citing the evidence
39
+
40
+ A completion claim without verification is a lie.
41
+
42
+ **Common verification failures:**
43
+
44
+ | Failure | What Actually Happened |
45
+ |---------|----------------------|
46
+ | "Tests pass" without running them | You guessed. Run them. |
47
+ | Ran tests but didn't read output | A failure was buried in the output. Read it. |
48
+ | Tests pass but don't cover the change | You tested the wrong thing. Check coverage. |
49
+ | "Looks correct" from reading code | Reading is not testing. Execute it. |
50
+ | Verified one case, claimed all cases | Edge cases exist. Test them. |
51
+
52
+ ## Rationalization Red Flags
53
+
54
+ These thoughts mean STOP — you're about to violate a guardrail:
55
+
56
+ | Thought | Reality |
57
+ |---------|---------|
58
+ | "This small fix doesn't need the full checkpoint" | Small fixes are where scope creep starts. |
59
+ | "The user seems to want me to just do it" | Discipline is not optional based on tone. |
60
+ | "I'll verify at the end" | Verify continuously. End-of-task verification catches less. |
61
+ | "This is obviously correct" | Obvious code has bugs too. Test it. |
62
+ | "Adding this extra thing will help" | That's scope creep. Mention it, don't do it. |
63
+ | "I'm sure this test passes" | Run it. Being sure is not evidence. |
64
+ | "The user won't notice this improvement" | Unasked changes are defects regardless. |
65
+ | "This is a standard pattern, no need to verify" | Standard patterns fail in specific contexts. Verify. |
66
+
67
+ ## Escalation Rule
68
+
69
+ After 3 failed attempts to fix the same issue, **STOP**. Do not attempt a 4th fix. Instead:
70
+
71
+ 1. State what you've tried and why each attempt failed
72
+ 2. Question whether the approach or architecture is wrong
73
+ 3. Suggest an alternative approach or ask the user for direction
74
+
75
+ Repeated failures on the same issue usually indicate a wrong approach, not insufficient effort.
30
76
 
31
77
  ## Pre-Implementation Checkpoint
32
78
 
@@ -51,14 +97,7 @@ Before writing ANY implementation code, you MUST:
51
97
 
52
98
  ## Performance Changes
53
99
 
54
- When the task is performance optimization:
55
-
56
- 1. **Profile first** — identify the actual bottleneck with evidence (timing, profiler output). Never guess.
57
- 2. **Verify correctness after every change** — run the full test suite. Any test regression invalidates the optimization.
58
- 3. **Measure improvement quantitatively** — compare before/after timings. No "it should be faster" — prove it.
59
- 4. **Prefer structural improvements** — algorithmic and data-structure changes over micro-optimizations or input-specific hacks.
60
- 5. **Never sacrifice correctness for speed** — a faster but broken program is not an optimization, it's a defect.
61
- 6. **Watch for convenience bias** — small, surface-level tweaks that are easy to produce but fragile and hard to maintain. Push for deeper fixes.
100
+ For performance optimization tasks, follow the **performance-optimization** skill for the full workflow (profile-first discipline, convenience bias detection, measured improvement). The core rule: never sacrifice correctness for speed.
62
101
 
63
102
  ## Completion Review
64
103
 
@@ -67,7 +106,7 @@ Before claiming work is done:
67
106
  1. Re-read the original request
68
107
  2. Verify every requirement is met
69
108
  3. Check for dead code you introduced
70
- 4. Check function/file length limits (50/500 lines)
109
+ 4. Check function/file length guidelines (aim for ~50/~500 lines — adapt to project conventions)
71
110
  5. Verify no unrelated files were modified
72
111
  6. Run available tests
73
112
 
@@ -75,5 +114,15 @@ Before claiming work is done:
75
114
 
76
115
  Consult the failure patterns reference for detailed detection and response guidance for all 12 failure modes.
77
116
 
117
+ ## Related Skills
118
+
119
+ - **spec-first** — for the pre-implementation spec check (step 6 above)
120
+ - **tdd-discipline** — for the TDD inner discipline during implementation
121
+ - **iterative-execution** — for the implement-verify-fix outer cycle
122
+ - **performance-optimization** — for performance-specific guardrails
123
+ - **architecture-aware** — for architectural consistency checks
124
+
125
+ ## References
126
+
78
127
  See: `references/failure-patterns.md`
79
128
  See: `references/pushback-guide.md`
@@ -61,3 +61,8 @@
61
61
  **Detection**: You've been coding for a while and haven't re-read the original requirement.
62
62
  **Response**: Periodically re-read the request. Check that your solution actually solves the stated problem, not a related but different one.
63
63
  **Example**: User asks to "sort by date" and you implement alphabetical sort because you started coding before fully reading.
64
+
65
+ ### 13. Fix Thrashing
66
+ **Detection**: You've attempted 3+ fixes for the same issue and it's still broken. Each fix introduces a new problem or reverts to a previous failure.
67
+ **Response**: Stop fixing. Step back and question the approach. State what you've tried, why each failed, and propose an alternative architecture or ask the user for direction.
68
+ **Example**: A test keeps failing despite three different fixes to the handler. The real problem is the test assumes synchronous behavior but the handler is async. The fix isn't in the handler — it's in the test setup or the architectural approach.
@@ -1,10 +1,9 @@
1
1
  ---
2
2
  name: Iterative Execution
3
3
  description: >
4
- This skill provides disciplined implement-verify-fix cycles for delivering features against specifications.
5
- It should be used when the user asks to implement a feature from a spec, when implementation needs iterating
6
- to match requirements, or when the user says "make this work," "implement this spec," "keep going until all
7
- tests pass," or "it's not matching the spec yet."
4
+ Use when implementing a feature from a spec, when implementation needs iterating to match
5
+ requirements, or when delivering any non-trivial change. Use when the user says "make this work,"
6
+ "implement this spec," "keep going until all tests pass," or "it's not matching the spec yet."
8
7
  ---
9
8
 
10
9
  # Iterative Execution
@@ -36,6 +35,21 @@ They are complementary, not competing. TDD governs how you write code. Iterative
36
35
  7. Report honest completion status
37
36
  ```
38
37
 
38
+ ## Rationalization Red Flags
39
+
40
+ These thoughts mean STOP — you're about to skip verification:
41
+
42
+ | Thought | Reality |
43
+ |---------|---------|
44
+ | "I'll verify everything at the end" | Verify after each change. End-of-task catches less. |
45
+ | "The code looks right, no need to run tests" | Looking right is not evidence. Run the tests. |
46
+ | "I fixed the issue, moving on" | Did you verify the fix? Run the test again. |
47
+ | "Only one small thing changed" | Small changes cause big failures. Verify. |
48
+ | "I already know this passes" | You knew the previous version passed. This is a new version. |
49
+ | "Verification would take too long" | Shipping a bug takes longer. Verify. |
50
+ | "The spec is satisfied, I can see it" | Seeing is not testing. Run the criteria checks. |
51
+ | "I'll skip this iteration's verification" | Skipping once becomes skipping always. Never skip. |
52
+
39
53
  ## Completion Criteria
40
54
 
41
55
  Good completion criteria are:
@@ -54,11 +68,7 @@ Use whatever is available, in order of preference:
54
68
 
55
69
  ## Performance Optimization Tasks
56
70
 
57
- When the task is performance optimization, the verification step MUST include:
58
- 1. **Timing comparison** — measure before vs after on the actual workload. Quantify the speedup.
59
- 2. **Test suite pass** — correctness preserved. Any new test failure invalidates the optimization.
60
- 3. **Profile comparison** — confirm the bottleneck was actually addressed, not just masked or shifted elsewhere.
61
- 4. **Convenience bias check** — is this a structural improvement or a shallow, input-specific hack? If the latter, iterate.
71
+ For performance optimization tasks, the verification step must additionally include timing comparison, profile comparison, and convenience bias checks. Follow the **performance-optimization** skill for the full workflow.
62
72
 
63
73
  ## Honesty Rules
64
74
 
@@ -67,7 +77,16 @@ When the task is performance optimization, the verification step MUST include:
67
77
  - **Never weaken criteria to match output.** The spec defines done, not the implementation.
68
78
  - **Be honest about partial completion.** "3 of 5 criteria met, blocked on X" is better than a false "done."
69
79
 
80
+ ## Related Skills
81
+
82
+ - **tdd-discipline** — the inner discipline used within each implementation step
83
+ - **spec-first** — produces the specs that define completion criteria
84
+ - **guardrails** — the overarching discipline layer
85
+ - **architecture-aware** — architectural context for integration decisions during implementation
86
+ - **performance-optimization** — specialized verification for performance tasks
87
+
70
88
  ## References
71
89
 
72
90
  See: `references/loop-patterns.md`
73
91
  See: `references/completion-criteria.md`
92
+ See: `references/review-prompts.md`
@@ -0,0 +1,69 @@
1
+ # Review Prompt Templates
2
+
3
+ Subagent prompt templates for the two-stage review process.
4
+
5
+ ## Stage 1: Spec Compliance Reviewer
6
+
7
+ Use this prompt for the spec-compliance review subagent:
8
+
9
+ ```
10
+ You are reviewing an implementation against its behavior specification.
11
+
12
+ DO NOT trust the implementer's report of what was done. Read the actual code and actual test output.
13
+
14
+ Your job:
15
+ 1. Read the behavior spec (acceptance criteria)
16
+ 2. Read the implementation code
17
+ 3. Run or read test results
18
+ 4. For EACH acceptance criterion, independently verify:
19
+ - Is there a test that covers this criterion?
20
+ - Does the test actually test what the criterion specifies?
21
+ - Does the test pass? (Read the output, don't trust claims)
22
+ - Does the implementation match the criterion's intent, not just its letter?
23
+
24
+ Report format:
25
+ - For each criterion: PASS / FAIL / PARTIAL with evidence
26
+ - Overall: X of Y criteria satisfied
27
+ - Blocking issues (must fix before proceeding)
28
+ - Non-blocking observations
29
+
30
+ DO NOT soften findings. DO NOT say "mostly works" when a criterion fails.
31
+ A criterion either passes with evidence or it doesn't.
32
+ ```
33
+
34
+ ## Stage 2: Code Quality Reviewer
35
+
36
+ Use this prompt for the code quality review subagent (only run after Stage 1 passes):
37
+
38
+ ```
39
+ You are reviewing code quality after spec compliance has been verified.
40
+
41
+ Review the implementation for:
42
+ 1. Unnecessary complexity (could this be simpler?)
43
+ 2. Dead code introduced by the changes
44
+ 3. Scope creep (changes beyond what the spec required)
45
+ 4. Missing error handling at system boundaries
46
+ 5. Naming clarity
47
+ 6. Function/file length (aim ~50/~500 lines)
48
+
49
+ For each finding, classify:
50
+ - [Critical] — must fix (bugs, security issues)
51
+ - [Simplification] — should fix (unnecessary complexity)
52
+ - [Observation] — consider fixing (style, minor improvements)
53
+
54
+ DO NOT invent requirements. Only flag issues that make the code worse.
55
+ DO NOT suggest adding features, abstractions, or patterns not needed by the spec.
56
+ ```
57
+
58
+ ## Implementer Self-Review Checklist
59
+
60
+ Before requesting external review, the implementer should verify:
61
+
62
+ 1. [ ] Re-read the original request/spec
63
+ 2. [ ] Every acceptance criterion has a corresponding test
64
+ 3. [ ] All tests pass (actually ran them, read the output)
65
+ 4. [ ] No unrelated files were modified
66
+ 5. [ ] No dead code was introduced
67
+ 6. [ ] No abstractions for single-use patterns
68
+ 7. [ ] Function lengths are reasonable
69
+ 8. [ ] Changes are the minimum needed to satisfy the spec
@@ -1,10 +1,9 @@
1
1
  ---
2
2
  name: Performance Optimization
3
3
  description: >
4
- This skill enforces disciplined performance optimization practices defending against convenience bias,
5
- localization failure, and correctness regressions. It should be used when the user asks to optimize,
6
- speed up, improve performance, reduce runtime, or make code faster — any task where the goal is better
7
- performance without breaking correctness.
4
+ Use when optimizing, speeding up, profiling, reducing memory usage, or improving performance.
5
+ Use when the user says "profile this," "find the bottleneck," "speed up," or "optimize."
6
+ Use when any change targets performance without breaking correctness.
8
7
  ---
9
8
 
10
9
  # Performance Optimization Discipline
@@ -49,10 +48,18 @@ Performance optimization is investigative work. You must understand the problem
49
48
  ## Convenience Bias Checklist
50
49
 
51
50
  Before submitting a performance patch, verify it is NOT:
52
- - [ ] An input-specific hack that only helps one case
53
- - [ ] A micro-optimization with unmeasurable impact
54
- - [ ] A change that trades correctness risk for speed
55
- - [ ] A surface-level tweak when a deeper structural fix exists
51
+ - An input-specific hack that only helps one case
52
+ - A micro-optimization with unmeasurable impact
53
+ - A change that trades correctness risk for speed
54
+ - A surface-level tweak when a deeper structural fix exists
55
+
56
+ ## Related Skills
57
+
58
+ - **guardrails** — enforces correctness-first and verify-before-claiming during optimization
59
+ - **iterative-execution** — the outer verify-fix cycle for measuring and iterating on improvements
60
+ - **tdd-discipline** — ensures test suite is maintained through optimization changes
61
+ - **spec-first** — performance requirements originate in specs (stack.md, behavior-spec.md)
62
+ - **architecture-aware** — structural optimizations require architectural context
56
63
 
57
64
  ## References
58
65
 
@@ -1,10 +1,9 @@
1
1
  ---
2
2
  name: Spec-First Development
3
3
  description: >
4
- This skill guides interactive specification development, turning rough ideas into formal documents before
5
- any code is written. It should be used when the user is starting a new project or feature, wants to create
6
- specs or plans, is adopting an existing project, or says things like "I want to build something," "let's
7
- plan this out," "write a spec for this," or "let's design this first."
4
+ Use when starting a new project or feature, creating specs or plans, adopting an existing project,
5
+ or when the user says "I want to build something," "let's plan this out," "write a spec," or
6
+ "let's design this first." Use before any non-trivial implementation that lacks a spec.
8
7
  ---
9
8
 
10
9
  # Spec-First Development
@@ -60,6 +59,13 @@ For existing projects, use the adoption flow instead of starting from scratch. S
60
59
  - **Documents are living**: Specs evolve. That's fine. But they must exist before code.
61
60
  - **Lean templates**: The templates are starting points, not forms to fill out
62
61
 
62
+ ## Related Skills
63
+
64
+ - **architecture-aware** — for deeper architectural guidance during Stage 4
65
+ - **tdd-discipline** — for test planning from behavior specs (use `references/templates/test-plan.md`)
66
+ - **iterative-execution** — delivers features against the specs produced here
67
+ - **guardrails** — enforces spec-first as a pre-implementation check
68
+
63
69
  ## References
64
70
 
65
71
  See: `references/interactive-spec-process.md` — Detailed questioning flow
@@ -1,15 +1,19 @@
1
1
  ---
2
2
  name: TDD Discipline
3
3
  description: >
4
- This skill enforces test-driven development discipline with the Red/Green/Refactor cycle and traceability
5
- from behavior spec to test to code. It should be used when the user asks to write tests, add test coverage,
6
- discuss testing strategy, or says "how should I test this?", "add tests for this," or "write tests first."
4
+ Use when writing tests, adding test coverage, fixing bugs, debugging, or when any new code needs
5
+ to be written. Use when the user says "write tests," "add tests," "fix this bug," "debug this,"
6
+ or "how should I test this?"
7
7
  ---
8
8
 
9
9
  # TDD Discipline
10
10
 
11
11
  Tests are not an afterthought — they are the first expression of intent. Write the test that describes the behavior, watch it fail, then write the minimum code to make it pass.
12
12
 
13
+ ## Spirit vs. Letter
14
+
15
+ The spirit of TDD is: **know what correct behavior looks like before writing the code.** The Red/Green/Refactor cycle is the mechanism, but the principle is that you define "done" before you start. If a situation genuinely doesn't benefit from a test-first approach (see "When TDD Is Overhead" below), skip the mechanism — but never skip the principle of defining expected behavior first.
16
+
13
17
  ## Red → Green → Refactor
14
18
 
15
19
  1. **Red**: Write a failing test that describes the desired behavior
@@ -18,9 +22,24 @@ Tests are not an afterthought — they are the first expression of intent. Write
18
22
 
19
23
  This cycle applies at every level: unit, integration, e2e.
20
24
 
25
+ ## Rationalization Red Flags
26
+
27
+ These thoughts mean STOP — you're about to skip TDD:
28
+
29
+ | Thought | Reality |
30
+ |---------|---------|
31
+ | "I'll write tests after the code works" | That's test-after, not TDD. Write the test first. |
32
+ | "This is too simple to need a test" | Simple code with no test becomes complex code with no test. |
33
+ | "I know this works, I'll just verify manually" | Manual verification doesn't persist. Tests do. |
34
+ | "The test is obvious, I'll skip to code" | If it's obvious, it takes 30 seconds to write. Do it. |
35
+ | "I need to see the code structure first" | Write the test to discover the structure. That's the point. |
36
+ | "This is just a refactor, tests already pass" | Run the tests. Confirm they pass. Then refactor. |
37
+ | "Writing a test for this would be too complex" | If you can't test it, you can't verify it. Simplify the design. |
38
+ | "I'll add tests in the next iteration" | Next iteration never comes. Write them now. |
39
+
21
40
  ## Relationship to Iterative Execution
22
41
 
23
- TDD is the **inner discipline** — how you write each piece of code. Iterative execution is the **outer cycle** — how you deliver a complete feature against a spec. Every "implement" step in the iterative execution cycle uses TDD internally. They are complementary: TDD ensures code correctness at the unit level; iterative execution ensures spec satisfaction at the feature level.
42
+ TDD is the **inner discipline** — how you write each piece of code. Iterative execution is the **outer cycle** — how you deliver a complete feature against a spec. They are complementary: TDD ensures correctness at the unit level; iterative execution ensures spec satisfaction at the feature level. See the **iterative-execution** skill for the full outer cycle.
24
43
 
25
44
  ## When TDD Adds Value
26
45
 
@@ -49,6 +68,13 @@ Code: FormHandler.submit()
49
68
 
50
69
  This chain ensures nothing is built without a reason and nothing specified goes untested. If a test has no spec criterion, either add the criterion to the spec or question whether the test is needed. If a spec criterion has no test, that is a finding — even if the code works.
51
70
 
71
+ ## Related Skills
72
+
73
+ - **iterative-execution** — the outer delivery cycle that uses TDD internally
74
+ - **spec-first** — produces behavior specs that drive test design (see `spec-first/references/templates/test-plan.md`)
75
+ - **guardrails** — enforces TDD during implementation
76
+ - **performance-optimization** — uses TDD to preserve correctness during optimization
77
+
52
78
  ## References
53
79
 
54
80
  See: `references/test-strategies.md`
@@ -0,0 +1,72 @@
1
+ ---
2
+ name: Using SDD
3
+ description: >
4
+ Use at the start of every session and before every response to determine which SDD skills apply.
5
+ This is the meta-skill — it teaches skill discovery and invocation discipline.
6
+ ---
7
+
8
+ # Using SDD Skills
9
+
10
+ You have access to SDD (Spec-Driven Development) skills that enforce development discipline. **Check for applicable skills before every response.**
11
+
12
+ ## The Rule
13
+
14
+ **Invoke relevant skills BEFORE any response or action.** Even a 1% chance a skill might apply means you should check. If it turns out to be wrong for the situation, you don't need to use it.
15
+
16
+ ## Available Skills
17
+
18
+ | Skill | When to Use |
19
+ |-------|-------------|
20
+ | **guardrails** | ANY coding task — implement, build, fix, refactor, add, change, modify |
21
+ | **spec-first** | New project/feature, creating specs/plans, adopting a project |
22
+ | **tdd-discipline** | Writing tests, adding coverage, fixing bugs, debugging |
23
+ | **iterative-execution** | Implementing a feature from spec, iterating to match requirements |
24
+ | **architecture-aware** | Structuring code, design patterns, component integration, ADRs |
25
+ | **performance-optimization** | Optimizing, profiling, speeding up, reducing resource usage |
26
+
27
+ ## Skill Priority Order
28
+
29
+ When multiple skills apply, use this order:
30
+
31
+ 1. **Guardrails first** — always active for any coding task. This is the discipline layer.
32
+ 2. **Process skills second** (spec-first, tdd-discipline, iterative-execution) — these determine HOW to approach the task.
33
+ 3. **Domain skills third** (architecture-aware, performance-optimization) — these provide specialized guidance.
34
+
35
+ ## Rationalization Red Flags
36
+
37
+ These thoughts mean STOP — you're rationalizing skipping a skill:
38
+
39
+ | Thought | Reality |
40
+ |---------|---------|
41
+ | "This is just a simple question" | Questions about code are tasks. Check guardrails. |
42
+ | "I need more context first" | Skill check comes BEFORE exploration. |
43
+ | "Let me explore the codebase first" | Skills tell you HOW to explore. Check first. |
44
+ | "I can just do this quickly" | Quick work is where discipline matters most. |
45
+ | "This doesn't need a formal skill" | If a skill exists for this task type, use it. |
46
+ | "I remember the skill" | Skills evolve. Read the current version. |
47
+ | "This doesn't count as implementation" | If you're changing code, guardrails apply. |
48
+ | "The skill is overkill" | Simple things become complex. Use it. |
49
+ | "I'll just do this one thing first" | Check BEFORE doing anything. |
50
+ | "This feels productive" | Undisciplined action wastes time. Skills prevent this. |
51
+ | "The user said to skip guardrails" | Only `/sdd-yolo` disables guardrails. Verbal requests don't count. |
52
+ | "I already know what to do" | Knowing the task ≠ following the discipline. |
53
+
54
+ ## Skill Classification
55
+
56
+ **Rigid skills** (follow exactly, don't adapt away discipline):
57
+ - guardrails
58
+ - tdd-discipline
59
+
60
+ **Flexible skills** (adapt principles to context):
61
+ - spec-first
62
+ - architecture-aware
63
+ - iterative-execution
64
+ - performance-optimization
65
+
66
+ ## Spirit vs. Letter
67
+
68
+ Follow the **spirit** of each skill, not just its checklist. The goal is disciplined development that produces correct, simple, spec-compliant code. If following a checklist item mechanically would produce worse results than thoughtful application of the principle behind it, follow the principle. But this is never an excuse to skip steps — it's a reason to apply them thoughtfully.
69
+
70
+ ## References
71
+
72
+ See: `references/skill-creation-process.md`
@@ -0,0 +1,54 @@
1
+ # Skill Creation Process
2
+
3
+ Creating new SDD skills follows a RED/GREEN/REFACTOR approach — the same TDD discipline applied to the skills themselves.
4
+
5
+ ## RED: Identify the Failure
6
+
7
+ Before writing a skill, you need evidence of a failure pattern:
8
+
9
+ 1. **Observe the failure** — identify a specific, repeatable behavior problem (e.g., the agent skips verification, over-engineers, ignores specs)
10
+ 2. **Document the failure** — write down exactly what went wrong, with concrete examples
11
+ 3. **Pressure test** — verify this isn't a one-off. Does it happen across different tasks, projects, or prompts?
12
+
13
+ If you can't reproduce the failure consistently, you don't need a skill yet. You need more data.
14
+
15
+ ## GREEN: Write the Minimal Skill
16
+
17
+ Write the smallest skill that addresses the failure:
18
+
19
+ 1. **Frontmatter** — name + CSO-format description ("Use when..." with trigger conditions only)
20
+ 2. **One core principle** — the single behavioral change needed
21
+ 3. **Detection** — how the agent recognizes it's about to fail
22
+ 4. **Response** — what the agent should do instead
23
+ 5. **Rationalization table** — 4-8 entries mapping excuses to counters
24
+
25
+ The skill should be under 500 words at this stage. If it's longer, you're solving too many problems at once.
26
+
27
+ ## REFACTOR: Plug Loopholes
28
+
29
+ Deploy the minimal skill and observe:
30
+
31
+ 1. **Does the agent follow it?** If not, the trigger conditions in the description may be wrong — fix them.
32
+ 2. **Does the agent rationalize around it?** Add entries to the rationalization table for each observed excuse.
33
+ 3. **Does it create new problems?** If the skill causes over-correction (e.g., too rigid in cases where flexibility is needed), add "When This Skill Is Overhead" section.
34
+ 4. **Is it too broad?** Split into focused skills. One skill should address one failure pattern cluster.
35
+
36
+ ## Checklist
37
+
38
+ Before shipping a new skill:
39
+
40
+ - [ ] Failure pattern documented with 3+ examples
41
+ - [ ] Description uses "Use when..." CSO format
42
+ - [ ] Rationalization table has 4+ entries
43
+ - [ ] Skill body under 3000 words
44
+ - [ ] References directory exists (even if empty initially)
45
+ - [ ] Added to `using-sdd` skill table
46
+ - [ ] Added to `scripts/verify-skills.sh` SKILLS array
47
+ - [ ] Rigid vs. flexible classification documented in `using-sdd`
48
+
49
+ ## Anti-Patterns
50
+
51
+ - **Speculative skills**: Writing a skill for a problem you haven't observed yet
52
+ - **Kitchen-sink skills**: Cramming multiple unrelated concerns into one skill
53
+ - **Checklist-only skills**: Lists of rules without detection/response guidance
54
+ - **Aspirational skills**: Describing ideal behavior without addressing the specific failure that motivated the skill