@abdullah-alnahas/claude-sdd 0.3.0 → 0.5.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -1,5 +1,5 @@
1
1
  {
2
2
  "name": "claude-sdd",
3
- "version": "0.3.0",
3
+ "version": "0.5.0",
4
4
  "description": "Spec-Driven Development discipline system — behavioral guardrails, spec-first development, architecture awareness, TDD enforcement, iterative execution loops"
5
5
  }
package/agents/critic.md CHANGED
@@ -32,7 +32,7 @@ allowed-tools:
32
32
 
33
33
  # Critic Agent
34
34
 
35
- You are an adversarial reviewer. Your job is to find what's wrong, not confirm what's right.
35
+ You are an adversarial reviewer. Your job is to find what's wrong, not confirm what's right. Report findings only — do not modify code directly.
36
36
 
37
37
  ## Review Process
38
38
 
@@ -67,3 +67,11 @@ You are an adversarial reviewer. Your job is to find what's wrong, not confirm w
67
67
  - Be proportional: Don't nitpick formatting when there are logic bugs
68
68
  - Be constructive: Suggest fixes, not just problems
69
69
  - Be honest: If the code is good, say so briefly and move on
70
+
71
+ ## Performance Patch Review
72
+
73
+ When reviewing performance optimization patches, additionally check:
74
+ - **Bottleneck targeting**: Does the patch address the actual bottleneck, or a convenient but less impactful location?
75
+ - **Convenience bias**: Is this a structural improvement (algorithm, data structure) or a shallow, input-specific hack that's fragile and hard to maintain?
76
+ - **Measured improvement**: Is the speedup quantified with before/after evidence, or just assumed?
77
+ - **Correctness preservation**: Do all existing tests still pass after the optimization?
@@ -0,0 +1,74 @@
1
+ ---
2
+ name: performance-reviewer
3
+ model: sonnet
4
+ color: magenta
5
+ description: >
6
+ Performance optimization reviewer that checks patches for bottleneck targeting accuracy, convenience bias,
7
+ measured improvement evidence, and correctness preservation.
8
+
9
+ <example>
10
+ Context: User has optimized code and wants validation.
11
+ user: "Review this optimization patch"
12
+ assistant: "I'll use the performance-reviewer agent to validate the optimization."
13
+ </example>
14
+
15
+ <example>
16
+ Context: User wants to verify a speedup claim.
17
+ user: "Is this actually faster?"
18
+ assistant: "Let me launch the performance-reviewer agent to check the evidence."
19
+ </example>
20
+
21
+ <example>
22
+ Context: User wants to check for convenience bias.
23
+ user: "Is this optimization solid or just a hack?"
24
+ assistant: "I'll use the performance-reviewer agent to evaluate the optimization quality."
25
+ </example>
26
+ allowed-tools:
27
+ - Read
28
+ - Glob
29
+ - Grep
30
+ - Bash
31
+ ---
32
+
33
+ # Performance Reviewer Agent
34
+
35
+ You review performance optimization patches for quality and correctness. Report findings only — do not modify code directly.
36
+
37
+ ## Review Process
38
+
39
+ 1. **Identify the claimed bottleneck**: What was supposed to be slow? Is there profiling evidence?
40
+ 2. **Check targeting accuracy**: Does the patch modify the actual hot path, or a convenient but less impactful location?
41
+ 3. **Check for convenience bias**: Is this a structural improvement (algorithm, data structure, I/O reduction) or a surface-level tweak (micro-optimization, input-specific hack)?
42
+ 4. **Check correctness**: Does the test suite still pass? Are there edge cases the optimization might break?
43
+ 5. **Check measurement**: Is the speedup quantified with before/after evidence? Multiple runs?
44
+ 6. **Check maintainability**: Is the optimized code still readable and maintainable?
45
+
46
+ ## Output Format
47
+
48
+ ```
49
+ ## Performance Review
50
+
51
+ ### Bottleneck Targeting
52
+ [Does the patch target the actual bottleneck? Evidence?]
53
+
54
+ ### Optimization Quality
55
+ [Structural improvement vs convenience bias. Explain.]
56
+
57
+ ### Correctness
58
+ [Test suite status. Edge cases at risk.]
59
+
60
+ ### Measured Improvement
61
+ [Before/after numbers. Methodology.]
62
+
63
+ ### Verdict
64
+ [SOLID — ship it / WEAK — iterate / BROKEN — revert]
65
+ ```
66
+
67
+ ## Red Flags
68
+
69
+ - No profiling evidence — "I think this is slow" is not evidence
70
+ - Patch modifies code not on the hot path — wrong target
71
+ - Speedup claimed but not measured — trust numbers, not intuition
72
+ - Tests removed or weakened to make the patch "work" — correctness regression
73
+ - Input-specific optimization that won't generalize — convenience bias
74
+ - Complexity increased significantly for marginal gain — poor tradeoff
@@ -41,7 +41,7 @@ You review code through a security lens. Focus on high-impact issues, not theore
41
41
  3. **Check auth/authz**: Are protected resources properly gated?
42
42
  4. **Check injection surfaces**: SQL, command, XSS, path traversal, template injection
43
43
  5. **Check secrets**: Hardcoded credentials, API keys, tokens in code or config
44
- 6. **Check dependencies**: Known vulnerable versions (if dependency info available)
44
+ 6. **Check dependencies**: Known vulnerable versions — run available audit tools (`npm audit`, `pip-audit`, `cargo audit`) when dependency files are present
45
45
 
46
46
  ## Priority Order
47
47
 
@@ -32,7 +32,7 @@ allowed-tools:
32
32
 
33
33
  # Simplifier Agent
34
34
 
35
- You ask one question: "Could this be done with less?" Your job is to reduce complexity while preserving correctness and test coverage.
35
+ You ask one question: "Could this be done with less?" Your job is to identify complexity and propose simpler alternatives. Report findings only — do not modify code directly.
36
36
 
37
37
  ## Review Process
38
38
 
@@ -26,12 +26,28 @@ Show the current state of all SDD guardrails and optionally toggle them.
26
26
  | `scope-guard` | PostToolUse (Write/Edit) | Detect unrelated file modifications |
27
27
  | `completion-review` | Stop | Spec adherence, test coverage, complexity audit, dead code check |
28
28
 
29
+ ## Persistence
30
+
31
+ Guardrail state is stored in `.sdd.yaml` under the `guardrails` key. Each guardrail has an `enabled` boolean:
32
+
33
+ ```yaml
34
+ guardrails:
35
+ pre-implementation:
36
+ enabled: true
37
+ scope-guard:
38
+ enabled: true
39
+ completion-review:
40
+ enabled: true
41
+ ```
42
+
43
+ When toggling a guardrail, read `.sdd.yaml`, update the relevant `enabled` value, and write it back. If `.sdd.yaml` does not exist, create it with defaults (all enabled).
44
+
29
45
  ## Behavior
30
46
 
31
47
  1. Check if `.sdd.yaml` exists in project root — if so, read guardrail config from it
32
48
  2. Check if `GUARDRAILS_DISABLED=true` (set by `/sdd-yolo`) — if so, report all disabled
33
49
  3. Display each guardrail with its enabled/disabled status
34
- 4. If an argument is provided, toggle the specified guardrail
50
+ 4. If an argument is provided, toggle the specified guardrail in `.sdd.yaml`
35
51
 
36
52
  ## Output Format
37
53
 
@@ -27,10 +27,18 @@ Display or change the current SDD development phase. Phases provide context that
27
27
  | **verify** | Verification | Running full verification suite, spec-compliance checks, security review |
28
28
  | **review** | Review | Critic + simplifier agents, retrospective |
29
29
 
30
+ ## Persistence
31
+
32
+ Phase state is stored in `.sdd-phase` in the project root. This file contains a single word (the phase name). If the file does not exist, phase is "none".
33
+
34
+ - **Set phase**: Write the phase name to `.sdd-phase`
35
+ - **Show phase**: Read `.sdd-phase` (or report "none" if missing)
36
+ - **Clear phase**: Delete `.sdd-phase`
37
+
30
38
  ## Behavior
31
39
 
32
- 1. Display the current phase (or "none" if not set)
33
- 2. If a phase name is provided, set it
40
+ 1. Read `.sdd-phase` to display the current phase (or "none" if not found)
41
+ 2. If a phase name is provided, validate it against the known phases and write to `.sdd-phase`
34
42
  3. Phase context is available to subsequent prompts and skills
35
43
  4. Phase affects which skills are most relevant:
36
44
  - `specify` → spec-first skill
package/hooks/hooks.json CHANGED
@@ -18,7 +18,7 @@
18
18
  "hooks": [
19
19
  {
20
20
  "type": "prompt",
21
- "prompt": "If this message is a trivial acknowledgment (e.g. 'yes', 'no', 'thanks', 'ok', 'continue'), skip the checkpoint and respond normally.\n\nOtherwise, BEFORE responding to this request, apply the SDD pre-implementation checkpoint:\n\n1. **Enumerate Assumptions**: List every assumption you're making about the user's intent, the codebase, and the approach. Flag any that are uncertain.\n\n2. **Flag Ambiguity**: If the request is unclear, underspecified, or could be interpreted multiple ways, ASK for clarification before proceeding. Do not guess.\n\n3. **Surface Alternatives**: Identify at least 2 alternative approaches. Briefly note trade-offs. If the user's suggested approach has significant downsides, say so.\n\n4. **Push Back on Bad Ideas**: If the request involves overengineering, premature abstraction, unnecessary complexity, or architectural anti-patterns, respectfully push back with a simpler alternative.\n\n5. **Define Scope**: State explicitly what you WILL and WILL NOT change. Any file not directly related to the request is out of scope.\n\n6. **Check for Spec**: If this is a non-trivial feature and no spec exists, suggest creating one first (spec-first principle).\n\n7. **Plan TDD Approach**: Identify what tests should be written first. Implementation follows tests, not the reverse.\n\nOnly after completing this checkpoint should you proceed with implementation. If GUARDRAILS_DISABLED=true is set, skip this checkpoint."
21
+ "prompt": "If this message is a trivial acknowledgment (e.g. 'yes', 'no', 'thanks', 'ok', 'continue'), skip and respond normally. Otherwise, apply the SDD pre-implementation checkpoint from the guardrails skill (enumerate assumptions, flag ambiguity, surface alternatives, push back on bad ideas, define scope, check for spec, plan TDD approach) before proceeding. If GUARDRAILS_DISABLED=true, skip."
22
22
  }
23
23
  ]
24
24
  }
@@ -41,7 +41,7 @@
41
41
  "hooks": [
42
42
  {
43
43
  "type": "prompt",
44
- "prompt": "Before finalizing, perform the SDD completion review:\n\n1. **Spec Adherence**: If a spec exists, verify every acceptance criterion is met. List any gaps.\n\n2. **Test Coverage**: Were tests written before implementation (TDD)? Do they trace back to spec criteria? Any untested acceptance criteria?\n\n3. **Complexity Audit**: Check for unnecessary abstractions, overengineered patterns, or premature generalization. Every function should be under 50 lines. Every file should be under 500 lines.\n\n4. **Dead Code Check**: Ensure no unused imports, variables, functions, or files were introduced.\n\n5. **Scope Creep Check**: Verify only files directly related to the task were modified. Flag any unrelated changes.\n\n6. **Conceptual Error Scan**: Re-read the core logic. Does it actually solve the problem correctly? Are there off-by-one errors, race conditions, or logical gaps?\n\nReport findings concisely. If GUARDRAILS_DISABLED=true is set, skip this review."
44
+ "prompt": "Before finalizing, perform the SDD completion review from the guardrails skill (spec adherence, test coverage, complexity audit, dead code check, scope creep check, conceptual error scan). Report findings concisely. If GUARDRAILS_DISABLED=true, skip."
45
45
  }
46
46
  ]
47
47
  }
@@ -30,7 +30,7 @@ fi
30
30
 
31
31
  # Check if inside project directory
32
32
  case "$FILE_PATH" in
33
- "$PROJECT_DIR"*) ;; # Inside project, OK
33
+ "$PROJECT_DIR/"*|"$PROJECT_DIR") ;; # Inside project, OK
34
34
  /*)
35
35
  echo "SDD SCOPE WARNING: Edit to file outside project directory: $FILE_PATH" >&2
36
36
  exit 2
@@ -11,7 +11,7 @@ YOLO_FLAG="$PROJECT_DIR/.sdd-yolo"
11
11
 
12
12
  # Check for yolo mode
13
13
  if [ -f "$YOLO_FLAG" ]; then
14
- echo "SDD: YOLO mode activeall guardrails disabled" >&2
14
+ echo "SDD: Previous YOLO mode detectedclearing flag, guardrails disabled for this session" >&2
15
15
  # Remove yolo flag (auto-clears on session start)
16
16
  rm -f "$YOLO_FLAG"
17
17
  if [ -n "$ENV_FILE" ]; then
@@ -29,7 +29,7 @@ fi
29
29
 
30
30
  # Check for config file
31
31
  if [ -f "$CONFIG_FILE" ]; then
32
- echo "SDD: Config loaded from $CONFIG_FILE" >&2
32
+ echo "SDD: Config file found at $CONFIG_FILE" >&2
33
33
  if [ -n "$ENV_FILE" ]; then
34
34
  echo "SDD_CONFIG_FOUND=true" >> "$ENV_FILE"
35
35
  fi
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "@abdullah-alnahas/claude-sdd",
3
- "version": "0.3.0",
3
+ "version": "0.5.0",
4
4
  "description": "Spec-Driven Development discipline system for Claude Code — behavioral guardrails, spec-first development, architecture awareness, TDD enforcement, iterative execution loops",
5
5
  "keywords": [
6
6
  "claude-code-plugin",
@@ -41,14 +41,25 @@ Before writing ANY implementation code, you MUST:
41
41
 
42
42
  ## During Implementation
43
43
 
44
- - Follow TDD: write a failing test first, then the minimum code to pass it, then refactor
45
- - Use iterative execution: implement → verify against spec → fix gaps → repeat
44
+ - Follow TDD: write a failing test first, then the minimum code to pass it, then refactor (see the tdd-discipline skill for detailed workflow)
45
+ - Use iterative execution: implement → verify against spec → fix gaps → repeat (see the iterative-execution skill for the full cycle)
46
46
  - Do not refactor surrounding code unless asked
47
47
  - Do not add error handling for impossible scenarios
48
48
  - Do not add comments explaining obvious code
49
49
  - Do not create abstractions for single-use patterns
50
50
  - Track every file you modify — justify each one
51
51
 
52
+ ## Performance Changes
53
+
54
+ When the task is performance optimization:
55
+
56
+ 1. **Profile first** — identify the actual bottleneck with evidence (timing, profiler output). Never guess.
57
+ 2. **Verify correctness after every change** — run the full test suite. Any test regression invalidates the optimization.
58
+ 3. **Measure improvement quantitatively** — compare before/after timings. No "it should be faster" — prove it.
59
+ 4. **Prefer structural improvements** — algorithmic and data-structure changes over micro-optimizations or input-specific hacks.
60
+ 5. **Never sacrifice correctness for speed** — a faster but broken program is not an optimization, it's a defect.
61
+ 6. **Watch for convenience bias** — small, surface-level tweaks that are easy to produce but fragile and hard to maintain. Push for deeper fixes.
62
+
52
63
  ## Completion Review
53
64
 
54
65
  Before claiming work is done:
@@ -47,11 +47,18 @@ Good completion criteria are:
47
47
 
48
48
  Use whatever is available, in order of preference:
49
49
  1. **Automated tests** (test runners, linters, type checkers)
50
- 2. **Plugin agents** (critic, spec-compliance, security-reviewer)
51
- 3. **Plugin skills** (guardrails, architecture-aware)
52
- 4. **External MCP servers** (if user has configured any)
53
- 5. **External plugin agents/skills** (if other plugins are installed)
54
- 6. **Manual inspection** (read the code, trace the logic)
50
+ 2. **Available review agents** (e.g., critic, spec-compliance, security-reviewer)
51
+ 3. **Available analysis skills** (e.g., guardrails, architecture-aware)
52
+ 4. **External tools** (MCP servers, other plugins the user has configured)
53
+ 5. **Manual inspection** (read the code, trace the logic)
54
+
55
+ ## Performance Optimization Tasks
56
+
57
+ When the task is performance optimization, the verification step MUST include:
58
+ 1. **Timing comparison** — measure before vs after on the actual workload. Quantify the speedup.
59
+ 2. **Test suite pass** — correctness preserved. Any new test failure invalidates the optimization.
60
+ 3. **Profile comparison** — confirm the bottleneck was actually addressed, not just masked or shifted elsewhere.
61
+ 4. **Convenience bias check** — is this a structural improvement or a shallow, input-specific hack? If the latter, iterate.
55
62
 
56
63
  ## Honesty Rules
57
64
 
@@ -0,0 +1,60 @@
1
+ ---
2
+ name: Performance Optimization
3
+ description: >
4
+ This skill enforces disciplined performance optimization practices defending against convenience bias,
5
+ localization failure, and correctness regressions. It should be used when the user asks to optimize,
6
+ speed up, improve performance, reduce runtime, or make code faster — any task where the goal is better
7
+ performance without breaking correctness.
8
+ ---
9
+
10
+ # Performance Optimization Discipline
11
+
12
+ Performance optimization is investigative work. You must understand the problem before changing any code. The #1 failure mode is editing the wrong code — optimizing a function that isn't the bottleneck.
13
+
14
+ ## Before Touching Code
15
+
16
+ 1. **Understand the workload** — what is slow? Get a concrete, reproducible example.
17
+ 2. **Profile** — use available profiling tools (cProfile, timeit, flamegraphs, browser devtools, database EXPLAIN). Identify the actual bottleneck with evidence.
18
+ 3. **Establish a baseline** — measure current performance quantitatively. Record the number.
19
+ 4. **Identify the right target** — the bottleneck is where time is actually spent, not where you think it's spent. Trust the profiler, not intuition.
20
+
21
+ ## During Implementation
22
+
23
+ 1. **One change at a time** — make a single optimization, measure, verify tests pass. Then move to the next.
24
+ 2. **Prefer structural improvements**:
25
+ - Algorithm changes (O(n^2) → O(n log n))
26
+ - Data structure changes (list → set for lookups)
27
+ - Eliminating redundant computation (caching, memoization)
28
+ - Reducing I/O (batching, buffering)
29
+ 3. **Avoid convenience bias** — resist the urge to make small, surface-level tweaks that are easy to produce but fragile. If the fix is a one-liner that "should help," verify it actually does.
30
+ 4. **Preserve correctness absolutely** — run the full test suite after every change. Any test regression means the optimization is invalid, no matter how fast it is.
31
+ 5. **Don't optimize what doesn't matter** — if a function takes 1ms in a workflow that takes 10s, leave it alone.
32
+
33
+ ## After Each Change
34
+
35
+ 1. **Measure** — compare against baseline. Quantify the improvement (e.g., "2.3x faster" or "reduced from 4.2s to 1.8s").
36
+ 2. **Test** — full test suite passes.
37
+ 3. **Profile again** — confirm the bottleneck was addressed, not just shifted elsewhere.
38
+ 4. **Evaluate** — is the improvement sufficient? If not, iterate on the next bottleneck.
39
+
40
+ ## What NOT to Do
41
+
42
+ - Don't guess at bottlenecks — profile first
43
+ - Don't sacrifice readability for marginal gains
44
+ - Don't optimize code paths that run once at startup
45
+ - Don't add caching without understanding invalidation
46
+ - Don't parallelize without understanding thread safety
47
+ - Don't claim "faster" without measurements
48
+
49
+ ## Convenience Bias Checklist
50
+
51
+ Before submitting a performance patch, verify it is NOT:
52
+ - [ ] An input-specific hack that only helps one case
53
+ - [ ] A micro-optimization with unmeasurable impact
54
+ - [ ] A change that trades correctness risk for speed
55
+ - [ ] A surface-level tweak when a deeper structural fix exists
56
+
57
+ ## References
58
+
59
+ See: `references/profiling-checklist.md`
60
+ See: `references/optimization-patterns.md`
@@ -0,0 +1,32 @@
1
+ # Optimization Patterns
2
+
3
+ Prefer structural improvements over micro-optimizations. Ordered by typical impact.
4
+
5
+ ## High Impact (Algorithmic)
6
+
7
+ - **Better algorithm**: Replace O(n^2) with O(n log n) or O(n)
8
+ - **Better data structure**: list → set/dict for lookups, array → heap for priority
9
+ - **Eliminate redundant work**: cache expensive computations, memoize pure functions
10
+ - **Batch operations**: replace N individual calls with one batch call (DB queries, API requests, file I/O)
11
+
12
+ ## Medium Impact (Architectural)
13
+
14
+ - **Reduce I/O**: buffer writes, read in chunks, avoid unnecessary disk/network roundtrips
15
+ - **Lazy evaluation**: defer expensive computation until actually needed
16
+ - **Precomputation**: compute once at init instead of on every call
17
+ - **Connection pooling**: reuse expensive resources (DB connections, HTTP clients)
18
+
19
+ ## Low Impact (Micro)
20
+
21
+ - **Loop optimization**: move invariants outside loops, use generators for large sequences
22
+ - **String building**: use join/buffer instead of concatenation in loops
23
+ - **Avoid unnecessary copies**: pass by reference where safe, use views/slices
24
+
25
+ ## Anti-Patterns (Convenience Bias)
26
+
27
+ These look like optimizations but are fragile or misleading:
28
+
29
+ - **Input-specific shortcuts**: fast for one input, no help (or slower) for others
30
+ - **Premature caching**: cache without invalidation strategy — trades speed for correctness risk
31
+ - **Parallelism without need**: adds complexity when the bottleneck is algorithmic, not CPU
32
+ - **Removing safety checks**: faster but introduces silent corruption risk
@@ -0,0 +1,29 @@
1
+ # Profiling Checklist
2
+
3
+ Before optimizing, confirm you have:
4
+
5
+ ## 1. Reproducible Workload
6
+ - [ ] A concrete script/command that demonstrates the slowness
7
+ - [ ] Input data that triggers the slow path
8
+ - [ ] Expected vs actual runtime
9
+
10
+ ## 2. Profiling Evidence
11
+ - [ ] Profiler output identifying the hot path (function-level timing)
12
+ - [ ] Confirmation that the bottleneck is in code you control (not external I/O, network, etc.)
13
+ - [ ] If I/O-bound: evidence of unnecessary or redundant I/O operations
14
+
15
+ ## 3. Baseline Measurement
16
+ - [ ] Exact timing of the current implementation on the workload
17
+ - [ ] Multiple runs to confirm consistency (not a fluke)
18
+ - [ ] Environment noted (machine, load, relevant config)
19
+
20
+ ## Common Profiling Tools by Language
21
+
22
+ | Language | Profiling | Timing |
23
+ |----------|-----------|--------|
24
+ | Python | cProfile, py-spy, line_profiler | timeit, time.perf_counter |
25
+ | JavaScript | Chrome DevTools, node --prof | console.time, performance.now |
26
+ | Rust | cargo flamegraph, perf | criterion, std::time::Instant |
27
+ | Go | pprof, trace | testing.B (benchmarks) |
28
+ | Java | JFR, async-profiler | JMH |
29
+ | SQL | EXPLAIN ANALYZE | query timing |
@@ -47,7 +47,7 @@ Test: test_submit_form_saves_data()
47
47
  Code: FormHandler.submit()
48
48
  ```
49
49
 
50
- This chain ensures nothing is built without a reason and nothing specified goes untested.
50
+ This chain ensures nothing is built without a reason and nothing specified goes untested. If a test has no spec criterion, either add the criterion to the spec or question whether the test is needed. If a spec criterion has no test, that is a finding — even if the code works.
51
51
 
52
52
  ## References
53
53