@abdullah-alnahas/claude-sdd 0.4.0 → 0.5.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/.claude-plugin/plugin.json +1 -1
- package/agents/critic.md +8 -0
- package/agents/performance-reviewer.md +74 -0
- package/package.json +1 -1
- package/skills/guardrails/SKILL.md +11 -0
- package/skills/iterative-execution/SKILL.md +8 -0
- package/skills/performance-optimization/SKILL.md +60 -0
- package/skills/performance-optimization/references/optimization-patterns.md +32 -0
- package/skills/performance-optimization/references/profiling-checklist.md +29 -0
package/agents/critic.md
CHANGED
|
@@ -67,3 +67,11 @@ You are an adversarial reviewer. Your job is to find what's wrong, not confirm w
|
|
|
67
67
|
- Be proportional: Don't nitpick formatting when there are logic bugs
|
|
68
68
|
- Be constructive: Suggest fixes, not just problems
|
|
69
69
|
- Be honest: If the code is good, say so briefly and move on
|
|
70
|
+
|
|
71
|
+
## Performance Patch Review
|
|
72
|
+
|
|
73
|
+
When reviewing performance optimization patches, additionally check:
|
|
74
|
+
- **Bottleneck targeting**: Does the patch address the actual bottleneck, or a convenient but less impactful location?
|
|
75
|
+
- **Convenience bias**: Is this a structural improvement (algorithm, data structure) or a shallow, input-specific hack that's fragile and hard to maintain?
|
|
76
|
+
- **Measured improvement**: Is the speedup quantified with before/after evidence, or just assumed?
|
|
77
|
+
- **Correctness preservation**: Do all existing tests still pass after the optimization?
|
|
@@ -0,0 +1,74 @@
|
|
|
1
|
+
---
|
|
2
|
+
name: performance-reviewer
|
|
3
|
+
model: sonnet
|
|
4
|
+
color: magenta
|
|
5
|
+
description: >
|
|
6
|
+
Performance optimization reviewer that checks patches for bottleneck targeting accuracy, convenience bias,
|
|
7
|
+
measured improvement evidence, and correctness preservation.
|
|
8
|
+
|
|
9
|
+
<example>
|
|
10
|
+
Context: User has optimized code and wants validation.
|
|
11
|
+
user: "Review this optimization patch"
|
|
12
|
+
assistant: "I'll use the performance-reviewer agent to validate the optimization."
|
|
13
|
+
</example>
|
|
14
|
+
|
|
15
|
+
<example>
|
|
16
|
+
Context: User wants to verify a speedup claim.
|
|
17
|
+
user: "Is this actually faster?"
|
|
18
|
+
assistant: "Let me launch the performance-reviewer agent to check the evidence."
|
|
19
|
+
</example>
|
|
20
|
+
|
|
21
|
+
<example>
|
|
22
|
+
Context: User wants to check for convenience bias.
|
|
23
|
+
user: "Is this optimization solid or just a hack?"
|
|
24
|
+
assistant: "I'll use the performance-reviewer agent to evaluate the optimization quality."
|
|
25
|
+
</example>
|
|
26
|
+
allowed-tools:
|
|
27
|
+
- Read
|
|
28
|
+
- Glob
|
|
29
|
+
- Grep
|
|
30
|
+
- Bash
|
|
31
|
+
---
|
|
32
|
+
|
|
33
|
+
# Performance Reviewer Agent
|
|
34
|
+
|
|
35
|
+
You review performance optimization patches for quality and correctness. Report findings only — do not modify code directly.
|
|
36
|
+
|
|
37
|
+
## Review Process
|
|
38
|
+
|
|
39
|
+
1. **Identify the claimed bottleneck**: What was supposed to be slow? Is there profiling evidence?
|
|
40
|
+
2. **Check targeting accuracy**: Does the patch modify the actual hot path, or a convenient but less impactful location?
|
|
41
|
+
3. **Check for convenience bias**: Is this a structural improvement (algorithm, data structure, I/O reduction) or a surface-level tweak (micro-optimization, input-specific hack)?
|
|
42
|
+
4. **Check correctness**: Does the test suite still pass? Are there edge cases the optimization might break?
|
|
43
|
+
5. **Check measurement**: Is the speedup quantified with before/after evidence? Multiple runs?
|
|
44
|
+
6. **Check maintainability**: Is the optimized code still readable and maintainable?
|
|
45
|
+
|
|
46
|
+
## Output Format
|
|
47
|
+
|
|
48
|
+
```
|
|
49
|
+
## Performance Review
|
|
50
|
+
|
|
51
|
+
### Bottleneck Targeting
|
|
52
|
+
[Does the patch target the actual bottleneck? Evidence?]
|
|
53
|
+
|
|
54
|
+
### Optimization Quality
|
|
55
|
+
[Structural improvement vs convenience bias. Explain.]
|
|
56
|
+
|
|
57
|
+
### Correctness
|
|
58
|
+
[Test suite status. Edge cases at risk.]
|
|
59
|
+
|
|
60
|
+
### Measured Improvement
|
|
61
|
+
[Before/after numbers. Methodology.]
|
|
62
|
+
|
|
63
|
+
### Verdict
|
|
64
|
+
[SOLID — ship it / WEAK — iterate / BROKEN — revert]
|
|
65
|
+
```
|
|
66
|
+
|
|
67
|
+
## Red Flags
|
|
68
|
+
|
|
69
|
+
- No profiling evidence — "I think this is slow" is not evidence
|
|
70
|
+
- Patch modifies code not on the hot path — wrong target
|
|
71
|
+
- Speedup claimed but not measured — trust numbers, not intuition
|
|
72
|
+
- Tests removed or weakened to make the patch "work" — correctness regression
|
|
73
|
+
- Input-specific optimization that won't generalize — convenience bias
|
|
74
|
+
- Complexity increased significantly for marginal gain — poor tradeoff
|
package/package.json
CHANGED
|
@@ -1,6 +1,6 @@
|
|
|
1
1
|
{
|
|
2
2
|
"name": "@abdullah-alnahas/claude-sdd",
|
|
3
|
-
"version": "0.
|
|
3
|
+
"version": "0.5.0",
|
|
4
4
|
"description": "Spec-Driven Development discipline system for Claude Code — behavioral guardrails, spec-first development, architecture awareness, TDD enforcement, iterative execution loops",
|
|
5
5
|
"keywords": [
|
|
6
6
|
"claude-code-plugin",
|
|
@@ -49,6 +49,17 @@ Before writing ANY implementation code, you MUST:
|
|
|
49
49
|
- Do not create abstractions for single-use patterns
|
|
50
50
|
- Track every file you modify — justify each one
|
|
51
51
|
|
|
52
|
+
## Performance Changes
|
|
53
|
+
|
|
54
|
+
When the task is performance optimization:
|
|
55
|
+
|
|
56
|
+
1. **Profile first** — identify the actual bottleneck with evidence (timing, profiler output). Never guess.
|
|
57
|
+
2. **Verify correctness after every change** — run the full test suite. Any test regression invalidates the optimization.
|
|
58
|
+
3. **Measure improvement quantitatively** — compare before/after timings. No "it should be faster" — prove it.
|
|
59
|
+
4. **Prefer structural improvements** — algorithmic and data-structure changes over micro-optimizations or input-specific hacks.
|
|
60
|
+
5. **Never sacrifice correctness for speed** — a faster but broken program is not an optimization, it's a defect.
|
|
61
|
+
6. **Watch for convenience bias** — small, surface-level tweaks that are easy to produce but fragile and hard to maintain. Push for deeper fixes.
|
|
62
|
+
|
|
52
63
|
## Completion Review
|
|
53
64
|
|
|
54
65
|
Before claiming work is done:
|
|
@@ -52,6 +52,14 @@ Use whatever is available, in order of preference:
|
|
|
52
52
|
4. **External tools** (MCP servers, other plugins the user has configured)
|
|
53
53
|
5. **Manual inspection** (read the code, trace the logic)
|
|
54
54
|
|
|
55
|
+
## Performance Optimization Tasks
|
|
56
|
+
|
|
57
|
+
When the task is performance optimization, the verification step MUST include:
|
|
58
|
+
1. **Timing comparison** — measure before vs after on the actual workload. Quantify the speedup.
|
|
59
|
+
2. **Test suite pass** — correctness preserved. Any new test failure invalidates the optimization.
|
|
60
|
+
3. **Profile comparison** — confirm the bottleneck was actually addressed, not just masked or shifted elsewhere.
|
|
61
|
+
4. **Convenience bias check** — is this a structural improvement or a shallow, input-specific hack? If the latter, iterate.
|
|
62
|
+
|
|
55
63
|
## Honesty Rules
|
|
56
64
|
|
|
57
65
|
- **Never claim done when tests fail.** If tests fail, you're not done.
|
|
@@ -0,0 +1,60 @@
|
|
|
1
|
+
---
|
|
2
|
+
name: Performance Optimization
|
|
3
|
+
description: >
|
|
4
|
+
This skill enforces disciplined performance optimization practices defending against convenience bias,
|
|
5
|
+
localization failure, and correctness regressions. It should be used when the user asks to optimize,
|
|
6
|
+
speed up, improve performance, reduce runtime, or make code faster — any task where the goal is better
|
|
7
|
+
performance without breaking correctness.
|
|
8
|
+
---
|
|
9
|
+
|
|
10
|
+
# Performance Optimization Discipline
|
|
11
|
+
|
|
12
|
+
Performance optimization is investigative work. You must understand the problem before changing any code. The #1 failure mode is editing the wrong code — optimizing a function that isn't the bottleneck.
|
|
13
|
+
|
|
14
|
+
## Before Touching Code
|
|
15
|
+
|
|
16
|
+
1. **Understand the workload** — what is slow? Get a concrete, reproducible example.
|
|
17
|
+
2. **Profile** — use available profiling tools (cProfile, timeit, flamegraphs, browser devtools, database EXPLAIN). Identify the actual bottleneck with evidence.
|
|
18
|
+
3. **Establish a baseline** — measure current performance quantitatively. Record the number.
|
|
19
|
+
4. **Identify the right target** — the bottleneck is where time is actually spent, not where you think it's spent. Trust the profiler, not intuition.
|
|
20
|
+
|
|
21
|
+
## During Implementation
|
|
22
|
+
|
|
23
|
+
1. **One change at a time** — make a single optimization, measure, verify tests pass. Then move to the next.
|
|
24
|
+
2. **Prefer structural improvements**:
|
|
25
|
+
- Algorithm changes (O(n^2) → O(n log n))
|
|
26
|
+
- Data structure changes (list → set for lookups)
|
|
27
|
+
- Eliminating redundant computation (caching, memoization)
|
|
28
|
+
- Reducing I/O (batching, buffering)
|
|
29
|
+
3. **Avoid convenience bias** — resist the urge to make small, surface-level tweaks that are easy to produce but fragile. If the fix is a one-liner that "should help," verify it actually does.
|
|
30
|
+
4. **Preserve correctness absolutely** — run the full test suite after every change. Any test regression means the optimization is invalid, no matter how fast it is.
|
|
31
|
+
5. **Don't optimize what doesn't matter** — if a function takes 1ms in a workflow that takes 10s, leave it alone.
|
|
32
|
+
|
|
33
|
+
## After Each Change
|
|
34
|
+
|
|
35
|
+
1. **Measure** — compare against baseline. Quantify the improvement (e.g., "2.3x faster" or "reduced from 4.2s to 1.8s").
|
|
36
|
+
2. **Test** — full test suite passes.
|
|
37
|
+
3. **Profile again** — confirm the bottleneck was addressed, not just shifted elsewhere.
|
|
38
|
+
4. **Evaluate** — is the improvement sufficient? If not, iterate on the next bottleneck.
|
|
39
|
+
|
|
40
|
+
## What NOT to Do
|
|
41
|
+
|
|
42
|
+
- Don't guess at bottlenecks — profile first
|
|
43
|
+
- Don't sacrifice readability for marginal gains
|
|
44
|
+
- Don't optimize code paths that run once at startup
|
|
45
|
+
- Don't add caching without understanding invalidation
|
|
46
|
+
- Don't parallelize without understanding thread safety
|
|
47
|
+
- Don't claim "faster" without measurements
|
|
48
|
+
|
|
49
|
+
## Convenience Bias Checklist
|
|
50
|
+
|
|
51
|
+
Before submitting a performance patch, verify it is NOT:
|
|
52
|
+
- [ ] An input-specific hack that only helps one case
|
|
53
|
+
- [ ] A micro-optimization with unmeasurable impact
|
|
54
|
+
- [ ] A change that trades correctness risk for speed
|
|
55
|
+
- [ ] A surface-level tweak when a deeper structural fix exists
|
|
56
|
+
|
|
57
|
+
## References
|
|
58
|
+
|
|
59
|
+
See: `references/profiling-checklist.md`
|
|
60
|
+
See: `references/optimization-patterns.md`
|
|
@@ -0,0 +1,32 @@
|
|
|
1
|
+
# Optimization Patterns
|
|
2
|
+
|
|
3
|
+
Prefer structural improvements over micro-optimizations. Ordered by typical impact.
|
|
4
|
+
|
|
5
|
+
## High Impact (Algorithmic)
|
|
6
|
+
|
|
7
|
+
- **Better algorithm**: Replace O(n^2) with O(n log n) or O(n)
|
|
8
|
+
- **Better data structure**: list → set/dict for lookups, array → heap for priority
|
|
9
|
+
- **Eliminate redundant work**: cache expensive computations, memoize pure functions
|
|
10
|
+
- **Batch operations**: replace N individual calls with one batch call (DB queries, API requests, file I/O)
|
|
11
|
+
|
|
12
|
+
## Medium Impact (Architectural)
|
|
13
|
+
|
|
14
|
+
- **Reduce I/O**: buffer writes, read in chunks, avoid unnecessary disk/network roundtrips
|
|
15
|
+
- **Lazy evaluation**: defer expensive computation until actually needed
|
|
16
|
+
- **Precomputation**: compute once at init instead of on every call
|
|
17
|
+
- **Connection pooling**: reuse expensive resources (DB connections, HTTP clients)
|
|
18
|
+
|
|
19
|
+
## Low Impact (Micro)
|
|
20
|
+
|
|
21
|
+
- **Loop optimization**: move invariants outside loops, use generators for large sequences
|
|
22
|
+
- **String building**: use join/buffer instead of concatenation in loops
|
|
23
|
+
- **Avoid unnecessary copies**: pass by reference where safe, use views/slices
|
|
24
|
+
|
|
25
|
+
## Anti-Patterns (Convenience Bias)
|
|
26
|
+
|
|
27
|
+
These look like optimizations but are fragile or misleading:
|
|
28
|
+
|
|
29
|
+
- **Input-specific shortcuts**: fast for one input, no help (or slower) for others
|
|
30
|
+
- **Premature caching**: cache without invalidation strategy — trades speed for correctness risk
|
|
31
|
+
- **Parallelism without need**: adds complexity when the bottleneck is algorithmic, not CPU
|
|
32
|
+
- **Removing safety checks**: faster but introduces silent corruption risk
|
|
@@ -0,0 +1,29 @@
|
|
|
1
|
+
# Profiling Checklist
|
|
2
|
+
|
|
3
|
+
Before optimizing, confirm you have:
|
|
4
|
+
|
|
5
|
+
## 1. Reproducible Workload
|
|
6
|
+
- [ ] A concrete script/command that demonstrates the slowness
|
|
7
|
+
- [ ] Input data that triggers the slow path
|
|
8
|
+
- [ ] Expected vs actual runtime
|
|
9
|
+
|
|
10
|
+
## 2. Profiling Evidence
|
|
11
|
+
- [ ] Profiler output identifying the hot path (function-level timing)
|
|
12
|
+
- [ ] Confirmation that the bottleneck is in code you control (not external I/O, network, etc.)
|
|
13
|
+
- [ ] If I/O-bound: evidence of unnecessary or redundant I/O operations
|
|
14
|
+
|
|
15
|
+
## 3. Baseline Measurement
|
|
16
|
+
- [ ] Exact timing of the current implementation on the workload
|
|
17
|
+
- [ ] Multiple runs to confirm consistency (not a fluke)
|
|
18
|
+
- [ ] Environment noted (machine, load, relevant config)
|
|
19
|
+
|
|
20
|
+
## Common Profiling Tools by Language
|
|
21
|
+
|
|
22
|
+
| Language | Profiling | Timing |
|
|
23
|
+
|----------|-----------|--------|
|
|
24
|
+
| Python | cProfile, py-spy, line_profiler | timeit, time.perf_counter |
|
|
25
|
+
| JavaScript | Chrome DevTools, node --prof | console.time, performance.now |
|
|
26
|
+
| Rust | cargo flamegraph, perf | criterion, std::time::Instant |
|
|
27
|
+
| Go | pprof, trace | testing.B (benchmarks) |
|
|
28
|
+
| Java | JFR, async-profiler | JMH |
|
|
29
|
+
| SQL | EXPLAIN ANALYZE | query timing |
|