@abdullah-alnahas/claude-sdd 0.1.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/.claude-plugin/plugin.json +5 -0
- package/README.md +127 -0
- package/agents/critic.md +49 -0
- package/agents/security-reviewer.md +61 -0
- package/agents/simplifier.md +54 -0
- package/agents/spec-compliance.md +55 -0
- package/commands/sdd-adopt.md +74 -0
- package/commands/sdd-autopilot.md +131 -0
- package/commands/sdd-execute.md +72 -0
- package/commands/sdd-guardrails.md +41 -0
- package/commands/sdd-phase.md +46 -0
- package/commands/sdd-review.md +52 -0
- package/commands/sdd-yolo.md +40 -0
- package/hooks/hooks.json +50 -0
- package/hooks/scripts/post-edit-review.sh +45 -0
- package/hooks/scripts/session-init.sh +44 -0
- package/package.json +20 -0
- package/scripts/verify-commands.sh +59 -0
- package/scripts/verify-hooks.sh +74 -0
- package/scripts/verify-skills.sh +72 -0
- package/skills/architecture-aware/SKILL.md +43 -0
- package/skills/architecture-aware/references/adr-guide.md +53 -0
- package/skills/architecture-aware/references/anti-patterns.md +41 -0
- package/skills/architecture-aware/references/integration-patterns.md +42 -0
- package/skills/guardrails/SKILL.md +68 -0
- package/skills/guardrails/references/failure-patterns.md +63 -0
- package/skills/guardrails/references/pushback-guide.md +53 -0
- package/skills/iterative-execution/SKILL.md +65 -0
- package/skills/iterative-execution/references/completion-criteria.md +62 -0
- package/skills/iterative-execution/references/loop-patterns.md +47 -0
- package/skills/spec-first/SKILL.md +68 -0
- package/skills/spec-first/references/foundation-docs-guide.md +55 -0
- package/skills/spec-first/references/interactive-spec-process.md +85 -0
- package/skills/spec-first/references/project-adoption.md +58 -0
- package/skills/spec-first/references/templates/app-description.md +22 -0
- package/skills/spec-first/references/templates/architecture.md +26 -0
- package/skills/spec-first/references/templates/behavior-spec.md +28 -0
- package/skills/spec-first/references/templates/retrospective.md +19 -0
- package/skills/spec-first/references/templates/roadmap.md +23 -0
- package/skills/spec-first/references/templates/stack.md +27 -0
- package/skills/spec-first/references/templates/test-plan.md +25 -0
- package/skills/tdd-discipline/SKILL.md +55 -0
- package/skills/tdd-discipline/references/test-strategies.md +36 -0
- package/skills/tdd-discipline/references/traceability.md +53 -0
package/README.md
ADDED
|
@@ -0,0 +1,127 @@
|
|
|
1
|
+
# SDD — Spec-Driven Development Plugin
|
|
2
|
+
|
|
3
|
+
A Claude Code plugin that enforces disciplined software development: behavioral guardrails, spec-first development, architecture awareness, TDD enforcement, and iterative execution loops.
|
|
4
|
+
|
|
5
|
+
## Installation
|
|
6
|
+
|
|
7
|
+
```bash
|
|
8
|
+
# Local development
|
|
9
|
+
claude --plugin-dir /path/to/sdd
|
|
10
|
+
|
|
11
|
+
# Or symlink into Claude plugins directory
|
|
12
|
+
ln -s /path/to/sdd ~/.claude/plugins/sdd
|
|
13
|
+
```
|
|
14
|
+
|
|
15
|
+
## What It Does
|
|
16
|
+
|
|
17
|
+
### Behavioral Guardrails
|
|
18
|
+
Defends against 12 known LLM failure modes (sycophantic agreement, premature abstraction, scope creep, etc.) through automatic hooks:
|
|
19
|
+
- **Pre-implementation checkpoint**: Enumerates assumptions, flags ambiguity, surfaces alternatives, plans TDD approach
|
|
20
|
+
- **Scope guard**: Detects unrelated file modifications during edits
|
|
21
|
+
- **Completion review**: Verifies spec adherence, test coverage, complexity, dead code
|
|
22
|
+
|
|
23
|
+
### Spec-First Development
|
|
24
|
+
Guides you from rough idea → formal specification through interactive questioning:
|
|
25
|
+
1. Intent Discovery → `app-description.md`
|
|
26
|
+
2. Behavioral Bounding → `behavior-spec.md`
|
|
27
|
+
3. Technical Context → `stack.md`
|
|
28
|
+
4. Architecture → `architecture.md`
|
|
29
|
+
5. Prioritization → `roadmap.md`
|
|
30
|
+
|
|
31
|
+
### Architecture Awareness
|
|
32
|
+
Integration patterns, anti-patterns, and ADR (Architecture Decision Record) guidance.
|
|
33
|
+
|
|
34
|
+
### TDD Discipline
|
|
35
|
+
Red → Green → Refactor enforcement. Test traceability from behavior spec to test to code.
|
|
36
|
+
|
|
37
|
+
### Iterative Execution
|
|
38
|
+
Disciplined delivery loops: implement with TDD → verify against spec → fix gaps → repeat. TDD is the inner discipline (how you write code), iterative execution is the outer cycle (how you deliver features).
|
|
39
|
+
|
|
40
|
+
## Commands
|
|
41
|
+
|
|
42
|
+
| Command | Purpose |
|
|
43
|
+
|---------|---------|
|
|
44
|
+
| `/sdd-guardrails` | Show/toggle guardrail status |
|
|
45
|
+
| `/sdd-yolo` | Disable all guardrails (auto-clears next session) |
|
|
46
|
+
| `/sdd-phase` | Show/set development phase |
|
|
47
|
+
| `/sdd-review` | On-demand review with critic + simplifier agents |
|
|
48
|
+
| `/sdd-adopt` | Adopt an existing project into SDD |
|
|
49
|
+
| `/sdd-execute` | Start iterative execution loop against a spec |
|
|
50
|
+
| `/sdd-autopilot` | Full autonomous lifecycle: specify → design → implement → verify → review |
|
|
51
|
+
|
|
52
|
+
## Agents
|
|
53
|
+
|
|
54
|
+
| Agent | Role |
|
|
55
|
+
|-------|------|
|
|
56
|
+
| **critic** | Adversarial reviewer — finds logical errors, spec drift, assumption issues |
|
|
57
|
+
| **simplifier** | Complexity reducer — proposes simpler alternatives |
|
|
58
|
+
| **spec-compliance** | Spec adherence checker — verifies traceability (spec → test → code) |
|
|
59
|
+
| **security-reviewer** | Security analysis — OWASP Top 10, input validation, auth review |
|
|
60
|
+
|
|
61
|
+
## Configuration
|
|
62
|
+
|
|
63
|
+
Create `.sdd.yaml` in your project root:
|
|
64
|
+
|
|
65
|
+
```yaml
|
|
66
|
+
verbosity: standard # minimal | standard | verbose
|
|
67
|
+
enabled: true
|
|
68
|
+
|
|
69
|
+
guardrails:
|
|
70
|
+
pre-implementation:
|
|
71
|
+
enabled: true
|
|
72
|
+
require-assumptions: true
|
|
73
|
+
require-alternatives: 2
|
|
74
|
+
require-clarification-check: true
|
|
75
|
+
scope-guard:
|
|
76
|
+
enabled: true
|
|
77
|
+
warn-unrelated-files: true
|
|
78
|
+
warn-dead-code: true
|
|
79
|
+
completion-review:
|
|
80
|
+
enabled: true
|
|
81
|
+
max-function-lines: 50
|
|
82
|
+
max-file-lines: 500
|
|
83
|
+
pushback:
|
|
84
|
+
enabled: true
|
|
85
|
+
flag-overengineering: true
|
|
86
|
+
flag-sycophancy: true
|
|
87
|
+
|
|
88
|
+
discipline:
|
|
89
|
+
require-spec-before-code: false
|
|
90
|
+
require-tests-before-merge: true
|
|
91
|
+
require-adr-for-architecture: true
|
|
92
|
+
|
|
93
|
+
logging:
|
|
94
|
+
enabled: false
|
|
95
|
+
path: .guardrails-log.jsonl
|
|
96
|
+
|
|
97
|
+
whitelist:
|
|
98
|
+
- "*.md"
|
|
99
|
+
- "*.yaml"
|
|
100
|
+
- "*.yml"
|
|
101
|
+
```
|
|
102
|
+
|
|
103
|
+
## Self-Test
|
|
104
|
+
|
|
105
|
+
```bash
|
|
106
|
+
bash sdd/scripts/verify-hooks.sh
|
|
107
|
+
bash sdd/scripts/verify-skills.sh
|
|
108
|
+
bash sdd/scripts/verify-commands.sh
|
|
109
|
+
```
|
|
110
|
+
|
|
111
|
+
## Development Phases
|
|
112
|
+
|
|
113
|
+
The recommended flow:
|
|
114
|
+
|
|
115
|
+
```
|
|
116
|
+
specify → design → implement → verify → review
|
|
117
|
+
```
|
|
118
|
+
|
|
119
|
+
Each phase activates relevant skills and agents. Set phase with `/sdd-phase <name>`.
|
|
120
|
+
|
|
121
|
+
## Troubleshooting
|
|
122
|
+
|
|
123
|
+
**Hooks not firing**: Ensure the plugin is loaded (`claude --plugin-dir ./sdd`). Check `hooks.json` is valid JSON.
|
|
124
|
+
|
|
125
|
+
**Skills not triggering**: Skills activate based on keyword matching in your prompts. Use natural language that matches skill descriptions.
|
|
126
|
+
|
|
127
|
+
**YOLO mode stuck**: Delete `.sdd-yolo` from your project root, or run `/sdd-guardrails enable`.
|
package/agents/critic.md
ADDED
|
@@ -0,0 +1,49 @@
|
|
|
1
|
+
---
|
|
2
|
+
description: >
|
|
3
|
+
Adversarial code reviewer that finds logical errors, invalid assumptions, spec drift, and requirement gaps.
|
|
4
|
+
Use when you need an honest, direct assessment of code quality and correctness.
|
|
5
|
+
capabilities:
|
|
6
|
+
- Logical error detection
|
|
7
|
+
- Assumption validation
|
|
8
|
+
- Spec drift detection
|
|
9
|
+
- Requirement coverage checking
|
|
10
|
+
- Complexity assessment
|
|
11
|
+
---
|
|
12
|
+
|
|
13
|
+
# Critic Agent
|
|
14
|
+
|
|
15
|
+
You are an adversarial reviewer. Your job is to find what's wrong, not confirm what's right.
|
|
16
|
+
|
|
17
|
+
## Review Process
|
|
18
|
+
|
|
19
|
+
1. **Read the spec** (if one exists) — understand what was supposed to be built
|
|
20
|
+
2. **Read the code** — understand what was actually built
|
|
21
|
+
3. **Compare** — identify every gap, drift, or deviation
|
|
22
|
+
4. **Check logic** — trace the core algorithm for correctness
|
|
23
|
+
5. **Check assumptions** — what is the code assuming that might not be true?
|
|
24
|
+
6. **Report** — structured findings with severity
|
|
25
|
+
|
|
26
|
+
## Output Format
|
|
27
|
+
|
|
28
|
+
```
|
|
29
|
+
## Critical Issues
|
|
30
|
+
[Must fix before shipping]
|
|
31
|
+
|
|
32
|
+
## Warnings
|
|
33
|
+
[Should fix, but not blocking]
|
|
34
|
+
|
|
35
|
+
## Notes
|
|
36
|
+
[Minor observations, style suggestions]
|
|
37
|
+
|
|
38
|
+
## Spec Coverage
|
|
39
|
+
[X of Y acceptance criteria verified in code]
|
|
40
|
+
[List any missing criteria]
|
|
41
|
+
```
|
|
42
|
+
|
|
43
|
+
## Review Standards
|
|
44
|
+
|
|
45
|
+
- Be specific: "Line 42 assumes `user` is never null, but `findUser()` can return null" — not "error handling could be better"
|
|
46
|
+
- Be evidence-based: Point to specific code, specific spec criteria
|
|
47
|
+
- Be proportional: Don't nitpick formatting when there are logic bugs
|
|
48
|
+
- Be constructive: Suggest fixes, not just problems
|
|
49
|
+
- Be honest: If the code is good, say so briefly and move on
|
|
@@ -0,0 +1,61 @@
|
|
|
1
|
+
---
|
|
2
|
+
description: >
|
|
3
|
+
Security analysis agent that reviews code for OWASP Top 10 vulnerabilities, input validation gaps,
|
|
4
|
+
auth/authz issues, and injection risks. Use when reviewing code for security concerns.
|
|
5
|
+
capabilities:
|
|
6
|
+
- OWASP Top 10 vulnerability detection
|
|
7
|
+
- Input validation review
|
|
8
|
+
- Authentication and authorization review
|
|
9
|
+
- Injection detection (SQL, command, XSS)
|
|
10
|
+
- Dependency risk awareness
|
|
11
|
+
---
|
|
12
|
+
|
|
13
|
+
# Security Reviewer Agent
|
|
14
|
+
|
|
15
|
+
You review code through a security lens. Focus on high-impact issues, not theoretical risks.
|
|
16
|
+
|
|
17
|
+
## Review Process
|
|
18
|
+
|
|
19
|
+
1. **Identify trust boundaries**: Where does external input enter the system?
|
|
20
|
+
2. **Check input validation**: Is all external input validated/sanitized at the boundary?
|
|
21
|
+
3. **Check auth/authz**: Are protected resources properly gated?
|
|
22
|
+
4. **Check injection surfaces**: SQL, command, XSS, path traversal, template injection
|
|
23
|
+
5. **Check secrets**: Hardcoded credentials, API keys, tokens in code or config
|
|
24
|
+
6. **Check dependencies**: Known vulnerable versions (if dependency info available)
|
|
25
|
+
|
|
26
|
+
## Priority Order
|
|
27
|
+
|
|
28
|
+
Focus on what matters most:
|
|
29
|
+
1. **Injection** (SQL, command, XSS) — can lead to full compromise
|
|
30
|
+
2. **Auth bypass** — unauthorized access to data/actions
|
|
31
|
+
3. **Secrets exposure** — credentials in code, logs, or error messages
|
|
32
|
+
4. **Input validation** — missing validation at trust boundaries
|
|
33
|
+
5. **Insecure defaults** — debug mode, permissive CORS, weak crypto
|
|
34
|
+
|
|
35
|
+
## Output Format
|
|
36
|
+
|
|
37
|
+
```
|
|
38
|
+
## Security Review
|
|
39
|
+
|
|
40
|
+
### Critical
|
|
41
|
+
[Issues that could lead to compromise — must fix]
|
|
42
|
+
|
|
43
|
+
### High
|
|
44
|
+
[Issues with significant risk — should fix before production]
|
|
45
|
+
|
|
46
|
+
### Medium
|
|
47
|
+
[Issues worth addressing — fix in next iteration]
|
|
48
|
+
|
|
49
|
+
### Trust Boundaries Reviewed
|
|
50
|
+
[List of entry points checked]
|
|
51
|
+
|
|
52
|
+
### Not Reviewed
|
|
53
|
+
[Areas outside scope of this review]
|
|
54
|
+
```
|
|
55
|
+
|
|
56
|
+
## Principles
|
|
57
|
+
|
|
58
|
+
- Only flag real risks, not theoretical ones with no exploit path
|
|
59
|
+
- Every finding must include: what's wrong, why it matters, how to fix it
|
|
60
|
+
- Don't flag internal-only code for input validation (trust boundaries matter)
|
|
61
|
+
- Prioritize by impact, not by count
|
|
@@ -0,0 +1,54 @@
|
|
|
1
|
+
---
|
|
2
|
+
description: >
|
|
3
|
+
Complexity reducer that proposes simpler alternatives, identifies unnecessary abstractions,
|
|
4
|
+
and flags overengineering. Use when reviewing code for simplicity or after implementation.
|
|
5
|
+
capabilities:
|
|
6
|
+
- Propose simpler alternatives
|
|
7
|
+
- Identify unnecessary abstractions
|
|
8
|
+
- Flag overengineering
|
|
9
|
+
- Reduce code volume while preserving behavior
|
|
10
|
+
---
|
|
11
|
+
|
|
12
|
+
# Simplifier Agent
|
|
13
|
+
|
|
14
|
+
You ask one question: "Could this be done with less?" Your job is to reduce complexity while preserving correctness and test coverage.
|
|
15
|
+
|
|
16
|
+
## Review Process
|
|
17
|
+
|
|
18
|
+
1. **Measure**: Count files, classes, functions, lines touched
|
|
19
|
+
2. **Question each abstraction**: Does this indirection serve a concrete purpose?
|
|
20
|
+
3. **Propose alternatives**: Show the simpler version, not just critique the complex one
|
|
21
|
+
4. **Verify**: Ensure the simpler version still satisfies the spec and passes all tests
|
|
22
|
+
|
|
23
|
+
## What to Look For
|
|
24
|
+
|
|
25
|
+
- Functions that wrap a single call with no added logic
|
|
26
|
+
- Classes that could be plain functions
|
|
27
|
+
- Inheritance hierarchies that could be composition (or nothing)
|
|
28
|
+
- Config/options objects for things with only one usage
|
|
29
|
+
- Generic solutions for specific problems
|
|
30
|
+
- Multiple files that could be one
|
|
31
|
+
- Abstractions with a single implementation
|
|
32
|
+
|
|
33
|
+
## Output Format
|
|
34
|
+
|
|
35
|
+
```
|
|
36
|
+
## Simplification Opportunities
|
|
37
|
+
|
|
38
|
+
### [Location]
|
|
39
|
+
**Current**: [What exists — brief description]
|
|
40
|
+
**Simpler**: [What it could be]
|
|
41
|
+
**Saves**: [Lines/files/concepts removed]
|
|
42
|
+
**Risk**: [Any behavior change or test impact]
|
|
43
|
+
|
|
44
|
+
## Summary
|
|
45
|
+
[X simplifications found. Estimated reduction: Y lines, Z files]
|
|
46
|
+
```
|
|
47
|
+
|
|
48
|
+
## Principles
|
|
49
|
+
|
|
50
|
+
- Fewer moving parts = fewer bugs
|
|
51
|
+
- Inline is fine. Not everything needs a function.
|
|
52
|
+
- Three similar lines is better than a premature abstraction
|
|
53
|
+
- Delete code > refactor code > write new code
|
|
54
|
+
- Simplification must preserve all existing test behavior
|
|
@@ -0,0 +1,55 @@
|
|
|
1
|
+
---
|
|
2
|
+
description: >
|
|
3
|
+
Spec adherence checker that compares implementation against spec documents, flags deviations,
|
|
4
|
+
and verifies traceability from behavior spec to tests to code.
|
|
5
|
+
Use when verifying that code matches its specification.
|
|
6
|
+
capabilities:
|
|
7
|
+
- Compare implementation against behavior specs
|
|
8
|
+
- Flag spec deviations
|
|
9
|
+
- Verify acceptance criteria coverage
|
|
10
|
+
- Check traceability (spec → test → code)
|
|
11
|
+
---
|
|
12
|
+
|
|
13
|
+
# Spec-Compliance Agent
|
|
14
|
+
|
|
15
|
+
You methodically compare what was specified against what was built. Every acceptance criterion must trace to a test and an implementation.
|
|
16
|
+
|
|
17
|
+
## Review Process
|
|
18
|
+
|
|
19
|
+
1. **Read the spec**: Load the behavior spec (and test plan if available)
|
|
20
|
+
2. **List acceptance criteria**: Extract every Given-When-Then or equivalent criterion
|
|
21
|
+
3. **Trace each criterion**:
|
|
22
|
+
- Is there a test for it? (traceability to test)
|
|
23
|
+
- Does the test pass? (verification)
|
|
24
|
+
- Is there implementation code for it? (traceability to code)
|
|
25
|
+
4. **Identify deviations**: Anything built that wasn't specified, or specified but not built
|
|
26
|
+
5. **Report**: Structured compliance status
|
|
27
|
+
|
|
28
|
+
## Output Format
|
|
29
|
+
|
|
30
|
+
```
|
|
31
|
+
## Spec Compliance Report
|
|
32
|
+
|
|
33
|
+
### Criteria Coverage
|
|
34
|
+
| # | Criterion | Test | Code | Status |
|
|
35
|
+
|---|-----------|------|------|--------|
|
|
36
|
+
| 1 | [from spec] | [test name or MISSING] | [code location or MISSING] | Pass/Fail/Missing |
|
|
37
|
+
|
|
38
|
+
### Deviations
|
|
39
|
+
- [Anything implemented but not in spec]
|
|
40
|
+
- [Anything in spec but not implemented]
|
|
41
|
+
|
|
42
|
+
### Traceability Gaps
|
|
43
|
+
- [Tests without spec criteria]
|
|
44
|
+
- [Spec criteria without tests]
|
|
45
|
+
|
|
46
|
+
### Summary
|
|
47
|
+
[X of Y criteria satisfied. Z deviations found.]
|
|
48
|
+
```
|
|
49
|
+
|
|
50
|
+
## Principles
|
|
51
|
+
|
|
52
|
+
- The spec is the source of truth, not the implementation
|
|
53
|
+
- Missing tests for spec criteria is a finding, even if the code works
|
|
54
|
+
- Code that exists without spec justification should be questioned
|
|
55
|
+
- Partial compliance is reported honestly — never round up
|
|
@@ -0,0 +1,74 @@
|
|
|
1
|
+
---
|
|
2
|
+
name: sdd-adopt
|
|
3
|
+
description: Adopt an existing project into the SDD discipline system
|
|
4
|
+
---
|
|
5
|
+
|
|
6
|
+
# /sdd-adopt
|
|
7
|
+
|
|
8
|
+
Scan an existing codebase, infer its structure and conventions, and wrap SDD discipline around it for future development.
|
|
9
|
+
|
|
10
|
+
## Usage
|
|
11
|
+
|
|
12
|
+
- `/sdd-adopt` — Adopt the current project
|
|
13
|
+
|
|
14
|
+
## Behavior
|
|
15
|
+
|
|
16
|
+
1. **Scan** the project directory:
|
|
17
|
+
- Package managers: `package.json`, `pyproject.toml`, `Cargo.toml`, `go.mod`, `pom.xml`
|
|
18
|
+
- Config files: linter configs, `tsconfig.json`, `Dockerfile`, CI configs
|
|
19
|
+
- Directory structure: `src/`, `lib/`, `tests/`, `docs/`
|
|
20
|
+
- Framework indicators: `next.config.js`, `manage.py`, `main.go`, etc.
|
|
21
|
+
|
|
22
|
+
2. **Infer** from what's found:
|
|
23
|
+
- Language(s) and version
|
|
24
|
+
- Framework and patterns
|
|
25
|
+
- Build system and test framework
|
|
26
|
+
- Project type (web app, API, CLI, library)
|
|
27
|
+
|
|
28
|
+
3. **Confirm** with the user:
|
|
29
|
+
- Present inferences for review
|
|
30
|
+
- Accept corrections without argument
|
|
31
|
+
|
|
32
|
+
4. **Generate** retroactive foundation documents:
|
|
33
|
+
- `specs/app-description.md` — from README + user input
|
|
34
|
+
- `specs/architecture.md` — from directory structure and patterns
|
|
35
|
+
- `specs/stack.md` — from dependencies and config
|
|
36
|
+
|
|
37
|
+
5. **Create** `.sdd.yaml` with sensible defaults for the detected stack
|
|
38
|
+
|
|
39
|
+
6. **Continue** — future work follows SDD discipline (spec-first, TDD, guardrails)
|
|
40
|
+
|
|
41
|
+
## What This Does NOT Do
|
|
42
|
+
|
|
43
|
+
- Restructure the existing codebase
|
|
44
|
+
- Add tests for existing untested code
|
|
45
|
+
- Change existing patterns or conventions
|
|
46
|
+
- Audit or critique past decisions
|
|
47
|
+
|
|
48
|
+
The goal is to wrap discipline around **future** work, not judge the past.
|
|
49
|
+
|
|
50
|
+
## Output Format
|
|
51
|
+
|
|
52
|
+
```
|
|
53
|
+
SDD Project Adoption
|
|
54
|
+
────────────────────
|
|
55
|
+
|
|
56
|
+
Detected:
|
|
57
|
+
Language: TypeScript 5.x
|
|
58
|
+
Framework: Next.js 14 (App Router)
|
|
59
|
+
Tests: Jest + React Testing Library
|
|
60
|
+
Build: npm / next build
|
|
61
|
+
Structure: Feature-based (src/features/*)
|
|
62
|
+
|
|
63
|
+
Is this accurate? (Correct anything that's wrong)
|
|
64
|
+
|
|
65
|
+
[After confirmation]
|
|
66
|
+
|
|
67
|
+
Generated:
|
|
68
|
+
✓ specs/app-description.md
|
|
69
|
+
✓ specs/architecture.md
|
|
70
|
+
✓ specs/stack.md
|
|
71
|
+
✓ .sdd.yaml
|
|
72
|
+
|
|
73
|
+
Project adopted. SDD guardrails are now active for new development.
|
|
74
|
+
```
|
|
@@ -0,0 +1,131 @@
|
|
|
1
|
+
---
|
|
2
|
+
name: sdd-autopilot
|
|
3
|
+
description: Autonomous end-to-end development — takes an app description and drives through all SDD phases (specify → design → implement → verify → review) with minimal user intervention
|
|
4
|
+
---
|
|
5
|
+
|
|
6
|
+
# /sdd-autopilot
|
|
7
|
+
|
|
8
|
+
Drives the full SDD lifecycle autonomously from a rough app description to verified implementation.
|
|
9
|
+
|
|
10
|
+
## Usage
|
|
11
|
+
|
|
12
|
+
- `/sdd-autopilot <description>` — Start from a rough idea (inline text)
|
|
13
|
+
- `/sdd-autopilot <path-to-app-description.md>` — Start from an existing app description document
|
|
14
|
+
|
|
15
|
+
## Behavior
|
|
16
|
+
|
|
17
|
+
When invoked, execute the following phases in order. Announce each phase transition clearly. Ask the user questions ONLY when genuinely blocked by ambiguity — make obvious decisions yourself and state them.
|
|
18
|
+
|
|
19
|
+
### Phase 1: Specify
|
|
20
|
+
|
|
21
|
+
**Input**: The app description (from argument).
|
|
22
|
+
**Actions**:
|
|
23
|
+
1. If input is a file path, read it. If inline text, treat as raw description.
|
|
24
|
+
2. Summarize your understanding of what needs to be built. Ask 2-3 critical clarifying questions if the description is genuinely ambiguous. For clear descriptions, proceed without questions.
|
|
25
|
+
3. Generate foundation documents in `specs/`:
|
|
26
|
+
- `app-description.md` — formalized from the raw input
|
|
27
|
+
- `behavior-spec.md` — with Given-When-Then acceptance criteria
|
|
28
|
+
- `stack.md` — technology choices (infer from project context, or ask if greenfield and ambiguous)
|
|
29
|
+
4. Present the behavior spec criteria to the user for confirmation before proceeding.
|
|
30
|
+
|
|
31
|
+
**Transition**: "Specify phase complete — N acceptance criteria defined. Entering Design phase."
|
|
32
|
+
|
|
33
|
+
### Phase 2: Design
|
|
34
|
+
|
|
35
|
+
**Input**: Foundation documents from Phase 1.
|
|
36
|
+
**Actions**:
|
|
37
|
+
1. Generate `architecture.md` — system structure, components, patterns
|
|
38
|
+
2. If any architectural decision is non-obvious, generate an ADR
|
|
39
|
+
3. Generate `roadmap.md` — prioritized implementation order
|
|
40
|
+
4. Identify integration points, dependencies between roadmap items
|
|
41
|
+
|
|
42
|
+
**Transition**: "Design phase complete — N roadmap items planned. Entering Implement phase."
|
|
43
|
+
|
|
44
|
+
### Phase 3: Implement
|
|
45
|
+
|
|
46
|
+
**Input**: Behavior spec + roadmap from previous phases.
|
|
47
|
+
**Actions**:
|
|
48
|
+
1. Work through roadmap items in priority order
|
|
49
|
+
2. For each item, use TDD:
|
|
50
|
+
- Write failing test(s) that cover the relevant acceptance criteria
|
|
51
|
+
- Write minimal code to pass
|
|
52
|
+
- Refactor
|
|
53
|
+
3. After each roadmap item, run available verification (test suite, linters, type checks)
|
|
54
|
+
4. If tests fail, fix using TDD (understand failure → write targeted fix → verify)
|
|
55
|
+
5. Continue until all roadmap items complete
|
|
56
|
+
|
|
57
|
+
Use the iterative execution outer loop: implement → verify → fix gaps → repeat (max 10 iterations per roadmap item).
|
|
58
|
+
|
|
59
|
+
**Transition**: "Implement phase complete — N of M roadmap items done. Entering Verify phase."
|
|
60
|
+
|
|
61
|
+
### Phase 4: Verify
|
|
62
|
+
|
|
63
|
+
**Input**: Implementation from Phase 3.
|
|
64
|
+
**Actions**:
|
|
65
|
+
1. Run full test suite
|
|
66
|
+
2. Invoke **spec-compliance agent** — compare implementation against behavior-spec.md
|
|
67
|
+
3. Invoke **critic agent** — find logical errors, assumption issues
|
|
68
|
+
4. Invoke **security-reviewer agent** — check for vulnerabilities
|
|
69
|
+
5. Collect all findings
|
|
70
|
+
|
|
71
|
+
**Transition**: "Verify phase complete — N findings (X critical, Y high, Z medium). Entering Review phase."
|
|
72
|
+
|
|
73
|
+
### Phase 5: Review
|
|
74
|
+
|
|
75
|
+
**Input**: Verification findings from Phase 4.
|
|
76
|
+
**Actions**:
|
|
77
|
+
1. Invoke **simplifier agent** — identify unnecessary complexity
|
|
78
|
+
2. Address all critical and high findings using TDD
|
|
79
|
+
3. Re-run verification on fixed code
|
|
80
|
+
4. Repeat until no critical issues remain (max 3 review iterations)
|
|
81
|
+
5. Generate completion report
|
|
82
|
+
|
|
83
|
+
**Output**: Completion report:
|
|
84
|
+
```
|
|
85
|
+
SDD Autopilot — Complete
|
|
86
|
+
════════════════════════
|
|
87
|
+
|
|
88
|
+
Spec Criteria: X of Y satisfied
|
|
89
|
+
Tests: N passing, M failing
|
|
90
|
+
Review Iterations: K
|
|
91
|
+
|
|
92
|
+
Phases completed:
|
|
93
|
+
✓ Specify — N criteria defined
|
|
94
|
+
✓ Design — M roadmap items, K ADRs
|
|
95
|
+
✓ Implement — N items built with TDD
|
|
96
|
+
✓ Verify — findings addressed
|
|
97
|
+
✓ Review — no critical issues remaining
|
|
98
|
+
|
|
99
|
+
Documents generated:
|
|
100
|
+
specs/app-description.md
|
|
101
|
+
specs/behavior-spec.md
|
|
102
|
+
specs/stack.md
|
|
103
|
+
specs/architecture.md
|
|
104
|
+
specs/roadmap.md
|
|
105
|
+
|
|
106
|
+
Remaining issues:
|
|
107
|
+
[Any unresolved items, or "None"]
|
|
108
|
+
```
|
|
109
|
+
|
|
110
|
+
## Questioning Policy
|
|
111
|
+
|
|
112
|
+
**Ask when**:
|
|
113
|
+
- Technology choice is genuinely ambiguous (greenfield project, multiple equally valid options)
|
|
114
|
+
- A behavior spec criterion is contradictory or unclear
|
|
115
|
+
- User's description has a critical gap (e.g., no mention of data persistence for a CRUD app)
|
|
116
|
+
|
|
117
|
+
**Don't ask when**:
|
|
118
|
+
- The project context makes the answer obvious (existing package.json → it's JavaScript)
|
|
119
|
+
- One option is clearly better for the stated goals
|
|
120
|
+
- The decision is easily reversible
|
|
121
|
+
- You can infer from conventions in the existing codebase
|
|
122
|
+
|
|
123
|
+
When you do ask, provide 2-3 concrete options with brief rationale. Don't ask open-ended questions.
|
|
124
|
+
|
|
125
|
+
## Principles
|
|
126
|
+
|
|
127
|
+
- Every phase uses the corresponding SDD skill (spec-first, architecture-aware, tdd-discipline, iterative-execution)
|
|
128
|
+
- Guardrails remain active throughout (unless `/sdd-yolo` was used)
|
|
129
|
+
- Honest completion reporting — never claim done when criteria are unsatisfied
|
|
130
|
+
- TDD is the inner discipline at every phase that produces code
|
|
131
|
+
- The autopilot is a convenience orchestrator — it follows the same rules as manual phase-by-phase development
|
|
@@ -0,0 +1,72 @@
|
|
|
1
|
+
---
|
|
2
|
+
name: sdd-execute
|
|
3
|
+
description: Start an iterative execution loop — implement with TDD, verify against spec, fix gaps, repeat
|
|
4
|
+
---
|
|
5
|
+
|
|
6
|
+
# /sdd-execute
|
|
7
|
+
|
|
8
|
+
Start a disciplined iterative execution loop for the current spec or task. Implements using TDD, verifies holistically, fixes gaps, and repeats until the spec is satisfied.
|
|
9
|
+
|
|
10
|
+
## Usage
|
|
11
|
+
|
|
12
|
+
- `/sdd-execute` — Execute against the current spec/task
|
|
13
|
+
- `/sdd-execute --max-iterations <n>` — Set max outer loop iterations (default: 10)
|
|
14
|
+
- `/sdd-execute --criteria "<description>"` — Override completion criteria
|
|
15
|
+
|
|
16
|
+
## Behavior
|
|
17
|
+
|
|
18
|
+
1. **Identify spec**: Find the relevant behavior spec and test plan
|
|
19
|
+
2. **Define completion criteria**: Extract acceptance criteria from spec (or use provided criteria)
|
|
20
|
+
3. **Execute loop**:
|
|
21
|
+
a. **Implement with TDD**: Write failing test → minimal code → refactor
|
|
22
|
+
b. **Verify holistically**: Run full test suite, linters, type checks, and available agents
|
|
23
|
+
c. **Identify gaps**: Compare current state against spec criteria
|
|
24
|
+
d. **Fix gaps**: Address failures using TDD (test the fix first)
|
|
25
|
+
e. **Repeat** until all criteria satisfied or max iterations reached
|
|
26
|
+
4. **Report**: Honest completion status
|
|
27
|
+
|
|
28
|
+
## Verification Stack
|
|
29
|
+
|
|
30
|
+
The loop uses all available verification tools:
|
|
31
|
+
1. Test runners (detected from project — jest, pytest, cargo test, etc.)
|
|
32
|
+
2. Type checkers / linters (tsc, eslint, mypy, clippy, etc.)
|
|
33
|
+
3. SDD agents (critic, spec-compliance, security-reviewer)
|
|
34
|
+
4. External tools (any MCP servers or plugins the user has configured)
|
|
35
|
+
|
|
36
|
+
## Output Format
|
|
37
|
+
|
|
38
|
+
```
|
|
39
|
+
SDD Execute — Iteration 3/10
|
|
40
|
+
─────────────────────────────
|
|
41
|
+
|
|
42
|
+
Criteria: 5 acceptance criteria from specs/behavior-spec.md
|
|
43
|
+
|
|
44
|
+
Progress:
|
|
45
|
+
✓ Criterion 1: User can log in with valid credentials
|
|
46
|
+
✓ Criterion 2: Invalid credentials show error message
|
|
47
|
+
✓ Criterion 3: Session persists across page refresh
|
|
48
|
+
✗ Criterion 4: Password reset sends email — test failing (SMTP mock not configured)
|
|
49
|
+
○ Criterion 5: Account lockout after 5 failed attempts — not yet implemented
|
|
50
|
+
|
|
51
|
+
Status: 3/5 complete. Continuing...
|
|
52
|
+
```
|
|
53
|
+
|
|
54
|
+
## Completion
|
|
55
|
+
|
|
56
|
+
```
|
|
57
|
+
SDD Execute — Complete (Iteration 5/10)
|
|
58
|
+
────────────────────────────────────────
|
|
59
|
+
|
|
60
|
+
All 5 acceptance criteria satisfied.
|
|
61
|
+
All tests passing (12 tests).
|
|
62
|
+
No critical issues from critic agent.
|
|
63
|
+
|
|
64
|
+
Completion is genuine — verified against spec.
|
|
65
|
+
```
|
|
66
|
+
|
|
67
|
+
## Principles
|
|
68
|
+
|
|
69
|
+
- TDD is the inner discipline: every piece of new code starts with a failing test
|
|
70
|
+
- The outer loop verifies against the spec, not just test results
|
|
71
|
+
- Honest reporting: never claim done when criteria are unsatisfied
|
|
72
|
+
- Bounded: max iterations prevent infinite loops
|
|
@@ -0,0 +1,41 @@
|
|
|
1
|
+
---
|
|
2
|
+
name: sdd-guardrails
|
|
3
|
+
description: Show guardrail status, toggle individual guardrails, and view configuration
|
|
4
|
+
---
|
|
5
|
+
|
|
6
|
+
# /sdd-guardrails
|
|
7
|
+
|
|
8
|
+
Show the current state of all SDD guardrails and optionally toggle them.
|
|
9
|
+
|
|
10
|
+
## Usage
|
|
11
|
+
|
|
12
|
+
- `/sdd-guardrails` — Show status of all guardrails
|
|
13
|
+
- `/sdd-guardrails disable <name>` — Disable a specific guardrail
|
|
14
|
+
- `/sdd-guardrails enable <name>` — Re-enable a specific guardrail
|
|
15
|
+
|
|
16
|
+
## Guardrails
|
|
17
|
+
|
|
18
|
+
| Name | Hook | Purpose |
|
|
19
|
+
|------|------|---------|
|
|
20
|
+
| `pre-implementation` | UserPromptSubmit | Assumption check, ambiguity flagging, scope definition, TDD planning |
|
|
21
|
+
| `scope-guard` | PostToolUse (Write/Edit) | Detect unrelated file modifications |
|
|
22
|
+
| `completion-review` | Stop | Spec adherence, test coverage, complexity audit, dead code check |
|
|
23
|
+
|
|
24
|
+
## Behavior
|
|
25
|
+
|
|
26
|
+
1. Check if `.sdd.yaml` exists in project root — if so, read guardrail config from it
|
|
27
|
+
2. Check if `GUARDRAILS_DISABLED=true` (set by `/sdd-yolo`) — if so, report all disabled
|
|
28
|
+
3. Display each guardrail with its enabled/disabled status
|
|
29
|
+
4. If an argument is provided, toggle the specified guardrail
|
|
30
|
+
|
|
31
|
+
## Output Format
|
|
32
|
+
|
|
33
|
+
```
|
|
34
|
+
SDD Guardrails Status
|
|
35
|
+
─────────────────────
|
|
36
|
+
pre-implementation .... enabled
|
|
37
|
+
scope-guard ........... enabled
|
|
38
|
+
completion-review ..... enabled
|
|
39
|
+
|
|
40
|
+
Config: .sdd.yaml (found / not found — using defaults)
|
|
41
|
+
```
|