claude-code-pilot 3.1.0 → 3.2.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (110) hide show
  1. package/README.md +11 -11
  2. package/bin/install.js +20 -2
  3. package/manifest.json +5 -1
  4. package/package.json +18 -6
  5. package/src/agents/a11y-architect.md +141 -0
  6. package/src/agents/code-architect.md +71 -0
  7. package/src/agents/code-explorer.md +69 -0
  8. package/src/agents/code-simplifier.md +47 -0
  9. package/src/agents/comment-analyzer.md +45 -0
  10. package/src/agents/csharp-reviewer.md +101 -0
  11. package/src/agents/dart-build-resolver.md +201 -0
  12. package/src/agents/pr-test-analyzer.md +45 -0
  13. package/src/agents/silent-failure-hunter.md +50 -0
  14. package/src/agents/type-design-analyzer.md +41 -0
  15. package/src/available-rules/README.md +3 -1
  16. package/src/available-rules/dart/coding-style.md +159 -0
  17. package/src/available-rules/dart/hooks.md +66 -0
  18. package/src/available-rules/dart/patterns.md +261 -0
  19. package/src/available-rules/dart/security.md +135 -0
  20. package/src/available-rules/dart/testing.md +215 -0
  21. package/src/available-rules/web/coding-style.md +105 -0
  22. package/src/available-rules/web/design-quality.md +72 -0
  23. package/src/available-rules/web/hooks.md +129 -0
  24. package/src/available-rules/web/patterns.md +88 -0
  25. package/src/available-rules/web/performance.md +73 -0
  26. package/src/available-rules/web/security.md +66 -0
  27. package/src/available-rules/web/testing.md +64 -0
  28. package/src/commands/ccp/ai-integration-phase.md +36 -0
  29. package/src/commands/ccp/audit-fix.md +33 -0
  30. package/src/commands/ccp/code-review-fix.md +52 -0
  31. package/src/commands/ccp/eval-review.md +32 -0
  32. package/src/commands/ccp/extract_learnings.md +22 -0
  33. package/src/commands/ccp/import.md +37 -0
  34. package/src/commands/ccp/ingest-docs.md +42 -0
  35. package/src/commands/ccp/intel.md +179 -0
  36. package/src/commands/ccp/plan-review-convergence.md +58 -0
  37. package/src/commands/ccp/scan.md +26 -0
  38. package/src/commands/ccp/sketch-wrap-up.md +31 -0
  39. package/src/commands/ccp/sketch.md +54 -0
  40. package/src/commands/ccp/spec-phase.md +62 -0
  41. package/src/commands/ccp/spike-wrap-up.md +31 -0
  42. package/src/commands/ccp/spike.md +51 -0
  43. package/src/commands/ccp/ultraplan-phase.md +33 -0
  44. package/src/hooks/ccp-read-injection-scanner.js +152 -0
  45. package/src/hooks/kit-check-update.js +59 -7
  46. package/src/hooks/run-with-flags-shell.sh +1 -0
  47. package/src/hooks/run-with-flags.js +48 -1
  48. package/src/hooks/session-end.js +88 -1
  49. package/src/lib/hook-flags.js +14 -0
  50. package/src/pilot/references/agent-contracts.md +79 -0
  51. package/src/pilot/references/ai-evals.md +156 -0
  52. package/src/pilot/references/ai-frameworks.md +186 -0
  53. package/src/pilot/references/doc-conflict-engine.md +91 -0
  54. package/src/pilot/references/gate-prompts.md +100 -0
  55. package/src/pilot/references/gates.md +70 -0
  56. package/src/pilot/references/mandatory-initial-read.md +2 -0
  57. package/src/pilot/references/project-skills-discovery.md +19 -0
  58. package/src/pilot/references/revision-loop.md +97 -0
  59. package/src/pilot/references/sketch-interactivity.md +41 -0
  60. package/src/pilot/references/sketch-theme-system.md +94 -0
  61. package/src/pilot/references/sketch-tooling.md +45 -0
  62. package/src/pilot/references/sketch-variant-patterns.md +81 -0
  63. package/src/pilot/references/thinking-models-debug.md +44 -0
  64. package/src/pilot/references/thinking-models-execution.md +50 -0
  65. package/src/pilot/references/thinking-models-planning.md +62 -0
  66. package/src/pilot/references/thinking-models-research.md +50 -0
  67. package/src/pilot/references/thinking-models-verification.md +55 -0
  68. package/src/pilot/templates/AI-SPEC.md +246 -0
  69. package/src/pilot/templates/spec.md +307 -0
  70. package/src/pilot/workflows/ai-integration-phase.md +284 -0
  71. package/src/pilot/workflows/audit-fix.md +175 -0
  72. package/src/pilot/workflows/code-review-fix.md +497 -0
  73. package/src/pilot/workflows/eval-review.md +155 -0
  74. package/src/pilot/workflows/extract_learnings.md +242 -0
  75. package/src/pilot/workflows/import.md +246 -0
  76. package/src/pilot/workflows/ingest-docs.md +328 -0
  77. package/src/pilot/workflows/plan-review-convergence.md +329 -0
  78. package/src/pilot/workflows/scan.md +102 -0
  79. package/src/pilot/workflows/sketch-wrap-up.md +285 -0
  80. package/src/pilot/workflows/sketch.md +360 -0
  81. package/src/pilot/workflows/spec-phase.md +262 -0
  82. package/src/pilot/workflows/spike-wrap-up.md +306 -0
  83. package/src/pilot/workflows/spike.md +452 -0
  84. package/src/pilot/workflows/ultraplan-phase.md +189 -0
  85. package/src/skills/accessibility/SKILL.md +146 -0
  86. package/src/skills/agent-eval/SKILL.md +145 -0
  87. package/src/skills/agent-introspection-debugging/SKILL.md +153 -0
  88. package/src/skills/android-clean-architecture/SKILL.md +339 -0
  89. package/src/skills/api-connector-builder/SKILL.md +120 -0
  90. package/src/skills/code-tour/SKILL.md +236 -0
  91. package/src/skills/compose-multiplatform-patterns/SKILL.md +299 -0
  92. package/src/skills/csharp-testing/SKILL.md +321 -0
  93. package/src/skills/dart-flutter-patterns/SKILL.md +563 -0
  94. package/src/skills/dashboard-builder/SKILL.md +108 -0
  95. package/src/skills/dotnet-patterns/SKILL.md +321 -0
  96. package/src/skills/frontend-design/SKILL.md +145 -0
  97. package/src/skills/frontend-slides/SKILL.md +184 -0
  98. package/src/skills/frontend-slides/STYLE_PRESETS.md +330 -0
  99. package/src/skills/gateguard/SKILL.md +121 -0
  100. package/src/skills/github-ops/SKILL.md +144 -0
  101. package/src/skills/hookify-rules/SKILL.md +128 -0
  102. package/src/skills/knowledge-ops/SKILL.md +154 -0
  103. package/src/skills/liquid-glass-design/SKILL.md +279 -0
  104. package/src/skills/nestjs-patterns/SKILL.md +230 -0
  105. package/src/skills/security-bounty-hunter/SKILL.md +99 -0
  106. package/src/skills/swift-actor-persistence/SKILL.md +143 -0
  107. package/src/skills/swift-protocol-di-testing/SKILL.md +190 -0
  108. package/src/skills/swiftui-patterns/SKILL.md +259 -0
  109. package/src/skills/terminal-ops/SKILL.md +109 -0
  110. package/src/skills/ui-demo/SKILL.md +465 -0
@@ -0,0 +1,246 @@
1
+ # AI-SPEC — Phase {N}: {phase_name}
2
+
3
+ > AI design contract generated by `/ccp:ai-integration-phase`. Consumed by `gsd-planner` and `gsd-eval-auditor`.
4
+ > Locks framework selection, implementation guidance, and evaluation strategy before planning begins.
5
+
6
+ ---
7
+
8
+ ## 1. System Classification
9
+
10
+ **System Type:** <!-- RAG | Multi-Agent | Conversational | Extraction | Autonomous Agent | Content Generation | Code Automation | Hybrid -->
11
+
12
+ **Description:**
13
+ <!-- One-paragraph description of what this AI system does, who uses it, and what "good" looks like -->
14
+
15
+ **Critical Failure Modes:**
16
+ <!-- The 3-5 behaviors that absolutely cannot go wrong in this system -->
17
+ 1.
18
+ 2.
19
+ 3.
20
+
21
+ ---
22
+
23
+ ## 1b. Domain Context
24
+
25
+ > Researched by `gsd-domain-researcher`. Grounds the evaluation strategy in domain expert knowledge.
26
+
27
+ **Industry Vertical:** <!-- healthcare | legal | finance | customer service | education | developer tooling | e-commerce | etc. -->
28
+
29
+ **User Population:** <!-- who uses this system and in what context -->
30
+
31
+ **Stakes Level:** <!-- Low | Medium | High | Critical -->
32
+
33
+ **Output Consequence:** <!-- what happens downstream when the AI output is acted on -->
34
+
35
+ ### What Domain Experts Evaluate Against
36
+
37
+ <!-- Domain-specific rubric ingredients — in practitioner language, not AI jargon -->
38
+ <!-- Format: Dimension / Good (expert accepts) / Bad (expert flags) / Stakes / Source -->
39
+
40
+ ### Known Failure Modes in This Domain
41
+
42
+ <!-- Domain-specific failure modes from research — not generic hallucination, but how it manifests here -->
43
+
44
+ ### Regulatory / Compliance Context
45
+
46
+ <!-- Relevant regulations or constraints — or "None identified" if genuinely none apply -->
47
+
48
+ ### Domain Expert Roles for Evaluation
49
+
50
+ | Role | Responsibility |
51
+ |------|---------------|
52
+ | <!-- e.g., Senior practitioner --> | <!-- Dataset labeling / rubric calibration / production sampling --> |
53
+
54
+ ---
55
+
56
+ ## 2. Framework Decision
57
+
58
+ **Selected Framework:** <!-- e.g., LlamaIndex v0.10.x -->
59
+
60
+ **Version:** <!-- Pin the version -->
61
+
62
+ **Rationale:**
63
+ <!-- Why this framework fits this system type, team context, and production requirements -->
64
+
65
+ **Alternatives Considered:**
66
+
67
+ | Framework | Ruled Out Because |
68
+ |-----------|------------------|
69
+ | | |
70
+
71
+ **Vendor Lock-In Accepted:** <!-- Yes / No / Partial — document the trade-off consciously -->
72
+
73
+ ---
74
+
75
+ ## 3. Framework Quick Reference
76
+
77
+ > Fetched from official docs by `gsd-ai-researcher`. Distilled for this specific use case.
78
+
79
+ ### Installation
80
+ ```bash
81
+ # Install command(s)
82
+ ```
83
+
84
+ ### Core Imports
85
+ ```python
86
+ # Key imports for this use case
87
+ ```
88
+
89
+ ### Entry Point Pattern
90
+ ```python
91
+ # Minimal working example for this system type
92
+ ```
93
+
94
+ ### Key Abstractions
95
+ <!-- Framework-specific concepts the developer must understand before coding -->
96
+ | Concept | What It Is | When You Use It |
97
+ |---------|-----------|-----------------|
98
+ | | | |
99
+
100
+ ### Common Pitfalls
101
+ <!-- Gotchas specific to this framework and system type — from docs, issues, and community reports -->
102
+ 1.
103
+ 2.
104
+ 3.
105
+
106
+ ### Recommended Project Structure
107
+ ```
108
+ project/
109
+ ├── # Framework-specific folder layout
110
+ ```
111
+
112
+ ---
113
+
114
+ ## 4. Implementation Guidance
115
+
116
+ **Model Configuration:**
117
+ <!-- Which model(s), temperature, max tokens, and other key parameters -->
118
+
119
+ **Core Pattern:**
120
+ <!-- The primary implementation pattern for this system type in this framework -->
121
+
122
+ **Tool Use:**
123
+ <!-- Tools/integrations needed and how to configure them -->
124
+
125
+ **State Management:**
126
+ <!-- How state is persisted, retrieved, and updated -->
127
+
128
+ **Context Window Strategy:**
129
+ <!-- How to manage context limits for this system type -->
130
+
131
+ ---
132
+
133
+ ## 4b. AI Systems Best Practices
134
+
135
+ > Written by `gsd-ai-researcher`. Cross-cutting patterns every developer building AI systems needs — independent of framework choice.
136
+
137
+ ### Structured Outputs with Pydantic
138
+
139
+ <!-- Framework-specific Pydantic integration pattern for this use case -->
140
+ <!-- Include: output model definition, how the framework uses it, retry logic on validation failure -->
141
+
142
+ ```python
143
+ # Pydantic output model for this system type
144
+ ```
145
+
146
+ ### Async-First Design
147
+
148
+ <!-- How async is handled in this framework, the one common mistake, and when to stream vs. await -->
149
+
150
+ ### Prompt Engineering Discipline
151
+
152
+ <!-- System vs. user prompt separation, few-shot guidance, token budget strategy -->
153
+
154
+ ### Context Window Management
155
+
156
+ <!-- Strategy specific to this system type: RAG chunking / conversation summarisation / agent compaction -->
157
+
158
+ ### Cost and Latency Budget
159
+
160
+ <!-- Per-call cost estimate, caching strategy, sub-task model routing -->
161
+
162
+ ---
163
+
164
+ ## 5. Evaluation Strategy
165
+
166
+ ### Dimensions
167
+
168
+ | Dimension | Rubric (Pass/Fail or 1-5) | Measurement Approach | Priority |
169
+ |-----------|--------------------------|---------------------|----------|
170
+ | | | Code / LLM Judge / Human | Critical / High / Medium |
171
+
172
+ ### Eval Tooling
173
+
174
+ **Primary Tool:** <!-- e.g., RAGAS + Langfuse -->
175
+
176
+ **Setup:**
177
+ ```bash
178
+ # Install and configure
179
+ ```
180
+
181
+ **CI/CD Integration:**
182
+ ```bash
183
+ # Command to run evals in CI/CD pipeline
184
+ ```
185
+
186
+ ### Reference Dataset
187
+
188
+ **Size:** <!-- e.g., 20 examples to start -->
189
+
190
+ **Composition:**
191
+ <!-- What scenario types the dataset covers: critical paths, edge cases, failure modes -->
192
+
193
+ **Labeling:**
194
+ <!-- Who labels examples and how (domain expert, LLM judge with calibration, etc.) -->
195
+
196
+ ---
197
+
198
+ ## 6. Guardrails
199
+
200
+ ### Online (Real-Time)
201
+
202
+ | Guardrail | Trigger | Intervention |
203
+ |-----------|---------|--------------|
204
+ | | | Block / Escalate / Flag |
205
+
206
+ ### Offline (Flywheel)
207
+
208
+ | Metric | Sampling Strategy | Action on Degradation |
209
+ |--------|------------------|----------------------|
210
+ | | | |
211
+
212
+ ---
213
+
214
+ ## 7. Production Monitoring
215
+
216
+ **Tracing Tool:** <!-- e.g., Langfuse self-hosted -->
217
+
218
+ **Key Metrics to Track:**
219
+ <!-- 3-5 metrics that will be monitored in production -->
220
+
221
+ **Alert Thresholds:**
222
+ <!-- When to page/alert -->
223
+
224
+ **Smart Sampling Strategy:**
225
+ <!-- How to select interactions for human review — signal-based filters -->
226
+
227
+ ---
228
+
229
+ ## Checklist
230
+
231
+ - [ ] System type classified
232
+ - [ ] Critical failure modes identified (≥ 3)
233
+ - [ ] Domain context researched (Section 1b: vertical, stakes, expert criteria, failure modes)
234
+ - [ ] Regulatory/compliance context identified or explicitly noted as none
235
+ - [ ] Domain expert roles defined for evaluation involvement
236
+ - [ ] Framework selected with rationale documented
237
+ - [ ] Alternatives considered and ruled out
238
+ - [ ] Framework quick reference written (install, imports, pattern, pitfalls)
239
+ - [ ] AI systems best practices written (Section 4b: Pydantic, async, prompt discipline, context)
240
+ - [ ] Evaluation dimensions grounded in domain rubric ingredients
241
+ - [ ] Each eval dimension has a concrete rubric (Good/Bad in domain language)
242
+ - [ ] Eval tooling selected — Arize Phoenix default confirmed or override noted
243
+ - [ ] Reference dataset spec written (size ≥ 10, composition + labeling defined)
244
+ - [ ] CI/CD eval integration specified
245
+ - [ ] Online guardrails defined
246
+ - [ ] Production monitoring configured (tracing tool + sampling strategy)
@@ -0,0 +1,307 @@
1
+ # Phase Spec Template
2
+
3
+ Template for `.planning/phases/XX-name/{phase_num}-SPEC.md` — locks requirements before discuss-phase.
4
+
5
+ **Purpose:** Capture WHAT a phase delivers and WHY, with enough precision that requirements are falsifiable. discuss-phase reads this file and focuses on HOW to implement (skipping "what/why" questions already answered here).
6
+
7
+ **Key principle:** Every requirement must be falsifiable — you can write a test or check that proves it was met or not. Vague requirements like "improve performance" are not allowed.
8
+
9
+ **Downstream consumers:**
10
+ - `discuss-phase` — reads SPEC.md at startup; treats Requirements, Boundaries, and Acceptance Criteria as locked; skips "what/why" questions
11
+ - `gsd-planner` — reads locked requirements to constrain plan scope
12
+ - `gsd-verifier` — uses acceptance criteria as explicit pass/fail checks
13
+
14
+ ---
15
+
16
+ ## File Template
17
+
18
+ ```markdown
19
+ # Phase [X]: [Name] — Specification
20
+
21
+ **Created:** [date]
22
+ **Ambiguity score:** [score] (gate: ≤ 0.20)
23
+ **Requirements:** [N] locked
24
+
25
+ ## Goal
26
+
27
+ [One precise sentence — specific and measurable. NOT "improve X" — instead "X changes from A to B".]
28
+
29
+ ## Background
30
+
31
+ [Current state from codebase — what exists today, what's broken or missing, what triggers this work. Grounded in code reality, not abstract description.]
32
+
33
+ ## Requirements
34
+
35
+ 1. **[Short label]**: [Specific, testable statement.]
36
+ - Current: [what exists or does NOT exist today]
37
+ - Target: [what it should become after this phase]
38
+ - Acceptance: [concrete pass/fail check — how a verifier confirms this was met]
39
+
40
+ 2. **[Short label]**: [Specific, testable statement.]
41
+ - Current: [what exists or does NOT exist today]
42
+ - Target: [what it should become after this phase]
43
+ - Acceptance: [concrete pass/fail check]
44
+
45
+ [Continue for all requirements. Each must have Current/Target/Acceptance.]
46
+
47
+ ## Boundaries
48
+
49
+ **In scope:**
50
+ - [Explicit list of what this phase produces]
51
+ - [Each item is a concrete deliverable or behavior]
52
+
53
+ **Out of scope:**
54
+ - [Explicit list of what this phase does NOT do] — [brief reason why it's excluded]
55
+ - [Adjacent problems excluded from this phase] — [brief reason]
56
+
57
+ ## Constraints
58
+
59
+ [Performance, compatibility, data volume, dependency, or platform constraints.
60
+ If none: "No additional constraints beyond standard project conventions."]
61
+
62
+ ## Acceptance Criteria
63
+
64
+ - [ ] [Pass/fail criterion — unambiguous, verifiable]
65
+ - [ ] [Pass/fail criterion]
66
+ - [ ] [Pass/fail criterion]
67
+
68
+ [Every acceptance criterion must be a checkbox that resolves to PASS or FAIL.
69
+ No "should feel good", "looks reasonable", or "generally works" — those are not checkboxes.]
70
+
71
+ ## Ambiguity Report
72
+
73
+ | Dimension | Score | Min | Status | Notes |
74
+ |--------------------|-------|------|--------|------------------------------------|
75
+ | Goal Clarity | | 0.75 | | |
76
+ | Boundary Clarity | | 0.70 | | |
77
+ | Constraint Clarity | | 0.65 | | |
78
+ | Acceptance Criteria| | 0.70 | | |
79
+ | **Ambiguity** | | ≤0.20| | |
80
+
81
+ Status: ✓ = met minimum, ⚠ = below minimum (planner treats as assumption)
82
+
83
+ ## Interview Log
84
+
85
+ [Key decisions made during the Socratic interview. Format: round → question → answer → decision locked.]
86
+
87
+ | Round | Perspective | Question summary | Decision locked |
88
+ |-------|----------------|-------------------------|------------------------------------|
89
+ | 1 | Researcher | [what was asked] | [what was decided] |
90
+ | 2 | Simplifier | [what was asked] | [what was decided] |
91
+ | 3 | Boundary Keeper| [what was asked] | [what was decided] |
92
+
93
+ [If --auto mode: note "auto-selected" decisions with the reasoning Claude used.]
94
+
95
+ ---
96
+
97
+ *Phase: [XX-name]*
98
+ *Spec created: [date]*
99
+ *Next step: /ccp:discuss-phase [X] — implementation decisions (how to build what's specified above)*
100
+ ```
101
+
102
+ <good_examples>
103
+
104
+ **Example 1: Feature addition (Post Feed)**
105
+
106
+ ```markdown
107
+ # Phase 3: Post Feed — Specification
108
+
109
+ **Created:** 2025-01-20
110
+ **Ambiguity score:** 0.12
111
+ **Requirements:** 4 locked
112
+
113
+ ## Goal
114
+
115
+ Users can scroll through posts from accounts they follow, with new posts available after pull-to-refresh.
116
+
117
+ ## Background
118
+
119
+ The database has a `posts` table and `follows` table. No feed query or feed UI exists today. The home screen shows a placeholder "Your feed will appear here." This phase builds the feed query, API endpoint, and the feed list component.
120
+
121
+ ## Requirements
122
+
123
+ 1. **Feed query**: Returns posts from followed accounts ordered by creation time, descending.
124
+ - Current: No feed query exists — `posts` table is queried directly only from profile pages
125
+ - Target: `GET /api/feed` returns paginated posts from followed accounts, newest first, max 20 per page
126
+ - Acceptance: Query returns correct posts for a user who follows 3 accounts with known post counts; cursor-based pagination advances correctly
127
+
128
+ 2. **Feed display**: Posts display in a scrollable card list.
129
+ - Current: Home screen shows static placeholder text
130
+ - Target: Home screen renders feed cards with author, timestamp, post content, and reaction count
131
+ - Acceptance: Feed renders without error for 0 posts (empty state shown), 1 post, and 20+ posts
132
+
133
+ 3. **Pull-to-refresh**: User can refresh the feed manually.
134
+ - Current: No refresh mechanism exists
135
+ - Target: Pull-down gesture triggers refetch; new posts appear at top of list
136
+ - Acceptance: After a new post is created in test, pull-to-refresh shows the new post without full app restart
137
+
138
+ 4. **New posts indicator**: When new posts arrive, a banner appears instead of auto-scrolling.
139
+ - Current: No such mechanism
140
+ - Target: "3 new posts" banner appears when refetch returns posts newer than the oldest visible post; tapping banner scrolls to top and shows new posts
141
+ - Acceptance: Banner appears for ≥1 new post, does not appear when no new posts, tap navigates to top
142
+
143
+ ## Boundaries
144
+
145
+ **In scope:**
146
+ - Feed query (backend) — posts from followed accounts, paginated
147
+ - Feed list UI (frontend) — post cards with author, timestamp, content, reaction counts
148
+ - Pull-to-refresh gesture
149
+ - New posts indicator banner
150
+ - Empty state when user follows no one or no posts exist
151
+
152
+ **Out of scope:**
153
+ - Creating posts — that is Phase 4
154
+ - Reacting to posts — that is Phase 5
155
+ - Following/unfollowing accounts — that is Phase 2 (already done)
156
+ - Push notifications for new posts — separate backlog item
157
+
158
+ ## Constraints
159
+
160
+ - Feed query must use cursor-based pagination (not offset) — the database has 500K+ posts and offset pagination is unacceptably slow beyond page 3
161
+ - The feed card component must reuse the existing `<AvatarImage>` component from Phase 2
162
+
163
+ ## Acceptance Criteria
164
+
165
+ - [ ] `GET /api/feed` returns posts only from followed accounts (not all posts)
166
+ - [ ] `GET /api/feed` supports `cursor` parameter for pagination
167
+ - [ ] Feed renders correctly at 0, 1, and 20+ posts
168
+ - [ ] Pull-to-refresh triggers refetch
169
+ - [ ] New posts indicator appears when posts newer than current view exist
170
+ - [ ] Empty state renders when user follows no one
171
+
172
+ ## Ambiguity Report
173
+
174
+ | Dimension | Score | Min | Status | Notes |
175
+ |--------------------|-------|------|--------|----------------------------------|
176
+ | Goal Clarity | 0.92 | 0.75 | ✓ | |
177
+ | Boundary Clarity | 0.95 | 0.70 | ✓ | Explicit out-of-scope list |
178
+ | Constraint Clarity | 0.80 | 0.65 | ✓ | Cursor pagination required |
179
+ | Acceptance Criteria| 0.85 | 0.70 | ✓ | 6 pass/fail criteria |
180
+ | **Ambiguity** | 0.12 | ≤0.20| ✓ | |
181
+
182
+ ## Interview Log
183
+
184
+ | Round | Perspective | Question summary | Decision locked |
185
+ |-------|-----------------|------------------------------|-----------------------------------------|
186
+ | 1 | Researcher | What exists in posts today? | posts + follows tables exist, no feed |
187
+ | 2 | Simplifier | Minimum viable feed? | Cards + pull-refresh, no auto-scroll |
188
+ | 3 | Boundary Keeper | What's NOT this phase? | Creating posts, reactions out of scope |
189
+ | 3 | Boundary Keeper | What does done look like? | Scrollable feed with 4 card fields |
190
+
191
+ ---
192
+
193
+ *Phase: 03-post-feed*
194
+ *Spec created: 2025-01-20*
195
+ *Next step: /ccp:discuss-phase 3 — implementation decisions (card layout, loading skeleton, etc.)*
196
+ ```
197
+
198
+ **Example 2: CLI tool (Database backup)**
199
+
200
+ ```markdown
201
+ # Phase 2: Backup Command — Specification
202
+
203
+ **Created:** 2025-01-20
204
+ **Ambiguity score:** 0.15
205
+ **Requirements:** 3 locked
206
+
207
+ ## Goal
208
+
209
+ A `gsd backup` CLI command creates a reproducible database snapshot that can be restored by `gsd restore` (a separate phase).
210
+
211
+ ## Background
212
+
213
+ No backup tooling exists. The project uses PostgreSQL. Developers currently use `pg_dump` manually — there is no standardized process, no output naming convention, and no CI integration. Three incidents in the last quarter involved restoring from wrong or corrupt dumps.
214
+
215
+ ## Requirements
216
+
217
+ 1. **Backup creation**: CLI command executes a full database backup.
218
+ - Current: No `backup` subcommand exists in the CLI
219
+ - Target: `gsd backup` connects to the database (via `DATABASE_URL` env or `--db` flag), runs pg_dump, writes output to `./backups/YYYY-MM-DD_HH-MM-SS.dump`
220
+ - Acceptance: Running `gsd backup` on a test database creates a `.dump` file; running `pg_restore` on that file recreates the database without error
221
+
222
+ 2. **Network retry**: Transient network failures are retried automatically.
223
+ - Current: pg_dump fails immediately on network error
224
+ - Target: Backup retries up to 3 times with 5-second delay; 4th failure exits with code 1 and a message to stderr
225
+ - Acceptance: Simulating 2 sequential network failures causes 2 retries then success; simulating 4 failures causes exit code 1 and stderr message
226
+
227
+ 3. **Partial cleanup**: Failed backups do not leave corrupt files.
228
+ - Current: Manual pg_dump leaves partial files on failure
229
+ - Target: If backup fails after starting, the partial `.dump` file is deleted before exit
230
+ - Acceptance: After a simulated failure mid-dump, no `.dump` file exists in `./backups/`
231
+
232
+ ## Boundaries
233
+
234
+ **In scope:**
235
+ - `gsd backup` subcommand (full dump only)
236
+ - Output to `./backups/` directory (created if missing)
237
+ - Network retry (3 attempts)
238
+ - Partial file cleanup on failure
239
+
240
+ **Out of scope:**
241
+ - `gsd restore` — that is Phase 3
242
+ - Incremental backups — separate backlog item (full dump only for now)
243
+ - S3 or remote storage — separate backlog item
244
+ - Encryption — separate backlog item
245
+ - Scheduled/cron backups — separate backlog item
246
+
247
+ ## Constraints
248
+
249
+ - Must use `pg_dump` (not a custom query) — ensures compatibility with standard `pg_restore`
250
+ - `--no-retry` flag must be available for CI use (fail fast, no retries)
251
+
252
+ ## Acceptance Criteria
253
+
254
+ - [ ] `gsd backup` creates a `.dump` file in `./backups/YYYY-MM-DD_HH-MM-SS.dump` format
255
+ - [ ] `gsd backup` uses `DATABASE_URL` env var or `--db` flag for connection
256
+ - [ ] 3 retries on network failure, then exit code 1 with stderr message
257
+ - [ ] `--no-retry` flag skips retries and fails immediately on first error
258
+ - [ ] No partial `.dump` file left after a failed backup
259
+
260
+ ## Ambiguity Report
261
+
262
+ | Dimension | Score | Min | Status | Notes |
263
+ |--------------------|-------|------|--------|--------------------------------|
264
+ | Goal Clarity | 0.90 | 0.75 | ✓ | |
265
+ | Boundary Clarity | 0.95 | 0.70 | ✓ | Explicit out-of-scope list |
266
+ | Constraint Clarity | 0.75 | 0.65 | ✓ | pg_dump required |
267
+ | Acceptance Criteria| 0.80 | 0.70 | ✓ | 5 pass/fail criteria |
268
+ | **Ambiguity** | 0.15 | ≤0.20| ✓ | |
269
+
270
+ ## Interview Log
271
+
272
+ | Round | Perspective | Question summary | Decision locked |
273
+ |-------|-----------------|------------------------------|-----------------------------------------|
274
+ | 1 | Researcher | What backup tooling exists? | None — pg_dump manual only |
275
+ | 2 | Simplifier | Minimum viable backup? | Full dump only, local only |
276
+ | 3 | Boundary Keeper | What's NOT this phase? | Restore, S3, encryption excluded |
277
+ | 4 | Failure Analyst | What goes wrong on failure? | Partial files, CI fail-fast needed |
278
+
279
+ ---
280
+
281
+ *Phase: 02-backup-command*
282
+ *Spec created: 2025-01-20*
283
+ *Next step: /ccp:discuss-phase 2 — implementation decisions (progress reporting, flag design, etc.)*
284
+ ```
285
+
286
+ </good_examples>
287
+
288
+ <guidelines>
289
+ **Every requirement needs all three fields:**
290
+ - Current: grounds the requirement in reality — what exists today?
291
+ - Target: the concrete change — not "improve X" but "X becomes Y"
292
+ - Acceptance: the falsifiable check — how does a verifier confirm this?
293
+
294
+ **Ambiguity Report must reflect the actual interview.** If a dimension is below minimum, mark it ⚠ — the planner knows to treat it as an assumption rather than a locked requirement.
295
+
296
+ **Interview Log is evidence of rigor.** Don't skip it. It shows that requirements came from discovery, not assumption.
297
+
298
+ **Boundaries protect the phase from scope creep.** The out-of-scope list with reasoning is as important as the in-scope list. Future phases that touch adjacent areas can point to this SPEC.md to understand what was intentionally excluded.
299
+
300
+ **SPEC.md is a one-way door for requirements.** discuss-phase will treat these as locked. If requirements change after SPEC.md is written, the user should update SPEC.md first, then re-run discuss-phase.
301
+
302
+ **SPEC.md does NOT replace CONTEXT.md.** They serve different purposes:
303
+ - SPEC.md: what the phase delivers (requirements, boundaries, acceptance criteria)
304
+ - CONTEXT.md: how the phase will be implemented (decisions, patterns, tradeoffs)
305
+
306
+ discuss-phase generates CONTEXT.md after reading SPEC.md.
307
+ </guidelines>