@bhargavvc/sdd-cc 1.30.0 → 1.35.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (242) hide show
  1. package/README.ja-JP.md +144 -110
  2. package/README.ko-KR.md +143 -107
  3. package/README.md +183 -112
  4. package/README.pt-BR.md +90 -52
  5. package/README.zh-CN.md +141 -101
  6. package/agents/sdd-advisor-researcher.md +23 -0
  7. package/agents/sdd-ai-researcher.md +133 -0
  8. package/agents/sdd-code-fixer.md +516 -0
  9. package/agents/sdd-code-reviewer.md +355 -0
  10. package/agents/sdd-codebase-mapper.md +3 -3
  11. package/agents/sdd-debugger.md +17 -5
  12. package/agents/sdd-doc-verifier.md +201 -0
  13. package/agents/sdd-doc-writer.md +602 -0
  14. package/agents/sdd-domain-researcher.md +153 -0
  15. package/agents/sdd-eval-auditor.md +164 -0
  16. package/agents/sdd-eval-planner.md +154 -0
  17. package/agents/sdd-executor.md +87 -4
  18. package/agents/sdd-framework-selector.md +160 -0
  19. package/agents/sdd-intel-updater.md +314 -0
  20. package/agents/sdd-nyquist-auditor.md +1 -1
  21. package/agents/sdd-phase-researcher.md +71 -4
  22. package/agents/sdd-plan-checker.md +100 -6
  23. package/agents/sdd-planner.md +145 -206
  24. package/agents/sdd-project-researcher.md +25 -2
  25. package/agents/sdd-research-synthesizer.md +3 -3
  26. package/agents/sdd-roadmapper.md +6 -6
  27. package/agents/sdd-security-auditor.md +128 -0
  28. package/agents/sdd-ui-auditor.md +43 -3
  29. package/agents/sdd-ui-checker.md +5 -5
  30. package/agents/sdd-ui-researcher.md +27 -4
  31. package/agents/sdd-user-profiler.md +2 -2
  32. package/agents/sdd-verifier.md +142 -22
  33. package/bin/install.js +2151 -551
  34. package/commands/sdd/add-backlog.md +5 -5
  35. package/commands/sdd/add-tests.md +2 -2
  36. package/commands/sdd/ai-integration-phase.md +36 -0
  37. package/commands/sdd/analyze-dependencies.md +34 -0
  38. package/commands/sdd/audit-fix.md +33 -0
  39. package/commands/sdd/autonomous.md +7 -2
  40. package/commands/sdd/cleanup.md +5 -0
  41. package/commands/sdd/code-review-fix.md +52 -0
  42. package/commands/sdd/code-review.md +55 -0
  43. package/commands/sdd/complete-milestone.md +6 -6
  44. package/commands/sdd/debug.md +22 -9
  45. package/commands/sdd/discuss-phase.md +7 -2
  46. package/commands/sdd/do.md +1 -1
  47. package/commands/sdd/docs-update.md +48 -0
  48. package/commands/sdd/eval-review.md +32 -0
  49. package/commands/sdd/execute-phase.md +4 -0
  50. package/commands/sdd/explore.md +27 -0
  51. package/commands/sdd/fast.md +2 -2
  52. package/commands/sdd/from-sdd2.md +45 -0
  53. package/commands/sdd/help.md +2 -0
  54. package/commands/sdd/import.md +36 -0
  55. package/commands/sdd/intel.md +179 -0
  56. package/commands/sdd/join-discord.md +2 -1
  57. package/commands/sdd/manager.md +1 -0
  58. package/commands/sdd/map-codebase.md +3 -3
  59. package/commands/sdd/new-milestone.md +1 -1
  60. package/commands/sdd/new-project.md +5 -1
  61. package/commands/sdd/new-workspace.md +1 -1
  62. package/commands/sdd/next.md +2 -0
  63. package/commands/sdd/plan-milestone-gaps.md +2 -2
  64. package/commands/sdd/plan-phase.md +6 -1
  65. package/commands/sdd/plant-seed.md +1 -1
  66. package/commands/sdd/profile-user.md +1 -1
  67. package/commands/sdd/quick.md +5 -3
  68. package/commands/sdd/reapply-patches.md +230 -42
  69. package/commands/sdd/research-phase.md +3 -3
  70. package/commands/sdd/review-backlog.md +1 -0
  71. package/commands/sdd/review.md +6 -3
  72. package/commands/sdd/scan.md +26 -0
  73. package/commands/sdd/secure-phase.md +35 -0
  74. package/commands/sdd/ship.md +1 -1
  75. package/commands/sdd/thread.md +5 -5
  76. package/commands/sdd/undo.md +34 -0
  77. package/commands/sdd/verify-work.md +1 -1
  78. package/commands/sdd/workstreams.md +17 -11
  79. package/hooks/dist/sdd-check-update.js +33 -8
  80. package/hooks/dist/sdd-context-monitor.js +17 -8
  81. package/hooks/dist/sdd-phase-boundary.sh +27 -0
  82. package/hooks/dist/sdd-prompt-guard.js +1 -0
  83. package/hooks/dist/sdd-read-guard.js +82 -0
  84. package/hooks/dist/sdd-session-state.sh +33 -0
  85. package/hooks/dist/sdd-statusline.js +137 -15
  86. package/hooks/dist/sdd-validate-commit.sh +47 -0
  87. package/hooks/dist/sdd-workflow-guard.js +4 -4
  88. package/hooks/sdd-check-update.js +139 -0
  89. package/hooks/sdd-context-monitor.js +165 -0
  90. package/hooks/sdd-phase-boundary.sh +27 -0
  91. package/hooks/sdd-prompt-guard.js +97 -0
  92. package/hooks/sdd-read-guard.js +82 -0
  93. package/hooks/sdd-session-state.sh +33 -0
  94. package/hooks/sdd-statusline.js +241 -0
  95. package/hooks/sdd-validate-commit.sh +47 -0
  96. package/hooks/sdd-workflow-guard.js +94 -0
  97. package/package.json +3 -3
  98. package/scripts/build-hooks.js +18 -7
  99. package/scripts/prompt-injection-scan.sh +1 -0
  100. package/scripts/rebrand-gsd-to-sdd.sh +221 -220
  101. package/scripts/run-tests.cjs +5 -1
  102. package/scripts/sync-upstream.sh +1 -1
  103. package/sdd/bin/lib/commands.cjs +79 -17
  104. package/sdd/bin/lib/config.cjs +90 -48
  105. package/sdd/bin/lib/core.cjs +452 -87
  106. package/sdd/bin/lib/docs.cjs +267 -0
  107. package/sdd/bin/lib/frontmatter.cjs +381 -336
  108. package/sdd/bin/lib/init.cjs +110 -16
  109. package/sdd/bin/lib/intel.cjs +660 -0
  110. package/sdd/bin/lib/learnings.cjs +378 -0
  111. package/sdd/bin/lib/milestone.cjs +42 -11
  112. package/sdd/bin/lib/model-profiles.cjs +17 -15
  113. package/sdd/bin/lib/phase.cjs +367 -288
  114. package/sdd/bin/lib/profile-output.cjs +106 -10
  115. package/sdd/bin/lib/roadmap.cjs +146 -115
  116. package/sdd/bin/lib/schema-detect.cjs +238 -0
  117. package/sdd/bin/lib/sdd2-import.cjs +511 -0
  118. package/sdd/bin/lib/security.cjs +124 -3
  119. package/sdd/bin/lib/state.cjs +648 -264
  120. package/sdd/bin/lib/template.cjs +8 -4
  121. package/sdd/bin/lib/verify.cjs +209 -28
  122. package/sdd/bin/lib/workstream.cjs +7 -3
  123. package/sdd/bin/sdd-tools.cjs +184 -12
  124. package/sdd/contexts/dev.md +21 -0
  125. package/sdd/contexts/research.md +22 -0
  126. package/sdd/contexts/review.md +22 -0
  127. package/sdd/references/agent-contracts.md +79 -0
  128. package/sdd/references/ai-evals.md +156 -0
  129. package/sdd/references/ai-frameworks.md +186 -0
  130. package/sdd/references/artifact-types.md +113 -0
  131. package/sdd/references/common-bug-patterns.md +114 -0
  132. package/sdd/references/context-budget.md +49 -0
  133. package/sdd/references/continuation-format.md +25 -25
  134. package/sdd/references/domain-probes.md +125 -0
  135. package/sdd/references/few-shot-examples/plan-checker.md +73 -0
  136. package/sdd/references/few-shot-examples/verifier.md +109 -0
  137. package/sdd/references/gate-prompts.md +100 -0
  138. package/sdd/references/gates.md +70 -0
  139. package/sdd/references/git-integration.md +1 -1
  140. package/sdd/references/ios-scaffold.md +123 -0
  141. package/sdd/references/model-profile-resolution.md +2 -0
  142. package/sdd/references/model-profiles.md +24 -18
  143. package/sdd/references/planner-gap-closure.md +62 -0
  144. package/sdd/references/planner-reviews.md +39 -0
  145. package/sdd/references/planner-revision.md +87 -0
  146. package/sdd/references/planning-config.md +252 -0
  147. package/sdd/references/revision-loop.md +97 -0
  148. package/sdd/references/thinking-models-debug.md +44 -0
  149. package/sdd/references/thinking-models-execution.md +50 -0
  150. package/sdd/references/thinking-models-planning.md +62 -0
  151. package/sdd/references/thinking-models-research.md +50 -0
  152. package/sdd/references/thinking-models-verification.md +55 -0
  153. package/sdd/references/thinking-partner.md +96 -0
  154. package/sdd/references/ui-brand.md +4 -4
  155. package/sdd/references/universal-anti-patterns.md +63 -0
  156. package/sdd/references/verification-overrides.md +227 -0
  157. package/sdd/references/workstream-flag.md +56 -3
  158. package/sdd/templates/AI-SPEC.md +246 -0
  159. package/sdd/templates/DEBUG.md +1 -1
  160. package/sdd/templates/SECURITY.md +61 -0
  161. package/sdd/templates/UAT.md +4 -4
  162. package/sdd/templates/VALIDATION.md +4 -4
  163. package/sdd/templates/claude-md.md +32 -9
  164. package/sdd/templates/config.json +4 -0
  165. package/sdd/templates/debug-subagent-prompt.md +1 -1
  166. package/sdd/templates/dev-preferences.md +1 -1
  167. package/sdd/templates/discovery.md +2 -2
  168. package/sdd/templates/phase-prompt.md +1 -1
  169. package/sdd/templates/planner-subagent-prompt.md +3 -3
  170. package/sdd/templates/project.md +1 -1
  171. package/sdd/templates/research.md +1 -1
  172. package/sdd/templates/state.md +2 -2
  173. package/sdd/workflows/add-phase.md +8 -8
  174. package/sdd/workflows/add-tests.md +12 -9
  175. package/sdd/workflows/add-todo.md +5 -3
  176. package/sdd/workflows/ai-integration-phase.md +284 -0
  177. package/sdd/workflows/analyze-dependencies.md +96 -0
  178. package/sdd/workflows/audit-fix.md +157 -0
  179. package/sdd/workflows/audit-milestone.md +11 -11
  180. package/sdd/workflows/audit-uat.md +2 -2
  181. package/sdd/workflows/autonomous.md +195 -27
  182. package/sdd/workflows/check-todos.md +12 -10
  183. package/sdd/workflows/cleanup.md +2 -0
  184. package/sdd/workflows/code-review-fix.md +497 -0
  185. package/sdd/workflows/code-review.md +515 -0
  186. package/sdd/workflows/complete-milestone.md +56 -22
  187. package/sdd/workflows/diagnose-issues.md +10 -3
  188. package/sdd/workflows/discovery-phase.md +5 -3
  189. package/sdd/workflows/discuss-phase-assumptions.md +24 -6
  190. package/sdd/workflows/discuss-phase-power.md +291 -0
  191. package/sdd/workflows/discuss-phase.md +173 -21
  192. package/sdd/workflows/do.md +23 -21
  193. package/sdd/workflows/docs-update.md +1155 -0
  194. package/sdd/workflows/eval-review.md +155 -0
  195. package/sdd/workflows/execute-phase.md +594 -38
  196. package/sdd/workflows/execute-plan.md +67 -96
  197. package/sdd/workflows/explore.md +139 -0
  198. package/sdd/workflows/fast.md +5 -5
  199. package/sdd/workflows/forensics.md +2 -2
  200. package/sdd/workflows/health.md +4 -4
  201. package/sdd/workflows/help.md +122 -119
  202. package/sdd/workflows/import.md +276 -0
  203. package/sdd/workflows/inbox.md +387 -0
  204. package/sdd/workflows/insert-phase.md +7 -7
  205. package/sdd/workflows/list-phase-assumptions.md +4 -4
  206. package/sdd/workflows/list-workspaces.md +2 -2
  207. package/sdd/workflows/manager.md +35 -32
  208. package/sdd/workflows/map-codebase.md +7 -5
  209. package/sdd/workflows/milestone-summary.md +2 -2
  210. package/sdd/workflows/new-milestone.md +17 -9
  211. package/sdd/workflows/new-project.md +50 -25
  212. package/sdd/workflows/new-workspace.md +7 -5
  213. package/sdd/workflows/next.md +67 -11
  214. package/sdd/workflows/note.md +9 -7
  215. package/sdd/workflows/pause-work.md +75 -12
  216. package/sdd/workflows/plan-milestone-gaps.md +8 -8
  217. package/sdd/workflows/plan-phase.md +294 -42
  218. package/sdd/workflows/plant-seed.md +6 -3
  219. package/sdd/workflows/pr-branch.md +42 -14
  220. package/sdd/workflows/profile-user.md +9 -7
  221. package/sdd/workflows/progress.md +45 -45
  222. package/sdd/workflows/quick.md +195 -47
  223. package/sdd/workflows/remove-phase.md +6 -6
  224. package/sdd/workflows/remove-workspace.md +3 -1
  225. package/sdd/workflows/research-phase.md +2 -2
  226. package/sdd/workflows/resume-project.md +12 -12
  227. package/sdd/workflows/review.md +109 -9
  228. package/sdd/workflows/scan.md +102 -0
  229. package/sdd/workflows/secure-phase.md +166 -0
  230. package/sdd/workflows/session-report.md +2 -2
  231. package/sdd/workflows/settings.md +38 -12
  232. package/sdd/workflows/ship.md +21 -9
  233. package/sdd/workflows/stats.md +1 -1
  234. package/sdd/workflows/transition.md +23 -23
  235. package/sdd/workflows/ui-phase.md +15 -7
  236. package/sdd/workflows/ui-review.md +29 -4
  237. package/sdd/workflows/undo.md +314 -0
  238. package/sdd/workflows/update.md +171 -20
  239. package/sdd/workflows/validate-phase.md +6 -4
  240. package/sdd/workflows/verify-phase.md +210 -6
  241. package/sdd/workflows/verify-work.md +83 -9
  242. package/sdd/commands/sdd/workstreams.md +0 -63
@@ -0,0 +1,227 @@
1
+ # Verification Overrides
2
+
3
+ Mechanism for intentionally accepting must-have failures when the deviation is known and acceptable. Prevents verification loops on items that will never pass as originally specified.
4
+
5
+ <override_format>
6
+
7
+ ## Override Format
8
+
9
+ Overrides are declared in the VERIFICATION.md frontmatter under an `overrides:` key:
10
+
11
+ ```yaml
12
+ ---
13
+ phase: 03-authentication
14
+ verified: 2026-04-05T12:00:00Z
15
+ status: passed
16
+ score: 5/5
17
+ overrides_applied: 2
18
+ overrides:
19
+ - must_have: "OAuth2 PKCE flow implemented"
20
+ reason: "Using session-based auth instead — PKCE unnecessary for server-rendered app"
21
+ accepted_by: "dave"
22
+ accepted_at: "2026-04-04T15:30:00Z"
23
+ - must_have: "Rate limiting on login endpoint"
24
+ reason: "Deferred to Phase 5 (infrastructure) — tracked in ROADMAP.md"
25
+ accepted_by: "dave"
26
+ accepted_at: "2026-04-04T15:30:00Z"
27
+ ---
28
+ ```
29
+
30
+ ### Required Fields
31
+
32
+ | Field | Type | Description |
33
+ |-------|------|-------------|
34
+ | `must_have` | string | The must-have truth, artifact description, or key link being overridden. Does not need to be an exact match — fuzzy matching applies. |
35
+ | `reason` | string | Why this deviation is acceptable. Must be specific — not just "not needed". |
36
+ | `accepted_by` | string | Who accepted the override (username or role). Required. |
37
+ | `accepted_at` | string | ISO timestamp of when the override was accepted. Required. |
38
+
39
+ </override_format>
40
+
41
+ ## When to Use
42
+
43
+ Overrides apply when a phase intentionally deviated from the original plan during execution — for example, a requirement was descoped, an alternative approach was chosen, or a dependency changed.
44
+
45
+ Without overrides, the verifier reports these as FAIL even though the deviation was intentional. Overrides let the developer mark specific items as `PASSED (override)` with a documented reason.
46
+
47
+ Overrides are appropriate when:
48
+ - A requirement changed after planning but ROADMAP.md hasn't been updated yet
49
+ - An alternative implementation satisfies the intent but not the literal wording
50
+ - A must-have is deferred to a later phase with explicit tracking
51
+ - External constraints make the original must-have impossible or unnecessary
52
+
53
+ ## When NOT to Use
54
+
55
+ Overrides are NOT appropriate when:
56
+ - The implementation is simply incomplete — fix it instead
57
+ - The must-have is unclear — clarify it instead
58
+ - The developer wants to skip verification — that undermines the process
59
+ - Multiple must-haves are failing for the same phase — if more than 2-3 items need overrides, revisit the plan instead of overriding in bulk
60
+
61
+ <matching_rules>
62
+
63
+ ## Matching Rules
64
+
65
+ Override matching uses **fuzzy matching**, not exact string comparison. This accommodates minor wording differences between how must-haves are phrased in ROADMAP.md, PLAN.md frontmatter, and the override entry.
66
+
67
+ ### Matching Algorithm
68
+
69
+ 1. **Normalize both strings:** case-insensitive comparison — lowercase both strings, strip punctuation, collapse whitespace
70
+ 2. **Token overlap:** split into words, compute intersection
71
+ 3. **Match threshold:** 80% token overlap in EITHER direction (override tokens found in must-have, OR must-have tokens found in override)
72
+ 4. **Key noun priority:** nouns and technical terms (file paths, component names, API endpoints) are weighted higher than common words
73
+
74
+ ### Examples
75
+
76
+ | Must-Have | Override `must_have` | Match? | Reason |
77
+ |-----------|---------------------|--------|--------|
78
+ | "User can authenticate via OAuth2 PKCE" | "OAuth2 PKCE flow implemented" | Yes | Key terms `OAuth2` and `PKCE` overlap, 80% threshold met |
79
+ | "Rate limiting on /api/auth/login" | "Rate limiting on login endpoint" | Yes | `rate limiting` + `login` overlap |
80
+ | "Chat component renders messages" | "OAuth2 PKCE flow implemented" | No | No meaningful token overlap |
81
+ | "src/components/Chat.tsx provides message list" | "Chat.tsx message list rendering" | Yes | `Chat.tsx` + `message` + `list` overlap |
82
+
83
+ ### Ambiguity Resolution
84
+
85
+ If an override matches multiple must-haves, apply it to the **most specific match** (highest token overlap percentage). If still ambiguous, apply to the first match and log a warning.
86
+
87
+ </matching_rules>
88
+
89
+ <verifier_behavior>
90
+
91
+ ## Verifier Behavior with Overrides
92
+
93
+ ### Check Order
94
+
95
+ The override check happens **before marking a must-have as FAIL**. The flow is:
96
+
97
+ 1. Evaluate must-have against codebase (Steps 3-5 of verification process)
98
+ 2. If evaluation result is FAIL or UNCERTAIN:
99
+ a. Check `overrides:` array in VERIFICATION.md frontmatter for a fuzzy match
100
+ b. If override found: mark as `PASSED (override)` instead of FAIL
101
+ c. If no override found: mark as FAIL as normal
102
+ 3. If evaluation result is PASS: mark as VERIFIED (overrides are irrelevant)
103
+
104
+ ### Output Format
105
+
106
+ Overridden items appear with distinct status in all verification tables:
107
+
108
+ ```markdown
109
+ | # | Truth | Status | Evidence |
110
+ |---|-------|--------|----------|
111
+ | 1 | User can authenticate | VERIFIED | OAuth session flow working |
112
+ | 2 | OAuth2 PKCE flow | PASSED (override) | Override: Using session-based auth — accepted by dave on 2026-04-04 |
113
+ | 3 | Chat renders messages | FAILED | Component returns placeholder |
114
+ ```
115
+
116
+ The `PASSED (override)` status must be visually distinct from both `VERIFIED` and `FAILED`. In the evidence column, include the override reason and who accepted it.
117
+
118
+ ### Impact on Overall Status
119
+
120
+ - `PASSED (override)` items count toward the passing score, not the failing score
121
+ - A phase with all items either VERIFIED or PASSED (override) can have status `passed`
122
+ - Overrides do NOT suppress `human_needed` items — those still require human testing
123
+
124
+ ### Frontmatter Score
125
+
126
+ The score and override count in frontmatter reflect applied overrides:
127
+
128
+ ```yaml
129
+ score: 5/5 # includes 2 overrides
130
+ overrides_applied: 2
131
+ ```
132
+
133
+ </verifier_behavior>
134
+
135
+ <creating_overrides>
136
+
137
+ ## Creating Overrides
138
+
139
+ ### Interactive Override Suggestion
140
+
141
+ When the verifier marks a must-have as FAIL and the failure looks intentional (e.g., alternative implementation exists, or the code explicitly handles the case differently), the verifier should suggest creating an override:
142
+
143
+ ```markdown
144
+ ### F-002: OAuth2 PKCE flow
145
+
146
+ **Status:** FAILED
147
+ **Evidence:** No PKCE implementation found. Session-based auth used instead.
148
+
149
+ **This looks intentional.** The codebase uses session-based authentication which achieves the same goal differently. To accept this deviation, add an override to VERIFICATION.md frontmatter:
150
+
151
+ ```yaml
152
+ overrides:
153
+ - must_have: "OAuth2 PKCE flow implemented"
154
+ reason: "Using session-based auth instead — PKCE unnecessary for server-rendered app"
155
+ accepted_by: "{your name}"
156
+ accepted_at: "{current ISO timestamp}"
157
+ ```
158
+
159
+ Then re-run verification to apply.
160
+ ```
161
+
162
+ ### Override via sdd-tools
163
+
164
+ Overrides can also be managed through the verification workflow:
165
+
166
+ 1. Run `/sdd-verify-work` — verification finds gaps
167
+ 2. Review gaps — determine which are intentional deviations
168
+ 3. Add override entries to VERIFICATION.md frontmatter
169
+ 4. Re-run `/sdd-verify-work` — overrides are applied, remaining gaps shown
170
+
171
+ </creating_overrides>
172
+
173
+ <override_lifecycle>
174
+
175
+ ## Override Lifecycle
176
+
177
+ ### During Re-verification
178
+
179
+ When a phase is re-verified (e.g., after gap closure):
180
+ - Existing overrides carry forward automatically
181
+ - If the underlying code now satisfies the must-have, the override becomes unnecessary — mark as VERIFIED instead
182
+ - Overrides are never removed automatically; they persist as documentation
183
+
184
+ ### At Milestone Completion
185
+
186
+ During `/sdd-audit-milestone`, overrides are surfaced in the audit report:
187
+
188
+ ```
189
+ ### Verification Overrides ({count} across {phase_count} phases)
190
+
191
+ | Phase | Must-Have | Reason | Accepted By |
192
+ |-------|----------|--------|-------------|
193
+ | 03 | OAuth2 PKCE | Session-based auth used instead | dave |
194
+ ```
195
+
196
+ This gives the team visibility into all accepted deviations before closing the milestone.
197
+
198
+ ### Cleanup
199
+
200
+ Stale overrides (where the must-have was later implemented or removed from ROADMAP.md) can be cleaned up during milestone completion. They are informational — leaving them causes no harm.
201
+
202
+ </override_lifecycle>
203
+
204
+ ## Example VERIFICATION.md
205
+
206
+ ```markdown
207
+ ---
208
+ phase: 03-api-layer
209
+ verified: 2026-04-05T12:00:00Z
210
+ status: passed
211
+ score: 3/3
212
+ overrides_applied: 1
213
+ overrides:
214
+ - must_have: "paginated API responses"
215
+ reason: "Descoped — dataset under 100 items, pagination adds complexity without value"
216
+ accepted_by: "dave"
217
+ accepted_at: "2026-04-04T15:30:00Z"
218
+ ---
219
+
220
+ ## Phase 3: API Layer — Verification
221
+
222
+ | # | Truth | Status | Evidence |
223
+ |---|-------|--------|----------|
224
+ | 1 | REST endpoints return JSON | VERIFIED | curl tests confirm |
225
+ | 2 | Paginated API responses | PASSED (override) | Descoped — see override: dataset under 100 items |
226
+ | 3 | Authentication middleware | VERIFIED | JWT validation working |
227
+ ```
@@ -9,8 +9,55 @@ parallel milestone work by multiple Claude Code instances on the same codebase.
9
9
 
10
10
  1. `--ws <name>` flag (explicit, highest priority)
11
11
  2. `SDD_WORKSTREAM` environment variable (per-instance)
12
- 3. `.planning/active-workstream` file (shared, last-writer-wins)
13
- 4. `null` flat mode (no workstreams)
12
+ 3. Session-scoped active workstream pointer in temp storage (per runtime session / terminal)
13
+ 4. `.planning/active-workstream` file (legacy shared fallback when no session key exists)
14
+ 5. `null` — flat mode (no workstreams)
15
+
16
+ ## Why session-scoped pointers exist
17
+
18
+ The shared `.planning/active-workstream` file is fundamentally unsafe when multiple
19
+ Claude/Codex instances are active on the same repo at the same time. One session can
20
+ silently repoint another session's `STATE.md`, `ROADMAP.md`, and phase paths.
21
+
22
+ SDD now prefers a session-scoped pointer keyed by runtime/session identity
23
+ (`SDD_SESSION_KEY`, `CODEX_THREAD_ID`, `CLAUDE_CODE_SSE_PORT`, terminal session IDs,
24
+ or the controlling TTY). This keeps concurrent sessions isolated while preserving
25
+ legacy compatibility for runtimes that do not expose a stable session key.
26
+
27
+ ## Session Identity Resolution
28
+
29
+ When SDD resolves the session-scoped pointer in step 3 above, it uses this order:
30
+
31
+ 1. Explicit runtime/session env vars such as `SDD_SESSION_KEY`, `CODEX_THREAD_ID`,
32
+ `CLAUDE_SESSION_ID`, `CLAUDE_CODE_SSE_PORT`, `OPENCODE_SESSION_ID`,
33
+ `GEMINI_SESSION_ID`, `CURSOR_SESSION_ID`, `WINDSURF_SESSION_ID`,
34
+ `TERM_SESSION_ID`, `WT_SESSION`, `TMUX_PANE`, and `ZELLIJ_SESSION_NAME`
35
+ 2. `TTY` or `SSH_TTY` if the shell/runtime already exposes the terminal path
36
+ 3. A single best-effort `tty` probe, but only when stdin is interactive
37
+
38
+ If none of those produce a stable identity, SDD does not keep probing. It falls
39
+ back directly to the legacy shared `.planning/active-workstream` file.
40
+
41
+ This matters in headless or stripped environments: when stdin is already
42
+ non-interactive, SDD intentionally skips shelling out to `tty` because that path
43
+ cannot discover a stable session identity and only adds avoidable failures on the
44
+ routing hot path.
45
+
46
+ ## Pointer Lifecycle
47
+
48
+ Session-scoped pointers are intentionally lightweight and best-effort:
49
+
50
+ - Clearing a workstream for one session removes only that session's pointer file
51
+ - If that was the last pointer for the repo, SDD also removes the now-empty
52
+ per-project temp directory
53
+ - If sibling session pointers still exist, the temp directory is left in place
54
+ - When a pointer refers to a workstream directory that no longer exists, SDD
55
+ treats it as stale state: it removes that pointer file and resolves to `null`
56
+ until the session explicitly sets a new active workstream again
57
+
58
+ SDD does not currently run a background garbage collector for historical temp
59
+ directories. Cleanup is opportunistic at the pointer being cleared or self-healed,
60
+ and broader temp hygiene is left to OS temp cleanup or future maintenance work.
14
61
 
15
62
  ## Routing Propagation
16
63
 
@@ -29,7 +76,7 @@ This ensures workstream scope chains automatically through the workflow:
29
76
  ├── config.json # Shared
30
77
  ├── milestones/ # Shared
31
78
  ├── codebase/ # Shared
32
- ├── active-workstream # Points to current ws
79
+ ├── active-workstream # Legacy shared fallback only
33
80
  └── workstreams/
34
81
  ├── feature-a/ # Workstream A
35
82
  │ ├── STATE.md
@@ -50,6 +97,12 @@ This ensures workstream scope chains automatically through the workflow:
50
97
  node sdd-tools.cjs state json --ws feature-a
51
98
  node sdd-tools.cjs find-phase 3 --ws feature-b
52
99
 
100
+ # Session-local switching without --ws on every command
101
+ SDD_SESSION_KEY=my-terminal-a node sdd-tools.cjs workstream set feature-a
102
+ SDD_SESSION_KEY=my-terminal-a node sdd-tools.cjs state json
103
+ SDD_SESSION_KEY=my-terminal-b node sdd-tools.cjs workstream set feature-b
104
+ SDD_SESSION_KEY=my-terminal-b node sdd-tools.cjs state json
105
+
53
106
  # Workstream CRUD
54
107
  node sdd-tools.cjs workstream create <name>
55
108
  node sdd-tools.cjs workstream list
@@ -0,0 +1,246 @@
1
+ # AI-SPEC — Phase {N}: {phase_name}
2
+
3
+ > AI design contract generated by `/sdd-ai-integration-phase`. Consumed by `sdd-planner` and `sdd-eval-auditor`.
4
+ > Locks framework selection, implementation guidance, and evaluation strategy before planning begins.
5
+
6
+ ---
7
+
8
+ ## 1. System Classification
9
+
10
+ **System Type:** <!-- RAG | Multi-Agent | Conversational | Extraction | Autonomous Agent | Content Generation | Code Automation | Hybrid -->
11
+
12
+ **Description:**
13
+ <!-- One-paragraph description of what this AI system does, who uses it, and what "good" looks like -->
14
+
15
+ **Critical Failure Modes:**
16
+ <!-- The 3-5 behaviors that absolutely cannot go wrong in this system -->
17
+ 1.
18
+ 2.
19
+ 3.
20
+
21
+ ---
22
+
23
+ ## 1b. Domain Context
24
+
25
+ > Researched by `sdd-domain-researcher`. Grounds the evaluation strategy in domain expert knowledge.
26
+
27
+ **Industry Vertical:** <!-- healthcare | legal | finance | customer service | education | developer tooling | e-commerce | etc. -->
28
+
29
+ **User Population:** <!-- who uses this system and in what context -->
30
+
31
+ **Stakes Level:** <!-- Low | Medium | High | Critical -->
32
+
33
+ **Output Consequence:** <!-- what happens downstream when the AI output is acted on -->
34
+
35
+ ### What Domain Experts Evaluate Against
36
+
37
+ <!-- Domain-specific rubric ingredients — in practitioner language, not AI jargon -->
38
+ <!-- Format: Dimension / Good (expert accepts) / Bad (expert flags) / Stakes / Source -->
39
+
40
+ ### Known Failure Modes in This Domain
41
+
42
+ <!-- Domain-specific failure modes from research — not generic hallucination, but how it manifests here -->
43
+
44
+ ### Regulatory / Compliance Context
45
+
46
+ <!-- Relevant regulations or constraints — or "None identified" if genuinely none apply -->
47
+
48
+ ### Domain Expert Roles for Evaluation
49
+
50
+ | Role | Responsibility |
51
+ |------|---------------|
52
+ | <!-- e.g., Senior practitioner --> | <!-- Dataset labeling / rubric calibration / production sampling --> |
53
+
54
+ ---
55
+
56
+ ## 2. Framework Decision
57
+
58
+ **Selected Framework:** <!-- e.g., LlamaIndex v0.10.x -->
59
+
60
+ **Version:** <!-- Pin the version -->
61
+
62
+ **Rationale:**
63
+ <!-- Why this framework fits this system type, team context, and production requirements -->
64
+
65
+ **Alternatives Considered:**
66
+
67
+ | Framework | Ruled Out Because |
68
+ |-----------|------------------|
69
+ | | |
70
+
71
+ **Vendor Lock-In Accepted:** <!-- Yes / No / Partial — document the trade-off consciously -->
72
+
73
+ ---
74
+
75
+ ## 3. Framework Quick Reference
76
+
77
+ > Fetched from official docs by `sdd-ai-researcher`. Distilled for this specific use case.
78
+
79
+ ### Installation
80
+ ```bash
81
+ # Install command(s)
82
+ ```
83
+
84
+ ### Core Imports
85
+ ```python
86
+ # Key imports for this use case
87
+ ```
88
+
89
+ ### Entry Point Pattern
90
+ ```python
91
+ # Minimal working example for this system type
92
+ ```
93
+
94
+ ### Key Abstractions
95
+ <!-- Framework-specific concepts the developer must understand before coding -->
96
+ | Concept | What It Is | When You Use It |
97
+ |---------|-----------|-----------------|
98
+ | | | |
99
+
100
+ ### Common Pitfalls
101
+ <!-- Gotchas specific to this framework and system type — from docs, issues, and community reports -->
102
+ 1.
103
+ 2.
104
+ 3.
105
+
106
+ ### Recommended Project Structure
107
+ ```
108
+ project/
109
+ ├── # Framework-specific folder layout
110
+ ```
111
+
112
+ ---
113
+
114
+ ## 4. Implementation Guidance
115
+
116
+ **Model Configuration:**
117
+ <!-- Which model(s), temperature, max tokens, and other key parameters -->
118
+
119
+ **Core Pattern:**
120
+ <!-- The primary implementation pattern for this system type in this framework -->
121
+
122
+ **Tool Use:**
123
+ <!-- Tools/integrations needed and how to configure them -->
124
+
125
+ **State Management:**
126
+ <!-- How state is persisted, retrieved, and updated -->
127
+
128
+ **Context Window Strategy:**
129
+ <!-- How to manage context limits for this system type -->
130
+
131
+ ---
132
+
133
+ ## 4b. AI Systems Best Practices
134
+
135
+ > Written by `sdd-ai-researcher`. Cross-cutting patterns every developer building AI systems needs — independent of framework choice.
136
+
137
+ ### Structured Outputs with Pydantic
138
+
139
+ <!-- Framework-specific Pydantic integration pattern for this use case -->
140
+ <!-- Include: output model definition, how the framework uses it, retry logic on validation failure -->
141
+
142
+ ```python
143
+ # Pydantic output model for this system type
144
+ ```
145
+
146
+ ### Async-First Design
147
+
148
+ <!-- How async is handled in this framework, the one common mistake, and when to stream vs. await -->
149
+
150
+ ### Prompt Engineering Discipline
151
+
152
+ <!-- System vs. user prompt separation, few-shot guidance, token budget strategy -->
153
+
154
+ ### Context Window Management
155
+
156
+ <!-- Strategy specific to this system type: RAG chunking / conversation summarisation / agent compaction -->
157
+
158
+ ### Cost and Latency Budget
159
+
160
+ <!-- Per-call cost estimate, caching strategy, sub-task model routing -->
161
+
162
+ ---
163
+
164
+ ## 5. Evaluation Strategy
165
+
166
+ ### Dimensions
167
+
168
+ | Dimension | Rubric (Pass/Fail or 1-5) | Measurement Approach | Priority |
169
+ |-----------|--------------------------|---------------------|----------|
170
+ | | | Code / LLM Judge / Human | Critical / High / Medium |
171
+
172
+ ### Eval Tooling
173
+
174
+ **Primary Tool:** <!-- e.g., RAGAS + Langfuse -->
175
+
176
+ **Setup:**
177
+ ```bash
178
+ # Install and configure
179
+ ```
180
+
181
+ **CI/CD Integration:**
182
+ ```bash
183
+ # Command to run evals in CI/CD pipeline
184
+ ```
185
+
186
+ ### Reference Dataset
187
+
188
+ **Size:** <!-- e.g., 20 examples to start -->
189
+
190
+ **Composition:**
191
+ <!-- What scenario types the dataset covers: critical paths, edge cases, failure modes -->
192
+
193
+ **Labeling:**
194
+ <!-- Who labels examples and how (domain expert, LLM judge with calibration, etc.) -->
195
+
196
+ ---
197
+
198
+ ## 6. Guardrails
199
+
200
+ ### Online (Real-Time)
201
+
202
+ | Guardrail | Trigger | Intervention |
203
+ |-----------|---------|--------------|
204
+ | | | Block / Escalate / Flag |
205
+
206
+ ### Offline (Flywheel)
207
+
208
+ | Metric | Sampling Strategy | Action on Degradation |
209
+ |--------|------------------|----------------------|
210
+ | | | |
211
+
212
+ ---
213
+
214
+ ## 7. Production Monitoring
215
+
216
+ **Tracing Tool:** <!-- e.g., Langfuse self-hosted -->
217
+
218
+ **Key Metrics to Track:**
219
+ <!-- 3-5 metrics that will be monitored in production -->
220
+
221
+ **Alert Thresholds:**
222
+ <!-- When to page/alert -->
223
+
224
+ **Smart Sampling Strategy:**
225
+ <!-- How to select interactions for human review — signal-based filters -->
226
+
227
+ ---
228
+
229
+ ## Checklist
230
+
231
+ - [ ] System type classified
232
+ - [ ] Critical failure modes identified (≥ 3)
233
+ - [ ] Domain context researched (Section 1b: vertical, stakes, expert criteria, failure modes)
234
+ - [ ] Regulatory/compliance context identified or explicitly noted as none
235
+ - [ ] Domain expert roles defined for evaluation involvement
236
+ - [ ] Framework selected with rationale documented
237
+ - [ ] Alternatives considered and ruled out
238
+ - [ ] Framework quick reference written (install, imports, pattern, pitfalls)
239
+ - [ ] AI systems best practices written (Section 4b: Pydantic, async, prompt discipline, context)
240
+ - [ ] Evaluation dimensions grounded in domain rubric ingredients
241
+ - [ ] Each eval dimension has a concrete rubric (Good/Bad in domain language)
242
+ - [ ] Eval tooling selected — Arize Phoenix default confirmed or override noted
243
+ - [ ] Reference dataset spec written (size ≥ 10, composition + labeling defined)
244
+ - [ ] CI/CD eval integration specified
245
+ - [ ] Online guardrails defined
246
+ - [ ] Production monitoring configured (tracing tool + sampling strategy)
@@ -99,7 +99,7 @@ files_changed: []
99
99
 
100
100
  <lifecycle>
101
101
 
102
- **Creation:** Immediately when /sdd:debug is called
102
+ **Creation:** Immediately when /sdd-debug is called
103
103
  - Create file with trigger from user input
104
104
  - Set status to "gathering"
105
105
  - Current Focus: next_action = "gather symptoms"
@@ -0,0 +1,61 @@
1
+ ---
2
+ phase: {N}
3
+ slug: {phase-slug}
4
+ status: draft
5
+ threats_open: 0
6
+ asvs_level: 1
7
+ created: {date}
8
+ ---
9
+
10
+ # Phase {N} — Security
11
+
12
+ > Per-phase security contract: threat register, accepted risks, and audit trail.
13
+
14
+ ---
15
+
16
+ ## Trust Boundaries
17
+
18
+ | Boundary | Description | Data Crossing |
19
+ |----------|-------------|---------------|
20
+ | {boundary} | {description} | {data type / sensitivity} |
21
+
22
+ ---
23
+
24
+ ## Threat Register
25
+
26
+ | Threat ID | Category | Component | Disposition | Mitigation | Status |
27
+ |-----------|----------|-----------|-------------|------------|--------|
28
+ | T-{N}-01 | {STRIDE category} | {component} | {mitigate / accept / transfer} | {control or reference} | open |
29
+
30
+ *Status: open · closed*
31
+ *Disposition: mitigate (implementation required) · accept (documented risk) · transfer (third-party)*
32
+
33
+ ---
34
+
35
+ ## Accepted Risks Log
36
+
37
+ | Risk ID | Threat Ref | Rationale | Accepted By | Date |
38
+ |---------|------------|-----------|-------------|------|
39
+
40
+ *Accepted risks do not resurface in future audit runs.*
41
+
42
+ *If none: "No accepted risks."*
43
+
44
+ ---
45
+
46
+ ## Security Audit Trail
47
+
48
+ | Audit Date | Threats Total | Closed | Open | Run By |
49
+ |------------|---------------|--------|------|--------|
50
+ | {YYYY-MM-DD} | {N} | {N} | {N} | {name / agent} |
51
+
52
+ ---
53
+
54
+ ## Sign-Off
55
+
56
+ - [ ] All threats have a disposition (mitigate / accept / transfer)
57
+ - [ ] Accepted risks documented in Accepted Risks Log
58
+ - [ ] `threats_open: 0` confirmed
59
+ - [ ] `status: verified` set in frontmatter
60
+
61
+ **Approval:** {pending / verified YYYY-MM-DD}
@@ -106,7 +106,7 @@ blocked: [N]
106
106
  **Gaps:**
107
107
  - APPEND only when issue found (YAML format)
108
108
  - After diagnosis: fill `root_cause`, `artifacts`, `missing`, `debug_session`
109
- - This section feeds directly into /sdd:plan-phase --gaps
109
+ - This section feeds directly into /sdd-plan-phase --gaps
110
110
 
111
111
  </section_rules>
112
112
 
@@ -120,7 +120,7 @@ blocked: [N]
120
120
  4. UAT.md Gaps section updated with diagnosis:
121
121
  - Each gap gets `root_cause`, `artifacts`, `missing`, `debug_session` filled
122
122
  5. status → "diagnosed"
123
- 6. Ready for /sdd:plan-phase --gaps with root causes
123
+ 6. Ready for /sdd-plan-phase --gaps with root causes
124
124
 
125
125
  **After diagnosis:**
126
126
  ```yaml
@@ -144,7 +144,7 @@ blocked: [N]
144
144
 
145
145
  <lifecycle>
146
146
 
147
- **Creation:** When /sdd:verify-work starts new session
147
+ **Creation:** When /sdd-verify-work starts new session
148
148
  - Extract tests from SUMMARY.md files
149
149
  - Set status to "testing"
150
150
  - Current Test points to test 1
@@ -171,7 +171,7 @@ blocked: [N]
171
171
  - Present summary with outstanding items highlighted
172
172
 
173
173
  **Resuming partial session:**
174
- - `/sdd:verify-work {phase}` picks up from first pending/blocked test
174
+ - `/sdd-verify-work {phase}` picks up from first pending/blocked test
175
175
  - When all items resolved, status advances to "complete"
176
176
 
177
177
  **Resume after /clear:**
@@ -29,16 +29,16 @@ created: {date}
29
29
 
30
30
  - **After every task commit:** Run `{quick run command}`
31
31
  - **After every plan wave:** Run `{full suite command}`
32
- - **Before `/sdd:verify-work`:** Full suite must be green
32
+ - **Before `/sdd-verify-work`:** Full suite must be green
33
33
  - **Max feedback latency:** {N} seconds
34
34
 
35
35
  ---
36
36
 
37
37
  ## Per-Task Verification Map
38
38
 
39
- | Task ID | Plan | Wave | Requirement | Test Type | Automated Command | File Exists | Status |
40
- |---------|------|------|-------------|-----------|-------------------|-------------|--------|
41
- | {N}-01-01 | 01 | 1 | REQ-{XX} | unit | `{command}` | ✅ / ❌ W0 | ⬜ pending |
39
+ | Task ID | Plan | Wave | Requirement | Threat Ref | Secure Behavior | Test Type | Automated Command | File Exists | Status |
40
+ |---------|------|------|-------------|------------|-----------------|-----------|-------------------|-------------|--------|
41
+ | {N}-01-01 | 01 | 1 | REQ-{XX} | T-{N}-01 / — | {expected secure behavior or "N/A"} | unit | `{command}` | ✅ / ❌ W0 | ⬜ pending |
42
42
 
43
43
  *Status: ⬜ pending · ✅ green · ❌ red · ⚠️ flaky*
44
44