npm - @bhargavvc/sdd-cc - Versions diffs - 1.30.0 → 1.35.0 - Mend

@bhargavvc/sdd-cc 1.30.0 → 1.35.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (242) hide show

package/README.ja-JP.md +144 -110
package/README.ko-KR.md +143 -107
package/README.md +183 -112
package/README.pt-BR.md +90 -52
package/README.zh-CN.md +141 -101
package/agents/sdd-advisor-researcher.md +23 -0
package/agents/sdd-ai-researcher.md +133 -0
package/agents/sdd-code-fixer.md +516 -0
package/agents/sdd-code-reviewer.md +355 -0
package/agents/sdd-codebase-mapper.md +3 -3
package/agents/sdd-debugger.md +17 -5
package/agents/sdd-doc-verifier.md +201 -0
package/agents/sdd-doc-writer.md +602 -0
package/agents/sdd-domain-researcher.md +153 -0
package/agents/sdd-eval-auditor.md +164 -0
package/agents/sdd-eval-planner.md +154 -0
package/agents/sdd-executor.md +87 -4
package/agents/sdd-framework-selector.md +160 -0
package/agents/sdd-intel-updater.md +314 -0
package/agents/sdd-nyquist-auditor.md +1 -1
package/agents/sdd-phase-researcher.md +71 -4
package/agents/sdd-plan-checker.md +100 -6
package/agents/sdd-planner.md +145 -206
package/agents/sdd-project-researcher.md +25 -2
package/agents/sdd-research-synthesizer.md +3 -3
package/agents/sdd-roadmapper.md +6 -6
package/agents/sdd-security-auditor.md +128 -0
package/agents/sdd-ui-auditor.md +43 -3
package/agents/sdd-ui-checker.md +5 -5
package/agents/sdd-ui-researcher.md +27 -4
package/agents/sdd-user-profiler.md +2 -2
package/agents/sdd-verifier.md +142 -22
package/bin/install.js +2151 -551
package/commands/sdd/add-backlog.md +5 -5
package/commands/sdd/add-tests.md +2 -2
package/commands/sdd/ai-integration-phase.md +36 -0
package/commands/sdd/analyze-dependencies.md +34 -0
package/commands/sdd/audit-fix.md +33 -0
package/commands/sdd/autonomous.md +7 -2
package/commands/sdd/cleanup.md +5 -0
package/commands/sdd/code-review-fix.md +52 -0
package/commands/sdd/code-review.md +55 -0
package/commands/sdd/complete-milestone.md +6 -6
package/commands/sdd/debug.md +22 -9
package/commands/sdd/discuss-phase.md +7 -2
package/commands/sdd/do.md +1 -1
package/commands/sdd/docs-update.md +48 -0
package/commands/sdd/eval-review.md +32 -0
package/commands/sdd/execute-phase.md +4 -0
package/commands/sdd/explore.md +27 -0
package/commands/sdd/fast.md +2 -2
package/commands/sdd/from-sdd2.md +45 -0
package/commands/sdd/help.md +2 -0
package/commands/sdd/import.md +36 -0
package/commands/sdd/intel.md +179 -0
package/commands/sdd/join-discord.md +2 -1
package/commands/sdd/manager.md +1 -0
package/commands/sdd/map-codebase.md +3 -3
package/commands/sdd/new-milestone.md +1 -1
package/commands/sdd/new-project.md +5 -1
package/commands/sdd/new-workspace.md +1 -1
package/commands/sdd/next.md +2 -0
package/commands/sdd/plan-milestone-gaps.md +2 -2
package/commands/sdd/plan-phase.md +6 -1
package/commands/sdd/plant-seed.md +1 -1
package/commands/sdd/profile-user.md +1 -1
package/commands/sdd/quick.md +5 -3
package/commands/sdd/reapply-patches.md +230 -42
package/commands/sdd/research-phase.md +3 -3
package/commands/sdd/review-backlog.md +1 -0
package/commands/sdd/review.md +6 -3
package/commands/sdd/scan.md +26 -0
package/commands/sdd/secure-phase.md +35 -0
package/commands/sdd/ship.md +1 -1
package/commands/sdd/thread.md +5 -5
package/commands/sdd/undo.md +34 -0
package/commands/sdd/verify-work.md +1 -1
package/commands/sdd/workstreams.md +17 -11
package/hooks/dist/sdd-check-update.js +33 -8
package/hooks/dist/sdd-context-monitor.js +17 -8
package/hooks/dist/sdd-phase-boundary.sh +27 -0
package/hooks/dist/sdd-prompt-guard.js +1 -0
package/hooks/dist/sdd-read-guard.js +82 -0
package/hooks/dist/sdd-session-state.sh +33 -0
package/hooks/dist/sdd-statusline.js +137 -15
package/hooks/dist/sdd-validate-commit.sh +47 -0
package/hooks/dist/sdd-workflow-guard.js +4 -4
package/hooks/sdd-check-update.js +139 -0
package/hooks/sdd-context-monitor.js +165 -0
package/hooks/sdd-phase-boundary.sh +27 -0
package/hooks/sdd-prompt-guard.js +97 -0
package/hooks/sdd-read-guard.js +82 -0
package/hooks/sdd-session-state.sh +33 -0
package/hooks/sdd-statusline.js +241 -0
package/hooks/sdd-validate-commit.sh +47 -0
package/hooks/sdd-workflow-guard.js +94 -0
package/package.json +3 -3
package/scripts/build-hooks.js +18 -7
package/scripts/prompt-injection-scan.sh +1 -0
package/scripts/rebrand-gsd-to-sdd.sh +221 -220
package/scripts/run-tests.cjs +5 -1
package/scripts/sync-upstream.sh +1 -1
package/sdd/bin/lib/commands.cjs +79 -17
package/sdd/bin/lib/config.cjs +90 -48
package/sdd/bin/lib/core.cjs +452 -87
package/sdd/bin/lib/docs.cjs +267 -0
package/sdd/bin/lib/frontmatter.cjs +381 -336
package/sdd/bin/lib/init.cjs +110 -16
package/sdd/bin/lib/intel.cjs +660 -0
package/sdd/bin/lib/learnings.cjs +378 -0
package/sdd/bin/lib/milestone.cjs +42 -11
package/sdd/bin/lib/model-profiles.cjs +17 -15
package/sdd/bin/lib/phase.cjs +367 -288
package/sdd/bin/lib/profile-output.cjs +106 -10
package/sdd/bin/lib/roadmap.cjs +146 -115
package/sdd/bin/lib/schema-detect.cjs +238 -0
package/sdd/bin/lib/sdd2-import.cjs +511 -0
package/sdd/bin/lib/security.cjs +124 -3
package/sdd/bin/lib/state.cjs +648 -264
package/sdd/bin/lib/template.cjs +8 -4
package/sdd/bin/lib/verify.cjs +209 -28
package/sdd/bin/lib/workstream.cjs +7 -3
package/sdd/bin/sdd-tools.cjs +184 -12
package/sdd/contexts/dev.md +21 -0
package/sdd/contexts/research.md +22 -0
package/sdd/contexts/review.md +22 -0
package/sdd/references/agent-contracts.md +79 -0
package/sdd/references/ai-evals.md +156 -0
package/sdd/references/ai-frameworks.md +186 -0
package/sdd/references/artifact-types.md +113 -0
package/sdd/references/common-bug-patterns.md +114 -0
package/sdd/references/context-budget.md +49 -0
package/sdd/references/continuation-format.md +25 -25
package/sdd/references/domain-probes.md +125 -0
package/sdd/references/few-shot-examples/plan-checker.md +73 -0
package/sdd/references/few-shot-examples/verifier.md +109 -0
package/sdd/references/gate-prompts.md +100 -0
package/sdd/references/gates.md +70 -0
package/sdd/references/git-integration.md +1 -1
package/sdd/references/ios-scaffold.md +123 -0
package/sdd/references/model-profile-resolution.md +2 -0
package/sdd/references/model-profiles.md +24 -18
package/sdd/references/planner-gap-closure.md +62 -0
package/sdd/references/planner-reviews.md +39 -0
package/sdd/references/planner-revision.md +87 -0
package/sdd/references/planning-config.md +252 -0
package/sdd/references/revision-loop.md +97 -0
package/sdd/references/thinking-models-debug.md +44 -0
package/sdd/references/thinking-models-execution.md +50 -0
package/sdd/references/thinking-models-planning.md +62 -0
package/sdd/references/thinking-models-research.md +50 -0
package/sdd/references/thinking-models-verification.md +55 -0
package/sdd/references/thinking-partner.md +96 -0
package/sdd/references/ui-brand.md +4 -4
package/sdd/references/universal-anti-patterns.md +63 -0
package/sdd/references/verification-overrides.md +227 -0
package/sdd/references/workstream-flag.md +56 -3
package/sdd/templates/AI-SPEC.md +246 -0
package/sdd/templates/DEBUG.md +1 -1
package/sdd/templates/SECURITY.md +61 -0
package/sdd/templates/UAT.md +4 -4
package/sdd/templates/VALIDATION.md +4 -4
package/sdd/templates/claude-md.md +32 -9
package/sdd/templates/config.json +4 -0
package/sdd/templates/debug-subagent-prompt.md +1 -1
package/sdd/templates/dev-preferences.md +1 -1
package/sdd/templates/discovery.md +2 -2
package/sdd/templates/phase-prompt.md +1 -1
package/sdd/templates/planner-subagent-prompt.md +3 -3
package/sdd/templates/project.md +1 -1
package/sdd/templates/research.md +1 -1
package/sdd/templates/state.md +2 -2
package/sdd/workflows/add-phase.md +8 -8
package/sdd/workflows/add-tests.md +12 -9
package/sdd/workflows/add-todo.md +5 -3
package/sdd/workflows/ai-integration-phase.md +284 -0
package/sdd/workflows/analyze-dependencies.md +96 -0
package/sdd/workflows/audit-fix.md +157 -0
package/sdd/workflows/audit-milestone.md +11 -11
package/sdd/workflows/audit-uat.md +2 -2
package/sdd/workflows/autonomous.md +195 -27
package/sdd/workflows/check-todos.md +12 -10
package/sdd/workflows/cleanup.md +2 -0
package/sdd/workflows/code-review-fix.md +497 -0
package/sdd/workflows/code-review.md +515 -0
package/sdd/workflows/complete-milestone.md +56 -22
package/sdd/workflows/diagnose-issues.md +10 -3
package/sdd/workflows/discovery-phase.md +5 -3
package/sdd/workflows/discuss-phase-assumptions.md +24 -6
package/sdd/workflows/discuss-phase-power.md +291 -0
package/sdd/workflows/discuss-phase.md +173 -21
package/sdd/workflows/do.md +23 -21
package/sdd/workflows/docs-update.md +1155 -0
package/sdd/workflows/eval-review.md +155 -0
package/sdd/workflows/execute-phase.md +594 -38
package/sdd/workflows/execute-plan.md +67 -96
package/sdd/workflows/explore.md +139 -0
package/sdd/workflows/fast.md +5 -5
package/sdd/workflows/forensics.md +2 -2
package/sdd/workflows/health.md +4 -4
package/sdd/workflows/help.md +122 -119
package/sdd/workflows/import.md +276 -0
package/sdd/workflows/inbox.md +387 -0
package/sdd/workflows/insert-phase.md +7 -7
package/sdd/workflows/list-phase-assumptions.md +4 -4
package/sdd/workflows/list-workspaces.md +2 -2
package/sdd/workflows/manager.md +35 -32
package/sdd/workflows/map-codebase.md +7 -5
package/sdd/workflows/milestone-summary.md +2 -2
package/sdd/workflows/new-milestone.md +17 -9
package/sdd/workflows/new-project.md +50 -25
package/sdd/workflows/new-workspace.md +7 -5
package/sdd/workflows/next.md +67 -11
package/sdd/workflows/note.md +9 -7
package/sdd/workflows/pause-work.md +75 -12
package/sdd/workflows/plan-milestone-gaps.md +8 -8
package/sdd/workflows/plan-phase.md +294 -42
package/sdd/workflows/plant-seed.md +6 -3
package/sdd/workflows/pr-branch.md +42 -14
package/sdd/workflows/profile-user.md +9 -7
package/sdd/workflows/progress.md +45 -45
package/sdd/workflows/quick.md +195 -47
package/sdd/workflows/remove-phase.md +6 -6
package/sdd/workflows/remove-workspace.md +3 -1
package/sdd/workflows/research-phase.md +2 -2
package/sdd/workflows/resume-project.md +12 -12
package/sdd/workflows/review.md +109 -9
package/sdd/workflows/scan.md +102 -0
package/sdd/workflows/secure-phase.md +166 -0
package/sdd/workflows/session-report.md +2 -2
package/sdd/workflows/settings.md +38 -12
package/sdd/workflows/ship.md +21 -9
package/sdd/workflows/stats.md +1 -1
package/sdd/workflows/transition.md +23 -23
package/sdd/workflows/ui-phase.md +15 -7
package/sdd/workflows/ui-review.md +29 -4
package/sdd/workflows/undo.md +314 -0
package/sdd/workflows/update.md +171 -20
package/sdd/workflows/validate-phase.md +6 -4
package/sdd/workflows/verify-phase.md +210 -6
package/sdd/workflows/verify-work.md +83 -9
package/sdd/commands/sdd/workstreams.md +0 -63

package/sdd/references/verification-overrides.md ADDED Viewed

@@ -0,0 +1,227 @@
+# Verification Overrides
+Mechanism for intentionally accepting must-have failures when the deviation is known and acceptable. Prevents verification loops on items that will never pass as originally specified.
+<override_format>
+## Override Format
+Overrides are declared in the VERIFICATION.md frontmatter under an `overrides:` key:
+```yaml
+---
+phase: 03-authentication
+verified: 2026-04-05T12:00:00Z
+status: passed
+score: 5/5
+overrides_applied: 2
+overrides:
+  - must_have: "OAuth2 PKCE flow implemented"
+    reason: "Using session-based auth instead — PKCE unnecessary for server-rendered app"
+    accepted_by: "dave"
+    accepted_at: "2026-04-04T15:30:00Z"
+  - must_have: "Rate limiting on login endpoint"
+    reason: "Deferred to Phase 5 (infrastructure) — tracked in ROADMAP.md"
+    accepted_by: "dave"
+    accepted_at: "2026-04-04T15:30:00Z"
+---
+```
+### Required Fields
+| Field | Type | Description |
+|-------|------|-------------|
+| `must_have` | string | The must-have truth, artifact description, or key link being overridden. Does not need to be an exact match — fuzzy matching applies. |
+| `reason` | string | Why this deviation is acceptable. Must be specific — not just "not needed". |
+| `accepted_by` | string | Who accepted the override (username or role). Required. |
+| `accepted_at` | string | ISO timestamp of when the override was accepted. Required. |
+</override_format>
+## When to Use
+Overrides apply when a phase intentionally deviated from the original plan during execution — for example, a requirement was descoped, an alternative approach was chosen, or a dependency changed.
+Without overrides, the verifier reports these as FAIL even though the deviation was intentional. Overrides let the developer mark specific items as `PASSED (override)` with a documented reason.
+Overrides are appropriate when:
+- A requirement changed after planning but ROADMAP.md hasn't been updated yet
+- An alternative implementation satisfies the intent but not the literal wording
+- A must-have is deferred to a later phase with explicit tracking
+- External constraints make the original must-have impossible or unnecessary
+## When NOT to Use
+Overrides are NOT appropriate when:
+- The implementation is simply incomplete — fix it instead
+- The must-have is unclear — clarify it instead
+- The developer wants to skip verification — that undermines the process
+- Multiple must-haves are failing for the same phase — if more than 2-3 items need overrides, revisit the plan instead of overriding in bulk
+<matching_rules>
+## Matching Rules
+Override matching uses **fuzzy matching**, not exact string comparison. This accommodates minor wording differences between how must-haves are phrased in ROADMAP.md, PLAN.md frontmatter, and the override entry.
+### Matching Algorithm
+1. **Normalize both strings:** case-insensitive comparison — lowercase both strings, strip punctuation, collapse whitespace
+2. **Token overlap:** split into words, compute intersection
+3. **Match threshold:** 80% token overlap in EITHER direction (override tokens found in must-have, OR must-have tokens found in override)
+4. **Key noun priority:** nouns and technical terms (file paths, component names, API endpoints) are weighted higher than common words
+### Examples
+| Must-Have | Override `must_have` | Match? | Reason |
+|-----------|---------------------|--------|--------|
+| "User can authenticate via OAuth2 PKCE" | "OAuth2 PKCE flow implemented" | Yes | Key terms `OAuth2` and `PKCE` overlap, 80% threshold met |
+| "Rate limiting on /api/auth/login" | "Rate limiting on login endpoint" | Yes | `rate limiting` + `login` overlap |
+| "Chat component renders messages" | "OAuth2 PKCE flow implemented" | No | No meaningful token overlap |
+| "src/components/Chat.tsx provides message list" | "Chat.tsx message list rendering" | Yes | `Chat.tsx` + `message` + `list` overlap |
+### Ambiguity Resolution
+If an override matches multiple must-haves, apply it to the **most specific match** (highest token overlap percentage). If still ambiguous, apply to the first match and log a warning.
+</matching_rules>
+<verifier_behavior>
+## Verifier Behavior with Overrides
+### Check Order
+The override check happens **before marking a must-have as FAIL**. The flow is:
+1. Evaluate must-have against codebase (Steps 3-5 of verification process)
+2. If evaluation result is FAIL or UNCERTAIN:
+   a. Check `overrides:` array in VERIFICATION.md frontmatter for a fuzzy match
+   b. If override found: mark as `PASSED (override)` instead of FAIL
+   c. If no override found: mark as FAIL as normal
+3. If evaluation result is PASS: mark as VERIFIED (overrides are irrelevant)
+### Output Format
+Overridden items appear with distinct status in all verification tables:
+```markdown
+| # | Truth | Status | Evidence |
+|---|-------|--------|----------|
+| 1 | User can authenticate | VERIFIED | OAuth session flow working |
+| 2 | OAuth2 PKCE flow | PASSED (override) | Override: Using session-based auth — accepted by dave on 2026-04-04 |
+| 3 | Chat renders messages | FAILED | Component returns placeholder |
+```
+The `PASSED (override)` status must be visually distinct from both `VERIFIED` and `FAILED`. In the evidence column, include the override reason and who accepted it.
+### Impact on Overall Status
+- `PASSED (override)` items count toward the passing score, not the failing score
+- A phase with all items either VERIFIED or PASSED (override) can have status `passed`
+- Overrides do NOT suppress `human_needed` items — those still require human testing
+### Frontmatter Score
+The score and override count in frontmatter reflect applied overrides:
+```yaml
+score: 5/5  # includes 2 overrides
+overrides_applied: 2
+```
+</verifier_behavior>
+<creating_overrides>
+## Creating Overrides
+### Interactive Override Suggestion
+When the verifier marks a must-have as FAIL and the failure looks intentional (e.g., alternative implementation exists, or the code explicitly handles the case differently), the verifier should suggest creating an override:
+```markdown
+### F-002: OAuth2 PKCE flow
+**Status:** FAILED
+**Evidence:** No PKCE implementation found. Session-based auth used instead.
+**This looks intentional.** The codebase uses session-based authentication which achieves the same goal differently. To accept this deviation, add an override to VERIFICATION.md frontmatter:
+```yaml
+overrides:
+  - must_have: "OAuth2 PKCE flow implemented"
+    reason: "Using session-based auth instead — PKCE unnecessary for server-rendered app"
+    accepted_by: "{your name}"
+    accepted_at: "{current ISO timestamp}"
+```
+Then re-run verification to apply.
+```
+### Override via sdd-tools
+Overrides can also be managed through the verification workflow:
+1. Run `/sdd-verify-work` — verification finds gaps
+2. Review gaps — determine which are intentional deviations
+3. Add override entries to VERIFICATION.md frontmatter
+4. Re-run `/sdd-verify-work` — overrides are applied, remaining gaps shown
+</creating_overrides>
+<override_lifecycle>
+## Override Lifecycle
+### During Re-verification
+When a phase is re-verified (e.g., after gap closure):
+- Existing overrides carry forward automatically
+- If the underlying code now satisfies the must-have, the override becomes unnecessary — mark as VERIFIED instead
+- Overrides are never removed automatically; they persist as documentation
+### At Milestone Completion
+During `/sdd-audit-milestone`, overrides are surfaced in the audit report:
+```
+### Verification Overrides ({count} across {phase_count} phases)
+| Phase | Must-Have | Reason | Accepted By |
+|-------|----------|--------|-------------|
+| 03 | OAuth2 PKCE | Session-based auth used instead | dave |
+```
+This gives the team visibility into all accepted deviations before closing the milestone.
+### Cleanup
+Stale overrides (where the must-have was later implemented or removed from ROADMAP.md) can be cleaned up during milestone completion. They are informational — leaving them causes no harm.
+</override_lifecycle>
+## Example VERIFICATION.md
+```markdown
+---
+phase: 03-api-layer
+verified: 2026-04-05T12:00:00Z
+status: passed
+score: 3/3
+overrides_applied: 1
+overrides:
+  - must_have: "paginated API responses"
+    reason: "Descoped — dataset under 100 items, pagination adds complexity without value"
+    accepted_by: "dave"
+    accepted_at: "2026-04-04T15:30:00Z"
+---
+## Phase 3: API Layer — Verification
+| # | Truth | Status | Evidence |
+|---|-------|--------|----------|
+| 1 | REST endpoints return JSON | VERIFIED | curl tests confirm |
+| 2 | Paginated API responses | PASSED (override) | Descoped — see override: dataset under 100 items |
+| 3 | Authentication middleware | VERIFIED | JWT validation working |
+```

package/sdd/references/workstream-flag.md CHANGED Viewed

@@ -9,8 +9,55 @@ parallel milestone work by multiple Claude Code instances on the same codebase.
 1. `--ws <name>` flag (explicit, highest priority)
 2. `SDD_WORKSTREAM` environment variable (per-instance)
-3. `.planning/active-workstream` file (shared, last-writer-wins)
-4. `null` — flat mode (no workstreams)
+3. Session-scoped active workstream pointer in temp storage (per runtime session / terminal)
+4. `.planning/active-workstream` file (legacy shared fallback when no session key exists)
+5. `null` — flat mode (no workstreams)
+## Why session-scoped pointers exist
+The shared `.planning/active-workstream` file is fundamentally unsafe when multiple
+Claude/Codex instances are active on the same repo at the same time. One session can
+silently repoint another session's `STATE.md`, `ROADMAP.md`, and phase paths.
+SDD now prefers a session-scoped pointer keyed by runtime/session identity
+(`SDD_SESSION_KEY`, `CODEX_THREAD_ID`, `CLAUDE_CODE_SSE_PORT`, terminal session IDs,
+or the controlling TTY). This keeps concurrent sessions isolated while preserving
+legacy compatibility for runtimes that do not expose a stable session key.
+## Session Identity Resolution
+When SDD resolves the session-scoped pointer in step 3 above, it uses this order:
+1. Explicit runtime/session env vars such as `SDD_SESSION_KEY`, `CODEX_THREAD_ID`,
+   `CLAUDE_SESSION_ID`, `CLAUDE_CODE_SSE_PORT`, `OPENCODE_SESSION_ID`,
+   `GEMINI_SESSION_ID`, `CURSOR_SESSION_ID`, `WINDSURF_SESSION_ID`,
+   `TERM_SESSION_ID`, `WT_SESSION`, `TMUX_PANE`, and `ZELLIJ_SESSION_NAME`
+2. `TTY` or `SSH_TTY` if the shell/runtime already exposes the terminal path
+3. A single best-effort `tty` probe, but only when stdin is interactive
+If none of those produce a stable identity, SDD does not keep probing. It falls
+back directly to the legacy shared `.planning/active-workstream` file.
+This matters in headless or stripped environments: when stdin is already
+non-interactive, SDD intentionally skips shelling out to `tty` because that path
+cannot discover a stable session identity and only adds avoidable failures on the
+routing hot path.
+## Pointer Lifecycle
+Session-scoped pointers are intentionally lightweight and best-effort:
+- Clearing a workstream for one session removes only that session's pointer file
+- If that was the last pointer for the repo, SDD also removes the now-empty
+  per-project temp directory
+- If sibling session pointers still exist, the temp directory is left in place
+- When a pointer refers to a workstream directory that no longer exists, SDD
+  treats it as stale state: it removes that pointer file and resolves to `null`
+  until the session explicitly sets a new active workstream again
+SDD does not currently run a background garbage collector for historical temp
+directories. Cleanup is opportunistic at the pointer being cleared or self-healed,
+and broader temp hygiene is left to OS temp cleanup or future maintenance work.
 ## Routing Propagation
@@ -29,7 +76,7 @@ This ensures workstream scope chains automatically through the workflow:
 ├── config.json         # Shared
 ├── milestones/         # Shared
 ├── codebase/           # Shared
-├── active-workstream   # Points to current ws
+├── active-workstream   # Legacy shared fallback only
 └── workstreams/
     ├── feature-a/      # Workstream A
     │   ├── STATE.md
@@ -50,6 +97,12 @@ This ensures workstream scope chains automatically through the workflow:
 node sdd-tools.cjs state json --ws feature-a
 node sdd-tools.cjs find-phase 3 --ws feature-b
+# Session-local switching without --ws on every command
+SDD_SESSION_KEY=my-terminal-a node sdd-tools.cjs workstream set feature-a
+SDD_SESSION_KEY=my-terminal-a node sdd-tools.cjs state json
+SDD_SESSION_KEY=my-terminal-b node sdd-tools.cjs workstream set feature-b
+SDD_SESSION_KEY=my-terminal-b node sdd-tools.cjs state json
 # Workstream CRUD
 node sdd-tools.cjs workstream create <name>
 node sdd-tools.cjs workstream list

package/sdd/templates/AI-SPEC.md ADDED Viewed

@@ -0,0 +1,246 @@
+# AI-SPEC — Phase {N}: {phase_name}
+> AI design contract generated by `/sdd-ai-integration-phase`. Consumed by `sdd-planner` and `sdd-eval-auditor`.
+> Locks framework selection, implementation guidance, and evaluation strategy before planning begins.
+---
+## 1. System Classification
+**System Type:** <!-- RAG | Multi-Agent | Conversational | Extraction | Autonomous Agent | Content Generation | Code Automation | Hybrid -->
+**Description:**
+<!-- One-paragraph description of what this AI system does, who uses it, and what "good" looks like -->
+**Critical Failure Modes:**
+<!-- The 3-5 behaviors that absolutely cannot go wrong in this system -->
+1.
+2.
+3.
+---
+## 1b. Domain Context
+> Researched by `sdd-domain-researcher`. Grounds the evaluation strategy in domain expert knowledge.
+**Industry Vertical:** <!-- healthcare | legal | finance | customer service | education | developer tooling | e-commerce | etc. -->
+**User Population:** <!-- who uses this system and in what context -->
+**Stakes Level:** <!-- Low | Medium | High | Critical -->
+**Output Consequence:** <!-- what happens downstream when the AI output is acted on -->
+### What Domain Experts Evaluate Against
+<!-- Domain-specific rubric ingredients — in practitioner language, not AI jargon -->
+<!-- Format: Dimension / Good (expert accepts) / Bad (expert flags) / Stakes / Source -->
+### Known Failure Modes in This Domain
+<!-- Domain-specific failure modes from research — not generic hallucination, but how it manifests here -->
+### Regulatory / Compliance Context
+<!-- Relevant regulations or constraints — or "None identified" if genuinely none apply -->
+### Domain Expert Roles for Evaluation
+| Role | Responsibility |
+|------|---------------|
+| <!-- e.g., Senior practitioner --> | <!-- Dataset labeling / rubric calibration / production sampling --> |
+---
+## 2. Framework Decision
+**Selected Framework:** <!-- e.g., LlamaIndex v0.10.x -->
+**Version:** <!-- Pin the version -->
+**Rationale:**
+<!-- Why this framework fits this system type, team context, and production requirements -->
+**Alternatives Considered:**
+| Framework | Ruled Out Because |
+|-----------|------------------|
+| | |
+**Vendor Lock-In Accepted:** <!-- Yes / No / Partial — document the trade-off consciously -->
+---
+## 3. Framework Quick Reference
+> Fetched from official docs by `sdd-ai-researcher`. Distilled for this specific use case.
+### Installation
+```bash
+# Install command(s)
+```
+### Core Imports
+```python
+# Key imports for this use case
+```
+### Entry Point Pattern
+```python
+# Minimal working example for this system type
+```
+### Key Abstractions
+<!-- Framework-specific concepts the developer must understand before coding -->
+| Concept | What It Is | When You Use It |
+|---------|-----------|-----------------|
+| | | |
+### Common Pitfalls
+<!-- Gotchas specific to this framework and system type — from docs, issues, and community reports -->
+1.
+2.
+3.
+### Recommended Project Structure
+```
+project/
+├── # Framework-specific folder layout
+```
+---
+## 4. Implementation Guidance
+**Model Configuration:**
+<!-- Which model(s), temperature, max tokens, and other key parameters -->
+**Core Pattern:**
+<!-- The primary implementation pattern for this system type in this framework -->
+**Tool Use:**
+<!-- Tools/integrations needed and how to configure them -->
+**State Management:**
+<!-- How state is persisted, retrieved, and updated -->
+**Context Window Strategy:**
+<!-- How to manage context limits for this system type -->
+---
+## 4b. AI Systems Best Practices
+> Written by `sdd-ai-researcher`. Cross-cutting patterns every developer building AI systems needs — independent of framework choice.
+### Structured Outputs with Pydantic
+<!-- Framework-specific Pydantic integration pattern for this use case -->
+<!-- Include: output model definition, how the framework uses it, retry logic on validation failure -->
+```python
+# Pydantic output model for this system type
+```
+### Async-First Design
+<!-- How async is handled in this framework, the one common mistake, and when to stream vs. await -->
+### Prompt Engineering Discipline
+<!-- System vs. user prompt separation, few-shot guidance, token budget strategy -->
+### Context Window Management
+<!-- Strategy specific to this system type: RAG chunking / conversation summarisation / agent compaction -->
+### Cost and Latency Budget
+<!-- Per-call cost estimate, caching strategy, sub-task model routing -->
+---
+## 5. Evaluation Strategy
+### Dimensions
+| Dimension | Rubric (Pass/Fail or 1-5) | Measurement Approach | Priority |
+|-----------|--------------------------|---------------------|----------|
+| | | Code / LLM Judge / Human | Critical / High / Medium |
+### Eval Tooling
+**Primary Tool:** <!-- e.g., RAGAS + Langfuse -->
+**Setup:**
+```bash
+# Install and configure
+```
+**CI/CD Integration:**
+```bash
+# Command to run evals in CI/CD pipeline
+```
+### Reference Dataset
+**Size:** <!-- e.g., 20 examples to start -->
+**Composition:**
+<!-- What scenario types the dataset covers: critical paths, edge cases, failure modes -->
+**Labeling:**
+<!-- Who labels examples and how (domain expert, LLM judge with calibration, etc.) -->
+---
+## 6. Guardrails
+### Online (Real-Time)
+| Guardrail | Trigger | Intervention |
+|-----------|---------|--------------|
+| | | Block / Escalate / Flag |
+### Offline (Flywheel)
+| Metric | Sampling Strategy | Action on Degradation |
+|--------|------------------|----------------------|
+| | | |
+---
+## 7. Production Monitoring
+**Tracing Tool:** <!-- e.g., Langfuse self-hosted -->
+**Key Metrics to Track:**
+<!-- 3-5 metrics that will be monitored in production -->
+**Alert Thresholds:**
+<!-- When to page/alert -->
+**Smart Sampling Strategy:**
+<!-- How to select interactions for human review — signal-based filters -->
+---
+## Checklist
+- [ ] System type classified
+- [ ] Critical failure modes identified (≥ 3)
+- [ ] Domain context researched (Section 1b: vertical, stakes, expert criteria, failure modes)
+- [ ] Regulatory/compliance context identified or explicitly noted as none
+- [ ] Domain expert roles defined for evaluation involvement
+- [ ] Framework selected with rationale documented
+- [ ] Alternatives considered and ruled out
+- [ ] Framework quick reference written (install, imports, pattern, pitfalls)
+- [ ] AI systems best practices written (Section 4b: Pydantic, async, prompt discipline, context)
+- [ ] Evaluation dimensions grounded in domain rubric ingredients
+- [ ] Each eval dimension has a concrete rubric (Good/Bad in domain language)
+- [ ] Eval tooling selected — Arize Phoenix default confirmed or override noted
+- [ ] Reference dataset spec written (size ≥ 10, composition + labeling defined)
+- [ ] CI/CD eval integration specified
+- [ ] Online guardrails defined
+- [ ] Production monitoring configured (tracing tool + sampling strategy)

package/sdd/templates/DEBUG.md CHANGED Viewed

@@ -99,7 +99,7 @@ files_changed: []
 <lifecycle>
-**Creation:** Immediately when /sdd:debug is called
+**Creation:** Immediately when /sdd-debug is called
 - Create file with trigger from user input
 - Set status to "gathering"
 - Current Focus: next_action = "gather symptoms"

package/sdd/templates/SECURITY.md ADDED Viewed

@@ -0,0 +1,61 @@
+---
+phase: {N}
+slug: {phase-slug}
+status: draft
+threats_open: 0
+asvs_level: 1
+created: {date}
+---
+# Phase {N} — Security
+> Per-phase security contract: threat register, accepted risks, and audit trail.
+---
+## Trust Boundaries
+| Boundary | Description | Data Crossing |
+|----------|-------------|---------------|
+| {boundary} | {description} | {data type / sensitivity} |
+---
+## Threat Register
+| Threat ID | Category | Component | Disposition | Mitigation | Status |
+|-----------|----------|-----------|-------------|------------|--------|
+| T-{N}-01 | {STRIDE category} | {component} | {mitigate / accept / transfer} | {control or reference} | open |
+*Status: open · closed*
+*Disposition: mitigate (implementation required) · accept (documented risk) · transfer (third-party)*
+---
+## Accepted Risks Log
+| Risk ID | Threat Ref | Rationale | Accepted By | Date |
+|---------|------------|-----------|-------------|------|
+*Accepted risks do not resurface in future audit runs.*
+*If none: "No accepted risks."*
+---
+## Security Audit Trail
+| Audit Date | Threats Total | Closed | Open | Run By |
+|------------|---------------|--------|------|--------|
+| {YYYY-MM-DD} | {N} | {N} | {N} | {name / agent} |
+---
+## Sign-Off
+- [ ] All threats have a disposition (mitigate / accept / transfer)
+- [ ] Accepted risks documented in Accepted Risks Log
+- [ ] `threats_open: 0` confirmed
+- [ ] `status: verified` set in frontmatter
+**Approval:** {pending / verified YYYY-MM-DD}

package/sdd/templates/UAT.md CHANGED Viewed

@@ -106,7 +106,7 @@ blocked: [N]
 **Gaps:**
 - APPEND only when issue found (YAML format)
 - After diagnosis: fill `root_cause`, `artifacts`, `missing`, `debug_session`
-- This section feeds directly into /sdd:plan-phase --gaps
+- This section feeds directly into /sdd-plan-phase --gaps
 </section_rules>
@@ -120,7 +120,7 @@ blocked: [N]
 4. UAT.md Gaps section updated with diagnosis:
    - Each gap gets `root_cause`, `artifacts`, `missing`, `debug_session` filled
 5. status → "diagnosed"
-6. Ready for /sdd:plan-phase --gaps with root causes
+6. Ready for /sdd-plan-phase --gaps with root causes
 **After diagnosis:**
 ```yaml
@@ -144,7 +144,7 @@ blocked: [N]
 <lifecycle>
-**Creation:** When /sdd:verify-work starts new session
+**Creation:** When /sdd-verify-work starts new session
 - Extract tests from SUMMARY.md files
 - Set status to "testing"
 - Current Test points to test 1
@@ -171,7 +171,7 @@ blocked: [N]
 - Present summary with outstanding items highlighted
 **Resuming partial session:**
-- `/sdd:verify-work {phase}` picks up from first pending/blocked test
+- `/sdd-verify-work {phase}` picks up from first pending/blocked test
 - When all items resolved, status advances to "complete"
 **Resume after /clear:**

package/sdd/templates/VALIDATION.md CHANGED Viewed

@@ -29,16 +29,16 @@ created: {date}
 - **After every task commit:** Run `{quick run command}`
 - **After every plan wave:** Run `{full suite command}`
-- **Before `/sdd:verify-work`:** Full suite must be green
+- **Before `/sdd-verify-work`:** Full suite must be green
 - **Max feedback latency:** {N} seconds
 ---
 ## Per-Task Verification Map
-| Task ID | Plan | Wave | Requirement | Test Type | Automated Command | File Exists | Status |
-|---------|------|------|-------------|-----------|-------------------|-------------|--------|
-| {N}-01-01 | 01 | 1 | REQ-{XX} | unit | `{command}` | ✅ / ❌ W0 | ⬜ pending |
+| Task ID | Plan | Wave | Requirement | Threat Ref | Secure Behavior | Test Type | Automated Command | File Exists | Status |
+|---------|------|------|-------------|------------|-----------------|-----------|-------------------|-------------|--------|
+| {N}-01-01 | 01 | 1 | REQ-{XX} | T-{N}-01 / — | {expected secure behavior or "N/A"} | unit | `{command}` | ✅ / ❌ W0 | ⬜ pending |
 *Status: ⬜ pending · ✅ green · ❌ red · ⚠️ flaky*