npm - @bhargavvc/sdd-cc - Versions diffs - 1.30.0 → 1.35.0 - Mend

@bhargavvc/sdd-cc 1.30.0 → 1.35.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (242) hide show

package/README.ja-JP.md +144 -110
package/README.ko-KR.md +143 -107
package/README.md +183 -112
package/README.pt-BR.md +90 -52
package/README.zh-CN.md +141 -101
package/agents/sdd-advisor-researcher.md +23 -0
package/agents/sdd-ai-researcher.md +133 -0
package/agents/sdd-code-fixer.md +516 -0
package/agents/sdd-code-reviewer.md +355 -0
package/agents/sdd-codebase-mapper.md +3 -3
package/agents/sdd-debugger.md +17 -5
package/agents/sdd-doc-verifier.md +201 -0
package/agents/sdd-doc-writer.md +602 -0
package/agents/sdd-domain-researcher.md +153 -0
package/agents/sdd-eval-auditor.md +164 -0
package/agents/sdd-eval-planner.md +154 -0
package/agents/sdd-executor.md +87 -4
package/agents/sdd-framework-selector.md +160 -0
package/agents/sdd-intel-updater.md +314 -0
package/agents/sdd-nyquist-auditor.md +1 -1
package/agents/sdd-phase-researcher.md +71 -4
package/agents/sdd-plan-checker.md +100 -6
package/agents/sdd-planner.md +145 -206
package/agents/sdd-project-researcher.md +25 -2
package/agents/sdd-research-synthesizer.md +3 -3
package/agents/sdd-roadmapper.md +6 -6
package/agents/sdd-security-auditor.md +128 -0
package/agents/sdd-ui-auditor.md +43 -3
package/agents/sdd-ui-checker.md +5 -5
package/agents/sdd-ui-researcher.md +27 -4
package/agents/sdd-user-profiler.md +2 -2
package/agents/sdd-verifier.md +142 -22
package/bin/install.js +2151 -551
package/commands/sdd/add-backlog.md +5 -5
package/commands/sdd/add-tests.md +2 -2
package/commands/sdd/ai-integration-phase.md +36 -0
package/commands/sdd/analyze-dependencies.md +34 -0
package/commands/sdd/audit-fix.md +33 -0
package/commands/sdd/autonomous.md +7 -2
package/commands/sdd/cleanup.md +5 -0
package/commands/sdd/code-review-fix.md +52 -0
package/commands/sdd/code-review.md +55 -0
package/commands/sdd/complete-milestone.md +6 -6
package/commands/sdd/debug.md +22 -9
package/commands/sdd/discuss-phase.md +7 -2
package/commands/sdd/do.md +1 -1
package/commands/sdd/docs-update.md +48 -0
package/commands/sdd/eval-review.md +32 -0
package/commands/sdd/execute-phase.md +4 -0
package/commands/sdd/explore.md +27 -0
package/commands/sdd/fast.md +2 -2
package/commands/sdd/from-sdd2.md +45 -0
package/commands/sdd/help.md +2 -0
package/commands/sdd/import.md +36 -0
package/commands/sdd/intel.md +179 -0
package/commands/sdd/join-discord.md +2 -1
package/commands/sdd/manager.md +1 -0
package/commands/sdd/map-codebase.md +3 -3
package/commands/sdd/new-milestone.md +1 -1
package/commands/sdd/new-project.md +5 -1
package/commands/sdd/new-workspace.md +1 -1
package/commands/sdd/next.md +2 -0
package/commands/sdd/plan-milestone-gaps.md +2 -2
package/commands/sdd/plan-phase.md +6 -1
package/commands/sdd/plant-seed.md +1 -1
package/commands/sdd/profile-user.md +1 -1
package/commands/sdd/quick.md +5 -3
package/commands/sdd/reapply-patches.md +230 -42
package/commands/sdd/research-phase.md +3 -3
package/commands/sdd/review-backlog.md +1 -0
package/commands/sdd/review.md +6 -3
package/commands/sdd/scan.md +26 -0
package/commands/sdd/secure-phase.md +35 -0
package/commands/sdd/ship.md +1 -1
package/commands/sdd/thread.md +5 -5
package/commands/sdd/undo.md +34 -0
package/commands/sdd/verify-work.md +1 -1
package/commands/sdd/workstreams.md +17 -11
package/hooks/dist/sdd-check-update.js +33 -8
package/hooks/dist/sdd-context-monitor.js +17 -8
package/hooks/dist/sdd-phase-boundary.sh +27 -0
package/hooks/dist/sdd-prompt-guard.js +1 -0
package/hooks/dist/sdd-read-guard.js +82 -0
package/hooks/dist/sdd-session-state.sh +33 -0
package/hooks/dist/sdd-statusline.js +137 -15
package/hooks/dist/sdd-validate-commit.sh +47 -0
package/hooks/dist/sdd-workflow-guard.js +4 -4
package/hooks/sdd-check-update.js +139 -0
package/hooks/sdd-context-monitor.js +165 -0
package/hooks/sdd-phase-boundary.sh +27 -0
package/hooks/sdd-prompt-guard.js +97 -0
package/hooks/sdd-read-guard.js +82 -0
package/hooks/sdd-session-state.sh +33 -0
package/hooks/sdd-statusline.js +241 -0
package/hooks/sdd-validate-commit.sh +47 -0
package/hooks/sdd-workflow-guard.js +94 -0
package/package.json +3 -3
package/scripts/build-hooks.js +18 -7
package/scripts/prompt-injection-scan.sh +1 -0
package/scripts/rebrand-gsd-to-sdd.sh +221 -220
package/scripts/run-tests.cjs +5 -1
package/scripts/sync-upstream.sh +1 -1
package/sdd/bin/lib/commands.cjs +79 -17
package/sdd/bin/lib/config.cjs +90 -48
package/sdd/bin/lib/core.cjs +452 -87
package/sdd/bin/lib/docs.cjs +267 -0
package/sdd/bin/lib/frontmatter.cjs +381 -336
package/sdd/bin/lib/init.cjs +110 -16
package/sdd/bin/lib/intel.cjs +660 -0
package/sdd/bin/lib/learnings.cjs +378 -0
package/sdd/bin/lib/milestone.cjs +42 -11
package/sdd/bin/lib/model-profiles.cjs +17 -15
package/sdd/bin/lib/phase.cjs +367 -288
package/sdd/bin/lib/profile-output.cjs +106 -10
package/sdd/bin/lib/roadmap.cjs +146 -115
package/sdd/bin/lib/schema-detect.cjs +238 -0
package/sdd/bin/lib/sdd2-import.cjs +511 -0
package/sdd/bin/lib/security.cjs +124 -3
package/sdd/bin/lib/state.cjs +648 -264
package/sdd/bin/lib/template.cjs +8 -4
package/sdd/bin/lib/verify.cjs +209 -28
package/sdd/bin/lib/workstream.cjs +7 -3
package/sdd/bin/sdd-tools.cjs +184 -12
package/sdd/contexts/dev.md +21 -0
package/sdd/contexts/research.md +22 -0
package/sdd/contexts/review.md +22 -0
package/sdd/references/agent-contracts.md +79 -0
package/sdd/references/ai-evals.md +156 -0
package/sdd/references/ai-frameworks.md +186 -0
package/sdd/references/artifact-types.md +113 -0
package/sdd/references/common-bug-patterns.md +114 -0
package/sdd/references/context-budget.md +49 -0
package/sdd/references/continuation-format.md +25 -25
package/sdd/references/domain-probes.md +125 -0
package/sdd/references/few-shot-examples/plan-checker.md +73 -0
package/sdd/references/few-shot-examples/verifier.md +109 -0
package/sdd/references/gate-prompts.md +100 -0
package/sdd/references/gates.md +70 -0
package/sdd/references/git-integration.md +1 -1
package/sdd/references/ios-scaffold.md +123 -0
package/sdd/references/model-profile-resolution.md +2 -0
package/sdd/references/model-profiles.md +24 -18
package/sdd/references/planner-gap-closure.md +62 -0
package/sdd/references/planner-reviews.md +39 -0
package/sdd/references/planner-revision.md +87 -0
package/sdd/references/planning-config.md +252 -0
package/sdd/references/revision-loop.md +97 -0
package/sdd/references/thinking-models-debug.md +44 -0
package/sdd/references/thinking-models-execution.md +50 -0
package/sdd/references/thinking-models-planning.md +62 -0
package/sdd/references/thinking-models-research.md +50 -0
package/sdd/references/thinking-models-verification.md +55 -0
package/sdd/references/thinking-partner.md +96 -0
package/sdd/references/ui-brand.md +4 -4
package/sdd/references/universal-anti-patterns.md +63 -0
package/sdd/references/verification-overrides.md +227 -0
package/sdd/references/workstream-flag.md +56 -3
package/sdd/templates/AI-SPEC.md +246 -0
package/sdd/templates/DEBUG.md +1 -1
package/sdd/templates/SECURITY.md +61 -0
package/sdd/templates/UAT.md +4 -4
package/sdd/templates/VALIDATION.md +4 -4
package/sdd/templates/claude-md.md +32 -9
package/sdd/templates/config.json +4 -0
package/sdd/templates/debug-subagent-prompt.md +1 -1
package/sdd/templates/dev-preferences.md +1 -1
package/sdd/templates/discovery.md +2 -2
package/sdd/templates/phase-prompt.md +1 -1
package/sdd/templates/planner-subagent-prompt.md +3 -3
package/sdd/templates/project.md +1 -1
package/sdd/templates/research.md +1 -1
package/sdd/templates/state.md +2 -2
package/sdd/workflows/add-phase.md +8 -8
package/sdd/workflows/add-tests.md +12 -9
package/sdd/workflows/add-todo.md +5 -3
package/sdd/workflows/ai-integration-phase.md +284 -0
package/sdd/workflows/analyze-dependencies.md +96 -0
package/sdd/workflows/audit-fix.md +157 -0
package/sdd/workflows/audit-milestone.md +11 -11
package/sdd/workflows/audit-uat.md +2 -2
package/sdd/workflows/autonomous.md +195 -27
package/sdd/workflows/check-todos.md +12 -10
package/sdd/workflows/cleanup.md +2 -0
package/sdd/workflows/code-review-fix.md +497 -0
package/sdd/workflows/code-review.md +515 -0
package/sdd/workflows/complete-milestone.md +56 -22
package/sdd/workflows/diagnose-issues.md +10 -3
package/sdd/workflows/discovery-phase.md +5 -3
package/sdd/workflows/discuss-phase-assumptions.md +24 -6
package/sdd/workflows/discuss-phase-power.md +291 -0
package/sdd/workflows/discuss-phase.md +173 -21
package/sdd/workflows/do.md +23 -21
package/sdd/workflows/docs-update.md +1155 -0
package/sdd/workflows/eval-review.md +155 -0
package/sdd/workflows/execute-phase.md +594 -38
package/sdd/workflows/execute-plan.md +67 -96
package/sdd/workflows/explore.md +139 -0
package/sdd/workflows/fast.md +5 -5
package/sdd/workflows/forensics.md +2 -2
package/sdd/workflows/health.md +4 -4
package/sdd/workflows/help.md +122 -119
package/sdd/workflows/import.md +276 -0
package/sdd/workflows/inbox.md +387 -0
package/sdd/workflows/insert-phase.md +7 -7
package/sdd/workflows/list-phase-assumptions.md +4 -4
package/sdd/workflows/list-workspaces.md +2 -2
package/sdd/workflows/manager.md +35 -32
package/sdd/workflows/map-codebase.md +7 -5
package/sdd/workflows/milestone-summary.md +2 -2
package/sdd/workflows/new-milestone.md +17 -9
package/sdd/workflows/new-project.md +50 -25
package/sdd/workflows/new-workspace.md +7 -5
package/sdd/workflows/next.md +67 -11
package/sdd/workflows/note.md +9 -7
package/sdd/workflows/pause-work.md +75 -12
package/sdd/workflows/plan-milestone-gaps.md +8 -8
package/sdd/workflows/plan-phase.md +294 -42
package/sdd/workflows/plant-seed.md +6 -3
package/sdd/workflows/pr-branch.md +42 -14
package/sdd/workflows/profile-user.md +9 -7
package/sdd/workflows/progress.md +45 -45
package/sdd/workflows/quick.md +195 -47
package/sdd/workflows/remove-phase.md +6 -6
package/sdd/workflows/remove-workspace.md +3 -1
package/sdd/workflows/research-phase.md +2 -2
package/sdd/workflows/resume-project.md +12 -12
package/sdd/workflows/review.md +109 -9
package/sdd/workflows/scan.md +102 -0
package/sdd/workflows/secure-phase.md +166 -0
package/sdd/workflows/session-report.md +2 -2
package/sdd/workflows/settings.md +38 -12
package/sdd/workflows/ship.md +21 -9
package/sdd/workflows/stats.md +1 -1
package/sdd/workflows/transition.md +23 -23
package/sdd/workflows/ui-phase.md +15 -7
package/sdd/workflows/ui-review.md +29 -4
package/sdd/workflows/undo.md +314 -0
package/sdd/workflows/update.md +171 -20
package/sdd/workflows/validate-phase.md +6 -4
package/sdd/workflows/verify-phase.md +210 -6
package/sdd/workflows/verify-work.md +83 -9
package/sdd/commands/sdd/workstreams.md +0 -63

package/sdd/bin/sdd-tools.cjs CHANGED Viewed

@@ -70,6 +70,16 @@
  *   audit-uat                           Scan all phases for unresolved UAT/verification items
  *   uat render-checkpoint --file <path> Render the current UAT checkpoint block
  *
+ * Intel:
+ *   intel query <term>             Query intel files for a term
+ *   intel status                   Show intel file freshness
+ *   intel update                   Trigger intel refresh (returns agent spawn hint)
+ *   intel diff                     Show changed intel entries since last snapshot
+ *   intel snapshot                 Save current intel state as diff baseline
+ *   intel patch-meta <file>        Update _meta.updated_at in an intel file
+ *   intel validate                 Validate intel file structure
+ *   intel extract-exports <file>   Extract exported symbols from a source file
+ *
  * Scaffolding:
  *   scaffold context --phase <N>       Create CONTEXT.md template
  *   scaffold uat --phase <N>           Create UAT.md template
@@ -93,6 +103,7 @@
  *   verify commits <h1> [h2] ...      Batch verify commit hashes
  *   verify artifacts <plan-file>       Check must_haves.artifacts
  *   verify key-links <plan-file>       Check must_haves.key_links
+ *   verify schema-drift <phase> [--skip]  Detect schema file changes without push
  *
  * Template Fill:
  *   template fill summary --phase N    Create pre-filled SUMMARY.md
@@ -133,6 +144,20 @@
  *   init milestone-op                  All context for milestone operations
  *   init map-codebase                  All context for map-codebase workflow
  *   init progress                      All context for progress workflow
+ *
+ * Documentation:
+ *   docs-init                            Project context for docs-update workflow
+ *
+ * Learnings:
+ *   learnings list                       List all global learnings (JSON)
+ *   learnings query --tag <tag>          Query learnings by tag
+ *   learnings copy                       Copy from current project's LEARNINGS.md
+ *   learnings prune --older-than <dur>   Remove entries older than duration (e.g. 90d)
+ *   learnings delete <id>                Delete a learning by ID
+ *
+ * SDD-2 Migration:
+ *   from-sdd2 [--path <dir>] [--force] [--dry-run]
+ *             Import a SDD-2 (.sdd/) project back to SDD v1 (.planning/) format
  */
 const fs = require('fs');
@@ -152,6 +177,8 @@ const frontmatter = require('./lib/frontmatter.cjs');
 const profilePipeline = require('./lib/profile-pipeline.cjs');
 const profileOutput = require('./lib/profile-output.cjs');
 const workstream = require('./lib/workstream.cjs');
+const docs = require('./lib/docs.cjs');
+const learnings = require('./lib/learnings.cjs');
 // ─── Arg parsing helpers ──────────────────────────────────────────────────────
@@ -230,7 +257,7 @@ async function main() {
   }
   // Optional workstream override for parallel milestone work.
-  // Priority: --ws flag > SDD_WORKSTREAM env var > active-workstream file > null (flat mode)
+  // Priority: --ws flag > SDD_WORKSTREAM env var > session-scoped pointer > shared legacy pointer > null
   const wsEqArg = args.find(arg => arg.startsWith('--ws='));
   const wsIdx = args.indexOf('--ws');
   let ws = null;
@@ -271,10 +298,31 @@ async function main() {
     args.splice(pickIdx, 2);
   }
+  // --default <value>: for config-get, return this value instead of erroring
+  // when the key is absent. Allows workflows to express optional config reads
+  // without defensive `2>/dev/null || true` boilerplate (#1893).
+  const defaultIdx = args.indexOf('--default');
+  let defaultValue = undefined;
+  if (defaultIdx !== -1) {
+    defaultValue = args[defaultIdx + 1];
+    if (defaultValue === undefined) defaultValue = '';
+    args.splice(defaultIdx, 2);
+  }
   const command = args[0];
   if (!command) {
-    error('Usage: sdd-tools <command> [args] [--raw] [--pick <field>] [--cwd <path>] [--ws <name>]\nCommands: state, resolve-model, find-phase, commit, verify-summary, verify, frontmatter, template, generate-slug, current-timestamp, list-todos, verify-path-exists, config-ensure-section, config-new-project, init, workstream');
+    error('Usage: sdd-tools <command> [args] [--raw] [--pick <field>] [--cwd <path>] [--ws <name>]\nCommands: state, resolve-model, find-phase, commit, verify-summary, verify, frontmatter, template, generate-slug, current-timestamp, list-todos, verify-path-exists, config-ensure-section, config-new-project, init, workstream, docs-init');
+  }
+  // Reject flags that are never valid for any sdd-tools command. AI agents
+  // sometimes hallucinate --help or --version on tool invocations; silently
+  // ignoring them can cause destructive operations to proceed unchecked.
+  const NEVER_VALID_FLAGS = new Set(['-h', '--help', '-?', '--h', '--version', '-v', '--usage']);
+  for (const arg of args) {
+    if (NEVER_VALID_FLAGS.has(arg)) {
+      error(`Unknown flag: ${arg}\nsdd-tools does not accept help or version flags. Run "sdd-tools" with no arguments for usage.`);
+    }
   }
   // Multi-repo guard: resolve project root for commands that read/write .planning/.
@@ -313,7 +361,7 @@ async function main() {
       }
     };
     try {
-      await runCommand(command, args, cwd, raw);
+      await runCommand(command, args, cwd, raw, defaultValue);
       cleanup();
     } catch (e) {
       fs.writeSync = origWriteSync;
@@ -322,7 +370,27 @@ async function main() {
     return;
   }
-  await runCommand(command, args, cwd, raw);
+  // Intercept stdout to transparently resolve @file: references (#1891).
+  // core.cjs output() writes @file:<path> when JSON > 50KB. The --pick path
+  // already resolves this, but the normal path wrote @file: to stdout, forcing
+  // every workflow to have a bash-specific `if [[ "$INIT" == @file:* ]]` check
+  // that breaks on PowerShell and other non-bash shells.
+  const origWriteSync2 = fs.writeSync;
+  const outChunks = [];
+  fs.writeSync = function (fd, data, ...rest) {
+    if (fd === 1) { outChunks.push(String(data)); return; }
+    return origWriteSync2.call(fs, fd, data, ...rest);
+  };
+  try {
+    await runCommand(command, args, cwd, raw, defaultValue);
+  } finally {
+    fs.writeSync = origWriteSync2;
+  }
+  let captured = outChunks.join('');
+  if (captured.startsWith('@file:')) {
+    captured = fs.readFileSync(captured.slice(6), 'utf-8');
+  }
+  origWriteSync2.call(fs, 1, captured);
 }
 /**
@@ -348,7 +416,7 @@ function extractField(obj, fieldPath) {
   return current;
 }
-async function runCommand(command, args, cwd, raw) {
+async function runCommand(command, args, cwd, raw, defaultValue) {
   switch (command) {
     case 'state': {
       const subcommand = args[1];
@@ -394,6 +462,14 @@ async function runCommand(command, args, cwd, raw) {
         state.cmdSignalWaiting(cwd, type, question, options, p, raw);
       } else if (subcommand === 'signal-resume') {
         state.cmdSignalResume(cwd, raw);
+      } else if (subcommand === 'planned-phase') {
+        const { phase: p, name, plans } = parseNamedArgs(args, ['phase', 'name', 'plans']);
+        state.cmdStatePlannedPhase(cwd, p, plans !== null ? parseInt(plans, 10) : null, raw);
+      } else if (subcommand === 'validate') {
+        state.cmdStateValidate(cwd, raw);
+      } else if (subcommand === 'sync') {
+        const { verify } = parseNamedArgs(args, [], ['verify']);
+        state.cmdStateSync(cwd, { verify }, raw);
       } else {
         state.cmdStateLoad(cwd, raw);
       }
@@ -425,6 +501,11 @@ async function runCommand(command, args, cwd, raw) {
       break;
     }
+    case 'check-commit': {
+      commands.cmdCheckCommit(cwd, raw);
+      break;
+    }
     case 'commit-to-subrepo': {
       const message = args[1];
       const filesIndex = args.indexOf('--files');
@@ -498,8 +579,11 @@ async function runCommand(command, args, cwd, raw) {
         verify.cmdVerifyArtifacts(cwd, args[2], raw);
       } else if (subcommand === 'key-links') {
         verify.cmdVerifyKeyLinks(cwd, args[2], raw);
+      } else if (subcommand === 'schema-drift') {
+        const skipFlag = args.includes('--skip');
+        verify.cmdVerifySchemaDrift(cwd, args[2], skipFlag, raw);
       } else {
-        error('Unknown verify subcommand. Available: plan-structure, phase-completeness, references, commits, artifacts, key-links');
+        error('Unknown verify subcommand. Available: plan-structure, phase-completeness, references, commits, artifacts, key-links, schema-drift');
       }
       break;
     }
@@ -540,7 +624,7 @@ async function runCommand(command, args, cwd, raw) {
     }
     case 'config-get': {
-      config.cmdConfigGet(cwd, args[1], raw);
+      config.cmdConfigGet(cwd, args[1], raw, defaultValue);
       break;
     }
@@ -570,8 +654,10 @@ async function runCommand(command, args, cwd, raw) {
           includeArchived: args.includes('--include-archived'),
         };
         phase.cmdPhasesList(cwd, options, raw);
+      } else if (subcommand === 'clear') {
+        milestone.cmdPhasesClear(cwd, raw, args.slice(2));
       } else {
-        error('Unknown phases subcommand. Available: list');
+        error('Unknown phases subcommand. Available: list, clear');
       }
       break;
     }
@@ -712,12 +798,16 @@ async function runCommand(command, args, cwd, raw) {
     case 'init': {
       const workflow = args[1];
       switch (workflow) {
-        case 'execute-phase':
-          init.cmdInitExecutePhase(cwd, args[2], raw);
+        case 'execute-phase': {
+          const { validate: epValidate } = parseNamedArgs(args, [], ['validate']);
+          init.cmdInitExecutePhase(cwd, args[2], raw, { validate: epValidate });
           break;
-        case 'plan-phase':
-          init.cmdInitPlanPhase(cwd, args[2], raw);
+        }
+        case 'plan-phase': {
+          const { validate: ppValidate } = parseNamedArgs(args, [], ['validate']);
+          init.cmdInitPlanPhase(cwd, args[2], raw, { validate: ppValidate });
           break;
+        }
         case 'new-project':
           init.cmdInitNewProject(cwd, raw);
           break;
@@ -910,6 +1000,88 @@ async function runCommand(command, args, cwd, raw) {
       break;
     }
+    // ─── Intel ────────────────────────────────────────────────────────────
+    case 'intel': {
+      const intel = require('./lib/intel.cjs');
+      const subcommand = args[1];
+      if (subcommand === 'query') {
+        const term = args[2];
+        if (!term) error('Usage: sdd-tools intel query <term>');
+        const planningDir = path.join(cwd, '.planning');
+        core.output(intel.intelQuery(term, planningDir), raw);
+      } else if (subcommand === 'status') {
+        const planningDir = path.join(cwd, '.planning');
+        core.output(intel.intelStatus(planningDir), raw);
+      } else if (subcommand === 'diff') {
+        const planningDir = path.join(cwd, '.planning');
+        core.output(intel.intelDiff(planningDir), raw);
+      } else if (subcommand === 'snapshot') {
+        const planningDir = path.join(cwd, '.planning');
+        core.output(intel.intelSnapshot(planningDir), raw);
+      } else if (subcommand === 'patch-meta') {
+        const filePath = args[2];
+        if (!filePath) error('Usage: sdd-tools intel patch-meta <file-path>');
+        core.output(intel.intelPatchMeta(path.resolve(cwd, filePath)), raw);
+      } else if (subcommand === 'validate') {
+        const planningDir = path.join(cwd, '.planning');
+        core.output(intel.intelValidate(planningDir), raw);
+      } else if (subcommand === 'extract-exports') {
+        const filePath = args[2];
+        if (!filePath) error('Usage: sdd-tools intel extract-exports <file-path>');
+        core.output(intel.intelExtractExports(path.resolve(cwd, filePath)), raw);
+      } else if (subcommand === 'update') {
+        const planningDir = path.join(cwd, '.planning');
+        core.output(intel.intelUpdate(planningDir), raw);
+      } else {
+        error('Unknown intel subcommand. Available: query, status, update, diff, snapshot, patch-meta, validate, extract-exports');
+      }
+      break;
+    }
+    // ─── Documentation ────────────────────────────────────────────────────
+    case 'docs-init': {
+      docs.cmdDocsInit(cwd, raw);
+      break;
+    }
+    // ─── Learnings ─────────────────────────────────────────────────────────
+    case 'learnings': {
+      const subcommand = args[1];
+      if (subcommand === 'list') {
+        learnings.cmdLearningsList(raw);
+      } else if (subcommand === 'query') {
+        const tagIdx = args.indexOf('--tag');
+        const tag = tagIdx !== -1 ? args[tagIdx + 1] : null;
+        if (!tag) error('Usage: sdd-tools learnings query --tag <tag>');
+        learnings.cmdLearningsQuery(tag, raw);
+      } else if (subcommand === 'copy') {
+        learnings.cmdLearningsCopy(cwd, raw);
+      } else if (subcommand === 'prune') {
+        const olderIdx = args.indexOf('--older-than');
+        const olderThan = olderIdx !== -1 ? args[olderIdx + 1] : null;
+        if (!olderThan) error('Usage: sdd-tools learnings prune --older-than <duration>');
+        learnings.cmdLearningsPrune(olderThan, raw);
+      } else if (subcommand === 'delete') {
+        const id = args[2];
+        if (!id) error('Usage: sdd-tools learnings delete <id>');
+        learnings.cmdLearningsDelete(id, raw);
+      } else {
+        error('Unknown learnings subcommand. Available: list, query, copy, prune, delete');
+      }
+      break;
+    }
+    // ─── SDD-2 Reverse Migration ───────────────────────────────────────────
+    case 'from-sdd2': {
+      const sdd2Import = require('./lib/sdd2-import.cjs');
+      sdd2Import.cmdFromSdd2(args.slice(1), cwd, raw);
+      break;
+    }
     default:
       error(`Unknown command: ${command}`);
   }

package/sdd/contexts/dev.md ADDED Viewed

@@ -0,0 +1,21 @@
+# Dev Context Profile
+Agent output guidance for dev mode. Loaded when `context: dev` is set in config.json.
+## Output Style
+- Concise, action-oriented responses
+- Lead with the code change or command, follow with brief rationale
+- Skip preamble — assume the developer has full context
+- Use inline code references (`file:line`) over prose descriptions
+## Focus Areas
+- Working code that compiles and passes tests
+- Minimal diff — change only what is necessary
+- Flag side effects or breaking changes immediately
+- Surface the next actionable step at the end of every response
+## Verbosity
+Low. One-liner explanations unless the change is non-obvious. Omit background theory, alternative approaches, and caveats that do not affect the current task.

package/sdd/contexts/research.md ADDED Viewed

@@ -0,0 +1,22 @@
+# Research Context Profile
+Agent output guidance for research mode. Loaded when `context: research` is set in config.json.
+## Output Style
+- Verbose, exploratory responses that surface trade-offs and alternatives
+- Present multiple approaches with pros and cons before recommending one
+- Include links, references, and citations where available
+- Use structured headings and bullet lists for scan-ability
+## Focus Areas
+- Breadth of options — enumerate before narrowing
+- Prior art and ecosystem conventions
+- Risks, edge cases, and failure modes
+- Dependencies and compatibility implications
+- Long-term maintainability of each approach
+## Verbosity
+High. Explain reasoning, show evidence, and document assumptions. Include background context even if the developer likely knows it — research artifacts are read by future contributors who may not.

package/sdd/contexts/review.md ADDED Viewed

@@ -0,0 +1,22 @@
+# Review Context Profile
+Agent output guidance for review mode. Loaded when `context: review` is set in config.json.
+## Output Style
+- Critical, detail-focused responses that prioritize correctness
+- Organize findings by severity: blocking, important, nit
+- Reference specific lines and files for every finding
+- State what is correct as well as what needs change — confirm the good parts
+## Focus Areas
+- Correctness — logic errors, off-by-ones, missing edge cases
+- Security — input validation, injection vectors, secret exposure
+- Performance — unnecessary allocations, O(n^2) patterns, missing caching
+- Style and consistency — naming, formatting, import order
+- Test coverage — untested branches, missing assertions, flaky patterns
+## Verbosity
+Medium. Be thorough on findings but terse in explanation. Each issue should be one to three sentences: what is wrong, why it matters, and how to fix it.

package/sdd/references/agent-contracts.md ADDED Viewed

@@ -0,0 +1,79 @@
+# Agent Contracts
+Completion markers and handoff schemas for all SDD agents. Workflows use these markers to detect agent completion and route accordingly.
+This doc describes what IS, not what should be. Casing inconsistencies are documented as they appear in agent source files.
+---
+## Agent Registry
+| Agent | Role | Completion Markers |
+|-------|------|--------------------|
+| sdd-planner | Plan creation | `## PLANNING COMPLETE` |
+| sdd-executor | Plan execution | `## PLAN COMPLETE`, `## CHECKPOINT REACHED` |
+| sdd-phase-researcher | Phase-scoped research | `## RESEARCH COMPLETE`, `## RESEARCH BLOCKED` |
+| sdd-project-researcher | Project-wide research | `## RESEARCH COMPLETE`, `## RESEARCH BLOCKED` |
+| sdd-plan-checker | Plan validation | `## VERIFICATION PASSED`, `## ISSUES FOUND` |
+| sdd-research-synthesizer | Multi-research synthesis | `## SYNTHESIS COMPLETE`, `## SYNTHESIS BLOCKED` |
+| sdd-debugger | Debug investigation | `## DEBUG COMPLETE`, `## ROOT CAUSE FOUND`, `## CHECKPOINT REACHED` |
+| sdd-roadmapper | Roadmap creation/revision | `## ROADMAP CREATED`, `## ROADMAP REVISED`, `## ROADMAP BLOCKED` |
+| sdd-ui-auditor | UI review | `## UI REVIEW COMPLETE` |
+| sdd-ui-checker | UI validation | `## ISSUES FOUND` |
+| sdd-ui-researcher | UI spec creation | `## UI-SPEC COMPLETE`, `## UI-SPEC BLOCKED` |
+| sdd-verifier | Post-execution verification | `## Verification Complete` (title case) |
+| sdd-integration-checker | Cross-phase integration check | `## Integration Check Complete` (title case) |
+| sdd-nyquist-auditor | Sampling audit | `## PARTIAL`, `## ESCALATE` (non-standard) |
+| sdd-security-auditor | Security audit | `## OPEN_THREATS`, `## ESCALATE` (non-standard) |
+| sdd-codebase-mapper | Codebase analysis | No marker (writes docs directly) |
+| sdd-assumptions-analyzer | Assumption extraction | No marker (returns `## Assumptions` sections) |
+| sdd-doc-verifier | Doc validation | No marker (writes JSON to `.planning/tmp/`) |
+| sdd-doc-writer | Doc generation | No marker (writes docs directly) |
+| sdd-advisor-researcher | Advisory research | No marker (utility agent) |
+| sdd-user-profiler | User profiling | No marker (returns JSON in analysis tags) |
+| sdd-intel-updater | Codebase intelligence analysis | `## INTEL UPDATE COMPLETE`, `## INTEL UPDATE FAILED` |
+## Marker Rules
+1. **ALL-CAPS markers** (e.g., `## PLANNING COMPLETE`) are the standard convention
+2. **Title-case markers** (e.g., `## Verification Complete`) exist in sdd-verifier and sdd-integration-checker -- these are intentional as-is, not bugs
+3. **Non-standard markers** (e.g., `## PARTIAL`, `## ESCALATE`) in audit agents indicate partial results requiring orchestrator judgment
+4. **Agents without markers** either write artifacts directly to disk or return structured data (JSON/sections) that the caller parses
+5. Markers must appear as H2 headings (`## `) at the start of a line in the agent's final output
+## Key Handoff Contracts
+### Planner -> Executor (via PLAN.md)
+| Field | Required | Description |
+|-------|----------|-------------|
+| Frontmatter | Yes | phase, plan, type, wave, depends_on, files_modified, autonomous, requirements |
+| `<objective>` | Yes | What the plan achieves |
+| `<tasks>` | Yes | Ordered task list with type, files, action, verify, acceptance_criteria |
+| `<verification>` | Yes | Overall verification steps |
+| `<success_criteria>` | Yes | Measurable completion criteria |
+### Executor -> Verifier (via SUMMARY.md)
+| Field | Required | Description |
+|-------|----------|-------------|
+| Frontmatter | Yes | phase, plan, subsystem, tags, key-files, metrics |
+| Commits table | Yes | Per-task commit hashes and descriptions |
+| Deviations section | Yes | Auto-fixed issues or "None" |
+| Self-Check | Yes | PASSED or FAILED with details |
+## Workflow Regex Patterns
+Workflows match these markers to detect agent completion:
+**plan-phase.md matches:**
+- `## RESEARCH COMPLETE` / `## RESEARCH BLOCKED` (researcher output)
+- `## PLANNING COMPLETE` (planner output)
+- `## CHECKPOINT REACHED` (planner/executor pause)
+- `## VERIFICATION PASSED` / `## ISSUES FOUND` (plan-checker output)
+**execute-phase.md matches:**
+- `## PHASE COMPLETE` (all plans in phase done)
+- `## Self-Check: FAILED` (summary self-check)
+> **NOTE:** `## PLAN COMPLETE` is the sdd-executor's completion marker but execute-phase.md does not regex-match it. Instead, it detects executor completion via spot-checks (SUMMARY.md existence, git commit state). This is intentional behavior, not a mismatch.

package/sdd/references/ai-evals.md ADDED Viewed

@@ -0,0 +1,156 @@
+# AI Evaluation Reference
+> Reference used by `sdd-eval-planner` and `sdd-eval-auditor`.
+> Based on "AI Evals for Everyone" course (Reganti & Badam) + industry practice.
+---
+## Core Concepts
+### Why Evals Exist
+AI systems are non-deterministic. Input X does not reliably produce output Y across runs, users, or edge cases. Evals are the continuous process of assessing whether your system's behavior meets expectations under real-world conditions — unit tests and integration tests alone are insufficient.
+### Model vs. Product Evaluation
+- **Model evals** (MMLU, HumanEval, GSM8K) — measure general capability in standardized conditions. Use as initial filter only.
+- **Product evals** — measure behavior inside your specific system, with your data, your users, your domain rules. This is where 80% of eval effort belongs.
+### The Three Components of Every Eval
+- **Input** — everything affecting the system: query, history, retrieved docs, system prompt, config
+- **Expected** — what good behavior looks like, defined through rubrics
+- **Actual** — what the system produced, including intermediate steps, tool calls, and reasoning traces
+### Three Measurement Approaches
+1. **Code-based metrics** — deterministic checks: JSON validation, required disclaimers, performance thresholds, classification flags. Fast, cheap, reliable. Use first.
+2. **LLM judges** — one model evaluates another against a rubric. Powerful for subjective qualities (tone, reasoning, escalation). Requires calibration against human judgment before trusting.
+3. **Human evaluation** — gold standard for nuanced judgment. Doesn't scale. Use for calibration, edge cases, periodic sampling, and high-stakes decisions.
+Most effective systems combine all three.
+---
+## Evaluation Dimensions
+### Pre-Deployment (Development Phase)
+| Dimension | What It Measures | When It Matters |
+|-----------|-----------------|-----------------|
+| **Factual accuracy** | Correctness of claims against ground truth | RAG, knowledge bases, any factual assertions |
+| **Context faithfulness** | Response grounded in provided context vs. fabricated | RAG pipelines, document Q&A, retrieval-augmented systems |
+| **Hallucination detection** | Plausible but unsupported claims | All generative systems, high-stakes domains |
+| **Escalation accuracy** | Correct identification of when human intervention needed | Customer service, healthcare, financial advisory |
+| **Policy compliance** | Adherence to business rules, legal requirements, disclaimers | Regulated industries, enterprise deployments |
+| **Tone/style appropriateness** | Match with brand voice, audience expectations, emotional context | Customer-facing systems, content generation |
+| **Output structure validity** | Schema compliance, required fields, format correctness | Structured extraction, API integrations, data pipelines |
+| **Task completion** | Whether the system accomplished the stated goal | Agentic workflows, multi-step tasks |
+| **Tool use correctness** | Correct selection and invocation of tools | Agent systems with tool calls |
+| **Safety** | Absence of harmful, biased, or inappropriate outputs | All user-facing systems |
+### Production Monitoring
+| Dimension | Monitoring Approach |
+|-----------|---------------------|
+| **Safety violations** | Online guardrail — real-time, immediate intervention |
+| **Compliance failures** | Online guardrail — block or escalate before user sees output |
+| **Quality degradation trends** | Offline flywheel — batch analysis of sampled interactions |
+| **Emerging failure modes** | Signal-metric divergence — when user behavior signals diverge from metric scores, investigate manually |
+| **Cost/latency drift** | Code-based metrics — automated threshold alerts |
+---
+## The Guardrail vs. Flywheel Decision
+Ask: "If this behavior goes wrong, would it be catastrophic for my business?"
+- **Yes → Guardrail** — run online, real-time, with immediate intervention (block, escalate, hand off). Be selective: guardrails add latency.
+- **No → Flywheel** — run offline as batch analysis feeding system refinements over time.
+---
+## Rubric Design
+Generic metrics are meaningless without context. "Helpfulness" in real estate means summarizing listings clearly. In healthcare it means knowing when *not* to answer.
+A rubric must define:
+1. The dimension being measured
+2. What scores 1, 3, and 5 on a 5-point scale (or pass/fail criteria)
+3. Domain-specific examples of acceptable vs. unacceptable behavior
+Without rubrics, LLM judges produce noise rather than signal.
+---
+## Reference Dataset Guidelines
+- Start with **10-20 high-quality examples** — not 200 mediocre ones
+- Cover: critical success scenarios, common user workflows, known edge cases, historical failure modes
+- Have domain experts label the examples (not just engineers)
+- Expand based on what you learn in production — don't build for hypothetical coverage
+---
+## Eval Tooling Guide
+| Tool | Type | Best For | Key Strength |
+|------|------|----------|-------------|
+| **RAGAS** | Python library | RAG evaluation | Purpose-built metrics: faithfulness, answer relevance, context precision/recall |
+| **Langfuse** | Platform (open-source, self-hostable) | All system types | Strong tracing, prompt management, good for teams wanting infrastructure control |
+| **LangSmith** | Platform (commercial) | LangChain/LangGraph ecosystems | Tightest integration with LangChain; best if already in that ecosystem |
+| **Arize Phoenix** | Platform (open-source + hosted) | RAG + multi-agent tracing | Strong RAG eval + trace visualization; open-source with hosted option |
+| **Braintrust** | Platform (commercial) | Model-agnostic evaluation | Dataset and experiment management; good for comparing across frameworks |
+| **Promptfoo** | CLI tool (open-source) | Prompt testing, CI/CD | CLI-first, excellent for CI/CD prompt regression testing |
+### Tool Selection by System Type
+| System Type | Recommended Tooling |
+|-------------|---------------------|
+| RAG / Knowledge Q&A | RAGAS + Arize Phoenix or Braintrust |
+| Multi-agent systems | Langfuse + Arize Phoenix |
+| Conversational / single-model | Promptfoo + Braintrust |
+| Structured extraction | Promptfoo + code-based validators |
+| LangChain/LangGraph projects | LangSmith (native integration) |
+| Production monitoring (all types) | Langfuse, Arize Phoenix, or LangSmith |
+---
+## Evals in the Development Lifecycle
+### Plan Phase (Evaluation-Aware Design)
+Before writing code, define:
+1. What type of AI system is being built → determines framework and dominant eval concerns
+2. Critical failure modes (3-5 behaviors that cannot go wrong)
+3. Rubrics — explicit definitions of acceptable/unacceptable behavior per dimension
+4. Evaluation strategy — which dimensions use code metrics, LLM judges, or human review
+5. Reference dataset requirements — size, composition, labeling approach
+6. Eval tooling selection
+Output: EVALS-SPEC section of AI-SPEC.md
+### Execute Phase (Instrument While Building)
+- Add tracing from day one (Langfuse, Arize Phoenix, or LangSmith)
+- Build reference dataset concurrently with implementation
+- Implement code-based checks first; add LLM judges only for subjective dimensions
+- Run evals in CI/CD via Promptfoo or Braintrust
+### Verify Phase (Pre-Deployment Validation)
+- Run full reference dataset against all metrics
+- Conduct human review of edge cases and LLM judge disagreements
+- Calibrate LLM judges against human scores (target ≥ 0.7 correlation before trusting)
+- Define and configure production guardrails
+- Establish monitoring baseline
+### Monitor Phase (Production Evaluation Loop)
+- Smart sampling — weight toward interactions with concerning signals (retries, unusual length, explicit escalations)
+- Online guardrails on every interaction
+- Offline flywheel on sampled batch
+- Watch for signal-metric divergence — the early warning system for evaluation gaps
+---
+## Common Pitfalls
+1. **Assuming benchmarks predict product success** — they don't; model evals are a filter, not a verdict
+2. **Engineering evals in isolation** — domain experts must co-define rubrics; engineers alone miss critical nuances
+3. **Building comprehensive coverage on day one** — start small (10-20 examples), expand from real failure modes
+4. **Trusting uncalibrated LLM judges** — validate against human judgment before relying on them
+5. **Measuring everything** — only track metrics that drive decisions; "collect it all" produces noise
+6. **Treating evaluation as one-time setup** — user behavior evolves, requirements change, failure modes emerge; evaluation is continuous