npm - cclaw-cli - Versions diffs - 3.0.0 → 5.0.0 - Mend

cclaw-cli 3.0.0 → 5.0.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (41) hide show

package/dist/artifact-linter/brainstorm.js +51 -2
package/dist/artifact-linter/design.js +14 -3
package/dist/artifact-linter/review-army.d.ts +25 -0
package/dist/artifact-linter/review-army.js +155 -0
package/dist/artifact-linter/review.js +13 -0
package/dist/artifact-linter/scope.js +27 -48
package/dist/artifact-linter/shared.d.ts +98 -11
package/dist/artifact-linter/shared.js +280 -113
package/dist/artifact-linter.d.ts +12 -2
package/dist/artifact-linter.js +29 -13
package/dist/content/core-agents.js +6 -1
package/dist/content/examples.js +8 -0
package/dist/content/hooks.js +2 -1
package/dist/content/idea.js +14 -2
package/dist/content/review-prompts.js +3 -3
package/dist/content/skills-elicitation.js +61 -20
package/dist/content/skills.js +19 -6
package/dist/content/stage-schema.js +46 -18
package/dist/content/stages/_lint-metadata/index.js +1 -2
package/dist/content/stages/brainstorm.js +6 -3
package/dist/content/stages/design.js +13 -12
package/dist/content/stages/plan.js +1 -1
package/dist/content/stages/review.js +21 -21
package/dist/content/stages/schema-types.d.ts +9 -0
package/dist/content/stages/scope.js +22 -20
package/dist/content/stages/spec.js +3 -3
package/dist/content/stages/tdd.js +1 -0
package/dist/content/templates.d.ts +8 -1
package/dist/content/templates.js +115 -43
package/dist/flow-state.d.ts +12 -0
package/dist/gate-evidence.d.ts +37 -1
package/dist/gate-evidence.js +37 -3
package/dist/harness-adapters.js +8 -0
package/dist/install.js +22 -11
package/dist/internal/advance-stage/advance.d.ts +1 -0
package/dist/internal/advance-stage/advance.js +5 -2
package/dist/internal/advance-stage/parsers.d.ts +8 -0
package/dist/internal/advance-stage/parsers.js +27 -1
package/dist/internal/advance-stage/start-flow.js +13 -0
package/dist/run-persistence.js +14 -2
package/package.json +1 -1

package/dist/content/skills-elicitation.js CHANGED Viewed

@@ -18,18 +18,38 @@ export function adaptiveElicitationSkillMarkdown() {
     const budgetTable = renderQuestionBudgetHintTable();
     return `---
 name: adaptive-elicitation
-description: "Harness-native one-question-at-a-time dialogue for brainstorm/scope/design with stop signals, smart-skip, and append-only Q&A logging."
+description: "Harness-native one-question-at-a-time dialogue for brainstorm/scope/design with stop signals, smart-skip, and append-only Q&A logging. Walking forcing questions in order is mandatory; the linter blocks stage-complete when Q&A Log is below floor."
 ---
 # Adaptive Elicitation
 Pinned anchor: "Don't tell it what to do, give it success criteria and watch it go."
-## HARD-GATE
+## Anti-pattern (BAD examples — never do these)
+These behaviors are the exact reason this skill exists. The linter will block your stage-complete if you do them.
+- **Bad**: User asks for a "simple web app" -> agent asks 1 question about stack -> 1 question about auth -> drafts the brainstorm artifact and asks for approval.
+- **Good**: User asks for a "simple web app" -> agent asks Q1 (what pain) -> Q2 (direct path) -> Q3 (do-nothing cost) -> Q4 (first operator/user) -> Q5 (no-go boundaries) -> self-eval: clear -> drafts the brainstorm artifact.
+- **Bad**: Agent immediately dispatches a subagent (\`product-discovery\`, \`critic\`, \`planner\`) at the start of brainstorm/scope/design to "gather context" before any user dialogue.
+- **Good**: Agent walks the Q&A loop with the user first; subagent dispatch happens only after the user approves the elicitation outcome.
+- **Bad**: Agent batches 3-5 grill questions into one large message and asks the user to answer them all at once.
+- **Good**: Agent asks one grill question, waits, logs the answer, asks the next.
+- **Bad**: Agent skips forcing questions because it "already has a good idea" of the answer.
+- **Good**: Agent asks the forcing question; if the user's reply confirms the assumption, log it as \`asked (confirmed assumption)\` and move on. Do not silently skip.
+## HARD-GATE (machine-enforced)
 - User does not run cclaw manually. Do not tell the user to run CLI commands for answers.
 - Ask exactly one question per turn and wait for the answer before asking the next one.
 - Use harness-native question tools first; prose fallback is allowed only when the tool is unavailable.
 - Keep a running Q&A trace in the active artifact under \`## Q&A Log\` in \`${RUNTIME_ROOT}/artifacts/\` as append-only rows.
+- **Convergence floor**: do NOT advance the stage (do NOT call \`stage-complete.mjs\`) until Q&A converges. Convergence is reached when ANY of: (a) all forcing-question topics are addressed in \`## Q&A Log\`, (b) the last 2 substantive rows produce no decision-changing impact (\`skip\`/\`continue\`/\`no-change\`/\`done\`), or (c) an explicit user stop-signal row is recorded. The linter rule \`qa_log_unconverged\` enforces this; \`stage-complete\` will fail otherwise. Wave 23 (v5.0.0) replaced the fixed-count floor with this convergence detector.
+- **NEVER run shell hash commands** (\`shasum\`, \`sha256sum\`, \`md5sum\`, \`Get-FileHash\`, \`certutil\`, etc.) to compute artifact hashes. If a linter ever asks you for a hash, that is a linter bug — report failure and stop, do not auto-fix in bash.
+- **NEVER paste cclaw command lines into chat** (e.g. \`node .cclaw/hooks/stage-complete.mjs ... --evidence-json '{...}'\`). Run them via the tool layer; report only the resulting summary. The user does not run cclaw manually and seeing the command line is noise.
 ## Harness Question Surface
@@ -43,67 +63,84 @@ If unavailable, ask one concise prose question and explicitly wait for chat answ
 ## Core Protocol
-1. Ask one decision-changing question.
+1. Ask one decision-changing question via the harness-native question tool.
 2. Wait for the answer.
 3. Append one row to \`## Q&A Log\`: \`Turn | Question | User answer (1-line) | Decision impact\`.
 4. Self-evaluate:
    - What did I learn?
    - Is context enough to draft now? (yes/no + reason)
-   - If no, what is the next most decision-changing question?
-5. Repeat until context is clear OR user asks to proceed.
+   - Have I covered all stage forcing questions in order? (yes/no + which remain)
+   - If forcing questions remain or context is incomplete, what is the next decision-changing question?
+5. Repeat until **all forcing questions are answered/skipped/waived AND self-evaluation says context is sufficient**, OR user records an explicit stop-signal row.
 ## Question Shape Rules
 - Prefer single-select multiple choice when one direction/priority/next step must be chosen.
 - Use multi-select only for compatible sets (goals, constraints, non-goals).
-- Smart-skip questions already answered earlier (directly or implicitly) and log "skipped (already covered)" when relevant.
+- Smart-skip: if a question is already answered earlier (directly or implicitly), log \`skipped (already covered: turn N)\` instead of skipping silently. The smart-skip row counts as a substantive Q&A Log entry for floor purposes.
 ## Stop Signals (Natural Language)
 Treat these as stop-and-draft signals:
-- RU: "достаточно", "хватит", "давай драфт"
-- EN: "enough", "skip", "just draft it", "stop asking", "move on"
+- RU: "достаточно", "хватит", "давай драфт", "хватит вопросов"
+- EN: "enough", "skip", "just draft it", "stop asking", "move on", "no more questions"
 - UA: "досить", "вистачить", "давай драфт", "рухаємось далі"
 When detected:
+- Append a Q&A Log row exactly like: \`Turn N | (stop-signal) | <user quote> | stop-and-draft\` — this row satisfies the linter floor escape hatch.
 - Do not ask another question in this stage loop.
 - Move to drafting with available context.
-- For internal agent calls only, pass \`--skip-questions\` on the next advance helper call.
+- For the next internal agent-only call to advance-stage, pass \`--skip-questions\`. **The user never sees or types this flag.**
 ## Conditional Grilling (Only On Risk Triggers)
-Ask an extra 3-5 sharp questions only when one of these triggers appears:
+When one of these triggers appears, continue the elicitation loop with sharper questions **one at a time** (do NOT batch them):
 - Irreversibility (data deletion, schema migration, breaking API/contract)
 - Security/auth boundary changes
 - Domain-model ambiguity with multiple plausible invariants
+Each grill question follows the same Core Protocol: ask one, wait, log, self-eval, ask next.
 Do not ask extra questions "for theater" on simple low-risk work.
-## Question Budget Hint (Soft Guidance)
+## Question Budget Hint (advisory only — Wave 23 dropped the count floor)
-Use as orientation, never as a hard stop. Source of truth is \`questionBudgetHint(track, stage)\`:
+Source of truth: \`questionBudgetHint(track, stage)\`. The numbers below are
+**soft hints** for harness UI and elicitation pacing; gate blocking is done
+by the \`qa_log_unconverged\` rule (Ralph-Loop convergence detector), NOT by
+a fixed count.
 ${budgetTable}
 Track mapping note: \`quick\` ~= lightweight, \`medium\` ~= standard, \`standard\` ~= deep.
-Stop based on clarity/user signal, not raw count.
-## Stage Forcing Questions
+How to use the columns:
+- \`Min\` — soft minimum to surface forcing questions; not a blocking gate.
+- \`Recommended\` — target for normal flows.
+- \`Hard cap warning\` — point at which to stop or compress remaining forcing questions into one final batched ask. Not skip.
+## Stage Forcing Questions (walk in order, one per turn)
-Always keep at least one unresolved forcing question in play until answered or explicitly waived:
+**Walk the forcing questions list one-by-one in order, asking each as a separate turn.** Do NOT batch. Do NOT pick favorites — go in order. For each question record one of:
+- \`asked\` — question was asked and answered.
+- \`asked (confirmed assumption)\` — question was asked, user confirmed your prior reading.
+- \`skipped (already covered: turn N)\` — answered implicitly by an earlier reply; cite the turn.
+- \`waived (user override)\` — user explicitly waived this question.
-- Brainstorm:
+Stage forcing question lists:
+- **Brainstorm**:
   - What pain are we solving?
   - What is the most direct path?
   - What happens if we do nothing?
   - Who is the operator/user impacted first?
   - What are non-negotiable no-go boundaries?
-- Scope:
+- **Scope**:
   - What is definitely in and definitely out?
   - Which decisions are already locked upstream?
   - What is the rollback path if this fails?
   - What are the top failure modes we must design for?
-- Design:
+- **Design**:
   - What is the data flow end-to-end?
   - Where are the seams/interfaces and ownership boundaries?
   - Which invariants must always hold?
@@ -118,6 +155,10 @@ For irreversible moves (deletion, schema migration, breaking API):
 ## Completion Rule
-"Continue until clear OR user wants to proceed."
-Never force a fixed N-question script.`;
+Continue asking forcing questions in order until one of:
+- (a) all forcing questions for the stage are answered/skipped/waived AND self-evaluation says context is sufficient, OR
+- (b) user records an explicit stop-signal row in \`## Q&A Log\`, OR
+- (c) the \`hard cap warning\` count is reached and you compressed the remaining forcing questions into one final batched ask (not skip).
+Do NOT exit the loop after the first 1-2 questions just because you can draft something. The point of the loop is to surface the user's actual constraints, not to confirm your initial reading.`;
 }

package/dist/content/skills.js CHANGED Viewed

@@ -175,18 +175,23 @@ function autoSubagentDispatchBlock(stage, track) {
         const userGate = rule.requiresUserGate ? "required" : "not required";
         const dispatchClass = rule.dispatchClass ?? "stage-specialist";
         const returnSchema = rule.returnSchema ?? "agent-default";
-        return `| ${rule.agent} | ${rule.mode} | ${dispatchClass} | ${returnSchema} | ${userGate} | ${rule.when} | ${rule.purpose} |`;
+        const runPhase = rule.runPhase ?? "any";
+        return `| ${rule.agent} | ${rule.mode} | ${runPhase} | ${dispatchClass} | ${returnSchema} | ${userGate} | ${rule.when} | ${rule.purpose} |`;
     })
         .join("\n");
     const mandatory = schema.mandatoryDelegations;
     const mandatoryList = mandatory.length > 0 ? mandatory.map((a) => `\`${a}\``).join(", ") : "none";
     const delegationLogRel = `${RUNTIME_ROOT}/state/delegation-log.json`;
     const delegationEventsRel = `${RUNTIME_ROOT}/state/delegation-events.jsonl`;
+    const hasPostElicitation = rules.some((rule) => rule.runPhase === "post-elicitation");
+    const runPhaseLegend = hasPostElicitation
+        ? `\nRun Phase legend: \`post-elicitation\` = run only AFTER the adaptive elicitation Q&A loop converges (forcing questions answered/skipped/waived OR user stop-signal recorded). \`pre-elicitation\` = run before any user dialogue (rare). \`any\` = no ordering constraint.`
+        : "";
     return `## Automatic Subagent Dispatch
-| Agent | Mode | Class | Return Schema | User Gate | Trigger | Purpose |
-|---|---|---|---|---|---|---|
+| Agent | Mode | Run Phase | Class | Return Schema | User Gate | Trigger | Purpose |
+|---|---|---|---|---|---|---|---|
 ${rows}
-Mandatory: ${mandatoryList}. Record lifecycle rows in \`${delegationLogRel}\` and append-only \`${delegationEventsRel}\` before completion.
+Mandatory: ${mandatoryList}. Record lifecycle rows in \`${delegationLogRel}\` and append-only \`${delegationEventsRel}\` before completion.${runPhaseLegend}
 ### Harness Dispatch Contract — use true harness dispatch: Claude Task, Cursor generic dispatch, OpenCode \`.opencode/agents/<agent>.md\` via Task/@agent, Codex \`.codex/agents/<agent>.toml\`. Do not collapse OpenCode or Codex to role-switch by default. Worker ACK Contract: ACK must include \`spanId\`, \`dispatchId\`, \`dispatchSurface\`, \`agentDefinitionPath\`, and \`ackTs\`; never claim \`fulfillmentMode: "isolated"\` without matching lifecycle proof. Helper: \`.cclaw/hooks/delegation-record.mjs --status=<status> --span-id=<spanId> --dispatch-id=<dispatchId> --dispatch-surface=<surface> --agent-definition-path=<path> --json\`. Exact recipe: scheduled -> launched -> acknowledged -> completed with the same span; completed isolated/generic rows require a prior ACK event for that span or \`--ack-ts=<iso>\`.
 ${perHarnessLifecycleRecipeBlock()}`;
@@ -392,6 +397,14 @@ function delegationAndCompletionBlock(schema, track) {
 ${normalizedDispatch}
 ${completionBlock}
+### Stage Closure (harness-only UX)
+- **NEVER paste the \`stage-complete.mjs\` command line into chat.** The user does not run cclaw manually; seeing \`node .cclaw/hooks/stage-complete.mjs ... --evidence-json '{...}' --waive-delegation=...\` is noise. Run the helper via the tool layer; report only the resulting summary.
+- **NEVER paste the \`--evidence-json\` payload into chat.** It is structured data for the helper, not for the user. The same evidence already lives in the artifact section.
+- On failure, report a compact human-readable summary based on the helper's JSON \`findings\` array — list failing section names only (one line each), include the full helper JSON in a single fenced \`json\` block. Do not echo the invoking command.
+- **NEVER run shell hash commands** (\`shasum\`, \`sha256sum\`, \`md5sum\`, \`Get-FileHash\`, \`certutil\`, etc.) for hash compute. If the linter ever asks for a hash, that is a linter bug — report failure and stop, do not auto-fix in bash.
+- The helper defaults to quiet success (\`CCLAW_STAGE_COMPLETE_QUIET=1\`); rely on the resulting JSON, not stdout chatter.
 `;
 }
 function quickStartBlock(stage, track) {
@@ -638,10 +651,10 @@ CLI commands, using existing \`cclaw run resume\` and \`internal verify-current-
 1. **Wave Start**: author wave plan as \`.cclaw/wave-plans/<wave-n>.md\` referencing previous wave's ship artifact.
 2. **Carry-forward Audit**: at brainstorm of the next wave, re-read previous wave ship artifact and explicitly record in the existing \`## Wave Carry-forward\` section:
-   - Carrying forward: <scope LD# hash references still valid>
+   - Carrying forward: <scope D-XX decision references still valid>
    - Drift detected: <decisions no longer valid + reason>
    - Re-scope needed: <yes/no>
-   - Never create a second \`## Locked Decisions\` heading in brainstorm; reference prior LD# hashes inline.
+   - Never create a second \`## Locked Decisions\` heading in brainstorm; reference prior D-XX IDs inline.
 3. **Resume Path**: if a wave was interrupted mid-stage, \`cclaw run resume\` restores state. Run \`internal verify-current-state\` before continuing.
 4. **Wave End**: at ship, architect cross-stage verification runs from dispatch matrix. If \`DRIFT_DETECTED\`, fix before ship.
 5. **Next Wave Trigger**: launch new \`/cc <topic>\` for next wave and reference previous wave ship artifact in upstream handoff.

package/dist/content/stage-schema.js CHANGED Viewed

@@ -439,20 +439,32 @@ const STAGE_SCHEMA_MAP = {
     review: REVIEW,
     ship: SHIP
 };
+/**
+ * Stage-level subagent dispatch matrix.
+ *
+ * NOTE on `fixer`: the `fixer` agent is intentionally NOT listed in any stage
+ * row. It is dispatched on-demand by the SDD `subagent-dev` skill (and by
+ * reviewer flows) when a review surfaces a concrete failing criterion that
+ * needs a fresh worker. Adding `fixer` to the static matrix would create
+ * proactive-waiver theatre because it can only run after a specific review
+ * finding exists. See `core-agents.ts` `fixer` definition for the contract.
+ */
 const STAGE_AUTO_SUBAGENT_DISPATCH = {
     brainstorm: [
         {
             agent: "product-discovery",
             mode: "mandatory",
             requiredAtTier: "standard",
-            when: "Always for standard/deep brainstorm to validate value, persona/JTBD, success metric, and why-now framing.",
+            runPhase: "post-elicitation",
+            when: "Always for standard/deep brainstorm to validate value, persona/JTBD, success metric, and why-now framing. Runs only after the adaptive elicitation Q&A loop converges.",
             purpose: "Run product-discovery mode to pressure-test problem/value fit and produce product evidence for the Problem Decision Record.",
             requiresUserGate: false
         },
         {
             agent: "divergent-thinker",
             mode: "proactive",
-            when: "When brainstorm has >1 candidate direction or user signals openness to alternatives.",
+            runPhase: "post-elicitation",
+            when: "When brainstorm has >1 candidate direction or user signals openness to alternatives. Runs only after the adaptive elicitation Q&A loop converges.",
             purpose: "Expand option-space with alternative framings and approaches before planner/critic convergence.",
             requiresUserGate: false
         },
@@ -460,7 +472,8 @@ const STAGE_AUTO_SUBAGENT_DISPATCH = {
             agent: "critic",
             mode: "mandatory",
             requiredAtTier: "standard",
-            when: "Always for standard/deep brainstorm to challenge the premise, do-nothing path, and higher-upside alternatives.",
+            runPhase: "post-elicitation",
+            when: "Always for standard/deep brainstorm to challenge the premise, do-nothing path, and higher-upside alternatives. Runs only after the adaptive elicitation Q&A loop converges.",
             purpose: "Attack assumptions and surface non-goals before direction approval, with pre-commitment predictions validated against evidence.",
             requiresUserGate: false,
             skill: "critic-multi-perspective"
@@ -468,7 +481,8 @@ const STAGE_AUTO_SUBAGENT_DISPATCH = {
         {
             agent: "researcher",
             mode: "proactive",
-            when: "When repository, market, docs, or prior-art context changes the approach set.",
+            runPhase: "post-elicitation",
+            when: "When repository, market, docs, or prior-art context changes the approach set. Runs only after the adaptive elicitation Q&A loop converges.",
             purpose: "Provide search-before-read summaries and context-readiness evidence before large reads or decisions.",
             requiresUserGate: false
         }
@@ -478,14 +492,16 @@ const STAGE_AUTO_SUBAGENT_DISPATCH = {
             agent: "planner",
             mode: "mandatory",
             requiredAtTier: "standard",
-            when: "Always during scope shaping.",
+            runPhase: "post-elicitation",
+            when: "Always during scope shaping. Runs only after the adaptive elicitation Q&A loop converges and the user has approved the scope contract draft.",
             purpose: "Challenge premise, map alternatives, and produce explicit in/out contract.",
             requiresUserGate: false
         },
         {
             agent: "divergent-thinker",
             mode: "proactive",
-            when: "When scope mode is SCOPE EXPANSION or SELECTIVE EXPANSION, or scope contract has fewer than 3 alternatives considered.",
+            runPhase: "post-elicitation",
+            when: "When scope mode is SCOPE EXPANSION or SELECTIVE EXPANSION, or scope contract has fewer than 3 alternatives considered. Runs only after the adaptive elicitation Q&A loop converges.",
             purpose: "Generate additional framings and approach variants before scope convergence hardens.",
             requiresUserGate: false
         },
@@ -493,7 +509,8 @@ const STAGE_AUTO_SUBAGENT_DISPATCH = {
             agent: "critic",
             mode: "mandatory",
             requiredAtTier: "standard",
-            when: "Always during scope shaping for standard/deep work.",
+            runPhase: "post-elicitation",
+            when: "Always during scope shaping for standard/deep work. Runs only after the adaptive elicitation Q&A loop converges.",
             purpose: "Test whether the selected scope mode is too timid, too broad, or hiding a smaller useful slice, using pre-commitment predictions and validation.",
             requiresUserGate: false,
             skill: "critic-multi-perspective"
@@ -501,14 +518,16 @@ const STAGE_AUTO_SUBAGENT_DISPATCH = {
         {
             agent: "researcher",
             mode: "proactive",
-            when: "When churn, prior attempts, reference patterns, or external constraints may change scope boundaries.",
+            runPhase: "post-elicitation",
+            when: "When churn, prior attempts, reference patterns, or external constraints may change scope boundaries. Runs only after the adaptive elicitation Q&A loop converges.",
             purpose: "Summarize search/context findings before the scope contract locks accepted/rejected/deferred ideas.",
             requiresUserGate: false
         },
         {
             agent: "product-discovery",
             mode: "proactive",
-            when: "When scope choices change user value, success metrics, or product positioning (Mode: discovery).",
+            runPhase: "post-elicitation",
+            when: "When scope choices change user value, success metrics, or product positioning (Mode: discovery). Runs only after the adaptive elicitation Q&A loop converges.",
             purpose: "Keep accepted/deferred reference ideas tied to user value and measurable success under product-discovery mode.",
             requiresUserGate: false
         },
@@ -516,14 +535,16 @@ const STAGE_AUTO_SUBAGENT_DISPATCH = {
             agent: "product-discovery",
             mode: "proactive",
             requiredAtTier: "standard",
-            when: "When scope mode resolves to SCOPE EXPANSION or SELECTIVE EXPANSION (Mode: strategist).",
+            runPhase: "post-elicitation",
+            when: "When scope mode resolves to SCOPE EXPANSION or SELECTIVE EXPANSION (Mode: strategist). Runs only after the adaptive elicitation Q&A loop converges.",
             purpose: "Drive 10x vision and concrete expansion proposals before locking the scope contract via product-discovery strategist mode.",
             requiresUserGate: false
         },
         {
             agent: "scope-guardian-reviewer",
             mode: "proactive",
-            when: "When scope mode is SCOPE EXPANSION or SELECTIVE EXPANSION, or scope contract has many accepted ideas.",
+            runPhase: "post-elicitation",
+            when: "When scope mode is SCOPE EXPANSION or SELECTIVE EXPANSION, or scope contract has many accepted ideas. Runs only after the adaptive elicitation Q&A loop converges.",
             purpose: "Challenge complexity growth and enforce minimum-change scope discipline before scope lock.",
             requiresUserGate: false,
             skill: "document-scope-guard"
@@ -534,7 +555,8 @@ const STAGE_AUTO_SUBAGENT_DISPATCH = {
             agent: "architect",
             mode: "mandatory",
             requiredAtTier: "standard",
-            when: "Always during design lock.",
+            runPhase: "post-elicitation",
+            when: "Always during design lock. Runs only after the adaptive elicitation Q&A loop converges.",
             purpose: "Stress architecture boundaries, dependency graph, critical path, and spec handoff.",
             requiresUserGate: false
         },
@@ -542,14 +564,16 @@ const STAGE_AUTO_SUBAGENT_DISPATCH = {
             agent: "test-author",
             mode: "mandatory",
             requiredAtTier: "standard",
-            when: "Always during design lock.",
+            runPhase: "post-elicitation",
+            when: "Always during design lock. Runs only after the adaptive elicitation Q&A loop converges.",
             purpose: "Check test diagram mapping, RED expressibility, assertion quality, and verification routes before implementation.",
             requiresUserGate: false
         },
         {
             agent: "critic",
             mode: "proactive",
-            when: "When architecture alternatives, coupling, cost, or rollback risk remain debatable, or when security/auth/authz trust boundaries are involved.",
+            runPhase: "post-elicitation",
+            when: "When architecture alternatives, coupling, cost, or rollback risk remain debatable, or when security/auth/authz trust boundaries are involved. Runs only after the adaptive elicitation Q&A loop converges.",
             purpose: "Produce a shadow alternative, switch trigger, and cheaper-path challenge for the engineering lock with pre-commitment predictions and validation.",
             requiresUserGate: false,
             skill: "critic-multi-perspective"
@@ -557,21 +581,24 @@ const STAGE_AUTO_SUBAGENT_DISPATCH = {
         {
             agent: "researcher",
             mode: "proactive",
-            when: "When framework/library docs, repo graph context, or reference contracts may change the design.",
+            runPhase: "post-elicitation",
+            when: "When framework/library docs, repo graph context, or reference contracts may change the design. Runs only after the adaptive elicitation Q&A loop converges.",
             purpose: "Run search-before-read context synthesis before architecture locks.",
             requiresUserGate: false
         },
         {
             agent: "security-reviewer",
             mode: "proactive",
-            when: "When trust boundaries, auth, secrets, sensitive data, or external inputs are involved.",
+            runPhase: "post-elicitation",
+            when: "When trust boundaries, auth, secrets, sensitive data, or external inputs are involved. Runs only after the adaptive elicitation Q&A loop converges.",
             purpose: "Catch design-level security risks before implementation.",
             requiresUserGate: false
         },
         {
             agent: "coherence-reviewer",
             mode: "proactive",
-            when: "When design touches multiple subsystems or includes multiple alternatives sections.",
+            runPhase: "post-elicitation",
+            when: "When design touches multiple subsystems or includes multiple alternatives sections. Runs only after the adaptive elicitation Q&A loop converges.",
             purpose: "Detect internal contradictions, terminology drift, and broken cross-section references in design docs.",
             requiresUserGate: false,
             skill: "document-coherence-pass"
@@ -579,7 +606,8 @@ const STAGE_AUTO_SUBAGENT_DISPATCH = {
         {
             agent: "feasibility-reviewer",
             mode: "proactive",
-            when: "When design assumes runtime conditions, scaling behavior, or external service availability.",
+            runPhase: "post-elicitation",
+            when: "When design assumes runtime conditions, scaling behavior, or external service availability. Runs only after the adaptive elicitation Q&A loop converges.",
             purpose: "Validate that design assumptions remain feasible in real runtime and rollout constraints.",
             requiresUserGate: false,
             skill: "document-feasibility-pass"

package/dist/content/stages/_lint-metadata/index.js CHANGED Viewed

@@ -18,8 +18,7 @@ const STAGE_POLICY_NEEDLES = {
         "In Scope",
         "Out of Scope",
         "Discretion Areas",
-        "NOT in scope",
-        "Premise Challenge",
+        "Premise Drift",
         "Locked Decisions",
         "Victory Detector",
         "Critic Pass"

package/dist/content/stages/brainstorm.js CHANGED Viewed

@@ -36,8 +36,8 @@ export const BRAINSTORM = {
     },
     executionModel: {
         checklist: [
-            "**Explore project context** — inspect existing files/docs/recent activity before asking what to build; capture matching files/patterns/seeds in `Context > Discovered context` so downstream stages don't redo discovery.",
-            "**Adaptive elicitation loop (shared skill)** — load `.cclaw/skills/adaptive-elicitation/SKILL.md` and run one decision-changing question at a time via harness-native question tools. After each answer, append one row to `## Q&A Log` (`Turn | Question | User answer (1-line) | Decision impact`). Continue until context is clear or user signals to proceed.",
+            "**ADAPTIVE ELICITATION COMES FIRST (no exceptions, no subagent dispatch before).** Load `.cclaw/skills/adaptive-elicitation/SKILL.md`. Walk the brainstorm forcing questions one-at-a-time via the harness-native question tool, append one row to `## Q&A Log` (`Turn | Question | User answer (1-line) | Decision impact`) after each user answer. Continue until forcing-questions converge (all answered/skipped/waived) OR Ralph-Loop convergence detector says no new decision-changing rows in last 2 iterations OR user records an explicit stop-signal row. Only then proceed to delegations, drafts, or analysis. The linter `qa_log_unconverged` rule will block `stage-complete` if convergence is not reached.",
+            "**Explore project context** — after the elicitation loop converges, inspect existing files/docs/recent activity to refine the Discovered context section; capture matching files/patterns/seeds in `Context > Discovered context` so downstream stages don't redo discovery.",
             "**Brainstorm forcing questions (must be covered or explicitly waived)** — what pain are we solving, what is the direct path, what happens if we do nothing, who is the first operator/user affected, and what no-go boundaries are non-negotiable.",
             "**Classify stage depth** — choose `lite` for clear low-risk tasks, `standard` for normal engineering/product changes, or `deep` for ambiguity, architecture, external dependency, security/data risk, or explicit think-bigger requests.",
             "**Write the Problem Decision Record** — pick a free-form `Frame type` label that names how this work is framed (examples: product, technical-maintenance, research-spike, ops-incident, infrastructure), then fill the universal Framing fields: affected user/role/operator, current state/failure mode/opportunity, desired observable outcome, evidence/signal, why now, do-nothing consequence, and non-goals.",
@@ -48,11 +48,12 @@ export const BRAINSTORM = {
             "**Use compact discovery for low-risk asks** — for concrete bounded requests, do one context pass, compare one baseline and one challenger, and move to draft once context is sufficient; do not drag the user through a full workshop.",
             "**Early-exit concrete asks** — for unambiguous implementation-only requests, write a compact Problem Decision Record plus short-circuit handoff (context, approved intent, constraints, assumptions, next-stage risks) and request explicit approval when the draft is ready.",
             "**Ask only decision-changing questions** — one at a time; if answers would not change approach and are non-critical preference/default assumptions, state the assumption and continue; STOP on scope, architecture, security, data loss, public API, migration, auth/pricing, or user approval uncertainty.",
+            "**Idea-evidence carry-forward (when applicable).** If `flow-state.interactionHints.brainstorm.fromIdeaArtifact` is set, read that idea artifact and reuse its `Title`, `Why-now`, `Expected impact`, `Risk`, `Counter-argument` for the chosen `I-#` (`fromIdeaCandidateId`) as the seed of `## Selected Direction` and as one row of `## Approaches` (role: `baseline`, evidence: idea-artifact path). Generate ONLY the missing higher-upside `challenger` row(s); do NOT re-generate the candidate that came from `/cc-ideate`. Record the carry-forward in `## Idea Evidence Carry-forward` with at minimum `- Source: <path>`, `- Candidate: <I-#>`, `- Reused fields: Title, Why-now, Expected impact, Risk, Counter-argument`, `- Newly generated: challenger(s) only`.",
             "**Compare 2-3 distinct approaches with stable Role/Upside columns** — Role values are `baseline` | `challenger` | `wild-card`; Upside is `low` | `modest` | `high` | `higher`; include real trade-offs, reuse notes, and reference-pattern source/disposition when a known pattern influenced the option; include exactly one challenger with explicit `high` or `higher` upside.",
             "**Collect reaction before recommending** — ask which option feels closest and what concern remains, then recommend based on that reaction.",
             "**Write the `Not Doing` list** — name 3-5 things this brainstorm explicitly is not committing to (vs. deferred). This protects scope from silent enlargement and the next stage from rework.",
             "**Run early Ralph loop discipline** — after each producer iteration, append a `Critic Pass` JSONL row to `.cclaw/state/early-loop-log.jsonl`, refresh `.cclaw/state/early-loop.json`, and iterate until open concerns clear or convergence guard escalates.",
-            "**Embedded Grill (post-pick)** — after `Selected Direction` is named, run 3-5 sharp checks on hidden constraints, reversibility/rollback, scope boundaries, existing-pattern conformance, and domain-language fit; record each question with recommended answer and disposition (accept/refine/reject).",
+            "**Embedded Grill (post-pick, one-at-a-time)** — after `Selected Direction` is named, if grilling triggers fire (irreversibility, security/auth boundary, domain-model ambiguity per `adaptive-elicitation:Conditional Grilling`), continue the elicitation loop with sharper questions **one at a time**, appended to `## Q&A Log` and reflected as rows in `## Embedded Grill`. Do NOT batch the 3-5 grill checks — each one follows the Core Protocol (ask, wait, log, self-eval, ask next).",
             "**Self-review before user approval** — re-read the artifact and patch contradictions, weak trade-offs, placeholders, ambiguity, and weak handoff language. Record the result in `Self-Review Notes` using the calibrated review format: `- Status: Approved` (or `Issues Found`), `- Patches applied:` with inline note or sub-bullets, `- Remaining concerns:` with inline note or sub-bullets. Use `Patches applied: None` and `Remaining concerns: None` when there is nothing to record.",
             "**Request explicit approval to close the stage** — state exactly what direction is being approved after the adaptive elicitation loop converges; do not advance without approval and artifact review.",
             "**Handoff packet** — only after approval, produce a scope handoff packet with selected direction, why rejected options were rejected, explicit non-goals, unresolved questions, risk hints, and explicit drift from the initial ask so scope starts from locked upstream decisions instead of rediscovering intent."
@@ -91,6 +92,7 @@ export const BRAINSTORM = {
             "Clarity Gate records ambiguity score, decision boundaries, reaffirmed non-goals, and residual-risk handoff.",
             "Clarifying questions are one-at-a-time and captured only when they change a decision or stop condition.",
             "2-3 approaches with trade-offs are recorded, including one higher-upside challenger option and reference-pattern source/disposition when applicable.",
+            "When `flow-state.interactionHints.brainstorm.fromIdeaArtifact` is set, the `## Idea Evidence Carry-forward` section cites the idea artifact + `I-#` and only the challenger rows are newly generated (idea candidate is reused as `baseline`, never re-derived).",
             "User reaction to approaches is captured before final recommendation.",
             "Final recommendation explicitly reflects user reaction.",
             "Early-loop status is reflected via `Victory Detector` / `Critic Pass` sections and `.cclaw/state/early-loop.json` when concerns remain.",
@@ -146,6 +148,7 @@ export const BRAINSTORM = {
             { section: "Approach Tier", required: true, validationRule: "Must classify depth as lite/standard/deep and explain the risk/uncertainty signal." },
             { section: "Short-Circuit Decision", required: false, validationRule: "Must include Status/Why/Scope handoff lines when short-circuit is discussed; compact stubs are valid for concrete asks." },
             { section: "Reference Pattern Candidates", required: false, validationRule: "Recommended when examples influence direction: list pattern/source, reusable invariant, accept/reject/defer disposition, and reason before approaches are finalized." },
+            { section: "Idea Evidence Carry-forward", required: false, validationRule: "Wave 23 (v5.0.0): when `flow-state.interactionHints.brainstorm.fromIdeaArtifact` is set, this section MUST cite the idea artifact path and the chosen `I-#`, list reused fields (Title, Why-now, Expected impact, Risk, Counter-argument), and explicitly state that only challenger row(s) were newly generated. Honors `/cc-ideate` handoff so divergent + critique + rank work is reused, not redone." },
             { section: "Approaches", required: true, validationRule: "Must compare 2-3 distinct options with real trade-offs. Use the canonical `Role` column with `baseline` | `challenger` | `wild-card` and the `Upside` column with `low` | `modest` | `high` | `higher`; include exactly one challenger row with `high` or `higher` upside, and cite reference-pattern source/disposition when applicable." },
             { section: "Approach Reaction", required: true, validationRule: "Must appear before Selected Direction and summarize user reaction before recommendation, including `Closest option`, `Concerns`, and what changed after reaction." },
             { section: "Selected Direction", required: true, validationRule: "Must include the selected approach, explicit approval marker, rationale traceable to Approach Reaction, and a scope handoff packet with selected direction, decisions, drift, confidence, unresolved questions, risk hints, and non-goals." },

package/dist/content/stages/design.js CHANGED Viewed

@@ -34,19 +34,22 @@ export const DESIGN = {
             "Skipping outside-voice review loop and treating first draft as final",
             "Batching multiple design issues into one question",
             "Agreeing with user's architecture choice without evaluating alternatives",
-            "No NOT-in-scope output section",
+            "Re-authoring scope's out-of-scope list instead of citing it via Upstream Handoff",
+            "Re-authoring scope's repo audit instead of diffing the blast radius since scope baseline",
             "Design decisions made without reading the actual code first"
         ]
     },
     executionModel: {
         checklist: [
-            "**Adaptive elicitation loop (shared skill)** — load `.cclaw/skills/adaptive-elicitation/SKILL.md` and run one decision-changing question per turn via harness-native tools. After each answer, append one row to `## Q&A Log` (`Turn | Question | User answer (1-line) | Decision impact`). Continue until architecture context is clear or user signals to proceed.",
+            "**ADAPTIVE ELICITATION COMES FIRST (no exceptions, no subagent dispatch before).** Load `.cclaw/skills/adaptive-elicitation/SKILL.md`. Walk the design forcing questions one-at-a-time via the harness-native question tool, append one row to `## Q&A Log` (`Turn | Question | User answer (1-line) | Decision impact`) after each user answer. Continue until forcing-questions converge (all answered/skipped/waived) OR Ralph-Loop convergence detector says no new decision-changing rows in last 2 iterations OR user records an explicit stop-signal row. Only then proceed to research, investigator pass, architecture lock, or any delegations. The linter `qa_log_unconverged` rule will block `stage-complete` if convergence is not reached.",
             "**Design forcing questions (must be covered or explicitly waived)** — what is the end-to-end data flow, where are seams/ownership boundaries, which invariants must hold, and what will explicitly NOT be refactored now.",
+            "**Out-of-scope carry-forward (do NOT re-author)** — scope OWNS the out-of-scope list. Cite scope's `## In Scope / Out of Scope > Out of Scope` via `## Upstream Handoff > Decisions carried forward`; do NOT add a separate `## NOT in scope` section in the design artifact. Add a row to `## Spec Handoff` only if a design-stage decision NEWLY excludes something not already in scope's out-of-scope.",
             "Compact design lock — design does not decide what to build; it decides how the approved scope works. For simple slices, produce a tight lock: upstream handoff, existing fit, architecture boundary, one labeled diagram, data/state flow, critical path, failure/rescue, trust boundaries, test/perf expectations, rollout/rollback, rejected alternative, and spec handoff.",
             "Trivial-Change Escape Hatch — for <=3 files, no new interfaces, and no cross-module data flow, produce a mini-design (rationale, changed files, one risk) and proceed to spec.",
+            "**Architecture choice (design OWNS the tier decision)** — pick the architecture tier (minimum-viable / product-grade / ideal) using scope's `## Scope Contract > Design handoff` as the input. Record the tier and rationale in `## Architecture Decision Record (ADR)` and `## Engineering Lock`. Scope only locked the SCOPE MODE; it did NOT enumerate Implementation Alternatives.",
             "Tiered Research — for simple/medium work, do compact inline codebase/research synthesis in `Research Fleet Synthesis`; write `.cclaw/artifacts/02a-research.md` and run the full fleet only for deep/high-risk work or when external framework/architecture uncertainty exists.",
             "Design Doc Check — read upstream artifacts and current design docs; latest superseding doc wins.",
-            "Investigator pass — before design decisions, read blast-radius code and record touched files, responsibilities, reuse candidates, and existing patterns.",
+            "**Blast-radius diff (do NOT re-audit the whole repo)** — scope OWNS the full repo audit (`## Pre-Scope System Audit`). Design only diffs the blast radius SINCE scope baseline: `git diff <scope-artifact-head-sha>..HEAD -- <touched-paths>`. Record touched files, current responsibilities, reuse candidates, and existing patterns in `## Codebase Investigation` and `## Blast-radius Diff`. Do NOT re-author scope's git log/diff/stash audit.",
             "Scope Challenge + Search Before Building — find existing solutions, minimum change set, reference-grade contracts to mirror, and complexity smells before custom architecture.",
             "Architecture Review — lock boundaries, chosen path, shadow alternative, switch trigger, failure/rescue/degraded behavior, and verification evidence for every high-risk choice; include tier-required diagrams.",
             "Review core risk areas — existing system fit, data/state flow, critical path, security/trust boundaries, tests, performance budget, observability/debuggability, rollout/rollback, rejected alternatives, and spec handoff.",
@@ -68,7 +71,7 @@ export const DESIGN = {
             "Classify ambiguity before acting. Only non-critical preference/default assumptions may continue; STOP on uncertainty about scope, architecture, security, data loss, public API, migration, auth/pricing, or required user approval. Design hypotheses must name validation path, rollback trigger, and owner before they can be carried forward.",
             "Before final approval, run the critic pass, reconcile material findings, and bound retries with the review-loop policy.",
             "For baseline approval, present the full design plus exact spec handoff and **STOP** until explicit approval.",
-            "**STOP BEFORE ADVANCE.** Mandatory delegation `planner` must be completed or explicitly waived, then close via `node .cclaw/hooks/stage-complete.mjs design`."
+            "**STOP BEFORE ADVANCE.** Mandatory delegation `planner` runs **AFTER user approval of the design lock**, not before Q&A. Sequence is: Q&A loop -> draft design lock -> user approval -> `planner` delegation -> `stage-complete`. Legal fulfillment modes for `planner`: (a) **harness-native Task tool** — run the delegation, then record via `node .cclaw/hooks/delegation-record.mjs --stage=design --agent=planner --mode=mandatory --status=completed --span-id=<uuid> --dispatch-surface=cursor-task --agent-definition-path=<agent-md-path> --evidence-ref=<artifact#section>`; (b) **role-switch** — write planner output into the design artifact, then record with `--dispatch-surface=role-switch`; (c) **cclaw subagent helper** with `--dispatch-surface=isolated`. Run `node .cclaw/hooks/stage-complete.mjs design` from the tool layer (do not paste the command into chat); report only the resulting summary."
         ],
         process: [
             "Read upstream artifacts and current design docs.",
@@ -77,7 +80,7 @@ export const DESIGN = {
             "Walk review sections interactively and lock boundaries, data flow, state transitions, edge cases, and failure modes.",
             "Cover security, observability, deployment, tests, and performance for Standard+ changes.",
             "Run stale-diagram audit (enabled by default unless explicitly disabled).",
-            "Produce required outputs: NOT-in-scope, What-already-exists, tier diagrams, failure table, completion dashboard.",
+            "Produce required outputs: blast-radius diff (scope owns full repo audit), tier diagrams, failure table, completion dashboard. Out-of-scope is carried from scope via Upstream Handoff — do NOT re-author it.",
             "Plant high-upside deferred ideas when useful and reconcile critic/outside-voice findings.",
             "Write design lock artifact for downstream spec/plan with design decisions, rejected alternatives, verification evidence, and exact spec handoff."
         ],
@@ -107,8 +110,8 @@ export const DESIGN = {
             "Test-Diagram Mapping links critical flows to both validating tests and diagram anchors.",
             "Test strategy includes unit/integration/e2e expectations.",
             "When a high-upside idea is deferred, a seed file is created under `.cclaw/seeds/` and referenced in the artifact.",
-            "NOT-in-scope section produced.",
-            "What-already-exists section produced.",
+            "Out-of-scope is carried forward from scope's `## In Scope / Out of Scope > Out of Scope` via `## Upstream Handoff > Decisions carried forward`; design does NOT author its own NOT-in-scope section.",
+            "Blast-radius Diff section produced (git diff since scope artifact baseline) — scope owns the full repo audit; design only diffs touched paths.",
             "Completion dashboard lists review section status, critical/open gap counts, decision count, and unresolved items (or 'None')."
         ],
         inputs: ["scope agreement artifact", "system constraints", "non-functional requirements"],
@@ -174,7 +177,7 @@ export const DESIGN = {
             { section: "Performance Budget", required: false, validationRule: "For each critical path: metric name, target threshold, and measurement method." },
             { section: "Observability & Debuggability", required: true, validationRule: "Must define logs/metrics/traces plus alerting/debug path for critical failure modes." },
             { section: "Deployment & Rollout", required: true, validationRule: "Must define migration/flag strategy, rollout/rollback plan, switch trigger, and post-deploy verification steps." },
-            { section: "What Already Exists", required: false, validationRule: "For each sub-problem: existing code/library found (Layer 1-3/EUREKA label), reuse decision, and adaptation needed." },
+            { section: "Blast-radius Diff", required: false, validationRule: "Diff since scope artifact baseline (`git diff <scope-sha>..HEAD -- <touched-paths>`): for each touched file, summarize change since scope, current responsibility, reuse candidate, and existing pattern. Scope OWNS the full repo audit; design only diffs the blast radius." },
             { section: "Reference-Grade Contracts", required: false, validationRule: "For every mirrored pattern: source, reusable invariant, local adaptation, rejection boundary, and verification signal. Omit with `None - no external or in-repo pattern mirrored` for compact local changes." },
             { section: "Rejected Alternatives", required: false, validationRule: "List alternatives considered, why rejected, and what signal would revive them." },
             { section: "Design Decisions", required: false, validationRule: "Stable design decisions with requirement/locked-decision refs and downstream spec impact." },
@@ -184,10 +187,9 @@ export const DESIGN = {
             { section: "Design Outside Voice Loop", required: false, validationRule: `Record iteration table with quality score per iteration, stop reason, and unresolved concerns. Enforce ${reviewLoopPolicySummary("design")}` },
             { section: "Victory Detector", required: false, validationRule: "Recommended early-loop checkpoint: cite `.cclaw/state/early-loop.json`, current iteration/maxIterations, open concern count, convergence status, and iterate/ready/escalate decision." },
             { section: "Critic Pass", required: false, validationRule: "Recommended producer/critic log contract: each iteration appends one JSONL row to `.cclaw/state/early-loop-log.jsonl` with runId, stage, iteration, and open concerns." },
-            { section: "NOT in scope", required: false, validationRule: "Work considered and explicitly deferred with one-line rationale." },
             { section: "Completion Dashboard", required: true, validationRule: "Lists every review section with status (clear / issues-found-resolved / issues-open), critical/open gap counts, decision count, and unresolved items (or 'None')." }
         ],
-        trivialOverrideSections: ["Architecture Boundaries", "NOT in scope", "Completion Dashboard"]
+        trivialOverrideSections: ["Architecture Boundaries", "Completion Dashboard"]
     },
     reviewLens: {
         outputs: [
@@ -195,8 +197,7 @@ export const DESIGN = {
             "architecture lock",
             "risk and failure map",
             "test and performance baseline",
-            "NOT-in-scope section",
-            "What-already-exists section",
+            "blast-radius diff since scope baseline",
             "design decisions and spec handoff",
             "design completion dashboard"
         ],

package/dist/content/stages/plan.js CHANGED Viewed

@@ -46,7 +46,7 @@ export const PLAN = {
             "Slice into vertical tasks — each task targets 2-5 minutes, produces one testable outcome, and touches one coherent area.",
             "Task Contract — every task has one coherent outcome, AC mapping, exact verification command/manual step, and expected evidence snippet or pass condition. Avoid vague `run tests` wording.",
             "Annotate slice-review metadata — task rows may carry `touchCount` (rough number of files expected to change), `touchPaths` (glob hints, e.g. `migrations/**`, `src/auth/**`), and optional `highRisk: true` to force a review pass. These fields feed the TDD stage's Per-Slice Review point.",
-            "Map scope Locked Decisions — every LD#hash anchor from scope is referenced by at least one plan task (or explicitly marked deferred with reason).",
+            "Map scope Locked Decisions — every D-XX ID from scope is referenced by at least one plan task (or explicitly marked deferred with reason).",
             "Run anti-placeholder + anti-scope-reduction scans — block `TODO/TBD/...` and phrasing like `v1`, `for now`, `later` for locked boundaries.",
             "Define validation points — mark where progress must be checked before continuing, with concrete command and expected evidence.",
             "Define execution posture — record whether execution should be sequential, dependency-batched, parallel-safe, or blocked; include risk triggers and RED/GREEN/REFACTOR checkpoint/commit expectations when the repo workflow supports them. This fulfills the `plan_execution_posture_recorded` gate.",