cclaw-cli 0.51.23 → 0.51.24

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -9,11 +9,11 @@ export const DESIGN = {
9
9
  complexityTier: "standard",
10
10
  skillFolder: "engineering-design-lock",
11
11
  skillName: "engineering-design-lock",
12
- skillDescription: "Engineering lock-in stage. Build a concrete technical spine before spec and planning, with section-by-section interactive review.",
12
+ skillDescription: "Engineering lock stage. Convert the approved scope contract into a buildable architecture with adversarial alternatives, failure/rescue paths, and spec handoff.",
13
13
  philosophy: {
14
14
  hardGate: "Do NOT write implementation code. This stage produces design decisions and architecture documents only. No code changes, no scaffolding, no test files.",
15
15
  ironLaw: "NO DESIGN DECISION WITHOUT A LABELED DIAGRAM, A REJECTED ALTERNATIVE, AND A NAMED FAILURE MODE.",
16
- purpose: "Lock architecture, data flow, failure modes, and test/performance expectations through rigorous interactive review.",
16
+ purpose: "Lock how the scoped slice works: architecture boundary, existing fit, data/state flow, critical path, trust boundaries, failure/rescue behavior, verification, rollout, and spec handoff.",
17
17
  whenToUse: [
18
18
  "After scope agreement approval",
19
19
  "Before writing final spec and execution plan",
@@ -40,14 +40,14 @@ export const DESIGN = {
40
40
  },
41
41
  executionModel: {
42
42
  checklist: [
43
- "Compact design lock — for simple greenfield/product slices, produce a tight but complete design spine: codebase investigation, architecture boundary, one labeled diagram, data flow, failure/rescue table, test/perf expectations, and handoff. Do not run a sprawling workshop when a strong engineering lock fits on one page.",
43
+ "Compact design lock — design does not decide what to build; it decides how the approved scope works. For simple slices, produce a tight lock: upstream handoff, existing fit, architecture boundary, one labeled diagram, data/state flow, critical path, failure/rescue, trust boundaries, test/perf expectations, rollout/rollback, rejected alternative, and spec handoff.",
44
44
  "Trivial-Change Escape Hatch — for <=3 files, no new interfaces, and no cross-module data flow, produce a mini-design (rationale, changed files, one risk) and proceed to spec.",
45
45
  "Tiered Research — for simple/medium work, do compact inline codebase/research synthesis in `Research Fleet Synthesis`; write `.cclaw/artifacts/02a-research.md` and run the full fleet only for deep/high-risk work or when external framework/architecture uncertainty exists.",
46
46
  "Design Doc Check — read upstream artifacts and current design docs; latest superseding doc wins.",
47
47
  "Investigator pass — before design decisions, read blast-radius code and record touched files, responsibilities, reuse candidates, and existing patterns.",
48
48
  "Scope Challenge + Search Before Building — find existing solutions, minimum change set, and complexity smells before custom architecture.",
49
- "Architecture Review — lock boundaries, one realistic failure scenario per new codepath, and high-risk choices with chosen path, one shadow alternative, switch trigger, and verification evidence; include tier-required diagrams.",
50
- "Review core risk areas — security/threat model, code quality, tests, performance, observability/debuggability, deployment/rollout, and parallelization when modules are independent.",
49
+ "Architecture Review — lock boundaries, chosen path, shadow alternative, switch trigger, failure/rescue/degraded behavior, and verification evidence for every high-risk choice; include tier-required diagrams.",
50
+ "Review core risk areas — existing system fit, data/state flow, critical path, security/trust boundaries, tests, performance budget, observability/debuggability, rollout/rollback, rejected alternatives, and spec handoff.",
51
51
  `Critic pass — run/reconcile adversarial second opinion on architecture, coupling, failure modes, and cheaper alternatives. ${reviewLoopPolicySummary("design")} ${reviewLoopSecondOpinionSummary("design")}`,
52
52
  "Run optional stale-diagram audit only when configured.",
53
53
  "Capture leftovers — seed high-upside deferred ideas, list unresolved decisions with defaults, document distribution for new artifact types, and cross-reference deferred items to scope or unresolved decisions."
@@ -73,7 +73,7 @@ export const DESIGN = {
73
73
  "Run configured stale-diagram audit when enabled.",
74
74
  "Produce required outputs: NOT-in-scope, What-already-exists, tier diagrams, failure table, completion dashboard.",
75
75
  "Plant high-upside deferred ideas when useful and reconcile critic/outside-voice findings.",
76
- "Write design lock artifact for downstream spec/plan."
76
+ "Write design lock artifact for downstream spec/plan with design decisions, rejected alternatives, verification evidence, and exact spec handoff."
77
77
  ],
78
78
  requiredGates: [
79
79
  { id: "design_research_complete", description: "Research is complete: compact inline synthesis by default, or a separate research artifact for deep/high-risk work, and findings are mapped to design decisions." },
@@ -93,6 +93,7 @@ export const DESIGN = {
93
93
  "Outside-voice findings and dispositions are recorded (accept/reject/defer).",
94
94
  `Spec review loop summary includes iteration count and quality score trajectory per ${reviewLoopPolicySummary("design")}`,
95
95
  reviewLoopSecondOpinionSummary("design"),
96
+ "Adversarial lock table includes chosen path, shadow alternative, switch trigger, failure/rescue/degraded behavior, and verification evidence.",
96
97
  "Test strategy includes unit/integration/e2e expectations.",
97
98
  "When a high-upside idea is deferred, a seed file is created under `.cclaw/seeds/` and referenced in the artifact.",
98
99
  "NOT-in-scope section produced.",
@@ -144,30 +145,28 @@ export const DESIGN = {
144
145
  artifactValidation: [
145
146
  { section: "Upstream Handoff", required: false, validationRule: "Summarizes scope/research decisions, constraints, open questions, and explicit drift before design choices." },
146
147
  { section: "Research Fleet Synthesis", required: true, validationRule: "Must summarize the tiered lenses actually run and map findings to concrete design decisions. Default may be compact inline synthesis; full separate research pack is Deep/high-risk only." },
147
- { section: "Codebase Investigation", required: false, validationRule: "Investigator pass: list blast-radius files with current responsibilities, discovered patterns, and reuse candidates." },
148
+ { section: "Codebase Investigation", required: false, validationRule: "Investigator pass: list blast-radius files with current responsibilities, discovered patterns, reuse candidates, and existing system fit." },
149
+ { section: "Engineering Lock", required: true, validationRule: "Canonical lock: chosen path, shadow alternative, switch trigger, failure/rescue/degraded behavior, verification evidence, critical path, rollout/rollback, and confidence." },
148
150
  { section: "Search Before Building", required: false, validationRule: "For each technical choice: Layer 1 (exact match), Layer 2 (partial match), Layer 3 (inspiration), EUREKA labels with reuse-first default." },
149
151
  { section: "Architecture Boundaries", required: true, validationRule: "Must list component boundaries with ownership." },
150
152
  { section: "Architecture Diagram", required: true, validationRule: "Must include `<!-- diagram: architecture -->` marker. Diagram must label concrete nodes, label arrows, mark direction, distinguish sync/async edges, and include at least one failure/degraded edge." },
151
- { section: "Data-Flow Shadow Paths", required: false, validationRule: "Standard/Deep add-on: include `<!-- diagram: data-flow-shadow-paths -->` marker plus a table for high-risk choices: chosen path, shadow alternative, switch trigger, fallback/degrade behavior, and verification evidence." },
153
+ { section: "Data-Flow Shadow Paths", required: false, validationRule: "Standard/Deep add-on: include `<!-- diagram: data-flow-shadow-paths -->` marker plus a table for high-risk choices: chosen path, shadow alternative, switch trigger, failure/rescue/degraded behavior, and verification evidence." },
152
154
  { section: "Error Flow Diagram", required: false, validationRule: "Standard/Deep add-on: include `<!-- diagram: error-flow -->` marker and failure-detection -> rescue -> user-visible outcome flow." },
153
- { section: "State Machine Diagram", required: false, validationRule: "Deep add-on: include `<!-- diagram: state-machine -->` marker and state transitions for critical flow lifecycle." },
154
- { section: "Rollback Flowchart", required: false, validationRule: "Deep add-on: include `<!-- diagram: rollback-flowchart -->` marker with trigger -> rollback actions -> verification." },
155
- { section: "Deployment Sequence Diagram", required: false, validationRule: "Deep add-on: include `<!-- diagram: deployment-sequence -->` marker with rollout order and guard checks." },
156
- { section: "Data Flow", required: false, validationRule: "Must include happy path, nil input, empty input, upstream error paths, plus Interaction Edge Case matrix rows for: double-click, nav-away-mid-request, 10K-result dataset, background-job abandonment, zombie connection. Each row must declare handled yes/no and deferred item when not handled." },
155
+ { section: "Data Flow", required: false, validationRule: "Must include data/state flow, happy path, nil input, empty input, upstream error paths, plus Interaction Edge Case matrix rows for double-click, nav-away-mid-request, 10K-result dataset, background-job abandonment, zombie connection. Each row declares handled yes/no and deferred item when not handled." },
157
156
  { section: "Stale Diagram Audit", required: false, validationRule: "When `.cclaw/config.yaml::optInAudits.staleDiagramAudit` is true: blast-radius files from Codebase Investigation must not be newer than the current design diagram-marker baseline unless explicitly refreshed." },
158
157
  { section: "Failure Mode Table", required: true, validationRule: "Use Method/Exception/Rescue/UserSees columns and treat silent user impact without rescue as critical." },
159
158
  { section: "Security & Threat Model", required: true, validationRule: "Must list trust boundaries, abuse/failure scenarios, mitigations, and residual risks." },
160
159
  { section: "Test Strategy", required: false, validationRule: "Must define unit/integration/e2e expectations with coverage targets." },
161
160
  { section: "Performance Budget", required: false, validationRule: "For each critical path: metric name, target threshold, and measurement method." },
162
161
  { section: "Observability & Debuggability", required: true, validationRule: "Must define logs/metrics/traces plus alerting/debug path for critical failure modes." },
163
- { section: "Deployment & Rollout", required: true, validationRule: "Must define migration/flag strategy, rollback plan, and post-deploy verification steps." },
162
+ { section: "Deployment & Rollout", required: true, validationRule: "Must define migration/flag strategy, rollout/rollback plan, switch trigger, and post-deploy verification steps." },
164
163
  { section: "What Already Exists", required: false, validationRule: "For each sub-problem: existing code/library found (Layer 1-3/EUREKA label), reuse decision, and adaptation needed." },
164
+ { section: "Rejected Alternatives", required: false, validationRule: "List alternatives considered, why rejected, and what signal would revive them." },
165
+ { section: "Design Decisions", required: false, validationRule: "Stable design decisions with requirement/locked-decision refs and downstream spec impact." },
166
+ { section: "Spec Handoff", required: true, validationRule: "Exact requirements, design decisions, risks, test/perf expectations, and unresolved questions that spec must carry forward." },
165
167
  { section: "Outside Voice Findings", required: false, validationRule: "Critic pass: list adversarial findings and disposition (accept/reject/defer) with rationale per material finding." },
166
168
  { section: "Design Outside Voice Loop", required: false, validationRule: `Record iteration table with quality score per iteration, stop reason, and unresolved concerns. Enforce ${reviewLoopPolicySummary("design")}` },
167
169
  { section: "NOT in scope", required: false, validationRule: "Work considered and explicitly deferred with one-line rationale." },
168
- { section: "Parallelization Strategy", required: false, validationRule: "Standard/Deep add-on when multi-module: dependency table, parallel lanes, conflict flags." },
169
- { section: "Interface Contracts", required: false, validationRule: "Standard/Deep add-on when module boundaries or APIs change: producers, consumers, and payload/interface expectations." },
170
- { section: "Unresolved Decisions", required: false, validationRule: "Standard/Deep add-on if any: what info is missing, who provides it, default if unanswered." },
171
170
  { section: "Completion Dashboard", required: true, validationRule: "Lists every review section with status (clear / issues-found-resolved / issues-open), critical/open gap counts, decision count, and unresolved items (or 'None')." }
172
171
  ],
173
172
  trivialOverrideSections: ["Architecture Boundaries", "NOT in scope", "Completion Dashboard"]
@@ -180,6 +179,7 @@ export const DESIGN = {
180
179
  "test and performance baseline",
181
180
  "NOT-in-scope section",
182
181
  "What-already-exists section",
182
+ "design decisions and spec handoff",
183
183
  "design completion dashboard"
184
184
  ],
185
185
  reviewLoop: {
@@ -38,11 +38,12 @@ export const REVIEW = {
38
38
  "Load upstream evidence — read TDD artifact (RED + GREEN + REFACTOR), spec, and the active track's upstream source items.",
39
39
  "Run traceability matrix when the active track enforces it; otherwise confirm spec acceptance/reproduction slices are covered directly.",
40
40
  "Layer 1: Spec Compliance — check every acceptance criterion against implementation. Verdict: pass/fail per criterion.",
41
- "Layer 2: Integrated findings one structured pass tagged by category: correctness, security, performance, architecture, external-safety.",
42
- "Security sweepmandatory dedicated security-reviewer pass across diff + touched modules. A zero-finding pass must include `NO_CHANGE_ATTESTATION` with rationale.",
41
+ "Review Evidence Scoperecord base/head, files inspected, changed-file coverage, diagnostics run, dependency/version audit when relevant, and any files intentionally not inspected with explicit reason.",
42
+ "Layer 2: Integrated findings one structured pass tagged by category: correctness, security, performance, architecture, external-safety. Every finding uses file:line; if impossible, include an explicit no-line reason.",
43
+ "Security sweep — mandatory dedicated security-reviewer pass across diff + touched modules. A zero-finding pass must include `NO_CHANGE_ATTESTATION` or `NO_SECURITY_IMPACT` with rationale and inspected surfaces.",
43
44
  "Incoming Feedback Intake — when human reviewer comments, bot findings, or CI annotations exist, keep a per-comment disposition queue and mirror outcomes into `07-review.md` + `07-review-army.json` before final verdict.",
44
45
  "Structured Review reconciliation — normalize findings into `07-review-army.json`, dedup by fingerprint, and mark multi-specialist confirmations when multiple lenses agree.",
45
- "Meta-Review — Were tests actually run? Do test names match what they test? Are there real assertions?",
46
+ "Meta-Review — Were tests/diagnostics actually run? Do test names match what they test? Are there real assertions? Is the dependency/version surface unchanged or audited?",
46
47
  "Classify findings — Critical (blocks ship), Important (should fix), Suggestion (optional improvement).",
47
48
  "Produce verdict — APPROVED, APPROVED_WITH_CONCERNS, or BLOCKED.",
48
49
  "If verdict is BLOCKED, emit remediation route token `ROUTE_BACK_TO_TDD`, include `cclaw internal rewind tdd \"review_blocked_by_critical\"` with the blocking finding IDs, and satisfy the special transition guard `review_verdict_blocked` instead of `review_criticals_resolved`."
@@ -79,7 +80,11 @@ export const REVIEW = {
79
80
  "Artifact written to `.cclaw/artifacts/07-review-army.json`.",
80
81
  "Traceability matrix run recorded (no orphaned source items or tests for enforced tracks).",
81
82
  "Layer 1 verdict captured with per-criterion pass/fail.",
83
+ "Review Evidence Scope lists files inspected, changed-file coverage, diagnostics run, and omitted files with explicit reason.",
82
84
  "Layer 2 sections completed across correctness, security, performance, architecture, and external-safety findings.",
85
+ "Every finding cites `file:line`, or an explicit no-line reason is recorded.",
86
+ "No-finding attestation is explicit when no issues are found.",
87
+ "Dependency/version audit is recorded when manifests, lockfiles, generated clients, CI, runtime config, or external APIs are relevant.",
83
88
  "Severity log includes critical/important/suggestion buckets.",
84
89
  "Explicit final verdict: APPROVED, APPROVED_WITH_CONCERNS, or BLOCKED.",
85
90
  "Fresh verification command discovery recorded, and the command cited in `review_trace_matrix_clean` evidence before ship handoff.",
@@ -114,8 +119,12 @@ export const REVIEW = {
114
119
  },
115
120
  artifactValidation: [
116
121
  { section: "Upstream Handoff", required: false, validationRule: "Summarizes spec/plan/tdd decisions, constraints, open questions, and explicit drift before review verdicts." },
122
+ { section: "Review Evidence Scope", required: true, validationRule: "Base/head, files inspected, changed-file coverage, diagnostics run, omitted files with reason, and reviewer/security-reviewer delegation evidence." },
123
+ { section: "Changed-File Coverage", required: true, validationRule: "Each changed file is covered, intentionally omitted with no-impact reason, or linked to a broader inspected module." },
117
124
  { section: "Layer 1 Verdict", required: true, validationRule: "Per-criterion pass/fail with references." },
118
- { section: "Layer 2 Findings", required: false, validationRule: "Each finding has severity, description, and resolution status across correctness, security, performance, architecture, and external-safety. Security coverage must include either explicit security findings or `NO_CHANGE_ATTESTATION: <reason>` when no security-relevant changes were found." },
125
+ { section: "Layer 2 Findings", required: false, validationRule: "Each finding has severity, category, file:line or explicit no-line reason, description, and resolution status across correctness/security/performance/architecture/external-safety. If there are no findings, include a no-finding attestation." },
126
+ { section: "Security Sweep Attestation", required: false, validationRule: "Dedicated security-reviewer result: findings or `NO_CHANGE_ATTESTATION` / `NO_SECURITY_IMPACT` with inspected surfaces and rationale." },
127
+ { section: "Dependency & Version Audit", required: false, validationRule: "Required when manifests, lockfiles, generated clients, CI, runtime config, or external APIs changed; otherwise record no-impact rationale." },
119
128
  { section: "Review Findings Contract", required: true, validationRule: "Structured findings in 07-review-army.json include id/severity/confidence/fingerprint/reportedBy/status and source tags from {spec, correctness, security, performance, architecture, external-safety} with dedup reconciliation summary." },
120
129
  { section: "Review Readiness Snapshot", required: false, validationRule: "Optional compact summary: completed checks, delegation-log status, staleness signal, open critical blockers, and ship recommendation." },
121
130
  { section: "Completeness Snapshot", required: false, validationRule: "Optional compact coverage summary for AC coverage, source item coverage, test-slice coverage, and adversarial-review status when triggered." },
@@ -21,7 +21,7 @@ export interface ArtifactValidation {
21
21
  validationRule: string;
22
22
  }
23
23
  export interface StageAutoSubagentDispatch {
24
- agent: "planner" | "reviewer" | "security-reviewer" | "test-author" | "doc-updater";
24
+ agent: "planner" | "product-manager" | "critic" | "reviewer" | "security-reviewer" | "test-author" | "doc-updater";
25
25
  /**
26
26
  * - `mandatory` — must be dispatched (or explicitly waived) before stage transition.
27
27
  * - `proactive` — should be dispatched automatically when context matches `when`.
@@ -9,7 +9,7 @@ export const SCOPE = {
9
9
  complexityTier: "standard",
10
10
  skillFolder: "scope-shaping",
11
11
  skillName: "scope-shaping",
12
- skillDescription: "Strategic scope stage. Challenge premise and lock explicit in-scope/out-of-scope boundaries using CEO-level thinking.",
12
+ skillDescription: "Strategic contract stage. Select HOLD/SELECTIVE/EXPAND/REDUCE mode, lock the slice and boundaries, and hand stable discretion zones to design.",
13
13
  philosophy: {
14
14
  hardGate: "Do NOT begin architecture, design, or code. This stage produces scope decisions only. Do not silently add or remove scope — every change is an explicit user opt-in.",
15
15
  ironLaw: "EVERY SCOPE CHANGE IS AN EXPLICIT USER OPT-IN — NEVER A SILENT ENLARGEMENT OR TRIM.",
@@ -45,19 +45,19 @@ export const SCOPE = {
45
45
  },
46
46
  executionModel: {
47
47
  checklist: [
48
- "**Scope contract first** — read brainstorm, name the job-to-be-done, draft the explicit in-scope/out-of-scope/deferred contract, select one mode, and write the rationale. This is the default path; use dream/10-star/temporal/deep strategy sections only when risk, novelty, or user ambition justifies them.",
48
+ "**Scope contract first** — read brainstorm handoff, name upstream decisions used, explicit drift, confidence, unresolved questions, and next-stage risk hints; draft the in-scope/out-of-scope/deferred/discretion contract before any design choice.",
49
49
  "**Premise and leverage check** — answer in the artifact: *Right problem? Direct path? What if nothing? Where can we leverage existing code? What is the reversibility cost?* Take a position; do not hedge.",
50
50
  "**Conditional 10-star boundary** — for deep/high-risk/product-strategy work, show what would make the product meaningfully better, then explicitly choose what ships now, what is deferred, and what is excluded without vague `later/for now` placeholders. Skip this for straightforward repair work and record `not needed: compact scope`.",
51
- "**Pick one of four gstack modes with the user** — SCOPE EXPANSION, SELECTIVE EXPANSION, HOLD SCOPE, or SCOPE REDUCTION. Recommend one, state why and what signal would change it, then STOP for the user's mode/scope approval before writing the final artifact.",
52
- "**Run mode-specific analysis only to needed depth** — ordinary path is a selected-mode row plus rationale tied to the scope contract. For deep/high-risk work, expand the analysis to match the chosen mode: SCOPE EXPANSION enumerates 10x opportunities + delight features; SELECTIVE EXPANSION lists baseline + cherry-picked additions; HOLD SCOPE proves rigor on the current slice; SCOPE REDUCTION names the smallest useful wedge.",
51
+ "**Pick one operational mode with the user** — HOLD SCOPE preserves focus; SELECTIVE EXPANSION cherry-picks high-leverage reference ideas; SCOPE EXPANSION explores ambitious alternatives; SCOPE REDUCTION cuts to the essential wedge. Recommend one, state why and what signal would change it, then STOP for approval.",
52
+ "**Run mode-specific analysis only to needed depth** — lite keeps the selected-mode row compact; standard adds requirements/locked decisions/discretion; deep may add Landscape Check, Taste Calibration, Reference Pull, Ambitious Alternatives, and Ruthless Minimum Slice evidence when mode/risk warrants it.",
53
53
  "**Compare implementation alternatives** — include minimum viable, product-grade, and ideal architecture options with effort (S/M/L/XL), risk (Low/Med/High), pros, cons, and reuses. Recommend one and tie it to mode.",
54
54
  "**Run outside voice before final approval** — for simple/low-risk scope, record one concise adversarial self-check row; for complex/high-risk/configured scope, iterate until threshold. Record the loop summary in `## Scope Outside Voice Loop`, but do not treat it as user approval.",
55
55
  "**Ask only one decision-changing question** — if the user rejects the contract but is unsure, offer 3-4 concrete scope moves instead of open-ended interrogation.",
56
- "**Write the scope contract after approval** — include in-scope/out-of-scope, discretion areas, deferred items, locked decisions, error/rescue notes, completion dashboard, scope summary (with canonical mode token + next-stage handoff), and explicit approval evidence."
56
+ "**Write the scope contract after approval** — include selected mode, in scope, out of scope, requirements, locked decisions, discretion areas, deferred ideas, accepted/rejected reference ideas, success definition, design handoff, completion dashboard, and explicit approval evidence."
57
57
  ],
58
58
  interactionProtocol: [
59
59
  decisionProtocolInstruction("scope mode selection", "present expand/selective/hold/reduce as labeled options with trade-offs and mark one as (recommended)", "recommend the option that best covers the prime-directive failure modes, four data-flow paths, observability, and deferred handling for the in-scope set with the smallest blast radius. Base your recommendation on default heuristics: greenfield -> expand, enhancement -> selective, bugfix/hotfix/refactor -> hold, broad blast radius -> reduce"),
60
- "Do not walk the full checklist by default. Lead with a proposed scope contract and the one decision that matters most; label the mode as recommended, not selected, until the user answers.",
60
+ "Do not walk the full checklist by default. Lead with a proposed scope contract, selected depth (`lite`/`standard`/`deep`), and the one decision that matters most; label the mode as recommended, not selected, until the user answers.",
61
61
  "For simple web-app flows, default to HOLD SCOPE or SELECTIVE EXPANSION, show the exact in/out/deferred contract as a proposal, and STOP for one explicit approval before writing the final scope artifact or completing the stage.",
62
62
  "Challenge premise first, take a firm position, and name one concrete condition that would change it.",
63
63
  "Push back on weak framing: vague scope needs a specific user/problem, platform vision needs a narrow wedge, social proof needs behavioral evidence.",
@@ -86,7 +86,8 @@ export const SCOPE = {
86
86
  "When `.cclaw/config.yaml::optInAudits.scopePreAudit` is true, Pre-Scope System Audit findings are captured (git log/diff/stash/debt markers).",
87
87
  "In-scope and out-of-scope lists are explicit.",
88
88
  "Discretion areas are explicit (or marked as `None`).",
89
- "Selected mode and rationale are documented.",
89
+ "Selected mode and rationale are documented using HOLD SCOPE, SELECTIVE EXPANSION, SCOPE EXPANSION, or SCOPE REDUCTION.",
90
+ "Scope Contract captures requirements, locked decisions, discretion areas, accepted/rejected reference ideas, success definition, and design handoff.",
90
91
  "Locked Decisions section lists stable LD#hash anchors for non-negotiable boundaries.",
91
92
  "Premise challenge findings documented.",
92
93
  "Outside Voice findings and dispositions are recorded (accept/reject/defer with rationale) before final approval.",
@@ -140,8 +141,12 @@ export const SCOPE = {
140
141
  { section: "Pre-Scope System Audit", required: false, validationRule: "When `.cclaw/config.yaml::optInAudits.scopePreAudit` is true: must capture git log -30, git diff --stat, git stash list, and debt-marker scan (TODO/FIXME/XXX/HACK) before premise challenge." },
141
142
  { section: "Prime Directives", required: false, validationRule: "For each scoped capability: named failure modes, explicit error surface, four data-flow paths, interaction edge cases, observability expectations, and deferred-item handling." },
142
143
  { section: "Premise Challenge", required: false, validationRule: "Must list at least 3 question/answer rows in a markdown table or bullet list (gstack default trio: right problem? direct path? what if we do nothing? — extend with leverage and reversibility for richer scope). The linter enforces structure, not English wording — answers may be in any language." },
143
- { section: "Landscape Check", required: false, validationRule: "When mode is EXPAND/SELECTIVE, include at least one external reference insight and its impact on scope." },
144
- { section: "Taste Calibration", required: false, validationRule: "Must reference 2-3 strong in-repo modules/files that define the quality bar or explicitly justify omission." },
144
+ { section: "Scope Contract", required: true, validationRule: "Canonical contract: selected mode, in scope, out of scope, requirements, locked decisions, discretion areas, deferred ideas, accepted/rejected reference ideas, success definition, and design handoff." },
145
+ { section: "Landscape Check", required: false, validationRule: "Optional evidence heading for EXPAND/SELECTIVE/deep modes: include reference insight and impact on scope, or omit for compact HOLD SCOPE." },
146
+ { section: "Taste Calibration", required: false, validationRule: "Optional evidence heading: reference 2-3 strong in-repo modules/files that define the quality bar or justify omission." },
147
+ { section: "Reference Pull", required: false, validationRule: "Optional evidence heading: cite ideas pulled from `/Users/zuevrs/Downloads/references` or state no reference pull was needed for compact HOLD SCOPE." },
148
+ { section: "Ambitious Alternatives", required: false, validationRule: "Optional evidence heading for SCOPE EXPANSION/SELECTIVE: list larger alternatives considered and their disposition." },
149
+ { section: "Ruthless Minimum Slice", required: false, validationRule: "Optional evidence heading for SCOPE REDUCTION or high-risk scope: define the smallest useful wedge and what it proves." },
145
150
  { section: "Requirements", required: false, validationRule: "Table of stable requirement IDs (R1, R2, R3…) one per row with observable outcome, priority, and source. IDs are assigned once and never renumbered across scope/design/spec/plan/review; dropped requirements stay with Priority `DROPPED`." },
146
151
  { section: "Locked Decisions (LD#hash)", required: false, validationRule: "List of stable locked decisions with unique `LD#<sha8>` anchors. Each anchor is derived from the normalized Decision cell and is referenced downstream for cross-stage traceability." },
147
152
  { section: "Implementation Alternatives", required: false, validationRule: "2-3 options with Name, Summary, Effort, Risk, Pros, Cons, and Reuses. Must include minimal viable and ideal architecture options." },
@@ -154,7 +159,7 @@ export const SCOPE = {
154
159
  { section: "Outside Voice Findings", required: false, validationRule: "Must list external/adversarial findings and disposition (accept/reject/defer) with rationale." },
155
160
  { section: "Scope Outside Voice Loop", required: false, validationRule: `Must record iterations, quality score per iteration, stop reason, and unresolved concerns. Enforce ${reviewLoopPolicySummary("scope")}` },
156
161
  { section: "Completion Dashboard", required: true, validationRule: "Lists per-review-section status, count of critical/open gaps, resolved decisions, and unresolved decisions (or 'None')." },
157
- { section: "Scope Summary", required: true, validationRule: "Compact recap of the locked scope. Must name the selected mode using one of the canonical tokens (`SCOPE EXPANSION`, `SELECTIVE EXPANSION`, `HOLD SCOPE`, `SCOPE REDUCTION`) and record the track-aware next-stage handoff (`design` for standard, `spec` for medium); the linter checks structure, not English wording." },
162
+ { section: "Scope Summary", required: true, validationRule: "Compact recap of the locked scope. Must name the selected mode using one canonical token, confidence, explicit drift from brainstorm, unresolved questions, and the track-aware next-stage handoff (`design` for standard, `spec` for medium); the linter checks structure, not English wording." },
158
163
  { section: "Dream State Mapping", required: false, validationRule: "Deep/optional only: CURRENT STATE, THIS PLAN, 12-MONTH IDEAL, and alignment verdict. Omit for compact scope." },
159
164
  { section: "Temporal Interrogation", required: false, validationRule: "Deep/optional only: timeline simulation table with decision pressures and lock-now vs defer verdicts. Omit for compact scope." }
160
165
  ]
@@ -7,6 +7,8 @@ import { conversationLanguagePolicyMarkdown } from "./language-policy.js";
7
7
  */
8
8
  const SUBAGENT_AGENT_NAMES = [
9
9
  "planner",
10
+ "product-manager",
11
+ "critic",
10
12
  "reviewer",
11
13
  "security-reviewer",
12
14
  "test-author",
@@ -130,9 +132,9 @@ Concrete per-stage rules so the controller does not have to guess which tier fit
130
132
 
131
133
  | Stage | Deep slot | Balanced slot(s) | Fast fan-out | Trigger to escalate |
132
134
  |---|---|---|---|---|
133
- | brainstorm | planner (only if ambiguity spans >1 module) | | run in-thread research playbooks | promote to \`balanced\` reviewer once direction locks |
134
- | scope | planner (always) | | run \`research/git-history.md\` in-thread when churn is high | promote to \`balanced\` planner if scope touches external contracts |
135
- | design | planner (always) | security-reviewer (if trust boundary touched) | run \`research/framework-docs-lookup.md\` + \`research/best-practices-lookup.md\` in-thread | escalate one specialist to \`deep\` only if a failure mode is Critical-severity |
135
+ | brainstorm | planner (only if ambiguity spans >1 module) | product-manager / critic when product value or premise is uncertain | run in-thread research playbooks | promote to \`balanced\` critic if the do-nothing path may beat the idea |
136
+ | scope | planner (always) | product-manager / critic when mode changes user value or boundaries are soft | run \`research/git-history.md\` in-thread when churn is high | promote to \`balanced\` critic if scope mode is disputed |
137
+ | design | planner (always) | critic, security-reviewer, test-author when alternatives/trust/testability apply | run \`research/framework-docs-lookup.md\` + \`research/best-practices-lookup.md\` in-thread | escalate one specialist to \`deep\` only if a failure mode is Critical-severity |
136
138
  | spec | — | reviewer (if spec > 200 lines or multiple ACs) | — | escalate to \`deep\` only for spec ↔ design contradictions |
137
139
  | plan | planner (solo, always) | — | — | never fan out at plan stage; one owner for dependency graph |
138
140
  | tdd | — | ${formatAgentList(stageSummary("tdd").primaryAgents)} (per slice, carrying RED/GREEN/REFACTOR evidence) · reviewer (slice-local only when sliceReview triggers) | doc-updater (API surface changes) | escalate to \`deep\` only when a RED test cannot be expressed (design leak) |
@@ -601,6 +603,56 @@ Output format (mandatory):
601
603
  - Close with RISK_SUMMARY and SHIP_BLOCKERS (explicit list, possibly empty).
602
604
  \`\`\`
603
605
 
606
+ `;
607
+ }
608
+ function productManagerEnhancedBody() {
609
+ return `
610
+
611
+ ## Task Tool Delegation
612
+
613
+ Use this payload when product discovery needs an isolated lens:
614
+
615
+ \`\`\`
616
+ You are a product-manager subagent.
617
+
618
+ DISCOVERY GOAL: {problem/value decision to clarify}
619
+ CONTEXT: {existing artifact excerpts, user segment, constraints}
620
+ DEPTH: {lite|standard|deep}
621
+
622
+ Required output:
623
+ - PERSONA_JTBD: persona, job, pain/trigger
624
+ - VALUE_HYPOTHESIS: expected value and success metric
625
+ - EVIDENCE_SIGNAL: strongest evidence, weakest assumption
626
+ - WHY_NOW_AND_DO_NOTHING: why now plus consequence of no action
627
+ - NON_GOALS: explicit exclusions
628
+ - SCOPE_HANDOFF: one recommendation for hold/selective/expand/reduce
629
+ \`\`\`
630
+
631
+ `;
632
+ }
633
+ function criticEnhancedBody() {
634
+ return `
635
+
636
+ ## Task Tool Delegation
637
+
638
+ Use this payload when a premise, scope mode, or engineering path needs adversarial pressure:
639
+
640
+ \`\`\`
641
+ You are a critic subagent.
642
+
643
+ DECISION_UNDER_REVIEW: {direction/scope/design choice}
644
+ CONTEXT: {artifact excerpts, constraints, known risks}
645
+ DEPTH: {lite|standard|deep}
646
+
647
+ Required output:
648
+ - PREMISE_ATTACK: what could make this decision wrong
649
+ - CHEAPER_ALTERNATIVE: smaller or more reversible option
650
+ - SHADOW_ALTERNATIVE: viable competing path
651
+ - SWITCH_TRIGGER: signal that should change the decision
652
+ - FAILURE_RESCUE: likely failure and rescue/degraded behavior
653
+ - VERIFICATION_EVIDENCE: evidence needed before locking
654
+ \`\`\`
655
+
604
656
  `;
605
657
  }
606
658
  function reviewerEnhancedBody() {
@@ -689,6 +741,10 @@ export function enhancedAgentBody(agentName) {
689
741
  switch (agentName) {
690
742
  case "planner":
691
743
  return plannerEnhancedBody();
744
+ case "product-manager":
745
+ return productManagerEnhancedBody();
746
+ case "critic":
747
+ return criticEnhancedBody();
692
748
  case "reviewer":
693
749
  return reviewerEnhancedBody();
694
750
  case "security-reviewer":
@@ -29,10 +29,28 @@ export const ARTIFACT_TEMPLATES = {
29
29
  ### Discovered context
30
30
  - (paths, prior artifacts, seeds, prompt fragments — referenced by downstream stages, or \`- None.\`)
31
31
 
32
- ## Problem
33
- - **What we're solving:**
34
- - **Success criteria:**
35
- - **Constraints:**
32
+ ## Problem Decision Record
33
+ - **Depth:** lite | standard | deep
34
+ - **Frame type:** product | technical-maintenance
35
+
36
+ ### Product framing (use when applicable)
37
+ - **Persona / user:**
38
+ - **Job to be done:**
39
+ - **Pain / trigger:**
40
+ - **Value hypothesis:**
41
+ - **Evidence / signal:**
42
+ - **Success metric:**
43
+ - **Why now:**
44
+ - **Do-nothing consequence:**
45
+ - **Non-goals:**
46
+
47
+ ### Technical-maintenance framing (use when product framing is not applicable)
48
+ - **Affected operator/developer:**
49
+ - **Current failure mode:**
50
+ - **Expected operational improvement:**
51
+ - **Verification signal:**
52
+ - **Do-nothing cost:**
53
+ - **Non-goals:**
36
54
 
37
55
  ## Premise Check
38
56
  - **Right problem?** (yes/no + one-line justification — take a position)
@@ -43,11 +61,10 @@ export const ARTIFACT_TEMPLATES = {
43
61
  - *How might we …?* — one line naming the user, the desired outcome, and the binding constraint.
44
62
 
45
63
  ## Sharpening Questions
64
+ > Ask one decision-changing question at a time. For concrete early exits, record \`None - early exit\` with rationale.
46
65
  | # | Question | Answer / Assumption | Decision impact |
47
66
  |---|---|---|---|
48
67
  | 1 | | | |
49
- | 2 | | | |
50
- | 3 | | | |
51
68
 
52
69
  ## Clarifying Questions
53
70
  | # | Question | Answer | Decision impact |
@@ -55,7 +72,7 @@ export const ARTIFACT_TEMPLATES = {
55
72
  | 1 | | | |
56
73
 
57
74
  ## Approach Tier
58
- - Tier: Lightweight | Standard | Deep
75
+ - Tier: lite | standard | deep
59
76
  - Why this tier:
60
77
 
61
78
  ## Short-Circuit Decision
@@ -80,7 +97,7 @@ export const ARTIFACT_TEMPLATES = {
80
97
  - **Approach:**
81
98
  - **Rationale:** Trace this to the prior Approach Reaction.
82
99
  - **Approval:** pending
83
- - **Next-stage handoff:** On standard track, hand this to \`scope\`; on medium track, hand this directly to \`spec\` with explicit requirements/constraints.
100
+ - **Next-stage handoff:** On standard track, hand this to \`scope\`; on medium track, hand this directly to \`spec\`. Include upstream decisions used, drift, confidence, unresolved questions, risk hints, and non-goals.
84
101
 
85
102
  ## Not Doing
86
103
  - (3-5 things this brainstorm is *not* committing to — distinct from \`Deferred\`. These will not appear in scope unless the user explicitly opts in.)
@@ -165,8 +182,21 @@ ${SEED_SHELF_SECTION}
165
182
  | HOUR 4-5 (integration) | | | |
166
183
  | HOUR 6+ (polish/tests) | | | |
167
184
 
185
+ ## Scope Contract
186
+ - **Selected mode:** HOLD SCOPE | SELECTIVE EXPANSION | SCOPE EXPANSION | SCOPE REDUCTION
187
+ - **In scope:**
188
+ - **Out of scope:**
189
+ - **Requirements:**
190
+ - **Locked decisions:**
191
+ - **Discretion areas:**
192
+ - **Deferred ideas:**
193
+ - **Accepted reference ideas:**
194
+ - **Rejected reference ideas:**
195
+ - **Success definition:**
196
+ - **Design handoff:**
197
+
168
198
  ## Scope Mode
169
- - [ ] SCOPE EXPANSION — dream bigger; user explicitly opts into the larger product slice.
199
+ - [ ] SCOPE EXPANSION — explore ambitious alternatives; user explicitly opts into the larger product slice.
170
200
  - [ ] SELECTIVE EXPANSION — hold baseline scope and cherry-pick one high-leverage addition.
171
201
  - [ ] HOLD SCOPE — preserve the approved brainstorm direction with maximum rigor.
172
202
  - [ ] SCOPE REDUCTION — strip to the smallest useful wedge when risk/blast radius is too high.
@@ -174,9 +204,24 @@ ${SEED_SHELF_SECTION}
174
204
  ## Mode-Specific Analysis
175
205
  | Selected mode | Rationale | Depth |
176
206
  |---|---|---|
177
- | | | default / deep |
207
+ | | | lite / standard / deep |
208
+
209
+ > Default path: one selected-mode row plus rationale. Deep/high-risk scope may expand with optional evidence headings below.
210
+
211
+ ## Landscape Check
212
+ - Optional for EXPAND/SELECTIVE/deep; omit for compact HOLD SCOPE.
213
+
214
+ ## Taste Calibration
215
+ - Optional quality-bar references from in-repo modules/files.
178
216
 
179
- > Default path: one selected-mode row plus rationale. Deep/high-risk scope may expand below with mode-specific analysis.
217
+ ## Reference Pull
218
+ - Optional evidence from \`/Users/zuevrs/Downloads/references\`; list accepted/rejected ideas or \`Not needed - compact scope\`.
219
+
220
+ ## Ambitious Alternatives
221
+ - Optional for SCOPE EXPANSION/SELECTIVE; list larger alternatives and disposition.
222
+
223
+ ## Ruthless Minimum Slice
224
+ - Optional for SCOPE REDUCTION/high-risk scope; define the smallest useful wedge.
180
225
 
181
226
  ## Requirements (stable IDs)
182
227
  | ID | Requirement (observable outcome) | Priority | Source (origin doc / prompt line) |
@@ -241,6 +286,9 @@ ${SEED_SHELF_SECTION}
241
286
 
242
287
  ## Scope Summary
243
288
  - Selected mode: (one of \`SCOPE EXPANSION\` | \`SELECTIVE EXPANSION\` | \`HOLD SCOPE\` | \`SCOPE REDUCTION\`)
289
+ - Confidence: high | medium | low
290
+ - Drift from brainstorm: None / <specific drift>
291
+ - Unresolved questions: None / <questions>
244
292
  - Strongest challenges resolved:
245
293
  - Recommended path:
246
294
  - Accepted scope:
@@ -291,7 +339,7 @@ ${SEED_SHELF_SECTION}
291
339
 
292
340
  ## Compact-First Scaffold
293
341
  - Default to the compact design spine unless risk requires Standard/Deep add-ons.
294
- - Compact required spine: Codebase Investigation, Architecture Boundaries, Architecture Diagram, Data Flow, Failure Mode Table, Test Strategy, and Completion Dashboard.
342
+ - Compact required spine: Upstream Handoff, Codebase Investigation, Engineering Lock, Architecture Boundaries, Architecture Diagram, Data Flow, Failure Mode Table, Test Strategy, Spec Handoff, and Completion Dashboard.
295
343
  - Mark optional Standard/Deep sections as \`Omitted - compact design\` when they do not apply; do not expand the scaffold just to fill empty tables.
296
344
 
297
345
  ## Upstream Handoff
@@ -302,9 +350,14 @@ ${SEED_SHELF_SECTION}
302
350
  - Drift from upstream (or \`None\`):
303
351
 
304
352
  ## Codebase Investigation
305
- | File | Current responsibility | Patterns discovered |
306
- |---|---|---|
307
- | | | |
353
+ | File | Current responsibility | Patterns discovered | Existing fit / reuse candidate |
354
+ |---|---|---|---|
355
+ | | | | |
356
+
357
+ ## Engineering Lock
358
+ | Decision area | Chosen path | Shadow alternative | Switch trigger | Failure/rescue/degraded behavior | Verification evidence | Confidence |
359
+ |---|---|---|---|---|---|---|
360
+ | | | | | | | |
308
361
 
309
362
  ## Search Before Building
310
363
  | Layer | Label | What to reuse first |
@@ -336,9 +389,9 @@ ${MARKDOWN_CODE_FENCE}
336
389
  ## Data-Flow Shadow Paths
337
390
  - Standard/Deep add-on; omit when compact design does not need a shadow path.
338
391
  <!-- diagram: data-flow-shadow-paths -->
339
- | Path | Trigger | Fallback/Degrade behavior |
340
- |---|---|---|
341
- | | | |
392
+ | Chosen path | Shadow alternative | Switch trigger | Failure/rescue/degraded behavior | Verification evidence |
393
+ |---|---|---|---|---|
394
+ | | | | | |
342
395
 
343
396
  ## Error Flow Diagram
344
397
  - Standard/Deep add-on; omit when the Failure Mode Table is sufficient.
@@ -387,6 +440,8 @@ ${MARKDOWN_CODE_FENCE}
387
440
  | | | | |
388
441
 
389
442
  ## Data Flow
443
+ - Data/state flow:
444
+ - Critical path:
390
445
  - Happy path:
391
446
  - Nil/empty input path:
392
447
  - Upstream error path:
@@ -431,6 +486,23 @@ ${MARKDOWN_CODE_FENCE}
431
486
  |---|---|---|
432
487
  | | | |
433
488
 
489
+ ## Rejected Alternatives
490
+ | Alternative | Why rejected | Revival signal |
491
+ |---|---|---|
492
+ | | | |
493
+
494
+ ## Design Decisions
495
+ | Decision Ref | Requirement / LD refs | Decision | Spec impact |
496
+ |---|---|---|---|
497
+ | DD-1 | | | |
498
+
499
+ ## Spec Handoff
500
+ - Requirements to carry forward:
501
+ - Design decisions to encode:
502
+ - Risks and rescue paths:
503
+ - Test/performance expectations:
504
+ - Unresolved questions (or \`None\`):
505
+
434
506
  ## Outside Voice Findings
435
507
  | ID | Dimension | Finding | Disposition | Rationale |
436
508
  |---|---|---|---|---|
@@ -735,16 +807,40 @@ Execution rule: complete and verify each batch before starting the next batch.
735
807
  - Open questions:
736
808
  - Drift from upstream (or \`None\`):
737
809
 
810
+ ## Review Evidence Scope
811
+ - Base/head:
812
+ - Files inspected:
813
+ - Changed-file coverage summary:
814
+ - Diagnostics run:
815
+ - Omitted files with explicit reason:
816
+ - Reviewer delegation evidence:
817
+ - Security-reviewer delegation evidence:
818
+
819
+ ## Changed-File Coverage
820
+ | File | Coverage status | Evidence / no-impact reason |
821
+ |---|---|---|
822
+ | | inspected / broader-module / omitted-no-impact | |
823
+
738
824
  ## Layer 1 Verdict
739
825
  | Criterion | Verdict | Evidence |
740
826
  |---|---|---|
741
827
  | AC-1 | PASS/FAIL | |
742
828
 
743
829
  ## Layer 2 Findings
744
- | ID | Severity | Category | Description | Status |
745
- |---|---|---|---|---|
746
- | R-1 | Critical/Important/Suggestion | correctness/security/performance/architecture/external-safety | | open/resolved |
747
- - NO_CHANGE_ATTESTATION: <required when Category=security has no entries; explain why no security-relevant changes were detected>
830
+ | ID | Severity | Category | File:line / no-line reason | Description | Status |
831
+ |---|---|---|---|---|---|
832
+ | R-1 | Critical/Important/Suggestion | correctness/security/performance/architecture/external-safety | path:line | | open/resolved |
833
+ - NO_FINDINGS_ATTESTATION: <required when no findings are reported; cite inspected coverage>
834
+
835
+ ## Security Sweep Attestation
836
+ - Result: findings | NO_CHANGE_ATTESTATION | NO_SECURITY_IMPACT
837
+ - Inspected surfaces:
838
+ - Rationale:
839
+
840
+ ## Dependency & Version Audit
841
+ - Relevant: yes/no
842
+ - Manifests/lockfiles/generated clients/CI/runtime config/external APIs inspected:
843
+ - Result / no-impact rationale:
748
844
 
749
845
  ## Incoming Feedback Queue
750
846
  | ID | Source | Severity | File:line | Request | Status | Evidence |