@mmerterden/multi-agent-pipeline 10.7.3 → 10.7.4

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (46) hide show
  1. package/CHANGELOG.md +19 -2
  2. package/docs/adr/0001-three-model-triage.md +2 -2
  3. package/docs/adr/0007-multi-tool-adapter-framework.md +1 -1
  4. package/docs/adr/README.md +2 -2
  5. package/docs/architecture.md +14 -14
  6. package/docs/features.md +22 -21
  7. package/docs/performance.md +3 -3
  8. package/index.js +3 -7
  9. package/install/templates/copilot-instructions.md +2 -2
  10. package/package.json +2 -5
  11. package/pipeline/agents/dev-critic.md +1 -1
  12. package/pipeline/claude-md-template.md +1 -1
  13. package/pipeline/commands/multi-agent/dev-autopilot.md +1 -1
  14. package/pipeline/commands/multi-agent/finish.md +2 -2
  15. package/pipeline/commands/multi-agent/help.md +12 -12
  16. package/pipeline/commands/multi-agent/local.md +1 -1
  17. package/pipeline/commands/multi-agent/refs/features/dev-critic.md +1 -1
  18. package/pipeline/commands/multi-agent/refs/features/model-fallback.md +7 -3
  19. package/pipeline/commands/multi-agent/refs/knowledge.md +1 -1
  20. package/pipeline/commands/multi-agent/refs/phases/log-format.md +1 -1
  21. package/pipeline/commands/multi-agent/refs/phases/modes.md +1 -1
  22. package/pipeline/commands/multi-agent/refs/phases/phase-1-analysis.md +2 -2
  23. package/pipeline/commands/multi-agent/refs/phases/phase-2-planning.md +2 -2
  24. package/pipeline/commands/multi-agent/refs/phases/phase-3-dev.md +1 -1
  25. package/pipeline/commands/multi-agent/refs/phases/phase-4-review.md +18 -18
  26. package/pipeline/commands/multi-agent/refs/progress-contract.md +1 -1
  27. package/pipeline/commands/multi-agent/refs/tracker-contract.md +1 -2
  28. package/pipeline/commands/multi-agent/review.md +8 -8
  29. package/pipeline/commands/multi-agent/sync.md +3 -3
  30. package/pipeline/commands/multi-agent.md +7 -7
  31. package/pipeline/schemas/agent-state.schema.json +1 -1
  32. package/pipeline/schemas/prefs.schema.json +3 -3
  33. package/pipeline/schemas/reviewer-output.schema.json +1 -1
  34. package/pipeline/schemas/triage-output.schema.json +2 -2
  35. package/pipeline/scripts/README.md +1 -2
  36. package/pipeline/scripts/cost-budget-check.mjs +1 -1
  37. package/pipeline/scripts/cost-table.json +7 -0
  38. package/pipeline/scripts/fixtures/install-layout.tsv +5 -5
  39. package/pipeline/scripts/uninstall.mjs +53 -57
  40. package/pipeline/skills/shared/core/multi-agent/SKILL.md +11 -11
  41. package/pipeline/skills/shared/core/multi-agent-dev-autopilot/SKILL.md +1 -1
  42. package/pipeline/skills/shared/core/multi-agent-finish/SKILL.md +1 -1
  43. package/pipeline/skills/shared/core/multi-agent-help/SKILL.md +8 -8
  44. package/pipeline/skills/shared/core/multi-agent-review/SKILL.md +5 -5
  45. package/pipeline/skills/shared/core/multi-agent-sync/SKILL.md +7 -5
  46. package/pipeline/scripts/smoke-readme-counts.sh +0 -120
@@ -1,6 +1,6 @@
1
1
  ### Phase 4: Review (deterministic gates + parallel + triage)
2
2
 
3
- > **TLDR** - Three-stage review. Stage 1: deterministic gates (build + lint + test + secret scan) that MUST pass. Stage 2: AI models in parallel - reviewer set is **CLI-aware**: Claude Code dispatches 2 reviewers (Opus + Sonnet); Copilot CLI dispatches 3 reviewers (GPT-5.4 + Opus + Sonnet). Stage 3: Opus triage - evaluates raw findings, filters false-positives/out-of-scope, keeps only actionable items. Only triage-accepted blocking items loop back to Phase 3.
3
+ > **TLDR** - Three-stage review. Stage 1: deterministic gates (build + lint + test + secret scan) that MUST pass. Stage 2: AI models in parallel - reviewer set is **CLI-aware**: Claude Code dispatches 2 reviewers (Fable + Sonnet); Copilot CLI dispatches 3 reviewers (GPT-5.4 + Opus + Sonnet — Fable 5 is not offered on Copilot CLI). Stage 3: Fable triage (Opus on Copilot CLI) - evaluates raw findings, filters false-positives/out-of-scope, keeps only actionable items. Only triage-accepted blocking items loop back to Phase 3.
4
4
 
5
5
  <!-- progress-contract: applied -->
6
6
  Progress emission per `refs/progress-contract.md` - lines for each gate, each reviewer dispatch + finish, triage start, triage verdict, fix dispatch.
@@ -181,17 +181,17 @@ Launch Agent instances **in parallel** using the shared `code-reviewer` subagent
181
181
 
182
182
  | Reviewer | subagent_type | Model | Focus | Skills Referenced | Where it runs |
183
183
  | ---------- | ----------------- | ------------------- | --------------------------------- | --------------------------------------------- | -------------------- |
184
- | Reviewer 1 | `code-reviewer` | `claude-opus-4.6` | Deep security + architecture | `api-security-best-practices`, `architecture` | Both CLIs |
184
+ | Reviewer 1 | `code-reviewer` | `claude-fable-5` (Claude Code) / `claude-opus-4-8` (Copilot CLI) | Deep security + architecture | `api-security-best-practices`, `architecture` | Both CLIs |
185
185
  | Reviewer 2 | `code-reviewer` | `gpt-5.4` | Edge cases, different perspective | cross-model diversity | **Copilot CLI only** |
186
- | Reviewer 3 | `code-reviewer` | `claude-sonnet-4.6` | Quality + correctness + naming | `clean-code`, stack-specific skill | Both CLIs |
186
+ | Reviewer 3 | `code-reviewer` | `claude-sonnet-4-6` | Quality + correctness + naming | `clean-code`, stack-specific skill | Both CLIs |
187
187
 
188
188
  Each reviewer inherits the `code-reviewer` agent's focus areas (Security, Architecture, Quality, Performance) and output contract. The orchestrator overrides only the model and the stack-specific skill per-reviewer - no prompt duplication.
189
189
 
190
- **Model override wiring:** `code-reviewer.md` declares `preferredModel: fable`, so Reviewer 1 uses the persona default (Fable 5). Reviewer 2 (Copilot-only, `gpt-5.4`) and Reviewer 3 (`claude-sonnet-4.6`) set `PHASE_MODEL_OVERRIDE=<model>` before dispatch - the orchestrator exports `CLAUDE_CODE_SUBAGENT_MODEL` on Claude Code, or passes `--model` on Copilot CLI. Full precedence rule: `skills/shared/core/multi-agent/SKILL.md#agent-dispatch--per-persona-model-routing-v610`. Fable dispatches are subject to the fallback contract (`refs/features/model-fallback.md`): dispatch-error retry walks `fable -> opus -> sonnet` and budget-ceiling downgrade.
190
+ **Model override wiring:** `code-reviewer.md` declares `preferredModel: fable`, so Reviewer 1 uses the persona default (Fable 5). Reviewer 2 (Copilot-only, `gpt-5.4`) and Reviewer 3 (`claude-sonnet-4-6`) set `PHASE_MODEL_OVERRIDE=<model>` before dispatch - the orchestrator exports `CLAUDE_CODE_SUBAGENT_MODEL` on Claude Code, or passes `--model` on Copilot CLI. Full precedence rule: `skills/shared/core/multi-agent/SKILL.md#agent-dispatch--per-persona-model-routing-v610`. Fable dispatches are subject to the fallback contract (`refs/features/model-fallback.md`): dispatch-error retry walks `fable -> opus -> sonnet` and budget-ceiling downgrade.
191
191
 
192
192
  **Stack-specific skills loaded per reviewer** (from Phase 1 `detectedStack`). On Claude Code, Reviewer 2 (GPT-5.4) is not dispatched - its skill column is ignored. On Copilot CLI all three columns are used.
193
193
 
194
- | Stack | Reviewer 1 (Opus) | Reviewer 2 (GPT-5.4 - Copilot CLI only) | Reviewer 3 (Sonnet) |
194
+ | Stack | Reviewer 1 (Fable / Opus on Copilot) | Reviewer 2 (GPT-5.4 - Copilot CLI only) | Reviewer 3 (Sonnet) |
195
195
  |-------|-------------------|-----------------------------------------|---------------------|
196
196
  | iOS/Swift | `ios-security`, `swiftui-performance`, `hig-patterns` | `swift-concurrency`, `ios-accessibility` | `swiftui-pro`, `swift-testing` |
197
197
  | Android/Kotlin | `android-security`, `android-performance` | `compose-testing`, `android-architecture` | `compose-components`, `kotlin-coroutines-expert` |
@@ -204,11 +204,11 @@ Skills are injected into reviewer prompt context - the reviewer uses them as r
204
204
 
205
205
  **iOS/Swift - interaction & convention skills (conditional).** When the diff touches SwiftUI UI files (`*View.swift`, `*Screen.swift`, `*Configuration.swift`, `*+Modifiers.swift`), additionally inject the relevant `figma-common` convention skills as reference for the iOS reviewers: `figma-navigation`, `figma-overlays`, `figma-bottom-sheets` (interaction: emit-intent vs self-route/self-present; native-SwiftUI-first vs the project's `ui.*` custom system), and the enriched `figma-to-swiftui` accessibility rules (minimalism). These back the Step 1.5 iOS convention checks. Generic across SwiftUI projects - not tied to any one app. Omit when the diff has no SwiftUI UI changes (keeps the reviewer prompt lean).
206
206
 
207
- **Dispatch timeout (required, mirrors triage 3.3).** Reviewers run in parallel and triage waits on all of them, so one stalled reviewer hangs the phase. Bound each reviewer dispatch by `REVIEWER_TIMEOUT_SECONDS` (default 180). If a reviewer has not returned by the budget: log `review.reviewer_timeout reviewer=<name>`, treat that reviewer as absent, and proceed to triage with the reviewers that did return. The merged-findings count and `consensus.reviewerCount` reflect only the reviewers that returned. If **zero** reviewers return, retry the Opus reviewer once; on a second total failure HALT with `ERR: no reviewer returned within ${REVIEWER_TIMEOUT_SECONDS}s; resume with /multi-agent:resume #N.`. The Step 2.5 rebuttal round uses the same per-dispatch timeout. Never block indefinitely on a slow or dead reviewer dispatch.
207
+ **Dispatch timeout (required, mirrors triage 3.3).** Reviewers run in parallel and triage waits on all of them, so one stalled reviewer hangs the phase. Bound each reviewer dispatch by `REVIEWER_TIMEOUT_SECONDS` (default 180). If a reviewer has not returned by the budget: log `review.reviewer_timeout reviewer=<name>`, treat that reviewer as absent, and proceed to triage with the reviewers that did return. The merged-findings count and `consensus.reviewerCount` reflect only the reviewers that returned. If **zero** reviewers return, retry Reviewer 1 once; on a second total failure HALT with `ERR: no reviewer returned within ${REVIEWER_TIMEOUT_SECONDS}s; resume with /multi-agent:resume #N.`. The Step 2.5 rebuttal round uses the same per-dispatch timeout. Never block indefinitely on a slow or dead reviewer dispatch.
208
208
 
209
209
  #### Output contract - reviewer step
210
210
 
211
- Step 2 produces N reviewer-output objects (one per dispatched reviewer), each conforming to `pipeline/schemas/reviewer-output.schema.json`. They are persisted to `state.reviewIterations[<iteration>].reviewers[]` and consumed by Step 3 (Opus triage) - never by Phase 6 directly. The triage step (below) is the producer of the only review artifact Phase 6 reads, conforming to `pipeline/schemas/triage-output.schema.json`.
211
+ Step 2 produces N reviewer-output objects (one per dispatched reviewer), each conforming to `pipeline/schemas/reviewer-output.schema.json`. They are persisted to `state.reviewIterations[<iteration>].reviewers[]` and consumed by Step 3 (Fable triage) - never by Phase 6 directly. The triage step (below) is the producer of the only review artifact Phase 6 reads, conforming to `pipeline/schemas/triage-output.schema.json`.
212
212
 
213
213
  **Subagent return format** - each reviewer returns JSON conforming to `pipeline/schemas/reviewer-output.schema.json`:
214
214
 
@@ -248,9 +248,9 @@ Exit 0 = valid. Exit 2 = contradiction (approved=true with blocking findings) -
248
248
 
249
249
  **Off by default reason:** mixed-verdict cases are ~8% of runs in practice; the extra ~$0.20-$0.50 per run isn't worth automating for users who'd rather let triage resolve it cleanly. Users with high-stakes tasks (security-critical, release branches) can flip the flag.
250
250
 
251
- #### Step 3 - Opus Triage (filter before acting)
251
+ #### Step 3 - Fable Triage (filter before acting)
252
252
 
253
- **CRITICAL**: Reviewer findings are **raw signals**, not commands. Never auto-loop on every "blocking" tag - reviewers hallucinate, misread scope, or repeat each other. Run Opus triage to evaluate merged findings against task scope.
253
+ **CRITICAL**: Reviewer findings are **raw signals**, not commands. Never auto-loop on every "blocking" tag - reviewers hallucinate, misread scope, or repeat each other. Run Fable triage (Opus on Copilot CLI) to evaluate merged findings against task scope.
254
254
 
255
255
  ##### 3.1 Short-circuit: no findings
256
256
 
@@ -258,7 +258,7 @@ If merged findings `length === 0`, **skip triage**: write empty result `{"accept
258
258
 
259
259
  ##### 3.2 Launch triage agent
260
260
 
261
- Launch **1 Agent** (subagent_type: `general-purpose`, model: `opus`) with:
261
+ Launch **1 Agent** (subagent_type: `general-purpose`, model: `fable` on Claude Code / `opus` on Copilot CLI) with:
262
262
 
263
263
  - Raw findings from Reviewer 1 + Reviewer 2 (merged JSON)
264
264
  - Task scope (Phase 1 analysis summary + Phase 2 plan)
@@ -307,11 +307,11 @@ Step 3 produces a single triage-output object conforming to `pipeline/schemas/tr
307
307
 
308
308
  Return ONLY valid JSON conforming to pipeline/schemas/triage-output.schema.json:
309
309
  {
310
- "accepted": [{ "severity": "blocking|important|suggestion", "file": "...", "line": N, "issue": "...", "fix": "...", "reviewer": "opus|sonnet" }],
310
+ "accepted": [{ "severity": "blocking|important|suggestion", "file": "...", "line": N, "issue": "...", "fix": "...", "reviewer": "fable|opus|sonnet|gpt" }],
311
311
  "deferred": [{ "finding": {...}, "reason": "..." }],
312
312
  "rejected": [{ "finding": {...}, "reason": "..." }],
313
313
  "approved": true|false, // true if no accepted blocking items remain
314
- "consensus": { "reviewerCount": N, "verdict": "unanimous-pass|unanimous-block|split|unverified", "disagreements": [{ "file": "...", "line": N, "issue": "...", "note": "Opus blocking, Sonnet approved" }] } // optional, see 3.6
314
+ "consensus": { "reviewerCount": N, "verdict": "unanimous-pass|unanimous-block|split|unverified", "disagreements": [{ "file": "...", "line": N, "issue": "...", "note": "Fable blocking, Sonnet approved" }] } // optional, see 3.6
315
315
  }
316
316
  ```
317
317
 
@@ -352,12 +352,12 @@ Failure fallback (timeout >120s, or agent crash before any JSON is produced): re
352
352
  Emit metrics per review pass for Phase 7 cost rollup:
353
353
 
354
354
  ```bash
355
- LOG_METRIC_FORWARD_TO_TRACKER=1 pipeline/scripts/log-metric.sh "$TASK_ID" 4 review.reviewer_call model=opus duration_ms=$OPUS_DURATION tokens_in=$OPUS_IN tokens_out=$OPUS_OUT
355
+ LOG_METRIC_FORWARD_TO_TRACKER=1 pipeline/scripts/log-metric.sh "$TASK_ID" 4 review.reviewer_call model=fable duration_ms=$R1_DURATION tokens_in=$R1_IN tokens_out=$R1_OUT # model=opus on Copilot CLI
356
356
  # GPT-5.4 metric emitted only on Copilot CLI (skip on Claude Code):
357
357
  [ "${CLI_HOST:-claude}" = "copilot" ] && \
358
358
  LOG_METRIC_FORWARD_TO_TRACKER=1 pipeline/scripts/log-metric.sh "$TASK_ID" 4 review.reviewer_call model=gpt-5.4 duration_ms=$GPT_DURATION tokens_in=$GPT_IN tokens_out=$GPT_OUT
359
359
  LOG_METRIC_FORWARD_TO_TRACKER=1 pipeline/scripts/log-metric.sh "$TASK_ID" 4 review.reviewer_call model=sonnet duration_ms=$SONNET_DURATION tokens_in=$SONNET_IN tokens_out=$SONNET_OUT
360
- LOG_METRIC_FORWARD_TO_TRACKER=1 pipeline/scripts/log-metric.sh "$TASK_ID" 4 review.triage_call model=opus duration_ms=$TRIAGE_DURATION tokens_in=$TRIAGE_IN tokens_out=$TRIAGE_OUT
360
+ LOG_METRIC_FORWARD_TO_TRACKER=1 pipeline/scripts/log-metric.sh "$TASK_ID" 4 review.triage_call model=fable duration_ms=$TRIAGE_DURATION tokens_in=$TRIAGE_IN tokens_out=$TRIAGE_OUT
361
361
  pipeline/scripts/log-metric.sh "$TASK_ID" 4 review.completed raw_count=$RAW accepted=$ACC deferred=$DEF rejected=$REJ approved=$APPROVED duration_ms=$DURATION
362
362
  ```
363
363
 
@@ -365,18 +365,18 @@ pipeline/scripts/log-metric.sh "$TASK_ID" 4 review.completed raw_count=$RAW acce
365
365
 
366
366
  ##### 3.5 Optional cross-check (single-point-of-failure mitigation)
367
367
 
368
- Opt-in via `prefs.global.triageCrossCheck.enabled` (default `false`). Sampled runs dispatch a **Sonnet** triage agent as second opinion, validated via `validate-triage.mjs` (same fallback rules). Disagreements logged as `triage.cross_check_diff`; `blockOnDisagreement` pauses for user (autopilot: proceed with Opus verdict). Doubles triage cost on sampled runs.
368
+ Opt-in via `prefs.global.triageCrossCheck.enabled` (default `false`). Sampled runs dispatch a **Sonnet** triage agent as second opinion, validated via `validate-triage.mjs` (same fallback rules). Disagreements logged as `triage.cross_check_diff`; `blockOnDisagreement` pauses for user (autopilot: proceed with the Fable verdict). Doubles triage cost on sampled runs.
369
369
 
370
370
  ##### 3.6 Consensus surfacing (anti-correlation)
371
371
 
372
- **Rationale:** Reviewer 1 (Opus) and Reviewer 3 (Sonnet) share a base model family, so unanimous agreement on a *judgment call* is not independent confirmation - same-family models drift the same way on ambiguous prompts. Treating "both approved" as proof produces false-consensus passes. Triage therefore records a `consensus` block (schema v3.1.0) and surfaces disagreement and unverified agreement to the user rather than burying it.
372
+ **Rationale:** Reviewer 1 (Fable) and Reviewer 3 (Sonnet) are both Anthropic Claude models, so unanimous agreement on a *judgment call* is not independent confirmation - same-family models drift the same way on ambiguous prompts. Treating "both approved" as proof produces false-consensus passes. Triage therefore records a `consensus` block (schema v3.1.0) and surfaces disagreement and unverified agreement to the user rather than burying it.
373
373
 
374
374
  After the triage verdict is computed, populate `triage.consensus`:
375
375
 
376
376
  1. `reviewerCount` = number of reviewers dispatched this iteration (`2` on Claude Code, `3` on Copilot CLI).
377
377
  2. Classify the iteration `verdict`:
378
378
  - `unanimous-block` -> all reviewers returned at least one overlapping `blocking` finding.
379
- - `split` -> reviewers disagreed on existence or severity of one or more findings (the Step 2.5 disagreement definition). List each split in `disagreements[]` with a `note` naming who held which position (e.g. "Opus blocking, Sonnet approved").
379
+ - `split` -> reviewers disagreed on existence or severity of one or more findings (the Step 2.5 disagreement definition). List each split in `disagreements[]` with a `note` naming who held which position (e.g. "Fable blocking, Sonnet approved").
380
380
  - `unanimous-pass` -> all reviewers approved AND the diff is low-risk (no security/auth/concurrency surface per Phase 1 `touchedAreas`). Clear-cut; trust it.
381
381
  - `unverified` -> all reviewers approved BUT the diff touches a judgment-heavy surface (security, auth, concurrency, money, data migration). Agreement here may be correlated; do NOT treat it as a confirmed pass. Surface it.
382
382
  3. `disagreements[]` is populated for `split` and is also used to carry `unverified` notes (e.g. "both approved a keychain change - agreement unverified, confirm manually").
@@ -430,7 +430,7 @@ for proj in $(jq -r '.projects[] | "\(.name)\t\(.worktreePath)\t\(.baseBranch)"'
430
430
  done
431
431
  ```
432
432
 
433
- Same 3 reviewers (Opus / GPT-5.4 / Sonnet) receive `COMBINED_DIFF` with a multi-repo prefix in the system prompt:
433
+ Same reviewer set (Fable-or-Opus / GPT-5.4 / Sonnet) receive `COMBINED_DIFF` with a multi-repo prefix in the system prompt:
434
434
 
435
435
  ```
436
436
  This is a multi-repo task spanning {N} repos: {repo names}.
@@ -128,7 +128,7 @@ Every phase that dispatches a billable LLM agent MUST forward the call's token t
128
128
 
129
129
  ```bash
130
130
  LOG_METRIC_FORWARD_TO_TRACKER=1 pipeline/scripts/log-metric.sh "$TASK_ID" <phase-id> <event> \
131
- model=<opus|sonnet|haiku|gpt-5.4> \
131
+ model=<fable|opus|sonnet|haiku|gpt-5.4> \
132
132
  tokens_in=$IN tokens_out=$OUT duration_ms=$DUR
133
133
  ```
134
134
 
@@ -24,8 +24,7 @@ The agent detects which CLI it's running in and uses the appropriate visual mech
24
24
  ```
25
25
  1. system prompt mentions "Claude Code" → claude-code
26
26
  2. system prompt mentions "Copilot" / "GitHub Copilot" → copilot
27
- 3. system prompt mentions "Cursor" cursor
28
- 5. None of the above → generic (bash stdout)
27
+ 3. None of the above generic (bash stdout)
29
28
  ```
30
29
 
31
30
  Visual mechanism per CLI:
@@ -1,5 +1,5 @@
1
1
  ---
2
- description: "Run parallel review on a branch's diff or a Pull Request: 2 models on Claude Code (Opus + Sonnet), 3 models on Copilot CLI (GPT + Opus + Sonnet). On PR input, posts per-finding inline comments and sets approve/needs-work review state."
2
+ description: "Run parallel review on a branch's diff or a Pull Request: 2 models on Claude Code (Fable + Sonnet), 3 models on Copilot CLI (GPT + Opus + Sonnet). On PR input, posts per-finding inline comments and sets approve/needs-work review state."
3
3
  argument-hint: "[#N | repo#N | PR-URL | branch] - optional: PR by number/URL, repo+number, or local branch. Supports GitHub and Bitbucket Server URLs. If omitted, the current branch is used."
4
4
  ---
5
5
 
@@ -112,13 +112,13 @@ Save the diff to `/tmp/multi-agent-review-${TASK_ID}-diff.patch` so reviewers ca
112
112
  ### 3. Launch parallel reviewers - host-CLI dependent
113
113
 
114
114
  **Claude Code (2 in parallel):**
115
- - Agent 1: `claude-opus-4.6` → security + architecture
116
- - Agent 2: `claude-sonnet-4.6` → general quality
115
+ - Agent 1: `claude-fable-5` → security + architecture
116
+ - Agent 2: `claude-sonnet-4-6` → general quality
117
117
 
118
118
  **Copilot CLI (3 in parallel):**
119
- - Agent 1: `claude-opus-4.6` → security + architecture
119
+ - Agent 1: `claude-opus-4-8` → security + architecture (Fable 5 is not offered on Copilot CLI)
120
120
  - Agent 2: `gpt-5.4` → edge cases, alternate perspective
121
- - Agent 3: `claude-sonnet-4.6` → general quality
121
+ - Agent 3: `claude-sonnet-4-6` → general quality
122
122
 
123
123
  Each reviewer receives the diff plus the standard reviewer system prompt (see `refs/phases/phase-4-review.md` for the prompt contract). Output: structured `findings[]` per reviewer.
124
124
 
@@ -137,7 +137,7 @@ Each finding gets the `ruleID` from the catalog plus the platform policy ref:
137
137
 
138
138
  Catalog-only - does NOT invoke binaries. For a full scan, use `/multi-agent:test "store-ready"`.
139
139
 
140
- ### 5. Triage (Opus)
140
+ ### 5. Triage (Fable)
141
141
 
142
142
  Classify findings into:
143
143
  - 🔴 **Blocking** → must fix
@@ -152,10 +152,10 @@ Triage also marks each finding as `accepted` (real issue), `deferred` (real but
152
152
  🔍 Review Complete · PR #1250 · 3 files +120 -45
153
153
  | Model | Verdict | Blocking | Important | Suggestion |
154
154
  |----------|-----------|----------|-----------|------------|
155
- | Opus | approved | 0 | 1 | 3 |
155
+ | Fable | approved | 0 | 1 | 3 |
156
156
  | Sonnet | rejected | 1 | 2 | 5 |
157
157
 
158
- Consensus: ⚠ DISAGREEMENT - see Opus triage
158
+ Consensus: ⚠ DISAGREEMENT - see Fable triage
159
159
  ```
160
160
 
161
161
  This summary ALWAYS prints, regardless of input mode. The chat is the live conversation; on the PR side, the durable artifacts are inline comments + the review state (Step 7).
@@ -58,7 +58,7 @@ Run every step automatically:
58
58
  Step 0: FIGMA_SYNC SKIP (deprecated - feedback_figma_source_deprecated)
59
59
  Step 1: PLATFORM Detect macOS / Linux / Windows (Git Bash / WSL); export PLATFORM env
60
60
  Step 1.5: DETECT Compare timestamps, find stale targets
61
- Step 2: COPILOT Claude Code -> Copilot CLI (instructions + 34 sub-command skills)
61
+ Step 2: COPILOT Claude Code -> Copilot CLI (instructions + 35 sub-command skills)
62
62
  Step 3: REPO Claude Code -> pipeline repo (genericized, personal data scrub, bash -n on all sh)
63
63
  Step 3c: PLUGINS pipeline shared/external -> multi-agent-plugins marketplace (rebuild knowledge/,
64
64
  bump changed plugins' patch version, commit + push the plugins repo)
@@ -277,11 +277,11 @@ This runs on the Claude <-> Copilot axis — the two CLIs the pipeline supports
277
277
  |-------------|-------------|
278
278
  | `~/.claude/commands/multi-agent/{cmd}.md` | `~/.copilot/skills/multi-agent-{cmd}/SKILL.md` |
279
279
 
280
- **34 commands are synced** (canonical inventory - must match `cross-cli-contract.md` section 1; drift = contract violation):
280
+ **35 commands are synced** (canonical inventory - must match `cross-cli-contract.md` section 1; drift = contract violation):
281
281
 
282
282
  ```
283
283
  analysis, analysis-resolve, autopilot, build-optimize, channels, delete, dev,
284
- dev-autopilot, dev-local, dev-local-autopilot, diff-explain, garbage-collect,
284
+ dev-autopilot, dev-local, dev-local-autopilot, diff-explain, finish, garbage-collect,
285
285
  help, issue, jira, kill, language, local, local-autopilot, log, manual-test,
286
286
  prune-logs, purge, refactor, resume, review, scan, search, setup, stack, status,
287
287
  sync, test, update
@@ -1,5 +1,5 @@
1
1
  ---
2
- description: "Task orchestrator - full pipeline via Jira ID + branch or GitHub Issue URL: analysis, plan, TDD development, parallel review + Opus triage (CLI-aware: 2-model on Claude Code, 3-model on Copilot CLI), commit, log"
2
+ description: "Task orchestrator - full pipeline via Jira ID + branch or GitHub Issue URL: analysis, plan, TDD development, parallel review + Fable triage (CLI-aware: 2-model on Claude Code, 3-model on Copilot CLI), commit, log"
3
3
  allowed-tools: Agent, Bash, Read, Write, Edit, Glob, Grep, TaskCreate, TaskUpdate, TaskList, TaskGet, AskUserQuestion, WebFetch, WebSearch, NotebookEdit, Skill
4
4
  ---
5
5
 
@@ -140,14 +140,14 @@ This command uses lazy loading for token efficiency. Read the relevant sub-file
140
140
  - Multiple stacks -> load all relevant guides
141
141
 
142
142
  **Agent definitions** (used in Phase 1 and Phase 4):
143
- - `$HOME/.claude/agents/code-reviewer.md` - Phase 4 reviewer persona (`preferredModel: opus`; Phase 4 overrides Reviewer 3 to `sonnet`)
143
+ - `$HOME/.claude/agents/code-reviewer.md` - Phase 4 reviewer persona (`preferredModel: fable`; Phase 4 overrides Reviewer 3 to `sonnet`)
144
144
  - `$HOME/.claude/agents/explorer.md` - Phase 1 codebase scan persona (`preferredModel: sonnet` - scan work, cost-efficient)
145
- - `$HOME/.claude/agents/ios-architect.md` - iOS architecture review (`preferredModel: opus`)
146
- - `$HOME/.claude/agents/android-architect.md` - Android architecture review (`preferredModel: opus`)
147
- - `$HOME/.claude/agents/backend-architect.md` - Backend/API architecture review (`preferredModel: opus`)
145
+ - `$HOME/.claude/agents/ios-architect.md` - iOS architecture review (`preferredModel: fable`)
146
+ - `$HOME/.claude/agents/android-architect.md` - Android architecture review (`preferredModel: fable`)
147
+ - `$HOME/.claude/agents/backend-architect.md` - Backend/API architecture review (`preferredModel: fable`)
148
148
  - `$HOME/.claude/agents/security-auditor.md` - Security audit (`preferredModel: opus`)
149
149
 
150
- **Per-persona model routing:** Before each Agent dispatch, the orchestrator reads `preferredModel` from the persona file and exports `CLAUDE_CODE_SUBAGENT_MODEL` (Claude Code) / passes `--model` (Copilot CLI). Precedence: per-dispatch `PHASE_MODEL_OVERRIDE` > persona `preferredModel` > `opus`. Full contract: `skills/shared/core/multi-agent/SKILL.md#agent-dispatch--per-persona-model-routing-v610`.
150
+ **Per-persona model routing:** Before each Agent dispatch, the orchestrator reads `preferredModel` from the persona file and exports `CLAUDE_CODE_SUBAGENT_MODEL` (Claude Code) / passes `--model` (Copilot CLI). Precedence: per-dispatch `PHASE_MODEL_OVERRIDE` > persona `preferredModel` > `fable` (falls back per `refs/features/model-fallback.md`). Full contract: `skills/shared/core/multi-agent/SKILL.md#agent-dispatch--per-persona-model-routing-v610`.
151
151
 
152
152
  ---
153
153
 
@@ -247,7 +247,7 @@ When called with `review`:
247
247
  1. Detect current branch and project from cwd (or ask)
248
248
  2. Get diff: `git diff HEAD` (unstaged + staged)
249
249
  3. If no diff, get diff against base branch: `git diff origin/{baseBranch}...HEAD`
250
- 4. Launch Phase 4 review (parallel + Opus triage - 2-model on Claude Code, 3-model on Copilot CLI) on the diff
250
+ 4. Launch Phase 4 review (parallel + Fable triage - 2-model on Claude Code, 3-model on Copilot CLI) on the diff
251
251
  5. No worktree, no state file - lightweight one-shot review
252
252
  6. Print findings to terminal
253
253
 
@@ -183,7 +183,7 @@
183
183
  "planEditRequests": {
184
184
  "type": "array",
185
185
  "items": { "type": "string" },
186
- "description": "v5.3.0 Phase 2 - free-text edit instructions the user typed between plan renders. Preserved verbatim for audit; Opus parses them conversationally to revise the plan."
186
+ "description": "v5.3.0 Phase 2 - free-text edit instructions the user typed between plan renders. Preserved verbatim for audit; the planning model (Fable top tier) parses them conversationally to revise the plan."
187
187
  }
188
188
  }
189
189
  }
@@ -831,9 +831,9 @@
831
831
  },
832
832
  "pricingModel": {
833
833
  "type": "string",
834
- "enum": ["opus", "sonnet", "haiku"],
835
- "default": "opus",
836
- "description": "Which cost-table.json rate to price accumulated tokens at. Defaults to opus for a deliberately conservative (upper-bound) estimate, so the ceiling trips early rather than late."
834
+ "enum": ["fable", "opus", "sonnet", "haiku"],
835
+ "default": "fable",
836
+ "description": "Which cost-table.json rate to price accumulated tokens at. Defaults to fable (the top tier since v10.6.0) for a deliberately conservative (upper-bound) estimate, so the ceiling trips early rather than late."
837
837
  }
838
838
  }
839
839
  },
@@ -19,7 +19,7 @@
19
19
  },
20
20
  "reviewer": {
21
21
  "type": "string",
22
- "description": "Model label for this output (e.g. 'opus', 'sonnet', 'gpt'). Present once the parallel reviewer outputs are merged into the Phase 4 array so triage/consensus can attribute each finding to its source. Optional on a single reviewer's raw pre-merge output."
22
+ "description": "Model label for this output (e.g. 'fable', 'opus', 'sonnet', 'gpt'). Present once the parallel reviewer outputs are merged into the Phase 4 array so triage/consensus can attribute each finding to its source. Optional on a single reviewer's raw pre-merge output."
23
23
  }
24
24
  },
25
25
  "$defs": {
@@ -74,8 +74,8 @@
74
74
  },
75
75
  "reviewer": {
76
76
  "type": "string",
77
- "enum": ["opus", "sonnet"],
78
- "description": "Which reviewer produced the raw finding. Haiku was removed in v2.1.0."
77
+ "enum": ["fable", "opus", "sonnet", "gpt"],
78
+ "description": "Which reviewer produced the raw finding. Claude Code Reviewer 1 is fable (opus when fallback engages); Copilot CLI adds gpt. Haiku was removed in v2.1.0."
79
79
  },
80
80
  "consensus": {
81
81
  "type": "object",
@@ -64,12 +64,11 @@ Installed into `~/.claude/scripts/` and invoked by settings.json hook configurat
64
64
  - `pre-push-check.sh` - runs before `git push` (smoke-cross-cli-behavior + smoke-personal-data)
65
65
  - `output-quality-check.sh` - runs after PR body / Jira comment generation (newline / HTML entity guard)
66
66
 
67
- ## Runtime helpers (13 files)
67
+ ## Runtime helpers
68
68
  Shell scripts invoked during pipeline execution.
69
69
 
70
70
  - `phase-banner.sh` - renders phase headers
71
71
  - `phase-tracker.sh` - live tracker state + tokens accumulation + render
72
- - `stack-swap.sh` - stack detection + skill set swap
73
72
  - `keychain-save.sh` - store PAT in macOS Keychain
74
73
  - `audit-log.sh` + `audit-log-rotate.sh` - opt-in audit trail
75
74
  - `log-metric.sh` - opt-in metric capture
@@ -66,7 +66,7 @@ if (flags.help || flags.h) {
66
66
  }
67
67
 
68
68
  // --- resolve config: prefs first, CLI overrides -----------------------------
69
- const cfg = { enabled: false, maxUsd: 5.0, warnPct: 80, onExceed: "warn", pricingModel: "opus" };
69
+ const cfg = { enabled: false, maxUsd: 5.0, warnPct: 80, onExceed: "warn", pricingModel: "fable" };
70
70
 
71
71
  if (flags.prefs) {
72
72
  if (!existsSync(flags.prefs)) die(`prefs file not found: ${flags.prefs}`);
@@ -2,6 +2,13 @@
2
2
  "_readme": "Per-model unit prices in USD per million tokens. Source: Anthropic public pricing (verified 2026-04-21). Update when Anthropic publishes new tiers. Unknown models render USD as ' - ' and emit a footnote - never block PR-body generation. cacheReadPerMtok is the discounted rate for prompt-cache hits (~10% of inPerMtok); the renderer prices a phase's tokens_cached at this rate when the tracker records it, so resume/cache reuse is visible in the ledger.",
3
3
  "schemaVersion": "1.1.0",
4
4
  "prices": {
5
+ "fable": {
6
+ "inPerMtok": 10.0,
7
+ "outPerMtok": 50.0,
8
+ "cacheReadPerMtok": 1.0,
9
+ "modelId": "claude-fable-5",
10
+ "note": "Top tier (restored v10.6.0) - architects, Reviewer 1, triage. Verified against Anthropic pricing 2026-07-02."
11
+ },
5
12
  "opus": {
6
13
  "inPerMtok": 5.0,
7
14
  "outPerMtok": 25.0,
@@ -1,16 +1,16 @@
1
1
  .claude/CLAUDE.md 1
2
2
  .claude/agents 8
3
- .claude/commands 87
3
+ .claude/commands 88
4
4
  .claude/lib 23
5
5
  .claude/multi-agent-preferences.json 1
6
6
  .claude/rules 12
7
7
  .claude/schemas 23
8
- .claude/scripts 174
8
+ .claude/scripts 167
9
9
  .claude/settings.json 1
10
- .claude/skills 555
10
+ .claude/skills 560
11
11
  .copilot/agents 8
12
12
  .copilot/copilot-instructions.md 1
13
13
  .copilot/lib 23
14
14
  .copilot/schemas 23
15
- .copilot/scripts 174
16
- .copilot/skills 590
15
+ .copilot/scripts 167
16
+ .copilot/skills 596
@@ -2,15 +2,17 @@
2
2
 
3
3
  /**
4
4
  * @file Token-preserving uninstaller - removes the multi-agent-pipeline
5
- * footprint from Claude Code, Copilot CLI, Cursor, Antigravity, and VS Code
6
- * Copilot Chat without touching personal access tokens stored in the OS credential store
7
- * (macOS Keychain / Windows Credential Manager / Linux libsecret).
5
+ * footprint from Claude Code and Copilot CLI without touching personal access
6
+ * tokens stored in the OS credential store (macOS Keychain / Windows
7
+ * Credential Manager / Linux libsecret). The legacy adapter flags
8
+ * (--cursor / --copilot-chat / --antigravity / --codex) clean up files left
9
+ * behind by pre-v10.7.0 adapter installs; the adapters themselves are gone.
8
10
  *
9
11
  * Invocation:
10
12
  * node uninstall.mjs # interactive, removes from all installed targets
11
13
  * node uninstall.mjs --yes # skip prompts, remove from all
12
14
  * node uninstall.mjs --claude # only Claude Code
13
- * node uninstall.mjs --cursor --target=/path/to/repo
15
+ * node uninstall.mjs --cursor --target=/path/to/repo # legacy adapter-file cleanup
14
16
  * node uninstall.mjs --dry-run # report what would be removed, change nothing
15
17
  *
16
18
  * Targeted by:
@@ -29,12 +31,9 @@
29
31
  */
30
32
 
31
33
  import { existsSync, readdirSync, readFileSync, rmSync, writeFileSync } from "fs";
32
- import { join, dirname } from "path";
33
- import { fileURLToPath } from "url";
34
+ import { join } from "path";
34
35
  import { createInterface } from "readline";
35
36
 
36
- const __dirname = dirname(fileURLToPath(import.meta.url));
37
- const PIPELINE_ROOT = join(__dirname, "..");
38
37
  const HOME = process.env.HOME || process.env.USERPROFILE;
39
38
 
40
39
  const flags = process.argv.slice(2).filter((a) => a !== "uninstall");
@@ -106,6 +105,23 @@ function rmMatchingDirs(parent, predicate) {
106
105
  return count;
107
106
  }
108
107
 
108
+ /**
109
+ * Remove every plain file under `parent` whose name matches a predicate.
110
+ * Used by the legacy adapter cleanup (pre-v10.7.0 generated files).
111
+ * @param {string} parent
112
+ * @param {(name: string) => boolean} predicate
113
+ */
114
+ function rmMatchingFiles(parent, predicate) {
115
+ if (!existsSync(parent)) return 0;
116
+ let count = 0;
117
+ for (const entry of readdirSync(parent, { withFileTypes: true })) {
118
+ if (!entry.isFile()) continue;
119
+ if (!predicate(entry.name)) continue;
120
+ if (rmIfExists(join(parent, entry.name))) count++;
121
+ }
122
+ return count;
123
+ }
124
+
109
125
  /**
110
126
  * Strip a marker-wrapped `<!-- multi-agent-pipeline:begin/end -->` block from
111
127
  * a user-owned file. Preserves everything outside the markers. Deletes the
@@ -287,68 +303,48 @@ async function main() {
287
303
  stripManagedBlock(join(COP, "copilot-instructions.md"));
288
304
  }
289
305
 
306
+ // Legacy adapter-file cleanup (adapters removed in v10.7.0). These blocks
307
+ // delete files a pre-v10.7.0 install generated; they never touch user files
308
+ // outside the multi-agent-* namespace / managed markers.
290
309
  if (forCursor) {
291
310
  console.log("");
292
- console.log(` [Cursor] Removing from ${adapterTarget}...`);
293
- if (!dryRun) {
294
- try {
295
- const adapter = (await import(join(PIPELINE_ROOT, "adapters", "cursor.mjs"))).default;
296
- const result = adapter.uninstall({ target: adapterTarget });
297
- console.log(` removed: ${result.removed} file(s)`);
298
- } catch (e) {
299
- console.log(` skipped (adapter unavailable): ${e.message}`);
300
- }
301
- } else {
302
- report("would invoke", "cursor adapter uninstall");
303
- }
311
+ console.log(` [Cursor - legacy cleanup] Removing from ${adapterTarget}...`);
312
+ const isOurs = (name) => name.startsWith("multi-agent-") || name.startsWith("multi-agent.");
313
+ const n = rmMatchingFiles(join(adapterTarget, ".cursor", "rules"), isOurs);
314
+ if (n > 0) console.log(` removed ${n} rule file(s)`);
315
+ stripManagedBlock(join(adapterTarget, ".cursorrules"));
304
316
  }
305
317
 
306
318
  if (forCopilotChat) {
307
319
  console.log("");
308
- console.log(` [GitHub Copilot Chat] Removing from ${adapterTarget}...`);
309
- if (!dryRun) {
310
- try {
311
- const adapter = (await import(join(PIPELINE_ROOT, "adapters", "copilot-chat.mjs"))).default;
312
- const result = adapter.uninstall({ target: adapterTarget });
313
- console.log(` main: ${result.mainStatus} · per-skill removed: ${result.perSkillRemoved}`);
314
- } catch (e) {
315
- console.log(` skipped (adapter unavailable): ${e.message}`);
316
- }
317
- } else {
318
- report("would invoke", "copilot-chat adapter uninstall");
319
- }
320
+ console.log(` [GitHub Copilot Chat - legacy cleanup] Removing from ${adapterTarget}...`);
321
+ stripManagedBlock(join(adapterTarget, ".github", "copilot-instructions.md"));
322
+ const n = rmMatchingFiles(join(adapterTarget, ".github", "instructions"), (name) =>
323
+ name.startsWith("multi-agent-"),
324
+ );
325
+ if (n > 0) console.log(` removed ${n} instruction file(s)`);
320
326
  }
321
327
 
322
328
  if (forAntigravity) {
323
329
  console.log("");
324
- console.log(` [Antigravity] Removing from ${adapterTarget}...`);
325
- if (!dryRun) {
326
- try {
327
- const adapter = (await import(join(PIPELINE_ROOT, "adapters", "antigravity.mjs"))).default;
328
- const result = adapter.uninstall({ target: adapterTarget });
329
- console.log(` .agent + AGENTS.md artifacts removed: ${result.removed}`);
330
- } catch (e) {
331
- console.log(` skipped (adapter unavailable): ${e.message}`);
332
- }
333
- } else {
334
- report("would invoke", "antigravity adapter uninstall");
335
- }
330
+ console.log(` [Antigravity - legacy cleanup] Removing from ${adapterTarget}...`);
331
+ const isOurs = (name) => name.startsWith("multi-agent-") || name.startsWith("multi-agent.");
332
+ let n = rmMatchingFiles(join(adapterTarget, ".agent", "rules"), isOurs);
333
+ n += rmMatchingFiles(join(adapterTarget, ".agent", "workflows"), isOurs);
334
+ if (n > 0) console.log(` removed ${n} .agent file(s)`);
335
+ stripManagedBlock(join(adapterTarget, "AGENTS.md"));
336
+ if (existsSync(join(adapterTarget, ".agent", "mcp_config.json")))
337
+ console.log(" note: .agent/mcp_config.json left untouched (may hold user servers) - remove pipeline entries manually if present");
336
338
  }
337
339
 
338
- if (forCodex) {
340
+ if (forCodex && HOME) {
339
341
  console.log("");
340
- console.log(" [OpenAI Codex CLI] Removing from ~/.codex...");
341
- if (!dryRun) {
342
- try {
343
- const adapter = (await import(join(PIPELINE_ROOT, "adapters", "codex.mjs"))).default;
344
- const result = adapter.uninstall();
345
- console.log(` ~/.codex artifacts removed: ${result.removed}`);
346
- } catch (e) {
347
- console.log(` skipped (adapter unavailable): ${e.message}`);
348
- }
349
- } else {
350
- report("would invoke", "codex adapter uninstall");
351
- }
342
+ console.log(" [OpenAI Codex CLI - legacy cleanup] Removing from ~/.codex...");
343
+ const CODEX = join(HOME, ".codex");
344
+ rmIfExists(join(CODEX, "prompts", "multi-agent.md"));
345
+ stripManagedBlock(join(CODEX, "AGENTS.md"));
346
+ if (existsSync(join(CODEX, "config.toml")))
347
+ console.log(" note: ~/.codex/config.toml left untouched (may hold user MCP servers) - remove pipeline entries manually if present");
352
348
  }
353
349
 
354
350
  console.log("");
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  name: multi-agent
3
3
  language: en
4
- description: "Task orchestrator: runs the full pipeline from a Jira ID or GitHub Issue URL - analysis → plan → TDD development → parallel review (Opus + Sonnet on Claude Code, GPT + Opus + Sonnet on Copilot CLI) → commit → log. Every step is written to agent-log.md."
4
+ description: "Task orchestrator: runs the full pipeline from a Jira ID or GitHub Issue URL - analysis → plan → TDD development → parallel review (Fable + Sonnet on Claude Code, GPT + Opus + Sonnet on Copilot CLI) → commit → log. Every step is written to agent-log.md."
5
5
  user-invocable: true
6
6
  argument-hint: '"PROJ-12345" "feature/PROJ-12345-flight-filter" | "https://github.com/.../issues/316" | status | log #1 | resume #1 | kill #1 | clear-logs | purge | review'
7
7
  ---
@@ -397,7 +397,7 @@ Full contract: `refs/tracker-contract.md` section "TaskCreate ordering (strict)"
397
397
 
398
398
  **IMPORTANT**: Update `agent-state.json` at EVERY phase transition. This file is the resume source of truth.
399
399
 
400
- ### Phase 1: Analysis (claude-opus-4.6)
400
+ ### Phase 1: Analysis (claude-fable-5)
401
401
  1. Launch **explore agents** (parallel) to scan codebase:
402
402
  - Related files to the task
403
403
  - Existing patterns and conventions
@@ -406,17 +406,17 @@ Full contract: `refs/tracker-contract.md` section "TaskCreate ordering (strict)"
406
406
  3. Summarize findings
407
407
  4. Log: `📊 Phase 1: Analysis - {N} files identified, {summary}`
408
408
 
409
- ### Phase 2: Planning (claude-opus-4.6)
409
+ ### Phase 2: Planning (claude-fable-5)
410
410
  1. Create task breakdown → todos with dependencies
411
411
  2. Launch **ios-architect** agent for architecture review (if structural changes)
412
412
  3. Determine development approach per todo (new file, modify, refactor)
413
413
  4. Log: `🧠 Phase 2: Plan - {N} todos created`
414
414
  5. **Plan Approval Gate** (normal mode only - skipped for `--dev`, `autopilot`, `--dev autopilot`). Full flow in `refs/phases/phase-2-planning.md` Step 5:
415
415
  - **5a - Clarification** (conditional, max 2 rounds): if the Jira/issue description is ambiguous (vague acceptance, UI task without Figma, API task without endpoint contract, `ambiguityScore >= 2`, parent-story scope drift), ask structured questions before rendering the plan. User answers → plan regenerated. If it is still unclear after the 2nd round, render the plan with a "best-effort" banner.
416
- - **5b - Approval loop**: render plan → user: `onayla`/`iptal`/free-text. A free-text edit request → Opus revises the plan → show it again. No iteration cap; user controls exit via `onayla` or `iptal`.
416
+ - **5b - Approval loop**: render plan → user: `onayla`/`iptal`/free-text. A free-text edit request → the planning model (Fable) revises the plan → show it again. No iteration cap; user controls exit via `onayla` or `iptal`.
417
417
  - Persist `clarificationRounds`, `clarificationQuestions`, `clarificationAnswers`, `planIterations`, `planApprovedAt`, `planEditRequests` to `state.phases["2"]`.
418
418
 
419
- ### Phase 3: Dev (claude-sonnet-4.6)
419
+ ### Phase 3: Dev (claude-sonnet-4-6)
420
420
  For each todo (respecting dependency order):
421
421
  1. Update todo status: `in_progress`
422
422
  2. **TDD cycle**:
@@ -431,9 +431,9 @@ For each todo (respecting dependency order):
431
431
  ### Phase 4: Review (parallel + triage)
432
432
  0. **Diff Risk Scoring (advisory, v8.3+)** - before reviewer dispatch run `node pipeline/scripts/diff-risk-score.mjs --base "$BASE_BRANCH" --top 5` and inject the top-N risk-ranked files as a `${PRIORITY_FILES}` block into each reviewer's prompt. Heuristic, deterministic, sub-second, never gates the pipeline. Disabled when `prefs.global.diffRiskAdvisory = false`. Signals: security paths (×3), schema migrations (×4), public API surfaces (×2), no-test-change (×2.5), complexity delta (×1.5), UI-critical paths (×1.5), loc changed (×1).
433
433
  1. Launch **code-reviewer** agents in parallel. Reviewer set depends on which CLI is hosting the pipeline:
434
- - **Claude Code** (2 reviewers): `claude-opus-4.6` (deep security + architecture) + `claude-sonnet-4.6` (quality + correctness)
435
- - **Copilot CLI** (3 reviewers): `gpt-5.4` (edge cases, different perspective) + `claude-opus-4.6` + `claude-sonnet-4.6`
436
- - Triage (both CLIs): single `claude-opus-4.6` pass over merged findings
434
+ - **Claude Code** (2 reviewers): `claude-fable-5` (deep security + architecture) + `claude-sonnet-4-6` (quality + correctness)
435
+ - **Copilot CLI** (3 reviewers): `gpt-5.4` (edge cases, different perspective) + `claude-opus-4-8` + `claude-sonnet-4-6` (Fable 5 is not offered on Copilot CLI)
436
+ - Triage: single top-tier pass over merged findings (`claude-fable-5` on Claude Code, `claude-opus-4-8` on Copilot CLI)
437
437
  2. Collect findings, classify:
438
438
  - 🔴 **Blocking** → must fix → back to Phase 3 (max 3 iterations)
439
439
  - 🟡 **Important** → fix and re-review
@@ -872,11 +872,11 @@ First show how it works:
872
872
  ```
873
873
  🤖 How it works (8 phases):
874
874
  0. Init - Project detection, worktree creation, state file
875
- 1. Analysis - Codebase scan (parallel explore agents, Opus)
875
+ 1. Analysis - Codebase scan (parallel explore agents, Fable)
876
876
  2. Planning - Task breakdown, architecture review, user approval
877
877
  3. Dev - TDD loop: write test → write code → build (Sonnet)
878
- 4. Review - Parallel review + Opus triage. Reviewer set by CLI:
879
- • Claude Code → Opus + Sonnet (2 parallel)
878
+ 4. Review - Parallel review + Fable triage. Reviewer set by CLI:
879
+ • Claude Code → Fable + Sonnet (2 parallel)
880
880
  • Copilot CLI → GPT-5.4 + Opus + Sonnet (3 parallel)
881
881
  5. Test - Optional: switch to the branch, manual test in Xcode
882
882
  6. Commit - Commit + push + PR + issue body update (PR links + progress flags)
@@ -35,7 +35,7 @@ Phase 7: Report → Short terminal summary
35
35
 
36
36
  1. **Parse the input** - Normal multi-agent formats (Issue URL, Jira ID, free-text)
37
37
  2. **Phase 0: Init** - `"mode": "dev", "autopilot": true` in `agent-state.json`
38
- 3. **Phase 3: Dev** - Write code + build directly with `claude-opus-4.6`
38
+ 3. **Phase 3: Dev** - Write code + build directly with `claude-opus-4-8`
39
39
  4. **Phase 6: Commit** - Create automatic commit + push + PR
40
40
  5. **Phase 7: Report** - Terminal summary
41
41