@mmerterden/multi-agent-pipeline 10.7.3 → 10.7.4

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (46) hide show
  1. package/CHANGELOG.md +19 -2
  2. package/docs/adr/0001-three-model-triage.md +2 -2
  3. package/docs/adr/0007-multi-tool-adapter-framework.md +1 -1
  4. package/docs/adr/README.md +2 -2
  5. package/docs/architecture.md +14 -14
  6. package/docs/features.md +22 -21
  7. package/docs/performance.md +3 -3
  8. package/index.js +3 -7
  9. package/install/templates/copilot-instructions.md +2 -2
  10. package/package.json +2 -5
  11. package/pipeline/agents/dev-critic.md +1 -1
  12. package/pipeline/claude-md-template.md +1 -1
  13. package/pipeline/commands/multi-agent/dev-autopilot.md +1 -1
  14. package/pipeline/commands/multi-agent/finish.md +2 -2
  15. package/pipeline/commands/multi-agent/help.md +12 -12
  16. package/pipeline/commands/multi-agent/local.md +1 -1
  17. package/pipeline/commands/multi-agent/refs/features/dev-critic.md +1 -1
  18. package/pipeline/commands/multi-agent/refs/features/model-fallback.md +7 -3
  19. package/pipeline/commands/multi-agent/refs/knowledge.md +1 -1
  20. package/pipeline/commands/multi-agent/refs/phases/log-format.md +1 -1
  21. package/pipeline/commands/multi-agent/refs/phases/modes.md +1 -1
  22. package/pipeline/commands/multi-agent/refs/phases/phase-1-analysis.md +2 -2
  23. package/pipeline/commands/multi-agent/refs/phases/phase-2-planning.md +2 -2
  24. package/pipeline/commands/multi-agent/refs/phases/phase-3-dev.md +1 -1
  25. package/pipeline/commands/multi-agent/refs/phases/phase-4-review.md +18 -18
  26. package/pipeline/commands/multi-agent/refs/progress-contract.md +1 -1
  27. package/pipeline/commands/multi-agent/refs/tracker-contract.md +1 -2
  28. package/pipeline/commands/multi-agent/review.md +8 -8
  29. package/pipeline/commands/multi-agent/sync.md +3 -3
  30. package/pipeline/commands/multi-agent.md +7 -7
  31. package/pipeline/schemas/agent-state.schema.json +1 -1
  32. package/pipeline/schemas/prefs.schema.json +3 -3
  33. package/pipeline/schemas/reviewer-output.schema.json +1 -1
  34. package/pipeline/schemas/triage-output.schema.json +2 -2
  35. package/pipeline/scripts/README.md +1 -2
  36. package/pipeline/scripts/cost-budget-check.mjs +1 -1
  37. package/pipeline/scripts/cost-table.json +7 -0
  38. package/pipeline/scripts/fixtures/install-layout.tsv +5 -5
  39. package/pipeline/scripts/uninstall.mjs +53 -57
  40. package/pipeline/skills/shared/core/multi-agent/SKILL.md +11 -11
  41. package/pipeline/skills/shared/core/multi-agent-dev-autopilot/SKILL.md +1 -1
  42. package/pipeline/skills/shared/core/multi-agent-finish/SKILL.md +1 -1
  43. package/pipeline/skills/shared/core/multi-agent-help/SKILL.md +8 -8
  44. package/pipeline/skills/shared/core/multi-agent-review/SKILL.md +5 -5
  45. package/pipeline/skills/shared/core/multi-agent-sync/SKILL.md +7 -5
  46. package/pipeline/scripts/smoke-readme-counts.sh +0 -120
@@ -13,7 +13,7 @@ You already wrote (and maybe hand-tested) the change on the current branch - o
13
13
 
14
14
  ```
15
15
  Phase 0: Init → project/branch detect, resolve base + diff (work already done), Jira id, state (NO worktree)
16
- Phase 4: Review → deterministic gates + parallel review + Opus triage
16
+ Phase 4: Review → deterministic gates + parallel review + Fable triage
17
17
  Phase 5: Build+Test → stack-aware build + run existing tests; SUCCESS required (automated gate, not the interactive user-test)
18
18
  Phase 6: Commit → commit remaining changes + push + open PR if none exists
19
19
  Phase 7: Report → technical analysis + Jira comment with test scenarios (channels: Jira / PR / Confluence / Wiki)
@@ -59,11 +59,11 @@ How It Works (Phase 0 - Interactive Flow):
59
59
  Pipeline (after Phase 0):
60
60
 
61
61
  Phase 0: Init -> The 8 steps above
62
- Phase 1: Analysis -> Stack detection + codebase scan (Opus)
62
+ Phase 1: Analysis -> Stack detection + codebase scan (Fable)
63
63
  Phase 2: Planning -> Task breakdown + architecture review + Plan Approval Gate
64
64
  Phase 3: Dev -> TDD: test -> code -> build (Sonnet) + build queue
65
- Phase 4: Review -> Deterministic gates + parallel AI review + Opus triage
66
- (Claude Code: Opus + Sonnet · Copilot CLI: GPT-5.4 + Opus + Sonnet)
65
+ Phase 4: Review -> Deterministic gates + parallel AI review + Fable triage
66
+ (Claude Code: Fable + Sonnet · Copilot CLI: GPT-5.4 + Opus + Sonnet)
67
67
  Phase 5: Test -> Optional: switch to branch, test in Xcode
68
68
  Phase 6: Commit -> Commit -> push -> PR + issue body update (never auto-closes)
69
69
  Phase 7: Report -> Channels dispatcher (PR · Jira · Confluence · Wiki, multi-select)
@@ -75,7 +75,7 @@ Pipeline (after Phase 0):
75
75
 
76
76
  Modes:
77
77
 
78
- (normal) Full 8 phases, parallel review + Opus triage
78
+ (normal) Full 8 phases, parallel review + Fable triage
79
79
  --dev Fast: Init -> Dev(Opus) -> Commit -> Report (no plan gate)
80
80
  --local No worktree - works directly on local branch
81
81
  autopilot Skip all confirmations (EXCEPT Phase 7 channels menu)
@@ -194,11 +194,11 @@ Nasıl Çalışır (Phase 0 - İnteraktif Akış):
194
194
  Pipeline (Phase 0'dan sonra):
195
195
 
196
196
  Phase 0: Init -> Yukarıdaki 8 adım
197
- Phase 1: Analysis -> Stack tespiti + codebase taraması (Opus)
197
+ Phase 1: Analysis -> Stack tespiti + codebase taraması (Fable)
198
198
  Phase 2: Planning -> Task kırılımı + mimari inceleme + Plan Onay Kapısı
199
199
  Phase 3: Dev -> TDD: test -> kod -> build (Sonnet) + build queue
200
- Phase 4: Review -> Deterministik kapılar + paralel AI review + Opus triage
201
- (Claude Code: Opus + Sonnet · Copilot CLI: GPT-5.4 + Opus + Sonnet)
200
+ Phase 4: Review -> Deterministik kapılar + paralel AI review + Fable triage
201
+ (Claude Code: Fable + Sonnet · Copilot CLI: GPT-5.4 + Opus + Sonnet)
202
202
  Phase 5: Test -> Opsiyonel: branch'e geç, Xcode'da test
203
203
  Phase 6: Commit -> Commit -> push -> PR + issue body güncelleme (hiç auto-close yok)
204
204
  Phase 7: Report -> Channels dispatcher (PR · Jira · Confluence · Wiki, multi-select)
@@ -210,7 +210,7 @@ Pipeline (Phase 0'dan sonra):
210
210
 
211
211
  Modlar:
212
212
 
213
- (normal) Tam 8 faz, paralel review + Opus triage
213
+ (normal) Tam 8 faz, paralel review + Fable triage
214
214
  --dev Hızlı: Init -> Dev(Opus) -> Commit -> Report (plan gate yok)
215
215
  --local Worktree yok - doğrudan local branch'te çalışır
216
216
  autopilot Tüm onayları atla (İSTİSNA: Phase 7 channels menüsü)
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  name: multi-agent-review
3
3
  language: en
4
- description: "Run parallel review on the current branch's diff: 2 models on Claude Code (Opus + Sonnet), 3 models on Copilot CLI (GPT + Opus + Sonnet). Review-only slice of the pipeline."
4
+ description: "Run parallel review on the current branch's diff: 2 models on Claude Code (Fable + Sonnet), 3 models on Copilot CLI (GPT + Opus + Sonnet). Review-only slice of the pipeline."
5
5
  user-invocable: true
6
6
  argument-hint: "[branch] - optional: branch to review. If omitted, the current branch is used."
7
7
  ---
@@ -27,13 +27,13 @@ Skip Phase 0-3 and review the current diff only.
27
27
  3. **Start parallel reviewers** - reviewer set depends on the host CLI:
28
28
 
29
29
  **Claude Code (2 in parallel):**
30
- - Agent 1: `claude-opus-4.6` → security + architecture
31
- - Agent 2: `claude-sonnet-4.6` → general quality
30
+ - Agent 1: `claude-fable-5` → security + architecture
31
+ - Agent 2: `claude-sonnet-4-6` → general quality
32
32
 
33
33
  **Copilot CLI (3 in parallel):**
34
- - Agent 1: `claude-opus-4.6` → security + architecture
34
+ - Agent 1: `claude-opus-4-8` → security + architecture (Fable 5 is not offered on Copilot CLI)
35
35
  - Agent 2: `gpt-5.4` → edge cases, different perspective
36
- - Agent 3: `claude-sonnet-4.6` → general quality
36
+ - Agent 3: `claude-sonnet-4-6` → general quality
37
37
 
38
38
  4. **Store-compliance cross-reference** - if iOS/Android release-relevant files changed, the matching catalog is loaded:
39
39
 
@@ -33,7 +33,7 @@ Run all steps automatically:
33
33
  Step 0: FIGMA_SYNC (opt-in) pipeline/scripts/sync-figma-source.sh
34
34
  -- incremental pull from upstream if figmaSource.path is set
35
35
  Step 1: DETECT Compare timestamps, find stale targets
36
- Step 2: COPILOT Claude Code -> Copilot CLI (instructions + 26 sub-command skills)
36
+ Step 2: COPILOT Claude Code -> Copilot CLI (instructions + 35 sub-command skills)
37
37
  Step 3: REPO Claude Code -> pipeline repo (genericized, personal data scrub)
38
38
  Step 4: WEBSITE Version + phase/model counts -> {website-host} (i18n + projects.ts)
39
39
  Step 5: REMOTE Pipeline references -> remote-control README
@@ -158,12 +158,14 @@ When invoked with the `release` argument:
158
158
  |-------------|-------------|
159
159
  | `~/.claude/commands/multi-agent/{cmd}.md` | `~/.copilot/skills/multi-agent-{cmd}/SKILL.md` |
160
160
 
161
- **26 commands are synced** (v7.0.0+ canonical inventory - must match `cross-cli-contract.md` section1, drift = contract violation):
161
+ **35 commands are synced** (canonical inventory - must match `cross-cli-contract.md` section 1; drift = contract violation):
162
162
 
163
163
  ```
164
- autopilot, channels, dev, dev-autopilot, dev-local, dev-local-autopilot,
165
- help, issue, jira, kill, local, local-autopilot, log, manual-test, purge,
166
- refactor, resume, review, scan, search, setup, stack, status, sync, test, update
164
+ analysis, analysis-resolve, autopilot, build-optimize, channels, delete, dev,
165
+ dev-autopilot, dev-local, dev-local-autopilot, diff-explain, finish, garbage-collect,
166
+ help, issue, jira, kill, language, local, local-autopilot, log, manual-test,
167
+ prune-logs, purge, refactor, resume, review, scan, search, setup, stack, status,
168
+ sync, test, update
167
169
  ```
168
170
 
169
171
  **NOT synced**: `refs/*` - Lazy-load references, Claude Code specific
@@ -1,120 +0,0 @@
1
- #!/usr/bin/env bash
2
- # smoke-readme-counts.sh - fail when README "Current at-a-glance" counts drift
3
- # from the actual filesystem. The table is hand-edited; without this gate it
4
- # silently goes stale every time a smoke / skill / schema is added or removed.
5
- #
6
- # Each row is checked independently. Compute the live value, parse the README
7
- # value, fail if they differ. Output the exact table line so the fix is a
8
- # one-line edit.
9
- #
10
- # Exit 0 = every count matches reality, 1 = at least one row drifted.
11
-
12
- set -uo pipefail
13
-
14
- ROOT="$(cd "$(dirname "$0")/../.." && pwd)"
15
- README="$ROOT/README.md"
16
-
17
- pass=0
18
- fail=0
19
- failures=()
20
- record_pass() { pass=$((pass + 1)); printf ' \033[0;32mPASS\033[0m %s\n' "$1"; }
21
- record_fail() { fail=$((fail + 1)); failures+=("$1"); printf ' \033[0;31mFAIL\033[0m %s\n' "$1"; }
22
-
23
- printf '→ smoke-readme-counts: README "at-a-glance" table matches filesystem\n'
24
-
25
- [ -f "$README" ] || { record_fail "README.md missing"; exit 1; }
26
-
27
- # Helper: extract the integer at the end of a markdown table row whose first
28
- # cell starts with the given prefix. Examples of expected rows:
29
- # "| Smoke suites | 73 |"
30
- # "| Total `SKILL.md` files across all groups | 195 |"
31
- parse_count() {
32
- local prefix_pattern="$1"
33
- # Use grep -E for the line match (more predictable than awk's match()),
34
- # then strip everything that isn't the trailing integer cell.
35
- local row
36
- row=$(grep -E "$prefix_pattern" "$README" | head -1)
37
- [ -z "$row" ] && { echo "PARSE_ERROR"; return; }
38
- # The integer is the last `| <digits> |` cell on the row.
39
- echo "$row" | grep -oE '\| *[0-9]+ *\|' | tail -1 | grep -oE '[0-9]+'
40
- }
41
-
42
- check_count() {
43
- local label="$1" actual="$2" prefix_pattern="$3"
44
- local declared
45
- declared=$(parse_count "$prefix_pattern")
46
- if [ -z "$declared" ] || [ "$declared" = "PARSE_ERROR" ]; then
47
- record_fail "$label: README row missing or unparseable (pattern: $prefix_pattern)"
48
- return
49
- fi
50
- if [ "$declared" = "$actual" ]; then
51
- record_pass "$label: README $declared matches actual $actual"
52
- else
53
- record_fail "$label: README claims $declared but actual is $actual"
54
- fi
55
- }
56
-
57
- # --- Compute live counts -----------------------------------------------------
58
-
59
- # Slash commands: every .md in commands/multi-agent/ that is NOT an internal
60
- # `_`-prefixed picker fragment.
61
- SLASH=$(ls "$ROOT"/pipeline/commands/multi-agent/*.md 2>/dev/null \
62
- | xargs -n1 basename | grep -v '^_' | wc -l | tr -d ' ')
63
-
64
- # Copilot skills: skill dirs whose SKILL.md frontmatter is `name: multi-agent-*`
65
- # AND `user-invocable: true`. Mirrors the Claude Code slash-command surface.
66
- COPILOT=$(grep -lE '^name: multi-agent-' "$ROOT"/pipeline/skills/shared/core/*/SKILL.md 2>/dev/null \
67
- | xargs grep -l '^user-invocable: true' 2>/dev/null | wc -l | tr -d ' ')
68
-
69
- # Figma skills: SKILL.md count under figma-{common,ios,android}.
70
- FIGMA=$(find "$ROOT"/pipeline/skills/figma-common "$ROOT"/pipeline/skills/figma-ios "$ROOT"/pipeline/skills/figma-android \
71
- -name SKILL.md 2>/dev/null | wc -l | tr -d ' ')
72
-
73
- # External skill catalog: subdirectories of shared/external/.
74
- EXTERNAL=$(ls -d "$ROOT"/pipeline/skills/shared/external/*/ 2>/dev/null | wc -l | tr -d ' ')
75
-
76
- # Total SKILL.md files: every SKILL.md anywhere under pipeline/skills/.
77
- TOTAL_SKILLS=$(find "$ROOT"/pipeline/skills -name SKILL.md 2>/dev/null | wc -l | tr -d ' ')
78
-
79
- # Smoke suites: every smoke-*.sh in pipeline/scripts/.
80
- SMOKES=$(ls "$ROOT"/pipeline/scripts/smoke-*.sh 2>/dev/null | wc -l | tr -d ' ')
81
-
82
- # Eval-triage fixtures: subdirs in pipeline/eval/triage/ whose names start with
83
- # a digit (the numbered fixture pattern, e.g. `01-empty-findings`).
84
- EVAL_TRIAGE=$(ls -d "$ROOT"/pipeline/eval/triage/[0-9]* 2>/dev/null | wc -l | tr -d ' ')
85
-
86
- # JSON schemas (excluding token-budget.json which is a config, not a schema).
87
- SCHEMAS=$(ls "$ROOT"/pipeline/schemas/*.schema.json 2>/dev/null | wc -l | tr -d ' ')
88
-
89
- # Agent personas: every .md under pipeline/agents/.
90
- AGENTS=$(find "$ROOT"/pipeline/agents -name '*.md' 2>/dev/null | wc -l | tr -d ' ')
91
-
92
- # Store-compliance skills (apple-archive-compliance + google-play-compliance).
93
- COMPLIANCE=$(ls -d "$ROOT"/pipeline/skills/shared/core/apple-archive-compliance "$ROOT"/pipeline/skills/shared/core/google-play-compliance 2>/dev/null | wc -l | tr -d ' ')
94
-
95
- # Golden-task fixtures.
96
- GOLDEN=$(ls -d "$ROOT"/pipeline/eval/golden-tasks/[0-9]* 2>/dev/null | wc -l | tr -d ' ')
97
-
98
- # --- Verify each row ---------------------------------------------------------
99
-
100
- check_count "slash commands" "$SLASH" '^\| Slash commands \(colon-form'
101
- check_count "Copilot skills" "$COPILOT" '^\| Copilot skills \(dash-form'
102
- check_count "store-compliance" "$COMPLIANCE" '^\| Store-compliance skills'
103
- check_count "figma skills" "$FIGMA" '^\| Figma skills'
104
- check_count "external catalog" "$EXTERNAL" '^\| External skill catalog'
105
- check_count "total SKILL.md" "$TOTAL_SKILLS" '^\| Total `SKILL\.md`'
106
- check_count "smoke suites" "$SMOKES" '^\| Smoke suites'
107
- check_count "golden-task fixtures" "$GOLDEN" '^\| Golden-task fixtures'
108
- check_count "eval-triage fixtures" "$EVAL_TRIAGE" '^\| Eval-triage fixtures'
109
- check_count "JSON schemas" "$SCHEMAS" '^\| JSON schemas'
110
- check_count "agent personas" "$AGENTS" '^\| Agent personas'
111
-
112
- # --- Summary -----------------------------------------------------------------
113
- total=$((pass + fail))
114
- printf '\n→ smoke-readme-counts: %d/%d passed\n' "$pass" "$total"
115
- if [ "$fail" -ne 0 ]; then
116
- printf '\nFailures (one-line README fix per row):\n'
117
- for f in "${failures[@]}"; do printf ' - %s\n' "$f"; done
118
- exit 1
119
- fi
120
- exit 0