@mmerterden/multi-agent-pipeline 10.7.3 → 10.7.4
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/CHANGELOG.md +19 -2
- package/docs/adr/0001-three-model-triage.md +2 -2
- package/docs/adr/0007-multi-tool-adapter-framework.md +1 -1
- package/docs/adr/README.md +2 -2
- package/docs/architecture.md +14 -14
- package/docs/features.md +22 -21
- package/docs/performance.md +3 -3
- package/index.js +3 -7
- package/install/templates/copilot-instructions.md +2 -2
- package/package.json +2 -5
- package/pipeline/agents/dev-critic.md +1 -1
- package/pipeline/claude-md-template.md +1 -1
- package/pipeline/commands/multi-agent/dev-autopilot.md +1 -1
- package/pipeline/commands/multi-agent/finish.md +2 -2
- package/pipeline/commands/multi-agent/help.md +12 -12
- package/pipeline/commands/multi-agent/local.md +1 -1
- package/pipeline/commands/multi-agent/refs/features/dev-critic.md +1 -1
- package/pipeline/commands/multi-agent/refs/features/model-fallback.md +7 -3
- package/pipeline/commands/multi-agent/refs/knowledge.md +1 -1
- package/pipeline/commands/multi-agent/refs/phases/log-format.md +1 -1
- package/pipeline/commands/multi-agent/refs/phases/modes.md +1 -1
- package/pipeline/commands/multi-agent/refs/phases/phase-1-analysis.md +2 -2
- package/pipeline/commands/multi-agent/refs/phases/phase-2-planning.md +2 -2
- package/pipeline/commands/multi-agent/refs/phases/phase-3-dev.md +1 -1
- package/pipeline/commands/multi-agent/refs/phases/phase-4-review.md +18 -18
- package/pipeline/commands/multi-agent/refs/progress-contract.md +1 -1
- package/pipeline/commands/multi-agent/refs/tracker-contract.md +1 -2
- package/pipeline/commands/multi-agent/review.md +8 -8
- package/pipeline/commands/multi-agent/sync.md +3 -3
- package/pipeline/commands/multi-agent.md +7 -7
- package/pipeline/schemas/agent-state.schema.json +1 -1
- package/pipeline/schemas/prefs.schema.json +3 -3
- package/pipeline/schemas/reviewer-output.schema.json +1 -1
- package/pipeline/schemas/triage-output.schema.json +2 -2
- package/pipeline/scripts/README.md +1 -2
- package/pipeline/scripts/cost-budget-check.mjs +1 -1
- package/pipeline/scripts/cost-table.json +7 -0
- package/pipeline/scripts/fixtures/install-layout.tsv +5 -5
- package/pipeline/scripts/uninstall.mjs +53 -57
- package/pipeline/skills/shared/core/multi-agent/SKILL.md +11 -11
- package/pipeline/skills/shared/core/multi-agent-dev-autopilot/SKILL.md +1 -1
- package/pipeline/skills/shared/core/multi-agent-finish/SKILL.md +1 -1
- package/pipeline/skills/shared/core/multi-agent-help/SKILL.md +8 -8
- package/pipeline/skills/shared/core/multi-agent-review/SKILL.md +5 -5
- package/pipeline/skills/shared/core/multi-agent-sync/SKILL.md +7 -5
- package/pipeline/scripts/smoke-readme-counts.sh +0 -120
|
@@ -13,7 +13,7 @@ You already wrote (and maybe hand-tested) the change on the current branch - o
|
|
|
13
13
|
|
|
14
14
|
```
|
|
15
15
|
Phase 0: Init → project/branch detect, resolve base + diff (work already done), Jira id, state (NO worktree)
|
|
16
|
-
Phase 4: Review → deterministic gates + parallel review +
|
|
16
|
+
Phase 4: Review → deterministic gates + parallel review + Fable triage
|
|
17
17
|
Phase 5: Build+Test → stack-aware build + run existing tests; SUCCESS required (automated gate, not the interactive user-test)
|
|
18
18
|
Phase 6: Commit → commit remaining changes + push + open PR if none exists
|
|
19
19
|
Phase 7: Report → technical analysis + Jira comment with test scenarios (channels: Jira / PR / Confluence / Wiki)
|
|
@@ -59,11 +59,11 @@ How It Works (Phase 0 - Interactive Flow):
|
|
|
59
59
|
Pipeline (after Phase 0):
|
|
60
60
|
|
|
61
61
|
Phase 0: Init -> The 8 steps above
|
|
62
|
-
Phase 1: Analysis -> Stack detection + codebase scan (
|
|
62
|
+
Phase 1: Analysis -> Stack detection + codebase scan (Fable)
|
|
63
63
|
Phase 2: Planning -> Task breakdown + architecture review + Plan Approval Gate
|
|
64
64
|
Phase 3: Dev -> TDD: test -> code -> build (Sonnet) + build queue
|
|
65
|
-
Phase 4: Review -> Deterministic gates + parallel AI review +
|
|
66
|
-
(Claude Code:
|
|
65
|
+
Phase 4: Review -> Deterministic gates + parallel AI review + Fable triage
|
|
66
|
+
(Claude Code: Fable + Sonnet · Copilot CLI: GPT-5.4 + Opus + Sonnet)
|
|
67
67
|
Phase 5: Test -> Optional: switch to branch, test in Xcode
|
|
68
68
|
Phase 6: Commit -> Commit -> push -> PR + issue body update (never auto-closes)
|
|
69
69
|
Phase 7: Report -> Channels dispatcher (PR · Jira · Confluence · Wiki, multi-select)
|
|
@@ -75,7 +75,7 @@ Pipeline (after Phase 0):
|
|
|
75
75
|
|
|
76
76
|
Modes:
|
|
77
77
|
|
|
78
|
-
(normal) Full 8 phases, parallel review +
|
|
78
|
+
(normal) Full 8 phases, parallel review + Fable triage
|
|
79
79
|
--dev Fast: Init -> Dev(Opus) -> Commit -> Report (no plan gate)
|
|
80
80
|
--local No worktree - works directly on local branch
|
|
81
81
|
autopilot Skip all confirmations (EXCEPT Phase 7 channels menu)
|
|
@@ -194,11 +194,11 @@ Nasıl Çalışır (Phase 0 - İnteraktif Akış):
|
|
|
194
194
|
Pipeline (Phase 0'dan sonra):
|
|
195
195
|
|
|
196
196
|
Phase 0: Init -> Yukarıdaki 8 adım
|
|
197
|
-
Phase 1: Analysis -> Stack tespiti + codebase taraması (
|
|
197
|
+
Phase 1: Analysis -> Stack tespiti + codebase taraması (Fable)
|
|
198
198
|
Phase 2: Planning -> Task kırılımı + mimari inceleme + Plan Onay Kapısı
|
|
199
199
|
Phase 3: Dev -> TDD: test -> kod -> build (Sonnet) + build queue
|
|
200
|
-
Phase 4: Review -> Deterministik kapılar + paralel AI review +
|
|
201
|
-
(Claude Code:
|
|
200
|
+
Phase 4: Review -> Deterministik kapılar + paralel AI review + Fable triage
|
|
201
|
+
(Claude Code: Fable + Sonnet · Copilot CLI: GPT-5.4 + Opus + Sonnet)
|
|
202
202
|
Phase 5: Test -> Opsiyonel: branch'e geç, Xcode'da test
|
|
203
203
|
Phase 6: Commit -> Commit -> push -> PR + issue body güncelleme (hiç auto-close yok)
|
|
204
204
|
Phase 7: Report -> Channels dispatcher (PR · Jira · Confluence · Wiki, multi-select)
|
|
@@ -210,7 +210,7 @@ Pipeline (Phase 0'dan sonra):
|
|
|
210
210
|
|
|
211
211
|
Modlar:
|
|
212
212
|
|
|
213
|
-
(normal) Tam 8 faz, paralel review +
|
|
213
|
+
(normal) Tam 8 faz, paralel review + Fable triage
|
|
214
214
|
--dev Hızlı: Init -> Dev(Opus) -> Commit -> Report (plan gate yok)
|
|
215
215
|
--local Worktree yok - doğrudan local branch'te çalışır
|
|
216
216
|
autopilot Tüm onayları atla (İSTİSNA: Phase 7 channels menüsü)
|
|
@@ -1,7 +1,7 @@
|
|
|
1
1
|
---
|
|
2
2
|
name: multi-agent-review
|
|
3
3
|
language: en
|
|
4
|
-
description: "Run parallel review on the current branch's diff: 2 models on Claude Code (
|
|
4
|
+
description: "Run parallel review on the current branch's diff: 2 models on Claude Code (Fable + Sonnet), 3 models on Copilot CLI (GPT + Opus + Sonnet). Review-only slice of the pipeline."
|
|
5
5
|
user-invocable: true
|
|
6
6
|
argument-hint: "[branch] - optional: branch to review. If omitted, the current branch is used."
|
|
7
7
|
---
|
|
@@ -27,13 +27,13 @@ Skip Phase 0-3 and review the current diff only.
|
|
|
27
27
|
3. **Start parallel reviewers** - reviewer set depends on the host CLI:
|
|
28
28
|
|
|
29
29
|
**Claude Code (2 in parallel):**
|
|
30
|
-
- Agent 1: `claude-
|
|
31
|
-
- Agent 2: `claude-sonnet-4
|
|
30
|
+
- Agent 1: `claude-fable-5` → security + architecture
|
|
31
|
+
- Agent 2: `claude-sonnet-4-6` → general quality
|
|
32
32
|
|
|
33
33
|
**Copilot CLI (3 in parallel):**
|
|
34
|
-
- Agent 1: `claude-opus-4
|
|
34
|
+
- Agent 1: `claude-opus-4-8` → security + architecture (Fable 5 is not offered on Copilot CLI)
|
|
35
35
|
- Agent 2: `gpt-5.4` → edge cases, different perspective
|
|
36
|
-
- Agent 3: `claude-sonnet-4
|
|
36
|
+
- Agent 3: `claude-sonnet-4-6` → general quality
|
|
37
37
|
|
|
38
38
|
4. **Store-compliance cross-reference** - if iOS/Android release-relevant files changed, the matching catalog is loaded:
|
|
39
39
|
|
|
@@ -33,7 +33,7 @@ Run all steps automatically:
|
|
|
33
33
|
Step 0: FIGMA_SYNC (opt-in) pipeline/scripts/sync-figma-source.sh
|
|
34
34
|
-- incremental pull from upstream if figmaSource.path is set
|
|
35
35
|
Step 1: DETECT Compare timestamps, find stale targets
|
|
36
|
-
Step 2: COPILOT Claude Code -> Copilot CLI (instructions +
|
|
36
|
+
Step 2: COPILOT Claude Code -> Copilot CLI (instructions + 35 sub-command skills)
|
|
37
37
|
Step 3: REPO Claude Code -> pipeline repo (genericized, personal data scrub)
|
|
38
38
|
Step 4: WEBSITE Version + phase/model counts -> {website-host} (i18n + projects.ts)
|
|
39
39
|
Step 5: REMOTE Pipeline references -> remote-control README
|
|
@@ -158,12 +158,14 @@ When invoked with the `release` argument:
|
|
|
158
158
|
|-------------|-------------|
|
|
159
159
|
| `~/.claude/commands/multi-agent/{cmd}.md` | `~/.copilot/skills/multi-agent-{cmd}/SKILL.md` |
|
|
160
160
|
|
|
161
|
-
**
|
|
161
|
+
**35 commands are synced** (canonical inventory - must match `cross-cli-contract.md` section 1; drift = contract violation):
|
|
162
162
|
|
|
163
163
|
```
|
|
164
|
-
|
|
165
|
-
|
|
166
|
-
|
|
164
|
+
analysis, analysis-resolve, autopilot, build-optimize, channels, delete, dev,
|
|
165
|
+
dev-autopilot, dev-local, dev-local-autopilot, diff-explain, finish, garbage-collect,
|
|
166
|
+
help, issue, jira, kill, language, local, local-autopilot, log, manual-test,
|
|
167
|
+
prune-logs, purge, refactor, resume, review, scan, search, setup, stack, status,
|
|
168
|
+
sync, test, update
|
|
167
169
|
```
|
|
168
170
|
|
|
169
171
|
**NOT synced**: `refs/*` - Lazy-load references, Claude Code specific
|
|
@@ -1,120 +0,0 @@
|
|
|
1
|
-
#!/usr/bin/env bash
|
|
2
|
-
# smoke-readme-counts.sh - fail when README "Current at-a-glance" counts drift
|
|
3
|
-
# from the actual filesystem. The table is hand-edited; without this gate it
|
|
4
|
-
# silently goes stale every time a smoke / skill / schema is added or removed.
|
|
5
|
-
#
|
|
6
|
-
# Each row is checked independently. Compute the live value, parse the README
|
|
7
|
-
# value, fail if they differ. Output the exact table line so the fix is a
|
|
8
|
-
# one-line edit.
|
|
9
|
-
#
|
|
10
|
-
# Exit 0 = every count matches reality, 1 = at least one row drifted.
|
|
11
|
-
|
|
12
|
-
set -uo pipefail
|
|
13
|
-
|
|
14
|
-
ROOT="$(cd "$(dirname "$0")/../.." && pwd)"
|
|
15
|
-
README="$ROOT/README.md"
|
|
16
|
-
|
|
17
|
-
pass=0
|
|
18
|
-
fail=0
|
|
19
|
-
failures=()
|
|
20
|
-
record_pass() { pass=$((pass + 1)); printf ' \033[0;32mPASS\033[0m %s\n' "$1"; }
|
|
21
|
-
record_fail() { fail=$((fail + 1)); failures+=("$1"); printf ' \033[0;31mFAIL\033[0m %s\n' "$1"; }
|
|
22
|
-
|
|
23
|
-
printf '→ smoke-readme-counts: README "at-a-glance" table matches filesystem\n'
|
|
24
|
-
|
|
25
|
-
[ -f "$README" ] || { record_fail "README.md missing"; exit 1; }
|
|
26
|
-
|
|
27
|
-
# Helper: extract the integer at the end of a markdown table row whose first
|
|
28
|
-
# cell starts with the given prefix. Examples of expected rows:
|
|
29
|
-
# "| Smoke suites | 73 |"
|
|
30
|
-
# "| Total `SKILL.md` files across all groups | 195 |"
|
|
31
|
-
parse_count() {
|
|
32
|
-
local prefix_pattern="$1"
|
|
33
|
-
# Use grep -E for the line match (more predictable than awk's match()),
|
|
34
|
-
# then strip everything that isn't the trailing integer cell.
|
|
35
|
-
local row
|
|
36
|
-
row=$(grep -E "$prefix_pattern" "$README" | head -1)
|
|
37
|
-
[ -z "$row" ] && { echo "PARSE_ERROR"; return; }
|
|
38
|
-
# The integer is the last `| <digits> |` cell on the row.
|
|
39
|
-
echo "$row" | grep -oE '\| *[0-9]+ *\|' | tail -1 | grep -oE '[0-9]+'
|
|
40
|
-
}
|
|
41
|
-
|
|
42
|
-
check_count() {
|
|
43
|
-
local label="$1" actual="$2" prefix_pattern="$3"
|
|
44
|
-
local declared
|
|
45
|
-
declared=$(parse_count "$prefix_pattern")
|
|
46
|
-
if [ -z "$declared" ] || [ "$declared" = "PARSE_ERROR" ]; then
|
|
47
|
-
record_fail "$label: README row missing or unparseable (pattern: $prefix_pattern)"
|
|
48
|
-
return
|
|
49
|
-
fi
|
|
50
|
-
if [ "$declared" = "$actual" ]; then
|
|
51
|
-
record_pass "$label: README $declared matches actual $actual"
|
|
52
|
-
else
|
|
53
|
-
record_fail "$label: README claims $declared but actual is $actual"
|
|
54
|
-
fi
|
|
55
|
-
}
|
|
56
|
-
|
|
57
|
-
# --- Compute live counts -----------------------------------------------------
|
|
58
|
-
|
|
59
|
-
# Slash commands: every .md in commands/multi-agent/ that is NOT an internal
|
|
60
|
-
# `_`-prefixed picker fragment.
|
|
61
|
-
SLASH=$(ls "$ROOT"/pipeline/commands/multi-agent/*.md 2>/dev/null \
|
|
62
|
-
| xargs -n1 basename | grep -v '^_' | wc -l | tr -d ' ')
|
|
63
|
-
|
|
64
|
-
# Copilot skills: skill dirs whose SKILL.md frontmatter is `name: multi-agent-*`
|
|
65
|
-
# AND `user-invocable: true`. Mirrors the Claude Code slash-command surface.
|
|
66
|
-
COPILOT=$(grep -lE '^name: multi-agent-' "$ROOT"/pipeline/skills/shared/core/*/SKILL.md 2>/dev/null \
|
|
67
|
-
| xargs grep -l '^user-invocable: true' 2>/dev/null | wc -l | tr -d ' ')
|
|
68
|
-
|
|
69
|
-
# Figma skills: SKILL.md count under figma-{common,ios,android}.
|
|
70
|
-
FIGMA=$(find "$ROOT"/pipeline/skills/figma-common "$ROOT"/pipeline/skills/figma-ios "$ROOT"/pipeline/skills/figma-android \
|
|
71
|
-
-name SKILL.md 2>/dev/null | wc -l | tr -d ' ')
|
|
72
|
-
|
|
73
|
-
# External skill catalog: subdirectories of shared/external/.
|
|
74
|
-
EXTERNAL=$(ls -d "$ROOT"/pipeline/skills/shared/external/*/ 2>/dev/null | wc -l | tr -d ' ')
|
|
75
|
-
|
|
76
|
-
# Total SKILL.md files: every SKILL.md anywhere under pipeline/skills/.
|
|
77
|
-
TOTAL_SKILLS=$(find "$ROOT"/pipeline/skills -name SKILL.md 2>/dev/null | wc -l | tr -d ' ')
|
|
78
|
-
|
|
79
|
-
# Smoke suites: every smoke-*.sh in pipeline/scripts/.
|
|
80
|
-
SMOKES=$(ls "$ROOT"/pipeline/scripts/smoke-*.sh 2>/dev/null | wc -l | tr -d ' ')
|
|
81
|
-
|
|
82
|
-
# Eval-triage fixtures: subdirs in pipeline/eval/triage/ whose names start with
|
|
83
|
-
# a digit (the numbered fixture pattern, e.g. `01-empty-findings`).
|
|
84
|
-
EVAL_TRIAGE=$(ls -d "$ROOT"/pipeline/eval/triage/[0-9]* 2>/dev/null | wc -l | tr -d ' ')
|
|
85
|
-
|
|
86
|
-
# JSON schemas (excluding token-budget.json which is a config, not a schema).
|
|
87
|
-
SCHEMAS=$(ls "$ROOT"/pipeline/schemas/*.schema.json 2>/dev/null | wc -l | tr -d ' ')
|
|
88
|
-
|
|
89
|
-
# Agent personas: every .md under pipeline/agents/.
|
|
90
|
-
AGENTS=$(find "$ROOT"/pipeline/agents -name '*.md' 2>/dev/null | wc -l | tr -d ' ')
|
|
91
|
-
|
|
92
|
-
# Store-compliance skills (apple-archive-compliance + google-play-compliance).
|
|
93
|
-
COMPLIANCE=$(ls -d "$ROOT"/pipeline/skills/shared/core/apple-archive-compliance "$ROOT"/pipeline/skills/shared/core/google-play-compliance 2>/dev/null | wc -l | tr -d ' ')
|
|
94
|
-
|
|
95
|
-
# Golden-task fixtures.
|
|
96
|
-
GOLDEN=$(ls -d "$ROOT"/pipeline/eval/golden-tasks/[0-9]* 2>/dev/null | wc -l | tr -d ' ')
|
|
97
|
-
|
|
98
|
-
# --- Verify each row ---------------------------------------------------------
|
|
99
|
-
|
|
100
|
-
check_count "slash commands" "$SLASH" '^\| Slash commands \(colon-form'
|
|
101
|
-
check_count "Copilot skills" "$COPILOT" '^\| Copilot skills \(dash-form'
|
|
102
|
-
check_count "store-compliance" "$COMPLIANCE" '^\| Store-compliance skills'
|
|
103
|
-
check_count "figma skills" "$FIGMA" '^\| Figma skills'
|
|
104
|
-
check_count "external catalog" "$EXTERNAL" '^\| External skill catalog'
|
|
105
|
-
check_count "total SKILL.md" "$TOTAL_SKILLS" '^\| Total `SKILL\.md`'
|
|
106
|
-
check_count "smoke suites" "$SMOKES" '^\| Smoke suites'
|
|
107
|
-
check_count "golden-task fixtures" "$GOLDEN" '^\| Golden-task fixtures'
|
|
108
|
-
check_count "eval-triage fixtures" "$EVAL_TRIAGE" '^\| Eval-triage fixtures'
|
|
109
|
-
check_count "JSON schemas" "$SCHEMAS" '^\| JSON schemas'
|
|
110
|
-
check_count "agent personas" "$AGENTS" '^\| Agent personas'
|
|
111
|
-
|
|
112
|
-
# --- Summary -----------------------------------------------------------------
|
|
113
|
-
total=$((pass + fail))
|
|
114
|
-
printf '\n→ smoke-readme-counts: %d/%d passed\n' "$pass" "$total"
|
|
115
|
-
if [ "$fail" -ne 0 ]; then
|
|
116
|
-
printf '\nFailures (one-line README fix per row):\n'
|
|
117
|
-
for f in "${failures[@]}"; do printf ' - %s\n' "$f"; done
|
|
118
|
-
exit 1
|
|
119
|
-
fi
|
|
120
|
-
exit 0
|