opencode-swarm 6.2.0 → 6.5.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +306 -543
- package/dist/agents/test-engineer.adversarial.test.d.ts +5 -0
- package/dist/agents/test-engineer.security.test.d.ts +1 -0
- package/dist/config/schema.d.ts +51 -0
- package/dist/index.js +4216 -59
- package/dist/tools/checkpoint.d.ts +2 -0
- package/dist/tools/complexity-hotspots.d.ts +2 -0
- package/dist/tools/evidence-check.d.ts +2 -0
- package/dist/tools/imports.d.ts +5 -0
- package/dist/tools/index.d.ts +11 -0
- package/dist/tools/lint.d.ts +34 -0
- package/dist/tools/pkg-audit.d.ts +2 -0
- package/dist/tools/schema-drift.d.ts +2 -0
- package/dist/tools/secretscan.d.ts +31 -0
- package/dist/tools/symbols.d.ts +2 -0
- package/dist/tools/test-runner.d.ts +48 -0
- package/dist/tools/test-runner.security-adversarial.test.d.ts +5 -0
- package/dist/tools/todo-extract.d.ts +2 -0
- package/package.json +1 -1
package/README.md
CHANGED
|
@@ -1,5 +1,5 @@
|
|
|
1
1
|
<p align="center">
|
|
2
|
-
|
|
2
|
+
<img src="https://img.shields.io/badge/version-6.3.0-blue" alt="Version">
|
|
3
3
|
<img src="https://img.shields.io/badge/license-MIT-green" alt="License">
|
|
4
4
|
<img src="https://img.shields.io/badge/opencode-plugin-purple" alt="OpenCode Plugin">
|
|
5
5
|
<img src="https://img.shields.io/badge/agents-9-orange" alt="Agents">
|
|
@@ -9,185 +9,191 @@
|
|
|
9
9
|
<h1 align="center">🐝 OpenCode Swarm</h1>
|
|
10
10
|
|
|
11
11
|
<p align="center">
|
|
12
|
-
<strong>
|
|
13
|
-
|
|
12
|
+
<strong>A structured multi-agent coding framework for OpenCode.</strong><br>
|
|
13
|
+
Nine specialized agents. Persistent memory. A QA gate on every task. Code that ships.
|
|
14
14
|
</p>
|
|
15
15
|
|
|
16
16
|
<p align="center">
|
|
17
|
-
<a href="#
|
|
17
|
+
<a href="#the-problem">The Problem</a> •
|
|
18
18
|
<a href="#how-it-works">How It Works</a> •
|
|
19
|
-
<a href="#installation">Installation</a> •
|
|
20
19
|
<a href="#agents">Agents</a> •
|
|
21
|
-
<a href="#
|
|
20
|
+
<a href="#persistent-memory">Memory</a> •
|
|
21
|
+
<a href="#guardrails">Guardrails</a> •
|
|
22
|
+
<a href="#comparison">Comparison</a> •
|
|
23
|
+
<a href="#installation">Installation</a> •
|
|
24
|
+
<a href="#roadmap">Roadmap</a>
|
|
22
25
|
</p>
|
|
23
26
|
|
|
24
27
|
---
|
|
25
28
|
|
|
26
|
-
## The Problem
|
|
27
|
-
|
|
28
|
-
```
|
|
29
|
-
You: "Build me an authentication system"
|
|
30
|
-
|
|
31
|
-
Other Frameworks:
|
|
32
|
-
├── Agent 1 starts auth module...
|
|
33
|
-
├── Agent 2 starts user model... (conflicts with Agent 1)
|
|
34
|
-
├── Agent 3 starts database... (wrong schema)
|
|
35
|
-
├── Agent 4 starts tests... (for code that doesn't exist yet)
|
|
36
|
-
└── Result: Chaos. Conflicts. Context lost. Start over.
|
|
37
|
-
|
|
38
|
-
OpenCode Swarm:
|
|
39
|
-
├── Architect analyzes request
|
|
40
|
-
├── Explorer scans codebase (+ gap analysis)
|
|
41
|
-
├── @sme consulted on security domain
|
|
42
|
-
├── Architect creates phased plan with acceptance criteria
|
|
43
|
-
├── @critic reviews plan → APPROVED
|
|
44
|
-
├── Phase 1: User model → Review → Tests (run + PASS) → ✓
|
|
45
|
-
├── Phase 2: Auth logic → Review → Tests (run + PASS) → ✓
|
|
46
|
-
├── Phase 3: Session management → Review → Tests (run + PASS) → ✓
|
|
47
|
-
└── Result: Working code. Documented decisions. Resumable progress.
|
|
48
|
-
```
|
|
49
|
-
|
|
50
|
-
---
|
|
51
|
-
|
|
52
|
-
## Why Swarm?
|
|
53
|
-
|
|
54
|
-
<table>
|
|
55
|
-
<tr>
|
|
56
|
-
<td width="50%">
|
|
29
|
+
## The Problem
|
|
57
30
|
|
|
58
|
-
|
|
31
|
+
Every multi-agent AI coding tool on the market has the same failure mode: they are vibes-driven. You describe a feature. Agents spawn. They race each other to write conflicting code, lose context after 20 messages, hit token limits mid-task, and produce something that sort-of-works until it doesn't. There's no plan. There's no memory. There's no gatekeeper. There's no test that was actually run.
|
|
59
32
|
|
|
60
|
-
-
|
|
61
|
-
- Single model = correlated failures
|
|
62
|
-
- No planning, just vibes
|
|
63
|
-
- Context lost between sessions
|
|
64
|
-
- QA as afterthought (if at all)
|
|
65
|
-
- Entire codebase in one prompt
|
|
66
|
-
- No way to resume projects
|
|
33
|
+
**oh-my-opencode** is a prompt collection. **get-shit-done** is a workflow macro. Neither is a framework with memory, QA enforcement, or the ability to resume a project a week later exactly where you left off.
|
|
67
34
|
|
|
68
|
-
|
|
69
|
-
<td width="50%">
|
|
35
|
+
OpenCode Swarm is built differently.
|
|
70
36
|
|
|
71
|
-
|
|
72
|
-
|
|
73
|
-
|
|
74
|
-
|
|
75
|
-
|
|
76
|
-
|
|
77
|
-
|
|
78
|
-
- **One task at a time** - focused, quality code
|
|
79
|
-
- **Resumable projects** - pick up exactly where you left off
|
|
37
|
+
```
|
|
38
|
+
Every other framework:
|
|
39
|
+
├── Agent 1 starts the auth module...
|
|
40
|
+
├── Agent 2 starts the user model... (conflicts with Agent 1)
|
|
41
|
+
├── Agent 3 writes tests... (for code that doesn't exist yet)
|
|
42
|
+
├── Context window fills up and the whole thing drifts
|
|
43
|
+
└── Result: chaos. Rework. Start over.
|
|
80
44
|
|
|
81
|
-
|
|
82
|
-
|
|
83
|
-
|
|
45
|
+
OpenCode Swarm:
|
|
46
|
+
├── Architect reads .swarm/plan.md → project already in progress, resumes Phase 2
|
|
47
|
+
├── @explorer scans the codebase for current state
|
|
48
|
+
├── @sme DOMAIN: security → consults on auth patterns, guidance cached
|
|
49
|
+
├── Architect writes .swarm/plan.md: 3 phases, 9 tasks, acceptance criteria per task
|
|
50
|
+
├── @critic reviews the plan → APPROVED
|
|
51
|
+
├── @coder implements Task 2.2 (one task, full context, nothing else)
|
|
52
|
+
├── diff tool → imports tool → lint fix → lint check → secretscan → @reviewer → @test_engineer
|
|
53
|
+
├── All gates pass → plan.md updated → Task 2.2: [x]
|
|
54
|
+
└── Result: working code, documented decisions, resumable project, evidence trail
|
|
55
|
+
```
|
|
84
56
|
|
|
85
57
|
---
|
|
86
58
|
|
|
87
59
|
## How It Works
|
|
88
60
|
|
|
61
|
+
### The Execution Pipeline
|
|
62
|
+
|
|
89
63
|
```
|
|
90
|
-
|
|
91
|
-
│
|
|
92
|
-
|
|
64
|
+
┌──────────────────────────────────────────────────────────────────────────┐
|
|
65
|
+
│ Phase 0: Resume Check │
|
|
66
|
+
│ .swarm/plan.md exists? Resume mid-task. New project? Continue. │
|
|
67
|
+
└──────────────────────────────────────────────────────────────────────────┘
|
|
93
68
|
│
|
|
94
69
|
▼
|
|
95
|
-
|
|
96
|
-
│
|
|
97
|
-
│
|
|
98
|
-
|
|
70
|
+
┌──────────────────────────────────────────────────────────────────────────┐
|
|
71
|
+
│ Phase 1: Clarify │
|
|
72
|
+
│ Ask only what the Architect cannot infer. Then stop. │
|
|
73
|
+
└──────────────────────────────────────────────────────────────────────────┘
|
|
99
74
|
│
|
|
100
75
|
▼
|
|
101
|
-
|
|
102
|
-
│
|
|
103
|
-
│
|
|
104
|
-
|
|
76
|
+
┌──────────────────────────────────────────────────────────────────────────┐
|
|
77
|
+
│ Phase 2: Discover │
|
|
78
|
+
│ @explorer scans codebase → structure, languages, frameworks, key files │
|
|
79
|
+
└──────────────────────────────────────────────────────────────────────────┘
|
|
105
80
|
│
|
|
106
81
|
▼
|
|
107
|
-
|
|
108
|
-
│
|
|
109
|
-
│
|
|
110
|
-
|
|
82
|
+
┌──────────────────────────────────────────────────────────────────────────┐
|
|
83
|
+
│ Phase 3: SME Consult (serial, cached) │
|
|
84
|
+
│ @sme DOMAIN: security, @sme DOMAIN: api, ... │
|
|
85
|
+
│ Guidance written to .swarm/context.md — never re-asked in future phases │
|
|
86
|
+
└──────────────────────────────────────────────────────────────────────────┘
|
|
111
87
|
│
|
|
112
88
|
▼
|
|
113
|
-
|
|
114
|
-
│
|
|
115
|
-
│
|
|
116
|
-
│
|
|
117
|
-
│
|
|
118
|
-
|
|
89
|
+
┌──────────────────────────────────────────────────────────────────────────┐
|
|
90
|
+
│ Phase 4: Plan │
|
|
91
|
+
│ Architect writes .swarm/plan.md │
|
|
92
|
+
│ Structured phases, tasks with SMALL/MEDIUM/LARGE sizing, acceptance │
|
|
93
|
+
│ criteria per task, explicit dependency graph │
|
|
94
|
+
└──────────────────────────────────────────────────────────────────────────┘
|
|
119
95
|
│
|
|
120
96
|
▼
|
|
121
|
-
|
|
122
|
-
│
|
|
123
|
-
│
|
|
124
|
-
│
|
|
125
|
-
|
|
126
|
-
│ Phase 2: Core Auth [4 tasks] │
|
|
127
|
-
│ Phase 3: Session Management [3 tasks] │
|
|
128
|
-
└─────────────────────────────────────────────────────────────────────────┘
|
|
97
|
+
┌──────────────────────────────────────────────────────────────────────────┐
|
|
98
|
+
│ Phase 4.5: Critic Gate │
|
|
99
|
+
│ @critic reviews plan → APPROVED / NEEDS_REVISION / REJECTED │
|
|
100
|
+
│ Max 2 revision cycles. Escalates to user if unresolved. │
|
|
101
|
+
└──────────────────────────────────────────────────────────────────────────┘
|
|
129
102
|
│
|
|
130
103
|
▼
|
|
131
|
-
|
|
132
|
-
│
|
|
133
|
-
│
|
|
134
|
-
│
|
|
135
|
-
|
|
104
|
+
┌──────────────────────────────────────────────────────────────────────────┐
|
|
105
|
+
│ Phase 5: Execute (per task) │
|
|
106
|
+
│ │
|
|
107
|
+
│ [UI task?] → @designer scaffold first │
|
|
108
|
+
│ │
|
|
109
|
+
│ @coder (one task, full context) │
|
|
110
|
+
│ ↓ │
|
|
111
|
+
│ diff tool → imports tool → lint fix → lint check → secretscan │
|
|
112
|
+
│ (contract change detection) (AST-based) (auto-fix) (entropy scan) │
|
|
113
|
+
│ ↓ │
|
|
114
|
+
│ @reviewer (correctness pass) │
|
|
115
|
+
│ ↓ APPROVED │
|
|
116
|
+
│ @reviewer (security-only pass, if file matches security globs) │
|
|
117
|
+
│ ↓ APPROVED │
|
|
118
|
+
│ @test_engineer (verification tests + coverage gate ≥70%) │
|
|
119
|
+
│ ↓ PASS │
|
|
120
|
+
│ @test_engineer (adversarial tests — boundary violations, injections) │
|
|
121
|
+
│ ↓ PASS │
|
|
122
|
+
│ plan.md → [x] Task complete │
|
|
123
|
+
│ │
|
|
124
|
+
│ Any gate fails → back to @coder with structured rejection reason │
|
|
125
|
+
└──────────────────────────────────────────────────────────────────────────┘
|
|
136
126
|
│
|
|
137
127
|
▼
|
|
138
|
-
|
|
139
|
-
│
|
|
140
|
-
│
|
|
141
|
-
│
|
|
142
|
-
│
|
|
143
|
-
|
|
144
|
-
│ └─────────┘ └───────┘ └────────────┘ └──────────────┘ │
|
|
145
|
-
│ │ │ │ │ │
|
|
146
|
-
│ │ Contract │ If REJECTED: If FAIL: fix │
|
|
147
|
-
│ │ changes? │ retry from coder + retest │
|
|
148
|
-
│ │ │ │ │ │
|
|
149
|
-
│ │ ▼ │ ▼ │
|
|
150
|
-
│ │ ┌─────────┐ │ ┌──────────────┐ ┌──────────────┐ │
|
|
151
|
-
│ │ │@explorer│ │ │ @reviewer │ → │ @test │ │
|
|
152
|
-
│ │ │ impact │ │ │ security-only│ │ adversarial │ │
|
|
153
|
-
│ │ │analysis │ │ │ (if match) │ │ (attacks) │ │
|
|
154
|
-
│ │ └─────────┘ │ └──────────────┘ └──────────────┘ │
|
|
155
|
-
│ │ │ │
|
|
156
|
-
│ └───────────────┘ │
|
|
157
|
-
│ │
|
|
158
|
-
│ Update plan.md: [x] Task complete (only after ALL gates pass) │
|
|
159
|
-
│ Next task... │
|
|
160
|
-
└─────────────────────────────────────────────────────────────────────────┘
|
|
161
|
-
│
|
|
162
|
-
▼
|
|
163
|
-
┌─────────────────────────────────────────────────────────────────────────┐
|
|
164
|
-
│ PHASE 6: Phase Complete │
|
|
165
|
-
│ Re-scan with @explorer │
|
|
166
|
-
│ Update context.md with learnings │
|
|
167
|
-
│ Archive to .swarm/history/ │
|
|
168
|
-
│ "Phase 1 complete. Ready for Phase 2?" │
|
|
169
|
-
└─────────────────────────────────────────────────────────────────────────┘
|
|
128
|
+
┌──────────────────────────────────────────────────────────────────────────┐
|
|
129
|
+
│ Phase 6: Phase Complete │
|
|
130
|
+
│ @explorer rescans. @docs updates documentation. Retrospective written. │
|
|
131
|
+
│ Learnings injected as [SWARM RETROSPECTIVE] into next phase. │
|
|
132
|
+
│ "Phase 1 complete (4 tasks, 0 rejections). Ready for Phase 2?" │
|
|
133
|
+
└──────────────────────────────────────────────────────────────────────────┘
|
|
170
134
|
```
|
|
171
135
|
|
|
136
|
+
### Why Serial Execution Matters
|
|
137
|
+
|
|
138
|
+
Multi-agent parallelism sounds fast. In practice, it is a race to produce conflicting, unreviewed code that requires a human to untangle. OpenCode Swarm runs one task at a time through a deterministic pipeline. Every task is reviewed. Every test is run. Every failure is documented and fed back to the coder with structured context. The tradeoff in raw speed is paid back in not redoing work.
|
|
139
|
+
|
|
172
140
|
---
|
|
173
141
|
|
|
174
|
-
##
|
|
142
|
+
## Agents
|
|
143
|
+
|
|
144
|
+
### 🎯 Orchestrator
|
|
145
|
+
|
|
146
|
+
**`architect`** — The central coordinator. Owns the plan, delegates all work, enforces every QA gate, maintains project memory, and resumes projects across sessions. Every other agent works for the Architect.
|
|
147
|
+
|
|
148
|
+
### 🔍 Discovery
|
|
149
|
+
|
|
150
|
+
**`explorer`** — Fast codebase scanner. Identifies structure, languages, frameworks, key files, and import patterns. Runs before planning and after every phase completes.
|
|
151
|
+
|
|
152
|
+
### 🧠 Domain Expert
|
|
153
|
+
|
|
154
|
+
**`sme`** — Open-domain expert. The Architect specifies any domain per call: `security`, `python`, `rust`, `kubernetes`, `ios`, `ml`, `blockchain` — any domain the underlying model has knowledge of. No hardcoded list. Guidance is cached in `.swarm/context.md` so the same question is never asked twice.
|
|
155
|
+
|
|
156
|
+
### 🎨 Design
|
|
157
|
+
|
|
158
|
+
**`designer`** — UI/UX specification agent. Opt-in via config. Generates component scaffolds and design tokens before the coder touches UI tasks, eliminating the most common source of front-end rework.
|
|
175
159
|
|
|
176
|
-
|
|
160
|
+
### 💻 Implementation
|
|
161
|
+
|
|
162
|
+
**`coder`** — Implements exactly one task with full context. No multitasking. No context bleed from prior tasks. The coder receives: the task spec, acceptance criteria, SME guidance, and relevant context from `.swarm/context.md`. Nothing else.
|
|
163
|
+
|
|
164
|
+
**`test_engineer`** — Generates tests, runs them, and returns structured `PASS/FAIL` verdicts with coverage percentages. Runs twice per task: once for verification, once for adversarial attack scenarios.
|
|
165
|
+
|
|
166
|
+
### ✅ Quality Assurance
|
|
167
|
+
|
|
168
|
+
**`reviewer`** — Dual-pass review. First pass: correctness, logic, maintainability. Second pass: security-only, scoped to OWASP Top 10 categories, triggered automatically when the modified files match security-sensitive path patterns. Both passes produce structured verdicts with specific rejection reasons.
|
|
169
|
+
|
|
170
|
+
**`critic`** — Plan review gate. Reviews the Architect's plan *before implementation begins*. Checks for completeness, feasibility, scope creep, missing dependencies, and AI-slop hallucinations. Plans do not proceed without Critic approval.
|
|
171
|
+
|
|
172
|
+
### 📝 Documentation
|
|
173
|
+
|
|
174
|
+
**`docs`** — Documentation synthesizer. Runs in Phase 6 with a diff of changed files. Updates READMEs, API documentation, and guides to reflect what was actually built, not what was planned.
|
|
175
|
+
|
|
176
|
+
---
|
|
177
|
+
|
|
178
|
+
## Persistent Memory
|
|
179
|
+
|
|
180
|
+
Other frameworks lose everything when the session ends. Swarm stores project state on disk.
|
|
177
181
|
|
|
178
182
|
```
|
|
179
183
|
.swarm/
|
|
180
|
-
├── plan.md #
|
|
181
|
-
├──
|
|
182
|
-
├──
|
|
183
|
-
|
|
184
|
-
│
|
|
184
|
+
├── plan.md # Living roadmap: phases, tasks, status, rejections, blockers
|
|
185
|
+
├── plan.json # Machine-readable plan for tooling
|
|
186
|
+
├── context.md # Institutional knowledge: decisions, SME guidance, patterns
|
|
187
|
+
├── evidence/ # Per-task execution evidence bundles
|
|
188
|
+
│ ├── 1.1/ # review verdict, test results, diff summary for task 1.1
|
|
189
|
+
│ └── 2.3/
|
|
185
190
|
└── history/
|
|
186
|
-
├── phase-1.md # What was
|
|
191
|
+
├── phase-1.md # What was built, what was learned, retrospective metrics
|
|
187
192
|
└── phase-2.md
|
|
188
193
|
```
|
|
189
194
|
|
|
190
|
-
### plan.md
|
|
195
|
+
### plan.md — Living Roadmap
|
|
196
|
+
|
|
191
197
|
```markdown
|
|
192
198
|
# Project: Auth System
|
|
193
199
|
Current Phase: 2
|
|
@@ -200,281 +206,133 @@ Current Phase: 2
|
|
|
200
206
|
## Phase 2: Core Auth [IN PROGRESS]
|
|
201
207
|
- [x] Task 2.1: Login endpoint [MEDIUM]
|
|
202
208
|
- [ ] Task 2.2: JWT generation [MEDIUM] (depends: 2.1) ← CURRENT
|
|
203
|
-
- Acceptance: Returns valid JWT with user claims
|
|
204
|
-
- Attempt 1: REJECTED
|
|
209
|
+
- Acceptance: Returns valid JWT with user claims, 15-minute expiry
|
|
210
|
+
- Attempt 1: REJECTED — missing expiration claim
|
|
205
211
|
- [ ] Task 2.3: Token validation middleware [MEDIUM]
|
|
206
|
-
- [BLOCKED] Task 2.4: Refresh
|
|
207
|
-
- Reason:
|
|
212
|
+
- [BLOCKED] Task 2.4: Refresh token rotation
|
|
213
|
+
- Reason: Awaiting decision on rotation strategy
|
|
208
214
|
```
|
|
209
215
|
|
|
210
|
-
### context.md
|
|
216
|
+
### context.md — Institutional Knowledge
|
|
217
|
+
|
|
211
218
|
```markdown
|
|
212
219
|
# Project Context: Auth System
|
|
213
220
|
|
|
214
221
|
## Technical Decisions
|
|
215
|
-
-
|
|
216
|
-
- JWT
|
|
217
|
-
-
|
|
222
|
+
- bcrypt cost factor: 12
|
|
223
|
+
- JWT TTL: 15 minutes; refresh TTL: 7 days
|
|
224
|
+
- Refresh token store: Redis with key prefix auth:refresh:
|
|
218
225
|
|
|
219
226
|
## SME Guidance Cache
|
|
220
|
-
###
|
|
221
|
-
- Never log tokens or passwords
|
|
222
|
-
- Use constant-time comparison for
|
|
223
|
-
-
|
|
227
|
+
### security (Phase 1)
|
|
228
|
+
- Never log tokens or passwords in any context
|
|
229
|
+
- Use constant-time comparison for all token equality checks
|
|
230
|
+
- Rate-limit login endpoint: 5 attempts / 15 minutes per IP
|
|
224
231
|
|
|
225
|
-
###
|
|
226
|
-
- Return 401 for invalid credentials (not 404)
|
|
227
|
-
- Include token expiry in response body
|
|
232
|
+
### api (Phase 1)
|
|
233
|
+
- Return HTTP 401 for invalid credentials (not 404)
|
|
234
|
+
- Include token expiry timestamp in response body
|
|
228
235
|
|
|
229
236
|
## Patterns Established
|
|
230
|
-
- Error handling:
|
|
231
|
-
- Validation: Zod schemas in /validators
|
|
237
|
+
- Error handling: custom ApiError class with HTTP status and error code
|
|
238
|
+
- Validation: Zod schemas in /validators/, applied at request boundary
|
|
232
239
|
```
|
|
233
240
|
|
|
234
|
-
|
|
241
|
+
Start a new session tomorrow. The Architect reads these files and picks up exactly where you left off — no re-explaining, no rediscovery, no drift.
|
|
235
242
|
|
|
236
|
-
### Evidence
|
|
243
|
+
### Evidence Bundles
|
|
237
244
|
|
|
238
|
-
Each task
|
|
245
|
+
Each completed task writes structured evidence to `.swarm/evidence/`:
|
|
239
246
|
|
|
240
|
-
| Type |
|
|
241
|
-
|
|
242
|
-
| `review` |
|
|
243
|
-
| `test` |
|
|
244
|
-
| `diff` |
|
|
245
|
-
| `approval` | Stakeholder sign-off
|
|
246
|
-
| `
|
|
247
|
-
| `retrospective` | Phase metrics & lessons | `phase_number`, `total_tool_calls`, `coder_revisions`, `reviewer_rejections`, `test_failures`, `security_findings`, `task_count`, `task_complexity`, `top_rejection_reasons[]`, `lessons_learned[]` |
|
|
247
|
+
| Type | What It Captures |
|
|
248
|
+
|------|-----------------|
|
|
249
|
+
| `review` | Verdict (APPROVED/REJECTED), risk level, specific issues |
|
|
250
|
+
| `test` | Pass/fail counts, coverage percentage, failure messages |
|
|
251
|
+
| `diff` | Files changed, additions/deletions, contract change flags |
|
|
252
|
+
| `approval` | Stakeholder sign-off with notes |
|
|
253
|
+
| `retrospective` | Phase metrics: total tool calls, coder revisions, reviewer rejections, test failures, security findings, lessons learned |
|
|
248
254
|
|
|
249
|
-
|
|
255
|
+
Retrospectives from completed phases are injected as `[SWARM RETROSPECTIVE]` hints at the start of subsequent phases. The framework learns from its own history within a project.
|
|
250
256
|
|
|
251
257
|
---
|
|
252
258
|
|
|
253
|
-
## Heterogeneous Models
|
|
254
|
-
|
|
255
|
-
Most frameworks use one model for everything. Same blindspots everywhere.
|
|
259
|
+
## Heterogeneous Models
|
|
256
260
|
|
|
257
|
-
Swarm lets you
|
|
261
|
+
Single-model frameworks have correlated failure modes. The same model that writes the bug reviews it and misses it. Swarm lets you route each agent to the model it is best suited for:
|
|
258
262
|
|
|
259
263
|
```json
|
|
260
264
|
{
|
|
261
265
|
"agents": {
|
|
262
|
-
"architect": { "model": "anthropic/claude-
|
|
263
|
-
"
|
|
264
|
-
"
|
|
265
|
-
"sme": { "model": "
|
|
266
|
-
"
|
|
267
|
-
"
|
|
268
|
-
"test_engineer": { "model": "
|
|
266
|
+
"architect": { "model": "anthropic/claude-opus-4-6" },
|
|
267
|
+
"coder": { "model": "minimax-coding-plan/MiniMax-M2.5" },
|
|
268
|
+
"explorer": { "model": "minimax-coding-plan/MiniMax-M2.1" },
|
|
269
|
+
"sme": { "model": "kimi-for-coding/k2p5" },
|
|
270
|
+
"critic": { "model": "zai-coding-plan/glm-5" },
|
|
271
|
+
"reviewer": { "model": "zai-coding-plan/glm-5" },
|
|
272
|
+
"test_engineer": { "model": "minimax-coding-plan/MiniMax-M2.5" },
|
|
273
|
+
"docs": { "model": "zai-coding-plan/glm-4.7-flash" },
|
|
274
|
+
"designer": { "model": "kimi-for-coding/k2p5" }
|
|
269
275
|
}
|
|
270
276
|
}
|
|
271
277
|
```
|
|
272
278
|
|
|
273
|
-
|
|
274
|
-
|------|---------------|----------------------|
|
|
275
|
-
| Architect | Deep reasoning | Needs to plan complex work |
|
|
276
|
-
| Explorer | Fast scanning | Speed over depth |
|
|
277
|
-
| Coder | Implementation | Best coding model you have |
|
|
278
|
-
| SME | Domain knowledge | Fast recall, not deep reasoning |
|
|
279
|
-
| Reviewer | Finding flaws | **Different vendor catches different bugs** |
|
|
280
|
-
| Critic | Plan review | Catches scope issues before any code is written |
|
|
281
|
-
| Test Engineer | Test + run | Writes tests, runs them, reports PASS/FAIL |
|
|
282
|
-
|
|
283
|
-
**If Claude writes code and GPT reviews it, GPT catches Claude's blindspots.** This is why real teams have code review.
|
|
279
|
+
Reviewer uses a different model than Coder by design. Different training, different priors, different blind spots. This is the cheapest bug-catcher you will ever deploy.
|
|
284
280
|
|
|
285
281
|
---
|
|
286
282
|
|
|
287
|
-
##
|
|
283
|
+
## Guardrails
|
|
284
|
+
|
|
285
|
+
Every subagent runs inside a circuit breaker that kills runaway behavior before it burns credits on a stuck loop.
|
|
286
|
+
|
|
287
|
+
| Layer | Trigger | Action |
|
|
288
|
+
|-------|---------|--------|
|
|
289
|
+
| ⚠️ Soft Warning | 50% of any limit reached | Warning injected into agent stream |
|
|
290
|
+
| 🛑 Hard Block | 100% of any limit reached | All further tool calls blocked |
|
|
288
291
|
|
|
289
|
-
|
|
290
|
-
|
|
291
|
-
|
|
292
|
-
|
|
292
|
+
| Signal | Default | Description |
|
|
293
|
+
|--------|---------|-------------|
|
|
294
|
+
| Tool calls | 200 | Per-invocation, not per-session |
|
|
295
|
+
| Duration | 30 min | Wall-clock time per delegation |
|
|
296
|
+
| Repetition | 10 | Same tool + args consecutively |
|
|
297
|
+
| Consecutive errors | 5 | Sequential null/undefined outputs |
|
|
293
298
|
|
|
294
|
-
|
|
299
|
+
Limits are enforced **per-invocation**. Each delegation to a subagent starts a fresh budget. A coder fixing a second task is not penalized for the first task's tool calls. The Architect is exempt from all limits by default.
|
|
295
300
|
|
|
296
|
-
|
|
301
|
+
Per-agent profiles allow fine-grained overrides:
|
|
302
|
+
|
|
303
|
+
```jsonc
|
|
297
304
|
{
|
|
298
|
-
"
|
|
299
|
-
"
|
|
300
|
-
|
|
301
|
-
"
|
|
302
|
-
|
|
303
|
-
"coder": { "model": "anthropic/claude-sonnet-4-5" },
|
|
304
|
-
"sme": { "model": "google/gemini-2.0-flash" },
|
|
305
|
-
"reviewer": { "model": "openai/gpt-4o" }
|
|
306
|
-
}
|
|
307
|
-
},
|
|
308
|
-
"local": {
|
|
309
|
-
"name": "Local",
|
|
310
|
-
"agents": {
|
|
311
|
-
"architect": { "model": "ollama/qwen2.5:32b" },
|
|
312
|
-
"coder": { "model": "ollama/qwen2.5:32b" },
|
|
313
|
-
"sme": { "model": "ollama/qwen2.5:14b" },
|
|
314
|
-
"reviewer": { "model": "ollama/qwen2.5:14b" }
|
|
315
|
-
}
|
|
305
|
+
"guardrails": {
|
|
306
|
+
"max_tool_calls": 200,
|
|
307
|
+
"profiles": {
|
|
308
|
+
"coder": { "max_tool_calls": 500, "max_duration_minutes": 60 },
|
|
309
|
+
"explorer": { "max_tool_calls": 50 }
|
|
316
310
|
}
|
|
317
311
|
}
|
|
318
312
|
}
|
|
319
313
|
```
|
|
320
314
|
|
|
321
|
-
### What Gets Created
|
|
322
|
-
|
|
323
|
-
| Swarm | Agents |
|
|
324
|
-
|-------|--------|
|
|
325
|
-
| `cloud` (default) | `architect`, `explorer`, `coder`, `sme`, `reviewer`, `critic`, `test_engineer` |
|
|
326
|
-
| `local` | `local_architect`, `local_explorer`, `local_coder`, `local_sme`, `local_reviewer`, `local_critic`, `local_test_engineer` |
|
|
327
|
-
|
|
328
|
-
The first swarm (or one named "default") creates unprefixed agents. Additional swarms prefix all agent names.
|
|
329
|
-
|
|
330
|
-
### Usage
|
|
331
|
-
|
|
332
|
-
In OpenCode, you'll see multiple architects to choose from:
|
|
333
|
-
- `architect` - Cloud swarm (default)
|
|
334
|
-
- `local_architect` - Local swarm
|
|
335
|
-
|
|
336
|
-
Each architect automatically delegates to its own swarm's agents.
|
|
337
|
-
|
|
338
|
-
---
|
|
339
|
-
|
|
340
|
-
## Installation
|
|
341
|
-
|
|
342
|
-
```bash
|
|
343
|
-
# Install via CLI (recommended)
|
|
344
|
-
bunx opencode-swarm install
|
|
345
|
-
```
|
|
346
|
-
|
|
347
|
-
### Uninstall
|
|
348
|
-
|
|
349
|
-
```bash
|
|
350
|
-
# Remove from opencode.json
|
|
351
|
-
bunx opencode-swarm uninstall
|
|
352
|
-
|
|
353
|
-
# Remove from opencode.json + clean up config files
|
|
354
|
-
bunx opencode-swarm uninstall --clean
|
|
355
|
-
```
|
|
356
|
-
|
|
357
|
-
---
|
|
358
|
-
|
|
359
|
-
## What's New
|
|
360
|
-
|
|
361
|
-
### v6.2.0 — System Intelligence
|
|
362
|
-
- **Retrospective evidence** — New evidence type that captures phase metrics (tool calls, revisions, rejections, test failures, security findings) and lessons learned. Architect writes it after each phase; system enhancer injects the most recent one as a `[SWARM RETROSPECTIVE]` hint for the next phase, enabling continuous improvement across phases.
|
|
363
|
-
- **Soft compaction advisory** — System enhancer injects a `[SWARM HINT]` when the architect's tool-call count crosses configurable thresholds (default 50/75/100/125/150). A `lastCompactionHint` guard prevents re-injection at the same threshold. Configurable via `compaction_advisory` block.
|
|
364
|
-
- **Coverage reporting** — Test engineer now reports line/branch/function coverage percentages and flags files below 70%. Architect uses this in Phase 5 step 5d to request additional test passes when coverage is insufficient.
|
|
365
|
-
- **111 new tests** — 1391 total tests across 62+ files (up from 1280 in v6.1.2).
|
|
366
|
-
|
|
367
|
-
### v6.1.2 — Guardrails Remediation
|
|
368
|
-
- **Fail-safe config validation** — Config validation failures now disable guardrails as a safety precaution (previously Zod defaults could silently re-enable them).
|
|
369
|
-
- **Architect exemption fix** — Architect/orchestrator sessions can no longer inherit 30-minute base limits during delegation race conditions.
|
|
370
|
-
- **Explicit disable always wins** — `guardrails.enabled: false` in config is now always honored, even when the config was loaded from file.
|
|
371
|
-
- **Internal map synchronization** — `startAgentSession()` now keeps `activeAgent` and `agentSessions` maps in sync for consistent state tracking.
|
|
372
|
-
|
|
373
|
-
### v6.1.1 — Security Fix & Tech Debt
|
|
374
|
-
- **Security hardening (`_loadedFromFile`)** — Fixed a critical vulnerability where an internal loader flag could be injected via JSON config to bypass guardrails. The flag is now purely internal and no longer part of the public schema.
|
|
375
|
-
- **TOCTOU protection** — Added atomic-style content checks in the config loader to prevent race conditions during file reads.
|
|
376
|
-
- **`retrieve_summary` tool** — Properly registered the retrieval tool, allowing agents to fetch full content from auto-summarized tool outputs.
|
|
377
|
-
- **92 new tests** — 1280 total tests across 57+ files (up from 1188 in v6.0.0).
|
|
378
|
-
|
|
379
|
-
### v6.1.0 — Docs & Design Agents
|
|
380
|
-
- **`docs` agent** — Dedicated documentation synthesizer that automatically updates READMEs, API docs, and guides during Phase 6.
|
|
381
|
-
- **`designer` agent** — UI/UX specification agent that generates component scaffolds before coding begins on UI-heavy tasks.
|
|
382
|
-
- **Heterogeneous model defaults** — Updated default models for new agents to use optimized Gemini models for speed and cost.
|
|
383
|
-
|
|
384
|
-
### v6.0.0 — Core QA & Security Gates
|
|
385
|
-
- **Dual-pass security reviewer** — After the general reviewer APPROVES, the architect automatically triggers a second security-only review pass when the changed file matches security-sensitive paths (`auth`, `crypto`, `session`, `token`, `middleware`, `api`, `security`) or the coder's output contains security keywords. Configurable via `review_passes` config.
|
|
386
|
-
- **Adversarial testing** — After verification tests PASS, the test engineer is re-delegated with adversarial-only framing: attack vectors, boundary violations, and injection attempts. Pure prompt engineering, no new infrastructure.
|
|
387
|
-
- **Integration impact analysis** — After the coder completes, the `diff` tool detects contract changes (exported functions, interfaces, types). If found, the explorer runs impact analysis across dependents before review begins.
|
|
388
|
-
- **`diff` tool** — New agent-accessible tool providing structured git diff with numstat parsing, contract change detection, configurable base ref (`HEAD`/staged/unstaged), path filtering, and 500-line truncation.
|
|
389
|
-
- **87 new tests** — 1188 total tests across 53+ files (up from 1101 in v5.2.0).
|
|
390
|
-
|
|
391
|
-
### v5.2.0 — Per-Invocation Guardrails
|
|
392
|
-
- **Per-invocation budget isolation** — Guardrail limits (tool calls, duration, errors) now reset with each agent delegation. Second invocation of the same agent gets a fresh budget, preventing false circuit breaker trips in long-running projects.
|
|
393
|
-
- **Architect protocol enforcement** — New mandatory QA gate rules: every coder task must go through reviewer approval + test_engineer verification before the next coder task. Protocol violations detected at runtime with warning injection.
|
|
394
|
-
- **Invocation window observability** — Circuit breaker logs now include `invocationId` and `windowKey` for precise debugging of which specific agent invocation hit limits.
|
|
395
|
-
- **67 new tests** — 1101 total tests across 48 files (up from 1034 in v5.1.x).
|
|
396
|
-
|
|
397
|
-
### v5.0.0 — Verifiable Execution
|
|
398
|
-
- **Canonical plan schema** — Machine-readable `plan.json` with Zod-validated `PlanSchema`/`TaskSchema`/`PhaseSchema`. Automatic migration from legacy `plan.md` format. Structured status tracking (`pending`, `in_progress`, `completed`, `blocked`).
|
|
399
|
-
- **Evidence bundles** — Per-task execution evidence persisted to `.swarm/evidence/`. Five evidence types: `review`, `test`, `diff`, `approval`, `note`. Sanitized task IDs, atomic writes, configurable size limits. `/swarm evidence` to view, `/swarm archive` to manage retention.
|
|
400
|
-
- **Per-agent guardrail profiles** — Override guardrail limits for individual agents via `guardrails.profiles`. `resolveGuardrailsConfig()` merges base + profile with per-agent specificity.
|
|
401
|
-
- **Context injection budget** — `max_injection_tokens` config controls how much context is injected into system prompts. Priority-ordered: phase → task → decisions → agent context. Lower-priority items dropped when budget exhausted.
|
|
402
|
-
- **Enhanced `/swarm agents`** — Agent count summary, `⚡ custom limits` indicator for profiled agents, guardrail profiles section.
|
|
403
|
-
- **Packaging smoke tests** — CI-safe `dist/` validation (8 tests).
|
|
404
|
-
- **151 new tests** — 1027 total tests across 44 files (up from 876 in v4.6.0).
|
|
405
|
-
|
|
406
|
-
### v4.6.0 — Agent Guardrails
|
|
407
|
-
- **Circuit breaker** — Two-layer protection against runaway agents. Soft warning at 50% of limits, hard block at 100%. Prevents infinite loops and runaway API costs.
|
|
408
|
-
- **Detection signals** — Tool call count, wall-clock time, consecutive repetition, and consecutive error tracking per agent session.
|
|
409
|
-
- **Configurable limits** — All thresholds tunable via `guardrails` config: `max_tool_calls`, `max_duration_minutes`, `max_repetitions`, `max_consecutive_errors`, `warning_threshold`.
|
|
410
|
-
- **46 new tests** — 668 total tests across 30 files.
|
|
411
|
-
|
|
412
|
-
### v4.5.0 — Tech Debt + New Commands
|
|
413
|
-
- **Lint cleanup** — Replaced string concatenation with template literals, documented `as any` casts with biome-ignore comments.
|
|
414
|
-
- **Code deduplication** — Extracted `stripSwarmPrefix()` utility to eliminate 3 duplicate prefix-stripping blocks.
|
|
415
|
-
- **`/swarm diagnose`** — Health check for `.swarm/` files, plan structure, and plugin configuration.
|
|
416
|
-
- **`/swarm export`** — Export plan.md and context.md as portable JSON.
|
|
417
|
-
- **`/swarm reset --confirm`** — Clear swarm state files with safety confirmation.
|
|
418
|
-
|
|
419
|
-
### v4.4.0 — DX & Quality
|
|
420
|
-
- **CLI `uninstall` command** — Remove plugin with optional `--clean` flag.
|
|
421
|
-
- **Custom error classes** — `SwarmError` hierarchy with actionable `guidance` messages.
|
|
422
|
-
- **`/swarm history`** — View completed phases from plan.md.
|
|
423
|
-
- **`/swarm config`** — View current resolved plugin configuration.
|
|
424
|
-
|
|
425
|
-
### v4.3.2 — Security Hardening
|
|
426
|
-
- **Path validation** — `validateSwarmPath()` prevents directory traversal in `.swarm/` file operations.
|
|
427
|
-
- **Fetch hardening** — 10s timeout, 5MB limit, retry logic for gitingest tool.
|
|
428
|
-
- **Config limits** — Deep merge depth limit (10), config file size limit (100KB).
|
|
429
|
-
|
|
430
|
-
### v4.3.0 — Hooks & Agent Awareness
|
|
431
|
-
- **Hooks pipeline** — `safeHook()` crash-safe wrapper, `composeHandlers()` for multi-handler composition.
|
|
432
|
-
- **Context pruning** — Token budget tracking with 70%/90% threshold warnings.
|
|
433
|
-
- **Slash commands** — `/swarm status`, `/swarm plan`, `/swarm agents`.
|
|
434
|
-
- **Agent awareness** — Activity tracking, delegation tracking, cross-agent context injection.
|
|
435
|
-
|
|
436
|
-
All features are opt-in via configuration. See [Installation Guide](docs/installation.md) for config options.
|
|
437
|
-
|
|
438
315
|
---
|
|
439
316
|
|
|
440
|
-
##
|
|
441
|
-
|
|
442
|
-
### 🎯 Orchestrator
|
|
443
|
-
| Agent | Role |
|
|
444
|
-
|-------|------|
|
|
445
|
-
| `architect` | Central coordinator. Plans phases, delegates tasks, manages QA, maintains project memory. |
|
|
446
|
-
|
|
447
|
-
### 🔍 Discovery
|
|
448
|
-
| Agent | Role |
|
|
449
|
-
|-------|------|
|
|
450
|
-
| `explorer` | Fast codebase scanner. Identifies structure, languages, frameworks, key files. |
|
|
451
|
-
|
|
452
|
-
### 🎨 Design
|
|
453
|
-
| Agent | Role |
|
|
454
|
-
|-------|------|
|
|
455
|
-
| `designer` | UI/UX specification agent. Generates component scaffolds and design tokens before coding begins on UI-heavy tasks. |
|
|
456
|
-
|
|
457
|
-
### 🧠 Domain Expert
|
|
458
|
-
| Agent | Role |
|
|
459
|
-
|-------|------|
|
|
460
|
-
| `sme` | Open-domain expert. The architect specifies any domain (security, python, ios, rust, kubernetes, etc.) per call. No hardcoded list — works with any domain the LLM has knowledge of. |
|
|
461
|
-
|
|
462
|
-
### 💻 Implementation
|
|
463
|
-
| Agent | Role |
|
|
464
|
-
|-------|------|
|
|
465
|
-
| `coder` | Implements ONE task at a time with full context |
|
|
466
|
-
| `test_engineer` | Generates tests, runs them, and reports structured PASS/FAIL verdicts |
|
|
467
|
-
|
|
468
|
-
### ✅ Quality Assurance
|
|
469
|
-
| Agent | Role |
|
|
470
|
-
|-------|------|
|
|
471
|
-
| `reviewer` | Dual-pass review: correctness review first, then automatic security-only pass for security-sensitive files. The architect specifies CHECK dimensions per call. OWASP Top 10 categories built in. |
|
|
472
|
-
| `critic` | Plan review gate. Reviews the architect's plan BEFORE implementation — checks completeness, feasibility, scope, dependencies, and flags AI-slop. |
|
|
317
|
+
## Comparison
|
|
473
318
|
|
|
474
|
-
|
|
475
|
-
|
|
476
|
-
|
|
477
|
-
|
|
|
319
|
+
| Feature | OpenCode Swarm | oh-my-opencode | get-shit-done | AutoGen | CrewAI |
|
|
320
|
+
|---------|:-:|:-:|:-:|:-:|:-:|
|
|
321
|
+
| Multi-agent orchestration | ✅ 9 specialized agents | ❌ Prompt config only | ❌ Single-agent macros | ✅ | ✅ |
|
|
322
|
+
| Execution model | Serial (deterministic) | N/A | N/A | Parallel (chaotic) | Parallel |
|
|
323
|
+
| Phased planning with acceptance criteria | ✅ | ❌ | ❌ | ❌ | ❌ |
|
|
324
|
+
| Critic gate before implementation | ✅ | ❌ | ❌ | ❌ | ❌ |
|
|
325
|
+
| Per-task dual-pass review (correctness + security) | ✅ | ❌ | ❌ | Optional | Optional |
|
|
326
|
+
| Adversarial test pass per task | ✅ | ❌ | ❌ | ❌ | ❌ |
|
|
327
|
+
| Pre-reviewer pipeline (lint, secretscan, imports) | ✅ v6.3 | ❌ | ❌ | ❌ | ❌ |
|
|
328
|
+
| Persistent session memory | ✅ `.swarm/` files | ❌ | ❌ | Session only | Session only |
|
|
329
|
+
| Resume projects across sessions | ✅ Native | ❌ | ❌ | ❌ | ❌ |
|
|
330
|
+
| Evidence trail per task | ✅ Structured bundles | ❌ | ❌ | ❌ | ❌ |
|
|
331
|
+
| Heterogeneous model routing | ✅ Per-agent | ❌ | ❌ | Limited | Limited |
|
|
332
|
+
| Circuit breaker / guardrails | ✅ Per-invocation | ❌ | ❌ | ❌ | ❌ |
|
|
333
|
+
| Open-domain SME consultation | ✅ Any domain | ❌ | ❌ | ❌ | ❌ |
|
|
334
|
+
| Retrospective learning across phases | ✅ | ❌ | ❌ | ❌ | ❌ |
|
|
335
|
+
| Slash commands + diagnostics | ✅ 12 commands | ❌ | Limited | ❌ | ❌ |
|
|
478
336
|
|
|
479
337
|
---
|
|
480
338
|
|
|
@@ -482,236 +340,141 @@ All features are opt-in via configuration. See [Installation Guide](docs/install
|
|
|
482
340
|
|
|
483
341
|
| Command | Description |
|
|
484
342
|
|---------|-------------|
|
|
485
|
-
| `/swarm status` | Current phase, task progress,
|
|
486
|
-
| `/swarm plan [N]` |
|
|
487
|
-
| `/swarm agents` |
|
|
488
|
-
| `/swarm history` |
|
|
489
|
-
| `/swarm config` |
|
|
490
|
-
| `/swarm diagnose` | Health check for
|
|
343
|
+
| `/swarm status` | Current phase, task progress, agent count |
|
|
344
|
+
| `/swarm plan [N]` | Full plan or filtered by phase |
|
|
345
|
+
| `/swarm agents` | All registered agents with models and permissions |
|
|
346
|
+
| `/swarm history` | Completed phases with status |
|
|
347
|
+
| `/swarm config` | Current resolved configuration |
|
|
348
|
+
| `/swarm diagnose` | Health check for `.swarm/` files and config |
|
|
491
349
|
| `/swarm export` | Export plan and context as portable JSON |
|
|
492
|
-
| `/swarm
|
|
493
|
-
| `/swarm
|
|
494
|
-
| `/swarm
|
|
495
|
-
| `/swarm
|
|
496
|
-
| `/swarm
|
|
350
|
+
| `/swarm evidence [task]` | Evidence bundles for a task or all tasks |
|
|
351
|
+
| `/swarm archive [--dry-run]` | Archive old evidence with retention policy |
|
|
352
|
+
| `/swarm benchmark` | Performance benchmarks |
|
|
353
|
+
| `/swarm retrieve [id]` | Retrieve auto-summarized tool outputs |
|
|
354
|
+
| `/swarm reset --confirm` | Clear swarm state files |
|
|
497
355
|
|
|
498
356
|
---
|
|
499
357
|
|
|
500
358
|
## Configuration
|
|
501
359
|
|
|
502
|
-
Create `~/.config/opencode/opencode-swarm.json`:
|
|
503
|
-
|
|
504
360
|
```json
|
|
505
361
|
{
|
|
506
362
|
"agents": {
|
|
507
|
-
"architect": { "model": "anthropic/claude-
|
|
508
|
-
"
|
|
509
|
-
"
|
|
510
|
-
"sme": { "model": "
|
|
511
|
-
"
|
|
512
|
-
"
|
|
513
|
-
"test_engineer": { "model": "
|
|
514
|
-
"docs": { "model": "
|
|
515
|
-
"designer": { "model": "
|
|
516
|
-
}
|
|
517
|
-
}
|
|
518
|
-
```
|
|
519
|
-
|
|
520
|
-
### Disable Agents
|
|
521
|
-
```json
|
|
522
|
-
{
|
|
523
|
-
"sme": { "disabled": true },
|
|
524
|
-
"test_engineer": { "disabled": true }
|
|
525
|
-
}
|
|
526
|
-
```
|
|
527
|
-
|
|
528
|
-
---
|
|
529
|
-
|
|
530
|
-
## Guardrails
|
|
531
|
-
|
|
532
|
-
OpenCode Swarm includes a built-in circuit breaker that prevents subagents from running away — burning API credits in infinite loops, repeating the same tool call, or spinning for hours.
|
|
533
|
-
|
|
534
|
-
### How It Works
|
|
535
|
-
|
|
536
|
-
| Layer | Trigger | Action |
|
|
537
|
-
|-------|---------|--------|
|
|
538
|
-
| ⚠️ **Soft Warning** | 50% of any limit reached | Injects warning message into agent's chat stream |
|
|
539
|
-
| 🛑 **Hard Block** | 100% of any limit reached | Blocks ALL further tool calls + injects stop message |
|
|
540
|
-
|
|
541
|
-
### Detection Signals
|
|
542
|
-
|
|
543
|
-
| Signal | Default Limit | Description |
|
|
544
|
-
|--------|---------------|-------------|
|
|
545
|
-
| Tool calls | 200 | Total tool invocations per agent session |
|
|
546
|
-
| Duration | 30 min | Wall-clock time since delegation started |
|
|
547
|
-
| Repetition | 10 | Same tool + args called consecutively |
|
|
548
|
-
| Consecutive errors | 5 | Sequential null/undefined tool outputs |
|
|
549
|
-
|
|
550
|
-
### Configuration
|
|
551
|
-
|
|
552
|
-
Guardrails are **enabled by default**. Customize in your swarm config:
|
|
553
|
-
|
|
554
|
-
```jsonc
|
|
555
|
-
{
|
|
556
|
-
"guardrails": {
|
|
557
|
-
"enabled": true, // default: true
|
|
558
|
-
"max_tool_calls": 200, // range: 10–1000
|
|
559
|
-
"max_duration_minutes": 30, // range: 1–120
|
|
560
|
-
"max_repetitions": 10, // range: 3–50
|
|
561
|
-
"max_consecutive_errors": 5, // range: 2–20
|
|
562
|
-
"warning_threshold": 0.5 // range: 0.1–0.9 (fraction of limit for soft warning)
|
|
563
|
-
}
|
|
564
|
-
}
|
|
565
|
-
```
|
|
566
|
-
|
|
567
|
-
### Per-Agent Profiles
|
|
568
|
-
|
|
569
|
-
Override limits for specific agents that need more (or less) room:
|
|
570
|
-
|
|
571
|
-
```jsonc
|
|
572
|
-
{
|
|
363
|
+
"architect": { "model": "anthropic/claude-opus-4-6" },
|
|
364
|
+
"coder": { "model": "minimax-coding-plan/MiniMax-M2.5" },
|
|
365
|
+
"explorer": { "model": "minimax-coding-plan/MiniMax-M2.1" },
|
|
366
|
+
"sme": { "model": "kimi-for-coding/k2p5" },
|
|
367
|
+
"critic": { "model": "zai-coding-plan/glm-5" },
|
|
368
|
+
"reviewer": { "model": "zai-coding-plan/glm-5" },
|
|
369
|
+
"test_engineer": { "model": "minimax-coding-plan/MiniMax-M2.5" },
|
|
370
|
+
"docs": { "model": "zai-coding-plan/glm-4.7-flash" },
|
|
371
|
+
"designer": { "model": "kimi-for-coding/k2p5" }
|
|
372
|
+
},
|
|
573
373
|
"guardrails": {
|
|
574
374
|
"max_tool_calls": 200,
|
|
375
|
+
"max_duration_minutes": 30,
|
|
575
376
|
"profiles": {
|
|
576
|
-
"coder": { "max_tool_calls": 500
|
|
577
|
-
"explorer": { "max_tool_calls": 50 }
|
|
377
|
+
"coder": { "max_tool_calls": 500 }
|
|
578
378
|
}
|
|
379
|
+
},
|
|
380
|
+
"review_passes": {
|
|
381
|
+
"always_security_review": false,
|
|
382
|
+
"security_globs": ["**/*auth*", "**/*crypto*", "**/*session*", "**/*token*"]
|
|
579
383
|
}
|
|
580
384
|
}
|
|
581
385
|
```
|
|
582
386
|
|
|
583
|
-
|
|
584
|
-
|
|
585
|
-
### Review Passes
|
|
387
|
+
Save to `~/.config/opencode/opencode-swarm.json` or `.opencode/swarm.json` in your project root. Project config merges over global config via deep merge — partial overrides do not clobber unspecified fields.
|
|
586
388
|
|
|
587
|
-
|
|
389
|
+
### Disabling Agents
|
|
588
390
|
|
|
589
|
-
```
|
|
391
|
+
```json
|
|
590
392
|
{
|
|
591
|
-
"
|
|
592
|
-
|
|
593
|
-
|
|
594
|
-
"**/*auth*", "**/*crypto*",
|
|
595
|
-
"**/*session*", "**/*token*",
|
|
596
|
-
"**/*middleware*", "**/*api*",
|
|
597
|
-
"**/*security*"
|
|
598
|
-
]
|
|
599
|
-
}
|
|
393
|
+
"sme": { "disabled": true },
|
|
394
|
+
"designer": { "disabled": true },
|
|
395
|
+
"test_engineer": { "disabled": true }
|
|
600
396
|
}
|
|
601
397
|
```
|
|
602
398
|
|
|
603
|
-
|
|
399
|
+
---
|
|
604
400
|
|
|
605
|
-
|
|
401
|
+
## Installation
|
|
606
402
|
|
|
607
|
-
|
|
403
|
+
```bash
|
|
404
|
+
# Install globally
|
|
405
|
+
npm install -g opencode-swarm
|
|
608
406
|
|
|
609
|
-
|
|
610
|
-
|
|
611
|
-
"integration_analysis": {
|
|
612
|
-
"enabled": true // default: true
|
|
613
|
-
}
|
|
614
|
-
}
|
|
615
|
-
```
|
|
407
|
+
# Or use npx
|
|
408
|
+
npx opencode-swarm install
|
|
616
409
|
|
|
617
|
-
|
|
410
|
+
# Verify
|
|
411
|
+
opencode # then: /swarm diagnose
|
|
412
|
+
```
|
|
618
413
|
|
|
619
|
-
|
|
414
|
+
The installer auto-configures `opencode.json` to include the plugin. Manual configuration:
|
|
620
415
|
|
|
621
|
-
```
|
|
416
|
+
```json
|
|
622
417
|
{
|
|
623
|
-
"
|
|
624
|
-
"enabled": true, // default: true
|
|
625
|
-
"thresholds": [50, 75, 100, 125, 150], // tool-call counts that trigger hints
|
|
626
|
-
"message": "Large context may benefit from compaction" // custom message
|
|
627
|
-
}
|
|
418
|
+
"plugins": ["opencode-swarm"]
|
|
628
419
|
}
|
|
629
420
|
```
|
|
630
421
|
|
|
631
|
-
|
|
632
|
-
|
|
633
|
-
> **Architect is exempt/unlimited by default:** The architect agent has no guardrail limits by default. To override, add a `profiles.architect` entry in your guardrails config.
|
|
634
|
-
|
|
635
|
-
### Per-Invocation Budgets
|
|
636
|
-
|
|
637
|
-
Guardrail limits are enforced **per-invocation**, not per-session. Each time the architect delegates to an agent, that agent gets a fresh budget of tool calls, duration, and error tolerance.
|
|
638
|
-
|
|
639
|
-
**Example**: If `max_tool_calls: 200`, then:
|
|
640
|
-
- Architect → Coder (task 1) → 200 calls available
|
|
641
|
-
- Coder finishes → Architect → Coder (task 2) → 200 calls available again
|
|
642
|
-
|
|
643
|
-
This prevents long-running projects from accumulating session-wide counters that incorrectly trip the circuit breaker on later tasks.
|
|
422
|
+
---
|
|
644
423
|
|
|
645
|
-
|
|
424
|
+
## Testing
|
|
646
425
|
|
|
647
|
-
|
|
426
|
+
2031 tests across 78 files. Unit, integration, adversarial, and smoke. Covers config schemas, all agent prompts, all hooks, all tools, all commands, guardrail circuit breaker, race conditions, invocation window isolation, multi-invocation state, security category classification, and evidence validation.
|
|
648
427
|
|
|
649
|
-
```
|
|
650
|
-
|
|
651
|
-
"guardrails": {
|
|
652
|
-
"enabled": false
|
|
653
|
-
}
|
|
654
|
-
}
|
|
428
|
+
```bash
|
|
429
|
+
bun test
|
|
655
430
|
```
|
|
656
431
|
|
|
657
|
-
|
|
432
|
+
Zero additional test dependencies. Uses Bun's built-in test runner.
|
|
658
433
|
|
|
659
|
-
|
|
434
|
+
---
|
|
660
435
|
|
|
661
|
-
|
|
662
|
-
|---------|---------------|---------|--------|-----------|
|
|
663
|
-
| Execution | Serial (predictable) | Parallel (chaotic) | Parallel | Configurable |
|
|
664
|
-
| Planning | Phased with acceptance criteria | Ad-hoc | Role-based | Graph-based |
|
|
665
|
-
| Memory | Persistent `.swarm/` files | Session only | Session only | Checkpoints |
|
|
666
|
-
| QA | Dual-pass per-task (review + security + adversarial) | Optional | Optional | Manual |
|
|
667
|
-
| Model mixing | Per-agent configuration | Limited | Limited | Manual |
|
|
668
|
-
| Resume projects | ✅ Native | ❌ | ❌ | Partial |
|
|
669
|
-
| SME domains | Open-domain (any) | Generic | Generic | Generic |
|
|
670
|
-
| Task granularity | One at a time | Batched | Batched | Varies |
|
|
436
|
+
## Roadmap
|
|
671
437
|
|
|
672
|
-
|
|
438
|
+
### v6.3 — Pre-Reviewer Pipeline
|
|
673
439
|
|
|
674
|
-
|
|
440
|
+
Three new tools complete the pre-reviewer gauntlet. Code reaching the Reviewer is already clean.
|
|
675
441
|
|
|
676
|
-
|
|
677
|
-
|
|
678
|
-
|
|
679
|
-
4. **Cache SME knowledge** - Don't re-ask answered questions
|
|
680
|
-
5. **Persistent memory** - `.swarm/` files survive sessions
|
|
681
|
-
6. **Serial execution** - Predictable, debuggable, no race conditions
|
|
682
|
-
7. **Heterogeneous models** - Different perspectives catch different bugs
|
|
683
|
-
8. **User checkpoints** - Confirm before proceeding to next phase
|
|
684
|
-
9. **Failure tracking** - Document rejections, escalate after 5 attempts
|
|
685
|
-
10. **Resumable by design** - Any Architect can pick up any project
|
|
442
|
+
- **`imports`** — AST-based import graph. For each file changed by the coder, returns every consumer file, which exports each consumer uses, and the line numbers. Replaces fragile grep-based integration analysis with deterministic graph traversal.
|
|
443
|
+
- **`lint`** — Auto-detects project linter (Biome, ESLint, Ruff, Clippy, PSScriptAnalyzer). Runs in fix mode first, then check mode. Structured diagnostic output per file.
|
|
444
|
+
- **`secretscan`** — Entropy-based credential scanner. Detects API keys, tokens, connection strings, and private key headers in the diff before they reach the reviewer. Zero external dependencies.
|
|
686
445
|
|
|
687
|
-
|
|
446
|
+
Phase 5 execute loop becomes: `coder → diff → imports → lint fix → lint check → secretscan → reviewer → security reviewer → test_engineer → adversarial test_engineer`.
|
|
688
447
|
|
|
689
|
-
|
|
448
|
+
### v6.4 — Execution and Planning Tools
|
|
690
449
|
|
|
691
|
-
|
|
692
|
-
|
|
693
|
-
|
|
450
|
+
- **`test_runner`** — Unified test execution across Bun, Vitest, Jest, Mocha, pytest, cargo test, and Pester. Auto-detects framework, returns normalized JSON with pass/fail/skip counts and coverage. Three scope modes: `all`, `convention` (naming-based), `graph` (import-graph-based). Eliminates the test_engineer's most common failure mode.
|
|
451
|
+
- **`symbols`** — Export inventory for a module: functions, classes, interfaces, types, enums. Gives the Architect instant visibility into a file's public API surface without reading the full source.
|
|
452
|
+
- **`checkpoint`** — Git-backed save points. Before any multi-file refactor (≥3 files), Architect auto-creates a checkpoint commit. On critical integration failure, restores via soft reset instead of iterating into a hole.
|
|
694
453
|
|
|
695
|
-
|
|
696
|
-
bun test tests/unit/config/schema.test.ts
|
|
697
|
-
```
|
|
454
|
+
### v6.5 — Intelligence and Audit Tools
|
|
698
455
|
|
|
699
|
-
|
|
456
|
+
Five tools that improve planning quality and post-phase validation:
|
|
700
457
|
|
|
701
|
-
|
|
458
|
+
- **`pkg_audit`** — Wraps `npm audit`, `pip-audit`, `cargo audit`. Structured CVE output with severity, patched versions, and advisory URLs. Fed to the security reviewer for concrete vulnerability context.
|
|
459
|
+
- **`complexity_hotspots`** — Git churn × cyclomatic complexity risk map. Run in Phase 0/2 to identify modules that need stricter QA gates before implementation begins.
|
|
460
|
+
- **`schema_drift`** — Compares OpenAPI spec against actual route implementations. Surfaces undocumented routes and phantom spec paths. Run in Phase 6 when API routes were modified.
|
|
461
|
+
- **`todo_extract`** — Structured extraction of `TODO`, `FIXME`, and `HACK` annotations across the codebase. High-priority items fed directly into plan task candidates.
|
|
462
|
+
- **`evidence_check`** — Audits completed tasks against required evidence types. Run in Phase 6 to verify every task has review and test evidence before the phase is marked complete.
|
|
702
463
|
|
|
703
|
-
|
|
704
|
-
1. Verify `opencode-swarm` is listed in your `opencode.json` plugins array
|
|
705
|
-
2. Run `bunx opencode-swarm install` to auto-configure
|
|
706
|
-
3. Run `/swarm diagnose` to check health status
|
|
464
|
+
---
|
|
707
465
|
|
|
708
|
-
|
|
709
|
-
- Ensure you're using `/swarm <command>`, not `/swarm/<command>`
|
|
710
|
-
- Run `/swarm` with no arguments to see available commands
|
|
466
|
+
## Design Principles
|
|
711
467
|
|
|
712
|
-
|
|
713
|
-
|
|
714
|
-
|
|
468
|
+
1. **Plan before code** — Documented phases with acceptance criteria. The Critic approves the plan before a single line is written.
|
|
469
|
+
2. **One task at a time** — The Coder gets one task and full context. Nothing else.
|
|
470
|
+
3. **Review everything immediately** — Every task goes through correctness review, security review, verification tests, and adversarial tests. No task ships without passing all four.
|
|
471
|
+
4. **Cache SME knowledge** — Guidance is written to `context.md`. The same domain question is never asked twice in a project.
|
|
472
|
+
5. **Persistent memory** — `.swarm/` files are the ground truth. Any session, any model, any day.
|
|
473
|
+
6. **Serial execution** — Predictable, debuggable, no race conditions, no conflicting writes.
|
|
474
|
+
7. **Heterogeneous models** — Different models, different blind spots. The coder's bug is the reviewer's catch.
|
|
475
|
+
8. **User checkpoints** — Phase transitions require user confirmation. No unsupervised multi-phase runs.
|
|
476
|
+
9. **Document failures** — Rejections and retries are recorded in plan.md. After 5 failed attempts, the task escalates to the user.
|
|
477
|
+
10. **Resumable by design** — A cold-start Architect can read `.swarm/` and continue any project as if it had been there from the beginning.
|
|
715
478
|
|
|
716
479
|
---
|
|
717
480
|
|
|
@@ -730,5 +493,5 @@ MIT
|
|
|
730
493
|
---
|
|
731
494
|
|
|
732
495
|
<p align="center">
|
|
733
|
-
<strong>Stop hoping your agents figure it out. Start shipping code that works.</strong>
|
|
496
|
+
<strong>Stop hoping your agents figure it out. Start shipping code that actually works.</strong>
|
|
734
497
|
</p>
|