opencode-swarm 6.1.2 → 6.3.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +310 -510
- package/dist/config/evidence-schema.d.ts +94 -0
- package/dist/config/schema.d.ts +53 -0
- package/dist/index.js +1443 -55
- package/dist/state.d.ts +2 -0
- package/dist/tools/imports.d.ts +5 -0
- package/dist/tools/index.d.ts +3 -0
- package/dist/tools/lint.d.ts +34 -0
- package/dist/tools/secretscan.d.ts +31 -0
- package/package.json +1 -1
package/README.md
CHANGED
|
@@ -1,193 +1,199 @@
|
|
|
1
1
|
<p align="center">
|
|
2
|
-
|
|
2
|
+
<img src="https://img.shields.io/badge/version-6.3.0-blue" alt="Version">
|
|
3
3
|
<img src="https://img.shields.io/badge/license-MIT-green" alt="License">
|
|
4
4
|
<img src="https://img.shields.io/badge/opencode-plugin-purple" alt="OpenCode Plugin">
|
|
5
5
|
<img src="https://img.shields.io/badge/agents-9-orange" alt="Agents">
|
|
6
|
-
<img src="https://img.shields.io/badge/tests-
|
|
6
|
+
<img src="https://img.shields.io/badge/tests-1391-brightgreen" alt="Tests">
|
|
7
7
|
</p>
|
|
8
8
|
|
|
9
9
|
<h1 align="center">🐝 OpenCode Swarm</h1>
|
|
10
10
|
|
|
11
11
|
<p align="center">
|
|
12
|
-
<strong>
|
|
13
|
-
|
|
12
|
+
<strong>A structured multi-agent coding framework for OpenCode.</strong><br>
|
|
13
|
+
Nine specialized agents. Persistent memory. A QA gate on every task. Code that ships.
|
|
14
14
|
</p>
|
|
15
15
|
|
|
16
16
|
<p align="center">
|
|
17
|
-
<a href="#
|
|
17
|
+
<a href="#the-problem">The Problem</a> •
|
|
18
18
|
<a href="#how-it-works">How It Works</a> •
|
|
19
|
-
<a href="#installation">Installation</a> •
|
|
20
19
|
<a href="#agents">Agents</a> •
|
|
21
|
-
<a href="#
|
|
20
|
+
<a href="#persistent-memory">Memory</a> •
|
|
21
|
+
<a href="#guardrails">Guardrails</a> •
|
|
22
|
+
<a href="#comparison">Comparison</a> •
|
|
23
|
+
<a href="#installation">Installation</a> •
|
|
24
|
+
<a href="#roadmap">Roadmap</a>
|
|
22
25
|
</p>
|
|
23
26
|
|
|
24
27
|
---
|
|
25
28
|
|
|
26
|
-
## The Problem
|
|
27
|
-
|
|
28
|
-
```
|
|
29
|
-
You: "Build me an authentication system"
|
|
30
|
-
|
|
31
|
-
Other Frameworks:
|
|
32
|
-
├── Agent 1 starts auth module...
|
|
33
|
-
├── Agent 2 starts user model... (conflicts with Agent 1)
|
|
34
|
-
├── Agent 3 starts database... (wrong schema)
|
|
35
|
-
├── Agent 4 starts tests... (for code that doesn't exist yet)
|
|
36
|
-
└── Result: Chaos. Conflicts. Context lost. Start over.
|
|
37
|
-
|
|
38
|
-
OpenCode Swarm:
|
|
39
|
-
├── Architect analyzes request
|
|
40
|
-
├── Explorer scans codebase (+ gap analysis)
|
|
41
|
-
├── @sme consulted on security domain
|
|
42
|
-
├── Architect creates phased plan with acceptance criteria
|
|
43
|
-
├── @critic reviews plan → APPROVED
|
|
44
|
-
├── Phase 1: User model → Review → Tests (run + PASS) → ✓
|
|
45
|
-
├── Phase 2: Auth logic → Review → Tests (run + PASS) → ✓
|
|
46
|
-
├── Phase 3: Session management → Review → Tests (run + PASS) → ✓
|
|
47
|
-
└── Result: Working code. Documented decisions. Resumable progress.
|
|
48
|
-
```
|
|
49
|
-
|
|
50
|
-
---
|
|
51
|
-
|
|
52
|
-
## Why Swarm?
|
|
29
|
+
## The Problem
|
|
53
30
|
|
|
54
|
-
|
|
55
|
-
<tr>
|
|
56
|
-
<td width="50%">
|
|
31
|
+
Every multi-agent AI coding tool on the market has the same failure mode: they are vibes-driven. You describe a feature. Agents spawn. They race each other to write conflicting code, lose context after 20 messages, hit token limits mid-task, and produce something that sort-of-works until it doesn't. There's no plan. There's no memory. There's no gatekeeper. There's no test that was actually run.
|
|
57
32
|
|
|
58
|
-
|
|
33
|
+
**oh-my-opencode** is a prompt collection. **get-shit-done** is a workflow macro. Neither is a framework with memory, QA enforcement, or the ability to resume a project a week later exactly where you left off.
|
|
59
34
|
|
|
60
|
-
|
|
61
|
-
- Single model = correlated failures
|
|
62
|
-
- No planning, just vibes
|
|
63
|
-
- Context lost between sessions
|
|
64
|
-
- QA as afterthought (if at all)
|
|
65
|
-
- Entire codebase in one prompt
|
|
66
|
-
- No way to resume projects
|
|
35
|
+
OpenCode Swarm is built differently.
|
|
67
36
|
|
|
68
|
-
|
|
69
|
-
|
|
70
|
-
|
|
71
|
-
|
|
72
|
-
|
|
73
|
-
|
|
74
|
-
|
|
75
|
-
- **Phased planning** - documented tasks with acceptance criteria
|
|
76
|
-
- **Persistent memory** - `.swarm/` files survive sessions
|
|
77
|
-
- **Review per task** - correctness + security review before anything ships
|
|
78
|
-
- **One task at a time** - focused, quality code
|
|
79
|
-
- **Resumable projects** - pick up exactly where you left off
|
|
37
|
+
```
|
|
38
|
+
Every other framework:
|
|
39
|
+
├── Agent 1 starts the auth module...
|
|
40
|
+
├── Agent 2 starts the user model... (conflicts with Agent 1)
|
|
41
|
+
├── Agent 3 writes tests... (for code that doesn't exist yet)
|
|
42
|
+
├── Context window fills up and the whole thing drifts
|
|
43
|
+
└── Result: chaos. Rework. Start over.
|
|
80
44
|
|
|
81
|
-
|
|
82
|
-
|
|
83
|
-
|
|
45
|
+
OpenCode Swarm:
|
|
46
|
+
├── Architect reads .swarm/plan.md → project already in progress, resumes Phase 2
|
|
47
|
+
├── @explorer scans the codebase for current state
|
|
48
|
+
├── @sme DOMAIN: security → consults on auth patterns, guidance cached
|
|
49
|
+
├── Architect writes .swarm/plan.md: 3 phases, 9 tasks, acceptance criteria per task
|
|
50
|
+
├── @critic reviews the plan → APPROVED
|
|
51
|
+
├── @coder implements Task 2.2 (one task, full context, nothing else)
|
|
52
|
+
├── diff tool → imports tool → lint fix → secretscan → @reviewer → @test_engineer
|
|
53
|
+
├── All gates pass → plan.md updated → Task 2.2: [x]
|
|
54
|
+
└── Result: working code, documented decisions, resumable project, evidence trail
|
|
55
|
+
```
|
|
84
56
|
|
|
85
57
|
---
|
|
86
58
|
|
|
87
59
|
## How It Works
|
|
88
60
|
|
|
61
|
+
### The Execution Pipeline
|
|
62
|
+
|
|
89
63
|
```
|
|
90
|
-
|
|
91
|
-
│
|
|
92
|
-
|
|
64
|
+
┌──────────────────────────────────────────────────────────────────────────┐
|
|
65
|
+
│ Phase 0: Resume Check │
|
|
66
|
+
│ .swarm/plan.md exists? Resume mid-task. New project? Continue. │
|
|
67
|
+
└──────────────────────────────────────────────────────────────────────────┘
|
|
93
68
|
│
|
|
94
69
|
▼
|
|
95
|
-
|
|
96
|
-
│
|
|
97
|
-
│
|
|
98
|
-
|
|
70
|
+
┌──────────────────────────────────────────────────────────────────────────┐
|
|
71
|
+
│ Phase 1: Clarify │
|
|
72
|
+
│ Ask only what the Architect cannot infer. Then stop. │
|
|
73
|
+
└──────────────────────────────────────────────────────────────────────────┘
|
|
99
74
|
│
|
|
100
75
|
▼
|
|
101
|
-
|
|
102
|
-
│
|
|
103
|
-
│
|
|
104
|
-
|
|
76
|
+
┌──────────────────────────────────────────────────────────────────────────┐
|
|
77
|
+
│ Phase 2: Discover │
|
|
78
|
+
│ @explorer scans codebase → structure, languages, frameworks, key files │
|
|
79
|
+
└──────────────────────────────────────────────────────────────────────────┘
|
|
105
80
|
│
|
|
106
81
|
▼
|
|
107
|
-
|
|
108
|
-
│
|
|
109
|
-
│
|
|
110
|
-
|
|
82
|
+
┌──────────────────────────────────────────────────────────────────────────┐
|
|
83
|
+
│ Phase 3: SME Consult (serial, cached) │
|
|
84
|
+
│ @sme DOMAIN: security, @sme DOMAIN: api, ... │
|
|
85
|
+
│ Guidance written to .swarm/context.md — never re-asked in future phases │
|
|
86
|
+
└──────────────────────────────────────────────────────────────────────────┘
|
|
111
87
|
│
|
|
112
88
|
▼
|
|
113
|
-
|
|
114
|
-
│
|
|
115
|
-
│
|
|
116
|
-
│
|
|
117
|
-
│
|
|
118
|
-
|
|
89
|
+
┌──────────────────────────────────────────────────────────────────────────┐
|
|
90
|
+
│ Phase 4: Plan │
|
|
91
|
+
│ Architect writes .swarm/plan.md │
|
|
92
|
+
│ Structured phases, tasks with SMALL/MEDIUM/LARGE sizing, acceptance │
|
|
93
|
+
│ criteria per task, explicit dependency graph │
|
|
94
|
+
└──────────────────────────────────────────────────────────────────────────┘
|
|
119
95
|
│
|
|
120
96
|
▼
|
|
121
|
-
|
|
122
|
-
│
|
|
123
|
-
│
|
|
124
|
-
│
|
|
125
|
-
|
|
126
|
-
│ Phase 2: Core Auth [4 tasks] │
|
|
127
|
-
│ Phase 3: Session Management [3 tasks] │
|
|
128
|
-
└─────────────────────────────────────────────────────────────────────────┘
|
|
97
|
+
┌──────────────────────────────────────────────────────────────────────────┐
|
|
98
|
+
│ Phase 4.5: Critic Gate │
|
|
99
|
+
│ @critic reviews plan → APPROVED / NEEDS_REVISION / REJECTED │
|
|
100
|
+
│ Max 2 revision cycles. Escalates to user if unresolved. │
|
|
101
|
+
└──────────────────────────────────────────────────────────────────────────┘
|
|
129
102
|
│
|
|
130
103
|
▼
|
|
131
|
-
|
|
132
|
-
│
|
|
133
|
-
│
|
|
134
|
-
│
|
|
135
|
-
|
|
104
|
+
┌──────────────────────────────────────────────────────────────────────────┐
|
|
105
|
+
│ Phase 5: Execute (per task) │
|
|
106
|
+
│ │
|
|
107
|
+
│ [UI task?] → @designer scaffold first │
|
|
108
|
+
│ │
|
|
109
|
+
│ @coder (one task, full context) │
|
|
110
|
+
│ ↓ │
|
|
111
|
+
│ diff tool → imports tool → lint fix → lint check → secretscan │
|
|
112
|
+
│ (contract change detection) (AST-based) (auto-fix) (entropy scan) │
|
|
113
|
+
│ ↓ │
|
|
114
|
+
│ @reviewer (correctness pass) │
|
|
115
|
+
│ ↓ APPROVED │
|
|
116
|
+
│ @reviewer (security-only pass, if file matches security globs) │
|
|
117
|
+
│ ↓ APPROVED │
|
|
118
|
+
│ @test_engineer (verification tests + coverage gate ≥70%) │
|
|
119
|
+
│ ↓ PASS │
|
|
120
|
+
│ @test_engineer (adversarial tests — boundary violations, injections) │
|
|
121
|
+
│ ↓ PASS │
|
|
122
|
+
│ plan.md → [x] Task complete │
|
|
123
|
+
│ │
|
|
124
|
+
│ Any gate fails → back to @coder with structured rejection reason │
|
|
125
|
+
└──────────────────────────────────────────────────────────────────────────┘
|
|
136
126
|
│
|
|
137
127
|
▼
|
|
138
|
-
|
|
139
|
-
│
|
|
140
|
-
│
|
|
141
|
-
│
|
|
142
|
-
│
|
|
143
|
-
|
|
144
|
-
│ └─────────┘ └───────┘ └────────────┘ └──────────────┘ │
|
|
145
|
-
│ │ │ │ │ │
|
|
146
|
-
│ │ Contract │ If REJECTED: If FAIL: fix │
|
|
147
|
-
│ │ changes? │ retry from coder + retest │
|
|
148
|
-
│ │ │ │ │ │
|
|
149
|
-
│ │ ▼ │ ▼ │
|
|
150
|
-
│ │ ┌─────────┐ │ ┌──────────────┐ ┌──────────────┐ │
|
|
151
|
-
│ │ │@explorer│ │ │ @reviewer │ → │ @test │ │
|
|
152
|
-
│ │ │ impact │ │ │ security-only│ │ adversarial │ │
|
|
153
|
-
│ │ │analysis │ │ │ (if match) │ │ (attacks) │ │
|
|
154
|
-
│ │ └─────────┘ │ └──────────────┘ └──────────────┘ │
|
|
155
|
-
│ │ │ │
|
|
156
|
-
│ └───────────────┘ │
|
|
157
|
-
│ │
|
|
158
|
-
│ Update plan.md: [x] Task complete (only after ALL gates pass) │
|
|
159
|
-
│ Next task... │
|
|
160
|
-
└─────────────────────────────────────────────────────────────────────────┘
|
|
161
|
-
│
|
|
162
|
-
▼
|
|
163
|
-
┌─────────────────────────────────────────────────────────────────────────┐
|
|
164
|
-
│ PHASE 6: Phase Complete │
|
|
165
|
-
│ Re-scan with @explorer │
|
|
166
|
-
│ Update context.md with learnings │
|
|
167
|
-
│ Archive to .swarm/history/ │
|
|
168
|
-
│ "Phase 1 complete. Ready for Phase 2?" │
|
|
169
|
-
└─────────────────────────────────────────────────────────────────────────┘
|
|
128
|
+
┌──────────────────────────────────────────────────────────────────────────┐
|
|
129
|
+
│ Phase 6: Phase Complete │
|
|
130
|
+
│ @explorer rescans. @docs updates documentation. Retrospective written. │
|
|
131
|
+
│ Learnings injected as [SWARM RETROSPECTIVE] into next phase. │
|
|
132
|
+
│ "Phase 1 complete (4 tasks, 0 rejections). Ready for Phase 2?" │
|
|
133
|
+
└──────────────────────────────────────────────────────────────────────────┘
|
|
170
134
|
```
|
|
171
135
|
|
|
136
|
+
### Why Serial Execution Matters
|
|
137
|
+
|
|
138
|
+
Multi-agent parallelism sounds fast. In practice, it is a race to produce conflicting, unreviewed code that requires a human to untangle. OpenCode Swarm runs one task at a time through a deterministic pipeline. Every task is reviewed. Every test is run. Every failure is documented and fed back to the coder with structured context. The tradeoff in raw speed is paid back in not redoing work.
|
|
139
|
+
|
|
172
140
|
---
|
|
173
141
|
|
|
174
|
-
##
|
|
142
|
+
## Agents
|
|
143
|
+
|
|
144
|
+
### 🎯 Orchestrator
|
|
145
|
+
|
|
146
|
+
**`architect`** — The central coordinator. Owns the plan, delegates all work, enforces every QA gate, maintains project memory, and resumes projects across sessions. Every other agent works for the Architect.
|
|
147
|
+
|
|
148
|
+
### 🔍 Discovery
|
|
149
|
+
|
|
150
|
+
**`explorer`** — Fast codebase scanner. Identifies structure, languages, frameworks, key files, and import patterns. Runs before planning and after every phase completes.
|
|
151
|
+
|
|
152
|
+
### 🧠 Domain Expert
|
|
153
|
+
|
|
154
|
+
**`sme`** — Open-domain expert. The Architect specifies any domain per call: `security`, `python`, `rust`, `kubernetes`, `ios`, `ml`, `blockchain` — any domain the underlying model has knowledge of. No hardcoded list. Guidance is cached in `.swarm/context.md` so the same question is never asked twice.
|
|
155
|
+
|
|
156
|
+
### 🎨 Design
|
|
157
|
+
|
|
158
|
+
**`designer`** — UI/UX specification agent. Opt-in via config. Generates component scaffolds and design tokens before the coder touches UI tasks, eliminating the most common source of front-end rework.
|
|
159
|
+
|
|
160
|
+
### 💻 Implementation
|
|
161
|
+
|
|
162
|
+
**`coder`** — Implements exactly one task with full context. No multitasking. No context bleed from prior tasks. The coder receives: the task spec, acceptance criteria, SME guidance, and relevant context from `.swarm/context.md`. Nothing else.
|
|
163
|
+
|
|
164
|
+
**`test_engineer`** — Generates tests, runs them, and returns structured `PASS/FAIL` verdicts with coverage percentages. Runs twice per task: once for verification, once for adversarial attack scenarios.
|
|
165
|
+
|
|
166
|
+
### ✅ Quality Assurance
|
|
167
|
+
|
|
168
|
+
**`reviewer`** — Dual-pass review. First pass: correctness, logic, maintainability. Second pass: security-only, scoped to OWASP Top 10 categories, triggered automatically when the modified files match security-sensitive path patterns. Both passes produce structured verdicts with specific rejection reasons.
|
|
169
|
+
|
|
170
|
+
**`critic`** — Plan review gate. Reviews the Architect's plan *before implementation begins*. Checks for completeness, feasibility, scope creep, missing dependencies, and AI-slop hallucinations. Plans do not proceed without Critic approval.
|
|
171
|
+
|
|
172
|
+
### 📝 Documentation
|
|
175
173
|
|
|
176
|
-
|
|
174
|
+
**`docs`** — Documentation synthesizer. Runs in Phase 6 with a diff of changed files. Updates READMEs, API documentation, and guides to reflect what was actually built, not what was planned.
|
|
175
|
+
|
|
176
|
+
---
|
|
177
|
+
|
|
178
|
+
## Persistent Memory
|
|
179
|
+
|
|
180
|
+
Other frameworks lose everything when the session ends. Swarm stores project state on disk.
|
|
177
181
|
|
|
178
182
|
```
|
|
179
183
|
.swarm/
|
|
180
|
-
├── plan.md #
|
|
181
|
-
├──
|
|
182
|
-
├──
|
|
183
|
-
|
|
184
|
-
│
|
|
184
|
+
├── plan.md # Living roadmap: phases, tasks, status, rejections, blockers
|
|
185
|
+
├── plan.json # Machine-readable plan for tooling
|
|
186
|
+
├── context.md # Institutional knowledge: decisions, SME guidance, patterns
|
|
187
|
+
├── evidence/ # Per-task execution evidence bundles
|
|
188
|
+
│ ├── 1.1/ # review verdict, test results, diff summary for task 1.1
|
|
189
|
+
│ └── 2.3/
|
|
185
190
|
└── history/
|
|
186
|
-
├── phase-1.md # What was
|
|
191
|
+
├── phase-1.md # What was built, what was learned, retrospective metrics
|
|
187
192
|
└── phase-2.md
|
|
188
193
|
```
|
|
189
194
|
|
|
190
|
-
### plan.md
|
|
195
|
+
### plan.md — Living Roadmap
|
|
196
|
+
|
|
191
197
|
```markdown
|
|
192
198
|
# Project: Auth System
|
|
193
199
|
Current Phase: 2
|
|
@@ -200,260 +206,133 @@ Current Phase: 2
|
|
|
200
206
|
## Phase 2: Core Auth [IN PROGRESS]
|
|
201
207
|
- [x] Task 2.1: Login endpoint [MEDIUM]
|
|
202
208
|
- [ ] Task 2.2: JWT generation [MEDIUM] (depends: 2.1) ← CURRENT
|
|
203
|
-
- Acceptance: Returns valid JWT with user claims
|
|
204
|
-
- Attempt 1: REJECTED
|
|
209
|
+
- Acceptance: Returns valid JWT with user claims, 15-minute expiry
|
|
210
|
+
- Attempt 1: REJECTED — missing expiration claim
|
|
205
211
|
- [ ] Task 2.3: Token validation middleware [MEDIUM]
|
|
206
|
-
- [BLOCKED] Task 2.4: Refresh
|
|
207
|
-
- Reason:
|
|
212
|
+
- [BLOCKED] Task 2.4: Refresh token rotation
|
|
213
|
+
- Reason: Awaiting decision on rotation strategy
|
|
208
214
|
```
|
|
209
215
|
|
|
210
|
-
### context.md
|
|
216
|
+
### context.md — Institutional Knowledge
|
|
217
|
+
|
|
211
218
|
```markdown
|
|
212
219
|
# Project Context: Auth System
|
|
213
220
|
|
|
214
221
|
## Technical Decisions
|
|
215
|
-
-
|
|
216
|
-
- JWT
|
|
217
|
-
-
|
|
222
|
+
- bcrypt cost factor: 12
|
|
223
|
+
- JWT TTL: 15 minutes; refresh TTL: 7 days
|
|
224
|
+
- Refresh token store: Redis with key prefix auth:refresh:
|
|
218
225
|
|
|
219
226
|
## SME Guidance Cache
|
|
220
|
-
###
|
|
221
|
-
- Never log tokens or passwords
|
|
222
|
-
- Use constant-time comparison for
|
|
223
|
-
-
|
|
227
|
+
### security (Phase 1)
|
|
228
|
+
- Never log tokens or passwords in any context
|
|
229
|
+
- Use constant-time comparison for all token equality checks
|
|
230
|
+
- Rate-limit login endpoint: 5 attempts / 15 minutes per IP
|
|
224
231
|
|
|
225
|
-
###
|
|
226
|
-
- Return 401 for invalid credentials (not 404)
|
|
227
|
-
- Include token expiry in response body
|
|
232
|
+
### api (Phase 1)
|
|
233
|
+
- Return HTTP 401 for invalid credentials (not 404)
|
|
234
|
+
- Include token expiry timestamp in response body
|
|
228
235
|
|
|
229
236
|
## Patterns Established
|
|
230
|
-
- Error handling:
|
|
231
|
-
- Validation: Zod schemas in /validators
|
|
237
|
+
- Error handling: custom ApiError class with HTTP status and error code
|
|
238
|
+
- Validation: Zod schemas in /validators/, applied at request boundary
|
|
232
239
|
```
|
|
233
240
|
|
|
234
|
-
|
|
241
|
+
Start a new session tomorrow. The Architect reads these files and picks up exactly where you left off — no re-explaining, no rediscovery, no drift.
|
|
235
242
|
|
|
236
|
-
|
|
237
|
-
|
|
238
|
-
## Heterogeneous Models = Better Code
|
|
239
|
-
|
|
240
|
-
Most frameworks use one model for everything. Same blindspots everywhere.
|
|
243
|
+
### Evidence Bundles
|
|
241
244
|
|
|
242
|
-
|
|
243
|
-
|
|
244
|
-
```json
|
|
245
|
-
{
|
|
246
|
-
"agents": {
|
|
247
|
-
"architect": { "model": "anthropic/claude-sonnet-4-5" },
|
|
248
|
-
"explorer": { "model": "google/gemini-2.0-flash" },
|
|
249
|
-
"coder": { "model": "anthropic/claude-sonnet-4-5" },
|
|
250
|
-
"sme": { "model": "google/gemini-2.0-flash" },
|
|
251
|
-
"reviewer": { "model": "openai/gpt-4o" },
|
|
252
|
-
"critic": { "model": "google/gemini-2.0-flash" },
|
|
253
|
-
"test_engineer": { "model": "google/gemini-2.0-flash" }
|
|
254
|
-
}
|
|
255
|
-
}
|
|
256
|
-
```
|
|
245
|
+
Each completed task writes structured evidence to `.swarm/evidence/`:
|
|
257
246
|
|
|
258
|
-
|
|
|
259
|
-
|
|
260
|
-
|
|
|
261
|
-
|
|
|
262
|
-
|
|
|
263
|
-
|
|
|
264
|
-
|
|
|
265
|
-
| Critic | Plan review | Catches scope issues before any code is written |
|
|
266
|
-
| Test Engineer | Test + run | Writes tests, runs them, reports PASS/FAIL |
|
|
247
|
+
| Type | What It Captures |
|
|
248
|
+
|------|-----------------|
|
|
249
|
+
| `review` | Verdict (APPROVED/REJECTED), risk level, specific issues |
|
|
250
|
+
| `test` | Pass/fail counts, coverage percentage, failure messages |
|
|
251
|
+
| `diff` | Files changed, additions/deletions, contract change flags |
|
|
252
|
+
| `approval` | Stakeholder sign-off with notes |
|
|
253
|
+
| `retrospective` | Phase metrics: total tool calls, coder revisions, reviewer rejections, test failures, security findings, lessons learned |
|
|
267
254
|
|
|
268
|
-
|
|
255
|
+
Retrospectives from completed phases are injected as `[SWARM RETROSPECTIVE]` hints at the start of subsequent phases. The framework learns from its own history within a project.
|
|
269
256
|
|
|
270
257
|
---
|
|
271
258
|
|
|
272
|
-
##
|
|
273
|
-
|
|
274
|
-
Run different model configurations simultaneously. Perfect for:
|
|
275
|
-
- **Cloud vs Local**: Premium cloud models for critical work, local models for quick tasks
|
|
276
|
-
- **Fast vs Quality**: Quick iterations with fast models, careful work with expensive ones
|
|
277
|
-
- **Cost Tiers**: Cheap models for exploration, premium for implementation
|
|
259
|
+
## Heterogeneous Models
|
|
278
260
|
|
|
279
|
-
|
|
261
|
+
Single-model frameworks have correlated failure modes. The same model that writes the bug reviews it and misses it. Swarm lets you route each agent to the model it is best suited for:
|
|
280
262
|
|
|
281
263
|
```json
|
|
282
264
|
{
|
|
283
|
-
"
|
|
284
|
-
"
|
|
285
|
-
|
|
286
|
-
|
|
287
|
-
|
|
288
|
-
|
|
289
|
-
|
|
290
|
-
|
|
291
|
-
|
|
292
|
-
}
|
|
293
|
-
"local": {
|
|
294
|
-
"name": "Local",
|
|
295
|
-
"agents": {
|
|
296
|
-
"architect": { "model": "ollama/qwen2.5:32b" },
|
|
297
|
-
"coder": { "model": "ollama/qwen2.5:32b" },
|
|
298
|
-
"sme": { "model": "ollama/qwen2.5:14b" },
|
|
299
|
-
"reviewer": { "model": "ollama/qwen2.5:14b" }
|
|
300
|
-
}
|
|
301
|
-
}
|
|
265
|
+
"agents": {
|
|
266
|
+
"architect": { "model": "anthropic/claude-opus-4-6" },
|
|
267
|
+
"coder": { "model": "minimax-coding-plan/MiniMax-M2.5" },
|
|
268
|
+
"explorer": { "model": "minimax-coding-plan/MiniMax-M2.1" },
|
|
269
|
+
"sme": { "model": "kimi-for-coding/k2p5" },
|
|
270
|
+
"critic": { "model": "zai-coding-plan/glm-5" },
|
|
271
|
+
"reviewer": { "model": "zai-coding-plan/glm-5" },
|
|
272
|
+
"test_engineer": { "model": "minimax-coding-plan/MiniMax-M2.5" },
|
|
273
|
+
"docs": { "model": "zai-coding-plan/glm-4.7-flash" },
|
|
274
|
+
"designer": { "model": "kimi-for-coding/k2p5" }
|
|
302
275
|
}
|
|
303
276
|
}
|
|
304
277
|
```
|
|
305
278
|
|
|
306
|
-
|
|
307
|
-
|
|
308
|
-
| Swarm | Agents |
|
|
309
|
-
|-------|--------|
|
|
310
|
-
| `cloud` (default) | `architect`, `explorer`, `coder`, `sme`, `reviewer`, `critic`, `test_engineer` |
|
|
311
|
-
| `local` | `local_architect`, `local_explorer`, `local_coder`, `local_sme`, `local_reviewer`, `local_critic`, `local_test_engineer` |
|
|
312
|
-
|
|
313
|
-
The first swarm (or one named "default") creates unprefixed agents. Additional swarms prefix all agent names.
|
|
314
|
-
|
|
315
|
-
### Usage
|
|
316
|
-
|
|
317
|
-
In OpenCode, you'll see multiple architects to choose from:
|
|
318
|
-
- `architect` - Cloud swarm (default)
|
|
319
|
-
- `local_architect` - Local swarm
|
|
320
|
-
|
|
321
|
-
Each architect automatically delegates to its own swarm's agents.
|
|
279
|
+
Reviewer uses a different model than Coder by design. Different training, different priors, different blind spots. This is the cheapest bug-catcher you will ever deploy.
|
|
322
280
|
|
|
323
281
|
---
|
|
324
282
|
|
|
325
|
-
##
|
|
283
|
+
## Guardrails
|
|
326
284
|
|
|
327
|
-
|
|
328
|
-
# Install via CLI (recommended)
|
|
329
|
-
bunx opencode-swarm install
|
|
330
|
-
```
|
|
285
|
+
Every subagent runs inside a circuit breaker that kills runaway behavior before it burns credits on a stuck loop.
|
|
331
286
|
|
|
332
|
-
|
|
287
|
+
| Layer | Trigger | Action |
|
|
288
|
+
|-------|---------|--------|
|
|
289
|
+
| ⚠️ Soft Warning | 50% of any limit reached | Warning injected into agent stream |
|
|
290
|
+
| 🛑 Hard Block | 100% of any limit reached | All further tool calls blocked |
|
|
333
291
|
|
|
334
|
-
|
|
335
|
-
|
|
336
|
-
|
|
292
|
+
| Signal | Default | Description |
|
|
293
|
+
|--------|---------|-------------|
|
|
294
|
+
| Tool calls | 200 | Per-invocation, not per-session |
|
|
295
|
+
| Duration | 30 min | Wall-clock time per delegation |
|
|
296
|
+
| Repetition | 10 | Same tool + args consecutively |
|
|
297
|
+
| Consecutive errors | 5 | Sequential null/undefined outputs |
|
|
337
298
|
|
|
338
|
-
|
|
339
|
-
bunx opencode-swarm uninstall --clean
|
|
340
|
-
```
|
|
299
|
+
Limits are enforced **per-invocation**. Each delegation to a subagent starts a fresh budget. A coder fixing a second task is not penalized for the first task's tool calls. The Architect is exempt from all limits by default.
|
|
341
300
|
|
|
342
|
-
|
|
301
|
+
Per-agent profiles allow fine-grained overrides:
|
|
343
302
|
|
|
344
|
-
|
|
345
|
-
|
|
346
|
-
|
|
347
|
-
|
|
348
|
-
|
|
349
|
-
|
|
350
|
-
|
|
351
|
-
|
|
352
|
-
|
|
353
|
-
|
|
354
|
-
|
|
355
|
-
- **`retrieve_summary` tool** — Properly registered the retrieval tool, allowing agents to fetch full content from auto-summarized tool outputs.
|
|
356
|
-
- **92 new tests** — 1280 total tests across 57+ files (up from 1188 in v6.0.0).
|
|
357
|
-
|
|
358
|
-
### v6.1.0 — Docs & Design Agents
|
|
359
|
-
- **`docs` agent** — Dedicated documentation synthesizer that automatically updates READMEs, API docs, and guides during Phase 6.
|
|
360
|
-
- **`designer` agent** — UI/UX specification agent that generates component scaffolds before coding begins on UI-heavy tasks.
|
|
361
|
-
- **Heterogeneous model defaults** — Updated default models for new agents to use optimized Gemini models for speed and cost.
|
|
362
|
-
|
|
363
|
-
### v6.0.0 — Core QA & Security Gates
|
|
364
|
-
- **Dual-pass security reviewer** — After the general reviewer APPROVES, the architect automatically triggers a second security-only review pass when the changed file matches security-sensitive paths (`auth`, `crypto`, `session`, `token`, `middleware`, `api`, `security`) or the coder's output contains security keywords. Configurable via `review_passes` config.
|
|
365
|
-
- **Adversarial testing** — After verification tests PASS, the test engineer is re-delegated with adversarial-only framing: attack vectors, boundary violations, and injection attempts. Pure prompt engineering, no new infrastructure.
|
|
366
|
-
- **Integration impact analysis** — After the coder completes, the `diff` tool detects contract changes (exported functions, interfaces, types). If found, the explorer runs impact analysis across dependents before review begins.
|
|
367
|
-
- **`diff` tool** — New agent-accessible tool providing structured git diff with numstat parsing, contract change detection, configurable base ref (`HEAD`/staged/unstaged), path filtering, and 500-line truncation.
|
|
368
|
-
- **87 new tests** — 1188 total tests across 53+ files (up from 1101 in v5.2.0).
|
|
369
|
-
|
|
370
|
-
### v5.2.0 — Per-Invocation Guardrails
|
|
371
|
-
- **Per-invocation budget isolation** — Guardrail limits (tool calls, duration, errors) now reset with each agent delegation. Second invocation of the same agent gets a fresh budget, preventing false circuit breaker trips in long-running projects.
|
|
372
|
-
- **Architect protocol enforcement** — New mandatory QA gate rules: every coder task must go through reviewer approval + test_engineer verification before the next coder task. Protocol violations detected at runtime with warning injection.
|
|
373
|
-
- **Invocation window observability** — Circuit breaker logs now include `invocationId` and `windowKey` for precise debugging of which specific agent invocation hit limits.
|
|
374
|
-
- **67 new tests** — 1101 total tests across 48 files (up from 1034 in v5.1.x).
|
|
375
|
-
|
|
376
|
-
### v5.0.0 — Verifiable Execution
|
|
377
|
-
- **Canonical plan schema** — Machine-readable `plan.json` with Zod-validated `PlanSchema`/`TaskSchema`/`PhaseSchema`. Automatic migration from legacy `plan.md` format. Structured status tracking (`pending`, `in_progress`, `completed`, `blocked`).
|
|
378
|
-
- **Evidence bundles** — Per-task execution evidence persisted to `.swarm/evidence/`. Five evidence types: `review`, `test`, `diff`, `approval`, `note`. Sanitized task IDs, atomic writes, configurable size limits. `/swarm evidence` to view, `/swarm archive` to manage retention.
|
|
379
|
-
- **Per-agent guardrail profiles** — Override guardrail limits for individual agents via `guardrails.profiles`. `resolveGuardrailsConfig()` merges base + profile with per-agent specificity.
|
|
380
|
-
- **Context injection budget** — `max_injection_tokens` config controls how much context is injected into system prompts. Priority-ordered: phase → task → decisions → agent context. Lower-priority items dropped when budget exhausted.
|
|
381
|
-
- **Enhanced `/swarm agents`** — Agent count summary, `⚡ custom limits` indicator for profiled agents, guardrail profiles section.
|
|
382
|
-
- **Packaging smoke tests** — CI-safe `dist/` validation (8 tests).
|
|
383
|
-
- **151 new tests** — 1027 total tests across 44 files (up from 876 in v4.6.0).
|
|
384
|
-
|
|
385
|
-
### v4.6.0 — Agent Guardrails
|
|
386
|
-
- **Circuit breaker** — Two-layer protection against runaway agents. Soft warning at 50% of limits, hard block at 100%. Prevents infinite loops and runaway API costs.
|
|
387
|
-
- **Detection signals** — Tool call count, wall-clock time, consecutive repetition, and consecutive error tracking per agent session.
|
|
388
|
-
- **Configurable limits** — All thresholds tunable via `guardrails` config: `max_tool_calls`, `max_duration_minutes`, `max_repetitions`, `max_consecutive_errors`, `warning_threshold`.
|
|
389
|
-
- **46 new tests** — 668 total tests across 30 files.
|
|
390
|
-
|
|
391
|
-
### v4.5.0 — Tech Debt + New Commands
|
|
392
|
-
- **Lint cleanup** — Replaced string concatenation with template literals, documented `as any` casts with biome-ignore comments.
|
|
393
|
-
- **Code deduplication** — Extracted `stripSwarmPrefix()` utility to eliminate 3 duplicate prefix-stripping blocks.
|
|
394
|
-
- **`/swarm diagnose`** — Health check for `.swarm/` files, plan structure, and plugin configuration.
|
|
395
|
-
- **`/swarm export`** — Export plan.md and context.md as portable JSON.
|
|
396
|
-
- **`/swarm reset --confirm`** — Clear swarm state files with safety confirmation.
|
|
397
|
-
|
|
398
|
-
### v4.4.0 — DX & Quality
|
|
399
|
-
- **CLI `uninstall` command** — Remove plugin with optional `--clean` flag.
|
|
400
|
-
- **Custom error classes** — `SwarmError` hierarchy with actionable `guidance` messages.
|
|
401
|
-
- **`/swarm history`** — View completed phases from plan.md.
|
|
402
|
-
- **`/swarm config`** — View current resolved plugin configuration.
|
|
403
|
-
|
|
404
|
-
### v4.3.2 — Security Hardening
|
|
405
|
-
- **Path validation** — `validateSwarmPath()` prevents directory traversal in `.swarm/` file operations.
|
|
406
|
-
- **Fetch hardening** — 10s timeout, 5MB limit, retry logic for gitingest tool.
|
|
407
|
-
- **Config limits** — Deep merge depth limit (10), config file size limit (100KB).
|
|
408
|
-
|
|
409
|
-
### v4.3.0 — Hooks & Agent Awareness
|
|
410
|
-
- **Hooks pipeline** — `safeHook()` crash-safe wrapper, `composeHandlers()` for multi-handler composition.
|
|
411
|
-
- **Context pruning** — Token budget tracking with 70%/90% threshold warnings.
|
|
412
|
-
- **Slash commands** — `/swarm status`, `/swarm plan`, `/swarm agents`.
|
|
413
|
-
- **Agent awareness** — Activity tracking, delegation tracking, cross-agent context injection.
|
|
414
|
-
|
|
415
|
-
All features are opt-in via configuration. See [Installation Guide](docs/installation.md) for config options.
|
|
303
|
+
```jsonc
|
|
304
|
+
{
|
|
305
|
+
"guardrails": {
|
|
306
|
+
"max_tool_calls": 200,
|
|
307
|
+
"profiles": {
|
|
308
|
+
"coder": { "max_tool_calls": 500, "max_duration_minutes": 60 },
|
|
309
|
+
"explorer": { "max_tool_calls": 50 }
|
|
310
|
+
}
|
|
311
|
+
}
|
|
312
|
+
}
|
|
313
|
+
```
|
|
416
314
|
|
|
417
315
|
---
|
|
418
316
|
|
|
419
|
-
##
|
|
420
|
-
|
|
421
|
-
### 🎯 Orchestrator
|
|
422
|
-
| Agent | Role |
|
|
423
|
-
|-------|------|
|
|
424
|
-
| `architect` | Central coordinator. Plans phases, delegates tasks, manages QA, maintains project memory. |
|
|
425
|
-
|
|
426
|
-
### 🔍 Discovery
|
|
427
|
-
| Agent | Role |
|
|
428
|
-
|-------|------|
|
|
429
|
-
| `explorer` | Fast codebase scanner. Identifies structure, languages, frameworks, key files. |
|
|
430
|
-
|
|
431
|
-
### 🎨 Design
|
|
432
|
-
| Agent | Role |
|
|
433
|
-
|-------|------|
|
|
434
|
-
| `designer` | UI/UX specification agent. Generates component scaffolds and design tokens before coding begins on UI-heavy tasks. |
|
|
435
|
-
|
|
436
|
-
### 🧠 Domain Expert
|
|
437
|
-
| Agent | Role |
|
|
438
|
-
|-------|------|
|
|
439
|
-
| `sme` | Open-domain expert. The architect specifies any domain (security, python, ios, rust, kubernetes, etc.) per call. No hardcoded list — works with any domain the LLM has knowledge of. |
|
|
440
|
-
|
|
441
|
-
### 💻 Implementation
|
|
442
|
-
| Agent | Role |
|
|
443
|
-
|-------|------|
|
|
444
|
-
| `coder` | Implements ONE task at a time with full context |
|
|
445
|
-
| `test_engineer` | Generates tests, runs them, and reports structured PASS/FAIL verdicts |
|
|
446
|
-
|
|
447
|
-
### ✅ Quality Assurance
|
|
448
|
-
| Agent | Role |
|
|
449
|
-
|-------|------|
|
|
450
|
-
| `reviewer` | Dual-pass review: correctness review first, then automatic security-only pass for security-sensitive files. The architect specifies CHECK dimensions per call. OWASP Top 10 categories built in. |
|
|
451
|
-
| `critic` | Plan review gate. Reviews the architect's plan BEFORE implementation — checks completeness, feasibility, scope, dependencies, and flags AI-slop. |
|
|
317
|
+
## Comparison
|
|
452
318
|
|
|
453
|
-
|
|
454
|
-
|
|
455
|
-
|
|
456
|
-
|
|
|
319
|
+
| Feature | OpenCode Swarm | oh-my-opencode | get-shit-done | AutoGen | CrewAI |
|
|
320
|
+
|---------|:-:|:-:|:-:|:-:|:-:|
|
|
321
|
+
| Multi-agent orchestration | ✅ 9 specialized agents | ❌ Prompt config only | ❌ Single-agent macros | ✅ | ✅ |
|
|
322
|
+
| Execution model | Serial (deterministic) | N/A | N/A | Parallel (chaotic) | Parallel |
|
|
323
|
+
| Phased planning with acceptance criteria | ✅ | ❌ | ❌ | ❌ | ❌ |
|
|
324
|
+
| Critic gate before implementation | ✅ | ❌ | ❌ | ❌ | ❌ |
|
|
325
|
+
| Per-task dual-pass review (correctness + security) | ✅ | ❌ | ❌ | Optional | Optional |
|
|
326
|
+
| Adversarial test pass per task | ✅ | ❌ | ❌ | ❌ | ❌ |
|
|
327
|
+
| Pre-reviewer pipeline (lint, secretscan, imports) | ✅ v6.3 | ❌ | ❌ | ❌ | ❌ |
|
|
328
|
+
| Persistent session memory | ✅ `.swarm/` files | ❌ | ❌ | Session only | Session only |
|
|
329
|
+
| Resume projects across sessions | ✅ Native | ❌ | ❌ | ❌ | ❌ |
|
|
330
|
+
| Evidence trail per task | ✅ Structured bundles | ❌ | ❌ | ❌ | ❌ |
|
|
331
|
+
| Heterogeneous model routing | ✅ Per-agent | ❌ | ❌ | Limited | Limited |
|
|
332
|
+
| Circuit breaker / guardrails | ✅ Per-invocation | ❌ | ❌ | ❌ | ❌ |
|
|
333
|
+
| Open-domain SME consultation | ✅ Any domain | ❌ | ❌ | ❌ | ❌ |
|
|
334
|
+
| Retrospective learning across phases | ✅ | ❌ | ❌ | ❌ | ❌ |
|
|
335
|
+
| Slash commands + diagnostics | ✅ 12 commands | ❌ | Limited | ❌ | ❌ |
|
|
457
336
|
|
|
458
337
|
---
|
|
459
338
|
|
|
@@ -461,220 +340,141 @@ All features are opt-in via configuration. See [Installation Guide](docs/install
|
|
|
461
340
|
|
|
462
341
|
| Command | Description |
|
|
463
342
|
|---------|-------------|
|
|
464
|
-
| `/swarm status` | Current phase, task progress,
|
|
465
|
-
| `/swarm plan [N]` |
|
|
466
|
-
| `/swarm agents` |
|
|
467
|
-
| `/swarm history` |
|
|
468
|
-
| `/swarm config` |
|
|
469
|
-
| `/swarm diagnose` | Health check for
|
|
343
|
+
| `/swarm status` | Current phase, task progress, agent count |
|
|
344
|
+
| `/swarm plan [N]` | Full plan or filtered by phase |
|
|
345
|
+
| `/swarm agents` | All registered agents with models and permissions |
|
|
346
|
+
| `/swarm history` | Completed phases with status |
|
|
347
|
+
| `/swarm config` | Current resolved configuration |
|
|
348
|
+
| `/swarm diagnose` | Health check for `.swarm/` files and config |
|
|
470
349
|
| `/swarm export` | Export plan and context as portable JSON |
|
|
471
|
-
| `/swarm
|
|
472
|
-
| `/swarm
|
|
473
|
-
| `/swarm
|
|
474
|
-
| `/swarm
|
|
475
|
-
| `/swarm
|
|
350
|
+
| `/swarm evidence [task]` | Evidence bundles for a task or all tasks |
|
|
351
|
+
| `/swarm archive [--dry-run]` | Archive old evidence with retention policy |
|
|
352
|
+
| `/swarm benchmark` | Performance benchmarks |
|
|
353
|
+
| `/swarm retrieve [id]` | Retrieve auto-summarized tool outputs |
|
|
354
|
+
| `/swarm reset --confirm` | Clear swarm state files |
|
|
476
355
|
|
|
477
356
|
---
|
|
478
357
|
|
|
479
358
|
## Configuration
|
|
480
359
|
|
|
481
|
-
Create `~/.config/opencode/opencode-swarm.json`:
|
|
482
|
-
|
|
483
360
|
```json
|
|
484
361
|
{
|
|
485
362
|
"agents": {
|
|
486
|
-
"architect": { "model": "anthropic/claude-
|
|
487
|
-
"
|
|
488
|
-
"
|
|
489
|
-
"sme": { "model": "
|
|
490
|
-
"
|
|
491
|
-
"
|
|
492
|
-
"test_engineer": { "model": "
|
|
493
|
-
"docs": { "model": "
|
|
494
|
-
"designer": { "model": "
|
|
363
|
+
"architect": { "model": "anthropic/claude-opus-4-6" },
|
|
364
|
+
"coder": { "model": "minimax-coding-plan/MiniMax-M2.5" },
|
|
365
|
+
"explorer": { "model": "minimax-coding-plan/MiniMax-M2.1" },
|
|
366
|
+
"sme": { "model": "kimi-for-coding/k2p5" },
|
|
367
|
+
"critic": { "model": "zai-coding-plan/glm-5" },
|
|
368
|
+
"reviewer": { "model": "zai-coding-plan/glm-5" },
|
|
369
|
+
"test_engineer": { "model": "minimax-coding-plan/MiniMax-M2.5" },
|
|
370
|
+
"docs": { "model": "zai-coding-plan/glm-4.7-flash" },
|
|
371
|
+
"designer": { "model": "kimi-for-coding/k2p5" }
|
|
372
|
+
},
|
|
373
|
+
"guardrails": {
|
|
374
|
+
"max_tool_calls": 200,
|
|
375
|
+
"max_duration_minutes": 30,
|
|
376
|
+
"profiles": {
|
|
377
|
+
"coder": { "max_tool_calls": 500 }
|
|
378
|
+
}
|
|
379
|
+
},
|
|
380
|
+
"review_passes": {
|
|
381
|
+
"always_security_review": false,
|
|
382
|
+
"security_globs": ["**/*auth*", "**/*crypto*", "**/*session*", "**/*token*"]
|
|
495
383
|
}
|
|
496
384
|
}
|
|
497
385
|
```
|
|
498
386
|
|
|
499
|
-
|
|
387
|
+
Save to `~/.config/opencode/opencode-swarm.json` or `.opencode/swarm.json` in your project root. Project config merges over global config via deep merge — partial overrides do not clobber unspecified fields.
|
|
388
|
+
|
|
389
|
+
### Disabling Agents
|
|
390
|
+
|
|
500
391
|
```json
|
|
501
392
|
{
|
|
502
|
-
"sme":
|
|
393
|
+
"sme": { "disabled": true },
|
|
394
|
+
"designer": { "disabled": true },
|
|
503
395
|
"test_engineer": { "disabled": true }
|
|
504
396
|
}
|
|
505
397
|
```
|
|
506
398
|
|
|
507
399
|
---
|
|
508
400
|
|
|
509
|
-
##
|
|
510
|
-
|
|
511
|
-
OpenCode Swarm includes a built-in circuit breaker that prevents subagents from running away — burning API credits in infinite loops, repeating the same tool call, or spinning for hours.
|
|
512
|
-
|
|
513
|
-
### How It Works
|
|
514
|
-
|
|
515
|
-
| Layer | Trigger | Action |
|
|
516
|
-
|-------|---------|--------|
|
|
517
|
-
| ⚠️ **Soft Warning** | 50% of any limit reached | Injects warning message into agent's chat stream |
|
|
518
|
-
| 🛑 **Hard Block** | 100% of any limit reached | Blocks ALL further tool calls + injects stop message |
|
|
519
|
-
|
|
520
|
-
### Detection Signals
|
|
521
|
-
|
|
522
|
-
| Signal | Default Limit | Description |
|
|
523
|
-
|--------|---------------|-------------|
|
|
524
|
-
| Tool calls | 200 | Total tool invocations per agent session |
|
|
525
|
-
| Duration | 30 min | Wall-clock time since delegation started |
|
|
526
|
-
| Repetition | 10 | Same tool + args called consecutively |
|
|
527
|
-
| Consecutive errors | 5 | Sequential null/undefined tool outputs |
|
|
401
|
+
## Installation
|
|
528
402
|
|
|
529
|
-
|
|
403
|
+
```bash
|
|
404
|
+
# Install globally
|
|
405
|
+
npm install -g opencode-swarm
|
|
530
406
|
|
|
531
|
-
|
|
407
|
+
# Or use npx
|
|
408
|
+
npx opencode-swarm install
|
|
532
409
|
|
|
533
|
-
|
|
534
|
-
|
|
535
|
-
"guardrails": {
|
|
536
|
-
"enabled": true, // default: true
|
|
537
|
-
"max_tool_calls": 200, // range: 10–1000
|
|
538
|
-
"max_duration_minutes": 30, // range: 1–120
|
|
539
|
-
"max_repetitions": 10, // range: 3–50
|
|
540
|
-
"max_consecutive_errors": 5, // range: 2–20
|
|
541
|
-
"warning_threshold": 0.5 // range: 0.1–0.9 (fraction of limit for soft warning)
|
|
542
|
-
}
|
|
543
|
-
}
|
|
410
|
+
# Verify
|
|
411
|
+
opencode # then: /swarm diagnose
|
|
544
412
|
```
|
|
545
413
|
|
|
546
|
-
|
|
547
|
-
|
|
548
|
-
Override limits for specific agents that need more (or less) room:
|
|
414
|
+
The installer auto-configures `opencode.json` to include the plugin. Manual configuration:
|
|
549
415
|
|
|
550
|
-
```
|
|
416
|
+
```json
|
|
551
417
|
{
|
|
552
|
-
"
|
|
553
|
-
"max_tool_calls": 200,
|
|
554
|
-
"profiles": {
|
|
555
|
-
"coder": { "max_tool_calls": 500, "max_duration_minutes": 60 },
|
|
556
|
-
"explorer": { "max_tool_calls": 50 }
|
|
557
|
-
}
|
|
558
|
-
}
|
|
418
|
+
"plugins": ["opencode-swarm"]
|
|
559
419
|
}
|
|
560
420
|
```
|
|
561
421
|
|
|
562
|
-
|
|
422
|
+
---
|
|
563
423
|
|
|
564
|
-
|
|
424
|
+
## Testing
|
|
565
425
|
|
|
566
|
-
|
|
426
|
+
2031 tests across 78 files. Unit, integration, adversarial, and smoke. Covers config schemas, all agent prompts, all hooks, all tools, all commands, guardrail circuit breaker, race conditions, invocation window isolation, multi-invocation state, security category classification, and evidence validation.
|
|
567
427
|
|
|
568
|
-
```
|
|
569
|
-
|
|
570
|
-
"review_passes": {
|
|
571
|
-
"always_security_review": false, // default: false (only on security-sensitive files)
|
|
572
|
-
"security_globs": [ // default patterns:
|
|
573
|
-
"**/*auth*", "**/*crypto*",
|
|
574
|
-
"**/*session*", "**/*token*",
|
|
575
|
-
"**/*middleware*", "**/*api*",
|
|
576
|
-
"**/*security*"
|
|
577
|
-
]
|
|
578
|
-
}
|
|
579
|
-
}
|
|
428
|
+
```bash
|
|
429
|
+
bun test
|
|
580
430
|
```
|
|
581
431
|
|
|
582
|
-
|
|
432
|
+
Zero additional test dependencies. Uses Bun's built-in test runner.
|
|
583
433
|
|
|
584
|
-
|
|
434
|
+
---
|
|
585
435
|
|
|
586
|
-
|
|
436
|
+
## Roadmap
|
|
587
437
|
|
|
588
|
-
|
|
589
|
-
{
|
|
590
|
-
"integration_analysis": {
|
|
591
|
-
"enabled": true // default: true
|
|
592
|
-
}
|
|
593
|
-
}
|
|
594
|
-
```
|
|
438
|
+
### v6.3 — Pre-Reviewer Pipeline
|
|
595
439
|
|
|
596
|
-
|
|
440
|
+
Three new tools complete the pre-reviewer gauntlet. Code reaching the Reviewer is already clean.
|
|
597
441
|
|
|
598
|
-
|
|
442
|
+
- **`imports`** — AST-based import graph. For each file changed by the coder, returns every consumer file, which exports each consumer uses, and the line numbers. Replaces fragile grep-based integration analysis with deterministic graph traversal.
|
|
443
|
+
- **`lint`** — Auto-detects project linter (Biome, ESLint, Ruff, Clippy, PSScriptAnalyzer). Runs in fix mode first, then check mode. Structured diagnostic output per file.
|
|
444
|
+
- **`secretscan`** — Entropy-based credential scanner. Detects API keys, tokens, connection strings, and private key headers in the diff before they reach the reviewer. Zero external dependencies.
|
|
599
445
|
|
|
600
|
-
|
|
446
|
+
Phase 5 execute loop becomes: `coder → diff → imports → lint fix → lint check → secretscan → reviewer → security reviewer → test_engineer → adversarial test_engineer`.
|
|
601
447
|
|
|
602
|
-
|
|
603
|
-
- Architect → Coder (task 1) → 200 calls available
|
|
604
|
-
- Coder finishes → Architect → Coder (task 2) → 200 calls available again
|
|
448
|
+
### v6.4 — Execution and Planning Tools
|
|
605
449
|
|
|
606
|
-
|
|
450
|
+
- **`test_runner`** — Unified test execution across Bun, Vitest, Jest, Mocha, pytest, cargo test, and Pester. Auto-detects framework, returns normalized JSON with pass/fail/skip counts and coverage. Three scope modes: `all`, `convention` (naming-based), `graph` (import-graph-based). Eliminates the test_engineer's most common failure mode.
|
|
451
|
+
- **`symbols`** — Export inventory for a module: functions, classes, interfaces, types, enums. Gives the Architect instant visibility into a file's public API surface without reading the full source.
|
|
452
|
+
- **`checkpoint`** — Git-backed save points. Before any multi-file refactor (≥3 files), Architect auto-creates a checkpoint commit. On critical integration failure, restores via soft reset instead of iterating into a hole.
|
|
607
453
|
|
|
608
|
-
|
|
454
|
+
### v6.5 — Intelligence and Audit Tools
|
|
609
455
|
|
|
610
|
-
|
|
456
|
+
Five tools that improve planning quality and post-phase validation:
|
|
611
457
|
|
|
612
|
-
|
|
613
|
-
|
|
614
|
-
|
|
615
|
-
|
|
616
|
-
|
|
617
|
-
}
|
|
618
|
-
```
|
|
619
|
-
|
|
620
|
-
---
|
|
621
|
-
|
|
622
|
-
## Comparison
|
|
623
|
-
|
|
624
|
-
| Feature | OpenCode Swarm | AutoGen | CrewAI | LangGraph |
|
|
625
|
-
|---------|---------------|---------|--------|-----------|
|
|
626
|
-
| Execution | Serial (predictable) | Parallel (chaotic) | Parallel | Configurable |
|
|
627
|
-
| Planning | Phased with acceptance criteria | Ad-hoc | Role-based | Graph-based |
|
|
628
|
-
| Memory | Persistent `.swarm/` files | Session only | Session only | Checkpoints |
|
|
629
|
-
| QA | Dual-pass per-task (review + security + adversarial) | Optional | Optional | Manual |
|
|
630
|
-
| Model mixing | Per-agent configuration | Limited | Limited | Manual |
|
|
631
|
-
| Resume projects | ✅ Native | ❌ | ❌ | Partial |
|
|
632
|
-
| SME domains | Open-domain (any) | Generic | Generic | Generic |
|
|
633
|
-
| Task granularity | One at a time | Batched | Batched | Varies |
|
|
458
|
+
- **`pkg_audit`** — Wraps `npm audit`, `pip-audit`, `cargo audit`. Structured CVE output with severity, patched versions, and advisory URLs. Fed to the security reviewer for concrete vulnerability context.
|
|
459
|
+
- **`complexity_hotspots`** — Git churn × cyclomatic complexity risk map. Run in Phase 0/2 to identify modules that need stricter QA gates before implementation begins.
|
|
460
|
+
- **`schema_drift`** — Compares OpenAPI spec against actual route implementations. Surfaces undocumented routes and phantom spec paths. Run in Phase 6 when API routes were modified.
|
|
461
|
+
- **`todo_extract`** — Structured extraction of `TODO`, `FIXME`, and `HACK` annotations across the codebase. High-priority items fed directly into plan task candidates.
|
|
462
|
+
- **`evidence_check`** — Audits completed tasks against required evidence types. Run in Phase 6 to verify every task has review and test evidence before the phase is marked complete.
|
|
634
463
|
|
|
635
464
|
---
|
|
636
465
|
|
|
637
466
|
## Design Principles
|
|
638
467
|
|
|
639
|
-
1. **Plan before code**
|
|
640
|
-
2. **One task at a time**
|
|
641
|
-
3. **Review everything immediately**
|
|
642
|
-
4. **Cache SME knowledge**
|
|
643
|
-
5. **Persistent memory**
|
|
644
|
-
6. **Serial execution**
|
|
645
|
-
7. **Heterogeneous models**
|
|
646
|
-
8. **User checkpoints**
|
|
647
|
-
9. **
|
|
648
|
-
10. **Resumable by design** -
|
|
649
|
-
|
|
650
|
-
---
|
|
651
|
-
|
|
652
|
-
## Testing
|
|
653
|
-
|
|
654
|
-
```bash
|
|
655
|
-
# Run all tests
|
|
656
|
-
bun test
|
|
657
|
-
|
|
658
|
-
# Run specific test file
|
|
659
|
-
bun test tests/unit/config/schema.test.ts
|
|
660
|
-
```
|
|
661
|
-
|
|
662
|
-
1280 tests across 57+ files covering config, tools, agents, hooks, commands, state, guardrails, evidence, plan schemas, circuit breaker race conditions, invocation windows, multi-invocation isolation, security categories, review/integration schemas, and diff tool. Uses Bun's built-in test runner — zero additional test dependencies.
|
|
663
|
-
|
|
664
|
-
## Troubleshooting
|
|
665
|
-
|
|
666
|
-
### Plugin not loading
|
|
667
|
-
1. Verify `opencode-swarm` is listed in your `opencode.json` plugins array
|
|
668
|
-
2. Run `bunx opencode-swarm install` to auto-configure
|
|
669
|
-
3. Run `/swarm diagnose` to check health status
|
|
670
|
-
|
|
671
|
-
### Commands not working
|
|
672
|
-
- Ensure you're using `/swarm <command>`, not `/swarm/<command>`
|
|
673
|
-
- Run `/swarm` with no arguments to see available commands
|
|
674
|
-
|
|
675
|
-
### Resuming a project
|
|
676
|
-
- Swarm automatically detects `.swarm/plan.md` and resumes where you left off
|
|
677
|
-
- If you get unexpected behavior, run `/swarm export` to backup, then `/swarm reset --confirm` to start fresh
|
|
468
|
+
1. **Plan before code** — Documented phases with acceptance criteria. The Critic approves the plan before a single line is written.
|
|
469
|
+
2. **One task at a time** — The Coder gets one task and full context. Nothing else.
|
|
470
|
+
3. **Review everything immediately** — Every task goes through correctness review, security review, verification tests, and adversarial tests. No task ships without passing all four.
|
|
471
|
+
4. **Cache SME knowledge** — Guidance is written to `context.md`. The same domain question is never asked twice in a project.
|
|
472
|
+
5. **Persistent memory** — `.swarm/` files are the ground truth. Any session, any model, any day.
|
|
473
|
+
6. **Serial execution** — Predictable, debuggable, no race conditions, no conflicting writes.
|
|
474
|
+
7. **Heterogeneous models** — Different models, different blind spots. The coder's bug is the reviewer's catch.
|
|
475
|
+
8. **User checkpoints** — Phase transitions require user confirmation. No unsupervised multi-phase runs.
|
|
476
|
+
9. **Document failures** — Rejections and retries are recorded in plan.md. After 5 failed attempts, the task escalates to the user.
|
|
477
|
+
10. **Resumable by design** — A cold-start Architect can read `.swarm/` and continue any project as if it had been there from the beginning.
|
|
678
478
|
|
|
679
479
|
---
|
|
680
480
|
|
|
@@ -693,5 +493,5 @@ MIT
|
|
|
693
493
|
---
|
|
694
494
|
|
|
695
495
|
<p align="center">
|
|
696
|
-
<strong>Stop hoping your agents figure it out. Start shipping code that works.</strong>
|
|
496
|
+
<strong>Stop hoping your agents figure it out. Start shipping code that actually works.</strong>
|
|
697
497
|
</p>
|