opencode-swarm 6.1.2 → 6.3.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -1,193 +1,199 @@
1
1
  <p align="center">
2
- <img src="https://img.shields.io/badge/version-6.1.2-blue" alt="Version">
2
+ <img src="https://img.shields.io/badge/version-6.3.0-blue" alt="Version">
3
3
  <img src="https://img.shields.io/badge/license-MIT-green" alt="License">
4
4
  <img src="https://img.shields.io/badge/opencode-plugin-purple" alt="OpenCode Plugin">
5
5
  <img src="https://img.shields.io/badge/agents-9-orange" alt="Agents">
6
- <img src="https://img.shields.io/badge/tests-1280-brightgreen" alt="Tests">
6
+ <img src="https://img.shields.io/badge/tests-1391-brightgreen" alt="Tests">
7
7
  </p>
8
8
 
9
9
  <h1 align="center">🐝 OpenCode Swarm</h1>
10
10
 
11
11
  <p align="center">
12
- <strong>The only multi-agent framework that actually works.</strong><br>
13
- Structured phases. Persistent memory. One task at a time. QA on everything.
12
+ <strong>A structured multi-agent coding framework for OpenCode.</strong><br>
13
+ Nine specialized agents. Persistent memory. A QA gate on every task. Code that ships.
14
14
  </p>
15
15
 
16
16
  <p align="center">
17
- <a href="#why-swarm">Why Swarm?</a> •
17
+ <a href="#the-problem">The Problem</a> •
18
18
  <a href="#how-it-works">How It Works</a> •
19
- <a href="#installation">Installation</a> •
20
19
  <a href="#agents">Agents</a> •
21
- <a href="#configuration">Configuration</a>
20
+ <a href="#persistent-memory">Memory</a>
21
+ <a href="#guardrails">Guardrails</a> •
22
+ <a href="#comparison">Comparison</a> •
23
+ <a href="#installation">Installation</a> •
24
+ <a href="#roadmap">Roadmap</a>
22
25
  </p>
23
26
 
24
27
  ---
25
28
 
26
- ## The Problem with Every Other Multi-Agent System
27
-
28
- ```
29
- You: "Build me an authentication system"
30
-
31
- Other Frameworks:
32
- ├── Agent 1 starts auth module...
33
- ├── Agent 2 starts user model... (conflicts with Agent 1)
34
- ├── Agent 3 starts database... (wrong schema)
35
- ├── Agent 4 starts tests... (for code that doesn't exist yet)
36
- └── Result: Chaos. Conflicts. Context lost. Start over.
37
-
38
- OpenCode Swarm:
39
- ├── Architect analyzes request
40
- ├── Explorer scans codebase (+ gap analysis)
41
- ├── @sme consulted on security domain
42
- ├── Architect creates phased plan with acceptance criteria
43
- ├── @critic reviews plan → APPROVED
44
- ├── Phase 1: User model → Review → Tests (run + PASS) → ✓
45
- ├── Phase 2: Auth logic → Review → Tests (run + PASS) → ✓
46
- ├── Phase 3: Session management → Review → Tests (run + PASS) → ✓
47
- └── Result: Working code. Documented decisions. Resumable progress.
48
- ```
49
-
50
- ---
51
-
52
- ## Why Swarm?
29
+ ## The Problem
53
30
 
54
- <table>
55
- <tr>
56
- <td width="50%">
31
+ Every multi-agent AI coding tool on the market has the same failure mode: they are vibes-driven. You describe a feature. Agents spawn. They race each other to write conflicting code, lose context after 20 messages, hit token limits mid-task, and produce something that sort-of-works until it doesn't. There's no plan. There's no memory. There's no gatekeeper. There's no test that was actually run.
57
32
 
58
- ### Other Frameworks
33
+ **oh-my-opencode** is a prompt collection. **get-shit-done** is a workflow macro. Neither is a framework with memory, QA enforcement, or the ability to resume a project a week later exactly where you left off.
59
34
 
60
- - Parallel chaos, hope it converges
61
- - Single model = correlated failures
62
- - No planning, just vibes
63
- - Context lost between sessions
64
- - QA as afterthought (if at all)
65
- - Entire codebase in one prompt
66
- - No way to resume projects
35
+ OpenCode Swarm is built differently.
67
36
 
68
- </td>
69
- <td width="50%">
70
-
71
- ### OpenCode Swarm
72
-
73
- - **Serial execution** - predictable, traceable
74
- - **Heterogeneous models** - different perspectives catch errors
75
- - **Phased planning** - documented tasks with acceptance criteria
76
- - **Persistent memory** - `.swarm/` files survive sessions
77
- - **Review per task** - correctness + security review before anything ships
78
- - **One task at a time** - focused, quality code
79
- - **Resumable projects** - pick up exactly where you left off
37
+ ```
38
+ Every other framework:
39
+ ├── Agent 1 starts the auth module...
40
+ ├── Agent 2 starts the user model... (conflicts with Agent 1)
41
+ ├── Agent 3 writes tests... (for code that doesn't exist yet)
42
+ ├── Context window fills up and the whole thing drifts
43
+ └── Result: chaos. Rework. Start over.
80
44
 
81
- </td>
82
- </tr>
83
- </table>
45
+ OpenCode Swarm:
46
+ ├── Architect reads .swarm/plan.md → project already in progress, resumes Phase 2
47
+ ├── @explorer scans the codebase for current state
48
+ ├── @sme DOMAIN: security → consults on auth patterns, guidance cached
49
+ ├── Architect writes .swarm/plan.md: 3 phases, 9 tasks, acceptance criteria per task
50
+ ├── @critic reviews the plan → APPROVED
51
+ ├── @coder implements Task 2.2 (one task, full context, nothing else)
52
+ ├── diff tool → imports tool → lint fix → secretscan → @reviewer → @test_engineer
53
+ ├── All gates pass → plan.md updated → Task 2.2: [x]
54
+ └── Result: working code, documented decisions, resumable project, evidence trail
55
+ ```
84
56
 
85
57
  ---
86
58
 
87
59
  ## How It Works
88
60
 
61
+ ### The Execution Pipeline
62
+
89
63
  ```
90
- ┌─────────────────────────────────────────────────────────────────────────┐
91
- USER: "Add user authentication with JWT"
92
- └─────────────────────────────────────────────────────────────────────────┘
64
+ ┌──────────────────────────────────────────────────────────────────────────┐
65
+ Phase 0: Resume Check
66
+ │ .swarm/plan.md exists? Resume mid-task. New project? Continue. │
67
+ └──────────────────────────────────────────────────────────────────────────┘
93
68
 
94
69
 
95
- ┌─────────────────────────────────────────────────────────────────────────┐
96
- PHASE 0: Check for .swarm/plan.md
97
- Exists? Resume. New? Continue.
98
- └─────────────────────────────────────────────────────────────────────────┘
70
+ ┌──────────────────────────────────────────────────────────────────────────┐
71
+ Phase 1: Clarify
72
+ Ask only what the Architect cannot infer. Then stop.
73
+ └──────────────────────────────────────────────────────────────────────────┘
99
74
 
100
75
 
101
- ┌─────────────────────────────────────────────────────────────────────────┐
102
- PHASE 1: Clarify (if needed)
103
- "Do you need refresh tokens? What's the session duration?"
104
- └─────────────────────────────────────────────────────────────────────────┘
76
+ ┌──────────────────────────────────────────────────────────────────────────┐
77
+ Phase 2: Discover
78
+ @explorer scans codebase structure, languages, frameworks, key files
79
+ └──────────────────────────────────────────────────────────────────────────┘
105
80
 
106
81
 
107
- ┌─────────────────────────────────────────────────────────────────────────┐
108
- PHASE 2: Discover
109
- @explorer scans codebase structure, languages, patterns
110
- └─────────────────────────────────────────────────────────────────────────┘
82
+ ┌──────────────────────────────────────────────────────────────────────────┐
83
+ Phase 3: SME Consult (serial, cached)
84
+ @sme DOMAIN: security, @sme DOMAIN: api, ...
85
+ │ Guidance written to .swarm/context.md — never re-asked in future phases │
86
+ └──────────────────────────────────────────────────────────────────────────┘
111
87
 
112
88
 
113
- ┌─────────────────────────────────────────────────────────────────────────┐
114
- PHASE 3: Consult SMEs (serial, cached)
115
- @sme DOMAIN: security → auth best practices
116
- @sme DOMAIN: api JWT patterns, refresh flow
117
- Guidance saved to .swarm/context.md
118
- └─────────────────────────────────────────────────────────────────────────┘
89
+ ┌──────────────────────────────────────────────────────────────────────────┐
90
+ Phase 4: Plan
91
+ Architect writes .swarm/plan.md
92
+ Structured phases, tasks with SMALL/MEDIUM/LARGE sizing, acceptance
93
+ criteria per task, explicit dependency graph
94
+ └──────────────────────────────────────────────────────────────────────────┘
119
95
 
120
96
 
121
- ┌─────────────────────────────────────────────────────────────────────────┐
122
- PHASE 4: Plan
123
- Creates .swarm/plan.md with phases, tasks, acceptance criteria
124
-
125
- │ Phase 1: Foundation [3 tasks] │
126
- │ Phase 2: Core Auth [4 tasks] │
127
- │ Phase 3: Session Management [3 tasks] │
128
- └─────────────────────────────────────────────────────────────────────────┘
97
+ ┌──────────────────────────────────────────────────────────────────────────┐
98
+ Phase 4.5: Critic Gate
99
+ @critic reviews plan APPROVED / NEEDS_REVISION / REJECTED
100
+ Max 2 revision cycles. Escalates to user if unresolved.
101
+ └──────────────────────────────────────────────────────────────────────────┘
129
102
 
130
103
 
131
- ┌─────────────────────────────────────────────────────────────────────────┐
132
- PHASE 4.5: Critic Gate
133
- @critic reviews plan → APPROVED / NEEDS_REVISION / REJECTED
134
- Max 2 revision cycles before escalating to user
135
- └─────────────────────────────────────────────────────────────────────────┘
104
+ ┌──────────────────────────────────────────────────────────────────────────┐
105
+ Phase 5: Execute (per task)
106
+
107
+ [UI task?] @designer scaffold first
108
+ │ │
109
+ │ @coder (one task, full context) │
110
+ │ ↓ │
111
+ │ diff tool → imports tool → lint fix → lint check → secretscan │
112
+ │ (contract change detection) (AST-based) (auto-fix) (entropy scan) │
113
+ │ ↓ │
114
+ │ @reviewer (correctness pass) │
115
+ │ ↓ APPROVED │
116
+ │ @reviewer (security-only pass, if file matches security globs) │
117
+ │ ↓ APPROVED │
118
+ │ @test_engineer (verification tests + coverage gate ≥70%) │
119
+ │ ↓ PASS │
120
+ │ @test_engineer (adversarial tests — boundary violations, injections) │
121
+ │ ↓ PASS │
122
+ │ plan.md → [x] Task complete │
123
+ │ │
124
+ │ Any gate fails → back to @coder with structured rejection reason │
125
+ └──────────────────────────────────────────────────────────────────────────┘
136
126
 
137
127
 
138
- ┌─────────────────────────────────────────────────────────────────────────┐
139
- PHASE 5: Execute (per task)
140
-
141
- ┌─────────┐ ┌───────┐ ┌────────────┐ ┌──────────────┐
142
- │ @coder → │ diff │ → │ @reviewer │ @test │ │
143
- │ │ 1 task │ │ tool │ │ check all │ │ write + run │ │
144
- │ └─────────┘ └───────┘ └────────────┘ └──────────────┘ │
145
- │ │ │ │ │ │
146
- │ │ Contract │ If REJECTED: If FAIL: fix │
147
- │ │ changes? │ retry from coder + retest │
148
- │ │ │ │ │ │
149
- │ │ ▼ │ ▼ │
150
- │ │ ┌─────────┐ │ ┌──────────────┐ ┌──────────────┐ │
151
- │ │ │@explorer│ │ │ @reviewer │ → │ @test │ │
152
- │ │ │ impact │ │ │ security-only│ │ adversarial │ │
153
- │ │ │analysis │ │ │ (if match) │ │ (attacks) │ │
154
- │ │ └─────────┘ │ └──────────────┘ └──────────────┘ │
155
- │ │ │ │
156
- │ └───────────────┘ │
157
- │ │
158
- │ Update plan.md: [x] Task complete (only after ALL gates pass) │
159
- │ Next task... │
160
- └─────────────────────────────────────────────────────────────────────────┘
161
-
162
-
163
- ┌─────────────────────────────────────────────────────────────────────────┐
164
- │ PHASE 6: Phase Complete │
165
- │ Re-scan with @explorer │
166
- │ Update context.md with learnings │
167
- │ Archive to .swarm/history/ │
168
- │ "Phase 1 complete. Ready for Phase 2?" │
169
- └─────────────────────────────────────────────────────────────────────────┘
128
+ ┌──────────────────────────────────────────────────────────────────────────┐
129
+ Phase 6: Phase Complete
130
+ @explorer rescans. @docs updates documentation. Retrospective written.
131
+ Learnings injected as [SWARM RETROSPECTIVE] into next phase.
132
+ "Phase 1 complete (4 tasks, 0 rejections). Ready for Phase 2?"
133
+ └──────────────────────────────────────────────────────────────────────────┘
170
134
  ```
171
135
 
136
+ ### Why Serial Execution Matters
137
+
138
+ Multi-agent parallelism sounds fast. In practice, it is a race to produce conflicting, unreviewed code that requires a human to untangle. OpenCode Swarm runs one task at a time through a deterministic pipeline. Every task is reviewed. Every test is run. Every failure is documented and fed back to the coder with structured context. The tradeoff in raw speed is paid back in not redoing work.
139
+
172
140
  ---
173
141
 
174
- ## Persistent Project Memory
142
+ ## Agents
143
+
144
+ ### 🎯 Orchestrator
145
+
146
+ **`architect`** — The central coordinator. Owns the plan, delegates all work, enforces every QA gate, maintains project memory, and resumes projects across sessions. Every other agent works for the Architect.
147
+
148
+ ### 🔍 Discovery
149
+
150
+ **`explorer`** — Fast codebase scanner. Identifies structure, languages, frameworks, key files, and import patterns. Runs before planning and after every phase completes.
151
+
152
+ ### 🧠 Domain Expert
153
+
154
+ **`sme`** — Open-domain expert. The Architect specifies any domain per call: `security`, `python`, `rust`, `kubernetes`, `ios`, `ml`, `blockchain` — any domain the underlying model has knowledge of. No hardcoded list. Guidance is cached in `.swarm/context.md` so the same question is never asked twice.
155
+
156
+ ### 🎨 Design
157
+
158
+ **`designer`** — UI/UX specification agent. Opt-in via config. Generates component scaffolds and design tokens before the coder touches UI tasks, eliminating the most common source of front-end rework.
159
+
160
+ ### 💻 Implementation
161
+
162
+ **`coder`** — Implements exactly one task with full context. No multitasking. No context bleed from prior tasks. The coder receives: the task spec, acceptance criteria, SME guidance, and relevant context from `.swarm/context.md`. Nothing else.
163
+
164
+ **`test_engineer`** — Generates tests, runs them, and returns structured `PASS/FAIL` verdicts with coverage percentages. Runs twice per task: once for verification, once for adversarial attack scenarios.
165
+
166
+ ### ✅ Quality Assurance
167
+
168
+ **`reviewer`** — Dual-pass review. First pass: correctness, logic, maintainability. Second pass: security-only, scoped to OWASP Top 10 categories, triggered automatically when the modified files match security-sensitive path patterns. Both passes produce structured verdicts with specific rejection reasons.
169
+
170
+ **`critic`** — Plan review gate. Reviews the Architect's plan *before implementation begins*. Checks for completeness, feasibility, scope creep, missing dependencies, and AI-slop hallucinations. Plans do not proceed without Critic approval.
171
+
172
+ ### 📝 Documentation
175
173
 
176
- Other frameworks lose everything when the session ends. Swarm doesn't.
174
+ **`docs`** Documentation synthesizer. Runs in Phase 6 with a diff of changed files. Updates READMEs, API documentation, and guides to reflect what was actually built, not what was planned.
175
+
176
+ ---
177
+
178
+ ## Persistent Memory
179
+
180
+ Other frameworks lose everything when the session ends. Swarm stores project state on disk.
177
181
 
178
182
  ```
179
183
  .swarm/
180
- ├── plan.md # Your project roadmap (+ plan.json)
181
- ├── context.md # Everything a new Architect needs
182
- ├── evidence/ # Per-task execution evidence
183
- ├── 1.1/ # Evidence for task 1.1
184
- └── 2.3/ # Evidence for task 2.3
184
+ ├── plan.md # Living roadmap: phases, tasks, status, rejections, blockers
185
+ ├── plan.json # Machine-readable plan for tooling
186
+ ├── context.md # Institutional knowledge: decisions, SME guidance, patterns
187
+ ├── evidence/ # Per-task execution evidence bundles
188
+ ├── 1.1/ # review verdict, test results, diff summary for task 1.1
189
+ │ └── 2.3/
185
190
  └── history/
186
- ├── phase-1.md # What was done, what was learned
191
+ ├── phase-1.md # What was built, what was learned, retrospective metrics
187
192
  └── phase-2.md
188
193
  ```
189
194
 
190
- ### plan.md - Living Roadmap
195
+ ### plan.md Living Roadmap
196
+
191
197
  ```markdown
192
198
  # Project: Auth System
193
199
  Current Phase: 2
@@ -200,260 +206,133 @@ Current Phase: 2
200
206
  ## Phase 2: Core Auth [IN PROGRESS]
201
207
  - [x] Task 2.1: Login endpoint [MEDIUM]
202
208
  - [ ] Task 2.2: JWT generation [MEDIUM] (depends: 2.1) ← CURRENT
203
- - Acceptance: Returns valid JWT with user claims
204
- - Attempt 1: REJECTED - Missing expiration
209
+ - Acceptance: Returns valid JWT with user claims, 15-minute expiry
210
+ - Attempt 1: REJECTED missing expiration claim
205
211
  - [ ] Task 2.3: Token validation middleware [MEDIUM]
206
- - [BLOCKED] Task 2.4: Refresh tokens
207
- - Reason: Waiting for decision on rotation strategy
212
+ - [BLOCKED] Task 2.4: Refresh token rotation
213
+ - Reason: Awaiting decision on rotation strategy
208
214
  ```
209
215
 
210
- ### context.md - Institutional Knowledge
216
+ ### context.md Institutional Knowledge
217
+
211
218
  ```markdown
212
219
  # Project Context: Auth System
213
220
 
214
221
  ## Technical Decisions
215
- - Using bcrypt (cost 12) for password hashing
216
- - JWT expires in 15 minutes, refresh in 7 days
217
- - Storing refresh tokens in Redis
222
+ - bcrypt cost factor: 12
223
+ - JWT TTL: 15 minutes; refresh TTL: 7 days
224
+ - Refresh token store: Redis with key prefix auth:refresh:
218
225
 
219
226
  ## SME Guidance Cache
220
- ### Security (Phase 1)
221
- - Never log tokens or passwords
222
- - Use constant-time comparison for tokens
223
- - Implement rate limiting on login
227
+ ### security (Phase 1)
228
+ - Never log tokens or passwords in any context
229
+ - Use constant-time comparison for all token equality checks
230
+ - Rate-limit login endpoint: 5 attempts / 15 minutes per IP
224
231
 
225
- ### API (Phase 1)
226
- - Return 401 for invalid credentials (not 404)
227
- - Include token expiry in response body
232
+ ### api (Phase 1)
233
+ - Return HTTP 401 for invalid credentials (not 404)
234
+ - Include token expiry timestamp in response body
228
235
 
229
236
  ## Patterns Established
230
- - Error handling: Custom ApiError class with status codes
231
- - Validation: Zod schemas in /validators/
237
+ - Error handling: custom ApiError class with HTTP status and error code
238
+ - Validation: Zod schemas in /validators/, applied at request boundary
232
239
  ```
233
240
 
234
- **Start a new session tomorrow?** The Architect reads these files and picks up exactly where you left off.
241
+ Start a new session tomorrow. The Architect reads these files and picks up exactly where you left off — no re-explaining, no rediscovery, no drift.
235
242
 
236
- ---
237
-
238
- ## Heterogeneous Models = Better Code
239
-
240
- Most frameworks use one model for everything. Same blindspots everywhere.
243
+ ### Evidence Bundles
241
244
 
242
- Swarm lets you mix models strategically:
243
-
244
- ```json
245
- {
246
- "agents": {
247
- "architect": { "model": "anthropic/claude-sonnet-4-5" },
248
- "explorer": { "model": "google/gemini-2.0-flash" },
249
- "coder": { "model": "anthropic/claude-sonnet-4-5" },
250
- "sme": { "model": "google/gemini-2.0-flash" },
251
- "reviewer": { "model": "openai/gpt-4o" },
252
- "critic": { "model": "google/gemini-2.0-flash" },
253
- "test_engineer": { "model": "google/gemini-2.0-flash" }
254
- }
255
- }
256
- ```
245
+ Each completed task writes structured evidence to `.swarm/evidence/`:
257
246
 
258
- | Role | Optimized For | Why Different Models? |
259
- |------|---------------|----------------------|
260
- | Architect | Deep reasoning | Needs to plan complex work |
261
- | Explorer | Fast scanning | Speed over depth |
262
- | Coder | Implementation | Best coding model you have |
263
- | SME | Domain knowledge | Fast recall, not deep reasoning |
264
- | Reviewer | Finding flaws | **Different vendor catches different bugs** |
265
- | Critic | Plan review | Catches scope issues before any code is written |
266
- | Test Engineer | Test + run | Writes tests, runs them, reports PASS/FAIL |
247
+ | Type | What It Captures |
248
+ |------|-----------------|
249
+ | `review` | Verdict (APPROVED/REJECTED), risk level, specific issues |
250
+ | `test` | Pass/fail counts, coverage percentage, failure messages |
251
+ | `diff` | Files changed, additions/deletions, contract change flags |
252
+ | `approval` | Stakeholder sign-off with notes |
253
+ | `retrospective` | Phase metrics: total tool calls, coder revisions, reviewer rejections, test failures, security findings, lessons learned |
267
254
 
268
- **If Claude writes code and GPT reviews it, GPT catches Claude's blindspots.** This is why real teams have code review.
255
+ Retrospectives from completed phases are injected as `[SWARM RETROSPECTIVE]` hints at the start of subsequent phases. The framework learns from its own history within a project.
269
256
 
270
257
  ---
271
258
 
272
- ## Multiple Swarms
273
-
274
- Run different model configurations simultaneously. Perfect for:
275
- - **Cloud vs Local**: Premium cloud models for critical work, local models for quick tasks
276
- - **Fast vs Quality**: Quick iterations with fast models, careful work with expensive ones
277
- - **Cost Tiers**: Cheap models for exploration, premium for implementation
259
+ ## Heterogeneous Models
278
260
 
279
- ### Configuration
261
+ Single-model frameworks have correlated failure modes. The same model that writes the bug reviews it and misses it. Swarm lets you route each agent to the model it is best suited for:
280
262
 
281
263
  ```json
282
264
  {
283
- "swarms": {
284
- "cloud": {
285
- "name": "Cloud",
286
- "agents": {
287
- "architect": { "model": "anthropic/claude-sonnet-4-5" },
288
- "coder": { "model": "anthropic/claude-sonnet-4-5" },
289
- "sme": { "model": "google/gemini-2.0-flash" },
290
- "reviewer": { "model": "openai/gpt-4o" }
291
- }
292
- },
293
- "local": {
294
- "name": "Local",
295
- "agents": {
296
- "architect": { "model": "ollama/qwen2.5:32b" },
297
- "coder": { "model": "ollama/qwen2.5:32b" },
298
- "sme": { "model": "ollama/qwen2.5:14b" },
299
- "reviewer": { "model": "ollama/qwen2.5:14b" }
300
- }
301
- }
265
+ "agents": {
266
+ "architect": { "model": "anthropic/claude-opus-4-6" },
267
+ "coder": { "model": "minimax-coding-plan/MiniMax-M2.5" },
268
+ "explorer": { "model": "minimax-coding-plan/MiniMax-M2.1" },
269
+ "sme": { "model": "kimi-for-coding/k2p5" },
270
+ "critic": { "model": "zai-coding-plan/glm-5" },
271
+ "reviewer": { "model": "zai-coding-plan/glm-5" },
272
+ "test_engineer": { "model": "minimax-coding-plan/MiniMax-M2.5" },
273
+ "docs": { "model": "zai-coding-plan/glm-4.7-flash" },
274
+ "designer": { "model": "kimi-for-coding/k2p5" }
302
275
  }
303
276
  }
304
277
  ```
305
278
 
306
- ### What Gets Created
307
-
308
- | Swarm | Agents |
309
- |-------|--------|
310
- | `cloud` (default) | `architect`, `explorer`, `coder`, `sme`, `reviewer`, `critic`, `test_engineer` |
311
- | `local` | `local_architect`, `local_explorer`, `local_coder`, `local_sme`, `local_reviewer`, `local_critic`, `local_test_engineer` |
312
-
313
- The first swarm (or one named "default") creates unprefixed agents. Additional swarms prefix all agent names.
314
-
315
- ### Usage
316
-
317
- In OpenCode, you'll see multiple architects to choose from:
318
- - `architect` - Cloud swarm (default)
319
- - `local_architect` - Local swarm
320
-
321
- Each architect automatically delegates to its own swarm's agents.
279
+ Reviewer uses a different model than Coder by design. Different training, different priors, different blind spots. This is the cheapest bug-catcher you will ever deploy.
322
280
 
323
281
  ---
324
282
 
325
- ## Installation
283
+ ## Guardrails
326
284
 
327
- ```bash
328
- # Install via CLI (recommended)
329
- bunx opencode-swarm install
330
- ```
285
+ Every subagent runs inside a circuit breaker that kills runaway behavior before it burns credits on a stuck loop.
331
286
 
332
- ### Uninstall
287
+ | Layer | Trigger | Action |
288
+ |-------|---------|--------|
289
+ | ⚠️ Soft Warning | 50% of any limit reached | Warning injected into agent stream |
290
+ | 🛑 Hard Block | 100% of any limit reached | All further tool calls blocked |
333
291
 
334
- ```bash
335
- # Remove from opencode.json
336
- bunx opencode-swarm uninstall
292
+ | Signal | Default | Description |
293
+ |--------|---------|-------------|
294
+ | Tool calls | 200 | Per-invocation, not per-session |
295
+ | Duration | 30 min | Wall-clock time per delegation |
296
+ | Repetition | 10 | Same tool + args consecutively |
297
+ | Consecutive errors | 5 | Sequential null/undefined outputs |
337
298
 
338
- # Remove from opencode.json + clean up config files
339
- bunx opencode-swarm uninstall --clean
340
- ```
299
+ Limits are enforced **per-invocation**. Each delegation to a subagent starts a fresh budget. A coder fixing a second task is not penalized for the first task's tool calls. The Architect is exempt from all limits by default.
341
300
 
342
- ---
301
+ Per-agent profiles allow fine-grained overrides:
343
302
 
344
- ## What's New
345
-
346
- ### v6.1.2 — Guardrails Remediation
347
- - **Fail-safe config validation** — Config validation failures now disable guardrails as a safety precaution (previously Zod defaults could silently re-enable them).
348
- - **Architect exemption fix** — Architect/orchestrator sessions can no longer inherit 30-minute base limits during delegation race conditions.
349
- - **Explicit disable always wins** — `guardrails.enabled: false` in config is now always honored, even when the config was loaded from file.
350
- - **Internal map synchronization** — `startAgentSession()` now keeps `activeAgent` and `agentSessions` maps in sync for consistent state tracking.
351
-
352
- ### v6.1.1 — Security Fix & Tech Debt
353
- - **Security hardening (`_loadedFromFile`)** — Fixed a critical vulnerability where an internal loader flag could be injected via JSON config to bypass guardrails. The flag is now purely internal and no longer part of the public schema.
354
- - **TOCTOU protection** — Added atomic-style content checks in the config loader to prevent race conditions during file reads.
355
- - **`retrieve_summary` tool** — Properly registered the retrieval tool, allowing agents to fetch full content from auto-summarized tool outputs.
356
- - **92 new tests** — 1280 total tests across 57+ files (up from 1188 in v6.0.0).
357
-
358
- ### v6.1.0 — Docs & Design Agents
359
- - **`docs` agent** — Dedicated documentation synthesizer that automatically updates READMEs, API docs, and guides during Phase 6.
360
- - **`designer` agent** — UI/UX specification agent that generates component scaffolds before coding begins on UI-heavy tasks.
361
- - **Heterogeneous model defaults** — Updated default models for new agents to use optimized Gemini models for speed and cost.
362
-
363
- ### v6.0.0 — Core QA & Security Gates
364
- - **Dual-pass security reviewer** — After the general reviewer APPROVES, the architect automatically triggers a second security-only review pass when the changed file matches security-sensitive paths (`auth`, `crypto`, `session`, `token`, `middleware`, `api`, `security`) or the coder's output contains security keywords. Configurable via `review_passes` config.
365
- - **Adversarial testing** — After verification tests PASS, the test engineer is re-delegated with adversarial-only framing: attack vectors, boundary violations, and injection attempts. Pure prompt engineering, no new infrastructure.
366
- - **Integration impact analysis** — After the coder completes, the `diff` tool detects contract changes (exported functions, interfaces, types). If found, the explorer runs impact analysis across dependents before review begins.
367
- - **`diff` tool** — New agent-accessible tool providing structured git diff with numstat parsing, contract change detection, configurable base ref (`HEAD`/staged/unstaged), path filtering, and 500-line truncation.
368
- - **87 new tests** — 1188 total tests across 53+ files (up from 1101 in v5.2.0).
369
-
370
- ### v5.2.0 — Per-Invocation Guardrails
371
- - **Per-invocation budget isolation** — Guardrail limits (tool calls, duration, errors) now reset with each agent delegation. Second invocation of the same agent gets a fresh budget, preventing false circuit breaker trips in long-running projects.
372
- - **Architect protocol enforcement** — New mandatory QA gate rules: every coder task must go through reviewer approval + test_engineer verification before the next coder task. Protocol violations detected at runtime with warning injection.
373
- - **Invocation window observability** — Circuit breaker logs now include `invocationId` and `windowKey` for precise debugging of which specific agent invocation hit limits.
374
- - **67 new tests** — 1101 total tests across 48 files (up from 1034 in v5.1.x).
375
-
376
- ### v5.0.0 — Verifiable Execution
377
- - **Canonical plan schema** — Machine-readable `plan.json` with Zod-validated `PlanSchema`/`TaskSchema`/`PhaseSchema`. Automatic migration from legacy `plan.md` format. Structured status tracking (`pending`, `in_progress`, `completed`, `blocked`).
378
- - **Evidence bundles** — Per-task execution evidence persisted to `.swarm/evidence/`. Five evidence types: `review`, `test`, `diff`, `approval`, `note`. Sanitized task IDs, atomic writes, configurable size limits. `/swarm evidence` to view, `/swarm archive` to manage retention.
379
- - **Per-agent guardrail profiles** — Override guardrail limits for individual agents via `guardrails.profiles`. `resolveGuardrailsConfig()` merges base + profile with per-agent specificity.
380
- - **Context injection budget** — `max_injection_tokens` config controls how much context is injected into system prompts. Priority-ordered: phase → task → decisions → agent context. Lower-priority items dropped when budget exhausted.
381
- - **Enhanced `/swarm agents`** — Agent count summary, `⚡ custom limits` indicator for profiled agents, guardrail profiles section.
382
- - **Packaging smoke tests** — CI-safe `dist/` validation (8 tests).
383
- - **151 new tests** — 1027 total tests across 44 files (up from 876 in v4.6.0).
384
-
385
- ### v4.6.0 — Agent Guardrails
386
- - **Circuit breaker** — Two-layer protection against runaway agents. Soft warning at 50% of limits, hard block at 100%. Prevents infinite loops and runaway API costs.
387
- - **Detection signals** — Tool call count, wall-clock time, consecutive repetition, and consecutive error tracking per agent session.
388
- - **Configurable limits** — All thresholds tunable via `guardrails` config: `max_tool_calls`, `max_duration_minutes`, `max_repetitions`, `max_consecutive_errors`, `warning_threshold`.
389
- - **46 new tests** — 668 total tests across 30 files.
390
-
391
- ### v4.5.0 — Tech Debt + New Commands
392
- - **Lint cleanup** — Replaced string concatenation with template literals, documented `as any` casts with biome-ignore comments.
393
- - **Code deduplication** — Extracted `stripSwarmPrefix()` utility to eliminate 3 duplicate prefix-stripping blocks.
394
- - **`/swarm diagnose`** — Health check for `.swarm/` files, plan structure, and plugin configuration.
395
- - **`/swarm export`** — Export plan.md and context.md as portable JSON.
396
- - **`/swarm reset --confirm`** — Clear swarm state files with safety confirmation.
397
-
398
- ### v4.4.0 — DX & Quality
399
- - **CLI `uninstall` command** — Remove plugin with optional `--clean` flag.
400
- - **Custom error classes** — `SwarmError` hierarchy with actionable `guidance` messages.
401
- - **`/swarm history`** — View completed phases from plan.md.
402
- - **`/swarm config`** — View current resolved plugin configuration.
403
-
404
- ### v4.3.2 — Security Hardening
405
- - **Path validation** — `validateSwarmPath()` prevents directory traversal in `.swarm/` file operations.
406
- - **Fetch hardening** — 10s timeout, 5MB limit, retry logic for gitingest tool.
407
- - **Config limits** — Deep merge depth limit (10), config file size limit (100KB).
408
-
409
- ### v4.3.0 — Hooks & Agent Awareness
410
- - **Hooks pipeline** — `safeHook()` crash-safe wrapper, `composeHandlers()` for multi-handler composition.
411
- - **Context pruning** — Token budget tracking with 70%/90% threshold warnings.
412
- - **Slash commands** — `/swarm status`, `/swarm plan`, `/swarm agents`.
413
- - **Agent awareness** — Activity tracking, delegation tracking, cross-agent context injection.
414
-
415
- All features are opt-in via configuration. See [Installation Guide](docs/installation.md) for config options.
303
+ ```jsonc
304
+ {
305
+ "guardrails": {
306
+ "max_tool_calls": 200,
307
+ "profiles": {
308
+ "coder": { "max_tool_calls": 500, "max_duration_minutes": 60 },
309
+ "explorer": { "max_tool_calls": 50 }
310
+ }
311
+ }
312
+ }
313
+ ```
416
314
 
417
315
  ---
418
316
 
419
- ## Agents
420
-
421
- ### 🎯 Orchestrator
422
- | Agent | Role |
423
- |-------|------|
424
- | `architect` | Central coordinator. Plans phases, delegates tasks, manages QA, maintains project memory. |
425
-
426
- ### 🔍 Discovery
427
- | Agent | Role |
428
- |-------|------|
429
- | `explorer` | Fast codebase scanner. Identifies structure, languages, frameworks, key files. |
430
-
431
- ### 🎨 Design
432
- | Agent | Role |
433
- |-------|------|
434
- | `designer` | UI/UX specification agent. Generates component scaffolds and design tokens before coding begins on UI-heavy tasks. |
435
-
436
- ### 🧠 Domain Expert
437
- | Agent | Role |
438
- |-------|------|
439
- | `sme` | Open-domain expert. The architect specifies any domain (security, python, ios, rust, kubernetes, etc.) per call. No hardcoded list — works with any domain the LLM has knowledge of. |
440
-
441
- ### 💻 Implementation
442
- | Agent | Role |
443
- |-------|------|
444
- | `coder` | Implements ONE task at a time with full context |
445
- | `test_engineer` | Generates tests, runs them, and reports structured PASS/FAIL verdicts |
446
-
447
- ### ✅ Quality Assurance
448
- | Agent | Role |
449
- |-------|------|
450
- | `reviewer` | Dual-pass review: correctness review first, then automatic security-only pass for security-sensitive files. The architect specifies CHECK dimensions per call. OWASP Top 10 categories built in. |
451
- | `critic` | Plan review gate. Reviews the architect's plan BEFORE implementation — checks completeness, feasibility, scope, dependencies, and flags AI-slop. |
317
+ ## Comparison
452
318
 
453
- ### 📝 Documentation
454
- | Agent | Role |
455
- |-------|------|
456
- | `docs` | Documentation synthesizer. Automatically updates READMEs, API docs, and guides based on implementation changes during Phase 6. |
319
+ | Feature | OpenCode Swarm | oh-my-opencode | get-shit-done | AutoGen | CrewAI |
320
+ |---------|:-:|:-:|:-:|:-:|:-:|
321
+ | Multi-agent orchestration | ✅ 9 specialized agents | ❌ Prompt config only | ❌ Single-agent macros | ✅ | ✅ |
322
+ | Execution model | Serial (deterministic) | N/A | N/A | Parallel (chaotic) | Parallel |
323
+ | Phased planning with acceptance criteria | ✅ | ❌ | ❌ | ❌ | ❌ |
324
+ | Critic gate before implementation | ✅ | ❌ | ❌ | ❌ | ❌ |
325
+ | Per-task dual-pass review (correctness + security) | ✅ | ❌ | ❌ | Optional | Optional |
326
+ | Adversarial test pass per task | ✅ | ❌ | ❌ | ❌ | ❌ |
327
+ | Pre-reviewer pipeline (lint, secretscan, imports) | ✅ v6.3 | ❌ | ❌ | ❌ | ❌ |
328
+ | Persistent session memory | ✅ `.swarm/` files | ❌ | ❌ | Session only | Session only |
329
+ | Resume projects across sessions | ✅ Native | ❌ | ❌ | ❌ | ❌ |
330
+ | Evidence trail per task | ✅ Structured bundles | ❌ | ❌ | ❌ | ❌ |
331
+ | Heterogeneous model routing | ✅ Per-agent | ❌ | ❌ | Limited | Limited |
332
+ | Circuit breaker / guardrails | ✅ Per-invocation | ❌ | ❌ | ❌ | ❌ |
333
+ | Open-domain SME consultation | ✅ Any domain | ❌ | ❌ | ❌ | ❌ |
334
+ | Retrospective learning across phases | ✅ | ❌ | ❌ | ❌ | ❌ |
335
+ | Slash commands + diagnostics | ✅ 12 commands | ❌ | Limited | ❌ | ❌ |
457
336
 
458
337
  ---
459
338
 
@@ -461,220 +340,141 @@ All features are opt-in via configuration. See [Installation Guide](docs/install
461
340
 
462
341
  | Command | Description |
463
342
  |---------|-------------|
464
- | `/swarm status` | Current phase, task progress, and agent count |
465
- | `/swarm plan [N]` | View full plan or filter by phase number |
466
- | `/swarm agents` | List all registered agents with models and permissions |
467
- | `/swarm history` | View completed phases with status icons |
468
- | `/swarm config` | View current resolved plugin configuration |
469
- | `/swarm diagnose` | Health check for .swarm/ files and config |
343
+ | `/swarm status` | Current phase, task progress, agent count |
344
+ | `/swarm plan [N]` | Full plan or filtered by phase |
345
+ | `/swarm agents` | All registered agents with models and permissions |
346
+ | `/swarm history` | Completed phases with status |
347
+ | `/swarm config` | Current resolved configuration |
348
+ | `/swarm diagnose` | Health check for `.swarm/` files and config |
470
349
  | `/swarm export` | Export plan and context as portable JSON |
471
- | `/swarm reset --confirm` | Clear swarm state files (with safety gate) |
472
- | `/swarm evidence [task]` | View evidence bundles for a task or all tasks |
473
- | `/swarm archive [--dry-run]` | Archive old evidence bundles with retention policy |
474
- | `/swarm benchmark` | Run performance benchmarks and display metrics |
475
- | `/swarm retrieve [id]` | Retrieve auto-summarized tool outputs by ID |
350
+ | `/swarm evidence [task]` | Evidence bundles for a task or all tasks |
351
+ | `/swarm archive [--dry-run]` | Archive old evidence with retention policy |
352
+ | `/swarm benchmark` | Performance benchmarks |
353
+ | `/swarm retrieve [id]` | Retrieve auto-summarized tool outputs |
354
+ | `/swarm reset --confirm` | Clear swarm state files |
476
355
 
477
356
  ---
478
357
 
479
358
  ## Configuration
480
359
 
481
- Create `~/.config/opencode/opencode-swarm.json`:
482
-
483
360
  ```json
484
361
  {
485
362
  "agents": {
486
- "architect": { "model": "anthropic/claude-sonnet-4-5" },
487
- "explorer": { "model": "google/gemini-2.0-flash" },
488
- "coder": { "model": "anthropic/claude-sonnet-4-5" },
489
- "sme": { "model": "google/gemini-2.0-flash" },
490
- "reviewer": { "model": "openai/gpt-4o" },
491
- "critic": { "model": "google/gemini-2.0-flash" },
492
- "test_engineer": { "model": "google/gemini-2.0-flash" },
493
- "docs": { "model": "google/gemini-2.0-flash" },
494
- "designer": { "model": "google/gemini-2.0-flash" }
363
+ "architect": { "model": "anthropic/claude-opus-4-6" },
364
+ "coder": { "model": "minimax-coding-plan/MiniMax-M2.5" },
365
+ "explorer": { "model": "minimax-coding-plan/MiniMax-M2.1" },
366
+ "sme": { "model": "kimi-for-coding/k2p5" },
367
+ "critic": { "model": "zai-coding-plan/glm-5" },
368
+ "reviewer": { "model": "zai-coding-plan/glm-5" },
369
+ "test_engineer": { "model": "minimax-coding-plan/MiniMax-M2.5" },
370
+ "docs": { "model": "zai-coding-plan/glm-4.7-flash" },
371
+ "designer": { "model": "kimi-for-coding/k2p5" }
372
+ },
373
+ "guardrails": {
374
+ "max_tool_calls": 200,
375
+ "max_duration_minutes": 30,
376
+ "profiles": {
377
+ "coder": { "max_tool_calls": 500 }
378
+ }
379
+ },
380
+ "review_passes": {
381
+ "always_security_review": false,
382
+ "security_globs": ["**/*auth*", "**/*crypto*", "**/*session*", "**/*token*"]
495
383
  }
496
384
  }
497
385
  ```
498
386
 
499
- ### Disable Agents
387
+ Save to `~/.config/opencode/opencode-swarm.json` or `.opencode/swarm.json` in your project root. Project config merges over global config via deep merge — partial overrides do not clobber unspecified fields.
388
+
389
+ ### Disabling Agents
390
+
500
391
  ```json
501
392
  {
502
- "sme": { "disabled": true },
393
+ "sme": { "disabled": true },
394
+ "designer": { "disabled": true },
503
395
  "test_engineer": { "disabled": true }
504
396
  }
505
397
  ```
506
398
 
507
399
  ---
508
400
 
509
- ## Guardrails
510
-
511
- OpenCode Swarm includes a built-in circuit breaker that prevents subagents from running away — burning API credits in infinite loops, repeating the same tool call, or spinning for hours.
512
-
513
- ### How It Works
514
-
515
- | Layer | Trigger | Action |
516
- |-------|---------|--------|
517
- | ⚠️ **Soft Warning** | 50% of any limit reached | Injects warning message into agent's chat stream |
518
- | 🛑 **Hard Block** | 100% of any limit reached | Blocks ALL further tool calls + injects stop message |
519
-
520
- ### Detection Signals
521
-
522
- | Signal | Default Limit | Description |
523
- |--------|---------------|-------------|
524
- | Tool calls | 200 | Total tool invocations per agent session |
525
- | Duration | 30 min | Wall-clock time since delegation started |
526
- | Repetition | 10 | Same tool + args called consecutively |
527
- | Consecutive errors | 5 | Sequential null/undefined tool outputs |
401
+ ## Installation
528
402
 
529
- ### Configuration
403
+ ```bash
404
+ # Install globally
405
+ npm install -g opencode-swarm
530
406
 
531
- Guardrails are **enabled by default**. Customize in your swarm config:
407
+ # Or use npx
408
+ npx opencode-swarm install
532
409
 
533
- ```jsonc
534
- {
535
- "guardrails": {
536
- "enabled": true, // default: true
537
- "max_tool_calls": 200, // range: 10–1000
538
- "max_duration_minutes": 30, // range: 1–120
539
- "max_repetitions": 10, // range: 3–50
540
- "max_consecutive_errors": 5, // range: 2–20
541
- "warning_threshold": 0.5 // range: 0.1–0.9 (fraction of limit for soft warning)
542
- }
543
- }
410
+ # Verify
411
+ opencode # then: /swarm diagnose
544
412
  ```
545
413
 
546
- ### Per-Agent Profiles
547
-
548
- Override limits for specific agents that need more (or less) room:
414
+ The installer auto-configures `opencode.json` to include the plugin. Manual configuration:
549
415
 
550
- ```jsonc
416
+ ```json
551
417
  {
552
- "guardrails": {
553
- "max_tool_calls": 200,
554
- "profiles": {
555
- "coder": { "max_tool_calls": 500, "max_duration_minutes": 60 },
556
- "explorer": { "max_tool_calls": 50 }
557
- }
558
- }
418
+ "plugins": ["opencode-swarm"]
559
419
  }
560
420
  ```
561
421
 
562
- Profiles merge with base config — only specified fields are overridden.
422
+ ---
563
423
 
564
- ### Review Passes
424
+ ## Testing
565
425
 
566
- Control the dual-pass security review behavior:
426
+ 2031 tests across 78 files. Unit, integration, adversarial, and smoke. Covers config schemas, all agent prompts, all hooks, all tools, all commands, guardrail circuit breaker, race conditions, invocation window isolation, multi-invocation state, security category classification, and evidence validation.
567
427
 
568
- ```jsonc
569
- {
570
- "review_passes": {
571
- "always_security_review": false, // default: false (only on security-sensitive files)
572
- "security_globs": [ // default patterns:
573
- "**/*auth*", "**/*crypto*",
574
- "**/*session*", "**/*token*",
575
- "**/*middleware*", "**/*api*",
576
- "**/*security*"
577
- ]
578
- }
579
- }
428
+ ```bash
429
+ bun test
580
430
  ```
581
431
 
582
- Set `always_security_review: true` to run the security pass on every task, regardless of file path.
432
+ Zero additional test dependencies. Uses Bun's built-in test runner.
583
433
 
584
- ### Integration Analysis
434
+ ---
585
435
 
586
- Control whether contract change detection triggers impact analysis:
436
+ ## Roadmap
587
437
 
588
- ```jsonc
589
- {
590
- "integration_analysis": {
591
- "enabled": true // default: true
592
- }
593
- }
594
- ```
438
+ ### v6.3 — Pre-Reviewer Pipeline
595
439
 
596
- > **Architect is exempt/unlimited by default:** The architect agent has no guardrail limits by default. To override, add a `profiles.architect` entry in your guardrails config.
440
+ Three new tools complete the pre-reviewer gauntlet. Code reaching the Reviewer is already clean.
597
441
 
598
- ### Per-Invocation Budgets
442
+ - **`imports`** — AST-based import graph. For each file changed by the coder, returns every consumer file, which exports each consumer uses, and the line numbers. Replaces fragile grep-based integration analysis with deterministic graph traversal.
443
+ - **`lint`** — Auto-detects project linter (Biome, ESLint, Ruff, Clippy, PSScriptAnalyzer). Runs in fix mode first, then check mode. Structured diagnostic output per file.
444
+ - **`secretscan`** — Entropy-based credential scanner. Detects API keys, tokens, connection strings, and private key headers in the diff before they reach the reviewer. Zero external dependencies.
599
445
 
600
- Guardrail limits are enforced **per-invocation**, not per-session. Each time the architect delegates to an agent, that agent gets a fresh budget of tool calls, duration, and error tolerance.
446
+ Phase 5 execute loop becomes: `coder diff imports lint fix lint check secretscan reviewer security reviewer test_engineer adversarial test_engineer`.
601
447
 
602
- **Example**: If `max_tool_calls: 200`, then:
603
- - Architect → Coder (task 1) → 200 calls available
604
- - Coder finishes → Architect → Coder (task 2) → 200 calls available again
448
+ ### v6.4 Execution and Planning Tools
605
449
 
606
- This prevents long-running projects from accumulating session-wide counters that incorrectly trip the circuit breaker on later tasks.
450
+ - **`test_runner`** — Unified test execution across Bun, Vitest, Jest, Mocha, pytest, cargo test, and Pester. Auto-detects framework, returns normalized JSON with pass/fail/skip counts and coverage. Three scope modes: `all`, `convention` (naming-based), `graph` (import-graph-based). Eliminates the test_engineer's most common failure mode.
451
+ - **`symbols`** — Export inventory for a module: functions, classes, interfaces, types, enums. Gives the Architect instant visibility into a file's public API surface without reading the full source.
452
+ - **`checkpoint`** — Git-backed save points. Before any multi-file refactor (≥3 files), Architect auto-creates a checkpoint commit. On critical integration failure, restores via soft reset instead of iterating into a hole.
607
453
 
608
- > **Architect is unlimited**: The architect never creates invocation windows and has no guardrail limits by default.
454
+ ### v6.5 Intelligence and Audit Tools
609
455
 
610
- ### Disable Guardrails
456
+ Five tools that improve planning quality and post-phase validation:
611
457
 
612
- ```json
613
- {
614
- "guardrails": {
615
- "enabled": false
616
- }
617
- }
618
- ```
619
-
620
- ---
621
-
622
- ## Comparison
623
-
624
- | Feature | OpenCode Swarm | AutoGen | CrewAI | LangGraph |
625
- |---------|---------------|---------|--------|-----------|
626
- | Execution | Serial (predictable) | Parallel (chaotic) | Parallel | Configurable |
627
- | Planning | Phased with acceptance criteria | Ad-hoc | Role-based | Graph-based |
628
- | Memory | Persistent `.swarm/` files | Session only | Session only | Checkpoints |
629
- | QA | Dual-pass per-task (review + security + adversarial) | Optional | Optional | Manual |
630
- | Model mixing | Per-agent configuration | Limited | Limited | Manual |
631
- | Resume projects | ✅ Native | ❌ | ❌ | Partial |
632
- | SME domains | Open-domain (any) | Generic | Generic | Generic |
633
- | Task granularity | One at a time | Batched | Batched | Varies |
458
+ - **`pkg_audit`** — Wraps `npm audit`, `pip-audit`, `cargo audit`. Structured CVE output with severity, patched versions, and advisory URLs. Fed to the security reviewer for concrete vulnerability context.
459
+ - **`complexity_hotspots`** — Git churn × cyclomatic complexity risk map. Run in Phase 0/2 to identify modules that need stricter QA gates before implementation begins.
460
+ - **`schema_drift`** — Compares OpenAPI spec against actual route implementations. Surfaces undocumented routes and phantom spec paths. Run in Phase 6 when API routes were modified.
461
+ - **`todo_extract`** — Structured extraction of `TODO`, `FIXME`, and `HACK` annotations across the codebase. High-priority items fed directly into plan task candidates.
462
+ - **`evidence_check`** — Audits completed tasks against required evidence types. Run in Phase 6 to verify every task has review and test evidence before the phase is marked complete.
634
463
 
635
464
  ---
636
465
 
637
466
  ## Design Principles
638
467
 
639
- 1. **Plan before code** - Documented phases with acceptance criteria
640
- 2. **One task at a time** - Focused work, quality output
641
- 3. **Review everything immediately** - Dual-pass review (correctness + security) with adversarial testing per task
642
- 4. **Cache SME knowledge** - Don't re-ask answered questions
643
- 5. **Persistent memory** - `.swarm/` files survive sessions
644
- 6. **Serial execution** - Predictable, debuggable, no race conditions
645
- 7. **Heterogeneous models** - Different perspectives catch different bugs
646
- 8. **User checkpoints** - Confirm before proceeding to next phase
647
- 9. **Failure tracking** - Document rejections, escalate after 5 attempts
648
- 10. **Resumable by design** - Any Architect can pick up any project
649
-
650
- ---
651
-
652
- ## Testing
653
-
654
- ```bash
655
- # Run all tests
656
- bun test
657
-
658
- # Run specific test file
659
- bun test tests/unit/config/schema.test.ts
660
- ```
661
-
662
- 1280 tests across 57+ files covering config, tools, agents, hooks, commands, state, guardrails, evidence, plan schemas, circuit breaker race conditions, invocation windows, multi-invocation isolation, security categories, review/integration schemas, and diff tool. Uses Bun's built-in test runner — zero additional test dependencies.
663
-
664
- ## Troubleshooting
665
-
666
- ### Plugin not loading
667
- 1. Verify `opencode-swarm` is listed in your `opencode.json` plugins array
668
- 2. Run `bunx opencode-swarm install` to auto-configure
669
- 3. Run `/swarm diagnose` to check health status
670
-
671
- ### Commands not working
672
- - Ensure you're using `/swarm <command>`, not `/swarm/<command>`
673
- - Run `/swarm` with no arguments to see available commands
674
-
675
- ### Resuming a project
676
- - Swarm automatically detects `.swarm/plan.md` and resumes where you left off
677
- - If you get unexpected behavior, run `/swarm export` to backup, then `/swarm reset --confirm` to start fresh
468
+ 1. **Plan before code** Documented phases with acceptance criteria. The Critic approves the plan before a single line is written.
469
+ 2. **One task at a time** The Coder gets one task and full context. Nothing else.
470
+ 3. **Review everything immediately** Every task goes through correctness review, security review, verification tests, and adversarial tests. No task ships without passing all four.
471
+ 4. **Cache SME knowledge** Guidance is written to `context.md`. The same domain question is never asked twice in a project.
472
+ 5. **Persistent memory** `.swarm/` files are the ground truth. Any session, any model, any day.
473
+ 6. **Serial execution** Predictable, debuggable, no race conditions, no conflicting writes.
474
+ 7. **Heterogeneous models** Different models, different blind spots. The coder's bug is the reviewer's catch.
475
+ 8. **User checkpoints** Phase transitions require user confirmation. No unsupervised multi-phase runs.
476
+ 9. **Document failures** Rejections and retries are recorded in plan.md. After 5 failed attempts, the task escalates to the user.
477
+ 10. **Resumable by design** — A cold-start Architect can read `.swarm/` and continue any project as if it had been there from the beginning.
678
478
 
679
479
  ---
680
480
 
@@ -693,5 +493,5 @@ MIT
693
493
  ---
694
494
 
695
495
  <p align="center">
696
- <strong>Stop hoping your agents figure it out. Start shipping code that works.</strong>
496
+ <strong>Stop hoping your agents figure it out. Start shipping code that actually works.</strong>
697
497
  </p>