opencode-swarm 6.2.0 → 6.3.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -1,5 +1,5 @@
1
1
  <p align="center">
2
- <img src="https://img.shields.io/badge/version-6.2.0-blue" alt="Version">
2
+ <img src="https://img.shields.io/badge/version-6.3.0-blue" alt="Version">
3
3
  <img src="https://img.shields.io/badge/license-MIT-green" alt="License">
4
4
  <img src="https://img.shields.io/badge/opencode-plugin-purple" alt="OpenCode Plugin">
5
5
  <img src="https://img.shields.io/badge/agents-9-orange" alt="Agents">
@@ -9,185 +9,191 @@
9
9
  <h1 align="center">🐝 OpenCode Swarm</h1>
10
10
 
11
11
  <p align="center">
12
- <strong>The only multi-agent framework that actually works.</strong><br>
13
- Structured phases. Persistent memory. One task at a time. QA on everything.
12
+ <strong>A structured multi-agent coding framework for OpenCode.</strong><br>
13
+ Nine specialized agents. Persistent memory. A QA gate on every task. Code that ships.
14
14
  </p>
15
15
 
16
16
  <p align="center">
17
- <a href="#why-swarm">Why Swarm?</a> •
17
+ <a href="#the-problem">The Problem</a> •
18
18
  <a href="#how-it-works">How It Works</a> •
19
- <a href="#installation">Installation</a> •
20
19
  <a href="#agents">Agents</a> •
21
- <a href="#configuration">Configuration</a>
20
+ <a href="#persistent-memory">Memory</a>
21
+ <a href="#guardrails">Guardrails</a> •
22
+ <a href="#comparison">Comparison</a> •
23
+ <a href="#installation">Installation</a> •
24
+ <a href="#roadmap">Roadmap</a>
22
25
  </p>
23
26
 
24
27
  ---
25
28
 
26
- ## The Problem with Every Other Multi-Agent System
27
-
28
- ```
29
- You: "Build me an authentication system"
30
-
31
- Other Frameworks:
32
- ├── Agent 1 starts auth module...
33
- ├── Agent 2 starts user model... (conflicts with Agent 1)
34
- ├── Agent 3 starts database... (wrong schema)
35
- ├── Agent 4 starts tests... (for code that doesn't exist yet)
36
- └── Result: Chaos. Conflicts. Context lost. Start over.
37
-
38
- OpenCode Swarm:
39
- ├── Architect analyzes request
40
- ├── Explorer scans codebase (+ gap analysis)
41
- ├── @sme consulted on security domain
42
- ├── Architect creates phased plan with acceptance criteria
43
- ├── @critic reviews plan → APPROVED
44
- ├── Phase 1: User model → Review → Tests (run + PASS) → ✓
45
- ├── Phase 2: Auth logic → Review → Tests (run + PASS) → ✓
46
- ├── Phase 3: Session management → Review → Tests (run + PASS) → ✓
47
- └── Result: Working code. Documented decisions. Resumable progress.
48
- ```
49
-
50
- ---
51
-
52
- ## Why Swarm?
53
-
54
- <table>
55
- <tr>
56
- <td width="50%">
29
+ ## The Problem
57
30
 
58
- ### Other Frameworks
31
+ Every multi-agent AI coding tool on the market has the same failure mode: they are vibes-driven. You describe a feature. Agents spawn. They race each other to write conflicting code, lose context after 20 messages, hit token limits mid-task, and produce something that sort-of-works until it doesn't. There's no plan. There's no memory. There's no gatekeeper. There's no test that was actually run.
59
32
 
60
- - Parallel chaos, hope it converges
61
- - Single model = correlated failures
62
- - No planning, just vibes
63
- - Context lost between sessions
64
- - QA as afterthought (if at all)
65
- - Entire codebase in one prompt
66
- - No way to resume projects
33
+ **oh-my-opencode** is a prompt collection. **get-shit-done** is a workflow macro. Neither is a framework with memory, QA enforcement, or the ability to resume a project a week later exactly where you left off.
67
34
 
68
- </td>
69
- <td width="50%">
35
+ OpenCode Swarm is built differently.
70
36
 
71
- ### ✅ OpenCode Swarm
72
-
73
- - **Serial execution** - predictable, traceable
74
- - **Heterogeneous models** - different perspectives catch errors
75
- - **Phased planning** - documented tasks with acceptance criteria
76
- - **Persistent memory** - `.swarm/` files survive sessions
77
- - **Review per task** - correctness + security review before anything ships
78
- - **One task at a time** - focused, quality code
79
- - **Resumable projects** - pick up exactly where you left off
37
+ ```
38
+ Every other framework:
39
+ ├── Agent 1 starts the auth module...
40
+ ├── Agent 2 starts the user model... (conflicts with Agent 1)
41
+ ├── Agent 3 writes tests... (for code that doesn't exist yet)
42
+ ├── Context window fills up and the whole thing drifts
43
+ └── Result: chaos. Rework. Start over.
80
44
 
81
- </td>
82
- </tr>
83
- </table>
45
+ OpenCode Swarm:
46
+ ├── Architect reads .swarm/plan.md → project already in progress, resumes Phase 2
47
+ ├── @explorer scans the codebase for current state
48
+ ├── @sme DOMAIN: security → consults on auth patterns, guidance cached
49
+ ├── Architect writes .swarm/plan.md: 3 phases, 9 tasks, acceptance criteria per task
50
+ ├── @critic reviews the plan → APPROVED
51
+ ├── @coder implements Task 2.2 (one task, full context, nothing else)
52
+ ├── diff tool → imports tool → lint fix → secretscan → @reviewer → @test_engineer
53
+ ├── All gates pass → plan.md updated → Task 2.2: [x]
54
+ └── Result: working code, documented decisions, resumable project, evidence trail
55
+ ```
84
56
 
85
57
  ---
86
58
 
87
59
  ## How It Works
88
60
 
61
+ ### The Execution Pipeline
62
+
89
63
  ```
90
- ┌─────────────────────────────────────────────────────────────────────────┐
91
- USER: "Add user authentication with JWT"
92
- └─────────────────────────────────────────────────────────────────────────┘
64
+ ┌──────────────────────────────────────────────────────────────────────────┐
65
+ Phase 0: Resume Check
66
+ │ .swarm/plan.md exists? Resume mid-task. New project? Continue. │
67
+ └──────────────────────────────────────────────────────────────────────────┘
93
68
 
94
69
 
95
- ┌─────────────────────────────────────────────────────────────────────────┐
96
- PHASE 0: Check for .swarm/plan.md
97
- Exists? Resume. New? Continue.
98
- └─────────────────────────────────────────────────────────────────────────┘
70
+ ┌──────────────────────────────────────────────────────────────────────────┐
71
+ Phase 1: Clarify
72
+ Ask only what the Architect cannot infer. Then stop.
73
+ └──────────────────────────────────────────────────────────────────────────┘
99
74
 
100
75
 
101
- ┌─────────────────────────────────────────────────────────────────────────┐
102
- PHASE 1: Clarify (if needed)
103
- "Do you need refresh tokens? What's the session duration?"
104
- └─────────────────────────────────────────────────────────────────────────┘
76
+ ┌──────────────────────────────────────────────────────────────────────────┐
77
+ Phase 2: Discover
78
+ @explorer scans codebase structure, languages, frameworks, key files
79
+ └──────────────────────────────────────────────────────────────────────────┘
105
80
 
106
81
 
107
- ┌─────────────────────────────────────────────────────────────────────────┐
108
- PHASE 2: Discover
109
- @explorer scans codebase structure, languages, patterns
110
- └─────────────────────────────────────────────────────────────────────────┘
82
+ ┌──────────────────────────────────────────────────────────────────────────┐
83
+ Phase 3: SME Consult (serial, cached)
84
+ @sme DOMAIN: security, @sme DOMAIN: api, ...
85
+ │ Guidance written to .swarm/context.md — never re-asked in future phases │
86
+ └──────────────────────────────────────────────────────────────────────────┘
111
87
 
112
88
 
113
- ┌─────────────────────────────────────────────────────────────────────────┐
114
- PHASE 3: Consult SMEs (serial, cached)
115
- @sme DOMAIN: security → auth best practices
116
- @sme DOMAIN: api JWT patterns, refresh flow
117
- Guidance saved to .swarm/context.md
118
- └─────────────────────────────────────────────────────────────────────────┘
89
+ ┌──────────────────────────────────────────────────────────────────────────┐
90
+ Phase 4: Plan
91
+ Architect writes .swarm/plan.md
92
+ Structured phases, tasks with SMALL/MEDIUM/LARGE sizing, acceptance
93
+ criteria per task, explicit dependency graph
94
+ └──────────────────────────────────────────────────────────────────────────┘
119
95
 
120
96
 
121
- ┌─────────────────────────────────────────────────────────────────────────┐
122
- PHASE 4: Plan
123
- Creates .swarm/plan.md with phases, tasks, acceptance criteria
124
-
125
- │ Phase 1: Foundation [3 tasks] │
126
- │ Phase 2: Core Auth [4 tasks] │
127
- │ Phase 3: Session Management [3 tasks] │
128
- └─────────────────────────────────────────────────────────────────────────┘
97
+ ┌──────────────────────────────────────────────────────────────────────────┐
98
+ Phase 4.5: Critic Gate
99
+ @critic reviews plan APPROVED / NEEDS_REVISION / REJECTED
100
+ Max 2 revision cycles. Escalates to user if unresolved.
101
+ └──────────────────────────────────────────────────────────────────────────┘
129
102
 
130
103
 
131
- ┌─────────────────────────────────────────────────────────────────────────┐
132
- PHASE 4.5: Critic Gate
133
- @critic reviews plan → APPROVED / NEEDS_REVISION / REJECTED
134
- Max 2 revision cycles before escalating to user
135
- └─────────────────────────────────────────────────────────────────────────┘
104
+ ┌──────────────────────────────────────────────────────────────────────────┐
105
+ Phase 5: Execute (per task)
106
+
107
+ [UI task?] @designer scaffold first
108
+ │ │
109
+ │ @coder (one task, full context) │
110
+ │ ↓ │
111
+ │ diff tool → imports tool → lint fix → lint check → secretscan │
112
+ │ (contract change detection) (AST-based) (auto-fix) (entropy scan) │
113
+ │ ↓ │
114
+ │ @reviewer (correctness pass) │
115
+ │ ↓ APPROVED │
116
+ │ @reviewer (security-only pass, if file matches security globs) │
117
+ │ ↓ APPROVED │
118
+ │ @test_engineer (verification tests + coverage gate ≥70%) │
119
+ │ ↓ PASS │
120
+ │ @test_engineer (adversarial tests — boundary violations, injections) │
121
+ │ ↓ PASS │
122
+ │ plan.md → [x] Task complete │
123
+ │ │
124
+ │ Any gate fails → back to @coder with structured rejection reason │
125
+ └──────────────────────────────────────────────────────────────────────────┘
136
126
 
137
127
 
138
- ┌─────────────────────────────────────────────────────────────────────────┐
139
- PHASE 5: Execute (per task)
140
-
141
- ┌─────────┐ ┌───────┐ ┌────────────┐ ┌──────────────┐
142
- │ @coder → │ diff │ → │ @reviewer │ @test │ │
143
- │ │ 1 task │ │ tool │ │ check all │ │ write + run │ │
144
- │ └─────────┘ └───────┘ └────────────┘ └──────────────┘ │
145
- │ │ │ │ │ │
146
- │ │ Contract │ If REJECTED: If FAIL: fix │
147
- │ │ changes? │ retry from coder + retest │
148
- │ │ │ │ │ │
149
- │ │ ▼ │ ▼ │
150
- │ │ ┌─────────┐ │ ┌──────────────┐ ┌──────────────┐ │
151
- │ │ │@explorer│ │ │ @reviewer │ → │ @test │ │
152
- │ │ │ impact │ │ │ security-only│ │ adversarial │ │
153
- │ │ │analysis │ │ │ (if match) │ │ (attacks) │ │
154
- │ │ └─────────┘ │ └──────────────┘ └──────────────┘ │
155
- │ │ │ │
156
- │ └───────────────┘ │
157
- │ │
158
- │ Update plan.md: [x] Task complete (only after ALL gates pass) │
159
- │ Next task... │
160
- └─────────────────────────────────────────────────────────────────────────┘
161
-
162
-
163
- ┌─────────────────────────────────────────────────────────────────────────┐
164
- │ PHASE 6: Phase Complete │
165
- │ Re-scan with @explorer │
166
- │ Update context.md with learnings │
167
- │ Archive to .swarm/history/ │
168
- │ "Phase 1 complete. Ready for Phase 2?" │
169
- └─────────────────────────────────────────────────────────────────────────┘
128
+ ┌──────────────────────────────────────────────────────────────────────────┐
129
+ Phase 6: Phase Complete
130
+ @explorer rescans. @docs updates documentation. Retrospective written.
131
+ Learnings injected as [SWARM RETROSPECTIVE] into next phase.
132
+ "Phase 1 complete (4 tasks, 0 rejections). Ready for Phase 2?"
133
+ └──────────────────────────────────────────────────────────────────────────┘
170
134
  ```
171
135
 
136
+ ### Why Serial Execution Matters
137
+
138
+ Multi-agent parallelism sounds fast. In practice, it is a race to produce conflicting, unreviewed code that requires a human to untangle. OpenCode Swarm runs one task at a time through a deterministic pipeline. Every task is reviewed. Every test is run. Every failure is documented and fed back to the coder with structured context. The tradeoff in raw speed is paid back in not redoing work.
139
+
172
140
  ---
173
141
 
174
- ## Persistent Project Memory
142
+ ## Agents
143
+
144
+ ### 🎯 Orchestrator
145
+
146
+ **`architect`** — The central coordinator. Owns the plan, delegates all work, enforces every QA gate, maintains project memory, and resumes projects across sessions. Every other agent works for the Architect.
147
+
148
+ ### 🔍 Discovery
149
+
150
+ **`explorer`** — Fast codebase scanner. Identifies structure, languages, frameworks, key files, and import patterns. Runs before planning and after every phase completes.
151
+
152
+ ### 🧠 Domain Expert
153
+
154
+ **`sme`** — Open-domain expert. The Architect specifies any domain per call: `security`, `python`, `rust`, `kubernetes`, `ios`, `ml`, `blockchain` — any domain the underlying model has knowledge of. No hardcoded list. Guidance is cached in `.swarm/context.md` so the same question is never asked twice.
155
+
156
+ ### 🎨 Design
157
+
158
+ **`designer`** — UI/UX specification agent. Opt-in via config. Generates component scaffolds and design tokens before the coder touches UI tasks, eliminating the most common source of front-end rework.
175
159
 
176
- Other frameworks lose everything when the session ends. Swarm doesn't.
160
+ ### 💻 Implementation
161
+
162
+ **`coder`** — Implements exactly one task with full context. No multitasking. No context bleed from prior tasks. The coder receives: the task spec, acceptance criteria, SME guidance, and relevant context from `.swarm/context.md`. Nothing else.
163
+
164
+ **`test_engineer`** — Generates tests, runs them, and returns structured `PASS/FAIL` verdicts with coverage percentages. Runs twice per task: once for verification, once for adversarial attack scenarios.
165
+
166
+ ### ✅ Quality Assurance
167
+
168
+ **`reviewer`** — Dual-pass review. First pass: correctness, logic, maintainability. Second pass: security-only, scoped to OWASP Top 10 categories, triggered automatically when the modified files match security-sensitive path patterns. Both passes produce structured verdicts with specific rejection reasons.
169
+
170
+ **`critic`** — Plan review gate. Reviews the Architect's plan *before implementation begins*. Checks for completeness, feasibility, scope creep, missing dependencies, and AI-slop hallucinations. Plans do not proceed without Critic approval.
171
+
172
+ ### 📝 Documentation
173
+
174
+ **`docs`** — Documentation synthesizer. Runs in Phase 6 with a diff of changed files. Updates READMEs, API documentation, and guides to reflect what was actually built, not what was planned.
175
+
176
+ ---
177
+
178
+ ## Persistent Memory
179
+
180
+ Other frameworks lose everything when the session ends. Swarm stores project state on disk.
177
181
 
178
182
  ```
179
183
  .swarm/
180
- ├── plan.md # Your project roadmap (+ plan.json)
181
- ├── context.md # Everything a new Architect needs
182
- ├── evidence/ # Per-task execution evidence
183
- ├── 1.1/ # Evidence for task 1.1
184
- └── 2.3/ # Evidence for task 2.3
184
+ ├── plan.md # Living roadmap: phases, tasks, status, rejections, blockers
185
+ ├── plan.json # Machine-readable plan for tooling
186
+ ├── context.md # Institutional knowledge: decisions, SME guidance, patterns
187
+ ├── evidence/ # Per-task execution evidence bundles
188
+ ├── 1.1/ # review verdict, test results, diff summary for task 1.1
189
+ │ └── 2.3/
185
190
  └── history/
186
- ├── phase-1.md # What was done, what was learned
191
+ ├── phase-1.md # What was built, what was learned, retrospective metrics
187
192
  └── phase-2.md
188
193
  ```
189
194
 
190
- ### plan.md - Living Roadmap
195
+ ### plan.md Living Roadmap
196
+
191
197
  ```markdown
192
198
  # Project: Auth System
193
199
  Current Phase: 2
@@ -200,281 +206,133 @@ Current Phase: 2
200
206
  ## Phase 2: Core Auth [IN PROGRESS]
201
207
  - [x] Task 2.1: Login endpoint [MEDIUM]
202
208
  - [ ] Task 2.2: JWT generation [MEDIUM] (depends: 2.1) ← CURRENT
203
- - Acceptance: Returns valid JWT with user claims
204
- - Attempt 1: REJECTED - Missing expiration
209
+ - Acceptance: Returns valid JWT with user claims, 15-minute expiry
210
+ - Attempt 1: REJECTED missing expiration claim
205
211
  - [ ] Task 2.3: Token validation middleware [MEDIUM]
206
- - [BLOCKED] Task 2.4: Refresh tokens
207
- - Reason: Waiting for decision on rotation strategy
212
+ - [BLOCKED] Task 2.4: Refresh token rotation
213
+ - Reason: Awaiting decision on rotation strategy
208
214
  ```
209
215
 
210
- ### context.md - Institutional Knowledge
216
+ ### context.md Institutional Knowledge
217
+
211
218
  ```markdown
212
219
  # Project Context: Auth System
213
220
 
214
221
  ## Technical Decisions
215
- - Using bcrypt (cost 12) for password hashing
216
- - JWT expires in 15 minutes, refresh in 7 days
217
- - Storing refresh tokens in Redis
222
+ - bcrypt cost factor: 12
223
+ - JWT TTL: 15 minutes; refresh TTL: 7 days
224
+ - Refresh token store: Redis with key prefix auth:refresh:
218
225
 
219
226
  ## SME Guidance Cache
220
- ### Security (Phase 1)
221
- - Never log tokens or passwords
222
- - Use constant-time comparison for tokens
223
- - Implement rate limiting on login
227
+ ### security (Phase 1)
228
+ - Never log tokens or passwords in any context
229
+ - Use constant-time comparison for all token equality checks
230
+ - Rate-limit login endpoint: 5 attempts / 15 minutes per IP
224
231
 
225
- ### API (Phase 1)
226
- - Return 401 for invalid credentials (not 404)
227
- - Include token expiry in response body
232
+ ### api (Phase 1)
233
+ - Return HTTP 401 for invalid credentials (not 404)
234
+ - Include token expiry timestamp in response body
228
235
 
229
236
  ## Patterns Established
230
- - Error handling: Custom ApiError class with status codes
231
- - Validation: Zod schemas in /validators/
237
+ - Error handling: custom ApiError class with HTTP status and error code
238
+ - Validation: Zod schemas in /validators/, applied at request boundary
232
239
  ```
233
240
 
234
- **Start a new session tomorrow?** The Architect reads these files and picks up exactly where you left off.
241
+ Start a new session tomorrow. The Architect reads these files and picks up exactly where you left off — no re-explaining, no rediscovery, no drift.
235
242
 
236
- ### Evidence Types
243
+ ### Evidence Bundles
237
244
 
238
- Each task in `.swarm/evidence/` contains structured evidence bundles with typed entries:
245
+ Each completed task writes structured evidence to `.swarm/evidence/`:
239
246
 
240
- | Type | Purpose | Key Fields |
241
- |------|---------|------------|
242
- | `review` | Code review verdict | `verdict`, `risk`, `issues[]` |
243
- | `test` | Test run results | `verdict`, `tests_passed`, `tests_failed`, `coverage` |
244
- | `diff` | Git diff summary | `files_changed[]`, `additions`, `deletions` |
245
- | `approval` | Stakeholder sign-off | `approver`, `notes` |
246
- | `note` | General observations | `content` |
247
- | `retrospective` | Phase metrics & lessons | `phase_number`, `total_tool_calls`, `coder_revisions`, `reviewer_rejections`, `test_failures`, `security_findings`, `task_count`, `task_complexity`, `top_rejection_reasons[]`, `lessons_learned[]` |
247
+ | Type | What It Captures |
248
+ |------|-----------------|
249
+ | `review` | Verdict (APPROVED/REJECTED), risk level, specific issues |
250
+ | `test` | Pass/fail counts, coverage percentage, failure messages |
251
+ | `diff` | Files changed, additions/deletions, contract change flags |
252
+ | `approval` | Stakeholder sign-off with notes |
253
+ | `retrospective` | Phase metrics: total tool calls, coder revisions, reviewer rejections, test failures, security findings, lessons learned |
248
254
 
249
- **Retrospective Evidence** (v6.2.0+): After each phase completes, the architect writes a retrospective evidence bundle capturing what went well and what didn't. The system enhancer injects the most recent retrospective as a `[SWARM RETROSPECTIVE]` hint at the start of the next phase, enabling continuous improvement across phases.
255
+ Retrospectives from completed phases are injected as `[SWARM RETROSPECTIVE]` hints at the start of subsequent phases. The framework learns from its own history within a project.
250
256
 
251
257
  ---
252
258
 
253
- ## Heterogeneous Models = Better Code
254
-
255
- Most frameworks use one model for everything. Same blindspots everywhere.
259
+ ## Heterogeneous Models
256
260
 
257
- Swarm lets you mix models strategically:
261
+ Single-model frameworks have correlated failure modes. The same model that writes the bug reviews it and misses it. Swarm lets you route each agent to the model it is best suited for:
258
262
 
259
263
  ```json
260
264
  {
261
265
  "agents": {
262
- "architect": { "model": "anthropic/claude-sonnet-4-5" },
263
- "explorer": { "model": "google/gemini-2.0-flash" },
264
- "coder": { "model": "anthropic/claude-sonnet-4-5" },
265
- "sme": { "model": "google/gemini-2.0-flash" },
266
- "reviewer": { "model": "openai/gpt-4o" },
267
- "critic": { "model": "google/gemini-2.0-flash" },
268
- "test_engineer": { "model": "google/gemini-2.0-flash" }
266
+ "architect": { "model": "anthropic/claude-opus-4-6" },
267
+ "coder": { "model": "minimax-coding-plan/MiniMax-M2.5" },
268
+ "explorer": { "model": "minimax-coding-plan/MiniMax-M2.1" },
269
+ "sme": { "model": "kimi-for-coding/k2p5" },
270
+ "critic": { "model": "zai-coding-plan/glm-5" },
271
+ "reviewer": { "model": "zai-coding-plan/glm-5" },
272
+ "test_engineer": { "model": "minimax-coding-plan/MiniMax-M2.5" },
273
+ "docs": { "model": "zai-coding-plan/glm-4.7-flash" },
274
+ "designer": { "model": "kimi-for-coding/k2p5" }
269
275
  }
270
276
  }
271
277
  ```
272
278
 
273
- | Role | Optimized For | Why Different Models? |
274
- |------|---------------|----------------------|
275
- | Architect | Deep reasoning | Needs to plan complex work |
276
- | Explorer | Fast scanning | Speed over depth |
277
- | Coder | Implementation | Best coding model you have |
278
- | SME | Domain knowledge | Fast recall, not deep reasoning |
279
- | Reviewer | Finding flaws | **Different vendor catches different bugs** |
280
- | Critic | Plan review | Catches scope issues before any code is written |
281
- | Test Engineer | Test + run | Writes tests, runs them, reports PASS/FAIL |
282
-
283
- **If Claude writes code and GPT reviews it, GPT catches Claude's blindspots.** This is why real teams have code review.
279
+ Reviewer uses a different model than Coder by design. Different training, different priors, different blind spots. This is the cheapest bug-catcher you will ever deploy.
284
280
 
285
281
  ---
286
282
 
287
- ## Multiple Swarms
283
+ ## Guardrails
284
+
285
+ Every subagent runs inside a circuit breaker that kills runaway behavior before it burns credits on a stuck loop.
286
+
287
+ | Layer | Trigger | Action |
288
+ |-------|---------|--------|
289
+ | ⚠️ Soft Warning | 50% of any limit reached | Warning injected into agent stream |
290
+ | 🛑 Hard Block | 100% of any limit reached | All further tool calls blocked |
288
291
 
289
- Run different model configurations simultaneously. Perfect for:
290
- - **Cloud vs Local**: Premium cloud models for critical work, local models for quick tasks
291
- - **Fast vs Quality**: Quick iterations with fast models, careful work with expensive ones
292
- - **Cost Tiers**: Cheap models for exploration, premium for implementation
292
+ | Signal | Default | Description |
293
+ |--------|---------|-------------|
294
+ | Tool calls | 200 | Per-invocation, not per-session |
295
+ | Duration | 30 min | Wall-clock time per delegation |
296
+ | Repetition | 10 | Same tool + args consecutively |
297
+ | Consecutive errors | 5 | Sequential null/undefined outputs |
293
298
 
294
- ### Configuration
299
+ Limits are enforced **per-invocation**. Each delegation to a subagent starts a fresh budget. A coder fixing a second task is not penalized for the first task's tool calls. The Architect is exempt from all limits by default.
295
300
 
296
- ```json
301
+ Per-agent profiles allow fine-grained overrides:
302
+
303
+ ```jsonc
297
304
  {
298
- "swarms": {
299
- "cloud": {
300
- "name": "Cloud",
301
- "agents": {
302
- "architect": { "model": "anthropic/claude-sonnet-4-5" },
303
- "coder": { "model": "anthropic/claude-sonnet-4-5" },
304
- "sme": { "model": "google/gemini-2.0-flash" },
305
- "reviewer": { "model": "openai/gpt-4o" }
306
- }
307
- },
308
- "local": {
309
- "name": "Local",
310
- "agents": {
311
- "architect": { "model": "ollama/qwen2.5:32b" },
312
- "coder": { "model": "ollama/qwen2.5:32b" },
313
- "sme": { "model": "ollama/qwen2.5:14b" },
314
- "reviewer": { "model": "ollama/qwen2.5:14b" }
315
- }
305
+ "guardrails": {
306
+ "max_tool_calls": 200,
307
+ "profiles": {
308
+ "coder": { "max_tool_calls": 500, "max_duration_minutes": 60 },
309
+ "explorer": { "max_tool_calls": 50 }
316
310
  }
317
311
  }
318
312
  }
319
313
  ```
320
314
 
321
- ### What Gets Created
322
-
323
- | Swarm | Agents |
324
- |-------|--------|
325
- | `cloud` (default) | `architect`, `explorer`, `coder`, `sme`, `reviewer`, `critic`, `test_engineer` |
326
- | `local` | `local_architect`, `local_explorer`, `local_coder`, `local_sme`, `local_reviewer`, `local_critic`, `local_test_engineer` |
327
-
328
- The first swarm (or one named "default") creates unprefixed agents. Additional swarms prefix all agent names.
329
-
330
- ### Usage
331
-
332
- In OpenCode, you'll see multiple architects to choose from:
333
- - `architect` - Cloud swarm (default)
334
- - `local_architect` - Local swarm
335
-
336
- Each architect automatically delegates to its own swarm's agents.
337
-
338
- ---
339
-
340
- ## Installation
341
-
342
- ```bash
343
- # Install via CLI (recommended)
344
- bunx opencode-swarm install
345
- ```
346
-
347
- ### Uninstall
348
-
349
- ```bash
350
- # Remove from opencode.json
351
- bunx opencode-swarm uninstall
352
-
353
- # Remove from opencode.json + clean up config files
354
- bunx opencode-swarm uninstall --clean
355
- ```
356
-
357
- ---
358
-
359
- ## What's New
360
-
361
- ### v6.2.0 — System Intelligence
362
- - **Retrospective evidence** — New evidence type that captures phase metrics (tool calls, revisions, rejections, test failures, security findings) and lessons learned. Architect writes it after each phase; system enhancer injects the most recent one as a `[SWARM RETROSPECTIVE]` hint for the next phase, enabling continuous improvement across phases.
363
- - **Soft compaction advisory** — System enhancer injects a `[SWARM HINT]` when the architect's tool-call count crosses configurable thresholds (default 50/75/100/125/150). A `lastCompactionHint` guard prevents re-injection at the same threshold. Configurable via `compaction_advisory` block.
364
- - **Coverage reporting** — Test engineer now reports line/branch/function coverage percentages and flags files below 70%. Architect uses this in Phase 5 step 5d to request additional test passes when coverage is insufficient.
365
- - **111 new tests** — 1391 total tests across 62+ files (up from 1280 in v6.1.2).
366
-
367
- ### v6.1.2 — Guardrails Remediation
368
- - **Fail-safe config validation** — Config validation failures now disable guardrails as a safety precaution (previously Zod defaults could silently re-enable them).
369
- - **Architect exemption fix** — Architect/orchestrator sessions can no longer inherit 30-minute base limits during delegation race conditions.
370
- - **Explicit disable always wins** — `guardrails.enabled: false` in config is now always honored, even when the config was loaded from file.
371
- - **Internal map synchronization** — `startAgentSession()` now keeps `activeAgent` and `agentSessions` maps in sync for consistent state tracking.
372
-
373
- ### v6.1.1 — Security Fix & Tech Debt
374
- - **Security hardening (`_loadedFromFile`)** — Fixed a critical vulnerability where an internal loader flag could be injected via JSON config to bypass guardrails. The flag is now purely internal and no longer part of the public schema.
375
- - **TOCTOU protection** — Added atomic-style content checks in the config loader to prevent race conditions during file reads.
376
- - **`retrieve_summary` tool** — Properly registered the retrieval tool, allowing agents to fetch full content from auto-summarized tool outputs.
377
- - **92 new tests** — 1280 total tests across 57+ files (up from 1188 in v6.0.0).
378
-
379
- ### v6.1.0 — Docs & Design Agents
380
- - **`docs` agent** — Dedicated documentation synthesizer that automatically updates READMEs, API docs, and guides during Phase 6.
381
- - **`designer` agent** — UI/UX specification agent that generates component scaffolds before coding begins on UI-heavy tasks.
382
- - **Heterogeneous model defaults** — Updated default models for new agents to use optimized Gemini models for speed and cost.
383
-
384
- ### v6.0.0 — Core QA & Security Gates
385
- - **Dual-pass security reviewer** — After the general reviewer APPROVES, the architect automatically triggers a second security-only review pass when the changed file matches security-sensitive paths (`auth`, `crypto`, `session`, `token`, `middleware`, `api`, `security`) or the coder's output contains security keywords. Configurable via `review_passes` config.
386
- - **Adversarial testing** — After verification tests PASS, the test engineer is re-delegated with adversarial-only framing: attack vectors, boundary violations, and injection attempts. Pure prompt engineering, no new infrastructure.
387
- - **Integration impact analysis** — After the coder completes, the `diff` tool detects contract changes (exported functions, interfaces, types). If found, the explorer runs impact analysis across dependents before review begins.
388
- - **`diff` tool** — New agent-accessible tool providing structured git diff with numstat parsing, contract change detection, configurable base ref (`HEAD`/staged/unstaged), path filtering, and 500-line truncation.
389
- - **87 new tests** — 1188 total tests across 53+ files (up from 1101 in v5.2.0).
390
-
391
- ### v5.2.0 — Per-Invocation Guardrails
392
- - **Per-invocation budget isolation** — Guardrail limits (tool calls, duration, errors) now reset with each agent delegation. Second invocation of the same agent gets a fresh budget, preventing false circuit breaker trips in long-running projects.
393
- - **Architect protocol enforcement** — New mandatory QA gate rules: every coder task must go through reviewer approval + test_engineer verification before the next coder task. Protocol violations detected at runtime with warning injection.
394
- - **Invocation window observability** — Circuit breaker logs now include `invocationId` and `windowKey` for precise debugging of which specific agent invocation hit limits.
395
- - **67 new tests** — 1101 total tests across 48 files (up from 1034 in v5.1.x).
396
-
397
- ### v5.0.0 — Verifiable Execution
398
- - **Canonical plan schema** — Machine-readable `plan.json` with Zod-validated `PlanSchema`/`TaskSchema`/`PhaseSchema`. Automatic migration from legacy `plan.md` format. Structured status tracking (`pending`, `in_progress`, `completed`, `blocked`).
399
- - **Evidence bundles** — Per-task execution evidence persisted to `.swarm/evidence/`. Five evidence types: `review`, `test`, `diff`, `approval`, `note`. Sanitized task IDs, atomic writes, configurable size limits. `/swarm evidence` to view, `/swarm archive` to manage retention.
400
- - **Per-agent guardrail profiles** — Override guardrail limits for individual agents via `guardrails.profiles`. `resolveGuardrailsConfig()` merges base + profile with per-agent specificity.
401
- - **Context injection budget** — `max_injection_tokens` config controls how much context is injected into system prompts. Priority-ordered: phase → task → decisions → agent context. Lower-priority items dropped when budget exhausted.
402
- - **Enhanced `/swarm agents`** — Agent count summary, `⚡ custom limits` indicator for profiled agents, guardrail profiles section.
403
- - **Packaging smoke tests** — CI-safe `dist/` validation (8 tests).
404
- - **151 new tests** — 1027 total tests across 44 files (up from 876 in v4.6.0).
405
-
406
- ### v4.6.0 — Agent Guardrails
407
- - **Circuit breaker** — Two-layer protection against runaway agents. Soft warning at 50% of limits, hard block at 100%. Prevents infinite loops and runaway API costs.
408
- - **Detection signals** — Tool call count, wall-clock time, consecutive repetition, and consecutive error tracking per agent session.
409
- - **Configurable limits** — All thresholds tunable via `guardrails` config: `max_tool_calls`, `max_duration_minutes`, `max_repetitions`, `max_consecutive_errors`, `warning_threshold`.
410
- - **46 new tests** — 668 total tests across 30 files.
411
-
412
- ### v4.5.0 — Tech Debt + New Commands
413
- - **Lint cleanup** — Replaced string concatenation with template literals, documented `as any` casts with biome-ignore comments.
414
- - **Code deduplication** — Extracted `stripSwarmPrefix()` utility to eliminate 3 duplicate prefix-stripping blocks.
415
- - **`/swarm diagnose`** — Health check for `.swarm/` files, plan structure, and plugin configuration.
416
- - **`/swarm export`** — Export plan.md and context.md as portable JSON.
417
- - **`/swarm reset --confirm`** — Clear swarm state files with safety confirmation.
418
-
419
- ### v4.4.0 — DX & Quality
420
- - **CLI `uninstall` command** — Remove plugin with optional `--clean` flag.
421
- - **Custom error classes** — `SwarmError` hierarchy with actionable `guidance` messages.
422
- - **`/swarm history`** — View completed phases from plan.md.
423
- - **`/swarm config`** — View current resolved plugin configuration.
424
-
425
- ### v4.3.2 — Security Hardening
426
- - **Path validation** — `validateSwarmPath()` prevents directory traversal in `.swarm/` file operations.
427
- - **Fetch hardening** — 10s timeout, 5MB limit, retry logic for gitingest tool.
428
- - **Config limits** — Deep merge depth limit (10), config file size limit (100KB).
429
-
430
- ### v4.3.0 — Hooks & Agent Awareness
431
- - **Hooks pipeline** — `safeHook()` crash-safe wrapper, `composeHandlers()` for multi-handler composition.
432
- - **Context pruning** — Token budget tracking with 70%/90% threshold warnings.
433
- - **Slash commands** — `/swarm status`, `/swarm plan`, `/swarm agents`.
434
- - **Agent awareness** — Activity tracking, delegation tracking, cross-agent context injection.
435
-
436
- All features are opt-in via configuration. See [Installation Guide](docs/installation.md) for config options.
437
-
438
315
  ---
439
316
 
440
- ## Agents
441
-
442
- ### 🎯 Orchestrator
443
- | Agent | Role |
444
- |-------|------|
445
- | `architect` | Central coordinator. Plans phases, delegates tasks, manages QA, maintains project memory. |
446
-
447
- ### 🔍 Discovery
448
- | Agent | Role |
449
- |-------|------|
450
- | `explorer` | Fast codebase scanner. Identifies structure, languages, frameworks, key files. |
451
-
452
- ### 🎨 Design
453
- | Agent | Role |
454
- |-------|------|
455
- | `designer` | UI/UX specification agent. Generates component scaffolds and design tokens before coding begins on UI-heavy tasks. |
456
-
457
- ### 🧠 Domain Expert
458
- | Agent | Role |
459
- |-------|------|
460
- | `sme` | Open-domain expert. The architect specifies any domain (security, python, ios, rust, kubernetes, etc.) per call. No hardcoded list — works with any domain the LLM has knowledge of. |
461
-
462
- ### 💻 Implementation
463
- | Agent | Role |
464
- |-------|------|
465
- | `coder` | Implements ONE task at a time with full context |
466
- | `test_engineer` | Generates tests, runs them, and reports structured PASS/FAIL verdicts |
467
-
468
- ### ✅ Quality Assurance
469
- | Agent | Role |
470
- |-------|------|
471
- | `reviewer` | Dual-pass review: correctness review first, then automatic security-only pass for security-sensitive files. The architect specifies CHECK dimensions per call. OWASP Top 10 categories built in. |
472
- | `critic` | Plan review gate. Reviews the architect's plan BEFORE implementation — checks completeness, feasibility, scope, dependencies, and flags AI-slop. |
317
+ ## Comparison
473
318
 
474
- ### 📝 Documentation
475
- | Agent | Role |
476
- |-------|------|
477
- | `docs` | Documentation synthesizer. Automatically updates READMEs, API docs, and guides based on implementation changes during Phase 6. |
319
+ | Feature | OpenCode Swarm | oh-my-opencode | get-shit-done | AutoGen | CrewAI |
320
+ |---------|:-:|:-:|:-:|:-:|:-:|
321
+ | Multi-agent orchestration | ✅ 9 specialized agents | ❌ Prompt config only | ❌ Single-agent macros | ✅ | ✅ |
322
+ | Execution model | Serial (deterministic) | N/A | N/A | Parallel (chaotic) | Parallel |
323
+ | Phased planning with acceptance criteria | ✅ | ❌ | ❌ | ❌ | ❌ |
324
+ | Critic gate before implementation | ✅ | ❌ | ❌ | ❌ | ❌ |
325
+ | Per-task dual-pass review (correctness + security) | ✅ | ❌ | ❌ | Optional | Optional |
326
+ | Adversarial test pass per task | ✅ | ❌ | ❌ | ❌ | ❌ |
327
+ | Pre-reviewer pipeline (lint, secretscan, imports) | ✅ v6.3 | ❌ | ❌ | ❌ | ❌ |
328
+ | Persistent session memory | ✅ `.swarm/` files | ❌ | ❌ | Session only | Session only |
329
+ | Resume projects across sessions | ✅ Native | ❌ | ❌ | ❌ | ❌ |
330
+ | Evidence trail per task | ✅ Structured bundles | ❌ | ❌ | ❌ | ❌ |
331
+ | Heterogeneous model routing | ✅ Per-agent | ❌ | ❌ | Limited | Limited |
332
+ | Circuit breaker / guardrails | ✅ Per-invocation | ❌ | ❌ | ❌ | ❌ |
333
+ | Open-domain SME consultation | ✅ Any domain | ❌ | ❌ | ❌ | ❌ |
334
+ | Retrospective learning across phases | ✅ | ❌ | ❌ | ❌ | ❌ |
335
+ | Slash commands + diagnostics | ✅ 12 commands | ❌ | Limited | ❌ | ❌ |
478
336
 
479
337
  ---
480
338
 
@@ -482,236 +340,141 @@ All features are opt-in via configuration. See [Installation Guide](docs/install
482
340
 
483
341
  | Command | Description |
484
342
  |---------|-------------|
485
- | `/swarm status` | Current phase, task progress, and agent count |
486
- | `/swarm plan [N]` | View full plan or filter by phase number |
487
- | `/swarm agents` | List all registered agents with models and permissions |
488
- | `/swarm history` | View completed phases with status icons |
489
- | `/swarm config` | View current resolved plugin configuration |
490
- | `/swarm diagnose` | Health check for .swarm/ files and config |
343
+ | `/swarm status` | Current phase, task progress, agent count |
344
+ | `/swarm plan [N]` | Full plan or filtered by phase |
345
+ | `/swarm agents` | All registered agents with models and permissions |
346
+ | `/swarm history` | Completed phases with status |
347
+ | `/swarm config` | Current resolved configuration |
348
+ | `/swarm diagnose` | Health check for `.swarm/` files and config |
491
349
  | `/swarm export` | Export plan and context as portable JSON |
492
- | `/swarm reset --confirm` | Clear swarm state files (with safety gate) |
493
- | `/swarm evidence [task]` | View evidence bundles for a task or all tasks |
494
- | `/swarm archive [--dry-run]` | Archive old evidence bundles with retention policy |
495
- | `/swarm benchmark` | Run performance benchmarks and display metrics |
496
- | `/swarm retrieve [id]` | Retrieve auto-summarized tool outputs by ID |
350
+ | `/swarm evidence [task]` | Evidence bundles for a task or all tasks |
351
+ | `/swarm archive [--dry-run]` | Archive old evidence with retention policy |
352
+ | `/swarm benchmark` | Performance benchmarks |
353
+ | `/swarm retrieve [id]` | Retrieve auto-summarized tool outputs |
354
+ | `/swarm reset --confirm` | Clear swarm state files |
497
355
 
498
356
  ---
499
357
 
500
358
  ## Configuration
501
359
 
502
- Create `~/.config/opencode/opencode-swarm.json`:
503
-
504
360
  ```json
505
361
  {
506
362
  "agents": {
507
- "architect": { "model": "anthropic/claude-sonnet-4-5" },
508
- "explorer": { "model": "google/gemini-2.0-flash" },
509
- "coder": { "model": "anthropic/claude-sonnet-4-5" },
510
- "sme": { "model": "google/gemini-2.0-flash" },
511
- "reviewer": { "model": "openai/gpt-4o" },
512
- "critic": { "model": "google/gemini-2.0-flash" },
513
- "test_engineer": { "model": "google/gemini-2.0-flash" },
514
- "docs": { "model": "google/gemini-2.0-flash" },
515
- "designer": { "model": "google/gemini-2.0-flash" }
516
- }
517
- }
518
- ```
519
-
520
- ### Disable Agents
521
- ```json
522
- {
523
- "sme": { "disabled": true },
524
- "test_engineer": { "disabled": true }
525
- }
526
- ```
527
-
528
- ---
529
-
530
- ## Guardrails
531
-
532
- OpenCode Swarm includes a built-in circuit breaker that prevents subagents from running away — burning API credits in infinite loops, repeating the same tool call, or spinning for hours.
533
-
534
- ### How It Works
535
-
536
- | Layer | Trigger | Action |
537
- |-------|---------|--------|
538
- | ⚠️ **Soft Warning** | 50% of any limit reached | Injects warning message into agent's chat stream |
539
- | 🛑 **Hard Block** | 100% of any limit reached | Blocks ALL further tool calls + injects stop message |
540
-
541
- ### Detection Signals
542
-
543
- | Signal | Default Limit | Description |
544
- |--------|---------------|-------------|
545
- | Tool calls | 200 | Total tool invocations per agent session |
546
- | Duration | 30 min | Wall-clock time since delegation started |
547
- | Repetition | 10 | Same tool + args called consecutively |
548
- | Consecutive errors | 5 | Sequential null/undefined tool outputs |
549
-
550
- ### Configuration
551
-
552
- Guardrails are **enabled by default**. Customize in your swarm config:
553
-
554
- ```jsonc
555
- {
556
- "guardrails": {
557
- "enabled": true, // default: true
558
- "max_tool_calls": 200, // range: 10–1000
559
- "max_duration_minutes": 30, // range: 1–120
560
- "max_repetitions": 10, // range: 3–50
561
- "max_consecutive_errors": 5, // range: 2–20
562
- "warning_threshold": 0.5 // range: 0.1–0.9 (fraction of limit for soft warning)
563
- }
564
- }
565
- ```
566
-
567
- ### Per-Agent Profiles
568
-
569
- Override limits for specific agents that need more (or less) room:
570
-
571
- ```jsonc
572
- {
363
+ "architect": { "model": "anthropic/claude-opus-4-6" },
364
+ "coder": { "model": "minimax-coding-plan/MiniMax-M2.5" },
365
+ "explorer": { "model": "minimax-coding-plan/MiniMax-M2.1" },
366
+ "sme": { "model": "kimi-for-coding/k2p5" },
367
+ "critic": { "model": "zai-coding-plan/glm-5" },
368
+ "reviewer": { "model": "zai-coding-plan/glm-5" },
369
+ "test_engineer": { "model": "minimax-coding-plan/MiniMax-M2.5" },
370
+ "docs": { "model": "zai-coding-plan/glm-4.7-flash" },
371
+ "designer": { "model": "kimi-for-coding/k2p5" }
372
+ },
573
373
  "guardrails": {
574
374
  "max_tool_calls": 200,
375
+ "max_duration_minutes": 30,
575
376
  "profiles": {
576
- "coder": { "max_tool_calls": 500, "max_duration_minutes": 60 },
577
- "explorer": { "max_tool_calls": 50 }
377
+ "coder": { "max_tool_calls": 500 }
578
378
  }
379
+ },
380
+ "review_passes": {
381
+ "always_security_review": false,
382
+ "security_globs": ["**/*auth*", "**/*crypto*", "**/*session*", "**/*token*"]
579
383
  }
580
384
  }
581
385
  ```
582
386
 
583
- Profiles merge with base config — only specified fields are overridden.
584
-
585
- ### Review Passes
387
+ Save to `~/.config/opencode/opencode-swarm.json` or `.opencode/swarm.json` in your project root. Project config merges over global config via deep merge partial overrides do not clobber unspecified fields.
586
388
 
587
- Control the dual-pass security review behavior:
389
+ ### Disabling Agents
588
390
 
589
- ```jsonc
391
+ ```json
590
392
  {
591
- "review_passes": {
592
- "always_security_review": false, // default: false (only on security-sensitive files)
593
- "security_globs": [ // default patterns:
594
- "**/*auth*", "**/*crypto*",
595
- "**/*session*", "**/*token*",
596
- "**/*middleware*", "**/*api*",
597
- "**/*security*"
598
- ]
599
- }
393
+ "sme": { "disabled": true },
394
+ "designer": { "disabled": true },
395
+ "test_engineer": { "disabled": true }
600
396
  }
601
397
  ```
602
398
 
603
- Set `always_security_review: true` to run the security pass on every task, regardless of file path.
399
+ ---
604
400
 
605
- ### Integration Analysis
401
+ ## Installation
606
402
 
607
- Control whether contract change detection triggers impact analysis:
403
+ ```bash
404
+ # Install globally
405
+ npm install -g opencode-swarm
608
406
 
609
- ```jsonc
610
- {
611
- "integration_analysis": {
612
- "enabled": true // default: true
613
- }
614
- }
615
- ```
407
+ # Or use npx
408
+ npx opencode-swarm install
616
409
 
617
- ### Compaction Advisory
410
+ # Verify
411
+ opencode # then: /swarm diagnose
412
+ ```
618
413
 
619
- Control when the system hints about context compaction thresholds:
414
+ The installer auto-configures `opencode.json` to include the plugin. Manual configuration:
620
415
 
621
- ```jsonc
416
+ ```json
622
417
  {
623
- "compaction_advisory": {
624
- "enabled": true, // default: true
625
- "thresholds": [50, 75, 100, 125, 150], // tool-call counts that trigger hints
626
- "message": "Large context may benefit from compaction" // custom message
627
- }
418
+ "plugins": ["opencode-swarm"]
628
419
  }
629
420
  ```
630
421
 
631
- When the architect's tool-call count crosses a threshold, the system enhancer injects a `[SWARM HINT]` suggesting context management. Each threshold fires only once per session (tracked via `lastCompactionHint`).
632
-
633
- > **Architect is exempt/unlimited by default:** The architect agent has no guardrail limits by default. To override, add a `profiles.architect` entry in your guardrails config.
634
-
635
- ### Per-Invocation Budgets
636
-
637
- Guardrail limits are enforced **per-invocation**, not per-session. Each time the architect delegates to an agent, that agent gets a fresh budget of tool calls, duration, and error tolerance.
638
-
639
- **Example**: If `max_tool_calls: 200`, then:
640
- - Architect → Coder (task 1) → 200 calls available
641
- - Coder finishes → Architect → Coder (task 2) → 200 calls available again
642
-
643
- This prevents long-running projects from accumulating session-wide counters that incorrectly trip the circuit breaker on later tasks.
422
+ ---
644
423
 
645
- > **Architect is unlimited**: The architect never creates invocation windows and has no guardrail limits by default.
424
+ ## Testing
646
425
 
647
- ### Disable Guardrails
426
+ 2031 tests across 78 files. Unit, integration, adversarial, and smoke. Covers config schemas, all agent prompts, all hooks, all tools, all commands, guardrail circuit breaker, race conditions, invocation window isolation, multi-invocation state, security category classification, and evidence validation.
648
427
 
649
- ```json
650
- {
651
- "guardrails": {
652
- "enabled": false
653
- }
654
- }
428
+ ```bash
429
+ bun test
655
430
  ```
656
431
 
657
- ---
432
+ Zero additional test dependencies. Uses Bun's built-in test runner.
658
433
 
659
- ## Comparison
434
+ ---
660
435
 
661
- | Feature | OpenCode Swarm | AutoGen | CrewAI | LangGraph |
662
- |---------|---------------|---------|--------|-----------|
663
- | Execution | Serial (predictable) | Parallel (chaotic) | Parallel | Configurable |
664
- | Planning | Phased with acceptance criteria | Ad-hoc | Role-based | Graph-based |
665
- | Memory | Persistent `.swarm/` files | Session only | Session only | Checkpoints |
666
- | QA | Dual-pass per-task (review + security + adversarial) | Optional | Optional | Manual |
667
- | Model mixing | Per-agent configuration | Limited | Limited | Manual |
668
- | Resume projects | ✅ Native | ❌ | ❌ | Partial |
669
- | SME domains | Open-domain (any) | Generic | Generic | Generic |
670
- | Task granularity | One at a time | Batched | Batched | Varies |
436
+ ## Roadmap
671
437
 
672
- ---
438
+ ### v6.3 — Pre-Reviewer Pipeline
673
439
 
674
- ## Design Principles
440
+ Three new tools complete the pre-reviewer gauntlet. Code reaching the Reviewer is already clean.
675
441
 
676
- 1. **Plan before code** - Documented phases with acceptance criteria
677
- 2. **One task at a time** - Focused work, quality output
678
- 3. **Review everything immediately** - Dual-pass review (correctness + security) with adversarial testing per task
679
- 4. **Cache SME knowledge** - Don't re-ask answered questions
680
- 5. **Persistent memory** - `.swarm/` files survive sessions
681
- 6. **Serial execution** - Predictable, debuggable, no race conditions
682
- 7. **Heterogeneous models** - Different perspectives catch different bugs
683
- 8. **User checkpoints** - Confirm before proceeding to next phase
684
- 9. **Failure tracking** - Document rejections, escalate after 5 attempts
685
- 10. **Resumable by design** - Any Architect can pick up any project
442
+ - **`imports`** — AST-based import graph. For each file changed by the coder, returns every consumer file, which exports each consumer uses, and the line numbers. Replaces fragile grep-based integration analysis with deterministic graph traversal.
443
+ - **`lint`** Auto-detects project linter (Biome, ESLint, Ruff, Clippy, PSScriptAnalyzer). Runs in fix mode first, then check mode. Structured diagnostic output per file.
444
+ - **`secretscan`** — Entropy-based credential scanner. Detects API keys, tokens, connection strings, and private key headers in the diff before they reach the reviewer. Zero external dependencies.
686
445
 
687
- ---
446
+ Phase 5 execute loop becomes: `coder → diff → imports → lint fix → lint check → secretscan → reviewer → security reviewer → test_engineer → adversarial test_engineer`.
688
447
 
689
- ## Testing
448
+ ### v6.4 — Execution and Planning Tools
690
449
 
691
- ```bash
692
- # Run all tests
693
- bun test
450
+ - **`test_runner`** — Unified test execution across Bun, Vitest, Jest, Mocha, pytest, cargo test, and Pester. Auto-detects framework, returns normalized JSON with pass/fail/skip counts and coverage. Three scope modes: `all`, `convention` (naming-based), `graph` (import-graph-based). Eliminates the test_engineer's most common failure mode.
451
+ - **`symbols`** Export inventory for a module: functions, classes, interfaces, types, enums. Gives the Architect instant visibility into a file's public API surface without reading the full source.
452
+ - **`checkpoint`** — Git-backed save points. Before any multi-file refactor (≥3 files), Architect auto-creates a checkpoint commit. On critical integration failure, restores via soft reset instead of iterating into a hole.
694
453
 
695
- # Run specific test file
696
- bun test tests/unit/config/schema.test.ts
697
- ```
454
+ ### v6.5 Intelligence and Audit Tools
698
455
 
699
- 1391 tests across 62+ files covering config, tools, agents, hooks, commands, state, guardrails, evidence, plan schemas, circuit breaker race conditions, invocation windows, multi-invocation isolation, security categories, review/integration schemas, and diff tool. Uses Bun's built-in test runner — zero additional test dependencies.
456
+ Five tools that improve planning quality and post-phase validation:
700
457
 
701
- ## Troubleshooting
458
+ - **`pkg_audit`** — Wraps `npm audit`, `pip-audit`, `cargo audit`. Structured CVE output with severity, patched versions, and advisory URLs. Fed to the security reviewer for concrete vulnerability context.
459
+ - **`complexity_hotspots`** — Git churn × cyclomatic complexity risk map. Run in Phase 0/2 to identify modules that need stricter QA gates before implementation begins.
460
+ - **`schema_drift`** — Compares OpenAPI spec against actual route implementations. Surfaces undocumented routes and phantom spec paths. Run in Phase 6 when API routes were modified.
461
+ - **`todo_extract`** — Structured extraction of `TODO`, `FIXME`, and `HACK` annotations across the codebase. High-priority items fed directly into plan task candidates.
462
+ - **`evidence_check`** — Audits completed tasks against required evidence types. Run in Phase 6 to verify every task has review and test evidence before the phase is marked complete.
702
463
 
703
- ### Plugin not loading
704
- 1. Verify `opencode-swarm` is listed in your `opencode.json` plugins array
705
- 2. Run `bunx opencode-swarm install` to auto-configure
706
- 3. Run `/swarm diagnose` to check health status
464
+ ---
707
465
 
708
- ### Commands not working
709
- - Ensure you're using `/swarm <command>`, not `/swarm/<command>`
710
- - Run `/swarm` with no arguments to see available commands
466
+ ## Design Principles
711
467
 
712
- ### Resuming a project
713
- - Swarm automatically detects `.swarm/plan.md` and resumes where you left off
714
- - If you get unexpected behavior, run `/swarm export` to backup, then `/swarm reset --confirm` to start fresh
468
+ 1. **Plan before code** — Documented phases with acceptance criteria. The Critic approves the plan before a single line is written.
469
+ 2. **One task at a time** — The Coder gets one task and full context. Nothing else.
470
+ 3. **Review everything immediately** Every task goes through correctness review, security review, verification tests, and adversarial tests. No task ships without passing all four.
471
+ 4. **Cache SME knowledge** — Guidance is written to `context.md`. The same domain question is never asked twice in a project.
472
+ 5. **Persistent memory** — `.swarm/` files are the ground truth. Any session, any model, any day.
473
+ 6. **Serial execution** — Predictable, debuggable, no race conditions, no conflicting writes.
474
+ 7. **Heterogeneous models** — Different models, different blind spots. The coder's bug is the reviewer's catch.
475
+ 8. **User checkpoints** — Phase transitions require user confirmation. No unsupervised multi-phase runs.
476
+ 9. **Document failures** — Rejections and retries are recorded in plan.md. After 5 failed attempts, the task escalates to the user.
477
+ 10. **Resumable by design** — A cold-start Architect can read `.swarm/` and continue any project as if it had been there from the beginning.
715
478
 
716
479
  ---
717
480
 
@@ -730,5 +493,5 @@ MIT
730
493
  ---
731
494
 
732
495
  <p align="center">
733
- <strong>Stop hoping your agents figure it out. Start shipping code that works.</strong>
496
+ <strong>Stop hoping your agents figure it out. Start shipping code that actually works.</strong>
734
497
  </p>