npm - opencode-swarm - Versions diffs - 2.3.3 → 3.0.0 - Mend

opencode-swarm 2.3.3 → 3.0.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (3) hide show

package/README.md CHANGED Viewed

@@ -1,112 +1,223 @@
-# OpenCode Swarm
+<p align="center">
+  <img src="https://img.shields.io/badge/version-3.0.0-blue" alt="Version">
+  <img src="https://img.shields.io/badge/license-MIT-green" alt="License">
+  <img src="https://img.shields.io/badge/opencode-plugin-purple" alt="OpenCode Plugin">
+  <img src="https://img.shields.io/badge/agents-20+-orange" alt="Agents">
+</p>
+<h1 align="center">🐝 OpenCode Swarm</h1>
+<p align="center">
+  <strong>The only multi-agent framework that actually works.</strong><br>
+  Structured phases. Persistent memory. One task at a time. QA on everything.
+</p>
+<p align="center">
+  <a href="#why-swarm">Why Swarm?</a> •
+  <a href="#how-it-works">How It Works</a> •
+  <a href="#installation">Installation</a> •
+  <a href="#agents">Agents</a> •
+  <a href="#configuration">Configuration</a>
+</p>
-![License](https://img.shields.io/badge/license-MIT-blue)
-![OpenCode Plugin](https://img.shields.io/badge/opencode-plugin-green)
-![Architecture](https://img.shields.io/badge/architecture-architect--centric-purple)
-![Version](https://img.shields.io/badge/version-2.2.1-orange)
+---
+## The Problem with Every Other Multi-Agent System
+```
+You: "Build me an authentication system"
+Other Frameworks:
+├── Agent 1 starts auth module...
+├── Agent 2 starts user model... (conflicts with Agent 1)
+├── Agent 3 starts database... (wrong schema)
+├── Agent 4 starts tests... (for code that doesn't exist yet)
+└── Result: Chaos. Conflicts. Context lost. Start over.
+OpenCode Swarm:
+├── Architect analyzes request
+├── Explorer scans codebase
+├── Security SME provides auth guidance
+├── Architect creates phased plan with acceptance criteria
+├── Phase 1: User model → QA → Tests → ✓
+├── Phase 2: Auth logic → QA → Tests → ✓
+├── Phase 3: Session management → QA → Tests → ✓
+└── Result: Working code. Documented decisions. Resumable progress.
+```
+---
+## Why Swarm?
+<table>
+<tr>
+<td width="50%">
+### ❌ Other Frameworks
+- Parallel chaos, hope it converges
+- Single model = correlated failures
+- No planning, just vibes
+- Context lost between sessions
+- QA as afterthought (if at all)
+- Entire codebase in one prompt
+- No way to resume projects
+</td>
+<td width="50%">
+### ✅ OpenCode Swarm
-**Architect-driven, multi-agent development for OpenCode.**
+- **Serial execution** - predictable, traceable
+- **Heterogeneous models** - different perspectives catch errors
+- **Phased planning** - documented tasks with acceptance criteria
+- **Persistent memory** - `.swarm/` files survive sessions
+- **QA per task** - security + audit before anything ships
+- **One task at a time** - focused, quality code
+- **Resumable projects** - pick up exactly where you left off
-Design-first orchestration with codebase discovery, domain-aware SMEs, heterogeneous model perspectives, and layered QA review.
+</td>
+</tr>
+</table>
+---
+## How It Works
 ```
-┌──────────────────────────────────────────────────────────────────────────┐
-│  "Review this PowerShell application for security issues"                │
-└──────────────────────────────────────────────────────────────────────────┘
+┌─────────────────────────────────────────────────────────────────────────┐
+│  USER: "Add user authentication with JWT"                               │
+└─────────────────────────────────────────────────────────────────────────┘
+                                    │
+                                    ▼
+┌─────────────────────────────────────────────────────────────────────────┐
+│  PHASE 0: Check for .swarm/plan.md                                      │
+│           Exists? Resume. New? Continue.                                │
+└─────────────────────────────────────────────────────────────────────────┘
+                                    │
+                                    ▼
+┌─────────────────────────────────────────────────────────────────────────┐
+│  PHASE 1: Clarify (if needed)                                           │
+│           "Do you need refresh tokens? What's the session duration?"    │
+└─────────────────────────────────────────────────────────────────────────┘
+                                    │
+                                    ▼
+┌─────────────────────────────────────────────────────────────────────────┐
+│  PHASE 2: Discover                                                      │
+│           @explorer scans codebase → structure, languages, patterns     │
+└─────────────────────────────────────────────────────────────────────────┘
                                     │
                                     ▼
-┌──────────────────────────────────────────────────────────────────────────┐
-│  ARCHITECT: Delegating to @explorer for codebase analysis...             │
-└──────────────────────────────────────────────────────────────────────────┘
+┌─────────────────────────────────────────────────────────────────────────┐
+│  PHASE 3: Consult SMEs (serial, cached)                                 │
+│           @sme_security → auth best practices                           │
+│           @sme_api → JWT patterns, refresh flow                         │
+│           Guidance saved to .swarm/context.md                           │
+└─────────────────────────────────────────────────────────────────────────┘
                                     │
                                     ▼
-┌──────────────────────────────────────────────────────────────────────────┐
-│  EXPLORER: PowerShell module, 12 files, domains: powershell, security    │
-│  → Flagged: auth.ps1, invoke-command.ps1 for SME review                  │
-└──────────────────────────────────────────────────────────────────────────┘
+┌─────────────────────────────────────────────────────────────────────────┐
+│  PHASE 4: Plan                                                          │
+│           Creates .swarm/plan.md with phases, tasks, acceptance criteria│
+│                                                                         │
+│           Phase 1: Foundation [3 tasks]                                 │
+│           Phase 2: Core Auth [4 tasks]                                  │
+│           Phase 3: Session Management [3 tasks]                         │
+└─────────────────────────────────────────────────────────────────────────┘
                                     │
                                     ▼
-┌──────────────────────────────────────────────────────────────────────────┐
-│  SME_POWERSHELL: Remoting patterns detected, needs constrained endpoints │
-│  SME_SECURITY: Credential handling issues in auth.ps1:42-58              │
-└──────────────────────────────────────────────────────────────────────────┘
+┌─────────────────────────────────────────────────────────────────────────┐
+│  PHASE 5: Execute (per task)                                            │
+│                                                                         │
+│   ┌─────────┐    ┌──────────┐    ┌─────────┐    ┌──────────┐           │
+│   │ @coder  │ →  │@security │ →  │@auditor │ →  │  @test   │           │
+│   │ 1 task  │    │ review   │    │ verify  │    │ generate │           │
+│   └─────────┘    └──────────┘    └─────────┘    └──────────┘           │
+│        │                                              │                 │
+│        └──── If rejected: retry with feedback ────────┘                 │
+│                                                                         │
+│   Update plan.md: [x] Task complete                                     │
+│   Next task...                                                          │
+└─────────────────────────────────────────────────────────────────────────┘
                                     │
                                     ▼
-┌──────────────────────────────────────────────────────────────────────────┐
-│  ARCHITECT: Collated review with 3 HIGH findings, 2 recommendations      │
-└──────────────────────────────────────────────────────────────────────────┘
+┌─────────────────────────────────────────────────────────────────────────┐
+│  PHASE 6: Phase Complete                                                │
+│           Re-scan with @explorer                                        │
+│           Update context.md with learnings                              │
+│           Archive to .swarm/history/                                    │
+│           "Phase 1 complete. Ready for Phase 2?"                        │
+└─────────────────────────────────────────────────────────────────────────┘
 ```
 ---
-## Why OpenCode Swarm?
+## Persistent Project Memory
-Most agent frameworks parallelize everything and hope coherence emerges.
-**OpenCode Swarm enforces discipline:**
+Other frameworks lose everything when the session ends. Swarm doesn't.
-| Problem | Our Solution |
-|---------|--------------|
-| Agents read the same files repeatedly | Explorer scans once, shares context |
-| All 11 SMEs consulted for every task | Only relevant domains (1-3) based on Explorer findings |
-| Single model = correlated failures | Different models per role = diverse perspectives |
-| No visibility into agent decisions | Serial execution with clear delegation traces |
-| Code ships without review | Mandatory Security → Audit → Test pipeline |
+```
+.swarm/
+├── plan.md          # Your project roadmap
+├── context.md       # Everything a new Architect needs
+└── history/
+    ├── phase-1.md   # What was done, what was learned
+    └── phase-2.md
+```
----
+### plan.md - Living Roadmap
+```markdown
+# Project: Auth System
+Current Phase: 2
+## Phase 1: Foundation [COMPLETE]
+- [x] Task 1.1: Create user model [SMALL]
+- [x] Task 1.2: Add password hashing [SMALL]
+- [x] Task 1.3: Database migrations [MEDIUM]
+## Phase 2: Core Auth [IN PROGRESS]
+- [x] Task 2.1: Login endpoint [MEDIUM]
+- [ ] Task 2.2: JWT generation [MEDIUM] (depends: 2.1) ← CURRENT
+  - Acceptance: Returns valid JWT with user claims
+  - Attempt 1: REJECTED - Missing expiration
+- [ ] Task 2.3: Token validation middleware [MEDIUM]
+- [BLOCKED] Task 2.4: Refresh tokens
+  - Reason: Waiting for decision on rotation strategy
+```
-## Architecture
+### context.md - Institutional Knowledge
+```markdown
+# Project Context: Auth System
-```
-User Request
-     │
-     ▼
-┌─────────────┐
-│  ARCHITECT  │ ◄── Orchestrates everything, owns all decisions
-└─────────────┘
-     │
-     ▼
-┌─────────────┐
-│  EXPLORER   │ ◄── Fast codebase discovery (read-only)
-└─────────────┘     Returns: structure, languages, domains, flagged files
-     │
-     ▼
-┌─────────────┐
-│    SMEs     │ ◄── Domain experts consulted serially (read-only)
-└─────────────┘     Only domains identified by Explorer
-     │
-     ▼
-┌─────────────┐
-│   CODER     │ ◄── Implements unified specification
-└─────────────┘
-     │
-     ▼
-┌─────────────┐     ┌─────────────┐
-│  SECURITY   │ ──► │   AUDITOR   │ ◄── QA review (read-only)
-└─────────────┘     └─────────────┘
-     │
-     ▼
-┌─────────────┐
-│    TEST     │ ◄── Generates tests for approved code
-└─────────────┘
-```
+## Technical Decisions
+- Using bcrypt (cost 12) for password hashing
+- JWT expires in 15 minutes, refresh in 7 days
+- Storing refresh tokens in Redis
-### Agent Permissions
+## SME Guidance Cache
+### Security (Phase 1)
+- Never log tokens or passwords
+- Use constant-time comparison for tokens
+- Implement rate limiting on login
-| Agent | Read | Write | Role |
-|-------|:----:|:-----:|------|
-| Architect | ✅ | ✅ | Orchestrator - can fall back if delegation fails |
-| Explorer | ✅ | ❌ | Discovery - scans, summarizes, identifies domains |
-| SMEs (×15) | ✅ | ❌ | Advisory - domain expertise, never implements |
-| Coder | ✅ | ✅ | Implementation - writes production code |
-| Security Reviewer | ✅ | ❌ | Audit - vulnerability assessment |
-| Auditor | ✅ | ❌ | Audit - correctness verification |
-| Test Engineer | ✅ | ✅ | Testing - generates test cases |
+### API (Phase 1)
+- Return 401 for invalid credentials (not 404)
+- Include token expiry in response body
+## Patterns Established
+- Error handling: Custom ApiError class with status codes
+- Validation: Zod schemas in /validators/
+```
+**Start a new session tomorrow?** The Architect reads these files and picks up exactly where you left off.
 ---
-## Heterogeneous Model Perspectives
+## Heterogeneous Models = Better Code
+Most frameworks use one model for everything. Same blindspots everywhere.
-OpenCode Swarm allows **different models per role**, reducing correlated failures:
+Swarm lets you mix models strategically:
 ```json
 {
@@ -115,38 +226,84 @@ OpenCode Swarm allows **different models per role**, reducing correlated failure
     "explorer": { "model": "google/gemini-2.0-flash" },
     "coder": { "model": "anthropic/claude-sonnet-4-5" },
     "_sme": { "model": "google/gemini-2.0-flash" },
-    "_qa": { "model": "openai/gpt-4o" },
-    "test_engineer": { "model": "google/gemini-2.0-flash" }
+    "security_reviewer": { "model": "openai/gpt-4o" },
+    "auditor": { "model": "google/gemini-2.0-flash" }
   }
 }
 ```
-**Why this matters:**
-- Reasoning-heavy model for Architect decisions
-- Fast/cheap model for Explorer and SME consultation
-- Different model family for QA catches errors the others miss
-- Mix local (Ollama) and cloud models based on cost/capability
+| Role | Optimized For | Why Different Models? |
+|------|---------------|----------------------|
+| Architect | Deep reasoning | Needs to plan complex work |
+| Explorer | Fast scanning | Speed over depth |
+| Coder | Implementation | Best coding model you have |
+| SMEs | Domain knowledge | Fast recall, not deep reasoning |
+| Security Reviewer | Finding flaws | **Different vendor catches different bugs** |
+| Auditor | Verification | Independent perspective |
+**If Claude writes code and GPT reviews it, GPT catches Claude's blindspots.** This is why real teams have code review.
 ---
 ## Installation
-Add to your `opencode.json`:
-```json
+```bash
+# Add to opencode.json
 {
   "plugin": ["opencode-swarm"]
 }
-```
-Or install via CLI:
-```bash
+# Or install via CLI
 bunx opencode-swarm install
 ```
 ---
+## Agents
+### 🎯 Orchestrator
+| Agent | Role |
+|-------|------|
+| `architect` | Central coordinator. Plans phases, delegates tasks, manages QA, maintains project memory. |
+### 🔍 Discovery
+| Agent | Role |
+|-------|------|
+| `explorer` | Fast codebase scanner. Identifies structure, languages, frameworks, key files. |
+### 🧠 Domain Experts (15 SMEs)
+| Agent | Domain |
+|-------|--------|
+| `sme_web` | Flutter, React, Vue, Angular, JS/TS, HTML/CSS |
+| `sme_api` | REST, GraphQL, OAuth, JWT, webhooks |
+| `sme_database` | SQL Server, PostgreSQL, MySQL, MongoDB, Redis |
+| `sme_devops` | Docker, Kubernetes, CI/CD, Terraform |
+| `sme_security` | STIG, hardening, CVE, encryption, PKI |
+| `sme_python` | Python ecosystem, libraries, patterns |
+| `sme_powershell` | PowerShell scripting, modules, remoting |
+| `sme_windows` | Windows internals, registry, services, WMI |
+| `sme_linux` | Linux, systemd, package management |
+| `sme_network` | TCP/IP, firewalls, DNS, TLS |
+| `sme_azure` | Azure services, Entra ID, ARM/Bicep |
+| `sme_vmware` | vSphere, ESXi, PowerCLI |
+| `sme_oracle` | Oracle Database, SQL/PLSQL |
+| `sme_active_directory` | AD, LDAP, Group Policy, Kerberos |
+| `sme_ui_ux` | UI/UX design, accessibility |
+### 💻 Implementation
+| Agent | Role |
+|-------|------|
+| `coder` | Implements ONE task at a time with full context |
+| `test_engineer` | Generates tests for each completed task |
+### ✅ Quality Assurance
+| Agent | Role |
+|-------|------|
+| `security_reviewer` | Vulnerability assessment per task |
+| `auditor` | Correctness verification per task |
+---
 ## Configuration
 Create `~/.config/opencode/opencode-swarm.json`:
@@ -166,143 +323,60 @@ Create `~/.config/opencode/opencode-swarm.json`:
 ### Category Defaults
-`_sme` and `_qa` set defaults for all agents in that category:
+- `_sme` → All 15 SME agents
+- `_qa` → security_reviewer + auditor
+Override specific agents:
 ```json
 {
-  "agents": {
-    "_sme": { "model": "google/gemini-2.0-flash" },
-    "sme_oracle": { "model": "anthropic/claude-sonnet-4-5" }
-  }
+  "_sme": { "model": "google/gemini-2.0-flash" },
+  "sme_security": { "model": "anthropic/claude-sonnet-4-5" }
 }
 ```
 ### Disable Unused Domains
 ```json
 {
-  "agents": {
-    "sme_vmware": { "disabled": true },
-    "sme_azure": { "disabled": true }
-  }
+  "sme_vmware": { "disabled": true },
+  "sme_oracle": { "disabled": true }
 }
 ```
-### Custom Prompts
-Place in `~/.config/opencode/opencode-swarm/`:
-- `{agent}.md` - Replace default prompt
-- `{agent}_append.md` - Append to default prompt
 ---
-## Agents
-### Orchestrator
-| Agent | Description |
-|-------|-------------|
-| `architect` | Central orchestrator. Analyzes requests, delegates to specialists, synthesizes outputs, triages QA feedback. |
-### Discovery
-| Agent | Description |
-|-------|-------------|
-| `explorer` | Fast codebase scanner. Identifies structure, languages, frameworks, and flags files for SME review. |
-### Domain Experts (SMEs)
-| Agent | Domain |
-|-------|--------|
-| `sme_windows` | Windows internals, registry, services, WMI/CIM |
-| `sme_powershell` | PowerShell scripting, cmdlets, modules, remoting |
-| `sme_python` | Python ecosystem, libraries, packaging |
-| `sme_oracle` | Oracle Database, SQL/PLSQL, administration |
-| `sme_network` | TCP/IP, firewalls, DNS, TLS, load balancing |
-| `sme_security` | STIG compliance, hardening, CVEs, PKI |
-| `sme_linux` | Linux administration, systemd, package management |
-| `sme_vmware` | vSphere, ESXi, PowerCLI, virtualization |
-| `sme_azure` | Azure services, Entra ID, ARM/Bicep |
-| `sme_active_directory` | AD, LDAP, Group Policy, Kerberos |
-| `sme_ui_ux` | UI/UX design, accessibility, interaction patterns |
-| `sme_web` | Flutter, React, Vue, Angular, JS/TS, HTML/CSS |
-| `sme_database` | SQL Server, PostgreSQL, MySQL, MongoDB, Redis |
-| `sme_devops` | Docker, Kubernetes, CI/CD, Terraform, GitHub Actions |
-| `sme_api` | REST, GraphQL, OAuth, JWT, webhooks |
+## Comparison
-### Implementation
-| Agent | Description |
-|-------|-------------|
-| `coder` | Writes production code from unified specifications |
-| `test_engineer` | Generates test cases and validation scripts |
-### Quality Assurance
-| Agent | Description |
-|-------|-------------|
-| `security_reviewer` | Vulnerability assessment, injection risks, data exposure |
-| `auditor` | Correctness verification, logic errors, edge cases |
----
-## Tools
-| Tool | Description |
-|------|-------------|
-| `gitingest` | Fetch GitHub repository contents for analysis |
-| `detect_domains` | Auto-detect relevant SME domains from text |
-| `extract_code_blocks` | Extract code blocks to files |
-### gitingest Example
-```
-"Analyze the architecture of https://github.com/user/repo"
-"Use gitingest to fetch https://github.com/user/repo with pattern *.py include"
-```
----
-## Workflow Examples
-### Code Review
-```
-User: "Review this codebase for issues"
-  → Explorer scans, identifies: TypeScript, React, needs sme_security
-  → SME_Security reviews flagged files
-  → Architect collates findings into review report
-```
-### Implementation
-```
-User: "Add authentication to this API"
-  → Explorer scans existing code
-  → SME_Security + SME_Network consulted
-  → Coder implements spec
-  → Security_Reviewer → Auditor validates
-  → Test_Engineer generates tests
-```
-### Bug Fix
-```
-User: "Fix the null reference in user.ts:42"
-  → Explorer locates context
-  → Relevant SME consulted
-  → Coder implements fix
-  → QA validates
-```
+| Feature | OpenCode Swarm | AutoGen | CrewAI | LangGraph |
+|---------|---------------|---------|--------|-----------|
+| Execution | Serial (predictable) | Parallel (chaotic) | Parallel | Configurable |
+| Planning | Phased with acceptance criteria | Ad-hoc | Role-based | Graph-based |
+| Memory | Persistent `.swarm/` files | Session only | Session only | Checkpoints |
+| QA | Per-task (security + audit) | Optional | Optional | Manual |
+| Model mixing | Per-agent configuration | Limited | Limited | Manual |
+| Resume projects | ✅ Native | ❌ | ❌ | Partial |
+| SME domains | 15 specialized | Generic | Generic | Generic |
+| Task granularity | One at a time | Batched | Batched | Varies |
 ---
-## Design Philosophy
+## Design Principles
-1. **Single point of control** - Architect owns all decisions
-2. **Discovery before action** - Explorer maps the terrain first
-3. **Selective expertise** - Only relevant SMEs consulted
-4. **Serial execution** - Traceable, debuggable, predictable
-5. **Mandatory QA** - No code ships without security + audit review
-6. **Fail-safe orchestration** - Architect can handle tasks itself if agents fail
+1. **Plan before code** - Documented phases with acceptance criteria
+2. **One task at a time** - Focused work, quality output
+3. **QA everything immediately** - Security + audit per task, not per project
+4. **Cache SME knowledge** - Don't re-ask answered questions
+5. **Persistent memory** - `.swarm/` files survive sessions
+6. **Serial execution** - Predictable, debuggable, no race conditions
+7. **Heterogeneous models** - Different perspectives catch different bugs
+8. **User checkpoints** - Confirm before proceeding to next phase
+9. **Failure tracking** - Document rejections, escalate after 3 attempts
+10. **Resumable by design** - Any Architect can pick up any project
 ---
 ## Documentation
-- [Architecture Details](docs/architecture.md)
+- [Architecture Deep Dive](docs/architecture.md)
 - [Design Rationale](docs/design-rationale.md)
 - [Installation Guide](docs/installation.md)
@@ -311,3 +385,9 @@ User: "Fix the null reference in user.ts:42"
 ## License
 MIT
+---
+<p align="center">
+  <strong>Stop hoping your agents figure it out. Start shipping code that works.</strong>
+</p>

package/dist/index.js CHANGED Viewed

@@ -13846,197 +13846,388 @@ function loadAgentPrompt(agentName) {
   return result;
 }
 // src/agents/architect.ts
-var ARCHITECT_PROMPT = `You are Architect - an AI coding orchestrator that coordinates specialist LLM agents to deliver quality code.
-**Role**: Analyze requests, delegate to specialist agents with clear instructions, synthesize their outputs, and manage the pipeline.
+var ARCHITECT_PROMPT = `You are Architect - an AI orchestrator that coordinates specialist LLM agents to deliver production-quality code through structured, phased execution.
 **CRITICAL: YOU ARE ORCHESTRATING OTHER LLMs**
-The agents you delegate to are separate LLM instances, typically smaller/faster models optimized for specific tasks. They cannot read your mind or infer context. Your delegations must be:
+The agents you delegate to are separate LLM instances, typically smaller/faster models. They cannot read your mind or infer context. Your delegations must be:
 - **Explicit**: State exactly what you want, not what you assume they know
 - **Structured**: Use clear sections, numbered steps, specific file paths
-- **Constrained**: Tell them what NOT to do, limit scope to prevent drift
+- **Constrained**: Tell them what NOT to do to prevent scope creep
 - **Self-contained**: Include all context they need in the delegation message
-**CRITICAL RULE: SERIAL EXECUTION ONLY**
+**CRITICAL: SERIAL EXECUTION ONLY**
 You MUST call agents ONE AT A TIME. After each delegation:
 1. Send to ONE agent
 2. STOP and wait for response
 3. Only then proceed to next agent
-NEVER delegate to multiple agents in the same message. This is mandatory.
+NEVER delegate to multiple agents in the same message.
+---
+## AGENTS
+@explorer - Fast codebase discovery and summarization
+@sme_windows - Windows OS, registry, services, WMI/CIM
+@sme_powershell - PowerShell scripting, cmdlets, modules
+@sme_python - Python ecosystem, libraries, patterns
+@sme_oracle - Oracle Database, SQL/PLSQL
+@sme_network - TCP/IP, firewalls, DNS, TLS
+@sme_security - STIG, hardening, CVE, PKI
+@sme_linux - Linux, systemd, package management
+@sme_vmware - vSphere, ESXi, PowerCLI
+@sme_azure - Azure, Entra ID, ARM/Bicep
+@sme_active_directory - AD, LDAP, Group Policy, Kerberos
+@sme_ui_ux - UI/UX design, accessibility
+@sme_web - Flutter, React, Vue, Angular, JS/TS, HTML/CSS
+@sme_database - SQL Server, PostgreSQL, MySQL, MongoDB, Redis
+@sme_devops - Docker, Kubernetes, CI/CD, Terraform
+@sme_api - REST, GraphQL, OAuth, JWT
+@coder - Implementation specialist
+@security_reviewer - Security vulnerability assessment
+@auditor - Code correctness verification
+@test_engineer - Test case generation
+---
+## WORKFLOW
+### Phase 0: Initialize or Resume
+**FIRST ACTION**: Check if \`.swarm/plan.md\` exists.
+If EXISTS \u2192 Read plan.md and context.md, resume from current phase/task
+If NOT EXISTS \u2192 This is a new project, proceed to Phase 1
+### Phase 1: Clarify (if needed)
+If the user request is ambiguous or missing critical details:
+- Ask up to 3 targeted clarifying questions
+- Wait for answers before proceeding
+- Do NOT guess at requirements
+If the request is clear \u2192 Proceed to Phase 2
+### Phase 2: Discover
+Delegate to @explorer:
+"Analyze this codebase for [task type].
+Focus on: [relevant areas based on user request]
+Return: project summary, key files, directory structure, relevant domains for SME consultation"
+STOP. Wait for @explorer response.
+### Phase 3: Consult SMEs (serial, check cache first)
+Before calling an SME, check \`.swarm/context.md\` for cached guidance.
+Only call SMEs for NEW questions not already answered.
+For each relevant domain (usually 1-3, NEVER parallel):
+"Review for [domain] considerations:
+Files: [specific paths from explorer]
+Context: [what we're building]
+Provide: [specific guidance needed]
+Constraints: Focus only on [domain]"
+STOP after each. Wait for response. Then next SME.
+Capture ALL SME guidance in context.md for future reference.
+### Phase 4: Plan
+Create or update \`.swarm/plan.md\` with:
+- Project overview
+- Phases broken into discrete tasks
+- Task dependencies (which tasks require others)
+- Acceptance criteria for each task
+- Complexity estimates [SMALL/MEDIUM/LARGE]
+Create or update \`.swarm/context.md\` with:
+- Technical decisions made
+- Architecture patterns
+- SME guidance (cached for future phases)
+- File map
+**PLANNING RULES**:
+- Each task should be ONE focused unit of work (single file or single feature)
+- Tasks must have clear acceptance criteria
+- Mark dependencies explicitly
+- Estimate complexity to set expectations
+### Phase 5: Execute Current Phase
+For EACH task in the current phase (respecting dependencies):
+**Step 5a: Delegate to @coder (ONE TASK ONLY)**
+"Implement the following:
+TASK: [specific single task]
+FILE: [single file path]
+REQUIREMENTS:
+1. [requirement with acceptance criteria]
+2. [requirement with acceptance criteria]
+CONTEXT:
+- [relevant SME guidance from context.md]
+- [patterns from existing code]
+DO NOT:
+- Modify other files
+- Add features not specified
+- Refactor unrelated code
+ACCEPTANCE CRITERIA:
+- [specific testable criterion]
+- [specific testable criterion]"
+STOP. Wait for @coder response.
+**Step 5b: Security Review**
+"Security review this code:
+FILE: [path]
+PURPOSE: [what it does]
+CHECK FOR:
+- Injection vulnerabilities
+- Data exposure
+- Privilege escalation
+- Input validation gaps
+RETURN: Risk level (LOW/MEDIUM/HIGH/CRITICAL) with specific findings and line numbers"
+STOP. Wait for response.
-**Agents**:
+**Step 5c: Audit**
+"Verify this implementation:
-@explorer - Fast codebase discovery and summarization (ALWAYS FIRST for code tasks)
-@sme_windows - Windows OS internals, registry, services, WMI/CIM
-@sme_powershell - PowerShell scripting, cmdlets, modules, remoting
-@sme_python - Python ecosystem, libraries, best practices
-@sme_oracle - Oracle Database, SQL/PLSQL, administration
-@sme_network - Networking, firewalls, DNS, TLS/SSL, load balancing
-@sme_security - STIG compliance, hardening, CVE, encryption, PKI
-@sme_linux - Linux administration, systemd, package management
-@sme_vmware - VMware vSphere, ESXi, PowerCLI, virtualization
-@sme_azure - Azure cloud services, Entra ID, ARM/Bicep
-@sme_active_directory - Active Directory, LDAP, Group Policy, Kerberos
-@sme_ui_ux - UI/UX design, interaction patterns, accessibility
-@sme_web - Web/frontend (Flutter, React, Vue, Angular, JS/TS, HTML/CSS)
-@sme_database - Databases (SQL Server, PostgreSQL, MySQL, MongoDB, Redis)
-@sme_devops - DevOps, CI/CD, Docker, Kubernetes, Terraform, GitHub Actions
-@sme_api - API design, REST, GraphQL, OAuth, JWT, webhooks
+FILE: [path]
+SPECIFICATION: [from task requirements]
-@coder - Implementation specialist, writes production code
-@security_reviewer - Security audit, vulnerability assessment
-@auditor - Code quality review, correctness verification
-@test_engineer - Test case generation and validation scripts
+CHECK:
+- Logic correctness
+- Edge cases
+- Error handling
+- Meets acceptance criteria
-**HOW TO DELEGATE TO EACH AGENT**:
+RETURN: APPROVED or REJECTED with specific issues"
-## @explorer
-Provide: The task context and what you need to understand
-Format:
-  "Analyze this codebase for [task type].
-   Focus on: [specific areas]
-   Return: project summary, key files, relevant domains for SME consultation"
+STOP. Wait for response.
-## @sme_* (domain experts)
-Provide: Specific files/code to review, what expertise you need
-Format:
-  "Review the following for [domain] considerations:
-   Files: [list specific paths]
-   Context: [what the code does]
-   Provide: [specific guidance needed]
-   Constraints: Focus only on [domain], do not suggest unrelated changes"
+**Step 5d: Handle QA Result**
-## @coder
-**IMPORTANT: ONE TASK AT A TIME**
-If you have multiple coding tasks, send them to @coder ONE AT A TIME.
-Do NOT batch multiple files or features into a single delegation.
-Wait for @coder to complete each task before sending the next.
+If APPROVED:
+  \u2192 Delegate to @test_engineer for this task
+  \u2192 Update plan.md: mark task [x] complete
+  \u2192 Proceed to next task
-Provide: Complete specification for ONE focused task
-Format:
-  "Implement the following:
-   TASK: [one specific task - single file or single feature]
-   FILE: [single path to create/modify]
-   REQUIREMENTS:
-   1. [specific requirement]
-   2. [specific requirement]
-   CONTEXT:
-   - [relevant info from explorer/SMEs]
-   - [patterns from existing code]
-   DO NOT:
-   - Modify other files
-   - Add features not specified
-   - Refactor unrelated code
-   OUTPUT: [single deliverable]"
+If REJECTED (Attempt 1-2):
+  \u2192 Send specific feedback to @coder
+  \u2192 Re-run QA cycle
+  \u2192 Track attempt in plan.md
-Example of CORRECT coding delegation:
-  Turn 1: "Implement the logging module" \u2192 @coder \u2192 wait for completion
-  Turn 2: "Implement the config parser" \u2192 @coder \u2192 wait for completion
-  Turn 3: "Implement the main entry point" \u2192 @coder \u2192 wait for completion
+If REJECTED (Attempt 3):
+  \u2192 ESCALATE: Handle directly or re-scope task
+  \u2192 Document in plan.md why it was escalated
-Example of WRONG batched delegation (NEVER DO THIS):
-  "Implement the logging module, config parser, and main entry point" \u2190 WRONG
+**Step 5e: Test**
+"Generate tests for:
-## @security_reviewer
-Provide: Code to review with context
-Format:
-  "Security review the following code:
-   FILES: [paths]
-   PURPOSE: [what the code does]
-   CHECK FOR:
-   - Injection vulnerabilities
-   - Data exposure
-   - Privilege issues
-   - Input validation
-   RETURN: Risk level (LOW/MEDIUM/HIGH/CRITICAL) and specific findings with line numbers"
+FILE: [path]
+FUNCTION: [specific function]
-## @auditor
-Provide: Code and specification to verify against
-Format:
-  "Verify this implementation:
-   FILES: [paths]
-   SPECIFICATION: [what it should do]
-   CHECK:
-   - Logic correctness
-   - Edge cases handled
-   - Error handling
-   - Specification compliance
-   RETURN: APPROVED or REJECTED with specific issues"
+TEST CASES:
+- Happy path: [expected behavior]
+- Edge cases: [specific cases to cover]
+- Error conditions: [what should fail gracefully]
-## @test_engineer
-Provide: Code and what to test
-Format:
-  "Generate tests for:
-   FILES: [paths]
-   FUNCTIONS TO TEST: [list]
-   COVERAGE:
-   - Happy path
-   - Edge cases: [specific cases]
-   - Error conditions
-   FRAMEWORK: [test framework to use]
-   OUTPUT: Test file(s) at [paths]"
+ACCEPTANCE: [from task criteria]
+OUTPUT: Test file at [path]"
-**WORKFLOW**:
+STOP. Wait for response.
-## 1. Parse Request (you do this briefly)
-Understand what the user wants. Determine task type.
+**Step 5f: Mark Complete**
+Update plan.md:
+- Mark task [x] complete
+- Add any notes or learnings to context.md
-## 2. Explorer FIRST (one delegation, wait for response)
-Delegate to @explorer with clear instructions. STOP and wait.
+Proceed to next task in phase.
-## 3. SME Consultation (ONE AT A TIME, wait between each)
-Based on @explorer's domains, delegate to each SME serially.
-Each SME delegation must be self-contained with file paths and context.
+### Phase 6: Phase Complete
-## 4. Collate (you do this)
-Synthesize all inputs into a clear specification or report.
+When all tasks in a phase are complete:
-## 5. Code (ONE TASK AT A TIME to @coder)
-If you have multiple coding tasks:
-- Break them into individual, focused tasks
-- Send first task to @coder, WAIT for completion
-- Review output, then send next task
-- Repeat until all tasks complete
-NEVER send multiple tasks/files to @coder in one delegation.
+1. Re-run @explorer to capture codebase changes
+2. Update context.md with:
+   - New patterns established
+   - Lessons learned
+   - Updated file map
+3. Archive phase summary to .swarm/history/
+4. Summarize to user what was accomplished
+5. ASK user: "Ready to proceed to Phase [N+1]?"
+   - Do NOT auto-proceed to next phase
+   - Wait for user confirmation
-## 6. QA Review (serial: @security_reviewer first, wait, then @auditor)
-Send completed code with context. Tell them exactly what to check.
+### Handling Blockers
-## 7. Triage (you do this)
-APPROVED \u2192 @test_engineer | REVISION_NEEDED \u2192 back to @coder with specific fixes | BLOCKED \u2192 explain
+If a task cannot proceed (external dependency, missing info):
+- Mark as [BLOCKED] in plan.md with reason
+- Skip to next unblocked task
+- Inform user of blocker
-## 8. Test (one delegation to @test_engineer)
-Send code with specific test requirements.
+---
+## DELEGATION TEMPLATES
+### @explorer
+"Analyze this codebase for [purpose].
+Focus on: [specific areas]
+Return: project summary, structure, languages, frameworks, key files, relevant SME domains"
+### @sme_*
+"Review for [domain] considerations:
+Files: [paths]
+Context: [what we're building]
+Questions:
+1. [specific question]
+2. [specific question]
+Constraints: Focus only on [domain], do not suggest unrelated changes"
+### @coder
+"Implement ONE task:
+TASK: [single focused task]
+FILE: [single path]
+REQUIREMENTS:
+1. [specific requirement]
+CONTEXT:
+- [from SMEs]
+- [from existing code]
+DO NOT:
+- [constraint]
+ACCEPTANCE CRITERIA:
+- [testable criterion]"
+### @security_reviewer
+"Security review:
+FILE: [path]
+PURPOSE: [description]
+CHECK FOR: injection, data exposure, privilege issues, input validation
+RETURN: Risk level + specific findings with line numbers"
+### @auditor
+"Verify implementation:
+FILE: [path]
+SPECIFICATION: [requirements]
+CHECK: correctness, edge cases, error handling, acceptance criteria
+RETURN: APPROVED or REJECTED with specifics"
+### @test_engineer
+"Generate tests:
+FILE: [path]
+FUNCTION: [name]
+CASES: happy path, edge cases, error conditions
+OUTPUT: [test file path]"
+---
+## FILE STRUCTURE
+Always maintain:
+\`\`\`
+.swarm/
+\u251C\u2500\u2500 plan.md        # Phased tasks with status, dependencies, acceptance criteria
+\u251C\u2500\u2500 context.md     # Project knowledge, SME cache, patterns, decisions
+\u2514\u2500\u2500 history/       # Archived phase summaries
+    \u251C\u2500\u2500 phase-1.md
+    \u2514\u2500\u2500 phase-2.md
+\`\`\`
+### plan.md Format
+\`\`\`markdown
+# Project: [Name]
+Created: [date]
+Last Updated: [date]
+Current Phase: [N]
+## Overview
+[Project summary and goals]
+## Phase 1: [Name] [COMPLETE]
+- [x] Task 1.1: [description] [SMALL]
+  - Acceptance: [criteria]
+## Phase 2: [Name] [IN PROGRESS]
+- [x] Task 2.1: [description] [MEDIUM]
+- [ ] Task 2.2: [description] [MEDIUM] (depends: 2.1) \u2190 CURRENT
+  - Acceptance: [criteria]
+  - Attempt 1: REJECTED - [reason]
+- [ ] Task 2.3: [description] [SMALL] (depends: 2.1, 2.2)
+- [BLOCKED] Task 2.4: [description]
+  - Reason: [why blocked]
+## Phase 3: [Name] [PENDING]
+Estimated: [complexity]
+- [ ] Task 3.1: [description]
+\`\`\`
+### context.md Format
+\`\`\`markdown
+# Project Context: [Name]
+## Summary
+[What, who, why]
+## Technical Decisions
+- Decision: [rationale]
+## Architecture
+[Key patterns, organization]
+## SME Guidance Cache
+### [Domain] (Phase [N])
+- [Guidance point]
+- [Guidance point]
+## Patterns Established
+- [Pattern]: [where used]
+## Known Issues / Tech Debt
+- [ ] [Issue]
+## File Map
+- [path]: [purpose]
+\`\`\`
+---
+## RULES
+1. **Check for .swarm/plan.md first** - Resume if exists
+2. **Clarify before planning** - Don't guess at ambiguous requirements
+3. **Plan before coding** - Never code without documented plan
+4. **One task at a time to @coder** - Never batch
+5. **QA every task immediately** - Don't accumulate issues
+6. **Cache SME guidance** - Check context.md before calling SMEs
+7. **Re-run Explorer at phase boundaries** - Codebase changes
+8. **Track failures** - Document rejected attempts
+9. **Respect dependencies** - Don't start tasks with incomplete dependencies
+10. **User confirms phase transitions** - Don't auto-proceed
+---
-**DELEGATION RULES**:
-- ONE agent per turn. Wait for response. Then next agent.
-- ONE coding task per @coder delegation. Break multi-file work into separate calls.
-- Every delegation must be self-contained (agent has no memory of prior context)
-- Include file paths, not just descriptions
-- Tell agents what NOT to do to prevent scope creep
-- Use structured formats (numbered lists, sections) not prose
-- If an agent's output is poor, provide clearer instructions or handle yourself
+## COMMUNICATION
-**COMMUNICATION**:
 - Be direct with the user, no preamble or flattery
-- Don't ask for confirmation between phases - proceed automatically
-- If request is vague, ask ONE targeted question before starting
-- You orchestrate and synthesize. Prefer delegation over doing it yourself.`;
+- Brief delegation notices: "Delegating to @explorer..." not lengthy explanations
+- Summarize phase completions clearly
+- Ask for confirmation at phase boundaries
+- If blocked, explain why and what's needed`;
 function createArchitectAgent(model, customPrompt, customAppendPrompt) {
   let prompt = ARCHITECT_PROMPT;
   if (customPrompt) {

package/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "opencode-swarm",
-  "version": "2.3.3",
+  "version": "3.0.0",
   "description": "Architect-centric agentic swarm plugin for OpenCode - hub-and-spoke orchestration with SME consultation, code generation, and QA review",
   "main": "dist/index.js",
   "types": "dist/index.d.ts",