npm - swarm-engine - Versions diffs - 1.43.0 → 1.54.0 - Mend

swarm-engine 1.43.0 → 1.54.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (165) hide show

package/README.md +145 -42
package/agents/implementer.md +1 -14
package/agents/orchestrator.md +127 -246
package/agents/reviewer.md +3 -18
package/dist/cli/commands/agents.js +13 -1
package/dist/cli/commands/agents.js.map +1 -1
package/dist/cli/commands/dashboard.d.ts +3 -0
package/dist/cli/commands/dashboard.d.ts.map +1 -0
package/dist/cli/commands/dashboard.js +43 -0
package/dist/cli/commands/dashboard.js.map +1 -0
package/dist/cli/commands/orchestrate.d.ts.map +1 -1
package/dist/cli/commands/orchestrate.js +17 -0
package/dist/cli/commands/orchestrate.js.map +1 -1
package/dist/cli/commands/run.d.ts.map +1 -1
package/dist/cli/commands/run.js +20 -1
package/dist/cli/commands/run.js.map +1 -1
package/dist/cli/commands/status.d.ts +1 -0
package/dist/cli/commands/status.d.ts.map +1 -1
package/dist/cli/commands/status.js +5 -2
package/dist/cli/commands/status.js.map +1 -1
package/dist/cli/commands/usage.d.ts +3 -0
package/dist/cli/commands/usage.d.ts.map +1 -0
package/dist/cli/commands/usage.js +85 -0
package/dist/cli/commands/usage.js.map +1 -0
package/dist/cli/index.js +2 -0
package/dist/cli/index.js.map +1 -1
package/dist/core/event-bus.d.ts.map +1 -1
package/dist/core/event-bus.js +4 -1
package/dist/core/event-bus.js.map +1 -1
package/dist/core/types.d.ts +44 -1
package/dist/core/types.d.ts.map +1 -1
package/dist/core/types.js +4 -0
package/dist/core/types.js.map +1 -1
package/dist/hooks/cli.js +4 -1
package/dist/hooks/cli.js.map +1 -1
package/dist/hooks/index.d.ts +9 -0
package/dist/hooks/index.d.ts.map +1 -1
package/dist/hooks/index.js +54 -0
package/dist/hooks/index.js.map +1 -1
package/dist/hooks/transcript-parser.d.ts +68 -0
package/dist/hooks/transcript-parser.d.ts.map +1 -0
package/dist/hooks/transcript-parser.js +163 -0
package/dist/hooks/transcript-parser.js.map +1 -0
package/dist/hooks/usage-logger.d.ts +30 -0
package/dist/hooks/usage-logger.d.ts.map +1 -0
package/dist/hooks/usage-logger.js +67 -0
package/dist/hooks/usage-logger.js.map +1 -0
package/dist/index.d.ts +16 -1
package/dist/index.d.ts.map +1 -1
package/dist/index.js +10 -1
package/dist/index.js.map +1 -1
package/dist/memory/index.js +2 -2
package/dist/memory/index.js.map +1 -1
package/dist/runtime/agent-runner.d.ts +9 -0
package/dist/runtime/agent-runner.d.ts.map +1 -1
package/dist/runtime/agent-runner.js +167 -11
package/dist/runtime/agent-runner.js.map +1 -1
package/dist/runtime/backends/claude.d.ts +30 -0
package/dist/runtime/backends/claude.d.ts.map +1 -1
package/dist/runtime/backends/claude.js +112 -2
package/dist/runtime/backends/claude.js.map +1 -1
package/dist/runtime/backends/codex.d.ts.map +1 -1
package/dist/runtime/backends/codex.js +4 -0
package/dist/runtime/backends/codex.js.map +1 -1
package/dist/runtime/backends/gemini.d.ts.map +1 -1
package/dist/runtime/backends/gemini.js +4 -0
package/dist/runtime/backends/gemini.js.map +1 -1
package/dist/runtime/benefits.d.ts +81 -1
package/dist/runtime/benefits.d.ts.map +1 -1
package/dist/runtime/benefits.js +199 -12
package/dist/runtime/benefits.js.map +1 -1
package/dist/runtime/cache-optimizer.d.ts +7 -3
package/dist/runtime/cache-optimizer.d.ts.map +1 -1
package/dist/runtime/cache-optimizer.js +11 -7
package/dist/runtime/cache-optimizer.js.map +1 -1
package/dist/runtime/compaction.d.ts +6 -1
package/dist/runtime/compaction.d.ts.map +1 -1
package/dist/runtime/compaction.js +39 -2
package/dist/runtime/compaction.js.map +1 -1
package/dist/runtime/cost-model.d.ts.map +1 -1
package/dist/runtime/cost-model.js +20 -17
package/dist/runtime/cost-model.js.map +1 -1
package/dist/runtime/engine.d.ts +2 -0
package/dist/runtime/engine.d.ts.map +1 -1
package/dist/runtime/engine.js +162 -3
package/dist/runtime/engine.js.map +1 -1
package/dist/runtime/graph-discovery.js +2 -2
package/dist/runtime/graph-discovery.js.map +1 -1
package/dist/runtime/graph-trajectory.js +3 -3
package/dist/runtime/graph-trajectory.js.map +1 -1
package/dist/runtime/handoff.d.ts +8 -0
package/dist/runtime/handoff.d.ts.map +1 -0
package/dist/runtime/handoff.js +109 -0
package/dist/runtime/handoff.js.map +1 -0
package/dist/runtime/heuristics.d.ts +2 -1
package/dist/runtime/heuristics.d.ts.map +1 -1
package/dist/runtime/heuristics.js +15 -2
package/dist/runtime/heuristics.js.map +1 -1
package/dist/runtime/lsp.d.ts.map +1 -1
package/dist/runtime/lsp.js +4 -0
package/dist/runtime/lsp.js.map +1 -1
package/dist/runtime/mcp.d.ts +1 -0
package/dist/runtime/mcp.d.ts.map +1 -1
package/dist/runtime/mcp.js +38 -0
package/dist/runtime/mcp.js.map +1 -1
package/dist/runtime/model-router.d.ts +2 -1
package/dist/runtime/model-router.d.ts.map +1 -1
package/dist/runtime/model-router.js +8 -1
package/dist/runtime/model-router.js.map +1 -1
package/dist/runtime/output-summarizer.d.ts +45 -0
package/dist/runtime/output-summarizer.d.ts.map +1 -0
package/dist/runtime/output-summarizer.js +171 -0
package/dist/runtime/output-summarizer.js.map +1 -0
package/dist/runtime/plugins.d.ts +5 -1
package/dist/runtime/plugins.d.ts.map +1 -1
package/dist/runtime/plugins.js +14 -2
package/dist/runtime/plugins.js.map +1 -1
package/dist/runtime/prompt-tier.d.ts +33 -0
package/dist/runtime/prompt-tier.d.ts.map +1 -0
package/dist/runtime/prompt-tier.js +105 -0
package/dist/runtime/prompt-tier.js.map +1 -0
package/dist/runtime/savings-manifest.d.ts +66 -0
package/dist/runtime/savings-manifest.d.ts.map +1 -0
package/dist/runtime/savings-manifest.js +103 -0
package/dist/runtime/savings-manifest.js.map +1 -0
package/dist/runtime/sharing.js +2 -1
package/dist/runtime/sharing.js.map +1 -1
package/dist/runtime/stats.d.ts +2 -0
package/dist/runtime/stats.d.ts.map +1 -1
package/dist/runtime/stats.js +23 -3
package/dist/runtime/stats.js.map +1 -1
package/dist/runtime/violation-tracker.d.ts +31 -0
package/dist/runtime/violation-tracker.d.ts.map +1 -0
package/dist/runtime/violation-tracker.js +125 -0
package/dist/runtime/violation-tracker.js.map +1 -0
package/dist/utils/project-config.d.ts +20 -0
package/dist/utils/project-config.d.ts.map +1 -1
package/dist/utils/project-config.js +46 -1
package/dist/utils/project-config.js.map +1 -1
package/dist/utils/redact.d.ts.map +1 -1
package/dist/utils/redact.js +5 -1
package/dist/utils/redact.js.map +1 -1
package/dist/web/bridge.d.ts +47 -0
package/dist/web/bridge.d.ts.map +1 -0
package/dist/web/bridge.js +267 -0
package/dist/web/bridge.js.map +1 -0
package/dist/web/graph-api.d.ts +19 -0
package/dist/web/graph-api.d.ts.map +1 -0
package/dist/web/graph-api.js +157 -0
package/dist/web/graph-api.js.map +1 -0
package/dist/web/index.d.ts +21 -0
package/dist/web/index.d.ts.map +1 -0
package/dist/web/index.js +38 -0
package/dist/web/index.js.map +1 -0
package/dist/web/public/index.html +1304 -0
package/dist/web/public/public/index.html +1307 -0
package/dist/web/server.d.ts +24 -0
package/dist/web/server.d.ts.map +1 -0
package/dist/web/server.js +113 -0
package/dist/web/server.js.map +1 -0
package/dist/web/tray.d.ts +23 -0
package/dist/web/tray.d.ts.map +1 -0
package/dist/web/tray.js +205 -0
package/dist/web/tray.js.map +1 -0
package/package.json +1 -1

package/README.md CHANGED Viewed

@@ -3,13 +3,11 @@
 [![npm](https://img.shields.io/npm/v/swarm-engine)](https://www.npmjs.com/package/swarm-engine)
 [![Node](https://img.shields.io/node/v/swarm-engine)](https://nodejs.org)
 [![License: MIT](https://img.shields.io/badge/License-MIT-blue.svg)](https://opensource.org/licenses/MIT)
-[![Tests](https://img.shields.io/badge/tests-1278%20passing-brightgreen)]()
+[![Tests](https://img.shields.io/badge/tests-1848%20passing-brightgreen)]()
-**Your agents. Orchestrated.**
+Multi-agent orchestration for AI coding tools. Coordinates Claude, Codex, and Gemini through research, implementation, and review phases. 20-35% token reduction on parallel workflows (measured per-run), cost transparency after every orchestration, and a knowledge graph that sharpens context routing with each run. 1,848 tests. 14-finding security audit. MIT licensed.
-The first self-aware AI orchestration engine. A 16-module ML pipeline -- built in pure TypeScript with zero external ML dependencies -- that predicts its own failures, explains them causally, and evolves its own rules. Underneath: 26 agents, 7 composable patterns, and a persistent knowledge graph that compounds intelligence across every run.
-Works with Claude Code, OpenAI Codex, Google Gemini CLI, and Vercel AI SDK. Mix models across agents in the same orchestration.
+26 agents, 7 composable patterns, pure TypeScript, zero external ML dependencies. Works with Claude Code, OpenAI Codex, Google Gemini CLI, and Vercel AI SDK. Mix models across agents in the same orchestration.
 ## What It Looks Like
@@ -37,25 +35,41 @@ Works with Claude Code, OpenAI Codex, Google Gemini CLI, and Vercel AI SDK. Mix
   12.8K tokens │ $0.24 │ 1m 36s  ~2m remaining
 ```
+When it finishes, you see exactly what the engine did for you:
 ```
-  +---------------------------------------------------------+
-  |                                                         |
-  |  Orchestration complete                                 |
-  |                                                         |
-  |  Pattern:  hybrid (3 phases, 6 agents)                  |
-  |  Duration: 3m 42s                                       |
-  |  Tokens:   47.2K                                        |
-  |  Cost:     $0.3814                                      |
-  |  Tools:    142 calls                                    |
-  |                                                         |
-  |  Changes:                                               |
-  |    src/middleware/rate-limit.ts        48 +++           |
-  |    src/routes/users.ts                3 +-              |
-  |    tests/middleware/rate-limit.test.ts 62 +++           |
-  |                                                         |
-  +---------------------------------------------------------+
+┌─ Engine Benefits ──────────────────────────────────────────────────────┐
+│  Cost: $2.8400  (avg: $3.6200, saved $0.7800)  |  28% tokens saved   │
+├────────────────────────────────────────────────────────────────────────┤
+│  TOKEN EFFICIENCY                                                      │
+│  ├─ Smart routing         -18,400 tok     filtered phase outputs       │
+│  ├─ Verbatim compaction   -12,800 tok     replaced with refs           │
+│  ├─ Context decay          -6,200 tok     older phases summarized      │
+│  ├─ Prompt diet (balanced) -4,100 tok     trimmed agent prompts        │
+│  ├─ Tool search deferred  -14,000 tok     ENABLE_TOOL_SEARCH=true      │
+│  └─ TOTAL SAVED           -55,500 tok     28% reduction                │
+│                                                                        │
+│  CACHE                                                                 │
+│  ├─ Cache read tokens     142,800         reused from prior turns      │
+│  ├─ Cache creation         28,400         new cache entries            │
+│  └─ Cache hit rate         83.4%          no cliff events detected     │
+│                                                                        │
+│  KNOWLEDGE GRAPH                                                       │
+│  ├─ Context routing       8               graph-optimized per agent    │
+│  ├─ Confidence gates      2 eval          all passed                   │
+│  └─ Pattern history       91% success     14 runs                      │
+│                                                                        │
+│  ML / GNN                                                              │
+│  └─ Predictive dropout    1 agents        -5,000 tok saved             │
+│                                                                        │
+│  ADAPTIVE                                                              │
+│  ├─ Model downgrades      2               cheaper where safe           │
+│  └─ Living spec           3 updates       refined during run           │
+└────────────────────────────────────────────────────────────────────────┘
 ```
+The more you use it, the richer this gets -- the knowledge graph, GNN predictions, cache baselines, and historical cost averages build over time.
 ## Install
 **npm (recommended):**
@@ -202,24 +216,35 @@ import {
   TaskDiscovery,          // Mines failure patterns for tasks
   OrchestrationEmbedder,  // Topology embeddings for transfer learning
 } from 'swarm-engine';
+// Token Compression & Benefits
+import {
+  VerbatimCompactor,       // Replace file reads, diffs, stack traces with refs
+  ContextDecayManager,     // Hierarchical time-decay summarization
+  ACONOptimizer,           // Failure-driven compression guidelines
+  PromptCompressor,        // Strip markdown boilerplate from prompts
+  getOutputSchema,         // Structured JSON schemas per agent type
+  BenefitsCollector,       // Aggregate optimization metrics
+  formatBenefitsTable,     // Render styled benefits summary
+  createOutputSummarizerHook,  // PostToolUse hook for Bash output reduction
+} from 'swarm-engine';
 ```
 See [src/index.ts](src/index.ts) for the full export surface.
 ## Why Swarm Engine
-Tools like Claude Code already let you spawn parallel agents with teams. That's powerful infrastructure. Swarm Engine builds on top of it with the parts you'd otherwise have to figure out yourself: which agents to run, in what order, with what prompts, on which models, and how to learn from the results.
+Tools like Claude Code already let you spawn parallel agents with teams. Swarm Engine adds the orchestration layer: which agents to run, in what order, with what context, on which models, and how to learn from the results.
-It gives you composable patterns, cost-aware planning, specialized agent definitions, and a memory system that improves with every run. Think of it as the orchestration layer that turns ad-hoc multi-agent work into repeatable workflows.
-- **Self-aware ML pipeline** - the engine predicts its own failures (GNN), explains them causally (do-calculus), and evolves its own replanning rules. All in pure TypeScript, zero external ML dependencies.
-- **16 graph modules across 3 tiers** - persistent knowledge graph, causal inference, GNN failure propagation, trajectory prediction, orchestration embeddings, and self-evolving rules.
-- **7 composable patterns** - hybrid, TDD, red-team, spike, discover, review-cycle, research. Compose them: `--pattern "tdd | red-team"`. Plus 12 slash commands including postmortem, diff-review, and fix-pr.
-- **Intelligent planner** - cost-based optimization, adaptive execution, learns from every run
-- **Mix any backend** - Claude for implementation, Codex for review, Gemini for research. Different model per agent.
-- **Reusable templates** - save successful workflows, run them again with different parameters
-- **26 specialized agents** - 16 core roles plus 10 focused reviewers (security, performance, data integrity, API contracts, testing, accessibility, dependencies, error handling, concurrency, documentation)
-- **1,278 tests across 76 files** - 4 rounds of security review, clean on all adversarial attack vectors
+- **20-35% token reduction** — measured per-run on typical parallel orchestrations. Cross-phase context filtering, verbatim compaction, context decay, prompt diet (3 modes), and tool schema deferral (~14K saved per session). No configuration needed. Cache-aware: tracks hit rates and detects cliff events.
+- **Knowledge graph improves with every run** — first run uses heuristics, tenth run uses data. Context routing scores prior outputs by file overlap and recency. Pattern learning tracks success rates per orchestration type. Failure prediction estimates risk from historical topology before agents run. Cost baselines compare each run to your historical average.
+- **Cost transparency after every run** — every orchestration ends with a benefits summary: actual cost, historical comparison, tokens saved (with compounding math), cache hit rates, and what each optimization contributed. `swarm plan` previews estimated cost before execution.
+- **Failure prediction** — 3-layer GNN propagation model predicts which agents are likely to fail based on historical topology. Causal inference (do-calculus) explains why. Pure TypeScript, no external ML deps.
+- **7 composable patterns** — hybrid, TDD, red-team, spike, discover, review-cycle, research. Compose them: `--pattern "tdd | red-team"`. Plus 12 slash commands including postmortem, diff-review, and fix-pr.
+- **Mix backends per agent** — Claude for implementation, Codex for review, Gemini for research. Assign models at the agent level within one orchestration.
+- **26 specialized agents** — 16 core roles plus 10 focused reviewers (security, performance, data integrity, API contracts, testing, accessibility, dependencies, error handling, concurrency, documentation).
+- **14-finding security audit** — 3 recon agents + 3 adversarial breakers, all findings hardened. Plugin trust model, MCP command allowlists, prompt injection defense, path traversal guards, secrets redaction (16 pattern categories), file permissions hardened to 0o600/0o700.
+- **1,848 tests across 104 files** — reusable templates let you save and replay successful orchestrations (`swarm template run bug-fix`).
 ## Templates
@@ -245,16 +270,17 @@ See exactly what will happen before it runs:
 swarm plan "add auth middleware" --pattern hybrid
   Phase 1: research [parallel]
-    researcher-code    sonnet   ~5K tokens
-    researcher-context sonnet   ~3K tokens
+    researcher-code    sonnet   ~40K tokens
+    researcher-context sonnet   ~25K tokens
   Phase 2: implement [sequential]
-    implementer        opus     ~15K tokens
+    implementer        opus     ~100K tokens
   Phase 3: review [parallel]
-    reviewer-correct   opus     ~8K tokens
-    reviewer-security  opus     ~8K tokens
+    reviewer-correct   opus     ~60K tokens
+    reviewer-security  opus     ~60K tokens
-  Est. cost: $0.51 | Est. duration: 95s
-  Optimizations: model downgrade for research phase (sonnet vs opus)
+  Est. cost: $3.20 | Est. duration: 95s
+  Optimizations: model downgrade for research phase (sonnet vs opus),
+                 cross-phase filtering (reviewers skip research output)
 ```
 ## Commands
@@ -335,17 +361,94 @@ swarm convert --to opencode   # OpenCode agents
 swarm convert --to windsurf   # Windsurf skills
 ```
+## Token Efficiency
+Saves 20-35% of tokens on typical parallel orchestrations (measured and shown after every run). Multi-agent context grows linearly without optimization; Swarm Engine applies these compression techniques automatically -- no configuration needed.
+| Technique | What it does | Typical savings |
+|-----------|-------------|-----------------|
+| **Cross-phase filtering** | Implement phases only get research/plan outputs. Review phases only get implement/test outputs. File-scope filtering removes irrelevant files. | 10-30% per phase |
+| **Verbatim compaction** | Replaces file reads, git diffs, test output, and stack traces with compact references. Runs on shared-context files, not just agent output. | 50-70% on tool-heavy outputs |
+| **Context decay** | Recent phases: full fidelity. Older phases: heuristic summary. Oldest: one-line reference. | 5-20x on deep pipelines |
+| **Prompt compression** | Strips YAML frontmatter, markdown headers, duplicate prefixes, and excessive indentation from agent prompts. | 10-25% per prompt |
+| **Output schemas** | Structured JSON contracts per agent type force compact output instead of verbose prose. | 3-5x on agent outputs |
+| **Tool search deferral** | `ENABLE_TOOL_SEARCH=true` auto-set on Claude backend. Defers tool schema loading until needed. | ~14K tokens per agent session |
+| **Output summarizer** | PostToolUse hook injects compact digest after verbose Bash output (>50 lines). Classifies output type, extracts errors and test results. | Reduces model reasoning load on verbose commands |
+**ACON (Agent Context Optimization)** goes further: it records trajectory pairs (full context vs compressed), analyzes failures where compression caused worse outcomes, and iteratively refines compression guidelines. Ships with 8 built-in guidelines covering error messages, file paths, decisions, API contracts, and boilerplate. Gradient-free -- works with any model.
+**Cache tracking.** The runtime captures `cacheReadInputTokens` and `cacheCreationInputTokens` end-to-end from the SDK through SQLite. Per-agent-type rolling baselines detect cache cliff events (>50% hit rate drop) and cold-wake gaps (>5min idle). These feed into the benefits table and cost model.
+The benefits summary at the end of each orchestration shows exactly what was saved.
+### Prompt Diet
+Control how aggressively the runtime trims agent system prompts (Before-You-Act / Self-Check / Debt / Meta sections) per turn.
+```yaml
+# OrchestrationConfig
+promptDiet:
+  default: balanced            # aggressive | balanced | conservative
+  overrides:
+    orchestrator: full         # keep full prompt for this agent type
+    my-custom-agent: lite
+```
+CLI: `swarm run --prompt-diet aggressive` or `swarm orchestrate --prompt-diet aggressive`.
+Env: `SWARM_PROMPT_DIET=aggressive swarm ...`.
+Modes:
+- `aggressive`: trims **all** agents (including orchestrator). Biggest savings, highest risk of behavior drift.
+- `balanced` (default): trims reviewers, implementers, researchers, testers, debuggers, planners, integrators, and other workers. Keeps `orchestrator`, `grounding`, `devils-advocate`, `judge`, and `refactorer` at full.
+- `conservative`: no automatic trimming. Only explicit `promptTier` on individual agents applies (legacy behavior).
+**Per-agent precedence** (strongest first, applied after `dietConfig` is resolved):
+1. Explicit `config.promptTier` on the AgentConfig — always wins
+2. `dietConfig.overrides[key]` where the agent type contains `key` (case-insensitive substring; longest matching key wins, so `security-implementer` beats `implementer`)
+3. `mode === 'conservative'` → no automatic tiering
+4. `maxTurns ≤ 3` → `minimal` (in `balanced` and `aggressive` modes)
+5. `mode === 'aggressive'` → `lite` for everyone (including the protected roles)
+6. `mode === 'balanced'` (default) → protected roles (`orchestrator`, `grounding`, `devils-advocate`, `judge`, `refactorer`) stay full; lite-eligible roles → `lite`; unknown roles → full
+7. Default → full
+**Config-source precedence** (which `dietConfig` the runtime sees): CLI flag (`--prompt-diet`) populates `orchestrationConfig.promptDiet` directly, so `orchestrationConfig.promptDiet ?? resolveDietFromEnv()` — i.e. CLI/orchestration config wins, then `SWARM_PROMPT_DIET` env, then `balanced` default.
+## Security
+v1.50 addressed 14 findings from an internal red-team audit. Key hardening:
+- **Plugin trust model** — repo-local plugins no longer auto-load. Explicit opt-in required via config. npm-installed plugins load normally.
+- **MCP/LSP command allowlist** — only allowlisted binaries can be launched. Blocks arbitrary command execution through tool server configs.
+- **Prompt injection defenses** — cross-phase context uses boundary markers, agent definitions are positioned before user context, and inline redaction strips injection attempts.
+- **Path traversal guard** — `swarm agents install` validates paths to prevent writing outside the agents directory.
+- **Secrets redaction** — Anthropic API keys, Slack tokens, npm tokens, and Vercel tokens are pattern-matched and stripped from logs, JSON reports, and vault files.
+- **Backend opt-in** — Codex and Gemini backends require explicit `--unsafe-backend` flag since they execute in less-sandboxed environments.
+- **Config validation** — YAML config files are validated on load; dangerous fields (`shell`, `exec`, `command`) are stripped.
+- **Error sanitization** — error messages are cleaned before being injected into retry prompts to prevent reflection-based injection.
 ## Memory and Knowledge Graph Intelligence
 SQLite-backed knowledge base with full-text search, plus a 3-tier Execution Knowledge Graph that records every orchestration as a persistent, queryable topology. Syncs to Obsidian vault for cross-machine access.
-**Tier 1 -- Core Graph.** ExecutionGraph records orchestration topology. GraphLearner extracts cross-run patterns. GraphContextRouter replaces dump-everything context with relevance-scored assembly. GraphAnalyzer detects god nodes, bottlenecks, and topology risks.
+**Tier 1 -- Core Graph.** ExecutionGraph records orchestration topology. GraphLearner extracts cross-run patterns. GraphContextRouter replaces dump-everything context with relevance-scored assembly. GraphAnalyzer detects god nodes, bottlenecks, and topology risks. Review findings feed back into the graph to refine future context routing.
-**Tier 2 -- Advanced ML.** CausalGraphEngine applies do-calculus to estimate treatment effects and suggest interventions. FailurePropagationPredictor uses a 3-layer GNN to predict which nodes are at risk before execution starts. MetaPatternSelector recommends orchestration patterns via TF-IDF + logistic regression. PredictiveDropout uses active learning to skip redundant agents.
+**Tier 2 -- Advanced ML.** CausalGraphEngine applies do-calculus to estimate treatment effects and suggest interventions. FailurePropagationPredictor uses a 3-layer GNN to predict which nodes are at risk before execution starts. MetaPatternSelector recommends orchestration patterns via TF-IDF + logistic regression. PredictiveDropout uses active learning to skip redundant agents (saving ~5K tokens per dropped agent).
 **Tier 3 -- Self-Aware Engine.** PatternSynthesizer generates novel orchestration patterns from topology diffs. TrajectoryPredictor forecasts orchestration success mid-run. RuleEvolver proposes and backtests its own replanning rules from historical failures. TaskDiscovery mines the graph for actionable tasks. OrchestrationEmbedder produces topology-based embeddings for similarity search and transfer learning. MetaAdversarialTester red-teams the engine's own ML subsystems.
-All ML is implemented in pure TypeScript with zero external ML dependencies.
+All ML is implemented in pure TypeScript with zero external ML dependencies. Every optimization is tracked by the BenefitsCollector and surfaced in the post-orchestration summary -- the engine shows its work.
+### Knowledge Graph vs Claude Memory
+Claude Code memory gives every session the same static context. Swarm's execution graph learns from every run:
+- **Context routing** — agents get prior-phase outputs scored by relevance (file overlap, recency), not a dump of everything
+- **Pattern learning** — tracks success rates per orchestration pattern and recommends what works for your codebase
+- **Failure prediction** — estimates which agents are likely to fail based on historical topology, before they run
+- **Cost baselines** — each run is compared to historical average so you see if this orchestration was cheap or expensive relative to your norm
+First run uses heuristics. Tenth run uses data.
 ```bash
 swarm memory search "authentication"

package/agents/implementer.md CHANGED Viewed

@@ -52,20 +52,7 @@ Before executing your process, reason through these questions internally (do not
 ## Output Format
-```
-## Changes Made
-- `path/to/file.py` — [what was changed and why]
-- `path/to/new_file.py` — [new file, what it does]
-## Decisions
-- [Any judgment calls you made and why]
-## Verification
-- [Tests run and results]
-- [Any warnings or concerns]
-```
-### Example Output
+Output exactly these sections in this order. Each bullet should name the file and explain why, not just what.
 ```
 ## Changes Made