swarm-engine 1.43.0 → 1.51.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (125) hide show
  1. package/README.md +145 -42
  2. package/agents/implementer.md +1 -14
  3. package/agents/orchestrator.md +127 -246
  4. package/agents/reviewer.md +3 -18
  5. package/dist/cli/commands/agents.js +13 -1
  6. package/dist/cli/commands/agents.js.map +1 -1
  7. package/dist/cli/commands/dashboard.d.ts +3 -0
  8. package/dist/cli/commands/dashboard.d.ts.map +1 -0
  9. package/dist/cli/commands/dashboard.js +43 -0
  10. package/dist/cli/commands/dashboard.js.map +1 -0
  11. package/dist/cli/commands/orchestrate.d.ts.map +1 -1
  12. package/dist/cli/commands/orchestrate.js +17 -0
  13. package/dist/cli/commands/orchestrate.js.map +1 -1
  14. package/dist/cli/commands/run.d.ts.map +1 -1
  15. package/dist/cli/commands/run.js +20 -1
  16. package/dist/cli/commands/run.js.map +1 -1
  17. package/dist/cli/commands/status.d.ts +1 -0
  18. package/dist/cli/commands/status.d.ts.map +1 -1
  19. package/dist/cli/commands/status.js +5 -2
  20. package/dist/cli/commands/status.js.map +1 -1
  21. package/dist/core/event-bus.d.ts.map +1 -1
  22. package/dist/core/event-bus.js +4 -1
  23. package/dist/core/event-bus.js.map +1 -1
  24. package/dist/core/types.d.ts +24 -1
  25. package/dist/core/types.d.ts.map +1 -1
  26. package/dist/core/types.js +4 -0
  27. package/dist/core/types.js.map +1 -1
  28. package/dist/index.d.ts +3 -0
  29. package/dist/index.d.ts.map +1 -1
  30. package/dist/index.js +2 -0
  31. package/dist/index.js.map +1 -1
  32. package/dist/memory/index.js +2 -2
  33. package/dist/memory/index.js.map +1 -1
  34. package/dist/runtime/agent-runner.d.ts +8 -0
  35. package/dist/runtime/agent-runner.d.ts.map +1 -1
  36. package/dist/runtime/agent-runner.js +93 -4
  37. package/dist/runtime/agent-runner.js.map +1 -1
  38. package/dist/runtime/backends/claude.d.ts +1 -0
  39. package/dist/runtime/backends/claude.d.ts.map +1 -1
  40. package/dist/runtime/backends/claude.js +50 -2
  41. package/dist/runtime/backends/claude.js.map +1 -1
  42. package/dist/runtime/backends/codex.d.ts.map +1 -1
  43. package/dist/runtime/backends/codex.js +4 -0
  44. package/dist/runtime/backends/codex.js.map +1 -1
  45. package/dist/runtime/backends/gemini.d.ts.map +1 -1
  46. package/dist/runtime/backends/gemini.js +4 -0
  47. package/dist/runtime/backends/gemini.js.map +1 -1
  48. package/dist/runtime/benefits.d.ts +81 -1
  49. package/dist/runtime/benefits.d.ts.map +1 -1
  50. package/dist/runtime/benefits.js +199 -12
  51. package/dist/runtime/benefits.js.map +1 -1
  52. package/dist/runtime/cache-optimizer.d.ts +7 -3
  53. package/dist/runtime/cache-optimizer.d.ts.map +1 -1
  54. package/dist/runtime/cache-optimizer.js +11 -7
  55. package/dist/runtime/cache-optimizer.js.map +1 -1
  56. package/dist/runtime/compaction.d.ts +6 -1
  57. package/dist/runtime/compaction.d.ts.map +1 -1
  58. package/dist/runtime/compaction.js +39 -2
  59. package/dist/runtime/compaction.js.map +1 -1
  60. package/dist/runtime/cost-model.d.ts.map +1 -1
  61. package/dist/runtime/cost-model.js +20 -17
  62. package/dist/runtime/cost-model.js.map +1 -1
  63. package/dist/runtime/engine.d.ts +1 -0
  64. package/dist/runtime/engine.d.ts.map +1 -1
  65. package/dist/runtime/engine.js +62 -2
  66. package/dist/runtime/engine.js.map +1 -1
  67. package/dist/runtime/graph-discovery.js +2 -2
  68. package/dist/runtime/graph-discovery.js.map +1 -1
  69. package/dist/runtime/graph-trajectory.js +3 -3
  70. package/dist/runtime/graph-trajectory.js.map +1 -1
  71. package/dist/runtime/lsp.d.ts.map +1 -1
  72. package/dist/runtime/lsp.js +4 -0
  73. package/dist/runtime/lsp.js.map +1 -1
  74. package/dist/runtime/mcp.d.ts +1 -0
  75. package/dist/runtime/mcp.d.ts.map +1 -1
  76. package/dist/runtime/mcp.js +38 -0
  77. package/dist/runtime/mcp.js.map +1 -1
  78. package/dist/runtime/output-summarizer.d.ts +45 -0
  79. package/dist/runtime/output-summarizer.d.ts.map +1 -0
  80. package/dist/runtime/output-summarizer.js +171 -0
  81. package/dist/runtime/output-summarizer.js.map +1 -0
  82. package/dist/runtime/plugins.d.ts +5 -1
  83. package/dist/runtime/plugins.d.ts.map +1 -1
  84. package/dist/runtime/plugins.js +14 -2
  85. package/dist/runtime/plugins.js.map +1 -1
  86. package/dist/runtime/prompt-tier.d.ts +33 -0
  87. package/dist/runtime/prompt-tier.d.ts.map +1 -0
  88. package/dist/runtime/prompt-tier.js +105 -0
  89. package/dist/runtime/prompt-tier.js.map +1 -0
  90. package/dist/runtime/sharing.js +2 -1
  91. package/dist/runtime/sharing.js.map +1 -1
  92. package/dist/runtime/stats.d.ts +2 -0
  93. package/dist/runtime/stats.d.ts.map +1 -1
  94. package/dist/runtime/stats.js +17 -3
  95. package/dist/runtime/stats.js.map +1 -1
  96. package/dist/utils/project-config.d.ts +20 -0
  97. package/dist/utils/project-config.d.ts.map +1 -1
  98. package/dist/utils/project-config.js +46 -1
  99. package/dist/utils/project-config.js.map +1 -1
  100. package/dist/utils/redact.d.ts.map +1 -1
  101. package/dist/utils/redact.js +5 -1
  102. package/dist/utils/redact.js.map +1 -1
  103. package/dist/web/bridge.d.ts +47 -0
  104. package/dist/web/bridge.d.ts.map +1 -0
  105. package/dist/web/bridge.js +267 -0
  106. package/dist/web/bridge.js.map +1 -0
  107. package/dist/web/graph-api.d.ts +19 -0
  108. package/dist/web/graph-api.d.ts.map +1 -0
  109. package/dist/web/graph-api.js +157 -0
  110. package/dist/web/graph-api.js.map +1 -0
  111. package/dist/web/index.d.ts +21 -0
  112. package/dist/web/index.d.ts.map +1 -0
  113. package/dist/web/index.js +38 -0
  114. package/dist/web/index.js.map +1 -0
  115. package/dist/web/public/index.html +1304 -0
  116. package/dist/web/public/public/index.html +1307 -0
  117. package/dist/web/server.d.ts +24 -0
  118. package/dist/web/server.d.ts.map +1 -0
  119. package/dist/web/server.js +113 -0
  120. package/dist/web/server.js.map +1 -0
  121. package/dist/web/tray.d.ts +23 -0
  122. package/dist/web/tray.d.ts.map +1 -0
  123. package/dist/web/tray.js +205 -0
  124. package/dist/web/tray.js.map +1 -0
  125. package/package.json +1 -1
package/README.md CHANGED
@@ -3,13 +3,11 @@
3
3
  [![npm](https://img.shields.io/npm/v/swarm-engine)](https://www.npmjs.com/package/swarm-engine)
4
4
  [![Node](https://img.shields.io/node/v/swarm-engine)](https://nodejs.org)
5
5
  [![License: MIT](https://img.shields.io/badge/License-MIT-blue.svg)](https://opensource.org/licenses/MIT)
6
- [![Tests](https://img.shields.io/badge/tests-1278%20passing-brightgreen)]()
6
+ [![Tests](https://img.shields.io/badge/tests-1848%20passing-brightgreen)]()
7
7
 
8
- **Your agents. Orchestrated.**
8
+ Multi-agent orchestration for AI coding tools. Coordinates Claude, Codex, and Gemini through research, implementation, and review phases. 20-35% token reduction on parallel workflows (measured per-run), cost transparency after every orchestration, and a knowledge graph that sharpens context routing with each run. 1,848 tests. 14-finding security audit. MIT licensed.
9
9
 
10
- The first self-aware AI orchestration engine. A 16-module ML pipeline -- built in pure TypeScript with zero external ML dependencies -- that predicts its own failures, explains them causally, and evolves its own rules. Underneath: 26 agents, 7 composable patterns, and a persistent knowledge graph that compounds intelligence across every run.
11
-
12
- Works with Claude Code, OpenAI Codex, Google Gemini CLI, and Vercel AI SDK. Mix models across agents in the same orchestration.
10
+ 26 agents, 7 composable patterns, pure TypeScript, zero external ML dependencies. Works with Claude Code, OpenAI Codex, Google Gemini CLI, and Vercel AI SDK. Mix models across agents in the same orchestration.
13
11
 
14
12
  ## What It Looks Like
15
13
 
@@ -37,25 +35,41 @@ Works with Claude Code, OpenAI Codex, Google Gemini CLI, and Vercel AI SDK. Mix
37
35
  12.8K tokens │ $0.24 │ 1m 36s ~2m remaining
38
36
  ```
39
37
 
38
+ When it finishes, you see exactly what the engine did for you:
39
+
40
40
  ```
41
- +---------------------------------------------------------+
42
- | |
43
- | Orchestration complete |
44
- | |
45
- | Pattern: hybrid (3 phases, 6 agents) |
46
- | Duration: 3m 42s |
47
- | Tokens: 47.2K |
48
- | Cost: $0.3814 |
49
- | Tools: 142 calls |
50
- | |
51
- | Changes: |
52
- | src/middleware/rate-limit.ts 48 +++ |
53
- | src/routes/users.ts 3 +- |
54
- | tests/middleware/rate-limit.test.ts 62 +++ |
55
- | |
56
- +---------------------------------------------------------+
41
+ ┌─ Engine Benefits ──────────────────────────────────────────────────────┐
42
+ Cost: $2.8400 (avg: $3.6200, saved $0.7800) | 28% tokens saved │
43
+ ├────────────────────────────────────────────────────────────────────────┤
44
+ TOKEN EFFICIENCY │
45
+ ├─ Smart routing -18,400 tok filtered phase outputs │
46
+ ├─ Verbatim compaction -12,800 tok replaced with refs │
47
+ ├─ Context decay -6,200 tok older phases summarized │
48
+ ├─ Prompt diet (balanced) -4,100 tok trimmed agent prompts │
49
+ ├─ Tool search deferred -14,000 tok ENABLE_TOOL_SEARCH=true │
50
+ └─ TOTAL SAVED -55,500 tok 28% reduction │
51
+ │ │
52
+ CACHE │
53
+ ├─ Cache read tokens 142,800 reused from prior turns │
54
+ ├─ Cache creation 28,400 new cache entries │
55
+ └─ Cache hit rate 83.4% no cliff events detected │
56
+ │ │
57
+ │ KNOWLEDGE GRAPH │
58
+ │ ├─ Context routing 8 graph-optimized per agent │
59
+ │ ├─ Confidence gates 2 eval all passed │
60
+ │ └─ Pattern history 91% success 14 runs │
61
+ │ │
62
+ │ ML / GNN │
63
+ │ └─ Predictive dropout 1 agents -5,000 tok saved │
64
+ │ │
65
+ │ ADAPTIVE │
66
+ │ ├─ Model downgrades 2 cheaper where safe │
67
+ │ └─ Living spec 3 updates refined during run │
68
+ └────────────────────────────────────────────────────────────────────────┘
57
69
  ```
58
70
 
71
+ The more you use it, the richer this gets -- the knowledge graph, GNN predictions, cache baselines, and historical cost averages build over time.
72
+
59
73
  ## Install
60
74
 
61
75
  **npm (recommended):**
@@ -202,24 +216,35 @@ import {
202
216
  TaskDiscovery, // Mines failure patterns for tasks
203
217
  OrchestrationEmbedder, // Topology embeddings for transfer learning
204
218
  } from 'swarm-engine';
219
+
220
+ // Token Compression & Benefits
221
+ import {
222
+ VerbatimCompactor, // Replace file reads, diffs, stack traces with refs
223
+ ContextDecayManager, // Hierarchical time-decay summarization
224
+ ACONOptimizer, // Failure-driven compression guidelines
225
+ PromptCompressor, // Strip markdown boilerplate from prompts
226
+ getOutputSchema, // Structured JSON schemas per agent type
227
+ BenefitsCollector, // Aggregate optimization metrics
228
+ formatBenefitsTable, // Render styled benefits summary
229
+ createOutputSummarizerHook, // PostToolUse hook for Bash output reduction
230
+ } from 'swarm-engine';
205
231
  ```
206
232
 
207
233
  See [src/index.ts](src/index.ts) for the full export surface.
208
234
 
209
235
  ## Why Swarm Engine
210
236
 
211
- Tools like Claude Code already let you spawn parallel agents with teams. That's powerful infrastructure. Swarm Engine builds on top of it with the parts you'd otherwise have to figure out yourself: which agents to run, in what order, with what prompts, on which models, and how to learn from the results.
237
+ Tools like Claude Code already let you spawn parallel agents with teams. Swarm Engine adds the orchestration layer: which agents to run, in what order, with what context, on which models, and how to learn from the results.
212
238
 
213
- It gives you composable patterns, cost-aware planning, specialized agent definitions, and a memory system that improves with every run. Think of it as the orchestration layer that turns ad-hoc multi-agent work into repeatable workflows.
214
-
215
- - **Self-aware ML pipeline** - the engine predicts its own failures (GNN), explains them causally (do-calculus), and evolves its own replanning rules. All in pure TypeScript, zero external ML dependencies.
216
- - **16 graph modules across 3 tiers** - persistent knowledge graph, causal inference, GNN failure propagation, trajectory prediction, orchestration embeddings, and self-evolving rules.
217
- - **7 composable patterns** - hybrid, TDD, red-team, spike, discover, review-cycle, research. Compose them: `--pattern "tdd | red-team"`. Plus 12 slash commands including postmortem, diff-review, and fix-pr.
218
- - **Intelligent planner** - cost-based optimization, adaptive execution, learns from every run
219
- - **Mix any backend** - Claude for implementation, Codex for review, Gemini for research. Different model per agent.
220
- - **Reusable templates** - save successful workflows, run them again with different parameters
221
- - **26 specialized agents** - 16 core roles plus 10 focused reviewers (security, performance, data integrity, API contracts, testing, accessibility, dependencies, error handling, concurrency, documentation)
222
- - **1,278 tests across 76 files** - 4 rounds of security review, clean on all adversarial attack vectors
239
+ - **20-35% token reduction** measured per-run on typical parallel orchestrations. Cross-phase context filtering, verbatim compaction, context decay, prompt diet (3 modes), and tool schema deferral (~14K saved per session). No configuration needed. Cache-aware: tracks hit rates and detects cliff events.
240
+ - **Knowledge graph improves with every run** — first run uses heuristics, tenth run uses data. Context routing scores prior outputs by file overlap and recency. Pattern learning tracks success rates per orchestration type. Failure prediction estimates risk from historical topology before agents run. Cost baselines compare each run to your historical average.
241
+ - **Cost transparency after every run** every orchestration ends with a benefits summary: actual cost, historical comparison, tokens saved (with compounding math), cache hit rates, and what each optimization contributed. `swarm plan` previews estimated cost before execution.
242
+ - **Failure prediction** 3-layer GNN propagation model predicts which agents are likely to fail based on historical topology. Causal inference (do-calculus) explains why. Pure TypeScript, no external ML deps.
243
+ - **7 composable patterns** hybrid, TDD, red-team, spike, discover, review-cycle, research. Compose them: `--pattern "tdd | red-team"`. Plus 12 slash commands including postmortem, diff-review, and fix-pr.
244
+ - **Mix backends per agent** Claude for implementation, Codex for review, Gemini for research. Assign models at the agent level within one orchestration.
245
+ - **26 specialized agents** 16 core roles plus 10 focused reviewers (security, performance, data integrity, API contracts, testing, accessibility, dependencies, error handling, concurrency, documentation).
246
+ - **14-finding security audit** 3 recon agents + 3 adversarial breakers, all findings hardened. Plugin trust model, MCP command allowlists, prompt injection defense, path traversal guards, secrets redaction (16 pattern categories), file permissions hardened to 0o600/0o700.
247
+ - **1,848 tests across 104 files** reusable templates let you save and replay successful orchestrations (`swarm template run bug-fix`).
223
248
 
224
249
  ## Templates
225
250
 
@@ -245,16 +270,17 @@ See exactly what will happen before it runs:
245
270
  swarm plan "add auth middleware" --pattern hybrid
246
271
 
247
272
  Phase 1: research [parallel]
248
- researcher-code sonnet ~5K tokens
249
- researcher-context sonnet ~3K tokens
273
+ researcher-code sonnet ~40K tokens
274
+ researcher-context sonnet ~25K tokens
250
275
  Phase 2: implement [sequential]
251
- implementer opus ~15K tokens
276
+ implementer opus ~100K tokens
252
277
  Phase 3: review [parallel]
253
- reviewer-correct opus ~8K tokens
254
- reviewer-security opus ~8K tokens
278
+ reviewer-correct opus ~60K tokens
279
+ reviewer-security opus ~60K tokens
255
280
 
256
- Est. cost: $0.51 | Est. duration: 95s
257
- Optimizations: model downgrade for research phase (sonnet vs opus)
281
+ Est. cost: $3.20 | Est. duration: 95s
282
+ Optimizations: model downgrade for research phase (sonnet vs opus),
283
+ cross-phase filtering (reviewers skip research output)
258
284
  ```
259
285
 
260
286
  ## Commands
@@ -335,17 +361,94 @@ swarm convert --to opencode # OpenCode agents
335
361
  swarm convert --to windsurf # Windsurf skills
336
362
  ```
337
363
 
364
+ ## Token Efficiency
365
+
366
+ Saves 20-35% of tokens on typical parallel orchestrations (measured and shown after every run). Multi-agent context grows linearly without optimization; Swarm Engine applies these compression techniques automatically -- no configuration needed.
367
+
368
+ | Technique | What it does | Typical savings |
369
+ |-----------|-------------|-----------------|
370
+ | **Cross-phase filtering** | Implement phases only get research/plan outputs. Review phases only get implement/test outputs. File-scope filtering removes irrelevant files. | 10-30% per phase |
371
+ | **Verbatim compaction** | Replaces file reads, git diffs, test output, and stack traces with compact references. Runs on shared-context files, not just agent output. | 50-70% on tool-heavy outputs |
372
+ | **Context decay** | Recent phases: full fidelity. Older phases: heuristic summary. Oldest: one-line reference. | 5-20x on deep pipelines |
373
+ | **Prompt compression** | Strips YAML frontmatter, markdown headers, duplicate prefixes, and excessive indentation from agent prompts. | 10-25% per prompt |
374
+ | **Output schemas** | Structured JSON contracts per agent type force compact output instead of verbose prose. | 3-5x on agent outputs |
375
+ | **Tool search deferral** | `ENABLE_TOOL_SEARCH=true` auto-set on Claude backend. Defers tool schema loading until needed. | ~14K tokens per agent session |
376
+ | **Output summarizer** | PostToolUse hook injects compact digest after verbose Bash output (>50 lines). Classifies output type, extracts errors and test results. | Reduces model reasoning load on verbose commands |
377
+
378
+ **ACON (Agent Context Optimization)** goes further: it records trajectory pairs (full context vs compressed), analyzes failures where compression caused worse outcomes, and iteratively refines compression guidelines. Ships with 8 built-in guidelines covering error messages, file paths, decisions, API contracts, and boilerplate. Gradient-free -- works with any model.
379
+
380
+ **Cache tracking.** The runtime captures `cacheReadInputTokens` and `cacheCreationInputTokens` end-to-end from the SDK through SQLite. Per-agent-type rolling baselines detect cache cliff events (>50% hit rate drop) and cold-wake gaps (>5min idle). These feed into the benefits table and cost model.
381
+
382
+ The benefits summary at the end of each orchestration shows exactly what was saved.
383
+
384
+ ### Prompt Diet
385
+
386
+ Control how aggressively the runtime trims agent system prompts (Before-You-Act / Self-Check / Debt / Meta sections) per turn.
387
+
388
+ ```yaml
389
+ # OrchestrationConfig
390
+ promptDiet:
391
+ default: balanced # aggressive | balanced | conservative
392
+ overrides:
393
+ orchestrator: full # keep full prompt for this agent type
394
+ my-custom-agent: lite
395
+ ```
396
+
397
+ CLI: `swarm run --prompt-diet aggressive` or `swarm orchestrate --prompt-diet aggressive`.
398
+ Env: `SWARM_PROMPT_DIET=aggressive swarm ...`.
399
+
400
+ Modes:
401
+ - `aggressive`: trims **all** agents (including orchestrator). Biggest savings, highest risk of behavior drift.
402
+ - `balanced` (default): trims reviewers, implementers, researchers, testers, debuggers, planners, integrators, and other workers. Keeps `orchestrator`, `grounding`, `devils-advocate`, `judge`, and `refactorer` at full.
403
+ - `conservative`: no automatic trimming. Only explicit `promptTier` on individual agents applies (legacy behavior).
404
+
405
+ **Per-agent precedence** (strongest first, applied after `dietConfig` is resolved):
406
+
407
+ 1. Explicit `config.promptTier` on the AgentConfig — always wins
408
+ 2. `dietConfig.overrides[key]` where the agent type contains `key` (case-insensitive substring; longest matching key wins, so `security-implementer` beats `implementer`)
409
+ 3. `mode === 'conservative'` → no automatic tiering
410
+ 4. `maxTurns ≤ 3` → `minimal` (in `balanced` and `aggressive` modes)
411
+ 5. `mode === 'aggressive'` → `lite` for everyone (including the protected roles)
412
+ 6. `mode === 'balanced'` (default) → protected roles (`orchestrator`, `grounding`, `devils-advocate`, `judge`, `refactorer`) stay full; lite-eligible roles → `lite`; unknown roles → full
413
+ 7. Default → full
414
+
415
+ **Config-source precedence** (which `dietConfig` the runtime sees): CLI flag (`--prompt-diet`) populates `orchestrationConfig.promptDiet` directly, so `orchestrationConfig.promptDiet ?? resolveDietFromEnv()` — i.e. CLI/orchestration config wins, then `SWARM_PROMPT_DIET` env, then `balanced` default.
416
+
417
+ ## Security
418
+
419
+ v1.50 addressed 14 findings from an internal red-team audit. Key hardening:
420
+
421
+ - **Plugin trust model** — repo-local plugins no longer auto-load. Explicit opt-in required via config. npm-installed plugins load normally.
422
+ - **MCP/LSP command allowlist** — only allowlisted binaries can be launched. Blocks arbitrary command execution through tool server configs.
423
+ - **Prompt injection defenses** — cross-phase context uses boundary markers, agent definitions are positioned before user context, and inline redaction strips injection attempts.
424
+ - **Path traversal guard** — `swarm agents install` validates paths to prevent writing outside the agents directory.
425
+ - **Secrets redaction** — Anthropic API keys, Slack tokens, npm tokens, and Vercel tokens are pattern-matched and stripped from logs, JSON reports, and vault files.
426
+ - **Backend opt-in** — Codex and Gemini backends require explicit `--unsafe-backend` flag since they execute in less-sandboxed environments.
427
+ - **Config validation** — YAML config files are validated on load; dangerous fields (`shell`, `exec`, `command`) are stripped.
428
+ - **Error sanitization** — error messages are cleaned before being injected into retry prompts to prevent reflection-based injection.
429
+
338
430
  ## Memory and Knowledge Graph Intelligence
339
431
 
340
432
  SQLite-backed knowledge base with full-text search, plus a 3-tier Execution Knowledge Graph that records every orchestration as a persistent, queryable topology. Syncs to Obsidian vault for cross-machine access.
341
433
 
342
- **Tier 1 -- Core Graph.** ExecutionGraph records orchestration topology. GraphLearner extracts cross-run patterns. GraphContextRouter replaces dump-everything context with relevance-scored assembly. GraphAnalyzer detects god nodes, bottlenecks, and topology risks.
434
+ **Tier 1 -- Core Graph.** ExecutionGraph records orchestration topology. GraphLearner extracts cross-run patterns. GraphContextRouter replaces dump-everything context with relevance-scored assembly. GraphAnalyzer detects god nodes, bottlenecks, and topology risks. Review findings feed back into the graph to refine future context routing.
343
435
 
344
- **Tier 2 -- Advanced ML.** CausalGraphEngine applies do-calculus to estimate treatment effects and suggest interventions. FailurePropagationPredictor uses a 3-layer GNN to predict which nodes are at risk before execution starts. MetaPatternSelector recommends orchestration patterns via TF-IDF + logistic regression. PredictiveDropout uses active learning to skip redundant agents.
436
+ **Tier 2 -- Advanced ML.** CausalGraphEngine applies do-calculus to estimate treatment effects and suggest interventions. FailurePropagationPredictor uses a 3-layer GNN to predict which nodes are at risk before execution starts. MetaPatternSelector recommends orchestration patterns via TF-IDF + logistic regression. PredictiveDropout uses active learning to skip redundant agents (saving ~5K tokens per dropped agent).
345
437
 
346
438
  **Tier 3 -- Self-Aware Engine.** PatternSynthesizer generates novel orchestration patterns from topology diffs. TrajectoryPredictor forecasts orchestration success mid-run. RuleEvolver proposes and backtests its own replanning rules from historical failures. TaskDiscovery mines the graph for actionable tasks. OrchestrationEmbedder produces topology-based embeddings for similarity search and transfer learning. MetaAdversarialTester red-teams the engine's own ML subsystems.
347
439
 
348
- All ML is implemented in pure TypeScript with zero external ML dependencies.
440
+ All ML is implemented in pure TypeScript with zero external ML dependencies. Every optimization is tracked by the BenefitsCollector and surfaced in the post-orchestration summary -- the engine shows its work.
441
+
442
+ ### Knowledge Graph vs Claude Memory
443
+
444
+ Claude Code memory gives every session the same static context. Swarm's execution graph learns from every run:
445
+
446
+ - **Context routing** — agents get prior-phase outputs scored by relevance (file overlap, recency), not a dump of everything
447
+ - **Pattern learning** — tracks success rates per orchestration pattern and recommends what works for your codebase
448
+ - **Failure prediction** — estimates which agents are likely to fail based on historical topology, before they run
449
+ - **Cost baselines** — each run is compared to historical average so you see if this orchestration was cheap or expensive relative to your norm
450
+
451
+ First run uses heuristics. Tenth run uses data.
349
452
 
350
453
  ```bash
351
454
  swarm memory search "authentication"
@@ -52,20 +52,7 @@ Before executing your process, reason through these questions internally (do not
52
52
 
53
53
  ## Output Format
54
54
 
55
- ```
56
- ## Changes Made
57
- - `path/to/file.py` — [what was changed and why]
58
- - `path/to/new_file.py` — [new file, what it does]
59
-
60
- ## Decisions
61
- - [Any judgment calls you made and why]
62
-
63
- ## Verification
64
- - [Tests run and results]
65
- - [Any warnings or concerns]
66
- ```
67
-
68
- ### Example Output
55
+ Output exactly these sections in this order. Each bullet should name the file and explain why, not just what.
69
56
 
70
57
  ```
71
58
  ## Changes Made