npm - opencode-swarm-plugin - Versions diffs - 0.44.0 → 0.44.1 - Mend

opencode-swarm-plugin 0.44.0 → 0.44.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (205) hide show

package/bin/swarm.serve.test.ts +6 -4
package/bin/swarm.ts +16 -10
package/dist/compaction-prompt-scoring.js +139 -0
package/dist/eval-capture.js +12811 -0
package/dist/hive.d.ts.map +1 -1
package/dist/index.js +7644 -62599
package/dist/plugin.js +23766 -78721
package/dist/swarm-orchestrate.d.ts.map +1 -1
package/dist/swarm-prompts.d.ts.map +1 -1
package/dist/swarm-review.d.ts.map +1 -1
package/package.json +17 -5
package/.changeset/swarm-insights-data-layer.md +0 -63
package/.hive/analysis/eval-failure-analysis-2025-12-25.md +0 -331
package/.hive/analysis/session-data-quality-audit.md +0 -320
package/.hive/eval-results.json +0 -483
package/.hive/issues.jsonl +0 -138
package/.hive/memories.jsonl +0 -729
package/.opencode/eval-history.jsonl +0 -327
package/.turbo/turbo-build.log +0 -9
package/CHANGELOG.md +0 -2286
package/SCORER-ANALYSIS.md +0 -598
package/docs/analysis/subagent-coordination-patterns.md +0 -902
package/docs/analysis-socratic-planner-pattern.md +0 -504
package/docs/planning/ADR-001-monorepo-structure.md +0 -171
package/docs/planning/ADR-002-package-extraction.md +0 -393
package/docs/planning/ADR-003-performance-improvements.md +0 -451
package/docs/planning/ADR-004-message-queue-features.md +0 -187
package/docs/planning/ADR-005-devtools-observability.md +0 -202
package/docs/planning/ADR-007-swarm-enhancements-worktree-review.md +0 -168
package/docs/planning/ADR-008-worker-handoff-protocol.md +0 -293
package/docs/planning/ADR-009-oh-my-opencode-patterns.md +0 -353
package/docs/planning/ADR-010-cass-inhousing.md +0 -1215
package/docs/planning/ROADMAP.md +0 -368
package/docs/semantic-memory-cli-syntax.md +0 -123
package/docs/swarm-mail-architecture.md +0 -1147
package/docs/testing/context-recovery-test.md +0 -470
package/evals/ARCHITECTURE.md +0 -1189
package/evals/README.md +0 -768
package/evals/compaction-prompt.eval.ts +0 -149
package/evals/compaction-resumption.eval.ts +0 -289
package/evals/coordinator-behavior.eval.ts +0 -307
package/evals/coordinator-session.eval.ts +0 -154
package/evals/evalite.config.ts.bak +0 -15
package/evals/example.eval.ts +0 -31
package/evals/fixtures/cass-baseline.ts +0 -217
package/evals/fixtures/compaction-cases.ts +0 -350
package/evals/fixtures/compaction-prompt-cases.ts +0 -311
package/evals/fixtures/coordinator-sessions.ts +0 -328
package/evals/fixtures/decomposition-cases.ts +0 -105
package/evals/lib/compaction-loader.test.ts +0 -248
package/evals/lib/compaction-loader.ts +0 -320
package/evals/lib/data-loader.evalite-test.ts +0 -289
package/evals/lib/data-loader.test.ts +0 -345
package/evals/lib/data-loader.ts +0 -281
package/evals/lib/llm.ts +0 -115
package/evals/scorers/compaction-prompt-scorers.ts +0 -145
package/evals/scorers/compaction-scorers.ts +0 -305
package/evals/scorers/coordinator-discipline.evalite-test.ts +0 -539
package/evals/scorers/coordinator-discipline.ts +0 -325
package/evals/scorers/index.test.ts +0 -146
package/evals/scorers/index.ts +0 -328
package/evals/scorers/outcome-scorers.evalite-test.ts +0 -27
package/evals/scorers/outcome-scorers.ts +0 -349
package/evals/swarm-decomposition.eval.ts +0 -121
package/examples/commands/swarm.md +0 -745
package/examples/plugin-wrapper-template.ts +0 -2515
package/examples/skills/hive-workflow/SKILL.md +0 -212
package/examples/skills/skill-creator/SKILL.md +0 -223
package/examples/skills/swarm-coordination/SKILL.md +0 -292
package/global-skills/cli-builder/SKILL.md +0 -344
package/global-skills/cli-builder/references/advanced-patterns.md +0 -244
package/global-skills/learning-systems/SKILL.md +0 -644
package/global-skills/skill-creator/LICENSE.txt +0 -202
package/global-skills/skill-creator/SKILL.md +0 -352
package/global-skills/skill-creator/references/output-patterns.md +0 -82
package/global-skills/skill-creator/references/workflows.md +0 -28
package/global-skills/swarm-coordination/SKILL.md +0 -995
package/global-skills/swarm-coordination/references/coordinator-patterns.md +0 -235
package/global-skills/swarm-coordination/references/strategies.md +0 -138
package/global-skills/system-design/SKILL.md +0 -213
package/global-skills/testing-patterns/SKILL.md +0 -430
package/global-skills/testing-patterns/references/dependency-breaking-catalog.md +0 -586
package/opencode-swarm-plugin-0.30.7.tgz +0 -0
package/opencode-swarm-plugin-0.31.0.tgz +0 -0
package/scripts/cleanup-test-memories.ts +0 -346
package/scripts/init-skill.ts +0 -222
package/scripts/migrate-unknown-sessions.ts +0 -349
package/scripts/validate-skill.ts +0 -204
package/src/agent-mail.ts +0 -1724
package/src/anti-patterns.test.ts +0 -1167
package/src/anti-patterns.ts +0 -448
package/src/compaction-capture.integration.test.ts +0 -257
package/src/compaction-hook.test.ts +0 -838
package/src/compaction-hook.ts +0 -1204
package/src/compaction-observability.integration.test.ts +0 -139
package/src/compaction-observability.test.ts +0 -187
package/src/compaction-observability.ts +0 -324
package/src/compaction-prompt-scorers.test.ts +0 -475
package/src/compaction-prompt-scoring.ts +0 -300
package/src/contributor-tools.test.ts +0 -133
package/src/contributor-tools.ts +0 -201
package/src/dashboard.test.ts +0 -611
package/src/dashboard.ts +0 -462
package/src/error-enrichment.test.ts +0 -403
package/src/error-enrichment.ts +0 -219
package/src/eval-capture.test.ts +0 -1015
package/src/eval-capture.ts +0 -929
package/src/eval-gates.test.ts +0 -306
package/src/eval-gates.ts +0 -218
package/src/eval-history.test.ts +0 -508
package/src/eval-history.ts +0 -214
package/src/eval-learning.test.ts +0 -378
package/src/eval-learning.ts +0 -360
package/src/eval-runner.test.ts +0 -223
package/src/eval-runner.ts +0 -402
package/src/export-tools.test.ts +0 -476
package/src/export-tools.ts +0 -257
package/src/hive.integration.test.ts +0 -2241
package/src/hive.ts +0 -1628
package/src/index.ts +0 -940
package/src/learning.integration.test.ts +0 -1815
package/src/learning.ts +0 -1079
package/src/logger.test.ts +0 -189
package/src/logger.ts +0 -135
package/src/mandate-promotion.test.ts +0 -473
package/src/mandate-promotion.ts +0 -239
package/src/mandate-storage.integration.test.ts +0 -601
package/src/mandate-storage.test.ts +0 -578
package/src/mandate-storage.ts +0 -794
package/src/mandates.ts +0 -540
package/src/memory-tools.test.ts +0 -195
package/src/memory-tools.ts +0 -344
package/src/memory.integration.test.ts +0 -334
package/src/memory.test.ts +0 -158
package/src/memory.ts +0 -527
package/src/model-selection.test.ts +0 -188
package/src/model-selection.ts +0 -68
package/src/observability-tools.test.ts +0 -359
package/src/observability-tools.ts +0 -871
package/src/output-guardrails.test.ts +0 -438
package/src/output-guardrails.ts +0 -381
package/src/pattern-maturity.test.ts +0 -1160
package/src/pattern-maturity.ts +0 -525
package/src/planning-guardrails.test.ts +0 -491
package/src/planning-guardrails.ts +0 -438
package/src/plugin.ts +0 -23
package/src/post-compaction-tracker.test.ts +0 -251
package/src/post-compaction-tracker.ts +0 -237
package/src/query-tools.test.ts +0 -636
package/src/query-tools.ts +0 -324
package/src/rate-limiter.integration.test.ts +0 -466
package/src/rate-limiter.ts +0 -774
package/src/replay-tools.test.ts +0 -496
package/src/replay-tools.ts +0 -240
package/src/repo-crawl.integration.test.ts +0 -441
package/src/repo-crawl.ts +0 -610
package/src/schemas/cell-events.test.ts +0 -347
package/src/schemas/cell-events.ts +0 -807
package/src/schemas/cell.ts +0 -257
package/src/schemas/evaluation.ts +0 -166
package/src/schemas/index.test.ts +0 -199
package/src/schemas/index.ts +0 -286
package/src/schemas/mandate.ts +0 -232
package/src/schemas/swarm-context.ts +0 -115
package/src/schemas/task.ts +0 -161
package/src/schemas/worker-handoff.test.ts +0 -302
package/src/schemas/worker-handoff.ts +0 -131
package/src/sessions/agent-discovery.test.ts +0 -137
package/src/sessions/agent-discovery.ts +0 -112
package/src/sessions/index.ts +0 -15
package/src/skills.integration.test.ts +0 -1192
package/src/skills.test.ts +0 -643
package/src/skills.ts +0 -1549
package/src/storage.integration.test.ts +0 -341
package/src/storage.ts +0 -884
package/src/structured.integration.test.ts +0 -817
package/src/structured.test.ts +0 -1046
package/src/structured.ts +0 -762
package/src/swarm-decompose.test.ts +0 -188
package/src/swarm-decompose.ts +0 -1302
package/src/swarm-deferred.integration.test.ts +0 -157
package/src/swarm-deferred.test.ts +0 -38
package/src/swarm-insights.test.ts +0 -214
package/src/swarm-insights.ts +0 -459
package/src/swarm-mail.integration.test.ts +0 -970
package/src/swarm-mail.ts +0 -739
package/src/swarm-orchestrate.integration.test.ts +0 -282
package/src/swarm-orchestrate.test.ts +0 -548
package/src/swarm-orchestrate.ts +0 -3084
package/src/swarm-prompts.test.ts +0 -1270
package/src/swarm-prompts.ts +0 -2077
package/src/swarm-research.integration.test.ts +0 -701
package/src/swarm-research.test.ts +0 -698
package/src/swarm-research.ts +0 -472
package/src/swarm-review.integration.test.ts +0 -285
package/src/swarm-review.test.ts +0 -879
package/src/swarm-review.ts +0 -709
package/src/swarm-strategies.ts +0 -407
package/src/swarm-worktree.test.ts +0 -501
package/src/swarm-worktree.ts +0 -575
package/src/swarm.integration.test.ts +0 -2377
package/src/swarm.ts +0 -38
package/src/tool-adapter.integration.test.ts +0 -1221
package/src/tool-availability.ts +0 -461
package/tsconfig.json +0 -28

package/.hive/analysis/session-data-quality-audit.md DELETED Viewed

@@ -1,320 +0,0 @@
-# Session Data Quality Audit Report
-**Date:** 2025-12-25
-**Cell:** opencode-swarm-plugin--ys7z8-mjlk7jspacf
-**Agent:** WildDawn
-## Executive Summary
-Investigation of why only 3 of 102 sessions (2.9%) pass the coordinator-session eval filter reveals:
-1. **Filter is working as designed** - correctly isolating high-quality complete coordinator sessions
-2. **Data quality is actually GOOD** - the 3 passing sessions are gold-standard examples
-3. **97% filtered out is EXPECTED** - most sessions are worker completions, not coordinator sessions
-4. **Filter may be too strict** for broad coordinator behavior analysis (needs tuning)
-## Data Breakdown
-### Total Sessions: 102
-| Category | Count | % | Description |
-|----------|-------|---|-------------|
-| **Single-event sessions** | 70 | 68.6% | Worker completions (subtask_success), isolated reviews |
-| **Multi-event incomplete** | 29 | 28.4% | Coordinator sessions that didn't complete full cycle |
-| **Passing sessions** | 3 | 2.9% | Complete coordinator cycles with spawn + review |
-### Single-Event Sessions (70 sessions - 68.6%)
-**Event Type Breakdown:**
-- `OUTCOME/subtask_success`: 56 (80.0%) - **Worker completions, not coordinator sessions**
-- `DECISION/review_completed`: 12 (17.1%) - Isolated review events
-- `DECISION/worker_spawned`: 2 (2.9%) - Isolated spawn events
-**Analysis:** These are **NOT coordinator sessions**. They're worker agents reporting completion or isolated coordinator actions captured in separate session files.
-### Multi-Event Failures (29 sessions - 28.4%)
-**Failure Breakdown:**
-- **No worker_spawned event**: 20 sessions
-  - Review-only sessions (3-22 events, all `review_completed`)
-  - Appears to be test data or session capture split across files
-- **Has worker_spawned but no review_completed**: 5 sessions
-  - Incomplete coordinator sessions (4-24 events)
-  - Coordinator spawned workers but reviews weren't captured (yet)
-- **Too few events (<3)**: 4 sessions
-  - Aborted early
-**Key Finding:** None of these 29 sessions have `decomposition_complete` events. This suggests:
-1. Session capture may not be recording decomposition events
-2. OR coordinator sessions span multiple session files
-3. OR these are partial captures from long-running coordinators
-### Passing Sessions (3 sessions - 2.9%)
-#### ses_4b86f0867ffeXKv95ktf31igfD
-- **Events:** 33
-- **Worker spawns:** 20
-- **Reviews completed:** 13
-- **Violations:** 0
-- **Duration:** 437 minutes (7.3 hours)
-- **Quality:** GOLD STANDARD
-#### ses_4ac0f508dffeEcwSQ6OSMWrmWF
-- **Events:** 21
-- **Worker spawns:** 17
-- **Reviews completed:** 4
-- **Duration:** 540 minutes (9.0 hours)
-- **Quality:** GOLD STANDARD
-#### ses_4ae8c2f66ffecyfyre7ZQ7y5LW
-- **Events:** 31
-- **Worker spawns:** 24
-- **Reviews completed:** 7
-- **Violations:** 0
-- **Duration:** 368 minutes (6.1 hours)
-- **Quality:** GOLD STANDARD
-**Analysis:** These are FULL multi-hour coordinator sessions with extensive worker coordination. They represent the ideal coordinator behavior the eval is designed to measure.
-## Current Filter Criteria
-```typescript
-{
-  minEvents: 3,              // Default
-  requireWorkerSpawn: true,  // Default
-  requireReview: true,       // Default
-}
-```
-### Filter Performance
-| Check | Impact |
-|-------|--------|
-| `minEvents >= 3` | Filters out 74 sessions (72.5%) |
-| `requireWorkerSpawn: true` | Filters out 20 additional sessions (19.6%) |
-| `requireReview: true` | Filters out 5 additional sessions (4.9%) |
-**Cascade effect:** Each filter compounds, resulting in 2.9% passing rate.
-## Root Cause Analysis
-### Is the Filter Too Strict?
-**YES and NO:**
-✅ **Working as designed:**
-- Correctly excludes worker-only sessions (80% of single-event data)
-- Correctly excludes incomplete coordinator sessions
-- Isolates high-quality complete coordinator cycles
-❌ **Too strict for real-world analysis:**
-- 2.9% passing rate means most coordinator behavior is invisible to the eval
-- Filter assumes coordinators ALWAYS complete full spawn+review cycles
-- Doesn't account for:
-  - Long-running multi-session coordinators
-  - Coordinators that spawn workers but reviews aren't captured yet
-  - Early-stage coordinator sessions (before first spawn)
-### Is the Data Quality Low?
-**NO.** The data quality is actually GOOD:
-- The 3 passing sessions are excellent gold-standard examples
-- They contain rich coordinator behavior (20-24 worker spawns, 4-13 reviews)
-- Zero violations in all 3 sessions
-- Multi-hour timelines showing sustained coordination
-The "low passing rate" is a **filter strictness issue**, not a data quality issue.
-### Why Only 3/102 Pass?
-**Theory 1: Session Capture Splits Long Coordinators**
-- The 3 passing sessions are 6-9 hour marathons
-- Most coordinator work may be happening in shorter bursts
-- Session files might be split by epic_id or time windows
-**Evidence:**
-- Some sessions have 20+ `review_completed` events with no `worker_spawned`
-- This suggests reviews from previous spawns in a different session file
-**Theory 2: Review Capture Is Incomplete**
-- 5 sessions have `worker_spawned` but no `review_completed`
-- Reviews may be captured in separate session files
-- OR review capture isn't working consistently
-**Theory 3: Most Coordinator Sessions Are Short**
-- Only 32/102 sessions (31.4%) have ANY `review_completed` event
-- Only 10/102 sessions (9.8%) have ANY `worker_spawned` event
-- This suggests most captured activity is worker completions, not coordinator cycles
-## Recommendations
-### 1. Make Filter Parameters Optional (IMMEDIATE)
-**Current default:**
-```typescript
-{
-  minEvents: 3,
-  requireWorkerSpawn: true,
-  requireReview: true,
-}
-```
-**Recommended default:**
-```typescript
-{
-  minEvents: 3,              // Keep - filters out noise
-  requireWorkerSpawn: false, // CHANGE - allow early-stage sessions
-  requireReview: false,      // CHANGE - allow incomplete sessions
-}
-```
-**Impact:** This would increase passing rate from 3 to ~28 sessions (from 2.9% to 27.5%).
-**Rationale:**
-- Captures more coordinator behavior (spawns without reviews)
-- Allows evaluation of early-stage coordination patterns
-- Still filters out single-event worker completions
-- Users can opt-in to stricter filters if needed
-### 2. Add Session Type Detection (ENHANCEMENT)
-Add a filter to exclude worker-only sessions automatically:
-```typescript
-function isCoordinatorSession(session: CoordinatorSession): boolean {
-  return session.events.some(e =>
-    e.event_type === "DECISION" &&
-    (e.decision_type === "decomposition_complete" ||
-     e.decision_type === "worker_spawned" ||
-     e.decision_type === "strategy_selected")
-  );
-}
-```
-**Impact:** Filters out 70+ worker-only sessions before applying other criteria.
-### 3. Investigate Session Capture Splitting (BUG FIX?)
-**Symptoms:**
-- Sessions with 22 `review_completed` events but no `worker_spawned`
-- Sessions with 24 `worker_spawned` events but no reviews
-- No `decomposition_complete` events in ANY session (including the 3 passing)
-**Hypothesis:** Long-running coordinator sessions may be split across multiple session files.
-**Action:** Investigate `eval-capture.ts` to understand:
-- How `session_id` is generated
-- Whether sessions are split by epic_id
-- Whether there's a session timeout that creates new files
-### 4. Add Filter Reporting to Data Loader (OBSERVABILITY)
-The data loader logs filtered-out count, but doesn't break down WHY sessions failed.
-**Enhancement:**
-```typescript
-console.log(`Filtered out ${filteredOutCount} sessions:`);
-console.log(`  - Too few events (<${minEvents}): ${stats.tooFewEvents}`);
-console.log(`  - No worker_spawned: ${stats.noWorkerSpawn}`);
-console.log(`  - No review_completed: ${stats.noReview}`);
-console.log(`  - Worker-only sessions: ${stats.workerOnly}`);
-```
-This helps users understand filter impact.
-### 5. Consider Separate Evals for Different Session Types
-Instead of one eval with strict filters, consider:
-**Eval 1: Full Coordinator Cycles** (current behavior)
-- Filters: `minEvents=3, requireWorkerSpawn=true, requireReview=true`
-- Focus: End-to-end coordinator discipline
-- Expected passing rate: ~3% (gold standard only)
-**Eval 2: Coordinator Spawning Behavior**
-- Filters: `minEvents=3, requireWorkerSpawn=true, requireReview=false`
-- Focus: How coordinators delegate work
-- Expected passing rate: ~10%
-**Eval 3: Coordinator Review Behavior**
-- Filters: `minEvents=3, requireWorkerSpawn=false, requireReview=true`
-- Focus: How coordinators review worker output
-- Expected passing rate: ~31%
-**Eval 4: All Coordinator Activity**
-- Filters: `minEvents=3, requireWorkerSpawn=false, requireReview=false, isCoordinatorSession=true`
-- Focus: Broad coordinator behavior patterns
-- Expected passing rate: ~27%
-## Conclusion
-The coordinator-session eval filter is **working as designed**. It successfully isolates high-quality complete coordinator sessions for evaluation.
-However, the **2.9% passing rate is too strict** for comprehensive coordinator behavior analysis. The filter should:
-1. **Default to more lenient settings** (requireWorkerSpawn=false, requireReview=false)
-2. **Allow users to opt-in** to stricter filters for gold-standard analysis
-3. **Automatically exclude worker-only sessions** via session type detection
-4. **Provide visibility** into why sessions are filtered out
-The data quality itself is GOOD. The 3 passing sessions are excellent examples of sustained multi-hour coordinator behavior with extensive worker coordination and zero violations.
----
-## Appendix: Raw Data
-### Event Count Distribution
-```
-  1 event:  70 sessions (68.6%) - Mostly worker completions
-  2 events:  4 sessions (3.9%)
-  3 events:  6 sessions (5.9%)
-  4 events:  3 sessions (2.9%)
-  5 events:  3 sessions (2.9%)
-  6 events:  2 sessions (2.0%)
-  7 events:  1 session  (1.0%)
-  9 events:  1 session  (1.0%)
- 21 events:  1 session  (1.0%) ✓ PASSING
- 22 events:  5 sessions (4.9%)
- 24 events:  1 session  (1.0%)
- 27 events:  1 session  (1.0%)
- 30 events:  2 sessions (2.0%)
- 31 events:  1 session  (1.0%) ✓ PASSING
- 33 events:  1 session  (1.0%) ✓ PASSING
-```
-### Sample Worker-Only Sessions
-```
-ses_6EraEW6LTRswygMPQa2voC.jsonl (1 event):
-  OUTCOME/subtask_success
-ses_xyJ85H9SaA5FSnJvDL7ktJ.jsonl (1 event):
-  OUTCOME/subtask_success
-ses_BiqTpFyafkbpt3tvZbh29R.jsonl (1 event):
-  DECISION/review_completed
-```
-### Sample Incomplete Coordinator Sessions
-```
-ses_4aa1d6e57ffeGfXIoIMNhTQ9JI.jsonl (7 events):
-  DECISION/worker_spawned (x7)
-  → Missing reviews
-ses_3t9CP2ZG54wF3D982kZgps.jsonl (3 events):
-  DECISION/review_completed (x3)
-  → Missing spawns
-test-review-1766636012605.jsonl (22 events):
-  DECISION/review_completed (x22)
-  → Missing spawns (likely test data)
-```
----
-**Generated by:** WildDawn (swarm worker agent)
-**Date:** 2025-12-25
-**Files analyzed:** 102 session files from `~/.config/swarm-tools/sessions/`