npm - codeswarm - Versions diffs - 0.1.0 - Mend

codeswarm 0.1.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (21) hide show

package/.codeswarm/skills/prd_template.md +98 -0
package/AGENT_TIPS.md +206 -0
package/BROWSER_TESTING.md +177 -0
package/COORDINATOR.md +151 -0
package/LICENSE +21 -0
package/README.md +253 -0
package/TASK_PROTOCOL.md +111 -0
package/WORKFLOWS.md +174 -0
package/bin/codeswarm.js +15 -0
package/config.yaml +55 -0
package/coordinator.sh +1762 -0
package/dashboard/package-lock.json +1036 -0
package/dashboard/package.json +14 -0
package/dashboard/public/index.html +758 -0
package/dashboard/server.js +444 -0
package/docs/prd-example.md +90 -0
package/docs/prd-template.md +45 -0
package/orchestrate.sh +467 -0
package/package.json +62 -0
package/playwright.config.ts +19 -0
package/setup.sh +142 -0

package/README.md ADDED Viewed

@@ -0,0 +1,253 @@
+# 🤖 Codeswarm — Multi-Agent AI Orchestration Framework
+> Coordinate **Claude Code**, **Gemini CLI**, **Codex CLI**, **Amp**, and **OpenCode** as a unified development team with a planner-driven feedback loop.
+[![License: MIT](https://img.shields.io/badge/License-MIT-blue.svg)](LICENSE)
+[![npm version](https://img.shields.io/npm/v/codeswarm.svg)](https://www.npmjs.com/package/codeswarm)
+[![Node.js](https://img.shields.io/badge/Node.js-18+-green.svg)](https://nodejs.org)
+## Why Codeswarm?
+Instead of running one AI agent at a time, Codeswarm assigns **different roles** to multiple AI agents:
+- 🧠 **Planner** — Reads your task or PRD, creates a structured execution plan
+- ⚡ **Executor** — Implements the changes according to the plan
+- 🔍 **Reviewer** — Reviews the diff, runs quality checks, provides feedback
+- 🎨 **Frontend Dev** — Handles UI-specific tasks (optional)
+The planner dynamically issues directives (`EXECUTE`, `REVIEW`, `APPROVE`, `SKIP`, `DONE`) creating an autonomous feedback loop until the work is approved.
+## Architecture
+```
+                      ┌─────────────┐
+               ┌─────│   PLANNER   │─────┐
+               │     └─────────────┘     │
+               │          │ directives   │
+               │          ▼              │
+               │     ┌─────────────┐     │
+       replan  │     │  EXECUTOR   │     │ re-consult on
+       request │     └─────────────┘     │ major issues
+               │          │ code changes │
+               │          ▼              │
+               │     ┌─────────────┐     │
+               │     │  REVIEWER   │─────┘
+               │     └──────┬──────┘
+               │      ┌─────┴─────┐
+               ▼    APPROVED   NEEDS_CHANGES
+              DONE     ✓      → loop back to EXECUTOR
+```
+## Quick Start
+```bash
+# Install globally
+npm install -g codeswarm
+# Run with a PRD file
+codeswarm --project ~/my-app --prd docs/feature.md
+# Run with a task description
+codeswarm --project ~/my-app --task "Add user authentication with JWT"
+# Run with JSON PRD (ralph-style)
+codeswarm --project ~/my-app --prd prd.json
+# Use specific agents
+codeswarm --project ~/my-app \
+  --task "Fix pagination bug" \
+  --planner codex \
+  --executor claude \
+  --reviewer gemini,amp
+# Run a complex workflow with a custom plan and dashboard
+codeswarm --project ~/my-project \
+  --plan "docs/plan.md" \
+  --planner codex \
+  --executor claude \
+  --reviewer gemini,amp \
+  --dashboard
+```
+## Features
+### 🔀 Agent-Agnostic Orchestration
+Mix and match any combination of supported AI coding agents per role:
+| Agent | CLI | Use As |
+|-------|-----|--------|
+| **Claude Code** | `claude` | Planner, Executor, Reviewer |
+| **Gemini CLI** | `gemini` | Planner, Executor, Reviewer |
+| **Codex CLI** | `codex` | Planner, Executor, Reviewer |
+| **Amp** | `amp` | Executor, Reviewer |
+| **OpenCode** | `opencode` | Executor, Reviewer |
+### 📋 PRD-First Workflow
+Feed a Product Requirements Document (PRD) and Codeswarm breaks it into ordered, dependency-aware user stories:
+- **Markdown PRD** — `### US-001: Title` format with acceptance criteria
+- **JSON PRD** — Ralph-compatible `prd.json` with `userStories` array
+- **Auto-generate** — Provide `--task` and the planner generates a PRD first
+### 📊 Real-Time Dashboard
+Built-in monitoring dashboard with WebSocket live updates:
+- Agent flow visualization (who's running what)
+- **Log search** with `Ctrl+F` — search within agent output logs
+- **Log download** — export raw agent logs
+- **Phase detection** — see if agent is Reading, Implementing, Testing, Building
+- **PRD progress** — per-story acceptance criteria pass/fail tracking
+- **Directive timeline** — visual history of EXECUTE → REVIEW → APPROVE flow
+- Subtask progress with live status updates
+```bash
+# Start with dashboard
+codeswarm --project ~/my-app --prd docs/feature.md --dashboard
+```
+### 🛡️ Safety Features
+- **Watchdog timer** — kills stuck agents that produce no output
+- **Retry logic** — handles transient agent failures (API timeouts, connection resets)
+- **Session audit trail** — every prompt/log/directive saved to `.codeswarm/sessions/`
+- **Dry-run mode** — preview all prompts without executing agents
+## Project Structure
+```
+codeswarm/
+├── bin/
+│   └── codeswarm.js          # CLI entry point (npm global binary)
+├── coordinator.sh             # Core orchestration engine (v7.0)
+├── orchestrate.sh             # Legacy sequential pipeline
+├── setup.sh                   # Dependency installer
+├── config.yaml                # Default agent roles, models, timeouts
+├── dashboard/
+│   ├── server.js              # Express + WebSocket dashboard server
+│   ├── package.json           # Dashboard dependencies
+│   └── public/
+│       └── index.html         # Dashboard SPA (dark theme, live UI)
+├── .codeswarm/
+│   └── skills/
+│       └── prd_template.md    # PRD generation skill for planner agents
+├── docs/
+│   ├── prd-template.md        # PRD format template
+│   └── prd-example.md         # Example PRD
+├── COORDINATOR.md             # Coordinator architecture docs
+├── AGENT_TIPS.md              # Per-agent configuration tips
+├── TASK_PROTOCOL.md           # How agents communicate via shared files
+├── BROWSER_TESTING.md         # Frontend testing with Playwright MCP
+├── WORKFLOWS.md               # Workflow definitions
+├── playwright.config.ts       # Playwright test configuration
+└── package.json               # npm package manifest
+```
+## CLI Reference
+| Flag | Description | Default |
+|------|-------------|---------|
+| `--project` | Target project directory | **required** |
+| `--task` | Task description (auto-generates PRD) | — |
+| `--prd` | PRD file path (`.md` or `.json`) | — |
+| `--plan` | Existing plan file (skip planning) | — |
+| `--planner` | Agent for planning | `codex` |
+| `--executor` | Agent for execution | `claude` |
+| `--reviewer` | Agent(s) for review (comma-separated) | `gemini` |
+| `--fe-dev` | Frontend executor agent | — |
+| `--fe-reviewer` | Frontend reviewer agent(s) | — |
+| `--max-rounds` | Max planner rounds | `10` |
+| `--max-iterations` | Max execute→review cycles per subtask | `5` |
+| `--dashboard` | Start real-time dashboard | `false` |
+| `--tmux` | Use tmux for agent terminals | `false` |
+| `--dry-run` | Print prompts without executing | `false` |
+| `--verbose` | Show full agent output | `false` |
+| `--context` | Comma-separated context files | — |
+## Configuration
+Edit `config.yaml` to set defaults:
+```yaml
+roles:
+  planner: claude
+  executor: gemini
+  reviewer: codex
+models:
+  claude: opus
+  gemini: ""        # uses default
+  codex: ""         # uses default
+timeouts:
+  planner: 300
+  executor: 600
+  reviewer: 300
+hooks:
+  after_plan: ""          # e.g. "./hooks/validate-plan.sh"
+  after_execute: ""       # e.g. "npm run build && npm test"
+  after_review: ""        # e.g. "./hooks/notify-slack.sh"
+```
+## Session Artifacts
+After a run, find everything under your project:
+```
+<project>/.codeswarm/
+├── task.md                          # Current task plan
+├── sessions/session_<timestamp>/
+│   ├── coordinator.log              # Full orchestration log
+│   ├── metadata.json                # Agent roles metadata
+│   ├── prompt_001_codex.md          # Exact prompt sent to planner
+│   ├── log_001_codex.md             # Planner output
+│   ├── prompt_002_claude.md         # Executor prompt
+│   ├── log_002_claude.md            # Executor output
+│   ├── prompt_003_gemini.md         # Reviewer prompt
+│   ├── log_003_gemini.md            # Reviewer output
+│   └── directives/
+│       ├── directive_001.md         # EXECUTE #1
+│       ├── directive_002.md         # REVIEW #1
+│       └── directive_003.md         # APPROVE #1
+└── docs/tasks/                      # Archived completed tasks
+```
+## Documentation
+| Document | Description |
+|----------|-------------|
+| [COORDINATOR.md](./COORDINATOR.md) | Architecture deep-dive and flow diagram |
+| [TASK_PROTOCOL.md](./TASK_PROTOCOL.md) | How agents communicate via shared files |
+| [AGENT_TIPS.md](./AGENT_TIPS.md) | Per-agent configuration and tips |
+| [BROWSER_TESTING.md](./BROWSER_TESTING.md) | Frontend testing with Playwright MCP |
+| [WORKFLOWS.md](./WORKFLOWS.md) | Workflow definitions |
+| [config.yaml](./config.yaml) | Default role assignments and settings |
+## Requirements
+- **Node.js** ≥ 18
+- **Bash** ≥ 4.0
+- At least one AI coding CLI installed:
+  - [Claude Code](https://github.com/anthropics/claude-code): `npm i -g @anthropic/claude-code`
+  - [Gemini CLI](https://github.com/google-gemini/gemini-cli): `npm i -g @anthropic/gemini-cli`
+  - [Codex CLI](https://github.com/openai/codex): `npm i -g @openai/codex`
+  - [Amp](https://ampcode.com): Install from website
+  - [OpenCode](https://opencode.ai): Install from website
+- **jq** (optional, for JSON PRD support): `brew install jq`
+## Contributing
+```bash
+# Clone the repo
+git clone https://github.com/mskutlu/codeswarm.git
+cd codeswarm
+# Install dependencies
+./setup.sh
+# Run tests
+./coordinator.sh --project /tmp/test-project --prd docs/prd-example.md --dry-run
+```
+## License
+MIT © [mskutlu](https://github.com/mskutlu)

package/TASK_PROTOCOL.md ADDED Viewed

@@ -0,0 +1,111 @@
+# Task Protocol — Inter-Agent Communication
+This document defines how agents communicate through the shared `.codeswarm/` directory.
+## Directory Structure
+Every project that uses the orchestrator gets a `.codeswarm/` directory:
+```
+<project-root>/
+└── .codeswarm/
+    ├── plan.md                    # Planner output
+    ├── execution.log              # Executor output
+    ├── review.md                  # Reviewer output
+    ├── test-report.md             # Browser test results
+    ├── screenshots/               # Browser test screenshots
+    │   ├── step-01-login.png
+    │   ├── step-02-navigate.png
+    │   └── ...
+    └── report-<timestamp>.md      # Final orchestration report
+```
+## Message Format
+All inter-agent files use **Markdown** with structured sections. This ensures both humans and agents can read them.
+### Plan (plan.md)
+```markdown
+# Implementation Plan
+## Task Summary
+<What needs to be done>
+## File Changes
+### [MODIFY] src/main/java/com/app/Service.java
+- Add new method `processLeave()`
+- Inject `NotificationService`
+### [NEW] src/main/java/com/app/dto/LeaveRequestDTO.java
+- Fields: employeeId, startDate, endDate, type
+## Testing Strategy
+- Unit test for `processLeave()`
+- Integration test for API endpoint
+## Risk Assessment
+- Breaking change to existing API contract
+```
+### Execution Log (execution.log)
+Raw output from the executor agent. Contains:
+- Files read and modified
+- Commands executed (build, test)
+- Any errors or warnings
+- Deviations from the plan
+### Review (review.md)
+```markdown
+# Code Review Report
+## Summary
+**Verdict:** APPROVED | NEEDS_CHANGES | REJECTED
+## Findings
+### Critical
+- None
+### Warnings
+- Missing null check in processLeave() line 45
+### Suggestions
+- Consider using Optional<> for return type
+## Plan Adherence
+All planned changes implemented correctly.
+```
+## Agent Prompting Rules
+1. **Planner** receives: task description + project context
+2. **Executor** receives: task description + plan.md content
+3. **Reviewer** receives: task description + plan.md + git diff
+Each agent should:
+- Stay in its role (don't plan during execution, don't execute during review)
+- Reference file paths relative to project root
+- Use structured markdown for outputs
+- Flag blockers immediately
+## IDE Integration (Windsurf / Antigravity)
+IDEs can participate by:
+1. **Reading** `.codeswarm/plan.md` to understand what the CLI agents planned
+2. **Editing** files interactively when the executor needs human guidance
+3. **Running** browser tests using IDE's built-in browser tools
+4. **Reviewing** by opening `review.md` in the IDE's markdown preview
+### Windsurf Integration
+```
+# In Windsurf, open the project and use Cascade to review:
+"Read .codeswarm/plan.md and implement the changes described in it"
+```
+### Antigravity Integration
+```
+# In Antigravity, use the built-in browser for visual testing:
+"Read .codeswarm/plan.md, implement changes, then use browser to verify at http://localhost:4200"
+```

package/WORKFLOWS.md ADDED Viewed

@@ -0,0 +1,174 @@
+# Advanced Workflows & Examples
+## Workflow 1: Full-Stack Feature (Backend + Frontend)
+```bash
+# Step 1: Plan with Claude Opus across both repos
+./orchestrate.sh \
+  --project ~/IdeaProjects/blue-flow \
+  --task "Add leave approval notification: create REST endpoint in blue-flow, \
+          add Kafka event, consume in derin-ui-manager to show real-time toast" \
+  --planner claude --model opus
+# Step 2: Execute backend with Gemini
+./orchestrate.sh \
+  --project ~/IdeaProjects/blue-flow \
+  --task "Implement the backend changes from plan" \
+  --skip-plan --executor gemini --skip-review
+# Step 3: Execute frontend with Gemini
+./orchestrate.sh \
+  --project ~/WebstormProjects/derin-ui-manager \
+  --task "Implement the frontend notification toast from plan in blue-flow/.agentic/plan.md" \
+  --skip-plan --executor gemini --skip-review
+# Step 4: Review everything with Codex
+./orchestrate.sh \
+  --project ~/IdeaProjects/blue-flow \
+  --task "Review all changes for leave approval notification" \
+  --skip-plan --reviewer codex
+# Step 5: Browser test the frontend
+./orchestrate.sh \
+  --project ~/WebstormProjects/derin-ui-manager \
+  --task "Test leave approval notification appears after triggering approval" \
+  --skip-plan --skip-review --browser-test --test-url "http://localhost:4200"
+```
+---
+## Workflow 2: Bug Fix Pipeline
+```bash
+# All-in-one: plan, fix, review
+./orchestrate.sh \
+  --project ~/IdeaProjects/derin-purchase \
+  --task "Fix: Purchase order total calculation returns 0 when \
+          discount percentage is applied. Error in PurchaseOrderService.calculateTotal()"
+```
+---
+## Workflow 3: Code Review Only
+```bash
+# Use Codex to review current uncommitted changes
+cd ~/IdeaProjects/blue-planning
+codex review
+# Or via orchestrator with only review phase
+./orchestrate.sh \
+  --project ~/IdeaProjects/blue-planning \
+  --task "Review uncommitted changes for quality and security" \
+  --skip-plan --skip-review \
+  --reviewer codex
+```
+---
+## Workflow 4: Multi-Agent with Claude Agent Teams
+Use Claude's native `--agents` flag for intra-Claude orchestration:
+```bash
+claude --agents '{
+  "planner": {
+    "description": "Plans architecture changes",
+    "prompt": "You are a senior architect. Analyze tasks and produce detailed plans.",
+    "model": "opus",
+    "allowedTools": ["Read", "Bash(find:*,grep:*,cat:*)"]
+  },
+  "coder": {
+    "description": "Implements code changes",
+    "prompt": "You implement changes following the plan precisely.",
+    "model": "sonnet",
+    "allowedTools": ["Read", "Edit", "Bash"]
+  },
+  "tester": {
+    "description": "Tests changes",
+    "prompt": "You write and run tests. Use browser tools for UI testing.",
+    "model": "sonnet",
+    "allowedTools": ["Read", "Bash", "Browser"]
+  }
+}' --add-dir ~/IdeaProjects/blue-flow
+```
+---
+## Workflow 5: Parallel Multi-Project
+Run agents in parallel across related services:
+```bash
+#!/bin/bash
+# parallel-deploy.sh — Fix the same issue across multiple services
+TASK="Update blue-citrus-tools dependency to 0.4.84 and fix any compilation errors"
+for PROJECT in blue-planning blue-material-recipe blue-inventory-production derin-execution derin-purchase; do
+  echo "Processing $PROJECT..."
+  ./orchestrate.sh \
+    --project ~/IdeaProjects/$PROJECT \
+    --task "$TASK" &
+done
+wait
+echo "All projects updated!"
+```
+---
+## Workflow 6: Frontend Visual Regression
+```bash
+# Generate baseline screenshots
+./orchestrate.sh \
+  --project ~/WebstormProjects/derin-ui-manager \
+  --task "Take screenshots of every page: login, dashboard, inventory, \
+          planning, materials, quality control, settings" \
+  --skip-plan --skip-review --browser-test --test-url "http://localhost:4200"
+# After making changes, generate new screenshots and compare
+./orchestrate.sh \
+  --project ~/WebstormProjects/derin-ui-manager \
+  --task "Take the same screenshots as before and compare with \
+          baseline in .agentic/screenshots/. Report any visual differences." \
+  --skip-plan --skip-review --browser-test --test-url "http://localhost:4200"
+```
+---
+## Workflow 7: Using Windsurf as Executor
+When you prefer interactive development:
+1. Run the planner:
+   ```bash
+   ./orchestrate.sh --project ~/IdeaProjects/blue-flow \
+     --task "Add batch confirmation cancellation endpoint" \
+     --planner claude --model opus \
+     --skip-review
+   ```
+2. Open the project in Windsurf and tell Cascade:
+   > "Read `.agentic/plan.md` and implement all the changes described in it"
+3. After implementing, run the reviewer:
+   ```bash
+   ./orchestrate.sh --project ~/IdeaProjects/blue-flow \
+     --task "Review batch confirmation cancellation implementation" \
+     --skip-plan --reviewer codex
+   ```
+---
+## Workflow 8: Database Migration
+```bash
+./orchestrate.sh \
+  --project ~/IdeaProjects/blue-migration \
+  --task "Create Liquibase migration for new leave_requests table: \
+    columns: id (bigserial), employee_id (bigint FK), start_date, end_date, \
+    type (varchar), status (varchar), created_at, updated_at. \
+    Add indexes on employee_id and status. Follow existing changelog patterns."
+```

package/bin/codeswarm.js ADDED Viewed

@@ -0,0 +1,15 @@
+#!/usr/bin/env node
+const { execFileSync } = require('child_process');
+const path = require('path');
+const coordinator = path.join(__dirname, '..', 'coordinator.sh');
+const args = process.argv.slice(2);
+try {
+    execFileSync('bash', [coordinator, ...args], {
+        stdio: 'inherit',
+        env: { ...process.env, CODESWARM_ROOT: path.join(__dirname, '..') }
+    });
+} catch (e) {
+    process.exit(e.status || 1);
+}

package/config.yaml ADDED Viewed

@@ -0,0 +1,55 @@
+# Default agent role assignments
+# Override with CLI flags: --planner, --executor, --reviewer
+roles:
+  planner: claude        # claude | gemini | codex | amp | opencode
+  executor: gemini       # claude | gemini | codex | amp | opencode
+  reviewer: codex        # claude | gemini | codex | amp | opencode
+# Model preferences per agent
+models:
+  claude: opus           # opus | sonnet | haiku  (alias or full name e.g. claude-opus-4-6-20260212)
+  gemini: ""             # leave empty for default, or specify e.g. gemini-2.5-pro
+  codex: ""              # leave empty for default, or specify e.g. o3
+  amp: ""                # leave empty for default
+  opencode: ""           # leave empty for default
+# Permission modes
+permissions:
+  planner: plan          # plan = read-only, no edits
+  executor: auto_edit    # auto_edit = auto-approve file edits
+  reviewer: plan         # plan = read-only analysis
+# Shared workspace settings
+workspace:
+  agentic_dir: .agentic              # created inside each project
+  plan_file: plan.md
+  execution_log: execution.log
+  review_file: review.md
+  screenshots_dir: screenshots
+  test_report: test-report.md
+# Browser testing (Playwright MCP)
+browser_testing:
+  enabled: false
+  browser: chromium                  # chromium | firefox | webkit
+  headless: true                     # set false to watch the browser
+  base_url: "http://localhost:4200"
+  credentials:
+    username: "admin"
+    password: "admin"
+  screenshot_on_step: true
+  timeout_ms: 30000
+# Timeouts (seconds)
+timeouts:
+  planner: 300
+  executor: 600
+  reviewer: 300
+  browser_test: 120
+# Hooks (optional scripts to run between phases)
+hooks:
+  after_plan: ""       # e.g. "./hooks/validate-plan.sh"
+  after_execute: ""    # e.g. "npm run build && npm test"
+  after_review: ""     # e.g. "./hooks/notify-slack.sh"