npm - codeswarm - Versions diffs - 0.1.0 - Mend

codeswarm 0.1.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (21) hide show

package/.codeswarm/skills/prd_template.md +98 -0
package/AGENT_TIPS.md +206 -0
package/BROWSER_TESTING.md +177 -0
package/COORDINATOR.md +151 -0
package/LICENSE +21 -0
package/README.md +253 -0
package/TASK_PROTOCOL.md +111 -0
package/WORKFLOWS.md +174 -0
package/bin/codeswarm.js +15 -0
package/config.yaml +55 -0
package/coordinator.sh +1762 -0
package/dashboard/package-lock.json +1036 -0
package/dashboard/package.json +14 -0
package/dashboard/public/index.html +758 -0
package/dashboard/server.js +444 -0
package/docs/prd-example.md +90 -0
package/docs/prd-template.md +45 -0
package/orchestrate.sh +467 -0
package/package.json +62 -0
package/playwright.config.ts +19 -0
package/setup.sh +142 -0

package/.codeswarm/skills/prd_template.md ADDED Viewed

@@ -0,0 +1,98 @@
+# PRD Generator Skill
+Generate a structured Product Requirements Document (PRD) for autonomous agent execution.
+## Goal
+Create a PRD that breaks work into small, dependency-ordered user stories with verifiable acceptance criteria.
+## PRD Format
+Write the PRD using this exact structure:
+```markdown
+# PRD: <Project Title>
+## Overview
+<2-3 sentence description of the feature and what problem it solves>
+## Tech Stack
+<language, framework, database — from project analysis>
+## User Stories
+### US-001: <Title> [priority: 1]
+**Description:** As a <user>, I want <feature> so that <benefit>.
+**Files:** `path/to/file1`, `path/to/file2`
+**Acceptance Criteria:**
+- [ ] <specific, testable criterion>
+- [ ] <specific, testable criterion>
+- [ ] Build passes (e.g. `mvn compile`, `npm run build`)
+**Dependencies:** none
+**Notes:** <patterns to follow, gotchas>
+### US-002: <Title> [priority: 2]
+...
+```
+## Story Sizing Rules
+Each user story MUST be completable by one agent in ~10 minutes (one context window).
+### Right-sized:
+- Add a database column and migration
+- Add a UI component to an existing page
+- Update a service method with new logic
+- Add a filter/dropdown to a list
+- Create a single REST endpoint
+### Too big — split these:
+- "Build the entire dashboard" → Split: schema, queries, UI, filters
+- "Add authentication" → Split: schema, middleware, login UI, session
+- "Refactor the API" → Split: one story per endpoint
+**Rule of thumb:** If you cannot describe the change in 2-3 sentences, it's too big.
+## Story Ordering
+Stories execute in priority order. Earlier stories must NOT depend on later ones.
+**Correct Order:**
+1. Schema/database changes (migrations, tables)
+2. Domain entities/models
+3. Service layer / business logic
+4. REST controllers / API endpoints
+5. UI components that use the backend
+6. Integration / summary views
+## Acceptance Criteria Rules
+Each criterion must be verifiable — something an agent can CHECK.
+### Good (verifiable):
+- "Add `status` column to tasks table with default 'PENDING'"
+- "GET /api/tasks returns 200 with JSON array"
+- "Filter dropdown has options: All, Active, Completed"
+- "Build passes (`mvn compile` / `npm run build`)"
+### Bad (vague):
+- "Works correctly"
+- "Good UX"
+- "Handles edge cases"
+### Always include as the last criterion:
+- `Build passes` (e.g. `mvn compile`, `npm run build`, `tsc --noEmit`)
+### For UI stories, also include:
+- `Verify in browser` (visual confirmation via screenshot or dev tools)
+## Before Saving
+Verify:
+- [ ] Read existing code first to understand patterns and conventions
+- [ ] Each story is completable in one iteration (~10 min)
+- [ ] Stories ordered by dependency (schema → entities → services → controllers → UI)
+- [ ] Every story has "Build passes" in acceptance criteria
+- [ ] Acceptance criteria are specific and verifiable
+- [ ] No story depends on a later story
+- [ ] File paths are specific (name files, classes, methods)
+- [ ] Referenced existing implementations as patterns to follow

package/AGENT_TIPS.md ADDED Viewed

@@ -0,0 +1,206 @@
+# Agent Tips & Configuration
+## Claude Code
+### Key Flags
+```bash
+claude -p "prompt"              # Headless mode (print & exit)
+claude --model opus             # Use Claude Opus
+claude --model sonnet           # Use Claude Sonnet (faster/cheaper)
+claude --permission-mode plan   # Read-only mode (no edits)
+claude --permission-mode auto_edit  # Auto-approve edits (no prompts)
+claude --chrome                 # Enable Chrome browser control
+claude --agents '{"name": {...}}'   # Define custom subagents
+claude --add-dir /other/project     # Add additional directories
+claude --mcp-config mcp.json       # Load MCP servers
+```
+### Claude Agent Teams (Experimental)
+Define a team of specialized subagents:
+```bash
+claude --agents '{
+  "architect": {
+    "description": "Designs system architecture",
+    "prompt": "You analyze requirements and design scalable architectures",
+    "model": "opus"
+  },
+  "implementer": {
+    "description": "Implements code changes",
+    "prompt": "You write clean, tested code following team conventions",
+    "model": "sonnet"
+  },
+  "qa": {
+    "description": "Tests and validates changes",
+    "prompt": "You test code for bugs, edge cases, and regressions",
+    "model": "sonnet"
+  }
+}'
+```
+Or via `.claude/agents/` markdown files:
+```markdown
+---
+name: architect
+description: Designs system architecture
+model: opus
+allowedTools:
+  - Read
+  - Bash(find:*, grep:*, cat:*, tree:*)
+---
+You are an expert software architect. Analyze requirements and produce
+detailed implementation plans with file paths, interfaces, and data models.
+```
+### Best Practices
+- Use `opus` for planning (better reasoning, slower)
+- Use `sonnet` for execution (fast, good at code generation)
+- Use `--permission-mode plan` for reviewers (prevents accidental edits)
+- Use `--add-dir` to give access to multiple related projects
+---
+## Gemini CLI
+### Key Flags
+```bash
+gemini -p "prompt"              # Headless mode (prompt & exit)
+gemini --model gemini-2.5-pro   # Specify model
+gemini --approval-mode yolo     # Auto-approve everything
+gemini --approval-mode auto_edit # Auto-approve edits only
+gemini --approval-mode plan     # Read-only mode
+gemini -y                       # YOLO mode (same as approval-mode yolo)
+gemini --include-directories /other/project  # Add extra dirs
+```
+### Configuration Files
+- `~/.gemini/settings.json` — Global settings and MCP servers
+- `.gemini/settings.json` — Project-level settings
+- `.gemini/AGENTS.md` — Custom agent instructions
+### Project-Level Agent Instructions
+Create `.gemini/AGENTS.md` in your project:
+```markdown
+# Project Agent Instructions
+## Architecture
+- This is a Spring Boot microservice using Hexagonal Architecture
+- Entities go in domain/, services in application/, REST controllers in adapter/web/
+## Code Conventions
+- Use Lombok @Data for DTOs
+- Use constructor injection
+- All endpoints return ResultDTO
+- Follow existing naming patterns
+```
+### Best Practices
+- Gemini is excellent at file editing — ideal as executor
+- Use `--approval-mode auto_edit` for automated execution
+- YOLO mode (`-y`) is useful for fully automated pipelines
+- Use `--include-directories` for multi-project tasks
+---
+## Codex CLI
+### Key Flags
+```bash
+codex exec "prompt"             # Non-interactive execution
+codex review                    # Code review mode
+codex exec --sandbox read-only "prompt"  # Read-only sandbox
+codex -m o3 "prompt"            # Use specific model
+```
+### Configuration
+`~/.codex/config.toml`:
+```toml
+model = "o4-mini"
+[sandbox_permissions]
+disk-full-read-access = true
+```
+### Code Review
+```bash
+# Review uncommitted changes
+codex review
+# Review with specific focus
+codex exec --sandbox read-only "Review the git diff for security issues, \
+  performance problems, and adherence to SOLID principles. \
+  Output a structured markdown report."
+```
+### Best Practices
+- Codex excels at code review with `--sandbox read-only`
+- Use `codex review` for the built-in review workflow
+- Use `o3` or `o4-mini` models for reasoning-heavy reviews
+- The sandbox prevents accidental modifications during review
+---
+## Windsurf
+### Integration with Orchestrator
+Windsurf (Cascade) can read the `.codeswarm/` directory:
+1. Open your project in Windsurf
+2. Ask Cascade: *"Read `.codeswarm/plan.md` and implement the changes"*
+3. Or: *"Read `.codeswarm/review.md` and fix the issues mentioned"*
+### When to Use Windsurf
+- Complex refactoring that benefits from IDE context
+- Debugging with breakpoints and step-through
+- Visual feedback during development
+- When you want an interactive conversation about the plan
+---
+## Antigravity
+### Integration with Orchestrator
+Antigravity has built-in browser capabilities:
+1. Open your project in Antigravity
+2. Ask: *"Read `.codeswarm/plan.md` and implement changes"*
+3. For browser testing: *"Open http://localhost:4200 in the browser, login, and verify the dashboard"*
+4. Design tasks: Use Pencil MCP for design-to-code workflows
+### When to Use Antigravity
+- Tasks involving UI design or visual prototyping
+- Browser-based verification with built-in browser tools
+- When you need Pencil MCP for design work
+---
+## Choosing the Right Agent
+| Task Type | Recommended | Why |
+|-----------|-------------|-----|
+| Architecture planning | Claude (Opus) | Best reasoning and analysis |
+| Code implementation | Gemini CLI | Fast, great at file editing |
+| Code review | Codex CLI | Built-in review mode, sandbox |
+| Complex debugging | Windsurf | IDE debugging tools |
+| Frontend/Design | Antigravity | Browser + Pencil MCP |
+| Quick prototyping | Gemini (YOLO) | Fastest iteration |
+| Security audit | Codex (read-only) | Sandboxed analysis |
+| Multi-repo tasks | Claude | Best at cross-project reasoning |

package/BROWSER_TESTING.md ADDED Viewed

@@ -0,0 +1,177 @@
+# 🌐 Browser Testing Guide
+## Overview
+Frontend testing is handled via **Playwright MCP** — a Model Context Protocol server that gives AI agents control over a real browser (Chrome/Chromium). Agents can navigate, click, type, take screenshots, and report results.
+## Setup
+### 1. Install Playwright
+```bash
+# From the agentic project root
+cd /Users/msk/IdeaProjects/agentic
+npm init -y
+npm install playwright @playwright/test
+npx playwright install chromium
+```
+### 2. Install Playwright MCP Server
+```bash
+npm install -g @anthropic/mcp-server-playwright
+# or use npx
+npx @anthropic/mcp-server-playwright
+```
+### 3. Configure MCP for Claude Code
+Create or update `~/.claude/mcp.json`:
+```json
+{
+  "mcpServers": {
+    "playwright": {
+      "command": "npx",
+      "args": ["@anthropic/mcp-server-playwright"],
+      "env": {
+        "PLAYWRIGHT_HEADLESS": "true"
+      }
+    }
+  }
+}
+```
+### 4. Configure MCP for Gemini CLI
+Create or update `~/.gemini/settings.json`:
+```json
+{
+  "mcpServers": {
+    "playwright": {
+      "command": "npx",
+      "args": ["@anthropic/mcp-server-playwright"]
+    }
+  }
+}
+```
+### 5. Configure MCP for Codex CLI
+Create or update `~/.codex/config.toml` or provide via `--mcp-config`:
+```toml
+[mcp_servers.playwright]
+command = "npx"
+args = ["@anthropic/mcp-server-playwright"]
+```
+## Usage
+### Via Orchestrator
+```bash
+./orchestrate.sh \
+  --project ~/IdeaProjects/derin-ui-manager \
+  --task "Verify login page and dashboard navigation" \
+  --browser-test \
+  --test-url "http://localhost:4200"
+```
+### Direct Agent Testing
+#### Claude Code (with Chrome integration)
+```bash
+# Claude Code has a built-in --chrome flag for browser control
+claude -p "Navigate to http://localhost:4200, login with admin/admin, \
+  go to the Dashboard page, take a screenshot, then go to Settings, \
+  take another screenshot. Report any visual issues." \
+  --chrome
+```
+#### Claude Code (with Playwright MCP)
+```bash
+claude -p "Use the Playwright MCP tools to: \
+  1. Open http://localhost:4200 \
+  2. Fill in username 'admin' and password 'admin' \
+  3. Click the Login button \
+  4. Wait for the dashboard to load \
+  5. Take a screenshot named 'dashboard.png' \
+  6. Click on each sidebar menu item \
+  7. Take a screenshot of each page \
+  8. Report your findings"
+```
+#### Gemini CLI (with Playwright MCP)
+```bash
+gemini -p "Use Playwright to test http://localhost:4200: \
+  login with admin/admin, navigate to all main pages, \
+  take screenshots, verify no console errors, report issues."
+```
+## Writing Custom Test Scripts
+For repeatable tests, create Playwright test files:
+```bash
+mkdir -p /Users/msk/IdeaProjects/agentic/tests
+```
+### Example: Login & Navigate Test
+Create `tests/login-navigate.spec.ts`:
+```typescript
+import { test, expect } from '@playwright/test';
+test.describe('Login and Navigation', () => {
+  test('should login and navigate dashboard', async ({ page }) => {
+    // Navigate to login
+    await page.goto('http://localhost:4200');
+    await page.screenshot({ path: '.agentic/screenshots/01-login-page.png' });
+    // Fill credentials
+    await page.fill('input[name="username"]', 'admin');
+    await page.fill('input[name="password"]', 'admin');
+    await page.click('button[type="submit"]');
+    // Wait for dashboard
+    await page.waitForURL('**/dashboard**');
+    await page.screenshot({ path: '.agentic/screenshots/02-dashboard.png' });
+    // Navigate sidebar items
+    const menuItems = await page.locator('nav a').all();
+    for (let i = 0; i < menuItems.length; i++) {
+      await menuItems[i].click();
+      await page.waitForLoadState('networkidle');
+      await page.screenshot({
+        path: `.agentic/screenshots/03-page-${i}.png`
+      });
+    }
+  });
+});
+```
+### Running Tests
+```bash
+npx playwright test tests/login-navigate.spec.ts --reporter=html
+```
+## Screenshot Reports
+Screenshots are saved to `<project>/.agentic/screenshots/`. The orchestrator generates a markdown report referencing these screenshots.
+## Troubleshooting
+| Issue | Solution |
+|-------|----------|
+| Browser won't launch | Run `npx playwright install chromium` |
+| MCP server not found | `npm install -g @anthropic/mcp-server-playwright` |
+| Login fails | Check credentials in `config.yaml` |
+| Timeout errors | Increase `timeout_ms` in `config.yaml` |
+| Port in use | Make sure dev server is running on the expected port |

package/COORDINATOR.md ADDED Viewed

@@ -0,0 +1,151 @@
+# Autonomous Multi-Agent Coordinator
+## Overview
+Unlike the basic `orchestrate.sh` (sequential pipeline), `coordinator.sh` runs agents in an **autonomous feedback loop** where they communicate, delegate tasks, and iterate until the work is approved.
+## Flow
+```
+                    ┌─────────────┐
+             ┌─────│   PLANNER   │─────┐
+             │     └─────────────┘     │
+             │          │ plan.md      │
+             │          ▼              │
+             │     ┌─────────────┐     │
+     replan  │     │  EXECUTOR   │     │ re-consult on
+     request │     └─────────────┘     │ major issues
+             │          │ code changes │
+             │          ▼              │
+             │     ┌─────────────┐     │
+             │     │  REVIEWER   │─────┘
+             │     └──────┬──────┘
+             │            │
+             │     ┌──────┴──────┐
+             │     │             │
+             ▼   APPROVED    NEEDS_CHANGES
+            DONE     ✓      │
+                            │ feedback
+                            ▼
+                     ┌─────────────┐
+                     │  EXECUTOR   │ ← fixes issues
+                     └──────┬──────┘
+                            │
+                            ▼
+                     ┌─────────────┐
+                     │  REVIEWER   │ ← re-checks
+                     └──────┬──────┘
+                            │
+                     APPROVED or loop again
+                     (max N iterations)
+```
+## Key Features
+### 1. Message Bus
+All agents communicate through a file-based message bus at `<project>/.agentic/bus/`. Every message is a timestamped markdown file that any agent can read.
+### 2. Feedback Loop
+When the reviewer says `NEEDS_CHANGES`, the executor automatically gets the feedback and tries again. No human intervention needed.
+### 3. Planner Re-consultation
+If the reviewer flags `BLOCKER`, `ARCHITECTURE`, or `FUNDAMENTAL` issues, the coordinator automatically re-consults the planner to update the plan before the executor tries again.
+### 4. Session History
+All messages, prompts, and outputs are saved to `<project>/.codeswarm/sessions/<session_id>/`, creating a full audit trail.
+### 5. Configurable Iterations
+Set `--max-iterations` to control how many execute→review cycles are allowed (default: 5).
+## Usage
+### Basic
+```bash
+./coordinator.sh \
+  --project ~/IdeaProjects/blue-flow \
+  --task "Add leave approval notification endpoint"
+```
+### Custom Agents & Models
+```bash
+./coordinator.sh \
+  --project ~/IdeaProjects/blue-flow \
+  --task "Implement İşe Başlayan Personel Girişi process" \
+  --planner codex --planner-model gpt-5.3 \
+  --executor claude --executor-model opus \
+  --reviewer gemini \
+  --context "İşe Başlayan Personel Girişi.xml" \
+  --max-iterations 5
+```
+### With Context Files
+```bash
+./coordinator.sh \
+  --project ~/IdeaProjects/blue-flow \
+  --task "Implement the leave process from XML" \
+  --context "IdariPersonelIzinSureci.xml,src/main/java/com/app/blue/domain/Leave.java"
+```
+### Verbose Mode
+```bash
+./coordinator.sh \
+  --project ~/IdeaProjects/blue-flow \
+  --task "Fix bug in process engine" \
+  --verbose
+```
+### Dry Run (see commands without executing)
+```bash
+./coordinator.sh \
+  --project ~/IdeaProjects/blue-flow \
+  --task "Test task" \
+  --dry-run
+```
+## Session Artifacts
+After a run, you'll find:
+```
+<project>/.agentic/
+├── plan.md                         # Current plan
+├── execution.log                   # Latest executor output
+├── review.md                       # Latest review
+├── report-<timestamp>.md           # Final summary report
+└── sessions/session_<timestamp>/
+    ├── coordinator.log             # Full coordinator log
+    ├── msg_001_planner_to_executor.md
+    ├── msg_002_executor_to_reviewer.md
+    ├── msg_003_reviewer_to_executor.md  # Feedback
+    ├── msg_004_executor_to_reviewer.md  # Re-implementation
+    ├── msg_005_reviewer_to_executor.md  # APPROVED!
+    ├── prompt_codex_iter1.md       # Exact prompts sent
+    ├── output_codex_iter1.md       # Exact outputs received
+    ├── output_claude_iter1.md
+    ├── output_gemini_iter1.md
+    └── ...
+```
+## CLI Reference
+| Flag | Description | Default |
+|------|-------------|---------|
+| `--project` | Target project directory | required |
+| `--task` | Task description | required |
+| `--planner` | Agent for planning | codex |
+| `--executor` | Agent for execution | claude |
+| `--reviewer` | Agent for review | gemini |
+| `--planner-model` | Model for planner | (default) |
+| `--executor-model` | Model for executor | (default) |
+| `--reviewer-model` | Model for reviewer | (default) |
+| `--context` | Comma-separated context files | none |
+| `--max-iterations` | Max execute→review cycles | 5 |
+| `--verbose` | Show full agent output | false |
+| `--dry-run` | Print commands only | false |

package/LICENSE ADDED Viewed

@@ -0,0 +1,21 @@
+MIT License
+Copyright (c) 2026 mskutlu
+Permission is hereby granted, free of charge, to any person obtaining a copy
+of this software and associated documentation files (the "Software"), to deal
+in the Software without restriction, including without limitation the rights
+to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+copies of the Software, and to permit persons to whom the Software is
+furnished to do so, subject to the following conditions:
+The above copyright notice and this permission notice shall be included in all
+copies or substantial portions of the Software.
+THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
+OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+SOFTWARE.