npm - agentic-orchestrator - Versions diffs - 0.1.2 → 0.1.4 - Mend

agentic-orchestrator 0.1.2 → 0.1.4

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (300) hide show

package/spec-files/completed/agentic_orchestrator_feature_gaps_closure_spec.md ADDED Viewed

@@ -0,0 +1,1764 @@
+# Feature Spec: Closing Functionality Gaps Between Agentic-Orchestrator and ComposioHQ Agent Orchestrator
+**Version:** 2.0
+**Date:** 2026-03-02
+**Status:** Draft (Revised — deep-dive analysis against ComposioHQ source)
+**Milestone:** M29 - Feature Gap Closure
+---
+## 0. Implementation Standards & References
+### 0.1 Testing Standards
+All new code MUST follow the testing standards defined in:
+- **`prompts/vitest-testing-standards.instructions.md`**
+Key requirements:
+- Use Vitest (`describe/it/expect`, `vi` mocks/spies)
+- Match existing repo conventions (test files in `apps/control-plane/test/*.spec.ts`)
+- Use **Given / When / Then** naming: `GIVEN_<context>_WHEN_<action>_THEN_<expected>`
+- Maintain coverage thresholds: lines/branches ≥90%
+- No flaky tests: use `vi.useFakeTimers()` for time-dependent tests
+- Mock external I/O (HTTP calls, filesystem outside temp dirs)
+### 0.2 Reference Implementation Repository
+The ComposioHQ Agent Orchestrator implementation serves as the reference for feature implementations:
+- **Repository:** https://github.com/ComposioHQ/agent-orchestrator
+- **Tech stack:** TypeScript ESM, pnpm workspaces, Commander.js CLI, Next.js 15 + React 19 dashboard, Zod validation, Vitest + Playwright tests, tmux-based agent runtime
+- **Key directories to study:**
+  - `packages/web/src/` — Dashboard (Next.js App Router, Kanban view, WebSocket terminal via ttyd/xterm.js)
+  - `packages/core/src/` — Core services (`session-manager.ts` 38KB, `lifecycle-manager.ts` 20KB, `config.ts` 13KB, `metadata.ts`, `plugin-registry.ts`, `orchestrator-prompt.ts`, `paths.ts`)
+  - `packages/plugins/notifier-*/` — 4 notification channels (slack, webhook, desktop, composio)
+  - `packages/plugins/agent-*/` — 4 agent plugins (claude-code 29KB, codex 28KB+16KB app-server, aider 7KB, opencode 5KB)
+  - `packages/plugins/tracker-*/` — 2 tracker plugins (github 8KB, linear 22KB with dual-transport)
+  - `packages/plugins/workspace-*/` — 2 workspace plugins (worktree, clone)
+  - `packages/plugins/runtime-*/` — 2 runtime plugins (tmux, process)
+  - `packages/plugins/terminal-*/` — 2 terminal plugins (iterm2, web)
+  - `packages/plugins/scm-github/` — PR lifecycle, CI checks, review decisions, merge control
+  - `packages/cli/src/commands/` — CLI commands (init, start, stop, spawn, batch-spawn, status, send, session, dashboard, review-check, open)
+- **Architecture note:** Composio uses a meta-agent pattern where the orchestrator itself is an AI agent (Claude Code) that receives a system prompt teaching it the `ao` CLI. Worker agents are spawned in tmux sessions/processes. This is fundamentally different from AOP's code-driven supervisor.
+### 0.3 Package Dependencies (Required Additions)
+New dependencies to add to `apps/control-plane/package.json`:
+```json
+{
+  "dependencies": {
+    "chalk": "^5.3.0",
+    "yaml": "^2.3.4"
+  },
+  "devDependencies": {
+    "@testing-library/react": "^14.0.0",
+    "msw": "^2.0.0"
+  }
+}
+```
+New package for dashboard (`packages/web-dashboard/package.json`):
+```json
+{
+  "name": "@aop/web-dashboard",
+  "version": "0.1.0",
+  "type": "module",
+  "dependencies": {
+    "next": "^14.0.0",
+    "react": "^18.2.0",
+    "react-dom": "^18.2.0",
+    "@monaco-editor/react": "^4.6.0",
+    "tailwindcss": "^3.4.0"
+  }
+}
+```
+---
+## 1. Executive Summary
+This specification defines a roadmap to close identified functionality gaps between Agentic-Orchestrator (AOP) and ComposioHQ's Agent Orchestrator, focusing on high-value features that align with AOP's deterministic, MCP-first architecture while preserving our core differentiators (collision detection, lock management, multi-phase workflows, quality gates).
+**Guiding Principle:** Adopt features that enhance developer experience and operational visibility WITHOUT compromising deterministic guarantees, state consistency, or explicit merge control.
+---
+## 2. Gap Prioritization Framework
+### 2.1 Priority Tiers
+**P0 (Critical - Implement First):**
+- Features that eliminate major UX friction
+- Features that enable production deployment
+- Features required for multi-project workflows
+**P1 (High Value - Implement Soon):**
+- Features that significantly improve observability
+- Features that reduce manual configuration burden
+- Features that enable common use cases
+**P2 (Nice to Have - Future Consideration):**
+- Features that improve edge cases
+- Features with viable workarounds
+- Features with unclear ROI
+**P3 (Low Priority - Deferred):**
+- Features that conflict with core architecture
+- Features with marginal benefit
+- Features that require major redesigns
+---
+## 3. Gap Analysis by Priority
+### 3.1 P0 GAPS (Critical - Must Implement)
+#### G1: Web Dashboard with Real-Time Updates
+**ComposioHQ Feature:** Next.js dashboard with Server-Sent Events for live session monitoring.
+**Gap Impact:** No visual monitoring; CLI-only interaction is friction for teams.
+**AOP Design Alignment:** HIGH - Monitoring does not conflict with deterministic model.
+**Reference Implementation (ComposioHQ):**
+- SSE endpoint: `packages/web/src/app/api/events/route.ts`
+- Dashboard components: `packages/web/src/components/Dashboard.tsx`, `SessionCard.tsx`, `SessionDetail.tsx`
+- Hooks: `packages/web/src/hooks/` (real-time state management)
+**Specification:**
+**Directory Structure:**
+```
+packages/web-dashboard/
+├── package.json
+├── tsconfig.json
+├── next.config.js
+├── tailwind.config.js
+├── src/
+│   ├── app/
+│   │   ├── layout.tsx
+│   │   ├── page.tsx                    # Main dashboard page
+│   │   ├── globals.css
+│   │   └── api/
+│   │       ├── status/route.ts         # GET /api/status
+│   │       ├── events/route.ts         # GET /api/events (SSE)
+│   │       ├── features/
+│   │       │   └── [id]/
+│   │       │       ├── route.ts        # GET /api/features/:id
+│   │       │       ├── diff/route.ts   # GET /api/features/:id/diff
+│   │       │       └── evidence/
+│   │       │           └── [artifact]/route.ts
+│   │       └── actions/route.ts        # POST /api/actions
+│   │       ├── actions/route.ts        # POST /api/actions
+│   │       └── features/
+│   │           └── [id]/
+│   │               ├── review/route.ts    # POST /api/features/:id/review (approve/deny/request-changes)
+│   │               └── checkout/route.ts  # POST /api/features/:id/checkout (switch main repo to worktree branch)
+│   ├── components/
+│   │   ├── FeatureCard.tsx             # Feature status card
+│   │   ├── FeatureDetail.tsx           # Expanded feature view
+│   │   ├── DiffViewer.tsx              # Monaco-based diff viewer
+│   │   ├── EvidenceViewer.tsx          # Logs/coverage display
+│   │   ├── ReviewPanel.tsx             # Review/approve/deny controls + merge trigger
+│   │   ├── CheckoutButton.tsx          # One-click checkout to feature worktree branch
+│   │   ├── StatusBadge.tsx             # Status indicator
+│   │   └── LockIndicator.tsx           # Lock visualization
+│   ├── hooks/
+│   │   ├── useFeatures.ts              # Feature state hook
+│   │   └── useSSE.ts                   # SSE connection hook
+│   └── lib/
+│       ├── aop-client.ts               # Read .aop/ files
+│       └── types.ts                    # TypeScript types
+└── vitest.config.ts
+```
+**SSE Implementation (Reference: ComposioHQ `packages/web/src/app/api/events/route.ts`):**
+```typescript
+// packages/web-dashboard/src/app/api/events/route.ts
+export const dynamic = "force-dynamic";
+export async function GET(): Promise<Response> {
+  const encoder = new TextEncoder();
+  let heartbeat: ReturnType<typeof setInterval> | undefined;
+  let updates: ReturnType<typeof setInterval> | undefined;
+  const stream = new ReadableStream({
+    start(controller) {
+      // Send initial snapshot
+      void (async () => {
+        const features = await readFeaturesIndex();
+        const event = { type: "snapshot", features };
+        controller.enqueue(encoder.encode(`data: ${JSON.stringify(event)}\n\n`));
+      })();
+      // Heartbeat every 15s
+      heartbeat = setInterval(() => {
+        try {
+          controller.enqueue(encoder.encode(`: heartbeat\n\n`));
+        } catch {
+          clearInterval(heartbeat);
+          clearInterval(updates);
+        }
+      }, 15000);
+      // Poll for changes every 2s
+      updates = setInterval(() => {
+        void (async () => {
+          const features = await readFeaturesIndex();
+          const event = { type: "snapshot", features };
+          controller.enqueue(encoder.encode(`data: ${JSON.stringify(event)}\n\n`));
+        })();
+      }, 2000);
+    },
+    cancel() {
+      clearInterval(heartbeat);
+      clearInterval(updates);
+    },
+  });
+  return new Response(stream, {
+    headers: {
+      "Content-Type": "text/event-stream",
+      "Cache-Control": "no-cache",
+      Connection: "keep-alive",
+      "X-Accel-Buffering": "no",
+    },
+  });
+}
+```
+**useSSE Hook Implementation:**
+```typescript
+// packages/web-dashboard/src/hooks/useSSE.ts
+import { useState, useEffect, useCallback } from 'react';
+interface SSEOptions {
+  url: string;
+  onMessage: (data: unknown) => void;
+  reconnectInterval?: number;
+}
+export function useSSE({ url, onMessage, reconnectInterval = 5000 }: SSEOptions) {
+  const [connected, setConnected] = useState(false);
+  const [error, setError] = useState<Error | null>(null);
+  useEffect(() => {
+    let eventSource: EventSource | null = null;
+    let reconnectTimer: ReturnType<typeof setTimeout> | null = null;
+    const connect = () => {
+      eventSource = new EventSource(url);
+      eventSource.onopen = () => {
+        setConnected(true);
+        setError(null);
+      };
+      eventSource.onmessage = (event) => {
+        try {
+          const data = JSON.parse(event.data);
+          onMessage(data);
+        } catch (e) {
+          console.error('SSE parse error:', e);
+        }
+      };
+      eventSource.onerror = () => {
+        setConnected(false);
+        eventSource?.close();
+        reconnectTimer = setTimeout(connect, reconnectInterval);
+      };
+    };
+    connect();
+    return () => {
+      eventSource?.close();
+      if (reconnectTimer) clearTimeout(reconnectTimer);
+    };
+  }, [url, onMessage, reconnectInterval]);
+  return { connected, error };
+}
+```
+**CLI Integration:**
+```typescript
+// apps/control-plane/src/cli/dashboard-command-handler.ts
+import { spawn } from 'node:child_process';
+import { resolve } from 'node:path';
+export class DashboardCommandHandler {
+  async execute(options: { port?: number; foreground?: boolean }): Promise<void> {
+    const port = options.port ?? 3000;
+    const dashboardPath = resolve(__dirname, '../../../../packages/web-dashboard');
+    const env = {
+      ...process.env,
+      PORT: String(port),
+      AOP_ROOT: process.cwd(),
+    };
+    if (options.foreground) {
+      // Run in foreground
+      const child = spawn('npm', ['run', 'dev'], {
+        cwd: dashboardPath,
+        env,
+        stdio: 'inherit',
+      });
+      await new Promise((_, reject) => child.on('error', reject));
+    } else {
+      // Run in background (detached)
+      const child = spawn('npm', ['run', 'start'], {
+        cwd: dashboardPath,
+        env,
+        detached: true,
+        stdio: 'ignore',
+      });
+      child.unref();
+      console.log(`Dashboard started on http://localhost:${port}`);
+    }
+  }
+}
+```
+**Testing Requirements (per `prompts/vitest-testing-standards.instructions.md`):**
+```typescript
+// packages/web-dashboard/src/__tests__/api-status.test.ts
+import { describe, it, expect, vi, beforeEach } from 'vitest';
+import { GET } from '../app/api/status/route';
+describe('GET /api/status', () => {
+  beforeEach(() => {
+    vi.restoreAllMocks();
+  });
+  it('GIVEN_valid_aop_directory_WHEN_status_requested_THEN_returns_features_list', async () => {
+    // Mock file reading
+    vi.mock('node:fs/promises', () => ({
+      readFile: vi.fn().mockResolvedValue(JSON.stringify({
+        active: ['feature_a'],
+        blocked: [],
+        merged: ['feature_b'],
+      })),
+    }));
+    const response = await GET();
+    const data = await response.json();
+    expect(response.status).toBe(200);
+    expect(data.active).toContain('feature_a');
+  });
+  it('GIVEN_missing_index_file_WHEN_status_requested_THEN_returns_empty_state', async () => {
+    vi.mock('node:fs/promises', () => ({
+      readFile: vi.fn().mockRejectedValue(new Error('ENOENT')),
+    }));
+    const response = await GET();
+    const data = await response.json();
+    expect(response.status).toBe(200);
+    expect(data.active).toEqual([]);
+  });
+});
+```
+**Dashboard Routes:**
+- `GET /` — Dashboard UI
+- `GET /api/status` — Global status snapshot (same as `aop status` JSON)
+- `GET /api/events` — SSE stream
+- `GET /api/features/:id` — Feature detail
+- `GET /api/features/:id/diff` — Diff bundle
+- `GET /api/features/:id/evidence/:artifact` — Evidence artifact download
+- `POST /api/features/:id/actions` — Trigger actions (delete with confirmation)
+- `POST /api/features/:id/review` — Review decision: approve, deny, or request changes
+- `POST /api/features/:id/checkout` — Checkout feature worktree branch to main repo
+**Review & Merge Control (Dashboard-Driven):**
+The dashboard serves as the primary human review interface. When a feature reaches `ready_to_merge` status, the reviewer can:
+1. **Review changes** — View the full diff (Monaco diff viewer), plan, evidence (gate results, coverage), and feature log from the feature detail page.
+2. **Approve & Merge** — Click "Approve & Merge" to instruct the orchestrator to execute `feature.ready_to_merge`. This:
+   - Calls `POST /api/features/:id/review` with `{ decision: "approve", approval_token: "<token>" }`
+   - Backend invokes the `feature.ready_to_merge` MCP tool with the approval token
+   - Merge executes through the existing deterministic merge path (lock acquisition, final gate run, merge commit, cleanup)
+   - Dashboard shows merge progress via SSE updates
+3. **Deny / Request Changes** — Click "Deny" or "Request Changes" to block the merge:
+   - `{ decision: "deny", reason: "..." }` — Moves feature back to `blocked` status with the reviewer's reason logged to feature decisions
+   - `{ decision: "request_changes", message: "..." }` — Sends the reviewer's feedback as a corrective prompt to the builder agent (via the reaction system) and keeps the feature in its current phase for another build/QA cycle
+4. **API Implementation:**
+   ```typescript
+   // POST /api/features/:id/review
+   interface ReviewRequest {
+     decision: 'approve' | 'deny' | 'request_changes';
+     approval_token?: string;  // required for approve
+     reason?: string;          // required for deny
+     message?: string;         // required for request_changes
+   }
+   ```
+   - `approve` → calls `ToolClient.call('feature.ready_to_merge', { feature_id, approval_token })`
+   - `deny` → calls `ToolClient.call('feature.state_patch', { feature_id, patch: { status: 'blocked' } })` + `ToolClient.call('feature.log_append', { feature_id, entry: reason })`
+   - `request_changes` → calls `ToolClient.call('feature.log_append', { feature_id, entry: message })` + triggers `changes_requested` reaction to inject feedback into agent
+**Worktree Checkout (Local Testing):**
+The dashboard provides a one-click checkout so the reviewer can spin up a local dev server and run manual tests against the feature's changes in their main repo working directory.
+1. **Checkout Button** — Displayed on the feature detail page for features with an active worktree. Shows the branch name and a warning that it will switch the user's main repo checkout.
+2. **Checkout Flow:**
+   - User clicks "Checkout to Local" on a feature
+   - Dashboard shows confirmation dialog: "This will run `git checkout <feature_branch>` in your main repo at `<repo_root>`. Any uncommitted changes will be stashed first. Continue?"
+   - On confirm, calls `POST /api/features/:id/checkout`
+3. **API Implementation:**
+   ```typescript
+   // POST /api/features/:id/checkout
+   interface CheckoutRequest {
+     stash_changes?: boolean;  // default: true — auto-stash uncommitted work
+     restore_after?: boolean;  // default: false — if true, remember original branch for later restore
+   }
+   interface CheckoutResponse {
+     ok: boolean;
+     data?: {
+       branch: string;          // the branch checked out
+       previous_branch: string; // the branch before checkout (for restore)
+       stashed: boolean;        // whether changes were stashed
+       stash_ref?: string;      // stash reference if stashed
+     };
+     error?: { code: string; message: string };
+   }
+   ```
+4. **Backend Implementation:**
+   - Reads feature state to get the worktree branch name
+   - Runs in the main repo root (not the worktree):
+     1. `git stash push -m "aop-dashboard-checkout: before switching to <feature_branch>"` (if `stash_changes` and dirty)
+     2. `git checkout <feature_branch>`
+   - Records previous branch in a temporary file (`.aop/runtime/checkout-restore.json`) for later restoration
+   - Returns branch info + stash reference
+5. **Restore Button** — After checkout, the dashboard shows a "Restore Original Branch" button that:
+   - Calls `POST /api/features/:id/checkout` with `{ action: "restore" }`
+   - Backend: `git checkout <previous_branch>` + `git stash pop` (if stashed)
+6. **Safety Guards:**
+   - Checkout blocked if feature has no worktree or branch
+   - Checkout blocked if main repo has merge conflicts
+   - Warning displayed if there are uncommitted changes (with stash option)
+   - Only one checkout active at a time (tracked in `.aop/runtime/checkout-restore.json`)
+**Launch Integration:**
+- New CLI command: `aop dashboard [--port 3000] [--foreground]`
+- Starts dashboard server in background or foreground
+- Dashboard reads config from `agentic/orchestrator/policy.yaml` for port, auth settings
+**Acceptance Criteria:**
+- [ ] Dashboard displays all active/blocked/merged features with live updates
+- [ ] Clicking feature shows state, plan, diff, evidence
+- [ ] SSE updates within 2s of state file changes
+- [ ] Dashboard survives run restart (polling fallback if SSE drops)
+- [ ] Reviewer can approve & merge a `ready_to_merge` feature from the dashboard
+- [ ] Reviewer can deny a feature with a reason (feature moves to `blocked`, reason logged)
+- [ ] Reviewer can request changes with a message (feedback sent to agent via reaction system)
+- [ ] Reviewer can one-click checkout a feature branch to their main repo for local testing
+- [ ] Checkout auto-stashes uncommitted changes and provides a restore button
+- [ ] Checkout blocked when no worktree/branch exists or repo has merge conflicts
+- [ ] Tests pass with ≥90% coverage per `prompts/vitest-testing-standards.instructions.md`
+**Estimated Effort:** 2 weeks (Medium - requires new package, SSE infra, UI components)
+---
+#### G2: Notification System (Multi-Channel)
+**ComposioHQ Feature:** Desktop notifications, Slack, webhooks with priority routing.
+**Gap Impact:** No proactive alerts; users must poll `aop status`.
+**AOP Design Alignment:** HIGH - Notification is output-only; no state mutations.
+**Specification:**
+**Architecture:**
+1. **Notifier Service** (`apps/control-plane/src/application/services/notifier-service.ts`)
+   - Abstract `NotifierChannel` interface (desktop, slack, webhook, email)
+   - Config in `agentic/orchestrator/policy.yaml`:
+   ```yaml
+   notifications:
+     enabled: true
+     channels:
+       desktop:
+         enabled: true
+       slack:
+         enabled: true
+         webhook: ${SLACK_WEBHOOK_URL}
+         channel: "#aop-alerts"
+       webhook:
+         enabled: false
+         url: ${CUSTOM_WEBHOOK_URL}
+         method: POST
+         headers:
+           Authorization: "Bearer ${WEBHOOK_TOKEN}"
+     routing:
+       critical: [desktop, slack]  # Gate failures, collisions
+       warning: [slack]             # Stale leases, retry exhaustion
+       info: [slack]                # Feature merged, gates passed
+   ```
+2. **Notification Events:**
+   - `gate_failed` - Gate execution failed (priority: critical)
+   - `collision_detected` - Plan rejected due to collision (priority: critical)
+   - `feature_blocked` - Feature moved to blocked queue (priority: warning)
+   - `ready_to_merge` - Feature ready for review (priority: info)
+   - `feature_merged` - Feature merged successfully (priority: info)
+   - `stale_lease` - Run lease expiring soon (priority: warning)
+3. **Channel Implementations:**
+   - **Desktop:** `node-notifier` or `notifier-send` (Linux), `terminal-notifier` (macOS)
+   - **Slack:** HTTP POST to webhook URL with formatted message + attachments
+   - **Webhook:** Generic HTTP POST with JSON payload
+   - **Email:** Nodemailer with SMTP config (optional, P2)
+4. **Notification Flow:**
+   - Supervisor runtime calls `NotifierService.notify(event, context)`
+   - Service routes to enabled channels based on priority
+   - Failures logged but do not block orchestration
+**Acceptance Criteria:**
+- [ ] Desktop notification on gate failure with feature ID + error summary
+- [ ] Slack notification includes clickable link to dashboard (if running)
+- [ ] Webhook payload includes full event context (feature_id, status, evidence summary)
+- [ ] Notification config validated against schema on startup
+- [ ] Notification failures do not crash orchestrator
+**Estimated Effort:** 1 week (Medium - requires channel abstractions, config schema, supervisor integration)
+---
+#### G3: Init Wizard (`aop init`)
+**ComposioHQ Feature:** Interactive setup wizard with auto-detection (git repo, GitHub remote, branch, API keys).
+**Gap Impact:** Manual YAML editing is error-prone; high barrier to entry.
+**AOP Design Alignment:** HIGH - Setup tool does not affect runtime behavior.
+**Specification:**
+**Wizard Flow:**
+1. **Detect repository context:**
+   - Check for `.git/` in current directory
+   - Parse git remote URL → derive repo owner/name
+   - Detect default branch (`git symbolic-ref refs/remotes/origin/HEAD`)
+2. **Prompt for config values:**
+   - Worktree base branch (default: `main`)
+   - Max active features (default: `5`)
+   - Max parallel gate runs (default: `3`)
+   - Dashboard port (default: `3000`)
+   - Notification channels (multi-select: desktop, slack, webhook)
+   - Slack webhook URL (if slack selected)
+   - Test framework (vitest/jest/pytest/maven/gradle - for gates.yaml template)
+3. **Generate config files:**
+   - `agentic/orchestrator/policy.yaml` (with detected/entered values)
+   - `agentic/orchestrator/gates.yaml` (template for detected test framework)
+   - `agentic/orchestrator/agents.yaml` (defaults: planner/builder/qa prompts)
+   - Copy prompt templates to `agentic/orchestrator/prompts/`
+   - Copy schema files to `agentic/orchestrator/schemas/`
+4. **Validate generated config:**
+   - Run schema validation against all generated YAML files
+   - Report validation errors with file/line/error details
+5. **Post-init instructions:**
+   - Print next steps:
+     ```
+     ✅ Configuration created successfully!
+     Next steps:
+     1. Review config files in agentic/orchestrator/
+     2. Add feature specs to .aop/features/<feature_id>/spec.md
+     3. Run: aop run -fi .aop/features/<feature_id>/spec.md
+     4. Monitor: aop status (or aop dashboard)
+     ```
+**Acceptance Criteria:**
+- [ ] Wizard detects git repo and parses remote URL
+- [ ] Generated config files pass schema validation
+- [ ] Wizard handles non-git directories gracefully (prompts for manual values)
+- [ ] Template selection generates appropriate gates.yaml (e.g., `npm test` vs `pytest` vs `mvn test`)
+- [ ] Wizard is idempotent (detects existing config, offers to update)
+**Estimated Effort:** 1 week (Medium - requires interactive prompts, git parsing, template generation)
+---
+#### G4: Multi-Project Configuration Support
+**ComposioHQ Feature:** Single config file managing multiple repositories with per-project overrides.
+**Gap Impact:** Cannot manage multiple repos from one orchestrator instance.
+**AOP Design Alignment:** MEDIUM - Requires run lease isolation per repo.
+**Specification:**
+**Config Schema Extension:**
+```yaml
+# agentic/orchestrator/multi-project.yaml (new file)
+version: "1.0"
+defaults:
+  max_active_features: 5
+  max_parallel_gate_runs: 3
+  dashboard_port: 3000
+  notifications:
+    enabled: true
+    channels: [desktop, slack]
+projects:
+  - name: "backend"
+    path: ~/repos/backend
+    repo: "myorg/backend"
+    branch: main
+    policy: agentic/orchestrator/policy.yaml  # per-project override
+    gates: agentic/orchestrator/gates-backend.yaml
+    dashboard_port: 3001  # override default
+  - name: "frontend"
+    path: ~/repos/frontend
+    repo: "myorg/frontend"
+    branch: main
+    policy: agentic/orchestrator/policy.yaml
+    gates: agentic/orchestrator/gates-frontend.yaml
+    dashboard_port: 3002
+```
+**Implementation:**
+1. **Multi-Project Loader** (`src/application/multi-project-loader.ts`)
+   - Parses `multi-project.yaml` (optional; single-project mode remains default)
+   - Validates each project config against schema
+   - Resolves relative paths to absolute paths
+2. **CLI Changes:**
+   - New flag: `aop run --project backend` (selects project from multi-project.yaml)
+   - If `--project` not specified and multi-project.yaml exists → interactive selection
+   - `aop status --project backend` (project-specific status)
+   - `aop status --all` (global status across all projects)
+3. **Run Lease Isolation:**
+   - Run lease file per project: `.aop/runtime/<project_name>/run-lease.json`
+   - Dashboard instances per project (separate ports)
+   - Each project has independent `.aop/features/` directory
+4. **Dashboard Multi-Project View:**
+   - New route: `GET /api/projects` (list all configured projects)
+   - Project switcher in dashboard UI
+   - Global view showing status across all projects
+**Acceptance Criteria:**
+- [ ] Multi-project config validated against schema
+- [ ] Can run orchestrator for specific project via `--project` flag
+- [ ] Run leases isolated per project (parallel orchestration safe)
+- [ ] Dashboard can switch between projects
+- [ ] `aop status --all` aggregates status across projects
+**Estimated Effort:** 1.5 weeks (High - requires config schema changes, CLI routing, lease isolation)
+---
+### 3.2 P1 GAPS (High Value - Implement Soon)
+#### G5: CI Failure Auto-Remediation (Reactions)
+**ComposioHQ Feature:** Automatic routing of CI failures to agents with retry logic.
+**Gap Impact:** Gate failures require manual retry; no autonomous recovery.
+**AOP Design Alignment:** MEDIUM - Requires autonomous retry policy; must preserve deterministic gates.
+**Specification:**
+**Reaction Policy Config:**
+```yaml
+# agentic/orchestrator/policy.yaml
+reactions:
+  gate_failed:
+    enabled: true
+    max_retries: 2
+    action: retry_with_agent_repair  # or 'notify_only'
+    escalate_after: 2  # escalate to human after N failures
+    retry_delay: 30s
+  collision_detected:
+    enabled: true
+    action: notify_only  # no auto-resolution
+  ready_to_merge:
+    enabled: true
+    action: notify_only  # never auto-merge
+```
+**Retry Flow:**
+1. **Gate failure detected** (QA wave or build wave)
+2. **Check reaction policy:** If `reactions.gate_failed.enabled` and retry count < max_retries
+3. **Agent repair loop:**
+   - Load gate failure evidence + logs
+   - Inject repair prompt to builder/QA agent:
+     ```
+     Gate execution failed. Review the error logs below and apply fixes.
+     Gate: {gate_name}
+     Exit Code: {exit_code}
+     Logs:
+     {logs}
+     Evidence:
+     {evidence_summary}
+     ```
+4. **Agent generates repair patches** → apply via `repo.apply_patch`
+5. **Re-run gate** → capture new evidence
+6. **Success:** Advance to next phase
+7. **Failure:** Increment retry count, repeat or escalate
+**Escalation:**
+- After `escalate_after` failures → notification to all critical channels
+- Feature remains in current phase (does not auto-advance)
+- User can manually intervene via dashboard or `aop send <feature_id> <instruction>`
+**Acceptance Criteria:**
+- [ ] Gate failure triggers retry if policy enabled
+- [ ] Retry count tracked in feature state (`gate_retry_count`)
+- [ ] Agent receives failure context in prompt
+- [ ] Escalation notification includes full failure history
+- [ ] Manual override: `aop retry <feature_id> --force` (ignores retry limit)
+**Estimated Effort:** 2 weeks (High - requires retry state tracking, agent prompt injection, escalation logic)
+---
+#### G6: Session Management Commands (`aop send`, `aop attach`)
+**ComposioHQ Feature:** Interactive agent communication (`ao send`, `ao open`).
+**Gap Impact:** Cannot send ad-hoc instructions to agents; must restart workflow.
+**AOP Design Alignment:** MEDIUM - Requires provider-specific session access.
+**Specification:**
+**New CLI Commands:**
+1. **`aop send <feature_id> <message>`**
+   - Sends message to orchestrator session for given feature
+   - Orchestrator routes message to appropriate worker (planner/builder/qa based on phase)
+   - Example: `aop send my_feature "Add logging to error handlers"`
+2. **`aop attach <feature_id>`**
+   - Attaches to orchestrator session terminal (if provider supports it)
+   - For Claude Code/Codex: launches interactive chat session
+   - For tmux (future): attaches to tmux session
+   - Exit with `Ctrl-D` or type `/exit`
+**Implementation:**
+1. **Provider Interface Extension:**
+   ```typescript
+   interface WorkerProvider {
+     sendMessage(sessionId: string, message: string): Promise<void>;
+     attachToSession(sessionId: string): Promise<void>; // interactive mode
+   }
+   ```
+2. **Message Routing:**
+   - CLI calls `ToolClient.call('feature.send_message', { feature_id, message })`
+   - Tool validates feature exists and has active orchestrator session
+   - Tool calls `WorkerProvider.sendMessage(orchestratorSessionId, message)`
+   - Provider forwards to agent (implementation-specific)
+3. **Session Attach:**
+   - CLI calls provider-specific attach method
+   - For Claude Code: launches `claude-code chat --session <session_id>`
+   - For Codex: launches `codex chat --session <session_id>`
+   - Terminal streams stdin/stdout until exit
+**Acceptance Criteria:**
+- [ ] `aop send` delivers message to active orchestrator session
+- [ ] `aop attach` launches interactive session for supported providers
+- [ ] Error handling: feature not found, session not active, provider unsupported
+- [ ] Attach session streams real-time responses
+**Estimated Effort:** 1.5 weeks (High - requires provider API integration, terminal streaming)
+---
+#### G6a: Agent Activity Detection & Health Monitoring
+**ComposioHQ Feature:** JSONL-based activity detection reads agent session files for structured events; fallback terminal output parsing determines real-time agent state (active, ready, idle, waiting_input, blocked, exited). Lifecycle manager polls activity state and triggers reactions (e.g., `agent-stuck` after configurable idle threshold).
+**Gap Impact:** AOP has no runtime visibility into whether agents are actively working, idle, or stuck. Supervisor relies on tool call responses but cannot detect hung agents.
+**AOP Design Alignment:** HIGH — Read-only monitoring; no state mutations.
+**Specification:**
+1. **Activity Monitor Service** (`apps/control-plane/src/application/services/activity-monitor-service.ts`)
+   - Interface: `getActivityState(featureId: string): Promise<ActivityState>`
+   - States: `active | idle | waiting_input | blocked | exited | unknown`
+   - Detection strategies (provider-dependent):
+     - **Claude Code:** Read JSONL session files for last event timestamp + type
+     - **Codex:** Query JSON-RPC app-server for thread state
+     - **Generic:** Check process alive + last tool call timestamp from operation ledger
+   - Configurable idle threshold: `policy.supervisor.agent_idle_threshold_ms` (default: `300000` / 5min)
+2. **Integration Points:**
+   - Supervisor `WorkerDecisionLoop` checks activity state before sending next prompt
+   - Stale-activity triggers `agent_stuck` notification event (see G2)
+   - Dashboard displays per-feature activity indicator (see G1)
+   - `aop status` output includes activity column
+3. **Stuck Agent Reaction** (extends G5 reaction policy):
+   ```yaml
+   reactions:
+     agent_stuck:
+       enabled: true
+       action: notify_and_restart  # or: notify_only
+       idle_threshold: 300s
+       escalate_after: 2
+   ```
+**Acceptance Criteria:**
+- [ ] Activity state detected for at least Claude Code and generic providers
+- [ ] `aop status` displays activity state per feature
+- [ ] Agent stuck beyond threshold triggers notification
+- [ ] Activity monitoring does not block supervisor execution
+**Estimated Effort:** 1 week (Medium — requires provider-specific detection strategies)
+---
+#### G6b: Automated Session Cleanup
+**ComposioHQ Feature:** `ao session cleanup` auto-evaluates sessions and kills those whose PRs are merged/closed, issues are completed, or runtimes are dead. Supports `--dry-run`.
+**Gap Impact:** AOP's `aop delete` is manual-only. No automated lifecycle cleanup for terminal features.
+**AOP Design Alignment:** HIGH — Cleanup is a natural extension of existing feature deletion service.
+**Specification:**
+1. **Cleanup Command** (`aop cleanup [--dry-run] [--yes]`)
+   - Scans all active + blocked features in index
+   - Evaluates cleanup criteria per feature:
+     - Feature status is terminal (`merged` or `failed`) for > configurable grace period (default: 1h)
+     - Worktree exists but feature is no longer in index
+     - Run lease expired and no active supervisor session
+   - `--dry-run`: Print what would be cleaned up without acting
+   - Delegates to existing `FeatureDeletionService` for actual cleanup
+2. **Auto-Cleanup Hook** (optional):
+   - After `feature.ready_to_merge` completes merge, schedule cleanup after grace period
+   - Configurable in policy: `cleanup.auto_after_merge: true`, `cleanup.grace_period: 3600s`
+**Acceptance Criteria:**
+- [ ] `aop cleanup --dry-run` lists features eligible for cleanup
+- [ ] `aop cleanup --yes` removes terminal features + orphan worktrees
+- [ ] Auto-cleanup triggers after merge when enabled
+- [ ] Grace period prevents premature cleanup
+**Estimated Effort:** 3 days (Low — leverages existing FeatureDeletionService)
+---
+#### G6c: Batch Feature Operations
+**ComposioHQ Feature:** `ao batch-spawn` accepts multiple issue IDs, deduplicates against existing sessions, detects dead sessions for respawning, and reports created/skipped/failed counts.
+**Gap Impact:** AOP can only run one feature at a time via CLI. Multi-feature workflows require multiple manual `aop run` invocations.
+**AOP Design Alignment:** HIGH — Natural extension of existing `-fl` folder scanning.
+**Specification:**
+1. **Batch Run** (`aop run -fl <spec_folder> --batch`)
+   - Scans folder for all spec files (existing behavior)
+   - Deduplicates against active features in index
+   - Skips features already in `active` or `blocked` status
+   - Spawns remaining features sequentially with 500ms delay
+   - Reports: `created: N, skipped: N (already active), failed: N`
+2. **Batch Status Extension:**
+   - `aop status --summary` — One-line-per-feature compact view
+**Acceptance Criteria:**
+- [ ] Batch run deduplicates against existing features
+- [ ] Failed spawns don't block remaining features
+- [ ] Summary output reports counts
+**Estimated Effort:** 3 days (Low — extends existing folder scanning)
+---
+#### G6d: Workspace postCreate Hooks & Symlinks
+**ComposioHQ Feature:** Per-project `postCreate` commands (e.g., `pnpm install`) run after worktree creation. `symlinks` config shares files (`.env`, `.claude`) across worktrees via symlinks.
+**Gap Impact:** After AOP creates a worktree, dependencies aren't installed and environment files are missing. Users must manually set up each worktree.
+**AOP Design Alignment:** HIGH — Configuration-driven, no architectural conflict.
+**Specification:**
+1. **Policy Extension:**
+   ```yaml
+   # agentic/orchestrator/policy.yaml
+   worktree:
+     base_branch: main
+     post_create:
+       - "npm ci"
+       - "cp .env.example .env"
+     symlinks:
+       - .env
+       - .claude
+   ```
+2. **Implementation:**
+   - After `repo.ensure_worktree` creates worktree, run `post_create` commands in worktree directory
+   - Before commands, create symlinks from worktree to main repo for listed files
+   - Command failures logged as warnings but don't block feature initialization
+   - Schema extension in `policy.schema.json`
+**Acceptance Criteria:**
+- [ ] `post_create` commands execute in new worktree directory
+- [ ] Symlinks created before post_create commands run
+- [ ] Command failures logged but don't block initialization
+**Estimated Effort:** 2 days (Low)
+---
+#### G6e: PR Lifecycle Integration
+**ComposioHQ Feature:** SCM plugin tracks PR state as first-class session data: CI check status, review decisions (approved/changes_requested), pending review threads, merge readiness, and conflict status. Dashboard shows PR table with sortable "merge score" weighting these signals.
+**Gap Impact:** AOP has no PR awareness. Features go through gates locally but there's no bridge to the PR review cycle. After `ready_to_merge`, there's no visibility into whether a PR was created, reviewed, or has CI issues upstream.
+**AOP Design Alignment:** MEDIUM — Requires optional GitHub API integration. Read-only monitoring doesn't conflict with deterministic model, but auto-actions (merge, comment) should remain opt-in.
+**Specification:**
+1. **PR Monitor Service** (`apps/control-plane/src/application/services/pr-monitor-service.ts`)
+   - Interface:
+     ```typescript
+     interface PrMonitorService {
+       detectPr(featureId: string, branch: string): Promise<PrInfo | null>;
+       getCiStatus(prNumber: number): Promise<CiStatus>;
+       getReviewDecision(prNumber: number): Promise<ReviewDecision>;
+       getMergeability(prNumber: number): Promise<MergeabilityInfo>;
+     }
+     ```
+   - Implementation via `gh` CLI (no Octokit dependency needed)
+   - PR info stored in feature state: `pr_number`, `pr_url`, `ci_status`, `review_decision`
+2. **Feature State Extension:**
+   ```yaml
+   pr:
+     number: 42
+     url: "https://github.com/org/repo/pull/42"
+     ci_status: passing  # passing | failing | pending | none
+     review_decision: approved  # approved | changes_requested | pending | none
+     merge_ready: true
+   ```
+3. **Dashboard Integration:**
+   - PR column in feature table
+   - Merge score calculation: `ci_weight * ci_pass + review_weight * approved + conflict_weight * no_conflicts`
+   - Sortable PR table view
+4. **Reaction Integration (extends G5):**
+   ```yaml
+   reactions:
+     ci_failed_upstream:
+       enabled: true
+       action: notify_only  # or: retry_with_agent_repair
+     changes_requested:
+       enabled: true
+       action: send_review_context_to_agent
+       escalate_after: 2
+   ```
+**Acceptance Criteria:**
+- [ ] PR detected automatically after feature creates a branch with open PR
+- [ ] CI status and review decisions reflected in feature state
+- [ ] Dashboard shows PR info with merge score
+- [ ] `changes_requested` reaction sends review context to agent (opt-in)
+**Estimated Effort:** 1.5 weeks (Medium — requires `gh` CLI integration, state schema extension, dashboard components)
+---
+#### G7: Review Comment Auto-Handling
+**ComposioHQ Feature:** Agents automatically address PR review comments. Lifecycle manager detects `changes_requested` review state and sends review context to agent with `send-to-agent` action. Agent reads comments, applies fixes, pushes new commits.
+**Gap Impact:** No PR lifecycle integration; manual comment handling.
+**AOP Design Alignment:** MEDIUM (revised from LOW) — Composio's implementation is pragmatic: it forwards review comments to agents as additional context, letting the agent decide what to fix. This doesn't bypass review — it accelerates the review-fix-rereview cycle. Compatible with AOP's model if review feedback is treated as additional spec input.
+**Decision:** **PROMOTE to P1** — Subsume into G6e (PR Lifecycle Integration). When `changes_requested` is detected, send review comment context to builder agent as a corrective prompt. The agent applies fixes in the worktree, re-runs gates, and the feature re-enters the review cycle. The human reviewer retains approval authority.
+**Alternative:** If PR integration (G6e) is deferred, this remains P3.
+---
+#### G8: Alternative Workspace Modes (Clone, Copy)
+**ComposioHQ Feature:** Clone and copy workspace modes in addition to worktrees.
+**Gap Impact:** Worktrees-only; no flexibility for shared .git concerns.
+**AOP Design Alignment:** LOW - Worktrees are optimal for parallel work; clone/copy add complexity.
+**Decision:** **DEFER to P3** - Worktrees are sufficient; clone/copy modes add marginal value at high implementation cost.
+---
+### 3.3 P2 GAPS (Nice to Have - Future Consideration)
+#### G9: Multi-Tracker Support (GitHub, Linear, Jira)
+**ComposioHQ Feature:** Pluggable tracker integration.
+**Gap Impact:** Spec-only issue references; no direct tracker sync.
+**AOP Design Alignment:** MEDIUM - Could enrich context but not essential.
+**Specification:**
+**Tracker Abstraction:**
+```typescript
+interface IssueTracker {
+  getIssue(issueId: string): Promise<Issue>;
+  updateIssueStatus(issueId: string, status: string): Promise<void>;
+  addComment(issueId: string, comment: string): Promise<void>;
+}
+```
+**Config Extension:**
+```yaml
+# agentic/orchestrator/policy.yaml
+issue_tracker:
+  type: github  # or linear, jira
+  config:
+    token: ${GITHUB_TOKEN}
+    repo: myorg/myrepo
+```
+**Integration Points:**
+1. **Spec enrichment:** Fetch issue details, inject into planner context
+2. **Status sync:** Update issue status when feature advances (planning → building → merged)
+3. **Comment posting:** Post gate results, evidence links as issue comments
+**Estimated Effort:** 2 weeks (Medium - requires API clients, auth, config schema)
+**Decision:** **P2** - Nice to have but not critical. Spec files provide sufficient context.
+---
+#### G10: Multiple Runtime Environments (Docker, K8s, SSH)
+**ComposioHQ Feature:** Pluggable runtime (tmux, Docker, K8s, process, SSH, E2B).
+**Gap Impact:** MCP-only execution; no remote execution options.
+**AOP Design Alignment:** LOW - MCP transport abstraction already supports remote MCP servers.
+**Decision:** **DEFER to P3** - Current MCP transport supports Docker via remote MCP server. K8s/SSH add complexity without clear benefit.
+---
+#### G11: Hash-Based Multi-Instance Isolation
+**ComposioHQ Feature:** Config-path-derived hashing for safe multi-checkout operation.
+**Gap Impact:** Single-instance assumption; cannot run multiple orchestrator checkouts safely.
+**AOP Design Alignment:** MEDIUM - Run lease already provides single-instance guarantee.
+**Specification:**
+**Instance Isolation Strategy:**
+1. Derive instance ID from config path hash (SHA256)
+2. Namespace run lease file: `.aop/runtime/<instance_id>/run-lease.json`
+3. Each instance has independent dashboard port, worktree paths
+4. CLI detects instance ID from config path, routes commands accordingly
+**Estimated Effort:** 1 week (Low - requires path hashing, namespace prefixing)
+**Decision:** **P2** - Useful for testing/staging but not critical. Workaround: run leases already prevent collisions.
+---
+#### G12: Terminal Integration Plugins
+**ComposioHQ Feature:** iTerm2 integration, web terminal support.
+**Gap Impact:** No terminal integration; CLI-only.
+**AOP Design Alignment:** LOW - Dashboard terminal viewer covers most use cases.
+**Decision:** **DEFER to P3** - Dashboard terminal viewer (G1) provides sufficient terminal access.
+---
+### 3.4 P3 GAPS (Low Priority - Deferred)
+#### G13-G18: Deferred Gaps
+**Deferred due to low ROI or architectural misalignment:**
+- **G13: Plugin System** - Conflicts with deterministic kernel design; provider abstraction sufficient
+- **G14: Alternative Workspace Modes** - Worktrees optimal; clone/copy add complexity
+- **G15: K8s/SSH Runtimes** - MCP transport abstraction sufficient
+- **G16: Review Comment Auto-Handling** - Conflicts with explicit review workflow
+- **G17: Auto-Merge on Green CI** - Conflicts with explicit merge control principle
+- **G18: Terminal Plugins** - Dashboard terminal viewer sufficient
+---
+## 3.5 Novel Features (Not in Either Package — AOP Differentiators)
+These features emerged from analyzing both codebases and represent opportunities for AOP to leapfrog both its current state and Composio's capabilities.
+#### N1: Incremental Gate Execution (P1)
+**Problem:** Both AOP and Composio run full test suites on every gate pass. For large codebases, this wastes minutes re-running unaffected tests.
+**Specification:**
+- After `repo.apply_patch`, compute affected file set from diff
+- Use test dependency graph (vitest `--changed`, jest `--changedSince`, pytest `--lf`) to select affected tests
+- Gate config extension:
+  ```yaml
+  gates:
+    profiles:
+      default:
+        modes:
+          fast:
+            commands:
+              - "npx vitest run --changed {base_branch}"  # incremental
+          full:
+            commands:
+              - "npx vitest run"  # full suite
+  ```
+- `fast` mode uses incremental; `full` and `merge` modes run complete suite
+- Evidence captures which tests were skipped and why
+**Impact:** 2-5x faster gate cycles for large projects. Direct competitive advantage.
+**Estimated Effort:** 1 week
+---
+#### N2: Parallel Gate Execution (P2)
+**Problem:** Gates run sequentially. Independent gates (lint, type-check, unit tests) could run concurrently.
+**Specification:**
+- Gate config supports `parallel: true` flag per command group
+- Commands within a parallel group execute concurrently via `Promise.allSettled`
+- Evidence captured per-command; overall gate fails if any parallel command fails
+- Sequential dependencies via `depends_on` field
+**Estimated Effort:** 3 days
+---
+#### N3: Cost Tracking & Budget Enforcement (P2)
+**Problem:** Neither system tracks or limits LLM API costs per feature. Runaway agent loops can burn tokens.
+**Specification:**
+- Track token usage per feature via operation ledger metadata
+- Budget config:
+  ```yaml
+  budget:
+    per_feature_limit: 50.00  # USD
+    per_phase_limit: 20.00
+    alert_threshold: 0.8  # notify at 80% of budget
+  ```
+- Supervisor checks budget before each worker decision loop iteration
+- Over-budget triggers notification + feature pause (not kill)
+- Dashboard shows cost-per-feature column
+**Estimated Effort:** 1 week
+---
+#### N4: Dependency-Aware Feature Scheduling (P2)
+**Problem:** Neither system supports declaring that Feature B depends on Feature A. AOP's collision detection is file-path based, not semantic.
+**Specification:**
+- Spec metadata supports `depends_on: [feature_a]`
+- Scheduler defers dependent features to blocked queue until dependencies reach `merged` status
+- Automatic promotion when dependency chain resolves
+- Circular dependency detection at plan submission time
+**Estimated Effort:** 1 week
+---
+#### N5: Agent Performance Analytics (P3)
+**Problem:** Neither system tracks which provider/model combinations succeed more often at which task types.
+**Specification:**
+- Record per-feature outcome metrics: gate pass rate, retry count, time-to-merge, cost
+- Aggregate by provider + model over time
+- Optional: feed analytics into provider selection heuristics
+**Estimated Effort:** 1.5 weeks
+---
+#### N6: Typed Adapter Registry (Extension Point Taxonomy) (P0)
+**Problem:** AOP currently defines extension interfaces ad-hoc per feature (G2 introduces `NotifierChannel`, G6 extends `WorkerProvider`, G9 defines `IssueTracker`, G6a adds `ActivityMonitor`, G6e adds `PrMonitor`). Each is a standalone interface with its own discovery, configuration, and error handling. This leads to duplicated patterns, inconsistent adapter lifecycle, and a codebase that gets harder to extend with every new concern axis. Meanwhile the existing `ProviderSelection` in `providers.ts` uses a hardcoded union type (`'codex' | 'claude' | ... | 'copilot'`) — adding a new provider means editing the union, the resolution logic, and every switch that touches it.
+**Relationship to Plugin Systems:** This is *not* a plugin system. Plugins imply runtime discovery, dynamic loading, and third-party code running inside the process boundary — all of which undermine AOP's deterministic guarantees. An adapter registry is a **compile-time contract** with **config-driven selection**: the kernel knows every adapter that exists at build time, validates adapter configuration against schemas, and routes through the same deterministic pipeline (RBAC, validation, audit) as everything else. Adapters don't own state — the kernel does. Adapters don't make decisions — the kernel does. Adapters just answer "how do I talk to Slack" or "how do I parse Claude Code's session files."
+**Specification:**
+1. **Core Abstraction** (`apps/control-plane/src/application/adapters/adapter-registry.ts`):
+   ```typescript
+   /** A typed slot that adapters can fill. */
+   interface AdapterSlot<TContract> {
+     readonly name: string;                    // e.g. 'notification-channel', 'agent-provider'
+     readonly contract: TContract;             // the interface adapters must implement
+   }
+   /** Metadata every adapter must declare. */
+   interface AdapterManifest {
+     readonly slot: string;                    // which slot this fills
+     readonly name: string;                    // unique adapter name within slot (e.g. 'slack', 'claude')
+     readonly configSchema?: JsonSchema;       // AJV schema for adapter-specific config
+   }
+   /** The registry: slot → (name → adapter instance). */
+   interface AdapterRegistry {
+     register<T>(slot: AdapterSlot<T>, manifest: AdapterManifest, factory: (config: unknown) => T): void;
+     resolve<T>(slot: AdapterSlot<T>, name: string, config: unknown): T;
+     list(slot: string): ReadonlyArray<AdapterManifest>;
+     has(slot: string, name: string): boolean;
+   }
+   ```
+2. **Adapter Slots** (formalized concern axes):
+   | Slot | Contract Interface | Built-in Adapters | Used By |
+   |------|-------------------|-------------------|---------|
+   | `agent-provider` | `WorkerProvider` | codex, claude, gemini, kiro-cli, copilot, custom | Supervisor runtime, G6 send/attach |
+   | `notification-channel` | `NotifierChannel` | desktop, slack, webhook | G2 NotifierService |
+   | `scm-provider` | `ScmProvider` | github (via `gh` CLI) | G6e PR lifecycle |
+   | `issue-tracker` | `IssueTracker` | github, linear, jira | G9 tracker support |
+   | `activity-detector` | `ActivityDetector` | claude-jsonl, codex-rpc, process-heuristic | G6a activity monitoring |
+3. **Registration & Resolution:**
+   - All built-in adapters are registered at kernel boot time in a deterministic order
+   - Registration validates the adapter's `configSchema` against the adapter config from `policy.yaml` / `agents.yaml`
+   - Resolution is config-driven: `policy.yaml` specifies which adapter name to use per slot
+   - Resolution fails fast with structured error (`adapter_not_found`, `adapter_config_invalid`) if the adapter doesn't exist or config doesn't validate
+   - No dynamic imports, no runtime discovery, no third-party code — every adapter is a known import at build time
+4. **Config Integration:**
+   ```yaml
+   # agentic/orchestrator/policy.yaml
+   adapters:
+     notification-channel: slack       # selects the 'slack' adapter for this slot
+     scm-provider: github              # selects 'github' for SCM
+     issue-tracker: github             # selects 'github' for issue tracking
+     activity-detector: claude-jsonl   # selects Claude Code JSONL parser
+   # agentic/orchestrator/agents.yaml (existing, unchanged)
+   runtime:
+     default_provider: claude          # selects 'claude' for agent-provider slot
+   ```
+5. **Schema Validation:**
+   - New schema: `agentic/orchestrator/schemas/adapters.schema.json`
+   - Validates: slot names are known, adapter names exist within slots, adapter-specific config matches adapter's declared `configSchema`
+   - Validated at kernel boot alongside existing policy/gates/agents schemas
+6. **Migration Path (Non-Breaking):**
+   - Phase 1 (M29): Introduce `AdapterRegistry` and migrate `agent-provider` slot (replace hardcoded union in `providers.ts` with registry-based resolution). Existing `agents.yaml` config continues to work — `default_provider: claude` resolves through the registry.
+   - Phase 2 (M30): Register `notification-channel`, `scm-provider`, `activity-detector` slots as G2/G6a/G6e are implemented. These features use the registry from the start rather than inventing their own discovery patterns.
+   - Phase 3 (M31): Register `issue-tracker` slot when G9 is implemented.
+   - Each phase is backward compatible — existing config keeps working, the registry just formalizes what was already implicit.
+7. **What This Is NOT:**
+   - NOT a plugin system: no dynamic loading, no npm package discovery, no third-party extension API
+   - NOT a service locator: adapters are resolved at boot time, not lazily on first use
+   - NOT an abstraction for abstraction's sake: every slot maps to a concrete feature (G2, G6, G6a, G6e, G9) that is already planned
+   - The kernel retains full authority over state, validation, RBAC, and audit. Adapters are leaf-node implementations behind the kernel's deterministic pipeline.
+**Acceptance Criteria:**
+- [ ] `AdapterRegistry` supports register/resolve/list/has operations with type safety
+- [ ] `agent-provider` slot migrated from hardcoded union to registry (no config changes required)
+- [ ] Adapter config validated against adapter-declared `configSchema` at boot
+- [ ] Resolution fails fast with structured error for unknown adapter or invalid config
+- [ ] No dynamic imports or runtime code loading — all adapters are static imports
+- [ ] Adding a new adapter to an existing slot requires: one file (implementation), one registration call, one config entry
+**Estimated Effort:** 1 week (registry core + agent-provider migration). Subsequent slot registrations are ~1 day each, folded into the features that use them (G2, G6a, G6e, G9).
+---
+## 4. Implementation Roadmap
+### Phase 1: Critical UX Improvements (M29)
+**Duration:** 5-6 weeks
+**Deliverables:**
+1. **G1: Web Dashboard** (2 weeks)
+   - Next.js dashboard with SSE updates
+   - Feature cards, diff viewer, evidence viewer
+   - Kanban view (planning/building/qa/ready_to_merge/merged columns)
+   - Launch via `aop dashboard`
+2. **G2: Notification System** (1 week)
+   - Register `notification-channel` adapter slot (N6): desktop, slack, webhook adapters
+   - 4-tier priority routing (urgent/action/warning/info)
+   - Event triggers (gate_failed, collision_detected, ready_to_merge, agent_stuck)
+3. **G3: Init Wizard** (1 week)
+   - Interactive `aop init` command with `--auto` zero-prompt mode
+   - Auto-detection (git repo, remote, branch, test framework)
+   - Template generation (policy, gates, agents)
+   - postCreate hooks and symlink configuration
+4. **G4: Multi-Project Config** (1.5 weeks)
+   - Multi-project YAML schema
+   - `--project` flag in CLI
+   - Run lease isolation per project
+5. **G6b: Automated Session Cleanup** (3 days)
+   - `aop cleanup [--dry-run] [--yes]`
+   - Auto-cleanup after merge (configurable grace period)
+6. **G6c: Batch Feature Operations** (3 days)
+   - `aop run -fl <folder> --batch` with deduplication
+   - Summary output (created/skipped/failed counts)
+7. **G6d: Workspace postCreate Hooks & Symlinks** (2 days)
+   - `post_create` commands in policy.yaml
+   - Worktree symlinks for shared config files
+8. **N6: Typed Adapter Registry — Core + Agent-Provider Migration** (1 week)
+   - `AdapterRegistry` with register/resolve/list/has and config schema validation
+   - Migrate `agent-provider` slot from hardcoded union in `providers.ts` to registry-based resolution
+   - `adapters.schema.json` validated at kernel boot
+   - No config changes required — existing `agents.yaml` `default_provider` resolves through registry
+**Milestone Acceptance:**
+- [ ] Dashboard displays live feature status with SSE updates + Kanban view
+- [ ] Dashboard review panel: approve/deny/request-changes with merge control
+- [ ] Dashboard checkout: one-click switch to feature branch for local testing with stash/restore
+- [ ] Notifications sent to Slack on gate failures with 4-tier priority routing
+- [ ] Adapter registry operational with `agent-provider` slot migrated; adding a new provider requires one file + one registration + one config entry
+- [ ] `aop init` generates valid config from git context (with `--auto` mode)
+- [ ] Multi-project config supports 2+ repos with isolated run leases
+- [ ] `aop cleanup` removes terminal features automatically
+- [ ] Batch run deduplicates and reports counts
+- [ ] Worktree post-create hooks execute and symlinks created
+---
+### Phase 2: Autonomous Operations & Observability (M30)
+**Duration:** 4-5 weeks
+**Deliverables:**
+1. **G5: CI Failure Auto-Remediation** (2 weeks)
+   - Reaction policy config
+   - Retry loop with agent repair
+   - Escalation notifications (time-based + retry-count-based)
+   - `agent_stuck` reaction with idle threshold
+2. **G6: Session Management Commands** (1.5 weeks)
+   - `aop send <feature_id> <message>` with idle-wait
+   - `aop attach <feature_id>` (interactive mode)
+   - Provider interface extensions (via `agent-provider` adapter slot from N6)
+3. **G6a: Agent Activity Detection** (1 week)
+   - Provider-specific activity state detection
+   - Activity column in `aop status`
+   - Stuck-agent notification triggers
+   - Register `activity-detector` adapter slot (N6): claude-jsonl, codex-rpc, process-heuristic adapters
+4. **G6e: PR Lifecycle Integration** (1.5 weeks)
+   - PR detection via `gh` CLI
+   - CI status + review decision tracking in feature state
+   - Merge score in dashboard
+   - `changes_requested` reaction (subsumes G7)
+   - Register `scm-provider` adapter slot (N6): github adapter
+5. **N1: Incremental Gate Execution** (1 week)
+   - `--changed` flag support in gate commands
+   - Fast mode uses incremental, full/merge modes run complete suite
+**Milestone Acceptance:**
+- [ ] Gate failures trigger automatic retry with agent repair + time-based escalation
+- [ ] Agent activity state visible in `aop status` and dashboard
+- [ ] `aop send` delivers messages to active agents with idle-wait
+- [ ] PR state (CI, reviews, merge readiness) tracked in feature state
+- [ ] Incremental gates reduce fast-mode execution time by ≥50%
+---
+### Phase 3: Ecosystem Integration (M31)
+**Duration:** 3-4 weeks
+**Deliverables:**
+1. **G9: Multi-Tracker Support** (2 weeks)
+   - Register `issue-tracker` adapter slot (N6): github, linear, jira adapters
+   - Issue context enrichment
+   - Status sync (optional)
+2. **G11: Hash-Based Multi-Instance Isolation** (1 week)
+   - Instance ID from config path hash
+   - Namespaced run leases
+   - Multi-instance dashboard aggregation
+3. **N3: Cost Tracking & Budget Enforcement** (1 week)
+   - Per-feature token/cost tracking via operation ledger
+   - Budget limits with notification at threshold
+   - Cost column in dashboard
+4. **N4: Dependency-Aware Feature Scheduling** (1 week)
+   - `depends_on` in spec metadata
+   - Automatic promotion when dependencies merge
+   - Circular dependency detection
+**Milestone Acceptance:**
+- [ ] Planner receives issue context from GitHub/Linear
+- [ ] Feature status updates sync to issue tracker
+- [ ] Multiple orchestrator instances run safely with isolated leases
+- [ ] Per-feature cost tracked and budget alerts fire at threshold
+- [ ] Dependent features auto-promote when dependencies merge
+---
+## 5. Testing Strategy
+### 5.1 Unit Tests
+- **Dashboard:** SSE event emission, API route handlers, file polling, Kanban column assignment, review decision dispatch (approve/deny/request_changes → tool client calls), checkout flow (stash detection, branch switch, restore state tracking)
+- **Notifications:** Channel routing (4-tier), message formatting, failure handling, throttle/batch
+- **Init Wizard:** Git detection, template generation, schema validation, `--auto` mode
+- **Multi-Project:** Config parsing, project selection, lease isolation
+- **Reactions:** Retry logic, escalation triggers (time + count), agent repair loops, stuck detection
+- **Activity Monitor:** Provider-specific state detection, idle threshold, unknown fallback
+- **Cleanup:** Terminal feature detection, grace period, dry-run mode
+- **Batch Operations:** Deduplication, sequential spawn, summary reporting
+- **PR Monitor:** `gh` CLI parsing, state mapping, merge score calculation
+- **Incremental Gates:** Changed-file detection, command interpolation, evidence tracking
+- **Cost Tracking:** Token accumulation, budget threshold detection, pause logic
+### 5.2 Integration Tests
+- **Dashboard E2E:** Feature status updates → SSE events → UI refresh
+- **Dashboard Review E2E:** Feature reaches ready_to_merge → reviewer approves via dashboard → merge executes → feature moves to merged
+- **Dashboard Checkout E2E:** Reviewer clicks checkout → stash created → branch switched → restore returns to original branch + stash pop
+- **Notification E2E:** Gate failure → Slack webhook call → message received
+- **Multi-Project E2E:** Run two projects concurrently → verify lease isolation
+- **Reaction E2E:** Gate failure → retry → agent repair → gate re-run → success
+- **Cleanup E2E:** Feature merged → grace period → auto-cleanup → index updated
+- **PR Lifecycle E2E:** Branch push → PR detected → CI status tracked → review feedback → agent fix
+### 5.3 Manual Acceptance Tests
+- Dashboard visual inspection (UI polish, responsiveness)
+- `aop init` wizard flow (user-friendly prompts, error messages)
+- `aop send` / `aop attach` interactive sessions (terminal streaming)
+- Multi-tracker issue sync (GitHub API responses, Linear updates)
+---
+## 6. Configuration Schema Changes
+### 6.1 Policy Extensions
+```yaml
+# agentic/orchestrator/policy.yaml
+# NEW: Notifications
+notifications:
+  enabled: true
+  channels:
+    desktop:
+      enabled: true
+    slack:
+      enabled: true
+      webhook: ${SLACK_WEBHOOK_URL}
+      channel: "#aop-alerts"
+    webhook:
+      enabled: false
+      url: ${CUSTOM_WEBHOOK_URL}
+  routing:
+    critical: [desktop, slack]
+    warning: [slack]
+    info: [slack]
+# NEW: Reactions
+reactions:
+  gate_failed:
+    enabled: true
+    max_retries: 2
+    action: retry_with_agent_repair
+    escalate_after: 2
+    retry_delay: 30s
+  collision_detected:
+    enabled: true
+    action: notify_only
+  ready_to_merge:
+    enabled: true
+    action: notify_only
+# NEW: Dashboard
+dashboard:
+  enabled: true
+  port: 3000
+  auth:
+    enabled: false  # future: API key auth
+# NEW: Issue Tracker (optional)
+issue_tracker:
+  enabled: false
+  type: github  # or linear, jira
+  config:
+    token: ${GITHUB_TOKEN}
+    repo: myorg/myrepo
+```
+### 6.2 Multi-Project Schema
+```yaml
+# agentic/orchestrator/multi-project.yaml (NEW FILE)
+version: "1.0"
+defaults:
+  max_active_features: 5
+  max_parallel_gate_runs: 3
+  dashboard_port: 3000
+  notifications:
+    enabled: true
+projects:
+  - name: "project_a"
+    path: ~/repos/project_a
+    repo: "org/project_a"
+    branch: main
+    policy: agentic/orchestrator/policy.yaml
+    gates: agentic/orchestrator/gates-project-a.yaml
+    dashboard_port: 3001  # override
+```
+---
+## 7. Acceptance Criteria (Phase 1 - M29)
+### Dashboard (G1)
+- [ ] Dashboard displays features in real-time via SSE
+- [ ] Feature detail page shows state, plan, diff, evidence
+- [ ] Diff viewer renders syntax-highlighted diffs
+- [ ] Evidence artifacts downloadable from dashboard
+- [ ] Dashboard survives orchestrator restart (reconnects SSE)
+- [ ] Review panel: approve & merge, deny with reason, request changes with feedback to agent
+- [ ] Checkout button: switch main repo to feature branch with auto-stash and restore
+- [ ] Checkout safety: blocked when no worktree/branch or repo has conflicts
+### Notifications (G2)
+- [ ] Desktop notification on gate failure (macOS/Linux)
+- [ ] Slack webhook receives formatted messages with links
+- [ ] Notification config validated on startup
+- [ ] Notification failures logged but do not crash orchestrator
+### Init Wizard (G3)
+- [ ] Wizard detects git repo and parses remote URL
+- [ ] Generated config files pass schema validation
+- [ ] Template selection generates correct gates.yaml for test framework
+- [ ] Wizard handles non-git directories gracefully
+### Multi-Project (G4)
+- [ ] Multi-project config validates against schema
+- [ ] `--project` flag selects correct project
+- [ ] Run leases isolated per project (parallel safe)
+- [ ] `aop status --all` aggregates across projects
+---
+## 8. Non-Goals
+**Features explicitly excluded from this spec:**
+1. **Auto-merge on green CI** — Conflicts with explicit merge control. AOP requires human approval before merge.
+2. **Plugin system** — Conflicts with deterministic kernel design. Provider abstraction + MCP tool registry provide sufficient extensibility.
+3. **K8s/SSH runtimes** — MCP transport abstraction already supports remote MCP servers for distributed execution.
+4. **Alternative workspace modes (clone/copy)** — Worktrees are optimal for parallel work with shared .git. Clone mode adds complexity with marginal benefit (workaround: symlinks via G6d).
+5. **Terminal plugins (iTerm2, web terminal)** — Dashboard with log viewer (G1) sufficient for monitoring. Interactive `aop attach` (G6) covers direct agent access.
+6. **Meta-agent orchestration pattern** — Composio's approach (AI agent as orchestrator using CLI) is innovative but trades determinism for flexibility. AOP's code-driven supervisor provides stronger guarantees. Noted as architectural divergence, not a gap.
+**Revised from v1.1:**
+- **Review comment auto-handling (G7)** — PROMOTED from Non-Goal to P1. Pragmatic implementation (forward review comments to agent) is compatible with AOP's review model. Subsumed into G6e (PR Lifecycle Integration).
+---
+## 9. Migration Path
+### 9.1 Backward Compatibility
+**Existing Features Unaffected:**
+- All core MCP tools remain unchanged
+- Existing CLI commands (`run`, `status`, `resume`, `delete`) backward compatible
+- Existing config files work without changes (new fields optional)
+- Feature state/plan/index schemas unchanged
+**New Features Opt-In:**
+- Dashboard: Launch via `aop dashboard` (opt-in)
+- Notifications: Disabled by default; enable in policy.yaml
+- Reactions: Disabled by default; enable in policy.yaml
+- Multi-project: Single-project mode remains default
+### 9.2 Deprecation Policy
+**No deprecations in M29-M31.**
+Future consideration (M32+):
+- Deprecate `--transport mcp` in favor of remote MCP server URLs
+- Deprecate in-process transport in favor of local MCP server
+---
+## 10. Success Metrics
+### 10.1 Quantitative Metrics
+**M29 (Phase 1):**
+- Dashboard page load < 2s
+- SSE event latency < 2s (file change → UI update)
+- `aop init` completion time < 60s
+- Notification delivery latency < 5s
+**M30 (Phase 2):**
+- Retry success rate > 60% (gate failures auto-resolved)
+- Escalation rate < 20% (most failures resolved before human intervention)
+- `aop send` message delivery < 1s
+**M31 (Phase 3):**
+- Multi-project config validation time < 5s
+- Issue tracker sync latency < 10s
+- Multi-instance run lease acquisition < 1s
+### 10.2 Qualitative Metrics
+**User Experience:**
+- Dashboard intuitive for first-time users (user testing)
+- Init wizard reduces setup time from 30min → 5min
+- Notifications reduce "poll for status" behavior
+**Operational Excellence:**
+- Auto-remediation reduces manual intervention by 50%
+- Multi-project support enables single-dashboard management
+- Session commands reduce workflow restarts by 70%
+---
+## 11. Risk Assessment
+### 11.1 Technical Risks
+**Risk 1: SSE Scalability**
+- **Impact:** Dashboard becomes unresponsive with >10 features
+- **Mitigation:** Implement event batching, debounce updates, connection pooling
+**Risk 2: Provider API Changes**
+- **Impact:** `aop send` / `aop attach` break when Claude/Codex updates
+- **Mitigation:** Version provider interface, graceful fallback, release notes monitoring
+**Risk 3: Notification Delivery Failures**
+- **Impact:** Critical alerts lost (gate failures, collisions)
+- **Mitigation:** Log all notification attempts, retry failed deliveries, fallback to desktop
+**Risk 4: Multi-Project Config Complexity**
+- **Impact:** Users misconfigure run leases → orchestrator conflicts
+- **Mitigation:** Schema validation, init wizard guidance, clear error messages
+### 11.2 Operational Risks
+**Risk 1: Dashboard Security**
+- **Impact:** Unauthorized access to feature diffs, evidence
+- **Mitigation:** Add API key auth (P2), limit to localhost by default
+**Risk 2: Retry Loop Abuse**
+- **Impact:** Infinite retry loops consume resources
+- **Mitigation:** Hard cap on retries (max 5), exponential backoff, manual override required
+**Risk 3: Slack Webhook Rate Limits**
+- **Impact:** Notifications dropped during high activity
+- **Mitigation:** Implement rate limiting, batch notifications, queue overflow alerts
+---
+## 12. Open Questions
+1. **Dashboard Auth:** Should dashboard require authentication for production deployments?
+   - **Recommendation:** Add optional API key auth in M30; disabled by default in M29
+2. **Notification Throttling:** How to prevent notification spam during gate retry storms?
+   - **Recommendation:** Batch notifications (max 1 per feature per 5min), summary digest option
+3. **Multi-Project Dashboard:** Should dashboard show all projects simultaneously or require switching?
+   - **Recommendation:** Project switcher in M29; global view in M30
+4. **Session Attach TTY:** Should `aop attach` use raw TTY mode or wrap in TUI?
+   - **Recommendation:** Raw TTY for Claude/Codex compatibility; TUI wrapper optional in M31
+5. **Issue Tracker Sync Frequency:** Real-time or batched?
+   - **Recommendation:** Batched (every 5min) to avoid rate limits; manual sync command for immediate update
+6. **Meta-Agent vs Code-Driven Orchestration:** Should AOP offer an optional meta-agent mode where the orchestrator is itself an AI agent (Composio's pattern)?
+   - **Recommendation:** No for M29-M31. AOP's code-driven supervisor provides deterministic guarantees that a meta-agent cannot. However, consider a hybrid mode in M32+ where a meta-agent can suggest but not execute priority decisions.
+7. **Activity Detection Reliability:** JSONL session file formats differ across Claude Code versions. How to handle format changes?
+   - **Recommendation:** Version-detect the JSONL format; fall back to process-alive + last-tool-call heuristic for unknown formats.
+8. **Cost Model Accuracy:** Token costs vary by provider and model. How to price accurately?
+   - **Recommendation:** Use provider-reported token counts when available (Claude API response headers). Fall back to tiktoken estimation. Allow user-configured $/token overrides.
+9. **Incremental Gate Safety:** Incremental test selection may miss integration-level regressions. How to balance speed vs safety?
+   - **Recommendation:** `fast` mode uses incremental (acceptable risk for inner dev loop). `full` and `merge` modes always run complete suite. Document the tradeoff.
+---
+## 13. Conclusion
+This specification (v2.0) significantly expands the gap analysis based on a deep-dive into Composio's actual source code (not just README claims). The original v1.1 identified 12 gaps; this revision adds 5 newly-discovered gaps (G6a-G6e) and 5 novel differentiator features (N1-N5), revises priority decisions (G7 promoted from P3 to P1), and provides implementation-level detail for all additions.
+**Phase 1 (M29)** delivers essential UX improvements (dashboard, notifications, init wizard, multi-project support) plus quick wins (cleanup automation, batch operations, workspace hooks) that eliminate major friction points.
+**Phase 2 (M30)** adds autonomous operations (auto-remediation, activity detection, session commands, PR lifecycle integration, incremental gates) that reduce manual intervention and provide runtime observability.
+**Phase 3 (M31)** integrates with external ecosystems (issue trackers, multi-instance isolation, cost tracking, dependency scheduling) for enterprise workflows.
+**Key Principles:**
+1. Every new feature is opt-in, backward compatible, and does not compromise deterministic guarantees or explicit merge control.
+2. AOP's code-driven supervisor is an intentional architectural choice, not a gap. Meta-agent orchestration (Composio's pattern) trades determinism for flexibility — AOP chooses determinism.
+3. Novel features (N1-N5) represent competitive differentiation opportunities that neither package currently offers.
+**Total estimated effort:** M29 (5-6 weeks) + M30 (4-5 weeks) + M31 (3-4 weeks) = ~12-15 weeks.
+---
+**End of Specification**