@neotx/agents 0.1.0-alpha.14 → 0.1.0-alpha.19

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/GUIDE.md ADDED
@@ -0,0 +1,595 @@
1
+ # Neo — AI Integration Guide
2
+
3
+ You are reading the neo integration guide. This document explains how an AI agent can use neo to orchestrate autonomous developer agents across git repositories.
4
+
5
+ neo is a framework that wraps the Claude Agent SDK with clone isolation, 3-level recovery, DAG workflows, concurrency control, budget guards, and approval gates. Agents work in isolated git clones — your main branch is never touched.
6
+
7
+ ---
8
+
9
+ ## Two ways to use neo
10
+
11
+ ### Mode A: Supervisor (recommended)
12
+
13
+ **This is the recommended way to use neo.** The supervisor is a long-lived autonomous daemon that acts as your CTO. You send it messages in natural language and it handles everything — agent selection, dispatch ordering, review cycles, retries, and memory.
14
+
15
+ The supervisor is NOT a chatbot. It's an event-driven heartbeat loop that:
16
+ - Picks up your messages at the next heartbeat
17
+ - Dispatches the right agents in the right order
18
+ - Monitors progress and reacts to completions/failures
19
+ - Persists memory across sessions — it learns your codebase over time
20
+ - Handles the full lifecycle: refine → architect → develop → review → fix → done
21
+
22
+ ```bash
23
+ # Start the supervisor (background daemon)
24
+ neo supervise --detach
25
+
26
+ # Send a task — the supervisor handles the rest
27
+ neo supervise --message "Implement user authentication with JWT. Create login/register endpoints, middleware, and tests."
28
+
29
+ # The supervisor autonomously:
30
+ # 1. Analyzes your request
31
+ # 2. Dispatches architect if design is needed
32
+ # 3. Dispatches developer for each task
33
+ # 4. Dispatches reviewer to review PRs
34
+ # 5. Dispatches fixer if issues are found
35
+ # 6. Reports back via activity log
36
+
37
+ # Check supervisor status
38
+ neo supervisor status
39
+
40
+ # View what the supervisor is doing
41
+ neo supervisor activity --limit 10
42
+
43
+ # Send follow-up instructions
44
+ neo supervise --message "Prioritize the auth middleware — we need it before the API routes"
45
+
46
+ # Check costs
47
+ neo cost --short
48
+ ```
49
+
50
+ **Why supervisor mode?** You don't need to know which agent to use, how to chain them, or when to retry. The supervisor makes those decisions based on its experience and memory. It also handles edge cases (review cycles, CI failures, anti-loop guards) that are tedious to manage manually.
51
+
52
+ ### Mode B: Direct dispatch (advanced)
53
+
54
+ For cases where you want full control over the workflow — you decide what to build, which agent to use, and when to follow up. Useful for one-off tasks or when you have a specific agent pipeline in mind.
55
+
56
+ ```bash
57
+ # Dispatch a developer agent
58
+ neo run developer --prompt "Add input validation to POST /api/users" \
59
+ --repo /path/to/project --branch feat/input-validation \
60
+ --meta '{"label":"input-validation","stage":"develop"}'
61
+
62
+ # Check progress
63
+ neo runs --short --status running
64
+
65
+ # Read the result when done
66
+ neo runs <runId>
67
+
68
+ # Check costs
69
+ neo cost --short
70
+ ```
71
+
72
+ You handle the develop → review → fix cycle yourself. See "Typical Workflows" at the end for examples.
73
+
74
+ ---
75
+
76
+ ## Installation & Setup
77
+
78
+ ```bash
79
+ # Prerequisites: Node.js >= 22, git >= 2.20, Claude Code CLI installed
80
+
81
+ # Install neo globally
82
+ npm install -g @neotx/cli
83
+
84
+ # Verify installation
85
+ neo doctor
86
+
87
+ # Initialize in your project
88
+ cd /path/to/your/project
89
+ neo init
90
+
91
+ # (Optional) Add MCP integrations
92
+ neo mcp add github # requires GITHUB_TOKEN env var
93
+ neo mcp add linear # requires LINEAR_API_KEY env var
94
+ neo mcp add notion # requires NOTION_TOKEN env var
95
+ ```
96
+
97
+ ---
98
+
99
+ ## Available Agents
100
+
101
+ | Agent | Model | Mode | Use when |
102
+ |-------|-------|------|----------|
103
+ | `developer` | opus | writable | Implementing code changes, bug fixes, new features |
104
+ | `architect` | opus | readonly | Designing systems, planning features, decomposing work |
105
+ | `reviewer` | sonnet | readonly | Code review — blocks on ≥1 CRITICAL or ≥3 WARNINGs |
106
+ | `fixer` | opus | writable | Fixing issues found by reviewer — targets root causes |
107
+ | `refiner` | opus | readonly | Evaluating ticket quality, splitting vague tickets |
108
+
109
+ **Custom agents:** Drop a YAML file in `.neo/agents/` to extend built-in agents:
110
+
111
+ ```yaml
112
+ # .neo/agents/my-developer.yml
113
+ name: my-developer
114
+ extends: developer
115
+ promptAppend: |
116
+ Always use our internal logger instead of console.log.
117
+ Follow the patterns in src/shared/conventions.ts.
118
+ ```
119
+
120
+ List all agents: `neo agents`
121
+
122
+ ---
123
+
124
+ ## Complete Command Reference
125
+
126
+ ### neo run — Dispatch an agent
127
+
128
+ ```bash
129
+ neo run <agent> --prompt "..." --repo <path> --branch <name> [flags]
130
+ ```
131
+
132
+ | Flag | Type | Default | Description |
133
+ |------|------|---------|-------------|
134
+ | `--prompt` | string | required | Task description for the agent |
135
+ | `--repo` | string | `.` | Target repository path |
136
+ | `--branch` | string | required | Branch name for the isolated clone |
137
+ | `--priority` | string | `medium` | `critical`, `high`, `medium`, `low` |
138
+ | `--meta` | JSON string | — | Metadata: `{"label":"...","ticketId":"...","stage":"..."}` |
139
+ | `--detach`, `-d` | boolean | `true` | Run in background, return immediately |
140
+ | `--sync`, `-s` | boolean | `false` | Run in foreground (blocking) |
141
+ | `--git-strategy` | string | `branch` | `branch` (push only) or `pr` (create PR) |
142
+ | `--output` | string | — | `json` for machine-readable output |
143
+
144
+ **Detached output:** returns `runId` and PID immediately. Use `neo logs -f <runId>` to follow.
145
+
146
+ **Example with full metadata:**
147
+ ```bash
148
+ neo run developer \
149
+ --prompt "Add rate limiting to POST /api/upload: max 10 req/min/IP, return 429 with Retry-After" \
150
+ --repo /path/to/api \
151
+ --branch feat/rate-limiting \
152
+ --priority high \
153
+ --meta '{"label":"T1-rate-limit","ticketId":"PROJ-42","stage":"develop"}' \
154
+ --git-strategy pr
155
+ ```
156
+
157
+ ### neo runs — Monitor runs
158
+
159
+ ```bash
160
+ neo runs # List all runs for current repo
161
+ neo runs <runId> # Full details + agent output (prefix match on ID)
162
+ neo runs --short # Compact output (minimal tokens)
163
+ neo runs --short --status running # Check active runs
164
+ neo runs --last 5 # Last N runs
165
+ neo runs --status failed # Filter by status: completed, failed, running
166
+ neo runs --repo my-project # Filter by repo
167
+ neo runs --output json # Machine-readable
168
+ ```
169
+
170
+ **Important:** After an agent completes, ALWAYS read `neo runs <runId>` — it contains the agent's structured output (PR URLs, issues found, plans, milestones).
171
+
172
+ ### neo supervise — Manage the supervisor daemon
173
+
174
+ ```bash
175
+ neo supervise # Start daemon + open live TUI
176
+ neo supervise --detach # Start daemon in background (no TUI)
177
+ neo supervise --attach # Open TUI for running daemon
178
+ neo supervise --status # Show supervisor status (PID, port, costs, heartbeats)
179
+ neo supervise --kill # Stop the running supervisor
180
+ neo supervise --message "..." # Send a message to the supervisor inbox
181
+ neo supervise --name my-supervisor # Use a named supervisor instance (default: "supervisor")
182
+ ```
183
+
184
+ **Status output includes:** PID, port, session ID, started timestamp, heartbeat count, last heartbeat, cost today, cost total, status (running/idle/stopped).
185
+
186
+ ### neo supervisor — Query supervisor state
187
+
188
+ ```bash
189
+ neo supervisor status # Current status + recent activity (top 5)
190
+ neo supervisor status --json # Machine-readable status
191
+ neo supervisor activity # Activity log (last 50 entries)
192
+ neo supervisor activity --type dispatch # Filter: decision, action, error, event, message, plan, dispatch
193
+ neo supervisor activity --since "2024-01-15T00:00:00Z" --until "2024-01-16T00:00:00Z"
194
+ neo supervisor activity --limit 20
195
+ neo supervisor activity --json # Machine-readable
196
+ ```
197
+
198
+ ### neo logs — Event journal
199
+
200
+ ```bash
201
+ neo logs # Last 20 events
202
+ neo logs --last 50 # Last N events
203
+ neo logs --type session:complete # Filter: session:start, session:complete, session:fail, cost:update, budget:alert
204
+ neo logs --run <runId> # Events for a specific run
205
+ neo logs --follow --run <runId> # Live tail of a running agent's log
206
+ neo logs --short # Ultra-compact (one-line per event)
207
+ neo logs --output json # Machine-readable
208
+ ```
209
+
210
+ ### neo log — Report to supervisor
211
+
212
+ Agents use this to report progress. Reports appear in the supervisor's TUI and activity log.
213
+
214
+ ```bash
215
+ neo log progress "3/5 endpoints done"
216
+ neo log action "Pushed to branch feat/auth"
217
+ neo log decision "Chose JWT over sessions — simpler for MVP"
218
+ neo log blocker "Tests failing, missing dependency" # Also wakes the supervisor via inbox
219
+ neo log milestone "All tests passing, PR opened"
220
+ neo log discovery "Repo uses Prisma + PostgreSQL"
221
+ ```
222
+
223
+ Flags: `--memory` (force to memory store), `--knowledge` (force to knowledge), `--procedure` (write as procedure memory).
224
+
225
+ ### neo cost — Budget tracking
226
+
227
+ ```bash
228
+ neo cost # Today's total + all-time + breakdown by agent and repo
229
+ neo cost --short # One-liner: today=$X.XX sessions=N agent1=$X.XX
230
+ neo cost --repo my-project # Filter by repo
231
+ neo cost --output json # Machine-readable
232
+ ```
233
+
234
+ ### neo memory — Persistent memory store
235
+
236
+ The supervisor maintains semantic memory using SQLite + FTS5 + optional vector embeddings.
237
+
238
+ ```bash
239
+ # Write memory
240
+ neo memory write --type fact --scope /path/to/repo "main branch uses protected merges"
241
+ neo memory write --type procedure --scope /path/to/repo "After architect run: parse milestones, create tasks"
242
+ neo memory write --type focus --expires 2h "Working on auth module — 3 tasks remaining"
243
+ neo memory write --type task --scope /path/to/repo --severity high --category "neo runs abc123" "Implement login endpoint"
244
+ neo memory write --type feedback --scope /path/to/repo "User wants PR descriptions in French"
245
+
246
+ # Update
247
+ neo memory update <id> "Updated content"
248
+ neo memory update <id> --outcome done # pending, in_progress, done, blocked, abandoned
249
+
250
+ # Search & list
251
+ neo memory search "authentication" # Semantic search (uses embeddings if available)
252
+ neo memory list # All memories
253
+ neo memory list --type fact # Filter by type: fact, procedure, episode, focus, feedback, task
254
+
255
+ # Delete
256
+ neo memory forget <id>
257
+
258
+ # Statistics
259
+ neo memory stats # Count by type and scope
260
+ ```
261
+
262
+ **Memory types:**
263
+
264
+ | Type | Use when | TTL |
265
+ |------|----------|-----|
266
+ | `fact` | Stable truth affecting dispatch decisions | Permanent (decays) |
267
+ | `procedure` | Same failure 3+ times, reusable how-to | Permanent |
268
+ | `focus` | Current working context (scratchpad) | `--expires` required |
269
+ | `task` | Planned work items | Until done/abandoned |
270
+ | `feedback` | Recurring review complaints | Permanent |
271
+ | `episode` | Event log entries | Permanent (decays) |
272
+
273
+ Additional flags: `--scope` (default: global), `--source` (developer/reviewer/supervisor/user), `--severity` (critical/high/medium/low), `--category` (context reference), `--tags` (comma-separated).
274
+
275
+ ### neo decision — Decision gates
276
+
277
+ Decision gates allow the supervisor to pause and wait for user input on important choices. Scout agents create decisions when they find issues that require human judgment. Users answer decisions via CLI, and the supervisor routes based on the answer.
278
+
279
+ ```bash
280
+ # Create a decision with options
281
+ neo decision create "Should we refactor the auth module or patch the existing code?" \
282
+ --options "refactor:Full refactor:Clean solution but takes 2 days,patch:Quick patch:Fast but adds tech debt" \
283
+ --default-answer patch \
284
+ --expires-in 24h \
285
+ --context "Found 3 security issues in auth module"
286
+
287
+ # List all pending decisions
288
+ neo decision pending
289
+
290
+ # List all decisions (including answered)
291
+ neo decision list
292
+
293
+ # Get details of a specific decision
294
+ neo decision get dec_abc123
295
+
296
+ # Answer a decision
297
+ neo decision answer dec_abc123 refactor
298
+
299
+ # JSON output for programmatic use
300
+ neo decision list --json
301
+ neo decision get dec_abc123 --json
302
+ ```
303
+
304
+ | Flag | Type | Default | Description |
305
+ |------|------|---------|-------------|
306
+ | `--options`, `-o` | string | — | Options in format `key:label` or `key:label:description` (comma-separated) |
307
+ | `--default-answer`, `-d` | string | — | Default answer key used if decision expires |
308
+ | `--expires-in`, `-e` | duration | `24h` | Expiration duration (e.g., `30m`, `24h`, `7d`) |
309
+ | `--type`, `-t` | string | `generic` | Decision type for categorization |
310
+ | `--context`, `-c` | string | — | Additional context for the decision |
311
+ | `--name` | string | `supervisor` | Supervisor name |
312
+ | `--json` | boolean | `false` | Output as JSON |
313
+
314
+ **Actions:**
315
+
316
+ | Action | Description |
317
+ |--------|-------------|
318
+ | `create` | Create a new decision gate (VALUE = question text) |
319
+ | `list` | List all decisions |
320
+ | `pending` | List only unanswered decisions |
321
+ | `get` | Get details of a decision (VALUE = decision ID) |
322
+ | `answer` | Answer a decision (VALUE = decision ID, followed by answer key) |
323
+
324
+ **Typical workflow:**
325
+
326
+ 1. **Scout finds issue** → Agent discovers something requiring human judgment
327
+ 2. **Creates decision** → `neo decision create "..." --options "..."` with clear options
328
+ 3. **User is notified** → Decision appears in `neo decision pending`
329
+ 4. **User answers** → `neo decision answer <id> <key>`
330
+ 5. **Supervisor routes** → Picks up the answer and dispatches appropriate follow-up
331
+
332
+ **Example: Security issue triage**
333
+
334
+ ```bash
335
+ # Scout agent creates a decision after finding vulnerabilities
336
+ neo decision create "Found SQL injection in UserService. How should we proceed?" \
337
+ --options "block:Block release:Stop deployment until fixed,hotfix:Hotfix now:Emergency patch within 2h,schedule:Schedule fix:Add to next sprint" \
338
+ --default-answer block \
339
+ --expires-in 4h \
340
+ --type security \
341
+ --context "CVE-2024-1234 affects getUserById(). Risk: HIGH"
342
+
343
+ # User checks pending decisions
344
+ neo decision pending
345
+ # Output:
346
+ # ID TYPE QUESTION EXPIRES
347
+ # dec_x7k9m2 security Found SQL injection in UserService... 3h 45m
348
+
349
+ # User answers
350
+ neo decision answer dec_x7k9m2 hotfix
351
+
352
+ # Supervisor receives the answer and dispatches fixer agent with hotfix priority
353
+ ```
354
+
355
+ ### neo webhooks — Event notifications
356
+
357
+ Neo can push events to external URLs when things happen (agent completes, budget alert, etc.).
358
+
359
+ ```bash
360
+ neo webhooks # List all registered webhooks
361
+ neo webhooks add https://example.com/neo-events # Register a new endpoint
362
+ neo webhooks remove https://example.com/neo-events # Deregister
363
+ neo webhooks test # Test all endpoints (shows response codes + latency)
364
+ neo webhooks --output json # Machine-readable
365
+ ```
366
+
367
+ **Events emitted:** `supervisor_started`, `heartbeat`, `run_dispatched`, `run_completed`, `supervisor_stopped`, `session:start`, `session:complete`, `session:fail`, `cost:update`, `budget:alert`.
368
+
369
+ **Webhook payloads** are JSON. Optional HMAC signature verification via `X-Neo-Signature` header (configure `supervisor.secret` in config).
370
+
371
+ **Receiving webhooks in your app:**
372
+ ```
373
+ POST /webhook
374
+ Content-Type: application/json
375
+ X-Neo-Signature: sha256=<hmac>
376
+
377
+ {
378
+ "event": "run_completed",
379
+ "source": "neo-supervisor",
380
+ "payload": {
381
+ "runId": "abc-123",
382
+ "status": "completed",
383
+ "costUsd": 1.24,
384
+ "durationMs": 45000
385
+ }
386
+ }
387
+ ```
388
+
389
+ ### neo mcp — MCP server integrations
390
+
391
+ MCP (Model Context Protocol) servers give agents access to external tools (Linear, GitHub, Notion, etc.).
392
+
393
+ ```bash
394
+ neo mcp list # List configured MCP servers
395
+
396
+ # Add a preset (auto-configured)
397
+ neo mcp add linear # Requires LINEAR_API_KEY env var
398
+ neo mcp add github # Requires GITHUB_TOKEN env var
399
+ neo mcp add notion # Requires NOTION_TOKEN env var
400
+ neo mcp add jira # Requires JIRA_API_TOKEN + JIRA_URL env vars
401
+ neo mcp add slack # Requires SLACK_BOT_TOKEN env var
402
+
403
+ # Add a custom MCP server
404
+ neo mcp add my-server --type stdio --command npx --serverArgs "@org/my-mcp-server"
405
+ neo mcp add my-http-server --type http --url http://localhost:8080
406
+
407
+ # Remove
408
+ neo mcp remove linear
409
+ ```
410
+
411
+ Once configured, MCP tools are available to the supervisor and agents during their sessions.
412
+
413
+ ### neo repos — Repository management
414
+
415
+ ```bash
416
+ neo repos # List registered repositories
417
+ neo repos add /path/to/repo --name my-project --branch main
418
+ neo repos remove my-project # By name or path
419
+ ```
420
+
421
+ ### neo agents — List agents
422
+
423
+ ```bash
424
+ neo agents # Table: name, model, sandbox, source (builtin/custom)
425
+ neo agents --output json # Machine-readable
426
+ ```
427
+
428
+ ### neo doctor — Health check
429
+
430
+ ```bash
431
+ neo doctor # Check all prerequisites
432
+ neo doctor --fix # Auto-fix missing directories, stale sessions
433
+ neo doctor --output json # Machine-readable
434
+ ```
435
+
436
+ ---
437
+
438
+ ## Configuration Reference
439
+
440
+ Neo stores global configuration in `~/.neo/config.yml`. Created automatically on `neo init`.
441
+
442
+ ```yaml
443
+ repos:
444
+ - path: "/path/to/your/repo"
445
+ defaultBranch: main
446
+ branchPrefix: feat
447
+ pushRemote: origin
448
+ gitStrategy: branch # "branch" or "pr"
449
+
450
+ concurrency:
451
+ maxSessions: 5 # Total concurrent agent sessions
452
+ maxPerRepo: 4 # Max sessions per repository
453
+
454
+ budget:
455
+ dailyCapUsd: 500 # Hard daily spending limit
456
+ alertThresholdPct: 80 # Emit budget:alert at this threshold
457
+
458
+ recovery:
459
+ maxRetries: 3 # Retry attempts per session
460
+ backoffBaseMs: 30000 # Base delay between retries
461
+
462
+ sessions:
463
+ initTimeoutMs: 120000 # Timeout waiting for session init
464
+ maxDurationMs: 3600000 # Max session duration (1 hour)
465
+
466
+ supervisor:
467
+ port: 7777 # Webhook server port
468
+ dailyCapUsd: 50 # Supervisor-specific daily cap
469
+ secret: "" # HMAC secret for webhook signature verification
470
+
471
+ memory:
472
+ embeddings: true # Enable local vector embeddings for semantic search
473
+ ```
474
+
475
+ ### Editing configuration
476
+
477
+ The config file is plain YAML — edit directly:
478
+
479
+ ```bash
480
+ # Open in editor
481
+ nano ~/.neo/config.yml
482
+
483
+ # Or use neo init to reset defaults
484
+ neo init
485
+ ```
486
+
487
+ ### Per-project setup
488
+
489
+ Each project has a `.neo/` directory (created by `neo init`):
490
+
491
+ ```
492
+ .neo/
493
+ ├── agents/ # Custom agent YAML definitions
494
+ │ └── my-dev.yml # Extends built-in agents
495
+ └── (created by init)
496
+ ```
497
+
498
+ ---
499
+
500
+ ## Programmatic API
501
+
502
+ For deep integration, use `@neotx/core` directly:
503
+
504
+ ```typescript
505
+ import { AgentRegistry, loadGlobalConfig, Orchestrator } from "@neotx/core";
506
+
507
+ const config = await loadGlobalConfig();
508
+ const orchestrator = new Orchestrator(config);
509
+
510
+ // Load agents
511
+ const registry = new AgentRegistry("path/to/agents");
512
+ await registry.load();
513
+ for (const agent of registry.list()) {
514
+ orchestrator.registerAgent(agent);
515
+ }
516
+
517
+ // Listen to events
518
+ orchestrator.on("session:complete", (e) => console.log(`Done: $${e.costUsd}`));
519
+ orchestrator.on("session:fail", (e) => console.log(`Failed: ${e.error}`));
520
+ orchestrator.on("budget:alert", (e) => console.log(`Budget: ${e.utilizationPct}%`));
521
+
522
+ // Dispatch
523
+ await orchestrator.start();
524
+ const result = await orchestrator.dispatch({
525
+ agent: "developer",
526
+ repo: "/path/to/repo",
527
+ prompt: "Add rate limiting to the API",
528
+ priority: "high",
529
+ });
530
+
531
+ console.log(result.status); // "success" | "failure"
532
+ console.log(result.costUsd); // 1.24
533
+ await orchestrator.shutdown();
534
+ ```
535
+
536
+ ---
537
+
538
+ ## Typical Workflows
539
+
540
+ ### Feature implementation (supervisor — recommended)
541
+
542
+ ```bash
543
+ # Just describe what you want — the supervisor orchestrates everything
544
+ neo supervise --message "Implement JWT authentication: login/register endpoints, middleware, refresh tokens, and tests"
545
+
546
+ # Monitor progress
547
+ neo supervisor status
548
+ neo supervisor activity --type dispatch
549
+ neo runs --short --status running
550
+ neo cost --short
551
+
552
+ # Send follow-up context if needed
553
+ neo supervise --message "The JWT secret should come from env var JWT_SECRET, not hardcoded"
554
+ ```
555
+
556
+ The supervisor will autonomously: refine the task if vague → dispatch architect for design → dispatch developer for each sub-task → dispatch reviewer → dispatch fixer if issues → report completion.
557
+
558
+ ### Bug fix (supervisor)
559
+
560
+ ```bash
561
+ neo supervise --message "Fix: POST /api/users returns 500 when email contains '+'. The Zod schema rejects it. High priority."
562
+ ```
563
+
564
+ ### Code review (supervisor)
565
+
566
+ ```bash
567
+ neo supervise --message "Review PR #42 on branch feat/caching. Focus on cache invalidation strategy and memory leaks."
568
+ ```
569
+
570
+ ### Feature implementation (direct dispatch — advanced)
571
+
572
+ ```bash
573
+ # 1. Design
574
+ neo run architect --prompt "Design auth system with JWT" --repo . --branch feat/auth
575
+
576
+ # 2. Read architect output, get task list
577
+ neo runs <architectRunId>
578
+
579
+ # 3. Implement each task
580
+ neo run developer --prompt "Task 1: Create JWT middleware" --repo . --branch feat/auth \
581
+ --meta '{"label":"T1-jwt-middleware","stage":"develop"}'
582
+
583
+ # 4. Review
584
+ neo run reviewer --prompt "Review PR on branch feat/auth" --repo . --branch feat/auth
585
+
586
+ # 5. Fix if needed
587
+ neo run fixer --prompt "Fix issues: missing token expiry check" --repo . --branch feat/auth
588
+ ```
589
+
590
+ ### Bug fix (direct dispatch)
591
+
592
+ ```bash
593
+ neo run developer --prompt "Fix: POST /api/users returns 500 when email contains '+'. The Zod schema rejects it." \
594
+ --repo . --branch fix/email-validation --priority high
595
+ ```
package/README.md CHANGED
@@ -1,8 +1,8 @@
1
1
  # @neotx/agents
2
2
 
3
- Built-in agent definitions and workflow templates for `@neotx/core`.
3
+ Built-in agent definitions for `@neotx/core`.
4
4
 
5
- This package contains YAML configuration files and Markdown prompts that define the 9 built-in agents and 5 workflows used by the Neo orchestrator. It's a data package — no TypeScript, no runtime code.
5
+ This package contains YAML configuration files and Markdown prompts that define the 5 built-in agents used by the Neo orchestrator. It's a data package — no TypeScript, no runtime code.
6
6
 
7
7
  ## Contents
8
8
 
@@ -13,19 +13,13 @@ packages/agents/
13
13
  │ ├── developer.yml
14
14
  │ ├── fixer.yml
15
15
  │ ├── refiner.yml
16
- │ ├── reviewer-coverage.yml
17
- │ ├── reviewer-perf.yml
18
- │ ├── reviewer-quality.yml
19
- │ ├── reviewer-security.yml
20
16
  │ └── reviewer.yml
21
- ├── prompts/ # Markdown system prompts
22
- │ └── *.md
23
- └── workflows/ # Workflow YAML definitions
24
- ├── feature.yml
25
- ├── hotfix.yml
26
- ├── refine.yml
27
- ├── review-fast.yml
28
- └── review.yml
17
+ └── prompts/ # Markdown system prompts
18
+ ├── architect.md
19
+ ├── developer.md
20
+ ├── fixer.md
21
+ ├── refiner.md
22
+ └── reviewer.md
29
23
  ```
30
24
 
31
25
  ## Built-in Agents
@@ -36,11 +30,7 @@ packages/agents/
36
30
  | **developer** | opus | writable | Read, Write, Edit, Bash, Glob, Grep | Implementation worker. Executes atomic tasks from specs in isolated clones. |
37
31
  | **fixer** | opus | writable | Read, Write, Edit, Bash, Glob, Grep | Auto-correction agent. Fixes issues found by reviewers. Targets root causes, not symptoms. |
38
32
  | **refiner** | opus | readonly | Read, Glob, Grep, WebSearch, WebFetch | Ticket quality evaluator. Assesses clarity and splits vague tickets into precise sub-tickets. |
39
- | **reviewer-quality** | sonnet | readonly | Read, Glob, Grep, Bash | Code quality reviewer. Catches bugs and DRY violations. Approves by default. |
40
- | **reviewer-security** | opus | readonly | Read, Glob, Grep, Bash | Security auditor. Flags directly exploitable vulnerabilities. Approves by default. |
41
- | **reviewer-perf** | sonnet | readonly | Read, Glob, Grep, Bash | Performance reviewer. Flags N+1 queries and O(n²) on unbounded data. Approves by default. |
42
- | **reviewer-coverage** | sonnet | readonly | Read, Glob, Grep, Bash | Test coverage reviewer. Recommends missing tests. Never blocks merge. |
43
- | **reviewer** | sonnet | readonly | Read, Glob, Grep, Bash | Single-pass unified reviewer. Covers all 4 lenses in one sweep. Lightweight alternative to parallel review. |
33
+ | **reviewer** | sonnet | readonly | Read, Glob, Grep, Bash | Single-pass unified reviewer. Covers quality, security, performance, and test coverage in one sweep. Challenges by default — blocks on critical issues. |
44
34
 
45
35
  ### Sandbox Modes
46
36
 
@@ -52,82 +42,6 @@ packages/agents/
52
42
  - **opus**: Used for complex reasoning (architecture, security, implementation)
53
43
  - **sonnet**: Used for focused review tasks (quality, performance, coverage)
54
44
 
55
- ## Built-in Workflows
56
-
57
- ### feature
58
-
59
- Full development cycle: plan, implement, review, and fix.
60
-
61
- ```yaml
62
- steps:
63
- plan:
64
- agent: architect
65
- sandbox: readonly
66
- implement:
67
- agent: developer
68
- dependsOn: [plan]
69
- review:
70
- agent: reviewer-quality
71
- dependsOn: [implement]
72
- sandbox: readonly
73
- fix:
74
- agent: fixer
75
- dependsOn: [review]
76
- condition: "output(review).hasIssues == true"
77
- ```
78
-
79
- ### review
80
-
81
- Parallel 4-lens code review. All reviewers run concurrently.
82
-
83
- ```yaml
84
- steps:
85
- quality:
86
- agent: reviewer-quality
87
- sandbox: readonly
88
- security:
89
- agent: reviewer-security
90
- sandbox: readonly
91
- perf:
92
- agent: reviewer-perf
93
- sandbox: readonly
94
- coverage:
95
- agent: reviewer-coverage
96
- sandbox: readonly
97
- ```
98
-
99
- ### review-fast
100
-
101
- Single-pass lightweight review. One agent covers all 4 lenses — ideal for small PRs or budget-constrained runs.
102
-
103
- ```yaml
104
- steps:
105
- review:
106
- agent: reviewer
107
- sandbox: readonly
108
- ```
109
-
110
- ### hotfix
111
-
112
- Fast-track single-agent implementation. Skips planning for urgent fixes.
113
-
114
- ```yaml
115
- steps:
116
- implement:
117
- agent: developer
118
- ```
119
-
120
- ### refine
121
-
122
- Ticket evaluation and decomposition for backlog grooming.
123
-
124
- ```yaml
125
- steps:
126
- evaluate:
127
- agent: refiner
128
- sandbox: readonly
129
- ```
130
-
131
45
  ## Creating Custom Agents
132
46
 
133
47
  Custom agents are defined in `.neo/agents/` in your project. You can create entirely new agents or extend built-in ones.
@@ -210,7 +124,7 @@ promptAppend: |
210
124
  Each agent has a corresponding Markdown prompt in `prompts/`. The prompt defines:
211
125
 
212
126
  - The agent's role and responsibilities
213
- - Workflow and execution protocol
127
+ - Execution protocol
214
128
  - Output format expectations
215
129
  - Hard rules and constraints
216
130
  - Escalation conditions
@@ -257,8 +171,6 @@ The `@neotx/core` orchestrator:
257
171
  2. Loads all YAML files from `.neo/agents/` as custom agents
258
172
  3. Resolves extensions and merges configurations
259
173
  4. Reads and injects prompts into agent sessions
260
- 5. Loads workflows from `packages/agents/workflows/` and `.neo/workflows/`
261
-
262
174
  Custom agents in `.neo/agents/` override or extend the built-ins from this package.
263
175
 
264
176
  ## License
package/SUPERVISOR.md CHANGED
@@ -11,6 +11,7 @@ This file contains domain-specific knowledge for the supervisor. Commands, heart
11
11
  | `fixer` | opus | writable | Fixing issues found by reviewer — targets root causes |
12
12
  | `refiner` | opus | readonly | Evaluating ticket quality, splitting vague tickets |
13
13
  | `reviewer` | sonnet | readonly | Thorough single-pass review: quality, standards, security, perf, and coverage. Challenges by default — blocks on ≥1 CRITICAL or ≥3 WARNINGs |
14
+ | `scout` | opus | readonly | Autonomous codebase explorer. Deep-dives into a repo to surface bugs, improvements, security issues, and tech debt. Creates decisions for the user |
14
15
 
15
16
  ## Agent Output Contracts
16
17
 
@@ -49,6 +50,17 @@ React to:
49
50
  - `action: "decompose"` → create sub-tickets from `sub_tickets[]`, dispatch in order
50
51
  - `action: "escalate"` → mark ticket blocked, log questions
51
52
 
53
+ ### scout → `findings[]` + `decisions_created`
54
+
55
+ React to:
56
+ - Parse `findings[]` — each has `severity`, `category`, `suggestion`, and optional `decision_id`
57
+ - CRITICAL findings with `decision_id` → wait for user decision before acting
58
+ - HIGH findings with `decision_id` → wait for user decision before acting
59
+ - User answers "yes" on a decision → route the finding as a ticket (dispatch `developer` or `architect` based on `effort`)
60
+ - User answers "later" → backlog the finding
61
+ - User answers "no" → discard
62
+ - MEDIUM/LOW findings (no decisions created) → log for reference, no action needed
63
+
52
64
  ## Dispatch — `--meta` fields
53
65
 
54
66
  Use `--meta` for traceability and idempotency:
@@ -104,6 +116,12 @@ neo run architect --prompt "Design decomposition for multi-tenant auth system" \
104
116
  --repo /path/to/repo \
105
117
  --branch feat/PROJ-99-multi-tenant-auth \
106
118
  --meta '{"ticketId":"PROJ-99","stage":"refine"}'
119
+
120
+ # scout
121
+ neo run scout --prompt "Explore this repository and surface bugs, improvements, security issues, and tech debt. Create decisions for critical and high-impact findings." \
122
+ --repo /path/to/repo \
123
+ --branch main \
124
+ --meta '{"stage":"scout"}'
107
125
  ```
108
126
 
109
127
  ## Protocol
@@ -128,6 +146,7 @@ neo run architect --prompt "Design decomposition for multi-tenant auth system" \
128
146
  | Clear criteria + small scope (< 5 points) | Dispatch `developer` |
129
147
  | Complexity ≥ 5 | Dispatch `architect` first |
130
148
  | Unclear criteria or vague scope | Dispatch `refiner` |
149
+ | Proactive exploration / no specific ticket | Dispatch `scout` on target repo |
131
150
 
132
151
  ### 3. On Refiner Completion
133
152
 
@@ -161,7 +180,18 @@ Parse fixer's JSON output:
161
180
  - `status: "FIXED"` → update tracker → in review, re-dispatch `reviewer`.
162
181
  - `status: "ESCALATED"` → update tracker → blocked.
163
182
 
164
- ### 8. On Agent Failure
183
+ ### 8. On Scout Completion
184
+
185
+ Parse scout's JSON output:
186
+ - For each finding with `decision_id`: wait for user decision at future heartbeat.
187
+ - User answers "yes" on a decision:
188
+ - `effort: "XS" | "S"` → dispatch `developer` with finding as ticket
189
+ - `effort: "M" | "L"` → dispatch `architect` for design first
190
+ - User answers "later" → log to backlog, no dispatch
191
+ - User answers "no" → discard finding, no action
192
+ - Log `health_score` and `strengths` for project context.
193
+
194
+ ### 9. On Agent Failure
165
195
 
166
196
  Update tracker → abandoned. Log the failure reason.
167
197
 
@@ -201,6 +231,41 @@ Infer missing fields before routing:
201
231
 
202
232
  **Priority** (when unset): `medium`
203
233
 
234
+ ## Idle Behavior — Scout Dispatch
235
+
236
+ When the supervisor has **no events, no active runs, and no pending tasks**, it enters idle mode.
237
+
238
+ Instead of doing nothing, dispatch a `scout` agent to proactively explore a repository:
239
+
240
+ 1. **Check preconditions:**
241
+ - Budget remaining > 10% — do not scout if budget is tight
242
+ - No pending decisions from a previous scout — wait for user to answer before scouting again
243
+ - No active runs — scout only when truly idle
244
+
245
+ 2. **Pick a repo:**
246
+ - Choose the repo least recently scouted (check memory for previous `scout` runs)
247
+ - If no scout has ever run, pick the first configured repo
248
+ - Rotate across repos over time — do not scout the same repo twice in a row
249
+
250
+ 3. **Dispatch:**
251
+ ```bash
252
+ neo log decision "Idle — dispatching scout on <repo-name>"
253
+ neo run scout --prompt "Explore this repository. Surface bugs, improvements, security issues, and tech debt. Create decisions for critical and high-impact findings." \
254
+ --repo <path> \
255
+ --branch <default-branch> \
256
+ --meta '{"stage":"scout","label":"scout-<repo-name>"}'
257
+ ```
258
+
259
+ 4. **On scout completion** (see Protocol §8):
260
+ - Read the output with `neo runs <runId>`
261
+ - The scout has already created decisions via `neo decision create`
262
+ - Log the `health_score` and finding count as a fact
263
+ - Wait for user to answer decisions at future heartbeats
264
+
265
+ 5. **Frequency guard:**
266
+ - Max ONE scout per repo per 24h — do not re-scout a repo that was scouted today
267
+ - Write a fact after each scout: `neo memory write --type fact --scope <repo> "Last scouted: <date>, health: <score>/10, <N> findings"`
268
+
204
269
  ## Safety Guards
205
270
 
206
271
  ### Anti-Loop Guard
@@ -0,0 +1,12 @@
1
+ name: scout
2
+ description: "Autonomous codebase explorer. Deep-dives into a repository to surface bugs, improvements, security issues, tech debt, and optimization opportunities. Produces actionable decisions for the supervisor."
3
+ model: opus
4
+ tools:
5
+ - Read
6
+ - Glob
7
+ - Grep
8
+ - Bash
9
+ - WebSearch
10
+ - WebFetch
11
+ sandbox: readonly
12
+ prompt: ../prompts/scout.md
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "@neotx/agents",
3
- "version": "0.1.0-alpha.14",
3
+ "version": "0.1.0-alpha.19",
4
4
  "description": "Built-in agent definitions and prompts for @neotx/core",
5
5
  "type": "module",
6
6
  "license": "MIT",
@@ -12,8 +12,8 @@
12
12
  "files": [
13
13
  "agents",
14
14
  "prompts",
15
- "workflows",
16
- "SUPERVISOR.md"
15
+ "SUPERVISOR.md",
16
+ "GUIDE.md"
17
17
  ],
18
18
  "keywords": [
19
19
  "ai-agents",
@@ -74,22 +74,6 @@ that depends on all implementation tasks.
74
74
  }
75
75
  ```
76
76
 
77
- ## Memory & Reporting
78
-
79
- You receive a "Known context" section with facts and procedures from previous runs. These are retrieved via semantic search — the most relevant memories for your task are automatically selected.
80
-
81
- Write stable discoveries to memory so future agents benefit. Memories are embedded locally for semantic retrieval — write clear, descriptive content:
82
- ```bash
83
- neo memory write --type fact --scope $NEO_REPOSITORY "Monorepo with 3 packages: core engine, CLI wrapper, agent definitions"
84
- neo memory write --type fact --scope $NEO_REPOSITORY "Event-driven architecture using typed EventEmitter, all modules emit events"
85
- ```
86
-
87
- Report progress to the supervisor (chain with commands, never standalone):
88
- ```bash
89
- neo log milestone "Architecture design complete with 3 milestones, 8 tasks"
90
- neo log decision "Chose event-driven over polling for webhook integration"
91
- ```
92
-
93
77
  ## Escalation
94
78
 
95
79
  STOP and report when:
@@ -110,22 +110,6 @@ Output the PR URL on a dedicated line: `PR_URL: https://...`
110
110
  }
111
111
  ```
112
112
 
113
- ## Memory & Reporting
114
-
115
- You receive a "Known context" section with facts and procedures from previous runs. These are retrieved via semantic search — the most relevant memories for your task are automatically selected.
116
-
117
- Write stable discoveries to memory so future agents benefit. Memories are embedded locally for semantic retrieval — write clear, descriptive content:
118
- ```bash
119
- neo memory write --type fact --scope $NEO_REPOSITORY "Uses Prisma ORM with PostgreSQL for all database access"
120
- neo memory write --type procedure --scope $NEO_REPOSITORY "Run pnpm test:e2e for integration tests, requires DATABASE_URL"
121
- ```
122
-
123
- Report progress to the supervisor (chain with commands, never standalone):
124
- ```bash
125
- pnpm test && neo log milestone "All tests passing" || neo log blocker "Tests failing"
126
- git push origin HEAD && neo log action "Pushed to branch"
127
- ```
128
-
129
113
  ## Escalation
130
114
 
131
115
  STOP and report when:
package/prompts/fixer.md CHANGED
@@ -91,22 +91,6 @@ You MUST push — the clone is destroyed after session ends.
91
91
  }
92
92
  ```
93
93
 
94
- ## Memory & Reporting
95
-
96
- You receive a "Known context" section with facts and procedures from previous runs. These are retrieved via semantic search — the most relevant memories for your task are automatically selected.
97
-
98
- Write stable discoveries to memory so future agents benefit. Memories are embedded locally for semantic retrieval — write clear, descriptive content:
99
- ```bash
100
- neo memory write --type fact --scope $NEO_REPOSITORY "Error handling uses custom AppError class in src/errors.ts"
101
- neo memory write --type procedure --scope $NEO_REPOSITORY "Integration tests require DATABASE_URL env var to be set"
102
- ```
103
-
104
- Report progress to the supervisor (chain with commands, never standalone):
105
- ```bash
106
- git push origin HEAD && neo log action "Pushed fixes to branch"
107
- pnpm test && neo log milestone "All tests passing" || neo log blocker "Tests still failing"
108
- ```
109
-
110
94
  ## Limits
111
95
 
112
96
  | Limit | Value | On exceed |
@@ -102,22 +102,6 @@ Split into atomic sub-tickets. Each MUST have:
102
102
  }
103
103
  ```
104
104
 
105
- ## Memory & Reporting
106
-
107
- You receive a "Known context" section with facts and procedures from previous runs. These are retrieved via semantic search — the most relevant memories for your task are automatically selected.
108
-
109
- Write stable discoveries to memory so future agents benefit. Memories are embedded locally for semantic retrieval — write clear, descriptive content:
110
- ```bash
111
- neo memory write --type fact --scope $NEO_REPOSITORY "Uses Drizzle ORM with PostgreSQL for database access"
112
- neo memory write --type fact --scope $NEO_REPOSITORY "Feature modules follow src/modules/<name>/ directory pattern"
113
- ```
114
-
115
- Report progress to the supervisor (chain with commands, never standalone):
116
- ```bash
117
- neo log milestone "Ticket decomposed into 4 sub-tickets"
118
- neo log decision "Decomposing ticket — score 2, vague scope"
119
- ```
120
-
121
105
  ## Decomposition Rules
122
106
 
123
107
  1. No file overlap between sub-tickets (unless dependency-ordered)
@@ -132,22 +132,6 @@ EOF
132
132
 
133
133
  Verdict: any CRITICAL → `CHANGES_REQUESTED`. ≥3 WARNINGs → `CHANGES_REQUESTED`. Otherwise → `APPROVED`.
134
134
 
135
- ## Memory & Reporting
136
-
137
- You receive a "Known context" section with facts and procedures from previous runs. These are retrieved via semantic search — the most relevant memories for your task are automatically selected.
138
-
139
- Write stable discoveries to memory so future agents benefit. Memories are embedded locally for semantic retrieval — write clear, descriptive content:
140
- ```bash
141
- neo memory write --type fact --scope $NEO_REPOSITORY "CI pipeline takes ~8 min, flaky test in auth.spec.ts"
142
- neo memory write --type fact --scope $NEO_REPOSITORY "All API endpoints require auth middleware in src/middleware/auth.ts"
143
- ```
144
-
145
- Report progress to the supervisor (chain with commands, never standalone):
146
- ```bash
147
- gh pr comment 73 --body "..." && neo log action "Posted review on PR #73"
148
- neo log milestone "Review complete: APPROVED"
149
- ```
150
-
151
135
  ## Rules
152
136
 
153
137
  1. Read-only. Never modify files.
@@ -0,0 +1,231 @@
1
+ # Scout
2
+
3
+ You are an autonomous codebase explorer. You deep-dive into a repository to
4
+ surface bugs, improvements, security issues, tech debt, and optimization
5
+ opportunities. Read-only — never modify files. You produce actionable findings
6
+ that become decisions for the user.
7
+
8
+ ## Mindset
9
+
10
+ - Think like an experienced engineer joining a new team — curious, thorough, opinionated.
11
+ - Look for what matters, not what's easy to find. Prioritize impact over quantity.
12
+ - Every finding must be actionable — if you can't suggest a fix, don't report it.
13
+ - Be honest about severity. Don't inflate minor issues to seem thorough.
14
+
15
+ ## Budget
16
+
17
+ - No limit on tool calls — explore as deeply as needed.
18
+ - Max **20 findings** total across all categories (prioritize by impact).
19
+ - Spend at least 60% of your effort reading code, not searching.
20
+
21
+ ## Protocol
22
+
23
+ ### 1. Orientation
24
+
25
+ Get a high-level understanding of the project:
26
+
27
+ - Read `package.json`, `tsconfig.json`, `CLAUDE.md`, `README.md` (if they exist)
28
+ - Glob the top-level structure: `*`, `src/**` (2 levels deep max)
29
+ - Identify: language, framework, test runner, build tool, dependencies
30
+ - Read any existing lint/format config (biome.json, .eslintrc, etc.)
31
+
32
+ ### 2. Deep Exploration
33
+
34
+ Systematically explore the codebase through these lenses:
35
+
36
+ **Architecture & Structure**
37
+ - Module boundaries — are they clean or tangled?
38
+ - Dependency direction — do low-level modules depend on high-level ones?
39
+ - File organization — does it follow a consistent pattern?
40
+ - Dead code — unused exports, unreachable branches, orphan files
41
+
42
+ **Code Quality**
43
+ - Complex functions (>60 lines, deep nesting, high cyclomatic complexity)
44
+ - DRY violations — similar logic repeated across files
45
+ - Error handling — silent catches, missing error paths, inconsistent patterns
46
+ - Type safety — `any` usage, missing types, unsafe assertions
47
+ - Naming — misleading names, inconsistent conventions
48
+
49
+ **Bugs & Correctness**
50
+ - Race conditions, unhandled promise rejections
51
+ - Off-by-one errors, null/undefined access without guards
52
+ - Logic errors in conditionals or data transformations
53
+ - Stale closures in React hooks
54
+ - Missing cleanup (event listeners, intervals, subscriptions)
55
+
56
+ **Security**
57
+ - Injection vectors (SQL, command, path traversal)
58
+ - Auth/authz gaps
59
+ - Hardcoded secrets or credentials
60
+ - Unsafe deserialization, prototype pollution
61
+ - Missing input validation at system boundaries
62
+
63
+ **Performance**
64
+ - N+1 queries, unbounded iterations
65
+ - Memory leaks in long-lived processes
66
+ - Unnecessary re-renders, missing memoization on expensive computations
67
+ - Large bundle imports that could be tree-shaken or lazy-loaded
68
+
69
+ **Dependencies**
70
+ - Outdated packages with known vulnerabilities
71
+ - Unused dependencies in package.json
72
+ - Duplicate dependencies serving the same purpose
73
+ - Missing peer dependencies
74
+
75
+ **Testing**
76
+ - Untested critical paths (auth, payments, data mutations)
77
+ - Test quality — do tests verify behavior or just call functions?
78
+ - Missing edge case coverage
79
+ - Flaky test patterns (timing, shared state, network calls)
80
+
81
+ ### 3. Synthesize
82
+
83
+ Rank all findings by impact:
84
+ - **CRITICAL**: Production risk — bugs, security holes, data loss potential
85
+ - **HIGH**: Significant improvement — major tech debt, performance bottleneck
86
+ - **MEDIUM**: Worthwhile — code quality, missing tests, minor debt
87
+ - **LOW**: Nice-to-have — style improvements, minor optimizations
88
+
89
+ ### 4. Create Decisions
90
+
91
+ For each CRITICAL or HIGH finding, create a decision gate using `neo decision create`.
92
+ The supervisor and user will see these decisions and act on them.
93
+
94
+ **Syntax:**
95
+ ```bash
96
+ neo decision create "Short actionable question" \
97
+ --options "yes:Act on it,no:Skip,later:Backlog" \
98
+ --type approval \
99
+ --context "Detailed context: what the issue is, where it is, suggested fix, effort estimate" \
100
+ --expires-in 72h
101
+ ```
102
+
103
+ **Rules for decisions:**
104
+ - One decision per CRITICAL finding — these deserve individual attention
105
+ - Group related HIGH findings into a single decision when they share a root cause or fix
106
+ - The question must be actionable: "Fix N+1 query in user-list endpoint?" not "Performance issue found"
107
+ - Include enough context so the user can decide without re-reading the code
108
+ - Use `--context` to embed file paths, line numbers, and the suggested approach
109
+ - Capture the returned decision ID (format: `dec_<uuid>`) for your output
110
+
111
+ **Examples:**
112
+ ```bash
113
+ # Critical security issue — standalone decision
114
+ neo decision create "Fix SQL injection in search endpoint?" \
115
+ --options "yes:Fix now,no:Accept risk,later:Backlog" \
116
+ --type approval \
117
+ --context "src/api/search.ts:42 — user input interpolated directly into SQL query. Fix: use parameterized query. Effort: XS" \
118
+ --expires-in 72h
119
+
120
+ # Group of related HIGH findings
121
+ neo decision create "Refactor error handling to use consistent pattern?" \
122
+ --options "yes:Refactor,no:Skip,later:Backlog" \
123
+ --type approval \
124
+ --context "3 files use different error patterns: src/auth.ts:18 (silent catch), src/api.ts:55 (throws string), src/db.ts:92 (no catch). Fix: adopt AppError class. Effort: S" \
125
+ --expires-in 72h
126
+ ```
127
+
128
+ ### 5. Write Memory
129
+
130
+ This is one of your most important responsibilities. You are the first agent to deeply explore this repo — everything you learn becomes institutional knowledge for every future agent that works here.
131
+
132
+ Write memories **as you explore**, not just at the end. Every stable discovery that would change how an agent approaches work should be a memory.
133
+
134
+ The test for a good memory: **would an agent fail, waste time, or produce wrong output without this knowledge?** If yes, write it. If it's just "nice to know", skip it.
135
+
136
+ **What to memorize:**
137
+ - Things that would make an agent's build/test/push **fail silently or unexpectedly**
138
+ - Constraints that **aren't in docs or config** but are enforced by CI, hooks, or conventions
139
+ - Patterns that **look wrong but are intentional** — so agents don't "fix" them
140
+ - Workflows where **order matters** and getting it wrong breaks things
141
+
142
+ **What NOT to memorize:**
143
+ - Anything visible in `package.json`, `README.md`, or config files
144
+ - General best practices the agent model already knows
145
+ - File paths, directory structure, line counts
146
+ - Things that are obvious from reading the code
147
+
148
+ <examples type="good">
149
+ ```bash
150
+ # Would cause a failed push without this knowledge
151
+ neo memory write --type procedure --scope $NEO_REPOSITORY "pnpm build MUST pass locally before push — CI does not rebuild, it only runs the compiled output"
152
+
153
+ # Would cause an agent to write broken code
154
+ neo memory write --type fact --scope $NEO_REPOSITORY "All service methods throw AppError (src/errors.ts), never raw Error — controllers rely on AppError.statusCode for HTTP mapping"
155
+
156
+ # Would cause a 30-minute debugging session
157
+ neo memory write --type procedure --scope $NEO_REPOSITORY "After any Drizzle schema change: run pnpm db:generate then pnpm db:push in that order — generate alone won't update the DB"
158
+
159
+ # Would cause an agent to miss required auth and ship a security hole
160
+ neo memory write --type fact --scope $NEO_REPOSITORY "Every new API route MUST use authGuard AND tenantGuard — RLS alone is not sufficient, guards set the tenant context"
161
+
162
+ # Would cause flaky test failures
163
+ neo memory write --type fact --scope $NEO_REPOSITORY "E2E tests share a single DB — tests that mutate users must use unique emails or they collide in parallel runs"
164
+
165
+ # Would cause an agent to break the deploy pipeline
166
+ neo memory write --type fact --scope $NEO_REPOSITORY "env vars in .env.production are baked at build time (Next.js NEXT_PUBLIC_*) — changing them requires a rebuild, not just a restart"
167
+ ```
168
+ </examples>
169
+
170
+ <examples type="bad">
171
+ ```bash
172
+ # Derivable from package.json — DO NOT WRITE
173
+ # "Uses React 19 with TypeScript"
174
+ # "Test runner is vitest"
175
+
176
+ # Obvious from reading the code — DO NOT WRITE
177
+ # "Components are in src/components/"
178
+ # "API routes follow REST conventions"
179
+
180
+ # Generic knowledge the model already has — DO NOT WRITE
181
+ # "Use parameterized queries to prevent SQL injection"
182
+ # "Always handle errors in async functions"
183
+ ```
184
+ </examples>
185
+
186
+ **Volume target:** aim for 3-8 high-impact memories per scout run. Every memory must pass the "would an agent fail without this?" test. Zero memories is fine if the repo is well-documented. 20 memories means you're not filtering hard enough.
187
+
188
+ ### 6. Report
189
+
190
+ Log your exploration summary:
191
+
192
+ ```bash
193
+ neo log milestone "Scout complete: X findings (Y critical, Z high), N memories written"
194
+ ```
195
+
196
+ ## Output
197
+
198
+ ```json
199
+ {
200
+ "summary": "1-2 sentence overall assessment of the codebase",
201
+ "health_score": 1-10,
202
+ "findings": [
203
+ {
204
+ "id": "F-1",
205
+ "category": "bug | security | performance | quality | architecture | testing | dependency",
206
+ "severity": "CRITICAL | HIGH | MEDIUM | LOW",
207
+ "title": "Short descriptive title",
208
+ "description": "What the issue is and why it matters",
209
+ "files": ["src/path.ts:42", "src/other.ts:18"],
210
+ "suggestion": "Concrete fix or approach",
211
+ "effort": "XS | S | M | L",
212
+ "decision_id": "dec_xxx or null"
213
+ }
214
+ ],
215
+ "decisions_created": 3,
216
+ "memories_written": 8,
217
+ "strengths": [
218
+ "Things the codebase does well — acknowledge good patterns"
219
+ ]
220
+ }
221
+ ```
222
+
223
+ ## Rules
224
+
225
+ 1. Read-only. Never modify files.
226
+ 2. Every finding has exact file paths and line numbers.
227
+ 3. Be specific — "code quality could be improved" is not a finding.
228
+ 4. Acknowledge strengths. A scout reports the full picture, not just problems.
229
+ 5. Create decisions only for CRITICAL and HIGH findings — don't flood the user.
230
+ 6. Group related findings into single decisions when they share a root cause.
231
+ 7. Max 20 findings. If you find more, keep only the highest-impact ones.
@@ -1,21 +0,0 @@
1
- name: feature
2
- description: "Plan, implement, and review a feature"
3
- steps:
4
- plan:
5
- agent: architect
6
- sandbox: readonly
7
- implement:
8
- agent: developer
9
- dependsOn: [plan]
10
- prompt: |
11
- Implement the following based on the architecture plan.
12
-
13
- Original request: {{prompt}}
14
- review:
15
- agent: reviewer
16
- dependsOn: [implement]
17
- sandbox: readonly
18
- fix:
19
- agent: fixer
20
- dependsOn: [review]
21
- condition: "output(review).hasIssues == true"
@@ -1,5 +0,0 @@
1
- name: hotfix
2
- description: "Fast-track single-agent implementation"
3
- steps:
4
- implement:
5
- agent: developer
@@ -1,6 +0,0 @@
1
- name: refine
2
- description: "Evaluate and decompose tickets"
3
- steps:
4
- evaluate:
5
- agent: refiner
6
- sandbox: readonly
@@ -1,6 +0,0 @@
1
- name: review
2
- description: "Single-pass code review covering quality, security, performance, and test coverage"
3
- steps:
4
- review:
5
- agent: reviewer
6
- sandbox: readonly