@neotx/agents 0.1.0-alpha.22 → 0.1.0-alpha.25
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/GUIDE.md +5 -7
- package/README.md +4 -10
- package/SUPERVISOR.md +304 -103
- package/agents/architect.yml +15 -2
- package/agents/developer.yml +19 -1
- package/agents/reviewer.yml +1 -1
- package/package.json +1 -1
- package/prompts/architect.md +185 -67
- package/prompts/developer.md +297 -40
- package/prompts/focused-supervisor.md +42 -0
- package/prompts/reviewer.md +35 -4
- package/prompts/subagents/code-quality-reviewer.md +49 -0
- package/prompts/subagents/plan-reviewer.md +34 -0
- package/prompts/subagents/spec-reviewer.md +43 -0
- package/agents/fixer.yml +0 -12
- package/agents/refiner.yml +0 -11
- package/prompts/fixer.md +0 -135
- package/prompts/refiner.md +0 -119
package/GUIDE.md
CHANGED
|
@@ -17,7 +17,7 @@ The supervisor is NOT a chatbot. It's an event-driven heartbeat loop that:
|
|
|
17
17
|
- Dispatches the right agents in the right order
|
|
18
18
|
- Monitors progress and reacts to completions/failures
|
|
19
19
|
- Persists memory across sessions — it learns your codebase over time
|
|
20
|
-
- Handles the full lifecycle:
|
|
20
|
+
- Handles the full lifecycle: architect (triage + design) → develop (self-review) → review (if needed) → done
|
|
21
21
|
|
|
22
22
|
```bash
|
|
23
23
|
# Start the supervisor (background daemon)
|
|
@@ -31,7 +31,7 @@ neo supervise --message "Implement user authentication with JWT. Create login/re
|
|
|
31
31
|
# 2. Dispatches architect if design is needed
|
|
32
32
|
# 3. Dispatches developer for each task
|
|
33
33
|
# 4. Dispatches reviewer to review PRs
|
|
34
|
-
# 5.
|
|
34
|
+
# 5. Re-dispatches developer if issues are found
|
|
35
35
|
# 6. Reports back via activity log
|
|
36
36
|
|
|
37
37
|
# Check supervisor status
|
|
@@ -103,8 +103,6 @@ neo mcp add notion # requires NOTION_TOKEN env var
|
|
|
103
103
|
| `developer` | opus | writable | Implementing code changes, bug fixes, new features |
|
|
104
104
|
| `architect` | opus | readonly | Designing systems, planning features, decomposing work |
|
|
105
105
|
| `reviewer` | sonnet | readonly | Code review — blocks on ≥1 CRITICAL or ≥3 WARNINGs |
|
|
106
|
-
| `fixer` | opus | writable | Fixing issues found by reviewer — targets root causes |
|
|
107
|
-
| `refiner` | opus | readonly | Evaluating ticket quality, splitting vague tickets |
|
|
108
106
|
|
|
109
107
|
**Custom agents:** Drop a YAML file in `.neo/agents/` to extend built-in agents:
|
|
110
108
|
|
|
@@ -349,7 +347,7 @@ neo decision pending
|
|
|
349
347
|
# User answers
|
|
350
348
|
neo decision answer dec_x7k9m2 hotfix
|
|
351
349
|
|
|
352
|
-
# Supervisor receives the answer and dispatches
|
|
350
|
+
# Supervisor receives the answer and re-dispatches developer with hotfix priority
|
|
353
351
|
```
|
|
354
352
|
|
|
355
353
|
### neo webhooks — Event notifications
|
|
@@ -553,7 +551,7 @@ neo cost --short
|
|
|
553
551
|
neo supervise --message "The JWT secret should come from env var JWT_SECRET, not hardcoded"
|
|
554
552
|
```
|
|
555
553
|
|
|
556
|
-
The supervisor will autonomously:
|
|
554
|
+
The supervisor will autonomously: dispatch architect (handles triage + design) → dispatch developer for each sub-task (with internal review) → dispatch standalone reviewer if needed → done.
|
|
557
555
|
|
|
558
556
|
### Bug fix (supervisor)
|
|
559
557
|
|
|
@@ -584,7 +582,7 @@ neo run developer --prompt "Task 1: Create JWT middleware" --repo . --branch fea
|
|
|
584
582
|
neo run reviewer --prompt "Review PR on branch feat/auth" --repo . --branch feat/auth
|
|
585
583
|
|
|
586
584
|
# 5. Fix if needed
|
|
587
|
-
neo run
|
|
585
|
+
neo run developer --prompt "Fix issues found in review: missing token expiry check" --repo . --branch feat/auth
|
|
588
586
|
```
|
|
589
587
|
|
|
590
588
|
### Bug fix (direct dispatch)
|
package/README.md
CHANGED
|
@@ -2,7 +2,7 @@
|
|
|
2
2
|
|
|
3
3
|
Built-in agent definitions for `@neotx/core`.
|
|
4
4
|
|
|
5
|
-
This package contains YAML configuration files and Markdown prompts that define the
|
|
5
|
+
This package contains YAML configuration files and Markdown prompts that define the 4 built-in agents used by the Neo orchestrator. It's a data package — no TypeScript, no runtime code.
|
|
6
6
|
|
|
7
7
|
## Contents
|
|
8
8
|
|
|
@@ -11,14 +11,10 @@ packages/agents/
|
|
|
11
11
|
├── agents/ # Agent YAML definitions
|
|
12
12
|
│ ├── architect.yml
|
|
13
13
|
│ ├── developer.yml
|
|
14
|
-
│ ├── fixer.yml
|
|
15
|
-
│ ├── refiner.yml
|
|
16
14
|
│ └── reviewer.yml
|
|
17
15
|
└── prompts/ # Markdown system prompts
|
|
18
16
|
├── architect.md
|
|
19
17
|
├── developer.md
|
|
20
|
-
├── fixer.md
|
|
21
|
-
├── refiner.md
|
|
22
18
|
└── reviewer.md
|
|
23
19
|
```
|
|
24
20
|
|
|
@@ -26,11 +22,9 @@ packages/agents/
|
|
|
26
22
|
|
|
27
23
|
| Agent | Model | Sandbox | Tools | Role |
|
|
28
24
|
|-------|-------|---------|-------|------|
|
|
29
|
-
| **architect** | opus | readonly | Read, Glob, Grep, WebSearch, WebFetch | Strategic planner.
|
|
30
|
-
| **developer** | opus | writable | Read, Write, Edit, Bash, Glob, Grep | Implementation worker. Executes atomic tasks from specs in isolated clones. |
|
|
31
|
-
| **
|
|
32
|
-
| **refiner** | opus | readonly | Read, Glob, Grep, WebSearch, WebFetch | Ticket quality evaluator. Assesses clarity and splits vague tickets into precise sub-tickets. |
|
|
33
|
-
| **reviewer** | sonnet | readonly | Read, Glob, Grep, Bash | Single-pass unified reviewer. Covers quality, security, performance, and test coverage in one sweep. Challenges by default — blocks on critical issues. |
|
|
25
|
+
| **architect** | opus | readonly | Read, Glob, Grep, WebSearch, WebFetch, Agent | Strategic planner. Triages requests, designs architecture, decomposes work into atomic tasks, and spawns subagents when needed. Never writes code. |
|
|
26
|
+
| **developer** | opus | writable | Read, Write, Edit, Bash, Glob, Grep, Agent | Implementation worker. Executes atomic tasks from specs in isolated clones. Performs self-review and spawns subagents for complex steps. |
|
|
27
|
+
| **reviewer** | sonnet | readonly | Read, Glob, Grep, Bash | Two-pass unified reviewer. Covers quality, security, performance, and test coverage. Challenges by default — blocks on critical issues. |
|
|
34
28
|
|
|
35
29
|
### Sandbox Modes
|
|
36
30
|
|
package/SUPERVISOR.md
CHANGED
|
@@ -6,60 +6,111 @@ This file contains domain-specific knowledge for the supervisor. Commands, heart
|
|
|
6
6
|
|
|
7
7
|
| Agent | Model | Mode | Use when |
|
|
8
8
|
|-------|-------|------|----------|
|
|
9
|
-
| `architect` | opus |
|
|
10
|
-
| `developer` | opus | writable |
|
|
11
|
-
| `
|
|
12
|
-
| `
|
|
13
|
-
| `reviewer` | sonnet | readonly | Thorough single-pass review: quality, standards, security, perf, and coverage. Challenges by default — blocks on ≥1 CRITICAL or ≥3 WARNINGs |
|
|
14
|
-
| `scout` | opus | readonly | Autonomous codebase explorer. Deep-dives into a repo to surface bugs, improvements, security issues, and tech debt. Creates decisions for the user |
|
|
9
|
+
| `architect` | opus | writable | Triage + design + write implementation plan to `.neo/specs/`. Spawns plan-reviewer subagent. Writes code in plans, NEVER modifies source files. |
|
|
10
|
+
| `developer` | opus | writable | Executes implementation plans step by step (plan mode) OR direct tasks (direct mode). Spawns spec-reviewer and code-quality-reviewer subagents. |
|
|
11
|
+
| `reviewer` | sonnet | readonly | Thorough two-pass review: spec compliance first (gate), then code quality. Challenges by default — blocks on ≥1 CRITICAL or ≥5 WARNINGs. |
|
|
12
|
+
| `scout` | opus | readonly | Autonomous codebase explorer. Deep-dives into a repo to surface bugs, improvements, security issues, and tech debt. Creates decisions for CRITICAL/HIGH findings. Writes institutional memory. |
|
|
15
13
|
|
|
16
14
|
## Agent Output Contracts
|
|
17
15
|
|
|
18
|
-
|
|
16
|
+
Read agent output to decide next actions.
|
|
19
17
|
|
|
20
|
-
### architect → `
|
|
18
|
+
### architect → approval decision (gate) → `plan_path` + `summary`
|
|
21
19
|
|
|
22
|
-
|
|
20
|
+
The architect workflow is two-phase:
|
|
23
21
|
|
|
24
|
-
|
|
22
|
+
**Phase A — Design Gate:** Before writing any plan, the architect creates a blocking approval decision:
|
|
23
|
+
```bash
|
|
24
|
+
neo decision create "Design approval for {ticket-id}" --type approval --context "..." --wait --timeout 30m
|
|
25
|
+
```
|
|
26
|
+
The architect is **paused waiting for your response**. Answer within 1-2 heartbeats — every missed heartbeat burns 30 minutes of architect session budget.
|
|
25
27
|
|
|
26
28
|
React to:
|
|
27
|
-
- `
|
|
28
|
-
- `
|
|
29
|
-
- `
|
|
29
|
+
- `approved` → architect writes plan, then report arrives
|
|
30
|
+
- `approved_with_changes` → architect revises design, re-submits (max 2 cycles)
|
|
31
|
+
- `rejected` → architect restarts design
|
|
30
32
|
|
|
31
|
-
|
|
33
|
+
**Phase B — Plan Ready:** After design is approved, architect reports:
|
|
34
|
+
- `plan_path` + `summary` → dispatch `developer` with `--prompt "Execute the implementation plan at {plan_path}. Create a PR when all tasks pass."` on the same branch.
|
|
32
35
|
|
|
33
|
-
The
|
|
34
|
-
Expect `CHANGES_REQUESTED` more often than `APPROVED` — this is intentional.
|
|
36
|
+
The developer handles the full plan autonomously — no task-by-task dispatch from supervisor.
|
|
35
37
|
|
|
36
|
-
|
|
37
|
-
- `verdict: "APPROVED"` → mark ticket done
|
|
38
|
-
- `verdict: "CHANGES_REQUESTED"` → check anti-loop guard, dispatch `fixer` with issues (include severity — fixer should prioritize CRITICALs first)
|
|
38
|
+
### developer → `status` + `branch_completion`
|
|
39
39
|
|
|
40
|
-
|
|
40
|
+
React to status:
|
|
41
|
+
- `status: "DONE"` + `PR_URL` → extract PR number, set ticket to CI pending, check CI at next heartbeat
|
|
42
|
+
- `status: "DONE"` without PR → mark ticket done
|
|
43
|
+
- `status: "DONE_WITH_CONCERNS"` → read concerns, evaluate impact. If architectural → create a decision or dispatch architect. If minor → mark done with note.
|
|
44
|
+
- `status: "BLOCKED"` → route via decision system. If you can answer directly (scope, priority, strategic question), do it. Otherwise wait for human.
|
|
45
|
+
- `status: "NEEDS_CONTEXT"` → provide the requested context and re-dispatch developer on same branch.
|
|
41
46
|
|
|
42
|
-
|
|
43
|
-
- `
|
|
44
|
-
- `
|
|
47
|
+
When `branch_completion` is present, supervisor decides:
|
|
48
|
+
- `recommendation: "pr"` + tests passing → create/push PR (most common)
|
|
49
|
+
- `recommendation: "keep"` → note in focus, revisit later
|
|
50
|
+
- `recommendation: "discard"` → requires supervisor confirmation before executing
|
|
51
|
+
- `recommendation: "push"` → push without PR (rare, for config/doc changes)
|
|
52
|
+
|
|
53
|
+
### Crash Resumption Protocol
|
|
54
|
+
|
|
55
|
+
When a developer run **fails** on a plan-mode task (run `status: "failed"`, error contains `error_max_turns` or any runtime crash):
|
|
56
|
+
|
|
57
|
+
**Step 1 — Reconstruct completed tasks:**
|
|
58
|
+
```bash
|
|
59
|
+
# Read the neo logs for the failed run to find checkpoints
|
|
60
|
+
neo logs <failedRunId> 2>&1 | grep "milestone"
|
|
61
|
+
# Cross-check with git log on the branch
|
|
62
|
+
git -C <repoPath> log --oneline <branch> 2>&1 | head -20
|
|
63
|
+
```
|
|
45
64
|
|
|
46
|
-
|
|
65
|
+
**Step 2 — Build resumption context:**
|
|
66
|
+
From the logs, identify:
|
|
67
|
+
- Which tasks have a "done — commit <sha>" milestone → completed
|
|
68
|
+
- Which task has no milestone or an error → failed task
|
|
69
|
+
|
|
70
|
+
**Step 3 — Relaunch with RESUMING FROM CRASH header:**
|
|
71
|
+
```bash
|
|
72
|
+
neo run developer \
|
|
73
|
+
--prompt "RESUMING FROM CRASH
|
|
74
|
+
Previous run: <failedRunId>
|
|
75
|
+
Completed tasks: T1 (commit abc1234), T2 (commit def5678)
|
|
76
|
+
Failed at: T3 — error: <last error from logs>
|
|
77
|
+
Resume: start from T3, skip completed tasks above.
|
|
78
|
+
|
|
79
|
+
Original task:
|
|
80
|
+
Execute the implementation plan at .neo/specs/<plan>.md. Create a PR when all tasks pass." \
|
|
81
|
+
--repo <repoPath> \
|
|
82
|
+
--branch <sameBranch> \
|
|
83
|
+
--meta '{"ticketId":"<id>","stage":"develop","resumedFrom":"<failedRunId>"}'
|
|
84
|
+
```
|
|
85
|
+
|
|
86
|
+
**Rules:**
|
|
87
|
+
- Always use the **same branch** — completed commits are already there
|
|
88
|
+
- Max 3 resumption attempts per plan — on 3rd failure, create a decision for human
|
|
89
|
+
- Never resume if the branch has diverged (commits missing from git log) — create a decision instead
|
|
90
|
+
|
|
91
|
+
### reviewer → `verdict` + `issues[]`
|
|
92
|
+
|
|
93
|
+
The reviewer runs two passes: spec compliance first (fail = stop), then code quality. It challenges by default.
|
|
94
|
+
|
|
95
|
+
Blocks on:
|
|
96
|
+
- Any CRITICAL issue
|
|
97
|
+
- ≥5 WARNINGs
|
|
98
|
+
- `spec_compliance: "FAIL"` (always blocks, regardless of code quality)
|
|
47
99
|
|
|
48
100
|
React to:
|
|
49
|
-
- `
|
|
50
|
-
- `
|
|
51
|
-
- `action: "escalate"` → mark ticket blocked, log questions
|
|
101
|
+
- `verdict: "APPROVED"` → mark ticket done
|
|
102
|
+
- `verdict: "CHANGES_REQUESTED"` → check anti-loop guard, re-dispatch `developer` with review feedback as context on same branch (include severity — developer should prioritize CRITICALs first, then spec deviations)
|
|
52
103
|
|
|
53
104
|
### scout → `findings[]` + `decisions_created`
|
|
54
105
|
|
|
55
106
|
React to:
|
|
56
|
-
- Parse `findings[]` — each has `severity`, `category`, `suggestion`, and optional `decision_id`
|
|
57
|
-
- CRITICAL findings with `decision_id` → wait for user decision before acting
|
|
58
|
-
-
|
|
59
|
-
- User answers "
|
|
60
|
-
- User answers "
|
|
61
|
-
- User answers "no" → discard
|
|
107
|
+
- Parse `findings[]` — each has `severity`, `category`, `suggestion`, `effort`, and optional `decision_id`
|
|
108
|
+
- CRITICAL/HIGH findings with `decision_id` → wait for user decision before acting
|
|
109
|
+
- User answers `"yes"` on a decision → **run pre-dispatch dedup check** (§2) then route as ticket
|
|
110
|
+
- User answers `"later"` → backlog the finding
|
|
111
|
+
- User answers `"no"` → discard
|
|
62
112
|
- MEDIUM/LOW findings (no decisions created) → log for reference, no action needed
|
|
113
|
+
- Log `health_score` and `strengths` for project context
|
|
63
114
|
|
|
64
115
|
## Dispatch — `--meta` fields
|
|
65
116
|
|
|
@@ -68,36 +119,43 @@ Use `--meta` for traceability and idempotency:
|
|
|
68
119
|
| Field | Required | Description |
|
|
69
120
|
|-------|----------|-------------|
|
|
70
121
|
| `ticketId` | always | Source ticket identifier for traceability |
|
|
71
|
-
| `stage` | always | Pipeline stage: `
|
|
122
|
+
| `stage` | always | Pipeline stage: `plan`, `develop`, `review` |
|
|
72
123
|
| `prNumber` | if exists | GitHub PR number |
|
|
73
|
-
| `cycle` | fix stage | Fixer→review cycle count (anti-loop tracking) |
|
|
74
124
|
| `parentTicketId` | sub-tickets | Parent ticket ID for decomposed work |
|
|
75
125
|
|
|
76
126
|
### Branch & PR lifecycle
|
|
77
127
|
|
|
78
128
|
- `--branch` is **required for all agents**. Every session runs in an isolated clone on that branch.
|
|
79
|
-
- **
|
|
80
|
-
- **
|
|
129
|
+
- **plan**: pass `--branch feat/PROJ-42-description` — architect commits the spec to this branch.
|
|
130
|
+
- **develop**: pass the same `--branch` as architect used.
|
|
131
|
+
- **review**: pass the same `--branch` and `prNumber` in `--meta`.
|
|
81
132
|
- On developer completion: extract `branch` and `prNumber` from `neo runs <runId>`, carry forward.
|
|
82
133
|
|
|
83
134
|
### Prompt writing
|
|
84
135
|
|
|
85
136
|
The `--prompt` is the agent's only context. It must be self-contained:
|
|
86
137
|
|
|
87
|
-
- **develop**: task description + acceptance criteria + instruction to create branch and PR
|
|
88
|
-
- **review**: PR number + branch name + what to review
|
|
89
|
-
- **fix**: PR number + branch name + specific issues to fix + instruction to push to existing branch
|
|
90
|
-
- **refine**: ticket title + description + any existing criteria
|
|
91
138
|
- **architect**: feature description + constraints + scope
|
|
139
|
+
- **developer**: task description + acceptance criteria + instruction to create branch and PR (or plan path for plan mode)
|
|
140
|
+
- **reviewer**: PR number + branch name + what to review
|
|
92
141
|
|
|
93
142
|
### Examples
|
|
94
143
|
|
|
95
144
|
```bash
|
|
96
|
-
#
|
|
97
|
-
neo run
|
|
98
|
-
--repo /path/to/repo \
|
|
99
|
-
--
|
|
100
|
-
|
|
145
|
+
# architect (design + plan)
|
|
146
|
+
neo run architect --prompt "Design and plan: multi-tenant auth system" \
|
|
147
|
+
--repo /path/to/repo --branch feat/PROJ-99-auth \
|
|
148
|
+
--meta '{"ticketId":"PROJ-99","stage":"plan"}'
|
|
149
|
+
|
|
150
|
+
# developer with plan (after architect completes)
|
|
151
|
+
neo run developer --prompt "Execute the implementation plan at .neo/specs/PROJ-99-plan.md. Create a PR when all tasks pass." \
|
|
152
|
+
--repo /path/to/repo --branch feat/PROJ-99-auth \
|
|
153
|
+
--meta '{"ticketId":"PROJ-99","stage":"develop"}'
|
|
154
|
+
|
|
155
|
+
# developer direct (small task, no architect needed)
|
|
156
|
+
neo run developer --prompt "Fix: POST /api/users returns 500 when email contains '+'. Open a PR." \
|
|
157
|
+
--repo /path/to/repo --branch fix/PROJ-43-email \
|
|
158
|
+
--meta '{"ticketId":"PROJ-43","stage":"develop"}'
|
|
101
159
|
|
|
102
160
|
# review
|
|
103
161
|
neo run reviewer --prompt "Review PR #73 on branch feat/PROJ-42-add-auth." \
|
|
@@ -105,18 +163,6 @@ neo run reviewer --prompt "Review PR #73 on branch feat/PROJ-42-add-auth." \
|
|
|
105
163
|
--branch feat/PROJ-42-add-auth \
|
|
106
164
|
--meta '{"ticketId":"PROJ-42","stage":"review","prNumber":73}'
|
|
107
165
|
|
|
108
|
-
# fix
|
|
109
|
-
neo run fixer --prompt "Fix issues from review on PR #73: missing input validation on login endpoint. Push fixes to the existing branch." \
|
|
110
|
-
--repo /path/to/repo \
|
|
111
|
-
--branch feat/PROJ-42-add-auth \
|
|
112
|
-
--meta '{"ticketId":"PROJ-42","stage":"fix","prNumber":73,"cycle":1}'
|
|
113
|
-
|
|
114
|
-
# architect
|
|
115
|
-
neo run architect --prompt "Design decomposition for multi-tenant auth system" \
|
|
116
|
-
--repo /path/to/repo \
|
|
117
|
-
--branch feat/PROJ-99-multi-tenant-auth \
|
|
118
|
-
--meta '{"ticketId":"PROJ-99","stage":"refine"}'
|
|
119
|
-
|
|
120
166
|
# scout
|
|
121
167
|
neo run scout --prompt "Explore this repository and surface bugs, improvements, security issues, and tech debt. Create decisions for critical and high-impact findings." \
|
|
122
168
|
--repo /path/to/repo \
|
|
@@ -124,6 +170,32 @@ neo run scout --prompt "Explore this repository and surface bugs, improvements,
|
|
|
124
170
|
--meta '{"stage":"scout"}'
|
|
125
171
|
```
|
|
126
172
|
|
|
173
|
+
### Task/Run Linkage — Mandatory Protocol
|
|
174
|
+
|
|
175
|
+
Every dispatch follows this sequence:
|
|
176
|
+
|
|
177
|
+
```bash
|
|
178
|
+
# 1. Create or identify the task
|
|
179
|
+
neo task create --scope /path/to/repo --priority high --initiative auth-v2 "T1: Implement JWT middleware"
|
|
180
|
+
# → returns: mem_abc123
|
|
181
|
+
|
|
182
|
+
# 2. Dispatch the run
|
|
183
|
+
neo run developer --prompt "..." --repo /path --branch feat/auth --meta '{"ticketId":"T1","stage":"develop"}'
|
|
184
|
+
# → returns: run-uuid-here
|
|
185
|
+
|
|
186
|
+
# 3. Link run to task immediately
|
|
187
|
+
neo task update mem_abc123 --status in_progress
|
|
188
|
+
# (use --context "neo runs run-uuid-here" when neo task supports it)
|
|
189
|
+
```
|
|
190
|
+
|
|
191
|
+
**On run completion:**
|
|
192
|
+
```bash
|
|
193
|
+
neo task update mem_abc123 --status done # if run succeeded
|
|
194
|
+
neo task update mem_abc123 --status blocked # if run failed
|
|
195
|
+
```
|
|
196
|
+
|
|
197
|
+
**Never dispatch without a task. Never leave a failed run's task as `in_progress`.**
|
|
198
|
+
|
|
127
199
|
## Protocol
|
|
128
200
|
|
|
129
201
|
### 1. Ticket Pickup
|
|
@@ -134,9 +206,10 @@ neo run scout --prompt "Explore this repository and surface bugs, improvements,
|
|
|
134
206
|
a. Read full ticket details.
|
|
135
207
|
b. Self-evaluate missing fields (see below).
|
|
136
208
|
c. Resolve target repository.
|
|
137
|
-
d.
|
|
138
|
-
e.
|
|
139
|
-
f.
|
|
209
|
+
d. **Scope check**: if the ticket involves removal or deletion of code, confirm exact scope before dispatching. If ambiguous, create a decision: "Should I delete ONLY X, or also Y?"
|
|
210
|
+
e. Route the ticket.
|
|
211
|
+
f. Update tracker → in progress.
|
|
212
|
+
g. **Yield.** Completion arrives at a future heartbeat.
|
|
140
213
|
|
|
141
214
|
### 2. Routing
|
|
142
215
|
|
|
@@ -156,57 +229,65 @@ Skip silently and log: `neo log discovery "Skipping <finding> — covered by PR
|
|
|
156
229
|
|
|
157
230
|
| Condition | Action |
|
|
158
231
|
|-----------|--------|
|
|
159
|
-
| Bug + critical priority | Dispatch `developer`
|
|
160
|
-
| Clear criteria + small scope (<
|
|
161
|
-
| Complexity ≥
|
|
162
|
-
| Unclear criteria or vague scope | Dispatch `
|
|
232
|
+
| Bug + critical priority | Dispatch `developer` direct (hotfix) |
|
|
233
|
+
| Clear criteria + small scope (< 3 points) | Dispatch `developer` direct |
|
|
234
|
+
| Complexity ≥ 3 | Dispatch `architect` first → design gate → plan → dispatch `developer` with plan path |
|
|
235
|
+
| Unclear criteria or vague scope | Dispatch `architect` (handles triage via decision poll) |
|
|
236
|
+
| Deletion / large removal | Create scope decision first, then dispatch `developer` direct |
|
|
163
237
|
| Proactive exploration / no specific ticket | Dispatch `scout` on target repo |
|
|
164
238
|
|
|
165
|
-
### 3. On
|
|
239
|
+
### 3. On Architect Design Gate
|
|
240
|
+
|
|
241
|
+
When architect creates an approval decision (`--type approval --wait`):
|
|
242
|
+
|
|
243
|
+
1. Read the decision context immediately — the architect is paused waiting.
|
|
244
|
+
2. Evaluate: does the proposed design match the ticket's intent? Are the components and scope reasonable?
|
|
245
|
+
3. **Can you approve directly?** (most common — if the approach is reasonable and in scope)
|
|
246
|
+
→ `neo decision answer <id> approved`
|
|
247
|
+
4. **Need changes?** → `neo decision answer <id> "approved_with_changes: <specific changes needed>"`
|
|
248
|
+
5. **Fundamentally wrong approach?** → `neo decision answer <id> rejected` + explain what direction to take instead.
|
|
166
249
|
|
|
167
|
-
|
|
168
|
-
- `action: "pass_through"` → dispatch `developer` with `enriched_context`
|
|
169
|
-
- `action: "decompose"` → create sub-tickets from `sub_tickets[]`, dispatch `developer` for each
|
|
170
|
-
- `action: "escalate"` → update tracker → blocked
|
|
250
|
+
Do NOT dispatch follow-up agents until the architect reports `plan_path`.
|
|
171
251
|
|
|
172
|
-
### 4. On Developer
|
|
252
|
+
### 4. On Developer Completion — with PR
|
|
173
253
|
|
|
174
254
|
1. Parse output for `PR_URL`, extract PR number.
|
|
175
|
-
2.
|
|
176
|
-
|
|
255
|
+
2. Handle by status:
|
|
256
|
+
- `status: "DONE"` → update tracker → CI pending.
|
|
257
|
+
- `status: "DONE_WITH_CONCERNS"` → read concerns, evaluate impact. If architectural → create a decision or dispatch architect. If minor → update tracker → CI pending, note concerns.
|
|
258
|
+
- `status: "BLOCKED"` → route via decision system.
|
|
259
|
+
- `status: "NEEDS_CONTEXT"` → provide context, re-dispatch developer on same branch.
|
|
260
|
+
3. For CI pending tickets: check CI: `gh pr checks <prNumber> --repo <repository>`.
|
|
177
261
|
4. CI passed → update tracker → in review, dispatch `reviewer`.
|
|
178
|
-
5. CI failed →
|
|
262
|
+
5. CI failed → re-dispatch `developer` with CI error context on same branch.
|
|
179
263
|
6. CI pending → note in focus, check at next heartbeat.
|
|
180
264
|
|
|
181
|
-
### 5. On Developer
|
|
265
|
+
### 5. On Developer Completion — no PR
|
|
182
266
|
|
|
183
|
-
|
|
267
|
+
- `status: "DONE"` → update tracker → done.
|
|
268
|
+
- `status: "DONE_WITH_CONCERNS"` → evaluate concerns, mark done with note if minor.
|
|
269
|
+
- `status: "BLOCKED"` → route via decision system.
|
|
270
|
+
- `status: "NEEDS_CONTEXT"` → provide context, re-dispatch developer.
|
|
184
271
|
|
|
185
272
|
### 6. On Review Completion
|
|
186
273
|
|
|
187
274
|
Parse reviewer's JSON output:
|
|
188
275
|
- `verdict: "APPROVED"` → update tracker → done.
|
|
189
|
-
- `verdict: "CHANGES_REQUESTED"` → check anti-loop guard → dispatch `
|
|
190
|
-
|
|
191
|
-
### 7. On Fixer Completion
|
|
192
|
-
|
|
193
|
-
Parse fixer's JSON output:
|
|
194
|
-
- `status: "FIXED"` → update tracker → in review, re-dispatch `reviewer`.
|
|
195
|
-
- `status: "ESCALATED"` → update tracker → blocked.
|
|
276
|
+
- `verdict: "CHANGES_REQUESTED"` → check anti-loop guard → re-dispatch `developer` with review feedback as context on same branch (include spec deviations + CRITICAL issues + WARNING count), or escalate.
|
|
196
277
|
|
|
197
|
-
###
|
|
278
|
+
### 7. On Scout Completion
|
|
198
279
|
|
|
199
280
|
Parse scout's JSON output:
|
|
200
281
|
- For each finding with `decision_id`: wait for user decision at future heartbeat.
|
|
201
|
-
- User answers "yes" on a decision:
|
|
202
|
-
- **Run pre-dispatch dedup check** (§2) before dispatching
|
|
282
|
+
- User answers `"yes"` on a decision:
|
|
283
|
+
- **Run pre-dispatch dedup check** (§2) before dispatching.
|
|
203
284
|
- `effort: "XS" | "S"` → dispatch `developer` with finding as ticket
|
|
204
285
|
- `effort: "M" | "L"` → dispatch `architect` for design first
|
|
205
|
-
- User answers "later" → log to backlog, no dispatch
|
|
206
|
-
- User answers "no" → discard finding, no action
|
|
286
|
+
- User answers `"later"` → log to backlog, no dispatch
|
|
287
|
+
- User answers `"no"` → discard finding, no action
|
|
207
288
|
- Log `health_score` and `strengths` for project context.
|
|
208
289
|
|
|
209
|
-
###
|
|
290
|
+
### 8. On Agent Failure
|
|
210
291
|
|
|
211
292
|
Update tracker → abandoned. Log the failure reason.
|
|
212
293
|
|
|
@@ -217,12 +298,14 @@ ready → in progress → ci pending → in review → done
|
|
|
217
298
|
│ │ │
|
|
218
299
|
│ │ failure │ changes requested
|
|
219
300
|
│ ▼ ▼
|
|
220
|
-
│
|
|
221
|
-
│
|
|
222
|
-
│ └──→ in review (re-review)
|
|
301
|
+
│ developer ◄────────┘
|
|
302
|
+
│ re-dispatch
|
|
223
303
|
│
|
|
224
304
|
└──→ blocked (escalation/budget/anti-loop)
|
|
225
305
|
└──→ abandoned (terminal failure)
|
|
306
|
+
|
|
307
|
+
architect path:
|
|
308
|
+
ready → design gate (--wait decision) → plan written → in progress → ...
|
|
226
309
|
```
|
|
227
310
|
|
|
228
311
|
## Self-Evaluation (Missing Ticket Fields)
|
|
@@ -233,47 +316,165 @@ Infer missing fields before routing:
|
|
|
233
316
|
- "crash", "error", "broken", "fix", "regression" → `bug`
|
|
234
317
|
- "add", "create", "implement", "build", "new" → `feature`
|
|
235
318
|
- "refactor", "clean", "improve", "optimize" → `chore`
|
|
319
|
+
- "remove", "delete", "cleanup" → `chore` (requires scope confirmation — see §1d)
|
|
236
320
|
- Unclear → `feature`
|
|
237
321
|
|
|
238
322
|
**Complexity (Fibonacci):**
|
|
239
323
|
- 1: typo, config, single-line — 2: single file, <50 lines — 3: 2-3 files (default)
|
|
240
|
-
-
|
|
324
|
+
- 3+: triggers architect first — 5: 3-5 files — 8: 5-8 files — 13: large feature — 21+: major
|
|
241
325
|
|
|
242
326
|
**Criteria** (when unset):
|
|
243
327
|
- Bugs: "The bug described in the title is fixed and does not regress"
|
|
244
328
|
- Features: derive from title
|
|
245
329
|
- Chores: "Code is cleaned up without breaking existing behavior"
|
|
330
|
+
- Deletions: "ONLY the named subsystem is removed. Nothing adjacent is deleted."
|
|
246
331
|
|
|
247
332
|
**Priority** (when unset): `medium`
|
|
248
333
|
|
|
334
|
+
## Execution Strategy
|
|
335
|
+
|
|
336
|
+
When an architect completes:
|
|
337
|
+
|
|
338
|
+
1. Read `plan_path` from architect output.
|
|
339
|
+
2. Dispatch `developer` with the plan path on the same branch. The developer handles task ordering autonomously.
|
|
340
|
+
3. Post-completion: check CI, dispatch `reviewer` after CI passes.
|
|
341
|
+
4. Anti-loop guard: max 6 re-dispatch cycles per ticket.
|
|
342
|
+
|
|
343
|
+
## Decision Routing
|
|
344
|
+
|
|
345
|
+
When a pending decision arrives from an agent:
|
|
346
|
+
|
|
347
|
+
1. **Can you answer directly?** (strategic question, scope, priority, design approval)
|
|
348
|
+
→ `neo decision answer <id> <answer>`
|
|
349
|
+
|
|
350
|
+
2. **Needs codebase investigation?** (technical question about existing code)
|
|
351
|
+
→ Dispatch `scout` to investigate (already readonly)
|
|
352
|
+
→ Read run output → `neo decision answer <id>` with findings
|
|
353
|
+
|
|
354
|
+
3. **Needs human input?** (`autoDecide: false`, or genuinely uncertain)
|
|
355
|
+
→ Log and wait for human response
|
|
356
|
+
|
|
357
|
+
IMPORTANT: An agent may be BLOCKED waiting on this decision (especially architect with `--wait`).
|
|
358
|
+
Answer within 1-2 heartbeats. Stale decisions waste agent session budget.
|
|
359
|
+
|
|
360
|
+
## Memory Usage
|
|
361
|
+
|
|
362
|
+
Use `neo memory write` to capture stable facts that would change how future agents approach work on this repo.
|
|
363
|
+
|
|
364
|
+
Write memories when:
|
|
365
|
+
- A scout or developer reveals a non-obvious constraint not in docs (e.g., "build must pass locally before push — CI runs compiled output only")
|
|
366
|
+
- A developer or reviewer hits the same issue twice (→ write a `procedure` memory with the fix)
|
|
367
|
+
- A user provides feedback that affects future dispatches (→ write a `feedback` memory with scope)
|
|
368
|
+
- A design decision locks in a pattern that future architects should know (→ write a `fact` memory)
|
|
369
|
+
|
|
370
|
+
```bash
|
|
371
|
+
neo memory write --type fact --scope <repo-path> "<stable truth>"
|
|
372
|
+
neo memory write --type procedure --scope <repo-path> "<step-by-step how-to>"
|
|
373
|
+
neo memory write --type feedback --scope <repo-path> "<recurring complaint or preference>"
|
|
374
|
+
neo memory write --type focus --expires 2h "<current working context>"
|
|
375
|
+
```
|
|
376
|
+
|
|
377
|
+
Do NOT memorize: file paths, general best practices, obvious conventions, anything in README or package.json.
|
|
378
|
+
|
|
249
379
|
## Idle Behavior
|
|
250
380
|
|
|
251
381
|
When the supervisor has **no events, no active runs, and no pending tasks**, it enters idle mode.
|
|
252
382
|
|
|
253
|
-
**Do not dispatch new agents proactively
|
|
383
|
+
**Do not dispatch new agents proactively** unless there are active **directives** (see below). Instead, use idle time to audit past work and catch dropped tasks:
|
|
254
384
|
|
|
255
385
|
1. **Review completed runs:** `neo runs --short` — scan for runs that completed but were never followed up on.
|
|
256
386
|
2. **Check for missed dispatches:**
|
|
257
387
|
- A `developer` run completed with a `PR_URL` but no `reviewer` was dispatched → dispatch `reviewer`.
|
|
258
|
-
- A `
|
|
259
|
-
-
|
|
260
|
-
-
|
|
261
|
-
- A `refiner` returned `decompose` but no `sub-tickets` were created (and/or no `developer` was dispatched) → create sub-tickets from `sub_tickets[]`, then dispatch one `developer` per sub-ticket.
|
|
262
|
-
- A `architect` returned `milestones[].tasks[]` but sub-tickets were never created → create them and dispatch.
|
|
388
|
+
- A `reviewer` returned `CHANGES_REQUESTED` but no `developer` was re-dispatched → re-dispatch `developer` with review feedback (check anti-loop guard first).
|
|
389
|
+
- An `architect` returned a `plan_path` but no `developer` was dispatched with it → dispatch `developer` with the plan path.
|
|
390
|
+
- Pending decisions not yet answered → check `neo decision list` and route appropriately.
|
|
263
391
|
3. **Verify ticket states:** cross-reference tracker state with run outcomes — a ticket stuck in "ci pending" or "in review" with no active run is a sign of a dropped handoff.
|
|
264
|
-
4. **If everything checks out:** do nothing. Wait for the next heartbeat or user input.
|
|
392
|
+
4. **If everything checks out and no active directives:** do nothing. Wait for the next heartbeat or user input.
|
|
393
|
+
|
|
394
|
+
## Directives
|
|
395
|
+
|
|
396
|
+
Directives are persistent standing instructions for idle time. They tell the supervisor what to do when otherwise idle.
|
|
397
|
+
|
|
398
|
+
### Managing Directives
|
|
399
|
+
|
|
400
|
+
```bash
|
|
401
|
+
# Create a directive (default: indefinite)
|
|
402
|
+
neo directive create "launch scout and implement findings" --trigger idle
|
|
403
|
+
|
|
404
|
+
# Create a time-bounded directive
|
|
405
|
+
neo directive create "run tests on all repos" --trigger idle --duration "2h"
|
|
406
|
+
neo directive create "check for PRs needing review" --trigger idle --duration "until midnight"
|
|
407
|
+
|
|
408
|
+
# List all directives
|
|
409
|
+
neo directive list
|
|
410
|
+
|
|
411
|
+
# Toggle a directive off/on
|
|
412
|
+
neo directive toggle <id>
|
|
413
|
+
|
|
414
|
+
# Delete a directive
|
|
415
|
+
neo directive delete <id>
|
|
416
|
+
```
|
|
417
|
+
|
|
418
|
+
### Trigger Types
|
|
419
|
+
|
|
420
|
+
| Trigger | When it fires |
|
|
421
|
+
|---------|---------------|
|
|
422
|
+
| `idle` | No events, no active runs, no pending tasks |
|
|
423
|
+
| `startup` | Supervisor starts |
|
|
424
|
+
| `shutdown` | Supervisor stops |
|
|
425
|
+
|
|
426
|
+
### Idle Directive Execution
|
|
427
|
+
|
|
428
|
+
When idle and there are active directives:
|
|
429
|
+
|
|
430
|
+
1. Read the list of active directives (sorted by priority descending).
|
|
431
|
+
2. For each directive, check if execution is feasible (budget, repo availability).
|
|
432
|
+
3. Execute the highest-priority feasible directive.
|
|
433
|
+
4. Log the action: `neo log action "executed directive: <action>"`.
|
|
434
|
+
5. Update the directive's `lastTriggeredAt` timestamp.
|
|
435
|
+
|
|
436
|
+
### Duration Formats
|
|
437
|
+
|
|
438
|
+
- **Shorthand:** `2h`, `30m`, `7d`
|
|
439
|
+
- **Natural:** `for 2 hours`, `for 30 minutes`, `for 7 days`
|
|
440
|
+
- **Until time:** `until midnight`, `until 18:00`
|
|
441
|
+
- **Indefinite:** `indefinitely` or omit `--duration`
|
|
442
|
+
|
|
443
|
+
### Example Directives
|
|
444
|
+
|
|
445
|
+
```bash
|
|
446
|
+
# Proactive exploration
|
|
447
|
+
neo directive create "launch scout and implement high-severity findings" \
|
|
448
|
+
--trigger idle --priority 5 --description "Continuous improvement"
|
|
449
|
+
|
|
450
|
+
# Background maintenance
|
|
451
|
+
neo directive create "run tests on all repos and fix failures" \
|
|
452
|
+
--trigger idle --priority 3 --duration "for 8 hours"
|
|
453
|
+
|
|
454
|
+
# Time-boxed campaign
|
|
455
|
+
neo directive create "update dependencies in all repos" \
|
|
456
|
+
--trigger idle --priority 10 --duration "until midnight"
|
|
457
|
+
```
|
|
458
|
+
|
|
459
|
+
### Cleanup
|
|
460
|
+
|
|
461
|
+
Directives that expired more than 24 hours ago are automatically removed during compaction heartbeats.
|
|
265
462
|
|
|
266
463
|
## Safety Guards
|
|
267
464
|
|
|
268
465
|
### Anti-Loop Guard
|
|
269
|
-
- Max **6**
|
|
466
|
+
- Max **6** developer re-dispatch cycles per ticket.
|
|
270
467
|
- At limit: escalate. Do NOT dispatch again.
|
|
271
468
|
|
|
272
469
|
### Escalation Policy
|
|
273
|
-
- If
|
|
470
|
+
- If developer reports `status: "BLOCKED"` or fails **3× on the same error type**: escalate immediately.
|
|
274
471
|
- Do NOT attempt a 4th variant.
|
|
275
472
|
|
|
473
|
+
### Scope Guard
|
|
474
|
+
- For deletion or large removal tasks: always confirm exact scope before dispatch.
|
|
475
|
+
- "Remove X" means ONLY X — never adjacent systems unless explicitly stated.
|
|
476
|
+
- When in doubt: create a decision with boundary options before dispatching.
|
|
477
|
+
|
|
276
478
|
### Budget Enforcement
|
|
277
479
|
- Check `neo cost --short` before every dispatch.
|
|
278
480
|
- Never dispatch if budget would be exceeded.
|
|
279
|
-
|