@neotx/agents 0.1.0-alpha.2 → 0.1.0-alpha.21

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md ADDED
@@ -0,0 +1,178 @@
1
+ # @neotx/agents
2
+
3
+ Built-in agent definitions for `@neotx/core`.
4
+
5
+ This package contains YAML configuration files and Markdown prompts that define the 5 built-in agents used by the Neo orchestrator. It's a data package — no TypeScript, no runtime code.
6
+
7
+ ## Contents
8
+
9
+ ```
10
+ packages/agents/
11
+ ├── agents/ # Agent YAML definitions
12
+ │ ├── architect.yml
13
+ │ ├── developer.yml
14
+ │ ├── fixer.yml
15
+ │ ├── refiner.yml
16
+ │ └── reviewer.yml
17
+ └── prompts/ # Markdown system prompts
18
+ ├── architect.md
19
+ ├── developer.md
20
+ ├── fixer.md
21
+ ├── refiner.md
22
+ └── reviewer.md
23
+ ```
24
+
25
+ ## Built-in Agents
26
+
27
+ | Agent | Model | Sandbox | Tools | Role |
28
+ |-------|-------|---------|-------|------|
29
+ | **architect** | opus | readonly | Read, Glob, Grep, WebSearch, WebFetch | Strategic planner. Analyzes features, designs architecture, decomposes work into atomic tasks. Never writes code. |
30
+ | **developer** | opus | writable | Read, Write, Edit, Bash, Glob, Grep | Implementation worker. Executes atomic tasks from specs in isolated clones. |
31
+ | **fixer** | opus | writable | Read, Write, Edit, Bash, Glob, Grep | Auto-correction agent. Fixes issues found by reviewers. Targets root causes, not symptoms. |
32
+ | **refiner** | opus | readonly | Read, Glob, Grep, WebSearch, WebFetch | Ticket quality evaluator. Assesses clarity and splits vague tickets into precise sub-tickets. |
33
+ | **reviewer** | sonnet | readonly | Read, Glob, Grep, Bash | Single-pass unified reviewer. Covers quality, security, performance, and test coverage in one sweep. Challenges by default — blocks on critical issues. |
34
+
35
+ ### Sandbox Modes
36
+
37
+ - **readonly**: Agent can read files but cannot write. Safe for analysis tasks.
38
+ - **writable**: Agent can read and write files. Used for implementation and fixes.
39
+
40
+ ### Model Selection
41
+
42
+ - **opus**: Used for complex reasoning (architecture, security, implementation)
43
+ - **sonnet**: Used for focused review tasks (quality, performance, coverage)
44
+
45
+ ## Creating Custom Agents
46
+
47
+ Custom agents are defined in `.neo/agents/` in your project. You can create entirely new agents or extend built-in ones.
48
+
49
+ ### Agent YAML Schema
50
+
51
+ ```yaml
52
+ name: my-agent # Required: unique identifier
53
+ description: "What this agent does" # Required for custom agents
54
+ model: opus | sonnet | haiku # Required for custom agents
55
+ tools: # Required for custom agents
56
+ - Read
57
+ - Write
58
+ - Edit
59
+ - Bash
60
+ - Glob
61
+ - Grep
62
+ - WebSearch
63
+ - WebFetch
64
+ sandbox: writable | readonly # Required for custom agents
65
+ prompt: ../prompts/my-agent.md # Path to system prompt (relative to YAML)
66
+ ```
67
+
68
+ ### Extending Built-in Agents
69
+
70
+ Use `extends` to inherit from a built-in agent and override specific fields:
71
+
72
+ ```yaml
73
+ name: my-developer
74
+ extends: developer
75
+ model: sonnet # Override: use sonnet instead of opus
76
+ promptAppend: |
77
+ ## Additional Instructions
78
+ Always write tests before implementation.
79
+ ```
80
+
81
+ When extending:
82
+ - Unspecified fields inherit from the base agent
83
+ - `prompt` replaces the base prompt entirely
84
+ - `promptAppend` appends to the inherited prompt
85
+
86
+ ### The `$inherited` Token
87
+
88
+ When extending an agent, you can add tools while keeping the inherited ones:
89
+
90
+ ```yaml
91
+ name: research-developer
92
+ extends: developer
93
+ tools:
94
+ - $inherited # Keep all tools from developer
95
+ - WebSearch # Add web search capability
96
+ - WebFetch # Add web fetch capability
97
+ ```
98
+
99
+ Without `$inherited`, the tools list replaces the base entirely:
100
+
101
+ ```yaml
102
+ name: minimal-developer
103
+ extends: developer
104
+ tools:
105
+ - Read # Only these tools, not the inherited ones
106
+ - Edit
107
+ ```
108
+
109
+ ### Implicit Extension
110
+
111
+ If your custom agent has the same name as a built-in, it implicitly extends it:
112
+
113
+ ```yaml
114
+ # .neo/agents/developer.yml
115
+ # No "extends:" needed — same name implies extends: developer
116
+ name: developer
117
+ model: sonnet # Override model
118
+ promptAppend: |
119
+ Use the project's existing patterns.
120
+ ```
121
+
122
+ ## Prompts
123
+
124
+ Each agent has a corresponding Markdown prompt in `prompts/`. The prompt defines:
125
+
126
+ - The agent's role and responsibilities
127
+ - Execution protocol
128
+ - Output format expectations
129
+ - Hard rules and constraints
130
+ - Escalation conditions
131
+
132
+ ### Prompt Structure
133
+
134
+ Prompts follow a consistent structure:
135
+
136
+ ```markdown
137
+ # Agent Name
138
+
139
+ One-sentence role definition.
140
+
141
+ ## Protocol
142
+ Step-by-step execution protocol.
143
+
144
+ ## Output
145
+ Expected JSON structure for agent output.
146
+
147
+ ## Escalation
148
+ When to stop and report to the dispatcher.
149
+
150
+ ## Rules
151
+ Non-negotiable constraints.
152
+ ```
153
+
154
+ Runtime metadata (hooks, skills, memory, isolation) are injected by `@neotx/core` — not written in the prompt.
155
+
156
+ ### Referencing Prompts
157
+
158
+ In agent YAML, reference prompts with a relative path:
159
+
160
+ ```yaml
161
+ prompt: ../prompts/architect.md
162
+ ```
163
+
164
+ The path is resolved relative to the YAML file's directory.
165
+
166
+ ## How @neotx/core Uses This Package
167
+
168
+ The `@neotx/core` orchestrator:
169
+
170
+ 1. Loads all YAML files from `packages/agents/agents/` as built-in agents
171
+ 2. Loads all YAML files from `.neo/agents/` as custom agents
172
+ 3. Resolves extensions and merges configurations
173
+ 4. Reads and injects prompts into agent sessions
174
+ Custom agents in `.neo/agents/` override or extend the built-ins from this package.
175
+
176
+ ## License
177
+
178
+ MIT
package/SUPERVISOR.md ADDED
@@ -0,0 +1,282 @@
1
+ # Supervisor — Domain Knowledge
2
+
3
+ This file contains domain-specific knowledge for the supervisor. Commands, heartbeat lifecycle, reporting, memory operations, and focus instructions are provided by the system prompt — do not duplicate them here.
4
+
5
+ ## Available Agents
6
+
7
+ | Agent | Model | Mode | Use when |
8
+ |-------|-------|------|----------|
9
+ | `architect` | opus | readonly | Designing systems, planning features, decomposing work |
10
+ | `developer` | opus | writable | Implementing code changes, bug fixes, new features |
11
+ | `fixer` | opus | writable | Fixing issues found by reviewer — targets root causes |
12
+ | `refiner` | opus | readonly | Evaluating ticket quality, splitting vague tickets |
13
+ | `reviewer` | sonnet | readonly | Thorough single-pass review: quality, standards, security, perf, and coverage. Challenges by default — blocks on ≥1 CRITICAL or ≥3 WARNINGs |
14
+ | `scout` | opus | readonly | Autonomous codebase explorer. Deep-dives into a repo to surface bugs, improvements, security issues, and tech debt. Creates decisions for the user |
15
+
16
+ ## Agent Output Contracts
17
+
18
+ Each agent outputs structured JSON. Parse these to decide next actions.
19
+
20
+ ### architect → `design` + `milestones[].tasks[]`
21
+
22
+ React to: create sub-tickets from `milestones[].tasks[]`, dispatch `developer` for each (respecting `depends_on` order).
23
+
24
+ ### developer → `status` + `PR_URL`
25
+
26
+ React to:
27
+ - `status: "completed"` + `PR_URL` → extract PR number, set ticket to CI pending, check CI at next heartbeat
28
+ - `status: "completed"` without PR → mark ticket done
29
+ - `status: "failed"` or `"escalated"` → mark ticket abandoned, log reason
30
+
31
+ ### reviewer → `verdict` + `issues[]`
32
+
33
+ The reviewer challenges by default. It blocks on any CRITICAL issue or ≥3 WARNINGs.
34
+ Expect `CHANGES_REQUESTED` more often than `APPROVED` — this is intentional.
35
+
36
+ React to:
37
+ - `verdict: "APPROVED"` → mark ticket done
38
+ - `verdict: "CHANGES_REQUESTED"` → check anti-loop guard, dispatch `fixer` with issues (include severity — fixer should prioritize CRITICALs first)
39
+
40
+ ### fixer → `status` + `issues_fixed[]`
41
+
42
+ React to:
43
+ - `status: "FIXED"` → set ticket to review, re-dispatch `reviewer`
44
+ - `status: "PARTIAL"` or `"ESCALATED"` → evaluate remaining issues, escalate if needed
45
+
46
+ ### refiner → `action` + `score`
47
+
48
+ React to:
49
+ - `action: "pass_through"` → dispatch `developer` with enriched context
50
+ - `action: "decompose"` → create sub-tickets from `sub_tickets[]`, dispatch in order
51
+ - `action: "escalate"` → mark ticket blocked, log questions
52
+
53
+ ### scout → `findings[]` + `decisions_created`
54
+
55
+ React to:
56
+ - Parse `findings[]` — each has `severity`, `category`, `suggestion`, and optional `decision_id`
57
+ - CRITICAL findings with `decision_id` → wait for user decision before acting
58
+ - HIGH findings with `decision_id` → wait for user decision before acting
59
+ - User answers "yes" on a decision → route the finding as a ticket (dispatch `developer` or `architect` based on `effort`)
60
+ - User answers "later" → backlog the finding
61
+ - User answers "no" → discard
62
+ - MEDIUM/LOW findings (no decisions created) → log for reference, no action needed
63
+
64
+ ## Dispatch — `--meta` fields
65
+
66
+ Use `--meta` for traceability and idempotency:
67
+
68
+ | Field | Required | Description |
69
+ |-------|----------|-------------|
70
+ | `ticketId` | always | Source ticket identifier for traceability |
71
+ | `stage` | always | Pipeline stage: `refine`, `develop`, `review`, `fix` |
72
+ | `prNumber` | if exists | GitHub PR number |
73
+ | `cycle` | fix stage | Fixer→review cycle count (anti-loop tracking) |
74
+ | `parentTicketId` | sub-tickets | Parent ticket ID for decomposed work |
75
+
76
+ ### Branch & PR lifecycle
77
+
78
+ - `--branch` is **required for all agents**. Every session runs in an isolated clone on that branch.
79
+ - **develop**: pass `--branch feat/PROJ-42-description` to name the working branch.
80
+ - **review/fix**: pass the same `--branch` and `prNumber` in `--meta`.
81
+ - On developer completion: extract `branch` and `prNumber` from `neo runs <runId>`, carry forward.
82
+
83
+ ### Prompt writing
84
+
85
+ The `--prompt` is the agent's only context. It must be self-contained:
86
+
87
+ - **develop**: task description + acceptance criteria + instruction to create branch and PR
88
+ - **review**: PR number + branch name + what to review
89
+ - **fix**: PR number + branch name + specific issues to fix + instruction to push to existing branch
90
+ - **refine**: ticket title + description + any existing criteria
91
+ - **architect**: feature description + constraints + scope
92
+
93
+ ### Examples
94
+
95
+ ```bash
96
+ # develop
97
+ neo run developer --prompt "Implement user auth flow. Criteria: login with email/password, JWT tokens, refresh flow. Open a PR when done." \
98
+ --repo /path/to/repo \
99
+ --branch feat/PROJ-42-add-auth \
100
+ --meta '{"ticketId":"PROJ-42","stage":"develop"}'
101
+
102
+ # review
103
+ neo run reviewer --prompt "Review PR #73 on branch feat/PROJ-42-add-auth." \
104
+ --repo /path/to/repo \
105
+ --branch feat/PROJ-42-add-auth \
106
+ --meta '{"ticketId":"PROJ-42","stage":"review","prNumber":73}'
107
+
108
+ # fix
109
+ neo run fixer --prompt "Fix issues from review on PR #73: missing input validation on login endpoint. Push fixes to the existing branch." \
110
+ --repo /path/to/repo \
111
+ --branch feat/PROJ-42-add-auth \
112
+ --meta '{"ticketId":"PROJ-42","stage":"fix","prNumber":73,"cycle":1}'
113
+
114
+ # architect
115
+ neo run architect --prompt "Design decomposition for multi-tenant auth system" \
116
+ --repo /path/to/repo \
117
+ --branch feat/PROJ-99-multi-tenant-auth \
118
+ --meta '{"ticketId":"PROJ-99","stage":"refine"}'
119
+
120
+ # scout
121
+ neo run scout --prompt "Explore this repository and surface bugs, improvements, security issues, and tech debt. Create decisions for critical and high-impact findings." \
122
+ --repo /path/to/repo \
123
+ --branch main \
124
+ --meta '{"stage":"scout"}'
125
+ ```
126
+
127
+ ## Protocol
128
+
129
+ ### 1. Ticket Pickup
130
+
131
+ 1. Query your tracker for ready tickets, sorted by priority.
132
+ 2. Check capacity: `neo runs --short` and `neo cost --short`.
133
+ 3. For each ticket (up to capacity):
134
+ a. Read full ticket details.
135
+ b. Self-evaluate missing fields (see below).
136
+ c. Resolve target repository.
137
+ d. Route the ticket.
138
+ e. Update tracker → in progress.
139
+ f. **Yield.** Completion arrives at a future heartbeat.
140
+
141
+ ### 2. Routing
142
+
143
+ | Condition | Action |
144
+ |-----------|--------|
145
+ | Bug + critical priority | Dispatch `developer` directly (hotfix) |
146
+ | Clear criteria + small scope (< 5 points) | Dispatch `developer` |
147
+ | Complexity ≥ 5 | Dispatch `architect` first |
148
+ | Unclear criteria or vague scope | Dispatch `refiner` |
149
+ | Proactive exploration / no specific ticket | Dispatch `scout` on target repo |
150
+
151
+ ### 3. On Refiner Completion
152
+
153
+ Parse the refiner's JSON output:
154
+ - `action: "pass_through"` → dispatch `developer` with `enriched_context`
155
+ - `action: "decompose"` → create sub-tickets from `sub_tickets[]`, dispatch `developer` for each
156
+ - `action: "escalate"` → update tracker → blocked
157
+
158
+ ### 4. On Developer/Fixer Completion — with PR
159
+
160
+ 1. Parse output for `PR_URL`, extract PR number.
161
+ 2. Update tracker → CI pending.
162
+ 3. Check CI: `gh pr checks <prNumber> --repo <repository>`.
163
+ 4. CI passed → update tracker → in review, dispatch `reviewer`.
164
+ 5. CI failed → update tracker → fixing, dispatch `fixer` with CI error context.
165
+ 6. CI pending → note in focus, check at next heartbeat.
166
+
167
+ ### 5. On Developer/Fixer Completion — no PR
168
+
169
+ Update tracker → done.
170
+
171
+ ### 6. On Review Completion
172
+
173
+ Parse reviewer's JSON output:
174
+ - `verdict: "APPROVED"` → update tracker → done.
175
+ - `verdict: "CHANGES_REQUESTED"` → check anti-loop guard → dispatch `fixer` with `issues[]`, or escalate.
176
+
177
+ ### 7. On Fixer Completion
178
+
179
+ Parse fixer's JSON output:
180
+ - `status: "FIXED"` → update tracker → in review, re-dispatch `reviewer`.
181
+ - `status: "ESCALATED"` → update tracker → blocked.
182
+
183
+ ### 8. On Scout Completion
184
+
185
+ Parse scout's JSON output:
186
+ - For each finding with `decision_id`: wait for user decision at future heartbeat.
187
+ - User answers "yes" on a decision:
188
+ - `effort: "XS" | "S"` → dispatch `developer` with finding as ticket
189
+ - `effort: "M" | "L"` → dispatch `architect` for design first
190
+ - User answers "later" → log to backlog, no dispatch
191
+ - User answers "no" → discard finding, no action
192
+ - Log `health_score` and `strengths` for project context.
193
+
194
+ ### 9. On Agent Failure
195
+
196
+ Update tracker → abandoned. Log the failure reason.
197
+
198
+ ## Pipeline State Machine
199
+
200
+ ```
201
+ ready → in progress → ci pending → in review → done
202
+ │ │ │
203
+ │ │ failure │ changes requested
204
+ │ ▼ ▼
205
+ │ fixing ◄────────┘
206
+ │ │
207
+ │ └──→ in review (re-review)
208
+
209
+ └──→ blocked (escalation/budget/anti-loop)
210
+ └──→ abandoned (terminal failure)
211
+ ```
212
+
213
+ ## Self-Evaluation (Missing Ticket Fields)
214
+
215
+ Infer missing fields before routing:
216
+
217
+ **Type:**
218
+ - "crash", "error", "broken", "fix", "regression" → `bug`
219
+ - "add", "create", "implement", "build", "new" → `feature`
220
+ - "refactor", "clean", "improve", "optimize" → `chore`
221
+ - Unclear → `feature`
222
+
223
+ **Complexity (Fibonacci):**
224
+ - 1: typo, config, single-line — 2: single file, <50 lines — 3: 2-3 files (default)
225
+ - 5+: triggers architect first — 8: 5-8 files — 13: large feature — 21+: major
226
+
227
+ **Criteria** (when unset):
228
+ - Bugs: "The bug described in the title is fixed and does not regress"
229
+ - Features: derive from title
230
+ - Chores: "Code is cleaned up without breaking existing behavior"
231
+
232
+ **Priority** (when unset): `medium`
233
+
234
+ ## Idle Behavior — Scout Dispatch
235
+
236
+ When the supervisor has **no events, no active runs, and no pending tasks**, it enters idle mode.
237
+
238
+ Instead of doing nothing, dispatch a `scout` agent to proactively explore a repository:
239
+
240
+ 1. **Check preconditions:**
241
+ - Budget remaining > 10% — do not scout if budget is tight
242
+ - No pending decisions from a previous scout — wait for user to answer before scouting again
243
+ - No active runs — scout only when truly idle
244
+
245
+ 2. **Pick a repo:**
246
+ - Choose the repo least recently scouted (check memory for previous `scout` runs)
247
+ - If no scout has ever run, pick the first configured repo
248
+ - Rotate across repos over time — do not scout the same repo twice in a row
249
+
250
+ 3. **Dispatch:**
251
+ ```bash
252
+ neo log decision "Idle — dispatching scout on <repo-name>"
253
+ neo run scout --prompt "Explore this repository. Surface bugs, improvements, security issues, and tech debt. Create decisions for critical and high-impact findings." \
254
+ --repo <path> \
255
+ --branch <default-branch> \
256
+ --meta '{"stage":"scout","label":"scout-<repo-name>"}'
257
+ ```
258
+
259
+ 4. **On scout completion** (see Protocol §8):
260
+ - Read the output with `neo runs <runId>`
261
+ - The scout has already created decisions via `neo decision create`
262
+ - Log the `health_score` and finding count as a fact
263
+ - Wait for user to answer decisions at future heartbeats
264
+
265
+ 5. **Frequency guard:**
266
+ - Max ONE scout per repo per 24h — do not re-scout a repo that was scouted today
267
+ - Write a fact after each scout: `neo memory write --type fact --scope <repo> "Last scouted: <date>, health: <score>/10, <N> findings"`
268
+
269
+ ## Safety Guards
270
+
271
+ ### Anti-Loop Guard
272
+ - Max **6** fixer→review cycles per ticket.
273
+ - At limit: escalate. Do NOT dispatch again.
274
+
275
+ ### Escalation Policy
276
+ - If fixer reports `status: "ESCALATED"` or fails **3× on the same error type**: escalate immediately.
277
+ - Do NOT attempt a 4th variant.
278
+
279
+ ### Budget Enforcement
280
+ - Check `neo cost --short` before every dispatch.
281
+ - Never dispatch if budget would be exceeded.
282
+
@@ -1,5 +1,5 @@
1
1
  name: developer
2
- description: "Implementation worker. Executes atomic tasks from specs in isolated worktrees. Follows strict scope discipline."
2
+ description: "Implementation worker. Executes atomic tasks from specs in isolated clones. Follows strict scope discipline."
3
3
  model: opus
4
4
  tools:
5
5
  - Read
@@ -0,0 +1,10 @@
1
+ name: reviewer
2
+ description: "Thorough single-pass code reviewer. Covers quality, standards, security, performance, and test coverage. Challenges code by default — approves only when standards are met."
3
+ model: sonnet
4
+ tools:
5
+ - Read
6
+ - Glob
7
+ - Grep
8
+ - Bash
9
+ sandbox: readonly
10
+ prompt: ../prompts/reviewer.md
@@ -0,0 +1,12 @@
1
+ name: scout
2
+ description: "Autonomous codebase explorer. Deep-dives into a repository to surface bugs, improvements, security issues, tech debt, and optimization opportunities. Produces actionable decisions for the supervisor."
3
+ model: opus
4
+ tools:
5
+ - Read
6
+ - Glob
7
+ - Grep
8
+ - Bash
9
+ - WebSearch
10
+ - WebFetch
11
+ sandbox: readonly
12
+ prompt: ../prompts/scout.md
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "@neotx/agents",
3
- "version": "0.1.0-alpha.2",
3
+ "version": "0.1.0-alpha.21",
4
4
  "description": "Built-in agent definitions and prompts for @neotx/core",
5
5
  "type": "module",
6
6
  "license": "MIT",
@@ -12,7 +12,8 @@
12
12
  "files": [
13
13
  "agents",
14
14
  "prompts",
15
- "workflows"
15
+ "SUPERVISOR.md",
16
+ "GUIDE.md"
16
17
  ],
17
18
  "keywords": [
18
19
  "ai-agents",