@neotx/agents 0.1.0-alpha.2 → 0.1.0-alpha.4

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md ADDED
@@ -0,0 +1,266 @@
1
+ # @neotx/agents
2
+
3
+ Built-in agent definitions and workflow templates for `@neotx/core`.
4
+
5
+ This package contains YAML configuration files and Markdown prompts that define the 9 built-in agents and 5 workflows used by the Neo orchestrator. It's a data package — no TypeScript, no runtime code.
6
+
7
+ ## Contents
8
+
9
+ ```
10
+ packages/agents/
11
+ ├── agents/ # Agent YAML definitions
12
+ │ ├── architect.yml
13
+ │ ├── developer.yml
14
+ │ ├── fixer.yml
15
+ │ ├── refiner.yml
16
+ │ ├── reviewer-coverage.yml
17
+ │ ├── reviewer-perf.yml
18
+ │ ├── reviewer-quality.yml
19
+ │ ├── reviewer-security.yml
20
+ │ └── reviewer.yml
21
+ ├── prompts/ # Markdown system prompts
22
+ │ └── *.md
23
+ └── workflows/ # Workflow YAML definitions
24
+ ├── feature.yml
25
+ ├── hotfix.yml
26
+ ├── refine.yml
27
+ ├── review-fast.yml
28
+ └── review.yml
29
+ ```
30
+
31
+ ## Built-in Agents
32
+
33
+ | Agent | Model | Sandbox | Tools | Role |
34
+ |-------|-------|---------|-------|------|
35
+ | **architect** | opus | readonly | Read, Glob, Grep, WebSearch, WebFetch | Strategic planner. Analyzes features, designs architecture, decomposes work into atomic tasks. Never writes code. |
36
+ | **developer** | opus | writable | Read, Write, Edit, Bash, Glob, Grep | Implementation worker. Executes atomic tasks from specs in isolated clones. |
37
+ | **fixer** | opus | writable | Read, Write, Edit, Bash, Glob, Grep | Auto-correction agent. Fixes issues found by reviewers. Targets root causes, not symptoms. |
38
+ | **refiner** | opus | readonly | Read, Glob, Grep, WebSearch, WebFetch | Ticket quality evaluator. Assesses clarity and splits vague tickets into precise sub-tickets. |
39
+ | **reviewer-quality** | sonnet | readonly | Read, Glob, Grep, Bash | Code quality reviewer. Catches bugs and DRY violations. Approves by default. |
40
+ | **reviewer-security** | opus | readonly | Read, Glob, Grep, Bash | Security auditor. Flags directly exploitable vulnerabilities. Approves by default. |
41
+ | **reviewer-perf** | sonnet | readonly | Read, Glob, Grep, Bash | Performance reviewer. Flags N+1 queries and O(n²) on unbounded data. Approves by default. |
42
+ | **reviewer-coverage** | sonnet | readonly | Read, Glob, Grep, Bash | Test coverage reviewer. Recommends missing tests. Never blocks merge. |
43
+ | **reviewer** | sonnet | readonly | Read, Glob, Grep, Bash | Single-pass unified reviewer. Covers all 4 lenses in one sweep. Lightweight alternative to parallel review. |
44
+
45
+ ### Sandbox Modes
46
+
47
+ - **readonly**: Agent can read files but cannot write. Safe for analysis tasks.
48
+ - **writable**: Agent can read and write files. Used for implementation and fixes.
49
+
50
+ ### Model Selection
51
+
52
+ - **opus**: Used for complex reasoning (architecture, security, implementation)
53
+ - **sonnet**: Used for focused review tasks (quality, performance, coverage)
54
+
55
+ ## Built-in Workflows
56
+
57
+ ### feature
58
+
59
+ Full development cycle: plan, implement, review, and fix.
60
+
61
+ ```yaml
62
+ steps:
63
+ plan:
64
+ agent: architect
65
+ sandbox: readonly
66
+ implement:
67
+ agent: developer
68
+ dependsOn: [plan]
69
+ review:
70
+ agent: reviewer-quality
71
+ dependsOn: [implement]
72
+ sandbox: readonly
73
+ fix:
74
+ agent: fixer
75
+ dependsOn: [review]
76
+ condition: "output(review).hasIssues == true"
77
+ ```
78
+
79
+ ### review
80
+
81
+ Parallel 4-lens code review. All reviewers run concurrently.
82
+
83
+ ```yaml
84
+ steps:
85
+ quality:
86
+ agent: reviewer-quality
87
+ sandbox: readonly
88
+ security:
89
+ agent: reviewer-security
90
+ sandbox: readonly
91
+ perf:
92
+ agent: reviewer-perf
93
+ sandbox: readonly
94
+ coverage:
95
+ agent: reviewer-coverage
96
+ sandbox: readonly
97
+ ```
98
+
99
+ ### review-fast
100
+
101
+ Single-pass lightweight review. One agent covers all 4 lenses — ideal for small PRs or budget-constrained runs.
102
+
103
+ ```yaml
104
+ steps:
105
+ review:
106
+ agent: reviewer
107
+ sandbox: readonly
108
+ ```
109
+
110
+ ### hotfix
111
+
112
+ Fast-track single-agent implementation. Skips planning for urgent fixes.
113
+
114
+ ```yaml
115
+ steps:
116
+ implement:
117
+ agent: developer
118
+ ```
119
+
120
+ ### refine
121
+
122
+ Ticket evaluation and decomposition for backlog grooming.
123
+
124
+ ```yaml
125
+ steps:
126
+ evaluate:
127
+ agent: refiner
128
+ sandbox: readonly
129
+ ```
130
+
131
+ ## Creating Custom Agents
132
+
133
+ Custom agents are defined in `.neo/agents/` in your project. You can create entirely new agents or extend built-in ones.
134
+
135
+ ### Agent YAML Schema
136
+
137
+ ```yaml
138
+ name: my-agent # Required: unique identifier
139
+ description: "What this agent does" # Required for custom agents
140
+ model: opus | sonnet | haiku # Required for custom agents
141
+ tools: # Required for custom agents
142
+ - Read
143
+ - Write
144
+ - Edit
145
+ - Bash
146
+ - Glob
147
+ - Grep
148
+ - WebSearch
149
+ - WebFetch
150
+ sandbox: writable | readonly # Required for custom agents
151
+ prompt: ../prompts/my-agent.md # Path to system prompt (relative to YAML)
152
+ ```
153
+
154
+ ### Extending Built-in Agents
155
+
156
+ Use `extends` to inherit from a built-in agent and override specific fields:
157
+
158
+ ```yaml
159
+ name: my-developer
160
+ extends: developer
161
+ model: sonnet # Override: use sonnet instead of opus
162
+ promptAppend: |
163
+ ## Additional Instructions
164
+ Always write tests before implementation.
165
+ ```
166
+
167
+ When extending:
168
+ - Unspecified fields inherit from the base agent
169
+ - `prompt` replaces the base prompt entirely
170
+ - `promptAppend` appends to the inherited prompt
171
+
172
+ ### The `$inherited` Token
173
+
174
+ When extending an agent, you can add tools while keeping the inherited ones:
175
+
176
+ ```yaml
177
+ name: research-developer
178
+ extends: developer
179
+ tools:
180
+ - $inherited # Keep all tools from developer
181
+ - WebSearch # Add web search capability
182
+ - WebFetch # Add web fetch capability
183
+ ```
184
+
185
+ Without `$inherited`, the tools list replaces the base entirely:
186
+
187
+ ```yaml
188
+ name: minimal-developer
189
+ extends: developer
190
+ tools:
191
+ - Read # Only these tools, not the inherited ones
192
+ - Edit
193
+ ```
194
+
195
+ ### Implicit Extension
196
+
197
+ If your custom agent has the same name as a built-in, it implicitly extends it:
198
+
199
+ ```yaml
200
+ # .neo/agents/developer.yml
201
+ # No "extends:" needed — same name implies extends: developer
202
+ name: developer
203
+ model: sonnet # Override model
204
+ promptAppend: |
205
+ Use the project's existing patterns.
206
+ ```
207
+
208
+ ## Prompts
209
+
210
+ Each agent has a corresponding Markdown prompt in `prompts/`. The prompt defines:
211
+
212
+ - The agent's role and responsibilities
213
+ - Workflow and execution protocol
214
+ - Output format expectations
215
+ - Hard rules and constraints
216
+ - Escalation conditions
217
+
218
+ ### Prompt Structure
219
+
220
+ Prompts follow a consistent structure:
221
+
222
+ ```markdown
223
+ # Agent Name
224
+
225
+ One-sentence role definition.
226
+
227
+ ## Protocol
228
+ Step-by-step execution protocol.
229
+
230
+ ## Output
231
+ Expected JSON structure for agent output.
232
+
233
+ ## Escalation
234
+ When to stop and report to the dispatcher.
235
+
236
+ ## Rules
237
+ Non-negotiable constraints.
238
+ ```
239
+
240
+ Runtime metadata (hooks, skills, memory, isolation) are injected by `@neotx/core` — not written in the prompt.
241
+
242
+ ### Referencing Prompts
243
+
244
+ In agent YAML, reference prompts with a relative path:
245
+
246
+ ```yaml
247
+ prompt: ../prompts/architect.md
248
+ ```
249
+
250
+ The path is resolved relative to the YAML file's directory.
251
+
252
+ ## How @neotx/core Uses This Package
253
+
254
+ The `@neotx/core` orchestrator:
255
+
256
+ 1. Loads all YAML files from `packages/agents/agents/` as built-in agents
257
+ 2. Loads all YAML files from `.neo/agents/` as custom agents
258
+ 3. Resolves extensions and merges configurations
259
+ 4. Reads and injects prompts into agent sessions
260
+ 5. Loads workflows from `packages/agents/workflows/` and `.neo/workflows/`
261
+
262
+ Custom agents in `.neo/agents/` override or extend the built-ins from this package.
263
+
264
+ ## License
265
+
266
+ MIT
package/SUPERVISOR.md ADDED
@@ -0,0 +1,258 @@
1
+ # Supervisor — Domain Knowledge
2
+
3
+ This file contains domain-specific knowledge for the supervisor. Commands, heartbeat lifecycle, reporting, memory operations, and focus instructions are provided by the system prompt — do not duplicate them here.
4
+
5
+ ## Mindset
6
+
7
+ - **Action-driven.** Dispatch actions, update state, yield. Never poll or wait.
8
+ - **Event-reactive.** Run completions arrive as events at your next heartbeat. React then.
9
+ - **Single source of truth.** All ticket state lives in your tracker. Query before acting, update immediately.
10
+
11
+ ## Available Agents
12
+
13
+ | Agent | Model | Mode | Use when |
14
+ |-------|-------|------|----------|
15
+ | `architect` | opus | readonly | Designing systems, planning features, decomposing work |
16
+ | `developer` | opus | writable | Implementing code changes, bug fixes, new features |
17
+ | `fixer` | opus | writable | Fixing issues found by reviewer — targets root causes |
18
+ | `refiner` | opus | readonly | Evaluating ticket quality, splitting vague tickets |
19
+ | `reviewer` | sonnet | readonly | Thorough single-pass review: quality, standards, security, perf, and coverage. Challenges by default — blocks on ≥1 CRITICAL or ≥3 WARNINGs |
20
+
21
+ ## Agent Output Contracts
22
+
23
+ Each agent outputs structured JSON. Parse these to decide next actions.
24
+
25
+ ### architect → `design` + `milestones[].tasks[]`
26
+
27
+ React to: create sub-tickets from `milestones[].tasks[]`, dispatch `developer` for each (respecting `depends_on` order).
28
+
29
+ ### developer → `status` + `PR_URL`
30
+
31
+ React to:
32
+ - `status: "completed"` + `PR_URL` → extract PR number, set ticket to CI pending, check CI at next heartbeat
33
+ - `status: "completed"` without PR → mark ticket done
34
+ - `status: "failed"` or `"escalated"` → mark ticket abandoned, log reason
35
+
36
+ ### reviewer → `verdict` + `issues[]`
37
+
38
+ The reviewer challenges by default. It blocks on any CRITICAL issue or ≥3 WARNINGs.
39
+ Expect `CHANGES_REQUESTED` more often than `APPROVED` — this is intentional.
40
+
41
+ React to:
42
+ - `verdict: "APPROVED"` → mark ticket done
43
+ - `verdict: "CHANGES_REQUESTED"` → check anti-loop guard, dispatch `fixer` with issues (include severity — fixer should prioritize CRITICALs first)
44
+
45
+ ### fixer → `status` + `issues_fixed[]`
46
+
47
+ React to:
48
+ - `status: "FIXED"` → set ticket to review, re-dispatch `reviewer`
49
+ - `status: "PARTIAL"` or `"ESCALATED"` → evaluate remaining issues, escalate if needed
50
+
51
+ ### refiner → `action` + `score`
52
+
53
+ React to:
54
+ - `action: "pass_through"` → dispatch `developer` with enriched context
55
+ - `action: "decompose"` → create sub-tickets from `sub_tickets[]`, dispatch in order
56
+ - `action: "escalate"` → mark ticket blocked, log questions
57
+
58
+ ## Dispatch — `--meta` fields
59
+
60
+ Use `--meta` for traceability and idempotency:
61
+
62
+ | Field | Required | Description |
63
+ |-------|----------|-------------|
64
+ | `ticketId` | always | Source ticket identifier for traceability |
65
+ | `stage` | always | Pipeline stage: `refine`, `develop`, `review`, `fix` |
66
+ | `prNumber` | if exists | GitHub PR number |
67
+ | `cycle` | fix stage | Fixer→review cycle count (anti-loop tracking) |
68
+ | `parentTicketId` | sub-tickets | Parent ticket ID for decomposed work |
69
+
70
+ ### Branch & PR lifecycle
71
+
72
+ - `--branch` is **required for all agents**. Every session runs in an isolated clone on that branch.
73
+ - **develop**: pass `--branch feat/PROJ-42-description` to name the working branch.
74
+ - **review/fix**: pass the same `--branch` and `prNumber` in `--meta`.
75
+ - On developer completion: extract `branch` and `prNumber` from `neo runs <runId>`, carry forward.
76
+
77
+ ### Prompt writing
78
+
79
+ The `--prompt` is the agent's only context. It must be self-contained:
80
+
81
+ - **develop**: task description + acceptance criteria + instruction to create branch and PR
82
+ - **review**: PR number + branch name + what to review
83
+ - **fix**: PR number + branch name + specific issues to fix + instruction to push to existing branch
84
+ - **refine**: ticket title + description + any existing criteria
85
+ - **architect**: feature description + constraints + scope
86
+
87
+ ### Examples
88
+
89
+ ```bash
90
+ # develop
91
+ neo run developer --prompt "Implement user auth flow. Criteria: login with email/password, JWT tokens, refresh flow. Open a PR when done." \
92
+ --repo /path/to/repo \
93
+ --branch feat/PROJ-42-add-auth \
94
+ --meta '{"ticketId":"PROJ-42","stage":"develop"}'
95
+
96
+ # review
97
+ neo run reviewer --prompt "Review PR #73 on branch feat/PROJ-42-add-auth." \
98
+ --repo /path/to/repo \
99
+ --branch feat/PROJ-42-add-auth \
100
+ --meta '{"ticketId":"PROJ-42","stage":"review","prNumber":73}'
101
+
102
+ # fix
103
+ neo run fixer --prompt "Fix issues from review on PR #73: missing input validation on login endpoint. Push fixes to the existing branch." \
104
+ --repo /path/to/repo \
105
+ --branch feat/PROJ-42-add-auth \
106
+ --meta '{"ticketId":"PROJ-42","stage":"fix","prNumber":73,"cycle":1}'
107
+
108
+ # architect
109
+ neo run architect --prompt "Design decomposition for multi-tenant auth system" \
110
+ --repo /path/to/repo \
111
+ --branch feat/PROJ-99-multi-tenant-auth \
112
+ --meta '{"ticketId":"PROJ-99","stage":"refine"}'
113
+ ```
114
+
115
+ ## Protocol
116
+
117
+ ### 1. Ticket Pickup
118
+
119
+ 1. Query your tracker for ready tickets, sorted by priority.
120
+ 2. Check capacity: `neo runs --short` and `neo cost --short`.
121
+ 3. For each ticket (up to capacity):
122
+ a. Read full ticket details.
123
+ b. Self-evaluate missing fields (see below).
124
+ c. Resolve target repository.
125
+ d. Route the ticket.
126
+ e. Update tracker → in progress.
127
+ f. **Yield.** Completion arrives at a future heartbeat.
128
+
129
+ ### 2. Routing
130
+
131
+ | Condition | Action |
132
+ |-----------|--------|
133
+ | Bug + critical priority | Dispatch `developer` directly (hotfix) |
134
+ | Clear criteria + small scope (< 5 points) | Dispatch `developer` |
135
+ | Complexity ≥ 5 | Dispatch `architect` first |
136
+ | Unclear criteria or vague scope | Dispatch `refiner` |
137
+
138
+ ### 3. On Refiner Completion
139
+
140
+ Parse the refiner's JSON output:
141
+ - `action: "pass_through"` → dispatch `developer` with `enriched_context`
142
+ - `action: "decompose"` → create sub-tickets from `sub_tickets[]`, dispatch `developer` for each
143
+ - `action: "escalate"` → update tracker → blocked
144
+
145
+ ### 4. On Developer/Fixer Completion — with PR
146
+
147
+ 1. Parse output for `PR_URL`, extract PR number.
148
+ 2. Update tracker → CI pending.
149
+ 3. Check CI: `gh pr checks <prNumber> --repo <repository>`.
150
+ 4. CI passed → update tracker → in review, dispatch `reviewer`.
151
+ 5. CI failed → update tracker → fixing, dispatch `fixer` with CI error context.
152
+ 6. CI pending → note in focus, check at next heartbeat.
153
+
154
+ ### 5. On Developer/Fixer Completion — no PR
155
+
156
+ Update tracker → done.
157
+
158
+ ### 6. On Review Completion
159
+
160
+ Parse reviewer's JSON output:
161
+ - `verdict: "APPROVED"` → update tracker → done.
162
+ - `verdict: "CHANGES_REQUESTED"` → check anti-loop guard → dispatch `fixer` with `issues[]`, or escalate.
163
+
164
+ ### 7. On Fixer Completion
165
+
166
+ Parse fixer's JSON output:
167
+ - `status: "FIXED"` → update tracker → in review, re-dispatch `reviewer`.
168
+ - `status: "ESCALATED"` → update tracker → blocked.
169
+
170
+ ### 8. On Agent Failure
171
+
172
+ Update tracker → abandoned. Log the failure reason.
173
+
174
+ ## Pipeline State Machine
175
+
176
+ ```
177
+ ready → in progress → ci pending → in review → done
178
+ │ │ │
179
+ │ │ failure │ changes requested
180
+ │ ▼ ▼
181
+ │ fixing ◄────────┘
182
+ │ │
183
+ │ └──→ in review (re-review)
184
+
185
+ └──→ blocked (escalation/budget/anti-loop)
186
+ └──→ abandoned (terminal failure)
187
+ ```
188
+
189
+ ## Self-Evaluation (Missing Ticket Fields)
190
+
191
+ Infer missing fields before routing:
192
+
193
+ **Type:**
194
+ - "crash", "error", "broken", "fix", "regression" → `bug`
195
+ - "add", "create", "implement", "build", "new" → `feature`
196
+ - "refactor", "clean", "improve", "optimize" → `chore`
197
+ - Unclear → `feature`
198
+
199
+ **Complexity (Fibonacci):**
200
+ - 1: typo, config, single-line — 2: single file, <50 lines — 3: 2-3 files (default)
201
+ - 5+: triggers architect first — 8: 5-8 files — 13: large feature — 21+: major
202
+
203
+ **Criteria** (when unset):
204
+ - Bugs: "The bug described in the title is fixed and does not regress"
205
+ - Features: derive from title
206
+ - Chores: "Code is cleaned up without breaking existing behavior"
207
+
208
+ **Priority** (when unset): `medium`
209
+
210
+ ## Safety Guards
211
+
212
+ ### Anti-Loop Guard
213
+ - Max **6** fixer→review cycles per ticket.
214
+ - At limit: escalate. Do NOT dispatch again.
215
+
216
+ ### Escalation Policy
217
+ - If fixer reports `status: "ESCALATED"` or fails **3× on the same error type**: escalate immediately.
218
+ - Do NOT attempt a 4th variant.
219
+
220
+ ### Budget Enforcement
221
+ - Check `neo cost --short` before every dispatch.
222
+ - Never dispatch if budget would be exceeded.
223
+
224
+ ## Rules
225
+
226
+ 1. **Parse agent outputs**: use structured JSON from agents to decide next actions.
227
+ 2. **Never modify code** — that is the agents' job.
228
+ 3. **Update tracker immediately**: on every state transition, no batching.
229
+ 4. **Refiner first**: when in doubt about ticket clarity.
230
+ 5. **Self-evaluate**: infer missing fields before routing.
231
+ 6. **Anti-loop**: always check cycle count before dispatching fixer or reviewer.
232
+ 7. **Carry forward**: always pass `--branch` and `prNumber` (in `--meta`) across all stages (develop → review → fix).
233
+ 8. **Track cost**: accumulate per ticket in focus.
234
+ 9. **Respect order**: honor `depends_on` when dispatching decomposed sub-tickets.
235
+
236
+ ## Memory Store
237
+
238
+ Memory is managed via `neo memory`. Facts, procedures, and episodes persist in SQLite with **local semantic search** (all-MiniLM-L6-v2 embeddings via sqlite-vec). When agents are dispatched, the most relevant memories are automatically retrieved and injected into their prompts — no manual selection needed.
239
+
240
+ ### Commands
241
+ ```bash
242
+ neo memory write --type fact --scope /path "Stable fact about repo"
243
+ neo memory write --type focus --expires 2h "Current working context"
244
+ neo memory write --type procedure --scope /path "How to do X"
245
+ neo memory forget <id>
246
+ neo memory search "keyword" # semantic search across all memories
247
+ neo memory list --type fact
248
+ ```
249
+
250
+ ### Writing good memories
251
+ Write clear, descriptive content — memories are matched semantically, not by keywords. Good: "Uses Prisma ORM with PostgreSQL for all database access". Bad: "Prisma + PG".
252
+
253
+ ### Guidelines
254
+ - **Facts**: stable truths about repos (stack, conventions, patterns)
255
+ - **Focus**: ephemeral working context (expires automatically)
256
+ - **Episodes**: auto-created on run completion — do not write manually
257
+ - Use `neo log` for real-time TUI output, `neo memory write` for persistent knowledge
258
+ - Use `notes/` for detailed multi-page plans and checklists
@@ -1,5 +1,5 @@
1
1
  name: developer
2
- description: "Implementation worker. Executes atomic tasks from specs in isolated worktrees. Follows strict scope discipline."
2
+ description: "Implementation worker. Executes atomic tasks from specs in isolated clones. Follows strict scope discipline."
3
3
  model: opus
4
4
  tools:
5
5
  - Read
@@ -0,0 +1,10 @@
1
+ name: reviewer
2
+ description: "Thorough single-pass code reviewer. Covers quality, standards, security, performance, and test coverage. Challenges code by default — approves only when standards are met."
3
+ model: sonnet
4
+ tools:
5
+ - Read
6
+ - Glob
7
+ - Grep
8
+ - Bash
9
+ sandbox: readonly
10
+ prompt: ../prompts/reviewer.md
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "@neotx/agents",
3
- "version": "0.1.0-alpha.2",
3
+ "version": "0.1.0-alpha.4",
4
4
  "description": "Built-in agent definitions and prompts for @neotx/core",
5
5
  "type": "module",
6
6
  "license": "MIT",
@@ -12,7 +12,8 @@
12
12
  "files": [
13
13
  "agents",
14
14
  "prompts",
15
- "workflows"
15
+ "workflows",
16
+ "SUPERVISOR.md"
16
17
  ],
17
18
  "keywords": [
18
19
  "ai-agents",