@clanker-code/pi-subagents 0.10.5 → 0.10.6
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/AGENTS.md +11 -1
- package/CHANGELOG.md +5 -0
- package/RELEASE.md +18 -12
- package/package.json +1 -1
- package/.plans/PLAN-next-changes.md +0 -183
- package/.plans/README.md +0 -14
- package/reviews/proposal-structured-output-schema.md +0 -135
- package/reviews/recursive-subagent-widget-preview-rev2.png +0 -0
- package/reviews/recursive-subagent-widget-preview.html +0 -137
- package/reviews/recursive-subagent-widget-preview.png +0 -0
- package/reviews/subagent-features-comparison.md +0 -350
package/AGENTS.md
CHANGED
|
@@ -17,7 +17,17 @@ Every release must:
|
|
|
17
17
|
3. **Update `README.md` to document any new or changed user-facing features.**
|
|
18
18
|
4. Run and pass: `npm run lint`, `npm run typecheck`, `npm test`, `npm run build`.
|
|
19
19
|
5. Commit and push.
|
|
20
|
-
6.
|
|
20
|
+
6. Push a `vX.Y.Z` tag. The `release.yml` workflow then publishes to npm and creates a GitHub Release automatically.
|
|
21
|
+
|
|
22
|
+
One-time npm setup for CI publishing:
|
|
23
|
+
|
|
24
|
+
```bash
|
|
25
|
+
npm trust github @clanker-code/pi-subagents --repo=clankercode/pi-subagents --file=release.yml
|
|
26
|
+
```
|
|
27
|
+
|
|
28
|
+
If `npm trust` fails, open `https://www.npmjs.com/package/@clanker-code/pi-subagents/access` and add a GitHub Actions trusted publisher for the `release.yml` workflow.
|
|
29
|
+
|
|
30
|
+
See the general guide at `~/.llm-general/npm-autopublish-via-ci.md` for other repos.
|
|
21
31
|
|
|
22
32
|
## Keeping Upstream In Sync
|
|
23
33
|
|
package/CHANGELOG.md
CHANGED
|
@@ -7,6 +7,11 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
|
|
|
7
7
|
|
|
8
8
|
## [Unreleased]
|
|
9
9
|
|
|
10
|
+
## [0.10.6] - 2026-06-22
|
|
11
|
+
|
|
12
|
+
### Changed
|
|
13
|
+
- **Release workflow test** — patch bump to verify the tag-driven CI publish and GitHub Release creation.
|
|
14
|
+
|
|
10
15
|
## [0.10.5] - 2026-06-21
|
|
11
16
|
|
|
12
17
|
### Changed
|
package/RELEASE.md
CHANGED
|
@@ -21,19 +21,25 @@
|
|
|
21
21
|
git push
|
|
22
22
|
```
|
|
23
23
|
|
|
24
|
-
4. **
|
|
25
|
-
|
|
26
|
-
```bash
|
|
27
|
-
git tag vX.Y.Z
|
|
28
|
-
git push origin vX.Y.Z
|
|
29
|
-
```
|
|
30
|
-
- Open the [GitHub releases page](https://github.com/clankercode/pi-subagents/releases) and create a new release for the tag.
|
|
31
|
-
- Copy the relevant `[x.y.z]` section from `CHANGELOG.md` into the release notes.
|
|
32
|
-
- Highlight any breaking changes, fork-specific features, or upgrade notes.
|
|
33
|
-
|
|
34
|
-
5. **Publish to npm**
|
|
24
|
+
4. **Push the version tag**
|
|
25
|
+
The `release.yml` workflow publishes to npm and creates the GitHub Release automatically:
|
|
35
26
|
```bash
|
|
36
|
-
|
|
27
|
+
git tag vX.Y.Z
|
|
28
|
+
git push origin vX.Y.Z
|
|
37
29
|
```
|
|
38
30
|
|
|
31
|
+
5. **Verify**
|
|
32
|
+
- Check the [Actions run](https://github.com/clankercode/pi-subagents/actions) succeeded.
|
|
33
|
+
- Confirm the package version appears on npm: `npm view @clanker-code/pi-subagents`.
|
|
34
|
+
- Confirm the GitHub Release has the changelog notes.
|
|
35
|
+
|
|
36
|
+
One-time npm trusted-publisher setup:
|
|
37
|
+
```bash
|
|
38
|
+
npm trust github @clanker-code/pi-subagents --repo=clankercode/pi-subagents --file=release.yml
|
|
39
|
+
```
|
|
40
|
+
|
|
41
|
+
If `npm trust` fails, open `https://www.npmjs.com/package/@clanker-code/pi-subagents/access` and add a GitHub Actions trusted publisher for the `release.yml` workflow.
|
|
42
|
+
|
|
43
|
+
See `~/.llm-general/npm-autopublish-via-ci.md` for general instructions.
|
|
44
|
+
|
|
39
45
|
> Note: `prepublishOnly` already runs lint, typecheck, tests, and build before publishing.
|
package/package.json
CHANGED
|
@@ -1,183 +0,0 @@
|
|
|
1
|
-
# Plan: Next pi-subagents improvements
|
|
2
|
-
|
|
3
|
-
This plan covers four coordinated changes:
|
|
4
|
-
1. **Abort-detach + interruptible/timeout wait** for `get_subagent_result wait:true` (Escape no longer wedges the turn; default 4.5 min timeout).
|
|
5
|
-
2. Configurable `waitTimeoutSeconds` setting (default 270s) exposed in `/agents → Settings`.
|
|
6
|
-
3. `get_subagent_result` peek/tail capability with line numbers, regex filter, and `after` line offset.
|
|
7
|
-
4. Description polish for `inherit_context` and `isolation: "worktree"`.
|
|
8
|
-
|
|
9
|
-
All changes are scoped to keep diffs minimal and upstream-merge-friendly.
|
|
10
|
-
|
|
11
|
-
### Key finding (abort-detach)
|
|
12
|
-
Background spawns already do NOT pass the parent abort signal to `manager.spawn()` — only the now-dead foreground path did. So background subagents are already detached from Escape. The Escape-wedge symptom comes from `get_subagent_result wait:true`, which awaits `record.promise` and ignores the parent `signal`. Fix: race the wait against (a) the parent abort `signal` and (b) the configurable timeout, returning current status on either WITHOUT aborting the subagent.
|
|
13
|
-
|
|
14
|
-
---
|
|
15
|
-
|
|
16
|
-
## 1. Settings foundation: configurable `waitTimeoutSeconds`
|
|
17
|
-
|
|
18
|
-
### Motivation
|
|
19
|
-
Users want to avoid typical 5-minute LLM cache expiry; 4.5 minutes is a safe default. We also want this exposed in `/agents → Settings` so it is discoverable and adjustable per project.
|
|
20
|
-
|
|
21
|
-
### Files touched
|
|
22
|
-
- `src/types.ts` — add `waitTimeoutSeconds` to `AgentRecord`? No, this is runtime/config, not per-agent. Keep it in settings only.
|
|
23
|
-
- `src/settings.ts`:
|
|
24
|
-
- Add `waitTimeoutSeconds?: number` to `SubagentsSettings`.
|
|
25
|
-
- Add `setWaitTimeoutSeconds: (seconds: number) => void` to `SettingsAppliers`.
|
|
26
|
-
- Add sanitize rule: integer, min 30, max 3600 (30s–1h). Default 270 (4.5m) when absent.
|
|
27
|
-
- Add `applySettings` wiring.
|
|
28
|
-
- `src/index.ts`:
|
|
29
|
-
- Add `let waitTimeoutSeconds = 270` and getter/setter.
|
|
30
|
-
- Add to `snapshotSettings()`.
|
|
31
|
-
- Add `/agents → Settings` item with id `waitTimeoutSeconds`.
|
|
32
|
-
- Wire `applyValue` for `waitTimeoutSeconds` (numeric prompt, 30–3600).
|
|
33
|
-
- Add to `applyAndEmitLoaded` appliers object.
|
|
34
|
-
- Pass timeout value into `get_subagent_result` execute closure.
|
|
35
|
-
|
|
36
|
-
### Return value / UI impact
|
|
37
|
-
- `get_subagent_result` `wait` description: mention current configured timeout.
|
|
38
|
-
- On timeout, return message:
|
|
39
|
-
```
|
|
40
|
-
Agent is still running after 4 minutes 30 seconds.
|
|
41
|
-
This wait timed out to avoid blocking the parent session longer than the configured limit.
|
|
42
|
-
Call get_subagent_result with wait: true again to keep waiting, or omit wait to check status.
|
|
43
|
-
```
|
|
44
|
-
|
|
45
|
-
### Tests
|
|
46
|
-
- `test/settings.test.ts` (if exists) or new assertions in `test/settings.test.ts`: sanitize boundaries.
|
|
47
|
-
- New/updated test for `get_subagent_result` timeout behavior using a mocked promise and fake timers.
|
|
48
|
-
|
|
49
|
-
---
|
|
50
|
-
|
|
51
|
-
## 2. `get_subagent_result` peek parameter
|
|
52
|
-
|
|
53
|
-
### Motivation
|
|
54
|
-
Agents should be able to cheaply check recent output / logs without fetching the full result or conversation.
|
|
55
|
-
|
|
56
|
-
### Proposed schema
|
|
57
|
-
```ts
|
|
58
|
-
peek: Type.Optional(
|
|
59
|
-
Type.Object({
|
|
60
|
-
lines: Type.Optional(Type.Number({ minimum: 1, description: "Number of trailing lines to return. Default: 20." })),
|
|
61
|
-
regex: Type.Optional(Type.String({ description: "Optional regex filter. Only lines matching this regex are included." })),
|
|
62
|
-
after: Type.Optional(Type.Number({ minimum: 0, description: "Return all lines after this 0-based line index. Overrides lines when set." })),
|
|
63
|
-
}, {
|
|
64
|
-
description: "Return a peek (tail/filter/update) of the agent result or streaming output file. Ignored when verbose is true.",
|
|
65
|
-
}),
|
|
66
|
-
),
|
|
67
|
-
```
|
|
68
|
-
|
|
69
|
-
### Behavior
|
|
70
|
-
- `peek` is ignored when `verbose: true`.
|
|
71
|
-
- If `after` is set, return all lines with index > `after` (or >= `after+1`). Include line numbers.
|
|
72
|
-
- Else, return the last `lines` lines (default 20). Include line numbers.
|
|
73
|
-
- If `regex` is provided, filter matching lines **first**, then apply tail/after semantics. Include line numbers of the original source.
|
|
74
|
-
- Source precedence:
|
|
75
|
-
1. If agent is running and `record.outputFile` exists, read from the streaming output file. Parse JSONL and extract assistant/toolResult text (most useful live content).
|
|
76
|
-
2. Else if `record.result` exists, split it into lines.
|
|
77
|
-
3. Else return: "No output yet."
|
|
78
|
-
|
|
79
|
-
### Return format
|
|
80
|
-
```
|
|
81
|
-
Showing last 20 lines of agent output (line numbers from full output):
|
|
82
|
-
|
|
83
|
-
[42] some line
|
|
84
|
-
[43] another line
|
|
85
|
-
...
|
|
86
|
-
|
|
87
|
-
---
|
|
88
|
-
Use verbose: true for the full conversation, or omit peek for the complete result.
|
|
89
|
-
```
|
|
90
|
-
|
|
91
|
-
If `regex` is used, add: `(filtered by regex: /.../)`.
|
|
92
|
-
If `after` is used, add: `(lines after index N)`.
|
|
93
|
-
|
|
94
|
-
### Implementation notes
|
|
95
|
-
- Add helper `peekAgentOutput(record, peek)` in a new file or in `src/index.ts` near `get_subagent_result`.
|
|
96
|
-
- For output-file JSONL parsing, reuse/extract from `output-file.ts` or read the file line-by-line.
|
|
97
|
-
- Handle regex parse errors gracefully — return a clear error message.
|
|
98
|
-
- Avoid reading the whole file into memory if possible; for now `readFileSync` is acceptable because output files are bounded by session length and the tail case is common.
|
|
99
|
-
|
|
100
|
-
### Tests
|
|
101
|
-
- New test file `test/get-subagent-result-peek.test.ts` or extend existing `status-note-wiring.test.ts`:
|
|
102
|
-
- Peek tail of result.
|
|
103
|
-
- Peek with regex.
|
|
104
|
-
- Peek `after` offset.
|
|
105
|
-
- Peek ignored when verbose true.
|
|
106
|
-
- Peek on running agent with output file.
|
|
107
|
-
- Regex parse error handling.
|
|
108
|
-
|
|
109
|
-
---
|
|
110
|
-
|
|
111
|
-
## 3. Description polish
|
|
112
|
-
|
|
113
|
-
### `inherit_context`
|
|
114
|
-
Current:
|
|
115
|
-
> "If true, fork parent conversation into the agent. Default: false (fresh context)."
|
|
116
|
-
|
|
117
|
-
New:
|
|
118
|
-
> "If true, fork the parent conversation into the agent so it sees the chat history. Recommended for questions or requests that require current context. Default: false (fresh context)."
|
|
119
|
-
|
|
120
|
-
Also update the eject template line in `src/index.ts`.
|
|
121
|
-
|
|
122
|
-
### `isolation: "worktree"`
|
|
123
|
-
Current:
|
|
124
|
-
> "Set to 'worktree' to run the agent in a temporary git worktree (isolated copy of the repo). Changes are saved to a branch on completion."
|
|
125
|
-
|
|
126
|
-
New:
|
|
127
|
-
> "Set to 'worktree' to run the agent in a temporary git worktree that is automatically created from the current repo state at HEAD and removed on completion. Changes are saved to a branch. Requires the working directory to be a git repo with at least one commit."
|
|
128
|
-
|
|
129
|
-
Also update README.md and `examples/agent-tool-description.md` to match.
|
|
130
|
-
|
|
131
|
-
---
|
|
132
|
-
|
|
133
|
-
## 4. Coordination / cross-cutting concerns
|
|
134
|
-
|
|
135
|
-
- `get_subagent_result` description must mention the configurable timeout, e.g.:
|
|
136
|
-
> "If true, wait for the agent to complete before returning. Blocks up to the configured wait timeout (default 4.5 minutes). If the agent is still running when the timeout is reached, returns current status with instructions to call again. Default: false."
|
|
137
|
-
|
|
138
|
-
- The settings UI should present `waitTimeoutSeconds` as "Wait timeout (30–3600s, default 270 = 4m30s)".
|
|
139
|
-
|
|
140
|
-
- Tests for the timeout should use Vitest fake timers so they run fast and deterministically.
|
|
141
|
-
|
|
142
|
-
- Peek should not duplicate verbose. Keep the contract: `verbose` = full conversation, `peek` = lightweight tail/filter.
|
|
143
|
-
|
|
144
|
-
---
|
|
145
|
-
|
|
146
|
-
## 5. Things to consider removing / avoiding
|
|
147
|
-
|
|
148
|
-
Before jumping in, review whether any existing code becomes dead weight:
|
|
149
|
-
|
|
150
|
-
- The foreground execution path in `src/index.ts` is now dead code after the background-by-default change. We currently left it in place but unreachable. Consider whether to remove it in a separate cleanup pass (it complicates future changes).
|
|
151
|
-
- `run_in_background` param is deprecated but still in the schema. Keep it for prompt compatibility; do not remove.
|
|
152
|
-
- `AgentConfig.runInBackground` frontmatter default is still consulted by `resolveAgentInvocationConfig`. Since `index.ts` now forces `runInBackground = true`, the frontmatter field is effectively ignored. Consider whether to deprecate it in docs or leave it as a no-op.
|
|
153
|
-
|
|
154
|
-
---
|
|
155
|
-
|
|
156
|
-
## 6. Acceptance criteria
|
|
157
|
-
|
|
158
|
-
- [ ] `get_subagent_result wait:true` times out after the configured number of seconds and returns a clear message.
|
|
159
|
-
- [ ] Timeout duration is configurable via `/agents → Settings` and persists to `.pi/subagents.json`.
|
|
160
|
-
- [ ] `get_subagent_result` supports `peek` with `lines`, `regex`, and `after`.
|
|
161
|
-
- [ ] `peek` returns line numbers and respects `verbose: true` (is ignored).
|
|
162
|
-
- [ ] `inherit_context` and `isolation` descriptions are updated in tool schema, eject template, README, and example template.
|
|
163
|
-
- [ ] All existing tests pass; new tests cover timeout + peek.
|
|
164
|
-
- [ ] `npm run typecheck` and `npm run lint` clean.
|
|
165
|
-
|
|
166
|
-
---
|
|
167
|
-
|
|
168
|
-
## 7. Implementation order
|
|
169
|
-
|
|
170
|
-
1. Add `waitTimeoutSeconds` setting (settings.ts + index.ts wiring).
|
|
171
|
-
2. Implement wait timeout in `get_subagent_result` execute.
|
|
172
|
-
3. Add `peek` parameter + helper.
|
|
173
|
-
4. Update descriptions for `inherit_context` and `isolation`.
|
|
174
|
-
5. Update README + example template.
|
|
175
|
-
6. Write tests.
|
|
176
|
-
7. Run full test/typecheck/lint.
|
|
177
|
-
|
|
178
|
-
--- SUMMARY ---
|
|
179
|
-
|
|
180
|
-
- Add a configurable `waitTimeoutSeconds` setting (default 270s) wired into `/agents → Settings` and `get_subagent_result wait:true`.
|
|
181
|
-
- Add a `peek` object param to `get_subagent_result` for tail/filter/offset access to result/output-file with line numbers, ignored when `verbose` is true.
|
|
182
|
-
- Polish `inherit_context` and `isolation: "worktree"` descriptions across schema, template, README, and example.
|
|
183
|
-
- Tests, typecheck, lint.
|
package/.plans/README.md
DELETED
|
@@ -1,14 +0,0 @@
|
|
|
1
|
-
# `.plans/` — local planning scratch
|
|
2
|
-
|
|
3
|
-
This directory holds ad-hoc plans, notes, and design sketches for this fork. It is gitignored so it does not pollute upstream diffs.
|
|
4
|
-
|
|
5
|
-
## Conventions
|
|
6
|
-
|
|
7
|
-
- One plan per file: `PLAN-<short-name>.md`.
|
|
8
|
-
- Plans should include: motivation, files touched, behavior changes, UI/message changes, tests, acceptance criteria, implementation order, and a summary.
|
|
9
|
-
- When a plan graduates to implementation, update the plan file with progress or archive it.
|
|
10
|
-
- Delete obsolete plans to avoid stale guidance.
|
|
11
|
-
|
|
12
|
-
## Active plans
|
|
13
|
-
|
|
14
|
-
- `PLAN-next-changes.md` — configurable `get_subagent_result` wait timeout, peek/tail/regex/after parameter, and description polish for `inherit_context` / `isolation: "worktree"`.
|
|
@@ -1,135 +0,0 @@
|
|
|
1
|
-
# Proposal: Structured JSON Output for Subagents
|
|
2
|
-
|
|
3
|
-
**Date:** 2026-06-19
|
|
4
|
-
**Status:** Proposal / draft
|
|
5
|
-
**Related:** `reviews/subagent-features-comparison.md`
|
|
6
|
-
|
|
7
|
-
## Problem
|
|
8
|
-
|
|
9
|
-
Today, subagents return free-form text. The parent model must parse the text to extract lists, file paths, decisions, or any other structured data before it can feed results into the next tool or agent. This is slow, unreliable, and becomes a bottleneck when chaining multiple agents together.
|
|
10
|
-
|
|
11
|
-
## Goal
|
|
12
|
-
|
|
13
|
-
Add an optional `output_schema` parameter to the `Agent` tool. When supplied, the agent must return JSON matching the schema. The extension validates the output and surfaces either the parsed object or a clear validation error.
|
|
14
|
-
|
|
15
|
-
The feature must be strictly opt-in: the default is free-form text, and agents should only use structured output when the caller explicitly needs it.
|
|
16
|
-
|
|
17
|
-
## Design
|
|
18
|
-
|
|
19
|
-
### Tool parameter
|
|
20
|
-
|
|
21
|
-
Add to the `Agent` tool schema:
|
|
22
|
-
|
|
23
|
-
```ts
|
|
24
|
-
output_schema: Type.Optional(
|
|
25
|
-
Type.Union([Type.String(), Type.Object({})], {
|
|
26
|
-
description:
|
|
27
|
-
'Optional JSON Schema the agent\'s final answer must match. "": free-form text (recommended default; only use this when the consumer needs structured data).',
|
|
28
|
-
})
|
|
29
|
-
),
|
|
30
|
-
```
|
|
31
|
-
|
|
32
|
-
- Accepts either a JSON Schema object or a JSON-stringified schema.
|
|
33
|
-
- Default is `undefined` / `""` → free-form text, no validation.
|
|
34
|
-
- The schema type is JSON Schema, the same standard already used for tool-call schemas in pi.
|
|
35
|
-
|
|
36
|
-
### Frontmatter support
|
|
37
|
-
|
|
38
|
-
Optionally allow agent `.md` files to pin a default schema:
|
|
39
|
-
|
|
40
|
-
```yaml
|
|
41
|
-
---
|
|
42
|
-
output_schema:
|
|
43
|
-
type: object
|
|
44
|
-
properties:
|
|
45
|
-
files:
|
|
46
|
-
type: array
|
|
47
|
-
items:
|
|
48
|
-
type: object
|
|
49
|
-
properties:
|
|
50
|
-
path: { type: string }
|
|
51
|
-
reason: { type: string }
|
|
52
|
-
required: [path, reason]
|
|
53
|
-
required: [files]
|
|
54
|
-
---
|
|
55
|
-
```
|
|
56
|
-
|
|
57
|
-
The caller-supplied `output_schema` overrides the frontmatter value. This lets custom agent types advertise a structured contract without every caller repeating it.
|
|
58
|
-
|
|
59
|
-
### Prompt injection
|
|
60
|
-
|
|
61
|
-
When `output_schema` is non-empty, append a concise instruction to the agent system prompt:
|
|
62
|
-
|
|
63
|
-
> Your final answer must be a single JSON object matching the provided JSON Schema. Do not wrap the JSON in markdown fences, do not add commentary outside the JSON, and do not emit any text after the JSON object.
|
|
64
|
-
|
|
65
|
-
### Validation
|
|
66
|
-
|
|
67
|
-
On agent completion:
|
|
68
|
-
|
|
69
|
-
1. Extract the last JSON object from the final assistant message.
|
|
70
|
-
2. If the output is wrapped in markdown fences, strip them.
|
|
71
|
-
3. Validate the parsed object against the JSON Schema using a lightweight validator (e.g. `ajv` or TypeBox's `Value.Check`).
|
|
72
|
-
4. If validation fails:
|
|
73
|
-
- Mark the agent status as `failed`.
|
|
74
|
-
- Return an error message containing the schema validation errors and a snippet of the raw output.
|
|
75
|
-
5. If validation passes:
|
|
76
|
-
- Include the parsed object in the result under a `structured` field.
|
|
77
|
-
- Keep the normal textual result preview so the notification UI stays readable.
|
|
78
|
-
|
|
79
|
-
### Result shape
|
|
80
|
-
|
|
81
|
-
For a structured agent, the result should expose:
|
|
82
|
-
|
|
83
|
-
```json
|
|
84
|
-
{
|
|
85
|
-
"ok": true,
|
|
86
|
-
"data": { "files": [...] },
|
|
87
|
-
"preview": "Found 5 auth-related files..."
|
|
88
|
-
}
|
|
89
|
-
```
|
|
90
|
-
|
|
91
|
-
For failures:
|
|
92
|
-
|
|
93
|
-
```json
|
|
94
|
-
{
|
|
95
|
-
"ok": false,
|
|
96
|
-
"error": "Output did not match the provided JSON Schema",
|
|
97
|
-
"validationErrors": [...],
|
|
98
|
-
"rawPreview": "..."
|
|
99
|
-
}
|
|
100
|
-
```
|
|
101
|
-
|
|
102
|
-
### Notification and join modes
|
|
103
|
-
|
|
104
|
-
Structured output does not change the notification path. Background agents still send steering-style notifications. The XML payload can include a `<structured-output>` block with the serialized JSON so the parent model can reason about it directly.
|
|
105
|
-
|
|
106
|
-
### Interaction with existing features
|
|
107
|
-
|
|
108
|
-
- **Worktree isolation:** unchanged; the structured result is still returned through the normal completion path.
|
|
109
|
-
- **Scheduling:** scheduled agents may use `output_schema` so recurring jobs produce machine-readable results.
|
|
110
|
-
- **Resume:** resuming a structured agent does not re-validate old output; only the final completion is validated.
|
|
111
|
-
- **Cross-extension RPC:** `output_schema` is serializable JSON and can be passed through the RPC spawn envelope.
|
|
112
|
-
|
|
113
|
-
## Acceptance criteria
|
|
114
|
-
|
|
115
|
-
- [ ] `Agent` tool accepts optional `output_schema` as JSON Schema object or string.
|
|
116
|
-
- [ ] Default behavior remains free-form text; no validation when omitted.
|
|
117
|
-
- [ ] Frontmatter supports optional `output_schema`.
|
|
118
|
-
- [ ] System prompt instructs the agent to emit only matching JSON.
|
|
119
|
-
- [ ] Output is parsed and validated strictly on completion.
|
|
120
|
-
- [ ] Validation failures produce a clear error with the raw output snippet.
|
|
121
|
-
- [ ] Validation successes expose parsed data in result notifications and RPC responses.
|
|
122
|
-
- [ ] Tests cover valid schema, invalid schema, fenced JSON, and missing schema cases.
|
|
123
|
-
- [ ] `npm run typecheck` and `npm run lint` pass.
|
|
124
|
-
|
|
125
|
-
## Open questions
|
|
126
|
-
|
|
127
|
-
1. Should we add a small built-in helper agent type that demonstrates structured output (e.g. `json-explorer`)?
|
|
128
|
-
2. Should `output_schema` be surfaced in `/agents` agent-type descriptions so the orchestrator knows which agents return structured data?
|
|
129
|
-
3. Should failed validation trigger automatic retry with a steering message, or fail fast?
|
|
130
|
-
|
|
131
|
-
## Rationale
|
|
132
|
-
|
|
133
|
-
JSON Schema was chosen because it is the same standard already used for pi tool definitions. Agents and users do not need to learn a new format, and existing tooling (TypeBox, `ajv`) integrates cleanly.
|
|
134
|
-
|
|
135
|
-
Strict validation keeps the contract trustworthy. If a consumer asks for structured output, receiving invalid data is worse than receiving no data, because the consumer is likely to feed it into code that assumes correctness.
|
|
Binary file
|
|
@@ -1,137 +0,0 @@
|
|
|
1
|
-
<!doctype html>
|
|
2
|
-
<html lang="en">
|
|
3
|
-
<head>
|
|
4
|
-
<meta charset="utf-8" />
|
|
5
|
-
<meta name="viewport" content="width=device-width, initial-scale=1" />
|
|
6
|
-
<title>Recursive Subagents Widget Preview</title>
|
|
7
|
-
<style>
|
|
8
|
-
:root {
|
|
9
|
-
color-scheme: dark;
|
|
10
|
-
--bg: #0b1020;
|
|
11
|
-
--panel: #121a2d;
|
|
12
|
-
--terminal: #070b13;
|
|
13
|
-
--line: #29364f;
|
|
14
|
-
--text: #dbe7ff;
|
|
15
|
-
--muted: #7f8daa;
|
|
16
|
-
--dim: #566176;
|
|
17
|
-
--accent: #82aaff;
|
|
18
|
-
--green: #8bdc9f;
|
|
19
|
-
--yellow: #ffd479;
|
|
20
|
-
--red: #ff7f8f;
|
|
21
|
-
--cyan: #7ee7f5;
|
|
22
|
-
--purple: #c099ff;
|
|
23
|
-
}
|
|
24
|
-
* { box-sizing: border-box; }
|
|
25
|
-
body {
|
|
26
|
-
margin: 0;
|
|
27
|
-
background: radial-gradient(circle at 20% 0%, #1b2b50 0, transparent 35%), var(--bg);
|
|
28
|
-
color: var(--text);
|
|
29
|
-
font: 14px/1.45 ui-sans-serif, system-ui, -apple-system, BlinkMacSystemFont, "Segoe UI", sans-serif;
|
|
30
|
-
}
|
|
31
|
-
main { max-width: 1220px; margin: 0 auto; padding: 32px; }
|
|
32
|
-
h1 { font-size: 28px; margin: 0 0 8px; }
|
|
33
|
-
h2 { font-size: 18px; margin: 28px 0 12px; color: #f0f5ff; }
|
|
34
|
-
h3 { font-size: 14px; margin: 0 0 10px; color: var(--accent); text-transform: uppercase; letter-spacing: .08em; }
|
|
35
|
-
p { color: #b9c6df; margin: 0 0 14px; }
|
|
36
|
-
.grid { display: grid; grid-template-columns: repeat(3, minmax(0, 1fr)); gap: 18px; align-items: start; }
|
|
37
|
-
.card { background: color-mix(in srgb, var(--panel) 92%, white 8%); border: 1px solid var(--line); border-radius: 16px; padding: 18px; box-shadow: 0 12px 40px rgba(0,0,0,.25); }
|
|
38
|
-
.recommended { border-color: color-mix(in srgb, var(--accent) 70%, white 10%); box-shadow: 0 0 0 1px rgba(130,170,255,.15), 0 18px 55px rgba(22,41,82,.5); }
|
|
39
|
-
.terminal { background: var(--terminal); border: 1px solid #1e2a3e; border-radius: 12px; padding: 12px 14px; font: 13px/1.35 ui-monospace, SFMono-Regular, Menlo, Monaco, Consolas, monospace; white-space: pre; overflow: hidden; min-height: 260px; }
|
|
40
|
-
.term-title { display: flex; gap: 6px; align-items: center; margin-bottom: 10px; color: var(--muted); font: 12px ui-monospace, monospace; }
|
|
41
|
-
.dot { width: 9px; height: 9px; border-radius: 999px; display: inline-block; }
|
|
42
|
-
.green { color: var(--green); } .yellow { color: var(--yellow); } .red { color: var(--red); } .cyan { color: var(--cyan); } .purple { color: var(--purple); }
|
|
43
|
-
.muted { color: var(--muted); } .dim { color: var(--dim); } .accent { color: var(--accent); }
|
|
44
|
-
.legend { display: flex; flex-wrap: wrap; gap: 10px; margin-top: 10px; color: var(--muted); font-size: 12px; }
|
|
45
|
-
.plan { display: grid; grid-template-columns: 1.2fr .8fr; gap: 18px; }
|
|
46
|
-
ol, ul { margin: 0; padding-left: 20px; color: #c4cee2; }
|
|
47
|
-
li { margin: 6px 0; }
|
|
48
|
-
.pill { display: inline-flex; align-items: center; gap: 6px; padding: 3px 8px; border-radius: 999px; background: #1b2740; border: 1px solid #344260; color: #cfd9ef; font-size: 12px; }
|
|
49
|
-
.wide { grid-column: 1 / -1; }
|
|
50
|
-
@media (max-width: 980px) { .grid, .plan { grid-template-columns: 1fr; } }
|
|
51
|
-
</style>
|
|
52
|
-
</head>
|
|
53
|
-
<body>
|
|
54
|
-
<main>
|
|
55
|
-
<h1>Recursive Subagents Widget — Plan + Visual Preview</h1>
|
|
56
|
-
<p>Verified issue: the current widget is flat and bound to one manager instance. It does not assemble a durable parent → child → grandchild tree from recursive agent metadata.</p>
|
|
57
|
-
|
|
58
|
-
<section class="plan">
|
|
59
|
-
<div class="card">
|
|
60
|
-
<h2>Implementation plan, once design is approved</h2>
|
|
61
|
-
<ol>
|
|
62
|
-
<li>Create a small tree model layer: records keyed by id, linked by <code>parentAgentId</code>, sorted by start time/status.</li>
|
|
63
|
-
<li>Teach lifecycle events / shared manager state to surface descendants to the root widget, not only direct children.</li>
|
|
64
|
-
<li>Add configurable widget modes: <code>compact</code> (Option A), <code>rich</code> (Option B), and <code>auto</code>.</li>
|
|
65
|
-
<li>Use the focused path view (Option C) when rendering from inside a subagent, or as an optional display mode later.</li>
|
|
66
|
-
<li>Add deterministic tests for parent → child → grandchild rendering, overflow, width truncation, completed/error linger, and mode switching.</li>
|
|
67
|
-
<li>Keep status bar compact: aggregate running/queued counts across the full tree.</li>
|
|
68
|
-
</ol>
|
|
69
|
-
</div>
|
|
70
|
-
<div class="card">
|
|
71
|
-
<h2>Acceptance criteria</h2>
|
|
72
|
-
<ul>
|
|
73
|
-
<li>Grandchildren and deeper descendants are visible.</li>
|
|
74
|
-
<li>Users can choose compact vs rich rendering in settings.</li>
|
|
75
|
-
<li>Subagents can see their location in the larger recursive tree.</li>
|
|
76
|
-
<li>Width/height bounded; no TUI overflow.</li>
|
|
77
|
-
<li>Running activity remains live and readable.</li>
|
|
78
|
-
<li>Completed/error states linger briefly in context.</li>
|
|
79
|
-
</ul>
|
|
80
|
-
</div>
|
|
81
|
-
</section>
|
|
82
|
-
|
|
83
|
-
<h2>Design options</h2>
|
|
84
|
-
<div class="grid">
|
|
85
|
-
<article class="card">
|
|
86
|
-
<h3>Mode: Compact tree</h3>
|
|
87
|
-
<div class="terminal"><div class="term-title"><span class="dot" style="background:#ff5f57"></span><span class="dot" style="background:#ffbd2e"></span><span class="dot" style="background:#28c840"></span><span>Agents widget</span></div><span class="accent">● Agents</span>
|
|
88
|
-
├─ <span class="green">⠋</span> <b>Plan</b> split task <span class="dim">· ↻3 · 2 tools · 1m 12s</span>
|
|
89
|
-
│ ├─ <span class="green">⠹</span> <b>Explore</b> inspect UI <span class="dim">· reading…</span>
|
|
90
|
-
│ └─ <span class="yellow">✓</span> <b>auditor</b> review paths <span class="dim">· 42s</span>
|
|
91
|
-
└─ <span class="green">⠼</span> <b>general-purpose</b> write tests <span class="dim">· editing…</span></div>
|
|
92
|
-
<p><b>Use as:</b> configurable mode <code>compact</code>. <b>Pros:</b> dense, low-risk, close to current widget. <b>Cons:</b> less rich for deep trees.</p>
|
|
93
|
-
</article>
|
|
94
|
-
|
|
95
|
-
<article class="card recommended">
|
|
96
|
-
<h3>Mode: Rich tree (default)</h3>
|
|
97
|
-
<div class="terminal"><div class="term-title"><span class="dot" style="background:#ff5f57"></span><span class="dot" style="background:#ffbd2e"></span><span class="dot" style="background:#28c840"></span><span>Agents widget</span></div><span class="accent">● Agents</span> <span class="muted">3 running · 1 queued · depth 3/4</span>
|
|
98
|
-
├─ <span class="green">⠋</span> <b>Plan</b> <span class="purple">opus</span> <span class="muted">split task</span>
|
|
99
|
-
│ <span class="dim">⎿ ↻3≤20 · 4 tools · 22.8k token (38%) · 1m 12s</span>
|
|
100
|
-
│ ├─ <span class="green">⠹</span> <b>Explore</b> <span class="muted">inspect widget data flow</span>
|
|
101
|
-
│ │ <span class="dim">⎿ reading agent-widget.ts · 14.2s</span>
|
|
102
|
-
│ │ └─ <span class="green">⠼</span> <b>general-purpose</b> <span class="muted">trace manager ownership</span>
|
|
103
|
-
│ │ <span class="dim">⎿ searching tests · 4.8s</span>
|
|
104
|
-
│ └─ <span class="yellow">✓</span> <b>auditor</b> <span class="muted">review plan</span> <span class="dim">· 31s</span>
|
|
105
|
-
├─ <span class="cyan">◦</span> <b>Explore</b> <span class="muted">queued after concurrency cap</span>
|
|
106
|
-
└─ <span class="red">✗</span> <b>Plan</b> <span class="muted">old attempt</span> <span class="red">error: model unavailable</span></div>
|
|
107
|
-
<p><b>Use as:</b> configurable mode <code>rich</code>, recommended default. <b>Pros:</b> best visibility; activity/stats are readable; relationships are obvious. <b>Cons:</b> needs careful overflow rules.</p>
|
|
108
|
-
</article>
|
|
109
|
-
|
|
110
|
-
<article class="card">
|
|
111
|
-
<h3>Focused subagent location view</h3>
|
|
112
|
-
<div class="terminal"><div class="term-title"><span class="dot" style="background:#ff5f57"></span><span class="dot" style="background:#ffbd2e"></span><span class="dot" style="background:#28c840"></span><span>Agents widget</span></div><span class="accent">● Agents</span> <span class="muted">7 total · showing active branch</span>
|
|
113
|
-
├─ <span class="green">⠋</span> <b>Plan</b> split task <span class="dim">· 1m 12s</span>
|
|
114
|
-
│ └─ <span class="green">⠹</span> <b>Explore</b> inspect UI <span class="dim">· reading…</span>
|
|
115
|
-
│ └─ <span class="green">⠼</span> <b>general-purpose</b> trace manager <span class="dim">· searching…</span>
|
|
116
|
-
├─ <span class="dim">+2 completed siblings hidden</span>
|
|
117
|
-
└─ <span class="dim">+2 queued / stale descendants hidden</span></div>
|
|
118
|
-
<p><b>Use as:</b> view shown inside a subagent session to answer “where am I in the tree?” It can also become a selectable mode later. <b>Pros:</b> very compact under load.</p>
|
|
119
|
-
</article>
|
|
120
|
-
</div>
|
|
121
|
-
|
|
122
|
-
<section class="card wide" style="margin-top:18px;">
|
|
123
|
-
<h2>Settled direction so far</h2>
|
|
124
|
-
<p><span class="pill">rich default</span> <span class="pill">compact setting</span> <span class="pill">focused subagent view</span></p>
|
|
125
|
-
<p>Build one recursive tree model, then render it through multiple views. The root TUI widget can use <code>rich</code>, <code>compact</code>, or <code>auto</code>. A subagent-local widget can render the focused path view so the agent/operator sees the current subagent’s location in the whole delegation tree.</p>
|
|
126
|
-
<ul>
|
|
127
|
-
<li><b>Header:</b> aggregate full-tree counts and max observed depth.</li>
|
|
128
|
-
<li><b>Rich node:</b> connector + status icon/spinner + agent type + model/tag chips + description, plus detail line.</li>
|
|
129
|
-
<li><b>Compact node:</b> same tree structure, one line per agent, fewer stats.</li>
|
|
130
|
-
<li><b>Focused node:</b> ancestor path + current active branch + hidden sibling summaries.</li>
|
|
131
|
-
<li><b>Overflow:</b> collapse by subtree: <code>└─ +4 descendants hidden (2 running, 1 queued, 1 finished)</code>.</li>
|
|
132
|
-
<li><b>Fallback:</b> orphan records render under <code>Other agents</code> with a dim warning marker.</li>
|
|
133
|
-
</ul>
|
|
134
|
-
</section>
|
|
135
|
-
</main>
|
|
136
|
-
</body>
|
|
137
|
-
</html>
|
|
Binary file
|
|
@@ -1,350 +0,0 @@
|
|
|
1
|
-
# Subagent Feature Landscape — A Comparison for `pi-subagents`
|
|
2
|
-
|
|
3
|
-
**Date:** 2026-06-18
|
|
4
|
-
**Scope:** Survey the subagent / multi-agent features shipping in leading coding agents and CLIs, extract the most useful and innovative ideas, and compare them to the current `pi-subagents` implementation.
|
|
5
|
-
|
|
6
|
-
This document is meant to be a single reference for deciding which features are worth adopting, improving, or deliberately avoiding in `pi-subagents`.
|
|
7
|
-
|
|
8
|
-
---
|
|
9
|
-
|
|
10
|
-
## 1. Feature taxonomy
|
|
11
|
-
|
|
12
|
-
Across the tools surveyed, the interesting subagent capabilities cluster into a handful of categories:
|
|
13
|
-
|
|
14
|
-
| Category | What it covers |
|
|
15
|
-
|----------|----------------|
|
|
16
|
-
| **Spawn model** | Tool-based (`Agent`), natural-language auto-delegation, `@agent` forcing, API/SDK spawn |
|
|
17
|
-
| **Context isolation** | Fresh context vs. inherited conversation; context-window utilization signals |
|
|
18
|
-
| **Tool & extension scoping** | Per-agent tool allowlists/denylists, sandbox mode (read-only, etc.), extension isolation |
|
|
19
|
-
| **Concurrency & parallelism** | Background queue, max threads, parallel worktrees, join/group strategies |
|
|
20
|
-
| **Orchestration** | Parent-driven fan-out, coordinator/agent-teams, shared task lists, self-coordination |
|
|
21
|
-
| **Communication** | One-way result reporting, mid-run steering, peer-to-peer messaging |
|
|
22
|
-
| **Scheduling** | Cron / interval / one-shot scheduled agents, CI-triggered agents |
|
|
23
|
-
| **Lifecycle intervention** | Graceful max-turns wrap-up, hard abort, resume, hooks/quality gates |
|
|
24
|
-
| **Sandbox / isolation** | Worktree isolation, container sandboxes, browser-agent isolation |
|
|
25
|
-
| **Configuration** | Markdown + YAML, TOML, programmatic `agents` map, skills preloading |
|
|
26
|
-
| **UI / observability** | Persistent widget, conversation viewer, styled notifications, token meters |
|
|
27
|
-
| **Governance** | Model allowlists, approval gates, audit events, cross-extension RPC |
|
|
28
|
-
|
|
29
|
-
---
|
|
30
|
-
|
|
31
|
-
## 2. Tool-by-tool survey
|
|
32
|
-
|
|
33
|
-
### 2.1 Claude Code — subagents
|
|
34
|
-
|
|
35
|
-
Claude Code implements subagents as a first-class `Agent` tool.
|
|
36
|
-
|
|
37
|
-
**Key capabilities**
|
|
38
|
-
|
|
39
|
-
- **Three definition styles:** programmatic `AgentDefinition` in the SDK, filesystem agents in `.claude/agents/*.md`, and a built-in `general-purpose` agent that is always available.
|
|
40
|
-
- **Markdown + YAML agents:** each file has frontmatter (`name`, `description`, optional `tools`) plus a system prompt body. Project-scoped agents override user-scoped agents.
|
|
41
|
-
- **Tool scoping:** agents receive a curated `tools` array; omitting it inherits the parent thread’s tools (including MCP). `disallowedTools` can block the `Agent` tool itself.
|
|
42
|
-
- **Nested subagents:** as of Claude Code v2.1.172, subagents can spawn their own subagents. Foreground subagents can spawn at any depth; background subagents cannot spawn beyond depth 5.
|
|
43
|
-
- **Fresh context by default:** the subagent’s context window starts empty; the only parent-to-child channel is the `Agent` tool’s prompt string.
|
|
44
|
-
- **Lifecycle hooks:** shell hooks fire on events such as `SubagentStop` / `Stop`, letting users inject quality gates or external review steps.
|
|
45
|
-
|
|
46
|
-
**What stands out**
|
|
47
|
-
|
|
48
|
-
- The filesystem-based agent definition is simple and version-controllable.
|
|
49
|
-
- The model can auto-delegate based on the agent’s `description`.
|
|
50
|
-
- Nested subagents give genuine recursive delegation.
|
|
51
|
-
|
|
52
|
-
**Sources:** [Claude Code subagent docs](https://code.claude.com/docs/en/agent-sdk/subagents), [PubNub best-practices write-up](https://www.pubnub.com/blog/best-practices-for-claude-code-sub-agents/)
|
|
53
|
-
|
|
54
|
-
### 2.2 Claude Code — Agent Teams (experimental)
|
|
55
|
-
|
|
56
|
-
Agent Teams are a different, heavier abstraction from ordinary subagents: they coordinate multiple independent Claude Code sessions rather than nested tool calls.
|
|
57
|
-
|
|
58
|
-
**Key capabilities**
|
|
59
|
-
|
|
60
|
-
- **Peer-to-peer messaging:** teammates message each other directly, not just through the parent.
|
|
61
|
-
- **Shared task list:** the team coordinates through a shared task list with self-coordination.
|
|
62
|
-
- **Lead / teammate model:** the main session is the lead; teammates are spawned on request or when the model decides it needs help.
|
|
63
|
-
- **Quality-gate hooks:** `TeammateIdle`, `TaskCreated`, and `TaskCompleted` hooks can block idle/creation/completion with exit code 2 and send feedback.
|
|
64
|
-
- **Graceful shutdown:** the lead can ask a teammate to shut down; the teammate may approve or reject.
|
|
65
|
-
- **Shared directories:** team workspace directories are cleaned up automatically on session end.
|
|
66
|
-
|
|
67
|
-
**What stands out**
|
|
68
|
-
|
|
69
|
-
- This is the most sophisticated *team* abstraction among the surveyed tools: agents are true peers with a task board and direct messaging.
|
|
70
|
-
- Quality-gate hooks are a genuine governance primitive.
|
|
71
|
-
|
|
72
|
-
**Sources:** [Claude Code agent teams docs](https://code.claude.com/docs/en/agent-teams)
|
|
73
|
-
|
|
74
|
-
### 2.3 OpenAI Codex
|
|
75
|
-
|
|
76
|
-
Codex treats subagent workflows as explicit, user-requested parallel fan-outs.
|
|
77
|
-
|
|
78
|
-
**Key capabilities**
|
|
79
|
-
|
|
80
|
-
- **Explicit orchestration:** Codex spawns, routes follow-up instructions, waits for results, and closes agent threads only when the user asks for subagents.
|
|
81
|
-
- **Built-in + custom agents:** ships `default`, `worker`, and `explorer` plus user-defined TOML agents under `.codex/agents/`.
|
|
82
|
-
- **Global `[agents]` settings:** `max_threads` (default 6), `max_depth` (default 1), `job_max_runtime_seconds`.
|
|
83
|
-
- **Per-agent config:** `model`, `model_reasoning_effort`, `sandbox_mode` (`read-only`), `developer_instructions`.
|
|
84
|
-
- **Sandbox modes:** read-only exploration agents, with broader sandbox-agent support in the Agents SDK.
|
|
85
|
-
- **Agents SDK integration:** the CLI can be exposed as an MCP server and orchestrated from the OpenAI Agents SDK.
|
|
86
|
-
- **`spawn_agents_on_csv`:** batch-style fan-out over a CSV of inputs, with per-worker timeouts.
|
|
87
|
-
|
|
88
|
-
**What stands out**
|
|
89
|
-
|
|
90
|
-
- The TOML config is more structured than Markdown frontmatter but less readable as a document.
|
|
91
|
-
- `max_threads`/`max_depth` are simple, effective global governors.
|
|
92
|
-
- Explicit orchestration keeps the user in control but requires intentional prompting.
|
|
93
|
-
|
|
94
|
-
**Sources:** [Codex subagents docs](https://developers.openai.com/codex/subagents), [Codex subagent concepts](https://developers.openai.com/codex/concepts/subagents), [Codex Agents SDK guide](https://developers.openai.com/codex/guides/agents-sdk), [Simon Willison’s notes](https://simonwillison.net/2026/Mar/16/codex-subagents/)
|
|
95
|
-
|
|
96
|
-
### 2.4 Google Gemini CLI
|
|
97
|
-
|
|
98
|
-
Gemini CLI describes subagents as specialists the main agent can “hire.”
|
|
99
|
-
|
|
100
|
-
**Key capabilities**
|
|
101
|
-
|
|
102
|
-
- **Subagents-as-tools:** each subagent is exposed to the main agent as a callable tool with the same name.
|
|
103
|
-
- **Built-in agents:** `codebase_investigator`, `cli_help`, `generalist_agent`, and an experimental `browser_agent`.
|
|
104
|
-
- **Custom agents:** Markdown + YAML frontmatter under `.gemini/agents/*.md` or `~/.gemini/agents/*.md`.
|
|
105
|
-
- **`@agent_name` forcing:** prefix a prompt with `@agent_name` to force delegation to that agent.
|
|
106
|
-
- **Parallel execution:** multiple subagents (or instances of the same subagent) can run concurrently.
|
|
107
|
-
- **Isolated context loops:** each subagent operates in its own context loop.
|
|
108
|
-
- **No recursion:** subagents cannot call other subagents, preventing runaway agent loops.
|
|
109
|
-
- **Agent Skills:** skills are self-contained directories following the [Agent Skills](https://agentskills.io) open standard, discovered from built-ins, extensions, user, and workspace tiers.
|
|
110
|
-
- **Browser-agent sandboxing:** adjusts behavior under macOS seatbelt and Docker/Podman; can connect to a host Chrome over remote debugging.
|
|
111
|
-
|
|
112
|
-
**What stands out**
|
|
113
|
-
|
|
114
|
-
- The `@agent` syntax is a fast, ergonomic forcing mechanism.
|
|
115
|
-
- Treating subagents as tools makes the mental model very simple.
|
|
116
|
-
- The built-in browser agent is a genuinely useful default specialist.
|
|
117
|
-
|
|
118
|
-
**Note:** As of June 2026, Gemini CLI is being superseded by Antigravity CLI for unpaid users.
|
|
119
|
-
|
|
120
|
-
**Sources:** [Gemini CLI subagents docs](https://geminicli.com/docs/core/subagents/), [Gemini CLI skills docs](https://geminicli.com/docs/cli/skills/), [Google Developers Blog announcement](https://developers.googleblog.com/subagents-have-arrived-in-gemini-cli/)
|
|
121
|
-
|
|
122
|
-
### 2.5 Cursor
|
|
123
|
-
|
|
124
|
-
Cursor 2.4 introduced subagents as parallel, specialized workers inside the editor and CLI.
|
|
125
|
-
|
|
126
|
-
**Key capabilities**
|
|
127
|
-
|
|
128
|
-
- **Independent parallel agents:** each subagent has its own context window and can run alongside the main agent.
|
|
129
|
-
- **Default subagents:** included defaults for codebase research, terminal commands, and parallel work streams.
|
|
130
|
-
- **Custom subagents:** users can define custom prompts, tool access, and models.
|
|
131
|
-
- **Skills:** reusable skill definitions that can be attached to agents.
|
|
132
|
-
- **IDE integration:** subagents run inside the same editor session as the main agent.
|
|
133
|
-
|
|
134
|
-
**What stands out**
|
|
135
|
-
|
|
136
|
-
- Cursor’s main differentiator is the tight IDE integration: subagents can work while the user keeps editing.
|
|
137
|
-
- The combination of subagents + skills is aimed at repeatable team workflows.
|
|
138
|
-
|
|
139
|
-
**Sources:** [Cursor 2.4 changelog](https://cursor.com/changelog/2-4), [Cursor subagents guide](https://www.aimakers.co/blog/cursor-2-4-subagents/)
|
|
140
|
-
|
|
141
|
-
### 2.6 Cline / Kilo Code
|
|
142
|
-
|
|
143
|
-
Cline is an open-source VS Code extension and CLI with a plan/act workflow. Kilo Code is a hard fork with an explicit focus on multi-agent teams.
|
|
144
|
-
|
|
145
|
-
**Key capabilities**
|
|
146
|
-
|
|
147
|
-
- **Plan / Act modes:** separate planning from execution.
|
|
148
|
-
- **Coordinator agents:** a coordinator delegates to specialists with their own tools and context.
|
|
149
|
-
- **Schedules:** agents can be run on cron for recurring automations.
|
|
150
|
-
- **Parallel isolated worktrees:** Kilo advertises “parallel isolated worktrees” for safe concurrent edits.
|
|
151
|
-
- **Skills / `.clinerules`:** repo-local rules and skills teach the agent standards and conventions.
|
|
152
|
-
- **MCP + plugins:** extend tools via MCP and the Cline SDK.
|
|
153
|
-
- **Agent Manager / Portal:** Kilo provides a single portal to supervise local and cloud agents across IDE/CLI.
|
|
154
|
-
|
|
155
|
-
**What stands out**
|
|
156
|
-
|
|
157
|
-
- Cline/Kilo is the only open-source entry besides `pi-subagents` that combines scheduling, parallel worktrees, and coordinator/specialist teams.
|
|
158
|
-
- The plan/act split is a useful UX primitive for risky or multi-step tasks.
|
|
159
|
-
|
|
160
|
-
**Sources:** [Cline homepage](https://cline.bot/), [Kilo Code homepage](https://kilo.ai/)
|
|
161
|
-
|
|
162
|
-
### 2.7 Roo Code
|
|
163
|
-
|
|
164
|
-
Roo Code is an open-source VS Code extension built around *modes*.
|
|
165
|
-
|
|
166
|
-
**Key capabilities**
|
|
167
|
-
|
|
168
|
-
- **Modes as agents:** each mode is a persona/role with its own instructions and assigned model (`Code`, `Architect`, `Ask`, `Debug`, `Orchestrator`, …).
|
|
169
|
-
- **Orchestrator mode:** a strategic workflow orchestrator that breaks complex tasks into discrete subtasks and delegates them to specialized modes.
|
|
170
|
-
- **Multi-provider support:** OpenAI, Anthropic, local models via LiteLLM.
|
|
171
|
-
- **Context indexing:** Roo Code uses indexing and prompt management to maintain context across modes.
|
|
172
|
-
|
|
173
|
-
**What stands out**
|
|
174
|
-
|
|
175
|
-
- The Orchestrator mode is an elegant lightweight team abstraction: modes are cheap to define and the orchestrator mode handles routing.
|
|
176
|
-
- At the time of writing, parallel execution of specialists is still an open enhancement request; delegation is primarily sequential.
|
|
177
|
-
|
|
178
|
-
**Sources:** [Roo Code GitHub](https://github.com/RooCodeInc/Roo-Code), [Xebia multi-agent workflow write-up](https://xebia.com/blog/multi-agent-workflow-with-roo-code/)
|
|
179
|
-
|
|
180
|
-
### 2.8 Aider
|
|
181
|
-
|
|
182
|
-
Aider does not have subagents in the Claude Code sense, but its **architect mode** is an interesting adjacent pattern.
|
|
183
|
-
|
|
184
|
-
**Key capabilities**
|
|
185
|
-
|
|
186
|
-
- **Architect / editor model split:** the architect model (often a reasoning model like o1/Opus) proposes a solution, then a separate editor model turns that proposal into concrete file edits.
|
|
187
|
-
- **Chat modes:** `code`, `ask`, `architect`, `help`; modes can be sticky or per-message.
|
|
188
|
-
- **Git-first workflow:** every change is committed automatically.
|
|
189
|
-
|
|
190
|
-
**What stands out**
|
|
191
|
-
|
|
192
|
-
- This is *specialization by model* rather than *specialization by agent*. It proves that splitting reasoning from execution can improve quality without full agent orchestration.
|
|
193
|
-
- It is cheaper and simpler than a multi-agent runtime, but less flexible.
|
|
194
|
-
|
|
195
|
-
**Sources:** [Aider chat modes docs](https://aider.chat/docs/usage/modes.html), [Aider vs Claude Code 2026](https://www.developersdigest.tech/blog/aider-vs-claude-code-2026-update)
|
|
196
|
-
|
|
197
|
-
### 2.9 Continue.dev
|
|
198
|
-
|
|
199
|
-
Continue.dev is an open-source coding agent (CLI, VS Code, JetBrains) with a strong focus on repeatable, team-level agent workflows.
|
|
200
|
-
|
|
201
|
-
**Key capabilities**
|
|
202
|
-
|
|
203
|
-
- **Agents for PR/CI workflows:** automated code review, PR checks, and development workflows on every pull request.
|
|
204
|
-
- **Hub:** centralized configuration, shared custom agents, and secure API-key management.
|
|
205
|
-
- **Audit and monitoring:** agent actions and decisions are observable for teams.
|
|
206
|
-
|
|
207
|
-
**What stands out**
|
|
208
|
-
|
|
209
|
-
- Continue is less about runtime subagent spawning and more about packaging agents as reusable team services.
|
|
210
|
-
- The hub/audit model is worth studying for any extension that wants to scale across an organization.
|
|
211
|
-
|
|
212
|
-
**Sources:** [Continue.dev agents](https://www.continue.dev/agents), [Continue agents blog](https://blog.continue.dev/what-are-continue-agents-any-workflow-your-teams-way/)
|
|
213
|
-
|
|
214
|
-
### 2.10 OpenHands
|
|
215
|
-
|
|
216
|
-
OpenHands is an open-source, model-agnostic cloud platform for coding agents.
|
|
217
|
-
|
|
218
|
-
**Key capabilities**
|
|
219
|
-
|
|
220
|
-
- **Cloud / headless agents:** run agents in isolated sandboxes on a VM or in the cloud.
|
|
221
|
-
- **Multi-agent collaboration:** supports multiple agents working together on large refactors.
|
|
222
|
-
- **Triggers and schedules:** GitHub, Slack, PagerDuty, cron-style scheduling even when the user’s machine is off.
|
|
223
|
-
- **SDK + REST API:** embed agents into products or internal platforms.
|
|
224
|
-
- **Governance:** access controls, audit trails, cost guardrails.
|
|
225
|
-
|
|
226
|
-
**What stands out**
|
|
227
|
-
|
|
228
|
-
- OpenHands is the most enterprise-oriented: it is about fleet management, not just a single CLI session.
|
|
229
|
-
- The combination of multi-agent collaboration + scheduled triggers + governance is the direction many teams will eventually want.
|
|
230
|
-
|
|
231
|
-
**Sources:** [OpenHands homepage](https://openhands.dev/), [OpenHands SDK docs](https://docs.openhands.dev/sdk)
|
|
232
|
-
|
|
233
|
-
---
|
|
234
|
-
|
|
235
|
-
## 3. Feature matrix
|
|
236
|
-
|
|
237
|
-
| Capability | Claude Code subagents | Claude Code Agent Teams | Codex | Gemini CLI | Cursor | Cline / Kilo | Roo Code | `pi-subagents` |
|
|
238
|
-
|---|---|---|---|---|---|---|---|---|
|
|
239
|
-
| **Spawn style** | `Agent` tool | Lead / teammate sessions | Explicit user prompt | Tool-like `@agent` | Tool/IDE action | Coordinator delegate | Orchestrator mode | `Agent` tool |
|
|
240
|
-
| **Custom agent config** | `.claude/agents/*.md` | Same + teammate profiles | `.codex/agents/*.toml` | `.gemini/agents/*.md` | Custom prompts + skills | `.clinerules` / skills | Mode files | `.pi/agents/*.md` |
|
|
241
|
-
| **Built-in agents** | `general-purpose`, `Explore`, `Plan` | N/A (team roles) | `default`, `worker`, `explorer` | `codebase_investigator`, `cli_help`, `generalist_agent`, `browser_agent` | Research, terminal, parallel | Coordinator + specialists | Code, Architect, Ask, Debug, Orchestrator | `general-purpose`, `Explore`, `Plan` |
|
|
242
|
-
| **Tool scoping** | `tools`, `disallowedTools` | Per-teammate tools | `sandbox_mode`, per-agent tools | `tools`, wildcards | Custom tool access | Yes | Yes | `tools`, `disallowed_tools`, `ext:` selectors, `isolated` |
|
|
243
|
-
| **Concurrency control** | Background/foreground, nesting depth | Multiple teammates | `max_threads`, `max_depth` | Parallel instances | Up to 10 concurrent ops | Parallel isolated worktrees | Sequential primarily | `maxConcurrent` queue, recursive depth limit |
|
|
244
|
-
| **Context model** | Fresh by default, optional inherit | Fully independent sessions | Fresh threads | Isolated context loops | Own context window | Own context | Indexed prompt mgmt | Fresh by default, optional `inherit_context` |
|
|
245
|
-
| **Mid-run steering** | No (stop/abort only) | Yes (messages, shutdown) | No | No | No | ? | ? | `steer_subagent` |
|
|
246
|
-
| **Scheduling** | No | No | `spawn_agents_on_csv` / SDK | No | No | Cron schedules | No | Cron, interval, one-shot |
|
|
247
|
-
| **Worktree / sandbox isolation** | No built-in worktree | Shared team dirs | Sandbox agents (container) | Browser sandboxing | No | Parallel worktrees | No | `isolation: "worktree"` |
|
|
248
|
-
| **Join / group results** | Summarized inline | Shared task list | Wait + consolidate | Summarized inline | Main agent collates | ? | ? | `smart` / `async` / `group` join modes |
|
|
249
|
-
| **Resume sessions** | No | Session resumption | ? | No | ? | ? | ? | Yes (`resume`) |
|
|
250
|
-
| **Lifecycle hooks** | `SubagentStop`, `Stop`, etc. | `TeammateIdle`, `TaskCreated`, `TaskCompleted` | ? | No | No | ? | ? | Event bus (`subagents:*`) |
|
|
251
|
-
| **Cross-extension RPC** | No | No | MCP server exposure | No | No | MCP/SKD | No | `pi.events` RPC (`spawn`, `stop`, `ping`) |
|
|
252
|
-
| **Persistent memory** | No | No | No | No | ? | ? | ? | Project / local / user `MEMORY.md` |
|
|
253
|
-
| **UI / observability** | Inline Claude Code UI | Inline Claude Code UI | CLI output | CLI output | IDE widget | IDE + portal | VS Code UI | Persistent widget, conversation viewer, styled notifications |
|
|
254
|
-
| **Model governance** | Model override per agent | Per-teammate model | `model`, `model_reasoning_effort` | Model selection | Per-subagent model | Multi-provider | Multi-provider | Fuzzy selection, `enabledModels` scope check |
|
|
255
|
-
|
|
256
|
-
*Empty cells mean the capability was not documented or is not a first-class feature.*
|
|
257
|
-
|
|
258
|
-
---
|
|
259
|
-
|
|
260
|
-
## 4. How `pi-subagents` compares
|
|
261
|
-
|
|
262
|
-
### 4.1 Where `pi-subagents` is already strong
|
|
263
|
-
|
|
264
|
-
- **Filesystem-first custom agents:** the Markdown + YAML frontmatter model matches Claude Code and Gemini CLI, which is the most user-friendly of the config formats.
|
|
265
|
-
- **Rich tool scoping:** `tools`, `disallowed_tools`, `extensions`, `exclude_extensions`, and `ext:<extension>/<tool>` selectors are more granular than most competitors.
|
|
266
|
-
- **Background orchestration:** concurrency queue, background-by-default, and configurable join modes (`smart`/`async`/`group`) are not common; Claude Code and Codex do not expose join strategies.
|
|
267
|
-
- **Mid-run steering:** `steer_subagent` is a genuine differentiator; none of the surveyed tools except Claude Code Agent Teams offer a comparable intervention primitive.
|
|
268
|
-
- **Scheduling:** built-in cron / interval / one-shot scheduling is rare; only Cline/Kilo and OpenHands advertise it.
|
|
269
|
-
- **Worktree isolation:** `isolation: "worktree"` with automatic branch preservation is a powerful primitive for safe parallel writes; Codex/Cursor/Gemini do not offer it out of the box.
|
|
270
|
-
- **Cross-extension RPC + event bus:** letting other pi extensions spawn, stop, and observe subagents via `pi.events` is unique in this survey.
|
|
271
|
-
- **Persistent memory + skill preloading:** project/local/user memory scopes and automatic skill injection are ahead of the simpler config-only agents in Codex/Gemini.
|
|
272
|
-
- **UI polish:** the persistent widget, conversation viewer, token-meter, and styled notifications are richer than the CLI-only competitors.
|
|
273
|
-
- **Graceful max-turns:** the wrap-up-then-abort behavior is a nice user-experience touch.
|
|
274
|
-
|
|
275
|
-
### 4.2 Notable gaps relative to competitors
|
|
276
|
-
|
|
277
|
-
- **Agent teams / peer-to-peer collaboration:** Claude Code Agent Teams’ shared task list, direct teammate messaging, and quality-gate hooks are not replicated. `pi-subagents` is strictly parent-child.
|
|
278
|
-
- **Explicit orchestration primitives:** Codex’s `spawn_agents_on_csv`, batch timeouts, and Agents SDK integration make large fan-outs more ergonomic. `pi-subagents` requires the parent LLM to drive every spawn.
|
|
279
|
-
- **Built-in browser agent:** Gemini CLI ships a `browser_agent`; `pi-subagents` can only get browsing through an MCP extension or tool.
|
|
280
|
-
- **Sandbox modes:** Codex’s `sandbox_mode: read-only` / container sandbox agents provide stronger safety guarantees than tool allowlists alone.
|
|
281
|
-
- **Coordinator / orchestrator mode:** Roo Code’s Orchestrator mode and Cline/Kilo’s coordinator agents are lightweight ways to auto-route tasks. `pi-subagents` relies on the parent model to pick `subagent_type`.
|
|
282
|
-
- **Aider-style architect/editor split:** this is a cheap, high-leverage specialization pattern that does not require a full subagent runtime.
|
|
283
|
-
- **Audit / hub model:** Continue.dev and OpenHands emphasize team-level configuration, audit trails, and cost guardrails, which `pi-subagents` does not address.
|
|
284
|
-
|
|
285
|
-
### 4.3 Where the design philosophies diverge
|
|
286
|
-
|
|
287
|
-
- **Claude Code / `pi-subagents`** emphasize *task routing and permission shaping* — the parent decides who gets which tools.
|
|
288
|
-
- **Codex** emphasizes *explicit parallel spawning and result collection* — the user must ask for a team.
|
|
289
|
-
- **Gemini CLI** treats subagents *as tools* with simple `@agent` forcing.
|
|
290
|
-
- **Cline/Kilo** and **OpenHands** push toward *persistent, schedulable, cloud-scale agent teams*.
|
|
291
|
-
|
|
292
|
-
`pi-subagents` currently sits closest to Claude Code, but with scheduling, worktree isolation, and cross-extension RPC as its own differentiators.
|
|
293
|
-
|
|
294
|
-
---
|
|
295
|
-
|
|
296
|
-
## 5. Interesting features worth borrowing
|
|
297
|
-
|
|
298
|
-
Based on the survey, the highest-value additions to consider for `pi-subagents` are:
|
|
299
|
-
|
|
300
|
-
1. **Coordinator / orchestrator agent type**
|
|
301
|
-
A built-in mode whose job is to break a large task into subtasks and fan them out to other agents. This would reduce the burden on the parent model and make multi-agent workflows feel more automatic.
|
|
302
|
-
|
|
303
|
-
2. **Batch fan-out primitive**
|
|
304
|
-
Something like Codex’s `spawn_agents_on_csv` or a “map over files/packages” helper would make it easy to run the same agent across many inputs in parallel.
|
|
305
|
-
|
|
306
|
-
3. **Shared task list / agent-team mode**
|
|
307
|
-
Optional peer-to-peer messaging and a shared task board would move `pi-subagents` from parent-child delegation to true team collaboration.
|
|
308
|
-
|
|
309
|
-
4. **Quality-gate hooks**
|
|
310
|
-
Shell hooks at `task-created`, `task-completed`, `agent-idle`, etc., would let teams enforce policies without writing a full extension.
|
|
311
|
-
|
|
312
|
-
5. **Sandbox mode abstraction**
|
|
313
|
-
Beyond worktree isolation, a `sandbox_mode: read-only` (and eventually container) option would make read-only agents safer and more discoverable.
|
|
314
|
-
|
|
315
|
-
6. **Built-in browser agent**
|
|
316
|
-
A default `browser` agent type, even if implemented via a headless browser tool, would match Gemini CLI and Cursor.
|
|
317
|
-
|
|
318
|
-
7. **Architect/editor model split**
|
|
319
|
-
A config option to run a planning model first and a cheaper editing model second could improve complex edits without the cost of full agent orchestration.
|
|
320
|
-
|
|
321
|
-
8. **Audit / cost events**
|
|
322
|
-
Richer event payloads (cost per agent, per project, per user) would make `pi-subagents` more attractive for team usage.
|
|
323
|
-
|
|
324
|
-
---
|
|
325
|
-
|
|
326
|
-
## 6. References
|
|
327
|
-
|
|
328
|
-
- [Claude Code subagent SDK docs](https://code.claude.com/docs/en/agent-sdk/subagents)
|
|
329
|
-
- [PubNub — Best practices for Claude Code subagents](https://www.pubnub.com/blog/best-practices-for-claude-code-sub-agents/)
|
|
330
|
-
- [Claude Code agent teams docs](https://code.claude.com/docs/en/agent-teams)
|
|
331
|
-
- [OpenAI Codex — Subagents](https://developers.openai.com/codex/subagents)
|
|
332
|
-
- [OpenAI Codex — Subagent concepts](https://developers.openai.com/codex/concepts/subagents)
|
|
333
|
-
- [OpenAI Codex — Use with the Agents SDK](https://developers.openai.com/codex/guides/agents-sdk)
|
|
334
|
-
- [Simon Willison — Use subagents and custom agents in Codex](https://simonwillison.net/2026/Mar/16/codex-subagents/)
|
|
335
|
-
- [Gemini CLI subagents docs](https://geminicli.com/docs/core/subagents/)
|
|
336
|
-
- [Gemini CLI skills docs](https://geminicli.com/docs/cli/skills/)
|
|
337
|
-
- [Google Developers Blog — Subagents have arrived in Gemini CLI](https://developers.googleblog.com/subagents-have-arrived-in-gemini-cli/)
|
|
338
|
-
- [Cursor 2.4 changelog](https://cursor.com/changelog/2-4)
|
|
339
|
-
- [Cursor 2.4 subagents guide](https://www.aimakers.co/blog/cursor-2-4-subagents/)
|
|
340
|
-
- [Cline homepage](https://cline.bot/)
|
|
341
|
-
- [Kilo Code homepage](https://kilo.ai/)
|
|
342
|
-
- [Roo Code GitHub](https://github.com/RooCodeInc/Roo-Code)
|
|
343
|
-
- [Xebia — Multi-agent workflow with Roo Code](https://xebia.com/blog/multi-agent-workflow-with-roo-code/)
|
|
344
|
-
- [Aider chat modes docs](https://aider.chat/docs/usage/modes.html)
|
|
345
|
-
- [Aider vs Claude Code 2026](https://www.developersdigest.tech/blog/aider-vs-claude-code-2026-update)
|
|
346
|
-
- [Continue.dev agents](https://www.continue.dev/agents)
|
|
347
|
-
- [Continue.dev — What are Continue agents?](https://blog.continue.dev/what-are-continue-agents-any-workflow-your-teams-way/)
|
|
348
|
-
- [OpenHands homepage](https://openhands.dev/)
|
|
349
|
-
- [OpenHands SDK docs](https://docs.openhands.dev/sdk)
|
|
350
|
-
- [Medium — Multi-agent comparison: Codex, Claude Code, Gemini CLI](https://medium.com/@aristojeff/what-are-multi-agent-systems-and-subagents-a-comparison-of-codex-claude-code-and-gemini-cli-304376584f51)
|