@rohaquinlop/pi-subagents 0.2.0 → 0.4.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +79 -1
- package/agents/researcher.md +1 -0
- package/agents/scout.md +1 -0
- package/agents/worker.md +1 -0
- package/index.ts +669 -8
- package/lib/helpers.ts +32 -1
- package/lib/pipeline-helpers.ts +53 -0
- package/lib/types.ts +53 -0
- package/package.json +1 -1
package/README.md
CHANGED
|
@@ -1,6 +1,6 @@
|
|
|
1
1
|
# @rohaquinlop/pi-subagents
|
|
2
2
|
|
|
3
|
-
A [pi](https://github.com/earendil-works/pi) extension that registers
|
|
3
|
+
A [pi](https://github.com/earendil-works/pi) extension that registers three agent orchestration tools — `subagent`, `pipeline`, and `loop` — with three built-in agents:
|
|
4
4
|
|
|
5
5
|
## Installation
|
|
6
6
|
|
|
@@ -22,6 +22,8 @@ pi install @rohaquinlop/pi-subagents
|
|
|
22
22
|
|
|
23
23
|
## Usage
|
|
24
24
|
|
|
25
|
+
### Subagent — Single Agent Dispatch
|
|
26
|
+
|
|
25
27
|
One tool call = one subagent:
|
|
26
28
|
```json
|
|
27
29
|
{ "agent": "scout", "task": "Find all auth-related files in src/" }
|
|
@@ -31,6 +33,49 @@ To fan out, emit multiple `subagent` tool calls in the same assistant turn — p
|
|
|
31
33
|
|
|
32
34
|
Each subagent runs as an isolated `pi` process with no inherited context — all context must be in the task description.
|
|
33
35
|
|
|
36
|
+
### Pipeline — Sequential Agent Chains
|
|
37
|
+
|
|
38
|
+
Chain 2–5 agents in sequence where each agent's output feeds as context into the next. Use `{previous}` in a step's task to inject the prior step's output.
|
|
39
|
+
|
|
40
|
+
```json
|
|
41
|
+
{ "tool": "pipeline", "args": { "steps": [
|
|
42
|
+
{ "agent": "scout", "task": "Find all auth-related code in src/" },
|
|
43
|
+
{ "agent": "worker", "task": "Based on these findings:\n{previous}\n\nImplement password reset flow." }
|
|
44
|
+
]}}
|
|
45
|
+
```
|
|
46
|
+
|
|
47
|
+
Each step runs a separate subagent process. The pipeline stops on the first error. Per-step and total usage (tokens, cost, duration) are shown in the TUI.
|
|
48
|
+
|
|
49
|
+
### Loop — Iterative Refinement
|
|
50
|
+
|
|
51
|
+
Run the same agent 2–5 times, passing all prior iteration outputs as context. Optionally use a `judge` agent to stop early when quality is sufficient.
|
|
52
|
+
|
|
53
|
+
**Basic (fixed iterations):**
|
|
54
|
+
|
|
55
|
+
```json
|
|
56
|
+
{ "tool": "loop", "args": {
|
|
57
|
+
"agent": "worker",
|
|
58
|
+
"task": "Write a comprehensive README for this project.",
|
|
59
|
+
"max_iterations": 3
|
|
60
|
+
}}
|
|
61
|
+
```
|
|
62
|
+
|
|
63
|
+
**With judge for dynamic stopping:**
|
|
64
|
+
|
|
65
|
+
```json
|
|
66
|
+
{ "tool": "loop", "args": {
|
|
67
|
+
"agent": "worker",
|
|
68
|
+
"task": "Write a comprehensive README for this project.",
|
|
69
|
+
"max_iterations": 5,
|
|
70
|
+
"judge": {
|
|
71
|
+
"agent": "reviewer",
|
|
72
|
+
"criteria": "Is this README complete, well-structured, and ready for publication? Answer YES or NO."
|
|
73
|
+
}
|
|
74
|
+
}}
|
|
75
|
+
```
|
|
76
|
+
|
|
77
|
+
The judge evaluates each iteration's output. When satisfied (YES), the loop stops early — no wasted iterations. Judge feedback is passed back to the runner agent for refinement.
|
|
78
|
+
|
|
34
79
|
## Config
|
|
35
80
|
|
|
36
81
|
Optional `config.json` next to `index.ts`:
|
|
@@ -81,6 +126,14 @@ Frontmatter fields:
|
|
|
81
126
|
|
|
82
127
|
The markdown body becomes the agent's system prompt.
|
|
83
128
|
|
|
129
|
+
Agents can optionally declare a `connector` field in their frontmatter — a prompt template that wraps their output before it's passed as `{previous}` to the next agent in a pipeline:
|
|
130
|
+
|
|
131
|
+
```yaml
|
|
132
|
+
connector: "## Key findings from codebase exploration:\n\n{output}"
|
|
133
|
+
```
|
|
134
|
+
|
|
135
|
+
Connectors use single-line format with `\n` for line breaks. They can be overridden per-step via the optional `connector` field on pipeline steps.
|
|
136
|
+
|
|
84
137
|
### 2. Register agents via `globalThis.__pi_subagents`
|
|
85
138
|
|
|
86
139
|
Pi loads extensions via jiti, which creates separate module instances. Direct imports from the subagents extension will reference a different `agents` array than the one the `subagent` tool uses. Use the `globalThis` bridge instead:
|
|
@@ -159,6 +212,27 @@ Built-in tools (`read`, `write`, `edit`, `bash`, `grep`, `find`, `ls`) work auto
|
|
|
159
212
|
|
|
160
213
|
The `subagent` tool itself is listed in `CUSTOM_TOOL_EXTENSIONS` pointing back to this extension's own `index.ts` — that's how an agent like `worker` can recursively spawn other agents. Recursion is bounded only by each agent's `subagent_agents` allowlist (e.g. worker can spawn scout/researcher, neither of which declares the `subagent` tool, so the chain stops at depth 2).
|
|
161
214
|
|
|
215
|
+
### 4. Model-specific extensions (no manual mapping)
|
|
216
|
+
|
|
217
|
+
Extensions that apply to specific models (not tools) can declare `appliesToModels`
|
|
218
|
+
in their `package.json` and auto-load without any `CUSTOM_TOOL_EXTENSIONS` mapping:
|
|
219
|
+
|
|
220
|
+
```json
|
|
221
|
+
{
|
|
222
|
+
"pi": {
|
|
223
|
+
"extensions": ["./extensions/index.ts"],
|
|
224
|
+
"appliesToModels": ["deepseek-*"]
|
|
225
|
+
}
|
|
226
|
+
}
|
|
227
|
+
```
|
|
228
|
+
|
|
229
|
+
When a subagent's configured model matches one of the patterns, the extension is
|
|
230
|
+
loaded via `--extension` in the child process. Glob patterns (`*` wildcard) match
|
|
231
|
+
the **model ID** (the portion after the last `/`, e.g. `deepseek-v4-flash` in
|
|
232
|
+
`nan/deepseek-v4-flash`). Plain strings without wildcards match the **provider
|
|
233
|
+
name** (the portion before the `/`, e.g. `nan` in `nan/deepseek-v4-flash`).
|
|
234
|
+
Matching is case-insensitive.
|
|
235
|
+
|
|
162
236
|
## Structure
|
|
163
237
|
|
|
164
238
|
```
|
|
@@ -169,3 +243,7 @@ The `subagent` tool itself is listed in `CUSTOM_TOOL_EXTENSIONS` pointing back t
|
|
|
169
243
|
└── tools/ # Extensions loaded into subagent processes
|
|
170
244
|
└── safe-bash.ts # bash with dangerous command blocking
|
|
171
245
|
```
|
|
246
|
+
|
|
247
|
+
## Acknowledgements
|
|
248
|
+
|
|
249
|
+
The pipeline and loop tools are conceptually inspired by [RecursiveMAS](https://arxiv.org/abs/2604.25917) — a research framework for scaling agent collaboration through iterative refinement and system-level orchestration.
|
package/agents/researcher.md
CHANGED
|
@@ -4,6 +4,7 @@ description: Web researcher — searches the web and synthesizes findings
|
|
|
4
4
|
tools: web_search, web_fetch
|
|
5
5
|
model: deepseek-v4-flash
|
|
6
6
|
thinking: medium
|
|
7
|
+
connector: "## Research findings:\n\n{output}"
|
|
7
8
|
---
|
|
8
9
|
|
|
9
10
|
You are a research specialist. Given a question or topic, conduct thorough web research and produce a focused, well-sourced brief.
|
package/agents/scout.md
CHANGED
|
@@ -4,6 +4,7 @@ description: Fast codebase recon — explores files, finds patterns, maps archit
|
|
|
4
4
|
tools: read, grep, find, ls
|
|
5
5
|
model: deepseek-v4-flash
|
|
6
6
|
thinking: medium
|
|
7
|
+
connector: "## Key findings from codebase exploration:\n\n{output}"
|
|
7
8
|
---
|
|
8
9
|
|
|
9
10
|
You are a scout agent. Quickly investigate a codebase and return structured findings.
|
package/agents/worker.md
CHANGED
|
@@ -5,6 +5,7 @@ tools: read, write, edit, safe_bash, web_search, web_fetch, subagent
|
|
|
5
5
|
subagent_agents: scout, researcher
|
|
6
6
|
model: deepseek-v4-flash
|
|
7
7
|
thinking: medium
|
|
8
|
+
connector: "## Implementation results:\n\n{output}"
|
|
8
9
|
---
|
|
9
10
|
|
|
10
11
|
You are a worker agent. You operate in an isolated context — you have no knowledge of any prior conversation.
|
package/index.ts
CHANGED
|
@@ -14,8 +14,9 @@ import { Container, Markdown, Spacer, Text, visibleWidth } from "@earendil-works
|
|
|
14
14
|
import { Type } from "@sinclair/typebox";
|
|
15
15
|
import "./tools/safe-bash";
|
|
16
16
|
|
|
17
|
-
import type { AgentConfig } from "./lib/types";
|
|
18
|
-
import { discoverAgents, mergeAgents } from "./lib/helpers";
|
|
17
|
+
import type { AgentConfig, AgentUsage, PipelineStepResult, PipelineResult, LoopIterationResult, LoopResult } from "./lib/types";
|
|
18
|
+
import { discoverAgents, mergeAgents, substitutePlaceholders, formatConnectorContext } from "./lib/helpers";
|
|
19
|
+
import { zeroUsage, accumulateUsage, validateAgents, MAX_LOOP_CONTEXT, parseJudgeVerdict } from "./lib/pipeline-helpers";
|
|
19
20
|
|
|
20
21
|
interface ToolEvent {
|
|
21
22
|
tool: string;
|
|
@@ -64,11 +65,13 @@ interface AgentResult {
|
|
|
64
65
|
progress: AgentProgress;
|
|
65
66
|
model?: string;
|
|
66
67
|
contextWindow?: number;
|
|
67
|
-
usage:
|
|
68
|
+
usage: AgentUsage;
|
|
68
69
|
}
|
|
69
70
|
|
|
70
71
|
interface Details {
|
|
71
|
-
results
|
|
72
|
+
results?: AgentResult[];
|
|
73
|
+
pipelineResult?: PipelineResult & { currentStep?: number };
|
|
74
|
+
loopResult?: LoopResult & { currentIteration?: number };
|
|
72
75
|
}
|
|
73
76
|
|
|
74
77
|
// ── Config ─────────────────────────────────────────────────────────────
|
|
@@ -156,9 +159,47 @@ function buildCustomToolExtensions(): Record<string, string> {
|
|
|
156
159
|
|
|
157
160
|
const CUSTOM_TOOL_EXTENSIONS: Record<string, string> = buildCustomToolExtensions();
|
|
158
161
|
|
|
162
|
+
interface ModelExtension {
|
|
163
|
+
patterns: string[];
|
|
164
|
+
path: string;
|
|
165
|
+
}
|
|
166
|
+
|
|
167
|
+
function buildModelExtensions(): ModelExtension[] {
|
|
168
|
+
const result: ModelExtension[] = [];
|
|
169
|
+
try {
|
|
170
|
+
const entries = fs.readdirSync(EXT_BASE, { withFileTypes: true });
|
|
171
|
+
for (const entry of entries) {
|
|
172
|
+
if (!entry.isDirectory()) continue;
|
|
173
|
+
const pkgPath = path.join(EXT_BASE, entry.name, "package.json");
|
|
174
|
+
if (!fs.existsSync(pkgPath)) continue;
|
|
175
|
+
try {
|
|
176
|
+
const pkg = JSON.parse(fs.readFileSync(pkgPath, "utf-8"));
|
|
177
|
+
const patterns: unknown = pkg?.pi?.appliesToModels;
|
|
178
|
+
const extEntries: unknown = pkg?.pi?.extensions;
|
|
179
|
+
if (!Array.isArray(patterns) || patterns.length === 0) continue;
|
|
180
|
+
if (!patterns.every((p: unknown): p is string => typeof p === "string")) continue;
|
|
181
|
+
if (!Array.isArray(extEntries) || extEntries.length === 0) continue;
|
|
182
|
+
if (!extEntries.every((e: unknown): e is string => typeof e === "string")) continue;
|
|
183
|
+
const extPath = path.join(EXT_BASE, entry.name, extEntries[0]);
|
|
184
|
+
if (fs.existsSync(extPath)) {
|
|
185
|
+
result.push({ patterns, path: extPath });
|
|
186
|
+
}
|
|
187
|
+
} catch {
|
|
188
|
+
// skip corrupted package.json
|
|
189
|
+
}
|
|
190
|
+
}
|
|
191
|
+
} catch {
|
|
192
|
+
// skip if EXT_BASE doesn't exist
|
|
193
|
+
}
|
|
194
|
+
return result;
|
|
195
|
+
}
|
|
196
|
+
|
|
197
|
+
const MODEL_EXTENSIONS: ModelExtension[] = buildModelExtensions();
|
|
198
|
+
|
|
159
199
|
// ── Agent Discovery & Registration ────────────────────────────────────
|
|
160
200
|
|
|
161
201
|
let agents: AgentConfig[] = [];
|
|
202
|
+
let semaphore: Semaphore;
|
|
162
203
|
|
|
163
204
|
// Read once at module load. If we're a child subagent process whose parent
|
|
164
205
|
// pinned an allowlist, we silently ignore any agent (built-in OR registered
|
|
@@ -300,6 +341,29 @@ function truncLine(text: string, maxWidth: number): string {
|
|
|
300
341
|
|
|
301
342
|
// ── Subagent Execution ────────────────────────────────────────────────
|
|
302
343
|
|
|
344
|
+
/**
|
|
345
|
+
* Match a model string against a glob-style pattern (supports only `*` as wildcard).
|
|
346
|
+
* Pattern "deepseek-*" matches "nan/deepseek-v4-flash" and "deepseek-v4-pro".
|
|
347
|
+
* Plain strings without wildcards match the provider name (portion before `/`).
|
|
348
|
+
*/
|
|
349
|
+
function matchModelPattern(model: string, pattern: string): boolean {
|
|
350
|
+
const slashIdx = model.indexOf("/");
|
|
351
|
+
const hasProvider = slashIdx !== -1;
|
|
352
|
+
const modelId = hasProvider ? model.slice(slashIdx + 1) : model;
|
|
353
|
+
const provider = hasProvider ? model.slice(0, slashIdx) : "";
|
|
354
|
+
|
|
355
|
+
// Pattern without wildcards: match against provider name
|
|
356
|
+
if (!pattern.includes("*") && hasProvider) {
|
|
357
|
+
return provider.toLowerCase() === pattern.toLowerCase();
|
|
358
|
+
}
|
|
359
|
+
|
|
360
|
+
// Glob pattern: match against model ID
|
|
361
|
+
// Escape regex-special characters, then convert escaped * back to .*
|
|
362
|
+
const escaped = pattern.replace(/[.*+?^${}()|[\]\\]/g, "\\$&").replace(/\\\*/g, ".*");
|
|
363
|
+
const regex = new RegExp("^" + escaped + "$", "i");
|
|
364
|
+
return regex.test(modelId);
|
|
365
|
+
}
|
|
366
|
+
|
|
303
367
|
async function buildPiArgs(
|
|
304
368
|
agent: AgentConfig,
|
|
305
369
|
task: string,
|
|
@@ -348,6 +412,16 @@ async function buildPiArgs(
|
|
|
348
412
|
args.push("--no-tools");
|
|
349
413
|
}
|
|
350
414
|
|
|
415
|
+
// Auto-load model-specific extensions (e.g. deepseek-cache)
|
|
416
|
+
for (const me of MODEL_EXTENSIONS) {
|
|
417
|
+
for (const pattern of me.patterns) {
|
|
418
|
+
if (matchModelPattern(agent.model, pattern)) {
|
|
419
|
+
extensionPaths.add(me.path);
|
|
420
|
+
break;
|
|
421
|
+
}
|
|
422
|
+
}
|
|
423
|
+
}
|
|
424
|
+
|
|
351
425
|
for (const extPath of extensionPaths) {
|
|
352
426
|
args.push("--extension", extPath);
|
|
353
427
|
}
|
|
@@ -829,11 +903,344 @@ function renderAgentProgress(
|
|
|
829
903
|
return c;
|
|
830
904
|
}
|
|
831
905
|
|
|
906
|
+
// ── Pipeline Execution ────────────────────────────────────────────────
|
|
907
|
+
|
|
908
|
+
async function runPipeline(
|
|
909
|
+
steps: Array<{ agent: string; task: string; connector?: string }>,
|
|
910
|
+
cwd: string,
|
|
911
|
+
signal: AbortSignal | undefined,
|
|
912
|
+
onUpdate?: (stepIndex: number, progress: AgentProgress, usage: AgentUsage) => void,
|
|
913
|
+
): Promise<PipelineResult> {
|
|
914
|
+
const results: PipelineStepResult[] = [];
|
|
915
|
+
let previousOutput = "";
|
|
916
|
+
let totalUsage = zeroUsage();
|
|
917
|
+
const startTime = Date.now();
|
|
918
|
+
|
|
919
|
+
for (let i = 0; i < steps.length; i++) {
|
|
920
|
+
if (signal?.aborted) break;
|
|
921
|
+
|
|
922
|
+
const step = steps[i];
|
|
923
|
+
const agent = agents.find((a) => a.name === step.agent);
|
|
924
|
+
if (!agent) {
|
|
925
|
+
const errMsg = `Unknown agent: ${step.agent}`;
|
|
926
|
+
results.push({
|
|
927
|
+
agent: step.agent, task: step.task, output: `Error: ${errMsg}`,
|
|
928
|
+
exitCode: 1, usage: zeroUsage(), durationMs: 0,
|
|
929
|
+
});
|
|
930
|
+
return {
|
|
931
|
+
steps: results, finalOutput: previousOutput || "(no output)",
|
|
932
|
+
stoppedAt: i, error: errMsg,
|
|
933
|
+
totalUsage, totalDurationMs: Date.now() - startTime,
|
|
934
|
+
};
|
|
935
|
+
}
|
|
936
|
+
|
|
937
|
+
// Build task with {previous} substitution
|
|
938
|
+
let taskWithContext = step.task;
|
|
939
|
+
if (previousOutput && taskWithContext.includes("{previous}")) {
|
|
940
|
+
// Apply connector formatting if available (step-level overrides agent-level)
|
|
941
|
+
const connector = step.connector ?? agent.connector;
|
|
942
|
+
const formattedOutput = formatConnectorContext(previousOutput, connector);
|
|
943
|
+
taskWithContext = substitutePlaceholders(step.task, formattedOutput);
|
|
944
|
+
}
|
|
945
|
+
|
|
946
|
+
const stepStart = Date.now();
|
|
947
|
+
const result = await semaphore.run(() =>
|
|
948
|
+
runSubagent(agent, taskWithContext, cwd, signal, (progress, usage) => {
|
|
949
|
+
onUpdate?.(i, progress, usage);
|
|
950
|
+
}),
|
|
951
|
+
);
|
|
952
|
+
|
|
953
|
+
const stepResult: PipelineStepResult = {
|
|
954
|
+
agent: step.agent, task: step.task, output: result.output,
|
|
955
|
+
exitCode: result.exitCode, usage: result.usage,
|
|
956
|
+
durationMs: Date.now() - stepStart,
|
|
957
|
+
};
|
|
958
|
+
results.push(stepResult);
|
|
959
|
+
totalUsage = accumulateUsage(totalUsage, result.usage);
|
|
960
|
+
previousOutput = result.output;
|
|
961
|
+
|
|
962
|
+
// Stop on error
|
|
963
|
+
if (result.exitCode !== 0 || result.progress.error) {
|
|
964
|
+
return {
|
|
965
|
+
steps: results, finalOutput: previousOutput,
|
|
966
|
+
stoppedAt: i, error: result.progress.error || `Agent ${step.agent} exited with code ${result.exitCode}`,
|
|
967
|
+
totalUsage, totalDurationMs: Date.now() - startTime,
|
|
968
|
+
};
|
|
969
|
+
}
|
|
970
|
+
}
|
|
971
|
+
|
|
972
|
+
return {
|
|
973
|
+
steps: results, finalOutput: previousOutput || "(no output)",
|
|
974
|
+
totalUsage, totalDurationMs: Date.now() - startTime,
|
|
975
|
+
};
|
|
976
|
+
}
|
|
977
|
+
|
|
978
|
+
// ── Loop Execution ─────────────────────────────────────────────────────
|
|
979
|
+
|
|
980
|
+
async function runLoop(
|
|
981
|
+
agentName: string,
|
|
982
|
+
task: string,
|
|
983
|
+
maxIterations: number,
|
|
984
|
+
judge: { agent: string; criteria: string } | undefined,
|
|
985
|
+
cwd: string,
|
|
986
|
+
signal: AbortSignal | undefined,
|
|
987
|
+
onUpdate?: (iteration: number, progress: AgentProgress, usage: AgentUsage) => void,
|
|
988
|
+
): Promise<LoopResult> {
|
|
989
|
+
const agent = agents.find((a) => a.name === agentName);
|
|
990
|
+
if (!agent) throw new Error(`Unknown agent: ${agentName}`);
|
|
991
|
+
|
|
992
|
+
const iterations: LoopIterationResult[] = [];
|
|
993
|
+
let priorOutputs: string[] = [];
|
|
994
|
+
let stoppedBecause: LoopResult["stoppedBecause"] = "max_iterations";
|
|
995
|
+
let totalUsage = zeroUsage();
|
|
996
|
+
const startTime = Date.now();
|
|
997
|
+
|
|
998
|
+
for (let i = 0; i < maxIterations; i++) {
|
|
999
|
+
if (signal?.aborted) break;
|
|
1000
|
+
|
|
1001
|
+
// Build task with accumulated context
|
|
1002
|
+
let fullTask = task;
|
|
1003
|
+
if (priorOutputs.length > 0) {
|
|
1004
|
+
// Enforce MAX_LOOP_CONTEXT budget: drop oldest iterations first
|
|
1005
|
+
let totalContext = 0;
|
|
1006
|
+
let keptOutputs: string[] = [];
|
|
1007
|
+
for (let j = priorOutputs.length - 1; j >= 0; j--) {
|
|
1008
|
+
const block = `--- Iteration ${j + 1} output ---\n${priorOutputs[j]}`;
|
|
1009
|
+
if (totalContext + block.length <= MAX_LOOP_CONTEXT) {
|
|
1010
|
+
keptOutputs.unshift(block);
|
|
1011
|
+
totalContext += block.length;
|
|
1012
|
+
} else {
|
|
1013
|
+
break;
|
|
1014
|
+
}
|
|
1015
|
+
}
|
|
1016
|
+
const contextBlock = keptOutputs.join("\n\n");
|
|
1017
|
+
fullTask = `${task}\n\n## Prior iterations:\n${contextBlock}`;
|
|
1018
|
+
}
|
|
1019
|
+
|
|
1020
|
+
const iterStart = Date.now();
|
|
1021
|
+
const result = await semaphore.run(() =>
|
|
1022
|
+
runSubagent(agent, fullTask, cwd, signal, (progress, usage) => {
|
|
1023
|
+
onUpdate?.(i, progress, usage);
|
|
1024
|
+
}),
|
|
1025
|
+
);
|
|
1026
|
+
|
|
1027
|
+
const iterResult: LoopIterationResult = {
|
|
1028
|
+
iteration: i + 1, output: result.output,
|
|
1029
|
+
exitCode: result.exitCode, usage: result.usage,
|
|
1030
|
+
durationMs: Date.now() - iterStart,
|
|
1031
|
+
};
|
|
1032
|
+
totalUsage = accumulateUsage(totalUsage, result.usage);
|
|
1033
|
+
|
|
1034
|
+
// Judge evaluation (if configured)
|
|
1035
|
+
if (judge && result.exitCode === 0 && !result.progress.error) {
|
|
1036
|
+
const judgeAgent = agents.find((a) => a.name === judge.agent);
|
|
1037
|
+
if (judgeAgent) {
|
|
1038
|
+
const judgePrompt = `Evaluate this output against the criteria below. Respond with YES if satisfied, or NO with specific feedback.\n\nCriteria: ${judge.criteria}\n\nOutput to evaluate:\n${result.output}`;
|
|
1039
|
+
const judgeResult = await semaphore.run(() =>
|
|
1040
|
+
runSubagent(judgeAgent, judgePrompt, cwd, signal),
|
|
1041
|
+
);
|
|
1042
|
+
totalUsage = accumulateUsage(totalUsage, judgeResult.usage);
|
|
1043
|
+
|
|
1044
|
+
// Parse judge verdict
|
|
1045
|
+
const satisfied = parseJudgeVerdict(judgeResult.output);
|
|
1046
|
+
|
|
1047
|
+
iterResult.judgeVerdict = { satisfied, response: judgeResult.output };
|
|
1048
|
+
|
|
1049
|
+
if (satisfied) {
|
|
1050
|
+
iterations.push(iterResult);
|
|
1051
|
+
stoppedBecause = "judge_satisfied";
|
|
1052
|
+
return {
|
|
1053
|
+
iterations, finalOutput: result.output,
|
|
1054
|
+
stoppedBecause, totalUsage, totalDurationMs: Date.now() - startTime,
|
|
1055
|
+
};
|
|
1056
|
+
}
|
|
1057
|
+
}
|
|
1058
|
+
}
|
|
1059
|
+
|
|
1060
|
+
iterations.push(iterResult);
|
|
1061
|
+
priorOutputs.push(result.output);
|
|
1062
|
+
|
|
1063
|
+
if (result.exitCode !== 0 || result.progress.error) {
|
|
1064
|
+
stoppedBecause = "error";
|
|
1065
|
+
return {
|
|
1066
|
+
iterations, finalOutput: result.output || "(error)",
|
|
1067
|
+
stoppedBecause, totalUsage, totalDurationMs: Date.now() - startTime,
|
|
1068
|
+
};
|
|
1069
|
+
}
|
|
1070
|
+
}
|
|
1071
|
+
|
|
1072
|
+
return {
|
|
1073
|
+
iterations, finalOutput: priorOutputs[priorOutputs.length - 1] || "(no output)",
|
|
1074
|
+
stoppedBecause: "max_iterations",
|
|
1075
|
+
totalUsage, totalDurationMs: Date.now() - startTime,
|
|
1076
|
+
};
|
|
1077
|
+
}
|
|
1078
|
+
|
|
1079
|
+
// ── Pipeline / Loop Rendering ─────────────────────────────────────────
|
|
1080
|
+
|
|
1081
|
+
function renderPipelineResult(
|
|
1082
|
+
result: PipelineResult,
|
|
1083
|
+
theme: Theme,
|
|
1084
|
+
expanded: boolean,
|
|
1085
|
+
w: number,
|
|
1086
|
+
): Container {
|
|
1087
|
+
const c = new Container();
|
|
1088
|
+
|
|
1089
|
+
// Header
|
|
1090
|
+
c.addChild(new Text(
|
|
1091
|
+
`${theme.fg("toolTitle", theme.bold("pipeline"))} — ${result.steps.length} steps · ${formatDuration(result.totalDurationMs)}`,
|
|
1092
|
+
0, 0,
|
|
1093
|
+
));
|
|
1094
|
+
c.addChild(new Spacer(1));
|
|
1095
|
+
|
|
1096
|
+
// Steps
|
|
1097
|
+
for (let i = 0; i < result.steps.length; i++) {
|
|
1098
|
+
const step = result.steps[i];
|
|
1099
|
+
const icon = step.exitCode === 0
|
|
1100
|
+
? theme.fg("success", "✓")
|
|
1101
|
+
: theme.fg("error", "✗");
|
|
1102
|
+
|
|
1103
|
+
if (!expanded) {
|
|
1104
|
+
const arrow = i < result.steps.length - 1 && result.steps[i].exitCode === 0 && result.stoppedAt === undefined
|
|
1105
|
+
? theme.fg("dim", " → ")
|
|
1106
|
+
: "";
|
|
1107
|
+
c.addChild(new Text(
|
|
1108
|
+
` ${icon} ${theme.fg("accent", step.agent)}${arrow}`,
|
|
1109
|
+
0, 0,
|
|
1110
|
+
));
|
|
1111
|
+
} else {
|
|
1112
|
+
c.addChild(new Text(
|
|
1113
|
+
` ${icon} ${theme.fg("accent", step.agent)} — ${formatDuration(step.durationMs)}`,
|
|
1114
|
+
0, 0,
|
|
1115
|
+
));
|
|
1116
|
+
c.addChild(new Text(
|
|
1117
|
+
` ${theme.fg("dim", "Task:")} ${truncLine(step.task, w - 20)}`,
|
|
1118
|
+
0, 0,
|
|
1119
|
+
));
|
|
1120
|
+
if (step.output) {
|
|
1121
|
+
c.addChild(new Spacer(1));
|
|
1122
|
+
const mdTheme = getMarkdownTheme();
|
|
1123
|
+
c.addChild(new Markdown(step.output, 2, 0, mdTheme));
|
|
1124
|
+
}
|
|
1125
|
+
if (i < result.steps.length - 1 && result.stoppedAt === undefined) {
|
|
1126
|
+
c.addChild(new Text(theme.fg("dim", " ↓"), 0, 0));
|
|
1127
|
+
}
|
|
1128
|
+
}
|
|
1129
|
+
}
|
|
1130
|
+
|
|
1131
|
+
// Show running indicator if pipeline is still executing
|
|
1132
|
+
if (result.currentStep !== undefined && result.currentStep >= result.steps.length) {
|
|
1133
|
+
if (!expanded) {
|
|
1134
|
+
const hasCompletedSteps = result.steps.length > 0;
|
|
1135
|
+
const lastCompletedOk = hasCompletedSteps && result.steps[result.steps.length - 1].exitCode === 0;
|
|
1136
|
+
const arrow = hasCompletedSteps && lastCompletedOk ? theme.fg("dim", " → ") : "";
|
|
1137
|
+
c.addChild(new Text(
|
|
1138
|
+
` ${arrow}${theme.fg("warning", "⟳")} ${theme.fg("dim", "running...")}`,
|
|
1139
|
+
0, 0,
|
|
1140
|
+
));
|
|
1141
|
+
}
|
|
1142
|
+
}
|
|
1143
|
+
|
|
1144
|
+
// Error message if pipeline failed
|
|
1145
|
+
if (result.error) {
|
|
1146
|
+
c.addChild(new Spacer(1));
|
|
1147
|
+
c.addChild(new Text(theme.fg("error", `Stopped at step ${(result.stoppedAt ?? 0) + 1}: ${result.error}`), 0, 0));
|
|
1148
|
+
}
|
|
1149
|
+
|
|
1150
|
+
// Usage summary
|
|
1151
|
+
c.addChild(new Spacer(1));
|
|
1152
|
+
const usageParts: string[] = [];
|
|
1153
|
+
if (result.totalUsage.input) usageParts.push(theme.fg("dim", `↑${formatTokens(result.totalUsage.input)}`));
|
|
1154
|
+
if (result.totalUsage.output) usageParts.push(theme.fg("dim", `↓${formatTokens(result.totalUsage.output)}`));
|
|
1155
|
+
if (result.totalUsage.cost) usageParts.push(theme.fg("dim", `$${result.totalUsage.cost.toFixed(3)}`));
|
|
1156
|
+
if (usageParts.length) c.addChild(new Text(usageParts.join(" "), 0, 0));
|
|
1157
|
+
|
|
1158
|
+
return c;
|
|
1159
|
+
}
|
|
1160
|
+
|
|
1161
|
+
function renderLoopResult(
|
|
1162
|
+
result: LoopResult,
|
|
1163
|
+
theme: Theme,
|
|
1164
|
+
expanded: boolean,
|
|
1165
|
+
w: number,
|
|
1166
|
+
): Container {
|
|
1167
|
+
const c = new Container();
|
|
1168
|
+
|
|
1169
|
+
const stoppedLabel = result.stoppedBecause === "judge_satisfied"
|
|
1170
|
+
? theme.fg("success", "judge satisfied")
|
|
1171
|
+
: result.stoppedBecause === "error"
|
|
1172
|
+
? theme.fg("error", "stopped (error)")
|
|
1173
|
+
: theme.fg("dim", `max ${result.iterations.length} iterations`);
|
|
1174
|
+
|
|
1175
|
+
// Header
|
|
1176
|
+
c.addChild(new Text(
|
|
1177
|
+
`${theme.fg("toolTitle", theme.bold("loop"))} — ${result.iterations.length} iterations · ${stoppedLabel} · ${formatDuration(result.totalDurationMs)}`,
|
|
1178
|
+
0, 0,
|
|
1179
|
+
));
|
|
1180
|
+
c.addChild(new Spacer(1));
|
|
1181
|
+
|
|
1182
|
+
// Iterations
|
|
1183
|
+
result.iterations.forEach((iter, idx) => {
|
|
1184
|
+
const icon = iter.exitCode === 0
|
|
1185
|
+
? theme.fg("success", "✓")
|
|
1186
|
+
: theme.fg("error", "✗");
|
|
1187
|
+
|
|
1188
|
+
const verdictStr = iter.judgeVerdict
|
|
1189
|
+
? (iter.judgeVerdict.satisfied
|
|
1190
|
+
? theme.fg("success", " (YES)")
|
|
1191
|
+
: theme.fg("warning", " (NO)"))
|
|
1192
|
+
: "";
|
|
1193
|
+
|
|
1194
|
+
if (!expanded) {
|
|
1195
|
+
const isLast = idx === result.iterations.length - 1;
|
|
1196
|
+
const arrow = isLast ? "" : theme.fg("dim", " → ");
|
|
1197
|
+
c.addChild(new Text(
|
|
1198
|
+
` ${icon} ${theme.fg("accent", `Iteration ${iter.iteration}`)}${verdictStr}${arrow}`,
|
|
1199
|
+
0, 0,
|
|
1200
|
+
));
|
|
1201
|
+
} else {
|
|
1202
|
+
c.addChild(new Text(
|
|
1203
|
+
` ${icon} ${theme.fg("accent", `Iteration ${iter.iteration}`)}${verdictStr} — ${formatDuration(iter.durationMs)}`,
|
|
1204
|
+
0, 0,
|
|
1205
|
+
));
|
|
1206
|
+
if (iter.output) {
|
|
1207
|
+
const mdTheme = getMarkdownTheme();
|
|
1208
|
+
c.addChild(new Markdown(iter.output, 2, 0, mdTheme));
|
|
1209
|
+
}
|
|
1210
|
+
if (iter.judgeVerdict && !iter.judgeVerdict.satisfied) {
|
|
1211
|
+
c.addChild(new Text(theme.fg("dim", " ↓ refine"), 0, 0));
|
|
1212
|
+
}
|
|
1213
|
+
}
|
|
1214
|
+
});
|
|
1215
|
+
|
|
1216
|
+
// Show running indicator if loop is still executing
|
|
1217
|
+
if (result.currentIteration !== undefined && result.currentIteration >= result.iterations.length) {
|
|
1218
|
+
if (!expanded) {
|
|
1219
|
+
const hasCompleted = result.iterations.length > 0;
|
|
1220
|
+
const arrow = hasCompleted ? theme.fg("dim", " → ") : "";
|
|
1221
|
+
c.addChild(new Text(
|
|
1222
|
+
` ${arrow}${theme.fg("warning", "⟳")} ${theme.fg("dim", "refining...")}`,
|
|
1223
|
+
0, 0,
|
|
1224
|
+
));
|
|
1225
|
+
}
|
|
1226
|
+
}
|
|
1227
|
+
|
|
1228
|
+
// Usage summary
|
|
1229
|
+
c.addChild(new Spacer(1));
|
|
1230
|
+
const usageParts: string[] = [];
|
|
1231
|
+
if (result.totalUsage.input) usageParts.push(theme.fg("dim", `↑${formatTokens(result.totalUsage.input)}`));
|
|
1232
|
+
if (result.totalUsage.output) usageParts.push(theme.fg("dim", `↓${formatTokens(result.totalUsage.output)}`));
|
|
1233
|
+
if (result.totalUsage.cost) usageParts.push(theme.fg("dim", `$${result.totalUsage.cost.toFixed(3)}`));
|
|
1234
|
+
if (usageParts.length) c.addChild(new Text(usageParts.join(" "), 0, 0));
|
|
1235
|
+
|
|
1236
|
+
return c;
|
|
1237
|
+
}
|
|
1238
|
+
|
|
832
1239
|
// ── Extension ─────────────────────────────────────────────────────────
|
|
833
1240
|
|
|
834
1241
|
export default function (pi: ExtensionAPI) {
|
|
835
1242
|
const config = loadConfig();
|
|
836
|
-
|
|
1243
|
+
semaphore = new Semaphore(config.maxConcurrency ?? DEFAULT_MAX_CONCURRENCY);
|
|
837
1244
|
agents = loadAgents();
|
|
838
1245
|
|
|
839
1246
|
// If spawned as a child by a parent subagent process, PI_SUBAGENT_ALLOWED
|
|
@@ -952,7 +1359,7 @@ export default function (pi: ExtensionAPI) {
|
|
|
952
1359
|
// ── Render: result ──
|
|
953
1360
|
renderResult(result, options, theme, context) {
|
|
954
1361
|
const details = result.details as Details | undefined;
|
|
955
|
-
if (!details
|
|
1362
|
+
if (!details) {
|
|
956
1363
|
const t = result.content[0];
|
|
957
1364
|
const text = t?.type === "text" ? t.text : "(no output)";
|
|
958
1365
|
return new Text(text.slice(0, 200), 0, 0);
|
|
@@ -960,8 +1367,262 @@ export default function (pi: ExtensionAPI) {
|
|
|
960
1367
|
|
|
961
1368
|
const w = getTermWidth() - 4;
|
|
962
1369
|
const expanded = options.expanded;
|
|
963
|
-
|
|
964
|
-
|
|
1370
|
+
|
|
1371
|
+
// Pipeline result
|
|
1372
|
+
if (details.pipelineResult) {
|
|
1373
|
+
return renderPipelineResult(details.pipelineResult, theme, expanded, w);
|
|
1374
|
+
}
|
|
1375
|
+
|
|
1376
|
+
// Loop result
|
|
1377
|
+
if (details.loopResult) {
|
|
1378
|
+
return renderLoopResult(details.loopResult, theme, expanded, w);
|
|
1379
|
+
}
|
|
1380
|
+
|
|
1381
|
+
// Single agent result (existing behavior)
|
|
1382
|
+
if (details.results?.length) {
|
|
1383
|
+
const c = new Container();
|
|
1384
|
+
c.addChild(renderAgentProgress(details.results[0], theme, expanded, w));
|
|
1385
|
+
return c;
|
|
1386
|
+
}
|
|
1387
|
+
|
|
1388
|
+
// Fallback
|
|
1389
|
+
const t = result.content[0];
|
|
1390
|
+
const text = t?.type === "text" ? t.text : "(no output)";
|
|
1391
|
+
return new Text(text.slice(0, 200), 0, 0);
|
|
1392
|
+
},
|
|
1393
|
+
});
|
|
1394
|
+
|
|
1395
|
+
// ── Pipeline Tool ────────────────────────────────────────────────────
|
|
1396
|
+
|
|
1397
|
+
pi.registerTool({
|
|
1398
|
+
name: "pipeline",
|
|
1399
|
+
label: "Pipeline",
|
|
1400
|
+
description:
|
|
1401
|
+
"Run 2–5 agents in sequence. Each agent's output feeds as {previous} context into the next agent's task. Use for multi-stage workflows like scout → planner → worker.",
|
|
1402
|
+
promptSnippet: "Run sequential multi-agent pipelines",
|
|
1403
|
+
promptGuidelines: [
|
|
1404
|
+
"Use pipeline when a task naturally decomposes into sequential agent roles (e.g. explore → plan → implement → review).",
|
|
1405
|
+
"Each step receives the previous step's output automatically via {previous} placeholder substitution.",
|
|
1406
|
+
"Pipelines stop on first error. The finalOutput is the last successful step's output.",
|
|
1407
|
+
],
|
|
1408
|
+
parameters: Type.Object({
|
|
1409
|
+
steps: Type.Array(
|
|
1410
|
+
Type.Object({
|
|
1411
|
+
agent: Type.String({ description: "Agent name for this step" }),
|
|
1412
|
+
task: Type.String({ description: "Task description. Use {previous} to reference the prior step's output." }),
|
|
1413
|
+
connector: Type.Optional(Type.String({ description: "Override agent's default connector template for this step. Format: \"## Header\\n\\n{output}\"" })),
|
|
1414
|
+
}),
|
|
1415
|
+
{ minItems: 2, maxItems: 5, description: "Sequential steps (2–5). Each step's agent output feeds into the next step's task via {previous}." },
|
|
1416
|
+
),
|
|
1417
|
+
cwd: Type.Optional(Type.String({ description: "Working directory for all agent processes" })),
|
|
1418
|
+
}),
|
|
1419
|
+
|
|
1420
|
+
async execute(toolCallId, params, signal, onUpdate, ctx) {
|
|
1421
|
+
const cwd = params.cwd ?? ctx.cwd;
|
|
1422
|
+
|
|
1423
|
+
if (!params.steps || params.steps.length < 2) {
|
|
1424
|
+
throw new Error("pipeline requires at least 2 steps");
|
|
1425
|
+
}
|
|
1426
|
+
|
|
1427
|
+
// Validate all agents exist
|
|
1428
|
+
const agentNames = params.steps.map((s: { agent: string }) => s.agent);
|
|
1429
|
+
const missing = validateAgents(agentNames, agents);
|
|
1430
|
+
if (missing) {
|
|
1431
|
+
const available = agents.map((a) => a.name).join(", ") || "none";
|
|
1432
|
+
throw new Error(`Unknown agent in pipeline: ${missing}. Available agents: ${available}`);
|
|
1433
|
+
}
|
|
1434
|
+
|
|
1435
|
+
const liveResult: Details = {
|
|
1436
|
+
pipelineResult: {
|
|
1437
|
+
steps: [],
|
|
1438
|
+
currentStep: 0,
|
|
1439
|
+
finalOutput: "",
|
|
1440
|
+
totalUsage: zeroUsage(),
|
|
1441
|
+
totalDurationMs: 0,
|
|
1442
|
+
},
|
|
1443
|
+
};
|
|
1444
|
+
|
|
1445
|
+
const result = await runPipeline(
|
|
1446
|
+
params.steps,
|
|
1447
|
+
cwd,
|
|
1448
|
+
signal,
|
|
1449
|
+
(stepIndex, progress, usage) => {
|
|
1450
|
+
const pResult = liveResult.pipelineResult!;
|
|
1451
|
+
pResult.currentStep = stepIndex;
|
|
1452
|
+
// Update live result with latest step progress
|
|
1453
|
+
if (progress.status === "running") {
|
|
1454
|
+
// Ensure step slot exists for live rendering
|
|
1455
|
+
if (stepIndex === pResult.steps.length) {
|
|
1456
|
+
pResult.steps.push({
|
|
1457
|
+
agent: params.steps[stepIndex].agent,
|
|
1458
|
+
task: params.steps[stepIndex].task,
|
|
1459
|
+
output: "",
|
|
1460
|
+
exitCode: -1, // sentinel: not yet done
|
|
1461
|
+
usage,
|
|
1462
|
+
durationMs: progress.durationMs,
|
|
1463
|
+
});
|
|
1464
|
+
}
|
|
1465
|
+
}
|
|
1466
|
+
if (progress.status === "completed" || progress.status === "failed") {
|
|
1467
|
+
const stepResult: PipelineStepResult = {
|
|
1468
|
+
agent: params.steps[stepIndex].agent,
|
|
1469
|
+
task: params.steps[stepIndex].task,
|
|
1470
|
+
output: progress.lastMessage || "",
|
|
1471
|
+
exitCode: progress.status === "failed" ? 1 : 0,
|
|
1472
|
+
usage,
|
|
1473
|
+
durationMs: progress.durationMs,
|
|
1474
|
+
};
|
|
1475
|
+
// Replace placeholder or push
|
|
1476
|
+
while (pResult.steps.length <= stepIndex) {
|
|
1477
|
+
pResult.steps.push({...stepResult, output: "", exitCode: -1, usage: zeroUsage()});
|
|
1478
|
+
}
|
|
1479
|
+
pResult.steps[stepIndex] = stepResult;
|
|
1480
|
+
}
|
|
1481
|
+
onUpdate?.({
|
|
1482
|
+
content: [{ type: "text", text: `Pipeline: step ${stepIndex + 1}/${params.steps.length}` }],
|
|
1483
|
+
details: liveResult,
|
|
1484
|
+
});
|
|
1485
|
+
},
|
|
1486
|
+
);
|
|
1487
|
+
|
|
1488
|
+
const isError = result.stoppedAt !== undefined;
|
|
1489
|
+
return {
|
|
1490
|
+
content: [{ type: "text", text: result.finalOutput || "(no output)" }],
|
|
1491
|
+
details: { pipelineResult: result },
|
|
1492
|
+
...(isError ? { isError: true } : {}),
|
|
1493
|
+
};
|
|
1494
|
+
},
|
|
1495
|
+
|
|
1496
|
+
renderCall(args, theme, context) {
|
|
1497
|
+
if (!context.expanded) {
|
|
1498
|
+
if (!args.steps) {
|
|
1499
|
+
return new Text(theme.fg("toolTitle", theme.bold("pipeline")), 0, 0);
|
|
1500
|
+
}
|
|
1501
|
+
const stepNames = args.steps.map((s: { agent?: string }) => s?.agent || "?").join(" → ");
|
|
1502
|
+
return new Text(
|
|
1503
|
+
`${theme.fg("toolTitle", theme.bold("pipeline"))} ${theme.fg("accent", stepNames)}`,
|
|
1504
|
+
0, 0,
|
|
1505
|
+
);
|
|
1506
|
+
}
|
|
1507
|
+
|
|
1508
|
+
const c = context.lastComponent instanceof Container
|
|
1509
|
+
? (context.lastComponent.clear(), context.lastComponent)
|
|
1510
|
+
: new Container();
|
|
1511
|
+
const stepCount = args.steps?.length || 0;
|
|
1512
|
+
c.addChild(new Text(`${theme.fg("toolTitle", theme.bold("pipeline"))} — ${stepCount} steps`, 0, 0));
|
|
1513
|
+
if (args.steps) {
|
|
1514
|
+
c.addChild(new Spacer(1));
|
|
1515
|
+
for (let i = 0; i < args.steps.length; i++) {
|
|
1516
|
+
const step = args.steps[i];
|
|
1517
|
+
const agentLabel = step.agent ? theme.fg("accent", step.agent) : "?";
|
|
1518
|
+
const taskPreview = step.task ? truncLine(step.task, 60) : "";
|
|
1519
|
+
c.addChild(new Text(` ${theme.fg("dim", `${i + 1}.`)} ${agentLabel} ${theme.fg("dim", taskPreview)}`, 0, 0));
|
|
1520
|
+
}
|
|
1521
|
+
}
|
|
1522
|
+
return c;
|
|
1523
|
+
},
|
|
1524
|
+
});
|
|
1525
|
+
|
|
1526
|
+
// ── Loop Tool ─────────────────────────────────────────────────────────
|
|
1527
|
+
|
|
1528
|
+
pi.registerTool({
|
|
1529
|
+
name: "loop",
|
|
1530
|
+
label: "Loop",
|
|
1531
|
+
description:
|
|
1532
|
+
"Run the same agent 2–5 times, passing prior iteration outputs as context. Optionally use a judge agent to evaluate quality and stop early.",
|
|
1533
|
+
promptSnippet: "Run iterative refinement loops with optional judge",
|
|
1534
|
+
promptGuidelines: [
|
|
1535
|
+
"Use loop for tasks that benefit from iterative refinement (e.g. drafting → reviewing → polishing).",
|
|
1536
|
+
"Configure a judge agent to stop early when quality is sufficient, avoiding wasted iterations.",
|
|
1537
|
+
"Each iteration receives all prior outputs as context, enabling progressive improvement.",
|
|
1538
|
+
],
|
|
1539
|
+
parameters: Type.Object({
|
|
1540
|
+
agent: Type.String({ description: "Agent name to run in the loop" }),
|
|
1541
|
+
task: Type.String({ description: "Task description for each iteration" }),
|
|
1542
|
+
max_iterations: Type.Optional(Type.Number({ minimum: 2, maximum: 5, default: 3, description: "Maximum number of iterations (2–5, default 3)" })),
|
|
1543
|
+
judge: Type.Optional(Type.Object({
|
|
1544
|
+
agent: Type.String({ description: "Judge agent name" }),
|
|
1545
|
+
criteria: Type.String({ description: "Quality criteria. Judge responds YES if satisfied, NO otherwise." }),
|
|
1546
|
+
}, { description: "Optional judge agent to evaluate each iteration and stop early when quality is sufficient" })),
|
|
1547
|
+
cwd: Type.Optional(Type.String({ description: "Working directory for agent processes" })),
|
|
1548
|
+
}),
|
|
1549
|
+
|
|
1550
|
+
async execute(toolCallId, params, signal, onUpdate, ctx) {
|
|
1551
|
+
const cwd = params.cwd ?? ctx.cwd;
|
|
1552
|
+
const maxIterations = params.max_iterations ?? 3;
|
|
1553
|
+
|
|
1554
|
+
// Validate agent exists
|
|
1555
|
+
const agentNames = [params.agent];
|
|
1556
|
+
if (params.judge) agentNames.push(params.judge.agent);
|
|
1557
|
+
const missing = validateAgents(agentNames, agents);
|
|
1558
|
+
if (missing) {
|
|
1559
|
+
const available = agents.map((a) => a.name).join(", ") || "none";
|
|
1560
|
+
throw new Error(`Unknown agent in loop: ${missing}. Available agents: ${available}`);
|
|
1561
|
+
}
|
|
1562
|
+
|
|
1563
|
+
const liveResult: Details = {
|
|
1564
|
+
loopResult: {
|
|
1565
|
+
iterations: [],
|
|
1566
|
+
currentIteration: 0,
|
|
1567
|
+
finalOutput: "",
|
|
1568
|
+
stoppedBecause: "max_iterations",
|
|
1569
|
+
totalUsage: zeroUsage(),
|
|
1570
|
+
totalDurationMs: 0,
|
|
1571
|
+
},
|
|
1572
|
+
};
|
|
1573
|
+
|
|
1574
|
+
const result = await runLoop(
|
|
1575
|
+
params.agent,
|
|
1576
|
+
params.task,
|
|
1577
|
+
maxIterations,
|
|
1578
|
+
params.judge,
|
|
1579
|
+
cwd,
|
|
1580
|
+
signal,
|
|
1581
|
+
(iteration, progress, usage) => {
|
|
1582
|
+
const lResult = liveResult.loopResult!;
|
|
1583
|
+
lResult.currentIteration = iteration;
|
|
1584
|
+
onUpdate?.({
|
|
1585
|
+
content: [{ type: "text", text: `Loop: iteration ${iteration + 1}/${maxIterations}` }],
|
|
1586
|
+
details: liveResult,
|
|
1587
|
+
});
|
|
1588
|
+
},
|
|
1589
|
+
);
|
|
1590
|
+
|
|
1591
|
+
const isError = result.stoppedBecause === "error";
|
|
1592
|
+
return {
|
|
1593
|
+
content: [{ type: "text", text: result.finalOutput || "(no output)" }],
|
|
1594
|
+
details: { loopResult: result },
|
|
1595
|
+
...(isError ? { isError: true } : {}),
|
|
1596
|
+
};
|
|
1597
|
+
},
|
|
1598
|
+
|
|
1599
|
+
renderCall(args, theme, context) {
|
|
1600
|
+
if (!context.expanded) {
|
|
1601
|
+
if (!args.agent) {
|
|
1602
|
+
return new Text(theme.fg("toolTitle", theme.bold("loop")), 0, 0);
|
|
1603
|
+
}
|
|
1604
|
+
const maxIter = args.max_iterations || 3;
|
|
1605
|
+
const judgeStr = args.judge ? ` (judge: ${theme.fg("accent", (args.judge as { agent?: string }).agent || "?")})` : "";
|
|
1606
|
+
return new Text(
|
|
1607
|
+
`${theme.fg("toolTitle", theme.bold("loop"))} ${theme.fg("accent", args.agent)} × ${maxIter}${judgeStr}`,
|
|
1608
|
+
0, 0,
|
|
1609
|
+
);
|
|
1610
|
+
}
|
|
1611
|
+
|
|
1612
|
+
const c = context.lastComponent instanceof Container
|
|
1613
|
+
? (context.lastComponent.clear(), context.lastComponent)
|
|
1614
|
+
: new Container();
|
|
1615
|
+
const maxIter = args.max_iterations || 3;
|
|
1616
|
+
c.addChild(new Text(`${theme.fg("toolTitle", theme.bold("loop"))} ${theme.fg("accent", args.agent || "?")} × ${maxIter}`, 0, 0));
|
|
1617
|
+
if (args.task) {
|
|
1618
|
+
c.addChild(new Spacer(1));
|
|
1619
|
+
c.addChild(new Text(theme.fg("text", args.task), 0, 0));
|
|
1620
|
+
}
|
|
1621
|
+
if (args.judge) {
|
|
1622
|
+
const j = args.judge as { agent?: string; criteria?: string };
|
|
1623
|
+
c.addChild(new Spacer(1));
|
|
1624
|
+
c.addChild(new Text(`${theme.fg("dim", "Judge:")} ${theme.fg("accent", j.agent || "?")} — ${theme.fg("dim", j.criteria || "")}`, 0, 0));
|
|
1625
|
+
}
|
|
965
1626
|
return c;
|
|
966
1627
|
},
|
|
967
1628
|
});
|
package/lib/helpers.ts
CHANGED
|
@@ -50,8 +50,11 @@ export function parseAgentMd(content: string, filePath: string): AgentConfig | n
|
|
|
50
50
|
const subagentAgents = fields.subagent_agents
|
|
51
51
|
? normalizeTools(fields.subagent_agents)
|
|
52
52
|
: undefined;
|
|
53
|
+
const connector = fields.connector
|
|
54
|
+
? fields.connector.replace(/^"|"$/g, "")
|
|
55
|
+
: undefined;
|
|
53
56
|
|
|
54
|
-
return { name, description, tools, model, thinking, systemPrompt, filePath, subagentAgents };
|
|
57
|
+
return { name, description, tools, model, thinking, systemPrompt, filePath, subagentAgents, connector };
|
|
55
58
|
}
|
|
56
59
|
|
|
57
60
|
/**
|
|
@@ -90,6 +93,34 @@ export function mergeAgents(builtIn: AgentConfig[], user: AgentConfig[]): AgentC
|
|
|
90
93
|
return Array.from(byName.values());
|
|
91
94
|
}
|
|
92
95
|
|
|
96
|
+
/**
|
|
97
|
+
* Replace {previous} placeholder in a task string with the prior step's output.
|
|
98
|
+
* Truncation happens here — this is the single truncation point.
|
|
99
|
+
*/
|
|
100
|
+
export function substitutePlaceholders(
|
|
101
|
+
task: string,
|
|
102
|
+
previousOutput: string,
|
|
103
|
+
maxContextChars: number = 16000,
|
|
104
|
+
): string {
|
|
105
|
+
const truncated = previousOutput.length > maxContextChars
|
|
106
|
+
? previousOutput.slice(0, maxContextChars) + "\n\n[Context truncated for pipeline]"
|
|
107
|
+
: previousOutput;
|
|
108
|
+
return task.replace(/\{previous\}/g, truncated);
|
|
109
|
+
}
|
|
110
|
+
|
|
111
|
+
/**
|
|
112
|
+
* Format an agent's output using its connector template.
|
|
113
|
+
* Pure formatting function — does NOT truncate. Truncation is handled
|
|
114
|
+
* by substitutePlaceholders() before this is called.
|
|
115
|
+
*/
|
|
116
|
+
export function formatConnectorContext(
|
|
117
|
+
output: string,
|
|
118
|
+
connectorTemplate?: string,
|
|
119
|
+
): string {
|
|
120
|
+
if (!connectorTemplate) return output;
|
|
121
|
+
return connectorTemplate.replace(/\{output\}/g, output);
|
|
122
|
+
}
|
|
123
|
+
|
|
93
124
|
/**
|
|
94
125
|
* Parses PI_SUBAGENT_ALLOWED env var into a Set of agent names.
|
|
95
126
|
* Returns null if the env var is not set or empty (meaning no restriction).
|
|
@@ -0,0 +1,53 @@
|
|
|
1
|
+
import type { AgentConfig, AgentUsage } from "./types";
|
|
2
|
+
|
|
3
|
+
/**
|
|
4
|
+
* Create a zeroed-out AgentUsage object.
|
|
5
|
+
*/
|
|
6
|
+
export function zeroUsage(): AgentUsage {
|
|
7
|
+
return { input: 0, output: 0, cacheRead: 0, cacheWrite: 0, cost: 0, turns: 0 };
|
|
8
|
+
}
|
|
9
|
+
|
|
10
|
+
/**
|
|
11
|
+
* Accumulate usage from one step/iteration into the running total.
|
|
12
|
+
*/
|
|
13
|
+
export function accumulateUsage(total: AgentUsage, step: AgentUsage): AgentUsage {
|
|
14
|
+
return {
|
|
15
|
+
input: total.input + step.input,
|
|
16
|
+
output: total.output + step.output,
|
|
17
|
+
cacheRead: total.cacheRead + step.cacheRead,
|
|
18
|
+
cacheWrite: total.cacheWrite + step.cacheWrite,
|
|
19
|
+
cost: total.cost + step.cost,
|
|
20
|
+
turns: total.turns + step.turns,
|
|
21
|
+
};
|
|
22
|
+
}
|
|
23
|
+
|
|
24
|
+
/**
|
|
25
|
+
* Validate that all referenced agent names exist in the loaded agents array.
|
|
26
|
+
* Returns the first missing agent name, or null if all are valid.
|
|
27
|
+
*/
|
|
28
|
+
export function validateAgents(
|
|
29
|
+
agentNames: string[],
|
|
30
|
+
agents: AgentConfig[],
|
|
31
|
+
): string | null {
|
|
32
|
+
for (const name of agentNames) {
|
|
33
|
+
if (!agents.some((a) => a.name === name)) return name;
|
|
34
|
+
}
|
|
35
|
+
return null;
|
|
36
|
+
}
|
|
37
|
+
|
|
38
|
+
/**
|
|
39
|
+
* Maximum total characters for accumulated loop context (prior iteration outputs).
|
|
40
|
+
* When exceeded, oldest iterations are dropped first, keeping only the last 2–3.
|
|
41
|
+
*/
|
|
42
|
+
export const MAX_LOOP_CONTEXT = 48000;
|
|
43
|
+
|
|
44
|
+
/**
|
|
45
|
+
* Parse a judge agent's response to determine if it signals satisfaction.
|
|
46
|
+
* Extracts the first non-empty line, strips markdown formatting, and checks
|
|
47
|
+
* for word-boundary YES match. Returns false on any parse failure.
|
|
48
|
+
*/
|
|
49
|
+
export function parseJudgeVerdict(response: string): boolean {
|
|
50
|
+
const firstLine = response.split('\n').find(l => l.trim()) || '';
|
|
51
|
+
const cleaned = firstLine.replace(/[*_`#]/g, '').trim().toUpperCase();
|
|
52
|
+
return /\bYES\b/.test(cleaned);
|
|
53
|
+
}
|
package/lib/types.ts
CHANGED
|
@@ -7,4 +7,57 @@ export interface AgentConfig {
|
|
|
7
7
|
systemPrompt: string;
|
|
8
8
|
filePath: string;
|
|
9
9
|
subagentAgents?: string[];
|
|
10
|
+
connector?: string; // Single-line prompt template, e.g. "## Findings\n\n{output}"
|
|
11
|
+
}
|
|
12
|
+
|
|
13
|
+
export interface AgentUsage {
|
|
14
|
+
input: number;
|
|
15
|
+
output: number;
|
|
16
|
+
cacheRead: number;
|
|
17
|
+
cacheWrite: number;
|
|
18
|
+
cost: number;
|
|
19
|
+
turns: number;
|
|
20
|
+
}
|
|
21
|
+
|
|
22
|
+
export interface PipelineStep {
|
|
23
|
+
agent: string;
|
|
24
|
+
task: string; // May contain {previous} placeholder
|
|
25
|
+
connector?: string; // Override agent's default connector for this step
|
|
26
|
+
}
|
|
27
|
+
|
|
28
|
+
export interface PipelineStepResult {
|
|
29
|
+
agent: string;
|
|
30
|
+
task: string;
|
|
31
|
+
output: string;
|
|
32
|
+
exitCode: number;
|
|
33
|
+
usage: AgentUsage;
|
|
34
|
+
durationMs: number;
|
|
35
|
+
}
|
|
36
|
+
|
|
37
|
+
export interface PipelineResult {
|
|
38
|
+
steps: PipelineStepResult[];
|
|
39
|
+
currentStep?: number; // Present during live execution updates
|
|
40
|
+
finalOutput: string;
|
|
41
|
+
stoppedAt?: number; // 0-indexed step where pipeline stopped (on error)
|
|
42
|
+
error?: string; // Error message if pipeline failed
|
|
43
|
+
totalUsage: AgentUsage;
|
|
44
|
+
totalDurationMs: number;
|
|
45
|
+
}
|
|
46
|
+
|
|
47
|
+
export interface LoopIterationResult {
|
|
48
|
+
iteration: number;
|
|
49
|
+
output: string;
|
|
50
|
+
exitCode: number;
|
|
51
|
+
usage: AgentUsage;
|
|
52
|
+
durationMs: number;
|
|
53
|
+
judgeVerdict?: { satisfied: boolean; response: string }; // Present when judge is configured
|
|
54
|
+
}
|
|
55
|
+
|
|
56
|
+
export interface LoopResult {
|
|
57
|
+
iterations: LoopIterationResult[];
|
|
58
|
+
currentIteration?: number; // Present during live execution updates
|
|
59
|
+
finalOutput: string;
|
|
60
|
+
stoppedBecause: "max_iterations" | "judge_satisfied" | "error";
|
|
61
|
+
totalUsage: AgentUsage;
|
|
62
|
+
totalDurationMs: number;
|
|
10
63
|
}
|
package/package.json
CHANGED