claude-overnight 0.3.2 → 0.5.1
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +54 -10
- package/dist/index.js +283 -123
- package/dist/planner.d.ts +4 -1
- package/dist/planner.js +138 -10
- package/dist/swarm.js +16 -7
- package/package.json +1 -1
package/README.md
CHANGED
|
@@ -2,7 +2,7 @@
|
|
|
2
2
|
|
|
3
3
|
Fire off Claude agents, come back to shipped work.
|
|
4
4
|
|
|
5
|
-
Describe what to build. Set a budget — 10 agents, 100, 1000. A planner agent analyzes your codebase, breaks the objective into
|
|
5
|
+
Describe what to build. Set a budget — 10 agents, 100, 1000. A planner agent analyzes your codebase, breaks the objective into independent tasks, and launches them all. Each agent runs in its own git worktree with full tooling (Read, Edit, Bash, Grep — everything). Rate limits? It waits. Windows reset? It resumes. It doesn't stop until every task is done.
|
|
6
6
|
|
|
7
7
|
## Install
|
|
8
8
|
|
|
@@ -20,7 +20,32 @@ Requires Node.js >= 20 and Claude authentication (OAuth via `claude` CLI, or `AN
|
|
|
20
20
|
claude-overnight
|
|
21
21
|
```
|
|
22
22
|
|
|
23
|
-
|
|
23
|
+
A guided flow walks you through each step:
|
|
24
|
+
|
|
25
|
+
```
|
|
26
|
+
🌙 claude-overnight
|
|
27
|
+
────────────────────────────────────
|
|
28
|
+
|
|
29
|
+
① What should the agents do?
|
|
30
|
+
> refactor auth, add tests, update docs
|
|
31
|
+
|
|
32
|
+
② Budget [10]: 50
|
|
33
|
+
|
|
34
|
+
③ Worker model:
|
|
35
|
+
● Sonnet — Sonnet 4.6 · Best for everyday tasks
|
|
36
|
+
○ Opus — Opus 4.6 · Most capable
|
|
37
|
+
○ Haiku — Haiku 4.5 · Fastest
|
|
38
|
+
|
|
39
|
+
④ Usage:
|
|
40
|
+
● Unlimited · full capacity, wait through rate limits
|
|
41
|
+
○ 90% · leave 10% for other work
|
|
42
|
+
|
|
43
|
+
╭────────────────────────────────────╮
|
|
44
|
+
│ sonnet · budget 50 · 5× · flex │
|
|
45
|
+
╰────────────────────────────────────╯
|
|
46
|
+
```
|
|
47
|
+
|
|
48
|
+
For large budgets, the planner identifies research themes — review them, then press Run. Everything after that is fully autonomous: thinking agents explore, the orchestrator synthesizes tasks, execution waves run, and steering adapts between waves. No further interaction needed — go to sleep.
|
|
24
49
|
|
|
25
50
|
### Task file
|
|
26
51
|
|
|
@@ -38,6 +63,25 @@ claude-overnight "fix auth bug in src/auth.ts" "add tests for user model"
|
|
|
38
63
|
|
|
39
64
|
The planner always runs on the best available model (Opus) regardless of which model you pick for workers. This ensures high-quality task decomposition even when workers use a cheaper model.
|
|
40
65
|
|
|
66
|
+
### Thinking wave
|
|
67
|
+
|
|
68
|
+
For large budgets (`budget > concurrency * 3`), the planner doesn't try to generate hundreds of tasks from scratch. Instead, it launches a **thinking wave** — a team of architect agents that explore your codebase in parallel before any code is written.
|
|
69
|
+
|
|
70
|
+
```
|
|
71
|
+
⠋ identifying themes... → splits objective into N angles (< 30s)
|
|
72
|
+
✓ 10 themes → review themes, press Run, walk away
|
|
73
|
+
◆ Thinking: 10 agents exploring → each explores from its angle, writes a design doc
|
|
74
|
+
◆ Orchestrating plan... → reads all design docs, synthesizes execution tasks
|
|
75
|
+
◆ Wave 1 · 50 tasks → fully autonomous from here
|
|
76
|
+
◆ Steering... → adapts between waves, retries on rate limits
|
|
77
|
+
```
|
|
78
|
+
|
|
79
|
+
The review prompt appears right after theme identification — the last thing requiring your presence. After you press Run, the thinking wave, orchestration, execution, and steering all run autonomously. Rate-limited? The planner waits and retries. Go to sleep.
|
|
80
|
+
|
|
81
|
+
The number of thinking agents scales with budget: 5 for budget=50, 10 for budget=2000+. Each agent explores the codebase from a different angle and writes a structured design document. The orchestrator then reads all design docs and produces grounded execution tasks referencing real files and patterns.
|
|
82
|
+
|
|
83
|
+
For small budgets (≤ `concurrency * 3`), the planner skips the thinking wave and generates tasks directly — fast and efficient for focused work.
|
|
84
|
+
|
|
41
85
|
### Model-aware task design
|
|
42
86
|
|
|
43
87
|
The planner calibrates task ambition based on your worker model:
|
|
@@ -56,20 +100,20 @@ The budget also shapes task granularity:
|
|
|
56
100
|
|
|
57
101
|
**Medium budget (16-50)**: Autonomous missions. "Design and implement the complete favorites system: DB schema, API routes, client hooks, error handling."
|
|
58
102
|
|
|
59
|
-
**Large budget (50+)**:
|
|
103
|
+
**Large budget (50+)**: Thinking wave + orchestration. Architects explore, then execution tasks are synthesized from their findings. Each task is a substantial work session grounded in real codebase analysis.
|
|
60
104
|
|
|
61
|
-
A budget of 200 is not 200 micro-edits. It's
|
|
105
|
+
A budget of 200 is not 200 micro-edits. It's ~5 architects + ~195 senior-engineer work sessions, planned in waves. A budget of 2000 gets 10 architects.
|
|
62
106
|
|
|
63
107
|
## Usage limits
|
|
64
108
|
|
|
65
|
-
Control how much of your plan capacity the run consumes
|
|
109
|
+
Control how much of your plan capacity the run consumes:
|
|
66
110
|
|
|
67
111
|
```
|
|
68
|
-
Usage
|
|
69
|
-
|
|
70
|
-
90%
|
|
71
|
-
75%
|
|
72
|
-
50%
|
|
112
|
+
④ Usage:
|
|
113
|
+
● Unlimited · full capacity, wait through rate limits
|
|
114
|
+
○ 90% · leave 10% for other work
|
|
115
|
+
○ 75% · conservative, plenty of headroom
|
|
116
|
+
○ 50% · use half, keep the rest
|
|
73
117
|
```
|
|
74
118
|
|
|
75
119
|
When utilization hits your cap, the swarm stops dispatching new tasks and lets active agents finish gracefully. This way you can run a big overnight job and still have capacity left for manual Claude usage.
|
package/dist/index.js
CHANGED
|
@@ -1,5 +1,5 @@
|
|
|
1
1
|
#!/usr/bin/env node
|
|
2
|
-
import { readFileSync, existsSync } from "fs";
|
|
2
|
+
import { readFileSync, existsSync, mkdirSync, readdirSync, rmSync } from "fs";
|
|
3
3
|
import { resolve, dirname, join } from "path";
|
|
4
4
|
import { fileURLToPath } from "url";
|
|
5
5
|
import { execSync } from "child_process";
|
|
@@ -7,7 +7,7 @@ import { createInterface } from "readline";
|
|
|
7
7
|
import chalk from "chalk";
|
|
8
8
|
import { query } from "@anthropic-ai/claude-agent-sdk";
|
|
9
9
|
import { Swarm } from "./swarm.js";
|
|
10
|
-
import { planTasks, refinePlan, detectModelTier, steerWave } from "./planner.js";
|
|
10
|
+
import { planTasks, refinePlan, detectModelTier, steerWave, identifyThemes, buildThinkingTasks, orchestrate } from "./planner.js";
|
|
11
11
|
import { startRenderLoop, renderSummary } from "./ui.js";
|
|
12
12
|
// ── CLI flag parsing ──
|
|
13
13
|
function parseCliFlags(argv) {
|
|
@@ -86,10 +86,11 @@ async function select(label, items, defaultIdx = 0) {
|
|
|
86
86
|
if (!first)
|
|
87
87
|
stdout.write(`\x1B[${items.length}A`);
|
|
88
88
|
for (let i = 0; i < items.length; i++) {
|
|
89
|
-
const
|
|
90
|
-
const
|
|
91
|
-
const
|
|
92
|
-
|
|
89
|
+
const sel = i === idx;
|
|
90
|
+
const radio = sel ? chalk.cyan(" ● ") : chalk.dim(" ○ ");
|
|
91
|
+
const name = sel ? chalk.white(items[i].name) : chalk.dim(items[i].name);
|
|
92
|
+
const hint = items[i].hint ? chalk.dim(` · ${items[i].hint}`) : "";
|
|
93
|
+
stdout.write(`\x1B[2K${radio}${name}${hint}\n`);
|
|
93
94
|
}
|
|
94
95
|
};
|
|
95
96
|
stdout.write(`\n ${chalk.bold(label)}\n`);
|
|
@@ -134,7 +135,8 @@ async function select(label, items, defaultIdx = 0) {
|
|
|
134
135
|
async function selectKey(label, options) {
|
|
135
136
|
const { stdin, stdout } = process;
|
|
136
137
|
const keys = options.map((o) => o.key.toLowerCase());
|
|
137
|
-
|
|
138
|
+
const optStr = options.map((o) => `${chalk.cyan.bold(o.key.toUpperCase())}${chalk.dim(o.desc)}`).join(chalk.dim(" │ "));
|
|
139
|
+
stdout.write(`\n ${label}\n ${optStr}\n `);
|
|
138
140
|
return new Promise((resolve) => {
|
|
139
141
|
stdin.setRawMode(true);
|
|
140
142
|
stdin.resume();
|
|
@@ -259,10 +261,37 @@ function validateGitRepo(cwd) {
|
|
|
259
261
|
}
|
|
260
262
|
// ── Show plan ──
|
|
261
263
|
function showPlan(tasks) {
|
|
264
|
+
const w = Math.max((process.stdout.columns ?? 80) - 6, 40);
|
|
265
|
+
const ruleLen = Math.min(w, 70);
|
|
266
|
+
console.log(chalk.dim(` ─── ${tasks.length} tasks ${"─".repeat(Math.max(0, ruleLen - String(tasks.length).length - 10))}`));
|
|
262
267
|
for (const t of tasks) {
|
|
263
|
-
|
|
268
|
+
const num = chalk.dim(String(Number(t.id) + 1).padStart(4) + ".");
|
|
269
|
+
console.log(`${num} ${t.prompt.slice(0, w)}`);
|
|
264
270
|
}
|
|
265
|
-
console.log("");
|
|
271
|
+
console.log(chalk.dim(` ${"─".repeat(ruleLen)}\n`));
|
|
272
|
+
}
|
|
273
|
+
function readDesignDocs(dir) {
|
|
274
|
+
try {
|
|
275
|
+
const files = readdirSync(dir).filter(f => f.endsWith(".md")).sort();
|
|
276
|
+
return files.map(f => {
|
|
277
|
+
const content = readFileSync(join(dir, f), "utf-8");
|
|
278
|
+
return `### ${f}\n${content}`;
|
|
279
|
+
}).join("\n\n");
|
|
280
|
+
}
|
|
281
|
+
catch {
|
|
282
|
+
return "";
|
|
283
|
+
}
|
|
284
|
+
}
|
|
285
|
+
const BRAILLE = ["⠋", "⠙", "⠹", "⠸", "⠼", "⠴", "⠦", "⠧", "⠇", "⠏"];
|
|
286
|
+
function makeProgressLog() {
|
|
287
|
+
let frame = 0;
|
|
288
|
+
return (text) => {
|
|
289
|
+
const spin = chalk.cyan(BRAILLE[frame++ % BRAILLE.length]);
|
|
290
|
+
const maxW = (process.stdout.columns ?? 80) - 6;
|
|
291
|
+
const clean = text.replace(/\n/g, " ");
|
|
292
|
+
const line = clean.length > maxW ? clean.slice(0, maxW - 1) + "\u2026" : clean;
|
|
293
|
+
process.stdout.write(`\x1B[2K\r ${spin} ${chalk.dim(line)}`);
|
|
294
|
+
};
|
|
266
295
|
}
|
|
267
296
|
// ── Main ──
|
|
268
297
|
async function main() {
|
|
@@ -275,25 +304,26 @@ async function main() {
|
|
|
275
304
|
}
|
|
276
305
|
if (argv.includes("-h") || argv.includes("--help")) {
|
|
277
306
|
console.log(`
|
|
278
|
-
${chalk.bold("claude-overnight")} — fire off Claude agents, come back to shipped work
|
|
307
|
+
${chalk.bold("🌙 claude-overnight")} ${chalk.dim("— fire off Claude agents, come back to shipped work")}
|
|
308
|
+
${chalk.dim("─".repeat(60))}
|
|
279
309
|
|
|
280
|
-
${chalk.
|
|
281
|
-
claude-overnight ${chalk.dim("interactive
|
|
282
|
-
claude-overnight tasks.json ${chalk.dim("
|
|
283
|
-
claude-overnight "fix auth" "add tests" ${chalk.dim("
|
|
310
|
+
${chalk.cyan("Usage")}
|
|
311
|
+
claude-overnight ${chalk.dim("interactive mode")}
|
|
312
|
+
claude-overnight tasks.json ${chalk.dim("task file mode")}
|
|
313
|
+
claude-overnight "fix auth" "add tests" ${chalk.dim("inline tasks")}
|
|
284
314
|
|
|
285
|
-
${chalk.
|
|
315
|
+
${chalk.cyan("Flags")}
|
|
286
316
|
-h, --help Show this help
|
|
287
317
|
-v, --version Print version
|
|
288
318
|
--dry-run Show planned tasks without running them
|
|
289
|
-
--budget=N Target number of agent runs ${chalk.dim("(
|
|
319
|
+
--budget=N Target number of agent runs ${chalk.dim("(default: 10)")}
|
|
290
320
|
--concurrency=N Max parallel agents ${chalk.dim("(default: 5)")}
|
|
291
321
|
--model=NAME Worker model override ${chalk.dim("(planner always uses best available)")}
|
|
292
322
|
--usage-cap=N Stop at N% utilization ${chalk.dim("(e.g. 90 to save 10% for other work)")}
|
|
293
323
|
--timeout=SECONDS Agent inactivity timeout ${chalk.dim("(default: 300s, kills only silent agents)")}
|
|
294
324
|
--no-flex Disable adaptive multi-wave planning ${chalk.dim("(run all tasks in one shot)")}
|
|
295
325
|
|
|
296
|
-
${chalk.dim("
|
|
326
|
+
${chalk.cyan("Defaults")} ${chalk.dim("(non-interactive)")}
|
|
297
327
|
model: first available concurrency: 5 worktrees: auto perms: auto
|
|
298
328
|
`);
|
|
299
329
|
process.exit(0);
|
|
@@ -344,7 +374,8 @@ async function main() {
|
|
|
344
374
|
}
|
|
345
375
|
}
|
|
346
376
|
// ── Determine mode ──
|
|
347
|
-
console.log(chalk.bold("
|
|
377
|
+
console.log(`\n ${chalk.bold("🌙 claude-overnight")}`);
|
|
378
|
+
console.log(chalk.dim(` ${"─".repeat(36)}`));
|
|
348
379
|
const noTTY = !process.stdin.isTTY;
|
|
349
380
|
const nonInteractive = noTTY || fileCfg !== undefined || tasks.length > 0;
|
|
350
381
|
const cwd = fileCfg?.cwd ?? process.cwd();
|
|
@@ -363,55 +394,80 @@ async function main() {
|
|
|
363
394
|
let objective = fileCfg?.objective;
|
|
364
395
|
let usageCap;
|
|
365
396
|
if (!nonInteractive) {
|
|
366
|
-
|
|
367
|
-
// 1. Objective first — it's the whole point
|
|
397
|
+
// ① Objective
|
|
368
398
|
while (true) {
|
|
369
|
-
objective = await ask(chalk.bold("
|
|
399
|
+
objective = await ask(`\n ${chalk.cyan("①")} ${chalk.bold("What should the agents do?")}\n ${chalk.cyan(">")} `);
|
|
370
400
|
if (!objective) {
|
|
371
401
|
console.error(chalk.red("\n No objective provided."));
|
|
372
402
|
process.exit(1);
|
|
373
403
|
}
|
|
374
404
|
if (objective.split(/\s+/).length >= 5)
|
|
375
405
|
break;
|
|
376
|
-
console.log(chalk.yellow(' Be specific, e.g. "refactor the auth module, add tests, and update docs"
|
|
406
|
+
console.log(chalk.yellow(' Be specific, e.g. "refactor the auth module, add tests, and update docs"'));
|
|
377
407
|
}
|
|
378
|
-
//
|
|
379
|
-
const
|
|
408
|
+
// Start fetching models while user enters budget
|
|
409
|
+
const modelsPromise = fetchModels();
|
|
410
|
+
// ② Budget
|
|
411
|
+
const budgetAns = await ask(`\n ${chalk.cyan("②")} ${chalk.dim("Budget")} ${chalk.dim("[")}${chalk.white("10")}${chalk.dim("]:")} `);
|
|
380
412
|
budget = parseInt(budgetAns) || 10;
|
|
381
413
|
if (budget < 1) {
|
|
382
414
|
console.error(chalk.red(` Budget must be a positive number`));
|
|
383
415
|
process.exit(1);
|
|
384
416
|
}
|
|
385
|
-
//
|
|
386
|
-
|
|
387
|
-
const
|
|
388
|
-
|
|
389
|
-
|
|
417
|
+
// ③ Worker model — show spinner if models aren't ready yet
|
|
418
|
+
let modelFrame = 0;
|
|
419
|
+
const modelSpinner = setInterval(() => {
|
|
420
|
+
const spin = chalk.cyan(BRAILLE[modelFrame++ % BRAILLE.length]);
|
|
421
|
+
process.stdout.write(`\x1B[2K\r ${spin} ${chalk.dim("loading models...")}`);
|
|
422
|
+
}, 120);
|
|
423
|
+
let models;
|
|
424
|
+
try {
|
|
425
|
+
models = await modelsPromise;
|
|
426
|
+
}
|
|
427
|
+
finally {
|
|
428
|
+
clearInterval(modelSpinner);
|
|
429
|
+
process.stdout.write(`\x1B[2K\r`);
|
|
430
|
+
}
|
|
390
431
|
plannerModel = models[0]?.value || "claude-sonnet-4-6";
|
|
391
432
|
if (models.length > 0) {
|
|
392
|
-
workerModel = await select("Worker model
|
|
433
|
+
workerModel = await select(`${chalk.cyan("③")} Worker model:`, models.map((m) => ({
|
|
393
434
|
name: m.displayName,
|
|
394
435
|
value: m.value,
|
|
395
436
|
hint: m.description,
|
|
396
437
|
})));
|
|
397
438
|
}
|
|
398
439
|
else {
|
|
399
|
-
const ans = await ask(chalk.dim("
|
|
440
|
+
const ans = await ask(` ${chalk.cyan("③")} ${chalk.dim("Worker model [claude-sonnet-4-6]:")} `);
|
|
400
441
|
workerModel = ans || "claude-sonnet-4-6";
|
|
401
442
|
}
|
|
402
|
-
|
|
403
|
-
|
|
404
|
-
|
|
405
|
-
}
|
|
406
|
-
// 4. Usage cap — how much of your plan to use
|
|
407
|
-
usageCap = await select("Usage limit:", [
|
|
408
|
-
{ name: "Unlimited", value: undefined, hint: "use full capacity, wait through rate limits" },
|
|
443
|
+
// ④ Usage
|
|
444
|
+
usageCap = await select(`${chalk.cyan("④")} Usage:`, [
|
|
445
|
+
{ name: "Unlimited", value: undefined, hint: "full capacity, wait through rate limits" },
|
|
409
446
|
{ name: "90%", value: 0.9, hint: "leave 10% for other work" },
|
|
410
447
|
{ name: "75%", value: 0.75, hint: "conservative, plenty of headroom" },
|
|
411
448
|
{ name: "50%", value: 0.5, hint: "use half, keep the rest" },
|
|
412
449
|
]);
|
|
413
|
-
// Concurrency defaults based on budget
|
|
414
450
|
concurrency = Math.min(5, budget);
|
|
451
|
+
// Config summary box
|
|
452
|
+
const parts = [];
|
|
453
|
+
if (workerModel !== plannerModel) {
|
|
454
|
+
const tier = detectModelTier(workerModel);
|
|
455
|
+
parts.push(`${tier} → ${detectModelTier(plannerModel)}`);
|
|
456
|
+
}
|
|
457
|
+
else {
|
|
458
|
+
parts.push(detectModelTier(workerModel));
|
|
459
|
+
}
|
|
460
|
+
parts.push(`budget ${budget}`);
|
|
461
|
+
parts.push(`${concurrency}×`);
|
|
462
|
+
if (budget > 2)
|
|
463
|
+
parts.push("flex");
|
|
464
|
+
if (usageCap != null)
|
|
465
|
+
parts.push(`cap ${Math.round(usageCap * 100)}%`);
|
|
466
|
+
const inner = parts.join(chalk.dim(" · "));
|
|
467
|
+
const innerLen = parts.join(" · ").length;
|
|
468
|
+
console.log(chalk.dim(`\n ╭${"─".repeat(innerLen + 4)}╮`));
|
|
469
|
+
console.log(chalk.dim(" │") + ` ${inner} ` + chalk.dim("│"));
|
|
470
|
+
console.log(chalk.dim(` ╰${"─".repeat(innerLen + 4)}╯`));
|
|
415
471
|
}
|
|
416
472
|
else {
|
|
417
473
|
// Non-interactive: resolve config from file/flags/defaults
|
|
@@ -451,6 +507,10 @@ async function main() {
|
|
|
451
507
|
}
|
|
452
508
|
// ── Flex mode: adaptive multi-wave planning ──
|
|
453
509
|
const flex = !argv.includes("--no-flex") && (fileCfg?.flexiblePlan ?? objective != null) && objective != null && (budget ?? 10) > 2;
|
|
510
|
+
const agentTimeoutMs = cliFlags.timeout ? parseFloat(cliFlags.timeout) * 1000 : undefined;
|
|
511
|
+
let thinkingUsed = 0;
|
|
512
|
+
let thinkingCost = 0, thinkingIn = 0, thinkingOut = 0, thinkingTools = 0;
|
|
513
|
+
let designContext;
|
|
454
514
|
// ── Plan phase (interactive: review loop, non-interactive: auto-plan or skip) ──
|
|
455
515
|
const needsPlan = tasks.length === 0;
|
|
456
516
|
if (needsPlan) {
|
|
@@ -458,20 +518,178 @@ async function main() {
|
|
|
458
518
|
console.error(chalk.red(" No tasks provided and stdin is not a TTY. Provide tasks via args or a .json file."));
|
|
459
519
|
process.exit(1);
|
|
460
520
|
}
|
|
461
|
-
// In flex mode, plan ~50% of budget for wave 1, leaving room for steering
|
|
462
|
-
const waveBudget = flex ? Math.max(concurrency, Math.ceil((budget ?? 10) * 0.5)) : budget;
|
|
463
|
-
const flexNote = flex
|
|
464
|
-
? `This is wave 1 of an adaptive multi-wave run (total budget: ${budget}). Plan the highest-impact foundational work first. Future waves will iterate, polish, and expand based on what's learned.`
|
|
465
|
-
: undefined;
|
|
466
521
|
process.stdout.write("\x1B[?25l");
|
|
467
522
|
const planRestore = () => process.stdout.write("\x1B[?25h");
|
|
468
|
-
|
|
523
|
+
const useThinking = flex && (budget ?? 10) > concurrency * 3;
|
|
524
|
+
const thinkingCount = useThinking ? Math.min(Math.max(concurrency, Math.ceil((budget ?? 10) * 0.005)), 10) : 0;
|
|
525
|
+
const designDir = join(cwd, ".claude-overnight", "designs");
|
|
469
526
|
try {
|
|
470
|
-
|
|
471
|
-
|
|
472
|
-
|
|
473
|
-
|
|
474
|
-
|
|
527
|
+
if (useThinking) {
|
|
528
|
+
// Phase 1: Quick theme identification → review → then autonomous
|
|
529
|
+
let themeFrame = 0;
|
|
530
|
+
const themeSpinner = setInterval(() => {
|
|
531
|
+
const spin = chalk.cyan(BRAILLE[themeFrame++ % BRAILLE.length]);
|
|
532
|
+
process.stdout.write(`\x1B[2K\r ${spin} ${chalk.dim("identifying themes...")}`);
|
|
533
|
+
}, 120);
|
|
534
|
+
let themes;
|
|
535
|
+
try {
|
|
536
|
+
themes = await identifyThemes(objective, thinkingCount, plannerModel, permissionMode);
|
|
537
|
+
}
|
|
538
|
+
finally {
|
|
539
|
+
clearInterval(themeSpinner);
|
|
540
|
+
}
|
|
541
|
+
process.stdout.write(`\x1B[2K\r ${chalk.green(`\u2713 ${themes.length} themes`)}\n\n`);
|
|
542
|
+
// Show themes for review — this is the LAST user interaction
|
|
543
|
+
planRestore();
|
|
544
|
+
let reviewing = true;
|
|
545
|
+
while (reviewing) {
|
|
546
|
+
for (let i = 0; i < themes.length; i++) {
|
|
547
|
+
console.log(chalk.dim(` ${String(i + 1).padStart(3)}.`) + ` ${themes[i]}`);
|
|
548
|
+
}
|
|
549
|
+
console.log(chalk.dim(`\n ${thinkingCount} thinking agents → orchestrate → ${(budget ?? 10) - thinkingCount} execution sessions\n`));
|
|
550
|
+
const action = await selectKey(`${chalk.white(`${themes.length} themes`)} ${chalk.dim(`· ${thinkingCount} thinking · ${concurrency} concurrent`)}`, [
|
|
551
|
+
{ key: "r", desc: "un" },
|
|
552
|
+
{ key: "e", desc: "dit" },
|
|
553
|
+
{ key: "q", desc: "uit" },
|
|
554
|
+
]);
|
|
555
|
+
switch (action) {
|
|
556
|
+
case "r":
|
|
557
|
+
reviewing = false;
|
|
558
|
+
break;
|
|
559
|
+
case "e": {
|
|
560
|
+
const feedback = await ask(`\n ${chalk.bold("What should change?")}\n ${chalk.cyan(">")} `);
|
|
561
|
+
if (!feedback)
|
|
562
|
+
break;
|
|
563
|
+
process.stdout.write("\x1B[?25l");
|
|
564
|
+
try {
|
|
565
|
+
themes = await identifyThemes(`${objective}\n\nUser feedback: ${feedback}`, thinkingCount, plannerModel, permissionMode);
|
|
566
|
+
process.stdout.write(`\x1B[2K\r ${chalk.green(`\u2713 ${themes.length} themes`)}\n\n`);
|
|
567
|
+
}
|
|
568
|
+
catch (err) {
|
|
569
|
+
console.error(chalk.red(`\n Re-planning failed: ${err.message}\n`));
|
|
570
|
+
}
|
|
571
|
+
planRestore();
|
|
572
|
+
break;
|
|
573
|
+
}
|
|
574
|
+
case "q":
|
|
575
|
+
console.log(chalk.dim("\n Aborted.\n"));
|
|
576
|
+
process.exit(0);
|
|
577
|
+
}
|
|
578
|
+
}
|
|
579
|
+
// ── From here, fully autonomous — no more user interaction ──
|
|
580
|
+
process.stdout.write("\x1B[?25l");
|
|
581
|
+
// Phase 2: Thinking wave
|
|
582
|
+
mkdirSync(designDir, { recursive: true });
|
|
583
|
+
const thinkingTasks = buildThinkingTasks(objective, themes, designDir, plannerModel);
|
|
584
|
+
console.log(chalk.cyan(`\n ◆ Thinking: ${thinkingTasks.length} agents exploring...\n`));
|
|
585
|
+
const thinkingSwarm = new Swarm({
|
|
586
|
+
tasks: thinkingTasks, concurrency, cwd,
|
|
587
|
+
model: plannerModel,
|
|
588
|
+
permissionMode,
|
|
589
|
+
useWorktrees: false,
|
|
590
|
+
mergeStrategy: "yolo",
|
|
591
|
+
agentTimeoutMs,
|
|
592
|
+
usageCap,
|
|
593
|
+
});
|
|
594
|
+
const stopThinkRender = startRenderLoop(thinkingSwarm);
|
|
595
|
+
try {
|
|
596
|
+
await thinkingSwarm.run();
|
|
597
|
+
}
|
|
598
|
+
finally {
|
|
599
|
+
stopThinkRender();
|
|
600
|
+
}
|
|
601
|
+
console.log(renderSummary(thinkingSwarm));
|
|
602
|
+
thinkingUsed = thinkingSwarm.completed + thinkingSwarm.failed;
|
|
603
|
+
thinkingCost = thinkingSwarm.totalCostUsd;
|
|
604
|
+
thinkingIn = thinkingSwarm.totalInputTokens;
|
|
605
|
+
thinkingOut = thinkingSwarm.totalOutputTokens;
|
|
606
|
+
thinkingTools = thinkingSwarm.agents.reduce((sum, a) => sum + a.toolCalls, 0);
|
|
607
|
+
// Phase 3: Orchestrate from design docs
|
|
608
|
+
designContext = readDesignDocs(designDir);
|
|
609
|
+
if (designContext) {
|
|
610
|
+
const orchBudget = Math.min(50, Math.max(concurrency, Math.ceil(((budget ?? 10) - thinkingUsed) * 0.5)));
|
|
611
|
+
const flexNote = `This is wave 1 of an adaptive multi-wave run (total budget: ${(budget ?? 10) - thinkingUsed}). Plan the highest-impact foundational work first. Future waves will iterate based on what's learned.`;
|
|
612
|
+
console.log(chalk.cyan(`\n ◆ Orchestrating plan...\n`));
|
|
613
|
+
tasks = await orchestrate(objective, designContext, cwd, plannerModel, workerModel, permissionMode, orchBudget, concurrency, makeProgressLog(), flexNote);
|
|
614
|
+
process.stdout.write(`\x1B[2K\r ${chalk.green(`\u2713 ${tasks.length} tasks`)}\n\n`);
|
|
615
|
+
}
|
|
616
|
+
else {
|
|
617
|
+
console.log(chalk.yellow(`\n No design docs — falling back to direct planning\n`));
|
|
618
|
+
const waveBudget = Math.min(50, Math.max(concurrency, Math.ceil(((budget ?? 10) - thinkingUsed) * 0.5)));
|
|
619
|
+
tasks = await planTasks(objective, cwd, plannerModel, workerModel, permissionMode, waveBudget, concurrency, makeProgressLog());
|
|
620
|
+
process.stdout.write(`\x1B[2K\r ${chalk.green(`\u2713 ${tasks.length} tasks`)}\n\n`);
|
|
621
|
+
}
|
|
622
|
+
}
|
|
623
|
+
else {
|
|
624
|
+
// Small budget: direct planning → review → run
|
|
625
|
+
const waveBudget = flex ? Math.min(50, Math.max(concurrency, Math.ceil((budget ?? 10) * 0.5))) : budget;
|
|
626
|
+
const flexNote = flex
|
|
627
|
+
? `This is wave 1 of an adaptive multi-wave run (total budget: ${budget}). Plan the highest-impact foundational work first. Future waves will iterate, polish, and expand based on what's learned.`
|
|
628
|
+
: undefined;
|
|
629
|
+
console.log(chalk.cyan(`\n ◆ Planning${flex ? " wave 1" : ""}...\n`));
|
|
630
|
+
tasks = await planTasks(objective, cwd, plannerModel, workerModel, permissionMode, waveBudget, concurrency, makeProgressLog(), flexNote);
|
|
631
|
+
const flexHint = flex ? chalk.dim(` · wave 1`) : "";
|
|
632
|
+
process.stdout.write(`\x1B[2K\r ${chalk.green(`\u2713 ${tasks.length} tasks`)}${flexHint}\n\n`);
|
|
633
|
+
// Review loop for small-budget path
|
|
634
|
+
planRestore();
|
|
635
|
+
let reviewing = true;
|
|
636
|
+
while (reviewing) {
|
|
637
|
+
showPlan(tasks);
|
|
638
|
+
const action = await selectKey(`${chalk.white(`${tasks.length} tasks`)} ${chalk.dim(`· ${concurrency} concurrent`)}`, [
|
|
639
|
+
{ key: "r", desc: "un" },
|
|
640
|
+
{ key: "e", desc: "dit" },
|
|
641
|
+
{ key: "c", desc: "hat" },
|
|
642
|
+
{ key: "q", desc: "uit" },
|
|
643
|
+
]);
|
|
644
|
+
switch (action) {
|
|
645
|
+
case "r":
|
|
646
|
+
reviewing = false;
|
|
647
|
+
break;
|
|
648
|
+
case "e": {
|
|
649
|
+
const feedback = await ask(`\n ${chalk.bold("What should change?")}\n ${chalk.cyan(">")} `);
|
|
650
|
+
if (!feedback)
|
|
651
|
+
break;
|
|
652
|
+
console.log(chalk.cyan("\n ◆ Re-planning...\n"));
|
|
653
|
+
process.stdout.write("\x1B[?25l");
|
|
654
|
+
try {
|
|
655
|
+
tasks = await refinePlan(objective, tasks, feedback, cwd, plannerModel, workerModel, permissionMode, budget, concurrency, makeProgressLog());
|
|
656
|
+
process.stdout.write(`\x1B[2K\r ${chalk.green(`\u2713 ${tasks.length} tasks`)}\n\n`);
|
|
657
|
+
}
|
|
658
|
+
catch (err) {
|
|
659
|
+
console.error(chalk.red(`\n Re-planning failed: ${err.message}\n`));
|
|
660
|
+
}
|
|
661
|
+
planRestore();
|
|
662
|
+
break;
|
|
663
|
+
}
|
|
664
|
+
case "c": {
|
|
665
|
+
const question = await ask(`\n ${chalk.bold("Ask about the plan:")}\n ${chalk.cyan(">")} `);
|
|
666
|
+
if (!question)
|
|
667
|
+
break;
|
|
668
|
+
process.stdout.write("\x1B[?25l");
|
|
669
|
+
try {
|
|
670
|
+
let answer = "";
|
|
671
|
+
for await (const msg of query({
|
|
672
|
+
prompt: `You planned these tasks for the objective "${objective}":\n${tasks.map((t, i) => `${i + 1}. ${t.prompt}`).join("\n")}\n\nUser question: ${question}`,
|
|
673
|
+
options: { cwd, model: plannerModel, permissionMode, persistSession: false },
|
|
674
|
+
})) {
|
|
675
|
+
if (msg.type === "result" && msg.subtype === "success")
|
|
676
|
+
answer = msg.result || "";
|
|
677
|
+
}
|
|
678
|
+
planRestore();
|
|
679
|
+
if (answer)
|
|
680
|
+
console.log(chalk.dim(`\n ${answer.slice(0, 500)}\n`));
|
|
681
|
+
}
|
|
682
|
+
catch {
|
|
683
|
+
planRestore();
|
|
684
|
+
}
|
|
685
|
+
break;
|
|
686
|
+
}
|
|
687
|
+
case "q":
|
|
688
|
+
console.log(chalk.dim("\n Aborted.\n"));
|
|
689
|
+
process.exit(0);
|
|
690
|
+
}
|
|
691
|
+
}
|
|
692
|
+
}
|
|
475
693
|
}
|
|
476
694
|
catch (err) {
|
|
477
695
|
planRestore();
|
|
@@ -481,89 +699,27 @@ async function main() {
|
|
|
481
699
|
console.error(chalk.red(`\n Planning failed: ${err.message}\n`));
|
|
482
700
|
process.exit(1);
|
|
483
701
|
}
|
|
484
|
-
// ── Review loop ──
|
|
485
|
-
planRestore();
|
|
486
|
-
let reviewing = true;
|
|
487
|
-
while (reviewing) {
|
|
488
|
-
showPlan(tasks);
|
|
489
|
-
const action = await selectKey(`${tasks.length} tasks, concurrency ${concurrency}.`, [
|
|
490
|
-
{ key: "r", desc: "un" },
|
|
491
|
-
{ key: "e", desc: "dit" },
|
|
492
|
-
{ key: "c", desc: "hat" },
|
|
493
|
-
{ key: "q", desc: "uit" },
|
|
494
|
-
]);
|
|
495
|
-
switch (action) {
|
|
496
|
-
case "r":
|
|
497
|
-
reviewing = false;
|
|
498
|
-
break;
|
|
499
|
-
case "e": {
|
|
500
|
-
const feedback = await ask(chalk.bold("\n What should change?\n > "));
|
|
501
|
-
if (!feedback)
|
|
502
|
-
break;
|
|
503
|
-
console.log(chalk.magenta("\n Re-planning...\n"));
|
|
504
|
-
process.stdout.write("\x1B[?25l");
|
|
505
|
-
try {
|
|
506
|
-
tasks = await refinePlan(objective, tasks, feedback, cwd, plannerModel, workerModel, permissionMode, budget, concurrency, (text) => {
|
|
507
|
-
process.stdout.write(`\x1B[2K\r ${chalk.dim(text)}`);
|
|
508
|
-
});
|
|
509
|
-
process.stdout.write(`\x1B[2K\r ${chalk.green(`${tasks.length} tasks`)}\n\n`);
|
|
510
|
-
}
|
|
511
|
-
catch (err) {
|
|
512
|
-
console.error(chalk.red(`\n Re-planning failed: ${err.message}\n`));
|
|
513
|
-
}
|
|
514
|
-
planRestore();
|
|
515
|
-
break;
|
|
516
|
-
}
|
|
517
|
-
case "c": {
|
|
518
|
-
const question = await ask(chalk.bold("\n Ask about the plan:\n > "));
|
|
519
|
-
if (!question)
|
|
520
|
-
break;
|
|
521
|
-
process.stdout.write("\x1B[?25l");
|
|
522
|
-
try {
|
|
523
|
-
let answer = "";
|
|
524
|
-
for await (const msg of query({
|
|
525
|
-
prompt: `You planned these tasks for the objective "${objective}":\n${tasks.map((t, i) => `${i + 1}. ${t.prompt}`).join("\n")}\n\nUser question: ${question}`,
|
|
526
|
-
options: { cwd, model: plannerModel, permissionMode, persistSession: false },
|
|
527
|
-
})) {
|
|
528
|
-
if (msg.type === "result" && msg.subtype === "success")
|
|
529
|
-
answer = msg.result || "";
|
|
530
|
-
}
|
|
531
|
-
planRestore();
|
|
532
|
-
if (answer)
|
|
533
|
-
console.log(chalk.dim(`\n ${answer.slice(0, 500)}\n`));
|
|
534
|
-
}
|
|
535
|
-
catch {
|
|
536
|
-
planRestore();
|
|
537
|
-
}
|
|
538
|
-
break;
|
|
539
|
-
}
|
|
540
|
-
case "q":
|
|
541
|
-
console.log(chalk.dim("\n Aborted.\n"));
|
|
542
|
-
process.exit(0);
|
|
543
|
-
}
|
|
544
|
-
}
|
|
545
702
|
}
|
|
546
703
|
if (tasks.length === 0) {
|
|
547
704
|
console.error("No tasks provided.");
|
|
548
705
|
process.exit(1);
|
|
549
706
|
}
|
|
550
707
|
if (dryRun) {
|
|
551
|
-
console.log(chalk.bold(" Tasks:"));
|
|
552
708
|
showPlan(tasks);
|
|
709
|
+
console.log(chalk.dim(" --dry-run: exiting without running\n"));
|
|
553
710
|
process.exit(0);
|
|
554
711
|
}
|
|
555
712
|
// ── Run (wave loop) ──
|
|
556
713
|
process.stdout.write("\x1B[?25l");
|
|
557
714
|
const restore = () => process.stdout.write("\x1B[?25h\n");
|
|
558
|
-
const agentTimeoutMs = cliFlags.timeout ? parseFloat(cliFlags.timeout) * 1000 : undefined;
|
|
559
715
|
const runStartedAt = Date.now();
|
|
560
716
|
// Wave-loop state
|
|
561
717
|
let currentSwarm;
|
|
562
|
-
let remaining = budget ?? tasks.length;
|
|
718
|
+
let remaining = (budget ?? tasks.length) - thinkingUsed;
|
|
563
719
|
let currentTasks = tasks;
|
|
564
720
|
let waveNum = 0;
|
|
565
721
|
const waveHistory = [];
|
|
566
|
-
let accCost =
|
|
722
|
+
let accCost = thinkingCost, accIn = thinkingIn, accOut = thinkingOut, accCompleted = 0, accFailed = 0, accTools = thinkingTools;
|
|
567
723
|
let lastCapped = false, lastAborted = false;
|
|
568
724
|
// For flex + branch strategy: create one target branch, waves merge via yolo into it
|
|
569
725
|
let runBranch;
|
|
@@ -601,7 +757,7 @@ async function main() {
|
|
|
601
757
|
if (currentTasks.length > remaining)
|
|
602
758
|
currentTasks = currentTasks.slice(0, remaining);
|
|
603
759
|
if (flex) {
|
|
604
|
-
console.log(chalk.
|
|
760
|
+
console.log(chalk.cyan(`\n ◆ Wave ${waveNum + 1}`) + chalk.dim(` · ${currentTasks.length} tasks · ${remaining} remaining\n`));
|
|
605
761
|
}
|
|
606
762
|
const swarm = new Swarm({
|
|
607
763
|
tasks: currentTasks, concurrency, cwd, model: workerModel, permissionMode, allowedTools,
|
|
@@ -647,12 +803,10 @@ async function main() {
|
|
|
647
803
|
if (!flex || remaining <= 0 || swarm.aborted || swarm.cappedOut)
|
|
648
804
|
break;
|
|
649
805
|
// ── Steer next wave ──
|
|
650
|
-
console.log(chalk.
|
|
806
|
+
console.log(chalk.cyan("\n ◆ Steering...\n"));
|
|
651
807
|
process.stdout.write("\x1B[?25l");
|
|
652
808
|
try {
|
|
653
|
-
const steer = await steerWave(objective, waveHistory, remaining, cwd, plannerModel, workerModel, permissionMode, concurrency, (
|
|
654
|
-
process.stdout.write(`\x1B[2K\r ${chalk.dim(text)}`);
|
|
655
|
-
});
|
|
809
|
+
const steer = await steerWave(objective, waveHistory, remaining, cwd, plannerModel, workerModel, permissionMode, concurrency, makeProgressLog(), designContext);
|
|
656
810
|
process.stdout.write(`\x1B[2K\r`);
|
|
657
811
|
process.stdout.write("\x1B[?25h");
|
|
658
812
|
if (steer.done) {
|
|
@@ -669,6 +823,11 @@ async function main() {
|
|
|
669
823
|
break;
|
|
670
824
|
}
|
|
671
825
|
}
|
|
826
|
+
// Clean up design docs
|
|
827
|
+
try {
|
|
828
|
+
rmSync(join(cwd, ".claude-overnight", "designs"), { recursive: true, force: true });
|
|
829
|
+
}
|
|
830
|
+
catch { }
|
|
672
831
|
// Switch back if we created a run branch
|
|
673
832
|
if (runBranch && originalRef) {
|
|
674
833
|
try {
|
|
@@ -682,9 +841,10 @@ async function main() {
|
|
|
682
841
|
const summaryText = accFailed > 0
|
|
683
842
|
? chalk.yellow(`${accCompleted} done, ${accFailed} failed`) + cappedNote
|
|
684
843
|
: chalk.green(`${accCompleted} done`) + cappedNote;
|
|
685
|
-
const costText = accCost > 0 ? `
|
|
686
|
-
const wavePart = waves > 1 ? `${waves} waves
|
|
687
|
-
console.log(`\n ${
|
|
844
|
+
const costText = accCost > 0 ? chalk.dim(` · $${accCost.toFixed(3)}`) : "";
|
|
845
|
+
const wavePart = waves > 1 ? chalk.dim(`${waves} waves · `) : "";
|
|
846
|
+
console.log(chalk.dim(`\n ${"─".repeat(36)}`));
|
|
847
|
+
console.log(` ${chalk.green("✓")} ${chalk.bold("Complete")} ${wavePart}${summaryText}${costText}`);
|
|
688
848
|
if (accFailed > 0 && waves === 1) {
|
|
689
849
|
const failedAgents = currentSwarm?.agents.filter((a) => a.status === "error") ?? [];
|
|
690
850
|
if (failedAgents.length > 0) {
|
package/dist/planner.d.ts
CHANGED
|
@@ -16,5 +16,8 @@ export interface SteerResult {
|
|
|
16
16
|
export type ModelTier = "opus" | "sonnet" | "haiku" | "unknown";
|
|
17
17
|
export declare function detectModelTier(model: string): ModelTier;
|
|
18
18
|
export declare function planTasks(objective: string, cwd: string, plannerModel: string, workerModel: string, permissionMode: PermMode, budget: number | undefined, concurrency: number, onLog: (text: string) => void, flexNote?: string): Promise<Task[]>;
|
|
19
|
+
export declare function identifyThemes(objective: string, count: number, model: string, permissionMode: PermMode): Promise<string[]>;
|
|
20
|
+
export declare function buildThinkingTasks(objective: string, themes: string[], designDir: string, plannerModel: string): Task[];
|
|
21
|
+
export declare function orchestrate(objective: string, designDocs: string, cwd: string, plannerModel: string, workerModel: string, permissionMode: PermMode, budget: number, concurrency: number, onLog: (text: string) => void, flexNote?: string): Promise<Task[]>;
|
|
19
22
|
export declare function refinePlan(objective: string, previousTasks: Task[], feedback: string, cwd: string, plannerModel: string, workerModel: string, permissionMode: PermMode, budget: number | undefined, concurrency: number, onLog: (text: string) => void): Promise<Task[]>;
|
|
20
|
-
export declare function steerWave(objective: string, history: WaveSummary[], remainingBudget: number, cwd: string, plannerModel: string, workerModel: string, permissionMode: PermMode, concurrency: number, onLog: (text: string) => void): Promise<SteerResult>;
|
|
23
|
+
export declare function steerWave(objective: string, history: WaveSummary[], remainingBudget: number, cwd: string, plannerModel: string, workerModel: string, permissionMode: PermMode, concurrency: number, onLog: (text: string) => void, designContext?: string): Promise<SteerResult>;
|
package/dist/planner.js
CHANGED
|
@@ -2,7 +2,7 @@ import { query } from "@anthropic-ai/claude-agent-sdk";
|
|
|
2
2
|
const INACTIVITY_MS = 5 * 60 * 1000;
|
|
3
3
|
export function detectModelTier(model) {
|
|
4
4
|
const m = model.toLowerCase();
|
|
5
|
-
if (m.includes("opus"))
|
|
5
|
+
if (m === "default" || m.includes("opus"))
|
|
6
6
|
return "opus";
|
|
7
7
|
if (m.includes("sonnet"))
|
|
8
8
|
return "sonnet";
|
|
@@ -146,7 +146,32 @@ Respond with ONLY a JSON object (no markdown fences):
|
|
|
146
146
|
]
|
|
147
147
|
}`;
|
|
148
148
|
}
|
|
149
|
+
const RATE_LIMIT_PATTERNS = ["rate", "limit", "overloaded", "429", "hit your limit", "too many"];
|
|
150
|
+
function isRateLimitError(err) {
|
|
151
|
+
const msg = err instanceof Error ? err.message : String(err);
|
|
152
|
+
return RATE_LIMIT_PATTERNS.some((p) => msg.toLowerCase().includes(p));
|
|
153
|
+
}
|
|
149
154
|
async function runPlannerQuery(prompt, opts, onLog) {
|
|
155
|
+
const MAX_RETRIES = 3;
|
|
156
|
+
const BACKOFF = [30_000, 60_000, 120_000];
|
|
157
|
+
for (let attempt = 0; attempt <= MAX_RETRIES; attempt++) {
|
|
158
|
+
try {
|
|
159
|
+
return await runPlannerQueryOnce(prompt, opts, onLog);
|
|
160
|
+
}
|
|
161
|
+
catch (err) {
|
|
162
|
+
if (attempt < MAX_RETRIES && isRateLimitError(err)) {
|
|
163
|
+
const waitMs = BACKOFF[attempt];
|
|
164
|
+
const waitSec = Math.round(waitMs / 1000);
|
|
165
|
+
onLog(`Rate limited — waiting ${waitSec}s before retry ${attempt + 1}/${MAX_RETRIES}`);
|
|
166
|
+
await new Promise((r) => setTimeout(r, waitMs));
|
|
167
|
+
continue;
|
|
168
|
+
}
|
|
169
|
+
throw err;
|
|
170
|
+
}
|
|
171
|
+
}
|
|
172
|
+
throw new Error("Planner query failed after retries");
|
|
173
|
+
}
|
|
174
|
+
async function runPlannerQueryOnce(prompt, opts, onLog) {
|
|
150
175
|
let resultText = "";
|
|
151
176
|
const startedAt = Date.now();
|
|
152
177
|
const pq = query({
|
|
@@ -162,7 +187,7 @@ async function runPlannerQuery(prompt, opts, onLog) {
|
|
|
162
187
|
includePartialMessages: true,
|
|
163
188
|
},
|
|
164
189
|
});
|
|
165
|
-
// Progress ticker —
|
|
190
|
+
// Progress ticker — fast updates with compact format
|
|
166
191
|
let lastLogText = "";
|
|
167
192
|
let toolCount = 0;
|
|
168
193
|
const ticker = setInterval(() => {
|
|
@@ -170,9 +195,10 @@ async function runPlannerQuery(prompt, opts, onLog) {
|
|
|
170
195
|
const m = Math.floor(elapsed / 60);
|
|
171
196
|
const s = elapsed % 60;
|
|
172
197
|
const timeStr = m > 0 ? `${m}m ${s}s` : `${s}s`;
|
|
173
|
-
const
|
|
174
|
-
|
|
175
|
-
|
|
198
|
+
const toolStr = toolCount > 0 ? ` · ${toolCount} tools` : "";
|
|
199
|
+
const extra = lastLogText ? ` · ${lastLogText}` : "";
|
|
200
|
+
onLog(`${timeStr}${toolStr}${extra}`);
|
|
201
|
+
}, 500);
|
|
176
202
|
let lastActivity = Date.now();
|
|
177
203
|
let timer;
|
|
178
204
|
const watchdog = new Promise((_, reject) => {
|
|
@@ -201,8 +227,8 @@ async function runPlannerQuery(prompt, opts, onLog) {
|
|
|
201
227
|
if (ev?.type === "content_block_delta") {
|
|
202
228
|
const delta = ev.delta;
|
|
203
229
|
if (delta?.type === "text_delta" && delta.text) {
|
|
204
|
-
const snippet = delta.text.trim();
|
|
205
|
-
if (snippet.length >
|
|
230
|
+
const snippet = delta.text.trim().replace(/[{}"\\,[\]]+/g, " ").replace(/\s+/g, " ").trim();
|
|
231
|
+
if (snippet.length > 5) {
|
|
206
232
|
lastLogText = snippet.slice(0, 60);
|
|
207
233
|
}
|
|
208
234
|
}
|
|
@@ -212,7 +238,7 @@ async function runPlannerQuery(prompt, opts, onLog) {
|
|
|
212
238
|
if (msg.subtype === "success")
|
|
213
239
|
resultText = msg.result || "";
|
|
214
240
|
else
|
|
215
|
-
throw new Error(`Planner failed: ${msg.subtype}`);
|
|
241
|
+
throw new Error(`Planner failed: ${msg.result || msg.subtype}`);
|
|
216
242
|
}
|
|
217
243
|
}
|
|
218
244
|
};
|
|
@@ -310,6 +336,108 @@ export async function planTasks(objective, cwd, plannerModel, workerModel, permi
|
|
|
310
336
|
onLog(`${tasks.length} tasks`);
|
|
311
337
|
return tasks;
|
|
312
338
|
}
|
|
339
|
+
// ── Thinking wave ──
|
|
340
|
+
export async function identifyThemes(objective, count, model, permissionMode) {
|
|
341
|
+
let resultText = "";
|
|
342
|
+
for await (const msg of query({
|
|
343
|
+
prompt: `Split this objective into exactly ${count} independent research angles for architects exploring a codebase. Each angle should cover a distinct aspect.
|
|
344
|
+
|
|
345
|
+
Objective: ${objective}
|
|
346
|
+
|
|
347
|
+
Return ONLY a JSON object: {"themes": ["angle description", ...]}`,
|
|
348
|
+
options: {
|
|
349
|
+
model,
|
|
350
|
+
permissionMode,
|
|
351
|
+
...(permissionMode === "bypassPermissions" && { allowDangerouslySkipPermissions: true }),
|
|
352
|
+
persistSession: false,
|
|
353
|
+
},
|
|
354
|
+
})) {
|
|
355
|
+
if (msg.type === "result" && msg.subtype === "success")
|
|
356
|
+
resultText = msg.result || "";
|
|
357
|
+
}
|
|
358
|
+
const parsed = attemptJsonParse(resultText);
|
|
359
|
+
if (parsed?.themes && Array.isArray(parsed.themes))
|
|
360
|
+
return parsed.themes.slice(0, count);
|
|
361
|
+
const fallback = ["architecture, patterns, and conventions", "data models, state, and persistence", "user-facing flows, components, and UX", "APIs, integrations, and services", "testing, quality, and error handling", "security, performance, and infrastructure", "build, deployment, and configuration", "documentation and developer experience"];
|
|
362
|
+
return Array.from({ length: count }, (_, i) => fallback[i % fallback.length]);
|
|
363
|
+
}
|
|
364
|
+
export function buildThinkingTasks(objective, themes, designDir, plannerModel) {
|
|
365
|
+
return themes.map((theme, i) => ({
|
|
366
|
+
id: `think-${i}`,
|
|
367
|
+
prompt: `You are a senior architect exploring a codebase to design a solution.
|
|
368
|
+
|
|
369
|
+
OVERALL OBJECTIVE: ${objective}
|
|
370
|
+
|
|
371
|
+
YOUR FOCUS: ${theme}
|
|
372
|
+
|
|
373
|
+
Explore the codebase thoroughly using Read, Glob, and Grep. Then write a design document to ${designDir}/focus-${i}.md with these sections:
|
|
374
|
+
|
|
375
|
+
## Findings
|
|
376
|
+
Key files, patterns, and architecture you discovered. Cite specific file paths and function names.
|
|
377
|
+
|
|
378
|
+
## Proposed Work Items
|
|
379
|
+
For each item:
|
|
380
|
+
- **What**: What to build or change
|
|
381
|
+
- **Where**: Specific file paths
|
|
382
|
+
- **Why**: Why this matters
|
|
383
|
+
- **Risk**: Conflicts or complications
|
|
384
|
+
|
|
385
|
+
## Key Files
|
|
386
|
+
Relevant files with one-line descriptions.
|
|
387
|
+
|
|
388
|
+
Be thorough — your findings drive the execution plan.`,
|
|
389
|
+
model: plannerModel,
|
|
390
|
+
}));
|
|
391
|
+
}
|
|
392
|
+
export async function orchestrate(objective, designDocs, cwd, plannerModel, workerModel, permissionMode, budget, concurrency, onLog, flexNote) {
|
|
393
|
+
const capability = modelCapabilityBlock(workerModel);
|
|
394
|
+
const flexLine = flexNote ? `\n\n${flexNote}` : "";
|
|
395
|
+
const prompt = `You are a tech lead planning a sprint based on your team's codebase research.
|
|
396
|
+
|
|
397
|
+
Objective: ${objective}
|
|
398
|
+
|
|
399
|
+
Your architects explored the codebase and found:
|
|
400
|
+
|
|
401
|
+
${designDocs}
|
|
402
|
+
|
|
403
|
+
AGENT CAPABILITY: ${capability}
|
|
404
|
+
|
|
405
|
+
Create exactly ~${budget} concrete execution tasks based on these findings.
|
|
406
|
+
|
|
407
|
+
Requirements:
|
|
408
|
+
- Each task is actionable by a single agent session
|
|
409
|
+
- Each task MUST be independent — no dependencies between tasks
|
|
410
|
+
- ${concurrency} agents run in parallel — tasks must touch DIFFERENT files
|
|
411
|
+
- Trust the research — don't tell agents to re-explore what's documented
|
|
412
|
+
- Reference specific files and patterns from the findings
|
|
413
|
+
- Priority order: foundational first, polish last${flexLine}
|
|
414
|
+
|
|
415
|
+
Respond with ONLY a JSON object (no markdown fences):
|
|
416
|
+
{"tasks": [{"prompt": "..."}]}`;
|
|
417
|
+
onLog("Synthesizing...");
|
|
418
|
+
const resultText = await runPlannerQuery(prompt, { cwd, model: plannerModel, permissionMode }, onLog);
|
|
419
|
+
const parsed = await extractTaskJson(resultText, async () => {
|
|
420
|
+
onLog("Retrying...");
|
|
421
|
+
let retryText = "";
|
|
422
|
+
for await (const msg of query({
|
|
423
|
+
prompt: `Output ONLY a JSON object:\n{"tasks":[{"prompt":"..."}]}`,
|
|
424
|
+
options: { cwd, model: plannerModel, permissionMode, ...(permissionMode === "bypassPermissions" && { allowDangerouslySkipPermissions: true }), persistSession: false },
|
|
425
|
+
})) {
|
|
426
|
+
if (msg.type === "result" && msg.subtype === "success")
|
|
427
|
+
retryText = msg.result || "";
|
|
428
|
+
}
|
|
429
|
+
return retryText;
|
|
430
|
+
});
|
|
431
|
+
let tasks = (parsed.tasks || []).map((t, i) => ({
|
|
432
|
+
id: String(i),
|
|
433
|
+
prompt: typeof t === "string" ? t : t.prompt,
|
|
434
|
+
}));
|
|
435
|
+
tasks = postProcess(tasks, budget, onLog);
|
|
436
|
+
if (tasks.length === 0)
|
|
437
|
+
throw new Error("Orchestration generated 0 tasks");
|
|
438
|
+
onLog(`${tasks.length} tasks`);
|
|
439
|
+
return tasks;
|
|
440
|
+
}
|
|
313
441
|
export async function refinePlan(objective, previousTasks, feedback, cwd, plannerModel, workerModel, permissionMode, budget, concurrency, onLog) {
|
|
314
442
|
onLog("Refining plan...");
|
|
315
443
|
const prev = previousTasks.map((t, i) => `${i + 1}. ${t.prompt}`).join("\n");
|
|
@@ -421,7 +549,7 @@ async function extractTaskJson(raw, retry) {
|
|
|
421
549
|
throw new Error("Planner did not return valid task JSON after retry");
|
|
422
550
|
}
|
|
423
551
|
// ── Wave steering ──
|
|
424
|
-
export async function steerWave(objective, history, remainingBudget, cwd, plannerModel, workerModel, permissionMode, concurrency, onLog) {
|
|
552
|
+
export async function steerWave(objective, history, remainingBudget, cwd, plannerModel, workerModel, permissionMode, concurrency, onLog, designContext) {
|
|
425
553
|
const capability = modelCapabilityBlock(workerModel);
|
|
426
554
|
const historyText = history.map(w => {
|
|
427
555
|
const lines = w.tasks.map(t => {
|
|
@@ -437,7 +565,7 @@ Objective: ${objective}
|
|
|
437
565
|
|
|
438
566
|
Work completed so far:
|
|
439
567
|
${historyText}
|
|
440
|
-
|
|
568
|
+
${designContext ? `\nOriginal architectural research:\n${designContext}\n` : ""}
|
|
441
569
|
Remaining budget: ${remainingBudget} agent sessions. ${concurrency} agents run in parallel — tasks must touch DIFFERENT files.
|
|
442
570
|
${capability}
|
|
443
571
|
|
package/dist/swarm.js
CHANGED
|
@@ -240,9 +240,17 @@ export class Swarm {
|
|
|
240
240
|
this.activeQueries.delete(agentQuery);
|
|
241
241
|
}
|
|
242
242
|
if (agent.status === "running") {
|
|
243
|
-
agent.status = "done";
|
|
244
243
|
agent.finishedAt = Date.now();
|
|
245
|
-
|
|
244
|
+
const duration = agent.finishedAt - (agent.startedAt || agent.finishedAt);
|
|
245
|
+
if (agent.toolCalls === 0 && (agent.costUsd ?? 0) < 0.001 && duration < 15_000) {
|
|
246
|
+
agent.status = "error";
|
|
247
|
+
agent.error = "Agent did no work (likely rate-limited before starting)";
|
|
248
|
+
this.failed++;
|
|
249
|
+
}
|
|
250
|
+
else {
|
|
251
|
+
agent.status = "done";
|
|
252
|
+
this.completed++;
|
|
253
|
+
}
|
|
246
254
|
this.log(id, this.agentSummary(agent));
|
|
247
255
|
}
|
|
248
256
|
break; // Success — exit retry loop
|
|
@@ -424,12 +432,13 @@ export class Swarm {
|
|
|
424
432
|
finally {
|
|
425
433
|
if (stashed) {
|
|
426
434
|
try {
|
|
427
|
-
exec("git stash
|
|
428
|
-
|
|
429
|
-
|
|
430
|
-
|
|
431
|
-
|
|
435
|
+
const stashList = exec("git stash list", this.config.cwd).trim();
|
|
436
|
+
if (stashList) {
|
|
437
|
+
exec("git stash pop", this.config.cwd);
|
|
438
|
+
this.log(-1, "Restored stashed changes");
|
|
439
|
+
}
|
|
432
440
|
}
|
|
441
|
+
catch { /* stash already gone or empty */ }
|
|
433
442
|
}
|
|
434
443
|
}
|
|
435
444
|
}
|