claude-overnight 0.3.2 → 0.5.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -2,7 +2,7 @@
2
2
 
3
3
  Fire off Claude agents, come back to shipped work.
4
4
 
5
- Describe what to build. Set a budget — 10 agents, 100, 1000. A planner agent analyzes your codebase, breaks the objective into that many independent tasks, and launches them all. Each agent runs in its own git worktree with full tooling (Read, Edit, Bash, Grep — everything). Rate limits? It waits. Windows reset? It resumes. It doesn't stop until every task is done.
5
+ Describe what to build. Set a budget — 10 agents, 100, 1000. A planner agent analyzes your codebase, breaks the objective into independent tasks, and launches them all. Each agent runs in its own git worktree with full tooling (Read, Edit, Bash, Grep — everything). Rate limits? It waits. Windows reset? It resumes. It doesn't stop until every task is done.
6
6
 
7
7
  ## Install
8
8
 
@@ -20,7 +20,32 @@ Requires Node.js >= 20 and Claude authentication (OAuth via `claude` CLI, or `AN
20
20
  claude-overnight
21
21
  ```
22
22
 
23
- Describe your objective, set a budget, pick a worker model, set a usage limit. The planner generates tasks — review, edit, or chat about them, then run.
23
+ A guided flow walks you through each step:
24
+
25
+ ```
26
+ 🌙 claude-overnight
27
+ ────────────────────────────────────
28
+
29
+ ① What should the agents do?
30
+ > refactor auth, add tests, update docs
31
+
32
+ ② Budget [10]: 50
33
+
34
+ ③ Worker model:
35
+ ● Sonnet — Sonnet 4.6 · Best for everyday tasks
36
+ ○ Opus — Opus 4.6 · Most capable
37
+ ○ Haiku — Haiku 4.5 · Fastest
38
+
39
+ ④ Usage:
40
+ ● Unlimited · full capacity, wait through rate limits
41
+ ○ 90% · leave 10% for other work
42
+
43
+ ╭────────────────────────────────────╮
44
+ │ sonnet · budget 50 · 5× · flex │
45
+ ╰────────────────────────────────────╯
46
+ ```
47
+
48
+ For large budgets, the planner identifies research themes — review them, then press Run. Everything after that is fully autonomous: thinking agents explore, the orchestrator synthesizes tasks, execution waves run, and steering adapts between waves. No further interaction needed — go to sleep.
24
49
 
25
50
  ### Task file
26
51
 
@@ -38,6 +63,25 @@ claude-overnight "fix auth bug in src/auth.ts" "add tests for user model"
38
63
 
39
64
  The planner always runs on the best available model (Opus) regardless of which model you pick for workers. This ensures high-quality task decomposition even when workers use a cheaper model.
40
65
 
66
+ ### Thinking wave
67
+
68
+ For large budgets (`budget > concurrency * 3`), the planner doesn't try to generate hundreds of tasks from scratch. Instead, it launches a **thinking wave** — a team of architect agents that explore your codebase in parallel before any code is written.
69
+
70
+ ```
71
+ ⠋ identifying themes... → splits objective into N angles (< 30s)
72
+ ✓ 10 themes → review themes, press Run, walk away
73
+ ◆ Thinking: 10 agents exploring → each explores from its angle, writes a design doc
74
+ ◆ Orchestrating plan... → reads all design docs, synthesizes execution tasks
75
+ ◆ Wave 1 · 50 tasks → fully autonomous from here
76
+ ◆ Steering... → adapts between waves, retries on rate limits
77
+ ```
78
+
79
+ The review prompt appears right after theme identification — the last thing requiring your presence. After you press Run, the thinking wave, orchestration, execution, and steering all run autonomously. Rate-limited? The planner waits and retries. Go to sleep.
80
+
81
+ The number of thinking agents scales with budget: 5 for budget=50, 10 for budget=2000+. Each agent explores the codebase from a different angle and writes a structured design document. The orchestrator then reads all design docs and produces grounded execution tasks referencing real files and patterns.
82
+
83
+ For small budgets (≤ `concurrency * 3`), the planner skips the thinking wave and generates tasks directly — fast and efficient for focused work.
84
+
41
85
  ### Model-aware task design
42
86
 
43
87
  The planner calibrates task ambition based on your worker model:
@@ -56,20 +100,20 @@ The budget also shapes task granularity:
56
100
 
57
101
  **Medium budget (16-50)**: Autonomous missions. "Design and implement the complete favorites system: DB schema, API routes, client hooks, error handling."
58
102
 
59
- **Large budget (50+)**: Full workstream decomposition. Architecture, features, testing, security, UX polish, performance everything a team would cover. Each task is a substantial work session.
103
+ **Large budget (50+)**: Thinking wave + orchestration. Architects explore, then execution tasks are synthesized from their findings. Each task is a substantial work session grounded in real codebase analysis.
60
104
 
61
- A budget of 200 is not 200 micro-edits. It's 200 senior-engineer work sessions running in parallel.
105
+ A budget of 200 is not 200 micro-edits. It's ~5 architects + ~195 senior-engineer work sessions, planned in waves. A budget of 2000 gets 10 architects.
62
106
 
63
107
  ## Usage limits
64
108
 
65
- Control how much of your plan capacity the run consumes. In interactive mode, you'll be asked:
109
+ Control how much of your plan capacity the run consumes:
66
110
 
67
111
  ```
68
- Usage limit:
69
- Unlimited use full capacity, wait through rate limits
70
- 90% leave 10% for other work
71
- 75% conservative, plenty of headroom
72
- 50% use half, keep the rest
112
+ Usage:
113
+ Unlimited · full capacity, wait through rate limits
114
+ 90% · leave 10% for other work
115
+ 75% · conservative, plenty of headroom
116
+ 50% · use half, keep the rest
73
117
  ```
74
118
 
75
119
  When utilization hits your cap, the swarm stops dispatching new tasks and lets active agents finish gracefully. This way you can run a big overnight job and still have capacity left for manual Claude usage.
package/dist/index.js CHANGED
@@ -1,5 +1,5 @@
1
1
  #!/usr/bin/env node
2
- import { readFileSync, existsSync } from "fs";
2
+ import { readFileSync, existsSync, mkdirSync, readdirSync, rmSync } from "fs";
3
3
  import { resolve, dirname, join } from "path";
4
4
  import { fileURLToPath } from "url";
5
5
  import { execSync } from "child_process";
@@ -7,7 +7,7 @@ import { createInterface } from "readline";
7
7
  import chalk from "chalk";
8
8
  import { query } from "@anthropic-ai/claude-agent-sdk";
9
9
  import { Swarm } from "./swarm.js";
10
- import { planTasks, refinePlan, detectModelTier, steerWave } from "./planner.js";
10
+ import { planTasks, refinePlan, detectModelTier, steerWave, identifyThemes, buildThinkingTasks, orchestrate } from "./planner.js";
11
11
  import { startRenderLoop, renderSummary } from "./ui.js";
12
12
  // ── CLI flag parsing ──
13
13
  function parseCliFlags(argv) {
@@ -86,10 +86,11 @@ async function select(label, items, defaultIdx = 0) {
86
86
  if (!first)
87
87
  stdout.write(`\x1B[${items.length}A`);
88
88
  for (let i = 0; i < items.length; i++) {
89
- const arrow = i === idx ? chalk.green(" → ") : " ";
90
- const name = i === idx ? chalk.green(items[i].name) : chalk.dim(items[i].name);
91
- const hint = items[i].hint ? chalk.dim(` — ${items[i].hint}`) : "";
92
- stdout.write(`\x1B[2K${arrow}${name}${hint}\n`);
89
+ const sel = i === idx;
90
+ const radio = sel ? chalk.cyan(" ● ") : chalk.dim(" ○ ");
91
+ const name = sel ? chalk.white(items[i].name) : chalk.dim(items[i].name);
92
+ const hint = items[i].hint ? chalk.dim(` · ${items[i].hint}`) : "";
93
+ stdout.write(`\x1B[2K${radio}${name}${hint}\n`);
93
94
  }
94
95
  };
95
96
  stdout.write(`\n ${chalk.bold(label)}\n`);
@@ -134,7 +135,8 @@ async function select(label, items, defaultIdx = 0) {
134
135
  async function selectKey(label, options) {
135
136
  const { stdin, stdout } = process;
136
137
  const keys = options.map((o) => o.key.toLowerCase());
137
- stdout.write(`\n ${label} ${options.map((o) => `[${chalk.bold(o.key.toUpperCase())}]${chalk.dim(o.desc)}`).join(" ")}\n `);
138
+ const optStr = options.map((o) => `${chalk.cyan.bold(o.key.toUpperCase())}${chalk.dim(o.desc)}`).join(chalk.dim(" "));
139
+ stdout.write(`\n ${label}\n ${optStr}\n `);
138
140
  return new Promise((resolve) => {
139
141
  stdin.setRawMode(true);
140
142
  stdin.resume();
@@ -259,10 +261,37 @@ function validateGitRepo(cwd) {
259
261
  }
260
262
  // ── Show plan ──
261
263
  function showPlan(tasks) {
264
+ const w = Math.max((process.stdout.columns ?? 80) - 6, 40);
265
+ const ruleLen = Math.min(w, 70);
266
+ console.log(chalk.dim(` ─── ${tasks.length} tasks ${"─".repeat(Math.max(0, ruleLen - String(tasks.length).length - 10))}`));
262
267
  for (const t of tasks) {
263
- console.log(chalk.dim(` ${Number(t.id) + 1}. ${t.prompt.slice(0, 90)}`));
268
+ const num = chalk.dim(String(Number(t.id) + 1).padStart(4) + ".");
269
+ console.log(`${num} ${t.prompt.slice(0, w)}`);
264
270
  }
265
- console.log("");
271
+ console.log(chalk.dim(` ${"".repeat(ruleLen)}\n`));
272
+ }
273
+ function readDesignDocs(dir) {
274
+ try {
275
+ const files = readdirSync(dir).filter(f => f.endsWith(".md")).sort();
276
+ return files.map(f => {
277
+ const content = readFileSync(join(dir, f), "utf-8");
278
+ return `### ${f}\n${content}`;
279
+ }).join("\n\n");
280
+ }
281
+ catch {
282
+ return "";
283
+ }
284
+ }
285
+ const BRAILLE = ["⠋", "⠙", "⠹", "⠸", "⠼", "⠴", "⠦", "⠧", "⠇", "⠏"];
286
+ function makeProgressLog() {
287
+ let frame = 0;
288
+ return (text) => {
289
+ const spin = chalk.cyan(BRAILLE[frame++ % BRAILLE.length]);
290
+ const maxW = (process.stdout.columns ?? 80) - 6;
291
+ const clean = text.replace(/\n/g, " ");
292
+ const line = clean.length > maxW ? clean.slice(0, maxW - 1) + "\u2026" : clean;
293
+ process.stdout.write(`\x1B[2K\r ${spin} ${chalk.dim(line)}`);
294
+ };
266
295
  }
267
296
  // ── Main ──
268
297
  async function main() {
@@ -275,25 +304,26 @@ async function main() {
275
304
  }
276
305
  if (argv.includes("-h") || argv.includes("--help")) {
277
306
  console.log(`
278
- ${chalk.bold("claude-overnight")} — fire off Claude agents, come back to shipped work
307
+ ${chalk.bold("🌙 claude-overnight")} ${chalk.dim("— fire off Claude agents, come back to shipped work")}
308
+ ${chalk.dim("─".repeat(60))}
279
309
 
280
- ${chalk.dim("Usage:")}
281
- claude-overnight ${chalk.dim("interactive — describe what to do, review plan, run")}
282
- claude-overnight tasks.json ${chalk.dim("run tasks defined in a JSON file")}
283
- claude-overnight "fix auth" "add tests" ${chalk.dim("run inline tasks in parallel")}
310
+ ${chalk.cyan("Usage")}
311
+ claude-overnight ${chalk.dim("interactive mode")}
312
+ claude-overnight tasks.json ${chalk.dim("task file mode")}
313
+ claude-overnight "fix auth" "add tests" ${chalk.dim("inline tasks")}
284
314
 
285
- ${chalk.dim("Flags:")}
315
+ ${chalk.cyan("Flags")}
286
316
  -h, --help Show this help
287
317
  -v, --version Print version
288
318
  --dry-run Show planned tasks without running them
289
- --budget=N Target number of agent runs ${chalk.dim("(planner aims for this many tasks)")}
319
+ --budget=N Target number of agent runs ${chalk.dim("(default: 10)")}
290
320
  --concurrency=N Max parallel agents ${chalk.dim("(default: 5)")}
291
321
  --model=NAME Worker model override ${chalk.dim("(planner always uses best available)")}
292
322
  --usage-cap=N Stop at N% utilization ${chalk.dim("(e.g. 90 to save 10% for other work)")}
293
323
  --timeout=SECONDS Agent inactivity timeout ${chalk.dim("(default: 300s, kills only silent agents)")}
294
324
  --no-flex Disable adaptive multi-wave planning ${chalk.dim("(run all tasks in one shot)")}
295
325
 
296
- ${chalk.dim("Non-interactive defaults (task file / inline / piped):")}
326
+ ${chalk.cyan("Defaults")} ${chalk.dim("(non-interactive)")}
297
327
  model: first available concurrency: 5 worktrees: auto perms: auto
298
328
  `);
299
329
  process.exit(0);
@@ -344,7 +374,8 @@ async function main() {
344
374
  }
345
375
  }
346
376
  // ── Determine mode ──
347
- console.log(chalk.bold("\n \uD83C\uDF19 claude-overnight\n"));
377
+ console.log(`\n ${chalk.bold("🌙 claude-overnight")}`);
378
+ console.log(chalk.dim(` ${"─".repeat(36)}`));
348
379
  const noTTY = !process.stdin.isTTY;
349
380
  const nonInteractive = noTTY || fileCfg !== undefined || tasks.length > 0;
350
381
  const cwd = fileCfg?.cwd ?? process.cwd();
@@ -363,55 +394,80 @@ async function main() {
363
394
  let objective = fileCfg?.objective;
364
395
  let usageCap;
365
396
  if (!nonInteractive) {
366
- console.log(chalk.dim(" Fire off Claude agents, come back to shipped work.\n"));
367
- // 1. Objective first — it's the whole point
397
+ // Objective
368
398
  while (true) {
369
- objective = await ask(chalk.bold(" What should the agents do?\n > "));
399
+ objective = await ask(`\n ${chalk.cyan("①")} ${chalk.bold("What should the agents do?")}\n ${chalk.cyan(">")} `);
370
400
  if (!objective) {
371
401
  console.error(chalk.red("\n No objective provided."));
372
402
  process.exit(1);
373
403
  }
374
404
  if (objective.split(/\s+/).length >= 5)
375
405
  break;
376
- console.log(chalk.yellow(' Be specific, e.g. "refactor the auth module, add tests, and update docs"\n'));
406
+ console.log(chalk.yellow(' Be specific, e.g. "refactor the auth module, add tests, and update docs"'));
377
407
  }
378
- // 2. Budget how many agent runs to spend
379
- const budgetAns = await ask(chalk.dim("\n Agent budget [10]: "));
408
+ // Start fetching models while user enters budget
409
+ const modelsPromise = fetchModels();
410
+ // ② Budget
411
+ const budgetAns = await ask(`\n ${chalk.cyan("②")} ${chalk.dim("Budget")} ${chalk.dim("[")}${chalk.white("10")}${chalk.dim("]:")} `);
380
412
  budget = parseInt(budgetAns) || 10;
381
413
  if (budget < 1) {
382
414
  console.error(chalk.red(` Budget must be a positive number`));
383
415
  process.exit(1);
384
416
  }
385
- // 3. Worker model — planner always uses best available
386
- process.stdout.write(chalk.dim(" Fetching models..."));
387
- const models = await fetchModels();
388
- process.stdout.write(`\x1B[2K\r`);
389
- // Pick best model for planner (first = most capable)
417
+ // Worker model — show spinner if models aren't ready yet
418
+ let modelFrame = 0;
419
+ const modelSpinner = setInterval(() => {
420
+ const spin = chalk.cyan(BRAILLE[modelFrame++ % BRAILLE.length]);
421
+ process.stdout.write(`\x1B[2K\r ${spin} ${chalk.dim("loading models...")}`);
422
+ }, 120);
423
+ let models;
424
+ try {
425
+ models = await modelsPromise;
426
+ }
427
+ finally {
428
+ clearInterval(modelSpinner);
429
+ process.stdout.write(`\x1B[2K\r`);
430
+ }
390
431
  plannerModel = models[0]?.value || "claude-sonnet-4-6";
391
432
  if (models.length > 0) {
392
- workerModel = await select("Worker model (planner always uses best available):", models.map((m) => ({
433
+ workerModel = await select(`${chalk.cyan("③")} Worker model:`, models.map((m) => ({
393
434
  name: m.displayName,
394
435
  value: m.value,
395
436
  hint: m.description,
396
437
  })));
397
438
  }
398
439
  else {
399
- const ans = await ask(chalk.dim(" Worker model [claude-sonnet-4-6]: "));
440
+ const ans = await ask(` ${chalk.cyan("③")} ${chalk.dim("Worker model [claude-sonnet-4-6]:")} `);
400
441
  workerModel = ans || "claude-sonnet-4-6";
401
442
  }
402
- if (workerModel !== plannerModel) {
403
- const tier = detectModelTier(workerModel);
404
- console.log(chalk.dim(`\n Planner: ${plannerModel} · Workers: ${workerModel} (${tier})`));
405
- }
406
- // 4. Usage cap — how much of your plan to use
407
- usageCap = await select("Usage limit:", [
408
- { name: "Unlimited", value: undefined, hint: "use full capacity, wait through rate limits" },
443
+ // Usage
444
+ usageCap = await select(`${chalk.cyan("④")} Usage:`, [
445
+ { name: "Unlimited", value: undefined, hint: "full capacity, wait through rate limits" },
409
446
  { name: "90%", value: 0.9, hint: "leave 10% for other work" },
410
447
  { name: "75%", value: 0.75, hint: "conservative, plenty of headroom" },
411
448
  { name: "50%", value: 0.5, hint: "use half, keep the rest" },
412
449
  ]);
413
- // Concurrency defaults based on budget
414
450
  concurrency = Math.min(5, budget);
451
+ // Config summary box
452
+ const parts = [];
453
+ if (workerModel !== plannerModel) {
454
+ const tier = detectModelTier(workerModel);
455
+ parts.push(`${tier} → ${detectModelTier(plannerModel)}`);
456
+ }
457
+ else {
458
+ parts.push(detectModelTier(workerModel));
459
+ }
460
+ parts.push(`budget ${budget}`);
461
+ parts.push(`${concurrency}×`);
462
+ if (budget > 2)
463
+ parts.push("flex");
464
+ if (usageCap != null)
465
+ parts.push(`cap ${Math.round(usageCap * 100)}%`);
466
+ const inner = parts.join(chalk.dim(" · "));
467
+ const innerLen = parts.join(" · ").length;
468
+ console.log(chalk.dim(`\n ╭${"─".repeat(innerLen + 4)}╮`));
469
+ console.log(chalk.dim(" │") + ` ${inner} ` + chalk.dim("│"));
470
+ console.log(chalk.dim(` ╰${"─".repeat(innerLen + 4)}╯`));
415
471
  }
416
472
  else {
417
473
  // Non-interactive: resolve config from file/flags/defaults
@@ -451,6 +507,10 @@ async function main() {
451
507
  }
452
508
  // ── Flex mode: adaptive multi-wave planning ──
453
509
  const flex = !argv.includes("--no-flex") && (fileCfg?.flexiblePlan ?? objective != null) && objective != null && (budget ?? 10) > 2;
510
+ const agentTimeoutMs = cliFlags.timeout ? parseFloat(cliFlags.timeout) * 1000 : undefined;
511
+ let thinkingUsed = 0;
512
+ let thinkingCost = 0, thinkingIn = 0, thinkingOut = 0, thinkingTools = 0;
513
+ let designContext;
454
514
  // ── Plan phase (interactive: review loop, non-interactive: auto-plan or skip) ──
455
515
  const needsPlan = tasks.length === 0;
456
516
  if (needsPlan) {
@@ -458,20 +518,178 @@ async function main() {
458
518
  console.error(chalk.red(" No tasks provided and stdin is not a TTY. Provide tasks via args or a .json file."));
459
519
  process.exit(1);
460
520
  }
461
- // In flex mode, plan ~50% of budget for wave 1, leaving room for steering
462
- const waveBudget = flex ? Math.max(concurrency, Math.ceil((budget ?? 10) * 0.5)) : budget;
463
- const flexNote = flex
464
- ? `This is wave 1 of an adaptive multi-wave run (total budget: ${budget}). Plan the highest-impact foundational work first. Future waves will iterate, polish, and expand based on what's learned.`
465
- : undefined;
466
521
  process.stdout.write("\x1B[?25l");
467
522
  const planRestore = () => process.stdout.write("\x1B[?25h");
468
- console.log(chalk.magenta(`\n Planning${flex ? " wave 1" : ""}...\n`));
523
+ const useThinking = flex && (budget ?? 10) > concurrency * 3;
524
+ const thinkingCount = useThinking ? Math.min(Math.max(concurrency, Math.ceil((budget ?? 10) * 0.005)), 10) : 0;
525
+ const designDir = join(cwd, ".claude-overnight", "designs");
469
526
  try {
470
- tasks = await planTasks(objective, cwd, plannerModel, workerModel, permissionMode, waveBudget, concurrency, (text) => {
471
- process.stdout.write(`\x1B[2K\r ${chalk.dim(text)}`);
472
- }, flexNote);
473
- const flexHint = flex ? chalk.dim(` (wave 1, ${(budget ?? 10) - tasks.length} remaining)`) : "";
474
- process.stdout.write(`\x1B[2K\r ${chalk.green(`${tasks.length} tasks`)}${flexHint}\n\n`);
527
+ if (useThinking) {
528
+ // Phase 1: Quick theme identification → review → then autonomous
529
+ let themeFrame = 0;
530
+ const themeSpinner = setInterval(() => {
531
+ const spin = chalk.cyan(BRAILLE[themeFrame++ % BRAILLE.length]);
532
+ process.stdout.write(`\x1B[2K\r ${spin} ${chalk.dim("identifying themes...")}`);
533
+ }, 120);
534
+ let themes;
535
+ try {
536
+ themes = await identifyThemes(objective, thinkingCount, plannerModel, permissionMode);
537
+ }
538
+ finally {
539
+ clearInterval(themeSpinner);
540
+ }
541
+ process.stdout.write(`\x1B[2K\r ${chalk.green(`\u2713 ${themes.length} themes`)}\n\n`);
542
+ // Show themes for review — this is the LAST user interaction
543
+ planRestore();
544
+ let reviewing = true;
545
+ while (reviewing) {
546
+ for (let i = 0; i < themes.length; i++) {
547
+ console.log(chalk.dim(` ${String(i + 1).padStart(3)}.`) + ` ${themes[i]}`);
548
+ }
549
+ console.log(chalk.dim(`\n ${thinkingCount} thinking agents → orchestrate → ${(budget ?? 10) - thinkingCount} execution sessions\n`));
550
+ const action = await selectKey(`${chalk.white(`${themes.length} themes`)} ${chalk.dim(`· ${thinkingCount} thinking · ${concurrency} concurrent`)}`, [
551
+ { key: "r", desc: "un" },
552
+ { key: "e", desc: "dit" },
553
+ { key: "q", desc: "uit" },
554
+ ]);
555
+ switch (action) {
556
+ case "r":
557
+ reviewing = false;
558
+ break;
559
+ case "e": {
560
+ const feedback = await ask(`\n ${chalk.bold("What should change?")}\n ${chalk.cyan(">")} `);
561
+ if (!feedback)
562
+ break;
563
+ process.stdout.write("\x1B[?25l");
564
+ try {
565
+ themes = await identifyThemes(`${objective}\n\nUser feedback: ${feedback}`, thinkingCount, plannerModel, permissionMode);
566
+ process.stdout.write(`\x1B[2K\r ${chalk.green(`\u2713 ${themes.length} themes`)}\n\n`);
567
+ }
568
+ catch (err) {
569
+ console.error(chalk.red(`\n Re-planning failed: ${err.message}\n`));
570
+ }
571
+ planRestore();
572
+ break;
573
+ }
574
+ case "q":
575
+ console.log(chalk.dim("\n Aborted.\n"));
576
+ process.exit(0);
577
+ }
578
+ }
579
+ // ── From here, fully autonomous — no more user interaction ──
580
+ process.stdout.write("\x1B[?25l");
581
+ // Phase 2: Thinking wave
582
+ mkdirSync(designDir, { recursive: true });
583
+ const thinkingTasks = buildThinkingTasks(objective, themes, designDir, plannerModel);
584
+ console.log(chalk.cyan(`\n ◆ Thinking: ${thinkingTasks.length} agents exploring...\n`));
585
+ const thinkingSwarm = new Swarm({
586
+ tasks: thinkingTasks, concurrency, cwd,
587
+ model: plannerModel,
588
+ permissionMode,
589
+ useWorktrees: false,
590
+ mergeStrategy: "yolo",
591
+ agentTimeoutMs,
592
+ usageCap,
593
+ });
594
+ const stopThinkRender = startRenderLoop(thinkingSwarm);
595
+ try {
596
+ await thinkingSwarm.run();
597
+ }
598
+ finally {
599
+ stopThinkRender();
600
+ }
601
+ console.log(renderSummary(thinkingSwarm));
602
+ thinkingUsed = thinkingSwarm.completed + thinkingSwarm.failed;
603
+ thinkingCost = thinkingSwarm.totalCostUsd;
604
+ thinkingIn = thinkingSwarm.totalInputTokens;
605
+ thinkingOut = thinkingSwarm.totalOutputTokens;
606
+ thinkingTools = thinkingSwarm.agents.reduce((sum, a) => sum + a.toolCalls, 0);
607
+ // Phase 3: Orchestrate from design docs
608
+ designContext = readDesignDocs(designDir);
609
+ if (designContext) {
610
+ const orchBudget = Math.min(50, Math.max(concurrency, Math.ceil(((budget ?? 10) - thinkingUsed) * 0.5)));
611
+ const flexNote = `This is wave 1 of an adaptive multi-wave run (total budget: ${(budget ?? 10) - thinkingUsed}). Plan the highest-impact foundational work first. Future waves will iterate based on what's learned.`;
612
+ console.log(chalk.cyan(`\n ◆ Orchestrating plan...\n`));
613
+ tasks = await orchestrate(objective, designContext, cwd, plannerModel, workerModel, permissionMode, orchBudget, concurrency, makeProgressLog(), flexNote);
614
+ process.stdout.write(`\x1B[2K\r ${chalk.green(`\u2713 ${tasks.length} tasks`)}\n\n`);
615
+ }
616
+ else {
617
+ console.log(chalk.yellow(`\n No design docs — falling back to direct planning\n`));
618
+ const waveBudget = Math.min(50, Math.max(concurrency, Math.ceil(((budget ?? 10) - thinkingUsed) * 0.5)));
619
+ tasks = await planTasks(objective, cwd, plannerModel, workerModel, permissionMode, waveBudget, concurrency, makeProgressLog());
620
+ process.stdout.write(`\x1B[2K\r ${chalk.green(`\u2713 ${tasks.length} tasks`)}\n\n`);
621
+ }
622
+ }
623
+ else {
624
+ // Small budget: direct planning → review → run
625
+ const waveBudget = flex ? Math.min(50, Math.max(concurrency, Math.ceil((budget ?? 10) * 0.5))) : budget;
626
+ const flexNote = flex
627
+ ? `This is wave 1 of an adaptive multi-wave run (total budget: ${budget}). Plan the highest-impact foundational work first. Future waves will iterate, polish, and expand based on what's learned.`
628
+ : undefined;
629
+ console.log(chalk.cyan(`\n ◆ Planning${flex ? " wave 1" : ""}...\n`));
630
+ tasks = await planTasks(objective, cwd, plannerModel, workerModel, permissionMode, waveBudget, concurrency, makeProgressLog(), flexNote);
631
+ const flexHint = flex ? chalk.dim(` · wave 1`) : "";
632
+ process.stdout.write(`\x1B[2K\r ${chalk.green(`\u2713 ${tasks.length} tasks`)}${flexHint}\n\n`);
633
+ // Review loop for small-budget path
634
+ planRestore();
635
+ let reviewing = true;
636
+ while (reviewing) {
637
+ showPlan(tasks);
638
+ const action = await selectKey(`${chalk.white(`${tasks.length} tasks`)} ${chalk.dim(`· ${concurrency} concurrent`)}`, [
639
+ { key: "r", desc: "un" },
640
+ { key: "e", desc: "dit" },
641
+ { key: "c", desc: "hat" },
642
+ { key: "q", desc: "uit" },
643
+ ]);
644
+ switch (action) {
645
+ case "r":
646
+ reviewing = false;
647
+ break;
648
+ case "e": {
649
+ const feedback = await ask(`\n ${chalk.bold("What should change?")}\n ${chalk.cyan(">")} `);
650
+ if (!feedback)
651
+ break;
652
+ console.log(chalk.cyan("\n ◆ Re-planning...\n"));
653
+ process.stdout.write("\x1B[?25l");
654
+ try {
655
+ tasks = await refinePlan(objective, tasks, feedback, cwd, plannerModel, workerModel, permissionMode, budget, concurrency, makeProgressLog());
656
+ process.stdout.write(`\x1B[2K\r ${chalk.green(`\u2713 ${tasks.length} tasks`)}\n\n`);
657
+ }
658
+ catch (err) {
659
+ console.error(chalk.red(`\n Re-planning failed: ${err.message}\n`));
660
+ }
661
+ planRestore();
662
+ break;
663
+ }
664
+ case "c": {
665
+ const question = await ask(`\n ${chalk.bold("Ask about the plan:")}\n ${chalk.cyan(">")} `);
666
+ if (!question)
667
+ break;
668
+ process.stdout.write("\x1B[?25l");
669
+ try {
670
+ let answer = "";
671
+ for await (const msg of query({
672
+ prompt: `You planned these tasks for the objective "${objective}":\n${tasks.map((t, i) => `${i + 1}. ${t.prompt}`).join("\n")}\n\nUser question: ${question}`,
673
+ options: { cwd, model: plannerModel, permissionMode, persistSession: false },
674
+ })) {
675
+ if (msg.type === "result" && msg.subtype === "success")
676
+ answer = msg.result || "";
677
+ }
678
+ planRestore();
679
+ if (answer)
680
+ console.log(chalk.dim(`\n ${answer.slice(0, 500)}\n`));
681
+ }
682
+ catch {
683
+ planRestore();
684
+ }
685
+ break;
686
+ }
687
+ case "q":
688
+ console.log(chalk.dim("\n Aborted.\n"));
689
+ process.exit(0);
690
+ }
691
+ }
692
+ }
475
693
  }
476
694
  catch (err) {
477
695
  planRestore();
@@ -481,89 +699,27 @@ async function main() {
481
699
  console.error(chalk.red(`\n Planning failed: ${err.message}\n`));
482
700
  process.exit(1);
483
701
  }
484
- // ── Review loop ──
485
- planRestore();
486
- let reviewing = true;
487
- while (reviewing) {
488
- showPlan(tasks);
489
- const action = await selectKey(`${tasks.length} tasks, concurrency ${concurrency}.`, [
490
- { key: "r", desc: "un" },
491
- { key: "e", desc: "dit" },
492
- { key: "c", desc: "hat" },
493
- { key: "q", desc: "uit" },
494
- ]);
495
- switch (action) {
496
- case "r":
497
- reviewing = false;
498
- break;
499
- case "e": {
500
- const feedback = await ask(chalk.bold("\n What should change?\n > "));
501
- if (!feedback)
502
- break;
503
- console.log(chalk.magenta("\n Re-planning...\n"));
504
- process.stdout.write("\x1B[?25l");
505
- try {
506
- tasks = await refinePlan(objective, tasks, feedback, cwd, plannerModel, workerModel, permissionMode, budget, concurrency, (text) => {
507
- process.stdout.write(`\x1B[2K\r ${chalk.dim(text)}`);
508
- });
509
- process.stdout.write(`\x1B[2K\r ${chalk.green(`${tasks.length} tasks`)}\n\n`);
510
- }
511
- catch (err) {
512
- console.error(chalk.red(`\n Re-planning failed: ${err.message}\n`));
513
- }
514
- planRestore();
515
- break;
516
- }
517
- case "c": {
518
- const question = await ask(chalk.bold("\n Ask about the plan:\n > "));
519
- if (!question)
520
- break;
521
- process.stdout.write("\x1B[?25l");
522
- try {
523
- let answer = "";
524
- for await (const msg of query({
525
- prompt: `You planned these tasks for the objective "${objective}":\n${tasks.map((t, i) => `${i + 1}. ${t.prompt}`).join("\n")}\n\nUser question: ${question}`,
526
- options: { cwd, model: plannerModel, permissionMode, persistSession: false },
527
- })) {
528
- if (msg.type === "result" && msg.subtype === "success")
529
- answer = msg.result || "";
530
- }
531
- planRestore();
532
- if (answer)
533
- console.log(chalk.dim(`\n ${answer.slice(0, 500)}\n`));
534
- }
535
- catch {
536
- planRestore();
537
- }
538
- break;
539
- }
540
- case "q":
541
- console.log(chalk.dim("\n Aborted.\n"));
542
- process.exit(0);
543
- }
544
- }
545
702
  }
546
703
  if (tasks.length === 0) {
547
704
  console.error("No tasks provided.");
548
705
  process.exit(1);
549
706
  }
550
707
  if (dryRun) {
551
- console.log(chalk.bold(" Tasks:"));
552
708
  showPlan(tasks);
709
+ console.log(chalk.dim(" --dry-run: exiting without running\n"));
553
710
  process.exit(0);
554
711
  }
555
712
  // ── Run (wave loop) ──
556
713
  process.stdout.write("\x1B[?25l");
557
714
  const restore = () => process.stdout.write("\x1B[?25h\n");
558
- const agentTimeoutMs = cliFlags.timeout ? parseFloat(cliFlags.timeout) * 1000 : undefined;
559
715
  const runStartedAt = Date.now();
560
716
  // Wave-loop state
561
717
  let currentSwarm;
562
- let remaining = budget ?? tasks.length;
718
+ let remaining = (budget ?? tasks.length) - thinkingUsed;
563
719
  let currentTasks = tasks;
564
720
  let waveNum = 0;
565
721
  const waveHistory = [];
566
- let accCost = 0, accIn = 0, accOut = 0, accCompleted = 0, accFailed = 0, accTools = 0;
722
+ let accCost = thinkingCost, accIn = thinkingIn, accOut = thinkingOut, accCompleted = 0, accFailed = 0, accTools = thinkingTools;
567
723
  let lastCapped = false, lastAborted = false;
568
724
  // For flex + branch strategy: create one target branch, waves merge via yolo into it
569
725
  let runBranch;
@@ -601,7 +757,7 @@ async function main() {
601
757
  if (currentTasks.length > remaining)
602
758
  currentTasks = currentTasks.slice(0, remaining);
603
759
  if (flex) {
604
- console.log(chalk.magenta(`\n \u2500\u2500 Wave ${waveNum + 1} (${currentTasks.length} tasks, ${remaining} remaining) \u2500\u2500\n`));
760
+ console.log(chalk.cyan(`\n Wave ${waveNum + 1}`) + chalk.dim(` · ${currentTasks.length} tasks · ${remaining} remaining\n`));
605
761
  }
606
762
  const swarm = new Swarm({
607
763
  tasks: currentTasks, concurrency, cwd, model: workerModel, permissionMode, allowedTools,
@@ -647,12 +803,10 @@ async function main() {
647
803
  if (!flex || remaining <= 0 || swarm.aborted || swarm.cappedOut)
648
804
  break;
649
805
  // ── Steer next wave ──
650
- console.log(chalk.magenta("\n Steering...\n"));
806
+ console.log(chalk.cyan("\n Steering...\n"));
651
807
  process.stdout.write("\x1B[?25l");
652
808
  try {
653
- const steer = await steerWave(objective, waveHistory, remaining, cwd, plannerModel, workerModel, permissionMode, concurrency, (text) => {
654
- process.stdout.write(`\x1B[2K\r ${chalk.dim(text)}`);
655
- });
809
+ const steer = await steerWave(objective, waveHistory, remaining, cwd, plannerModel, workerModel, permissionMode, concurrency, makeProgressLog(), designContext);
656
810
  process.stdout.write(`\x1B[2K\r`);
657
811
  process.stdout.write("\x1B[?25h");
658
812
  if (steer.done) {
@@ -669,6 +823,11 @@ async function main() {
669
823
  break;
670
824
  }
671
825
  }
826
+ // Clean up design docs
827
+ try {
828
+ rmSync(join(cwd, ".claude-overnight", "designs"), { recursive: true, force: true });
829
+ }
830
+ catch { }
672
831
  // Switch back if we created a run branch
673
832
  if (runBranch && originalRef) {
674
833
  try {
@@ -682,9 +841,10 @@ async function main() {
682
841
  const summaryText = accFailed > 0
683
842
  ? chalk.yellow(`${accCompleted} done, ${accFailed} failed`) + cappedNote
684
843
  : chalk.green(`${accCompleted} done`) + cappedNote;
685
- const costText = accCost > 0 ? ` ($${accCost.toFixed(3)})` : "";
686
- const wavePart = waves > 1 ? `${waves} waves, ` : "";
687
- console.log(`\n ${chalk.bold("Complete:")} ${wavePart}${summaryText}${chalk.dim(costText)}`);
844
+ const costText = accCost > 0 ? chalk.dim(` · $${accCost.toFixed(3)}`) : "";
845
+ const wavePart = waves > 1 ? chalk.dim(`${waves} waves · `) : "";
846
+ console.log(chalk.dim(`\n ${"".repeat(36)}`));
847
+ console.log(` ${chalk.green("✓")} ${chalk.bold("Complete")} ${wavePart}${summaryText}${costText}`);
688
848
  if (accFailed > 0 && waves === 1) {
689
849
  const failedAgents = currentSwarm?.agents.filter((a) => a.status === "error") ?? [];
690
850
  if (failedAgents.length > 0) {
package/dist/planner.d.ts CHANGED
@@ -16,5 +16,8 @@ export interface SteerResult {
16
16
  export type ModelTier = "opus" | "sonnet" | "haiku" | "unknown";
17
17
  export declare function detectModelTier(model: string): ModelTier;
18
18
  export declare function planTasks(objective: string, cwd: string, plannerModel: string, workerModel: string, permissionMode: PermMode, budget: number | undefined, concurrency: number, onLog: (text: string) => void, flexNote?: string): Promise<Task[]>;
19
+ export declare function identifyThemes(objective: string, count: number, model: string, permissionMode: PermMode): Promise<string[]>;
20
+ export declare function buildThinkingTasks(objective: string, themes: string[], designDir: string, plannerModel: string): Task[];
21
+ export declare function orchestrate(objective: string, designDocs: string, cwd: string, plannerModel: string, workerModel: string, permissionMode: PermMode, budget: number, concurrency: number, onLog: (text: string) => void, flexNote?: string): Promise<Task[]>;
19
22
  export declare function refinePlan(objective: string, previousTasks: Task[], feedback: string, cwd: string, plannerModel: string, workerModel: string, permissionMode: PermMode, budget: number | undefined, concurrency: number, onLog: (text: string) => void): Promise<Task[]>;
20
- export declare function steerWave(objective: string, history: WaveSummary[], remainingBudget: number, cwd: string, plannerModel: string, workerModel: string, permissionMode: PermMode, concurrency: number, onLog: (text: string) => void): Promise<SteerResult>;
23
+ export declare function steerWave(objective: string, history: WaveSummary[], remainingBudget: number, cwd: string, plannerModel: string, workerModel: string, permissionMode: PermMode, concurrency: number, onLog: (text: string) => void, designContext?: string): Promise<SteerResult>;
package/dist/planner.js CHANGED
@@ -2,7 +2,7 @@ import { query } from "@anthropic-ai/claude-agent-sdk";
2
2
  const INACTIVITY_MS = 5 * 60 * 1000;
3
3
  export function detectModelTier(model) {
4
4
  const m = model.toLowerCase();
5
- if (m.includes("opus"))
5
+ if (m === "default" || m.includes("opus"))
6
6
  return "opus";
7
7
  if (m.includes("sonnet"))
8
8
  return "sonnet";
@@ -146,7 +146,32 @@ Respond with ONLY a JSON object (no markdown fences):
146
146
  ]
147
147
  }`;
148
148
  }
149
+ const RATE_LIMIT_PATTERNS = ["rate", "limit", "overloaded", "429", "hit your limit", "too many"];
150
+ function isRateLimitError(err) {
151
+ const msg = err instanceof Error ? err.message : String(err);
152
+ return RATE_LIMIT_PATTERNS.some((p) => msg.toLowerCase().includes(p));
153
+ }
149
154
  async function runPlannerQuery(prompt, opts, onLog) {
155
+ const MAX_RETRIES = 3;
156
+ const BACKOFF = [30_000, 60_000, 120_000];
157
+ for (let attempt = 0; attempt <= MAX_RETRIES; attempt++) {
158
+ try {
159
+ return await runPlannerQueryOnce(prompt, opts, onLog);
160
+ }
161
+ catch (err) {
162
+ if (attempt < MAX_RETRIES && isRateLimitError(err)) {
163
+ const waitMs = BACKOFF[attempt];
164
+ const waitSec = Math.round(waitMs / 1000);
165
+ onLog(`Rate limited — waiting ${waitSec}s before retry ${attempt + 1}/${MAX_RETRIES}`);
166
+ await new Promise((r) => setTimeout(r, waitMs));
167
+ continue;
168
+ }
169
+ throw err;
170
+ }
171
+ }
172
+ throw new Error("Planner query failed after retries");
173
+ }
174
+ async function runPlannerQueryOnce(prompt, opts, onLog) {
150
175
  let resultText = "";
151
176
  const startedAt = Date.now();
152
177
  const pq = query({
@@ -162,7 +187,7 @@ async function runPlannerQuery(prompt, opts, onLog) {
162
187
  includePartialMessages: true,
163
188
  },
164
189
  });
165
- // Progress ticker — show elapsed time so it doesn't look frozen
190
+ // Progress ticker — fast updates with compact format
166
191
  let lastLogText = "";
167
192
  let toolCount = 0;
168
193
  const ticker = setInterval(() => {
@@ -170,9 +195,10 @@ async function runPlannerQuery(prompt, opts, onLog) {
170
195
  const m = Math.floor(elapsed / 60);
171
196
  const s = elapsed % 60;
172
197
  const timeStr = m > 0 ? `${m}m ${s}s` : `${s}s`;
173
- const extra = lastLogText ? ` ${lastLogText}` : "";
174
- onLog(`${timeStr} elapsed, ${toolCount} tool calls${extra}`);
175
- }, 3000);
198
+ const toolStr = toolCount > 0 ? ` · ${toolCount} tools` : "";
199
+ const extra = lastLogText ? ` · ${lastLogText}` : "";
200
+ onLog(`${timeStr}${toolStr}${extra}`);
201
+ }, 500);
176
202
  let lastActivity = Date.now();
177
203
  let timer;
178
204
  const watchdog = new Promise((_, reject) => {
@@ -201,8 +227,8 @@ async function runPlannerQuery(prompt, opts, onLog) {
201
227
  if (ev?.type === "content_block_delta") {
202
228
  const delta = ev.delta;
203
229
  if (delta?.type === "text_delta" && delta.text) {
204
- const snippet = delta.text.trim();
205
- if (snippet.length > 3) {
230
+ const snippet = delta.text.trim().replace(/[{}"\\,[\]]+/g, " ").replace(/\s+/g, " ").trim();
231
+ if (snippet.length > 5) {
206
232
  lastLogText = snippet.slice(0, 60);
207
233
  }
208
234
  }
@@ -212,7 +238,7 @@ async function runPlannerQuery(prompt, opts, onLog) {
212
238
  if (msg.subtype === "success")
213
239
  resultText = msg.result || "";
214
240
  else
215
- throw new Error(`Planner failed: ${msg.subtype}`);
241
+ throw new Error(`Planner failed: ${msg.result || msg.subtype}`);
216
242
  }
217
243
  }
218
244
  };
@@ -310,6 +336,108 @@ export async function planTasks(objective, cwd, plannerModel, workerModel, permi
310
336
  onLog(`${tasks.length} tasks`);
311
337
  return tasks;
312
338
  }
339
+ // ── Thinking wave ──
340
+ export async function identifyThemes(objective, count, model, permissionMode) {
341
+ let resultText = "";
342
+ for await (const msg of query({
343
+ prompt: `Split this objective into exactly ${count} independent research angles for architects exploring a codebase. Each angle should cover a distinct aspect.
344
+
345
+ Objective: ${objective}
346
+
347
+ Return ONLY a JSON object: {"themes": ["angle description", ...]}`,
348
+ options: {
349
+ model,
350
+ permissionMode,
351
+ ...(permissionMode === "bypassPermissions" && { allowDangerouslySkipPermissions: true }),
352
+ persistSession: false,
353
+ },
354
+ })) {
355
+ if (msg.type === "result" && msg.subtype === "success")
356
+ resultText = msg.result || "";
357
+ }
358
+ const parsed = attemptJsonParse(resultText);
359
+ if (parsed?.themes && Array.isArray(parsed.themes))
360
+ return parsed.themes.slice(0, count);
361
+ const fallback = ["architecture, patterns, and conventions", "data models, state, and persistence", "user-facing flows, components, and UX", "APIs, integrations, and services", "testing, quality, and error handling", "security, performance, and infrastructure", "build, deployment, and configuration", "documentation and developer experience"];
362
+ return Array.from({ length: count }, (_, i) => fallback[i % fallback.length]);
363
+ }
364
+ export function buildThinkingTasks(objective, themes, designDir, plannerModel) {
365
+ return themes.map((theme, i) => ({
366
+ id: `think-${i}`,
367
+ prompt: `You are a senior architect exploring a codebase to design a solution.
368
+
369
+ OVERALL OBJECTIVE: ${objective}
370
+
371
+ YOUR FOCUS: ${theme}
372
+
373
+ Explore the codebase thoroughly using Read, Glob, and Grep. Then write a design document to ${designDir}/focus-${i}.md with these sections:
374
+
375
+ ## Findings
376
+ Key files, patterns, and architecture you discovered. Cite specific file paths and function names.
377
+
378
+ ## Proposed Work Items
379
+ For each item:
380
+ - **What**: What to build or change
381
+ - **Where**: Specific file paths
382
+ - **Why**: Why this matters
383
+ - **Risk**: Conflicts or complications
384
+
385
+ ## Key Files
386
+ Relevant files with one-line descriptions.
387
+
388
+ Be thorough — your findings drive the execution plan.`,
389
+ model: plannerModel,
390
+ }));
391
+ }
392
+ export async function orchestrate(objective, designDocs, cwd, plannerModel, workerModel, permissionMode, budget, concurrency, onLog, flexNote) {
393
+ const capability = modelCapabilityBlock(workerModel);
394
+ const flexLine = flexNote ? `\n\n${flexNote}` : "";
395
+ const prompt = `You are a tech lead planning a sprint based on your team's codebase research.
396
+
397
+ Objective: ${objective}
398
+
399
+ Your architects explored the codebase and found:
400
+
401
+ ${designDocs}
402
+
403
+ AGENT CAPABILITY: ${capability}
404
+
405
+ Create exactly ~${budget} concrete execution tasks based on these findings.
406
+
407
+ Requirements:
408
+ - Each task is actionable by a single agent session
409
+ - Each task MUST be independent — no dependencies between tasks
410
+ - ${concurrency} agents run in parallel — tasks must touch DIFFERENT files
411
+ - Trust the research — don't tell agents to re-explore what's documented
412
+ - Reference specific files and patterns from the findings
413
+ - Priority order: foundational first, polish last${flexLine}
414
+
415
+ Respond with ONLY a JSON object (no markdown fences):
416
+ {"tasks": [{"prompt": "..."}]}`;
417
+ onLog("Synthesizing...");
418
+ const resultText = await runPlannerQuery(prompt, { cwd, model: plannerModel, permissionMode }, onLog);
419
+ const parsed = await extractTaskJson(resultText, async () => {
420
+ onLog("Retrying...");
421
+ let retryText = "";
422
+ for await (const msg of query({
423
+ prompt: `Output ONLY a JSON object:\n{"tasks":[{"prompt":"..."}]}`,
424
+ options: { cwd, model: plannerModel, permissionMode, ...(permissionMode === "bypassPermissions" && { allowDangerouslySkipPermissions: true }), persistSession: false },
425
+ })) {
426
+ if (msg.type === "result" && msg.subtype === "success")
427
+ retryText = msg.result || "";
428
+ }
429
+ return retryText;
430
+ });
431
+ let tasks = (parsed.tasks || []).map((t, i) => ({
432
+ id: String(i),
433
+ prompt: typeof t === "string" ? t : t.prompt,
434
+ }));
435
+ tasks = postProcess(tasks, budget, onLog);
436
+ if (tasks.length === 0)
437
+ throw new Error("Orchestration generated 0 tasks");
438
+ onLog(`${tasks.length} tasks`);
439
+ return tasks;
440
+ }
313
441
  export async function refinePlan(objective, previousTasks, feedback, cwd, plannerModel, workerModel, permissionMode, budget, concurrency, onLog) {
314
442
  onLog("Refining plan...");
315
443
  const prev = previousTasks.map((t, i) => `${i + 1}. ${t.prompt}`).join("\n");
@@ -421,7 +549,7 @@ async function extractTaskJson(raw, retry) {
421
549
  throw new Error("Planner did not return valid task JSON after retry");
422
550
  }
423
551
  // ── Wave steering ──
424
- export async function steerWave(objective, history, remainingBudget, cwd, plannerModel, workerModel, permissionMode, concurrency, onLog) {
552
+ export async function steerWave(objective, history, remainingBudget, cwd, plannerModel, workerModel, permissionMode, concurrency, onLog, designContext) {
425
553
  const capability = modelCapabilityBlock(workerModel);
426
554
  const historyText = history.map(w => {
427
555
  const lines = w.tasks.map(t => {
@@ -437,7 +565,7 @@ Objective: ${objective}
437
565
 
438
566
  Work completed so far:
439
567
  ${historyText}
440
-
568
+ ${designContext ? `\nOriginal architectural research:\n${designContext}\n` : ""}
441
569
  Remaining budget: ${remainingBudget} agent sessions. ${concurrency} agents run in parallel — tasks must touch DIFFERENT files.
442
570
  ${capability}
443
571
 
package/dist/swarm.js CHANGED
@@ -240,9 +240,17 @@ export class Swarm {
240
240
  this.activeQueries.delete(agentQuery);
241
241
  }
242
242
  if (agent.status === "running") {
243
- agent.status = "done";
244
243
  agent.finishedAt = Date.now();
245
- this.completed++;
244
+ const duration = agent.finishedAt - (agent.startedAt || agent.finishedAt);
245
+ if (agent.toolCalls === 0 && (agent.costUsd ?? 0) < 0.001 && duration < 15_000) {
246
+ agent.status = "error";
247
+ agent.error = "Agent did no work (likely rate-limited before starting)";
248
+ this.failed++;
249
+ }
250
+ else {
251
+ agent.status = "done";
252
+ this.completed++;
253
+ }
246
254
  this.log(id, this.agentSummary(agent));
247
255
  }
248
256
  break; // Success — exit retry loop
@@ -424,12 +432,13 @@ export class Swarm {
424
432
  finally {
425
433
  if (stashed) {
426
434
  try {
427
- exec("git stash pop", this.config.cwd);
428
- this.log(-1, "Restored stashed changes");
429
- }
430
- catch (e) {
431
- this.log(-1, `Stash pop failed: ${String(e.message || e).slice(0, 80)}`);
435
+ const stashList = exec("git stash list", this.config.cwd).trim();
436
+ if (stashList) {
437
+ exec("git stash pop", this.config.cwd);
438
+ this.log(-1, "Restored stashed changes");
439
+ }
432
440
  }
441
+ catch { /* stash already gone or empty */ }
433
442
  }
434
443
  }
435
444
  }
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "claude-overnight",
3
- "version": "0.3.2",
3
+ "version": "0.5.1",
4
4
  "description": "Fire off Claude agents, come back days later to shipped work. Maximizes every token in your plan.",
5
5
  "type": "module",
6
6
  "bin": {