claude-overnight 0.3.2 → 0.5.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -2,7 +2,7 @@
2
2
 
3
3
  Fire off Claude agents, come back to shipped work.
4
4
 
5
- Describe what to build. Set a budget — 10 agents, 100, 1000. A planner agent analyzes your codebase, breaks the objective into that many independent tasks, and launches them all. Each agent runs in its own git worktree with full tooling (Read, Edit, Bash, Grep — everything). Rate limits? It waits. Windows reset? It resumes. It doesn't stop until every task is done.
5
+ Describe what to build. Set a budget — 10 agents, 100, 1000. A planner agent analyzes your codebase, breaks the objective into independent tasks, and launches them all. Each agent runs in its own git worktree with full tooling (Read, Edit, Bash, Grep — everything). Rate limits? It waits. Windows reset? It resumes. It doesn't stop until every task is done.
6
6
 
7
7
  ## Install
8
8
 
@@ -20,7 +20,32 @@ Requires Node.js >= 20 and Claude authentication (OAuth via `claude` CLI, or `AN
20
20
  claude-overnight
21
21
  ```
22
22
 
23
- Describe your objective, set a budget, pick a worker model, set a usage limit. The planner generates tasks — review, edit, or chat about them, then run.
23
+ A guided flow walks you through each step:
24
+
25
+ ```
26
+ 🌙 claude-overnight
27
+ ────────────────────────────────────
28
+
29
+ ① What should the agents do?
30
+ > refactor auth, add tests, update docs
31
+
32
+ ② Budget [10]: 50
33
+
34
+ ③ Worker model:
35
+ ● Sonnet — Sonnet 4.6 · Best for everyday tasks
36
+ ○ Opus — Opus 4.6 · Most capable
37
+ ○ Haiku — Haiku 4.5 · Fastest
38
+
39
+ ④ Usage:
40
+ ● Unlimited · full capacity, wait through rate limits
41
+ ○ 90% · leave 10% for other work
42
+
43
+ ╭────────────────────────────────────╮
44
+ │ sonnet · budget 50 · 5× · flex │
45
+ ╰────────────────────────────────────╯
46
+ ```
47
+
48
+ The planner generates tasks — review, edit, or chat about them, then run.
24
49
 
25
50
  ### Task file
26
51
 
@@ -38,6 +63,22 @@ claude-overnight "fix auth bug in src/auth.ts" "add tests for user model"
38
63
 
39
64
  The planner always runs on the best available model (Opus) regardless of which model you pick for workers. This ensures high-quality task decomposition even when workers use a cheaper model.
40
65
 
66
+ ### Thinking wave
67
+
68
+ For large budgets (`budget > concurrency * 3`), the planner doesn't try to generate hundreds of tasks from scratch. Instead, it launches a **thinking wave** — a team of architect agents that explore your codebase in parallel before any code is written.
69
+
70
+ ```
71
+ ⠋ identifying themes... → splits objective into N angles (< 30s)
72
+ ◆ Thinking: 5 agents exploring → each explores from its angle, writes a design doc
73
+ ◆ Orchestrating plan... → reads all design docs, synthesizes execution tasks
74
+ ```
75
+
76
+ Each thinking agent gets a different research focus (architecture, data, UI, APIs, testing, etc.), explores using Read/Glob/Grep, and writes a structured design document with findings, proposed work items, and key files. The orchestrator then reads all design docs and produces grounded, well-informed execution tasks that reference specific files and patterns the researchers found.
77
+
78
+ This means a budget of 200 doesn't generate 200 tasks from a single LLM call guessing at your codebase. It sends 5 architects to study the code first, then plans 50 tasks based on their findings, executes them, steers, and repeats.
79
+
80
+ For small budgets (≤ `concurrency * 3`), the planner skips the thinking wave and generates tasks directly — fast and efficient for focused work.
81
+
41
82
  ### Model-aware task design
42
83
 
43
84
  The planner calibrates task ambition based on your worker model:
@@ -56,20 +97,20 @@ The budget also shapes task granularity:
56
97
 
57
98
  **Medium budget (16-50)**: Autonomous missions. "Design and implement the complete favorites system: DB schema, API routes, client hooks, error handling."
58
99
 
59
- **Large budget (50+)**: Full workstream decomposition. Architecture, features, testing, security, UX polish, performance everything a team would cover. Each task is a substantial work session.
100
+ **Large budget (50+)**: Thinking wave + orchestration. Architects explore, then execution tasks are synthesized from their findings. Each task is a substantial work session grounded in real codebase analysis.
60
101
 
61
- A budget of 200 is not 200 micro-edits. It's 200 senior-engineer work sessions running in parallel.
102
+ A budget of 200 is not 200 micro-edits. It's 5 architects + ~195 senior-engineer work sessions, planned in waves.
62
103
 
63
104
  ## Usage limits
64
105
 
65
- Control how much of your plan capacity the run consumes. In interactive mode, you'll be asked:
106
+ Control how much of your plan capacity the run consumes:
66
107
 
67
108
  ```
68
- Usage limit:
69
- Unlimited use full capacity, wait through rate limits
70
- 90% leave 10% for other work
71
- 75% conservative, plenty of headroom
72
- 50% use half, keep the rest
109
+ Usage:
110
+ Unlimited · full capacity, wait through rate limits
111
+ 90% · leave 10% for other work
112
+ 75% · conservative, plenty of headroom
113
+ 50% · use half, keep the rest
73
114
  ```
74
115
 
75
116
  When utilization hits your cap, the swarm stops dispatching new tasks and lets active agents finish gracefully. This way you can run a big overnight job and still have capacity left for manual Claude usage.
package/dist/index.js CHANGED
@@ -1,5 +1,5 @@
1
1
  #!/usr/bin/env node
2
- import { readFileSync, existsSync } from "fs";
2
+ import { readFileSync, existsSync, mkdirSync, readdirSync, rmSync } from "fs";
3
3
  import { resolve, dirname, join } from "path";
4
4
  import { fileURLToPath } from "url";
5
5
  import { execSync } from "child_process";
@@ -7,7 +7,7 @@ import { createInterface } from "readline";
7
7
  import chalk from "chalk";
8
8
  import { query } from "@anthropic-ai/claude-agent-sdk";
9
9
  import { Swarm } from "./swarm.js";
10
- import { planTasks, refinePlan, detectModelTier, steerWave } from "./planner.js";
10
+ import { planTasks, refinePlan, detectModelTier, steerWave, identifyThemes, buildThinkingTasks, orchestrate } from "./planner.js";
11
11
  import { startRenderLoop, renderSummary } from "./ui.js";
12
12
  // ── CLI flag parsing ──
13
13
  function parseCliFlags(argv) {
@@ -86,10 +86,11 @@ async function select(label, items, defaultIdx = 0) {
86
86
  if (!first)
87
87
  stdout.write(`\x1B[${items.length}A`);
88
88
  for (let i = 0; i < items.length; i++) {
89
- const arrow = i === idx ? chalk.green(" → ") : " ";
90
- const name = i === idx ? chalk.green(items[i].name) : chalk.dim(items[i].name);
91
- const hint = items[i].hint ? chalk.dim(` — ${items[i].hint}`) : "";
92
- stdout.write(`\x1B[2K${arrow}${name}${hint}\n`);
89
+ const sel = i === idx;
90
+ const radio = sel ? chalk.cyan(" ● ") : chalk.dim(" ○ ");
91
+ const name = sel ? chalk.white(items[i].name) : chalk.dim(items[i].name);
92
+ const hint = items[i].hint ? chalk.dim(` · ${items[i].hint}`) : "";
93
+ stdout.write(`\x1B[2K${radio}${name}${hint}\n`);
93
94
  }
94
95
  };
95
96
  stdout.write(`\n ${chalk.bold(label)}\n`);
@@ -134,7 +135,8 @@ async function select(label, items, defaultIdx = 0) {
134
135
  async function selectKey(label, options) {
135
136
  const { stdin, stdout } = process;
136
137
  const keys = options.map((o) => o.key.toLowerCase());
137
- stdout.write(`\n ${label} ${options.map((o) => `[${chalk.bold(o.key.toUpperCase())}]${chalk.dim(o.desc)}`).join(" ")}\n `);
138
+ const optStr = options.map((o) => `${chalk.cyan.bold(o.key.toUpperCase())}${chalk.dim(o.desc)}`).join(chalk.dim(" "));
139
+ stdout.write(`\n ${label}\n ${optStr}\n `);
138
140
  return new Promise((resolve) => {
139
141
  stdin.setRawMode(true);
140
142
  stdin.resume();
@@ -259,10 +261,37 @@ function validateGitRepo(cwd) {
259
261
  }
260
262
  // ── Show plan ──
261
263
  function showPlan(tasks) {
264
+ const w = Math.max((process.stdout.columns ?? 80) - 6, 40);
265
+ const ruleLen = Math.min(w, 70);
266
+ console.log(chalk.dim(` ─── ${tasks.length} tasks ${"─".repeat(Math.max(0, ruleLen - String(tasks.length).length - 10))}`));
262
267
  for (const t of tasks) {
263
- console.log(chalk.dim(` ${Number(t.id) + 1}. ${t.prompt.slice(0, 90)}`));
268
+ const num = chalk.dim(String(Number(t.id) + 1).padStart(4) + ".");
269
+ console.log(`${num} ${t.prompt.slice(0, w)}`);
264
270
  }
265
- console.log("");
271
+ console.log(chalk.dim(` ${"".repeat(ruleLen)}\n`));
272
+ }
273
+ function readDesignDocs(dir) {
274
+ try {
275
+ const files = readdirSync(dir).filter(f => f.endsWith(".md")).sort();
276
+ return files.map(f => {
277
+ const content = readFileSync(join(dir, f), "utf-8");
278
+ return `### ${f}\n${content}`;
279
+ }).join("\n\n");
280
+ }
281
+ catch {
282
+ return "";
283
+ }
284
+ }
285
+ const BRAILLE = ["⠋", "⠙", "⠹", "⠸", "⠼", "⠴", "⠦", "⠧", "⠇", "⠏"];
286
+ function makeProgressLog() {
287
+ let frame = 0;
288
+ return (text) => {
289
+ const spin = chalk.cyan(BRAILLE[frame++ % BRAILLE.length]);
290
+ const maxW = (process.stdout.columns ?? 80) - 6;
291
+ const clean = text.replace(/\n/g, " ");
292
+ const line = clean.length > maxW ? clean.slice(0, maxW - 1) + "\u2026" : clean;
293
+ process.stdout.write(`\x1B[2K\r ${spin} ${chalk.dim(line)}`);
294
+ };
266
295
  }
267
296
  // ── Main ──
268
297
  async function main() {
@@ -275,25 +304,26 @@ async function main() {
275
304
  }
276
305
  if (argv.includes("-h") || argv.includes("--help")) {
277
306
  console.log(`
278
- ${chalk.bold("claude-overnight")} — fire off Claude agents, come back to shipped work
307
+ ${chalk.bold("🌙 claude-overnight")} ${chalk.dim("— fire off Claude agents, come back to shipped work")}
308
+ ${chalk.dim("─".repeat(60))}
279
309
 
280
- ${chalk.dim("Usage:")}
281
- claude-overnight ${chalk.dim("interactive — describe what to do, review plan, run")}
282
- claude-overnight tasks.json ${chalk.dim("run tasks defined in a JSON file")}
283
- claude-overnight "fix auth" "add tests" ${chalk.dim("run inline tasks in parallel")}
310
+ ${chalk.cyan("Usage")}
311
+ claude-overnight ${chalk.dim("interactive mode")}
312
+ claude-overnight tasks.json ${chalk.dim("task file mode")}
313
+ claude-overnight "fix auth" "add tests" ${chalk.dim("inline tasks")}
284
314
 
285
- ${chalk.dim("Flags:")}
315
+ ${chalk.cyan("Flags")}
286
316
  -h, --help Show this help
287
317
  -v, --version Print version
288
318
  --dry-run Show planned tasks without running them
289
- --budget=N Target number of agent runs ${chalk.dim("(planner aims for this many tasks)")}
319
+ --budget=N Target number of agent runs ${chalk.dim("(default: 10)")}
290
320
  --concurrency=N Max parallel agents ${chalk.dim("(default: 5)")}
291
321
  --model=NAME Worker model override ${chalk.dim("(planner always uses best available)")}
292
322
  --usage-cap=N Stop at N% utilization ${chalk.dim("(e.g. 90 to save 10% for other work)")}
293
323
  --timeout=SECONDS Agent inactivity timeout ${chalk.dim("(default: 300s, kills only silent agents)")}
294
324
  --no-flex Disable adaptive multi-wave planning ${chalk.dim("(run all tasks in one shot)")}
295
325
 
296
- ${chalk.dim("Non-interactive defaults (task file / inline / piped):")}
326
+ ${chalk.cyan("Defaults")} ${chalk.dim("(non-interactive)")}
297
327
  model: first available concurrency: 5 worktrees: auto perms: auto
298
328
  `);
299
329
  process.exit(0);
@@ -344,7 +374,8 @@ async function main() {
344
374
  }
345
375
  }
346
376
  // ── Determine mode ──
347
- console.log(chalk.bold("\n \uD83C\uDF19 claude-overnight\n"));
377
+ console.log(`\n ${chalk.bold("🌙 claude-overnight")}`);
378
+ console.log(chalk.dim(` ${"─".repeat(36)}`));
348
379
  const noTTY = !process.stdin.isTTY;
349
380
  const nonInteractive = noTTY || fileCfg !== undefined || tasks.length > 0;
350
381
  const cwd = fileCfg?.cwd ?? process.cwd();
@@ -363,55 +394,80 @@ async function main() {
363
394
  let objective = fileCfg?.objective;
364
395
  let usageCap;
365
396
  if (!nonInteractive) {
366
- console.log(chalk.dim(" Fire off Claude agents, come back to shipped work.\n"));
367
- // 1. Objective first — it's the whole point
397
+ // Objective
368
398
  while (true) {
369
- objective = await ask(chalk.bold(" What should the agents do?\n > "));
399
+ objective = await ask(`\n ${chalk.cyan("①")} ${chalk.bold("What should the agents do?")}\n ${chalk.cyan(">")} `);
370
400
  if (!objective) {
371
401
  console.error(chalk.red("\n No objective provided."));
372
402
  process.exit(1);
373
403
  }
374
404
  if (objective.split(/\s+/).length >= 5)
375
405
  break;
376
- console.log(chalk.yellow(' Be specific, e.g. "refactor the auth module, add tests, and update docs"\n'));
406
+ console.log(chalk.yellow(' Be specific, e.g. "refactor the auth module, add tests, and update docs"'));
377
407
  }
378
- // 2. Budget how many agent runs to spend
379
- const budgetAns = await ask(chalk.dim("\n Agent budget [10]: "));
408
+ // Start fetching models while user enters budget
409
+ const modelsPromise = fetchModels();
410
+ // ② Budget
411
+ const budgetAns = await ask(`\n ${chalk.cyan("②")} ${chalk.dim("Budget")} ${chalk.dim("[")}${chalk.white("10")}${chalk.dim("]:")} `);
380
412
  budget = parseInt(budgetAns) || 10;
381
413
  if (budget < 1) {
382
414
  console.error(chalk.red(` Budget must be a positive number`));
383
415
  process.exit(1);
384
416
  }
385
- // 3. Worker model — planner always uses best available
386
- process.stdout.write(chalk.dim(" Fetching models..."));
387
- const models = await fetchModels();
388
- process.stdout.write(`\x1B[2K\r`);
389
- // Pick best model for planner (first = most capable)
417
+ // Worker model — show spinner if models aren't ready yet
418
+ let modelFrame = 0;
419
+ const modelSpinner = setInterval(() => {
420
+ const spin = chalk.cyan(BRAILLE[modelFrame++ % BRAILLE.length]);
421
+ process.stdout.write(`\x1B[2K\r ${spin} ${chalk.dim("loading models...")}`);
422
+ }, 120);
423
+ let models;
424
+ try {
425
+ models = await modelsPromise;
426
+ }
427
+ finally {
428
+ clearInterval(modelSpinner);
429
+ process.stdout.write(`\x1B[2K\r`);
430
+ }
390
431
  plannerModel = models[0]?.value || "claude-sonnet-4-6";
391
432
  if (models.length > 0) {
392
- workerModel = await select("Worker model (planner always uses best available):", models.map((m) => ({
433
+ workerModel = await select(`${chalk.cyan("③")} Worker model:`, models.map((m) => ({
393
434
  name: m.displayName,
394
435
  value: m.value,
395
436
  hint: m.description,
396
437
  })));
397
438
  }
398
439
  else {
399
- const ans = await ask(chalk.dim(" Worker model [claude-sonnet-4-6]: "));
440
+ const ans = await ask(` ${chalk.cyan("③")} ${chalk.dim("Worker model [claude-sonnet-4-6]:")} `);
400
441
  workerModel = ans || "claude-sonnet-4-6";
401
442
  }
402
- if (workerModel !== plannerModel) {
403
- const tier = detectModelTier(workerModel);
404
- console.log(chalk.dim(`\n Planner: ${plannerModel} · Workers: ${workerModel} (${tier})`));
405
- }
406
- // 4. Usage cap — how much of your plan to use
407
- usageCap = await select("Usage limit:", [
408
- { name: "Unlimited", value: undefined, hint: "use full capacity, wait through rate limits" },
443
+ // Usage
444
+ usageCap = await select(`${chalk.cyan("④")} Usage:`, [
445
+ { name: "Unlimited", value: undefined, hint: "full capacity, wait through rate limits" },
409
446
  { name: "90%", value: 0.9, hint: "leave 10% for other work" },
410
447
  { name: "75%", value: 0.75, hint: "conservative, plenty of headroom" },
411
448
  { name: "50%", value: 0.5, hint: "use half, keep the rest" },
412
449
  ]);
413
- // Concurrency defaults based on budget
414
450
  concurrency = Math.min(5, budget);
451
+ // Config summary box
452
+ const parts = [];
453
+ if (workerModel !== plannerModel) {
454
+ const tier = detectModelTier(workerModel);
455
+ parts.push(`${tier} → ${detectModelTier(plannerModel)}`);
456
+ }
457
+ else {
458
+ parts.push(detectModelTier(workerModel));
459
+ }
460
+ parts.push(`budget ${budget}`);
461
+ parts.push(`${concurrency}×`);
462
+ if (budget > 2)
463
+ parts.push("flex");
464
+ if (usageCap != null)
465
+ parts.push(`cap ${Math.round(usageCap * 100)}%`);
466
+ const inner = parts.join(chalk.dim(" · "));
467
+ const innerLen = parts.join(" · ").length;
468
+ console.log(chalk.dim(`\n ╭${"─".repeat(innerLen + 4)}╮`));
469
+ console.log(chalk.dim(" │") + ` ${inner} ` + chalk.dim("│"));
470
+ console.log(chalk.dim(` ╰${"─".repeat(innerLen + 4)}╯`));
415
471
  }
416
472
  else {
417
473
  // Non-interactive: resolve config from file/flags/defaults
@@ -451,6 +507,10 @@ async function main() {
451
507
  }
452
508
  // ── Flex mode: adaptive multi-wave planning ──
453
509
  const flex = !argv.includes("--no-flex") && (fileCfg?.flexiblePlan ?? objective != null) && objective != null && (budget ?? 10) > 2;
510
+ const agentTimeoutMs = cliFlags.timeout ? parseFloat(cliFlags.timeout) * 1000 : undefined;
511
+ let thinkingUsed = 0;
512
+ let thinkingCost = 0, thinkingIn = 0, thinkingOut = 0, thinkingTools = 0;
513
+ let designContext;
454
514
  // ── Plan phase (interactive: review loop, non-interactive: auto-plan or skip) ──
455
515
  const needsPlan = tasks.length === 0;
456
516
  if (needsPlan) {
@@ -458,20 +518,81 @@ async function main() {
458
518
  console.error(chalk.red(" No tasks provided and stdin is not a TTY. Provide tasks via args or a .json file."));
459
519
  process.exit(1);
460
520
  }
461
- // In flex mode, plan ~50% of budget for wave 1, leaving room for steering
462
- const waveBudget = flex ? Math.max(concurrency, Math.ceil((budget ?? 10) * 0.5)) : budget;
463
- const flexNote = flex
464
- ? `This is wave 1 of an adaptive multi-wave run (total budget: ${budget}). Plan the highest-impact foundational work first. Future waves will iterate, polish, and expand based on what's learned.`
465
- : undefined;
466
521
  process.stdout.write("\x1B[?25l");
467
522
  const planRestore = () => process.stdout.write("\x1B[?25h");
468
- console.log(chalk.magenta(`\n Planning${flex ? " wave 1" : ""}...\n`));
523
+ const useThinking = flex && (budget ?? 10) > concurrency * 3;
524
+ const designDir = join(cwd, ".claude-overnight", "designs");
469
525
  try {
470
- tasks = await planTasks(objective, cwd, plannerModel, workerModel, permissionMode, waveBudget, concurrency, (text) => {
471
- process.stdout.write(`\x1B[2K\r ${chalk.dim(text)}`);
472
- }, flexNote);
473
- const flexHint = flex ? chalk.dim(` (wave 1, ${(budget ?? 10) - tasks.length} remaining)`) : "";
474
- process.stdout.write(`\x1B[2K\r ${chalk.green(`${tasks.length} tasks`)}${flexHint}\n\n`);
526
+ if (useThinking) {
527
+ // Phase 1: Quick theme identification
528
+ let themeFrame = 0;
529
+ const themeSpinner = setInterval(() => {
530
+ const spin = chalk.cyan(BRAILLE[themeFrame++ % BRAILLE.length]);
531
+ process.stdout.write(`\x1B[2K\r ${spin} ${chalk.dim("identifying themes...")}`);
532
+ }, 120);
533
+ let themes;
534
+ try {
535
+ themes = await identifyThemes(objective, concurrency, plannerModel, permissionMode);
536
+ }
537
+ finally {
538
+ clearInterval(themeSpinner);
539
+ }
540
+ process.stdout.write(`\x1B[2K\r ${chalk.green(`\u2713 ${themes.length} themes`)}\n`);
541
+ // Phase 2: Thinking wave — agents explore codebase
542
+ mkdirSync(designDir, { recursive: true });
543
+ const thinkingTasks = buildThinkingTasks(objective, themes, designDir, plannerModel);
544
+ console.log(chalk.cyan(`\n ◆ Thinking: ${thinkingTasks.length} agents exploring...\n`));
545
+ const thinkingSwarm = new Swarm({
546
+ tasks: thinkingTasks, concurrency, cwd,
547
+ model: plannerModel,
548
+ permissionMode,
549
+ useWorktrees: false,
550
+ mergeStrategy: "yolo",
551
+ agentTimeoutMs,
552
+ usageCap,
553
+ });
554
+ const stopThinkRender = startRenderLoop(thinkingSwarm);
555
+ try {
556
+ await thinkingSwarm.run();
557
+ }
558
+ finally {
559
+ stopThinkRender();
560
+ }
561
+ console.log(renderSummary(thinkingSwarm));
562
+ thinkingUsed = thinkingSwarm.completed + thinkingSwarm.failed;
563
+ thinkingCost = thinkingSwarm.totalCostUsd;
564
+ thinkingIn = thinkingSwarm.totalInputTokens;
565
+ thinkingOut = thinkingSwarm.totalOutputTokens;
566
+ thinkingTools = thinkingSwarm.agents.reduce((sum, a) => sum + a.toolCalls, 0);
567
+ // Phase 3: Orchestrate from design docs
568
+ designContext = readDesignDocs(designDir);
569
+ if (designContext) {
570
+ const orchBudget = Math.min(50, Math.max(concurrency, Math.ceil(((budget ?? 10) - thinkingUsed) * 0.5)));
571
+ const flexNote = `This is wave 1 of an adaptive multi-wave run (total budget: ${(budget ?? 10) - thinkingUsed}). Plan the highest-impact foundational work first. Future waves will iterate based on what's learned.`;
572
+ console.log(chalk.cyan(`\n ◆ Orchestrating plan...\n`));
573
+ tasks = await orchestrate(objective, designContext, cwd, plannerModel, workerModel, permissionMode, orchBudget, concurrency, makeProgressLog(), flexNote);
574
+ const remaining = (budget ?? 10) - thinkingUsed - tasks.length;
575
+ process.stdout.write(`\x1B[2K\r ${chalk.green(`\u2713 ${tasks.length} tasks`)}${chalk.dim(` · ${remaining} remaining`)}\n\n`);
576
+ }
577
+ else {
578
+ // Fallback: no design docs produced, use direct planner
579
+ console.log(chalk.yellow(`\n No design docs produced — falling back to direct planning\n`));
580
+ const waveBudget = Math.min(50, Math.max(concurrency, Math.ceil(((budget ?? 10) - thinkingUsed) * 0.5)));
581
+ tasks = await planTasks(objective, cwd, plannerModel, workerModel, permissionMode, waveBudget, concurrency, makeProgressLog());
582
+ process.stdout.write(`\x1B[2K\r ${chalk.green(`\u2713 ${tasks.length} tasks`)}\n\n`);
583
+ }
584
+ }
585
+ else {
586
+ // Small budget: direct planning (no thinking wave)
587
+ const waveBudget = flex ? Math.min(50, Math.max(concurrency, Math.ceil((budget ?? 10) * 0.5))) : budget;
588
+ const flexNote = flex
589
+ ? `This is wave 1 of an adaptive multi-wave run (total budget: ${budget}). Plan the highest-impact foundational work first. Future waves will iterate, polish, and expand based on what's learned.`
590
+ : undefined;
591
+ console.log(chalk.cyan(`\n ◆ Planning${flex ? " wave 1" : ""}...\n`));
592
+ tasks = await planTasks(objective, cwd, plannerModel, workerModel, permissionMode, waveBudget, concurrency, makeProgressLog(), flexNote);
593
+ const flexHint = flex ? chalk.dim(` (wave 1, ${(budget ?? 10) - tasks.length} remaining)`) : "";
594
+ process.stdout.write(`\x1B[2K\r ${chalk.green(`\u2713 ${tasks.length} tasks`)}${flexHint}\n\n`);
595
+ }
475
596
  }
476
597
  catch (err) {
477
598
  planRestore();
@@ -486,7 +607,7 @@ async function main() {
486
607
  let reviewing = true;
487
608
  while (reviewing) {
488
609
  showPlan(tasks);
489
- const action = await selectKey(`${tasks.length} tasks, concurrency ${concurrency}.`, [
610
+ const action = await selectKey(`${chalk.white(`${tasks.length} tasks`)} ${chalk.dim(`· ${concurrency} concurrent`)}`, [
490
611
  { key: "r", desc: "un" },
491
612
  { key: "e", desc: "dit" },
492
613
  { key: "c", desc: "hat" },
@@ -497,16 +618,14 @@ async function main() {
497
618
  reviewing = false;
498
619
  break;
499
620
  case "e": {
500
- const feedback = await ask(chalk.bold("\n What should change?\n > "));
621
+ const feedback = await ask(`\n ${chalk.bold("What should change?")}\n ${chalk.cyan(">")} `);
501
622
  if (!feedback)
502
623
  break;
503
- console.log(chalk.magenta("\n Re-planning...\n"));
624
+ console.log(chalk.cyan("\n Re-planning...\n"));
504
625
  process.stdout.write("\x1B[?25l");
505
626
  try {
506
- tasks = await refinePlan(objective, tasks, feedback, cwd, plannerModel, workerModel, permissionMode, budget, concurrency, (text) => {
507
- process.stdout.write(`\x1B[2K\r ${chalk.dim(text)}`);
508
- });
509
- process.stdout.write(`\x1B[2K\r ${chalk.green(`${tasks.length} tasks`)}\n\n`);
627
+ tasks = await refinePlan(objective, tasks, feedback, cwd, plannerModel, workerModel, permissionMode, budget, concurrency, makeProgressLog());
628
+ process.stdout.write(`\x1B[2K\r ${chalk.green(`\u2713 ${tasks.length} tasks`)}\n\n`);
510
629
  }
511
630
  catch (err) {
512
631
  console.error(chalk.red(`\n Re-planning failed: ${err.message}\n`));
@@ -515,7 +634,7 @@ async function main() {
515
634
  break;
516
635
  }
517
636
  case "c": {
518
- const question = await ask(chalk.bold("\n Ask about the plan:\n > "));
637
+ const question = await ask(`\n ${chalk.bold("Ask about the plan:")}\n ${chalk.cyan(">")} `);
519
638
  if (!question)
520
639
  break;
521
640
  process.stdout.write("\x1B[?25l");
@@ -548,22 +667,21 @@ async function main() {
548
667
  process.exit(1);
549
668
  }
550
669
  if (dryRun) {
551
- console.log(chalk.bold(" Tasks:"));
552
670
  showPlan(tasks);
671
+ console.log(chalk.dim(" --dry-run: exiting without running\n"));
553
672
  process.exit(0);
554
673
  }
555
674
  // ── Run (wave loop) ──
556
675
  process.stdout.write("\x1B[?25l");
557
676
  const restore = () => process.stdout.write("\x1B[?25h\n");
558
- const agentTimeoutMs = cliFlags.timeout ? parseFloat(cliFlags.timeout) * 1000 : undefined;
559
677
  const runStartedAt = Date.now();
560
678
  // Wave-loop state
561
679
  let currentSwarm;
562
- let remaining = budget ?? tasks.length;
680
+ let remaining = (budget ?? tasks.length) - thinkingUsed;
563
681
  let currentTasks = tasks;
564
682
  let waveNum = 0;
565
683
  const waveHistory = [];
566
- let accCost = 0, accIn = 0, accOut = 0, accCompleted = 0, accFailed = 0, accTools = 0;
684
+ let accCost = thinkingCost, accIn = thinkingIn, accOut = thinkingOut, accCompleted = 0, accFailed = 0, accTools = thinkingTools;
567
685
  let lastCapped = false, lastAborted = false;
568
686
  // For flex + branch strategy: create one target branch, waves merge via yolo into it
569
687
  let runBranch;
@@ -601,7 +719,7 @@ async function main() {
601
719
  if (currentTasks.length > remaining)
602
720
  currentTasks = currentTasks.slice(0, remaining);
603
721
  if (flex) {
604
- console.log(chalk.magenta(`\n \u2500\u2500 Wave ${waveNum + 1} (${currentTasks.length} tasks, ${remaining} remaining) \u2500\u2500\n`));
722
+ console.log(chalk.cyan(`\n Wave ${waveNum + 1}`) + chalk.dim(` · ${currentTasks.length} tasks · ${remaining} remaining\n`));
605
723
  }
606
724
  const swarm = new Swarm({
607
725
  tasks: currentTasks, concurrency, cwd, model: workerModel, permissionMode, allowedTools,
@@ -647,12 +765,10 @@ async function main() {
647
765
  if (!flex || remaining <= 0 || swarm.aborted || swarm.cappedOut)
648
766
  break;
649
767
  // ── Steer next wave ──
650
- console.log(chalk.magenta("\n Steering...\n"));
768
+ console.log(chalk.cyan("\n Steering...\n"));
651
769
  process.stdout.write("\x1B[?25l");
652
770
  try {
653
- const steer = await steerWave(objective, waveHistory, remaining, cwd, plannerModel, workerModel, permissionMode, concurrency, (text) => {
654
- process.stdout.write(`\x1B[2K\r ${chalk.dim(text)}`);
655
- });
771
+ const steer = await steerWave(objective, waveHistory, remaining, cwd, plannerModel, workerModel, permissionMode, concurrency, makeProgressLog(), designContext);
656
772
  process.stdout.write(`\x1B[2K\r`);
657
773
  process.stdout.write("\x1B[?25h");
658
774
  if (steer.done) {
@@ -669,6 +785,11 @@ async function main() {
669
785
  break;
670
786
  }
671
787
  }
788
+ // Clean up design docs
789
+ try {
790
+ rmSync(join(cwd, ".claude-overnight", "designs"), { recursive: true, force: true });
791
+ }
792
+ catch { }
672
793
  // Switch back if we created a run branch
673
794
  if (runBranch && originalRef) {
674
795
  try {
@@ -682,9 +803,10 @@ async function main() {
682
803
  const summaryText = accFailed > 0
683
804
  ? chalk.yellow(`${accCompleted} done, ${accFailed} failed`) + cappedNote
684
805
  : chalk.green(`${accCompleted} done`) + cappedNote;
685
- const costText = accCost > 0 ? ` ($${accCost.toFixed(3)})` : "";
686
- const wavePart = waves > 1 ? `${waves} waves, ` : "";
687
- console.log(`\n ${chalk.bold("Complete:")} ${wavePart}${summaryText}${chalk.dim(costText)}`);
806
+ const costText = accCost > 0 ? chalk.dim(` · $${accCost.toFixed(3)}`) : "";
807
+ const wavePart = waves > 1 ? chalk.dim(`${waves} waves · `) : "";
808
+ console.log(chalk.dim(`\n ${"".repeat(36)}`));
809
+ console.log(` ${chalk.green("✓")} ${chalk.bold("Complete")} ${wavePart}${summaryText}${costText}`);
688
810
  if (accFailed > 0 && waves === 1) {
689
811
  const failedAgents = currentSwarm?.agents.filter((a) => a.status === "error") ?? [];
690
812
  if (failedAgents.length > 0) {
package/dist/planner.d.ts CHANGED
@@ -16,5 +16,8 @@ export interface SteerResult {
16
16
  export type ModelTier = "opus" | "sonnet" | "haiku" | "unknown";
17
17
  export declare function detectModelTier(model: string): ModelTier;
18
18
  export declare function planTasks(objective: string, cwd: string, plannerModel: string, workerModel: string, permissionMode: PermMode, budget: number | undefined, concurrency: number, onLog: (text: string) => void, flexNote?: string): Promise<Task[]>;
19
+ export declare function identifyThemes(objective: string, count: number, model: string, permissionMode: PermMode): Promise<string[]>;
20
+ export declare function buildThinkingTasks(objective: string, themes: string[], designDir: string, plannerModel: string): Task[];
21
+ export declare function orchestrate(objective: string, designDocs: string, cwd: string, plannerModel: string, workerModel: string, permissionMode: PermMode, budget: number, concurrency: number, onLog: (text: string) => void, flexNote?: string): Promise<Task[]>;
19
22
  export declare function refinePlan(objective: string, previousTasks: Task[], feedback: string, cwd: string, plannerModel: string, workerModel: string, permissionMode: PermMode, budget: number | undefined, concurrency: number, onLog: (text: string) => void): Promise<Task[]>;
20
- export declare function steerWave(objective: string, history: WaveSummary[], remainingBudget: number, cwd: string, plannerModel: string, workerModel: string, permissionMode: PermMode, concurrency: number, onLog: (text: string) => void): Promise<SteerResult>;
23
+ export declare function steerWave(objective: string, history: WaveSummary[], remainingBudget: number, cwd: string, plannerModel: string, workerModel: string, permissionMode: PermMode, concurrency: number, onLog: (text: string) => void, designContext?: string): Promise<SteerResult>;
package/dist/planner.js CHANGED
@@ -162,7 +162,7 @@ async function runPlannerQuery(prompt, opts, onLog) {
162
162
  includePartialMessages: true,
163
163
  },
164
164
  });
165
- // Progress ticker — show elapsed time so it doesn't look frozen
165
+ // Progress ticker — fast updates with compact format
166
166
  let lastLogText = "";
167
167
  let toolCount = 0;
168
168
  const ticker = setInterval(() => {
@@ -170,9 +170,10 @@ async function runPlannerQuery(prompt, opts, onLog) {
170
170
  const m = Math.floor(elapsed / 60);
171
171
  const s = elapsed % 60;
172
172
  const timeStr = m > 0 ? `${m}m ${s}s` : `${s}s`;
173
- const extra = lastLogText ? ` ${lastLogText}` : "";
174
- onLog(`${timeStr} elapsed, ${toolCount} tool calls${extra}`);
175
- }, 3000);
173
+ const toolStr = toolCount > 0 ? ` · ${toolCount} tools` : "";
174
+ const extra = lastLogText ? ` · ${lastLogText}` : "";
175
+ onLog(`${timeStr}${toolStr}${extra}`);
176
+ }, 500);
176
177
  let lastActivity = Date.now();
177
178
  let timer;
178
179
  const watchdog = new Promise((_, reject) => {
@@ -201,8 +202,8 @@ async function runPlannerQuery(prompt, opts, onLog) {
201
202
  if (ev?.type === "content_block_delta") {
202
203
  const delta = ev.delta;
203
204
  if (delta?.type === "text_delta" && delta.text) {
204
- const snippet = delta.text.trim();
205
- if (snippet.length > 3) {
205
+ const snippet = delta.text.trim().replace(/[{}"\\,[\]]+/g, " ").replace(/\s+/g, " ").trim();
206
+ if (snippet.length > 5) {
206
207
  lastLogText = snippet.slice(0, 60);
207
208
  }
208
209
  }
@@ -310,6 +311,108 @@ export async function planTasks(objective, cwd, plannerModel, workerModel, permi
310
311
  onLog(`${tasks.length} tasks`);
311
312
  return tasks;
312
313
  }
314
+ // ── Thinking wave ──
315
+ export async function identifyThemes(objective, count, model, permissionMode) {
316
+ let resultText = "";
317
+ for await (const msg of query({
318
+ prompt: `Split this objective into exactly ${count} independent research angles for architects exploring a codebase. Each angle should cover a distinct aspect.
319
+
320
+ Objective: ${objective}
321
+
322
+ Return ONLY a JSON object: {"themes": ["angle description", ...]}`,
323
+ options: {
324
+ model,
325
+ permissionMode,
326
+ ...(permissionMode === "bypassPermissions" && { allowDangerouslySkipPermissions: true }),
327
+ persistSession: false,
328
+ },
329
+ })) {
330
+ if (msg.type === "result" && msg.subtype === "success")
331
+ resultText = msg.result || "";
332
+ }
333
+ const parsed = attemptJsonParse(resultText);
334
+ if (parsed?.themes && Array.isArray(parsed.themes))
335
+ return parsed.themes.slice(0, count);
336
+ const fallback = ["architecture, patterns, and conventions", "data models, state, and persistence", "user-facing flows, components, and UX", "APIs, integrations, and services", "testing, quality, and error handling", "security, performance, and infrastructure", "build, deployment, and configuration", "documentation and developer experience"];
337
+ return Array.from({ length: count }, (_, i) => fallback[i % fallback.length]);
338
+ }
339
+ export function buildThinkingTasks(objective, themes, designDir, plannerModel) {
340
+ return themes.map((theme, i) => ({
341
+ id: `think-${i}`,
342
+ prompt: `You are a senior architect exploring a codebase to design a solution.
343
+
344
+ OVERALL OBJECTIVE: ${objective}
345
+
346
+ YOUR FOCUS: ${theme}
347
+
348
+ Explore the codebase thoroughly using Read, Glob, and Grep. Then write a design document to ${designDir}/focus-${i}.md with these sections:
349
+
350
+ ## Findings
351
+ Key files, patterns, and architecture you discovered. Cite specific file paths and function names.
352
+
353
+ ## Proposed Work Items
354
+ For each item:
355
+ - **What**: What to build or change
356
+ - **Where**: Specific file paths
357
+ - **Why**: Why this matters
358
+ - **Risk**: Conflicts or complications
359
+
360
+ ## Key Files
361
+ Relevant files with one-line descriptions.
362
+
363
+ Be thorough — your findings drive the execution plan.`,
364
+ model: plannerModel,
365
+ }));
366
+ }
367
+ export async function orchestrate(objective, designDocs, cwd, plannerModel, workerModel, permissionMode, budget, concurrency, onLog, flexNote) {
368
+ const capability = modelCapabilityBlock(workerModel);
369
+ const flexLine = flexNote ? `\n\n${flexNote}` : "";
370
+ const prompt = `You are a tech lead planning a sprint based on your team's codebase research.
371
+
372
+ Objective: ${objective}
373
+
374
+ Your architects explored the codebase and found:
375
+
376
+ ${designDocs}
377
+
378
+ AGENT CAPABILITY: ${capability}
379
+
380
+ Create exactly ~${budget} concrete execution tasks based on these findings.
381
+
382
+ Requirements:
383
+ - Each task is actionable by a single agent session
384
+ - Each task MUST be independent — no dependencies between tasks
385
+ - ${concurrency} agents run in parallel — tasks must touch DIFFERENT files
386
+ - Trust the research — don't tell agents to re-explore what's documented
387
+ - Reference specific files and patterns from the findings
388
+ - Priority order: foundational first, polish last${flexLine}
389
+
390
+ Respond with ONLY a JSON object (no markdown fences):
391
+ {"tasks": [{"prompt": "..."}]}`;
392
+ onLog("Synthesizing...");
393
+ const resultText = await runPlannerQuery(prompt, { cwd, model: plannerModel, permissionMode }, onLog);
394
+ const parsed = await extractTaskJson(resultText, async () => {
395
+ onLog("Retrying...");
396
+ let retryText = "";
397
+ for await (const msg of query({
398
+ prompt: `Output ONLY a JSON object:\n{"tasks":[{"prompt":"..."}]}`,
399
+ options: { cwd, model: plannerModel, permissionMode, ...(permissionMode === "bypassPermissions" && { allowDangerouslySkipPermissions: true }), persistSession: false },
400
+ })) {
401
+ if (msg.type === "result" && msg.subtype === "success")
402
+ retryText = msg.result || "";
403
+ }
404
+ return retryText;
405
+ });
406
+ let tasks = (parsed.tasks || []).map((t, i) => ({
407
+ id: String(i),
408
+ prompt: typeof t === "string" ? t : t.prompt,
409
+ }));
410
+ tasks = postProcess(tasks, budget, onLog);
411
+ if (tasks.length === 0)
412
+ throw new Error("Orchestration generated 0 tasks");
413
+ onLog(`${tasks.length} tasks`);
414
+ return tasks;
415
+ }
313
416
  export async function refinePlan(objective, previousTasks, feedback, cwd, plannerModel, workerModel, permissionMode, budget, concurrency, onLog) {
314
417
  onLog("Refining plan...");
315
418
  const prev = previousTasks.map((t, i) => `${i + 1}. ${t.prompt}`).join("\n");
@@ -421,7 +524,7 @@ async function extractTaskJson(raw, retry) {
421
524
  throw new Error("Planner did not return valid task JSON after retry");
422
525
  }
423
526
  // ── Wave steering ──
424
- export async function steerWave(objective, history, remainingBudget, cwd, plannerModel, workerModel, permissionMode, concurrency, onLog) {
527
+ export async function steerWave(objective, history, remainingBudget, cwd, plannerModel, workerModel, permissionMode, concurrency, onLog, designContext) {
425
528
  const capability = modelCapabilityBlock(workerModel);
426
529
  const historyText = history.map(w => {
427
530
  const lines = w.tasks.map(t => {
@@ -437,7 +540,7 @@ Objective: ${objective}
437
540
 
438
541
  Work completed so far:
439
542
  ${historyText}
440
-
543
+ ${designContext ? `\nOriginal architectural research:\n${designContext}\n` : ""}
441
544
  Remaining budget: ${remainingBudget} agent sessions. ${concurrency} agents run in parallel — tasks must touch DIFFERENT files.
442
545
  ${capability}
443
546
 
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "claude-overnight",
3
- "version": "0.3.2",
3
+ "version": "0.5.0",
4
4
  "description": "Fire off Claude agents, come back days later to shipped work. Maximizes every token in your plan.",
5
5
  "type": "module",
6
6
  "bin": {