npm - claude-overnight - Versions diffs - 0.3.2 → 0.5.1 - Mend

claude-overnight 0.3.2 → 0.5.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (6) hide show

package/README.md CHANGED Viewed

@@ -2,7 +2,7 @@
 Fire off Claude agents, come back to shipped work.
-Describe what to build. Set a budget — 10 agents, 100, 1000. A planner agent analyzes your codebase, breaks the objective into that many independent tasks, and launches them all. Each agent runs in its own git worktree with full tooling (Read, Edit, Bash, Grep — everything). Rate limits? It waits. Windows reset? It resumes. It doesn't stop until every task is done.
+Describe what to build. Set a budget — 10 agents, 100, 1000. A planner agent analyzes your codebase, breaks the objective into independent tasks, and launches them all. Each agent runs in its own git worktree with full tooling (Read, Edit, Bash, Grep — everything). Rate limits? It waits. Windows reset? It resumes. It doesn't stop until every task is done.
 ## Install
@@ -20,7 +20,32 @@ Requires Node.js >= 20 and Claude authentication (OAuth via `claude` CLI, or `AN
 claude-overnight
 ```
-Describe your objective, set a budget, pick a worker model, set a usage limit. The planner generates tasks — review, edit, or chat about them, then run.
+A guided flow walks you through each step:
+```
+🌙  claude-overnight
+────────────────────────────────────
+① What should the agents do?
+  > refactor auth, add tests, update docs
+② Budget [10]: 50
+③ Worker model:
+  ● Sonnet — Sonnet 4.6 · Best for everyday tasks
+  ○ Opus — Opus 4.6 · Most capable
+  ○ Haiku — Haiku 4.5 · Fastest
+④ Usage:
+  ● Unlimited · full capacity, wait through rate limits
+  ○ 90% · leave 10% for other work
+╭────────────────────────────────────╮
+│  sonnet · budget 50 · 5× · flex   │
+╰────────────────────────────────────╯
+```
+For large budgets, the planner identifies research themes — review them, then press Run. Everything after that is fully autonomous: thinking agents explore, the orchestrator synthesizes tasks, execution waves run, and steering adapts between waves. No further interaction needed — go to sleep.
 ### Task file
@@ -38,6 +63,25 @@ claude-overnight "fix auth bug in src/auth.ts" "add tests for user model"
 The planner always runs on the best available model (Opus) regardless of which model you pick for workers. This ensures high-quality task decomposition even when workers use a cheaper model.
+### Thinking wave
+For large budgets (`budget > concurrency * 3`), the planner doesn't try to generate hundreds of tasks from scratch. Instead, it launches a **thinking wave** — a team of architect agents that explore your codebase in parallel before any code is written.
+```
+⠋ identifying themes...          → splits objective into N angles (< 30s)
+✓ 10 themes                      → review themes, press Run, walk away
+◆ Thinking: 10 agents exploring  → each explores from its angle, writes a design doc
+◆ Orchestrating plan...          → reads all design docs, synthesizes execution tasks
+◆ Wave 1 · 50 tasks              → fully autonomous from here
+◆ Steering...                    → adapts between waves, retries on rate limits
+```
+The review prompt appears right after theme identification — the last thing requiring your presence. After you press Run, the thinking wave, orchestration, execution, and steering all run autonomously. Rate-limited? The planner waits and retries. Go to sleep.
+The number of thinking agents scales with budget: 5 for budget=50, 10 for budget=2000+. Each agent explores the codebase from a different angle and writes a structured design document. The orchestrator then reads all design docs and produces grounded execution tasks referencing real files and patterns.
+For small budgets (≤ `concurrency * 3`), the planner skips the thinking wave and generates tasks directly — fast and efficient for focused work.
 ### Model-aware task design
 The planner calibrates task ambition based on your worker model:
@@ -56,20 +100,20 @@ The budget also shapes task granularity:
 **Medium budget (16-50)**: Autonomous missions. "Design and implement the complete favorites system: DB schema, API routes, client hooks, error handling."
-**Large budget (50+)**: Full workstream decomposition. Architecture, features, testing, security, UX polish, performance — everything a team would cover. Each task is a substantial work session.
+**Large budget (50+)**: Thinking wave + orchestration. Architects explore, then execution tasks are synthesized from their findings. Each task is a substantial work session grounded in real codebase analysis.
-A budget of 200 is not 200 micro-edits. It's 200 senior-engineer work sessions running in parallel.
+A budget of 200 is not 200 micro-edits. It's ~5 architects + ~195 senior-engineer work sessions, planned in waves. A budget of 2000 gets 10 architects.
 ## Usage limits
-Control how much of your plan capacity the run consumes. In interactive mode, you'll be asked:
+Control how much of your plan capacity the run consumes:
 ```
-Usage limit:
-→ Unlimited — use full capacity, wait through rate limits
-  90%       — leave 10% for other work
-  75%       — conservative, plenty of headroom
-  50%       — use half, keep the rest
+④ Usage:
+  ● Unlimited · full capacity, wait through rate limits
+  ○ 90% · leave 10% for other work
+  ○ 75% · conservative, plenty of headroom
+  ○ 50% · use half, keep the rest
 ```
 When utilization hits your cap, the swarm stops dispatching new tasks and lets active agents finish gracefully. This way you can run a big overnight job and still have capacity left for manual Claude usage.

package/dist/index.js CHANGED Viewed

@@ -1,5 +1,5 @@
 #!/usr/bin/env node
-import { readFileSync, existsSync } from "fs";
+import { readFileSync, existsSync, mkdirSync, readdirSync, rmSync } from "fs";
 import { resolve, dirname, join } from "path";
 import { fileURLToPath } from "url";
 import { execSync } from "child_process";
@@ -7,7 +7,7 @@ import { createInterface } from "readline";
 import chalk from "chalk";
 import { query } from "@anthropic-ai/claude-agent-sdk";
 import { Swarm } from "./swarm.js";
-import { planTasks, refinePlan, detectModelTier, steerWave } from "./planner.js";
+import { planTasks, refinePlan, detectModelTier, steerWave, identifyThemes, buildThinkingTasks, orchestrate } from "./planner.js";
 import { startRenderLoop, renderSummary } from "./ui.js";
 // ── CLI flag parsing ──
 function parseCliFlags(argv) {
@@ -86,10 +86,11 @@ async function select(label, items, defaultIdx = 0) {
         if (!first)
             stdout.write(`\x1B[${items.length}A`);
         for (let i = 0; i < items.length; i++) {
-            const arrow = i === idx ? chalk.green("  → ") : "    ";
-            const name = i === idx ? chalk.green(items[i].name) : chalk.dim(items[i].name);
-            const hint = items[i].hint ? chalk.dim(` — ${items[i].hint}`) : "";
-            stdout.write(`\x1B[2K${arrow}${name}${hint}\n`);
+            const sel = i === idx;
+            const radio = sel ? chalk.cyan("  ● ") : chalk.dim("  ○ ");
+            const name = sel ? chalk.white(items[i].name) : chalk.dim(items[i].name);
+            const hint = items[i].hint ? chalk.dim(` · ${items[i].hint}`) : "";
+            stdout.write(`\x1B[2K${radio}${name}${hint}\n`);
         }
     };
     stdout.write(`\n  ${chalk.bold(label)}\n`);
@@ -134,7 +135,8 @@ async function select(label, items, defaultIdx = 0) {
 async function selectKey(label, options) {
     const { stdin, stdout } = process;
     const keys = options.map((o) => o.key.toLowerCase());
-    stdout.write(`\n  ${label}  ${options.map((o) => `[${chalk.bold(o.key.toUpperCase())}]${chalk.dim(o.desc)}`).join("  ")}\n  `);
+    const optStr = options.map((o) => `${chalk.cyan.bold(o.key.toUpperCase())}${chalk.dim(o.desc)}`).join(chalk.dim("  │  "));
+    stdout.write(`\n  ${label}\n  ${optStr}\n  `);
     return new Promise((resolve) => {
         stdin.setRawMode(true);
         stdin.resume();
@@ -259,10 +261,37 @@ function validateGitRepo(cwd) {
 }
 // ── Show plan ──
 function showPlan(tasks) {
+    const w = Math.max((process.stdout.columns ?? 80) - 6, 40);
+    const ruleLen = Math.min(w, 70);
+    console.log(chalk.dim(`  ─── ${tasks.length} tasks ${"─".repeat(Math.max(0, ruleLen - String(tasks.length).length - 10))}`));
     for (const t of tasks) {
-        console.log(chalk.dim(`    ${Number(t.id) + 1}. ${t.prompt.slice(0, 90)}`));
+        const num = chalk.dim(String(Number(t.id) + 1).padStart(4) + ".");
+        console.log(`${num} ${t.prompt.slice(0, w)}`);
     }
-    console.log("");
+    console.log(chalk.dim(`  ${"─".repeat(ruleLen)}\n`));
+}
+function readDesignDocs(dir) {
+    try {
+        const files = readdirSync(dir).filter(f => f.endsWith(".md")).sort();
+        return files.map(f => {
+            const content = readFileSync(join(dir, f), "utf-8");
+            return `### ${f}\n${content}`;
+        }).join("\n\n");
+    }
+    catch {
+        return "";
+    }
+}
+const BRAILLE = ["⠋", "⠙", "⠹", "⠸", "⠼", "⠴", "⠦", "⠧", "⠇", "⠏"];
+function makeProgressLog() {
+    let frame = 0;
+    return (text) => {
+        const spin = chalk.cyan(BRAILLE[frame++ % BRAILLE.length]);
+        const maxW = (process.stdout.columns ?? 80) - 6;
+        const clean = text.replace(/\n/g, " ");
+        const line = clean.length > maxW ? clean.slice(0, maxW - 1) + "\u2026" : clean;
+        process.stdout.write(`\x1B[2K\r  ${spin} ${chalk.dim(line)}`);
+    };
 }
 // ── Main ──
 async function main() {
@@ -275,25 +304,26 @@ async function main() {
     }
     if (argv.includes("-h") || argv.includes("--help")) {
         console.log(`
-  ${chalk.bold("claude-overnight")} — fire off Claude agents, come back to shipped work
+  ${chalk.bold("🌙  claude-overnight")} ${chalk.dim("— fire off Claude agents, come back to shipped work")}
+  ${chalk.dim("─".repeat(60))}
-  ${chalk.dim("Usage:")}
-    claude-overnight                          ${chalk.dim("interactive — describe what to do, review plan, run")}
-    claude-overnight tasks.json               ${chalk.dim("run tasks defined in a JSON file")}
-    claude-overnight "fix auth" "add tests"   ${chalk.dim("run inline tasks in parallel")}
+  ${chalk.cyan("Usage")}
+    claude-overnight                          ${chalk.dim("interactive mode")}
+    claude-overnight tasks.json               ${chalk.dim("task file mode")}
+    claude-overnight "fix auth" "add tests"   ${chalk.dim("inline tasks")}
-  ${chalk.dim("Flags:")}
+  ${chalk.cyan("Flags")}
     -h, --help             Show this help
     -v, --version          Print version
     --dry-run              Show planned tasks without running them
-    --budget=N             Target number of agent runs ${chalk.dim("(planner aims for this many tasks)")}
+    --budget=N             Target number of agent runs ${chalk.dim("(default: 10)")}
     --concurrency=N        Max parallel agents ${chalk.dim("(default: 5)")}
     --model=NAME           Worker model override ${chalk.dim("(planner always uses best available)")}
     --usage-cap=N          Stop at N% utilization ${chalk.dim("(e.g. 90 to save 10% for other work)")}
     --timeout=SECONDS      Agent inactivity timeout ${chalk.dim("(default: 300s, kills only silent agents)")}
     --no-flex              Disable adaptive multi-wave planning ${chalk.dim("(run all tasks in one shot)")}
-  ${chalk.dim("Non-interactive defaults (task file / inline / piped):")}
+  ${chalk.cyan("Defaults")} ${chalk.dim("(non-interactive)")}
     model: first available    concurrency: 5    worktrees: auto    perms: auto
     `);
         process.exit(0);
@@ -344,7 +374,8 @@ async function main() {
         }
     }
     // ── Determine mode ──
-    console.log(chalk.bold("\n  \uD83C\uDF19 claude-overnight\n"));
+    console.log(`\n  ${chalk.bold("🌙  claude-overnight")}`);
+    console.log(chalk.dim(`  ${"─".repeat(36)}`));
     const noTTY = !process.stdin.isTTY;
     const nonInteractive = noTTY || fileCfg !== undefined || tasks.length > 0;
     const cwd = fileCfg?.cwd ?? process.cwd();
@@ -363,55 +394,80 @@ async function main() {
     let objective = fileCfg?.objective;
     let usageCap;
     if (!nonInteractive) {
-        console.log(chalk.dim("  Fire off Claude agents, come back to shipped work.\n"));
-        // 1. Objective first — it's the whole point
+        // ① Objective
         while (true) {
-            objective = await ask(chalk.bold("  What should the agents do?\n  > "));
+            objective = await ask(`\n  ${chalk.cyan("①")} ${chalk.bold("What should the agents do?")}\n  ${chalk.cyan(">")} `);
             if (!objective) {
                 console.error(chalk.red("\n  No objective provided."));
                 process.exit(1);
             }
             if (objective.split(/\s+/).length >= 5)
                 break;
-            console.log(chalk.yellow('  Be specific, e.g. "refactor the auth module, add tests, and update docs"\n'));
+            console.log(chalk.yellow('  Be specific, e.g. "refactor the auth module, add tests, and update docs"'));
         }
-        // 2. Budget — how many agent runs to spend
-        const budgetAns = await ask(chalk.dim("\n  Agent budget [10]: "));
+        // Start fetching models while user enters budget
+        const modelsPromise = fetchModels();
+        // ② Budget
+        const budgetAns = await ask(`\n  ${chalk.cyan("②")} ${chalk.dim("Budget")} ${chalk.dim("[")}${chalk.white("10")}${chalk.dim("]:")} `);
         budget = parseInt(budgetAns) || 10;
         if (budget < 1) {
             console.error(chalk.red(`  Budget must be a positive number`));
             process.exit(1);
         }
-        // 3. Worker model — planner always uses best available
-        process.stdout.write(chalk.dim("  Fetching models..."));
-        const models = await fetchModels();
-        process.stdout.write(`\x1B[2K\r`);
-        // Pick best model for planner (first = most capable)
+        // ③ Worker model — show spinner if models aren't ready yet
+        let modelFrame = 0;
+        const modelSpinner = setInterval(() => {
+            const spin = chalk.cyan(BRAILLE[modelFrame++ % BRAILLE.length]);
+            process.stdout.write(`\x1B[2K\r  ${spin} ${chalk.dim("loading models...")}`);
+        }, 120);
+        let models;
+        try {
+            models = await modelsPromise;
+        }
+        finally {
+            clearInterval(modelSpinner);
+            process.stdout.write(`\x1B[2K\r`);
+        }
         plannerModel = models[0]?.value || "claude-sonnet-4-6";
         if (models.length > 0) {
-            workerModel = await select("Worker model (planner always uses best available):", models.map((m) => ({
+            workerModel = await select(`${chalk.cyan("③")} Worker model:`, models.map((m) => ({
                 name: m.displayName,
                 value: m.value,
                 hint: m.description,
             })));
         }
         else {
-            const ans = await ask(chalk.dim("  Worker model [claude-sonnet-4-6]: "));
+            const ans = await ask(`  ${chalk.cyan("③")} ${chalk.dim("Worker model [claude-sonnet-4-6]:")} `);
             workerModel = ans || "claude-sonnet-4-6";
         }
-        if (workerModel !== plannerModel) {
-            const tier = detectModelTier(workerModel);
-            console.log(chalk.dim(`\n  Planner: ${plannerModel} · Workers: ${workerModel} (${tier})`));
-        }
-        // 4. Usage cap — how much of your plan to use
-        usageCap = await select("Usage limit:", [
-            { name: "Unlimited", value: undefined, hint: "use full capacity, wait through rate limits" },
+        // ④ Usage
+        usageCap = await select(`${chalk.cyan("④")} Usage:`, [
+            { name: "Unlimited", value: undefined, hint: "full capacity, wait through rate limits" },
             { name: "90%", value: 0.9, hint: "leave 10% for other work" },
             { name: "75%", value: 0.75, hint: "conservative, plenty of headroom" },
             { name: "50%", value: 0.5, hint: "use half, keep the rest" },
         ]);
-        // Concurrency defaults based on budget
         concurrency = Math.min(5, budget);
+        // Config summary box
+        const parts = [];
+        if (workerModel !== plannerModel) {
+            const tier = detectModelTier(workerModel);
+            parts.push(`${tier} → ${detectModelTier(plannerModel)}`);
+        }
+        else {
+            parts.push(detectModelTier(workerModel));
+        }
+        parts.push(`budget ${budget}`);
+        parts.push(`${concurrency}×`);
+        if (budget > 2)
+            parts.push("flex");
+        if (usageCap != null)
+            parts.push(`cap ${Math.round(usageCap * 100)}%`);
+        const inner = parts.join(chalk.dim(" · "));
+        const innerLen = parts.join(" · ").length;
+        console.log(chalk.dim(`\n  ╭${"─".repeat(innerLen + 4)}╮`));
+        console.log(chalk.dim("  │") + `  ${inner}  ` + chalk.dim("│"));
+        console.log(chalk.dim(`  ╰${"─".repeat(innerLen + 4)}╯`));
     }
     else {
         // Non-interactive: resolve config from file/flags/defaults
@@ -451,6 +507,10 @@ async function main() {
     }
     // ── Flex mode: adaptive multi-wave planning ──
     const flex = !argv.includes("--no-flex") && (fileCfg?.flexiblePlan ?? objective != null) && objective != null && (budget ?? 10) > 2;
+    const agentTimeoutMs = cliFlags.timeout ? parseFloat(cliFlags.timeout) * 1000 : undefined;
+    let thinkingUsed = 0;
+    let thinkingCost = 0, thinkingIn = 0, thinkingOut = 0, thinkingTools = 0;
+    let designContext;
     // ── Plan phase (interactive: review loop, non-interactive: auto-plan or skip) ──
     const needsPlan = tasks.length === 0;
     if (needsPlan) {
@@ -458,20 +518,178 @@ async function main() {
             console.error(chalk.red("  No tasks provided and stdin is not a TTY. Provide tasks via args or a .json file."));
             process.exit(1);
         }
-        // In flex mode, plan ~50% of budget for wave 1, leaving room for steering
-        const waveBudget = flex ? Math.max(concurrency, Math.ceil((budget ?? 10) * 0.5)) : budget;
-        const flexNote = flex
-            ? `This is wave 1 of an adaptive multi-wave run (total budget: ${budget}). Plan the highest-impact foundational work first. Future waves will iterate, polish, and expand based on what's learned.`
-            : undefined;
         process.stdout.write("\x1B[?25l");
         const planRestore = () => process.stdout.write("\x1B[?25h");
-        console.log(chalk.magenta(`\n  Planning${flex ? " wave 1" : ""}...\n`));
+        const useThinking = flex && (budget ?? 10) > concurrency * 3;
+        const thinkingCount = useThinking ? Math.min(Math.max(concurrency, Math.ceil((budget ?? 10) * 0.005)), 10) : 0;
+        const designDir = join(cwd, ".claude-overnight", "designs");
         try {
-            tasks = await planTasks(objective, cwd, plannerModel, workerModel, permissionMode, waveBudget, concurrency, (text) => {
-                process.stdout.write(`\x1B[2K\r  ${chalk.dim(text)}`);
-            }, flexNote);
-            const flexHint = flex ? chalk.dim(` (wave 1, ${(budget ?? 10) - tasks.length} remaining)`) : "";
-            process.stdout.write(`\x1B[2K\r  ${chalk.green(`${tasks.length} tasks`)}${flexHint}\n\n`);
+            if (useThinking) {
+                // Phase 1: Quick theme identification → review → then autonomous
+                let themeFrame = 0;
+                const themeSpinner = setInterval(() => {
+                    const spin = chalk.cyan(BRAILLE[themeFrame++ % BRAILLE.length]);
+                    process.stdout.write(`\x1B[2K\r  ${spin} ${chalk.dim("identifying themes...")}`);
+                }, 120);
+                let themes;
+                try {
+                    themes = await identifyThemes(objective, thinkingCount, plannerModel, permissionMode);
+                }
+                finally {
+                    clearInterval(themeSpinner);
+                }
+                process.stdout.write(`\x1B[2K\r  ${chalk.green(`\u2713 ${themes.length} themes`)}\n\n`);
+                // Show themes for review — this is the LAST user interaction
+                planRestore();
+                let reviewing = true;
+                while (reviewing) {
+                    for (let i = 0; i < themes.length; i++) {
+                        console.log(chalk.dim(`  ${String(i + 1).padStart(3)}.`) + ` ${themes[i]}`);
+                    }
+                    console.log(chalk.dim(`\n  ${thinkingCount} thinking agents → orchestrate → ${(budget ?? 10) - thinkingCount} execution sessions\n`));
+                    const action = await selectKey(`${chalk.white(`${themes.length} themes`)} ${chalk.dim(`· ${thinkingCount} thinking · ${concurrency} concurrent`)}`, [
+                        { key: "r", desc: "un" },
+                        { key: "e", desc: "dit" },
+                        { key: "q", desc: "uit" },
+                    ]);
+                    switch (action) {
+                        case "r":
+                            reviewing = false;
+                            break;
+                        case "e": {
+                            const feedback = await ask(`\n  ${chalk.bold("What should change?")}\n  ${chalk.cyan(">")} `);
+                            if (!feedback)
+                                break;
+                            process.stdout.write("\x1B[?25l");
+                            try {
+                                themes = await identifyThemes(`${objective}\n\nUser feedback: ${feedback}`, thinkingCount, plannerModel, permissionMode);
+                                process.stdout.write(`\x1B[2K\r  ${chalk.green(`\u2713 ${themes.length} themes`)}\n\n`);
+                            }
+                            catch (err) {
+                                console.error(chalk.red(`\n  Re-planning failed: ${err.message}\n`));
+                            }
+                            planRestore();
+                            break;
+                        }
+                        case "q":
+                            console.log(chalk.dim("\n  Aborted.\n"));
+                            process.exit(0);
+                    }
+                }
+                // ── From here, fully autonomous — no more user interaction ──
+                process.stdout.write("\x1B[?25l");
+                // Phase 2: Thinking wave
+                mkdirSync(designDir, { recursive: true });
+                const thinkingTasks = buildThinkingTasks(objective, themes, designDir, plannerModel);
+                console.log(chalk.cyan(`\n  ◆ Thinking: ${thinkingTasks.length} agents exploring...\n`));
+                const thinkingSwarm = new Swarm({
+                    tasks: thinkingTasks, concurrency, cwd,
+                    model: plannerModel,
+                    permissionMode,
+                    useWorktrees: false,
+                    mergeStrategy: "yolo",
+                    agentTimeoutMs,
+                    usageCap,
+                });
+                const stopThinkRender = startRenderLoop(thinkingSwarm);
+                try {
+                    await thinkingSwarm.run();
+                }
+                finally {
+                    stopThinkRender();
+                }
+                console.log(renderSummary(thinkingSwarm));
+                thinkingUsed = thinkingSwarm.completed + thinkingSwarm.failed;
+                thinkingCost = thinkingSwarm.totalCostUsd;
+                thinkingIn = thinkingSwarm.totalInputTokens;
+                thinkingOut = thinkingSwarm.totalOutputTokens;
+                thinkingTools = thinkingSwarm.agents.reduce((sum, a) => sum + a.toolCalls, 0);
+                // Phase 3: Orchestrate from design docs
+                designContext = readDesignDocs(designDir);
+                if (designContext) {
+                    const orchBudget = Math.min(50, Math.max(concurrency, Math.ceil(((budget ?? 10) - thinkingUsed) * 0.5)));
+                    const flexNote = `This is wave 1 of an adaptive multi-wave run (total budget: ${(budget ?? 10) - thinkingUsed}). Plan the highest-impact foundational work first. Future waves will iterate based on what's learned.`;
+                    console.log(chalk.cyan(`\n  ◆ Orchestrating plan...\n`));
+                    tasks = await orchestrate(objective, designContext, cwd, plannerModel, workerModel, permissionMode, orchBudget, concurrency, makeProgressLog(), flexNote);
+                    process.stdout.write(`\x1B[2K\r  ${chalk.green(`\u2713 ${tasks.length} tasks`)}\n\n`);
+                }
+                else {
+                    console.log(chalk.yellow(`\n  No design docs — falling back to direct planning\n`));
+                    const waveBudget = Math.min(50, Math.max(concurrency, Math.ceil(((budget ?? 10) - thinkingUsed) * 0.5)));
+                    tasks = await planTasks(objective, cwd, plannerModel, workerModel, permissionMode, waveBudget, concurrency, makeProgressLog());
+                    process.stdout.write(`\x1B[2K\r  ${chalk.green(`\u2713 ${tasks.length} tasks`)}\n\n`);
+                }
+            }
+            else {
+                // Small budget: direct planning → review → run
+                const waveBudget = flex ? Math.min(50, Math.max(concurrency, Math.ceil((budget ?? 10) * 0.5))) : budget;
+                const flexNote = flex
+                    ? `This is wave 1 of an adaptive multi-wave run (total budget: ${budget}). Plan the highest-impact foundational work first. Future waves will iterate, polish, and expand based on what's learned.`
+                    : undefined;
+                console.log(chalk.cyan(`\n  ◆ Planning${flex ? " wave 1" : ""}...\n`));
+                tasks = await planTasks(objective, cwd, plannerModel, workerModel, permissionMode, waveBudget, concurrency, makeProgressLog(), flexNote);
+                const flexHint = flex ? chalk.dim(` · wave 1`) : "";
+                process.stdout.write(`\x1B[2K\r  ${chalk.green(`\u2713 ${tasks.length} tasks`)}${flexHint}\n\n`);
+                // Review loop for small-budget path
+                planRestore();
+                let reviewing = true;
+                while (reviewing) {
+                    showPlan(tasks);
+                    const action = await selectKey(`${chalk.white(`${tasks.length} tasks`)} ${chalk.dim(`· ${concurrency} concurrent`)}`, [
+                        { key: "r", desc: "un" },
+                        { key: "e", desc: "dit" },
+                        { key: "c", desc: "hat" },
+                        { key: "q", desc: "uit" },
+                    ]);
+                    switch (action) {
+                        case "r":
+                            reviewing = false;
+                            break;
+                        case "e": {
+                            const feedback = await ask(`\n  ${chalk.bold("What should change?")}\n  ${chalk.cyan(">")} `);
+                            if (!feedback)
+                                break;
+                            console.log(chalk.cyan("\n  ◆ Re-planning...\n"));
+                            process.stdout.write("\x1B[?25l");
+                            try {
+                                tasks = await refinePlan(objective, tasks, feedback, cwd, plannerModel, workerModel, permissionMode, budget, concurrency, makeProgressLog());
+                                process.stdout.write(`\x1B[2K\r  ${chalk.green(`\u2713 ${tasks.length} tasks`)}\n\n`);
+                            }
+                            catch (err) {
+                                console.error(chalk.red(`\n  Re-planning failed: ${err.message}\n`));
+                            }
+                            planRestore();
+                            break;
+                        }
+                        case "c": {
+                            const question = await ask(`\n  ${chalk.bold("Ask about the plan:")}\n  ${chalk.cyan(">")} `);
+                            if (!question)
+                                break;
+                            process.stdout.write("\x1B[?25l");
+                            try {
+                                let answer = "";
+                                for await (const msg of query({
+                                    prompt: `You planned these tasks for the objective "${objective}":\n${tasks.map((t, i) => `${i + 1}. ${t.prompt}`).join("\n")}\n\nUser question: ${question}`,
+                                    options: { cwd, model: plannerModel, permissionMode, persistSession: false },
+                                })) {
+                                    if (msg.type === "result" && msg.subtype === "success")
+                                        answer = msg.result || "";
+                                }
+                                planRestore();
+                                if (answer)
+                                    console.log(chalk.dim(`\n  ${answer.slice(0, 500)}\n`));
+                            }
+                            catch {
+                                planRestore();
+                            }
+                            break;
+                        }
+                        case "q":
+                            console.log(chalk.dim("\n  Aborted.\n"));
+                            process.exit(0);
+                    }
+                }
+            }
         }
         catch (err) {
             planRestore();
@@ -481,89 +699,27 @@ async function main() {
                 console.error(chalk.red(`\n  Planning failed: ${err.message}\n`));
             process.exit(1);
         }
-        // ── Review loop ──
-        planRestore();
-        let reviewing = true;
-        while (reviewing) {
-            showPlan(tasks);
-            const action = await selectKey(`${tasks.length} tasks, concurrency ${concurrency}.`, [
-                { key: "r", desc: "un" },
-                { key: "e", desc: "dit" },
-                { key: "c", desc: "hat" },
-                { key: "q", desc: "uit" },
-            ]);
-            switch (action) {
-                case "r":
-                    reviewing = false;
-                    break;
-                case "e": {
-                    const feedback = await ask(chalk.bold("\n  What should change?\n  > "));
-                    if (!feedback)
-                        break;
-                    console.log(chalk.magenta("\n  Re-planning...\n"));
-                    process.stdout.write("\x1B[?25l");
-                    try {
-                        tasks = await refinePlan(objective, tasks, feedback, cwd, plannerModel, workerModel, permissionMode, budget, concurrency, (text) => {
-                            process.stdout.write(`\x1B[2K\r  ${chalk.dim(text)}`);
-                        });
-                        process.stdout.write(`\x1B[2K\r  ${chalk.green(`${tasks.length} tasks`)}\n\n`);
-                    }
-                    catch (err) {
-                        console.error(chalk.red(`\n  Re-planning failed: ${err.message}\n`));
-                    }
-                    planRestore();
-                    break;
-                }
-                case "c": {
-                    const question = await ask(chalk.bold("\n  Ask about the plan:\n  > "));
-                    if (!question)
-                        break;
-                    process.stdout.write("\x1B[?25l");
-                    try {
-                        let answer = "";
-                        for await (const msg of query({
-                            prompt: `You planned these tasks for the objective "${objective}":\n${tasks.map((t, i) => `${i + 1}. ${t.prompt}`).join("\n")}\n\nUser question: ${question}`,
-                            options: { cwd, model: plannerModel, permissionMode, persistSession: false },
-                        })) {
-                            if (msg.type === "result" && msg.subtype === "success")
-                                answer = msg.result || "";
-                        }
-                        planRestore();
-                        if (answer)
-                            console.log(chalk.dim(`\n  ${answer.slice(0, 500)}\n`));
-                    }
-                    catch {
-                        planRestore();
-                    }
-                    break;
-                }
-                case "q":
-                    console.log(chalk.dim("\n  Aborted.\n"));
-                    process.exit(0);
-            }
-        }
     }
     if (tasks.length === 0) {
         console.error("No tasks provided.");
         process.exit(1);
     }
     if (dryRun) {
-        console.log(chalk.bold("  Tasks:"));
         showPlan(tasks);
+        console.log(chalk.dim("  --dry-run: exiting without running\n"));
         process.exit(0);
     }
     // ── Run (wave loop) ──
     process.stdout.write("\x1B[?25l");
     const restore = () => process.stdout.write("\x1B[?25h\n");
-    const agentTimeoutMs = cliFlags.timeout ? parseFloat(cliFlags.timeout) * 1000 : undefined;
     const runStartedAt = Date.now();
     // Wave-loop state
     let currentSwarm;
-    let remaining = budget ?? tasks.length;
+    let remaining = (budget ?? tasks.length) - thinkingUsed;
     let currentTasks = tasks;
     let waveNum = 0;
     const waveHistory = [];
-    let accCost = 0, accIn = 0, accOut = 0, accCompleted = 0, accFailed = 0, accTools = 0;
+    let accCost = thinkingCost, accIn = thinkingIn, accOut = thinkingOut, accCompleted = 0, accFailed = 0, accTools = thinkingTools;
     let lastCapped = false, lastAborted = false;
     // For flex + branch strategy: create one target branch, waves merge via yolo into it
     let runBranch;
@@ -601,7 +757,7 @@ async function main() {
         if (currentTasks.length > remaining)
             currentTasks = currentTasks.slice(0, remaining);
         if (flex) {
-            console.log(chalk.magenta(`\n  \u2500\u2500 Wave ${waveNum + 1} (${currentTasks.length} tasks, ${remaining} remaining) \u2500\u2500\n`));
+            console.log(chalk.cyan(`\n  ◆ Wave ${waveNum + 1}`) + chalk.dim(` · ${currentTasks.length} tasks · ${remaining} remaining\n`));
         }
         const swarm = new Swarm({
             tasks: currentTasks, concurrency, cwd, model: workerModel, permissionMode, allowedTools,
@@ -647,12 +803,10 @@ async function main() {
         if (!flex || remaining <= 0 || swarm.aborted || swarm.cappedOut)
             break;
         // ── Steer next wave ──
-        console.log(chalk.magenta("\n  Steering...\n"));
+        console.log(chalk.cyan("\n  ◆ Steering...\n"));
         process.stdout.write("\x1B[?25l");
         try {
-            const steer = await steerWave(objective, waveHistory, remaining, cwd, plannerModel, workerModel, permissionMode, concurrency, (text) => {
-                process.stdout.write(`\x1B[2K\r  ${chalk.dim(text)}`);
-            });
+            const steer = await steerWave(objective, waveHistory, remaining, cwd, plannerModel, workerModel, permissionMode, concurrency, makeProgressLog(), designContext);
             process.stdout.write(`\x1B[2K\r`);
             process.stdout.write("\x1B[?25h");
             if (steer.done) {
@@ -669,6 +823,11 @@ async function main() {
             break;
         }
     }
+    // Clean up design docs
+    try {
+        rmSync(join(cwd, ".claude-overnight", "designs"), { recursive: true, force: true });
+    }
+    catch { }
     // Switch back if we created a run branch
     if (runBranch && originalRef) {
         try {
@@ -682,9 +841,10 @@ async function main() {
     const summaryText = accFailed > 0
         ? chalk.yellow(`${accCompleted} done, ${accFailed} failed`) + cappedNote
         : chalk.green(`${accCompleted} done`) + cappedNote;
-    const costText = accCost > 0 ? ` ($${accCost.toFixed(3)})` : "";
-    const wavePart = waves > 1 ? `${waves} waves, ` : "";
-    console.log(`\n  ${chalk.bold("Complete:")} ${wavePart}${summaryText}${chalk.dim(costText)}`);
+    const costText = accCost > 0 ? chalk.dim(` · $${accCost.toFixed(3)}`) : "";
+    const wavePart = waves > 1 ? chalk.dim(`${waves} waves · `) : "";
+    console.log(chalk.dim(`\n  ${"─".repeat(36)}`));
+    console.log(`  ${chalk.green("✓")} ${chalk.bold("Complete")}  ${wavePart}${summaryText}${costText}`);
     if (accFailed > 0 && waves === 1) {
         const failedAgents = currentSwarm?.agents.filter((a) => a.status === "error") ?? [];
         if (failedAgents.length > 0) {

package/dist/planner.d.ts CHANGED Viewed

@@ -16,5 +16,8 @@ export interface SteerResult {
 export type ModelTier = "opus" | "sonnet" | "haiku" | "unknown";
 export declare function detectModelTier(model: string): ModelTier;
 export declare function planTasks(objective: string, cwd: string, plannerModel: string, workerModel: string, permissionMode: PermMode, budget: number | undefined, concurrency: number, onLog: (text: string) => void, flexNote?: string): Promise<Task[]>;
+export declare function identifyThemes(objective: string, count: number, model: string, permissionMode: PermMode): Promise<string[]>;
+export declare function buildThinkingTasks(objective: string, themes: string[], designDir: string, plannerModel: string): Task[];
+export declare function orchestrate(objective: string, designDocs: string, cwd: string, plannerModel: string, workerModel: string, permissionMode: PermMode, budget: number, concurrency: number, onLog: (text: string) => void, flexNote?: string): Promise<Task[]>;
 export declare function refinePlan(objective: string, previousTasks: Task[], feedback: string, cwd: string, plannerModel: string, workerModel: string, permissionMode: PermMode, budget: number | undefined, concurrency: number, onLog: (text: string) => void): Promise<Task[]>;
-export declare function steerWave(objective: string, history: WaveSummary[], remainingBudget: number, cwd: string, plannerModel: string, workerModel: string, permissionMode: PermMode, concurrency: number, onLog: (text: string) => void): Promise<SteerResult>;
+export declare function steerWave(objective: string, history: WaveSummary[], remainingBudget: number, cwd: string, plannerModel: string, workerModel: string, permissionMode: PermMode, concurrency: number, onLog: (text: string) => void, designContext?: string): Promise<SteerResult>;

package/dist/planner.js CHANGED Viewed

@@ -2,7 +2,7 @@ import { query } from "@anthropic-ai/claude-agent-sdk";
 const INACTIVITY_MS = 5 * 60 * 1000;
 export function detectModelTier(model) {
     const m = model.toLowerCase();
-    if (m.includes("opus"))
+    if (m === "default" || m.includes("opus"))
         return "opus";
     if (m.includes("sonnet"))
         return "sonnet";
@@ -146,7 +146,32 @@ Respond with ONLY a JSON object (no markdown fences):
   ]
 }`;
 }
+const RATE_LIMIT_PATTERNS = ["rate", "limit", "overloaded", "429", "hit your limit", "too many"];
+function isRateLimitError(err) {
+    const msg = err instanceof Error ? err.message : String(err);
+    return RATE_LIMIT_PATTERNS.some((p) => msg.toLowerCase().includes(p));
+}
 async function runPlannerQuery(prompt, opts, onLog) {
+    const MAX_RETRIES = 3;
+    const BACKOFF = [30_000, 60_000, 120_000];
+    for (let attempt = 0; attempt <= MAX_RETRIES; attempt++) {
+        try {
+            return await runPlannerQueryOnce(prompt, opts, onLog);
+        }
+        catch (err) {
+            if (attempt < MAX_RETRIES && isRateLimitError(err)) {
+                const waitMs = BACKOFF[attempt];
+                const waitSec = Math.round(waitMs / 1000);
+                onLog(`Rate limited — waiting ${waitSec}s before retry ${attempt + 1}/${MAX_RETRIES}`);
+                await new Promise((r) => setTimeout(r, waitMs));
+                continue;
+            }
+            throw err;
+        }
+    }
+    throw new Error("Planner query failed after retries");
+}
+async function runPlannerQueryOnce(prompt, opts, onLog) {
     let resultText = "";
     const startedAt = Date.now();
     const pq = query({
@@ -162,7 +187,7 @@ async function runPlannerQuery(prompt, opts, onLog) {
             includePartialMessages: true,
         },
     });
-    // Progress ticker — show elapsed time so it doesn't look frozen
+    // Progress ticker — fast updates with compact format
     let lastLogText = "";
     let toolCount = 0;
     const ticker = setInterval(() => {
@@ -170,9 +195,10 @@ async function runPlannerQuery(prompt, opts, onLog) {
         const m = Math.floor(elapsed / 60);
         const s = elapsed % 60;
         const timeStr = m > 0 ? `${m}m ${s}s` : `${s}s`;
-        const extra = lastLogText ? ` — ${lastLogText}` : "";
-        onLog(`${timeStr} elapsed, ${toolCount} tool calls${extra}`);
-    }, 3000);
+        const toolStr = toolCount > 0 ? ` · ${toolCount} tools` : "";
+        const extra = lastLogText ? ` · ${lastLogText}` : "";
+        onLog(`${timeStr}${toolStr}${extra}`);
+    }, 500);
     let lastActivity = Date.now();
     let timer;
     const watchdog = new Promise((_, reject) => {
@@ -201,8 +227,8 @@ async function runPlannerQuery(prompt, opts, onLog) {
                 if (ev?.type === "content_block_delta") {
                     const delta = ev.delta;
                     if (delta?.type === "text_delta" && delta.text) {
-                        const snippet = delta.text.trim();
-                        if (snippet.length > 3) {
+                        const snippet = delta.text.trim().replace(/[{}"\\,[\]]+/g, " ").replace(/\s+/g, " ").trim();
+                        if (snippet.length > 5) {
                             lastLogText = snippet.slice(0, 60);
                         }
                     }
@@ -212,7 +238,7 @@ async function runPlannerQuery(prompt, opts, onLog) {
                 if (msg.subtype === "success")
                     resultText = msg.result || "";
                 else
-                    throw new Error(`Planner failed: ${msg.subtype}`);
+                    throw new Error(`Planner failed: ${msg.result || msg.subtype}`);
             }
         }
     };
@@ -310,6 +336,108 @@ export async function planTasks(objective, cwd, plannerModel, workerModel, permi
     onLog(`${tasks.length} tasks`);
     return tasks;
 }
+// ── Thinking wave ──
+export async function identifyThemes(objective, count, model, permissionMode) {
+    let resultText = "";
+    for await (const msg of query({
+        prompt: `Split this objective into exactly ${count} independent research angles for architects exploring a codebase. Each angle should cover a distinct aspect.
+Objective: ${objective}
+Return ONLY a JSON object: {"themes": ["angle description", ...]}`,
+        options: {
+            model,
+            permissionMode,
+            ...(permissionMode === "bypassPermissions" && { allowDangerouslySkipPermissions: true }),
+            persistSession: false,
+        },
+    })) {
+        if (msg.type === "result" && msg.subtype === "success")
+            resultText = msg.result || "";
+    }
+    const parsed = attemptJsonParse(resultText);
+    if (parsed?.themes && Array.isArray(parsed.themes))
+        return parsed.themes.slice(0, count);
+    const fallback = ["architecture, patterns, and conventions", "data models, state, and persistence", "user-facing flows, components, and UX", "APIs, integrations, and services", "testing, quality, and error handling", "security, performance, and infrastructure", "build, deployment, and configuration", "documentation and developer experience"];
+    return Array.from({ length: count }, (_, i) => fallback[i % fallback.length]);
+}
+export function buildThinkingTasks(objective, themes, designDir, plannerModel) {
+    return themes.map((theme, i) => ({
+        id: `think-${i}`,
+        prompt: `You are a senior architect exploring a codebase to design a solution.
+OVERALL OBJECTIVE: ${objective}
+YOUR FOCUS: ${theme}
+Explore the codebase thoroughly using Read, Glob, and Grep. Then write a design document to ${designDir}/focus-${i}.md with these sections:
+## Findings
+Key files, patterns, and architecture you discovered. Cite specific file paths and function names.
+## Proposed Work Items
+For each item:
+- **What**: What to build or change
+- **Where**: Specific file paths
+- **Why**: Why this matters
+- **Risk**: Conflicts or complications
+## Key Files
+Relevant files with one-line descriptions.
+Be thorough — your findings drive the execution plan.`,
+        model: plannerModel,
+    }));
+}
+export async function orchestrate(objective, designDocs, cwd, plannerModel, workerModel, permissionMode, budget, concurrency, onLog, flexNote) {
+    const capability = modelCapabilityBlock(workerModel);
+    const flexLine = flexNote ? `\n\n${flexNote}` : "";
+    const prompt = `You are a tech lead planning a sprint based on your team's codebase research.
+Objective: ${objective}
+Your architects explored the codebase and found:
+${designDocs}
+AGENT CAPABILITY: ${capability}
+Create exactly ~${budget} concrete execution tasks based on these findings.
+Requirements:
+- Each task is actionable by a single agent session
+- Each task MUST be independent — no dependencies between tasks
+- ${concurrency} agents run in parallel — tasks must touch DIFFERENT files
+- Trust the research — don't tell agents to re-explore what's documented
+- Reference specific files and patterns from the findings
+- Priority order: foundational first, polish last${flexLine}
+Respond with ONLY a JSON object (no markdown fences):
+{"tasks": [{"prompt": "..."}]}`;
+    onLog("Synthesizing...");
+    const resultText = await runPlannerQuery(prompt, { cwd, model: plannerModel, permissionMode }, onLog);
+    const parsed = await extractTaskJson(resultText, async () => {
+        onLog("Retrying...");
+        let retryText = "";
+        for await (const msg of query({
+            prompt: `Output ONLY a JSON object:\n{"tasks":[{"prompt":"..."}]}`,
+            options: { cwd, model: plannerModel, permissionMode, ...(permissionMode === "bypassPermissions" && { allowDangerouslySkipPermissions: true }), persistSession: false },
+        })) {
+            if (msg.type === "result" && msg.subtype === "success")
+                retryText = msg.result || "";
+        }
+        return retryText;
+    });
+    let tasks = (parsed.tasks || []).map((t, i) => ({
+        id: String(i),
+        prompt: typeof t === "string" ? t : t.prompt,
+    }));
+    tasks = postProcess(tasks, budget, onLog);
+    if (tasks.length === 0)
+        throw new Error("Orchestration generated 0 tasks");
+    onLog(`${tasks.length} tasks`);
+    return tasks;
+}
 export async function refinePlan(objective, previousTasks, feedback, cwd, plannerModel, workerModel, permissionMode, budget, concurrency, onLog) {
     onLog("Refining plan...");
     const prev = previousTasks.map((t, i) => `${i + 1}. ${t.prompt}`).join("\n");
@@ -421,7 +549,7 @@ async function extractTaskJson(raw, retry) {
     throw new Error("Planner did not return valid task JSON after retry");
 }
 // ── Wave steering ──
-export async function steerWave(objective, history, remainingBudget, cwd, plannerModel, workerModel, permissionMode, concurrency, onLog) {
+export async function steerWave(objective, history, remainingBudget, cwd, plannerModel, workerModel, permissionMode, concurrency, onLog, designContext) {
     const capability = modelCapabilityBlock(workerModel);
     const historyText = history.map(w => {
         const lines = w.tasks.map(t => {
@@ -437,7 +565,7 @@ Objective: ${objective}
 Work completed so far:
 ${historyText}
+${designContext ? `\nOriginal architectural research:\n${designContext}\n` : ""}
 Remaining budget: ${remainingBudget} agent sessions. ${concurrency} agents run in parallel — tasks must touch DIFFERENT files.
 ${capability}

package/dist/swarm.js CHANGED Viewed

@@ -240,9 +240,17 @@ export class Swarm {
                     this.activeQueries.delete(agentQuery);
                 }
                 if (agent.status === "running") {
-                    agent.status = "done";
                     agent.finishedAt = Date.now();
-                    this.completed++;
+                    const duration = agent.finishedAt - (agent.startedAt || agent.finishedAt);
+                    if (agent.toolCalls === 0 && (agent.costUsd ?? 0) < 0.001 && duration < 15_000) {
+                        agent.status = "error";
+                        agent.error = "Agent did no work (likely rate-limited before starting)";
+                        this.failed++;
+                    }
+                    else {
+                        agent.status = "done";
+                        this.completed++;
+                    }
                     this.log(id, this.agentSummary(agent));
                 }
                 break; // Success — exit retry loop
@@ -424,12 +432,13 @@ export class Swarm {
         finally {
             if (stashed) {
                 try {
-                    exec("git stash pop", this.config.cwd);
-                    this.log(-1, "Restored stashed changes");
-                }
-                catch (e) {
-                    this.log(-1, `Stash pop failed: ${String(e.message || e).slice(0, 80)}`);
+                    const stashList = exec("git stash list", this.config.cwd).trim();
+                    if (stashList) {
+                        exec("git stash pop", this.config.cwd);
+                        this.log(-1, "Restored stashed changes");
+                    }
                 }
+                catch { /* stash already gone or empty */ }
             }
         }
     }

package/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "claude-overnight",
-  "version": "0.3.2",
+  "version": "0.5.1",
   "description": "Fire off Claude agents, come back days later to shipped work. Maximizes every token in your plan.",
   "type": "module",
   "bin": {