npm - claude-overnight - Versions diffs - 0.5.0 → 0.5.1 - Mend

claude-overnight 0.5.0 → 0.5.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (5) hide show

package/README.md CHANGED Viewed

@@ -45,7 +45,7 @@ A guided flow walks you through each step:
 ╰────────────────────────────────────╯
 ```
-The planner generates tasks — review, edit, or chat about them, then run.
+For large budgets, the planner identifies research themes — review them, then press Run. Everything after that is fully autonomous: thinking agents explore, the orchestrator synthesizes tasks, execution waves run, and steering adapts between waves. No further interaction needed — go to sleep.
 ### Task file
@@ -68,14 +68,17 @@ The planner always runs on the best available model (Opus) regardless of which m
 For large budgets (`budget > concurrency * 3`), the planner doesn't try to generate hundreds of tasks from scratch. Instead, it launches a **thinking wave** — a team of architect agents that explore your codebase in parallel before any code is written.
 ```
-⠋ identifying themes...        → splits objective into N angles (< 30s)
-◆ Thinking: 5 agents exploring  → each explores from its angle, writes a design doc
-◆ Orchestrating plan...         → reads all design docs, synthesizes execution tasks
+⠋ identifying themes...          → splits objective into N angles (< 30s)
+✓ 10 themes                      → review themes, press Run, walk away
+◆ Thinking: 10 agents exploring  → each explores from its angle, writes a design doc
+◆ Orchestrating plan...          → reads all design docs, synthesizes execution tasks
+◆ Wave 1 · 50 tasks              → fully autonomous from here
+◆ Steering...                    → adapts between waves, retries on rate limits
 ```
-Each thinking agent gets a different research focus (architecture, data, UI, APIs, testing, etc.), explores using Read/Glob/Grep, and writes a structured design document with findings, proposed work items, and key files. The orchestrator then reads all design docs and produces grounded, well-informed execution tasks that reference specific files and patterns the researchers found.
+The review prompt appears right after theme identification — the last thing requiring your presence. After you press Run, the thinking wave, orchestration, execution, and steering all run autonomously. Rate-limited? The planner waits and retries. Go to sleep.
-This means a budget of 200 doesn't generate 200 tasks from a single LLM call guessing at your codebase. It sends 5 architects to study the code first, then plans 50 tasks based on their findings, executes them, steers, and repeats.
+The number of thinking agents scales with budget: 5 for budget=50, 10 for budget=2000+. Each agent explores the codebase from a different angle and writes a structured design document. The orchestrator then reads all design docs and produces grounded execution tasks referencing real files and patterns.
 For small budgets (≤ `concurrency * 3`), the planner skips the thinking wave and generates tasks directly — fast and efficient for focused work.
@@ -99,7 +102,7 @@ The budget also shapes task granularity:
 **Large budget (50+)**: Thinking wave + orchestration. Architects explore, then execution tasks are synthesized from their findings. Each task is a substantial work session grounded in real codebase analysis.
-A budget of 200 is not 200 micro-edits. It's 5 architects + ~195 senior-engineer work sessions, planned in waves.
+A budget of 200 is not 200 micro-edits. It's ~5 architects + ~195 senior-engineer work sessions, planned in waves. A budget of 2000 gets 10 architects.
 ## Usage limits

package/dist/index.js CHANGED Viewed

@@ -521,10 +521,11 @@ async function main() {
         process.stdout.write("\x1B[?25l");
         const planRestore = () => process.stdout.write("\x1B[?25h");
         const useThinking = flex && (budget ?? 10) > concurrency * 3;
+        const thinkingCount = useThinking ? Math.min(Math.max(concurrency, Math.ceil((budget ?? 10) * 0.005)), 10) : 0;
         const designDir = join(cwd, ".claude-overnight", "designs");
         try {
             if (useThinking) {
-                // Phase 1: Quick theme identification
+                // Phase 1: Quick theme identification → review → then autonomous
                 let themeFrame = 0;
                 const themeSpinner = setInterval(() => {
                     const spin = chalk.cyan(BRAILLE[themeFrame++ % BRAILLE.length]);
@@ -532,13 +533,52 @@ async function main() {
                 }, 120);
                 let themes;
                 try {
-                    themes = await identifyThemes(objective, concurrency, plannerModel, permissionMode);
+                    themes = await identifyThemes(objective, thinkingCount, plannerModel, permissionMode);
                 }
                 finally {
                     clearInterval(themeSpinner);
                 }
-                process.stdout.write(`\x1B[2K\r  ${chalk.green(`\u2713 ${themes.length} themes`)}\n`);
-                // Phase 2: Thinking wave — agents explore codebase
+                process.stdout.write(`\x1B[2K\r  ${chalk.green(`\u2713 ${themes.length} themes`)}\n\n`);
+                // Show themes for review — this is the LAST user interaction
+                planRestore();
+                let reviewing = true;
+                while (reviewing) {
+                    for (let i = 0; i < themes.length; i++) {
+                        console.log(chalk.dim(`  ${String(i + 1).padStart(3)}.`) + ` ${themes[i]}`);
+                    }
+                    console.log(chalk.dim(`\n  ${thinkingCount} thinking agents → orchestrate → ${(budget ?? 10) - thinkingCount} execution sessions\n`));
+                    const action = await selectKey(`${chalk.white(`${themes.length} themes`)} ${chalk.dim(`· ${thinkingCount} thinking · ${concurrency} concurrent`)}`, [
+                        { key: "r", desc: "un" },
+                        { key: "e", desc: "dit" },
+                        { key: "q", desc: "uit" },
+                    ]);
+                    switch (action) {
+                        case "r":
+                            reviewing = false;
+                            break;
+                        case "e": {
+                            const feedback = await ask(`\n  ${chalk.bold("What should change?")}\n  ${chalk.cyan(">")} `);
+                            if (!feedback)
+                                break;
+                            process.stdout.write("\x1B[?25l");
+                            try {
+                                themes = await identifyThemes(`${objective}\n\nUser feedback: ${feedback}`, thinkingCount, plannerModel, permissionMode);
+                                process.stdout.write(`\x1B[2K\r  ${chalk.green(`\u2713 ${themes.length} themes`)}\n\n`);
+                            }
+                            catch (err) {
+                                console.error(chalk.red(`\n  Re-planning failed: ${err.message}\n`));
+                            }
+                            planRestore();
+                            break;
+                        }
+                        case "q":
+                            console.log(chalk.dim("\n  Aborted.\n"));
+                            process.exit(0);
+                    }
+                }
+                // ── From here, fully autonomous — no more user interaction ──
+                process.stdout.write("\x1B[?25l");
+                // Phase 2: Thinking wave
                 mkdirSync(designDir, { recursive: true });
                 const thinkingTasks = buildThinkingTasks(objective, themes, designDir, plannerModel);
                 console.log(chalk.cyan(`\n  ◆ Thinking: ${thinkingTasks.length} agents exploring...\n`));
@@ -571,27 +611,84 @@ async function main() {
                     const flexNote = `This is wave 1 of an adaptive multi-wave run (total budget: ${(budget ?? 10) - thinkingUsed}). Plan the highest-impact foundational work first. Future waves will iterate based on what's learned.`;
                     console.log(chalk.cyan(`\n  ◆ Orchestrating plan...\n`));
                     tasks = await orchestrate(objective, designContext, cwd, plannerModel, workerModel, permissionMode, orchBudget, concurrency, makeProgressLog(), flexNote);
-                    const remaining = (budget ?? 10) - thinkingUsed - tasks.length;
-                    process.stdout.write(`\x1B[2K\r  ${chalk.green(`\u2713 ${tasks.length} tasks`)}${chalk.dim(` · ${remaining} remaining`)}\n\n`);
+                    process.stdout.write(`\x1B[2K\r  ${chalk.green(`\u2713 ${tasks.length} tasks`)}\n\n`);
                 }
                 else {
-                    // Fallback: no design docs produced, use direct planner
-                    console.log(chalk.yellow(`\n  No design docs produced — falling back to direct planning\n`));
+                    console.log(chalk.yellow(`\n  No design docs — falling back to direct planning\n`));
                     const waveBudget = Math.min(50, Math.max(concurrency, Math.ceil(((budget ?? 10) - thinkingUsed) * 0.5)));
                     tasks = await planTasks(objective, cwd, plannerModel, workerModel, permissionMode, waveBudget, concurrency, makeProgressLog());
                     process.stdout.write(`\x1B[2K\r  ${chalk.green(`\u2713 ${tasks.length} tasks`)}\n\n`);
                 }
             }
             else {
-                // Small budget: direct planning (no thinking wave)
+                // Small budget: direct planning → review → run
                 const waveBudget = flex ? Math.min(50, Math.max(concurrency, Math.ceil((budget ?? 10) * 0.5))) : budget;
                 const flexNote = flex
                     ? `This is wave 1 of an adaptive multi-wave run (total budget: ${budget}). Plan the highest-impact foundational work first. Future waves will iterate, polish, and expand based on what's learned.`
                     : undefined;
                 console.log(chalk.cyan(`\n  ◆ Planning${flex ? " wave 1" : ""}...\n`));
                 tasks = await planTasks(objective, cwd, plannerModel, workerModel, permissionMode, waveBudget, concurrency, makeProgressLog(), flexNote);
-                const flexHint = flex ? chalk.dim(` (wave 1, ${(budget ?? 10) - tasks.length} remaining)`) : "";
+                const flexHint = flex ? chalk.dim(` · wave 1`) : "";
                 process.stdout.write(`\x1B[2K\r  ${chalk.green(`\u2713 ${tasks.length} tasks`)}${flexHint}\n\n`);
+                // Review loop for small-budget path
+                planRestore();
+                let reviewing = true;
+                while (reviewing) {
+                    showPlan(tasks);
+                    const action = await selectKey(`${chalk.white(`${tasks.length} tasks`)} ${chalk.dim(`· ${concurrency} concurrent`)}`, [
+                        { key: "r", desc: "un" },
+                        { key: "e", desc: "dit" },
+                        { key: "c", desc: "hat" },
+                        { key: "q", desc: "uit" },
+                    ]);
+                    switch (action) {
+                        case "r":
+                            reviewing = false;
+                            break;
+                        case "e": {
+                            const feedback = await ask(`\n  ${chalk.bold("What should change?")}\n  ${chalk.cyan(">")} `);
+                            if (!feedback)
+                                break;
+                            console.log(chalk.cyan("\n  ◆ Re-planning...\n"));
+                            process.stdout.write("\x1B[?25l");
+                            try {
+                                tasks = await refinePlan(objective, tasks, feedback, cwd, plannerModel, workerModel, permissionMode, budget, concurrency, makeProgressLog());
+                                process.stdout.write(`\x1B[2K\r  ${chalk.green(`\u2713 ${tasks.length} tasks`)}\n\n`);
+                            }
+                            catch (err) {
+                                console.error(chalk.red(`\n  Re-planning failed: ${err.message}\n`));
+                            }
+                            planRestore();
+                            break;
+                        }
+                        case "c": {
+                            const question = await ask(`\n  ${chalk.bold("Ask about the plan:")}\n  ${chalk.cyan(">")} `);
+                            if (!question)
+                                break;
+                            process.stdout.write("\x1B[?25l");
+                            try {
+                                let answer = "";
+                                for await (const msg of query({
+                                    prompt: `You planned these tasks for the objective "${objective}":\n${tasks.map((t, i) => `${i + 1}. ${t.prompt}`).join("\n")}\n\nUser question: ${question}`,
+                                    options: { cwd, model: plannerModel, permissionMode, persistSession: false },
+                                })) {
+                                    if (msg.type === "result" && msg.subtype === "success")
+                                        answer = msg.result || "";
+                                }
+                                planRestore();
+                                if (answer)
+                                    console.log(chalk.dim(`\n  ${answer.slice(0, 500)}\n`));
+                            }
+                            catch {
+                                planRestore();
+                            }
+                            break;
+                        }
+                        case "q":
+                            console.log(chalk.dim("\n  Aborted.\n"));
+                            process.exit(0);
+                    }
+                }
             }
         }
         catch (err) {
@@ -602,65 +699,6 @@ async function main() {
                 console.error(chalk.red(`\n  Planning failed: ${err.message}\n`));
             process.exit(1);
         }
-        // ── Review loop ──
-        planRestore();
-        let reviewing = true;
-        while (reviewing) {
-            showPlan(tasks);
-            const action = await selectKey(`${chalk.white(`${tasks.length} tasks`)} ${chalk.dim(`· ${concurrency} concurrent`)}`, [
-                { key: "r", desc: "un" },
-                { key: "e", desc: "dit" },
-                { key: "c", desc: "hat" },
-                { key: "q", desc: "uit" },
-            ]);
-            switch (action) {
-                case "r":
-                    reviewing = false;
-                    break;
-                case "e": {
-                    const feedback = await ask(`\n  ${chalk.bold("What should change?")}\n  ${chalk.cyan(">")} `);
-                    if (!feedback)
-                        break;
-                    console.log(chalk.cyan("\n  ◆ Re-planning...\n"));
-                    process.stdout.write("\x1B[?25l");
-                    try {
-                        tasks = await refinePlan(objective, tasks, feedback, cwd, plannerModel, workerModel, permissionMode, budget, concurrency, makeProgressLog());
-                        process.stdout.write(`\x1B[2K\r  ${chalk.green(`\u2713 ${tasks.length} tasks`)}\n\n`);
-                    }
-                    catch (err) {
-                        console.error(chalk.red(`\n  Re-planning failed: ${err.message}\n`));
-                    }
-                    planRestore();
-                    break;
-                }
-                case "c": {
-                    const question = await ask(`\n  ${chalk.bold("Ask about the plan:")}\n  ${chalk.cyan(">")} `);
-                    if (!question)
-                        break;
-                    process.stdout.write("\x1B[?25l");
-                    try {
-                        let answer = "";
-                        for await (const msg of query({
-                            prompt: `You planned these tasks for the objective "${objective}":\n${tasks.map((t, i) => `${i + 1}. ${t.prompt}`).join("\n")}\n\nUser question: ${question}`,
-                            options: { cwd, model: plannerModel, permissionMode, persistSession: false },
-                        })) {
-                            if (msg.type === "result" && msg.subtype === "success")
-                                answer = msg.result || "";
-                        }
-                        planRestore();
-                        if (answer)
-                            console.log(chalk.dim(`\n  ${answer.slice(0, 500)}\n`));
-                    }
-                    catch {
-                        planRestore();
-                    }
-                    break;
-                }
-                case "q":
-                    console.log(chalk.dim("\n  Aborted.\n"));
-                    process.exit(0);
-            }
-        }
     }
     if (tasks.length === 0) {
         console.error("No tasks provided.");

package/dist/planner.js CHANGED Viewed

@@ -2,7 +2,7 @@ import { query } from "@anthropic-ai/claude-agent-sdk";
 const INACTIVITY_MS = 5 * 60 * 1000;
 export function detectModelTier(model) {
     const m = model.toLowerCase();
-    if (m.includes("opus"))
+    if (m === "default" || m.includes("opus"))
         return "opus";
     if (m.includes("sonnet"))
         return "sonnet";
@@ -146,7 +146,32 @@ Respond with ONLY a JSON object (no markdown fences):
   ]
 }`;
 }
+const RATE_LIMIT_PATTERNS = ["rate", "limit", "overloaded", "429", "hit your limit", "too many"];
+function isRateLimitError(err) {
+    const msg = err instanceof Error ? err.message : String(err);
+    return RATE_LIMIT_PATTERNS.some((p) => msg.toLowerCase().includes(p));
+}
 async function runPlannerQuery(prompt, opts, onLog) {
+    const MAX_RETRIES = 3;
+    const BACKOFF = [30_000, 60_000, 120_000];
+    for (let attempt = 0; attempt <= MAX_RETRIES; attempt++) {
+        try {
+            return await runPlannerQueryOnce(prompt, opts, onLog);
+        }
+        catch (err) {
+            if (attempt < MAX_RETRIES && isRateLimitError(err)) {
+                const waitMs = BACKOFF[attempt];
+                const waitSec = Math.round(waitMs / 1000);
+                onLog(`Rate limited — waiting ${waitSec}s before retry ${attempt + 1}/${MAX_RETRIES}`);
+                await new Promise((r) => setTimeout(r, waitMs));
+                continue;
+            }
+            throw err;
+        }
+    }
+    throw new Error("Planner query failed after retries");
+}
+async function runPlannerQueryOnce(prompt, opts, onLog) {
     let resultText = "";
     const startedAt = Date.now();
     const pq = query({
@@ -213,7 +238,7 @@ async function runPlannerQuery(prompt, opts, onLog) {
                 if (msg.subtype === "success")
                     resultText = msg.result || "";
                 else
-                    throw new Error(`Planner failed: ${msg.subtype}`);
+                    throw new Error(`Planner failed: ${msg.result || msg.subtype}`);
             }
         }
     };

package/dist/swarm.js CHANGED Viewed

@@ -240,9 +240,17 @@ export class Swarm {
                     this.activeQueries.delete(agentQuery);
                 }
                 if (agent.status === "running") {
-                    agent.status = "done";
                     agent.finishedAt = Date.now();
-                    this.completed++;
+                    const duration = agent.finishedAt - (agent.startedAt || agent.finishedAt);
+                    if (agent.toolCalls === 0 && (agent.costUsd ?? 0) < 0.001 && duration < 15_000) {
+                        agent.status = "error";
+                        agent.error = "Agent did no work (likely rate-limited before starting)";
+                        this.failed++;
+                    }
+                    else {
+                        agent.status = "done";
+                        this.completed++;
+                    }
                     this.log(id, this.agentSummary(agent));
                 }
                 break; // Success — exit retry loop
@@ -424,12 +432,13 @@ export class Swarm {
         finally {
             if (stashed) {
                 try {
-                    exec("git stash pop", this.config.cwd);
-                    this.log(-1, "Restored stashed changes");
-                }
-                catch (e) {
-                    this.log(-1, `Stash pop failed: ${String(e.message || e).slice(0, 80)}`);
+                    const stashList = exec("git stash list", this.config.cwd).trim();
+                    if (stashList) {
+                        exec("git stash pop", this.config.cwd);
+                        this.log(-1, "Restored stashed changes");
+                    }
                 }
+                catch { /* stash already gone or empty */ }
             }
         }
     }

package/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "claude-overnight",
-  "version": "0.5.0",
+  "version": "0.5.1",
   "description": "Fire off Claude agents, come back days later to shipped work. Maximizes every token in your plan.",
   "type": "module",
   "bin": {