claude-overnight 0.5.1 → 1.0.1
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +107 -105
- package/dist/index.js +537 -95
- package/dist/planner.d.ts +17 -4
- package/dist/planner.js +156 -68
- package/dist/types.d.ts +34 -0
- package/package.json +14 -8
package/README.md
CHANGED
|
@@ -1,8 +1,10 @@
|
|
|
1
1
|
# claude-overnight
|
|
2
2
|
|
|
3
|
-
|
|
3
|
+
Run 10, 100, or 1000 Claude agents overnight. Come back to shipped work.
|
|
4
4
|
|
|
5
|
-
Describe what to build. Set a budget
|
|
5
|
+
Describe what to build. Set a budget. The tool plans, explores your codebase, breaks the objective into tasks, launches parallel agents in isolated git worktrees, iterates toward quality, and handles rate limits automatically. You press Run once, then go to sleep.
|
|
6
|
+
|
|
7
|
+
Built on the [Claude Agent SDK](https://www.npmjs.com/package/@anthropic-ai/claude-agent-sdk). Works with Claude Opus, Sonnet, and Haiku.
|
|
6
8
|
|
|
7
9
|
## Install
|
|
8
10
|
|
|
@@ -10,18 +12,14 @@ Describe what to build. Set a budget — 10 agents, 100, 1000. A planner agent a
|
|
|
10
12
|
npm install -g claude-overnight
|
|
11
13
|
```
|
|
12
14
|
|
|
13
|
-
Requires Node.js >= 20 and Claude authentication (
|
|
14
|
-
|
|
15
|
-
## Usage
|
|
15
|
+
Requires Node.js >= 20 and Claude authentication (`claude auth login`, or set `ANTHROPIC_API_KEY`).
|
|
16
16
|
|
|
17
|
-
|
|
17
|
+
## Quick start
|
|
18
18
|
|
|
19
19
|
```bash
|
|
20
20
|
claude-overnight
|
|
21
21
|
```
|
|
22
22
|
|
|
23
|
-
A guided flow walks you through each step:
|
|
24
|
-
|
|
25
23
|
```
|
|
26
24
|
🌙 claude-overnight
|
|
27
25
|
────────────────────────────────────
|
|
@@ -29,98 +27,102 @@ A guided flow walks you through each step:
|
|
|
29
27
|
① What should the agents do?
|
|
30
28
|
> refactor auth, add tests, update docs
|
|
31
29
|
|
|
32
|
-
② Budget [10]:
|
|
30
|
+
② Budget [10]: 200
|
|
33
31
|
|
|
34
32
|
③ Worker model:
|
|
35
33
|
● Sonnet — Sonnet 4.6 · Best for everyday tasks
|
|
36
34
|
○ Opus — Opus 4.6 · Most capable
|
|
37
|
-
○ Haiku — Haiku 4.5 · Fastest
|
|
38
35
|
|
|
39
36
|
④ Usage:
|
|
40
|
-
●
|
|
41
|
-
|
|
42
|
-
|
|
43
|
-
|
|
44
|
-
|
|
45
|
-
|
|
37
|
+
● 90% · leave 10% for other work
|
|
38
|
+
|
|
39
|
+
╭──────────────────────────────────────────╮
|
|
40
|
+
│ sonnet · budget 200 · 5× · flex · 90% │
|
|
41
|
+
╰──────────────────────────────────────────╯
|
|
42
|
+
|
|
43
|
+
✓ 5 themes → review, press Run, walk away
|
|
44
|
+
|
|
45
|
+
◆ Thinking: 5 agents exploring... ← architects analyze your codebase
|
|
46
|
+
◆ Orchestrating plan... ← synthesizes 50 concrete tasks
|
|
47
|
+
◆ Wave 1 · 50 tasks ← fully autonomous from here
|
|
48
|
+
◆ Assessing... how close to amazing?
|
|
49
|
+
◆ Wave 2 · 30 tasks ← improvements from assessment
|
|
50
|
+
◆ Reflection: 2 agents reviewing ← deep quality audit
|
|
51
|
+
◆ Wave 3 · 20 tasks ← fixes from review findings
|
|
52
|
+
◆ Assessing... ✓ Vision met
|
|
46
53
|
```
|
|
47
54
|
|
|
48
|
-
|
|
55
|
+
You interact once (objective, budget, model, review themes), then everything runs autonomously — thinking, planning, executing, reflecting, steering. Rate-limited? It waits and retries. Crash? Resume where you left off.
|
|
49
56
|
|
|
50
|
-
|
|
57
|
+
## How it works
|
|
51
58
|
|
|
52
|
-
|
|
53
|
-
claude-overnight tasks.json
|
|
54
|
-
```
|
|
59
|
+
### 1. Thinking wave
|
|
55
60
|
|
|
56
|
-
|
|
61
|
+
For budgets > 15, the tool launches **architect agents** that explore your codebase before any code is written. Each one gets a different research angle (architecture, data models, APIs, testing, etc.) and writes a structured design document. The number scales with budget: 5 for budget=50, 10 for budget=2000.
|
|
57
62
|
|
|
58
|
-
|
|
59
|
-
claude-overnight "fix auth bug in src/auth.ts" "add tests for user model"
|
|
60
|
-
```
|
|
61
|
-
|
|
62
|
-
## How the planner works
|
|
63
|
-
|
|
64
|
-
The planner always runs on the best available model (Opus) regardless of which model you pick for workers. This ensures high-quality task decomposition even when workers use a cheaper model.
|
|
63
|
+
### 2. Orchestration
|
|
65
64
|
|
|
66
|
-
|
|
65
|
+
An orchestrator agent reads all design documents and synthesizes concrete execution tasks — grounded in real files and patterns the architects found. No guesswork.
|
|
67
66
|
|
|
68
|
-
|
|
67
|
+
### 3. Iterative execution
|
|
69
68
|
|
|
70
|
-
|
|
71
|
-
⠋ identifying themes... → splits objective into N angles (< 30s)
|
|
72
|
-
✓ 10 themes → review themes, press Run, walk away
|
|
73
|
-
◆ Thinking: 10 agents exploring → each explores from its angle, writes a design doc
|
|
74
|
-
◆ Orchestrating plan... → reads all design docs, synthesizes execution tasks
|
|
75
|
-
◆ Wave 1 · 50 tasks → fully autonomous from here
|
|
76
|
-
◆ Steering... → adapts between waves, retries on rate limits
|
|
77
|
-
```
|
|
69
|
+
Tasks run in parallel (each agent in its own git worktree). After each wave, steering assesses: "how good is this?" — not "what's missing?" It can:
|
|
78
70
|
|
|
79
|
-
|
|
71
|
+
- **Execute** more tasks to build features, fix bugs, polish UX
|
|
72
|
+
- **Reflect** by spinning up 1-2 review agents for deep quality/architecture audits
|
|
73
|
+
- **Declare done** when the vision is met at high quality
|
|
80
74
|
|
|
81
|
-
|
|
75
|
+
### 4. Goal refinement
|
|
82
76
|
|
|
83
|
-
|
|
77
|
+
The tool starts with your broad objective but evolves its definition of "amazing" as it learns your codebase. Steering refines the goal after each wave. Late waves are informed by early discoveries.
|
|
84
78
|
|
|
85
|
-
###
|
|
79
|
+
### 5. Three-layer context
|
|
86
80
|
|
|
87
|
-
|
|
81
|
+
Long runs stay sharp because steering maintains three layers of memory:
|
|
88
82
|
|
|
89
|
-
**
|
|
83
|
+
- **Status** — a living project snapshot, updated every wave. Compressed, never truncated.
|
|
84
|
+
- **Milestones** — strategic snapshots archived every ~5 waves. Long-term memory.
|
|
85
|
+
- **Goal** — the evolving north star. What "amazing" means for this codebase.
|
|
90
86
|
|
|
91
|
-
|
|
87
|
+
## Run history and resume
|
|
92
88
|
|
|
93
|
-
|
|
89
|
+
Every run gets its own folder in `.claude-overnight/runs/`. Nothing is ever overwritten.
|
|
94
90
|
|
|
95
|
-
|
|
91
|
+
```
|
|
92
|
+
.claude-overnight/
|
|
93
|
+
runs/
|
|
94
|
+
2026-04-04T18-52-49/ ← run A (done, $200, 200 tasks)
|
|
95
|
+
run.json, status.md, goal.md, milestones/, sessions/
|
|
96
|
+
2026-04-05T10-30-00/ ← run B (crashed)
|
|
97
|
+
run.json, sessions/
|
|
98
|
+
```
|
|
96
99
|
|
|
97
|
-
|
|
100
|
+
If a run crashes, gets rate-limited, or you Ctrl+C:
|
|
98
101
|
|
|
99
|
-
|
|
102
|
+
```
|
|
103
|
+
⚠ Interrupted run
|
|
104
|
+
╭──────────────────────────────────────────────────╮
|
|
105
|
+
│ refactor auth, add tests, update docs │
|
|
106
|
+
│ 50/200 sessions · 3 waves · $69.16 │
|
|
107
|
+
│ 34 merged · 16 unmerged · 0 failed branches │
|
|
108
|
+
╰──────────────────────────────────────────────────╯
|
|
109
|
+
|
|
110
|
+
Resume │ Fresh │ Quit
|
|
111
|
+
```
|
|
100
112
|
|
|
101
|
-
|
|
113
|
+
On resume: unmerged branches auto-merge, the wave loop continues, all context is preserved.
|
|
102
114
|
|
|
103
|
-
**
|
|
115
|
+
**Knowledge carries forward** — new runs inherit knowledge from completed previous runs. Thinking agents and steering see what past runs built. Run 2 knows run 1 already built the auth system.
|
|
104
116
|
|
|
105
|
-
|
|
117
|
+
Add `.claude-overnight` to your `.gitignore`.
|
|
106
118
|
|
|
107
|
-
##
|
|
119
|
+
## Other usage modes
|
|
108
120
|
|
|
109
|
-
|
|
121
|
+
### Task file
|
|
110
122
|
|
|
123
|
+
```bash
|
|
124
|
+
claude-overnight tasks.json
|
|
111
125
|
```
|
|
112
|
-
④ Usage:
|
|
113
|
-
● Unlimited · full capacity, wait through rate limits
|
|
114
|
-
○ 90% · leave 10% for other work
|
|
115
|
-
○ 75% · conservative, plenty of headroom
|
|
116
|
-
○ 50% · use half, keep the rest
|
|
117
|
-
```
|
|
118
|
-
|
|
119
|
-
When utilization hits your cap, the swarm stops dispatching new tasks and lets active agents finish gracefully. This way you can run a big overnight job and still have capacity left for manual Claude usage.
|
|
120
|
-
|
|
121
|
-
Use `--usage-cap=90` on the command line, or `"usageCap": 90` in task files.
|
|
122
|
-
|
|
123
|
-
## Task file format
|
|
124
126
|
|
|
125
127
|
```json
|
|
126
128
|
{
|
|
@@ -135,71 +137,67 @@ Use `--usage-cap=90` on the command line, or `"usageCap": 90` in task files.
|
|
|
135
137
|
}
|
|
136
138
|
```
|
|
137
139
|
|
|
138
|
-
|
|
139
|
-
|
|
140
|
-
For multi-wave runs from a task file, add `objective` and `flexiblePlan`:
|
|
140
|
+
For multi-wave runs, add `objective` and `flexiblePlan`:
|
|
141
141
|
|
|
142
142
|
```json
|
|
143
143
|
{
|
|
144
|
-
"objective": "Modernize the auth system
|
|
144
|
+
"objective": "Modernize the auth system",
|
|
145
145
|
"flexiblePlan": true,
|
|
146
146
|
"tasks": ["Refactor auth middleware", "Add JWT validation"],
|
|
147
147
|
"usageCap": 90
|
|
148
148
|
}
|
|
149
149
|
```
|
|
150
150
|
|
|
151
|
-
|
|
151
|
+
### Inline
|
|
152
152
|
|
|
153
|
-
|
|
154
|
-
|
|
155
|
-
|
|
156
|
-
| `objective` | `string` | — | High-level goal for multi-wave steering (required when `flexiblePlan` is true) |
|
|
157
|
-
| `flexiblePlan` | `boolean` | `false` | Enable adaptive multi-wave planning from task files |
|
|
158
|
-
| `model` | `string` | prompted | Worker model (per-task overridable) |
|
|
159
|
-
| `concurrency` | `number` | `5` | Max parallel agents |
|
|
160
|
-
| `worktrees` | `boolean` | auto (git repo) | Isolate each agent in a git worktree |
|
|
161
|
-
| `permissionMode` | `"auto" \| "bypassPermissions" \| "default"` | `"auto"` | How agents handle dangerous operations |
|
|
162
|
-
| `cwd` | `string` | `process.cwd()` | Working directory |
|
|
163
|
-
| `allowedTools` | `string[]` | all | Restrict agent tools |
|
|
164
|
-
| `mergeStrategy` | `"yolo" \| "branch"` | `"yolo"` | Merge into HEAD or a new branch |
|
|
165
|
-
| `usageCap` | `number (0-100)` | unlimited | Stop at N% utilization (e.g. 90) |
|
|
153
|
+
```bash
|
|
154
|
+
claude-overnight "fix auth bug in src/auth.ts" "add tests for user model"
|
|
155
|
+
```
|
|
166
156
|
|
|
167
157
|
## CLI flags
|
|
168
158
|
|
|
169
159
|
| Flag | Default | Description |
|
|
170
160
|
|---|---|---|
|
|
171
|
-
| `--budget=N` | `10` | Total agent sessions
|
|
172
|
-
| `--concurrency=N` | `5` |
|
|
173
|
-
| `--model=NAME` | prompted | Worker model (planner
|
|
161
|
+
| `--budget=N` | `10` | Total agent sessions |
|
|
162
|
+
| `--concurrency=N` | `5` | Parallel agents |
|
|
163
|
+
| `--model=NAME` | prompted | Worker model (planner uses best available) |
|
|
174
164
|
| `--usage-cap=N` | unlimited | Stop at N% utilization |
|
|
175
|
-
| `--timeout=SECONDS` | `300` | Inactivity timeout
|
|
176
|
-
| `--no-flex` | — | Disable
|
|
165
|
+
| `--timeout=SECONDS` | `300` | Inactivity timeout per agent |
|
|
166
|
+
| `--no-flex` | — | Disable multi-wave steering |
|
|
177
167
|
| `--dry-run` | — | Show planned tasks without running |
|
|
178
|
-
| `-h, --help` | — | Help |
|
|
179
|
-
| `-v, --version` | — | Version |
|
|
180
168
|
|
|
181
|
-
|
|
169
|
+
## Task file fields
|
|
182
170
|
|
|
183
|
-
|
|
171
|
+
| Field | Type | Default | Description |
|
|
172
|
+
|---|---|---|---|
|
|
173
|
+
| `tasks` | `(string \| {prompt, cwd?, model?})[]` | required | Tasks to run |
|
|
174
|
+
| `objective` | `string` | — | High-level goal for steering |
|
|
175
|
+
| `flexiblePlan` | `boolean` | `false` | Enable multi-wave planning |
|
|
176
|
+
| `model` | `string` | prompted | Worker model |
|
|
177
|
+
| `concurrency` | `number` | `5` | Parallel agents |
|
|
178
|
+
| `worktrees` | `boolean` | auto | Git worktree isolation |
|
|
179
|
+
| `permissionMode` | `"auto" \| "bypassPermissions" \| "default"` | `"auto"` | Permission handling |
|
|
180
|
+
| `mergeStrategy` | `"yolo" \| "branch"` | `"yolo"` | Merge into HEAD or new branch |
|
|
181
|
+
| `usageCap` | `number (0-100)` | unlimited | Stop at N% utilization |
|
|
184
182
|
|
|
185
|
-
|
|
183
|
+
## Rate limits
|
|
186
184
|
|
|
187
|
-
|
|
188
|
-
- **Hard block**: API returns a reset timestamp — swarm pauses and resumes exactly when the window opens.
|
|
189
|
-
- **Soft throttle**: at >75% utilization, dispatch slows to avoid hitting the limit.
|
|
190
|
-
- **Retry with backoff**: transient errors (429, overloaded, connection reset) retry with exponential backoff.
|
|
191
|
-
- **Usage cap**: set a ceiling and the swarm stops dispatching when it's reached — active agents finish, no new ones start.
|
|
185
|
+
Built for unattended runs lasting hours or days.
|
|
192
186
|
|
|
193
|
-
|
|
187
|
+
- **Hard block**: pauses until the rate limit window resets, then resumes
|
|
188
|
+
- **Soft throttle**: slows dispatch at >75% utilization
|
|
189
|
+
- **Retry with backoff**: transient errors (429, overloaded) retry automatically
|
|
190
|
+
- **Usage cap**: set a ceiling, active agents finish, no new ones start
|
|
191
|
+
- **Planner retries**: steering and orchestration also retry on rate limits (30s/60s/120s backoff)
|
|
194
192
|
|
|
195
193
|
## Worktrees and merging
|
|
196
194
|
|
|
197
|
-
Each agent gets an isolated git worktree
|
|
195
|
+
Each agent gets an isolated git worktree (`swarm/task-N` branch). Changes auto-commit. After all agents complete, branches merge back.
|
|
198
196
|
|
|
199
|
-
- `"yolo"` (default): merges
|
|
200
|
-
- `"branch"`: creates a `swarm/run-{timestamp}` branch
|
|
197
|
+
- `"yolo"` (default): merges into your current branch
|
|
198
|
+
- `"branch"`: creates a new `swarm/run-{timestamp}` branch
|
|
201
199
|
|
|
202
|
-
|
|
200
|
+
Conflicts retry with `-X theirs`. Unresolved branches are preserved for manual merge.
|
|
203
201
|
|
|
204
202
|
## Exit codes
|
|
205
203
|
|
|
@@ -208,3 +206,7 @@ Merge conflicts retry with `-X theirs`. If that fails, the branch is preserved f
|
|
|
208
206
|
| `0` | All tasks succeeded |
|
|
209
207
|
| `1` | Some tasks failed |
|
|
210
208
|
| `2` | All failed or none completed |
|
|
209
|
+
|
|
210
|
+
## License
|
|
211
|
+
|
|
212
|
+
MIT
|