claude-overnight 1.24.8 → 1.25.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -1,14 +1,10 @@
1
1
  # claude-overnight
2
2
 
3
- **A background lane for your Claude Max plan.** Runs a capped swarm of Claude Agent SDK sessions in isolated git worktrees -- stops at a usage cap you set, so your interactive Claude Code always has headroom. Rate-limited? It waits. Crash? It resumes with full context.
3
+ Parallel Claude agents in isolated git worktrees. Set a usage cap so your interactive Claude Code keeps its headroom. Rate-limited? It waits. Crash? It resumes with full context.
4
4
 
5
- Your Max plan rate limits eat interactive coding time. One deep refactor and the 5-hour window is gone before lunch. `claude-overnight` runs background agent sessions up to the percentage cap you pick (90% is typical), leaving the rest free for your own Claude Code session. Hand it an objective and a session budget, walk away, review the diff when the run ends.
5
+ Hand it an objective and a session budget, walk away, review the diff when the run ends. Every agent runs in its own worktree on its own branch a misbehaving agent can't trash your working tree. Unmerged branches are preserved for manual review, never discarded.
6
6
 
7
- Cursor API Proxy supported -- route through Cursor's model gateway for Composer-powered execution on `auto`, `composer`, or `composer-2` models. See **Run via Cursor API Proxy** below.
8
-
9
- Isolated by default. Every agent runs in its own git worktree on its own branch, so a misbehaving agent can't trash your working tree. You choose what agents can do before the run starts -- no surprise escalation mid-flight. Unmerged branches are preserved for manual review, never discarded. Built on the [Claude Agent SDK](https://www.npmjs.com/package/@anthropic-ai/claude-agent-sdk) -- not a Claude Code replacement, but a background lane that runs alongside it.
10
-
11
- Different shape from hosted agent harnesses like [Claude Managed Agents](https://platform.claude.com/docs/en/managed-agents/overview): instead of one agent in one cloud container billed separately, you get many parallel sessions on your own machine, in your real repo, against your own Max plan (or API key). Works with Claude Opus, Sonnet, and Haiku -- or pair an Anthropic planner with a cheaper executor on Qwen, OpenRouter, or any Anthropic-compatible endpoint.
7
+ Built on the [Claude Agent SDK](https://www.npmjs.com/package/@anthropic-ai/claude-agent-sdk). Pair any planner (Opus, Sonnet) with any executor Anthropic, Cursor, Qwen, OpenRouter, or any Anthropic-compatible endpoint.
12
8
 
13
9
  ## Run on Qwen 3.6 Plus
14
10
 
@@ -39,6 +35,27 @@ claude-overnight
39
35
 
40
36
  Use Cursor's model gateway as an executor -- `auto` (delegates to best available), `composer`, or `composer-2` models. Runs locally through a proxy that speaks the Anthropic Messages API, so it's a drop-in replacement for any other provider.
41
37
 
38
+ ### macOS: Cursor agent shell patch
39
+
40
+ On macOS, Cursor's `agent` / `cursor-agent` CLI often misbehaves because it uses a bundled Node.js. Add this to `~/.zshrc` so the `agent` command runs the real script with your **system** Node (then `source ~/.zshrc` or open a new terminal):
41
+
42
+ ```bash
43
+ # Force Cursor Agent to use System Node.js
44
+ run_cursor_agent() {
45
+ # Find the real directory of the cursor-agent script (resolves symlinks)
46
+ local agent_path="$(command -v cursor-agent)"
47
+ local script_dir="$(dirname "$(realpath "$agent_path")")"
48
+
49
+ # Run the core JS file directly with your system node
50
+ node "$script_dir/index.js" "$@"
51
+ }
52
+
53
+ # Overwrite any existing 'agent' alias to use our custom function
54
+ alias agent="run_cursor_agent"
55
+ ```
56
+
57
+ `claude-overnight` prints a one-time notice when you use the Cursor proxy and this snippet is not detected in `~/.zshrc` or `~/.zprofile`. The bundled proxy also sets `CURSOR_AGENT_NODE` / `CURSOR_AGENT_SCRIPT` when it can find `node` and `cursor-agent`, but your interactive shell still benefits from the alias.
58
+
42
59
  1. **Install the Cursor CLI and proxy:**
43
60
 
44
61
  ```bash
@@ -68,6 +85,24 @@ claude-overnight
68
85
 
69
86
  **Tip:** run `claude-overnight` with the `--model=cursor-auto` flag in non-interactive mode to skip the picker. If the proxy isn't running at startup, a warning is shown but Anthropic providers remain available.
70
87
 
88
+ ### macOS: “Keychain Not Found” / `cursor-user`
89
+
90
+ The Cursor **`agent`** binary stores an interactive login as **`cursor-user`** in your **login** keychain. For automation, use a **[User API key](https://cursor.com/docs/cli/headless)** (`export CURSOR_API_KEY=...` from [Integrations](https://cursor.com/dashboard/integrations)) — the bundled proxy then does not need Keychain. `claude-overnight` forces `CURSOR_SKIP_KEYCHAIN=1` and `CI=true`; if System Settings still shows **“A keychain cannot be found to store …”**, the login keychain is often missing or damaged: open **Keychain Access → First Aid** on **login**, or use **Reset To Defaults** in the dialog. Some users fix a stuck keychain with:
91
+
92
+ ```bash
93
+ security unlock-keychain ~/Library/Keychains/login.keychain-db
94
+ ```
95
+
96
+ **Automation:** Saving a key via **Cursor…** in `claude-overnight` is enough — it is written to `providers.json` and injected into both the Claude SDK env and the bundled proxy (including `CURSOR_API_KEY` for the native `agent`). You do not need to `export` variables unless you want to override for one shell.
97
+
98
+ **Advanced:** If something else must share port `8765` and you manage the proxy yourself, set `CURSOR_OVERNIGHT_NO_PROXY_RESTART=1` to skip the automatic “replace listener” step when a Cursor API token is present.
99
+
100
+ **How headless Cursor + macOS Keychain actually works (discovery):** We documented the full investigation: why ACP + skip-authenticate + `CURSOR_API_KEY` were not enough, how **chat-only workspace** (default in cursor-composer) fakes `HOME` and still triggered **Keychain timeouts** despite a User API key, and how **`composer-2-fast`** can fail the ACP smoke test for reasons unrelated to Keychain. See **[docs/CURSOR_PROXY_MACOS_DISCOVERY.md](docs/CURSOR_PROXY_MACOS_DISCOVERY.md)**.
101
+
102
+ **Quick reference — bundled proxy env:** `CURSOR_BRIDGE_ACP_SKIP_AUTHENTICATE=1`, `CURSOR_BRIDGE_USE_ACP=1`, `CURSOR_BRIDGE_CHAT_ONLY_WORKSPACE=false`, plus `CURSOR_API_KEY` / `CURSOR_AUTH_TOKEN` / `CURSOR_BRIDGE_API_KEY` and `CURSOR_SKIP_KEYCHAIN=1` / `CI=true`. Details and tables are in the doc above.
103
+
104
+ **Regression / stress test:** `npm run matrix:cursor-proxy` (optional `--quick`, `--include-danger`). Use `MATRIX_MODELS=composer-2,composer-2-fast` to compare models; override `MATRIX_PORT_BASE`, `MATRIX_MODEL`, `MATRIX_MSG_TIMEOUT_MS` as needed.
105
+
71
106
  ## Install
72
107
 
73
108
  ```bash
@@ -126,24 +161,9 @@ claude-overnight
126
161
 
127
162
  You interact once (objective, budget, model, review themes), then the rest runs unattended -- thinking, planning, executing, reflecting, steering. Rate-limited? It waits and retries. Crash? Resume where you left off. Capped at usage limit? Pick up next time with full context preserved.
128
163
 
129
- ## How it differs
130
-
131
- - vs **Claude Code**: many agents, no driver, capped so your Claude Code session keeps its headroom
132
- - vs **[Managed Agents](https://platform.claude.com/docs/en/managed-agents/overview)**: on your machine, against your Max plan, in your real git history -- not a cloud container billed separately
133
- - vs **Cursor / Copilot / Cline**: asynchronous, off the keyboard
134
-
135
164
  ## Use cases
136
165
 
137
- - **Overnight refactors** -- "Modernize the auth system" at budget 200.
138
- - **Batch feature implementation** -- dozens of features from a task file, parallelized.
139
- - **Codebase-wide cleanups** -- deduplicate, simplify, rename, normalize.
140
- - **Test generation at scale** -- integration tests for every route or module.
141
- - **Documentation sprints** -- API docs, READMEs, inline comments, changelogs.
142
- - **Framework migrations** -- version upgrades, type annotations, config format swaps.
143
- - **Quality audits** -- reflection waves surface architectural issues and code smells.
144
- - **Long research runs** -- architect sessions explore a large codebase before any code lands.
145
-
146
- Typical shape: one objective + a $20–$200 spend cap + walk away.
166
+ Overnight refactors, batch feature implementation, codebase-wide cleanups, test generation, documentation sprints, framework migrations, quality audits, long research runs. One objective + a budget + walk away.
147
167
 
148
168
  ## How it works
149
169
 
@@ -1 +1 @@
1
- export declare const VERSION = "1.24.8";
1
+ export declare const VERSION = "1.25.0";
package/dist/_version.js CHANGED
@@ -1,2 +1,2 @@
1
1
  // Auto-generated by build — do not edit manually.
2
- export const VERSION = "1.24.8";
2
+ export const VERSION = "1.25.0";
package/dist/bin.js CHANGED
@@ -4,6 +4,11 @@
4
4
  // rest of the module graph takes several seconds on a cold cache -- without
5
5
  // this, the terminal sits black that whole time. index.ts stops the splash
6
6
  // via `globalThis.__coStopSplash` as soon as its header is about to print.
7
+ // Cursor agent: never inherit a shell that disabled keychain skip (`CI=0`,
8
+ // empty `CURSOR_SKIP_KEYCHAIN`) — the Cursor CLI may prompt for "cursor-user"
9
+ // and block preflight. Force like cursor-composer-in-claude/dist/cli.js (not ??=).
10
+ process.env.CURSOR_SKIP_KEYCHAIN = "1";
11
+ process.env.CI = "true";
7
12
  const argv = process.argv.slice(2);
8
13
  const quiet = argv.includes("-h") || argv.includes("--help") || argv.includes("-v") || argv.includes("--version");
9
14
  if (!quiet && process.stdout.isTTY) {
@@ -14,7 +14,6 @@
14
14
  import { modelDisplayName, formatContextWindow } from "./models.js";
15
15
  export const CURSOR_PRIORITY_MODELS = [
16
16
  { id: "composer-2", label: "composer-2", hint: "Cursor Composer 2 — latest, strongest Cursor model" },
17
- { id: "composer-2-fast", label: "composer-2-fast", hint: "Cursor Composer 2 Fast — faster, cheaper variant" },
18
17
  { id: "auto", label: "auto", hint: "auto-delegates to the best available model" },
19
18
  ];
20
19
  export const CURSOR_KNOWN_MODELS = [
package/dist/index.js CHANGED
@@ -9,7 +9,7 @@ import { Swarm } from "./swarm.js";
9
9
  import { planTasks, refinePlan, identifyThemes, buildThinkingTasks, orchestrate, salvageFromFile } from "./planner.js";
10
10
  import { modelDisplayName, formatContextWindow, DEFAULT_MODEL } from "./models.js";
11
11
  import { setPlannerEnvResolver } from "./planner-query.js";
12
- import { pickModel, loadProviders, preflightProvider, buildEnvResolver, healthCheckCursorProxy, PROXY_DEFAULT_URL, isCursorProxyProvider, ensureCursorProxyRunning, bundledComposerProxyShellCommand, } from "./providers.js";
12
+ import { pickModel, loadProviders, preflightProvider, buildEnvResolver, healthCheckCursorProxy, PROXY_DEFAULT_URL, isCursorProxyProvider, ensureCursorProxyRunning, bundledComposerProxyShellCommand, warnMacCursorAgentShellPatchIfNeeded, hasCursorAgentToken, } from "./providers.js";
13
13
  import { RunDisplay } from "./ui.js";
14
14
  import { renderSummary } from "./render.js";
15
15
  import { executeRun } from "./run.js";
@@ -158,8 +158,9 @@ async function promptResumeOverrides(state, cliFlags, argv, noTTY, runDir) {
158
158
  console.log();
159
159
  }
160
160
  async function main() {
161
- // Prevent macOS keychain popups from the Cursor CLI agent subprocess.
162
- process.env.CURSOR_SKIP_KEYCHAIN ??= "1";
161
+ // Same as bin.ts: do not use ??= parent shell can set CI=0 / CURSOR_SKIP_KEYCHAIN=0.
162
+ process.env.CURSOR_SKIP_KEYCHAIN = "1";
163
+ process.env.CI = "true";
163
164
  const argv = process.argv.slice(2);
164
165
  if (argv.includes("-v") || argv.includes("--version")) {
165
166
  const __dirname = dirname(fileURLToPath(import.meta.url));
@@ -220,6 +221,7 @@ async function main() {
220
221
  // ── Pre-check: warn if saved Cursor providers exist but proxy is down ──
221
222
  const savedCursorProviders = loadProviders().filter(isCursorProxyProvider);
222
223
  if (savedCursorProviders.length > 0 && !dryRun) {
224
+ warnMacCursorAgentShellPatchIfNeeded();
223
225
  const proxyUp = await healthCheckCursorProxy();
224
226
  if (!proxyUp) {
225
227
  console.warn(chalk.yellow(`\n ⚠ ${savedCursorProviders.length} Cursor provider(s) saved but proxy is not running at ${PROXY_DEFAULT_URL}`));
@@ -513,15 +515,10 @@ async function main() {
513
515
  mergeStrategy = resumeState.mergeStrategy;
514
516
  }
515
517
  else if (!nonInteractive) {
516
- while (true) {
517
- objective = await ask(`\n ${chalk.cyan("①")} ${chalk.bold("What should the agents do?")}\n ${chalk.cyan(">")} `);
518
- if (!objective) {
519
- console.error(chalk.red("\n No objective provided."));
520
- process.exit(1);
521
- }
522
- if (objective.split(/\s+/).length >= 5)
523
- break;
524
- console.log(chalk.yellow(' Be specific, e.g. "refactor the auth module, add tests, and update docs"'));
518
+ objective = (await ask(`\n ${chalk.cyan("①")} ${chalk.bold("What should the agents do?")}\n ${chalk.cyan(">")} `)).trim();
519
+ if (!objective) {
520
+ console.error(chalk.red("\n No objective provided."));
521
+ process.exit(1);
525
522
  }
526
523
  const modelsPromise = fetchModels();
527
524
  const budgetAns = await ask(`\n ${chalk.cyan("②")} ${chalk.dim("Budget")} ${chalk.dim("[")}${chalk.white("10")}${chalk.dim("]:")} `);
@@ -763,18 +760,43 @@ async function main() {
763
760
  cursorProxies.push(p);
764
761
  }
765
762
  }
766
- // Auto-start cursor proxy before pinging
763
+ // Auto-start cursor proxy before pinging (restarts when a token exists so stale listeners get CURSOR_API_KEY).
767
764
  if (cursorProxies.length > 0) {
768
765
  await ensureCursorProxyRunning();
766
+ if (!hasCursorAgentToken()) {
767
+ console.error(chalk.red(` ✗ Cursor models require a User API key — add it via ${chalk.bold("Cursor…")} setup, or set ` +
768
+ `${chalk.bold("CURSOR_API_KEY")} / ${chalk.bold("CURSOR_BRIDGE_API_KEY")}, or ${chalk.bold("cursorApiKey")} in providers.json.`));
769
+ console.error(chalk.dim(` Without it the Cursor CLI falls back to macOS Keychain (\`cursor-user\`).`));
770
+ process.exit(1);
771
+ }
769
772
  }
770
773
  process.stdout.write(` ${chalk.dim(`◆ Pinging ${pending.map(([r, p]) => `${r} (${p.displayName})`).join(", ")}…`)}\n`);
771
- const results = await Promise.all(pending.map(async ([role, p]) => ({
772
- role,
773
- provider: p,
774
- result: await preflightProvider(p, cwd, 20_000, {
775
- onProgress: (msg) => process.stdout.write(chalk.dim(` ${msg}\n`)),
776
- }),
777
- })));
774
+ // Cursor proxy: each saved model is a distinct provider id (`cursor-composer-2`, etc.), so
775
+ // planner + executor + fast can schedule multiple preflights. The bundled proxy typically
776
+ // handles one agent query at a time — parallel preflights starve each other and hit the
777
+ // 20s timeout. Run non-proxy checks in parallel, then cursor proxy checks one at a time
778
+ // (preserve original `pending` order for messages).
779
+ const progress = (msg) => process.stdout.write(chalk.dim(` ${msg}\n`));
780
+ /** Cursor agent cold start + model variance can exceed 20s; API providers stay tight. */
781
+ const preflightMs = (p) => isCursorProxyProvider(p) ? 60_000 : 20_000;
782
+ const nonCursorIdx = [];
783
+ const cursorIdx = [];
784
+ for (let i = 0; i < pending.length; i++) {
785
+ if (isCursorProxyProvider(pending[i][1]))
786
+ cursorIdx.push(i);
787
+ else
788
+ nonCursorIdx.push(i);
789
+ }
790
+ const slot = Array.from({ length: pending.length });
791
+ await Promise.all(nonCursorIdx.map(async (i) => {
792
+ const [role, p] = pending[i];
793
+ slot[i] = { role, provider: p, result: await preflightProvider(p, cwd, preflightMs(p), { onProgress: progress }) };
794
+ }));
795
+ for (const i of cursorIdx) {
796
+ const [role, p] = pending[i];
797
+ slot[i] = { role, provider: p, result: await preflightProvider(p, cwd, preflightMs(p), { onProgress: progress }) };
798
+ }
799
+ const results = slot;
778
800
  for (const { role, provider, result } of results) {
779
801
  if (!result.ok) {
780
802
  console.error(chalk.red(` ✗ ${role} preflight failed: ${chalk.dim(result.error)}`));
package/dist/models.d.ts CHANGED
@@ -1,5 +1,6 @@
1
1
  export interface ModelCapability {
2
2
  contextWindow: number;
3
+ safeContext: number;
3
4
  contextConstraint: "tight" | "moderate" | "relaxed";
4
5
  /** Human-readable label for UI display. Falls back to the model key if absent. */
5
6
  displayName?: string;
@@ -16,7 +17,8 @@ export declare function getModelCapability(model: string): ModelCapability;
16
17
  export declare function modelDisplayName(model: string): string;
17
18
  /**
18
19
  * Context constraint instruction injected into planner prompts.
19
- * Tells the planner how to scope tasks based on the worker model's context.
20
+ * Uses safeContext (not declared contextWindow) so planners scope tasks
21
+ * to what the model can actually handle reliably.
20
22
  */
21
23
  export declare function contextConstraintNote(model: string): string;
22
24
  /** Format context window for display (e.g. "256K"). */
package/dist/models.js CHANGED
@@ -4,33 +4,58 @@
4
4
  // arrive (which happens basically daily). Each entry describes what the model
5
5
  // can handle in terms of context and task scoping.
6
6
  //
7
- // contextConstraint:
8
- // "tight" small context window. Model is lazy and error-prone on big
9
- // tasks. Needs surgical, hyper-specific instructions.
10
- // "moderate" decent context. Can handle focused missions but may lose
11
- // thread on sprawling codebases.
12
- // "relaxed" — large context. Can read most of the codebase at once,
13
- // reliably own multi-file features with autonomy.
7
+ // contextWindow — declared/advertised context (shown in UI)
8
+ // safeContext conservative usable context ≤40% of declared, adjusted for
9
+ // model quality. This is what planners use to scope tasks.
10
+ // Based on: RULER benchmarks, "lost in the middle" research,
11
+ // Chroma context-rot studies, and real-world experience.
12
+ //
13
+ // contextConstraint combines usable context AND model laziness/diligence:
14
+ // "tight" — lazy or small context. Needs surgical, hyper-specific tasks.
15
+ // "moderate" — decent. Focused missions with clear targets.
16
+ // "relaxed" — large usable context + low laziness. Full autonomy.
17
+ //
18
+ // Laziness source: IFEval scores, Ian Paterson 38-task routing benchmark,
19
+ // Chroma hallucination study. "relaxed" = 95%+ on all three axes.
14
20
  export const MODEL_CAPABILITIES = {
15
- // ── Anthropic Claude 4.5 / 4.6 ──
16
- "claude-sonnet-4-6": { contextWindow: 256_000, contextConstraint: "relaxed", displayName: "Sonnet 4.6" },
17
- "claude-sonnet-4-5": { contextWindow: 256_000, contextConstraint: "relaxed", displayName: "Sonnet 4.5" },
18
- "claude-opus-4-6": { contextWindow: 200_000, contextConstraint: "relaxed", displayName: "Opus 4.6" },
19
- "claude-opus-4-5": { contextWindow: 200_000, contextConstraint: "relaxed", displayName: "Opus 4.5" },
20
- "claude-opus-4-20250514": { contextWindow: 200_000, contextConstraint: "relaxed", displayName: "Opus 4" },
21
- "claude-haiku-4-5": { contextWindow: 200_000, contextConstraint: "moderate", displayName: "Haiku 4.5" },
22
- "claude-haiku-4-5-20251001": { contextWindow: 200_000, contextConstraint: "moderate", displayName: "Haiku 4.5" },
23
- // ── Cursor models ──
24
- "auto": { contextWindow: 256_000, contextConstraint: "relaxed", displayName: "Cursor Auto" },
25
- "composer-2": { contextWindow: 200_000, contextConstraint: "relaxed", displayName: "Composer 2" },
26
- "composer-2-fast": { contextWindow: 128_000, contextConstraint: "moderate", displayName: "Composer 2 Fast" },
27
- "composer": { contextWindow: 128_000, contextConstraint: "moderate", displayName: "Composer" },
28
- // ── Qwen (via DashScope / custom provider) ──
29
- "qwen3.6-plus": { contextWindow: 131_072, contextConstraint: "moderate", displayName: "Qwen 3.6 Plus" },
30
- "qwen3-coder": { contextWindow: 262_144, contextConstraint: "relaxed", displayName: "Qwen 3 Coder" },
31
- "qwen-max": { contextWindow: 32_768, contextConstraint: "tight", displayName: "Qwen Max" },
32
- // ── Fallback for unknown models ──
33
- "unknown": { contextWindow: 128_000, contextConstraint: "moderate" },
21
+ // ── Anthropic Claude (Apr 2026) ──
22
+ // Opus: only model that earns "relaxed". 100% on 38-task routing, 95%+ IFEval.
23
+ "claude-opus-4-6": { contextWindow: 1_000_000, safeContext: 400_000, contextConstraint: "relaxed", displayName: "Opus 4.6" },
24
+ // Sonnet: good but loses thread more than Opus on autonomous multi-file work.
25
+ "claude-sonnet-4-6": { contextWindow: 1_000_000, safeContext: 300_000, contextConstraint: "moderate", displayName: "Sonnet 4.6" },
26
+ // Haiku: cheapest Claude. Skips steps more often. No 1M upgrade.
27
+ "claude-haiku-4-5": { contextWindow: 200_000, safeContext: 60_000, contextConstraint: "moderate", displayName: "Haiku 4.5" },
28
+ "claude-haiku-4-5-20251001": { contextWindow: 200_000, safeContext: 60_000, contextConstraint: "moderate", displayName: "Haiku 4.5" },
29
+ // ── OpenAI (Apr 2026 — GPT-4.1/o3/o4-mini retired Feb 2026) ──
30
+ // GPT-5.4: current flagship. 1M context, 128K output. Good but literal.
31
+ "gpt-5.4": { contextWindow: 1_050_000, safeContext: 300_000, contextConstraint: "moderate", displayName: "GPT-5.4" },
32
+ "gpt-5.4-mini": { contextWindow: 1_050_000, safeContext: 200_000, contextConstraint: "moderate", displayName: "GPT-5.4 Mini" },
33
+ // Codex 5.3: best agentic coder from OpenAI. 400K context, 128K output.
34
+ "gpt-5.3-codex": { contextWindow: 400_000, safeContext: 160_000, contextConstraint: "moderate", displayName: "Codex 5.3" },
35
+ // ── Google Gemini 3 (Apr 2026 Gemini 2.5 deprecated June 2026) ──
36
+ // Large context but terrible at agentic coding: 13.5% SWE-bench (vs Sonnet 31.2%).
37
+ // Good for reading lots of code, bad at following through. Needs surgical tasks.
38
+ "gemini-3.1-pro": { contextWindow: 1_000_000, safeContext: 350_000, contextConstraint: "tight", displayName: "Gemini 3.1 Pro" },
39
+ "gemini-3-pro": { contextWindow: 1_000_000, safeContext: 350_000, contextConstraint: "tight", displayName: "Gemini 3 Pro" },
40
+ // Flash: 8.2% SWE-bench. Essentially unusable for autonomous agent work.
41
+ "gemini-3-flash": { contextWindow: 1_000_000, safeContext: 250_000, contextConstraint: "tight", displayName: "Gemini 3 Flash" },
42
+ // ── DeepSeek V3.2 (Apr 2026 — V3/R1 superseded, V4 not yet out) ──
43
+ "deepseek-chat": { contextWindow: 128_000, safeContext: 40_000, contextConstraint: "tight", displayName: "DeepSeek V3.2" },
44
+ "deepseek-reasoner": { contextWindow: 128_000, safeContext: 45_000, contextConstraint: "moderate", displayName: "DeepSeek V3.2 Reasoner" },
45
+ // ── Meta Llama 4 (Apr 2025 — still latest open-weight) ──
46
+ // Scout: claims 10M via iRoPE, providers cap at ~327K. No independent validation.
47
+ "llama-4-scout": { contextWindow: 327_680, safeContext: 80_000, contextConstraint: "moderate", displayName: "Llama 4 Scout" },
48
+ "llama-4-maverick": { contextWindow: 1_000_000, safeContext: 100_000, contextConstraint: "moderate", displayName: "Llama 4 Maverick" },
49
+ // ── Cursor models (opaque routing) ──
50
+ "auto": { contextWindow: 256_000, safeContext: 60_000, contextConstraint: "moderate", displayName: "Cursor Auto" },
51
+ "composer-2": { contextWindow: 200_000, safeContext: 40_000, contextConstraint: "tight", displayName: "Composer 2" },
52
+ "composer": { contextWindow: 128_000, safeContext: 30_000, contextConstraint: "tight", displayName: "Composer" },
53
+ // ── Qwen (Apr 2026 — qwen3.6-plus is newest flagship) ──
54
+ "qwen3.6-plus": { contextWindow: 1_000_000, safeContext: 200_000, contextConstraint: "moderate", displayName: "Qwen 3.6 Plus" },
55
+ "qwen3-coder-plus": { contextWindow: 1_000_000, safeContext: 200_000, contextConstraint: "moderate", displayName: "Qwen 3 Coder Plus" },
56
+ "qwen3-max": { contextWindow: 262_144, safeContext: 60_000, contextConstraint: "moderate", displayName: "Qwen 3 Max" },
57
+ // ── Fallback — unknown models get maximum caution ──
58
+ "unknown": { contextWindow: 128_000, safeContext: 40_000, contextConstraint: "tight" },
34
59
  };
35
60
  // ── Default / fallback models ──
36
61
  export const DEFAULT_MODEL = "claude-sonnet-4-6";
@@ -69,18 +94,19 @@ export function modelDisplayName(model) {
69
94
  }
70
95
  /**
71
96
  * Context constraint instruction injected into planner prompts.
72
- * Tells the planner how to scope tasks based on the worker model's context.
97
+ * Uses safeContext (not declared contextWindow) so planners scope tasks
98
+ * to what the model can actually handle reliably.
73
99
  */
74
100
  export function contextConstraintNote(model) {
75
101
  const cap = getModelCapability(model);
76
- const ctx = Math.round(cap.contextWindow / 1000);
102
+ const safe = Math.round(cap.safeContext / 1000);
77
103
  switch (cap.contextConstraint) {
78
104
  case "tight":
79
- return `Worker agents have a TIGHT context window (~${ctx}K tokens). They are prone losing thread on large tasks. Be hyper-specific: name exact files, functions, and changes. One narrow deliverable per task. No ambiguity.`;
105
+ return `Worker agents have a TIGHT usable context (~${safe}K tokens). They lose thread and skip steps on large tasks. Be hyper-specific: name exact files, functions, and changes. One narrow deliverable per task. No ambiguity.`;
80
106
  case "moderate":
81
- return `Worker agents have a moderate context window (~${ctx}K tokens). They can handle focused missions but may struggle with sprawling codebases. Be specific about files and expected outcomes. Scope tasks to clear, concrete deliverables.`;
107
+ return `Worker agents have a moderate usable context (~${safe}K tokens). They can handle focused missions but may struggle with sprawling tasks. Be specific about target files and expected outcomes. Scope tasks to clear, concrete deliverables — not open-ended explorations.`;
82
108
  case "relaxed":
83
- return `Worker agents have a large context window (~${ctx}K tokens). They can read most of the codebase at once and reliably own multi-file features. Give them missions with full autonomy — "Design and implement X" not "edit line 42 of Y.ts".`;
109
+ return `Worker agents have ~${safe}K usable tokens and high instruction-following. They can own multi-file features with autonomy. Give them missions — "Design and implement X" not "edit line 42 of Y.ts".`;
84
110
  }
85
111
  }
86
112
  /** Format context window for display (e.g. "256K"). */
@@ -62,6 +62,17 @@ export declare function preflightProvider(p: ProviderConfig, cwd: string, timeou
62
62
  export declare const PROXY_DEFAULT_URL = "http://127.0.0.1:8765";
63
63
  /** Check if a provider routes through cursor-composer-in-claude. */
64
64
  export declare function isCursorProxyProvider(p: ProviderConfig): boolean;
65
+ /** True if ~/.zshrc / ~/.zprofile contain the `run_cursor_agent` workaround (see README). */
66
+ export declare function hasCursorMacAgentZshPatch(): boolean;
67
+ /**
68
+ * On macOS, if the Cursor `agent` / `cursor-agent` CLI is installed but the zsh
69
+ * workaround is missing, print once. See README: macOS Cursor agent shell patch.
70
+ */
71
+ export declare function warnMacCursorAgentShellPatchIfNeeded(): void;
72
+ /** True when a User API key (or bridge key) is available for Cursor agent + proxy. */
73
+ export declare function hasCursorAgentToken(): boolean;
74
+ /** Resolved token for tests/diagnostics (never log the return value). */
75
+ export declare function getCursorAgentToken(): string | null;
65
76
  /**
66
77
  * Health check: GET /health on the proxy. Returns true if proxy is reachable.
67
78
  * Passes the stored API key so the /health endpoint doesn't return 401.
package/dist/providers.js CHANGED
@@ -108,9 +108,19 @@ export function envFor(p) {
108
108
  base[k] = v;
109
109
  if (p.cursorProxy) {
110
110
  base.ANTHROPIC_BASE_URL = p.baseURL;
111
- const key = process.env.CURSOR_BRIDGE_API_KEY || p.cursorApiKey;
112
- base.ANTHROPIC_AUTH_TOKEN = key || "unused";
111
+ // HTTP Authorization to the proxy: bridge env > per-provider > any resolved agent token (env or providers.json).
112
+ const agentTok = resolveCursorAgentToken();
113
+ const bridgeBearer = process.env.CURSOR_BRIDGE_API_KEY?.trim() ||
114
+ p.cursorApiKey?.trim() ||
115
+ agentTok?.trim() ||
116
+ "";
117
+ base.ANTHROPIC_AUTH_TOKEN = bridgeBearer || "unused";
113
118
  delete base.ANTHROPIC_API_KEY;
119
+ // Native Cursor agent — same token so SDK and proxy never fall through to Keychain (`cursor-user`).
120
+ if (agentTok) {
121
+ base.CURSOR_API_KEY = agentTok;
122
+ base.CURSOR_AUTH_TOKEN = agentTok;
123
+ }
114
124
  // SDK replaces env for subprocesses — force these so nothing inherits a bad CI / skip flag.
115
125
  base.CI = "true";
116
126
  base.CURSOR_SKIP_KEYCHAIN = "1";
@@ -323,6 +333,48 @@ export const PROXY_DEFAULT_URL = "http://127.0.0.1:8765";
323
333
  export function isCursorProxyProvider(p) {
324
334
  return p.cursorProxy === true || p.baseURL === PROXY_DEFAULT_URL;
325
335
  }
336
+ /** True if ~/.zshrc / ~/.zprofile contain the `run_cursor_agent` workaround (see README). */
337
+ export function hasCursorMacAgentZshPatch() {
338
+ let combined = "";
339
+ for (const f of [".zshrc", ".zprofile"]) {
340
+ try {
341
+ combined += readFileSync(join(homedir(), f), "utf8");
342
+ }
343
+ catch {
344
+ /* missing */
345
+ }
346
+ }
347
+ return /run_cursor_agent\s*\(/.test(combined) || /alias\s+agent=\s*['"]?run_cursor_agent['"]?/.test(combined);
348
+ }
349
+ let warnedMacCursorAgentPatch = false;
350
+ /**
351
+ * On macOS, if the Cursor `agent` / `cursor-agent` CLI is installed but the zsh
352
+ * workaround is missing, print once. See README: macOS Cursor agent shell patch.
353
+ */
354
+ export function warnMacCursorAgentShellPatchIfNeeded() {
355
+ if (warnedMacCursorAgentPatch || process.platform !== "darwin")
356
+ return;
357
+ let agentPath = "";
358
+ try {
359
+ agentPath = execSync("command -v cursor-agent 2>/dev/null || command -v agent 2>/dev/null", {
360
+ encoding: "utf8",
361
+ shell: "bash",
362
+ timeout: 3_000,
363
+ stdio: ["pipe", "pipe", "pipe"],
364
+ }).trim();
365
+ }
366
+ catch {
367
+ return;
368
+ }
369
+ if (!agentPath)
370
+ return;
371
+ if (hasCursorMacAgentZshPatch())
372
+ return;
373
+ warnedMacCursorAgentPatch = true;
374
+ console.warn(chalk.yellow("\n ⚠ macOS: Cursor's `agent` CLI is unreliable with its bundled Node.js."));
375
+ console.warn(chalk.dim(" Append the snippet from README (\"macOS: Cursor agent shell patch\") to ~/.zshrc, then run: source ~/.zshrc"));
376
+ console.warn("");
377
+ }
326
378
  /** Resolve the cursor-composer-in-claude API key from env or providers.json. */
327
379
  function resolveCursorProxyKey() {
328
380
  if (process.env.CURSOR_BRIDGE_API_KEY?.trim())
@@ -332,6 +384,26 @@ function resolveCursorProxyKey() {
332
384
  return saved.cursorApiKey.trim();
333
385
  return null;
334
386
  }
387
+ /**
388
+ * Token for the native Cursor `agent` binary — same order as cursor-composer `loadBridgeConfig`
389
+ * (CURSOR_API_KEY → CURSOR_AUTH_TOKEN → bridge / stored). Without a real token the CLI tries
390
+ * login/keychain and macOS may show “Keychain Not Found” for `cursor-user`.
391
+ */
392
+ function resolveCursorAgentToken() {
393
+ if (process.env.CURSOR_API_KEY?.trim())
394
+ return process.env.CURSOR_API_KEY.trim();
395
+ if (process.env.CURSOR_AUTH_TOKEN?.trim())
396
+ return process.env.CURSOR_AUTH_TOKEN.trim();
397
+ return resolveCursorProxyKey();
398
+ }
399
+ /** True when a User API key (or bridge key) is available for Cursor agent + proxy. */
400
+ export function hasCursorAgentToken() {
401
+ return resolveCursorAgentToken() != null;
402
+ }
403
+ /** Resolved token for tests/diagnostics (never log the return value). */
404
+ export function getCursorAgentToken() {
405
+ return resolveCursorAgentToken();
406
+ }
335
407
  /** Build fetch options with the cursor proxy auth header if a key is available. */
336
408
  function cursorProxyFetchOpts() {
337
409
  const key = resolveCursorProxyKey();
@@ -544,9 +616,16 @@ async function isPortInUse(port, host = "127.0.0.1") {
544
616
  * Returns true when the proxy is reachable at PROXY_DEFAULT_URL.
545
617
  */
546
618
  export async function ensureCursorProxyRunning(baseUrl = PROXY_DEFAULT_URL, forceRestart = false) {
619
+ warnMacCursorAgentShellPatchIfNeeded();
547
620
  const url = new URL(baseUrl);
548
621
  const port = parseInt(url.port, 10) || 80;
549
- if (forceRestart && resolveCursorComposerCli()) {
622
+ // Stale listener on :8765 may have been started without CURSOR_API_KEY for the agent child.
623
+ // When we have a token, replace the listener by default so the bundled proxy always inherits it.
624
+ // Opt out: CURSOR_OVERNIGHT_NO_PROXY_RESTART=1 (e.g. shared port / external proxy).
625
+ const token = resolveCursorAgentToken();
626
+ const skipTokenRestart = process.env.CURSOR_OVERNIGHT_NO_PROXY_RESTART === "1";
627
+ const effectiveForce = forceRestart || (!!token && !skipTokenRestart);
628
+ if (effectiveForce && resolveCursorComposerCli()) {
550
629
  console.log(chalk.dim(` Replacing listener on port ${port} with bundled cursor-composer-in-claude…`));
551
630
  killProcessOnPort(port, url.hostname);
552
631
  await new Promise(r => setTimeout(r, 500));
@@ -600,10 +679,21 @@ async function startProxyProcess(baseUrl, url, port) {
600
679
  }
601
680
  }
602
681
  catch { }
603
- // Resolve the API key source for logging
604
- const apiKeyEnv = process.env.CURSOR_BRIDGE_API_KEY;
605
682
  const apiKeyStored = loadProviders().find(p => p.cursorProxy)?.cursorApiKey;
606
- const keySource = apiKeyEnv ? "env CURSOR_BRIDGE_API_KEY" : (apiKeyStored ? "providers.json (stored)" : "none — using 'unused'");
683
+ const agentToken = resolveCursorAgentToken();
684
+ if (!agentToken) {
685
+ console.log(chalk.red(` ✗ Cursor proxy needs a User API key so the agent does not use macOS Keychain (\`cursor-user\`).\n` +
686
+ ` Set ${chalk.bold("CURSOR_API_KEY")} (${chalk.dim("Cursor dashboard → Integrations / API Keys")}) ` +
687
+ `or complete the ${chalk.bold("Cursor…")} setup in claude-overnight (saved to providers.json).\n` +
688
+ ` See: ${chalk.dim("https://cursor.com/docs/cli/headless")}`));
689
+ return false;
690
+ }
691
+ const bridgeKey = process.env.CURSOR_BRIDGE_API_KEY?.trim() ||
692
+ apiKeyStored?.trim() ||
693
+ agentToken;
694
+ const keySource = process.env.CURSOR_BRIDGE_API_KEY?.trim()
695
+ ? "env CURSOR_BRIDGE_API_KEY"
696
+ : (apiKeyStored?.trim() ? "providers.json (stored)" : "mirrored from CURSOR_API_KEY / token");
607
697
  const proxyVersion = getEmbeddedComposerProxyVersion() ?? "unknown";
608
698
  const composerCli = resolveCursorComposerCli();
609
699
  if (!composerCli) {
@@ -617,21 +707,25 @@ async function startProxyProcess(baseUrl, url, port) {
617
707
  catch {
618
708
  cliResolved = composerCli;
619
709
  }
620
- const bridgeKey = apiKeyEnv || apiKeyStored || "unused";
621
710
  const proxyEnv = {
622
711
  ...Object.fromEntries(Object.entries(process.env).filter(([, v]) => v !== undefined)),
623
712
  CI: "true",
624
713
  CURSOR_BRIDGE_API_KEY: bridgeKey,
625
714
  CURSOR_SKIP_KEYCHAIN: "1",
715
+ // Always set — cursor-composer only forwards these to the agent; spread alone is not enough
716
+ // if the shell omitted CURSOR_API_KEY (GUI launches, etc.).
717
+ CURSOR_API_KEY: agentToken,
718
+ CURSOR_AUTH_TOKEN: agentToken,
719
+ // cursor-composer loadBridgeConfig: forces acpSkipAuthenticate so ACP never sends
720
+ // `authenticate` / `cursor_login` (that path touches macOS Keychain for `cursor-user`).
721
+ CURSOR_BRIDGE_ACP_SKIP_AUTHENTICATE: "1",
722
+ // Default bridge is useAcp=false → agent uses runStreaming; skip-authenticate only applies
723
+ // to runAcpStream. Force ACP so real traffic matches the headless/keychain-avoidance path.
724
+ CURSOR_BRIDGE_USE_ACP: "1",
725
+ // cursor-composer chat-only mode fakes HOME to a temp dir; on macOS the agent still waits on
726
+ // Keychain (~30s) for `cursor-user` despite CURSOR_API_KEY. Use the real workspace profile.
727
+ CURSOR_BRIDGE_CHAT_ONLY_WORKSPACE: "false",
626
728
  };
627
- // cursor-composer-in-claude passes CURSOR_API_KEY / CURSOR_AUTH_TOKEN to the agent only from
628
- // these vars — not from CURSOR_BRIDGE_API_KEY. Without them the Cursor CLI falls back to
629
- // login/keychain (macOS dialogs, "cursor-user", hangs under preflight).
630
- const explicitAgentKey = process.env.CURSOR_API_KEY?.trim() || process.env.CURSOR_AUTH_TOKEN?.trim();
631
- if (!explicitAgentKey && bridgeKey !== "unused") {
632
- proxyEnv.CURSOR_API_KEY = bridgeKey;
633
- proxyEnv.CURSOR_AUTH_TOKEN = bridgeKey;
634
- }
635
729
  if (sysNode && agentJs) {
636
730
  proxyEnv.CURSOR_AGENT_NODE = sysNode;
637
731
  proxyEnv.CURSOR_AGENT_SCRIPT = agentJs;
@@ -644,12 +738,14 @@ async function startProxyProcess(baseUrl, url, port) {
644
738
  cliPath: cliResolved,
645
739
  nodeExec: process.execPath,
646
740
  apiKey: keySource,
647
- agentCursorKey: explicitAgentKey ? "env CURSOR_API_KEY or CURSOR_AUTH_TOKEN" : (bridgeKey === "unused" ? "none" : "mirrored from bridge key"),
741
+ agentCursorKey: "set (CURSOR_API_KEY / bridge / stored)",
648
742
  agentPaths: sysNode && agentJs ? { node: sysNode, script: agentJs } : undefined,
649
743
  childEnv: {
650
744
  CI: proxyEnv.CI,
651
745
  CURSOR_SKIP_KEYCHAIN: proxyEnv.CURSOR_SKIP_KEYCHAIN,
652
- CURSOR_API_KEY: proxyEnv.CURSOR_API_KEY ? "(set)" : "(unset)",
746
+ CURSOR_BRIDGE_USE_ACP: proxyEnv.CURSOR_BRIDGE_USE_ACP,
747
+ CURSOR_BRIDGE_CHAT_ONLY_WORKSPACE: proxyEnv.CURSOR_BRIDGE_CHAT_ONLY_WORKSPACE,
748
+ CURSOR_API_KEY: "(set)",
653
749
  },
654
750
  },
655
751
  })));
@@ -716,10 +812,7 @@ function setupSteps() {
716
812
  },
717
813
  {
718
814
  label: "Cursor API key",
719
- check: () => {
720
- const key = process.env.CURSOR_BRIDGE_API_KEY;
721
- return !!key && key.trim().length > 0;
722
- },
815
+ check: () => !!resolveCursorAgentToken(),
723
816
  autoCmd: "",
724
817
  manualCmd: "",
725
818
  successMsg: "Cursor API key configured",
@@ -0,0 +1,116 @@
1
+ # Cursor bundled proxy on macOS: Keychain, ACP, and what actually fixed it
2
+
3
+ This document records **why** the Cursor API proxy (`cursor-composer-in-claude`) triggered macOS Keychain dialogs and long hangs on automation, **what did not fix it**, and **which environment variables and model choices** make headless runs reliable. It is written for maintainers and for anyone debugging similar “it still asks for Keychain” reports.
4
+
5
+ ---
6
+
7
+ ## Context
8
+
9
+ - **claude-overnight** can bundle **cursor-composer-in-claude**, which exposes an Anthropic-compatible HTTP server and forwards requests to the Cursor **`agent`** CLI (often via **ACP**, the Agent Client Protocol over stdio).
10
+ - Headless use is supposed to rely on a **[User API key](https://cursor.com/docs/cli/headless)** (`CURSOR_API_KEY` / dashboard), not on interactive login stored as **`cursor-user`** in the login keychain.
11
+ - Despite setting `CURSOR_SKIP_KEYCHAIN=1`, `CI=true`, and API keys, macOS could still show Keychain UI or block for ~30s with errors like **`Keychain operation timed out after 30000ms`** in the proxy log (`~/.cursor-api-proxy/sessions.log` or stderr).
12
+
13
+ ---
14
+
15
+ ## Symptoms we saw
16
+
17
+ 1. **GUI:** System Keychain prompts, or “Keychain Not Found” style dialogs for `cursor-user`.
18
+ 2. **Proxy logs:** `Agent error: Cursor CLI failed (exit 1): Error: Keychain operation timed out after 30000ms`.
19
+ 3. **Stress tests:** Every matrix row returning **HTTP 500** looked like one bug; in reality **two different failure modes** were mixed (see below).
20
+
21
+ ---
22
+
23
+ ## What we tried that was necessary but not sufficient
24
+
25
+ These are still **correct** to set; they address real issues, but they did **not** alone stop Keychain contention on macOS.
26
+
27
+ | Measure | Role |
28
+ |--------|------|
29
+ | **`CURSOR_SKIP_KEYCHAIN=1`** + **`CI=true`** | Cursor’s own convention to discourage interactive keychain probes in CI-style runs. |
30
+ | **`CURSOR_API_KEY` / `CURSOR_AUTH_TOKEN`** (User API key) | Headless auth for the native agent; must be injected into the **proxy process** env, not only the parent shell (GUI launches often omit them). |
31
+ | **`CURSOR_BRIDGE_API_KEY`** | HTTP bearer for the proxy’s `/health` and `/v1/*` routes; often mirrored from the same token. |
32
+ | **`CURSOR_BRIDGE_ACP_SKIP_AUTHENTICATE=1`** | In `cursor-composer-in-claude`, `loadBridgeConfig` sets `acpSkipAuthenticate` when this is on **or** when an API key is present. Skips the ACP **`authenticate` / `cursor_login`** step that can touch Keychain. |
33
+ | **`CURSOR_BRIDGE_USE_ACP=1`** | Default bridge config has **`useAcp: false`**. Without ACP, traffic used **`runStreaming`** instead of **`runAcpStream`**; skip-authenticate only applies on the **ACP** path. Forcing ACP keeps behavior aligned with the intended headless/ACP pipeline. |
34
+
35
+ Without **`CURSOR_BRIDGE_USE_ACP=1`**, skip-authenticate did not apply to the code path that handled streaming requests.
36
+
37
+ ---
38
+
39
+ ## Discovery 1: Chat-only workspace and a fake `HOME` (main Keychain fix)
40
+
41
+ **cursor-composer** defaults **`CURSOR_BRIDGE_CHAT_ONLY_WORKSPACE`** to **`true`** (“chat-only workspace: yes (isolated temp dir)” in the startup banner).
42
+
43
+ For each request it:
44
+
45
+ - Creates a **temporary directory** and points **`CURSOR_CONFIG_DIR`** at a minimal tree under it.
46
+ - In **`getChatOnlyEnvOverrides`** (when no account-pool `authConfigDir`), it sets **`HOME`** (and related profile vars) to that **temp** directory so rules from the real `~/.cursor` are not loaded.
47
+
48
+ **Observation:** With a valid User API key in env, **`composer-2`** could still hit **`Keychain operation timed out after 30000ms`** when chat-only was **on**. With **`CURSOR_BRIDGE_CHAT_ONLY_WORKSPACE=false`**, the same model and key **succeeded** (real workspace / real profile resolution, no temp `HOME`).
49
+
50
+ **Interpretation:** The Cursor CLI in ACP mode was still probing macOS Keychain for `cursor-user` when the process believed it was in an isolated “empty” profile (temp `HOME`), even though API key auth was set. That matches a **profile / keychain resolution** path, not a missing `CURSOR_API_KEY` in the parent shell.
51
+
52
+ **Fix shipped in claude-overnight:** spawn the bundled proxy with **`CURSOR_BRIDGE_CHAT_ONLY_WORKSPACE=false`**.
53
+
54
+ **Trade-off:** You lose the strictest isolation (the agent no longer runs with a disposable fake `HOME` for every request). You gain reliable headless behavior on macOS with API keys. For many automation setups this is the right default.
55
+
56
+ **How to see it in tests:** The matrix script includes a row **`12-chat-workspace-isolated`** (`CURSOR_BRIDGE_CHAT_ONLY_WORKSPACE=true`). With **`composer-2`**, that row tends to **fail** while **`01-overnight-parity`** passes, reproducing the regression.
57
+
58
+ ---
59
+
60
+ ## Discovery 2: `composer-2-fast` was never a real model
61
+
62
+ The ACP model catalog only offers `composer-2` with `modelId: composer-2[fast=true]`. There is no separate `composer-2-fast` model — `composer-2` already IS the fast variant. Passing `composer-2-fast` to `session/set_config_option` fails with "Invalid model value" because it's not in the catalog. Use **`composer-2`** as the model name.
63
+
64
+ ---
65
+
66
+ ## What claude-overnight sets when it auto-starts the proxy
67
+
68
+ When `startProxyProcess` runs, it builds a **`proxyEnv`** that always includes (among others):
69
+
70
+ | Variable | Purpose |
71
+ |----------|--------|
72
+ | `CI` | `"true"` (forced so a parent shell cannot leave `CI` empty and re-enable interactive probes). |
73
+ | `CURSOR_SKIP_KEYCHAIN` | `"1"` (forced). |
74
+ | `CURSOR_API_KEY` / `CURSOR_AUTH_TOKEN` | Resolved User API key / bridge key (same token mirrored for the native agent). |
75
+ | `CURSOR_BRIDGE_API_KEY` | HTTP auth for the proxy. |
76
+ | `CURSOR_BRIDGE_ACP_SKIP_AUTHENTICATE` | `"1"` (skip `cursor_login` on ACP). |
77
+ | `CURSOR_BRIDGE_USE_ACP` | `"1"` (use ACP path so skip-authenticate applies). |
78
+ | **`CURSOR_BRIDGE_CHAT_ONLY_WORKSPACE`** | **`"false"`** (avoid temp `HOME` Keychain behavior on macOS). |
79
+ | `CURSOR_AGENT_NODE` / `CURSOR_AGENT_SCRIPT` | When detected: system Node + `agent` `index.js` (avoids known issues with the bundled Node on some macOS installs). |
80
+
81
+ See `startProxyProcess` in `src/providers.ts` for the exact spawn and logging.
82
+
83
+ ---
84
+
85
+ ## How to verify
86
+
87
+ 1. **Matrix (recommended):**
88
+ `MATRIX_MODELS=composer-2 npm run matrix:cursor-proxy`
89
+ - Expect **`composer-2`** parity row **HTTP 200**.
90
+
91
+ 2. **Logs:** On failure, check proxy stderr / `~/.cursor-api-proxy/sessions.log` for **`Keychain operation timed out`** vs empty stderr / generic exit 1.
92
+
93
+ 3. **Preflight:** claude-overnight runs provider preflights with timeouts; Cursor proxy preflights are serialized to avoid starving the single agent listener.
94
+
95
+ ---
96
+
97
+ ## When the OS keychain itself is broken
98
+
99
+ If **`login.keychain`** is missing or damaged, macOS can still show dialogs unrelated to Cursor. Keychain Access → First Aid, or `security unlock-keychain ~/Library/Keychains/login.keychain-db`, may help. That is **orthogonal** to the chat-only / `HOME` discovery above.
100
+
101
+ ---
102
+
103
+ ## References in this repo
104
+
105
+ - Implementation: `src/providers.ts` (`startProxyProcess`, `envFor`, `ensureCursorProxyRunning`).
106
+ - Stress harness: `scripts/cursor-proxy-keychain-matrix.mjs`, `npm run matrix:cursor-proxy`.
107
+ - Upstream behavior: `node_modules/cursor-composer-in-claude/dist/lib/config.js` (`loadBridgeConfig`), `workspace.js` (`getChatOnlyEnvOverrides`), `acp-client.js` (`buildAcpSpawnEnv`, ACP handshake).
108
+
109
+ ---
110
+
111
+ ## Summary
112
+
113
+ 1. **ACP + skip-authenticate + USE_ACP** are required so the bridge uses the path where headless auth is designed to apply.
114
+ 2. **`CURSOR_BRIDGE_CHAT_ONLY_WORKSPACE=false`** is the macOS-specific fix that stops temp-`HOME` isolation from driving Keychain waits despite API keys.
115
+ 3. **Keychain shim** (`NODE_OPTIONS=--require keychain-shim.cjs`) intercepts `/usr/bin/security` calls at the Node.js level, eliminating macOS Keychain dialogs regardless of other env vars.
116
+ 4. Use **`composer-2`** as the model name — `composer-2-fast` was never a real model in the ACP catalog.
package/package.json CHANGED
@@ -1,7 +1,7 @@
1
1
  {
2
2
  "name": "claude-overnight",
3
- "version": "1.24.8",
4
- "description": "Background lane for your Claude Max plan. Parallel Claude Agent SDK sessions in git worktrees with a usage cap that reserves headroom for your interactive Claude Code. Crash-safe resume. Provider-agnostic model catalog with capability-based planning.",
3
+ "version": "1.25.0",
4
+ "description": "Parallel Claude agents in git worktrees with a usage cap that reserves headroom for your interactive Claude Code. Crash-safe resume. Provider-agnostic model catalog (Anthropic, Cursor, OpenAI, Gemini, DeepSeek, Llama, Qwen) with capability-based task scoping.",
5
5
  "type": "module",
6
6
  "bin": {
7
7
  "claude-overnight": "dist/bin.js"
@@ -11,12 +11,13 @@
11
11
  "dev": "tsc --watch",
12
12
  "start": "node dist/bin.js",
13
13
  "test": "node --test dist/__tests__/*.test.js",
14
+ "matrix:cursor-proxy": "node scripts/cursor-proxy-keychain-matrix.mjs",
14
15
  "prepublishOnly": "node scripts/sync-plugin-version.js"
15
16
  },
16
17
  "dependencies": {
17
18
  "@anthropic-ai/claude-agent-sdk": "^0.2.92",
18
19
  "chalk": "^5.4.1",
19
- "cursor-composer-in-claude": "0.7.9",
20
+ "cursor-composer-in-claude": "0.8.0",
20
21
  "jsonwebtoken": "^9.0.2"
21
22
  },
22
23
  "devDependencies": {
@@ -72,6 +73,7 @@
72
73
  "files": [
73
74
  "dist",
74
75
  "!dist/__tests__",
76
+ "docs",
75
77
  "plugins",
76
78
  "QUICKSHEET_PLAYWRIGHT.md",
77
79
  "README.md",
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "claude-overnight",
3
- "version": "1.24.8",
3
+ "version": "1.25.0",
4
4
  "description": "Claude Code skill for understanding, installing, and inspecting claude-overnight runs -- parallel Claude agents in git worktrees with thinking waves, multi-wave steering, and crash-safe resume. Supports Cursor API Proxy, Qwen, OpenRouter.",
5
5
  "author": {
6
6
  "name": "Francesco Fornace"