npm - baro-ai - Versions diffs - 0.38.2 → 0.39.0 - Mend

baro-ai 0.38.2 → 0.39.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (4) hide show

package/README.md CHANGED Viewed

@@ -10,14 +10,22 @@ Give it a goal, it breaks it into stories, builds a dependency DAG, and runs the
 > 📖 **Deep dive:** [Getting the Maximum Out of My Claude Code Subscription](https://jigjoy.ai/blog/getting-the-maximum-out-of-claude-code) — the story of why baro exists, how it pairs with Mozaik, and what it looks like in practice.
-## What's new (0.22–0.23)
+> 📊 **Recent post:** [How baro generated 808 NestJS Jest tests autonomously in 71 minutes](https://jigjoy.ai/blog/baro-808-nestjs-jest-tests) — one prompt, 33-story DAG, two sessions because of the Anthropic usage cap, 83.5% branch coverage, zero filed bug issues.
-- **Opus as the default executor** — richer reasoning per story, still with routed Sonnet/Haiku available via `--model` or `.barorc`.
-- **Smaller-stories planner** — the planner now biases toward narrower, more independent stories that parallelize better on the DAG.
-- **Branch dedup** — reruns on the same goal reuse the existing `baro/<name>` branch instead of piling up duplicates.
-- **TUI: terminal-clear on tab switch** — cleaner transitions between story logs, DAG view, and stats.
-- **Audit log survives project resets** — JSONL event logs now live in `~/.baro/runs/` by default, so a wiped `node_modules` or a fresh clone doesn't lose history.
-- **Always-on audit + abnormal-exit banner** — every run is recorded, and the TUI surfaces an explicit banner when the orchestrator exits unexpectedly.
+## What's new in 0.39
+- **`--share-architect-cache` (experimental)** — routes the Architect's DecisionDocument through Claude Code's `--append-system-prompt` instead of prepending it to each story's user prompt. Anthropic's prompt cache hashes system + tools as the cache key, so with the flag on stories 2..N read the DD from cache at the 10× discount instead of each re-paying its own `cache_creation`. Default OFF; measure with audit-log token totals before flipping.
+## What's new in 0.30–0.38 (the bigger shifts)
+- **Dual-mode LLM**: `--llm openai` is fully end-to-end now (Architect / Planner / StoryAgent / Critic / Surgeon all route through Mozaik's native OpenAI runner). Defaults: gpt-5.5 for code-shaped phases, gpt-5.4-mini for Critic.
+- **Architect phase**: an upstream agent that fixes cross-cutting design decisions (column names, file paths, API shapes, library choices) **before** any Story Agent starts, so 30 parallel agents don't each invent their own. See [baro vs Claude Code post](https://jigjoy.ai/blog/baro-vs-claude-code) for the architecture.
+- **`--quick`** — skip Architect, force 1-story plan, silence Critic + Surgeon. For trivial goals (`baro --quick "fix the typo on line 42"`) where the design-doc + multi-agent ceremony is overhead.
+- **Per-phase model overrides**: `--architect-model`, `--planner-model`, `--story-model`, `--critic-model`, `--surgeon-model` — pin any individual role to a specific model without forcing the rest with the global `--model`.
+- **Branch per run**: every fresh run creates a new suffixed `baro/<slug>-<id>` branch, never falls back to checkout. Side-by-side runs from sibling clones can't collide on `git push`.
+- **`baro --doctor`** — pre-flight self-check that verifies the `claude` CLI is on PATH and authenticated, the audit dir is writable, and `gh` exists. First thing to run when a baro run fails before any agents start.
+- **Repo moved** to [jigjoy-ai/baro](https://github.com/jigjoy-ai/baro). npm package name unchanged (`baro-ai`).
+- **gpt-5.5 defaults** + UTF-8 panic fix + Ghostty/kitty-keyboard Enter handling + Execute-screen 0/N resume fix (0.38.x).
 ## Install
@@ -25,73 +33,62 @@ Give it a goal, it breaks it into stories, builds a dependency DAG, and runs the
 npm install -g baro-ai
 ```
-Requires [Claude CLI](https://docs.anthropic.com/en/docs/claude-cli) installed and authenticated.
+Requires [Claude CLI](https://docs.anthropic.com/en/docs/claude-cli) installed and authenticated for `--llm claude` (the default). For `--llm openai`, set `OPENAI_API_KEY` in your shell env or enter it on the welcome screen.
 ## Usage
 ```bash
-# Interactive - opens welcome screen
+# Interactive — opens the welcome screen
 baro
-# Direct - skip to planning
+# Direct — skip to planning
 baro "Add authentication with JWT and role-based access control"
-# Use OpenAI for planning
-baro --planner openai "Add WebSocket support"
+# Route every phase through GPT-5.5 (Mozaik-native OpenAI runner)
+OPENAI_API_KEY=sk-... baro --llm openai "Refactor the database layer"
 # Limit parallelism to 3 concurrent stories
 baro --parallel 3 "Refactor database layer"
-# Set story timeout to 5 minutes
+# Set per-story timeout (default: 600s)
 baro --timeout 300 "Add unit tests"
-# Force a specific model for all phases
+# Force one model everywhere — wins over per-phase routing
 baro --model opus "Complex architecture redesign"
-# Disable model routing (use opus everywhere)
-baro --no-model-routing "Build entire app"
+# Pin a single phase to a specific model
+baro --story-model sonnet "Add WebSocket support"
-# Dry run - generate plan without executing
+# Dry-run — generate plan, save to prd.json, don't execute
 baro --dry-run "Add REST API"
-# Resume interrupted execution (or execute a dry-run plan)
+# Resume an interrupted run (also runs dry-run plans)
 baro --resume
+# Quick fast-path — skip Architect + Critic + Surgeon, single-story plan
+baro --quick "fix the typo on line 42"
+# EXPERIMENTAL — push Architect's DecisionDocument through Claude Code's
+# cached system-prompt prefix so stories 2..N hit cache for the DD
+baro --share-architect-cache "Add comprehensive test coverage"
 # Specify working directory
 baro --cwd ~/projects/myapp "Add REST API"
+# Self-diagnostic and exit
+baro --doctor
 ```
 ## How it works
-1. **Plan** — Claude (Opus) explores your codebase and generates a dependency graph of user stories
-2. **Review** — You review the plan, refine with feedback, accept or quit
-3. **Execute** — Stories run in parallel on a feature branch, each with its own Claude agent (Opus by default in 0.23+; Sonnet/Haiku available via `--model` or `.barorc`)
-4. **Review Agent** — After each level, a review agent (Haiku) checks work against acceptance criteria and creates fix stories if needed
-5. **Finalize** — Runs build verification and creates a GitHub PR with full summary
-## Features
-- **Parallel execution** — independent stories run simultaneously, respecting dependency order
-- **DAG engine** — topological sort with level grouping, cycle detection
-- **Model routing** — Opus for planning and execution (0.23+ default), Haiku for review (configurable)
-- **Live TUI** — dashboard with story status, live agent logs, DAG view, stats
-- **Review agent** — automated code review between levels with build detection and auto-fix
-- **Plan refinement** — press `r` on review screen to give feedback and regenerate the plan
-- **Build detection** — auto-detects project type (Cargo, npm, Go, Python, Make) and runs builds during review
-- **Git coordination** — mutex-protected commits, auto-push with retry, pull --rebase, conflict detection
-- **Branch per run** — creates `baro/<name>` branch, keeps main clean, reuses existing branches on rerun (0.23+)
-- **Dry run** — `--dry-run` generates plan and saves to `prd.json` without executing, then `--resume` to run it
-- **Resume** — detects `prd.json` and resumes incomplete executions
-- **PR creation** — creates GitHub PR with stories table, stats, time saved, and review summary
-- **Configurable parallelism** — `--parallel N` to limit concurrent story execution
-- **Story timeout** — `--timeout SECONDS` kills stuck agents (default: 10 minutes, hard timeout disabled in 0.22+)
-- **Time saved** — shows parallel speedup vs sequential execution
-- **System notifications** — terminal bell + OS notification (macOS/Linux/Windows) when done
-- **Retry logic** — failed stories retry automatically (configurable per story)
-- **Interactive settings** — configure model, parallelism, timeout, context, and planner on the welcome screen with Tab/arrow keys
-- **Project config** — `.barorc` file in project root sets defaults (no CLI flags needed)
-- **Session lock** — prevents multiple baro instances from running in the same directory
-- **Audit log** — every bus event written to `~/.baro/runs/<run-id>.jsonl`
+1. **Architect** (Opus by default, or gpt-5.5 with `--llm openai`) — reads the codebase and writes a `DecisionDocument`: the file paths, names, schemas, API shapes, and library choices every story will use.
+2. **Planner** — decomposes the goal into a dependency DAG of stories, with the DecisionDocument pinned.
+3. **Review** — you accept the plan, or press `r` to refine it with feedback.
+4. **Execute** — stories run in parallel on a fresh `baro/<slug>-<id>` branch. Each story is one Claude Code subprocess (or one Mozaik-native OpenAI session).
+5. **Critic** (per-turn) — Haiku evaluator scores each agent turn against its acceptance criteria. On a fail verdict, an inline corrective lands as the agent's next turn so it self-corrects before the next tool call.
+6. **Sentry + Librarian** (run-wide) — Sentry flags overlapping Edit/Write tool calls; Librarian indexes one agent's Read/Grep so the next agent doesn't redo the exploration.
+7. **Surgeon** (on terminal failure) — asks Opus for a richer replan (split / prereq / rewire) instead of dropping a failed story outright.
+8. **Finalize** — runs the project's build verification and opens a GitHub PR with stories table, stats, time saved, and a summary of every story's commits.
 ## Config file
@@ -124,122 +121,90 @@ All fields are optional. CLI flags override `.barorc`, and interactive changes o
 baro [goal] [options]
 Arguments:
-  goal                         Project goal (opens welcome screen if omitted)
-Options:
-  --planner <name>             Planner: claude or openai (default: claude)
-  --model <name>               Override model for all phases: opus, sonnet, haiku
-  --no-model-routing           Use opus for everything (disables routing)
-  --parallel <N>               Max concurrent stories, 0 = unlimited (default: 0)
-  --timeout <seconds>          Story timeout in seconds (default: 600)
-  --dry-run                    Generate plan only, save to prd.json, do not execute
-  --resume                     Resume from existing prd.json (also runs dry-run plans)
-  --skip-context               Skip CLAUDE.md auto-generation
-  --cwd <path>                 Working directory (default: current)
-  --no-critic                  Disable live Critic (default: on). The Critic
-                               reviews each agent turn against acceptance
-                               criteria via `claude --model haiku` and injects
-                               corrective feedback when the turn doesn't pass.
-  --critic-model <name>        Model for the Critic (default: haiku)
-  --no-librarian               Disable cross-agent runtime memory (default: on)
-  --no-sentry                  Disable file-touch conflict detector (default: on)
-  --no-surgeon                 Disable Surgeon (default: on). The Surgeon
-                               observes terminal story failures and proposes
-                               replans (split / prereq / rewire) so failed
-                               work gets done in a different shape rather
-                               than dropped.
-  --no-surgeon-llm             Use deterministic Surgeon (skip-only) instead
-                               of the LLM-driven replanner. The LLM Surgeon
-                               is on by default; it costs an Opus call per
-                               terminal failure but produces richer replans.
-  --surgeon-model <name>       Model for the Surgeon LLM (default: opus)
-  -h, --help                   Print help
+  goal                          Project goal (opens welcome screen if omitted)
+Provider:
+  --llm <name>                  Provider for every phase: claude (default) or openai.
+                                openai routes through Mozaik's native OpenAI runner
+                                (gpt-5.x); requires OPENAI_API_KEY.
+Model selection:
+  --model <name>                Override ALL phases: opus, sonnet, haiku
+  --no-model-routing            Equivalent to --model opus
+  --architect-model <name>      Pin only the Architect phase
+  --planner-model <name>        Pin only the Planner phase
+  --story-model <name>          Pin every Story Agent in the run
+  --critic-model <name>         Model for the Critic (default: haiku)
+  --surgeon-model <name>        Model for the Surgeon LLM (default: opus)
+Run shape:
+  --parallel <N>                Max concurrent stories per level (0 = unlimited)
+  --timeout <seconds>           Per-story timeout (default: 600)
+  --intra-level-delay <secs>    Stagger story spawns within a level so Librarian
+                                can broadcast first agent's findings (default: 10)
+  --quick                       Trivial-goal fast path — skip Architect, force
+                                1-story plan, disable Critic + Surgeon
+  --dry-run                     Generate plan only, save to prd.json
+  --resume                      Resume from existing prd.json
+  --share-architect-cache       EXPERIMENTAL — route DecisionDocument through
+                                Claude Code's --append-system-prompt so stories
+                                2..N read it from the prompt cache
+  --skip-context                Skip CLAUDE.md auto-generation
+  --cwd <path>                  Working directory (default: current)
+Observers:
+  --no-critic                   Disable live Critic (default: ON)
+  --no-librarian                Disable cross-agent runtime memory (default: ON)
+  --no-sentry                   Disable file-touch conflict detector (default: ON)
+  --no-surgeon                  Disable Surgeon (default: ON)
+  --no-surgeon-llm              Use deterministic Surgeon (skip-only) instead
+                                of the LLM-driven replanner
+Diagnostics:
+  --doctor                      Run self-diagnostic and exit
+  -h, --help                    Print help
 ```
-### Phase 2/3/4 observers (Mozaik bus)
-baro 0.19+ runs every story through a TypeScript Mozaik orchestrator.
-Stories on the same DAG level run truly in parallel and observers can
-react to one another's bus events:
-- **Librarian** (default ON) — when one agent reads a file or runs grep,
-  later agents in the run see the digest in their prompt and skip the
-  redundant exploration. Measurable token savings on multi-story runs.
-- **Sentry** (default ON) — flags overlapping Edit/Write tool calls
-  across concurrent stories.
-- **Critic** (default ON) — Haiku evaluator reviews each agent turn
-  against acceptance criteria; on a fail verdict, an inline corrective
-  message lands as the agent's next turn so it self-corrects before
-  commit. Disable with `--no-critic`.
-- **Surgeon** (default ON, with LLM) — when a story fails its retry
-  budget, the Surgeon asks Opus for a richer replan and emits a
-  ReplanItem the Conductor applies at the next level boundary. The LLM
-  is biased toward keeping the work done — it prefers splitting a too-
-  large story into smaller pieces, inserting a prerequisite, or
-  rewiring dependencies, over dropping outright. A run is reported as
-  successful only when every original story passes; if the Surgeon
-  drops a story without replacement, the run terminates with a clear
-  "did not complete the goal" verdict instead of a green tick. Disable
-  the LLM with `--no-surgeon-llm` to fall back to deterministic
-  skip-only behavior, or `--no-surgeon` to remove adaptive replans
-  entirely.
-## Requirements
+### The participants (Mozaik bus)
-- [Claude CLI](https://docs.anthropic.com/en/docs/claude-cli) installed and authenticated
-- macOS (arm64/x64), Linux (x64/arm64), or Windows (x64)
-- **Node.js 20+** (orchestrator runtime)
-- `gh` CLI (optional, for automatic PR creation)
-> **Windows note:** Windows 10+ is required. For best TUI experience, use [Windows Terminal](https://aka.ms/terminal) or another modern terminal emulator.
-## Architecture
-Rust binary distributed via npm. TUI built with ratatui, async execution
-with tokio. Each `baro` invocation spawns the bundled TypeScript
-[Mozaik](https://github.com/jigjoy-ai/mozaik) orchestrator as a
-subprocess; the orchestrator owns story execution and emits typed
-events into a shared `AgenticEnvironment` bus. Each story is one
-`claude` CLI subprocess (auth inherits from your Claude CLI session —
-no API key needed).
-The orchestrator is itself a Mozaik agentic environment: there is no
-imperative `run()` method, no top-level `Promise.all` loop. The
-**Conductor** is a state machine that reacts to typed bus events
-(`RunStartRequest` → `LevelComputeRequest` → `StorySpawnRequest` →
-`StoryResult` → `LevelCompleted` → …). Spawning a story, evaluating a
-turn, and replanning the DAG are all reactions, not steps in a loop.
-Ten participants share that bus:
+Every story runs through a TypeScript Mozaik orchestrator. The orchestrator is itself a Mozaik agentic environment: there is no imperative `run()` method, no top-level `Promise.all` loop. The `Conductor` is a state machine that reacts to typed bus events. Spawning a story, evaluating a turn, replanning the DAG — all reactions on the bus, not steps in a loop.
 | Participant     | Role                                                              |
 | --------------- | ----------------------------------------------------------------- |
+| `Architect`     | One Opus (or gpt-5.5) turn before planning — emits a `DecisionDocumentItem` that pins every cross-cutting design decision |
+| `Planner`       | Decomposes the goal into a story DAG, with the DecisionDocument pinned |
 | `Conductor`     | Orchestration state machine — drives the run by reacting          |
 | `StoryFactory`  | Spawns Story Agents on each `StorySpawnRequest`                   |
-| `StoryAgent`    | Runs one story via Claude CLI, with retries and timeout           |
+| `StoryAgent`    | Runs one story via Claude CLI subprocess (or Mozaik OpenAI session) |
 | `Librarian`     | Cross-agent memory — indexes outputs of exploration tools         |
 | `Sentry`        | Flags overlapping file writes across concurrent stories           |
 | `Critic`        | Per-turn acceptance-criteria evaluator (default ON, `--no-critic` to disable) |
 | `Surgeon`       | Emits DAG replans when a story fails terminally (default ON, `--no-surgeon` to disable) |
+| `Finalizer`     | Runs build verification, opens the GitHub PR with stories table + stats |
 | `Operator`      | Bridges external user commands (TUI, web UI) into bus events      |
 | `Auditor`       | JSONL log of every event on the bus (written to `~/.baro/runs/`)  |
 | `Cartographer`  | Translates bus events into UI frames for the Rust TUI             |
-The bus is open. New participants — CI deployers, Slack notifiers,
-external ticket triggers — are subscribers and emitters with no changes
-to the orchestrator.
+The bus is open. New participants — CI deployers, Slack notifiers, external ticket triggers — are subscribers and emitters with no changes to the orchestrator. (Adding the Architect was a 200-line change because of this shape; see the [Mozaik post](https://jigjoy.ai/blog/baro-vs-claude-code).)
+## Requirements
+- [Claude CLI](https://docs.anthropic.com/en/docs/claude-cli) installed and authenticated (for `--llm claude`, the default), OR `OPENAI_API_KEY` set (for `--llm openai`)
+- macOS (arm64/x64), Linux (x64/arm64), or Windows (x64)
+- **Node.js 20+** (orchestrator runtime)
+- `gh` CLI (optional, for automatic PR creation)
+> **Windows note:** Windows 10+ is required. For best TUI experience, use [Windows Terminal](https://aka.ms/terminal) or another modern terminal emulator.
+## Architecture
+Rust binary distributed via npm. TUI built with ratatui, async execution with tokio. Each `baro` invocation spawns the bundled TypeScript [Mozaik](https://github.com/jigjoy-ai/mozaik) orchestrator as a subprocess; the orchestrator owns story execution and emits typed events into a shared `AgenticEnvironment` bus.
 ## Status & feedback
-baro is a work in progress. I'm actively adding things, testing ideas,
-and occasionally breaking them — if a run explodes, an [issue on
-GitHub](https://github.com/jigjoy-ai/baro/issues) with the run's audit
-log from `~/.baro/runs/` is the fastest way to get it fixed.
+baro is a work in progress. I'm actively adding things, testing ideas, and occasionally breaking them — if a run explodes, an [issue on GitHub](https://github.com/jigjoy-ai/baro/issues) with the run's audit log from `~/.baro/runs/` is the fastest way to get it fixed.
-If you like the idea and want to help shape where it goes, PRs are
-welcome, and you can DM me on Twitter
-[@lotus_sbc](https://twitter.com/lotus_sbc) with ideas, use cases, or
-bug reports.
+If you like the idea and want to help shape where it goes, PRs are welcome, you can DM me on Twitter [@lotus_sbc](https://twitter.com/lotus_sbc), or jump into the [JigJoy Discord](https://discord.gg/dvxY9J2kWX) where the people building baro and Mozaik hang out.
 ## License

package/dist/cli.mjs CHANGED Viewed

@@ -8324,13 +8324,14 @@ var LevelCompletedItem = class extends BusEvent {
   }
 };
 var StorySpawnRequestItem = class extends BusEvent {
-  constructor(storyId, prompt, model, retries, timeoutSecs) {
+  constructor(storyId, prompt, model, retries, timeoutSecs, appendSystemPrompt) {
     super();
     this.storyId = storyId;
     this.prompt = prompt;
     this.model = model;
     this.retries = retries;
     this.timeoutSecs = timeoutSecs;
+    this.appendSystemPrompt = appendSystemPrompt;
   }
   type = "story_spawn_request";
   toJSON() {
@@ -8340,7 +8341,8 @@ var StorySpawnRequestItem = class extends BusEvent {
       promptLen: this.prompt.length,
       model: this.model,
       retries: this.retries,
-      timeoutSecs: this.timeoutSecs
+      timeoutSecs: this.timeoutSecs,
+      appendSystemPromptLen: this.appendSystemPrompt?.length ?? 0
     };
   }
 };
@@ -8864,6 +8866,9 @@ var ClaudeCliParticipant = class _ClaudeCliParticipant extends BaroParticipant {
     if (this.options.resumeSessionId) {
       args.push("--resume", this.options.resumeSessionId);
     }
+    if (this.options.appendSystemPrompt && this.options.appendSystemPrompt.length > 0) {
+      args.push("--append-system-prompt", this.options.appendSystemPrompt);
+    }
     if (this.options.extraArgs && this.options.extraArgs.length > 0) {
       args.push(...this.options.extraArgs);
     }
@@ -9122,7 +9127,8 @@ var StoryAgent = class extends BaroParticipant {
     this.transition("running", `attempt ${attempt}`);
     const claude = new ClaudeCliParticipant(this.spec.id, {
       cwd: this.spec.cwd,
-      model: this.spec.model
+      model: this.spec.model,
+      appendSystemPrompt: this.spec.appendSystemPrompt
     });
     this.currentClaude = claude;
     claude.join(this.envRef);
@@ -9482,7 +9488,9 @@ var Conductor = class extends BaroParticipant {
   }
   async requestStorySpawn(story) {
     const model = this.opts.overrideModel ?? story.model ?? this.opts.defaultModel;
-    let prompt = this.resolvePrompt(story);
+    const resolved = this.resolvePrompt(story);
+    let prompt = resolved.userPrompt;
+    const appendSystemPrompt = resolved.appendSystemPrompt;
     if (this.opts.onBeforeStoryLaunch) {
       try {
         const extra = await this.opts.onBeforeStoryLaunch(story.id, story);
@@ -9507,7 +9515,8 @@ ${prompt}`;
         prompt,
         model,
         story.retries,
-        this.opts.timeoutSecs
+        this.opts.timeoutSecs,
+        appendSystemPrompt
       )
     );
   }
@@ -9617,24 +9626,26 @@ ${prompt}`;
       prompt = buildDefaultStoryPrompt(story);
     }
     const doc = this.prd?.decisionDocument;
-    if (doc && doc.trim().length > 0) {
-      const header = [
-        "## Design spec (authoritative \u2014 already decided)",
-        "",
-        "The Architect made these decisions before any story started.",
-        "Treat them as fixed: use these exact file paths, names,",
-        "schemas, API shapes, and dependency choices. Do NOT",
-        "improvise alternatives \u2014 your siblings are working from",
-        "the same spec and divergence breaks the build.",
-        "",
-        doc.trim(),
-        "",
-        "---",
-        ""
-      ].join("\n");
-      prompt = header + prompt;
-    }
-    return prompt;
+    if (!doc || doc.trim().length === 0) {
+      return { userPrompt: prompt };
+    }
+    const trimmedDoc = doc.trim();
+    const headerLines = [
+      "## Design spec (authoritative \u2014 already decided)",
+      "",
+      "The Architect made these decisions before any story started.",
+      "Treat them as fixed: use these exact file paths, names,",
+      "schemas, API shapes, and dependency choices. Do NOT",
+      "improvise alternatives \u2014 your siblings are working from",
+      "the same spec and divergence breaks the build.",
+      ""
+    ];
+    if (this.opts.shareArchitectCache) {
+      const appendSystemPrompt = [...headerLines, trimmedDoc].join("\n");
+      return { userPrompt: prompt, appendSystemPrompt };
+    }
+    const header = [...headerLines, trimmedDoc, "", "---", ""].join("\n");
+    return { userPrompt: header + prompt };
   }
   emit(event) {
     this.envRef?.deliverBusEvent(this, event);
@@ -11684,7 +11695,8 @@ var StoryFactory = class extends BaroParticipant {
       cwd: this.opts.cwd,
       model: claudeModel,
       retries: req.retries,
-      timeoutSecs: req.timeoutSecs
+      timeoutSecs: req.timeoutSecs,
+      appendSystemPrompt: req.appendSystemPrompt
     });
     agent.join(this.envRef);
     this.active.set(req.storyId, agent);
@@ -12114,6 +12126,7 @@ async function orchestrate(config) {
     overrideModel: config.overrideModel ?? void 0,
     defaultModel: config.defaultModel ?? "opus",
     intraLevelDelaySecs: config.intraLevelDelaySecs,
+    shareArchitectCache: config.shareArchitectCache ?? false,
     onRunStart: useGit ? async (prd) => {
       baseSha = await getHeadSha(config.cwd);
       if (prd.branchName) {
@@ -12401,6 +12414,7 @@ function parseArgs(argv) {
     withSurgeon: false,
     surgeonUseLlm: false,
     llm: "claude",
+    shareArchitectCache: false,
     help: false
   };
   for (let i = 0; i < argv.length; i++) {
@@ -12474,6 +12488,9 @@ function parseArgs(argv) {
         args.llm = v;
         break;
       }
+      case "--share-architect-cache":
+        args.shareArchitectCache = true;
+        break;
       default:
         process.stderr.write(`[cli] unknown flag: ${a}
 `);
@@ -12553,7 +12570,8 @@ async function main() {
     surgeonModel: args.surgeonModel,
     intraLevelDelaySecs: args.intraLevelDelaySecs,
     llm: args.llm,
-    storyModel: args.storyModel
+    storyModel: args.storyModel,
+    shareArchitectCache: args.shareArchitectCache
   };
   if (args.llm === "openai" && !process.env.OPENAI_API_KEY) {
     process.stderr.write(