baro-ai 0.44.0 → 0.45.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -69,7 +69,31 @@ flowchart LR
69
69
  F --> PR([Pull Request])
70
70
  ```
71
71
 
72
- Every story is one **Claude Code subprocess** (or one Mozaik-native OpenAI session) auth inherits from your existing setup, no API key plumbing.
72
+ Every story is one CLI subprocess — Claude Code, OpenAI Codex CLI, or a Mozaik-native OpenAI Responses session, depending on `--llm`. Auth inherits from whichever CLI you already have signed in, no API key plumbing.
73
+
74
+ ## Three LLM backends, one DAG
75
+
76
+ ```bash
77
+ baro --llm claude "Your goal" # default — Claude Code on Anthropic Max subscription
78
+ baro --llm codex "Your goal" # OpenAI Codex CLI on ChatGPT Pro/Plus subscription
79
+ baro --llm openai "Your goal" # Mozaik-native OpenAI Responses (per-call API billing)
80
+ baro --llm hybrid "Your goal" # Claude on Architect/Planner/Surgeon, Codex on Story/Critic
81
+ ```
82
+
83
+ Same orchestration. Same DAG. Same prompts. The only thing that moves is which provider every agent talks to. `--llm hybrid` is the new default-recommendation for serious runs — Claude where the upstream plan matters, Codex for the parallel story+critic work that dominates the budget.
84
+
85
+ Each phase has its own override flag if you want to mix it yourself:
86
+
87
+ ```bash
88
+ baro --architect-llm claude \
89
+ --planner-llm claude \
90
+ --story-llm codex \
91
+ --critic-llm codex \
92
+ --surgeon-llm claude \
93
+ "Your goal"
94
+ ```
95
+
96
+ Full breakdown at [docs.baro.rs/llm-providers](https://docs.baro.rs/llm-providers) — provider economics, per-phase routing, the side-by-side benchmark across three real tasks: [**I tested Claude Code vs OpenAI Codex in my parallel agent setup. Then I built a hybrid.**](https://jigjoy.ai/blog/claude-code-vs-codex-baro)
73
97
 
74
98
  ## Recent real run
75
99
 
@@ -82,7 +106,7 @@ Every story is one **Claude Code subprocess** (or one Mozaik-native OpenAI sessi
82
106
  | **Architect** | One Opus call before planning — emits a `DecisionDocument` that pins every cross-cutting design decision (file paths, schemas, API shapes, library choices) so 30 parallel agents don't each invent their own |
83
107
  | **Planner** | Decomposes the goal into a story DAG, with the DecisionDocument already pinned |
84
108
  | **Conductor** | State machine that drives the run by reacting to bus events |
85
- | **StoryAgent** | One Claude Code subprocess per story; multi-turn loop until story completes |
109
+ | **StoryAgent** | One CLI subprocess per story (Claude Code / Codex / OpenAI Responses, picked by `--llm` or `--story-llm`); multi-turn loop until story completes |
86
110
  | **Critic** | Per-turn evaluator (Haiku). On fail verdict, injects corrective feedback as the agent's next turn |
87
111
  | **Sentry** | Flags overlapping Edit/Write tool calls across concurrent stories |
88
112
  | **Librarian** | Indexes one agent's Read/Grep findings so siblings don't redo the exploration |
@@ -96,16 +120,22 @@ Bus is open. CI deployers, Slack notifiers, ticket triggers — all new particip
96
120
  ```bash
97
121
  npm install -g baro-ai
98
122
 
99
- # Full run (default — Architect + Planner + parallel Story Agents)
123
+ # Full run (default — Claude on every phase via Claude Code CLI)
100
124
  baro "Migrate the hardcoded category data to a backend dictionary"
101
125
 
102
126
  # Trivial goal — skip Architect + Critic + Surgeon, single story
103
127
  baro --quick "fix the typo on line 42 of README.md"
104
128
 
105
- # Route every phase through GPT-5.5 instead of Claude
129
+ # Codex everywhere (ChatGPT Pro/Plus subscription, ~3-11× cheaper per run than Claude)
130
+ baro --llm codex "Refactor the database layer"
131
+
132
+ # Per-phase routing — Claude upstream (tight plans), Codex downstream (cheap writes)
133
+ baro --llm hybrid "Add WebSocket support across api and frontend"
134
+
135
+ # Route every phase through GPT-5.5 (Mozaik-native OpenAI API)
106
136
  OPENAI_API_KEY=sk-... baro --llm openai "Refactor the database layer"
107
137
 
108
- # Limit parallelism (Anthropic plan tiers cap concurrency)
138
+ # Limit parallelism (plan-tier concurrency caps)
109
139
  baro --parallel 3 "Add unit tests for the auth module"
110
140
 
111
141
  # Dry-run first, execute later
@@ -134,7 +164,11 @@ For a deeper side-by-side on a real refactor, see [baro vs Claude Code `/goal`](
134
164
 
135
165
  ## Requirements
136
166
 
137
- - [Claude CLI](https://docs.anthropic.com/en/docs/claude-cli) authenticated (for `--llm claude`, the default) **or** `OPENAI_API_KEY` set (for `--llm openai`)
167
+ - At least one of:
168
+ - [Claude CLI](https://docs.anthropic.com/en/docs/claude-cli) authenticated (for `--llm claude`, the default)
169
+ - [OpenAI Codex CLI](https://github.com/openai/codex) authenticated (for `--llm codex`)
170
+ - `OPENAI_API_KEY` set (for `--llm openai`)
171
+ - Both Claude CLI **and** Codex CLI authenticated (for `--llm hybrid`)
138
172
  - Node.js 20+
139
173
  - macOS (arm64/x64), Linux (x64/arm64), Windows (x64)
140
174
  - `gh` CLI (optional, for automatic PR creation)
package/dist/cli.mjs CHANGED
@@ -9886,6 +9886,10 @@ function buildDag(stories, options = {}) {
9886
9886
  return levels;
9887
9887
  }
9888
9888
 
9889
+ // ../baro-orchestrator/src/participants/auditor.ts
9890
+ import { appendFileSync, mkdirSync } from "fs";
9891
+ import { dirname } from "path";
9892
+
9889
9893
  // ../baro-orchestrator/src/semantic-events.ts
9890
9894
  function defineSemanticEvent(type) {
9891
9895
  return {
@@ -9924,8 +9928,6 @@ var ConductorState = defineSemanticEvent("conductor_state");
9924
9928
  var StoryResult = defineSemanticEvent("story_result");
9925
9929
 
9926
9930
  // ../baro-orchestrator/src/participants/auditor.ts
9927
- import { appendFileSync, mkdirSync } from "fs";
9928
- import { dirname } from "path";
9929
9931
  var Auditor = class extends BaseObserver {
9930
9932
  path;
9931
9933
  skipStreamChunks;
@@ -11611,6 +11613,206 @@ function matchCommit(story, commits) {
11611
11613
  return bestScore >= 2 ? bestSha : null;
11612
11614
  }
11613
11615
 
11616
+ // ../baro-orchestrator/src/tui-protocol.ts
11617
+ function emit(event) {
11618
+ const line = JSON.stringify(event) + "\n";
11619
+ process.stdout.write(line);
11620
+ }
11621
+
11622
+ // ../baro-orchestrator/src/participants/forwarders/agent-stream.ts
11623
+ var AgentStreamForwarder = class extends BaseObserver {
11624
+ async onExternalModelMessage(source, item) {
11625
+ const agentId = source.agentId;
11626
+ if (typeof agentId !== "string") return;
11627
+ const json = item.toJSON();
11628
+ const text = json.content?.[0]?.text ?? "";
11629
+ if (!text.trim()) return;
11630
+ emitMultiline(agentId, text);
11631
+ }
11632
+ async onExternalFunctionCall(source, item) {
11633
+ const agentId = source.agentId;
11634
+ if (typeof agentId !== "string") return;
11635
+ emitMultiline(agentId, `[tool_call] ${item.name} ${item.args}`);
11636
+ }
11637
+ async onExternalFunctionCallOutput(source, item) {
11638
+ const agentId = source.agentId;
11639
+ if (typeof agentId !== "string") return;
11640
+ const json = item.toJSON();
11641
+ const text = json.output?.[0]?.text ?? "";
11642
+ emitMultiline(agentId, `[tool_result ${json.call_id}] ${text}`);
11643
+ }
11644
+ };
11645
+ function emitMultiline(agentId, text) {
11646
+ if (!text) return;
11647
+ const lines = text.split("\n");
11648
+ for (const line of lines) {
11649
+ if (line.length === 0 && lines.length === 1) continue;
11650
+ emit({ type: "story_log", id: agentId, line });
11651
+ }
11652
+ }
11653
+
11654
+ // ../baro-orchestrator/src/participants/forwarders/coordination.ts
11655
+ var CoordinationForwarder = class extends BaseObserver {
11656
+ async onExternalEvent(_source, event) {
11657
+ if (Coordination.is(event)) {
11658
+ this.handleCoordination(event.data);
11659
+ return;
11660
+ }
11661
+ if (Critique.is(event)) {
11662
+ this.handleCritique(event.data);
11663
+ return;
11664
+ }
11665
+ }
11666
+ handleCoordination(item) {
11667
+ emit({
11668
+ type: "story_log",
11669
+ id: item.recipientId,
11670
+ line: `[sentry/${item.kind}] ${item.reason}`
11671
+ });
11672
+ }
11673
+ handleCritique(item) {
11674
+ emit({
11675
+ type: "story_log",
11676
+ id: item.agentId,
11677
+ line: `[critic/${item.verdict}] ${item.reasoning}`
11678
+ });
11679
+ }
11680
+ };
11681
+
11682
+ // ../baro-orchestrator/src/participants/forwarders/finalization.ts
11683
+ var FinalizationForwarder = class extends BaseObserver {
11684
+ async onExternalEvent(_source, event) {
11685
+ if (FinalizeStarted.is(event)) {
11686
+ const _item = event.data;
11687
+ emit({ type: "finalize_start" });
11688
+ return;
11689
+ }
11690
+ if (PrCreated.is(event)) {
11691
+ const item = event.data;
11692
+ emit({ type: "finalize_complete", pr_url: item.url });
11693
+ return;
11694
+ }
11695
+ }
11696
+ };
11697
+
11698
+ // ../baro-orchestrator/src/participants/forwarders/progress.ts
11699
+ var ProgressForwarder = class extends BaseObserver {
11700
+ async onExternalEvent(_source, event) {
11701
+ if (!ConductorState.is(event)) return;
11702
+ const item = event.data;
11703
+ if (item.phase === "running_level" && item.currentLevel != null && item.totalLevels != null) {
11704
+ emit({
11705
+ type: "progress",
11706
+ completed: item.currentLevel - 1,
11707
+ total: item.totalLevels,
11708
+ percentage: Math.round(
11709
+ (item.currentLevel - 1) / Math.max(1, item.totalLevels) * 100
11710
+ )
11711
+ });
11712
+ }
11713
+ }
11714
+ };
11715
+
11716
+ // ../baro-orchestrator/src/participants/forwarders/story-lifecycle.ts
11717
+ var StoryLifecycleForwarder = class extends BaseObserver {
11718
+ startedStories = /* @__PURE__ */ new Set();
11719
+ retryCounts = /* @__PURE__ */ new Map();
11720
+ async onExternalEvent(_source, event) {
11721
+ if (AgentState.is(event)) {
11722
+ this.handleAgentState(event.data);
11723
+ return;
11724
+ }
11725
+ if (StoryResult.is(event)) {
11726
+ this.handleStoryResult(event.data);
11727
+ return;
11728
+ }
11729
+ }
11730
+ handleAgentState(item) {
11731
+ if (item.phase === "running" && !this.startedStories.has(item.agentId)) {
11732
+ this.startedStories.add(item.agentId);
11733
+ emit({ type: "story_start", id: item.agentId, title: item.agentId });
11734
+ }
11735
+ if (item.phase === "waiting" && item.detail?.includes("retrying")) {
11736
+ const count = (this.retryCounts.get(item.agentId) ?? 0) + 1;
11737
+ this.retryCounts.set(item.agentId, count);
11738
+ emit({ type: "story_retry", id: item.agentId, attempt: count });
11739
+ }
11740
+ }
11741
+ handleStoryResult(item) {
11742
+ if (item.success) {
11743
+ emit({
11744
+ type: "story_complete",
11745
+ id: item.storyId,
11746
+ duration_secs: item.durationSecs,
11747
+ files_created: 0,
11748
+ files_modified: 0
11749
+ });
11750
+ } else {
11751
+ emit({
11752
+ type: "story_error",
11753
+ id: item.storyId,
11754
+ error: item.error ?? "unknown error",
11755
+ attempt: item.attempts,
11756
+ max_retries: item.attempts
11757
+ });
11758
+ }
11759
+ }
11760
+ };
11761
+
11762
+ // ../baro-orchestrator/src/participants/forwarders/token-usage.ts
11763
+ var TokenUsageForwarder = class extends BaseObserver {
11764
+ async onExternalEvent(_source, event) {
11765
+ if (AgentResult.is(event)) {
11766
+ this.handleClaudeResult(event.data);
11767
+ return;
11768
+ }
11769
+ if (CodexTurnEvent.is(event)) {
11770
+ this.handleCodexTurnEvent(event.data);
11771
+ return;
11772
+ }
11773
+ }
11774
+ handleClaudeResult(item) {
11775
+ const usage = item.usage;
11776
+ const inputTokens = typeof usage?.input_tokens === "number" ? usage.input_tokens : 0;
11777
+ const outputTokens = typeof usage?.output_tokens === "number" ? usage.output_tokens : 0;
11778
+ emit({
11779
+ type: "token_usage",
11780
+ id: item.agentId,
11781
+ input_tokens: inputTokens,
11782
+ output_tokens: outputTokens
11783
+ });
11784
+ }
11785
+ handleCodexTurnEvent(item) {
11786
+ if (item.phase !== "completed") return;
11787
+ const raw = item.raw;
11788
+ const usage = raw.usage;
11789
+ if (!usage) return;
11790
+ const inputTokens = typeof usage.input_tokens === "number" ? usage.input_tokens : 0;
11791
+ const outputBase = typeof usage.output_tokens === "number" ? usage.output_tokens : 0;
11792
+ const reasoning = typeof usage.reasoning_output_tokens === "number" ? usage.reasoning_output_tokens : 0;
11793
+ const outputTokens = outputBase + reasoning;
11794
+ emit({
11795
+ type: "token_usage",
11796
+ id: item.agentId,
11797
+ input_tokens: inputTokens,
11798
+ output_tokens: outputTokens
11799
+ });
11800
+ }
11801
+ };
11802
+
11803
+ // ../baro-orchestrator/src/participants/forwarders/index.ts
11804
+ function joinBaroEventForwarders(env) {
11805
+ const forwarders = [
11806
+ new AgentStreamForwarder(),
11807
+ new StoryLifecycleForwarder(),
11808
+ new TokenUsageForwarder(),
11809
+ new ProgressForwarder(),
11810
+ new CoordinationForwarder(),
11811
+ new FinalizationForwarder()
11812
+ ];
11813
+ for (const f of forwarders) f.join(env);
11814
+ }
11815
+
11614
11816
  // ../baro-orchestrator/src/participants/librarian.ts
11615
11817
  var EXPLORATION_TOOLS = /* @__PURE__ */ new Set([
11616
11818
  "Read",
@@ -14456,12 +14658,6 @@ var SurgeonOpenAI = class extends BaseObserver {
14456
14658
  }
14457
14659
  };
14458
14660
 
14459
- // ../baro-orchestrator/src/tui-protocol.ts
14460
- function emit(event) {
14461
- const line = JSON.stringify(event) + "\n";
14462
- process.stdout.write(line);
14463
- }
14464
-
14465
14661
  // ../baro-orchestrator/src/orchestrate.ts
14466
14662
  async function orchestrate(config) {
14467
14663
  const env = new AgenticEnvironment();
@@ -14497,7 +14693,7 @@ async function orchestrate(config) {
14497
14693
  for (const p of config.extraParticipants) p.join(env);
14498
14694
  }
14499
14695
  if (emitTui) {
14500
- new BaroEventForwarder().join(env);
14696
+ joinBaroEventForwarders(env);
14501
14697
  }
14502
14698
  const operator = new Operator(config.operatorHooks ?? {});
14503
14699
  operator.setEnvironment(env);
@@ -14698,198 +14894,6 @@ async function orchestrate(config) {
14698
14894
  storyAgents: /* @__PURE__ */ new Map()
14699
14895
  };
14700
14896
  }
14701
- var BaroEventForwarder = class extends BaseObserver {
14702
- /** Story IDs that have already received a `story_start`. */
14703
- startedStories = /* @__PURE__ */ new Set();
14704
- /** Number of in-flight retry attempts per story (for `story_retry`). */
14705
- retryCounts = /* @__PURE__ */ new Map();
14706
- /** Token-usage tally per story (incrementally updated from results). */
14707
- tokensByStory = /* @__PURE__ */ new Map();
14708
- async onExternalModelMessage(source, item) {
14709
- this.handleModelMessage(source, item);
14710
- }
14711
- async onExternalFunctionCall(source, item) {
14712
- this.handleToolCall(source, item);
14713
- }
14714
- async onExternalFunctionCallOutput(source, item) {
14715
- this.handleToolResult(source, item);
14716
- }
14717
- async onExternalEvent(_source, event) {
14718
- if (ConductorState.is(event)) {
14719
- this.handleConductorState(event.data);
14720
- return;
14721
- }
14722
- if (StoryResult.is(event)) {
14723
- this.handleStoryResult(event.data);
14724
- return;
14725
- }
14726
- if (AgentResult.is(event)) {
14727
- this.handleClaudeResult(event.data);
14728
- return;
14729
- }
14730
- if (CodexTurnEvent.is(event)) {
14731
- this.handleCodexTurnEvent(event.data);
14732
- return;
14733
- }
14734
- if (AgentState.is(event)) {
14735
- this.handleAgentState(event.data);
14736
- return;
14737
- }
14738
- if (ClaudeSystem.is(event)) {
14739
- return;
14740
- }
14741
- if (Coordination.is(event)) {
14742
- this.handleCoordination(event.data);
14743
- return;
14744
- }
14745
- if (Critique.is(event)) {
14746
- this.handleCritique(event.data);
14747
- return;
14748
- }
14749
- if (FinalizeStarted.is(event)) {
14750
- emit({ type: "finalize_start" });
14751
- return;
14752
- }
14753
- if (PrCreated.is(event)) {
14754
- emit({ type: "finalize_complete", pr_url: event.data.url });
14755
- return;
14756
- }
14757
- }
14758
- handleCoordination(item) {
14759
- emit({
14760
- type: "story_log",
14761
- id: item.recipientId,
14762
- line: `[sentry/${item.kind}] ${item.reason}`
14763
- });
14764
- }
14765
- handleCritique(item) {
14766
- emit({
14767
- type: "story_log",
14768
- id: item.agentId,
14769
- line: `[critic/${item.verdict}] ${item.reasoning}`
14770
- });
14771
- }
14772
- handleConductorState(item) {
14773
- if (item.phase === "running_level" && item.currentLevel != null && item.totalLevels != null) {
14774
- emit({
14775
- type: "progress",
14776
- completed: item.currentLevel - 1,
14777
- total: item.totalLevels,
14778
- percentage: Math.round(
14779
- (item.currentLevel - 1) / Math.max(1, item.totalLevels) * 100
14780
- )
14781
- });
14782
- }
14783
- }
14784
- handleStoryResult(item) {
14785
- if (item.success) {
14786
- emit({
14787
- type: "story_complete",
14788
- id: item.storyId,
14789
- duration_secs: item.durationSecs,
14790
- files_created: 0,
14791
- files_modified: 0
14792
- });
14793
- } else {
14794
- emit({
14795
- type: "story_error",
14796
- id: item.storyId,
14797
- error: item.error ?? "unknown error",
14798
- attempt: item.attempts,
14799
- max_retries: item.attempts
14800
- });
14801
- }
14802
- }
14803
- handleClaudeResult(item) {
14804
- const usage = item.usage;
14805
- const inputTokens = typeof usage?.input_tokens === "number" ? usage.input_tokens : 0;
14806
- const outputTokens = typeof usage?.output_tokens === "number" ? usage.output_tokens : 0;
14807
- const tally = this.tokensByStory.get(item.agentId) ?? { input: 0, output: 0 };
14808
- tally.input += inputTokens;
14809
- tally.output += outputTokens;
14810
- this.tokensByStory.set(item.agentId, tally);
14811
- emit({
14812
- type: "token_usage",
14813
- id: item.agentId,
14814
- input_tokens: inputTokens,
14815
- output_tokens: outputTokens
14816
- });
14817
- }
14818
- /**
14819
- * Codex emits its usage stats inside `turn.completed` envelopes
14820
- * (shape: `{type:"turn.completed", usage:{input_tokens,
14821
- * cached_input_tokens, output_tokens, reasoning_output_tokens}}`).
14822
- * Translate to the same `token_usage` BaroEvent shape Claude uses
14823
- * so the TUI's existing counter works without backend-specific
14824
- * branching. `cached_input_tokens` is rolled into `input_tokens`
14825
- * (Codex reports both — Claude only reports the combined total —
14826
- * so we surface the same number here for parity). Reasoning
14827
- * tokens are billed as output tokens by OpenAI so we lump them
14828
- * with output_tokens.
14829
- */
14830
- handleCodexTurnEvent(item) {
14831
- if (item.phase !== "completed") return;
14832
- const raw = item.raw;
14833
- const usage = raw.usage;
14834
- if (!usage) return;
14835
- const inputTokens = typeof usage.input_tokens === "number" ? usage.input_tokens : 0;
14836
- const outputBase = typeof usage.output_tokens === "number" ? usage.output_tokens : 0;
14837
- const reasoning = typeof usage.reasoning_output_tokens === "number" ? usage.reasoning_output_tokens : 0;
14838
- const outputTokens = outputBase + reasoning;
14839
- const tally = this.tokensByStory.get(item.agentId) ?? {
14840
- input: 0,
14841
- output: 0
14842
- };
14843
- tally.input += inputTokens;
14844
- tally.output += outputTokens;
14845
- this.tokensByStory.set(item.agentId, tally);
14846
- emit({
14847
- type: "token_usage",
14848
- id: item.agentId,
14849
- input_tokens: inputTokens,
14850
- output_tokens: outputTokens
14851
- });
14852
- }
14853
- handleAgentState(item) {
14854
- if (item.phase === "running" && !this.startedStories.has(item.agentId)) {
14855
- this.startedStories.add(item.agentId);
14856
- emit({ type: "story_start", id: item.agentId, title: item.agentId });
14857
- }
14858
- if (item.phase === "waiting" && item.detail?.includes("retrying")) {
14859
- const count = (this.retryCounts.get(item.agentId) ?? 0) + 1;
14860
- this.retryCounts.set(item.agentId, count);
14861
- emit({ type: "story_retry", id: item.agentId, attempt: count });
14862
- }
14863
- }
14864
- handleModelMessage(source, item) {
14865
- const agentId = source.agentId;
14866
- if (typeof agentId !== "string") return;
14867
- const json = item.toJSON();
14868
- const text = json.content?.[0]?.text ?? "";
14869
- if (!text.trim()) return;
14870
- emitMultiline(agentId, text);
14871
- }
14872
- handleToolCall(source, item) {
14873
- const agentId = source.agentId;
14874
- if (typeof agentId !== "string") return;
14875
- emitMultiline(agentId, `[tool_call] ${item.name} ${item.args}`);
14876
- }
14877
- handleToolResult(source, item) {
14878
- const agentId = source.agentId;
14879
- if (typeof agentId !== "string") return;
14880
- const json = item.toJSON();
14881
- const text = json.output?.[0]?.text ?? "";
14882
- emitMultiline(agentId, `[tool_result ${json.call_id}] ${text}`);
14883
- }
14884
- };
14885
- function emitMultiline(agentId, text) {
14886
- if (!text) return;
14887
- const lines = text.split("\n");
14888
- for (const line of lines) {
14889
- if (line.length === 0 && lines.length === 1) continue;
14890
- emit({ type: "story_log", id: agentId, line });
14891
- }
14892
- }
14893
14897
  function tokenizeForHints(text) {
14894
14898
  const seen = /* @__PURE__ */ new Set();
14895
14899
  const out = [];