npm - baro-ai - Versions diffs - 0.20.0 → 0.21.0 - Mend

baro-ai 0.20.0 → 0.21.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (4) hide show

package/README.md CHANGED Viewed

@@ -124,18 +124,23 @@ Options:
   --resume                     Resume from existing prd.json (also runs dry-run plans)
   --skip-context               Skip CLAUDE.md auto-generation
   --cwd <path>                 Working directory (default: current)
-  --with-critic                Enable live Critic — reviews each agent turn
-                               against acceptance criteria via `claude --model haiku`
-                               and injects corrective feedback (default: off)
+  --no-critic                  Disable live Critic (default: on). The Critic
+                               reviews each agent turn against acceptance
+                               criteria via `claude --model haiku` and injects
+                               corrective feedback when the turn doesn't pass.
   --critic-model <name>        Model for the Critic (default: haiku)
   --no-librarian               Disable cross-agent runtime memory (default: on)
   --no-sentry                  Disable file-touch conflict detector (default: on)
-  --with-surgeon               Enable adaptive DAG: drop / replace failing stories
-                               at level boundaries instead of stalling (default: off)
-  --surgeon-use-llm            Use `claude --model …` for richer Surgeon replans
-                               (default: deterministic skip-only)
-  --surgeon-model <name>       Model for Surgeon when --surgeon-use-llm is on
-                               (default: opus)
+  --no-surgeon                 Disable Surgeon (default: on). The Surgeon
+                               observes terminal story failures and proposes
+                               replans (split / prereq / rewire) so failed
+                               work gets done in a different shape rather
+                               than dropped.
+  --no-surgeon-llm             Use deterministic Surgeon (skip-only) instead
+                               of the LLM-driven replanner. The LLM Surgeon
+                               is on by default; it costs an Opus call per
+                               terminal failure but produces richer replans.
+  --surgeon-model <name>       Model for the Surgeon LLM (default: opus)
   -h, --help                   Print help
 ```
@@ -150,15 +155,22 @@ react to one another's bus events:
   redundant exploration. Measurable token savings on multi-story runs.
 - **Sentry** (default ON) — flags overlapping Edit/Write tool calls
   across concurrent stories.
-- **Critic** (`--with-critic`, default OFF) — Haiku evaluator reviews
-  each agent turn against acceptance criteria; on a fail verdict, an
-  inline corrective message lands as the agent's next turn so it
-  self-corrects before commit.
-- **Surgeon** (`--with-surgeon`, default OFF) — when a story fails its
-  retry budget, a ReplanItem is emitted on the bus and the Conductor
-  recomputes the DAG at the next level boundary. The simplest mode just
-  drops failing stories so dependents unblock; with `--surgeon-use-llm`
-  Opus proposes splits, prerequisite inserts, or dependency rewires.
+- **Critic** (default ON) — Haiku evaluator reviews each agent turn
+  against acceptance criteria; on a fail verdict, an inline corrective
+  message lands as the agent's next turn so it self-corrects before
+  commit. Disable with `--no-critic`.
+- **Surgeon** (default ON, with LLM) — when a story fails its retry
+  budget, the Surgeon asks Opus for a richer replan and emits a
+  ReplanItem the Conductor applies at the next level boundary. The LLM
+  is biased toward keeping the work done — it prefers splitting a too-
+  large story into smaller pieces, inserting a prerequisite, or
+  rewiring dependencies, over dropping outright. A run is reported as
+  successful only when every original story passes; if the Surgeon
+  drops a story without replacement, the run terminates with a clear
+  "did not complete the goal" verdict instead of a green tick. Disable
+  the LLM with `--no-surgeon-llm` to fall back to deterministic
+  skip-only behavior, or `--no-surgeon` to remove adaptive replans
+  entirely.
 ## Requirements

package/dist/cli.mjs CHANGED Viewed

@@ -8673,6 +8673,14 @@ var Conductor = class extends Participant {
   globalCompleted = [];
   /** All stories that have failed terminally (after retries) in this run. */
   globalFailed = [];
+  /**
+   * Stories removed from the PRD by a Surgeon replan without a
+   * replacement. These do NOT come back to globalFailed (the failing
+   * story is gone from the PRD and won't be retried) but they DO
+   * count against the run's success verdict — terminateRun(success)
+   * is true only when this list is empty.
+   */
+  globalDropped = [];
   totalAttempts = 0;
   appliedReplans = 0;
   currentLevel = null;
@@ -8761,7 +8769,8 @@ var Conductor = class extends Participant {
     if (!this.prd) return;
     const levels = buildDag(this.prd.userStories, { onlyIncomplete: true });
     if (levels.length === 0) {
-      this.terminateRun(this.globalFailed.length === 0, null);
+      const allPassed = this.prd.userStories.every((s) => s.passes) && this.globalDropped.length === 0;
+      this.terminateRun(allPassed, null);
       return;
     }
     const level = levels[0];
@@ -8898,9 +8907,17 @@ ${prompt}`;
         replannedThisLevel = true;
         if (replan.removedStoryIds.length > 0) {
           const removeSet = new Set(replan.removedStoryIds);
-          for (let i = this.globalFailed.length - 1; i >= 0; i--) {
-            if (removeSet.has(this.globalFailed[i])) {
-              this.globalFailed.splice(i, 1);
+          if (replan.addedStories.length > 0) {
+            for (let i = this.globalFailed.length - 1; i >= 0; i--) {
+              if (removeSet.has(this.globalFailed[i])) {
+                this.globalFailed.splice(i, 1);
+              }
+            }
+          } else {
+            for (const id of replan.removedStoryIds) {
+              if (!this.globalDropped.includes(id)) {
+                this.globalDropped.push(id);
+              }
             }
           }
         }
@@ -8930,15 +8947,17 @@ ${prompt}`;
     if (this.phase === "done") return;
     this.phase = "done";
     const totalDurationSecs = Math.round((Date.now() - this.startedAt) / 1e3);
+    const droppedSegment = this.globalDropped.length > 0 ? `, ${this.globalDropped.length} dropped` : "";
     this.emit(
       new ConductorStateItem(
         success ? "done" : "failed",
-        abortReason ?? `${this.globalCompleted.length} passed, ${this.globalFailed.length} failed in ${totalDurationSecs}s`
+        abortReason ?? `${this.globalCompleted.length} passed, ${this.globalFailed.length} failed${droppedSegment} in ${totalDurationSecs}s`
       )
     );
     const summary = {
       completedStories: [...this.globalCompleted],
       failedStories: [...this.globalFailed],
+      droppedStories: [...this.globalDropped],
       totalDurationSecs,
       totalAttempts: this.totalAttempts
     };
@@ -9563,18 +9582,38 @@ DAG when stories fail. Given:
 2. The id, title, description, and FAILURE REASON of the story that just
    exhausted its retry budget.
-Decide ONE of:
-  (a) "skip"      \u2014 the failure isn't load-bearing; remove only this story.
-  (b) "split"     \u2014 replace the failing story with 2-3 smaller stories.
-  (c) "prereq"    \u2014 insert a NEW story that the failing one depends on,
-                    AND remove the failing one (it can be re-attempted
-                    later by re-introducing it manually).
-  (d) "abort"     \u2014 nothing useful can be salvaged; emit no replan.
+Decide ONE of, in this order of preference:
+  (a) "split"     \u2014 replace the failing story with 2-3 smaller stories
+                    that together cover its acceptance criteria. Use
+                    this whenever the failure looks like the story was
+                    too broad \u2014 too many files, too many concerns,
+                    too much for one Claude session. Strongly preferred
+                    over removal whenever the goal still needs the work.
+  (b) "prereq"    \u2014 insert ONE OR MORE new prerequisite stories that
+                    the failing story now depends on, then ALSO add a
+                    replacement of the failing story (with updated
+                    dependsOn) so the original work still gets done.
+                    Removing without replacement is NOT prereq.
+  (c) "rewire"    \u2014 keep the failing story BUT modifyDeps so it runs
+                    in a different order, or change its dependsOn to
+                    unblock dependents. Use when the failure was
+                    timing-related, not scope-related.
+  (d) "skip"      \u2014 last resort. Use ONLY when the story is genuinely
+                    infeasible (e.g., asks for a library that doesn't
+                    exist, references files that aren't there). When
+                    you skip, modifyDeps for any dependents so the
+                    rest of the run can still complete.
+  (e) "abort"     \u2014 only when the entire run cannot continue.
+Strong bias: the run is only successful when EVERY original goal item
+gets done. Splitting into smaller stories is almost always better than
+dropping. Don't drop just because one attempt failed \u2014 propose a
+different approach.
 Respond ONLY with a JSON object \u2014 no prose, no markdown fences \u2014 in
 exactly this shape:
-{"action":"skip"|"split"|"prereq"|"abort",
+{"action":"split"|"prereq"|"rewire"|"skip"|"abort",
  "reason":"\u2026",
  "added":[ { "id":"S?","priority":N,"title":"\u2026","description":"\u2026",
              "dependsOn":["\u2026"], "acceptance":["\u2026"] } ],
@@ -9595,9 +9634,9 @@ var Surgeon = class extends Participant {
   constructor(opts) {
     super();
     this.opts = {
-      useLlm: opts.useLlm ?? false,
+      useLlm: opts.useLlm ?? true,
       model: opts.model ?? "opus",
-      maxReplans: opts.maxReplans ?? 3,
+      maxReplans: opts.maxReplans ?? 10,
       claudeBin: opts.claudeBin ?? "claude",
       timeoutMs: opts.timeoutMs ?? 9e4,
       snapshot: opts.snapshot