npm - compound-agent - Versions diffs - 2.7.1 → 2.8.0 - Mend

compound-agent 2.7.1 → 2.8.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (3) hide show

package/CHANGELOG.md CHANGED Viewed

@@ -7,6 +7,53 @@ All notable changes to this project will be documented in this file.
 The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.1.0/),
 and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
+## [Unreleased]
+## [2.8.0] - 2026-05-17
+### Upgrading (action required for the default path)
+- `ca loop` / `ca polish` now default to the `claude --bg` backend. The bg
+  backend **fails loud (exit 1) at startup** unless **both** operator
+  prerequisites are met: (1) the `--dangerously-skip-permissions` bypass
+  disclaimer is accepted on the machine, and (2) Claude's
+  `worktree.bgIsolation` setting is `none`. If you cannot set these, pin the
+  legacy behavior with `--backend p` (or `CA_BACKEND=p`) — that path is
+  byte-for-byte unchanged.
+### Changed
+- **Default loop/polish/review backend is now `claude --bg` (subscription-billed)**: `ca loop` and `ca polish` default to the bg backend (`claude --bg`). The legacy `claude -p` path remains fully supported via `--backend p` or `CA_BACKEND=p`. Precedence: explicit `--backend` flag > `CA_BACKEND` env > default (bg). The bg backend runs sessions as background jobs polled via `state.json`, auto-isolates each session into a git worktree (harvested before cleanup), and uses the existing `EPIC_COMPLETE`/`HUMAN_REQUIRED:`/`EPIC_FAILED` protocol unchanged.
+- **`--backend bg|p` flag added to `ca loop` and `ca polish`**: Explicit backend selection. Default is `bg`. `--backend p` restores the legacy `claude -p` streaming pipeline byte-for-byte.
+- **Bootstrap preflight added to bg-backend scripts**: Generated scripts with `--backend bg` (or default) now run a `bootstrap_preflight` step before the epic loop starts. It enforces two operator prerequisites and fails loud (exit 1, remediation) if either is missing: (1) the `--dangerously-skip-permissions` bypass disclaimer must be accepted on the machine; (2) Claude's `worktree.bgIsolation` setting must be `none` — otherwise `bd` (keyed to the main repo path) is unreachable from the worktree `claude --bg` auto-isolates into, so epics never close and the polish architect's `bd` writes are lost. The `bgIsolation` check tries `claude config get worktree.bgIsolation` first and falls back to the settings-JSON precedence (`.claude/settings.local.json` → project `.claude/settings.json` → `~/.claude/settings.json`); it errs toward failing loud when it cannot confirm `none`. Both checks are skipped entirely for `--backend p`.
+- **Polish architect runs synchronously regardless of backend**: the polish architect (which only runs `bd create --type=epic` / `bd dep add` and makes no code edits) always uses the synchronous `agent_invoke` path even under `CA_BACKEND=bg`, so its `bd` epic writes reach the main-tree Dolt instead of a discarded bg worktree. The reviewer fleet is unaffected and is still bg-dispatched.
+- **Worktree harvest**: with the bg backend, each `claude --bg` session auto-isolates into a git worktree (`worktree-<name>` branch). After terminal state, the loop merges the worktree into the working branch before `claude rm`. `claude rm` runs only when worktree safety is positively verified. On harvest failure (merge conflict, ambiguous worktrees, non-success marker) **or when the pre-dispatch worktree snapshot is missing** (worktree safety unverifiable), the worktree is retained, the epic is marked `HUMAN_REQUIRED`, and `claude rm` is skipped to preserve work — a missing snapshot is treated as unverifiable, never as "safe to remove".
+- **`ca watch` bg data source**: `ca watch` follows the `.latest` symlink updated by the harvest/collect step to a trace file in `.compound-agent/agent_logs/`.
+- **Polish inner-loop backend propagation**: the polish loop now propagates `CA_BACKEND` to the inner infinity loop via `CA_BACKEND="${CA_BACKEND:-bg}" bash "$inner_script"`.
+### Removed
+- **Improve loop (`ca improve`, `ca loop --improve`, `ca watch --improve`)**: Removed the improve loop command and all associated flags and shell-script plumbing. The feature is superseded; this removal is unrelated to the bg backend migration.
+## [2.7.2] - 2026-04-16
+### Added
+- **`ca uninstall` command**: reverses `ca init` / `ca setup` in three tiers. Default removes managed Claude Code hooks from `.claude/settings.json`. `--templates` additionally removes the `compound/` template directories (`.claude/agents/compound/`, `.claude/commands/compound/`, `.claude/skills/compound/`, `docs/compound/`) and `.claude/plugin.json`. `--all` additionally removes `.compound-agent/` runtime state and strips managed marker blocks from `AGENTS.md`, `.claude/CLAUDE.md`, root `.gitignore`, and `.claude/.gitignore`. `.claude/lessons/` and `.claude/compound-agent.json` are ALWAYS preserved. Requires `--yes` to skip interactive confirmation.
+- **Install profiles (`ca setup --profile <minimal|workflow|full>`)**: `minimal` installs only lesson-capture plumbing (lessons dir, AGENTS.md integration, plugin.json, and 3 hooks — SessionStart/PreCompact prime + UserPromptSubmit reminder) — no commands, no phase skills, no agent role skills, no docs. `workflow` adds the 5-phase cook-it workflow (all commands, phase skills, agent role skills, doc templates, all phase/failure hooks) but skips the `docs/compound/research/` tree. `full` (default, backward-compatible) installs everything. `--confirm-prune` is required when a lower profile would delete existing templates on disk.
+### Changed
+- **Default model bumped to Opus 4.7**: All default model references updated from `claude-opus-4-6[1m]` to `claude-opus-4-7[1m]`. Affects `ca loop --model`, `ca loop --review-model`, `ca polish --model`, `ca improve --model`, the `claude-opus` reviewer selector in the generated review/polish scripts, and the Simplicity lens in the architect advisory fleet. Template skills (`loop-launcher/SKILL.md`, `architect/references/infinity-loop/README.md`, `architect/references/infinity-loop/troubleshooting.md`, `architect/references/polish-loop/README.md`, `architect/references/advisory-fleet.md`) and shipped docs (`CLI_REFERENCE.md`) updated to match. The Windows PowerShell reference template `$MODEL` default also updated.
+### Fixed
+- **Concrete fallback quality-gate commands**: when the project stack cannot be detected, the fallback strings substituted for `{{QUALITY_GATE_TEST}}`, `{{QUALITY_GATE_LINT}}`, and `{{QUALITY_GATE_BUILD}}` are now shell commands that exit non-zero with a `[compound-agent]`-tagged diagnostic on stderr, e.g. `sh -c 'echo "[compound-agent] test command not configured..." >&2; exit 1'`. Previously the fallbacks were English prose ("detect and run the project's test suite") that, when rendered into SKILL.md as shell commands, broke sh parsing on the apostrophe and produced a confusing error rather than a clear failure. Agents now see a visible failure with actionable configuration guidance.
+- **`hasTransitionEvidence` robustness**: replaced hardcoded `5` with `len(Phases)` so the cook-it final-phase branch tracks the phase list automatically. No behavioral change today (architect at index 6 correctly falls through); regression tests (`TestHasTransitionEvidence_OutOfRangeReturnsFalse`, `TestHasTransitionEvidence_FinalPhaseAlwaysTrue`) pin the contract so a future `Phases` shrink can't introduce a panic.
+- **Empty hook arrays in `settings.json`**: `AddHooksForProfile` and `RemoveAllHooks` both drop empty `"PreToolUse": []` style entries at the end of their run (previously created eagerly by `upgradeNpxHooks` calling `getHookArray` for every known hook type). Cosmetic only; no behavioral change for in-profile hooks.
+- **`removeIfPresent` (formerly `removeIfExists`) semantics**: rewrote to suppress `os.ErrNotExist` internally and return `(existed bool, err error)`. `uninstallTemplates` now propagates real I/O errors (permission denied, read-only FS) when removing `.claude/plugin.json` instead of silently reporting success. Previously a permission error on plugin.json would make `ca uninstall --templates` report success without removing the file — flagged by the second-pass reviewer as a blocking issue. Regression tests `TestRemoveIfPresent_RealIOErrorSurfaces` and `TestUninstallTemplates_PropagatesRealIOError` pin the contract.
+- **`polish-loop.sh` dry-run leaked state** (#17, #16): `POLISH_DRY_RUN=1` now fully skips the post-loop commit/push block AND writes a distinct `"status":"dry-run-completed"` to `.polish-status.json` instead of overwriting it with `"status":"completed"`. Previously a dry run still mutated git *and* produced a status file indistinguishable from a real run, defeating the preflight contract and misleading any monitoring tool that polled the status file. Regression test `TestPolishCommand_PostLoopRespectsDryRun` pins the guard structure so this can't silently come back.
 ## [2.7.1] - 2026-04-10
 ### Added

package/README.md CHANGED Viewed

@@ -143,53 +143,32 @@ ca loop --reviewers claude-sonnet --review-every 3
 `ca loop` generates a bash script that processes your beads epics sequentially, running the full cook-it cycle on each one. No human intervention required between epics.
+The default backend is `claude --bg` (subscription-billed; requires accepting the bypass-permissions disclaimer once: `claude --dangerously-skip-permissions`). Use `--backend p` or `CA_BACKEND=p` for the legacy `claude -p` (pay-per-token) path.
 ```bash
-# Generate script for all ready epics
+# Generate script for all ready epics (bg backend by default)
 ca loop
+# Explicit backend selection
+ca loop --backend bg     # bg (default): subscription-billed
+ca loop --backend p      # p: legacy pay-per-token
 # With periodic review every 3 epics
 ca loop --reviewers claude-sonnet --review-every 3
 # Target specific epics
 ca loop --epics "beads-abc,beads-def,beads-ghi" --max-retries 2
-# Run it
-./.compound-agent/infinity-loop.sh
+# Run it (always use screen for durability)
+screen -dmS compound-loop bash ./.compound-agent/infinity-loop.sh
 ```
+**One-time bootstrap (bg backend)**: run `claude --dangerously-skip-permissions` once interactively to accept the bypass-permissions disclaimer. The generated script's bootstrap preflight detects a missing disclaimer and exits with remediation instructions before starting the loop.
 The loop respects beads dependency graphs — it only processes epics whose dependencies are complete. If an epic fails after `--max-retries` attempts, it stops and reports before proceeding.
 **Current maturity**: the loop works and has been used to ship real projects, including compound-agent itself. Two things still required human involvement: specifications had to be written before the loop started, and a human applied fixes after the first review pass surfaced real problems (missing error handling, a migration gap, insufficient test coverage). Fully unattended long-duration runs across many epics are the current area of hardening.
-## The improvement loop
-`ca improve` generates a bash script that iterates over `improve/*.md` program files, spawning Claude Code sessions to make focused improvements. Each program file defines what to improve, how to find work, and how to validate changes.
-```bash
-# Scaffold an example program file
-ca improve init
-# Creates improve/example.md with a linting template
-# Generate the improvement script
-ca improve
-# Filter to specific topics
-ca improve --topics lint tests --max-iters 3
-# Preview without generating
-ca improve --dry-run
-# Run the generated script
-./.compound-agent/improvement-loop.sh
-# Preview without executing Claude sessions
-IMPROVE_DRY_RUN=1 ./.compound-agent/improvement-loop.sh
-```
-Each iteration makes one focused improvement, commits it, and moves on. If an iteration finds nothing to improve or fails validation, it reverts cleanly and moves to the next topic. The loop tracks consecutive no-improvement results and stops early to avoid diminishing returns.
-Monitor progress with `ca watch --improve` to see live trace output from improvement sessions.
 ## Automatic hooks
 Once installed, seven Claude Code hooks fire without any commands:
@@ -301,7 +280,9 @@ The CLI binary is `ca` (alias: `compound-agent`).
 | Command | Description |
 |---------|-------------|
-| `ca loop` | Generate infinity loop script for autonomous epic processing |
+| `ca loop` | Generate infinity loop script (default: `claude --bg`, subscription-billed) |
+| `ca loop --backend bg` | Default bg backend: `claude --bg` (subscription-billed) |
+| `ca loop --backend p` | Legacy p backend: `claude -p` (pay-per-token) |
 | `ca loop --epics "id1,id2,id3"` | Target specific epic IDs (comma-separated) |
 | `ca loop -o <path>` | Custom output path (default: `./.compound-agent/infinity-loop.sh`) |
 | `ca loop --max-retries <n>` | Max retries per epic on failure (default: 1) |
@@ -310,19 +291,13 @@ The CLI binary is `ca` (alias: `compound-agent`).
 | `ca loop --review-every <n>` | Review every N completed epics (0 = end-only, default: 0) |
 | `ca loop --max-review-cycles <n>` | Max review/fix iterations (default: 3) |
 | `ca loop --review-blocking` | Fail loop if review not approved after max cycles |
-| `ca loop --review-model <model>` | Model for implementer fix sessions (default: claude-opus-4-6[1m]) |
-| `ca improve` | Generate improvement loop script from `improve/*.md` programs |
-| `ca improve --topics <names...>` | Run only specific topics |
-| `ca improve --max-iters <n>` | Max iterations per topic (default: 5) |
-| `ca improve --time-budget <seconds>` | Total time budget, 0=unlimited (default: 0) |
-| `ca improve --dry-run` | Validate and print plan without generating |
-| `ca improve --force` | Overwrite existing script |
-| `ca improve init` | Scaffold an example `improve/*.md` program file |
+| `ca loop --review-model <model>` | Model for implementer fix sessions (default: claude-opus-4-7[1m]) |
 | `ca watch` | Tail and pretty-print live trace from loop sessions |
 | `ca watch --epic <id>` | Watch a specific epic trace |
-| `ca watch --improve` | Watch improvement loop traces |
 | `ca watch --no-follow` | Print existing trace and exit (no live tail) |
-| `ca polish` | Generate polish loop script for iterative refinement |
+| `ca polish` | Generate polish loop script (default: `claude --bg`, subscription-billed) |
+| `ca polish --backend bg` | Default bg backend: `claude --bg` (subscription-billed) |
+| `ca polish --backend p` | Legacy p backend: `claude -p` (pay-per-token) |
 | `ca polish --spec-file <path>` | Specify the spec file for polish review |
 | `ca polish --reviewers <names>` | Comma-separated reviewer models |
 | `ca polish --cycles <n>` | Number of polish cycles (default: 1) |

package/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "compound-agent",
-  "version": "2.7.1",
+  "version": "2.8.0",
   "type": "module",
   "description": "Learning system for Claude Code — avoids repeating mistakes across sessions",
   "bin": {
@@ -51,12 +51,12 @@
     "knowledge-management"
   ],
   "optionalDependencies": {
-    "@syottos/darwin-arm64": "2.7.1",
-    "@syottos/darwin-x64": "2.7.1",
-    "@syottos/linux-arm64": "2.7.1",
-    "@syottos/linux-x64": "2.7.1",
-    "@syottos/win32-x64": "2.7.1",
-    "@syottos/win32-arm64": "2.7.1"
+    "@syottos/darwin-arm64": "2.8.0",
+    "@syottos/darwin-x64": "2.8.0",
+    "@syottos/linux-arm64": "2.8.0",
+    "@syottos/linux-x64": "2.8.0",
+    "@syottos/win32-x64": "2.8.0",
+    "@syottos/win32-arm64": "2.8.0"
   },
   "author": "Nathan Delacrétaz",
   "license": "MIT",