compound-agent 2.7.1 → 2.8.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (3) hide show
  1. package/CHANGELOG.md +47 -0
  2. package/README.md +18 -43
  3. package/package.json +7 -7
package/CHANGELOG.md CHANGED
@@ -7,6 +7,53 @@ All notable changes to this project will be documented in this file.
7
7
  The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.1.0/),
8
8
  and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
9
9
 
10
+ ## [Unreleased]
11
+
12
+ ## [2.8.0] - 2026-05-17
13
+
14
+ ### Upgrading (action required for the default path)
15
+
16
+ - `ca loop` / `ca polish` now default to the `claude --bg` backend. The bg
17
+ backend **fails loud (exit 1) at startup** unless **both** operator
18
+ prerequisites are met: (1) the `--dangerously-skip-permissions` bypass
19
+ disclaimer is accepted on the machine, and (2) Claude's
20
+ `worktree.bgIsolation` setting is `none`. If you cannot set these, pin the
21
+ legacy behavior with `--backend p` (or `CA_BACKEND=p`) — that path is
22
+ byte-for-byte unchanged.
23
+
24
+ ### Changed
25
+
26
+ - **Default loop/polish/review backend is now `claude --bg` (subscription-billed)**: `ca loop` and `ca polish` default to the bg backend (`claude --bg`). The legacy `claude -p` path remains fully supported via `--backend p` or `CA_BACKEND=p`. Precedence: explicit `--backend` flag > `CA_BACKEND` env > default (bg). The bg backend runs sessions as background jobs polled via `state.json`, auto-isolates each session into a git worktree (harvested before cleanup), and uses the existing `EPIC_COMPLETE`/`HUMAN_REQUIRED:`/`EPIC_FAILED` protocol unchanged.
27
+ - **`--backend bg|p` flag added to `ca loop` and `ca polish`**: Explicit backend selection. Default is `bg`. `--backend p` restores the legacy `claude -p` streaming pipeline byte-for-byte.
28
+ - **Bootstrap preflight added to bg-backend scripts**: Generated scripts with `--backend bg` (or default) now run a `bootstrap_preflight` step before the epic loop starts. It enforces two operator prerequisites and fails loud (exit 1, remediation) if either is missing: (1) the `--dangerously-skip-permissions` bypass disclaimer must be accepted on the machine; (2) Claude's `worktree.bgIsolation` setting must be `none` — otherwise `bd` (keyed to the main repo path) is unreachable from the worktree `claude --bg` auto-isolates into, so epics never close and the polish architect's `bd` writes are lost. The `bgIsolation` check tries `claude config get worktree.bgIsolation` first and falls back to the settings-JSON precedence (`.claude/settings.local.json` → project `.claude/settings.json` → `~/.claude/settings.json`); it errs toward failing loud when it cannot confirm `none`. Both checks are skipped entirely for `--backend p`.
29
+ - **Polish architect runs synchronously regardless of backend**: the polish architect (which only runs `bd create --type=epic` / `bd dep add` and makes no code edits) always uses the synchronous `agent_invoke` path even under `CA_BACKEND=bg`, so its `bd` epic writes reach the main-tree Dolt instead of a discarded bg worktree. The reviewer fleet is unaffected and is still bg-dispatched.
30
+ - **Worktree harvest**: with the bg backend, each `claude --bg` session auto-isolates into a git worktree (`worktree-<name>` branch). After terminal state, the loop merges the worktree into the working branch before `claude rm`. `claude rm` runs only when worktree safety is positively verified. On harvest failure (merge conflict, ambiguous worktrees, non-success marker) **or when the pre-dispatch worktree snapshot is missing** (worktree safety unverifiable), the worktree is retained, the epic is marked `HUMAN_REQUIRED`, and `claude rm` is skipped to preserve work — a missing snapshot is treated as unverifiable, never as "safe to remove".
31
+ - **`ca watch` bg data source**: `ca watch` follows the `.latest` symlink updated by the harvest/collect step to a trace file in `.compound-agent/agent_logs/`.
32
+ - **Polish inner-loop backend propagation**: the polish loop now propagates `CA_BACKEND` to the inner infinity loop via `CA_BACKEND="${CA_BACKEND:-bg}" bash "$inner_script"`.
33
+
34
+ ### Removed
35
+
36
+ - **Improve loop (`ca improve`, `ca loop --improve`, `ca watch --improve`)**: Removed the improve loop command and all associated flags and shell-script plumbing. The feature is superseded; this removal is unrelated to the bg backend migration.
37
+
38
+ ## [2.7.2] - 2026-04-16
39
+
40
+ ### Added
41
+
42
+ - **`ca uninstall` command**: reverses `ca init` / `ca setup` in three tiers. Default removes managed Claude Code hooks from `.claude/settings.json`. `--templates` additionally removes the `compound/` template directories (`.claude/agents/compound/`, `.claude/commands/compound/`, `.claude/skills/compound/`, `docs/compound/`) and `.claude/plugin.json`. `--all` additionally removes `.compound-agent/` runtime state and strips managed marker blocks from `AGENTS.md`, `.claude/CLAUDE.md`, root `.gitignore`, and `.claude/.gitignore`. `.claude/lessons/` and `.claude/compound-agent.json` are ALWAYS preserved. Requires `--yes` to skip interactive confirmation.
43
+ - **Install profiles (`ca setup --profile <minimal|workflow|full>`)**: `minimal` installs only lesson-capture plumbing (lessons dir, AGENTS.md integration, plugin.json, and 3 hooks — SessionStart/PreCompact prime + UserPromptSubmit reminder) — no commands, no phase skills, no agent role skills, no docs. `workflow` adds the 5-phase cook-it workflow (all commands, phase skills, agent role skills, doc templates, all phase/failure hooks) but skips the `docs/compound/research/` tree. `full` (default, backward-compatible) installs everything. `--confirm-prune` is required when a lower profile would delete existing templates on disk.
44
+
45
+ ### Changed
46
+
47
+ - **Default model bumped to Opus 4.7**: All default model references updated from `claude-opus-4-6[1m]` to `claude-opus-4-7[1m]`. Affects `ca loop --model`, `ca loop --review-model`, `ca polish --model`, `ca improve --model`, the `claude-opus` reviewer selector in the generated review/polish scripts, and the Simplicity lens in the architect advisory fleet. Template skills (`loop-launcher/SKILL.md`, `architect/references/infinity-loop/README.md`, `architect/references/infinity-loop/troubleshooting.md`, `architect/references/polish-loop/README.md`, `architect/references/advisory-fleet.md`) and shipped docs (`CLI_REFERENCE.md`) updated to match. The Windows PowerShell reference template `$MODEL` default also updated.
48
+
49
+ ### Fixed
50
+
51
+ - **Concrete fallback quality-gate commands**: when the project stack cannot be detected, the fallback strings substituted for `{{QUALITY_GATE_TEST}}`, `{{QUALITY_GATE_LINT}}`, and `{{QUALITY_GATE_BUILD}}` are now shell commands that exit non-zero with a `[compound-agent]`-tagged diagnostic on stderr, e.g. `sh -c 'echo "[compound-agent] test command not configured..." >&2; exit 1'`. Previously the fallbacks were English prose ("detect and run the project's test suite") that, when rendered into SKILL.md as shell commands, broke sh parsing on the apostrophe and produced a confusing error rather than a clear failure. Agents now see a visible failure with actionable configuration guidance.
52
+ - **`hasTransitionEvidence` robustness**: replaced hardcoded `5` with `len(Phases)` so the cook-it final-phase branch tracks the phase list automatically. No behavioral change today (architect at index 6 correctly falls through); regression tests (`TestHasTransitionEvidence_OutOfRangeReturnsFalse`, `TestHasTransitionEvidence_FinalPhaseAlwaysTrue`) pin the contract so a future `Phases` shrink can't introduce a panic.
53
+ - **Empty hook arrays in `settings.json`**: `AddHooksForProfile` and `RemoveAllHooks` both drop empty `"PreToolUse": []` style entries at the end of their run (previously created eagerly by `upgradeNpxHooks` calling `getHookArray` for every known hook type). Cosmetic only; no behavioral change for in-profile hooks.
54
+ - **`removeIfPresent` (formerly `removeIfExists`) semantics**: rewrote to suppress `os.ErrNotExist` internally and return `(existed bool, err error)`. `uninstallTemplates` now propagates real I/O errors (permission denied, read-only FS) when removing `.claude/plugin.json` instead of silently reporting success. Previously a permission error on plugin.json would make `ca uninstall --templates` report success without removing the file — flagged by the second-pass reviewer as a blocking issue. Regression tests `TestRemoveIfPresent_RealIOErrorSurfaces` and `TestUninstallTemplates_PropagatesRealIOError` pin the contract.
55
+ - **`polish-loop.sh` dry-run leaked state** (#17, #16): `POLISH_DRY_RUN=1` now fully skips the post-loop commit/push block AND writes a distinct `"status":"dry-run-completed"` to `.polish-status.json` instead of overwriting it with `"status":"completed"`. Previously a dry run still mutated git *and* produced a status file indistinguishable from a real run, defeating the preflight contract and misleading any monitoring tool that polled the status file. Regression test `TestPolishCommand_PostLoopRespectsDryRun` pins the guard structure so this can't silently come back.
56
+
10
57
  ## [2.7.1] - 2026-04-10
11
58
 
12
59
  ### Added
package/README.md CHANGED
@@ -143,53 +143,32 @@ ca loop --reviewers claude-sonnet --review-every 3
143
143
 
144
144
  `ca loop` generates a bash script that processes your beads epics sequentially, running the full cook-it cycle on each one. No human intervention required between epics.
145
145
 
146
+ The default backend is `claude --bg` (subscription-billed; requires accepting the bypass-permissions disclaimer once: `claude --dangerously-skip-permissions`). Use `--backend p` or `CA_BACKEND=p` for the legacy `claude -p` (pay-per-token) path.
147
+
146
148
  ```bash
147
- # Generate script for all ready epics
149
+ # Generate script for all ready epics (bg backend by default)
148
150
  ca loop
149
151
 
152
+ # Explicit backend selection
153
+ ca loop --backend bg # bg (default): subscription-billed
154
+ ca loop --backend p # p: legacy pay-per-token
155
+
150
156
  # With periodic review every 3 epics
151
157
  ca loop --reviewers claude-sonnet --review-every 3
152
158
 
153
159
  # Target specific epics
154
160
  ca loop --epics "beads-abc,beads-def,beads-ghi" --max-retries 2
155
161
 
156
- # Run it
157
- ./.compound-agent/infinity-loop.sh
162
+ # Run it (always use screen for durability)
163
+ screen -dmS compound-loop bash ./.compound-agent/infinity-loop.sh
158
164
  ```
159
165
 
166
+ **One-time bootstrap (bg backend)**: run `claude --dangerously-skip-permissions` once interactively to accept the bypass-permissions disclaimer. The generated script's bootstrap preflight detects a missing disclaimer and exits with remediation instructions before starting the loop.
167
+
160
168
  The loop respects beads dependency graphs — it only processes epics whose dependencies are complete. If an epic fails after `--max-retries` attempts, it stops and reports before proceeding.
161
169
 
162
170
  **Current maturity**: the loop works and has been used to ship real projects, including compound-agent itself. Two things still required human involvement: specifications had to be written before the loop started, and a human applied fixes after the first review pass surfaced real problems (missing error handling, a migration gap, insufficient test coverage). Fully unattended long-duration runs across many epics are the current area of hardening.
163
171
 
164
- ## The improvement loop
165
-
166
- `ca improve` generates a bash script that iterates over `improve/*.md` program files, spawning Claude Code sessions to make focused improvements. Each program file defines what to improve, how to find work, and how to validate changes.
167
-
168
- ```bash
169
- # Scaffold an example program file
170
- ca improve init
171
- # Creates improve/example.md with a linting template
172
-
173
- # Generate the improvement script
174
- ca improve
175
-
176
- # Filter to specific topics
177
- ca improve --topics lint tests --max-iters 3
178
-
179
- # Preview without generating
180
- ca improve --dry-run
181
-
182
- # Run the generated script
183
- ./.compound-agent/improvement-loop.sh
184
-
185
- # Preview without executing Claude sessions
186
- IMPROVE_DRY_RUN=1 ./.compound-agent/improvement-loop.sh
187
- ```
188
-
189
- Each iteration makes one focused improvement, commits it, and moves on. If an iteration finds nothing to improve or fails validation, it reverts cleanly and moves to the next topic. The loop tracks consecutive no-improvement results and stops early to avoid diminishing returns.
190
-
191
- Monitor progress with `ca watch --improve` to see live trace output from improvement sessions.
192
-
193
172
  ## Automatic hooks
194
173
 
195
174
  Once installed, seven Claude Code hooks fire without any commands:
@@ -301,7 +280,9 @@ The CLI binary is `ca` (alias: `compound-agent`).
301
280
 
302
281
  | Command | Description |
303
282
  |---------|-------------|
304
- | `ca loop` | Generate infinity loop script for autonomous epic processing |
283
+ | `ca loop` | Generate infinity loop script (default: `claude --bg`, subscription-billed) |
284
+ | `ca loop --backend bg` | Default bg backend: `claude --bg` (subscription-billed) |
285
+ | `ca loop --backend p` | Legacy p backend: `claude -p` (pay-per-token) |
305
286
  | `ca loop --epics "id1,id2,id3"` | Target specific epic IDs (comma-separated) |
306
287
  | `ca loop -o <path>` | Custom output path (default: `./.compound-agent/infinity-loop.sh`) |
307
288
  | `ca loop --max-retries <n>` | Max retries per epic on failure (default: 1) |
@@ -310,19 +291,13 @@ The CLI binary is `ca` (alias: `compound-agent`).
310
291
  | `ca loop --review-every <n>` | Review every N completed epics (0 = end-only, default: 0) |
311
292
  | `ca loop --max-review-cycles <n>` | Max review/fix iterations (default: 3) |
312
293
  | `ca loop --review-blocking` | Fail loop if review not approved after max cycles |
313
- | `ca loop --review-model <model>` | Model for implementer fix sessions (default: claude-opus-4-6[1m]) |
314
- | `ca improve` | Generate improvement loop script from `improve/*.md` programs |
315
- | `ca improve --topics <names...>` | Run only specific topics |
316
- | `ca improve --max-iters <n>` | Max iterations per topic (default: 5) |
317
- | `ca improve --time-budget <seconds>` | Total time budget, 0=unlimited (default: 0) |
318
- | `ca improve --dry-run` | Validate and print plan without generating |
319
- | `ca improve --force` | Overwrite existing script |
320
- | `ca improve init` | Scaffold an example `improve/*.md` program file |
294
+ | `ca loop --review-model <model>` | Model for implementer fix sessions (default: claude-opus-4-7[1m]) |
321
295
  | `ca watch` | Tail and pretty-print live trace from loop sessions |
322
296
  | `ca watch --epic <id>` | Watch a specific epic trace |
323
- | `ca watch --improve` | Watch improvement loop traces |
324
297
  | `ca watch --no-follow` | Print existing trace and exit (no live tail) |
325
- | `ca polish` | Generate polish loop script for iterative refinement |
298
+ | `ca polish` | Generate polish loop script (default: `claude --bg`, subscription-billed) |
299
+ | `ca polish --backend bg` | Default bg backend: `claude --bg` (subscription-billed) |
300
+ | `ca polish --backend p` | Legacy p backend: `claude -p` (pay-per-token) |
326
301
  | `ca polish --spec-file <path>` | Specify the spec file for polish review |
327
302
  | `ca polish --reviewers <names>` | Comma-separated reviewer models |
328
303
  | `ca polish --cycles <n>` | Number of polish cycles (default: 1) |
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "compound-agent",
3
- "version": "2.7.1",
3
+ "version": "2.8.0",
4
4
  "type": "module",
5
5
  "description": "Learning system for Claude Code — avoids repeating mistakes across sessions",
6
6
  "bin": {
@@ -51,12 +51,12 @@
51
51
  "knowledge-management"
52
52
  ],
53
53
  "optionalDependencies": {
54
- "@syottos/darwin-arm64": "2.7.1",
55
- "@syottos/darwin-x64": "2.7.1",
56
- "@syottos/linux-arm64": "2.7.1",
57
- "@syottos/linux-x64": "2.7.1",
58
- "@syottos/win32-x64": "2.7.1",
59
- "@syottos/win32-arm64": "2.7.1"
54
+ "@syottos/darwin-arm64": "2.8.0",
55
+ "@syottos/darwin-x64": "2.8.0",
56
+ "@syottos/linux-arm64": "2.8.0",
57
+ "@syottos/linux-x64": "2.8.0",
58
+ "@syottos/win32-x64": "2.8.0",
59
+ "@syottos/win32-arm64": "2.8.0"
60
60
  },
61
61
  "author": "Nathan Delacrétaz",
62
62
  "license": "MIT",