npm - compound-agent - Versions diffs - 2.8.0 → 2.11.0 - Mend

compound-agent 2.8.0 → 2.11.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (3) hide show

package/CHANGELOG.md CHANGED Viewed

@@ -9,6 +9,129 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
 ## [Unreleased]
+## [2.11.0] - 2026-06-14
+Two additive workflow changes: per-epic specs become file-backed (the spec file
+is the single source of truth, the beads epic is a pointer stub), and the
+architect gains a third implementation mode (in-conversation live orchestration)
+alongside the existing detached infinity loop. No Go/CLI behavior changes; the
+loop generators and `ca` commands are untouched. `go build/vet/test` all green.
+### Added
+- **Spec-as-file source of truth**: `/compound:spec-dev` now writes each per-epic
+  spec to `docs/specs/<epic-id>-<slug>.md` (EARS requirements, scenario table,
+  diagrams, open questions, and an append-only `## Amendments` section). The beads
+  epic description becomes a short pointer stub (`Spec: docs/specs/...`) plus a
+  matching `Spec:` bead note, and the spec is registered in `docs/specs/index.md`
+  (created if missing). Benefit: specs stay readable and usable outside the beads
+  tooling.
+- **Architect live orchestration**: `/compound:architect` Phase 5 now offers a
+  mode choice -- (A) the existing detached infinity loop via `/compound:launch-loop`,
+  or (B) a new in-conversation orchestrator that stays live in the session and
+  drives each materialized epic through `/compound:cook-it` sequentially in
+  dependency order. It is harness-adaptive (uses the Workflow/ultracode tool when
+  available, else Task + AgentTeam + subagents), fully autonomous (a failed epic is
+  marked blocked, its dependents are skipped, the run continues), tracks progress
+  via a meta-epic checklist note (resumable), and emits one consolidated report.
+  Documented in `architect/references/live-orchestration.md` (new) with gotchas in
+  `architect/GOTCHA.md`.
+### Changed
+- **plan** now appends the `## Acceptance Criteria` table and `## Verification
+  Contract` to the spec file (before `## Amendments`) instead of the epic
+  description.
+- **work**, **review**, **compound**, and **cook-it** gates now read the spec, AC,
+  and Verification Contract from the resolved spec file, with a legacy fallback to
+  the epic description for pre-existing epics. **review** records contract
+  escalations in the file's Verification Contract section plus an `## Amendments`
+  entry; **compound** logs spec-drift reconciliations as amendments.
+- `.gemini/skills` mirror synced to canonical for all seven affected phase skills
+  (also fixing the pre-existing architect 4-phase mirror drift to the canonical
+  6-phase version).
+### Notes
+- Backward compatible: every downstream reader falls back to the epic description
+  when no spec-file pointer is present, so legacy epics keep working.
+- Live orchestration is entered via architect Phase 5 (no new slash command) and
+  does not use `ca loop` or a screen session.
+## [2.10.0] - 2026-06-14
+Adds `codex` and `gemini` as full loop implementers alongside `claude` (default)
+and `goose`, plus antigravity migration groundwork. Purely additive: the default
+`ca setup` and `ca loop --implementer claude|goose` paths are byte-for-byte
+unchanged and guarded by regression tests. The codex implementer invocation and
+the review loop were live-verified, and the whole change passed a `codex review`
+loop to a clean pass.
+### Added
+- **`ca loop --implementer codex`**: drive the loop with Codex (OpenAI). PID-based, in-tree engine (`CA_BACKEND=p`, own process group). Dispatch uses `codex exec --sandbox workspace-write -c approval_policy="never" --skip-git-repo-check`. Default model `gpt-5.5-codex` (plain model name, no `provider/` slash; override with `--model`). Preflight checks the codex CLI and login status.
+- **`ca loop --implementer gemini`**: drive the loop with the Gemini CLI. PID-based, in-tree engine. Dispatch uses `gemini -p "<prompt>" --yolo`. Default model `gemini-3.1-pro` (override with `--model`). Preflight checks the gemini CLI and `GEMINI_API_KEY`.
+- **Soft, in-prompt phase-gate for codex/gemini**: the compound protocol (`ca search`/`ca knowledge` before phases, `ca learn` after corrections, `ca phase-check` gate flow, Verification Contract, the three completion markers, explicit commit/push) is ported into the codex (`AGENTS.md`) and gemini (`GEMINI.md`) memory files. Codex `PreToolUse` and Antigravity `decision:deny` hooks can hard-block as a future enhancement.
+- **Implementer review reuse**: codex/gemini implementers reuse the existing CLI-reviewer dispatch. Valid reviewers for codex/gemini are `{codex, gemini}` (the CLI-direct branches); claude reviewers are rejected for these implementers because they route through the implementer-specific `agent_invoke`.
+- **`ca setup --harness antigravity`**: installs the `AGENTS.md` memory file for the Antigravity (`agy`) CLI, the successor to the Gemini CLI.
+- **Loop skill**: `templates/skills/loop-launcher/SKILL.md` now documents `--implementer codex` and `--implementer gemini` trigger scripts and the implementer/CLI syntax that ships with the plugin.
+### Notes
+- The standalone gemini CLI is sunset on 2026-06-18 in favor of the Antigravity CLI (`agy`). Antigravity is shipped here as groundwork only (a `--harness antigravity` setup target + this note), not as a functional loop implementer or reviewer: `agy` currently authenticates via OAuth keyring only (no API-key env), and `agy -p` drops stdout in non-TTY/subprocess contexts, which breaks marker capture. The full agy loop and reviewer are deferred until those are resolved.
+- A `codex review` loop over the implementation surfaced and fixed two real issues before release: a scaffolded antigravity reviewer that could falsely report approval (removed from the active reviewer set), and the claude-reviewer cross-wiring above.
+## [2.9.1] - 2026-06-14
+Hardening and live-verification of the Goose harness shipped in 2.9.0, against
+real Goose 1.37 + ollama qwen2.5-coder:14b. Fixes the phase-gate on the
+local-model path and wires the toolshim. The default `ca setup` (no `--harness`)
+and `ca loop --implementer claude` paths remain byte-for-byte unchanged.
+### Fixed
+- **Phase-gate now fires for local (toolshim) models.** Goose's toolshim (used for local models that do not emit native `tool_calls`) collapses `developer__text_editor` down to the bare names `write`/`edit`, which the phase-guard did not recognize, so the gate silently no-opped on exactly the local-model path it targets. `write` and `edit` are now guarded in both `phase_guard.go` and the goose `hooks.json` PreToolUse matcher (plus `str_replace_editor` for matcher/map symmetry). Verified live: an out-of-phase edit inside a Goose subrecipe is now genuinely blocked and the subagent obeys the deny reason.
+- **Phase-gate repo-root fallback.** The phase-guard now falls back to the `working_dir` Goose sends on stdin when `COMPOUND_AGENT_ROOT` is unset and the current directory has no `.compound-agent`, closing a latent no-op when the hook process runs from a foreign cwd. Additive and Claude-path-neutral (Claude runs from the project root and sends no `working_dir`).
+- **`GOOSE_TOOLSHIM=1` is set automatically for the ollama provider** in the generated loop script (no-clobber: a user-set value wins). Without it, local ollama models dispatch no tools and the goose implementer is inert. Documented in `.goosehints`.
+- `ca loop --reviewers` help text now lists the goose specialty reviewers (`security,correctness,quality`) alongside the claude reviewers.
+### Verified (live)
+- The PreToolUse phase-gate fires inside Goose `sub_recipes` (run as in-process subagents); cwd and `working_dir` correctly resolve to the repo root, and with the fix an out-of-phase edit is blocked.
+- End-to-end `ca loop --implementer goose` mechanics: script generation, dispatch in its own process group, tool dispatch under the toolshim, marker detection with bd-state fallback, retry, clean termination, and process cleanup with no orphaned processes.
+### Known issues (pre-existing shared-loop behavior, not goose-specific)
+- The loop stops on the first epic that exhausts its retries, so later targeted epics are not attempted.
+- An intentional non-zero exit on epic failure is logged as `CRASH` by the cleanup trap, so an expected failure shows as `status:crashed` in telemetry.
+- A failed epic can leave partial work uncommitted while the end-of-loop push still runs.
+- On a 16GB host a resident ~9.5GB local model trips the default memory thresholds (`MIN_FREE_MEMORY_PCT=20`), and `OLLAMA_CONTEXT_LENGTH` cannot resize an already-loaded model.
+## [2.9.0] - 2026-06-14
+This release adds an opt-in Goose harness so the infinity loop can run on
+open-weight models (local Ollama or API providers) alongside Claude. It is
+purely additive: the default `ca setup` (no `--harness`) and
+`ca loop --implementer claude` paths are byte-for-byte unchanged and guarded by
+regression tests.
+### Added
+- **`ca setup --harness {claude,codex,gemini,goose}`**: opt-in per-harness installer (comma-separated and/or repeatable). `--harness goose` installs only the Goose artifacts (global Open-Plugins hooks at `~/.agents/plugins/compound/hooks/hooks.json`, project `.goosehints`, and `.goose/recipes/`) without touching `.claude`. Omitting `--harness` installs the default Claude target unchanged. Codex (`config.toml`) and Gemini (`GEMINI.md`) install targets included.
+- **`ca loop --implementer goose`**: run the loop with Goose driving open-weight models. Provider and model are derived from `--model` (`ollama/<model>` for local, `<provider>/<model>` for API). A preflight verifies the goose CLI plus the provider API key (or local Ollama model presence) before the loop starts.
+- **Goose phase-gate (blocking PreToolUse hook)**: the compound phase-guard fires under real Goose. Verified live against Goose 1.37: Goose pipes a Claude-shaped `tool_name` payload and honors `exit 2` (deny reason read from stderr) to block out-of-phase edits on its `developer__text_editor` / `write` / `edit` tools. The matcher was corrected from the anchored Claude-only token set to match Goose's namespaced tool names.
+- **Compound primitives wired into the Goose loop**: the goose implementer prompt invokes `ca phase-check`, `ca search`, `ca knowledge`, `ca learn`, and `ca verify-gates`; `.goosehints` mirrors the mandatory-recall and phase-gate protocol; the `compound-cook-it.yaml` recipe ships the plan/work/review/compound workflow.
+- **Open-model review fleet (Goose subrecipes)**: `--reviewers security,correctness,quality` for the goose implementer dispatches self-contained `review-*.yaml` subrecipes (each declares the builtin developer extension and a response JSON-schema verdict). `--review-models security=ollama/qwen2.5-coder:14b,quality=glm/glm-4-plus` pins each reviewer to its own open model; unpinned reviewers inherit the loop model.
+### Fixed
+- Goose loop dispatch runs the goose child in its own process group (`set -m`) so the watchdog and `agent_stop` group-kill reach it instead of orphaning it (macOS lacks `setsid`).
+- Goose preflight API-key check uses bash indirect expansion (`${!key_var}`) instead of `eval`, closing a command-injection vector via a hostile `--model` provider half.
+- Marker detection on the goose path falls back to `bd` epic state when no completion-marker text is present.
+- `installGemini` writes a single root `GEMINI.md` (removed the redundant `.gemini/GEMINI.md` mirror).
+- `ca setup` surfaces a warning when no resolved binary path is available and the install falls back to `npx ca` (which requires npx on PATH).
+### Notes
+- The phase-gate is verified to block at the top level under real Goose; firing inside Goose subrecipes depends on the hook process working directory and is not yet empirically confirmed. A full end-to-end goose loop and review-fleet live run remains future verification. The recipe also enforces `ca phase-check` gates in-band as a fallback.
 ## [2.8.0] - 2026-05-17
 ### Upgrading (action required for the default path)

package/README.md CHANGED Viewed

@@ -118,6 +118,8 @@ Run phases individually when you want more control:
 /compound:compound                        # Capture what was learned
 ```
+spec-dev writes each per-epic spec to `docs/specs/<epic-id>-<slug>.md` as the single source of truth; the beads epic description holds only a pointer stub. plan appends the Acceptance Criteria and Verification Contract to that file, and work, review, and compound read from it (falling back to the legacy epic-description spec when no file exists). Material changes are logged in an `## Amendments` section. Specs stay readable and usable even outside the beads tooling.
 ### Level 3 — Factory mode
 For systems too large for a single feature cycle. `/compound:architect` decomposes the system; `ca loop` processes the resulting epics autonomously.
@@ -143,7 +145,7 @@ ca loop --reviewers claude-sonnet --review-every 3
 `ca loop` generates a bash script that processes your beads epics sequentially, running the full cook-it cycle on each one. No human intervention required between epics.
-The default backend is `claude --bg` (subscription-billed; requires accepting the bypass-permissions disclaimer once: `claude --dangerously-skip-permissions`). Use `--backend p` or `CA_BACKEND=p` for the legacy `claude -p` (pay-per-token) path.
+`--implementer` selects the engine that runs each epic: **claude** (default), **goose**, **codex**, or **gemini**. With the default claude implementer, the backend is `claude --bg` (subscription-billed; requires accepting the bypass-permissions disclaimer once: `claude --dangerously-skip-permissions`); use `--backend p` or `CA_BACKEND=p` for the legacy `claude -p` (pay-per-token) path. **goose** runs open/local models via Goose (e.g. `--model ollama/qwen2.5-coder:14b` or `deepseek/deepseek-chat`; for the ollama provider the loop auto-exports `GOOSE_TOOLSHIM=1`). **codex** drives the OpenAI Codex CLI (default model `gpt-5.5-codex`, dispatched via `codex exec`). **gemini** drives the Gemini CLI (default model `gemini-3.1-pro`, dispatched via `gemini -p --yolo`). For the codex and gemini implementers, valid `--reviewers` are `codex` and `gemini`.
 ```bash
 # Generate script for all ready epics (bg backend by default)
@@ -195,8 +197,9 @@ AI agents work best on well-scoped problems. When a task exceeds what fits comfo
 2. **Spec** — produces system-level EARS requirements, C4 architecture diagrams, and a scenario table
 3. **Decompose** — runs 6 parallel subagents (bounded context mapping, dependency analysis, scope sizing, interface design, STPA hazard analysis, structural-semantic gap analysis) then synthesises into a proposed epic structure
 4. **Materialise** — creates beads epics with scope boundaries, interface contracts, and wired dependencies
+5. **Orchestrate** — two implementation modes for driving the materialised epics through cook-it. **(A) Detached infinity loop**: the `ca loop` script run in a `screen` session, processing epics autonomously with no in-session involvement. **(B) Live orchestration**: the architect model stays in the conversation and autonomously drives each epic through `/compound:cook-it` sequentially in dependency order, tracking progress via a beads-backed checklist note (resumable) and reporting at the end. Polish is a separate opt-in post-loop phase, not an implementation mode.
-Three human approval gates separate the phases. Each output epic is sized for one cook-it cycle and includes an EARS subset for traceability back to the system spec.
+Three human approval gates separate the phases. Each output epic is sized for one cook-it cycle and includes an EARS subset for traceability back to the system spec. Live orchestration is entered through Phase 5 — there is no separate slash command.
 ```bash
 /compound:architect "Build a data pipeline: ingestion, transformation, storage, and API layer"
@@ -281,6 +284,8 @@ The CLI binary is `ca` (alias: `compound-agent`).
 | Command | Description |
 |---------|-------------|
 | `ca loop` | Generate infinity loop script (default: `claude --bg`, subscription-billed) |
+| `ca loop --implementer <name>` | Engine that runs each epic: `claude` (default), `goose`, `codex`, `gemini` |
+| `ca loop --model <model>` | Implementer model (e.g. `ollama/qwen2.5-coder:14b`, `gpt-5.5-codex`, `gemini-3.1-pro`) |
 | `ca loop --backend bg` | Default bg backend: `claude --bg` (subscription-billed) |
 | `ca loop --backend p` | Legacy p backend: `claude -p` (pay-per-token) |
 | `ca loop --epics "id1,id2,id3"` | Target specific epic IDs (comma-separated) |
@@ -320,6 +325,7 @@ The CLI binary is `ca` (alias: `compound-agent`).
 | Command | Description |
 |---------|-------------|
 | `ca setup` | One-shot setup (hooks + templates) |
+| `ca setup --harness antigravity` | Groundwork: install an `AGENTS.md` for the `agy` CLI (the Gemini CLI successor); no functional antigravity loop/reviewer yet |
 | `ca setup --skip-hooks` | Setup without installing hooks |
 | `ca setup --json` | Output result as JSON |
 | `ca setup claude` | Install Claude Code hooks only |
@@ -366,7 +372,7 @@ A: Yes, completely. Embeddings run locally via the `ca-embed` Rust daemon (nomic
 A: ~278MB for the embedding model (one-time download, shared across projects) plus negligible space for lessons.
 **Q: Can I use it with other AI coding tools?**
-A: The CLI (`ca`) works standalone with any tool. Full hook integration is available for Claude Code and Gemini CLI.
+A: The CLI (`ca`) works standalone with any tool. Full hook integration is available for Claude Code and Gemini CLI. The Gemini CLI is being sunset (~2026-06-18) in favor of Antigravity; `ca setup --harness antigravity` installs groundwork (an `AGENTS.md` for the `agy` CLI) ahead of that migration.
 **Q: What happens if the embedding model isn't available?**
 A: Search gracefully falls back to keyword-only mode. Other commands that require embeddings will tell you what's missing. Run `ca doctor` to diagnose issues.

package/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "compound-agent",
-  "version": "2.8.0",
+  "version": "2.11.0",
   "type": "module",
   "description": "Learning system for Claude Code — avoids repeating mistakes across sessions",
   "bin": {
@@ -51,12 +51,12 @@
     "knowledge-management"
   ],
   "optionalDependencies": {
-    "@syottos/darwin-arm64": "2.8.0",
-    "@syottos/darwin-x64": "2.8.0",
-    "@syottos/linux-arm64": "2.8.0",
-    "@syottos/linux-x64": "2.8.0",
-    "@syottos/win32-x64": "2.8.0",
-    "@syottos/win32-arm64": "2.8.0"
+    "@syottos/darwin-arm64": "2.11.0",
+    "@syottos/darwin-x64": "2.11.0",
+    "@syottos/linux-arm64": "2.11.0",
+    "@syottos/linux-x64": "2.11.0",
+    "@syottos/win32-x64": "2.11.0",
+    "@syottos/win32-arm64": "2.11.0"
   },
   "author": "Nathan Delacrétaz",
   "license": "MIT",