npm - compound-agent - Versions diffs - 1.6.5 → 1.7.3 - Mend

compound-agent 1.6.5 → 1.7.3

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (8) hide show

package/CHANGELOG.md CHANGED Viewed

@@ -9,6 +9,78 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
 ## [Unreleased]
+## [1.7.3] - 2026-03-09
+### Added
+- **Update notification**: CLI checks npm registry for newer versions on startup (24h file-based cache, non-blocking). Notification displays after command output (TTY only) and in `ca prime` context.
+### Fixed
+- **Spec-dev epic type**: `bd create` in spec-dev Phase 4 now explicitly uses `--type=epic`, preventing epics from defaulting to task type. Plan phase also validates the epic type and corrects it if needed.
+- **Update-check hardening**: Added explicit `res.ok` check in `fetchLatestVersion`, removed dead `checkedAt` cache field, removed redundant type cast.
+## [1.7.2] - 2026-03-09
+### Added
+- **Loop review phase**: `ca loop` can now spawn independent AI reviewers (Claude Sonnet, Claude Opus, Gemini, Codex) in parallel after every N completed epics. Reviewers produce severity-tagged reports, an implementer session fixes findings, and reviewers are resumed (not fresh) to verify fixes. Iterates until all approve or max cycles reached. New CLI options: `--reviewers`, `--review-every`, `--max-review-cycles`, `--review-blocking`, `--review-model`. Gracefully skips unavailable CLIs.
+### Fixed
+- **Security: command injection in `ca test-summary --cmd`**: User-supplied test command is now validated against an allowlist of safe prefixes (`pnpm`, `npm`, `vitest`, etc.) and shell metacharacters are rejected.
+- **Security: shell injection in `ca doctor`**: Replaced `execSync(cmd, {shell})` with `execFileSync('bd', ['doctor'])` to avoid shell interpretation.
+- **Portable timeout for macOS**: Generated loop scripts now use a `portable_timeout()` wrapper that tries GNU `timeout`, then `gtimeout` (Homebrew coreutils), then a shell-based kill/watchdog fallback. Previously failed silently on macOS.
+- **Session ID python3 fallback**: Review phase session ID management now falls back to python3 when `jq` is unavailable, with a centralized `read_session_id()` helper.
+- **Git diff window stability**: Replaced fragile `HEAD~$N..HEAD` commit-count arithmetic with SHA-based `$REVIEW_BASE_SHA..HEAD` diff ranges, immune to rebases and cherry-picks.
+- **ID collision risk**: Memory item IDs now use 64-bit entropy (16 hex chars) instead of 32-bit (8 hex chars).
+- **JSONL resilience**: Malformed lines in JSONL files are now skipped with try/catch per line instead of crashing the entire read.
+- **Stdin timeout leak**: `clearTimeout` now called in `finally` block for stdin reads in retrieval and hooks.
+- **Double JSONL read eliminated**: `readMemoryItems()` now returns `deletedIds` set, removing the need for a separate `wasLessonDeleted()` file read.
+- **FTS5 trigger optimization**: SQLite update trigger now scoped to FTS-indexed columns only, reducing unnecessary FTS rebuilds.
+- **Clustering noise accuracy**: Single-item clusters now correctly returned as `noise` instead of an always-empty noise array.
+- **Embed-worker path validation**: `embed-worker` command now validates that `repoRoot` exists and is a directory before proceeding.
+- **Script check timeout**: Rule-based script checks now have a 30-second default timeout, configurable via `check.timeout`.
+### Changed
+- **Anchored approval detection**: Review loop now uses `^REVIEW_APPROVED` anchored grep to prevent false positives from partial-line matches.
+- **Numeric option validation**: `--review-every` and `--max-review-cycles` now reject NaN, negative, and non-integer values.
+- **`isModelUsable()` replaced**: `compound` command now uses lightweight `isModelAvailable()` (fs check) instead of loading the 278MB model just to probe.
+- **Dead code removed**: `addCompoundAgentHook()`, back-compat hook aliases (`CLAUDE_READ_TRACKER_HOOK_CONFIG`, `CLAUDE_STOP_AUDIT_HOOK_CONFIG`), and `wasLessonDeleted()` removed.
+- **Hardcoded model extracted**: Five occurrences of `'claude-opus-4-6'` in loop.ts extracted to `DEFAULT_MODEL` constant.
+- **EPIC_ID_PATTERN deduplicated**: `watch.ts` now imports `LOOP_EPIC_ID_PATTERN` from `loop.ts` instead of maintaining a duplicate.
+- **`warn()` output corrected**: `shared.ts` warn helper now writes to `stderr` instead of `stdout`.
+- **Templates import fixed**: `templates.ts` now imports `VERSION` from `../version.js` instead of barrel re-export.
+## [1.7.1] - 2026-03-09
+### Added
+- **Scenario testing integration**: Spec-dev Phase 3 now generates scenario tables from EARS requirements and Mermaid diagrams with five categories (happy, error, boundary, combinatorial, adversarial). Review phase verifies coverage via a new `scenario-coverage-reviewer` agent using heuristic AI-driven matching.
+- **Scenario coverage reviewer**: New medium-tier AgentTeam reviewer that matches test files against epic scenario tables and flags gaps (P1) or partial coverage (P2). Spawned for diffs >100 lines.
+### Fixed
+- **Stale reviewer count in tests**: Updated "5 reviewer perspectives" test to "6" with `scenario-coverage` assertion. Removed no-op `.replace('security-', 'security-')` in escalation wiring test.
+## [1.7.0] - 2026-03-08
+### Added
+- **Loop observability**: Generated loop script now writes `.loop-status.json` (real-time epic/attempt/status) and `loop-execution.jsonl` (append-only result log with per-epic duration and end-of-loop summary). Enables `ca watch --status` and post-mortem forensics.
+- **ESLint rule `no-solo-trivial-assertion`**: Custom rule that warns when a test's only assertion is `toBeDefined()`, `toBeTruthy()`, `toBeFalsy()`, or `toBeNull()`. Registered but not yet enabled (requires cleanup of ~40 existing violations).
+### Fixed
+- **Loop 0-byte log resilience**: `extract_text` pipeline could produce 0-byte log files while the trace JSONL had valid content, causing the loop to falsely detect failure. New `detect_marker()` function checks the macro log first (anchored grep), then falls back to the trace JSONL (unanchored grep). Includes health check warning on extraction failure.
+- **Search fallback when embeddings unavailable**: `retrieveForPlan()` no longer throws when the embedding model is missing or broken. Falls back to keyword-only search with a console warning.
+### Changed
+- **Anti-cargo-cult reviewer strengthened**: Added three new subtle anti-patterns to the reviewer agent: solo trivial assertions, substring-only `toContain()` checks, and keyword-presence tests. Each with bad/good examples.
+- **Loop template extraction**: Bash script templates moved to `loop-templates.ts` to stay within lint limits.
 ## [1.6.5] - 2026-03-07
 ### Fixed

package/README.md CHANGED Viewed

@@ -190,6 +190,11 @@ The CLI binary is `ca` (alias: `compound-agent`).
 | `ca loop -o <path>` | Custom output path (default: `./infinity-loop.sh`) |
 | `ca loop --max-retries <n>` | Max retries per epic on failure (default: 1) |
 | `ca loop --force` | Overwrite existing script |
+| `ca loop --reviewers <names...>` | Enable review phase with specified reviewers (claude-sonnet, claude-opus, gemini, codex) |
+| `ca loop --review-every <n>` | Review every N completed epics (0 = end-only, default: 0) |
+| `ca loop --max-review-cycles <n>` | Max review/fix iterations (default: 3) |
+| `ca loop --review-blocking` | Fail loop if review not approved after max cycles |
+| `ca loop --review-model <model>` | Model for implementer fix sessions (default: claude-opus-4-6) |
 ### Knowledge
@@ -263,7 +268,7 @@ pnpm lint             # Type check + ESLint
 | Script | Duration | Use Case |
 |--------|----------|----------|
-| `pnpm test:fast` | ~6s | Rapid feedback during development |
+| `pnpm test:fast` | ~12s | Rapid feedback during development |
 | `pnpm test` | ~60s | Full suite before committing |
 | `pnpm test:changed` | varies | Only tests affected by recent changes |
 | `pnpm test:watch` | - | Watch mode for TDD workflow |