npm - eagle-mem - Versions diffs - 4.12.1 → 4.13.0 - Mend

eagle-mem 4.12.1 → 4.13.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (35) hide show

package/CHANGELOG.md +29 -0
package/README.md +4 -0
package/db/migrate.sh +11 -1
package/docs/agent-compatibility/claude-code.md +27 -0
package/docs/agent-compatibility/codex.md +1 -0
package/docs/reviews/2026-06-10-full-spectrum-hardening.md +90 -0
package/hooks/post-tool-use.sh +73 -22
package/hooks/session-end.sh +10 -0
package/hooks/session-start.sh +24 -1
package/hooks/stop.sh +7 -2
package/integrations/google_antigravity_hook.py +61 -26
package/lib/codex-hooks.sh +5 -1
package/lib/common.sh +104 -4
package/lib/db-core.sh +28 -0
package/lib/db-events.sh +13 -0
package/lib/db-observations.sh +10 -3
package/lib/db-sessions.sh +10 -1
package/lib/db-summaries.sh +4 -1
package/lib/hooks-sessionstart.sh +32 -13
package/lib/provider.sh +10 -2
package/lib/updater.sh +16 -2
package/package.json +1 -1
package/scripts/enrich-summary.sh +4 -1
package/scripts/logs.sh +44 -12
package/scripts/orchestrate.sh +34 -4
package/scripts/session.sh +5 -0
package/scripts/statusline-em.sh +5 -1
package/scripts/tasks.sh +6 -3
package/scripts/test.sh +11 -1
package/tests/test_context_budget.sh +117 -0
package/tests/test_data_integrity_hardening.sh +115 -0
package/tests/test_mod_tracker_concurrency.sh +142 -0
package/tests/test_redaction_coverage.sh +183 -0
package/tests/test_reliability_retention.sh +75 -0
package/tests/test_test_runner_no_abort.sh +86 -0

package/CHANGELOG.md CHANGED Viewed

@@ -4,6 +4,35 @@ All notable changes to the **Eagle Mem** project are documented here.
 ---
+## v4.13.0 Full-Spectrum Security & Reliability Hardening
+A six-lens fix-in-place review of the whole codebase (security, data integrity, reliability, token economy, code quality, architecture). 32 files hardened, 6 new regression suites, full smoke suite green. No behavioral surface for normal sessions changed — recall and capture are byte-for-byte the same; what changed is the failure, concurrency, and trust behavior underneath.
+**Security**
+- **Unattended workers no longer default to full access (highest severity).** Orchestration workers previously spawned `--sandbox danger-full-access` / `--permission-mode dontAsk` on prompts assembled from DB-stored lane text — a stored-prompt-injection path to unattended arbitrary execution. New `[orchestration] worker_autonomy` defaults to **safe** (`workspace-write` / `on-request` / `acceptEdits`); `danger` is now an explicit opt-in.
+- **LLM inputs are redacted, not just outputs.** Transcript excerpts sent to providers (including remote APIs via fallback) and persisted to the enrich job file are now run through `eagle_redact` before send/persist; `recall_events.prompt_snippet` is redacted before insert; the Bash command summary is redacted *before* truncation so a boundary-split secret can't leak.
+- **Antigravity binary-hijack fix.** The Python hook resolves `bin/eagle-mem` and hook scripts from its install dir instead of `os.getcwd()`, and validates `SESSION_ID`.
+- Prompts passed to provider CLIs now go over stdin (not `ps`-visible argv); `logs show|tail` canonicalizes paths (realpath) before the runs-root containment check; the Codex hook passes event names via jq `--arg`; the dead `[redaction] extra_patterns` knob is now wired into `eagle_redact`.
+**Data integrity & database**
+- **Fail-open `SQLITE_BUSY` reads fixed.** Standalone `sqlite3` reads (statusline, tasks, updater, project-identity lookups) now set `busy_timeout` so a momentary lock waits instead of being misread as "no data."
+- **Mod-tracker data-loss & lock-wedge fixed.** Stale-lock TTL reclaim, atomic mv-based pending drain (a concurrent append is never lost), and the edit-history append is now locked. Verified by a 40-parallel-writer regression test.
+- Summary `project` is sticky on conflict for agent-authored rows (hook-path project drift can no longer rekey them); added `eagle_db_strict` (`.bail on`) for fail-fast callers; observation dedup runs in `BEGIN IMMEDIATE`.
+**Reliability / self-healing**
+- **Auto-scan no longer self-blocks for 24h on failure.** A short-lived *in-flight* marker now debounces concurrent spawns, while the durable *freshness* marker is set only on genuine success — a crashed or output-less scan retries instead of being suppressed for a day.
+- **`eagle_events` retention.** The hook-observability table (≈99k unbounded rows in practice) is now pruned by age (30 days) at SessionEnd. `pending_feature_verifications` is deliberately left un-pruned (expiring it would let an unverified change ship).
+**Token economy**
+- **SessionStart injection ceiling.** A generous global budget (`context_budget.sessionstart_chars`, default 24000 chars / ~6K tokens) drops whole low-priority sections from the bottom when the recall body is pathologically large, always keeping overview/recent/memories + capture instructions and never splitting a section. Normal-sized recall is emitted unchanged.
+**Code quality**
+- **Test runner no longer aborts at the first failure.** `((errors++))` returns exit 1 when the count is 0, so under `set -euo pipefail` the first failing check killed the suite before it could report the count; switched to `errors=$((errors + 1))`.
+Full findings report (every item marked fixed-with-its-test or proposed) lives at `docs/reviews/2026-06-10-full-spectrum-hardening.md`, including the propose-only architecture backlog (`common.sh` decomposition, orchestrate lifecycle, curator→guardrail provenance, Grok/bare-shell gate parity).
+---
 ## v4.12.1 Deploy-Path Fixes for v4.12.0
 Two install/update-path bugs that defeated or destabilized the v4.12.0 rollout:

package/README.md CHANGED Viewed

@@ -13,6 +13,8 @@ Eagle Mem turns AI coding sessions into compounding project knowledge. It gives
 **v4.10.5 and onward focuses on Graph Memory, Dream Cycle Curation, OpenCode, Grok, Google Antigravity support, and Compaction Survival:** OpenCode users get a global local plugin plus linked skills, Grok users get first-class skill linking and `eagle-mem grok-bootstrap`, and Antigravity users get native Python SDK hook integration via `google_antigravity_hook.py`. Claude Code, Codex, OpenCode, and Antigravity receive the deepest automatic lifecycle support through hooks or plugins; Grok currently uses the shared CLI and skill workflow.
+**v4.13.0 is a full-spectrum security & reliability hardening:** orchestration workers no longer default to full filesystem access, all LLM inputs (not just outputs) are redacted before they leave the machine, fail-open `SQLITE_BUSY` reads now wait for the lock, the mod-tracker is concurrency-safe, failed background scans retry instead of self-blocking for a day, and SessionStart injection has a token-economy ceiling. Normal recall and capture are unchanged. See [`CHANGELOG.md`](CHANGELOG.md) and the [findings report](docs/reviews/2026-06-10-full-spectrum-hardening.md).
 **Website:** [Product](https://eagleisbatman.github.io/eagle-mem/) |
 [Architecture](https://eagleisbatman.github.io/eagle-mem/architecture.html) |
 [About](https://eagleisbatman.github.io/eagle-mem/about.html)
@@ -265,6 +267,8 @@ By default Eagle Mem uses the opposite-agent worker model:
 Worker models, effort, route, and worktree behavior are configurable in `~/.eagle-mem/config.toml` under `[orchestration]`.
+Spawned workers run in a sandbox with an approval/permission gate by default (`worker_autonomy = "safe"`). Because worker prompts are assembled from DB-stored lane descriptions, unattended full filesystem access is opt-in: set `worker_autonomy = "danger"` under `[orchestration]` to run workers with `codex --sandbox danger-full-access` / `claude --permission-mode dontAsk`. Only enable this when you trust the lane descriptions.
 ```bash
 eagle-mem orchestrate init "Ship auth cleanup"
 eagle-mem orchestrate lane add api --agent codex --desc "API fixes + tests" --validate "npm test"

package/db/migrate.sh CHANGED Viewed

@@ -44,7 +44,17 @@ run_migration() {
     if [ "$already_applied" = "0" ]; then
         # Strip PRAGMAs from migration body (they can't run inside transactions
-        # and are already set on every connection via lib/db.sh EAGLE_DB_SETUP)
+        # and are already set on every connection via lib/db.sh EAGLE_DB_SETUP).
+        #
+        # FUTURE-MIGRATION FOOTGUN: EVERY line matching `^\s*PRAGMA ` is removed
+        # from the body before it runs inside the BEGIN/COMMIT below. A migration
+        # that genuinely needs a persistent/connection PRAGMA (e.g. a one-shot
+        # `PRAGMA user_version=N;` or a `PRAGMA foreign_key_check;` validation)
+        # will have that line SILENTLY DROPPED — it will appear to succeed while
+        # doing nothing. Do not rely on PRAGMA statements inside a migration file;
+        # set connection PRAGMAs in EAGLE_DB_SETUP, and run any standalone schema
+        # PRAGMA as a separate `"$SQLITE_BIN" "$DB" "PRAGMA ...;"` call outside
+        # run_migration. This stripping behavior is intentional — do not remove it.
         local body
         body=$(grep -v -E '^[[:space:]]*PRAGMA ' "$file")

package/docs/agent-compatibility/claude-code.md CHANGED Viewed

@@ -57,3 +57,30 @@ Last verified: 2026-06-10
 When editing Claude Code hooks, update this file with the new verification date and include the exact hook event names, input fields, and output semantics used by the implementation. When editing brand visibility or statusline behavior, re-read the statusline reference first and update `claude-statusline.json` if the stdin schema changed.
+### Evidence: data-integrity hardening (2026-06-10)
+Phase 2 data-integrity hardening touched the PostToolUse hook and the statusline without changing any Claude Code contract — no hook event names, stdin field reads, or stdout/exit semantics changed; `claude-statusline.json` is unaffected. Specifically:
+- `hooks/post-tool-use.sh`: the `mod-tracker`/`edit-tracker` writers (internal `~/.eagle-mem` state, not Claude I/O) gained a stale-lock TTL reclaim, an atomic mv-based pending drain so a concurrent append is never lost, and a lock around the edit-history append. The PostToolUse stdin parse (`session_id`, `cwd`, `tool_name`, `hook_event_name`, `tool_input`) is unchanged.
+- `scripts/statusline-em.sh`: the hottest standalone stats query now runs `PRAGMA busy_timeout=10000;` so a momentary `SQLITE_BUSY` waits for the lock instead of being misread as a DB-integrity error. Statusline stdin schema and output rendering are unchanged.
+Covered by `tests/test_mod_tracker_concurrency.sh` and `tests/test_trust_surfaces.sh` (statusline integrity-status path).
+### Evidence: reliability/self-healing hardening (2026-06-10)
+Phase 3 reliability hardening touched SessionStart auto-provisioning and SessionEnd without changing any Claude Code contract — hook event names, stdin field reads, and stdout/exit semantics are unchanged. Specifically:
+- `lib/hooks-sessionstart.sh` (sourced by SessionStart): auto-scan/auto-index now debounce concurrent spawns with a short-lived in-flight marker and set the durable freshness marker ONLY when the background job genuinely succeeds, so a crashed/output-less scan no longer blocks retry for ~24h. Pure background-state behavior; no change to injected context format or stdin parse.
+- `hooks/session-end.sh`: now also prunes `eagle_events` (hook-observability telemetry) older than 30 days; `pending_feature_verifications` is deliberately left un-pruned (documented inline). SessionEnd input/exit semantics unchanged.
+Covered by `tests/test_reliability_retention.sh`.
+### Evidence: SessionStart injection ceiling (2026-06-10)
+Phase 4 token-economy hardening added a generous global size ceiling on the recall body that SessionStart injects, without changing any Claude Code contract — the hook still emits the same `additionalContext` via the existing `eagle_emit_context_for_agent` path, with unchanged stdin field reads (`session_id`, `cwd`, `source`, `model`) and exit semantics. Specifically:
+- `lib/common.sh`: added `eagle_sessionstart_inject_budget` (config key `context_budget.sessionstart_chars`, default 24000 chars / ~6K tokens, floored at 4000) and `eagle_trim_inject_body`, which drops whole `=== Eagle Mem: ...` sections from the END (lowest priority appended last) until the body fits. Sections are never split, so no recall surface is half-emitted, and the top-priority sections (overview, recent recall, memories) always survive.
+- `hooks/session-start.sh`: applies the ceiling to the accumulated recall body for the non-Codex path BEFORE the trailing Active/instructions block is appended, so capture instructions are never trimmed. When it trims it logs an observable `WARN` ("injection over budget … trimmed N low-priority section(s)") and records `sections_trimmed` in the hook observability detail. Normal-sized recall is emitted byte-for-byte unchanged (verified by test).
+Covered by `tests/test_context_budget.sh`.

package/docs/agent-compatibility/codex.md CHANGED Viewed

@@ -15,6 +15,7 @@ Last verified: 2026-06-10
 - `AGENTS.md` is the durable repo-instruction surface for Codex. Closer nested files override earlier guidance because Codex concatenates instruction files from the project root down to the current working directory.
 - Codex hooks are discovered from active config layers through `hooks.json` or inline `[hooks]` tables in `config.toml`.
 - `features.hooks` is the canonical feature key. `features.codex_hooks` is a deprecated alias and should not be the primary setting.
+- Hook entries are registered into `hooks.<Event>` via jq. The event name is passed as `--arg` and indexed dynamically (`.hooks[$event]`), never interpolated into the jq program; the emitted `hooks.json` shape is unchanged. Re-verified during the 2026-06-10 security hardening pass (jq injection hardening, secret redaction before provider calls/persistence); no Codex hook contract changed.
 - `SessionStart`, `PreToolUse`, `PostToolUse`, `PreCompact`, `PostCompact`, `UserPromptSubmit`, `SubagentStop`, and `Stop` run at the documented lifecycle scopes.
 - `SessionStart` matchers use `startup`, `resume`, `clear`, and `compact`.
 - `PreToolUse` and `PostToolUse` match tool names, including `Bash`, `apply_patch`, MCP tool names, and aliases such as `Edit` and `Write` for `apply_patch`.

package/docs/reviews/2026-06-10-full-spectrum-hardening.md ADDED Viewed

@@ -0,0 +1,90 @@
+# Eagle Mem — Full-Spectrum Review & Hardening (2026-06-10)
+Branch `review/arch-hardening` (off `main` @ v4.12.1). Six-lens review, fix-in-place.
+**32 files changed (+1275 / −101); 6 new regression suites (registered checks 33 → 39); full `scripts/test.sh` suite green.** Nothing published — release is a separate, user-gated step.
+| Phase | Lens | Commit |
+|---|---|---|
+| 1 | Security | `be08b31` |
+| 2 | Data integrity & DB | `1fdaa2e` |
+| 3 | Reliability / self-healing | `9451cdd` |
+| 4 | Performance / token economy | `0ef772e` |
+| 5 | Code quality | `d50ba86` |
+| 6 | Architecture / coupling | this document (assessment + proposals) |
+---
+## Fixed (with the test that proves it)
+### Phase 1 — Security (`be08b31`)
+1. **Unredacted LLM input** — transcript excerpts were sent to providers (remote APIs via fallback) and written to the enrich job file in the clear. Now `eagle_redact` runs on the input before send/persist (`hooks/stop.sh`, `scripts/enrich-summary.sh`).
+2. **Raw prompt in `recall_events`** — `prompt_snippet` now redacted before insert (`lib/db-observations.sh`).
+3. **Truncate-before-redact** — Bash command summary now redacted *then* truncated (`hooks/post-tool-use.sh`).
+4. **Dead `extra_patterns` knob** — `[redaction] extra_patterns` config now wired into `eagle_redact` (`lib/common.sh`).
+5. **Antigravity binary hijack** — hook resolves `bin/eagle-mem` + hook scripts from the install dir, not `os.getcwd()`; validates `SESSION_ID` (`integrations/google_antigravity_hook.py`).
+6. **Prompt via argv (`ps`-visible)** — Claude CLI provider + antigravity summary now pass prompts via stdin (`lib/provider.sh`, `scripts/session.sh`).
+7. **Non-canonical log containment** — `logs.sh` canonicalizes (realpath) before the runs-root containment check.
+8. **jq program-string interpolation** — codex-hooks event name passed via `--arg` (`lib/codex-hooks.sh`).
+9. **Unattended full-access workers (highest severity)** — orchestration workers no longer default to `--sandbox danger-full-access` / `dontAsk` on DB-assembled prompts. `[orchestration] worker_autonomy` defaults to **safe** (`workspace-write` / `on-request` / `acceptEdits`); `danger` is explicit opt-in (`scripts/orchestrate.sh`).
+   *Proof:* `tests/test_redaction_coverage.sh` (seeds fake secrets; asserts none reach provider input / recall_events / enrich job; asserts safe-mode run-scripts carry no full-access flags).
+### Phase 2 — Data integrity & DB (`1fdaa2e`)
+- **Fail-open `sqlite3` guards** — standalone reads now set `busy_timeout` so `SQLITE_BUSY` waits instead of reading as "no data" (`lib/common.sh` ×2, `scripts/statusline-em.sh`, `scripts/tasks.sh`, `lib/updater.sh`); `eagle_project_has_table_row` allowlists its table identifier; `tasks.sh` escapes a DB-read value + uses `eagle_db`.
+- **Mod-tracker data loss / lock wedge** — stale-lock TTL reclaim, atomic mv-based pending drain (no append lost), and the edit-history append is now locked (`hooks/post-tool-use.sh`).
+- **Project clobber** — summary `project` is now sticky on conflict for agent rows (`lib/db-summaries.sh`).
+- **Error masking** — added `eagle_db_strict` (`.bail on`) for fail-fast callers (left `eagle_db` continue-on-error to avoid breaking ~200 callers); observation dedup runs in `BEGIN IMMEDIATE`; the project-repair transaction now logs failure (`lib/db-core.sh`, `lib/db-observations.sh`, `lib/db-sessions.sh`).
+- **PRAGMA-stripping footgun** documented in `db/migrate.sh`.
+   *Proof:* `tests/test_data_integrity_hardening.sh` (migrate idempotency, SQL-escaping, summary precedence/clobber) + `tests/test_mod_tracker_concurrency.sh` (40 parallel writers, no lost append, lock reclaim, dedup race).
+### Phase 3 — Reliability / self-healing (`9451cdd`)
+- **24h auto-scan block** — the foreground touch now sets a short-lived *in-flight* marker (debounce); the durable *freshness* marker is set only on genuine background success, so a crashed/output-less scan no longer blocks retry for a day (`lib/hooks-sessionstart.sh`).
+- **Unbounded `eagle_events`** (≈99k rows on this machine) — pruned by age (30d) at SessionEnd via new `eagle_prune_events` (`lib/db-events.sh`, `hooks/session-end.sh`).
+- **`pending_feature_verifications`** — deliberately *not* TTL'd (documented; expiring it would let an unverified change ship).
+   *Proof:* `tests/test_reliability_retention.sh`.
+### Phase 4 — Performance / token economy (`0ef772e`)
+- **SessionStart injection ceiling** — added a generous global budget (`context_budget.sessionstart_chars`, default 24000 chars / ~6K tokens, floored at 4000) that drops whole low-priority sections from the bottom (never splits a section; always keeps overview/recent/memories + capture instructions), logs a trim, and records `sections_trimmed`. Normal recall is byte-for-byte unchanged. The earlier "3 git diffs per tool call" concern was verified overstated (gated behind release-boundary detection).
+   *Proof:* `tests/test_context_budget.sh`.
+### Phase 5 — Code quality (`d50ba86`)
+- **Test runner aborted at first failure** — `((errors++))` returns 1 when `errors==0`, so under `set -euo pipefail` the first failing check killed the runner before it could report the count. Switched to `errors=$((errors + 1))`. (This is why a failing check earlier showed truncated output.)
+   *Proof:* `tests/test_test_runner_no_abort.sh` (incl. a control proving the old form aborts).
+- Audited: all remaining `grep -F` sites are genuinely literal; `feature.sh`'s `changes()` pattern is benign; all shell tests registered; `extra_patterns` confirmed live.
+---
+## Phase 6 — Architecture assessment (propose only)
+The system is sound and unusually well-instrumented for a bash project (durable feature gate, hook observability, redaction, FTS recall). The structural risks below are about *coupling and blast radius*, not correctness; none were auto-refactored because they move code across gate-watched files and would destabilize every adapter mid-review.
+1. **`lib/common.sh` is a 2k-line nexus.** It owns project-identity resolution, redaction, command classification, release-boundary detection, the token-budget trim, and CLAUDE.md/AGENTS.md patching — and every hook and adapter depends on it. *Proposal:* decompose into focused libs (`lib/identity.sh`, `lib/redact.sh`, `lib/cmd-classify.sh`, `lib/claude-md.sh`) behind the current function names, one module at a time, each with the existing tests as the safety net. High value, medium risk; do it as its own series, not inside a review.
+2. **`scripts/orchestrate.sh` (largest script, 1.3k lines) ↔ `bin/eagle-mem` re-entrancy.** Worker lifecycle state is spread across DB rows + run-dir files (`exit_code`, `.done`) + PIDs, with recorded cancellation/resume gaps. Now that autonomy is gated (Phase 1), the next step is a single source of truth for lane state and an explicit cancel/resume contract. *Proposal, separate effort.*
+3. **Curator → guardrail → PreToolUse feedback loop.** LLM output (`PROMOTE:` lines, learned command rules) becomes enforcement input that steers future command execution, with only format-level validation. *Proposal:* tag model-derived guardrails/rules with provenance and require confirmation (or a confidence threshold) before they can rewrite commands — closes the loop between the curator and the autonomy work done in Phase 1.
+4. **Adapter parity is convention-held; Grok has no hook lifecycle.** Five agents × six events × three transports (native hooks / JS plugin / Python shim) are kept in line only by `docs/agent-compatibility/` + the docs gate. Grok is skills-only — no capture, no guardrails, **no release gate** — and bare-shell pushes also bypass the gate (it lives only in PreToolUse). *Proposal:* either document Grok/bare-shell as explicitly out-of-gate scope, or add a lightweight pre-push wrapper so governance isn't silently skippable.
+5. **install.sh / update.sh duplication.** Currently *in sync*, but the hook matchers are duplicated literals across both — the historical drift class. *Proposal (from Phase 5):* extract `eagle_register_claude_hooks` into `lib/hooks.sh` and call it from both.
+---
+## Deferred proposals (rationale)
+- **PostToolUse zero-state short-circuit** — skip stale-memory / decision FTS on empty projects; needs a cached count to avoid adding a query (recall-sensitive).
+- **UserPromptSubmit query consolidation** — the 3 FTS queries hit different tables/ranking; UNION would entangle parsing for ~1 process spawn saved. Recommend leaving as-is.
+- **Provider LLM-output cache** — correctness risk (stale enrichment); needs content-hash invalidation.
+- **`tests/test_antigravity_hook.py`** runs only manually — add a guarded python lane to `scripts/test.sh` or document as dev-only.
+- **Dead-function sweep** — ~18 lib functions have no shell caller, but several are public aliases / dynamic-dispatch; needs an owner-confirmed call-graph (incl. `.py`/`.js`) before removal.
+---
+## Verification
+- Per phase: `bash scripts/test.sh` fully green; `bash -n` on every touched shell file; `py_compile` + `tests/test_antigravity_hook.py` for the Python hook.
+- New coverage: redaction (provider input / recall_events / enrich job / autonomy / log paths), migration idempotency + SQL escaping + summary precedence, 40-writer mod-tracker concurrency, scan in-flight/freshness + events retention, context budget, test-runner no-abort.
+- The docs gate (`tests/test_agent_compatibility_docs_gate.sh`) was satisfied with truthful dated evidence notes for each phase that touched a sensitive integration file.
+## Releasing
+Not auto-shipped. To release (likely `v4.13.0`): bump + CHANGELOG, clear the feature gate, `npm publish`, `npm install -g eagle-mem@<v>` → `eagle-mem update`, verify the installed `~/.eagle-mem` runtime, then gh-pages + GitHub release.

package/hooks/post-tool-use.sh CHANGED Viewed

@@ -17,34 +17,72 @@ LIB_DIR="$SCRIPT_DIR/../lib"
 input=$(eagle_read_stdin)
 [ -z "$input" ] && exit 0
+# Stale lock TTL (seconds). A hook killed between mkdir and rmdir would
+# otherwise leak its lock dir forever and wedge every later writer; reclaim a
+# lock dir older than this so the tracker self-heals.
+EAGLE_MOD_LOCK_TTL="${EAGLE_MOD_LOCK_TTL:-30}"
+# Acquire a mkdir-based lock with stale-lock reclaim. Echoes 0 on success.
+# Returns non-zero (without acquiring) if the lock stays held by a live holder.
+eagle_acquire_dir_lock() {
+    local lock_dir="$1" attempt lock_age now mtime
+    for attempt in 1 2 3 4 5 6 7 8 9 10; do
+        if mkdir "$lock_dir" 2>/dev/null; then
+            return 0
+        fi
+        # Reclaim a stale lock: if the dir is older than the TTL, the holder
+        # almost certainly died mid-critical-section. rmdir + retry.
+        now=$(date +%s 2>/dev/null) || now=""
+        mtime=$(stat -f %m "$lock_dir" 2>/dev/null || stat -c %Y "$lock_dir" 2>/dev/null || echo "")
+        if [ -n "$now" ] && [ -n "$mtime" ]; then
+            lock_age=$(( now - mtime ))
+            if [ "$lock_age" -ge "$EAGLE_MOD_LOCK_TTL" ]; then
+                rmdir "$lock_dir" 2>/dev/null || true
+                eagle_log "WARN" "PostToolUse: reclaimed stale tracker lock ${lock_dir##*/} (age=${lock_age}s)"
+                continue
+            fi
+        fi
+        sleep 0.05
+    done
+    return 1
+}
 eagle_track_modified_path() {
     local path="$1" sid="$2"
     [ -n "$path" ] || return 0
     [ -n "$sid" ] && eagle_validate_session_id "$sid" || return 0
-    local mod_dir mod_file mod_lock mod_tmp attempt
+    local mod_dir mod_file mod_lock mod_tmp pending_file drain_file
     mod_dir="$EAGLE_MEM_DIR/mod-tracker"
     mkdir -p "$mod_dir" 2>/dev/null || return 0
     mod_file="$mod_dir/${sid}"
     mod_lock="${mod_file}.lock"
-    for attempt in 1 2 3 4 5 6 7 8 9 10; do
-        if mkdir "$mod_lock" 2>/dev/null; then
-            mod_tmp=$(mktemp "${mod_file}.XXXXXX" 2>/dev/null) || mod_tmp="${mod_file}.$$"
-            (
-                cat "$mod_file" 2>/dev/null
-                for pending_file in "${mod_file}".pending.*; do
-                    [ -f "$pending_file" ] && cat "$pending_file" 2>/dev/null
-                done
-                printf '%s\n' "$path"
-            ) | tail -3 > "$mod_tmp"
-            mv "$mod_tmp" "$mod_file" 2>/dev/null || rm -f "$mod_tmp"
-            rm -f "${mod_file}".pending.* 2>/dev/null || true
-            rmdir "$mod_lock" 2>/dev/null || true
-            return 0
-        fi
-        sleep 0.05
-    done
+    if eagle_acquire_dir_lock "$mod_lock"; then
+        mod_tmp=$(mktemp "${mod_file}.XXXXXX" 2>/dev/null) || mod_tmp="${mod_file}.$$"
+        # Drain pending atomically: mv each pending file to a private .draining
+        # name BEFORE reading it. A concurrent append landing in a NEW .pending.*
+        # after this glob is simply left for the next lock holder — never deleted
+        # mid-flight, so no append is ever lost between snapshot and cleanup.
+        for pending_file in "${mod_file}".pending.*; do
+            [ -e "$pending_file" ] || continue
+            drain_file="${pending_file}.draining.$$"
+            mv "$pending_file" "$drain_file" 2>/dev/null || continue
+        done
+        (
+            cat "$mod_file" 2>/dev/null
+            # Read every draining file (including any orphaned by a holder that
+            # died mid-drain) — we hold the lock, so we are the sole reader.
+            for drain_file in "${mod_file}".pending.*.draining.*; do
+                [ -e "$drain_file" ] && cat "$drain_file" 2>/dev/null
+            done
+            printf '%s\n' "$path"
+        ) | tail -3 > "$mod_tmp"
+        mv "$mod_tmp" "$mod_file" 2>/dev/null || rm -f "$mod_tmp"
+        rm -f "${mod_file}".pending.*.draining.* 2>/dev/null || true
+        rmdir "$mod_lock" 2>/dev/null || true
+        return 0
+    fi
     printf '%s\n' "$path" >> "${mod_file}.pending.$$" 2>/dev/null || true
     eagle_log "WARN" "PostToolUse: mod-tracker lock busy; queued pending modified file for session=$sid"
@@ -55,10 +93,21 @@ eagle_track_edit_history_path() {
     [ -n "$path" ] || return 0
     [ -n "$sid" ] && eagle_validate_session_id "$sid" || return 0
-    local edit_dir
+    local edit_dir edit_file edit_lock
     edit_dir="$EAGLE_MEM_DIR/edit-tracker"
     mkdir -p "$edit_dir" 2>/dev/null || return 0
-    printf '%s\n' "$path" >> "$edit_dir/${sid}" 2>/dev/null || true
+    edit_file="$edit_dir/${sid}"
+    edit_lock="${edit_file}.lock"
+    # Lock the append so it shares the mod-tracker's serialization discipline:
+    # mixing locked and unlocked writers can interleave/overwrite under load.
+    # Fall back to a bare append (atomic for small lines) if the lock is wedged.
+    if eagle_acquire_dir_lock "$edit_lock"; then
+        printf '%s\n' "$path" >> "$edit_file" 2>/dev/null || true
+        rmdir "$edit_lock" 2>/dev/null || true
+        return 0
+    fi
+    printf '%s\n' "$path" >> "$edit_file" 2>/dev/null || true
 }
 IFS=$'\x1f' read -r session_id cwd tool_name hook_event <<< \
@@ -167,8 +216,10 @@ case "$tool_name" in
         fi
         ;;
     Bash|exec_command|shell_command|unified_exec)
-        cmd=$(eagle_tool_command_from_json "$input" | cut -c1-200)
-        cmd=$(echo "$cmd" | eagle_redact)
+        # Redact BEFORE truncating: a secret split across the 200-char boundary
+        # could otherwise survive because the redaction prefix patterns no longer
+        # see the full token.
+        cmd=$(eagle_tool_command_from_json "$input" | eagle_redact | cut -c1-200)
         tool_summary="${tool_name}: $cmd"
         tool_output=$(echo "$input" | jq -r '.tool_response.stdout // empty' 2>/dev/null)

package/hooks/session-end.sh CHANGED Viewed

@@ -47,4 +47,14 @@ eagle_hook_observability_complete 0
 # Prune observations older than 90 days (keeps DB size bounded)
 eagle_prune_observations 90 "$project"
+# Prune hook-observability events older than 30 days. This telemetry table is
+# written on every hook fire and was previously never pruned (unbounded growth).
+eagle_prune_events 30 "$project"
+# NOTE: pending_feature_verifications is deliberately NOT pruned/TTL'd here. A
+# pending verification is an explicit "this changed and was not yet verified or
+# waived" gate; silently expiring it would let an unverified change ship. It is
+# resolved only by `eagle-mem feature verify|waive` (or reconciliation when the
+# diff fingerprint no longer matches), never by age.
 exit 0

package/hooks/session-start.sh CHANGED Viewed

@@ -620,6 +620,28 @@ Files you were modifying before compact.
     fi
 fi
+# ─── Global injection ceiling (safety only — trims pathological cases) ──
+# Recall sections are appended high→low priority (overview, recent, memories,
+# plans, tasks, lanes, pending features, core files, working set). If the body
+# is pathologically large, drop whole low-priority sections from the bottom so
+# the top surfaces survive. The Active/instructions block is appended AFTER
+# this and is never trimmed.
+inject_budget=$(eagle_sessionstart_inject_budget)
+inject_trimmed=0
+if [ "${#context}" -gt "$inject_budget" ] 2>/dev/null; then
+    pre_trim_chars=${#context}
+    EAGLE_INJECT_TRIM_COUNT="$EAGLE_MEM_DIR/.inject-trim-count.${session_id}"
+    context=$(printf '%s' "$context" | eagle_trim_inject_body "$inject_budget")
+    [ -f "$EAGLE_INJECT_TRIM_COUNT" ] && inject_trimmed=$(cat "$EAGLE_INJECT_TRIM_COUNT" 2>/dev/null | tr -d '[:space:]')
+    rm -f "$EAGLE_INJECT_TRIM_COUNT" 2>/dev/null
+    unset EAGLE_INJECT_TRIM_COUNT
+    inject_trimmed=${inject_trimmed:-0}
+    if [ "$inject_trimmed" -gt 0 ] 2>/dev/null; then
+        eagle_log "WARN" "SessionStart: injection over budget (${pre_trim_chars}>${inject_budget} chars); trimmed ${inject_trimmed} low-priority section(s) for session=$session_id project=$project"
+    fi
+fi
 # ─── Instructions (compressed) ───────────────────────────
 if [ "$agent" = "codex" ]; then
@@ -651,7 +673,8 @@ if [ -n "$context" ]; then
         --arg source_type "$source_type" \
         --arg recall_scope "$recall_scope" \
         --argjson injected_chars "${#context}" \
-        '{source_type:$source_type, recall_scope:$recall_scope, injected_chars:$injected_chars}')"
+        --argjson sections_trimmed "${inject_trimmed:-0}" \
+        '{source_type:$source_type, recall_scope:$recall_scope, injected_chars:$injected_chars, sections_trimmed:$sections_trimmed}')"
     eagle_emit_context_for_agent "$agent" "SessionStart" "$context"
 fi
 eagle_hook_observability_complete 0

package/hooks/stop.sh CHANGED Viewed

@@ -240,7 +240,9 @@ if [ "$needs_enrichment" -eq 1 ]; then
         defer_enrichment=1
         eagle_log "INFO" "Stop: LLM enrichment skipped on fast hook path; provider=$provider"
     elif [ "$provider" != "none" ] && [ -n "$text_content" ]; then
-        excerpt=$(echo "$text_content" | tail -c 3000)
+        # Redact secrets BEFORE the transcript tail is sent to an LLM provider
+        # (the fallback chain includes remote Anthropic/OpenAI APIs).
+        excerpt=$(echo "$text_content" | tail -c 3000 | eagle_redact)
         enrich_prompt="Extract facts from this AI coding session. Only include items with clear evidence in the session text. Do NOT invent or repeat example content.
@@ -437,11 +439,14 @@ if [ "$defer_enrichment" -eq 1 ] && [ "${EAGLE_MEM_STOP_BACKGROUND_ENRICH:-1}" =
     mkdir -p "$EAGLE_MEM_DIR/tmp" 2>/dev/null || true
     enrich_job=$(mktemp "$EAGLE_MEM_DIR/tmp/summary-enrich.XXXXXX.json" 2>/dev/null)
     if [ -n "$enrich_job" ]; then
+        # Redact secrets BEFORE persisting the transcript to the job file. The
+        # background enricher sends this text to an LLM provider (possibly remote).
+        enrich_text=$(printf '%s' "$text_content" | eagle_redact)
         jq -cn \
             --arg session_id "$session_id" \
             --arg project "$project" \
             --arg agent "$agent" \
-            --arg text "$text_content" \
+            --arg text "$enrich_text" \
             '{session_id:$session_id, project:$project, agent:$agent, text:$text}' > "$enrich_job"
         enrich_script="$SCRIPT_DIR/../scripts/enrich-summary.sh"

package/integrations/google_antigravity_hook.py CHANGED Viewed

@@ -5,6 +5,7 @@ and tasks, enforce release-boundary guardrails, and survive context compaction.
 """
 import os
+import re
 import sys
 import json
 import asyncio
@@ -16,6 +17,34 @@ from typing import Any, Dict, List, Optional
 logging.basicConfig(level=logging.INFO)
 logger = logging.getLogger("eagle_mem.antigravity")
+# Trusted install roots, resolved relative to THIS hook file (never os.getcwd()),
+# so an opened/untrusted workspace cannot shadow the real eagle-mem binaries or
+# hook scripts by dropping bin/eagle-mem or hooks/*.sh into the working dir.
+_REPO_ROOT = os.path.dirname(os.path.dirname(os.path.abspath(__file__)))
+_EAGLE_HOME = os.path.join(os.path.expanduser("~"), ".eagle-mem")
+def _resolve_eagle_mem_bin() -> Optional[str]:
+    """Resolve the eagle-mem binary from trusted install roots only."""
+    for path in (
+        os.path.join(_EAGLE_HOME, "bin", "eagle-mem"),
+        os.path.join(_REPO_ROOT, "bin", "eagle-mem"),
+    ):
+        if os.path.exists(path):
+            return path
+    return None
+def _resolve_hook_script(script_name: str) -> Optional[str]:
+    """Resolve an Eagle Mem hook script from trusted install roots only."""
+    for path in (
+        os.path.join(_EAGLE_HOME, "hooks", script_name),
+        os.path.join(_REPO_ROOT, "hooks", script_name),
+    ):
+        if os.path.exists(path):
+            return path
+    return None
 # --- Import and Mock Fallback for google-antigravity SDK ---
 try:
     from google.antigravity import types
@@ -66,26 +95,27 @@ except ImportError:
 # --- Asynchronous Subprocess Helpers ---
-async def run_cmd_async(cmd: List[str]) -> str:
-    """Runs a shell command asynchronously and returns stdout."""
-    # Prioritize workspace-local bin/eagle-mem if running from workspace or if it exists in cwd or relative to this script
+async def run_cmd_async(cmd: List[str], stdin_data: Optional[str] = None) -> str:
+    """Runs a shell command asynchronously and returns stdout.
+    Resolves the eagle-mem binary from the install dir (next to this hook file),
+    never from os.getcwd(), so an opened workspace cannot shadow the real binary.
+    Optional stdin_data is piped in so sensitive values stay out of `ps`.
+    """
     if cmd and cmd[0] == "eagle-mem":
-        local_paths = [
-            os.path.join(os.getcwd(), "bin", "eagle-mem"),
-            os.path.join(os.path.dirname(os.path.dirname(os.path.abspath(__file__))), "bin", "eagle-mem")
-        ]
-        for path in local_paths:
-            if os.path.exists(path):
-                cmd = [path] + cmd[1:]
-                break
+        resolved = _resolve_eagle_mem_bin()
+        if resolved:
+            cmd = [resolved] + cmd[1:]
     try:
         proc = await asyncio.create_subprocess_exec(
             *cmd,
+            stdin=asyncio.subprocess.PIPE if stdin_data is not None else None,
             stdout=asyncio.subprocess.PIPE,
             stderr=asyncio.subprocess.PIPE
         )
-        stdout, stderr = await proc.communicate()
+        stdin_bytes = stdin_data.encode('utf-8') if stdin_data is not None else None
+        stdout, stderr = await proc.communicate(input=stdin_bytes)
         if proc.returncode == 0:
             return stdout.decode('utf-8', errors='ignore')
         else:
@@ -102,14 +132,11 @@ async def run_cmd_async(cmd: List[str]) -> str:
 async def run_hook_async(script_name: str, input_data: Dict[str, Any]) -> str:
     """Runs an Eagle Mem bash hook script asynchronously, piping JSON via stdin."""
-    # Resolve the physical path of the hook script
-    # First check ~/.eagle-mem/hooks/, then fall back to workspace hooks/
-    home_dir = os.path.expanduser("~")
-    hook_path = os.path.join(home_dir, ".eagle-mem", "hooks", script_name)
-    if not os.path.exists(hook_path):
-        hook_path = os.path.join(os.getcwd(), "hooks", script_name)
-    if not os.path.exists(hook_path):
+    # Resolve the hook script from trusted install roots only (~/.eagle-mem/hooks
+    # then the repo's hooks/). Never os.getcwd() — an opened workspace could
+    # otherwise shadow the real hook with a malicious script.
+    hook_path = _resolve_hook_script(script_name)
+    if not hook_path:
         logger.error(f"Eagle Mem hook script not found: {script_name}")
         return ""
@@ -134,12 +161,19 @@ async def run_hook_async(script_name: str, input_data: Dict[str, Any]) -> str:
 # --- Utility Helpers ---
+_SESSION_ID_RE = re.compile(r"^[A-Za-z0-9_-]{1,128}$")
 def get_session_id() -> str:
     """Retrieves or generates a persistent session ID for the current context."""
-    # Use environment variables if present (matches Claude/Codex)
+    # Use environment variables if present (matches Claude/Codex). Validate the
+    # value: the session ID flows into file paths inside the bash hooks, so an
+    # unvalidated env var could enable path traversal. Reject anything that is
+    # not a UUID/hex-style token and fall back to a generated ID.
     session_id = os.environ.get("EAGLE_SESSION_ID") or os.environ.get("SESSION_ID")
-    if not session_id:
-        # Generate a unique hex session ID
+    if not session_id or not _SESSION_ID_RE.match(session_id):
+        if session_id:
+            logger.warning("Ignoring malformed session ID from environment; generating a new one")
         import uuid
         session_id = f"agy-{uuid.uuid4().hex[:16]}"
     return session_id
@@ -343,13 +377,14 @@ class EagleMemAntigravityHook:
         # 1. Trigger Stop hook (stop.sh) in background to capture transcript/summary
         asyncio.create_task(run_hook_async("stop.sh", input_data))
-        # 2. Concurrently trigger session save CLI in background
+        # 2. Concurrently trigger session save CLI in background. Pass the
+        # summary via stdin (not argv) so it is not visible in `ps`.
         summary = final_response[:200] if final_response else ""
         asyncio.create_task(run_cmd_async([
             "eagle-mem", "session", "save",
-            "--summary", summary,
+            "--summary-stdin",
             "--agent", self.agent_name
-        ]))
+        ], stdin_data=summary))
     async def on_compaction(self, data: Any):
         """Fires when context is compacted. Queries active tasks and injects them to survive compaction."""

package/lib/codex-hooks.sh CHANGED Viewed

@@ -149,7 +149,11 @@ eagle_patch_codex_hook() {
     local tmp
     tmp=$(mktemp)
-    jq --argjson entry "$entry" ".hooks.${event} = ((.hooks.${event} // []) + [\$entry])" "$hooks_file" > "$tmp" && mv "$tmp" "$hooks_file"
+    # Pass $event as data via --arg and index dynamically so the event name is
+    # never interpolated into the jq program string (injection-safe even if the
+    # caller ever passes an untrusted event name).
+    jq --argjson entry "$entry" --arg event "$event" \
+        '.hooks[$event] = ((.hooks[$event] // []) + [$entry])' "$hooks_file" > "$tmp" && mv "$tmp" "$hooks_file"
     chmod 600 "$hooks_file" 2>/dev/null || true
     [ -n "$description" ] && eagle_ok "$description"
 }