audrey 0.20.0 → 0.23.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (156) hide show
  1. package/CHANGELOG.md +191 -0
  2. package/README.md +216 -117
  3. package/SECURITY.md +29 -0
  4. package/dist/mcp-server/config.d.ts +29 -4
  5. package/dist/mcp-server/config.d.ts.map +1 -1
  6. package/dist/mcp-server/config.js +100 -17
  7. package/dist/mcp-server/config.js.map +1 -1
  8. package/dist/mcp-server/index.d.ts +302 -25
  9. package/dist/mcp-server/index.d.ts.map +1 -1
  10. package/dist/mcp-server/index.js +1077 -74
  11. package/dist/mcp-server/index.js.map +1 -1
  12. package/dist/src/adaptive.d.ts.map +1 -1
  13. package/dist/src/adaptive.js +3 -1
  14. package/dist/src/adaptive.js.map +1 -1
  15. package/dist/src/affect.d.ts +4 -1
  16. package/dist/src/affect.d.ts.map +1 -1
  17. package/dist/src/affect.js +6 -4
  18. package/dist/src/affect.js.map +1 -1
  19. package/dist/src/audrey.d.ts +58 -4
  20. package/dist/src/audrey.d.ts.map +1 -1
  21. package/dist/src/audrey.js +469 -62
  22. package/dist/src/audrey.js.map +1 -1
  23. package/dist/src/capsule.d.ts +2 -1
  24. package/dist/src/capsule.d.ts.map +1 -1
  25. package/dist/src/capsule.js +14 -4
  26. package/dist/src/capsule.js.map +1 -1
  27. package/dist/src/causal.d.ts.map +1 -1
  28. package/dist/src/causal.js +20 -2
  29. package/dist/src/causal.js.map +1 -1
  30. package/dist/src/confidence.d.ts.map +1 -1
  31. package/dist/src/confidence.js +3 -0
  32. package/dist/src/confidence.js.map +1 -1
  33. package/dist/src/consolidate.d.ts +1 -0
  34. package/dist/src/consolidate.d.ts.map +1 -1
  35. package/dist/src/consolidate.js +35 -19
  36. package/dist/src/consolidate.js.map +1 -1
  37. package/dist/src/controller.d.ts +38 -0
  38. package/dist/src/controller.d.ts.map +1 -0
  39. package/dist/src/controller.js +169 -0
  40. package/dist/src/controller.js.map +1 -0
  41. package/dist/src/db.d.ts.map +1 -1
  42. package/dist/src/db.js +12 -0
  43. package/dist/src/db.js.map +1 -1
  44. package/dist/src/decay.d.ts.map +1 -1
  45. package/dist/src/decay.js +57 -50
  46. package/dist/src/decay.js.map +1 -1
  47. package/dist/src/embedding.d.ts.map +1 -1
  48. package/dist/src/embedding.js +31 -3
  49. package/dist/src/embedding.js.map +1 -1
  50. package/dist/src/encode.d.ts +9 -2
  51. package/dist/src/encode.d.ts.map +1 -1
  52. package/dist/src/encode.js +21 -8
  53. package/dist/src/encode.js.map +1 -1
  54. package/dist/src/export.d.ts.map +1 -1
  55. package/dist/src/export.js +5 -3
  56. package/dist/src/export.js.map +1 -1
  57. package/dist/src/feedback.d.ts +29 -0
  58. package/dist/src/feedback.d.ts.map +1 -0
  59. package/dist/src/feedback.js +123 -0
  60. package/dist/src/feedback.js.map +1 -0
  61. package/dist/src/forget.d.ts.map +1 -1
  62. package/dist/src/forget.js +58 -50
  63. package/dist/src/forget.js.map +1 -1
  64. package/dist/src/fts.js +1 -1
  65. package/dist/src/fts.js.map +1 -1
  66. package/dist/src/hybrid-recall.d.ts +2 -1
  67. package/dist/src/hybrid-recall.d.ts.map +1 -1
  68. package/dist/src/hybrid-recall.js +35 -26
  69. package/dist/src/hybrid-recall.js.map +1 -1
  70. package/dist/src/impact.d.ts +47 -0
  71. package/dist/src/impact.d.ts.map +1 -0
  72. package/dist/src/impact.js +146 -0
  73. package/dist/src/impact.js.map +1 -0
  74. package/dist/src/import.d.ts +177 -1
  75. package/dist/src/import.d.ts.map +1 -1
  76. package/dist/src/import.js +206 -17
  77. package/dist/src/import.js.map +1 -1
  78. package/dist/src/index.d.ts +8 -0
  79. package/dist/src/index.d.ts.map +1 -1
  80. package/dist/src/index.js +4 -0
  81. package/dist/src/index.js.map +1 -1
  82. package/dist/src/interference.d.ts +5 -2
  83. package/dist/src/interference.d.ts.map +1 -1
  84. package/dist/src/interference.js +27 -20
  85. package/dist/src/interference.js.map +1 -1
  86. package/dist/src/llm.d.ts.map +1 -1
  87. package/dist/src/llm.js +1 -0
  88. package/dist/src/llm.js.map +1 -1
  89. package/dist/src/migrate.d.ts.map +1 -1
  90. package/dist/src/migrate.js +21 -9
  91. package/dist/src/migrate.js.map +1 -1
  92. package/dist/src/preflight.d.ts +52 -0
  93. package/dist/src/preflight.d.ts.map +1 -0
  94. package/dist/src/preflight.js +221 -0
  95. package/dist/src/preflight.js.map +1 -0
  96. package/dist/src/profile.d.ts +23 -0
  97. package/dist/src/profile.d.ts.map +1 -0
  98. package/dist/src/profile.js +51 -0
  99. package/dist/src/profile.js.map +1 -0
  100. package/dist/src/promote.d.ts.map +1 -1
  101. package/dist/src/promote.js +2 -3
  102. package/dist/src/promote.js.map +1 -1
  103. package/dist/src/prompts.d.ts.map +1 -1
  104. package/dist/src/prompts.js +76 -47
  105. package/dist/src/prompts.js.map +1 -1
  106. package/dist/src/recall.d.ts +9 -6
  107. package/dist/src/recall.d.ts.map +1 -1
  108. package/dist/src/recall.js +182 -40
  109. package/dist/src/recall.js.map +1 -1
  110. package/dist/src/redact.d.ts +7 -1
  111. package/dist/src/redact.d.ts.map +1 -1
  112. package/dist/src/redact.js +94 -11
  113. package/dist/src/redact.js.map +1 -1
  114. package/dist/src/reflexes.d.ts +35 -0
  115. package/dist/src/reflexes.d.ts.map +1 -0
  116. package/dist/src/reflexes.js +87 -0
  117. package/dist/src/reflexes.js.map +1 -0
  118. package/dist/src/rollback.d.ts.map +1 -1
  119. package/dist/src/rollback.js +9 -4
  120. package/dist/src/rollback.js.map +1 -1
  121. package/dist/src/routes.d.ts +1 -0
  122. package/dist/src/routes.d.ts.map +1 -1
  123. package/dist/src/routes.js +267 -11
  124. package/dist/src/routes.js.map +1 -1
  125. package/dist/src/rules-compiler.d.ts.map +1 -1
  126. package/dist/src/rules-compiler.js +36 -6
  127. package/dist/src/rules-compiler.js.map +1 -1
  128. package/dist/src/server.d.ts +2 -1
  129. package/dist/src/server.d.ts.map +1 -1
  130. package/dist/src/server.js +42 -4
  131. package/dist/src/server.js.map +1 -1
  132. package/dist/src/tool-trace.d.ts.map +1 -1
  133. package/dist/src/tool-trace.js +42 -29
  134. package/dist/src/tool-trace.js.map +1 -1
  135. package/dist/src/types.d.ts +28 -1
  136. package/dist/src/types.d.ts.map +1 -1
  137. package/dist/src/ulid.d.ts.map +1 -1
  138. package/dist/src/ulid.js +52 -2
  139. package/dist/src/ulid.js.map +1 -1
  140. package/dist/src/utils.d.ts.map +1 -1
  141. package/dist/src/utils.js +8 -1
  142. package/dist/src/utils.js.map +1 -1
  143. package/dist/src/validate.d.ts +2 -0
  144. package/dist/src/validate.d.ts.map +1 -1
  145. package/dist/src/validate.js +60 -29
  146. package/dist/src/validate.js.map +1 -1
  147. package/docs/assets/audrey-feature-grid.jpg +0 -0
  148. package/docs/assets/audrey-logo.svg +45 -0
  149. package/docs/assets/audrey-wordmark.png +0 -0
  150. package/examples/ollama-memory-agent.js +326 -0
  151. package/package.json +35 -22
  152. package/docs/assets/benchmarks/local-benchmark.svg +0 -45
  153. package/docs/assets/benchmarks/operations-benchmark.svg +0 -45
  154. package/docs/assets/benchmarks/published-memory-standards.svg +0 -50
  155. package/docs/benchmarking.md +0 -151
  156. package/docs/production-readiness.md +0 -124
package/CHANGELOG.md ADDED
@@ -0,0 +1,191 @@
1
+ # Changelog
2
+
3
+ ## 0.23.1 - 2026-05-08
4
+
5
+ ### Added - Audrey Guard chassis
6
+
7
+ - Added `MemoryController` as the first orchestration layer for memory-before-action workflows. `beforeAction()` returns `allow` / `warn` / `block` with evidence, reflexes, recommendations, and an optional capsule; `afterAction()` records redacted tool outcomes and turns failures into tool-result memories.
8
+ - Added `audrey guard --tool <Tool> "<action>"` with `--json`, `--explain`, `--override`, and `--fail-on-warn`.
9
+ - Added `audrey demo --scenario repeated-failure`, a deterministic no-network demo where Audrey records a failed deploy, blocks the repeat attempt, validates the lesson, and prints impact.
10
+ - `Audrey.encodeBatch()` now uses provider-level `embedBatch()` and validates the batch before embedding, avoiding N sequential cloud embedding calls for valid batches.
11
+ - Recall now surfaces partial vector/FTS failures on the returned result array. Capsules preserve those diagnostics, strict Guard preflights block when recall is degraded, and `/v1/status` / `memory_status` expose the latest recall degradation signal.
12
+ - Added `docs/AUDREY_PAPER_OUTLINE.md`, framing Audrey Guard as local-first pre-action memory control for tool-using agents and outlining the GuardBench evaluation plan.
13
+
14
+ ### Fixed
15
+
16
+ - Docker Compose now requires `AUDREY_API_KEY` instead of starting a non-loopback unauthenticated REST sidecar that the server correctly refuses.
17
+ - Guard exact-failure matching now redacts before trimming, matches tool names case-insensitively, and includes file scope in the action hash.
18
+ - Redaction-aware truncation keeps complete `[REDACTED:*]` markers in long tool errors and output summaries.
19
+ - `npm test` and `npm run test:watch` now set a repo-local Vitest temp directory before Vitest starts, avoiding locked-down Windows user-temp failures.
20
+ - `npm audit --omit=dev --audit-level=moderate` is clean after refreshing Hono, Zod, and transitive rate-limit packages.
21
+ - README benchmark sample values now match `benchmarks/snapshots/perf-0.22.2.json`; the paper evidence ledger was re-checked for the repeated-failure demo line range and live bibliography URLs before release prep.
22
+
23
+ ## 0.22.2 - 2026-05-01
24
+
25
+ ### Correctness — second CodeRabbit review pass and code-scanning audit
26
+
27
+ - `src/forget.ts` `WHERE v.state ...` was filtering on the denormalized state column on `vec_semantics` / `vec_procedures`. That column is only populated at INSERT and never updated, so dormant or superseded rows were still passing the filter. Switched to `s.state` / `p.state`. Same fix applied to `src/interference.ts` after the second review pass caught the duplicate.
28
+ - Wrapped `forgetMemory`, `purgeMemories`, `applyDecay`, `applyInterference`, and the contradiction insert + state update in `src/validate.ts` in transactions so partial failures can't leave inconsistent counts or orphan contradictions.
29
+ - `mcp-server/index.ts` `VALID_SOURCES` and `VALID_TYPES` were object literals fed to `z.enum()`, which expects a tuple. Converted to const tuples so the MCP schemas validate correctly.
30
+ - `src/utils.ts` `cosineSimilarity` now throws on length mismatch instead of silently returning NaN; `daysBetween` throws on invalid date strings.
31
+ - `src/ulid.ts` `generateDeterministicId` rebuilt as canonicalize → SHA-256 → first 16 bytes → Crockford Base32. The previous shape used `JSON.stringify` (object-key-order-unstable) and emitted hex characters, neither of which produced a real ULID. `canonicalize` now also rejects circular references.
32
+ - `src/audrey.ts` constructor and `consolidate`/`decay` now use `??` for default fallbacks so an explicit `0` survives. The previous `||` short-circuit silently replaced valid zero-value config.
33
+ - `src/audrey.ts` `recallStream` now respects `options.agent` (was hardcoded to `this.agent`) and waits for embedding warmup like the non-streaming path.
34
+ - `src/confidence.ts` `recencyDecay` throws `RangeError` on `halfLifeDays <= 0` to surface NaN/Infinity earlier in the pipeline.
35
+ - `src/causal.ts` and `src/validate.ts` now validate the LLM response shape before reading fields. `causal` rejects non-finite confidence; `validate` rejects non-object/array conditions and only counts new evidence toward `supporting_count`.
36
+ - `src/rollback.ts` UPDATEs now check `.changes` and aggregate real counts. Rolling back ids that don't exist no longer reports false success.
37
+ - `src/rules-compiler.ts` `quoteString` now also escapes newline, carriage return, and tab so promoted rule content with multiline values produces valid double-quoted YAML.
38
+ - `src/decay.ts` and `src/forget.ts purgeMemories` moved their SELECTs inside the surrounding transaction so concurrent writers can't slip rows in or out between read and write.
39
+ - `src/migrate.ts` `reembedAll` chunks `embedBatch` calls into 256-row batches and labels failures by kind + row range. Pre-fix a partial embed failure on a 50K-episode reembed printed a bare provider error and lost the location. `EpisodeMigrateRow.consolidated` was also retyped to `number | null` to match runtime usage.
40
+ - `src/embedding.ts` `embedBatch` validates response shape with clear errors instead of mapping over a missing or malformed `data` field.
41
+ - `src/encode.ts` `effectiveSalience` clamped to `[0, 1]`. The previous formula could go negative on a sufficiently negative arousal boost.
42
+ - `src/affect.ts` `timeDeltaDays` no longer propagates NaN from invalid `created_at`.
43
+ - `src/capsule.ts` failure entry `memory_id` no longer interpolates `'undefined'` when `tool_name` is missing; recall spread order keeps `scope: 'agent'` from being overridden by caller options.
44
+ - `src/import.ts` `isDatabaseEmpty` now also checks `memory_events`. Pre-fix you could `restore` into a "fresh" store that already contained audit-trail rows.
45
+ - `src/server.ts` shutdown awaits `server.close` (was fire-and-forget) and surfaces `audrey.closeAsync` errors to stderr instead of silently swallowing them. `ERR_SERVER_NOT_RUNNING` is treated as success.
46
+ - `src/feedback.ts` replaced a `findRow(id)!.row` non-null assertion with a defensive null check; if the row was concurrently forgotten between UPDATE and re-read, returns the values just written rather than crashing.
47
+ - `src/promote.ts` folded `trigger_conditions` into the main SELECT (was an N+1).
48
+
49
+ ### Security
50
+
51
+ - `src/routes.ts` API key auth uses padded-buffer constant-time comparison. The previous `provided.length !== expected.length || !timingSafeEqual(...)` shape leaked the expected key length via response timing on local untrusted callers. Both buffers are now padded to 1 KiB before `timingSafeEqual`, so the comparison runs identically regardless of header length.
52
+ - `src/redact.ts` raised the hex-secret length threshold from 40 to 80 chars so 40-character git SHAs and 64-character SHA-256 checksums are no longer redacted as secrets.
53
+ - The "Protect master" GitHub ruleset was updated to drop the stale `Node 18 on Ubuntu` required check (CI dropped Node 18 from the matrix in 0.22.1 to match `engines.node >=20`, but the protection rule kept requiring a check that would never run).
54
+
55
+ ### Added — closed-loop visibility on REST and Python
56
+
57
+ - New `GET /v1/impact` route that mirrors `Audrey.impact()` and the `audrey impact` CLI. Bounds `windowDays` to 1-365 and `limit` to 1-100.
58
+ - Python sync and async clients gained an `impact(window_days=, limit=)` method. The previous `analytics()` no longer raises `NotImplementedError`; it's an alias of `impact()` for older callers.
59
+ - Python integration tests are no longer skipped. The suite spins up the real TS REST sidecar via `node dist/mcp-server/index.js serve` and exercises encode → recall → mark_used → impact → snapshot → restore end-to-end.
60
+
61
+ ### Benchmarks — legitimate performance snapshot, no marketing graphs
62
+
63
+ - New `npm run bench:perf-snapshot` (`benchmarks/perf-snapshot.js`) reports encode and hybrid-recall p50/p95/p99 across multiple corpus sizes (default 100, 1000, 5000) with full machine provenance (Node version, CPU model, RAM, git SHA) so the numbers are reproducible.
64
+ - Removed the synthetic-baseline SVG charts (`docs/assets/benchmarks/local-benchmark.svg`, `operations-benchmark.svg`, `published-memory-standards.svg`) from the repo and from the npm package's `files` field. They claimed Audrey beat naive baselines on 12 hand-crafted scenarios, which is not a useful marketing signal. The behavioral regression suite (`npm run bench:memory:check`) still runs as a release gate; it just no longer ships chart artifacts to the README.
65
+ - Removed the `bench:memory:readme-assets` script (it generated the SVGs above).
66
+ - README's Benchmarks section rewritten around the perf snapshot with explicit caveats about embedding-provider cost and what the numbers do and don't cover.
67
+
68
+ ### Fixed
69
+
70
+ - `mcp-server/index.ts` help banner: `memory_validate` was already registered but was missing from the in-session tool list.
71
+ - `CHANGELOG.md` 0.22.1 contradicted itself by stating `mark_used()` was both upgraded to a real call and still raises `NotImplementedError`. Removed the stale duplicate.
72
+
73
+ ### Personal-data cleanup
74
+
75
+ - `tests/http-api.test.js` no longer references "Tyler" — replaced with generic test fixtures so the public test suite has no personal identifiers.
76
+
77
+ ## 0.22.1 - 2026-04-30
78
+
79
+ ### Added — `audrey impact` report
80
+
81
+ - New `audrey impact` CLI command (also `--json` for automation, `--window N` for the lookback window in days, `--limit N` for how many rows in each list).
82
+ - Shows: total memories by type, all-time validated count, recent validations, top-N most-used memories, weakest-N (lowest salience — candidates to forget), and recent activity timeline.
83
+ - Backed by `src/impact.ts` (`buildImpactReport`, `formatImpactReport`) and `Audrey.impact({ windowDays, limit })`.
84
+ - This is the marketing surface the adversary called for: vital signs over CI verdicts. As agents start calling `memory_validate`, the report accumulates the "X failures prevented this week, Y procedures auto-promoted" story.
85
+
86
+ ### Added — closed-loop feedback (the "memory before action" wedge)
87
+
88
+ - New `memory_validate(id, outcome)` MCP tool. `outcome` is one of:
89
+ - `"helpful"` — the recalled memory drove a correct action. Reinforces salience and bumps `retrieval_count` for semantic/procedural rows.
90
+ - `"wrong"` — the memory was misleading. Decreases salience and bumps `challenge_count` for semantic memories.
91
+ - `"used"` — neutral signal that the memory was referenced (smaller salience delta than `helpful`).
92
+ - New REST endpoints `POST /v1/validate` (canonical) and `POST /v1/mark-used` (legacy alias defaulting to `outcome=used`).
93
+ - New `Audrey.validate({ id, outcome })` SDK method emits a `'validate'` event so consumers can audit feedback flow.
94
+ - New `src/feedback.ts` module with the `applyFeedback()` primitive — kept out of `audrey.ts` per architecture review (god-class concern).
95
+ - Python client `mark_used()` is no longer a `NotImplementedError`; calls `/v1/mark-used`. New `validate(memory_id, outcome="used"|"helpful"|"wrong")` method on both sync and async clients.
96
+ - 10 new tests (6 SDK math, 1 MCP enum, 3 HTTP roundtrip including 404 path).
97
+
98
+ This is the P0#1 item from `docs/PRODUCTION_BACKLOG.md` — the closed feedback loop that lifts the autopilot rubric's ALIVE dimension from 4 to 7+. The math reuses the existing `confidence.ts` reinforcement formula; the new column work is a no-op (`usage_count` and `last_used_at` were already added by migration 10 in v0.21).
99
+
100
+ ### Security
101
+
102
+ - HTTP `/v1/recall` and `/v1/capsule` no longer body-spread caller options into `audrey.recall()`. Pre-fix, `includePrivate: true` and `confidenceConfig` overrides could be passed in HTTP bodies, bypassing the private-memory ACL and integrity controls. The new `sanitizeRecallOptions()` allowlist drops anything not in a known-safe key set.
103
+ - `audrey serve` defaults to binding `127.0.0.1` (was `0.0.0.0`). Refuses to start on a non-loopback host without `AUDREY_API_KEY` unless `AUDREY_ALLOW_NO_AUTH=1`. New `AUDREY_HOST` env var explicitly opts in to network exposure.
104
+ - HTTP API key comparison uses `crypto.timingSafeEqual` instead of string `!==` to avoid prefix-match timing leaks on local untrusted callers.
105
+ - `audrey promote --yes` refuses to write `.claude/rules/*.md` outside `process.cwd()` unless the target path is in `AUDREY_PROMOTE_ROOTS`. Prevents a malicious MCP caller from writing persistent prompt-injection files into the user's `~/.claude/` directory.
106
+
107
+ ### First-contact UX
108
+
109
+ - `audrey --help`, `audrey --version`, and `audrey help`/`audrey version` now print help/version and exit 0 instead of silently dropping into the MCP stdio server. Unknown subcommands print error + help and exit 2.
110
+ - ONNX runtime EP-assignment warnings ("Some nodes were not assigned to the preferred execution providers...") are suppressed by default via per-session `logSeverityLevel`. Set `AUDREY_ONNX_VERBOSE=1` to restore the original behavior.
111
+ - `[audrey-mcp]` info boot logs (server started, connected via stdio, warmup completed) are gated behind `AUDREY_DEBUG=1`. Warmup-failure errors continue to log unconditionally.
112
+
113
+ ### Reliability
114
+
115
+ - `audrey.close()` now warns to stderr when called with pending post-encode consolidation work. New `audrey.closeAsync()` awaits `drainPostEncodeQueue()` before closing the database. All CLI subcommands (`reembed`, `dream`, `greeting`, `reflect`, `demo`, `observe-tool`, `promote`) use `closeAsync` to prevent the silent-data-loss race introduced in v0.22.0 where post-encode validation/interference could hit a closed DB.
116
+ - `_emitQueueError` reverted to the standard EventEmitter idiom: emit `error` when a listener is attached, fall back to `console.error` otherwise. v0.22.0 always called `console.error` and produced duplicate stderr lines for apps with structured error pipelines.
117
+ - `encodeBatch` now reuses the encode vector across post-encode stages and routes through `_enqueuePostEncode` (matching `encode`). Pre-fix, batch callers paid 4× embed cost per item and silently bypassed interference/resonance — a behavior divergence from single-encode that the v0.22.0 perf pass missed.
118
+
119
+ ### Performance
120
+
121
+ - SQLite PRAGMA tuning at db creation: `synchronous=NORMAL` (durable under WAL), 64 MiB page cache, 256 MiB mmap, `temp_store=MEMORY`. Set `AUDREY_PRAGMA_DEFAULTS=0` to revert to better-sqlite3 defaults. Expected impact: 2-5× recall p95 at &gt;10K episodes; 30-50% improvement on encode under sustained load.
122
+
123
+ ### Dependencies
124
+
125
+ - `sqlite-vec`: `0.1.7-alpha.2` → `0.1.9` (alpha to stable; the prior pin was 15 months old).
126
+ - `@modelcontextprotocol/sdk`: `1.26.0` → `1.29.0` (stricter schema validation, transport stability).
127
+ - `zod` `4.3.6` → `4.4.1`, `better-sqlite3` `12.6.2` → `12.9.0`, `hono` `4.12.14` → `4.12.15`, `@hono/node-server` `1.19.13` → `1.19.14`, `vitest` `4.0.18` → `4.1.5`, `typescript` `6.0.2` → `6.0.3`.
128
+ - `npm audit`: 0 vulnerabilities (production); transitive postcss CVE in vitest's vite resolved via `npm audit fix`.
129
+
130
+ ### SDK contract fixes (Python ↔ TS server)
131
+
132
+ - Python client `DEFAULT_BASE_URL` corrected from `http://127.0.0.1:3487` to `http://127.0.0.1:7437` to match the TS server's default port. Pre-fix, calling `Audrey()` with no args connected to nothing.
133
+ - Python `recall()` and `recall_response()` now decode the bare-list payload that `/v1/recall` actually returns, then wrap into `RecallResponse` client-side. Pre-fix, `recall_response()` would raise a Pydantic validation error against the real server.
134
+ - Python `restore()` now wraps the snapshot in `{"snapshot": ...}` to match the TS `/v1/import` handler that reads `body.snapshot`. Pre-fix, the server received `body.snapshot === undefined` and `audrey.import(undefined)` failed.
135
+ - Python `analytics()` raises `NotImplementedError` with a pointer to `docs/PRODUCTION_BACKLOG.md` until the analytics endpoint ships. Pre-fix, it produced a cryptic 404 from the TS sidecar that doesn't expose that endpoint. (Note: `mark_used()` was upgraded to a real call against `/v1/mark-used` in this same release — see the closed-loop section above.)
136
+ - README REST API row no longer claims `/openapi.json` or `/docs` — those routes aren't currently wired. The README now matches the actual surface (`/health` + `/v1/*`).
137
+
138
+ ### Removed
139
+
140
+ - `hybrid_strict` retrieval mode (was a silent alias of `hybrid` with no behavioral difference). Use `hybrid` (default) or `vector`.
141
+
142
+ ### Internal
143
+
144
+ - New `closeAsync(timeoutMs?: number)` on `Audrey`.
145
+ - New `sanitizeRecallOptions()` allowlist helper in `src/routes.ts`.
146
+ - `startServer` returns `hostname` alongside `port`.
147
+ - 5 new tests: CLI surface (`--help`/`--version`/unknown), HTTP recall sanitizer (privacy ACL, integrity, retrieval enum), HTTP bind safety (no-auth on LAN refused, `AUDREY_ALLOW_NO_AUTH` override).
148
+
149
+ ## 0.22.0 - 2026-04-28
150
+
151
+ ### Performance
152
+
153
+ - Encode response time: 24.7ms to 15.2ms p50, about 40% faster.
154
+ - Cold-start first encode: 525ms to 28ms with warmup, about 18.7x faster.
155
+ - Hybrid recall: 30.2ms to 14.3ms p50, about 2.1x faster.
156
+ - Eliminated 3 of 4 redundant embedding calls during encode. Validation, interference, and affect resonance now reuse the main content vector.
157
+
158
+ ### Added
159
+
160
+ - Added `memory_encode.wait_for_consolidation` parameter, default `false`, for opt-in read-after-write semantics.
161
+ - Added `memory_recall.retrieval` parameter with `"hybrid"` default and `"vector"` (FTS-bypass fast path).
162
+ - Added `pending_consolidation_count`, `embedding_warm`, `warmup_duration_ms`, and `default_retrieval_mode` to `memory_status`.
163
+ - Added background embedding pipeline warmup after MCP `server.connect()`.
164
+ - Added `AUDREY_PROFILE=1` for per-stage timings in MCP `_meta.diagnostics`.
165
+ - Added `AUDREY_DISABLE_WARMUP=1` to opt out of background embedding warmup.
166
+ - Added `benchmarks/perf.bench.js` and `npm run bench:perf` as a mock-embedding CI perf gate.
167
+
168
+ ### Changed
169
+
170
+ - Moved post-encode validation, interference, and affect resonance onto a serialized async queue so `memory_encode` no longer blocks on downstream consolidation work by default.
171
+ - Folded recall's three healthy-store vec-table count queries into one SQL roundtrip before KNN.
172
+ - Process shutdown now drains the post-encode consolidation queue with a 5-second timeout and logs pending row IDs if work remains.
173
+
174
+ ### Internal
175
+
176
+ - Added `src/profile.ts` with `ProfileRecorder`.
177
+ - Added `encodeWithDiagnostics()` and `recallWithDiagnostics()` for MCP profiling-mode response metadata.
178
+
179
+ ## 0.21.0 - Release Diagnostics and Host Setup
180
+
181
+ - Added `npx audrey doctor` for first-contact diagnostics, JSON automation, provider checks, MCP entrypoint validation, memory-store health, and host config generation.
182
+ - Added `npx audrey install --host <host> --dry-run` so Codex, Claude Code, Claude Desktop, Cursor, Windsurf, VS Code, JetBrains, and generic MCP hosts can preview setup without accidental config writes.
183
+ - Updated docs around the recommended first run: `doctor`, `demo`, safe host install preview, then host-specific verification.
184
+ - Kept Claude Code's direct installer intact while making the default release story host-neutral.
185
+ - Refreshed lockfile transitive packages through the npm resolver; vulnerability audit remains clean.
186
+
187
+ ## 0.20.0 - Memory Reflexes
188
+
189
+ - Added Memory Preflight and Memory Reflexes so agents can check memory before acting and turn repeated failures into trigger-response guidance.
190
+ - Added Ollama/local-agent guidance and runnable local-agent example.
191
+ - Expanded host-neutral MCP docs and Audrey for Dummies onboarding.
package/README.md CHANGED
@@ -1,83 +1,148 @@
1
- # Audrey
1
+ <div align="center">
2
+ <img src="docs/assets/audrey-wordmark.png" alt="Audrey wordmark" width="760">
2
3
 
3
- [![CI](https://github.com/Evilander/Audrey/actions/workflows/ci.yml/badge.svg?branch=master)](https://github.com/Evilander/Audrey/actions/workflows/ci.yml)
4
- [![npm version](https://img.shields.io/npm/v/audrey.svg)](https://www.npmjs.com/package/audrey)
5
- [![License: MIT](https://img.shields.io/badge/license-MIT-blue.svg)](LICENSE)
4
+ <p><strong>The local-first memory firewall for AI agents.</strong></p>
6
5
 
7
- Audrey is a persistent memory and continuity engine for Claude Code and AI agents.
6
+ <p>
7
+ Give Codex, Claude Code, Claude Desktop, Cursor, Windsurf, VS Code, JetBrains, Ollama-backed agents,
8
+ and custom agent services one durable memory layer they can check before they touch tools.
9
+ </p>
8
10
 
9
- It gives an agent a local memory store, durable recall, consolidation, contradiction handling, a REST sidecar, MCP tools, and benchmark gates without adding external infrastructure.
11
+ <p>
12
+ <a href="https://github.com/Evilander/Audrey/actions/workflows/ci.yml"><img alt="CI" src="https://github.com/Evilander/Audrey/actions/workflows/ci.yml/badge.svg?branch=master"></a>
13
+ <a href="https://www.npmjs.com/package/audrey"><img alt="npm version" src="https://img.shields.io/npm/v/audrey.svg"></a>
14
+ <a href="LICENSE"><img alt="MIT license" src="https://img.shields.io/badge/license-MIT-blue.svg"></a>
15
+ </p>
16
+ </div>
10
17
 
11
- Requires Node.js 20+.
18
+ ## Why Audrey Exists
19
+
20
+ Agents forget the exact mistakes they made yesterday. They repeat broken commands, lose project-specific rules, miss contradictions, and treat every new session like a cold start.
21
+
22
+ Audrey Guard is the headline loop: record what happened, remember what mattered, check before action, return `allow`, `warn`, or `block` with evidence, then validate whether the memory helped.
23
+
24
+ Audrey turns those hard-won lessons into a local memory runtime:
25
+
26
+ - `audrey guard --tool Bash "npm run deploy"` runs memory-before-action from the terminal.
27
+ - `memory_recall` finds durable context by semantic similarity.
28
+ - `memory_preflight` checks prior failures, risks, rules, and relevant procedures before an action.
29
+ - `memory_reflexes` converts remembered evidence into trigger-response guidance agents can follow.
30
+ - `memory_validate` closes the loop after the action — `helpful`, `used`, or `wrong` outcomes feed salience and decay.
31
+ - `memory_dream` consolidates episodes into principles and applies decay.
32
+ - `audrey impact` and `audrey doctor` tell a human or CI system whether the runtime is doing real work and is actually ready.
33
+
34
+ It is not a hosted vector database, a notes app, or a Claude-only plugin. Audrey is a SQLite-backed continuity layer that can sit under any local or sidecar agent loop.
35
+
36
+ <div align="center">
37
+ <img src="docs/assets/audrey-feature-grid.jpg" alt="Audrey feature marks: memory continuity, archive signal, recall loop, layered evidence, local node, and remembering before acting" width="760">
38
+ </div>
12
39
 
13
40
  ## Quick Start
14
41
 
15
- ### Claude Code
42
+ Requires Node.js 20+.
16
43
 
17
44
  ```bash
18
- npx audrey init
19
45
  npx audrey doctor
46
+ npx audrey demo --scenario repeated-failure
47
+ npx audrey guard --tool Bash "npm run deploy"
48
+ ```
49
+
50
+ `doctor` verifies Node, the MCP entrypoint, provider selection, memory-store health, and host config generation. The repeated-failure demo is no-key, no-host, and no-network: it creates a temporary store, records a failed deploy, teaches Audrey the fix, then shows Audrey Guard blocking the repeat attempt with evidence.
51
+
52
+ Expected first-run shape:
53
+
54
+ ```text
55
+ Audrey Doctor v0.23.1
56
+ Store health: not initialized
57
+ Verdict: ready
20
58
  ```
21
59
 
22
- This uses the default `local-offline` preset:
60
+ After the first real memory write, `doctor` should report the store as healthy.
23
61
 
24
- - registers Audrey with Claude Code
25
- - installs hooks for automatic recall and reflection
26
- - uses local embeddings by default
27
- - stores memory in one local SQLite-backed data directory
62
+ ## Install Into Agent Hosts
28
63
 
29
- ### REST or Docker Sidecar
64
+ Preview host setup without editing config files:
30
65
 
31
66
  ```bash
32
- npx audrey init sidecar-prod
33
- docker compose up -d --build
67
+ npx audrey install --host codex --dry-run
68
+ npx audrey install --host claude-code --dry-run
69
+ npx audrey install --host generic --dry-run
34
70
  ```
35
71
 
36
- Then verify:
72
+ Generate raw config blocks:
37
73
 
38
74
  ```bash
39
- npx audrey doctor
40
- curl http://localhost:3487/health
75
+ npx audrey mcp-config codex
76
+ npx audrey mcp-config generic
77
+ npx audrey mcp-config vscode
41
78
  ```
42
79
 
43
- ## Why Audrey
80
+ Claude Code can be registered directly:
44
81
 
45
- - Local-first: memory lives in SQLite with `sqlite-vec`, not a hosted vector database.
46
- - Practical: MCP, CLI, REST, JavaScript, Python, and Docker are all first-class.
47
- - Durable: snapshot, restore, health checks, benchmark gates, and graceful shutdown are built in.
48
- - Structured: Audrey does more than save notes. It consolidates, decays, tracks contradictions, and supports procedural memory.
82
+ ```bash
83
+ npx audrey install
84
+ claude mcp list
85
+ ```
49
86
 
50
- ## What Ships
87
+ All local MCP paths default to local embeddings and one shared SQLite-backed memory directory. Use `AUDREY_DATA_DIR` to isolate projects, tenants, or host identities.
51
88
 
52
- - Claude Code MCP server with 13 memory tools
53
- - Automatic hook-based recall and reflection for Claude Code sessions
54
- - JavaScript SDK
55
- - Python SDK packaged as `audrey-memory`
56
- - REST API for sidecar deployment
57
- - Docker and Compose deployment path
58
- - Snapshot and restore for portable memory state
59
- - Machine-readable health and benchmark gates
60
- - Local benchmark harness with retrieval and lifecycle-operation tracks
89
+ Installer-generated host config does not include provider API keys by default. Prefer setting `ANTHROPIC_API_KEY`, `OPENAI_API_KEY`, `GOOGLE_API_KEY`, or `GEMINI_API_KEY` in the host runtime environment; use `npx audrey install --include-secrets` only if you explicitly accept argv/config exposure.
61
90
 
62
- ## Setup Presets
91
+ ## Use With Ollama And Local Agents
63
92
 
64
- `npx audrey init` supports four named presets:
93
+ Ollama runs models; Audrey supplies memory. Start Audrey as a local REST sidecar and expose its routes as tools in your agent loop:
65
94
 
66
- | Preset | Best For | Behavior |
67
- |---|---|---|
68
- | `local-offline` | Claude Code on one machine | Local embeddings, MCP install, hooks install |
69
- | `hosted-fast` | Claude Code with provider keys already present | Auto-picks hosted providers from env, MCP install, hooks install |
70
- | `ci-mock` | CI and smoke tests | Mock embedding + LLM providers, no Claude-specific setup |
71
- | `sidecar-prod` | REST API and Docker deployment | Sidecar-oriented defaults, no Claude-specific setup |
95
+ ```bash
96
+ AUDREY_AGENT=ollama-local-agent npx audrey serve
97
+ curl http://localhost:7437/health
98
+ curl http://localhost:7437/v1/status
99
+ ```
72
100
 
73
- Useful checks:
101
+ Runnable example:
74
102
 
75
103
  ```bash
76
- npx audrey doctor
77
- npx audrey status
78
- npx audrey status --json --fail-on-unhealthy
104
+ AUDREY_AGENT=ollama-local-agent npx audrey serve
105
+ OLLAMA_MODEL=qwen3 node examples/ollama-memory-agent.js "What should you remember about Audrey?"
79
106
  ```
80
107
 
108
+ Core sidecar tools:
109
+
110
+ | Agent Need | REST Route |
111
+ |---|---|
112
+ | Check memory before acting | `POST /v1/preflight` |
113
+ | Get reflex rules for an action | `POST /v1/reflexes` |
114
+ | Store a useful observation | `POST /v1/encode` |
115
+ | Recall relevant context | `POST /v1/recall` |
116
+ | Get a turn-sized memory packet | `POST /v1/capsule` |
117
+ | Check health | `GET /v1/status` |
118
+
119
+ ## What Ships
120
+
121
+ | Surface | Status |
122
+ |---|---|
123
+ | MCP stdio server | 20 tools plus status/recent/principles resources and briefing/recall/reflection prompts |
124
+ | CLI | `doctor`, `demo`, `guard`, `install`, `mcp-config`, `status`, `dream`, `reembed`, `observe-tool`, `promote`, `impact` |
125
+ | REST API | Hono server with `/health` and `/v1/*` routes |
126
+ | JavaScript SDK | Direct TypeScript/Node import from `audrey` |
127
+ | Python client | `pip install audrey-memory`, calls the REST sidecar |
128
+ | Storage | Local SQLite plus `sqlite-vec`, no hosted database required |
129
+ | Deployment | npm package, Docker, Compose, host-specific MCP config generation |
130
+ | Safety loop | preflight warnings, reflexes, redacted tool traces, contradiction handling |
131
+
132
+ ## Memory Model
133
+
134
+ Audrey is built around the parts of memory that matter for agents:
135
+
136
+ - Episodic memory: specific observations, tool results, preferences, and session facts.
137
+ - Semantic memory: consolidated principles extracted from repeated evidence.
138
+ - Procedural memory: remembered ways to act, avoid, retry, or verify.
139
+ - Affect and salience: emotional weight and importance influence recall.
140
+ - Interference and decay: stale, conflicting, or low-confidence memories lose authority over time.
141
+ - Contradiction handling: competing claims are tracked instead of silently overwritten.
142
+ - Tool-trace learning: failed commands and risky actions become future preflight warnings.
143
+
144
+ The product bet is simple: the next generation of useful agents will not just retrieve facts. They will remember what happened, decide whether a memory is still trustworthy, and use that memory before touching tools.
145
+
81
146
  ## Use Audrey From Code
82
147
 
83
148
  ### JavaScript
@@ -112,119 +177,153 @@ pip install audrey-memory
112
177
  ```python
113
178
  from audrey_memory import Audrey
114
179
 
115
- brain = Audrey(
116
- base_url="http://127.0.0.1:3487",
117
- api_key="secret",
118
- agent="support-agent",
119
- )
120
-
121
- memory_id = brain.encode(
122
- "Stripe returns HTTP 429 above 100 req/s",
123
- source="direct-observation",
124
- )
180
+ brain = Audrey(base_url="http://127.0.0.1:7437", agent="support-agent")
181
+ memory_id = brain.encode("Stripe returns HTTP 429 above 100 req/s", source="direct-observation")
125
182
  results = brain.recall("stripe rate limit", limit=5)
126
183
  brain.close()
127
184
  ```
128
185
 
129
- ## Key Commands
186
+ ## Production Readiness
130
187
 
131
- ```bash
132
- # Setup
133
- npx audrey init
134
- npx audrey init hosted-fast
135
- npx audrey init ci-mock
136
- npx audrey init sidecar-prod
188
+ Audrey is close to a 1.0-ready local memory runtime, but production depends on how it is embedded. Treat it like stateful infrastructure.
137
189
 
138
- # Claude Code integration
139
- npx audrey install
140
- npx audrey hooks install
141
- npx audrey hooks uninstall
142
- npx audrey uninstall
190
+ Release gates used for this package:
143
191
 
144
- # Health and maintenance
192
+ ```bash
193
+ npm run release:gate
145
194
  npx audrey doctor
146
- npx audrey status
147
- npx audrey dream
148
- npx audrey reembed
195
+ npx audrey demo
196
+ ```
149
197
 
150
- # Versioning
151
- npx audrey snapshot
152
- npx audrey restore backup.json --force
198
+ Recommended runtime checks:
153
199
 
154
- # Sidecar
155
- npx audrey serve
156
- docker compose up -d --build
200
+ ```bash
201
+ npx audrey doctor --json
202
+ npx audrey status --json --fail-on-unhealthy
203
+ npx audrey install --host codex --dry-run
157
204
  ```
158
205
 
206
+ Production controls you still own:
207
+
208
+ - Set one `AUDREY_DATA_DIR` per tenant, environment, or isolation boundary.
209
+ - Pin `AUDREY_EMBEDDING_PROVIDER` and `AUDREY_LLM_PROVIDER` explicitly.
210
+ - Back up the SQLite data directory before provider or dimension changes.
211
+ - Keep API keys and raw credentials out of encoded memory content.
212
+ - Use `AUDREY_API_KEY` if the REST sidecar is reachable beyond the local process boundary.
213
+ - Run `npx audrey dream` on a schedule so consolidation and decay stay current.
214
+ - Add application-level encryption, retention, access control, and audit logging for regulated environments.
215
+
216
+ ## Environment Variables
217
+
218
+ | Variable | Default | Purpose |
219
+ |---|---|---|
220
+ | `AUDREY_DATA_DIR` | `~/.audrey/data` | SQLite memory store path. Use one per tenant or agent identity for isolation. |
221
+ | `AUDREY_AGENT` | `local-agent` | Logical agent identity stamped on writes. |
222
+ | `AUDREY_EMBEDDING_PROVIDER` | `local` | `local`, `gemini`, `openai`, or `mock`. Cloud providers require explicit opt-in. |
223
+ | `AUDREY_LLM_PROVIDER` | auto | `anthropic`, `openai`, or `mock`. |
224
+ | `AUDREY_DEVICE` | `gpu` | Local embedding device (`gpu` or `cpu`). Falls back to CPU if GPU init fails. |
225
+ | `AUDREY_PORT` | `7437` | REST sidecar port. |
226
+ | `AUDREY_HOST` | `127.0.0.1` | REST sidecar bind address. Set to `0.0.0.0` only with `AUDREY_API_KEY`. |
227
+ | `AUDREY_API_KEY` | unset | Bearer token required for non-loopback REST traffic. |
228
+ | `AUDREY_ALLOW_NO_AUTH` | `0` | Set to `1` to allow non-loopback bind without an API key. Don't. |
229
+ | `AUDREY_ENABLE_ADMIN_TOOLS` | `0` | Set to `1` to enable export, import, and forget routes/tools. Disabled by default. |
230
+ | `AUDREY_PROMOTE_ROOTS` | unset | Colon/semicolon-separated extra roots for `audrey promote --yes` writes. By default writes are restricted to `process.cwd()`. |
231
+ | `AUDREY_DEBUG` | `0` | Set to `1` to print MCP info logs (server started, warmup completed). Errors always log. |
232
+ | `AUDREY_PROFILE` | `0` | Set to `1` to emit per-stage timings via MCP `_meta.diagnostics`. |
233
+ | `AUDREY_DISABLE_WARMUP` | `0` | Set to `1` to skip background embedding warmup at MCP boot. |
234
+ | `AUDREY_ONNX_VERBOSE` | `0` | Set to `1` to restore ONNX runtime EP-assignment warnings (suppressed by default). |
235
+ | `AUDREY_PRAGMA_DEFAULTS` | `1` | Set to `0` to revert SQLite PRAGMA tuning to better-sqlite3 defaults. |
236
+ | `AUDREY_CONTEXT_BUDGET_CHARS` | `4000` | Default Memory Capsule character budget. |
237
+
159
238
  ## Benchmarks
160
239
 
161
- Audrey ships with a benchmark harness and release gate:
240
+ Audrey ships two benchmark commands.
241
+
242
+ ### Performance snapshot
243
+
244
+ `npm run bench:perf-snapshot` measures encode and hybrid recall latency at multiple corpus sizes against the in-process mock provider. It reports p50/p95/p99 plus machine provenance so the numbers are reproducible and honest about what they cover.
162
245
 
163
246
  ```bash
164
- npm run bench:memory
165
- npm run bench:memory:check
247
+ npm run build
248
+ npm run bench:perf-snapshot # default sizes 100, 1000, 5000
249
+ node benchmarks/perf-snapshot.js --sizes 1000,10000 --json # custom shape
166
250
  ```
167
251
 
168
- The benchmark suite measures:
169
-
170
- - retrieval behavior
171
- - update and overwrite behavior
172
- - delete and abstain behavior
173
- - semantic and procedural merge behavior
252
+ Sample output from `benchmarks/snapshots/perf-0.22.2.json` (24-core Ryzen 9 7900X3D, Node 25.5.0, mock 64-dim embedding, hybrid recall, limit 5):
174
253
 
175
- Current repo snapshot:
254
+ | Corpus size | Encode p50 (ms) | Encode p95 (ms) | Recall p50 (ms) | Recall p95 (ms) | Recall p99 (ms) |
255
+ |---|---|---|---|---|---|
256
+ | 100 | 0.33 | 0.59 | 0.54 | 1.82 | 2.71 |
257
+ | 1,000 | 0.31 | 2.15 | 1.57 | 2.36 | 21.18 |
258
+ | 5,000 | 0.31 | 1.84 | 2.09 | 3.42 | 16.58 |
176
259
 
177
- ![Audrey local benchmark](docs/assets/benchmarks/local-benchmark.svg)
260
+ These numbers cover Audrey's own pipeline (SQLite + sqlite-vec + hybrid ranking) and exclude embedding-provider cost. Real-world recall p95 with a local 384-dim provider is typically 5-15x higher; with a hosted provider it is dominated by the API round-trip. Run on your own hardware before quoting numbers anywhere.
178
261
 
179
- For detailed methodology, published comparison anchors, and generated reports, see [docs/benchmarking.md](docs/benchmarking.md).
262
+ ### Behavioral regression suite
180
263
 
181
- ## Production
264
+ `npm run bench:memory:check` is a release gate. It runs a small set of retrieval and lifecycle scenarios (information extraction, knowledge updates, multi-session reasoning, conflict resolution, privacy boundary, overwrite, delete-and-abstain, semantic/procedural merge) against Audrey and three weak baselines (vector-only, keyword+recency, recent-window) and asserts Audrey doesn't regress. The baseline comparisons exist to catch correctness regressions in retrieval logic, not to make marketing claims.
182
265
 
183
- Audrey is strongest in workflows where memory must stay local, reviewable, and durable. It already fits well as a sidecar for internal agents in operational domains like financial services and healthcare operations, but it is a memory layer, not a compliance boundary.
266
+ ```bash
267
+ npm run bench:memory # full regression suite (writes JSON + report)
268
+ npm run bench:memory:check # release gate, exits non-zero on regression
269
+ ```
184
270
 
185
- Production guide: [docs/production-readiness.md](docs/production-readiness.md)
271
+ ## Command Reference
186
272
 
187
- Examples:
273
+ ```bash
274
+ # First contact
275
+ npx audrey doctor
276
+ npx audrey demo
188
277
 
189
- - [examples/fintech-ops-demo.js](examples/fintech-ops-demo.js)
190
- - [examples/healthcare-ops-demo.js](examples/healthcare-ops-demo.js)
191
- - [examples/stripe-demo.js](examples/stripe-demo.js)
278
+ # MCP setup
279
+ npx audrey install --host codex --dry-run
280
+ npx audrey mcp-config codex
281
+ npx audrey mcp-config generic
282
+ npx audrey install
283
+ npx audrey uninstall
192
284
 
193
- ## Environment
285
+ # Health and maintenance
286
+ npx audrey status
287
+ npx audrey status --json --fail-on-unhealthy
288
+ npx audrey dream
289
+ npx audrey reembed
194
290
 
195
- Starter config:
291
+ # Closed-loop visibility
292
+ npx audrey impact
293
+ npx audrey impact --json --window 7 --limit 5
196
294
 
197
- - [.env.example](.env.example)
198
- - [.env.docker.example](.env.docker.example)
295
+ # Tool-trace learning
296
+ npx audrey observe-tool --event PostToolUse --tool Bash --outcome failed
297
+ npx audrey promote --dry-run
199
298
 
200
- Key environment variables:
299
+ # REST sidecar
300
+ npx audrey serve
301
+ copy .env.docker.example .env
302
+ # edit AUDREY_API_KEY in .env
303
+ docker compose up -d --build
304
+ ```
201
305
 
202
- - `AUDREY_DATA_DIR`
203
- - `AUDREY_EMBEDDING_PROVIDER`
204
- - `AUDREY_LLM_PROVIDER`
205
- - `AUDREY_DEVICE`
206
- - `AUDREY_API_KEY`
207
- - `AUDREY_HOST`
208
- - `AUDREY_PORT`
306
+ The Node sidecar defaults to `127.0.0.1:7437`. The Docker image intentionally binds inside the container on `3487`, so Compose requires `AUDREY_API_KEY` in `.env` before startup. Override the published host port with `AUDREY_PUBLISHED_PORT` when using Compose.
209
307
 
210
308
  ## Documentation
211
309
 
212
- - [docs/benchmarking.md](docs/benchmarking.md)
213
- - [docs/production-readiness.md](docs/production-readiness.md)
214
- - [CONTRIBUTING.md](CONTRIBUTING.md)
215
- - [SECURITY.md](SECURITY.md)
310
+ - [Security policy](SECURITY.md)
311
+ - [Audrey paper outline](docs/AUDREY_PAPER_OUTLINE.md)
312
+ - Public setup, runtime, benchmark, and command guidance is maintained in this README.
216
313
 
217
314
  ## Development
218
315
 
219
316
  ```bash
220
317
  npm ci
221
- npm test
222
- npm run bench:memory:check
223
- npm run pack:check
318
+ npm run release:gate
224
319
  python -m unittest discover -s python/tests -v
225
320
  python -m build --no-isolation python
226
321
  ```
227
322
 
323
+ `npm test` uses a repo-local Vitest launcher so locked-down Windows temp
324
+ directories do not block test startup. `npm run release:gate:sandbox` remains
325
+ available for hosts that block child-process spawning entirely.
326
+
228
327
  ## License
229
328
 
230
329
  MIT. See [LICENSE](LICENSE).
package/SECURITY.md ADDED
@@ -0,0 +1,29 @@
1
+ # Security Policy
2
+
3
+ ## Supported Versions
4
+
5
+ Security fixes are best-effort for the current published release line and the current default branch.
6
+
7
+ | Version | Supported |
8
+ |---|---|
9
+ | `0.22.x` | Yes |
10
+ | `< 0.22.0` | No |
11
+
12
+ ## Reporting a Vulnerability
13
+
14
+ Do not open a public GitHub issue for a security vulnerability.
15
+
16
+ Report vulnerabilities through one of these channels:
17
+
18
+ - GitHub Security Advisories for this repository
19
+
20
+ Include:
21
+
22
+ - affected version
23
+ - reproduction steps or proof of concept
24
+ - impact description
25
+ - suggested mitigation, if you have one
26
+
27
+ ## Scope Notes
28
+
29
+ Audrey is a memory layer. Security posture also depends on the host application, deployment environment, provider configuration, access controls, and data-handling rules around it.