hippo-memory 1.7.4 → 1.7.6

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -85,6 +85,20 @@ hippo recall "data pipeline issues" --budget 2000
85
85
 
86
86
  ---
87
87
 
88
+ ### What's new in v1.7.6
89
+
90
+ - **Fresh-tail pinned context injection.** `hippo context --pinned-only --include-recent <n>` now includes the last N writes regardless of pinning, so memories saved mid-session can appear in the next Claude Code `UserPromptSubmit` injection before they are explicitly pinned. New Claude hook installs use `--include-recent 5`; legacy pinned-only hooks are migrated on `hippo hook install`.
91
+ - **Calibration sweep on the sequential-learning benchmark.** Adds `--budget` plumbing through the runner + a calibration script (`calibrate.mjs`) with a mechanical B* selection rule. Used to test "would smaller budget recover headroom for the goal-stack hypothesis?" on the v1.7.5 floor.
92
+ - **Calibration verdict: budget reduction does not produce a discriminating workload.** 5 budgets × 10 seeds = 50 single-seed runs all returned 0% late-phase trap rate. Floor effect is structural, not budget-tunable. B\* = NULL. Per pre-registered escalation, v1.7.7 will sweep `--restrict-late-to last-4` instead.
93
+ - **Bug-fix on `calibrate.mjs` starvation guard.** Read a non-existent JSON field; false-positive `starved=true` on every candidate. Did not affect the verdict (lateMean=0% was load-bearing). Fix: drop the broken extraction.
94
+ - **Hypothesis still untested.** The −10pp goal-stack lift claim remains unsupported by a discriminating workload. Mechanism still shipped from v1.7.4. Honest reporting: see `docs/evals/2026-05-09-v1.7.6-calibration-result.md`.
95
+
96
+ ### What's new in v1.7.5
97
+
98
+ - **Sequential-learning benchmark gains `pushGoal`/`completeGoal` hooks** + a multi-seed eval harness with seeded category-to-slot variance, exact paired permutation CI, and `--eval-strict` mode. The dlPFC goal-stack mechanism is now exercisable on the public benchmark.
99
+ - **Tag-fix on memory store** so the goal-stack boost can actually match. Pre-fix the boost would have matched zero memories.
100
+ - **Eval ran but stopped per pre-registered sanity gate.** Both hippo-base and hippo+goal-stack hit 0% late-phase trap rate across 20 seeds — floor effect prevents H1/H0 discrimination. The −10pp hypothesis remains untested on a discriminating workload. Mechanism shipped, hypothesis open. Pre-reg + result in `docs/evals/`.
101
+
88
102
  ### What's new in v1.7.4
89
103
 
90
104
  - **Goal-stack boost on MCP + HTTP.** Set `RecallOpts.sessionId` (or HTTP `?session_id=...`, or MCP `hippo_recall { session_id }`) and the dlPFC goal-stack boost — previously CLI-only — applies on MCP and HTTP too. Both `api.recall` (primary BM25 band, before fresh-tail / summary appendix) AND MCP's separate `physicsSearch`/`hybridSearch` path are boosted. New `RecallOpts.goalTag` lets callers opt out per-call.
@@ -797,7 +811,7 @@ This adds a `<!-- hippo:start -->` ... `<!-- hippo:end -->` block that tells the
797
811
  For Claude Code, it also adds:
798
812
  - a `SessionEnd` hook so `hippo sleep` runs automatically when the session exits
799
813
  - a `SessionStart` hook that prints the previous session's consolidation output
800
- - a `UserPromptSubmit` hook that re-injects pinned memories (`hippo remember <text> --pin`) into every turn's context so invariants survive long sessions where Opus 4.7 might otherwise "forget" them. Budget: 500 tokens per turn, skipped entirely when no pinned memories exist. Opt out with `{"pinnedInject":{"enabled":false}}` in `.hippo/config.json`.
814
+ - a `UserPromptSubmit` hook that runs `hippo context --pinned-only --include-recent 5 --format additional-context` every turn. It re-injects pinned memories (`hippo remember <text> --pin`) plus the last 5 writes, so fresh same-session lessons appear on the next prompt before you pin them. Opt out with `{"pinnedInject":{"enabled":false}}` in `.hippo/config.json`.
801
815
 
802
816
  To remove: `hippo hook uninstall claude-code`
803
817
 
package/dist/cli.js CHANGED
@@ -79,6 +79,12 @@ function parseLimitFlag(value) {
79
79
  const parsed = parseInt(String(value), 10);
80
80
  return Number.isFinite(parsed) && parsed >= 1 ? parsed : Infinity;
81
81
  }
82
+ function parseCountFlag(value) {
83
+ if (!value || value === true || Array.isArray(value))
84
+ return 0;
85
+ const parsed = parseInt(String(value), 10);
86
+ return Number.isFinite(parsed) && parsed >= 1 ? parsed : 0;
87
+ }
82
88
  /**
83
89
  * Emit an audit event against `hippoRoot`'s db. Opens its own short-lived
84
90
  * connection so callers don't have to thread a db handle. Swallows all errors
@@ -2823,6 +2829,7 @@ async function cmdContext(hippoRoot, args, flags) {
2823
2829
  }
2824
2830
  const budget = parseInt(String(flags['budget'] ?? '1500'), 10);
2825
2831
  const limit = parseLimitFlag(flags['limit']);
2832
+ const includeRecent = parseCountFlag(flags['include-recent']);
2826
2833
  const ctxExplicitScope = flags['scope'] !== undefined ? String(flags['scope']).trim() : null;
2827
2834
  const ctxActiveScope = ctxExplicitScope || detectScope();
2828
2835
  // If budget is 0, skip entirely (zero token cost)
@@ -2874,11 +2881,39 @@ async function cmdContext(hippoRoot, args, flags) {
2874
2881
  return; // user disabled via config
2875
2882
  // Effective budget: explicit --budget wins over config.
2876
2883
  const effBudget = flags['budget'] !== undefined ? budget : pinnedCfg.pinnedInject.budget;
2884
+ const nowP = new Date();
2885
+ const selectedIds = new Set();
2886
+ let usedP = 0;
2887
+ if (includeRecent > 0) {
2888
+ const recent = [
2889
+ ...localEntries.map((entry) => ({ entry, isGlobal: false })),
2890
+ ...globalEntries.map((entry) => ({ entry, isGlobal: true })),
2891
+ ]
2892
+ .sort((a, b) => {
2893
+ const byCreated = Date.parse(b.entry.created) - Date.parse(a.entry.created);
2894
+ return byCreated !== 0 ? byCreated : b.entry.id.localeCompare(a.entry.id);
2895
+ })
2896
+ .slice(0, includeRecent)
2897
+ .map(({ entry, isGlobal }) => ({
2898
+ entry,
2899
+ score: calculateStrength(entry, nowP) * (isGlobal ? 1 / 1.2 : 1),
2900
+ tokens: estimateTokens(entry.content),
2901
+ isGlobal,
2902
+ }));
2903
+ for (const r of recent) {
2904
+ if (selectedIds.has(r.entry.id))
2905
+ continue;
2906
+ if (usedP + r.tokens > effBudget)
2907
+ continue;
2908
+ selectedItems.push(r);
2909
+ selectedIds.add(r.entry.id);
2910
+ usedP += r.tokens;
2911
+ }
2912
+ }
2877
2913
  const pinnedLocal = localEntries.filter((e) => e.pinned);
2878
2914
  const pinnedGlobal = globalEntries.filter((e) => e.pinned);
2879
- if (pinnedLocal.length === 0 && pinnedGlobal.length === 0)
2915
+ if (pinnedLocal.length === 0 && pinnedGlobal.length === 0 && selectedItems.length === 0)
2880
2916
  return; // zero output
2881
- const nowP = new Date();
2882
2917
  const rankedPinned = [
2883
2918
  ...pinnedLocal.map((e) => ({ entry: e, isGlobal: false })),
2884
2919
  ...pinnedGlobal.map((e) => ({ entry: e, isGlobal: true })),
@@ -2894,11 +2929,13 @@ async function cmdContext(hippoRoot, args, flags) {
2894
2929
  };
2895
2930
  })
2896
2931
  .sort((a, b) => b.score - a.score);
2897
- let usedP = 0;
2898
2932
  for (const r of rankedPinned) {
2933
+ if (selectedIds.has(r.entry.id))
2934
+ continue;
2899
2935
  if (usedP + r.tokens > effBudget)
2900
2936
  continue;
2901
2937
  selectedItems.push(r);
2938
+ selectedIds.add(r.entry.id);
2902
2939
  usedP += r.tokens;
2903
2940
  }
2904
2941
  totalTokens = usedP;
@@ -4673,6 +4710,7 @@ Commands:
4673
4710
  --auto Auto-detect task from git state
4674
4711
  --budget <n> Token budget (default: 1500)
4675
4712
  --pinned-only Only inject pinned memories (used by UserPromptSubmit hook)
4713
+ --include-recent <n> With --pinned-only, also inject the last N writes regardless of pinning
4676
4714
  --format <fmt> Output format: markdown (default), json, or additional-context (Claude Code hook JSON)
4677
4715
  --framing <mode> Framing: observe (default), suggest, assert
4678
4716
  sleep Run consolidation pass (auto-learns + dedup + auto-shares)