npm - @mastra/memory - Versions diffs - 1.9.0-alpha.2 → 1.9.1-alpha.0 - Mend

@mastra/memory 1.9.0-alpha.2 → 1.9.1-alpha.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (22) hide show

package/CHANGELOG.md CHANGED Viewed

@@ -1,5 +1,56 @@
 # @mastra/memory
+## 1.9.1-alpha.0
+### Patch Changes
+- Fixed observational memory reflection compression for `google/gemini-2.5-flash` by using stronger compression guidance and starting it at a higher compression level during reflection. `google/gemini-2.5-flash` is unusually good at generating long, faithful outputs. That made reflection retries more likely to preserve too much detail and miss the compression target, wasting tokens in the process. ([#14612](https://github.com/mastra-ai/mastra/pull/14612))
+- Updated dependencies [[`be37de4`](https://github.com/mastra-ai/mastra/commit/be37de4391bd1d5486ce38efacbf00ca51637262), [`f3ce603`](https://github.com/mastra-ai/mastra/commit/f3ce603fd76180f4a5be90b6dc786d389b6b3e98), [`2871451`](https://github.com/mastra-ai/mastra/commit/2871451703829aefa06c4a5d6eca7fd3731222ef), [`d3930ea`](https://github.com/mastra-ai/mastra/commit/d3930eac51c30b0ecf7eaa54bb9430758b399777)]:
+  - @mastra/core@1.16.0-alpha.2
+  - @mastra/schema-compat@1.2.7-alpha.0
+## 1.9.0
+### Minor Changes
+- Added experimental retrieval-mode recall tooling for observational memory. ([#14437](https://github.com/mastra-ai/mastra/pull/14437))
+  When `observationalMemory.retrieval` is enabled with `scope: 'thread'`, observation groups store colon-delimited message ranges (`startId:endId`) pointing back to the raw messages they were derived from. A `recall` tool is registered that lets agents retrieve those source messages via cursor-based pagination.
+  The recall tool supports:
+  - **Detail levels**: `detail: 'low'` (default) returns truncated text with part indices; `detail: 'high'` returns full content clamped to one part per call with continuation hints
+  - **Part-level fetch**: `partIndex` targets a single message part at full detail
+  - **Pagination flags**: `hasNextPage` and `hasPrevPage` in results
+  - **Token limiting**: results are capped at a token budget with `truncated` and `tokenOffset` reporting
+  - **Smart range detection**: passing a range as a cursor returns a helpful hint explaining how to extract individual IDs
+- Added opt-in Observational Memory thread titles. ([#14436](https://github.com/mastra-ai/mastra/pull/14436))
+  When enabled, the Observer suggests a short thread title and updates it as the conversation topic changes. Harness consumers can detect these updates via the new `om_thread_title_updated` event.
+  **Example**
+  ```ts
+  const memory = new Memory({
+    options: {
+      observationalMemory: {
+        observation: {
+          threadTitle: true,
+        },
+      },
+    },
+  });
+  ```
+### Patch Changes
+- Improved observational memory so completed tasks and answered questions are explicitly tracked and retained, reducing repeated follow-up on resolved topics. ([#14419](https://github.com/mastra-ai/mastra/pull/14419))
+- Updated dependencies [[`cb611a1`](https://github.com/mastra-ai/mastra/commit/cb611a1e89a4f4cf74c97b57e0c27bb56f2eceb5), [`da93115`](https://github.com/mastra-ai/mastra/commit/da931155c1a9bc63d455d3d86b4ec984db5991fe), [`b71bce1`](https://github.com/mastra-ai/mastra/commit/b71bce144912ed33f76c52a94e594988a649c3e1), [`62d1d3c`](https://github.com/mastra-ai/mastra/commit/62d1d3cc08fe8182e7080237fd975de862ec8c91), [`9e1a3ed`](https://github.com/mastra-ai/mastra/commit/9e1a3ed07cfafb5e8e19a796ce0bee817002d7c0), [`8681ecb`](https://github.com/mastra-ai/mastra/commit/8681ecb86184d5907267000e4576cc442a9a83fc), [`28d0249`](https://github.com/mastra-ai/mastra/commit/28d0249295782277040ad1e0d243e695b7ab1ce4), [`cd7b568`](https://github.com/mastra-ai/mastra/commit/cd7b568fe427b1b4838abe744fa5367a47539db3), [`681ee1c`](https://github.com/mastra-ai/mastra/commit/681ee1c811359efd1b8bebc4bce35b9bb7b14bec), [`bb0f09d`](https://github.com/mastra-ai/mastra/commit/bb0f09dbac58401b36069f483acf5673202db5b5), [`a579f7a`](https://github.com/mastra-ai/mastra/commit/a579f7a31e582674862b5679bc79af7ccf7429b8), [`5f7e9d0`](https://github.com/mastra-ai/mastra/commit/5f7e9d0db664020e1f3d97d7d18c6b0b9d4843d0), [`d7f14c3`](https://github.com/mastra-ai/mastra/commit/d7f14c3285cd253ecdd5f58139b7b6cbdf3678b5), [`0efe12a`](https://github.com/mastra-ai/mastra/commit/0efe12a5f008a939a1aac71699486ba40138054e)]:
+  - @mastra/core@1.15.0
+  - @mastra/schema-compat@1.2.6
 ## 1.9.0-alpha.2
 ### Minor Changes

package/dist/{chunk-LVV2RT42.cjs → chunk-CNOHXG5O.cjs} RENAMED Viewed

@@ -1736,6 +1736,7 @@ User messages are extremely important. If the user asks a question or gives a ne
 ${instruction}` : ""}`;
 }
+var MAX_COMPRESSION_LEVEL = 4;
 var COMPRESSION_GUIDANCE = {
   0: "",
   1: `
@@ -1748,11 +1749,11 @@ Please re-process with slightly more compression:
 - Closer to the end, retain more fine details (recent context matters more)
 - Memory is getting long - use a more condensed style throughout
 - Combine related items more aggressively but do not lose important specific details of names, places, events, and people
+- Combine repeated similar tool calls (e.g. multiple file views, searches, or edits in the same area) into a single summary line describing what was explored/changed and the outcome
 - Preserve \u2705 completion markers \u2014 they are memory signals that tell the assistant what is already resolved and help prevent repeated work
 - Preserve the concrete resolved outcome captured by \u2705 markers so the assistant knows what exactly is done
-- For example if there is a long nested observation list about repeated tool calls, you can combine those into a single line and observe that the tool was called multiple times for x reason, and finally y outcome happened.
-Your current detail level was a 10/10, lets aim for a 8/10 detail level.
+Aim for a 8/10 detail level.
 `,
   2: `
 ## AGGRESSIVE COMPRESSION REQUIRED
@@ -1764,12 +1765,13 @@ Please re-process with much more aggressive compression:
 - Closer to the end, retain fine details (recent context matters more)
 - Memory is getting very long - use a significantly more condensed style throughout
 - Combine related items aggressively but do not lose important specific details of names, places, events, and people
+- Combine repeated similar tool calls (e.g. multiple file views, searches, or edits in the same area) into a single summary line describing what was explored/changed and the outcome
+- If the same file or module is mentioned across many observations, merge into one entry covering the full arc
 - Preserve \u2705 completion markers \u2014 they are memory signals that tell the assistant what is already resolved and help prevent repeated work
 - Preserve the concrete resolved outcome captured by \u2705 markers so the assistant knows what exactly is done
-- For example if there is a long nested observation list about repeated tool calls, you can combine those into a single line and observe that the tool was called multiple times for x reason, and finally y outcome happened.
 - Remove redundant information and merge overlapping observations
-Your current detail level was a 10/10, lets aim for a 6/10 detail level.
+Aim for a 6/10 detail level.
 `,
   3: `
 ## CRITICAL COMPRESSION REQUIRED
@@ -1780,13 +1782,32 @@ Please re-process with maximum compression:
 - Summarize the oldest observations (first 50-70%) into brief high-level paragraphs \u2014 only key facts, decisions, and outcomes
 - For the most recent observations (last 30-50%), retain important details but still use a condensed style
 - Ruthlessly merge related observations \u2014 if 10 observations are about the same topic, combine into 1-2 lines
+- Combine all tool call sequences (file views, searches, edits, builds) into outcome-only summaries \u2014 drop individual steps entirely
 - Drop procedural details (tool calls, retries, intermediate steps) \u2014 keep only final outcomes
 - Drop observations that are no longer relevant or have been superseded by newer information
 - Preserve \u2705 completion markers \u2014 they are memory signals that tell the assistant what is already resolved and help prevent repeated work
 - Preserve the concrete resolved outcome captured by \u2705 markers so the assistant knows what exactly is done
 - Preserve: names, dates, decisions, errors, user preferences, and architectural choices
-Your current detail level was a 10/10, lets aim for a 4/10 detail level.
+Aim for a 4/10 detail level.
+`,
+  4: `
+## EXTREME COMPRESSION REQUIRED
+Multiple compression attempts have failed. The content may already be dense from a prior reflection.
+You MUST dramatically reduce the number of observations while keeping the standard observation format (date groups with bullet points and priority emojis):
+- Tool call observations are the biggest source of bloat. Collapse ALL tool call sequences into outcome-only observations \u2014 e.g. 10 observations about viewing/searching/editing files become 1 observation about what was actually learned or achieved (e.g. "Investigated auth module and found token validation was skipping expiry check")
+- Never preserve individual tool calls (viewed file X, searched for Y, ran build) \u2014 only preserve what was discovered or accomplished
+- Consolidate many related observations into single, more generic observations
+- Merge all same-day date groups into at most 2-3 date groups per day
+- For older content, each topic or task should be at most 1-2 observations capturing the key outcome
+- For recent content, retain more detail but still merge related items aggressively
+- If multiple observations describe incremental progress on the same task, keep only the final state
+- Preserve \u2705 completion markers and their outcomes but merge related completions into fewer lines
+- Preserve: user preferences, key decisions, architectural choices, and unresolved issues
+Aim for a 2/10 detail level. Fewer, more generic observations are better than many specific ones that exceed the budget.
 `
 };
 function buildReflectorPrompt(observations, manualPrompt, compressionLevel, skipContinuationHints) {
@@ -3912,6 +3933,22 @@ Async buffering is enabled by default \u2014 this opt-out is only needed when us
       modelId: resolved.modelId
     };
   }
+  /**
+   * Get the default compression start level based on model behavior.
+   * gemini-2.5-flash is a faithful transcriber that needs explicit pressure to compress effectively.
+   */
+  async getCompressionStartLevel(requestContext) {
+    try {
+      const resolved = await this.resolveModelContext(this.reflectionConfig.model, requestContext);
+      const modelId = resolved?.modelId ?? "";
+      if (modelId.includes("gemini-2.5-flash")) {
+        return 2;
+      }
+      return 1;
+    } catch {
+      return 1;
+    }
+  }
   getRuntimeModelContext(model) {
     if (!model?.modelId) {
       return void 0;
@@ -4709,8 +4746,9 @@ ${unreflectedContent}` : bufferedReflection;
     const originalTokens = this.tokenCounter.countObservations(observations);
     const targetThreshold = observationTokensThreshold ?? getMaxThreshold(this.reflectionConfig.observationTokens);
     let totalUsage = { inputTokens: 0, outputTokens: 0, totalTokens: 0 };
-    let currentLevel = compressionStartLevel ?? 0;
-    const maxLevel = 3;
+    const startLevel = compressionStartLevel ?? 0;
+    let currentLevel = startLevel;
+    const maxLevel = Math.min(MAX_COMPRESSION_LEVEL, startLevel + 3);
     let parsed = { observations: "", suggestedContinuation: void 0 };
     let reflectedTokens = 0;
     let attemptNumber = 0;
@@ -4781,13 +4819,17 @@ ${unreflectedContent}` : bufferedReflection;
         omDebug(`[OM:callReflector] degenerate output persists at maxLevel=${maxLevel}, breaking`);
         break;
       }
+      if (currentLevel >= maxLevel) {
+        break;
+      }
+      const nextLevel = currentLevel + 1;
       if (streamContext?.writer) {
         const failedMarker = createObservationFailedMarker({
           cycleId: streamContext.cycleId,
           operationType: "reflection",
           startedAt: streamContext.startedAt,
           tokensAttempted: originalTokens,
-          error: `Did not compress below threshold (${originalTokens} \u2192 ${reflectedTokens}, target: ${targetThreshold}), retrying at level ${currentLevel + 1}`,
+          error: `Did not compress below threshold (${originalTokens} \u2192 ${reflectedTokens}, target: ${targetThreshold}), retrying at level ${nextLevel}`,
           recordId: streamContext.recordId,
           threadId: streamContext.threadId
         });
@@ -4808,7 +4850,7 @@ ${unreflectedContent}` : bufferedReflection;
         await streamContext.writer.custom(startMarker).catch(() => {
         });
       }
-      currentLevel = Math.min(currentLevel + 1, maxLevel);
+      currentLevel = nextLevel;
     }
     return {
       observations: parsed.observations,
@@ -6709,9 +6751,6 @@ ${bufferedObservations}`;
     omDebug(
       `[OM:reflect] doAsyncBufferedReflection: slicing observations for reflection \u2014 totalLines=${totalLines}, avgTokPerLine=${avgTokensPerLine.toFixed(1)}, activationPointTokens=${activationPointTokens}, linesToReflect=${linesToReflect}/${totalLines}, sliceTokenEstimate=${sliceTokenEstimate}, compressionTarget=${compressionTarget}`
     );
-    omDebug(
-      `[OM:reflect] doAsyncBufferedReflection: starting reflector call, recordId=${currentRecord.id}, observationTokens=${sliceTokenEstimate}, compressionTarget=${compressionTarget} (inputTokens), activeObsLength=${activeObservations.length}, reflectedLineCount=${reflectedObservationLineCount}`
-    );
     if (writer) {
       const startMarker = createBufferingStartMarker({
         cycleId,
@@ -6725,6 +6764,9 @@ ${bufferedObservations}`;
       void writer.custom(startMarker).catch(() => {
       });
     }
+    omDebug(
+      `[OM:reflect] doAsyncBufferedReflection: starting reflector call, recordId=${currentRecord.id}, observationTokens=${sliceTokenEstimate}, compressionTarget=${compressionTarget} (inputTokens), activeObsLength=${activeObservations.length}, reflectedLineCount=${reflectedObservationLineCount}`
+    );
     const reflectResult = await this.callReflector(
       activeObservations,
       void 0,
@@ -6736,8 +6778,7 @@ ${bufferedObservations}`;
       // No abort signal for background ops
       true,
       // Skip continuation hints for async buffering
-      1,
-      // Start at compression level 1 for buffered reflection
+      await this.getCompressionStartLevel(requestContext),
       requestContext
     );
     const reflectionTokenCount = this.tokenCounter.countObservations(reflectResult.observations);
@@ -7598,5 +7639,5 @@ exports.stripEphemeralAnchorIds = stripEphemeralAnchorIds;
 exports.stripObservationGroups = stripObservationGroups;
 exports.truncateStringByTokens = truncateStringByTokens;
 exports.wrapInObservationGroup = wrapInObservationGroup;
-//# sourceMappingURL=chunk-LVV2RT42.cjs.map
-//# sourceMappingURL=chunk-LVV2RT42.cjs.map
+//# sourceMappingURL=chunk-CNOHXG5O.cjs.map
+//# sourceMappingURL=chunk-CNOHXG5O.cjs.map