npm - ultra-memory - Versions diffs - 3.0.5 → 3.2.0 - Mend

ultra-memory 3.0.5 → 3.2.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (16) hide show

package/SKILL.md +141 -0
package/integrations/__init__.py +1 -0
package/integrations/langchain_memory.py +118 -0
package/integrations/langgraph_checkpointer.py +76 -0
package/integrations/n8n_nodes.py +150 -0
package/package.json +121 -108
package/scripts/auto_decay.py +351 -0
package/scripts/cleanup.py +21 -0
package/scripts/detect_contradictions.py +537 -0
package/scripts/evolve_profile.py +414 -0
package/scripts/extract_facts.py +471 -0
package/scripts/log_op.py +42 -0
package/scripts/multimodal/__init__.py +2 -0
package/scripts/multimodal/extract_from_image.py +138 -0
package/scripts/multimodal/extract_from_pdf.py +182 -0
package/scripts/multimodal/transcribe_video.py +157 -0

package/SKILL.md CHANGED Viewed

@@ -269,6 +269,143 @@ python3 $SKILL_DIR/scripts/log_op.py \
 ---
+## 步骤七：元反思与进化
+记忆积累不等于进化。进化需要对记忆做二次加工：提炼模式、纠正偏差、淘汰噪音。
+### 7A：定期元反思
+**触发条件（满足任意一条）：**
+1. 当前会话里程碑累计达到 **5 个**（从 init.py 返回的 op_count 判断，每次 milestone 后检查）
+2. 用户说：回顾一下、总结经验、我们学到了什么、reflect、what have we learned、review progress
+3. 距上次元反思超过 **3 天**（从 user_profile.json 的 `last_reflection` 字段判断，不存在则视为从未反思过）
+**执行步骤（按顺序执行，不可跳过）：**
+**第一步：读取近期知识库**
+```bash
+# 读取最近 20 条知识库条目
+tail -20 $ULTRA_MEMORY_HOME/semantic/knowledge_base.jsonl
+```
+**第二步：读取近期会话摘要**
+```bash
+# 读取当前会话摘要
+cat $ULTRA_MEMORY_HOME/sessions/$SESSION_ID/summary.md 2>/dev/null || echo "暂无摘要"
+```
+**第三步：模型自主提炼（核心步骤）**
+基于读取到的内容，模型执行以下判断，每一项都必须完成：
+| 判断项 | 执行动作 |
+|-------|---------|
+| 发现两条或以上内容相似的知识条目 | 合并为一条更精炼的条目，写入 knowledge_base.jsonl，原条目加 `"merged": true` 标记 |
+| 发现某个知识点在多次操作中反复出现 | 将其标记为 `"importance": "high"`，写回该条目 |
+| 发现某条知识点超过 30 天未被检索且不是 high importance | 将其标记为 `"stale": true` |
+| 发现用户行为与 user_profile.json 记录不符 | 更新 user_profile.json 对应字段，加 `"corrected_at"` 时间戳 |
+| 总结出一个新的用户工作规律 | 追加到 user_profile.json 的 `observed_patterns` 数组 |
+**第四步：写入反思记录**
+```bash
+python3 $SKILL_DIR/scripts/log_op.py \
+  --session $SESSION_ID \
+  --type reasoning \
+  --summary "元反思完成：<一句话描述本次提炼了什么>" \
+  --tags "reflection,evolution"
+```
+**第五步：更新反思时间戳**
+将 `user_profile.json` 的 `last_reflection` 字段更新为当前 UTC 时间（ISO 格式）。
+**第六步：告知用户（简短）**
+用一句话告知用户反思结果。不需要展示完整报告，一句话即可，不打断主任务。
+---
+### 7B：错误修正
+**触发条件（满足任意一条）：**
+1. 用户说：不对、你记错了、不是这样的、纠正一下、wrong、that's not right、correct that
+2. 用户描述的信息与 user_profile.json 中的记录明显矛盾
+**执行步骤：**
+**第一步：定位错误记录**
+```bash
+cat $ULTRA_MEMORY_HOME/semantic/user_profile.json
+```
+**第二步：模型判断需要修正的字段**
+找到与用户当前描述矛盾的字段。
+**第三步：修正并记录**
+更新 user_profile.json 对应字段，同时在该字段旁追加：
+```json
+"_correction_note": "用户于 <日期> 纠正，原值为 <旧值>"
+```
+**第四步：记录修正操作**
+```bash
+python3 $SKILL_DIR/scripts/log_op.py \
+  --session $SESSION_ID \
+  --type decision \
+  --summary "用户画像修正：<字段名> 从 <旧值> 改为 <新值>" \
+  --tags "correction,profile"
+```
+**第五步：告知用户**
+"好的，我已经更新了记录，<字段名> 现在是 <新值>。"
+---
+### 7C：知识蒸馏（每月一次）
+**触发条件：**
+`user_profile.json` 的 `last_distillation` 字段距今超过 **30 天**，或该字段不存在。
+在步骤七 7A 执行完毕后，额外执行本步骤。
+**执行步骤：**
+**第一步：统计知识库规模**
+```bash
+wc -l $ULTRA_MEMORY_HOME/semantic/knowledge_base.jsonl
+```
+条目数少于 10 条时，跳过本步骤。
+**第二步：提取高价值条目**
+读取全部 knowledge_base.jsonl，筛选满足以下任意条件的条目：
+- `"importance": "high"`
+- `tags` 中包含 `"reusable"` 或 `"pattern"`
+- 同一 `title` 关键词出现超过 2 次
+**第三步：生成项目级知识摘要**
+基于筛选出的条目，生成一段 200 字以内的项目级知识摘要，格式：
+```json
+{"ts": "<当前UTC时间>", "type": "distillation", "period": "<YYYY-MM>", "project": "<项目名>", "summary": "<200字以内的知识摘要>", "source_count": <来源条目数>, "tags": ["distillation"]}
+```
+追加写入 `knowledge_base.jsonl`。
+**第四步：更新蒸馏时间戳**
+将 `user_profile.json` 的 `last_distillation` 更新为当前 UTC 时间。
+---
 ## 信号对照表
 | 脚本输出 | 立即执行 |
@@ -278,6 +415,8 @@ python3 $SKILL_DIR/scripts/log_op.py \
 | `COMPRESS_SUGGESTED` | 执行 summarize.py --session $SESSION_ID |
 | `SESSION_ID=sess_xxxxx` | 更新 SESSION_ID 为该值 |
 | 非零退出码 | 静默跳过，继续主任务 |
+| `op_count` 达到 5 的倍数且含 milestone | 触发步骤 7A |
+| user_profile.json `last_reflection` 距今 > 3 天 | 下次会话初始化后触发步骤 7A |
 ---
@@ -301,6 +440,8 @@ python3 $SKILL_DIR/scripts/log_op.py \
 | 任意脚本非零退出码 | **静默跳过**，不中断主任务 |
 | summarize.py 条数不足 | 加 --force 参数重新执行 |
 | 用户明确说"不用记录" | 立即停止记录，后续操作不再调用 log_op.py |
+| knowledge_base.jsonl 不存在 | 跳过 7A 的知识库读取，仅基于摘要执行反思 |
+| user_profile.json 解析失败 | 重新创建空文件，不中断进化流程 |
 **最重要原则：记忆功能失败不能影响主任务。静默处理，不打印错误。**

package/integrations/__init__.py ADDED Viewed

	@@ -0,0 +1 @@
1	+ # ultra-memory integrations package

package/integrations/langchain_memory.py ADDED Viewed

@@ -0,0 +1,118 @@
+#!/usr/bin/env python3
+"""
+ultra-memory: LangChain Memory 集成
+提供 UltraMemoryMemory 类，实现 LangChain BaseMemory 接口，
+可直接用于 LC agents。
+用法:
+    from integrations.langchain_memory import UltraMemoryMemory
+    memory = UltraMemoryMemory(session_id="sess_langchain_test", project="my-agent")
+    agent = OpenAIAgent(..., memory=memory)
+"""
+import json
+import os
+import sys
+from pathlib import Path
+from typing import Any
+try:
+    from langchain.schema import BaseMemory
+    from langchain.schema import HumanMessage, AIMessage
+    HAS_LANGCHAIN = True
+except ImportError:
+    HAS_LANGCHAIN = False
+ULTRA_MEMORY_HOME = Path(os.environ.get("ULTRA_MEMORY_HOME", Path.home() / ".ultra-memory"))
+_SCRIPTS_DIR = Path(__file__).parent.parent / "scripts"
+class UltraMemoryMemory:
+    """
+    LangChain memory backed by ultra-memory's 5-layer system.
+    Implements BaseMemory-compatible interface.
+    """
+    def __init__(
+        self,
+        session_id: str,
+        project: str = "langchain",
+        top_k: int = 5,
+    ):
+        self.session_id = session_id
+        self.project = project
+        self.top_k = top_k
+    @property
+    def memory_variables(self) -> list[str]:
+        return ["ultra_memory_context"]
+    def load_memory_variables(self, inputs: dict) -> dict:
+        """加载与当前上下文相关的记忆"""
+        query = inputs.get("query", "")
+        if not self.session_id:
+            return {"ultra_memory_context": ""}
+        if query:
+            # 使用 recall 获取相关记忆
+            import subprocess, io
+            recall_script = _SCRIPTS_DIR / "recall.py"
+            old_stdout = sys.stdout
+            sys.stdout = io.StringIO()
+            try:
+                subprocess.run(
+                    [sys.executable, str(recall_script),
+                     "--session", self.session_id,
+                     "--query", query,
+                     "--top-k", str(self.top_k)],
+                    capture_output=True,
+                    timeout=30,
+                )
+                context = sys.stdout.getvalue()
+            except Exception:
+                context = ""
+            finally:
+                sys.stdout = old_stdout
+        else:
+            # 加载最新摘要
+            summary_file = ULTRA_MEMORY_HOME / "sessions" / self.session_id / "summary.md"
+            if summary_file.exists():
+                context = summary_file.read_text(encoding="utf-8")
+            else:
+                context = ""
+        return {"ultra_memory_context": context}
+    def save_context(self, inputs: dict, outputs: dict) -> None:
+        """保存一轮对话到 ultra-memory"""
+        import subprocess
+        input_text = inputs.get("input", "")[:200]
+        output_text = outputs.get("output", "")[:200]
+        detail = {
+            "input": inputs.get("input", ""),
+            "output": outputs.get("output", ""),
+        }
+        try:
+            subprocess.run(
+                [
+                    sys.executable,
+                    str(_SCRIPTS_DIR / "log_op.py"),
+                    "--session", self.session_id,
+                    "--type", "tool_call",
+                    "--summary", f"LC: {input_text[:60]}",
+                    "--detail", json.dumps(detail, ensure_ascii=False),
+                    "--tags", "langchain",
+                ],
+                capture_output=True,
+                timeout=5,
+            )
+        except Exception:
+            pass
+    def clear(self) -> None:
+        """清除当前记忆（不删除 session）"""
+        self.session_id = None

package/integrations/langgraph_checkpointer.py ADDED Viewed

@@ -0,0 +1,76 @@
+#!/usr/bin/env python3
+"""
+ultra-memory: LangGraph Checkpointer 集成
+提供 UltraMemoryCheckpointer 类，作为 LangGraph 的状态持久化后端。
+用法:
+    from integrations.langgraph_checkpointer import UltraMemoryCheckpointer
+    checkpointer = UltraMemoryCheckpointer(session_id="sess_langgraph_proj")
+    compiled = graph.compile(checkpointer=checkpointer)
+"""
+import json
+import os
+from pathlib import Path
+from typing import Any, Optional
+ULTRA_MEMORY_HOME = Path(os.environ.get("ULTRA_MEMORY_HOME", Path.home() / ".ultra-memory"))
+class UltraMemoryCheckpointer:
+    """
+    LangGraph checkpointer backed by ultra-memory。
+    在每个节点执行后保存/恢复 agent graph 状态。
+    """
+    def __init__(self, session_id: str):
+        self.session_id = session_id
+        self.checkpoint_dir = ULTRA_MEMORY_HOME / "sessions" / session_id / "checkpoints"
+        self.checkpoint_dir.mkdir(parents=True, exist_ok=True)
+    def _checkpoint_file(self, thread_id: str, step: int) -> Path:
+        """获取检查点文件路径"""
+        return self.checkpoint_dir / f"thread_{thread_id}_step_{step:04d}.json"
+    def get(self, thread_id: str, step: int) -> Optional[dict[str, Any]]:
+        """获取指定 thread 和 step 的检查点状态"""
+        checkpoint_file = self._checkpoint_file(thread_id, step)
+        if not checkpoint_file.exists():
+            return None
+        try:
+            with open(checkpoint_file, encoding="utf-8") as f:
+                data = json.load(f)
+            return data.get("state")
+        except (json.JSONDecodeError, IOError):
+            return None
+    def put(self, thread_id: str, step: int, state: dict[str, Any]) -> None:
+        """保存检查点状态"""
+        checkpoint_file = self._checkpoint_file(thread_id, step)
+        data = {
+            "step": step,
+            "state": state,
+            "session_id": self.session_id,
+            "thread_id": thread_id,
+        }
+        with open(checkpoint_file, "w", encoding="utf-8") as f:
+            json.dump(data, f, ensure_ascii=False, indent=2)
+    def get_latest(self, thread_id: str) -> Optional[dict[str, Any]]:
+        """获取指定 thread 的最新检查点"""
+        checkpoints = sorted(
+            self.checkpoint_dir.glob(f"thread_{thread_id}_step_*.json"),
+            key=lambda p: int(p.stem.split("_")[-1]),
+        )
+        if not checkpoints:
+            return None
+        return self.get(thread_id, int(checkpoints[-1].stem.split("_")[-1]))
+    def list_threads(self) -> list[str]:
+        """列出所有已有 thread ID"""
+        threads = set()
+        for f in self.checkpoint_dir.glob("thread_*_step_*.json"):
+            parts = f.stem.split("_")
+            if len(parts) >= 2:
+                threads.add(parts[1])
+        return sorted(threads)

package/integrations/n8n_nodes.py ADDED Viewed

@@ -0,0 +1,150 @@
+#!/usr/bin/env python3
+"""
+ultra-memory: n8n 集成节点
+作为 n8n "Execute Command" 节点的 Python 脚本后端，
+支持 init / log / recall 三种操作。
+n8n 配置示例：
+  Execute Command 节点
+    命令: python3
+    参数: /path/to/ultra-memory/integrations/n8n_nodes.py <operation> <args>
+Operations:
+  init --project <proj>                    → 返回 session_id
+  log --session <id> --summary "..." --type <type> --detail '{}'
+  recall --session <id> --query "..."
+  profile --action read|update --field <field> --value <value>
+"""
+import json
+import os
+import sys
+import re
+from pathlib import Path
+ULTRA_MEMORY_HOME = Path(os.environ.get("ULTRA_MEMORY_HOME", Path.home() / ".ultra-memory"))
+_SCRIPTS_DIR = Path(__file__).parent.parent / "scripts"
+def _run_script(script_name: str, args: list[str]) -> str:
+    """运行脚本并返回输出"""
+    import subprocess
+    script_path = _SCRIPTS_DIR / script_name
+    result = subprocess.run(
+        [sys.executable, str(script_path)] + args,
+        capture_output=True, text=True, timeout=30,
+    )
+    return result.stdout + result.stderr
+def cmd_init(project: str) -> dict:
+    """初始化会话"""
+    output = _run_script("init.py", ["--project", project, "--resume"])
+    session_id = None
+    memory_ready = False
+    for line in output.split("\n"):
+        if "session_id:" in line:
+            match = re.search(r"session_id:\s*(sess_\w+)", line)
+            if match:
+                session_id = match.group(1)
+        if "MEMORY_READY" in line:
+            memory_ready = True
+    return {
+        "success": memory_ready,
+        "session_id": session_id,
+        "output": output,
+    }
+def cmd_log(session_id: str, summary: str, op_type: str, detail: str = "{}") -> dict:
+    """记录操作"""
+    output = _run_script("log_op.py", [
+        "--session", session_id,
+        "--type", op_type,
+        "--summary", summary,
+        "--detail", detail,
+    ])
+    return {"success": True, "output": output}
+def cmd_recall(session_id: str, query: str, top_k: int = 5) -> dict:
+    """检索记忆"""
+    output = _run_script("recall.py", [
+        "--session", session_id,
+        "--query", query,
+        "--top-k", str(top_k),
+    ])
+    return {"success": True, "output": output}
+def cmd_profile(action: str, field: str = None, value: str = None) -> dict:
+    """读取或更新用户画像"""
+    if action == "read":
+        output = _run_script("evolve_profile.py", [])
+        return {"success": True, "output": output}
+    elif action == "update" and field and value:
+        output = _run_script("evolve_profile.py", [
+            "--field", field, "--value", value,
+        ])
+        return {"success": True, "output": output}
+    return {"success": False, "error": "invalid profile command"}
+# ── CLI ─────────────────────────────────────────────────────────────────────
+if __name__ == "__main__":
+    if len(sys.argv) < 2:
+        print("Usage: n8n_nodes.py <init|log|recall|profile> [args...]")
+        sys.exit(1)
+    operation = sys.argv[1].lower()
+    args = sys.argv[2:]
+    result = {}
+    try:
+        if operation == "init":
+            project = next((a for a in args if a.startswith("--project=")),
+                          "--project=default").split("=", 1)[1]
+            result = cmd_init(project)
+        elif operation == "log":
+            session_id = next((a for a in args if a.startswith("--session=")),
+                             None).split("=", 1)[1]
+            summary = next((a for a in args if a.startswith("--summary=")),
+                          "").split("=", 1)[1]
+            op_type = next((a for a in args if a.startswith("--type=")),
+                          "tool_call").split("=", 1)[1]
+            detail = next((a for a in args if a.startswith("--detail=")),
+                         "{}").split("=", 1)[1]
+            result = cmd_log(session_id, summary, op_type, detail)
+        elif operation == "recall":
+            session_id = next((a for a in args if a.startswith("--session=")),
+                             None).split("=", 1)[1]
+            query = next((a for a in args if a.startswith("--query=")),
+                        "").split("=", 1)[1]
+            result = cmd_recall(session_id, query)
+        elif operation == "profile":
+            action = next((a for a in args if a.startswith("--action=")),
+                         "read").split("=", 1)[1]
+            field = next((a for a in args if a.startswith("--field=")),
+                        None)
+            field = field.split("=", 1)[1] if field else None
+            value = next((a for a in args if a.startswith("--value=")),
+                        None)
+            value = value.split("=", 1)[1] if value else None
+            result = cmd_profile(action, field, value)
+        else:
+            result = {"success": False, "error": f"unknown operation: {operation}"}
+    except Exception as e:
+        result = {"success": False, "error": str(e)}
+    print(json.dumps(result, ensure_ascii=False, indent=2))