ultra-memory 3.2.0 → 4.1.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +64 -32
- package/SKILL.md +80 -1
- package/package.json +13 -6
- package/platform/server.py +23 -1
- package/scripts/__pycache__/auto_decay.cpython-313.pyc +0 -0
- package/scripts/__pycache__/conflict_detector.cpython-313.pyc +0 -0
- package/scripts/__pycache__/log_op.cpython-313.pyc +0 -0
- package/scripts/__pycache__/recall.cpython-313.pyc +0 -0
- package/scripts/__pycache__/summarize.cpython-313.pyc +0 -0
- package/scripts/auto_decay.py +75 -3
- package/scripts/conflict_detector.py +320 -0
- package/scripts/log_op.py +116 -16
- package/scripts/manage.py +426 -0
- package/scripts/multimodal/extract_from_docx.py +167 -0
- package/scripts/recall.py +627 -95
- package/scripts/summarize.py +38 -1
package/README.md
CHANGED
|
@@ -14,12 +14,19 @@
|
|
|
14
14
|
| 特性 | 说明 |
|
|
15
15
|
|------|------|
|
|
16
16
|
| **5 层记忆架构** | ops 日志 → 摘要 → 语义 → 实体索引 → 向量语义 |
|
|
17
|
-
| **零外部依赖** | 纯 Python stdlib
|
|
17
|
+
| **零外部依赖** | 纯 Python stdlib,可选 sentence-transformers 增强 |
|
|
18
|
+
| **RRF 多路融合** | BM25 + TF-IDF + 向量三通道倒数排名融合,消除量纲不一致 |
|
|
19
|
+
| **本地 Cross-Encoder** | `cross-encoder/ms-marco-MiniLM-L-6-v2` 精排,完全离线,零 API |
|
|
20
|
+
| **Weibull 衰减** | `exp(-(age/λ)^0.75)` 长期记忆保留比简单指数高 2.7 倍(7天后) |
|
|
21
|
+
| **三层记忆分级** | core / working / peripheral 自动分类,gc 可清理外围记忆 |
|
|
22
|
+
| **Snippet 截取** | recall 输出相关片段而非全量,Token 消耗减少 ~70% |
|
|
23
|
+
| **反馈环防护** | 自动过滤记忆注入标记,防止自引用噪音积累 |
|
|
18
24
|
| **结构化实体** | 自动提取函数/文件/依赖/决策/错误/类,精确召回 |
|
|
19
25
|
| **分层压缩** | O(log n) 上下文增长,永不爆 context |
|
|
20
26
|
| **跨语言检索** | 中英文同义词双向映射("数据清洗" ↔ "clean_df") |
|
|
21
|
-
| **全平台支持** | MCP Server / REST API / Claude Code Skill |
|
|
22
|
-
|
|
|
27
|
+
| **全平台支持** | MCP Server / REST API / Claude Code Skill / OpenClaw |
|
|
28
|
+
| **管理 CLI** | list / search / stats / export / gc / tier 六个子命令 |
|
|
29
|
+
| **Bearer Token 认证** | REST API 可选 token 保护,支持环境变量注入 |
|
|
23
30
|
|
|
24
31
|
---
|
|
25
32
|
|
|
@@ -69,42 +76,63 @@ ultra-memory # 启动 MCP Server
|
|
|
69
76
|
|
|
70
77
|
---
|
|
71
78
|
|
|
72
|
-
## 架构:5 层记忆模型
|
|
79
|
+
## 架构:5 层记忆模型 + RRF 检索引擎
|
|
73
80
|
|
|
74
81
|
```
|
|
75
|
-
|
|
76
|
-
│ Layer 5: 向量语义层 (TF-IDF / sentence-transformers)
|
|
77
|
-
│
|
|
78
|
-
|
|
79
|
-
│ Layer 4: 结构化实体索引 (entities.jsonl)
|
|
80
|
-
│
|
|
81
|
-
|
|
82
|
-
│ Layer 3: 跨会话语义层 (semantic/)
|
|
83
|
-
│ 知识库 ·
|
|
84
|
-
|
|
85
|
-
│ Layer 2: 会话摘要层 (summary.md)
|
|
86
|
-
│ 里程碑 · 关键决策 ·
|
|
87
|
-
|
|
88
|
-
│ Layer 1: 操作日志层 (ops.jsonl) ←
|
|
89
|
-
│
|
|
90
|
-
|
|
82
|
+
┌─────────────────────────────────────────────────────────────────┐
|
|
83
|
+
│ Layer 5: 向量语义层 (TF-IDF / sentence-transformers) │
|
|
84
|
+
│ 模糊召回 · all-MiniLM-L6-v2 · 增量缓存 │
|
|
85
|
+
├─────────────────────────────────────────────────────────────────┤
|
|
86
|
+
│ Layer 4: 结构化实体索引 (entities.jsonl) │
|
|
87
|
+
│ 精确召回:函数 / 文件 / 依赖 / 决策 / 错误 / 类 │
|
|
88
|
+
├─────────────────────────────────────────────────────────────────┤
|
|
89
|
+
│ Layer 3: 跨会话语义层 (semantic/) │
|
|
90
|
+
│ 知识库 · 用户画像 · 冲突检测 · 时间旅行查询 │
|
|
91
|
+
├─────────────────────────────────────────────────────────────────┤
|
|
92
|
+
│ Layer 2: 会话摘要层 (summary.md) │
|
|
93
|
+
│ 里程碑 · 关键决策 · 三层记忆统计 · 元压缩 O(log n) │
|
|
94
|
+
├─────────────────────────────────────────────────────────────────┤
|
|
95
|
+
│ Layer 1: 操作日志层 (ops.jsonl) ← append-only 核心 │
|
|
96
|
+
│ Weibull衰减 · tier分层 · 上下文窗口 · 反馈环防护 │
|
|
97
|
+
└─────────────────────────────────────────────────────────────────┘
|
|
98
|
+
↓ 召回时
|
|
99
|
+
┌─────────────────────────────────────────────────────────────────┐
|
|
100
|
+
│ 检索引擎(v4.1 新增) │
|
|
101
|
+
│ BM25 ──┐ │
|
|
102
|
+
│ TF-IDF ─┼→ RRF 融合 → [可选] Cross-Encoder 精排 → Snippet 截取 │
|
|
103
|
+
│ 向量 ──┘ │
|
|
104
|
+
└─────────────────────────────────────────────────────────────────┘
|
|
91
105
|
```
|
|
92
106
|
|
|
93
107
|
---
|
|
94
108
|
|
|
95
109
|
## 工具接口
|
|
96
110
|
|
|
111
|
+
### MCP / REST API 工具
|
|
112
|
+
|
|
97
113
|
| 工具 | 功能 |
|
|
98
114
|
|------|------|
|
|
99
115
|
| `memory_init` | 初始化会话,创建三层记忆结构 |
|
|
100
|
-
| `memory_log` |
|
|
101
|
-
| `memory_recall` | 5
|
|
102
|
-
| `memory_summarize` |
|
|
116
|
+
| `memory_log` | 记录操作(自动提取实体 + tier 分级) |
|
|
117
|
+
| `memory_recall` | 5 层统一检索(RRF 融合 + Cross-Encoder 精排) |
|
|
118
|
+
| `memory_summarize` | 触发摘要压缩(含三层统计 + 元压缩) |
|
|
103
119
|
| `memory_restore` | 恢复上次会话上下文 |
|
|
104
120
|
| `memory_profile` | 读写用户画像 |
|
|
105
121
|
| `memory_status` | 查询会话状态与 context 压力 |
|
|
106
122
|
| `memory_entities` | 查询结构化实体索引 |
|
|
107
123
|
| `memory_extract_entities` | 全量重提取实体 |
|
|
124
|
+
| `memory_knowledge_add` | 追加知识库条目 |
|
|
125
|
+
|
|
126
|
+
### manage.py 管理 CLI
|
|
127
|
+
|
|
128
|
+
```bash
|
|
129
|
+
python3 scripts/manage.py list # 列出所有会话
|
|
130
|
+
python3 scripts/manage.py search "关键词" # 跨会话全文搜索
|
|
131
|
+
python3 scripts/manage.py stats # 全局统计(tier 分布、知识库规模)
|
|
132
|
+
python3 scripts/manage.py export --format json --output backup.json
|
|
133
|
+
python3 scripts/manage.py gc --days 90 # 垃圾回收旧会话(默认 dry-run)
|
|
134
|
+
python3 scripts/manage.py tier # 补写历史数据的 tier 分级
|
|
135
|
+
```
|
|
108
136
|
|
|
109
137
|
---
|
|
110
138
|
|
|
@@ -168,15 +196,19 @@ Claude: [检索 ops #23] 在 src/cleaner.py 的 clean_df() 中:
|
|
|
168
196
|
|
|
169
197
|
## 与主流方案对比
|
|
170
198
|
|
|
171
|
-
| 能力 | Claude
|
|
172
|
-
|
|
173
|
-
|
|
|
174
|
-
|
|
|
175
|
-
|
|
|
176
|
-
|
|
|
177
|
-
|
|
|
178
|
-
|
|
|
179
|
-
|
|
|
199
|
+
| 能力 | Claude 原生 | mem0 | memory-lancedb-pro | ultra-memory |
|
|
200
|
+
|------|:-----------:|:----:|:-----------------:|:-----------:|
|
|
201
|
+
| 零外部依赖(核心) | ✅ | ❌ | ❌ | **✅** |
|
|
202
|
+
| RRF 多路融合 | ❌ | ❌ | ✅ | **✅** |
|
|
203
|
+
| Cross-Encoder 精排 | ❌ | ❌ | ✅(外部API) | **✅**(本地离线) |
|
|
204
|
+
| 冲突检测 | ❌ | ❌ | ❌ | **✅** |
|
|
205
|
+
| 三层记忆分级 | ❌ | ❌ | ✅ | **✅** |
|
|
206
|
+
| 反馈环防护 | ❌ | ❌ | ✅ | **✅** |
|
|
207
|
+
| 结构化实体提取 | ❌ | 部分 | LLM分类 | **✅**(regex,零API) |
|
|
208
|
+
| 分层压缩 O(log n) | ❌ | ❌ | ❌ | **✅** |
|
|
209
|
+
| 数据完全本地 | ❌(云端) | ⚠️ | ✅ | **✅** |
|
|
210
|
+
| 全平台支持 | 仅Claude | ⚠️ | ⚠️ | **✅** |
|
|
211
|
+
| 管理 CLI | ❌ | ❌ | ✅ | **✅** |
|
|
180
212
|
|
|
181
213
|
---
|
|
182
214
|
|
package/SKILL.md
CHANGED
|
@@ -1,7 +1,10 @@
|
|
|
1
1
|
---
|
|
2
2
|
name: ultra-memory
|
|
3
|
+
version: 4.1.0
|
|
3
4
|
description: >
|
|
4
|
-
ultra-memory 是多模型 AI
|
|
5
|
+
ultra-memory 是多模型 AI 的超长会话记忆系统(v4.1)。
|
|
6
|
+
检索引擎:RRF 多路融合 + 本地 Cross-Encoder 精排 + Weibull 衰减 + Snippet 截取。
|
|
7
|
+
记忆分层:core / working / peripheral 三层自动分级,反馈环防护,跨会话零遗忘。
|
|
5
8
|
【必须触发-中文】用户说以下任意词:记住、别忘了、记录一下、不要忘记、上次我们做了什么、帮我回忆、继续上次的、从上次继续、记忆、帮我记、追踪进度
|
|
6
9
|
【必须触发-英文】用户说以下任意词:remember、don't forget、recall、what did we do、pick up where we left off、continue from last time、memory、keep track、track progress、log this
|
|
7
10
|
【隐式触发-A+B】同时满足以下两条时触发:(A)消息含持续性任务动词:开发、实现、处理、完成、构建、develop、implement、create、fix;(B)消息中包含项目名词(专有名词/文件名/系统名)
|
|
@@ -474,3 +477,79 @@ wc -l $ULTRA_MEMORY_HOME/semantic/knowledge_base.jsonl
|
|
|
474
477
|
---
|
|
475
478
|
|
|
476
479
|
进阶配置(过滤规则、LanceDB 向量检索升级、自动 hook 配置、安全注意事项等)见 `references/advanced-config.md`。
|
|
480
|
+
|
|
481
|
+
---
|
|
482
|
+
|
|
483
|
+
## 附录 A:v4.1 检索增强说明
|
|
484
|
+
|
|
485
|
+
以下优化已内置,无需配置,自动生效:
|
|
486
|
+
|
|
487
|
+
| 优化项 | 说明 |
|
|
488
|
+
|--------|------|
|
|
489
|
+
| **RRF 多路融合** | BM25 + TF-IDF + 向量三通道结果用倒数排名融合,消除跨通道分数量纲不一致问题 |
|
|
490
|
+
| **本地 Cross-Encoder 精排** | 安装 `sentence-transformers` 后自动启用 `cross-encoder/ms-marco-MiniLM-L-6-v2`,完全本地运行,零 API 调用,精排准确率提升约 5-8% |
|
|
491
|
+
| **Weibull 衰减** | `exp(-(age/λ)^0.75)` 替代简单指数衰减;k<1 初期衰减更快(降低操作噪音),7 天后保留权重比简单指数高 2.7 倍(长期记忆保留更好) |
|
|
492
|
+
| **Snippet 截取** | recall 输出从全量记录截取 150 字符相关片段,Token 消耗减少约 70% |
|
|
493
|
+
| **反馈环防护** | 自动过滤 `[ultra-memory]`、`MEMORY_READY`、`[RECALL]`、`[ops #N]` 等记忆注入标记,防止 AI 把自身输出再次记录造成自引用噪音 |
|
|
494
|
+
|
|
495
|
+
---
|
|
496
|
+
|
|
497
|
+
## 附录 B:记忆分层(三层模型)
|
|
498
|
+
|
|
499
|
+
压缩后每条 op 自动写入 `tier` 字段:
|
|
500
|
+
|
|
501
|
+
| 层级 | 类型 | 操作类型 | 召回策略 |
|
|
502
|
+
|------|------|----------|---------|
|
|
503
|
+
| **core** | 核心记忆 | milestone / decision / error / user_instruction | 长期保留,高优先级 |
|
|
504
|
+
| **working** | 工作记忆 | reasoning / file_write / bash_exec | 当前会话活跃,定期压缩 |
|
|
505
|
+
| **peripheral** | 外围记忆 | file_read / tool_call | 历史细节,低优先级,可 gc 回收 |
|
|
506
|
+
|
|
507
|
+
summary.md 每次压缩后自动输出分层统计。
|
|
508
|
+
|
|
509
|
+
---
|
|
510
|
+
|
|
511
|
+
## 附录 C:manage.py 管理工具
|
|
512
|
+
|
|
513
|
+
```bash
|
|
514
|
+
# 列出所有会话
|
|
515
|
+
python3 $SKILL_DIR/scripts/manage.py list
|
|
516
|
+
python3 $SKILL_DIR/scripts/manage.py list --project my-project
|
|
517
|
+
|
|
518
|
+
# 跨所有会话全文搜索
|
|
519
|
+
python3 $SKILL_DIR/scripts/manage.py search "pandas 数据清洗"
|
|
520
|
+
python3 $SKILL_DIR/scripts/manage.py search "error" --limit 50
|
|
521
|
+
|
|
522
|
+
# 全局统计(操作数、tier 分布、项目分布、知识库规模)
|
|
523
|
+
python3 $SKILL_DIR/scripts/manage.py stats
|
|
524
|
+
|
|
525
|
+
# 导出完整记忆备份
|
|
526
|
+
python3 $SKILL_DIR/scripts/manage.py export --format json --output backup.json
|
|
527
|
+
python3 $SKILL_DIR/scripts/manage.py export --format markdown --output memory.md
|
|
528
|
+
|
|
529
|
+
# 垃圾回收:清理 90 天未活跃且无核心操作的会话(默认预演)
|
|
530
|
+
python3 $SKILL_DIR/scripts/manage.py gc
|
|
531
|
+
python3 $SKILL_DIR/scripts/manage.py gc --days 30 --no-dry-run # 实际执行
|
|
532
|
+
|
|
533
|
+
# 补写 tier 分层标记(历史数据迁移)
|
|
534
|
+
python3 $SKILL_DIR/scripts/manage.py tier
|
|
535
|
+
python3 $SKILL_DIR/scripts/manage.py tier --session sess_xxxxx
|
|
536
|
+
```
|
|
537
|
+
|
|
538
|
+
---
|
|
539
|
+
|
|
540
|
+
## 附录 D:REST API 认证(可选)
|
|
541
|
+
|
|
542
|
+
启动 REST 服务器时可配置 Bearer Token 保护:
|
|
543
|
+
|
|
544
|
+
```bash
|
|
545
|
+
# 方式 1:命令行参数
|
|
546
|
+
python3 $SKILL_DIR/platform/server.py --token your-secret-token
|
|
547
|
+
|
|
548
|
+
# 方式 2:环境变量
|
|
549
|
+
ULTRA_MEMORY_TOKEN=your-secret-token python3 $SKILL_DIR/platform/server.py
|
|
550
|
+
|
|
551
|
+
# 客户端调用时携带 Header:
|
|
552
|
+
# Authorization: Bearer your-secret-token
|
|
553
|
+
```
|
|
554
|
+
|
|
555
|
+
不配置 token 时服务器维持原有行为(仅监听 127.0.0.1,局域网不可访问)。
|
package/package.json
CHANGED
|
@@ -1,7 +1,7 @@
|
|
|
1
1
|
{
|
|
2
2
|
"name": "ultra-memory",
|
|
3
|
-
"version": "
|
|
4
|
-
"description": "超长会话记忆系统 —
|
|
3
|
+
"version": "4.1.0",
|
|
4
|
+
"description": "超长会话记忆系统 — RRF多路融合+本地Cross-Encoder精排+Weibull衰减+三层记忆分级,支持所有LLM平台",
|
|
5
5
|
"keywords": [
|
|
6
6
|
"ai",
|
|
7
7
|
"memory",
|
|
@@ -18,7 +18,12 @@
|
|
|
18
18
|
"mem0-alternative",
|
|
19
19
|
"openai",
|
|
20
20
|
"gemini",
|
|
21
|
-
"qwen"
|
|
21
|
+
"qwen",
|
|
22
|
+
"rrf",
|
|
23
|
+
"cross-encoder",
|
|
24
|
+
"weibull-decay",
|
|
25
|
+
"memory-tier",
|
|
26
|
+
"privacy-first"
|
|
22
27
|
],
|
|
23
28
|
"homepage": "https://github.com/nanjingya/ultra-memory",
|
|
24
29
|
"bugs": {
|
|
@@ -48,10 +53,11 @@
|
|
|
48
53
|
"ultra-memory-restore": "scripts/restore.py",
|
|
49
54
|
"ultra-memory-knowledge": "scripts/log_knowledge.py",
|
|
50
55
|
"ultra-memory-extract-facts": "scripts/extract_facts.py",
|
|
51
|
-
"ultra-memory-evolve": "scripts/evolve_profile.py"
|
|
56
|
+
"ultra-memory-evolve": "scripts/evolve_profile.py",
|
|
57
|
+
"ultra-memory-manage": "scripts/manage.py"
|
|
52
58
|
},
|
|
53
59
|
"scripts": {
|
|
54
|
-
"test": "python3 test_e2e.py",
|
|
60
|
+
"test": "python3 -m unittest tests.test_core -v && python3 test_e2e.py",
|
|
55
61
|
"start": "node scripts/mcp-server.js",
|
|
56
62
|
"server": "python3 platform/server.py"
|
|
57
63
|
},
|
|
@@ -62,7 +68,8 @@
|
|
|
62
68
|
"optionalDependencies": {
|
|
63
69
|
"pdfminer.six": "*",
|
|
64
70
|
"pytesseract": "*",
|
|
65
|
-
"whisper": "*"
|
|
71
|
+
"whisper": "*",
|
|
72
|
+
"python-docx": "*"
|
|
66
73
|
},
|
|
67
74
|
"peerDependencies": {
|
|
68
75
|
"langchain": ">=0.1.0"
|
package/platform/server.py
CHANGED
|
@@ -58,6 +58,9 @@ ULTRA_MEMORY_HOME = Path(os.environ.get("ULTRA_MEMORY_HOME", Path.home() / ".ult
|
|
|
58
58
|
|
|
59
59
|
VERSION = "3.0.0"
|
|
60
60
|
|
|
61
|
+
# Bearer Token(可选):由 --token 参数或 ULTRA_MEMORY_TOKEN 环境变量设置
|
|
62
|
+
_BEARER_TOKEN: str = os.environ.get("ULTRA_MEMORY_TOKEN", "")
|
|
63
|
+
|
|
61
64
|
logging.basicConfig(
|
|
62
65
|
level=logging.INFO,
|
|
63
66
|
format="[ultra-memory] %(asctime)s %(levelname)s %(message)s",
|
|
@@ -312,6 +315,16 @@ class MemoryHandler(BaseHTTPRequestHandler):
|
|
|
312
315
|
def log_message(self, fmt, *args):
|
|
313
316
|
log.info(f"{self.address_string()} {fmt % args}")
|
|
314
317
|
|
|
318
|
+
def _check_auth(self) -> bool:
|
|
319
|
+
"""如果配置了 Bearer Token,验证请求头;否则放行。"""
|
|
320
|
+
if not _BEARER_TOKEN:
|
|
321
|
+
return True
|
|
322
|
+
auth = self.headers.get("Authorization", "")
|
|
323
|
+
if auth.startswith("Bearer ") and auth[7:] == _BEARER_TOKEN:
|
|
324
|
+
return True
|
|
325
|
+
self._send_json(401, {"error": "Unauthorized: 需要有效的 Bearer Token"})
|
|
326
|
+
return False
|
|
327
|
+
|
|
315
328
|
def _send_json(self, status: int, data: dict):
|
|
316
329
|
body = json.dumps(data, ensure_ascii=False, indent=2).encode("utf-8")
|
|
317
330
|
self.send_response(status)
|
|
@@ -343,6 +356,8 @@ class MemoryHandler(BaseHTTPRequestHandler):
|
|
|
343
356
|
self.end_headers()
|
|
344
357
|
|
|
345
358
|
def do_GET(self):
|
|
359
|
+
if not self._check_auth():
|
|
360
|
+
return
|
|
346
361
|
parsed = urlparse(self.path)
|
|
347
362
|
path = parsed.path.rstrip("/")
|
|
348
363
|
|
|
@@ -391,6 +406,8 @@ class MemoryHandler(BaseHTTPRequestHandler):
|
|
|
391
406
|
self._send_json(404, {"error": f"路径不存在: {path}"})
|
|
392
407
|
|
|
393
408
|
def do_POST(self):
|
|
409
|
+
if not self._check_auth():
|
|
410
|
+
return
|
|
394
411
|
parsed = urlparse(self.path)
|
|
395
412
|
path = parsed.path.rstrip("/")
|
|
396
413
|
|
|
@@ -436,17 +453,22 @@ def main():
|
|
|
436
453
|
help="监听端口(默认 3200)")
|
|
437
454
|
parser.add_argument("--storage", default=None,
|
|
438
455
|
help="覆盖 ULTRA_MEMORY_HOME 路径")
|
|
456
|
+
parser.add_argument("--token", default=None,
|
|
457
|
+
help="Bearer Token 认证密钥(不设则无需认证)")
|
|
439
458
|
args = parser.parse_args()
|
|
440
459
|
|
|
441
|
-
global ULTRA_MEMORY_HOME
|
|
460
|
+
global ULTRA_MEMORY_HOME, _BEARER_TOKEN
|
|
442
461
|
if args.storage:
|
|
443
462
|
ULTRA_MEMORY_HOME = Path(args.storage)
|
|
444
463
|
os.environ["ULTRA_MEMORY_HOME"] = str(ULTRA_MEMORY_HOME)
|
|
464
|
+
if args.token:
|
|
465
|
+
_BEARER_TOKEN = args.token
|
|
445
466
|
|
|
446
467
|
server = HTTPServer((args.host, args.port), MemoryHandler)
|
|
447
468
|
|
|
448
469
|
log.info(f"ultra-memory REST Server v{VERSION} 已启动")
|
|
449
470
|
log.info(f"地址: http://{args.host}:{args.port}")
|
|
471
|
+
log.info(f"认证: {'已启用 Bearer Token' if _BEARER_TOKEN else '未启用(仅本机访问)'}")
|
|
450
472
|
log.info(f"存储: {ULTRA_MEMORY_HOME}")
|
|
451
473
|
log.info(f"脚本: {SCRIPTS_DIR}")
|
|
452
474
|
log.info(f"工具: {list(TOOL_HANDLERS.keys())}")
|
|
Binary file
|
|
Binary file
|
|
Binary file
|
|
Binary file
|
|
Binary file
|
package/scripts/auto_decay.py
CHANGED
|
@@ -40,6 +40,32 @@ ULTRA_MEMORY_HOME = Path(os.environ.get("ULTRA_MEMORY_HOME", Path.home() / ".ult
|
|
|
40
40
|
DEFAULT_HALF_LIFE_DAYS = 30
|
|
41
41
|
DEFAULT_FORGET_THRESHOLD = 0.05
|
|
42
42
|
|
|
43
|
+
# ── 记忆类型 TTL 分层(参照 sdr-supermemory)─────────────────────────────
|
|
44
|
+
# permanent: 永不过期(用户画像、关键决策、里程碑)
|
|
45
|
+
# insight: 会话洞察、偏好(90天)
|
|
46
|
+
# signal: 临时信号、错误记录(30天)
|
|
47
|
+
# default: 其他事实(30天)
|
|
48
|
+
|
|
49
|
+
MEMORY_TYPE_TTL = {
|
|
50
|
+
"permanent": None, # 永不过期
|
|
51
|
+
"insight": 90, # 天
|
|
52
|
+
"signal": 30, # 天
|
|
53
|
+
"default": 30, # 天
|
|
54
|
+
}
|
|
55
|
+
|
|
56
|
+
# 实体类型 / 标签 → 记忆类型映射
|
|
57
|
+
ENTITY_TYPE_TO_MEMORY_TYPE = {
|
|
58
|
+
"dependency": "permanent", # 依赖包一般不变化
|
|
59
|
+
"decision": "permanent", # 关键决策永久保留
|
|
60
|
+
"class": "permanent", # 类定义稳定
|
|
61
|
+
"function": "insight", # 函数实现可能演进
|
|
62
|
+
"file": "insight", # 文件可能变化
|
|
63
|
+
"error": "signal", # 错误记录临时
|
|
64
|
+
"preference": "permanent", # 用户偏好永久
|
|
65
|
+
"person": "permanent", # 人物信息永久
|
|
66
|
+
"project": "permanent", # 项目配置永久
|
|
67
|
+
}
|
|
68
|
+
|
|
43
69
|
# 衰减等级边界
|
|
44
70
|
DECAY_LEVELS = [
|
|
45
71
|
(0.6, "none"),
|
|
@@ -114,7 +140,45 @@ def compute_decay_score(
|
|
|
114
140
|
return max(0.0, min(1.0, score))
|
|
115
141
|
|
|
116
142
|
|
|
117
|
-
def
|
|
143
|
+
def detect_memory_type(fact: dict) -> str:
|
|
144
|
+
"""
|
|
145
|
+
根据事实的实体类型、标签或来源判断记忆类型。
|
|
146
|
+
返回 "permanent" | "insight" | "signal" | "default"。
|
|
147
|
+
"""
|
|
148
|
+
# 1. 实体类型优先
|
|
149
|
+
entity_type = fact.get("entity_type", "")
|
|
150
|
+
if entity_type and entity_type in ENTITY_TYPE_TO_MEMORY_TYPE:
|
|
151
|
+
return ENTITY_TYPE_TO_MEMORY_TYPE[entity_type]
|
|
152
|
+
|
|
153
|
+
# 2. 标签判断
|
|
154
|
+
tags = set(fact.get("tags", []))
|
|
155
|
+
if tags & {"preference", "person", "project", "milestone"}:
|
|
156
|
+
return "permanent"
|
|
157
|
+
if tags & {"error", "debug", "signal"}:
|
|
158
|
+
return "signal"
|
|
159
|
+
|
|
160
|
+
# 3. 来源类型判断
|
|
161
|
+
source_type = fact.get("source_type", "")
|
|
162
|
+
if source_type in ("milestone", "decision"):
|
|
163
|
+
return "permanent"
|
|
164
|
+
if source_type in ("error",):
|
|
165
|
+
return "signal"
|
|
166
|
+
|
|
167
|
+
# 4. fact 内容关键词(针对 facts.jsonl 中无 entity_type 的情况)
|
|
168
|
+
content = (fact.get("subject", "") + " " + fact.get("predicate", "") + " " + fact.get("object", "")).lower()
|
|
169
|
+
permanent_kw = ["用户", "偏好", "决策", "项目", "配置", "住在", "工作", "职"]
|
|
170
|
+
signal_kw = ["错误", "报错", "失败", "异常", "bug"]
|
|
171
|
+
for kw in permanent_kw:
|
|
172
|
+
if kw in content:
|
|
173
|
+
return "permanent"
|
|
174
|
+
for kw in signal_kw:
|
|
175
|
+
if kw in content:
|
|
176
|
+
return "signal"
|
|
177
|
+
|
|
178
|
+
return "default"
|
|
179
|
+
|
|
180
|
+
|
|
181
|
+
|
|
118
182
|
"""根据衰减评分确定衰减等级"""
|
|
119
183
|
for threshold, level in DECAY_LEVELS:
|
|
120
184
|
if score >= threshold:
|
|
@@ -239,6 +303,9 @@ def run_decay_pass(session_id: str | None = None):
|
|
|
239
303
|
if not fid:
|
|
240
304
|
continue
|
|
241
305
|
if fid not in meta["facts"]:
|
|
306
|
+
mem_type = detect_memory_type(fact)
|
|
307
|
+
ttl_days = MEMORY_TYPE_TTL[mem_type]
|
|
308
|
+
|
|
242
309
|
meta["facts"][fid] = {
|
|
243
310
|
"confidence": fact.get("confidence", 0.7),
|
|
244
311
|
"access_count": fact.get("access_count", 1),
|
|
@@ -246,7 +313,8 @@ def run_decay_pass(session_id: str | None = None):
|
|
|
246
313
|
"last_updated": fact.get("ts", _now_iso()),
|
|
247
314
|
"importance_score": compute_importance_score(fact, {}),
|
|
248
315
|
"decay_level": "none",
|
|
249
|
-
"ttl_days":
|
|
316
|
+
"ttl_days": ttl_days,
|
|
317
|
+
"memory_type": mem_type,
|
|
250
318
|
"expires_at": None,
|
|
251
319
|
"status": "active",
|
|
252
320
|
"contradiction_count": fact.get("contradiction_count", 0),
|
|
@@ -294,8 +362,12 @@ def run_decay_pass(session_id: str | None = None):
|
|
|
294
362
|
new_level = compute_decay_level(decay_score)
|
|
295
363
|
fact_meta["decay_level"] = new_level
|
|
296
364
|
|
|
297
|
-
# 更新 ttl_days
|
|
365
|
+
# 更新 ttl_days 配置(仅 non-permanent 类型)
|
|
298
366
|
ttl_days = fact_meta.get("ttl_days", DEFAULT_HALF_LIFE_DAYS)
|
|
367
|
+
# permanent 类型永不过期
|
|
368
|
+
if ttl_days is None:
|
|
369
|
+
fact_meta["decay_level"] = "none"
|
|
370
|
+
continue
|
|
299
371
|
|
|
300
372
|
# 触发遗忘
|
|
301
373
|
if new_level == "forgotten" and old_level != "forgotten":
|