keepsake-memory 1.0.0__tar.gz
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- keepsake_memory-1.0.0/LICENSE +21 -0
- keepsake_memory-1.0.0/MANIFEST.in +1 -0
- keepsake_memory-1.0.0/PKG-INFO +424 -0
- keepsake_memory-1.0.0/README.md +401 -0
- keepsake_memory-1.0.0/pyproject.toml +35 -0
- keepsake_memory-1.0.0/setup.cfg +4 -0
- keepsake_memory-1.0.0/src/keepsake/__init__.py +558 -0
- keepsake_memory-1.0.0/src/keepsake/attention.py +146 -0
- keepsake_memory-1.0.0/src/keepsake/consolidator.py +395 -0
- keepsake_memory-1.0.0/src/keepsake/embedder.py +155 -0
- keepsake_memory-1.0.0/src/keepsake/emotion.py +136 -0
- keepsake_memory-1.0.0/src/keepsake/forgetter.py +262 -0
- keepsake_memory-1.0.0/src/keepsake/py.typed +0 -0
- keepsake_memory-1.0.0/src/keepsake/splitter.py +436 -0
- keepsake_memory-1.0.0/src/keepsake/storage.py +1360 -0
- keepsake_memory-1.0.0/src/keepsake_memory.egg-info/PKG-INFO +424 -0
- keepsake_memory-1.0.0/src/keepsake_memory.egg-info/SOURCES.txt +18 -0
- keepsake_memory-1.0.0/src/keepsake_memory.egg-info/dependency_links.txt +1 -0
- keepsake_memory-1.0.0/src/keepsake_memory.egg-info/requires.txt +2 -0
- keepsake_memory-1.0.0/src/keepsake_memory.egg-info/top_level.txt +1 -0
|
@@ -0,0 +1,21 @@
|
|
|
1
|
+
MIT License
|
|
2
|
+
|
|
3
|
+
Copyright (c) 2026 j-zly
|
|
4
|
+
|
|
5
|
+
Permission is hereby granted, free of charge, to any person obtaining a copy
|
|
6
|
+
of this software and associated documentation files (the "Software"), to deal
|
|
7
|
+
in the Software without restriction, including without limitation the rights
|
|
8
|
+
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
|
|
9
|
+
copies of the Software, and to permit persons to whom the Software is
|
|
10
|
+
furnished to do so, subject to the following conditions:
|
|
11
|
+
|
|
12
|
+
The above copyright notice and this permission notice shall be included in all
|
|
13
|
+
copies or substantial portions of the Software.
|
|
14
|
+
|
|
15
|
+
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
|
|
16
|
+
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
|
|
17
|
+
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
|
|
18
|
+
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
|
|
19
|
+
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
|
|
20
|
+
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
|
|
21
|
+
SOFTWARE.
|
|
@@ -0,0 +1 @@
|
|
|
1
|
+
include src/fragmented_memory/py.typed
|
|
@@ -0,0 +1,424 @@
|
|
|
1
|
+
Metadata-Version: 2.4
|
|
2
|
+
Name: keepsake-memory
|
|
3
|
+
Version: 1.0.0
|
|
4
|
+
Summary: Keepsake — full-entry memory system for Hermes Agent. On-demand storage of complete memories with semantic search.
|
|
5
|
+
Author: j-zly
|
|
6
|
+
License: MIT
|
|
7
|
+
Keywords: hermes-agent,memory,rag,redisearch,llm
|
|
8
|
+
Classifier: Development Status :: 4 - Beta
|
|
9
|
+
Classifier: Intended Audience :: Developers
|
|
10
|
+
Classifier: License :: OSI Approved :: MIT License
|
|
11
|
+
Classifier: Programming Language :: Python :: 3
|
|
12
|
+
Classifier: Programming Language :: Python :: 3.10
|
|
13
|
+
Classifier: Programming Language :: Python :: 3.11
|
|
14
|
+
Classifier: Programming Language :: Python :: 3.12
|
|
15
|
+
Classifier: Programming Language :: Python :: 3.13
|
|
16
|
+
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
|
|
17
|
+
Requires-Python: >=3.10
|
|
18
|
+
Description-Content-Type: text/markdown
|
|
19
|
+
License-File: LICENSE
|
|
20
|
+
Requires-Dist: redis>=5.0
|
|
21
|
+
Requires-Dist: jieba>=0.42
|
|
22
|
+
Dynamic: license-file
|
|
23
|
+
|
|
24
|
+
# Keepsake — Memory Plugin for Hermes Agent
|
|
25
|
+
|
|
26
|
+
The Keepsake system automatically retrieves relevant memories and injects them into the conversation context for each dialogue.
|
|
27
|
+
|
|
28
|
+
```text
|
|
29
|
+
User: "How did we set up that React project structure last time?"
|
|
30
|
+
↓
|
|
31
|
+
Keepsake System ← Redis + RediSearch
|
|
32
|
+
↓
|
|
33
|
+
┌─────────────────────────────────────┐
|
|
34
|
+
│ [1] User prefers TypeScript + Vite │
|
|
35
|
+
│ [2] Previous projects used pinia state management │
|
|
36
|
+
│ [3] Backend suggested using .NET 10 implementation │
|
|
37
|
+
└─────────────────────────────────────┘
|
|
38
|
+
↓
|
|
39
|
+
Model directly uses memories to answer
|
|
40
|
+
```
|
|
41
|
+
|
|
42
|
+
## Features
|
|
43
|
+
|
|
44
|
+
- **Full Entry Storage** — stores complete text as-is, no semantic splitting
|
|
45
|
+
- **BM25 Full-Text Search** — works out of the box with no external API
|
|
46
|
+
- **Optional Vector Search** — KNN via RediSearch (OpenAI / DashScope embedder)
|
|
47
|
+
- **Time Decay** — newer entries rank higher (60-day half-life configurable)
|
|
48
|
+
- **Sentiment Weighting** — emotional entries get priority
|
|
49
|
+
- **User Feedback** — mark entries useful/useless to improve ranking
|
|
50
|
+
- **Hot Topic Boost** — frequently discussed topics rank higher
|
|
51
|
+
- **Entity Extraction** — auto-tags entries with entities (people, places, crypto tickers, domain terms) at store time; searched alongside content text for higher recall
|
|
52
|
+
- **Entity Co-occurrence** — auto-track which entities appear together, expand search to co-occurring entities for associative recall ("Python" → also finds entries mentioning "Django")
|
|
53
|
+
- **Domain Dictionary** — jieba user dictionary auto-generated from corpus + synonym table, loaded on `/new` for better Chinese tokenization
|
|
54
|
+
- **Workflow Lock** — set `keepsake:workflow_lock` in Redis to globally disable memory retrieval (e.g. during automated workflows)
|
|
55
|
+
- **Skip Patterns** — define skip lists (via file) to avoid searching on trivial queries like "ok", "got it"
|
|
56
|
+
- **On-Demand Storage** — only `memory(action='add')` stores data; no automatic per-turn archiving
|
|
57
|
+
- **Search-Time Expiry** — `invalid_at` field in index: set a timestamp and the entry is filtered out at search time (no data loss, can be reverted)
|
|
58
|
+
- **Auto Maintenance** — consolidation (keyword clustering + LLM summarization) + selective forgetting (multi-dimension low-value detection) run every 2h to keep storage tidy
|
|
59
|
+
|
|
60
|
+
## Design Philosophy: Clean Memory for LLMs
|
|
61
|
+
|
|
62
|
+
Keepsake stores **full, self-contained entries** — not entryed conversation snippets. The key insight is that LLMs need complete context to make use of stored information. A entry like "prefers TypeScript + Vite" without its surrounding context is useless; the full entry "User prefers TypeScript + Vite for frontend projects" is immediately actionable.
|
|
63
|
+
|
|
64
|
+
| Mechanism | Implementation |
|
|
65
|
+
|-----------|---------------|
|
|
66
|
+
| Complete Context | Stores full text entries, no splitting |
|
|
67
|
+
| Forgetting Curve | Time decay (60-day half-life) — old memories fade naturally |
|
|
68
|
+
| Emotion Deepens Memory | Emotional weight boost — intense moments stick |
|
|
69
|
+
| Repetition Reinforces | Attention tracking + hot topic scoring |
|
|
70
|
+
| Use It or Lose It | Feedback reinforcement (keepsake_feedback) |
|
|
71
|
+
| Association & Analogy | Synonym discovery (Jaccard co-occurrence statistics) — "deploy" ↔ "release" |
|
|
72
|
+
| Entity Association | Entity co-occurrence tracking — entries mentioning "BTC" also recall "halving" without being synonyms |
|
|
73
|
+
| Entity Tagging | Like the brain tagging memories with people/places/things — auto-extracted entities searched alongside content |
|
|
74
|
+
| On-Demand Storage | No automatic archiving; only saves when explicitly told to (memory tool) |
|
|
75
|
+
| Sleep Consolidation | Background maintenance every 2h: keyword-based clustering + LLM summarization |
|
|
76
|
+
| Context Isolation | agent_id tagging — different identities, separate memories |
|
|
77
|
+
| Fuzzy but Enough | BM25 full-text search — doesn't need an exact match to recall |
|
|
78
|
+
|
|
79
|
+
No vector database. No embedding API calls. No LLM inference for memory operations. Just **pure statistical methods** running on Redis + RediSearch — the same techniques the brain uses: frequency, recency, emotional salience, association, and feedback.
|
|
80
|
+
|
|
81
|
+
## Requirements
|
|
82
|
+
|
|
83
|
+
- **Python 3.10+**
|
|
84
|
+
- **Hermes Agent 0.12+** — provides `MemoryProvider` interface
|
|
85
|
+
- **Redis 7+** — with RediSearch module (v2.6+)
|
|
86
|
+
- **jieba** — Chinese tokenization (auto-installed)
|
|
87
|
+
- **Embedding API** (optional) — OpenAI / DashScope / any compatible `/v1/embeddings` service
|
|
88
|
+
|
|
89
|
+
## Installation
|
|
90
|
+
|
|
91
|
+
```bash
|
|
92
|
+
pip install keepsake
|
|
93
|
+
```
|
|
94
|
+
|
|
95
|
+
Or install directly from GitHub:
|
|
96
|
+
|
|
97
|
+
```bash
|
|
98
|
+
pip install git+https://github.com/j-zly/keepsake.git
|
|
99
|
+
```
|
|
100
|
+
|
|
101
|
+
## Configuration
|
|
102
|
+
|
|
103
|
+
Configuration precedence (high to low): **Environment variables > JSON config file > config.yaml inline > defaults**
|
|
104
|
+
|
|
105
|
+
### 1. Configuration Methods
|
|
106
|
+
|
|
107
|
+
There are three ways to configure Keepsake, listed in order of priority:
|
|
108
|
+
|
|
109
|
+
1. **Environment Variables** (Highest precedence)
|
|
110
|
+
Set environment variables like `KEEPSAKE_REDIS_HOST`, `KEEPSAKE_REDIS_PASSWORD`, etc.
|
|
111
|
+
|
|
112
|
+
2. **JSON Config File** (~/.config/keepsake/config.json)
|
|
113
|
+
A complete JSON configuration file for all settings.
|
|
114
|
+
|
|
115
|
+
3. **Code Defaults** (Lowest precedence)
|
|
116
|
+
Default values defined in the code.
|
|
117
|
+
|
|
118
|
+
### 2. Complete Configuration Example
|
|
119
|
+
|
|
120
|
+
Here's a comprehensive example of the configuration file `~/.config/keepsake/config.json` with all available options:
|
|
121
|
+
|
|
122
|
+
```json
|
|
123
|
+
{
|
|
124
|
+
// Redis connection
|
|
125
|
+
"redis_host": "127.0.0.1",
|
|
126
|
+
"redis_port": 6379,
|
|
127
|
+
"redis_password": "",
|
|
128
|
+
|
|
129
|
+
// Search settings
|
|
130
|
+
"top_k": 5,
|
|
131
|
+
"candidate_k": 10,
|
|
132
|
+
"bm25_limit": 10,
|
|
133
|
+
"tag_filter": "",
|
|
134
|
+
|
|
135
|
+
// Skip patterns
|
|
136
|
+
"skip_min_length": 2,
|
|
137
|
+
"skip_patterns_file": "~/.config/keepsake/skip_patterns.txt",
|
|
138
|
+
|
|
139
|
+
// Time decay
|
|
140
|
+
"decay_half_days": 60,
|
|
141
|
+
"hot_topic_decay_half_days": 30,
|
|
142
|
+
|
|
143
|
+
// Ranking weights
|
|
144
|
+
"sentiment_boost_positive": 1.5,
|
|
145
|
+
"sentiment_boost_negative": 1.3,
|
|
146
|
+
"feedback_positive_boost": 1.3,
|
|
147
|
+
"feedback_negative_penalty": 0.5,
|
|
148
|
+
"hot_topic_boost": 1.2,
|
|
149
|
+
|
|
150
|
+
// Attention
|
|
151
|
+
"attention_boost_max": 1.5,
|
|
152
|
+
"attention_base_increment": 2.0,
|
|
153
|
+
"attention_emotion_factor": 1.5,
|
|
154
|
+
|
|
155
|
+
// Embedding (optional)
|
|
156
|
+
"embedder": {
|
|
157
|
+
"provider": "dashscope",
|
|
158
|
+
"api_key": "sk-xxx",
|
|
159
|
+
"base_url": "https://dashscope.aliyuncs.com/compatible-mode/v1",
|
|
160
|
+
"model": "text-embedding-v2"
|
|
161
|
+
},
|
|
162
|
+
|
|
163
|
+
// Auto maintenance
|
|
164
|
+
"consolidate_min_group": 2,
|
|
165
|
+
"consolidate_max_age_hours": 72,
|
|
166
|
+
"forget_max_age_days": 30,
|
|
167
|
+
"forget_dry_run": false,
|
|
168
|
+
|
|
169
|
+
// Agent isolation
|
|
170
|
+
"agent_id": "main-brain",
|
|
171
|
+
"is_primary": true,
|
|
172
|
+
|
|
173
|
+
// Synonym discovery
|
|
174
|
+
"synonym_min_word_freq": 10,
|
|
175
|
+
"synonym_jaccard_threshold": 0.5,
|
|
176
|
+
"synonym_min_co_occurrence": 3,
|
|
177
|
+
|
|
178
|
+
// Entity co-occurrence
|
|
179
|
+
"entity_cooc_top_n": 3,
|
|
180
|
+
"entity_cooc_min_count": 2,
|
|
181
|
+
|
|
182
|
+
// Emotion intensity factor
|
|
183
|
+
"emotion_intensity_factor": 0.4
|
|
184
|
+
}
|
|
185
|
+
```
|
|
186
|
+
|
|
187
|
+
> Note: Redis password compatibility: leave empty for no authentication, or provide password to automatically send AUTH command.
|
|
188
|
+
|
|
189
|
+
### 3. Environment Variables Reference
|
|
190
|
+
|
|
191
|
+
| Environment Variable | Corresponding Config Item | Description |
|
|
192
|
+
|----------------------|----------------------------|-------------|
|
|
193
|
+
| `KEEPSAKE_REDIS_HOST` | `redis_host` | Redis server host |
|
|
194
|
+
| `KEEPSAKE_REDIS_PORT` | `redis_port` | Redis server port |
|
|
195
|
+
| `KEEPSAKE_REDIS_PASSWORD` | `redis_password` | Redis password for authentication |
|
|
196
|
+
| `KEEPSAKE_TOP_K` | `top_k` | Number of final entries returned |
|
|
197
|
+
| `KEEPSAKE_CANDIDATE_K` | `candidate_k` | Candidate entries count (for KNN) |
|
|
198
|
+
| `KEEPSAKE_BM25_LIMIT` | `bm25_limit` | BM25 search candidate count |
|
|
199
|
+
| `KEEPSAKE_TAG_FILTER` | `tag_filter` | Tag filtering (comma-separated) |
|
|
200
|
+
| `KEEPSAKE_DECAY_HALF_DAYS` | `decay_half_days` | Time decay half-life (days) |
|
|
201
|
+
| `KEEPSAKE_HOT_TOPIC_DECAY_HALF_DAYS` | `hot_topic_decay_half_days` | Hot topic time decay half-life (days) |
|
|
202
|
+
| `KEEPSAKE_EMBED_CACHE_TTL` | `embed_cache_ttl` | Embedding cache TTL (seconds) |
|
|
203
|
+
| `KEEPSAKE_EMBEDDER` | `embedder.provider` | Embedding provider (`openai`, `dashscope`) |
|
|
204
|
+
| `KEEPSAKE_EMBEDDER_URL` | `embedder.base_url` | Embedding API endpoint |
|
|
205
|
+
| `KEEPSAKE_EMBEDDER_MODEL` | `embedder.model` | Embedding model name |
|
|
206
|
+
| `KEEPSAKE_CONSOLIDATE_MIN_GROUP` | `consolidate_min_group` | Minimum entries to trigger consolidation |
|
|
207
|
+
| `KEEPSAKE_CONSOLIDATE_MAX_AGE_HOURS` | `consolidate_max_age_hours` | Minimum age (hours) before entries can be consolidated |
|
|
208
|
+
| `KEEPSAKE_FORGET_MAX_AGE_DAYS` | `forget_max_age_days` | Number of days before entries might be forgotten |
|
|
209
|
+
| `KEEPSAKE_FORGET_DRY_RUN` | `forget_dry_run` | Safe mode: `true` = count only, `false` = actually delete |
|
|
210
|
+
| `KEEPSAKE_EMOTION_INTENSITY_FACTOR` | `emotion_intensity_factor` | Emotion intensity → weight coefficient (0=disabled, 1=max) |
|
|
211
|
+
|
|
212
|
+
> Note: Redis password is compatible with empty value (no auth) or password provided for AUTH command.
|
|
213
|
+
> Note: Changes to config.json take effect immediately without restarting (just send `/new`).
|
|
214
|
+
|
|
215
|
+
### 4. Create Redis Index (first-time usage)
|
|
216
|
+
|
|
217
|
+
The code will auto-create (`ensure_index()`), or execute manually:
|
|
218
|
+
|
|
219
|
+
```bash
|
|
220
|
+
redis-cli FT.CREATE idx:memories ON HASH PREFIX 1 "memory:frag:" SCHEMA \
|
|
221
|
+
content TEXT WEIGHT 1 \
|
|
222
|
+
tags TAG SEPARATOR "," \
|
|
223
|
+
category TAG SEPARATOR "," \
|
|
224
|
+
source TEXT WEIGHT 1 \
|
|
225
|
+
created TEXT WEIGHT 0 \
|
|
226
|
+
entry_type TAG SEPARATOR "," \
|
|
227
|
+
invalid_at TAG SEPARATOR "," \
|
|
228
|
+
entities TAG SEPARATOR "," \
|
|
229
|
+
embed_bin VECTOR FLAT 6 TYPE FLOAT32 DIM 1536 DISTANCE_METRIC COSINE
|
|
230
|
+
```
|
|
231
|
+
|
|
232
|
+
> Dimension (DIM) is dynamically adjusted based on the embedding model used, default 1536.
|
|
233
|
+
> For Docker: `docker run -d --name redis-stack -p 6379:6379 redis/redis-stack:latest`
|
|
234
|
+
|
|
235
|
+
### 5. Hermes Configuration
|
|
236
|
+
|
|
237
|
+
Enable in `~/.hermes/config.yaml`:
|
|
238
|
+
|
|
239
|
+
```yaml
|
|
240
|
+
memory:
|
|
241
|
+
provider: keepsake
|
|
242
|
+
```
|
|
243
|
+
|
|
244
|
+
If `embedder` is not configured, only BM25 full-text search mode will be used.
|
|
245
|
+
|
|
246
|
+
Also supports environment variable configuration (highest precedence):
|
|
247
|
+
|
|
248
|
+
```bash
|
|
249
|
+
export KEEPSAKE_REDIS_HOST=127.0.0.1
|
|
250
|
+
export KEEPSAKE_REDIS_PORT=6379
|
|
251
|
+
export KEEPSAKE_TOP_K=5
|
|
252
|
+
export KEEPSAKE_EMBEDDER=dashscope
|
|
253
|
+
export KEEPSAKE_EMBEDDER_MODEL=text-embedding-v2
|
|
254
|
+
export OPENAI_API_KEY=sk-xxx # embedder API key
|
|
255
|
+
```
|
|
256
|
+
|
|
257
|
+
### 6. Workflow Lock
|
|
258
|
+
|
|
259
|
+
Temporarily disable memory retrieval during automated workflows (like batch processing):
|
|
260
|
+
|
|
261
|
+
```bash
|
|
262
|
+
# Lock (3600s TTL)
|
|
263
|
+
redis-cli SET keepsake:workflow_lock 1 EX 3600
|
|
264
|
+
|
|
265
|
+
# Unlock
|
|
266
|
+
redis-cli DEL keepsake:workflow_lock
|
|
267
|
+
```
|
|
268
|
+
|
|
269
|
+
### 7. Skip Patterns File
|
|
270
|
+
|
|
271
|
+
Create a file (one pattern per line, `#` for comments):
|
|
272
|
+
|
|
273
|
+
```text
|
|
274
|
+
# ~/.config/keepsake/skip_patterns.txt
|
|
275
|
+
好的
|
|
276
|
+
嗯
|
|
277
|
+
对
|
|
278
|
+
是
|
|
279
|
+
哦
|
|
280
|
+
可以
|
|
281
|
+
没错
|
|
282
|
+
ok
|
|
283
|
+
okay
|
|
284
|
+
yes
|
|
285
|
+
yeah
|
|
286
|
+
```
|
|
287
|
+
|
|
288
|
+
Then reference it in config.json:
|
|
289
|
+
|
|
290
|
+
```json
|
|
291
|
+
{
|
|
292
|
+
"skip_min_length": 2,
|
|
293
|
+
"skip_patterns_file": "~/.config/keepsake/skip_patterns.txt"
|
|
294
|
+
}
|
|
295
|
+
```
|
|
296
|
+
|
|
297
|
+
### 8. Restart Gateway
|
|
298
|
+
|
|
299
|
+
```bash
|
|
300
|
+
# For CLI mode, restart session is sufficient
|
|
301
|
+
# For Gateway mode, restart the process
|
|
302
|
+
```
|
|
303
|
+
|
|
304
|
+
## Configuration Reference
|
|
305
|
+
|
|
306
|
+
| Config Item | Environment Variable | Default Value | Description |
|
|
307
|
+
|-------------|---------------------|---------------|-------------|
|
|
308
|
+
| `redis_host` | `KEEPSAKE_REDIS_HOST` | `127.0.0.1` | Redis address |
|
|
309
|
+
| `redis_port` | `KEEPSAKE_REDIS_PORT` | `6379` | Redis port |
|
|
310
|
+
| `top_k` | `KEEPSAKE_TOP_K` | `5` | Number of final entries returned |
|
|
311
|
+
| `candidate_k` | `KEEPSAKE_CANDIDATE_K` | `10` | Candidate entries count (for KNN) |
|
|
312
|
+
| `tag_filter` | `KEEPSAKE_TAG_FILTER` | `""` | Tag filtering (comma-separated) |
|
|
313
|
+
| `bm25_limit` | `KEEPSAKE_BM25_LIMIT` | `10` | BM25 search candidate count |
|
|
314
|
+
| `decay_half_days` | `KEEPSAKE_DECAY_HALF_DAYS` | `60` | Time decay half-life (days) |
|
|
315
|
+
| `embed_cache_ttl` | `KEEPSAKE_EMBED_CACHE_TTL` | `3600` | Embedding cache TTL (seconds) |
|
|
316
|
+
| `sentiment_boost_positive` | — | `1.5` | Positive entry weight multiplier |
|
|
317
|
+
| `sentiment_boost_negative` | — | `1.3` | Negative entry weight multiplier |
|
|
318
|
+
| `feedback_positive_boost` | — | `1.3` | Positive feedback bonus weight |
|
|
319
|
+
| `feedback_negative_penalty` | — | `0.5` | Negative feedback penalty coefficient |
|
|
320
|
+
| `hot_topic_boost` | — | `1.2` | Hot topic weighting multiplier |
|
|
321
|
+
| `embedder.provider` | `KEEPSAKE_EMBEDDER` | `openai` | `openai` / `dashscope` |
|
|
322
|
+
| `embedder.api_key` | `OPENAI_API_KEY` | — | Embedding API key |
|
|
323
|
+
| `embedder.base_url` | `KEEPSAKE_EMBEDDER_URL` | `https://api.openai.com/v1` | API endpoint |
|
|
324
|
+
| `embedder.model` | `KEEPSAKE_EMBEDDER_MODEL` | `text-embedding-3-small` | Embedding model name |
|
|
325
|
+
| `consolidate_min_group` | — | `2` | Minimum entries to trigger consolidation |
|
|
326
|
+
| `consolidate_max_age_hours` | — | `72` | Minimum age (hours) before consolidation |
|
|
327
|
+
| `forget_max_age_days` | — | `30` | Max age (days) before deletion |
|
|
328
|
+
| `forget_dry_run` | — | `true` | Safe mode: `true` = count only, `false` = delete |
|
|
329
|
+
| `agent_id` | — | `""` | Agent identity tag for isolation (e.g. `"main-brain"`) |
|
|
330
|
+
| `is_primary` | — | `false` | `true` = sees all entries; `false` = only tagged ones |
|
|
331
|
+
| `hot_topic_decay_half_days` | — | `30` | Hot topic time decay half-life (days) |
|
|
332
|
+
| `emotion_intensity_factor` | — | `0.4` | Emotion intensity → weight coefficient |
|
|
333
|
+
| `skip_min_length` | — | `2` | Minimum query length to trigger search |
|
|
334
|
+
| `skip_patterns_file` | — | `""` | Path to skip patterns file |
|
|
335
|
+
| `attention_boost_max` | — | `1.5` | Max attention weighting value |
|
|
336
|
+
| `attention_base_increment` | — | `2.0` | Base attention increment per mention |
|
|
337
|
+
| `attention_emotion_factor` | — | `1.5` | Emotion amplification for attention |
|
|
338
|
+
| `synonym_min_word_freq` | — | `10` | Min frequency for synonym candidate |
|
|
339
|
+
| `synonym_jaccard_threshold` | — | `0.5` | Jaccard threshold for synonym detection |
|
|
340
|
+
| `synonym_min_co_occurrence` | — | `3` | Min co-occurrence for synonym detection |
|
|
341
|
+
| `entity_cooc_top_n` | — | `3` | Number of co-occurring entities to expand search |
|
|
342
|
+
| `entity_cooc_min_count` | — | `2` | Min co-occurrence for entity association |
|
|
343
|
+
|
|
344
|
+
> `sentiment_*`, `feedback_*`, `hot_topic_*` and other ranking weight parameters currently only support configuration through JSON config file, not environment variables. Set to `1.0` to disable the effect of that dimension.
|
|
345
|
+
|
|
346
|
+
### Embedding Models and Dimensions
|
|
347
|
+
|
|
348
|
+
| Model | Dimensions |
|
|
349
|
+
|-------|------------|
|
|
350
|
+
| OpenAI text-embedding-3-small | 1536 |
|
|
351
|
+
| OpenAI text-embedding-3-large | 3072 |
|
|
352
|
+
| OpenAI text-embedding-ada-002 | 1536 |
|
|
353
|
+
| DashScope text-embedding-v2 | 1536 |
|
|
354
|
+
| DashScope text-embedding-v3 | 1024 |
|
|
355
|
+
|
|
356
|
+
Dimensions are automatically detected, switching models doesn't require reconfiguration.
|
|
357
|
+
|
|
358
|
+
### Synonym Table
|
|
359
|
+
|
|
360
|
+
Stored in Redis Hash `keepsake:synonyms`, expanded at search time to improve recall:
|
|
361
|
+
|
|
362
|
+
```bash
|
|
363
|
+
redis-cli HSET keepsake:synonyms setup '["install","configure","deploy","setup"]'
|
|
364
|
+
redis-cli HSET keepsake:synonyms fix '["fix","modify","correct","repair","solve"]'
|
|
365
|
+
```
|
|
366
|
+
|
|
367
|
+
## Verification
|
|
368
|
+
|
|
369
|
+
Check logs after startup:
|
|
370
|
+
|
|
371
|
+
```
|
|
372
|
+
Memory provider 'entryed' registered (0 tools)
|
|
373
|
+
entryed: connected (session=xxx, top_k=5, tag_filter=(none))
|
|
374
|
+
entryed: BM25-only mode (no embedder configured)
|
|
375
|
+
```
|
|
376
|
+
|
|
377
|
+
## Architecture
|
|
378
|
+
|
|
379
|
+
```
|
|
380
|
+
┌────────────────────────────────────────────────────────┐
|
|
381
|
+
│ User sends message │
|
|
382
|
+
└──────────────────┬─────────────────────────────────────┘
|
|
383
|
+
│
|
|
384
|
+
┌─────────▼─────────┐
|
|
385
|
+
│ prefetch() │ ← Automatically triggered on every user message
|
|
386
|
+
│ ↓ │
|
|
387
|
+
│ Workflow Lock? │ ← Checks keepsake:workflow_lock
|
|
388
|
+
│ ↓ │
|
|
389
|
+
│ Skip patterns? │ ← Length / exact match against skip list
|
|
390
|
+
│ ↓ │
|
|
391
|
+
│ BM25 Full-Text Search │ ← Default, zero cost, searches full entries
|
|
392
|
+
│ (KNN Vector search) │ ← Optional (needs embedder)
|
|
393
|
+
│ Entity co-occurrence │ ← Expand query with co-occurring entities
|
|
394
|
+
│ ↓ │
|
|
395
|
+
│ Six-dimensional Re-ranking │ ← Similarity × Time decay
|
|
396
|
+
│ │ × Emotion × Feedback × Hot Topic × Attention
|
|
397
|
+
│ ↓ │
|
|
398
|
+
│ Top N Injected into Context │ ← Full entries returned as-is
|
|
399
|
+
└─────────┬─────────┘
|
|
400
|
+
│
|
|
401
|
+
┌─────────▼─────────┐
|
|
402
|
+
│ Model Response │ ← Entries used directly (complete text)
|
|
403
|
+
└───────────────────┘
|
|
404
|
+
│
|
|
405
|
+
┌─────────▼─────────┐
|
|
406
|
+
│ on_memory_write()│ ← Only on memory(action='add')
|
|
407
|
+
│ Stores Full Text │ ← Complete entry, no splitting
|
|
408
|
+
│ Entity Extraction│ ← jieba + regex → entities TAG field
|
|
409
|
+
│ Entity Co-occur. │ ← Track entity pairs in ZSET
|
|
410
|
+
│ Attention Track │ ← Extract keywords, increase attention score
|
|
411
|
+
│ ↓ │
|
|
412
|
+
│ Stored in Redis │ ← Available for next retrieval as full text
|
|
413
|
+
└───────────────────┘
|
|
414
|
+
│
|
|
415
|
+
┌─────────▼─────────┐
|
|
416
|
+
│ [cron] Every 2h │ ← Background maintenance
|
|
417
|
+
│ ① Multi-level Consolidation │ ← Same topic → keyword clustering → LLM → level+1
|
|
418
|
+
│ ② Selective Forgetting │ ← Age>30d + no feedback + low emotion + low attention → delete
|
|
419
|
+
└───────────────────┘
|
|
420
|
+
```
|
|
421
|
+
|
|
422
|
+
## License
|
|
423
|
+
|
|
424
|
+
MIT
|