@parkgogogo/openclaw-reflection 0.1.0 → 0.1.3

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/INSTALL.md CHANGED
@@ -1,78 +1,90 @@
1
- # OpenClaw Install Guide
1
+ # OpenClaw Plugin Install Skill
2
2
 
3
- This file is for OpenClaw itself to follow when installing `@parkgogogo/openclaw-reflection`.
3
+ Use this file when installing `@parkgogogo/openclaw-reflection`.
4
4
 
5
- ## Goal
5
+ ## Role
6
6
 
7
- Install the packaged plugin, enable it under the `openclaw-reflection` plugin id, provide a valid LLM config, and restart the gateway so the plugin becomes active.
7
+ Install the package, collect only missing config, write the plugin entry, validate config, and restart the gateway.
8
8
 
9
- ## Install From npm
9
+ ## Ask Only If Missing
10
10
 
11
- Use the published package, not a linked workspace checkout:
11
+ - `workspaceDir`
12
+ Recommendation: use OpenClaw's current workspace for the active profile.
13
+ Only ask if the current workspace is unknown or the operator wants a different one.
14
+ - `llm.baseURL`
15
+ - `llm.apiKey`
16
+ - `llm.model`
17
+ - whether the profile already uses `plugins.allow`
12
18
 
13
- ```bash
14
- openclaw plugins install @parkgogogo/openclaw-reflection@0.1.0
15
- ```
19
+ ## Defaults
20
+
21
+ - `bufferSize`: `50`
22
+ - `logLevel`: `info`
23
+ - `memoryGate.enabled`: `true`
24
+ - `memoryGate.windowSize`: `10`
25
+ - `consolidation.enabled`: `false`
26
+ - `consolidation.schedule`: `0 2 * * *`
27
+ - recommended model: `x-ai/grok-4.1-fast`
28
+
29
+ ## Rules
16
30
 
17
- If you are installing from a local release artifact instead:
31
+ - Install from npm, not a linked checkout.
32
+ - Write config under `plugins.entries.openclaw-reflection`.
33
+ - Do not leave `llm.baseURL`, `llm.apiKey`, or `llm.model` empty.
34
+ - Only add `openclaw-reflection` to `plugins.allow` if the profile already uses an allowlist.
35
+
36
+ ## Steps
37
+
38
+ ### 1. Install
18
39
 
19
40
  ```bash
20
- openclaw plugins install ./parkgogogo-openclaw-reflection-0.1.0.tgz
41
+ openclaw plugins install @parkgogogo/openclaw-reflection@0.1.0
21
42
  ```
22
43
 
23
- ## Write Config
44
+ ### 2. Write Config
24
45
 
25
- Add this under `plugins.entries.openclaw-reflection` in the active OpenClaw profile config:
46
+ Use the current OpenClaw workspace as `workspaceDir` unless the operator explicitly wants another workspace.
26
47
 
27
- ```json
48
+ ```jsonc
28
49
  {
29
50
  "enabled": true,
30
51
  "config": {
31
- "workspaceDir": "/absolute/path/to/agent-workspace",
52
+ "workspaceDir": "/absolute/path/to/current-openclaw-workspace",
32
53
  "bufferSize": 50,
33
54
  "logLevel": "info",
34
55
  "llm": {
35
- "baseURL": "https://api.openai.com/v1",
56
+ "baseURL": "https://openrouter.ai/api/v1",
36
57
  "apiKey": "YOUR_API_KEY",
37
- "model": "gpt-4.1-mini"
58
+ "model": "x-ai/grok-4.1-fast"
38
59
  },
39
60
  "memoryGate": {
40
61
  "enabled": true,
41
62
  "windowSize": 10
42
63
  },
43
64
  "consolidation": {
44
- "enabled": true,
65
+ "enabled": false,
45
66
  "schedule": "0 2 * * *"
46
67
  }
47
68
  }
48
69
  }
49
70
  ```
50
71
 
51
- Make sure `plugins.allow` includes `openclaw-reflection` if the profile uses an allowlist.
72
+ If the profile uses `plugins.allow`, ensure it contains:
52
73
 
53
- ## Validate And Restart
74
+ ```json
75
+ ["openclaw-reflection"]
76
+ ```
54
77
 
55
- Run:
78
+ ### 3. Validate
56
79
 
57
80
  ```bash
58
81
  openclaw config validate --json
59
- openclaw gateway run --force --verbose
60
82
  ```
61
83
 
62
- In another shell, confirm the gateway is live:
84
+ Stop if validation fails.
85
+
86
+ ### 4. Restart
63
87
 
64
88
  ```bash
65
- curl --fail http://127.0.0.1:PORT/health
89
+ openclaw gateway restart
66
90
  ```
67
-
68
- ## Expected Success Signals
69
-
70
- - The gateway starts without plugin registration errors
71
- - The plugin log contains `Plugin registered successfully, all hooks active`
72
- - `GET /health` returns JSON containing `"ok": true`
73
-
74
- ## Do Not Do This
75
-
76
- - Do not use `openclaw plugins install -l` unless you are actively developing this repository
77
- - Do not leave `llm.baseURL`, `llm.apiKey`, or `llm.model` empty
78
- - Do not configure the plugin under the old id `reflection-plugin`
package/README.md CHANGED
@@ -1,14 +1,29 @@
1
1
  # OpenClaw Reflection
2
2
 
3
+ <p align="center">
4
+ <img src="./assets/openclaw-reflection-logo.png" alt="OpenClaw Reflection logo" width="180" />
5
+ </p>
6
+
7
+ <p align="center"><strong>Make OpenClaw's native memory system sharper without replacing it.</strong></p>
8
+
3
9
  ![OpenClaw Plugin](https://img.shields.io/badge/OpenClaw-Plugin-111111?style=flat-square)
4
10
  ![TypeScript](https://img.shields.io/badge/TypeScript-5.x-3178c6?style=flat-square)
5
- ![memoryGate 16/16](https://img.shields.io/badge/memoryGate-16%2F16%20passed-2ea043?style=flat-square)
6
- ![writer guardian 16/16](https://img.shields.io/badge/writer%20guardian-16%2F16%20passed-2ea043?style=flat-square)
11
+ ![memory_gate 18 cases](https://img.shields.io/badge/memory_gate-18%20benchmark%20cases-2ea043?style=flat-square)
12
+ ![write_guardian 14 cases](https://img.shields.io/badge/write_guardian-14%20benchmark%20cases-2ea043?style=flat-square)
7
13
 
8
- **Make OpenClaw's native memory system sharper without replacing it.**
14
+ Chinese version: [README.zh-CN.md](./README.zh-CN.md)
9
15
 
10
16
  OpenClaw Reflection is an additive layer on top of OpenClaw's built-in Markdown memory system. It captures message flow, keeps thread noise out of long-term memory, writes durable knowledge into the same human-readable memory files OpenClaw already uses, and periodically consolidates them so your agent gets sharper over time instead of messier.
11
17
 
18
+ ## Current Scope
19
+
20
+ Reflection currently supports:
21
+
22
+ - a single agent
23
+ - multiple sessions for that same agent
24
+
25
+ Reflection does not currently support multi-agent memory coordination or per-agent routing across multiple agents in one OpenClaw setup.
26
+
12
27
  ## Built On OpenClaw Memory
13
28
 
14
29
  OpenClaw memory is already workspace-native: the source of truth is Markdown files in the agent workspace, not a hidden database. In the official model, daily logs live under `memory/YYYY-MM-DD.md`, while `MEMORY.md` is the curated long-term layer.
@@ -19,17 +34,15 @@ Reflection builds on top of that system instead of replacing it.
19
34
  - It does **not** require replacing OpenClaw's default `memory-core`
20
35
  - It does **not** take over the active `plugins.slots.memory` role
21
36
  - It works by listening to message hooks and curating the same workspace memory files
37
+ - It analyzes and curates `USER.md`, `MEMORY.md`, `TOOLS.md`, `IDENTITY.md`, and `SOUL.md` based on conversation flow
22
38
 
23
39
  In practice, that means low migration risk and low conceptual overhead: you keep OpenClaw's native MEMORY workflow, and Reflection enhances the capture, filtering, routing, and consolidation steps around it.
24
40
 
25
41
  ## Why People Install It
26
42
 
27
- Most chat memory systems fail in one of two ways:
43
+ OpenClaw's core long-term files such as `USER.md`, `TOOLS.md`, `IDENTITY.md`, and `SOUL.md` are hard to improve continuously in the default setup.
28
44
 
29
- - they forget too much, so you keep re-explaining the same context
30
- - they remember too much, so temporary thread noise pollutes long-term memory
31
-
32
- Reflection is built to fix both.
45
+ Reflection is built to solve that.
33
46
 
34
47
  - Keep stable user preferences and collaboration habits
35
48
  - Preserve durable shared context across sessions
@@ -37,15 +50,20 @@ Reflection is built to fix both.
37
50
  - Refuse one-off tasks, active thread chatter, and misrouted writes
38
51
  - Periodically consolidate memory so it stays usable
39
52
 
53
+ ## Core Mechanism
54
+
55
+ Reflection uses LLM analysis over recent conversation context and adds two control points: `memory_gate` and `write_guardian`.
56
+
57
+ - `memory_gate` analyzes the conversation and decides which durable fact, if any, should be written and which target file it belongs to
58
+ - `write_guardian` acts as the write gate and follows OpenClaw's file responsibilities to decide whether a write should be accepted, rejected, or merged into the target file
59
+
40
60
  ## Install
41
61
 
42
62
  ### Recommended for users: install the plugin package
43
63
 
44
- OpenClaw can install plugins directly from a package source. That is the right distribution path for Reflection, because users should not need to clone the repository or run `pnpm install` just to use the plugin.
64
+ For an install script written for OpenClaw itself to follow, including which config questions to ask first, see [INSTALL.md](./INSTALL.md).
45
65
 
46
- For a step-by-step installation flow that OpenClaw can follow directly, see [INSTALL.md](./INSTALL.md).
47
-
48
- Registry install after publishing:
66
+ Install
49
67
 
50
68
  ```bash
51
69
  openclaw plugins install <npm-spec>
@@ -61,25 +79,25 @@ openclaw plugins install @parkgogogo/openclaw-reflection
61
79
 
62
80
  Put the following under `plugins.entries.openclaw-reflection` in your OpenClaw config:
63
81
 
64
- ```json
82
+ ```jsonc
65
83
  {
66
- "enabled": true,
84
+ "enabled": true, // Enable the plugin entry
67
85
  "config": {
68
- "workspaceDir": "/absolute/path/to/your-agent-workspace",
69
- "bufferSize": 50,
70
- "logLevel": "info",
86
+ "workspaceDir": "/absolute/path/to/your-agent-workspace", // Workspace where MEMORY.md, USER.md, TOOLS.md, IDENTITY.md, and SOUL.md live
87
+ "bufferSize": 50, // Session buffer size used to collect recent messages
88
+ "logLevel": "info", // Runtime log verbosity: debug, info, warn, or error
71
89
  "llm": {
72
- "baseURL": "https://api.openai.com/v1",
73
- "apiKey": "YOUR_API_KEY",
74
- "model": "gpt-4.1-mini"
90
+ "baseURL": "https://openrouter.ai/api/v1", // OpenAI-compatible provider base URL
91
+ "apiKey": "YOUR_API_KEY", // Provider API key used for analysis and writing
92
+ "model": "x-ai/grok-4.1-fast" // Recommended model for plugin runtime
75
93
  },
76
94
  "memoryGate": {
77
- "enabled": true,
78
- "windowSize": 10
95
+ "enabled": true, // Enable durable-memory filtering before any write
96
+ "windowSize": 10 // Number of recent messages included in memory_gate analysis
79
97
  },
80
98
  "consolidation": {
81
- "enabled": true,
82
- "schedule": "0 2 * * *"
99
+ "enabled": false, // Keep disabled by default; enable only if you want scheduled cleanup
100
+ "schedule": "0 2 * * *" // Cron expression used when consolidation is enabled
83
101
  }
84
102
  }
85
103
  }
@@ -89,6 +107,13 @@ Put the following under `plugins.entries.openclaw-reflection` in your OpenClaw c
89
107
 
90
108
  Once the gateway restarts, Reflection will begin listening to `message_received` and `before_message_write`, then writing curated memory files into your configured `workspaceDir`.
91
109
 
110
+ ### Observability command
111
+
112
+ - Reflection now writes an independent write_guardian audit log to:
113
+ - `<workspaceDir>/.openclaw-reflection/write-guardian.log.jsonl`
114
+ - Register command: `/openclaw-reflection`
115
+ - Returns the most recent 10 write_guardian behaviors (written/refused/failed/skipped), including decision, target file, and reason.
116
+
92
117
  ## What You Get
93
118
 
94
119
  | You want | Reflection gives you |
@@ -96,42 +121,30 @@ Once the gateway restarts, Reflection will begin listening to `message_received`
96
121
  | A memory system you can inspect | Plain Markdown files you can open, edit, diff, and version |
97
122
  | Better continuity across sessions | Durable facts routed into the right long-term file |
98
123
  | Less memory pollution | Gatekeeping that refuses temporary or misrouted content |
99
- | A system that stays usable over time | Scheduled consolidation for existing memory files |
100
-
101
- ## Why This Beats Naive Memory
102
-
103
- | Naive memory | Reflection |
104
- | -------------------------------- | ------------------------------------------------ |
105
- | Appends whatever seems memorable | Filters for durable signal before writing |
106
- | Hides memory in a black box | Stores memory in readable Markdown files |
107
- | Mixes all facts together | Routes facts into purpose-specific files |
108
- | Lets bad writes accumulate | Adds writer guarding and scheduled consolidation |
124
+ | A system that stays usable over time | Optional scheduled consolidation for existing memory files |
109
125
 
110
126
  ## How It Works
111
127
 
112
- ```mermaid
113
- flowchart LR
114
- A["Incoming conversation"] --> B["Session buffer"]
115
- B --> C["memoryGate"]
116
- C -->|durable fact| D["Writer guardian"]
117
- C -->|thread noise| E["No write"]
118
- D --> F["MEMORY.md / USER.md / SOUL.md / IDENTITY.md / TOOLS.md"]
119
- F --> G["Scheduled consolidation"]
120
- ```
128
+ ![OpenClaw Reflection flowchart](./assets/memory-flowchart.png)
121
129
 
122
130
  In practice, the pipeline is simple:
123
131
 
124
132
  1. Reflection captures conversation context from OpenClaw hooks.
125
- 2. `memoryGate` decides whether the candidate fact is durable enough to keep.
133
+ 2. `memory_gate` decides whether the candidate fact is durable enough to keep.
126
134
  3. A file-specific guardian either rewrites the target memory file or refuses the write.
127
- 4. Scheduled consolidation keeps `MEMORY.md`, `USER.md`, `SOUL.md`, and `TOOLS.md` compact over time.
135
+ 4. When enabled, scheduled consolidation keeps `MEMORY.md`, `USER.md`, `SOUL.md`, and `TOOLS.md` compact over time.
128
136
 
129
137
  ## Proof, Not Just Promises
130
138
 
131
- This repo already includes offline eval coverage for the two hardest parts of the system:
139
+ The active default offline benchmark currently includes:
140
+
141
+ - `memory_gate`: `18` benchmark cases
142
+ - `write_guardian`: `14` benchmark cases
132
143
 
133
- - [`memoryGate`: 16/16 passed on V2](./evals/results/2026-03-08-memory-gate-v2-16-of-16.md)
134
- - [`writer guardian`: 16/16 passed on V2](./evals/results/2026-03-08-writer-guardian-v2-16-of-16.md)
144
+ The most recent archived result snapshots in this repo are:
145
+
146
+ - [`memory_gate`: 16/16 passed on V2](./evals/results/2026-03-08-memory-gate-v2-16-of-16.md)
147
+ - [`write_guardian`: 16/16 passed on V2](./evals/results/2026-03-08-write-guardian-v2-16-of-16.md)
135
148
 
136
149
  These evals focus on the failure modes that make long-term memory systems unreliable:
137
150
 
@@ -163,32 +176,69 @@ These evals focus on the failure modes that make long-term memory systems unreli
163
176
  | `llm.model` | `gpt-4.1-mini` | Model used for analysis and consolidation |
164
177
  | `memoryGate.enabled` | `true` | Enable long-term memory filtering |
165
178
  | `memoryGate.windowSize` | `10` | Message window used during analysis |
166
- | `consolidation.enabled` | `true` | Enable scheduled consolidation |
179
+ | `consolidation.enabled` | `false` | Enable scheduled consolidation |
167
180
  | `consolidation.schedule` | `0 2 * * *` | Cron expression for consolidation |
168
181
 
169
182
  ## Built For
170
183
 
171
184
  - personal agents that should get better over weeks, not just one session
185
+ - single-agent OpenClaw setups with many sessions
172
186
  - teams that want memory with reviewability and version control
173
187
  - OpenClaw users who do not want a black-box memory store
174
188
  - agents that need stronger continuity without turning every chat into permanent history
175
189
 
176
190
  ## Development And Evals
177
191
 
192
+ Recommended model for real plugin use:
193
+
194
+ - `x-ai/grok-4.1-fast`
195
+
196
+ The development eval setup in this repository currently uses:
197
+
198
+ - eval model: `x-ai/grok-4.1-fast`
199
+ - judge model: `openai/gpt-5.4`
200
+
178
201
  ```bash
179
202
  pnpm run typecheck
180
203
  pnpm run eval:memory-gate
181
- pnpm run eval:writer-guardian
204
+ pnpm run eval:write-guardian
182
205
  pnpm run eval:all
206
+
207
+ node evals/run.mjs \
208
+ --suite memory-gate \
209
+ --models-config evals/models.json \
210
+ --baseline grok-fast \
211
+ --output evals/results/$(date +%F)-memory-gate-matrix.json \
212
+ --markdown-output evals/results/$(date +%F)-memory-gate-matrix.md
183
213
  ```
184
214
 
215
+ `evals/models.json` defines only the comparison matrix. The shared provider endpoint and key still come from `EVAL_BASE_URL` and `EVAL_API_KEY`. JSON output is the source of truth for automation and history, while the Markdown artifact is the readable leaderboard summary.
216
+
185
217
  More eval details: [evals/README.md](./evals/README.md)
186
218
 
187
- Fast packaged-plugin regression on a reused local OpenClaw profile:
219
+ ## Model Selection
188
220
 
189
- ```bash
190
- pnpm run e2e:openclaw-plugin
191
- ```
221
+ Benchmark date: `2026-03-09`
222
+ Scope: `memory_gate` only, `18` cases, shared OpenRouter-compatible `EVAL_*` route
223
+
224
+ | Model | Pass/Total | Accuracy | Errors (P/S/E) | Recommendation | Best For |
225
+ | --- | --- | --- | --- | --- | --- |
226
+ | `x-ai/grok-4.1-fast` | `17/18` | `94.4%` | `0/0/0` | Default baseline | Daily eval baseline |
227
+ | `qwen/qwen3.5-flash-02-23` | `17/18` | `94.4%` | `0/1/0` | Good backup option | Cost-sensitive cross-checks |
228
+ | `google/gemini-2.5-flash-lite` | `16/18` | `88.9%` | `0/0/0` | Fast iteration candidate | Cheap prompt iteration |
229
+ | `inception/mercury-2` | `11/18` | `61.1%` | `0/0/0` | Not recommended as default | Exploratory comparisons only |
230
+ | `minimax/minimax-m2.5` | `9/18` | `50.0%` | `0/0/0` | Not recommended as default | Occasional sanity checks only |
231
+ | `openai/gpt-4o-mini` | `4/18` | `22.2%` | `18/0/0` | Not recommended on current route | Avoid on current OpenRouter path |
232
+
233
+ How to choose:
234
+
235
+ - Default to `x-ai/grok-4.1-fast` because it had the best overall stability in this round with no internal errors.
236
+ - Use `qwen/qwen3.5-flash-02-23` as the strongest backup when you want similar accuracy but can tolerate one schema failure in this benchmark.
237
+ - Use `google/gemini-2.5-flash-lite` for cheaper, faster prompt iteration when slightly lower boundary accuracy is acceptable.
238
+ - Avoid `inception/mercury-2` and `minimax/minimax-m2.5` as defaults because they frequently collapse `SOUL`, `IDENTITY`, or `NO_WRITE` boundaries into the wrong bucket.
239
+ - Avoid `openai/gpt-4o-mini` on the current OpenRouter/Azure-backed route because all `18` cases surfaced provider-side structured-output errors.
240
+
241
+ Source artifact: [2026-03-09-memory-gate-openrouter-model-benchmark.md](./evals/results/2026-03-09-memory-gate-openrouter-model-benchmark.md)
192
242
 
193
243
  ## Links
194
244
 
@@ -0,0 +1,219 @@
1
+ # OpenClaw Reflection
2
+
3
+ <p align="center">
4
+ <img src="./assets/openclaw-reflection-logo.png" alt="OpenClaw Reflection logo" width="180" />
5
+ </p>
6
+
7
+ <p align="center"><strong>在不替换 OpenClaw 原生记忆体系的前提下,让 Markdown 记忆更干净、更稳定、更可持续。</strong></p>
8
+
9
+ 英文版: [README.md](./README.md)
10
+
11
+ ![OpenClaw Plugin](https://img.shields.io/badge/OpenClaw-Plugin-111111?style=flat-square)
12
+ ![TypeScript](https://img.shields.io/badge/TypeScript-5.x-3178c6?style=flat-square)
13
+ ![memory_gate 18 cases](https://img.shields.io/badge/memory_gate-18%20benchmark%20cases-2ea043?style=flat-square)
14
+ ![write_guardian 14 cases](https://img.shields.io/badge/write_guardian-14%20benchmark%20cases-2ea043?style=flat-square)
15
+
16
+ OpenClaw Reflection 是叠加在 OpenClaw 原生 Markdown memory 之上的一层增强插件。它负责监听消息流,过滤线程噪音,把真正长期有效的信息写回 OpenClaw 的核心记忆文件,并定期整理这些文件,避免长期使用后越记越乱。
17
+
18
+ ## 当前支持范围
19
+
20
+ Reflection 当前支持:
21
+
22
+ - 单一 agent
23
+ - 同一个 agent 下的多 sessions
24
+
25
+ 目前还不支持多 agent 之间的记忆协调,也不支持在一个 OpenClaw 多 agent 环境里做按 agent 分流的长期记忆管理。
26
+
27
+ ## 它建立在 OpenClaw 原生 Memory 之上
28
+
29
+ OpenClaw 的 memory 本来就是 workspace-native 的:事实源头是 agent workspace 中的 Markdown 文件,而不是隐藏数据库。官方模型里,日常记录通常在 `memory/YYYY-MM-DD.md`,而 `MEMORY.md` 是长期整理层。
30
+
31
+ Reflection 的定位不是替换,而是增强:
32
+
33
+ - 不引入新的私有 memory store
34
+ - 不要求替换 OpenClaw 默认的 `memory-core`
35
+ - 不接管 `plugins.slots.memory`
36
+ - 直接围绕现有 Markdown memory 文件做捕获、过滤、路由和整理
37
+ - 根据对话,分析整理 `USER.md` `MEMORY.md` `TOOLS.md` `IDENTITY.md` `SOUL.md`
38
+
39
+ 这意味着迁移成本低、概念负担低,也更容易人工检查和版本管理。
40
+
41
+ ## 为什么要装它
42
+
43
+ Openclaw 默认状态下核心的 `USER.md` `TOOLS.md` `IDENTITY.md` `SOUL.md` 是很难自我迭代改进的
44
+
45
+ Reflection 就是为了解决这个问题:
46
+
47
+ - 保留稳定的用户偏好和协作习惯
48
+ - 沉淀跨会话仍然有价值的长期上下文
49
+ - 将长期记忆拆分到 `MEMORY.md`、`USER.md`、`SOUL.md`、`IDENTITY.md`、`TOOLS.md`
50
+ - 拒绝一次性任务、短期线程聊天、错路由内容
51
+ - 周期性整理长期记忆,防止文件持续膨胀和失真
52
+
53
+ ## 原理
54
+
55
+ 我们使用 LLM 的能力对最近的对话进行分析,设置了 `memory_gate` 和 `write_guardian` 两个工具
56
+
57
+ - `memory_gate` 通过对话分析,分析有哪些事实应该被记录到哪个文件
58
+
59
+ - `write_guardian` 设置为写入门禁,会根据 OpenClaw 官方的指引,来判断是否要写入,并进行事实整合
60
+
61
+ ## 安装
62
+
63
+ ### 推荐方式:安装打包后的插件
64
+
65
+ 更详细的安装指引见 [INSTALL.md](./INSTALL.md)。这个文件现在按“给 OpenClaw 自己执行的安装技能”来写,包含安装前应该向操作者询问哪些配置。
66
+
67
+ 手动直接安装:
68
+
69
+ ```bash
70
+ openclaw plugins install @parkgogogo/openclaw-reflection
71
+ ```
72
+
73
+ ### 添加插件配置
74
+
75
+ 把下面这段配置写到 OpenClaw profile 的 `plugins.entries.openclaw-reflection` 下:
76
+
77
+ ```jsonc
78
+ {
79
+ "enabled": true, // 启用这个插件入口
80
+ "config": {
81
+ "workspaceDir": "/absolute/path/to/your-agent-workspace", // 长期记忆文件所在的 agent workspace 目录
82
+ "bufferSize": 50, // 会话缓冲区大小,用来保留最近消息上下文
83
+ "logLevel": "info", // 运行日志级别:debug、info、warn、error
84
+ "llm": {
85
+ "baseURL": "https://openrouter.ai/api/v1", // OpenAI 兼容接口的 provider base URL
86
+ "apiKey": "YOUR_API_KEY", // 用于分析和写入决策的 provider API key
87
+ "model": "x-ai/grok-4.1-fast" // 推荐用于插件运行时的模型
88
+ },
89
+ "memoryGate": {
90
+ "enabled": true, // 启用长期记忆写入前的过滤
91
+ "windowSize": 10 // memory_gate 分析时使用的最近消息窗口大小
92
+ },
93
+ "consolidation": {
94
+ "enabled": false, // 默认禁用;只有需要定时整理时再开启
95
+ "schedule": "0 2 * * *" // 启用 consolidation 后使用的 cron 表达式
96
+ }
97
+ }
98
+ }
99
+ ```
100
+
101
+ ### 重启 OpenClaw Gateway
102
+
103
+ Gateway 重启后,Reflection 就会开始监听 `message_received` 和 `before_message_write`,并把整理后的长期信息写入你配置的 `workspaceDir`。
104
+
105
+ ### 可观测性命令
106
+
107
+ - Reflection 现在会给 write_guardian 单独写一份审计日志:
108
+ - `<workspaceDir>/.openclaw-reflection/write-guardian.log.jsonl`
109
+ - 注册命令:`/openclaw-reflection`
110
+ - 返回最近 10 条 write_guardian 行为(written/refused/failed/skipped),包含 decision、目标文件和原因。
111
+
112
+ ## 你会得到什么
113
+
114
+ | 你想要的能力 | Reflection 提供的结果 |
115
+ | ------------------------ | ---------------------------------------------- |
116
+ | 可检查、可编辑的记忆系统 | 直接落到 Markdown 文件,能打开、diff、版本管理 |
117
+ | 更稳定的跨会话连续性 | 长期事实会被路由到正确的文件 |
118
+ | 更少的记忆污染 | 会过滤临时线程内容和错路由写入 |
119
+ | 长期使用后仍然可维护 | 可选的定期 consolidation,避免文件越来越乱 |
120
+
121
+ ## 它如何工作
122
+
123
+ ![OpenClaw Reflection flowchart](./assets/memory-flowchart.png)
124
+
125
+ 流程很直接:
126
+
127
+ 1. Reflection 从 OpenClaw hook 中捕获会话上下文。
128
+ 2. `memory_gate` 判断候选事实是否足够长期、足够稳定。
129
+ 3. file-specific `write_guardian` 决定是否写入目标文件,并在需要时重写目标文件内容。
130
+ 4. 在启用时,`consolidation` 会定期整理长期文件,控制冗余和过时信息。
131
+
132
+ ## 评测覆盖
133
+
134
+ 我们设置了一个小型人工校验过的数据集,使用 x-ai/grok-4.1-fast 来优化 prompt,测试完善 `memory_gate` 和 `write_guardian`
135
+
136
+ 当前默认离线 benchmark 包含:
137
+
138
+ - `memory_gate`:`18` 个 benchmark case
139
+ - `write_guardian`:`14` 个 benchmark case
140
+
141
+ 仓库中最近一次归档结果快照是:
142
+
143
+ - [`memory_gate`: 16/16 passed on V2](./evals/results/2026-03-08-memory-gate-v2-16-of-16.md)
144
+ - [`write_guardian`: 16/16 passed on V2](./evals/results/2026-03-08-write-guardian-v2-16-of-16.md)
145
+
146
+ 这些评测重点覆盖:
147
+
148
+ - 拒绝当前线程噪音
149
+ - 防止用户事实写错文件
150
+ - 保持 `SOUL` 连续性规则
151
+ - 正确替换过时的 `IDENTITY` 元数据
152
+ - 让 `TOOLS.md` 只保存本地工具映射,而不是把它误当工具注册表
153
+
154
+ ## 长期记忆文件
155
+
156
+ | 文件 | 作用 |
157
+ | ------------- | ---------------------------------------------- |
158
+ | `MEMORY.md` | 持久共享上下文、关键结论、长期背景事实 |
159
+ | `USER.md` | 稳定的用户偏好、协作风格、长期有帮助的个人背景 |
160
+ | `SOUL.md` | 助手原则、边界、连续性规则 |
161
+ | `IDENTITY.md` | 显式身份元数据,例如名字、气质、形象描述 |
162
+ | `TOOLS.md` | 环境特定的工具别名、端点、设备名、本地工具映射 |
163
+
164
+ ## 开发和评测命令
165
+
166
+ 实际插件使用时,推荐模型:
167
+
168
+ - `x-ai/grok-4.1-fast`
169
+
170
+ 当前这个仓库里的开发评测配置使用的是:
171
+
172
+ - eval model: `x-ai/grok-4.1-fast`
173
+ - judge model: `openai/gpt-5.4`
174
+
175
+ ```bash
176
+ pnpm run typecheck
177
+ pnpm run eval:memory-gate
178
+ pnpm run eval:write-guardian
179
+ pnpm run eval:all
180
+
181
+ node evals/run.mjs \
182
+ --suite memory-gate \
183
+ --models-config evals/models.json \
184
+ --baseline grok-fast \
185
+ --output evals/results/$(date +%F)-memory-gate-matrix.json \
186
+ --markdown-output evals/results/$(date +%F)-memory-gate-matrix.md
187
+ ```
188
+
189
+ `evals/models.json` 只用来定义多模型对比矩阵;共享的 provider endpoint 和 key 仍然来自 `EVAL_BASE_URL` 与 `EVAL_API_KEY`。JSON 输出是后续自动化和历史追踪的基准,Markdown 输出则是给人看的 leaderboard 摘要。
190
+
191
+ 更多评测说明见 [evals/README.md](./evals/README.md)。
192
+
193
+ ## 模型选择
194
+
195
+ 评测日期:`2026-03-09`
196
+ 范围:仅 `memory_gate`,共 `18` 个 case,共享 OpenRouter 兼容的 `EVAL_*` 路由
197
+
198
+ | 模型 | Pass/Total | 准确率 | 错误数 (P/S/E) | 建议 | 适用场景 |
199
+ | --- | --- | --- | --- | --- | --- |
200
+ | `x-ai/grok-4.1-fast` | `17/18` | `94.4%` | `0/0/0` | 默认基线 | 日常 eval 基线 |
201
+ | `qwen/qwen3.5-flash-02-23` | `17/18` | `94.4%` | `0/1/0` | 优秀备选 | 对成本敏感的交叉验证 |
202
+ | `google/gemini-2.5-flash-lite` | `16/18` | `88.9%` | `0/0/0` | 便宜快速候选 | 低成本 prompt 迭代 |
203
+ | `inception/mercury-2` | `11/18` | `61.1%` | `0/0/0` | 不建议默认使用 | 仅做探索性对比 |
204
+ | `minimax/minimax-m2.5` | `9/18` | `50.0%` | `0/0/0` | 不建议默认使用 | 偶尔做 sanity check |
205
+ | `openai/gpt-4o-mini` | `4/18` | `22.2%` | `18/0/0` | 当前路由下不建议使用 | 避免在当前 OpenRouter 路径使用 |
206
+
207
+ 如何选择:
208
+
209
+ - 默认优先用 `x-ai/grok-4.1-fast`,因为这一轮里它的整体稳定性最好,而且没有内部错误。
210
+ - 如果想要接近的准确率,同时能接受一次 schema 失败,可以把 `qwen/qwen3.5-flash-02-23` 作为最强备选。
211
+ - 如果更看重低成本和快速迭代,可以用 `google/gemini-2.5-flash-lite`,但要接受它在部分 `TOOLS` 边界上略弱。
212
+ - 不要把 `inception/mercury-2` 和 `minimax/minimax-m2.5` 当默认基线,因为它们经常把 `SOUL`、`IDENTITY` 或 `NO_WRITE` 判到错误类别。
213
+ - 当前 OpenRouter/Azure 路由下不要选 `openai/gpt-4o-mini`,因为 `18` 个 case 全都触发了 provider 侧 structured-output 错误。
214
+
215
+ 源结果见:[2026-03-09-memory-gate-openrouter-model-benchmark.md](./evals/results/2026-03-09-memory-gate-openrouter-model-benchmark.md)
216
+
217
+ ## 链接
218
+
219
+ - OpenClaw plugin docs: [docs.openclaw.ai/tools/plugin](https://docs.openclaw.ai/tools/plugin)
Binary file
@@ -54,7 +54,7 @@
54
54
  "properties": {
55
55
  "enabled": {
56
56
  "type": "boolean",
57
- "default": true
57
+ "default": false
58
58
  },
59
59
  "schedule": {
60
60
  "type": "string",
package/package.json CHANGED
@@ -1,13 +1,15 @@
1
1
  {
2
2
  "name": "@parkgogogo/openclaw-reflection",
3
- "version": "0.1.0",
3
+ "version": "0.1.3",
4
4
  "description": "OpenClaw plugin that enhances native Markdown memory with filtering, curation, and consolidation",
5
5
  "type": "module",
6
6
  "main": "src/index.ts",
7
7
  "files": [
8
+ "assets/",
8
9
  "src/",
9
10
  "openclaw.plugin.json",
10
11
  "README.md",
12
+ "README.zh-CN.md",
11
13
  "INSTALL.md"
12
14
  ],
13
15
  "repository": {
@@ -19,12 +21,13 @@
19
21
  "url": "https://github.com/parkgogogo/openclaw-reflection/issues"
20
22
  },
21
23
  "scripts": {
22
- "build": "tsc --noEmit",
24
+ "build": "tsc -p tsconfig.json",
23
25
  "clean": "rm -rf logs",
26
+ "test": "pnpm run build && node --test tests/*.test.mjs",
24
27
  "typecheck": "tsc --noEmit",
25
28
  "e2e:openclaw-plugin": "bash scripts/e2e-openclaw-plugin.sh",
26
29
  "eval:memory-gate": "pnpm exec tsc && node evals/run.mjs --suite memory-gate",
27
- "eval:writer-guardian": "pnpm exec tsc && node evals/run.mjs --suite writer-guardian",
30
+ "eval:write-guardian": "pnpm exec tsc && node evals/run.mjs --suite write-guardian",
28
31
  "eval:all": "pnpm exec tsc && node evals/run.mjs --suite all"
29
32
  },
30
33
  "keywords": [
package/src/config.ts CHANGED
@@ -21,7 +21,7 @@ const DEFAULT_CONFIG: PluginConfig = {
21
21
  windowSize: 10,
22
22
  },
23
23
  consolidation: {
24
- enabled: true,
24
+ enabled: false,
25
25
  schedule: "0 2 * * *",
26
26
  },
27
27
  };