@oyasmi/pipiclaw 0.4.0 → 0.5.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +43 -5
- package/dist/agent.d.ts.map +1 -1
- package/dist/agent.js +156 -57
- package/dist/agent.js.map +1 -1
- package/dist/context.d.ts +18 -0
- package/dist/context.d.ts.map +1 -1
- package/dist/context.js +26 -0
- package/dist/context.js.map +1 -1
- package/dist/index.d.ts +7 -3
- package/dist/index.d.ts.map +1 -1
- package/dist/index.js +6 -2
- package/dist/index.js.map +1 -1
- package/dist/llm-json.d.ts +7 -0
- package/dist/llm-json.d.ts.map +1 -0
- package/dist/llm-json.js +77 -0
- package/dist/llm-json.js.map +1 -0
- package/dist/markdown-sections.d.ts +6 -0
- package/dist/markdown-sections.d.ts.map +1 -0
- package/dist/markdown-sections.js +34 -0
- package/dist/markdown-sections.js.map +1 -0
- package/dist/memory-candidates.d.ts +21 -0
- package/dist/memory-candidates.d.ts.map +1 -0
- package/dist/memory-candidates.js +126 -0
- package/dist/memory-candidates.js.map +1 -0
- package/dist/memory-consolidation.d.ts.map +1 -1
- package/dist/memory-consolidation.js +28 -49
- package/dist/memory-consolidation.js.map +1 -1
- package/dist/memory-files.d.ts +3 -0
- package/dist/memory-files.d.ts.map +1 -1
- package/dist/memory-files.js +51 -0
- package/dist/memory-files.js.map +1 -1
- package/dist/memory-lifecycle.d.ts +9 -0
- package/dist/memory-lifecycle.d.ts.map +1 -1
- package/dist/memory-lifecycle.js +66 -0
- package/dist/memory-lifecycle.js.map +1 -1
- package/dist/memory-recall.d.ts +29 -0
- package/dist/memory-recall.d.ts.map +1 -0
- package/dist/memory-recall.js +218 -0
- package/dist/memory-recall.js.map +1 -0
- package/dist/prompt-builder.d.ts.map +1 -1
- package/dist/prompt-builder.js +7 -2
- package/dist/prompt-builder.js.map +1 -1
- package/dist/session-memory-files.d.ts +2 -0
- package/dist/session-memory-files.d.ts.map +1 -0
- package/dist/session-memory-files.js +2 -0
- package/dist/session-memory-files.js.map +1 -0
- package/dist/session-memory.d.ts +22 -0
- package/dist/session-memory.d.ts.map +1 -0
- package/dist/session-memory.js +274 -0
- package/dist/session-memory.js.map +1 -0
- package/dist/sidecar-worker.d.ts +27 -0
- package/dist/sidecar-worker.d.ts.map +1 -0
- package/dist/sidecar-worker.js +105 -0
- package/dist/sidecar-worker.js.map +1 -0
- package/dist/sub-agents.d.ts +10 -0
- package/dist/sub-agents.d.ts.map +1 -1
- package/dist/sub-agents.js +90 -0
- package/dist/sub-agents.js.map +1 -1
- package/dist/tools/index.d.ts +3 -0
- package/dist/tools/index.d.ts.map +1 -1
- package/dist/tools/index.js +2 -0
- package/dist/tools/index.js.map +1 -1
- package/dist/tools/subagent.d.ts +6 -0
- package/dist/tools/subagent.d.ts.map +1 -1
- package/dist/tools/subagent.js +127 -12
- package/dist/tools/subagent.js.map +1 -1
- package/docs/improve-memory/design.md +537 -0
- package/docs/improve-memory/interfaces-and-tests.md +473 -0
- package/docs/improve-memory/spec.md +357 -0
- package/docs/memory-rfc.md +7 -1
- package/docs/proj-review.md +188 -0
- package/docs/test-supplementation-plan.md +553 -0
- package/package.json +3 -1
|
@@ -0,0 +1,553 @@
|
|
|
1
|
+
# Pipiclaw 测试补充方案
|
|
2
|
+
|
|
3
|
+
## Context
|
|
4
|
+
|
|
5
|
+
Pipiclaw 当前有 22 个测试文件,工具层和记忆文件 I/O 覆盖良好,但**核心编排层(agent.ts)、消息投递层(delivery.ts)、命令扩展(command-extension.ts)、DingTalk 协议层(dingtalk.ts)完全没有测试**。记忆生命周期(memory-lifecycle.ts)和合并逻辑(memory-consolidation.ts)仅通过 mock 做了基础验证。**没有任何集成测试或端到端冒烟测试**。本方案旨在补全这些关键缺口。
|
|
6
|
+
|
|
7
|
+
### 当前测试覆盖现状
|
|
8
|
+
|
|
9
|
+
**已覆盖(22 个文件,单元测试):**
|
|
10
|
+
- 工具层:read, bash, edit, write, truncate, attach, shell-escape
|
|
11
|
+
- 记忆文件 I/O:memory-files, memory-recall, memory-candidates, session-memory
|
|
12
|
+
- 记忆生命周期:memory-lifecycle(仅 2 个基础 hook 测试)
|
|
13
|
+
- 子 Agent:sub-agents(发现/解析)、subagent tool(执行/上下文注入)
|
|
14
|
+
- 配置/运行时:config-loader, context, model-utils, prompt-builder, sandbox
|
|
15
|
+
- 基础设施:commands, events, store, log, sidecar-worker, llm-json, tools-index
|
|
16
|
+
|
|
17
|
+
**未覆盖:**
|
|
18
|
+
|
|
19
|
+
| 模块 | 行数 | 风险等级 | 说明 |
|
|
20
|
+
|------|------|----------|------|
|
|
21
|
+
| `agent.ts` | 907 | **高** | ChannelRunner: run(), 事件订阅, 命令分发, steer/followup |
|
|
22
|
+
| `delivery.ts` | 247 | **高** | 节流投递状态机, progress/finalize/silent 模式 |
|
|
23
|
+
| `command-extension.ts` | 169 | **中** | /session, /model, /new, /compact 命令 |
|
|
24
|
+
| `dingtalk.ts` | 881 | **高** | 连接生命周期, 消息去重, 授权, Card API, 消息路由 |
|
|
25
|
+
|
|
26
|
+
**欠覆盖:**
|
|
27
|
+
|
|
28
|
+
| 模块 | 现有测试 | 缺失 |
|
|
29
|
+
|------|----------|------|
|
|
30
|
+
| `memory-lifecycle.ts` | 2 个 hook 测试 | 阈值触发、后台队列、错误恢复 |
|
|
31
|
+
| `memory-consolidation.ts` | 仅通过 mock 验证 | 真实文件 I/O 的合并/清理/折叠逻辑 |
|
|
32
|
+
|
|
33
|
+
---
|
|
34
|
+
|
|
35
|
+
## 前置工作:共享测试基础设施
|
|
36
|
+
|
|
37
|
+
### `test/helpers/fake-bot.ts` — DingTalkBot 测试替身
|
|
38
|
+
|
|
39
|
+
提供一个可配置的 DingTalkBot 替身,记录所有方法调用及其参数,支持通过 `configure()` 控制返回值。
|
|
40
|
+
|
|
41
|
+
```typescript
|
|
42
|
+
export class FakeDingTalkBot {
|
|
43
|
+
calls: { method: string; args: unknown[] }[] = [];
|
|
44
|
+
private returnValues: Map<string, unknown> = new Map();
|
|
45
|
+
|
|
46
|
+
configure(method: string, returnValue: unknown): void;
|
|
47
|
+
|
|
48
|
+
async streamToCard(channelId: string, content: string): Promise<boolean>;
|
|
49
|
+
async finalizeCard(channelId: string, content: string): Promise<boolean>;
|
|
50
|
+
async finalizeExistingCard(channelId: string, content: string): Promise<boolean>;
|
|
51
|
+
discardCard(channelId: string): void;
|
|
52
|
+
async sendPlain(channelId: string, text: string): Promise<boolean>;
|
|
53
|
+
enqueueEvent(event: DingTalkEvent): boolean;
|
|
54
|
+
}
|
|
55
|
+
```
|
|
56
|
+
|
|
57
|
+
### `test/helpers/fake-store.ts` — ChannelStore 测试替身
|
|
58
|
+
|
|
59
|
+
```typescript
|
|
60
|
+
export class FakeChannelStore {
|
|
61
|
+
logged: { method: string; args: unknown[] }[] = [];
|
|
62
|
+
async logMessage(channelId: string, message: unknown): Promise<void>;
|
|
63
|
+
async logBotResponse(channelId: string, text: string, ts: string): Promise<void>;
|
|
64
|
+
async logSubAgentRun(channelId: string, run: unknown): Promise<void>;
|
|
65
|
+
getChannelDir(channelId: string): string;
|
|
66
|
+
}
|
|
67
|
+
```
|
|
68
|
+
|
|
69
|
+
### `test/helpers/fake-extension-api.ts` — ExtensionAPI 测试替身
|
|
70
|
+
|
|
71
|
+
```typescript
|
|
72
|
+
export class FakeExtensionAPI {
|
|
73
|
+
registeredCommands = new Map<string, { description: string; handler: Function }>();
|
|
74
|
+
sentMessages: unknown[] = [];
|
|
75
|
+
handlers = new Map<string, Function>();
|
|
76
|
+
|
|
77
|
+
registerCommand(name: string, config: { description: string; handler: Function }): void;
|
|
78
|
+
sendMessage(msg: unknown): void;
|
|
79
|
+
on(event: string, handler: Function): void;
|
|
80
|
+
}
|
|
81
|
+
```
|
|
82
|
+
|
|
83
|
+
### `test/helpers/fixtures.ts` — 通用 Fixture 工厂
|
|
84
|
+
|
|
85
|
+
```typescript
|
|
86
|
+
export function createFakeEvent(overrides?: Partial<DingTalkEvent>): DingTalkEvent;
|
|
87
|
+
export function createTempWorkspace(prefix?: string): string;
|
|
88
|
+
export function setupChannelFiles(dir: string, content?: {
|
|
89
|
+
memory?: string; session?: string; history?: string;
|
|
90
|
+
}): void;
|
|
91
|
+
```
|
|
92
|
+
|
|
93
|
+
**设计原则:**
|
|
94
|
+
- 复用项目中已有的 `FakeWorker`、`createAssistantMessage` 模式(源自 `subagent-phase1.test.ts`)
|
|
95
|
+
- 复用已有的 `createFakePi` 模式(源自 `memory-lifecycle.test.ts`)
|
|
96
|
+
- 复用已有的临时目录 `mkdtempSync` + `afterEach` cleanup 模式
|
|
97
|
+
|
|
98
|
+
---
|
|
99
|
+
|
|
100
|
+
## Phase 1:未覆盖模块的单元测试
|
|
101
|
+
|
|
102
|
+
Phase 1 的三个测试文件之间无依赖关系,可以并行实现。
|
|
103
|
+
|
|
104
|
+
### 1.1 `test/delivery.test.ts` — 消息投递状态机
|
|
105
|
+
|
|
106
|
+
**测试对象:** `src/delivery.ts` — `ChannelDeliveryController`(通过 `createDingTalkContext` 导出)
|
|
107
|
+
|
|
108
|
+
**Mock 策略:**
|
|
109
|
+
- `FakeDingTalkBot` 记录所有 card/sendPlain 调用
|
|
110
|
+
- `FakeChannelStore` 记录 logBotResponse 调用
|
|
111
|
+
- `vi.useFakeTimers()` 控制 800ms 节流窗口(`MIN_UPDATE_INTERVAL_MS`)
|
|
112
|
+
|
|
113
|
+
**测试用例:**
|
|
114
|
+
|
|
115
|
+
#### describe("buildContext")
|
|
116
|
+
|
|
117
|
+
| # | 用例 | 描述 | 关键断言 |
|
|
118
|
+
|---|------|------|----------|
|
|
119
|
+
| 1 | 返回完整 DingTalkContext 接口 | 验证返回对象包含所有 10 个方法 | 每个方法存在且为 function 类型 |
|
|
120
|
+
|
|
121
|
+
#### describe("respond / progress streaming")
|
|
122
|
+
|
|
123
|
+
| # | 用例 | 描述 | 关键断言 |
|
|
124
|
+
|---|------|------|----------|
|
|
125
|
+
| 2 | 累积 progress 文本 | 连续 respond("A"), respond("B"), flush | `streamToCard` 收到 `"A\n\nB"` |
|
|
126
|
+
| 3 | 忽略空白文本 | respond(" ") | `streamToCard` 未调用 |
|
|
127
|
+
| 4 | 节流 800ms | 两次 respond 间隔 <800ms | 窗口内只触发一次 streamToCard |
|
|
128
|
+
| 5 | 节流窗口过后立即投递 | 推进 800ms+ 后第二次 respond | 第二次立即投递 |
|
|
129
|
+
| 6 | shouldLog=true 时写入 store | respond("text", true) | `store.logBotResponse` 被调用 |
|
|
130
|
+
| 7 | shouldLog=false 时不写 store | respond("text", false) | `store.logBotResponse` 未调用 |
|
|
131
|
+
|
|
132
|
+
#### describe("respondPlain")
|
|
133
|
+
|
|
134
|
+
| # | 用例 | 描述 | 关键断言 |
|
|
135
|
+
|---|------|------|----------|
|
|
136
|
+
| 8 | 发送最终消息 | respondPlain("final") | `bot.sendPlain` 被调用,返回 true |
|
|
137
|
+
| 9 | 失败返回 false | sendPlain 配置返回 false | respondPlain 返回 false |
|
|
138
|
+
| 10 | 阻止进一步 progress | respondPlain 后再 respond | `streamToCard` 不再调用 |
|
|
139
|
+
|
|
140
|
+
#### describe("replaceMessage / deleteMessage")
|
|
141
|
+
|
|
142
|
+
| # | 用例 | 描述 | 关键断言 |
|
|
143
|
+
|---|------|------|----------|
|
|
144
|
+
| 11 | replaceMessage 设置 finalize-with-fallback | replaceMessage("text"), flush | `finalizeCard` 被调用 |
|
|
145
|
+
| 12 | deleteMessage 设置 silent 模式 | deleteMessage(), flush | `discardCard` 被调用 |
|
|
146
|
+
|
|
147
|
+
#### describe("flush / close")
|
|
148
|
+
|
|
149
|
+
| # | 用例 | 描述 | 关键断言 |
|
|
150
|
+
|---|------|------|----------|
|
|
151
|
+
| 13 | flush 无待处理操作时立即返回 | 直接 flush | 同步完成 |
|
|
152
|
+
| 14 | flush 等待进行中的投递 | respond 后立即 flush | flush 在投递完成后 resolve |
|
|
153
|
+
| 15 | close 阻止后续操作 | close 后 respond/respondPlain | 均为 no-op |
|
|
154
|
+
| 16 | close 幂等 | 两次 close | 不报错 |
|
|
155
|
+
|
|
156
|
+
#### describe("error recovery")
|
|
157
|
+
|
|
158
|
+
| # | 用例 | 描述 | 关键断言 |
|
|
159
|
+
|---|------|------|----------|
|
|
160
|
+
| 17 | streamToCard 失败时 discard card | streamToCard 返回 false | `discardCard` 被调用 |
|
|
161
|
+
|
|
162
|
+
---
|
|
163
|
+
|
|
164
|
+
### 1.2 `test/command-extension.test.ts` — 命令扩展
|
|
165
|
+
|
|
166
|
+
**测试对象:** `src/command-extension.ts` — `createCommandExtension`
|
|
167
|
+
|
|
168
|
+
**Mock 策略:**
|
|
169
|
+
- `FakeExtensionAPI` 捕获 registerCommand / sendMessage 调用
|
|
170
|
+
- `vi.fn()` 模拟 options 中的回调(getCurrentModel, switchModel 等)
|
|
171
|
+
- 构造 `FakeExtensionCommandContext`(包含 newSession, compact, sessionManager.getSessionId)
|
|
172
|
+
|
|
173
|
+
**测试用例:**
|
|
174
|
+
|
|
175
|
+
#### describe("registration")
|
|
176
|
+
|
|
177
|
+
| # | 用例 | 关键断言 |
|
|
178
|
+
|---|------|----------|
|
|
179
|
+
| 1 | 注册 session/model/new/compact 四个命令 | registerCommand 调用 4 次,每次名称正确 |
|
|
180
|
+
|
|
181
|
+
#### describe("/session")
|
|
182
|
+
|
|
183
|
+
| # | 用例 | 关键断言 |
|
|
184
|
+
|---|------|----------|
|
|
185
|
+
| 2 | 渲染会话统计 | sendMessage 内容含 Session ID、model ref、token 数、cost |
|
|
186
|
+
|
|
187
|
+
#### describe("/model")
|
|
188
|
+
|
|
189
|
+
| # | 用例 | 关键断言 |
|
|
190
|
+
|---|------|----------|
|
|
191
|
+
| 3 | 无参数显示当前模型和可用列表 | 输出含 "Current model" 和模型列表 |
|
|
192
|
+
| 4 | 精确匹配切换模型 | `switchModel` 被调用,消息含 "已切换模型" |
|
|
193
|
+
| 5 | 歧义引用报错 | 输出含 "匹配到多个模型" |
|
|
194
|
+
| 6 | 未知引用报错 | 输出含 "未找到模型" |
|
|
195
|
+
|
|
196
|
+
#### describe("/new")
|
|
197
|
+
|
|
198
|
+
| # | 用例 | 关键断言 |
|
|
199
|
+
|---|------|----------|
|
|
200
|
+
| 7 | 成功创建新会话 | `refreshSessionResources` 被调用,输出含新会话 ID |
|
|
201
|
+
| 8 | 取消创建 | `refreshSessionResources` 未调用,输出含 "已取消" |
|
|
202
|
+
|
|
203
|
+
#### describe("/compact")
|
|
204
|
+
|
|
205
|
+
| # | 用例 | 关键断言 |
|
|
206
|
+
|---|------|----------|
|
|
207
|
+
| 9 | 执行压缩 | 输出含 tokensBefore 和 summary |
|
|
208
|
+
| 10 | 传递自定义指令 | compact 收到 customInstructions |
|
|
209
|
+
|
|
210
|
+
#### describe("message format")
|
|
211
|
+
|
|
212
|
+
| # | 用例 | 关键断言 |
|
|
213
|
+
|---|------|----------|
|
|
214
|
+
| 11 | 使用正确 customType | sendMessage 参数含 `customType: "pipiclaw.command_result"` |
|
|
215
|
+
|
|
216
|
+
---
|
|
217
|
+
|
|
218
|
+
### 1.3 `test/dingtalk.test.ts` — DingTalk 协议层
|
|
219
|
+
|
|
220
|
+
**测试对象:** `src/dingtalk.ts` — `DingTalkBot`
|
|
221
|
+
|
|
222
|
+
**Mock 策略:**
|
|
223
|
+
- `vi.mock("dingtalk-stream")` 替换 DWClient(FakeDWClient 记录 connect/disconnect/registerCallbackListener)
|
|
224
|
+
- `vi.mock("axios")` 拦截所有 HTTP 请求(token、card、消息发送)
|
|
225
|
+
- `FakeDingTalkHandler` 以 `vi.fn()` 实现 handleEvent / handleStop / handleBusyMessage
|
|
226
|
+
- `vi.useFakeTimers()` 控制重连退避和 keep-alive
|
|
227
|
+
- 真实临时目录用于 conversation metadata 持久化测试
|
|
228
|
+
|
|
229
|
+
**测试用例(按功能分组):**
|
|
230
|
+
|
|
231
|
+
#### describe("message deduplication")
|
|
232
|
+
|
|
233
|
+
| # | 用例 | 关键断言 |
|
|
234
|
+
|---|------|----------|
|
|
235
|
+
| 1 | 新 messageId 正常处理 | markProcessed 返回 true |
|
|
236
|
+
| 2 | 重复 messageId 跳过处理 | markProcessed 返回 false |
|
|
237
|
+
| 3 | FIFO 超过 200 条后驱逐最早 ID | 被驱逐 ID 可再次通过 |
|
|
238
|
+
| 4 | 相同 msgId 不同 messageId 去重 | 第二次跳过处理 |
|
|
239
|
+
|
|
240
|
+
#### describe("message extraction and routing")
|
|
241
|
+
|
|
242
|
+
| # | 用例 | 关键断言 |
|
|
243
|
+
|---|------|----------|
|
|
244
|
+
| 5 | 提取 text.content 字段 | 文本内容正确 |
|
|
245
|
+
| 6 | 提取 richText 内容 | 拼接后内容正确 |
|
|
246
|
+
| 7 | 空消息返回空字符串 | 不抛错 |
|
|
247
|
+
| 8 | DM 路由到 `dm_{senderId}` | channelId 格式正确 |
|
|
248
|
+
| 9 | Group 路由到 `group_{conversationId}` | channelId 格式正确 |
|
|
249
|
+
|
|
250
|
+
#### describe("authorization")
|
|
251
|
+
|
|
252
|
+
| # | 用例 | 关键断言 |
|
|
253
|
+
|---|------|----------|
|
|
254
|
+
| 10 | allowFrom 为空允许所有人 | handleEvent 被调用 |
|
|
255
|
+
| 11 | allowFrom 不含发送者时忽略 | handleEvent 未调用 |
|
|
256
|
+
| 12 | allowFrom 包含发送者时处理 | handleEvent 被调用 |
|
|
257
|
+
|
|
258
|
+
#### describe("busy command routing")
|
|
259
|
+
|
|
260
|
+
| # | 用例 | 关键断言 |
|
|
261
|
+
|---|------|----------|
|
|
262
|
+
| 13 | /help → sendPlain 帮助文本 | 不经过 handler |
|
|
263
|
+
| 14 | /stop → handler.handleStop | 调用 handleStop 并 sendPlain "Stopping" |
|
|
264
|
+
| 15 | /steer msg → handleBusyMessage("steer") | mode 正确,文本正确 |
|
|
265
|
+
| 16 | /followup msg → handleBusyMessage("followUp") | mode 正确 |
|
|
266
|
+
| 17 | 纯文本 → 默认 steer | handleBusyMessage mode="steer" |
|
|
267
|
+
|
|
268
|
+
#### describe("access token management")
|
|
269
|
+
|
|
270
|
+
| # | 用例 | 关键断言 |
|
|
271
|
+
|---|------|----------|
|
|
272
|
+
| 18 | 首次调用刷新 token | axios POST 到 oauth2/accessToken |
|
|
273
|
+
| 19 | 过期前使用缓存 | 第二次不发 HTTP |
|
|
274
|
+
| 20 | 并发刷新合并 | 两个并发调用只发一次 HTTP |
|
|
275
|
+
|
|
276
|
+
#### describe("conversation metadata persistence")
|
|
277
|
+
|
|
278
|
+
| # | 用例 | 关键断言 |
|
|
279
|
+
|---|------|----------|
|
|
280
|
+
| 21 | 写入文件系统 | .channel-meta.json 内容正确 |
|
|
281
|
+
| 22 | 从文件系统回退读取 | 内存清空后仍可读 |
|
|
282
|
+
|
|
283
|
+
#### describe("channel queue")
|
|
284
|
+
|
|
285
|
+
| # | 用例 | 关键断言 |
|
|
286
|
+
|---|------|----------|
|
|
287
|
+
| 23 | 顺序处理工作项 | 3 个 enqueue 按序完成 |
|
|
288
|
+
| 24 | stop 阻止后续处理 | 后续 enqueue 返回 false |
|
|
289
|
+
| 25 | 超过队列上限时拒绝 | 返回 false |
|
|
290
|
+
|
|
291
|
+
#### describe("lifecycle")
|
|
292
|
+
|
|
293
|
+
| # | 用例 | 关键断言 |
|
|
294
|
+
|---|------|----------|
|
|
295
|
+
| 26 | stop 清理队列和 disconnect | disconnect 被调用 |
|
|
296
|
+
|
|
297
|
+
---
|
|
298
|
+
|
|
299
|
+
## Phase 2:集成测试
|
|
300
|
+
|
|
301
|
+
集成测试放在 `test/integration/` 子目录下,验证模块间协作的正确性。三个文件之间无依赖,可并行实现。
|
|
302
|
+
|
|
303
|
+
### 2.1 `test/integration/memory-lifecycle.test.ts` — 记忆生命周期完整链路
|
|
304
|
+
|
|
305
|
+
**测试边界:** `MemoryLifecycle` → `session-memory` → `memory-consolidation` → `memory-files`
|
|
306
|
+
|
|
307
|
+
**Mock 策略:**
|
|
308
|
+
- **仅 mock `sidecar-worker.js`**(`runSidecarTask` 返回可控的 JSON 输出)
|
|
309
|
+
- 真实文件系统 + 真实 memory-files I/O
|
|
310
|
+
- 真实 MemoryLifecycle 实例(非 mock)
|
|
311
|
+
|
|
312
|
+
**与现有 `memory-lifecycle.test.ts` 的区别:**
|
|
313
|
+
现有测试 mock 了 `session-memory.js` 和 `memory-consolidation.js`,只验证调用是否发生。此集成测试让这些模块真实执行,仅 mock 最底层的 LLM 调用(sidecar-worker),验证文件最终状态。
|
|
314
|
+
|
|
315
|
+
**测试用例:**
|
|
316
|
+
|
|
317
|
+
| # | 用例 | 描述 | 关键断言 |
|
|
318
|
+
|---|------|------|----------|
|
|
319
|
+
| 1 | turn 阈值触发 session memory 更新 | `noteCompletedAssistantTurn` N 次 | SESSION.md 被重写,包含 sidecar 返回的结构化内容 |
|
|
320
|
+
| 2 | tool call 阈值触发 | `noteToolCall` M 次 + 1 次 turn | SESSION.md 被重写 |
|
|
321
|
+
| 3 | 计数器更新后重置 | 触发后再计数 | 需要完整 N 次才再触发 |
|
|
322
|
+
| 4 | session_before_compact 链式执行 | 触发事件 | 顺序:1) SESSION.md 更新 → 2) MEMORY.md 追加 → 3) HISTORY.md 追加 |
|
|
323
|
+
| 5 | session_compact 触发后台维护 | 触发事件 | `runBackgroundMaintenance` 完成执行 |
|
|
324
|
+
| 6 | 后台队列串行执行 | 快速触发两次 | 两个任务按序完成,文件状态一致 |
|
|
325
|
+
| 7 | 合并失败不阻塞后续操作 | 第一次 sidecar 抛错 | 后续操作正常执行,不影响主流程 |
|
|
326
|
+
| 8 | enabled=false 全部跳过 | 设置 disabled | 无文件读写发生 |
|
|
327
|
+
|
|
328
|
+
---
|
|
329
|
+
|
|
330
|
+
### 2.2 `test/integration/memory-consolidation.test.ts` — 合并管道完整链路
|
|
331
|
+
|
|
332
|
+
**测试边界:** `runInlineConsolidation` / `runBackgroundMaintenance` → `sidecar-worker`(mock)→ `memory-files`(真实 I/O)
|
|
333
|
+
|
|
334
|
+
**Mock 策略:**
|
|
335
|
+
- 仅 mock `sidecar-worker.js`(返回可控的 JSON/Markdown 输出)
|
|
336
|
+
- 真实文件系统(临时目录 + afterEach 清理)
|
|
337
|
+
|
|
338
|
+
**测试用例:**
|
|
339
|
+
|
|
340
|
+
| # | 用例 | 描述 | 关键断言 |
|
|
341
|
+
|---|------|------|----------|
|
|
342
|
+
| 1 | 无实质消息时跳过 | 传入单条短消息 | 返回 `{ skipped: true }`,文件未修改 |
|
|
343
|
+
| 2 | inline 合并追加 memory 条目 | LLM 返回 `{memoryEntries: ["fact1", "fact2"]}` | MEMORY.md 含时间戳标记的 fact1, fact2 |
|
|
344
|
+
| 3 | inline 合并追加 history 块 | LLM 返回 `{historyBlock: "summary"}` | HISTORY.md 含新的时间戳区块 |
|
|
345
|
+
| 4 | 使用 compaction boundary 切割消息 | sessionEntries 含 compaction marker | 只处理 boundary 之后的消息内容 |
|
|
346
|
+
| 5 | 后台清理 MEMORY.md | 预填充 >8000 字符 | MEMORY.md 被 LLM 输出完整重写 |
|
|
347
|
+
| 6 | 后台折叠 HISTORY.md | 预填充 >8 个 ## 区块 | 旧区块折叠,保留最近 4 个 |
|
|
348
|
+
| 7 | 低于阈值时跳过 | 小文件 | 返回 `{cleaned: false, folded: false}` |
|
|
349
|
+
| 8 | LLM 返回格式错误优雅降级 | sidecar 抛 SidecarParseError | 不崩溃,返回含错误信息的结果 |
|
|
350
|
+
|
|
351
|
+
---
|
|
352
|
+
|
|
353
|
+
### 2.3 `test/integration/recall-scoring.test.ts` — 记忆召回评分完整链路
|
|
354
|
+
|
|
355
|
+
**测试边界:** `recallRelevantMemory` → `buildMemoryCandidates` → scoring → rendering
|
|
356
|
+
|
|
357
|
+
**Mock 策略:**
|
|
358
|
+
- 基本召回无需 mock(纯关键词评分)
|
|
359
|
+
- rerank 测试 mock `sidecar-worker.js`
|
|
360
|
+
- 真实文件系统
|
|
361
|
+
|
|
362
|
+
**与现有 `memory-recall.test.ts` 的区别:**
|
|
363
|
+
现有测试验证了候选构建和基础召回。此测试专注于评分精确性、优先级排序、渲染限制和 rerank 集成。
|
|
364
|
+
|
|
365
|
+
**测试用例:**
|
|
366
|
+
|
|
367
|
+
| # | 用例 | 描述 | 关键断言 |
|
|
368
|
+
|---|------|------|----------|
|
|
369
|
+
| 1 | 候选按关键词相关性排序 | 多文件含不同关键词密度 | 高密度候选排在前面 |
|
|
370
|
+
| 2 | session 候选优先级高于 channel memory | 相同关键词在两处出现 | session 候选分数更高 |
|
|
371
|
+
| 3 | maxInjected 限制生效 | maxInjected=1 | 结果只含 1 个候选 |
|
|
372
|
+
| 4 | maxChars 限制生效 | maxChars=100 | renderedText 长度 ≤100 |
|
|
373
|
+
| 5 | 无记忆文件时返回空 | 空目录 | 空结果,无错误 |
|
|
374
|
+
| 6 | 候选缓存避免重复读取 | 连续两次调用传同一 cache 对象 | 文件只读一次 |
|
|
375
|
+
| 7 | rerank 模式调用 sidecar | rerankWithModel=true | sidecar 被调用,结果按 LLM 选择过滤 |
|
|
376
|
+
|
|
377
|
+
---
|
|
378
|
+
|
|
379
|
+
## Phase 3:端到端冒烟测试
|
|
380
|
+
|
|
381
|
+
### `test/e2e/smoke.test.ts` — 完整消息处理管道
|
|
382
|
+
|
|
383
|
+
**目标:** 验证从 DingTalk 事件接收到最终响应投递的完整链路,以最小化 mock 覆盖最大真实代码路径。
|
|
384
|
+
|
|
385
|
+
#### 架构
|
|
386
|
+
|
|
387
|
+
```
|
|
388
|
+
DingTalkEvent (构造)
|
|
389
|
+
↓
|
|
390
|
+
createDingTalkContext() [真实 delivery.ts]
|
|
391
|
+
↓
|
|
392
|
+
ChannelRunner.run() [真实 agent.ts, mock 外部依赖]
|
|
393
|
+
├─ recallRelevantMemory() [真实召回, 真实文件系统]
|
|
394
|
+
├─ session.prompt() [mock AgentSession → 脚本化事件序列]
|
|
395
|
+
├─ event subscription [真实事件处理逻辑]
|
|
396
|
+
└─ delivery [真实投递状态机 → FakeDingTalkBot]
|
|
397
|
+
```
|
|
398
|
+
|
|
399
|
+
#### Mock 层级
|
|
400
|
+
|
|
401
|
+
**原则:** 只 mock 外部边界(LLM API、DingTalk API、pi-agent SDK),最大化内部模块的真实执行。
|
|
402
|
+
|
|
403
|
+
| 组件 | 策略 | 说明 |
|
|
404
|
+
|------|------|------|
|
|
405
|
+
| `@mariozechner/pi-agent-core` Agent | mock `prompt()` 发射脚本化事件序列 | 模拟 Agent 的思考-工具-回复流程 |
|
|
406
|
+
| `@mariozechner/pi-coding-agent` AgentSession | mock 构造器,delegate 到 mock Agent | 保留 session 管理逻辑 |
|
|
407
|
+
| `@mariozechner/pi-coding-agent` SessionManager | 内存实现 | 无需真实文件 session 持久化 |
|
|
408
|
+
| `@mariozechner/pi-coding-agent` ModelRegistry | 返回固定模型列表 | 避免 auth.json 依赖 |
|
|
409
|
+
| `sidecar-worker.js` | 返回可控 LLM 输出 | 记忆合并/更新使用 mock |
|
|
410
|
+
| DingTalkBot | **FakeDingTalkBot** | 记录所有调用,验证投递行为 |
|
|
411
|
+
| 文件系统 | **真实临时目录** | 验证记忆文件最终状态 |
|
|
412
|
+
| 计时器 | `vi.useFakeTimers()` | 控制投递节流 |
|
|
413
|
+
|
|
414
|
+
#### 脚本化 Agent 行为
|
|
415
|
+
|
|
416
|
+
Mock Agent 在 `prompt()` 被调用时按顺序发射以下事件:
|
|
417
|
+
|
|
418
|
+
```typescript
|
|
419
|
+
// 场景 1: 简单回复
|
|
420
|
+
[
|
|
421
|
+
{ type: "message_start", message: { role: "assistant" } },
|
|
422
|
+
{ type: "message_end", message: assistantMsg("Here is the analysis...") },
|
|
423
|
+
{ type: "turn_end", message: assistantMsg, toolResults: [] },
|
|
424
|
+
]
|
|
425
|
+
|
|
426
|
+
// 场景 2: 工具调用 + 回复
|
|
427
|
+
[
|
|
428
|
+
{ type: "message_start", message: { role: "assistant" } },
|
|
429
|
+
{ type: "tool_execution_start", toolName: "read", toolCallId: "tc-1" },
|
|
430
|
+
{ type: "tool_execution_end", toolCallId: "tc-1", result: "file content" },
|
|
431
|
+
{ type: "message_end", message: assistantMsg("Analysis complete.") },
|
|
432
|
+
{ type: "turn_end", message: assistantMsg, toolResults: [...] },
|
|
433
|
+
]
|
|
434
|
+
|
|
435
|
+
// 场景 3: 静默
|
|
436
|
+
[
|
|
437
|
+
{ type: "message_end", message: assistantMsg("[SILENT]") },
|
|
438
|
+
{ type: "turn_end", message: assistantMsg("[SILENT]") },
|
|
439
|
+
]
|
|
440
|
+
|
|
441
|
+
// 场景 4: 错误
|
|
442
|
+
prompt() → reject(new Error("Model overloaded"))
|
|
443
|
+
```
|
|
444
|
+
|
|
445
|
+
#### 测试用例
|
|
446
|
+
|
|
447
|
+
| # | 用例 | 描述 | 关键断言 |
|
|
448
|
+
|---|------|------|----------|
|
|
449
|
+
| 1 | 简单消息处理并投递最终响应 | 构造 event → createDingTalkContext → run() | `bot.sendPlain` 或 `finalizeCard` 收到 assistant 文本 |
|
|
450
|
+
| 2 | 记忆召回注入 prompt | 预填充 MEMORY.md 和 SESSION.md | Agent.prompt 的输入包含 `<runtime_context>` 包裹的召回内容 |
|
|
451
|
+
| 3 | 工具执行事件链 | Agent 发射 tool_start → tool_end → message_end → turn_end | progress 投递包含工具名;最终 usage 正确累积 |
|
|
452
|
+
| 4 | [SILENT] 响应走 deleteMessage | Agent 返回 "[SILENT]" | `bot.discardCard` 被调用,`sendPlain` 未调用 |
|
|
453
|
+
| 5 | Agent 抛错走错误投递 | Agent prompt() reject | bot 收到包含错误提示的消息 |
|
|
454
|
+
| 6 | /help 命令不触发 Agent run | 构造 /help 事件 | `bot.sendPlain` 收到帮助文本,Agent.prompt 未被调用 |
|
|
455
|
+
| 7 | /session 命令返回会话信息 | 通过 command extension 处理 | 输出含 Session ID 和 token 统计 |
|
|
456
|
+
| 8 | 忙碌时 steer 消息正确排队 | run 执行中调用 queueSteer | steer 文本被注入到 session.prompt 调用 |
|
|
457
|
+
|
|
458
|
+
#### 冒烟测试的验证维度
|
|
459
|
+
|
|
460
|
+
每个测试用例验证以下维度:
|
|
461
|
+
|
|
462
|
+
1. **投递正确性**:bot 方法调用的参数和顺序
|
|
463
|
+
2. **文件状态**:记忆文件(SESSION.md, MEMORY.md)的最终内容
|
|
464
|
+
3. **Store 日志**:ChannelStore 记录的消息和响应
|
|
465
|
+
4. **Usage 累积**:token 统计和 cost 计算
|
|
466
|
+
5. **错误隔离**:错误不泄漏到不相关的 channel
|
|
467
|
+
|
|
468
|
+
---
|
|
469
|
+
|
|
470
|
+
## 实施顺序
|
|
471
|
+
|
|
472
|
+
```
|
|
473
|
+
Phase 0: test/helpers/ (4 个文件,前置依赖)
|
|
474
|
+
│
|
|
475
|
+
▼
|
|
476
|
+
Phase 1 (可并行):
|
|
477
|
+
├─ test/delivery.test.ts (~17 用例)
|
|
478
|
+
├─ test/command-extension.test.ts (~11 用例)
|
|
479
|
+
└─ test/dingtalk.test.ts (~26 用例)
|
|
480
|
+
│
|
|
481
|
+
▼
|
|
482
|
+
Phase 2 (可并行):
|
|
483
|
+
├─ test/integration/memory-lifecycle.test.ts (~8 用例)
|
|
484
|
+
├─ test/integration/memory-consolidation.test.ts (~8 用例)
|
|
485
|
+
└─ test/integration/recall-scoring.test.ts (~7 用例)
|
|
486
|
+
│
|
|
487
|
+
▼
|
|
488
|
+
Phase 3:
|
|
489
|
+
└─ test/e2e/smoke.test.ts (~8 用例)
|
|
490
|
+
```
|
|
491
|
+
|
|
492
|
+
## 新增文件清单
|
|
493
|
+
|
|
494
|
+
| 文件路径 | 类型 | 预估用例数 |
|
|
495
|
+
|----------|------|-----------|
|
|
496
|
+
| `test/helpers/fake-bot.ts` | Helper | - |
|
|
497
|
+
| `test/helpers/fake-store.ts` | Helper | - |
|
|
498
|
+
| `test/helpers/fake-extension-api.ts` | Helper | - |
|
|
499
|
+
| `test/helpers/fixtures.ts` | Helper | - |
|
|
500
|
+
| `test/delivery.test.ts` | 单元测试 | 17 |
|
|
501
|
+
| `test/command-extension.test.ts` | 单元测试 | 11 |
|
|
502
|
+
| `test/dingtalk.test.ts` | 单元测试 | 26 |
|
|
503
|
+
| `test/integration/memory-lifecycle.test.ts` | 集成测试 | 8 |
|
|
504
|
+
| `test/integration/memory-consolidation.test.ts` | 集成测试 | 8 |
|
|
505
|
+
| `test/integration/recall-scoring.test.ts` | 集成测试 | 7 |
|
|
506
|
+
| `test/e2e/smoke.test.ts` | 冒烟测试 | 8 |
|
|
507
|
+
| **合计** | **4 helpers + 7 tests** | **~85** |
|
|
508
|
+
|
|
509
|
+
---
|
|
510
|
+
|
|
511
|
+
## vitest.config.ts 调整
|
|
512
|
+
|
|
513
|
+
```typescript
|
|
514
|
+
export default defineConfig({
|
|
515
|
+
test: {
|
|
516
|
+
environment: "node",
|
|
517
|
+
include: ["test/**/*.test.ts"], // 已支持子目录,无需修改
|
|
518
|
+
coverage: {
|
|
519
|
+
provider: "v8",
|
|
520
|
+
reporter: ["text", "json-summary", "html"],
|
|
521
|
+
reportsDirectory: "./coverage",
|
|
522
|
+
include: ["src/**/*.ts"],
|
|
523
|
+
// 排除难以测试的入口/常量文件
|
|
524
|
+
exclude: ["src/main.ts", "src/index.ts", "src/paths.ts"],
|
|
525
|
+
// Phase 3 完成后启用阈值
|
|
526
|
+
thresholds: {
|
|
527
|
+
statements: 70,
|
|
528
|
+
branches: 60,
|
|
529
|
+
functions: 70,
|
|
530
|
+
lines: 70,
|
|
531
|
+
},
|
|
532
|
+
},
|
|
533
|
+
},
|
|
534
|
+
});
|
|
535
|
+
```
|
|
536
|
+
|
|
537
|
+
## 覆盖率目标
|
|
538
|
+
|
|
539
|
+
| 阶段 | Statements | Branches | Functions | Lines |
|
|
540
|
+
|------|-----------|----------|-----------|-------|
|
|
541
|
+
| 当前(估算) | ~50% | ~40% | ~50% | ~50% |
|
|
542
|
+
| Phase 1 完成后 | 65% | 55% | 65% | 65% |
|
|
543
|
+
| Phase 2 完成后 | 70% | 60% | 70% | 70% |
|
|
544
|
+
| Phase 3 完成后 | **75%+** | **65%+** | **75%+** | **75%+** |
|
|
545
|
+
|
|
546
|
+
## 验证方式
|
|
547
|
+
|
|
548
|
+
每个 Phase 完成后:
|
|
549
|
+
|
|
550
|
+
1. `npm run test` — 全部测试通过(含新增)
|
|
551
|
+
2. `npm run test:coverage` — 覆盖率达到阶段目标
|
|
552
|
+
3. `npm run check` — lint + typecheck + test 全绿
|
|
553
|
+
|
package/package.json
CHANGED
|
@@ -1,6 +1,6 @@
|
|
|
1
1
|
{
|
|
2
2
|
"name": "@oyasmi/pipiclaw",
|
|
3
|
-
"version": "0.
|
|
3
|
+
"version": "0.5.0",
|
|
4
4
|
"description": "An AI assistant runtime for coding and team workflows, with DingTalk AI Cards, sub-agents, memory, and scheduled events.",
|
|
5
5
|
"type": "module",
|
|
6
6
|
"bin": {
|
|
@@ -29,6 +29,7 @@
|
|
|
29
29
|
"lint": "biome check .",
|
|
30
30
|
"typecheck": "tsc --noEmit -p tsconfig.json",
|
|
31
31
|
"test": "vitest --run",
|
|
32
|
+
"test:coverage": "vitest --run --coverage",
|
|
32
33
|
"check": "npm run lint && npm run typecheck && npm run test",
|
|
33
34
|
"prepublishOnly": "npm run clean && npm run build && npm run check"
|
|
34
35
|
},
|
|
@@ -47,6 +48,7 @@
|
|
|
47
48
|
"@biomejs/biome": "2.3.5",
|
|
48
49
|
"@types/diff": "^7.0.2",
|
|
49
50
|
"@types/node": "^24.3.0",
|
|
51
|
+
"@vitest/coverage-v8": "^3.2.4",
|
|
50
52
|
"shx": "^0.4.0",
|
|
51
53
|
"typescript": "^5.7.3",
|
|
52
54
|
"vitest": "^3.2.4"
|