@oyasmi/pipiclaw 0.4.0 → 0.5.1
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +43 -5
- package/dist/agent.d.ts.map +1 -1
- package/dist/agent.js +156 -57
- package/dist/agent.js.map +1 -1
- package/dist/context.d.ts +18 -0
- package/dist/context.d.ts.map +1 -1
- package/dist/context.js +26 -0
- package/dist/context.js.map +1 -1
- package/dist/index.d.ts +7 -3
- package/dist/index.d.ts.map +1 -1
- package/dist/index.js +6 -2
- package/dist/index.js.map +1 -1
- package/dist/llm-json.d.ts +7 -0
- package/dist/llm-json.d.ts.map +1 -0
- package/dist/llm-json.js +77 -0
- package/dist/llm-json.js.map +1 -0
- package/dist/markdown-sections.d.ts +6 -0
- package/dist/markdown-sections.d.ts.map +1 -0
- package/dist/markdown-sections.js +34 -0
- package/dist/markdown-sections.js.map +1 -0
- package/dist/memory-candidates.d.ts +21 -0
- package/dist/memory-candidates.d.ts.map +1 -0
- package/dist/memory-candidates.js +126 -0
- package/dist/memory-candidates.js.map +1 -0
- package/dist/memory-consolidation.d.ts.map +1 -1
- package/dist/memory-consolidation.js +28 -49
- package/dist/memory-consolidation.js.map +1 -1
- package/dist/memory-files.d.ts +3 -0
- package/dist/memory-files.d.ts.map +1 -1
- package/dist/memory-files.js +51 -0
- package/dist/memory-files.js.map +1 -1
- package/dist/memory-lifecycle.d.ts +9 -0
- package/dist/memory-lifecycle.d.ts.map +1 -1
- package/dist/memory-lifecycle.js +66 -0
- package/dist/memory-lifecycle.js.map +1 -1
- package/dist/memory-recall.d.ts +29 -0
- package/dist/memory-recall.d.ts.map +1 -0
- package/dist/memory-recall.js +218 -0
- package/dist/memory-recall.js.map +1 -0
- package/dist/prompt-builder.d.ts.map +1 -1
- package/dist/prompt-builder.js +7 -2
- package/dist/prompt-builder.js.map +1 -1
- package/dist/session-memory-files.d.ts +2 -0
- package/dist/session-memory-files.d.ts.map +1 -0
- package/dist/session-memory-files.js +2 -0
- package/dist/session-memory-files.js.map +1 -0
- package/dist/session-memory.d.ts +22 -0
- package/dist/session-memory.d.ts.map +1 -0
- package/dist/session-memory.js +274 -0
- package/dist/session-memory.js.map +1 -0
- package/dist/sidecar-worker.d.ts +27 -0
- package/dist/sidecar-worker.d.ts.map +1 -0
- package/dist/sidecar-worker.js +105 -0
- package/dist/sidecar-worker.js.map +1 -0
- package/dist/sub-agents.d.ts +10 -0
- package/dist/sub-agents.d.ts.map +1 -1
- package/dist/sub-agents.js +90 -0
- package/dist/sub-agents.js.map +1 -1
- package/dist/tools/index.d.ts +3 -0
- package/dist/tools/index.d.ts.map +1 -1
- package/dist/tools/index.js +2 -0
- package/dist/tools/index.js.map +1 -1
- package/dist/tools/subagent.d.ts +6 -0
- package/dist/tools/subagent.d.ts.map +1 -1
- package/dist/tools/subagent.js +127 -12
- package/dist/tools/subagent.js.map +1 -1
- package/docs/improve-memory/design.md +537 -0
- package/docs/improve-memory/interfaces-and-tests.md +473 -0
- package/docs/improve-memory/spec.md +357 -0
- package/docs/memory-rfc.md +7 -1
- package/docs/proj-review.md +188 -0
- package/docs/test-supplementation-plan.md +553 -0
- package/package.json +3 -1
|
@@ -0,0 +1,357 @@
|
|
|
1
|
+
# Pipiclaw Context Upgrade Spec
|
|
2
|
+
|
|
3
|
+
## Status
|
|
4
|
+
|
|
5
|
+
Draft
|
|
6
|
+
|
|
7
|
+
## Purpose
|
|
8
|
+
|
|
9
|
+
This spec defines the next-stage context architecture for `pipiclaw`.
|
|
10
|
+
|
|
11
|
+
It extends the existing [memory RFC](/Users/oyasmi/projects/pipiclaw/docs/memory-rfc.md) rather than replacing it wholesale.
|
|
12
|
+
|
|
13
|
+
The primary goal is not "more memory files". The goal is to materially improve:
|
|
14
|
+
|
|
15
|
+
1. cross-turn continuity
|
|
16
|
+
2. long-task stability
|
|
17
|
+
3. compaction resilience
|
|
18
|
+
4. sub-agent usefulness
|
|
19
|
+
5. skill usefulness in a long-running DingTalk channel workspace
|
|
20
|
+
|
|
21
|
+
The design remains file-based and runtime-local. It does not introduce embeddings, vector search, or a dedicated memory database.
|
|
22
|
+
|
|
23
|
+
## Problems To Solve
|
|
24
|
+
|
|
25
|
+
The current model is good enough to persist durable facts, but still leaves several gaps:
|
|
26
|
+
|
|
27
|
+
1. `MEMORY.md` and `HISTORY.md` are available but not proactively surfaced, so the model often fails to use them at the right time.
|
|
28
|
+
2. "current working state" is mixed into durable memory and is therefore either too weak, too stale, or too noisy.
|
|
29
|
+
3. compaction happens with only indirect help from memory consolidation, which is weaker than having an explicit working-state artifact.
|
|
30
|
+
4. sub-agents are context-isolated, but there is no standard way to give them the most relevant channel memory automatically.
|
|
31
|
+
5. skills are loaded, but they do not yet participate in a richer lifecycle where memory and runtime hooks reinforce one another.
|
|
32
|
+
|
|
33
|
+
## Non-Goals
|
|
34
|
+
|
|
35
|
+
1. No vector retrieval, embedding index, semantic database, or memory plugin framework.
|
|
36
|
+
2. No special standalone `memory_search` tool for the model.
|
|
37
|
+
3. No automatic mutation of workspace-level `SOUL.md`, `AGENTS.md`, or workspace `MEMORY.md`.
|
|
38
|
+
4. No proactive scanning of `log.jsonl` or `context.jsonl` as normal memory inputs.
|
|
39
|
+
5. No attempt to clone the full `claude-code` agent OS, swarm runtime, or prompt-cache infrastructure in this phase.
|
|
40
|
+
|
|
41
|
+
## Context Layers
|
|
42
|
+
|
|
43
|
+
The upgraded context model has five layers.
|
|
44
|
+
|
|
45
|
+
### 1. Identity Layer
|
|
46
|
+
|
|
47
|
+
- `workspace/SOUL.md`
|
|
48
|
+
- `workspace/AGENTS.md`
|
|
49
|
+
|
|
50
|
+
Semantics:
|
|
51
|
+
|
|
52
|
+
- loaded into session context at session start
|
|
53
|
+
- human-managed
|
|
54
|
+
- authoritative
|
|
55
|
+
- not rewritten by runtime maintenance
|
|
56
|
+
|
|
57
|
+
### 2. Shared Durable Memory
|
|
58
|
+
|
|
59
|
+
- `workspace/MEMORY.md`
|
|
60
|
+
|
|
61
|
+
Semantics:
|
|
62
|
+
|
|
63
|
+
- stable shared background
|
|
64
|
+
- read on demand
|
|
65
|
+
- not auto-rewritten by runtime
|
|
66
|
+
|
|
67
|
+
### 3. Channel Durable Memory
|
|
68
|
+
|
|
69
|
+
- `<channel>/MEMORY.md`
|
|
70
|
+
|
|
71
|
+
Semantics:
|
|
72
|
+
|
|
73
|
+
- durable facts, decisions, preferences, constraints, medium-horizon open loops
|
|
74
|
+
- append-first, then cleanup/fold
|
|
75
|
+
- channel-scoped
|
|
76
|
+
- runtime-managed
|
|
77
|
+
|
|
78
|
+
Important change:
|
|
79
|
+
|
|
80
|
+
`MEMORY.md` is no longer the primary owner of minute-by-minute "what am I doing right now" state. It may still contain open loops, but detailed active execution state belongs to `SESSION.md`.
|
|
81
|
+
|
|
82
|
+
### 4. Channel Working Memory
|
|
83
|
+
|
|
84
|
+
- `<channel>/SESSION.md`
|
|
85
|
+
|
|
86
|
+
Semantics:
|
|
87
|
+
|
|
88
|
+
- current task state
|
|
89
|
+
- active files and commands
|
|
90
|
+
- current hypotheses and next steps
|
|
91
|
+
- recent important errors and corrections
|
|
92
|
+
- channel-scoped handoff artifact across sessions and compactions
|
|
93
|
+
- runtime-managed
|
|
94
|
+
|
|
95
|
+
This is the major addition in this spec.
|
|
96
|
+
|
|
97
|
+
### 5. Channel History
|
|
98
|
+
|
|
99
|
+
- `<channel>/HISTORY.md`
|
|
100
|
+
|
|
101
|
+
Semantics:
|
|
102
|
+
|
|
103
|
+
- summarized chronological recovery material
|
|
104
|
+
- older work periods, decisions, milestones
|
|
105
|
+
- runtime-managed
|
|
106
|
+
- not intended for normal manual maintenance
|
|
107
|
+
|
|
108
|
+
## Cold Storage
|
|
109
|
+
|
|
110
|
+
These files remain cold:
|
|
111
|
+
|
|
112
|
+
- `<channel>/log.jsonl`
|
|
113
|
+
- `<channel>/context.jsonl`
|
|
114
|
+
|
|
115
|
+
They are not part of normal memory recall and are not proactively loaded into prompts.
|
|
116
|
+
|
|
117
|
+
## Core Invariants
|
|
118
|
+
|
|
119
|
+
The upgraded design must preserve these invariants:
|
|
120
|
+
|
|
121
|
+
1. File-based operation remains the source of truth.
|
|
122
|
+
2. The model should not need to remember to manually read the right memory file in common cases.
|
|
123
|
+
3. `SESSION.md` is the working-state artifact.
|
|
124
|
+
4. `MEMORY.md` is the durable-facts artifact.
|
|
125
|
+
5. `HISTORY.md` is the chronological-summary artifact.
|
|
126
|
+
6. Workspace memory remains human-managed.
|
|
127
|
+
7. Raw transport/session archives remain cold.
|
|
128
|
+
8. Failure of a background updater must not corrupt persisted memory files.
|
|
129
|
+
9. The system must degrade gracefully when an updater fails.
|
|
130
|
+
10. Memory improvements must help sub-agents and skills, not just the main agent.
|
|
131
|
+
|
|
132
|
+
## File Semantics
|
|
133
|
+
|
|
134
|
+
### `SESSION.md`
|
|
135
|
+
|
|
136
|
+
`SESSION.md` is channel-scoped hot memory.
|
|
137
|
+
|
|
138
|
+
It should answer:
|
|
139
|
+
|
|
140
|
+
1. What is the user currently trying to achieve?
|
|
141
|
+
2. What is the current state of work?
|
|
142
|
+
3. Which files and commands matter right now?
|
|
143
|
+
4. What constraints and recent failures should not be forgotten?
|
|
144
|
+
5. What are the next likely steps?
|
|
145
|
+
|
|
146
|
+
It should not become:
|
|
147
|
+
|
|
148
|
+
1. a raw transcript
|
|
149
|
+
2. a duplicate of all durable facts already in `MEMORY.md`
|
|
150
|
+
3. an infinite worklog
|
|
151
|
+
|
|
152
|
+
Read/write rule:
|
|
153
|
+
|
|
154
|
+
1. runtime-managed by default
|
|
155
|
+
2. eligible for automatic targeted recall
|
|
156
|
+
3. not intended for normal manual maintenance by the main agent
|
|
157
|
+
4. may be edited only when an explicit user/admin instruction makes that the task itself
|
|
158
|
+
|
|
159
|
+
### `MEMORY.md`
|
|
160
|
+
|
|
161
|
+
`MEMORY.md` remains the durable channel memory, but the threshold for what belongs here gets stricter.
|
|
162
|
+
|
|
163
|
+
It should keep:
|
|
164
|
+
|
|
165
|
+
1. stable preferences
|
|
166
|
+
2. durable decisions
|
|
167
|
+
3. medium-horizon constraints
|
|
168
|
+
4. important open loops that must survive beyond the current execution burst
|
|
169
|
+
|
|
170
|
+
It should avoid:
|
|
171
|
+
|
|
172
|
+
1. step-by-step active worklog
|
|
173
|
+
2. temporary local debugging observations unless they have lasting value
|
|
174
|
+
3. detailed "current state" that will likely be obsolete after a few turns
|
|
175
|
+
|
|
176
|
+
Read/write rule:
|
|
177
|
+
|
|
178
|
+
1. readable on demand
|
|
179
|
+
2. writable by runtime consolidation
|
|
180
|
+
3. still manually writable when necessary
|
|
181
|
+
4. cleanup is allowed to remove transient state that should now live in `SESSION.md`
|
|
182
|
+
|
|
183
|
+
### `HISTORY.md`
|
|
184
|
+
|
|
185
|
+
`HISTORY.md` remains the recovery narrative.
|
|
186
|
+
|
|
187
|
+
It should keep:
|
|
188
|
+
|
|
189
|
+
1. notable work periods
|
|
190
|
+
2. milestones
|
|
191
|
+
3. decision outcomes
|
|
192
|
+
4. enough chronology for later recovery
|
|
193
|
+
|
|
194
|
+
It should avoid:
|
|
195
|
+
|
|
196
|
+
1. dense per-turn detail
|
|
197
|
+
2. raw snippets copied from transcript
|
|
198
|
+
3. facts better represented in `MEMORY.md`
|
|
199
|
+
|
|
200
|
+
Read/write rule:
|
|
201
|
+
|
|
202
|
+
1. readable on demand
|
|
203
|
+
2. runtime-managed
|
|
204
|
+
3. not intended for ordinary manual edits
|
|
205
|
+
|
|
206
|
+
## Closed-Loop Lifecycle
|
|
207
|
+
|
|
208
|
+
The upgraded model introduces an explicit closed loop:
|
|
209
|
+
|
|
210
|
+
1. active session work happens in warm context
|
|
211
|
+
2. recent work updates `SESSION.md`
|
|
212
|
+
3. stable facts and medium-horizon open loops are promoted into `MEMORY.md`
|
|
213
|
+
4. older narrative is summarized into `HISTORY.md`
|
|
214
|
+
5. future prompts receive targeted recall from `SESSION.md`, `MEMORY.md`, and `HISTORY.md`
|
|
215
|
+
|
|
216
|
+
This loop must work even when:
|
|
217
|
+
|
|
218
|
+
1. the session compacts
|
|
219
|
+
2. the user starts a new session in the same channel
|
|
220
|
+
3. the process restarts
|
|
221
|
+
4. the main agent delegates work to a sub-agent
|
|
222
|
+
|
|
223
|
+
## Recall Model
|
|
224
|
+
|
|
225
|
+
The system should stop relying purely on "the model may remember to read memory files".
|
|
226
|
+
|
|
227
|
+
Instead, each turn may inject a small amount of relevant memory context chosen from:
|
|
228
|
+
|
|
229
|
+
1. `SESSION.md`
|
|
230
|
+
2. channel `MEMORY.md`
|
|
231
|
+
3. workspace `MEMORY.md`
|
|
232
|
+
4. channel `HISTORY.md`
|
|
233
|
+
|
|
234
|
+
Selection rules:
|
|
235
|
+
|
|
236
|
+
1. small budget
|
|
237
|
+
2. high precision
|
|
238
|
+
3. recency-aware
|
|
239
|
+
4. section-aware
|
|
240
|
+
5. prefer `SESSION.md` when current work state matters
|
|
241
|
+
6. prefer `MEMORY.md` when durable constraints or preferences matter
|
|
242
|
+
7. prefer `HISTORY.md` when recovery of older narrative matters
|
|
243
|
+
|
|
244
|
+
## SESSION.md Lifecycle Contract
|
|
245
|
+
|
|
246
|
+
`SESSION.md` must follow this lifecycle:
|
|
247
|
+
|
|
248
|
+
1. created automatically with the channel memory files
|
|
249
|
+
2. not loaded wholesale into every prompt
|
|
250
|
+
3. eligible for targeted recall injection
|
|
251
|
+
4. updated in the background during normal work
|
|
252
|
+
5. synchronously refreshed before context-reduction boundaries when possible
|
|
253
|
+
6. carried across `/new` session boundaries within the same channel
|
|
254
|
+
7. cleaned and condensed periodically so it remains current
|
|
255
|
+
8. recreated automatically if missing on an old channel directory
|
|
256
|
+
|
|
257
|
+
Important semantic rule:
|
|
258
|
+
|
|
259
|
+
`/new` starts a new model session, but does not imply "forget current channel work". `SESSION.md` is allowed to survive `/new` if the channel still has active work.
|
|
260
|
+
|
|
261
|
+
## Relationship Between SESSION.md And Existing Memory Files
|
|
262
|
+
|
|
263
|
+
The relationship is:
|
|
264
|
+
|
|
265
|
+
1. `SESSION.md` owns high-churn working state
|
|
266
|
+
2. `MEMORY.md` owns durable and semi-durable channel knowledge
|
|
267
|
+
3. `HISTORY.md` owns older narrative recovery
|
|
268
|
+
|
|
269
|
+
Promotion rules:
|
|
270
|
+
|
|
271
|
+
1. stable facts discovered in `SESSION.md` may be promoted into `MEMORY.md`
|
|
272
|
+
2. resolved work periods may be summarized into `HISTORY.md`
|
|
273
|
+
3. stale transient content should be removed from `SESSION.md`
|
|
274
|
+
4. cleanup may also remove overly transient content that had historically accumulated in `MEMORY.md`
|
|
275
|
+
|
|
276
|
+
This lets the system gradually migrate away from using channel `MEMORY.md` as the sole carrier of both durable and active state.
|
|
277
|
+
|
|
278
|
+
## Source Of Truth Precedence
|
|
279
|
+
|
|
280
|
+
When the same topic appears in multiple files, runtime and prompts should treat them in this precedence order:
|
|
281
|
+
|
|
282
|
+
1. `SESSION.md` for current active work state
|
|
283
|
+
2. `MEMORY.md` for durable constraints and decisions
|
|
284
|
+
3. `HISTORY.md` for older chronology
|
|
285
|
+
|
|
286
|
+
Practical implication:
|
|
287
|
+
|
|
288
|
+
1. a stale `MEMORY.md` note must not override a fresher `SESSION.md` current-state section
|
|
289
|
+
2. a stale `HISTORY.md` block must not override a fresher durable decision in `MEMORY.md`
|
|
290
|
+
|
|
291
|
+
## Compaction Contract
|
|
292
|
+
|
|
293
|
+
Compaction should become `SESSION.md`-aware.
|
|
294
|
+
|
|
295
|
+
The preferred order is:
|
|
296
|
+
|
|
297
|
+
1. refresh `SESSION.md`
|
|
298
|
+
2. run inline durable consolidation into `MEMORY.md` and `HISTORY.md`
|
|
299
|
+
3. compact using the latest `SESSION.md` as part of the retained context strategy
|
|
300
|
+
|
|
301
|
+
Graceful degradation rules:
|
|
302
|
+
|
|
303
|
+
1. if `SESSION.md` refresh fails, use the last persisted `SESSION.md`
|
|
304
|
+
2. if durable consolidation fails, compaction may still proceed if `SESSION.md` is available
|
|
305
|
+
3. failures must be logged and retried in background maintenance
|
|
306
|
+
|
|
307
|
+
This keeps compaction safe without turning memory maintenance into a hard availability dependency.
|
|
308
|
+
|
|
309
|
+
## Migration Contract
|
|
310
|
+
|
|
311
|
+
Existing channel directories already containing only `MEMORY.md` and `HISTORY.md` must migrate safely.
|
|
312
|
+
|
|
313
|
+
Migration rules:
|
|
314
|
+
|
|
315
|
+
1. missing `SESSION.md` is treated as normal and repaired by ensure-file bootstrap
|
|
316
|
+
2. existing `MEMORY.md` and `HISTORY.md` content remains authoritative
|
|
317
|
+
3. no one-time destructive rewrite of historical `MEMORY.md` content is required
|
|
318
|
+
4. transient content historically living in `MEMORY.md` is cleaned gradually by normal maintenance
|
|
319
|
+
|
|
320
|
+
This keeps rollout incremental and low-risk for live DingTalk channels.
|
|
321
|
+
|
|
322
|
+
## Sub-Agent Contract
|
|
323
|
+
|
|
324
|
+
Sub-agents remain isolated by default, but the runtime gains a standard way to provide them with the right memory context.
|
|
325
|
+
|
|
326
|
+
Two modes should exist:
|
|
327
|
+
|
|
328
|
+
1. isolated
|
|
329
|
+
- current behavior
|
|
330
|
+
- only runtime basics plus the explicitly provided task
|
|
331
|
+
2. contextual
|
|
332
|
+
- runtime prepends relevant memory and working-state context automatically
|
|
333
|
+
|
|
334
|
+
The contextual mode should pull from the same recall pipeline as the main agent, but with stricter budgets.
|
|
335
|
+
|
|
336
|
+
## Skill Contract
|
|
337
|
+
|
|
338
|
+
Skills should become lightweight participants in the upgraded context system.
|
|
339
|
+
|
|
340
|
+
Expected improvements:
|
|
341
|
+
|
|
342
|
+
1. richer frontmatter describing when a skill should be used
|
|
343
|
+
2. optional scoped hooks for session lifecycle events
|
|
344
|
+
3. optional declaration of memory relevance or path relevance
|
|
345
|
+
|
|
346
|
+
The skill system is still intentionally lighter than the `claude-code` version. The goal is stronger reuse and better context shaping, not a full plugin platform.
|
|
347
|
+
|
|
348
|
+
## Rollout Principle
|
|
349
|
+
|
|
350
|
+
This spec is designed for staged implementation:
|
|
351
|
+
|
|
352
|
+
1. first improve recall
|
|
353
|
+
2. then add `SESSION.md`
|
|
354
|
+
3. then bridge compaction
|
|
355
|
+
4. then upgrade sub-agent and skill integration
|
|
356
|
+
|
|
357
|
+
At each stage, existing `MEMORY.md` and `HISTORY.md` behavior must continue to work.
|
package/docs/memory-rfc.md
CHANGED
|
@@ -2,7 +2,13 @@
|
|
|
2
2
|
|
|
3
3
|
## Status
|
|
4
4
|
|
|
5
|
-
|
|
5
|
+
Superseded in part by [`docs/improve-memory/spec.md`](/Users/oyasmi/projects/pipiclaw/docs/improve-memory/spec.md) and [`docs/improve-memory/design.md`](/Users/oyasmi/projects/pipiclaw/docs/improve-memory/design.md).
|
|
6
|
+
|
|
7
|
+
This RFC remains useful for the original file-based memory rationale, but it no longer fully describes the current design because the runtime now introduces:
|
|
8
|
+
|
|
9
|
+
1. `SESSION.md` as a distinct channel working-memory layer
|
|
10
|
+
2. proactive relevant-memory injection
|
|
11
|
+
3. a revised lifecycle between `SESSION.md`, `MEMORY.md`, and `HISTORY.md`
|
|
6
12
|
|
|
7
13
|
## Goals
|
|
8
14
|
|
|
@@ -0,0 +1,188 @@
|
|
|
1
|
+
Pipiclaw 项目全面审查报告
|
|
2
|
+
|
|
3
|
+
项目概述
|
|
4
|
+
|
|
5
|
+
Pipiclaw 是一个基于 @mariozechner/pi-coding-agent SDK 的 DingTalk AI 编码助手运行时,提供持久化记忆、子 Agent 委派、定时事件等企业级功能。当前 v0.4.0,约 6,844 行
|
|
6
|
+
TypeScript,22 个测试文件。
|
|
7
|
+
|
|
8
|
+
---
|
|
9
|
+
一、当前设计/实现问题
|
|
10
|
+
|
|
11
|
+
1. 代码重复严重
|
|
12
|
+
|
|
13
|
+
extractJsonObject() 在 3 处重复实现:
|
|
14
|
+
- memory-consolidation.ts
|
|
15
|
+
- memory-recall.ts
|
|
16
|
+
- session-memory.ts
|
|
17
|
+
|
|
18
|
+
clipText() 在 3 处重复:
|
|
19
|
+
- session-memory.ts
|
|
20
|
+
- tools/subagent.ts
|
|
21
|
+
- memory-recall.ts
|
|
22
|
+
|
|
23
|
+
Markdown section 解析存在 3 个变体:
|
|
24
|
+
- splitMarkdownSections() (## 级别) — memory-files.ts
|
|
25
|
+
- splitLevelOneSections() — tools/subagent.ts
|
|
26
|
+
- splitLevelOneSections() — memory-candidates.ts
|
|
27
|
+
|
|
28
|
+
建议: 提取到 src/utils/ 共享模块,统一实现。
|
|
29
|
+
|
|
30
|
+
---
|
|
31
|
+
2. 核心模块职责过重
|
|
32
|
+
|
|
33
|
+
┌───────────────────┬──────┬────────────────────────────────────────────────────────────────────────────┐
|
|
34
|
+
│ 文件 │ 行数 │ 问题 │
|
|
35
|
+
├───────────────────┼──────┼────────────────────────────────────────────────────────────────────────────┤
|
|
36
|
+
│ agent.ts │ 907 │ Session 管理 + 事件订阅 + 消息格式化 + 记忆生命周期 + 工具配置全部混在一起 │
|
|
37
|
+
├───────────────────┼──────┼────────────────────────────────────────────────────────────────────────────┤
|
|
38
|
+
│ dingtalk.ts │ 881 │ 协议处理 + 消息队列 + AI Card 状态 + Token 缓存耦合 │
|
|
39
|
+
├───────────────────┼──────┼────────────────────────────────────────────────────────────────────────────┤
|
|
40
|
+
│ sub-agents.ts │ 511 │ 发现 + 配置解析 + 验证 + 合并逻辑混杂 │
|
|
41
|
+
├───────────────────┼──────┼────────────────────────────────────────────────────────────────────────────┤
|
|
42
|
+
│ tools/subagent.ts │ ~600 │ 配置解析 + 上下文构建 + 工具过滤 + Worker 创建 + 事件处理 + 预算跟踪 │
|
|
43
|
+
└───────────────────┴──────┴────────────────────────────────────────────────────────────────────────────┘
|
|
44
|
+
|
|
45
|
+
agent.ts 中的 ChannelRunner 承担了太多职责,既是会话编排器,又是事件分发器和记忆协调器。
|
|
46
|
+
|
|
47
|
+
---
|
|
48
|
+
3. 类型安全缺陷
|
|
49
|
+
|
|
50
|
+
- 大量 as any 类型断言:agent.ts 的事件处理中频繁使用 event as any,缺少正确的联合类型定义
|
|
51
|
+
- noExplicitAny: Off:biome 配置关闭了 any 检查,降低了类型安全性
|
|
52
|
+
- JSON 提取使用正则:extractJsonObject() 基于正则匹配 {...} 非常脆弱,嵌套 JSON、字符串中的花括号等边界情况容易出错
|
|
53
|
+
- Partial 配置合并:使用展开运算符合并配置,缺少运行时校验
|
|
54
|
+
|
|
55
|
+
---
|
|
56
|
+
4. 记忆系统设计问题
|
|
57
|
+
|
|
58
|
+
4.1 召回算法过于简单
|
|
59
|
+
|
|
60
|
+
当前的 token overlap 打分方式本质上是关键词匹配:
|
|
61
|
+
- 按空格分词,匹配命中数累加
|
|
62
|
+
- 不理解语义相似性(如 "登录" 和 "认证" 无法关联)
|
|
63
|
+
- 中文分词完全缺失(中文 token 化基于 \W+ 分割,对中文几乎无效)
|
|
64
|
+
|
|
65
|
+
4.2 合并管道缺少超时保护
|
|
66
|
+
|
|
67
|
+
runInlineConsolidation() 和 runBackgroundMaintenance() 依赖 LLM sidecar worker,但没有超时机制。如果 LLM 响应缓慢或挂起,整个管道会无限等待。
|
|
68
|
+
|
|
69
|
+
4.3 记忆候选者无缓存
|
|
70
|
+
|
|
71
|
+
buildMemoryCandidates() 每次调用都读取 4 个文件并解析。在一个 run 内可能被多次调用(recall + consolidation),但每次都从磁盘重新读取。
|
|
72
|
+
|
|
73
|
+
4.4 SESSION.md 的 LLM 更新缺乏一致性保证
|
|
74
|
+
|
|
75
|
+
session memory 更新依赖 LLM 生成结构化 JSON,但 LLM 输出格式不稳定,可能导致 SESSION.md 内容退化或丢失关键信息。
|
|
76
|
+
|
|
77
|
+
---
|
|
78
|
+
5. 测试覆盖缺口
|
|
79
|
+
|
|
80
|
+
未测试的关键路径:
|
|
81
|
+
|
|
82
|
+
┌───────────────────────────────────┬──────────────────────────────┐
|
|
83
|
+
│ 模块 │ 影响 │
|
|
84
|
+
├───────────────────────────────────┼──────────────────────────────┤
|
|
85
|
+
│ agent.ts (ChannelRunner 核心编排) │ 最核心的业务逻辑完全没有单测 │
|
|
86
|
+
├───────────────────────────────────┼──────────────────────────────┤
|
|
87
|
+
│ dingtalk.ts (连接/重连/消息路由) │ 生产环境最易出问题的部分 │
|
|
88
|
+
├───────────────────────────────────┼──────────────────────────────┤
|
|
89
|
+
│ main.ts (启动流程/配置校验) │ 首次启动失败无法快速定位 │
|
|
90
|
+
├───────────────────────────────────┼──────────────────────────────┤
|
|
91
|
+
│ delivery.ts (响应投递) │ 消息丢失/重复风险 │
|
|
92
|
+
├───────────────────────────────────┼──────────────────────────────┤
|
|
93
|
+
│ 完整消息流的集成测试 │ 无法验证端到端行为 │
|
|
94
|
+
└───────────────────────────────────┴──────────────────────────────┘
|
|
95
|
+
|
|
96
|
+
现有测试质量良好(工厂模式、临时目录清理、边界情况),但覆盖面集中在工具层和记忆文件 I/O,编排层几乎空白。
|
|
97
|
+
|
|
98
|
+
---
|
|
99
|
+
6. 错误处理与韧性
|
|
100
|
+
|
|
101
|
+
- 用户消息大小无限制:恶意或意外的超大消息可以直接注入 prompt
|
|
102
|
+
- 并发控制不够健壮:每个 channel 的 channelStates 依赖简单的状态标记,竞态条件下可能出现重复 run
|
|
103
|
+
- 磁盘写入没有容量检查:log.jsonl 无上限增长,memory 文件合并失败时没有回退机制
|
|
104
|
+
- 子 Agent 异常传播不清晰:子 Agent 超时/失败的错误信息格式化后丢失了原始堆栈
|
|
105
|
+
|
|
106
|
+
---
|
|
107
|
+
7. 架构耦合
|
|
108
|
+
|
|
109
|
+
- DingTalk 耦合深入核心:DingTalkContext 类型贯穿 agent.ts、delivery.ts、commands.ts,使得支持其他 IM 平台(飞书、企业微信)成本很高
|
|
110
|
+
- 记忆系统与文件系统强绑定:所有记忆操作直接读写 Markdown 文件,无抽象层,难以切换到数据库或向量存储
|
|
111
|
+
- 配置路径硬编码:paths.ts 中的 ~/.pi/pipiclaw/ 路径硬编码,不利于多实例部署
|
|
112
|
+
|
|
113
|
+
---
|
|
114
|
+
8. 其他问题
|
|
115
|
+
|
|
116
|
+
- release.yml 使用 Node 20,但项目要求 >=22,CI 测试也只跑 22/24
|
|
117
|
+
- attach.ts 是空壳:抛出 not implemented 错误,但已导出为公开 API
|
|
118
|
+
- shell-escape.ts 仅 7 行:功能过于简单,不处理特殊字符(如 \0、unicode)
|
|
119
|
+
- 无日志轮转:log.jsonl 和 context.jsonl 会无限增长
|
|
120
|
+
|
|
121
|
+
---
|
|
122
|
+
二、进一步迭代方向
|
|
123
|
+
|
|
124
|
+
高优先级
|
|
125
|
+
|
|
126
|
+
1. 拆分 ChannelRunner
|
|
127
|
+
- 将 agent.ts 拆为:session-orchestrator.ts(会话生命周期)、event-dispatcher.ts(事件流处理)、tool-configurator.ts(工具组装)
|
|
128
|
+
- 使核心编排可测试
|
|
129
|
+
2. 引入传输层抽象
|
|
130
|
+
- 定义 Transport 接口(send/receive/stream),DingTalk 作为一个实现
|
|
131
|
+
- 为未来支持飞书、Slack、CLI 等铺路
|
|
132
|
+
3. 修复中文记忆召回
|
|
133
|
+
- 当前分词对中文无效,需引入分词库(如 jieba-wasm)或转向 embedding 向量召回
|
|
134
|
+
- 至少作为 rerank 阶段的补充
|
|
135
|
+
4. 补全编排层测试
|
|
136
|
+
- 为 ChannelRunner 的 run() 流程编写集成测试
|
|
137
|
+
- Mock AgentSession,验证记忆召回 → prompt 构建 → 事件处理的完整链路
|
|
138
|
+
5. 提取共享工具函数
|
|
139
|
+
- extractJsonObject()、clipText()、splitMarkdownSections() 统一到 src/utils/
|
|
140
|
+
|
|
141
|
+
中优先级
|
|
142
|
+
|
|
143
|
+
6. 记忆系统改进
|
|
144
|
+
- 为 sidecar worker 添加超时(30s-60s)
|
|
145
|
+
- 记忆候选者添加 run 级别缓存
|
|
146
|
+
- SESSION.md 更新增加 schema 校验和回退机制
|
|
147
|
+
- 考虑引入 embedding 存储做语义检索
|
|
148
|
+
7. 添加防护措施
|
|
149
|
+
- 用户消息长度限制
|
|
150
|
+
- log.jsonl 轮转(按大小或时间)
|
|
151
|
+
- 磁盘写入前检查可用空间
|
|
152
|
+
- 并发 run 的互斥锁(替代状态标记)
|
|
153
|
+
8. 类型安全加固
|
|
154
|
+
- 定义 AgentEvent 联合类型,消除 as any
|
|
155
|
+
- 启用 noExplicitAny,逐步修复
|
|
156
|
+
- JSON 提取改用 proper parser(如先找到平衡的 {} 再 JSON.parse)
|
|
157
|
+
9. 可观测性增强
|
|
158
|
+
- 结构化日志(JSON 格式)替代当前的 chalk 彩色输出
|
|
159
|
+
- 关键操作添加 metrics(记忆召回耗时、合并频率、子 Agent 使用率)
|
|
160
|
+
- 健康检查端点
|
|
161
|
+
|
|
162
|
+
低优先级
|
|
163
|
+
|
|
164
|
+
10. 记忆存储抽象
|
|
165
|
+
- 定义 MemoryStore 接口(read/write/query),当前文件系统作为默认实现
|
|
166
|
+
- 为未来 SQLite / 向量数据库做准备
|
|
167
|
+
11. 子 Agent 改进
|
|
168
|
+
- 支持子 Agent 间通信
|
|
169
|
+
- 子 Agent 结果的结构化输出(而非纯文本)
|
|
170
|
+
- 子 Agent 池化(避免每次创建新实例)
|
|
171
|
+
12. release.yml 修复
|
|
172
|
+
- Node 版本改为 22,与 engines 字段一致
|
|
173
|
+
13. 文档补充
|
|
174
|
+
- 架构决策记录(ADR)
|
|
175
|
+
- 记忆系统的运维指南(如何手动清理/重建)
|
|
176
|
+
- 子 Agent 开发指南
|
|
177
|
+
|
|
178
|
+
---
|
|
179
|
+
三、总结
|
|
180
|
+
|
|
181
|
+
Pipiclaw 在 v0.4.0 阶段已具备清晰的分层架构、完善的记忆管道、灵活的子 Agent 系统。主要短板集中在:
|
|
182
|
+
|
|
183
|
+
1. 核心编排层 (agent.ts) 过于臃肿且缺少测试 — 这是最大风险
|
|
184
|
+
2. 记忆召回对中文场景基本失效 — 作为面向钉钉的中文产品这是关键缺陷
|
|
185
|
+
3. DingTalk 耦合过深 — 限制了平台扩展能力
|
|
186
|
+
4. 代码重复和类型安全 — 影响长期维护效率
|
|
187
|
+
|
|
188
|
+
建议下一阶段优先处理 ChannelRunner 拆分 + 中文召回修复 + 编排层测试补全,这三项能同时降低风险和提升产品质量。
|