@mariozechner/pi-coding-agent 0.23.2 → 0.23.3

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -1,387 +0,0 @@
1
- # Compaction Research & Redesign
2
-
3
- ## Current Pi Compaction Implementation
4
-
5
- ### Settings (defaults)
6
- - `reserveTokens: 16384` - Buffer to leave for new responses
7
- - `keepRecentTokens: 20000` - How many tokens of recent messages to keep
8
-
9
- ### Trigger Conditions
10
- 1. **Threshold**: After each turn, if `contextTokens > contextWindow - reserveTokens`
11
- 2. **Overflow**: If LLM returns context overflow error, compact and retry
12
-
13
- ### Current Process
14
- 1. Find cut point by walking backwards until `keepRecentTokens` accumulated
15
- 2. Generate single summary of everything before cut point
16
- 3. If cutting mid-turn, also generate "turn prefix summary"
17
- 4. Save `CompactionEntry` with summary and `firstKeptEntryIndex`
18
-
19
- ### Current Prompt
20
- ```
21
- You are performing a CONTEXT CHECKPOINT COMPACTION. Create a handoff summary for another LLM that will resume the task.
22
-
23
- Include:
24
- - Current progress and key decisions made
25
- - Important context, constraints, or user preferences
26
- - Absolute file paths of any relevant files that were read or modified
27
- - What remains to be done (clear next steps)
28
- - Any critical data, examples, or references needed to continue
29
-
30
- Be concise, structured, and focused on helping the next LLM seamlessly continue the work.
31
- ```
32
-
33
- ### maxTokens for Summarization
34
- - History summary: `0.8 * reserveTokens` (≈13K tokens)
35
- - Turn prefix summary: `0.5 * reserveTokens` (≈8K tokens)
36
-
37
- ---
38
-
39
- ## Claude Code's Approach
40
-
41
- ### Key Differences
42
- - Much more structured, detailed prompt
43
- - Uses `<analysis>` tags for chain-of-thought before summary
44
- - Uses `<summary>` tags for structured output
45
- - 9-section format with explicit requirements
46
- - Supports custom summarization instructions via user input
47
-
48
- ### Full Prompt (reconstructed from cli.js)
49
-
50
- ```
51
- Your task is to create a detailed summary of the conversation so far, paying close attention to the user's explicit requests and your previous actions.
52
- This summary should be thorough in capturing technical details, code patterns, and architectural decisions that would be essential for continuing development work without losing context.
53
-
54
- Before providing your final summary, wrap your analysis in <analysis> tags to organize your thoughts and ensure you've covered all necessary points. In your analysis process:
55
-
56
- 1. Chronologically analyze each message and section of the conversation. For each section thoroughly identify:
57
- - The user's explicit requests and intents
58
- - Your approach to addressing the user's requests
59
- - Key decisions, technical concepts and code patterns
60
- - Specific details like:
61
- - file names
62
- - full code snippets
63
- - function signatures
64
- - file edits
65
- - Errors that you ran into and how you fixed them
66
- - Pay special attention to specific user feedback that you received, especially if the user told you to do something differently.
67
- 2. Double-check for technical accuracy and completeness, addressing each required element thoroughly.
68
-
69
- Your summary should include the following sections:
70
-
71
- 1. Primary Request and Intent: Capture all of the user's explicit requests and intents in detail
72
- 2. Key Technical Concepts: List all important technical concepts, technologies, and frameworks discussed.
73
- 3. Files and Code Sections: Enumerate specific files and code sections examined, modified, or created. Pay special attention to the most recent messages and include full code snippets where applicable and include a summary of why this file read or edit is important.
74
- 4. Errors and fixes: List all errors that you ran into, and how you fixed them. Pay special attention to specific user feedback that you received, especially if the user told you to do something differently.
75
- 5. Problem Solving: Document problems solved and any ongoing troubleshooting efforts.
76
- 6. All user messages: List ALL user messages that are not tool results. These are critical for understanding the users' feedback and changing intent.
77
- 7. Pending Tasks: Outline any pending tasks that you have explicitly been asked to work on.
78
- 8. Current Work: Describe in detail precisely what was being worked on immediately before this summary request, paying special attention to the most recent messages from both user and assistant. Include file names and code snippets where applicable.
79
- 9. Optional Next Step: List the next step that you will take that is related to the most recent work you were doing. IMPORTANT: ensure that this step is DIRECTLY in line with the user's most recent explicit requests, and the task you were working on immediately before this summary request. If your last task was concluded, then only list next steps if they are explicitly in line with the users request. Do not start on tangential requests or really old requests that were already completed without confirming with the user first.
80
- If there is a next step, include direct quotes from the most recent conversation showing exactly what task you were working on and where you left off. This should be verbatim to ensure there's no drift in task interpretation.
81
-
82
- <example>
83
- <analysis>
84
- [Your thought process, ensuring all points are covered thoroughly and accurately]
85
- </analysis>
86
-
87
- <summary>
88
- 1. Primary Request and Intent:
89
- [Detailed description]
90
-
91
- 2. Key Technical Concepts:
92
- - [Concept 1]
93
- - [Concept 2]
94
- - [...]
95
-
96
- 3. Files and Code Sections:
97
- - [File Name 1]
98
- - [Summary of why this file is important]
99
- - [Summary of the changes made to this file, if any]
100
- - [Important Code Snippet]
101
- - [...]
102
-
103
- 4. Errors and fixes:
104
- - [Detailed description of error 1]:
105
- - [How you fixed the error]
106
- - [User feedback on the error if any]
107
- - [...]
108
-
109
- 5. Problem Solving:
110
- [Description of solved problems and ongoing troubleshooting]
111
-
112
- 6. All user messages:
113
- - [Detailed non tool use user message]
114
- - [...]
115
-
116
- 7. Pending Tasks:
117
- - [Task 1]
118
- - [Task 2]
119
- - [...]
120
-
121
- 8. Current Work:
122
- [Precise description of current work]
123
-
124
- 9. Optional Next Step:
125
- [Optional Next step to take]
126
-
127
- </summary>
128
- </example>
129
-
130
- There may be additional summarization instructions provided in the included context. If so, remember to follow these instructions when creating the above summary. Examples of instructions include:
131
- <example>
132
- ## Compact Instructions
133
- When summarizing the conversation focus on typescript code changes and also remember the mistakes you made and how you fixed them.
134
- </example>
135
- ```
136
-
137
- ### Additional Features
138
- - Supports custom instructions: `When you are using compact - please focus on test output and code changes. Include file reads verbatim.`
139
- - Post-processes to extract `<analysis>` and `<summary>` sections
140
- - Has "microcompact" for tool results (abbreviated tool outputs)
141
-
142
- ---
143
-
144
- ## OpenAI Codex's Approach
145
-
146
- ### Compaction Prompt (`codex-rs/core/templates/compact/prompt.md`)
147
- ```
148
- You are performing a CONTEXT CHECKPOINT COMPACTION. Create a handoff summary for another LLM that will resume the task.
149
-
150
- Include:
151
- - Current progress and key decisions made
152
- - Important context, constraints, or user preferences
153
- - What remains to be done (clear next steps)
154
- - Any critical data, examples, or references needed to continue
155
-
156
- Be concise, structured, and focused on helping the next LLM seamlessly continue the work.
157
- ```
158
-
159
- ### Summary Prefix (`codex-rs/core/templates/compact/summary_prefix.md`)
160
- ```
161
- Another language model started to solve this problem and produced a summary of its thinking process. You also have access to the state of the tools that were used by that language model. Use this to build on the work that has already been done and avoid duplicating work. Here is the summary produced by the other language model, use the information in this summary to assist with your own analysis:
162
- ```
163
-
164
- ### Notes
165
- - Very similar to our current prompt (likely we derived from same source)
166
- - Supports custom `compact_prompt` override in config
167
- - Has `experimental_compact_prompt_file` for loading from file
168
-
169
- ---
170
-
171
- ## SST OpenCode's Approach
172
-
173
- ### Compaction System Prompt (`session/prompt/compaction.txt`)
174
- ```
175
- You are a helpful AI assistant tasked with summarizing conversations.
176
-
177
- When asked to summarize, provide a detailed but concise summary of the conversation.
178
- Focus on information that would be helpful for continuing the conversation, including:
179
- - What was done
180
- - What is currently being worked on
181
- - Which files are being modified
182
- - What needs to be done next
183
- - Key user requests, constraints, or preferences that should persist
184
- - Important technical decisions and why they were made
185
-
186
- Your summary should be comprehensive enough to provide context but concise enough to be quickly understood.
187
- ```
188
-
189
- ### User Message for Compaction
190
- ```
191
- Provide a detailed prompt for continuing our conversation above. Focus on information that would be helpful for continuing the conversation, including what we did, what we're doing, which files we're working on, and what we're going to do next considering new session will not have access to our conversation.
192
- ```
193
-
194
- ### Short Summary Prompt (`session/prompt/summarize.txt`)
195
- ```
196
- Summarize the following conversation into 2 sentences MAX explaining what the assistant did and why
197
- Do not explain the user's input.
198
- Do not speak in the third person about the assistant.
199
- ```
200
-
201
- ### Additional Features
202
- - **Pruning**: Goes backwards through parts, after 40K tokens of tool calls, erases output of older tool calls
203
- - **Prune thresholds**: `PRUNE_MINIMUM = 20_000`, `PRUNE_PROTECT = 40_000`
204
- - Marks tool outputs as `compacted` with timestamp to avoid re-pruning
205
-
206
- ---
207
-
208
- ## Factory Droid's Approach (from binary strings)
209
-
210
- ### Scratchpad Feature
211
- From extracted strings:
212
- ```
213
- Edit the session scratchpad using multiple operations in a single call. Operations can be str_replace, insert, or overwrite commands and are applied in order. The scratchpad is working memory that persists when conversation history is compacted or summarized.
214
- ```
215
-
216
- ### Summary Guidance
217
- ```
218
- Once you are done with the task, you can summarize the changes you made in a 1-4 sentences, don't go into too much detail.
219
- ```
220
-
221
- ### Compaction Model
222
- Uses external model for summarization with configurable providers (Anthropic, OpenAI, generic chat completion API).
223
-
224
- ---
225
-
226
- ## Proposed Slice-Based Compaction
227
-
228
- ### Concept
229
- Instead of summarizing the entire history in one call:
230
- 1. Segment session into slices (possibly overlapping)
231
- 2. Summarize each slice with budget = 1/10th of slice token count
232
- 3. Stitch slice summaries together into unified summary
233
-
234
- ### Benefits
235
- - More parallelizable (summarize slices concurrently)
236
- - Less risk of losing detail in long sessions
237
- - Better "compression ratio" control per slice
238
- - Overlapping slices can preserve continuity/context
239
- - Can prioritize recent slices with larger budgets
240
-
241
- ### Proposed Algorithm
242
-
243
- ```typescript
244
- interface SliceConfig {
245
- sliceTokens: number; // Target tokens per slice (e.g., 20K)
246
- overlapTokens: number; // Overlap between slices (e.g., 2K)
247
- compressionRatio: number; // Summary budget as fraction of slice (e.g., 0.1)
248
- recentBoost: number; // Multiplier for most recent slice budget (e.g., 2.0)
249
- }
250
-
251
- async function sliceBasedCompaction(
252
- messages: Message[],
253
- config: SliceConfig
254
- ): Promise<string> {
255
- // 1. Segment into slices
256
- const slices = segmentIntoSlices(messages, config.sliceTokens, config.overlapTokens);
257
-
258
- // 2. Calculate budget per slice
259
- const budgets = slices.map((slice, i) => {
260
- const base = estimateTokens(slice) * config.compressionRatio;
261
- // Boost recent slices
262
- const isRecent = i >= slices.length - 2;
263
- return Math.floor(isRecent ? base * config.recentBoost : base);
264
- });
265
-
266
- // 3. Summarize slices in parallel
267
- const summaries = await Promise.all(
268
- slices.map((slice, i) => summarizeSlice(slice, budgets[i], i, slices.length))
269
- );
270
-
271
- // 4. Stitch summaries together
272
- return stitchSummaries(summaries);
273
- }
274
- ```
275
-
276
- ### Slice Summarization Prompt (per slice)
277
-
278
- ```
279
- You are summarizing slice ${sliceIndex + 1} of ${totalSlices} from a coding session.
280
-
281
- ${sliceIndex === 0 ? 'This is the BEGINNING of the session.' : ''}
282
- ${sliceIndex === totalSlices - 1 ? 'This is the MOST RECENT activity.' : ''}
283
-
284
- Summarize the key information in this slice:
285
- - User requests and intent changes
286
- - Files read, created, or modified (with paths)
287
- - Key code changes or patterns
288
- - Errors encountered and how they were resolved
289
- - Decisions made and their rationale
290
-
291
- ${sliceIndex === totalSlices - 1 ? `
292
- For the most recent slice, also include:
293
- - Current work in progress
294
- - Exact state of any pending tasks
295
- - Next steps that were planned
296
- ` : ''}
297
-
298
- Be precise and technical. Preserve file paths and important code snippets.
299
- Budget: approximately ${budget} tokens.
300
- ```
301
-
302
- ### Stitching Prompt
303
-
304
- ```
305
- You have ${summaries.length} chronological slice summaries from a coding session.
306
- Combine them into a single coherent handoff summary for another LLM.
307
-
308
- Requirements:
309
- - Preserve chronological flow
310
- - Deduplicate information that appears in overlapping sections
311
- - Emphasize the most recent work and next steps
312
- - Keep all file paths and critical code snippets
313
- - Total budget: ${totalBudget} tokens
314
-
315
- Slice summaries:
316
- ${summaries.map((s, i) => `--- Slice ${i + 1} ---\n${s}`).join('\n\n')}
317
- ```
318
-
319
- ---
320
-
321
- ## Comparison Table
322
-
323
- | Feature | Pi (Current) | Claude Code | OpenAI Codex | SST OpenCode |
324
- |---------|--------------|-------------|--------------|--------------|
325
- | Prompt detail | Basic | Very detailed | Basic | Medium |
326
- | Structured output | No | Yes (<summary>) | No | No |
327
- | Chain-of-thought | No | Yes (<analysis>) | No | No |
328
- | Custom instructions | Yes | Yes | Yes (config) | No |
329
- | Tool output pruning | No | Yes (microcompact) | No | Yes |
330
- | Parallel summarization | No | No | No | No |
331
- | Scratchpad/persistent memory | No | No | No | No |
332
-
333
- ---
334
-
335
- ---
336
-
337
- ## Test Harness
338
-
339
- A CLI test tool is available at [compaction-strategies.ts](./compaction-strategies.ts) to compare strategies:
340
-
341
- ```bash
342
- npx tsx docs/compaction-strategies.ts before-compaction
343
- npx tsx docs/compaction-strategies.ts large-session
344
- ```
345
-
346
- This outputs results to `compaction-results/[fixture]-[strategy].md` (in repo root) and a comparison file.
347
-
348
- ### Implemented Strategies
349
-
350
- 1. **single-shot**: Current approach, one LLM call with full transcript
351
- 2. **parallel-stitch**: Slice into chunks, summarize in parallel, LLM-merge results
352
- 3. **sequential-accumulated**: Slice into chunks, summarize each with all previous summaries as context
353
- 4. **sequential-rolling**: Slice into chunks, each call updates/rewrites the running summary
354
-
355
- ### Example Results (30K token session, 4 slices)
356
-
357
- | Strategy | Input Tokens | Output Tokens | API Calls | Time (ms) |
358
- |----------|-------------|---------------|-----------|-----------|
359
- | single-shot | 35706 | 1284 | 1 | 31914 |
360
- | parallel-stitch | 37850 | 3087 | 5 | 34010 |
361
- | sequential-accumulated | 39136 | 2996 | 4 | 66907 |
362
- | sequential-rolling | 38873 | 4557 | 4 | 98032 |
363
-
364
- Observations:
365
- - **single-shot**: Fastest, simplest, but entire context in one call
366
- - **parallel-stitch**: Similar wall-clock (parallel), needs extra stitch call
367
- - **sequential-accumulated**: 2x time, but each slice knows full prior context
368
- - **sequential-rolling**: Slowest, most output (rewrites summary each time)
369
-
370
- ---
371
-
372
- ## Recommendations
373
-
374
- ### Short Term
375
- 1. **Improve prompt**: Adopt Claude Code's structured format with sections
376
- 2. **Add pruning**: Implement tool output pruning like OpenCode (mark old outputs as compacted)
377
- 3. **Better token estimation**: Use actual tokenizer instead of chars/4 heuristic
378
-
379
- ### Medium Term
380
- 1. **Slice-based compaction**: Implement parallel slice summarization
381
- 2. **Persistent scratchpad**: Add working memory that survives compaction
382
- 3. **Custom instructions**: Support user-provided compaction focus
383
-
384
- ### Long Term
385
- 1. **Semantic chunking**: Use embeddings to find natural break points
386
- 2. **Importance scoring**: Weight messages by relevance to current task
387
- 3. **Incremental compaction**: Compact older portions while keeping recent detailed