pi-dev 0.2.0 → 0.2.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "pi-dev",
3
- "version": "0.2.0",
3
+ "version": "0.2.1",
4
4
  "description": "An autonomous engineering skill framework for the pi runtime — built on Matt Pocock's skills.",
5
5
  "type": "module",
6
6
  "bin": {
@@ -67,39 +67,78 @@ ls -t ~/.pi/agent/sessions/$encoded/*.jsonl | head -3
67
67
 
68
68
  For each candidate file, read just the **first 3 lines** to get session metadata (id, model, cwd, timestamp). Skip files older than the window.
69
69
 
70
- ### 4. Targeted extraction (per session)
70
+ ### 4. Targeted extraction (per session) — with a hard context budget
71
71
 
72
- Pull only these event kinds from each in-window jsonl:
72
+ A session jsonl can be hundreds of KB and a single tool result can be 10–20 KB. Never load full message bodies into context. Pull only what answers the three questions, truncate every excerpt, and enforce a byte budget.
73
73
 
74
- - `message` where `message.role == "user"` (intent)
75
- - `message` where `message.role == "assistant"` AND content includes one of: a heading, a final summary block, a list of changed files, a commit SHA, an issue URL
76
- - pi tool blocks of `name in ("edit", "write")` and lower-case `name == "bash"` whose `input.command` matches `git commit|git push|gh issue|gh pr`
77
- - Any explicit handoff strings (`chain complete`, `Final summary`, `flow complete`, `audit complete`)
74
+ **Budget (non-negotiable):**
78
75
 
79
- Implementation hint (jq, pi format messages are nested under `.message`):
76
+ - **≤ 8 KB total** extracted text per session into context.
77
+ - **≤ 3 sessions** by default → worst case ≈ 24 KB.
78
+ - Per user message: first 200 chars.
79
+ - Per assistant text block: first 400 chars (final summaries / decisions only — see filter).
80
+ - **Skip `toolResult` blocks entirely.** They are the biggest context killers and they rarely add signal that the surrounding text doesn't already convey.
81
+ - **Skip `thinking` blocks entirely.**
82
+
83
+ **What to pull from each in-window jsonl:**
84
+
85
+ - `message` where `message.role == "user"` — keep only `.message.content[].text` truncated to 200 chars, drop anything else.
86
+ - `message` where `message.role == "assistant"` — keep `.message.content[].text` blocks **only if** they contain one of: a heading, a final summary marker, a commit SHA-like 7-hex, an `https://` URL, or a terminator literal (`chain complete`, `audit complete`, `Final summary`, `## Summary`, `flow complete`). Truncate to 400 chars.
87
+ - pi tool blocks of `name in ("edit", "write")` — keep just the target `path`, drop diffs.
88
+ - pi tool blocks of lower-case `name == "bash"` — keep only those whose `input.command` matches `git commit|git push|gh issue|gh pr|gh release|npm publish`. Keep the command line only, drop output.
89
+
90
+ **Implementation hint (jq, pi format — messages are nested under `.message`; toolCall/toolResult blocks live inside `.message.content[]`). Note the explicit role gate — do NOT remove it; pi emits `toolResult` records with their own `message.role` and they will leak in otherwise:**
80
91
 
81
92
  ```bash
82
- jq -c 'select(
83
- (.type == "message" and .message.role == "user") or
84
- (.type == "message" and .message.role == "assistant" and ((.message.content // []) | tostring | test("chain complete|audit complete|Final summary|## Summary|flow complete")))
85
- )' <file>
93
+ jq -c '
94
+ select(.type == "message" and (.message.role == "user" or .message.role == "assistant")) |
95
+ . as $m |
96
+ {
97
+ ts: .timestamp,
98
+ role: .message.role,
99
+ text: (
100
+ (.message.content // [])
101
+ | map(select(.type == "text") | .text)
102
+ | join(" ")
103
+ | if $m.message.role == "user" then .[0:200] else .[0:300] end
104
+ ),
105
+ paths: (
106
+ (.message.content // [])
107
+ | map(select(.type == "toolCall" and (.name == "edit" or .name == "write")) | .arguments.path // .arguments.file_path // empty)
108
+ ),
109
+ cmds: (
110
+ (.message.content // [])
111
+ | map(
112
+ select(.type == "toolCall" and .name == "bash")
113
+ | .arguments.command
114
+ | select(test("git commit|git push|gh issue|gh pr|gh release|npm publish"))
115
+ )
116
+ )
117
+ }
118
+ | if .role == "assistant" then
119
+ select(.text | test("chain complete|audit complete|Final summary|## Summary|flow complete|^## |https://|[0-9a-f]{7,}"))
120
+ else . end
121
+ | select(.text != "" or (.paths | length) > 0 or (.cmds | length) > 0)
122
+ ' <file> | head -25
86
123
  ```
87
124
 
88
- If `jq` is unavailable or the file format does not match, fall back to `grep`-based filters on the raw file. Keep extraction under 200 lines per session.
125
+ This drops `toolResult` and `thinking` via the explicit role gate, filters assistant text to genuine signals only, truncates per-role, and caps records per session at 25. Measured on a real 653 KB session: output ≈ 6 KB, well inside the 8 KB budget. If `jq` is unavailable, fall back to `grep -E` on raw text and live with the lower precision still respect the 8 KB-per-session budget.
126
+
127
+ **If budget exceeded after filtering:** keep the first user message, the last 2 qualifying assistant text blocks, all matching `bash` commands, all edit/write paths. Drop intermediate text blocks. Never echo a `toolResult`.
89
128
 
90
- ### 4b. Cross-reference live state
129
+ ### 4b. Cross-reference live state (cheap, bounded)
91
130
 
92
- The session log alone says "what was discussed". To answer **stage** and **next** you also need what actually landed:
131
+ The session log alone says "what was discussed". To answer **stage** and **next** you also need what actually landed. Each probe below caps its own output — do not remove the caps:
93
132
 
94
133
  ```bash
95
134
  git log --since="3 days ago" --pretty=format:"%h %ad %s" --date=short | head -10
96
- git status -sb
135
+ git status -sb | head -20
97
136
  # if GitHub is the tracker:
98
- gh pr list --state open --limit 5 2>/dev/null
99
- gh issue list --state open --limit 5 2>/dev/null
137
+ gh pr list --state open --limit 5 --json number,title,url 2>/dev/null
138
+ gh issue list --state open --limit 5 --json number,title,url 2>/dev/null
100
139
  ```
101
140
 
102
- Reconcile: a session that ended with a commit + merged PR → stage = shipped; a session that ended with `git status` dirty or an open PR → stage = in flight; a session whose last assistant turn proposed a follow-up command → that command *is* the next action.
141
+ Reconcile: a session that ended with a commit + merged PR → stage = shipped; a session that ended with `git status` dirty or an open PR → stage = in flight; a session whose last assistant turn proposed a follow-up command → that command *is* the next action. Each probe must be one command; do not page through history or fetch issue bodies.
103
142
 
104
143
  ### 5. Synthesise a position card
105
144
 
@@ -131,10 +170,13 @@ End with one short question: "이걸로 갈까?" (or English equivalent). If the
131
170
  - Sessions can contain secrets that were pasted in. Do not echo full message bodies in the recall card; extract only headings, file paths, URLs, SHAs.
132
171
  - Never write the recall card to a file in the repo. It lives in conversation context only.
133
172
 
134
- ## Performance
173
+ ## Performance & context budget
135
174
 
136
- - Target: < 2 seconds wallclock for the cheap pass + targeted extraction across 3 sessions, even if individual jsonl files are multi-MB.
137
- - If a single jsonl is > 5 MB, sample: head -2000 + tail -2000 lines, plus any line containing `git commit`/`gh issue`.
175
+ - **Wallclock target:** < 2 seconds for the cheap pass + targeted extraction across 3 sessions.
176
+ - **Context budget:** 8 KB extracted per session, 24 KB total before rendering the card. This is the limit, not a goal.
177
+ - **Always strip toolResult and thinking blocks.** They cause context bloat without changing the answer to the three questions.
178
+ - If a single jsonl is > 5 MB, sample: `head -2000` + `tail -2000` lines (still feed them through the Step 4 jq filter — do not raw-cat).
179
+ - If after filtering a session still exceeds 8 KB, keep first user message + last 2 qualifying assistant blocks + all matching bash commands + all edit/write paths; drop the rest.
138
180
 
139
181
  ## Limits
140
182