pine-of-glass 0.2.4

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/LICENSE ADDED
@@ -0,0 +1,21 @@
1
+ MIT License
2
+
3
+ Copyright (c) 2026 Thomas Mustier
4
+
5
+ Permission is hereby granted, free of charge, to any person obtaining a copy
6
+ of this software and associated documentation files (the "Software"), to deal
7
+ in the Software without restriction, including without limitation the rights
8
+ to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
9
+ copies of the Software, and to permit persons to whom the Software is
10
+ furnished to do so, subject to the following conditions:
11
+
12
+ The above copyright notice and this permission notice shall be included in all
13
+ copies or substantial portions of the Software.
14
+
15
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16
+ IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17
+ FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18
+ AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19
+ LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20
+ OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21
+ SOFTWARE.
package/README.md ADDED
@@ -0,0 +1,20 @@
1
+ # pine-of-glass
2
+
3
+ Small observability and context management tools for the [Pi coding agent](https://github.com/earendil-works/pi-coding-agent).
4
+ 1. [`contextimate`](./extensions/pi-contextimate) breaks down what is filling your context window: sysprompt, AGENTS.md, Skill frontmatter, Tool schemas, and session material. Toggle with ctrl+o on start and /reload.
5
+ 2. [`traceline`](./extensions/pi-traceline) collapses tool calls to one trace line each, so you can see the full arc of what Pi did (path taken, context read, bloated tool results). Toggle with ctrl+t.
6
+ 3. `cachemire` coming soon.
7
+
8
+ See each extension's own `README.md` for details, and `docs/` for deeper reference.
9
+
10
+ ## Installation
11
+ - From GitHub: `pi install git:github.com/tmustier/pine-of-glass`
12
+ - From npm (installs both `contextimate` and `traceline`): `pi install npm:pine-of-glass`
13
+
14
+ ## Screenshots
15
+ ### [Contextimate](./extensions/pi-contextimate)
16
+ <img width="1477" height="589" alt="image" src="https://github.com/user-attachments/assets/8ff81aa2-f61b-4d8d-9507-f455f37c12cc" />
17
+
18
+ ### [Traceline](./extensions/pi-traceline)
19
+ <img width="709" height="409" alt="image" src="https://github.com/user-attachments/assets/4a59fbae-8270-46d3-a4fc-fdf2e5c3ba8c" />
20
+
@@ -0,0 +1,245 @@
1
+ # Codex context-accounting comparison for pi-contextimate
2
+
3
+ This note captures how upstream OpenAI Codex estimates active context usage, especially around interruption and compaction, so `pi-contextimate` can compare its own policy against Codex without mixing the two models.
4
+
5
+ Source snapshot used for this note: `openai/codex` commit [`0c5ccd18abda96efaed9e94e26ffe22def5e28ed`](https://github.com/openai/codex/tree/0c5ccd18abda96efaed9e94e26ffe22def5e28ed).
6
+
7
+ ## Executive summary
8
+
9
+ Codex keeps two token-usage concepts:
10
+
11
+ - `total_token_usage`: accumulated session spend.
12
+ - `last_token_usage`: Codex's active-context number, used by the CLI/TUI context-window indicator.
13
+
14
+ For normal completed model turns, `last_token_usage.total_tokens` comes from provider usage on `response.completed`. When local history has grown after that last successful response, Codex adds an estimate for every item after the most recent model-generated item.
15
+
16
+ For compaction, Codex does **not** wait for a later assistant response to get a fresh provider usage number. After installing compacted history it recomputes a local estimate and writes that estimate into `last_token_usage.total_tokens`.
17
+
18
+ Pi currently differs: after compaction, if there is no post-compaction assistant usage yet, Pi can report context usage as `null`/unknown and let the inspector fall back to heuristics.
19
+
20
+ ## Latest active-context estimate
21
+
22
+ Codex's active-context estimator is:
23
+
24
+ ```text
25
+ active_context_tokens = latest_provider_last_token_usage.total_tokens
26
+ + estimated_tokens_of_items_after_last_model_generated_item
27
+ ```
28
+
29
+ There is one extra branch: if the server has not indicated that past reasoning is included, Codex also adds estimated tokens for non-last encrypted reasoning items.
30
+
31
+ Important boundary: `items_after_last_model_generated_item()` starts after the newest model-generated item. Model-generated items include assistant messages, reasoning, function/tool calls, shell calls, web/search/image calls, compaction items, and context-compaction items. Tool outputs and new user messages after that boundary are estimated locally.
32
+
33
+ Source:
34
+
35
+ - [`ContextManager::items_after_last_model_generated_item`](https://github.com/openai/codex/blob/0c5ccd18abda96efaed9e94e26ffe22def5e28ed/codex-rs/core/src/context_manager/history.rs#L303-L310)
36
+ - [`ContextManager::get_total_token_usage`](https://github.com/openai/codex/blob/0c5ccd18abda96efaed9e94e26ffe22def5e28ed/codex-rs/core/src/context_manager/history.rs#L314-L331)
37
+ - [`ContextManager::get_total_token_usage_breakdown`](https://github.com/openai/codex/blob/0c5ccd18abda96efaed9e94e26ffe22def5e28ed/codex-rs/core/src/context_manager/history.rs#L334-L358)
38
+
39
+ ## Text and item token heuristic
40
+
41
+ Plain text uses a four-bytes-per-token heuristic:
42
+
43
+ ```text
44
+ approx_token_count(text) = ceil(text.length / 4)
45
+ approx_tokens_from_byte_count(bytes) = ceil(bytes / 4)
46
+ ```
47
+
48
+ For full history items:
49
+
50
+ ```text
51
+ estimate_item_token_count(item) = ceil(model_visible_bytes(item) / 4)
52
+ ```
53
+
54
+ Source:
55
+
56
+ - [`approx_token_count`](https://github.com/openai/codex/blob/0c5ccd18abda96efaed9e94e26ffe22def5e28ed/codex-rs/utils/string/src/truncate.rs#L71-L83)
57
+ - [`estimate_item_token_count`](https://github.com/openai/codex/blob/0c5ccd18abda96efaed9e94e26ffe22def5e28ed/codex-rs/core/src/context_manager/history.rs#L516-L518)
58
+
59
+ ## What Codex means by “model-visible bytes”
60
+
61
+ “Model-visible bytes” is Codex's local proxy for what counts toward context, not simply local object size and not necessarily human-visible text.
62
+
63
+ For ordinary `ResponseItem`s, Codex starts with `serde_json::to_string(item).len()` and then adjusts special payloads:
64
+
65
+ 1. Inline image data URLs: subtract raw base64 payload bytes and add an image-cost estimate.
66
+ 2. Encrypted function-output content items: subtract the encrypted string length and add a smaller decoded-size estimate.
67
+ 3. Encrypted reasoning / compaction blocks: do not use serialized JSON length; estimate from encrypted payload length.
68
+
69
+ Source: [`estimate_response_item_model_visible_bytes`](https://github.com/openai/codex/blob/0c5ccd18abda96efaed9e94e26ffe22def5e28ed/codex-rs/core/src/context_manager/history.rs#L543-L570).
70
+
71
+ ### Encrypted reasoning / compaction blocks
72
+
73
+ For these items:
74
+
75
+ - `ResponseItem::Reasoning { encrypted_content: Some(...) }`
76
+ - `ResponseItem::Compaction { encrypted_content }`
77
+ - `ResponseItem::ContextCompaction { encrypted_content: Some(...) }`
78
+
79
+ Codex estimates:
80
+
81
+ ```text
82
+ model_visible_bytes = max(0, floor(encrypted_content.length * 3 / 4) - 650)
83
+ tokens = ceil(model_visible_bytes / 4)
84
+ ```
85
+
86
+ So yes: this is effectively an estimate for opaque encrypted thinking / compaction payloads that are carried client-side and replayed into later requests. It is not counting visible reasoning-summary text; it is estimating the model-visible cost of the encrypted carry-forward blob.
87
+
88
+ Source:
89
+
90
+ - [`estimate_reasoning_length`](https://github.com/openai/codex/blob/0c5ccd18abda96efaed9e94e26ffe22def5e28ed/codex-rs/core/src/context_manager/history.rs#L504-L509)
91
+ - reasoning/compaction branch in [`estimate_response_item_model_visible_bytes`](https://github.com/openai/codex/blob/0c5ccd18abda96efaed9e94e26ffe22def5e28ed/codex-rs/core/src/context_manager/history.rs#L543-L555)
92
+
93
+ ### Encrypted function outputs
94
+
95
+ Encrypted function-output content uses a different adjustment:
96
+
97
+ ```text
98
+ replacement_bytes = ceil(encrypted_content.length * 9 / 16)
99
+ ```
100
+
101
+ Source:
102
+
103
+ - [`estimate_encrypted_function_output_length`](https://github.com/openai/codex/blob/0c5ccd18abda96efaed9e94e26ffe22def5e28ed/codex-rs/core/src/context_manager/history.rs#L512-L513)
104
+ - [`encrypted_function_output_estimate_adjustment`](https://github.com/openai/codex/blob/0c5ccd18abda96efaed9e94e26ffe22def5e28ed/codex-rs/core/src/context_manager/history.rs#L692-L714)
105
+
106
+ ### Images
107
+
108
+ For non-`original` image detail, Codex replaces the base64 image payload with a fixed byte estimate:
109
+
110
+ ```text
111
+ RESIZED_IMAGE_BYTES_ESTIMATE = 7,373 bytes
112
+ ≈ 1,844 tokens with ceil(bytes / 4)
113
+ ```
114
+
115
+ For `detail: "original"`, Codex attempts to decode dimensions and estimates a 32px patch count capped at 10,000 patches, then converts patches to bytes with the same four-bytes-per-token helper.
116
+
117
+ Source:
118
+
119
+ - image constants and comment: [`history.rs#L521-L533`](https://github.com/openai/codex/blob/0c5ccd18abda96efaed9e94e26ffe22def5e28ed/codex-rs/core/src/context_manager/history.rs#L521-L533)
120
+ - original-detail patch estimate: [`history.rs#L609-L642`](https://github.com/openai/codex/blob/0c5ccd18abda96efaed9e94e26ffe22def5e28ed/codex-rs/core/src/context_manager/history.rs#L609-L642)
121
+
122
+ ## Post-compaction estimate
123
+
124
+ After compaction, Codex replaces the in-memory history and then calls `recompute_token_usage()`.
125
+
126
+ That recompute does:
127
+
128
+ ```text
129
+ post_compaction_context_tokens = ceil(base_instructions.length / 4)
130
+ + sum(estimate_item_token_count(item) for item in compacted_history)
131
+ ```
132
+
133
+ Then it stores:
134
+
135
+ ```text
136
+ last_token_usage = {
137
+ input_tokens: 0,
138
+ cached_input_tokens: 0,
139
+ output_tokens: 0,
140
+ reasoning_output_tokens: 0,
141
+ total_tokens: post_compaction_context_tokens
142
+ }
143
+ ```
144
+
145
+ Source:
146
+
147
+ - [`ContextManager::estimate_token_count_with_base_instructions`](https://github.com/openai/codex/blob/0c5ccd18abda96efaed9e94e26ffe22def5e28ed/codex-rs/core/src/context_manager/history.rs#L149-L162)
148
+ - [`Session::recompute_token_usage`](https://github.com/openai/codex/blob/0c5ccd18abda96efaed9e94e26ffe22def5e28ed/codex-rs/core/src/session/mod.rs#L2970-L3005)
149
+ - local compaction calls recompute after replacement: [`compact.rs#L292-L294`](https://github.com/openai/codex/blob/0c5ccd18abda96efaed9e94e26ffe22def5e28ed/codex-rs/core/src/compact.rs#L292-L294)
150
+ - remote compaction calls recompute after replacement: [`compact_remote.rs#L261-L263`](https://github.com/openai/codex/blob/0c5ccd18abda96efaed9e94e26ffe22def5e28ed/codex-rs/core/src/compact_remote.rs#L261-L263)
151
+ - remote compaction v2 calls recompute after replacement: [`compact_remote_v2.rs#L275-L277`](https://github.com/openai/codex/blob/0c5ccd18abda96efaed9e94e26ffe22def5e28ed/codex-rs/core/src/compact_remote_v2.rs#L275-L277)
152
+
153
+ ## What is in compacted history
154
+
155
+ ### Local summarization compaction
156
+
157
+ Local compaction builds replacement history from:
158
+
159
+ 1. retained recent user messages, capped by `COMPACT_USER_MESSAGE_MAX_TOKENS = 20_000` using `chars / 4`;
160
+ 2. a user-role summary message;
161
+ 3. optionally re-injected initial context before the last real user message.
162
+
163
+ Source:
164
+
165
+ - token cap: [`compact.rs#L47-L49`](https://github.com/openai/codex/blob/0c5ccd18abda96efaed9e94e26ffe22def5e28ed/codex-rs/core/src/compact.rs#L47-L49)
166
+ - replacement history builder: [`compact.rs#L474-L538`](https://github.com/openai/codex/blob/0c5ccd18abda96efaed9e94e26ffe22def5e28ed/codex-rs/core/src/compact.rs#L474-L538)
167
+
168
+ ### Remote compaction v2
169
+
170
+ Remote compaction v2 constructs a compact request by appending `ResponseItem::CompactionTrigger` to the prompt input. The compacted history it installs is:
171
+
172
+ 1. retained `user` / `developer` / `system` messages from the prompt input;
173
+ 2. filtered through `should_keep_compacted_history_item`;
174
+ 3. truncated newest-first to `RETAINED_MESSAGE_TOKEN_BUDGET = 64_000` approximate text tokens;
175
+ 4. appended with the returned `ResponseItem::Compaction { encrypted_content }`.
176
+
177
+ The returned encrypted compaction item is exactly the kind of opaque payload covered by the encrypted-block estimate above.
178
+
179
+ Source:
180
+
181
+ - retained cap: [`compact_remote_v2.rs#L48-L50`](https://github.com/openai/codex/blob/0c5ccd18abda96efaed9e94e26ffe22def5e28ed/codex-rs/core/src/compact_remote_v2.rs#L48-L50)
182
+ - request appends `CompactionTrigger`: [`compact_remote_v2.rs#L209-L217`](https://github.com/openai/codex/blob/0c5ccd18abda96efaed9e94e26ffe22def5e28ed/codex-rs/core/src/compact_remote_v2.rs#L209-L217)
183
+ - compacted-history shape: [`compact_remote_v2.rs#L409-L421`](https://github.com/openai/codex/blob/0c5ccd18abda96efaed9e94e26ffe22def5e28ed/codex-rs/core/src/compact_remote_v2.rs#L409-L421)
184
+ - newest-first retained-message truncation: [`compact_remote_v2.rs#L433-L456`](https://github.com/openai/codex/blob/0c5ccd18abda96efaed9e94e26ffe22def5e28ed/codex-rs/core/src/compact_remote_v2.rs#L433-L456)
185
+
186
+ ## Remote-compaction failure diagnostics
187
+
188
+ When remote compaction fails, Codex logs both:
189
+
190
+ - `last_api_response_total_tokens`
191
+ - `all_history_items_model_visible_bytes`
192
+ - `estimated_tokens_of_items_added_since_last_successful_api_response`
193
+ - `estimated_bytes_of_items_added_since_last_successful_api_response`
194
+ - `failing_compaction_request_model_visible_bytes`
195
+
196
+ The failing request byte estimate is:
197
+
198
+ ```text
199
+ instructions.length + sum(model_visible_bytes(input_item))
200
+ ```
201
+
202
+ Source: [`compact_remote.rs#L335-L373`](https://github.com/openai/codex/blob/0c5ccd18abda96efaed9e94e26ffe22def5e28ed/codex-rs/core/src/compact_remote.rs#L335-L373).
203
+
204
+ Before remote compaction, Codex also tries to trim trailing Codex-generated history while the full estimate exceeds the context window. It only removes trailing Codex-generated items, not arbitrary user history.
205
+
206
+ Source: [`trim_function_call_history_to_fit_context_window`](https://github.com/openai/codex/blob/0c5ccd18abda96efaed9e94e26ffe22def5e28ed/codex-rs/core/src/compact_remote.rs#L376-L402).
207
+
208
+ ## Interruption behavior
209
+
210
+ If a turn is interrupted before `response.completed`, Codex does not have fresh provider token usage for the partial turn. It records a model-visible interrupted-turn marker and emits `TurnAborted`, but it does not synthesize a new provider-like `TokenCount` from the partial response.
211
+
212
+ Any later active-context calculation therefore falls back to the previous `last_token_usage.total_tokens` plus the local post-model tail estimate described above.
213
+
214
+ Source:
215
+
216
+ - interrupted-turn marker: [`tasks/mod.rs#L87-L106`](https://github.com/openai/codex/blob/0c5ccd18abda96efaed9e94e26ffe22def5e28ed/codex-rs/core/src/tasks/mod.rs#L87-L106)
217
+ - abort records marker and emits `TurnAborted`: [`tasks/mod.rs#L811-L864`](https://github.com/openai/codex/blob/0c5ccd18abda96efaed9e94e26ffe22def5e28ed/codex-rs/core/src/tasks/mod.rs#L811-L864)
218
+ - provider usage is recorded on `ResponseEvent::Completed`: [`turn.rs#L2000-L2014`](https://github.com/openai/codex/blob/0c5ccd18abda96efaed9e94e26ffe22def5e28ed/codex-rs/core/src/session/turn.rs#L2000-L2014)
219
+
220
+ ## TUI/status implication
221
+
222
+ Codex's UI uses `last_token_usage`, not accumulated `total_token_usage`, for context-window percentage.
223
+
224
+ Source:
225
+
226
+ - TUI token model: [`tui/src/token_usage.rs#L37-L50`](https://github.com/openai/codex/blob/0c5ccd18abda96efaed9e94e26ffe22def5e28ed/codex-rs/tui/src/token_usage.rs#L37-L50)
227
+ - status card uses `last_token_usage`: [`status/card.rs#L327-L337`](https://github.com/openai/codex/blob/0c5ccd18abda96efaed9e94e26ffe22def5e28ed/codex-rs/tui/src/status/card.rs#L327-L337)
228
+
229
+ ## Comparison policy for pi-contextimate
230
+
231
+ Recommended interpretation for `pi-contextimate`:
232
+
233
+ 1. Keep Pi's top-line current-context usage as source of truth when `ctx.getContextUsage()` returns tokens.
234
+ 2. Keep Pi's post-compaction `null`/unknown behavior explicit; do not silently pretend a provider usage exists.
235
+ 3. Use Codex's method only as a comparison or optional Codex-native fallback after compaction:
236
+ - base instructions `chars / 4`;
237
+ - provider-shaped Responses items via serialized JSON bytes;
238
+ - special estimates for encrypted reasoning / compaction blocks if Pi is actually replaying Codex-native encrypted items.
239
+ 4. Do **not** estimate hidden thinking from arbitrary local encrypted/signature string length unless that string is actually a provider-native item being replayed into the model. Otherwise keep Pi's current residual bucket:
240
+
241
+ ```text
242
+ Other / reasoning = Pi current-context total - locally countable visible pieces
243
+ ```
244
+
245
+ That residual is safer because it absorbs opaque reasoning replay, provider overhead, images/weird blocks, and estimation error without claiming the encrypted chars are themselves the true context payload.