@cad0p/pi-tree-navigator 0.1.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/CHANGELOG.md ADDED
@@ -0,0 +1,26 @@
1
+ # Changelog
2
+
3
+ All notable changes to this project will be documented in this file.
4
+
5
+ ## [0.1.0] - 2026-05-25
6
+
7
+ <!-- USER-EDITABLE SECTION START -->
8
+
9
+ Initial release.
10
+
11
+ `navigate_tree` is an agent-callable pi tool with three actions:
12
+
13
+ - `anchor` — label the current point in the conversation as a milestone.
14
+ - `rewind` — collapse work between an anchor and the current leaf into a model-generated `branch_summary`, freeing context.
15
+ - `list` — show all anchors on the active branch with cumulative context %.
16
+
17
+ Designed for long autonomous sessions where the agent itself decides when to summarize. Survives mid-loop rewinds (the next assistant turn within the same `prompt()` call sees the reduced context) and produces structurally valid Anthropic chains by injecting a synthetic `tool_use` to pair with the rewind's `tool_result`.
18
+
19
+ User-visible specifics worth knowing on day one:
20
+
21
+ - Anchor names are kebab-case (lowercase alphanumeric segments separated by single hyphens; max 40 chars). Re-anchoring with a name already on the active branch moves the prior label to the new leaf rather than duplicating it; the same move-on-collision applies to `rewind`'s `labelEnd`.
22
+ - `rewind` requires a `summaryFocus` of ≥20 chars after trim; the rejection message lists what the focus should preserve so the agent can self-correct without user intervention.
23
+ - The `branch_summary` boilerplate strip in `list` hints is sentinel-anchored — a user-authored doc whose first H2 happens to be `## Goal` is preserved untouched.
24
+ - If the `AgentSession.prototype` patch isn't installed (typically only after a pi internals shape change), `list` and `rewind` surface a `⚠ reflection bootstrap missing` warning. The hint suggests `/reload` first (lighter — re-runs the prototype patch on the current process) and `Restart pi` as the heavier-handed alternative.
25
+
26
+ <!-- USER-EDITABLE SECTION END -->
package/LICENSE ADDED
@@ -0,0 +1,21 @@
1
+ MIT License
2
+
3
+ Copyright (c) 2026 Pier Carlo Cadoppi
4
+
5
+ Permission is hereby granted, free of charge, to any person obtaining a copy
6
+ of this software and associated documentation files (the "Software"), to deal
7
+ in the Software without restriction, including without limitation the rights
8
+ to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
9
+ copies of the Software, and to permit persons to whom the Software is
10
+ furnished to do so, subject to the following conditions:
11
+
12
+ The above copyright notice and this permission notice shall be included in all
13
+ copies or substantial portions of the Software.
14
+
15
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16
+ IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17
+ FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18
+ AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19
+ LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20
+ OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21
+ SOFTWARE.
package/README.md ADDED
@@ -0,0 +1,120 @@
1
+ # pi-tree-navigator
2
+
3
+ 🌳 Agent-callable session tree navigation for [pi](https://github.com/badlogic/pi-mono).
4
+
5
+ Lets a pi agent anchor named milestones in its own conversation, then collapse work between them into a model-generated `branch_summary` to free up context — without tripping Anthropic's `tool_use` ↔ `tool_result` validation, and with the freed context immediately available to the next assistant turn (even within the same `prompt()` call).
6
+
7
+ ## Install
8
+
9
+ ```bash
10
+ pi install git:github.com/cad0p/pi-tree-navigator
11
+ ```
12
+
13
+ > **Status:** v0.1.0 is not yet on the npm registry (pending OIDC trusted-publisher setup). Install from the git source for now.
14
+
15
+ <details>
16
+ <summary>Once published / pre-release installs</summary>
17
+
18
+ - Stable (once published to npm):
19
+
20
+ ```bash
21
+ pi install npm:@cad0p/pi-tree-navigator
22
+ ```
23
+
24
+ - Pre-release (calver snapshots from `main`, published to npm `@next` on every push):
25
+
26
+ ```bash
27
+ pi install npm:@cad0p/pi-tree-navigator@next
28
+ ```
29
+
30
+ </details>
31
+
32
+ ### Requirements
33
+
34
+ - **pi 0.74+** with at least one model provider configured.
35
+ - Peer dependencies (the source of truth is `package.json` `peerDependencies`):
36
+ - `@earendil-works/pi-coding-agent >=0.74.0`
37
+ - `@earendil-works/pi-agent-core >=0.74.0`
38
+ - `typebox ^1.0.0` (used to declare the tool's parameter schema; bundled with pi but listed explicitly so a standalone install resolves correctly).
39
+ - The reflection bootstrap depends on five plain (not `#`-private) internal pi/agent fields: `AgentSession.prototype.prompt`, `agent.state.messages`, `agent.state.systemPrompt`, `agent.state.tools`, and `agent.prepareNextTurn`. Verified against pi 0.75.x.
40
+
41
+ ## What you get
42
+
43
+ A single agent-callable tool, `navigate_tree`, with three actions:
44
+
45
+ | action | params | effect |
46
+ |---|---|---|
47
+ | `anchor` | `name` | Label the current point in the conversation as a milestone. |
48
+ | `rewind` | `labelStart`, `labelEnd`, `summaryFocus` | Collapse work between `labelStart` and the current leaf into a `branch_summary` entry. The summary is itself labeled with `labelEnd`, so you can chain rewinds. `summaryFocus` is required (non-trivial focus required; floor enforced at runtime by `MIN_SUMMARY_FOCUS_LENGTH`). Despite the verb, `rewind` does not restore prior state — it forks a sibling branch from `labelStart` and continues forward from a model-generated summary; the original subtree is preserved on disk but no longer on the active path. |
49
+ | `list` | — | Show all anchors on the active branch with cumulative context %. |
50
+
51
+ `name` (written by `anchor`) and `labelEnd` (written by `rewind`) both share the reserved `anchor:` label prefix; `labelStart` resolves against that same namespace. Every label written by `anchor` and every `labelEnd` written by `rewind` is referenceable by any subsequent `rewind`'s `labelStart`, and `list` shows all of them.
52
+
53
+ ## How it works
54
+
55
+ A typical autonomous-loop pattern:
56
+
57
+ ```
58
+ agent: navigate_tree(action="anchor", name="impl-start")
59
+ → [anchor 'impl-start'] set at 1.9% of 1.0M (after: “implement the parser”)
60
+
61
+ agent: ...does work, runs tools, accumulates context to 30%...
62
+
63
+ agent: navigate_tree(action="rewind", labelStart="impl-start", labelEnd="impl-end",
64
+ summaryFocus="record only the public API of the parser
65
+ and the open issue with edge case X")
66
+ → [rewind 'impl-start' → 'impl-end'] · context 30.4% → 4.1% of 1.0M
67
+ → A branch_summary recording the work just collapsed has been appended
68
+ to your context. Items under '### Done' are complete. ...
69
+
70
+ agent: ...continues with the freed context, the next API call is back at ~4%...
71
+ ```
72
+
73
+ The freed context is available to the **next assistant turn within the same `prompt()` call**, not just on the next user prompt. This is the key feature — autonomous agents don't have to wait for a user round-trip to benefit from a rewind.
74
+
75
+ ## Implementation notes
76
+
77
+ Why this is more involved than just calling pi's `branchWithSummary`:
78
+
79
+ 1. **Anthropic's tool_use ↔ tool_result pairing.** When a tool call rewinds the session tree, the tool's own `tool_use` lives in the assistant message that issued it — which `branchWithSummary` puts on the abandoned branch. Pi unconditionally writes the tool's `tool_result` to the new branch, leaving the result orphaned. Anthropic 400s the next API call with `Improperly formed request`. The fix is to inject a synthetic assistant message whose single `tool_call` has the same id as the in-flight call, *after* `branchWithSummary` but *before* the tool returns. Pi then writes the real `tool_result` as a child of that synthetic assistant — and the chain stays structurally valid.
80
+
81
+ 2. **In-loop context refresh.** Pi's `Agent` class snapshots `state.messages` once at the start of `prompt()` and pushes new messages onto its own array. A rewind issued mid-loop wouldn't reduce the next API call's size until the user sent a fresh prompt. We wire `agent.prepareNextTurn` from a prototype patch on `AgentSession.prototype.prompt`, returning a fresh context built from `sessionManager.buildSessionContext()` between every turn boundary. After a rewind, the very next assistant turn within the same `prompt()` sees the rewound chain.
82
+
83
+ 3. **Reflection bootstrap.** Pi's slash-command `navigateTree` has access to `commandCtx.navigateTree`, which mutates `agent.state.messages`. Tool executes don't get that ctx, so we capture every `AgentSession` instance via the prompt patch and replicate the mutation manually. Without it, the on-disk leaf moves but `agent.state.messages` stays stale.
84
+
85
+ 4. **`summaryFocus` is mandatory.** The summary is the only thing the agent will see of the collapsed work. The first time the agent uses `rewind`, blanket prompts produce vague summaries; subsequent rewinds are weaker. Forcing the agent to articulate `summaryFocus` (passed to pi's `generateBranchSummary` as `customInstructions`) measurably improves what survives.
86
+
87
+ ### Synthetic assistant token bias
88
+
89
+ The synthetic assistant we inject after each rewind carries the **post-rewind chain estimate** in `usage.totalTokens` (so `estimateContextTokens` reads a sensible baseline immediately after the move). The synthetic itself adds a ~50-token toolCall block re-emitted on every subsequent turn until the next rewind — that overhead is **not** reflected in any `usage.*` field, so future `estimateContextTokens` calls understate the chain by ~50 tokens until the next assistant turn writes a fresh usage block. Negligible at typical anchor cadence; mention if you're benchmarking exact token deltas, ignore otherwise.
90
+
91
+ ## Limitations
92
+
93
+ - **Brittle to pi version bumps.** The fix uses five independent reflection points on internals that aren't part of pi's public API: `AgentSession.prototype.prompt`, `agent.state.messages`, `agent.state.systemPrompt`, `agent.state.tools`, and `agent.prepareNextTurn`. If a future pi release renames any of these, switches them to private (`#`) fields, or restructures the class hierarchy, this breaks. The extension fails loudly: `anchor` still works, `rewind` reports `⚠ reflection bootstrap missing — the rewind landed on disk but the next assistant turn may still see the pre-rewind context. Run \`/reload\` (or restart pi) to recover.`, and you'd see context corruption return on the next prompt.
94
+
95
+ - **Anchor early in the turn.** Whatever's in `agent.state.messages` *before* the `anchor` tool call stays in the kept chain. Everything after gets summarized. Anchor at the *start* of a stage for maximum context savings.
96
+
97
+ - **Abandoned branches grow the JSONL forever.** Each rewind preserves the abandoned subtree on disk. Session files get bigger over time even as live context shrinks. For very long autonomous runs (days), session files can hit hundreds of MB.
98
+
99
+ - **Tested against Anthropic and Kiro providers.** The synthetic-tool_use trick is specifically for Anthropic's strict tool_use/tool_result pairing; the synthetic's `stopReason: "toolUse"` survives Kiro's `normalizeMessages` filter. Other providers may have different validation rules — untested.
100
+
101
+ - **Loading the extension monkey-patches `AgentSession.prototype.prompt` globally.** Every session in the host pi process picks up the patch on import, including sessions that never call `navigate_tree`. The patch is install-on-import and not reversible within a running pi process; restart pi to fully unload it.
102
+
103
+ - **`anchor:` is a reserved label prefix.** Any label written via pi's `/label` command or by another extension that begins with `anchor:` will be picked up by `list` and addressable by `rewind`'s `labelStart` / `labelEnd`. Avoid the prefix in manually-set labels.
104
+
105
+ - **Disk-fault during `rewind` (rare).** Pi's `branchWithSummary` advances the in-memory leaf before persisting the new entry to disk. If pi's session-write fails mid-call (full disk, FS error on a persisted session), the in-memory leaf has already moved past the original assistant turn but the synthetic-assistant injection in this extension never runs — pi's tool-result then lands without a matching tool_use, surfacing as the same `context_length_exceeded` 400 the synthetic exists to prevent. Production risk: low (in-memory tests don't reach this case; pi's session-write is robust on POSIX disk). Tracked for an additional salvage layer wrapping `branchWithSummary` itself in v0.2.0.
106
+
107
+ ## Development
108
+
109
+ ```bash
110
+ bun install
111
+ bun test # helpers + dispatch / reflection bootstrap / salvage path
112
+ bunx biome check extensions/
113
+ bunx tsc --noEmit
114
+ ```
115
+
116
+ Tests cover `extensions/navigate-tree/helpers.ts` (pure helpers in `helpers.test.ts`) and `extensions/navigate-tree/index.ts` (action dispatch, schema shape, synthetic-assistant injection, reflection bootstrap, salvage path — in `index.test.ts`). The `summarize` factory option injects a stub for `generateBranchSummary` so no real LLM call fires during rewind tests. Additional manual e2e validation against pi 0.75.x is recommended for any pi version bump (the reflection bootstrap depends on internal field shapes).
117
+
118
+ ## License
119
+
120
+ MIT — see [LICENSE](LICENSE).
@@ -0,0 +1,135 @@
1
+ /**
2
+ * Pure helpers for the navigate-tree extension.
3
+ *
4
+ * Imported by `./index.ts` and `./helpers.test.ts`. Pi's extension loader
5
+ * loads `./index.ts` and ignores everything else in this directory unless
6
+ * referenced from there — so this file isn't loaded as a separate extension.
7
+ *
8
+ * No pi runtime imports — these are pure functions over plain JS values.
9
+ */
10
+
11
+ // ---------------------------------------------------------------------------
12
+ // Exported boundary constants below (MAX_NAME_LENGTH, MAX_BOILERPLATE_LEAD_IN).
13
+ //
14
+ // Stability: these are internal tunables. Exported only so the test suite
15
+ // can pin boundary cases by constant rather than literal. Re-tuning is
16
+ // NOT a semver-breaking change for this package — production callers
17
+ // should rely on the registered `navigate_tree` tool surface, not import
18
+ // these constants directly.
19
+ // ---------------------------------------------------------------------------
20
+
21
+ // Hard cap on label-name length. 40 chars accommodates descriptive names
22
+ // (e.g. 'parser-edge-case-investigation', 31 chars) while keeping list
23
+ // output column-friendly under common terminal widths and preventing a
24
+ // runaway label string from poisoning the JSONL on disk.
25
+ export const MAX_NAME_LENGTH = 40;
26
+ /**
27
+ * Kebab-case anchor name: lowercase alphanumeric segments separated by
28
+ * single hyphens. No leading / trailing / double hyphens (the `isValidName`
29
+ * test suite enforces these rejections). Mirrors the pattern documented in
30
+ * the tool description and README — if this regex relaxes, both surfaces
31
+ * need the same update.
32
+ */
33
+ const NAME_RE = /^[a-z0-9]+(-[a-z0-9]+)*$/;
34
+
35
+ // Maximum lead-in distance for pi's branch-summary boilerplate marker.
36
+ // Pi's standard prelude ("The user explored a different conversation
37
+ // branch...") fits in the first ~150 chars; 200 is a generous upper bound.
38
+ // A "## Goal" found later than this is treated as in-content prose, not
39
+ // the boilerplate marker, and the strip is a no-op.
40
+ export const MAX_BOILERPLATE_LEAD_IN = 200;
41
+
42
+ // Sentinel that pi's branch-summary prelude always starts with. Gating
43
+ // the strip on this prefix makes the helper a no-op for any non-
44
+ // boilerplate text that happens to contain `## Goal` early (e.g. a
45
+ // user-authored markdown doc whose first H2 is "Goal"). Verified
46
+ // against pi 0.75.5's `dist/core/compaction/branch-summarization.js`;
47
+ // if pi reworks the prelude wording, the strip falls back to a no-op
48
+ // (`findLabelHint` shows the unmodified prelude). Verify when bumping
49
+ // the peer-dep floor.
50
+ const BRANCH_SUMMARY_SENTINEL =
51
+ "The user explored a different conversation branch";
52
+
53
+ // Pi 0.75.5's branch-summary boilerplate places `## Goal` after the prelude
54
+ // (verified against `dist/core/compaction/branch-summarization.js`). Used to
55
+ // anchor the strip cut-point in `stripBranchSummaryBoilerplate` — a single
56
+ // constant so the indexOf probe and the slice-length advance can't drift.
57
+ const GOAL_HEADER = "## Goal";
58
+
59
+ /** Validate a kebab-case name suitable for use as a navigate-tree label. */
60
+ export function isValidName(s: unknown): s is string {
61
+ return (
62
+ typeof s === "string" &&
63
+ s.length > 0 &&
64
+ s.length <= MAX_NAME_LENGTH &&
65
+ NAME_RE.test(s)
66
+ );
67
+ }
68
+
69
+ /** Truncate text to a one-line preview, collapsing whitespace. */
70
+ export function toOneLine(text: string, maxLen: number): string | null {
71
+ const t = text.replace(/\s+/g, " ").trim();
72
+ if (t.length === 0) return null;
73
+ if (maxLen <= 1) return t.length > 0 ? "…" : null;
74
+ return t.length > maxLen ? `${t.slice(0, maxLen - 1)}…` : t;
75
+ }
76
+
77
+ /**
78
+ * Format token count as a percentage with one decimal, or as `Nk` if no window.
79
+ */
80
+ export function formatPct1(tokens: number, contextWindow: number): string {
81
+ if (contextWindow <= 0) return `${(tokens / 1000).toFixed(1)}k`;
82
+ return `${((tokens / contextWindow) * 100).toFixed(1)}%`;
83
+ }
84
+
85
+ /** Format a context window size as `1.0M` or `200k`. Empty string when unknown. */
86
+ export function formatWindow(contextWindow: number): string {
87
+ if (contextWindow <= 0) return "";
88
+ if (contextWindow >= 1_000_000)
89
+ return `${(contextWindow / 1_000_000).toFixed(1)}M`;
90
+ return `${(contextWindow / 1000).toFixed(0)}k`;
91
+ }
92
+
93
+ export function formatContextDelta(
94
+ beforeTokens: number,
95
+ afterTokens: number,
96
+ contextWindow: number,
97
+ ): string {
98
+ if (contextWindow > 0) {
99
+ return `context ${formatPct1(beforeTokens, contextWindow)} → ${formatPct1(afterTokens, contextWindow)} of ${formatWindow(contextWindow)}`;
100
+ }
101
+ return `tokens ${beforeTokens} → ${afterTokens}`;
102
+ }
103
+
104
+ /**
105
+ * Strip pi's standard branch-summary boilerplate ("The user explored a
106
+ * different conversation branch...") so the hint shows the actual content.
107
+ */
108
+ export function stripBranchSummaryBoilerplate(text: string): string {
109
+ if (!text.startsWith(BRANCH_SUMMARY_SENTINEL)) return text;
110
+ const goalIdx = text.indexOf(GOAL_HEADER);
111
+ if (goalIdx > 0 && goalIdx < MAX_BOILERPLATE_LEAD_IN) {
112
+ return text.slice(goalIdx + GOAL_HEADER.length);
113
+ }
114
+ return text;
115
+ }
116
+
117
+ /**
118
+ * Extract a string from a message-like content field. Handles both the
119
+ * legacy string shape and the modern array-of-blocks shape, joining all
120
+ * text blocks with spaces.
121
+ */
122
+ export function extractTextContent(content: unknown): string {
123
+ if (typeof content === "string") return content;
124
+ if (!Array.isArray(content)) return "";
125
+ return content
126
+ .filter(
127
+ (c): c is { type: "text"; text: string } =>
128
+ typeof c === "object" &&
129
+ c !== null &&
130
+ (c as { type?: string }).type === "text" &&
131
+ typeof (c as { text?: unknown }).text === "string",
132
+ )
133
+ .map((c) => c.text)
134
+ .join(" ");
135
+ }
@@ -0,0 +1,952 @@
1
+ /**
2
+ * navigate-tree — agent-callable session tree navigation.
3
+ *
4
+ * See README “Implementation notes” for the user-facing narrative
5
+ * (Anthropic tool_use↔tool_result pairing, same-loop context refresh,
6
+ * reflection bootstrap). This file-level JSDoc carries only
7
+ * source-internal facts the README doesn't.
8
+ *
9
+ * `/tree`-visible artifacts (empirical):
10
+ * • Synthetic assistant message right after each `branch_summary`,
11
+ * with one tool_call sharing the in-flight `toolCallId`.
12
+ * • Dangling tool_use on the anchor entry whose original
13
+ * tool_results were cut off by the rewind. Anthropic accepts this
14
+ * — the dangling tool_use is buffered behind the branch_summary's
15
+ * user-text rendering and the API doesn't reject it. No walk-up
16
+ * logic at anchor time.
17
+ *
18
+ * Reflection bootstrap replicates pi's own slash-command line
19
+ * verbatim (kept symmetric so a pi rename here surfaces as the
20
+ * runtime warning rather than silently drifting):
21
+ *
22
+ * this.agent.state.messages = this.sessionManager.buildSessionContext().messages;
23
+ *
24
+ * Risks of the reflection approach:
25
+ * • If pi switches any of the five fields this extension reads —
26
+ * `AgentSession.prototype.prompt`, `agent.state.messages`,
27
+ * `agent.state.systemPrompt`, `agent.state.tools`, or
28
+ * `agent.prepareNextTurn` — to ES `#` private fields, this breaks
29
+ * fundamentally.
30
+ * • If pi renames or restructures any of these fields, this breaks.
31
+ * • Patches `AgentSession.prototype.prompt` globally on import; not
32
+ * reversible without a process restart; affects every session in
33
+ * the pi process, including sessions that never call
34
+ * `navigate_tree`.
35
+ *
36
+ * Verified against pi 0.75.5.
37
+ */
38
+
39
+ import { estimateContextTokens } from "@earendil-works/pi-agent-core";
40
+ import {
41
+ AgentSession,
42
+ buildSessionContext,
43
+ collectEntriesForBranchSummary,
44
+ type ExtensionAPI,
45
+ generateBranchSummary,
46
+ type SessionEntry,
47
+ type SessionManager,
48
+ } from "@earendil-works/pi-coding-agent";
49
+ import { Type } from "typebox";
50
+ import {
51
+ extractTextContent,
52
+ formatContextDelta,
53
+ formatPct1,
54
+ formatWindow,
55
+ isValidName,
56
+ MAX_NAME_LENGTH,
57
+ stripBranchSummaryBoilerplate,
58
+ toOneLine,
59
+ } from "./helpers.ts";
60
+
61
+ const LABEL_PREFIX = "anchor:";
62
+
63
+ // ---------------------------------------------------------------------------
64
+ // Exported boundary constants below (MAX_SESSION_REFS, MAX_HINT_WALK_DEPTH,
65
+ // MIN_SUMMARY_FOCUS_LENGTH, MAX_SYNTHETIC_FOCUS_LENGTH).
66
+ //
67
+ // Stability: these are internal tunables. Exported only so the test suite
68
+ // can pin boundary cases by constant rather than literal. Re-tuning is
69
+ // NOT a semver-breaking change for this package — production callers
70
+ // should rely on the registered `navigate_tree` tool surface, not import
71
+ // these constants directly. The `__testHooks` JSDoc carries the same
72
+ // caveat for module-internal helpers.
73
+ // ---------------------------------------------------------------------------
74
+
75
+ // Cap on captured AgentSession refs across /new + /resume + /reload cycles.
76
+ // Worst case is ~one ref per long-lived session before reaping dead WeakRefs;
77
+ // 16 leaves headroom for the deepest session-fanout pattern observed (a few
78
+ // /resume cycles on top of a couple of /new cycles) without prematurely
79
+ // reaping a still-live session. Bump if the reaper fires while a session
80
+ // is still live.
81
+ export const MAX_SESSION_REFS = 16;
82
+ // Cap on parentId chain walks in `findLabelHint` (UX preview only;
83
+ // no need to walk to the root for a 50-char snippet).
84
+ export const MAX_HINT_WALK_DEPTH = 50;
85
+ // Floor on `summaryFocus` length (after trim) for `rewind`. The user's most
86
+ // recent instruction lives on the chain about to be collapsed; if the focus
87
+ // is shorter than this, it almost always elides that instruction (a terse
88
+ // "finish parser fix" is 17 chars and conveys nothing the next turn can
89
+ // act on). 20 is the empirical threshold below which the post-rewind turn
90
+ // reliably loses continuity — raising it forces more useful focus text
91
+ // without inviting verbosity.
92
+ export const MIN_SUMMARY_FOCUS_LENGTH = 20;
93
+ // Cap on the `summaryFocus` length stored in the synthetic assistant's
94
+ // arguments. The full focus is passed live to `generateBranchSummary`, so
95
+ // the summarizer always sees the original; we only need a trimmed copy in
96
+ // the synthetic's args because pi's `convertToLlm` re-emits the synthetic's
97
+ // toolCall block (including its arguments) on every subsequent turn until
98
+ // another rewind. Without a cap, a 100K-char focus inflates every later
99
+ // turn's input by ~100K chars indefinitely. 1024 chars is generous — well
100
+ // above empirically useful focus length, and the agent already saw the
101
+ // full focus string when it issued the rewind.
102
+ export const MAX_SYNTHETIC_FOCUS_LENGTH = 1024;
103
+ // Hint length cap for the per-row hint shown in `list` output. 50 chars
104
+ // fits one terminal column without wrapping in typical 80-column TUIs.
105
+ const LIST_HINT_MAX_LENGTH = 50;
106
+ // Hint length cap for the hint shown in the `anchor` response. The anchor
107
+ // response is a single block of prose (not a column-aligned table) so it
108
+ // can afford a longer hint than `list`'s per-row preview.
109
+ const ANCHOR_HINT_MAX_LENGTH = 60;
110
+ // padStart width for the percentage column in `list` output. The longest
111
+ // percent label is "100.0%" = 6 chars; "99.9%" = 5 chars covers the
112
+ // realistic worst case and keeps the column tight.
113
+ const LIST_PCT_COL_WIDTH = 5;
114
+ // padEnd width for the anchor-name column in `list` output. MAX_NAME_LENGTH
115
+ // is 40, but the typical kebab-case name is 8–20 chars; 28 keeps the
116
+ // hint column visible without truncating common names.
117
+ const LIST_LABEL_COL_WIDTH = 28;
118
+ const PNT_MARKER = Symbol.for("navigate-tree.pnt-installed");
119
+ const ORIG_PROMPT_KEY = Symbol.for("navigate-tree.orig-prompt");
120
+
121
+ // Two warnings: list-site (read-only path; warns about the next turn's
122
+ // context view) and rewind-site (wrote to disk; leads with that). Both
123
+ // suggest /reload first, then restart pi, in that order.
124
+ const REFLECTION_BOOTSTRAP_WARNING_LIST =
125
+ "⚠ reflection bootstrap missing — anchors and rewinds still work, but the next assistant turn may snapshot pre-bootstrap context. Run `/reload` (or restart pi) to recover.";
126
+ const REFLECTION_BOOTSTRAP_WARNING_REWIND =
127
+ "⚠ reflection bootstrap missing — the rewind landed on disk but the next assistant turn may still see the pre-rewind context. Run `/reload` (or restart pi) to recover.";
128
+
129
+ // ---------------------------------------------------------------------------
130
+ // Typed views over pi internals.
131
+ //
132
+ // pi-coding-agent doesn't expose `agent`, `state`, `prepareNextTurn`, or
133
+ // `sessionManager` on `AgentSession` in its public types, but they are plain
134
+ // (non-`#`-private) fields on the class. Each cast point is a fragility
135
+ // surface for pi version bumps; grouping them here makes the dependency
136
+ // surface explicit.
137
+ // ---------------------------------------------------------------------------
138
+
139
+ interface PiInternals {
140
+ agent: {
141
+ state: {
142
+ systemPrompt: string;
143
+ messages: unknown[];
144
+ tools: unknown[];
145
+ };
146
+ prepareNextTurn?: unknown;
147
+ };
148
+ sessionManager: SessionManager;
149
+ }
150
+
151
+ function asInternals(session: AgentSession): PiInternals {
152
+ return session as unknown as PiInternals;
153
+ }
154
+
155
+ type PntResult = {
156
+ context?: {
157
+ systemPrompt?: unknown;
158
+ messages?: unknown[];
159
+ tools?: unknown[];
160
+ [k: string]: unknown;
161
+ };
162
+ model?: unknown;
163
+ thinkingLevel?: unknown;
164
+ };
165
+ // pi 0.75.5 invokes `agent.prepareNextTurn(signal)` from
166
+ // `Agent.createLoopConfig` — a single AbortSignal argument. This differs
167
+ // from the documented `AgentLoopConfig.prepareNextTurn(context: PrepareNextTurnContext)`
168
+ // shape, which `Agent` is bridging. We accept whatever pi passes and forward
169
+ // it verbatim to the prior wrapper so we don't fight a future signature
170
+ // alignment. Verified against pi-coding-agent 0.75.5; revisit if the call
171
+ // site changes.
172
+ type PntFn = (...args: unknown[]) => Promise<PntResult> | PntResult;
173
+ type MarkedPntFn = PntFn & { [PNT_MARKER]?: boolean; __prior?: PntFn };
174
+
175
+ // =============================================================================
176
+ // Reflection bootstrap & in-loop refresh
177
+ // =============================================================================
178
+
179
+ const sessionInstances: WeakRef<AgentSession>[] = [];
180
+ let seenSessions = new WeakSet<AgentSession>();
181
+
182
+ function captureSession(session: AgentSession): void {
183
+ if (seenSessions.has(session)) return;
184
+ seenSessions.add(session);
185
+ sessionInstances.push(new WeakRef(session));
186
+ // Reap dead WeakRefs occasionally so the array doesn't grow unbounded
187
+ // across /new and /resume cycles.
188
+ if (sessionInstances.length > MAX_SESSION_REFS) {
189
+ for (let i = sessionInstances.length - 1; i >= 0; i--) {
190
+ if (!sessionInstances[i].deref()) sessionInstances.splice(i, 1);
191
+ }
192
+ }
193
+ }
194
+
195
+ function patchAgentSessionPrototype(): void {
196
+ const proto = AgentSession.prototype as unknown as Record<
197
+ PropertyKey,
198
+ unknown
199
+ >;
200
+ // Stash the truly-original prompt the FIRST time we patch. On subsequent
201
+ // /reloads the value is already there — we don't overwrite, we just read it
202
+ // back so the new wrapper still calls the original (not a previous wrapper).
203
+ if (!proto[ORIG_PROMPT_KEY]) {
204
+ proto[ORIG_PROMPT_KEY] = proto.prompt;
205
+ }
206
+ const orig = proto[ORIG_PROMPT_KEY] as (...args: unknown[]) => unknown;
207
+
208
+ // Always replace the wrapper, even if a previous load already patched. On
209
+ // /reload the previous wrapper closes over the previous module's
210
+ // `sessionInstances` — if we don't replace, captures land in the dead
211
+ // module and reflection finds nothing.
212
+ const patched = function (this: AgentSession, ...args: unknown[]) {
213
+ captureSession(this);
214
+ installPrepareNextTurn(this);
215
+ return orig.apply(this, args);
216
+ };
217
+ proto.prompt = patched;
218
+ }
219
+
220
+ /**
221
+ * Wire `agent.prepareNextTurn` so the in-flight agent loop refreshes its
222
+ * context from sessionManager between turns within the same prompt() call.
223
+ * Without this, the loop snapshots agent.state.messages once at prompt start
224
+ * and pushes new messages onto its own array — a rewind issued mid-loop
225
+ * doesn't reduce the next API call's size until the user sends a new prompt.
226
+ *
227
+ * The Agent class's `createLoopConfig` dereferences `this.prepareNextTurn`
228
+ * at the closure call site, so the value here is read at every turn boundary.
229
+ * But it gates the closure on `this.prepareNextTurn` being truthy at config
230
+ * creation — so we have to set this BEFORE prompt() runs, hence wiring it
231
+ * from inside the prompt patch.
232
+ */
233
+ function installPrepareNextTurn(session: AgentSession): void {
234
+ const internals = asInternals(session);
235
+ const agent = internals.agent;
236
+ if (!agent) return;
237
+
238
+ const sm = internals.sessionManager;
239
+
240
+ // If the existing prepareNextTurn was installed by a previous load of THIS
241
+ // extension, recover the chain it captured (its `__prior`) so we don't
242
+ // strand other extensions' closures across /reload. Preserve any other
243
+ // extension's prepareNextTurn so we compose with them.
244
+ const existing = agent.prepareNextTurn as MarkedPntFn | undefined;
245
+ const prior: PntFn | undefined =
246
+ typeof existing === "function" && existing[PNT_MARKER]
247
+ ? existing.__prior
248
+ : (existing as PntFn | undefined);
249
+
250
+ const next: MarkedPntFn = async (...args: unknown[]) => {
251
+ let priorResult: PntResult | undefined;
252
+ if (typeof prior === "function") {
253
+ priorResult = await prior(...args);
254
+ }
255
+ // Pi's loop replaces context wholesale (`currentContext = ctx ??
256
+ // currentContext`), not field-merges — a prior wrapper that returns
257
+ // a partial context (e.g. only systemPrompt) would silently drop
258
+ // tools. Spread the prior first so its fields survive, then fall
259
+ // back to `agent.state` for any field the prior left undefined.
260
+ // `messages` is owned by this wrapper.
261
+ const priorContext = priorResult?.context;
262
+ return {
263
+ context: {
264
+ ...priorContext,
265
+ systemPrompt: priorContext?.systemPrompt ?? agent.state.systemPrompt,
266
+ tools: priorContext?.tools ?? agent.state.tools,
267
+ messages: sm.buildSessionContext().messages,
268
+ },
269
+ model: priorResult?.model,
270
+ thinkingLevel: priorResult?.thinkingLevel,
271
+ };
272
+ };
273
+ next[PNT_MARKER] = true;
274
+ next.__prior = prior;
275
+ agent.prepareNextTurn = next;
276
+ }
277
+
278
+ function findOwningSession(sm: SessionManager): AgentSession | null {
279
+ for (const ref of sessionInstances) {
280
+ const s = ref.deref();
281
+ if (s && asInternals(s).sessionManager === sm) {
282
+ return s;
283
+ }
284
+ }
285
+ return null;
286
+ }
287
+
288
+ function refreshAgentMessages(sm: SessionManager): boolean {
289
+ // Manually replicate the agent-state refresh that pi's
290
+ // commandCtx.navigateTree does after branchWithSummary. Returns true on
291
+ // success, false if reflection couldn't find the owning AgentSession (in
292
+ // which case the rewind is structurally complete on disk, but the next LLM
293
+ // call will still see stale messages).
294
+ const session = findOwningSession(sm);
295
+ if (!session) return false;
296
+ try {
297
+ const sessionContext = sm.buildSessionContext();
298
+ const agent = asInternals(session).agent;
299
+ if (!agent?.state) return false;
300
+ agent.state.messages = sessionContext.messages;
301
+ return true;
302
+ } catch {
303
+ return false;
304
+ }
305
+ }
306
+
307
+ // =============================================================================
308
+ // Helpers (extension-internal; pure helpers in ./helpers.ts)
309
+ // =============================================================================
310
+
311
+ function findLabeledEntry(
312
+ sm: SessionManager,
313
+ fullLabel: string,
314
+ ): string | null {
315
+ const path = sm.getBranch();
316
+ for (let i = path.length - 1; i >= 0; i--) {
317
+ if (sm.getLabel(path[i].id) === fullLabel) return path[i].id;
318
+ }
319
+ return null;
320
+ }
321
+
322
+ function estimateActiveBranchTokens(sm: SessionManager): number {
323
+ return estimateContextTokens(sm.buildSessionContext().messages).tokens;
324
+ }
325
+
326
+ function estimateAtEntry(
327
+ entries: SessionEntry[],
328
+ entryId: string,
329
+ byId: Map<string, SessionEntry>,
330
+ ): number {
331
+ return estimateContextTokens(
332
+ buildSessionContext(entries, entryId, byId).messages,
333
+ ).tokens;
334
+ }
335
+
336
+ /**
337
+ * Walk parentId chain back from `fromId` and return a one-line preview of
338
+ * the first entry that has meaningful text content. Branch summaries are
339
+ * prefixed with `summary:` so the source is clear; user/assistant text is
340
+ * shown as-is.
341
+ */
342
+ function findLabelHint(
343
+ sm: SessionManager,
344
+ fromId: string,
345
+ maxLen: number,
346
+ ): string | null {
347
+ let cur: string | null | undefined = fromId;
348
+ let depth = 0;
349
+ while (cur && depth < MAX_HINT_WALK_DEPTH) {
350
+ const e = sm.getEntry(cur);
351
+ if (!e) break;
352
+ let text = "";
353
+ let prefix = "";
354
+ if (e.type === "branch_summary" && e.summary) {
355
+ text = stripBranchSummaryBoilerplate(e.summary);
356
+ prefix = "summary: ";
357
+ } else if (e.type === "message") {
358
+ const role = e.message.role;
359
+ if (role === "user" || role === "assistant") {
360
+ text = extractTextContent(e.message.content);
361
+ }
362
+ } else if (e.type === "custom_message") {
363
+ text = extractTextContent(e.content);
364
+ }
365
+ const oneLine = toOneLine(text, maxLen - prefix.length);
366
+ if (oneLine) return prefix + oneLine;
367
+ cur = e.parentId;
368
+ depth++;
369
+ }
370
+ return null;
371
+ }
372
+
373
+ /**
374
+ * Build a synthetic assistant message containing a single tool_call whose id
375
+ * matches the in-flight tool_call id. Appended after `branchWithSummary` so
376
+ * the real tool_result lands paired with a matching tool_use.
377
+ *
378
+ * `usage` fields are zeroed except `totalTokens`, which is set to the
379
+ * chain size measured BEFORE this synthetic is appended (the post-rewind
380
+ * baseline that pi-agent-core's `estimateContextTokens` reads off the last
381
+ * assistant). `stopReason: "toolUse"` survives Kiro's `normalizeMessages`
382
+ * filter (which strips `error` / `aborted`); without it the synthetic would
383
+ * be filtered out and the tool_result would re-orphan.
384
+ */
385
+ function buildSyntheticAssistant(
386
+ toolCallId: string,
387
+ toolName: string,
388
+ args: Record<string, unknown>,
389
+ model: { api?: string; provider?: string; id?: string } | undefined,
390
+ totalTokens: number,
391
+ ) {
392
+ return {
393
+ role: "assistant" as const,
394
+ content: [
395
+ {
396
+ type: "toolCall" as const,
397
+ id: toolCallId,
398
+ name: toolName,
399
+ arguments: args,
400
+ },
401
+ ],
402
+ api: model?.api ?? "unknown",
403
+ provider: model?.provider ?? "unknown",
404
+ model: model?.id ?? "unknown",
405
+ stopReason: "toolUse" as const,
406
+ timestamp: Date.now(),
407
+ usage: {
408
+ input: 0,
409
+ output: 0,
410
+ cacheRead: 0,
411
+ cacheWrite: 0,
412
+ totalTokens,
413
+ cost: { input: 0, output: 0, cacheRead: 0, cacheWrite: 0, total: 0 },
414
+ },
415
+ };
416
+ }
417
+
418
+ // =============================================================================
419
+ // Extension
420
+ // =============================================================================
421
+
422
+ interface ToolResult {
423
+ content: Array<{ type: "text"; text: string }>;
424
+ details: Record<string, unknown>;
425
+ isError?: boolean;
426
+ }
427
+
428
+ function toolError(
429
+ text: string,
430
+ details: Record<string, unknown> = {},
431
+ ): ToolResult {
432
+ return {
433
+ content: [{ type: "text", text }],
434
+ details,
435
+ isError: true,
436
+ };
437
+ }
438
+
439
+ export default function (
440
+ pi: ExtensionAPI,
441
+ opts?: { summarize?: typeof generateBranchSummary },
442
+ ) {
443
+ // DI seam: tests inject a stub `summarize` to avoid hitting the real
444
+ // model. Production callers (pi's extension loader) pass no second
445
+ // argument, so this falls back to the real `generateBranchSummary`
446
+ // import.
447
+ const summarize = opts?.summarize ?? generateBranchSummary;
448
+ patchAgentSessionPrototype();
449
+
450
+ pi.registerTool({
451
+ name: "navigate_tree",
452
+ label: "Navigate Tree",
453
+ // Stateful: every action mutates SessionManager; concurrent calls would
454
+ // race on `leafId` / `labelsById` and produce an undefined tree.
455
+ executionMode: "sequential",
456
+ description: `Long-session context management via the pi session tree. Anchor named milestones, then collapse work between them into a model-generated summary while preserving the full history on a sibling branch. Despite the verb, \`rewind\` does not restore prior state — it forks a sibling branch from the anchor and continues forward from a model-generated summary; the abandoned subtree is preserved on disk but no longer on the active path.
457
+
458
+ Operations (set the \`action\` parameter):
459
+ • action='anchor', name='<milestone-name>': label the current point so a later rewind can target it. Use at the start of a stage you'll summarize (e.g. 'design-start', 'impl-start'). If the same name already exists on the active branch, the prior label is moved to the new leaf (no duplicates).
460
+ • action='rewind', labelStart='<existing>', labelEnd='<new>': collapse work between labelStart and the current leaf into a branch_summary. The new summary entry is itself labeled with labelEnd, so you can chain rewinds.
461
+ • action='list': show all named anchors on the active branch in chronological order, with cumulative context % at each anchor.
462
+
463
+ Both \`name\` (anchor) and \`labelEnd\` (rewind) write into the same anchor namespace — either becomes addressable as a future \`labelStart\`. \`list\` shows every anchor under the \`anchor:\` prefix, regardless of which action wrote it. Pi labels written via \`/label anchor:foo\` (manually or by other extensions) are also addressable here. Avoid the \`anchor:\` prefix in manually-set labels.
464
+
465
+ \`summaryFocus\` is required when \`action='rewind'\` (≥${MIN_SUMMARY_FOCUS_LENGTH} chars after trim). Calls without it are rejected. It's passed to pi's \`generateBranchSummary\` as \`customInstructions\`, biasing the summarizer LLM toward the agent's specified focus while it rewrites the collapsed work into pi's structured summary format. To preserve continuity, instruct the summarizer to keep: (1) the user's most recent message verbatim, (2) what's done in the collapsed segment, (3) what's left to do as a next action.`,
466
+ promptSnippet:
467
+ "Use to anchor named milestones and rewind the conversation tree to a prior point with a model-generated summary, for token-efficient long autonomous sessions.",
468
+ // The schema is intentionally a flat `Type.Object` with everything-but-
469
+ // `action` optional, with action-conditional required-ness enforced at
470
+ // runtime in `execute`. Discriminated unions / property-level required-
471
+ // when-action shapes break the Kiro/CodeWhisperer adapter, which forwards
472
+ // `inputSchema.json` verbatim and 400s on non-`type: "object"` roots.
473
+ // The runtime guards in `execute` provide the conditional-required
474
+ // behavior the schema can't express.
475
+ parameters: Type.Object({
476
+ action: Type.Union(
477
+ [Type.Literal("anchor"), Type.Literal("rewind"), Type.Literal("list")],
478
+ {
479
+ description:
480
+ "Which operation to perform. 'anchor' labels the current point, 'rewind' collapses work between an anchor and the current leaf into a branch_summary, 'list' shows every anchor on the active branch.",
481
+ },
482
+ ),
483
+ name: Type.Optional(
484
+ Type.String({
485
+ description: `Required when action='anchor'. Kebab-case label (max ${MAX_NAME_LENGTH} chars) for the milestone. If a label with this name already exists on the active branch, it is moved to the new leaf.`,
486
+ }),
487
+ ),
488
+ labelStart: Type.Optional(
489
+ Type.String({
490
+ description: `Required when action='rewind'. Kebab-case name (max ${MAX_NAME_LENGTH} chars) of an existing anchor on the active branch — work between this anchor and the current leaf is summarized.`,
491
+ }),
492
+ ),
493
+ labelEnd: Type.Optional(
494
+ Type.String({
495
+ description: `Required when action='rewind'. Kebab-case name (max ${MAX_NAME_LENGTH} chars) for the resulting branch_summary entry. If a label with this name already exists on the active branch, it is moved to the new entry (mirrors anchor's move-on-collision). Becomes addressable as a future labelStart.`,
496
+ }),
497
+ ),
498
+ summaryFocus: Type.Optional(
499
+ Type.String({
500
+ description: `Required when action='rewind'. ≥${MIN_SUMMARY_FOCUS_LENGTH} chars after trim. Should encode (1) the user's most recent instruction verbatim, (2) what was done in the collapsed segment, (3) what's left to do as a next action.`,
501
+ }),
502
+ ),
503
+ }),
504
+ execute: async (toolCallId, params, signal, _onUpdate, ctx) => {
505
+ const sm = ctx.sessionManager as SessionManager;
506
+ const p = params as {
507
+ action: "anchor" | "rewind" | "list";
508
+ name?: string;
509
+ labelStart?: string;
510
+ labelEnd?: string;
511
+ summaryFocus?: string;
512
+ };
513
+
514
+ // --- list ---
515
+ if (p.action === "list") {
516
+ const path = sm.getBranch();
517
+ const allEntries = sm.getEntries();
518
+ const byId = new Map<string, SessionEntry>();
519
+ for (const e of allEntries) byId.set(e.id, e);
520
+ const cw = ctx.model?.contextWindow ?? 0;
521
+ const totalTokens = estimateActiveBranchTokens(sm);
522
+ const reflectionOk = !!findOwningSession(sm);
523
+
524
+ const lines: string[] = [];
525
+ for (const e of path) {
526
+ const lbl = sm.getLabel(e.id);
527
+ if (lbl?.startsWith(LABEL_PREFIX)) {
528
+ const name = lbl.slice(LABEL_PREFIX.length);
529
+ const tokensAt = estimateAtEntry(allEntries, e.id, byId);
530
+ const pct = formatPct1(tokensAt, cw).padStart(LIST_PCT_COL_WIDTH);
531
+ const hint = findLabelHint(sm, e.id, LIST_HINT_MAX_LENGTH);
532
+ const hintPart = hint ? ` (after: “${hint}”)` : "";
533
+ lines.push(
534
+ ` ${pct} ${name.padEnd(LIST_LABEL_COL_WIDTH)}${hintPart}`,
535
+ );
536
+ }
537
+ }
538
+
539
+ const reflectionWarning = reflectionOk
540
+ ? ""
541
+ : ` · ${REFLECTION_BOOTSTRAP_WARNING_LIST}`;
542
+ const header = `[list] · ${lines.length} label${lines.length === 1 ? "" : "s"} · ctx ${formatPct1(totalTokens, cw)}${cw > 0 ? ` of ${formatWindow(cw)}` : ""}${reflectionWarning}`;
543
+ const body = lines.length
544
+ ? `Active labels (root → leaf):\n${lines.join("\n")}`
545
+ : "No labels on the active branch.";
546
+ return {
547
+ content: [{ type: "text", text: `${header}\n\n${body}` }],
548
+ details: {
549
+ count: lines.length,
550
+ contextTokens: totalTokens,
551
+ contextWindow: cw,
552
+ reflectionOk,
553
+ },
554
+ };
555
+ }
556
+
557
+ // --- anchor ---
558
+ if (p.action === "anchor") {
559
+ if (!isValidName(p.name)) {
560
+ return toolError(
561
+ `anchor requires \`name\` in kebab-case, max ${MAX_NAME_LENGTH} chars (e.g. 'impl-start').`,
562
+ );
563
+ }
564
+ const leafId = sm.getLeafId();
565
+ if (!leafId) {
566
+ return toolError("No session entries yet — nothing to anchor.");
567
+ }
568
+ // Write the new label first, then clear the prior. If the second
569
+ // setLabel throws, two labels of the same name briefly coexist on
570
+ // the active branch — `findLabeledEntry` walks leaf→root and
571
+ // returns the leaf-side match, so navigation behavior is correct
572
+ // during the overlap. The pre-PR "no enforcement" semantics already
573
+ // tolerated this. The reverse order (clear-then-set) was move-then-
574
+ // lose under failure: a partial collapse left the active branch
575
+ // with no anchor of the requested name at all.
576
+ const fullLabel = LABEL_PREFIX + p.name;
577
+ const prior = findLabeledEntry(sm, fullLabel);
578
+ pi.setLabel(leafId, fullLabel);
579
+ if (prior && prior !== leafId) {
580
+ pi.setLabel(prior, undefined);
581
+ }
582
+ const cw = ctx.model?.contextWindow ?? 0;
583
+ const tokensHere = estimateActiveBranchTokens(sm);
584
+ const labelHint = findLabelHint(sm, leafId, ANCHOR_HINT_MAX_LENGTH);
585
+ const positionLine = `${formatPct1(tokensHere, cw)}${cw > 0 ? ` of ${formatWindow(cw)}` : ""}`;
586
+ const hintLine = labelHint ? ` (after: “${labelHint}”)` : "";
587
+ return {
588
+ content: [
589
+ {
590
+ type: "text",
591
+ text:
592
+ `[anchor '${p.name}'] set at ${positionLine}${hintLine}\n\n` +
593
+ `When you finish this stage, call: navigate_tree(action='rewind', labelStart='${p.name}', labelEnd='<milestone-name>', summaryFocus='<≥${MIN_SUMMARY_FOCUS_LENGTH}-char focus: latest user instruction + done + remaining>').`,
594
+ },
595
+ ],
596
+ details: {
597
+ label: p.name,
598
+ entryId: leafId,
599
+ contextTokens: tokensHere,
600
+ labelHint,
601
+ movedFromPriorEntry: prior && prior !== leafId ? prior : null,
602
+ },
603
+ };
604
+ }
605
+
606
+ // --- rewind ---
607
+ if (!isValidName(p.labelStart)) {
608
+ return toolError(
609
+ `rewind requires \`labelStart\` in kebab-case, max ${MAX_NAME_LENGTH} chars.`,
610
+ );
611
+ }
612
+ if (!isValidName(p.labelEnd)) {
613
+ return toolError(
614
+ `rewind requires \`labelEnd\` in kebab-case, max ${MAX_NAME_LENGTH} chars.`,
615
+ );
616
+ }
617
+ if (
618
+ !p.summaryFocus ||
619
+ p.summaryFocus.trim().length < MIN_SUMMARY_FOCUS_LENGTH
620
+ ) {
621
+ const focusLen = p.summaryFocus?.trim().length ?? 0;
622
+ return toolError(
623
+ `\`summaryFocus\` must be ≥${MIN_SUMMARY_FOCUS_LENGTH} chars after trim (got ${focusLen}). The user's most recent instruction (which triggered this rewind) lives on the chain that's about to be collapsed — if summaryFocus doesn't preserve it, the post-rewind turn won't know what's left to do.\n\n` +
624
+ `Include in summaryFocus:\n` +
625
+ ` 1. the user's most recent instruction verbatim,\n` +
626
+ ` 2. which parts have already been done in the work being collapsed,\n` +
627
+ ` 3. which parts remain unactioned.`,
628
+ );
629
+ }
630
+
631
+ const target = findLabeledEntry(sm, LABEL_PREFIX + p.labelStart);
632
+ if (!target) {
633
+ return toolError(
634
+ `No label '${p.labelStart}' on the active branch. Use action='list' to see available labels.`,
635
+ );
636
+ }
637
+
638
+ const oldLeaf = sm.getLeafId();
639
+ if (!oldLeaf || oldLeaf === target) {
640
+ return toolError(
641
+ `Already at '${p.labelStart}' — nothing to summarize.`,
642
+ );
643
+ }
644
+
645
+ if (!ctx.model) {
646
+ return toolError("No model configured for summarization.");
647
+ }
648
+
649
+ const auth = await ctx.modelRegistry.getApiKeyAndHeaders(ctx.model);
650
+ if (!auth.ok) {
651
+ return toolError(`Auth resolution failed: ${auth.error}`);
652
+ }
653
+
654
+ // The leaf at execute time is the assistant that just streamed the
655
+ // rewind tool call. Its `usage.input` is the *minimum* of recent API
656
+ // calls in this turn, so estimating from it understates what the
657
+ // user just saw. Use the chain up to its parent (which has the prior
658
+ // assistant's usage as baseline) so beforeTokens matches the value
659
+ // `list` would have reported on the previous turn.
660
+ const oldLeafEntry = sm.getEntry(oldLeaf);
661
+ let beforeTokens: number;
662
+ if (
663
+ oldLeafEntry &&
664
+ oldLeafEntry.type === "message" &&
665
+ oldLeafEntry.message.role === "assistant" &&
666
+ oldLeafEntry.parentId
667
+ ) {
668
+ const allEntries = sm.getEntries();
669
+ const byId = new Map<string, SessionEntry>();
670
+ for (const e of allEntries) byId.set(e.id, e);
671
+ beforeTokens = estimateAtEntry(allEntries, oldLeafEntry.parentId, byId);
672
+ } else {
673
+ beforeTokens = estimateActiveBranchTokens(sm);
674
+ }
675
+ const contextWindow = ctx.model.contextWindow ?? 0;
676
+
677
+ const { entries } = collectEntriesForBranchSummary(sm, oldLeaf, target);
678
+ if (entries.length === 0) {
679
+ return toolError(
680
+ `No entries between leaf and '${p.labelStart}' — nothing to summarize.`,
681
+ );
682
+ }
683
+ // Chained-rewind-no-turns guard: bail if the only message between
684
+ // leaf and target matches our synthetic shape (single navigate_tree
685
+ // toolCall block + stopReason "toolUse" + zero usage), meaning the
686
+ // agent didn't append a real turn between rewinds. Keep only message
687
+ // entries — label / compaction / branch_summary / etc. carry no
688
+ // rewindable semantic content for this guard. The synthetic-shape
689
+ // predicate avoids false-positives on real navigate_tree calls
690
+ // (which have nonzero usage from the model).
691
+ const messageEntries = entries.filter((e) => e.type === "message");
692
+ if (messageEntries.length === 1) {
693
+ const lone = messageEntries[0];
694
+ if (lone.type === "message" && lone.message.role === "assistant") {
695
+ const msg = lone.message as {
696
+ content: Array<{ type?: string; name?: string }>;
697
+ stopReason?: string;
698
+ usage?: { input?: number; output?: number };
699
+ };
700
+ const block = msg.content[0];
701
+ const isSyntheticShape =
702
+ msg.stopReason === "toolUse" &&
703
+ (msg.usage?.input ?? 0) === 0 &&
704
+ (msg.usage?.output ?? 0) === 0;
705
+ if (
706
+ block &&
707
+ block.type === "toolCall" &&
708
+ block.name === "navigate_tree" &&
709
+ isSyntheticShape
710
+ ) {
711
+ return toolError(
712
+ `Already at synthetic boundary — no work to summarize. Append at least one turn between rewinds.`,
713
+ );
714
+ }
715
+ }
716
+ }
717
+
718
+ const result = await summarize(entries, {
719
+ model: ctx.model,
720
+ apiKey: auth.apiKey ?? "",
721
+ headers: auth.headers,
722
+ signal: signal ?? new AbortController().signal,
723
+ customInstructions: p.summaryFocus,
724
+ });
725
+ if (result.aborted) {
726
+ return toolError("Summarization aborted.");
727
+ }
728
+ if (result.error || !result.summary) {
729
+ return toolError(
730
+ `Summarization failed: ${result.error ?? "no summary text"}`,
731
+ );
732
+ }
733
+
734
+ // Move the tree.
735
+ const summaryId = sm.branchWithSummary(target, result.summary, {
736
+ readFiles: result.readFiles ?? [],
737
+ modifiedFiles: result.modifiedFiles ?? [],
738
+ });
739
+
740
+ // Chain-validity invariants once `branchWithSummary` succeeds: a
741
+ // synthetic must land on every path with toolCallId === this
742
+ // in-flight call (so pi's appended tool_result pairs), and
743
+ // stopReason: "toolUse" (survives Kiro's normalizeMessages filter
744
+ // — see `buildSyntheticAssistant` JSDoc). The synthetic append
745
+ // sits OUTSIDE the try so it runs exactly once regardless of
746
+ // which earlier step threw. labelEnd write moves before clear,
747
+ // mirroring `anchor`'s move-on-collision so duplicate anchors
748
+ // can't survive a chained rewind.
749
+ const fullLabelEnd = LABEL_PREFIX + p.labelEnd;
750
+ let priorLabelEnd: ReturnType<typeof findLabeledEntry> = null;
751
+ let tokensAtNewLeaf = 0;
752
+ let originalErr: unknown;
753
+ let salvageDetail = "";
754
+ let failedStep:
755
+ | "lookup"
756
+ | "setLabelEnd"
757
+ | "clearPrior"
758
+ | "estimate"
759
+ | null = null;
760
+ try {
761
+ failedStep = "lookup";
762
+ priorLabelEnd = findLabeledEntry(sm, fullLabelEnd);
763
+ failedStep = "setLabelEnd";
764
+ pi.setLabel(summaryId, fullLabelEnd);
765
+ // `summaryId` was freshly allocated by branchWithSummary above;
766
+ // no pre-existing label can already point at it.
767
+ if (priorLabelEnd) {
768
+ failedStep = "clearPrior";
769
+ pi.setLabel(priorLabelEnd, undefined);
770
+ }
771
+
772
+ // Compute afterTokens NOW — before we append the synthetic. This
773
+ // captures the chain size at the new leaf (branch_summary) using
774
+ // the prior real assistant's usage as the baseline.
775
+ failedStep = "estimate";
776
+ tokensAtNewLeaf = estimateActiveBranchTokens(sm);
777
+ failedStep = null;
778
+ } catch (err) {
779
+ originalErr = err;
780
+ // Best-effort retry of the specific failed step (pi.setLabel is
781
+ // idempotent under re-application). Per-step recovery shape:
782
+ // - setLabelEnd: retry pi.setLabel(summaryId, fullLabelEnd).
783
+ // - clearPrior: retry pi.setLabel(priorLabelEnd, undefined).
784
+ // - lookup / estimate: no retry — either prior state unknown
785
+ // or both labels already wrote; redundant retry would mask
786
+ // the real cause.
787
+ if (failedStep === "setLabelEnd") {
788
+ try {
789
+ pi.setLabel(summaryId, fullLabelEnd);
790
+ } catch (retryErr) {
791
+ salvageDetail = `labelEnd retry failed: ${
792
+ retryErr instanceof Error ? retryErr.message : String(retryErr)
793
+ }`;
794
+ }
795
+ } else if (failedStep === "clearPrior" && priorLabelEnd) {
796
+ try {
797
+ pi.setLabel(priorLabelEnd, undefined);
798
+ } catch (retryErr) {
799
+ salvageDetail = `prior-clear retry failed: ${
800
+ retryErr instanceof Error ? retryErr.message : String(retryErr)
801
+ }`;
802
+ }
803
+ }
804
+ }
805
+
806
+ // Synthetic append: runs in BOTH the happy path and the salvage path.
807
+ // If `originalErr` is set we use a degenerate synthetic
808
+ // (totalTokens=0, since `tokensAtNewLeaf` may not have been computed).
809
+ // The synthetic's matching toolCallId is the only structural
810
+ // requirement for chain validity — pi's appended tool_result pairs
811
+ // with this synthetic regardless of which earlier step threw.
812
+ //
813
+ // The full `summaryFocus` is already live in the LLM call to
814
+ // `generateBranchSummary`; we only need a trimmed copy in the
815
+ // synthetic's args (which pi will re-emit on every subsequent turn).
816
+ // Truncate to MAX_SYNTHETIC_FOCUS_LENGTH so a long focus string
817
+ // doesn't inflate every later turn indefinitely.
818
+ const syntheticArgs: Record<string, unknown> = {
819
+ ...(p as unknown as Record<string, unknown>),
820
+ };
821
+ if (
822
+ typeof p.summaryFocus === "string" &&
823
+ p.summaryFocus.length > MAX_SYNTHETIC_FOCUS_LENGTH
824
+ ) {
825
+ syntheticArgs.summaryFocus = `${p.summaryFocus.slice(0, MAX_SYNTHETIC_FOCUS_LENGTH)}… [truncated]`;
826
+ }
827
+ const syntheticMsg = buildSyntheticAssistant(
828
+ toolCallId,
829
+ "navigate_tree",
830
+ syntheticArgs,
831
+ ctx.model as
832
+ | { api?: string; provider?: string; id?: string }
833
+ | undefined,
834
+ originalErr ? 0 : tokensAtNewLeaf,
835
+ );
836
+ const syntheticId = sm.appendMessage(syntheticMsg);
837
+
838
+ // Refresh agent.state.messages so the next prompt() snapshot reflects
839
+ // the rewound chain. Runs in both paths; `refreshAgentMessages`
840
+ // already swallows internal throws, so it can't re-trigger salvage.
841
+ const refreshed = refreshAgentMessages(sm);
842
+
843
+ if (originalErr) {
844
+ // Salvage path: synthetic landed (chain is valid), labelEnd retry
845
+ // and refresh were best-effort. Re-throw the original error with
846
+ // any salvage detail attached so the failure surfaces to the
847
+ // agent and post-mortem reviewers can tell what was recovered.
848
+ // Preserve the original via `Error.cause` (ES2022) so callers
849
+ // doing `instanceof` checks against typed subclasses, or
850
+ // post-mortem readers walking the cause chain, can recover the
851
+ // original throw. Older runtimes silently ignore the options
852
+ // bag, so this is forward-compatible without a feature gate.
853
+ const baseMsg =
854
+ originalErr instanceof Error
855
+ ? originalErr.message
856
+ : String(originalErr);
857
+ throw new Error(
858
+ salvageDetail ? `${baseMsg} (salvage: ${salvageDetail})` : baseMsg,
859
+ { cause: originalErr },
860
+ );
861
+ }
862
+
863
+ const afterTokens = tokensAtNewLeaf;
864
+
865
+ return {
866
+ content: [
867
+ {
868
+ type: "text",
869
+ text:
870
+ `[rewind '${p.labelStart}' → '${p.labelEnd}'] · ${formatContextDelta(beforeTokens, afterTokens, contextWindow)}\n\n` +
871
+ `A branch_summary recording the work just collapsed has been appended to your context. Items under '### Done' are complete. Items under '### In Progress', '### Blocked', or '## Next Steps' are pending — execute them next without re-confirming with the user. Other branch_summary messages, if present, record earlier collapsed segments.` +
872
+ (refreshed ? "" : `\n\n${REFLECTION_BOOTSTRAP_WARNING_REWIND}`),
873
+ },
874
+ ],
875
+ details: {
876
+ labelStart: p.labelStart,
877
+ labelEnd: p.labelEnd,
878
+ targetId: target,
879
+ summaryId,
880
+ syntheticAssistantId: syntheticId,
881
+ collapsedEntries: entries.length,
882
+ contextBefore: beforeTokens,
883
+ contextAfter: afterTokens,
884
+ contextWindow,
885
+ agentMessagesRefreshed: refreshed,
886
+ readFiles: result.readFiles ?? [],
887
+ modifiedFiles: result.modifiedFiles ?? [],
888
+ },
889
+ };
890
+ },
891
+ });
892
+ }
893
+
894
+ /**
895
+ * Non-stable testing-only hooks. **Do NOT import in production code.**
896
+ *
897
+ * The `__` prefix and individual member names are subject to change in any
898
+ * release without a semver-major bump. Intended exclusively for hermetic
899
+ * tests within this package; the hooks reach into module-internal state
900
+ * (the `AgentSession.prototype` patch, the `seenSessions` WeakSet, the
901
+ * `sessionInstances` array, the `prepareNextTurn` marker symbol) and are
902
+ * not designed for external consumption.
903
+ *
904
+ * If you found this via `node_modules` archaeology, you're holding it
905
+ * wrong — use the registered tool surface (`navigate_tree`) instead.
906
+ */
907
+ export const __testHooks = {
908
+ /**
909
+ * Restore the original `AgentSession.prototype.prompt` (stashed by
910
+ * `patchAgentSessionPrototype` under the `ORIG_PROMPT_KEY` symbol) and
911
+ * drain the captured-session refs. Idempotent: a no-op if the patch was
912
+ * never installed or has already been reset.
913
+ */
914
+ resetPrototype(): void {
915
+ const proto = AgentSession.prototype as unknown as Record<
916
+ PropertyKey,
917
+ unknown
918
+ >;
919
+ const orig = proto[ORIG_PROMPT_KEY];
920
+ if (typeof orig === "function") {
921
+ proto.prompt = orig;
922
+ delete proto[ORIG_PROMPT_KEY];
923
+ }
924
+ sessionInstances.length = 0;
925
+ // WeakSet has no .clear(); rebind to a fresh instance so a test that
926
+ // re-captures the SAME session identity post-reset isn't deduped by
927
+ // stale state from the previous test's capture.
928
+ seenSessions = new WeakSet();
929
+ },
930
+ /** Module-internal helpers exposed for hermetic unit tests. */
931
+ buildSyntheticAssistant,
932
+ findLabelHint,
933
+ findLabeledEntry,
934
+ installPrepareNextTurn,
935
+ refreshAgentMessages,
936
+ captureSession,
937
+ /** Symbol used to mark the wrapper installed by `installPrepareNextTurn`. */
938
+ PNT_MARKER,
939
+ /** Read-only view of captured-session ref count for reaping assertions. */
940
+ sessionRefCount(): number {
941
+ return sessionInstances.length;
942
+ },
943
+ /**
944
+ * The reflection-bootstrap-missing warning strings, split per site.
945
+ * Exported so tests can pin the per-site verbatim wording (the `list`
946
+ * site is read-only and uses the read-only phrasing; the `rewind` site
947
+ * writes to disk and uses the write phrasing). Tests assert literal
948
+ * containment at each site to catch drift.
949
+ */
950
+ REFLECTION_BOOTSTRAP_WARNING_LIST,
951
+ REFLECTION_BOOTSTRAP_WARNING_REWIND,
952
+ };
package/package.json ADDED
@@ -0,0 +1,55 @@
1
+ {
2
+ "name": "@cad0p/pi-tree-navigator",
3
+ "version": "0.1.0",
4
+ "description": "agent-callable session tree navigation for pi: anchor named milestones, rewind work into a branch summary, free up context for long autonomous sessions",
5
+ "publishConfig": {
6
+ "access": "public",
7
+ "registry": "https://registry.npmjs.org"
8
+ },
9
+ "keywords": [
10
+ "pi",
11
+ "pi-package",
12
+ "pi-extension",
13
+ "session-tree",
14
+ "branch-summary",
15
+ "agent-memory",
16
+ "context-management"
17
+ ],
18
+ "repository": {
19
+ "type": "git",
20
+ "url": "git+https://github.com/cad0p/pi-tree-navigator.git"
21
+ },
22
+ "homepage": "https://github.com/cad0p/pi-tree-navigator#readme",
23
+ "bugs": {
24
+ "url": "https://github.com/cad0p/pi-tree-navigator/issues"
25
+ },
26
+ "files": [
27
+ "extensions/**/*.ts",
28
+ "!extensions/**/*.test.ts",
29
+ "README.md",
30
+ "CHANGELOG.md",
31
+ "LICENSE"
32
+ ],
33
+ "pi": {
34
+ "extensions": [
35
+ "extensions/navigate-tree"
36
+ ]
37
+ },
38
+ "scripts": {
39
+ "lint": "bunx biome check extensions/",
40
+ "lint:fix": "bunx biome check --write extensions/",
41
+ "typecheck": "bunx tsc --noEmit",
42
+ "test": "bun test"
43
+ },
44
+ "devDependencies": {
45
+ "@biomejs/biome": "^2.3.14",
46
+ "@types/bun": "^1.3.0",
47
+ "typescript": "^5.8.0"
48
+ },
49
+ "peerDependencies": {
50
+ "@earendil-works/pi-agent-core": ">=0.74.0",
51
+ "@earendil-works/pi-coding-agent": ">=0.74.0",
52
+ "typebox": "^1.0.0"
53
+ },
54
+ "license": "MIT"
55
+ }