npm - pi-cursor-sdk - Versions diffs - 0.1.14 → 0.1.15 - Mend

pi-cursor-sdk 0.1.14 → 0.1.15

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (16) hide show

package/CHANGELOG.md +30 -0
package/README.md +55 -13
package/docs/cursor-model-ux-spec.md +17 -3
package/docs/cursor-native-tool-replay.md +88 -0
package/docs/cursor-native-tool-visual-audit.md +183 -0
package/package.json +5 -2
package/src/context.ts +34 -11
package/src/cursor-mcp-timeout-override.ts +111 -0
package/src/cursor-native-tool-display.ts +397 -46
package/src/cursor-pi-tool-bridge.ts +637 -0
package/src/cursor-provider.ts +477 -81
package/src/cursor-question-tool.ts +247 -0
package/src/cursor-session-cwd.ts +33 -0
package/src/cursor-tool-names.ts +67 -0
package/src/cursor-tool-transcript.ts +730 -61
package/src/index.ts +7 -0

package/CHANGELOG.md CHANGED Viewed

@@ -1,5 +1,35 @@
 # Changelog
+## Unreleased
+## 0.1.15 - 2026-05-21
+### Added
+- Add the default-on local pi MCP tool bridge, which exposes bridgeable active pi tools to local Cursor agents while executing calls through pi's normal tool path.
+- Add `cursor_ask_question` through the bridge so Cursor can ask users through pi UI as `pi__cursor_ask_question`.
+- Add `PI_CURSOR_EXPOSE_BUILTIN_TOOLS=1` for opting in to overlapping built-in pi tools that are hidden from the Cursor bridge by default.
+- Add Cursor SDK MCP tool-call timeout overrides via `PI_CURSOR_MCP_TOOL_TIMEOUT_SECONDS` and `PI_CURSOR_MCP_TOOL_TIMEOUT_MS` for long-running local MCP tools, including bridged pi tools.
+- Replay Cursor SDK `grep` activity through native pi `grep` cards and `glob` activity through native pi `find` cards, so search activity matches built-in tool UX in interactive TTY sessions.
+### Changed
+- Load Cursor setting sources with `PI_CURSOR_SETTING_SOURCES=all` by default while filtering direct Cursor SDK startup logs so settings, rules, plugins, and configured Cursor MCP servers are available without corrupting pi's TUI.
+### Fixed
+- Replay recorded Cursor tool errors, including nonzero shell exits and timeout-backgrounded shell commands, as native pi tool errors instead of successful green cards.
+- Format zero-match Cursor grep results as `(no matches)` instead of raw `{ "totalMatches": 0 }` JSON in native replay and transcript output.
+- Strip trailing colons from Cursor grep file-list replay output.
+- Make native Cursor read replay closer to pi's built-in read cards by displaying session-relative paths and 20-line continuation hints.
+- Convert Cursor SDK shell timeouts from milliseconds to seconds in native bash replay cards instead of rendering `30000ms` as `30000s`.
+- Use the pi session cwd for Cursor `Agent.create`, not only native tool replay display. Completes the 0.1.10 cwd work that previously updated replay registration but left the Cursor agent runtime on `process.cwd()`.
+- Replay path-only Cursor `write` activity through neutral recorded Cursor activity instead of invalid native pi `write` calls.
+- Preserve literal `cursor_edit`, `cursor_write`, and `cursor_mcp` text in user messages, assistant text, tool args, and tool results while still relabeling structured replay tool names.
+- Avoid hiding unrelated MCP activity whose result payload merely contains a bridge tool name, while still suppressing real bridge-owned Cursor MCP replay by invocation identity and call ID.
+- Clean up pending native replay waits when abort signals are already aborted or abort before listener registration.
+- Suppress direct Cursor SDK settings/skills startup noise, including late `managed_skills.removed` lines, without swallowing unrelated non-startup stdout/stderr output.
 ## 0.1.14 - 2026-05-18
 ### Changed

package/README.md CHANGED Viewed

@@ -165,7 +165,7 @@ For Claude models with both `thinking` and `effort`, pi thinking `off` sends `th
 In `pi --list-models`, `thinking=no` means pi cannot control the model's thinking level with `--thinking`, a final `:medium` model suffix, or shift+tab. It does not mean the Cursor model cannot think.
-Some Cursor SDK models do not expose a `reasoning`, `effort`, or `thinking` parameter for the extension to set. Cursor thinking is still enabled/supported by the model, and Cursor may still emit thinking deltas. The extension does not disable Cursor's default reasoning behavior.
+Some Cursor SDK models do not expose a `reasoning`, `effort`, or `thinking` parameter for the extension to set. Cursor thinking is still enabled/supported by the model, and Cursor may still emit thinking deltas. The extension surfaces those deltas through pi's native thinking rendering when the SDK emits them.
 ## Fast mode
@@ -197,6 +197,31 @@ If you do not see `cursor fast`, fast mode is off.
 Images from the latest user message are forwarded to Cursor. Historical images are kept out of the transcript and appear only as `[image omitted from transcript]` placeholders, so follow-up questions about an earlier image should reattach the image or include a textual description. The extension advertises `text` and `image` input for Cursor models because Cursor's SDK accepts image messages and Cursor models are expected to support them.
+## Tools and local pi bridge
+Cursor runs use local Cursor SDK agents. Cursor's own local-agent tools, Cursor settings, plugins, and configured Cursor MCP servers remain available through the SDK.
+In addition, pi-cursor-sdk exposes the current pi session's bridgeable active tools to Cursor through a local loopback MCP bridge by default. The bridge snapshots `pi.getActiveTools()` and `pi.getAllTools()` for each Cursor run, excludes internal Cursor replay activity names, and hides overlapping built-in pi tools (`read`, `bash`, `write`, `edit`, `grep`, `find`, `ls`) by default because Cursor local agents already have native equivalents. Non-overlapping active tools and extension/custom tools present in pi's active tool registry are exposed as MCP tools with collision-safe names such as `pi__sem_reindex`, and calls map back to the real pi tool names. When Cursor calls a bridged tool, pi executes it through the normal pi tool path, so confirmations, tool hooks, renderers, session history, and abort behavior stay pi-native. The bridge also exposes `cursor_ask_question` as `pi__cursor_ask_question` when enabled, allowing Cursor to ask the user through pi UI instead of silently choosing a default. The bridge does not call tool `execute()` handlers directly.
+Bridge controls:
+```bash
+# Roll back to Cursor SDK tools/settings/MCP only; do not expose active pi tools through the bridge.
+PI_CURSOR_PI_TOOL_BRIDGE=0 pi --model cursor/composer-2.5
+# Opt in to also expose overlapping pi tool names through the bridge.
+PI_CURSOR_EXPOSE_BUILTIN_TOOLS=1 pi --model cursor/composer-2.5
+# Override Cursor SDK MCP tool-call timeout, including bridged pi tools and configured Cursor MCP servers.
+PI_CURSOR_MCP_TOOL_TIMEOUT_SECONDS=7200 pi --model cursor/composer-2.5
+PI_CURSOR_MCP_TOOL_TIMEOUT_MS=7200000 pi --model cursor/composer-2.5
+```
+`PI_CURSOR_PI_TOOL_BRIDGE=0` is the supported rollback flag and disables the bridge entirely. The bridge also treats `false`, `off`, `none`, `no`, and `disabled` as off; `1`, `true`, `on`, `yes`, and `enabled` as on. `PI_CURSOR_EXPOSE_BUILTIN_TOOLS=1` opts in to exposing overlapping pi tool names that Cursor already has native equivalents for (`read`, `bash`, `write`, `edit`, `grep`, `find`, and `ls`). By default those names are hidden even when pi's Cursor replay wrapper has registered them as extension tools. The Cursor MCP timeout override defaults to 3600 seconds because the installed Cursor SDK has a 60-second MCP request default that is too short for some local MCP tools, including bridged pi tools and configured Cursor MCP servers.
+Cursor-native tool replay is separate from the bridge. Replay only displays completed Cursor SDK tool activity as pi-native-looking cards with recorded results, using pi's normal success/error card shell for neutral Cursor activity too. It never re-runs Cursor-side commands, reapplies Cursor edits, calls MCP servers, or mutates pi state. See [Cursor native tool replay](docs/cursor-native-tool-replay.md).
 ## Fallback models
 If no key is available from `/login`, `CURSOR_API_KEY`, or `--api-key`, model discovery fails, or discovery returns no models, the extension registers a bundled fallback snapshot of the latest reviewed Cursor SDK model catalog and notifies interactive users when possible.
@@ -207,11 +232,11 @@ Actual Cursor runs still need a key from `/login`, `CURSOR_API_KEY`, or `--api-k
 ## Limits
-- **Local Cursor SDK agents only.** This extension does not use Cursor cloud agents.
-- **Cursor-side tool use is not re-executed by pi.** Cursor still uses its own internal SDK tools. The extension records completed Cursor tool activity and, in interactive TTY sessions, replays supported `read`, `bash`, `ls`, `edit`, and `write` activity through pi's native tool-call path with recorded results (for example green `read`, `$ ...`, and Cursor edit/write cards) without forcing Cursor to call pi tools or rerun commands. Cursor edit/write activity uses replay-only `cursor_edit` and `cursor_write` tool cards because Cursor's file-editing schema is not the same as pi's built-in `edit`/`write` schemas; those replay tools only display recorded Cursor results and never mutate files directly. If a Cursor read completion reports no content, the extension may include a bounded local file preview for safe in-workspace paths; that preview is explicitly labeled as a local preview captured at transcript time, not guaranteed Cursor-observed content. Native replay wrappers are registered only for tool names not already owned by another extension; skipped tools fall back to the scrubbed Cursor activity transcript. As Cursor SDK tool completions arrive, the extension mirrors native Codex ordering by ending a tool-use turn, letting pi render the recorded tool results immediately, then continuing with live post-tool Cursor thinking/text, any later Cursor tool batches, or Cursor's final answer as the next assistant turn. Non-interactive/session consumers still get bounded scrubbed transcript data so `pi -p` keeps printing normal assistant text.
-- **Pi tool schemas are not passed through to Cursor.** This extension is a Cursor provider, not a bridge that forwards pi's tool system into Cursor.
-- **One fresh Cursor agent is created per provider call.** Cursor agent state is not reused between pi provider calls.
-- **Cursor setting sources are opt-in.** The extension does not pass `local.settingSources` by default because the Cursor SDK can print settings/skills loading output directly to the terminal during startup. To load configured Cursor MCP servers, plugin tools, project/user settings, and related Cursor-native capabilities, start pi with `PI_CURSOR_SETTING_SOURCES=all`. To narrow loading, set a comma-separated list such as `PI_CURSOR_SETTING_SOURCES=project,user,plugins`.
+- **Local Cursor SDK agents only.** This extension does not use Cursor cloud agents. Cloud pi tool bridging is out of scope because it needs a separate auth, transport, lifetime, and remote trust design.
+- **The pi tool bridge is local and MCP-backed.** Bridgeable active pi tools are exposed to local Cursor agents through a tokenized `127.0.0.1` MCP endpoint; internal Cursor replay activity names are excluded, and overlapping built-in pi tools are hidden by default. Set `PI_CURSOR_PI_TOOL_BRIDGE=0` to disable it or `PI_CURSOR_EXPOSE_BUILTIN_TOOLS=1` to expose overlapping built-ins too.
+- **Cursor native tool replay is display-only.** Replay renders recorded Cursor SDK activity and never re-runs Cursor-side commands, reapplies Cursor edits, calls MCP servers, or mutates pi state. Workflow tools such as Cursor `SwitchMode` and Cursor todo state are not pi workflow controls. See [Cursor native tool replay](docs/cursor-native-tool-replay.md) for supported replay cards, ordering, conflict handling, and opt-out flags.
+- **Cursor run state can span tool-use turns.** A new Cursor SDK agent starts for a new provider run. When Cursor calls bridged pi tools or emits replayed Cursor tool activity, the same Cursor SDK run can stay alive across pi `toolUse` turns so results resume in the original Cursor run.
+- **Cursor setting sources default to all.** The extension passes `local.settingSources: ["all"]` by default so configured Cursor MCP servers, plugin tools, project/user settings, and related Cursor-native capabilities are available like they are in Cursor. To narrow loading, set a comma-separated list such as `PI_CURSOR_SETTING_SOURCES=project,user,plugins`. To disable ambient setting sources, set `PI_CURSOR_SETTING_SOURCES=none`. Direct Cursor SDK startup logs are suppressed so setting/skill loading messages do not pollute the TUI.
 - **Max Mode is not a manual pi variant.** Cursor's SDK may enable Max Mode automatically for models that require it. This extension only advertises exact context-window variants that the SDK catalog exposes and otherwise uses conservative SDK-derived default/non-Max context windows.
 - **Output token limits are conservative.** Cursor SDK model metadata does not currently expose output token limits directly.
 - **Token usage is approximate in pi.** Cursor SDK usage events include internal agent/tool/cache work, so the extension reports an approximate replayable pi prompt/output size for context display and compaction decisions.
@@ -251,7 +276,7 @@ pi install npm:pi-cursor-sdk
 ### `pi --list-models` shows `thinking=no`
-That does not mean the model cannot think. It means the Cursor SDK does not expose a pi-controllable thinking parameter for that model. The model may still think internally and may still emit thinking deltas.
+That does not mean the model cannot think. It means the Cursor SDK does not expose a pi-controllable thinking parameter for that model. The model may still think internally and may still emit thinking deltas that pi renders natively.
 ### I do not see `cursor fast` in the footer
@@ -259,21 +284,38 @@ Fast mode is currently off. The footer only shows `cursor fast` when fast mode i
 ### My Cursor app settings or rules do not seem to apply
-Cursor setting sources are not loaded by default because the Cursor SDK can print settings/skills loading output directly to the terminal. Start pi with `PI_CURSOR_SETTING_SOURCES=all`, or choose a narrower list such as `PI_CURSOR_SETTING_SOURCES=project,user,plugins`.
+Cursor setting sources are loaded with `PI_CURSOR_SETTING_SOURCES=all` by default. To narrow loading, set `PI_CURSOR_SETTING_SOURCES=project,user,plugins` or another comma-separated list. If you explicitly disabled sources with `PI_CURSOR_SETTING_SOURCES=none`, remove that override.
 ### Cursor does not call my web search MCP/tool
-Cursor SDK local agents load MCP servers from Cursor setting sources and inline SDK config. This extension leaves Cursor setting sources off by default to avoid startup log noise, so a web search tool needs to be configured in Cursor and settings sources need to be enabled with `PI_CURSOR_SETTING_SOURCES=all` or a narrower list.
+Cursor SDK local agents load MCP servers from Cursor setting sources and inline SDK config. This extension enables all Cursor setting sources by default, so a missing web search tool usually means it is not configured in Cursor or the run was started with a narrowing/disable override such as `PI_CURSOR_SETTING_SOURCES=none`.
-### Cursor native tool cards conflict with another extension
+### Cursor does not call my pi extension tool
+The local pi bridge only exposes tools that are active in the current pi session and present in pi's tool registry at Cursor run start. By default, it does not expose overlapping pi tool names that Cursor already has native equivalents for (`read`, `bash`, `write`, `edit`, `grep`, `find`, and `ls`). Opt in if you intentionally want Cursor to see both the Cursor-native tool and an overlapping built-in pi tool:
+```bash
+PI_CURSOR_EXPOSE_BUILTIN_TOOLS=1 pi --model cursor/composer-2.5
+```
+To disable the bridge for rollback or isolation, start pi with:
-Cursor native replay is a UI enhancement for interactive TTY sessions. If another extension already owns `read`, `bash`, `ls`, `cursor_edit`, or `cursor_write`, this extension skips only the conflicting native replay wrapper and uses the scrubbed Cursor activity transcript for that tool instead. To disable Cursor native replay registration entirely, start pi with:
+```bash
+PI_CURSOR_PI_TOOL_BRIDGE=0 pi --model cursor/composer-2.5
+```
+### A Cursor MCP tool times out
+The extension raises Cursor SDK's MCP tool-call timeout from 60 seconds to 3600 seconds by default for Cursor SDK MCP `callTool` requests, including the local pi bridge and configured Cursor MCP servers. For longer local MCP tools, set one override:
 ```bash
-PI_CURSOR_NATIVE_TOOL_DISPLAY=0 pi --model cursor/composer-2.5
+PI_CURSOR_MCP_TOOL_TIMEOUT_SECONDS=7200 pi --model cursor/composer-2.5
+PI_CURSOR_MCP_TOOL_TIMEOUT_MS=7200000 pi --model cursor/composer-2.5
 ```
-`PI_CURSOR_REGISTER_NATIVE_TOOLS=0` is also accepted as a registration-only opt-out.
+### Cursor native tool cards conflict with another extension
+Cursor native replay is a UI enhancement for interactive TTY sessions. See [Cursor native tool replay](docs/cursor-native-tool-replay.md) for conflict behavior and opt-out flags.
 ## Development

package/docs/cursor-model-ux-spec.md CHANGED Viewed

@@ -16,10 +16,16 @@ Current implementation notes:
 - Image payload forwarding sends images only from the latest user message. If the latest user turn is plain text after an earlier image turn, the transcript keeps an `[image omitted from transcript]` placeholder but no image bytes are sent to Cursor. The prompt explicitly tells Cursor that prior image bytes are unavailable and to ask the user to reattach or describe a prior image when needed. Carrying images forward across turns remains a future product decision because it affects token cost, privacy, stale visual context, and expected multimodal follow-up behavior.
 - `@cursor/sdk` is a package dependency of this extension; users should not need a global SDK install.
 - Cursor auth uses pi-native API-key resolution for provider `cursor`: CLI `--api-key`, stored `~/.pi/agent/auth.json` API key from `/login`, then `CURSOR_API_KEY`. The extension config file stores only non-secret Cursor-only state such as fast defaults.
-- Local agents do not pass `settingSources` by default because the Cursor SDK can print settings/skills loading output directly to the terminal during startup. Users can opt in with `PI_CURSOR_SETTING_SOURCES=all` or narrow loading with a comma-separated list such as `PI_CURSOR_SETTING_SOURCES=project,user,plugins`.
+- Local agents pass `settingSources: ["all"]` by default so Cursor MCP servers, plugin tools, project/user settings, and related Cursor-native capabilities are available. Users can narrow loading with a comma-separated list such as `PI_CURSOR_SETTING_SOURCES=project,user,plugins`, or disable ambient setting sources with `PI_CURSOR_SETTING_SOURCES=none`. The provider suppresses direct Cursor SDK startup writes around agent creation so setting/skill loading logs do not pollute pi's TUI.
 - Cursor SDK models are treated as thinking-capable even when pi reports `thinking=no`; that pi column only means the SDK did not expose a pi-controllable thinking parameter for that model.
-- Cursor-side thinking remains visible. Cursor internal tool activity is recorded from SDK events and scrubbed. In interactive TTY sessions, supported completed `read`, `bash`, `ls`, `edit`, and `write` activity is replayed through pi's native tool-call rendering path with recorded Cursor results, so the TUI can show native green cards without forcing Cursor to call pi tools or rerunning Cursor's reads/shell commands/file edits. Cursor edit/write activity is replayed through `cursor_edit` and `cursor_write` cards rather than pi's built-in `edit`/`write` names because Cursor's edit/write schemas differ from pi's schemas; these replay-only tools display recorded Cursor results and fail closed if called without a recorded result. Native replay wrappers are registered only for tool names not already owned by another extension; conflicting tools use the bounded scrubbed transcript fallback. `PI_CURSOR_NATIVE_TOOL_DISPLAY=0` disables native replay, and `PI_CURSOR_REGISTER_NATIVE_TOOLS=0` is a registration-only opt-out that keeps the transcript fallback without shadowing pi tool names. When these native cards are emitted, the provider mirrors Codex's turn shape as Cursor SDK completions arrive: assistant `toolUse`, pi `toolResult`s, live post-tool Cursor thinking/text, any later Cursor tool batches as further `toolUse` turns, then Cursor's final assistant answer. Non-interactive runs keep bounded scrubbed transcript output instead, preserving `pi -p` assistant text output. Cursor text deltas stream live when native tool replay is not active.
+- Cursor-side thinking remains visible through pi's native thinking rendering when the Cursor SDK emits thinking or summary deltas.
+- Local Cursor agents get two tool paths. First, Cursor keeps the Cursor SDK local-agent tool surface plus configured Cursor settings, plugins, and Cursor MCP servers. Second, pi-cursor-sdk exposes active pi tools through a default-on, tokenized loopback MCP bridge when `pi.getActiveTools()` and `pi.getAllTools()` contain exposable tools. Cursor sees collision-safe MCP names such as `pi__sem_reindex`, while pi emits and executes the real pi tool name. Overlapping built-in pi tools (`read`, `bash`, `write`, `edit`, `grep`, `find`, `ls`) are hidden by default because Cursor local agents already have native equivalents; `PI_CURSOR_EXPOSE_BUILTIN_TOOLS=1` opts into exposing them too. The provider also registers `cursor_ask_question` for Cursor models when the bridge is enabled; Cursor sees it as `pi__cursor_ask_question`, and pi executes it through the normal tool path so interactive users can choose options from pi UI. In non-UI modes it reports that UI is unavailable so Cursor can state a default assumption instead. The bridge queues MCP calls, emits provider `toolcall_*` events, waits for matching pi `toolResult` messages by `toolCallId`, resolves the result back to the same Cursor SDK run, and never calls tool `execute()` handlers directly. `PI_CURSOR_PI_TOOL_BRIDGE=0` disables this local bridge, including question bridging. Cloud Cursor agents remain out of scope for the bridge.
+- Cursor SDK MCP tool calls use a guarded timeout override because installed `@cursor/sdk` 1.0.13 has a 60-second MCP request default with no public per-server timeout option. The extension extends that Cursor SDK MCP `callTool` timeout path to 3600 seconds by default. Users can override it with `PI_CURSOR_MCP_TOOL_TIMEOUT_MS` or `PI_CURSOR_MCP_TOOL_TIMEOUT_SECONDS`.
+- Cursor internal tool activity is recorded from SDK events and scrubbed. In interactive TTY sessions, supported completed `read`, `bash`, `grep`, `find`, `ls`, `edit`, `write`, diagnostics, delete, todo/plan, task, image generation, and MCP activity is replayed through pi's native tool-call rendering path with recorded Cursor results, so the TUI can show native-looking cards without rerunning Cursor's reads/shell commands/file edits. Cursor `glob` activity is replayed through native `find` cards. Cursor write activity is replayed through native-looking `write` cards, and Cursor StrReplace/edit activity uses native-looking `edit` only when recorded arguments truthfully satisfy pi's `edit` schema; path-only Cursor edit and notebook edit replay falls back to neutral Cursor activity before pi validation. Diagnostics, delete, todos/plans, task, image, and MCP activity use neutral Cursor activity cards with pi's default success/error shell. Neutral Cursor activity calls include `activityTitle` and, when available, `activitySummary` so partial/collapsed cards preserve identity such as `Cursor plan`, `Cursor todos`, `Cursor MCP`, or `Cursor edit`. Replay-only tools display recorded Cursor results, normalize workspace-local paths/diff headers for display, use pi diff colors for edit previews and path-inferred syntax highlighting for write previews, and fail closed if called without a recorded result. Native replay wrappers are registered only for tool names not already owned by another extension; conflicting tools use the bounded scrubbed transcript fallback. Cursor workflow tools such as `SwitchMode` and Cursor todo state are not pi workflow controls; reported todo/plan events are displayed as Cursor activity only. Plan/todo replay cards can be followed by Cursor's final plan text, selected from `run.wait().result` when Cursor provides one and trimmed against already-emitted text. Started Cursor SDK tool calls that never receive a completion event are discarded without synthetic replay errors; explicit failures remain visible when Cursor reports them through completed tool calls or step results. `PI_CURSOR_NATIVE_TOOL_DISPLAY=0` disables native replay, and `PI_CURSOR_REGISTER_NATIVE_TOOLS=0` is a registration-only opt-out that keeps the transcript fallback without shadowing pi tool names. When bridge or native replay cards are emitted, the provider mirrors Codex's turn shape as Cursor SDK activity arrives: assistant `toolUse`, pi `toolResult`s, live post-tool Cursor thinking/text, any later tool batches as further `toolUse` turns, then Cursor's final assistant answer. For shell replay, completed `stdout` / `stderr` are primary; unambiguous `shell-output-delta` data is used only as display-only fallback for empty successful shell completions, and overlapping shell calls drop ambiguous deltas instead of guessing. Non-interactive runs keep bounded scrubbed transcript output instead, preserving `pi -p` assistant text output. Cursor text deltas stream live when no live-run turn split is active.
+- Synthetic replay names are internal compatibility details. New model-facing prompt text and user-visible cards use native tool names when renderer-compatible, or neutral Cursor activity labels when not. Legacy sessions containing old internal replay names are sanitized before prompt/display. Bridge MCP names such as `pi__sem_reindex` are MCP-only; pi session output uses real pi tool names.
 - Cursor SDK usage events report cumulative internal agent/tool/cache work, not the replayable pi prompt context. The extension reports approximate prompt/output usage for pi context display and compaction decisions instead of copying raw Cursor SDK usage. When native replay splits one Cursor SDK run into multiple pi turns, prompt input is counted once for the run; later synthetic replay turns report `input: 0` and only their own output estimate.
+- Audit observation, 2026-05-19, superseded by the 2026-05-21 replay pass: a missing-file read with Composer 2.5 emitted `tool-call-started` for Cursor `read`, then streamed final text `Error: File not found`, but did not emit `tool-call-completed` or an `onStep` `toolCall` error result. Leftover started calls are now discarded at run completion instead of becoming synthetic replay errors. Cursor-reported completed/step errors remain visible.
+- Maintainer visual verification for replay-card changes should follow [Cursor Native Tool Visual Audit Workflow](./cursor-native-tool-visual-audit.md): offscreen PTY-driven pi run, xterm.js/Playwright screenshot rendering, and JSONL inspection before accepting commits or PRs.
 - For models without a catalog `context` parameter, context windows are not hardcoded. The extension ships a bundled SDK-derived default/non-Max cache generated from `createAgentPlatform().checkpointStore.loadLatest(agentId).tokenDetails.maxTokens`. Successful runs can update a local override cache, but model discovery does not probe models at startup.
 - Max Mode context windows are distinct from default/non-Max context windows. `@cursor/sdk` 1.0.13 documentation says the SDK may enable Max Mode automatically when a selected model requires it, but the public local-agent `ModelSelection` path still does not expose a manual Max Mode selector. Do not advertise Max Mode context windows unless the SDK catalog exposes an exact parameter/variant or the SDK public API adds a Max Mode selector that the extension actually sends.
 - `@cursor/sdk` 1.0.13 adds latest-style `ModelListItem.aliases`. The extension registers only unambiguous aliases as pi model IDs (with the same context suffixes when applicable) and sends the alias back in `ModelSelection.id`, while sharing Cursor-only state such as fast defaults with the underlying catalog `id`. Aliases shared by multiple base models, such as generic family aliases, are skipped because the pi row metadata would otherwise imply one base model while Cursor may resolve the alias to another.
@@ -236,7 +242,7 @@ Important distinction:
 - **Cursor thinking support** applies to all Cursor SDK models. The extension should assume Cursor models can think and may emit thinking deltas.
 - **Pi-controllable thinking** means Cursor exposes a `reasoning`, `effort`, or `thinking` parameter that the extension can set from pi's native thinking level. These models register `reasoning: true` and show `thinking=yes` in `pi --list-models`.
-- **Cursor SDK thinking-control gap** means the model can still think, but the SDK does not expose a user-controllable thinking parameter for that model. These models register `reasoning: false` and show `thinking=no` in `pi --list-models` because pi cannot control a level for them. The extension still parses Cursor `thinking-delta` events if they are emitted.
+- **Cursor SDK thinking-control gap** means the model can still think, but the SDK does not expose a user-controllable thinking parameter for that model. These models register `reasoning: false` and show `thinking=no` in `pi --list-models` because pi cannot control a level for them. The extension still surfaces Cursor `thinking-delta` and summary events through pi's native thinking rendering when they are emitted.
 Do not mark a model `reasoning: true` only because it can think. That would make pi show controls such as `--thinking`, `:medium`, and shift+tab even though the extension cannot translate them into Cursor SDK params.
@@ -655,3 +661,11 @@ Before calling done:
    - `pi --model cursor/gpt-5.5@272k --thinking xhigh -p "Say ok only"`
    - `pi --model cursor/gpt-5.5@1m --cursor-fast -p "Say ok only"`
    - confirm requests use selected context, pi thinking, and fast flag state
+4. Tool bridge and replay:
+   - `npm test -- test/cursor-pi-tool-bridge.test.ts test/cursor-provider.test.ts test/cursor-mcp-timeout-override.test.ts`
+   - confirm `Agent.create()` gets `mcpServers.pi_tools` when active pi tools exist and omits it when `PI_CURSOR_PI_TOOL_BRIDGE=0` or the active snapshot is empty
+   - confirm bridged MCP requests emit real pi tool calls and resolve matching pi tool results back to the same Cursor SDK run
+   - confirm bridge MCP activity is suppressed from Cursor replay while non-bridge Cursor MCP activity remains visible
+   - confirm `PI_CURSOR_MCP_TOOL_TIMEOUT_MS` and `PI_CURSOR_MCP_TOOL_TIMEOUT_SECONDS` override the Cursor SDK MCP callTool timeout seam
+   - run the visual audit workflow when replay card visuals or bridge card visuals change; JSONL should show real pi tool names for bridged calls and no duplicate MCP replay for bridge calls

package/docs/cursor-native-tool-replay.md ADDED Viewed

@@ -0,0 +1,88 @@
+# Cursor native tool replay
+pi-cursor-sdk has two separate tool paths:
+1. **Local pi MCP bridge:** default-on for local Cursor agents. It exposes the current pi session's bridgeable active tools to Cursor through a tokenized `127.0.0.1` MCP endpoint, excluding internal Cursor replay activity names and, by default, overlapping built-in pi tools (`read`, `bash`, `write`, `edit`, `grep`, `find`, `ls`). When Cursor calls one of those MCP tools, pi executes the real pi tool through the normal pi tool path.
+2. **Cursor native tool replay:** display-only. It renders completed Cursor SDK tool activity as pi-native-looking cards using recorded Cursor results.
+This document is about replay. Replay is not execution and is not the local pi bridge.
+## Local pi bridge summary
+The bridge is enabled by default when bridgeable active pi tools exist. Cursor sees bridge-owned MCP names such as `pi__sem_reindex`, while pi history and tool cards use the real pi tool name such as `sem_reindex`. The bridge hides overlapping built-in pi tools by default because Cursor already has native equivalents; extension/custom tools and non-overlapping active tools present in pi's active tool registry normally remain exposed. pi-cursor-sdk also registers `cursor_ask_question` for Cursor models when the bridge is enabled, exposed to Cursor as `pi__cursor_ask_question`, so Cursor can ask the user to choose instead of silently defaulting when the pi UI is available. The bridge does not call pi tool `execute()` handlers directly; it queues the request, emits a real pi `toolCall`, waits for the matching pi `toolResult`, and resolves the Cursor MCP call back into the same Cursor SDK run.
+Rollback and timeout controls:
+```bash
+PI_CURSOR_PI_TOOL_BRIDGE=0 pi --model cursor/composer-2.5
+PI_CURSOR_EXPOSE_BUILTIN_TOOLS=1 pi --model cursor/composer-2.5
+PI_CURSOR_MCP_TOOL_TIMEOUT_SECONDS=7200 pi --model cursor/composer-2.5
+PI_CURSOR_MCP_TOOL_TIMEOUT_MS=7200000 pi --model cursor/composer-2.5
+```
+`PI_CURSOR_PI_TOOL_BRIDGE=0` disables the bridge, including `pi__cursor_ask_question`. `PI_CURSOR_EXPOSE_BUILTIN_TOOLS=1` opts in to exposing overlapping pi tool names that Cursor already has native equivalents for (`read`, `bash`, `write`, `edit`, `grep`, `find`, and `ls`). By default those names are hidden even when pi's Cursor replay wrapper has registered them as extension tools; non-overlapping active built-ins remain bridgeable by default. Cursor-native tools, Cursor settings, plugins, and configured Cursor MCP servers still come from the Cursor SDK local agent path. Cloud Cursor agents are out of scope for this bridge.
+## What gets replayed
+When Cursor reports completed tool activity, the extension can display recorded results for:
+- `read`
+- `bash`
+- `grep`
+- `find`
+- `ls`
+- `edit`
+- `write`
+- diagnostics
+- delete
+- todos and plans
+- tasks
+- image generation
+- MCP activity
+Cursor `glob` activity is displayed through native `find` cards.
+Edit and write activity replays through pi-facing `edit` and `write` cards only when replay arguments truthfully satisfy the matching pi schema, but still uses recorded Cursor results only. The adapter passes through truthful Cursor paths, content when Cursor reported it, and recorded diff/details; it does not pretend Cursor's editing schema is pi's schema and it fails closed if a recorded replay result is missing. Cursor `StrReplace` with recorded replacement text displays as native-looking `edit`; path-only Cursor `edit` and notebook edit activity fall back to neutral Cursor activity so pi does not reject the replay before recorded-result handling. Cursor `write` displays as native-looking `write`. Diagnostics, delete, todos/plans, task, image, and MCP activity use neutral Cursor activity cards with pi's default success/error tool shell. Neutral Cursor activity cards carry display metadata such as `activityTitle` and `activitySummary`, so partial/collapsed cards can say `Cursor plan`, `Cursor todos`, `Cursor MCP`, or `Cursor edit` instead of only `Cursor activity`. These replay tools only display recorded Cursor results; they never mutate files or execute tool work directly. Replay paths are normalized to workspace-relative paths when possible. Collapsed replay cards include bounded previews for diffs and text details so small edits, todos, task output, and MCP results are visible without expanding; edit previews omit raw unified diff headers and show compact numbered changed/context lines using pi's native diff added/removed/context colors, and write previews use syntax highlighting when pi can infer a language from the path. Image generation replay cards show the saved image path in the collapsed summary and render the image inline when pi terminal image display is enabled and the generated file is still readable.
+## What replay does not do
+Native replay is display-only:
+- pi does not re-run Cursor-side commands.
+- pi does not apply Cursor-side edits or deletes.
+- pi does not call Cursor-side MCP servers.
+- replay-only cards do not update pi state or generate images.
+- replay does not expose pi tool schemas to Cursor; the local pi MCP bridge is the separate path that exposes active pi tools.
+- Cursor workflow tools such as `SwitchMode` and Cursor todo state are not pi workflow controls; reported todo/plan events are displayed as Cursor activity only. Plan/todo replay cards do not drive pi plan-mode state.
+If a Cursor read completion reports no content, the extension may include a bounded local file preview for safe in-workspace paths. That preview is labeled as a local preview captured at transcript time, not guaranteed Cursor-observed content.
+Other unsupported Cursor SDK tools may still be described through a bounded scrubbed activity transcript when the SDK reports completed tool-call data. Started Cursor SDK tool calls that never receive a completion event are discarded without a synthetic replay error; missing completion is not itself treated as a Cursor tool failure. Explicit failures remain visible when Cursor reports an error through a completed tool call or step result. Some Cursor-internal workflow actions may only appear in Cursor's own thinking stream or not be reported as replayable SDK tool completions.
+## Ordering and non-interactive output
+As Cursor SDK tool completions arrive, the extension mirrors native Codex ordering by ending a tool-use turn, letting pi render the recorded tool results, then continuing with live post-tool Cursor thinking/text, later Cursor tool batches, or Cursor's final answer as the next assistant turn. For plan-mode runs, neutral Cursor plan/todo cards can therefore appear before the final Cursor plan text.
+Bridged pi tool calls follow the same visible pi `toolUse` turn shape, but they are real pi tool executions rather than replayed Cursor results.
+For shell replay, completed `stdout` / `stderr` remain the primary source. If a successful completed shell result is empty and Cursor emitted unambiguous `shell-output-delta` data while exactly one shell call was active, the replay card uses that delta as display-only fallback data. Overlapping shell calls make delta attribution ambiguous, so those fallback deltas are dropped rather than guessed. `(no output)` is kept only when no completed output or safe delta fallback is available.
+Non-interactive and session consumers still receive bounded scrubbed transcript data so `pi -p` keeps printing normal assistant text.
+## Synthetic-name policy
+Synthetic replay names are internal compatibility details. New model-facing prompt text and user-visible cards use native tool names when renderer-compatible, or neutral Cursor activity labels when not. Legacy sessions that already contain old internal replay names are rewritten to safe labels in prompt text and display surfaces.
+Bridge MCP names are also not pi tool names. Cursor may see names such as `pi__sem_reindex` inside the local MCP bridge, but pi session output uses the real pi tool name.
+## Conflicts and opt out
+Native replay wrappers are registered only for tool names not already owned by another extension. If another extension already owns a wrapper name needed for replay, pi-cursor-sdk skips only the conflicting wrapper and uses the scrubbed Cursor activity transcript for that tool instead. Legacy replay wrappers remain registered for old sessions, but their model-facing and user-visible labels are sanitized.
+Disable native replay registration entirely:
+```bash
+PI_CURSOR_NATIVE_TOOL_DISPLAY=0 pi --model cursor/composer-2.5
+```
+`PI_CURSOR_REGISTER_NATIVE_TOOLS=0` is also accepted as a registration-only opt-out.

package/docs/cursor-native-tool-visual-audit.md ADDED Viewed

@@ -0,0 +1,183 @@
+# Cursor Native Tool Visual Audit Workflow
+This workflow verifies Cursor SDK tool replay the way a human sees it in pi's interactive TUI, without stealing macOS focus.
+Use it before accepting replay-card commits or PRs. Text logs and JSONL are necessary, but they are not enough when the claim is visual parity: always keep before/after PNGs for the exact prompt.
+## When to use this
+Use this workflow when changing or reviewing:
+- Cursor native tool replay cards.
+- Tool-call turn ordering.
+- Tool-result error styling.
+- Truncation, continuation hints, timeout labels, or path display.
+- Any PR claiming native TUI parity.
+Do not use this for ordinary unit-only logic changes.
+## Why this workflow exists
+Earlier manual verification used a visible Terminal window plus `screencapture`. That worked, but it stole system focus and made it easy for the user to type into the audit window by accident.
+The preferred workflow is now offscreen:
+1. Spawn `pi` in a pseudo-terminal at a fixed size.
+2. Feed the prompt programmatically.
+3. Save raw ANSI output and plain text output.
+4. Render the terminal buffer through xterm.js in headless Playwright.
+5. Save a PNG screenshot.
+6. Inspect the session JSONL for exact persisted `toolCall` / `toolResult` data.
+This gives human-like visual evidence without activating Terminal, iTerm, or a browser window.
+## Tool stack
+Install the harness outside this repo so generated assets and temporary dependencies do not pollute commits:
+```bash
+HARNESS=/tmp/pi-visual-harness
+rm -rf "$HARNESS"
+mkdir -p "$HARNESS"
+cd "$HARNESS"
+npm init -y
+npm install node-pty @xterm/xterm playwright
+npm rebuild node-pty
+```
+`npm rebuild node-pty` is useful after Node upgrades; without it, `node-pty` may fail with `posix_spawnp failed`.
+## Runner contract
+A runner script should:
+- Spawn `pi -e <extension-dir> --model cursor/composer-2.5` with:
+  - `PI_CURSOR_NATIVE_TOOL_DISPLAY=1`
+  - `TERM=xterm-256color`
+  - fixed PTY size, for example `150x45`
+  - cwd set to the target audit repo.
+- Wait for startup.
+- Write the exact prompt and carriage return to the PTY.
+- Wait a bounded amount of time.
+- Save:
+  - `<label>.ansi` raw terminal bytes.
+  - `<label>.txt` stripped text for quick search.
+  - `<label>.png` rendered xterm screenshot.
+  - `<label>.jsonl.path` pointing to the latest pi session JSONL.
+- Kill the PTY child after capture.
+- Check for leftover commands when prompts can background work, especially shell timeout tests.
+Example invocation shape:
+```bash
+node /tmp/pi-visual-harness/run-pi-visual.mjs \
+  --label after-shell-nonzero \
+  --ext /path/to/pi-cursor-sdk \
+  --cwd /path/to/test-workspace \
+  --prompt "Run \`printf 'cursor-shell-stderr\\n' >&2; exit 7\` using only the shell/terminal tool. Do not use read, grep, glob, find, ls, edit, or write. Print the command result exactly, then stop." \
+  --wait-ms 30000 \
+  --out-dir /tmp/pi-visual-harness/review-current
+```
+Keep the runner in `/tmp` unless the project explicitly decides to check in a maintained audit harness.
+## Before/after comparison
+Use a clean worktree for the baseline and the active worktree for the candidate change:
+```bash
+BASE=/tmp/pi-cursor-visual-review
+BEFORE_WT=$BASE/before-main
+AFTER_WT=/path/to/pi-cursor-sdk
+TARGET=/path/to/test-workspace
+rm -rf "$BASE"
+git fetch origin main
+BASE_COMMIT=$(git merge-base origin/main HEAD)
+git worktree add --detach "$BEFORE_WT" "$BASE_COMMIT"
+# Optional speedup when the before worktree has no install of its own.
+ln -s "$AFTER_WT/node_modules" "$BEFORE_WT/node_modules"
+```
+Then run the same prompt against both extension dirs:
+```bash
+node /tmp/pi-visual-harness/run-pi-visual.mjs \
+  --label before-glob-single \
+  --ext "$BEFORE_WT" \
+  --cwd "$TARGET" \
+  --prompt "Find files matching \`src/tools/reindex.ts\` using only the glob/file-search tool. Do not use shell, bash, grep, read, or ls. Print the matched files exactly as found, then stop." \
+  --wait-ms 16000 \
+  --out-dir /tmp/pi-visual-harness/review-current
+node /tmp/pi-visual-harness/run-pi-visual.mjs \
+  --label after-glob-single \
+  --ext "$AFTER_WT" \
+  --cwd "$TARGET" \
+  --prompt "Find files matching \`src/tools/reindex.ts\` using only the glob/file-search tool. Do not use shell, bash, grep, read, or ls. Print the matched files exactly as found, then stop." \
+  --wait-ms 16000 \
+  --out-dir /tmp/pi-visual-harness/review-current
+```
+For review, create a simple HTML/PNG gallery that places `before-*.png` and `after-*.png` side by side. Keep the generated gallery in `/tmp` unless explicitly asked to commit visual artifacts.
+## JSONL inspection
+For each visual claim, inspect the JSONL path written by the runner. Confirm at least:
+- `toolCall.name` is the expected pi-facing replay tool name.
+- `toolCall.arguments` show the expected user-facing args.
+- `toolResult.toolName` matches the call.
+- `toolResult.content[0].text` contains the recorded body expected in the card.
+- `toolResult.isError` matches the visual card state.
+For local pi MCP bridge claims, also confirm:
+- Bridged calls appear as the real pi tool name (for example `sem_reindex`), not the MCP bridge name (for example `pi__sem_reindex`; or `read`/`pi__read` when overlapping built-ins are explicitly exposed).
+- The JSONL has no second Cursor MCP replay card for the same bridged call.
+- Non-bridge Cursor MCP activity, if present, still renders as neutral Cursor activity instead of being suppressed.
+Small helper pattern:
+```bash
+python3 - <<'PY'
+import json, pathlib
+path = pathlib.Path('/tmp/pi-visual-harness/review-current/after-shell-nonzero.jsonl.path').read_text().strip()
+for line in pathlib.Path(path).read_text().splitlines():
+    obj = json.loads(line)
+    msg = obj.get('message', {})
+    if msg.get('role') == 'assistant':
+        for part in msg.get('content', []):
+            if part.get('type') == 'toolCall':
+                print('CALL', part.get('name'), part.get('arguments'))
+    if msg.get('role') == 'toolResult':
+        text = msg.get('content', [{}])[0].get('text', '')
+        print('RESULT', msg.get('toolName'), 'isError=', msg.get('isError'), repr(text[:160]))
+PY
+```
+## Safety rules
+- Prefer the offscreen PTY renderer. Do not use `osascript`, visible Terminal windows, or `screencapture` unless a user explicitly asks for a real desktop screenshot.
+- Keep generated screenshots, HTML galleries, ANSI logs, and temporary harness dependencies out of the repo by default.
+- Use short, deterministic prompts with bounded wait times.
+- For timeout/background prompts, always check for leftovers:
+```bash
+ps -axo pid,etime,command | rg "sleep 2|should-not-print|<audit-session-label>" || true
+```
+- If the model uses a different tool than requested, record it as model/provider behavior unless JSONL shows replay lost or misrendered a completed Cursor tool event.
+- Visual output can differ slightly from macOS Terminal fonts because xterm.js renders offscreen. Treat this workflow as evidence for card class, color state, labels, ordering, truncation, and content. Use a real terminal screenshot only for pixel-level terminal-specific bugs.
+## Required evidence before commit or merge
+Before accepting a replay-card change, provide:
+- Before and after PNG paths.
+- The prompt used for each pair.
+- JSONL paths for each run.
+- A short statement of what changed visually.
+- The relevant JSONL `toolCall` / `toolResult` facts.
+- `npm test` and `npm run typecheck` results, unless the change is documentation-only.

package/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
 	"name": "pi-cursor-sdk",
-	"version": "0.1.14",
+	"version": "0.1.15",
 	"description": "pi provider extension backed by @cursor/sdk local agents",
 	"author": "Mitch Fultz (https://github.com/fitchmultz)",
 	"license": "MIT",
@@ -26,6 +26,8 @@
 		"scripts/refresh-cursor-model-snapshots.mjs",
 		"README.md",
 		"docs/cursor-model-ux-spec.md",
+		"docs/cursor-native-tool-replay.md",
+		"docs/cursor-native-tool-visual-audit.md",
 		"LICENSE",
 		"CHANGELOG.md"
 	],
@@ -40,7 +42,8 @@
 		"refresh:cursor-snapshots": "node scripts/refresh-cursor-model-snapshots.mjs"
 	},
 	"dependencies": {
-		"@cursor/sdk": "^1.0.13"
+		"@cursor/sdk": "^1.0.13",
+		"@modelcontextprotocol/sdk": "^1.29.0"
 	},
 	"peerDependencies": {
 		"@earendil-works/pi-ai": "*",

package/src/context.ts CHANGED Viewed

@@ -1,5 +1,6 @@
 import type { Context, Message, ToolCall } from "@earendil-works/pi-ai";
 import type { SDKImage } from "@cursor/sdk";
+import { getCursorReplayPromptLabel } from "./cursor-tool-names.js";
 export interface CursorPrompt {
 	text: string;
@@ -58,8 +59,26 @@ function formatContentBlocks(content: string | { type: string; text?: string; da
 }
 function formatToolCall(toolCall: ToolCall): string {
-	const args = JSON.stringify(toolCall.arguments);
-	return `Tool call (${toolCall.name}, call ${toolCall.id}): ${args}`;
+	const args = JSON.stringify(toolCall.arguments) ?? "";
+	return `Tool call (${getCursorReplayPromptLabel(toolCall.name)}, call ${toolCall.id}): ${args}`;
+}
+function sanitizeSystemPromptForCursor(systemPrompt: string): string {
+	let sanitized = systemPrompt;
+	sanitized = sanitized.replace(
+		/Available tools:\n[\s\S]*?\n\nIn addition to the tools above, you may have access to other custom tools depending on the project\.\n\n/g,
+		"Pi tool catalog omitted: Cursor can call only Cursor SDK tools exposed in this run.\n\n",
+	);
+	sanitized = sanitized.replace(
+		/Guidelines:\n[\s\S]*?\n\nPi documentation /g,
+		"Guidelines:\n- Be concise in your responses.\n- Show file paths clearly when working with files.\n\nPi documentation ",
+	);
+	sanitized = sanitized.replace(
+		/\n\nThe following skills provide specialized instructions for specific tasks\.[\s\S]*?<\/available_skills>/g,
+		"",
+	);
+	sanitized = sanitized.replace(/\n+Semantic code intelligence priority:[\s\S]*$/g, "");
+	return sanitized.trim();
 }
 function formatMessage(msg: Message): string | undefined {
@@ -84,7 +103,7 @@ function formatMessage(msg: Message): string | undefined {
 		case "toolResult": {
 			const text = formatContentBlocks(msg.content);
 			const label = msg.isError ? "Tool error" : "Tool result";
-			return `${label} (${msg.toolName}, call ${msg.toolCallId}): ${text}`;
+			return `${label} (${getCursorReplayPromptLabel(msg.toolName)}, call ${msg.toolCallId}): ${text}`;
 		}
 	}
 }
@@ -152,15 +171,17 @@ export function buildCursorPrompt(context: Context, options: CursorPromptOptions
 	const sectionsBeforeMessages: string[] = [
 		[
 			"Cursor SDK tool boundary:",
-			"Only tools exposed by the Cursor SDK in this run are callable. The pi system prompt and transcript are context only; they do not grant access to pi tools or tool names mentioned there.",
-			"If the user asks you to search, fetch, browse, or research the web, use an actual Cursor SDK web/search/browser/MCP tool call. If no such Cursor SDK tool is available, say that web search is not configured for this Cursor SDK run.",
-			"Do not plan to use or claim to have used pi-only tools such as WebSearch or WebFetch unless the Cursor SDK actually exposes and executes that tool in this run.",
-			"Image payload boundary: only images attached to the latest user message are available as image bytes. Earlier images appear only as [image omitted from transcript] placeholders; ask the user to reattach or describe a prior image if the latest request depends on it.",
+			"You can call only tools actually exposed by Cursor SDK in this run. Pi tool names, replay tool names, and transcript tool names are context only, not callable capabilities.",
+			"If asked to list or exercise available tools, list and exercise Cursor SDK tools only; do not claim access to pi-side tools from the system prompt unless Cursor exposes an equivalent tool that runs.",
+			"Use pi__cursor_ask_question for material choices if exposed.",
+			"Web: use Cursor web/search/browser/MCP or say web search is not configured; do not claim WebSearch/WebFetch unless Cursor executes them.",
+			"Replay: pi may display recorded Cursor tool activity as pi-style cards, but replay is display-only and not a capability to invoke.",
+			"Images: only latest user images are sent; ask to reattach or describe prior images.",
 		].join("\n"),
 	];
 	if (context.systemPrompt) {
-		sectionsBeforeMessages.push(`System instructions from pi:\n${context.systemPrompt}`);
+		sectionsBeforeMessages.push(`System instructions from pi:\n${sanitizeSystemPromptForCursor(context.systemPrompt)}`);
 	}
 	const messageSections = context.messages
@@ -171,8 +192,8 @@ export function buildCursorPrompt(context: Context, options: CursorPromptOptions
 		.filter((section): section is { index: number; text: string } => section !== undefined);
 	const sectionsAfterMessages = [
 		[
-			"Answer the latest user request above using your capabilities. Do not assume access to pi tools.",
-			"If the user asks for web research, do not claim to have searched the web unless a Cursor SDK web/search/browser/MCP tool was actually used.",
+			"Answer the latest user request above using Cursor SDK capabilities only. Do not list, promise, or call pi-only tools from the system prompt as if they were available.",
+			"If web research is requested, do not claim it unless a Cursor web/search/browser/MCP tool ran.",
 		].join("\n"),
 	];
 	const images = extractLatestImages(context.messages);
@@ -188,6 +209,8 @@ export function buildCursorPrompt(context: Context, options: CursorPromptOptions
 		getLatestUserMessageIndex(context.messages),
 		budgetOptions,
 	);
+	const text = parts.join(SECTION_SEPARATOR);
-	return { text: parts.join(SECTION_SEPARATOR), images };
+	return { text, images };
 }