pi-cursor-sdk 0.1.19 → 0.1.21
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/CHANGELOG.md +52 -0
- package/README.md +72 -11
- package/docs/cursor-dogfood-checklist.md +57 -0
- package/docs/cursor-live-smoke-checklist.md +116 -10
- package/docs/cursor-model-ux-spec.md +60 -19
- package/docs/cursor-native-tool-replay.md +21 -11
- package/docs/cursor-native-tool-visual-audit.md +104 -59
- package/docs/cursor-testing-lessons.md +10 -5
- package/docs/cursor-tool-surfaces.md +69 -0
- package/package.json +37 -11
- package/scripts/debug-provider-events.d.mts +59 -0
- package/scripts/debug-provider-events.mjs +70 -175
- package/scripts/debug-sdk-events.d.mts +90 -0
- package/scripts/debug-sdk-events.mjs +36 -98
- package/scripts/fixtures/plan-strip-shim/index.ts +12 -0
- package/scripts/isolated-cursor-smoke.sh +264 -102
- package/scripts/lib/cursor-child-process.d.mts +10 -0
- package/scripts/lib/cursor-child-process.mjs +50 -0
- package/scripts/lib/cursor-cli-args.d.mts +63 -0
- package/scripts/lib/cursor-cli-args.mjs +129 -0
- package/scripts/lib/cursor-script-fail.d.mts +1 -0
- package/scripts/lib/cursor-script-fail.mjs +13 -0
- package/scripts/lib/cursor-sdk-output-filter.d.mts +5 -0
- package/scripts/lib/cursor-smoke-env.d.mts +38 -0
- package/scripts/lib/cursor-smoke-env.mjs +81 -0
- package/scripts/lib/cursor-smoke-shell.sh +174 -0
- package/scripts/lib/cursor-visual-render.d.mts +15 -0
- package/scripts/lib/cursor-visual-render.mjs +131 -0
- package/scripts/probe-mcp-coldstart.mjs +226 -0
- package/scripts/refresh-cursor-model-snapshots.mjs +29 -65
- package/scripts/steering-rpc-smoke.mjs +170 -65
- package/scripts/tmux-live-smoke.sh +152 -98
- package/scripts/visual-tui-smoke.mjs +659 -0
- package/shared/cursor-sdk-event-debug-env.d.mts +12 -0
- package/shared/cursor-sdk-event-debug-env.mjs +13 -0
- package/shared/cursor-sensitive-text.d.mts +1 -0
- package/{scripts/lib/cursor-probe-utils.mjs → shared/cursor-sensitive-text.mjs} +1 -13
- package/shared/cursor-setting-sources.d.mts +5 -0
- package/shared/cursor-setting-sources.mjs +22 -0
- package/src/context.ts +21 -12
- package/src/cursor-bridge-contract.ts +1 -3
- package/src/cursor-incomplete-tool-visibility.ts +72 -49
- package/src/cursor-mcp-timeout-override.ts +66 -11
- package/src/cursor-native-tool-display-registration.ts +63 -27
- package/src/cursor-native-tool-display-replay.ts +246 -143
- package/src/cursor-native-tool-display-state.ts +2 -0
- package/src/cursor-native-tool-display-tools.ts +149 -41
- package/src/cursor-provider-live-run-drain.ts +1 -52
- package/src/cursor-provider-run-finalizer.ts +235 -0
- package/src/cursor-provider-run-outcome.ts +149 -0
- package/src/cursor-provider-turn-api-key.ts +8 -0
- package/src/cursor-provider-turn-coordinator.ts +113 -440
- package/src/cursor-provider-turn-display-router.ts +216 -0
- package/src/cursor-provider-turn-emit.ts +59 -0
- package/src/cursor-provider-turn-finalize.ts +119 -0
- package/src/cursor-provider-turn-lifecycle-emitter.ts +97 -0
- package/src/cursor-provider-turn-message-offset.ts +15 -0
- package/src/cursor-provider-turn-prepare.ts +216 -0
- package/src/cursor-provider-turn-runner.ts +138 -0
- package/src/cursor-provider-turn-sdk-normalizer.ts +88 -0
- package/src/cursor-provider-turn-send.ts +103 -0
- package/src/cursor-provider-turn-shell-output.ts +107 -0
- package/src/cursor-provider-turn-tool-ledger.ts +126 -0
- package/src/cursor-provider-turn-types.ts +87 -0
- package/src/cursor-provider.ts +16 -482
- package/src/cursor-replay-activity-builders.ts +276 -0
- package/src/cursor-replay-source-names.ts +33 -0
- package/src/cursor-replay-summary-args.ts +191 -0
- package/src/cursor-replay-tool-details.ts +464 -0
- package/src/cursor-run-final-text.ts +56 -0
- package/src/cursor-sdk-abort-error-guard.ts +4 -0
- package/src/cursor-sdk-event-debug-constants.ts +14 -5
- package/src/cursor-sdk-event-debug.ts +8 -2
- package/src/cursor-sensitive-text.ts +3 -36
- package/src/cursor-session-agent.ts +265 -88
- package/src/cursor-setting-sources.ts +7 -10
- package/src/cursor-state.ts +232 -28
- package/src/cursor-tool-lifecycle.ts +17 -42
- package/src/cursor-tool-manifest.ts +41 -0
- package/src/cursor-tool-names.ts +18 -79
- package/src/cursor-tool-presentation-registry.ts +556 -0
- package/src/cursor-tool-transcript.ts +1 -1
- package/src/cursor-tool-visibility.ts +39 -0
- package/src/cursor-transcript-tool-formatters.ts +0 -59
- package/src/cursor-transcript-tool-specs.ts +169 -232
- package/src/cursor-transcript-utils.ts +0 -44
- package/src/cursor-web-tool-activity.ts +10 -60
- package/src/cursor-web-tool-args.ts +39 -0
- package/src/index.ts +4 -10
package/CHANGELOG.md
CHANGED
|
@@ -2,6 +2,58 @@
|
|
|
2
2
|
|
|
3
3
|
## Unreleased
|
|
4
4
|
|
|
5
|
+
## 0.1.21 - 2026-05-28
|
|
6
|
+
|
|
7
|
+
**Upgrade:** Requires **pi 0.76.0+** and installs exact **`@cursor/sdk@1.0.14`**. Older pi or Cursor SDK combinations are not supported on this release line.
|
|
8
|
+
|
|
9
|
+
### Added
|
|
10
|
+
|
|
11
|
+
- Add Cursor SDK **`agent` / `plan` mode** controls: `--cursor-mode agent|plan` for one run, `/cursor-mode agent|plan` (persisted in the session), and `/cursor-mode` to show current mode. Default is `agent`. Plan-mode `createPlan` / `updateTodos` activity stays display-only in pi replay.
|
|
12
|
+
- Add a bootstrap **callable tool surfaces** block on the first Cursor send (default on). It summarizes Cursor host tools, exposed `pi__*` bridge tools for the run, and that configured Cursor MCP servers are discovered at runtime. Disable with `PI_CURSOR_TOOL_MANIFEST=0`. See [cursor-tool-surfaces.md](docs/cursor-tool-surfaces.md).
|
|
13
|
+
- Add maintainer `/cursor-tools` to print the effective callable-surface manifest for the current session.
|
|
14
|
+
|
|
15
|
+
### Changed
|
|
16
|
+
|
|
17
|
+
- Cut over to exact `@cursor/sdk@1.0.14` and drop compatibility paths for older Cursor SDK releases.
|
|
18
|
+
- Raise the documented pi floor to **0.76.0+** (`peerDependencies` are minimum-only `>=0.76.0` with no upper bound).
|
|
19
|
+
- Seed Cursor SDK mode through `Agent.create({ mode })` and pass the effective mode on every `agent.send(..., { mode })` so CLI and slash-command mode stay authoritative for pooled session agents.
|
|
20
|
+
- Shorten bootstrap tool-boundary prompt text; move the full pi bridge contract out of per-tool MCP descriptions (one-line pointer to the bootstrap block); add a shell `cd` hint to the tool tail guard.
|
|
21
|
+
- Refactor the Cursor provider turn pipeline (prepare / send / finalize / emit / coordinator) and centralize tool presentation, replay details, and run outcomes. Behavior is intended to be the same or stricter; see **Fixed** for user-visible corrections.
|
|
22
|
+
|
|
23
|
+
### Fixed
|
|
24
|
+
|
|
25
|
+
- Fix edit/write **activity replay diff previews** so path-only fallbacks still show diff content instead of title-only cards.
|
|
26
|
+
- Fix **replay diff card colors** in the TUI.
|
|
27
|
+
- Fix **`generateImage` error replay titles** when the SDK reports a failed image call.
|
|
28
|
+
- Fix **abort races** during turn finalize and send cleanup so user aborts and overlapping runs tear down more reliably.
|
|
29
|
+
- Fix maintainer **`smoke:isolated` / `smoke:live` print-mode** captures hanging on pi 0.76 when stdout is redirected but stdin stays open (close stdin for `-p` / `--print` runs).
|
|
30
|
+
|
|
31
|
+
### Maintainer
|
|
32
|
+
|
|
33
|
+
- Add `npm run smoke:visual` for offscreen TUI visual smoke (ANSI/text/HTML/PNG/JSONL).
|
|
34
|
+
- Add [Cursor dogfood checklist](docs/cursor-dogfood-checklist.md) and tighten live/visual smoke env isolation for pi 0.76 `--session-id`, plan mode, and native-replay card proof.
|
|
35
|
+
- Add package metadata regression tests for the SDK/pi cutover baselines.
|
|
36
|
+
|
|
37
|
+
## 0.1.20 - 2026-05-26
|
|
38
|
+
|
|
39
|
+
### Added
|
|
40
|
+
|
|
41
|
+
- Shorten known Cursor SDK MCP initialize/listTools timeouts to 10 seconds by default so unavailable configured MCP servers fail fast on first send instead of blocking for the SDK's 60-second protocol default; unknown MCP protocol timeout stacks keep the SDK default. Override with `PI_CURSOR_MCP_CONNECT_TIMEOUT_MS` or `PI_CURSOR_MCP_CONNECT_TIMEOUT_SECONDS`.
|
|
42
|
+
- Add maintainer cold-start timing probe `scripts/probe-mcp-coldstart.mjs` and `npm run debug:mcp-coldstart`.
|
|
43
|
+
|
|
44
|
+
### Changed
|
|
45
|
+
|
|
46
|
+
- Document first-send MCP cold-start behavior and initialize/listTools timeout defaults in README troubleshooting.
|
|
47
|
+
- Centralize Cursor started-tool visibility classification across incomplete-tool cards, lifecycle progress, fast local discovery suppression, and completed replay titles.
|
|
48
|
+
- Rework the cold-start probe to run each scenario in a fresh child process before the first Cursor SDK import.
|
|
49
|
+
|
|
50
|
+
### Fixed
|
|
51
|
+
|
|
52
|
+
- Make pooled Cursor session agents idle before send planning/reuse by awaiting fire-and-forget live-run `run.wait()` cleanup in `acquireSessionCursorAgent()`, scoped to the pooled agent instance id, so pi auto-compaction summarization does not hit Cursor SDK `AgentBusyError` (`already has active run`) or plan against stale send state while manual `/compact` after idle still works.
|
|
53
|
+
- Fix stale busy pooled-agent waits so reset, terminal disposal, and pool-key replacement wake blocked acquires even when an old SDK `run.wait()` never settles.
|
|
54
|
+
- Remove test-only live-run coordinator detachment hooks and keep race invariants inside the session-agent lease/pool contract.
|
|
55
|
+
- Keep non-60-second timer scheduling on the cheap path by only capturing timeout stack traces for Cursor SDK's 60-second MCP protocol default.
|
|
56
|
+
|
|
5
57
|
## 0.1.19 - 2026-05-25
|
|
6
58
|
|
|
7
59
|
### Added
|
package/README.md
CHANGED
|
@@ -31,10 +31,10 @@ If pi started without a key, run `/cursor-refresh-models` after `/login` to refr
|
|
|
31
31
|
## Requirements
|
|
32
32
|
|
|
33
33
|
- Node.js 22.19+
|
|
34
|
-
- pi
|
|
34
|
+
- pi 0.76.0 or newer
|
|
35
35
|
- a Cursor API key saved through `/login`, available as `CURSOR_API_KEY`, or passed with pi's `--api-key`
|
|
36
36
|
|
|
37
|
-
No global `@cursor/sdk` install is required. This package depends on `@cursor/sdk`, so normal package installation brings in the SDK version this extension was built and tested against.
|
|
37
|
+
No global `@cursor/sdk` install is required. This package depends on exact `@cursor/sdk@1.0.14`, so normal package installation brings in the SDK version this extension was built and tested against. This cutover supports pi 0.76.0+ and Cursor SDK 1.0.14; older pi or Cursor SDK compatibility paths are not maintained.
|
|
38
38
|
|
|
39
39
|
## Install
|
|
40
40
|
|
|
@@ -147,7 +147,7 @@ pi --model cursor/gpt-5.5@272k:xhigh
|
|
|
147
147
|
pi --model cursor/gpt-5.5@1m --thinking medium
|
|
148
148
|
```
|
|
149
149
|
|
|
150
|
-
Cursor-only parameters are not encoded into pi model IDs. Cursor `context` becomes a pi-visible model variant because it changes pi's native `contextWindow`; Cursor `fast`
|
|
150
|
+
Cursor-only parameters are not encoded into pi model IDs. Cursor `context` becomes a pi-visible model variant because it changes pi's native `contextWindow`; Cursor `fast` and Cursor SDK conversation mode are extension state, not model identity. Alias model IDs still share Cursor-only state, such as fast defaults, with their underlying Cursor base model.
|
|
151
151
|
|
|
152
152
|
## Thinking support
|
|
153
153
|
|
|
@@ -185,13 +185,42 @@ pi --model cursor/composer-2.5 --cursor-no-fast -p "Say ok only"
|
|
|
185
185
|
|
|
186
186
|
Composer 2 and Composer 2.5 can default to fast. Use `--cursor-no-fast` for a one-shot no-fast Composer run. In print mode (`-p`), `--cursor-no-fast` is silent and does not write `~/.pi/agent/cursor-sdk.json`.
|
|
187
187
|
|
|
188
|
-
In interactive mode, the footer only shows fast mode when fast is enabled:
|
|
188
|
+
In interactive mode, the footer only shows fast mode when fast is enabled and Cursor mode when it is non-default. Fast and plan mode share one Cursor status value, so they do not overwrite each other:
|
|
189
189
|
|
|
190
190
|
```text
|
|
191
191
|
cursor fast
|
|
192
|
+
cursor plan
|
|
193
|
+
cursor fast · plan
|
|
192
194
|
```
|
|
193
195
|
|
|
194
|
-
If you do not see `cursor fast`, fast mode is off.
|
|
196
|
+
If you do not see `cursor fast`, fast mode is off. If you do not see `cursor plan`, Cursor SDK mode is the default `agent` mode.
|
|
197
|
+
|
|
198
|
+
## Cursor SDK mode
|
|
199
|
+
|
|
200
|
+
Cursor SDK conversation mode is Cursor-only extension state. It is not a pi model variant, not pi thinking/reasoning, not Cursor `fast`, and not pi's separate read-only plan-mode extension.
|
|
201
|
+
|
|
202
|
+
Default mode is `agent`. Start a one-shot run in a specific mode:
|
|
203
|
+
|
|
204
|
+
```bash
|
|
205
|
+
pi --model cursor/composer-2.5 --cursor-mode agent
|
|
206
|
+
pi --model cursor/composer-2.5 --cursor-mode plan
|
|
207
|
+
```
|
|
208
|
+
|
|
209
|
+
Change the session mode interactively:
|
|
210
|
+
|
|
211
|
+
```text
|
|
212
|
+
/cursor-mode agent
|
|
213
|
+
/cursor-mode plan
|
|
214
|
+
/cursor-mode
|
|
215
|
+
```
|
|
216
|
+
|
|
217
|
+
`/cursor-mode` with no argument reports the current mode and usage. The CLI flag does not persist to the session; slash-command changes are persisted with `pi.appendEntry()`.
|
|
218
|
+
|
|
219
|
+
Maintainers can run `/cursor-tools` in a Cursor model session to print the current bridge enablement, bootstrap manifest enablement, effective `PI_CURSOR_SETTING_SOURCES`, and callable-surface snapshot (host tools summary plus current `pi__*` names). See [Cursor dogfood checklist](docs/cursor-dogfood-checklist.md).
|
|
220
|
+
|
|
221
|
+
When a new local Cursor SDK agent is created, the extension seeds the mode through `Agent.create({ mode })`. The extension also sends the effective Cursor mode on every `agent.send(..., { mode })` call so `/cursor-mode` and `--cursor-mode` remain the source of truth even when a pooled SDK agent is reused.
|
|
222
|
+
|
|
223
|
+
Cursor SDK `plan` mode can produce plan-oriented output and Cursor todo/plan activity, but those replay cards remain display-only. They do not drive pi's plan-mode extension, pi todos, or active tool state.
|
|
195
224
|
|
|
196
225
|
## Images
|
|
197
226
|
|
|
@@ -200,6 +229,8 @@ Images from the latest user message are forwarded to Cursor. Historical images a
|
|
|
200
229
|
|
|
201
230
|
## Cursor provider tool contract
|
|
202
231
|
|
|
232
|
+
See [Cursor tool surfaces in pi](docs/cursor-tool-surfaces.md) for a concise guide to callable vs display-only tools, MCP catalog limits, JSONL ID patterns, and how pi toggles differ from Cursor ambient MCP.
|
|
233
|
+
|
|
203
234
|
Cursor runs use local Cursor SDK agents with two separate tool surfaces:
|
|
204
235
|
|
|
205
236
|
- **Cursor-native surface:** Cursor local-agent tools, Cursor settings, plugins, and configured Cursor MCP servers. These remain owned by the Cursor SDK local agent path.
|
|
@@ -224,15 +255,24 @@ PI_CURSOR_EXPOSE_BUILTIN_TOOLS=1 pi --model cursor/composer-2.5
|
|
|
224
255
|
PI_CURSOR_MCP_TOOL_TIMEOUT_SECONDS=7200 pi --model cursor/composer-2.5
|
|
225
256
|
PI_CURSOR_MCP_TOOL_TIMEOUT_MS=7200000 pi --model cursor/composer-2.5
|
|
226
257
|
|
|
258
|
+
# Override known MCP initialize/listTools timeouts on first send (default 10s).
|
|
259
|
+
PI_CURSOR_MCP_CONNECT_TIMEOUT_SECONDS=5 pi --model cursor/composer-2.5
|
|
260
|
+
PI_CURSOR_MCP_CONNECT_TIMEOUT_MS=5000 pi --model cursor/composer-2.5
|
|
261
|
+
|
|
262
|
+
# Disable bootstrap callable-surface manifest (on by default).
|
|
263
|
+
PI_CURSOR_TOOL_MANIFEST=0 pi --model cursor/composer-2.5
|
|
264
|
+
|
|
227
265
|
# Emit scrubbed bridge diagnostics as JSONL to stderr with prefix [pi-cursor-sdk:bridge].
|
|
228
266
|
PI_CURSOR_PI_TOOL_BRIDGE_DEBUG=1 pi --model cursor/composer-2.5
|
|
229
267
|
```
|
|
230
268
|
|
|
231
|
-
|
|
269
|
+
On bootstrap sends, a compact **callable tool surfaces** block is injected into the Cursor prompt by default so models see host-tool categories, exposed `pi__*` bridge names for the current run, and a reminder that configured Cursor MCP servers are discovered at runtime (not via pi's tool catalog). Disable with `PI_CURSOR_TOOL_MANIFEST=0`.
|
|
270
|
+
|
|
271
|
+
`PI_CURSOR_PI_TOOL_BRIDGE=0` is the supported rollback flag and disables the bridge entirely. The bridge also treats `false`, `off`, `none`, `no`, and `disabled` as off; `1`, `true`, `on`, `yes`, and `enabled` as on. `PI_CURSOR_EXPOSE_BUILTIN_TOOLS=1` opts in to exposing overlapping pi tool names that Cursor already has native equivalents for. The installed Cursor SDK uses a 60-second MCP protocol default with no public per-server timeout option. pi-cursor-sdk overrides that seam in two directions by default: MCP `callTool` requests are extended to 3600 seconds for long-running local MCP tools (including the pi bridge and configured Cursor MCP servers), and known MCP initialize/listTools requests on first send are shortened to 10 seconds so unavailable configured MCP servers fail fast instead of blocking for a full minute. Unknown Cursor SDK MCP protocol timeout stacks keep the SDK default instead of being shortened. Override tool-call timeouts with `PI_CURSOR_MCP_TOOL_TIMEOUT_MS` or `PI_CURSOR_MCP_TOOL_TIMEOUT_SECONDS`, and first-send initialize/listTools timeouts with `PI_CURSOR_MCP_CONNECT_TIMEOUT_MS` or `PI_CURSOR_MCP_CONNECT_TIMEOUT_SECONDS`. `PI_CURSOR_PI_TOOL_BRIDGE_DEBUG=1` is off by default and emits typed, allowlisted, scrubbed single-line JSONL records to `process.stderr`. These records are operational diagnostics, not anonymous telemetry: they intentionally include tool names, safe correlation IDs, bridge run state, exposed pi↔MCP name pairs, queued requests, result resolution, rejection, cancellation, and pending counts. They must not include endpoint URLs, endpoint path components, endpoint tokens, raw args/results, stdout/stderr payloads, file contents, Cursor settings output, API keys, bearer tokens, cookies, session credentials, or secrets. Do not enable or share bridge debug logs where tool names themselves are sensitive.
|
|
232
272
|
|
|
233
273
|
### Maintainer live smoke release gate
|
|
234
274
|
|
|
235
|
-
For Cursor provider/runtime changes, follow the manual [Cursor live smoke checklist](docs/cursor-live-smoke-checklist.md) before release. See [Cursor testing lessons](docs/cursor-testing-lessons.md) for auth.json seeding, isolated `/tmp` harness layout, JSONL replay-error scans, and other regression traps. Assume every runtime surface is in scope. The checklist uses real `pi -e . --cursor-no-fast --model cursor/composer-2.5` runs with temporary session dirs and
|
|
275
|
+
For Cursor provider/runtime changes, follow the manual [Cursor live smoke checklist](docs/cursor-live-smoke-checklist.md) before release. For a faster minimal-surface pass first, see [Cursor dogfood checklist](docs/cursor-dogfood-checklist.md). See [Cursor testing lessons](docs/cursor-testing-lessons.md) for auth.json seeding, isolated `/tmp` harness layout, JSONL replay-error scans, and other regression traps. Assume every runtime surface is in scope. The checklist uses real `pi -e . --cursor-no-fast --model cursor/composer-2.5` runs with temporary session dirs, pi 0.76.0 `--session-id`, sealed smoke-runner PATH/env wrappers, Cursor SDK `plan` mode, and mandatory visual TUI card/color inspection. The canonical visual path is `npm run smoke:visual`: offscreen PTY capture rendered through a browser/xterm view and saved as PNG screenshots with Playwright, or with `agent_browser` from the generated HTML when available. Its default matrix is native replay only: native replay registration is forced on, Cursor setting sources are disabled, the pi bridge is off, overlapping built-in pi tools are not exposed, and inherited Cursor SDK event-debug artifact env is cleared; `--event-debug` writes to a deterministic debug directory under the visual output directory. The visible TUI/output, rendered screenshots, scrubbed diagnostics, and persisted JSONL must agree. Do not mark a release ready with optional, deferred, mostly-passing, or unobserved smoke checks outstanding.
|
|
236
276
|
|
|
237
277
|
### Maintainer Cursor SDK event capture
|
|
238
278
|
|
|
@@ -254,7 +294,7 @@ Actual Cursor runs still need a key from `/login`, `CURSOR_API_KEY`, or `--api-k
|
|
|
254
294
|
|
|
255
295
|
- **Local Cursor SDK agents only.** This extension does not use Cursor cloud agents. Cloud pi tool bridging is out of scope because it needs a separate auth, transport, lifetime, and remote trust design.
|
|
256
296
|
- **The pi tool bridge is local and MCP-backed.** Bridgeable active pi tools are exposed to local Cursor agents through a tokenized `127.0.0.1` MCP endpoint; internal Cursor replay activity names are excluded, and overlapping built-in pi tools are hidden by default. Set `PI_CURSOR_PI_TOOL_BRIDGE=0` to disable it or `PI_CURSOR_EXPOSE_BUILTIN_TOOLS=1` to expose overlapping built-ins too.
|
|
257
|
-
- **Cursor native tool replay is display-only.** Replay renders recorded Cursor SDK activity and never re-runs Cursor-side commands, reapplies Cursor edits, calls MCP servers, or mutates pi state. Workflow tools such as Cursor
|
|
297
|
+
- **Cursor native tool replay is display-only.** Replay renders recorded Cursor SDK activity and never re-runs Cursor-side commands, reapplies Cursor edits, calls MCP servers, or mutates pi state. Workflow tools such as Cursor mode/task/todo/plan activity are not pi workflow controls. See [Cursor native tool replay](docs/cursor-native-tool-replay.md) for supported replay cards, ordering, conflict handling, and opt-out flags.
|
|
258
298
|
- **Cursor run state can span tool-use turns.** Within a pi session, the extension reuses one Cursor SDK agent across compatible follow-up turns and sends incremental prompts when context still matches. It recreates the agent when context diverges, after compaction or `/tree` navigation, on API key changes, after send errors, or on session shutdown. For bridged pi tools, the matching pi `toolResult` resolves into the same live Cursor SDK run without creating a new `Agent`, unless the run was disposed, aborted, or cancelled. Replay can also split one live Cursor SDK run across pi `toolUse` turns for display.
|
|
259
299
|
- **Cursor setting sources default to all.** The extension passes `local.settingSources: ["all"]` by default so configured Cursor MCP servers, plugin tools, project/user settings, and related Cursor-native capabilities are available like they are in Cursor. To narrow loading, set a comma-separated list such as `PI_CURSOR_SETTING_SOURCES=project,user,plugins`. To disable ambient setting sources, set `PI_CURSOR_SETTING_SOURCES=none`. Direct Cursor SDK bootstrap logs (settings, skills, hook-load compatibility warnings, and similar) are suppressed so they do not pollute the TUI.
|
|
260
300
|
- **AGENTS.md / CLAUDE.md are not duplicated on Cursor models when Cursor loads the same rules.** Pi discovers global and project context files (`AGENTS.md`, `CLAUDE.md`, and case variants) unless you start with `-nc`. On `cursor/*` models the extension removes only `<project_instructions>` blocks that overlap Cursor `settingSources` via the `before_agent_start` hook: `user` for `~/.pi/agent/AGENTS.md`, `project` for repo/parent `AGENTS.md` and `CLAUDE.md` (verified Cursor behavior: local agents load project `AGENTS.md` and `CLAUDE.md` alongside Cursor rules). `~/.pi/agent/CLAUDE.md` is not stripped (Cursor user rules use `~/.claude/CLAUDE.md`, not pi's agent dir). With `PI_CURSOR_SETTING_SOURCES=none` or `plugins`-only, pi context is left intact. Set `PI_CURSOR_PRESERVE_PI_AGENTS_MD=1` to keep duplicate injection.
|
|
@@ -305,9 +345,9 @@ pi install npm:pi-cursor-sdk
|
|
|
305
345
|
|
|
306
346
|
That does not mean the model cannot think. It means the Cursor SDK does not expose a pi-controllable thinking parameter for that model. The model may still think internally and may still emit thinking deltas that pi renders natively.
|
|
307
347
|
|
|
308
|
-
### I do not see `cursor fast` in the footer
|
|
348
|
+
### I do not see `cursor fast` or `cursor plan` in the footer
|
|
309
349
|
|
|
310
|
-
Fast mode is currently off.
|
|
350
|
+
Fast mode is currently off when `cursor fast` is absent. Cursor SDK mode is the default `agent` mode when `cursor plan` is absent. When both are active, pi shows one combined Cursor status: `cursor fast · plan`.
|
|
311
351
|
|
|
312
352
|
### My Cursor app settings or rules do not seem to apply
|
|
313
353
|
|
|
@@ -327,6 +367,10 @@ Many runs never expose web activity as replayable SDK tool completions or local
|
|
|
327
367
|
|
|
328
368
|
**Web fetch:** `pi-cursor-sdk` can display `webFetchToolCall` transcript records and web-fetch-shaped MCP/host completions when Cursor reports them. It cannot make Cursor expose or execute a `WebFetch` tool. If Cursor's current local SDK tool set does not include WebFetch, pi cannot fetch a URL through Cursor web fetch; use an allowed browser/shell/MCP tool instead.
|
|
329
369
|
|
|
370
|
+
### I disabled MCP in pi but Cursor still has extra tools
|
|
371
|
+
|
|
372
|
+
pi extension toggles and pi's MCP catalog do not control Cursor ambient MCP. Local Cursor agents load MCP servers from Cursor setting sources (`PI_CURSOR_SETTING_SOURCES=all` by default), including `~/.cursor/mcp.json`. To remove a server, edit or clear that file (or Cursor MCP settings) and restart the pi session, or narrow/disable sources with `PI_CURSOR_SETTING_SOURCES=none` or a comma-separated subset. See [Cursor tool surfaces in pi](docs/cursor-tool-surfaces.md).
|
|
373
|
+
|
|
330
374
|
### Cursor does not call my pi extension tool
|
|
331
375
|
|
|
332
376
|
The local pi bridge only exposes tools that are active in the current pi session and present in pi's tool registry at Cursor run start. By default, it does not expose overlapping pi tool names that Cursor already has native equivalents for (`read`, `bash`, `write`, `edit`, `grep`, `find`, and `ls`). Opt in if you intentionally want Cursor to see both the Cursor-native tool and an overlapping built-in pi tool:
|
|
@@ -341,6 +385,23 @@ To disable the bridge for rollback or isolation, start pi with:
|
|
|
341
385
|
PI_CURSOR_PI_TOOL_BRIDGE=0 pi --model cursor/composer-2.5
|
|
342
386
|
```
|
|
343
387
|
|
|
388
|
+
### First Cursor message is slow (10+ seconds)
|
|
389
|
+
|
|
390
|
+
The extension loads Cursor setting sources with `PI_CURSOR_SETTING_SOURCES=all` by default, which includes user MCP servers from `~/.cursor/mcp.json`. On the first send of a session, the Cursor SDK connects to each configured MCP server before streaming a reply. pi-cursor-sdk shortens the known MCP initialize/listTools timeout path to **10 seconds by default** (the raw Cursor SDK default is 60 seconds), so a dead server should fail fast instead of blocking for a full minute. Unknown MCP protocol timeout stacks keep the SDK default instead of being shortened. A slow or unavailable server can still add roughly that connect timeout before the first reply. Tighten further with:
|
|
391
|
+
|
|
392
|
+
```bash
|
|
393
|
+
PI_CURSOR_MCP_CONNECT_TIMEOUT_SECONDS=5 pi --model cursor/composer-2.5
|
|
394
|
+
PI_CURSOR_MCP_CONNECT_TIMEOUT_MS=5000 pi --model cursor/composer-2.5
|
|
395
|
+
```
|
|
396
|
+
|
|
397
|
+
Workarounds if you do not need user-level MCP in pi:
|
|
398
|
+
|
|
399
|
+
```bash
|
|
400
|
+
PI_CURSOR_SETTING_SOURCES=project,plugins,team pi --model cursor/composer-2.5
|
|
401
|
+
```
|
|
402
|
+
|
|
403
|
+
Or fix/disable the slow MCP server in Cursor settings. Maintainer timing probe: `npm run debug:mcp-coldstart`.
|
|
404
|
+
|
|
344
405
|
### A Cursor MCP tool times out
|
|
345
406
|
|
|
346
407
|
The extension raises Cursor SDK's MCP tool-call timeout from 60 seconds to 3600 seconds by default for Cursor SDK MCP `callTool` requests, including the local pi bridge and configured Cursor MCP servers. For longer local MCP tools, set one override:
|
|
@@ -357,7 +418,7 @@ This usually needs session JSONL to classify. Common cases:
|
|
|
357
418
|
- **Model text echo:** Assistant `text` blocks contain lines like `Tool call`, `Cursor activity`, or `call cursor-replay-…` without matching `toolCall` blocks — the Cursor model narrated pi prompt transcript format instead of invoking SDK tools. See [Tool calls listed as plain text (#40 triage)](docs/cursor-testing-lessons.md#tool-calls-listed-as-plain-text-40-triage).
|
|
358
419
|
- **Stale replay routing / plan-strip:** Error `toolResult` or error assistant messages contain `Tool grep/cursor/find/ls not found`, or provider debug shows `inactive_trace` after plan-mode execute stripped active tools — tracked in **#52** (distinct from model text echo and #55).
|
|
359
420
|
- **Replay vs execution:** `cursor-replay-*` IDs and neutral **Cursor MCP** activity cards are display-only recorded Cursor results; they do not re-run browser/MCP work. See [Cursor native tool replay](docs/cursor-native-tool-replay.md).
|
|
360
|
-
- **Run failure / discarded tools:** A red toast with scrubbed detail may indicate an SDK failure (#55). Started-but-never-completed Cursor tools
|
|
421
|
+
- **Run failure / discarded tools:** A red toast with scrubbed detail may indicate an SDK failure (#55). Started-but-never-completed Cursor tools surface neutral **Cursor … did not complete** activity cards with a bounded reason when the run failed/aborted, produced no assistant text, or involved external/side-effectful tools. Incomplete fast local discovery starts (`read`, `grep`, `glob`, `ls`) are debug-only after a successful text-producing run so stale SDK start events do not create red post-answer cards; maintainer debug for the same gap remains in **#52** (`PI_CURSOR_SDK_EVENT_DEBUG=1`).
|
|
361
422
|
- **Hard network crash:** pi exited with uncaught `ConnectError` / `ETIMEDOUT` — **#43**, not #40 text echo.
|
|
362
423
|
|
|
363
424
|
Capture `pi --version`, extension version, model, flags, the exact prompt, and a redacted session dir before filing bugs.
|
|
@@ -0,0 +1,57 @@
|
|
|
1
|
+
# Cursor dogfood checklist
|
|
2
|
+
|
|
3
|
+
Short maintainer checklist for **minimal-surface** validation after prompt, bridge, replay, or manifest changes. This is the fast path from pi-cursor-composer dogfood sessions—not a substitute for the full [Cursor live smoke checklist](./cursor-live-smoke-checklist.md).
|
|
4
|
+
|
|
5
|
+
## Minimal environment
|
|
6
|
+
|
|
7
|
+
- Extension only: `pi -e . --cursor-no-fast --model cursor/composer-2.5`
|
|
8
|
+
- Fresh session dir: `--session-dir /tmp/pi-cursor-dogfood-<id>`
|
|
9
|
+
- Baseline surface (no ambient Cursor MCP/rules):
|
|
10
|
+
- `PI_CURSOR_SETTING_SOURCES=none`, **or**
|
|
11
|
+
- empty / minimal `~/.cursor/mcp.json` when you need to verify user MCP config separately
|
|
12
|
+
- Optional: `PI_CURSOR_TOOL_MANIFEST=0` to confirm bootstrap behavior without the manifest block
|
|
13
|
+
|
|
14
|
+
## One-turn exercise
|
|
15
|
+
|
|
16
|
+
1. **Native Cursor host tool** — one `read` or `shell` call (Cursor SDK host tools; not listed in MCP `listTools`).
|
|
17
|
+
2. **Pi bridge** (if enabled) — one bridged call via exposed `pi__*` MCP name, e.g. `pi__cursor_ask_question` when active.
|
|
18
|
+
3. **Configured MCP** (optional) — only when you intentionally load Cursor MCP via settings; skip for minimal baseline.
|
|
19
|
+
|
|
20
|
+
In-session debug: `/cursor-tools` prints bridge enablement, bootstrap manifest enablement, effective `PI_CURSOR_SETTING_SOURCES`, and the callable-surface manifest snapshot for the current session.
|
|
21
|
+
|
|
22
|
+
## JSONL spot-check
|
|
23
|
+
|
|
24
|
+
Inspect the session JSONL under the temp `--session-dir`:
|
|
25
|
+
|
|
26
|
+
| Pattern | Meaning |
|
|
27
|
+
| --- | --- |
|
|
28
|
+
| `cursor-replay-*` | Display-only replay of Cursor SDK activity—not callable |
|
|
29
|
+
| `cursor-pi-bridge-run-*` | Live pi execution via bridge |
|
|
30
|
+
| Callable tools | Cursor SDK host + MCP `listTools` + exposed `pi__*` only |
|
|
31
|
+
|
|
32
|
+
Common mistake: treating `cursor-replay-*` IDs or pi transcript tool labels as tools to invoke.
|
|
33
|
+
|
|
34
|
+
## Bootstrap prompt
|
|
35
|
+
|
|
36
|
+
First send (bootstrap) should include:
|
|
37
|
+
|
|
38
|
+
- Short **Cursor SDK tool boundary** block
|
|
39
|
+
- **Callable tool surfaces this run** manifest (unless `PI_CURSOR_TOOL_MANIFEST=0`)
|
|
40
|
+
- Tail guard with shell `cd` hint
|
|
41
|
+
|
|
42
|
+
Incremental sends omit the full boundary; tail guard remains.
|
|
43
|
+
|
|
44
|
+
## Activity replay — Cursor edit card
|
|
45
|
+
|
|
46
|
+
After a Cursor **edit** tool call, confirm the activity card:
|
|
47
|
+
|
|
48
|
+
- `details.diffString` present on the replay record
|
|
49
|
+
- Collapsed diff preview with colored add/remove lines in the TUI
|
|
50
|
+
|
|
51
|
+
Canonical visual evidence: `npm run smoke:visual` (see [Cursor native tool visual audit](./cursor-native-tool-visual-audit.md)).
|
|
52
|
+
|
|
53
|
+
## Related docs
|
|
54
|
+
|
|
55
|
+
- [Cursor tool surfaces in pi](./cursor-tool-surfaces.md) — three namespaces and discoverability
|
|
56
|
+
- [Cursor live smoke checklist](./cursor-live-smoke-checklist.md) — full pre-release gate
|
|
57
|
+
- [Cursor testing lessons](./cursor-testing-lessons.md) — auth, JSONL scans, plan-mode traps
|
|
@@ -19,6 +19,8 @@ Use this manual checklist before releasing Cursor provider/runtime changes. Unit
|
|
|
19
19
|
```bash
|
|
20
20
|
export SMOKE_DIR="/tmp/pi-cursor-sdk-live-smoke-$(date +%Y%m%dT%H%M%S)"
|
|
21
21
|
mkdir -p "$SMOKE_DIR"
|
|
22
|
+
pi --version
|
|
23
|
+
npm ls @cursor/sdk @earendil-works/pi-coding-agent @earendil-works/pi-ai @earendil-works/pi-tui
|
|
22
24
|
pi -e . --list-models cursor
|
|
23
25
|
```
|
|
24
26
|
|
|
@@ -30,14 +32,26 @@ The repo also ships partial automation for the prerequisite/basic/default-settin
|
|
|
30
32
|
npm run smoke:live
|
|
31
33
|
```
|
|
32
34
|
|
|
35
|
+
`npm run smoke:live` resolves `pi`, `node`, `npm`, `rg`, and `tmux` once in the parent shell, then runs all `pi` shims with the resolved Node directory first on `PATH`. It clears inherited Cursor SDK event-debug env for every child pi run. Isolated helper cases force `PI_CURSOR_SETTING_SOURCES=none`; the `default-settings` helper case explicitly unsets `PI_CURSOR_SETTING_SOURCES` so it exercises the default ambient setting-source path.
|
|
36
|
+
|
|
37
|
+
The canonical visual runner for section 4 is checked in separately:
|
|
38
|
+
|
|
39
|
+
```bash
|
|
40
|
+
npm run smoke:visual -- --help
|
|
41
|
+
```
|
|
42
|
+
|
|
33
43
|
For native replay regression checks (packed install, plan-strip resync, JSONL replay-error scan), use the isolated helper:
|
|
34
44
|
|
|
35
45
|
```bash
|
|
36
46
|
npm run smoke:isolated
|
|
37
47
|
# unit tests + pack only (no live Cursor):
|
|
38
48
|
SKIP_LIVE=1 npm run smoke:isolated
|
|
49
|
+
# sealed PATH/debug-env guard for the isolated helper:
|
|
50
|
+
npm run smoke:isolated -- --self-test
|
|
39
51
|
```
|
|
40
52
|
|
|
53
|
+
`npm run smoke:isolated` follows the same smoke-runner env contract as live/visual/steering helpers: pack-only work resolves only `node`, `npm`, and `env` from the parent shell and does not require `pi`; live checks then resolve `pi` and `rg`. It runs pi/npm shims with the resolved Node directory first on `PATH`, clears Cursor SDK event-debug env, forces `PI_CURSOR_SETTING_SOURCES=none` for provider checks, and explicitly unsets `PI_CURSOR_SETTING_SOURCES` for install/list checks.
|
|
54
|
+
|
|
41
55
|
Scan persisted sessions for native replay tool failures:
|
|
42
56
|
|
|
43
57
|
```bash
|
|
@@ -47,10 +61,12 @@ node scripts/validate-smoke-jsonl.mjs --replay-errors-only "$SMOKE_DIR/session-s
|
|
|
47
61
|
|
|
48
62
|
The replay scan flags only error `toolResult` / error assistant messages with `Tool grep/cursor/find/ls not found`, not successful reads of docs that mention those strings. See [Cursor testing lessons](./cursor-testing-lessons.md#what-counts-as-a-replay-failure).
|
|
49
63
|
|
|
50
|
-
|
|
64
|
+
`npm run smoke:live` is a helper only; it polls the section 3 TUI for answer/footer evidence and then cleans up the tmux session, but it does not replace the canonical rendered-PNG visual review in section 4. Run the relevant helper `--self-test` (`smoke:live`, `smoke:visual`, `smoke:steering`, or `smoke:isolated`) when changing sealed PATH or env wrappers. Release readiness still requires the manual checks below for detailed visual TUI behavior, bridge, standalone native replay, abort/cancel, packaging, cleanup, and any touched runtime surface not covered by the helper.
|
|
51
65
|
|
|
52
66
|
Pass criteria:
|
|
53
67
|
|
|
68
|
+
- `pi --version` reports pi 0.76.0 for this cutover baseline.
|
|
69
|
+
- `npm ls` shows `@cursor/sdk@1.0.14` and local `@earendil-works/*@0.76.0` packages.
|
|
54
70
|
- `cursor/composer-2.5` appears in the model list.
|
|
55
71
|
- No Cursor key or auth token is printed.
|
|
56
72
|
- If neither `~/.pi/agent/auth.json` cursor auth nor `CURSOR_API_KEY` is available, stop and report the live smoke as blocked.
|
|
@@ -99,7 +115,7 @@ Run a real interactive session under tmux:
|
|
|
99
115
|
```bash
|
|
100
116
|
SESSION="pi-cursor-sdk-smoke-$(date +%s)"
|
|
101
117
|
tmux new-session -d -s "$SESSION" -x 120 -y 40 -- zsh -lc \
|
|
102
|
-
"cd '$PWD' && PI_CURSOR_SETTING_SOURCES=none pi -e . --cursor-no-fast --model cursor/composer-2.5 --session-dir '$SMOKE_DIR/tui' --no-tools 'TUI smoke. Compute 19 + 23. Reply only with SUM=<number>.'"
|
|
118
|
+
"cd '$PWD' && PI_CURSOR_SETTING_SOURCES=none pi -e . --cursor-no-fast --model cursor/composer-2.5 --session-dir '$SMOKE_DIR/tui' --session-id cursor-sdk-1014-tui --no-tools 'TUI smoke. Compute 19 + 23. Reply only with SUM=<number>.'"
|
|
103
119
|
```
|
|
104
120
|
|
|
105
121
|
Observe with `tmux capture-pane -pt "$SESSION"` or attach manually.
|
|
@@ -107,12 +123,102 @@ Observe with `tmux capture-pane -pt "$SESSION"` or attach manually.
|
|
|
107
123
|
Pass criteria:
|
|
108
124
|
|
|
109
125
|
- Footer shows `(cursor) composer-2.5`. With `--cursor-no-fast`, Cursor fast mode is off and the Cursor extension status should not show `cursor fast`; ignore unrelated status text from other extensions.
|
|
126
|
+
- The run uses pi 0.76.0 `--session-id` successfully.
|
|
110
127
|
- Assistant answer appears correctly.
|
|
111
128
|
- `/session` shows one user and one assistant message for the simple run.
|
|
112
129
|
- Persisted JSONL has one assistant message. If the screen appears duplicated, inspect JSONL before deciding whether it is a rendering bug.
|
|
113
130
|
- Kill the tmux session after the check and verify no smoke tmux sessions remain.
|
|
114
131
|
|
|
115
|
-
## 4.
|
|
132
|
+
## 4. Mandatory visual card/color rendering check
|
|
133
|
+
|
|
134
|
+
This is the canonical visual release path for Cursor provider/runtime changes. It requires offscreen TUI visual inspection, not only JSONL or code review. Use pi 0.76.0, `@cursor/sdk@1.0.14`, a fresh temporary session dir, Cursor SDK `plan` mode, native replay enabled, and the checked-in visual runner. The runner resolves `pi` by directly walking the parent `PATH`, uses `process.execPath` for Node, and prepends that Node directory for both prereq checks and tmux launches so `#!/usr/bin/env node` shims use the validated Node. The default matrix is native replay only: native replay registration is forced on, settings sources are `none`, the pi bridge is off, overlapping built-in pi tools are not exposed, and inherited Cursor SDK event-debug artifact env is cleared. With `--event-debug`, debug capture writes to a deterministic directory under `VISUAL_DIR`.
|
|
135
|
+
|
|
136
|
+
```bash
|
|
137
|
+
VISUAL_DIR="$(mktemp -d /tmp/pi-cursor-sdk-1014-visual.XXXXXX)"
|
|
138
|
+
VISUAL_ARGS=(
|
|
139
|
+
--ext "$PWD"
|
|
140
|
+
--cwd "$PWD"
|
|
141
|
+
--out-dir "$VISUAL_DIR"
|
|
142
|
+
--wait-ms 60000
|
|
143
|
+
--event-debug
|
|
144
|
+
)
|
|
145
|
+
|
|
146
|
+
npm run smoke:visual -- "${VISUAL_ARGS[@]}" \
|
|
147
|
+
--label read-package \
|
|
148
|
+
--prompt 'Use only your file read tool. Read ./package.json and answer with only the package name. Do not use shell, grep, glob, find, or list tools.'
|
|
149
|
+
|
|
150
|
+
npm run smoke:visual -- "${VISUAL_ARGS[@]}" \
|
|
151
|
+
--label grep-readme \
|
|
152
|
+
--prompt 'Use only your grep/search tool to search ./README.md for the literal string "pi-cursor-sdk". Do not use shell, read, glob, find, ls, or list tools. Report only the first matching file path.'
|
|
153
|
+
|
|
154
|
+
npm run smoke:visual -- "${VISUAL_ARGS[@]}" \
|
|
155
|
+
--label find-readme \
|
|
156
|
+
--prompt 'Use only your glob/file-search/find tool to find README.md from the repository root. Do not use shell, read, grep, ls, or list tools. Report matched paths exactly.'
|
|
157
|
+
|
|
158
|
+
npm run smoke:visual -- "${VISUAL_ARGS[@]}" \
|
|
159
|
+
--label list-src \
|
|
160
|
+
--prompt 'Use only your directory listing tool to list ./src. Do not use shell, read, grep, glob, or find tools. Report whether cursor-provider.ts is present.'
|
|
161
|
+
|
|
162
|
+
npm run smoke:visual -- "${VISUAL_ARGS[@]}" \
|
|
163
|
+
--label shell-success \
|
|
164
|
+
--prompt "Use only your shell/terminal tool to run printf 'cursor visual smoke\\n'. Do not use read, grep, glob, find, ls, edit, or write. Report the output."
|
|
165
|
+
|
|
166
|
+
npm run smoke:visual -- "${VISUAL_ARGS[@]}" \
|
|
167
|
+
--label write-file \
|
|
168
|
+
--prompt 'Use your normal file write tool to create .debug/visual-smoke/cursor-mode.txt with exactly two lines: alpha and beta. Do not use shell.'
|
|
169
|
+
|
|
170
|
+
npm run smoke:visual -- "${VISUAL_ARGS[@]}" \
|
|
171
|
+
--label edit-file \
|
|
172
|
+
--prompt 'Use your normal file edit/str-replace tool to change beta to gamma in .debug/visual-smoke/cursor-mode.txt. Do not use shell.'
|
|
173
|
+
|
|
174
|
+
npm run smoke:visual -- "${VISUAL_ARGS[@]}" \
|
|
175
|
+
--label read-missing \
|
|
176
|
+
--prompt 'Use only your file read tool to read .debug/visual-smoke/does-not-exist.txt. Then explain the result. Do not use shell, grep, glob, find, ls, edit, or write.'
|
|
177
|
+
|
|
178
|
+
npm run smoke:visual -- "${VISUAL_ARGS[@]}" \
|
|
179
|
+
--label workflow-activity \
|
|
180
|
+
--prompt 'Stay in Cursor plan mode. If Cursor exposes plan, todo, task, or mode activity for this request, use that capability to outline a tiny unit test without editing files. Otherwise answer with a concise numbered plan. Do not use shell or file mutation tools.'
|
|
181
|
+
```
|
|
182
|
+
|
|
183
|
+
By default, `npm run smoke:visual` writes `.ansi`, `.txt`, `.html`, `.png`, and `.jsonl.path` artifacts. If Playwright Chromium is unavailable in an agent-harness run, rerun with `--no-screenshot`, open the generated `.html` with `agent_browser`, save a PNG screenshot, and record that PNG path beside the runner artifacts. To visually audit bridge behavior or ambient Cursor settings, opt in with `--bridge`, `--bridge --expose-builtin-tools`, or `--setting-sources <value>` and label that evidence separately; do not count those opt-in runs as default native replay matrix proof.
|
|
184
|
+
|
|
185
|
+
Expected proof for each category is defined in [Cursor Native Tool Visual Audit Workflow](./cursor-native-tool-visual-audit.md). Do not mark a category passed because the prompt was sent. A category passes only when the PNG shows the expected card and the JSONL shows the expected completed `toolCall` / `toolResult` pair with the expected `isError` state.
|
|
186
|
+
|
|
187
|
+
Pass criteria:
|
|
188
|
+
|
|
189
|
+
- PNG screenshots exist for every claimed card category, not only text/JSONL logs.
|
|
190
|
+
- JSONL paths exist for every claimed card category.
|
|
191
|
+
- Required cutover categories have matching PNG + JSONL proof from the default native replay matrix: read, grep/search, find/glob, list, shell success, write, edit/diff, and true read failure.
|
|
192
|
+
- Native-looking read/search/find/list/shell/write/edit cards use intended pi card styling.
|
|
193
|
+
- Shell success is not red/error-styled; stdout is readable.
|
|
194
|
+
- Edit/diff previews show red/green added/removed colors and readable paths.
|
|
195
|
+
- True failures are visible, bounded, and distinct from neutral activity.
|
|
196
|
+
- Footer/status is readable in Cursor `plan` mode and combines with fast when applicable.
|
|
197
|
+
- Neutral Cursor plan/todo/task/mode activity is claimed only if JSONL contains a completed Cursor workflow event; if Cursor only returns plan text, record workflow activity as not exercised instead of passed.
|
|
198
|
+
- Evidence paths for ANSI capture, rendered PNG screenshots, JSONL, and debug artifact directories are recorded in [Cursor native tool visual audit](./cursor-native-tool-visual-audit.md) or the release handoff.
|
|
199
|
+
- No secrets, raw debug artifacts, or scratch output are committed.
|
|
200
|
+
|
|
201
|
+
## 5. Cursor SDK plan-mode provider check
|
|
202
|
+
|
|
203
|
+
```bash
|
|
204
|
+
PI_CURSOR_SETTING_SOURCES=none \
|
|
205
|
+
pi -e . --cursor-no-fast --cursor-mode plan --model cursor/composer-2.5 \
|
|
206
|
+
--session-dir "$SMOKE_DIR/cursor-mode-plan" \
|
|
207
|
+
--session-id cursor-sdk-1014-plan \
|
|
208
|
+
--no-tools \
|
|
209
|
+
-p 'Cursor mode smoke. Reply with one short implementation plan for printing hello.' \
|
|
210
|
+
> "$SMOKE_DIR/cursor-mode-plan.stdout.txt" \
|
|
211
|
+
2> "$SMOKE_DIR/cursor-mode-plan.stderr.txt"
|
|
212
|
+
```
|
|
213
|
+
|
|
214
|
+
Pass criteria:
|
|
215
|
+
|
|
216
|
+
- Exit code is `0`.
|
|
217
|
+
- stdout contains a short plan-like answer.
|
|
218
|
+
- stderr is empty or contains only expected non-secret diagnostics.
|
|
219
|
+
- No pi active-tool or pi plan-mode state is mutated merely because Cursor SDK mode is `plan`.
|
|
220
|
+
|
|
221
|
+
## 6. Bridge multi-tool success and failure
|
|
116
222
|
|
|
117
223
|
```bash
|
|
118
224
|
PI_CURSOR_SETTING_SOURCES=none \
|
|
@@ -133,7 +239,7 @@ Pass criteria:
|
|
|
133
239
|
- Persisted JSONL contains real pi tool calls named `read`, matching `toolResult` messages, and final assistant output.
|
|
134
240
|
- Later assistant usage counts consumed tool-result input; no assistant usage has negative values or nonzero cache fields.
|
|
135
241
|
|
|
136
|
-
##
|
|
242
|
+
## 7. Native replay cards without the pi bridge
|
|
137
243
|
|
|
138
244
|
```bash
|
|
139
245
|
PI_CURSOR_SETTING_SOURCES=none \
|
|
@@ -152,7 +258,7 @@ Pass criteria:
|
|
|
152
258
|
- Persisted JSONL shows an assistant `toolUse` turn with a replayed `read` tool call, a pi `read` `toolResult`, and a final assistant turn.
|
|
153
259
|
- Native replay is display-only: it must not re-run Cursor-side mutations or create duplicate pi mutations.
|
|
154
260
|
|
|
155
|
-
##
|
|
261
|
+
## 8. Diagnostics safety contract
|
|
156
262
|
|
|
157
263
|
Bridge diagnostics are scrubbed operational logs, not anonymous telemetry.
|
|
158
264
|
|
|
@@ -203,7 +309,7 @@ Pass criteria:
|
|
|
203
309
|
- The scan returns no matching files except deliberately planted test strings that are asserted not to appear in serialized diagnostics, and it does not print matched secret-bearing lines.
|
|
204
310
|
- If tool names themselves are considered sensitive for a release target, do not enable `PI_CURSOR_PI_TOOL_BRIDGE_DEBUG=1` for shared logs. The diagnostics contract intentionally allows tool names.
|
|
205
311
|
|
|
206
|
-
##
|
|
312
|
+
## 9. Long-running bridge and abort/cancel
|
|
207
313
|
|
|
208
314
|
This check is release-blocking for every Cursor provider/runtime release.
|
|
209
315
|
|
|
@@ -224,7 +330,7 @@ Pass criteria:
|
|
|
224
330
|
- Diagnostics either show clean cancellation/disposal or the process exits cleanly without orphaning children.
|
|
225
331
|
- Persisted JSONL does not contain a false successful final answer.
|
|
226
332
|
|
|
227
|
-
##
|
|
333
|
+
## 10. Final structural session scan
|
|
228
334
|
|
|
229
335
|
After all live runs, scan JSONL structurally instead of reading raw content into a report:
|
|
230
336
|
|
|
@@ -245,7 +351,7 @@ Additional manual usage checks for provider/accounting changes:
|
|
|
245
351
|
- Tool-heavy runs should show nonzero output for visible assistant/tool-call activity.
|
|
246
352
|
- Split runs should count consumed tool-result input once on the following assistant turn.
|
|
247
353
|
|
|
248
|
-
##
|
|
354
|
+
## 11. Standard local gates
|
|
249
355
|
|
|
250
356
|
```bash
|
|
251
357
|
git diff --check
|
|
@@ -259,7 +365,7 @@ Pass criteria:
|
|
|
259
365
|
- All commands exit `0`.
|
|
260
366
|
- `npm pack --dry-run` includes all new runtime source files and excludes local smoke artifacts, sessions, package tarballs, `.env*`, `.pi/`, `dist/`, and `coverage/`.
|
|
261
367
|
|
|
262
|
-
##
|
|
368
|
+
## 12. Cleanup
|
|
263
369
|
|
|
264
370
|
```bash
|
|
265
371
|
tmux list-sessions | grep 'pi-cursor-sdk-smoke' || true
|
|
@@ -279,7 +385,7 @@ Everything in this section is in scope for Cursor provider/runtime releases. The
|
|
|
279
385
|
- Long-running bridged tool abort/cancel cleanup.
|
|
280
386
|
- Native replay cards beyond read, especially shell/edit/write cards, when those renderers change.
|
|
281
387
|
- Bridge question UI when `cursor_ask_question` changes.
|
|
282
|
-
- MCP timeout override behavior when timeout code changes.
|
|
388
|
+
- MCP timeout override behavior (3600s `callTool` default, 10s initialize/listTools default, and SDK-default unknown protocol stacks) when timeout code changes.
|
|
283
389
|
- SDK `semSearch` / `recordScreen` activity replay when those formatters change. There is no reliable local prompt that forces Cursor to call these built-in SDK tools on demand; regression is covered by `test/cursor-tool-transcript.test.ts`. Opportunistically confirm neutral `Cursor semantic search` / `Cursor screen recording` cards if a live run surfaces them.
|
|
284
390
|
- Ambient Cursor setting-source behavior when startup filtering or local Cursor settings handling changes.
|
|
285
391
|
- Model discovery aliases/context variants when model-discovery code or Cursor SDK versions change.
|