pi-interactive-shell 0.8.2 → 0.10.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/CHANGELOG.md +47 -1
- package/README.md +22 -14
- package/SKILL.md +4 -2
- package/background-widget.ts +76 -0
- package/config.ts +4 -4
- package/examples/prompts/codex-implement-plan.md +18 -7
- package/examples/prompts/codex-review-impl.md +16 -5
- package/examples/prompts/codex-review-plan.md +20 -10
- package/examples/skills/codex-5-3-prompting/SKILL.md +161 -0
- package/examples/skills/codex-cli/SKILL.md +16 -8
- package/examples/skills/gpt-5-4-prompting/SKILL.md +202 -0
- package/handoff-utils.ts +92 -0
- package/headless-monitor.ts +16 -3
- package/index.ts +240 -384
- package/notification-utils.ts +134 -0
- package/overlay-component.ts +61 -248
- package/package.json +26 -6
- package/pty-log.ts +59 -0
- package/pty-protocol.ts +33 -0
- package/pty-session.ts +11 -134
- package/reattach-overlay.ts +6 -74
- package/runtime-coordinator.ts +69 -0
- package/scripts/install.js +18 -3
- package/session-manager.ts +21 -11
- package/session-query.ts +170 -0
- package/spawn-helper.ts +37 -0
- package/tool-schema.ts +6 -2
- package/types.ts +6 -0
package/CHANGELOG.md
CHANGED
|
@@ -2,7 +2,53 @@
|
|
|
2
2
|
|
|
3
3
|
All notable changes to the `pi-interactive-shell` extension will be documented in this file.
|
|
4
4
|
|
|
5
|
-
## [
|
|
5
|
+
## [0.10.0] - 2026-03-13
|
|
6
|
+
|
|
7
|
+
### Added
|
|
8
|
+
- **Test harness** - Added vitest with 20 tests covering session queries, key encoding, notification formatting, headless monitor lifecycle, session manager, config/docs parity, and module loading.
|
|
9
|
+
- **`gpt-5-4-prompting` skill** - New bundled skill with GPT-5.4 prompting best practices for Codex workflows.
|
|
10
|
+
|
|
11
|
+
### Changed
|
|
12
|
+
- **Architecture refactor** - Extracted shared logic into focused modules for better maintainability:
|
|
13
|
+
- `session-query.ts` - Unified output/query logic (rate limiting, incremental, drain, offset modes)
|
|
14
|
+
- `notification-utils.ts` - Message formatting for dispatch/hands-free notifications
|
|
15
|
+
- `handoff-utils.ts` - Snapshot/preview capture on session exit/transfer
|
|
16
|
+
- `runtime-coordinator.ts` - Centralized overlay/monitor/widget state management
|
|
17
|
+
- `pty-log.ts` - Raw output trimming and line slicing
|
|
18
|
+
- `pty-protocol.ts` - DSR cursor position query handling
|
|
19
|
+
- `spawn-helper.ts` - macOS node-pty permission fix
|
|
20
|
+
- `background-widget.ts` - TUI widget for background sessions
|
|
21
|
+
- README, `SKILL.md`, install output, and the packaged Codex workflow examples now tell the same story about dispatch being the recommended delegated mode, the current 8s quiet threshold / 15s grace-period defaults, and the bundled prompt-skill surface.
|
|
22
|
+
- The Codex workflow docs now point at the packaged `gpt-5-4-prompting`, `codex-5-3-prompting`, and `codex-cli` skills instead of describing a runtime fetch of the old 5.2 prompting guide.
|
|
23
|
+
- Example prompts and skill docs are aligned around `gpt-5.4` as the default Codex model, with `gpt-5.3-codex` remaining the explicit opt-in fallback.
|
|
24
|
+
- Renamed `codex-5.3-prompting` → `codex-5-3-prompting` example skill (filesystem-friendly path).
|
|
25
|
+
|
|
26
|
+
### Fixed
|
|
27
|
+
- **Map iteration bug** - Fixed `disposeAllMonitors()` modifying Map during iteration, which could cause unpredictable behavior.
|
|
28
|
+
- **Array iteration bug** - Fixed PTY listener notifications modifying arrays during iteration if a listener unsubscribed itself.
|
|
29
|
+
- **Missing runtime dependency** - Added `@sinclair/typebox` to dependencies (was imported but not declared).
|
|
30
|
+
- Documented the packaged prompt/skill onboarding path more clearly so users can either rely on the exported package metadata or copy the bundled examples into their own prompt and skill directories.
|
|
31
|
+
|
|
32
|
+
## [0.9.0] - 2026-02-23
|
|
33
|
+
|
|
34
|
+
### Added
|
|
35
|
+
- `examples/skills/codex-5.3-prompting/` skill with GPT-5.3-Codex prompting guide -- self-contained best practices for verbosity control, scope discipline, forced upfront reading, plan mode, mid-task steering, context management, and reasoning effort recommendations.
|
|
36
|
+
- **`interactive-shell:update` event** — All hands-free update callbacks now emit `pi.events.emit("interactive-shell:update", update)` with the full `HandsFreeUpdate` payload. Extensions can listen for quiet, exit, kill, and user-takeover events regardless of which code path started the session (blocking, non-blocking, or reattach).
|
|
37
|
+
- **`triggerTurn` on terminal events** — Non-blocking hands-free sessions now send `pi.sendMessage` with `triggerTurn: true` when the session exits, is killed, or the user takes over. Periodic "running" updates emit only on the event bus (cheap for extensions) without waking the agent.
|
|
38
|
+
|
|
39
|
+
### Fixed
|
|
40
|
+
- **Quiet detection broken for TUI apps** — Ink-based CLIs (Claude Code, etc.) emit periodic ANSI-only PTY data (cursor blink, frame redraws) that reset the quiet timer on every event, preventing quiet detection from ever triggering. Now filters data through `stripVTControlCharacters` and only resets the quiet timer when there's visible content. Fixed in both the overlay (`overlay-component.ts`) and headless dispatch monitor (`headless-monitor.ts`). Also seeds the quiet timer at startup when `autoExitOnQuiet` is enabled, so sessions that never produce visible output still get killed after the grace period.
|
|
41
|
+
- **Lifecycle guard decoupled from callback** — The overlay used `onHandsFreeUpdate` presence as a proxy for "blocking tool call" to decide whether to unregister sessions on completion. Wiring the callback in non-blocking paths (for event emission) would cause premature session cleanup. Introduced `streamingMode` flag to separate "has update callback" from "should unregister on completion," so non-blocking sessions stay queryable after the callback fires.
|
|
42
|
+
- **`autoExitOnQuiet` broken in interval update mode** — The `onData` handler only reset the quiet timer in `on-quiet` mode, so `autoExitOnQuiet` never fired with `updateMode: "interval"`. Also, the interval timer's safety-net flush unconditionally stopped the quiet timer, preventing `autoExitOnQuiet` from firing if the interval flushed before the quiet threshold. Both fixed: data handler now resets the timer whenever `autoExitOnQuiet` is enabled regardless of update mode, and the interval flush restarts (rather than stops) the quiet timer when `autoExitOnQuiet` is active.
|
|
43
|
+
- **RangeError on narrow terminals** — `render()` computed `width - 2` for border strings without a lower bound, causing `String.prototype.repeat()` to throw with negative counts when terminal width < 4. Clamped in both the main overlay and reattach overlay. Fixes #2.
|
|
44
|
+
- **Hardcoded `~/.pi/agent` path** — Config loading, snapshot writing, and the install script all hardcoded `~/.pi/agent`, ignoring `PI_CODING_AGENT_DIR`. Now uses `getAgentDir()` from pi's API in all runtime paths and reads the env var in the install script. Fixes #1.
|
|
45
|
+
|
|
46
|
+
### Changed
|
|
47
|
+
- Default `handsFreeQuietThreshold` increased from 5000ms to 8000ms and `autoExitGracePeriod` reduced from 30000ms to 15000ms. Both remain adjustable per-call via `handsFree.quietThreshold` and `handsFree.gracePeriod`, and via config file.
|
|
48
|
+
- Dispatch mode is now the recommended default for delegated Codex runs. Updated `README.md`, `SKILL.md`, `tool-schema.ts`, `examples/skills/codex-cli/SKILL.md`, and all three codex prompt templates to prefer `mode: "dispatch"` over hands-free for fire-and-forget delegations.
|
|
49
|
+
- Rewrote `codex-5.3-prompting` skill from a descriptive model-behavior guide into a directive prompt-construction reference. Cut behavioral comparison, mid-task steering, and context management prose sections; reframed each prompt block with a one-line "include when X" directive so the agent knows what to inject and when.
|
|
50
|
+
- Added "Backwards compatibility hedging" section to `codex-5.3-prompting` skill covering the "cutover" keyword trick -- GPT-5.3-Codex inserts compatibility shims and fallback code even when told not to; using "cutover" + "no backwards compatibility" + "do not preserve legacy code" produces cleaner breaks than vague "don't worry about backwards compatibility" phrasing.
|
|
51
|
+
- Example prompts (`codex-implement-plan`, `codex-review-impl`, `codex-review-plan`) updated for GPT-5.3-Codex: load `codex-5.3-prompting` and `codex-cli` skills instead of fetching the 5.2 guide URL at runtime, added scope fencing instructions to counter 5.3's aggressive refactoring, added "don't ask clarifying questions" and "brief updates" constraints, strengthened `codex-review-plan` to force reading codebase files referenced in the plan and constrain edit scope.
|
|
6
52
|
|
|
7
53
|
## [0.8.2] - 2026-02-10
|
|
8
54
|
|
package/README.md
CHANGED
|
@@ -49,7 +49,7 @@ Three modes control how the agent engages with a session:
|
|
|
49
49
|
|
|
50
50
|
**Hands-free** returns immediately so the agent can do other work, but the agent must poll periodically to discover output and completion. Good for processes the agent needs to monitor and react to mid-flight, like watching build output and sending follow-up commands.
|
|
51
51
|
|
|
52
|
-
**Dispatch** also returns immediately, but the agent doesn't poll at all. When the session completes — whether by natural exit, quiet detection, timeout, or user intervention — the agent gets woken up with a notification containing the tail output. This is the right mode for delegating a task to a subagent and moving on. Add `background: true` to skip the overlay entirely and run headless.
|
|
52
|
+
**Dispatch** also returns immediately, but the agent doesn't poll at all. When the session completes — whether by natural exit, quiet detection, timeout, or user intervention — the agent gets woken up with a notification containing the tail output. This is the right mode for delegating a task to a subagent and moving on. For fire-and-forget delegated runs and QA checks, prefer dispatch by default. Add `background: true` to skip the overlay entirely and run headless.
|
|
53
53
|
|
|
54
54
|
## Quick Start
|
|
55
55
|
|
|
@@ -115,7 +115,7 @@ Attach to review full output: interactive_shell({ attach: "calm-reef" })
|
|
|
115
115
|
|
|
116
116
|
The notification includes a brief tail (last 5 lines) and a reattach instruction. The PTY is preserved for 5 minutes so the agent can attach to review full scrollback.
|
|
117
117
|
|
|
118
|
-
Dispatch defaults `autoExitOnQuiet: true` — the session gets a
|
|
118
|
+
Dispatch defaults `autoExitOnQuiet: true` — the session gets a 15s startup grace period, then is killed after output goes silent (8s by default), which signals completion for task-oriented subagents. Tune the grace period with `handsFree: { gracePeriod: 60000 }` or opt out entirely with `handsFree: { autoExitOnQuiet: false }`.
|
|
119
119
|
|
|
120
120
|
The overlay still shows for the user, who can Ctrl+T to transfer output, Ctrl+B to background, take over by typing, or Ctrl+Q for more options.
|
|
121
121
|
|
|
@@ -151,7 +151,7 @@ interactive_shell({
|
|
|
151
151
|
|
|
152
152
|
### Auto-Exit on Quiet
|
|
153
153
|
|
|
154
|
-
For fire-and-forget single-task delegations, enable auto-exit to kill the session after
|
|
154
|
+
For fire-and-forget single-task delegations, enable auto-exit to kill the session after 8s of output silence:
|
|
155
155
|
|
|
156
156
|
```typescript
|
|
157
157
|
interactive_shell({
|
|
@@ -161,7 +161,7 @@ interactive_shell({
|
|
|
161
161
|
})
|
|
162
162
|
```
|
|
163
163
|
|
|
164
|
-
A
|
|
164
|
+
A 15s startup grace period prevents the session from being killed before the subprocess has time to produce output. Customize it per-call with `gracePeriod`:
|
|
165
165
|
|
|
166
166
|
```typescript
|
|
167
167
|
interactive_shell({
|
|
@@ -282,8 +282,8 @@ Configuration files (project overrides global):
|
|
|
282
282
|
"completionNotifyMaxChars": 5000,
|
|
283
283
|
"handsFreeUpdateMode": "on-quiet",
|
|
284
284
|
"handsFreeUpdateInterval": 60000,
|
|
285
|
-
"handsFreeQuietThreshold":
|
|
286
|
-
"autoExitGracePeriod":
|
|
285
|
+
"handsFreeQuietThreshold": 8000,
|
|
286
|
+
"autoExitGracePeriod": 15000,
|
|
287
287
|
"handsFreeUpdateMaxChars": 1500,
|
|
288
288
|
"handsFreeMaxTotalChars": 100000,
|
|
289
289
|
"handoffPreviewEnabled": true,
|
|
@@ -306,8 +306,8 @@ Configuration files (project overrides global):
|
|
|
306
306
|
| `completionNotifyLines` | 50 | Lines in dispatch completion notification (10-500) |
|
|
307
307
|
| `completionNotifyMaxChars` | 5000 | Max chars in completion notification (1KB-50KB) |
|
|
308
308
|
| `handsFreeUpdateMode` | "on-quiet" | "on-quiet" or "interval" |
|
|
309
|
-
| `handsFreeQuietThreshold` |
|
|
310
|
-
| `autoExitGracePeriod` |
|
|
309
|
+
| `handsFreeQuietThreshold` | 8000 | Silence duration before update (ms) |
|
|
310
|
+
| `autoExitGracePeriod` | 15000 | Startup grace before `autoExitOnQuiet` kill (ms) |
|
|
311
311
|
| `handsFreeUpdateInterval` | 60000 | Max interval between updates (ms) |
|
|
312
312
|
| `handsFreeUpdateMaxChars` | 1500 | Max chars per update |
|
|
313
313
|
| `handsFreeMaxTotalChars` | 100000 | Total char budget for updates |
|
|
@@ -331,7 +331,7 @@ Full PTY. The subprocess thinks it's in a real terminal.
|
|
|
331
331
|
|
|
332
332
|
## Example Workflow: Plan, Implement, Review
|
|
333
333
|
|
|
334
|
-
The `examples/prompts/` directory includes three prompt templates that chain together into a complete development workflow using Codex CLI. Each template
|
|
334
|
+
The `examples/prompts/` directory includes three prompt templates that chain together into a complete development workflow using Codex CLI. Each template now loads the bundled `gpt-5-4-prompting` skill by default, falls back to `codex-5-3-prompting` when the user explicitly asks for Codex 5.3, and launches Codex in an interactive overlay.
|
|
335
335
|
|
|
336
336
|
### The Pipeline
|
|
337
337
|
|
|
@@ -347,14 +347,22 @@ Write a plan
|
|
|
347
347
|
|
|
348
348
|
### Installing the Templates
|
|
349
349
|
|
|
350
|
-
|
|
350
|
+
Install the package first so pi can discover the bundled prompt and skill directories via the package metadata:
|
|
351
|
+
|
|
352
|
+
```bash
|
|
353
|
+
pi install npm:pi-interactive-shell
|
|
354
|
+
```
|
|
355
|
+
|
|
356
|
+
If you want your own slash commands and local skill copies, copy the examples into your agent config:
|
|
351
357
|
|
|
352
358
|
```bash
|
|
353
359
|
# Prompt templates (slash commands)
|
|
354
360
|
cp ~/.pi/agent/extensions/interactive-shell/examples/prompts/*.md ~/.pi/agent/prompts/
|
|
355
361
|
|
|
356
|
-
#
|
|
362
|
+
# Skills used by the templates
|
|
357
363
|
cp -r ~/.pi/agent/extensions/interactive-shell/examples/skills/codex-cli ~/.pi/agent/skills/
|
|
364
|
+
cp -r ~/.pi/agent/extensions/interactive-shell/examples/skills/gpt-5-4-prompting ~/.pi/agent/skills/
|
|
365
|
+
cp -r ~/.pi/agent/extensions/interactive-shell/examples/skills/codex-5-3-prompting ~/.pi/agent/skills/
|
|
358
366
|
```
|
|
359
367
|
|
|
360
368
|
### Usage
|
|
@@ -388,9 +396,9 @@ Say you have a plan at `docs/auth-redesign-plan.md`:
|
|
|
388
396
|
|
|
389
397
|
These templates demonstrate a "meta-prompt generation" pattern:
|
|
390
398
|
|
|
391
|
-
1. **Pi gathers context** — reads the plan, runs git diff,
|
|
392
|
-
2. **Pi generates a calibrated prompt** — tailored to the specific plan/diff, following the
|
|
393
|
-
3. **Pi launches Codex in the overlay** —
|
|
399
|
+
1. **Pi gathers context** — reads the plan, runs git diff, and loads the local `gpt-5-4-prompting` or `codex-5-3-prompting` skill
|
|
400
|
+
2. **Pi generates a calibrated prompt** — tailored to the specific plan/diff, following the selected skill's best practices
|
|
401
|
+
3. **Pi launches Codex in the overlay** — defaulting to `-m gpt-5.4 -a never` and switching to `-m gpt-5.3-codex -a never` only when the user explicitly asks for Codex 5.3
|
|
394
402
|
|
|
395
403
|
The user watches Codex work in the overlay and can take over anytime (type to intervene, Ctrl+T to transfer output back to pi, Ctrl+Q for options).
|
|
396
404
|
|
package/SKILL.md
CHANGED
|
@@ -5,7 +5,7 @@ description: Cheat sheet + workflow for launching interactive coding-agent CLIs
|
|
|
5
5
|
|
|
6
6
|
# Interactive Shell (Skill)
|
|
7
7
|
|
|
8
|
-
Last verified: 2026-
|
|
8
|
+
Last verified: 2026-03-12
|
|
9
9
|
|
|
10
10
|
## Foreground vs Background Subagents
|
|
11
11
|
|
|
@@ -84,6 +84,8 @@ interactive_shell({
|
|
|
84
84
|
|
|
85
85
|
Dispatch defaults `autoExitOnQuiet: true`. The agent can still query the sessionId if needed, but doesn't have to.
|
|
86
86
|
|
|
87
|
+
For fire-and-forget delegated runs (including QA-style delegated checks), prefer dispatch as the default mode.
|
|
88
|
+
|
|
87
89
|
#### Background Dispatch (Headless)
|
|
88
90
|
No overlay opens. Multiple headless dispatches can run concurrently:
|
|
89
91
|
|
|
@@ -163,7 +165,7 @@ interactive_shell({
|
|
|
163
165
|
reason: "Security review",
|
|
164
166
|
handsFree: { autoExitOnQuiet: true }
|
|
165
167
|
})
|
|
166
|
-
// Session auto-kills after ~
|
|
168
|
+
// Session auto-kills after ~8s of quiet (after the startup grace period)
|
|
167
169
|
// Read results from file:
|
|
168
170
|
// read("/tmp/security-review.md")
|
|
169
171
|
```
|
|
@@ -0,0 +1,76 @@
|
|
|
1
|
+
import { truncateToWidth, visibleWidth } from "@mariozechner/pi-tui";
|
|
2
|
+
import { formatDuration } from "./types.js";
|
|
3
|
+
import type { ShellSessionManager } from "./session-manager.js";
|
|
4
|
+
|
|
5
|
+
export function setupBackgroundWidget(
|
|
6
|
+
ctx: { ui: { setWidget: Function }; hasUI?: boolean },
|
|
7
|
+
sessionManager: ShellSessionManager,
|
|
8
|
+
): (() => void) | null {
|
|
9
|
+
if (!ctx.hasUI) return null;
|
|
10
|
+
|
|
11
|
+
let durationTimer: ReturnType<typeof setInterval> | null = null;
|
|
12
|
+
let tuiRef: { requestRender: () => void } | null = null;
|
|
13
|
+
|
|
14
|
+
const requestRender = () => tuiRef?.requestRender();
|
|
15
|
+
const unsubscribe = sessionManager.onChange(() => {
|
|
16
|
+
manageDurationTimer();
|
|
17
|
+
requestRender();
|
|
18
|
+
});
|
|
19
|
+
|
|
20
|
+
function manageDurationTimer() {
|
|
21
|
+
const sessions = sessionManager.list();
|
|
22
|
+
const hasRunning = sessions.some((s) => !s.session.exited);
|
|
23
|
+
if (hasRunning && !durationTimer) {
|
|
24
|
+
durationTimer = setInterval(requestRender, 10_000);
|
|
25
|
+
} else if (!hasRunning && durationTimer) {
|
|
26
|
+
clearInterval(durationTimer);
|
|
27
|
+
durationTimer = null;
|
|
28
|
+
}
|
|
29
|
+
}
|
|
30
|
+
|
|
31
|
+
ctx.ui.setWidget(
|
|
32
|
+
"bg-sessions",
|
|
33
|
+
(tui: any, theme: any) => {
|
|
34
|
+
tuiRef = tui;
|
|
35
|
+
return {
|
|
36
|
+
render: (width: number) => {
|
|
37
|
+
const sessions = sessionManager.list();
|
|
38
|
+
if (sessions.length === 0) return [];
|
|
39
|
+
const cols = width || tui.terminal?.columns || 120;
|
|
40
|
+
const lines: string[] = [];
|
|
41
|
+
for (const s of sessions) {
|
|
42
|
+
const exited = s.session.exited;
|
|
43
|
+
const dot = exited ? theme.fg("dim", "○") : theme.fg("accent", "●");
|
|
44
|
+
const id = theme.fg("dim", s.id);
|
|
45
|
+
const cmd = s.command.replace(/\s+/g, " ").trim();
|
|
46
|
+
const truncCmd = cmd.length > 60 ? cmd.slice(0, 57) + "..." : cmd;
|
|
47
|
+
const reason = s.reason ? theme.fg("dim", ` · ${s.reason}`) : "";
|
|
48
|
+
const status = exited ? theme.fg("dim", "exited") : theme.fg("success", "running");
|
|
49
|
+
const duration = theme.fg("dim", formatDuration(Date.now() - s.startedAt.getTime()));
|
|
50
|
+
const oneLine = ` ${dot} ${id} ${truncCmd}${reason} ${status} ${duration}`;
|
|
51
|
+
if (visibleWidth(oneLine) <= cols) {
|
|
52
|
+
lines.push(oneLine);
|
|
53
|
+
} else {
|
|
54
|
+
lines.push(truncateToWidth(` ${dot} ${id} ${cmd}`, cols, "…"));
|
|
55
|
+
lines.push(truncateToWidth(` ${status} ${duration}${reason}`, cols, "…"));
|
|
56
|
+
}
|
|
57
|
+
}
|
|
58
|
+
return lines;
|
|
59
|
+
},
|
|
60
|
+
invalidate: () => {},
|
|
61
|
+
};
|
|
62
|
+
},
|
|
63
|
+
{ placement: "belowEditor" },
|
|
64
|
+
);
|
|
65
|
+
|
|
66
|
+
manageDurationTimer();
|
|
67
|
+
|
|
68
|
+
return () => {
|
|
69
|
+
unsubscribe();
|
|
70
|
+
if (durationTimer) {
|
|
71
|
+
clearInterval(durationTimer);
|
|
72
|
+
durationTimer = null;
|
|
73
|
+
}
|
|
74
|
+
ctx.ui.setWidget("bg-sessions", undefined);
|
|
75
|
+
};
|
|
76
|
+
}
|
package/config.ts
CHANGED
|
@@ -1,6 +1,6 @@
|
|
|
1
1
|
import { existsSync, readFileSync } from "node:fs";
|
|
2
|
-
import { homedir } from "node:os";
|
|
3
2
|
import { join } from "node:path";
|
|
3
|
+
import { getAgentDir } from "@mariozechner/pi-coding-agent";
|
|
4
4
|
|
|
5
5
|
export interface InteractiveShellConfig {
|
|
6
6
|
exitAutoCloseDelay: number;
|
|
@@ -52,8 +52,8 @@ const DEFAULT_CONFIG: InteractiveShellConfig = {
|
|
|
52
52
|
// Hands-free mode defaults
|
|
53
53
|
handsFreeUpdateMode: "on-quiet" as const,
|
|
54
54
|
handsFreeUpdateInterval: 60000,
|
|
55
|
-
handsFreeQuietThreshold:
|
|
56
|
-
autoExitGracePeriod:
|
|
55
|
+
handsFreeQuietThreshold: 8000,
|
|
56
|
+
autoExitGracePeriod: 15000,
|
|
57
57
|
handsFreeUpdateMaxChars: 1500,
|
|
58
58
|
handsFreeMaxTotalChars: 100000,
|
|
59
59
|
// Query rate limiting (default 60 seconds between queries)
|
|
@@ -62,7 +62,7 @@ const DEFAULT_CONFIG: InteractiveShellConfig = {
|
|
|
62
62
|
|
|
63
63
|
export function loadConfig(cwd: string): InteractiveShellConfig {
|
|
64
64
|
const projectPath = join(cwd, ".pi", "interactive-shell.json");
|
|
65
|
-
const globalPath = join(
|
|
65
|
+
const globalPath = join(getAgentDir(), "interactive-shell.json");
|
|
66
66
|
|
|
67
67
|
let globalConfig: Partial<InteractiveShellConfig> = {};
|
|
68
68
|
let projectConfig: Partial<InteractiveShellConfig> = {};
|
|
@@ -1,23 +1,34 @@
|
|
|
1
1
|
---
|
|
2
2
|
description: Launch Codex CLI in overlay to fully implement an existing plan/spec document
|
|
3
3
|
---
|
|
4
|
-
|
|
4
|
+
Determine which prompting skill to load based on model:
|
|
5
|
+
- Default: Load `gpt-5-4-prompting` skill (for `gpt-5.4`)
|
|
6
|
+
- If user explicitly requests Codex 5.3: Load `codex-5-3-prompting` skill (for `gpt-5.3-codex`)
|
|
7
|
+
|
|
8
|
+
Also load the `codex-cli` skill. Then read the plan at `$1`.
|
|
5
9
|
|
|
6
10
|
Analyze the plan to understand: how many files are created vs modified, whether there's a prescribed implementation order or prerequisites, what existing code is referenced, and roughly how large the implementation is.
|
|
7
11
|
|
|
8
|
-
Based on the prompting
|
|
12
|
+
Based on the prompting skill's best practices and the plan's content, generate a comprehensive meta prompt tailored for Codex CLI. The meta prompt should instruct Codex to:
|
|
9
13
|
|
|
10
14
|
1. Read and internalize the full plan document. Identify every file to be created, every file to be modified, and any prerequisites or ordering constraints.
|
|
11
15
|
2. Before writing any code, read all existing files that will be modified — in full, not just the sections mentioned in the plan. Also read key files they import from or that import them, to absorb the surrounding patterns, naming conventions, and architecture.
|
|
12
16
|
3. If the plan specifies an implementation order or prerequisites (e.g., "extract module X before building Y"), follow that order exactly. Otherwise, implement bottom-up: shared utilities and types first, then the modules that depend on them, then integration/registration code last.
|
|
13
17
|
4. Implement each piece completely. No stubs, no TODOs, no placeholder comments, no "implement this later" shortcuts. Every function body, every edge case handler, every error path described in the plan must be real code.
|
|
14
18
|
5. Match existing code patterns exactly — same formatting, same import style, same error handling conventions, same naming. Read the surrounding codebase to absorb these patterns before writing. If the plan references patterns from specific files (e.g., "same pattern as X"), read those files and replicate the pattern faithfully.
|
|
15
|
-
6.
|
|
16
|
-
7.
|
|
17
|
-
8.
|
|
19
|
+
6. Stay within scope. Do not refactor, rename, or restructure adjacent code that the plan does not mention. No "while I'm here" improvements. If something adjacent looks wrong, note it in the summary but do not touch it.
|
|
20
|
+
7. Keep files reasonably sized. If a file grows beyond ~500 lines, split it as the plan describes or refactor into logical sub-modules.
|
|
21
|
+
8. After implementing all files, do a self-review pass: re-read the plan from top to bottom and verify every requirement, every edge case, every design decision is addressed in the code. Check for: missing imports, type mismatches, unreachable code paths, inconsistent field names between modules, and any plan requirement that was overlooked.
|
|
22
|
+
9. Do NOT commit or push. Write a summary listing every file created or modified, what was implemented in each, and any plan ambiguities that required judgment calls.
|
|
23
|
+
|
|
24
|
+
The meta prompt should follow the prompting skill's patterns: clear system context, explicit scope and verbosity constraints, step-by-step instructions, and expected output format. Instruct Codex not to ask clarifying questions about things answerable by reading the plan or codebase — read first, then act. Keep progress updates brief and concrete (no narrating routine file reads or tool calls). Emphasize that the plan has already been thoroughly reviewed — the job is faithful execution, not second-guessing the design. Emphasize scope discipline and verification requirements per the prompting skill.
|
|
25
|
+
|
|
26
|
+
Determine the model flag:
|
|
27
|
+
- Default: `-m gpt-5.4`
|
|
28
|
+
- If user explicitly requests Codex 5.3: `-m gpt-5.3-codex`
|
|
18
29
|
|
|
19
|
-
|
|
30
|
+
Then launch Codex CLI in the interactive shell overlay with that meta prompt using the chosen model flag plus `-a never`.
|
|
20
31
|
|
|
21
|
-
|
|
32
|
+
Use `interactive_shell` with `mode: "dispatch"` for this delegated run (fire-and-forget with completion notification). Do NOT pass sandbox flags in interactive_shell. Dispatch mode only. End turn immediately. Do not poll. Wait for completion notification.
|
|
22
33
|
|
|
23
34
|
$@
|
|
@@ -1,24 +1,35 @@
|
|
|
1
1
|
---
|
|
2
2
|
description: Launch Codex CLI in overlay to review implemented code changes (optionally against a plan)
|
|
3
3
|
---
|
|
4
|
-
|
|
4
|
+
Determine which prompting skill to load based on model:
|
|
5
|
+
- Default: Load `gpt-5-4-prompting` skill (for `gpt-5.4`)
|
|
6
|
+
- If user explicitly requests Codex 5.3: Load `codex-5-3-prompting` skill (for `gpt-5.3-codex`)
|
|
7
|
+
|
|
8
|
+
Also load the `codex-cli` skill. Then determine the review scope:
|
|
5
9
|
|
|
6
10
|
- If `$1` looks like a file path (contains `/` or ends in `.md`): read it as the plan/spec these changes were based on. The diff scope is uncommitted changes vs HEAD, or if clean, the current branch vs main.
|
|
7
11
|
- Otherwise: no plan file. Diff scope is the same. Treat all of `$@` as additional review context or focus areas.
|
|
8
12
|
|
|
9
13
|
Run the appropriate git diff to identify which files changed and how many lines are involved. This context helps you generate a better-calibrated meta prompt.
|
|
10
14
|
|
|
11
|
-
Based on the prompting
|
|
15
|
+
Based on the prompting skill's best practices, the diff scope, and the optional plan, generate a comprehensive meta prompt tailored for Codex CLI. The meta prompt should instruct Codex to:
|
|
12
16
|
|
|
13
17
|
1. Identify all changed files via git diff, then read every changed file in full — not just the diff hunks. For each changed file, also read the files it imports from and key files that depend on it, to understand integration points and downstream effects.
|
|
14
18
|
2. If a plan/spec was provided, read it and verify the implementation is complete — every requirement addressed, no steps skipped, nothing invented beyond scope, no partial stubs left behind.
|
|
15
19
|
3. Review each changed file for: bugs, logic errors, race conditions, resource leaks (timers, event listeners, file handles, unclosed connections), null/undefined hazards, off-by-one errors, error handling gaps, type mismatches, dead code, unused imports/variables/parameters, unnecessary complexity, and inconsistency with surrounding code patterns and naming conventions.
|
|
16
20
|
4. Trace key code paths end-to-end across function and file boundaries — verify data flows, state transitions, error propagation, and cleanup ordering. Don't evaluate functions in isolation.
|
|
17
21
|
5. Check for missing or inadequate tests, stale documentation, and missing changelog entries.
|
|
18
|
-
6. Fix every issue found with direct code edits.
|
|
22
|
+
6. Fix every issue found with direct code edits. Keep fixes scoped to the actual issues identified — do not expand into refactoring or restructuring code that wasn't flagged in the review. If adjacent code looks problematic, note it in the summary but don't touch it.
|
|
23
|
+
7. After all fixes, write a clear summary listing what was found, what was fixed, and any remaining concerns that require human judgment.
|
|
24
|
+
|
|
25
|
+
The meta prompt should follow the prompting skill's patterns: clear system context, explicit scope and verbosity constraints, step-by-step instructions, and expected output format. Instruct Codex not to ask clarifying questions — if intent is unclear, read the surrounding code for context instead of asking. Keep progress updates brief and concrete (no narrating routine file reads or tool calls). Emphasize thoroughness — read the actual code deeply before making judgments, question every assumption, and never rubber-stamp. Emphasize scope discipline and verification requirements per the prompting skill.
|
|
26
|
+
|
|
27
|
+
Determine the model flag:
|
|
28
|
+
- Default: `-m gpt-5.4`
|
|
29
|
+
- If user explicitly requests Codex 5.3: `-m gpt-5.3-codex`
|
|
19
30
|
|
|
20
|
-
|
|
31
|
+
Then launch Codex CLI in the interactive shell overlay with that meta prompt using the chosen model flag plus `-a never`.
|
|
21
32
|
|
|
22
|
-
|
|
33
|
+
Use `interactive_shell` with `mode: "dispatch"` for this delegated run (fire-and-forget with completion notification). Do NOT pass sandbox flags in interactive_shell. Dispatch mode only. End turn immediately. Do not poll. Wait for completion notification.
|
|
23
34
|
|
|
24
35
|
$@
|
|
@@ -1,19 +1,29 @@
|
|
|
1
1
|
---
|
|
2
2
|
description: Launch Codex CLI in overlay to review an implementation plan against the codebase
|
|
3
3
|
---
|
|
4
|
-
|
|
4
|
+
Determine which prompting skill to load based on model:
|
|
5
|
+
- Default: Load `gpt-5-4-prompting` skill (for `gpt-5.4`)
|
|
6
|
+
- If user explicitly requests Codex 5.3: Load `codex-5-3-prompting` skill (for `gpt-5.3-codex`)
|
|
5
7
|
|
|
6
|
-
|
|
8
|
+
Also load the `codex-cli` skill. Then read the plan at `$1`.
|
|
7
9
|
|
|
8
|
-
|
|
9
|
-
2. Systematically review the plan against the reference docs/links/code
|
|
10
|
-
3. Verify every assumption, file path, API shape, data flow, and integration point mentioned in the plan
|
|
11
|
-
4. Check that the plan's approach is logically sound, complete, and accounts for edge cases
|
|
12
|
-
5. Identify any gaps, contradictions, incorrect assumptions, or missing steps
|
|
13
|
-
6. Make direct edits to the plan file to fix any issues found, adding inline notes where changes were made
|
|
10
|
+
Based on the prompting skill's best practices and the plan's content, generate a comprehensive meta prompt tailored for Codex CLI. The meta prompt should instruct Codex to:
|
|
14
11
|
|
|
15
|
-
|
|
12
|
+
1. Read and internalize the full plan. Then read every codebase file the plan references — in full, not just the sections mentioned. Also read key files adjacent to those (imports, dependents) to understand the real state of the code the plan targets.
|
|
13
|
+
2. Systematically review the plan against what the code actually looks like, not what the plan assumes it looks like.
|
|
14
|
+
3. Verify every assumption, file path, API shape, data flow, and integration point mentioned in the plan against the actual codebase.
|
|
15
|
+
4. Check that the plan's approach is logically sound, complete, and accounts for edge cases.
|
|
16
|
+
5. Identify any gaps, contradictions, incorrect assumptions, or missing steps.
|
|
17
|
+
6. Make targeted edits to the plan file to fix issues found, adding inline notes where changes were made. Fix what's wrong — do not restructure or rewrite sections that are correct.
|
|
16
18
|
|
|
17
|
-
|
|
19
|
+
The meta prompt should follow the prompting skill's patterns (clear system context, explicit constraints, step-by-step instructions, expected output format). Instruct Codex not to ask clarifying questions — read the codebase to resolve ambiguities instead of asking. Keep progress updates brief and concrete. Emphasize scope discipline and verification requirements per the prompting skill.
|
|
20
|
+
|
|
21
|
+
Determine the model flag:
|
|
22
|
+
- Default: `-m gpt-5.4`
|
|
23
|
+
- If user explicitly requests Codex 5.3: `-m gpt-5.3-codex`
|
|
24
|
+
|
|
25
|
+
Then launch Codex CLI in the interactive shell overlay with that meta prompt using the chosen model flag plus `-a never`.
|
|
26
|
+
|
|
27
|
+
Use `interactive_shell` with `mode: "dispatch"` for this delegated run (fire-and-forget with completion notification). Do NOT pass sandbox flags in interactive_shell. Dispatch mode only. End turn immediately. Do not poll. Wait for completion notification.
|
|
18
28
|
|
|
19
29
|
$@
|
|
@@ -0,0 +1,161 @@
|
|
|
1
|
+
---
|
|
2
|
+
name: codex-5-3-prompting
|
|
3
|
+
description: How to write system prompts and instructions for GPT-5.3-Codex. Use when constructing or tuning prompts targeting Codex 5.3.
|
|
4
|
+
---
|
|
5
|
+
|
|
6
|
+
# GPT-5.3-Codex Prompting Guide
|
|
7
|
+
|
|
8
|
+
GPT-5.3-Codex is fast, capable, and eager. It moves quickly and will skip reading, over-refactor, and drift scope if prompts aren't tight. Explicit constraints matter more than with GPT-5.2-Codex. Include the following blocks as needed when constructing system prompts.
|
|
9
|
+
|
|
10
|
+
## Output shape
|
|
11
|
+
|
|
12
|
+
Always include. Controls verbosity and response structure.
|
|
13
|
+
|
|
14
|
+
```
|
|
15
|
+
<output_verbosity_spec>
|
|
16
|
+
- Default: 3-6 sentences or <=5 bullets for typical answers.
|
|
17
|
+
- Simple yes/no questions: <=2 sentences.
|
|
18
|
+
- Complex multi-step or multi-file tasks:
|
|
19
|
+
- 1 short overview paragraph
|
|
20
|
+
- then <=5 bullets tagged: What changed, Where, Risks, Next steps, Open questions.
|
|
21
|
+
- Avoid long narrative paragraphs; prefer compact bullets and short sections.
|
|
22
|
+
- Do not rephrase the user's request unless it changes semantics.
|
|
23
|
+
</output_verbosity_spec>
|
|
24
|
+
```
|
|
25
|
+
|
|
26
|
+
## Scope constraints
|
|
27
|
+
|
|
28
|
+
Always include. GPT-5.3-Codex will add features, refactor adjacent code, and invent UI elements if you don't fence it in.
|
|
29
|
+
|
|
30
|
+
```
|
|
31
|
+
<design_and_scope_constraints>
|
|
32
|
+
- Explore any existing design systems and understand them deeply.
|
|
33
|
+
- Implement EXACTLY and ONLY what the user requests.
|
|
34
|
+
- No extra features, no added components, no UX embellishments.
|
|
35
|
+
- Style aligned to the design system at hand.
|
|
36
|
+
- Do NOT invent colors, shadows, tokens, animations, or new UI elements unless requested or necessary.
|
|
37
|
+
- If any instruction is ambiguous, choose the simplest valid interpretation.
|
|
38
|
+
</design_and_scope_constraints>
|
|
39
|
+
```
|
|
40
|
+
|
|
41
|
+
## Context loading
|
|
42
|
+
|
|
43
|
+
Always include. GPT-5.3-Codex skips reading and starts writing if you don't force it.
|
|
44
|
+
|
|
45
|
+
```
|
|
46
|
+
<context_loading>
|
|
47
|
+
- Read ALL files that will be modified -- in full, not just the sections mentioned in the task.
|
|
48
|
+
- Also read key files they import from or that depend on them.
|
|
49
|
+
- Absorb surrounding patterns, naming conventions, error handling style, and architecture before writing any code.
|
|
50
|
+
- Do not ask clarifying questions about things that are answerable by reading the codebase.
|
|
51
|
+
</context_loading>
|
|
52
|
+
```
|
|
53
|
+
|
|
54
|
+
## Plan-first mode
|
|
55
|
+
|
|
56
|
+
Include for multi-file work, large refactors, or any task with ordering dependencies.
|
|
57
|
+
|
|
58
|
+
```
|
|
59
|
+
<plan_first>
|
|
60
|
+
- Before writing any code, produce a brief implementation plan:
|
|
61
|
+
- Files to create vs. modify
|
|
62
|
+
- Implementation order and prerequisites
|
|
63
|
+
- Key design decisions and edge cases
|
|
64
|
+
- Acceptance criteria for "done"
|
|
65
|
+
- Get the plan right first. Then implement step by step following the plan.
|
|
66
|
+
- If the plan is provided externally, follow it faithfully -- the job is execution, not second-guessing the design.
|
|
67
|
+
</plan_first>
|
|
68
|
+
```
|
|
69
|
+
|
|
70
|
+
## Long-context handling
|
|
71
|
+
|
|
72
|
+
Include when inputs exceed ~10k tokens (multi-chapter docs, long threads, multiple PDFs).
|
|
73
|
+
|
|
74
|
+
```
|
|
75
|
+
<long_context_handling>
|
|
76
|
+
- For inputs longer than ~10k tokens:
|
|
77
|
+
- First, produce a short internal outline of the key sections relevant to the task.
|
|
78
|
+
- Re-state the constraints explicitly before answering.
|
|
79
|
+
- Anchor claims to sections ("In the 'Data Retention' section...") rather than speaking generically.
|
|
80
|
+
- If the answer depends on fine details (dates, thresholds, clauses), quote or paraphrase them.
|
|
81
|
+
</long_context_handling>
|
|
82
|
+
```
|
|
83
|
+
|
|
84
|
+
## Uncertainty and ambiguity
|
|
85
|
+
|
|
86
|
+
Include when the task involves underspecified requirements or hallucination-prone domains.
|
|
87
|
+
|
|
88
|
+
```
|
|
89
|
+
<uncertainty_and_ambiguity>
|
|
90
|
+
- If the question is ambiguous or underspecified:
|
|
91
|
+
- Ask up to 1-3 precise clarifying questions, OR
|
|
92
|
+
- Present 2-3 plausible interpretations with clearly labeled assumptions.
|
|
93
|
+
- Never fabricate exact figures, line numbers, or external references when uncertain.
|
|
94
|
+
- When unsure, prefer "Based on the provided context..." over absolute claims.
|
|
95
|
+
</uncertainty_and_ambiguity>
|
|
96
|
+
```
|
|
97
|
+
|
|
98
|
+
## User updates
|
|
99
|
+
|
|
100
|
+
Include for agentic / long-running tasks.
|
|
101
|
+
|
|
102
|
+
```
|
|
103
|
+
<user_updates_spec>
|
|
104
|
+
- Send brief updates (1-2 sentences) only when:
|
|
105
|
+
- You start a new major phase of work, or
|
|
106
|
+
- You discover something that changes the plan.
|
|
107
|
+
- Avoid narrating routine tool calls ("reading file...", "running tests...").
|
|
108
|
+
- Each update must include at least one concrete outcome ("Found X", "Confirmed Y", "Updated Z").
|
|
109
|
+
- Do not expand the task beyond what was asked; if you notice new work, call it out as optional.
|
|
110
|
+
</user_updates_spec>
|
|
111
|
+
```
|
|
112
|
+
|
|
113
|
+
## Tool usage
|
|
114
|
+
|
|
115
|
+
Include when the prompt involves tool-calling agents.
|
|
116
|
+
|
|
117
|
+
```
|
|
118
|
+
<tool_usage_rules>
|
|
119
|
+
- Prefer tools over internal knowledge whenever:
|
|
120
|
+
- You need fresh or user-specific data (tickets, orders, configs, logs).
|
|
121
|
+
- You reference specific IDs, URLs, or document titles.
|
|
122
|
+
- Parallelize independent reads (read_file, fetch_record, search_docs) when possible to reduce latency.
|
|
123
|
+
- After any write/update tool call, briefly restate:
|
|
124
|
+
- What changed
|
|
125
|
+
- Where (ID or path)
|
|
126
|
+
- Any follow-up validation performed
|
|
127
|
+
</tool_usage_rules>
|
|
128
|
+
```
|
|
129
|
+
|
|
130
|
+
## Reasoning effort
|
|
131
|
+
|
|
132
|
+
Set `model_reasoning_effort` via Codex CLI: `-c model_reasoning_effort="high"`
|
|
133
|
+
|
|
134
|
+
| Task type | Effort |
|
|
135
|
+
|---|---|
|
|
136
|
+
| Simple code generation, formatting | `low` or `medium` |
|
|
137
|
+
| Standard implementation from clear specs | `high` |
|
|
138
|
+
| Complex refactors, plan review, architecture | `xhigh` |
|
|
139
|
+
| Code review (thorough) | `high` or `xhigh` |
|
|
140
|
+
|
|
141
|
+
## Backwards compatibility hedging
|
|
142
|
+
|
|
143
|
+
GPT-5.3-Codex has a strong tendency to preserve old patterns, add compatibility shims, and provide fallback code "just in case" -- even when explicitly told not to worry about backwards compatibility. Vague instructions like "don't worry about backwards compatibility" get interpreted weakly; the model may still hedge.
|
|
144
|
+
|
|
145
|
+
Use **"cutover"** to signal a clean, irreversible break. It's a precise industry term that conveys finality and intentional deprecation -- no dual-support phase, no gradual migration, no preserving old behavior.
|
|
146
|
+
|
|
147
|
+
Instead of:
|
|
148
|
+
> "Rewrite this and don't worry about backwards compatibility"
|
|
149
|
+
|
|
150
|
+
Say:
|
|
151
|
+
> "This is a cutover. No backwards compatibility. Rewrite using only Python 3.12+ features and current best practices. Do not preserve legacy code, polyfills, or deprecated patterns."
|
|
152
|
+
|
|
153
|
+
## Quick reference
|
|
154
|
+
|
|
155
|
+
- **Force reading first.** "Read all necessary files before you ask any dumb question."
|
|
156
|
+
- **Use plan mode.** Draft the full task with acceptance criteria before implementing.
|
|
157
|
+
- **Steer aggressively mid-task.** GPT-5.3-Codex handles redirects without losing context. Be direct: "Stop. Fix the actual cause." / "Simplest valid implementation only."
|
|
158
|
+
- **Constrain scope hard.** GPT-5.3-Codex will refactor aggressively if you don't fence it in.
|
|
159
|
+
- **Watch context burn.** Faster model = faster context consumption. Start fresh at ~40%.
|
|
160
|
+
- **Use domain jargon.** "Cutover," "golden-path," "no fallbacks," "domain split" get cleaner, faster responses.
|
|
161
|
+
- **Download libraries locally.** Tell it to read them for better context than relying on training data.
|