agent-sh 0.3.0 → 0.4.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -5,7 +5,7 @@
5
5
 
6
6
  Not a shell that lives in an agent — an agent that lives in a shell.
7
7
 
8
- agent-sh is a real terminal first. Every keystroke goes to a real PTY. `cd`, pipes, vim, job control — they all just work. But type `>` at the start of a line, and you're talking to an AI agent that has full context of what you've been doing: your working directory, recent commands, their output.
8
+ agent-sh is a real terminal first. Every keystroke goes to a real PTY. `cd`, pipes, vim, job control — they all just work. But type `?` or `>` at the start of a line, and you're talking to an AI agent that has full context of what you've been doing: your working directory, recent commands, their output.
9
9
 
10
10
  The agent connects via the [Agent Client Protocol (ACP)](https://agentclientprotocol.com/), so you can plug in **any** ACP-compatible agent: [pi](https://github.com/svkozak/pi-acp), claude-code, codex, gemini-cli, goose, etc.
11
11
 
@@ -13,15 +13,17 @@ The agent connects via the [Agent Client Protocol (ACP)](https://agentclientprot
13
13
  ⚡ src $ ls -la # real shell command
14
14
  ⚡ src $ cd ../tests && npm test # real cd, env, aliases — all just work
15
15
  ⚡ src $ vim file.ts # opens vim in the same PTY
16
- ⚡ src $ > refactor the auth middleware # sent to agent via ACP
17
- ⚡ src $ > explain the last error # agent sees your recent commands + output
16
+ ⚡ src $ ? explain the last error # query mode agent investigates using its own tools
17
+ ⚡ src $ > deploy to staging # execute mode → agent runs it in your live shell
18
18
  ```
19
19
 
20
20
  ## Why shell-first?
21
21
 
22
- Most AI coding tools are agent-first: the LLM drives the experience and the shell is bolted on. That means no real PTY, no job control, no interactive commands, and fragile `cd` tracking that reimplements what bash gives you for free.
22
+ I live mostly in a terminal. I don't just want an agent that has access to my shell I want a shell that has access to my agent.
23
23
 
24
- agent-sh starts from the opposite end. The shell is the primary interface it's your terminal, not the agent's. The agent is a tool you reach for when you need it, not the other way around.
24
+ Most AI coding tools get this backwards: the LLM drives the experience and the shell is bolted on. That means no real PTY, no job control, no interactive commands, and fragile `cd` tracking that reimplements what bash gives you for free.
25
+
26
+ agent-sh starts from the opposite end. The shell is the primary interface — it's your terminal, not the agent's. The agent is a tool you reach for when you need it, not the other way around. Two modes give you fine-grained control: `?` for questions and tasks (agent uses its own tools), `>` for commands that run directly in your live shell.
25
27
 
26
28
  ### Why ACP?
27
29
 
@@ -40,6 +42,8 @@ The [Agent Client Protocol](https://agentclientprotocol.com/) decouples the shel
40
42
  - **Real-time Streaming** — Agent responses stream live with syntax highlighting
41
43
  - **Zero Latency** — Direct PTY access, full terminal compatibility
42
44
  - **Context Aware** — Agent sees your cwd, recent commands, and their output
45
+ - **Dual Input Modes** — `?` for questions/tasks (agent tools), `>` for live shell execution
46
+ - **Extensible Modes** — Extensions can register custom input modes with their own triggers
43
47
  - **Multiple Agents** — Easy switching between pi-acp, claude, and other ACP agents
44
48
  - **Inline Diff Preview** — File writes show syntax-highlighted diffs inline (Ctrl+O to expand)
45
49
  - **Thinking Display** — Toggle agent thinking/reasoning text with Ctrl+T
@@ -67,22 +71,34 @@ See the [Usage Guide](docs/usage.md) for all options, model configuration, and e
67
71
 
68
72
  ## Input Modes
69
73
 
74
+ agent-sh has two agent input modes, each triggered by a single character at the start of an empty line:
75
+
76
+ | Trigger | Mode | Behavior |
77
+ |---|---|---|
78
+ | `?` | **Query** | Agent uses its own tools (bash, file read/write, search) to investigate and answer. Stays in query mode after each response. |
79
+ | `>` | **Execute** | Agent runs a command in your live shell via `user_shell`. Your aliases, env vars, and cwd apply. Returns to shell after execution. |
80
+
81
+ Regular shell input works as before — commands go straight to the PTY:
82
+
70
83
  | Input | Behavior |
71
84
  |---|---|
72
85
  | `ls -la` | Runs in real shell (PTY), output displayed normally |
73
86
  | `cd src && make` | Real shell — cd, env, aliases all just work |
74
87
  | `vim file.ts` | Opens vim in the same PTY, no hacks needed |
75
- | `> refactor this fn` | Sends to agent via ACP, streams response inline |
76
- | `> /help` | Shows available slash commands |
88
+ | `? refactor this fn` | Query mode agent investigates and responds |
89
+ | `> restart the server` | Execute mode agent runs it in your live shell |
90
+ | `? /help` | Shows available slash commands (works in either mode) |
77
91
  | `Ctrl-C` | Standard signal to shell, or cancels active agent response |
78
92
  | `Ctrl-O` | Expand/collapse truncated diff preview |
79
93
  | `Ctrl-T` | Toggle thinking/reasoning text display |
80
94
  | `Shift-Tab` | Cycle thinking level (off → minimal → low → medium → high → xhigh) |
81
- | `Escape` | Exit agent input mode (when typing after `>`) |
95
+ | `Escape` | Exit agent input mode |
96
+
97
+ Modes are extensible — extensions can register new modes via the `input-mode:register` event (see [Extensions](docs/extensions.md#custom-input-modes)).
82
98
 
83
99
  ### Agent Input Keybindings
84
100
 
85
- When typing after `>`, full readline-style keybindings are available:
101
+ When typing in either agent mode (`?` or `>`), full readline-style keybindings are available:
86
102
 
87
103
  | Key | Action |
88
104
  |---|---|
@@ -103,10 +119,11 @@ When typing after `>`, full readline-style keybindings are available:
103
119
 
104
120
  ### Thinking Level
105
121
 
106
- The agent prompt shows the current thinking level next to the model name:
122
+ The agent prompt shows the current thinking level next to the model name, with a mode-specific indicator:
107
123
 
108
124
  ```
109
- pi (claude-3.5-sonnet) [medium]
125
+ pi (claude-sonnet-4-6) [medium] # query mode
126
+ pi (claude-sonnet-4-6) [medium] ● ⟩ # execute mode
110
127
  ```
111
128
 
112
129
  Press **Shift-Tab** in agent input mode to cycle through levels. The levels are advertised by the agent via ACP session modes — different agents may offer different options. The spinner label reflects the mode: "Thinking" when thinking is enabled, "Working" when it's off.
@@ -30,7 +30,12 @@ export declare class AcpClient {
30
30
  /**
31
31
  * Send a user query to the agent.
32
32
  */
33
- sendPrompt(query: string): Promise<void>;
33
+ private firstPromptSent;
34
+ private static readonly SESSION_ORIENTATION;
35
+ sendPrompt(query: string, opts?: {
36
+ modeInstruction?: string;
37
+ modeLabel?: string;
38
+ }): Promise<void>;
34
39
  /**
35
40
  * Silently cancel the prompt after a shell tool completes.
36
41
  * Unlike user-initiated cancel(), this doesn't show "(cancelled)" —
@@ -21,7 +21,7 @@ export class AcpClient {
21
21
  terminalDonePromises = new Map();
22
22
  terminalCounter = 0;
23
23
  fileWatcher;
24
- pendingToolCalls = new Map(); // toolCallId → title
24
+ pendingToolCalls = new Map();
25
25
  autoCancelled = false;
26
26
  pendingToolCounter = 0;
27
27
  agentInfo = null;
@@ -129,7 +129,29 @@ export class AcpClient {
129
129
  /**
130
130
  * Send a user query to the agent.
131
131
  */
132
- async sendPrompt(query) {
132
+ firstPromptSent = false;
133
+ static SESSION_ORIENTATION = [
134
+ "You are running inside agent-sh, a terminal wrapper that gives the user two interaction modes:",
135
+ "",
136
+ "QUERY mode (triggered by '?'): The user is asking questions or requesting tasks.",
137
+ "Use your internal tools (bash, file operations, etc.) to accomplish tasks.",
138
+ "Do NOT use user_shell in this mode.",
139
+ "",
140
+ "EXECUTE mode (triggered by '>'): The user wants a command run in their live shell session.",
141
+ "You may use shell_recall to understand previous context and your own tools to investigate,",
142
+ "but the final action must be sending the command via user_shell,",
143
+ "which executes in the user's actual shell (with their aliases, env vars, and cwd).",
144
+ "Do not explain or ask for confirmation — just run it.",
145
+ "",
146
+ "Each prompt includes a per-query mode instruction — follow it.",
147
+ "",
148
+ "Available tools:",
149
+ "- user_shell: Runs commands in the user's live shell session (their PTY). Use in EXECUTE mode.",
150
+ "- shell_recall: Retrieves recent shell command history and output from the user's session.",
151
+ " Use this to understand what the user has been doing before answering questions.",
152
+ "- Your standard tools (bash, file read/write, etc.): Use in AGENT mode.",
153
+ ].join("\n");
154
+ async sendPrompt(query, opts) {
133
155
  if (!this.connection || !this.sessionId) {
134
156
  this.bus.emit("agent:error", { message: "Not connected to agent" });
135
157
  return;
@@ -141,24 +163,25 @@ export class AcpClient {
141
163
  this.autoCancelled = false;
142
164
  let cancelled = false;
143
165
  // Emit agent query event (TUI renders echo+spinner, ContextManager records it)
144
- this.bus.emit("agent:query", { query });
166
+ this.bus.emit("agent:query", { query, modeLabel: opts?.modeLabel });
145
167
  // Build structured context from ContextManager
146
168
  const contextBlock = this.contextManager.getContext();
147
169
  try {
148
170
  this.log("sending prompt...");
149
- const promptTimeoutMs = 300000; // 5 minutes timeout for LLM response
150
- const response = await Promise.race([
151
- this.connection.prompt({
152
- sessionId: this.sessionId,
153
- prompt: [
154
- {
155
- type: "text",
156
- text: contextBlock + "\n" + query,
157
- },
158
- ],
159
- }),
160
- new Promise((_, reject) => setTimeout(() => reject(new Error(`Prompt timeout after ${promptTimeoutMs}ms`)), promptTimeoutMs)),
161
- ]);
171
+ const promptContent = [];
172
+ // Send session orientation on first prompt
173
+ if (!this.firstPromptSent) {
174
+ promptContent.push({ type: "text", text: AcpClient.SESSION_ORIENTATION });
175
+ this.firstPromptSent = true;
176
+ }
177
+ if (opts?.modeInstruction) {
178
+ promptContent.push({ type: "text", text: opts.modeInstruction });
179
+ }
180
+ promptContent.push({ type: "text", text: contextBlock + "\n" + query });
181
+ const response = await this.connection.prompt({
182
+ sessionId: this.sessionId,
183
+ prompt: promptContent,
184
+ });
162
185
  this.log(`prompt resolved: stopReason=${response.stopReason}`);
163
186
  if (response.stopReason === "cancelled") {
164
187
  cancelled = true;
@@ -176,7 +199,7 @@ export class AcpClient {
176
199
  finally {
177
200
  this.log("restoring shell mode");
178
201
  if (!cancelled) {
179
- this.bus.emit("agent:response-done", {
202
+ this.bus.emitTransform("agent:response-done", {
180
203
  response: this.currentResponseText,
181
204
  });
182
205
  }
@@ -244,6 +267,7 @@ export class AcpClient {
244
267
  this.sessionId = sessionResponse.sessionId;
245
268
  this.lastResponseText = "";
246
269
  this.currentResponseText = "";
270
+ this.firstPromptSent = false;
247
271
  this.updateModes(sessionResponse);
248
272
  }
249
273
  /**
@@ -327,8 +351,15 @@ export class AcpClient {
327
351
  createClientHandler() {
328
352
  return {
329
353
  // Required: handle session update notifications (streaming)
354
+ // Errors must not propagate — the ACP SDK returns them as error
355
+ // responses to the agent, which can stall the stream.
330
356
  sessionUpdate: async (params) => {
331
- this.handleSessionUpdate(params);
357
+ try {
358
+ this.handleSessionUpdate(params);
359
+ }
360
+ catch (err) {
361
+ this.log(`Error in sessionUpdate handler: ${err instanceof Error ? err.stack : err}`);
362
+ }
332
363
  },
333
364
  // Required: handle permission requests
334
365
  requestPermission: async (params) => {
@@ -370,40 +401,53 @@ export class AcpClient {
370
401
  const content = update.content;
371
402
  if (content.type === "text") {
372
403
  this.currentResponseText += content.text;
373
- this.bus.emit("agent:response-chunk", { text: content.text });
404
+ this.bus.emitTransform("agent:response-chunk", { text: content.text });
374
405
  }
375
406
  break;
376
407
  }
377
408
  case "agent_thought_chunk": {
378
409
  const thought = update.content;
379
410
  if (thought.type === "text" && thought.text) {
380
- this.bus.emit("agent:thinking-chunk", { text: thought.text });
411
+ this.bus.emitTransform("agent:thinking-chunk", { text: thought.text });
381
412
  }
382
413
  break;
383
414
  }
384
415
  case "tool_call": {
385
416
  const toolId = update.toolCallId || `tool-${this.pendingToolCounter++}`;
386
- this.pendingToolCalls.set(toolId, update.title ?? "");
387
- this.bus.emit("agent:tool-started", {
417
+ const payload = {
388
418
  title: update.title,
389
419
  toolCallId: toolId,
390
420
  kind: update.kind ?? undefined,
391
421
  locations: update.locations?.map((l) => ({ path: l.path, line: l.line })),
392
422
  rawInput: update.rawInput,
423
+ };
424
+ const defer = this.pendingToolCalls.size > 0;
425
+ this.pendingToolCalls.set(toolId, {
426
+ title: update.title ?? "",
427
+ deferredPayload: defer ? payload : undefined,
393
428
  });
429
+ if (!defer) {
430
+ this.bus.emit("agent:tool-started", payload);
431
+ }
394
432
  break;
395
433
  }
396
434
  case "tool_call_update": {
397
435
  const toolId = update.toolCallId;
398
- const toolTitle = toolId ? this.pendingToolCalls.get(toolId) : undefined;
436
+ const toolInfo = toolId ? this.pendingToolCalls.get(toolId) : undefined;
437
+ const toolTitle = toolInfo?.title;
399
438
  if (update.status === "completed" || update.status === "failed") {
439
+ // Emit deferred tool-started before output (parallel tools)
440
+ if (toolInfo?.deferredPayload) {
441
+ this.bus.emit("agent:tool-started", toolInfo.deferredPayload);
442
+ toolInfo.deferredPayload = undefined;
443
+ }
400
444
  // Show content only on final status. Skip tools whose output the
401
445
  // user already sees (user_shell → PTY) or is agent-only (shell_recall).
402
446
  const skipOutput = toolTitle === "user_shell" || toolTitle === "shell_recall";
403
447
  if (!skipOutput && update.content && Array.isArray(update.content)) {
404
448
  for (const block of update.content) {
405
449
  if (block.type === "content" && block.content?.type === "text" && block.content.text) {
406
- this.bus.emit("agent:tool-output-chunk", { chunk: block.content.text });
450
+ this.bus.emitTransform("agent:tool-output-chunk", { chunk: block.content.text });
407
451
  }
408
452
  }
409
453
  }
package/dist/core.js CHANGED
@@ -20,17 +20,21 @@ import { EventBus } from "./event-bus.js";
20
20
  import { ContextManager } from "./context-manager.js";
21
21
  import { AcpClient } from "./acp-client.js";
22
22
  import { setPalette } from "./utils/palette.js";
23
+ import * as streamTransform from "./utils/stream-transform.js";
24
+ import * as settingsMod from "./settings.js";
25
+ import { HandlerRegistry } from "./utils/handler-registry.js";
23
26
  // Re-export types that library consumers need
24
27
  export { EventBus } from "./event-bus.js";
25
28
  export { palette, setPalette, resetPalette } from "./utils/palette.js";
26
29
  export function createCore(config) {
27
30
  const bus = new EventBus();
31
+ const handlers = new HandlerRegistry();
28
32
  const contextManager = new ContextManager(bus);
29
33
  const client = new AcpClient({ bus, contextManager, config });
30
34
  let connected = false;
31
35
  // Route frontend events to the agent — any frontend (Shell, WebSocket,
32
36
  // REST handler, test harness) can emit these without knowing about AcpClient.
33
- bus.on("agent:submit", ({ query }) => {
37
+ bus.on("agent:submit", ({ query, modeInstruction, modeLabel }) => {
34
38
  (async () => {
35
39
  // Wait briefly for agent connection if start() is still in progress
36
40
  if (!connected) {
@@ -42,7 +46,7 @@ export function createCore(config) {
42
46
  bus.emit("ui:error", { message: "Agent not connected. Please wait a moment and try again." });
43
47
  return;
44
48
  }
45
- await client.sendPrompt(query);
49
+ await client.sendPrompt(query, { modeInstruction, modeLabel });
46
50
  })().catch((err) => {
47
51
  bus.emit("agent:error", {
48
52
  message: err instanceof Error ? err.message : String(err),
@@ -67,6 +71,12 @@ export function createCore(config) {
67
71
  getAcpClient: () => client,
68
72
  quit: opts.quit,
69
73
  setPalette,
74
+ createBlockTransform: (o) => streamTransform.createBlockTransform(bus, o),
75
+ createFencedBlockTransform: (o) => streamTransform.createFencedBlockTransform(bus, o),
76
+ getExtensionSettings: settingsMod.getExtensionSettings,
77
+ define: (name, fn) => handlers.define(name, fn),
78
+ advise: (name, wrapper) => handlers.advise(name, wrapper),
79
+ call: (name, ...args) => handlers.call(name, ...args),
70
80
  };
71
81
  },
72
82
  kill() {
@@ -22,16 +22,21 @@ export interface ShellEvents {
22
22
  "shell:agent-exec-done": Record<string, never>;
23
23
  "agent:submit": {
24
24
  query: string;
25
+ modeInstruction?: string;
26
+ modeLabel?: string;
25
27
  };
26
28
  "agent:cancel-request": Record<string, never>;
29
+ "input-mode:register": import("./types.js").InputModeConfig;
27
30
  "agent:query": {
28
31
  query: string;
32
+ modeLabel?: string;
29
33
  };
30
34
  "agent:thinking-chunk": {
31
35
  text: string;
32
36
  };
33
37
  "agent:response-chunk": {
34
38
  text: string;
39
+ blocks?: ContentBlock[];
35
40
  };
36
41
  "agent:response-done": {
37
42
  response: string;
@@ -126,6 +131,20 @@ export interface ShellEvents {
126
131
  }[];
127
132
  };
128
133
  }
134
+ export type ContentBlock = {
135
+ type: "text";
136
+ text: string;
137
+ } | {
138
+ type: "code-block";
139
+ language: string;
140
+ code: string;
141
+ } | {
142
+ type: "image";
143
+ data: Buffer;
144
+ } | {
145
+ type: "raw";
146
+ escape: string;
147
+ };
129
148
  type Listener<T> = (payload: T) => void;
130
149
  type PipeListener<T> = (payload: T) => T;
131
150
  type AsyncPipeListener<T> = (payload: T) => T | Promise<T>;
@@ -145,6 +164,13 @@ export declare class EventBus {
145
164
  off<K extends keyof ShellEvents>(event: K, fn: Listener<ShellEvents[K]>): void;
146
165
  /** Emit a fire-and-forget event. */
147
166
  emit<K extends keyof ShellEvents>(event: K, payload: ShellEvents[K]): void;
167
+ /**
168
+ * Transform-then-notify: run the payload through any registered pipe
169
+ * listeners (transforms), then emit the final result to regular `on`
170
+ * listeners (renderers). This enables content pipelines where extensions
171
+ * modify data (e.g. render LaTeX → terminal image) before renderers see it.
172
+ */
173
+ emitTransform<K extends keyof ShellEvents>(event: K, payload: ShellEvents[K]): void;
148
174
  /** Register a transform listener for a pipeline event. */
149
175
  onPipe<K extends keyof ShellEvents>(event: K, fn: PipeListener<ShellEvents[K]>): void;
150
176
  /**
package/dist/event-bus.js CHANGED
@@ -21,6 +21,16 @@ export class EventBus {
21
21
  emit(event, payload) {
22
22
  this.emitter.emit(event, payload);
23
23
  }
24
+ /**
25
+ * Transform-then-notify: run the payload through any registered pipe
26
+ * listeners (transforms), then emit the final result to regular `on`
27
+ * listeners (renderers). This enables content pipelines where extensions
28
+ * modify data (e.g. render LaTeX → terminal image) before renderers see it.
29
+ */
30
+ emitTransform(event, payload) {
31
+ const transformed = this.emitPipe(event, payload);
32
+ this.emitter.emit(event, transformed);
33
+ }
24
34
  /** Register a transform listener for a pipeline event. */
25
35
  onPipe(event, fn) {
26
36
  let listeners = this.pipeListeners.get(event);
@@ -1,2 +1,2 @@
1
1
  import type { ExtensionContext } from "../types.js";
2
- export default function activate({ bus, getAcpClient }: ExtensionContext): void;
2
+ export default function activate(ctx: ExtensionContext): void;