npm - agent-sh - Versions diffs - 0.5.0 → 0.7.0 - Mend

agent-sh 0.5.0 → 0.7.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (54) hide show

package/README.md +12 -43
package/dist/agent/agent-loop.d.ts +1 -0
package/dist/agent/agent-loop.js +119 -26
package/dist/agent/subagent.js +3 -1
package/dist/agent/system-prompt.d.ts +1 -1
package/dist/agent/system-prompt.js +21 -16
package/dist/agent/tools/bash.js +10 -1
package/dist/agent/tools/display.d.ts +13 -0
package/dist/agent/tools/display.js +70 -0
package/dist/agent/tools/edit-file.js +60 -7
package/dist/agent/tools/glob.js +39 -7
package/dist/agent/tools/grep.js +111 -20
package/dist/agent/tools/ls.js +31 -2
package/dist/agent/tools/read-file.d.ts +9 -1
package/dist/agent/tools/read-file.js +50 -4
package/dist/agent/tools/user-shell.js +40 -13
package/dist/agent/tools/write-file.js +9 -1
package/dist/agent/types.d.ts +35 -1
package/dist/context-manager.d.ts +3 -1
package/dist/context-manager.js +11 -1
package/dist/core.d.ts +1 -3
package/dist/core.js +23 -12
package/dist/event-bus.d.ts +41 -3
package/dist/extension-loader.d.ts +1 -1
package/dist/extension-loader.js +1 -3
package/dist/extensions/overlay-agent.d.ts +11 -0
package/dist/extensions/overlay-agent.js +43 -0
package/dist/extensions/terminal-buffer.d.ts +14 -0
package/dist/extensions/terminal-buffer.js +120 -0
package/dist/extensions/tui-renderer.js +344 -83
package/dist/index.js +45 -36
package/dist/input-handler.js +10 -3
package/dist/output-parser.js +8 -0
package/dist/settings.js +1 -1
package/dist/shell.d.ts +5 -0
package/dist/shell.js +29 -4
package/dist/types.d.ts +13 -0
package/dist/utils/diff.js +10 -0
package/dist/utils/floating-panel.d.ts +198 -0
package/dist/utils/floating-panel.js +590 -0
package/dist/utils/markdown.d.ts +1 -0
package/dist/utils/markdown.js +23 -1
package/dist/utils/output-writer.d.ts +14 -0
package/dist/utils/output-writer.js +16 -0
package/dist/utils/terminal-buffer.d.ts +65 -0
package/dist/utils/terminal-buffer.js +166 -0
package/dist/utils/tool-display.d.ts +4 -0
package/dist/utils/tool-display.js +22 -5
package/examples/extensions/claude-code-bridge/index.ts +8 -12
package/examples/extensions/overlay-agent.ts +70 -0
package/examples/extensions/pi-bridge/index.ts +10 -12
package/examples/extensions/secret-guard.ts +100 -0
package/examples/extensions/terminal-buffer.ts +184 -0
package/package.json +5 -1

package/README.md CHANGED Viewed

@@ -7,23 +7,23 @@ Not a shell that lives in an agent — an agent that lives in a shell.
 I live in a terminal. I don't want an agent that can run shell commands when it needs to — I want my shell, with an agent I can reach for when *I* need to. Most AI tools get this backwards: the LLM drives the experience and the shell is bolted on as an afterthought. No real PTY, no job control, no vim, fragile `cd` tracking. The agent is the main character and your terminal is a prop.
-agent-sh flips this. It's your shell first — full PTY, your rc config, your aliases, everything just works. But type `?` or `>` at the start of a line, and you're talking to an agent that has full context of what you've been doing.
+agent-sh flips this. It's your shell first — full PTY, your rc config, your aliases, everything just works. But type `>` at the start of a line, and you're talking to an agent that has full context of what you've been doing.
 ```
 ⚡ src $ ls -la                          # real shell command
 ⚡ src $ cd ../tests && npm test          # real cd, env, aliases — all just work
 ⚡ src $ vim file.ts                      # opens vim in the same PTY
-⚡ src $ > explain the last error          # execute mode → agent investigates using its own tools
-⚡ src $ ? deploy to staging              # help mode → agent runs it in your live shell
+⚡ src $ > explain the last error          # agent investigates using its own tools
+⚡ src $ > deploy to staging              # agent runs it in your live shell
 ```
 ## Key Features
 **Real terminal, zero compromise.** Full PTY with your shell config, aliases, and environment. Shell starts instantly — the agent connects asynchronously in the background.
-**Context-aware agent.** Every query includes your cwd, recent commands, and their output. Run a failing test, type `? fix this`, and the agent knows exactly what happened. It has built-in tools for file read/write/edit, bash, grep, glob — no external setup needed.
+**Context-aware agent.** Every query includes your cwd, recent commands, and their output. Run a failing test, type `> fix this`, and the agent knows exactly what happened. It has built-in tools for file read/write/edit, bash, grep, glob — no external setup needed.
-**Two input modes.** `>` for questions and tasks — the agent investigates using its own isolated tools. `?` for commands that run directly in your live shell, affecting your real environment. The agent knows which mode it's in and behaves accordingly.
+**Agent decides how to help.** One entry point (`>`), three tool categories. The agent uses scratchpad tools to investigate, `display` to show you output, and `user_shell` for commands with lasting effects. No need to pick a mode — the agent reasons about which tools to use based on your intent.
 **Any LLM, any backend.** Works with any OpenAI-compatible API out of the box. Define multiple providers in settings and cycle between models at runtime with Shift+Tab. Or swap in a completely different agent — [Claude Code](examples/extensions/claude-code-bridge/) and [pi](examples/extensions/pi-bridge/) run as drop-in backend extensions.
@@ -42,14 +42,15 @@ Set `OPENAI_API_KEY` in your environment (or configure providers in `~/.agent-sh
 Requires Node.js 18+.
-## Input Modes
+## Agent Mode
-- **`>` Execute mode** — Agent uses its own tools (bash, file read/write, search) to investigate and answer. Stays in execute mode for follow-ups.
-- **`?` Help mode** — Agent runs a command in your live shell. Your aliases, env vars, and cwd apply. Returns to shell after.
+Type `>` at the start of a line to talk to the agent. The agent decides how to help:
-Everything else works as a normal shell — commands go straight to the PTY. Modes are extensible — see [Extensions: Custom Input Modes](docs/extensions.md#custom-input-modes).
+- **Scratchpad tools** (`bash`, `read_file`, `grep`, `glob`, etc.) — for investigation. Output goes to the agent, not your terminal.
+- **`display`** — shows output in your terminal (e.g. `cat`, `git log`). You see it; the agent doesn't process it.
+- **`user_shell`** — runs commands with lasting effects (`cd`, `npm install`, etc.) in your live shell.
-> **Why `>` for the main mode?** `>` is easy to type and the most common interaction — asking the agent to do things. `?` is reserved for when you need the agent to run something directly in your live shell.
+Everything else works as a normal shell — commands go straight to the PTY. Input modes are extensible — see [Extensions: Custom Input Modes](docs/extensions.md#custom-input-modes).
 ### Slash Commands
@@ -61,39 +62,7 @@ Everything else works as a normal shell — commands go straight to the PTY. Mod
 ## Configuration
-Configure via `~/.agent-sh/settings.json`. Define named providers with multiple models:
-```json
-{
-  "defaultProvider": "openai",
-  "providers": {
-    "openai": {
-      "apiKey": "$OPENAI_API_KEY",
-      "defaultModel": "gpt-4o",
-      "models": ["gpt-4o", "gpt-4o-mini"]
-    },
-    "ollama": {
-      "apiKey": "not-needed",
-      "baseURL": "http://localhost:11434/v1",
-      "defaultModel": "llama3",
-      "models": ["llama3", "mistral"]
-    }
-  }
-}
-```
-Cycle models with **Shift+Tab**, switch providers with `/provider <name>`, switch backends with `/backend <name>`. API keys support `$ENV_VAR` syntax.
-Additional options:
-| Key | Default | Description |
-|---|---|---|
-| `startupBanner` | `true` | Show startup banner with model info and usage hints |
-| `promptIndicator` | `true` | Show `⚡ agent-sh` in terminal tab/window title |
-Set either to `false` to disable.
-See the [Usage Guide](docs/usage.md#configuration) for the full settings reference.
+Configure via `~/.agent-sh/settings.json`. See the [Usage Guide](docs/usage.md#configuration) for the full settings reference (providers, models, extensions, skills, and more).
 ## Documentation

package/dist/agent/agent-loop.d.ts CHANGED Viewed

@@ -26,6 +26,7 @@ export declare class AgentLoop implements AgentBackend {
     private abortController;
     private toolRegistry;
     private conversation;
+    private fileReadCache;
     private modes;
     private currentModeIndex;
     private boundListeners;

package/dist/agent/agent-loop.js CHANGED Viewed

@@ -14,6 +14,7 @@ import { createGrepTool } from "./tools/grep.js";
 import { createGlobTool } from "./tools/glob.js";
 import { createLsTool } from "./tools/ls.js";
 import { createUserShellTool } from "./tools/user-shell.js";
+import { createDisplayTool } from "./tools/display.js";
 import { createListSkillsTool } from "./tools/list-skills.js";
 import { discoverProjectSkills } from "./skills.js";
 export class AgentLoop {
@@ -24,6 +25,7 @@ export class AgentLoop {
     abortController = null;
     toolRegistry = new ToolRegistry();
     conversation = new ConversationState();
+    fileReadCache = new Map();
     modes;
     currentModeIndex = 0;
     boundListeners = [];
@@ -51,8 +53,8 @@ export class AgentLoop {
             this.bus.on(event, fn);
             this.boundListeners.push({ event, fn });
         };
-        on("agent:submit", ({ query, modeInstruction, modeLabel }) => {
-            this.handleQuery(query, modeInstruction, modeLabel).catch(() => { });
+        on("agent:submit", ({ query }) => {
+            this.handleQuery(query).catch(() => { });
         });
         on("agent:cancel-request", (e) => {
             this.abortController?.abort(e.silent ? "silent" : undefined);
@@ -278,13 +280,14 @@ export class AgentLoop {
             return env;
         };
         this.toolRegistry.register(createBashTool({ getCwd, getEnv, bus: this.bus }));
-        this.toolRegistry.register(createReadFileTool(getCwd));
+        this.toolRegistry.register(createReadFileTool(getCwd, this.fileReadCache));
         this.toolRegistry.register(createWriteFileTool(getCwd));
         this.toolRegistry.register(createEditFileTool(getCwd));
         this.toolRegistry.register(createGrepTool(getCwd));
         this.toolRegistry.register(createGlobTool(getCwd));
         this.toolRegistry.register(createLsTool(getCwd));
         this.toolRegistry.register(createUserShellTool({ getCwd, bus: this.bus }));
+        this.toolRegistry.register(createDisplayTool({ getCwd, bus: this.bus }));
         this.toolRegistry.register(createListSkillsTool(getCwd));
     }
     /**
@@ -301,6 +304,8 @@ export class AgentLoop {
         h.define("conversation:prepare", (messages) => messages);
         // Wraps each tool call: permission → execute → emit events.
         // Extensions advise to add safe-mode, logging, metrics, custom policies.
+        // The ctx.onChunk callback is exposed so advisors can wrap it to
+        // intercept/transform streamed tool output (e.g. secret redaction).
         h.define("tool:execute", async (ctx) => {
             const { name, id, args, tool } = ctx;
             const display = tool.getDisplayInfo?.(args) ?? { kind: "execute" };
@@ -308,7 +313,9 @@ export class AgentLoop {
             // Permission gating
             if (tool.requiresPermission) {
                 let permKind = "tool-call";
-                let permTitle = name;
+                let permTitle = typeof args.description === "string"
+                    ? `${name}: ${args.description}`
+                    : name;
                 let metadata = { args };
                 // For file-modifying tools, pre-compute diff for display
                 if (tool.modifiesFiles && typeof args.path === "string") {
@@ -359,20 +366,33 @@ export class AgentLoop {
                 }
             }
             // Emit tool-started for TUI
+            const label = tool.displayName ?? name;
             this.bus.emit("agent:tool-started", {
-                title: name, toolCallId: id,
-                kind: display.kind, locations: display.locations, rawInput: args,
+                title: typeof args.description === "string" ? `${label}: ${args.description}` : label,
+                toolCallId: id,
+                kind: display.kind, icon: display.icon, locations: display.locations, rawInput: args,
+                displayDetail: tool.formatCall?.(args),
+                batchIndex: ctx.batchIndex, batchTotal: ctx.batchTotal,
             });
             this.bus.emit("agent:tool-call", { tool: name, args });
-            // Execute — suppress streaming output if diff was already shown
+            // Execute — use ctx.onChunk so advisors can wrap the streaming callback.
+            // Suppress streaming output if diff was already shown.
             const onChunk = (tool.showOutput !== false && !diffShown)
-                ? (chunk) => { this.bus.emit("agent:tool-output-chunk", { chunk }); }
+                ? ctx.onChunk
                 : undefined;
             const result = await tool.execute(args, onChunk);
-            // Emit completion events
-            this.bus.emit("agent:tool-completed", {
+            // Invalidate read cache when a file is modified
+            if (tool.modifiesFiles && typeof args.path === "string" && !result.isError) {
+                const absPath = path.resolve(process.cwd(), args.path);
+                this.fileReadCache.delete(absPath);
+            }
+            // Compute result display: tool-provided → default (none)
+            const resultDisplay = tool.formatResult?.(args, result);
+            // Emit completion events (via transform pipe so extensions can override)
+            this.bus.emitTransform("agent:tool-completed", {
                 toolCallId: id, exitCode: result.exitCode,
                 rawOutput: result.content, kind: display.kind,
+                resultDisplay,
             });
             this.bus.emit("agent:tool-output", {
                 tool: name, output: result.content, exitCode: result.exitCode,
@@ -380,7 +400,7 @@ export class AgentLoop {
             return result;
         });
     }
-    async handleQuery(query, modeInstruction, modeLabel) {
+    async handleQuery(query) {
         // Cancel any in-flight loop (concurrent prompt handling)
         if (this.abortController) {
             this.abortController.abort();
@@ -390,15 +410,11 @@ export class AgentLoop {
         // Each loop iteration adds an abort listener (via OpenAI SDK stream);
         // raise the limit to avoid spurious warnings on multi-tool queries.
         setMaxListeners(50, signal);
-        this.bus.emit("agent:query", { query, modeLabel });
+        this.bus.emit("agent:query", { query });
         this.bus.emit("agent:processing-start", {});
         let responseText = "";
         try {
-            // Prepend mode instruction to the user message
-            const userMessage = modeInstruction
-                ? `${modeInstruction}\n${query}`
-                : query;
-            this.conversation.addUserMessage(userMessage);
+            this.conversation.addUserMessage(query);
             responseText = await this.executeLoop(signal);
         }
         catch (e) {
@@ -453,14 +469,38 @@ export class AgentLoop {
             // No tool calls → agent is done
             if (toolCalls.length === 0)
                 break;
-            // Execute each tool call
-            for (const tc of toolCalls) {
-                if (signal.aborted)
-                    break;
+            // Emit batch info so the TUI can render group headers upfront
+            {
+                const groupMap = new Map();
+                for (const tc of toolCalls) {
+                    const tool = this.toolRegistry.get(tc.name);
+                    const kind = tool?.getDisplayInfo?.((() => { try {
+                        return JSON.parse(tc.argumentsJson);
+                    }
+                    catch {
+                        return {};
+                    } })())?.kind ?? "execute";
+                    let args = {};
+                    try {
+                        args = JSON.parse(tc.argumentsJson);
+                    }
+                    catch { }
+                    const detail = tool?.formatCall?.(args);
+                    if (!groupMap.has(kind))
+                        groupMap.set(kind, []);
+                    groupMap.get(kind).push({ name: tc.name, displayDetail: detail });
+                }
+                const groups = Array.from(groupMap.entries()).map(([kind, tools]) => ({ kind, tools }));
+                this.bus.emit("agent:tool-batch", { groups });
+            }
+            // Execute tool calls — run read-only tools in parallel, permission-
+            // requiring tools sequentially (to avoid overlapping permission prompts).
+            const batchTotal = toolCalls.length;
+            const executeSingle = async (tc, batchIndex) => {
                 const tool = this.toolRegistry.get(tc.name);
                 if (!tool) {
                     this.conversation.addToolResult(tc.id, `Error: Unknown tool "${tc.name}"`);
-                    continue;
+                    return;
                 }
                 let args;
                 try {
@@ -468,16 +508,69 @@ export class AgentLoop {
                 }
                 catch {
                     this.conversation.addToolResult(tc.id, `Error: Invalid JSON arguments for ${tc.name}`);
-                    continue;
+                    return;
                 }
                 // Execute via handler — extensions can advise to add safe-mode,
                 // logging, metrics, custom permission policies, etc.
-                const result = await this.handlers.call("tool:execute", { name: tc.name, id: tc.id, args, tool });
-                // Add tool result to conversation
-                const content = result.isError
+                const defaultOnChunk = (chunk) => {
+                    this.bus.emit("agent:tool-output-chunk", { chunk });
+                };
+                const result = await this.handlers.call("tool:execute", { name: tc.name, id: tc.id, args, tool, onChunk: defaultOnChunk,
+                    batchIndex, batchTotal: batchTotal > 1 ? batchTotal : undefined });
+                // Add tool result to conversation (truncate large outputs to avoid
+                // blowing through the context window on a single tool call)
+                let content = result.isError
                     ? `Error: ${result.content}`
                     : result.content;
+                const maxBytes = 16_384; // ~4k tokens
+                if (content.length > maxBytes) {
+                    const headBytes = Math.floor(maxBytes * 0.6);
+                    const tailBytes = maxBytes - headBytes;
+                    const lines = content.split("\n");
+                    let headEnd = 0, headLen = 0;
+                    for (let i = 0; i < lines.length && headLen + lines[i].length + 1 <= headBytes; i++) {
+                        headLen += lines[i].length + 1;
+                        headEnd = i + 1;
+                    }
+                    let tailStart = lines.length, tailLen = 0;
+                    for (let i = lines.length - 1; i >= headEnd && tailLen + lines[i].length + 1 <= tailBytes; i--) {
+                        tailLen += lines[i].length + 1;
+                        tailStart = i;
+                    }
+                    const omitted = tailStart - headEnd;
+                    content = [
+                        ...lines.slice(0, headEnd),
+                        `\n[… ${omitted} lines omitted (output truncated to ${Math.round(maxBytes / 1024)}KB) …]\n`,
+                        ...lines.slice(tailStart),
+                    ].join("\n");
+                }
                 this.conversation.addToolResult(tc.id, content);
+            };
+            // Partition into parallel-safe (read-only) and sequential (needs permission)
+            const parallel = [];
+            const sequential = [];
+            for (const tc of toolCalls) {
+                const tool = this.toolRegistry.get(tc.name);
+                if (tool && !tool.requiresPermission && !tool.modifiesFiles) {
+                    parallel.push(tc);
+                }
+                else {
+                    sequential.push(tc);
+                }
+            }
+            // Run read-only tools in parallel
+            let batchIdx = 0;
+            if (parallel.length > 0 && !signal.aborted) {
+                await Promise.all(parallel.map(tc => {
+                    const idx = ++batchIdx;
+                    return signal.aborted ? Promise.resolve() : executeSingle(tc, idx);
+                }));
+            }
+            // Run permission-requiring tools sequentially
+            for (const tc of sequential) {
+                if (signal.aborted)
+                    break;
+                await executeSingle(tc, ++batchIdx);
             }
             // Loop back — LLM sees tool results
         }

package/dist/agent/subagent.js CHANGED Viewed

@@ -62,11 +62,13 @@ export async function runSubagent(opts) {
             const result = await tool.execute(args, onChunk);
             if (bus) {
                 const display = tool.getDisplayInfo?.(args) ?? { kind: "execute" };
-                bus.emit("agent:tool-completed", {
+                const resultDisplay = tool.formatResult?.(args, result);
+                bus.emitTransform("agent:tool-completed", {
                     toolCallId: tc.id,
                     exitCode: result.exitCode,
                     rawOutput: result.content,
                     kind: display.kind,
+                    resultDisplay,
                 });
             }
             const content = result.isError ? `Error: ${result.content}` : result.content;

package/dist/agent/system-prompt.d.ts CHANGED Viewed

@@ -4,7 +4,7 @@ import type { ContextManager } from "../context-manager.js";
  * Static system prompt — identical across all queries, cacheable.
  * Contains only identity and behavioral instructions.
  */
-export declare const STATIC_SYSTEM_PROMPT = "You are an AI coding assistant embedded in agent-sh, a terminal shell.\nYou have access to the user's shell environment and can read, write, and execute code.\nYou share the user's working directory, environment variables, and shell history.\n\n# Input Modes\n\nThe user interacts with you through two modes:\n\nEXECUTE mode (triggered by '>'): The user is asking questions or requesting tasks.\nUse your internal tools (bash, file operations, etc.) to accomplish tasks.\nDo NOT use user_shell in this mode unless the user explicitly asks to run\nsomething in their live shell.\n\nHELP mode (triggered by '?'): The user wants a command run in their live shell.\nYou may use your tools to investigate first (read files, grep, etc.), but the\nfinal action must be running the command via user_shell with return_output=false.\nThe user sees the output directly \u2014 you don't need to see or summarize it.\nDo not explain, confirm, or comment on the result \u2014 just run it and stop.\n\nEach prompt includes a per-query mode instruction \u2014 follow it.\n\n# Tool Usage Guidelines\n- Use read_file before editing a file you haven't seen\n- Prefer edit_file over write_file for modifying existing files\n- Use grep/glob to find files before reading them\n- Keep bash commands focused; avoid long-running blocking commands\n- Always check command exit codes for errors\n- user_shell runs commands in the user's live terminal \u2014 use for cd, export, source, etc.\n- user_shell output is shown directly to the user but NOT returned to you by default.\n  Set return_output=true if you need to inspect the result to answer a question.";
+export declare const STATIC_SYSTEM_PROMPT = "You are an AI coding assistant embedded in agent-sh, a terminal shell.\nYou have access to the user's shell environment and can read, write, and execute code.\nYou share the user's working directory, environment variables, and shell history.\n\n# Tool Decision Guide\n\nYou have three categories of tools \u2014 choose based on who needs the output and\nwhether the command has lasting effects:\n\n**Scratchpad tools** (bash, read_file, grep, glob, ls, edit_file, write_file):\nUse these to investigate, search, read, and modify files. Output is returned\nto you for reasoning \u2014 the user doesn't see it directly.\n\n**Display** (display):\nUse this to show output to the user in their terminal. The user sees the\noutput directly, but it is NOT returned to you. Use when:\n- The user asks to see something (cat a file, git log, git diff, man page)\n- The output is for the user to read, not for you to process\n\n**Live shell** (user_shell):\nUse this to run commands with lasting effects in the user's real shell. Use for:\n- Commands that affect shell state (cd, export, source)\n- Installing packages, starting servers, running builds\n- Any command where the user wants real side effects\n- Set return_output=true only if you need to inspect the result\n\nDefault to scratchpad tools for your own investigation. Use display when the\nuser is the intended audience. Use user_shell when the command has real effects.\n\n# Tool Usage Guidelines\n- Use read_file before editing a file you haven't seen\n- Prefer edit_file over write_file for modifying existing files\n- Use grep/glob to find files before reading them\n- Keep bash commands focused; avoid long-running blocking commands\n- Always check command exit codes for errors";
 /**
  * Build the dynamic context — injected as a user message before each query.
  * Contains everything that changes: tools, shell context, conventions, cwd.

package/dist/agent/system-prompt.js CHANGED Viewed

@@ -40,32 +40,37 @@ export const STATIC_SYSTEM_PROMPT = `You are an AI coding assistant embedded in
 You have access to the user's shell environment and can read, write, and execute code.
 You share the user's working directory, environment variables, and shell history.
-# Input Modes
+# Tool Decision Guide
-The user interacts with you through two modes:
+You have three categories of tools — choose based on who needs the output and
+whether the command has lasting effects:
-EXECUTE mode (triggered by '>'): The user is asking questions or requesting tasks.
-Use your internal tools (bash, file operations, etc.) to accomplish tasks.
-Do NOT use user_shell in this mode unless the user explicitly asks to run
-something in their live shell.
+**Scratchpad tools** (bash, read_file, grep, glob, ls, edit_file, write_file):
+Use these to investigate, search, read, and modify files. Output is returned
+to you for reasoning — the user doesn't see it directly.
-HELP mode (triggered by '?'): The user wants a command run in their live shell.
-You may use your tools to investigate first (read files, grep, etc.), but the
-final action must be running the command via user_shell with return_output=false.
-The user sees the output directly — you don't need to see or summarize it.
-Do not explain, confirm, or comment on the result — just run it and stop.
+**Display** (display):
+Use this to show output to the user in their terminal. The user sees the
+output directly, but it is NOT returned to you. Use when:
+- The user asks to see something (cat a file, git log, git diff, man page)
+- The output is for the user to read, not for you to process
-Each prompt includes a per-query mode instruction — follow it.
+**Live shell** (user_shell):
+Use this to run commands with lasting effects in the user's real shell. Use for:
+- Commands that affect shell state (cd, export, source)
+- Installing packages, starting servers, running builds
+- Any command where the user wants real side effects
+- Set return_output=true only if you need to inspect the result
+Default to scratchpad tools for your own investigation. Use display when the
+user is the intended audience. Use user_shell when the command has real effects.
 # Tool Usage Guidelines
 - Use read_file before editing a file you haven't seen
 - Prefer edit_file over write_file for modifying existing files
 - Use grep/glob to find files before reading them
 - Keep bash commands focused; avoid long-running blocking commands
-- Always check command exit codes for errors
-- user_shell runs commands in the user's live terminal — use for cd, export, source, etc.
-- user_shell output is shown directly to the user but NOT returned to you by default.
-  Set return_output=true if you need to inspect the result to answer a question.`;
+- Always check command exit codes for errors`;
 /**
  * Build the dynamic context — injected as a user message before each query.
  * Contains everything that changes: tools, shell context, conventions, cwd.

package/dist/agent/tools/bash.js CHANGED Viewed

@@ -2,7 +2,11 @@ import { executeCommand } from "../../executor.js";
 export function createBashTool(opts) {
     return {
         name: "bash",
-        description: "Execute a bash command in an isolated subprocess. Output is captured and returned to you. Does not affect the user's shell state.",
+        description: "Execute a bash command in an isolated subprocess. Output is captured and returned. " +
+            "Does not affect the user's shell state (use user_shell for cd, export, source). " +
+            "Do NOT use bash for file searching — use grep/glob instead. " +
+            "Do NOT use bash for reading files — use read_file instead. " +
+            "Provide a description parameter to explain what the command does.",
         input_schema: {
             type: "object",
             properties: {
@@ -14,6 +18,10 @@ export function createBashTool(opts) {
                     type: "number",
                     description: "Timeout in seconds (default: 60)",
                 },
+                description: {
+                    type: "string",
+                    description: "Short description of what this command does (e.g., 'Install dependencies', 'Run test suite')",
+                },
             },
             required: ["command"],
         },
@@ -22,6 +30,7 @@ export function createBashTool(opts) {
         requiresPermission: true,
         getDisplayInfo: (args) => ({
             kind: "execute",
+            icon: "▶",
             locations: [],
         }),
         async execute(args, onChunk) {

package/dist/agent/tools/display.d.ts ADDED Viewed

@@ -0,0 +1,13 @@
+import type { EventBus } from "../../event-bus.js";
+import type { ToolDefinition } from "../types.js";
+/**
+ * display — shows command output to the user in their live terminal.
+ *
+ * Unlike bash (scratchpad), the user sees the output directly in their shell.
+ * Unlike user_shell, this is for read-only display — no lasting side effects.
+ * The agent does NOT receive the output back.
+ */
+export declare function createDisplayTool(opts: {
+    getCwd: () => string;
+    bus: EventBus;
+}): ToolDefinition;

package/dist/agent/tools/display.js ADDED Viewed

@@ -0,0 +1,70 @@
+/**
+ * display — shows command output to the user in their live terminal.
+ *
+ * Unlike bash (scratchpad), the user sees the output directly in their shell.
+ * Unlike user_shell, this is for read-only display — no lasting side effects.
+ * The agent does NOT receive the output back.
+ */
+export function createDisplayTool(opts) {
+    return {
+        name: "display",
+        description: "Show command output to the user in their terminal. Use when the user asks to see something (cat, git log, diff, man, etc.) and you don't need to process the output yourself. Output is NOT returned to you.",
+        input_schema: {
+            type: "object",
+            properties: {
+                command: {
+                    type: "string",
+                    description: "Command to run and display output to the user",
+                },
+                timeout: {
+                    type: "number",
+                    description: "Timeout in seconds (default: 30)",
+                },
+            },
+            required: ["command"],
+        },
+        showOutput: false,
+        modifiesFiles: false,
+        getDisplayInfo: () => ({
+            kind: "display",
+            icon: "◇",
+            locations: [],
+        }),
+        async execute(args) {
+            const command = args.command;
+            const timeoutSec = args.timeout ?? 30;
+            let result;
+            try {
+                const execPromise = opts.bus.emitPipeAsync("shell:exec-request", {
+                    command,
+                    output: "",
+                    cwd: opts.getCwd(),
+                    exitCode: null,
+                    done: false,
+                });
+                const timeoutPromise = new Promise((_, reject) => setTimeout(() => reject(new Error("timeout")), timeoutSec * 1000));
+                result = await Promise.race([execPromise, timeoutPromise]);
+            }
+            catch (err) {
+                const msg = err instanceof Error ? err.message : String(err);
+                if (msg === "timeout") {
+                    return {
+                        content: `Command timed out after ${timeoutSec}s.`,
+                        exitCode: -1,
+                        isError: true,
+                    };
+                }
+                return { content: `Error: ${msg}`, exitCode: -1, isError: true };
+            }
+            const exitCode = result.exitCode ?? 0;
+            const isError = exitCode !== 0 && exitCode !== null;
+            return {
+                content: isError
+                    ? `Command failed with exit code ${exitCode}.`
+                    : "Output displayed to user.",
+                exitCode,
+                isError,
+            };
+        },
+    };
+}