npm - @zhijiewang/openharness - Versions diffs - 2.20.0 → 2.21.0 - Mend

@zhijiewang/openharness 2.20.0 → 2.21.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (18) hide show

package/README.md +21 -1
package/README.zh-CN.md +21 -1
package/dist/commands/ai.js +10 -0
package/dist/commands/session.d.ts +18 -1
package/dist/commands/session.js +82 -2
package/dist/commands/settings.d.ts +1 -1
package/dist/commands/settings.js +71 -1
package/dist/harness/config.d.ts +13 -0
package/dist/harness/hooks.d.ts +12 -1
package/dist/harness/hooks.js +27 -0
package/dist/main.js +185 -45
package/dist/mcp/loader.d.ts +29 -2
package/dist/mcp/loader.js +59 -3
package/dist/tools/EnterWorktreeTool/index.js +4 -0
package/dist/tools/ExitWorktreeTool/index.js +7 -0
package/dist/utils/debug.d.ts +63 -0
package/dist/utils/debug.js +122 -0
package/package.json +1 -1

package/README.md CHANGED Viewed

@@ -241,6 +241,7 @@ Over 80 commands are registered. The most-used ones are grouped below; see `/hel
 | `/memory` | View and search memories |
 | `/doctor` | Run diagnostic health checks |
 | `/hooks` | List loaded hooks grouped by event |
+| `/reload-plugins` | Hot-reload plugins, skills, hooks, and MCP server connections without restarting the session |
 **Settings:**
 | Command | Description |
@@ -298,7 +299,7 @@ hooks:
     command: "scripts/cleanup.sh"
 ```
-**Event types** (17 total):
+**Event types** (23 total):
 | Event | When it fires | Can block? |
 |-------|---------------|------------|
@@ -307,10 +308,13 @@ hooks:
 | `turnStart` | Top-level agent turn begins (after user prompt accepted) | — |
 | `turnStop` | Top-level agent turn ends (mirrors Claude Code's `Stop`) | — |
 | `userPromptSubmit` | Before user prompt reaches the LLM | yes — `decision: deny` |
+| `userPromptExpansion` | Slash command produces an expanded prompt (audit trail) | — |
 | `preToolUse` | Before each tool call | yes — exit code 1 / `decision: deny` |
 | `postToolUse` | After successful tool execution | — |
 | `postToolUseFailure` | After tool throws or returns `isError: true` | — |
+| `postToolBatch` | Once after a turn's full set of tool calls all resolve, before the next model call | — |
 | `permissionRequest` | When a tool needs approval (between `preToolUse` and the prompt) | yes — `decision: allow\|deny\|ask` |
+| `permissionDenied` | When a tool call is denied (hook / user / headless / policy) | — |
 | `fileChanged` | After a tool modifies a file | — |
 | `cwdChanged` | After working directory changes | — |
 | `subagentStart` | A sub-agent is spawned | — |
@@ -319,6 +323,9 @@ hooks:
 | `postCompact` | After conversation compaction | — |
 | `configChange` | `.oh/config.yaml` is modified during the session | — |
 | `notification` | A notification is dispatched | — |
+| `taskCreated` | `TaskCreate` persists a new task | — |
+| `taskCompleted` | `TaskUpdate` transitions a task to `completed` | — |
+| `instructionsLoaded` | `loadRulesAsPrompt` rebuilt the system prompt with rules in scope | — |
 Live introspection: run `/hooks` in-session to see exactly which hooks are loaded, grouped by event.
@@ -390,6 +397,15 @@ mcpServers:
 MCP tools appear alongside built-in tools. `/status` shows connected servers.
+**MCP server prompts as slash commands** — servers that expose `prompts/list` (e.g., GitHub, Sentry, Linear) get their prompts surfaced as `/<server>:<prompt>` slash commands automatically. Arguments use a `key=value` syntax with quoting:
+```
+/github:summarize-pr repo=acme/widget pr=42
+/sentry:triage-issue issue=ABC-123 severity="high priority"
+```
+Required arguments declared by the prompt template surface a usage error if missing (no model call). Run `/reload-plugins` to re-discover prompts after editing your MCP config.
 ### Remote MCP servers (HTTP / SSE)
 ```yaml
@@ -535,6 +551,10 @@ oh run "add error handling to api.ts" --json    # JSON output
 # Pipe stdin
 cat error.log | oh run "what's wrong here?"
 git diff | oh run "review these changes"
+# Hard cap on session cost — agent halts at the threshold with reason: "budget_exceeded"
+oh run "review the diff" --model claude-sonnet-4-6 --max-budget-usd 0.50
+oh session --model gpt-4o --max-budget-usd 5
 ```
 ### Structured output with `--json-schema`

package/README.zh-CN.md CHANGED Viewed

@@ -241,6 +241,7 @@ OH 注册了 80+ 个斜杠命令；下表只列出最常用的一部分。在会
 | `/memory` | 查看并搜索记忆 |
 | `/doctor` | 运行诊断健康检查 |
 | `/hooks` | 按事件列出已加载的钩子 |
+| `/reload-plugins` | 不重启会话即可热重载插件、技能、钩子和 MCP 服务器连接 |
 **设置：**
 | 命令 | 描述 |
@@ -298,7 +299,7 @@ hooks:
     command: "scripts/cleanup.sh"
 ```
-**事件类型**（共 17 个）：
+**事件类型**（共 23 个）：
 | 事件 | 触发时机 | 是否可阻止 |
 |-------|---------------|------------|
@@ -307,10 +308,13 @@ hooks:
 | `turnStart` | 顶层代理回合开始（用户提示词被接受后） | — |
 | `turnStop` | 顶层代理回合结束（对应 Claude Code 的 `Stop`） | — |
 | `userPromptSubmit` | 用户提示词到达 LLM 之前 | 是 —— `decision: deny` |
+| `userPromptExpansion` | 斜杠命令展开成模型提示词时（用于审计追踪） | — |
 | `preToolUse` | 工具调用之前 | 是 —— 退出码 1 / `decision: deny` |
 | `postToolUse` | 工具成功执行之后 | — |
 | `postToolUseFailure` | 工具抛错或返回 `isError: true` | — |
+| `postToolBatch` | 一个回合内全部工具调用都完成后、下一次模型调用之前 | — |
 | `permissionRequest` | 工具需要授权时（`preToolUse` 与询问之间） | 是 —— `decision: allow\|deny\|ask` |
+| `permissionDenied` | 工具调用被拒绝时（钩子 / 用户 / 无头 / 策略） | — |
 | `fileChanged` | 工具修改文件之后 | — |
 | `cwdChanged` | 工作目录变更之后 | — |
 | `subagentStart` | 子代理被派生 | — |
@@ -319,6 +323,9 @@ hooks:
 | `postCompact` | 对话压缩之后 | — |
 | `configChange` | 会话过程中 `.oh/config.yaml` 被修改 | — |
 | `notification` | 通知被派发 | — |
+| `taskCreated` | `TaskCreate` 持久化新任务后 | — |
+| `taskCompleted` | `TaskUpdate` 将任务状态切换为 `completed` 时 | — |
+| `instructionsLoaded` | `loadRulesAsPrompt` 重新构建系统提示并加载规则之后 | — |
 实时查看：在会话中运行 `/hooks` 可以按事件分组查看当前已加载的钩子。
@@ -390,6 +397,15 @@ mcpServers:
 MCP 工具会与内置工具并列出现。`/status` 会显示已连接的服务器。
+**MCP 服务器提示词作为斜杠命令** —— 实现了 `prompts/list` 的服务器（如 GitHub、Sentry、Linear）会自动把它们的提示词暴露为 `/<server>:<prompt>` 斜杠命令。参数采用 `key=value` 语法，支持加引号：
+```
+/github:summarize-pr repo=acme/widget pr=42
+/sentry:triage-issue issue=ABC-123 severity="high priority"
+```
+提示词模板声明的 required 参数缺失时会直接报用法错误（不会调用模型）。修改 MCP 配置后运行 `/reload-plugins` 即可重新发现提示词。
 ### 远程 MCP 服务器（HTTP / SSE）
 ```yaml
@@ -535,6 +551,10 @@ oh run "add error handling to api.ts" --json    # JSON 输出
 # 通过 stdin 输入
 cat error.log | oh run "what's wrong here?"
 git diff | oh run "review these changes"
+# 会话总成本硬上限 —— 达到阈值时代理会以 reason: "budget_exceeded" 终止
+oh run "review the diff" --model claude-sonnet-4-6 --max-budget-usd 0.50
+oh session --model gpt-4o --max-budget-usd 5
 ```
 ### 使用 `--json-schema` 约束结构化输出

package/dist/commands/ai.js CHANGED Viewed

@@ -232,6 +232,16 @@ export function registerAICommands(register) {
             prependToPrompt: `Summarize this conversation concisely. Highlight the main topics discussed, decisions made, and any pending action items. Be brief but thorough.`,
         };
     });
+    register("recap", "One-sentence recap of the current session (lighter than /summarize)", (_args, ctx) => {
+        if (ctx.messages.length === 0) {
+            return { output: "No messages to recap.", handled: true };
+        }
+        return {
+            output: `[recap] ${ctx.messages.length} messages`,
+            handled: false,
+            prependToPrompt: `In ONE sentence (max ~25 words), recap what this session has accomplished so far. No preamble, no bullet list, no markdown — just the sentence. Skip topics still in progress; lead with what's done.`,
+        };
+    });
     register("explain", "Explain a file or concept", (args) => {
         const topic = args.trim();
         if (!topic) {

package/dist/commands/session.d.ts CHANGED Viewed

@@ -1,6 +1,23 @@
 /**
- * Session commands — /clear, /compact, /export, /history, /browse, /resume, /fork, /pin, /unpin
+ * Session commands — /clear, /compact, /copy, /export, /history, /browse, /resume, /fork, /pin, /unpin
  */
 import type { CommandHandler } from "./types.js";
+/**
+ * Copy text to the system clipboard. Picks a platform-specific external tool
+ * — `clip.exe` (Windows / WSL), `pbcopy` (macOS), then `wl-copy` then
+ * `xclip -selection clipboard` (Linux). The first one that exits 0 wins.
+ *
+ * Pure helper exported for tests; everything is synchronous so the slash
+ * command response can include success/failure inline.
+ *
+ * @internal
+ */
+export declare function copyToClipboard(text: string): {
+    ok: true;
+    tool: string;
+} | {
+    ok: false;
+    reason: string;
+};
 export declare function registerSessionCommands(register: (name: string, description: string, handler: CommandHandler) => void): void;
 //# sourceMappingURL=session.d.ts.map

package/dist/commands/session.js CHANGED Viewed

@@ -1,12 +1,53 @@
 /**
- * Session commands — /clear, /compact, /export, /history, /browse, /resume, /fork, /pin, /unpin
+ * Session commands — /clear, /compact, /copy, /export, /history, /browse, /resume, /fork, /pin, /unpin
  */
+import { spawnSync } from "node:child_process";
 import { existsSync, mkdirSync } from "node:fs";
-import { homedir } from "node:os";
+import { homedir, platform } from "node:os";
 import { dirname, join, resolve } from "node:path";
 import { getContextWindow } from "../harness/cost.js";
 import { createSession, listSessions, loadSession, saveSession } from "../harness/session.js";
 import { compressMessages } from "../query/index.js";
+/**
+ * Copy text to the system clipboard. Picks a platform-specific external tool
+ * — `clip.exe` (Windows / WSL), `pbcopy` (macOS), then `wl-copy` then
+ * `xclip -selection clipboard` (Linux). The first one that exits 0 wins.
+ *
+ * Pure helper exported for tests; everything is synchronous so the slash
+ * command response can include success/failure inline.
+ *
+ * @internal
+ */
+export function copyToClipboard(text) {
+    const candidates = platform() === "win32"
+        ? [{ cmd: "clip", args: [] }]
+        : platform() === "darwin"
+            ? [{ cmd: "pbcopy", args: [] }]
+            : [
+                { cmd: "wl-copy", args: [] },
+                { cmd: "xclip", args: ["-selection", "clipboard"] },
+                { cmd: "xsel", args: ["--clipboard", "--input"] },
+                // WSL / generic-Linux-with-clip.exe-on-PATH fallback
+                { cmd: "clip.exe", args: [] },
+            ];
+    for (const { cmd, args } of candidates) {
+        try {
+            const result = spawnSync(cmd, args, { input: text, encoding: "utf8", shell: false });
+            if (!result.error && result.status === 0) {
+                return { ok: true, tool: cmd };
+            }
+        }
+        catch {
+            /* try the next candidate */
+        }
+    }
+    return {
+        ok: false,
+        reason: platform() === "linux"
+            ? "no clipboard tool found — install one of: wl-copy, xclip, xsel"
+            : `no working clipboard tool found for ${platform()}`,
+    };
+}
 function formatMessagesAsMarkdown(messages) {
     const blocks = [];
     for (const m of messages) {
@@ -216,6 +257,45 @@ export function registerSessionCommands(register) {
             compactedMessages: kept,
         };
     });
+    register("copy", "Copy the Nth-last assistant response to the clipboard (default: most recent)", (args, ctx) => {
+        const raw = args.trim();
+        const n = raw ? parseInt(raw, 10) : 1;
+        if (raw && (Number.isNaN(n) || n < 1)) {
+            return {
+                output: `Usage: /copy [n]\n\nCopies the Nth-last assistant response (default: 1 = most recent).`,
+                handled: true,
+            };
+        }
+        // Walk messages newest-to-oldest, keeping only assistant messages with text.
+        // /export already filters out tool-result messages from rendering — same rule
+        // applies here: we want the model's reply text, not its tool plumbing.
+        const assistantMessages = ctx.messages
+            .filter((m) => m.role === "assistant" && typeof m.content === "string" && m.content.trim().length > 0)
+            .reverse();
+        if (assistantMessages.length === 0) {
+            return { output: "No assistant responses to copy yet.", handled: true };
+        }
+        if (n > assistantMessages.length) {
+            return {
+                output: `Only ${assistantMessages.length} assistant response(s) available; can't copy #${n}.`,
+                handled: true,
+            };
+        }
+        const target = assistantMessages[n - 1];
+        const text = target.content;
+        const result = copyToClipboard(text);
+        if (!result.ok) {
+            return {
+                output: `Could not copy to clipboard: ${result.reason}\n\nResponse text (${text.length} chars):\n${text}`,
+                handled: true,
+            };
+        }
+        const preview = text.length > 80 ? `${text.slice(0, 80)}…` : text;
+        return {
+            output: `Copied assistant response #${n} (${text.length} chars) via ${result.tool}: ${preview}`,
+            handled: true,
+        };
+    });
     register("search", "Search current conversation", (args, ctx) => {
         const term = args.trim().toLowerCase();
         if (!term) {

package/dist/commands/settings.d.ts CHANGED Viewed

@@ -1,5 +1,5 @@
 /**
- * Settings commands — /theme, /companion, /fast, /keys, /effort, /sandbox, /permissions, /allowed-tools
+ * Settings commands — /theme, /companion, /fast, /keys, /keybindings, /effort, /sandbox, /permissions, /allowed-tools
  */
 import type { CommandHandler } from "./types.js";
 export declare function registerSettingsCommands(register: (name: string, description: string, handler: CommandHandler) => void): void;

package/dist/commands/settings.js CHANGED Viewed

@@ -1,8 +1,47 @@
 /**
- * Settings commands — /theme, /companion, /fast, /keys, /effort, /sandbox, /permissions, /allowed-tools
+ * Settings commands — /theme, /companion, /fast, /keys, /keybindings, /effort, /sandbox, /permissions, /allowed-tools
  */
+import { spawn } from "node:child_process";
+import { existsSync, mkdirSync, writeFileSync } from "node:fs";
+import { homedir, platform } from "node:os";
+import { dirname, join } from "node:path";
 import { readOhConfig } from "../harness/config.js";
 import { loadKeybindings } from "../harness/keybindings.js";
+const KEYBINDINGS_TEMPLATE = `[
+  { "key": "ctrl+d", "action": "/diff" },
+  { "key": "ctrl+l", "action": "/clear" },
+  { "key": "ctrl+u", "action": "/undo" },
+  { "key": "ctrl+s", "action": "/status" },
+  { "key": "ctrl+k ctrl+c", "action": "/cost" },
+  { "key": "ctrl+k ctrl+f", "action": "/fast" },
+  { "key": "ctrl+k ctrl+l", "action": "/log" }
+]
+`;
+/**
+ * Open a file in the user's editor — `$VISUAL` → `$EDITOR` → `notepad`
+ * (Windows) → `vi` (POSIX). The child uses `stdio: "ignore"` + `detached: true`
+ * + `unref()` so it is fully decoupled from the REPL and from `node --test`
+ * (which would otherwise hang waiting for the inherited stdio handle to
+ * close). The trade-off: terminal editors like `vi` / `vim` are not usable
+ * here — they need a TTY. That's fine for `/keybindings`, which targets a
+ * GUI-editor flow; an interactive in-REPL edit would be its own command.
+ */
+function openInEditor(filePath) {
+    const editor = process.env.VISUAL || process.env.EDITOR || (platform() === "win32" ? "notepad" : "vi");
+    if (process.env.OH_NO_OPEN_EDITOR === "1") {
+        // Test / CI escape hatch — pretend the editor launched without actually
+        // spawning anything. Used so suite runs don't pop a notepad window.
+        return { command: editor, spawned: true };
+    }
+    try {
+        const child = spawn(editor, [filePath], { stdio: "ignore", shell: true, detached: true });
+        child.unref();
+        return { command: editor, spawned: true };
+    }
+    catch {
+        return { command: editor, spawned: false };
+    }
+}
 export function registerSettingsCommands(register) {
     register("theme", "Switch theme (dark/light)", (args) => {
         const theme = args.trim().toLowerCase();
@@ -52,6 +91,37 @@ export function registerSettingsCommands(register) {
         shortcuts.push("", "  Session:", "    /vim              Toggle Vim mode", "    /browse           Interactive session browser", "    /theme dark|light Switch theme");
         return { output: shortcuts.join("\n"), handled: true };
     });
+    register("keybindings", "Open ~/.oh/keybindings.json in $EDITOR (creates a starter file if missing)", () => {
+        const path = join(homedir(), ".oh", "keybindings.json");
+        let createdNew = false;
+        if (!existsSync(path)) {
+            try {
+                mkdirSync(dirname(path), { recursive: true });
+                writeFileSync(path, KEYBINDINGS_TEMPLATE);
+                createdNew = true;
+            }
+            catch (err) {
+                return {
+                    output: `Could not create ${path}: ${err instanceof Error ? err.message : String(err)}`,
+                    handled: true,
+                };
+            }
+        }
+        const { command, spawned } = openInEditor(path);
+        if (!spawned) {
+            return {
+                output: `Could not launch ${command}. File path: ${path}\nSet $EDITOR or open it manually.`,
+                handled: true,
+            };
+        }
+        const lines = [
+            createdNew ? `Created starter file at ${path}` : `Opening ${path}`,
+            `Editor: ${command}`,
+            "",
+            "Edits take effect on the next session start. Reload now with /reload-plugins.",
+        ];
+        return { output: lines.join("\n"), handled: true };
+    });
     register("effort", "Set reasoning effort level (low/medium/high/max)", (args) => {
         const level = args.trim().toLowerCase();
         const valid = ["low", "medium", "high", "max"];

package/dist/harness/config.d.ts CHANGED Viewed

@@ -76,6 +76,10 @@ export type HooksConfig = {
     taskCreated?: HookDef[];
     /** Fires when a TaskUpdate tool call transitions a task to status "completed". */
     taskCompleted?: HookDef[];
+    /** Fires after EnterWorktreeTool successfully creates an isolated git worktree. */
+    worktreeCreate?: HookDef[];
+    /** Fires after ExitWorktreeTool successfully removes a git worktree. */
+    worktreeRemove?: HookDef[];
     /** Fires once per system-prompt build after CLAUDE.md / global-rules / project RULES.md / user profile have been concatenated. Useful for audit trails. */
     instructionsLoaded?: HookDef[];
 };
@@ -111,6 +115,15 @@ export type OhConfig = {
     baseUrl?: string;
     mcpServers?: McpServerConfig[];
     hooks?: HooksConfig;
+    /**
+     * Global kill switch for the hook system. When `true`, every `emitHook` /
+     * `emitHookAsync` / `emitHookWithOutcome` call short-circuits as if no
+     * hooks were configured — useful for one-off CI runs where the configured
+     * hooks would interfere. Configured hooks remain in `.oh/config.yaml` and
+     * are visible via `/hooks` so the off-state is auditable. Mirrors
+     * Claude Code's `disableAllHooks` setting.
+     */
+    disableAllHooks?: boolean;
     toolPermissions?: ToolPermissionRule[];
     statusLineFormat?: string;
     /** Verification loops — auto-run lint/typecheck after file edits */

package/dist/harness/hooks.d.ts CHANGED Viewed

@@ -10,7 +10,7 @@
  * - prompt: LLM yes/no check via provider.complete()
  */
 import type { HookDef, HooksConfig } from "./config.js";
-export type HookEvent = "sessionStart" | "sessionEnd" | "preToolUse" | "postToolUse" | "postToolUseFailure" | "postToolBatch" | "userPromptSubmit" | "userPromptExpansion" | "permissionRequest" | "permissionDenied" | "fileChanged" | "cwdChanged" | "subagentStart" | "subagentStop" | "preCompact" | "postCompact" | "configChange" | "notification" | "turnStart" | "turnStop" | "taskCreated" | "taskCompleted" | "instructionsLoaded";
+export type HookEvent = "sessionStart" | "sessionEnd" | "preToolUse" | "postToolUse" | "postToolUseFailure" | "postToolBatch" | "userPromptSubmit" | "userPromptExpansion" | "permissionRequest" | "permissionDenied" | "fileChanged" | "cwdChanged" | "subagentStart" | "subagentStop" | "preCompact" | "postCompact" | "configChange" | "notification" | "turnStart" | "turnStop" | "taskCreated" | "taskCompleted" | "worktreeCreate" | "worktreeRemove" | "instructionsLoaded";
 export type HookContext = {
     toolName?: string;
     toolArgs?: string;
@@ -60,6 +60,12 @@ export type HookContext = {
     taskSubject?: string;
     /** For taskCompleted: the previous status before completion (usually "in_progress") */
     taskPreviousStatus?: string;
+    /** For worktreeCreate/worktreeRemove: absolute path to the worktree directory */
+    worktreePath?: string;
+    /** For worktreeCreate/worktreeRemove: the parent repo directory the worktree was forked from */
+    worktreeParent?: string;
+    /** For worktreeRemove: whether `force: true` was passed to skip the dirty-state check */
+    worktreeForced?: string;
     /** For instructionsLoaded: count of rules concatenated (as a string for env-var parity) */
     rulesCount?: string;
     /** For instructionsLoaded: total character length of the loaded rules */
@@ -68,6 +74,11 @@ export type HookContext = {
 export declare function getHooks(): HooksConfig | null;
 /** Clear hook cache (call after config changes) */
 export declare function invalidateHookCache(): void;
+/**
+ * Whether the configured `disableAllHooks` kill switch is set.
+ * Cached so the per-emit cost is a single boolean read.
+ */
+export declare function areHooksEnabled(): boolean;
 /**
  * Evaluate a hook matcher against the current tool name.
  *

package/dist/harness/hooks.js CHANGED Viewed

@@ -10,6 +10,7 @@
  * - prompt: LLM yes/no check via provider.complete()
  */
 import { spawn, spawnSync } from "node:child_process";
+import { debug } from "../utils/debug.js";
 import { readOhConfig } from "./config.js";
 let cachedHooks;
 export function getHooks() {
@@ -22,6 +23,18 @@ export function getHooks() {
 /** Clear hook cache (call after config changes) */
 export function invalidateHookCache() {
     cachedHooks = undefined;
+    cachedDisableAllHooks = undefined;
+}
+let cachedDisableAllHooks;
+/**
+ * Whether the configured `disableAllHooks` kill switch is set.
+ * Cached so the per-emit cost is a single boolean read.
+ */
+export function areHooksEnabled() {
+    if (cachedDisableAllHooks === undefined) {
+        cachedDisableAllHooks = readOhConfig()?.disableAllHooks === true;
+    }
+    return !cachedDisableAllHooks;
 }
 function buildEnv(event, ctx) {
     const env = {
@@ -71,6 +84,12 @@ function buildEnv(event, ctx) {
         env.OH_TURN_NUMBER = ctx.turnNumber;
     if (ctx.turnReason !== undefined)
         env.OH_TURN_REASON = ctx.turnReason;
+    if (ctx.worktreePath !== undefined)
+        env.OH_WORKTREE_PATH = ctx.worktreePath;
+    if (ctx.worktreeParent !== undefined)
+        env.OH_WORKTREE_PARENT = ctx.worktreeParent;
+    if (ctx.worktreeForced !== undefined)
+        env.OH_WORKTREE_FORCED = ctx.worktreeForced;
     return env;
 }
 /**
@@ -400,10 +419,14 @@ async function executeHookDef(def, event, ctx) {
  * All other hooks run asynchronously to avoid blocking the event loop.
  */
 export function emitHook(event, ctx = {}) {
+    if (!areHooksEnabled())
+        return true;
     const hooks = getHooks();
     if (!hooks)
         return true;
     const defs = hooks[event] ?? [];
+    if (defs.length > 0)
+        debug("hooks", "fire", { event, count: defs.length, tool: ctx.toolName });
     const env = buildEnv(event, ctx);
     if (event === "preToolUse") {
         // preToolUse command hooks must be synchronous — they gate tool execution
@@ -456,6 +479,8 @@ export function emitHook(event, ctx = {}) {
  * Supports all hook types (command, HTTP, prompt).
  */
 export async function emitHookAsync(event, ctx = {}) {
+    if (!areHooksEnabled())
+        return true;
     const hooks = getHooks();
     if (!hooks)
         return true;
@@ -570,6 +595,8 @@ async function runHookForOutcome(def, event, ctx) {
  *   from hooks is ignored — outcome.allowed is always true. additionalContext is still collected.
  */
 export async function emitHookWithOutcome(event, ctx = {}) {
+    if (!areHooksEnabled())
+        return { allowed: true };
     const hooks = getHooks();
     const list = hooks?.[event];
     if (!list || list.length === 0)

package/dist/main.js CHANGED Viewed

@@ -23,9 +23,10 @@ import { detectProject, projectContextToPrompt } from "./harness/onboarding.js";
 import { discoverSkills, skillsToPrompt } from "./harness/plugins.js";
 import { createRulesFile, loadRules, loadRulesAsPrompt } from "./harness/rules.js";
 import { listSessions } from "./harness/session.js";
-import { connectedMcpServers, disconnectMcpClients, getMcpInstructions, loadMcpPrompts, loadMcpTools, } from "./mcp/loader.js";
+import { connectedMcpServers, disconnectMcpClients, getMcpInstructions, loadMcpPrompts, loadMcpTools, parseMcpConfigFile, } from "./mcp/loader.js";
 import { loadOutputStyle } from "./outputStyles/index.js";
 import { getAllTools } from "./tools.js";
+import { configureDebug, debug } from "./utils/debug.js";
 import { validateAgainstJsonSchema } from "./utils/json-schema.js";
 import { parseMaxBudgetUsd } from "./utils/parse-budget.js";
 const _require = createRequire(import.meta.url);
@@ -75,6 +76,40 @@ You have access to tools for reading, writing, and searching files, running shel
 - When referencing code, include file_path:line_number.
 - Do not restate what the user said. Do not add trailing summaries unless asked.
 - Keep responses short and direct. If you can say it in one sentence, don't use three.`;
+/**
+ * Read a system prompt from a file path, or exit 2 with a stderr message.
+ * Used by `--system-prompt-file` / `--append-system-prompt-file` so callers
+ * can keep prompts as version-controlled files instead of stuffing them on
+ * the command line. Trailing newline is stripped (most editors add one).
+ */
+function readSystemPromptFile(path, label) {
+    try {
+        return readFileSync(path, "utf8").replace(/\n$/, "");
+    }
+    catch (err) {
+        const message = err instanceof Error ? err.message : String(err);
+        process.stderr.write(`Error: ${label} '${path}' could not be read: ${message}\n`);
+        process.exit(2);
+    }
+}
+/**
+ * Parse `--mcp-config <path>` (and the optional `--strict-mcp-config` flag)
+ * into a `LoadMcpOptions` shape ready to pass to `loadMcpTools`. Returns
+ * undefined when the user didn't pass `--mcp-config`. Exits 2 with a stderr
+ * message on parse / shape errors.
+ */
+function buildMcpLoadOpts(opts) {
+    if (!opts.mcpConfig)
+        return undefined;
+    try {
+        const extraServers = parseMcpConfigFile(opts.mcpConfig);
+        return { extraServers, strict: opts.strictMcpConfig === true };
+    }
+    catch (err) {
+        process.stderr.write(`Error: ${err instanceof Error ? err.message : String(err)}\n`);
+        process.exit(2);
+    }
+}
 /**
  * Parse the `--max-budget-usd` CLI argument into a positive USD amount, or
  * exit 2 with an error message. The pure parser lives in
@@ -89,7 +124,19 @@ function parseMaxBudgetUsdOrExit(raw) {
     }
     return result.value;
 }
-function buildSystemPrompt(model) {
+/**
+ * Build the assembled system prompt for a session.
+ *
+ * In `bare` mode (audit A4 — `--bare`) every optional contributor is skipped:
+ * no project context, no rules, no user profile, no remembered memories, no
+ * skill catalog, no MCP server instructions, no language directive, no output
+ * style. The result is exactly `DEFAULT_SYSTEM_PROMPT`. Used for fast SDK /
+ * CI invocations where the model just needs the tool-use baseline and the
+ * caller will supply its own context.
+ */
+function buildSystemPrompt(model, opts = {}) {
+    if (opts.bare)
+        return DEFAULT_SYSTEM_PROMPT;
     const cfg = readOhConfig();
     // Output-style preface (first — sets personality for everything that follows).
     // Skipped silently for the "default" style (empty prompt).
@@ -146,13 +193,27 @@ program
     .addOption(new Option("--output-format <format>", "Output format").choices(["json", "text", "stream-json"]).default("text"))
     .option("--max-turns <n>", "Maximum turns", "20")
     .option("--system-prompt <prompt>", "Override the system prompt")
+    .option("--system-prompt-file <path>", "Read the system prompt from a file (overrides --system-prompt)")
     .option("--append-system-prompt <text>", "Append text to the system prompt")
+    .option("--append-system-prompt-file <path>", "Append the contents of a file to the system prompt")
     .option("--allowed-tools <tools>", "Comma-separated list of allowed tools")
     .option("--disallowed-tools <tools>", "Comma-separated list of disallowed tools")
     .option("--resume <id>", "Resume a saved session (replays its message history before this prompt)")
     .option("--setting-sources <sources>", "Comma-separated list of setting sources to merge (e.g. 'user,project,local'). Mirrors Claude Code's setting_sources.")
     .option("--max-budget-usd <amount>", "Hard cap on session cost in USD. The agent halts with reason 'budget_exceeded' once totalCost reaches this amount. Mirrors Claude Code's --max-budget-usd.")
+    .option("--no-session-persistence", "Skip writing the session to disk under ~/.oh/sessions/. Useful for ephemeral CI runs that don't need resume.")
+    .option("--mcp-config <path>", 'Load MCP servers from a JSON file (in addition to .oh/config.yaml). File format: {"mcpServers": [...]} or a bare array.')
+    .option("--strict-mcp-config", "With --mcp-config, ignore .oh/config.yaml mcpServers — use only the file's servers.")
+    .option("--bare", "Skip optional startup work (project detection, plugins, memory, skills, MCP). System prompt is just the tool-use baseline. Useful for fast CI / SDK invocations.")
+    .option("--debug [categories]", "Enable categorized debug logs to stderr. Pass comma-separated categories (e.g. 'mcp,hooks') or no value for all. Also reads OH_DEBUG.")
+    .option("--debug-file <path>", "When --debug is set, append debug lines to this file instead of stderr.")
     .action(async (promptArg, opts) => {
+    configureDebug({
+        categories: opts.debug,
+        ...(opts.debugFile ? { file: opts.debugFile } : {}),
+    });
+    const bare = opts.bare === true;
+    debug("startup", "oh run", { bare, model: opts.model });
     // Read from stdin if prompt is "-" or omitted and stdin is not a TTY
     let prompt;
     if (!promptArg || promptArg === "-" || !process.stdin.isTTY) {
@@ -189,8 +250,14 @@ program
         overrides.baseUrl = savedConfig.baseUrl;
     const { provider, model } = await createProvider(effectiveModel, Object.keys(overrides).length ? overrides : undefined);
     const { query } = await import("./query.js");
-    // Tool filtering
-    let tools = getAllTools();
+    // Tool list = built-ins + MCP server tools (project config + --mcp-config).
+    // Previously oh run skipped MCP entirely, which silently broke the SDK
+    // `tools=[...]` feature (the SDK injects mcpServers into a temp config but
+    // the CLI never read it back). `--bare` opts back out — built-ins only.
+    const mcpLoadOpts = buildMcpLoadOpts(opts);
+    const mcpTools = bare ? [] : await loadMcpTools(mcpLoadOpts);
+    debug("mcp", "loaded", { count: mcpTools.length, bare });
+    let tools = [...getAllTools(), ...mcpTools];
     if (opts.allowedTools) {
         const allowed = new Set(opts.allowedTools.split(",").map((s) => s.trim()));
         tools = tools.filter((t) => allowed.has(t.name));
@@ -199,13 +266,22 @@ program
         const disallowed = new Set(opts.disallowedTools.split(",").map((s) => s.trim()));
         tools = tools.filter((t) => !disallowed.has(t.name));
     }
-    // System prompt
+    process.on("exit", () => disconnectMcpClients());
+    // System prompt — file variants take precedence over inline string variants
+    // so callers can override-from-file without removing a stale --system-prompt
+    // they were previously passing.
     let systemPrompt;
-    if (opts.systemPrompt) {
+    if (opts.systemPromptFile) {
+        systemPrompt = readSystemPromptFile(opts.systemPromptFile, "--system-prompt-file");
+    }
+    else if (opts.systemPrompt) {
         systemPrompt = opts.systemPrompt;
     }
     else {
-        systemPrompt = buildSystemPrompt(model);
+        systemPrompt = buildSystemPrompt(model, { bare });
+    }
+    if (opts.appendSystemPromptFile) {
+        systemPrompt += `\n\n${readSystemPromptFile(opts.appendSystemPromptFile, "--append-system-prompt-file")}`;
     }
     if (opts.appendSystemPrompt) {
         systemPrompt += `\n\n${opts.appendSystemPrompt}`;
@@ -233,6 +309,8 @@ program
     // --resume <id> on a later run. Without this, every fresh `oh run` was
     // a programmatic dead-end for resumption (issue #60).
     const { createSession, loadSession, saveSession } = await import("./harness/session.js");
+    // Commander rewrites --no-session-persistence to opts.sessionPersistence === false.
+    const persistSession = opts.sessionPersistence !== false;
     let priorMessages;
     let sessionId;
     let sessionRecord;
@@ -250,7 +328,8 @@ program
     else {
         sessionRecord = createSession(provider.name, model);
         sessionId = sessionRecord.id;
-        saveSession(sessionRecord);
+        if (persistSession)
+            saveSession(sessionRecord);
     }
     if (outputFormat === "stream-json") {
         // Emit a session_start event so SDK callers can capture the id for
@@ -352,16 +431,18 @@ program
     // they're per-tool ephemerals; the assistant's final text is what
     // matters for context resumption. Mirrors the REPL's save-on-exit pattern
     // (src/components/REPL.tsx:120) but at one-shot scope.
-    try {
-        const { createUserMessage, createAssistantMessage } = await import("./types/message.js");
-        const newMessages = [...(priorMessages ?? []), createUserMessage(prompt)];
-        if (fullOutput)
-            newMessages.push(createAssistantMessage(fullOutput));
-        sessionRecord.messages = newMessages;
-        saveSession(sessionRecord);
-    }
-    catch {
-        /* persistence is best-effort — never fail the user's run on a save error */
+    if (persistSession) {
+        try {
+            const { createUserMessage, createAssistantMessage } = await import("./types/message.js");
+            const newMessages = [...(priorMessages ?? []), createUserMessage(prompt)];
+            if (fullOutput)
+                newMessages.push(createAssistantMessage(fullOutput));
+            sessionRecord.messages = newMessages;
+            saveSession(sessionRecord);
+        }
+        catch {
+            /* persistence is best-effort — never fail the user's run on a save error */
+        }
     }
 });
 // ── `oh session`: long-lived stateful session for the Python SDK ──
@@ -376,10 +457,25 @@ program
     .option("--disallowed-tools <tools>", "Comma-separated disallowed tool names")
     .option("--max-turns <n>", "Maximum turns per prompt", "20")
     .option("--system-prompt <prompt>", "Override the system prompt")
+    .option("--system-prompt-file <path>", "Read the system prompt from a file (overrides --system-prompt)")
+    .option("--append-system-prompt <text>", "Append text to the system prompt")
+    .option("--append-system-prompt-file <path>", "Append the contents of a file to the system prompt")
     .option("--resume <id>", "Resume a saved session (seeds the conversation with its prior message history)")
     .option("--setting-sources <sources>", "Comma-separated list of setting sources to merge (mirrors Claude Code's setting_sources).")
     .option("--max-budget-usd <amount>", "Hard cap on session cost in USD. Each prompt's cost accumulates; the agent halts with reason 'budget_exceeded' once totalCost reaches this amount.")
+    .option("--no-session-persistence", "Skip writing the session to disk under ~/.oh/sessions/. Useful for ephemeral SDK clients that don't need resume.")
+    .option("--mcp-config <path>", 'Load MCP servers from a JSON file (in addition to .oh/config.yaml). File format: {"mcpServers": [...]} or a bare array.')
+    .option("--strict-mcp-config", "With --mcp-config, ignore .oh/config.yaml mcpServers — use only the file's servers.")
+    .option("--bare", "Skip optional startup work (project detection, plugins, memory, skills, MCP). System prompt is just the tool-use baseline.")
+    .option("--debug [categories]", "Enable categorized debug logs to stderr. Pass comma-separated categories (e.g. 'mcp,hooks') or no value for all. Also reads OH_DEBUG.")
+    .option("--debug-file <path>", "When --debug is set, append debug lines to this file instead of stderr.")
     .action(async (opts) => {
+    configureDebug({
+        categories: opts.debug,
+        ...(opts.debugFile ? { file: opts.debugFile } : {}),
+    });
+    const bare = opts.bare === true;
+    debug("startup", "oh session", { bare, model: opts.model });
     const settingSources = parseSettingSources(opts.settingSources);
     const savedConfig = readOhConfig(undefined, settingSources);
     const permissionMode = (opts.permissionMode ??
@@ -395,7 +491,14 @@ program
     const { provider, model } = await createProvider(effectiveModel, Object.keys(overrides).length ? overrides : undefined);
     const { query } = await import("./query.js");
     const { createAssistantMessage, createToolResultMessage, createUserMessage } = await import("./types/message.js");
-    let tools = getAllTools();
+    // Tool list = built-ins + MCP server tools (project config + --mcp-config).
+    // Same fix as `oh run` — `oh session` previously skipped MCP entirely,
+    // which silently broke the SDK `tools=[...]` feature for stateful clients.
+    // `--bare` opts back out — built-ins only.
+    const mcpLoadOpts = buildMcpLoadOpts(opts);
+    const mcpTools = bare ? [] : await loadMcpTools(mcpLoadOpts);
+    debug("mcp", "loaded", { count: mcpTools.length, bare });
+    let tools = [...getAllTools(), ...mcpTools];
     if (opts.allowedTools) {
         const allowed = new Set(opts.allowedTools.split(",").map((s) => s.trim()));
         tools = tools.filter((t) => allowed.has(t.name));
@@ -404,7 +507,23 @@ program
         const disallowed = new Set(opts.disallowedTools.split(",").map((s) => s.trim()));
         tools = tools.filter((t) => !disallowed.has(t.name));
     }
-    const systemPrompt = opts.systemPrompt ?? buildSystemPrompt(model);
+    process.on("exit", () => disconnectMcpClients());
+    let systemPrompt;
+    if (opts.systemPromptFile) {
+        systemPrompt = readSystemPromptFile(opts.systemPromptFile, "--system-prompt-file");
+    }
+    else if (opts.systemPrompt) {
+        systemPrompt = opts.systemPrompt;
+    }
+    else {
+        systemPrompt = buildSystemPrompt(model, { bare });
+    }
+    if (opts.appendSystemPromptFile) {
+        systemPrompt += `\n\n${readSystemPromptFile(opts.appendSystemPromptFile, "--append-system-prompt-file")}`;
+    }
+    if (opts.appendSystemPrompt) {
+        systemPrompt += `\n\n${opts.appendSystemPrompt}`;
+    }
     const config = {
         provider,
         tools,
@@ -420,6 +539,8 @@ program
     // event for later resume (issue #60).
     const conversation = [];
     const { createSession, loadSession, saveSession } = await import("./harness/session.js");
+    // Commander rewrites --no-session-persistence to opts.sessionPersistence === false.
+    const persistSession = opts.sessionPersistence !== false;
     let sessionId;
     let sessionRecord;
     if (opts.resume) {
@@ -436,7 +557,8 @@ program
     else {
         sessionRecord = createSession(provider.name, model);
         sessionId = sessionRecord.id;
-        saveSession(sessionRecord);
+        if (persistSession)
+            saveSession(sessionRecord);
     }
     let turnCounter = 0;
     // Will be set to the current prompt id before each turn so hook_decision
@@ -549,12 +671,15 @@ program
         }
         // Persist after every completed turn so a later --resume picks up the
         // history. Best-effort — a save failure shouldn't break the live session.
-        try {
-            sessionRecord.messages = conversation.slice();
-            saveSession(sessionRecord);
-        }
-        catch {
-            /* save errors don't propagate to the client */
+        // Skipped entirely when --no-session-persistence was passed.
+        if (persistSession) {
+            try {
+                sessionRecord.messages = conversation.slice();
+                saveSession(sessionRecord);
+            }
+            catch {
+                /* save errors don't propagate to the client */
+            }
         }
     }
 });
@@ -578,7 +703,16 @@ program
     .option("--json-schema <schema>", "Constrain output to match a JSON schema (headless mode)")
     .option("--input-format <format>", "Input format: text (default) or stream-json (NDJSON on stdin)")
     .option("--replay-user-messages", "Re-emit user messages on stdout (requires stream-json output)")
+    .option("--bare", "Skip optional startup work (project detection, plugins, memory, skills, MCP). System prompt is just the tool-use baseline.")
+    .option("--debug [categories]", "Enable categorized debug logs to stderr. Pass comma-separated categories (e.g. 'mcp,hooks') or no value for all. Also reads OH_DEBUG.")
+    .option("--debug-file <path>", "When --debug is set, append debug lines to this file instead of stderr.")
     .action(async (opts) => {
+    configureDebug({
+        categories: opts.debug,
+        ...(opts.debugFile ? { file: opts.debugFile } : {}),
+    });
+    const bare = opts.bare === true;
+    debug("startup", "oh chat", { bare, model: opts.model, print: !!opts.print });
     // Load saved config as defaults (env vars + CLI flags override)
     const savedConfig = readOhConfig();
     const effectiveModel = opts.model ?? savedConfig?.model;
@@ -648,25 +782,31 @@ program
             process.exit(0);
         }
     }
-    const mcpTools = await loadMcpTools();
-    const mcpNames = connectedMcpServers();
-    if (mcpNames.length > 0) {
-        console.log(`[mcp] Connected: ${mcpNames.join(", ")}`);
-    }
-    // Surface MCP-server prompts (`prompts/list`) as `/server:prompt` slash
-    // commands. Errors are swallowed inside loadMcpPrompts — servers that
-    // don't implement the prompts capability return [] without throwing.
-    try {
-        const { registerMcpPromptCommands } = await import("./commands/index.js");
-        const prompts = await loadMcpPrompts();
-        registerMcpPromptCommands(prompts);
-        if (prompts.length > 0) {
-            console.log(`[mcp] Prompts: ${prompts.map((p) => `/${p.qualifiedName}`).join(", ")}`);
+    // `--bare` skips MCP entirely (servers, prompts, instructions). The
+    // built-in tool set is still loaded — bare is about reducing optional
+    // startup work, not stripping the agent's tool surface.
+    const mcpTools = bare ? [] : await loadMcpTools();
+    if (!bare) {
+        const mcpNames = connectedMcpServers();
+        if (mcpNames.length > 0) {
+            console.log(`[mcp] Connected: ${mcpNames.join(", ")}`);
+        }
+        // Surface MCP-server prompts (`prompts/list`) as `/server:prompt` slash
+        // commands. Errors are swallowed inside loadMcpPrompts — servers that
+        // don't implement the prompts capability return [] without throwing.
+        try {
+            const { registerMcpPromptCommands } = await import("./commands/index.js");
+            const prompts = await loadMcpPrompts();
+            registerMcpPromptCommands(prompts);
+            if (prompts.length > 0) {
+                console.log(`[mcp] Prompts: ${prompts.map((p) => `/${p.qualifiedName}`).join(", ")}`);
+            }
+        }
+        catch {
+            /* prompt registration is best-effort; never block the REPL */
         }
     }
-    catch {
-        /* prompt registration is best-effort; never block the REPL */
-    }
+    debug("mcp", "loaded", { count: mcpTools.length, bare });
     const tools = [...getAllTools(), ...mcpTools];
     process.on("exit", () => disconnectMcpClients());
     // Compute working directory and git branch
@@ -728,7 +868,7 @@ program
         const qConfig = {
             provider,
             tools,
-            systemPrompt: buildSystemPrompt(resolvedModel),
+            systemPrompt: buildSystemPrompt(resolvedModel, { bare }),
             permissionMode: effectivePermMode,
             maxTurns: 20,
             model: resolvedModel,

package/dist/mcp/loader.d.ts CHANGED Viewed

@@ -1,6 +1,33 @@
+import type { McpServerConfig } from "../harness/config.js";
 import type { Tool } from "../Tool.js";
-/** Load MCP tools from .oh/config.yaml mcpServers list. Returns empty array if none configured. */
-export declare function loadMcpTools(): Promise<Tool[]>;
+/**
+ * Parse a `--mcp-config <path>` file. Format:
+ *   - `{ "mcpServers": [...] }` — Claude Code convention (preferred)
+ *   - `[ ... ]` — bare array of server configs (also accepted)
+ *   - `{ "name": ..., ... }` — single-server object (also accepted)
+ *
+ * Validation is shape-only: each entry must be an object with a `name`.
+ * Connection-time validation happens in `McpClient.connect`. Throws on
+ * malformed JSON or unrecognised top-level shape.
+ */
+export declare function parseMcpConfigFile(path: string): McpServerConfig[];
+export interface LoadMcpOptions {
+    /**
+     * MCP servers loaded from sources outside `.oh/config.yaml` — typically
+     * a `--mcp-config <path>` file. Merged with the config-file servers
+     * unless `strict` is set, in which case these REPLACE the config-file
+     * servers entirely.
+     */
+    extraServers?: import("../harness/config.js").McpServerConfig[];
+    /**
+     * When `true`, ignore `cfg.mcpServers` and use only `extraServers`.
+     * No-op when `extraServers` is undefined (the config-file servers
+     * still load). Mirrors Claude Code's `--strict-mcp-config`.
+     */
+    strict?: boolean;
+}
+/** Load MCP tools from .oh/config.yaml mcpServers list (and/or `--mcp-config` overrides). Returns empty array if none configured. */
+export declare function loadMcpTools(opts?: LoadMcpOptions): Promise<Tool[]>;
 /** Disconnect all MCP clients (call on exit) */
 export declare function disconnectMcpClients(): void;
 /** Names of connected MCP servers */

package/dist/mcp/loader.js CHANGED Viewed

@@ -1,7 +1,52 @@
+import { readFileSync } from "node:fs";
 import { readOhConfig } from "../harness/config.js";
+import { debug } from "../utils/debug.js";
 import { McpClient } from "./client.js";
 import { DeferredMcpTool } from "./DeferredMcpTool.js";
 import { McpTool } from "./McpTool.js";
+/**
+ * Parse a `--mcp-config <path>` file. Format:
+ *   - `{ "mcpServers": [...] }` — Claude Code convention (preferred)
+ *   - `[ ... ]` — bare array of server configs (also accepted)
+ *   - `{ "name": ..., ... }` — single-server object (also accepted)
+ *
+ * Validation is shape-only: each entry must be an object with a `name`.
+ * Connection-time validation happens in `McpClient.connect`. Throws on
+ * malformed JSON or unrecognised top-level shape.
+ */
+export function parseMcpConfigFile(path) {
+    const raw = readFileSync(path, "utf8");
+    let parsed;
+    try {
+        parsed = JSON.parse(raw);
+    }
+    catch (err) {
+        throw new Error(`--mcp-config '${path}' is not valid JSON: ${err instanceof Error ? err.message : String(err)}`);
+    }
+    let servers;
+    if (Array.isArray(parsed)) {
+        servers = parsed;
+    }
+    else if (parsed && typeof parsed === "object" && "mcpServers" in parsed) {
+        const list = parsed.mcpServers;
+        if (!Array.isArray(list)) {
+            throw new Error(`--mcp-config '${path}': mcpServers must be an array`);
+        }
+        servers = list;
+    }
+    else if (parsed && typeof parsed === "object" && "name" in parsed) {
+        servers = [parsed];
+    }
+    else {
+        throw new Error(`--mcp-config '${path}': expected an mcpServers array, a bare array, or a single server object`);
+    }
+    for (const s of servers) {
+        if (!s || typeof s !== "object" || typeof s.name !== "string") {
+            throw new Error(`--mcp-config '${path}': every server entry must be an object with a 'name' string`);
+        }
+    }
+    return servers;
+}
 const connectedClients = [];
 let exitHandlerInstalled = false;
 function installExitHandler() {
@@ -28,11 +73,20 @@ function installExitHandler() {
 }
 /** Threshold: servers with more tools than this use deferred loading */
 const DEFERRED_THRESHOLD = 10;
-/** Load MCP tools from .oh/config.yaml mcpServers list. Returns empty array if none configured. */
-export async function loadMcpTools() {
+/** Load MCP tools from .oh/config.yaml mcpServers list (and/or `--mcp-config` overrides). Returns empty array if none configured. */
+export async function loadMcpTools(opts = {}) {
     installExitHandler();
     const cfg = readOhConfig();
-    const servers = cfg?.mcpServers ?? [];
+    const fromConfig = opts.strict ? [] : (cfg?.mcpServers ?? []);
+    const fromExtra = opts.extraServers ?? [];
+    // Dedup by name — extras win on conflict so --mcp-config can override a
+    // project-config entry without --strict.
+    const byName = new Map();
+    for (const s of fromConfig)
+        byName.set(s.name, s);
+    for (const s of fromExtra)
+        byName.set(s.name, s);
+    const servers = Array.from(byName.values());
     if (servers.length === 0)
         return [];
     const tools = [];
@@ -45,10 +99,12 @@ export async function loadMcpTools() {
     for (const result of results) {
         if (result.status === "rejected") {
             console.warn(`[mcp] Failed to connect: ${result.reason instanceof Error ? result.reason.message : String(result.reason)}`);
+            debug("mcp", "connect failed", result.reason);
             continue;
         }
         const { client, defs, server } = result.value;
         connectedClients.push(client);
+        debug("mcp", "connected", { server: server.name, tools: defs.length, deferred: defs.length > DEFERRED_THRESHOLD });
         if (defs.length > DEFERRED_THRESHOLD) {
             for (const def of defs) {
                 tools.push(new DeferredMcpTool(client, def.name, def.description ?? "", server.riskLevel));

package/dist/tools/EnterWorktreeTool/index.js CHANGED Viewed

@@ -1,5 +1,6 @@
 import { z } from "zod";
 import { createWorktree, isGitRepo } from "../../git/index.js";
+import { emitHook } from "../../harness/hooks.js";
 const inputSchema = z.object({
     branch: z.string().optional().describe("Branch name for the worktree (auto-generated if omitted)"),
 });
@@ -22,6 +23,9 @@ export const EnterWorktreeTool = {
         if (!path) {
             return { output: "Failed to create worktree.", isError: true };
         }
+        // Symmetric to taskCreated — fire only on the success path so audit hooks
+        // can react to the new worktree (e.g. set up a per-worktree scratch dir).
+        emitHook("worktreeCreate", { worktreePath: path, worktreeParent: context.workingDir });
         return { output: `Worktree created at: ${path}\nUse ExitWorktree to clean up when done.`, isError: false };
     },
     prompt() {

package/dist/tools/ExitWorktreeTool/index.js CHANGED Viewed

@@ -1,5 +1,6 @@
 import { z } from "zod";
 import { hasWorktreeChanges, removeWorktree } from "../../git/index.js";
+import { emitHook } from "../../harness/hooks.js";
 const inputSchema = z.object({
     path: z.string().describe("Path to the worktree to remove"),
     force: z.boolean().optional().describe("Force removal even with uncommitted changes"),
@@ -24,6 +25,12 @@ export const ExitWorktreeTool = {
         }
         try {
             removeWorktree(input.path);
+            // Fire after removeWorktree resolves so the hook only sees confirmed
+            // removals — symmetric to worktreeCreate firing on success.
+            emitHook("worktreeRemove", {
+                worktreePath: input.path,
+                worktreeForced: input.force ? "true" : "false",
+            });
             return { output: `Worktree removed: ${input.path}`, isError: false };
         }
         catch (err) {

package/dist/utils/debug.d.ts ADDED Viewed

@@ -0,0 +1,63 @@
+/**
+ * Categorized debug logger — gates verbose internal traces behind a runtime
+ * switch so they're silent by default but easy to flip on for support / CI.
+ *
+ * Activation precedence (highest first):
+ *   1. `configureDebug({ categories })` from a CLI flag (`--debug [cats]`)
+ *   2. `OH_DEBUG` env var
+ *
+ * Sink precedence:
+ *   1. `configureDebug({ file })` from `--debug-file <path>`
+ *   2. `OH_DEBUG_FILE` env var
+ *   3. `process.stderr` (default)
+ *
+ * Categories are arbitrary strings — call sites pick them. The CLI accepts a
+ * comma-separated list (`--debug mcp,hooks`) or `--debug` alone for "all".
+ *
+ * Wire pattern:
+ *   import { configureDebug, debug } from "./utils/debug.js";
+ *   configureDebug({ categories: opts.debug, file: opts.debugFile });
+ *   debug("mcp", "connected", server.name);
+ */
+/**
+ * Parse the raw flag value into a Set of enabled categories.
+ *
+ * Accepted values:
+ *   - `undefined` / empty / `false`     → no debug
+ *   - `true` / `"*"` / `"all"` / `"1"`  → all categories
+ *   - `"mcp,hooks,provider"`            → comma-separated explicit list
+ *
+ * Whitespace is trimmed and empty entries dropped, so `"mcp, ,hooks"` is
+ * equivalent to `"mcp,hooks"`. Pure function — exposed for testability.
+ */
+export declare function parseDebugCategories(raw: string | boolean | undefined): Set<string>;
+export interface ConfigureDebugOptions {
+    /** CLI flag value: `--debug` → true, `--debug mcp` → "mcp", absent → undefined. */
+    categories?: string | boolean | undefined;
+    /** CLI flag value: `--debug-file <path>` — appended to, never truncated. */
+    file?: string;
+    /** Test injection — overrides the file/stderr sink. Not used at runtime. */
+    sink?: NodeJS.WritableStream;
+}
+/**
+ * Apply debug configuration. Safe to call multiple times — later calls fully
+ * replace earlier state. When `categories` is undefined, falls back to
+ * `OH_DEBUG`; when `file` is undefined, falls back to `OH_DEBUG_FILE`.
+ *
+ * File output uses `appendFileSync` rather than a `WriteStream` so each
+ * `debug()` line lands on disk before the function returns. That trades a
+ * little throughput for ordering guarantees that matter when debugging
+ * crashes — a streamed sink could lose its tail buffer on `process.exit`.
+ */
+export declare function configureDebug(opts?: ConfigureDebugOptions): void;
+/** Whether the given category is currently emitting. Cheap — a Set lookup. */
+export declare function isDebugEnabled(category: string): boolean;
+/**
+ * Emit a debug line for the given category. Cheap no-op when the category is
+ * disabled — argument formatting is skipped entirely. Each line is prefixed
+ * with `[debug:<cat>] +<elapsed_ms>ms` so categories interleave readably.
+ */
+export declare function debug(category: string, ...args: unknown[]): void;
+/** @internal Test-only: reset module-level state between cases. */
+export declare function _resetDebugForTest(): void;
+//# sourceMappingURL=debug.d.ts.map

package/dist/utils/debug.js ADDED Viewed

@@ -0,0 +1,122 @@
+/**
+ * Categorized debug logger — gates verbose internal traces behind a runtime
+ * switch so they're silent by default but easy to flip on for support / CI.
+ *
+ * Activation precedence (highest first):
+ *   1. `configureDebug({ categories })` from a CLI flag (`--debug [cats]`)
+ *   2. `OH_DEBUG` env var
+ *
+ * Sink precedence:
+ *   1. `configureDebug({ file })` from `--debug-file <path>`
+ *   2. `OH_DEBUG_FILE` env var
+ *   3. `process.stderr` (default)
+ *
+ * Categories are arbitrary strings — call sites pick them. The CLI accepts a
+ * comma-separated list (`--debug mcp,hooks`) or `--debug` alone for "all".
+ *
+ * Wire pattern:
+ *   import { configureDebug, debug } from "./utils/debug.js";
+ *   configureDebug({ categories: opts.debug, file: opts.debugFile });
+ *   debug("mcp", "connected", server.name);
+ */
+import { appendFileSync } from "node:fs";
+const ALL = "*";
+let enabledCategories = new Set();
+let debugFilePath;
+let sinkOverride;
+let started = Date.now();
+/**
+ * Parse the raw flag value into a Set of enabled categories.
+ *
+ * Accepted values:
+ *   - `undefined` / empty / `false`     → no debug
+ *   - `true` / `"*"` / `"all"` / `"1"`  → all categories
+ *   - `"mcp,hooks,provider"`            → comma-separated explicit list
+ *
+ * Whitespace is trimmed and empty entries dropped, so `"mcp, ,hooks"` is
+ * equivalent to `"mcp,hooks"`. Pure function — exposed for testability.
+ */
+export function parseDebugCategories(raw) {
+    if (raw === undefined || raw === false || raw === "")
+        return new Set();
+    if (raw === true)
+        return new Set([ALL]);
+    const lower = raw.toLowerCase();
+    if (lower === "*" || lower === "all" || lower === "true" || lower === "1")
+        return new Set([ALL]);
+    return new Set(raw
+        .split(",")
+        .map((s) => s.trim())
+        .filter(Boolean));
+}
+/**
+ * Apply debug configuration. Safe to call multiple times — later calls fully
+ * replace earlier state. When `categories` is undefined, falls back to
+ * `OH_DEBUG`; when `file` is undefined, falls back to `OH_DEBUG_FILE`.
+ *
+ * File output uses `appendFileSync` rather than a `WriteStream` so each
+ * `debug()` line lands on disk before the function returns. That trades a
+ * little throughput for ordering guarantees that matter when debugging
+ * crashes — a streamed sink could lose its tail buffer on `process.exit`.
+ */
+export function configureDebug(opts = {}) {
+    const rawCats = opts.categories !== undefined ? opts.categories : process.env.OH_DEBUG;
+    enabledCategories = parseDebugCategories(rawCats);
+    sinkOverride = opts.sink;
+    debugFilePath = opts.sink ? undefined : (opts.file ?? process.env.OH_DEBUG_FILE);
+    started = Date.now();
+}
+/** Whether the given category is currently emitting. Cheap — a Set lookup. */
+export function isDebugEnabled(category) {
+    return enabledCategories.has(ALL) || enabledCategories.has(category);
+}
+/**
+ * Emit a debug line for the given category. Cheap no-op when the category is
+ * disabled — argument formatting is skipped entirely. Each line is prefixed
+ * with `[debug:<cat>] +<elapsed_ms>ms` so categories interleave readably.
+ */
+export function debug(category, ...args) {
+    if (!isDebugEnabled(category))
+        return;
+    const elapsed = Date.now() - started;
+    const formatted = args
+        .map((a) => {
+        if (typeof a === "string")
+            return a;
+        if (a instanceof Error)
+            return a.stack ?? a.message;
+        try {
+            return JSON.stringify(a);
+        }
+        catch {
+            return String(a);
+        }
+    })
+        .join(" ");
+    const line = `[debug:${category}] +${elapsed}ms ${formatted}\n`;
+    if (sinkOverride) {
+        sinkOverride.write(line);
+    }
+    else if (debugFilePath) {
+        try {
+            appendFileSync(debugFilePath, line);
+        }
+        catch (err) {
+            // Fall back to stderr so a broken --debug-file doesn't swallow output.
+            process.stderr.write(`[debug] could not append to '${debugFilePath}': ${err instanceof Error ? err.message : String(err)}\n`);
+            process.stderr.write(line);
+            debugFilePath = undefined;
+        }
+    }
+    else {
+        process.stderr.write(line);
+    }
+}
+/** @internal Test-only: reset module-level state between cases. */
+export function _resetDebugForTest() {
+    enabledCategories = new Set();
+    debugFilePath = undefined;
+    sinkOverride = undefined;
+    started = Date.now();
+}
+//# sourceMappingURL=debug.js.map

package/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "@zhijiewang/openharness",
-  "version": "2.20.0",
+  "version": "2.21.0",
   "description": "Open-source terminal coding agent. Works with any LLM.",
   "type": "module",
   "bin": {