npm - @zhihand/mcp - Versions diffs - 0.19.1 → 0.22.0 - Mend

@zhihand/mcp 0.19.1 → 0.22.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (13) hide show

package/README.md +44 -14
package/bin/zhihand +21 -11
package/dist/core/config.d.ts +9 -0
package/dist/core/config.js +12 -0
package/dist/core/resolve-path.js +2 -2
package/dist/daemon/dispatcher.d.ts +1 -2
package/dist/daemon/dispatcher.js +207 -126
package/dist/daemon/heartbeat.d.ts +7 -0
package/dist/daemon/heartbeat.js +11 -1
package/dist/daemon/index.js +27 -8
package/dist/index.d.ts +1 -0
package/dist/index.js +1 -1
package/package.json +1 -1

package/README.md CHANGED Viewed

@@ -2,7 +2,7 @@
 ZhiHand MCP Server — let AI agents see and control your phone.
-Version: `0.16.0`
+Version: `0.20.0`
 ## What is this?
@@ -74,6 +74,8 @@ zhihand start -d           # Start daemon in background (detached)
 The daemon runs the MCP Server on `localhost:18686/mcp` (HTTP Streamable transport), maintains a brain heartbeat every 30 seconds (keeps the phone Brain indicator green), and listens for phone-initiated prompts.
+When started with `-d`, daemon logs are written to `~/.zhihand/daemon.log`.
 ### 3. Start using it
 Once configured, your AI agent can use ZhiHand tools directly. For example, in Claude Code:
@@ -90,18 +92,19 @@ Once configured, your AI agent can use ZhiHand tools directly. For example, in C
 ```
 zhihand setup              Interactive setup: pair + detect tools + auto-select + configure MCP + start daemon
 zhihand start              Start daemon (MCP Server + Relay + Config API)
-zhihand start -d           Start daemon in background (detached)
+zhihand start -d           Start daemon in background (logs to ~/.zhihand/daemon.log)
 zhihand stop               Stop the running daemon
-zhihand status             Show daemon status, pairing info, device, and active backend
+zhihand status             Show daemon status, pairing info, device, backend, and model
 zhihand pair               Pair with a phone (QR code in terminal)
 zhihand detect             List detected CLI tools and their login status
 zhihand serve              Start MCP Server (stdio mode, backward compatible)
 zhihand --help             Show help
-zhihand claude             Switch backend to Claude Code (sends IPC to daemon, auto-configures MCP)
-zhihand codex              Switch backend to Codex CLI (sends IPC to daemon, auto-configures MCP)
-zhihand gemini             Switch backend to Gemini CLI (sends IPC to daemon, auto-configures MCP)
+zhihand gemini             Switch backend to Gemini CLI (default model: flash)
+zhihand claude             Switch backend to Claude Code (default model: sonnet)
+zhihand codex              Switch backend to Codex CLI (default model: gpt-5.4-mini)
+zhihand gemini --model pro Switch backend with custom model
 ```
 ### Daemon Lifecycle
@@ -123,15 +126,28 @@ The daemon is a single persistent process that runs:
 Use `zhihand claude`, `zhihand codex`, or `zhihand gemini` to switch the active backend:
 ```bash
-zhihand gemini             # Switch to Gemini CLI
-zhihand claude             # Switch to Claude Code
-zhihand codex              # Switch to Codex CLI
+zhihand gemini                # Switch to Gemini CLI (model: flash)
+zhihand claude                # Switch to Claude Code (model: sonnet)
+zhihand codex                 # Switch to Codex CLI (model: gpt-5.4-mini)
+zhihand gemini --model pro    # Use a custom model
+zhihand claude -m opus        # Short flag form
 ```
+Each backend has a **default model alias** that resolves to the latest version:
+| Backend | Default | Alias examples | Resolution |
+|---------|---------|---------------|------------|
+| Gemini CLI | `flash` | `flash`, `pro` | Gemini CLI resolves natively (e.g. flash → gemini-2.5-flash) |
+| Claude Code | `sonnet` | `sonnet`, `opus`, `haiku` | Claude Code resolves natively (e.g. sonnet → claude-sonnet-4) |
+| Codex CLI | `gpt-5.4-mini` | any full model name | Codex requires full model names |
+Model resolution priority: `--model` flag > `ZHIHAND_MODEL` env > `ZHIHAND_<BACKEND>_MODEL` env > default.
 When you switch:
 - The command sends an **IPC message to the running daemon**
 - MCP config is **automatically added** to the new backend
 - MCP config is **automatically removed** from the previous backend
+- The model selection is **persisted** to `~/.zhihand/backend.json`
 - If the tool is not installed, an error is shown
 ### Options
@@ -139,6 +155,9 @@ When you switch:
 | Option | Description |
 |---|---|
 | `--device <name>` | Use a specific paired device (if you have multiple) |
+| `--model, -m <name>` | Set model alias (e.g. `flash`, `pro`, `sonnet`, `opus`, `gpt-5.4-mini`) |
+| `--port <port>` | Override daemon port (default: 18686) |
+| `-d, --detach` | Run daemon in background |
 | `-h, --help` | Show help |
 ### Environment Variables
@@ -147,6 +166,10 @@ When you switch:
 |---|---|
 | `ZHIHAND_DEVICE` | Default device name (same as `--device`) |
 | `ZHIHAND_CLI` | Override CLI tool selection for mobile-initiated tasks |
+| `ZHIHAND_MODEL` | Override model for all backends |
+| `ZHIHAND_GEMINI_MODEL` | Override model for Gemini only |
+| `ZHIHAND_CLAUDE_MODEL` | Override model for Claude only |
+| `ZHIHAND_CODEX_MODEL` | Override model for Codex only |
 ## MCP Tools
@@ -160,12 +183,17 @@ The main phone control tool. Supports these actions:
 |---|---|---|
 | `click` | `xRatio`, `yRatio` | Tap at normalized coordinates [0,1] |
 | `doubleclick` | `xRatio`, `yRatio` | Double-tap |
-| `rightclick` | `xRatio`, `yRatio` | Right-click (long press) |
-| `middleclick` | `xRatio`, `yRatio` | Middle-click |
+| `longclick` | `xRatio`, `yRatio`, `durationMs` | Long press (default 800ms) |
+| `rightclick` | `xRatio`, `yRatio` | Right-click (desktop/BLE HID) |
+| `middleclick` | `xRatio`, `yRatio` | Middle-click (desktop/BLE HID) |
 | `type` | `text` | Type text into the focused field |
-| `swipe` | `startXRatio`, `startYRatio`, `endXRatio`, `endYRatio` | Swipe gesture |
+| `swipe` | `startXRatio`, `startYRatio`, `endXRatio`, `endYRatio`, `durationMs` | Swipe gesture (default 300ms) |
 | `scroll` | `xRatio`, `yRatio`, `direction`, `amount` | Scroll up/down/left/right |
 | `keycombo` | `keys` | Key combination (e.g. `"ctrl+c"`, `"alt+tab"`) |
+| `back` | — | Press system Back button |
+| `home` | — | Press system Home button |
+| `enter` | — | Press Enter key |
+| `open_app` | `appPackage`, `bundleId`, `urlScheme`, `appName` | Open an application |
 | `clipboard` | `clipboardAction` (`get`/`set`), `text` | Read or write clipboard |
 | `wait` | `durationMs` | Wait (local sleep, no server round-trip) |
 | `screenshot` | — | Capture screen immediately |
@@ -234,8 +262,9 @@ Pairing credentials are stored at:
 ```
 ~/.zhihand/
 ├── credentials.json    # Device credentials (credentialId, controllerToken, endpoint)
-├── backend.json        # Active backend selection (claudecode/codex/gemini)
+├── backend.json        # Active backend + model selection
 ├── daemon.pid          # Daemon PID file (for zhihand stop)
+├── daemon.log          # Daemon log output (when started with -d)
 └── state.json          # Current pairing session state
 ```
@@ -267,7 +296,8 @@ packages/mcp/
 │   ├── index.ts             # MCP Server (stdio transport, legacy)
 │   ├── openclaw.adapter.ts  # OpenClaw Plugin adapter (thin wrapper)
 │   ├── core/
-│   │   ├── config.ts        # Credential & config management (~/.zhihand/)
+│   │   ├── config.ts        # Credential & config management (~/.zhihand/), default models
+│   │   ├── resolve-path.ts  # Platform-aware executable path resolution (gemini/claude/codex)
 │   │   ├── command.ts       # Command creation, enqueue, ACK formatting
 │   │   ├── screenshot.ts    # Binary screenshot fetch (JPEG)
 │   │   ├── sse.ts           # SSE client + hybrid ACK (SSE push + polling fallback)

package/bin/zhihand CHANGED Viewed

@@ -6,7 +6,7 @@ import { startStdioServer } from "../dist/index.js";
 import { startDaemon, stopDaemon, isAlreadyRunning } from "../dist/daemon/index.js";
 import { detectCLITools, formatDetectedTools } from "../dist/cli/detect.js";
 import { detectAndSetupOpenClaw } from "../dist/cli/openclaw.js";
-import { loadDefaultCredential, loadBackendConfig, saveBackendConfig } from "../dist/core/config.js";
+import { loadDefaultCredential, loadBackendConfig, saveBackendConfig, DEFAULT_MODELS } from "../dist/core/config.js";
 import { executePairing } from "../dist/core/pair.js";
 import { configureMCP, displayName } from "../dist/cli/mcp-config.js";
@@ -23,6 +23,7 @@ const { positionals, values } = parseArgs({
   strict: false,
   options: {
     device: { type: "string" },
+    model: { type: "string", short: "m" },
     help: { type: "boolean", short: "h", default: false },
     detach: { type: "boolean", short: "d", default: false },
     port: { type: "string" },
@@ -41,9 +42,10 @@ Usage:
   zhihand stop               Stop daemon
   zhihand status             Show status (pairing, backend, brain)
-  zhihand gemini             Switch backend to Gemini CLI
-  zhihand claude             Switch backend to Claude Code
-  zhihand codex              Switch backend to Codex CLI
+  zhihand gemini             Switch backend to Gemini CLI (default model: flash)
+  zhihand claude             Switch backend to Claude Code (default model: sonnet)
+  zhihand codex              Switch backend to Codex CLI (default model: gpt-5.4-mini)
+  zhihand gemini --model pro Switch backend with custom model
   zhihand setup              Interactive setup: pair + configure + start
   zhihand pair               Pair with a phone device
@@ -53,6 +55,7 @@ Usage:
 Options:
   --device <name>    Use a specific paired device
+  --model, -m <name> Set model alias (e.g. flash, pro, sonnet, opus, gpt-5.4-mini)
   --port <port>      Override daemon port (default: 18686)
   -d, --detach       Run daemon in background
   -h, --help         Show this help
@@ -82,12 +85,15 @@ if (Object.prototype.hasOwnProperty.call(CLI_TOOL_MAP, command)) {
   const config = loadBackendConfig();
   const previous = config.activeBackend;
-  if (previous === backendName) {
-    console.log(`Already using ${displayName(backendName)} as backend.`);
+  const userModel = values.model ?? null;
+  const effectiveModel = userModel ?? DEFAULT_MODELS[backendName];
+  if (previous === backendName && !userModel) {
+    console.log(`Already using ${displayName(backendName)} as backend (model: ${effectiveModel}).`);
     process.exit(0);
   }
-  console.log(`Switching backend to ${displayName(backendName)}...`);
+  console.log(`Switching backend to ${displayName(backendName)} (model: ${effectiveModel})...`);
   // Configure MCP (HTTP transport)
   const { configured, removed } = configureMCP(backendName, previous);
@@ -100,7 +106,7 @@ if (Object.prototype.hasOwnProperty.call(CLI_TOOL_MAP, command)) {
         const res = await fetch(`http://127.0.0.1:${port}/internal/backend`, {
           method: "POST",
           headers: { "Content-Type": "application/json" },
-          body: JSON.stringify({ backend: backendName }),
+          body: JSON.stringify({ backend: backendName, model: userModel }),
           signal: AbortSignal.timeout(5000),
         });
         if (res.ok) {
@@ -108,11 +114,11 @@ if (Object.prototype.hasOwnProperty.call(CLI_TOOL_MAP, command)) {
         }
       } catch {
         // Daemon not responding, just save config
-        saveBackendConfig({ activeBackend: backendName });
+        saveBackendConfig({ activeBackend: backendName, model: userModel });
         console.log(`\nBackend config saved. Daemon not responding — restart with 'zhihand start'.`);
       }
     } else {
-      saveBackendConfig({ activeBackend: backendName });
+      saveBackendConfig({ activeBackend: backendName, model: userModel });
       console.log(`\nBackend switched to ${displayName(backendName)}.`);
       console.log(`Start the daemon to receive prompts: zhihand start`);
     }
@@ -199,7 +205,11 @@ switch (command) {
     } else {
       console.log("No paired device. Run: zhihand setup");
     }
-    console.log(`Active backend: ${backend.activeBackend ? displayName(backend.activeBackend) : "(none)"}`);
+    const backendLabel = backend.activeBackend ? displayName(backend.activeBackend) : "(none)";
+    const modelLabel = backend.activeBackend
+      ? (backend.model ?? DEFAULT_MODELS[backend.activeBackend])
+      : "-";
+    console.log(`Active backend: ${backendLabel} (model: ${modelLabel})`);
     console.log(`Daemon: ${daemonPid ? `running (PID ${daemonPid})` : "not running"}`);
     // If daemon running, get live status

package/dist/core/config.d.ts CHANGED Viewed

@@ -19,7 +19,16 @@ export interface ZhiHandConfig {
 export type BackendName = "claudecode" | "codex" | "gemini" | "openclaw";
 export interface BackendConfig {
     activeBackend: BackendName | null;
+    model?: string | null;
 }
+/**
+ * Default model aliases per backend.
+ * These are generic aliases that the respective CLIs resolve to the latest version:
+ *   - Gemini CLI: "flash" → latest flash model (e.g. gemini-2.5-flash)
+ *   - Claude Code: "sonnet" → latest sonnet (e.g. claude-sonnet-4-20250514)
+ *   - Codex CLI: requires full model name, no alias support
+ */
+export declare const DEFAULT_MODELS: Record<Exclude<BackendName, "openclaw">, string>;
 export declare function resolveZhiHandDir(): string;
 export declare function ensureZhiHandDir(): void;
 export declare function loadCredentialStore(): CredentialStore | null;

package/dist/core/config.js CHANGED Viewed

@@ -1,6 +1,18 @@
 import fs from "node:fs";
 import path from "node:path";
 import os from "node:os";
+/**
+ * Default model aliases per backend.
+ * These are generic aliases that the respective CLIs resolve to the latest version:
+ *   - Gemini CLI: "flash" → latest flash model (e.g. gemini-2.5-flash)
+ *   - Claude Code: "sonnet" → latest sonnet (e.g. claude-sonnet-4-20250514)
+ *   - Codex CLI: requires full model name, no alias support
+ */
+export const DEFAULT_MODELS = {
+    gemini: "flash", // Gemini CLI resolves to latest flash
+    claudecode: "sonnet", // Claude Code resolves to latest sonnet
+    codex: "gpt-5.4-mini", // Codex default: latest GPT mini model
+};
 const ZHIHAND_DIR = path.join(os.homedir(), ".zhihand");
 const CREDENTIALS_PATH = path.join(ZHIHAND_DIR, "credentials.json");
 const STATE_PATH = path.join(ZHIHAND_DIR, "state.json");

package/dist/core/resolve-path.js CHANGED Viewed

@@ -2,7 +2,7 @@
  * Platform-aware executable path resolution.
  * Shared by both the CLI detection layer and the daemon dispatcher.
  */
-import { execSync } from "node:child_process";
+import { execFileSync } from "node:child_process";
 import fs from "node:fs";
 import path from "node:path";
 import os from "node:os";
@@ -19,7 +19,7 @@ export function resolveExecutable(name, fallbackPaths) {
         return cached;
     // Try `which` first (works when the binary is in PATH)
     try {
-        const resolved = execSync(`which ${name}`, { encoding: "utf8", timeout: 5000, stdio: ["pipe", "pipe", "pipe"] }).trim();
+        const resolved = execFileSync("which", [name], { encoding: "utf8", timeout: 5000, stdio: ["pipe", "pipe", "pipe"] }).trim();
         if (resolved) {
             cache.set(name, resolved);
             return resolved;

package/dist/daemon/dispatcher.d.ts CHANGED Viewed

@@ -5,8 +5,7 @@ export interface DispatchResult {
     durationMs: number;
 }
 /**
- * Kill the active child process. Returns a promise that resolves
- * when the child has exited (or immediately if no child).
+ * Kill the active session. Called by daemon on shutdown or backend switch.
  */
 export declare function killActiveChild(): Promise<void>;
 export declare function dispatchToCLI(backend: Exclude<BackendName, "openclaw">, prompt: string, log: (msg: string) => void, model?: string): Promise<DispatchResult>;

package/dist/daemon/dispatcher.js CHANGED Viewed

@@ -3,10 +3,12 @@ import fsp from "node:fs/promises";
 import path from "node:path";
 import os from "node:os";
 import { fileURLToPath } from "node:url";
+import { DEFAULT_MODELS } from "../core/config.js";
 import { resolveGemini, resolveClaude, resolveCodex } from "../core/resolve-path.js";
-const CLI_TIMEOUT = 120_000; // 120s
+const CLI_TIMEOUT = 120_000; // 120s per prompt
 const SIGKILL_DELAY = 2_000; // 2s after SIGTERM
-const MAX_OUTPUT_BYTES = 100 * 1024; // 100KB
+const MAX_OUTPUT_BYTES = 100 * 1024; // 100KB (for one-shot backends)
+const MAX_HISTORY_TURNS = 20; // keep last N exchanges in conversation history
 // Gemini session file polling
 const SESSION_POLL_INTERVAL = 1_000; // 1s
 const SESSION_STABILITY_DELAY = 2_000; // wait 2s after outcome before returning
@@ -15,7 +17,8 @@ const __dirname = path.dirname(fileURLToPath(import.meta.url));
 const PTY_WRAP_SCRIPT = path.resolve(__dirname, "../../scripts/pty-wrap.py");
 // Gemini session directories
 const GEMINI_TMP_DIR = path.join(os.homedir(), ".gemini", "tmp");
-let activeChild = null;
+let session = null;
+const conversationHistory = [];
 // ── Gemini Session File Monitoring ─────────────────────────
 /** Safely read and parse a JSON file (single attempt, async). */
 async function loadJsonFile(filePath) {
@@ -25,7 +28,6 @@ async function loadJsonFile(filePath) {
         return typeof parsed === "object" && parsed !== null ? parsed : null;
     }
     catch {
-        // File locked or partial write — next poll cycle will retry
         return null;
     }
 }
@@ -55,7 +57,6 @@ function extractMessageText(message) {
         if (typeof obj.text === "string")
             return obj.text;
     }
-    // Fallback to displayContent
     const display = message.displayContent;
     if (typeof display === "string")
         return display;
@@ -85,7 +86,6 @@ function hasActiveToolCalls(message) {
 function checkSessionOutcome(messages) {
     if (messages.length === 0)
         return null;
-    // Get the latest turn messages (trailing messages from last user input)
     const trailing = [];
     for (let i = messages.length - 1; i >= 0; i--) {
         const msg = messages[i];
@@ -95,22 +95,18 @@ function checkSessionOutcome(messages) {
     }
     if (trailing.length === 0)
         return null;
-    // If any message has active tool calls, still in progress
     for (const msg of trailing) {
         if (hasActiveToolCalls(msg))
             return null;
     }
-    // Check from last message backwards for a result
     for (let i = trailing.length - 1; i >= 0; i--) {
         const msg = trailing[i];
         const msgType = String(msg.type ?? "").trim();
-        // Error/warning/info messages
         if (["error", "warning", "info"].includes(msgType)) {
             const text = extractMessageText(msg).trim();
             if (text)
                 return ["error", text];
         }
-        // Gemini response message
         if (msgType === "gemini") {
             const text = extractMessageText(msg).trim();
             if (text)
@@ -151,21 +147,19 @@ async function findLatestSessionFile(afterTime, promptText) {
                 }
             }
         }
-        // Sort newest first, then validate content matches our prompt
         candidates.sort((a, b) => b.mtime - a.mtime);
         const promptPrefix = promptText.slice(0, 50);
         for (const candidate of candidates) {
             const data = await loadJsonFile(candidate.path);
             if (!data || !Array.isArray(data.messages))
                 continue;
-            // Check first user message matches our prompt
             for (const msg of data.messages) {
                 if (String(msg.type ?? "").trim() !== "user")
                     continue;
                 const text = extractMessageText(msg);
                 if (text.startsWith(promptPrefix))
                     return candidate.path;
-                break; // Only check first user message
+                break;
             }
         }
         return null;
@@ -174,33 +168,44 @@ async function findLatestSessionFile(afterTime, promptText) {
         return null;
     }
 }
+/** Count how many "user" type messages are in the session */
+function countUserMessages(messages) {
+    return messages.filter(m => String(m.type ?? "").trim() === "user").length;
+}
 /**
- * Poll gemini session files for the response.
- * Returns the final text when gemini completes, or null on timeout.
+ * Poll gemini session file for the response to the current prompt.
+ *
+ * For persistent sessions:
+ * - First prompt: find the session file, wait for first response, keep process alive
+ * - Subsequent: session file known, wait for new user message + response
  */
-function pollGeminiSession(child, startTime, promptText, log) {
+function pollGeminiSession(child, startTime, promptText, log, knownSessionFile, expectedUserCount) {
     return new Promise((resolve) => {
-        let sessionFile = null;
+        let sessionFile = knownSessionFile;
         let outcomeAt = null;
         let finalResult = null;
         let settled = false;
         let pollTimeout = null;
+        let newUserSeen = knownSessionFile === null; // first prompt: don't wait for user msg
         function settle(result) {
             if (settled)
                 return;
             settled = true;
             if (pollTimeout)
                 clearTimeout(pollTimeout);
-            // Kill the gemini process now that we have the answer
-            closeChild(child);
+            // DON'T kill the child — persistent session keeps it alive
             resolve(result);
         }
         async function poll() {
             if (settled)
                 return;
             const elapsed = Date.now() - startTime;
-            // Timeout
             if (elapsed > CLI_TIMEOUT) {
+                // Kill the timed-out session to prevent zombie processes
+                if (session?.child === child) {
+                    session.alive = false;
+                    log(`[gemini] Session timed out — killing process`);
+                }
                 closeChild(child);
                 settle({
                     text: "Gemini timed out after 120s.",
@@ -209,16 +214,18 @@ function pollGeminiSession(child, startTime, promptText, log) {
                 });
                 return;
             }
-            // Find session file if not yet found
+            // Find session file if not yet found (first prompt only)
             if (!sessionFile) {
                 sessionFile = await findLatestSessionFile(startTime, promptText);
                 if (sessionFile) {
                     log(`[gemini] Session file found: ${path.basename(sessionFile)}`);
+                    if (session)
+                        session.geminiSessionFile = sessionFile;
                 }
                 schedulePoll();
                 return;
             }
-            // Read session file and check for outcome
+            // Read session file
             const conversation = await loadJsonFile(sessionFile);
             if (!conversation) {
                 schedulePoll();
@@ -229,15 +236,23 @@ function pollGeminiSession(child, startTime, promptText, log) {
                 schedulePoll();
                 return;
             }
+            // For subsequent prompts: wait until the new user message appears
+            if (!newUserSeen) {
+                const userCount = countUserMessages(messages);
+                if (userCount < expectedUserCount) {
+                    schedulePoll();
+                    return;
+                }
+                newUserSeen = true;
+                log(`[gemini] New user message detected (turn #${expectedUserCount})`);
+            }
             const outcome = checkSessionOutcome(messages);
             if (!outcome) {
-                // Still in progress, reset stability timer
                 outcomeAt = null;
                 finalResult = null;
                 schedulePoll();
                 return;
             }
-            // Outcome detected — wait for stability (2s) before returning
             if (!outcomeAt) {
                 outcomeAt = Date.now();
                 finalResult = outcome;
@@ -261,12 +276,16 @@ function pollGeminiSession(child, startTime, promptText, log) {
                 return;
             pollTimeout = setTimeout(() => { poll(); }, SESSION_POLL_INTERVAL);
         }
-        // Start polling
         schedulePoll();
-        // Also handle process exit (in case it crashes before producing session file)
-        child.on("close", (code) => {
+        // Handle unexpected process exit
+        const onClose = (code) => {
             if (settled)
                 return;
+            // Mark session as dead
+            if (session?.child === child) {
+                session.alive = false;
+                log(`[gemini] Session process exited with code ${code}`);
+            }
             // Give a final chance to read the session file
             setTimeout(async () => {
                 if (settled)
@@ -291,14 +310,14 @@ function pollGeminiSession(child, startTime, promptText, log) {
                     durationMs: Date.now() - startTime,
                 });
             }, 500);
-        });
+        };
+        child.on("close", onClose);
     });
 }
-/** Gracefully close a child process: EOF → SIGTERM → SIGKILL. */
+/** Gracefully close a child process: SIGTERM → SIGKILL. */
 function closeChild(child) {
     if (child.killed || child.exitCode !== null)
         return;
-    // Try SIGTERM first
     child.kill("SIGTERM");
     setTimeout(() => {
         if (!child.killed && child.exitCode === null) {
@@ -306,29 +325,29 @@ function closeChild(child) {
         }
     }, SIGKILL_DELAY);
 }
-/**
- * Kill the active child process. Returns a promise that resolves
- * when the child has exited (or immediately if no child).
- */
-export function killActiveChild() {
-    if (!activeChild || activeChild.killed) {
+/** Close the persistent session and clear conversation history. */
+function closeSession() {
+    if (!session)
+        return Promise.resolve();
+    const s = session;
+    session = null;
+    if (!s.alive)
         return Promise.resolve();
-    }
     return new Promise((resolve) => {
-        const child = activeChild;
-        child.once("close", () => resolve());
-        closeChild(child);
-        // Safety: resolve after SIGKILL_DELAY + 1s even if no close event
+        s.child.once("close", () => resolve());
+        closeChild(s.child);
         setTimeout(() => resolve(), SIGKILL_DELAY + 1000);
     });
 }
-// ── System Prompt ─────────────────────────────────────────
 /**
- * Wrap the user's raw prompt with system context so the CLI backend
- * knows about the connected phone and how to use zhihand MCP tools.
+ * Kill the active session. Called by daemon on shutdown or backend switch.
  */
-function wrapPrompt(userPrompt) {
-    return `You are ZhiHand, an AI assistant connected to the user's mobile phone via MCP tools.
+export async function killActiveChild() {
+    await closeSession();
+    conversationHistory.length = 0;
+}
+// ── System Prompt ─────────────────────────────────────────
+const SYSTEM_CONTEXT = `You are ZhiHand, an AI assistant connected to the user's mobile phone via MCP tools.
 ## Available MCP Tools
@@ -358,23 +377,54 @@ Control the phone. Requires "action" parameter. All coordinates use normalized r
 - When the user asks to see their screen, ALWAYS call zhihand_screenshot first.
 - When the user asks to open an app (e.g. WeChat, Settings), use open_app action.
 - When the user asks to go back/home, use back/home actions.
-- For all tap/click operations, use xRatio and yRatio (0-1 normalized coordinates based on the screenshot).
-User message:
-${userPrompt}`;
+- For all tap/click operations, use xRatio and yRatio (0-1 normalized coordinates based on the screenshot).`;
+/**
+ * Build the full system prompt with optional conversation history.
+ * Used for first prompt in persistent sessions and all one-shot calls.
+ */
+function wrapPrompt(userPrompt, history) {
+    let result = SYSTEM_CONTEXT;
+    if (history && history.length > 0) {
+        result += "\n\n## Recent Conversation\n";
+        for (const turn of history) {
+            const label = turn.role === "user" ? "User" : "Assistant";
+            // Truncate long assistant responses in history to save tokens
+            const text = turn.text.length > 500 ? turn.text.slice(0, 500) + "..." : turn.text;
+            result += `\n${label}: ${text}\n`;
+        }
+    }
+    result += `\nUser message:\n${userPrompt}`;
+    return result;
+}
+// ── Conversation History Helpers ─────────────────────────────
+function recordTurn(role, text) {
+    conversationHistory.push({ role, text });
+    // Trim to keep last N exchanges (2 turns per exchange)
+    while (conversationHistory.length > MAX_HISTORY_TURNS * 2) {
+        conversationHistory.shift();
+    }
 }
 // ── Dispatch Entrypoint ────────────────────────────────────
 export function dispatchToCLI(backend, prompt, log, model) {
     const startTime = Date.now();
-    const wrappedPrompt = wrapPrompt(prompt);
+    const resolvedModel = resolveModel(backend, model);
+    // Check if existing session matches — if not, close it
+    const canReuse = session?.alive && session.backend === backend && session.model === resolvedModel;
+    if (session && !canReuse) {
+        log(`[dispatch] Session mismatch (was ${session.backend}/${session.model}), closing old session`);
+        closeSession();
+        conversationHistory.length = 0;
+    }
+    const sessionLabel = canReuse ? `#${session.promptCount + 1}` : "new";
+    log(`[dispatch] Backend: ${backend}, Model: ${resolvedModel}, Session: ${sessionLabel}`);
     if (backend === "gemini") {
-        return dispatchGemini(wrappedPrompt, startTime, log, model);
+        return dispatchGeminiPersistent(prompt, startTime, log, resolvedModel);
     }
     if (backend === "codex") {
-        return dispatchCodex(wrappedPrompt, startTime, model);
+        return dispatchCodexWithHistory(prompt, startTime, log, resolvedModel);
     }
     if (backend === "claudecode") {
-        return dispatchClaude(wrappedPrompt, startTime, model);
+        return dispatchClaudeWithHistory(prompt, startTime, log, resolvedModel);
     }
     return Promise.resolve({
         text: `Unsupported backend: ${backend}`,
@@ -382,13 +432,46 @@ export function dispatchToCLI(backend, prompt, log, model) {
         durationMs: 0,
     });
 }
-// ── Gemini Dispatch (PTY + Session File Monitoring) ────────
-function dispatchGemini(prompt, startTime, log, model) {
-    const geminiModel = model ?? process.env.CLAUDE_GEMINI_MODEL ?? "gemini-3.1-pro-preview";
+/**
+ * Resolve the model to use for a backend.
+ * Priority: explicit parameter > ZHIHAND_MODEL env > backend-specific env > default alias.
+ */
+function resolveModel(backend, explicit) {
+    if (explicit)
+        return explicit;
+    const globalEnv = process.env.ZHIHAND_MODEL;
+    if (globalEnv)
+        return globalEnv;
+    const envMap = {
+        gemini: process.env.ZHIHAND_GEMINI_MODEL,
+        claudecode: process.env.ZHIHAND_CLAUDE_MODEL,
+        codex: process.env.ZHIHAND_CODEX_MODEL,
+    };
+    const perBackend = envMap[backend];
+    if (perBackend)
+        return perBackend;
+    return DEFAULT_MODELS[backend];
+}
+// ── Gemini Dispatch (Persistent PTY Session) ─────────────────
+async function dispatchGeminiPersistent(prompt, startTime, log, model) {
+    // Reuse existing session?
+    if (session?.alive && session.backend === "gemini") {
+        session.promptCount++;
+        const turnNum = session.promptCount;
+        log(`[gemini] Reusing session — sending prompt #${turnNum}`);
+        // Write raw prompt to PTY stdin (gemini already has system context from first prompt)
+        session.child.stdin?.write(prompt + "\n");
+        const result = await pollGeminiSession(session.child, startTime, prompt, log, session.geminiSessionFile, turnNum);
+        recordTurn("user", prompt);
+        recordTurn("assistant", result.text);
+        return result;
+    }
+    // New session — spawn gemini with first prompt
+    const wrappedPrompt = wrapPrompt(prompt);
     const cliArgs = [
         "--approval-mode", "yolo",
-        "--model", geminiModel,
-        "-i", prompt,
+        "--model", model,
+        "-i", wrappedPrompt,
     ];
     const env = {
         ...process.env,
@@ -396,54 +479,80 @@ function dispatchGemini(prompt, startTime, log, model) {
         TERM: "xterm-256color",
         COLORTERM: "truecolor",
     };
-    // Wrap with PTY so gemini sees isatty()==true
     const geminiPath = resolveGemini();
+    log(`[gemini] Starting new persistent session (model: ${model})`);
     const child = spawn("python3", [PTY_WRAP_SCRIPT, geminiPath, ...cliArgs], {
         env,
-        stdio: ["ignore", "pipe", "pipe"],
+        stdio: ["pipe", "pipe", "pipe"], // stdin=pipe for subsequent prompts
         detached: false,
     });
-    activeChild = child;
-    // Drain PTY output (discard — we read from session file instead)
+    session = {
+        child,
+        backend: "gemini",
+        model,
+        promptCount: 1,
+        alive: true,
+        geminiSessionFile: null,
+    };
+    // Handle unexpected exit — mark session dead
+    child.on("close", (code) => {
+        if (session?.child === child) {
+            session.alive = false;
+            log(`[gemini] Session process exited (code ${code})`);
+        }
+    });
+    // Drain PTY stdout/stderr (we read from session file, not stdout)
     child.stdout?.resume();
     child.stderr?.resume();
-    return pollGeminiSession(child, startTime, prompt, log);
+    const result = await pollGeminiSession(child, startTime, wrappedPrompt, log, null, // no known session file yet
+    1);
+    recordTurn("user", prompt);
+    recordTurn("assistant", result.text);
+    return result;
 }
-// ── Codex Dispatch ─────────────────────────────────────────
-function dispatchCodex(prompt, startTime, model) {
-    // --dangerously-bypass-approvals-and-sandbox is required so MCP tool calls
-    // are not auto-cancelled in non-interactive mode (--full-auto cancels them)
+// ── Codex Dispatch (One-shot with History) ────────────────────
+async function dispatchCodexWithHistory(prompt, startTime, log, model) {
+    // Include conversation history in the prompt for context
+    const fullPrompt = wrapPrompt(prompt, conversationHistory);
     const args = ["exec", "--dangerously-bypass-approvals-and-sandbox", "--skip-git-repo-check", "--json"];
-    const codexModel = model ?? process.env.CLAUDE_CODEX_MODEL;
-    if (codexModel) {
-        args.push("-m", codexModel);
-    }
-    args.push(prompt);
+    args.push("-m", model);
+    // Pass prompt via stdin to avoid ARG_MAX limit with long conversation history
+    args.push("-");
     const codexPath = resolveCodex();
+    log(`[codex] One-shot dispatch (history: ${conversationHistory.length} turns)`);
     const child = spawn(codexPath, args, {
         env: process.env,
-        stdio: ["ignore", "pipe", "pipe"],
+        stdio: ["pipe", "pipe", "pipe"],
         detached: false,
     });
-    activeChild = child;
-    return collectCodexOutput(child, startTime);
+    // Write prompt to stdin, then close to signal EOF
+    child.stdin?.write(fullPrompt);
+    child.stdin?.end();
+    const result = await collectCodexOutput(child, startTime);
+    recordTurn("user", prompt);
+    recordTurn("assistant", result.text);
+    return result;
 }
-// ── Claude Dispatch ────────────────────────────────────────
-function dispatchClaude(prompt, startTime, model) {
+// ── Claude Dispatch (One-shot with History) ───────────────────
+async function dispatchClaudeWithHistory(prompt, startTime, log, model) {
+    const fullPrompt = wrapPrompt(prompt, conversationHistory);
     const claudePath = resolveClaude();
-    const child = spawn(claudePath, ["-p", prompt, "--output-format", "json"], {
+    log(`[claude] One-shot dispatch (history: ${conversationHistory.length} turns)`);
+    // Pass prompt via stdin (-p -) to avoid ARG_MAX limit with long conversation history
+    const child = spawn(claudePath, ["-p", "-", "--model", model, "--output-format", "json"], {
         env: process.env,
-        stdio: ["ignore", "pipe", "pipe"],
+        stdio: ["pipe", "pipe", "pipe"],
         detached: false,
     });
-    activeChild = child;
-    return collectChildOutput(child, startTime);
+    // Write prompt to stdin, then close to signal EOF
+    child.stdin?.write(fullPrompt);
+    child.stdin?.end();
+    const result = await collectChildOutput(child, startTime);
+    recordTurn("user", prompt);
+    recordTurn("assistant", result.text);
+    return result;
 }
-/**
- * Collect codex JSONL output with streaming line parsing.
- * Processes each JSONL line as it arrives so we extract agent text
- * without buffering large binary payloads (e.g. base64 screenshots).
- */
+// ── Codex JSONL Output Collector ──────────────────────────────
 function collectCodexOutput(child, startTime) {
     return new Promise((resolve) => {
         const texts = [];
@@ -478,50 +587,35 @@ function collectCodexOutput(child, startTime) {
                     hasError = true;
                 }
             }
-            catch {
-                // Not valid JSON — skip
-            }
+            catch { /* skip non-JSON */ }
         }
         const timer = setTimeout(() => { closeChild(child); }, CLI_TIMEOUT);
-        const onData = (data) => {
+        child.stdout?.on("data", (data) => {
             lineBuffer += data.toString("utf8");
             const lines = lineBuffer.split("\n");
-            // Keep the last (possibly incomplete) line in the buffer
             lineBuffer = lines.pop() ?? "";
-            for (const line of lines) {
+            for (const line of lines)
                 processLine(line);
-            }
-        };
-        child.stdout?.on("data", onData);
-        // stderr is not JSONL, just discard
+        });
         child.stderr?.resume();
         child.on("close", (code) => {
             clearTimeout(timer);
-            activeChild = null;
-            // Process any remaining data in the buffer
             if (lineBuffer.trim())
                 processLine(lineBuffer);
             const durationMs = Date.now() - startTime;
             let text = texts.join("\n\n");
             if (!text) {
-                text = code === 0
-                    ? "Task completed (no output)."
-                    : `CLI process exited with code ${code}.`;
+                text = code === 0 ? "Task completed (no output)." : `CLI process exited with code ${code}.`;
             }
             settle({ text, success: !hasError && code === 0, durationMs });
         });
         child.on("error", (err) => {
             clearTimeout(timer);
-            activeChild = null;
-            settle({
-                text: `CLI launch failed: ${err.message}`,
-                success: false,
-                durationMs: Date.now() - startTime,
-            });
+            settle({ text: `CLI launch failed: ${err.message}`, success: false, durationMs: Date.now() - startTime });
         });
     });
 }
-// ── Shared: Collect stdout/stderr from a child process ─────
+// ── Shared: Collect stdout/stderr from a child process ───────
 function collectChildOutput(child, startTime) {
     return new Promise((resolve) => {
         const chunks = [];
@@ -534,10 +628,7 @@ function collectChildOutput(child, startTime) {
             settled = true;
             resolve(result);
         }
-        // Timeout with two-stage kill
-        const timer = setTimeout(() => {
-            closeChild(child);
-        }, CLI_TIMEOUT);
+        const timer = setTimeout(() => { closeChild(child); }, CLI_TIMEOUT);
         const collectOutput = (data) => {
             if (truncated)
                 return;
@@ -554,27 +645,18 @@ function collectChildOutput(child, startTime) {
         child.stderr?.on("data", collectOutput);
         child.on("close", (code) => {
             clearTimeout(timer);
-            activeChild = null;
             const durationMs = Date.now() - startTime;
             let text = Buffer.concat(chunks).toString("utf8").trim();
-            if (truncated) {
+            if (truncated)
                 text += "\n\n[Output truncated at 100KB]";
-            }
             if (!text) {
-                text = code === 0
-                    ? "Task completed (no output)."
-                    : `CLI process exited with code ${code}.`;
+                text = code === 0 ? "Task completed (no output)." : `CLI process exited with code ${code}.`;
             }
             settle({ text, success: code === 0, durationMs });
         });
         child.on("error", (err) => {
             clearTimeout(timer);
-            activeChild = null;
-            settle({
-                text: `CLI launch failed: ${err.message}`,
-                success: false,
-                durationMs: Date.now() - startTime,
-            });
+            settle({ text: `CLI launch failed: ${err.message}`, success: false, durationMs: Date.now() - startTime });
         });
     });
 }
@@ -591,7 +673,6 @@ export async function postReply(config, promptId, text) {
             body: JSON.stringify({ role: "assistant", text }),
             signal: AbortSignal.timeout(30_000),
         });
-        // 4xx = prompt cancelled, that's OK
         return response.ok || (response.status >= 400 && response.status < 500);
     }
     catch {

package/dist/daemon/heartbeat.d.ts CHANGED Viewed

@@ -1,4 +1,11 @@
 import type { ZhiHandConfig } from "../core/config.ts";
+/** Brain metadata included in every heartbeat, so the app always knows the current backend/model. */
+export interface BrainMeta {
+    backend?: string | null;
+    model?: string | null;
+}
+/** Update the backend/model metadata that will be sent with the next heartbeat. */
+export declare function setBrainMeta(meta: BrainMeta): void;
 export declare function sendBrainOnline(config: ZhiHandConfig): Promise<boolean>;
 export declare function sendBrainOffline(config: ZhiHandConfig): Promise<boolean>;
 export declare function startHeartbeatLoop(config: ZhiHandConfig, log: (msg: string) => void): void;

package/dist/daemon/heartbeat.js CHANGED Viewed

@@ -2,18 +2,28 @@ const HEARTBEAT_INTERVAL = 30_000; // 30s
 const HEARTBEAT_RETRY_INTERVAL = 5_000; // 5s on failure
 let heartbeatTimer;
 let retryTimer;
+let currentMeta = {};
+/** Update the backend/model metadata that will be sent with the next heartbeat. */
+export function setBrainMeta(meta) {
+    currentMeta = meta;
+}
 function buildUrl(config) {
     return `${config.controlPlaneEndpoint}/v1/credentials/${encodeURIComponent(config.credentialId)}/brain-status`;
 }
 async function sendHeartbeat(config, online) {
     try {
+        const body = { plugin_online: online };
+        if (currentMeta.backend)
+            body.backend = currentMeta.backend;
+        if (currentMeta.model)
+            body.model = currentMeta.model;
         const response = await fetch(buildUrl(config), {
             method: "POST",
             headers: {
                 "Content-Type": "application/json",
                 "x-zhihand-controller-token": config.controllerToken,
             },
-            body: JSON.stringify({ plugin_online: online }),
+            body: JSON.stringify(body),
             signal: AbortSignal.timeout(10_000),
         });
         return response.ok;

package/dist/daemon/index.js CHANGED Viewed

@@ -5,14 +5,16 @@ import path from "node:path";
 import { StreamableHTTPServerTransport } from "@modelcontextprotocol/sdk/server/streamableHttp.js";
 // Transport type used only for cleanup interface
 import { createServer as createMcpServer } from "../index.js";
-import { resolveConfig, loadBackendConfig, saveBackendConfig, resolveZhiHandDir, ensureZhiHandDir, } from "../core/config.js";
-import { startHeartbeatLoop, stopHeartbeatLoop, sendBrainOffline } from "./heartbeat.js";
+import { resolveConfig, loadBackendConfig, saveBackendConfig, resolveZhiHandDir, ensureZhiHandDir, DEFAULT_MODELS, } from "../core/config.js";
+import { PACKAGE_VERSION } from "../index.js";
+import { startHeartbeatLoop, stopHeartbeatLoop, sendBrainOffline, setBrainMeta } from "./heartbeat.js";
 import { PromptListener } from "./prompt-listener.js";
 import { dispatchToCLI, postReply, killActiveChild } from "./dispatcher.js";
 const DEFAULT_PORT = 18686;
 const PID_FILE = "daemon.pid";
 // ── State ──────────────────────────────────────────────────
 let activeBackend = null;
+let activeModel = null; // user-selected model alias, null = use default
 let isProcessing = false;
 const promptQueue = [];
 function log(msg) {
@@ -28,7 +30,7 @@ async function processPrompt(config, prompt) {
     }
     const preview = prompt.text.length > 40 ? prompt.text.slice(0, 40) + "..." : prompt.text;
     log(`[relay] Prompt: "${preview}" → dispatching to ${activeBackend}...`);
-    const result = await dispatchToCLI(activeBackend, prompt.text, log);
+    const result = await dispatchToCLI(activeBackend, prompt.text, log, activeModel ?? undefined);
     const ok = await postReply(config, prompt.id, result.text);
     const dur = (result.durationMs / 1000).toFixed(1);
     if (ok) {
@@ -69,7 +71,7 @@ function handleInternalAPI(req, res) {
         });
         req.on("end", () => {
             try {
-                const { backend } = JSON.parse(body);
+                const { backend, model } = JSON.parse(body);
                 const allowed = ["claudecode", "codex", "gemini"];
                 if (!allowed.includes(backend)) {
                     res.writeHead(400, { "Content-Type": "application/json" });
@@ -77,10 +79,13 @@ function handleInternalAPI(req, res) {
                     return;
                 }
                 activeBackend = backend;
-                saveBackendConfig({ activeBackend });
-                log(`[config] Backend switched to ${activeBackend}.`);
+                activeModel = model ?? null;
+                saveBackendConfig({ activeBackend, model: activeModel });
+                const effectiveModel = activeModel ?? DEFAULT_MODELS[activeBackend];
+                setBrainMeta({ backend: activeBackend, model: effectiveModel });
+                log(`[config] Backend switched to ${activeBackend}, model: ${effectiveModel}`);
                 res.writeHead(200, { "Content-Type": "application/json" });
-                res.end(JSON.stringify({ ok: true, backend: activeBackend }));
+                res.end(JSON.stringify({ ok: true, backend: activeBackend, model: effectiveModel }));
             }
             catch {
                 res.writeHead(400, { "Content-Type": "application/json" });
@@ -90,9 +95,12 @@ function handleInternalAPI(req, res) {
         return true;
     }
     if (url === "/internal/status" && req.method === "GET") {
+        const effectiveModel = activeBackend ? (activeModel ?? DEFAULT_MODELS[activeBackend]) : null;
         res.writeHead(200, { "Content-Type": "application/json" });
         res.end(JSON.stringify({
+            version: PACKAGE_VERSION,
             backend: activeBackend,
+            model: effectiveModel,
             processing: isProcessing,
             queueLength: promptQueue.length,
             pid: process.pid,
@@ -155,9 +163,20 @@ export async function startDaemon(options) {
         log("Run 'zhihand setup' to pair a device first.");
         process.exit(1);
     }
-    // Load backend
+    // Load backend + model
     const backendConfig = loadBackendConfig();
     activeBackend = backendConfig.activeBackend ?? null;
+    activeModel = backendConfig.model ?? null;
+    // Log startup info + set brain meta for heartbeat
+    log(`ZhiHand v${PACKAGE_VERSION} starting...`);
+    if (activeBackend) {
+        const effectiveModel = activeModel ?? DEFAULT_MODELS[activeBackend];
+        log(`[config] Backend: ${activeBackend}, Model: ${effectiveModel}`);
+        setBrainMeta({ backend: activeBackend, model: effectiveModel });
+    }
+    else {
+        log(`[config] No backend configured. Use: zhihand gemini / zhihand claude / zhihand codex`);
+    }
     // MCP sessions: each client gets its own McpServer + Transport pair
     // because McpServer.connect() can only be called once per instance
     const MAX_MCP_SESSIONS = 20;

package/dist/index.d.ts CHANGED Viewed

@@ -1,3 +1,4 @@
 import { McpServer } from "@modelcontextprotocol/sdk/server/mcp.js";
+export declare const PACKAGE_VERSION = "0.22.0";
 export declare function createServer(deviceName?: string): McpServer;
 export declare function startStdioServer(deviceName?: string): Promise<void>;

package/dist/index.js CHANGED Viewed

@@ -5,7 +5,7 @@ import { controlSchema, screenshotSchema, pairSchema } from "./tools/schemas.js"
 import { executeControl } from "./tools/control.js";
 import { handleScreenshot } from "./tools/screenshot.js";
 import { handlePair } from "./tools/pair.js";
-const PACKAGE_VERSION = "0.19.1";
+export const PACKAGE_VERSION = "0.22.0";
 export function createServer(deviceName) {
     const server = new McpServer({
         name: "zhihand",

package/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "@zhihand/mcp",
-  "version": "0.19.1",
+  "version": "0.22.0",
   "private": false,
   "type": "module",
   "description": "ZhiHand MCP Server — phone control tools for Claude Code, Codex, Gemini CLI, and OpenClaw",