npm - @pdhaku0/gemini-cli-agent-sdk - Versions diffs - 0.1.0 - Mend

@pdhaku0/gemini-cli-agent-sdk 0.1.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (54) hide show

package/README.md +109 -0
package/client/index.d.ts +1 -0
package/client/index.js +1 -0
package/client/package.json +1 -0
package/dist/client.d.ts +5 -0
package/dist/client.js +5 -0
package/dist/client.js.map +1 -0
package/dist/common/types.d.ts +191 -0
package/dist/common/types.js +18 -0
package/dist/common/types.js.map +1 -0
package/dist/core/AcpWebSocketTransport.d.ts +25 -0
package/dist/core/AcpWebSocketTransport.js +222 -0
package/dist/core/AcpWebSocketTransport.js.map +1 -0
package/dist/core/AgentChatClient.d.ts +75 -0
package/dist/core/AgentChatClient.js +679 -0
package/dist/core/AgentChatClient.js.map +1 -0
package/dist/core/ToolPermissionManager.d.ts +26 -0
package/dist/core/ToolPermissionManager.js +88 -0
package/dist/core/ToolPermissionManager.js.map +1 -0
package/dist/core/diff-utils.d.ts +1 -0
package/dist/core/diff-utils.js +7 -0
package/dist/core/diff-utils.js.map +1 -0
package/dist/core/stream-utils.d.ts +14 -0
package/dist/core/stream-utils.js +57 -0
package/dist/core/stream-utils.js.map +1 -0
package/dist/extras/index.d.ts +1 -0
package/dist/extras/index.js +2 -0
package/dist/extras/index.js.map +1 -0
package/dist/extras/sys-tags.d.ts +38 -0
package/dist/extras/sys-tags.js +150 -0
package/dist/extras/sys-tags.js.map +1 -0
package/dist/index.d.ts +1 -0
package/dist/index.js +2 -0
package/dist/index.js.map +1 -0
package/dist/server/GeminiBridge.d.ts +50 -0
package/dist/server/GeminiBridge.js +500 -0
package/dist/server/GeminiBridge.js.map +1 -0
package/dist/server.d.ts +7 -0
package/dist/server.js +7 -0
package/dist/server.js.map +1 -0
package/dist/ui/AgentChatStore.d.ts +16 -0
package/dist/ui/AgentChatStore.js +59 -0
package/dist/ui/AgentChatStore.js.map +1 -0
package/docs/API.md +100 -0
package/docs/EVENTS.md +100 -0
package/docs/INTEGRATION.md +109 -0
package/docs/SPECIFICATION.md +93 -0
package/docs/TROUBLESHOOTING.md +44 -0
package/docs/USAGE.md +270 -0
package/docs/design.md +62 -0
package/package.json +71 -0
package/server/index.d.ts +1 -0
package/server/index.js +1 -0
package/server/package.json +1 -0

package/docs/API.md ADDED Viewed

@@ -0,0 +1,100 @@
+# API Reference
+This document defines the public API surface for the SDK.
+## Module Entrypoints
+- `@pdhaku0/gemini-cli-agent-sdk/client`
+  - Browser/Next.js client usage
+- `@pdhaku0/gemini-cli-agent-sdk/server`
+  - Node.js bridge usage
+- `@pdhaku0/gemini-cli-agent-sdk/extras`
+  - Optional helpers (SYS tag capture, etc.)
+## AgentChatClient
+```ts
+new AgentChatClient(options: AgentChatClientOptions)
+```
+### AgentChatClientOptions
+- `url: string` (required)
+- `model?: string`
+- `cwd?: string`
+- `diffContextLines?: number`
+- `sessionId?: string`
+- `replay?: { limit?: number; since?: number; before?: number }`
+### Methods
+- `connect(options?: { autoSession?: boolean }): Promise<void>`
+  - Connects the WebSocket.
+  - If `sessionId` is already set, the client reuses it and **does not** call `session/new`.
+  - If `autoSession` is `true` (default) and no session exists, sends `session/new`.
+- `sendMessage(text: string): Promise<void>`
+- `sendMessage(text: string, options?: { hidden?: HiddenMode }): Promise<void>`
+- `submitAuthCode(code: string): Promise<void>`
+- `approveTool(optionId: string): Promise<void>`
+- `cancel(): Promise<void>`
+- `getMessages(): ChatMessage[]`
+- `getMessages(options?: { includeHidden?: boolean }): ChatMessage[]`
+- `getAuthUrl(): string | null`
+- `getPendingApproval(): PendingApproval | null`
+- `getConnectionState(): ConnectionState`
+- `prependMessages(messages: ChatMessage[]): void`
+  - Prepends messages (for replay/infinite scroll).
+- `setSessionId(sessionId: string | null): void`
+- `getSessionId(): string | null`
+- `dispose(): void`
+### Static
+- `AgentChatClient.fetchReplay(url, replay, options?): Promise<ChatMessage[]>`
+  - Uses bridge replay query params to fetch older messages.
+  - `options.idleMs` controls the inactivity timeout once the first replay message arrives.
+## AgentChatStore
+```ts
+new AgentChatStore(client: AgentChatClient)
+```
+### Methods
+- `subscribe(listener): () => void`
+- `getState(): AgentChatState`
+- `dispose(): void`
+## GeminiBridge (server)
+```ts
+new GeminiBridge(options?: GeminiBridgeOptions)
+```
+### GeminiBridgeOptions
+- `model?: string`
+- `port?: number` (default 4444)
+- `approvalMode?: string`
+- `geminiBin?: string`
+- `cliPackage?: string`
+- `hostApiUrl?: string`
+- `sessionId?: string`
+- `bridgeSecret?: string`
+- `projectRoot?: string`
+- `outgoingTransform?: (msg) => { forward?: any | null; extra?: any[] } | null`
+### Methods
+- `start(): void`
+- `stop(): void`
+### Events
+- `gemini:message` (message from Gemini CLI)
+- `client:message` (message from a WebSocket client)
+## Types
+See `src/common/types.ts` for canonical type definitions.

package/docs/EVENTS.md ADDED Viewed

@@ -0,0 +1,100 @@
+# Events and Rendering Rules
+This document defines the event model and how to render messages correctly without ordering bugs.
+## Core Events (AgentChatClient)
+### Connection
+- `connection_state_changed`
+  - payload: `{ state: ConnectionState }`
+  - values: `connecting | connected | reconnecting | disconnected`
+- `session_ready`
+  - payload: `sessionId: string`
+- `error`
+  - payload: `unknown`
+### Messages
+- `message`
+  - emitted when a **new message object** is created (user or assistant)
+- `message_update`
+  - emitted when streaming text/thought/tool updates modify a message
+### Streaming deltas
+- `text_delta`
+- `assistant_text_delta`
+- `thought_delta`
+- `assistant_thought_delta`
+  - payload: `{ messageId, delta, text|thought }`
+### Tools
+- `tool_update`
+- `tool_call_started`
+- `tool_call_updated`
+- `tool_call_completed`
+  - payload: `{ messageId, toolCall }`
+### Turn lifecycle
+- `turn_started`
+  - payload: `{ userMessageId }`
+- `turn_completed`
+  - payload: `stopReason` (string)
+### Auth / Permission
+- `auth_required` (string URL)
+- `auth_resolved`
+- `permission_required` (PendingApproval)
+- `approval_required` (PendingApproval)
+- `approval_resolved`
+### Replay
+- `messages_replayed`
+  - payload: `{ count }`
+  - emitted after `prependMessages()`
+### Structured events (optional)
+- `bridge/structured_event`
+  - payload: `{ type, payload, raw, error? }`
+  - emitted when SYS tags are captured by the bridge (see `docs/USAGE.md`)
+## Rendering Rules (Important)
+### 1) Always render assistant content using `content[]`
+The SDK maintains a `content` array on assistant messages that preserves the **true order** of:
+- text
+- thought
+- tool calls
+If you render `m.text` + `m.toolCalls` separately, tool calls can appear out of order.
+### 2) Render tool approvals next to the tool call
+Tool approvals are tied to a specific tool call via `toolCallId`. Put the approval UI inside the matching tool block.
+### 3) User messages are local
+The server does **not** echo user messages. You must render them from SDK state (`message` event or store state).
+## Suggested UI Pattern
+```ts
+messages.map((msg) => {
+  if (msg.role === 'user') renderUser(msg.text);
+  if (msg.role === 'assistant') {
+    msg.content.map((part) => {
+      if (part.type === 'text') renderText(part.text);
+      if (part.type === 'thought') renderThought(part.thought);
+      if (part.type === 'tool_call') renderTool(part.call);
+    });
+  }
+});
+```

package/docs/INTEGRATION.md ADDED Viewed

@@ -0,0 +1,109 @@
+# Integration Guide
+This guide focuses on real-world integration details and common pitfalls.
+## Use the right entrypoint
+- **Browser/Next.js**: `@pdhaku0/gemini-cli-agent-sdk/client`
+- **Node.js Bridge**: `@pdhaku0/gemini-cli-agent-sdk/server`
+If you import the root package in a client component, Next.js will try to bundle server code (`fs`) and fail.
+## Next.js (App Router)
+### Use a singleton client/store
+React Strict Mode mounts components twice in dev, which can create **multiple WebSocket connections** and inconsistent state.
+Use a module-level singleton or a guard to ensure you connect only once.
+### Use `use client`
+Instantiate `AgentChatClient` in client components only.
+### Persist session across reloads
+If you want to keep the same ACP session across a page reload, store the session ID and pass it back to the client:
+```ts
+const sessionId = localStorage.getItem('agentchat_session_id') || undefined;
+const client = new AgentChatClient({ url: wsUrl, sessionId });
+client.on('session_ready', (id) => localStorage.setItem('agentchat_session_id', id));
+await client.connect();
+```
+### Example reference
+See `examples/next-app` for a working App Router implementation (auth, approvals, replay, session persistence).
+### Working directory (cwd)
+The client sends a `cwd` in `session/new`. For Next apps, you can set:
+```bash
+NEXT_PUBLIC_GEMINI_CWD=/path/to/project
+```
+## WebSocket URL in remote/SSH setups
+If you SSH into a remote host, `localhost` points to **the remote**, not your local machine.
+Set the WebSocket URL accordingly, for example:
+```bash
+NEXT_PUBLIC_GEMINI_WS_URL=ws://<host>:4444
+```
+## Auth flow
+When `auth_required` fires, you must call `submitAuthCode()` before the CLI will process prompts.
+## Tool approvals
+Use `pendingApproval.toolCall.toolCallId` to attach permission options to the correct tool block.
+## Replay performance tips
+- `limit` is in **turns**, not messages.
+- Use a small `limit` on connect, then fetch older as needed.
+- Replay is in-memory only; restarting the bridge clears history.
+## Optional SYS tag capture
+If you want to extract structured JSON from assistant output, use the extras helper:
+```ts
+import { createSysTagTransform } from '@pdhaku0/gemini-cli-agent-sdk/extras';
+const bridge = new GeminiBridge({
+  outgoingTransform: createSysTagTransform({ mode: 'event' }),
+});
+```
+### Recommended priming prompt
+Instruct the agent to use SYS tags for structured data so the bridge can capture it:
+```text
+When you need to emit machine-readable JSON, wrap it in <SYS_JSON>...</SYS_JSON>.
+When you want to group work, use <SYS_BLOCK>{"type":"start"...}</SYS_BLOCK> and
+<SYS_BLOCK>{"type":"end"...}</SYS_BLOCK>.
+```
+## Structured events → backend tools
+If you want the assistant to trigger backend tools, use SYS tags and process
+`bridge/structured_event` on the bridge:
+```ts
+bridge.on('client:message', (msg) => {
+  if (msg?.method !== 'bridge/structured_event') return;
+  const { type, payload } = msg.params || {};
+  if (type !== 'sys_json') return;
+  if (payload?.type === 'tool.invoke') {
+    // Example: run a custom backend tool
+    runTool(payload.payload);
+  }
+});
+```

package/docs/SPECIFICATION.md ADDED Viewed

@@ -0,0 +1,93 @@
+# Technical Specification
+This document details the internal architecture, protocol specifications, and custom logic implementation of the SDK and bridge.
+## Architecture Overview
+```
+[ Client Application (SDK) ]
+        |
+        | WebSocket (JSON-RPC 2.0)
+        |
+[ Node.js Bridge (gemini-bridge.cjs) ]
+        |
+        | Stdio (Pipe)
+        |
+[ Gemini CLI Binary ]
+```
+### 1. Client SDK (`src/core/`)
+- **AgentChatClient**: Manages the WebSocket connection, session state, and event emission.
+- **Event System**: Emits normalized events (`text_delta`, `thought_delta`, `tool_update`) to the UI.
+- **Tool Parsing**: Recovers structured tool data from Gemini CLI output.
+### 2. Bridge (`scripts/gemini-bridge.cjs`)
+- **Process Management**: Spawns the `gemini` binary with `--experimental-acp`.
+- **Protocol Translation**: Forwards JSON-RPC between WebSocket and stdio.
+- **Log Management**: Handles `gemini-acp.log` rotation (max 2MB).
+- **History Replay**: Maintains a small in-memory history for late-joiners.
+## ACP Protocol & Extensions
+### Session Handshake
+1. Client connects via WebSocket.
+2. Bridge spawns Gemini CLI.
+3. Client sends `session/new` (unless reusing a stored session ID).
+4. Bridge relays response with `sessionId`.
+### Session Reuse
+Clients may reuse an existing session by supplying a known `sessionId` and skipping `session/new`.
+This is useful for page reloads. It only works while the bridge/CLI process remains alive.
+### Message Flow
+- **User Input**: Client sends `session/prompt` with text and `sessionId`.
+- **Streaming Response**: CLI sends `session/update` events.
+  - `agent_thought_chunk`: Internal reasoning text.
+  - `agent_message_chunk`: User-facing assistant text.
+  - `tool_call`: Request to execute a tool.
+## Backend Event Hooks
+`GeminiBridge` extends `EventEmitter` and emits:
+- `gemini:message`: JSON-RPC messages from Gemini CLI.
+- `client:message`: JSON-RPC messages from WebSocket clients.
+## Custom Logic: Tool Description Parsing
+The Gemini CLI does not consistently emit a `description` field in the `tool_call` object. The SDK implements:
+1. **Title Parsing**: Analyze the tool title string.
+2. **Nested Parentheses**: Extract the *last balanced parentheses group*.
+   - Example: `"ls -F [cwd] (List files (detailed))"` => `"List files (detailed)"`
+3. **CWD Extraction**: Capture `[current working directory ...]` as `workingDir`.
+4. **Fallback**: Leave `description` empty if not found.
+## Operational Details
+### Log Rotation
+The `gemini-bridge.cjs` script checks `gemini-acp.log` size on startup.
+- **Limit**: 2MB (2 * 1024 * 1024 bytes)
+- **Action**: Renames current log to `.old` if limit exceeded
+### History Replay (Late Joiners)
+The bridge keeps an in-memory ring buffer of recent JSON-RPC messages (max 2000).
+Clients can request a replay using WebSocket query params:
+- `limit`: last N **turns** (to avoid slicing a response mid-stream)
+- `since`: only messages after this UNIX timestamp (ms)
+- `before`: only messages before this UNIX timestamp (ms)
+### Environment Variables
+- `GEMINI_PORT`: WebSocket port (default 4444)
+- `GEMINI_MODEL`: Model ID (default `gemini-3-flash-preview`)
+- `GEMINI_APPROVAL_MODE`: Tool approval mode (default `default`)

package/docs/TROUBLESHOOTING.md ADDED Viewed

@@ -0,0 +1,44 @@
+# Troubleshooting
+## Connected but no UI messages
+- Ensure you render using `content[]` for assistant messages.
+- Ensure you listen to `message_update` (or use `AgentChatStore`).
+- User messages are **not echoed** from the server; render them locally.
+## WebSocket connects but session never initializes
+- Verify `session/new` is being sent (unless you provide `sessionId`).
+- Confirm Gemini CLI is running with `--experimental-acp`.
+- Check bridge logs for JSON-RPC errors.
+## New session after page reload
+- A full page reload reinitializes the client and creates a new session.
+- Persist `sessionId` (localStorage) and pass it back to the client to reuse a session.
+- If the bridge restarts, the previous session ID is invalid and a new session is expected.
+- If you keep reusing an invalid session, clear the stored session ID and refresh.
+## History replay not working
+- `limit` is **turns**, not messages.
+- Make sure you pass `before` as a UNIX timestamp in **ms**.
+- Replay relies on bridge in-memory history; restarting the bridge clears it.
+## Replay returns empty even though bridge says "Replaying"
+- Ensure you are running the updated SDK build (rebuild after local changes if using `file:` dependency).
+- Increase `idleMs` if the bridge or browser is slow.
+## Tool approval UI does not show
+- Approvals are per tool call. Match on `toolCallId` and render inside that tool's block.
+## "WebSocket constructor not found"
+- In Node, ensure `ws` is installed and Node >= 18.
+- In Next.js, instantiate the client in a `use client` component.
+## "Blocked message during pending auth"
+- Gemini CLI requires auth. Use the URL from `auth_required` and call `submitAuthCode()`.