@wong2kim/wmux 2.0.0 → 2.0.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -1,144 +1,158 @@
1
- # wmux
1
+ # wmux — AI Agent Terminal for Windows
2
2
 
3
- **AI Agent Terminal for Windows**
3
+ > **Run Claude Code + Codex + Gemini CLI side by side.**
4
+ > Split terminals, browser automation, MCP integration — the only proper way to use AI agents on Windows.
4
5
 
5
- Run Claude Code, Codex, Gemini CLI side by side — with built-in browser automation, smart notifications, and MCP integration.
6
+ [![Windows 10/11](https://img.shields.io/badge/Windows-10%2F11-0078D6?logo=windows&logoColor=white)](https://github.com/openwong2kim/wmux/releases/latest)
7
+ [![npm](https://img.shields.io/npm/v/@wong2kim/wmux?color=CB3837&logo=npm)](https://www.npmjs.com/package/@wong2kim/wmux)
8
+ [![Electron 41](https://img.shields.io/badge/Electron-41-47848F?logo=electron&logoColor=white)](https://www.electronjs.org/)
9
+ [![License: MIT](https://img.shields.io/badge/License-MIT-green.svg)](LICENSE)
10
+ [![GitHub stars](https://img.shields.io/github/stars/openwong2kim/wmux?style=social)](https://github.com/openwong2kim/wmux)
6
11
 
7
- Inspired by [cmux](https://github.com/manaflow-ai/cmux) (macOS), wmux brings the same philosophy to Windows: **a primitive, not a solution.** Composable building blocks for multi-agent workflows.
12
+ ---
13
+
14
+ ## Still using one terminal for your AI coding agents on Windows?
15
+
16
+ macOS has [cmux](https://github.com/manaflow-ai/cmux) — a tmux-based terminal multiplexer for AI agents.
17
+
18
+ **Windows has no tmux.** Without WSL, there was no way.
8
19
 
9
- ![Windows](https://img.shields.io/badge/Windows-10%2F11-0078D6?logo=windows)
10
- ![Electron](https://img.shields.io/badge/Electron-41-47848F?logo=electron)
11
- ![npm](https://img.shields.io/npm/v/@wong2kim/wmux?color=CB3837&logo=npm)
12
- ![License](https://img.shields.io/badge/License-MIT-green)
20
+ wmux fixes this. Native Windows terminal multiplexer + browser automation + MCP server. Your AI agent reads the terminal, controls the browser, and works autonomously.
21
+
22
+ ```
23
+ Claude Code writes the backend on the left
24
+ Codex builds the frontend on the right
25
+ Gemini CLI runs tests at the bottom
26
+ — all on one screen, simultaneously.
27
+ ```
13
28
 
14
29
  ---
15
30
 
16
- ## Install
31
+ ## Install in 30 seconds
17
32
 
18
- **Download:** [wmux-2.0.0 Setup.exe](https://github.com/openwong2kim/wmux/releases/latest)
33
+ **Installer:**
19
34
 
20
- Or build from source:
35
+ [Download wmux Setup.exe](https://github.com/openwong2kim/wmux/releases/latest)
36
+
37
+ **One-liner (PowerShell):**
21
38
  ```powershell
22
39
  irm https://raw.githubusercontent.com/openwong2kim/wmux/main/install.ps1 | iex
23
40
  ```
24
41
 
25
- **npm (CLI + MCP server only):**
42
+ **npm (CLI + MCP server):**
26
43
  ```bash
27
44
  npm install -g @wong2kim/wmux
28
45
  ```
29
46
 
30
47
  ---
31
48
 
32
- ## What's New in v2.0.0
49
+ ## Why wmux?
33
50
 
34
- - **Browser automation via CDP** — Click, fill, type, screenshot directly through Chrome DevTools Protocol. Works with React inputs, CJK text, and controlled components.
35
- - **Security hardening** — Token auth on all pipes, SSRF protection, input sanitization, randomized CDP ports, memory pressure watchdog.
36
- - **Workspace reset** — One-click reset in Settings to clean all workspaces and start fresh.
37
- - **Daemon process** — Background session management with suspend/resume, scrollback persistence, and auto-recovery.
51
+ ### 1. Your AI agent controls the browser for real
38
52
 
39
- ---
53
+ Tell Claude Code "search Google for this" and it actually does it.
40
54
 
41
- ## Why wmux?
55
+ wmux's built-in browser connects via Chrome DevTools Protocol. Click, type, screenshot, execute JS — all done by the AI directly. Works perfectly with React controlled inputs and CJK text.
56
+
57
+ ```
58
+ You: "Search for wmux on Google"
59
+ Claude: browser_open → browser_snapshot → browser_fill(ref=13, "wmux") → browser_press_key("Enter")
60
+ → Actually searches Google. Done.
61
+ ```
42
62
 
43
- | Problem | wmux |
44
- |---------|------|
45
- | Windows has no cmux | Native Windows terminal multiplexer for AI agents |
46
- | Agents can't control the browser | Built-in browser with CDP — Claude clicks, fills, types, screenshots |
47
- | "Is it done yet?" | Smart activity-based notifications + taskbar flash |
48
- | Can't compare agents | Multiview — Ctrl+click workspaces to view side by side |
49
- | Hard to describe UI elements to LLM | Inspector — click any element, LLM-friendly context copied |
63
+ ### 2. Multiple terminals in one window
64
+
65
+ `Ctrl+D` to split, `Ctrl+N` for new workspace. Place multiple terminals and browsers in each workspace. `Ctrl+click` for multiview see multiple workspaces at once.
66
+
67
+ ConPTY-based native Windows terminal. xterm.js + WebGL hardware-accelerated rendering. 999K lines of scrollback. Terminal content persists even after restart.
68
+
69
+ ### 3. No more asking "is it done yet?"
70
+
71
+ wmux tells you when your AI agent finishes.
72
+
73
+ - Task complete → desktop notification + taskbar flash
74
+ - Abnormal exit → immediate warning
75
+ - `git push --force`, `rm -rf`, `DROP TABLE` → dangerous action detection
76
+
77
+ Not pattern matching — output throughput-based detection. Works with any agent.
78
+
79
+ ### 4. Automatic Claude Code integration
80
+
81
+ Launch wmux and the MCP server registers automatically. Claude Code just works:
82
+
83
+ | What Claude can do | MCP Tool |
84
+ |---|---|
85
+ | Open browser | `browser_open` |
86
+ | Navigate to URL | `browser_navigate` |
87
+ | Take screenshot | `browser_screenshot` |
88
+ | Read page structure | `browser_snapshot` |
89
+ | Click element | `browser_click` |
90
+ | Fill form | `browser_fill` / `browser_type` |
91
+ | Execute JS | `browser_evaluate` |
92
+ | Press key | `browser_press_key` |
93
+ | Read terminal | `terminal_read` |
94
+ | Send command | `terminal_send` |
95
+ | Manage workspaces | `workspace_list` / `surface_list` / `pane_list` |
96
+
97
+ **Multi-agent:** Every browser tool accepts `surfaceId` — each Claude Code session controls its own browser independently.
98
+
99
+ ### 5. Security that actually matters
100
+
101
+ - Token authentication on all IPC pipes
102
+ - SSRF protection — blocks private IPs, `file://`, `javascript:` schemes
103
+ - PTY input sanitization — prevents command injection
104
+ - Randomized CDP port — no fixed debug port
105
+ - Memory pressure watchdog — reaps dead sessions at 750MB, blocks new ones at 1GB
106
+ - Electron Fuses — RunAsNode disabled, cookie encryption enabled
50
107
 
51
108
  ---
52
109
 
53
- ## Features
110
+ ## All Features
54
111
 
55
112
  ### Terminal
56
- - **xterm.js + WebGL** GPU-accelerated rendering
57
- - **ConPTY** native Windows pseudo-terminal
58
- - **Split panes** — `Ctrl+D` horizontal, `Ctrl+Shift+D` vertical
59
- - **Tabs** — multiple surfaces per pane
60
- - **Vi copy mode** — `Ctrl+Shift+X`
61
- - **Search** — `Ctrl+F`
62
- - **Unlimited scrollback** 999,999 lines default
63
- - **Scrollback persistence** — terminal content saved to disk, restored on restart
113
+ - xterm.js + WebGL GPU-accelerated rendering
114
+ - ConPTY native Windows pseudo-terminal
115
+ - Split panes — `Ctrl+D` horizontal, `Ctrl+Shift+D` vertical
116
+ - Tabs — multiple surfaces per pane
117
+ - Vi copy mode — `Ctrl+Shift+X`
118
+ - Search — `Ctrl+F`
119
+ - 999K line scrollback with disk persistence
64
120
 
65
121
  ### Workspaces
66
122
  - Sidebar with drag-and-drop reordering
67
- - `Ctrl+1` ~ `Ctrl+9` quick switch
68
- - **Multiview** — `Ctrl+click` workspaces to split-view them simultaneously
69
- - **Session persistence**workspace layout, tabs, cwd, and terminal scrollback all restored on restart
70
- - **One-click reset** Settings > General > Reset to clean all workspaces
123
+ - `Ctrl+1~9` quick switch
124
+ - Multiview — `Ctrl+click` to view multiple workspaces side by side
125
+ - Full session persistence — layout, tabs, cwd, scrollback all restored
126
+ - One-click reset in Settings
71
127
 
72
128
  ### Browser + CDP Automation
73
129
  - Built-in browser panel — `Ctrl+Shift+L`
74
130
  - Navigation bar, DevTools, back/forward
75
- - **Element Inspector** — hover to highlight, click to copy LLM-friendly context
76
- - **Full CDP automation via MCP:**
77
- - Click elements by ref or CSS selector
78
- - Fill forms with real keyboard input (handles React, CJK)
79
- - Take screenshots via CDP `Page.captureScreenshot`
80
- - Evaluate JavaScript with user gesture context
81
- - Navigate, go back, press keys
131
+ - Element Inspector — hover to highlight, click to copy LLM-friendly context
132
+ - Full CDP automation: click, fill, type, screenshot, JS eval, key press
82
133
 
83
134
  ### Notifications
84
- - **Activity-based detection** — monitors output throughput, no fragile pattern matching
85
- - **Taskbar flash** orange flash when notifications arrive while unfocused
86
- - **Windows toast** — native OS notification with click-to-focus
87
- - **Process exit alerts** notifies on non-zero exit codes
88
- - **Notification panel** `Ctrl+I`, read/unread tracking, per-workspace filtering
89
- - **Sound** — Web Audio synthesized tones per notification type
90
-
91
- ### MCP Server (Claude Code Integration)
92
- wmux automatically registers its MCP server when launched. Claude Code can:
93
-
94
- | Tool | What it does |
95
- |------|-------------|
96
- | `browser_open` | Open a new browser panel |
97
- | `browser_navigate` | Go to URL |
98
- | `browser_screenshot` | Capture page as PNG (CDP) |
99
- | `browser_snapshot` | Get page structure with interactive element refs |
100
- | `browser_click` | Click element by ref number |
101
- | `browser_fill` | Fill form fields by ref |
102
- | `browser_type` | Type text into element (CDP keyboard input) |
103
- | `browser_press_key` | Press keyboard key (Enter, Tab, etc.) |
104
- | `browser_evaluate` | Execute JavaScript in page context |
105
- | `browser_hover` | Hover over element |
106
- | `browser_select` | Select dropdown options |
107
- | `browser_scroll_into_view` | Scroll element into viewport |
108
- | `terminal_read` | Read terminal screen |
109
- | `terminal_send` | Send text to terminal |
110
- | `terminal_send_key` | Send key (enter, ctrl+c, etc.) |
111
- | `workspace_list` | List all workspaces |
112
- | `surface_list` | List surfaces |
113
- | `pane_list` | List panes |
114
-
115
- **Multi-agent:** All browser tools accept `surfaceId` — each Claude Code session controls its own browser independently.
116
-
117
- ### Security
118
- - **Token authentication** on all IPC pipes (named pipe + session pipes)
119
- - **SSRF protection** — URL validation blocks private IPs, file://, javascript: schemes
120
- - **Input sanitization** — PTY command injection prevention
121
- - **CDP port randomization** — no fixed debug port
122
- - **Memory pressure watchdog** — auto-reaps dead sessions at 750MB, blocks new at 1GB
123
- - **Electron Fuses** — RunAsNode disabled, cookie encryption enabled
124
-
125
- ### Agent Status Detection
126
- Gate-based detection for AI coding agents:
127
- - Claude Code, Cursor, Aider, Codex CLI, Gemini CLI, OpenCode, GitHub Copilot CLI
128
- - Detects agent startup, monitors activity
129
- - Critical action warnings (git push --force, rm -rf, DROP TABLE, etc.)
135
+ - Output throughput-based activity detection
136
+ - Taskbar flash + Windows toast notifications
137
+ - Process exit alerts
138
+ - Notification panel`Ctrl+I`
139
+ - Web Audio sound effects
140
+
141
+ ### Agent Detection
142
+ Claude Code, Cursor, Aider, Codex CLI, Gemini CLI, OpenCode, GitHub Copilot CLI
143
+ - Detects agent start activates monitoring
144
+ - Critical action warnings
130
145
 
131
146
  ### Daemon Process
132
147
  - Background session management (survives app restart)
133
- - Suspend/resume with scrollback buffer dump
134
- - Auto-recovery of sessions on daemon restart
148
+ - Scrollback buffer dump and auto-recovery
135
149
  - Dead session TTL reaping (24h default)
136
150
 
137
151
  ### Themes
138
152
  Catppuccin, Tokyo Night, Dracula, Nord, Gruvbox, Solarized, One Dark, and more.
139
153
 
140
154
  ### i18n
141
- English, 한국어, 日本語, 中文
155
+ English, Korean, Japanese, Chinese
142
156
 
143
157
  ---
144
158
 
@@ -152,7 +166,7 @@ English, 한국어, 日本語, 中文
152
166
  | `Ctrl+W` | Close tab |
153
167
  | `Ctrl+N` | New workspace |
154
168
  | `Ctrl+1~9` | Switch workspace |
155
- | `Ctrl+click` | Add workspace to multiview |
169
+ | `Ctrl+click` | Add to multiview |
156
170
  | `Ctrl+Shift+G` | Exit multiview |
157
171
  | `Ctrl+Shift+L` | Open browser |
158
172
  | `Ctrl+B` | Toggle sidebar |
@@ -161,8 +175,6 @@ English, 한국어, 日本語, 中文
161
175
  | `Ctrl+,` | Settings |
162
176
  | `Ctrl+F` | Search terminal |
163
177
  | `Ctrl+Shift+X` | Vi copy mode |
164
- | `Ctrl+Shift+H` | Flash pane |
165
- | `Alt+Ctrl+Arrow` | Focus adjacent pane |
166
178
  | `F12` | Browser DevTools |
167
179
 
168
180
  ---
@@ -191,7 +203,7 @@ npm start # Dev mode
191
203
  npm run make # Build installer
192
204
  ```
193
205
 
194
- ### Requirements (development only)
206
+ ### Requirements (dev only)
195
207
  - Node.js 18+
196
208
  - Python 3.x (for node-gyp)
197
209
  - Visual Studio Build Tools with C++ workload
@@ -208,11 +220,11 @@ Electron Main Process
208
220
  ├── PTYBridge (data forwarding + ActivityMonitor)
209
221
  ├── AgentDetector (gate-based agent status)
210
222
  ├── SessionManager (atomic save with .bak recovery)
211
- ├── ScrollbackPersistence (dump/load terminal buffers)
223
+ ├── ScrollbackPersistence (terminal buffer dump/load)
212
224
  ├── PipeServer (Named Pipe JSON-RPC + token auth)
213
225
  ├── McpRegistrar (auto-registers MCP in ~/.claude.json)
214
- ├── WebviewCdpManager (CDP proxy to <webview> via debugger)
215
- ├── DaemonClient (optional daemon mode connector)
226
+ ├── WebviewCdpManager (CDP proxy to <webview>)
227
+ ├── DaemonClient (daemon mode connector)
216
228
  └── ToastManager (OS notifications + taskbar flash)
217
229
 
218
230
  Renderer Process (React 19 + Zustand)
@@ -223,7 +235,7 @@ Renderer Process (React 19 + Zustand)
223
235
  ├── SettingsPanel (workspace reset)
224
236
  └── Multiview grid
225
237
 
226
- Daemon Process (optional, standalone)
238
+ Daemon Process (standalone)
227
239
  ├── DaemonSessionManager (ConPTY lifecycle)
228
240
  ├── RingBuffer (circular scrollback buffer)
229
241
  ├── StateWriter (session suspend/resume)
@@ -233,8 +245,8 @@ Daemon Process (optional, standalone)
233
245
 
234
246
  MCP Server (stdio)
235
247
  ├── PlaywrightEngine (CDP connection, fast-fail)
236
- ├── CDP RPC fallback (browser.screenshot, browser.evaluate, etc.)
237
- └── Bridges Claude Code <-> wmux via Named Pipe RPC
248
+ ├── CDP RPC fallback (screenshot, evaluate, type, click)
249
+ └── Claude Code <-> wmux Named Pipe RPC bridge
238
250
  ```
239
251
 
240
252
  ---
@@ -1,8 +1,19 @@
1
1
  "use strict";
2
2
  Object.defineProperty(exports, "__esModule", { value: true });
3
3
  exports.handleSystem = handleSystem;
4
+ const fs_1 = require("fs");
5
+ const path_1 = require("path");
4
6
  const client_1 = require("../client");
5
7
  const utils_1 = require("../utils");
8
+ function getFallbackVersion() {
9
+ try {
10
+ const pkg = JSON.parse((0, fs_1.readFileSync)((0, path_1.join)(__dirname, '..', '..', 'package.json'), 'utf-8'));
11
+ return pkg.version ?? '0.0.0';
12
+ }
13
+ catch {
14
+ return '0.0.0';
15
+ }
16
+ }
6
17
  async function handleSystem(cmd, args, jsonMode) {
7
18
  let response;
8
19
  switch (cmd) {
@@ -18,7 +29,7 @@ async function handleSystem(cmd, args, jsonMode) {
18
29
  }
19
30
  const info = response.result;
20
31
  console.log(`app: ${info?.app ?? 'wmux'}`);
21
- console.log(`version: ${info?.version ?? '1.0.0'}`);
32
+ console.log(`version: ${info?.version ?? getFallbackVersion()}`);
22
33
  console.log(`platform: ${info?.platform ?? process.platform}`);
23
34
  }
24
35
  break;
@@ -14,9 +14,20 @@ const wait_1 = require("./playwright/tools/wait");
14
14
  const file_1 = require("./playwright/tools/file");
15
15
  const utility_1 = require("./playwright/tools/utility");
16
16
  const extraction_1 = require("./playwright/tools/extraction");
17
+ const fs_1 = require("fs");
18
+ const path_1 = require("path");
19
+ function getVersion() {
20
+ try {
21
+ const pkg = JSON.parse((0, fs_1.readFileSync)((0, path_1.join)(__dirname, '..', '..', 'package.json'), 'utf-8'));
22
+ return pkg.version ?? '0.0.0';
23
+ }
24
+ catch {
25
+ return '0.0.0';
26
+ }
27
+ }
17
28
  const server = new mcp_js_1.McpServer({
18
29
  name: 'wmux',
19
- version: '1.0.0',
30
+ version: getVersion(),
20
31
  });
21
32
  // Helper: wrap an RPC call as an MCP tool result
22
33
  async function callRpc(method, params = {}) {
package/package.json CHANGED
@@ -1,7 +1,7 @@
1
1
  {
2
2
  "name": "@wong2kim/wmux",
3
3
  "productName": "wmux",
4
- "version": "2.0.0",
4
+ "version": "2.0.1",
5
5
  "description": "Windows terminal multiplexer with MCP server for AI agents - run multiple CLI sessions in parallel, control via Claude Code and other AI tools",
6
6
  "main": ".vite/build/index.js",
7
7
  "scripts": {