agentgui 1.0.576 → 1.0.577

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (2) hide show
  1. package/package.json +1 -1
  2. package/readme.md +238 -94
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "agentgui",
3
- "version": "1.0.576",
3
+ "version": "1.0.577",
4
4
  "description": "Multi-agent ACP client with real-time communication",
5
5
  "type": "module",
6
6
  "main": "server.js",
package/readme.md CHANGED
@@ -1,78 +1,167 @@
1
1
  # AgentGUI
2
2
 
3
- [![GitHub Pages](https://img.shields.io/badge/GitHub_Pages-Enabled-blue?logo=github)](https://anentrypoint.github.io/agentgui/)
4
- [![npm version](https://badge.fury.io/js/agentgui.svg)](https://www.npmjs.com/package/agentgui)
5
- [![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)
6
-
7
- **Multi-agent GUI client for AI coding agents** with real-time streaming, WebSocket sync, and SQLite persistence.
3
+ <div align="center">
8
4
 
9
5
  ![AgentGUI Main Interface](docs/screenshot-main.png)
10
6
 
11
- ## Features
7
+ **Multi-agent GUI for AI coding assistants**
12
8
 
13
- - **🤖 Multi-Agent Support** - Claude Code, Gemini CLI, OpenCode, Goose, Kilo, and more
14
- - **📡 Real-Time Streaming** - Live execution visualization with WebSocket sync
15
- - **💾 Persistent Storage** - SQLite-based conversation and session history
16
- - **🎤 Voice I/O** - Built-in speech-to-text and text-to-speech with @huggingface/transformers
17
- - **📁 File Browser** - Integrated file system explorer with drag-drop upload
18
- - **🔧 Tool Manager** - Install and update agent plugins directly from UI
19
- - **🎨 Modern UI** - Dark/light themes with responsive design
20
- - **🔌 ACP Protocol** - Auto-discovery and lifecycle management for ACP tools
9
+ [![GitHub Pages](https://img.shields.io/badge/GitHub_Pages-Live-blue?logo=github)](https://anentrypoint.github.io/agentgui/)
10
+ [![npm](https://img.shields.io/npm/v/agentgui?color=brightgreen)](https://www.npmjs.com/package/agentgui)
11
+ [![Weekly Downloads](https://img.shields.io/npm/dw/agentgui?color=brightgreen)](https://www.npmjs.com/package/agentgui)
12
+ [![License](https://img.shields.io/badge/license-MIT-blue.svg)](LICENSE)
21
13
 
22
- ## 📸 Screenshots
14
+ [Quick Start](#quick-start) • [Features](#features) • [Screenshots](#screenshots) • [Architecture](#architecture) • [Documentation](https://anentrypoint.github.io/agentgui/)
23
15
 
24
- <table>
25
- <tr>
26
- <td><img src="docs/screenshot-chat.png" alt="Chat View" width="400"/><br/><em>Chat & Conversation View</em></td>
27
- <td><img src="docs/screenshot-files.png" alt="Files Browser" width="400"/><br/><em>File System Browser</em></td>
28
- </tr>
29
- <tr>
30
- <td><img src="docs/screenshot-terminal.png" alt="Terminal" width="400"/><br/><em>Terminal & Execution Output</em></td>
31
- <td><img src="docs/screenshot-tools-popup.png" alt="Tools" width="400"/><br/><em>Tool Management</em></td>
32
- </tr>
33
- </table>
16
+ </div>
34
17
 
35
- ## 🚀 Quick Start
18
+ ---
36
19
 
37
- ```bash
38
- # Install globally
39
- npm install -g agentgui
20
+ ## Overview
21
+
22
+ AgentGUI provides a unified web interface for AI coding agents. Connect to any CLI-based agent (Claude Code, Gemini CLI, OpenCode, Goose, Kilo, Codex, and more) and interact through a real-time streaming interface with SQLite persistence, file management, and speech capabilities.
40
23
 
41
- # Run the server
42
- agentgui
24
+ ## Quick Start
43
25
 
44
- # Or use npx
26
+ ### One-Line Install
27
+
28
+ ```bash
45
29
  npx agentgui
46
30
  ```
47
31
 
48
- Server starts on `http://localhost:3000` and redirects to `/gm/`.
32
+ The server starts at `http://localhost:3000/gm/`
33
+
34
+ ### Manual Installation
35
+
36
+ ```bash
37
+ git clone https://github.com/AnEntrypoint/agentgui.git
38
+ cd agentgui
39
+ npm install
40
+ npm run dev
41
+ ```
42
+
43
+ ### System Requirements
44
+
45
+ - **Node.js** 18+ (LTS recommended)
46
+ - **npm** or **bun**
47
+ - **AI Coding Agents**: Claude Code, Gemini CLI, OpenCode, Goose, Kilo, or Codex
48
+ - **Optional**: Python 3.9+ for text-to-speech on Windows
49
+
50
+ ## Features
51
+
52
+ ### 🤖 Multi-Agent Support
53
+ Auto-discovers and connects to all installed AI coding agents:
54
+ - Claude Code (`@anthropic-ai/claude-code`)
55
+ - Gemini CLI (`@google/gemini-cli`)
56
+ - OpenCode (`opencode-ai`)
57
+ - Goose (`goose-ai`)
58
+ - Kilo (`@kilocode/cli`)
59
+ - Codex and other CLI-based agents
60
+
61
+ ### ⚡ Real-Time Streaming
62
+ - WebSocket-based streaming for instant agent responses
63
+ - Live execution visualization with syntax highlighting
64
+ - Progress indicators for long-running operations
65
+ - Concurrent agent sessions
66
+
67
+ ### 💾 Persistent Storage
68
+ - SQLite database (`~/.gmgui/data.db`) in WAL mode
69
+ - Conversation history with full context
70
+ - Session management and resumption
71
+ - Message threading and organization
72
+
73
+ ### 📁 File Management
74
+ - Integrated file browser for agent working directories
75
+ - Drag-and-drop file uploads
76
+ - Direct file editing and viewing
77
+ - Context-aware file operations
78
+
79
+ ### 🎤 Speech Capabilities
80
+ - Speech-to-text via Hugging Face Whisper
81
+ - Text-to-speech with multiple voice options
82
+ - Automatic model downloading (~470MB)
83
+ - No API keys required
84
+
85
+ ### 🔧 Developer Experience
86
+ - Hot reload during development
87
+ - Extensible agent framework
88
+ - REST API + WebSocket endpoints
89
+ - Plugin system for custom agents
90
+
91
+ ## Screenshots
92
+
93
+ ### Main Interface
94
+ ![Main Interface](docs/screenshot-main.png)
49
95
 
50
- ## 📋 System Requirements
96
+ ### Chat & Conversation Views
97
+ ![Chat View](docs/screenshot-chat.png)
51
98
 
52
- - **Node.js**: v18+ or Bun v1.0+
53
- - **OS**: Linux, macOS, or Windows
54
- - **RAM**: 2GB+ recommended
55
- - **Disk**: 500MB for voice models (auto-downloaded)
99
+ ![Conversation History](docs/screenshot-conversation.png)
56
100
 
57
- ## 🏗️ Architecture
101
+ ### File Browser
102
+ ![File Browser](docs/screenshot-files.png)
103
+
104
+ ### Terminal Execution
105
+ ![Terminal View](docs/screenshot-terminal.png)
106
+
107
+ ### Tools Management
108
+ ![Tools Popup](docs/screenshot-tools-popup.png)
109
+
110
+ ## Architecture
58
111
 
59
112
  ```
60
- server.js HTTP server + WebSocket + API routes
61
- database.js SQLite (WAL mode) + queries
62
- lib/claude-runner.js Agent framework - spawns CLI processes
63
- lib/acp-manager.js ACP tool lifecycle management
64
- lib/speech.js Speech-to-text + text-to-speech
65
- static/ Frontend (vanilla JS, no build step)
113
+ ┌─────────────────────────────────────────────────────────────────┐
114
+ │ Browser Client │
115
+ │ ┌──────────────┐ ┌──────────────┐ ┌──────────────────────────┐
116
+ │ │ UI Layer │ │ WebSocket │ │ Streaming Renderer │ │
117
+ │ │ Components │ │ Manager │ │ (Event Processor) │ │
118
+ │ └──────────────┘ └──────────────┘ └──────────────────────────┘
119
+ └─────────────────────────────────────────────────────────────────┘
120
+
121
+ ┌─────────▼─────────┐
122
+ │ HTTP + WS │
123
+ │ Server │
124
+ │ (server.js) │
125
+ └─────────┬─────────┘
126
+
127
+ ┌─────────────────────┼─────────────────────┐
128
+ │ │ │
129
+ ┌───────▼────────┐ ┌─────────▼─────────┐ ┌───────▼────────┐
130
+ │ SQLite DB │ │ Agent Runner │ │ Speech │
131
+ │ (database.js) │ │ (claude-runner.js)│ │ (speech.js) │
132
+ └────────────────┘ └─────────┬─────────┘ └────────────────┘
133
+
134
+ ┌─────────▼─────────┐
135
+ │ Agent CLI Tools │
136
+ │ (spawned procs) │
137
+ └───────────────────┘
66
138
  ```
67
139
 
68
140
  ### Key Components
69
141
 
70
- - **Agent Discovery**: Scans PATH for known CLI binaries at startup
71
- - **Database**: `~/.gmgui/data.db` - conversations, messages, events, sessions, stream chunks
72
- - **WebSocket**: Real-time sync at `BASE_URL/sync` with subscribe/unsubscribe
73
- - **ACP Tools**: Auto-launches OpenCode (port 18100) and Kilo (port 18101) as HTTP servers
142
+ | Component | Purpose | Location |
143
+ |-----------|---------|----------|
144
+ | **HTTP Server** | REST API, static files, routing | `server.js` |
145
+ | **Database** | SQLite persistence (WAL mode) | `database.js` |
146
+ | **Agent Runner** | CLI spawning, stream parsing | `lib/claude-runner.js` |
147
+ | **Speech Engine** | STT/TTS via transformers | `lib/speech.js` |
148
+ | **Client Core** | Main browser logic | `static/js/client.js` |
149
+ | **WebSocket Manager** | Real-time communication | `static/js/websocket-manager.js` |
150
+ | **Streaming Renderer** | Event-based UI updates | `static/js/streaming-renderer.js` |
151
+ | **CLI Entry** | `npx agentgui` handler | `bin/gmgui.cjs` |
152
+
153
+ ## Configuration
154
+
155
+ Environment variables:
156
+
157
+ | Variable | Default | Description |
158
+ |----------|---------|-------------|
159
+ | `PORT` | `3000` | Server port |
160
+ | `BASE_URL` | `/gm` | URL prefix for all routes |
161
+ | `STARTUP_CWD` | Current dir | Working directory for agents |
162
+ | `HOT_RELOAD` | `true` | Enable watch mode |
74
163
 
75
- ## 🔌 API Endpoints
164
+ ## REST API
76
165
 
77
166
  All routes prefixed with `BASE_URL` (default `/gm`):
78
167
 
@@ -80,74 +169,129 @@ All routes prefixed with `BASE_URL` (default `/gm`):
80
169
  - `GET /api/conversations` - List all conversations
81
170
  - `POST /api/conversations` - Create new conversation
82
171
  - `GET /api/conversations/:id` - Get conversation details
83
- - `POST /api/conversations/:id/messages` - Send message
172
+ - `DELETE /api/conversations/:id` - Delete conversation
173
+
174
+ ### Messages & Streaming
175
+ - `POST /api/conversations/:id/messages` - Send message to agent
84
176
  - `POST /api/conversations/:id/stream` - Start streaming execution
177
+ - `GET /api/conversations/:id/chunks` - Get stream chunks
85
178
 
86
- ### Tools
87
- - `GET /api/tools` - List detected tools with installation status
179
+ ### Agents & Tools
180
+ - `GET /api/agents` - List discovered agents
181
+ - `GET /api/tools` - List available tools
88
182
  - `POST /api/tools/:id/install` - Install tool
89
183
  - `POST /api/tools/:id/update` - Update tool
90
- - `POST /api/tools/update` - Batch update all tools
91
184
 
92
- ### Voice
185
+ ### Speech
93
186
  - `POST /api/stt` - Speech-to-text (raw audio)
94
- - `POST /api/tts` - Text-to-speech
95
- - `GET /api/speech-status` - Model loading status
187
+ - `POST /api/tts` - Text-to-speech (returns audio)
188
+ - `GET /api/speech-status` - Model download status
96
189
 
97
- ## 🎙️ Voice Models
190
+ ### WebSocket
191
+ - Endpoint: `BASE_URL + /sync`
192
+ - Events: `streaming_start`, `streaming_progress`, `streaming_complete`, `streaming_error`
193
+ - Subscribe: `{ type: "subscribe", sessionId }` or `{ type: "subscribe", conversationId }`
98
194
 
99
- Speech models (~470MB) are auto-downloaded on first launch:
100
- - **Whisper Base** (~280MB) - STT from HuggingFace
101
- - **TTS Models** (~190MB) - Custom text-to-speech
195
+ ## Text-to-Speech Setup (Windows)
102
196
 
103
- Models cached at `~/.gmgui/models/`.
197
+ AgentGUI automatically configures text-to-speech on first use:
104
198
 
105
- ## 🛠️ Development
199
+ 1. Detects Python 3.9+ installation
200
+ 2. Creates virtual environment at `~/.gmgui/pocket-venv`
201
+ 3. Installs `pocket-tts` via pip
202
+ 4. Caches setup for subsequent requests
106
203
 
107
- ```bash
108
- # Clone repository
109
- git clone https://github.com/AnEntrypoint/agentgui.git
110
- cd agentgui
204
+ **Requirements**: Python 3.9+, ~200MB disk space, internet connection
111
205
 
112
- # Install dependencies
113
- npm install
206
+ **Troubleshooting**:
207
+ - **Python not found**: Install from [python.org](https://www.python.org) with "Add Python to PATH"
208
+ - **Setup fails**: Check write access to `~/.gmgui/`
209
+ - **Manual cleanup**: Delete `%USERPROFILE%\.gmgui\pocket-venv` and retry
210
+
211
+ ## Development
114
212
 
115
- # Run dev server with watch mode
213
+ ### Running in Dev Mode
214
+
215
+ ```bash
116
216
  npm run dev
217
+ ```
117
218
 
118
- # Build portable binaries
119
- npm run build:portable
219
+ Server auto-reloads on file changes.
220
+
221
+ ### Project Structure
222
+
223
+ ```
224
+ agentgui/
225
+ ├── server.js # Main server (HTTP + WebSocket + API)
226
+ ├── database.js # SQLite schema and queries
227
+ ├── lib/
228
+ │ ├── claude-runner.js # Agent execution framework
229
+ │ ├── acp-manager.js # ACP tool lifecycle
230
+ │ ├── speech.js # STT/TTS processing
231
+ │ └── tool-manager.js # Tool installation/updates
232
+ ├── static/
233
+ │ ├── index.html # Main app shell
234
+ │ ├── js/
235
+ │ │ ├── client.js # Core client logic
236
+ │ │ ├── websocket-manager.js
237
+ │ │ ├── streaming-renderer.js
238
+ │ │ └── ...
239
+ │ └── templates/ # HTML event templates
240
+ └── bin/
241
+ └── gmgui.cjs # CLI entry point
120
242
  ```
121
243
 
122
- ## 📦 Tool Detection
244
+ ### Adding Custom Agents
245
+
246
+ 1. Add agent descriptor to `lib/agent-descriptors.js`
247
+ 2. Implement CLI detection logic
248
+ 3. Configure spawn parameters
249
+ 4. Add to agent discovery scan
250
+
251
+ ## Troubleshooting
252
+
253
+ ### Server Won't Start
254
+ - Check if port 3000 is already in use: `lsof -i :3000` (macOS/Linux) or `netstat -ano | findstr :3000` (Windows)
255
+ - Try a different port: `PORT=4000 npm run dev`
256
+
257
+ ### Agent Not Detected
258
+ - Verify agent is installed globally: `which claude` / `where claude`
259
+ - Check PATH includes agent binary location
260
+ - Restart server after installing new agents
261
+
262
+ ### WebSocket Connection Fails
263
+ - Verify BASE_URL matches your deployment
264
+ - Check browser console for connection errors
265
+ - Ensure no proxy/firewall blocking WebSocket
123
266
 
124
- AgentGUI auto-detects installed AI coding tools:
125
- - **Claude Code**: `@anthropic-ai/claude-code`
126
- - **Gemini CLI**: `@google/gemini-cli`
127
- - **OpenCode**: `opencode-ai`
128
- - **Kilo**: `@kilocode/cli`
129
- - **Codex**: `@openai/codex`
267
+ ### Speech Models Not Downloading
268
+ - Check internet connection
269
+ - Verify `~/.gmgui/models/` is writable
270
+ - Monitor download via `/api/speech-status`
130
271
 
131
- Install/update directly from the Tools UI.
272
+ ## Contributing
132
273
 
133
- ## 🌐 Environment Variables
274
+ Contributions welcome! Please:
134
275
 
135
- - `PORT` - Server port (default: 3000)
136
- - `BASE_URL` - URL prefix (default: /gm)
137
- - `STARTUP_CWD` - Working directory for agents
138
- - `HOT_RELOAD` - Set to "false" to disable watch mode
276
+ 1. Fork the repository
277
+ 2. Create a feature branch (`git checkout -b feature/amazing-feature`)
278
+ 3. Commit changes (`git commit -m 'Add amazing feature'`)
279
+ 4. Push to branch (`git push origin feature/amazing-feature`)
280
+ 5. Open a Pull Request
139
281
 
140
- ## 📝 License
282
+ ## License
141
283
 
142
- MIT License - see [LICENSE](LICENSE) file for details.
284
+ MIT © [AnEntrypoint](https://github.com/AnEntrypoint)
143
285
 
144
- ## 🤝 Contributing
286
+ ## Links
145
287
 
146
- Contributions welcome! Please read our contributing guidelines before submitting PRs.
288
+ - **GitHub**: https://github.com/AnEntrypoint/agentgui
289
+ - **npm**: https://www.npmjs.com/package/agentgui
290
+ - **Documentation**: https://anentrypoint.github.io/agentgui/
291
+ - **Issues**: https://github.com/AnEntrypoint/agentgui/issues
147
292
 
148
- ## 🔗 Links
293
+ ---
149
294
 
150
- - [GitHub Repository](https://github.com/AnEntrypoint/agentgui)
151
- - [npm Package](https://www.npmjs.com/package/agentgui)
152
- - [Documentation](https://anentrypoint.github.io/agentgui/)
153
- - [Issue Tracker](https://github.com/AnEntrypoint/agentgui/issues)
295
+ <div align="center">
296
+ Made with ❤️ by the AgentGUI team
297
+ </div>