agentgui 1.0.385 → 1.0.386
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/.prd +255 -0
- package/package.json +1 -1
- package/server.js +2 -4
package/.prd
ADDED
|
@@ -0,0 +1,255 @@
|
|
|
1
|
+
# AgentGUI ACP Compliance PRD
|
|
2
|
+
|
|
3
|
+
## Overview
|
|
4
|
+
Transform AgentGUI into a fully ACP (Agent Connect Protocol) v0.2.3 compliant server while fixing UI consistency issues and optimizing WebSocket usage.
|
|
5
|
+
|
|
6
|
+
**Current Status**: ~30% ACP compliant (basic conversation/message CRUD exists)
|
|
7
|
+
**Target**: 100% ACP compliant with all endpoints, thread management, stateless runs, and run control
|
|
8
|
+
|
|
9
|
+
**Note on "Slash Commands"**: ACP spec contains no slash command concept. This is purely a client-side UI feature outside ACP scope. If user wants slash commands implemented, that would be a separate UI enhancement task.
|
|
10
|
+
|
|
11
|
+
---
|
|
12
|
+
|
|
13
|
+
## Dependency Graph & Execution Waves
|
|
14
|
+
|
|
15
|
+
### WAVE 3: Streaming & Run Control (2 items - after Wave 2)
|
|
16
|
+
|
|
17
|
+
**3.1** SSE (Server-Sent Events) Streaming
|
|
18
|
+
- BLOCKS: 2.1, 2.2, 2.3
|
|
19
|
+
- BLOCKED_BY: 4.1
|
|
20
|
+
- Implement SSE endpoint format (Content-Type: text/event-stream)
|
|
21
|
+
- Stream run outputs as ACP `RunOutputStream` format
|
|
22
|
+
- Support both `ValueRunResultUpdate` and `CustomRunResultUpdate` modes
|
|
23
|
+
- Event types: data, error, done
|
|
24
|
+
- Keep-alive pings every 15 seconds
|
|
25
|
+
- Handle client disconnect gracefully
|
|
26
|
+
- Convert existing chunk/event stream to SSE format
|
|
27
|
+
- Parallel SSE + WebSocket support (both work simultaneously)
|
|
28
|
+
|
|
29
|
+
**3.2** Run Cancellation & Control
|
|
30
|
+
- BLOCKS: 1.1, 1.2
|
|
31
|
+
- BLOCKED_BY: 4.1
|
|
32
|
+
- Implement run status state machine: pending → active → completed/error/cancelled
|
|
33
|
+
- Cancel endpoint kills agent process (SIGTERM then SIGKILL)
|
|
34
|
+
- Update run status to 'cancelled' in database
|
|
35
|
+
- Broadcast cancellation via WebSocket
|
|
36
|
+
- Clean up active execution tracking
|
|
37
|
+
- Return 409 if run already completed/cancelled
|
|
38
|
+
- Wait endpoint implements long-polling (30s timeout, return current status)
|
|
39
|
+
- Handle graceful degradation if agent doesn't support cancellation
|
|
40
|
+
|
|
41
|
+
### WAVE 4: UI Fixes & Optimization (3 items - after Wave 3)
|
|
42
|
+
|
|
43
|
+
**4.1** Thread Sidebar UI Consistency
|
|
44
|
+
- BLOCKS: 2.1, 2.2, 3.1
|
|
45
|
+
- BLOCKED_BY: nothing
|
|
46
|
+
- Audit conversation list rendering: verify agent display matches conversation.agentId
|
|
47
|
+
- Ensure model selection persists when loading existing conversation
|
|
48
|
+
- On conversation resume: restore last-used agent and model to UI selectors
|
|
49
|
+
- Fix any duplicate agent/model displays in sidebar or header
|
|
50
|
+
- Test: create conversation with agent A, reload page, verify agent A shown
|
|
51
|
+
- Test: switch to agent B mid-conversation, reload, verify agent B shown
|
|
52
|
+
- Store agent/model in conversation record, use as source of truth
|
|
53
|
+
|
|
54
|
+
**4.2** WebSocket Usage Optimization
|
|
55
|
+
- BLOCKS: 3.1
|
|
56
|
+
- BLOCKED_BY: nothing
|
|
57
|
+
- Audit all broadcastSync calls: identify high-frequency low-value messages
|
|
58
|
+
- Batch streaming_progress events (max 10 events per 100ms window)
|
|
59
|
+
- Only broadcast to subscribed clients (per sessionId or conversationId)
|
|
60
|
+
- Compress large payloads before WebSocket send
|
|
61
|
+
- Add message priority: high (errors, completion), normal (progress), low (status)
|
|
62
|
+
- Rate limit per client: max 100 msg/sec
|
|
63
|
+
- Implement message deduplication for identical consecutive events
|
|
64
|
+
- Monitor: track bytes sent per client, log if >1MB/sec sustained
|
|
65
|
+
|
|
66
|
+
**4.3** Consolidate Duplicate Displays
|
|
67
|
+
- BLOCKS: 4.1
|
|
68
|
+
- BLOCKED_BY: nothing
|
|
69
|
+
- Identify all places where agent/model info is displayed
|
|
70
|
+
- Remove duplicate displays: keep one authoritative location per UI section
|
|
71
|
+
- Sidebar: show agent name only (remove if duplicated elsewhere)
|
|
72
|
+
- Header/toolbar: show model + agent if conversation active
|
|
73
|
+
- Message bubbles: show agent avatar/name per message only if multi-agent conversation
|
|
74
|
+
- Test: verify no redundant agent/model text after changes
|
|
75
|
+
|
|
76
|
+
---
|
|
77
|
+
|
|
78
|
+
## Additional Enhancements (Non-blocking)
|
|
79
|
+
|
|
80
|
+
### NICE-TO-HAVE 1: Webhook Callbacks
|
|
81
|
+
- Implement webhook support for run status changes
|
|
82
|
+
- POST to webhook URL when run status changes (pending → active → completed)
|
|
83
|
+
- Retry logic: 3 attempts with exponential backoff
|
|
84
|
+
- Store webhook config in run_metadata table
|
|
85
|
+
- Validate webhook URL format on run creation
|
|
86
|
+
|
|
87
|
+
### NICE-TO-HAVE 2: Run Interrupts
|
|
88
|
+
- Support interrupt mechanism for agents that implement it
|
|
89
|
+
- Interrupt types: user feedback request, tool approval, configuration needed
|
|
90
|
+
- Store interrupt state in sessions table
|
|
91
|
+
- API endpoints: GET /runs/{id}/interrupts, POST /runs/{id}/resume with interrupt response
|
|
92
|
+
- UI: show interrupt prompt, collect user input, resume run
|
|
93
|
+
|
|
94
|
+
### NICE-TO-HAVE 3: Enhanced Search & Filtering
|
|
95
|
+
- Full-text search on thread content (messages, agent responses)
|
|
96
|
+
- Filter by agent type, date range, status, metadata fields
|
|
97
|
+
- Search history: recent searches saved per user
|
|
98
|
+
- Autocomplete for search filters
|
|
99
|
+
- Export search results as JSON
|
|
100
|
+
|
|
101
|
+
### NICE-TO-HAVE 4: Thread Templates
|
|
102
|
+
- Save thread configuration as template
|
|
103
|
+
- Templates include: agent, model, initial prompt, working directory
|
|
104
|
+
- Clone thread from template
|
|
105
|
+
- Share templates between users (if multi-user support added)
|
|
106
|
+
|
|
107
|
+
---
|
|
108
|
+
|
|
109
|
+
## Testing Requirements (Per Item)
|
|
110
|
+
|
|
111
|
+
Each implementation item must include:
|
|
112
|
+
1. Execute in plugin:gm:dev: create test run for every endpoint/function
|
|
113
|
+
2. Success paths: valid inputs, expected outputs verified
|
|
114
|
+
3. Error paths: invalid inputs, 404s, 409s, 422s verified
|
|
115
|
+
4. Edge cases: empty results, large payloads, concurrent requests
|
|
116
|
+
5. Integration tests: end-to-end flow (create thread → run → stream → cancel)
|
|
117
|
+
6. Database verification: inspect tables after operations, verify foreign keys
|
|
118
|
+
7. WebSocket verification: subscribe, receive events, verify payload format
|
|
119
|
+
8. SSE verification: curl endpoint, verify event-stream format
|
|
120
|
+
|
|
121
|
+
---
|
|
122
|
+
|
|
123
|
+
## Acceptance Criteria (All Must Pass)
|
|
124
|
+
|
|
125
|
+
### Core ACP Compliance
|
|
126
|
+
- [ ] All 23 ACP endpoints implemented and tested
|
|
127
|
+
- [ ] All ACP data models match spec (Thread, ThreadState, Run, Agent, etc.)
|
|
128
|
+
- [ ] Error responses follow ACP format (ErrorResponse schema)
|
|
129
|
+
- [ ] SSE streaming works with curl: `curl -N /threads/{id}/runs/stream`
|
|
130
|
+
- [ ] Stateless runs work without thread context
|
|
131
|
+
- [ ] Run cancellation kills agent process within 5 seconds
|
|
132
|
+
- [ ] Thread copy duplicates all states and checkpoints
|
|
133
|
+
- [ ] Agent descriptors return valid JSON matching AgentACPDescriptor schema
|
|
134
|
+
|
|
135
|
+
### Database Integrity
|
|
136
|
+
- [ ] No orphaned records after thread/run deletion
|
|
137
|
+
- [ ] Foreign key constraints enforced
|
|
138
|
+
- [ ] Thread status correctly reflects run states
|
|
139
|
+
- [ ] Checkpoint sequences monotonically increase
|
|
140
|
+
- [ ] WAL mode enabled, queries under 100ms for typical operations
|
|
141
|
+
|
|
142
|
+
### UI Consistency
|
|
143
|
+
- [ ] Sidebar shows correct agent for each conversation
|
|
144
|
+
- [ ] Model selection persists after page reload
|
|
145
|
+
- [ ] No duplicate agent/model displays found
|
|
146
|
+
- [ ] Agent/model changes reflected in database immediately
|
|
147
|
+
|
|
148
|
+
### WebSocket Optimization
|
|
149
|
+
- [ ] Streaming progress events batched (max 10/100ms)
|
|
150
|
+
- [ ] Only subscribed clients receive messages
|
|
151
|
+
- [ ] No client exceeds 1MB/sec sustained WebSocket traffic
|
|
152
|
+
- [ ] Message deduplication prevents identical consecutive events
|
|
153
|
+
|
|
154
|
+
### Integration & E2E
|
|
155
|
+
- [ ] Full flow: create thread → start run → stream events → cancel → verify cancelled
|
|
156
|
+
- [ ] Stateless run: create run → stream → complete → verify output
|
|
157
|
+
- [ ] Thread search: create 10 threads → search by metadata → verify correct results
|
|
158
|
+
- [ ] Agent search: search by capability "streaming" → verify all streaming agents returned
|
|
159
|
+
- [ ] Thread copy: create thread with 5 runs → copy → verify new thread has all history
|
|
160
|
+
- [ ] Concurrent runs blocked: start run on thread → start second run → verify 409 conflict
|
|
161
|
+
|
|
162
|
+
---
|
|
163
|
+
|
|
164
|
+
## Migration Strategy
|
|
165
|
+
|
|
166
|
+
### Backward Compatibility
|
|
167
|
+
- Existing conversations map to threads (1:1)
|
|
168
|
+
- Existing sessions map to thread runs
|
|
169
|
+
- `/api/conversations/*` endpoints remain functional (alias to `/threads/*`)
|
|
170
|
+
- Old WebSocket message formats supported alongside new ACP formats
|
|
171
|
+
- No breaking changes to current client code
|
|
172
|
+
|
|
173
|
+
### Rollout Plan
|
|
174
|
+
1. Deploy database schema changes (additive only, no drops)
|
|
175
|
+
2. Deploy new ACP endpoints alongside existing endpoints
|
|
176
|
+
3. Update client to use ACP endpoints where beneficial
|
|
177
|
+
4. Deprecation notice for old endpoints (6 month window)
|
|
178
|
+
5. Remove old endpoints after deprecation period
|
|
179
|
+
|
|
180
|
+
---
|
|
181
|
+
|
|
182
|
+
## Out of Scope
|
|
183
|
+
|
|
184
|
+
- Multi-user authentication/authorization
|
|
185
|
+
- Slash command implementation (not in ACP spec, pure client feature)
|
|
186
|
+
- Agent marketplace or discovery service
|
|
187
|
+
- Real-time collaboration on threads
|
|
188
|
+
- Thread branching/forking (beyond simple copy)
|
|
189
|
+
- Custom agent development framework
|
|
190
|
+
- Billing/metering for agent usage
|
|
191
|
+
|
|
192
|
+
---
|
|
193
|
+
|
|
194
|
+
## Technical Notes
|
|
195
|
+
|
|
196
|
+
### ACP Terminology Mapping
|
|
197
|
+
- AgentGUI "conversations" = ACP "threads"
|
|
198
|
+
- AgentGUI "sessions" = ACP "runs" (stateful, on a thread)
|
|
199
|
+
- AgentGUI "chunks/events" = ACP "run output stream"
|
|
200
|
+
- AgentGUI "claudeSessionId" = ACP checkpoint ID concept
|
|
201
|
+
|
|
202
|
+
### Known Gotchas
|
|
203
|
+
- ACP requires UUID format for thread_id, run_id, agent_id (current AgentGUI uses strings)
|
|
204
|
+
- SSE requires newline-delimited format, different from current JSON streaming
|
|
205
|
+
- Run cancellation must handle agents that don't support it gracefully
|
|
206
|
+
- Thread status "idle" means no pending runs; must validate on run creation
|
|
207
|
+
- Webhook URLs must be validated to prevent SSRF attacks
|
|
208
|
+
|
|
209
|
+
### Performance Targets
|
|
210
|
+
- Thread search: <200ms for 10,000 threads
|
|
211
|
+
- Run creation: <50ms (background processing)
|
|
212
|
+
- SSE streaming: <10ms latency per event
|
|
213
|
+
- WebSocket batch: <100ms accumulation window
|
|
214
|
+
- Database writes: <20ms per transaction
|
|
215
|
+
|
|
216
|
+
---
|
|
217
|
+
|
|
218
|
+
## Dependencies
|
|
219
|
+
|
|
220
|
+
**External**:
|
|
221
|
+
- None (all features implemented with existing dependencies)
|
|
222
|
+
|
|
223
|
+
**Internal**:
|
|
224
|
+
- database.js (extended with new tables/queries)
|
|
225
|
+
- server.js (new route handlers)
|
|
226
|
+
- lib/claude-runner.js (run cancellation support)
|
|
227
|
+
- static/js/client.js (UI consistency fixes)
|
|
228
|
+
- static/js/conversations.js (agent/model persistence)
|
|
229
|
+
- static/js/websocket-manager.js (optimization)
|
|
230
|
+
|
|
231
|
+
**Configuration**:
|
|
232
|
+
- No new env vars required
|
|
233
|
+
- Existing BASE_URL, PORT, STARTUP_CWD remain unchanged
|
|
234
|
+
|
|
235
|
+
---
|
|
236
|
+
|
|
237
|
+
## Success Metrics
|
|
238
|
+
|
|
239
|
+
- ACP compliance score: 0% → 100%
|
|
240
|
+
- API endpoint coverage: 20 → 43 endpoints
|
|
241
|
+
- WebSocket bandwidth: <50% reduction in bytes/sec per client
|
|
242
|
+
- UI consistency issues: 4 identified → 0 remaining
|
|
243
|
+
- Database tables: 5 → 8 (conversations, messages, sessions, events, chunks, thread_states, checkpoints, run_metadata)
|
|
244
|
+
- Test coverage: endpoint tests for all 43 routes, integration tests for all critical flows
|
|
245
|
+
|
|
246
|
+
---
|
|
247
|
+
|
|
248
|
+
## Timeline Estimate
|
|
249
|
+
|
|
250
|
+
- Wave 1 (Foundation): 3 parallel tasks = 1 completion cycle
|
|
251
|
+
- Wave 2 (Core APIs): 3 parallel tasks = 1 completion cycle
|
|
252
|
+
- Wave 3 (Streaming): 2 tasks = 1 completion cycle
|
|
253
|
+
- Wave 4 (UI Fixes): 3 tasks = 1 completion cycle
|
|
254
|
+
|
|
255
|
+
**Total**: 4 completion cycles (waves executed sequentially, items within wave executed in parallel with max 3 concurrent subagents per wave)
|
package/package.json
CHANGED
package/server.js
CHANGED
|
@@ -12,8 +12,7 @@ import { OAuth2Client } from 'google-auth-library';
|
|
|
12
12
|
import express from 'express';
|
|
13
13
|
import Busboy from 'busboy';
|
|
14
14
|
import fsbrowse from 'fsbrowse';
|
|
15
|
-
import { queries
|
|
16
|
-
import { createACPQueries } from './acp-queries.js';
|
|
15
|
+
import { queries } from './database.js';
|
|
17
16
|
import { runClaudeWithStreaming } from './lib/claude-runner.js';
|
|
18
17
|
import { initializeDescriptors, getAgentDescriptor } from './lib/agent-descriptors.js';
|
|
19
18
|
|
|
@@ -341,8 +340,6 @@ function discoverAgents() {
|
|
|
341
340
|
|
|
342
341
|
const discoveredAgents = discoverAgents();
|
|
343
342
|
initializeDescriptors(discoveredAgents);
|
|
344
|
-
const acpQueries = createACPQueries(db, prepare);
|
|
345
|
-
acpQueries.getAgentDescriptor = getAgentDescriptor;
|
|
346
343
|
|
|
347
344
|
const modelCache = new Map();
|
|
348
345
|
|
|
@@ -2608,6 +2605,7 @@ const server = http.createServer(async (req, res) => {
|
|
|
2608
2605
|
|
|
2609
2606
|
// POST /threads - Create empty thread
|
|
2610
2607
|
if (pathOnly === '/api/threads' && req.method === 'POST') {
|
|
2608
|
+
console.log('[ACP] POST /api/threads HIT');
|
|
2611
2609
|
try {
|
|
2612
2610
|
const body = await parseBody(req);
|
|
2613
2611
|
const metadata = body.metadata || {};
|