computer-agents 0.6.2 → 0.6.4

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (3) hide show
  1. package/README.md +163 -629
  2. package/dist/metadata.js +2 -2
  3. package/package.json +2 -2
package/README.md CHANGED
@@ -3,42 +3,7 @@
3
3
  [![npm version](https://img.shields.io/npm/v/computer-agents.svg)](https://www.npmjs.com/package/computer-agents)
4
4
  [![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)
5
5
 
6
- **The first orchestration framework for parallel computer-use agents.**
7
-
8
- Scale from 1 to 100+ agents. Run experiments in parallel. Test multiple approaches simultaneously. computer-agents enables agent workflows that were previously impossible.
9
-
10
- ## What Makes This Different
11
-
12
- Traditional agent frameworks focus on chat-based LLM agents. computer-agents is built for **computer-use agents** that write code, run tests, and modify files—with native support for **parallel execution at scale**.
13
-
14
- ### Before computer-agents
15
-
16
- - ❌ No parallel orchestration for computer-use agents
17
- - ❌ Single agent, single workspace, sequential execution
18
- - ❌ Hours to run experiments sequentially
19
- - ❌ Limited to local machine resources
20
-
21
- ### With computer-agents
22
-
23
- - ✅ **Parallel Orchestration** - Run 10, 50, 100+ agents simultaneously
24
- - ✅ **Unified Interface** - Seamless local ↔ cloud execution with one config change
25
- - ✅ **Workspace Collaboration** - Multiple agents working on the same codebase
26
- - ✅ **Cloud Scalability** - Effortless scaling beyond local machine limits
27
- - ✅ **Session Continuity** - Automatic multi-turn conversations
28
-
29
- ## Revolutionary Use Cases
30
-
31
- **🔬 Scientific Experiments**
32
- Run 20 experimental variations in parallel instead of sequentially. What took hours now takes minutes.
33
-
34
- **🧪 ML/AI Development**
35
- Test dozens of hyperparameter configurations simultaneously. Systematic exploration of model architectures at scale.
36
-
37
- **⚡️ Multi-Approach Problem Solving**
38
- Try 5 different implementation approaches in parallel. Let the agents find the best solution.
39
-
40
- **🚀 A/B Testing at Scale**
41
- Test multiple implementations, frameworks, or approaches concurrently. Data-driven decision making.
6
+ Build agents that write code, run tests, and modify files. Supports both local and cloud execution with automatic session management.
42
7
 
43
8
  ## Installation
44
9
 
@@ -54,400 +19,246 @@ npm install computer-agents
54
19
  import { Agent, run } from 'computer-agents';
55
20
 
56
21
  const agent = new Agent({
57
- type: "computer",
58
- workspace: "./my-project", // String path = automatic local execution
59
- instructions: "You are an expert developer.",
60
- debug: true // Optional: show detailed logs
22
+ agentType: "computer",
23
+ workspace: "./my-project",
24
+ instructions: "You are an expert developer."
61
25
  });
62
26
 
63
27
  const result = await run(agent, "Create a Python script that calculates fibonacci numbers");
64
28
  console.log(result.finalOutput);
65
29
  ```
66
30
 
67
- ### Cloud Computer Agent (Coming Soon)
68
-
69
- > **Note**: Cloud execution for remote execution is under development and will be available in an upcoming release. The infrastructure is production-ready, and we're finalizing API access for public use.
70
-
71
- When cloud execution becomes available, you'll use Projects for cloud workspaces:
31
+ ### LLM Agent
72
32
 
73
33
  ```typescript
74
- import { Agent, run, CloudClient } from 'computer-agents';
75
-
76
- // Cloud execution will be available soon
77
- const client = new CloudClient({ apiKey: process.env.TESTBASE_API_KEY });
78
- const project = await client.createProject({
79
- name: 'my-project',
80
- localPath: './my-project' // Enable local sync
81
- });
34
+ import { Agent, run } from 'computer-agents';
82
35
 
83
36
  const agent = new Agent({
84
- type: "computer",
85
- workspace: project, // Project = automatic cloud execution
86
- instructions: "You are an expert developer."
37
+ agentType: "llm",
38
+ model: "gpt-4o",
39
+ instructions: "You create detailed implementation plans."
87
40
  });
88
41
 
89
- const result = await run(agent, "Add unit tests to the fibonacci module");
42
+ const result = await run(agent, "Plan how to add user authentication");
90
43
  console.log(result.finalOutput);
91
- // Files automatically synced from cloud to local workspace
92
44
  ```
93
45
 
94
- For now, use local execution (workspace as string) for all computer agent tasks.
46
+ ## Agent Types
47
+
48
+ The SDK supports two agent types:
49
+
50
+ | Type | Execution | Use Cases |
51
+ |------|-----------|-----------|
52
+ | `computer` | Codex SDK | Code generation, file operations, terminal commands |
53
+ | `llm` | OpenAI API | Planning, reasoning, text generation |
95
54
 
96
- ### Streaming Progress (Real-Time Visibility)
55
+ ### Computer Agents
97
56
 
98
- Track agent execution in real-time with `runStreamed()`:
57
+ Computer agents execute code changes using the Codex SDK. They can create files, run commands, and modify codebases.
99
58
 
100
59
  ```typescript
101
- import { Agent, runStreamed } from 'computer-agents';
102
-
103
60
  const agent = new Agent({
104
- type: "computer",
61
+ agentType: "computer",
105
62
  workspace: "./my-project",
63
+ instructions: "You are a Python developer."
106
64
  });
107
65
 
108
- // Stream events in real-time
109
- for await (const event of runStreamed(agent, 'Create a Python web scraper')) {
110
- switch (event.type) {
111
- case 'thread.started':
112
- console.log(`🔗 Thread: ${event.thread_id}`);
113
- break;
114
- case 'turn.started':
115
- console.log('🎬 Turn started');
116
- break;
117
- case 'item.completed':
118
- if (event.item.type === 'file_change') {
119
- const files = event.item.changes.map(c => `${c.kind} ${c.path}`).join(', ');
120
- console.log(`✅ Files: ${files}`);
121
- }
122
- break;
123
- case 'turn.completed':
124
- console.log(`🎉 Completed (${event.usage.input_tokens + event.usage.output_tokens} tokens)`);
125
- break;
126
- }
127
- }
66
+ await run(agent, "Add unit tests for the fibonacci module");
128
67
  ```
129
68
 
130
- **Event Types:**
131
- - `thread.started` - Session initialized with thread ID
132
- - `turn.started` - Agent begins processing
133
- - `item.started` - Tool call or action begins
134
- - `item.completed` - Tool call or action completes (includes file changes)
135
- - `turn.completed` - Processing finished (includes token usage)
136
- - `turn.failed` - Error occurred
137
-
138
- **Use Cases:**
139
- - Progress bars for long-running tasks
140
- - Real-time logging and debugging
141
- - Live UI updates in applications
142
- - Better UX for multi-step operations
69
+ ### LLM Agents
143
70
 
144
- **API Consistency:** `runStreamed()` mirrors `run()` - same signature, just with streaming!
71
+ LLM agents use the OpenAI API for text generation and reasoning tasks.
145
72
 
146
73
  ```typescript
147
- // Standard execution
148
- const result = await run(agent, task);
74
+ const agent = new Agent({
75
+ agentType: "llm",
76
+ model: "gpt-4o",
77
+ instructions: "You review code for quality and security."
78
+ });
149
79
 
150
- // Streaming execution
151
- for await (const event of runStreamed(agent, task)) {
152
- // Real-time progress
153
- }
80
+ await run(agent, "Review the authentication implementation");
154
81
  ```
155
82
 
156
- ### Project Management (Efficient Workspace Sync) - Coming Soon
83
+ ## Cloud Execution
157
84
 
158
- > **Note**: Project Management for cloud execution is under development and will be available in an upcoming release.
85
+ ### Cloud Agents
159
86
 
160
- When available, manage workspaces with the Project API - perfect for organizing code and syncing with cloud storage:
87
+ Execute agents in cloud environments using the CloudClient:
161
88
 
162
89
  ```typescript
163
- import { CloudClient, Agent, run } from 'computer-agents';
164
-
165
- const client = new CloudClient({ apiKey: process.env.TESTBASE_API_KEY });
90
+ import { Agent, run, CloudClient } from 'computer-agents';
166
91
 
167
- // Create a synced project (local ↔ cloud)
168
- const project = await client.createProject({
169
- name: 'my-app',
170
- localPath: './src' // Enables bidirectional sync
92
+ const client = new CloudClient({
93
+ apiKey: process.env.TESTBASE_API_KEY
171
94
  });
172
95
 
173
- // Incremental sync - only uploads changed files (10x faster!)
174
- await project.sync({ direction: 'up' }); // Upload changes
175
- await project.sync({ direction: 'down' }); // Download changes
176
- await project.sync({ direction: 'both' }); // Bi-directional sync
96
+ // Create a cloud project
97
+ const project = await client.createProject({ name: 'my-app' });
177
98
 
178
- // Agents automatically use project workspaces
99
+ // Cloud agent
179
100
  const agent = new Agent({
180
- type: 'computer',
181
- workspace: project // Project = cloud execution
101
+ agentType: "computer",
102
+ workspace: project,
103
+ instructions: "You are an expert developer."
182
104
  });
183
105
 
184
- await run(agent, 'Add user authentication');
185
- // Changes are tracked, next sync will be incremental!
186
- ```
187
-
188
- **Key Benefits:**
189
- - **10x faster sync** - Only transfers changed files (SHA-256 hashing)
190
- - **Organized workspaces** - Manage multiple projects easily
191
- - **Automatic tracking** - Sync state persisted in `.testbase/sync-state.json`
192
- - **Flexible sync** - Choose `up`, `down`, or `both` directions
193
-
194
- **Example: Incremental Sync Performance**
195
- - Full workspace (500MB): ~35 seconds
196
- - Incremental (5MB changes): ~3 seconds
197
-
198
- ```typescript
199
- // List all projects
200
- const projects = await client.listProjects();
201
-
202
- // Get existing project
203
- const project = await client.getProject('project-id');
204
-
205
- // Get sync statistics
206
- const stats = await project.getSyncStats();
207
- console.log(stats); // { lastSyncAt, fileCount, version }
208
-
209
- // Manual file operations
210
- await project.upload(['file1.txt', 'file2.txt']);
211
- await project.download(['file1.txt']);
212
- await project.readFile('config.json');
213
- await project.writeFile('config.json', '{ "new": "data" }');
214
-
215
- // Delete project
216
- await client.deleteProject(project.id);
106
+ const result = await run(agent, "Add error handling to the API");
107
+ console.log(result.finalOutput);
217
108
  ```
218
109
 
219
- ### Parallel Execution
110
+ ### Project Management
220
111
 
221
- **Local Parallel Execution (Available Now):**
222
-
223
- You can run multiple agents in parallel for local development:
112
+ Manage cloud workspaces with the Project API:
224
113
 
225
114
  ```typescript
226
- import { Agent, run } from 'computer-agents';
115
+ import { CloudClient } from 'computer-agents';
227
116
 
228
- // Create 5 agents to test different approaches
229
- const frameworks = ['Express', 'Fastify', 'Koa', 'Hapi', 'Restify'];
230
- const agents = frameworks.map(framework => new Agent({
231
- name: `${framework} Agent`,
232
- type: 'computer',
233
- workspace: `./test-${framework.toLowerCase()}`,
234
- instructions: `You are an expert in ${framework}.`
235
- }));
236
-
237
- // Run all 5 in parallel!
238
- const results = await Promise.all(
239
- agents.map((agent, i) => run(agent, `Create a REST API with ${frameworks[i]}`))
240
- );
241
-
242
- // All 5 implementations complete in the time it takes to run 1
243
- console.log('All 5 frameworks tested in parallel!');
244
- ```
245
-
246
- **Cloud Parallel Execution (Coming Soon):**
247
-
248
- > **Note**: Large-scale parallel execution with cloud infrastructure is coming soon. When available, you'll be able to scale to 100+ concurrent agents using CloudClient and Projects.
249
-
250
- ### LLM Agent (for planning and reasoning)
117
+ const client = new CloudClient({ apiKey: process.env.TESTBASE_API_KEY });
251
118
 
252
- ```typescript
253
- const planner = new Agent({
254
- type: "llm",
255
- model: "gpt-4o",
256
- instructions: "You create detailed implementation plans."
119
+ // Create project with local sync
120
+ const project = await client.createProject({
121
+ name: 'my-app',
122
+ localPath: './src' // Enables bidirectional sync
257
123
  });
258
124
 
259
- const plan = await run(planner, "Plan how to add user authentication");
260
- console.log(plan.finalOutput);
261
- ```
262
-
263
- ## Core Concepts
264
-
265
- computer-agents has just **2 core concepts**:
266
-
267
- 1. **Agent** - Single unified interface for both LLM and computer-use agents
268
- 2. **CloudClient** - Manage cloud projects and infrastructure (coming soon)
269
-
270
- ### 1. Agent - Unified Interface
271
-
272
- ```typescript
273
- type AgentType = 'llm' | 'computer';
274
- ```
275
-
276
- | Type | Execution | Use Cases |
277
- |------|-----------|-----------|
278
- | `'llm'` | OpenAI API | Planning, reasoning, reviewing |
279
- | `'computer'` | Codex SDK | Code, tests, file operations, terminal commands |
125
+ // Sync files
126
+ await project.sync({ direction: 'both' });
280
127
 
281
- **Key insight:** Workspace type determines execution mode automatically:
128
+ // File operations
129
+ const files = await project.listFiles();
130
+ const content = await project.readFile('config.json');
131
+ await project.writeFile('settings.json', '{"debug": true}');
282
132
 
283
- ```typescript
284
- // Local execution (workspace = string path)
285
- const localAgent = new Agent({
286
- type: 'computer',
287
- workspace: './my-project' // String = local execution
288
- });
289
-
290
- // Cloud execution (workspace = Project) - Coming Soon
291
- const cloudAgent = new Agent({
292
- type: 'computer',
293
- workspace: project // Project = cloud execution
294
- });
133
+ // Sync statistics
134
+ const stats = await project.getSyncStats();
135
+ console.log(stats); // { lastSyncAt, fileCount, version }
295
136
  ```
296
137
 
297
- ### 2. CloudClient - Infrastructure Management (Coming Soon)
138
+ ### Streaming Events
298
139
 
299
- Single entry point for cloud operations:
140
+ Monitor agent execution in real-time:
300
141
 
301
142
  ```typescript
302
- const client = new CloudClient({ apiKey: process.env.TESTBASE_API_KEY });
303
-
304
- // Project management
305
- const projects = await client.listProjects();
306
- const project = await client.createProject({ name: 'my-app' });
307
- await client.deleteProject(project.id);
308
-
309
- // Infrastructure (future)
310
- const containers = await client.listContainers();
311
- const stats = await client.getContainerStats(containerId);
312
- ```
313
-
314
- ### Session Continuity
315
-
316
- Agents automatically maintain context across multiple runs:
143
+ import { Agent, runStreamed } from 'computer-agents';
317
144
 
318
- ```typescript
319
145
  const agent = new Agent({
320
- type: 'computer',
321
- workspace: './my-project'
146
+ agentType: "computer",
147
+ workspace: "./my-project"
322
148
  });
323
149
 
324
- await run(agent, 'Create app.py'); // New session
325
- await run(agent, 'Add error handling'); // Continues same session!
326
- await run(agent, 'Add tests'); // Still same session!
327
-
328
- console.log(agent.currentThreadId); // Thread ID maintained
329
-
330
- agent.resetSession(); // Start fresh when needed
331
- await run(agent, 'New project'); // New session
332
- ```
333
-
334
- ## Examples
335
-
336
- Comprehensive examples demonstrating the power of computer-agents:
337
-
338
- ```bash
339
- # Clone the repository
340
- git clone https://github.com/TestBase-ai/computer-agents.git
341
- cd computer-agents
342
- npm install
343
- npm run build
344
-
345
- # Streaming progress (real-time event visibility)
346
- node examples/testbase/streaming-progress.cjs
347
-
348
- # Workspace sync modes (default vs cloud-only)
349
- node examples/testbase/workspace-sync-modes.mjs
350
-
351
- # Parallel execution (the game changer!)
352
- node examples/testbase/parallel-execution.mjs
353
-
354
- # Scale experiments (ML hyperparameter tuning, algorithm comparison)
355
- node examples/testbase/scale-experiments.mjs
356
-
357
- # Multi-agent workflows (planner → executor → reviewer)
358
- node examples/testbase/multi-agent-workflow.mjs
359
-
360
- # Session continuity demonstration
361
- node examples/testbase/hello-world.mjs
150
+ for await (const event of runStreamed(agent, 'Create a web scraper')) {
151
+ switch (event.type) {
152
+ case 'thread.started':
153
+ console.log(`Thread: ${event.thread_id}`);
154
+ break;
155
+ case 'item.completed':
156
+ if (event.item.type === 'file_change') {
157
+ const files = event.item.changes.map(c => `${c.kind} ${c.path}`).join(', ');
158
+ console.log(`Files: ${files}`);
159
+ }
160
+ break;
161
+ case 'turn.completed':
162
+ console.log(`Completed (${event.usage.input_tokens + event.usage.output_tokens} tokens)`);
163
+ break;
164
+ }
165
+ }
362
166
  ```
363
167
 
364
- **[📂 View all examples →](https://github.com/TestBase-ai/computer-agents/tree/main/examples/testbase)**
365
-
366
168
  ## Multi-Agent Workflows
367
169
 
368
- Build custom workflows by composing agents:
170
+ Compose multiple agents for complex tasks:
369
171
 
370
172
  ```typescript
371
173
  import { Agent, run } from 'computer-agents';
372
174
 
373
175
  // LLM creates plan
374
176
  const planner = new Agent({
375
- type: 'llm',
177
+ agentType: 'llm',
376
178
  model: 'gpt-4o',
377
- instructions: 'Create detailed implementation plans.'
179
+ instructions: 'Create implementation plans.'
378
180
  });
379
181
 
380
- // Computer agent executes plan
182
+ // Computer agent executes
381
183
  const executor = new Agent({
382
- type: 'computer',
184
+ agentType: 'computer',
383
185
  workspace: './my-project',
384
186
  instructions: 'Execute implementation plans.'
385
187
  });
386
188
 
387
189
  // LLM reviews result
388
190
  const reviewer = new Agent({
389
- type: 'llm',
191
+ agentType: 'llm',
390
192
  model: 'gpt-4o',
391
- instructions: 'Review implementations for quality.'
193
+ instructions: 'Review code quality.'
392
194
  });
393
195
 
394
- // Manual workflow composition - you control the flow
395
196
  const task = "Add user authentication";
396
197
  const plan = await run(planner, `Plan: ${task}`);
397
198
  const code = await run(executor, plan.finalOutput);
398
199
  const review = await run(reviewer, `Review: ${code.finalOutput}`);
399
200
  ```
400
201
 
401
- ## Configuration
202
+ ## Session Continuity
402
203
 
403
- ### Environment Variables
204
+ Agents automatically maintain context across multiple runs:
404
205
 
405
- ```bash
406
- # Required for LLM agents and computer agents (Codex SDK uses OpenAI)
407
- OPENAI_API_KEY=your-openai-key
206
+ ```typescript
207
+ const agent = new Agent({
208
+ agentType: 'computer',
209
+ workspace: './my-project'
210
+ });
408
211
 
409
- # Optional for CloudClient (when cloud execution becomes available)
410
- TESTBASE_API_KEY=your-testbase-key # Get from testbase.ai
212
+ await run(agent, 'Create app.py');
213
+ await run(agent, 'Add error handling'); // Continues same session
214
+ await run(agent, 'Add tests'); // Still same session
215
+
216
+ console.log(agent.currentThreadId); // Thread ID maintained
217
+
218
+ agent.resetSession(); // Start fresh
219
+ await run(agent, 'New project'); // New session
411
220
  ```
412
221
 
222
+ ## Configuration
223
+
413
224
  ### Agent Configuration
414
225
 
415
226
  ```typescript
416
227
  const agent = new Agent({
417
228
  name: "My Agent", // Optional, auto-generated if omitted
418
- type: 'computer', // 'llm' | 'computer'
419
-
420
- // Computer agent specific
229
+ agentType: 'computer', // 'llm' | 'computer'
421
230
  workspace: './my-project', // Required for computer agents
422
- // String = local, Project = cloud
423
-
424
- // LLM agent specific
425
231
  model: 'gpt-4o', // Required for LLM agents
426
-
427
- // Execution settings (computer agents only)
428
- debug: true, // Show detailed logs
429
- timeout: 600000, // 10 minutes (default)
430
- skipGitRepoCheck: true, // Allow execution outside git repos (default: true)
431
-
432
- // Shared
433
232
  instructions: "You are helpful.", // System prompt
434
- mcpServers: [...], // MCP server configurations (optional)
233
+ debug: false, // Show detailed logs
234
+ timeout: 600000, // 10 minutes (default)
235
+ mcpServers: [], // MCP server configurations
435
236
  });
436
237
  ```
437
238
 
438
- ### CloudClient Configuration (Coming Soon)
239
+ ### CloudClient Configuration
439
240
 
440
241
  ```typescript
441
242
  const client = new CloudClient({
442
- apiKey: process.env.TESTBASE_API_KEY, // Required (or use env var)
443
- debug: true, // Show detailed logs
243
+ apiKey: process.env.TESTBASE_API_KEY, // Required
244
+ debug: false, // Show detailed logs
444
245
  timeout: 600000, // 10 minutes (default)
445
246
  });
446
247
  ```
447
248
 
249
+ ### Environment Variables
250
+
251
+ ```bash
252
+ # Required for all agents
253
+ OPENAI_API_KEY=your-openai-key
254
+
255
+ # Required for CloudClient
256
+ TESTBASE_API_KEY=your-testbase-key
257
+ ```
258
+
448
259
  ## MCP Server Integration
449
260
 
450
- Unified MCP configuration works for both agent types:
261
+ Use Model Context Protocol servers with your agents:
451
262
 
452
263
  ```typescript
453
264
  import type { McpServerConfig } from 'computer-agents';
@@ -467,35 +278,13 @@ const mcpServers: McpServerConfig[] = [
467
278
  }
468
279
  ];
469
280
 
470
- // Works for both LLM and computer agents!
471
281
  const agent = new Agent({
472
- type: 'computer',
282
+ agentType: 'computer',
473
283
  workspace: './my-project',
474
- mcpServers // Automatically converted to appropriate format
284
+ mcpServers
475
285
  });
476
286
  ```
477
287
 
478
- The SDK handles conversion automatically:
479
- - **LLM agents**: MCP servers → function tools
480
- - **Computer agents**: MCP servers → Codex SDK config
481
-
482
- ## Performance
483
-
484
- ### Local Execution (workspace = string path)
485
- - **Cold start**: <1 second
486
- - **Warm execution**: <100ms overhead
487
- - **Parallelization**: Limited by local CPU/memory
488
-
489
- ### Cloud Execution (workspace = Project) - Coming Soon
490
- - **First execution**: 30-45 seconds (includes workspace sync)
491
- - **Subsequent runs**: ~5-10 seconds
492
- - **Parallelization**: Scale to 100+ agents
493
-
494
- ### Cloud-Only Mode (no localPath in Project) - Coming Soon
495
- - **Execution**: Faster (no sync overhead)
496
- - **Parallelization**: Scale to 100+ agents
497
- - **Perfect for**: CI/CD, experiments, parallel tasks
498
-
499
288
  ## API Reference
500
289
 
501
290
  ### Agent
@@ -504,31 +293,10 @@ The SDK handles conversion automatically:
504
293
  class Agent {
505
294
  constructor(config: AgentConfiguration);
506
295
 
507
- currentThreadId: string | undefined; // Current session thread ID
508
- resetSession(): void; // Start new session
509
- workspace: string; // Workspace path
510
- type: 'llm' | 'computer'; // Agent type
511
- }
512
-
513
- interface AgentConfiguration {
514
- name?: string; // Optional, auto-generated if omitted
515
- type: 'llm' | 'computer'; // Agent type
516
-
517
- // Computer agent specific
518
- workspace?: string | Project; // Required for computer agents
519
- // String = local, Project = cloud
520
-
521
- // LLM agent specific
522
- model?: string; // Required for LLM agents
523
-
524
- // Execution settings (computer agents only)
525
- debug?: boolean; // Show detailed logs
526
- timeout?: number; // Execution timeout (default: 600000ms)
527
- skipGitRepoCheck?: boolean; // Allow execution outside git repos (default: true)
528
-
529
- // Shared
530
- instructions?: string; // System prompt
531
- mcpServers?: McpServerConfig[]; // MCP server configurations
296
+ currentThreadId: string | undefined;
297
+ resetSession(): void;
298
+ workspace: string;
299
+ agentType: 'llm' | 'computer';
532
300
  }
533
301
  ```
534
302
 
@@ -552,324 +320,97 @@ function runStreamed(
552
320
  ): AsyncGenerator<Event>;
553
321
  ```
554
322
 
555
- Stream real-time events during agent execution. Returns an async generator that yields:
556
- - `thread.started` - Session initialized
557
- - `turn.started` - Agent begins processing
558
- - `item.started` / `item.completed` - Tool calls and file changes
559
- - `turn.completed` - Processing finished with usage stats
560
- - `turn.failed` - Error occurred
561
-
562
- **Example:**
563
- ```typescript
564
- for await (const event of runStreamed(agent, 'Create app.py')) {
565
- console.log(event.type, event);
566
- }
567
- ```
568
-
569
- ### CloudClient (Coming Soon)
570
-
571
- > **Note**: CloudClient for cloud execution is under development and will be available in an upcoming release.
323
+ ### CloudClient
572
324
 
573
325
  ```typescript
574
326
  class CloudClient {
575
- constructor(config?: {
576
- apiKey?: string; // Required (or env var TESTBASE_API_KEY)
577
- debug?: boolean;
578
- timeout?: number; // default: 600000ms (10 min)
579
- });
327
+ constructor(config?: CloudClientConfig);
580
328
 
581
- // Project management
582
329
  async createProject(config: CreateProjectConfig): Promise<Project>;
583
330
  async listProjects(): Promise<Project[]>;
584
331
  async getProject(id: string): Promise<Project>;
585
332
  async deleteProject(id: string, hard?: boolean): Promise<void>;
586
-
587
- // Infrastructure (future)
588
- async listContainers(): Promise<Container[]>;
589
- async getContainerStats(id: string): Promise<ContainerStats>;
590
333
  }
591
334
  ```
592
335
 
593
- ### Project (Coming Soon)
594
-
595
- > **Note**: Project API for cloud execution is under development and will be available in an upcoming release.
336
+ ### Project
596
337
 
597
338
  ```typescript
598
339
  class Project {
599
- // Properties
600
340
  readonly id: string;
601
341
  readonly name: string;
602
342
  readonly localPath: string | undefined;
603
343
  readonly cloudPath: string;
604
344
 
605
- // Sync operations (when localPath provided)
606
345
  async sync(options?: SyncOptions): Promise<SyncResult>;
607
- async upload(files: string[]): Promise<void>;
608
- async download(files: string[]): Promise<void>;
609
-
610
- // File operations
611
346
  async listFiles(pattern?: string): Promise<ProjectFile[]>;
612
347
  async readFile(path: string): Promise<string>;
613
348
  async writeFile(path: string, content: string): Promise<void>;
614
-
615
- // Management
349
+ async upload(files: string[]): Promise<void>;
350
+ async download(files: string[]): Promise<void>;
616
351
  async delete(hard?: boolean): Promise<void>;
617
352
  async getStats(): Promise<ProjectStats>;
618
353
  async getSyncStats(): Promise<SyncStats>;
619
- async resetSyncState(): Promise<void>;
620
354
 
621
- // Workspace path for agents
622
355
  getWorkspacePath(): string;
623
356
  }
624
-
625
- // Create project via CloudClient
626
- const client = new CloudClient({ apiKey: process.env.TESTBASE_API_KEY });
627
- const project = await client.createProject({
628
- name: 'my-app',
629
- localPath: './src', // Optional - enables sync if provided
630
- description: 'My app', // Optional
631
- metadata: { ... }, // Optional
632
- });
633
-
634
- // Sync options (when localPath provided)
635
- await project.sync({
636
- direction: 'both', // 'up' | 'down' | 'both'
637
- force: false, // Force full sync (skip incremental)
638
- pattern: '*.ts' // Optional glob pattern
639
- });
640
- ```
641
-
642
- ### Runtime Classes (Internal Use)
643
-
644
- > **Note**: Runtime classes are used internally by the SDK. Most users should not need to interact with them directly. Use CloudClient for cloud operations instead.
645
-
646
- These classes are marked `@internal` and are primarily for advanced use cases:
647
-
648
- ```typescript
649
- // For advanced use only - not needed for typical usage
650
- class LocalRuntime implements Runtime { ... }
651
- class CloudRuntime implements Runtime { ... }
652
- ```
653
-
654
- ## Architecture
655
-
656
- ```
657
- computer-agents/
658
- ├── packages/
659
- │ ├── agents-core/ # Core SDK
660
- │ │ ├── src/
661
- │ │ │ ├── agent.ts # Agent class
662
- │ │ │ ├── run.ts # Run loop
663
- │ │ │ ├── runtime/ # Runtime abstraction
664
- │ │ │ │ ├── LocalRuntime.ts
665
- │ │ │ │ ├── CloudRuntime.ts
666
- │ │ │ │ └── gcsWorkspace.ts
667
- │ │ │ ├── codex/ # Codex SDK integration
668
- │ │ │ ├── cloud/ # Cloud API client
669
- │ │ │ └── mcpConfig.ts # Unified MCP types
670
- │ │ └── package.json
671
- │ │
672
- │ ├── agents/ # Main package export
673
- │ ├── agents-openai/ # OpenAI provider
674
- │ └── cloud-infrastructure/ # GCE cloud execution server
675
-
676
- └── examples/testbase/ # Working examples
677
- ```
678
-
679
- ## Best Practices
680
-
681
- ### Choosing Local vs Cloud Execution
682
-
683
- **Use Local Execution (workspace = string) when:**
684
- - Development and rapid iteration
685
- - Working with local files/tools
686
- - No cloud infrastructure needed
687
- - Testing and debugging
688
-
689
- **Use Cloud Execution (workspace = Project) when:** *(Coming Soon)*
690
- - Parallel execution at scale
691
- - Production deployments
692
- - CI/CD pipelines
693
- - Need isolated execution environments
694
- - Experiments requiring multiple concurrent agents
695
-
696
- ### Choosing Workspace Sync Mode
697
-
698
- **With localPath (bidirectional sync):** *(Coming Soon)*
699
- - You need results in your local filesystem
700
- - Continuing work locally after cloud execution
701
- - Interactive development workflows
702
-
703
- **Without localPath (cloud-only):** *(Coming Soon)*
704
- - CI/CD pipelines (no local filesystem)
705
- - Running experiments at scale
706
- - Parallel task execution
707
- - Faster execution (skip sync overhead)
708
-
709
- ### Session Management
710
-
711
- Always use the **same agent instance** for session continuity:
712
-
713
- ```typescript
714
- // ✅ Correct - same agent, continuous session
715
- const agent = new Agent({ type: 'computer', workspace: './project' });
716
- await run(agent, 'Task 1');
717
- await run(agent, 'Task 2'); // Continues session
718
-
719
- // ❌ Wrong - different agents, new sessions
720
- await run(new Agent({ type: 'computer', workspace: './project' }), 'Task 1');
721
- await run(new Agent({ type: 'computer', workspace: './project' }), 'Task 2'); // Different session!
722
357
  ```
723
358
 
724
- ### Parallel Execution
725
-
726
- Use `Promise.all()` for parallel execution:
727
-
728
- ```typescript
729
- const agents = [agent1, agent2, agent3];
730
- const tasks = ['Task 1', 'Task 2', 'Task 3'];
359
+ ## Examples
731
360
 
732
- // ✅ Parallel - all execute simultaneously
733
- const results = await Promise.all(
734
- agents.map((agent, i) => run(agent, tasks[i]))
735
- );
361
+ ```bash
362
+ # Clone the repository
363
+ git clone https://github.com/TestBase-ai/computer-agents.git
364
+ cd computer-agents
365
+ npm install
366
+ npm run build
736
367
 
737
- // Sequential - one at a time
738
- for (let i = 0; i < agents.length; i++) {
739
- await run(agents[i], tasks[i]); // Slower!
740
- }
368
+ # Run examples
369
+ node examples/testbase/hello-world.mjs
370
+ node examples/testbase/multi-agent-workflow.mjs
371
+ node examples/testbase/streaming-progress.cjs
741
372
  ```
742
373
 
743
- ## Cloud Infrastructure
744
-
745
- > **Coming Soon**: Public access to cloud execution infrastructure is under development.
746
-
747
- computer-agents includes production-ready cloud infrastructure that will soon be available:
748
-
749
- - **GCS Bucket** - Workspace storage (`gs://testbase-workspaces`)
750
- - **GCE VM** - Codex SDK execution server
751
- - **Pay-per-token** - Credit-based billing system
752
- - **API Keys** - Database-backed authentication
753
- - **Budget Protection** - Daily/monthly spending limits
754
- - **Project Management** - Incremental sync with SHA-256 hashing
755
-
756
- The infrastructure is fully built and tested. We're finalizing API access for public use. Stay tuned for updates!
757
-
758
- For now, `LocalRuntime` provides full computer-use agent capabilities for local development.
759
-
760
- ## Documentation
761
-
762
- - **[Examples](https://github.com/TestBase-ai/computer-agents/tree/main/examples/testbase)** - Comprehensive working examples
763
- - **[Cloud Infrastructure](./packages/cloud-infrastructure/README.md)** - Deployment and configuration
764
- - **[Architecture](../docs/ARCHITECTURE.md)** - System design and internals
374
+ [View all examples →](https://github.com/TestBase-ai/computer-agents/tree/main/examples/testbase)
765
375
 
766
376
  ## Troubleshooting
767
377
 
768
378
  ### "OPENAI_API_KEY not set"
379
+
769
380
  ```bash
770
381
  export OPENAI_API_KEY=sk-...
771
382
  ```
772
383
 
773
- ### "TESTBASE_API_KEY required" *(When using CloudClient)*
384
+ ### "TESTBASE_API_KEY required"
385
+
774
386
  ```bash
775
387
  export TESTBASE_API_KEY=your-key
776
388
  # Or provide in constructor:
777
389
  new CloudClient({ apiKey: 'your-key' })
778
390
  ```
779
391
 
780
- ### Session continuity not working
781
- Ensure you're using the **same agent instance** across runs:
782
- ```typescript
783
- const agent = new Agent({ type: 'computer', workspace: './project' });
784
- await run(agent, 'Task 1');
785
- await run(agent, 'Task 2'); // Same instance = session continues
786
- ```
787
-
788
392
  ### "Computer agents require a workspace"
393
+
789
394
  Computer agents need a workspace parameter:
395
+
790
396
  ```typescript
791
- // Correct
792
- new Agent({ type: 'computer', workspace: './my-project' })
397
+ // Correct
398
+ new Agent({ agentType: 'computer', workspace: './my-project' })
793
399
 
794
- // Missing workspace
795
- new Agent({ type: 'computer' }) // Error!
400
+ // Missing workspace - Error
401
+ new Agent({ agentType: 'computer' })
796
402
  ```
797
403
 
798
- ## What's New
799
-
800
- ### v0.6.0 - Major UX Simplification
801
-
802
- **Breaking Changes:**
803
- - Runtime objects removed from public API - no more `LocalRuntime` or `CloudRuntime` in user code
804
- - Agent configuration simplified: `agentType` → `type`, workspace accepts `string | Project`
805
- - CloudClient introduced as single entry point for cloud operations
806
- - Execution settings moved from Runtime to Agent level
404
+ ### Session continuity not working
807
405
 
808
- **Benefits:**
809
- - 40-50% less code for typical use cases
810
- - Reduced core concepts from 5 to 2 (Agent + CloudClient)
811
- - More intuitive API with better TypeScript inference
812
- - Automatic runtime selection based on workspace type
406
+ Use the same agent instance across runs:
813
407
 
814
- **Migration:**
815
408
  ```typescript
816
- // Before (v0.5.x)
817
- const agent = new Agent({
818
- agentType: 'computer',
819
- runtime: new LocalRuntime({ debug: true }),
820
- workspace: './project'
821
- });
822
-
823
- // After (v0.6.0)
824
- const agent = new Agent({
825
- type: 'computer',
826
- workspace: './project',
827
- debug: true
828
- });
409
+ const agent = new Agent({ agentType: 'computer', workspace: './project' });
410
+ await run(agent, 'Task 1');
411
+ await run(agent, 'Task 2'); // Same instance = session continues
829
412
  ```
830
413
 
831
- See [MIGRATION_v0.5_to_v0.6.md](https://github.com/TestBase-ai/computer-agents/blob/main/MIGRATION_v0.5_to_v0.6.md) for complete migration guide.
832
-
833
- ### v0.5.0
834
- - **Project Management System**: Organize and sync workspaces efficiently
835
- - Incremental sync with SHA-256 hashing - 10x faster than full sync
836
- - Track sync state automatically in `.testbase/sync-state.json`
837
- - Native Web API FormData for reliable file uploads
838
- - Seamless agent integration with project workspaces
839
-
840
- ### v0.4.9
841
- - **Streaming Progress**: New `runStreamed()` function for real-time visibility
842
- - Stream events: thread.started, turn.started, item.completed, turn.completed
843
- - API consistency - mirrors `run()` signature for easy adoption
844
- - Perfect for progress bars, real-time logging, and better UX
845
-
846
- ### v0.4.6
847
- - **Cloud-Only Mode**: `skipWorkspaceSync` option for CloudRuntime
848
- - Perfect for CI/CD and parallel experiments
849
- - Faster cloud execution (no sync overhead)
850
-
851
- ### v0.4.5
852
- - Fixed maxBuffer overflow for large workspace syncs
853
- - Improved GCS operation stability
854
-
855
- ### v0.4.0
856
- - Initial public release
857
- - Parallel computer-use agent orchestration
858
- - Unified local/cloud runtime abstraction
859
- - Session continuity
860
-
861
- ## Differences from OpenAI Agents SDK
862
-
863
- computer-agents extends OpenAI's Agents SDK with:
864
-
865
- 1. **Computer-use agent type** - Direct Codex SDK integration
866
- 2. **Simplified API** - No runtime objects needed, workspace type determines execution
867
- 3. **CloudClient** - Unified interface for cloud operations and project management
868
- 4. **Parallel orchestration** - Native support for concurrent agents
869
- 5. **Session continuity** - Automatic thread management
870
- 6. **Cloud infrastructure** - Production-ready execution platform (coming soon for public access)
871
- 7. **Unified MCP config** - Single configuration for all agent types
872
-
873
414
  ## License
874
415
 
875
416
  MIT
@@ -877,17 +418,10 @@ MIT
877
418
  ## Links
878
419
 
879
420
  - **GitHub**: [https://github.com/TestBase-ai/computer-agents](https://github.com/TestBase-ai/computer-agents)
880
- - **Examples**: [https://github.com/TestBase-ai/computer-agents/tree/main/examples/testbase](https://github.com/TestBase-ai/computer-agents/tree/main/examples/testbase)
881
421
  - **npm**: [https://www.npmjs.com/package/computer-agents](https://www.npmjs.com/package/computer-agents)
882
- - **Website**: [https://testbase.ai/computer-agents](https://testbase.ai/computer-agents)
422
+ - **Documentation**: [https://github.com/TestBase-ai/computer-agents/tree/main/examples/testbase](https://github.com/TestBase-ai/computer-agents/tree/main/examples/testbase)
883
423
 
884
424
  ## Support
885
425
 
886
426
  - **Issues**: [GitHub Issues](https://github.com/TestBase-ai/computer-agents/issues)
887
427
  - **Website**: [testbase.ai](https://testbase.ai)
888
-
889
- ---
890
-
891
- **Built with ❤️ by [TestBase](https://testbase.ai)**
892
-
893
- *Based on [OpenAI Agents SDK](https://github.com/openai/openai-agents-sdk) • Powered by [Codex SDK](https://github.com/anthropics/claude-code) • Cloud infrastructure on GCP*
package/dist/metadata.js CHANGED
@@ -4,9 +4,9 @@ Object.defineProperty(exports, "__esModule", { value: true });
4
4
  exports.METADATA = void 0;
5
5
  exports.METADATA = {
6
6
  "name": "computer-agents",
7
- "version": "0.6.2",
7
+ "version": "0.6.4",
8
8
  "versions": {
9
- "computer-agents": "0.6.2"
9
+ "computer-agents": "0.6.4"
10
10
  }
11
11
  };
12
12
  exports.default = exports.METADATA;
package/package.json CHANGED
@@ -2,7 +2,7 @@
2
2
  "name": "computer-agents",
3
3
  "repository": "https://github.com/TestBase-ai/computer-agents",
4
4
  "homepage": "https://testbase.ai/computer-agents",
5
- "version": "0.6.2",
5
+ "version": "0.6.4",
6
6
  "description": "Build computer-use agents that write code, run tests, and deploy apps. Seamless local and cloud execution with automatic session continuity.",
7
7
  "author": "Testbase",
8
8
  "main": "dist/index.js",
@@ -20,7 +20,7 @@
20
20
  "build-check": "tsc --noEmit -p ./tsconfig.test.json"
21
21
  },
22
22
  "dependencies": {
23
- "computer-agents-core": "0.6.0",
23
+ "computer-agents-core": "0.6.2",
24
24
  "computer-agents-openai": "0.6.1"
25
25
  },
26
26
  "keywords": [