computer-agents 0.6.3 → 0.6.5

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (3) hide show
  1. package/README.md +165 -645
  2. package/dist/metadata.js +2 -2
  3. package/package.json +1 -1
package/README.md CHANGED
@@ -3,42 +3,31 @@
3
3
  [![npm version](https://img.shields.io/npm/v/computer-agents.svg)](https://www.npmjs.com/package/computer-agents)
4
4
  [![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)
5
5
 
6
- **The first orchestration framework for parallel computer-use agents.**
6
+ Build agents that write code, run tests, and modify files. Orchestrate unlimited agents in parallel with seamless local and cloud execution.
7
7
 
8
- Scale from 1 to 100+ agents. Run experiments in parallel. Test multiple approaches simultaneously. computer-agents enables agent workflows that were previously impossible.
8
+ ## Why Computer Agents?
9
9
 
10
- ## What Makes This Different
10
+ Traditional agent frameworks limit you to single-agent workflows or rigid orchestration patterns. Computer Agents is designed for programmatic multi-agent orchestration at scale.
11
11
 
12
- Traditional agent frameworks focus on chat-based LLM agents. computer-agents is built for **computer-use agents** that write code, run tests, and modify files—with native support for **parallel execution at scale**.
12
+ **Unlimited Parallel Orchestration**
13
+ Compose and run unlimited agents concurrently. Build custom multi-agent workflows programmatically—you control execution flow and agent communication. No framework constraints, just code.
13
14
 
14
- ### Before computer-agents
15
+ **Seamless Local ↔ Cloud Execution**
16
+ Develop locally, scale to cloud by changing workspace configuration. Runtime abstraction handles the complexity while your code remains identical. Switch execution environments without rewriting workflows.
15
17
 
16
- - No parallel orchestration for computer-use agents
17
- - Single agent, single workspace, sequential execution
18
- - Hours to run experiments sequentially
19
- - ❌ Limited to local machine resources
18
+ **Two Powerful Agent Types**
19
+ - **LLM agents** (OpenAI API) for planning, reasoning, and code review
20
+ - **Computer agents** (Codex SDK) for code generation and file operations
20
21
 
21
- ### With computer-agents
22
+ Mix agent types to build sophisticated workflows. Computer agents bypass LLM for tool selection, providing faster execution and lower costs.
22
23
 
23
- - **Parallel Orchestration** - Run 10, 50, 100+ agents simultaneously
24
- - **Unified Interface** - Seamless local ↔ cloud execution with one config change
25
- - **Workspace Collaboration** - Multiple agents working on the same codebase
26
- - **Cloud Scalability** - Effortless scaling beyond local machine limits
27
- - **Session Continuity** - Automatic multi-turn conversations
24
+ **Production-Ready Infrastructure**
25
+ - Automatic session continuity across runs
26
+ - Efficient workspace synchronization
27
+ - Built on OpenAI Codex SDK
28
+ - Type-safe TypeScript with comprehensive documentation
28
29
 
29
- ## Revolutionary Use Cases
30
-
31
- **🔬 Scientific Experiments**
32
- Run 20 experimental variations in parallel instead of sequentially. What took hours now takes minutes.
33
-
34
- **🧪 ML/AI Development**
35
- Test dozens of hyperparameter configurations simultaneously. Systematic exploration of model architectures at scale.
36
-
37
- **⚡️ Multi-Approach Problem Solving**
38
- Try 5 different implementation approaches in parallel. Let the agents find the best solution.
39
-
40
- **🚀 A/B Testing at Scale**
41
- Test multiple implementations, frameworks, or approaches concurrently. Data-driven decision making.
30
+ **Use cases:** Parallel test generation, distributed code review, large-scale refactoring, automated debugging workflows, multi-repository updates.
42
31
 
43
32
  ## Installation
44
33
 
@@ -54,400 +43,216 @@ npm install computer-agents
54
43
  import { Agent, run } from 'computer-agents';
55
44
 
56
45
  const agent = new Agent({
57
- type: "computer",
58
- workspace: "./my-project", // String path = automatic local execution
59
- instructions: "You are an expert developer.",
60
- debug: true // Optional: show detailed logs
46
+ agentType: "computer",
47
+ workspace: "./my-project",
48
+ instructions: "You are an expert developer."
61
49
  });
62
50
 
63
51
  const result = await run(agent, "Create a Python script that calculates fibonacci numbers");
64
52
  console.log(result.finalOutput);
65
53
  ```
66
54
 
67
- ### Cloud Computer Agent (Coming Soon)
68
-
69
- > **Note**: Cloud execution for remote execution is under development and will be available in an upcoming release. The infrastructure is production-ready, and we're finalizing API access for public use.
70
-
71
- When cloud execution becomes available, you'll use Projects for cloud workspaces:
55
+ ### LLM Agent
72
56
 
73
57
  ```typescript
74
- import { Agent, run, CloudClient } from 'computer-agents';
75
-
76
- // Cloud execution will be available soon
77
- const client = new CloudClient({ apiKey: process.env.TESTBASE_API_KEY });
78
- const project = await client.createProject({
79
- name: 'my-project',
80
- localPath: './my-project' // Enable local sync
81
- });
58
+ import { Agent, run } from 'computer-agents';
82
59
 
83
60
  const agent = new Agent({
84
- type: "computer",
85
- workspace: project, // Project = automatic cloud execution
86
- instructions: "You are an expert developer."
61
+ agentType: "llm",
62
+ model: "gpt-4o",
63
+ instructions: "You create detailed implementation plans."
87
64
  });
88
65
 
89
- const result = await run(agent, "Add unit tests to the fibonacci module");
66
+ const result = await run(agent, "Plan how to add user authentication");
90
67
  console.log(result.finalOutput);
91
- // Files automatically synced from cloud to local workspace
92
68
  ```
93
69
 
94
- For now, use local execution (workspace as string) for all computer agent tasks.
70
+ ## Agent Types
95
71
 
96
- ### Streaming Progress (Real-Time Visibility)
72
+ The SDK supports two agent types:
97
73
 
98
- Track agent execution in real-time with `runStreamed()`:
74
+ | Type | Execution | Use Cases |
75
+ |------|-----------|-----------|
76
+ | `computer` | Codex SDK | Code generation, file operations, terminal commands |
77
+ | `llm` | OpenAI API | Planning, reasoning, text generation |
99
78
 
100
- ```typescript
101
- import { Agent, runStreamed } from 'computer-agents';
79
+ ### Computer Agents
80
+
81
+ Computer agents execute code changes using the Codex SDK. They can create files, run commands, and modify codebases.
102
82
 
83
+ ```typescript
103
84
  const agent = new Agent({
104
- type: "computer",
85
+ agentType: "computer",
105
86
  workspace: "./my-project",
87
+ instructions: "You are a Python developer."
106
88
  });
107
89
 
108
- // Stream events in real-time
109
- for await (const event of runStreamed(agent, 'Create a Python web scraper')) {
110
- switch (event.type) {
111
- case 'thread.started':
112
- console.log(`🔗 Thread: ${event.thread_id}`);
113
- break;
114
- case 'turn.started':
115
- console.log('🎬 Turn started');
116
- break;
117
- case 'item.completed':
118
- if (event.item.type === 'file_change') {
119
- const files = event.item.changes.map(c => `${c.kind} ${c.path}`).join(', ');
120
- console.log(`✅ Files: ${files}`);
121
- }
122
- break;
123
- case 'turn.completed':
124
- console.log(`🎉 Completed (${event.usage.input_tokens + event.usage.output_tokens} tokens)`);
125
- break;
126
- }
127
- }
90
+ await run(agent, "Add unit tests for the fibonacci module");
128
91
  ```
129
92
 
130
- **Event Types:**
131
- - `thread.started` - Session initialized with thread ID
132
- - `turn.started` - Agent begins processing
133
- - `item.started` - Tool call or action begins
134
- - `item.completed` - Tool call or action completes (includes file changes)
135
- - `turn.completed` - Processing finished (includes token usage)
136
- - `turn.failed` - Error occurred
93
+ ### LLM Agents
137
94
 
138
- **Use Cases:**
139
- - Progress bars for long-running tasks
140
- - Real-time logging and debugging
141
- - Live UI updates in applications
142
- - Better UX for multi-step operations
143
-
144
- **API Consistency:** `runStreamed()` mirrors `run()` - same signature, just with streaming!
95
+ LLM agents use the OpenAI API for text generation and reasoning tasks.
145
96
 
146
97
  ```typescript
147
- // Standard execution
148
- const result = await run(agent, task);
149
-
150
- // Streaming execution
151
- for await (const event of runStreamed(agent, task)) {
152
- // Real-time progress
153
- }
154
- ```
155
-
156
- ### Project Management (Efficient Workspace Sync) - Coming Soon
157
-
158
- > **Note**: Project Management for cloud execution is under development and will be available in an upcoming release.
159
-
160
- When available, manage workspaces with the Project API - perfect for organizing code and syncing with cloud storage:
161
-
162
- ```typescript
163
- import { CloudClient, Agent, run } from 'computer-agents';
164
-
165
- const client = new CloudClient({ apiKey: process.env.TESTBASE_API_KEY });
166
-
167
- // Create a synced project (local ↔ cloud)
168
- const project = await client.createProject({
169
- name: 'my-app',
170
- localPath: './src' // Enables bidirectional sync
171
- });
172
-
173
- // Incremental sync - only uploads changed files (10x faster!)
174
- await project.sync({ direction: 'up' }); // Upload changes
175
- await project.sync({ direction: 'down' }); // Download changes
176
- await project.sync({ direction: 'both' }); // Bi-directional sync
177
-
178
- // Agents automatically use project workspaces
179
98
  const agent = new Agent({
180
- type: 'computer',
181
- workspace: project // Project = cloud execution
99
+ agentType: "llm",
100
+ model: "gpt-4o",
101
+ instructions: "You review code for quality and security."
182
102
  });
183
103
 
184
- await run(agent, 'Add user authentication');
185
- // Changes are tracked, next sync will be incremental!
104
+ await run(agent, "Review the authentication implementation");
186
105
  ```
187
106
 
188
- **Key Benefits:**
189
- - **10x faster sync** - Only transfers changed files (SHA-256 hashing)
190
- - **Organized workspaces** - Manage multiple projects easily
191
- - **Automatic tracking** - Sync state persisted in `.testbase/sync-state.json`
192
- - **Flexible sync** - Choose `up`, `down`, or `both` directions
193
-
194
- **Example: Incremental Sync Performance**
195
- - Full workspace (500MB): ~35 seconds
196
- - Incremental (5MB changes): ~3 seconds
197
-
198
- ```typescript
199
- // List all projects
200
- const projects = await client.listProjects();
201
-
202
- // Get existing project
203
- const project = await client.getProject('project-id');
204
-
205
- // Get sync statistics
206
- const stats = await project.getSyncStats();
207
- console.log(stats); // { lastSyncAt, fileCount, version }
208
-
209
- // Manual file operations
210
- await project.upload(['file1.txt', 'file2.txt']);
211
- await project.download(['file1.txt']);
212
- await project.readFile('config.json');
213
- await project.writeFile('config.json', '{ "new": "data" }');
214
-
215
- // Delete project
216
- await client.deleteProject(project.id);
217
- ```
218
-
219
- ### Parallel Execution
220
-
221
- **Local Parallel Execution (Available Now):**
107
+ ## Multi-Agent Workflows
222
108
 
223
- You can run multiple agents in parallel for local development:
109
+ Compose multiple agents for complex tasks:
224
110
 
225
111
  ```typescript
226
112
  import { Agent, run } from 'computer-agents';
227
113
 
228
- // Create 5 agents to test different approaches
229
- const frameworks = ['Express', 'Fastify', 'Koa', 'Hapi', 'Restify'];
230
- const agents = frameworks.map(framework => new Agent({
231
- name: `${framework} Agent`,
232
- type: 'computer',
233
- workspace: `./test-${framework.toLowerCase()}`,
234
- instructions: `You are an expert in ${framework}.`
235
- }));
236
-
237
- // Run all 5 in parallel!
238
- const results = await Promise.all(
239
- agents.map((agent, i) => run(agent, `Create a REST API with ${frameworks[i]}`))
240
- );
241
-
242
- // All 5 implementations complete in the time it takes to run 1
243
- console.log('All 5 frameworks tested in parallel!');
244
- ```
245
-
246
- **Cloud Parallel Execution (Coming Soon):**
247
-
248
- > **Note**: Large-scale parallel execution with cloud infrastructure is coming soon. When available, you'll be able to scale to 100+ concurrent agents using CloudClient and Projects.
249
-
250
- ### LLM Agent (for planning and reasoning)
251
-
252
- ```typescript
114
+ // LLM creates plan
253
115
  const planner = new Agent({
254
- type: "llm",
255
- model: "gpt-4o",
256
- instructions: "You create detailed implementation plans."
116
+ agentType: 'llm',
117
+ model: 'gpt-4o',
118
+ instructions: 'Create implementation plans.'
257
119
  });
258
120
 
259
- const plan = await run(planner, "Plan how to add user authentication");
260
- console.log(plan.finalOutput);
261
- ```
262
-
263
- ## Core Concepts
264
-
265
- computer-agents has just **2 core concepts**:
266
-
267
- 1. **Agent** - Single unified interface for both LLM and computer-use agents
268
- 2. **CloudClient** - Manage cloud projects and infrastructure (coming soon)
269
-
270
- ### 1. Agent - Unified Interface
271
-
272
- ```typescript
273
- type AgentType = 'llm' | 'computer';
274
- ```
275
-
276
- | Type | Execution | Use Cases |
277
- |------|-----------|-----------|
278
- | `'llm'` | OpenAI API | Planning, reasoning, reviewing |
279
- | `'computer'` | Codex SDK | Code, tests, file operations, terminal commands |
280
-
281
- **Key insight:** Workspace type determines execution mode automatically:
282
-
283
- ```typescript
284
- // Local execution (workspace = string path)
285
- const localAgent = new Agent({
286
- type: 'computer',
287
- workspace: './my-project' // String = local execution
121
+ // Computer agent executes
122
+ const executor = new Agent({
123
+ agentType: 'computer',
124
+ workspace: './my-project',
125
+ instructions: 'Execute implementation plans.'
288
126
  });
289
127
 
290
- // Cloud execution (workspace = Project) - Coming Soon
291
- const cloudAgent = new Agent({
292
- type: 'computer',
293
- workspace: project // Project = cloud execution
128
+ // LLM reviews result
129
+ const reviewer = new Agent({
130
+ agentType: 'llm',
131
+ model: 'gpt-4o',
132
+ instructions: 'Review code quality.'
294
133
  });
134
+
135
+ const task = "Add user authentication";
136
+ const plan = await run(planner, `Plan: ${task}`);
137
+ const code = await run(executor, plan.finalOutput);
138
+ const review = await run(reviewer, `Review: ${code.finalOutput}`);
295
139
  ```
296
140
 
297
- ### 2. CloudClient - Infrastructure Management (Coming Soon)
141
+ ## Streaming Events
298
142
 
299
- Single entry point for cloud operations:
143
+ Monitor agent execution in real-time:
300
144
 
301
145
  ```typescript
302
- const client = new CloudClient({ apiKey: process.env.TESTBASE_API_KEY });
146
+ import { Agent, runStreamed } from 'computer-agents';
303
147
 
304
- // Project management
305
- const projects = await client.listProjects();
306
- const project = await client.createProject({ name: 'my-app' });
307
- await client.deleteProject(project.id);
148
+ const agent = new Agent({
149
+ agentType: "computer",
150
+ workspace: "./my-project"
151
+ });
308
152
 
309
- // Infrastructure (future)
310
- const containers = await client.listContainers();
311
- const stats = await client.getContainerStats(containerId);
153
+ for await (const event of runStreamed(agent, 'Create a web scraper')) {
154
+ switch (event.type) {
155
+ case 'thread.started':
156
+ console.log(`Thread: ${event.thread_id}`);
157
+ break;
158
+ case 'item.completed':
159
+ if (event.item.type === 'file_change') {
160
+ const files = event.item.changes.map(c => `${c.kind} ${c.path}`).join(', ');
161
+ console.log(`Files: ${files}`);
162
+ }
163
+ break;
164
+ case 'turn.completed':
165
+ console.log(`Completed (${event.usage.input_tokens + event.usage.output_tokens} tokens)`);
166
+ break;
167
+ }
168
+ }
312
169
  ```
313
170
 
314
- ### Session Continuity
171
+ ## Session Continuity
315
172
 
316
173
  Agents automatically maintain context across multiple runs:
317
174
 
318
175
  ```typescript
319
176
  const agent = new Agent({
320
- type: 'computer',
177
+ agentType: 'computer',
321
178
  workspace: './my-project'
322
179
  });
323
180
 
324
- await run(agent, 'Create app.py'); // New session
325
- await run(agent, 'Add error handling'); // Continues same session!
326
- await run(agent, 'Add tests'); // Still same session!
181
+ await run(agent, 'Create app.py');
182
+ await run(agent, 'Add error handling'); // Continues same session
183
+ await run(agent, 'Add tests'); // Still same session
327
184
 
328
- console.log(agent.currentThreadId); // Thread ID maintained
185
+ console.log(agent.currentThreadId); // Thread ID maintained
329
186
 
330
- agent.resetSession(); // Start fresh when needed
331
- await run(agent, 'New project'); // New session
187
+ agent.resetSession(); // Start fresh
188
+ await run(agent, 'New project'); // New session
332
189
  ```
333
190
 
334
- ## Examples
335
-
336
- Comprehensive examples demonstrating the power of computer-agents:
337
-
338
- ```bash
339
- # Clone the repository
340
- git clone https://github.com/TestBase-ai/computer-agents.git
341
- cd computer-agents
342
- npm install
343
- npm run build
344
-
345
- # Streaming progress (real-time event visibility)
346
- node examples/testbase/streaming-progress.cjs
191
+ ## Cloud Execution (Coming Soon)
347
192
 
348
- # Workspace sync modes (default vs cloud-only)
349
- node examples/testbase/workspace-sync-modes.mjs
193
+ > **Note**: Cloud execution with `CloudClient` and `Project` management is currently in private beta. Public access coming soon.
194
+ >
195
+ > Interested in early access? [Join the waitlist →](https://testbase.ai)
350
196
 
351
- # Parallel execution (the game changer!)
352
- node examples/testbase/parallel-execution.mjs
197
+ Cloud execution will enable:
198
+ - Run agents in isolated cloud environments
199
+ - Project-based workspace management with bidirectional sync
200
+ - Seamless transition from local to cloud with workspace configuration
201
+ - Parallel execution at scale without local resource constraints
353
202
 
354
- # Scale experiments (ML hyperparameter tuning, algorithm comparison)
355
- node examples/testbase/scale-experiments.mjs
356
-
357
- # Multi-agent workflows (planner → executor → reviewer)
358
- node examples/testbase/multi-agent-workflow.mjs
359
-
360
- # Session continuity demonstration
361
- node examples/testbase/hello-world.mjs
362
- ```
363
-
364
- **[📂 View all examples →](https://github.com/TestBase-ai/computer-agents/tree/main/examples/testbase)**
365
-
366
- ## Multi-Agent Workflows
367
-
368
- Build custom workflows by composing agents:
203
+ Example (available in private beta):
369
204
 
370
205
  ```typescript
371
- import { Agent, run } from 'computer-agents';
206
+ import { Agent, run, CloudClient } from 'computer-agents';
372
207
 
373
- // LLM creates plan
374
- const planner = new Agent({
375
- type: 'llm',
376
- model: 'gpt-4o',
377
- instructions: 'Create detailed implementation plans.'
208
+ const client = new CloudClient({
209
+ apiKey: process.env.TESTBASE_API_KEY
378
210
  });
379
211
 
380
- // Computer agent executes plan
381
- const executor = new Agent({
382
- type: 'computer',
383
- workspace: './my-project',
384
- instructions: 'Execute implementation plans.'
385
- });
212
+ // Create a cloud project
213
+ const project = await client.createProject({ name: 'my-app' });
386
214
 
387
- // LLM reviews result
388
- const reviewer = new Agent({
389
- type: 'llm',
390
- model: 'gpt-4o',
391
- instructions: 'Review implementations for quality.'
215
+ // Cloud agent
216
+ const agent = new Agent({
217
+ agentType: "computer",
218
+ workspace: project, // Use cloud project as workspace
219
+ instructions: "You are an expert developer."
392
220
  });
393
221
 
394
- // Manual workflow composition - you control the flow
395
- const task = "Add user authentication";
396
- const plan = await run(planner, `Plan: ${task}`);
397
- const code = await run(executor, plan.finalOutput);
398
- const review = await run(reviewer, `Review: ${code.finalOutput}`);
222
+ const result = await run(agent, "Add error handling to the API");
223
+ console.log(result.finalOutput);
399
224
  ```
400
225
 
401
226
  ## Configuration
402
227
 
403
- ### Environment Variables
404
-
405
- ```bash
406
- # Required for LLM agents and computer agents (Codex SDK uses OpenAI)
407
- OPENAI_API_KEY=your-openai-key
408
-
409
- # Optional for CloudClient (when cloud execution becomes available)
410
- TESTBASE_API_KEY=your-testbase-key # Get from testbase.ai
411
- ```
412
-
413
228
  ### Agent Configuration
414
229
 
415
230
  ```typescript
416
231
  const agent = new Agent({
417
232
  name: "My Agent", // Optional, auto-generated if omitted
418
- type: 'computer', // 'llm' | 'computer'
419
-
420
- // Computer agent specific
233
+ agentType: 'computer', // 'llm' | 'computer'
421
234
  workspace: './my-project', // Required for computer agents
422
- // String = local, Project = cloud
423
-
424
- // LLM agent specific
425
235
  model: 'gpt-4o', // Required for LLM agents
426
-
427
- // Execution settings (computer agents only)
428
- debug: true, // Show detailed logs
429
- timeout: 600000, // 10 minutes (default)
430
- skipGitRepoCheck: true, // Allow execution outside git repos (default: true)
431
-
432
- // Shared
433
236
  instructions: "You are helpful.", // System prompt
434
- mcpServers: [...], // MCP server configurations (optional)
237
+ debug: false, // Show detailed logs
238
+ timeout: 600000, // 10 minutes (default)
239
+ mcpServers: [], // MCP server configurations
435
240
  });
436
241
  ```
437
242
 
438
- ### CloudClient Configuration (Coming Soon)
243
+ ### Environment Variables
439
244
 
440
- ```typescript
441
- const client = new CloudClient({
442
- apiKey: process.env.TESTBASE_API_KEY, // Required (or use env var)
443
- debug: true, // Show detailed logs
444
- timeout: 600000, // 10 minutes (default)
445
- });
245
+ ```bash
246
+ # Required for all agents
247
+ OPENAI_API_KEY=your-openai-key
248
+
249
+ # Required for CloudClient (private beta)
250
+ TESTBASE_API_KEY=your-testbase-key
446
251
  ```
447
252
 
448
253
  ## MCP Server Integration
449
254
 
450
- Unified MCP configuration works for both agent types:
255
+ Use Model Context Protocol servers with your agents:
451
256
 
452
257
  ```typescript
453
258
  import type { McpServerConfig } from 'computer-agents';
@@ -467,35 +272,13 @@ const mcpServers: McpServerConfig[] = [
467
272
  }
468
273
  ];
469
274
 
470
- // Works for both LLM and computer agents!
471
275
  const agent = new Agent({
472
- type: 'computer',
276
+ agentType: 'computer',
473
277
  workspace: './my-project',
474
- mcpServers // Automatically converted to appropriate format
278
+ mcpServers
475
279
  });
476
280
  ```
477
281
 
478
- The SDK handles conversion automatically:
479
- - **LLM agents**: MCP servers → function tools
480
- - **Computer agents**: MCP servers → Codex SDK config
481
-
482
- ## Performance
483
-
484
- ### Local Execution (workspace = string path)
485
- - **Cold start**: <1 second
486
- - **Warm execution**: <100ms overhead
487
- - **Parallelization**: Limited by local CPU/memory
488
-
489
- ### Cloud Execution (workspace = Project) - Coming Soon
490
- - **First execution**: 30-45 seconds (includes workspace sync)
491
- - **Subsequent runs**: ~5-10 seconds
492
- - **Parallelization**: Scale to 100+ agents
493
-
494
- ### Cloud-Only Mode (no localPath in Project) - Coming Soon
495
- - **Execution**: Faster (no sync overhead)
496
- - **Parallelization**: Scale to 100+ agents
497
- - **Perfect for**: CI/CD, experiments, parallel tasks
498
-
499
282
  ## API Reference
500
283
 
501
284
  ### Agent
@@ -504,31 +287,10 @@ The SDK handles conversion automatically:
504
287
  class Agent {
505
288
  constructor(config: AgentConfiguration);
506
289
 
507
- currentThreadId: string | undefined; // Current session thread ID
508
- resetSession(): void; // Start new session
509
- workspace: string; // Workspace path
510
- type: 'llm' | 'computer'; // Agent type
511
- }
512
-
513
- interface AgentConfiguration {
514
- name?: string; // Optional, auto-generated if omitted
515
- type: 'llm' | 'computer'; // Agent type
516
-
517
- // Computer agent specific
518
- workspace?: string | Project; // Required for computer agents
519
- // String = local, Project = cloud
520
-
521
- // LLM agent specific
522
- model?: string; // Required for LLM agents
523
-
524
- // Execution settings (computer agents only)
525
- debug?: boolean; // Show detailed logs
526
- timeout?: number; // Execution timeout (default: 600000ms)
527
- skipGitRepoCheck?: boolean; // Allow execution outside git repos (default: true)
528
-
529
- // Shared
530
- instructions?: string; // System prompt
531
- mcpServers?: McpServerConfig[]; // MCP server configurations
290
+ currentThreadId: string | undefined;
291
+ resetSession(): void;
292
+ workspace: string;
293
+ agentType: 'llm' | 'computer';
532
294
  }
533
295
  ```
534
296
 
@@ -552,324 +314,89 @@ function runStreamed(
552
314
  ): AsyncGenerator<Event>;
553
315
  ```
554
316
 
555
- Stream real-time events during agent execution. Returns an async generator that yields:
556
- - `thread.started` - Session initialized
557
- - `turn.started` - Agent begins processing
558
- - `item.started` / `item.completed` - Tool calls and file changes
559
- - `turn.completed` - Processing finished with usage stats
560
- - `turn.failed` - Error occurred
561
-
562
- **Example:**
563
- ```typescript
564
- for await (const event of runStreamed(agent, 'Create app.py')) {
565
- console.log(event.type, event);
566
- }
567
- ```
568
-
569
317
  ### CloudClient (Coming Soon)
570
318
 
571
- > **Note**: CloudClient for cloud execution is under development and will be available in an upcoming release.
572
-
573
319
  ```typescript
574
320
  class CloudClient {
575
- constructor(config?: {
576
- apiKey?: string; // Required (or env var TESTBASE_API_KEY)
577
- debug?: boolean;
578
- timeout?: number; // default: 600000ms (10 min)
579
- });
321
+ constructor(config?: CloudClientConfig);
580
322
 
581
- // Project management
582
323
  async createProject(config: CreateProjectConfig): Promise<Project>;
583
324
  async listProjects(): Promise<Project[]>;
584
325
  async getProject(id: string): Promise<Project>;
585
326
  async deleteProject(id: string, hard?: boolean): Promise<void>;
586
-
587
- // Infrastructure (future)
588
- async listContainers(): Promise<Container[]>;
589
- async getContainerStats(id: string): Promise<ContainerStats>;
590
327
  }
591
328
  ```
592
329
 
593
330
  ### Project (Coming Soon)
594
331
 
595
- > **Note**: Project API for cloud execution is under development and will be available in an upcoming release.
596
-
597
332
  ```typescript
598
333
  class Project {
599
- // Properties
600
334
  readonly id: string;
601
335
  readonly name: string;
602
336
  readonly localPath: string | undefined;
603
337
  readonly cloudPath: string;
604
338
 
605
- // Sync operations (when localPath provided)
606
339
  async sync(options?: SyncOptions): Promise<SyncResult>;
607
- async upload(files: string[]): Promise<void>;
608
- async download(files: string[]): Promise<void>;
609
-
610
- // File operations
611
340
  async listFiles(pattern?: string): Promise<ProjectFile[]>;
612
341
  async readFile(path: string): Promise<string>;
613
342
  async writeFile(path: string, content: string): Promise<void>;
614
-
615
- // Management
343
+ async upload(files: string[]): Promise<void>;
344
+ async download(files: string[]): Promise<void>;
616
345
  async delete(hard?: boolean): Promise<void>;
617
346
  async getStats(): Promise<ProjectStats>;
618
347
  async getSyncStats(): Promise<SyncStats>;
619
- async resetSyncState(): Promise<void>;
620
348
 
621
- // Workspace path for agents
622
349
  getWorkspacePath(): string;
623
350
  }
624
-
625
- // Create project via CloudClient
626
- const client = new CloudClient({ apiKey: process.env.TESTBASE_API_KEY });
627
- const project = await client.createProject({
628
- name: 'my-app',
629
- localPath: './src', // Optional - enables sync if provided
630
- description: 'My app', // Optional
631
- metadata: { ... }, // Optional
632
- });
633
-
634
- // Sync options (when localPath provided)
635
- await project.sync({
636
- direction: 'both', // 'up' | 'down' | 'both'
637
- force: false, // Force full sync (skip incremental)
638
- pattern: '*.ts' // Optional glob pattern
639
- });
640
- ```
641
-
642
- ### Runtime Classes (Internal Use)
643
-
644
- > **Note**: Runtime classes are used internally by the SDK. Most users should not need to interact with them directly. Use CloudClient for cloud operations instead.
645
-
646
- These classes are marked `@internal` and are primarily for advanced use cases:
647
-
648
- ```typescript
649
- // For advanced use only - not needed for typical usage
650
- class LocalRuntime implements Runtime { ... }
651
- class CloudRuntime implements Runtime { ... }
652
- ```
653
-
654
- ## Architecture
655
-
656
351
  ```
657
- computer-agents/
658
- ├── packages/
659
- │ ├── agents-core/ # Core SDK
660
- │ │ ├── src/
661
- │ │ │ ├── agent.ts # Agent class
662
- │ │ │ ├── run.ts # Run loop
663
- │ │ │ ├── runtime/ # Runtime abstraction
664
- │ │ │ │ ├── LocalRuntime.ts
665
- │ │ │ │ ├── CloudRuntime.ts
666
- │ │ │ │ └── gcsWorkspace.ts
667
- │ │ │ ├── codex/ # Codex SDK integration
668
- │ │ │ ├── cloud/ # Cloud API client
669
- │ │ │ └── mcpConfig.ts # Unified MCP types
670
- │ │ └── package.json
671
- │ │
672
- │ ├── agents/ # Main package export
673
- │ ├── agents-openai/ # OpenAI provider
674
- │ └── cloud-infrastructure/ # GCE cloud execution server
675
-
676
- └── examples/testbase/ # Working examples
677
- ```
678
-
679
- ## Best Practices
680
-
681
- ### Choosing Local vs Cloud Execution
682
-
683
- **Use Local Execution (workspace = string) when:**
684
- - Development and rapid iteration
685
- - Working with local files/tools
686
- - No cloud infrastructure needed
687
- - Testing and debugging
688
-
689
- **Use Cloud Execution (workspace = Project) when:** *(Coming Soon)*
690
- - Parallel execution at scale
691
- - Production deployments
692
- - CI/CD pipelines
693
- - Need isolated execution environments
694
- - Experiments requiring multiple concurrent agents
695
-
696
- ### Choosing Workspace Sync Mode
697
-
698
- **With localPath (bidirectional sync):** *(Coming Soon)*
699
- - You need results in your local filesystem
700
- - Continuing work locally after cloud execution
701
- - Interactive development workflows
702
-
703
- **Without localPath (cloud-only):** *(Coming Soon)*
704
- - CI/CD pipelines (no local filesystem)
705
- - Running experiments at scale
706
- - Parallel task execution
707
- - Faster execution (skip sync overhead)
708
-
709
- ### Session Management
710
352
 
711
- Always use the **same agent instance** for session continuity:
712
-
713
- ```typescript
714
- // ✅ Correct - same agent, continuous session
715
- const agent = new Agent({ type: 'computer', workspace: './project' });
716
- await run(agent, 'Task 1');
717
- await run(agent, 'Task 2'); // Continues session
718
-
719
- // ❌ Wrong - different agents, new sessions
720
- await run(new Agent({ type: 'computer', workspace: './project' }), 'Task 1');
721
- await run(new Agent({ type: 'computer', workspace: './project' }), 'Task 2'); // Different session!
722
- ```
723
-
724
- ### Parallel Execution
725
-
726
- Use `Promise.all()` for parallel execution:
727
-
728
- ```typescript
729
- const agents = [agent1, agent2, agent3];
730
- const tasks = ['Task 1', 'Task 2', 'Task 3'];
353
+ ## Examples
731
354
 
732
- // ✅ Parallel - all execute simultaneously
733
- const results = await Promise.all(
734
- agents.map((agent, i) => run(agent, tasks[i]))
735
- );
355
+ ```bash
356
+ # Clone the repository
357
+ git clone https://github.com/TestBase-ai/computer-agents.git
358
+ cd computer-agents
359
+ npm install
360
+ npm run build
736
361
 
737
- // Sequential - one at a time
738
- for (let i = 0; i < agents.length; i++) {
739
- await run(agents[i], tasks[i]); // Slower!
740
- }
362
+ # Run examples
363
+ node examples/testbase/hello-world.mjs
364
+ node examples/testbase/multi-agent-workflow.mjs
365
+ node examples/testbase/streaming-progress.cjs
741
366
  ```
742
367
 
743
- ## Cloud Infrastructure
744
-
745
- > **Coming Soon**: Public access to cloud execution infrastructure is under development.
746
-
747
- computer-agents includes production-ready cloud infrastructure that will soon be available:
748
-
749
- - **GCS Bucket** - Workspace storage (`gs://testbase-workspaces`)
750
- - **GCE VM** - Codex SDK execution server
751
- - **Pay-per-token** - Credit-based billing system
752
- - **API Keys** - Database-backed authentication
753
- - **Budget Protection** - Daily/monthly spending limits
754
- - **Project Management** - Incremental sync with SHA-256 hashing
755
-
756
- The infrastructure is fully built and tested. We're finalizing API access for public use. Stay tuned for updates!
757
-
758
- For now, `LocalRuntime` provides full computer-use agent capabilities for local development.
759
-
760
- ## Documentation
761
-
762
- - **[Examples](https://github.com/TestBase-ai/computer-agents/tree/main/examples/testbase)** - Comprehensive working examples
763
- - **[Cloud Infrastructure](./packages/cloud-infrastructure/README.md)** - Deployment and configuration
764
- - **[Architecture](../docs/ARCHITECTURE.md)** - System design and internals
368
+ [View all examples →](https://github.com/TestBase-ai/computer-agents/tree/main/examples/testbase)
765
369
 
766
370
  ## Troubleshooting
767
371
 
768
372
  ### "OPENAI_API_KEY not set"
769
- ```bash
770
- export OPENAI_API_KEY=sk-...
771
- ```
772
373
 
773
- ### "TESTBASE_API_KEY required" *(When using CloudClient)*
774
374
  ```bash
775
- export TESTBASE_API_KEY=your-key
776
- # Or provide in constructor:
777
- new CloudClient({ apiKey: 'your-key' })
778
- ```
779
-
780
- ### Session continuity not working
781
- Ensure you're using the **same agent instance** across runs:
782
- ```typescript
783
- const agent = new Agent({ type: 'computer', workspace: './project' });
784
- await run(agent, 'Task 1');
785
- await run(agent, 'Task 2'); // Same instance = session continues
375
+ export OPENAI_API_KEY=sk-...
786
376
  ```
787
377
 
788
378
  ### "Computer agents require a workspace"
379
+
789
380
  Computer agents need a workspace parameter:
381
+
790
382
  ```typescript
791
- // Correct
792
- new Agent({ type: 'computer', workspace: './my-project' })
383
+ // Correct
384
+ new Agent({ agentType: 'computer', workspace: './my-project' })
793
385
 
794
- // Missing workspace
795
- new Agent({ type: 'computer' }) // Error!
386
+ // Missing workspace - Error
387
+ new Agent({ agentType: 'computer' })
796
388
  ```
797
389
 
798
- ## What's New
799
-
800
- ### v0.6.0 - Major UX Simplification
801
-
802
- **Breaking Changes:**
803
- - Runtime objects removed from public API - no more `LocalRuntime` or `CloudRuntime` in user code
804
- - Agent configuration simplified: `agentType` → `type`, workspace accepts `string | Project`
805
- - CloudClient introduced as single entry point for cloud operations
806
- - Execution settings moved from Runtime to Agent level
390
+ ### Session continuity not working
807
391
 
808
- **Benefits:**
809
- - 40-50% less code for typical use cases
810
- - Reduced core concepts from 5 to 2 (Agent + CloudClient)
811
- - More intuitive API with better TypeScript inference
812
- - Automatic runtime selection based on workspace type
392
+ Use the same agent instance across runs:
813
393
 
814
- **Migration:**
815
394
  ```typescript
816
- // Before (v0.5.x)
817
- const agent = new Agent({
818
- agentType: 'computer',
819
- runtime: new LocalRuntime({ debug: true }),
820
- workspace: './project'
821
- });
822
-
823
- // After (v0.6.0)
824
- const agent = new Agent({
825
- type: 'computer',
826
- workspace: './project',
827
- debug: true
828
- });
395
+ const agent = new Agent({ agentType: 'computer', workspace: './project' });
396
+ await run(agent, 'Task 1');
397
+ await run(agent, 'Task 2'); // Same instance = session continues
829
398
  ```
830
399
 
831
- See [MIGRATION_v0.5_to_v0.6.md](https://github.com/TestBase-ai/computer-agents/blob/main/MIGRATION_v0.5_to_v0.6.md) for complete migration guide.
832
-
833
- ### v0.5.0
834
- - **Project Management System**: Organize and sync workspaces efficiently
835
- - Incremental sync with SHA-256 hashing - 10x faster than full sync
836
- - Track sync state automatically in `.testbase/sync-state.json`
837
- - Native Web API FormData for reliable file uploads
838
- - Seamless agent integration with project workspaces
839
-
840
- ### v0.4.9
841
- - **Streaming Progress**: New `runStreamed()` function for real-time visibility
842
- - Stream events: thread.started, turn.started, item.completed, turn.completed
843
- - API consistency - mirrors `run()` signature for easy adoption
844
- - Perfect for progress bars, real-time logging, and better UX
845
-
846
- ### v0.4.6
847
- - **Cloud-Only Mode**: `skipWorkspaceSync` option for CloudRuntime
848
- - Perfect for CI/CD and parallel experiments
849
- - Faster cloud execution (no sync overhead)
850
-
851
- ### v0.4.5
852
- - Fixed maxBuffer overflow for large workspace syncs
853
- - Improved GCS operation stability
854
-
855
- ### v0.4.0
856
- - Initial public release
857
- - Parallel computer-use agent orchestration
858
- - Unified local/cloud runtime abstraction
859
- - Session continuity
860
-
861
- ## Differences from OpenAI Agents SDK
862
-
863
- computer-agents extends OpenAI's Agents SDK with:
864
-
865
- 1. **Computer-use agent type** - Direct Codex SDK integration
866
- 2. **Simplified API** - No runtime objects needed, workspace type determines execution
867
- 3. **CloudClient** - Unified interface for cloud operations and project management
868
- 4. **Parallel orchestration** - Native support for concurrent agents
869
- 5. **Session continuity** - Automatic thread management
870
- 6. **Cloud infrastructure** - Production-ready execution platform (coming soon for public access)
871
- 7. **Unified MCP config** - Single configuration for all agent types
872
-
873
400
  ## License
874
401
 
875
402
  MIT
@@ -877,17 +404,10 @@ MIT
877
404
  ## Links
878
405
 
879
406
  - **GitHub**: [https://github.com/TestBase-ai/computer-agents](https://github.com/TestBase-ai/computer-agents)
880
- - **Examples**: [https://github.com/TestBase-ai/computer-agents/tree/main/examples/testbase](https://github.com/TestBase-ai/computer-agents/tree/main/examples/testbase)
881
407
  - **npm**: [https://www.npmjs.com/package/computer-agents](https://www.npmjs.com/package/computer-agents)
882
- - **Website**: [https://testbase.ai/computer-agents](https://testbase.ai/computer-agents)
408
+ - **Documentation**: [https://github.com/TestBase-ai/computer-agents/tree/main/examples/testbase](https://github.com/TestBase-ai/computer-agents/tree/main/examples/testbase)
883
409
 
884
410
  ## Support
885
411
 
886
412
  - **Issues**: [GitHub Issues](https://github.com/TestBase-ai/computer-agents/issues)
887
413
  - **Website**: [testbase.ai](https://testbase.ai)
888
-
889
- ---
890
-
891
- **Built with ❤️ by [TestBase](https://testbase.ai)**
892
-
893
- *Based on [OpenAI Agents SDK](https://github.com/openai/openai-agents-sdk) • Powered by [Codex SDK](https://github.com/anthropics/claude-code) • Cloud infrastructure on GCP*
package/dist/metadata.js CHANGED
@@ -4,9 +4,9 @@ Object.defineProperty(exports, "__esModule", { value: true });
4
4
  exports.METADATA = void 0;
5
5
  exports.METADATA = {
6
6
  "name": "computer-agents",
7
- "version": "0.6.3",
7
+ "version": "0.6.5",
8
8
  "versions": {
9
- "computer-agents": "0.6.3"
9
+ "computer-agents": "0.6.5"
10
10
  }
11
11
  };
12
12
  exports.default = exports.METADATA;
package/package.json CHANGED
@@ -2,7 +2,7 @@
2
2
  "name": "computer-agents",
3
3
  "repository": "https://github.com/TestBase-ai/computer-agents",
4
4
  "homepage": "https://testbase.ai/computer-agents",
5
- "version": "0.6.3",
5
+ "version": "0.6.5",
6
6
  "description": "Build computer-use agents that write code, run tests, and deploy apps. Seamless local and cloud execution with automatic session continuity.",
7
7
  "author": "Testbase",
8
8
  "main": "dist/index.js",