@lov3kaizen/agentsea-surf 0.5.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/LICENSE ADDED
@@ -0,0 +1,21 @@
1
+ MIT License
2
+
3
+ Copyright (c) 2024 lovekaizen
4
+
5
+ Permission is hereby granted, free of charge, to any person obtaining a copy
6
+ of this software and associated documentation files (the "Software"), to deal
7
+ in the Software without restriction, including without limitation the rights
8
+ to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
9
+ copies of the Software, and to permit persons to whom the Software is
10
+ furnished to do so, subject to the following conditions:
11
+
12
+ The above copyright notice and this permission notice shall be included in all
13
+ copies or substantial portions of the Software.
14
+
15
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16
+ IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17
+ FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18
+ AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19
+ LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20
+ OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21
+ SOFTWARE.
package/README.md ADDED
@@ -0,0 +1,224 @@
1
+ # @lov3kaizen/agentsea-surf
2
+
3
+ Surf - Computer-use agent for AgentSea. Control desktop environments through screen capture, mouse, and keyboard actions using Claude's vision capabilities.
4
+
5
+ ## Features
6
+
7
+ - **8 Computer-Use Tools**: screenshot, click, type, scroll, drag, key press, cursor move, wait
8
+ - **Multiple Backends**: Native (macOS, Linux, Windows), Puppeteer browser, Docker container
9
+ - **Claude Vision Integration**: Automatic screen analysis and action determination
10
+ - **NestJS Integration**: Full REST API and WebSocket support
11
+ - **Security Sandboxing**: Rate limiting, command blocking, domain/path restrictions
12
+
13
+ ## Installation
14
+
15
+ ```bash
16
+ npm install @lov3kaizen/agentsea-surf
17
+ # or
18
+ pnpm add @lov3kaizen/agentsea-surf
19
+ ```
20
+
21
+ ### Optional Dependencies
22
+
23
+ ```bash
24
+ # For browser automation
25
+ npm install puppeteer
26
+
27
+ # For image processing
28
+ npm install sharp
29
+ ```
30
+
31
+ ## Quick Start
32
+
33
+ ### Basic Usage
34
+
35
+ ```typescript
36
+ import { SurfAgent, createNativeBackend } from '@lov3kaizen/agentsea-surf';
37
+
38
+ async function main() {
39
+ // Create a native backend for your platform
40
+ const backend = createNativeBackend();
41
+ await backend.connect();
42
+
43
+ // Create the agent
44
+ const agent = new SurfAgent('session-1', backend, {
45
+ maxSteps: 20,
46
+ vision: {
47
+ model: 'claude-sonnet-4-20250514',
48
+ maxTokens: 4096,
49
+ includeScreenshotInResponse: true,
50
+ },
51
+ });
52
+
53
+ // Execute a task
54
+ const result = await agent.execute('Open Chrome and navigate to google.com');
55
+
56
+ console.log('Result:', result.response);
57
+ console.log('Steps taken:', result.state.actionHistory.length);
58
+
59
+ await backend.disconnect();
60
+ }
61
+
62
+ main().catch(console.error);
63
+ ```
64
+
65
+ ### With Streaming
66
+
67
+ ```typescript
68
+ const agent = new SurfAgent('session-1', backend, config);
69
+
70
+ for await (const event of agent.executeStream('Search for weather')) {
71
+ switch (event.type) {
72
+ case 'screenshot':
73
+ console.log('Screenshot taken');
74
+ break;
75
+ case 'action':
76
+ console.log(`Executing: ${event.action.description}`);
77
+ break;
78
+ case 'complete':
79
+ console.log('Task completed:', event.response);
80
+ break;
81
+ }
82
+ }
83
+ ```
84
+
85
+ ### NestJS Integration
86
+
87
+ ```typescript
88
+ import { Module } from '@nestjs/common';
89
+ import { SurfModule } from '@lov3kaizen/agentsea-surf/nestjs';
90
+
91
+ @Module({
92
+ imports: [
93
+ SurfModule.forRoot({
94
+ backend: { type: 'native' },
95
+ config: {
96
+ maxSteps: 50,
97
+ sandbox: { enabled: true },
98
+ },
99
+ enableRestApi: true,
100
+ enableWebSocket: true,
101
+ }),
102
+ ],
103
+ })
104
+ export class AppModule {}
105
+ ```
106
+
107
+ ## Backends
108
+
109
+ ### Native Backend
110
+
111
+ Automatically selects the appropriate backend for your platform:
112
+
113
+ ```typescript
114
+ import { createNativeBackend } from '@lov3kaizen/agentsea-surf';
115
+
116
+ const backend = createNativeBackend({ displayIndex: 0 });
117
+ ```
118
+
119
+ ### Browser Backend (Puppeteer)
120
+
121
+ ```typescript
122
+ import { PuppeteerBackend } from '@lov3kaizen/agentsea-surf';
123
+
124
+ const backend = new PuppeteerBackend({
125
+ headless: false,
126
+ viewport: { width: 1920, height: 1080 },
127
+ initialUrl: 'https://example.com',
128
+ });
129
+ ```
130
+
131
+ ### Docker Backend
132
+
133
+ ```typescript
134
+ import { DockerBackend } from '@lov3kaizen/agentsea-surf';
135
+
136
+ const backend = new DockerBackend({
137
+ image: 'agentsea/desktop:ubuntu-22.04',
138
+ resolution: { width: 1920, height: 1080, scaleFactor: 1 },
139
+ removeOnDisconnect: true,
140
+ });
141
+ ```
142
+
143
+ ## Tools
144
+
145
+ All tools can be used independently:
146
+
147
+ ```typescript
148
+ import {
149
+ createSurfTools,
150
+ createNativeBackend,
151
+ } from '@lov3kaizen/agentsea-surf';
152
+
153
+ const backend = createNativeBackend();
154
+ await backend.connect();
155
+
156
+ const tools = createSurfTools(backend);
157
+
158
+ // Use individual tools
159
+ await tools.screenshot.execute({});
160
+ await tools.click.execute({ x: 100, y: 200 });
161
+ await tools.typeText.execute({ text: 'Hello World' });
162
+ ```
163
+
164
+ ## Security
165
+
166
+ The sandbox configuration allows you to restrict agent capabilities:
167
+
168
+ ```typescript
169
+ const agent = new SurfAgent('session', backend, {
170
+ sandbox: {
171
+ enabled: true,
172
+ maxActionsPerMinute: 60,
173
+ blockedDomains: ['malicious-site.com'],
174
+ blockedCommands: ['rm -rf', 'sudo'],
175
+ blockedPaths: ['/etc', '/root'],
176
+ },
177
+ });
178
+ ```
179
+
180
+ ## API Reference
181
+
182
+ ### SurfAgent
183
+
184
+ Main agent class for executing computer automation tasks.
185
+
186
+ **Constructor:**
187
+
188
+ ```typescript
189
+ new SurfAgent(
190
+ sessionId: string,
191
+ backend: DesktopBackend,
192
+ config?: Partial<SurfConfig>
193
+ )
194
+ ```
195
+
196
+ **Methods:**
197
+
198
+ - `execute(task: string, context?: AgentContext)` - Execute a task
199
+ - `executeStream(task: string, context?: AgentContext)` - Execute with streaming
200
+ - `stop()` - Stop the current execution
201
+ - `getState()` - Get current agent state
202
+
203
+ ### REST API Endpoints
204
+
205
+ When using NestJS integration:
206
+
207
+ - `POST /surf/execute` - Execute a task
208
+ - `POST /surf/action` - Execute single action
209
+ - `POST /surf/screenshot` - Take a screenshot
210
+ - `GET /surf/screen` - Get screen state
211
+ - `GET /surf/sessions` - List active sessions
212
+ - `GET /surf/status` - Get backend status
213
+
214
+ ### WebSocket Events
215
+
216
+ - `execute` - Start task execution (emits `stream`, `complete`, `error`)
217
+ - `action` - Execute single action (emits `actionResult`)
218
+ - `screenshot` - Take screenshot (emits `screenshotResult`)
219
+ - `stop` - Stop current execution
220
+ - `status` - Get backend status
221
+
222
+ ## License
223
+
224
+ MIT