@browserbasehq/convex-stagehand 0.0.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (55) hide show
  1. package/LICENSE +21 -0
  2. package/README.md +477 -0
  3. package/dist/client/index.d.ts +258 -0
  4. package/dist/client/index.d.ts.map +1 -0
  5. package/dist/client/index.js +216 -0
  6. package/dist/client/index.js.map +1 -0
  7. package/dist/component/_generated/api.d.ts +36 -0
  8. package/dist/component/_generated/api.d.ts.map +1 -0
  9. package/dist/component/_generated/api.js +31 -0
  10. package/dist/component/_generated/api.js.map +1 -0
  11. package/dist/component/_generated/component.d.ts +24 -0
  12. package/dist/component/_generated/component.d.ts.map +1 -0
  13. package/dist/component/_generated/component.js +11 -0
  14. package/dist/component/_generated/component.js.map +1 -0
  15. package/dist/component/_generated/dataModel.d.ts +15 -0
  16. package/dist/component/_generated/dataModel.d.ts.map +1 -0
  17. package/dist/component/_generated/dataModel.js +11 -0
  18. package/dist/component/_generated/dataModel.js.map +1 -0
  19. package/dist/component/_generated/server.d.ts +121 -0
  20. package/dist/component/_generated/server.d.ts.map +1 -0
  21. package/dist/component/_generated/server.js +78 -0
  22. package/dist/component/_generated/server.js.map +1 -0
  23. package/dist/component/api.d.ts +90 -0
  24. package/dist/component/api.d.ts.map +1 -0
  25. package/dist/component/api.js +112 -0
  26. package/dist/component/api.js.map +1 -0
  27. package/dist/component/convex.config.d.ts +3 -0
  28. package/dist/component/convex.config.d.ts.map +1 -0
  29. package/dist/component/convex.config.js +3 -0
  30. package/dist/component/convex.config.js.map +1 -0
  31. package/dist/component/lib.d.ts +62 -0
  32. package/dist/component/lib.d.ts.map +1 -0
  33. package/dist/component/lib.js +356 -0
  34. package/dist/component/lib.js.map +1 -0
  35. package/dist/component/schema.d.ts +23 -0
  36. package/dist/component/schema.d.ts.map +1 -0
  37. package/dist/component/schema.js +14 -0
  38. package/dist/component/schema.js.map +1 -0
  39. package/dist/test.d.ts +48 -0
  40. package/dist/test.d.ts.map +1 -0
  41. package/dist/test.js +25 -0
  42. package/dist/test.js.map +1 -0
  43. package/package.json +87 -0
  44. package/src/client/index.ts +348 -0
  45. package/src/component/README.md +90 -0
  46. package/src/component/_generated/api.ts +52 -0
  47. package/src/component/_generated/component.ts +26 -0
  48. package/src/component/_generated/dataModel.ts +17 -0
  49. package/src/component/_generated/server.ts +156 -0
  50. package/src/component/api.ts +243 -0
  51. package/src/component/convex.config.ts +3 -0
  52. package/src/component/lib.ts +416 -0
  53. package/src/component/schema.ts +23 -0
  54. package/src/component/tsconfig.json +25 -0
  55. package/src/test.ts +33 -0
package/LICENSE ADDED
@@ -0,0 +1,21 @@
1
+ MIT License
2
+
3
+ Copyright (c) 2025 Browserbase Inc.
4
+
5
+ Permission is hereby granted, free of charge, to any person obtaining a copy
6
+ of this software and associated documentation files (the "Software"), to deal
7
+ in the Software without restriction, including without limitation the rights
8
+ to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
9
+ copies of the Software, and to permit persons to whom the Software is
10
+ furnished to do so, subject to the following conditions:
11
+
12
+ The above copyright notice and this permission notice shall be included in all
13
+ copies or substantial portions of the Software.
14
+
15
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16
+ IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17
+ FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18
+ AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19
+ LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20
+ OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21
+ SOFTWARE.
package/README.md ADDED
@@ -0,0 +1,477 @@
1
+ # convex-stagehand
2
+
3
+ AI-powered browser automation for Convex applications. Extract data, perform actions, and automate workflows using natural language - no Playwright knowledge required.
4
+
5
+ ## Features
6
+
7
+ - **Simple API** - Describe what you want in plain English
8
+ - **Type-safe** - Full TypeScript support with Zod schemas
9
+ - **Session management** - Reuse browser sessions across multiple operations
10
+ - **Agent mode** - Autonomous multi-step task execution
11
+ - **Powered by Stagehand** - Uses the [Stagehand](https://github.com/browserbase/stagehand) REST API
12
+
13
+ ## Quick Start
14
+
15
+ ### 1. Install the Component
16
+
17
+ ```bash
18
+ npm install github:browserbase/convex-stagehand zod
19
+ ```
20
+
21
+ ### 2. Configure Convex
22
+
23
+ Add the component to your `convex/convex.config.ts`:
24
+
25
+ ```typescript
26
+ import { defineApp } from "convex/server";
27
+ import stagehand from "convex-stagehand/convex.config";
28
+
29
+ const app = defineApp();
30
+ app.use(stagehand, { name: "stagehand" });
31
+
32
+ export default app;
33
+ ```
34
+
35
+ ### 3. Set Up Environment Variables
36
+
37
+ Add these to your [Convex Dashboard](https://dashboard.convex.dev) → Settings → Environment Variables:
38
+
39
+ | Variable | Description |
40
+ |----------|-------------|
41
+ | `BROWSERBASE_API_KEY` | Your Browserbase API key |
42
+ | `BROWSERBASE_PROJECT_ID` | Your Browserbase project ID |
43
+ | `MODEL_API_KEY` | Your LLM provider API key (OpenAI, Anthropic, etc.) |
44
+
45
+ ### 4. Use the Component
46
+
47
+ ```typescript
48
+ import { action } from "./_generated/server";
49
+ import { Stagehand } from "convex-stagehand";
50
+ import { components } from "./_generated/api";
51
+ import { z } from "zod";
52
+
53
+ const stagehand = new Stagehand(components.stagehand, {
54
+ browserbaseApiKey: process.env.BROWSERBASE_API_KEY!,
55
+ browserbaseProjectId: process.env.BROWSERBASE_PROJECT_ID!,
56
+ modelApiKey: process.env.MODEL_API_KEY!,
57
+ });
58
+
59
+ export const scrapeHackerNews = action({
60
+ handler: async (ctx) => {
61
+ return await stagehand.extract(ctx, {
62
+ url: "https://news.ycombinator.com",
63
+ instruction: "Extract the top 5 stories with title, score, and link",
64
+ schema: z.object({
65
+ stories: z.array(z.object({
66
+ title: z.string(),
67
+ score: z.string(),
68
+ link: z.string(),
69
+ }))
70
+ })
71
+ });
72
+ }
73
+ });
74
+ ```
75
+
76
+ ## API Reference
77
+
78
+ ### `startSession(ctx, args)`
79
+
80
+ Start a new browser session. Returns session info for use with other operations.
81
+
82
+ ```typescript
83
+ const session = await stagehand.startSession(ctx, {
84
+ url: "https://example.com",
85
+ browserbaseSessionId: "optional-existing-session-id",
86
+ options: {
87
+ timeout: 30000,
88
+ waitUntil: "networkidle",
89
+ domSettleTimeoutMs: 2000,
90
+ selfHeal: true,
91
+ systemPrompt: "Custom system prompt for the session",
92
+ }
93
+ });
94
+ // { sessionId: "...", browserbaseSessionId: "...", cdpUrl: "wss://..." }
95
+ ```
96
+
97
+ **Parameters:**
98
+ - `url` - The URL to navigate to
99
+ - `browserbaseSessionId` - Optional: Resume an existing Browserbase session
100
+ - `options.timeout` - Navigation timeout in milliseconds
101
+ - `options.waitUntil` - When to consider navigation complete: `"load"`, `"domcontentloaded"`, or `"networkidle"`
102
+ - `options.domSettleTimeoutMs` - Timeout for DOM to settle before considering page loaded
103
+ - `options.selfHeal` - Enable self-healing capabilities for more robust automation
104
+ - `options.systemPrompt` - Custom system prompt to guide the AI's behavior during the session
105
+
106
+ **Returns:**
107
+ ```typescript
108
+ {
109
+ sessionId: string; // Use with other operations
110
+ browserbaseSessionId?: string; // Store to resume later
111
+ cdpUrl?: string; // For advanced Playwright/Puppeteer usage
112
+ }
113
+ ```
114
+
115
+ ---
116
+
117
+ ### `endSession(ctx, args)`
118
+
119
+ End a browser session.
120
+
121
+ ```typescript
122
+ await stagehand.endSession(ctx, { sessionId: session.sessionId });
123
+ ```
124
+
125
+ **Parameters:**
126
+ - `sessionId` - The session to end
127
+
128
+ **Returns:** `{ success: boolean }`
129
+
130
+ ---
131
+
132
+ ### `extract(ctx, args)`
133
+
134
+ Extract structured data from a web page using AI.
135
+
136
+ ```typescript
137
+ // Without session (creates and destroys its own)
138
+ const data = await stagehand.extract(ctx, {
139
+ url: "https://example.com",
140
+ instruction: "Extract all product names and prices",
141
+ schema: z.object({
142
+ products: z.array(z.object({
143
+ name: z.string(),
144
+ price: z.string(),
145
+ }))
146
+ }),
147
+ });
148
+
149
+ // With existing session (reuses session, doesn't end it)
150
+ const data = await stagehand.extract(ctx, {
151
+ sessionId: session.sessionId,
152
+ instruction: "Extract all product names and prices",
153
+ schema: z.object({ ... }),
154
+ });
155
+ ```
156
+
157
+ **Parameters:**
158
+ - `sessionId` - Optional: Use an existing session
159
+ - `url` - The URL to navigate to (required if no sessionId)
160
+ - `instruction` - Natural language description of what to extract
161
+ - `schema` - Zod schema defining the expected output structure
162
+ - `options.timeout` - Navigation timeout in milliseconds
163
+ - `options.waitUntil` - When to consider navigation complete: `"load"`, `"domcontentloaded"`, or `"networkidle"`
164
+
165
+ **Returns:** Data matching your Zod schema
166
+
167
+ ---
168
+
169
+ ### `act(ctx, args)`
170
+
171
+ Execute browser actions using natural language.
172
+
173
+ ```typescript
174
+ // Without session
175
+ const result = await stagehand.act(ctx, {
176
+ url: "https://example.com/login",
177
+ action: "Click the login button and wait for the page to load",
178
+ });
179
+
180
+ // With existing session
181
+ const result = await stagehand.act(ctx, {
182
+ sessionId: session.sessionId,
183
+ action: "Fill in the email field with 'user@example.com'",
184
+ });
185
+ ```
186
+
187
+ **Parameters:**
188
+ - `sessionId` - Optional: Use an existing session
189
+ - `url` - The URL to navigate to (required if no sessionId)
190
+ - `action` - Natural language description of the action to perform
191
+ - `options.timeout` - Navigation timeout in milliseconds
192
+ - `options.waitUntil` - When to consider navigation complete
193
+
194
+ **Returns:**
195
+ ```typescript
196
+ {
197
+ success: boolean;
198
+ message: string;
199
+ actionDescription: string;
200
+ }
201
+ ```
202
+
203
+ ---
204
+
205
+ ### `observe(ctx, args)`
206
+
207
+ Find available actions on a web page.
208
+
209
+ ```typescript
210
+ const actions = await stagehand.observe(ctx, {
211
+ url: "https://example.com",
212
+ instruction: "Find all clickable navigation links",
213
+ });
214
+ // [{ description: "Home link", selector: "a.nav-home", method: "click" }, ...]
215
+ ```
216
+
217
+ **Parameters:**
218
+ - `sessionId` - Optional: Use an existing session
219
+ - `url` - The URL to navigate to (required if no sessionId)
220
+ - `instruction` - Natural language description of what actions to find
221
+ - `options.timeout` - Navigation timeout in milliseconds
222
+ - `options.waitUntil` - When to consider navigation complete
223
+
224
+ **Returns:**
225
+ ```typescript
226
+ Array<{
227
+ description: string;
228
+ selector: string;
229
+ method: string;
230
+ arguments?: string[];
231
+ }>
232
+ ```
233
+
234
+ ---
235
+
236
+ ### `agent(ctx, args)`
237
+
238
+ Execute autonomous multi-step browser automation using an AI agent. The agent interprets the instruction and decides what actions to take.
239
+
240
+ ```typescript
241
+ // Agent creates its own session
242
+ const result = await stagehand.agent(ctx, {
243
+ url: "https://google.com",
244
+ instruction: "Search for 'convex database' and extract the top 3 results with title and URL",
245
+ options: { maxSteps: 10 },
246
+ });
247
+
248
+ // Agent with existing session
249
+ const result = await stagehand.agent(ctx, {
250
+ sessionId: session.sessionId,
251
+ instruction: "Fill out the contact form and submit",
252
+ options: { maxSteps: 5 },
253
+ });
254
+ ```
255
+
256
+ **Parameters:**
257
+ - `sessionId` - Optional: Use an existing session
258
+ - `url` - The URL to navigate to (required if no sessionId)
259
+ - `instruction` - Natural language description of the task to complete
260
+ - `options.cua` - Enable Computer Use Agent mode
261
+ - `options.maxSteps` - Maximum steps the agent can take
262
+ - `options.systemPrompt` - Custom system prompt for the agent
263
+ - `options.timeout` - Navigation timeout in milliseconds
264
+ - `options.waitUntil` - When to consider navigation complete
265
+
266
+ **Returns:**
267
+ ```typescript
268
+ {
269
+ actions: Array<{
270
+ type: string;
271
+ action?: string;
272
+ reasoning?: string;
273
+ timeMs?: number;
274
+ }>;
275
+ completed: boolean;
276
+ message: string;
277
+ success: boolean;
278
+ }
279
+ ```
280
+
281
+ ## Examples
282
+
283
+ ### Simple extraction (automatic session)
284
+
285
+ ```typescript
286
+ const news = await stagehand.extract(ctx, {
287
+ url: "https://news.ycombinator.com",
288
+ instruction: "Get the top 10 stories with title, points, and comment count",
289
+ schema: z.object({
290
+ stories: z.array(z.object({
291
+ title: z.string(),
292
+ points: z.string(),
293
+ comments: z.string(),
294
+ }))
295
+ })
296
+ });
297
+ ```
298
+
299
+ ### Manual session management
300
+
301
+ Use session management when you need to perform multiple operations while preserving browser state (cookies, login, etc.):
302
+
303
+ ```typescript
304
+ // Start a session
305
+ const session = await stagehand.startSession(ctx, {
306
+ url: "https://google.com"
307
+ });
308
+
309
+ // Perform multiple operations in the same session
310
+ await stagehand.act(ctx, {
311
+ sessionId: session.sessionId,
312
+ action: "Search for 'convex database'"
313
+ });
314
+
315
+ const data = await stagehand.extract(ctx, {
316
+ sessionId: session.sessionId,
317
+ instruction: "Extract the top 3 results",
318
+ schema: z.object({
319
+ results: z.array(z.object({
320
+ title: z.string(),
321
+ url: z.string(),
322
+ }))
323
+ })
324
+ });
325
+
326
+ // End the session when done
327
+ await stagehand.endSession(ctx, { sessionId: session.sessionId });
328
+ ```
329
+
330
+ ### Autonomous agent
331
+
332
+ Let the AI agent figure out how to complete a complex task:
333
+
334
+ ```typescript
335
+ const result = await stagehand.agent(ctx, {
336
+ url: "https://www.google.com",
337
+ instruction: "Search for 'best pizza in NYC', click on the first result, and extract the restaurant name and address",
338
+ options: { maxSteps: 10 }
339
+ });
340
+
341
+ console.log(result.message); // Summary of what the agent did
342
+ console.log(result.actions); // Detailed log of each action taken
343
+ ```
344
+
345
+ ### Resume session across Convex actions
346
+
347
+ Store the `browserbaseSessionId` to resume sessions across different Convex action calls:
348
+
349
+ ```typescript
350
+ // Action 1: Start session and return browserbaseSessionId
351
+ export const startBrowsing = action({
352
+ handler: async (ctx) => {
353
+ const session = await stagehand.startSession(ctx, {
354
+ url: "https://example.com/login"
355
+ });
356
+ // Store browserbaseSessionId in your database
357
+ return session.browserbaseSessionId;
358
+ }
359
+ });
360
+
361
+ // Action 2: Resume session later
362
+ export const continueBrowsing = action({
363
+ args: { browserbaseSessionId: v.string() },
364
+ handler: async (ctx, args) => {
365
+ const session = await stagehand.startSession(ctx, {
366
+ url: "https://example.com/dashboard",
367
+ browserbaseSessionId: args.browserbaseSessionId,
368
+ });
369
+ // Continue using the same browser instance
370
+ return await stagehand.extract(ctx, {
371
+ sessionId: session.sessionId,
372
+ instruction: "Extract user data",
373
+ schema: z.object({ ... }),
374
+ });
375
+ }
376
+ });
377
+ ```
378
+
379
+ ## Configuration Options
380
+
381
+ ### AI Model
382
+
383
+ By default, the component uses `openai/gpt-4o`. You can use any model supported by the [Vercel AI SDK](https://sdk.vercel.ai/providers/ai-sdk-providers) that supports structured outputs:
384
+
385
+ ```typescript
386
+ const stagehand = new Stagehand(components.stagehand, {
387
+ browserbaseApiKey: process.env.BROWSERBASE_API_KEY!,
388
+ browserbaseProjectId: process.env.BROWSERBASE_PROJECT_ID!,
389
+ modelApiKey: process.env.ANTHROPIC_API_KEY!, // Use Anthropic
390
+ modelName: "anthropic/claude-3-5-sonnet-20241022",
391
+ });
392
+ ```
393
+
394
+ For the full list of supported models and providers, see the [Stagehand Models documentation](https://docs.stagehand.dev/configuration/models).
395
+
396
+ ## Requirements
397
+
398
+ - [Browserbase](https://browserbase.com) account and API key
399
+ - LLM provider API key (see [supported models](https://docs.stagehand.dev/configuration/models))
400
+ - Convex 1.29.3 or later
401
+
402
+ ## How It Works
403
+
404
+ This component uses the [Stagehand REST API](https://stagehand.stldocs.app/api) to power browser automation. Each operation:
405
+
406
+ 1. Starts a cloud browser session via Browserbase (or reuses an existing one)
407
+ 2. Navigates to the target URL
408
+ 3. Uses AI to understand the page and perform the requested operation
409
+ 4. Optionally ends the session and returns results
410
+
411
+ With session management, you control when sessions start and end, allowing you to maintain browser state across multiple operations.
412
+
413
+ ## Development
414
+
415
+ ### Component Structure
416
+
417
+ The component exposes its API through Convex's component system. All functions are in a single `lib.ts` module:
418
+
419
+ ```
420
+ component.lib.<function>
421
+ ```
422
+
423
+ For example:
424
+ - `component.lib.startSession` - Start a browser session
425
+ - `component.lib.endSession` - End a browser session
426
+ - `component.lib.extract` - Extract data from web pages
427
+ - `component.lib.act` - Perform browser actions
428
+ - `component.lib.observe` - Find interactive elements
429
+ - `component.lib.agent` - Autonomous multi-step automation
430
+
431
+ The `Stagehand` client class wraps these internal paths to provide a clean user API:
432
+
433
+ ```typescript
434
+ // User calls:
435
+ stagehand.extract(ctx, {...})
436
+
437
+ // Internally calls:
438
+ ctx.runAction(component.lib.extract, {...})
439
+ ```
440
+
441
+ ### Building the Component
442
+
443
+ To build the component locally:
444
+
445
+ ```bash
446
+ # Install dependencies
447
+ npm install
448
+
449
+ # Build with Convex codegen (generates component API)
450
+ npm run build:codegen
451
+
452
+ # Or just build TypeScript
453
+ npm run build:esm
454
+ ```
455
+
456
+ The component requires a Convex deployment to generate proper component API types (`_generated/component.ts`).
457
+
458
+ ### Example App
459
+
460
+ Check out the full example app in the [`example/`](./example) directory:
461
+
462
+ ```bash
463
+ git clone https://github.com/browserbase/convex-stagehand
464
+ cd convex-stagehand/example
465
+ npm install
466
+ npm run dev
467
+ ```
468
+
469
+ The example includes:
470
+ - HackerNews story extraction with AI
471
+ - Type-safe data extraction using Zod schemas
472
+ - Database persistence with Convex
473
+ - Real-time updates and automatic refresh
474
+
475
+ ## License
476
+
477
+ MIT