testdriverai 7.2.19 β†’ 7.2.20

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (2) hide show
  1. package/README.md +169 -258
  2. package/package.json +1 -1
package/README.md CHANGED
@@ -1,314 +1,225 @@
1
- <a href="https://testdriver.ai"><img src="https://github.com/dashcamio/testdriver/assets/318295/2a0ad981-8504-46f0-ad97-60cb6c26f1e7"/></a>
2
-
3
- # TestDriver.ai
4
-
5
- Automate and scale QA with computer-use agents.
6
-
7
- [Docs](https://docs.testdriver.ai) | [Website](https://testdriver.ai) | [GitHub Action](https://github.com/marketplace/actions/testdriver-ai) | [Join our Discord](https://discord.com/invite/cWDFW8DzPm)
8
-
9
- # Install via NPM
10
-
11
- [Follow the instructions on our docs for more.](https://docs.testdriver.ai/overview/quickstart).
12
-
13
- ## v7 SDK - Progressive Disclosure
14
-
15
- TestDriver v7 introduces **three levels of API** to match your experience level:
16
-
17
- ### 🟒 Beginner: Presets (Zero Config)
18
-
19
- ```javascript
20
- import { test } from 'vitest';
21
- import { chromePreset } from 'testdriverai/presets';
22
-
23
- test('login test', async (context) => {
24
- const { client } = await chromePreset(context, {
25
- url: 'https://myapp.com'
26
- });
1
+ <div align="center">
2
+ <a href="https://testdriver.ai">
3
+ <img src="https://github.com/dashcamio/testdriver/assets/318295/2a0ad981-8504-46f0-ad97-60cb6c26f1e7" height="200" alt="TestDriver.ai"/>
4
+ </a>
5
+ </div>
6
+ <h4 align="center">
7
+ Reliably test your most difficult flows. Don't ship bugs because flows are too hard to test.
8
+ </h4>
9
+
10
+ <p align="center">
11
+ TestDriver helps engineering teams easily test, debug, and monitor E2E flows that are hard or impossible to cover with other tools.
12
+ </p>
13
+
14
+ <div align="center">
27
15
 
28
- await client.find('Login button').click();
29
- });
30
- ```
16
+ [πŸš€ **Quick Start**](#-quick-start) β€’ [πŸ“– **Documentation**](https://docs.testdriver.ai) β€’ [πŸ’» **Examples**](https://github.com/testdriverai/testdriverai/tree/main/test/testdriver) β€’ [πŸ’¬ **Discord**](https://discord.com/invite/cWDFW8DzPm) β€’ [🌐 **Website**](https://testdriver.ai)
31
17
 
32
- **Built-in presets:** Chrome, VS Code, Electron, and create your own!
18
+ </div>
33
19
 
34
- ### 🟑 Intermediate: Hooks (Flexible)
20
+ ---
35
21
 
36
- ```javascript
37
- import { test } from 'vitest';
38
- import { useTestDriver, useDashcam } from 'testdriverai/vitest/hooks';
39
-
40
- test('my test', async (context) => {
41
- const client = useTestDriver(context, { os: 'linux' });
42
- const dashcam = useDashcam(context, client, {
43
- autoStart: true,
44
- autoStop: true
45
- });
46
-
47
- await client.find('button').click();
48
- });
49
- ```
22
+ ## 🎬 What Can You Test?
50
23
 
51
- **Automatic lifecycle management** - no more forgetting cleanup!
24
+ <div align="center">
52
25
 
53
- ### πŸ”΄ Advanced: Core Classes (Full Control)
26
+ **Third-Party Web Apps** β€’ **Desktop Apps** β€’ **VS Code Extensions** β€’ **Chrome Extensions** β€’ **AI Chatbots** β€’ **OAuth Flows** β€’ **PDF Content** β€’ **Spelling & Grammar** β€’ **File System & Uploads** β€’ **OS Accessibility** β€’ **Visual Content** β€’ **`<iframe>`** β€’ **`<canvas>`** β€’ **`<video>`**
54
27
 
55
- ```javascript
56
- import { test } from 'vitest';
57
- import { TestDriver, Dashcam } from 'testdriverai/core';
28
+ </div>
58
29
 
59
- test('my test', async () => {
60
- const client = new TestDriver(apiKey, { os: 'linux' });
61
- const dashcam = new Dashcam(client);
62
-
63
- await client.auth();
64
- await client.connect();
65
- await dashcam.start();
66
-
67
- await client.find('button').click();
68
-
69
- await dashcam.stop();
70
- await client.disconnect();
71
- });
72
- ```
30
+ ---
73
31
 
74
- **Full manual control** for advanced scenarios.
32
+ ## 🎯 Why TestDriver?
75
33
 
76
- πŸ“– **Learn more:** [MIGRATION.md](./docs/MIGRATION.md) | [PRESETS.md](./docs/PRESETS.md) | [HOOKS.md](./docs/HOOKS.md)
34
+ TestDriver isn't just another testing frameworkβ€”it's a **computer-use agent for QA**. Using AI vision and mouse/keyboard emulation, TestDriver can test anything you can see on screen, just like a human QA engineer would.
77
35
 
78
- # About
36
+ ### The Problem with Traditional Testing
79
37
 
80
- TestDriver isn't like any test framework you've used before. TestDriver is an OS Agent for QA. TestDriver uses AI vision along with mouse and keyboard emulation to control the entire desktop. It's more like a QA employee than a test framework. This kind of black-box testing has some major advantages:
38
+ Modern testing tools like Playwright are powerful but limited to selector-based testing in single browser tabs. This breaks down when you need to test:
81
39
 
82
- - **Easier set up:** No need to add test IDs or craft complex selectors
83
- - **Less Maintenance:** Tests don't break when code changes
84
- - **More Power:** TestDriver can test any application and control any OS setting
40
+ | Challenge | Traditional Tools | TestDriver |
41
+ |-----------|------------------|------------|
42
+ | **Dynamic AI Content** | ❌ No selectors for chatbots, images, videos | βœ… AI vision sees everything |
43
+ | **Fast-Moving Teams** | ❌ Brittle selectors break constantly | βœ… Natural language adapts to changes |
44
+ | **Desktop Applications** | ❌ Web-only tools | βœ… Full OS control |
45
+ | **Third-Party Software** | ❌ No access to selectors | βœ… Tests anything visible |
46
+ | **Visual States** | ❌ Can't verify layouts, charts, images | βœ… Computer vision validation |
47
+ | **Multi-App Workflows** | ❌ Single-app limitation | βœ… Cross-application testing |
85
48
 
86
- ### Demo (Playing Balatro Desktop)
49
+ ### The TestDriver Solution
87
50
 
88
- https://github.com/user-attachments/assets/7cb9ee5a-0d05-4ff0-a4fa-084bcee12e98
89
-
90
- # Examples
91
-
92
- - Test any user flow on any website in any browser
93
- - Clone, build, and test any desktop app
94
- - Render multiple browser windows and popups like 3rd party auth
95
- - Test `<canvas>`, `<iframe>`, and `<video>` tags with ease
96
- - Use file selectors to upload files to the browser
97
- - Test chrome extensions
98
- - Test integrations between applications
99
- - Integrates into CI/CD via GitHub Actions ($)
100
-
101
- Check out [the docs](https://docs.testdriver.ai/).
102
-
103
- # Workflow
104
-
105
- 1. Tell TestDriver what to do in natural language on your local machine using `npm i testdriverai -g`
106
- 2. TestDriver looks at the screen and uses mouse and keyboard emulation to accomplish the goal
107
- 3. Run TestDriver tests on our test infrastructure
108
-
109
- # Quickstart
110
-
111
- ## Initialize TestDriver
112
-
113
- In your project directory:
114
-
115
- ```sh
116
- npx testdriverai@latest init
117
- ```
118
-
119
- ## Teach TestDriver a test
120
-
121
- Let's show TestDriver what we want to test. Run the following command:
51
+ ```javascript
52
+ // Instead of fragile selectors...
53
+ await page.locator('#user-menu > div.dropdown > button[data-testid="profile-btn"]').click();
122
54
 
123
- ```sh
124
- npx testdriverai@latest .testdriver/test.yaml
55
+ // ...use natural language that adapts to UI changes
56
+ await testdriver.find('profile button in the top right').click();
125
57
  ```
126
58
 
127
- ## Reset the test state
128
-
129
- TestDriver best practice is to start instructing TestDriver with your app in it's initial state. For browsers, this means creating a new tab with the website you want to test.
130
-
131
- If you have multiple monitors, make sure you do this on your primary display.
59
+ ---
132
60
 
133
- ## Instruct TestDriver
61
+ ## πŸš€ Quick Start
134
62
 
135
- Now, just tell TestDriver what you want it to do. For now, stick with single commands like "click sign up" and "scroll down."
63
+ Get your first test running in under 5 minutes:
136
64
 
137
- Later, try to perform higher level objectives like "complete the onboarding."
65
+ ### Step 1: Create a TestDriver Account
138
66
 
139
- ```yaml
140
- > Click on sign up
141
- TestDriver Generates a Test
142
- TestDriver will look at your screen and generate a test script. TestDriver can see the screen, control the mouse, keyboard, and more!
143
- TestDriver can only see your primary display!
144
- To navigate to testdriver.ai, we need to focus on the
145
- Google Chrome application, click on the search bar, type
146
- the URL, and then press Enter.
67
+ <a href="https://app.testdriver.ai/team"><img src="https://img.shields.io/badge/Sign_Up-Free_Account-blue?style=for-the-badge" alt="Sign Up"/></a>
147
68
 
148
- Here are the steps:
69
+ *No credit card required!*
149
70
 
150
- 1. Focus on the Google Chrome application.
151
- 2. Click on the search bar.
152
- 3. Type "testdriver.ai".
153
- 4. Press Enter.
71
+ ### Step 2: Initialize Your Project
154
72
 
155
- Let's start with focusing on the Google Chrome application
156
- and clicking on the search bar.
157
-
158
- commands:
159
- - command: focus-application
160
- name: Google Chrome
161
- - command: hover-text
162
- text: Search Google or type a URL
163
- description: main google search
164
- action: click
165
-
166
- After this, we will type the URL and press Enter.
73
+ ```bash
74
+ npx testdriverai@beta init
167
75
  ```
168
76
 
169
- ## TestDriver executes the test script
77
+ This will:
78
+ - Create a project folder
79
+ - Install dependencies (Vitest + TestDriver)
80
+ - Set up your API key
81
+ - Generate an example test
170
82
 
171
- TestDriver will execute the commands found in the yml codeblocks of the response.
83
+ ### Step 3: Run Your First Test
172
84
 
173
- See the yml TestDriver generated? That's our own schema. You can learn more about it in the [reference](https://docs.testdriver.ai/getting-started/editing).
174
-
175
- > Take your hands off the mouse and keyboard while TestDriver executes! TestDriver is not a fan of backseat drivers.
176
-
177
- ## Keep going!
178
-
179
- Feel free to ask TestDriver to perform some more tasks. Every time you prompt TestDriver it will look at your screen and generate more test step to complete your goal.
180
-
181
- ```sh
182
- > navigate to airbnb.com
183
- > search for destinations in austin tx
184
- > click check in
185
- > select august 8
85
+ ```bash
86
+ vitest run
186
87
  ```
187
88
 
188
- If something didn't work, you can use `/undo` to remove all of the test steps added since the last prompt.
89
+ Watch as TestDriver:
90
+ 1. Spawns a cloud sandbox
91
+ 2. Launches Chrome
92
+ 3. Runs your test using AI vision
93
+ 4. Returns results with video replay
189
94
 
190
- ## Test the test locally
95
+ <div align="center">
96
+ <a href="https://docs.testdriver.ai/v7/quickstart"><img src="https://img.shields.io/badge/πŸ“–_Read_Full_Quickstart-4A90E2?style=for-the-badge" alt="Full Quickstart"/></a>
97
+ </div>
191
98
 
192
- Now it's time to make sure the test plan works before we deploy it. Use testdriver run to run the test file you just created with /save .
99
+ ---
193
100
 
194
- ```sh
195
- npx testdriverai@latest run testdriver/test.yaml
196
- ```
101
+ ## �️ Core Concepts
197
102
 
198
- Make sure to reset the test state!
103
+ ### Real-World Examples
199
104
 
200
- ## Deploy
105
+ #### Web Applications
201
106
 
202
- Now it's time to deploy your test using our GitHub action! `testdriver init` already did the work for you and will start triggering tests once you commit the new files to your repository.
107
+ ```javascript
108
+ // Test dynamic AI chatbots (no selectors needed!)
109
+ test('chatbot conversation', async (context) => {
110
+ const { testdriver } = await chrome(context, { url: 'https://chatapp.com' });
111
+
112
+ await testdriver.find('message input').type('What is TestDriver?');
113
+ await testdriver.find('send button').click();
114
+
115
+ const response = await testdriver.assert('AI response is visible');
116
+ expect(response).toBeTruthy();
117
+ });
203
118
 
204
- ```sh
205
- git add .
206
- git commit -am "Add TestDriver tests"
207
- gh pr create --web
119
+ // Test OAuth flows across multiple domains
120
+ test('OAuth login', async (context) => {
121
+ const { testdriver } = await chrome(context, { url: 'https://myapp.com' });
122
+
123
+ await testdriver.find('Login with Google').click();
124
+ // Handles popup, enters credentials, returns to app
125
+ await testdriver.find('email input').type('user@gmail.com');
126
+ await testdriver.find('password input').type('password');
127
+ await testdriver.find('Sign in').click();
128
+
129
+ await testdriver.assert('successfully logged into the app');
130
+ });
208
131
  ```
209
132
 
210
- Your test will run on every commit and the results will be posted as a Dashcam.io video within your GitHub summary! Learn more about deploying on CI [here](https://docs.testdriver.ai/action/setup).
211
-
212
- ## Using as a Module
213
-
214
- TestDriver can also be used programmatically as a Node.js module. This is useful when you want to integrate TestDriver into your own applications or customize the test file paths.
215
-
216
- ### Custom Test File Paths
217
-
218
- By default, TestDriver looks for test files at `testdriver/testdriver.yaml` relative to the current working directory. You can customize this:
133
+ #### Desktop Applications
219
134
 
220
135
  ```javascript
221
- const TestDriverAgent = require("testdriverai");
222
-
223
- // Option 1: Set default via environment variable
224
- const agent1 = new TestDriverAgent({
225
- TD_DEFAULT_TEST_FILE: "my-tests/integration.yaml",
136
+ // Install and test native desktop apps
137
+ test('desktop app', async (context) => {
138
+ const testdriver = TestDriver(context, { os: 'windows' });
139
+
140
+ await testdriver.provision.installer({
141
+ url: 'https://example.com/MyApp.msi',
142
+ launch: true
143
+ });
144
+
145
+ await testdriver.find('main window').assert('app launched successfully');
226
146
  });
227
-
228
- // Option 2: Explicitly specify test file
229
- const agent2 = new TestDriverAgent(
230
- {},
231
- {
232
- args: ["path/to/specific/test.yaml"],
233
- },
234
- );
235
-
236
- // Option 3: Custom working directory + relative path
237
- const agent3 = new TestDriverAgent(
238
- { TD_DEFAULT_TEST_FILE: "tests/smoke.yaml" },
239
- { options: { workingDir: "/path/to/your/project" } },
240
- );
241
-
242
- // Run the test
243
- await agent1.run();
244
147
  ```
245
148
 
246
- ### Environment Variables
149
+ #### Browser Extensions
247
150
 
248
- You can also set the default test file path using environment variables:
249
-
250
- ```bash
251
- export TD_DEFAULT_TEST_FILE="custom/path/test.yaml"
252
- node your-script.js
151
+ ```javascript
152
+ // Test Chrome extensions from the Web Store
153
+ test('chrome extension', async (context) => {
154
+ const { testdriver } = await chrome(context);
155
+
156
+ await testdriver.provision.chromeExtension({
157
+ extensionId: 'liecbddmkiiihnedobmlmillhodjkdmb' // Loom
158
+ });
159
+
160
+ const button = await testdriver.find('extension icon in toolbar');
161
+ await button.click();
162
+
163
+ const panel = await testdriver.find('extension popup');
164
+ expect(panel.found()).toBeTruthy();
165
+ });
253
166
  ```
254
167
 
255
- ## MCP Server for AI Agents
256
-
257
- TestDriver includes a Model Context Protocol (MCP) server that enables AI agents to **interactively create Vitest test files**.
168
+ #### Visual Validation
258
169
 
259
- ### How It Works
260
-
261
- 1. **AI agent connects** to a persistent TestDriver sandbox
262
- 2. **User describes** what they want to test
263
- 3. **AI explores** the application using TestDriver commands
264
- 4. **AI generates** Vitest test code from successful interactions
265
-
266
- ### Quick Start
267
-
268
- ```bash
269
- cd mcp-server
270
- npm install && npm run build
271
- npm run deploy # Install to ~/.mcp/testdriver
272
- ```
273
-
274
- ### Configuration
170
+ ```javascript
171
+ // Verify visual states, charts, images
172
+ test('dashboard chart', async (context) => {
173
+ const { testdriver } = await chrome(context, { url: 'https://analytics.example.com' });
174
+
175
+ await testdriver.assert('line chart shows upward trend');
176
+ await testdriver.assert('data points are visible on the graph');
177
+
178
+ // Extract text from images/PDFs
179
+ const value = await testdriver.extract('the total revenue number');
180
+ console.log('Revenue:', value);
181
+ });
275
182
 
276
- Add to your MCP client configuration:
183
+ // Test spelling and grammar checking
184
+ test('content validation', async (context) => {
185
+ const { testdriver } = await chrome(context, { url: 'https://docs.example.com' });
186
+
187
+ await testdriver.assert('there are no spelling errors on the page');
188
+ await testdriver.assert('all headings use title case');
189
+ });
277
190
 
278
- ```json
279
- {
280
- "servers": {
281
- "testdriverai": {
282
- "type": "stdio",
283
- "command": "node",
284
- "args": ["/path/to/cli/mcp-server/dist/index.js"]
285
- }
286
- }
287
- }
191
+ // Test canvas and video elements
192
+ test('canvas rendering', async (context) => {
193
+ const { testdriver } = await chrome(context, { url: 'https://canvas-app.com' });
194
+
195
+ await testdriver.find('draw tool').click();
196
+ await testdriver.assert('canvas shows a red circle');
197
+
198
+ // Video content validation
199
+ await testdriver.assert('video is playing');
200
+ await testdriver.assert('video progress bar shows 50% complete');
201
+ });
288
202
  ```
289
203
 
290
- ### Example Workflow
204
+ #### Multi-Application Workflows
291
205
 
206
+ ```javascript
207
+ // Test interactions across multiple apps
208
+ test('copy from browser to VS Code', async (context) => {
209
+ const testdriver = TestDriver(context, { os: 'linux' });
210
+
211
+ await testdriver.provision.chrome({ url: 'https://example.com' });
212
+ await testdriver.find('code snippet').click();
213
+ await testdriver.pressKeys(['ctrl', 'c']);
214
+
215
+ await testdriver.provision.vscode();
216
+ await testdriver.find('editor').click();
217
+ await testdriver.pressKeys(['ctrl', 'v']);
218
+
219
+ await testdriver.assert('code is pasted in editor');
220
+ });
292
221
  ```
293
- User: "Create a test that logs into the app"
294
-
295
- AI: [Connects to sandbox with user's API key]
296
- AI: [Takes screenshot to see login page]
297
- AI: [Finds username field: await testdriver_find({ description: "username field" })]
298
- AI: [Clicks and types: await testdriver_type({ text: "test_user" })]
299
- AI: [Finds password field, enters password]
300
- AI: [Clicks login button]
301
- AI: [Asserts login succeeded]
302
- AI: [Generates Vitest test file from these steps]
303
- AI: [Saves test/login.test.mjs]
304
- ```
305
-
306
- ### Key Features
307
-
308
- - **Persistent sandbox** - Connection stays alive throughout test creation
309
- - **Live debugger URL** - User can watch the AI test in real-time
310
- - **Full SDK access** - All v7 SDK methods available
311
- - **Code generation** - AI translates interactions into proper Vitest code
312
-
313
- See [mcp-server/TEST_CREATION_GUIDE.md](mcp-server/TEST_CREATION_GUIDE.md) for the complete guide.
314
222
 
223
+ <div align="center">
224
+ <a href="https://github.com/testdriverai/testdriverai/tree/main/test/testdriver"><img src="https://img.shields.io/badge/πŸ’»_Browse_More_Examples-gray?style=for-the-badge" alt="More Examples"/></a>
225
+ </div>
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "testdriverai",
3
- "version": "7.2.19",
3
+ "version": "7.2.20",
4
4
  "description": "Next generation autonomous AI agent for end-to-end testing of web & desktop",
5
5
  "main": "sdk.js",
6
6
  "exports": {