misoai-web 1.0.6 → 1.5.7

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (76) hide show
  1. package/README.md +5 -349
  2. package/dist/es/agent.js +165 -428
  3. package/dist/es/agent.js.map +1 -1
  4. package/dist/es/bridge-mode-browser.js +10 -9
  5. package/dist/es/bridge-mode-browser.js.map +1 -1
  6. package/dist/es/bridge-mode.js +167 -430
  7. package/dist/es/bridge-mode.js.map +1 -1
  8. package/dist/es/chrome-extension.js +173 -435
  9. package/dist/es/chrome-extension.js.map +1 -1
  10. package/dist/es/index.js +185 -432
  11. package/dist/es/index.js.map +1 -1
  12. package/dist/es/midscene-playground.js +165 -428
  13. package/dist/es/midscene-playground.js.map +1 -1
  14. package/dist/es/midscene-server.js.map +1 -1
  15. package/dist/es/playground.js +165 -428
  16. package/dist/es/playground.js.map +1 -1
  17. package/dist/es/playwright-report.js +1 -1
  18. package/dist/es/playwright-report.js.map +1 -1
  19. package/dist/es/playwright.js +182 -429
  20. package/dist/es/playwright.js.map +1 -1
  21. package/dist/es/puppeteer-agent-launcher.js +169 -432
  22. package/dist/es/puppeteer-agent-launcher.js.map +1 -1
  23. package/dist/es/puppeteer.js +169 -432
  24. package/dist/es/puppeteer.js.map +1 -1
  25. package/dist/es/ui-utils.js.map +1 -1
  26. package/dist/es/utils.js +7 -4
  27. package/dist/es/utils.js.map +1 -1
  28. package/dist/es/yaml.js +29 -3
  29. package/dist/es/yaml.js.map +1 -1
  30. package/dist/lib/agent.js +163 -426
  31. package/dist/lib/agent.js.map +1 -1
  32. package/dist/lib/bridge-mode-browser.js +10 -9
  33. package/dist/lib/bridge-mode-browser.js.map +1 -1
  34. package/dist/lib/bridge-mode.js +165 -428
  35. package/dist/lib/bridge-mode.js.map +1 -1
  36. package/dist/lib/chrome-extension.js +171 -433
  37. package/dist/lib/chrome-extension.js.map +1 -1
  38. package/dist/lib/index.js +183 -430
  39. package/dist/lib/index.js.map +1 -1
  40. package/dist/lib/midscene-playground.js +163 -426
  41. package/dist/lib/midscene-playground.js.map +1 -1
  42. package/dist/lib/midscene-server.js.map +1 -1
  43. package/dist/lib/playground.js +163 -426
  44. package/dist/lib/playground.js.map +1 -1
  45. package/dist/lib/playwright-report.js +1 -1
  46. package/dist/lib/playwright-report.js.map +1 -1
  47. package/dist/lib/playwright.js +180 -427
  48. package/dist/lib/playwright.js.map +1 -1
  49. package/dist/lib/puppeteer-agent-launcher.js +167 -430
  50. package/dist/lib/puppeteer-agent-launcher.js.map +1 -1
  51. package/dist/lib/puppeteer.js +167 -430
  52. package/dist/lib/puppeteer.js.map +1 -1
  53. package/dist/lib/ui-utils.js.map +1 -1
  54. package/dist/lib/utils.js +7 -4
  55. package/dist/lib/utils.js.map +1 -1
  56. package/dist/lib/yaml.js +29 -3
  57. package/dist/lib/yaml.js.map +1 -1
  58. package/dist/types/agent.d.ts +13 -51
  59. package/dist/types/bridge-mode-browser.d.ts +2 -3
  60. package/dist/types/bridge-mode.d.ts +2 -3
  61. package/dist/types/{browser-aec1055d.d.ts → browser-9b472ffb.d.ts} +1 -1
  62. package/dist/types/chrome-extension.d.ts +2 -3
  63. package/dist/types/index.d.ts +1 -2
  64. package/dist/types/midscene-server.d.ts +1 -2
  65. package/dist/types/{page-86ab0fe1.d.ts → page-ed0ecb44.d.ts} +19 -9
  66. package/dist/types/playground.d.ts +2 -3
  67. package/dist/types/playwright.d.ts +9 -2
  68. package/dist/types/puppeteer-agent-launcher.d.ts +1 -2
  69. package/dist/types/puppeteer.d.ts +6 -5
  70. package/dist/types/ui-utils.d.ts +1 -1
  71. package/dist/types/utils.d.ts +1 -2
  72. package/dist/types/yaml.d.ts +1 -2
  73. package/iife-script/htmlElement.js +53 -75
  74. package/iife-script/htmlElementDebug.js +35 -56
  75. package/package.json +24 -24
  76. package/LICENSE +0 -21
package/README.md CHANGED
@@ -1,353 +1,9 @@
1
- # @midscene/web - AI-Powered Web Automation
1
+ ## Documentation
2
2
 
3
- [![npm version](https://badge.fury.io/js/@midscene%2Fweb.svg)](https://badge.fury.io/js/@midscene%2Fweb)
4
- [![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)
3
+ Automate UI actions, extract data, and perform assertions using AI. It offers JavaScript SDK, Chrome extension, and support for scripting in YAML.
5
4
 
6
- AI-powered web automation library with advanced cumulative context system for intelligent, multi-step browser interactions.
5
+ See https://midscenejs.com/ for details.
7
6
 
8
- ## 🚀 Features
7
+ ## License
9
8
 
10
- - **🧠 Cumulative Context System**: AI remembers previous actions and data across steps
11
- - **🗣️ Natural Language Data References**: No more `{stored.key}` syntax needed
12
- - **🎯 Smart Context-Aware Assertions**: Intelligent data replacement in assertions
13
- - **🔄 Multi-Step Workflows**: Seamless information flow between actions
14
- - **🛡️ Built-in CAPTCHA Solving**: Advanced AI-powered CAPTCHA detection and solving
15
- - **📊 Comprehensive Logging**: Full visibility into context operations and AI calls
16
- - **🎭 Framework Support**: Works with Puppeteer, Playwright, and more
17
-
18
- ## 📦 Installation
19
-
20
- ```bash
21
- npm install @midscene/web puppeteer
22
- # or
23
- yarn add @midscene/web puppeteer
24
- ```
25
-
26
- ## 🎯 Quick Start
27
-
28
- ### Basic Usage with Cumulative Context
29
-
30
- ```javascript
31
- import { PuppeteerAgent } from '@midscene/web';
32
- import puppeteer from 'puppeteer';
33
-
34
- const browser = await puppeteer.launch({ headless: false });
35
- const page = await browser.newPage();
36
-
37
- // Create agent with cumulative context enabled
38
- const agent = new PuppeteerAgent(page, {
39
- enableCumulativeContext: true, // 🔥 Enable context system
40
- autoClearContext: true, // 🧹 Start with clean context
41
- testId: 'my-automation'
42
- });
43
-
44
- await page.goto('https://example.com');
45
-
46
- // Step 1: Store data with natural language
47
- await agent.aiQuery('kullanıcı adını "john_doe" olarak username şeklinde kaydet');
48
-
49
- // Step 2: Use stored data without {stored.key} syntax!
50
- await agent.aiAction('arama kutusuna username verisini yaz');
51
-
52
- // Step 3: Context-aware assertion
53
- await agent.aiAssert('arama kutusunda username değeri görünüyor');
54
-
55
- await browser.close();
56
- ```
57
-
58
- ## 🧠 Cumulative Context System
59
-
60
- ### Natural Language Data Storage
61
-
62
- ```javascript
63
- // Turkish natural language storage
64
- await agent.aiQuery('ürün adını "iPhone 15" olarak urun_adi şeklinde kaydet');
65
- await agent.aiQuery('fiyatı "999$" olarak fiyat şeklinde kaydet');
66
-
67
- // English natural language storage
68
- await agent.aiQuery('extract username and store as user_name');
69
- await agent.aiQuery('get email address, store as email');
70
- ```
71
-
72
- ### Automatic Data Usage
73
-
74
- ```javascript
75
- // No need for {stored.key} syntax!
76
- await agent.aiAction('formu urun_adi ve fiyat ile doldur');
77
- await agent.aiAction('use user_name and email to fill the form');
78
-
79
- // Multiple data references in one action
80
- await agent.aiAction('sepete urun_adi ürününü fiyat fiyatıyla ekle');
81
- ```
82
-
83
- ### Context-Aware Assertions
84
-
85
- ```javascript
86
- // Smart replacement - AI decides when to replace
87
- await agent.aiAssert('sepette urun_adi ürünü görünüyor'); // ✅ Will replace
88
- await agent.aiAssert('sayfada iPhone yazısı var'); // ❓ Context-dependent
89
- ```
90
-
91
- ## 🛡️ AI CAPTCHA Solving
92
-
93
- The `aiCaptcha` method provides intelligent CAPTCHA detection and solving capabilities with automatic execution:
94
-
95
- ```javascript
96
- // Basic CAPTCHA solving (auto-detects complexity)
97
- await agent.aiCaptcha();
98
-
99
- // Advanced CAPTCHA solving with options
100
- await agent.aiCaptcha({
101
- deepThink: true, // Force deep analysis
102
- autoDetectComplexity: true // Auto-detect if deep thinking needed (default: true)
103
- });
104
- ```
105
-
106
- ### CAPTCHA Features
107
-
108
- - **🔍 Automatic Detection**: Identifies CAPTCHA type (text, image, unknown)
109
- - **🧠 Complexity Analysis**: Auto-detects if deep thinking is needed
110
- - **📝 Text CAPTCHAs**: Solves distorted text and automatically inputs solution
111
- - **🖼️ Image CAPTCHAs**: Handles "select all images with..." challenges with coordinate clicking
112
- - **🎯 Automatic Execution**: Performs all required actions (click, input, verify) automatically
113
- - **⚡ Smart Processing**: Uses appropriate AI model based on complexity
114
- - **🔄 Action Sequence**: Executes complete CAPTCHA solving workflow
115
-
116
- ### CAPTCHA Response Format
117
-
118
- The AI returns a structured response with:
119
-
120
- ```javascript
121
- {
122
- captchaType: "text" | "image" | "unknown",
123
- solution: "The solution text or description",
124
- thought: "AI reasoning process",
125
- actions: [
126
- {
127
- type: "click" | "input" | "verify",
128
- target: "Description of target element",
129
- value: "Text to input (for input actions)",
130
- coordinates: [x, y] // For precise clicking
131
- }
132
- ]
133
- }
134
- ```
135
-
136
- ### CAPTCHA Example
137
-
138
- ```javascript
139
- // Navigate to page with CAPTCHA
140
- await page.goto('https://example.com/login');
141
-
142
- // Fill login form
143
- await agent.aiInput('john@example.com', 'email field');
144
- await agent.aiInput('password123', 'password field');
145
-
146
- // Solve CAPTCHA automatically - AI will:
147
- // 1. Analyze the CAPTCHA type and complexity
148
- // 2. Generate solution
149
- // 3. Execute all required actions automatically
150
- const captchaResult = await agent.aiCaptcha({
151
- autoDetectComplexity: true
152
- });
153
-
154
- console.log('CAPTCHA solved:', captchaResult.result.captchaType);
155
- console.log('Solution:', captchaResult.result.solution);
156
- console.log('Actions performed:', captchaResult.result.actions.length);
157
-
158
- // Form is now ready to submit
159
- await agent.aiTap('login button');
160
- ```
161
-
162
- ### Advanced CAPTCHA Scenarios
163
-
164
- ```javascript
165
- // Complex image CAPTCHA with deep thinking
166
- await agent.aiCaptcha({
167
- deepThink: true // Forces detailed analysis for complex puzzles
168
- });
169
-
170
- // Text CAPTCHA with automatic input
171
- // AI will automatically:
172
- // - Click on input field
173
- // - Type the solution
174
- // - Click verify button
175
- await agent.aiCaptcha();
176
-
177
- // Using with cumulative context
178
- await agent.aiQuery('CAPTCHA çözümünü captcha_cozum olarak kaydet');
179
- await agent.aiCaptcha();
180
- await agent.aiAssert('CAPTCHA başarıyla çözüldü');
181
- ```
182
-
183
- ## ⚙️ Configuration Options
184
-
185
- ```javascript
186
- const agent = new PuppeteerAgent(page, {
187
- // Context System
188
- enableCumulativeContext: true, // Enable/disable context system
189
- autoClearContext: false, // Auto-clear on agent creation
190
-
191
- // Basic Options
192
- testId: 'my-test', // Test identifier
193
- cacheId: 'my-cache', // Cache identifier
194
- aiActionContext: 'context', // AI behavior context
195
-
196
- // Reporting
197
- generateReport: true, // Generate HTML reports
198
- autoPrintReportMsg: true, // Print report messages
199
-
200
- // Navigation
201
- forceSameTabNavigation: true, // Force same-tab navigation
202
- waitForNavigationTimeout: 30000, // Navigation timeout
203
- waitForNetworkIdleTimeout: 5000 // Network idle timeout
204
- });
205
- ```
206
-
207
- ## 📊 Context Management
208
-
209
- ### Direct Context Access
210
-
211
- ```javascript
212
- // Get stored data
213
- const storedData = agent.getStoredData();
214
- console.log('Stored:', storedData);
215
-
216
- // Get step summary
217
- const summary = agent.getStepSummary();
218
- console.log('Steps:', summary);
219
-
220
- // Clear context manually
221
- agent.clearContext();
222
-
223
- // Get context store instance
224
- const contextStore = agent.getContextStore();
225
- ```
226
-
227
- ### Manual Data Storage
228
-
229
- ```javascript
230
- const contextStore = agent.getContextStore();
231
-
232
- // Store data manually
233
- contextStore.storeData('customKey', 'customValue');
234
-
235
- // Store with aliases for natural language
236
- contextStore.storeDataWithAliases('productName', 'iPhone 15', ['urun', 'product']);
237
-
238
- // Get recent steps
239
- const recentSteps = contextStore.getRecentSteps(5);
240
- ```
241
-
242
- ## 🔍 Debug Logging
243
-
244
- Enable comprehensive logging to see context operations:
245
-
246
- ```bash
247
- # All debug logs
248
- DEBUG=midscene:* node your-script.js
249
-
250
- # Context-specific logs
251
- DEBUG=midscene:agent node your-script.js
252
-
253
- # AI call logs
254
- DEBUG=midscene:ai:*,midscene:agent node your-script.js
255
- ```
256
-
257
- ### Log Examples
258
-
259
- ```
260
- DEBUG midscene:agent Context replacement in aiAction: {
261
- original: "type username in search box",
262
- processed: "type john_doe in search box",
263
- storedData: { "username": "john_doe" }
264
- }
265
-
266
- DEBUG midscene:agent Stored query result with aliases: {
267
- key: "username",
268
- value: "john_doe",
269
- aliases: ["username", "user", "kullanici"]
270
- }
271
- ```
272
-
273
- ## 🎭 Multi-Step Workflow Example
274
-
275
- ```javascript
276
- // E-commerce automation with context
277
- const agent = new PuppeteerAgent(page, {
278
- enableCumulativeContext: true,
279
- autoClearContext: true
280
- });
281
-
282
- await page.goto('https://shop.example.com');
283
-
284
- // Step 1: Search and store product info
285
- await agent.aiAction('arama kutusuna "laptop" yaz ve ara');
286
- await agent.aiQuery('ilk ürünün adını urun_adi olarak kaydet');
287
- await agent.aiQuery('ilk ürünün fiyatını urun_fiyati olarak kaydet');
288
-
289
- // Step 2: Add to cart using stored data
290
- await agent.aiAction('urun_adi ürününü sepete ekle');
291
-
292
- // Step 3: Verify cart contents
293
- await agent.aiAction('sepete git');
294
- await agent.aiAssert('sepette urun_adi ürünü urun_fiyati fiyatıyla görünüyor');
295
-
296
- // Step 4: Proceed to checkout
297
- await agent.aiAction('ödeme sayfasına git');
298
- await agent.aiAssert('ödeme sayfasında urun_adi ürünü listeleniyor');
299
- ```
300
-
301
- ## 🔄 Backward Compatibility
302
-
303
- The new context system is fully backward compatible:
304
-
305
- ```javascript
306
- // Old syntax still works
307
- await agent.aiQuery('extract username, store as user');
308
- await agent.aiAction('type {stored.user} in field');
309
-
310
- // New syntax (recommended)
311
- await agent.aiQuery('kullanıcı adını user olarak kaydet');
312
- await agent.aiAction('alana user verisini yaz');
313
-
314
- // Mixed usage
315
- await agent.aiAction('use {stored.email} and user verisi together');
316
- ```
317
-
318
- ## 📚 API Reference
319
-
320
- ### Core Methods
321
-
322
- - `aiAction(prompt)` - Perform actions with context awareness
323
- - `aiQuery(prompt)` - Extract data with automatic storage
324
- - `aiAssert(assertion)` - Context-aware assertions
325
- - `aiCaptcha(options)` - Intelligent CAPTCHA solving
326
- - `aiTap(target, options)` - Click elements
327
- - `aiInput(text, target, options)` - Input text
328
-
329
- ### Context Methods
330
-
331
- - `getStoredData()` - Get all stored data
332
- - `getStepSummary()` - Get step summary
333
- - `clearContext()` - Clear context store
334
- - `getContextStore()` - Get context store instance
335
-
336
- ## 🤝 Contributing
337
-
338
- We welcome contributions! Please see our [Contributing Guide](CONTRIBUTING.md) for details.
339
-
340
- ## 📄 License
341
-
342
- MIT License - see [LICENSE](LICENSE) file for details.
343
-
344
- ## 🔗 Links
345
-
346
- - [Documentation](https://midscenejs.com)
347
- - [GitHub Repository](https://github.com/web-infra-dev/midscene)
348
- - [NPM Package](https://www.npmjs.com/package/@midscene/web)
349
- - [Examples](https://github.com/web-infra-dev/midscene/tree/main/packages/web-integration/examples)
350
-
351
- ---
352
-
353
- **Made with ❤️ by the Midscene Team**
9
+ Midscene is MIT licensed.