testchimp-runner-core 0.0.34 → 0.0.35

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (114) hide show
  1. package/dist/execution-service.d.ts +1 -4
  2. package/dist/execution-service.d.ts.map +1 -1
  3. package/dist/execution-service.js +155 -468
  4. package/dist/execution-service.js.map +1 -1
  5. package/dist/index.d.ts +3 -1
  6. package/dist/index.d.ts.map +1 -1
  7. package/dist/index.js +11 -1
  8. package/dist/index.js.map +1 -1
  9. package/dist/orchestrator/decision-parser.d.ts +18 -0
  10. package/dist/orchestrator/decision-parser.d.ts.map +1 -0
  11. package/dist/orchestrator/decision-parser.js +127 -0
  12. package/dist/orchestrator/decision-parser.js.map +1 -0
  13. package/dist/orchestrator/index.d.ts +4 -2
  14. package/dist/orchestrator/index.d.ts.map +1 -1
  15. package/dist/orchestrator/index.js +14 -2
  16. package/dist/orchestrator/index.js.map +1 -1
  17. package/dist/orchestrator/orchestrator-agent.d.ts +17 -14
  18. package/dist/orchestrator/orchestrator-agent.d.ts.map +1 -1
  19. package/dist/orchestrator/orchestrator-agent.js +534 -204
  20. package/dist/orchestrator/orchestrator-agent.js.map +1 -1
  21. package/dist/orchestrator/orchestrator-prompts.d.ts +14 -2
  22. package/dist/orchestrator/orchestrator-prompts.d.ts.map +1 -1
  23. package/dist/orchestrator/orchestrator-prompts.js +529 -247
  24. package/dist/orchestrator/orchestrator-prompts.js.map +1 -1
  25. package/dist/orchestrator/page-som-handler.d.ts +106 -0
  26. package/dist/orchestrator/page-som-handler.d.ts.map +1 -0
  27. package/dist/orchestrator/page-som-handler.js +1353 -0
  28. package/dist/orchestrator/page-som-handler.js.map +1 -0
  29. package/dist/orchestrator/som-types.d.ts +149 -0
  30. package/dist/orchestrator/som-types.d.ts.map +1 -0
  31. package/dist/orchestrator/som-types.js +87 -0
  32. package/dist/orchestrator/som-types.js.map +1 -0
  33. package/dist/orchestrator/tool-registry.d.ts +2 -0
  34. package/dist/orchestrator/tool-registry.d.ts.map +1 -1
  35. package/dist/orchestrator/tool-registry.js.map +1 -1
  36. package/dist/orchestrator/tools/index.d.ts +4 -1
  37. package/dist/orchestrator/tools/index.d.ts.map +1 -1
  38. package/dist/orchestrator/tools/index.js +7 -2
  39. package/dist/orchestrator/tools/index.js.map +1 -1
  40. package/dist/orchestrator/tools/refresh-som-markers.d.ts +12 -0
  41. package/dist/orchestrator/tools/refresh-som-markers.d.ts.map +1 -0
  42. package/dist/orchestrator/tools/refresh-som-markers.js +64 -0
  43. package/dist/orchestrator/tools/refresh-som-markers.js.map +1 -0
  44. package/dist/orchestrator/tools/view-previous-screenshot.d.ts +15 -0
  45. package/dist/orchestrator/tools/view-previous-screenshot.d.ts.map +1 -0
  46. package/dist/orchestrator/tools/view-previous-screenshot.js +92 -0
  47. package/dist/orchestrator/tools/view-previous-screenshot.js.map +1 -0
  48. package/dist/orchestrator/types.d.ts +23 -1
  49. package/dist/orchestrator/types.d.ts.map +1 -1
  50. package/dist/orchestrator/types.js +11 -1
  51. package/dist/orchestrator/types.js.map +1 -1
  52. package/dist/scenario-service.d.ts +5 -0
  53. package/dist/scenario-service.d.ts.map +1 -1
  54. package/dist/scenario-service.js +17 -0
  55. package/dist/scenario-service.js.map +1 -1
  56. package/dist/scenario-worker-class.d.ts +4 -0
  57. package/dist/scenario-worker-class.d.ts.map +1 -1
  58. package/dist/scenario-worker-class.js +18 -3
  59. package/dist/scenario-worker-class.js.map +1 -1
  60. package/dist/testing/agent-tester.d.ts +35 -0
  61. package/dist/testing/agent-tester.d.ts.map +1 -0
  62. package/dist/testing/agent-tester.js +84 -0
  63. package/dist/testing/agent-tester.js.map +1 -0
  64. package/dist/testing/ref-translator-tester.d.ts +44 -0
  65. package/dist/testing/ref-translator-tester.d.ts.map +1 -0
  66. package/dist/testing/ref-translator-tester.js +104 -0
  67. package/dist/testing/ref-translator-tester.js.map +1 -0
  68. package/dist/utils/hierarchical-selector.d.ts +47 -0
  69. package/dist/utils/hierarchical-selector.d.ts.map +1 -0
  70. package/dist/utils/hierarchical-selector.js +212 -0
  71. package/dist/utils/hierarchical-selector.js.map +1 -0
  72. package/dist/utils/page-info-retry.d.ts +14 -0
  73. package/dist/utils/page-info-retry.d.ts.map +1 -0
  74. package/dist/utils/page-info-retry.js +60 -0
  75. package/dist/utils/page-info-retry.js.map +1 -0
  76. package/dist/utils/page-info-utils.d.ts +1 -0
  77. package/dist/utils/page-info-utils.d.ts.map +1 -1
  78. package/dist/utils/page-info-utils.js +46 -18
  79. package/dist/utils/page-info-utils.js.map +1 -1
  80. package/dist/utils/ref-attacher.d.ts +21 -0
  81. package/dist/utils/ref-attacher.d.ts.map +1 -0
  82. package/dist/utils/ref-attacher.js +149 -0
  83. package/dist/utils/ref-attacher.js.map +1 -0
  84. package/dist/utils/ref-translator.d.ts +49 -0
  85. package/dist/utils/ref-translator.d.ts.map +1 -0
  86. package/dist/utils/ref-translator.js +276 -0
  87. package/dist/utils/ref-translator.js.map +1 -0
  88. package/package.json +1 -1
  89. package/plandocs/exploratory-mode-support-v2.plan.md +953 -0
  90. package/plandocs/exploratory-mode-support.plan.md +928 -0
  91. package/plandocs/journey-id-tracking-addendum.md +227 -0
  92. package/src/execution-service.ts +179 -596
  93. package/src/index.ts +10 -0
  94. package/src/orchestrator/decision-parser.ts +139 -0
  95. package/src/orchestrator/index.ts +25 -1
  96. package/src/orchestrator/orchestrator-agent.ts +656 -236
  97. package/src/orchestrator/orchestrator-prompts.ts +559 -247
  98. package/src/orchestrator/page-som-handler.ts +1565 -0
  99. package/src/orchestrator/som-types.ts +188 -0
  100. package/src/orchestrator/tool-registry.ts +2 -0
  101. package/src/orchestrator/tools/index.ts +4 -1
  102. package/src/orchestrator/tools/refresh-som-markers.ts +69 -0
  103. package/src/orchestrator/tools/view-previous-screenshot.ts +103 -0
  104. package/src/orchestrator/types.ts +49 -6
  105. package/src/scenario-service.ts +20 -0
  106. package/src/scenario-worker-class.ts +24 -3
  107. package/src/utils/page-info-retry.ts +65 -0
  108. package/src/utils/page-info-utils.ts +53 -18
  109. package/testchimp-runner-core-0.0.35.tgz +0 -0
  110. package/src/orchestrator/orchestrator-agent.ts.backup +0 -1386
  111. package/testchimp-runner-core-0.0.33.tgz +0 -0
  112. /package/{RELEASE_0.0.26.md → releasenotes/RELEASE_0.0.26.md} +0 -0
  113. /package/{RELEASE_0.0.27.md → releasenotes/RELEASE_0.0.27.md} +0 -0
  114. /package/{RELEASE_0.0.28.md → releasenotes/RELEASE_0.0.28.md} +0 -0
@@ -0,0 +1,928 @@
1
+ # Add Exploratory Mode Support to Runner-Core
2
+
3
+ ## Overview
4
+
5
+ Enable runner-core's orchestrator to support exploratory mode where the agent autonomously decides next actions based on a high-level exploration prompt, rather than following pre-defined test steps. This reuses existing infrastructure (tools, memory, command execution) with modified prompting strategies.
6
+
7
+ **Key Principle**: The orchestrator fires `onStepComplete` for each autonomous action, so test-based-explorer doesn't know or care if steps were pre-defined or autonomously decided. All analytics, bug capture, and screen state detection work identically.
8
+
9
+ ---
10
+
11
+ ## Phase 1: Extend Runner-Core Types & Config
12
+
13
+ ### 1.1 Add Exploration Mode Types
14
+
15
+ **File**: `runner-core/src/orchestrator/types.ts`
16
+
17
+ Add new interface:
18
+
19
+ ```typescript
20
+ /**
21
+ * Exploration mode configuration
22
+ */
23
+ export interface ExplorationMode {
24
+ enabled: boolean; // Whether exploration mode is active
25
+ explorationPrompt: string; // High-level goal: "Explore all menu options"
26
+ testDataPrompt?: string; // Test data, credentials context
27
+ maxExplorationSteps?: number; // Budget limit (default: 50) - agent can stop earlier
28
+ }
29
+ ```
30
+
31
+ Add to `AgentConfig`:
32
+
33
+ ```typescript
34
+ export interface AgentConfig {
35
+ // ... existing fields
36
+
37
+ // Exploration mode (NEW)
38
+ explorationMode?: ExplorationMode;
39
+ }
40
+ ```
41
+
42
+ Update `DEFAULT_AGENT_CONFIG`:
43
+
44
+ ```typescript
45
+ export const DEFAULT_AGENT_CONFIG: Required<AgentConfig> = {
46
+ // ... existing defaults
47
+
48
+ explorationMode: {
49
+ enabled: false,
50
+ explorationPrompt: '',
51
+ testDataPrompt: undefined,
52
+ maxExplorationSteps: 50
53
+ }
54
+ };
55
+ ```
56
+
57
+ ### 1.2 Reuse Existing Journey Memory
58
+
59
+ **No code changes needed** - existing `JourneyMemory` fields are sufficient:
60
+
61
+ - **`history`**: Agent reviews to understand visited screens/areas
62
+ - **`experiences`**: Used for BOTH app patterns AND exploration progress
63
+ - Examples: "Dashboard fully explored - tested all widgets"
64
+ - Examples: "Discovered Admin menu but not explored yet"
65
+ - **`extractedData`**: Store discovered areas with special keys
66
+ - Examples: `{ "menuItems": "Dashboard,Settings,Admin,Profile" }`
67
+ - Examples: `{ "explored": "Dashboard,Settings" }`
68
+ - **`latestNote`**: Tactical memory for exploration strategy
69
+
70
+ The exploratory prompts guide the agent to use these fields appropriately.
71
+
72
+ ---
73
+
74
+ ## Phase 2: Add Exploratory Prompts
75
+
76
+ ### 2.1 Create Exploratory System Prompt
77
+
78
+ **File**: `runner-core/src/orchestrator/orchestrator-prompts.ts`
79
+
80
+ Add new method:
81
+
82
+ ```typescript
83
+ static buildExploratorySystemPrompt(toolDescriptions: string): string {
84
+ return `You are an autonomous exploration agent that discovers and tests web application features.
85
+
86
+ ${toolDescriptions}
87
+
88
+ YOUR RESPONSE FORMAT - Output JSON matching this interface:
89
+
90
+ interface AgentDecisionLLMResponse {
91
+ status: string; // "continue" | "complete" | "stuck"
92
+ reasoning: string; // What you're exploring and why
93
+ commands?: string[]; // Playwright commands to execute
94
+ commandReasoning?: string; // Why these commands
95
+ toolCalls?: Array<{ // Tools to call (extract_data for menus, etc.)
96
+ name: string;
97
+ params: Record<string, any>;
98
+ }>;
99
+ toolReasoning?: string;
100
+ needsToolResults?: boolean;
101
+ noteToFutureSelf?: string;
102
+ coordinateAction?: { ... };
103
+ experiences?: string[]; // Use for BOTH app patterns AND exploration progress
104
+ blockerDetected?: { ... };
105
+ }
106
+
107
+ EXPLORATION MODE GUIDELINES:
108
+
109
+ 1. **GOAL-DRIVEN EXPLORATION**: Follow the exploration prompt as your north star
110
+ - "Explore all menu options" → Extract menu items, visit each one systematically
111
+ - "Test dashboard features" → Discover widgets/interactions, test them thoroughly
112
+ - "Find bugs in settings" → Navigate to settings, try various configurations
113
+
114
+ 2. **AUTONOMOUS DISCOVERY**: You decide what to explore next based on:
115
+ - The exploration prompt (main goal)
116
+ - Current page state (what's available now)
117
+ - Journey history (what's been explored - check history, experiences, extractedData)
118
+ - Discovered but unvisited areas (stored in extractedData or experiences)
119
+
120
+ 3. **SYSTEMATIC EXPLORATION**:
121
+ - Use extract_data tool to discover elements (menus, buttons, links)
122
+ - Store discoveries in extractedData: { "menuItems": "Dashboard,Settings,Admin" }
123
+ - Track progress in experiences: "Explored Dashboard - all widgets working"
124
+ - Check history to avoid re-visiting same areas
125
+ - Prioritize unexplored areas
126
+
127
+ 4. **CREATIVE TESTING**: Test functionality, don't just navigate
128
+ - Try different input combinations
129
+ - Explore edge cases (empty inputs, max lengths, special characters)
130
+ - Verify features work as expected
131
+ - Look for visual bugs, console errors, broken functionality
132
+
133
+ 5. **BLOCKER HANDLING**: Clear obstacles autonomously
134
+ - Cookie modals → dismiss with blockerDetected.clearingCommands
135
+ - Tour popups → close them
136
+ - Login required → use credentials from test data prompt
137
+ - Navigation blockers → clear before continuing
138
+
139
+ 6. **STATUS DECISIONS** (CRITICAL):
140
+ - "continue": More exploration needed to achieve goal
141
+ - "complete": Exploration goal ACHIEVED (all menus explored, features tested, OR budget running low)
142
+ - "stuck": Cannot proceed (auth permanently blocked, critical error)
143
+
144
+ You should mark "complete" when:
145
+ - You've achieved the exploration goal (e.g., all menus explored)
146
+ - You've made good progress and are approaching budget limit
147
+ - Further exploration would be repetitive
148
+
149
+ DON'T wait to hit maxExplorationSteps - stop when goal is met!
150
+
151
+ 7. **MEMORY USAGE**:
152
+ - experiences: Both app patterns AND exploration notes
153
+ Example: "Settings menu requires admin role to access"
154
+ Example: "Explored Dashboard completely - 5 widgets all functional"
155
+ - extractedData: Discovered elements and tracking
156
+ Example: { "menuItems": "Dashboard,Settings,Admin,Profile" }
157
+ Example: { "exploredMenus": "Dashboard,Settings" }
158
+ - history: Review to see what actions were taken
159
+ - noteToFutureSelf: Tactical plans for next iteration
160
+
161
+ 8. **TOOLS FOR EXPLORATION**:
162
+ - extract_data: Discover menus, links, interactive elements
163
+ - take_screenshot: Understand visual layout when DOM unclear
164
+ - recall_history: Check what was already explored
165
+
166
+ CRITICAL: You're fully autonomous - no step-by-step instructions will be provided.
167
+ YOU decide the exploration path based on the goal, current state, and memory.`;
168
+ }
169
+ ```
170
+
171
+ ### 2.2 Create Exploratory User Prompt
172
+
173
+ Add method to `OrchestratorPrompts`:
174
+
175
+ ```typescript
176
+ static buildExploratoryUserPrompt(
177
+ context: AgentContext,
178
+ explorationPrompt: string,
179
+ testDataPrompt?: string,
180
+ stepNumber?: number,
181
+ maxSteps?: number
182
+ ): string {
183
+ const parts: string[] = [];
184
+
185
+ parts.push('=== EXPLORATION CONTEXT ===\n');
186
+ parts.push(`🎯 EXPLORATION GOAL: ${explorationPrompt}`);
187
+
188
+ if (testDataPrompt) {
189
+ parts.push(`📋 TEST DATA/CREDENTIALS: ${testDataPrompt}`);
190
+ }
191
+
192
+ if (stepNumber && maxSteps) {
193
+ parts.push(`📊 PROGRESS: Step ${stepNumber}/${maxSteps} (you can complete earlier if goal met)\n`);
194
+ }
195
+
196
+ // Show what's been explored (from extractedData and experiences)
197
+ if (context.extractedData && Object.keys(context.extractedData).length > 0) {
198
+ parts.push(`\n💾 DISCOVERED DATA:`);
199
+ for (const [key, value] of Object.entries(context.extractedData)) {
200
+ parts.push(` ${key}: ${value}`);
201
+ }
202
+ }
203
+
204
+ parts.push(`\nCURRENT PAGE:`);
205
+ parts.push(`URL: ${context.currentURL}`);
206
+ parts.push(`Title: ${context.currentPageInfo.title}`);
207
+ parts.push(`\nINTERACTIVE ELEMENTS (with positions and selectors):`);
208
+ parts.push(context.currentPageInfo.formattedElements);
209
+ parts.push(`\nARIA TREE (hierarchical structure):`);
210
+ parts.push(JSON.stringify(context.currentPageInfo.ariaSnapshot, null, 2).substring(0, 5000));
211
+ if (JSON.stringify(context.currentPageInfo.ariaSnapshot).length > 5000) {
212
+ parts.push('... (truncated)');
213
+ }
214
+
215
+ // Recent actions
216
+ if (context.recentSteps.length > 0) {
217
+ parts.push(`\nRECENT EXPLORATION ACTIONS (last ${context.recentSteps.length}):`);
218
+ for (const step of context.recentSteps) {
219
+ const status = step.result === 'success' ? '✓' : '✗';
220
+ parts.push(` ${status} ${step.action}`);
221
+ parts.push(` ${step.observation}`);
222
+ }
223
+ }
224
+
225
+ // Learnings and exploration progress
226
+ if (context.experiences && context.experiences.length > 0) {
227
+ parts.push(`\nEXPLORATION NOTES & APP PATTERNS:`);
228
+ for (const exp of context.experiences) {
229
+ parts.push(` • ${exp}`);
230
+ }
231
+ }
232
+
233
+ // Note from previous iteration
234
+ if (context.noteFromPreviousIteration) {
235
+ parts.push(`\n📝 YOUR NOTE FROM LAST ITERATION:`);
236
+ parts.push(` ${context.noteFromPreviousIteration.content}`);
237
+ }
238
+
239
+ parts.push(`\n🤔 DECIDE YOUR NEXT EXPLORATION ACTION:`);
240
+ parts.push(`1. What does the exploration goal require?`);
241
+ parts.push(`2. What's available on the current page?`);
242
+ parts.push(`3. What have you already explored? (check history, experiences, extractedData)`);
243
+ parts.push(`4. What should you explore next to achieve the goal?`);
244
+ parts.push(`5. Is the goal achieved? If yes, mark status="complete" (don't wait for max steps)`);
245
+
246
+ return parts.join('\n');
247
+ }
248
+ ```
249
+
250
+ ---
251
+
252
+ ## Phase 3: Extend OrchestratorAgent for Exploration
253
+
254
+ ### 3.1 Add Exploration Execution Method
255
+
256
+ **File**: `runner-core/src/orchestrator/orchestrator-agent.ts`
257
+
258
+ Add new method that runs exploration loop (similar to executeStep but for autonomous exploration):
259
+
260
+ ```typescript
261
+ /**
262
+ * Execute exploration mode - agent autonomously explores based on prompt
263
+ * Fires callbacks for each autonomous action (transparent to caller)
264
+ */
265
+ async executeExploration(
266
+ page: any,
267
+ explorationConfig: ExplorationMode,
268
+ progressReporter: ProgressReporter | undefined,
269
+ jobId: string
270
+ ): Promise<OrchestratorStepResult> {
271
+ this.logger?.(`\n[Orchestrator] ========== EXPLORATORY MODE ==========`);
272
+ this.logger?.(`[Orchestrator] 🎯 Goal: ${explorationConfig.explorationPrompt}`);
273
+ if (explorationConfig.testDataPrompt) {
274
+ this.logger?.(`[Orchestrator] 📋 Test Data: ${explorationConfig.testDataPrompt}`);
275
+ }
276
+
277
+ const memory: JourneyMemory = {
278
+ history: [],
279
+ experiences: [],
280
+ extractedData: {}
281
+ };
282
+
283
+ const maxSteps = explorationConfig.maxExplorationSteps || 50;
284
+ let stepNumber = 0;
285
+ const commandsExecuted: string[] = [];
286
+
287
+ while (stepNumber < maxSteps) {
288
+ stepNumber++;
289
+
290
+ this.logger?.(`\n[Orchestrator] === Exploration Step ${stepNumber}/${maxSteps} ===`);
291
+
292
+ // Build exploratory context
293
+ const context = await this.buildExploratoryContext(
294
+ page,
295
+ explorationConfig.explorationPrompt,
296
+ explorationConfig.testDataPrompt,
297
+ memory,
298
+ stepNumber,
299
+ maxSteps
300
+ );
301
+
302
+ // Call agent with exploratory prompt
303
+ const decision = await this.callExploratoryAgent(
304
+ context,
305
+ jobId,
306
+ stepNumber
307
+ );
308
+
309
+ this.logAgentDecision(decision, stepNumber);
310
+
311
+ // Report progress via callback (CRITICAL - this is how test-based-explorer tracks steps)
312
+ if (progressReporter?.onStepProgress) {
313
+ await progressReporter.onStepProgress({
314
+ stepNumber,
315
+ stepId: `exploration-${stepNumber}`,
316
+ description: decision.reasoning,
317
+ code: decision.commands?.join('\n') || '',
318
+ status: StepExecutionStatus.IN_PROGRESS,
319
+ wasRepaired: false
320
+ });
321
+ }
322
+
323
+ // Execute tools if requested
324
+ if (decision.toolCalls && decision.toolCalls.length > 0) {
325
+ const toolResults = await this.executeTools(decision.toolCalls, page, memory, stepNumber);
326
+
327
+ // If needs tool results, call agent again
328
+ if (decision.needsToolResults) {
329
+ const updatedContext = { ...context, toolResults };
330
+ const continuedDecision = await this.callExploratoryAgent(updatedContext, jobId, stepNumber);
331
+
332
+ decision.commands = continuedDecision.commands || decision.commands;
333
+ decision.commandReasoning = continuedDecision.commandReasoning || decision.commandReasoning;
334
+ decision.status = continuedDecision.status;
335
+ }
336
+ }
337
+
338
+ // Handle blocker clearing
339
+ if (decision.blockerDetected && decision.blockerDetected.clearingCommands) {
340
+ this.logger?.(`[Orchestrator] 🚧 Clearing blocker: ${decision.blockerDetected.description}`);
341
+ const blockerResult = await this.executeCommandsSequentially(
342
+ decision.blockerDetected.clearingCommands,
343
+ page,
344
+ memory,
345
+ stepNumber,
346
+ 1,
347
+ jobId
348
+ );
349
+ commandsExecuted.push(...blockerResult.executed);
350
+ }
351
+
352
+ // Execute exploration commands
353
+ let commandsSucceeded = true;
354
+ if (decision.commands && decision.commands.length > 0) {
355
+ const executeResult = await this.executeCommandsSequentially(
356
+ decision.commands,
357
+ page,
358
+ memory,
359
+ stepNumber,
360
+ 1,
361
+ jobId
362
+ );
363
+ commandsExecuted.push(...executeResult.executed);
364
+ commandsSucceeded = executeResult.allSucceeded;
365
+ }
366
+
367
+ // Report step completion (CRITICAL - fires test-based-explorer callbacks)
368
+ if (progressReporter?.onStepProgress) {
369
+ await progressReporter.onStepProgress({
370
+ stepNumber,
371
+ stepId: `exploration-${stepNumber}`,
372
+ description: decision.reasoning,
373
+ code: decision.commands?.join('\n') || '',
374
+ status: commandsSucceeded ? StepExecutionStatus.SUCCESS : StepExecutionStatus.FAILED,
375
+ error: commandsSucceeded ? undefined : 'Command execution failed',
376
+ wasRepaired: false
377
+ });
378
+ }
379
+
380
+ // Add experiences (both app patterns AND exploration progress)
381
+ if (decision.experiences) {
382
+ memory.experiences.push(...decision.experiences);
383
+ if (memory.experiences.length > this.config.maxExperiences) {
384
+ memory.experiences = memory.experiences.slice(-this.config.maxExperiences);
385
+ }
386
+ }
387
+
388
+ // Store note for next iteration
389
+ if (decision.noteToFutureSelf) {
390
+ memory.latestNote = {
391
+ fromIteration: stepNumber,
392
+ content: decision.noteToFutureSelf
393
+ };
394
+ }
395
+
396
+ // Check termination
397
+ if (decision.status === 'complete') {
398
+ this.logger?.(`[Orchestrator] ✅ Exploration complete: ${decision.statusReasoning}`);
399
+ return {
400
+ success: true,
401
+ commands: commandsExecuted,
402
+ iterations: stepNumber,
403
+ terminationReason: 'complete',
404
+ memory
405
+ };
406
+ } else if (decision.status === 'stuck') {
407
+ this.logger?.(`[Orchestrator] ❌ Exploration stuck: ${decision.statusReasoning}`);
408
+ return {
409
+ success: false,
410
+ commands: commandsExecuted,
411
+ iterations: stepNumber,
412
+ terminationReason: 'agent_stuck',
413
+ memory,
414
+ error: decision.statusReasoning
415
+ };
416
+ }
417
+ }
418
+
419
+ // Hit max steps - not necessarily a failure
420
+ this.logger?.(`[Orchestrator] ⚠ Maximum exploration steps reached (budget limit)`);
421
+ return {
422
+ success: true, // Not a failure - just budget limit
423
+ commands: commandsExecuted,
424
+ iterations: stepNumber,
425
+ terminationReason: 'system_limit',
426
+ memory
427
+ };
428
+ }
429
+
430
+ private async buildExploratoryContext(
431
+ page: any,
432
+ explorationPrompt: string,
433
+ testDataPrompt: string | undefined,
434
+ memory: JourneyMemory,
435
+ stepNumber: number,
436
+ maxSteps: number
437
+ ): Promise<AgentContext> {
438
+ const currentPageInfo = await getEnhancedPageInfo(page);
439
+ const currentURL = page.url();
440
+ const recentSteps = memory.history.slice(-this.config.recentStepsCount);
441
+
442
+ return {
443
+ overallGoal: explorationPrompt,
444
+ currentStepGoal: explorationPrompt, // Same as overall in exploratory mode
445
+ stepNumber,
446
+ totalSteps: maxSteps,
447
+ completedSteps: [],
448
+ remainingSteps: [],
449
+ currentPageInfo,
450
+ currentURL,
451
+ recentSteps,
452
+ experiences: memory.experiences,
453
+ extractedData: memory.extractedData,
454
+ noteFromPreviousIteration: memory.latestNote
455
+ };
456
+ }
457
+
458
+ private async callExploratoryAgent(
459
+ context: AgentContext,
460
+ jobId: string,
461
+ stepNumber: number
462
+ ): Promise<AgentDecision> {
463
+ const toolDescriptions = this.toolRegistry.generateToolDescriptions();
464
+ const systemPrompt = OrchestratorPrompts.buildExploratorySystemPrompt(toolDescriptions);
465
+ const userPrompt = OrchestratorPrompts.buildExploratoryUserPrompt(
466
+ context,
467
+ context.overallGoal,
468
+ undefined, // testDataPrompt already in context
469
+ stepNumber,
470
+ context.totalSteps
471
+ );
472
+
473
+ // Call LLM (same as regular mode)
474
+ const llmRequest = {
475
+ model: DEFAULT_MODEL,
476
+ systemPrompt,
477
+ userPrompt
478
+ };
479
+
480
+ const response = await this.llmFacade.llmProvider.callLLM(llmRequest);
481
+
482
+ // Report token usage
483
+ if (response.usage && this.progressReporter?.onTokensUsed) {
484
+ await this.progressReporter.onTokensUsed({
485
+ jobId,
486
+ stepNumber,
487
+ iteration: 1,
488
+ inputTokens: response.usage.inputTokens,
489
+ outputTokens: response.usage.outputTokens,
490
+ includesImage: false,
491
+ model: DEFAULT_MODEL,
492
+ timestamp: Date.now()
493
+ });
494
+ }
495
+
496
+ // Parse response (same JSON format as regular mode)
497
+ const decision = this.parseAgentDecision(response.content);
498
+ return decision;
499
+ }
500
+ ```
501
+
502
+ ---
503
+
504
+ ## Phase 4: Wire Through ScenarioService & Worker
505
+
506
+ ### 4.1 Update ScenarioService Interface
507
+
508
+ **File**: `runner-core/src/scenario-service.ts`
509
+
510
+ Add explorationMode parameter to processScenario (if using orchestrator):
511
+
512
+ ```typescript
513
+ processScenario(
514
+ scenario: string,
515
+ testName?: string,
516
+ config?: PlaywrightConfig,
517
+ model?: string,
518
+ scenarioFileName?: string,
519
+ existingBrowser?: any,
520
+ existingContext?: any,
521
+ existingPage?: any,
522
+ explorationMode?: ExplorationMode // NEW parameter
523
+ ): string {
524
+ const jobId = `scenario_${Date.now()}_${Math.random().toString(36).substr(2, 9)}`;
525
+
526
+ const job: ScenarioRunJob = {
527
+ id: jobId,
528
+ scenario,
529
+ testName,
530
+ playwrightConfig: config,
531
+ model,
532
+ scenarioFileName,
533
+ existingBrowser,
534
+ existingContext,
535
+ existingPage,
536
+ explorationMode // NEW: Pass through
537
+ };
538
+
539
+ this.jobQueue.push(job);
540
+ this.processNextJob();
541
+
542
+ return jobId;
543
+ }
544
+ ```
545
+
546
+ ### 4.2 Update ScenarioWorker
547
+
548
+ **File**: `runner-core/src/scenario-worker-class.ts`
549
+
550
+ Handle explorationMode in processScenarioJob - when exploration mode is enabled and orchestrator is active, call executeExploration instead of building a scenario script.
551
+
552
+ ---
553
+
554
+ ## Phase 5: Extend SmartTestRunnerCoreV2
555
+
556
+ ### 5.1 Pass Exploration Config to Runner-Core
557
+
558
+ **File**: `scriptservice/smart-test-runner-core-v2.ts`
559
+
560
+ Modify constructor to accept explorationMode:
561
+
562
+ ```typescript
563
+ export interface RunnerConfig {
564
+ playwrightConfig?: string;
565
+ model?: string;
566
+ repairFlexibility?: number;
567
+ callbacks?: RunnerLifecycleCallbacks;
568
+ page?: Page;
569
+ browser?: Browser;
570
+ context?: BrowserContext;
571
+ maxRetriesPerStep?: number;
572
+ explorationMode?: ExplorationMode; // NEW
573
+ }
574
+
575
+ constructor(config: RunnerConfig) {
576
+ this.config = config;
577
+ // ... existing code
578
+
579
+ // Initialize runner-core with exploration mode if provided
580
+ this.runnerCore = new TestChimpService(
581
+ undefined,
582
+ undefined,
583
+ undefined,
584
+ 1,
585
+ llmProvider,
586
+ progressReporter,
587
+ {
588
+ useOrchestrator: true, // Enable orchestrator for exploration
589
+ orchestratorConfig: {
590
+ explorationMode: config.explorationMode // Pass exploration config
591
+ }
592
+ }
593
+ );
594
+ }
595
+ ```
596
+
597
+ Add exploration execution method:
598
+
599
+ ```typescript
600
+ /**
601
+ * Run exploration mode - agent autonomously explores based on prompt
602
+ * Delegates to runner-core's orchestrator which fires onStepComplete for each action
603
+ */
604
+ async runExploration(jobId?: string): Promise<RunExactlyResult> {
605
+ try {
606
+ logger.info(`SmartTestRunnerCoreV2.runExploration: Starting`);
607
+
608
+ if (!this.config.explorationMode?.enabled) {
609
+ throw new Error('Exploration mode not enabled in config');
610
+ }
611
+
612
+ // Call beforeStartTest
613
+ if (this.config.callbacks?.beforeStartTest) {
614
+ await this.config.callbacks.beforeStartTest(this.page, this.browser, this.context);
615
+ }
616
+
617
+ // Call runner-core's orchestrator in exploration mode
618
+ // Orchestrator will autonomously explore and fire onStepComplete for each action
619
+ const result = await this.runnerCore.executeExploration(
620
+ this.page,
621
+ this.config.explorationMode,
622
+ jobId
623
+ );
624
+
625
+ // Call afterEndTest
626
+ if (this.config.callbacks?.afterEndTest) {
627
+ const status = result.success ? 'passed' : 'failed';
628
+ await this.config.callbacks.afterEndTest(status, result.error, this.page);
629
+ }
630
+
631
+ return {
632
+ success: result.success,
633
+ error: result.error
634
+ };
635
+ } catch (error: any) {
636
+ logger.error(`SmartTestRunnerCoreV2.runExploration: Error - ${error.message}`);
637
+
638
+ if (this.config.callbacks?.afterEndTest) {
639
+ await this.config.callbacks.afterEndTest('failed', error.message, this.page);
640
+ }
641
+
642
+ return {
643
+ success: false,
644
+ error: error.message
645
+ };
646
+ }
647
+ }
648
+ ```
649
+
650
+ ---
651
+
652
+ ## Phase 6: Minimal Changes to Test-Based-Explorer
653
+
654
+ ### 6.1 Extend TestBasedExplorationTask
655
+
656
+ **File**: `scriptservice/utils/models.ts`
657
+
658
+ Add exploration mode fields:
659
+
660
+ ```typescript
661
+ export interface TestBasedExplorationTask {
662
+ // ... existing fields
663
+
664
+ // NEW: Exploration mode support
665
+ explorationMode?: {
666
+ enabled: boolean;
667
+ explorationPrompt: string;
668
+ testDataPrompt?: string;
669
+ };
670
+ }
671
+ ```
672
+
673
+ ### 6.2 Update Test-Based-Explorer Constructor
674
+
675
+ **File**: `scriptservice/workers/test-based-explorer.ts`
676
+
677
+ Store exploration config:
678
+
679
+ ```typescript
680
+ private explorationMode?: { enabled: boolean; explorationPrompt: string; testDataPrompt?: string; };
681
+
682
+ constructor(task: TestBasedExplorationTask, ...callbacks) {
683
+ // ... existing code
684
+ this.explorationMode = task.explorationMode;
685
+ }
686
+ ```
687
+
688
+ ### 6.3 Modify run() Method - Minimal Changes
689
+
690
+ **File**: `scriptservice/workers/test-based-explorer.ts`
691
+
692
+ Change around line 420-690:
693
+
694
+ ```typescript
695
+ public async run() {
696
+ let testFullName = `${this.testName?.suite}#${this.testName?.name}`;
697
+ try {
698
+ logger.info("Running test...");
699
+ await this.setup();
700
+ await this.journeyReporter?.startJourney()
701
+ } catch (error) {
702
+ logger.error("Exception in worker:", error);
703
+ await updateJourneyExecutionStatus(this.invocationId, JourneyExecutionStatus.EXCEPTION_IN_JOURNEY_EXECUTION);
704
+ return {
705
+ status: 'error',
706
+ error: error instanceof Error ? error.message : String(error),
707
+ };
708
+ }
709
+
710
+ let startTimeMillis: number = Date.now();
711
+ let endReason = "";
712
+
713
+ try {
714
+ // Parse test steps OR use empty array for exploration
715
+ let codeUnits: CodeUnit[] = [];
716
+ if (!this.explorationMode?.enabled) {
717
+ // TEST-BASED mode: Parse smart test into code units
718
+ if (!this.testId) {
719
+ logger.error('No testId provided');
720
+ return { result: null };
721
+ }
722
+ codeUnits = await this.parseSmartTestIntoCodeUnits();
723
+ if (codeUnits.length === 0) {
724
+ logger.error(`No codeUnits found for test ${this.testId}`);
725
+ return { result: null };
726
+ }
727
+ codeUnits = codeUnits.filter(cu => !isNonActionCodeUnit(cu));
728
+ } else {
729
+ // EXPLORATION mode: No pre-defined steps
730
+ logger.info(`EXPLORATION MODE: ${this.explorationMode.explorationPrompt}`);
731
+ }
732
+
733
+ // Convert to TestStepWithId format
734
+ const stepsWithId: TestStepWithId[] = codeUnits.map(cu => ({
735
+ stepId: cu.id || uuidv4(),
736
+ description: cu.description,
737
+ code: cu.code,
738
+ }));
739
+
740
+ // Track state for step callbacks (SAME FOR BOTH MODES)
741
+ let stepTimestamps = new Map<string, number>();
742
+
743
+ // Define callbacks (IDENTICAL FOR BOTH TEST-BASED AND EXPLORATION)
744
+ const handleBeforeStartTest = async (page: Page, browser: Browser, context: BrowserContext) => {
745
+ await this.initializeWatcherAndListeners();
746
+ logger.info('Test setup complete, starting execution');
747
+ };
748
+
749
+ const handleBeforeStepStart = async (step: TestStepWithId, page: Page) => {
750
+ const stepId = step.stepId || '';
751
+ const stepTimestamp = Date.now();
752
+ stepTimestamps.set(stepId, stepTimestamp);
753
+ this.journeyReporter?.startStep(stepId);
754
+ this.currentStepCodeExecutions = [];
755
+ };
756
+
757
+ const handleStepComplete = async (step: TestStepWithId, isRepairStep: boolean, repairForStepId: string | undefined, error: string | undefined, page: Page) => {
758
+ // ... EXISTING CODE (lines 466-648) - NO CHANGES
759
+ // All analytics, screenshot, bug capture, screen state detection - IDENTICAL
760
+ };
761
+
762
+ const handleAfterEndTest = async (status: 'passed' | 'failed', error: string | undefined, page: Page) => {
763
+ // ... EXISTING CODE (lines 650-666) - NO CHANGES
764
+ };
765
+
766
+ // Create runner config (SAME FOR BOTH MODES)
767
+ const runnerConfig: RunnerConfig = {
768
+ page: this.page!,
769
+ browser: this.browser!,
770
+ context: this.context!,
771
+ maxRetriesPerStep: 3,
772
+ repairFlexibility: this.aiHealingSettings?.freedomLevel,
773
+ explorationMode: this.explorationMode, // NEW: Pass exploration config
774
+ callbacks: {
775
+ beforeStartTest: handleBeforeStartTest,
776
+ beforeStepStart: handleBeforeStepStart,
777
+ onStepComplete: handleStepComplete,
778
+ afterEndTest: handleAfterEndTest
779
+ }
780
+ };
781
+
782
+ // Create runner (SAME FOR BOTH MODES)
783
+ const runner = new SmartTestRunnerCoreV2(runnerConfig);
784
+ logger.info(`test-based-explorer: using SmartTestRunnerCoreV2 (runner-core)`);
785
+
786
+ // Execute based on mode
787
+ let result;
788
+ if (this.explorationMode?.enabled) {
789
+ // EXPLORATION: Let orchestrator decide next steps autonomously
790
+ result = await runner.runExploration(this.invocationId);
791
+ } else {
792
+ // TEST-BASED: Execute predefined steps with repair
793
+ result = await runner.runWithRepair(stepsWithId, this.invocationId, false, 0);
794
+ }
795
+
796
+ if (!result.success) {
797
+ throw new Error(`Execution failed: ${result.error}`);
798
+ }
799
+
800
+ } catch (error) {
801
+ // ... EXISTING ERROR HANDLING - NO CHANGES
802
+ } finally {
803
+ // ... EXISTING CLEANUP - NO CHANGES
804
+ }
805
+ }
806
+ ```
807
+
808
+ **That's it!** Only ~10 lines of changes in test-based-explorer:
809
+ 1. Store explorationMode in constructor
810
+ 2. Skip parsing steps if exploration mode
811
+ 3. Pass explorationMode to RunnerConfig
812
+ 4. Call runExploration() instead of runWithRepair()
813
+
814
+ All callbacks remain identical - orchestrator fires them for each autonomous action.
815
+
816
+ ---
817
+
818
+ ## Phase 7: Wire Up AppExplorer
819
+
820
+ ### 7.1 Handle PromptConfig in AppExplorer
821
+
822
+ **File**: `scriptservice/workers/app-explorer.ts`
823
+
824
+ Modify around line 154:
825
+
826
+ ```typescript
827
+ if (this.config.promptConfig) {
828
+ // Prompt-based exploration - use test-based-explorer with exploration mode
829
+ logger.info("Running prompt-based exploration");
830
+ await this.runPromptBasedExploration();
831
+ }
832
+
833
+ // Add new method
834
+ private async runPromptBasedExploration(): Promise<void> {
835
+ const promptConfig = this.config.promptConfig!;
836
+
837
+ // Create exploration task (reuses TestBasedExplorationTask)
838
+ const task: TestBasedExplorationTask = {
839
+ invocationId: uuidv4(),
840
+ invocationBatchId: this.explorationId,
841
+ projectId: this.task.projectId,
842
+ appReleaseId: this.config.webappReleaseVersion ?? "default",
843
+ autohealEnabled: false, // Not needed for exploration
844
+ sessionRecordApiKey: this.task.sessionRecordApiKey,
845
+ ingressEndpoint: await ConfigService.get('test_service_endpoint'),
846
+ enableTestchimpSdkOnExec: false,
847
+ urlRegexToCapture: ".*",
848
+ playwrightConfig: this.config.playwrightConfig,
849
+ creditBudget: this.getCreditsForNextJourney(),
850
+ bugCaptureSettings: this.config.bugCaptureSettings,
851
+ viewportConfig: this.config.viewportConfig,
852
+
853
+ // NEW: Enable exploration mode
854
+ explorationMode: {
855
+ enabled: true,
856
+ explorationPrompt: promptConfig.explorePrompt || "Explore the application",
857
+ testDataPrompt: promptConfig.testDataPrompt
858
+ }
859
+ };
860
+
861
+ // Use test-based-explorer (it handles both modes transparently)
862
+ const explorer = new TestBasedExplorer(
863
+ task,
864
+ this.onBugsDiscovered.bind(this),
865
+ this.onScreenStateVisited.bind(this),
866
+ this.shouldVisualAnalyzeScreen?.bind(this),
867
+ this.getVisualAnalysisDecision?.bind(this),
868
+ this.shouldAnalyzeApiCall?.bind(this),
869
+ this.reportVisualAnalyzedScreen?.bind(this),
870
+ this.reportVisualAnalyzedScreenshot?.bind(this),
871
+ this.reportAnalyzedApiCalls?.bind(this),
872
+ this.getKnownScreenStates?.bind(this),
873
+ this.updateInputValueCombinations?.bind(this),
874
+ this.reportPageNavigation?.bind(this),
875
+ this.onTokensUsed?.bind(this),
876
+ this.updateLatestScreenshot?.bind(this)
877
+ );
878
+
879
+ await explorer.run();
880
+ }
881
+ ```
882
+
883
+ ---
884
+
885
+ ## Summary
886
+
887
+ This plan enables exploratory mode with minimal changes:
888
+
889
+ 1. **Runner-Core Extensions**:
890
+ - Add `ExplorationMode` type (removed `explorationStrategy`)
891
+ - Rename `explorationContext` to `testDataPrompt`
892
+ - Reuse existing `JourneyMemory` fields
893
+ - Add exploratory prompts that guide agent to use memory creatively
894
+ - Add `executeExploration` method to orchestrator
895
+
896
+ 2. **Autonomous Decision-Making**:
897
+ - Agent decides next actions based on: exploration prompt + page state + memory
898
+ - Agent can stop early when goal achieved (doesn't wait for maxSteps)
899
+ - Agent tracks progress using existing memory fields (experiences, extractedData)
900
+
901
+ 3. **Reuse Infrastructure**:
902
+ - Same tools (extract_data, screenshot, recall_history)
903
+ - Same command execution
904
+ - Same callback system (onStepComplete fires for each autonomous action)
905
+
906
+ 4. **Minimal Test-Based-Explorer Changes** (~10 lines):
907
+ - Check if explorationMode enabled
908
+ - Skip step parsing if exploration
909
+ - Pass explorationMode to runner config
910
+ - Call runExploration() vs runWithRepair()
911
+ - **All callbacks identical** - works transparently
912
+
913
+ 5. **Preserve Analytics**:
914
+ - All bug capture, screenshot analysis, screen state detection, console/network analytics work identically
915
+ - test-based-explorer doesn't know if steps were pre-defined or autonomous
916
+
917
+ ### Implementation Order
918
+
919
+ 1. Add types to runner-core (`ExplorationMode`, extend `AgentConfig`)
920
+ 2. Add exploratory prompts to orchestrator-prompts.ts
921
+ 3. Add `executeExploration` to orchestrator-agent.ts
922
+ 4. Add `explorationMode` to `RunnerConfig` in SmartTestRunnerCoreV2
923
+ 5. Add `runExploration` method to SmartTestRunnerCoreV2
924
+ 6. Add `explorationMode` field to `TestBasedExplorationTask`
925
+ 7. Modify test-based-explorer.ts `run()` method (~10 lines)
926
+ 8. Add `runPromptBasedExploration` to app-explorer.ts
927
+ 9. Test with sample exploration prompts
928
+