testchimp-runner-core 0.0.33 → 0.0.35
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/dist/execution-service.d.ts +1 -4
- package/dist/execution-service.d.ts.map +1 -1
- package/dist/execution-service.js +155 -468
- package/dist/execution-service.js.map +1 -1
- package/dist/index.d.ts +3 -1
- package/dist/index.d.ts.map +1 -1
- package/dist/index.js +11 -1
- package/dist/index.js.map +1 -1
- package/dist/llm-facade.d.ts.map +1 -1
- package/dist/llm-facade.js +7 -7
- package/dist/llm-facade.js.map +1 -1
- package/dist/llm-provider.d.ts +9 -0
- package/dist/llm-provider.d.ts.map +1 -1
- package/dist/model-constants.d.ts +16 -5
- package/dist/model-constants.d.ts.map +1 -1
- package/dist/model-constants.js +17 -6
- package/dist/model-constants.js.map +1 -1
- package/dist/orchestrator/decision-parser.d.ts +18 -0
- package/dist/orchestrator/decision-parser.d.ts.map +1 -0
- package/dist/orchestrator/decision-parser.js +127 -0
- package/dist/orchestrator/decision-parser.js.map +1 -0
- package/dist/orchestrator/index.d.ts +4 -2
- package/dist/orchestrator/index.d.ts.map +1 -1
- package/dist/orchestrator/index.js +15 -2
- package/dist/orchestrator/index.js.map +1 -1
- package/dist/orchestrator/orchestrator-agent.d.ts +17 -22
- package/dist/orchestrator/orchestrator-agent.d.ts.map +1 -1
- package/dist/orchestrator/orchestrator-agent.js +708 -577
- package/dist/orchestrator/orchestrator-agent.js.map +1 -1
- package/dist/orchestrator/orchestrator-prompts.d.ts +32 -0
- package/dist/orchestrator/orchestrator-prompts.d.ts.map +1 -0
- package/dist/orchestrator/orchestrator-prompts.js +737 -0
- package/dist/orchestrator/orchestrator-prompts.js.map +1 -0
- package/dist/orchestrator/page-som-handler.d.ts +106 -0
- package/dist/orchestrator/page-som-handler.d.ts.map +1 -0
- package/dist/orchestrator/page-som-handler.js +1353 -0
- package/dist/orchestrator/page-som-handler.js.map +1 -0
- package/dist/orchestrator/som-types.d.ts +149 -0
- package/dist/orchestrator/som-types.d.ts.map +1 -0
- package/dist/orchestrator/som-types.js +87 -0
- package/dist/orchestrator/som-types.js.map +1 -0
- package/dist/orchestrator/tool-registry.d.ts +2 -0
- package/dist/orchestrator/tool-registry.d.ts.map +1 -1
- package/dist/orchestrator/tool-registry.js.map +1 -1
- package/dist/orchestrator/tools/index.d.ts +5 -1
- package/dist/orchestrator/tools/index.d.ts.map +1 -1
- package/dist/orchestrator/tools/index.js +9 -2
- package/dist/orchestrator/tools/index.js.map +1 -1
- package/dist/orchestrator/tools/refresh-som-markers.d.ts +12 -0
- package/dist/orchestrator/tools/refresh-som-markers.d.ts.map +1 -0
- package/dist/orchestrator/tools/refresh-som-markers.js +64 -0
- package/dist/orchestrator/tools/refresh-som-markers.js.map +1 -0
- package/dist/orchestrator/tools/verify-action-result.d.ts +17 -0
- package/dist/orchestrator/tools/verify-action-result.d.ts.map +1 -0
- package/dist/orchestrator/tools/verify-action-result.js +140 -0
- package/dist/orchestrator/tools/verify-action-result.js.map +1 -0
- package/dist/orchestrator/tools/view-previous-screenshot.d.ts +15 -0
- package/dist/orchestrator/tools/view-previous-screenshot.d.ts.map +1 -0
- package/dist/orchestrator/tools/view-previous-screenshot.js +92 -0
- package/dist/orchestrator/tools/view-previous-screenshot.js.map +1 -0
- package/dist/orchestrator/types.d.ts +49 -1
- package/dist/orchestrator/types.d.ts.map +1 -1
- package/dist/orchestrator/types.js +11 -1
- package/dist/orchestrator/types.js.map +1 -1
- package/dist/prompts.d.ts.map +1 -1
- package/dist/prompts.js +40 -34
- package/dist/prompts.js.map +1 -1
- package/dist/scenario-service.d.ts +5 -0
- package/dist/scenario-service.d.ts.map +1 -1
- package/dist/scenario-service.js +17 -0
- package/dist/scenario-service.js.map +1 -1
- package/dist/scenario-worker-class.d.ts +4 -0
- package/dist/scenario-worker-class.d.ts.map +1 -1
- package/dist/scenario-worker-class.js +21 -3
- package/dist/scenario-worker-class.js.map +1 -1
- package/dist/testing/agent-tester.d.ts +35 -0
- package/dist/testing/agent-tester.d.ts.map +1 -0
- package/dist/testing/agent-tester.js +84 -0
- package/dist/testing/agent-tester.js.map +1 -0
- package/dist/testing/ref-translator-tester.d.ts +44 -0
- package/dist/testing/ref-translator-tester.d.ts.map +1 -0
- package/dist/testing/ref-translator-tester.js +104 -0
- package/dist/testing/ref-translator-tester.js.map +1 -0
- package/dist/utils/coordinate-converter.d.ts +32 -0
- package/dist/utils/coordinate-converter.d.ts.map +1 -0
- package/dist/utils/coordinate-converter.js +130 -0
- package/dist/utils/coordinate-converter.js.map +1 -0
- package/dist/utils/hierarchical-selector.d.ts +47 -0
- package/dist/utils/hierarchical-selector.d.ts.map +1 -0
- package/dist/utils/hierarchical-selector.js +212 -0
- package/dist/utils/hierarchical-selector.js.map +1 -0
- package/dist/utils/page-info-retry.d.ts +14 -0
- package/dist/utils/page-info-retry.d.ts.map +1 -0
- package/dist/utils/page-info-retry.js +60 -0
- package/dist/utils/page-info-retry.js.map +1 -0
- package/dist/utils/page-info-utils.d.ts +1 -0
- package/dist/utils/page-info-utils.d.ts.map +1 -1
- package/dist/utils/page-info-utils.js +46 -18
- package/dist/utils/page-info-utils.js.map +1 -1
- package/dist/utils/ref-attacher.d.ts +21 -0
- package/dist/utils/ref-attacher.d.ts.map +1 -0
- package/dist/utils/ref-attacher.js +149 -0
- package/dist/utils/ref-attacher.js.map +1 -0
- package/dist/utils/ref-translator.d.ts +49 -0
- package/dist/utils/ref-translator.d.ts.map +1 -0
- package/dist/utils/ref-translator.js +276 -0
- package/dist/utils/ref-translator.js.map +1 -0
- package/package.json +1 -1
- package/plandocs/BEFORE_AFTER_VERIFICATION.md +148 -0
- package/plandocs/COORDINATE_MODE_DIAGNOSIS.md +144 -0
- package/plandocs/IMPLEMENTATION_STATUS.md +108 -0
- package/plandocs/PHASE_1_COMPLETE.md +165 -0
- package/plandocs/PHASE_1_SUMMARY.md +184 -0
- package/plandocs/PROMPT_OPTIMIZATION_ANALYSIS.md +120 -0
- package/plandocs/PROMPT_SANITY_CHECK.md +120 -0
- package/plandocs/SESSION_SUMMARY_v0.0.33.md +151 -0
- package/plandocs/TROUBLESHOOTING_SESSION.md +72 -0
- package/plandocs/VISUAL_AGENT_EVOLUTION_PLAN.md +396 -0
- package/plandocs/WHATS_NEW_v0.0.33.md +183 -0
- package/plandocs/exploratory-mode-support-v2.plan.md +953 -0
- package/plandocs/exploratory-mode-support.plan.md +928 -0
- package/plandocs/journey-id-tracking-addendum.md +227 -0
- package/src/execution-service.ts +179 -596
- package/src/index.ts +10 -0
- package/src/llm-facade.ts +8 -8
- package/src/llm-provider.ts +11 -1
- package/src/model-constants.ts +17 -5
- package/src/orchestrator/decision-parser.ts +139 -0
- package/src/orchestrator/index.ts +27 -2
- package/src/orchestrator/orchestrator-agent.ts +868 -623
- package/src/orchestrator/orchestrator-prompts.ts +786 -0
- package/src/orchestrator/page-som-handler.ts +1565 -0
- package/src/orchestrator/som-types.ts +188 -0
- package/src/orchestrator/tool-registry.ts +2 -0
- package/src/orchestrator/tools/index.ts +5 -1
- package/src/orchestrator/tools/refresh-som-markers.ts +69 -0
- package/src/orchestrator/tools/verify-action-result.ts +159 -0
- package/src/orchestrator/tools/view-previous-screenshot.ts +103 -0
- package/src/orchestrator/types.ts +95 -4
- package/src/prompts.ts +40 -34
- package/src/scenario-service.ts +20 -0
- package/src/scenario-worker-class.ts +30 -4
- package/src/utils/coordinate-converter.ts +162 -0
- package/src/utils/page-info-retry.ts +65 -0
- package/src/utils/page-info-utils.ts +53 -18
- package/testchimp-runner-core-0.0.35.tgz +0 -0
- /package/{CREDIT_CALLBACK_ARCHITECTURE.md → plandocs/CREDIT_CALLBACK_ARCHITECTURE.md} +0 -0
- /package/{INTEGRATION_COMPLETE.md → plandocs/INTEGRATION_COMPLETE.md} +0 -0
- /package/{VISION_DIAGNOSTICS_IMPROVEMENTS.md → plandocs/VISION_DIAGNOSTICS_IMPROVEMENTS.md} +0 -0
- /package/{RELEASE_0.0.26.md → releasenotes/RELEASE_0.0.26.md} +0 -0
- /package/{RELEASE_0.0.27.md → releasenotes/RELEASE_0.0.27.md} +0 -0
- /package/{RELEASE_0.0.28.md → releasenotes/RELEASE_0.0.28.md} +0 -0
|
@@ -0,0 +1,928 @@
|
|
|
1
|
+
# Add Exploratory Mode Support to Runner-Core
|
|
2
|
+
|
|
3
|
+
## Overview
|
|
4
|
+
|
|
5
|
+
Enable runner-core's orchestrator to support exploratory mode where the agent autonomously decides next actions based on a high-level exploration prompt, rather than following pre-defined test steps. This reuses existing infrastructure (tools, memory, command execution) with modified prompting strategies.
|
|
6
|
+
|
|
7
|
+
**Key Principle**: The orchestrator fires `onStepComplete` for each autonomous action, so test-based-explorer doesn't know or care if steps were pre-defined or autonomously decided. All analytics, bug capture, and screen state detection work identically.
|
|
8
|
+
|
|
9
|
+
---
|
|
10
|
+
|
|
11
|
+
## Phase 1: Extend Runner-Core Types & Config
|
|
12
|
+
|
|
13
|
+
### 1.1 Add Exploration Mode Types
|
|
14
|
+
|
|
15
|
+
**File**: `runner-core/src/orchestrator/types.ts`
|
|
16
|
+
|
|
17
|
+
Add new interface:
|
|
18
|
+
|
|
19
|
+
```typescript
|
|
20
|
+
/**
|
|
21
|
+
* Exploration mode configuration
|
|
22
|
+
*/
|
|
23
|
+
export interface ExplorationMode {
|
|
24
|
+
enabled: boolean; // Whether exploration mode is active
|
|
25
|
+
explorationPrompt: string; // High-level goal: "Explore all menu options"
|
|
26
|
+
testDataPrompt?: string; // Test data, credentials context
|
|
27
|
+
maxExplorationSteps?: number; // Budget limit (default: 50) - agent can stop earlier
|
|
28
|
+
}
|
|
29
|
+
```
|
|
30
|
+
|
|
31
|
+
Add to `AgentConfig`:
|
|
32
|
+
|
|
33
|
+
```typescript
|
|
34
|
+
export interface AgentConfig {
|
|
35
|
+
// ... existing fields
|
|
36
|
+
|
|
37
|
+
// Exploration mode (NEW)
|
|
38
|
+
explorationMode?: ExplorationMode;
|
|
39
|
+
}
|
|
40
|
+
```
|
|
41
|
+
|
|
42
|
+
Update `DEFAULT_AGENT_CONFIG`:
|
|
43
|
+
|
|
44
|
+
```typescript
|
|
45
|
+
export const DEFAULT_AGENT_CONFIG: Required<AgentConfig> = {
|
|
46
|
+
// ... existing defaults
|
|
47
|
+
|
|
48
|
+
explorationMode: {
|
|
49
|
+
enabled: false,
|
|
50
|
+
explorationPrompt: '',
|
|
51
|
+
testDataPrompt: undefined,
|
|
52
|
+
maxExplorationSteps: 50
|
|
53
|
+
}
|
|
54
|
+
};
|
|
55
|
+
```
|
|
56
|
+
|
|
57
|
+
### 1.2 Reuse Existing Journey Memory
|
|
58
|
+
|
|
59
|
+
**No code changes needed** - existing `JourneyMemory` fields are sufficient:
|
|
60
|
+
|
|
61
|
+
- **`history`**: Agent reviews to understand visited screens/areas
|
|
62
|
+
- **`experiences`**: Used for BOTH app patterns AND exploration progress
|
|
63
|
+
- Examples: "Dashboard fully explored - tested all widgets"
|
|
64
|
+
- Examples: "Discovered Admin menu but not explored yet"
|
|
65
|
+
- **`extractedData`**: Store discovered areas with special keys
|
|
66
|
+
- Examples: `{ "menuItems": "Dashboard,Settings,Admin,Profile" }`
|
|
67
|
+
- Examples: `{ "explored": "Dashboard,Settings" }`
|
|
68
|
+
- **`latestNote`**: Tactical memory for exploration strategy
|
|
69
|
+
|
|
70
|
+
The exploratory prompts guide the agent to use these fields appropriately.
|
|
71
|
+
|
|
72
|
+
---
|
|
73
|
+
|
|
74
|
+
## Phase 2: Add Exploratory Prompts
|
|
75
|
+
|
|
76
|
+
### 2.1 Create Exploratory System Prompt
|
|
77
|
+
|
|
78
|
+
**File**: `runner-core/src/orchestrator/orchestrator-prompts.ts`
|
|
79
|
+
|
|
80
|
+
Add new method:
|
|
81
|
+
|
|
82
|
+
```typescript
|
|
83
|
+
static buildExploratorySystemPrompt(toolDescriptions: string): string {
|
|
84
|
+
return `You are an autonomous exploration agent that discovers and tests web application features.
|
|
85
|
+
|
|
86
|
+
${toolDescriptions}
|
|
87
|
+
|
|
88
|
+
YOUR RESPONSE FORMAT - Output JSON matching this interface:
|
|
89
|
+
|
|
90
|
+
interface AgentDecisionLLMResponse {
|
|
91
|
+
status: string; // "continue" | "complete" | "stuck"
|
|
92
|
+
reasoning: string; // What you're exploring and why
|
|
93
|
+
commands?: string[]; // Playwright commands to execute
|
|
94
|
+
commandReasoning?: string; // Why these commands
|
|
95
|
+
toolCalls?: Array<{ // Tools to call (extract_data for menus, etc.)
|
|
96
|
+
name: string;
|
|
97
|
+
params: Record<string, any>;
|
|
98
|
+
}>;
|
|
99
|
+
toolReasoning?: string;
|
|
100
|
+
needsToolResults?: boolean;
|
|
101
|
+
noteToFutureSelf?: string;
|
|
102
|
+
coordinateAction?: { ... };
|
|
103
|
+
experiences?: string[]; // Use for BOTH app patterns AND exploration progress
|
|
104
|
+
blockerDetected?: { ... };
|
|
105
|
+
}
|
|
106
|
+
|
|
107
|
+
EXPLORATION MODE GUIDELINES:
|
|
108
|
+
|
|
109
|
+
1. **GOAL-DRIVEN EXPLORATION**: Follow the exploration prompt as your north star
|
|
110
|
+
- "Explore all menu options" → Extract menu items, visit each one systematically
|
|
111
|
+
- "Test dashboard features" → Discover widgets/interactions, test them thoroughly
|
|
112
|
+
- "Find bugs in settings" → Navigate to settings, try various configurations
|
|
113
|
+
|
|
114
|
+
2. **AUTONOMOUS DISCOVERY**: You decide what to explore next based on:
|
|
115
|
+
- The exploration prompt (main goal)
|
|
116
|
+
- Current page state (what's available now)
|
|
117
|
+
- Journey history (what's been explored - check history, experiences, extractedData)
|
|
118
|
+
- Discovered but unvisited areas (stored in extractedData or experiences)
|
|
119
|
+
|
|
120
|
+
3. **SYSTEMATIC EXPLORATION**:
|
|
121
|
+
- Use extract_data tool to discover elements (menus, buttons, links)
|
|
122
|
+
- Store discoveries in extractedData: { "menuItems": "Dashboard,Settings,Admin" }
|
|
123
|
+
- Track progress in experiences: "Explored Dashboard - all widgets working"
|
|
124
|
+
- Check history to avoid re-visiting same areas
|
|
125
|
+
- Prioritize unexplored areas
|
|
126
|
+
|
|
127
|
+
4. **CREATIVE TESTING**: Test functionality, don't just navigate
|
|
128
|
+
- Try different input combinations
|
|
129
|
+
- Explore edge cases (empty inputs, max lengths, special characters)
|
|
130
|
+
- Verify features work as expected
|
|
131
|
+
- Look for visual bugs, console errors, broken functionality
|
|
132
|
+
|
|
133
|
+
5. **BLOCKER HANDLING**: Clear obstacles autonomously
|
|
134
|
+
- Cookie modals → dismiss with blockerDetected.clearingCommands
|
|
135
|
+
- Tour popups → close them
|
|
136
|
+
- Login required → use credentials from test data prompt
|
|
137
|
+
- Navigation blockers → clear before continuing
|
|
138
|
+
|
|
139
|
+
6. **STATUS DECISIONS** (CRITICAL):
|
|
140
|
+
- "continue": More exploration needed to achieve goal
|
|
141
|
+
- "complete": Exploration goal ACHIEVED (all menus explored, features tested, OR budget running low)
|
|
142
|
+
- "stuck": Cannot proceed (auth permanently blocked, critical error)
|
|
143
|
+
|
|
144
|
+
You should mark "complete" when:
|
|
145
|
+
- You've achieved the exploration goal (e.g., all menus explored)
|
|
146
|
+
- You've made good progress and are approaching budget limit
|
|
147
|
+
- Further exploration would be repetitive
|
|
148
|
+
|
|
149
|
+
DON'T wait to hit maxExplorationSteps - stop when goal is met!
|
|
150
|
+
|
|
151
|
+
7. **MEMORY USAGE**:
|
|
152
|
+
- experiences: Both app patterns AND exploration notes
|
|
153
|
+
Example: "Settings menu requires admin role to access"
|
|
154
|
+
Example: "Explored Dashboard completely - 5 widgets all functional"
|
|
155
|
+
- extractedData: Discovered elements and tracking
|
|
156
|
+
Example: { "menuItems": "Dashboard,Settings,Admin,Profile" }
|
|
157
|
+
Example: { "exploredMenus": "Dashboard,Settings" }
|
|
158
|
+
- history: Review to see what actions were taken
|
|
159
|
+
- noteToFutureSelf: Tactical plans for next iteration
|
|
160
|
+
|
|
161
|
+
8. **TOOLS FOR EXPLORATION**:
|
|
162
|
+
- extract_data: Discover menus, links, interactive elements
|
|
163
|
+
- take_screenshot: Understand visual layout when DOM unclear
|
|
164
|
+
- recall_history: Check what was already explored
|
|
165
|
+
|
|
166
|
+
CRITICAL: You're fully autonomous - no step-by-step instructions will be provided.
|
|
167
|
+
YOU decide the exploration path based on the goal, current state, and memory.`;
|
|
168
|
+
}
|
|
169
|
+
```
|
|
170
|
+
|
|
171
|
+
### 2.2 Create Exploratory User Prompt
|
|
172
|
+
|
|
173
|
+
Add method to `OrchestratorPrompts`:
|
|
174
|
+
|
|
175
|
+
```typescript
|
|
176
|
+
static buildExploratoryUserPrompt(
|
|
177
|
+
context: AgentContext,
|
|
178
|
+
explorationPrompt: string,
|
|
179
|
+
testDataPrompt?: string,
|
|
180
|
+
stepNumber?: number,
|
|
181
|
+
maxSteps?: number
|
|
182
|
+
): string {
|
|
183
|
+
const parts: string[] = [];
|
|
184
|
+
|
|
185
|
+
parts.push('=== EXPLORATION CONTEXT ===\n');
|
|
186
|
+
parts.push(`🎯 EXPLORATION GOAL: ${explorationPrompt}`);
|
|
187
|
+
|
|
188
|
+
if (testDataPrompt) {
|
|
189
|
+
parts.push(`📋 TEST DATA/CREDENTIALS: ${testDataPrompt}`);
|
|
190
|
+
}
|
|
191
|
+
|
|
192
|
+
if (stepNumber && maxSteps) {
|
|
193
|
+
parts.push(`📊 PROGRESS: Step ${stepNumber}/${maxSteps} (you can complete earlier if goal met)\n`);
|
|
194
|
+
}
|
|
195
|
+
|
|
196
|
+
// Show what's been explored (from extractedData and experiences)
|
|
197
|
+
if (context.extractedData && Object.keys(context.extractedData).length > 0) {
|
|
198
|
+
parts.push(`\n💾 DISCOVERED DATA:`);
|
|
199
|
+
for (const [key, value] of Object.entries(context.extractedData)) {
|
|
200
|
+
parts.push(` ${key}: ${value}`);
|
|
201
|
+
}
|
|
202
|
+
}
|
|
203
|
+
|
|
204
|
+
parts.push(`\nCURRENT PAGE:`);
|
|
205
|
+
parts.push(`URL: ${context.currentURL}`);
|
|
206
|
+
parts.push(`Title: ${context.currentPageInfo.title}`);
|
|
207
|
+
parts.push(`\nINTERACTIVE ELEMENTS (with positions and selectors):`);
|
|
208
|
+
parts.push(context.currentPageInfo.formattedElements);
|
|
209
|
+
parts.push(`\nARIA TREE (hierarchical structure):`);
|
|
210
|
+
parts.push(JSON.stringify(context.currentPageInfo.ariaSnapshot, null, 2).substring(0, 5000));
|
|
211
|
+
if (JSON.stringify(context.currentPageInfo.ariaSnapshot).length > 5000) {
|
|
212
|
+
parts.push('... (truncated)');
|
|
213
|
+
}
|
|
214
|
+
|
|
215
|
+
// Recent actions
|
|
216
|
+
if (context.recentSteps.length > 0) {
|
|
217
|
+
parts.push(`\nRECENT EXPLORATION ACTIONS (last ${context.recentSteps.length}):`);
|
|
218
|
+
for (const step of context.recentSteps) {
|
|
219
|
+
const status = step.result === 'success' ? '✓' : '✗';
|
|
220
|
+
parts.push(` ${status} ${step.action}`);
|
|
221
|
+
parts.push(` ${step.observation}`);
|
|
222
|
+
}
|
|
223
|
+
}
|
|
224
|
+
|
|
225
|
+
// Learnings and exploration progress
|
|
226
|
+
if (context.experiences && context.experiences.length > 0) {
|
|
227
|
+
parts.push(`\nEXPLORATION NOTES & APP PATTERNS:`);
|
|
228
|
+
for (const exp of context.experiences) {
|
|
229
|
+
parts.push(` • ${exp}`);
|
|
230
|
+
}
|
|
231
|
+
}
|
|
232
|
+
|
|
233
|
+
// Note from previous iteration
|
|
234
|
+
if (context.noteFromPreviousIteration) {
|
|
235
|
+
parts.push(`\n📝 YOUR NOTE FROM LAST ITERATION:`);
|
|
236
|
+
parts.push(` ${context.noteFromPreviousIteration.content}`);
|
|
237
|
+
}
|
|
238
|
+
|
|
239
|
+
parts.push(`\n🤔 DECIDE YOUR NEXT EXPLORATION ACTION:`);
|
|
240
|
+
parts.push(`1. What does the exploration goal require?`);
|
|
241
|
+
parts.push(`2. What's available on the current page?`);
|
|
242
|
+
parts.push(`3. What have you already explored? (check history, experiences, extractedData)`);
|
|
243
|
+
parts.push(`4. What should you explore next to achieve the goal?`);
|
|
244
|
+
parts.push(`5. Is the goal achieved? If yes, mark status="complete" (don't wait for max steps)`);
|
|
245
|
+
|
|
246
|
+
return parts.join('\n');
|
|
247
|
+
}
|
|
248
|
+
```
|
|
249
|
+
|
|
250
|
+
---
|
|
251
|
+
|
|
252
|
+
## Phase 3: Extend OrchestratorAgent for Exploration
|
|
253
|
+
|
|
254
|
+
### 3.1 Add Exploration Execution Method
|
|
255
|
+
|
|
256
|
+
**File**: `runner-core/src/orchestrator/orchestrator-agent.ts`
|
|
257
|
+
|
|
258
|
+
Add new method that runs exploration loop (similar to executeStep but for autonomous exploration):
|
|
259
|
+
|
|
260
|
+
```typescript
|
|
261
|
+
/**
|
|
262
|
+
* Execute exploration mode - agent autonomously explores based on prompt
|
|
263
|
+
* Fires callbacks for each autonomous action (transparent to caller)
|
|
264
|
+
*/
|
|
265
|
+
async executeExploration(
|
|
266
|
+
page: any,
|
|
267
|
+
explorationConfig: ExplorationMode,
|
|
268
|
+
progressReporter: ProgressReporter | undefined,
|
|
269
|
+
jobId: string
|
|
270
|
+
): Promise<OrchestratorStepResult> {
|
|
271
|
+
this.logger?.(`\n[Orchestrator] ========== EXPLORATORY MODE ==========`);
|
|
272
|
+
this.logger?.(`[Orchestrator] 🎯 Goal: ${explorationConfig.explorationPrompt}`);
|
|
273
|
+
if (explorationConfig.testDataPrompt) {
|
|
274
|
+
this.logger?.(`[Orchestrator] 📋 Test Data: ${explorationConfig.testDataPrompt}`);
|
|
275
|
+
}
|
|
276
|
+
|
|
277
|
+
const memory: JourneyMemory = {
|
|
278
|
+
history: [],
|
|
279
|
+
experiences: [],
|
|
280
|
+
extractedData: {}
|
|
281
|
+
};
|
|
282
|
+
|
|
283
|
+
const maxSteps = explorationConfig.maxExplorationSteps || 50;
|
|
284
|
+
let stepNumber = 0;
|
|
285
|
+
const commandsExecuted: string[] = [];
|
|
286
|
+
|
|
287
|
+
while (stepNumber < maxSteps) {
|
|
288
|
+
stepNumber++;
|
|
289
|
+
|
|
290
|
+
this.logger?.(`\n[Orchestrator] === Exploration Step ${stepNumber}/${maxSteps} ===`);
|
|
291
|
+
|
|
292
|
+
// Build exploratory context
|
|
293
|
+
const context = await this.buildExploratoryContext(
|
|
294
|
+
page,
|
|
295
|
+
explorationConfig.explorationPrompt,
|
|
296
|
+
explorationConfig.testDataPrompt,
|
|
297
|
+
memory,
|
|
298
|
+
stepNumber,
|
|
299
|
+
maxSteps
|
|
300
|
+
);
|
|
301
|
+
|
|
302
|
+
// Call agent with exploratory prompt
|
|
303
|
+
const decision = await this.callExploratoryAgent(
|
|
304
|
+
context,
|
|
305
|
+
jobId,
|
|
306
|
+
stepNumber
|
|
307
|
+
);
|
|
308
|
+
|
|
309
|
+
this.logAgentDecision(decision, stepNumber);
|
|
310
|
+
|
|
311
|
+
// Report progress via callback (CRITICAL - this is how test-based-explorer tracks steps)
|
|
312
|
+
if (progressReporter?.onStepProgress) {
|
|
313
|
+
await progressReporter.onStepProgress({
|
|
314
|
+
stepNumber,
|
|
315
|
+
stepId: `exploration-${stepNumber}`,
|
|
316
|
+
description: decision.reasoning,
|
|
317
|
+
code: decision.commands?.join('\n') || '',
|
|
318
|
+
status: StepExecutionStatus.IN_PROGRESS,
|
|
319
|
+
wasRepaired: false
|
|
320
|
+
});
|
|
321
|
+
}
|
|
322
|
+
|
|
323
|
+
// Execute tools if requested
|
|
324
|
+
if (decision.toolCalls && decision.toolCalls.length > 0) {
|
|
325
|
+
const toolResults = await this.executeTools(decision.toolCalls, page, memory, stepNumber);
|
|
326
|
+
|
|
327
|
+
// If needs tool results, call agent again
|
|
328
|
+
if (decision.needsToolResults) {
|
|
329
|
+
const updatedContext = { ...context, toolResults };
|
|
330
|
+
const continuedDecision = await this.callExploratoryAgent(updatedContext, jobId, stepNumber);
|
|
331
|
+
|
|
332
|
+
decision.commands = continuedDecision.commands || decision.commands;
|
|
333
|
+
decision.commandReasoning = continuedDecision.commandReasoning || decision.commandReasoning;
|
|
334
|
+
decision.status = continuedDecision.status;
|
|
335
|
+
}
|
|
336
|
+
}
|
|
337
|
+
|
|
338
|
+
// Handle blocker clearing
|
|
339
|
+
if (decision.blockerDetected && decision.blockerDetected.clearingCommands) {
|
|
340
|
+
this.logger?.(`[Orchestrator] 🚧 Clearing blocker: ${decision.blockerDetected.description}`);
|
|
341
|
+
const blockerResult = await this.executeCommandsSequentially(
|
|
342
|
+
decision.blockerDetected.clearingCommands,
|
|
343
|
+
page,
|
|
344
|
+
memory,
|
|
345
|
+
stepNumber,
|
|
346
|
+
1,
|
|
347
|
+
jobId
|
|
348
|
+
);
|
|
349
|
+
commandsExecuted.push(...blockerResult.executed);
|
|
350
|
+
}
|
|
351
|
+
|
|
352
|
+
// Execute exploration commands
|
|
353
|
+
let commandsSucceeded = true;
|
|
354
|
+
if (decision.commands && decision.commands.length > 0) {
|
|
355
|
+
const executeResult = await this.executeCommandsSequentially(
|
|
356
|
+
decision.commands,
|
|
357
|
+
page,
|
|
358
|
+
memory,
|
|
359
|
+
stepNumber,
|
|
360
|
+
1,
|
|
361
|
+
jobId
|
|
362
|
+
);
|
|
363
|
+
commandsExecuted.push(...executeResult.executed);
|
|
364
|
+
commandsSucceeded = executeResult.allSucceeded;
|
|
365
|
+
}
|
|
366
|
+
|
|
367
|
+
// Report step completion (CRITICAL - fires test-based-explorer callbacks)
|
|
368
|
+
if (progressReporter?.onStepProgress) {
|
|
369
|
+
await progressReporter.onStepProgress({
|
|
370
|
+
stepNumber,
|
|
371
|
+
stepId: `exploration-${stepNumber}`,
|
|
372
|
+
description: decision.reasoning,
|
|
373
|
+
code: decision.commands?.join('\n') || '',
|
|
374
|
+
status: commandsSucceeded ? StepExecutionStatus.SUCCESS : StepExecutionStatus.FAILED,
|
|
375
|
+
error: commandsSucceeded ? undefined : 'Command execution failed',
|
|
376
|
+
wasRepaired: false
|
|
377
|
+
});
|
|
378
|
+
}
|
|
379
|
+
|
|
380
|
+
// Add experiences (both app patterns AND exploration progress)
|
|
381
|
+
if (decision.experiences) {
|
|
382
|
+
memory.experiences.push(...decision.experiences);
|
|
383
|
+
if (memory.experiences.length > this.config.maxExperiences) {
|
|
384
|
+
memory.experiences = memory.experiences.slice(-this.config.maxExperiences);
|
|
385
|
+
}
|
|
386
|
+
}
|
|
387
|
+
|
|
388
|
+
// Store note for next iteration
|
|
389
|
+
if (decision.noteToFutureSelf) {
|
|
390
|
+
memory.latestNote = {
|
|
391
|
+
fromIteration: stepNumber,
|
|
392
|
+
content: decision.noteToFutureSelf
|
|
393
|
+
};
|
|
394
|
+
}
|
|
395
|
+
|
|
396
|
+
// Check termination
|
|
397
|
+
if (decision.status === 'complete') {
|
|
398
|
+
this.logger?.(`[Orchestrator] ✅ Exploration complete: ${decision.statusReasoning}`);
|
|
399
|
+
return {
|
|
400
|
+
success: true,
|
|
401
|
+
commands: commandsExecuted,
|
|
402
|
+
iterations: stepNumber,
|
|
403
|
+
terminationReason: 'complete',
|
|
404
|
+
memory
|
|
405
|
+
};
|
|
406
|
+
} else if (decision.status === 'stuck') {
|
|
407
|
+
this.logger?.(`[Orchestrator] ❌ Exploration stuck: ${decision.statusReasoning}`);
|
|
408
|
+
return {
|
|
409
|
+
success: false,
|
|
410
|
+
commands: commandsExecuted,
|
|
411
|
+
iterations: stepNumber,
|
|
412
|
+
terminationReason: 'agent_stuck',
|
|
413
|
+
memory,
|
|
414
|
+
error: decision.statusReasoning
|
|
415
|
+
};
|
|
416
|
+
}
|
|
417
|
+
}
|
|
418
|
+
|
|
419
|
+
// Hit max steps - not necessarily a failure
|
|
420
|
+
this.logger?.(`[Orchestrator] ⚠ Maximum exploration steps reached (budget limit)`);
|
|
421
|
+
return {
|
|
422
|
+
success: true, // Not a failure - just budget limit
|
|
423
|
+
commands: commandsExecuted,
|
|
424
|
+
iterations: stepNumber,
|
|
425
|
+
terminationReason: 'system_limit',
|
|
426
|
+
memory
|
|
427
|
+
};
|
|
428
|
+
}
|
|
429
|
+
|
|
430
|
+
private async buildExploratoryContext(
|
|
431
|
+
page: any,
|
|
432
|
+
explorationPrompt: string,
|
|
433
|
+
testDataPrompt: string | undefined,
|
|
434
|
+
memory: JourneyMemory,
|
|
435
|
+
stepNumber: number,
|
|
436
|
+
maxSteps: number
|
|
437
|
+
): Promise<AgentContext> {
|
|
438
|
+
const currentPageInfo = await getEnhancedPageInfo(page);
|
|
439
|
+
const currentURL = page.url();
|
|
440
|
+
const recentSteps = memory.history.slice(-this.config.recentStepsCount);
|
|
441
|
+
|
|
442
|
+
return {
|
|
443
|
+
overallGoal: explorationPrompt,
|
|
444
|
+
currentStepGoal: explorationPrompt, // Same as overall in exploratory mode
|
|
445
|
+
stepNumber,
|
|
446
|
+
totalSteps: maxSteps,
|
|
447
|
+
completedSteps: [],
|
|
448
|
+
remainingSteps: [],
|
|
449
|
+
currentPageInfo,
|
|
450
|
+
currentURL,
|
|
451
|
+
recentSteps,
|
|
452
|
+
experiences: memory.experiences,
|
|
453
|
+
extractedData: memory.extractedData,
|
|
454
|
+
noteFromPreviousIteration: memory.latestNote
|
|
455
|
+
};
|
|
456
|
+
}
|
|
457
|
+
|
|
458
|
+
private async callExploratoryAgent(
|
|
459
|
+
context: AgentContext,
|
|
460
|
+
jobId: string,
|
|
461
|
+
stepNumber: number
|
|
462
|
+
): Promise<AgentDecision> {
|
|
463
|
+
const toolDescriptions = this.toolRegistry.generateToolDescriptions();
|
|
464
|
+
const systemPrompt = OrchestratorPrompts.buildExploratorySystemPrompt(toolDescriptions);
|
|
465
|
+
const userPrompt = OrchestratorPrompts.buildExploratoryUserPrompt(
|
|
466
|
+
context,
|
|
467
|
+
context.overallGoal,
|
|
468
|
+
undefined, // testDataPrompt already in context
|
|
469
|
+
stepNumber,
|
|
470
|
+
context.totalSteps
|
|
471
|
+
);
|
|
472
|
+
|
|
473
|
+
// Call LLM (same as regular mode)
|
|
474
|
+
const llmRequest = {
|
|
475
|
+
model: DEFAULT_MODEL,
|
|
476
|
+
systemPrompt,
|
|
477
|
+
userPrompt
|
|
478
|
+
};
|
|
479
|
+
|
|
480
|
+
const response = await this.llmFacade.llmProvider.callLLM(llmRequest);
|
|
481
|
+
|
|
482
|
+
// Report token usage
|
|
483
|
+
if (response.usage && this.progressReporter?.onTokensUsed) {
|
|
484
|
+
await this.progressReporter.onTokensUsed({
|
|
485
|
+
jobId,
|
|
486
|
+
stepNumber,
|
|
487
|
+
iteration: 1,
|
|
488
|
+
inputTokens: response.usage.inputTokens,
|
|
489
|
+
outputTokens: response.usage.outputTokens,
|
|
490
|
+
includesImage: false,
|
|
491
|
+
model: DEFAULT_MODEL,
|
|
492
|
+
timestamp: Date.now()
|
|
493
|
+
});
|
|
494
|
+
}
|
|
495
|
+
|
|
496
|
+
// Parse response (same JSON format as regular mode)
|
|
497
|
+
const decision = this.parseAgentDecision(response.content);
|
|
498
|
+
return decision;
|
|
499
|
+
}
|
|
500
|
+
```
|
|
501
|
+
|
|
502
|
+
---
|
|
503
|
+
|
|
504
|
+
## Phase 4: Wire Through ScenarioService & Worker
|
|
505
|
+
|
|
506
|
+
### 4.1 Update ScenarioService Interface
|
|
507
|
+
|
|
508
|
+
**File**: `runner-core/src/scenario-service.ts`
|
|
509
|
+
|
|
510
|
+
Add explorationMode parameter to processScenario (if using orchestrator):
|
|
511
|
+
|
|
512
|
+
```typescript
|
|
513
|
+
processScenario(
|
|
514
|
+
scenario: string,
|
|
515
|
+
testName?: string,
|
|
516
|
+
config?: PlaywrightConfig,
|
|
517
|
+
model?: string,
|
|
518
|
+
scenarioFileName?: string,
|
|
519
|
+
existingBrowser?: any,
|
|
520
|
+
existingContext?: any,
|
|
521
|
+
existingPage?: any,
|
|
522
|
+
explorationMode?: ExplorationMode // NEW parameter
|
|
523
|
+
): string {
|
|
524
|
+
const jobId = `scenario_${Date.now()}_${Math.random().toString(36).substr(2, 9)}`;
|
|
525
|
+
|
|
526
|
+
const job: ScenarioRunJob = {
|
|
527
|
+
id: jobId,
|
|
528
|
+
scenario,
|
|
529
|
+
testName,
|
|
530
|
+
playwrightConfig: config,
|
|
531
|
+
model,
|
|
532
|
+
scenarioFileName,
|
|
533
|
+
existingBrowser,
|
|
534
|
+
existingContext,
|
|
535
|
+
existingPage,
|
|
536
|
+
explorationMode // NEW: Pass through
|
|
537
|
+
};
|
|
538
|
+
|
|
539
|
+
this.jobQueue.push(job);
|
|
540
|
+
this.processNextJob();
|
|
541
|
+
|
|
542
|
+
return jobId;
|
|
543
|
+
}
|
|
544
|
+
```
|
|
545
|
+
|
|
546
|
+
### 4.2 Update ScenarioWorker
|
|
547
|
+
|
|
548
|
+
**File**: `runner-core/src/scenario-worker-class.ts`
|
|
549
|
+
|
|
550
|
+
Handle explorationMode in processScenarioJob - when exploration mode is enabled and orchestrator is active, call executeExploration instead of building a scenario script.
|
|
551
|
+
|
|
552
|
+
---
|
|
553
|
+
|
|
554
|
+
## Phase 5: Extend SmartTestRunnerCoreV2
|
|
555
|
+
|
|
556
|
+
### 5.1 Pass Exploration Config to Runner-Core
|
|
557
|
+
|
|
558
|
+
**File**: `scriptservice/smart-test-runner-core-v2.ts`
|
|
559
|
+
|
|
560
|
+
Modify constructor to accept explorationMode:
|
|
561
|
+
|
|
562
|
+
```typescript
|
|
563
|
+
export interface RunnerConfig {
|
|
564
|
+
playwrightConfig?: string;
|
|
565
|
+
model?: string;
|
|
566
|
+
repairFlexibility?: number;
|
|
567
|
+
callbacks?: RunnerLifecycleCallbacks;
|
|
568
|
+
page?: Page;
|
|
569
|
+
browser?: Browser;
|
|
570
|
+
context?: BrowserContext;
|
|
571
|
+
maxRetriesPerStep?: number;
|
|
572
|
+
explorationMode?: ExplorationMode; // NEW
|
|
573
|
+
}
|
|
574
|
+
|
|
575
|
+
constructor(config: RunnerConfig) {
|
|
576
|
+
this.config = config;
|
|
577
|
+
// ... existing code
|
|
578
|
+
|
|
579
|
+
// Initialize runner-core with exploration mode if provided
|
|
580
|
+
this.runnerCore = new TestChimpService(
|
|
581
|
+
undefined,
|
|
582
|
+
undefined,
|
|
583
|
+
undefined,
|
|
584
|
+
1,
|
|
585
|
+
llmProvider,
|
|
586
|
+
progressReporter,
|
|
587
|
+
{
|
|
588
|
+
useOrchestrator: true, // Enable orchestrator for exploration
|
|
589
|
+
orchestratorConfig: {
|
|
590
|
+
explorationMode: config.explorationMode // Pass exploration config
|
|
591
|
+
}
|
|
592
|
+
}
|
|
593
|
+
);
|
|
594
|
+
}
|
|
595
|
+
```
|
|
596
|
+
|
|
597
|
+
Add exploration execution method:
|
|
598
|
+
|
|
599
|
+
```typescript
|
|
600
|
+
/**
|
|
601
|
+
* Run exploration mode - agent autonomously explores based on prompt
|
|
602
|
+
* Delegates to runner-core's orchestrator which fires onStepComplete for each action
|
|
603
|
+
*/
|
|
604
|
+
async runExploration(jobId?: string): Promise<RunExactlyResult> {
|
|
605
|
+
try {
|
|
606
|
+
logger.info(`SmartTestRunnerCoreV2.runExploration: Starting`);
|
|
607
|
+
|
|
608
|
+
if (!this.config.explorationMode?.enabled) {
|
|
609
|
+
throw new Error('Exploration mode not enabled in config');
|
|
610
|
+
}
|
|
611
|
+
|
|
612
|
+
// Call beforeStartTest
|
|
613
|
+
if (this.config.callbacks?.beforeStartTest) {
|
|
614
|
+
await this.config.callbacks.beforeStartTest(this.page, this.browser, this.context);
|
|
615
|
+
}
|
|
616
|
+
|
|
617
|
+
// Call runner-core's orchestrator in exploration mode
|
|
618
|
+
// Orchestrator will autonomously explore and fire onStepComplete for each action
|
|
619
|
+
const result = await this.runnerCore.executeExploration(
|
|
620
|
+
this.page,
|
|
621
|
+
this.config.explorationMode,
|
|
622
|
+
jobId
|
|
623
|
+
);
|
|
624
|
+
|
|
625
|
+
// Call afterEndTest
|
|
626
|
+
if (this.config.callbacks?.afterEndTest) {
|
|
627
|
+
const status = result.success ? 'passed' : 'failed';
|
|
628
|
+
await this.config.callbacks.afterEndTest(status, result.error, this.page);
|
|
629
|
+
}
|
|
630
|
+
|
|
631
|
+
return {
|
|
632
|
+
success: result.success,
|
|
633
|
+
error: result.error
|
|
634
|
+
};
|
|
635
|
+
} catch (error: any) {
|
|
636
|
+
logger.error(`SmartTestRunnerCoreV2.runExploration: Error - ${error.message}`);
|
|
637
|
+
|
|
638
|
+
if (this.config.callbacks?.afterEndTest) {
|
|
639
|
+
await this.config.callbacks.afterEndTest('failed', error.message, this.page);
|
|
640
|
+
}
|
|
641
|
+
|
|
642
|
+
return {
|
|
643
|
+
success: false,
|
|
644
|
+
error: error.message
|
|
645
|
+
};
|
|
646
|
+
}
|
|
647
|
+
}
|
|
648
|
+
```
|
|
649
|
+
|
|
650
|
+
---
|
|
651
|
+
|
|
652
|
+
## Phase 6: Minimal Changes to Test-Based-Explorer
|
|
653
|
+
|
|
654
|
+
### 6.1 Extend TestBasedExplorationTask
|
|
655
|
+
|
|
656
|
+
**File**: `scriptservice/utils/models.ts`
|
|
657
|
+
|
|
658
|
+
Add exploration mode fields:
|
|
659
|
+
|
|
660
|
+
```typescript
|
|
661
|
+
export interface TestBasedExplorationTask {
|
|
662
|
+
// ... existing fields
|
|
663
|
+
|
|
664
|
+
// NEW: Exploration mode support
|
|
665
|
+
explorationMode?: {
|
|
666
|
+
enabled: boolean;
|
|
667
|
+
explorationPrompt: string;
|
|
668
|
+
testDataPrompt?: string;
|
|
669
|
+
};
|
|
670
|
+
}
|
|
671
|
+
```
|
|
672
|
+
|
|
673
|
+
### 6.2 Update Test-Based-Explorer Constructor
|
|
674
|
+
|
|
675
|
+
**File**: `scriptservice/workers/test-based-explorer.ts`
|
|
676
|
+
|
|
677
|
+
Store exploration config:
|
|
678
|
+
|
|
679
|
+
```typescript
|
|
680
|
+
private explorationMode?: { enabled: boolean; explorationPrompt: string; testDataPrompt?: string; };
|
|
681
|
+
|
|
682
|
+
constructor(task: TestBasedExplorationTask, ...callbacks) {
|
|
683
|
+
// ... existing code
|
|
684
|
+
this.explorationMode = task.explorationMode;
|
|
685
|
+
}
|
|
686
|
+
```
|
|
687
|
+
|
|
688
|
+
### 6.3 Modify run() Method - Minimal Changes
|
|
689
|
+
|
|
690
|
+
**File**: `scriptservice/workers/test-based-explorer.ts`
|
|
691
|
+
|
|
692
|
+
Change around line 420-690:
|
|
693
|
+
|
|
694
|
+
```typescript
|
|
695
|
+
public async run() {
|
|
696
|
+
let testFullName = `${this.testName?.suite}#${this.testName?.name}`;
|
|
697
|
+
try {
|
|
698
|
+
logger.info("Running test...");
|
|
699
|
+
await this.setup();
|
|
700
|
+
await this.journeyReporter?.startJourney()
|
|
701
|
+
} catch (error) {
|
|
702
|
+
logger.error("Exception in worker:", error);
|
|
703
|
+
await updateJourneyExecutionStatus(this.invocationId, JourneyExecutionStatus.EXCEPTION_IN_JOURNEY_EXECUTION);
|
|
704
|
+
return {
|
|
705
|
+
status: 'error',
|
|
706
|
+
error: error instanceof Error ? error.message : String(error),
|
|
707
|
+
};
|
|
708
|
+
}
|
|
709
|
+
|
|
710
|
+
let startTimeMillis: number = Date.now();
|
|
711
|
+
let endReason = "";
|
|
712
|
+
|
|
713
|
+
try {
|
|
714
|
+
// Parse test steps OR use empty array for exploration
|
|
715
|
+
let codeUnits: CodeUnit[] = [];
|
|
716
|
+
if (!this.explorationMode?.enabled) {
|
|
717
|
+
// TEST-BASED mode: Parse smart test into code units
|
|
718
|
+
if (!this.testId) {
|
|
719
|
+
logger.error('No testId provided');
|
|
720
|
+
return { result: null };
|
|
721
|
+
}
|
|
722
|
+
codeUnits = await this.parseSmartTestIntoCodeUnits();
|
|
723
|
+
if (codeUnits.length === 0) {
|
|
724
|
+
logger.error(`No codeUnits found for test ${this.testId}`);
|
|
725
|
+
return { result: null };
|
|
726
|
+
}
|
|
727
|
+
codeUnits = codeUnits.filter(cu => !isNonActionCodeUnit(cu));
|
|
728
|
+
} else {
|
|
729
|
+
// EXPLORATION mode: No pre-defined steps
|
|
730
|
+
logger.info(`EXPLORATION MODE: ${this.explorationMode.explorationPrompt}`);
|
|
731
|
+
}
|
|
732
|
+
|
|
733
|
+
// Convert to TestStepWithId format
|
|
734
|
+
const stepsWithId: TestStepWithId[] = codeUnits.map(cu => ({
|
|
735
|
+
stepId: cu.id || uuidv4(),
|
|
736
|
+
description: cu.description,
|
|
737
|
+
code: cu.code,
|
|
738
|
+
}));
|
|
739
|
+
|
|
740
|
+
// Track state for step callbacks (SAME FOR BOTH MODES)
|
|
741
|
+
let stepTimestamps = new Map<string, number>();
|
|
742
|
+
|
|
743
|
+
// Define callbacks (IDENTICAL FOR BOTH TEST-BASED AND EXPLORATION)
|
|
744
|
+
const handleBeforeStartTest = async (page: Page, browser: Browser, context: BrowserContext) => {
|
|
745
|
+
await this.initializeWatcherAndListeners();
|
|
746
|
+
logger.info('Test setup complete, starting execution');
|
|
747
|
+
};
|
|
748
|
+
|
|
749
|
+
const handleBeforeStepStart = async (step: TestStepWithId, page: Page) => {
|
|
750
|
+
const stepId = step.stepId || '';
|
|
751
|
+
const stepTimestamp = Date.now();
|
|
752
|
+
stepTimestamps.set(stepId, stepTimestamp);
|
|
753
|
+
this.journeyReporter?.startStep(stepId);
|
|
754
|
+
this.currentStepCodeExecutions = [];
|
|
755
|
+
};
|
|
756
|
+
|
|
757
|
+
const handleStepComplete = async (step: TestStepWithId, isRepairStep: boolean, repairForStepId: string | undefined, error: string | undefined, page: Page) => {
|
|
758
|
+
// ... EXISTING CODE (lines 466-648) - NO CHANGES
|
|
759
|
+
// All analytics, screenshot, bug capture, screen state detection - IDENTICAL
|
|
760
|
+
};
|
|
761
|
+
|
|
762
|
+
const handleAfterEndTest = async (status: 'passed' | 'failed', error: string | undefined, page: Page) => {
|
|
763
|
+
// ... EXISTING CODE (lines 650-666) - NO CHANGES
|
|
764
|
+
};
|
|
765
|
+
|
|
766
|
+
// Create runner config (SAME FOR BOTH MODES)
|
|
767
|
+
const runnerConfig: RunnerConfig = {
|
|
768
|
+
page: this.page!,
|
|
769
|
+
browser: this.browser!,
|
|
770
|
+
context: this.context!,
|
|
771
|
+
maxRetriesPerStep: 3,
|
|
772
|
+
repairFlexibility: this.aiHealingSettings?.freedomLevel,
|
|
773
|
+
explorationMode: this.explorationMode, // NEW: Pass exploration config
|
|
774
|
+
callbacks: {
|
|
775
|
+
beforeStartTest: handleBeforeStartTest,
|
|
776
|
+
beforeStepStart: handleBeforeStepStart,
|
|
777
|
+
onStepComplete: handleStepComplete,
|
|
778
|
+
afterEndTest: handleAfterEndTest
|
|
779
|
+
}
|
|
780
|
+
};
|
|
781
|
+
|
|
782
|
+
// Create runner (SAME FOR BOTH MODES)
|
|
783
|
+
const runner = new SmartTestRunnerCoreV2(runnerConfig);
|
|
784
|
+
logger.info(`test-based-explorer: using SmartTestRunnerCoreV2 (runner-core)`);
|
|
785
|
+
|
|
786
|
+
// Execute based on mode
|
|
787
|
+
let result;
|
|
788
|
+
if (this.explorationMode?.enabled) {
|
|
789
|
+
// EXPLORATION: Let orchestrator decide next steps autonomously
|
|
790
|
+
result = await runner.runExploration(this.invocationId);
|
|
791
|
+
} else {
|
|
792
|
+
// TEST-BASED: Execute predefined steps with repair
|
|
793
|
+
result = await runner.runWithRepair(stepsWithId, this.invocationId, false, 0);
|
|
794
|
+
}
|
|
795
|
+
|
|
796
|
+
if (!result.success) {
|
|
797
|
+
throw new Error(`Execution failed: ${result.error}`);
|
|
798
|
+
}
|
|
799
|
+
|
|
800
|
+
} catch (error) {
|
|
801
|
+
// ... EXISTING ERROR HANDLING - NO CHANGES
|
|
802
|
+
} finally {
|
|
803
|
+
// ... EXISTING CLEANUP - NO CHANGES
|
|
804
|
+
}
|
|
805
|
+
}
|
|
806
|
+
```
|
|
807
|
+
|
|
808
|
+
**That's it!** Only ~10 lines of changes in test-based-explorer:
|
|
809
|
+
1. Store explorationMode in constructor
|
|
810
|
+
2. Skip parsing steps if exploration mode
|
|
811
|
+
3. Pass explorationMode to RunnerConfig
|
|
812
|
+
4. Call runExploration() instead of runWithRepair()
|
|
813
|
+
|
|
814
|
+
All callbacks remain identical - orchestrator fires them for each autonomous action.
|
|
815
|
+
|
|
816
|
+
---
|
|
817
|
+
|
|
818
|
+
## Phase 7: Wire Up AppExplorer
|
|
819
|
+
|
|
820
|
+
### 7.1 Handle PromptConfig in AppExplorer
|
|
821
|
+
|
|
822
|
+
**File**: `scriptservice/workers/app-explorer.ts`
|
|
823
|
+
|
|
824
|
+
Modify around line 154:
|
|
825
|
+
|
|
826
|
+
```typescript
|
|
827
|
+
if (this.config.promptConfig) {
|
|
828
|
+
// Prompt-based exploration - use test-based-explorer with exploration mode
|
|
829
|
+
logger.info("Running prompt-based exploration");
|
|
830
|
+
await this.runPromptBasedExploration();
|
|
831
|
+
}
|
|
832
|
+
|
|
833
|
+
// Add new method
|
|
834
|
+
private async runPromptBasedExploration(): Promise<void> {
|
|
835
|
+
const promptConfig = this.config.promptConfig!;
|
|
836
|
+
|
|
837
|
+
// Create exploration task (reuses TestBasedExplorationTask)
|
|
838
|
+
const task: TestBasedExplorationTask = {
|
|
839
|
+
invocationId: uuidv4(),
|
|
840
|
+
invocationBatchId: this.explorationId,
|
|
841
|
+
projectId: this.task.projectId,
|
|
842
|
+
appReleaseId: this.config.webappReleaseVersion ?? "default",
|
|
843
|
+
autohealEnabled: false, // Not needed for exploration
|
|
844
|
+
sessionRecordApiKey: this.task.sessionRecordApiKey,
|
|
845
|
+
ingressEndpoint: await ConfigService.get('test_service_endpoint'),
|
|
846
|
+
enableTestchimpSdkOnExec: false,
|
|
847
|
+
urlRegexToCapture: ".*",
|
|
848
|
+
playwrightConfig: this.config.playwrightConfig,
|
|
849
|
+
creditBudget: this.getCreditsForNextJourney(),
|
|
850
|
+
bugCaptureSettings: this.config.bugCaptureSettings,
|
|
851
|
+
viewportConfig: this.config.viewportConfig,
|
|
852
|
+
|
|
853
|
+
// NEW: Enable exploration mode
|
|
854
|
+
explorationMode: {
|
|
855
|
+
enabled: true,
|
|
856
|
+
explorationPrompt: promptConfig.explorePrompt || "Explore the application",
|
|
857
|
+
testDataPrompt: promptConfig.testDataPrompt
|
|
858
|
+
}
|
|
859
|
+
};
|
|
860
|
+
|
|
861
|
+
// Use test-based-explorer (it handles both modes transparently)
|
|
862
|
+
const explorer = new TestBasedExplorer(
|
|
863
|
+
task,
|
|
864
|
+
this.onBugsDiscovered.bind(this),
|
|
865
|
+
this.onScreenStateVisited.bind(this),
|
|
866
|
+
this.shouldVisualAnalyzeScreen?.bind(this),
|
|
867
|
+
this.getVisualAnalysisDecision?.bind(this),
|
|
868
|
+
this.shouldAnalyzeApiCall?.bind(this),
|
|
869
|
+
this.reportVisualAnalyzedScreen?.bind(this),
|
|
870
|
+
this.reportVisualAnalyzedScreenshot?.bind(this),
|
|
871
|
+
this.reportAnalyzedApiCalls?.bind(this),
|
|
872
|
+
this.getKnownScreenStates?.bind(this),
|
|
873
|
+
this.updateInputValueCombinations?.bind(this),
|
|
874
|
+
this.reportPageNavigation?.bind(this),
|
|
875
|
+
this.onTokensUsed?.bind(this),
|
|
876
|
+
this.updateLatestScreenshot?.bind(this)
|
|
877
|
+
);
|
|
878
|
+
|
|
879
|
+
await explorer.run();
|
|
880
|
+
}
|
|
881
|
+
```
|
|
882
|
+
|
|
883
|
+
---
|
|
884
|
+
|
|
885
|
+
## Summary
|
|
886
|
+
|
|
887
|
+
This plan enables exploratory mode with minimal changes:
|
|
888
|
+
|
|
889
|
+
1. **Runner-Core Extensions**:
|
|
890
|
+
- Add `ExplorationMode` type (removed `explorationStrategy`)
|
|
891
|
+
- Rename `explorationContext` to `testDataPrompt`
|
|
892
|
+
- Reuse existing `JourneyMemory` fields
|
|
893
|
+
- Add exploratory prompts that guide agent to use memory creatively
|
|
894
|
+
- Add `executeExploration` method to orchestrator
|
|
895
|
+
|
|
896
|
+
2. **Autonomous Decision-Making**:
|
|
897
|
+
- Agent decides next actions based on: exploration prompt + page state + memory
|
|
898
|
+
- Agent can stop early when goal achieved (doesn't wait for maxSteps)
|
|
899
|
+
- Agent tracks progress using existing memory fields (experiences, extractedData)
|
|
900
|
+
|
|
901
|
+
3. **Reuse Infrastructure**:
|
|
902
|
+
- Same tools (extract_data, screenshot, recall_history)
|
|
903
|
+
- Same command execution
|
|
904
|
+
- Same callback system (onStepComplete fires for each autonomous action)
|
|
905
|
+
|
|
906
|
+
4. **Minimal Test-Based-Explorer Changes** (~10 lines):
|
|
907
|
+
- Check if explorationMode enabled
|
|
908
|
+
- Skip step parsing if exploration
|
|
909
|
+
- Pass explorationMode to runner config
|
|
910
|
+
- Call runExploration() vs runWithRepair()
|
|
911
|
+
- **All callbacks identical** - works transparently
|
|
912
|
+
|
|
913
|
+
5. **Preserve Analytics**:
|
|
914
|
+
- All bug capture, screenshot analysis, screen state detection, console/network analytics work identically
|
|
915
|
+
- test-based-explorer doesn't know if steps were pre-defined or autonomous
|
|
916
|
+
|
|
917
|
+
### Implementation Order
|
|
918
|
+
|
|
919
|
+
1. Add types to runner-core (`ExplorationMode`, extend `AgentConfig`)
|
|
920
|
+
2. Add exploratory prompts to orchestrator-prompts.ts
|
|
921
|
+
3. Add `executeExploration` to orchestrator-agent.ts
|
|
922
|
+
4. Add `explorationMode` to `RunnerConfig` in SmartTestRunnerCoreV2
|
|
923
|
+
5. Add `runExploration` method to SmartTestRunnerCoreV2
|
|
924
|
+
6. Add `explorationMode` field to `TestBasedExplorationTask`
|
|
925
|
+
7. Modify test-based-explorer.ts `run()` method (~10 lines)
|
|
926
|
+
8. Add `runPromptBasedExploration` to app-explorer.ts
|
|
927
|
+
9. Test with sample exploration prompts
|
|
928
|
+
|