testchimp-runner-core 0.0.33 → 0.0.35
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/dist/execution-service.d.ts +1 -4
- package/dist/execution-service.d.ts.map +1 -1
- package/dist/execution-service.js +155 -468
- package/dist/execution-service.js.map +1 -1
- package/dist/index.d.ts +3 -1
- package/dist/index.d.ts.map +1 -1
- package/dist/index.js +11 -1
- package/dist/index.js.map +1 -1
- package/dist/llm-facade.d.ts.map +1 -1
- package/dist/llm-facade.js +7 -7
- package/dist/llm-facade.js.map +1 -1
- package/dist/llm-provider.d.ts +9 -0
- package/dist/llm-provider.d.ts.map +1 -1
- package/dist/model-constants.d.ts +16 -5
- package/dist/model-constants.d.ts.map +1 -1
- package/dist/model-constants.js +17 -6
- package/dist/model-constants.js.map +1 -1
- package/dist/orchestrator/decision-parser.d.ts +18 -0
- package/dist/orchestrator/decision-parser.d.ts.map +1 -0
- package/dist/orchestrator/decision-parser.js +127 -0
- package/dist/orchestrator/decision-parser.js.map +1 -0
- package/dist/orchestrator/index.d.ts +4 -2
- package/dist/orchestrator/index.d.ts.map +1 -1
- package/dist/orchestrator/index.js +15 -2
- package/dist/orchestrator/index.js.map +1 -1
- package/dist/orchestrator/orchestrator-agent.d.ts +17 -22
- package/dist/orchestrator/orchestrator-agent.d.ts.map +1 -1
- package/dist/orchestrator/orchestrator-agent.js +708 -577
- package/dist/orchestrator/orchestrator-agent.js.map +1 -1
- package/dist/orchestrator/orchestrator-prompts.d.ts +32 -0
- package/dist/orchestrator/orchestrator-prompts.d.ts.map +1 -0
- package/dist/orchestrator/orchestrator-prompts.js +737 -0
- package/dist/orchestrator/orchestrator-prompts.js.map +1 -0
- package/dist/orchestrator/page-som-handler.d.ts +106 -0
- package/dist/orchestrator/page-som-handler.d.ts.map +1 -0
- package/dist/orchestrator/page-som-handler.js +1353 -0
- package/dist/orchestrator/page-som-handler.js.map +1 -0
- package/dist/orchestrator/som-types.d.ts +149 -0
- package/dist/orchestrator/som-types.d.ts.map +1 -0
- package/dist/orchestrator/som-types.js +87 -0
- package/dist/orchestrator/som-types.js.map +1 -0
- package/dist/orchestrator/tool-registry.d.ts +2 -0
- package/dist/orchestrator/tool-registry.d.ts.map +1 -1
- package/dist/orchestrator/tool-registry.js.map +1 -1
- package/dist/orchestrator/tools/index.d.ts +5 -1
- package/dist/orchestrator/tools/index.d.ts.map +1 -1
- package/dist/orchestrator/tools/index.js +9 -2
- package/dist/orchestrator/tools/index.js.map +1 -1
- package/dist/orchestrator/tools/refresh-som-markers.d.ts +12 -0
- package/dist/orchestrator/tools/refresh-som-markers.d.ts.map +1 -0
- package/dist/orchestrator/tools/refresh-som-markers.js +64 -0
- package/dist/orchestrator/tools/refresh-som-markers.js.map +1 -0
- package/dist/orchestrator/tools/verify-action-result.d.ts +17 -0
- package/dist/orchestrator/tools/verify-action-result.d.ts.map +1 -0
- package/dist/orchestrator/tools/verify-action-result.js +140 -0
- package/dist/orchestrator/tools/verify-action-result.js.map +1 -0
- package/dist/orchestrator/tools/view-previous-screenshot.d.ts +15 -0
- package/dist/orchestrator/tools/view-previous-screenshot.d.ts.map +1 -0
- package/dist/orchestrator/tools/view-previous-screenshot.js +92 -0
- package/dist/orchestrator/tools/view-previous-screenshot.js.map +1 -0
- package/dist/orchestrator/types.d.ts +49 -1
- package/dist/orchestrator/types.d.ts.map +1 -1
- package/dist/orchestrator/types.js +11 -1
- package/dist/orchestrator/types.js.map +1 -1
- package/dist/prompts.d.ts.map +1 -1
- package/dist/prompts.js +40 -34
- package/dist/prompts.js.map +1 -1
- package/dist/scenario-service.d.ts +5 -0
- package/dist/scenario-service.d.ts.map +1 -1
- package/dist/scenario-service.js +17 -0
- package/dist/scenario-service.js.map +1 -1
- package/dist/scenario-worker-class.d.ts +4 -0
- package/dist/scenario-worker-class.d.ts.map +1 -1
- package/dist/scenario-worker-class.js +21 -3
- package/dist/scenario-worker-class.js.map +1 -1
- package/dist/testing/agent-tester.d.ts +35 -0
- package/dist/testing/agent-tester.d.ts.map +1 -0
- package/dist/testing/agent-tester.js +84 -0
- package/dist/testing/agent-tester.js.map +1 -0
- package/dist/testing/ref-translator-tester.d.ts +44 -0
- package/dist/testing/ref-translator-tester.d.ts.map +1 -0
- package/dist/testing/ref-translator-tester.js +104 -0
- package/dist/testing/ref-translator-tester.js.map +1 -0
- package/dist/utils/coordinate-converter.d.ts +32 -0
- package/dist/utils/coordinate-converter.d.ts.map +1 -0
- package/dist/utils/coordinate-converter.js +130 -0
- package/dist/utils/coordinate-converter.js.map +1 -0
- package/dist/utils/hierarchical-selector.d.ts +47 -0
- package/dist/utils/hierarchical-selector.d.ts.map +1 -0
- package/dist/utils/hierarchical-selector.js +212 -0
- package/dist/utils/hierarchical-selector.js.map +1 -0
- package/dist/utils/page-info-retry.d.ts +14 -0
- package/dist/utils/page-info-retry.d.ts.map +1 -0
- package/dist/utils/page-info-retry.js +60 -0
- package/dist/utils/page-info-retry.js.map +1 -0
- package/dist/utils/page-info-utils.d.ts +1 -0
- package/dist/utils/page-info-utils.d.ts.map +1 -1
- package/dist/utils/page-info-utils.js +46 -18
- package/dist/utils/page-info-utils.js.map +1 -1
- package/dist/utils/ref-attacher.d.ts +21 -0
- package/dist/utils/ref-attacher.d.ts.map +1 -0
- package/dist/utils/ref-attacher.js +149 -0
- package/dist/utils/ref-attacher.js.map +1 -0
- package/dist/utils/ref-translator.d.ts +49 -0
- package/dist/utils/ref-translator.d.ts.map +1 -0
- package/dist/utils/ref-translator.js +276 -0
- package/dist/utils/ref-translator.js.map +1 -0
- package/package.json +1 -1
- package/plandocs/BEFORE_AFTER_VERIFICATION.md +148 -0
- package/plandocs/COORDINATE_MODE_DIAGNOSIS.md +144 -0
- package/plandocs/IMPLEMENTATION_STATUS.md +108 -0
- package/plandocs/PHASE_1_COMPLETE.md +165 -0
- package/plandocs/PHASE_1_SUMMARY.md +184 -0
- package/plandocs/PROMPT_OPTIMIZATION_ANALYSIS.md +120 -0
- package/plandocs/PROMPT_SANITY_CHECK.md +120 -0
- package/plandocs/SESSION_SUMMARY_v0.0.33.md +151 -0
- package/plandocs/TROUBLESHOOTING_SESSION.md +72 -0
- package/plandocs/VISUAL_AGENT_EVOLUTION_PLAN.md +396 -0
- package/plandocs/WHATS_NEW_v0.0.33.md +183 -0
- package/plandocs/exploratory-mode-support-v2.plan.md +953 -0
- package/plandocs/exploratory-mode-support.plan.md +928 -0
- package/plandocs/journey-id-tracking-addendum.md +227 -0
- package/src/execution-service.ts +179 -596
- package/src/index.ts +10 -0
- package/src/llm-facade.ts +8 -8
- package/src/llm-provider.ts +11 -1
- package/src/model-constants.ts +17 -5
- package/src/orchestrator/decision-parser.ts +139 -0
- package/src/orchestrator/index.ts +27 -2
- package/src/orchestrator/orchestrator-agent.ts +868 -623
- package/src/orchestrator/orchestrator-prompts.ts +786 -0
- package/src/orchestrator/page-som-handler.ts +1565 -0
- package/src/orchestrator/som-types.ts +188 -0
- package/src/orchestrator/tool-registry.ts +2 -0
- package/src/orchestrator/tools/index.ts +5 -1
- package/src/orchestrator/tools/refresh-som-markers.ts +69 -0
- package/src/orchestrator/tools/verify-action-result.ts +159 -0
- package/src/orchestrator/tools/view-previous-screenshot.ts +103 -0
- package/src/orchestrator/types.ts +95 -4
- package/src/prompts.ts +40 -34
- package/src/scenario-service.ts +20 -0
- package/src/scenario-worker-class.ts +30 -4
- package/src/utils/coordinate-converter.ts +162 -0
- package/src/utils/page-info-retry.ts +65 -0
- package/src/utils/page-info-utils.ts +53 -18
- package/testchimp-runner-core-0.0.35.tgz +0 -0
- /package/{CREDIT_CALLBACK_ARCHITECTURE.md → plandocs/CREDIT_CALLBACK_ARCHITECTURE.md} +0 -0
- /package/{INTEGRATION_COMPLETE.md → plandocs/INTEGRATION_COMPLETE.md} +0 -0
- /package/{VISION_DIAGNOSTICS_IMPROVEMENTS.md → plandocs/VISION_DIAGNOSTICS_IMPROVEMENTS.md} +0 -0
- /package/{RELEASE_0.0.26.md → releasenotes/RELEASE_0.0.26.md} +0 -0
- /package/{RELEASE_0.0.27.md → releasenotes/RELEASE_0.0.27.md} +0 -0
- /package/{RELEASE_0.0.28.md → releasenotes/RELEASE_0.0.28.md} +0 -0
|
@@ -0,0 +1,953 @@
|
|
|
1
|
+
# Add Exploratory Mode Support to Runner-Core
|
|
2
|
+
|
|
3
|
+
## Overview
|
|
4
|
+
|
|
5
|
+
Enable runner-core's orchestrator to support exploratory mode where the agent autonomously decides next actions based on a high-level exploration prompt, rather than following pre-defined test steps.
|
|
6
|
+
|
|
7
|
+
**Architecture:**
|
|
8
|
+
- **AppExplorer**: Multi-journey orchestrator - decides WHAT journeys to run
|
|
9
|
+
- Test-based: picks tests from test tree (existing)
|
|
10
|
+
- **Prompt-based (NEW)**: LLM generates journey-specific prompts based on overall goal + learnings from prior journeys
|
|
11
|
+
|
|
12
|
+
- **JourneyRunner** (renamed from TestBasedExplorer): Single-journey executor - executes ONE journey
|
|
13
|
+
- Test-based: follows pre-defined test steps (existing)
|
|
14
|
+
- **Prompt-based (NEW)**: autonomous exploration with journey-specific focus prompt via runner-core orchestrator
|
|
15
|
+
|
|
16
|
+
**Key Principle**: JourneyRunner fires `onStepComplete` for each action (whether pre-defined or autonomous), so all analytics, bug capture, and screen state detection work identically for both modes.
|
|
17
|
+
|
|
18
|
+
---
|
|
19
|
+
|
|
20
|
+
## Phase 1: Extend Runner-Core Types & Config
|
|
21
|
+
|
|
22
|
+
### 1.1 Add Exploration Mode Types
|
|
23
|
+
|
|
24
|
+
**File**: `runner-core/src/orchestrator/types.ts`
|
|
25
|
+
|
|
26
|
+
Add new interface:
|
|
27
|
+
|
|
28
|
+
```typescript
|
|
29
|
+
/**
|
|
30
|
+
* Exploration mode configuration
|
|
31
|
+
*/
|
|
32
|
+
export interface ExplorationMode {
|
|
33
|
+
enabled: boolean; // Whether exploration mode is active
|
|
34
|
+
explorationPrompt: string; // Journey-specific focus: "Explore Dashboard and test all widgets"
|
|
35
|
+
testDataPrompt?: string; // Test data, credentials context
|
|
36
|
+
maxExplorationSteps?: number; // Budget limit (default: 50) - agent can stop earlier
|
|
37
|
+
}
|
|
38
|
+
```
|
|
39
|
+
|
|
40
|
+
Add to `AgentConfig`:
|
|
41
|
+
|
|
42
|
+
```typescript
|
|
43
|
+
export interface AgentConfig {
|
|
44
|
+
// ... existing fields
|
|
45
|
+
|
|
46
|
+
// Exploration mode (NEW)
|
|
47
|
+
explorationMode?: ExplorationMode;
|
|
48
|
+
}
|
|
49
|
+
```
|
|
50
|
+
|
|
51
|
+
Update `DEFAULT_AGENT_CONFIG`:
|
|
52
|
+
|
|
53
|
+
```typescript
|
|
54
|
+
export const DEFAULT_AGENT_CONFIG: Required<AgentConfig> = {
|
|
55
|
+
// ... existing defaults
|
|
56
|
+
|
|
57
|
+
explorationMode: {
|
|
58
|
+
enabled: false,
|
|
59
|
+
explorationPrompt: '',
|
|
60
|
+
testDataPrompt: undefined,
|
|
61
|
+
maxExplorationSteps: 50
|
|
62
|
+
}
|
|
63
|
+
};
|
|
64
|
+
```
|
|
65
|
+
|
|
66
|
+
### 1.2 Reuse Existing Journey Memory
|
|
67
|
+
|
|
68
|
+
**No code changes** - existing `JourneyMemory` fields work perfectly:
|
|
69
|
+
|
|
70
|
+
- **`history`**: Agent reviews to understand visited screens/areas
|
|
71
|
+
- **`experiences`**: Used for BOTH app patterns AND exploration progress
|
|
72
|
+
- Examples: "Dashboard fully explored - tested all 5 widgets successfully"
|
|
73
|
+
- Examples: "Discovered Admin menu (locked - requires admin role)"
|
|
74
|
+
- **`extractedData`**: Store discovered areas with special keys
|
|
75
|
+
- Examples: `{ "menuItems": "Dashboard,Settings,Admin,Profile" }`
|
|
76
|
+
- Examples: `{ "exploredMenus": "Dashboard,Settings" }`
|
|
77
|
+
- **`latestNote`**: Tactical memory for exploration strategy
|
|
78
|
+
|
|
79
|
+
Exploratory prompts guide the agent to use these fields appropriately.
|
|
80
|
+
|
|
81
|
+
---
|
|
82
|
+
|
|
83
|
+
## Phase 2: Add Exploratory Prompts to Runner-Core
|
|
84
|
+
|
|
85
|
+
### 2.1 Create Exploratory System Prompt
|
|
86
|
+
|
|
87
|
+
**File**: `runner-core/src/orchestrator/orchestrator-prompts.ts`
|
|
88
|
+
|
|
89
|
+
Add new method:
|
|
90
|
+
|
|
91
|
+
```typescript
|
|
92
|
+
static buildExploratorySystemPrompt(toolDescriptions: string): string {
|
|
93
|
+
return `You are an autonomous exploration agent that discovers and tests web application features.
|
|
94
|
+
|
|
95
|
+
${toolDescriptions}
|
|
96
|
+
|
|
97
|
+
YOUR RESPONSE FORMAT - Output JSON matching this interface:
|
|
98
|
+
|
|
99
|
+
interface AgentDecisionLLMResponse {
|
|
100
|
+
status: string; // "continue" | "complete" | "stuck"
|
|
101
|
+
reasoning: string; // What you're exploring and why
|
|
102
|
+
commands?: string[]; // Playwright commands to execute
|
|
103
|
+
commandReasoning?: string; // Why these commands
|
|
104
|
+
toolCalls?: Array<{ // Tools to call (extract_data for menus, etc.)
|
|
105
|
+
name: string;
|
|
106
|
+
params: Record<string, any>;
|
|
107
|
+
}>;
|
|
108
|
+
toolReasoning?: string;
|
|
109
|
+
needsToolResults?: boolean;
|
|
110
|
+
noteToFutureSelf?: string;
|
|
111
|
+
coordinateAction?: { ... };
|
|
112
|
+
experiences?: string[]; // Use for BOTH app patterns AND exploration progress
|
|
113
|
+
blockerDetected?: { ... };
|
|
114
|
+
}
|
|
115
|
+
|
|
116
|
+
EXPLORATION MODE GUIDELINES:
|
|
117
|
+
|
|
118
|
+
1. **JOURNEY-FOCUSED EXPLORATION**: Follow the exploration prompt as your goal for THIS journey
|
|
119
|
+
- Example prompt: "Explore Dashboard and test all widgets"
|
|
120
|
+
- You should systematically test dashboard widgets, not wander off to other sections
|
|
121
|
+
- Stay focused on the given journey goal
|
|
122
|
+
|
|
123
|
+
2. **AUTONOMOUS DISCOVERY**: You decide HOW to explore based on:
|
|
124
|
+
- The journey exploration prompt (your specific goal for this journey)
|
|
125
|
+
- Current page state (what's available now)
|
|
126
|
+
- Journey history (what's been explored already in THIS journey)
|
|
127
|
+
- Discovered but unvisited areas (stored in extractedData or experiences)
|
|
128
|
+
|
|
129
|
+
3. **SYSTEMATIC EXPLORATION**:
|
|
130
|
+
- Use extract_data tool to discover elements (menus, buttons, links, widgets)
|
|
131
|
+
- Store discoveries in extractedData: { "dashboardWidgets": "Sales,Analytics,Tasks,Calendar,Reports" }
|
|
132
|
+
- Track progress in experiences: "Explored Sales widget - chart renders correctly"
|
|
133
|
+
- Check history to avoid re-visiting same areas within this journey
|
|
134
|
+
- Prioritize unexplored areas related to journey goal
|
|
135
|
+
|
|
136
|
+
4. **CREATIVE TESTING**: Test functionality thoroughly, don't just navigate
|
|
137
|
+
- Try different input combinations
|
|
138
|
+
- Explore edge cases (empty inputs, max lengths, special characters)
|
|
139
|
+
- Verify features work as expected (data loads, actions complete, errors handled)
|
|
140
|
+
- Look for visual bugs, console errors, broken functionality
|
|
141
|
+
|
|
142
|
+
5. **BLOCKER HANDLING**: Clear obstacles autonomously
|
|
143
|
+
- Cookie modals → dismiss with blockerDetected.clearingCommands
|
|
144
|
+
- Tour popups → close them
|
|
145
|
+
- Login required → use credentials from test data prompt
|
|
146
|
+
- Navigation blockers → clear before continuing exploration
|
|
147
|
+
|
|
148
|
+
6. **STATUS DECISIONS** (CRITICAL):
|
|
149
|
+
- "continue": More exploration needed to achieve THIS journey's goal
|
|
150
|
+
- "complete": Journey goal ACHIEVED (e.g., "All dashboard widgets tested") OR budget running low
|
|
151
|
+
- "stuck": Cannot proceed (auth permanently blocked, critical error preventing goal)
|
|
152
|
+
|
|
153
|
+
**Mark "complete" when**:
|
|
154
|
+
- You've achieved the journey-specific exploration goal
|
|
155
|
+
- You've made good progress and are approaching step budget
|
|
156
|
+
- Further exploration would be repetitive or off-topic for this journey
|
|
157
|
+
|
|
158
|
+
**DON'T** wait to hit maxExplorationSteps - complete when journey goal is met!
|
|
159
|
+
|
|
160
|
+
7. **MEMORY USAGE**:
|
|
161
|
+
- experiences: BOTH app patterns AND exploration notes
|
|
162
|
+
Example: "Settings menu requires admin role to access"
|
|
163
|
+
Example: "Dashboard Analytics widget - tested filtering, all variants work correctly"
|
|
164
|
+
- extractedData: Discovered elements and tracking
|
|
165
|
+
Example: { "discoveredWidgets": "Sales,Analytics,Tasks,Calendar,Reports" }
|
|
166
|
+
Example: { "testedWidgets": "Sales,Analytics" }
|
|
167
|
+
- history: Review to see what actions were taken in this journey
|
|
168
|
+
- noteToFutureSelf: Tactical plans for next iteration
|
|
169
|
+
|
|
170
|
+
8. **TOOLS FOR EXPLORATION**:
|
|
171
|
+
- extract_data: Discover menus, links, interactive elements
|
|
172
|
+
- take_screenshot: Understand visual layout when DOM unclear
|
|
173
|
+
- recall_history: Check what was already explored in this journey
|
|
174
|
+
|
|
175
|
+
CRITICAL: You're fully autonomous for THIS journey - no step-by-step instructions provided.
|
|
176
|
+
YOU decide the exploration path to meet the journey goal based on: journey prompt, current state, and memory.`;
|
|
177
|
+
}
|
|
178
|
+
```
|
|
179
|
+
|
|
180
|
+
### 2.2 Create Exploratory User Prompt
|
|
181
|
+
|
|
182
|
+
Add method to `OrchestratorPrompts`:
|
|
183
|
+
|
|
184
|
+
```typescript
|
|
185
|
+
static buildExploratoryUserPrompt(
|
|
186
|
+
context: AgentContext,
|
|
187
|
+
explorationPrompt: string,
|
|
188
|
+
testDataPrompt?: string,
|
|
189
|
+
stepNumber?: number,
|
|
190
|
+
maxSteps?: number
|
|
191
|
+
): string {
|
|
192
|
+
const parts: string[] = [];
|
|
193
|
+
|
|
194
|
+
parts.push('=== JOURNEY EXPLORATION CONTEXT ===\n');
|
|
195
|
+
parts.push(`🎯 THIS JOURNEY'S GOAL: ${explorationPrompt}`);
|
|
196
|
+
parts.push(` (Focus on THIS specific goal - don't wander to unrelated areas)\n`);
|
|
197
|
+
|
|
198
|
+
if (testDataPrompt) {
|
|
199
|
+
parts.push(`📋 TEST DATA/CREDENTIALS: ${testDataPrompt}`);
|
|
200
|
+
}
|
|
201
|
+
|
|
202
|
+
if (stepNumber && maxSteps) {
|
|
203
|
+
parts.push(`📊 PROGRESS: Step ${stepNumber}/${maxSteps} (you can complete earlier if journey goal met)\n`);
|
|
204
|
+
}
|
|
205
|
+
|
|
206
|
+
// Show discovered and tracked data from extractedData
|
|
207
|
+
if (context.extractedData && Object.keys(context.extractedData).length > 0) {
|
|
208
|
+
parts.push(`\n💾 DISCOVERED DATA (this journey):`);
|
|
209
|
+
for (const [key, value] of Object.entries(context.extractedData)) {
|
|
210
|
+
parts.push(` ${key}: ${value}`);
|
|
211
|
+
}
|
|
212
|
+
}
|
|
213
|
+
|
|
214
|
+
parts.push(`\nCURRENT PAGE:`);
|
|
215
|
+
parts.push(`URL: ${context.currentURL}`);
|
|
216
|
+
parts.push(`Title: ${context.currentPageInfo.title}`);
|
|
217
|
+
parts.push(`\nINTERACTIVE ELEMENTS (with positions and selectors):`);
|
|
218
|
+
parts.push(context.currentPageInfo.formattedElements);
|
|
219
|
+
parts.push(`\nARIA TREE (hierarchical structure):`);
|
|
220
|
+
parts.push(JSON.stringify(context.currentPageInfo.ariaSnapshot, null, 2).substring(0, 5000));
|
|
221
|
+
if (JSON.stringify(context.currentPageInfo.ariaSnapshot).length > 5000) {
|
|
222
|
+
parts.push('... (truncated)');
|
|
223
|
+
}
|
|
224
|
+
|
|
225
|
+
// Recent actions
|
|
226
|
+
if (context.recentSteps.length > 0) {
|
|
227
|
+
parts.push(`\nRECENT ACTIONS (last ${context.recentSteps.length}):`);
|
|
228
|
+
for (const step of context.recentSteps) {
|
|
229
|
+
const status = step.result === 'success' ? '✓' : '✗';
|
|
230
|
+
parts.push(` ${status} ${step.action}`);
|
|
231
|
+
parts.push(` ${step.observation}`);
|
|
232
|
+
}
|
|
233
|
+
}
|
|
234
|
+
|
|
235
|
+
// Learnings and exploration progress
|
|
236
|
+
if (context.experiences && context.experiences.length > 0) {
|
|
237
|
+
parts.push(`\nEXPLORATION NOTES & APP PATTERNS:`);
|
|
238
|
+
for (const exp of context.experiences) {
|
|
239
|
+
parts.push(` • ${exp}`);
|
|
240
|
+
}
|
|
241
|
+
}
|
|
242
|
+
|
|
243
|
+
// Note from previous iteration
|
|
244
|
+
if (context.noteFromPreviousIteration) {
|
|
245
|
+
parts.push(`\n📝 YOUR NOTE FROM LAST ITERATION:`);
|
|
246
|
+
parts.push(` ${context.noteFromPreviousIteration.content}`);
|
|
247
|
+
}
|
|
248
|
+
|
|
249
|
+
parts.push(`\n🤔 DECIDE YOUR NEXT ACTION TO ACHIEVE THIS JOURNEY'S GOAL:`);
|
|
250
|
+
parts.push(`1. What does THIS journey's goal require?`);
|
|
251
|
+
parts.push(`2. What's available on the current page relevant to the goal?`);
|
|
252
|
+
parts.push(`3. What have you already explored in this journey? (check history, experiences, extractedData)`);
|
|
253
|
+
parts.push(`4. What should you explore/test next?`);
|
|
254
|
+
parts.push(`5. Is THIS journey's goal achieved? If yes, mark status="complete"`);
|
|
255
|
+
|
|
256
|
+
return parts.join('\n');
|
|
257
|
+
}
|
|
258
|
+
```
|
|
259
|
+
|
|
260
|
+
---
|
|
261
|
+
|
|
262
|
+
## Phase 3: Add Exploration Execution to OrchestratorAgent
|
|
263
|
+
|
|
264
|
+
### 3.1 Add executeExploration Method
|
|
265
|
+
|
|
266
|
+
**File**: `runner-core/src/orchestrator/orchestrator-agent.ts`
|
|
267
|
+
|
|
268
|
+
Add method that runs one journey's exploration loop:
|
|
269
|
+
|
|
270
|
+
```typescript
|
|
271
|
+
/**
|
|
272
|
+
* Execute exploration mode - agent autonomously explores to achieve journey goal
|
|
273
|
+
* Fires onStepProgress callbacks for each autonomous action (transparent to caller)
|
|
274
|
+
*/
|
|
275
|
+
async executeExploration(
|
|
276
|
+
page: any,
|
|
277
|
+
explorationConfig: ExplorationMode,
|
|
278
|
+
jobId: string
|
|
279
|
+
): Promise<OrchestratorStepResult> {
|
|
280
|
+
this.logger?.(`\n[Orchestrator] ========== EXPLORATION MODE ==========`);
|
|
281
|
+
this.logger?.(`[Orchestrator] 🎯 Journey Goal: ${explorationConfig.explorationPrompt}`);
|
|
282
|
+
if (explorationConfig.testDataPrompt) {
|
|
283
|
+
this.logger?.(`[Orchestrator] 📋 Test Data: ${explorationConfig.testDataPrompt}`);
|
|
284
|
+
}
|
|
285
|
+
|
|
286
|
+
const memory: JourneyMemory = {
|
|
287
|
+
history: [],
|
|
288
|
+
experiences: [],
|
|
289
|
+
extractedData: {}
|
|
290
|
+
};
|
|
291
|
+
|
|
292
|
+
const maxSteps = explorationConfig.maxExplorationSteps || 50;
|
|
293
|
+
let stepNumber = 0;
|
|
294
|
+
const commandsExecuted: string[] = [];
|
|
295
|
+
|
|
296
|
+
while (stepNumber < maxSteps) {
|
|
297
|
+
stepNumber++;
|
|
298
|
+
|
|
299
|
+
this.logger?.(`\n[Orchestrator] === Exploration Step ${stepNumber}/${maxSteps} ===`);
|
|
300
|
+
|
|
301
|
+
// Build exploratory context
|
|
302
|
+
const context = await this.buildExploratoryContext(
|
|
303
|
+
page,
|
|
304
|
+
explorationConfig.explorationPrompt,
|
|
305
|
+
explorationConfig.testDataPrompt,
|
|
306
|
+
memory,
|
|
307
|
+
stepNumber,
|
|
308
|
+
maxSteps
|
|
309
|
+
);
|
|
310
|
+
|
|
311
|
+
// Call agent with exploratory prompt
|
|
312
|
+
const decision = await this.callExploratoryAgent(
|
|
313
|
+
context,
|
|
314
|
+
jobId,
|
|
315
|
+
stepNumber
|
|
316
|
+
);
|
|
317
|
+
|
|
318
|
+
this.logAgentDecision(decision, stepNumber);
|
|
319
|
+
|
|
320
|
+
// Report step start (CRITICAL - fires JourneyRunner's beforeStepStart callback)
|
|
321
|
+
if (this.progressReporter?.onStepProgress) {
|
|
322
|
+
// Note: We fire onStepProgress with IN_PROGRESS first, then again with SUCCESS/FAILED
|
|
323
|
+
// This allows journey-runner to call beforeStepStart, then onStepComplete
|
|
324
|
+
// The stepId format allows journey-runner to track these as autonomous exploration steps
|
|
325
|
+
const stepInfo = {
|
|
326
|
+
stepNumber,
|
|
327
|
+
stepId: `exploration-${stepNumber}-${Date.now()}`,
|
|
328
|
+
description: decision.reasoning,
|
|
329
|
+
code: '', // Will be filled after commands execute
|
|
330
|
+
status: StepExecutionStatus.IN_PROGRESS,
|
|
331
|
+
wasRepaired: false
|
|
332
|
+
};
|
|
333
|
+
await this.progressReporter.onStepProgress(stepInfo);
|
|
334
|
+
}
|
|
335
|
+
|
|
336
|
+
// Execute tools if requested
|
|
337
|
+
if (decision.toolCalls && decision.toolCalls.length > 0) {
|
|
338
|
+
const toolResults = await this.executeTools(decision.toolCalls, page, memory, stepNumber);
|
|
339
|
+
|
|
340
|
+
// If needs tool results, call agent again
|
|
341
|
+
if (decision.needsToolResults) {
|
|
342
|
+
const updatedContext = { ...context, toolResults };
|
|
343
|
+
const continuedDecision = await this.callExploratoryAgent(updatedContext, jobId, stepNumber);
|
|
344
|
+
|
|
345
|
+
decision.commands = continuedDecision.commands || decision.commands;
|
|
346
|
+
decision.commandReasoning = continuedDecision.commandReasoning || decision.commandReasoning;
|
|
347
|
+
decision.status = continuedDecision.status;
|
|
348
|
+
}
|
|
349
|
+
}
|
|
350
|
+
|
|
351
|
+
// Handle blocker clearing
|
|
352
|
+
if (decision.blockerDetected && decision.blockerDetected.clearingCommands) {
|
|
353
|
+
this.logger?.(`[Orchestrator] 🚧 Clearing blocker: ${decision.blockerDetected.description}`);
|
|
354
|
+
const blockerResult = await this.executeCommandsSequentially(
|
|
355
|
+
decision.blockerDetected.clearingCommands,
|
|
356
|
+
page,
|
|
357
|
+
memory,
|
|
358
|
+
stepNumber,
|
|
359
|
+
1,
|
|
360
|
+
jobId
|
|
361
|
+
);
|
|
362
|
+
commandsExecuted.push(...blockerResult.executed);
|
|
363
|
+
}
|
|
364
|
+
|
|
365
|
+
// Execute exploration commands
|
|
366
|
+
let commandsSucceeded = true;
|
|
367
|
+
if (decision.commands && decision.commands.length > 0) {
|
|
368
|
+
const executeResult = await this.executeCommandsSequentially(
|
|
369
|
+
decision.commands,
|
|
370
|
+
page,
|
|
371
|
+
memory,
|
|
372
|
+
stepNumber,
|
|
373
|
+
1,
|
|
374
|
+
jobId
|
|
375
|
+
);
|
|
376
|
+
commandsExecuted.push(...executeResult.executed);
|
|
377
|
+
commandsSucceeded = executeResult.allSucceeded;
|
|
378
|
+
}
|
|
379
|
+
|
|
380
|
+
// Report step completion (CRITICAL - fires JourneyRunner's onStepComplete callback)
|
|
381
|
+
if (this.progressReporter?.onStepProgress) {
|
|
382
|
+
const stepInfo = {
|
|
383
|
+
stepNumber,
|
|
384
|
+
stepId: `exploration-${stepNumber}-${Date.now()}`,
|
|
385
|
+
description: decision.reasoning,
|
|
386
|
+
code: decision.commands?.join('\n') || '',
|
|
387
|
+
status: commandsSucceeded ? StepExecutionStatus.SUCCESS : StepExecutionStatus.FAILED,
|
|
388
|
+
error: commandsSucceeded ? undefined : 'Command execution failed',
|
|
389
|
+
wasRepaired: false
|
|
390
|
+
};
|
|
391
|
+
await this.progressReporter.onStepProgress(stepInfo);
|
|
392
|
+
}
|
|
393
|
+
|
|
394
|
+
// Add experiences (both app patterns AND exploration progress)
|
|
395
|
+
if (decision.experiences) {
|
|
396
|
+
memory.experiences.push(...decision.experiences);
|
|
397
|
+
if (memory.experiences.length > this.config.maxExperiences) {
|
|
398
|
+
memory.experiences = memory.experiences.slice(-this.config.maxExperiences);
|
|
399
|
+
}
|
|
400
|
+
}
|
|
401
|
+
|
|
402
|
+
// Store note for next iteration
|
|
403
|
+
if (decision.noteToFutureSelf) {
|
|
404
|
+
memory.latestNote = {
|
|
405
|
+
fromIteration: stepNumber,
|
|
406
|
+
content: decision.noteToFutureSelf
|
|
407
|
+
};
|
|
408
|
+
}
|
|
409
|
+
|
|
410
|
+
// Check termination
|
|
411
|
+
if (decision.status === 'complete') {
|
|
412
|
+
this.logger?.(`[Orchestrator] ✅ Journey exploration complete: ${decision.statusReasoning}`);
|
|
413
|
+
return {
|
|
414
|
+
success: true,
|
|
415
|
+
commands: commandsExecuted,
|
|
416
|
+
iterations: stepNumber,
|
|
417
|
+
terminationReason: 'complete',
|
|
418
|
+
memory
|
|
419
|
+
};
|
|
420
|
+
} else if (decision.status === 'stuck') {
|
|
421
|
+
this.logger?.(`[Orchestrator] ❌ Exploration stuck: ${decision.statusReasoning}`);
|
|
422
|
+
return {
|
|
423
|
+
success: false,
|
|
424
|
+
commands: commandsExecuted,
|
|
425
|
+
iterations: stepNumber,
|
|
426
|
+
terminationReason: 'agent_stuck',
|
|
427
|
+
memory,
|
|
428
|
+
error: decision.statusReasoning
|
|
429
|
+
};
|
|
430
|
+
}
|
|
431
|
+
}
|
|
432
|
+
|
|
433
|
+
// Hit max steps - not necessarily a failure
|
|
434
|
+
this.logger?.(`[Orchestrator] ⚠ Maximum exploration steps reached (budget limit)`);
|
|
435
|
+
return {
|
|
436
|
+
success: true, // Not a failure - just budget limit
|
|
437
|
+
commands: commandsExecuted,
|
|
438
|
+
iterations: stepNumber,
|
|
439
|
+
terminationReason: 'system_limit',
|
|
440
|
+
memory
|
|
441
|
+
};
|
|
442
|
+
}
|
|
443
|
+
|
|
444
|
+
private async buildExploratoryContext(
|
|
445
|
+
page: any,
|
|
446
|
+
explorationPrompt: string,
|
|
447
|
+
testDataPrompt: string | undefined,
|
|
448
|
+
memory: JourneyMemory,
|
|
449
|
+
stepNumber: number,
|
|
450
|
+
maxSteps: number
|
|
451
|
+
): Promise<AgentContext> {
|
|
452
|
+
const currentPageInfo = await getEnhancedPageInfo(page);
|
|
453
|
+
const currentURL = page.url();
|
|
454
|
+
const recentSteps = memory.history.slice(-this.config.recentStepsCount);
|
|
455
|
+
|
|
456
|
+
return {
|
|
457
|
+
overallGoal: explorationPrompt,
|
|
458
|
+
currentStepGoal: explorationPrompt, // Same as overall for single journey
|
|
459
|
+
stepNumber,
|
|
460
|
+
totalSteps: maxSteps,
|
|
461
|
+
completedSteps: [],
|
|
462
|
+
remainingSteps: [],
|
|
463
|
+
currentPageInfo,
|
|
464
|
+
currentURL,
|
|
465
|
+
recentSteps,
|
|
466
|
+
experiences: memory.experiences,
|
|
467
|
+
extractedData: memory.extractedData,
|
|
468
|
+
noteFromPreviousIteration: memory.latestNote
|
|
469
|
+
};
|
|
470
|
+
}
|
|
471
|
+
|
|
472
|
+
private async callExploratoryAgent(
|
|
473
|
+
context: AgentContext,
|
|
474
|
+
jobId: string,
|
|
475
|
+
stepNumber: number
|
|
476
|
+
): Promise<AgentDecision> {
|
|
477
|
+
const toolDescriptions = this.toolRegistry.generateToolDescriptions();
|
|
478
|
+
const systemPrompt = OrchestratorPrompts.buildExploratorySystemPrompt(toolDescriptions);
|
|
479
|
+
const userPrompt = OrchestratorPrompts.buildExploratoryUserPrompt(
|
|
480
|
+
context,
|
|
481
|
+
context.overallGoal,
|
|
482
|
+
undefined, // testDataPrompt already in explorationConfig
|
|
483
|
+
stepNumber,
|
|
484
|
+
context.totalSteps
|
|
485
|
+
);
|
|
486
|
+
|
|
487
|
+
const llmRequest = {
|
|
488
|
+
model: DEFAULT_MODEL,
|
|
489
|
+
systemPrompt,
|
|
490
|
+
userPrompt
|
|
491
|
+
};
|
|
492
|
+
|
|
493
|
+
const response = await this.llmFacade.llmProvider.callLLM(llmRequest);
|
|
494
|
+
|
|
495
|
+
// Report token usage
|
|
496
|
+
if (response.usage && this.progressReporter?.onTokensUsed) {
|
|
497
|
+
await this.progressReporter.onTokensUsed({
|
|
498
|
+
jobId,
|
|
499
|
+
stepNumber,
|
|
500
|
+
iteration: 1,
|
|
501
|
+
inputTokens: response.usage.inputTokens,
|
|
502
|
+
outputTokens: response.usage.outputTokens,
|
|
503
|
+
includesImage: false,
|
|
504
|
+
model: DEFAULT_MODEL,
|
|
505
|
+
timestamp: Date.now()
|
|
506
|
+
});
|
|
507
|
+
}
|
|
508
|
+
|
|
509
|
+
// Parse response (same JSON format as regular mode)
|
|
510
|
+
const decision = this.parseAgentDecision(response.content);
|
|
511
|
+
return decision;
|
|
512
|
+
}
|
|
513
|
+
```
|
|
514
|
+
|
|
515
|
+
---
|
|
516
|
+
|
|
517
|
+
## Phase 4: Extend SmartTestRunnerCoreV2
|
|
518
|
+
|
|
519
|
+
### 4.1 Add Exploration Mode Support
|
|
520
|
+
|
|
521
|
+
**File**: `scriptservice/smart-test-runner-core-v2.ts`
|
|
522
|
+
|
|
523
|
+
Add explorationMode to config:
|
|
524
|
+
|
|
525
|
+
```typescript
|
|
526
|
+
export interface RunnerConfig {
|
|
527
|
+
playwrightConfig?: string;
|
|
528
|
+
model?: string;
|
|
529
|
+
repairFlexibility?: number;
|
|
530
|
+
callbacks?: RunnerLifecycleCallbacks;
|
|
531
|
+
page?: Page;
|
|
532
|
+
browser?: Browser;
|
|
533
|
+
context?: BrowserContext;
|
|
534
|
+
maxRetriesPerStep?: number;
|
|
535
|
+
journeyExplorationMode?: { // NEW
|
|
536
|
+
enabled: boolean;
|
|
537
|
+
journeyFocusPrompt: string;
|
|
538
|
+
testDataPrompt?: string;
|
|
539
|
+
maxExplorationSteps?: number;
|
|
540
|
+
};
|
|
541
|
+
}
|
|
542
|
+
```
|
|
543
|
+
|
|
544
|
+
Add exploration execution method:
|
|
545
|
+
|
|
546
|
+
```typescript
|
|
547
|
+
/**
|
|
548
|
+
* Run exploration mode - agent autonomously explores to achieve journey goal
|
|
549
|
+
* Delegates to runner-core's orchestrator which fires onStepComplete for each action
|
|
550
|
+
*/
|
|
551
|
+
async runJourneyExploration(jobId?: string): Promise<RunExactlyResult> {
|
|
552
|
+
try {
|
|
553
|
+
logger.info(`SmartTestRunnerCoreV2.runJourneyExploration: Starting`);
|
|
554
|
+
|
|
555
|
+
if (!this.config.journeyExplorationMode?.enabled) {
|
|
556
|
+
throw new Error('Journey exploration mode not enabled in config');
|
|
557
|
+
}
|
|
558
|
+
|
|
559
|
+
// Call beforeStartTest
|
|
560
|
+
if (this.config.callbacks?.beforeStartTest) {
|
|
561
|
+
await this.config.callbacks.beforeStartTest(this.page, this.browser, this.context);
|
|
562
|
+
}
|
|
563
|
+
|
|
564
|
+
// Call runner-core's orchestrator in exploration mode
|
|
565
|
+
// Orchestrator will autonomously explore and fire onStepComplete for each action
|
|
566
|
+
const result = await this.runnerCore.executeExploration(
|
|
567
|
+
this.page,
|
|
568
|
+
{
|
|
569
|
+
enabled: true,
|
|
570
|
+
explorationPrompt: this.config.journeyExplorationMode.journeyFocusPrompt,
|
|
571
|
+
testDataPrompt: this.config.journeyExplorationMode.testDataPrompt,
|
|
572
|
+
maxExplorationSteps: this.config.journeyExplorationMode.maxExplorationSteps || 50
|
|
573
|
+
},
|
|
574
|
+
jobId || ''
|
|
575
|
+
);
|
|
576
|
+
|
|
577
|
+
// Call afterEndTest
|
|
578
|
+
if (this.config.callbacks?.afterEndTest) {
|
|
579
|
+
const status = result.success ? 'passed' : 'failed';
|
|
580
|
+
await this.config.callbacks.afterEndTest(status, result.error, this.page);
|
|
581
|
+
}
|
|
582
|
+
|
|
583
|
+
return {
|
|
584
|
+
success: result.success,
|
|
585
|
+
error: result.error
|
|
586
|
+
};
|
|
587
|
+
} catch (error: any) {
|
|
588
|
+
logger.error(`SmartTestRunnerCoreV2.runJourneyExploration: Error - ${error.message}`);
|
|
589
|
+
|
|
590
|
+
if (this.config.callbacks?.afterEndTest) {
|
|
591
|
+
await this.config.callbacks.afterEndTest('failed', error.message, this.page);
|
|
592
|
+
}
|
|
593
|
+
|
|
594
|
+
return {
|
|
595
|
+
success: false,
|
|
596
|
+
error: error.message
|
|
597
|
+
};
|
|
598
|
+
}
|
|
599
|
+
}
|
|
600
|
+
```
|
|
601
|
+
|
|
602
|
+
---
|
|
603
|
+
|
|
604
|
+
## Phase 5: Rename and Extend JourneyRunner (minimal changes)
|
|
605
|
+
|
|
606
|
+
### 5.1 Rename Files and Types
|
|
607
|
+
|
|
608
|
+
**Renames:**
|
|
609
|
+
- `test-based-explorer.ts` → `journey-runner.ts`
|
|
610
|
+
- `TestBasedExplorer` class → `JourneyRunner`
|
|
611
|
+
- `TestBasedExplorationTask` → `JourneyTask`
|
|
612
|
+
|
|
613
|
+
### 5.2 Extend JourneyTask Model
|
|
614
|
+
|
|
615
|
+
**File**: `scriptservice/utils/models.ts`
|
|
616
|
+
|
|
617
|
+
```typescript
|
|
618
|
+
export interface JourneyTask {
|
|
619
|
+
// ... all existing fields
|
|
620
|
+
|
|
621
|
+
// NEW: Journey-specific exploration mode
|
|
622
|
+
journeyExplorationMode?: {
|
|
623
|
+
enabled: boolean;
|
|
624
|
+
journeyFocusPrompt: string; // Journey-specific focus prompt
|
|
625
|
+
testDataPrompt?: string;
|
|
626
|
+
};
|
|
627
|
+
}
|
|
628
|
+
```
|
|
629
|
+
|
|
630
|
+
### 5.3 Update JourneyRunner.run() - Minimal Changes
|
|
631
|
+
|
|
632
|
+
**File**: `scriptservice/workers/journey-runner.ts`
|
|
633
|
+
|
|
634
|
+
Only ~15 lines of changes:
|
|
635
|
+
|
|
636
|
+
```typescript
|
|
637
|
+
// Constructor - store exploration mode
|
|
638
|
+
private journeyExplorationMode?: { enabled: boolean; journeyFocusPrompt: string; testDataPrompt?: string; };
|
|
639
|
+
|
|
640
|
+
constructor(task: JourneyTask, ...callbacks) {
|
|
641
|
+
// ... existing code
|
|
642
|
+
this.journeyExplorationMode = task.journeyExplorationMode;
|
|
643
|
+
}
|
|
644
|
+
|
|
645
|
+
public async run() {
|
|
646
|
+
// ... existing setup code (unchanged)
|
|
647
|
+
|
|
648
|
+
try {
|
|
649
|
+
// Parse test steps OR use empty array for exploration
|
|
650
|
+
let codeUnits: CodeUnit[] = [];
|
|
651
|
+
if (!this.journeyExplorationMode?.enabled) {
|
|
652
|
+
// TEST-BASED mode: Parse smart test into code units
|
|
653
|
+
if (!this.testId) {
|
|
654
|
+
logger.error('No testId provided');
|
|
655
|
+
return { result: null };
|
|
656
|
+
}
|
|
657
|
+
codeUnits = await this.parseSmartTestIntoCodeUnits();
|
|
658
|
+
if (codeUnits.length === 0) {
|
|
659
|
+
logger.error(`No codeUnits found for test ${this.testId}`);
|
|
660
|
+
return { result: null };
|
|
661
|
+
}
|
|
662
|
+
codeUnits = codeUnits.filter(cu => !isNonActionCodeUnit(cu));
|
|
663
|
+
logger.info('Running TEST-BASED journey');
|
|
664
|
+
} else {
|
|
665
|
+
// EXPLORATION mode: No pre-defined steps
|
|
666
|
+
logger.info(`Running PROMPT-BASED journey with focus: ${this.journeyExplorationMode.journeyFocusPrompt}`);
|
|
667
|
+
}
|
|
668
|
+
|
|
669
|
+
const stepsWithId: TestStepWithId[] = codeUnits.map(cu => ({ ... }));
|
|
670
|
+
|
|
671
|
+
// Define callbacks (IDENTICAL FOR BOTH MODES - all existing code)
|
|
672
|
+
const handleBeforeStartTest = async (...) => { /* existing code */ };
|
|
673
|
+
const handleBeforeStepStart = async (...) => { /* existing code */ };
|
|
674
|
+
const handleStepComplete = async (...) => { /* existing code - ALL analytics */ };
|
|
675
|
+
const handleAfterEndTest = async (...) => { /* existing code */ };
|
|
676
|
+
|
|
677
|
+
// Create runner config (SAME FOR BOTH MODES)
|
|
678
|
+
const runnerConfig: RunnerConfig = {
|
|
679
|
+
page: this.page!,
|
|
680
|
+
browser: this.browser!,
|
|
681
|
+
context: this.context!,
|
|
682
|
+
maxRetriesPerStep: 3,
|
|
683
|
+
repairFlexibility: this.aiHealingSettings?.freedomLevel,
|
|
684
|
+
journeyExplorationMode: this.journeyExplorationMode, // NEW: Pass through
|
|
685
|
+
callbacks: { ... } // Same callbacks
|
|
686
|
+
};
|
|
687
|
+
|
|
688
|
+
// Create runner (SAME FOR BOTH MODES)
|
|
689
|
+
const runner = new SmartTestRunnerCoreV2(runnerConfig);
|
|
690
|
+
|
|
691
|
+
// Execute based on mode
|
|
692
|
+
let result;
|
|
693
|
+
if (this.journeyExplorationMode?.enabled) {
|
|
694
|
+
// EXPLORATION: Orchestrator decides next steps autonomously
|
|
695
|
+
result = await runner.runJourneyExploration(this.invocationId);
|
|
696
|
+
} else {
|
|
697
|
+
// TEST-BASED: Execute predefined steps with repair
|
|
698
|
+
result = await runner.runWithRepair(stepsWithId, this.invocationId, false, 0);
|
|
699
|
+
}
|
|
700
|
+
|
|
701
|
+
if (!result.success) {
|
|
702
|
+
throw new Error(`Execution failed: ${result.error}`);
|
|
703
|
+
}
|
|
704
|
+
|
|
705
|
+
} catch (error) {
|
|
706
|
+
// ... existing error handling
|
|
707
|
+
} finally {
|
|
708
|
+
// ... existing cleanup
|
|
709
|
+
}
|
|
710
|
+
}
|
|
711
|
+
```
|
|
712
|
+
|
|
713
|
+
**That's it!** Only ~15 lines of changes:
|
|
714
|
+
1. Store `journeyExplorationMode` in constructor
|
|
715
|
+
2. Skip step parsing if exploration mode enabled
|
|
716
|
+
3. Pass `journeyExplorationMode` to `RunnerConfig`
|
|
717
|
+
4. Call `runJourneyExploration()` vs `runWithRepair()`
|
|
718
|
+
|
|
719
|
+
All callbacks unchanged - they fire identically for both modes.
|
|
720
|
+
|
|
721
|
+
---
|
|
722
|
+
|
|
723
|
+
## Phase 6: Extend AppExplorer for Prompt-Based Journey Orchestration
|
|
724
|
+
|
|
725
|
+
### 6.1 Add Journey-Prompt Generation
|
|
726
|
+
|
|
727
|
+
**File**: `scriptservice/workers/app-explorer.ts`
|
|
728
|
+
|
|
729
|
+
Modify `run()` around line 154:
|
|
730
|
+
|
|
731
|
+
```typescript
|
|
732
|
+
if (this.config.promptConfig) {
|
|
733
|
+
// Prompt-based exploration - orchestrate multiple journeys
|
|
734
|
+
logger.info("Running prompt-based exploration (multi-journey)");
|
|
735
|
+
await this.runPromptBasedMultiJourneyExploration();
|
|
736
|
+
}
|
|
737
|
+
```
|
|
738
|
+
|
|
739
|
+
Add new method:
|
|
740
|
+
|
|
741
|
+
```typescript
|
|
742
|
+
/**
|
|
743
|
+
* Prompt-based exploration: Orchestrate multiple journeys
|
|
744
|
+
* Uses LLM to generate journey-specific prompts based on overall goal and learnings
|
|
745
|
+
*/
|
|
746
|
+
private async runPromptBasedMultiJourneyExploration(): Promise<void> {
|
|
747
|
+
const promptConfig = this.config.promptConfig!;
|
|
748
|
+
const overallExplorationGoal = promptConfig.explorePrompt || "Explore the application";
|
|
749
|
+
const testDataPrompt = promptConfig.testDataPrompt;
|
|
750
|
+
|
|
751
|
+
logger.info(`Overall exploration goal: ${overallExplorationGoal}`);
|
|
752
|
+
logger.info(`Max journeys: ${this.config.maxJourneys}, Max credits: ${this.config.maxCredits}`);
|
|
753
|
+
|
|
754
|
+
let journeysCompleted = 0;
|
|
755
|
+
const maxJourneys = this.config.maxJourneys || 10;
|
|
756
|
+
|
|
757
|
+
while (journeysCompleted < maxJourneys && this.creditsUsed < (this.config.maxCredits || 100)) {
|
|
758
|
+
journeysCompleted++;
|
|
759
|
+
|
|
760
|
+
logger.info(`\n=== Journey ${journeysCompleted}/${maxJourneys} ===`);
|
|
761
|
+
|
|
762
|
+
// Use LLM to decide next journey focus based on:
|
|
763
|
+
// - Overall exploration goal
|
|
764
|
+
// - Mindmap (what's been explored via getExploredScreenStates())
|
|
765
|
+
// - Learnings from previous journeys
|
|
766
|
+
const journeyFocusPrompt = await this.generateNextJourneyPrompt(
|
|
767
|
+
overallExplorationGoal,
|
|
768
|
+
journeysCompleted,
|
|
769
|
+
maxJourneys
|
|
770
|
+
);
|
|
771
|
+
|
|
772
|
+
if (!journeyFocusPrompt) {
|
|
773
|
+
logger.info('No more journeys needed - exploration goal met');
|
|
774
|
+
break;
|
|
775
|
+
}
|
|
776
|
+
|
|
777
|
+
logger.info(`Journey ${journeysCompleted} focus: ${journeyFocusPrompt}`);
|
|
778
|
+
|
|
779
|
+
// Run this journey
|
|
780
|
+
await this.runSinglePromptBasedJourney(journeyFocusPrompt, testDataPrompt);
|
|
781
|
+
|
|
782
|
+
// Update mindmap and progress
|
|
783
|
+
await this.updateMindMapFromJourney();
|
|
784
|
+
}
|
|
785
|
+
|
|
786
|
+
logger.info(`Prompt-based exploration complete: ${journeysCompleted} journeys, ${this.creditsUsed} credits used`);
|
|
787
|
+
}
|
|
788
|
+
|
|
789
|
+
/**
|
|
790
|
+
* Use LLM to generate next journey-specific prompt
|
|
791
|
+
* Based on overall goal, mindmap, and learnings from prior journeys
|
|
792
|
+
*/
|
|
793
|
+
private async generateNextJourneyPrompt(
|
|
794
|
+
overallGoal: string,
|
|
795
|
+
journeyNumber: number,
|
|
796
|
+
maxJourneys: number
|
|
797
|
+
): Promise<string | null> {
|
|
798
|
+
// Get explored screen states from mindmap
|
|
799
|
+
const exploredScreens = this.getExploredScreenStates();
|
|
800
|
+
const unexploredScreens = this.getUnexploredScreenStates();
|
|
801
|
+
|
|
802
|
+
// Build prompt for LLM to decide next journey
|
|
803
|
+
const prompt = `You are planning the next exploration journey for a web application.
|
|
804
|
+
|
|
805
|
+
OVERALL EXPLORATION GOAL: ${overallGoal}
|
|
806
|
+
|
|
807
|
+
PROGRESS SO FAR:
|
|
808
|
+
- Journey ${journeyNumber}/${maxJourneys}
|
|
809
|
+
- Credits used: ${this.creditsUsed}/${this.config.maxCredits}
|
|
810
|
+
- Explored screens: ${exploredScreens.map(s => s.name).join(', ') || 'None yet'}
|
|
811
|
+
- Unexplored screens: ${unexploredScreens.map(s => s.name).join(', ') || 'None discovered'}
|
|
812
|
+
|
|
813
|
+
LEARNINGS FROM PREVIOUS JOURNEYS:
|
|
814
|
+
${this.getJourneyLearnings()}
|
|
815
|
+
|
|
816
|
+
YOUR TASK: Decide what to explore in the NEXT journey.
|
|
817
|
+
- Generate a specific, focused exploration prompt for ONE journey
|
|
818
|
+
- Example: "Explore Dashboard and test all widgets (Sales, Analytics, Tasks)"
|
|
819
|
+
- Example: "Navigate to Settings and test user profile management"
|
|
820
|
+
- If the overall goal is largely met or no clear next journey, return empty string
|
|
821
|
+
|
|
822
|
+
OUTPUT FORMAT (JSON):
|
|
823
|
+
{
|
|
824
|
+
"journeyFocusPrompt": "Specific prompt for next journey" OR "",
|
|
825
|
+
"reasoning": "Why this journey focus"
|
|
826
|
+
}`;
|
|
827
|
+
|
|
828
|
+
try {
|
|
829
|
+
const result = await callLLM(prompt);
|
|
830
|
+
const parsed = JSON.parse(result);
|
|
831
|
+
|
|
832
|
+
if (!parsed.journeyFocusPrompt || parsed.journeyFocusPrompt.trim() === '') {
|
|
833
|
+
return null;
|
|
834
|
+
}
|
|
835
|
+
|
|
836
|
+
logger.info(`Next journey reasoning: ${parsed.reasoning}`);
|
|
837
|
+
return parsed.journeyFocusPrompt;
|
|
838
|
+
} catch (error) {
|
|
839
|
+
logger.error(`Failed to generate journey prompt: ${error}`);
|
|
840
|
+
return null;
|
|
841
|
+
}
|
|
842
|
+
}
|
|
843
|
+
|
|
844
|
+
/**
|
|
845
|
+
* Execute a single prompt-based journey with given focus
|
|
846
|
+
*/
|
|
847
|
+
private async runSinglePromptBasedJourney(
|
|
848
|
+
journeyFocusPrompt: string,
|
|
849
|
+
testDataPrompt?: string
|
|
850
|
+
): Promise<void> {
|
|
851
|
+
// Create journey task
|
|
852
|
+
const task: JourneyTask = {
|
|
853
|
+
invocationId: uuidv4(),
|
|
854
|
+
invocationBatchId: this.explorationId,
|
|
855
|
+
projectId: this.task.projectId,
|
|
856
|
+
appReleaseId: this.config.webappReleaseVersion ?? "default",
|
|
857
|
+
autohealEnabled: false,
|
|
858
|
+
sessionRecordApiKey: this.task.sessionRecordApiKey,
|
|
859
|
+
ingressEndpoint: await ConfigService.get('test_service_endpoint'),
|
|
860
|
+
enableTestchimpSdkOnExec: false,
|
|
861
|
+
urlRegexToCapture: ".*",
|
|
862
|
+
playwrightConfig: this.config.playwrightConfig,
|
|
863
|
+
creditBudget: this.getCreditsForNextJourney(),
|
|
864
|
+
bugCaptureSettings: this.config.bugCaptureSettings,
|
|
865
|
+
viewportConfig: this.config.viewportConfig,
|
|
866
|
+
|
|
867
|
+
// Enable journey exploration mode with focused prompt
|
|
868
|
+
journeyExplorationMode: {
|
|
869
|
+
enabled: true,
|
|
870
|
+
journeyFocusPrompt, // Journey-specific focus
|
|
871
|
+
testDataPrompt
|
|
872
|
+
}
|
|
873
|
+
};
|
|
874
|
+
|
|
875
|
+
// Use JourneyRunner (handles both test-based and prompt-based)
|
|
876
|
+
const runner = new JourneyRunner(
|
|
877
|
+
task,
|
|
878
|
+
this.onBugsDiscovered.bind(this),
|
|
879
|
+
this.onScreenStateVisited.bind(this),
|
|
880
|
+
this.shouldVisualAnalyzeScreen?.bind(this),
|
|
881
|
+
this.getVisualAnalysisDecision?.bind(this),
|
|
882
|
+
this.shouldAnalyzeApiCall?.bind(this),
|
|
883
|
+
this.reportVisualAnalyzedScreen?.bind(this),
|
|
884
|
+
this.reportVisualAnalyzedScreenshot?.bind(this),
|
|
885
|
+
this.reportAnalyzedApiCalls?.bind(this),
|
|
886
|
+
this.getKnownScreenStates?.bind(this),
|
|
887
|
+
this.updateInputValueCombinations?.bind(this),
|
|
888
|
+
this.reportPageNavigation?.bind(this),
|
|
889
|
+
this.onTokensUsed?.bind(this),
|
|
890
|
+
this.updateLatestScreenshot?.bind(this)
|
|
891
|
+
);
|
|
892
|
+
|
|
893
|
+
await runner.run();
|
|
894
|
+
}
|
|
895
|
+
|
|
896
|
+
// Helper methods
|
|
897
|
+
private getExploredScreenStates(): ScreenState[] {
|
|
898
|
+
// Return screens that have been explored (from mindmap)
|
|
899
|
+
// Implementation depends on existing mindmap structure
|
|
900
|
+
return this.screenStateRegistry.filter(s => s.explorationStatus === 'EXPLORED');
|
|
901
|
+
}
|
|
902
|
+
|
|
903
|
+
private getUnexploredScreenStates(): ScreenState[] {
|
|
904
|
+
return this.screenStateRegistry.filter(s => s.explorationStatus === 'NOT_EXPLORED');
|
|
905
|
+
}
|
|
906
|
+
|
|
907
|
+
private getJourneyLearnings(): string {
|
|
908
|
+
// Aggregate learnings from completed journeys
|
|
909
|
+
// Could be stored in AppExplorer state as journeys complete
|
|
910
|
+
return this.journeyLearnings.join('\n') || 'No learnings yet';
|
|
911
|
+
}
|
|
912
|
+
|
|
913
|
+
private async updateMindMapFromJourney(): Promise<void> {
|
|
914
|
+
// Update mindmap based on journey results
|
|
915
|
+
// Existing mindmap update logic can be reused
|
|
916
|
+
}
|
|
917
|
+
```
|
|
918
|
+
|
|
919
|
+
---
|
|
920
|
+
|
|
921
|
+
## Summary
|
|
922
|
+
|
|
923
|
+
This plan enables prompt-based exploration with proper multi-journey orchestration:
|
|
924
|
+
|
|
925
|
+
1. **Runner-Core Extensions**:
|
|
926
|
+
- Add `ExplorationMode` type (journey-specific)
|
|
927
|
+
- Reuse existing `JourneyMemory` fields
|
|
928
|
+
- Add exploratory prompts to guide autonomous exploration
|
|
929
|
+
- Add `executeExploration` to orchestrator
|
|
930
|
+
|
|
931
|
+
2. **JourneyRunner** (formerly TestBasedExplorer - minimal changes ~15 lines):
|
|
932
|
+
- Rename for clarity
|
|
933
|
+
- Accept `journeyExplorationMode` config
|
|
934
|
+
- Skip step parsing if exploration mode
|
|
935
|
+
- Call `runJourneyExploration()` vs `runWithRepair()`
|
|
936
|
+
- **All callbacks identical** - transparent to analytics
|
|
937
|
+
|
|
938
|
+
3. **AppExplorer** (prompt-based multi-journey orchestration - NEW):
|
|
939
|
+
- Loop through journeys (up to maxJourneys / maxCredits)
|
|
940
|
+
- Use LLM to generate journey-specific prompts based on:
|
|
941
|
+
- Overall exploration goal
|
|
942
|
+
- Mindmap (explored vs unexplored screens)
|
|
943
|
+
- Learnings from prior journeys
|
|
944
|
+
- Run each journey via JourneyRunner with focused prompt
|
|
945
|
+
- Update mindmap and learnings after each journey
|
|
946
|
+
|
|
947
|
+
4. **Key Insight**:
|
|
948
|
+
- **AppExplorer** = Creative journey planning (WHAT to explore next)
|
|
949
|
+
- **JourneyRunner** = Journey execution (HOW to explore it)
|
|
950
|
+
- **Runner-Core Orchestrator** = Autonomous action decisions within journey
|
|
951
|
+
|
|
952
|
+
All analytics, bug capture, screen state detection work identically for both test-based and prompt-based modes.
|
|
953
|
+
|