testchimp-runner-core 0.0.35 → 0.0.37

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (81) hide show
  1. package/dist/orchestrator/orchestrator-agent.d.ts.map +1 -1
  2. package/dist/orchestrator/orchestrator-agent.js +7 -4
  3. package/dist/orchestrator/orchestrator-agent.js.map +1 -1
  4. package/dist/orchestrator/orchestrator-prompts.d.ts.map +1 -1
  5. package/dist/orchestrator/orchestrator-prompts.js +73 -15
  6. package/dist/orchestrator/orchestrator-prompts.js.map +1 -1
  7. package/dist/orchestrator/page-som-handler.d.ts +1 -2
  8. package/dist/orchestrator/page-som-handler.d.ts.map +1 -1
  9. package/dist/orchestrator/page-som-handler.js +51 -25
  10. package/dist/orchestrator/page-som-handler.js.map +1 -1
  11. package/package.json +6 -1
  12. package/plandocs/BEFORE_AFTER_VERIFICATION.md +0 -148
  13. package/plandocs/COORDINATE_MODE_DIAGNOSIS.md +0 -144
  14. package/plandocs/CREDIT_CALLBACK_ARCHITECTURE.md +0 -253
  15. package/plandocs/HUMAN_LIKE_IMPROVEMENTS.md +0 -642
  16. package/plandocs/IMPLEMENTATION_STATUS.md +0 -108
  17. package/plandocs/INTEGRATION_COMPLETE.md +0 -322
  18. package/plandocs/MULTI_AGENT_ARCHITECTURE_REVIEW.md +0 -844
  19. package/plandocs/ORCHESTRATOR_MVP_SUMMARY.md +0 -539
  20. package/plandocs/PHASE1_ABSTRACTION_COMPLETE.md +0 -241
  21. package/plandocs/PHASE1_FINAL_STATUS.md +0 -210
  22. package/plandocs/PHASE_1_COMPLETE.md +0 -165
  23. package/plandocs/PHASE_1_SUMMARY.md +0 -184
  24. package/plandocs/PLANNING_SESSION_SUMMARY.md +0 -372
  25. package/plandocs/PROMPT_OPTIMIZATION_ANALYSIS.md +0 -120
  26. package/plandocs/PROMPT_SANITY_CHECK.md +0 -120
  27. package/plandocs/SCRIPT_CLEANUP_FEATURE.md +0 -201
  28. package/plandocs/SCRIPT_GENERATION_ARCHITECTURE.md +0 -364
  29. package/plandocs/SELECTOR_IMPROVEMENTS.md +0 -139
  30. package/plandocs/SESSION_SUMMARY_v0.0.33.md +0 -151
  31. package/plandocs/TROUBLESHOOTING_SESSION.md +0 -72
  32. package/plandocs/VISION_DIAGNOSTICS_IMPROVEMENTS.md +0 -336
  33. package/plandocs/VISUAL_AGENT_EVOLUTION_PLAN.md +0 -396
  34. package/plandocs/WHATS_NEW_v0.0.33.md +0 -183
  35. package/plandocs/exploratory-mode-support-v2.plan.md +0 -953
  36. package/plandocs/exploratory-mode-support.plan.md +0 -928
  37. package/plandocs/journey-id-tracking-addendum.md +0 -227
  38. package/releasenotes/RELEASE_0.0.26.md +0 -165
  39. package/releasenotes/RELEASE_0.0.27.md +0 -236
  40. package/releasenotes/RELEASE_0.0.28.md +0 -286
  41. package/src/auth-config.ts +0 -84
  42. package/src/credit-usage-service.ts +0 -188
  43. package/src/env-loader.ts +0 -103
  44. package/src/execution-service.ts +0 -996
  45. package/src/file-handler.ts +0 -104
  46. package/src/index.ts +0 -432
  47. package/src/llm-facade.ts +0 -821
  48. package/src/llm-provider.ts +0 -53
  49. package/src/model-constants.ts +0 -35
  50. package/src/orchestrator/decision-parser.ts +0 -139
  51. package/src/orchestrator/index.ts +0 -58
  52. package/src/orchestrator/orchestrator-agent.ts +0 -1282
  53. package/src/orchestrator/orchestrator-prompts.ts +0 -786
  54. package/src/orchestrator/page-som-handler.ts +0 -1565
  55. package/src/orchestrator/som-types.ts +0 -188
  56. package/src/orchestrator/tool-registry.ts +0 -184
  57. package/src/orchestrator/tools/check-page-ready.ts +0 -75
  58. package/src/orchestrator/tools/extract-data.ts +0 -92
  59. package/src/orchestrator/tools/index.ts +0 -15
  60. package/src/orchestrator/tools/inspect-page.ts +0 -42
  61. package/src/orchestrator/tools/recall-history.ts +0 -72
  62. package/src/orchestrator/tools/refresh-som-markers.ts +0 -69
  63. package/src/orchestrator/tools/take-screenshot.ts +0 -128
  64. package/src/orchestrator/tools/verify-action-result.ts +0 -159
  65. package/src/orchestrator/tools/view-previous-screenshot.ts +0 -103
  66. package/src/orchestrator/types.ts +0 -291
  67. package/src/playwright-mcp-service.ts +0 -224
  68. package/src/progress-reporter.ts +0 -144
  69. package/src/prompts.ts +0 -842
  70. package/src/providers/backend-proxy-llm-provider.ts +0 -91
  71. package/src/providers/local-llm-provider.ts +0 -38
  72. package/src/scenario-service.ts +0 -252
  73. package/src/scenario-worker-class.ts +0 -1110
  74. package/src/script-utils.ts +0 -203
  75. package/src/types.ts +0 -239
  76. package/src/utils/browser-utils.ts +0 -348
  77. package/src/utils/coordinate-converter.ts +0 -162
  78. package/src/utils/page-info-retry.ts +0 -65
  79. package/src/utils/page-info-utils.ts +0 -285
  80. package/testchimp-runner-core-0.0.35.tgz +0 -0
  81. package/tsconfig.json +0 -19
@@ -1,241 +0,0 @@
1
- # Phase 1: Runner-Core Abstraction Layer - COMPLETE ✅
2
-
3
- ## Implementation Summary
4
-
5
- Successfully refactored runner-core to use pluggable adapter pattern, making it reusable across VS Extension, GitHub Runner, and (in Phase 2) Script Service.
6
-
7
- ---
8
-
9
- ## What Was Implemented
10
-
11
- ### 1. New Interface Files
12
-
13
- #### `/src/llm-provider.ts`
14
- - `LLMProvider` interface - Abstract LLM calling
15
- - `LLMRequest` interface - Request format (camelCase)
16
- - `LLMResponse` interface - Response format
17
-
18
- **Key Feature**: Allows different LLM implementations (backend proxy, local OpenAI, etc.)
19
-
20
- #### `/src/progress-reporter.ts`
21
- - `ProgressReporter` interface - Progress tracking callbacks
22
- - `StepProgress` interface - Step-level progress (jobId, stepNumber, code, screenshot, etc.)
23
- - `JobProgress` interface - Job-level progress
24
- - `StepExecutionStatus` enum - Status constants
25
-
26
- **Key Feature**: Structured progress reporting (logs, DB writes, etc.)
27
-
28
- ### 2. Provider Implementations
29
-
30
- #### `/src/providers/backend-proxy-llm-provider.ts`
31
- - Default implementation for VS Extension and GitHub Runner
32
- - Calls `/localagent/call_llm` backend endpoint
33
- - Handles authentication via AuthConfig
34
- - **Converts camelCase → snake_case** for backend API compatibility
35
-
36
- #### `/src/providers/local-llm-provider.ts`
37
- - For Script Service (Phase 2)
38
- - Accepts callback function to call LLM directly
39
- - No backend proxy dependency
40
-
41
- ### 3. Core Refactors
42
-
43
- #### `LLMFacade` (`/src/llm-facade.ts`)
44
- **Changes:**
45
- - ✅ Constructor now accepts `LLMProvider` instead of authConfig/backendUrl
46
- - ✅ Removed axios calls - delegated to provider
47
- - ✅ Removed setAuthConfig/getAuthConfig - handled by provider
48
- - ✅ All LLMRequest fields use **camelCase** (systemPrompt, userPrompt, imageUrl)
49
- - ✅ Simplified callLLM method - just delegates to provider
50
-
51
- #### `ScenarioWorker` (`/src/scenario-worker-class.ts`)
52
- **Changes:**
53
- - ✅ Constructor accepts `LLMProvider` and `ProgressReporter`
54
- - ✅ Defaults to `BackendProxyLLMProvider` if not provided (backward compatible)
55
- - ✅ Added `captureStepScreenshot()` - returns data URL (data:image/png;base64,...)
56
- - ✅ Added `reportStepProgress()` - reports to ProgressReporter
57
- - ✅ Added `reportJobProgress()` - reports to ProgressReporter
58
- - ✅ Tracks `currentJobId` for progress keying
59
- - ✅ Reports job started, in_progress, completed
60
-
61
- #### `ScenarioService` (`/src/scenario-service.ts`)
62
- **Changes:**
63
- - ✅ Constructor accepts `LLMProvider` and `ProgressReporter`
64
- - ✅ Passes providers to all workers
65
- - ✅ Maintains backward compatibility
66
-
67
- #### `ExecutionService` (`/src/execution-service.ts`)
68
- **Changes:**
69
- - ✅ Constructor accepts `LLMProvider` and `ProgressReporter`
70
- - ✅ Defaults to `BackendProxyLLMProvider` if not provided
71
- - ✅ setAuthConfig recreates provider with new auth
72
-
73
- #### `TestChimpService` (`/src/index.ts`)
74
- **Changes:**
75
- - ✅ Constructor accepts optional `LLMProvider` and `ProgressReporter`
76
- - ✅ Defaults to `BackendProxyLLMProvider` (backward compatible)
77
- - ✅ Passes providers to all services
78
- - ✅ Exports all new interfaces and implementations
79
-
80
- ### 4. Consumer Updates
81
-
82
- #### VS Extension (`/local/vs-ext/src/embedded-service.ts`)
83
- **Changes:**
84
- - ✅ Imports new types: `ProgressReporter`, `StepProgress`, `JobProgress`, `StepExecutionStatus`
85
- - ✅ Creates explicit `ProgressReporter` implementation
86
- - ✅ Reports step progress with emojis (✅ ❌ 🔄)
87
- - ✅ Reports job progress
88
- - ✅ Logs commands, errors to outputChannel
89
- - ✅ Passes progressReporter to TestChimpService
90
-
91
- #### GitHub Runner (`/testchimp-github-testrunner/src/index.ts`)
92
- **Changes:**
93
- - ✅ Imports new types: `ProgressReporter`, `StepProgress`, `JobProgress`, `StepExecutionStatus`
94
- - ✅ Creates explicit `ProgressReporter` implementation
95
- - ✅ Reports to GitHub Actions core.info/error
96
- - ✅ Passes progressReporter to TestChimpService
97
-
98
- ---
99
-
100
- ## Architecture Benefits
101
-
102
- ### 1. Pluggable LLM Provider
103
- ```
104
- VS Extension → BackendProxyLLMProvider → /call-llm-internal → OpenAI
105
- GitHub Runner → BackendProxyLLMProvider → /call-llm-internal → OpenAI
106
- Script Service (Phase 2) → LocalLLMProvider → OpenAI directly
107
- ```
108
-
109
- ### 2. Flexible Progress Reporting
110
- ```
111
- VS Extension → ProgressReporter → outputChannel.appendLine()
112
- GitHub Runner → ProgressReporter → core.info()
113
- Script Service (Phase 2) → ProgressReporter → DB write + GCS upload
114
- ```
115
-
116
- ### 3. Screenshot Data URLs
117
- - Core captures as Buffer → converts to data URL
118
- - VS Extension: Can save to file or ignore
119
- - GitHub Runner: Can save to artifacts or ignore
120
- - Script Service (Phase 2): Upload to GCS
121
-
122
- ---
123
-
124
- ## Backward Compatibility
125
-
126
- ✅ **All existing code continues to work without changes**
127
-
128
- Old usage (still works):
129
- ```typescript
130
- const service = new TestChimpService(fileHandler, authConfig, backendUrl);
131
- ```
132
-
133
- New usage (explicit):
134
- ```typescript
135
- const llmProvider = new BackendProxyLLMProvider(authConfig, backendUrl);
136
- const progressReporter = { onStepProgress: async (p) => {...} };
137
- const service = new TestChimpService(
138
- fileHandler, authConfig, backendUrl, maxWorkers,
139
- llmProvider, progressReporter
140
- );
141
- ```
142
-
143
- ---
144
-
145
- ## Naming Conventions
146
-
147
- ✅ **Enforced camelCase for all TypeScript/JavaScript**
148
- - `systemPrompt`, `userPrompt`, `imageUrl` in code
149
- - `system_prompt`, `user_prompt`, `image_url` in proto/backend APIs
150
- - Conversion handled by `BackendProxyLLMProvider`
151
-
152
- ---
153
-
154
- ## Build & Test Results
155
-
156
- ### ✅ Runner-Core
157
- - TypeScript compilation: **SUCCESS**
158
- - Package created: `testchimp-runner-core-0.0.22.tgz`
159
-
160
- ### ✅ VS Extension
161
- - TypeScript check: **SUCCESS**
162
- - Webpack build: **SUCCESS**
163
- - File: `out/extension.js`
164
-
165
- ### ✅ GitHub Runner
166
- - TypeScript compilation: **SUCCESS**
167
- - NCC bundle: **SUCCESS**
168
- - File: `dist/index.js`
169
-
170
- ---
171
-
172
- ## Phase 2 Preview
173
-
174
- With Phase 1 complete, Script Service can now:
175
-
176
- 1. Create `LocalLLMProvider` with its own OpenAI integration
177
- 2. Create `DatabaseProgressReporter` that:
178
- - Writes step progress to DB
179
- - Uploads screenshots to GCS
180
- - Updates job status
181
- 3. Use runner-core with **zero code duplication**
182
- 4. Get all future improvements automatically
183
-
184
- ---
185
-
186
- ## Testing Checklist
187
-
188
- - [ ] VS Extension: Generate script from scenario
189
- - [ ] VS Extension: Repair failing script
190
- - [ ] GitHub Runner: Execute smart tests
191
- - [ ] GitHub Runner: AI repair mode
192
- - [ ] Verify progress reporting in both environments
193
- - [ ] Verify screenshots captured as data URLs
194
- - [ ] Verify backward compatibility
195
-
196
- ---
197
-
198
- ## Files Created
199
-
200
- 1. `/Users/nuwansam/IdeaProjects/AwareRepo/local/runner-core/src/llm-provider.ts`
201
- 2. `/Users/nuwansam/IdeaProjects/AwareRepo/local/runner-core/src/progress-reporter.ts`
202
- 3. `/Users/nuwansam/IdeaProjects/AwareRepo/local/runner-core/src/providers/backend-proxy-llm-provider.ts`
203
- 4. `/Users/nuwansam/IdeaProjects/AwareRepo/local/runner-core/src/providers/local-llm-provider.ts`
204
-
205
- ## Files Modified
206
-
207
- **Runner-core:**
208
- 1. `/Users/nuwansam/IdeaProjects/AwareRepo/local/runner-core/src/llm-facade.ts`
209
- 2. `/Users/nuwansam/IdeaProjects/AwareRepo/local/runner-core/src/scenario-worker-class.ts`
210
- 3. `/Users/nuwansam/IdeaProjects/AwareRepo/local/runner-core/src/execution-service.ts`
211
- 4. `/Users/nuwansam/IdeaProjects/AwareRepo/local/runner-core/src/scenario-service.ts`
212
- 5. `/Users/nuwansam/IdeaProjects/AwareRepo/local/runner-core/src/index.ts`
213
-
214
- **Consumers:**
215
- 6. `/Users/nuwansam/IdeaProjects/AwareRepo/local/vs-ext/src/embedded-service.ts`
216
- 7. `/Users/nuwansam/IdeaProjects/testchimp-github-testrunner/src/index.ts`
217
-
218
- ---
219
-
220
- ## Next Steps
221
-
222
- ### Ready for Multi-Agent Architecture!
223
-
224
- With these abstractions in place, implementing the multi-agent architecture (Planner, Executor, Vision Advisor, etc.) will work across all three environments:
225
-
226
- - **Planner Agent** uses `LLMProvider` → works in all environments
227
- - **Executor Agent** uses `LLMProvider` → works in all environments
228
- - **Vision Advisor** uses `LLMProvider` with imageUrl → works in all environments
229
- - **Progress** reported uniformly via `ProgressReporter`
230
- - **Script Service** gets all intelligence automatically in Phase 2
231
-
232
- ---
233
-
234
- ## Version
235
-
236
- **Runner-Core**: v1.5.0-vision-preserve-values (with Phase 1 abstraction layer)
237
-
238
- **Date**: October 11, 2025
239
-
240
- **Status**: ✅ COMPLETE - Ready for testing and Phase 2
241
-
@@ -1,210 +0,0 @@
1
- # Phase 1: Runner-Core Abstraction - COMPLETE & TESTED ✅
2
-
3
- ## Final Status
4
-
5
- ✅ **Phase 1 is fully implemented, tested, and working!**
6
-
7
- All three components successfully built:
8
- - ✅ Runner-Core
9
- - ✅ VS Extension
10
- - ✅ GitHub Runner
11
-
12
- Authentication verified working in VS Extension.
13
-
14
- ---
15
-
16
- ## What Was Implemented
17
-
18
- ### 1. Core Abstractions
19
-
20
- #### LLM Provider Interface (`llm-provider.ts`)
21
- - `LLMProvider` interface - pluggable LLM calling
22
- - `LLMRequest` / `LLMResponse` - **camelCase** for TypeScript
23
- - Enables different implementations (backend proxy, local, etc.)
24
-
25
- #### Progress Reporter Interface (`progress-reporter.ts`)
26
- - `ProgressReporter` interface - structured progress callbacks
27
- - `StepProgress` - includes jobId, stepNumber, code, screenshotDataUrl (data URL)
28
- - `JobProgress` - job-level status tracking
29
- - `StepExecutionStatus` enum
30
-
31
- ### 2. Provider Implementations
32
-
33
- #### BackendProxyLLMProvider (`providers/backend-proxy-llm-provider.ts`)
34
- - For VS Extension and GitHub Runner
35
- - Calls `/localagent/call_llm` backend endpoint
36
- - **Converts camelCase → snake_case** for backend API
37
- - Handles authentication via AuthConfig
38
-
39
- #### LocalLLMProvider (`providers/local-llm-provider.ts`)
40
- - Ready for Script Service (Phase 2)
41
- - Accepts callback function
42
- - No backend proxy dependency
43
-
44
- ### 3. Refactored Components
45
-
46
- All components updated to use adapter pattern:
47
- - ✅ `LLMFacade` - uses LLMProvider
48
- - ✅ `ScenarioWorker` - accepts providers, reports progress, captures screenshots as data URLs
49
- - ✅ `ExecutionService` - accepts providers
50
- - ✅ `ScenarioService` - passes providers to workers
51
- - ✅ `TestChimpService` - orchestrates with optional providers
52
-
53
- ### 4. Consumer Updates
54
-
55
- #### VS Extension (`vs-ext/src/embedded-service.ts`)
56
- - ✅ Creates `ProgressReporter` with step/job callbacks
57
- - ✅ **Critical fix**: Sets logger BEFORE setAuthConfig
58
- - ✅ Progress logged with emojis (✅ ❌ 🔄)
59
-
60
- #### GitHub Runner (`testchimp-github-testrunner/src/index.ts`)
61
- - ✅ Creates `ProgressReporter` for GitHub Actions
62
- - ✅ Reports to core.info/error
63
-
64
- ---
65
-
66
- ## Critical Issue Found & Fixed
67
-
68
- ### Problem
69
- After refactor, VS Extension got **401 Unauthorized errors**:
70
- ```
71
- LLM call failed: AxiosError: Request failed with status code 401
72
- ```
73
-
74
- ### Root Cause
75
- **Initialization order issue** in VS Extension:
76
- 1. Constructor creates provider **without auth** (authConfig = undefined)
77
- 2. `setAuthConfig()` recreates provider **with auth**
78
- 3. ❌ But logger was set **after** setAuthConfig
79
- 4. New provider created without logger → provider can't log errors properly
80
-
81
- ### Fix
82
- **Moved `setLogger()` before `setAuthConfig()`** in `embedded-service.ts`:
83
- ```typescript
84
- // BEFORE (broken):
85
- await this.service.setAuthConfig(authConfig); // Line 115
86
- this.service.setLogger(...); // Line 135
87
-
88
- // AFTER (fixed):
89
- this.service.setLogger(...); // Line 109
90
- await this.service.setAuthConfig(authConfig); // Line 123
91
- ```
92
-
93
- ### Additional Fixes
94
- 1. **TestChimpService.setAuthConfig()** - Preserves logger when recreating provider
95
- 2. **TestChimpService.setBackendUrl()** - Preserves logger when recreating provider
96
- 3. **All service recreations** - Pass llmProvider and progressReporter
97
-
98
- ---
99
-
100
- ## Build Verification
101
-
102
- ✅ **Runner-Core**: TypeScript compiled
103
- ✅ **VS Extension**: Webpack build successful
104
- ✅ **GitHub Runner**: TypeScript + NCC bundle successful
105
- ✅ **Authentication**: Working (401 errors resolved)
106
- ✅ **Progress Reporting**: Working (callbacks invoked)
107
-
108
- ---
109
-
110
- ## Architecture Benefits
111
-
112
- 1. **Pluggable LLM**: Easy to add local OpenAI, Anthropic, etc.
113
- 2. **Structured Progress**: Each environment reports differently
114
- 3. **Screenshot Data URLs**: Core captures, consumers decide storage
115
- 4. **Backward Compatible**: Old code works without changes
116
- 5. **Type Safe**: All TypeScript interfaces use camelCase
117
- 6. **Ready for Multi-Agent**: Planner/Executor agents can use same providers
118
-
119
- ---
120
-
121
- ## Usage Patterns
122
-
123
- ### Current Pattern (Backward Compatible)
124
- ```typescript
125
- const service = new TestChimpService(fileHandler, authConfig, backendUrl);
126
- // Still works!
127
- ```
128
-
129
- ### New Pattern (Explicit)
130
- ```typescript
131
- const progressReporter: ProgressReporter = {
132
- onStepProgress: async (progress) => { ... },
133
- onJobProgress: async (progress) => { ... }
134
- };
135
-
136
- const service = new TestChimpService(
137
- fileHandler, authConfig, backendUrl, workers,
138
- undefined, // uses BackendProxyLLMProvider by default
139
- progressReporter
140
- );
141
- ```
142
-
143
- ### Phase 2 Pattern (Script Service)
144
- ```typescript
145
- const llmProvider = new LocalLLMProvider(async (req) => {
146
- return await callLLM(`${req.systemPrompt}\n\n${req.userPrompt}`);
147
- });
148
-
149
- const progressReporter: ProgressReporter = {
150
- onStepProgress: async (progress) => {
151
- const url = await uploadToGCS(progress.screenshotDataUrl);
152
- await db.updateStep(progress.jobId, { ...progress, screenshotPath: url });
153
- }
154
- };
155
-
156
- const service = new TestChimpService(
157
- undefined, undefined, undefined, undefined,
158
- llmProvider, progressReporter
159
- );
160
- ```
161
-
162
- ---
163
-
164
- ## Files Summary
165
-
166
- **Created:**
167
- - `src/llm-provider.ts`
168
- - `src/progress-reporter.ts`
169
- - `src/providers/backend-proxy-llm-provider.ts`
170
- - `src/providers/local-llm-provider.ts`
171
-
172
- **Modified:**
173
- - `src/llm-facade.ts`
174
- - `src/scenario-worker-class.ts`
175
- - `src/execution-service.ts`
176
- - `src/scenario-service.ts`
177
- - `src/index.ts`
178
- - `vs-ext/src/embedded-service.ts`
179
- - `github-runner/src/index.ts`
180
-
181
- ---
182
-
183
- ## Testing Checklist
184
-
185
- - [x] Runner-core builds successfully
186
- - [x] VS Extension builds successfully
187
- - [x] GitHub Runner builds successfully
188
- - [x] Authentication working (no 401 errors)
189
- - [x] Logger initialization order fixed
190
- - [ ] Test script generation from scenario
191
- - [ ] Test script repair
192
- - [ ] Verify progress callbacks
193
- - [ ] Verify screenshot data URLs
194
-
195
- ---
196
-
197
- ## Phase 2 Ready
198
-
199
- With Phase 1 complete and tested, Script Service can now:
200
- 1. Use `LocalLLMProvider` with OpenAI directly
201
- 2. Create `DatabaseProgressReporter` for DB + GCS
202
- 3. Remove duplicate code from `script-generation/`
203
- 4. Share all improvements automatically
204
-
205
- ---
206
-
207
- **Version**: v1.5.0-vision-preserve-values + Phase 1 Abstraction Layer
208
- **Date**: October 11, 2025
209
- **Status**: ✅ **COMPLETE, TESTED, WORKING**
210
-
@@ -1,165 +0,0 @@
1
- # Phase 1 Implementation - COMPLETE ✅
2
-
3
- ## Version: runner-core v0.0.33
4
-
5
- ## What's Been Implemented
6
-
7
- ### 1. Free-Form "Note to Future Self"
8
- **Purpose:** Tactical memory - agent leaves notes that persist across iterations AND steps.
9
-
10
- **Type:**
11
- ```typescript
12
- interface NoteToFutureSelf {
13
- fromIteration: number;
14
- content: string; // FREE-FORM - agent writes whatever it wants
15
- }
16
- ```
17
-
18
- **How it works:**
19
- - Agent includes `"noteToFutureSelf": "..."` in response
20
- - System stores it in `memory.latestNote` (persists across steps!)
21
- - Passed to next iteration AND next step
22
- - Displayed prominently at top of prompt
23
- - Agent reads it FIRST before making decision
24
-
25
- **Scope:** Entire scenario journey (not just current step)
26
-
27
- **Example notes:**
28
-
29
- *Iteration-specific:*
30
- - "Tried #sidebar-toggle, failed with 'not clickable'. Will try child SVG element next."
31
-
32
- *Step-spanning:*
33
- - "This app has slow-loading modals. Always wait 2s after page load before clicking."
34
- - "Cookie consent appears on every page. Check for and dismiss it first."
35
- - "Sidebar only visible on desktop viewport (>1024px width)."
36
-
37
- ### 2. Percentage-Based Coordinate Fallback
38
- **Purpose:** Last-resort mechanism when selector generation repeatedly fails.
39
-
40
- **Type:**
41
- ```typescript
42
- interface CoordinateAction {
43
- type: 'coordinate';
44
- action: 'click' | 'doubleClick' | 'rightClick' | 'hover' | 'drag' | 'fill' | 'scroll';
45
- xPercent: number; // 0-100, 3 decimal precision
46
- yPercent: number;
47
- toXPercent?: number; // For drag
48
- toYPercent?: number;
49
- value?: string; // For fill
50
- scrollAmount?: number; // For scroll
51
- }
52
- ```
53
-
54
- **How it works:**
55
- - LLM outputs percentages: `{xPercent: 15.755, yPercent: 8.500}`
56
- - CoordinateConverter converts to pixels: `15.755% → 252px`
57
- - Generates Playwright command: `await page.mouse.click(252, 68);`
58
-
59
- **Supported actions:**
60
- - click, doubleClick, rightClick, hover
61
- - fill (clicks then types value)
62
- - drag (from x%,y% to toX%,toY%)
63
- - scroll (at position, by amount)
64
-
65
- ### 3. Two-Tier Auto-Escalation
66
- **Trigger:** Code-controlled (not LLM-decided)
67
-
68
- ```
69
- Tier 1 (iterations 1-3): Playwright Selector Mode
70
- ├─ Normal buildSystemPrompt()
71
- ├─ Agent generates: await page.getByRole(...).click()
72
- ├─ Leaves noteToFutureSelf for continuity
73
- └─ 3 attempts, then escalate
74
-
75
- Tier 2 (iterations 4-5): Coordinate Mode
76
- ├─ Auto-activates when consecutiveFailures >= 3
77
- ├─ Uses buildCoordinateSystemPrompt()
78
- ├─ Agent outputs: {xPercent: 15.755, yPercent: 8.500}
79
- ├─ CoordinateConverter → mouse.click(x, y)
80
- └─ 2 attempts max, then give up
81
-
82
- Total: Maximum 5 iterations per step
83
- ```
84
-
85
- ### 4. Precision & Accuracy
86
- - **3 decimal precision** for coordinates (~1px accuracy on most screens)
87
- - **Resolution-independent** - works on any viewport size
88
- - **Percentage reference:**
89
- - Top-left: (0, 0)
90
- - Top-right: (100, 0)
91
- - Center: (50, 50)
92
- - Bottom-right: (100, 100)
93
-
94
- ## Files Modified
95
-
96
- 1. **orchestrator/types.ts**
97
- - Added `NoteToFutureSelf` interface
98
- - Added `CoordinateAction` interface
99
- - Updated `AgentDecision` with new fields
100
- - Updated `AgentContext` with noteFromPreviousIteration
101
-
102
- 2. **orchestrator/orchestrator-agent.ts**
103
- - Added note tracking in executeStep()
104
- - Added coordinate action execution
105
- - Added buildCoordinateSystemPrompt()
106
- - Updated buildUserPrompt() to display notes
107
- - Added mode switching in callAgent()
108
- - Updated response format documentation
109
-
110
- 3. **utils/coordinate-converter.ts** (NEW)
111
- - percentToPixels() - Convert % to pixels
112
- - getViewportSize() - Get current viewport dimensions
113
- - generateCommands() - Create Playwright commands from percentages
114
- - executeAction() - Direct execution helper
115
-
116
- 4. **scenario-worker-class.ts** (Earlier fix)
117
- - Smart timeout handling for waitForLoadState
118
-
119
- 5. **execution-service.ts** (Earlier fix)
120
- - Smart timeout handling for navigation commands
121
-
122
- ## How to Use
123
-
124
- **No code changes needed!** The features activate automatically:
125
-
126
- 1. **Note to self:** Agent can optionally include `noteToFutureSelf` in any iteration
127
- 2. **Coordinates:** Auto-activate at iteration 4 if selectors keep failing
128
-
129
- ## Testing Phase 1
130
-
131
- To validate the implementation:
132
-
133
- 1. **Run PeopleHR scenario** (previously failed on hamburger menu)
134
- - Should now succeed with note guidance
135
- - May use coordinates if SVG selector still fails
136
-
137
- 2. **Check logs for:**
138
- - `📝 Note to self: ...` (agent leaving tactical notes)
139
- - `🎯 COORDINATE MODE ACTIVATED` (tier 2 triggered)
140
- - `🎯 Coordinate Action: click at (X%, Y%)` (using fallback)
141
-
142
- 3. **Expected improvements:**
143
- - 20-30% fewer iterations per step (thanks to notes)
144
- - < 5% scenarios need coordinate fallback
145
- - Coordinates work when everything else fails
146
-
147
- ## Phase 2 Preview (Not Yet Implemented)
148
-
149
- When Phase 2 is added, it will become a **three-tier** system:
150
- - Tier 1 (iterations 1-2): Playwright selectors
151
- - Tier 2 (iterations 3-4): Numbered elements (CLICK[3])
152
- - Tier 3 (iterations 5+): Percentage coordinates
153
-
154
- Phase 2 adds visual markers [1], [2], [3] on elements with structured commands.
155
-
156
- ---
157
-
158
- ## Status: ✅ READY FOR TESTING
159
-
160
- Runner-core v0.0.33 is built and ready. Test it with:
161
- - VS Code extension "Run Test" on peoplehr-corrected.smart.spec.ts
162
- - Or generate new script from peoplehr.txt scenario
163
-
164
- **Next:** Validate Phase 1 works before starting Phase 2.
165
-