npm - testchimp-runner-core - Versions diffs - 0.0.34 → 0.0.36 - Mend

testchimp-runner-core 0.0.34 → 0.0.36

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (150) hide show

package/dist/execution-service.d.ts +1 -4
package/dist/execution-service.d.ts.map +1 -1
package/dist/execution-service.js +155 -468
package/dist/execution-service.js.map +1 -1
package/dist/index.d.ts +3 -1
package/dist/index.d.ts.map +1 -1
package/dist/index.js +11 -1
package/dist/index.js.map +1 -1
package/dist/orchestrator/decision-parser.d.ts +18 -0
package/dist/orchestrator/decision-parser.d.ts.map +1 -0
package/dist/orchestrator/decision-parser.js +127 -0
package/dist/orchestrator/decision-parser.js.map +1 -0
package/dist/orchestrator/index.d.ts +4 -2
package/dist/orchestrator/index.d.ts.map +1 -1
package/dist/orchestrator/index.js +14 -2
package/dist/orchestrator/index.js.map +1 -1
package/dist/orchestrator/orchestrator-agent.d.ts +17 -14
package/dist/orchestrator/orchestrator-agent.d.ts.map +1 -1
package/dist/orchestrator/orchestrator-agent.js +534 -204
package/dist/orchestrator/orchestrator-agent.js.map +1 -1
package/dist/orchestrator/orchestrator-prompts.d.ts +14 -2
package/dist/orchestrator/orchestrator-prompts.d.ts.map +1 -1
package/dist/orchestrator/orchestrator-prompts.js +529 -247
package/dist/orchestrator/orchestrator-prompts.js.map +1 -1
package/dist/orchestrator/page-som-handler.d.ts +106 -0
package/dist/orchestrator/page-som-handler.d.ts.map +1 -0
package/dist/orchestrator/page-som-handler.js +1353 -0
package/dist/orchestrator/page-som-handler.js.map +1 -0
package/dist/orchestrator/som-types.d.ts +149 -0
package/dist/orchestrator/som-types.d.ts.map +1 -0
package/dist/orchestrator/som-types.js +87 -0
package/dist/orchestrator/som-types.js.map +1 -0
package/dist/orchestrator/tool-registry.d.ts +2 -0
package/dist/orchestrator/tool-registry.d.ts.map +1 -1
package/dist/orchestrator/tool-registry.js.map +1 -1
package/dist/orchestrator/tools/index.d.ts +4 -1
package/dist/orchestrator/tools/index.d.ts.map +1 -1
package/dist/orchestrator/tools/index.js +7 -2
package/dist/orchestrator/tools/index.js.map +1 -1
package/dist/orchestrator/tools/refresh-som-markers.d.ts +12 -0
package/dist/orchestrator/tools/refresh-som-markers.d.ts.map +1 -0
package/dist/orchestrator/tools/refresh-som-markers.js +64 -0
package/dist/orchestrator/tools/refresh-som-markers.js.map +1 -0
package/dist/orchestrator/tools/view-previous-screenshot.d.ts +15 -0
package/dist/orchestrator/tools/view-previous-screenshot.d.ts.map +1 -0
package/dist/orchestrator/tools/view-previous-screenshot.js +92 -0
package/dist/orchestrator/tools/view-previous-screenshot.js.map +1 -0
package/dist/orchestrator/types.d.ts +23 -1
package/dist/orchestrator/types.d.ts.map +1 -1
package/dist/orchestrator/types.js +11 -1
package/dist/orchestrator/types.js.map +1 -1
package/dist/scenario-service.d.ts +5 -0
package/dist/scenario-service.d.ts.map +1 -1
package/dist/scenario-service.js +17 -0
package/dist/scenario-service.js.map +1 -1
package/dist/scenario-worker-class.d.ts +4 -0
package/dist/scenario-worker-class.d.ts.map +1 -1
package/dist/scenario-worker-class.js +18 -3
package/dist/scenario-worker-class.js.map +1 -1
package/dist/testing/agent-tester.d.ts +35 -0
package/dist/testing/agent-tester.d.ts.map +1 -0
package/dist/testing/agent-tester.js +84 -0
package/dist/testing/agent-tester.js.map +1 -0
package/dist/testing/ref-translator-tester.d.ts +44 -0
package/dist/testing/ref-translator-tester.d.ts.map +1 -0
package/dist/testing/ref-translator-tester.js +104 -0
package/dist/testing/ref-translator-tester.js.map +1 -0
package/dist/utils/hierarchical-selector.d.ts +47 -0
package/dist/utils/hierarchical-selector.d.ts.map +1 -0
package/dist/utils/hierarchical-selector.js +212 -0
package/dist/utils/hierarchical-selector.js.map +1 -0
package/dist/utils/page-info-retry.d.ts +14 -0
package/dist/utils/page-info-retry.d.ts.map +1 -0
package/dist/utils/page-info-retry.js +60 -0
package/dist/utils/page-info-retry.js.map +1 -0
package/dist/utils/page-info-utils.d.ts +1 -0
package/dist/utils/page-info-utils.d.ts.map +1 -1
package/dist/utils/page-info-utils.js +46 -18
package/dist/utils/page-info-utils.js.map +1 -1
package/dist/utils/ref-attacher.d.ts +21 -0
package/dist/utils/ref-attacher.d.ts.map +1 -0
package/dist/utils/ref-attacher.js +149 -0
package/dist/utils/ref-attacher.js.map +1 -0
package/dist/utils/ref-translator.d.ts +49 -0
package/dist/utils/ref-translator.d.ts.map +1 -0
package/dist/utils/ref-translator.js +276 -0
package/dist/utils/ref-translator.js.map +1 -0
package/package.json +6 -1
package/RELEASE_0.0.26.md +0 -165
package/RELEASE_0.0.27.md +0 -236
package/RELEASE_0.0.28.md +0 -286
package/plandocs/BEFORE_AFTER_VERIFICATION.md +0 -148
package/plandocs/COORDINATE_MODE_DIAGNOSIS.md +0 -144
package/plandocs/CREDIT_CALLBACK_ARCHITECTURE.md +0 -253
package/plandocs/HUMAN_LIKE_IMPROVEMENTS.md +0 -642
package/plandocs/IMPLEMENTATION_STATUS.md +0 -108
package/plandocs/INTEGRATION_COMPLETE.md +0 -322
package/plandocs/MULTI_AGENT_ARCHITECTURE_REVIEW.md +0 -844
package/plandocs/ORCHESTRATOR_MVP_SUMMARY.md +0 -539
package/plandocs/PHASE1_ABSTRACTION_COMPLETE.md +0 -241
package/plandocs/PHASE1_FINAL_STATUS.md +0 -210
package/plandocs/PHASE_1_COMPLETE.md +0 -165
package/plandocs/PHASE_1_SUMMARY.md +0 -184
package/plandocs/PLANNING_SESSION_SUMMARY.md +0 -372
package/plandocs/PROMPT_OPTIMIZATION_ANALYSIS.md +0 -120
package/plandocs/PROMPT_SANITY_CHECK.md +0 -120
package/plandocs/SCRIPT_CLEANUP_FEATURE.md +0 -201
package/plandocs/SCRIPT_GENERATION_ARCHITECTURE.md +0 -364
package/plandocs/SELECTOR_IMPROVEMENTS.md +0 -139
package/plandocs/SESSION_SUMMARY_v0.0.33.md +0 -151
package/plandocs/TROUBLESHOOTING_SESSION.md +0 -72
package/plandocs/VISION_DIAGNOSTICS_IMPROVEMENTS.md +0 -336
package/plandocs/VISUAL_AGENT_EVOLUTION_PLAN.md +0 -396
package/plandocs/WHATS_NEW_v0.0.33.md +0 -183
package/src/auth-config.ts +0 -84
package/src/credit-usage-service.ts +0 -188
package/src/env-loader.ts +0 -103
package/src/execution-service.ts +0 -1413
package/src/file-handler.ts +0 -104
package/src/index.ts +0 -422
package/src/llm-facade.ts +0 -821
package/src/llm-provider.ts +0 -53
package/src/model-constants.ts +0 -35
package/src/orchestrator/index.ts +0 -34
package/src/orchestrator/orchestrator-agent.ts +0 -862
package/src/orchestrator/orchestrator-agent.ts.backup +0 -1386
package/src/orchestrator/orchestrator-prompts.ts +0 -474
package/src/orchestrator/tool-registry.ts +0 -182
package/src/orchestrator/tools/check-page-ready.ts +0 -75
package/src/orchestrator/tools/extract-data.ts +0 -92
package/src/orchestrator/tools/index.ts +0 -12
package/src/orchestrator/tools/inspect-page.ts +0 -42
package/src/orchestrator/tools/recall-history.ts +0 -72
package/src/orchestrator/tools/take-screenshot.ts +0 -128
package/src/orchestrator/tools/verify-action-result.ts +0 -159
package/src/orchestrator/types.ts +0 -248
package/src/playwright-mcp-service.ts +0 -224
package/src/progress-reporter.ts +0 -144
package/src/prompts.ts +0 -842
package/src/providers/backend-proxy-llm-provider.ts +0 -91
package/src/providers/local-llm-provider.ts +0 -38
package/src/scenario-service.ts +0 -232
package/src/scenario-worker-class.ts +0 -1089
package/src/script-utils.ts +0 -203
package/src/types.ts +0 -239
package/src/utils/browser-utils.ts +0 -348
package/src/utils/coordinate-converter.ts +0 -162
package/src/utils/page-info-utils.ts +0 -250
package/testchimp-runner-core-0.0.33.tgz +0 -0
package/tsconfig.json +0 -19

package/RELEASE_0.0.27.md DELETED Viewed

@@ -1,236 +0,0 @@
-# Release 0.0.27 - Clean Logs and Credit Callback Architecture
-## Summary
-Major cleanup of logging architecture with timestamps moved to consumers, minimal initialization logs, credit callback support for server-side integration, and version read from package.json.
-## Changes in 0.0.27
-### 1. Timestamps at Consumer Level
-**Architecture Fix:**
-- ❌ Before: runner-core added timestamps
-- ✅ After: Consumer adds timestamps in their timezone
-**runner-core:**
-- Removed all timestamp formatting
-- Reports raw messages via callbacks
-**vs-extension:**
-- Added `formatLocalTimestamp()` utility function
-- Wraps outputChannel to add timestamps automatically
-- Format: `HH:MM:SS.mmm` in local timezone
-### 2. Minimal Initialization Logs
-**Before:**
-```
-🤖 Initializing Orchestrator Mode
-✓ Orchestrator initialized with 5 tools (DEBUG MODE)
-═══════════════════════════════════════════════════════
-🚀 RUNNER-CORE VERSION: v1.5.0-vision-preserve-values
-═══════════════════════════════════════════════════════
-Initializing Scenario worker...
-Scenario worker initialized with session: scenario_worker_XXX
-Scenario worker initialized (Orchestrator Mode) with session...
-Scenario service initialized
-```
-**After:**
-```
-testchimp-runner-core v0.0.27
-📋 Processing scenario: [scenario description]
-```
-**Changes:**
-- Single version log (reads from package.json)
-- No internal initialization details
-- Clean, minimal output
-### 3. Credit Callback Architecture
-**Purpose:** Allow server-side integration to update DB directly without axios calls
-**Implementation:**
-```typescript
-export interface CreditUsage {
-  credits: number;
-  usageReason: CreditUsageReason;
-  jobId?: string;
-  timestamp: number;
-}
-export type CreditUsageCallback = (usage: CreditUsage) => void | Promise<void>;
-```
-**Behavior:**
-1. **If callback provided** (server-side):
-   - ✅ Call callback → Direct DB update
-   - ❌ NO axios calls made
-2. **If NO callback but auth configured** (client-side):
-   - ❌ No callback
-   - ✅ Makes axios call to backend API
-3. **If neither** (development):
-   - Logs warning
-   - Continues without tracking
-**Usage:**
-```typescript
-// Server-side
-const service = new TestChimpService(
-  fileHandler, undefined, backendUrl, maxWorkers,
-  llmProvider, progressReporter, orchestratorOptions,
-  async (creditUsage) => {
-    await db.insertCreditUsage(creditUsage);  // Direct DB
-  }
-);
-// Client-side (vs-ext, github-action)
-const service = new TestChimpService(
-  fileHandler, authConfig, backendUrl
-  // No callback - uses axios
-);
-```
-### 4. Wrapped OutputChannel
-**Problem:** runner-core wrote directly to outputChannel without timestamps
-**Solution:** vs-extension wraps the outputChannel:
-```typescript
-const wrappedChannel = {
-  appendLine: (message) => {
-    const timestamp = formatLocalTimestamp();
-    const shouldLog = isDev || !isVerboseDebugLog;
-    if (shouldLog) {
-      this.outputChannel?.appendLine(`[${timestamp}] ${message}`);
-    }
-  }
-};
-service.setOutputChannel(wrappedChannel);
-```
-**Result:**
-- ✅ All logs get timestamps
-- ✅ Filtering applied
-- ✅ Consumer controls presentation
-### 5. Dynamic Version in build_local.sh
-**Before:**
-```bash
-cp testchimp-runner-core-0.0.22.tgz .  # Hardcoded!
-```
-**After:**
-```bash
-RUNNER_VERSION=$(node -p "require('./package.json').version")
-TARBALL_NAME="testchimp-runner-core-${RUNNER_VERSION}.tgz"
-cp "$PARENT_DIR/runner-core/${TARBALL_NAME}" .
-```
-**Benefit:** Always uses correct version, no manual updates needed
-## New Exports
-```typescript
-export {
-  // Credit usage types
-  CreditUsageCallback,
-  CreditUsage,
-  CreditUsageReason,
-  // Existing exports...
-  TestChimpService,
-  // ...
-}
-```
-## Files Modified
-### runner-core
-1. `/src/scenario-worker-class.ts` - Removed timestamps, minimal logs, version from package.json
-2. `/src/scenario-service.ts` - Removed initialization logs
-3. `/src/credit-usage-service.ts` - Added callback architecture, callback-first logic
-4. `/src/index.ts` - Credit callback support, preserve across recreations
-### vs-extension
-1. `/src/embedded-service.ts` - `formatLocalTimestamp()` utility, wrapped outputChannel, consistent filtering
-2. `/build_local.sh` - Dynamic version detection
-3. `/package.json` - Updated to `^0.0.27`
-### github-action
-1. `/package.json` - Updated to `^0.0.27`
-## Published to npm
-```
-✅ Published: testchimp-runner-core@0.0.27
-📦 Package Size: ~245 kB
-📋 Registry: https://registry.npmjs.org/
-```
-## Benefits
-### Logging
-1. **Local Timezone** - Timestamps match user's clock
-2. **Clean Output** - Only essential information
-3. **Consumer Control** - Consumer decides format and filtering
-4. **No Verbose Init** - Single version log instead of 10+ lines
-### Credit Tracking
-1. **Server-Side** - Direct DB updates, no HTTP overhead
-2. **Client-Side** - Existing axios behavior preserved
-3. **Flexible** - Each consumer decides how to track
-4. **Observable** - Callback provides visibility
-### Architecture
-1. **Separation of Concerns** - Library reports, consumer presents
-2. **Environment Agnostic** - Library doesn't assume timezone/environment
-3. **Testable** - Easy to mock callbacks
-4. **Maintainable** - Clear boundaries
-## Migration
-### For All Consumers
-Update package.json:
-```json
-{
-  "dependencies": {
-    "testchimp-runner-core": "^0.0.27"
-  }
-}
-```
-### For Server-Side (Optional)
-Add credit callback:
-```typescript
-const service = new TestChimpService(
-  fileHandler, undefined, backendUrl, maxWorkers,
-  llmProvider, progressReporter, orchestratorOptions,
-  async (creditUsage) => {
-    await creditRepository.insert(creditUsage);
-  }
-);
-```
-## Backward Compatibility
-✅ Fully backward compatible
-- All parameters optional
-- Existing behavior preserved
-- No breaking changes
-## Complete Feature Set
-All improvements from today are now in 0.0.27:
-1. ✅ Semantic selector preference
-2. ✅ Playwright expect() assertions
-3. ✅ Script cleanup feature
-4. ✅ Fixed comment placement
-5. ✅ Focused step execution
-6. ✅ Orchestrator reasoning logs in output channel
-7. ✅ Environment-aware log filtering (consumer-side)
-8. ✅ Local timezone timestamps
-9. ✅ Clean initialization logs
-10. ✅ Credit callback architecture
-11. ✅ Version from package.json
-Ready for production! 🚀

package/RELEASE_0.0.28.md DELETED Viewed

@@ -1,286 +0,0 @@
-# Runner-Core 0.0.28 Release - Scriptservice Integration
-## Overview
-Extended runner-core with lifecycle callbacks and created SmartTestRunnerCoreV2 wrapper to enable scriptservice integration. This release makes runner-core a drop-in replacement for scriptservice's duplicated SmartTestRunnerCore.
-## Changes
-### 1. Extended ProgressReporter with Lifecycle Callbacks
-**File:** `src/progress-reporter.ts`
-Added optional lifecycle callbacks to ProgressReporter interface:
-```typescript
-export interface StepInfo {
-  stepId?: string;
-  stepNumber: number;
-  description: string;
-  code?: string;
-}
-export interface ProgressReporter {
-  // ... existing callbacks ...
-  // NEW: Lifecycle callbacks (used by scriptservice, ignored by local clients)
-  beforeStartTest?(page: any, browser: any, context: any): Promise<void>;
-  beforeStepStart?(step: StepInfo, page: any): Promise<void>;
-  afterEndTest?(status: 'passed' | 'failed', error?: string, page?: any): Promise<void>;
-}
-```
-**Purpose:**
-- `beforeStartTest`: Initialize browser context, set up DB records (scriptservice only)
-- `beforeStepStart`: Update step status to IN_PROGRESS in DB (scriptservice only)
-- `afterEndTest`: Write final status to DB, cleanup resources (scriptservice only)
-- Local clients (vs-ext, github-action) can ignore these - they use return values
-### 2. Integrated Lifecycle Callbacks
-**Files:**
-- `src/execution-service.ts` - RUN_EXACTLY and AI_REPAIR modes
-- `src/scenario-worker-class.ts` - Orchestrator mode
-**Integration Points:**
-**ExecutionService (RUN_EXACTLY):**
-```typescript
-// Before script execution
-if (this.progressReporter?.beforeStartTest) {
-  await this.progressReporter.beforeStartTest(page, browser, context);
-}
-// After execution (success or failure)
-if (this.progressReporter?.afterEndTest) {
-  await this.progressReporter.afterEndTest(
-    success ? 'passed' : 'failed',
-    error,
-    page
-  );
-}
-```
-**ExecutionService (AI_REPAIR):**
-```typescript
-// Before each step
-if (this.progressReporter?.beforeStepStart) {
-  await this.progressReporter.beforeStepStart(
-    { stepNumber, description, code },
-    page
-  );
-}
-```
-**ScenarioWorker (Orchestrator):**
-```typescript
-// After browser initialization
-if (this.progressReporter?.beforeStartTest) {
-  await this.progressReporter.beforeStartTest(page, browser, context);
-}
-// Before each orchestrator step
-if (this.progressReporter?.beforeStepStart) {
-  await this.progressReporter.beforeStepStart({ stepNumber, description }, page);
-}
-// In finally block (before browser close)
-if (this.progressReporter?.afterEndTest) {
-  await this.progressReporter.afterEndTest(
-    overallSuccess ? 'passed' : 'failed',
-    error,
-    page
-  );
-}
-```
-### 3. Version Bump
-**File:** `package.json`
-Updated version from `0.0.27` to `0.0.28`
-## Scriptservice Integration
-### 1. Created ScriptserviceLLMProvider
-**File:** `services/scriptservice/providers/scriptservice-llm-provider.ts` (NEW)
-Implements `LLMProvider` interface for scriptservice:
-- Wraps scriptservice's existing OpenAI client
-- Supports both text and vision prompts
-- Uses scriptservice's OpenAI API key from ConfigService
-- No backend proxy - all calls are local
-- No token tracking needed (scriptservice doesn't use this interface for tracking)
-### 2. Created SmartTestRunnerCoreV2
-**File:** `services/scriptservice/smart-test-runner-core-v2.ts` (NEW)
-Drop-in replacement for SmartTestRunnerCore:
-- **Same Interface:** Maintains exact same constructor and methods (runExactly, runWithRepair)
-- **Uses runner-core:** Delegates to TestChimpService internally
-- **Callback Mapping:** Maps scriptservice callbacks to runner-core ProgressReporter
-- **Lifecycle Support:** Properly calls beforeStartTest, beforeStepStart, afterEndTest
-- **No Breaking Changes:** Can replace SmartTestRunnerCore with zero code changes
-**Key Features:**
-```typescript
-export class SmartTestRunnerCoreV2 {
-  constructor(config: RunnerConfig) {
-    // Create local LLM provider (no backend)
-    const llmProvider = new ScriptserviceLLMProvider();
-    // Map callbacks to progress reporter
-    const progressReporter: ProgressReporter = {
-      onStepProgress: async (stepProgress) => {
-        // Call scriptservice's onStepComplete
-      },
-      beforeStartTest: config.callbacks?.beforeStartTest,
-      beforeStepStart: config.callbacks?.beforeStepStart,
-      afterEndTest: config.callbacks?.afterEndTest
-    };
-    // Initialize runner-core (no auth, no backend)
-    this.runnerCore = new TestChimpService(
-      undefined, // No file handler
-      undefined, // No auth
-      undefined, // No backend URL
-      1,         // Single worker
-      llmProvider,
-      progressReporter
-    );
-  }
-  async runExactly(script: string): Promise<RunExactlyResult> { /* ... */ }
-  async runWithRepair(steps: TestStepWithId[]): Promise<RunWithRepairResult> { /* ... */ }
-}
-```
-### 3. Updated Scriptservice Consumers
-**Files Updated:**
-- `services/scriptservice/smart-test-execution-handler.ts`
-- `services/scriptservice/workers/test-based-explorer.ts`
-- `services/scriptservice/package.json` - Added `testchimp-runner-core: ^0.0.28`
-**Migration Strategy:**
-```typescript
-// Environment flag for gradual rollout
-const useV2 = process.env.USE_RUNNER_CORE_V2 !== 'false'; // Default: true
-const runner = useV2
-  ? new SmartTestRunnerCoreV2(runnerConfig)
-  : new SmartTestRunnerCore(runnerConfig);
-```
-**Rollout Plan:**
-1. Deploy with `USE_RUNNER_CORE_V2=true` (default)
-2. Monitor scriptservice execution logs
-3. If issues occur, set `USE_RUNNER_CORE_V2=false` to rollback
-4. Once validated, remove flag and delete old SmartTestRunnerCore
-## Benefits
-### For Scriptservice:
-1. **Zero Axios Calls:** All LLM calls local, all DB writes via callbacks
-2. **Auto-Updates:** Automatically get runner-core improvements (better prompts, vision, orchestrator)
-3. **Code Deduplication:** Remove ~600 lines of duplicated execution/repair logic
-4. **Consistency:** Same prompts and logic as vs-extension and github-action
-### For Runner-Core:
-1. **Universal Library:** Works for both client-side (vs-ext) and server-side (scriptservice)
-2. **Flexible Architecture:** Callbacks allow different execution patterns
-3. **Backward Compatible:** Existing consumers unaffected (callbacks are optional)
-## Testing
-### Scriptservice Testing:
-```bash
-# Test with V2 (default)
-USE_RUNNER_CORE_V2=true npm start
-# Test with V1 (legacy fallback)
-USE_RUNNER_CORE_V2=false npm start
-```
-### Expected Behavior:
-- V2: Logs show "using SmartTestRunnerCoreV2 (runner-core)"
-- V1: Logs show "using SmartTestRunnerCore (legacy)"
-- All callbacks (beforeStartTest, beforeStepStart, afterEndTest, onStepComplete) should fire
-- DB writes should happen via callbacks
-- Screenshots should upload to GCS
-- Journey reporting should work
-## Publishing
-### Prerequisites:
-1. ✅ Build runner-core: `npm run build`
-2. ✅ No lint errors
-3. ✅ Version bumped to 0.0.28
-4. ⏸️ **Awaiting user permission to publish to npm**
-### Publish Command (when approved):
-```bash
-cd /Users/nuwansam/IdeaProjects/AwareRepo/local/runner-core
-npm publish
-```
-### Post-Publish:
-1. Install in scriptservice: `npm install testchimp-runner-core@0.0.28`
-2. Install in vs-ext: Update to 0.0.28
-3. Install in github-action: Update to 0.0.28
-## Migration Checklist
-### Phase 1: Deploy with V2 (Default)
-- [x] Add lifecycle callbacks to ProgressReporter
-- [x] Integrate callbacks in ExecutionService and ScenarioService
-- [x] Create ScriptserviceLLMProvider
-- [x] Create SmartTestRunnerCoreV2
-- [x] Update scriptservice consumers with environment flag
-- [ ] Publish runner-core 0.0.28 (needs user permission)
-- [ ] Install 0.0.28 in scriptservice
-- [ ] Deploy scriptservice with USE_RUNNER_CORE_V2=true
-- [ ] Monitor logs and execution results
-### Phase 2: Validate (1-2 weeks)
-- [ ] Verify all smart test executions work
-- [ ] Verify explorer mode works
-- [ ] Verify script generation works
-- [ ] Check DB writes happen correctly
-- [ ] Check screenshot uploads work
-- [ ] Monitor for any errors or regressions
-### Phase 3: Full Migration
-- [ ] Remove environment flag check
-- [ ] Replace all `new SmartTestRunnerCore` with `new SmartTestRunnerCoreV2`
-- [ ] Delete old `smart-test-runner-core.ts` file
-- [ ] Optional: Rename V2 to just `SmartTestRunnerCore`
-## Breaking Changes
-None. All changes are backward compatible.
-## Files Changed
-### Runner-Core:
-- `src/progress-reporter.ts` - Added lifecycle callbacks
-- `src/execution-service.ts` - Integrated callbacks
-- `src/scenario-worker-class.ts` - Integrated callbacks
-- `package.json` - Version bump to 0.0.28
-### Scriptservice:
-- `providers/scriptservice-llm-provider.ts` - NEW
-- `smart-test-runner-core-v2.ts` - NEW
-- `smart-test-execution-handler.ts` - Added V2 usage with flag
-- `workers/test-based-explorer.ts` - Added V2 usage with flag
-- `package.json` - Added testchimp-runner-core dependency
-## Notes
-- Lifecycle callbacks are **optional** - local clients (vs-ext, github-action) don't need them
-- Scriptservice uses callbacks for DB writes and resource management
-- V2 wrapper uses existing page/browser/context (doesn't create new ones)
-- LLM calls in V2 are all local (no backend proxy)
-- No authentication needed for scriptservice (already authenticated at service level)

package/plandocs/BEFORE_AFTER_VERIFICATION.md DELETED Viewed

@@ -1,148 +0,0 @@
-# Before/After Screenshot Verification
-## Feature: Visual Goal Verification for Coordinate Actions
-### Problem Solved:
-When using coordinate-based actions (clicking at x,y%), the agent has no way to know if the click achieved the goal:
-- No element reference to check state
-- No selector feedback
-- Can't verify if expected page loaded or modal opened
-This led to:
-- False positives (click succeeded but goal not achieved)
-- Infinite loops (agent keeps clicking, unsure if it worked)
-### Solution:
-Automatic before/after screenshot comparison after coordinate clicks.
-## How It Works:
-### 1. **Automatic Trigger** (No Agent Action Required)
-When agent uses coordinate action:
-```typescript
-Iteration 4: 🎯 Coordinate mode activated
-  Step 1: Capture BEFORE screenshot
-  Step 2: Execute coordinate click (x%, y%)
-  Step 3: Wait 1000ms for UI to settle
-  Step 4: Capture AFTER screenshot
-  Step 5: Call LLM with both images (labeled "BEFORE", "AFTER")
-  Step 6: LLM responds: { goalAchieved: true/false, reasoning: "..." }
-  Step 7a: If TRUE → Mark complete, exit step ✅
-  Step 7b: If FALSE → Continue to next iteration, try different coordinates
-```
-### 2. **LLM Prompt for Verification**
-```
-Goal: [Current step goal]
-Compare the BEFORE and AFTER screenshots.
-Did the action achieve the goal? Respond with JSON:
-{
-  "goalAchieved": boolean,
-  "reasoning": "What changed (or didn't change)",
-  "visibleChanges": ["List of UI changes observed"]
-}
-Focus on:
-- Did expected elements appear/disappear?
-- Did page navigate or content change?
-- Visual indicators of success (new panels, forms, highlights)?
-Be strict: Only return true if you clearly see the expected change.
-```
-### 3. **Multi-Image LLM Interface**
-```typescript
-// NEW: LabeledImage interface
-export interface LabeledImage {
-  label: string;      // "Before", "After", etc.
-  dataUrl: string;    // Base64 data URL
-}
-// UPDATED: LLMRequest
-export interface LLMRequest {
-  imageUrl?: string;         // Backward compatible (single image)
-  images?: LabeledImage[];   // NEW - multi-image support
-}
-```
-### 4. **Provider Implementation** (scriptservice-llm-provider.ts)
-```typescript
-if (request.images && request.images.length > 0) {
-  for (const img of request.images) {
-    contentParts.push({ type: 'text', text: `\n[${img.label}]:` });
-    contentParts.push({ type: 'image_url', image_url: { url: img.dataUrl } });
-  }
-  // Sends: [BEFORE]: <image1>, [AFTER]: <image2>
-}
-```
-## When Verification Happens:
-✅ **Always**: After first coordinate action attempt
-❌ **Never**: After selector-based actions (have element state to check)
-⚠️ **Conditional**: Can add for other scenarios where goal verification is unclear
-## Cost Considerations:
-**Per verification call:**
-- 2 viewport screenshots (~50-100KB each)
-- Vision model (gpt-5-mini): ~$0.001 per call
-- Used only when coordinate mode activates (after 3 selector failures)
-**Typical scenario:**
-- Steps 1-10: Regular selectors → No verification cost
-- Step 5 gets stuck → Coordinate mode → 1 verification call → $0.001
-- Overall impact: Minimal, used sparingly
-## Example Flow:
-**Step 5: "Select Employee Information"**
-```
-Iteration 1: getByText('Employee Information') → Strict mode ❌
-Iteration 2: locator('#collapse-1').getByText('Employee Information') → Click succeeds ✅
-           BUT: Didn't navigate to Employee Information page (false positive)
-Iteration 3: Selector fails again
-Iteration 4: 🎯 Coordinate mode
-  → BEFORE: Homepage with sidebar
-  → Click at (19.3%, 22.9%)
-  → Wait 1s
-  → AFTER: Check screenshot
-  → LLM: "goalAchieved": true, "reasoning": "Employee Information page loaded with form"
-  → ✅ Mark complete, exit
-```
-## Backward Compatibility:
-✅ **Single image still works:**
-```typescript
-const request = {
-  imageUrl: 'data:image/png;base64,...'  // Old way
-};
-```
-✅ **Multi-image NEW:**
-```typescript
-const request = {
-  images: [
-    { label: 'BEFORE', dataUrl: '...' },
-    { label: 'AFTER', dataUrl: '...' }
-  ]
-};
-```
-## Files Modified:
-1. `runner-core/src/llm-provider.ts` - Added LabeledImage interface and images field
-2. `scriptservice/providers/scriptservice-llm-provider.ts` - Handle multiple images in OpenAI API
-3. `runner-core/src/orchestrator/orchestrator-agent.ts` - Added verifyGoalWithScreenshotComparison method
-4. Automatic trigger after coordinate actions
-## Next Steps:
-- ✅ Infrastructure ready
-- ⏳ Need to test with real scenario
-- 🔮 Future: Could expose as agent-callable tool if needed