npm - testchimp-runner-core - Versions diffs - 0.0.35 → 0.0.36 - Mend

testchimp-runner-core 0.0.35 → 0.0.36

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (71) hide show

package/package.json +6 -1
package/plandocs/BEFORE_AFTER_VERIFICATION.md +0 -148
package/plandocs/COORDINATE_MODE_DIAGNOSIS.md +0 -144
package/plandocs/CREDIT_CALLBACK_ARCHITECTURE.md +0 -253
package/plandocs/HUMAN_LIKE_IMPROVEMENTS.md +0 -642
package/plandocs/IMPLEMENTATION_STATUS.md +0 -108
package/plandocs/INTEGRATION_COMPLETE.md +0 -322
package/plandocs/MULTI_AGENT_ARCHITECTURE_REVIEW.md +0 -844
package/plandocs/ORCHESTRATOR_MVP_SUMMARY.md +0 -539
package/plandocs/PHASE1_ABSTRACTION_COMPLETE.md +0 -241
package/plandocs/PHASE1_FINAL_STATUS.md +0 -210
package/plandocs/PHASE_1_COMPLETE.md +0 -165
package/plandocs/PHASE_1_SUMMARY.md +0 -184
package/plandocs/PLANNING_SESSION_SUMMARY.md +0 -372
package/plandocs/PROMPT_OPTIMIZATION_ANALYSIS.md +0 -120
package/plandocs/PROMPT_SANITY_CHECK.md +0 -120
package/plandocs/SCRIPT_CLEANUP_FEATURE.md +0 -201
package/plandocs/SCRIPT_GENERATION_ARCHITECTURE.md +0 -364
package/plandocs/SELECTOR_IMPROVEMENTS.md +0 -139
package/plandocs/SESSION_SUMMARY_v0.0.33.md +0 -151
package/plandocs/TROUBLESHOOTING_SESSION.md +0 -72
package/plandocs/VISION_DIAGNOSTICS_IMPROVEMENTS.md +0 -336
package/plandocs/VISUAL_AGENT_EVOLUTION_PLAN.md +0 -396
package/plandocs/WHATS_NEW_v0.0.33.md +0 -183
package/plandocs/exploratory-mode-support-v2.plan.md +0 -953
package/plandocs/exploratory-mode-support.plan.md +0 -928
package/plandocs/journey-id-tracking-addendum.md +0 -227
package/releasenotes/RELEASE_0.0.26.md +0 -165
package/releasenotes/RELEASE_0.0.27.md +0 -236
package/releasenotes/RELEASE_0.0.28.md +0 -286
package/src/auth-config.ts +0 -84
package/src/credit-usage-service.ts +0 -188
package/src/env-loader.ts +0 -103
package/src/execution-service.ts +0 -996
package/src/file-handler.ts +0 -104
package/src/index.ts +0 -432
package/src/llm-facade.ts +0 -821
package/src/llm-provider.ts +0 -53
package/src/model-constants.ts +0 -35
package/src/orchestrator/decision-parser.ts +0 -139
package/src/orchestrator/index.ts +0 -58
package/src/orchestrator/orchestrator-agent.ts +0 -1282
package/src/orchestrator/orchestrator-prompts.ts +0 -786
package/src/orchestrator/page-som-handler.ts +0 -1565
package/src/orchestrator/som-types.ts +0 -188
package/src/orchestrator/tool-registry.ts +0 -184
package/src/orchestrator/tools/check-page-ready.ts +0 -75
package/src/orchestrator/tools/extract-data.ts +0 -92
package/src/orchestrator/tools/index.ts +0 -15
package/src/orchestrator/tools/inspect-page.ts +0 -42
package/src/orchestrator/tools/recall-history.ts +0 -72
package/src/orchestrator/tools/refresh-som-markers.ts +0 -69
package/src/orchestrator/tools/take-screenshot.ts +0 -128
package/src/orchestrator/tools/verify-action-result.ts +0 -159
package/src/orchestrator/tools/view-previous-screenshot.ts +0 -103
package/src/orchestrator/types.ts +0 -291
package/src/playwright-mcp-service.ts +0 -224
package/src/progress-reporter.ts +0 -144
package/src/prompts.ts +0 -842
package/src/providers/backend-proxy-llm-provider.ts +0 -91
package/src/providers/local-llm-provider.ts +0 -38
package/src/scenario-service.ts +0 -252
package/src/scenario-worker-class.ts +0 -1110
package/src/script-utils.ts +0 -203
package/src/types.ts +0 -239
package/src/utils/browser-utils.ts +0 -348
package/src/utils/coordinate-converter.ts +0 -162
package/src/utils/page-info-retry.ts +0 -65
package/src/utils/page-info-utils.ts +0 -285
package/testchimp-runner-core-0.0.35.tgz +0 -0
package/tsconfig.json +0 -19

package/plandocs/IMPLEMENTATION_STATUS.md DELETED Viewed

@@ -1,108 +0,0 @@
-# Runner-Core Visual Agent Implementation Status
-## Phase 1: ✅ COMPLETE (v0.0.33)
-### Implemented Features:
-1. **Note to Future Self** - Tactical iteration memory
-2. **Percentage-Based Coordinates** - Last-resort fallback with 3-decimal precision
-3. **Two-Tier Auto-Escalation** - Code-controlled mode switching
-### Current Behavior (Phase 1):
-```
-Iteration 1-3: Normal Playwright selectors + note-to-self (3 attempts)
-    ↓ (after 3 failures)
-Iteration 4-5: Percentage coordinates (2 attempts max)
-    ↓ (if both coordinate attempts fail)
-Give up - mark as stuck
-Total: Maximum 5 iterations per step
-```
----
-## Phase 2: 📋 PLANNED (Not Started)
-### Will Add:
-1. **ElementDetector** - Detect interactive elements with z-index awareness
-2. **VisualMarkerInjector** - Number elements [1], [2], [3] on screenshot
-3. **SelectorResolver** - Translate index → native Playwright selector
-4. **IndexCommandTranslator** - Convert CLICK[3] → native Playwright command
-### Future Behavior (Phase 2):
-```
-Iteration 1: Playwright selector (1 attempt) → 70% success
-    ↓ (on first failure)
-Iteration 2-3: Index commands CLICK[3] (2 attempts) → 25% success
-    ↓ (after 3 total failures)
-Iteration 4-5: Percentage coordinates (2 attempts max) → 5% success
-    ↓ (if all fail)
-Give up - mark as stuck
-Total: Maximum 5 iterations per step (down from 8)
-Average: ~1.5 iterations per step (fast!)
-```
-### Key Design Principle for Phase 2:
-**During Execution:**
-- Agent clicks using `data-testchimp-el="[3]"` (reliable, we inject it)
-**In Generated Script:**
-- Translator outputs NATIVE selector: `getByRole('button', {name: 'Menu'})`
-- Script works standalone without data-testchimp-el
-**Why Two-Stage:**
-1. Agent needs reliability during exploration → use data attribute
-2. Generated script must be portable → use native selectors
-3. Best of both worlds: reliable execution + maintainable output
----
-## Optimizations vs Original Plan
-### Original Plan:
-- Tier 1: iterations 1-2
-- Tier 2: iterations 3-4
-- Tier 3: iterations 5+
-- Average: ~4 iterations per step
-### Optimized Plan (Current):
-- Tier 1: iteration 1 ONLY (fast path)
-- Tier 2: iterations 2-3 (reliable fallback)
-- Tier 3: iterations 4+ (absolute last resort)
-- **Target: ~1.5 average iterations per step**
-**Rationale:** Don't waste time! Simple tasks finish in 1 iteration, complex tasks escalate quickly to more reliable methods.
----
-## Testing Checklist
-### Phase 1 (Ready Now):
-- [ ] Run PeopleHR scenario - verify note-to-self helps
-- [ ] Test coordinate fallback on deliberately difficult case
-- [ ] Measure iteration reduction (expect 20-30%)
-- [ ] Verify timeout fixes for waitForLoadState
-### Phase 2 (When Implemented):
-- [ ] Test ElementDetector on modals/overlays
-- [ ] Verify z-index occlusion detection
-- [ ] Validate native selector generation (no data-testchimp-el in output)
-- [ ] Run generated scripts standalone - must work!
-- [ ] Measure tier distribution: 70/25/5
----
-## Current Version
-**Runner-Core:** v0.0.33
-**Status:** Built and ready to test
-**Phase 1:** ✅ Complete
-**Phase 2:** 📋 Planned but not started
-**Next Step:** Test Phase 1 with PeopleHR scenario to validate improvements before implementing Phase 2.

package/plandocs/INTEGRATION_COMPLETE.md DELETED Viewed

@@ -1,322 +0,0 @@
-# Runner-Core 0.0.28 - Scriptservice Integration Complete
-## ✅ Implementation Summary
-Successfully extended runner-core to support both local clients (vs-ext, github-action) and server-side usage (scriptservice) through:
-1. **Lifecycle callbacks** for server-side DB writes and resource management
-2. **Existing browser support** to reuse caller's browser instead of creating new ones
-3. **Drop-in SmartTestRunnerCoreV2** wrapper for seamless migration
-## Changes Made
-### 1. Extended ProgressReporter with Lifecycle Callbacks
-**File:** `src/progress-reporter.ts`
-Added optional lifecycle callbacks (used by scriptservice, ignored by local clients):
-```typescript
-export interface StepInfo {
-  stepId?: string;
-  stepNumber: number;
-  description: string;
-  code?: string;
-}
-export interface ProgressReporter {
-  // Existing callbacks...
-  onStepProgress?(): Promise<void>;
-  onJobProgress?(): Promise<void>;
-  // NEW: Lifecycle callbacks
-  beforeStartTest?(page: any, browser: any, context: any): Promise<void>;
-  beforeStepStart?(step: StepInfo, page: any): Promise<void>;
-  afterEndTest?(status: 'passed' | 'failed', error?: string, page?: any): Promise<void>;
-}
-```
-**Exported StepInfo:** Added to index.ts exports for consumer usage.
-### 2. Integrated Lifecycle Callbacks
-**Files:**
-- `src/execution-service.ts` - RUN_EXACTLY and RUN_WITH_AI_REPAIR modes
-- `src/scenario-worker-class.ts` - Orchestrator mode
-**Integration Points:**
-- `beforeStartTest`: Called after browser initialization, before execution
-- `beforeStepStart`: Called before each step execution
-- `afterEndTest`: Called after all execution completes (success or failure)
-**Note:** All callbacks are optional - local clients don't provide them.
-### 3. Added Existing Browser Support
-**File:** `src/types.ts`
-Extended `ScriptExecutionRequest` to accept existing browser:
-```typescript
-export interface ScriptExecutionRequest {
-  // ... existing fields ...
-  // Optional: Provide existing browser/page/context (for server-side usage)
-  // If not provided, runner-core will create its own
-  existingBrowser?: any;
-  existingContext?: any;
-  existingPage?: any;
-}
-```
-**File:** `src/execution-service.ts`
-Updated `runExactly` and `runWithAIRepair` to:
-- Check if existing browser is provided
-- Use existing browser if available (don't create new one)
-- Don't close browser if we didn't create it (caller owns it)
-**Benefits:**
-- **Local clients:** Continue creating their own browser (no change)
-- **Scriptservice:** Reuses its existing browser (no double initialization)
-### 4. Created ScriptserviceLLMProvider
-**File:** `services/scriptservice/providers/scriptservice-llm-provider.ts` (NEW)
-Implements `LLMProvider` interface:
-- Wraps scriptservice's existing OpenAI client
-- Supports text and vision prompts
-- Uses scriptservice's OpenAI API key from ConfigService
-- No backend proxy - all calls are local
-- No auth needed (scriptservice is already authenticated)
-```typescript
-export class ScriptserviceLLMProvider implements LLMProvider {
-  async callLLM(request: LLMRequest): Promise<LLMResponse> {
-    // Uses scriptservice's OpenAI client
-    // Supports both text and vision (imageUrl)
-    // Returns answer and optional usage
-  }
-}
-```
-### 5. Created SmartTestRunnerCoreV2
-**File:** `services/scriptservice/smart-test-runner-core-v2.ts` (NEW)
-Drop-in replacement for SmartTestRunnerCore:
-**Same Interface:**
-```typescript
-constructor(config: RunnerConfig)
-async runExactly(script: string): Promise<RunExactlyResult>
-async runWithRepair(steps: TestStepWithId[]): Promise<RunWithRepairResult>
-```
-**Key Features:**
-- Uses `TestChimpService` (runner-core) internally
-- Maps scriptservice callbacks to ProgressReporter
-- Provides existing browser to runner-core (via `existingBrowser`, `existingContext`, `existingPage`)
-- Automatically gets runner-core improvements (better prompts, vision, semantic selectors)
-**Callback Mapping:**
-```typescript
-const progressReporter: ProgressReporter = {
-  onStepProgress: async (stepProgress) => {
-    // Calls scriptservice's onStepComplete
-    await config.callbacks.onStepComplete(...);
-  },
-  beforeStartTest: config.callbacks?.beforeStartTest,
-  beforeStepStart: config.callbacks?.beforeStepStart,
-  afterEndTest: config.callbacks?.afterEndTest
-};
-```
-**runWithRepair Implementation:**
-- Converts steps to Playwright script
-- Calls `runnerCore.executeScript` with mode `RUN_WITH_AI_REPAIR`
-- Provides existing browser/page/context
-- Maps results back to scriptservice format
-### 6. Updated Scriptservice Consumers
-**Files:**
-- `services/scriptservice/smart-test-execution-handler.ts`
-- `services/scriptservice/workers/test-based-explorer.ts`
-- `services/scriptservice/package.json` - Added `testchimp-runner-core: file:../../local/runner-core`
-**Migration Strategy:**
-```typescript
-// Default to V2, can toggle with environment variable
-const useV2 = process.env.USE_RUNNER_CORE_V2 !== 'false'; // Default: true
-const runner = useV2
-  ? new SmartTestRunnerCoreV2(runnerConfig)
-  : new SmartTestRunnerCore(runnerConfig);
-```
-**Rollback:**
-```bash
-export USE_RUNNER_CORE_V2=false
-```
-## Testing Instructions
-### 1. Install Local Runner-Core
-```bash
-cd /Users/nuwansam/IdeaProjects/AwareRepo/services/scriptservice
-# Manual install (if npm install doesn't work)
-./install-local-runner-core.sh
-# Or manually
-rm -rf node_modules/testchimp-runner-core
-mkdir -p node_modules/testchimp-runner-core
-cp -r ../../local/runner-core/dist node_modules/testchimp-runner-core/
-cp ../../local/runner-core/package.json node_modules/testchimp-runner-core/
-```
-### 2. Build Scriptservice
-```bash
-cd /Users/nuwansam/IdeaProjects/AwareRepo/services/scriptservice
-rm -rf dist
-npx tsc
-```
-### 3. Run Scriptservice
-```bash
-npm run start:staging
-# or
-npm run start:prod
-```
-### 4. Test Script Execution
-**Look for log message:**
-```
-using SmartTestRunnerCoreV2 (runner-core)
-```
-**Test JSON payload:**
-```json
-{
-  "playwrightConfig": "{}",
-  "targetUrl": "https://studio--cafetime-afg2v.us-central1.hosted.app/",
-  "scenario": "- Go to https://studio--cafetime-afg2v.us-central1.hosted.app/\n- Login with alice@example.com, TestPass123\n- Go to Messages tab\n- Send a message \"Hello\"\n- Verify conversation thread shows the sent message\n- Verify message input field is now empty",
-  "projectId": "test-project"
-}
-```
-### 5. Verify Behaviors
-- ✅ DB writes happen via callbacks
-- ✅ Screenshots upload to GCS correctly
-- ✅ LLM calls work (all local, no backend proxy)
-- ✅ Smart test execution completes
-- ✅ Explorer mode works
-- ✅ Error handling and reporting work
-- ✅ Lifecycle callbacks fire (beforeStartTest, beforeStepStart, afterEndTest)
-## Benefits
-### For Scriptservice:
-1. **Zero Code Duplication:** Removed ~600 lines of duplicated execution/repair logic
-2. **Auto-Updates:** Automatically gets runner-core improvements
-3. **Better Prompts:** Semantic selectors, improved vision, cleaner scripts
-4. **Consistent Behavior:** Same logic as vs-extension and github-action
-5. **No Breaking Changes:** Drop-in replacement with environment flag for safety
-### For Runner-Core:
-1. **Universal Library:** Works for both client-side and server-side
-2. **Flexible Architecture:** Callbacks and browser injection allow different execution patterns
-3. **Backward Compatible:** Existing consumers unaffected (all new features are optional)
-## Architecture
-```
-┌─────────────────────────────────────────────────────────────┐
-│                      SCRIPTSERVICE                          │
-│                                                             │
-│  ┌────────────────────────────────────────────────────┐   │
-│  │      SmartTestRunnerCoreV2 (Thin Wrapper)          │   │
-│  │  - Maps scriptservice callbacks to ProgressReporter│   │
-│  │  - Provides existing browser to runner-core        │   │
-│  │  - Converts steps ↔ script format                  │   │
-│  └───────────────┬────────────────────────────────────┘   │
-│                  │                                          │
-│                  ▼                                          │
-│  ┌────────────────────────────────────────────────────┐   │
-│  │     ScriptserviceLLMProvider (Local OpenAI)        │   │
-│  │  - Wraps scriptservice's OpenAI client             │   │
-│  │  - No backend proxy, all calls local               │   │
-│  └────────────────────────────────────────────────────┘   │
-│                                                             │
-└─────────────────────────────────────────────────────────────┘
-                               │
-                               ▼
-┌─────────────────────────────────────────────────────────────┐
-│                    RUNNER-CORE (v0.0.28)                    │
-│                                                             │
-│  ┌────────────────────────────────────────────────────┐   │
-│  │             ExecutionService                        │   │
-│  │  - Accepts existingBrowser/Page/Context (NEW)      │   │
-│  │  - Calls lifecycle callbacks (NEW)                 │   │
-│  │  - RUN_EXACTLY and RUN_WITH_AI_REPAIR modes        │   │
-│  └────────────────────────────────────────────────────┘   │
-│                                                             │
-│  ┌────────────────────────────────────────────────────┐   │
-│  │            ProgressReporter Interface               │   │
-│  │  - onStepProgress (existing)                       │   │
-│  │  - beforeStartTest (NEW)                           │   │
-│  │  - beforeStepStart (NEW)                           │   │
-│  │  - afterEndTest (NEW)                              │   │
-│  └────────────────────────────────────────────────────┘   │
-│                                                             │
-└─────────────────────────────────────────────────────────────┘
-```
-## Files Changed
-### Runner-Core:
-1. `src/progress-reporter.ts` - Added StepInfo and lifecycle callbacks
-2. `src/index.ts` - Export StepInfo
-3. `src/types.ts` - Added existingBrowser/Context/Page fields
-4. `src/execution-service.ts` - Support existing browser and lifecycle callbacks
-5. `src/scenario-worker-class.ts` - Call lifecycle callbacks in orchestrator mode
-6. `package.json` - Version 0.0.28
-### Scriptservice:
-7. `providers/scriptservice-llm-provider.ts` - NEW
-8. `smart-test-runner-core-v2.ts` - NEW
-9. `smart-test-execution-handler.ts` - Use V2 by default
-10. `workers/test-based-explorer.ts` - Use V2 by default
-11. `package.json` - Added testchimp-runner-core dependency
-12. `install-local-runner-core.sh` - NEW (helper script for local testing)
-## Next Steps
-### Before Publishing to npm:
-1. ✅ Test runExactly mode in scriptservice
-2. ✅ Test runWithRepair mode in scriptservice
-3. ✅ Verify DB writes and screenshots work
-4. ✅ Monitor logs for errors
-### After Validation:
-1. Publish runner-core 0.0.28 to npm
-2. Update scriptservice package.json to use `^0.0.28` instead of `file:...`
-3. Deploy to staging
-4. After 1-2 weeks, remove environment flag and delete old SmartTestRunnerCore
-## Breaking Changes
-None. All changes are backward compatible and opt-in.
-## Notes
-- Lifecycle callbacks are optional - local clients don't need them
-- Existing browser is optional - if not provided, runner-core creates its own
-- V2 defaults to enabled but can be disabled with `USE_RUNNER_CORE_V2=false`
-- Old SmartTestRunnerCore remains available as fallback