npm - ai-or-die - Versions diffs - 0.1.22 → 0.1.24 - Mend

ai-or-die 0.1.22 → 0.1.24

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (25) hide show

package/.github/workflows/ci.yml +29 -4
package/CLAUDE.md +14 -2
package/docs/adrs/0008-e2e-parallelization.md +65 -0
package/docs/adrs/0008-file-browser-architecture.md +82 -0
package/docs/agent-instructions/02-testing-and-validation.md +19 -7
package/docs/agent-instructions/03-tooling-and-pipelines.md +7 -8
package/docs/agent-instructions/04-handoff-protocol.md +63 -0
package/docs/agent-instructions/05-defensive-coding.md +170 -0
package/docs/agent-instructions/06-ci-first-testing.md +268 -0
package/docs/agent-instructions/07-docs-hygiene.md +124 -0
package/docs/agent-instructions/08-multi-agent-consultation.md +168 -0
package/docs/specs/client-app.md +106 -0
package/docs/specs/file-browser.md +557 -0
package/docs/specs/server.md +123 -0
package/e2e/playwright.config.js +7 -3
package/package.json +1 -1
package/src/public/app.js +50 -0
package/src/public/command-palette.js +36 -0
package/src/public/components/file-browser.css +592 -0
package/src/public/file-browser.js +1543 -0
package/src/public/file-editor.js +549 -0
package/src/public/icons.js +19 -0
package/src/public/index.html +10 -0
package/src/server.js +387 -15
package/src/utils/file-utils.js +219 -0

package/.github/workflows/ci.yml CHANGED Viewed

@@ -52,7 +52,7 @@ jobs:
             playwright-report/
           retention-days: 14
-  test-browser-functional:
+  test-browser-functional-core:
     runs-on: ${{ matrix.os }}
     strategy:
       matrix:
@@ -65,13 +65,38 @@ jobs:
       - run: npm ci
       - name: Install Playwright browsers
         run: npx playwright install chromium --with-deps
-      - name: Run functional browser tests
-        run: npx playwright test --config e2e/playwright.config.js --project functional
+      - name: Run functional core tests
+        run: npx playwright test --config e2e/playwright.config.js --project functional-core
       - name: Upload Playwright report
         uses: actions/upload-artifact@v4
         if: ${{ !cancelled() }}
         with:
-          name: playwright-functional-${{ matrix.os }}
+          name: playwright-functional-core-${{ matrix.os }}
+          path: |
+            e2e/test-results/
+            playwright-report/
+          retention-days: 14
+  test-browser-functional-extended:
+    runs-on: ${{ matrix.os }}
+    strategy:
+      matrix:
+        os: [ubuntu-latest, windows-latest]
+    steps:
+      - uses: actions/checkout@v4
+      - uses: actions/setup-node@v4
+        with:
+          node-version: '22'
+      - run: npm ci
+      - name: Install Playwright browsers
+        run: npx playwright install chromium --with-deps
+      - name: Run functional extended tests
+        run: npx playwright test --config e2e/playwright.config.js --project functional-extended
+      - name: Upload Playwright report
+        uses: actions/upload-artifact@v4
+        if: ${{ !cancelled() }}
+        with:
+          name: playwright-functional-extended-${{ matrix.os }}
           path: |
             e2e/test-results/
             playwright-report/

package/CLAUDE.md CHANGED Viewed

@@ -19,17 +19,29 @@ Available agents: **Architect**, **Engineer**, **QA Reviewer**, **Troubleshooter
 ### Documentation-Driven Workflow
 Before starting any task, consult the relevant documentation:
-- `docs/agent-instructions/` -- Philosophy, research guidelines, testing standards, tooling conventions
+- `docs/agent-instructions/` -- Agent workflow guides:
+  - `00-philosophy.md` -- Core principles
+  - `01-research-and-web.md` -- Research guidelines
+  - `02-testing-and-validation.md` -- Testing standards
+  - `03-tooling-and-pipelines.md` -- Tooling conventions
+  - `04-handoff-protocol.md` -- How to leave the repo clean for the next agent
+  - `05-defensive-coding.md` -- Error prevention, cross-platform traps
+  - `06-ci-first-testing.md` -- CI-only testing, E2E debugging, performance budget
+  - `07-docs-hygiene.md` -- Keeping documentation in sync
+  - `08-multi-agent-consultation.md` -- When and how to consult expert subagents
 - `docs/adrs/` -- Architecture Decision Records (check before proposing new patterns)
 - `docs/specs/` -- Component specifications (read before implementing, update after changing behavior)
 - `docs/architecture/` -- System diagrams and component overviews
-- `docs/history/` -- Incident post-mortems and debugging notes
+- `docs/history/` -- Solved problems and debugging notes (check before debugging any issue)
 ### Mandatory Rules
 1. **Spec updates with code changes**: When code behavior changes, the corresponding spec in `docs/specs/` must be updated in the same commit or PR.
 2. **ADR compliance**: Never contradict an accepted ADR. To change direction, write a new ADR that supersedes the old one.
 3. **Cross-platform support**: All code must work on both Windows and Linux. Use `path.join()` for file paths, provide `.sh` and `.ps1` script variants, and test on both platforms in CI.
 4. **Test coverage**: Every feature and bug fix requires tests. No exceptions.
+5. **CI-only testing**: All testing happens on GitHub Actions runners. Never test locally. E2E tests are the only true validation. Push → draft PR → CI → iterate.
+6. **Document what you solve**: Every solved problem goes in `docs/history/`. LLMs don't carry memories — written docs are the only institutional memory.
+7. **Consult before committing**: For significant decisions, spawn expert subagents (architect, principal engineer, lead QA, PM, designer, user researcher) in parallel. See `docs/agent-instructions/08-multi-agent-consultation.md`.
 ## Common Commands

package/docs/adrs/0008-e2e-parallelization.md ADDED Viewed

@@ -0,0 +1,65 @@
+# ADR-0008: E2E Test Parallelization Strategy
+## Status
+**Accepted**
+## Date
+2026-02-07
+## Context
+The E2E test suite has grown to 16 spec files across 6 Playwright projects. The `functional` project — containing tests 02-07, 09-image-paste, and 09-background-notifications — runs approximately 30 tests sequentially with `workers: 1`. On GitHub Actions runners, this takes 7-15 minutes per platform, exceeding the 7-minute performance budget for CI feedback loops.
+Fast CI feedback is critical because all testing happens exclusively on GitHub runners (no local testing). The push → CI → fix → push cycle must be fast enough that agents can iterate efficiently.
+## Decision
+Split the functional test group into two sub-groups and enable parallel workers in CI:
+### Test Split
+- **`functional-core`**: Tests `02-terminal-io`, `03-clipboard`, `04-context-menu`, `05-tab-switching` (core terminal interaction features)
+- **`functional-extended`**: Tests `06-large-paste`, `07-vim-and-session`, `09-image-paste`, `09-background-notifications` (extended features and cross-cutting concerns)
+### Parallel Workers
+- Set `workers: process.env.CI ? 2 : 1` in `e2e/playwright.config.js`
+- CI runs 2 Playwright workers per job for parallel test execution
+- Local development retains 1 worker for debugging simplicity (though local testing is not the primary workflow)
+### CI Pipeline Changes
+- Replace single `test-browser-functional` job with two: `test-browser-functional-core` and `test-browser-functional-extended`
+- Each runs independently and in parallel with all other browser test jobs
+- Each uploads artifacts with distinct names for failure diagnosis
+### Why this works
+- Each test already creates its own server instance via `createServer()` with an ephemeral port (port 0)
+- Sessions are per-server, eliminating cross-test state contamination
+- Playwright provides browser context isolation between parallel tests
+- No shared filesystem resources detected in the test suite
+## Consequences
+### Positive
+- No CI job exceeds 7 minutes — faster feedback for the push-fix-push workflow
+- More granular job names in CI (functional-core vs functional-extended) aid debugging — agents can immediately see which category of tests failed
+- Parallel workers within jobs further reduce wall-clock time
+- Sets a pattern for future test group splits as the suite grows
+### Negative
+- More CI jobs to monitor (6 browser test job types instead of 5, plus unit tests and build-binary)
+- Artifact names become longer and more numerous
+- If test isolation assumptions prove wrong, parallel execution could introduce flakiness (mitigated by the existing ephemeral-port pattern)
+### Neutral
+- Existing test files require no code changes — only configuration and CI workflow updates
+- The `workers: 2` setting is conservative and can be increased if runners have sufficient resources
+## Notes
+- When any job approaches 6 minutes consistently, split it further
+- When the test suite exceeds 80 tests, re-evaluate the overall split strategy
+- Monitor for flaky tests that may indicate parallel execution issues

package/docs/adrs/0008-file-browser-architecture.md ADDED Viewed

@@ -0,0 +1,82 @@
+# ADR-0008: File Browser Architecture
+## Status
+**Accepted**
+## Date
+2026-02-07
+## Context
+Users access ai-or-die remotely over devtunnels and similar tunnel services. In this environment there is no way to view visual content (images, PDFs), edit files with syntax highlighting, or upload/download files without native desktop tools -- which are not available through the tunnel. The existing `/api/folders` endpoint only lists directories for the working-directory selector and does not expose file contents or metadata.
+The application needs a web-based file manager that supports browsing, previewing, editing, uploading, and downloading files directly within the terminal UI, without requiring users to install additional software or leave the browser.
+## Decision
+We introduce a file browser feature built on these architectural choices:
+### REST API (not WebSocket) for file operations
+File operations (list, read, write, upload, download) follow a request-response pattern. REST is a natural fit: clients send a request, wait for the response, and render the result. WebSocket would add unnecessary complexity for operations that do not require real-time streaming. The six new endpoints are:
+| Endpoint | Method | Purpose |
+|----------|--------|---------|
+| `/api/files` | GET | List directory contents (files + directories), paginated |
+| `/api/files/stat` | GET | File metadata (size, modified, MIME category, hash) |
+| `/api/files/content` | GET | Read text file content in a JSON envelope with hash |
+| `/api/files/content` | PUT | Save text file content with optimistic concurrency (hash) |
+| `/api/files/upload` | POST | Upload file as base64 JSON (10 MB limit) |
+| `/api/files/download` | GET | Stream file for download or inline preview |
+### Right-docked side panel (not modal)
+The file browser opens as a docked panel on the right side of the viewport. The terminal remains visible and interactive alongside it. A modal would block terminal access, which defeats the purpose of a file browser in a terminal application. The panel auto-switches to a full-screen overlay on mobile viewports or when the terminal would be squeezed below 80 columns.
+### Ace Editor from CDN (not bundled)
+Text editing uses the Ace Editor loaded from cdnjs, matching the existing pattern of loading xterm.js from unpkg CDN. This avoids adding npm dependencies for a frontend-only library and keeps the server-side `node_modules` minimal. Ace is lazy-loaded on first editor open, with a loading spinner and a 5-second timeout with fallback error.
+### Hash-based optimistic concurrency for file saves
+Every text file response includes an MD5 hash computed via streaming (`crypto.createHash` + `fs.createReadStream`). When saving, the client sends the original hash; the server recomputes and returns 409 Conflict if the file was modified externally. This prevents silent overwrites in multi-user or multi-tab scenarios.
+### Extension-based MIME detection with binary heuristic
+File type detection uses a built-in extension-to-MIME map for known types, supplemented by a null-byte heuristic (reading the first 512 bytes) for unknown extensions. This avoids depending on system-level `file` commands or npm packages like `mime-types`.
+### Enhanced validatePath() with symlink resolution
+The existing `validatePath()` function is extended to resolve symlinks via `fs.realpathSync()` before the `startsWith` check. This eliminates TOCTOU (time-of-check-to-time-of-use) race conditions where a symlink could be swapped between validation and access.
+### File utilities extracted to src/utils/file-utils.js
+File-related utility functions (`getFileInfo`, `computeFileHash`, `isBinaryFile`, `sanitizeFileName`) are extracted into a dedicated module rather than being added inline to `server.js` (already 1507 lines). This keeps the server file focused on routing and makes the utilities independently testable.
+### No delete or rename in initial release
+Destructive file operations are excluded from the MVP to limit the security surface area. Users can still delete or rename files through the terminal. These operations may be added in a follow-up phase.
+## Consequences
+### Positive
+- Visual file preview (images, PDFs, JSON, CSV) works over tunnels without native tools
+- Code editing with syntax highlighting and auto-save directly in the browser
+- Drag-drop, file picker, and clipboard paste upload for getting files onto the remote machine
+- Hash-based conflict detection prevents accidental data loss
+- Extracted file utilities are independently testable
+### Negative
+- No real-time file watching -- directory listings are point-in-time snapshots (would require WebSocket, deferred to Phase 2)
+- Ace Editor introduces a CDN dependency for the editing feature (editing is degraded if CDN is unreachable)
+- 10 MB upload limit may be insufficient for large assets (chunked upload deferred to Phase 2)
+### Neutral
+- `/api/folders` and `/api/files` coexist: `/api/folders` lists only directories for the working-directory selector, `/api/files` lists both files and directories for the file browser
+- The same `validatePath()` function secures both old and new endpoints
+- No new npm dependencies are introduced

package/docs/agent-instructions/02-testing-and-validation.md CHANGED Viewed

@@ -26,14 +26,26 @@ Write tests alongside implementation, not after. The workflow:
 - Use temp directories for file system tests (see `session-store.test.js` pattern)
 - Test cross-platform behavior: path construction, command resolution, shell detection
+## CI-Only Testing
+All testing happens on GitHub Actions runners. No local test runs. Ever.
+- Local environments are unreliable: missing native modules, stale state, platform differences
+- CI provides fresh, reproducible, cross-platform results every time
+- E2E tests are the only true validation — if they pass on CI, the feature works
+The workflow: write code → push to branch → open draft PR → CI runs → read results → fix → push again.
+See `docs/agent-instructions/06-ci-first-testing.md` for the complete CI workflow guide, job map, and debugging playbook.
 ## Self-Validation
 Before committing, every agent must:
-1. Run `npm test` — all tests pass
-2. Run `npm start` — server boots without errors
-3. Run `scripts/validate.sh` (Linux) or `scripts/validate.ps1` (Windows)
-4. Verify the change doesn't break existing functionality
+1. Push to branch and open a draft PR to trigger CI
+2. Verify all CI jobs pass on both ubuntu-latest and windows-latest
+3. Check `docs/history/` for known issues if any job fails
+4. Verify the change doesn't break existing functionality (CI confirms this)
 ## What to Test
@@ -50,9 +62,9 @@ Before committing, every agent must:
 - Auth middleware behavior
 ### For Client Changes
-- Manual browser testing (create session, select tool, verify output)
-- Check mobile responsiveness
-- Verify WebSocket reconnection
+- E2E tests via Playwright (verified on CI, never locally)
+- Mobile viewport tests via mobile-iphone and mobile-pixel Playwright projects
+- WebSocket reconnection covered by E2E functional tests
 ## When Tests Fail

package/docs/agent-instructions/03-tooling-and-pipelines.md CHANGED Viewed

@@ -14,14 +14,13 @@ If you perform a verification task twice, script it. All scripts live in the `sc
 ### GitHub Actions
-The CI pipeline (`.github/workflows/ci.yml`) runs on every push and PR:
-1. **Matrix**: Runs on both `ubuntu-latest` and `windows-latest`
-2. **Install**: `npm ci`
-3. **Lint**: ESLint check
-4. **Test**: `npm test` with coverage reporting
-5. **Audit**: `npm audit` for security vulnerabilities
-6. **Docs Check**: Verify docs/ structure exists
+The CI pipeline (`.github/workflows/ci.yml`) runs on every push and PR. It runs 8 job types in parallel across ubuntu-latest and windows-latest (16 total jobs):
+- **Unit tests**: `npm test` + `npm audit`
+- **Browser E2E tests**: 6 Playwright job types (golden-path, functional-core, functional-extended, mobile, visual-regression, new-features)
+- **Binary build**: SEA binary compilation + smoke tests
+See `06-ci-first-testing.md` for the full CI job map, artifact details, and debugging workflow. CI is the only authority on whether code works (see ADR-0008 for the parallelization strategy).
 ### Release Pipeline

package/docs/agent-instructions/04-handoff-protocol.md ADDED Viewed

@@ -0,0 +1,63 @@
+# Handoff Protocol
+## The Golden Rule
+Every session ends with a cleaner repo than it started. If you touched it, you documented it. If you broke it, you fixed it. If you couldn't finish, you left a trail.
+## Pre-Handoff Checklist
+Before ending any work session, verify:
+1. **All CI jobs pass.** Push to your branch and check GitHub Actions. Both `ubuntu-latest` and `windows-latest` must be green. Do not hand off a red build.
+2. **Documentation is updated.** Specs in `docs/specs/` match the current code. ADRs are written for any architectural decisions made during the session.
+3. **No orphaned work-in-progress.** No half-implemented features sitting uncommitted. Everything is either committed and pushed, or explicitly tracked in a GitHub issue.
+4. **Commit messages explain "why", not just "what".** A future agent reading the git log should understand the reasoning without opening the diff.
+5. **New patterns and conventions are documented.** If you introduced a new coding pattern, utility, or convention, write it down in the relevant spec or instruction doc.
+## Work-in-Progress Protocol
+When you cannot finish a task:
+- Create a GitHub issue with full context: what was attempted, where it stopped, what blockers exist, and what the next steps are.
+- Use `[WIP]` prefix in commit messages for incomplete work.
+- List which files are mid-change and what state they are in.
+- Reference relevant specs, ADRs, and CI run links.
+- Never leave broken tests on main. If your work breaks tests, either fix them or revert before ending.
+## Clean Commit Hygiene
+- Follow Conventional Commits: `feat:`, `fix:`, `docs:`, `test:`, `chore:`, `refactor:`.
+- One concern per commit. Do not mix a bug fix with a feature addition.
+- Reference GitHub issues in the message: `fix: resolve WebSocket race in image upload (#42)`.
+- Commit messages should be self-contained. Another agent reading the git log should understand what happened and why without reading the diff.
+## Session Context Dump
+What to leave behind for the next agent:
+- Updated specs in `docs/specs/` reflecting any behavior changes.
+- Research findings documented in the relevant ADR or spec.
+- Error patterns discovered during debugging added to `docs/history/`.
+- Decisions made and their rationale recorded in ADRs.
+- If you modified the CI pipeline, document what changed and why.
+## Log What You Solved
+When you encounter and solve a problem, document it in `docs/history/`. LLMs do not carry memories between sessions -- written docs are the only institutional memory. Every solved problem that is not documented is a problem that will be solved again.
+See `07-docs-hygiene.md` for the history entry format and full guidelines. Before debugging any issue, always check `docs/history/` first.
+## Anti-Patterns
+Do NOT do any of these:
+- Leave vague commit messages like "Made some changes" or "Updated stuff".
+- Push uncommitted or unstaged work.
+- Leave broken tests and move on.
+- Make architectural decisions without writing an ADR.
+- Solve a problem without documenting the solution.
+- Skip spec updates when behavior changes.
+- Assume the next agent will "figure it out".
+- Delete or disable tests to make CI pass.
+- Commit secrets, API keys, tokens, or `.env` files. Check `git diff --staged` for sensitive data before every commit.
+- Expand scope beyond what was asked. If you discover adjacent issues, file them as separate GitHub issues. Do not expand scope without explicit approval.

package/docs/agent-instructions/05-defensive-coding.md ADDED Viewed

@@ -0,0 +1,170 @@
+# Defensive Coding
+## Validate at Boundaries
+Trust nothing that crosses a system boundary. Every REST endpoint, WebSocket handler, and bridge method should validate its inputs before processing.
+Where boundaries exist in this codebase:
+- REST API handlers in `src/server.js` -- validate request params, body, headers
+- WebSocket message handlers -- validate `type` field, required fields per message type
+- Bridge methods (`startSession`, `sendInput`, `resize`) -- validate sessionId exists, dimensions are positive integers
+- Client-to-server messages -- validate session ownership, check session is active
+Pattern:
+```javascript
+// Bad
+handleMessage(wsId, message) {
+  const session = this.sessions.get(message.sessionId);
+  session.bridge.sendInput(message.data); // crashes if session doesn't exist
+}
+// Good
+handleMessage(wsId, message) {
+  if (!message.sessionId) {
+    return this.sendError(wsId, 'Missing sessionId');
+  }
+  const session = this.sessions.get(message.sessionId);
+  if (!session) {
+    return this.sendError(wsId, `Session '${message.sessionId}' not found`);
+  }
+  if (!session.active) {
+    return this.sendError(wsId, `Session '${message.sessionId}' is not active`);
+  }
+  session.bridge.sendInput(message.data);
+}
+```
+## Error Messages Are UI
+Error messages are read by other agents trying to debug. Make them actionable.
+Every error message should answer three questions:
+1. What went wrong?
+2. What was expected?
+3. What should be done about it?
+```javascript
+// Bad
+throw new Error('Invalid');
+throw new Error('Not found');
+throw new Error('Failed');
+// Good
+throw new Error(`Session '${sessionId}' not found. Available sessions: [${[...sessions.keys()].join(', ')}]`);
+throw new Error(`Bridge '${toolId}' is not available. Run 'which ${command}' to verify installation. Searched paths: ${searchPaths.join(', ')}`);
+throw new Error(`WebSocket message missing required field 'type'. Received: ${JSON.stringify(message)}`);
+```
+## Cross-Platform Landmines
+This codebase runs on both Windows and Linux. Every line of code that touches the filesystem, spawns a process, or handles paths must account for both.
+### Paths
+- ALWAYS use `path.join()`, never string concatenation with `/` or `\\`
+- Use `os.homedir()`, never `process.env.HOME` (undefined on Windows)
+- File paths are case-insensitive on Windows, case-sensitive on Linux
+- Use `path.resolve()` to normalize paths before comparison
+### Process Spawning
+- `where` on Windows, `which` on Linux -- check `process.platform`
+- Windows uses ConPTY, Linux uses standard PTY -- different buffering behavior
+- Executable extensions: `.exe`, `.cmd` on Windows, none on Linux
+- Shell: `cmd.exe` or `powershell.exe` on Windows, `bash` or `sh` on Linux
+### Line Endings
+- Never match output with exact strings -- use `.includes()` or `.trim()`
+- Windows may inject `\r\n` where Linux gives `\n`
+- PTY output may contain ANSI escape sequences -- strip them before comparing
+### The ConPTY Quirks
+- Writes larger than 4096 bytes can overflow the ConPTY buffer on Windows
+- Solution: chunked writes with delays (see `base-bridge.js` chunked write pattern)
+- ConPTY may echo input back -- don't assume output is only from the spawned process
+## Async Safety
+Node.js is async-first. Unhandled promise rejections crash the process.
+Rules:
+- Every `async` function must have try-catch at the top level
+- Every `.then()` chain must have a `.catch()`
+- Event handlers that call async code must wrap in try-catch
+- Use the spawn watchdog pattern from `base-bridge.js`: set a timer when spawning a process, kill it if no output arrives within 30 seconds
+```javascript
+// Bad -- unhandled rejection if startSession throws
+ws.on('message', (data) => {
+  const msg = JSON.parse(data);
+  this.startSession(msg.sessionId);
+});
+// Good
+ws.on('message', (data) => {
+  try {
+    const msg = JSON.parse(data);
+    this.startSession(msg.sessionId).catch(err => {
+      console.error(`Failed to start session ${msg.sessionId}:`, err);
+      this.sendError(wsId, err.message);
+    });
+  } catch (err) {
+    console.error('Failed to parse WebSocket message:', err);
+  }
+});
+```
+## Fail Fast, Fail Loud
+Silent failures are the worst kind. They create bugs that surface hours or sessions later, with no trail.
+- Assert preconditions at function entry -- don't wait until line 50 to discover the input was invalid
+- Log errors with full context before re-throwing: what function, what inputs, what state
+- Never `catch` and silently swallow: `catch (err) { /* ignore */ }` -- this is forbidden
+- If something "shouldn't happen," make it throw, not silently return null
+```javascript
+// Bad -- silent null propagation
+function getSession(id) {
+  return sessions.get(id); // returns undefined silently
+}
+// Good -- fail fast with context
+function getSession(id) {
+  const session = sessions.get(id);
+  if (!session) {
+    throw new Error(`getSession: no session with id '${id}'. Active sessions: ${sessions.size}`);
+  }
+  return session;
+}
+```
+## The "Fresh Machine" Test
+Before considering any code complete, ask yourself: "Would this work on a brand new GitHub Actions runner with nothing pre-installed except Node.js 22?"
+This means:
+- No reliance on globally installed tools (unless you check for them and give a clear error)
+- No hardcoded paths that only exist on your dev machine
+- No cached `node_modules` assumptions -- `npm ci` installs from scratch
+- No file system state left over from previous runs
+- No environment variables that aren't set in CI
+If the answer is "maybe," add a runtime check:
+```javascript
+const commandPath = await this.findCommandAsync();
+if (!commandPath) {
+  throw new Error(
+    `${this.toolName} CLI not found. Searched: ${this.searchPaths.join(', ')}. ` +
+    `Install ${this.toolName} or add it to PATH.`
+  );
+}
+```