npm - agentscreenshots - Versions diffs - 0.1.0 → 0.2.0 - Mend

agentscreenshots 0.1.0 → 0.2.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (8) hide show

package/AGENT-INSTRUCTIONS.md CHANGED Viewed

@@ -1,102 +1,110 @@
-# AgentScreenshots Agent Prompt
+# AgentScreenshots CLI agent instructions
-Copy this into your coding agent's project instructions, custom instructions, skill, or rules file.
+Use `agentshot` for rendered webpage screenshots and visual UI checks.
-```text
-You have access to the `agentshot` CLI for visual UI checks.
-Use it whenever you need to inspect a rendered webpage, verify frontend work, debug layout/CSS issues, check responsive behavior, or confirm that a visual change actually looks correct in the browser.
-Assume `agentshot` is already installed and configured. Do not install it, authenticate it, or change its configuration unless the user explicitly asks. If `agentshot` is not available, report that clearly and continue with the best available fallback.
+What this tool is good for:
+- Inspecting a page, component, section, or responsive viewport after frontend edits.
+- Verifying layout, spacing, overflow, wrapping, missing assets, and visual regressions.
+- Capturing localhost, preview, staging, or production pages.
+- Giving an AI coding agent a PNG artifact it can inspect before judging the UI.
 Default workflow:
+1. Identify the exact URL to inspect. For local dev, use the active localhost URL such as `http://127.0.0.1:5173`.
+2. Save screenshots into a predictable project folder, preferably `.agents/screenshots/`.
+3. Prefer selector, viewport, or vertical-slice captures over huge full-page screenshots when they answer the question.
+4. Inspect the saved PNG with image-reading/viewing capability before making visual claims.
+5. If the UI is wrong, make a focused fix, capture again, and inspect again.
-1. Identify the page URL to inspect.
-   - For local dev, use the active localhost URL, for example `http://127.0.0.1:5173`.
-   - For deployed work, use the preview/staging/production URL the user provided.
-2. Save screenshots into a predictable project folder:
-   - Prefer `.agents/screenshots/`.
-   - Use short descriptive filenames, for example `home.png`, `pricing-section.png`, `mobile-nav.png`.
-3. Capture the relevant view with `agentshot`.
-4. Inspect the saved PNG with your image-reading/viewing capability before judging the UI.
-5. If the UI is wrong, make a focused fix, capture again, and re-inspect.
-If `agentshot` itself fails, run:
-agentshot doctor
-Use the doctor output to identify whether the issue is missing auth, browser setup, API reachability, output permissions, or the target page itself.
-Use full-page capture when you need overall page context:
+Core commands:
+```bash
 agentshot "<url>" ".agents/screenshots/page.png" --scroll --wait 1000
-Use a fixed top slice when the issue is near the top of the page:
 agentshot "<url>" ".agents/screenshots/top.png" --height 1200 --wait 500
-Use a vertical slice when the issue is in a known scroll range:
 agentshot "<url>" ".agents/screenshots/slice.png" --from 1600 --to 2400 --wait 500
-Use selector capture for a specific section or component:
-agentshot "<url>" ".agents/screenshots/pricing.png" --selector "section:has-text('Pricing')" --padding 24 --wait 500
-Use `--nth` when multiple elements match:
+agentshot "<url>" ".agents/screenshots/section.png" --selector "section:has-text('Pricing')" --padding 24 --wait 500
 agentshot "<url>" ".agents/screenshots/card-2.png" --selector ".pricing-card" --nth 1 --padding 16 --wait 500
-Use mobile viewport checks for responsive layout:
 agentshot "<url>" ".agents/screenshots/mobile.png" --viewport 390x844 --scroll --wait 1000
-Use desktop viewport checks when layout width matters:
 agentshot "<url>" ".agents/screenshots/desktop.png" --viewport 1440x1000 --scroll --wait 1000
+agentshot "https://example.com" ".agents/screenshots/live-page.png" --scroll --wait 2500 --wait-until load
+agentshot "<url>" "pricing-section.png" --temp --selector "section:has-text('Pricing')" --padding 24 --wait 500
+agentshot "https://example.com" ".agents/screenshots/no-cookie-banner.png" --click-if-present "button:has-text('Reject all')" --wait 500
+agentshot "<url>" ".agents/screenshots/hover-menu.png" --hover ".nav-item" --selector ".nav-region" --padding 24 --wait 500
+agentshot "<url>" ".agents/screenshots/small.png" --viewport 1440x1000 --device-scale-factor 1 --wait 500
+```
-Selector notes:
+Capture options:
+- `--selector SELECTOR`: capture the first matching Playwright/CSS selector.
+- `--section SELECTOR`: alias for `--selector`.
+- `--nth INDEX`: choose another selector match, zero-based.
+- `--padding PX`: add padding around selector captures.
+- `--height PX`: capture from page top or `--from` to a fixed height.
+- `--from PX --to PX`: capture a vertical page slice.
+- `--viewport WIDTHxHEIGHT`: set viewport size.
+- `--device-scale-factor N`: capture higher-density pixels without changing CSS layout. Defaults to `2`; use `1` when smaller files matter more than visual fidelity.
+- `--scroll`: scroll first to trigger lazy-loaded or scroll-triggered content.
+- `--wait MS`: wait after navigation/scroll before capture.
+- `--wait-for CSS`: wait until a specific element exists before capture.
+- `--click SELECTOR`: click a required Playwright/CSS selector before capture. It searches the main page and child frames. Repeat the flag for multiple clicks.
+- `--click-if-present SELECTOR`: try to click a Playwright/CSS selector before capture, but continue if it is missing. It searches the main page and child frames. Repeat the flag for multiple possible overlays.
+- `--hover SELECTOR`: hover a required Playwright/CSS selector before capture. It searches the main page and child frames. Repeat the flag for multiple hovers.
+- `--hover-if-present SELECTOR`: try to hover a Playwright/CSS selector before capture, but continue if it is missing. It searches the main page and child frames. Repeat the flag for multiple optional hover targets.
+- `--wait-until load|domcontentloaded|networkidle`: choose load-state wait.
+- `--temp`: save in the OS temp directory. With a filename hint, for example `pricing.png --temp`, the output is named like `pricing-temp-df2d.png`; without a filename hint, the name is derived from the URL. It does not add timed deletion or cleanup tracking.
+- `--json`: print machine-readable capture metadata.
+- `--no-report`: skip usage reporting for tests/smoke checks.
+Output path best practices:
+- For project captures, pass an explicit output path so the capture lands exactly where the user or agent expects.
+- For project work, save screenshots inside the active repo, preferably under `.agents/screenshots/`.
+- Use task-specific subfolders for larger runs, such as `.agents/screenshots/qc/`, `.agents/screenshots/mobile/`, or `.agents/screenshots/2026-05-17-pricing/`.
+- Use descriptive filenames that include the page, viewport, section, or state, for example `pricing-mobile.png`, `hero-hover-menu.png`, or `checkout-modal-closed.png`.
+- Add `.agents/screenshots/` to `.gitignore` unless the screenshots are intentionally part of the repo.
+- For fast development captures that do not need versioned screenshot history, use `--temp` when available.
+- If the user provides a specific output path or saving instructions, use that path and those instructions exactly.
+Selector notes:
 - Plain CSS selectors work: `.hero`, `#pricing`, `[data-testid='nav']`.
 - Playwright text selectors work: `text=Pricing`.
 - Playwright text filters work: `section:has-text('Pricing')`.
 - XPath works when needed: `xpath=//section[.//h2[contains(., 'Pricing')]]`.
-When to use `--scroll`:
-- Use it for full-page screenshots.
-- Use it when images, animations, lazy-loaded sections, or scroll-triggered content may not render until the page is scrolled.
-- You usually do not need it for a small selector capture unless that section is lazy-loaded.
-When to use `--wait`:
-- Use `--wait 500` for normal UI pages.
-- Use `--wait 1000` or `--wait 2000` for pages with animations, remote images, or client-side data loading.
-- Use `--wait-for "<selector>"` if a specific element must exist before capture.
-Good commands:
-agentshot "http://127.0.0.1:5173" ".agents/screenshots/home.png" --scroll --wait 1000
-agentshot "http://127.0.0.1:5173/pricing" ".agents/screenshots/pricing.png" --selector "main" --padding 16 --wait 500
-agentshot "http://127.0.0.1:5173" ".agents/screenshots/mobile-home.png" --viewport 390x844 --scroll --wait 1000
+Live-site notes:
+- For production websites, prefer `--wait-until load --wait 1500` to `--wait-until networkidle` unless you know the page settles cleanly.
+- `networkidle` can time out on live sites with analytics, chat widgets, streaming requests, or long-running background fetches.
+- Use `--scroll --wait 1500` or longer when external images, lazy sections, or scroll-triggered animations need time to render.
+- Use `--click-if-present "button:has-text('Reject all')"`, `--click-if-present "button[aria-label='Close']"`, or a site-specific selector to dismiss cookie banners and fixed overlays before capture. Prefer reject/close actions when available.
+- Use `--click` or `--click-if-present` to open or close click-driven UI such as modals, drawers, accordions, details sections, tabs, dropdown menus, and collapsed content.
+- Use `--hover` or `--hover-if-present` for hover-driven UI such as tooltips, hover cards, menu flyouts, hover-revealed controls, and CSS `:hover` states.
+- Many live sites do not use semantic `<section>` tags. If selector capture is unreliable, use stable IDs, text selectors, or measured vertical slices with `--from` and `--to`.
+When to use full-page capture:
+- Use it for broad page reviews and final sanity checks.
+- Pair it with `--scroll --wait 1000` when images, animations, or lazy content may not render until scrolled.
+- Avoid repeated full-page screenshots when a section or slice would answer the visual question faster.
+When to use selector capture:
+- Use it after editing a specific component or section.
+- Use `--padding` so borders, shadows, focus rings, and nearby spacing are visible.
+- Use `--nth` when multiple elements match the selector.
+When to use mobile/desktop viewports:
+- Use mobile checks for nav, wrapping, overflow, CTA layout, form layout, and horizontal scrolling.
+- Use desktop checks when max-widths, columns, hero composition, or large-screen spacing matter.
+- Screenshots default to `--device-scale-factor 2` for sharper image understanding. Use `--device-scale-factor 1` for large batch or matrix runs, such as dozens or hundreds of screenshots, when reducing disk/CI artifact size or upload time matters more than visual fidelity.
+- Use `--viewport` to test different responsive widths. Browser/page zoom is not a first-class `agentshot` option; viewport changes are the supported way to inspect narrower or wider layouts.
+Troubleshooting:
+- If capture fails, run `agentshot doctor`.
+- If browser setup is missing, run `agentshot install-browser` only when setup changes are acceptable.
+- If a selector is missing, verify the URL and selector, then try `--wait-for "<selector>"`.
+- If content is blank or incomplete, try `--scroll`, `--wait 1500`, or `--wait-for "<selector>"`. Use `--wait-until networkidle` only when the target page is expected to become idle.
 Do not:
-- Do not paste base64 screenshots into chat.
 - Do not rely only on command success; inspect the PNG.
-- Do not take huge full-page screenshots when a selector or slice would answer the question faster.
-- Do not keep re-capturing without making a concrete fix or forming a specific visual hypothesis.
-- Do not send feedback unless the user asks or the issue is clearly about the `agentshot` tool itself.
-```
-## Short Version
-Use this when the agent instruction surface is small:
+- Do not paste base64 screenshots into chat.
+- Do not use full-page screenshots as the default for every small UI question.
+- Do not keep re-capturing without a concrete hypothesis or a code change.
+- Do not install, authenticate, or change `agentshot` configuration unless the user explicitly asks.
-```text
-Use `agentshot` for visual UI checks. Capture rendered pages to `.agents/screenshots/`, then inspect the saved PNG before judging the UI. For full pages run `agentshot "<url>" ".agents/screenshots/page.png" --scroll --wait 1000`. For specific sections run `agentshot "<url>" ".agents/screenshots/section.png" --selector "section:has-text('Pricing')" --padding 24 --wait 500`. For mobile run `agentshot "<url>" ".agents/screenshots/mobile.png" --viewport 390x844 --scroll --wait 1000`. Prefer selector or vertical slice captures when possible. If the tool fails, run `agentshot doctor`. Do not paste base64 images into chat.
-```
+Fallback:
+- Use `agent-browser` instead when the task requires complex clicking, filling forms, login/session state, console/network inspection, annotated element refs, or multi-step browser exploration. Use `agentshot --click-if-present` for simple one-click overlay dismissal before a screenshot.

package/README.md CHANGED Viewed

@@ -22,12 +22,20 @@ Install globally:
 npm install -g agentscreenshots
 ```
-The package installs the `agentshot` binary and downloads Playwright Chromium during `postinstall`. If browser installation is blocked in your environment, run this manually after install:
+The package installs the `agentshot` binary and checks for an existing local Chrome/Chromium browser during `postinstall`.
+- If a local browser is found, AgentScreenshots uses it and skips the Playwright Chromium download.
+- If no local browser is found, AgentScreenshots downloads Playwright Chromium automatically.
+If browser installation is blocked in your environment, install the package with the automatic browser step disabled:
 ```bash
-npx playwright install chromium
+AGENTSHOT_SKIP_BROWSER_INSTALL=1 npm install -g agentscreenshots
+agentshot install-browser
 ```
+Run `agentshot install-browser --force` if you want to download Playwright Chromium even when a system browser is already available.
 ## Authenticate
 Create a free or paid license in the AgentScreenshots dashboard, then save it locally:
@@ -59,6 +67,12 @@ Basic full-page capture:
 agentshot "http://127.0.0.1:5173" ".agents/screenshots/home.png"
 ```
+Fast development capture in the OS temp directory:
+```bash
+agentshot "http://127.0.0.1:5173" "home.png" --temp
+```
 Lazy-loaded page:
 ```bash
@@ -91,13 +105,31 @@ agentshot "http://127.0.0.1:5173" ".agents/screenshots/mobile.png" \
   --viewport 390x844 --scroll --wait 1000
 ```
+Dismiss a cookie banner or simple overlay before capture:
+```bash
+agentshot "https://example.com" ".agents/screenshots/home.png" \
+  --click-if-present "button:has-text('Reject all')" --wait 500
+```
+Reveal hover UI before capture:
+```bash
+agentshot "http://127.0.0.1:5173" ".agents/screenshots/menu-hover.png" \
+  --hover ".nav-item" --selector ".nav-region" --padding 24
+```
 ## Commands
 ```bash
 agentshot URL OUTPUT [options]
+agentshot URL --temp [options]
+agentshot URL NAME --temp [options]
 agentshot auth LICENSE_KEY [--api-url URL]
 agentshot status
 agentshot doctor
+agentshot install-browser
+agentshot instructions
 agentshot feedback "MESSAGE" [--kind feedback|bug|idea]
 agentshot logout
 agentshot help
@@ -114,22 +146,52 @@ Important capture flags:
 - `--height PX`: capture from `--from` or page top to a fixed height.
 - `--from PX --to PX`: capture a vertical page slice.
 - `--viewport WIDTHxHEIGHT`: set viewport size.
+- `--device-scale-factor N`: capture higher-density pixels without changing CSS layout. Defaults to `2`; use `1` for smaller files.
 - `--wait-for CSS`: wait for an element before capture.
+- `--click SELECTOR`: click a required Playwright/CSS selector before capture. Searches the main page and child frames. Repeatable.
+- `--click-if-present SELECTOR`: click a Playwright/CSS selector before capture if it appears. Searches the main page and child frames. Repeatable.
+- `--hover SELECTOR`: hover a required Playwright/CSS selector before capture. Searches the main page and child frames. Repeatable.
+- `--hover-if-present SELECTOR`: hover a Playwright/CSS selector before capture if it appears. Searches the main page and child frames. Repeatable.
 - `--wait-until STATE`: `load`, `domcontentloaded`, or `networkidle`.
+- `--temp`: save in the OS temp directory. With a filename hint, for example `pricing.png --temp`, the output is named like `pricing-temp-df2d.png`; without a filename hint, the name is derived from the URL. It does not add timed deletion or cleanup tracking.
 - `--json`: print machine-readable output.
 - `--no-report`: skip usage reporting.
 Selectors are Playwright locators, so plain CSS, `text=Pricing`, `section:has-text('Pricing')`, and `xpath=...` work.
+For live production websites, prefer `--wait-until load --wait 1500` over `--wait-until networkidle` unless you know the page settles cleanly. Analytics, chat widgets, and background requests can keep `networkidle` from completing.
+Screenshots default to `--device-scale-factor 2` for sharper image understanding. Use `--device-scale-factor 1` for large batch or matrix runs, such as dozens or hundreds of screenshots, when reducing disk/CI artifact size or upload time matters more than visual fidelity. Browser/page zoom is not currently a first-class CLI option; use `--viewport` for responsive width checks.
+For cookie banners and other one-click overlays, use `--click-if-present` before capture. Prefer reject or close actions when available:
+```bash
+agentshot "https://example.com" ".agents/screenshots/page.png" \
+  --click-if-present "button:has-text('Reject all')" \
+  --click-if-present "button[aria-label='Close']"
+```
+Use `--click` or `--click-if-present` for click-driven UI such as modals, drawers, accordions, tabs, dropdown menus, and collapsed content. Use `--hover` or `--hover-if-present` for hover-driven UI such as tooltips, hover cards, menu flyouts, hover-revealed controls, and CSS `:hover` states.
 ## Agent Workflow
 Tell your coding agent:
-1. Save captures to `.agents/screenshots/`.
+1. Save project captures to `.agents/screenshots/`, unless the user provides another path or asks for `--temp`.
 2. Use `agentshot` after meaningful UI changes.
 3. Prefer selector or slice captures over huge full-page captures when possible.
 4. Open and inspect the saved PNG before judging the UI.
+Output path best practices:
+- For project captures, pass an explicit output path so the capture lands exactly where the user or agent expects.
+- For project work, save screenshots inside the active repo, preferably under `.agents/screenshots/`.
+- Use task-specific subfolders for larger runs, such as `.agents/screenshots/qc/`, `.agents/screenshots/mobile/`, or `.agents/screenshots/2026-05-17-pricing/`.
+- Use descriptive filenames that include the page, viewport, section, or state, for example `pricing-mobile.png`, `hero-hover-menu.png`, or `checkout-modal-closed.png`.
+- Add `.agents/screenshots/` to `.gitignore` unless the screenshots are intentionally part of the repo.
+- For fast development captures that do not need versioned screenshot history, use `--temp` when available.
+- If the user provides a specific output path or saving instructions, use that path and those instructions exactly.
 The package includes [`AGENT-INSTRUCTIONS.md`](./AGENT-INSTRUCTIONS.md), a copy-paste prompt for Claude Code, Codex, Cursor, Windsurf, and OpenCode.
 ## Usage Reporting

package/dist/browser.js ADDED Viewed

@@ -0,0 +1,225 @@
+import { access } from 'node:fs/promises';
+import { constants } from 'node:fs';
+import { createRequire } from 'node:module';
+import { delimiter, isAbsolute, join } from 'node:path';
+import { execFile, spawn } from 'node:child_process';
+import { promisify } from 'node:util';
+import { chromium } from 'playwright';
+const execFileAsync = promisify(execFile);
+const require = createRequire(import.meta.url);
+function hasPathSeparator(value) {
+    return value.includes('/') || value.includes('\\');
+}
+function truthyEnv(value) {
+    return value === '1' || value === 'true' || value === 'yes';
+}
+function getEnvBrowserCandidate() {
+    return process.env.AGENTSHOT_BROWSER_PATH || process.env.AGENTSHOT_CHROME_PATH || null;
+}
+async function canAccess(path) {
+    try {
+        await access(path, constants.F_OK);
+        return true;
+    }
+    catch {
+        return false;
+    }
+}
+async function resolveCommand(command) {
+    const resolver = process.platform === 'win32' ? 'where.exe' : 'which';
+    try {
+        const { stdout } = await execFileAsync(resolver, [command]);
+        return stdout
+            .split(/\r?\n/)
+            .map((line) => line.trim())
+            .find(Boolean) ?? null;
+    }
+    catch {
+        return null;
+    }
+}
+function getKnownBrowserPaths() {
+    if (process.platform === 'darwin') {
+        return [
+            '/Applications/Google Chrome.app/Contents/MacOS/Google Chrome',
+            '/Applications/Chromium.app/Contents/MacOS/Chromium',
+            '/Applications/Microsoft Edge.app/Contents/MacOS/Microsoft Edge',
+            '/Applications/Brave Browser.app/Contents/MacOS/Brave Browser'
+        ];
+    }
+    if (process.platform === 'win32') {
+        const prefixes = [
+            process.env.LOCALAPPDATA,
+            process.env.PROGRAMFILES,
+            process.env['PROGRAMFILES(X86)']
+        ].filter((value) => Boolean(value));
+        return prefixes.flatMap((prefix) => [
+            join(prefix, 'Google/Chrome/Application/chrome.exe'),
+            join(prefix, 'Chromium/Application/chrome.exe'),
+            join(prefix, 'Microsoft/Edge/Application/msedge.exe'),
+            join(prefix, 'BraveSoftware/Brave-Browser/Application/brave.exe')
+        ]);
+    }
+    return [];
+}
+function getPathBrowserCommands() {
+    if (process.platform === 'win32') {
+        return ['chrome.exe', 'msedge.exe', 'brave.exe'];
+    }
+    return [
+        'google-chrome-stable',
+        'google-chrome',
+        'chrome',
+        'chromium',
+        'chromium-browser',
+        'microsoft-edge-stable',
+        'microsoft-edge',
+        'brave-browser',
+        'brave'
+    ];
+}
+async function resolveBrowserCandidate(candidate) {
+    if (isAbsolute(candidate) || hasPathSeparator(candidate)) {
+        return (await canAccess(candidate)) ? candidate : null;
+    }
+    return resolveCommand(candidate);
+}
+function labelForPath(path) {
+    const parts = path.split(/[\\/]/);
+    return parts.at(-1) || path;
+}
+export async function findSystemBrowser() {
+    const seen = new Set();
+    const envCandidate = getEnvBrowserCandidate();
+    if (envCandidate) {
+        const executablePath = await resolveBrowserCandidate(envCandidate);
+        if (executablePath) {
+            return {
+                executablePath,
+                label: 'configured browser',
+                source: 'env'
+            };
+        }
+    }
+    for (const command of getPathBrowserCommands()) {
+        const executablePath = await resolveBrowserCandidate(command);
+        if (executablePath && !seen.has(executablePath)) {
+            seen.add(executablePath);
+            return {
+                executablePath,
+                label: command,
+                source: 'path'
+            };
+        }
+    }
+    for (const path of getKnownBrowserPaths()) {
+        if (seen.has(path)) {
+            continue;
+        }
+        if (await canAccess(path)) {
+            return {
+                executablePath: path,
+                label: labelForPath(path),
+                source: 'known-path'
+            };
+        }
+    }
+    return null;
+}
+export function shouldSkipBrowserInstall() {
+    return truthyEnv(process.env.AGENTSHOT_SKIP_BROWSER_INSTALL);
+}
+export async function installPlaywrightChromium() {
+    const playwrightCliPath = require.resolve('playwright/cli');
+    await new Promise((resolve, reject) => {
+        const child = spawn(process.execPath, [playwrightCliPath, 'install', 'chromium'], {
+            stdio: 'inherit',
+            env: {
+                ...process.env,
+                PATH: process.env.PATH?.split(delimiter).join(delimiter)
+            }
+        });
+        child.on('error', reject);
+        child.on('exit', (code) => {
+            if (code === 0) {
+                resolve();
+            }
+            else {
+                reject(new Error(`playwright install chromium exited with code ${code ?? 'unknown'}.`));
+            }
+        });
+    });
+}
+export async function ensureBrowserInstalled(options = {}) {
+    if (options.respectSkipEnv && shouldSkipBrowserInstall()) {
+        return {
+            status: 'skipped',
+            reason: 'env_skip',
+            message: 'Skipped browser install because AGENTSHOT_SKIP_BROWSER_INSTALL is set. Run `agentshot install-browser` later if needed.'
+        };
+    }
+    if (!options.force) {
+        const systemBrowser = await findSystemBrowser();
+        if (systemBrowser) {
+            return {
+                status: 'skipped',
+                reason: 'system_browser_found',
+                executablePath: systemBrowser.executablePath,
+                message: `Found local browser (${systemBrowser.label}) at ${systemBrowser.executablePath}; skipping Playwright Chromium download.`
+            };
+        }
+    }
+    await installPlaywrightChromium();
+    return {
+        status: 'installed',
+        message: 'Playwright Chromium installed.'
+    };
+}
+function browserLaunchError(error) {
+    const original = error instanceof Error ? error.message : String(error);
+    return new Error(`Could not launch a local browser. Run \`agentshot doctor\` for details or \`agentshot install-browser\` to download Playwright Chromium.\n\nOriginal error: ${original}`);
+}
+export async function launchAgentshotBrowser(options) {
+    if (options.browser === 'chrome') {
+        try {
+            return {
+                browser: await chromium.launch({
+                    channel: 'chrome',
+                    headless: !options.headed
+                }),
+                source: 'Chrome channel'
+            };
+        }
+        catch (error) {
+            throw browserLaunchError(error);
+        }
+    }
+    const systemBrowser = await findSystemBrowser();
+    const errors = [];
+    if (systemBrowser) {
+        try {
+            return {
+                browser: await chromium.launch({
+                    executablePath: systemBrowser.executablePath,
+                    headless: !options.headed
+                }),
+                source: `${systemBrowser.label} at ${systemBrowser.executablePath}`
+            };
+        }
+        catch (error) {
+            errors.push(error instanceof Error ? error.message : String(error));
+        }
+    }
+    try {
+        return {
+            browser: await chromium.launch({
+                headless: !options.headed
+            }),
+            source: 'Playwright Chromium'
+        };
+    }
+    catch (error) {
+        errors.push(error instanceof Error ? error.message : String(error));
+        throw browserLaunchError(errors.join('\n\n'));
+    }
+}

package/dist/capture.js CHANGED Viewed

@@ -1,8 +1,8 @@
 import { mkdir } from 'node:fs/promises';
 import { dirname, extname, resolve } from 'node:path';
-import { chromium } from 'playwright';
+import { launchAgentshotBrowser } from './browser.js';
 import { getFileSize, getTargetKind, reportCheck } from './reporting.js';
-const PACKAGE_VERSION = '0.1.0';
+const PACKAGE_VERSION = '0.2.0';
 function sleep(ms) {
     return new Promise((resolveSleep) => setTimeout(resolveSleep, ms));
 }
@@ -17,6 +17,45 @@ function getFormat(output) {
     const extension = extname(output).toLowerCase();
     return extension === '.jpg' || extension === '.jpeg' ? 'jpeg' : 'png';
 }
+function getPngDimensions(image) {
+    if (image.length < 24 || image.toString('ascii', 1, 4) !== 'PNG') {
+        return null;
+    }
+    return {
+        width: image.readUInt32BE(16),
+        height: image.readUInt32BE(20)
+    };
+}
+function getJpegDimensions(image) {
+    let offset = 2;
+    while (offset < image.length - 9) {
+        if (image[offset] !== 0xff) {
+            offset += 1;
+            continue;
+        }
+        const marker = image[offset + 1];
+        offset += 2;
+        if (marker === 0xd8 || marker === 0xd9 || (marker >= 0xd0 && marker <= 0xd7)) {
+            continue;
+        }
+        const segmentLength = image.readUInt16BE(offset);
+        const isStartOfFrame = (marker >= 0xc0 && marker <= 0xc3) ||
+            (marker >= 0xc5 && marker <= 0xc7) ||
+            (marker >= 0xc9 && marker <= 0xcb) ||
+            (marker >= 0xcd && marker <= 0xcf);
+        if (isStartOfFrame) {
+            return {
+                height: image.readUInt16BE(offset + 3),
+                width: image.readUInt16BE(offset + 5)
+            };
+        }
+        offset += segmentLength;
+    }
+    return null;
+}
+function getImageDimensions(image, format) {
+    return format === 'png' ? getPngDimensions(image) : getJpegDimensions(image);
+}
 async function getPageSize(page) {
     return page.evaluate(() => {
         const root = document.scrollingElement || document.documentElement;
@@ -49,6 +88,48 @@ async function triggerLazyLoadedContent(page) {
         await sleepInPage(250);
     });
 }
+async function findVisibleLocatorInFrames(page, selector, timeoutMs) {
+    const deadline = Date.now() + timeoutMs;
+    let lastError = null;
+    while (Date.now() <= deadline) {
+        for (const frame of page.frames()) {
+            const locator = frame.locator(selector).first();
+            try {
+                if (await locator.isVisible({ timeout: 100 })) {
+                    return locator;
+                }
+            }
+            catch (error) {
+                lastError = error;
+            }
+        }
+        await sleep(100);
+    }
+    if (lastError instanceof Error) {
+        throw lastError;
+    }
+    throw new Error(`Timed out waiting for visible selector before click: ${selector}`);
+}
+async function runPreCaptureActions(page, actions, timeoutMs) {
+    for (const action of actions) {
+        const timeout = action.optional ? Math.min(timeoutMs, 2500) : timeoutMs;
+        try {
+            const locator = await findVisibleLocatorInFrames(page, action.selector, timeout);
+            if (action.type === 'click') {
+                await locator.click({ timeout });
+            }
+            else {
+                await locator.hover({ timeout });
+            }
+            await sleep(250);
+        }
+        catch (error) {
+            if (!action.optional) {
+                throw error;
+            }
+        }
+    }
+}
 async function computeSelectorClip(page, selector, index, padding, timeoutMs) {
     const locator = page.locator(selector).nth(index);
     await locator.waitFor({ state: 'visible', timeout: timeoutMs });
@@ -83,12 +164,11 @@ function computeVerticalClip(options, pageWidth, pageHeight) {
     return clampClip({ x: 0, y, width: pageWidth, height }, pageWidth, pageHeight);
 }
 async function launchBrowser(options) {
-    const executablePath = options.browser === 'chrome' ? undefined : undefined;
-    return chromium.launch({
-        channel: options.browser === 'chrome' ? 'chrome' : undefined,
-        executablePath,
-        headless: !options.headed
+    const result = await launchAgentshotBrowser({
+        browser: options.browser,
+        headed: options.headed
     });
+    return result.browser;
 }
 export async function capture(options) {
     const startedAt = Date.now();
@@ -114,6 +194,7 @@ export async function capture(options) {
                 timeout: options.timeoutMs
             });
         }
+        await runPreCaptureActions(page, options.actions, options.timeoutMs);
         if (options.scroll) {
             await triggerLazyLoadedContent(page);
         }
@@ -124,11 +205,12 @@ export async function capture(options) {
         const format = getFormat(output);
         let screenshotWidth = pageSize.width;
         let screenshotHeight = pageSize.height;
+        let screenshotBuffer = null;
         let mode = 'full-page';
         if (options.selector) {
             const selectorClip = await computeSelectorClip(page, options.selector, options.selectorIndex, options.padding, options.timeoutMs);
             const clip = clampClip(selectorClip, pageSize.width, pageSize.height);
-            await page.screenshot({ path: output, fullPage: true, clip, type: format });
+            screenshotBuffer = await page.screenshot({ path: output, fullPage: true, clip, type: format });
             screenshotWidth = clip.width;
             screenshotHeight = clip.height;
             mode = 'selector';
@@ -136,17 +218,22 @@ export async function capture(options) {
         else {
             const verticalClip = computeVerticalClip(options, pageSize.width, pageSize.height);
             if (verticalClip) {
-                await page.screenshot({ path: output, fullPage: true, clip: verticalClip, type: format });
+                screenshotBuffer = await page.screenshot({ path: output, fullPage: true, clip: verticalClip, type: format });
                 screenshotWidth = verticalClip.width;
                 screenshotHeight = verticalClip.height;
                 mode = 'clip';
             }
             else {
-                await page.screenshot({ path: output, fullPage: options.fullPage, type: format });
+                screenshotBuffer = await page.screenshot({ path: output, fullPage: options.fullPage, type: format });
                 screenshotWidth = options.fullPage ? pageSize.width : options.width;
                 screenshotHeight = options.fullPage ? pageSize.height : options.viewportHeight;
             }
         }
+        const imageDimensions = screenshotBuffer ? getImageDimensions(screenshotBuffer, format) : null;
+        if (imageDimensions) {
+            screenshotWidth = imageDimensions.width;
+            screenshotHeight = imageDimensions.height;
+        }
         const durationMs = Date.now() - startedAt;
         const fileSizeBytes = await getFileSize(output);
         let reported = false;
@@ -175,6 +262,7 @@ export async function capture(options) {
         return {
             output,
             url: options.url,
+            temporary: options.temporary,
             width: screenshotWidth,
             height: screenshotHeight,
             fileSizeBytes,

package/dist/doctor.js CHANGED Viewed

@@ -1,7 +1,7 @@
 import { mkdtemp, rm, writeFile } from 'node:fs/promises';
 import { tmpdir } from 'node:os';
 import { join } from 'node:path';
-import { chromium } from 'playwright';
+import { findSystemBrowser, launchAgentshotBrowser } from './browser.js';
 import { getConfigPath, readConfig, resolveApiUrl, resolveLicenseKey } from './config.js';
 import { validateLicense } from './reporting.js';
 function getOverallStatus(checks) {
@@ -32,14 +32,18 @@ async function checkWritableTempFile() {
     }
 }
 async function checkBrowser() {
-    const browser = await chromium.launch();
+    const launch = await launchAgentshotBrowser({
+        browser: 'chromium',
+        headed: false
+    });
+    const { browser } = launch;
     try {
         const page = await browser.newPage();
         await page.setContent('<!doctype html><title>agentshot doctor</title><p>ok</p>');
         return {
             name: 'browser',
             status: 'ok',
-            message: 'Playwright Chromium launches successfully.'
+            message: `Local browser launches successfully (${launch.source}).`
         };
     }
     finally {
@@ -93,6 +97,14 @@ export async function runDoctor(options) {
             : 'No license key configured. Run `agentshot auth ags_live_...` when ready.'
     });
     try {
+        const systemBrowser = await findSystemBrowser();
+        checks.push({
+            name: 'browser-detection',
+            status: systemBrowser ? 'ok' : 'warn',
+            message: systemBrowser
+                ? `Found local browser at ${systemBrowser.executablePath}.`
+                : 'No system Chrome/Chromium browser found. Playwright Chromium will be used if installed.'
+        });
         checks.push(await checkBrowser());
     }
     catch (error) {
@@ -101,7 +113,7 @@ export async function runDoctor(options) {
             status: 'fail',
             message: error instanceof Error
                 ? error.message
-                : 'Playwright Chromium could not launch.'
+                : 'No local browser could launch. Run `agentshot install-browser` to download Playwright Chromium.'
         });
     }
     try {

package/dist/index.js CHANGED Viewed

@@ -1,7 +1,10 @@
 #!/usr/bin/env node
+import { randomUUID } from 'node:crypto';
 import { readFile } from 'node:fs/promises';
+import { tmpdir } from 'node:os';
 import { fileURLToPath } from 'node:url';
-import { dirname, join } from 'node:path';
+import { basename, dirname, extname, join } from 'node:path';
+import { ensureBrowserInstalled } from './browser.js';
 import { capture } from './capture.js';
 import { clearLicenseKey, readConfig, writeConfig } from './config.js';
 import { runDoctor } from './doctor.js';
@@ -12,14 +15,20 @@ function printHelp() {
 Usage:
   agentshot URL OUTPUT [options]
+  agentshot URL --temp [options]
+  agentshot URL NAME --temp [options]
   agentshot auth LICENSE_KEY [--api-url URL] [--json]
   agentshot status [--api-url URL] [--license-key KEY]
   agentshot doctor [--api-url URL] [--license-key KEY]
+  agentshot install-browser [--force] [--json]
+  agentshot instructions
   agentshot feedback "MESSAGE" [--kind feedback|bug|idea]
   agentshot logout
 Examples:
   agentshot "http://127.0.0.1:5200/design" "./shots/design.png"
+  agentshot "http://127.0.0.1:5200/design" --temp
+  agentshot "http://127.0.0.1:5200/design" "hero.png" --temp
   agentshot "http://127.0.0.1:5200/design" "./shots/hero.png" --height 1200
   agentshot "http://127.0.0.1:5200/design" "./shots/cards.png" --from 1800 --to 2500
   agentshot "http://127.0.0.1:5200/design" "./shots/borders.png" --selector "section:has-text('Borders')" --padding 24
@@ -39,12 +48,19 @@ Capture options:
   --viewport-height PX      Browser viewport height (default: 900)
   --viewport WIDTHxHEIGHT   Set viewport width and height together
   --wait-for CSS            Wait for a selector before capture
+  --click SELECTOR          Click a required selector before capture; repeatable
+  --click-if-present SELECTOR
+                            Try to click a selector before capture; repeatable
+  --hover SELECTOR          Hover a required selector before capture; repeatable
+  --hover-if-present SELECTOR
+                            Try to hover a selector before capture; repeatable
   --wait-until STATE        load, domcontentloaded, or networkidle (default: load)
   --timeout MS              Navigation/action timeout (default: 30000)
-  --browser NAME            chromium or chrome (default: chromium)
+  --browser NAME            chromium auto-detects local Chrome/Chromium, or use chrome channel
   --headed                  Show the browser window
   --no-full-page            Capture only viewport when no selector/crop is used
-  --device-scale-factor N   Device scale factor (default: 1)
+  --device-scale-factor N   Device scale factor (default: 2)
+  --temp                    Save to OS temp dir; optional OUTPUT becomes filename hint
   --json                    Print machine-readable JSON
   --no-report               Do not report a visual check to AgentScreenshots
   --api-url URL             Override AgentScreenshots API URL
@@ -54,11 +70,13 @@ Environment:
   AGENTSHOT_API_URL
   AGENTSHOT_LICENSE_KEY
   AGENTSHOT_CONFIG
+  AGENTSHOT_BROWSER_PATH
+  AGENTSHOT_SKIP_BROWSER_INSTALL=1
 `);
 }
 function parsePositiveNumber(value, name) {
     const parsed = Number(value);
-    if (!Number.isFinite(parsed) || parsed < 0) {
+    if (!Number.isFinite(parsed) || parsed <= 0) {
         throw new Error(`${name} must be a positive number.`);
     }
     return parsed;
@@ -81,12 +99,14 @@ function getDefaultCaptureOptions(url, output) {
     return {
         url,
         output,
+        temporary: false,
         width: 1280,
         viewportHeight: 900,
-        deviceScaleFactor: 1,
+        deviceScaleFactor: 2,
         waitMs: 0,
         timeoutMs: 30_000,
         scroll: false,
+        actions: [],
         selector: null,
         selectorIndex: 0,
         padding: 0,
@@ -104,6 +124,40 @@ function getDefaultCaptureOptions(url, output) {
         licenseKey: null
     };
 }
+function slugifyFilename(value) {
+    return (value
+        .toLowerCase()
+        .replace(/[^a-z0-9]+/g, '-')
+        .replace(/^-+|-+$/g, '')
+        .slice(0, 64) || 'capture');
+}
+function getTempCaptureOutput(url, nameHint = null) {
+    const id = randomUUID().replace(/-/g, '').slice(0, 4);
+    if (nameHint) {
+        const name = basename(nameHint);
+        const extension = extname(name).toLowerCase();
+        const outputExtension = extension === '.jpg' || extension === '.jpeg' ? extension : '.png';
+        const stem = extension ? name.slice(0, -extension.length) : name;
+        return join(tmpdir(), 'agentshot', `${slugifyFilename(stem)}-temp-${id}${outputExtension}`);
+    }
+    let source = 'capture';
+    try {
+        const parsed = new URL(url);
+        if (parsed.protocol === 'data:') {
+            source = 'data-url';
+        }
+        else {
+            const host = parsed.hostname || parsed.protocol.replace(':', '') || 'capture';
+            const path = parsed.pathname === '/' ? '' : parsed.pathname;
+            source = `${host}${path}`;
+        }
+    }
+    catch {
+        source = url;
+    }
+    const slug = slugifyFilename(source);
+    return join(tmpdir(), 'agentshot', `${slug}-temp-${id}.png`);
+}
 function parseCommonAuthOptions(args, startIndex = 0) {
     let apiUrl = null;
     let licenseKey = null;
@@ -186,12 +240,35 @@ function parseLogout(args) {
     }
     return { name: 'logout', json };
 }
+function parseInstallBrowser(args) {
+    let force = false;
+    let json = false;
+    for (const arg of args) {
+        if (arg === '--force') {
+            force = true;
+        }
+        else if (arg === '--json') {
+            json = true;
+        }
+        else {
+            throw new Error(`Unknown option: ${arg}`);
+        }
+    }
+    return { name: 'install-browser', force, json };
+}
 function parseCapture(args) {
-    if (args.length < 2) {
-        throw new Error('Capture requires URL and OUTPUT.');
+    if (args.length < 1) {
+        throw new Error('Capture requires URL and OUTPUT, or URL with --temp.');
+    }
+    const [url, maybeOutput, ...maybeRest] = args;
+    const output = maybeOutput && !maybeOutput.startsWith('--') ? maybeOutput : null;
+    const rest = output ? maybeRest : args.slice(1);
+    const temporary = rest.includes('--temp');
+    if (!output && !temporary) {
+        throw new Error('Capture requires OUTPUT unless --temp is used.');
     }
-    const [url, output, ...rest] = args;
-    const options = getDefaultCaptureOptions(url, output);
+    const options = getDefaultCaptureOptions(url, temporary ? getTempCaptureOutput(url, output) : output);
+    options.temporary = temporary;
     for (let index = 0; index < rest.length; index += 1) {
         const arg = rest[index];
         switch (arg) {
@@ -250,6 +327,38 @@ function parseCapture(args) {
                 options.waitForSelector = readValue(rest, index, arg);
                 index += 1;
                 break;
+            case '--click':
+                options.actions.push({
+                    type: 'click',
+                    selector: readValue(rest, index, arg),
+                    optional: false
+                });
+                index += 1;
+                break;
+            case '--click-if-present':
+                options.actions.push({
+                    type: 'click',
+                    selector: readValue(rest, index, arg),
+                    optional: true
+                });
+                index += 1;
+                break;
+            case '--hover':
+                options.actions.push({
+                    type: 'hover',
+                    selector: readValue(rest, index, arg),
+                    optional: false
+                });
+                index += 1;
+                break;
+            case '--hover-if-present':
+                options.actions.push({
+                    type: 'hover',
+                    selector: readValue(rest, index, arg),
+                    optional: true
+                });
+                index += 1;
+                break;
             case '--wait-until': {
                 const value = readValue(rest, index, arg);
                 if (value !== 'load' && value !== 'domcontentloaded' && value !== 'networkidle') {
@@ -282,6 +391,8 @@ function parseCapture(args) {
                 options.deviceScaleFactor = parsePositiveNumber(readValue(rest, index, arg), arg);
                 index += 1;
                 break;
+            case '--temp':
+                break;
             case '--json':
                 options.json = true;
                 break;
@@ -310,6 +421,12 @@ function parseArgs(args) {
     if (first === '--version' || first === '-v' || first === 'version') {
         return { name: 'version' };
     }
+    if (first === 'instructions') {
+        if (args.length > 1) {
+            throw new Error('Usage: agentshot instructions');
+        }
+        return { name: 'instructions' };
+    }
     if (first === 'auth') {
         const key = args[1];
         if (!key || key.startsWith('--')) {
@@ -324,6 +441,9 @@ function parseArgs(args) {
     if (first === 'doctor') {
         return { name: 'doctor', ...parseCommonAuthOptions(args, 1) };
     }
+    if (first === 'install-browser') {
+        return parseInstallBrowser(args.slice(1));
+    }
     if (first === 'feedback') {
         return parseFeedback(args.slice(1));
     }
@@ -343,6 +463,11 @@ async function readPackageVersion() {
         return '0.1.0';
     }
 }
+async function readAgentInstructions() {
+    const currentFile = fileURLToPath(import.meta.url);
+    const instructionsPath = join(dirname(currentFile), '..', 'AGENT-INSTRUCTIONS.md');
+    return readFile(instructionsPath, 'utf8');
+}
 async function run() {
     const command = parseArgs(process.argv.slice(2));
     if (command.name === 'help') {
@@ -353,6 +478,10 @@ async function run() {
         console.log(await readPackageVersion());
         return;
     }
+    if (command.name === 'instructions') {
+        console.log((await readAgentInstructions()).trim());
+        return;
+    }
     if (command.name === 'auth') {
         const config = await readConfig();
         const apiUrl = command.apiUrl ?? config.apiUrl;
@@ -463,6 +592,19 @@ async function run() {
         }
         return;
     }
+    if (command.name === 'install-browser') {
+        const result = await ensureBrowserInstalled({ force: command.force });
+        if (command.json) {
+            console.log(JSON.stringify(result, null, 2));
+        }
+        else {
+            console.log(result.message);
+            if (result.status === 'skipped' && result.reason === 'system_browser_found') {
+                console.log('Use `agentshot install-browser --force` to download Playwright Chromium anyway.');
+            }
+        }
+        return;
+    }
     if (command.name === 'feedback') {
         const result = await sendFeedback({
             apiUrl: command.apiUrl,

package/dist/postinstall.js ADDED Viewed

@@ -0,0 +1,10 @@
+import { ensureBrowserInstalled } from './browser.js';
+async function run() {
+    const result = await ensureBrowserInstalled({ respectSkipEnv: true });
+    console.log(`[agentshot] ${result.message}`);
+}
+run().catch((error) => {
+    const message = error instanceof Error ? error.message : String(error);
+    console.warn(`[agentshot] Browser install did not complete: ${message}`);
+    console.warn('[agentshot] Install finished anyway. Run `agentshot install-browser` or `agentshot doctor` before your first capture.');
+});

package/package.json CHANGED Viewed

@@ -1,17 +1,9 @@
 {
 	"name": "agentscreenshots",
-	"version": "0.1.0",
+	"version": "0.2.0",
 	"description": "Local-first website screenshots for AI coding agents.",
 	"type": "module",
 	"homepage": "https://agentscreenshots.com",
-	"repository": {
-		"type": "git",
-		"url": "git+https://github.com/ArchMiha/agentscreenshots.git",
-		"directory": "cli"
-	},
-	"bugs": {
-		"url": "https://github.com/ArchMiha/agentscreenshots/issues"
-	},
 	"bin": {
 		"agentshot": "dist/index.js"
 	},
@@ -25,7 +17,7 @@
 		"build": "tsc -p tsconfig.json",
 		"check": "tsc -p tsconfig.json --noEmit",
 		"dev": "tsx src/index.ts",
-		"postinstall": "playwright install chromium",
+		"postinstall": "node dist/postinstall.js",
 		"prepublishOnly": "npm run check && npm run build"
 	},
 	"keywords": [