npm - libretto - Versions diffs - 0.2.7 → 0.3.1 - Mend

libretto 0.2.7 → 0.3.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (18) hide show

package/{skill → .agents/skills/libretto}/SKILL.md +20 -18
package/{skill → .agents/skills/libretto}/code-generation-rules.md +3 -3
package/{skill → .agents/skills/libretto}/integration-approach-selection.md +3 -3
package/LICENSE +21 -0
package/README.md +41 -125
package/dist/cli/cli.js +148 -59
package/dist/cli/commands/execution.js +45 -16
package/dist/cli/commands/init.js +120 -49
package/dist/cli/core/browser.js +4 -0
package/dist/cli/core/context.js +5 -6
package/dist/cli/index.js +0 -0
package/dist/cli/workers/run-integration-runtime.js +10 -3
package/dist/cli/workers/run-integration-worker.js +2 -1
package/dist/index.cjs +28 -0
package/dist/index.js +17 -0
package/package.json +28 -81
package/bin/libretto.mjs +0 -18
package/scripts/postinstall.mjs +0 -48

package/{skill → .agents/skills/libretto}/SKILL.md RENAMED Viewed

@@ -4,18 +4,19 @@ description: "Browser automation CLI for building integrations, with a network-f
 license: MIT
 metadata:
   author: saffron-health
-  version: "0.2.2"
+  version: "0.2.4"
 ---
 # Browser Integration with Libretto CLI
 Use the `npx libretto` CLI to automate web interactions, debug browser agent jobs, and prototype fixes.
-## CRITICAL: Session Access
+## Session Access
-Libretto sessions are **full-access by default**. You can use `exec` and `run` immediately after opening a session.
+Libretto sessions are **full-access by default** (no approval prompts). You can use `exec` immediately after opening a session. `run` starts its own workflow browser process and requires the target session to be available.
 **Rules:**
 - Always announce which session you opened and what page you are on.
 - Use `snapshot`, `network`, and `actions` first when debugging unknown page state.
 - Before any potentially mutating action (submit/save/delete, or non-idempotent API calls), describe what you are about to do and wait for explicit user confirmation.
@@ -27,14 +28,15 @@ If it's not obvious which element to click or what value to enter, **ask the use
 ## Commands
 ```bash
-npx libretto open <url> [--headless]   # Launch browser and navigate (headed by default)
+npx libretto open <url> [--headed|--headless]   # Launch browser and navigate (headed by default)
 npx libretto exec <code> [--visualize] # Execute Playwright TypeScript code (--visualize enables ghost cursor + highlight)
-npx libretto run <integrationFile> <integrationExport> # Execute integration actions
+npx libretto run <integrationFile> <integrationExport> [--params <json> | --params-file <path>] [--auth-profile <domain>] [--headed|--headless] # Execute integration actions
 npx libretto resume                    # Resume a paused workflow for the current session
 npx libretto snapshot --objective "<what to find>" [--context "<situational info>"]
 npx libretto save <url|domain>         # Save session (cookies, localStorage) to .libretto/profiles/
 npx libretto network                   # Show last 20 captured network requests
 npx libretto actions                   # Show last 20 captured user/agent actions
+npx libretto pages                     # List open pages in the session
 npx libretto close                     # Close the browser
 ```
@@ -65,13 +67,13 @@ Workflows pause by calling `await pause()` (imported from `"libretto"`). In prod
 ## Globals Available in `exec`
-`page`, `context`, `state`, `browser`, `networkLog({ last?, filter?, method? })`, `actionLog({ last?, filter?, action?, source? })`, `console`, `fetch`, `Buffer`, `URL`, `setTimeout`
+`page`, `context`, `state`, `browser`, `networkLog({ last?, filter?, method? })`, `actionLog({ last?, filter?, action?, source? })`, `console`, `fetch`, `Buffer`, `URL`, `setTimeout`, `setInterval`, `clearTimeout`, `clearInterval`
-The `state` object persists across `exec` calls within the same session — use it to carry values between commands.
+The `state` object is scoped to a single `exec` invocation and resets on the next call.
 ## CRITICAL: No try/catch in exec
-**Never use try/catch or .catch() in exec code.** Let errors throw so they surface as exec failures. When an exec fails, you get the full error message (e.g., "intercepts pointer events", "Timeout 30000ms exceeded") — use that to diagnose the problemand write a corrected exec.
+**Never use try/catch or .catch() in exec code.** Let errors throw so they surface as exec failures. When an exec fails, you get the full error message (e.g., "intercepts pointer events", "Timeout 30000ms exceeded") — use that to diagnose the problem and write a corrected exec.
 **Why:** A try/catch inside exec hides failures from you. A click that times out takes 30 seconds — if you retry it in a loop with try/catch, you'll silently burn minutes on the same broken selector with no way to recover. Without try/catch, the error comes back immediately and you can reason about what went wrong.
@@ -174,9 +176,9 @@ npx libretto exec --session browser-agent "await page.locator('.dropdown-trigger
 ## Snapshot — The Primary Observation Tool
-The `snapshot` command captures a PNG screenshot + HTML, sends both to a vision model (Gemini Flash), and returns an analysis with Playwright-ready selectors. `--objective` is required for analysis, and `--context` is optional (but recommended for better results). This is the single way to understand what's on the page — use it any time you need to inspect page structure, find elements, or debug what's happening.
+The `snapshot` command captures a PNG screenshot + HTML and (when `--objective` is provided) runs analysis through the configured AI runtime (`codex`, `claude`, or `gemini` via `npx libretto ai configure ...`). `--context` is optional (but recommended for better results). This is the single way to understand what's on the page — use it any time you need to inspect page structure, find elements, or debug what's happening.
-**Never use `page.screenshot()` via `exec` to understand the page.** Use the `snapshot` command instead — it captures the screenshot, HTML, and sends both to a vision model that returns actionable selectors. Raw screenshots give you an image with no analysis; `snapshot` gives you the answer.
+**Never use `page.screenshot()` via `exec` to understand the page.** Use the `snapshot` command instead — it captures the screenshot, HTML, and runs analysis with selectors. Raw screenshots give you an image with no analysis; `snapshot` gives you the answer.
 ### What to Put in `--objective`
@@ -224,7 +226,7 @@ When the snapshot doesn't give you enough detail — why an element is hidden, w
 ## Tips
-- **Never use `page.screenshot()` via `exec`.** Use `npx libretto snapshot` instead — it captures the viewport, sends the screenshot + HTML to a vision model, and returns actionable selectors. The `fullPage` option is especially dangerous — it scrolls the entire page to stitch a screenshot, which can crash JavaScript-heavy pages (especially EMR portals like eClinicalWorks).
+- **Never use `page.screenshot()` via `exec`.** Use `npx libretto snapshot` instead — it captures the viewport plus HTML and returns analyzed output with actionable selectors. The `fullPage` option is especially dangerous — it scrolls the entire page to stitch a screenshot, which can crash JavaScript-heavy pages (especially EMR portals like eClinicalWorks).
 - **Never run `exec` commands in parallel.** Always wait for one `exec` to finish before starting the next. Do not use `run_in_background` for `exec` calls. Running simultaneous `exec` calls opens multiple CDP connections to the same page, which corrupts the page state and kills the browser.
 - `open` requires an available session. If the session is already active, Libretto fails fast and asks you to close the existing session or use a different `--session`.
 - `run` also requires an available session, except for the specific case of a prior failed `run` in the same session; in that case Libretto releases the failed worker and allows rerun.
@@ -234,7 +236,7 @@ When the snapshot doesn't give you enough detail — why an element is hidden, w
 ## Network Logging
-Network requests are captured automatically when a browser is opened via `npx libretto open`. All non-static HTTP responses (excluding `.css`, `.js`, `.png`, `.jpg`, `.gif`, `.woff`, `.ico`, `.svg`, and `chrome-extension://` URLs) are logged to `.libretto/sessions/<session>/network.jsonl`.
+Network requests are captured automatically for Libretto-managed browser sessions (for example from `npx libretto open` and `npx libretto run`). Non-static HTTP responses are logged to `.libretto/sessions/<session>/network.jsonl`.
 ### CLI: `npx libretto network`
@@ -256,11 +258,11 @@ npx libretto exec "return await networkLog({ method: 'POST' })"
 Returns an array of objects with: `ts`, `method`, `url`, `status`, `contentType`, `postData` (POST/PUT/PATCH only, first 2000 chars), `size`, `durationMs`.
-**Note:** Network logging only works for sessions opened via `npx libretto open`. It does not capture requests for external sessions like `--session browser-agent`.
+**Note:** Network logging works for Libretto-managed sessions. It does not capture requests for external sessions like `--session browser-agent`.
 ## Action Logging
-Browser actions are captured automatically when a browser is opened via `npx libretto open`. Both user interactions (manual clicks, typing in the headed browser window) and agent actions (programmatic Playwright API calls via `exec`) are logged to `.libretto/sessions/<session>/actions.jsonl` with a `source` field of `'user'` or `'agent'` to distinguish the two.
+Browser actions are captured automatically for Libretto-managed browser sessions (for example from `npx libretto open` and `npx libretto run`). Both user interactions (manual clicks, typing in the headed browser window) and agent actions (programmatic Playwright API calls via `exec`) are logged to `.libretto/sessions/<session>/actions.jsonl` with a `source` field of `'user'` or `'agent'` to distinguish the two.
 ### CLI: `npx libretto actions`
@@ -284,7 +286,7 @@ npx libretto exec "return await actionLog({ action: 'click' })"
 Returns an array of objects with: `ts`, `action`, `source` (`'user'` | `'agent'`), `selector`, `value`, `url`, `duration`, `success`, `error`.
-**Note:** Action logging only works for sessions opened via `npx libretto open`. It does not capture actions for external sessions like `--session browser-agent`.
+**Note:** Action logging works for Libretto-managed sessions. It does not capture actions for external sessions like `--session browser-agent`.
 ## Workflow: Creating a New Integration
@@ -436,7 +438,7 @@ After completing interactive exploration, **always generate the TypeScript workf
 **STOP AND ASK BEFORE GENERATING CODE.** Once the interactive workflow is figured out, pause and ask:
 1. "Are there any existing files or patterns in the codebase you want me to reference?"
-2. "Do you want me to incorporate any of your manual browser interactions from the actions log (`npx libretto actions --source user`) into the generated code?"
+2. Check the action log for user interactions by running `npx libretto actions --source user`. If there are any recorded user interactions, ask: "I see you performed some manual interactions in the browser (clicks, form fills, etc.). Would you like me to incorporate any of those into the generated code?" — and briefly list what you found. If there are no user interactions, skip this question entirely.
 3. "Any other guidance for how the production code should be structured?"
 Wait for the user's response before proceeding. Then:
@@ -446,6 +448,6 @@ Wait for the user's response before proceeding. Then:
 ## Patient Safety Warning
-Browser automation jobs process real patient health information. The `npx libretto` CLI executes arbitrary code with full page access. **Never** execute code that submits forms, sends referrals, deletes data, or modifies patient records.
+Browser automation jobs process real patient health information. The `npx libretto` CLI executes arbitrary code with full page access. **Never execute mutating actions without explicit user confirmation first** (submits, sends, deletes, updates, or other side effects).
-See `apps/browser-agent/docs/interactive-debugging-workflow.md` for the complete debugging guide.
+For debugging steps, see the "Workflow: Interactive Debugging" section in this file.

package/{skill → .agents/skills/libretto}/code-generation-rules.md RENAMED Viewed

@@ -151,7 +151,7 @@ Use `page.evaluate()` only for operations that have no Playwright locator equiva
 A quick test: if the evaluate body contains `querySelector`, `querySelectorAll`, `textContent`, `click()`, `getAttribute()`, or iterates DOM elements, it should be rewritten with Playwright locators.
-When `page.evaluate()` is used for the acceptable cases above, use a string expression to avoid DOM type errors:
+When `page.evaluate()` is used for the acceptable cases above, keep the logic self-contained and return JSON-serializable values:
 ```typescript
 const data = (await page.evaluate(`(() => {
@@ -160,7 +160,7 @@ const data = (await page.evaluate(`(() => {
 })()`)) as string;
 ```
-Do not use `/// <reference lib="dom" />` or add `"dom"` to the tsconfig lib — this project's tsconfig intentionally excludes DOM types.
+Do not rely on broad DOM querying inside `page.evaluate()` for production flows when Playwright locators can express the same interaction.
 ## Network Request Methods
@@ -220,4 +220,4 @@ for (const post of posts) {
 ## Type Checking
-The generated file must pass `npx tsc --noEmit` before it's considered done. If there are DOM type errors (`document`, `HTMLElement`, `getComputedStyle`), convert to locator APIs or string-expression `page.evaluate()`.
+The generated file must pass `npx tsc --noEmit` before it's considered done. If there are type errors around DOM access, prefer locator APIs first, then use focused `page.evaluate()` only for browser-native APIs.

package/{skill → .agents/skills/libretto}/integration-approach-selection.md RENAMED Viewed

@@ -138,9 +138,9 @@ Extract data directly from the rendered page using selectors and `page.evaluate(
 | Site Profile | Primary Strategy | Supplement With |
 |---|---|---|
 | No bot protection, fetch not patched | **A** (`page.evaluate(fetch)`) | Playwright for navigation/auth |
-| No bot protection, fetch IS patched | **B** (`page.onResponse`) | Playwright for navigation; DOM extraction as fallback |
-| Bot protection detected, fetch not patched | **B** (`page.onResponse`) | Playwright for all navigation; cautious use of `page.evaluate(fetch)` only if needed |
-| Bot protection detected, fetch IS patched | **B** (`page.onResponse`) | Playwright for all navigation; DOM extraction as fallback |
+| No bot protection, fetch IS patched | **B** (`page.on('response', ...)`) | Playwright for navigation; DOM extraction as fallback |
+| Bot protection detected, fetch not patched | **B** (`page.on('response', ...)`) | Playwright for all navigation; cautious use of `page.evaluate(fetch)` only if needed |
+| Bot protection detected, fetch IS patched | **B** (`page.on('response', ...)`) | Playwright for all navigation; DOM extraction as fallback |
 | Server-rendered content (no API calls) | **C** (DOM extraction) | Playwright for all interaction |
 ---

package/LICENSE ADDED Viewed

@@ -0,0 +1,21 @@
+MIT License
+Copyright (c) 2026 Libretto contributors
+Permission is hereby granted, free of charge, to any person obtaining a copy
+of this software and associated documentation files (the "Software"), to deal
+in the Software without restriction, including without limitation the rights
+to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+copies of the Software, and to permit persons to whom the Software is
+furnished to do so, subject to the following conditions:
+The above copyright notice and this permission notice shall be included in all
+copies or substantial portions of the Software.
+THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
+OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+SOFTWARE.

package/README.md CHANGED Viewed

@@ -1,156 +1,72 @@
 # Libretto
-AI-powered browser automation library and CLI built on Playwright.
+Libretto gives your coding agent superpowers for building, debugging, and maintaining browser RPA integrations.
+It is designed for engineering teams that automate workflows in web apps and want to move from brittle browser-only scripts to faster, more reliable network-first integrations.
 ## Installation
+Install Libretto in your project with your favorite package manager:
 ```bash
+npm install libretto playwright zod
+yarn add libretto playwright zod
+bun add libretto playwright zod
 pnpm add libretto playwright zod
-npx libretto init
 ```
-> **pnpm users:** if your workspace uses `onlyBuiltDependencies`, add both
-> `libretto` and `playwright` to allow their postinstall scripts to run
-> (libretto's postinstall copies skill files and installs Playwright Chromium):
->
-> ```jsonc
-> // package.json
-> {
->   "pnpm": {
->     "onlyBuiltDependencies": ["libretto", "playwright"]
->   }
-> }
-> ```
->
-> If the postinstall was skipped (e.g., `libretto` wasn't in the allowlist),
-> run `npx libretto init` manually after install to complete setup.
-## Quick Start
-### 1. Configure your LLM
-The easiest way is to use the built-in Vercel AI SDK adapter with any compatible provider:
-```typescript
-import { createLLMClientFromModel } from "libretto/llm";
-import { openai } from "@ai-sdk/openai";
-const llmClient = createLLMClientFromModel(openai("gpt-4o"));
+Then initialize Libretto:
+```bash
+npx libretto init
 ```
-Or use any other provider:
+## Usage
-```typescript
-import { createLLMClientFromModel } from "libretto";
-import { anthropic } from "@ai-sdk/anthropic";
+Libretto is usually used through prompts with the Libretto skill.
-const llmClient = createLLMClientFromModel(anthropic("claude-sonnet-4-20250514"));
-```
+### One-shot script generation
-You can also implement the `LLMClient` interface directly for full control:
+```text
+Use the Libretto skill. Go on LinkedIn and scrape the first 10 posts for content, who posted it, the number of reactions, the first 25 comments, and the first 25 reposts.
+```
-```typescript
-import type { LLMClient } from "libretto";
+### Interactive script building
-const llmClient: LLMClient = {
-  async generateObject({ prompt, schema, temperature }) {
-    // Call your LLM, return parsed + validated result
-  },
-  async generateObjectFromMessages({ messages, schema, temperature }) {
-    // Call your LLM with message history (may include images)
-  },
-};
+```text
+Use the Libretto skill. Let's interactively build a script to scrape scheduling info from the eClinicalWorks EHR.
 ```
-### 2. Write a workflow
+### Convert browser automation to network requests
-```typescript
-import { workflow } from "libretto";
-import { z } from "zod";
-export default workflow({
-  name: "extract-product",
-  schema: z.object({ url: z.string() }),
-  handler: async (ctx) => {
-    const page = ctx.page;
-    await page.goto(ctx.params.url);
+```text
+We have a browser script at ./integration.ts that automates going to Hacker News and getting the first 10 posts. Convert it to direct network scripts instead. Use the Libretto skill.
+```
-    const data = await ctx.extract({
-      instruction: "Extract the product name and price",
-      schema: z.object({ name: z.string(), price: z.number() }),
-    });
+### Fix broken integrations
-    return data;
-  },
-});
+```text
+We have a browser script at ./integration.ts that is supposed to go to Availity and perform an eligibility check for a patient. But I'm getting a broken selector error when I run it. Fix it. Use the Libretto skill.
 ```
-### 3. Run it
+You can also run workflows directly from the CLI:
 ```bash
-npx libretto run ./workflows/extract-product.ts extractProduct \
-  --params '{"url": "https://example.com/product"}'
+npx libretto help
+npx libretto run ./integration.ts main
 ```
-## CLI Commands
-```
-npx libretto init                  # Copy skills, install Playwright Chromium
-npx libretto open <url>            # Launch browser and open URL
-npx libretto run <file> <export>   # Run a workflow
-npx libretto ai configure <preset> # Configure AI runtime (codex, claude, gemini)
-npx libretto snapshot              # Capture page screenshot + HTML
-npx libretto exec <code>           # Execute Playwright code
-```
+## Authors
-Run `npx libretto help` for the full list.
-## Module Exports
-| Import                     | Contents                                                      |
-| -------------------------- | ------------------------------------------------------------- |
-| `libretto`                 | Everything                                                    |
-| `libretto/llm`             | `LLMClient` type, `createLLMClient`, `createLLMClientFromModel` |
-| `libretto/recovery`        | `attemptWithRecovery`, `executeRecoveryAgent`, `detectSubmissionError` |
-| `libretto/extract`         | `extractFromPage`                                             |
-| `libretto/network`         | `pageRequest`                                                 |
-| `libretto/download`        | `downloadViaClick`, `downloadAndSave`                         |
-| `libretto/logger`          | `Logger`, `defaultLogger`, sinks                              |
-| `libretto/debug`           | `debugPause`                                                  |
-| `libretto/config`          | `isDryRun`, `isDebugMode`, `shouldPauseBeforeMutation`        |
-| `libretto/instrumentation` | `instrumentPage`, `installInstrumentation`                    |
-| `libretto/visualization`   | Ghost cursor and highlight helpers                            |
-| `libretto/run`             | `launchBrowser`                                               |
-| `libretto/state`           | Session state serialization and parsing                       |
-## Using Recovery Helpers
-The recovery module (`libretto/recovery`) provides `detectSubmissionError` and
-`executeRecoveryAgent` for handling form submission errors. Both accept an
-`LLMClient` — create one with `createLLMClientFromModel` and pass it directly:
-```typescript
-import { detectSubmissionError, executeRecoveryAgent } from "libretto/recovery";
-import { createLLMClientFromModel } from "libretto/llm";
-import { openai } from "@ai-sdk/openai";
-const llmClient = createLLMClientFromModel(openai("gpt-4o"));
-// Detect if a submission produced an error
-const error = await detectSubmissionError(
-  page, submissionError, "eligibility check failed", llmClient, knownErrors, logger,
-);
-// Or run the full recovery agent to retry with corrections
-const result = await executeRecoveryAgent(
-  page, error, llmClient, recoveryOptions, logger,
-);
-```
+Maintained by the team at [Saffron Health](https://saffron.health).
-No need to write custom wrappers — `createLLMClientFromModel` bridges any
-Vercel AI SDK provider into the `LLMClient` interface that recovery helpers expect.
+## Development
-## Links
+For local development in this repository:
-- [GitHub](https://github.com/saffron-health/libretto)
-- [Issues](https://github.com/saffron-health/libretto/issues)
+```bash
+pnpm i
+pnpm build
+pnpm type-check
+pnpm test
+```