libretto 0.2.7 → 0.3.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -4,18 +4,19 @@ description: "Browser automation CLI for building integrations, with a network-f
4
4
  license: MIT
5
5
  metadata:
6
6
  author: saffron-health
7
- version: "0.2.2"
7
+ version: "0.2.4"
8
8
  ---
9
9
 
10
10
  # Browser Integration with Libretto CLI
11
11
 
12
12
  Use the `npx libretto` CLI to automate web interactions, debug browser agent jobs, and prototype fixes.
13
13
 
14
- ## CRITICAL: Session Access
14
+ ## Session Access
15
15
 
16
- Libretto sessions are **full-access by default**. You can use `exec` and `run` immediately after opening a session.
16
+ Libretto sessions are **full-access by default** (no approval prompts). You can use `exec` immediately after opening a session. `run` starts its own workflow browser process and requires the target session to be available.
17
17
 
18
18
  **Rules:**
19
+
19
20
  - Always announce which session you opened and what page you are on.
20
21
  - Use `snapshot`, `network`, and `actions` first when debugging unknown page state.
21
22
  - Before any potentially mutating action (submit/save/delete, or non-idempotent API calls), describe what you are about to do and wait for explicit user confirmation.
@@ -27,14 +28,15 @@ If it's not obvious which element to click or what value to enter, **ask the use
27
28
  ## Commands
28
29
 
29
30
  ```bash
30
- npx libretto open <url> [--headless] # Launch browser and navigate (headed by default)
31
+ npx libretto open <url> [--headed|--headless] # Launch browser and navigate (headed by default)
31
32
  npx libretto exec <code> [--visualize] # Execute Playwright TypeScript code (--visualize enables ghost cursor + highlight)
32
- npx libretto run <integrationFile> <integrationExport> # Execute integration actions
33
+ npx libretto run <integrationFile> <integrationExport> [--params <json> | --params-file <path>] [--auth-profile <domain>] [--headed|--headless] # Execute integration actions
33
34
  npx libretto resume # Resume a paused workflow for the current session
34
35
  npx libretto snapshot --objective "<what to find>" [--context "<situational info>"]
35
36
  npx libretto save <url|domain> # Save session (cookies, localStorage) to .libretto/profiles/
36
37
  npx libretto network # Show last 20 captured network requests
37
38
  npx libretto actions # Show last 20 captured user/agent actions
39
+ npx libretto pages # List open pages in the session
38
40
  npx libretto close # Close the browser
39
41
  ```
40
42
 
@@ -65,13 +67,13 @@ Workflows pause by calling `await pause()` (imported from `"libretto"`). In prod
65
67
 
66
68
  ## Globals Available in `exec`
67
69
 
68
- `page`, `context`, `state`, `browser`, `networkLog({ last?, filter?, method? })`, `actionLog({ last?, filter?, action?, source? })`, `console`, `fetch`, `Buffer`, `URL`, `setTimeout`
70
+ `page`, `context`, `state`, `browser`, `networkLog({ last?, filter?, method? })`, `actionLog({ last?, filter?, action?, source? })`, `console`, `fetch`, `Buffer`, `URL`, `setTimeout`, `setInterval`, `clearTimeout`, `clearInterval`
69
71
 
70
- The `state` object persists across `exec` calls within the same session use it to carry values between commands.
72
+ The `state` object is scoped to a single `exec` invocation and resets on the next call.
71
73
 
72
74
  ## CRITICAL: No try/catch in exec
73
75
 
74
- **Never use try/catch or .catch() in exec code.** Let errors throw so they surface as exec failures. When an exec fails, you get the full error message (e.g., "intercepts pointer events", "Timeout 30000ms exceeded") — use that to diagnose the problemand write a corrected exec.
76
+ **Never use try/catch or .catch() in exec code.** Let errors throw so they surface as exec failures. When an exec fails, you get the full error message (e.g., "intercepts pointer events", "Timeout 30000ms exceeded") — use that to diagnose the problem and write a corrected exec.
75
77
 
76
78
  **Why:** A try/catch inside exec hides failures from you. A click that times out takes 30 seconds — if you retry it in a loop with try/catch, you'll silently burn minutes on the same broken selector with no way to recover. Without try/catch, the error comes back immediately and you can reason about what went wrong.
77
79
 
@@ -174,9 +176,9 @@ npx libretto exec --session browser-agent "await page.locator('.dropdown-trigger
174
176
 
175
177
  ## Snapshot — The Primary Observation Tool
176
178
 
177
- The `snapshot` command captures a PNG screenshot + HTML, sends both to a vision model (Gemini Flash), and returns an analysis with Playwright-ready selectors. `--objective` is required for analysis, and `--context` is optional (but recommended for better results). This is the single way to understand what's on the page — use it any time you need to inspect page structure, find elements, or debug what's happening.
179
+ The `snapshot` command captures a PNG screenshot + HTML and (when `--objective` is provided) runs analysis through the configured AI runtime (`codex`, `claude`, or `gemini` via `npx libretto ai configure ...`). `--context` is optional (but recommended for better results). This is the single way to understand what's on the page — use it any time you need to inspect page structure, find elements, or debug what's happening.
178
180
 
179
- **Never use `page.screenshot()` via `exec` to understand the page.** Use the `snapshot` command instead — it captures the screenshot, HTML, and sends both to a vision model that returns actionable selectors. Raw screenshots give you an image with no analysis; `snapshot` gives you the answer.
181
+ **Never use `page.screenshot()` via `exec` to understand the page.** Use the `snapshot` command instead — it captures the screenshot, HTML, and runs analysis with selectors. Raw screenshots give you an image with no analysis; `snapshot` gives you the answer.
180
182
 
181
183
  ### What to Put in `--objective`
182
184
 
@@ -224,7 +226,7 @@ When the snapshot doesn't give you enough detail — why an element is hidden, w
224
226
 
225
227
  ## Tips
226
228
 
227
- - **Never use `page.screenshot()` via `exec`.** Use `npx libretto snapshot` instead — it captures the viewport, sends the screenshot + HTML to a vision model, and returns actionable selectors. The `fullPage` option is especially dangerous — it scrolls the entire page to stitch a screenshot, which can crash JavaScript-heavy pages (especially EMR portals like eClinicalWorks).
229
+ - **Never use `page.screenshot()` via `exec`.** Use `npx libretto snapshot` instead — it captures the viewport plus HTML and returns analyzed output with actionable selectors. The `fullPage` option is especially dangerous — it scrolls the entire page to stitch a screenshot, which can crash JavaScript-heavy pages (especially EMR portals like eClinicalWorks).
228
230
  - **Never run `exec` commands in parallel.** Always wait for one `exec` to finish before starting the next. Do not use `run_in_background` for `exec` calls. Running simultaneous `exec` calls opens multiple CDP connections to the same page, which corrupts the page state and kills the browser.
229
231
  - `open` requires an available session. If the session is already active, Libretto fails fast and asks you to close the existing session or use a different `--session`.
230
232
  - `run` also requires an available session, except for the specific case of a prior failed `run` in the same session; in that case Libretto releases the failed worker and allows rerun.
@@ -234,7 +236,7 @@ When the snapshot doesn't give you enough detail — why an element is hidden, w
234
236
 
235
237
  ## Network Logging
236
238
 
237
- Network requests are captured automatically when a browser is opened via `npx libretto open`. All non-static HTTP responses (excluding `.css`, `.js`, `.png`, `.jpg`, `.gif`, `.woff`, `.ico`, `.svg`, and `chrome-extension://` URLs) are logged to `.libretto/sessions/<session>/network.jsonl`.
239
+ Network requests are captured automatically for Libretto-managed browser sessions (for example from `npx libretto open` and `npx libretto run`). Non-static HTTP responses are logged to `.libretto/sessions/<session>/network.jsonl`.
238
240
 
239
241
  ### CLI: `npx libretto network`
240
242
 
@@ -256,11 +258,11 @@ npx libretto exec "return await networkLog({ method: 'POST' })"
256
258
 
257
259
  Returns an array of objects with: `ts`, `method`, `url`, `status`, `contentType`, `postData` (POST/PUT/PATCH only, first 2000 chars), `size`, `durationMs`.
258
260
 
259
- **Note:** Network logging only works for sessions opened via `npx libretto open`. It does not capture requests for external sessions like `--session browser-agent`.
261
+ **Note:** Network logging works for Libretto-managed sessions. It does not capture requests for external sessions like `--session browser-agent`.
260
262
 
261
263
  ## Action Logging
262
264
 
263
- Browser actions are captured automatically when a browser is opened via `npx libretto open`. Both user interactions (manual clicks, typing in the headed browser window) and agent actions (programmatic Playwright API calls via `exec`) are logged to `.libretto/sessions/<session>/actions.jsonl` with a `source` field of `'user'` or `'agent'` to distinguish the two.
265
+ Browser actions are captured automatically for Libretto-managed browser sessions (for example from `npx libretto open` and `npx libretto run`). Both user interactions (manual clicks, typing in the headed browser window) and agent actions (programmatic Playwright API calls via `exec`) are logged to `.libretto/sessions/<session>/actions.jsonl` with a `source` field of `'user'` or `'agent'` to distinguish the two.
264
266
 
265
267
  ### CLI: `npx libretto actions`
266
268
 
@@ -284,7 +286,7 @@ npx libretto exec "return await actionLog({ action: 'click' })"
284
286
 
285
287
  Returns an array of objects with: `ts`, `action`, `source` (`'user'` | `'agent'`), `selector`, `value`, `url`, `duration`, `success`, `error`.
286
288
 
287
- **Note:** Action logging only works for sessions opened via `npx libretto open`. It does not capture actions for external sessions like `--session browser-agent`.
289
+ **Note:** Action logging works for Libretto-managed sessions. It does not capture actions for external sessions like `--session browser-agent`.
288
290
 
289
291
  ## Workflow: Creating a New Integration
290
292
 
@@ -436,7 +438,7 @@ After completing interactive exploration, **always generate the TypeScript workf
436
438
  **STOP AND ASK BEFORE GENERATING CODE.** Once the interactive workflow is figured out, pause and ask:
437
439
 
438
440
  1. "Are there any existing files or patterns in the codebase you want me to reference?"
439
- 2. "Do you want me to incorporate any of your manual browser interactions from the actions log (`npx libretto actions --source user`) into the generated code?"
441
+ 2. Check the action log for user interactions by running `npx libretto actions --source user`. If there are any recorded user interactions, ask: "I see you performed some manual interactions in the browser (clicks, form fills, etc.). Would you like me to incorporate any of those into the generated code?" — and briefly list what you found. If there are no user interactions, skip this question entirely.
440
442
  3. "Any other guidance for how the production code should be structured?"
441
443
 
442
444
  Wait for the user's response before proceeding. Then:
@@ -446,6 +448,6 @@ Wait for the user's response before proceeding. Then:
446
448
 
447
449
  ## Patient Safety Warning
448
450
 
449
- Browser automation jobs process real patient health information. The `npx libretto` CLI executes arbitrary code with full page access. **Never** execute code that submits forms, sends referrals, deletes data, or modifies patient records.
451
+ Browser automation jobs process real patient health information. The `npx libretto` CLI executes arbitrary code with full page access. **Never execute mutating actions without explicit user confirmation first** (submits, sends, deletes, updates, or other side effects).
450
452
 
451
- See `apps/browser-agent/docs/interactive-debugging-workflow.md` for the complete debugging guide.
453
+ For debugging steps, see the "Workflow: Interactive Debugging" section in this file.
@@ -151,7 +151,7 @@ Use `page.evaluate()` only for operations that have no Playwright locator equiva
151
151
 
152
152
  A quick test: if the evaluate body contains `querySelector`, `querySelectorAll`, `textContent`, `click()`, `getAttribute()`, or iterates DOM elements, it should be rewritten with Playwright locators.
153
153
 
154
- When `page.evaluate()` is used for the acceptable cases above, use a string expression to avoid DOM type errors:
154
+ When `page.evaluate()` is used for the acceptable cases above, keep the logic self-contained and return JSON-serializable values:
155
155
 
156
156
  ```typescript
157
157
  const data = (await page.evaluate(`(() => {
@@ -160,7 +160,7 @@ const data = (await page.evaluate(`(() => {
160
160
  })()`)) as string;
161
161
  ```
162
162
 
163
- Do not use `/// <reference lib="dom" />` or add `"dom"` to the tsconfig lib this project's tsconfig intentionally excludes DOM types.
163
+ Do not rely on broad DOM querying inside `page.evaluate()` for production flows when Playwright locators can express the same interaction.
164
164
 
165
165
  ## Network Request Methods
166
166
 
@@ -220,4 +220,4 @@ for (const post of posts) {
220
220
 
221
221
  ## Type Checking
222
222
 
223
- The generated file must pass `npx tsc --noEmit` before it's considered done. If there are DOM type errors (`document`, `HTMLElement`, `getComputedStyle`), convert to locator APIs or string-expression `page.evaluate()`.
223
+ The generated file must pass `npx tsc --noEmit` before it's considered done. If there are type errors around DOM access, prefer locator APIs first, then use focused `page.evaluate()` only for browser-native APIs.
@@ -138,9 +138,9 @@ Extract data directly from the rendered page using selectors and `page.evaluate(
138
138
  | Site Profile | Primary Strategy | Supplement With |
139
139
  |---|---|---|
140
140
  | No bot protection, fetch not patched | **A** (`page.evaluate(fetch)`) | Playwright for navigation/auth |
141
- | No bot protection, fetch IS patched | **B** (`page.onResponse`) | Playwright for navigation; DOM extraction as fallback |
142
- | Bot protection detected, fetch not patched | **B** (`page.onResponse`) | Playwright for all navigation; cautious use of `page.evaluate(fetch)` only if needed |
143
- | Bot protection detected, fetch IS patched | **B** (`page.onResponse`) | Playwright for all navigation; DOM extraction as fallback |
141
+ | No bot protection, fetch IS patched | **B** (`page.on('response', ...)`) | Playwright for navigation; DOM extraction as fallback |
142
+ | Bot protection detected, fetch not patched | **B** (`page.on('response', ...)`) | Playwright for all navigation; cautious use of `page.evaluate(fetch)` only if needed |
143
+ | Bot protection detected, fetch IS patched | **B** (`page.on('response', ...)`) | Playwright for all navigation; DOM extraction as fallback |
144
144
  | Server-rendered content (no API calls) | **C** (DOM extraction) | Playwright for all interaction |
145
145
 
146
146
  ---
package/LICENSE ADDED
@@ -0,0 +1,21 @@
1
+ MIT License
2
+
3
+ Copyright (c) 2026 Libretto contributors
4
+
5
+ Permission is hereby granted, free of charge, to any person obtaining a copy
6
+ of this software and associated documentation files (the "Software"), to deal
7
+ in the Software without restriction, including without limitation the rights
8
+ to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
9
+ copies of the Software, and to permit persons to whom the Software is
10
+ furnished to do so, subject to the following conditions:
11
+
12
+ The above copyright notice and this permission notice shall be included in all
13
+ copies or substantial portions of the Software.
14
+
15
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16
+ IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17
+ FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18
+ AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19
+ LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20
+ OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21
+ SOFTWARE.
package/README.md CHANGED
@@ -1,156 +1,72 @@
1
1
  # Libretto
2
2
 
3
- AI-powered browser automation library and CLI built on Playwright.
3
+ Libretto gives your coding agent superpowers for building, debugging, and maintaining browser RPA integrations.
4
+
5
+ It is designed for engineering teams that automate workflows in web apps and want to move from brittle browser-only scripts to faster, more reliable network-first integrations.
4
6
 
5
7
  ## Installation
6
8
 
9
+ Install Libretto in your project with your favorite package manager:
10
+
7
11
  ```bash
12
+ npm install libretto playwright zod
13
+ yarn add libretto playwright zod
14
+ bun add libretto playwright zod
8
15
  pnpm add libretto playwright zod
9
- npx libretto init
10
16
  ```
11
17
 
12
- > **pnpm users:** if your workspace uses `onlyBuiltDependencies`, add both
13
- > `libretto` and `playwright` to allow their postinstall scripts to run
14
- > (libretto's postinstall copies skill files and installs Playwright Chromium):
15
- >
16
- > ```jsonc
17
- > // package.json
18
- > {
19
- > "pnpm": {
20
- > "onlyBuiltDependencies": ["libretto", "playwright"]
21
- > }
22
- > }
23
- > ```
24
- >
25
- > If the postinstall was skipped (e.g., `libretto` wasn't in the allowlist),
26
- > run `npx libretto init` manually after install to complete setup.
27
-
28
- ## Quick Start
29
-
30
- ### 1. Configure your LLM
31
-
32
- The easiest way is to use the built-in Vercel AI SDK adapter with any compatible provider:
33
-
34
- ```typescript
35
- import { createLLMClientFromModel } from "libretto/llm";
36
- import { openai } from "@ai-sdk/openai";
37
-
38
- const llmClient = createLLMClientFromModel(openai("gpt-4o"));
18
+ Then initialize Libretto:
19
+
20
+ ```bash
21
+ npx libretto init
39
22
  ```
40
23
 
41
- Or use any other provider:
24
+ ## Usage
42
25
 
43
- ```typescript
44
- import { createLLMClientFromModel } from "libretto";
45
- import { anthropic } from "@ai-sdk/anthropic";
26
+ Libretto is usually used through prompts with the Libretto skill.
46
27
 
47
- const llmClient = createLLMClientFromModel(anthropic("claude-sonnet-4-20250514"));
48
- ```
28
+ ### One-shot script generation
49
29
 
50
- You can also implement the `LLMClient` interface directly for full control:
30
+ ```text
31
+ Use the Libretto skill. Go on LinkedIn and scrape the first 10 posts for content, who posted it, the number of reactions, the first 25 comments, and the first 25 reposts.
32
+ ```
51
33
 
52
- ```typescript
53
- import type { LLMClient } from "libretto";
34
+ ### Interactive script building
54
35
 
55
- const llmClient: LLMClient = {
56
- async generateObject({ prompt, schema, temperature }) {
57
- // Call your LLM, return parsed + validated result
58
- },
59
- async generateObjectFromMessages({ messages, schema, temperature }) {
60
- // Call your LLM with message history (may include images)
61
- },
62
- };
36
+ ```text
37
+ Use the Libretto skill. Let's interactively build a script to scrape scheduling info from the eClinicalWorks EHR.
63
38
  ```
64
39
 
65
- ### 2. Write a workflow
40
+ ### Convert browser automation to network requests
66
41
 
67
- ```typescript
68
- import { workflow } from "libretto";
69
- import { z } from "zod";
70
-
71
- export default workflow({
72
- name: "extract-product",
73
- schema: z.object({ url: z.string() }),
74
- handler: async (ctx) => {
75
- const page = ctx.page;
76
- await page.goto(ctx.params.url);
42
+ ```text
43
+ We have a browser script at ./integration.ts that automates going to Hacker News and getting the first 10 posts. Convert it to direct network scripts instead. Use the Libretto skill.
44
+ ```
77
45
 
78
- const data = await ctx.extract({
79
- instruction: "Extract the product name and price",
80
- schema: z.object({ name: z.string(), price: z.number() }),
81
- });
46
+ ### Fix broken integrations
82
47
 
83
- return data;
84
- },
85
- });
48
+ ```text
49
+ We have a browser script at ./integration.ts that is supposed to go to Availity and perform an eligibility check for a patient. But I'm getting a broken selector error when I run it. Fix it. Use the Libretto skill.
86
50
  ```
87
51
 
88
- ### 3. Run it
52
+ You can also run workflows directly from the CLI:
89
53
 
90
54
  ```bash
91
- npx libretto run ./workflows/extract-product.ts extractProduct \
92
- --params '{"url": "https://example.com/product"}'
55
+ npx libretto help
56
+ npx libretto run ./integration.ts main
93
57
  ```
94
58
 
95
- ## CLI Commands
96
-
97
- ```
98
- npx libretto init # Copy skills, install Playwright Chromium
99
- npx libretto open <url> # Launch browser and open URL
100
- npx libretto run <file> <export> # Run a workflow
101
- npx libretto ai configure <preset> # Configure AI runtime (codex, claude, gemini)
102
- npx libretto snapshot # Capture page screenshot + HTML
103
- npx libretto exec <code> # Execute Playwright code
104
- ```
59
+ ## Authors
105
60
 
106
- Run `npx libretto help` for the full list.
107
-
108
- ## Module Exports
109
-
110
- | Import | Contents |
111
- | -------------------------- | ------------------------------------------------------------- |
112
- | `libretto` | Everything |
113
- | `libretto/llm` | `LLMClient` type, `createLLMClient`, `createLLMClientFromModel` |
114
- | `libretto/recovery` | `attemptWithRecovery`, `executeRecoveryAgent`, `detectSubmissionError` |
115
- | `libretto/extract` | `extractFromPage` |
116
- | `libretto/network` | `pageRequest` |
117
- | `libretto/download` | `downloadViaClick`, `downloadAndSave` |
118
- | `libretto/logger` | `Logger`, `defaultLogger`, sinks |
119
- | `libretto/debug` | `debugPause` |
120
- | `libretto/config` | `isDryRun`, `isDebugMode`, `shouldPauseBeforeMutation` |
121
- | `libretto/instrumentation` | `instrumentPage`, `installInstrumentation` |
122
- | `libretto/visualization` | Ghost cursor and highlight helpers |
123
- | `libretto/run` | `launchBrowser` |
124
- | `libretto/state` | Session state serialization and parsing |
125
-
126
- ## Using Recovery Helpers
127
-
128
- The recovery module (`libretto/recovery`) provides `detectSubmissionError` and
129
- `executeRecoveryAgent` for handling form submission errors. Both accept an
130
- `LLMClient` — create one with `createLLMClientFromModel` and pass it directly:
131
-
132
- ```typescript
133
- import { detectSubmissionError, executeRecoveryAgent } from "libretto/recovery";
134
- import { createLLMClientFromModel } from "libretto/llm";
135
- import { openai } from "@ai-sdk/openai";
136
-
137
- const llmClient = createLLMClientFromModel(openai("gpt-4o"));
138
-
139
- // Detect if a submission produced an error
140
- const error = await detectSubmissionError(
141
- page, submissionError, "eligibility check failed", llmClient, knownErrors, logger,
142
- );
143
-
144
- // Or run the full recovery agent to retry with corrections
145
- const result = await executeRecoveryAgent(
146
- page, error, llmClient, recoveryOptions, logger,
147
- );
148
- ```
61
+ Maintained by the team at [Saffron Health](https://saffron.health).
149
62
 
150
- No need to write custom wrappers — `createLLMClientFromModel` bridges any
151
- Vercel AI SDK provider into the `LLMClient` interface that recovery helpers expect.
63
+ ## Development
152
64
 
153
- ## Links
65
+ For local development in this repository:
154
66
 
155
- - [GitHub](https://github.com/saffron-health/libretto)
156
- - [Issues](https://github.com/saffron-health/libretto/issues)
67
+ ```bash
68
+ pnpm i
69
+ pnpm build
70
+ pnpm type-check
71
+ pnpm test
72
+ ```