@blazediff/agent 0.0.1 → 0.1.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +65 -86
- package/SKILL.md +2 -3
- package/dist/cli.js +2281 -1912
- package/dist/index.d.mts +73 -43
- package/dist/index.d.ts +73 -43
- package/dist/index.js +1186 -791
- package/dist/index.mjs +1187 -794
- package/package.json +2 -1
package/README.md
CHANGED
|
@@ -1,98 +1,89 @@
|
|
|
1
1
|
# @blazediff/agent
|
|
2
2
|
|
|
3
|
-
|
|
3
|
+
<div align="center">
|
|
4
4
|
|
|
5
|
-
|
|
5
|
+
[](https://www.npmjs.com/package/@blazediff/agent)
|
|
6
|
+
[](https://www.npmjs.com/package/@blazediff/agent)
|
|
6
7
|
|
|
7
|
-
|
|
8
|
+
</div>
|
|
8
9
|
|
|
9
|
-
|
|
10
|
-
npm install -g @blazediff/agent
|
|
11
|
-
# or as a dev dep
|
|
12
|
-
npm install --save-dev @blazediff/agent
|
|
13
|
-
```
|
|
14
|
-
|
|
15
|
-
First run will prompt to install Chromium via the bundled Playwright. No sudo, no `npx playwright install --with-deps`.
|
|
16
|
-
|
|
17
|
-
## Quickstart
|
|
10
|
+
Agentic visual regression for BlazeDiff. Discovers routes, screenshots them with Playwright, diffs against committed baselines, and hands ambiguous diffs to a coding agent (Claude Code, Codex, Cursor) for judgment.
|
|
18
11
|
|
|
19
|
-
|
|
20
|
-
|
|
21
|
-
|
|
22
|
-
|
|
23
|
-
|
|
24
|
-
|
|
12
|
+
**Features:**
|
|
13
|
+
- Deterministic CLI — no embedded LLM, no API key required
|
|
14
|
+
- Source-walking route discovery for Next.js / Vite / Remix (BFS fallback)
|
|
15
|
+
- Heuristic verdict: `regression-likely | intentional-likely | noise-likely | ambiguous`
|
|
16
|
+
- LangGraph pipeline with per-entry subgraphs, suspendable via `interrupt()` and resumable from an on-disk checkpoint
|
|
17
|
+
- Region-tile handoff to host agents (10–100× smaller than full PNGs)
|
|
18
|
+
- Auto-masking via `data-blazediff-agent-mask` attribute
|
|
25
19
|
|
|
26
|
-
|
|
27
|
-
blazediff-agent run --judge host --json # pipelined: capture → diff → verdict → judge
|
|
28
|
-
# or
|
|
29
|
-
blazediff-agent check --judge host --json # single-pool, simpler
|
|
20
|
+
## Installation
|
|
30
21
|
|
|
31
|
-
|
|
32
|
-
|
|
22
|
+
```bash
|
|
23
|
+
npm install --save-dev @blazediff/agent
|
|
33
24
|
```
|
|
34
25
|
|
|
35
|
-
|
|
36
|
-
|
|
37
|
-
## Onboarding a coding agent
|
|
26
|
+
First run prompts to install bundled Chromium.
|
|
38
27
|
|
|
39
|
-
|
|
28
|
+
## Quickstart
|
|
40
29
|
|
|
41
|
-
```
|
|
42
|
-
blazediff-agent
|
|
43
|
-
blazediff-agent
|
|
44
|
-
|
|
45
|
-
blazediff-agent
|
|
30
|
+
```bash
|
|
31
|
+
blazediff-agent init # write .blazediff/config.json
|
|
32
|
+
blazediff-agent browsers install # ensure chromium
|
|
33
|
+
echo '[{"id":"home","url":"/"}]' \
|
|
34
|
+
| blazediff-agent capture --stdin --mode baseline
|
|
35
|
+
blazediff-agent check --judge host # CI
|
|
36
|
+
blazediff-agent rewrite home # accept intentional change
|
|
46
37
|
```
|
|
47
38
|
|
|
48
|
-
|
|
49
|
-
|
|
50
|
-
- **Claude Code** writes `<project>/.claude/skills/blazediff/SKILL.md`
|
|
51
|
-
- **Codex** writes `~/.codex/prompts/blazediff.md` (user-global; Codex CLI looks here for slash-command prompts)
|
|
52
|
-
- **Cursor** writes `<project>/.cursor/rules/blazediff.mdc` with the right frontmatter
|
|
53
|
-
|
|
54
|
-
Detection is project-local (looks for `.claude/` / `CLAUDE.md` / `AGENTS.md` for Claude Code, `AGENTS.md` / `.codex/` for Codex, `.cursor/` / `.cursorrules` for Cursor). Both Claude Code and Codex read `AGENTS.md`, so a project with only `AGENTS.md` will install for both. On a TTY with no detection, the command prompts.
|
|
39
|
+
Commit `.blazediff/` (config + manifest + baselines). All commands accept `--json`.
|
|
55
40
|
|
|
56
41
|
## Commands
|
|
57
42
|
|
|
58
|
-
|
|
59
|
-
|
|
60
|
-
|
|
61
|
-
|
|
62
|
-
|
|
63
|
-
|
|
64
|
-
|
|
65
|
-
|
|
66
|
-
|
|
67
|
-
|
|
68
|
-
|
|
69
|
-
|
|
70
|
-
|
|
71
|
-
|
|
72
|
-
|
|
73
|
-
|
|
74
|
-
|
|
75
|
-
##
|
|
76
|
-
|
|
77
|
-
|
|
78
|
-
|
|
79
|
-
|
|
80
|
-
|
|
81
|
-
|
|
43
|
+
<table>
|
|
44
|
+
<tr><th width="200">Command</th><th>Description</th></tr>
|
|
45
|
+
<tr><td><code>onboard</code></td><td>Install the playbook into Claude Code / Codex / Cursor</td></tr>
|
|
46
|
+
<tr><td><code>init</code></td><td>Detect framework, write <code>.blazediff/config.json</code></td></tr>
|
|
47
|
+
<tr><td><code>discover</code></td><td>BFS-crawl routes from <code>baseUrl</code></td></tr>
|
|
48
|
+
<tr><td><code>capture --stdin</code></td><td>Screenshot routes from stdin JSON, write baselines/actuals</td></tr>
|
|
49
|
+
<tr><td><code>check</code></td><td>Re-capture, diff against baseline, emit <code>CheckReport</code>. <code>--judge host</code> suspends on the first ambiguous entry; <code>--apply-judgments</code> resumes from <code>.blazediff/checkpoints/</code> once verdicts are written.</td></tr>
|
|
50
|
+
<tr><td><code>rewrite <id...></code></td><td>Re-baseline existing entries (also <code>--failed</code> / <code>--all</code>). Cleans stale <code>actual/</code>, <code>judgments/</code>, <code>summary.md</code>, <code>checkpoints/</code> for the rewritten ids.</td></tr>
|
|
51
|
+
<tr><td><code>diff <id></code></td><td>Re-diff one entry without re-screenshotting</td></tr>
|
|
52
|
+
<tr><td><code>manifest</code></td><td>Inspect / list manifest entries</td></tr>
|
|
53
|
+
<tr><td><code>serve-status</code></td><td>Start / stop / probe the dev server</td></tr>
|
|
54
|
+
<tr><td><code>browsers install</code></td><td>Install bundled Playwright Chromium</td></tr>
|
|
55
|
+
<tr><td><code>reset --yes</code></td><td>Wipe <code>.blazediff/</code></td></tr>
|
|
56
|
+
</table>
|
|
57
|
+
|
|
58
|
+
Pass `--cwd <abs-path>` to target a sub-package in a monorepo.
|
|
59
|
+
|
|
60
|
+
## Onboarding
|
|
61
|
+
|
|
62
|
+
```bash
|
|
63
|
+
blazediff-agent onboard # auto-detect harness
|
|
64
|
+
blazediff-agent onboard --harness codex # explicit
|
|
65
|
+
blazediff-agent onboard --harness all
|
|
66
|
+
```
|
|
82
67
|
|
|
83
|
-
|
|
68
|
+
Writes:
|
|
69
|
+
- Claude Code → `<project>/.claude/skills/blazediff/SKILL.md`
|
|
70
|
+
- Codex → `~/.codex/skills/blazediff/SKILL.md`
|
|
71
|
+
- Cursor → `<project>/.cursor/rules/blazediff.mdc`
|
|
84
72
|
|
|
85
|
-
|
|
73
|
+
## Masking
|
|
86
74
|
|
|
87
|
-
|
|
75
|
+
Mark non-deterministic content (carousels, clocks, randomized avatars) in source:
|
|
88
76
|
|
|
89
77
|
```tsx
|
|
90
78
|
<div data-blazediff-agent-mask>...</div>
|
|
91
|
-
// or with a reason inline:
|
|
92
79
|
<div data-blazediff-agent-mask="report-carousel">...</div>
|
|
93
80
|
```
|
|
94
81
|
|
|
95
|
-
For
|
|
82
|
+
For third-party embeds you can't annotate, use a per-entry `manifest.entries[].mask` CSS selector and re-capture.
|
|
83
|
+
|
|
84
|
+
## Judging
|
|
85
|
+
|
|
86
|
+
Every non-match routes through the configured judge. With `--judge host` the judge node `interrupt()`s the LangGraph pipeline, writes a `JudgmentRequest` (region tiles + locator thumbnail) to `.blazediff/judgments/<id>/`, and the suspended graph is checkpointed to `.blazediff/checkpoints/`. The host agent reads the tiles, writes `verdict.json`, and `check --apply-judgments` resumes the same graph with the verdicts — no re-capture, no re-diff.
|
|
96
87
|
|
|
97
88
|
## Configuration
|
|
98
89
|
|
|
@@ -107,30 +98,18 @@ For external embeds you can't annotate (third-party iframes, framework-owned ele
|
|
|
107
98
|
}
|
|
108
99
|
```
|
|
109
100
|
|
|
110
|
-
`.blazediff/manifest.json` is written by `capture`
|
|
101
|
+
`.blazediff/manifest.json` is written by `capture` — don't edit it directly.
|
|
111
102
|
|
|
112
103
|
## CI
|
|
113
104
|
|
|
114
|
-
Only `check`
|
|
105
|
+
Only `check` is allowed under `CI=1`. Exit codes:
|
|
115
106
|
|
|
116
|
-
- `0`
|
|
117
|
-
- `1`
|
|
107
|
+
- `0` — all passed
|
|
108
|
+
- `1` — regression, intentional, or pending judgment
|
|
118
109
|
- non-zero with structured error JSON on infra failures
|
|
119
110
|
|
|
120
|
-
## Files
|
|
121
|
-
|
|
122
|
-
- `src/cli.ts` - entry point
|
|
123
|
-
- `src/check.ts` / `src/graph/` - single-pool and LangGraph-pipelined runners
|
|
124
|
-
- `src/judge/` - pluggable judge (`host` / `none`), region-tile generator, verdict applier
|
|
125
|
-
- `src/browser/launch.ts` - Chromium serialization + mask overlay painter
|
|
126
|
-
- `src/discover/` - source-walking for Next.js / Vite / Remix + BFS fallback
|
|
127
|
-
- `src/diff/` - heuristic verdict pipeline
|
|
128
|
-
- `src/report/markdown.ts` - `summary.md` generator (5-column `id | baseline | actual | diff | verdict`)
|
|
129
|
-
- `ROADMAP.md` - phase tracking
|
|
130
|
-
- Playbook: `skill/blazediff/SKILL.md` (repo root)
|
|
131
|
-
|
|
132
111
|
## Links
|
|
133
112
|
|
|
134
113
|
- [GitHub](https://github.com/teimurjan/blazediff/tree/main/packages/agent)
|
|
135
|
-
- [
|
|
114
|
+
- [Documentation](https://blazediff.dev/docs)
|
|
136
115
|
- [Roadmap](./ROADMAP.md)
|
package/SKILL.md
CHANGED
|
@@ -13,7 +13,7 @@ Sibling files in this skill directory — read on demand:
|
|
|
13
13
|
|
|
14
14
|
## Be terse
|
|
15
15
|
- Pass `--json` on every `blazediff-agent` call; parse fields. Do not echo CLI output.
|
|
16
|
-
- `check
|
|
16
|
+
- `check --json` returns a **slim payload**: `{ summaryPath, createdAt, totalEntries, passed, failed, pendingJudgments, results }`. `results` lists non-pass entries only, each as `{ id, url, status, verdict?: { label, headline, action } }`. The full per-entry detail (regions, paths, rationale) lives in `<TARGET>/.blazediff/summary.md` and `<TARGET>/.blazediff/judgments/<id>/request.json`.
|
|
17
17
|
- Authoring uses ONE `capture --stdin` call piped a JSON list of routes — never a per-route loop.
|
|
18
18
|
- No `ls`, `cat`, `find` for paths the CLI already returns.
|
|
19
19
|
- One final summary line — for authoring: `N captured | M skipped (reasons) | K auth-gated`; for check: `P/T passed (F failed)` plus failure ids.
|
|
@@ -31,8 +31,7 @@ Sibling files in this skill directory — read on demand:
|
|
|
31
31
|
- Else → **authoring**.
|
|
32
32
|
|
|
33
33
|
## check
|
|
34
|
-
1. `blazediff-agent --cwd "$TARGET" check --judge host --json` (the CLI starts the dev server if `devServer` is configured; otherwise hits the configured baseUrl directly).
|
|
35
|
-
- Prefer `run` instead for large sites (≥10 routes): `blazediff-agent --cwd "$TARGET" run --judge host --json`. Same flags, same report shape — it just pipelines capture → diff → verdict → judge through a LangGraph state graph so per-entry stages overlap. Use `check` when you want the simpler, single-pool implementation; `run` when wall-time matters or you want LangSmith traces.
|
|
34
|
+
1. `blazediff-agent --cwd "$TARGET" check --judge host --json` (the CLI starts the dev server if `devServer` is configured; otherwise hits the configured baseUrl directly). Capture, diff, verdict, and judge run through a LangGraph state graph so per-entry stages overlap.
|
|
36
35
|
- **Cold Next.js / Vite servers** can take 5–30s to compile a route on first hit; under default 30s `page.goto` timeout the first route in a fresh dev session sometimes times out (`page.goto: Timeout 30000ms exceeded`). If that happens, **rerun the same `check` command** — the dev server is now warm and the next pass usually completes. Don't change `waitFor` or restart the dev server; the issue is one-time compilation, not a routing or wait-condition bug.
|
|
37
36
|
- **Codex sandbox / restricted-bash environments** may block Playwright's chromium launch (`browserType.launch: Target page, context or browser has been closed`). Rerun with the sandbox/escape escalation the host agent provides (in Codex: approve the command for "always run outside sandbox"). Not a blazediff bug.
|
|
38
37
|
2. Pass: report `P/T passed`. Stop.
|