aipeek 0.1.5 → 0.2.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +70 -2
- package/dist/chunk-E7YXT6HW.js +3732 -0
- package/dist/chunk-IOK7MMPT.cjs +3732 -0
- package/dist/cli.cjs +33 -12
- package/dist/cli.js +29 -8
- package/dist/index.cjs +12 -2
- package/dist/index.d.cts +27 -3
- package/dist/index.d.ts +27 -3
- package/dist/index.js +13 -3
- package/dist/plugin.cjs +2 -2
- package/dist/plugin.js +1 -1
- package/package.json +3 -1
- package/dist/chunk-4PG5UWRS.js +0 -596
- package/dist/chunk-P73WZQ2R.cjs +0 -596
package/README.md
CHANGED
|
@@ -1,6 +1,10 @@
|
|
|
1
1
|
# aipeek
|
|
2
2
|
|
|
3
|
-
Gives AI a peek into your running browser app.
|
|
3
|
+
Gives AI a peek into — and a hand on — your running browser app. Reads the UI tree (React fiber), semantic DOM, console, network, errors, and store state; drives the page (click/fill/press/wait/screenshot). All over plain-text HTTP on your Vite dev server — zero resident context cost, unlike a browser MCP whose tool schemas sit in the model's context whether used or not.
|
|
4
|
+
|
|
5
|
+
**10× faster end-to-end.** What you feel is wall-clock from prompt to done — model thinking, round-trips, all of it. Screenshot agents (Playwright + vision) pay 2–5s of pixel-parsing *every step*; aipeek reads semantic text (instant) and batches a whole interaction into one round-trip with `/chain` — the model thinks once, not N times.
|
|
6
|
+
|
|
7
|
+
It lives **inside** the open page (injected client + HMR channel), so it reads React/store internals a DOM-only driver can't, and acts on the current tab with no separate browser process. It does **not** open browsers, navigate, run headless, or fire real pointer events — it's the dev inner loop, not E2E. For that, use Playwright.
|
|
4
8
|
|
|
5
9
|
## Install
|
|
6
10
|
|
|
@@ -14,7 +18,7 @@ bun add aipeek
|
|
|
14
18
|
|
|
15
19
|
## Setup
|
|
16
20
|
|
|
17
|
-
```
|
|
21
|
+
```ts
|
|
18
22
|
// vite.config.ts
|
|
19
23
|
import { aipeekPlugin } from 'aipeek'
|
|
20
24
|
|
|
@@ -29,10 +33,74 @@ All endpoints are available on your Vite dev server:
|
|
|
29
33
|
|
|
30
34
|
| Endpoint | Description |
|
|
31
35
|
|----------|-------------|
|
|
36
|
+
| `GET /__aipeek/screen` | **State-machine projection** — `{view, modal, focus, knobs}`. Start here. |
|
|
32
37
|
| `GET /__aipeek` | Summary of all sections (UI, console, network, errors, state) |
|
|
33
38
|
| `GET /__aipeek/{section}` | Detail for a section (`ui`, `console`, `network`, `errors`, `state`) |
|
|
34
39
|
| `GET /__aipeek/{section}/{index}` | Detail for a specific item in a section |
|
|
35
40
|
| `GET /__aipeek/{section}?full` | Full detail (no truncation) |
|
|
41
|
+
| `GET /__aipeek/dom[?scope=Name\|?sel=css]` | Semantic DOM — UI as text (see below) |
|
|
42
|
+
| `GET /__aipeek/{action}?...` | Drive the page (see Actions) |
|
|
43
|
+
| `POST /__aipeek/chain` | Run a JSON array of actions in one round-trip (see Actions) |
|
|
44
|
+
| `GET\|POST /__aipeek/eval` | Run arbitrary JS in the page (`?code=` or POST body); returns the result. Escape hatch for what typed endpoints can't do. |
|
|
45
|
+
|
|
46
|
+
### Perception layers — UI as text, not pixels
|
|
47
|
+
|
|
48
|
+
For a model, the UI's optimal representation is its **semantics**, not rendered pixels.
|
|
49
|
+
A screenshot forces pixel→meaning re-derivation and costs hundreds of tokens; the same
|
|
50
|
+
information is already textual in the DOM. aipeek exposes four layers, cheapest first:
|
|
51
|
+
|
|
52
|
+
- **`/screen`** — state-machine projection. The whole UI collapsed to what a human reads
|
|
53
|
+
off a washing-machine panel: `view` (which area), `modal` (is something covering it),
|
|
54
|
+
`focus`, and `knobs` (the few *reachable* controls now — repeated rows fold to `source ×N`,
|
|
55
|
+
and when a modal is open only its subtree counts). A handful of lines. *Start here.*
|
|
56
|
+
- **`/ui`** — React component tree. Full structure. Deep-dive when `/screen` isn't enough.
|
|
57
|
+
- **`/dom`** — semantic DOM: `tag·role·semantic-class·data-*·state` per element, with
|
|
58
|
+
Tailwind/atomic noise stripped and each line tagged with its **source location**
|
|
59
|
+
(`@File.tsx:line`, via `code-inspector` if present). This is what tells the model
|
|
60
|
+
*what an element is, its live state, and where to edit it.*
|
|
61
|
+
- **`/screenshot`** — pixels. Lossy DOM→PNG (`html-to-image`). Only for visual checks
|
|
62
|
+
a human looks at; not the model's primary channel.
|
|
63
|
+
|
|
64
|
+
**Scoped DOM — work top-down.** The full DOM is huge; a scoped view is accurate. Read
|
|
65
|
+
`/ui` to find a component, then scope the DOM to it:
|
|
66
|
+
|
|
67
|
+
```bash
|
|
68
|
+
curl localhost:5195/__aipeek/ui # find <ChatInput>
|
|
69
|
+
curl 'localhost:5195/__aipeek/dom?scope=ChatInput' # just that subtree (matches source path)
|
|
70
|
+
curl 'localhost:5195/__aipeek/dom?sel=.chat-list' # or any CSS subtree
|
|
71
|
+
```
|
|
72
|
+
|
|
73
|
+
`scope=` matches against the `data-insp-path` source path, so directory structure acts
|
|
74
|
+
as the component boundary. Each line's `@File.tsx:line` then tells you exactly where to edit.
|
|
75
|
+
|
|
76
|
+
### Actions — drive the current tab
|
|
77
|
+
|
|
78
|
+
| Endpoint | Params | Effect |
|
|
79
|
+
|----------|--------|--------|
|
|
80
|
+
| `/click` | `sel=` (CSS) or `text=` (visible text) | dispatch a real click |
|
|
81
|
+
| `/fill` | `sel=`/`text=` + `value=` | set value on input/textarea/select; **contenteditable** via `execCommand` |
|
|
82
|
+
| `/press` | `key=` (e.g. `Enter`, `Control+a`) | keydown/keyup on the focused element |
|
|
83
|
+
| `/wait` | `text=`/`sel=`, `timeout=` (ms, default 5000) | poll until it appears; 504 on timeout |
|
|
84
|
+
| `/screenshot` | `sel=`, `out=` | DOM→PNG into `.aipeek/`; skips cross-origin/broken images |
|
|
85
|
+
| `POST /chain` | JSON array of `{type, sel?, text?, value?, key?, timeout?}` | run in sequence, settle between steps, stop on first failure |
|
|
86
|
+
|
|
87
|
+
`click`/`fill`/`press` **settle the DOM and append the UI tree after** (`--- ui after ---`)
|
|
88
|
+
to the response — no follow-up read needed. On a target miss, `/click` and `/fill` return the
|
|
89
|
+
reachable clickable elements (clipped to the open modal's subtree) so you can re-target.
|
|
90
|
+
|
|
91
|
+
A CSS `sel=` with non-ASCII or quotes/brackets must be URL-encoded, or the query parser
|
|
92
|
+
mangles it: `curl -G .../click --data-urlencode 'sel=button[title="知识库"]'`.
|
|
93
|
+
|
|
94
|
+
**Chain** packs a whole interaction into one round-trip:
|
|
95
|
+
|
|
96
|
+
```bash
|
|
97
|
+
curl -X POST localhost:5195/__aipeek/chain -d '[
|
|
98
|
+
{"type":"click","sel":"button[title=\"知识库\"]"},
|
|
99
|
+
{"type":"wait","text":"Done"},
|
|
100
|
+
{"type":"fill","sel":"textarea","value":"hi"},
|
|
101
|
+
{"type":"press","key":"Enter"}
|
|
102
|
+
]'
|
|
103
|
+
```
|
|
36
104
|
|
|
37
105
|
## CLI
|
|
38
106
|
|