@athenaflow/plugin-agent-web-interface 1.0.6

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (35) hide show
  1. package/.claude-plugin/plugin.json +19 -0
  2. package/.codex-plugin/plugin.json +16 -0
  3. package/.mcp.json +8 -0
  4. package/dist/1.0.4/.agents/plugins/marketplace.json +14 -0
  5. package/dist/1.0.4/claude/plugin/.claude-plugin/plugin.json +19 -0
  6. package/dist/1.0.4/claude/plugin/.mcp.json +8 -0
  7. package/dist/1.0.4/claude/plugin/package.json +9 -0
  8. package/dist/1.0.4/claude/plugin/skills/agent-web-interface-guide/SKILL.md +302 -0
  9. package/dist/1.0.4/claude/plugin/skills/agent-web-interface-guide/agents/claude.yaml +3 -0
  10. package/dist/1.0.4/claude/plugin/skills/agent-web-interface-guide/agents/openai.yaml +10 -0
  11. package/dist/1.0.4/codex/plugin/.codex-plugin/plugin.json +16 -0
  12. package/dist/1.0.4/codex/plugin/.mcp.json +8 -0
  13. package/dist/1.0.4/codex/plugin/package.json +9 -0
  14. package/dist/1.0.4/codex/plugin/skills/agent-web-interface-guide/SKILL.md +302 -0
  15. package/dist/1.0.4/codex/plugin/skills/agent-web-interface-guide/agents/claude.yaml +3 -0
  16. package/dist/1.0.4/codex/plugin/skills/agent-web-interface-guide/agents/openai.yaml +10 -0
  17. package/dist/1.0.4/release.json +18 -0
  18. package/dist/1.0.6/.agents/plugins/marketplace.json +14 -0
  19. package/dist/1.0.6/claude/plugin/.claude-plugin/plugin.json +19 -0
  20. package/dist/1.0.6/claude/plugin/.mcp.json +8 -0
  21. package/dist/1.0.6/claude/plugin/package.json +9 -0
  22. package/dist/1.0.6/claude/plugin/skills/agent-web-interface-guide/SKILL.md +303 -0
  23. package/dist/1.0.6/claude/plugin/skills/agent-web-interface-guide/agents/claude.yaml +3 -0
  24. package/dist/1.0.6/claude/plugin/skills/agent-web-interface-guide/agents/openai.yaml +10 -0
  25. package/dist/1.0.6/codex/plugin/.codex-plugin/plugin.json +16 -0
  26. package/dist/1.0.6/codex/plugin/.mcp.json +8 -0
  27. package/dist/1.0.6/codex/plugin/package.json +9 -0
  28. package/dist/1.0.6/codex/plugin/skills/agent-web-interface-guide/SKILL.md +303 -0
  29. package/dist/1.0.6/codex/plugin/skills/agent-web-interface-guide/agents/claude.yaml +3 -0
  30. package/dist/1.0.6/codex/plugin/skills/agent-web-interface-guide/agents/openai.yaml +10 -0
  31. package/dist/1.0.6/release.json +18 -0
  32. package/package.json +13 -0
  33. package/skills/agent-web-interface-guide/SKILL.md +303 -0
  34. package/skills/agent-web-interface-guide/agents/claude.yaml +3 -0
  35. package/skills/agent-web-interface-guide/agents/openai.yaml +10 -0
@@ -0,0 +1,302 @@
1
+ ---
2
+ name: agent-web-interface-guide
3
+ description: >
4
+ Use this skill to act on live web pages in a browser. It can open a page, click through flows,
5
+ type into fields, submit forms, add products to cart, review page state, and capture Playwright
6
+ selectors for important elements. Use it whenever the task includes a URL or page reference and
7
+ you need to check, verify, inspect, extract selectors from, or actively interact with that page.
8
+ ---
9
+
10
+ # Agent Web Interface Guide
11
+
12
+ Use this skill to open live web pages, carry out actions, move through multi-step flows, validate page state, and capture selectors for automation.
13
+
14
+ Common uses:
15
+ - Review a live page or multi-step flow
16
+ - Click through navigation, buttons, dialogs, and other actions
17
+ - Fill, submit, or inspect forms and validation states
18
+ - Add products to cart or complete other in-page actions
19
+ - Capture reliable Playwright selectors for key elements
20
+
21
+ ## Input
22
+
23
+ Parse the target URL and exploration goal from: $ARGUMENTS
24
+
25
+ ## Workflow
26
+
27
+ 1. **Navigate or recover the right page** — use `list_pages` and explicit `page_id` when session state may be ambiguous
28
+ 2. **Orient first** — read the current state, active region, and visible controls before acting
29
+ 3. **Choose the lightest useful tool**
30
+ - Use page state or `snapshot` output for quick orientation
31
+ - Use `find` with `label`, `kind`, and `region` to narrow targets
32
+ - Use `get_form` when the task is clearly form-driven
33
+ - Use `get_element` for a chosen target, offsets, or selector extraction
34
+ 4. **Act one step at a time** — click, type, select, scroll, or drag only as needed to advance the task
35
+ 5. **Reacquire state after meaningful changes** — after navigation, overlays, search expansion, dialog opening, or large DOM updates, refresh your understanding before reusing old `eid`s
36
+ 6. **Inspect forms or extract selectors only when relevant** — do this when the user asks for them or when they materially help complete the task
37
+ 7. **Report** what you did, what happened, and any selectors or form details that matter
38
+
39
+ ## Output Format
40
+
41
+ Always include:
42
+ 1. **What you accomplished** — the result, finding, or outcome
43
+ 2. **Steps taken** — pages visited, buttons clicked, forms filled
44
+ 3. **Observations** — notable page states, messages, and behaviors
45
+ 4. **Selectors** (when relevant) — Playwright-compatible selectors for key elements
46
+ 5. **Form details** (when relevant) — only include when they helped drive the task
47
+
48
+ ## Operating Heuristics
49
+
50
+ - Prefer `find` over manual scanning when snapshots are trimmed or the page is dense
51
+ - Filter `find` aggressively with `kind`, `label`, and `region` before broad exploration
52
+ - Expect search UIs to appear as buttons or comboboxes before they expose a text field
53
+ - Expect overlays, drawers, and dialogs to mutate the page in place without changing the URL
54
+ - Treat `eid`s as short-lived after large mutations; reacquire targets instead of assuming old ids still work
55
+ - Trust `get_form` as a helper, not as ground truth; busy pages may contain multiple unrelated forms
56
+ - Use `observations`, `baseline`, and `diff` to confirm whether an action actually changed the page
57
+ - Prefer sequential progress on gated flows; if a control is disabled, look for the prerequisite choice above it
58
+
59
+ ## State Snapshot Structure
60
+
61
+ Every navigation or action returns a `<state>` snapshot:
62
+
63
+ ```xml
64
+ <state step="N" title="Page Title" url="https://...">
65
+ <meta view="1521x752" scroll="0,0" layer="main" />
66
+ <baseline reason="first|navigation" />
67
+ <diff type="mutation" added="N" removed="N" />
68
+ <observations>...</observations>
69
+ <region name="main">...</region>
70
+ </state>
71
+ ```
72
+
73
+ ### Key Elements
74
+
75
+ | Element | Purpose |
76
+ |---------|---------|
77
+ | `<meta>` | Viewport size, scroll position, active layer |
78
+ | `<baseline reason="...">` | Fresh snapshot - `"first"` (initial load) or `"navigation"` (URL change) |
79
+ | `<diff type="mutation">` | Incremental update with `added`/`removed` counts |
80
+ | `<observations>` | What appeared/disappeared after the action |
81
+ | `<region>` | Semantic page areas with interactive elements |
82
+
83
+ ## Observations
84
+
85
+ After actions (click, type, select), watch for changes:
86
+
87
+ ```xml
88
+ <observations>
89
+ <appeared when="action">Your Bag is empty</appeared>
90
+ <appeared when="action" role="status"></appeared>
91
+ <disappeared when="action" role="status"></disappeared>
92
+ </observations>
93
+ ```
94
+
95
+ - `<appeared>`: New content visible after action
96
+ - `<disappeared>`: Content removed after action
97
+ - `role` attribute: Semantic role (status, alert, dialog)
98
+
99
+ ## Regions
100
+
101
+ Page content is organized into semantic regions:
102
+
103
+ ```xml
104
+ <region name="main">
105
+ <link id="..." href="...">Link text</link>
106
+ <btn id="...">Button text</btn>
107
+ <!-- trimmed 50 items. Use find with region=main to see all -->
108
+ </region>
109
+ <region name="nav" unchanged="true" count="90" />
110
+ ```
111
+
112
+ ### Region Types
113
+ - `main` - Primary content area
114
+ - `nav` - Navigation menus
115
+ - `header` - Page header
116
+ - `footer` - Page footer
117
+ - `form` - Form containers
118
+ - `aside` - Sidebars
119
+ - `search` - Search areas
120
+
121
+ ### Optimization Hints
122
+ - `unchanged="true" count="N"` - Region didn't change, shows element count
123
+ - `<!-- trimmed N items -->` - Use `find` with `region` filter to see all
124
+
125
+ ## Element Types in Snapshots
126
+
127
+ | Tag | Element | Key Attributes |
128
+ |-----|---------|----------------|
129
+ | `<link>` | Hyperlink | `id`, `href` |
130
+ | `<btn>` | Button | `id`, `val`, `enabled` |
131
+ | `<rad>` | Radio button | `id`, `val`, `checked`, `focused` |
132
+ | `<sel>` | Dropdown/select | `id`, `expanded`, `focused` |
133
+ | `<elt>` | Input/generic | `id`, `type`, `val`, `focused`, `enabled`, `selected` |
134
+
135
+ ### Common Attributes
136
+
137
+ | Attribute | Meaning |
138
+ |-----------|---------|
139
+ | `id` | Element ID (eid) - use this to target the element |
140
+ | `enabled="false"` | Element is disabled (common in sequential forms) |
141
+ | `checked="true"` | Radio/checkbox is selected |
142
+ | `focused="true"` | Element has keyboard focus |
143
+ | `expanded="true"` | Dropdown is open |
144
+ | `selected="true"` | Option/tab is selected |
145
+ | `val` | Element value |
146
+
147
+ ## Progressive Enablement Pattern
148
+
149
+ Many sites use progressive enablement: later options stay disabled until earlier choices are made.
150
+
151
+ ```xml
152
+ <!-- Step 1: Model selection enabled -->
153
+ <rad id="model1" val="pro">iPhone 17 Pro</rad>
154
+ <rad id="color1" enabled="false" val="silver">Silver</rad> <!-- disabled -->
155
+
156
+ <!-- After selecting model, colors become enabled -->
157
+ <rad id="model1" checked="true" val="pro">iPhone 17 Pro</rad>
158
+ <rad id="color1" val="silver">Silver</rad> <!-- now enabled -->
159
+ ```
160
+
161
+ Common places this appears:
162
+ - Ecommerce product configuration
163
+ - Checkout and payment flows
164
+ - Onboarding wizards
165
+ - Settings pages with dependent options
166
+
167
+ **Strategy**: If you see `enabled="false"`, work upward to identify and complete the prerequisite step before continuing.
168
+
169
+ ## find Response
170
+
171
+ ```xml
172
+ <result type="find" page_id="..." snapshot_id="..." count="N">
173
+ <match eid="abc123"
174
+ kind="button|link|radio|checkbox|textbox|combobox|heading|image"
175
+ label="Button text"
176
+ region="main|nav|header|footer"
177
+ selector="role=button[name=&quot;...&quot;]"
178
+ visible="true"
179
+ enabled="true"
180
+ href="..." />
181
+ </result>
182
+ ```
183
+
184
+ ### Filter Parameters
185
+ - `kind`: Element type filter
186
+ - `label`: Case-insensitive substring match
187
+ - `region`: Restrict to semantic area
188
+ - `limit`: Max results (default 10)
189
+ - `include_readable`: Include text content (default true)
190
+
191
+ ## get_element Response
192
+
193
+ ```xml
194
+ <node eid="abc123" kind="link" region="main" group="tbody-28"
195
+ x="147.875" y="11.5" w="97.97" h="16.5"
196
+ display="inline" zone="top-left">
197
+ Element label text
198
+ <selector primary='role=link[name="..."]' />
199
+ <attrs href="..." />
200
+ </node>
201
+ ```
202
+
203
+ - `primary`: Best Playwright selector
204
+ - Position info: `x`, `y`, `w`, `h`, `zone`
205
+ - `group`: Logical grouping (for tables, lists)
206
+
207
+ ## get_form Response
208
+
209
+ ```xml
210
+ <forms page="page-id">
211
+ <form id="form-xxx" intent="search|login|signup|checkout" completion="100%">
212
+ <input eid="748" purpose="search">Search Wikipedia</input>
213
+ <combobox eid="750" purpose="selection" filled="true">EN</combobox>
214
+ <button eid="820" type="submit" primary="true">Search</button>
215
+ <next eid="748" reason="Optional field" />
216
+ </form>
217
+ </forms>
218
+ ```
219
+
220
+ - `intent`: Form purpose (search, login, checkout, etc.)
221
+ - `completion`: Percentage filled
222
+ - `next`: Suggested next field to fill with reason
223
+
224
+ ## list_pages Response
225
+
226
+ ```xml
227
+ <result type="list_pages" status="success">
228
+ <pages count="N">
229
+ <page page_id="page-xxx" url="https://..." title="Page Title" />
230
+ </pages>
231
+ </result>
232
+ ```
233
+
234
+ Use `page_id` to target specific browser tabs.
235
+
236
+ ## Session Recovery
237
+
238
+ The browser persists across conversation sessions — tabs from prior sessions remain open. On a new session, there is no "current" page; actions without `page_id` may target an arbitrary tab.
239
+
240
+ When encountering a "no page/session" error or resuming from a prior session:
241
+
242
+ 1. Call `list_pages` to see all open tabs with `page_id`, URL, and title
243
+ 2. Identify the target page by URL or title
244
+ 3. Pass `page_id` explicitly to all subsequent calls (`snapshot`, `find`, `click`, etc.)
245
+ 4. If the page is not found, navigate fresh — the tab may have been closed
246
+
247
+ **Caveats:**
248
+ - **Stale tab URLs**: `list_pages` shows the URL at open time. For SPAs, use `snapshot` with `page_id` to see actual current state.
249
+ - **Tab accumulation**: The browser accumulates tabs across sessions. Always use `page_id` to target the correct one.
250
+ - **Single active work tab assumptions**: Do not assume you have multiple useful tabs open. Check `list_pages` instead of relying on prior turn memory.
251
+
252
+ ## Error Responses
253
+
254
+ ```xml
255
+ <error>Field not found in any form: abc123</error>
256
+ ```
257
+
258
+ Common errors:
259
+ - Element ID not found (page may have changed)
260
+ - Element not visible/enabled
261
+ - Form field not in any form context
262
+ - No page/session (see Session Recovery above)
263
+
264
+ When this happens:
265
+ 1. Re-check the current page state
266
+ 2. Re-run `find` or `get_form` from the latest state
267
+ 3. Continue only with fresh `eid`s
268
+
269
+ ## Canvas Interactions
270
+
271
+ `<canvas>` elements render pixels, not DOM nodes — standard selectors don't work inside them. Use these tools for canvas-based UIs (drawing apps, games, visualizations):
272
+
273
+ - **`inspect_canvas`** — the key tool. Pass a canvas `eid` and it auto-detects the rendering library (Fabric.js, Konva, PixiJS, Phaser, Three.js, EaselJS, or raw canvas), queries the scene graph for objects with positions/sizes/labels, and returns an annotated screenshot with coordinate grid overlay and bounding boxes. Supports configurable `grid_spacing` (use 10px for precise handle targeting).
274
+ - **`click`** with `eid` + `x`/`y` — click at offset relative to canvas top-left (e.g., select a shape)
275
+ - **`drag`** with `eid` + source/target coordinates — drag within canvas (e.g., move objects, scale/rotate handles)
276
+ - **`screenshot`** with `eid` — capture just the canvas to visually verify state
277
+
278
+ **Workflow:** `find` → `get_element` (position) → `inspect_canvas` (discover objects) → `click`/`drag` (interact) → re-inspect to verify.
279
+
280
+ ## Best Practices
281
+
282
+ 1. **Use `find`** when snapshot shows `<!-- trimmed -->`
283
+ 2. **Track `<baseline>` vs `<diff>`** to know if you have full or partial state
284
+ 3. **Always pass `page_id`** when working across sessions or with multiple tabs
285
+ 4. **Reacquire targets after large mutations** instead of reusing stale `eid`s
286
+ 5. **Keep selector extraction optional** unless the task asks for it or automation handoff is part of the outcome
287
+
288
+ ## Example Usage
289
+
290
+ ```
291
+ Claude Code: /agent-web-interface-guide https://airbnb.com Walk through the search and booking flow for stays in Tokyo
292
+ Codex: $agent-web-interface-guide https://airbnb.com Walk through the search and booking flow for stays in Tokyo
293
+
294
+ Claude Code: /agent-web-interface-guide https://apple.com/store Configure an iPhone and add it to the bag, then summarize the steps
295
+ Codex: $agent-web-interface-guide https://apple.com/store Configure an iPhone and add it to the bag, then summarize the steps
296
+
297
+ Claude Code: /agent-web-interface-guide https://developer.mozilla.org Find the Fetch API docs and note how the search flow behaves
298
+ Codex: $agent-web-interface-guide https://developer.mozilla.org Find the Fetch API docs and note how the search flow behaves
299
+
300
+ Claude Code: /agent-web-interface-guide https://example.com/login Extract the login form selectors and field purposes
301
+ Codex: $agent-web-interface-guide https://example.com/login Extract the login form selectors and field purposes
302
+ ```
@@ -0,0 +1,3 @@
1
+ frontmatter:
2
+ argument-hint: "<url> <what to explore or do>"
3
+ user-invocable: true
@@ -0,0 +1,10 @@
1
+ interface:
2
+ display_name: "Act On Live Web Page"
3
+ short_description: "Open a live page, complete web actions efficiently, and inspect state when needed"
4
+ default_prompt: "Open this site, carry out the requested flow efficiently, and report the important observations, state changes, and any relevant selectors or form details."
5
+
6
+ dependencies:
7
+ tools:
8
+ - type: "mcp"
9
+ value: "agent-web-interface"
10
+ description: "Browser automation tools for carrying out live page actions and inspecting the resulting state"
@@ -0,0 +1,18 @@
1
+ {
2
+ "schemaVersion": 1,
3
+ "pluginRef": "agent-web-interface@athena-workflow-marketplace",
4
+ "pluginName": "agent-web-interface",
5
+ "marketplaceName": "athena-workflow-marketplace",
6
+ "version": "1.0.4",
7
+ "artifacts": {
8
+ "claude": {
9
+ "type": "directory",
10
+ "path": "./claude/plugin"
11
+ },
12
+ "codex": {
13
+ "type": "marketplace",
14
+ "marketplacePath": "./.agents/plugins/marketplace.json",
15
+ "pluginPath": "./codex/plugin"
16
+ }
17
+ }
18
+ }
@@ -0,0 +1,14 @@
1
+ {
2
+ "schemaVersion": 1,
3
+ "name": "athena-workflow-marketplace",
4
+ "plugins": [
5
+ {
6
+ "name": "agent-web-interface",
7
+ "version": "1.0.6",
8
+ "source": {
9
+ "source": "local",
10
+ "path": "./codex/plugin"
11
+ }
12
+ }
13
+ ]
14
+ }
@@ -0,0 +1,19 @@
1
+ {
2
+ "name": "agent-web-interface",
3
+ "description": "Open live web pages, click through real flows, fill forms, add items to cart, and inspect page state or selectors",
4
+ "version": "1.0.6",
5
+ "author": {
6
+ "name": "Athenaflow"
7
+ },
8
+ "repository": "https://github.com/lespaceman/agent-web-interface",
9
+ "license": "MIT",
10
+ "keywords": [
11
+ "browser",
12
+ "mcp",
13
+ "puppeteer",
14
+ "cdp",
15
+ "automation",
16
+ "semantic-snapshot"
17
+ ],
18
+ "category": "browser-automation"
19
+ }
@@ -0,0 +1,8 @@
1
+ {
2
+ "mcpServers": {
3
+ "browser": {
4
+ "command": "npx",
5
+ "args": ["-y", "agent-web-interface@latest"]
6
+ }
7
+ }
8
+ }
@@ -0,0 +1,9 @@
1
+ {
2
+ "name": "@athenaflow/plugin-agent-web-interface",
3
+ "version": "1.0.6",
4
+ "description": "Open live web pages, click through real flows, fill forms, add items to cart, and inspect page state or selectors",
5
+ "license": "MIT",
6
+ "publishConfig": {
7
+ "access": "public"
8
+ }
9
+ }