@captain_z/zsk-skills 1.6.1 → 1.7.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/demo/SKILL.md +41 -21
- package/demo/harness.yaml +18 -0
- package/demo/references/automation.md +51 -15
- package/package.json +1 -1
package/demo/SKILL.md
CHANGED
|
@@ -15,6 +15,7 @@ actions:
|
|
|
15
15
|
- resume
|
|
16
16
|
- terminate
|
|
17
17
|
- complete
|
|
18
|
+
- run optimized
|
|
18
19
|
- run playwright
|
|
19
20
|
- run computer-use
|
|
20
21
|
- run hybrid
|
|
@@ -35,7 +36,19 @@ triggers:
|
|
|
35
36
|
|
|
36
37
|
# Demo
|
|
37
38
|
|
|
38
|
-
Development demo before formal testing. Build a module-grouped, audience-ready demo outline, run rehearsed Playwright demonstration steps, record demo-only issues, preserve evidence, and leave reusable Playwright scenarios when possible.
|
|
39
|
+
Development demo before formal testing. Build a module-grouped, audience-ready demo outline, generate Playwright cases from formal raw testing inputs through `test-plan.json`, run rehearsed Playwright demonstration steps, record demo-only issues, preserve evidence, and leave reusable Playwright scenarios when possible.
|
|
40
|
+
|
|
41
|
+
Default `/demo` execution follows the optimized SOP:
|
|
42
|
+
|
|
43
|
+
```text
|
|
44
|
+
tests.raw_cases / sources.testing, normally .raws/testing
|
|
45
|
+
-> Browser Use observation handoff for logged-in/current-page state and locator hints
|
|
46
|
+
-> zsk/agent-written test-plan.json in tests.derived_cases
|
|
47
|
+
-> Playwright generator/CLI/Skills-written .spec.ts in tests.automated.e2e
|
|
48
|
+
-> Playwright Test/UI/debug execution and evidence under the configured issue evidence path
|
|
49
|
+
```
|
|
50
|
+
|
|
51
|
+
Browser Use is observation-only in this path. It may describe URL, page title, visible targets, role/label hints, auth/session notes, and privacy concerns. It must not write repo artifacts, `test-plan.json`, or final `.spec.ts` files. Optimized `/demo` must not silently fall back to Playwright MCP, Computer Use, or the legacy hybrid bridge.
|
|
39
52
|
|
|
40
53
|
Demo should execute and refine Playwright scenarios that were planned earlier by spec/design/task. It should not be the first stage that invents UI scenarios unless upstream artifacts are missing, in which case the gap must be recorded as a resource blocker or learning proposal.
|
|
41
54
|
|
|
@@ -66,7 +79,7 @@ Every demo step must have complete source alignment:
|
|
|
66
79
|
|
|
67
80
|
Do not include orphan demo steps that cannot be traced to those sources. Do not omit required function points from the PRD/SRS unless the omission is called out as a known gap with owner and reason.
|
|
68
81
|
|
|
69
|
-
If a formal test case and
|
|
82
|
+
If a formal test case and Browser Use handoff are available, demo should generate or refresh a Playwright case before running it. Prefer `raw test case + Browser Use observation + test-plan.json + Playwright locator generation` over screenshot-driven generation. Use the legacy hybrid lane explicitly when MCP or Computer Use is required.
|
|
70
83
|
|
|
71
84
|
## Operating Constraints
|
|
72
85
|
|
|
@@ -74,6 +87,8 @@ If a formal test case and structured page information or Browser Use handoff are
|
|
|
74
87
|
- Do not invent flows when spec/design/task resources are missing; record the resource gap.
|
|
75
88
|
- Do not use Browser Use when Playwright can reproduce the flow with storageState, a persistent context, or CDP.
|
|
76
89
|
- Do not use Computer Use when deterministic Playwright, Browser Use, or app APIs are sufficient.
|
|
90
|
+
- Do not let Browser Use generate final Playwright specs; it supplies observation only.
|
|
91
|
+
- Do not use Playwright MCP or Computer Use in the default optimized SOP.
|
|
77
92
|
- Do not mark ready-for-testing with P0 blockers or untriaged core P1 issues.
|
|
78
93
|
- Every claim needs reusable scenario evidence, screenshot/trace evidence, or a documented manual run.
|
|
79
94
|
|
|
@@ -85,6 +100,8 @@ The skill is the conversational entrypoint. The harness/CLI is the execution aut
|
|
|
85
100
|
demo start -> zsk demo start
|
|
86
101
|
demo pause -> zsk demo pause
|
|
87
102
|
demo resume -> zsk demo resume
|
|
103
|
+
demo run -> zsk demo run --optimized
|
|
104
|
+
demo run optimized -> zsk demo run --optimized
|
|
88
105
|
demo run playwright -> zsk demo run --playwright
|
|
89
106
|
demo run computer-use -> zsk demo run --computer-use
|
|
90
107
|
demo run hybrid -> zsk demo run --hybrid
|
|
@@ -96,17 +113,22 @@ demo complete -> zsk demo complete
|
|
|
96
113
|
## Automation Priority
|
|
97
114
|
|
|
98
115
|
1. Prefer deterministic scripts or app/test APIs when they are sufficient.
|
|
99
|
-
2.
|
|
100
|
-
3.
|
|
101
|
-
4. Use Browser Use when an already logged-in browser, SSO session, extension, persistent profile, or human-like page goal identification is the relevant state source.
|
|
102
|
-
5. Use
|
|
103
|
-
6. Use
|
|
116
|
+
2. Default to optimized Playwright generation: raw testing inputs plus Browser Use observation become `test-plan.json`, then Playwright CLI/Skills generate and execute `.spec.ts`.
|
|
117
|
+
3. Prefer Playwright CLI/Test/UI mode for visible, repeatable demo performance, controllable stop/pause, screenshots, traces, reports, and scenario execution.
|
|
118
|
+
4. Use Browser Use when an already logged-in browser, SSO session, extension, persistent profile, or human-like page goal identification is the relevant state source; keep it observation-only.
|
|
119
|
+
5. Use the explicit legacy hybrid lane when understanding and execution should be split across Playwright MCP, Browser Use, or Computer Use.
|
|
120
|
+
6. Use Computer Use when the surface is visual/system-level rather than reliably scriptable web DOM.
|
|
104
121
|
|
|
105
|
-
Browser Use or Computer Use must record why scripts, Playwright CLI
|
|
122
|
+
Browser Use or Computer Use outside optimized mode must record why scripts, Playwright CLI, storageState, persistent context, or CDP were insufficient.
|
|
106
123
|
|
|
107
124
|
## Tool Bridge
|
|
108
125
|
|
|
109
|
-
The
|
|
126
|
+
The optimized lane exchanges structured artifacts:
|
|
127
|
+
|
|
128
|
+
- `test-plan.json`: source-aligned test intent, preconditions, Browser Use observation summary, locator hints, steps, assertions, auth handoff, risks, and generated spec target.
|
|
129
|
+
- `.spec.ts`: final executable Playwright test generated from `test-plan.json`, following Playwright locator and assertion best practices.
|
|
130
|
+
|
|
131
|
+
The legacy hybrid lane exchanges structured artifacts:
|
|
110
132
|
|
|
111
133
|
- `operation-plan.json`: page understanding from Playwright MCP, Browser Use, or Computer Use; next operation; target intent; candidate locator(s); auth/session handoff; confidence; and fallback note.
|
|
112
134
|
- `playwright-execution.json`: selector/action attempted, result, screenshot/trace/video paths, UI mode/report link, scenario update, and issue link.
|
|
@@ -130,9 +152,9 @@ Demo has two sub-phases:
|
|
|
130
152
|
|
|
131
153
|
The handoff loop is:
|
|
132
154
|
|
|
133
|
-
1. Identify: Browser Use
|
|
134
|
-
2. Persist state: Browser Use observations
|
|
135
|
-
3. Pre-write:
|
|
155
|
+
1. Identify: Browser Use identifies the human goal, current page state, likely control to click/type, and candidate locators when Playwright cannot know which visible element matches the user's intent. In legacy hybrid mode, Playwright MCP or Computer Use may also identify the next operation.
|
|
156
|
+
2. Persist state: zsk or the agent generator persists Browser Use observations into `test-plan.json` or a separate handoff note before spec generation, including URL, candidate locator text, storage/session hint, profile source, visible login state, and privacy note. Browser Use itself is not the repo writer.
|
|
157
|
+
3. Pre-write: zsk or the agent generator converts source evidence plus Browser Use observations into `test-plan.json` and then Playwright specs, preferring role-based locators and including any auth handoff that Playwright can reproduce.
|
|
136
158
|
4. Rehearse: Playwright runs the pre-written cases with `storageState`, persistent context, CDP, or fixture login when available, then records trace/UI/report evidence.
|
|
137
159
|
5. Perform: Demo Show uses the rehearsed Playwright cases; successful runs are promoted to reusable verify/regression scenarios.
|
|
138
160
|
|
|
@@ -140,17 +162,16 @@ Repeat the loop until the demo function point is passed, paused, or converted in
|
|
|
140
162
|
|
|
141
163
|
## Scenario Generation
|
|
142
164
|
|
|
143
|
-
The agent should synthesize Playwright cases before the external demo from
|
|
165
|
+
The agent should synthesize Playwright cases before the external demo from raw test cases, source evidence, and Browser Use state handoff:
|
|
144
166
|
|
|
145
167
|
1. Load SRS/spec/design rows, formal QA cases, existing automation/e2e cases, and relevant unit-test assertions/fixtures.
|
|
146
|
-
2. Read
|
|
147
|
-
3.
|
|
148
|
-
4. Generate a Playwright spec using role-first locators, explicit auth bootstrap, and assertions derived from
|
|
149
|
-
5.
|
|
150
|
-
6.
|
|
151
|
-
7. Use the rehearsed case for Demo Show; do not improvise new clicks during the external demo unless the prepared case is blocked and the blocker is recorded.
|
|
168
|
+
2. Read Browser Use observation when login/session/current-page state or human-intent locator mapping is needed; map auth to Playwright `storageState`, persistent context, CDP, or documented manual setup.
|
|
169
|
+
3. Generate `test-plan.json` under `tests.derived_cases` with source links, step intent, locator hints, assertions, auth handoff, risks, and generated spec target.
|
|
170
|
+
4. Generate a Playwright spec under `tests.automated.e2e` using role-first locators, explicit auth bootstrap, and assertions derived from source evidence.
|
|
171
|
+
5. Rehearse it with Playwright UI/trace/report evidence and preserve trace/screenshots/video under the configured issue evidence path.
|
|
172
|
+
6. Use the rehearsed case for Demo Show; do not improvise new clicks during the external demo unless the prepared case is blocked and the blocker is recorded.
|
|
152
173
|
|
|
153
|
-
|
|
174
|
+
Playwright MCP and Computer Use are legacy hybrid fallbacks only. Browser Use is reserved for existing authenticated browser/profile state and human-intent locator mapping, and remains observation-only in optimized mode.
|
|
154
175
|
|
|
155
176
|
CLI example:
|
|
156
177
|
|
|
@@ -158,7 +179,6 @@ CLI example:
|
|
|
158
179
|
zsk demo scenario generate \
|
|
159
180
|
-m checkout \
|
|
160
181
|
--test-case .raws/testing/checkout-happy-path.md \
|
|
161
|
-
--snapshot .raws/testing/checkout.aria.md \
|
|
162
182
|
--name "Checkout happy path"
|
|
163
183
|
```
|
|
164
184
|
|
package/demo/harness.yaml
CHANGED
|
@@ -22,7 +22,25 @@ checks:
|
|
|
22
22
|
- demo-session
|
|
23
23
|
- issue-taxonomy
|
|
24
24
|
- scenario-preservation
|
|
25
|
+
- optimized-test-plan
|
|
25
26
|
- tool-bridge
|
|
27
|
+
optimized:
|
|
28
|
+
testPlan:
|
|
29
|
+
role: source-aligned-intermediate-contract
|
|
30
|
+
input: tests.raw_cases
|
|
31
|
+
fallbackInput: sources.testing
|
|
32
|
+
output: test-plan.json
|
|
33
|
+
playwrightSpec:
|
|
34
|
+
role: generated-executable-scenario
|
|
35
|
+
input: test-plan.json
|
|
36
|
+
output: "*.spec.ts"
|
|
37
|
+
browserUse:
|
|
38
|
+
role: observation-only
|
|
39
|
+
writesRepoArtifacts: false
|
|
40
|
+
forbidden:
|
|
41
|
+
- playwright_mcp
|
|
42
|
+
- computer_use
|
|
43
|
+
- operation-plan.json
|
|
26
44
|
bridge:
|
|
27
45
|
playwrightCli:
|
|
28
46
|
role: low-token-execute-screenshot-trace
|
|
@@ -1,13 +1,35 @@
|
|
|
1
1
|
# Demo Automation Reference
|
|
2
2
|
|
|
3
|
-
Use deterministic local scripts first when they fully cover the flow. For
|
|
3
|
+
Use deterministic local scripts first when they fully cover the flow. For test-case-driven web demos, `/demo` defaults to the optimized SOP:
|
|
4
|
+
|
|
5
|
+
```text
|
|
6
|
+
tests.raw_cases / sources.testing
|
|
7
|
+
-> Browser Use observation handoff
|
|
8
|
+
-> test-plan.json
|
|
9
|
+
-> Playwright .spec.ts
|
|
10
|
+
-> Playwright Test/UI/debug evidence
|
|
11
|
+
```
|
|
12
|
+
|
|
13
|
+
The optimized lane does not use Playwright MCP, Computer Use, or bridge artifacts. Use the legacy hybrid bridge only when optimized Playwright generation cannot represent the page or state.
|
|
4
14
|
|
|
5
15
|
- Playwright: perform the visible demo run, allow controlled pause/termination, record screenshots/traces/video/reports, preserve scenario cases, and support reproducible auth through fixtures, `storageState`, persistent contexts, or CDP.
|
|
6
16
|
- Playwright MCP: inspect structured accessibility snapshots and produce low-ambiguity operation plans.
|
|
7
|
-
- Browser Use:
|
|
17
|
+
- Browser Use: observe the human-intent target and existing or persistent logged-in browser profile when SSO, extensions, CAPTCHA-adjacent flows, or human browser state matters.
|
|
8
18
|
- Computer Use: understand visual/human-like or system-level context when DOM, ARIA, CDP, and Browser Use are insufficient.
|
|
9
19
|
|
|
10
|
-
Use Playwright-only for stable scenarios. Use Browser Use for stateful browser sessions. Use Computer Use-only as an explicit visual/system fallback.
|
|
20
|
+
Use optimized mode for raw-test-case to Playwright generation. Use Playwright-only for stable scenarios. Use Browser Use for observation of stateful browser sessions. Use Computer Use-only as an explicit visual/system fallback.
|
|
21
|
+
|
|
22
|
+
## Optimized SOP
|
|
23
|
+
|
|
24
|
+
In optimized mode:
|
|
25
|
+
|
|
26
|
+
1. Read formal test cases from `tests.raw_cases` or `sources.testing`; these usually point into `.raws/testing`.
|
|
27
|
+
2. Use Browser Use only to observe the logged-in/current page: URL, page title, visible controls, role/label hints, auth/session note, and privacy note.
|
|
28
|
+
3. Have zsk or the agent generator write `test-plan.json` under `tests.derived_cases`.
|
|
29
|
+
4. Generate final Playwright `.spec.ts` under `tests.automated.e2e`.
|
|
30
|
+
5. Execute or rehearse with Playwright CLI/Test/UI mode and store evidence under the configured issue evidence directory.
|
|
31
|
+
|
|
32
|
+
Browser Use must not write repo artifacts, `test-plan.json`, or final `.spec.ts`. If raw cases, auth state, or generated output are missing, optimized mode records a blocker or exits non-successfully; it must not silently fall back to Playwright MCP, Computer Use, or legacy bridge behavior.
|
|
11
33
|
|
|
12
34
|
## Playwright Surfaces
|
|
13
35
|
|
|
@@ -17,7 +39,7 @@ Use the Playwright tool that matches the demo job:
|
|
|
17
39
|
| --- | --- |
|
|
18
40
|
| Playwright Test | Preserve reusable scenario cases and rerun them for smoke, verify, and regression. |
|
|
19
41
|
| Playwright CLI/UI/Report | Visible demo performance, controlled stop/pause, token-efficient browser control, live session inspection, and replayable reports/traces. |
|
|
20
|
-
| Playwright MCP |
|
|
42
|
+
| Playwright MCP | Legacy hybrid-only structured accessibility snapshots for page understanding and operation planning. Not used by optimized mode. |
|
|
21
43
|
| Playwright Library | Deterministic browser scripts for screenshots, PDF, network interception, and custom evidence capture. |
|
|
22
44
|
|
|
23
45
|
## Browser State And Login
|
|
@@ -26,14 +48,14 @@ Do not use Computer Use just to keep a login session. Prefer this order:
|
|
|
26
48
|
|
|
27
49
|
1. Playwright fixture login or `storageState` for controlled test accounts.
|
|
28
50
|
2. Playwright persistent context or CDP connection when a dedicated browser profile is acceptable.
|
|
29
|
-
3. Browser Use when the user already has a logged-in browser/profile, SSO state, or extension-dependent session that should be preserved.
|
|
51
|
+
3. Browser Use observation when the user already has a logged-in browser/profile, SSO state, or extension-dependent session that should be preserved.
|
|
30
52
|
4. Computer Use only when the required state is visual/system-level or not reachable through browser automation.
|
|
31
53
|
|
|
32
54
|
Browser Use runs must record the browser/profile/session source and whether credentials or personal data were visible.
|
|
33
55
|
|
|
34
56
|
## Browser Use To Playwright Handoff
|
|
35
57
|
|
|
36
|
-
Browser Use should not be a throwaway visit. When it identifies what to click or which logged-in state matters,
|
|
58
|
+
Browser Use should not be a throwaway visit. When it identifies what to click or which logged-in state matters, preserve a structured handoff in the optimized `test-plan.json` or a separate agent note consumed by the zsk generator:
|
|
37
59
|
|
|
38
60
|
- URL and page title.
|
|
39
61
|
- Human goal and current page summary.
|
|
@@ -41,7 +63,7 @@ Browser Use should not be a throwaway visit. When it identifies what to click or
|
|
|
41
63
|
- Login/profile/session source and privacy note.
|
|
42
64
|
- Whether Playwright should use `storageState`, persistent context, CDP, or manual fixture setup.
|
|
43
65
|
|
|
44
|
-
Playwright then performs the visible demo run from that handoff. If the run succeeds, promote the path into a Playwright spec; if it fails, pause the demo with selector/auth/session diagnostics instead of losing the Browser Use observation.
|
|
66
|
+
Playwright then performs the visible demo run from that handoff. Browser Use itself must not write the final Playwright spec. If the run succeeds, promote the path into a Playwright spec; if it fails, pause the demo with selector/auth/session diagnostics instead of losing the Browser Use observation.
|
|
45
67
|
|
|
46
68
|
## Pre-write Demo Cases
|
|
47
69
|
|
|
@@ -50,7 +72,7 @@ Demo should not discover the path live in front of the audience. Prepare it firs
|
|
|
50
72
|
1. Collect source evidence: SRS, spec/design rows, formal QA cases, existing automation/e2e cases, and unit-test assertions/fixtures.
|
|
51
73
|
2. Write a flow-first demo outline: first show the starting state and primary goal, then the core happy path, then dependent function points, then required branch or edge scenarios, then the final state and evidence. Each row needs function/business point, scenario, source alignment, Playwright case, presenter words, visible result, and next step.
|
|
52
74
|
3. Browser Use captures login/profile state, page intent, visible targets, and candidate locators.
|
|
53
|
-
4.
|
|
75
|
+
4. zsk or the agent generator maps the source evidence to those targets, writes `test-plan.json`, and then writes Playwright pre-write specs under the configured scenario directory.
|
|
54
76
|
5. Playwright rehearses the specs with UI mode, trace, video, or HTML report enabled.
|
|
55
77
|
6. The external demo runs the rehearsed specs, so every step is visible, controllable, and stoppable.
|
|
56
78
|
7. Any live drift pauses the demo and creates diagnostics; it does not erase the Browser Use handoff.
|
|
@@ -91,7 +113,18 @@ Keep detailed resources, handoff notes, and evidence tables below the flow. They
|
|
|
91
113
|
|
|
92
114
|
## Tool-call Bridge
|
|
93
115
|
|
|
94
|
-
`
|
|
116
|
+
Optimized `test-plan.json` should contain:
|
|
117
|
+
|
|
118
|
+
- source raw case paths
|
|
119
|
+
- auth/storageState expectation
|
|
120
|
+
- Browser Use observation summary
|
|
121
|
+
- test data
|
|
122
|
+
- step intent/action/locator hints
|
|
123
|
+
- assertions
|
|
124
|
+
- generated spec target
|
|
125
|
+
- risks and blocker notes
|
|
126
|
+
|
|
127
|
+
Legacy hybrid `operation-plan.json` should contain:
|
|
95
128
|
|
|
96
129
|
- current page summary
|
|
97
130
|
- target function point
|
|
@@ -100,7 +133,7 @@ Keep detailed resources, handoff notes, and evidence tables below the flow. They
|
|
|
100
133
|
- risk/confidence
|
|
101
134
|
- fallback note
|
|
102
135
|
|
|
103
|
-
|
|
136
|
+
In optimized mode, prefer `raw test case + Browser Use observation + test-plan.json + getByRole` as the minimum stable loop. In legacy hybrid mode, `aria_snapshot + agent decision + getByRole` is still valid:
|
|
104
137
|
|
|
105
138
|
1. `aria_snapshot` gives the agent a compact semantic page tree.
|
|
106
139
|
2. Browser Use fills the gap when the semantic tree does not reveal which visible control matches the human intent.
|
|
@@ -114,10 +147,11 @@ Use screenshots only when semantic snapshots or authenticated Browser Use observ
|
|
|
114
147
|
When formal test cases exist, the agent should generate Playwright specs before the external demo:
|
|
115
148
|
|
|
116
149
|
1. Parse the test case steps and expected results.
|
|
117
|
-
2. Map each step to
|
|
118
|
-
3.
|
|
119
|
-
4.
|
|
120
|
-
5.
|
|
150
|
+
2. Map each step to Browser Use observation and locator hints when available.
|
|
151
|
+
3. Write `test-plan.json` as the intermediate contract.
|
|
152
|
+
4. Choose locators in this order: role, label, placeholder, test id, text.
|
|
153
|
+
5. Generate a spec with web-first assertions.
|
|
154
|
+
6. Mark tags: `demo`, `verify`, `regression` as appropriate.
|
|
121
155
|
|
|
122
156
|
This turns demo automation into reusable test assets instead of one-off clicking.
|
|
123
157
|
|
|
@@ -127,6 +161,8 @@ The generator should produce executable skeletons from structured input. Minimum
|
|
|
127
161
|
test case markdown + aria snapshot
|
|
128
162
|
```
|
|
129
163
|
|
|
164
|
+
For optimized mode, replace `aria snapshot` with a Browser Use observation handoff when available.
|
|
165
|
+
|
|
130
166
|
Minimum viable output:
|
|
131
167
|
|
|
132
168
|
```ts
|
|
@@ -147,7 +183,7 @@ await page.getByRole("button", { name: "Add" }).click()
|
|
|
147
183
|
|
|
148
184
|
When a demo step is stable and reusable:
|
|
149
185
|
|
|
150
|
-
1. Save or update a scenario under
|
|
186
|
+
1. Save or update a scenario under the project configured scenario directory.
|
|
151
187
|
2. Link the scenario from `docs/{module}/demo-report.md`.
|
|
152
188
|
3. Link screenshots and traces from `.issues/{module}/demo/_evidence/`.
|
|
153
189
|
4. Mark reuse targets: `smoke`, `verify`, `regression`.
|