autokap 1.0.6 → 1.0.8

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (347) hide show
  1. package/assets/chrome/ios-statusbar-comparison-reference.jpg +0 -0
  2. package/assets/chrome/ios-statusbar-dark-reference.jpg +0 -0
  3. package/assets/chrome/ios-statusbar-light-reference.jpg +0 -0
  4. package/assets/cursors/macos.svg +4 -0
  5. package/assets/cursors/windows.svg +15 -0
  6. package/assets/devices/ipad-pro-11-m4.json +52 -0
  7. package/assets/devices/iphone-16-pro.json +53 -0
  8. package/assets/devices/macbook-air-13.json +45 -0
  9. package/assets/frames/MacBook Air 13.svg +242 -0
  10. package/assets/frames/Status bar - iPhone.png +0 -0
  11. Menu bar- iPad.png +0 -0
  12. package/assets/frames/iPad Pro M4 11_.png +0 -0
  13. package/assets/frames/iPhone 16 Pro.png +0 -0
  14. package/assets/icons/Cellular Connection.svg +3 -0
  15. package/assets/icons/Union.svg +6 -0
  16. package/assets/icons/Wifi.svg +3 -0
  17. package/assets/icons/battery.svg +5 -0
  18. package/assets/icons/battery_charging.svg +8 -0
  19. package/assets/skill/OPCODE-REFERENCE.md +607 -0
  20. package/assets/skill/README.md +39 -0
  21. package/assets/skill/SKILL.md +453 -468
  22. package/assets/skill/STUDIO-SKILL.md +476 -0
  23. package/assets/skill/references/examples.md +104 -0
  24. package/assets/skill/references/interactive-demo.md +225 -0
  25. package/assets/skill/references/mock-data.md +178 -0
  26. package/dist/abort.d.ts +5 -0
  27. package/dist/abort.js +44 -0
  28. package/dist/action-verifier.d.ts +29 -0
  29. package/dist/action-verifier.js +133 -0
  30. package/dist/agent-action-recovery.d.ts +45 -0
  31. package/dist/agent-action-recovery.js +370 -0
  32. package/dist/agent-message-utils.d.ts +21 -0
  33. package/dist/agent-message-utils.js +77 -0
  34. package/dist/agent-url-utils.d.ts +30 -0
  35. package/dist/agent-url-utils.js +138 -0
  36. package/dist/agent.d.ts +226 -0
  37. package/dist/agent.js +6666 -0
  38. package/dist/ak-tree.d.ts +39 -0
  39. package/dist/ak-tree.js +368 -0
  40. package/dist/alt-text.d.ts +26 -0
  41. package/dist/alt-text.js +55 -0
  42. package/dist/auth-capture.d.ts +17 -0
  43. package/dist/auth-capture.js +164 -0
  44. package/dist/benchmark.d.ts +59 -0
  45. package/dist/benchmark.js +135 -0
  46. package/dist/billing-operation-logging.d.ts +38 -0
  47. package/dist/billing-operation-logging.js +248 -0
  48. package/dist/browser-bar.d.ts +48 -0
  49. package/dist/browser-bar.js +284 -0
  50. package/dist/browser-pool.d.ts +7 -0
  51. package/dist/browser-pool.js +15 -5
  52. package/dist/browser-utils.d.ts +31 -0
  53. package/dist/browser-utils.js +97 -0
  54. package/dist/browser.d.ts +76 -1
  55. package/dist/browser.js +1657 -39
  56. package/dist/capture-alt-text.d.ts +12 -0
  57. package/dist/capture-alt-text.js +52 -0
  58. package/dist/capture-encryption.d.ts +10 -0
  59. package/dist/capture-encryption.js +41 -0
  60. package/dist/capture-language-preflight.d.ts +41 -0
  61. package/dist/capture-language-preflight.js +300 -0
  62. package/dist/capture-llm-page-identity.d.ts +15 -0
  63. package/dist/capture-llm-page-identity.js +128 -0
  64. package/dist/capture-model-resolution.d.ts +9 -0
  65. package/dist/capture-model-resolution.js +21 -0
  66. package/dist/capture-page-identity.d.ts +7 -0
  67. package/dist/capture-page-identity.js +352 -0
  68. package/dist/capture-preset-credentials.d.ts +62 -0
  69. package/dist/capture-preset-credentials.js +184 -0
  70. package/dist/capture-request-plan.d.ts +58 -0
  71. package/dist/capture-request-plan.js +264 -0
  72. package/dist/capture-run-optimizer.d.ts +139 -0
  73. package/dist/capture-run-optimizer.js +863 -0
  74. package/dist/capture-selector-memory.d.ts +31 -0
  75. package/dist/capture-selector-memory.js +345 -0
  76. package/dist/capture-session-profile-encryption.d.ts +2 -0
  77. package/dist/capture-session-profile-encryption.js +22 -0
  78. package/dist/capture-step-timeout.d.ts +10 -0
  79. package/dist/capture-step-timeout.js +30 -0
  80. package/dist/capture-strategy.d.ts +36 -0
  81. package/dist/capture-strategy.js +95 -0
  82. package/dist/capture-studio-sync.d.ts +23 -0
  83. package/dist/capture-studio-sync.js +172 -0
  84. package/dist/capture-surface-contract.d.ts +36 -0
  85. package/dist/capture-surface-contract.js +299 -0
  86. package/dist/capture-transition-engine.d.ts +28 -0
  87. package/dist/capture-transition-engine.js +292 -0
  88. package/dist/capture-variant-state.d.ts +56 -0
  89. package/dist/capture-variant-state.js +182 -0
  90. package/dist/capture-verification.d.ts +35 -0
  91. package/dist/capture-verification.js +95 -0
  92. package/dist/capture-viewport-lock.d.ts +48 -0
  93. package/dist/capture-viewport-lock.js +74 -0
  94. package/dist/circuit-breaker.d.ts +42 -0
  95. package/dist/circuit-breaker.js +119 -0
  96. package/dist/cli-config.d.ts +8 -1
  97. package/dist/cli-config.js +62 -6
  98. package/dist/cli-contract.d.ts +15 -0
  99. package/dist/cli-contract.js +167 -0
  100. package/dist/cli-runner-local.d.ts +12 -0
  101. package/dist/cli-runner-local.js +102 -0
  102. package/dist/cli-runner.d.ts +34 -0
  103. package/dist/cli-runner.js +433 -0
  104. package/dist/cli-utils.d.ts +0 -1
  105. package/dist/cli-utils.js +2 -5
  106. package/dist/cli.js +1005 -252
  107. package/dist/clip-orchestrator.d.ts +148 -0
  108. package/dist/clip-orchestrator.js +957 -0
  109. package/dist/clip-postprocess.d.ts +42 -0
  110. package/dist/clip-postprocess.js +201 -0
  111. package/dist/cookie-dismiss.d.ts +2 -0
  112. package/dist/cookie-dismiss.js +48 -13
  113. package/dist/cost-logging.d.ts +35 -0
  114. package/dist/cost-logging.js +242 -0
  115. package/dist/cost-resolution-monitor.d.ts +16 -0
  116. package/dist/cost-resolution-monitor.js +34 -0
  117. package/dist/credential-templates.d.ts +5 -0
  118. package/dist/credential-templates.js +60 -0
  119. package/dist/cursor-overlay-script.d.ts +6 -0
  120. package/dist/cursor-overlay-script.js +169 -0
  121. package/dist/dom-css-purger.d.ts +65 -0
  122. package/dist/dom-css-purger.js +333 -0
  123. package/dist/dom-font-inliner.d.ts +45 -0
  124. package/dist/dom-font-inliner.js +148 -0
  125. package/dist/dom-patch-resolver.d.ts +52 -0
  126. package/dist/dom-patch-resolver.js +242 -0
  127. package/dist/dom-serializer.d.ts +82 -0
  128. package/dist/dom-serializer.js +378 -0
  129. package/dist/element-capture.d.ts +13 -0
  130. package/dist/element-capture.js +522 -0
  131. package/dist/env-validation.d.ts +5 -0
  132. package/dist/env-validation.js +29 -0
  133. package/dist/execution-schema.d.ts +4423 -0
  134. package/dist/execution-schema.js +507 -0
  135. package/dist/execution-types.d.ts +886 -0
  136. package/dist/execution-types.js +65 -0
  137. package/dist/fonts-loader.d.ts +14 -0
  138. package/dist/fonts-loader.js +55 -0
  139. package/dist/hybrid-navigator.d.ts +138 -0
  140. package/dist/hybrid-navigator.js +468 -0
  141. package/dist/index.d.ts +18 -0
  142. package/dist/index.js +17 -0
  143. package/dist/legacy/agent-action-recovery.d.ts +45 -0
  144. package/dist/legacy/agent-action-recovery.js +370 -0
  145. package/dist/legacy/agent-message-utils.d.ts +21 -0
  146. package/dist/legacy/agent-message-utils.js +77 -0
  147. package/dist/legacy/agent-url-utils.d.ts +30 -0
  148. package/dist/legacy/agent-url-utils.js +138 -0
  149. package/dist/legacy/agent.d.ts +226 -0
  150. package/dist/legacy/agent.js +6666 -0
  151. package/dist/legacy/clip-orchestrator.d.ts +148 -0
  152. package/dist/legacy/clip-orchestrator.js +957 -0
  153. package/dist/legacy/credential-templates.d.ts +5 -0
  154. package/dist/legacy/credential-templates.js +60 -0
  155. package/dist/legacy/hybrid-navigator.d.ts +138 -0
  156. package/dist/legacy/hybrid-navigator.js +468 -0
  157. package/dist/legacy/llm-usage.d.ts +17 -0
  158. package/dist/legacy/llm-usage.js +45 -0
  159. package/dist/legacy/prompt-cache.d.ts +10 -0
  160. package/dist/legacy/prompt-cache.js +24 -0
  161. package/dist/legacy/prompts.d.ts +175 -0
  162. package/dist/legacy/prompts.js +1038 -0
  163. package/dist/legacy/tools.d.ts +4 -0
  164. package/dist/legacy/tools.js +216 -0
  165. package/dist/legacy/video-agent.d.ts +143 -0
  166. package/dist/legacy/video-agent.js +4788 -0
  167. package/dist/legacy/video-observation.d.ts +36 -0
  168. package/dist/legacy/video-observation.js +192 -0
  169. package/dist/legacy/video-planner.d.ts +12 -0
  170. package/dist/legacy/video-planner.js +501 -0
  171. package/dist/legacy/video-prompts.d.ts +37 -0
  172. package/dist/legacy/video-prompts.js +569 -0
  173. package/dist/legacy/video-tools.d.ts +3 -0
  174. package/dist/legacy/video-tools.js +59 -0
  175. package/dist/legacy/video-variant-state.d.ts +29 -0
  176. package/dist/legacy/video-variant-state.js +80 -0
  177. package/dist/legacy/vision-model.d.ts +17 -0
  178. package/dist/legacy/vision-model.js +74 -0
  179. package/dist/llm-healer.d.ts +63 -0
  180. package/dist/llm-healer.js +166 -0
  181. package/dist/llm-provider.d.ts +29 -0
  182. package/dist/llm-provider.js +80 -0
  183. package/dist/llm-usage.d.ts +17 -0
  184. package/dist/llm-usage.js +45 -0
  185. package/dist/logger.d.ts +6 -2
  186. package/dist/logger.js +15 -1
  187. package/dist/mockup-html.d.ts +119 -0
  188. package/dist/mockup-html.js +263 -0
  189. package/dist/mockup.d.ts +187 -0
  190. package/dist/mockup.js +869 -0
  191. package/dist/mouse-animation.d.ts +46 -0
  192. package/dist/mouse-animation.js +114 -0
  193. package/dist/opcode-actions.d.ts +42 -0
  194. package/dist/opcode-actions.js +511 -0
  195. package/dist/opcode-runner.d.ts +51 -0
  196. package/dist/opcode-runner.js +770 -0
  197. package/dist/openrouter-client.d.ts +40 -0
  198. package/dist/openrouter-client.js +16 -0
  199. package/dist/overlay-engine.d.ts +24 -0
  200. package/dist/overlay-engine.js +176 -0
  201. package/dist/overlay-utils.d.ts +14 -0
  202. package/dist/overlay-utils.js +13 -0
  203. package/dist/postcondition.d.ts +16 -0
  204. package/dist/postcondition.js +269 -0
  205. package/dist/posthog.d.ts +4 -0
  206. package/dist/posthog.js +26 -0
  207. package/dist/program-patcher.d.ts +25 -0
  208. package/dist/program-patcher.js +44 -0
  209. package/dist/prompt-cache.d.ts +10 -0
  210. package/dist/prompt-cache.js +24 -0
  211. package/dist/prompts.d.ts +175 -0
  212. package/dist/prompts.js +1038 -0
  213. package/dist/provider-config.d.ts +12 -0
  214. package/dist/provider-config.js +15 -0
  215. package/dist/recovery-chain.d.ts +37 -0
  216. package/dist/recovery-chain.js +350 -0
  217. package/dist/remote-browser.d.ts +215 -0
  218. package/dist/remote-browser.js +360 -0
  219. package/dist/safari-browser-bar.d.ts +15 -0
  220. package/dist/safari-browser-bar.js +95 -0
  221. package/dist/safari-toolbar-asset.d.ts +15 -0
  222. package/dist/safari-toolbar-asset.js +12 -0
  223. package/dist/security.d.ts +21 -0
  224. package/dist/security.js +608 -0
  225. package/dist/selector-resolver.d.ts +34 -0
  226. package/dist/selector-resolver.js +181 -0
  227. package/dist/semantic-resolver.d.ts +35 -0
  228. package/dist/semantic-resolver.js +161 -0
  229. package/dist/server-capture-runtime.d.ts +125 -0
  230. package/dist/server-capture-runtime.js +585 -0
  231. package/dist/server-credit-usage.d.ts +12 -0
  232. package/dist/server-credit-usage.js +41 -0
  233. package/dist/server-posthog.d.ts +2 -0
  234. package/dist/server-posthog.js +16 -0
  235. package/dist/server-project-webhooks.d.ts +59 -0
  236. package/dist/server-project-webhooks.js +123 -0
  237. package/dist/server-screenshot-watermark.d.ts +7 -0
  238. package/dist/server-screenshot-watermark.js +60 -0
  239. package/dist/session-profile.d.ts +86 -0
  240. package/dist/session-profile.js +1536 -0
  241. package/dist/sf-pro-fonts.d.ts +4 -0
  242. package/dist/sf-pro-fonts.js +7 -0
  243. package/dist/sf-pro-symbols.d.ts +1 -0
  244. package/dist/sf-pro-symbols.js +55 -0
  245. package/dist/skill-packaging.d.ts +28 -0
  246. package/dist/skill-packaging.js +169 -0
  247. package/dist/smart-wait.d.ts +27 -0
  248. package/dist/smart-wait.js +81 -0
  249. package/dist/status-bar-l10n.d.ts +14 -0
  250. package/dist/status-bar-l10n.js +177 -0
  251. package/dist/status-bar-render.d.ts +20 -0
  252. package/dist/status-bar-render.js +410 -0
  253. package/dist/status-bar.d.ts +53 -0
  254. package/dist/status-bar.js +620 -0
  255. package/dist/svg-browser-bar.d.ts +33 -0
  256. package/dist/svg-browser-bar.js +206 -0
  257. package/dist/svg-status-bar.d.ts +36 -0
  258. package/dist/svg-status-bar.js +597 -0
  259. package/dist/svg-text.d.ts +61 -0
  260. package/dist/svg-text.js +118 -0
  261. package/dist/tools.d.ts +4 -0
  262. package/dist/tools.js +216 -0
  263. package/dist/types.d.ts +240 -5
  264. package/dist/types.js +23 -1
  265. package/dist/v2/action-verifier.d.ts +29 -0
  266. package/dist/v2/action-verifier.js +133 -0
  267. package/dist/v2/alt-text.d.ts +26 -0
  268. package/dist/v2/alt-text.js +55 -0
  269. package/dist/v2/benchmark.d.ts +59 -0
  270. package/dist/v2/benchmark.js +135 -0
  271. package/dist/v2/capture-strategy.d.ts +30 -0
  272. package/dist/v2/capture-strategy.js +67 -0
  273. package/dist/v2/capture-verification.d.ts +35 -0
  274. package/dist/v2/capture-verification.js +95 -0
  275. package/dist/v2/circuit-breaker.d.ts +42 -0
  276. package/dist/v2/circuit-breaker.js +119 -0
  277. package/dist/v2/cli-runner-local.d.ts +11 -0
  278. package/dist/v2/cli-runner-local.js +91 -0
  279. package/dist/v2/cli-runner.d.ts +34 -0
  280. package/dist/v2/cli-runner.js +300 -0
  281. package/dist/v2/compiler-prompts.d.ts +27 -0
  282. package/dist/v2/compiler-prompts.js +123 -0
  283. package/dist/v2/compiler.d.ts +37 -0
  284. package/dist/v2/compiler.js +147 -0
  285. package/dist/v2/explorer.d.ts +41 -0
  286. package/dist/v2/explorer.js +56 -0
  287. package/dist/v2/index.d.ts +37 -0
  288. package/dist/v2/index.js +31 -0
  289. package/dist/v2/llm-healer.d.ts +62 -0
  290. package/dist/v2/llm-healer.js +166 -0
  291. package/dist/v2/llm-provider.d.ts +29 -0
  292. package/dist/v2/llm-provider.js +80 -0
  293. package/dist/v2/opcode-runner.d.ts +47 -0
  294. package/dist/v2/opcode-runner.js +634 -0
  295. package/dist/v2/overlay-engine.d.ts +24 -0
  296. package/dist/v2/overlay-engine.js +150 -0
  297. package/dist/v2/postcondition.d.ts +16 -0
  298. package/dist/v2/postcondition.js +249 -0
  299. package/dist/v2/program-patcher.d.ts +25 -0
  300. package/dist/v2/program-patcher.js +44 -0
  301. package/dist/v2/recovery-chain.d.ts +30 -0
  302. package/dist/v2/recovery-chain.js +368 -0
  303. package/dist/v2/schema.d.ts +2580 -0
  304. package/dist/v2/schema.js +295 -0
  305. package/dist/v2/selector-resolver.d.ts +34 -0
  306. package/dist/v2/selector-resolver.js +181 -0
  307. package/dist/v2/semantic-resolver.d.ts +35 -0
  308. package/dist/v2/semantic-resolver.js +161 -0
  309. package/dist/v2/smart-wait.d.ts +27 -0
  310. package/dist/v2/smart-wait.js +81 -0
  311. package/dist/v2/types.d.ts +444 -0
  312. package/dist/v2/types.js +19 -0
  313. package/dist/v2/web-playwright-local.d.ts +69 -0
  314. package/dist/v2/web-playwright-local.js +392 -0
  315. package/dist/version.d.ts +1 -0
  316. package/dist/version.js +5 -0
  317. package/dist/video-agent.d.ts +143 -0
  318. package/dist/video-agent.js +4788 -0
  319. package/dist/video-observation.d.ts +36 -0
  320. package/dist/video-observation.js +192 -0
  321. package/dist/video-planner.d.ts +12 -0
  322. package/dist/video-planner.js +501 -0
  323. package/dist/video-prompts.d.ts +37 -0
  324. package/dist/video-prompts.js +554 -0
  325. package/dist/video-tools.d.ts +3 -0
  326. package/dist/video-tools.js +59 -0
  327. package/dist/video-variant-state.d.ts +29 -0
  328. package/dist/video-variant-state.js +80 -0
  329. package/dist/vision-model.d.ts +17 -0
  330. package/dist/vision-model.js +74 -0
  331. package/dist/web-playwright-local.d.ts +126 -0
  332. package/dist/web-playwright-local.js +819 -0
  333. package/dist/ws-auth.d.ts +20 -0
  334. package/dist/ws-auth.js +70 -0
  335. package/dist/ws-broadcast.d.ts +34 -0
  336. package/dist/ws-broadcast.js +85 -0
  337. package/dist/ws-connection-limits.d.ts +12 -0
  338. package/dist/ws-connection-limits.js +44 -0
  339. package/dist/ws-handler-utils.d.ts +32 -0
  340. package/dist/ws-handler-utils.js +139 -0
  341. package/dist/ws-handler.d.ts +10 -0
  342. package/dist/ws-handler.js +1793 -0
  343. package/dist/ws-metrics-server.d.ts +9 -0
  344. package/dist/ws-metrics-server.js +31 -0
  345. package/dist/ws-server.d.ts +9 -0
  346. package/dist/ws-server.js +92 -0
  347. package/package.json +142 -71
@@ -1,575 +1,560 @@
1
1
  ---
2
2
  name: autokap-preset
3
- description: Generate AutoKap preset configurations for automated screenshot, clip, and video capture of web applications
3
+ description: >
4
+ Generate AutoKap capture programs — deterministic opcode sequences for automated
5
+ screenshot, clip, and interactive demo capture of web apps. Use when: creating or updating presets,
6
+ adding data-ak attributes for capture automation, or debugging failed capture programs.
4
7
  metadata:
5
8
  author: AutoKap
6
- version: 1.0.0
9
+ version: 2.3.0
7
10
  ---
8
11
 
9
12
  # AutoKap Preset Creation Skill
10
13
 
11
- ## What is AutoKap
14
+ ## Mental Model
12
15
 
13
- AutoKap is a screenshot and video capture automation platform. It uses an AI agent inside a headless browser (Playwright) to navigate web applications and capture screenshots, clips (short animated GIFs/MP4s), or demo videos.
16
+ AutoKap is **not** a freeform runtime navigation agent. Your job is to inspect the user's codebase, add stable selectors where needed, and generate a **deterministic program** (a JSON sequence of browser opcodes) that the AutoKap CLI executes locally via Playwright.
14
17
 
15
- A **preset** defines what to capture: which pages, which viewports/devices, which languages and themes, and how to navigate to the right state. A single preset can generate dozens of assets one per combination of (page x target x language x theme).
18
+ Normal navigation and interaction stay deterministic. Runtime AI is limited to narrow fallback tasks such as overlay dismissal fallback, capture verification, alt text generation, and the last-resort healer. Do **not** design flows that depend on an LLM improvising navigation at runtime.
16
19
 
17
- ## Your Task
20
+ This installed skill is the **source of truth** for the AutoKap contract: opcode schema, login rules, variant handling, persistence, and validation. The copied prompt from the AutoKap dashboard is only the **preset-specific brief** (project URL, variants, template goal, mock data guidance, etc.).
18
21
 
19
- You are being asked to create one or more AutoKap presets for the user's project. Your job is to:
22
+ ## When To Use This Skill
20
23
 
21
- 1. **Analyze the project** look at the codebase (routes, pages, components, UI structure) to understand what the application looks like and how it's organized
22
- 2. **Generate a complete preset configuration** in JSON format that AutoKap can import directly
23
- 3. **Write precise navigation prompts** the capture agent sees a screenshot + accessibility tree and follows your instructions to reach the right page state
24
+ - User wants to capture screenshots, clips, or interactive demos of their web app
25
+ - User asks to create or update an AutoKap preset
26
+ - User needs `data-ak` attributes added to UI elements for capture
27
+ - User is debugging a failed capture program
24
28
 
25
- You know this project intimately. Use that knowledge to write prompts that are specific to the actual UI — reference real navigation elements, real page routes, real UI components.
29
+ ## Before Generating Anything
26
30
 
27
- ### Variants (CRITICAL)
31
+ 1. **Inspect the codebase first** — Read the relevant routes, layouts, auth checks, theme/locale system, and the components you may need to tag. Do not start by guessing opcodes.
32
+ 2. **Decide whether auth is required** — If any target route is protected, the program must begin with the canonical login flow. Never assume the user is already logged in.
33
+ 3. **Understand how locale and theme really work** — Prefer storage-based `SET_LOCALE` / `SET_THEME` with `"$variant"` when the app supports it. If the app uses URL-based locale routing, reflect that in `NAVIGATE`.
34
+ 4. **Understand the UI states** — Know which conditions make modals, dropdowns, tabs, tables, empty states, or dashboards appear before you write selectors or opcodes.
35
+ 5. **Use planning mode if the assistant supports it** — Make a short plan before editing code or generating the program.
36
+ 6. **Ask clarifying questions when navigation is ambiguous** — If the codebase or prompt does not clearly reveal how to reach a state, ask instead of inventing UI details.
37
+ 7. **Do not break the one-shot flow** — When CLI access is available, persist the preset so the user can run `autokap run <preset-id>` immediately. Manual JSON output is a fallback, not the default.
28
38
 
29
- The prompt will include a "Variants" section specifying the exact `langs`, `themes`, and `targets` to use. **You MUST copy these values exactly into the preset config** — do not change, add, or remove any value. The user has already chosen their desired languages, themes, and device viewports. Your job is only to write the capture prompts (what to navigate to, what to capture).
39
+ ## Quick Workflow
30
40
 
31
- **Targets and mockupOptions**: The targets include `viewport`, `deviceFrame`, and `mockupOptions` (orientation, status bar, dock, safe areas, etc.) that the user has specifically configured. Copy each target object exactly as-is into the preset's `targets` array. Do not modify `mockupOptions` if the user chose landscape orientation, keep landscape. If they configured a status bar, keep it.
41
+ 1. **Understand the capture goal**What pages/states should be captured? Screenshot, clip, or interactive demo? Which viewports, locales, themes, and mockups matter?
42
+ 2. **Inspect the implementation** — Confirm routes, auth, theme, locale, and any dynamic UI state in the codebase.
43
+ 3. **Add `data-ak` attributes** — Tag every element the opcodes must interact with using stable selectors.
44
+ 4. **Choose media mode and variants** — Set `mediaMode` and define the exact viewport/locale/theme combinations.
45
+ 5. **Generate the `ExecutionProgram`** — Write deterministic opcodes and explicit postconditions.
46
+ 6. **Persist the preset** — Prefer `autokap preset create|update` so the program lives inside `config.program`.
47
+ 7. **Validate it** — Run the recommended checks and at least one real `autokap run <preset-id>` before considering the work done.
48
+ 8. **Integrate dev links** — Run `autokap preset info <preset-id>` and use the structured output to wire endpoint URLs into the codebase (`<img>`, `<video>`, `<iframe>`, or component props). This is part of the one-shot flow — do not stop after validation.
32
49
 
33
- If the variants include multiple languages, you must provide `langInstructions` based on how the app actually switches language (look at the i18n setup in the code).
34
- If the variants include both light and dark themes, you must provide `themeInstructions` based on the actual theme toggle mechanism in the code.
50
+ ## Bundled References
35
51
 
36
- ---
52
+ Load these only when the request actually needs them:
37
53
 
38
- ## PresetConfig Schema
54
+ - **Opcode parameters** — [OPCODE-REFERENCE.md](OPCODE-REFERENCE.md)
55
+ - **Interactive demos** — [references/interactive-demo.md](references/interactive-demo.md)
56
+ - **Mock data** — [references/mock-data.md](references/mock-data.md)
57
+ - **Complete examples** — [references/examples.md](references/examples.md)
39
58
 
40
- The output must conform to this TypeScript schema. All fields are documented below.
59
+ Keep the core `SKILL.md` for the non-negotiable contract. Reach for the
60
+ references only after you know which mode or advanced feature the user needs.
41
61
 
42
- ```typescript
43
- /** Top-level preset object for import */
44
- interface PresetImport {
45
- name: string; // Display name (e.g. "Homepage Hero", "Pricing Page")
46
- description: string; // What this preset captures and why
47
- config: PresetConfig;
48
- }
62
+ ## Adding `data-ak` Attributes
49
63
 
50
- /** The full preset configuration */
51
- type PresetConfig = {
52
- /** The application URL (root URL of the project) */
53
- url?: string;
54
-
55
- /**
56
- * Main capture prompt — natural language instructions for the agent.
57
- * Ignored when `pages` is defined (each page has its own prompt).
58
- */
59
- prompt: string;
60
-
61
- /** Device/viewport targets for capture */
62
- targets: CaptureTarget[];
63
-
64
- /** Output resolution multiplier. Default: 2 */
65
- outputScale?: number;
66
-
67
- /** Languages to capture. E.g. ["en", "fr", "de"] */
68
- langs: string[];
69
-
70
- /** Themes to capture */
71
- themes: Array<"light" | "dark">;
72
-
73
- /** Max agent iterations before giving up. Default: 60 */
74
- maxIterations: number;
75
-
76
- /**
77
- * How to switch language in the UI.
78
- * Must describe actual UI controls, not abstract instructions.
79
- * Example: "Click the language dropdown in the footer and select the target language"
80
- * If the app uses URL-based locale (e.g. /fr/pricing), say so.
81
- */
82
- langInstructions?: string;
83
-
84
- /**
85
- * How to switch theme in the UI.
86
- * Must describe actual UI controls.
87
- * Example: "Click the sun/moon icon in the top-right header to toggle dark mode"
88
- */
89
- themeInstructions?: string;
90
-
91
- /** General navigation instructions applied to ALL captures (prepended to each page prompt) */
92
- navigationInstructions?: string;
93
-
94
- /**
95
- * Login credentials. The agent handles login automatically before running page prompts.
96
- * Only email/password auth is supported. OAuth, SSO, MFA, magic links are NOT supported.
97
- */
98
- credentials?: {
99
- loginUrl?: string; // Login page URL
100
- email?: string;
101
- password?: string;
102
- };
64
+ For every element you interact with in the opcodes, add a `data-ak="descriptive-name"` attribute:
103
65
 
104
- /**
105
- * Named capture pages — each page runs its own agent session with its own prompt.
106
- * When defined, the top-level `prompt` field is ignored.
107
- * CRITICAL: Each page runs independently with NO memory of other pages.
108
- */
109
- pages?: CapturePage[];
110
-
111
- /**
112
- * Isolated UI elements to extract from captured pages.
113
- * Each element is cropped from the full page capture.
114
- */
115
- elements?: IsolatedElement[];
116
-
117
- /**
118
- * Network interception: replace API responses with LLM-generated mock data.
119
- * Useful for showing populated UIs without real data.
120
- */
121
- mockData?: MockDataConfig;
122
-
123
- /** Capture mode. Default: "screenshot" */
124
- captureMode?: "screenshot" | "clip" | "video";
125
-
126
- /** Clip definitions (when captureMode === "clip") */
127
- clips?: ClipDefinition[];
128
-
129
- /** Clip export options */
130
- clipOptions?: ClipOptions;
131
- };
132
-
133
- interface CaptureTarget {
134
- /** Unique identifier for this target */
135
- id: string;
136
- /** Display label */
137
- label: string;
138
- /** Browser viewport dimensions */
139
- viewport: { width: number; height: number };
140
- /**
141
- * Optional device frame for mockup rendering.
142
- * Available frames: "iphone-16-pro", "ipad-pro-11-m4", "ipad-air-m4",
143
- * "macbook-air-13", "macbook-pro-16"
144
- */
145
- deviceFrame?: string;
146
- /** Mockup rendering options */
147
- mockupOptions?: MockupOptions;
148
- }
66
+ ```jsx
67
+ // Before
68
+ <button onClick={handleLogin}>Sign in</button>
69
+ <input type="email" placeholder="Email" />
149
70
 
150
- interface MockupOptions {
151
- orientation?: "portrait" | "landscape";
152
- outputScale?: number;
153
- showStatusBar?: boolean;
154
- statusBar?: StatusBarConfig;
155
- showDock?: boolean; // Mac only
156
- dockScale?: number; // Mac only
157
- dockMode?: "integrated" | "only-app"; // Mac only
158
- }
71
+ // After
72
+ <button data-ak="login-btn" onClick={handleLogin}>Sign in</button>
73
+ <input data-ak="login-email" type="email" placeholder="Email" />
74
+ ```
159
75
 
160
- interface StatusBarConfig {
161
- time?: string;
162
- signalStrength?: number;
163
- wifiStrength?: number;
164
- batteryLevel?: number;
165
- batteryCharging?: boolean;
166
- colorScheme?: "light" | "dark";
167
- autoLocale?: boolean; // Auto-adapt status bar to capture language (default: true)
168
- }
76
+ **Rules:**
77
+ - Use kebab-case: `login-btn`, `pricing-table`, `theme-toggle`
78
+ - Add ONLY to elements that opcodes interact with
79
+ - For dynamic lists, add `data-ak` to the container AND item template:
80
+ ```jsx
81
+ <ul data-ak="project-list">
82
+ {projects.map(p => <li data-ak="project-item" key={p.id}>{p.name}</li>)}
83
+ </ul>
84
+ ```
85
+ Target: `[data-ak="project-list"] [data-ak="project-item"]:nth-child(2)`
86
+ - Zero runtime cost and no security implications
169
87
 
170
- interface CapturePage {
171
- /**
172
- * Unique slug identifier. MUST match: /^[a-z0-9_-]+$/
173
- * Use descriptive slugs: "pricing", "dashboard-overview", "settings-billing"
174
- * AVOID generic slugs: "main", "page1", "page2", "screen1"
175
- */
176
- id: string;
177
- /** Display name shown in the UI */
178
- name: string;
179
- /**
180
- * Navigation prompt for the agent to reach this page/state.
181
- * MUST be self-contained — the agent has NO memory of other pages.
182
- * Include: where to navigate, what to click, what state to reach.
183
- */
184
- prompt: string;
185
- /** Optional URL override if this page is at a different URL than the preset root */
186
- url?: string;
187
- }
88
+ ## ExecutionProgram Schema
188
89
 
189
- interface IsolatedElement {
190
- /** Element name (used in output filenames) */
191
- name: string;
192
- /**
193
- * Description of the DOM element to crop.
194
- * Must be visually identifiable the agent matches against what it sees.
195
- * GOOD: "The pricing table with 3 plan columns"
196
- * GOOD: "The blue CTA button in the hero section"
197
- * BAD: "The main content" (too vague)
198
- */
199
- description: string;
200
- /** Source page id when the preset has multiple pages, specify which page this element belongs to */
201
- sourcePageId?: string;
202
- /** Outscale config for padding around the element */
203
- outscale?: OutscaleConfig;
90
+ ```typescript
91
+ interface ExecutionProgram {
92
+ presetId: string; // Unique slug (e.g. "homepage-hero")
93
+ programVersion: number; // Always 1 for new programs
94
+ mediaMode: 'screenshot' | 'clip' | 'dom';
95
+ baseUrl: string; // Root URL of the application
96
+ variants: VariantSpec[]; // Viewport/locale/theme combinations
97
+ preconditions: {
98
+ // Auth bootstrap auto-adapts to whatever is configured on the preset.
99
+ // If the preset has stored auth cookies, the runtime injects them into
100
+ // the browser context. If it has stored credentials (email/password), they
101
+ // are substituted into {{email}}/{{password}}/{{loginUrl}} placeholders
102
+ // inside opcodes — the program performs the UI login itself.
103
+ // If neither is set, the run is anonymous. Both can coexist.
104
+ credentialsId?: string; // Optional metadata pointer
105
+ };
106
+ steps: ExecutionOpcode[];
107
+ artifactPlan: {
108
+ mediaMode: 'screenshot' | 'clip' | 'dom';
109
+ cursorTheme?: 'minimal' | 'macos' | 'windows'; // Clip only. Default: 'minimal'
110
+ format?: {
111
+ clipFormat?: 'gif' | 'mp4' | 'both'; // Default: 'gif'
112
+ screenshotFormat?: 'png' | 'jpeg'; // Default: 'png'
113
+ };
114
+ applyMockup?: boolean; // Apply device frame mockup
115
+ applyStatusBar?: boolean; // Add status bar to mockup
116
+ };
117
+ compileFingerprint: string; // e.g. "v1-homepage"
118
+ variantFingerprint?: string; // "langs:en,fr|themes:dark,light"
119
+ compiledAt: string; // ISO timestamp
120
+ compiledWith?: string; // "ai-assistant"
204
121
  }
205
122
 
206
- interface OutscaleConfig {
207
- /** Uniform padding on all 4 sides (pixels) */
208
- padding?: number;
209
- paddingTop?: number;
210
- paddingRight?: number;
211
- paddingBottom?: number;
212
- paddingLeft?: number;
213
- /** Percentage-based padding relative to element dimensions (0-100) */
214
- paddingPercent?: number;
215
- /** Background fill color. Default: transparent */
216
- backgroundColor?: string;
123
+ interface VariantSpec {
124
+ id: string; // e.g. "desktop-en-light"
125
+ viewport: { width: number; height: number };
126
+ deviceScaleFactor?: number;
127
+ locale?: string; // BCP-47 (e.g. "en", "fr")
128
+ theme?: 'light' | 'dark';
129
+ targetId?: string; // Stable preset target id (e.g. "desktop")
130
+ targetLabel?: string; // Human-readable label (e.g. "Desktop")
131
+ deviceFrame?: string; // e.g. "iphone-16-pro" for mockup
217
132
  }
133
+ ```
218
134
 
219
- interface MockDataConfig {
220
- enabled: boolean;
221
- /**
222
- * Specific description of what data to generate.
223
- * GOOD: "5 customer invoices with amounts between $50-500, mix of paid/pending/overdue statuses"
224
- * BAD: "Some sample data"
225
- */
226
- description: string;
227
- /** Explicitly target specific API endpoints to mock */
228
- endpoints?: { url: string; method?: string }[];
229
- }
135
+ ## Opcode Quick Reference
136
+
137
+ 24 opcodes available. For full parameter documentation, see [OPCODE-REFERENCE.md](OPCODE-REFERENCE.md).
138
+
139
+ | Kind | Selector? | Key Params | Typical Postcondition | Notes |
140
+ |------|-----------|-----------|----------------------|-------|
141
+ | `NAVIGATE` | no | `url` | `route_matches` | Always first step |
142
+ | `DISMISS_OVERLAYS` | no | — | `overlay_dismissed` | Always after NAVIGATE |
143
+ | `CLICK` | yes | `button?` | `element_visible` / `route_matches` | Postcondition = what CHANGED |
144
+ | `TYPE` | yes | `text`, `clearFirst` | `any_change` | `{{email}}` / `{{password}}` for creds |
145
+ | `PRESS_KEY` | no | `key` | `any_change` | `"Enter"`, `"Escape"`, `"Tab"`, etc. |
146
+ | `WAIT_FOR` | yes* | `state` | `element_visible` | `"visible"` or `"attached"` (DOM only) |
147
+ | `SCROLL` | no | `direction`, `targetSelector?`, `amount?` | `element_visible` | Use `targetSelector` for precise scroll |
148
+ | `HOVER` | yes | — | `element_visible` | Capture immediately after for hover state |
149
+ | `SELECT_OPTION` | yes | `optionLabel` / `optionValue` / `optionIndex` | `text_contains` | **Native `<select>` only.** Custom dropdowns: use CLICK sequence |
150
+ | `CHECK` | yes | `checked` | `always` | Idempotent. Safer than CLICK for checkboxes |
151
+ | `DOUBLE_CLICK` | yes | — | `element_visible` / `any_change` | Inline editing, text selection |
152
+ | `SET_LOCALE` | no | `locale`, `method`, `storageHints?` | `always` | Use `"$variant"`. Prefer `method: "storage"` |
153
+ | `SET_THEME` | no | `theme`, `method`, `storageHints?` | `always` | Use `"$variant"`. Prefer `method: "storage"` |
154
+ | `ASSERT_ROUTE` | no | `urlPattern` | `route_matches` | Validation checkpoint |
155
+ | `ASSERT_SURFACE` | no | `selectors[]`, `matchAll` | `always` | Validation checkpoint |
156
+ | `CAPTURE_SCREENSHOT` | no | `captureId`, `captureName`, `elementSelector?` | `always` | `elementSelector` for element-level crop |
157
+ | `CAPTURE_DOM` | no | `stateName`, `selector?` | `always` | Interactive Demo state — see [Interactive Demo Workflow](#interactive-demo-workflow). `mediaMode: "dom"` only. Add `selector` to capture a focused subtree instead of the whole page. |
158
+ | `CAPTURE_FRAGMENT` | yes | `fragmentName`, `parentState`, `selector`, `variantName?`, `triggerSelector?`, `mountStrategy?` | `always` | Interactive Demo fragment (modal/popover/dropdown/local subtree) — see [Fragments](#fragments-and-local-interactions). Mounted on top of `parentState` by the player. Capture the same fragment under multiple `variantName`s to enable in-place swap (e.g. background colour change). |
159
+ | `BEGIN_CLIP` | no | `clipId`, `clipName` | `always` | Start recording |
160
+ | `END_CLIP` | no | `clipId`, `clipName` | `always` | Stop recording. Same `clipId` as BEGIN_CLIP |
161
+ | `CLONE_ELEMENT` | yes | `sourceSelector`, `containerSelector`, `count` | `always` | **Non-blocking.** Duplicate a template element N times |
162
+ | `INJECT_MOCK_DATA` | yes | `groupName` + clone fields and/or trigger fields | `always` | **Non-blocking.** Apply both clone (instant visual) AND trigger (survives re-renders) whenever possible |
163
+ | `REMOVE_ELEMENT` | yes | `selector` | `always` | **Non-blocking.** Remove empty-state placeholders / unwanted nodes |
164
+ | `SET_ATTRIBUTE` | yes | `attribute`, `value` | `always` | **Non-blocking.** Set attribute (e.g. `src` on an `<img>`) |
165
+
166
+ *WAIT_FOR requires `selector` or semantic `target`.
167
+
168
+ ## Postcondition Types
169
+
170
+ | Type | Fields | When to use |
171
+ |------|--------|-------------|
172
+ | `route_matches` | `pattern` | After NAVIGATE, CLICK that changes page |
173
+ | `element_visible` | `selector`, `waitMs?` | After CLICK/HOVER that opens modal/dropdown/tooltip |
174
+ | `element_absent` | `selector` | After CLICK that closes modal |
175
+ | `text_contains` | `selector`, `text` | After TYPE, SELECT_OPTION |
176
+ | `overlay_dismissed` | — | After DISMISS_OVERLAYS |
177
+ | `screenshot_stable` | `threshold?` (0-1), `waitMs?` | Before CAPTURE_SCREENSHOT when page has animations |
178
+ | `any_change` | — | After TYPE, PRESS_KEY, DOUBLE_CLICK (soft check) |
179
+ | `always` | — | CAPTURE_SCREENSHOT, SET_LOCALE, SET_THEME, CHECK, BEGIN/END_CLIP |
180
+
181
+ ## What Presets Can and Cannot Do
182
+
183
+ ### Capabilities
184
+
185
+ - **Multi-viewport**: desktop, tablet, mobile in a single program
186
+ - **Multi-locale**: switch languages via SET_LOCALE with `"$variant"` placeholder
187
+ - **Multi-theme**: switch light/dark via SET_THEME with `"$variant"` placeholder
188
+ - **Authenticated flows**: login via TYPE with `{{email}}`/`{{password}}` credential placeholders
189
+ - **Element-level capture**: crop screenshot to a specific component with `elementSelector`
190
+ - **Clips**: record interactions as GIF, MP4, or both via BEGIN_CLIP/END_CLIP
191
+ - **Device mockups**: wrap screenshots in device frames (`applyMockup: true`)
192
+ - **Status bars**: add device status bar to mockups (`applyStatusBar: true`)
193
+ - **Hover states**: capture tooltips, dropdown menus via HOVER + CAPTURE_SCREENSHOT
194
+ - **Scroll-to-element**: scroll specific sections into view with SCROLL
195
+ - **Right-click menus**: context menus via CLICK with `button: "right"`
196
+ - **Native selects**: option selection via SELECT_OPTION
197
+ - **Checkboxes/toggles**: idempotent state control via CHECK
198
+
199
+ ### Limitations
200
+
201
+ - **No file uploads**: native OS file dialogs cannot be automated
202
+ - **No complex drag-and-drop**: simple clicks only, no drag gestures
203
+ - **No multi-tab workflows**: single page context per execution
204
+ - **No native OS dialogs**: print, save-as, permission prompts are inaccessible
205
+ - **No CAPTCHAs**: cannot solve CAPTCHAs or bot detection challenges
206
+ - **No real-time dynamic data**: data that changes between runs produces inconsistent captures
207
+ - **No cross-origin iframes**: cannot interact with content from different origins
208
+ - **No browser extensions**: extension UIs are not accessible via Playwright
209
+
210
+ ## Anti-Patterns
211
+
212
+ | Do NOT | Do instead |
213
+ |--------|-----------|
214
+ | Use SELECT_OPTION for custom dropdowns (Radix, Headless UI, MUI) | CLICK to open + WAIT_FOR options + CLICK the option |
215
+ | Use CLICK for checkboxes | CHECK (idempotent, sets state directly) |
216
+ | Omit postconditions | Every opcode needs a postcondition describing the expected result |
217
+ | Hardcode locale/theme values in SET_LOCALE/SET_THEME | Use `"$variant"` — runtime substitutes the current variant's value |
218
+ | Use `method: "ui_interaction"` for locale/theme | Use `method: "storage"` — instant, reliable, no UI dependency |
219
+ | Tell the user to save a local `program.json` file | Persist the preset via CLI; only output full JSON as a fallback if the CLI/server write fails |
220
+ | Guess CSS selectors | Add `data-ak` attributes to the code first |
221
+ | Skip WAIT_FOR after page transitions | Add WAIT_FOR with `waitMs: 10000` after login, navigation, modal open |
222
+ | Use NAVIGATE or SCROLL inside a clip to reach a clickable target | CLICK the link/button — cursor animates to it naturally |
223
+ | Skip `cursorTheme` in clip `artifactPlan` | Set `cursorTheme: "macos"` or `"windows"` for polished recordings |
224
+
225
+ ## Clip Workflow
226
+
227
+ For recording video clips of interactions:
228
+
229
+ 1. Set `mediaMode: "clip"` on the program
230
+ 2. Set `artifactPlan.format.clipFormat` to `"gif"`, `"mp4"`, or `"both"`
231
+ 3. Set `artifactPlan.cursorTheme` to `"minimal"`, `"macos"`, or `"windows"` (default: `"minimal"`)
232
+ 4. Put navigation and setup steps **before** BEGIN_CLIP (they won't be recorded)
233
+ 5. Place `BEGIN_CLIP` with a `clipId` and `clipName` to start recording
234
+ 6. All interaction steps between BEGIN_CLIP and END_CLIP are recorded
235
+ 7. Place `END_CLIP` with the **same `clipId`** to stop recording
236
+
237
+ ### Cursor animation (automatic)
238
+
239
+ During clip recording, the runtime automatically animates a visible cursor with **cubic Bézier curves** for every CLICK, HOVER, TYPE, CHECK, DOUBLE_CLICK, and SCROLL opcode. The cursor moves smoothly (350–900 ms depending on distance), with randomized control points and micro-jitter to look human. You do **not** need to script cursor movement yourself — just emit the right interaction opcodes and the runtime handles the animation.
240
+
241
+ ### Designing human-like clip programs
242
+
243
+ Because the cursor is visible during recording, **how** you reach a target matters as much as reaching it:
244
+
245
+ | Do | Don't | Why |
246
+ |---|---|---|
247
+ | **CLICK** on a nav link / anchor to scroll to a section | SCROLL blindly then capture | The cursor animates to the link before clicking — the viewer sees intentional navigation |
248
+ | **CLICK** on an in-page link to change route | NAVIGATE to the new URL mid-clip | NAVIGATE is an instant browser jump with no cursor motion — it looks like a cut, not a flow |
249
+ | **HOVER** → pause → **CLICK** for important interactions | CLICK without HOVER | HOVER gives the viewer time to see where the cursor is heading and creates anticipation |
250
+ | Keep the clip short and focused (5–15 interactions) | Record an entire user journey in one clip | Long clips lose the viewer. Split into multiple clips if needed |
251
+ | Place WAIT_FOR after CLICK that triggers a route change | Immediately interact after navigation | Gives the page time to render and the viewer time to register the new state |
252
+ | Add a `screenshot_stable` WAIT_FOR before BEGIN_CLIP when DISMISS_OVERLAYS precedes it | Start recording immediately after DISMISS_OVERLAYS | Overlay dismiss animations (fade-out, slide) bleed into the first frames of the clip |
253
+
254
+ **Rule of thumb:** inside a clip, never use NAVIGATE or SCROLL to reach content that the user would normally reach by clicking a link or button. Use CLICK on that element instead — the cursor animation makes the transition visible and human-like.
255
+
256
+ SCROLL is still appropriate for:
257
+ - Scrolling down a long page to reveal below-the-fold content when there is no clickable anchor
258
+ - Scrolling inside a container or list
259
+
260
+ ### Clean start: wait for visual stability before recording
261
+
262
+ When `DISMISS_OVERLAYS` runs shortly before `BEGIN_CLIP`, overlay dismiss animations (fade-out, slide-away) can bleed into the first frames of the clip. Add a `screenshot_stable` WAIT_FOR between them so the page is visually settled before recording begins:
230
263
 
231
- interface ClipDefinition {
232
- /** Unique slug identifier */
233
- id: string;
234
- /** Display name */
235
- name: string;
236
- /**
237
- * What to record — the interaction the clip captures.
238
- * When navigationScript is also provided, this describes ONLY the recording phase.
239
- */
240
- script: string;
241
- /**
242
- * Optional: navigation instructions to reach the page BEFORE recording starts.
243
- * When provided, the navigation agent follows these to reach the page,
244
- * then the recording system executes `script`.
245
- */
246
- navigationScript?: string;
247
- /** Optional URL override */
248
- url?: string;
249
- /** Seconds to freeze the last frame before GIF loops. 0-10. Default: 0 */
250
- holdLastFrameSec?: number;
251
- }
264
+ ```json
265
+ { "kind": "WAIT_FOR", "description": "Wait for page to be visually stable", "selector": "[data-ak=\"main-content\"]", "state": "visible", "postcondition": { "type": "screenshot_stable", "threshold": 0.01, "waitMs": 3000 }, "recovery": { "retries": 0, "useSelectorMemory": false, "useAltInteraction": false, "allowReload": false, "allowHealer": false }, "timeoutMs": 5000, "maxFailures": 1 }
266
+ ```
267
+
268
+ Place this step **after** DISMISS_OVERLAYS + any content WAIT_FOR, and **immediately before** BEGIN_CLIP.
252
269
 
253
- interface ClipOptions {
254
- /** Output format. Default: "gif" */
255
- format?: "gif" | "mp4" | "both";
256
- /** Max duration in seconds. Default: 8 */
257
- maxDurationSec?: number;
258
- /** GIF framerate. Default: 15 */
259
- gifFps?: number;
260
- /** GIF max width in pixels. Default: 800 */
261
- gifMaxWidth?: number;
262
- /** Show cursor animation. Default: true */
263
- showCursor?: boolean;
264
- /** Whether to loop GIF. Default: true */
265
- loop?: boolean;
270
+ ```json
271
+ {
272
+ "steps": [
273
+ { "kind": "NAVIGATE", "url": "https://app.example.com", "..." : "..." },
274
+ { "kind": "DISMISS_OVERLAYS", "..." : "..." },
275
+ { "kind": "WAIT_FOR", "selector": "[data-ak=\"dashboard\"]", "state": "visible", "..." : "..." },
276
+ { "kind": "WAIT_FOR", "description": "Let overlays finish animating", "selector": "[data-ak=\"dashboard\"]", "state": "visible", "postcondition": { "type": "screenshot_stable", "threshold": 0.01, "waitMs": 3000 }, "..." : "..." },
277
+ { "kind": "BEGIN_CLIP", "clipId": "add-project", "clipName": "Add project", "postcondition": { "type": "always" }, "..." : "..." },
278
+ { "kind": "CLICK", "selector": "[data-ak=\"add-project-btn\"]", "..." : "..." },
279
+ { "kind": "TYPE", "selector": "[data-ak=\"project-name\"]", "text": "My Project", "clearFirst": true, "..." : "..." },
280
+ { "kind": "CLICK", "selector": "[data-ak=\"save-btn\"]", "..." : "..." },
281
+ { "kind": "END_CLIP", "clipId": "add-project", "clipName": "Add project", "postcondition": { "type": "always" }, "..." : "..." }
282
+ ]
266
283
  }
267
284
  ```
268
285
 
269
- ---
286
+ ## Interactive Demo Workflow
270
287
 
271
- ## Best Practices (Critical)
272
-
273
- ### Prompt Writing
274
- The capture agent's success rate depends heavily on prompt quality. Write prompts with **strong navigation hints**:
275
-
276
- - **Add location references**: "in the left sidebar", "in the top-right header", "via the navigation bar"
277
- - **Use visual anchors**: "the blue button labeled 'Save'", "the table with column headers Name/Status/Date"
278
- - **Specify end-state**: what should be visible when the screenshot is taken
279
- - **Keep it concise**: 1-3 sentences. The agent handles micro-steps on its own
280
-
281
- **Bad prompt**: "Go to settings"
282
- **Good prompt**: "Navigate to the Settings page via the gear icon in the left sidebar. Show the 'General' tab with all form fields visible."
283
-
284
- ### Named Pages
285
- - Each page runs as a **separate, independent agent session** with NO memory between pages
286
- - Page prompts must be **self-contained** — include all navigation steps from the root URL
287
- - If page B requires authentication, configure `credentials` in the preset — don't describe login steps in the prompt
288
- - Use **descriptive page IDs** as slugs: `pricing`, `dashboard-analytics`, `user-settings`
289
- - NEVER use generic IDs: `main`, `page1`, `page2`, `screen1`
290
-
291
- ### Isolated Elements
292
- - Descriptions must reference **visually identifiable** DOM landmarks
293
- - Good: "The hero section containing the main heading and CTA button"
294
- - Bad: "The main content area"
295
- - Always specify `sourcePageId` when the preset has multiple pages
296
-
297
- ### Theme & Language Instructions
298
- - Must describe **actual UI controls**, not abstract actions
299
- - Good: "Click the moon/sun toggle icon in the top-right corner of the header"
300
- - Bad: "Switch to dark mode"
301
- - If the app uses URL-based locales (e.g. `/fr/pricing`), mention it explicitly
302
-
303
- ### Mock Data
304
- - Descriptions must be **specific enough for deterministic generation**
305
- - Include: number of items, value ranges, status distributions, data types
306
- - Good: "10 user accounts with realistic names, emails, roles (3 admin, 5 member, 2 viewer), last active dates within the past 30 days"
307
- - Bad: "Some users"
308
-
309
- ### Credentials
310
- - Only email/password authentication is supported
311
- - OAuth, SSO, MFA, and magic links are NOT supported
312
- - When credentials are configured, the agent handles login automatically — do NOT repeat login steps in prompts
313
- - If a page requires auth but no credentials are configured, the capture will fail
314
-
315
- ### What the Agent CANNOT Do
316
- - Upload files or handle file inputs
317
- - Handle OAuth, SSO, MFA, magic link auth
318
- - Type into non-auth, non-search fields (no form submissions that create data)
319
- - Delete or modify persistent data
320
- - Navigate outside the target domain
321
- - Remember state between separate page runs
288
+ Interactive demos are advanced and should only be used when the user wants a
289
+ clickable DOM-based experience, not static screenshots or a clip.
322
290
 
323
- ---
291
+ Key rules:
324
292
 
325
- ## Common Targets
293
+ - center the capture around the feature loop, not the whole app
294
+ - use `CAPTURE_DOM` for base states
295
+ - use `CAPTURE_FRAGMENT` for local overlays and subtree swaps
296
+ - add authored markers such as `data-ak-interact`, `data-ak-fragment`,
297
+ `data-ak-model`, and `data-ak-template`
298
+ - prefer fragments / bindings / model-driven reconstruction before custom
299
+ interaction `code`
326
300
 
327
- Here are standard target configurations you can use:
301
+ Read the full reference before generating an interactive demo:
328
302
 
329
- ```json
330
- // Desktop (no device frame)
331
- { "id": "desktop-1440x900", "label": "Desktop 1440x900", "viewport": { "width": 1440, "height": 900 } }
303
+ - [references/interactive-demo.md](references/interactive-demo.md)
332
304
 
333
- // MacBook Air 13"
334
- { "id": "macbook-air-13", "label": "MacBook Air 13\"", "viewport": { "width": 1440, "height": 900 }, "deviceFrame": "macbook-air-13", "mockupOptions": { "outputScale": 2 } }
305
+ ## Recovery System Overview
335
306
 
336
- // iPhone 16 Pro (portrait)
337
- { "id": "iphone-16-pro", "label": "iPhone 16 Pro", "viewport": { "width": 402, "height": 778 }, "deviceFrame": "iphone-16-pro", "mockupOptions": { "orientation": "portrait", "outputScale": 2 } }
307
+ When an opcode fails, AutoKap tries 5 recovery strategies in order:
338
308
 
339
- // iPad Pro 11" (landscape)
340
- { "id": "ipad-pro-11", "label": "iPad Pro 11\" Landscape", "viewport": { "width": 1210, "height": 802 }, "deviceFrame": "ipad-pro-11-m4", "mockupOptions": { "orientation": "landscape", "outputScale": 2 } }
341
- ```
309
+ 1. **Deterministic retry** Same opcode, fresh page observation. Exponential backoff (500ms, 1000ms...).
310
+ 2. **Selector memory** Tries known-good selectors from previous successful runs stored in Supabase.
311
+ 3. **Alternative interaction** — Keyboard (Tab+Enter), JS dispatch, coordinate-based click.
312
+ 4. **Targeted reload** — Reloads the page and retries. **Loses UI state** (open modals, form data).
313
+ 5. **LLM Healer** — AI analyzes the page screenshot and AKTree, then rewrites the failing opcode. Max 3 invocations per run.
342
314
 
343
- ---
315
+ **Guidance:** Keep `allowReload: false` for most steps (it loses state). Set `allowReload: true` only for the first NAVIGATE or after full page transitions where UI state is expendable.
344
316
 
345
- ## Output Format
317
+ ## SET_LOCALE / SET_THEME: Method Selection
346
318
 
347
- ### Option 1: JSON Output (copy-paste)
319
+ **Investigate the app's i18n/theme mechanism first** by reading configuration files:
348
320
 
349
- Output the preset as a JSON object. The user will paste this into AutoKap's import dialog.
321
+ | Framework | Opcode | Method | Storage | Key |
322
+ |-----------|--------|--------|---------|-----|
323
+ | next-intl | SET_LOCALE | `storage` | `cookie` | `NEXT_LOCALE` |
324
+ | i18next / react-intl | SET_LOCALE | `storage` | `localStorage` | `i18nextLng` |
325
+ | URL-based (e.g. `/fr/page`) | NAVIGATE | — | — | Use locale in URL path |
326
+ | Browser-level only | SET_LOCALE | `browser_context` | — | — |
327
+ | next-themes | SET_THEME | `storage` | `localStorage` | `theme` |
328
+ | CSS `prefers-color-scheme` | SET_THEME | `color_scheme` | — | — |
350
329
 
351
- ```json
352
- {
353
- "name": "Preset Name",
354
- "description": "What this preset captures",
355
- "config": {
356
- "prompt": "...",
357
- "targets": [...],
358
- "langs": ["en"],
359
- "themes": ["light"],
360
- "maxIterations": 60,
361
- "outputScale": 2
362
- }
363
- }
364
- ```
330
+ Place SET_LOCALE/SET_THEME **after NAVIGATE + DISMISS_OVERLAYS** but **before WAIT_FOR or CAPTURE_SCREENSHOT**. The runtime reloads the page after applying storage-based changes.
365
331
 
366
- For multiple presets, output a JSON array:
367
- ```json
368
- [
369
- { "name": "...", "description": "...", "config": { ... } },
370
- { "name": "...", "description": "...", "config": { ... } }
371
- ]
372
- ```
332
+ If the preset has only one locale and one theme, these opcodes are unnecessary.
373
333
 
374
- ### Option 2: Direct API Creation
334
+ ## Mock Data Injection
375
335
 
376
- If an API key and project ID are provided in the prompt, create the preset directly via the AutoKap API:
336
+ Use mock data only when the capture would otherwise look empty or broken.
377
337
 
378
- ```bash
379
- curl -X POST "https://app.autokap.com/api/v1/presets" \
380
- -H "Authorization: Bearer YOUR_API_KEY" \
381
- -H "Content-Type: application/json" \
382
- -d '{
383
- "project_id": "YOUR_PROJECT_ID",
384
- "name": "Preset Name",
385
- "description": "What this preset captures",
386
- "config": { ... }
387
- }'
388
- ```
338
+ Core rules:
389
339
 
390
- When both an API key and project ID are available, prefer the API method for a seamless experience. Fall back to JSON output if the API call fails.
340
+ - wire both delivery mechanisms whenever possible: clone + hidden trigger/input
341
+ - every variable UI element must become a slot
342
+ - add `data-ak` markers to the template, container, variable children, and
343
+ hidden receivers
344
+ - declare `mockDataInjection.groups` at the top level of the preset config
345
+ - place `INJECT_MOCK_DATA` after `WAIT_FOR` and before `CAPTURE_SCREENSHOT`
346
+ - keep the opcode non-blocking with `postcondition: { type: "always" }`
391
347
 
392
- ---
348
+ Read the full reference before adding mock data:
393
349
 
394
- ## Examples
350
+ - [references/mock-data.md](references/mock-data.md)
395
351
 
396
- ### Example 1: Simple Homepage Screenshot
352
+ ## Critical Rules
397
353
 
398
- ```json
399
- {
400
- "name": "Homepage Hero",
401
- "description": "Above-the-fold hero section of the landing page",
402
- "config": {
403
- "prompt": "Navigate to the homepage. Wait for the hero section to fully load including any animations and images. The page should show the main heading, subheading, and CTA button.",
404
- "targets": [
405
- { "id": "desktop-1440x900", "label": "Desktop 1440x900", "viewport": { "width": 1440, "height": 900 } }
406
- ],
407
- "langs": ["en"],
408
- "themes": ["light"],
409
- "maxIterations": 60,
410
- "outputScale": 2
411
- }
412
- }
413
- ```
354
+ 1. **Always start with `NAVIGATE` + `DISMISS_OVERLAYS`**
355
+ 2. **Every CLICK/TYPE uses a `data-ak` selector** — add the attribute to the code first
356
+ 3. **Never guess selectors** — if the element doesn't have a stable selector, add `data-ak` to it
357
+ 4. **CLICK postconditions describe the result**, not the action (what changed, not what was clicked)
358
+ 5. **Add `WAIT_FOR` after page transitions** (login, route change, modal open)
359
+ 6. **Set `waitMs: 10000`** on postconditions involving async transitions
360
+ 7. **Persist the preset via the CLI — the program lives INSIDE `config.program`**, not in a separate file. After generating the program JSON, write the full config to a temp file and run `autokap preset create` or `autokap preset update` so the saved preset contains `program: { ...the full ExecutionProgram... }` plus any `interactiveDemo` / `mockDataInjection` blocks. The user MUST be able to run `autokap run <preset-id>` afterwards with no `--program` flag and no extra files. **Never tell the user to save the JSON to a local file**, never suggest `--program <file>` as the normal run path. The CLI fetches the program from the server.
361
+ 8. **Use `captureId`/`clipId`** on CAPTURE_SCREENSHOT/BEGIN_CLIP for Studio and dev links to work. Always set `"devLinksEnabled": true` in the preset config when creating via API so endpoints are visible on the dashboard
362
+ 9. **Mock data opcodes are non-blocking** — they log a warning and continue if selectors miss; always pair them with `recovery: { retries: 0, ... }` and `postcondition: { type: "always" }`
363
+ 10. **Hardcode the login URL — never use `{{loginUrl}}`.** The login URL is just another navigation. You generate the entire program, so you know which path the app's login lives at — write it directly (`https://app.example.com/login`, `http://localhost:3000/login`, etc.). The `{{loginUrl}}` placeholder is **deprecated** and produces broken navigations when the user hasn't filled the optional `loginUrl` credential field. `{{email}}` and `{{password}}` placeholders for `TYPE` opcodes are still correct and required (those are sensitive secrets the user fills in).
364
+ 11. **Capturing an auth-protected route requires a login flow at the START of the program.** Do NOT assume the user is "already logged in". Before any opcode that hits a route behind authentication, emit the canonical login sequence (see "Login flow" below). Skipping this is the #1 cause of "preset runs but never reaches the dashboard" failures.
414
365
 
415
- ### Example 2: Multi-Page Preset with Themes
366
+ ## Login flow
367
+
368
+ If **any** of the captures or DOM states in the program lives on an auth-protected route, the program MUST start with this canonical login sequence — placed BEFORE the first capture/state-bearing NAVIGATE.
416
369
 
417
370
  ```json
418
- {
419
- "name": "Marketing Pages",
420
- "description": "Key marketing pages in light and dark mode",
421
- "config": {
422
- "targets": [
423
- { "id": "desktop-1440x900", "label": "Desktop 1440x900", "viewport": { "width": 1440, "height": 900 } }
424
- ],
425
- "langs": ["en"],
426
- "themes": ["light", "dark"],
427
- "themeInstructions": "Click the sun/moon toggle icon in the top-right corner of the header navigation bar to switch between light and dark mode.",
428
- "maxIterations": 60,
429
- "outputScale": 2,
430
- "prompt": "",
431
- "pages": [
432
- {
433
- "id": "homepage",
434
- "name": "Homepage",
435
- "prompt": "Navigate to the homepage at /. Wait for the hero section to load completely with all images. The page should display the main value proposition heading and the 'Get Started' CTA button."
436
- },
437
- {
438
- "id": "pricing",
439
- "name": "Pricing Page",
440
- "prompt": "Navigate to /pricing. Wait for the pricing cards to load. The page should show all plan tiers (Free, Pro, Enterprise) with their prices and feature lists visible. Select the 'Annual' billing toggle if available."
441
- },
442
- {
443
- "id": "features",
444
- "name": "Features Page",
445
- "prompt": "Navigate to /features. Wait for the feature grid to fully render. The page should show the feature cards with icons and descriptions."
446
- }
447
- ]
448
- }
449
- }
371
+ { "kind": "NAVIGATE", "description": "Open login page", "url": "https://app.example.com/login", "postcondition": { "type": "route_matches", "pattern": "/login" }, "recovery": { "retries": 2, "useSelectorMemory": true, "useAltInteraction": true, "allowReload": true, "allowHealer": true }, "timeoutMs": 20000, "maxFailures": 3 },
372
+ { "kind": "DISMISS_OVERLAYS", "description": "Dismiss any overlays on login", "postcondition": { "type": "overlay_dismissed" }, "recovery": { "retries": 2, "useSelectorMemory": true, "useAltInteraction": true, "allowReload": false, "allowHealer": true }, "timeoutMs": 5000, "maxFailures": 3 },
373
+ { "kind": "WAIT_FOR", "description": "Wait for the login email input", "selector": "[data-ak=\"login-email\"]", "state": "visible", "postcondition": { "type": "element_visible", "selector": "[data-ak=\"login-email\"]", "waitMs": 5000 }, "recovery": { "retries": 2, "useSelectorMemory": true, "useAltInteraction": true, "allowReload": false, "allowHealer": true }, "timeoutMs": 10000, "maxFailures": 3 },
374
+ { "kind": "TYPE", "description": "Enter login email", "selector": "[data-ak=\"login-email\"]", "text": "{{email}}", "clearFirst": true, "postcondition": { "type": "any_change" }, "recovery": { "retries": 2, "useSelectorMemory": true, "useAltInteraction": true, "allowReload": false, "allowHealer": true }, "timeoutMs": 10000, "maxFailures": 3 },
375
+ { "kind": "TYPE", "description": "Enter login password", "selector": "[data-ak=\"login-password\"]", "text": "{{password}}", "clearFirst": true, "postcondition": { "type": "any_change" }, "recovery": { "retries": 2, "useSelectorMemory": true, "useAltInteraction": true, "allowReload": false, "allowHealer": true }, "timeoutMs": 10000, "maxFailures": 3 },
376
+ { "kind": "CLICK", "description": "Submit the login form", "selector": "[data-ak=\"login-submit\"]", "postcondition": { "type": "route_matches", "pattern": "/dashboard**", "waitMs": 10000 }, "recovery": { "retries": 2, "useSelectorMemory": true, "useAltInteraction": true, "allowReload": false, "allowHealer": true }, "timeoutMs": 15000, "maxFailures": 3 }
450
377
  ```
451
378
 
452
- ### Example 3: Clip (Animated GIF) Preset
379
+ ### Decision rule: do I need a login flow?
453
380
 
454
- ```json
381
+ Walk through the routes you're about to capture:
382
+
383
+ | Route shape | Login flow needed? |
384
+ |---|---|
385
+ | `/`, `/pricing`, `/blog/*`, `/docs/*` (marketing / public) | **No** |
386
+ | `/dashboard`, `/projects/*`, `/settings`, `/admin/*` (anything behind a guard) | **Yes** |
387
+ | `/login`, `/signup`, `/forgot-password` (the auth pages themselves) | **No** (you're already there) |
388
+ | Mixed (some public, some private) | **Yes** if at least one is private — emit the login sequence first, then NAVIGATE to each route |
389
+
390
+ When in doubt: **inspect the route's component file** for auth checks (`useSession`, `redirect()`, middleware, layout-level auth wrapper, server-side `getServerSession`, etc.). If you find any, emit the login flow.
391
+
392
+ ### Anti-patterns
393
+
394
+ | Don't | Why |
395
+ |---|---|
396
+ | `{ "kind": "NAVIGATE", "url": "{{loginUrl}}" }` | Deprecated placeholder. The user often leaves `loginUrl` empty (it's optional), and the substitution produces an empty string that crashes Playwright. Hardcode the URL. |
397
+ | Skip the login flow because "the user runs locally and is already logged in" | Playwright launches a fresh browser context with no cookies. Sessions never carry over. |
398
+ | Capture a private route with no preceding login sequence | You'll capture the login screen (or a 401 page) instead of the actual dashboard. |
399
+ | Place login steps in the middle of the program after a public capture | The first NAVIGATE to a private route fails with a 401/redirect before login runs. Login goes FIRST. |
400
+ | Use `{{loginUrl}}` in any example or doc you write into chat | Reinforces the broken pattern. Always hardcode. |
401
+
402
+ ## How to persist the preset
403
+
404
+ The user expects a **one-shot flow**: copy prompt → paste → assistant does the work → user runs `autokap run <preset-id>`. **Never break this flow.**
405
+
406
+ Use the AutoKap CLI commands to create, update, and query presets. The CLI reads the stored CLI key from `~/.autokap/config.json` automatically — **do NOT read the key yourself or build raw HTTP requests as the normal path**.
407
+
408
+ ### Step 1: Write the config JSON to a temp file
409
+
410
+ Build the full config object (with `program`, `pages`, and optionally `interactiveDemo`/`mockDataInjection`) and write it to a temporary file:
411
+
412
+ ```bash
413
+ cat > /tmp/autokap-preset.json << 'EOF'
455
414
  {
456
- "name": "Dashboard Demo Clip",
457
- "description": "Short animated GIF showing the dashboard interaction",
458
- "config": {
459
- "captureMode": "clip",
460
- "targets": [
461
- { "id": "desktop-1440x900", "label": "Desktop 1440x900", "viewport": { "width": 1440, "height": 900 } }
462
- ],
463
- "langs": ["en"],
464
- "themes": ["light"],
465
- "maxIterations": 60,
466
- "outputScale": 2,
467
- "prompt": "",
468
- "clips": [
469
- {
470
- "id": "dashboard-filter",
471
- "name": "Dashboard Filter Demo",
472
- "navigationScript": "Navigate to /dashboard. Wait for the analytics charts to fully load and display data.",
473
- "script": "Click the 'Date Range' dropdown in the top-right of the dashboard. Select 'Last 30 days'. Wait for the charts to update with the new data range.",
474
- "holdLastFrameSec": 2
475
- }
476
- ],
477
- "clipOptions": {
478
- "format": "gif",
479
- "maxDurationSec": 8,
480
- "gifMaxWidth": 800,
481
- "showCursor": true,
482
- "loop": true
483
- }
415
+ "captureMode": "screenshot",
416
+ "devLinksEnabled": true,
417
+ "pages": [
418
+ { "id": "dashboard", "name": "Dashboard" },
419
+ { "id": "settings", "name": "Settings" }
420
+ ],
421
+ "program": {
422
+ "presetId": "my-preset",
423
+ "programVersion": 1,
424
+ "mediaMode": "screenshot",
425
+ "baseUrl": "https://app.example.com",
426
+ "variants": [ { "id": "desktop-en-light", "viewport": { "width": 1440, "height": 900 }, "locale": "en", "theme": "light", "targetId": "desktop", "targetLabel": "Desktop" } ],
427
+ "preconditions": {},
428
+ "steps": [ { "kind": "NAVIGATE", "..." : "..." } ],
429
+ "artifactPlan": { "mediaMode": "screenshot" },
430
+ "compileFingerprint": "v1-my-preset",
431
+ "compiledAt": "2026-04-11T00:00:00.000Z",
432
+ "compiledWith": "ai-assistant"
484
433
  }
485
434
  }
435
+ EOF
486
436
  ```
487
437
 
488
- ---
438
+ #### Config requirements
489
439
 
490
- ## Integration Endpoints API
440
+ - **`pages`** one entry per `CAPTURE_SCREENSHOT` opcode (`{ "id": "<captureId>", "name": "<captureName>" }`). Drives endpoint (dev link) creation. **Screenshot presets only.**
441
+ - **`clips`** — one entry per `BEGIN_CLIP`/`END_CLIP` pair (`{ "id": "<clipId>", "name": "<clipName>" }`). Drives endpoint and Studio slot creation. **Clip presets only.** Do NOT put clip entries in `pages` — the server treats `pages` as screenshots.
442
+ - **`program`** — the full `ExecutionProgram` with `steps`, `variants`, `artifactPlan`, etc.
443
+ - **`captureMode`** — `"screenshot"`, `"clip"`, or `"interactive_demo"`.
444
+ - **`devLinksEnabled: true`** — enables endpoints on the dashboard.
491
445
 
492
- After creating presets and running captures, you can create **integration endpoints** stable public URLs that serve your assets. Each endpoint maps to a specific capture (page, element, or clip) and supports dynamic variant selection via query parameters (`lang`, `theme`, `target`).
446
+ The server auto-syncs `langs`, `themes`, `targets`, and `viewports` from `program.variants`you do NOT need to set these manually.
493
447
 
494
- ### Create all endpoints for a preset
448
+ ### Step 2: Create the preset
495
449
 
496
450
  ```bash
497
- curl -X POST https://autokap.com/api/v1/endpoints \
498
- -H "Authorization: Bearer $API_KEY" \
499
- -H "Content-Type: application/json" \
500
- -d '{ "preset_id": "PRESET_ID" }'
451
+ npx autokap@latest preset create \
452
+ --project <project-id> \
453
+ --name "My preset" \
454
+ --config /tmp/autokap-preset.json
501
455
  ```
502
456
 
503
- This creates an endpoint for every page, isolated element, and clip defined in the preset. Existing endpoints are skipped (no duplicates).
457
+ The command outputs **only the preset ID** to stdout. Capture it for the next steps:
504
458
 
505
- Response:
506
- ```json
507
- {
508
- "created": [{ "id": "abc123...", "label": "Homepage", "preset_id": "...", "asset_type": "screenshot" }],
509
- "skipped": 0
510
- }
459
+ ```bash
460
+ PRESET_ID=$(npx autokap@latest preset create --project <project-id> --name "My preset" --config /tmp/autokap-preset.json)
511
461
  ```
512
462
 
513
- ### Create endpoints for a composition
463
+ ### Step 3: Run the preset
514
464
 
515
465
  ```bash
516
- curl -X POST https://autokap.com/api/v1/endpoints \
517
- -H "Authorization: Bearer $API_KEY" \
518
- -H "Content-Type: application/json" \
519
- -d '{ "composition_id": "COMPOSITION_ID" }'
466
+ npx autokap@latest run $PRESET_ID
520
467
  ```
521
468
 
522
- ### List endpoints
469
+ ### Step 4: Fetch dev link endpoints
523
470
 
524
471
  ```bash
525
- # By preset
526
- curl "https://autokap.com/api/v1/endpoints?preset_id=PRESET_ID" \
527
- -H "Authorization: Bearer $API_KEY"
472
+ npx autokap@latest endpoints list --preset $PRESET_ID
473
+ ```
528
474
 
529
- # By project
530
- curl "https://autokap.com/api/v1/endpoints?project_id=PROJECT_ID" \
531
- -H "Authorization: Bearer $API_KEY"
475
+ Outputs JSON with endpoint IDs and public asset URLs:
476
+ ```json
477
+ [
478
+ { "id": "abc123...", "capture_name": "dashboard", "label": "Dashboard", "url": "https://app.example.com/api/v1/assets/abc123..." }
479
+ ]
532
480
  ```
533
481
 
534
- ### Update endpoint label
482
+ ### Step 5: Integrate dev links into the codebase
483
+
484
+ After the run succeeds, wire the endpoints into the user's code. **This is part of the one-shot flow** — do not stop at validation.
485
+
486
+ First, get the structured integration info:
535
487
 
536
488
  ```bash
537
- curl -X PATCH https://autokap.com/api/v1/endpoints/ENDPOINT_ID \
538
- -H "Authorization: Bearer $API_KEY" \
539
- -H "Content-Type: application/json" \
540
- -d '{ "label": "New Label" }'
489
+ npx autokap@latest preset info $PRESET_ID
541
490
  ```
542
491
 
543
- ### Delete an endpoint
492
+ This returns JSON with all endpoints, their URLs, asset types, available variants (langs, themes, targets with device slugs), and per-type URL parameters.
493
+
494
+ Then, use the output to embed the links in the appropriate places in the codebase:
495
+
496
+ - **Screenshots** → `<img src="{url}?lang=en&theme=light&format=webp" alt="..." loading="lazy" />`
497
+ - **Clips** → `<img src="{url}" alt="..." loading="lazy" />` (GIF, default) or `<video src="{url}?format=mp4" autoplay loop muted playsinline />`
498
+ - **Interactive demos** → `<iframe src="{embed_url}" style="width: 100%; border: none; aspect-ratio: 16/9;" allow="clipboard-write" loading="lazy"></iframe>`
499
+ - **Compositions** → `<img src="{url}?format=webp" alt="..." loading="lazy" />`
500
+
501
+ Where to embed depends on what the user asked for (README, landing page, docs, component props, etc.). **Ask the user if the target location is ambiguous.**
502
+
503
+ Append variant query params (`lang`, `theme`, `target`) based on context — for example, match the page locale, or use the user's preferred theme. If uncertain, use the first variant from the `preset info` output as default.
504
+
505
+ ### Updating an existing preset
544
506
 
545
507
  ```bash
546
- curl -X DELETE https://autokap.com/api/v1/endpoints/ENDPOINT_ID \
547
- -H "Authorization: Bearer $API_KEY"
508
+ npx autokap@latest preset update <preset-id> --config /tmp/autokap-preset.json
548
509
  ```
549
510
 
550
- ### Export endpoints
511
+ Optional `--name` and `--description` flags to update metadata.
551
512
 
552
- ```bash
553
- # JSON export
554
- curl "https://autokap.com/api/v1/endpoints/export?preset_id=PRESET_ID" \
555
- -H "Authorization: Bearer $API_KEY"
513
+ ### Deleting a preset
556
514
 
557
- # CSV export
558
- curl "https://autokap.com/api/v1/endpoints/export?preset_id=PRESET_ID&format=csv" \
559
- -H "Authorization: Bearer $API_KEY"
515
+ ```bash
516
+ npx autokap@latest preset delete <preset-id>
560
517
  ```
561
518
 
562
- ### Serve an asset
519
+ ### Anti-patterns (NEVER do these)
563
520
 
564
- Assets are served publicly without authentication:
521
+ | Don't | Why |
522
+ |---|---|
523
+ | Read `~/.autokap/config.json` and make raw `fetch`/`curl` calls | The CLI handles auth automatically. Raw HTTP calls are fragile and expose the key. |
524
+ | Tell the user to save the program to a local JSON and pass `--program` | Defeats the one-shot flow. The user expected the AI to handle persistence. |
525
+ | Create the preset without `config.program` | The CLI then has nothing to run; the dashboard shows "No capture program". |
526
+ | Create the preset without `config.pages` (for screenshot presets) | No dev link endpoints will be created. |
527
+ | Put clip entries in `config.pages` instead of `config.clips` | The server treats `pages` as screenshots. Clips won't appear in Studio or get endpoints. |
528
+ | Create a clip preset without `config.clips` | No clip endpoints or Studio slots will be created. |
529
+ | Output the program as a code block "for the user to copy" | Forces a manual step. Use the CLI instead. |
565
530
 
531
+ ### Fallback (when the CLI is genuinely unavailable)
532
+
533
+ If the CLI command fails, output the full config JSON as a code block and tell the user to import it via the dashboard. Make it clear this is a recovery path, not the normal one.
534
+
535
+ ## Running
536
+
537
+ ```bash
538
+ # Run a preset (the program lives in preset.config.program — no --program flag):
539
+ autokap run <preset-id>
540
+
541
+ # Watch the browser execute:
542
+ autokap run <preset-id> --headed
543
+
544
+ # Save the artifacts to a local directory in addition to uploading them:
545
+ autokap run <preset-id> --output ./screenshots
566
546
  ```
567
- https://autokap.com/api/v1/assets/ENDPOINT_ID?lang=en&theme=dark&target=desktop
568
- ```
569
547
 
570
- Query parameters: `lang`, `theme`, `target`, `w` (width), `quality`, `format` (png/webp).
548
+ ## Complete Examples
549
+
550
+ Use the examples reference for ready-made shapes of:
551
+
552
+ - anonymous screenshot presets
553
+ - authenticated screenshot presets
554
+ - clip presets
555
+ - interactive demo presets
556
+ - mock-data-enabled presets
571
557
 
572
- ### Required API scopes
558
+ Read:
573
559
 
574
- - `endpoints:read` — list, get, and export endpoints
575
- - `endpoints:write` — create, update, and delete endpoints
560
+ - [references/examples.md](references/examples.md)