cueframe 0.1.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +89 -0
- package/dist/analyze-JQBFZK57.js +2 -0
- package/dist/chunk-KEYFXLX4.js +12 -0
- package/dist/cueframe.js +34 -0
- package/package.json +54 -0
- package/skills/cueframe-cli/SKILL.md +200 -0
- package/skills/cueframe-compose-loop/SKILL.md +103 -0
- package/skills/cueframe-product-video/SKILL.md +275 -0
|
@@ -0,0 +1,275 @@
|
|
|
1
|
+
---
|
|
2
|
+
name: cueframe-product-video
|
|
3
|
+
description: Use when turning a screen recording (with cursor/click events) into a polished, auto-zooming product video or demo reel via CueFrame — e.g. "make a product demo from this Browserbase capture", "auto-zoom this screen recording to where the user clicked", "turn this walkthrough into a social demo". This is a USE-CASE recipe over the CueFrame primitives in the `cueframe-cli` skill.
|
|
4
|
+
---
|
|
5
|
+
|
|
6
|
+
# CueFrame — Product Video recipe
|
|
7
|
+
|
|
8
|
+
A **use-case recipe**, not new framework surface. CueFrame's engine only knows
|
|
9
|
+
primitives (`upload` → `composition put` → `render`, the `reframe` viewport, the
|
|
10
|
+
overlay/effect primitives). This skill is the *judgment* on top: how to turn a
|
|
11
|
+
captured walkthrough into a product video that looks good. Drive the primitives
|
|
12
|
+
with the `cueframe-cli` skill; this tells you *what* to author.
|
|
13
|
+
|
|
14
|
+
## The pipeline
|
|
15
|
+
|
|
16
|
+
```
|
|
17
|
+
capture (video + cursor/click points) → auto-zoom (points → reframe)
|
|
18
|
+
→ compose (video track + per-clip reframe) → polish (overlay primitives)
|
|
19
|
+
→ render
|
|
20
|
+
```
|
|
21
|
+
|
|
22
|
+
## 1. Capture → timed focus points
|
|
23
|
+
|
|
24
|
+
Get a screen recording **plus** the cursor/click events with viewport
|
|
25
|
+
coordinates and timestamps (e.g. from a browser-automation capture). Normalize
|
|
26
|
+
every point to `{ t: seconds, x: 0..1, y: 0..1 }` against the **real captured
|
|
27
|
+
viewport** — these are the *focus points* the zoom follows. Measure the rendered
|
|
28
|
+
video's true duration (ffprobe) and map point times onto it.
|
|
29
|
+
|
|
30
|
+
## 2. Points → reframe (auto-zoom)
|
|
31
|
+
|
|
32
|
+
The framework owns the tuning. Canonical implementation:
|
|
33
|
+
`autoZoom(points, style)` in `@cueframe/composition` (timed focus points →
|
|
34
|
+
gapless `reframe` segments; never re-derive this with an LLM — it drifts). Pick a
|
|
35
|
+
**style**, never raw numbers:
|
|
36
|
+
|
|
37
|
+
| style | zoom (`<1` punches in) | feel |
|
|
38
|
+
|---|---|---|
|
|
39
|
+
| `subtle` | 0.80 | gentle, corporate |
|
|
40
|
+
| `standard` | 0.67 | default |
|
|
41
|
+
| `punchy` | 0.55 | snappy, social |
|
|
42
|
+
|
|
43
|
+
The tiling rule (what `autoZoom` does — use this until it's exposed as
|
|
44
|
+
`cueframe compose from-points`): sort points by `t`; group points within
|
|
45
|
+
`mergeGap`s; pad each group `prePad` before / `postPad` after; drop groups
|
|
46
|
+
shorter than `minZoom`; merge overlapping windows; emit `frame-center` zoom 1
|
|
47
|
+
in the gaps and a `point` punch-in (`focus:{mode:"point",x,y}`, `zoom` = the
|
|
48
|
+
style's) over each window. Result → `source.reframe.segments` on the clip.
|
|
49
|
+
|
|
50
|
+
## 3. Compose
|
|
51
|
+
|
|
52
|
+
One video track, the screen recording as a media clip carrying the reframe.
|
|
53
|
+
`cueframe composition put <projectId> -b @composition.json` (see `cueframe-cli`).
|
|
54
|
+
Choose `-a 16:9` for landscape product video, `9:16` for social.
|
|
55
|
+
|
|
56
|
+
## 4. Polish — layer overlay primitives (this is what makes it look "produced")
|
|
57
|
+
|
|
58
|
+
A bare auto-zoom reads as a raw screen-grab. Add overlay/effect clips referencing
|
|
59
|
+
registered primitives (`registry/params-required.json` lists each one's params;
|
|
60
|
+
`$brand:` tokens bind brand colors/fonts at render). Tasteful default set for a
|
|
61
|
+
product demo:
|
|
62
|
+
|
|
63
|
+
| Goal | primitiveId | kind |
|
|
64
|
+
|---|---|---|
|
|
65
|
+
| animated cursor tracing the click path | `cursor-flow` / `simulated-cursor` | overlay |
|
|
66
|
+
| title / context | `lower-third` or `heroText` | overlay |
|
|
67
|
+
| highlight a clicked element | `marker-highlight` / `pulsing-indicator` | overlay |
|
|
68
|
+
| device/browser frame | `device-mockup-zoom` / `browser-flow` | overlay |
|
|
69
|
+
| captions (if narrated) | the top-level `captions` field | — |
|
|
70
|
+
| finish/look | `color-grade` + `vignette` | effect |
|
|
71
|
+
|
|
72
|
+
A primitive is the `source` of a clip; the clip wraps it with `id`/`startTime`/`duration`:
|
|
73
|
+
- Overlay clip (on a `kind:"overlay"` track): `{ "id":"t1", "startTime":1, "duration":3, "source":{ "kind":"overlay", "primitiveId":"lower-third", "params":{…}, "zPlane":"front" } }`.
|
|
74
|
+
- Effect clip (on a `kind:"effect"` track): `{ "id":"e1", "startTime":0, "duration":9, "source":{ "kind":"effect", "primitiveId":"color-grade", "params":{…} } }`.
|
|
75
|
+
Simultaneous overlays go on **separate tracks** — clips on one track can't overlap in time.
|
|
76
|
+
**Don't over-decorate** — 2–4 primitives. Cursor + title + grade is usually enough.
|
|
77
|
+
|
|
78
|
+
## 5. Render
|
|
79
|
+
|
|
80
|
+
`cueframe render <projectId> -o out.mp4 --json`. Verify the output (ffprobe +
|
|
81
|
+
sample frames) before declaring done.
|
|
82
|
+
|
|
83
|
+
## Worked examples — composition JSON
|
|
84
|
+
|
|
85
|
+
These are **orientation, not the contract.** The live wire schema is
|
|
86
|
+
`cueframe schema composition`, and the way to know a draft is correct is
|
|
87
|
+
`cueframe composition validate -b @composition.json` — it runs the same
|
|
88
|
+
validator the server enforces, with no save and no render. Discover + validate
|
|
89
|
+
before you render; never spend a render to find a schema mistake.
|
|
90
|
+
|
|
91
|
+
Each block below is a complete `@composition.json` (the bare object `composition put`
|
|
92
|
+
accepts: top-level `v`, `format`, `tracks`, optional `captions`). All five validate
|
|
93
|
+
against the real wire schema + save invariants (swap the `<…MediaId>` placeholders for
|
|
94
|
+
real `cueframe upload` ids). Conventions that keep them valid:
|
|
95
|
+
|
|
96
|
+
- Tracks hold **`contents`** (not `clips`); each clip wraps a `source`.
|
|
97
|
+
- A clip's `source.kind` must match the track `kind`: `media` → `video`/`image`/`audio`
|
|
98
|
+
tracks, `overlay` → `overlay` tracks, `effect` → `effect` tracks.
|
|
99
|
+
- Clip/segment times are **seconds** (`startTime`, `duration`, `startSec`, `endSec`,
|
|
100
|
+
`trim`); caption word times are **milliseconds** (`startMs`/`endMs`).
|
|
101
|
+
- `region`/`fit` are **clip-level** (siblings of `source`); `reframe` lives on the media `source`.
|
|
102
|
+
- Clips on one track can't overlap — put simultaneous layers on **separate tracks**.
|
|
103
|
+
- `zoom < 1` punches in (range `[0.1, 1]`, default `1`). Durations are illustrative — match your source.
|
|
104
|
+
|
|
105
|
+
### EX1 · Auto-zoom screen demo (16:9)
|
|
106
|
+
One screen recording with `reframe` punch-ins (the `autoZoom` output: `frame-center` in the gaps, `point` over each click) + a cinematic grade.
|
|
107
|
+
|
|
108
|
+
```json
|
|
109
|
+
{
|
|
110
|
+
"v": 1,
|
|
111
|
+
"format": { "aspectRatio": "16:9", "fps": 30 },
|
|
112
|
+
"tracks": [
|
|
113
|
+
{
|
|
114
|
+
"id": "v-screen", "kind": "video", "contents": [
|
|
115
|
+
{
|
|
116
|
+
"id": "rec", "startTime": 0, "duration": 12,
|
|
117
|
+
"source": {
|
|
118
|
+
"kind": "media", "mediaId": "<mediaId>",
|
|
119
|
+
"reframe": { "segments": [
|
|
120
|
+
{ "startSec": 0, "endSec": 2, "focus": { "mode": "frame-center" }, "zoom": 1 },
|
|
121
|
+
{ "startSec": 2, "endSec": 5, "focus": { "mode": "point", "x": 0.28, "y": 0.42 }, "zoom": 0.6, "ease": { "in": 0.4, "out": 0.4 } },
|
|
122
|
+
{ "startSec": 5, "endSec": 7, "focus": { "mode": "frame-center" }, "zoom": 1, "ease": { "in": 0.4, "out": 0.4 } },
|
|
123
|
+
{ "startSec": 7, "endSec": 10, "focus": { "mode": "point", "x": 0.72, "y": 0.66 }, "zoom": 0.55, "ease": { "in": 0.4, "out": 0.4 } },
|
|
124
|
+
{ "startSec": 10, "endSec": 12, "focus": { "mode": "frame-center" }, "zoom": 1, "ease": { "in": 0.4, "out": 0.4 } }
|
|
125
|
+
] }
|
|
126
|
+
}
|
|
127
|
+
}
|
|
128
|
+
]
|
|
129
|
+
},
|
|
130
|
+
{
|
|
131
|
+
"id": "fx", "kind": "effect", "contents": [
|
|
132
|
+
{ "id": "grade", "startTime": 0, "duration": 12, "source": { "kind": "effect", "primitiveId": "color-grade", "params": { "preset": "cinematic" } } }
|
|
133
|
+
]
|
|
134
|
+
}
|
|
135
|
+
]
|
|
136
|
+
}
|
|
137
|
+
```
|
|
138
|
+
|
|
139
|
+
### EX2 · Social vertical with captions (9:16)
|
|
140
|
+
Reframe + a `lower-third` title, then a `cursor-flow` overlay (non-overlapping on one overlay track), transcript `captions` (ms timing), warm grade. `$brand:` refs bind brand colors/fonts at render.
|
|
141
|
+
|
|
142
|
+
```json
|
|
143
|
+
{
|
|
144
|
+
"v": 1,
|
|
145
|
+
"format": { "aspectRatio": "9:16", "fps": 30, "platform": "tiktok" },
|
|
146
|
+
"tracks": [
|
|
147
|
+
{
|
|
148
|
+
"id": "v", "kind": "video", "contents": [
|
|
149
|
+
{
|
|
150
|
+
"id": "rec", "startTime": 0, "duration": 10,
|
|
151
|
+
"source": {
|
|
152
|
+
"kind": "media", "mediaId": "<mediaId>",
|
|
153
|
+
"reframe": { "segments": [
|
|
154
|
+
{ "startSec": 0, "endSec": 4, "focus": { "mode": "point", "x": 0.3, "y": 0.4 }, "zoom": 0.7, "ease": { "in": 0.4, "out": 0.4 } },
|
|
155
|
+
{ "startSec": 4, "endSec": 10, "focus": { "mode": "point", "x": 0.62, "y": 0.55 }, "zoom": 0.62, "ease": { "in": 0.4, "out": 0.4 } }
|
|
156
|
+
] }
|
|
157
|
+
}
|
|
158
|
+
}
|
|
159
|
+
]
|
|
160
|
+
},
|
|
161
|
+
{
|
|
162
|
+
"id": "ov", "kind": "overlay", "contents": [
|
|
163
|
+
{
|
|
164
|
+
"id": "title", "startTime": 0, "duration": 3.5,
|
|
165
|
+
"source": {
|
|
166
|
+
"kind": "overlay", "primitiveId": "lower-third",
|
|
167
|
+
"params": {
|
|
168
|
+
"text": "AI video, one command", "subtitle": "cueframe", "variant": "modern",
|
|
169
|
+
"tokens": { "textColor": "$brand:colors.textOnMedia", "accentColor": "$brand:colors.accent", "fontFamily": "$brand:fonts.heading.family" }
|
|
170
|
+
}
|
|
171
|
+
}
|
|
172
|
+
},
|
|
173
|
+
{
|
|
174
|
+
"id": "cursor", "startTime": 3.5, "duration": 6,
|
|
175
|
+
"source": {
|
|
176
|
+
"kind": "overlay", "primitiveId": "cursor-flow",
|
|
177
|
+
"params": {
|
|
178
|
+
"waypoints": [ { "x": 200, "y": 180 }, { "x": 540, "y": 240, "click": true, "label": "Generate" }, { "x": 1040, "y": 520, "click": true, "label": "Publish" } ],
|
|
179
|
+
"cursorColor": "$brand:colors.text", "showTargets": true
|
|
180
|
+
}
|
|
181
|
+
}
|
|
182
|
+
}
|
|
183
|
+
]
|
|
184
|
+
},
|
|
185
|
+
{
|
|
186
|
+
"id": "fx", "kind": "effect", "contents": [
|
|
187
|
+
{ "id": "grade", "startTime": 0, "duration": 10, "source": { "kind": "effect", "primitiveId": "color-grade", "params": { "preset": "warm" } } }
|
|
188
|
+
]
|
|
189
|
+
}
|
|
190
|
+
],
|
|
191
|
+
"captions": {
|
|
192
|
+
"segments": [
|
|
193
|
+
{ "words": [
|
|
194
|
+
{ "text": "Watch", "startMs": 0, "endMs": 320 },
|
|
195
|
+
{ "text": "this", "startMs": 320, "endMs": 560, "emphasis": true },
|
|
196
|
+
{ "text": "render", "startMs": 560, "endMs": 980 },
|
|
197
|
+
{ "text": "in", "startMs": 980, "endMs": 1120 },
|
|
198
|
+
{ "text": "one", "startMs": 1120, "endMs": 1360, "emphasis": true },
|
|
199
|
+
{ "text": "command", "startMs": 1360, "endMs": 1900 }
|
|
200
|
+
] }
|
|
201
|
+
],
|
|
202
|
+
"style": { "fontFamily": "Inter", "fontSize": 6.5, "fontWeight": 900, "color": "#ffffff", "highlightColor": "#10B981", "position": "bottom", "textTransform": "uppercase" }
|
|
203
|
+
}
|
|
204
|
+
}
|
|
205
|
+
```
|
|
206
|
+
|
|
207
|
+
### EX3 · Picture-in-picture (16:9)
|
|
208
|
+
Full-frame screen recording + a webcam **inset** placed with `region`/`fit` on a second video track (composites on top by track order). Two uploads → two `mediaId`s.
|
|
209
|
+
|
|
210
|
+
```json
|
|
211
|
+
{
|
|
212
|
+
"v": 1,
|
|
213
|
+
"format": { "aspectRatio": "16:9", "fps": 30 },
|
|
214
|
+
"tracks": [
|
|
215
|
+
{ "id": "v-screen", "kind": "video", "contents": [
|
|
216
|
+
{ "id": "screen", "startTime": 0, "duration": 10, "source": { "kind": "media", "mediaId": "<screenMediaId>" } }
|
|
217
|
+
] },
|
|
218
|
+
{ "id": "v-cam", "kind": "video", "contents": [
|
|
219
|
+
{ "id": "cam", "startTime": 0, "duration": 10, "source": { "kind": "media", "mediaId": "<webcamMediaId>" }, "region": { "x": 0.68, "y": 0.04, "w": 0.3, "h": 0.3 }, "fit": "cover" }
|
|
220
|
+
] }
|
|
221
|
+
]
|
|
222
|
+
}
|
|
223
|
+
```
|
|
224
|
+
|
|
225
|
+
### EX4 · Split-screen (16:9)
|
|
226
|
+
Two sources side-by-side, each with a half-frame `region` on its own video track.
|
|
227
|
+
|
|
228
|
+
```json
|
|
229
|
+
{
|
|
230
|
+
"v": 1,
|
|
231
|
+
"format": { "aspectRatio": "16:9", "fps": 30 },
|
|
232
|
+
"tracks": [
|
|
233
|
+
{ "id": "v-left", "kind": "video", "contents": [
|
|
234
|
+
{ "id": "left", "startTime": 0, "duration": 8, "source": { "kind": "media", "mediaId": "<leftMediaId>" }, "region": { "x": 0, "y": 0, "w": 0.5, "h": 1 }, "fit": "cover" }
|
|
235
|
+
] },
|
|
236
|
+
{ "id": "v-right", "kind": "video", "contents": [
|
|
237
|
+
{ "id": "right", "startTime": 0, "duration": 8, "source": { "kind": "media", "mediaId": "<rightMediaId>" }, "region": { "x": 0.5, "y": 0, "w": 0.5, "h": 1 }, "fit": "cover" }
|
|
238
|
+
] }
|
|
239
|
+
]
|
|
240
|
+
}
|
|
241
|
+
```
|
|
242
|
+
|
|
243
|
+
### EX5 · Presenter intro — title behind the speaker (9:16)
|
|
244
|
+
Talking-head with an `active-speaker` reframe and a `heroText` title at `zPlane:"behind-subject"`. The speaker **detection** and the person **matte** are both resolved automatically at render (see the `cueframe-cli` skill's "Render resolves expensive intents" section) — author the intent, render, done.
|
|
245
|
+
|
|
246
|
+
```json
|
|
247
|
+
{
|
|
248
|
+
"v": 1,
|
|
249
|
+
"format": { "aspectRatio": "9:16", "fps": 30 },
|
|
250
|
+
"tracks": [
|
|
251
|
+
{ "id": "v", "kind": "video", "contents": [
|
|
252
|
+
{ "id": "cam", "startTime": 0, "duration": 6, "source": { "kind": "media", "mediaId": "<mediaId>", "reframe": { "segments": [ { "startSec": 0, "endSec": 6, "focus": { "mode": "active-speaker" }, "zoom": 0.8 } ] } } }
|
|
253
|
+
] },
|
|
254
|
+
{ "id": "ov", "kind": "overlay", "contents": [
|
|
255
|
+
{ "id": "hero", "startTime": 0.5, "duration": 5, "source": {
|
|
256
|
+
"kind": "overlay", "primitiveId": "heroText",
|
|
257
|
+
"params": { "text": "SHIP IN MINUTES", "fontSizePct": 14, "textColor": "$brand:colors.textOnMedia", "fontFamily": "$brand:fonts.heading.family" },
|
|
258
|
+
"zPlane": "behind-subject", "anchor": { "space": "scene", "x": 0.5, "y": 0.42 }
|
|
259
|
+
} }
|
|
260
|
+
] }
|
|
261
|
+
]
|
|
262
|
+
}
|
|
263
|
+
```
|
|
264
|
+
|
|
265
|
+
## Notes / honesty
|
|
266
|
+
|
|
267
|
+
- **Picture-in-picture / inset cards render** — per-clip output-space placement
|
|
268
|
+
(`region:{x,y,w,h}` in `[0,1]` + `fit`) is wired for PiP / inset / split-screen,
|
|
269
|
+
for video, image, and overlay clips. Set `region`/`fit` on the **clip** (siblings
|
|
270
|
+
of `source`, not inside it — or author via the `set_clip_region` agent tool); two
|
|
271
|
+
clips with different placement composite by track order instead of occluding the
|
|
272
|
+
background. See EX3 / EX4 below.
|
|
273
|
+
- This recipe lives in a skill on purpose: the engine stays use-case-free; the
|
|
274
|
+
product opinions (style, which primitives, layout) live here and ship via
|
|
275
|
+
`cueframe install`. Add sibling recipes (`cueframe-podcast-clip`, …) the same way.
|