videowright 0.1.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (306) hide show
  1. package/README.md +91 -0
  2. package/dist/cli/argv.d.ts +28 -0
  3. package/dist/cli/argv.d.ts.map +1 -0
  4. package/dist/cli/argv.js +115 -0
  5. package/dist/cli/argv.js.map +1 -0
  6. package/dist/cli/bin.d.ts +7 -0
  7. package/dist/cli/bin.d.ts.map +1 -0
  8. package/dist/cli/bin.js +10 -0
  9. package/dist/cli/bin.js.map +1 -0
  10. package/dist/cli/dev.d.ts +19 -0
  11. package/dist/cli/dev.d.ts.map +1 -0
  12. package/dist/cli/dev.js +104 -0
  13. package/dist/cli/dev.js.map +1 -0
  14. package/dist/cli/discover.d.ts +29 -0
  15. package/dist/cli/discover.d.ts.map +1 -0
  16. package/dist/cli/discover.js +104 -0
  17. package/dist/cli/discover.js.map +1 -0
  18. package/dist/cli/discover_project.d.ts +29 -0
  19. package/dist/cli/discover_project.d.ts.map +1 -0
  20. package/dist/cli/discover_project.js +108 -0
  21. package/dist/cli/discover_project.js.map +1 -0
  22. package/dist/cli/errors.d.ts +10 -0
  23. package/dist/cli/errors.d.ts.map +1 -0
  24. package/dist/cli/errors.js +13 -0
  25. package/dist/cli/errors.js.map +1 -0
  26. package/dist/cli/ffmpeg.d.ts +57 -0
  27. package/dist/cli/ffmpeg.d.ts.map +1 -0
  28. package/dist/cli/ffmpeg.js +122 -0
  29. package/dist/cli/ffmpeg.js.map +1 -0
  30. package/dist/cli/index.d.ts +7 -0
  31. package/dist/cli/index.d.ts.map +1 -0
  32. package/dist/cli/index.js +152 -0
  33. package/dist/cli/index.js.map +1 -0
  34. package/dist/cli/playwright_check.d.ts +44 -0
  35. package/dist/cli/playwright_check.d.ts.map +1 -0
  36. package/dist/cli/playwright_check.js +20 -0
  37. package/dist/cli/playwright_check.js.map +1 -0
  38. package/dist/cli/prompt.d.ts +13 -0
  39. package/dist/cli/prompt.d.ts.map +1 -0
  40. package/dist/cli/prompt.js +47 -0
  41. package/dist/cli/prompt.js.map +1 -0
  42. package/dist/cli/render.d.ts +60 -0
  43. package/dist/cli/render.d.ts.map +1 -0
  44. package/dist/cli/render.js +471 -0
  45. package/dist/cli/render.js.map +1 -0
  46. package/dist/cli/script_cmd.d.ts +26 -0
  47. package/dist/cli/script_cmd.d.ts.map +1 -0
  48. package/dist/cli/script_cmd.js +88 -0
  49. package/dist/cli/script_cmd.js.map +1 -0
  50. package/dist/cli/time_shim.d.ts +44 -0
  51. package/dist/cli/time_shim.d.ts.map +1 -0
  52. package/dist/cli/time_shim.js +390 -0
  53. package/dist/cli/time_shim.js.map +1 -0
  54. package/dist/cli/ts_loader.d.ts +28 -0
  55. package/dist/cli/ts_loader.d.ts.map +1 -0
  56. package/dist/cli/ts_loader.js +95 -0
  57. package/dist/cli/ts_loader.js.map +1 -0
  58. package/dist/cli/vite_helpers.d.ts +62 -0
  59. package/dist/cli/vite_helpers.d.ts.map +1 -0
  60. package/dist/cli/vite_helpers.js +273 -0
  61. package/dist/cli/vite_helpers.js.map +1 -0
  62. package/dist/index.d.ts +11 -0
  63. package/dist/index.d.ts.map +1 -0
  64. package/dist/index.js +14 -0
  65. package/dist/index.js.map +1 -0
  66. package/dist/player/hash_router.d.ts +23 -0
  67. package/dist/player/hash_router.d.ts.map +1 -0
  68. package/dist/player/hash_router.js +49 -0
  69. package/dist/player/hash_router.js.map +1 -0
  70. package/dist/player/hud.d.ts +33 -0
  71. package/dist/player/hud.d.ts.map +1 -0
  72. package/dist/player/hud.js +357 -0
  73. package/dist/player/hud.js.map +1 -0
  74. package/dist/player/index.d.ts +123 -0
  75. package/dist/player/index.d.ts.map +1 -0
  76. package/dist/player/index.js +848 -0
  77. package/dist/player/index.js.map +1 -0
  78. package/dist/player/input.d.ts +14 -0
  79. package/dist/player/input.d.ts.map +1 -0
  80. package/dist/player/input.js +90 -0
  81. package/dist/player/input.js.map +1 -0
  82. package/dist/player/slot.d.ts +22 -0
  83. package/dist/player/slot.d.ts.map +1 -0
  84. package/dist/player/slot.js +43 -0
  85. package/dist/player/slot.js.map +1 -0
  86. package/dist/player/transitions/cut.d.ts +7 -0
  87. package/dist/player/transitions/cut.d.ts.map +1 -0
  88. package/dist/player/transitions/cut.js +9 -0
  89. package/dist/player/transitions/cut.js.map +1 -0
  90. package/dist/player/transitions/fade.d.ts +7 -0
  91. package/dist/player/transitions/fade.d.ts.map +1 -0
  92. package/dist/player/transitions/fade.js +18 -0
  93. package/dist/player/transitions/fade.js.map +1 -0
  94. package/dist/player/transitions/index.d.ts +4 -0
  95. package/dist/player/transitions/index.d.ts.map +1 -0
  96. package/dist/player/transitions/index.js +4 -0
  97. package/dist/player/transitions/index.js.map +1 -0
  98. package/dist/player/transitions/slide.d.ts +6 -0
  99. package/dist/player/transitions/slide.d.ts.map +1 -0
  100. package/dist/player/transitions/slide.js +35 -0
  101. package/dist/player/transitions/slide.js.map +1 -0
  102. package/dist/script/index.d.ts +2 -0
  103. package/dist/script/index.d.ts.map +1 -0
  104. package/dist/script/index.js +2 -0
  105. package/dist/script/index.js.map +1 -0
  106. package/dist/script/script.d.ts +10 -0
  107. package/dist/script/script.d.ts.map +1 -0
  108. package/dist/script/script.js +41 -0
  109. package/dist/script/script.js.map +1 -0
  110. package/dist/segment/SegmentRunner.d.ts +52 -0
  111. package/dist/segment/SegmentRunner.d.ts.map +1 -0
  112. package/dist/segment/SegmentRunner.js +187 -0
  113. package/dist/segment/SegmentRunner.js.map +1 -0
  114. package/dist/segment/defineConfig.d.ts +6 -0
  115. package/dist/segment/defineConfig.d.ts.map +1 -0
  116. package/dist/segment/defineConfig.js +7 -0
  117. package/dist/segment/defineConfig.js.map +1 -0
  118. package/dist/segment/defineSegment.d.ts +7 -0
  119. package/dist/segment/defineSegment.d.ts.map +1 -0
  120. package/dist/segment/defineSegment.js +25 -0
  121. package/dist/segment/defineSegment.js.map +1 -0
  122. package/dist/segment/index.d.ts +5 -0
  123. package/dist/segment/index.d.ts.map +1 -0
  124. package/dist/segment/index.js +4 -0
  125. package/dist/segment/index.js.map +1 -0
  126. package/dist/timeline/index.d.ts +73 -0
  127. package/dist/timeline/index.d.ts.map +1 -0
  128. package/dist/timeline/index.js +142 -0
  129. package/dist/timeline/index.js.map +1 -0
  130. package/dist/timeline/loadAudioTrack.d.ts +18 -0
  131. package/dist/timeline/loadAudioTrack.d.ts.map +1 -0
  132. package/dist/timeline/loadAudioTrack.js +44 -0
  133. package/dist/timeline/loadAudioTrack.js.map +1 -0
  134. package/dist/timeline/loadVoiceover.d.ts +18 -0
  135. package/dist/timeline/loadVoiceover.d.ts.map +1 -0
  136. package/dist/timeline/loadVoiceover.js +38 -0
  137. package/dist/timeline/loadVoiceover.js.map +1 -0
  138. package/dist/timeline/resolveTiming.d.ts +28 -0
  139. package/dist/timeline/resolveTiming.d.ts.map +1 -0
  140. package/dist/timeline/resolveTiming.js +63 -0
  141. package/dist/timeline/resolveTiming.js.map +1 -0
  142. package/dist/timeline/validateTiming.d.ts +29 -0
  143. package/dist/timeline/validateTiming.d.ts.map +1 -0
  144. package/dist/timeline/validateTiming.js +62 -0
  145. package/dist/timeline/validateTiming.js.map +1 -0
  146. package/dist/types.d.ts +216 -0
  147. package/dist/types.d.ts.map +1 -0
  148. package/dist/types.js +6 -0
  149. package/dist/types.js.map +1 -0
  150. package/package.json +47 -0
  151. package/skill/SKILL.md +64 -0
  152. package/skill/assets/hello_world/PLAN.md +31 -0
  153. package/skill/assets/hello_world/README.md +27 -0
  154. package/skill/assets/hello_world/audio/audio_plan.md +14 -0
  155. package/skill/assets/hello_world/segments/hello_intro.ts +69 -0
  156. package/skill/assets/hello_world/segments/hello_outro.ts +71 -0
  157. package/skill/assets/hello_world/timeline.ts +15 -0
  158. package/skill/assets/hello_world/voiceover_script/script.md +10 -0
  159. package/skill/assets/install/package.json +10 -0
  160. package/skill/assets/install/tsconfig.json +23 -0
  161. package/skill/assets/styles/editorial-mono/STYLE.md +124 -0
  162. package/skill/assets/styles/editorial-mono/brand.md +85 -0
  163. package/skill/assets/styles/editorial-mono/reference/animations.jsx +752 -0
  164. package/skill/assets/styles/editorial-mono/reference/scenes.html +563 -0
  165. package/skill/assets/styles/editorial-mono/sample/bullet.ts +101 -0
  166. package/skill/assets/styles/editorial-mono/sample/content.ts +104 -0
  167. package/skill/assets/styles/editorial-mono/sample/cta.ts +113 -0
  168. package/skill/assets/styles/editorial-mono/sample/feature.ts +111 -0
  169. package/skill/assets/styles/editorial-mono/sample/grid.ts +97 -0
  170. package/skill/assets/styles/editorial-mono/sample/kinetic.ts +96 -0
  171. package/skill/assets/styles/editorial-mono/sample/section.ts +101 -0
  172. package/skill/assets/styles/editorial-mono/sample/stat.ts +128 -0
  173. package/skill/assets/styles/editorial-mono/sample/title.ts +97 -0
  174. package/skill/assets/styles/editorial-mono/sample/ui-showcase.ts +159 -0
  175. package/skill/assets/styles/editorial-mono/tokens.css +44 -0
  176. package/skill/assets/styles/iso-diagram/STYLE.md +109 -0
  177. package/skill/assets/styles/iso-diagram/brand.md +32 -0
  178. package/skill/assets/styles/iso-diagram/reference/animations.jsx +673 -0
  179. package/skill/assets/styles/iso-diagram/reference/scenes.html +427 -0
  180. package/skill/assets/styles/iso-diagram/sample/bullet.ts +144 -0
  181. package/skill/assets/styles/iso-diagram/sample/content.ts +192 -0
  182. package/skill/assets/styles/iso-diagram/sample/cta.ts +162 -0
  183. package/skill/assets/styles/iso-diagram/sample/feature.ts +205 -0
  184. package/skill/assets/styles/iso-diagram/sample/grid.ts +181 -0
  185. package/skill/assets/styles/iso-diagram/sample/kinetic.ts +102 -0
  186. package/skill/assets/styles/iso-diagram/sample/section.ts +149 -0
  187. package/skill/assets/styles/iso-diagram/sample/stat.ts +164 -0
  188. package/skill/assets/styles/iso-diagram/sample/title.ts +173 -0
  189. package/skill/assets/styles/iso-diagram/sample/ui-showcase.ts +162 -0
  190. package/skill/assets/styles/iso-diagram/tokens.css +40 -0
  191. package/skill/assets/styles/motion-engineering/STYLE.md +106 -0
  192. package/skill/assets/styles/motion-engineering/brand.md +29 -0
  193. package/skill/assets/styles/motion-engineering/reference/animations.jsx +673 -0
  194. package/skill/assets/styles/motion-engineering/reference/scenes.html +513 -0
  195. package/skill/assets/styles/motion-engineering/sample/bullet.ts +176 -0
  196. package/skill/assets/styles/motion-engineering/sample/content.ts +228 -0
  197. package/skill/assets/styles/motion-engineering/sample/cta.ts +209 -0
  198. package/skill/assets/styles/motion-engineering/sample/feature.ts +299 -0
  199. package/skill/assets/styles/motion-engineering/sample/grid.ts +190 -0
  200. package/skill/assets/styles/motion-engineering/sample/kinetic.ts +159 -0
  201. package/skill/assets/styles/motion-engineering/sample/section.ts +196 -0
  202. package/skill/assets/styles/motion-engineering/sample/stat.ts +230 -0
  203. package/skill/assets/styles/motion-engineering/sample/title.ts +219 -0
  204. package/skill/assets/styles/motion-engineering/sample/ui-showcase.ts +267 -0
  205. package/skill/assets/styles/motion-engineering/tokens.css +40 -0
  206. package/skill/assets/styles/neon-terminal/STYLE.md +105 -0
  207. package/skill/assets/styles/neon-terminal/brand.md +27 -0
  208. package/skill/assets/styles/neon-terminal/reference/animations.jsx +673 -0
  209. package/skill/assets/styles/neon-terminal/reference/scenes.html +387 -0
  210. package/skill/assets/styles/neon-terminal/sample/bullet.ts +113 -0
  211. package/skill/assets/styles/neon-terminal/sample/content.ts +117 -0
  212. package/skill/assets/styles/neon-terminal/sample/cta.ts +131 -0
  213. package/skill/assets/styles/neon-terminal/sample/feature.ts +112 -0
  214. package/skill/assets/styles/neon-terminal/sample/grid.ts +128 -0
  215. package/skill/assets/styles/neon-terminal/sample/kinetic.ts +105 -0
  216. package/skill/assets/styles/neon-terminal/sample/section.ts +96 -0
  217. package/skill/assets/styles/neon-terminal/sample/stat.ts +123 -0
  218. package/skill/assets/styles/neon-terminal/sample/title.ts +122 -0
  219. package/skill/assets/styles/neon-terminal/sample/ui-showcase.ts +127 -0
  220. package/skill/assets/styles/neon-terminal/tokens.css +39 -0
  221. package/skill/assets/styles/risograph/STYLE.md +110 -0
  222. package/skill/assets/styles/risograph/brand.md +26 -0
  223. package/skill/assets/styles/risograph/reference/animations.jsx +673 -0
  224. package/skill/assets/styles/risograph/reference/scenes.html +403 -0
  225. package/skill/assets/styles/risograph/sample/bullet.ts +124 -0
  226. package/skill/assets/styles/risograph/sample/content.ts +135 -0
  227. package/skill/assets/styles/risograph/sample/cta.ts +149 -0
  228. package/skill/assets/styles/risograph/sample/feature.ts +152 -0
  229. package/skill/assets/styles/risograph/sample/grid.ts +123 -0
  230. package/skill/assets/styles/risograph/sample/kinetic.ts +125 -0
  231. package/skill/assets/styles/risograph/sample/section.ts +130 -0
  232. package/skill/assets/styles/risograph/sample/stat.ts +145 -0
  233. package/skill/assets/styles/risograph/sample/title.ts +132 -0
  234. package/skill/assets/styles/risograph/sample/ui-showcase.ts +147 -0
  235. package/skill/assets/styles/risograph/tokens.css +39 -0
  236. package/skill/assets/styles/swiss-console/STYLE.md +107 -0
  237. package/skill/assets/styles/swiss-console/brand.md +37 -0
  238. package/skill/assets/styles/swiss-console/reference/animations.jsx +673 -0
  239. package/skill/assets/styles/swiss-console/reference/scenes.html +420 -0
  240. package/skill/assets/styles/swiss-console/sample/bullet.ts +122 -0
  241. package/skill/assets/styles/swiss-console/sample/content.ts +137 -0
  242. package/skill/assets/styles/swiss-console/sample/cta.ts +109 -0
  243. package/skill/assets/styles/swiss-console/sample/feature.ts +163 -0
  244. package/skill/assets/styles/swiss-console/sample/grid.ts +145 -0
  245. package/skill/assets/styles/swiss-console/sample/kinetic.ts +117 -0
  246. package/skill/assets/styles/swiss-console/sample/section.ts +127 -0
  247. package/skill/assets/styles/swiss-console/sample/stat.ts +148 -0
  248. package/skill/assets/styles/swiss-console/sample/title.ts +148 -0
  249. package/skill/assets/styles/swiss-console/sample/ui-showcase.ts +198 -0
  250. package/skill/assets/styles/swiss-console/tokens.css +39 -0
  251. package/skill/install/INSTALL.md +400 -0
  252. package/skill/references/audio/audio_plan.md +199 -0
  253. package/skill/references/audio/build.md +208 -0
  254. package/skill/references/audio/cue_template.md +219 -0
  255. package/skill/references/audio/ffmpeg_cookbook.md +267 -0
  256. package/skill/references/audio/music/music.md +171 -0
  257. package/skill/references/audio/music/providers/elevenlabs.md +170 -0
  258. package/skill/references/audio/music/providers/manual.md +140 -0
  259. package/skill/references/audio/music/providers/openverse.md +265 -0
  260. package/skill/references/audio/sfx/providers/elevenlabs.md +152 -0
  261. package/skill/references/audio/sfx/providers/manual.md +117 -0
  262. package/skill/references/audio/sfx/providers/openverse.md +243 -0
  263. package/skill/references/audio/sfx/sfx.md +149 -0
  264. package/skill/references/audio/styles.md +102 -0
  265. package/skill/references/audio/sync.md +237 -0
  266. package/skill/references/audio/voiceover/animation_sync.md +142 -0
  267. package/skill/references/audio/voiceover/provider_script.md +153 -0
  268. package/skill/references/audio/voiceover/providers/elevenlabs.md +288 -0
  269. package/skill/references/audio/voiceover/providers/manual.md +100 -0
  270. package/skill/references/audio/voiceover/script_writing.md +100 -0
  271. package/skill/references/audio/voiceover/style_intake.md +56 -0
  272. package/skill/references/audio/voiceover/sync_algorithm.md +167 -0
  273. package/skill/references/audio/voiceover.md +296 -0
  274. package/skill/references/audio.md +135 -0
  275. package/skill/references/authoring_segment.md +446 -0
  276. package/skill/references/create_or_edit_video.md +232 -0
  277. package/skill/references/dev_server.md +157 -0
  278. package/skill/references/export.md +145 -0
  279. package/skill/references/new_video.md +117 -0
  280. package/skill/references/project_structure.md +144 -0
  281. package/skill/references/setup.md +109 -0
  282. package/skill/references/setup_new_style.md +158 -0
  283. package/skill/references/styles.md +154 -0
  284. package/skill/references/testing.md +115 -0
  285. package/skill/references/types.md +240 -0
  286. package/src/cli/entry/components/copy_button.ts +42 -0
  287. package/src/cli/entry/components/download_modal.ts +204 -0
  288. package/src/cli/entry/components/empty_state.ts +55 -0
  289. package/src/cli/entry/components/hide_hud_tab.ts +37 -0
  290. package/src/cli/entry/components/icons.ts +31 -0
  291. package/src/cli/entry/components/top_bar.ts +69 -0
  292. package/src/cli/entry/components/video_card.ts +57 -0
  293. package/src/cli/entry/dev_frame.ts +189 -0
  294. package/src/cli/entry/entry_index.ts +16 -0
  295. package/src/cli/entry/entry_video.ts +24 -0
  296. package/src/cli/entry/index.html +12 -0
  297. package/src/cli/entry/parse_slug.ts +14 -0
  298. package/src/cli/entry/render.html +17 -0
  299. package/src/cli/entry/render_entry.ts +121 -0
  300. package/src/cli/entry/styles/base.css +45 -0
  301. package/src/cli/entry/styles/components.css +605 -0
  302. package/src/cli/entry/styles/tokens.css +44 -0
  303. package/src/cli/entry/video.html +22 -0
  304. package/src/cli/entry/views/homepage.ts +66 -0
  305. package/src/cli/entry/views/video_view.ts +286 -0
  306. package/src/cli/entry/virtual.d.ts +8 -0
@@ -0,0 +1,237 @@
1
+ # Sync Video to Audio Track
2
+
3
+ ## When this is loaded
4
+
5
+ An audio track has been built and approved, and you need to compute per-segment timing that syncs the video to the audio. This reference extends the VO-level sync algorithm in [voiceover/sync_algorithm.md](voiceover/sync_algorithm.md) to handle multi-source audio tracks (VO + SFX + music). For VO-only tracks, the VO-only shortcut at the bottom of this file applies -- it delegates to the existing sync algorithm unchanged.
6
+
7
+ ## Overview
8
+
9
+ The sync stage reads the approved track's `plan_snapshot.md` and the source VO `timing.json` files to compute a `Timing` object. This `Timing` is written into the track's `track.ts` and drives segment advance timing during playback and render.
10
+
11
+ The procedure is an agent reasoning step, not a deterministic function. You read structured data, apply mapping rules, and produce timing values that align visual beats to audio events.
12
+
13
+ ## Inputs
14
+
15
+ 1. **Plan snapshot**: `audio/tracks/vN/plan_snapshot.md` from the active track. Contains all cues with their Source, Slice, Place at, and Notes fields.
16
+ 2. **Per-segment script**: from `PLAN.md` (the `## Script` section with subsections per segment id).
17
+ 3. **VO timing data**: `timing.json` from each VO source referenced by cues in the snapshot. Located at `audio/originals/voiceovers/<slug>/timing.json`.
18
+ 4. **Segment ids**: in timeline order, plus each segment's `advances` array length (tells you how many advances each segment needs).
19
+
20
+ ## Output
21
+
22
+ An updated `timing` field in `audio/tracks/vN/track.ts`:
23
+
24
+ ```ts
25
+ timing: {
26
+ perSegment: {
27
+ 'intro': [4.2],
28
+ 'feature-cards': [2.1, 5.8, 9.3, 12.0],
29
+ 'outro': [3.5],
30
+ },
31
+ },
32
+ ```
33
+
34
+ Each value array has the same length as the segment's `advances` array. Values are segment-relative seconds (time since the segment started playing, not since the audio started).
35
+
36
+ ## The sync procedure
37
+
38
+ ### Step 1: Parse VO cues from the snapshot
39
+
40
+ Read the plan snapshot. For each cue whose `Source:` points at `audio/originals/voiceovers/...`, extract:
41
+
42
+ - **Source path**: which VO folder
43
+ - **Slice**: `full file` or `<start>-<end>` (seconds within the source)
44
+ - **Place at**: where the cue starts in the track (seconds)
45
+ - **Notes**: any inline text about what the VO says
46
+
47
+ ### Step 2: Load VO timing data
48
+
49
+ For each unique VO source referenced by a cue, load `timing.json` from that source folder:
50
+
51
+ ```json
52
+ {
53
+ "words": [
54
+ { "word": "Welcome", "start": 0.0, "end": 0.45 },
55
+ { "word": "to", "start": 0.47, "end": 0.55 },
56
+ ...
57
+ ]
58
+ }
59
+ ```
60
+
61
+ Timestamps in timing.json are seconds from the start of the VO audio file.
62
+
63
+ ### Step 3: Map VO word times to track times
64
+
65
+ For each VO cue with `Slice: a-b` (or `full file`, where a=0 and b=file length) and `Place at: p`:
66
+
67
+ > **Effective track time** of a word originally at source time `w`:
68
+ > `track_time = p + (w - a)`, valid only when `a <= w <= b`.
69
+
70
+ Words outside the slice range are excluded from this cue's mapping. If a word appears in multiple cues (overlapping slices from the same VO), it has multiple effective track times -- this is rare but valid.
71
+
72
+ Example:
73
+ - VO cue: `Slice: 1.2-5.6`, `Place at: 3.4`
74
+ - Word at source time 2.0s: `track_time = 3.4 + (2.0 - 1.2) = 4.2s`
75
+ - Word at source time 5.0s: `track_time = 3.4 + (5.0 - 1.2) = 7.2s`
76
+ - Word at source time 0.5s: excluded (outside slice range)
77
+
78
+ ### Step 4: Map script text to timing words
79
+
80
+ Walk through the per-segment script from PLAN.md. For each segment's script section, find the corresponding words in the timing data by text matching.
81
+
82
+ - Match is case-insensitive and ignores punctuation.
83
+ - Provider timing may include words from pause markers or SSML annotations that were in the provider script but not the PLAN script -- skip those.
84
+ - If the provider timing has significantly different text (indicating the TTS changed wording), flag this to the user.
85
+
86
+ ### Step 5: Find segment boundaries
87
+
88
+ For each segment, determine the advance times:
89
+
90
+ - **Align to the next segment's VO onset, not the current segment's VO offset.** Each segment's final advance should land just *before* the next segment's first spoken word, so the voiceover starts right after the transition.
91
+ - Find the first word of the *next* segment's script section on the track timeline. Place the segment boundary 0.1-0.3s before that word's effective track start time.
92
+ - For the **last segment** (no next segment), find the last word in the segment's script section and add 0.3-0.5s after its end time.
93
+
94
+ ### Step 6: Handle multi-advance segments
95
+
96
+ Segments with multiple advances have internal beats (`waitForNext()` calls). For these:
97
+
98
+ 1. **Count the advances.** The segment's `advances` array length tells you how many beats are needed.
99
+ 2. **Identify beat positions.** Look for natural break points in the segment's script:
100
+ - `[pause for animation]` markers in PLAN.md.
101
+ - Sentence boundaries that align with visual transitions.
102
+ - Content transition cues: "Next,...", "And now,...", "Finally,...".
103
+ 3. **Place internal advances** at the effective track time of the words at these break points, converted to segment-relative time.
104
+ 4. **Place the final advance** just before the next segment's first VO word (per Step 5).
105
+
106
+ ### Step 7: Incorporate SFX and music anchors
107
+
108
+ Read `Notes:` fields from non-VO cues (SFX, music) in the plan snapshot. Look for time-specific hints:
109
+
110
+ - "beat drops at 5.3s" -- a music moment that could align with a visual transition
111
+ - "whoosh at 12.0s" -- a SFX that should coincide with a visual event
112
+ - "typing starts at 8.2s" -- a SFX placement that the video should sync to
113
+
114
+ These anchors are secondary to VO timing. Use them to fine-tune advance placement when a visual beat should land on an audio event. If an anchor conflicts with VO-derived timing, VO timing wins.
115
+
116
+ ### Step 8: Convert to segment-relative times
117
+
118
+ All advance values in the `Timing` are segment-relative (seconds since the segment started), not track-absolute.
119
+
120
+ ```
121
+ segment_start = sum of all previous segments' final advance times
122
+ (cumulative from the start of the track)
123
+ advance_time = absolute_track_time - segment_start
124
+ ```
125
+
126
+ **Worked example** (3 segments, 1 advance each):
127
+
128
+ | Segment | Track-absolute advance | segment_start | Segment-relative advance |
129
+ |---|---|---|---|
130
+ | intro | 4.2s | 0.0s (first segment) | 4.2 - 0.0 = **4.2s** |
131
+ | feature-cards | 16.2s | 4.2s (intro's final advance) | 16.2 - 4.2 = **12.0s** |
132
+ | outro | 19.7s | 16.2s (feature-cards' final advance) | 19.7 - 16.2 = **3.5s** |
133
+
134
+ For multi-advance segments, `segment_start` is still the cumulative total of all previous segments' **final** advance values. Internal advances within a segment are converted using the same `segment_start`.
135
+
136
+ ### Step 9: Apply the "audio always wins" rule
137
+
138
+ The video duration adapts to match the audio:
139
+
140
+ - If the audio for a segment is shorter than the segment's current advances suggest, compress the advances.
141
+ - If the audio is longer, stretch the advances.
142
+ - The last advance of the last segment should land at or near the end of the audio track.
143
+ - Never truncate audio. Never pad with silence.
144
+
145
+ ## Writing the timing
146
+
147
+ ### Update track.ts
148
+
149
+ Write the computed `timing` into the active track's `track.ts`:
150
+
151
+ ```ts
152
+ timing: {
153
+ perSegment: {
154
+ 'intro': [4.2],
155
+ 'feature-cards': [2.1, 5.8, 9.3, 12.0],
156
+ 'outro': [3.5],
157
+ },
158
+ },
159
+ ```
160
+
161
+ ### Present to the user
162
+
163
+ Show the timing with annotations:
164
+
165
+ ```
166
+ Proposed timing for audio track v3:
167
+
168
+ intro (1 advance):
169
+ [0] 4.2s -- "...set us apart." (end of intro narration)
170
+
171
+ feature-cards (4 advances):
172
+ [0] 2.1s -- "...across devices." (end of collaboration section)
173
+ [1] 5.8s -- "...in one view." (end of analytics section)
174
+ [2] 9.3s -- "...and more." (end of integrations section)
175
+ [3] 12.0s -- transition (next VO starts at ~12.2s)
176
+
177
+ outro (1 advance):
178
+ [0] 3.5s -- "...Thanks for watching." (end of video)
179
+
180
+ Total track duration: 19.7s
181
+ ```
182
+
183
+ For each advance, show:
184
+ - The segment-relative time in seconds.
185
+ - A snippet of the script text the advance lands on.
186
+ - What the advance does (internal beat vs. segment transition).
187
+
188
+ ### Iteration
189
+
190
+ The user may request adjustments:
191
+
192
+ - "Move the second beat in feature-cards 0.5s later" -- adjust that advance value.
193
+ - "The intro feels rushed" -- add buffer to the intro's advance.
194
+ - "Combine the first two beats" -- this changes the number of advances, which means a `waitForNext()` needs to be removed from segment code. Flag the code change.
195
+
196
+ After each adjustment, re-present the timing. When the user confirms, write it into `track.ts`.
197
+
198
+ ### Append sync log entry
199
+
200
+ After the timing is finalized, append to `audio/audio_plan.md`:
201
+
202
+ ```markdown
203
+ ## YYYY-MM-DD HH:MM -- Synced timing to vN
204
+ **Segments:** intro, feature-cards, outro
205
+ **Total duration:** 19.7s
206
+ ```
207
+
208
+ ### Animation sync pass
209
+
210
+ After writing timing, consider whether in-segment animations need adjustment. Load [voiceover/animation_sync.md](voiceover/animation_sync.md) if the audio track is being set as the default and the video has fully automated animations (not all `waitForNext`-driven).
211
+
212
+ ## VO-only shortcut
213
+
214
+ For VO-only tracks (single VO cue, full file, placed at 0s), the sync procedure is identical to the existing VO sync algorithm in [voiceover/sync_algorithm.md](voiceover/sync_algorithm.md). The mapping simplifies:
215
+
216
+ - `a = 0`, `p = 0`, so `track_time = w` (source time equals track time).
217
+ - No SFX/music anchors to incorporate.
218
+ - The full existing sync algorithm applies unchanged.
219
+
220
+ ## Edge cases
221
+
222
+ | Situation | Behavior |
223
+ |---|---|
224
+ | Plan snapshot has no VO cues | No word-level timing data. Use SFX/music anchors if available. Otherwise, fall back to the segment's existing `advances` values. |
225
+ | VO timing.json is missing | Error. Direct the user to generate timing data (via TTS with timestamps or STT transcription). |
226
+ | VO slice out of range | Error before computing. "VO v3 is 4.0s; slice 1.2-5.6 exceeds source." |
227
+ | Multiple VO cues from the same source with overlapping slices | Valid. A word may have multiple effective track times. Use the most relevant one for the segment being synced. |
228
+ | Segment has no script (silent segment) | Use the segment's existing `advances` values. The segment passes through without VO-derived timing. |
229
+ | SFX anchor contradicts VO timing | VO timing wins. Note the conflict in the timing presentation. |
230
+ | Track has no timing.json references (SFX/music only, no VO) | Use SFX/music anchors plus segment `advances` as the base. This is unusual but valid. |
231
+
232
+ ## Relationship to other files
233
+
234
+ - **[voiceover/sync_algorithm.md](voiceover/sync_algorithm.md)** -- the VO-level sync algorithm. Still the reference for parsing timing.json and matching words to script. This file extends it to audio-track-level sync.
235
+ - **[voiceover/animation_sync.md](voiceover/animation_sync.md)** -- the in-segment animation adjustment pass. Run after sync when the track is set as default.
236
+ - **[build.md](build.md)** -- the build workflow that precedes sync.
237
+ - **[audio_plan.md](audio_plan.md)** -- the plan format specification.
@@ -0,0 +1,142 @@
1
+ # Animation Sync
2
+
3
+ ## When this is loaded
4
+
5
+ A voiceover is being set as the default for a video, and you need to perform a one-time sync pass to adjust in-segment animations to align with audio beats.
6
+
7
+ ## What this is
8
+
9
+ When an audio track is set as the default (`default_audio_track` in `timeline.ts`), the agent performs a one-time manual pass over segment code to align fully automated animations with the audio track's timing. This is a planning-time agent action, not a runtime mechanism.
10
+
11
+ ## What to sync
12
+
13
+ Look for **fully automated animations** -- animations that run on a fixed clock inside a segment, not gated on user interaction (`waitForNext()`). Examples:
14
+
15
+ ### Animations to consider
16
+
17
+ - **`ctx.hold(ms)` calls** -- if a hold duration no longer matches the voiceover timing, adjust it.
18
+ - **CSS transitions with explicit durations** -- `transition: transform 2s ease` where the duration should match an audio beat.
19
+ - **CSS animation durations** -- `animation: fadeIn 1.5s forwards`.
20
+ - **GSAP timelines** -- `.to(el, { duration: 2, ... })` where the duration should align with narration.
21
+ - **Lottie playback speed** -- if a Lottie animation's duration needs to match a segment beat.
22
+ - **Element appearance delays** -- WAAPI `delay` values or CSS animation delays that stage element reveals.
23
+
24
+ ### Animations to leave alone
25
+
26
+ - **Transition animations** between segments (handled by the transition system, not segment code).
27
+ - **Infinite/looping animations** that run as backgrounds (e.g., a subtle pulsing gradient).
28
+ - **User-gated animations** that only trigger on `waitForNext()` -- these are already controlled by the `Timing`.
29
+
30
+ ## The sync procedure
31
+
32
+ ### Step 1: Read the voiceover's Timing
33
+
34
+ Get the `perSegment` timing from the voiceover object. Each segment's advance times tell you when visual beats should land.
35
+
36
+ ### Step 2: Walk each segment
37
+
38
+ For each segment in the timeline:
39
+
40
+ 1. Read the segment's code (`segments/<id>/index.ts`).
41
+ 2. Identify fully automated animations (see "What to sync" above).
42
+ 3. Compare the animation's current duration/timing with the voiceover's advance schedule for that segment.
43
+
44
+ ### Step 3: Adjust durations
45
+
46
+ Where an automated animation's duration should align with an audio beat, adjust it:
47
+
48
+ **Example: hold duration adjustment**
49
+
50
+ Before:
51
+ ```ts
52
+ async play(ctx) {
53
+ // Title animation
54
+ el.classList.add('visible');
55
+ await ctx.hold(3000); // original: 3 seconds
56
+ }
57
+ // advances: [3.0]
58
+ ```
59
+
60
+ After (voiceover timing says the intro narration ends at 4.2s):
61
+ ```ts
62
+ async play(ctx) {
63
+ // Title animation
64
+ el.classList.add('visible');
65
+ await ctx.hold(4200); // adjusted to match voiceover timing
66
+ }
67
+ // advances are now driven by the Timing object, not this array
68
+ ```
69
+
70
+ **Example: CSS animation adjustment**
71
+
72
+ Before:
73
+ ```ts
74
+ mount(el) {
75
+ el.innerHTML = `
76
+ <style>
77
+ .title { animation: slideIn 1.5s ease-out forwards; }
78
+ </style>
79
+ <h1 class="title">Hello</h1>
80
+ `;
81
+ }
82
+ ```
83
+
84
+ After (narration starts 0.5s in, so title should be visible by then):
85
+ ```ts
86
+ mount(el) {
87
+ el.innerHTML = `
88
+ <style>
89
+ .title { animation: slideIn 0.4s ease-out forwards; }
90
+ </style>
91
+ <h1 class="title">Hello</h1>
92
+ `;
93
+ }
94
+ ```
95
+
96
+ ### Step 4: Add timing comments
97
+
98
+ When adjusting a duration to match voiceover timing, add a brief comment noting why:
99
+
100
+ ```ts
101
+ await ctx.hold(4200); // synced to voiceover: intro narration ends at 4.2s
102
+ ```
103
+
104
+ This helps future editors understand the magic number.
105
+
106
+ ### Step 5: Render-safety review
107
+
108
+ After adjusting each segment, run the render-safety CR checklist from [create_or_edit_video.md](../../create_or_edit_video.md) (Step 2b) against the modified code. The sync pass may introduce or reveal non-idiomatic patterns -- for example, animations that depend on external network fetches or non-deterministic input. Fix any issues before moving to the next segment.
109
+
110
+ ## Tradeoffs
111
+
112
+ - **Sync is to the default audio track only.** If the user later switches to a different audio track via `--audio-track <other-id>`, advance timing updates automatically (driven by the new track's `Timing`), but in-segment animation durations remain tuned to the original default.
113
+ - **No runtime re-sync.** The sync is a one-time manual edit. There is no mechanism to dynamically adjust animation durations at playback time based on the active audio track.
114
+ - **Re-sync is available.** If the user changes the default audio track, they can ask the agent to re-run the animation sync pass. The agent reads the new `Timing` and adjusts durations again.
115
+
116
+ ## When to skip animation sync
117
+
118
+ - The video has no fully automated animations (all timing is `waitForNext`-driven).
119
+ - The user explicitly says they do not want animation adjustments.
120
+ - The audio track is not being set as the default (it is an alternate take that will be used via `--audio-track <id>` only).
121
+
122
+ In these cases, set `default_audio_track` in `timeline.ts` without modifying segment code.
123
+
124
+ ## Presenting changes to the user
125
+
126
+ After the sync pass, summarize the changes:
127
+
128
+ ```
129
+ Animation sync for default voiceover "v1":
130
+
131
+ intro:
132
+ - ctx.hold(3000) -> ctx.hold(4200) -- match intro narration end
133
+ - CSS .title animation: 1.5s -> 0.4s -- title visible before narration starts
134
+
135
+ feature-cards:
136
+ - No automated animations found (all waitForNext-driven)
137
+
138
+ outro:
139
+ - ctx.hold(2000) -> ctx.hold(3500) -- match outro narration end
140
+ ```
141
+
142
+ Let the user review before committing the changes.
@@ -0,0 +1,153 @@
1
+ # Provider Script
2
+
3
+ ## When this is loaded
4
+
5
+ You have a confirmed script in PLAN.md and need to transform it into a provider-specific format for TTS generation.
6
+
7
+ ## What `provider_script.md` is
8
+
9
+ The `provider_script.md` file is a transformation of the PLAN.md script with provider-specific annotations. It is the text that the user copies into the TTS portal (or the agent sends via API) to generate audio. It lives at:
10
+
11
+ ```
12
+ videos/<video>/audio/originals/voiceovers/<slug>/provider_script.md
13
+ ```
14
+
15
+ The file is markdown for human readability, but its content is the exact text to paste into the provider. Segment headings are stripped -- only the narration text and annotations are included.
16
+
17
+ ## Target: ElevenLabs v2
18
+
19
+ All provider scripts target **ElevenLabs v2** (`eleven_multilingual_v2`). This model is chosen because it supports exact pause timing via SSML `<break>` tags, which is critical for syncing audio to video animations.
20
+
21
+ **Do NOT use v3-style tags.** ElevenLabs v3 introduced inline emotion tags like `[excited]`, `[calm]`, `[whispering]`, etc. These tags are **silently ignored** by the v2 model. Never include them in a provider script.
22
+
23
+ ## Generating the provider script
24
+
25
+ ### Step 1: Read the confirmed script from PLAN.md
26
+
27
+ Extract the script sections in timeline order.
28
+
29
+ ### Step 2: Apply delivery style through writing
30
+
31
+ Since v2 does not have v3's emotion tag system, tone and emotion are conveyed through **how the text is written** -- punctuation, sentence structure, and word choice. The v2 model's prosody engine responds to natural language cues:
32
+
33
+ #### The v2 writing toolkit
34
+
35
+ | Technique | Effect on v2 delivery | Example |
36
+ |---|---|---|
37
+ | **Exclamation marks** | Increased energy, slight pitch rise | "This changes everything!" |
38
+ | **Question marks** | Rising intonation | "Ready to get started?" |
39
+ | **Ellipsis** (`...`) | Natural micro-pause (~0.3-0.5s), trailing-off feel | "And then... something unexpected." |
40
+ | **Em-dash** (`--`) | Brief breath pause (~0.2-0.4s), interruption feel | "The result -- stunning." |
41
+ | **ALL CAPS** | Slight emphasis on the word (use sparingly) | "This is EXACTLY what we needed." |
42
+ | **Short punchy sentences** | Energetic, urgent pacing | "It's fast. It's reliable. It just works." |
43
+ | **Long flowing sentences** | Calm, measured delivery | "Over the course of the next few minutes, we'll walk through each of the core features that make this possible." |
44
+ | **Commas and semicolons** | Micro-pacing within a sentence | "First, the dashboard; then, the analytics." |
45
+ | **Repeated punctuation** | Heightened emotion (use very sparingly) | "This is incredible!!" |
46
+ | **Parenthetical asides** | Softer, conspiratorial tone | "The best part (and this surprised us too) is the speed." |
47
+
48
+ #### What v2 cannot do
49
+
50
+ - **No explicit emotion control.** You cannot tag a section as `[excited]` or `[whispering]`. The closest approximation is through writing style (see toolkit above).
51
+ - **No voice speed control via script text.** Speaking rate is controlled by the speed slider in the portal or the `speed` parameter in the API, not by script content.
52
+ - **No pitch control.** v2 does not support pitch-shifting tags.
53
+
54
+ This is a deliberate trade-off: v2 gives us **deterministic pause timing** via `<break>` tags at the cost of explicit emotion tags. For voiceover-to-video sync, predictable timing is more valuable than fine-grained emotion control.
55
+
56
+ ### Step 3: Add pauses
57
+
58
+ Where the PLAN.md script has `[pause for animation]` markers, insert pause mechanisms appropriate to the desired duration:
59
+
60
+ #### Pause reference (v2)
61
+
62
+ | Pause type | Syntax | Duration | Use for |
63
+ |---|---|---|---|
64
+ | Natural micro-pause | `...` (ellipsis) | ~0.3-0.5s | Trailing off, rhetorical beat |
65
+ | Breath pause | `--` (em-dash) | ~0.2-0.4s | Parenthetical, interruption |
66
+ | Sentence break | Period + new sentence | ~0.5-0.8s | Normal sentence transition |
67
+ | Medium pause | `<break time="1.0s" />` | Exact (1.0s) | Animation beat between ideas |
68
+ | Long pause | `<break time="2.5s" />` | Exact (2.5s) | Major visual transition |
69
+ | Very long pause | `<break time="4.0s" />` | Exact (up to ~5s) | Extended animation sequence |
70
+
71
+ **The `<break>` SSML tag is the primary mechanism for precise pauses.** It is supported by v2 and produces exact, deterministic timing. Use it for any pause of 1 second or more where the video needs time for a visual transition.
72
+
73
+ Syntax: `<break time="X.Xs" />` where `X.X` is seconds (e.g., `"1.5s"`, `"0.8s"`, `"3.0s"`).
74
+
75
+ For pauses under 0.5 seconds, ellipses and em-dashes often sound more natural than a `<break>` tag. For pauses over 0.5 seconds, use `<break>`.
76
+
77
+ **Important: `<break>` cannot be the first element in the script.** ElevenLabs does not support a `<break>` tag at the very start of the text -- the TTS engine requires spoken text before the first break. If the video needs an initial silent pause before narration begins, start with a spoken word and place the first `<break>` after it, or handle the initial silence through video timing (e.g., a leading silent segment).
78
+
79
+ ### Step 4: Handle segment boundaries
80
+
81
+ The provider script is one continuous block of text (not divided by segment). Segment boundaries from PLAN.md become natural pause points in the provider script. Insert a `<break>` tag at each segment transition to give the audio natural breathing room:
82
+
83
+ ```
84
+ ...set us apart.
85
+
86
+ <break time="1.0s" />
87
+
88
+ First up: real-time collaboration.
89
+ ```
90
+
91
+ The duration of segment-boundary breaks depends on the video's pacing. A good default is 1.0-1.5 seconds.
92
+
93
+ ### Step 5: Pronunciation and special terms
94
+
95
+ v2 handles pronunciation through these mechanisms:
96
+
97
+ - **Spell out letters:** "A P I" (with spaces) for letter-by-letter pronunciation.
98
+ - **Phonetic hint:** For unusual names, spell them phonetically nearby: "Istio (is-tee-oh)" or just "is-tee-oh" if the correct name is not needed in audio.
99
+ - **URLs:** Write as speech: "acme dot com" instead of "acme.com".
100
+ - **Numbers:** "one hundred twenty three" instead of "123" if the TTS reads it oddly. Test first -- ElevenLabs v2 generally handles numbers well.
101
+
102
+ v2 does NOT support IPA phoneme tags or SSML `<phoneme>` elements. Use inline phonetic spelling as the workaround.
103
+
104
+ ## Output format
105
+
106
+ Write the provider script as a single markdown file:
107
+
108
+ ```markdown
109
+ # Provider Script
110
+
111
+ > Provider: ElevenLabs v2 (eleven_multilingual_v2)
112
+ > Voice: [voice name from selection, e.g. "Asher"]
113
+ > Style notes: Conversational, warm
114
+
115
+ ---
116
+
117
+ Welcome to Acme Product. Today we'll walk through the three features that set us apart.
118
+
119
+ <break time="1.2s" />
120
+
121
+ First up: real-time collaboration! Your team can edit simultaneously, with changes syncing instantly across devices.
122
+
123
+ <break time="2.0s" />
124
+
125
+ Next, the analytics dashboard. Track engagement, conversion, and retention -- in one view.
126
+
127
+ <break time="2.0s" />
128
+
129
+ Finally, integrations. Connect with the tools you already use... Slack, GitHub, Jira, and more.
130
+
131
+ <break time="1.0s" />
132
+
133
+ Ready to get started? Visit acme dot com for a free trial. Thanks for watching.
134
+ ```
135
+
136
+ ### Key conventions in the output
137
+
138
+ - The header block (blockquote) is metadata for the user, not pasted into the portal or sent to the API.
139
+ - Everything below the `---` is the text to paste or send.
140
+ - `<break>` tags are inline where exact pauses should occur.
141
+ - Punctuation-driven pauses (ellipses, em-dashes) are inline for natural micro-pauses.
142
+ - URLs are written phonetically ("acme dot com") to avoid TTS mispronunciation.
143
+ - **No v3 emotion tags** (`[excited]`, `[calm]`, `[whispering]`, etc.) -- these are silently ignored by v2.
144
+
145
+ ## Presenting to the user
146
+
147
+ After generating the provider script:
148
+
149
+ 1. Show the full script with a brief explanation of the annotations used.
150
+ 2. Tell the user:
151
+ > Copy everything below the horizontal rule into the ElevenLabs portal, or the agent will send it via the API. See the provider reference for step-by-step instructions.
152
+ 3. Write the file to `audio/originals/voiceovers/<slug>/provider_script.md`.
153
+ 4. Proceed to the provider walkthrough: [providers/elevenlabs.md](providers/elevenlabs.md).