@koda-sl/baker-cli 0.90.1 → 0.92.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +37 -4
- package/canvas/video-overlay-composition/index.html +31 -5
- package/dist/{chunk-2E4H2GIJ.js → chunk-LMVDA3EZ.js} +59 -5
- package/dist/chunk-LMVDA3EZ.js.map +1 -0
- package/dist/cli.js +519 -108
- package/dist/cli.js.map +1 -1
- package/dist/engine/index.js +1 -1
- package/package.json +1 -1
- package/dist/chunk-2E4H2GIJ.js.map +0 -1
package/README.md
CHANGED
|
@@ -2361,6 +2361,37 @@ Permissions enforced server-side:
|
|
|
2361
2361
|
|
|
2362
2362
|
---
|
|
2363
2363
|
|
|
2364
|
+
### Missions (`baker missions`)
|
|
2365
|
+
|
|
2366
|
+
A **Mission** groups an ordered set of actions (its "steps") under one goal, with a markdown **overview** (the human-readable plan). Use a mission whenever a request decomposes into 2+ ordered actions — audits, campaigns, multi-step plans. A single one-off capture stays a loose action.
|
|
2367
|
+
|
|
2368
|
+
Mission write ops stage on the **same chat draft as actions** and apply atomically on the existing publish — there is no separate apply. `BAKER_CHAT_ID` must be set for every command except `list`/`get`.
|
|
2369
|
+
|
|
2370
|
+
```bash
|
|
2371
|
+
# Open a mission (returns { ok, data: { missionTempId }, hints })
|
|
2372
|
+
baker missions create --title "Audit Google Ads" --overview "## Phase 1 — Pull data\n..."
|
|
2373
|
+
|
|
2374
|
+
# Create the action steps in order, then attach each by 0-based --order.
|
|
2375
|
+
# Both --mission and --action accept a real id or a temp_* ref from the same draft.
|
|
2376
|
+
baker missions add-action --mission <id-or-missionTempId> --action <id-or-tempId> --order 0
|
|
2377
|
+
|
|
2378
|
+
baker missions update <mission-id> --title "..." --overview "..." # real id only
|
|
2379
|
+
baker missions set-status <mission-id> --status accomplished # active | accomplished | aborted
|
|
2380
|
+
|
|
2381
|
+
baker missions list [--include-aborted] # per-mission progress (done/total) + ordered steps
|
|
2382
|
+
baker missions get <mission-id>
|
|
2383
|
+
```
|
|
2384
|
+
|
|
2385
|
+
Rules:
|
|
2386
|
+
|
|
2387
|
+
- **The overview is forward planning** — write it as numbered Phases (goal / what it produces / what unlocks next), with **no calendar framing** (dates, deadlines, ETAs) and **no effort framing** (quick-win, MVP, t-shirt sizes).
|
|
2388
|
+
- **Publishing the chat = the user approving the plan.** The mission and its ordered steps apply atomically; the dashboard then shows the mission grouped, ticking off as steps complete.
|
|
2389
|
+
- **Mark `accomplished` only when the goal is genuinely met**, `aborted` if abandoned — never auto-conclude from step status.
|
|
2390
|
+
- Hard dependencies between steps still go through `baker actions link`; missions (ordering) and dependencies (blocking) are orthogonal.
|
|
2391
|
+
- All staged mission ops revert automatically if the chat is discarded.
|
|
2392
|
+
|
|
2393
|
+
---
|
|
2394
|
+
|
|
2364
2395
|
### `baker schema [command]`
|
|
2365
2396
|
|
|
2366
2397
|
Inspect command argument schemas for AI agent introspection.
|
|
@@ -3360,9 +3391,10 @@ Place and mix several audio clips onto one timeline — a music bed plus timed v
|
|
|
3360
3391
|
|
|
3361
3392
|
| Name | Type | Required | Notes |
|
|
3362
3393
|
|---|---|---|---|
|
|
3363
|
-
| `tracks` | array | yes | `[{slot, start_s, gain_db?}]` — `slot` matches a wired input; `start_s` is absolute seconds; `gain_db`
|
|
3394
|
+
| `tracks` | array | yes | `[{slot, start_s, gain_db?}]` — `slot` matches a wired input; `start_s` is absolute seconds; `gain_db` sets the static level (e.g. `-20` for a music bed) |
|
|
3364
3395
|
| `total_ms` | number | no | pins the final length (pad/trim) |
|
|
3365
3396
|
| `output_format` | enum | no | `mp3` (default) / `wav` / `m4a` |
|
|
3397
|
+
| `duck` | object | no | sidechain-duck one track under others: `{ track, against: [...], threshold?, ratio?, attack?, release? }` — the `track` (music) drops while any `against` track (voice) carries signal and recovers in the gaps |
|
|
3366
3398
|
|
|
3367
3399
|
**Outputs** — `audio`. **Cost** — free (local). Requires `ffmpeg`.
|
|
3368
3400
|
|
|
@@ -3619,11 +3651,11 @@ Turn a reference video into a **runnable, self-validated reproduction canvas** i
|
|
|
3619
3651
|
1. **`video_deconstruct`** (`~google/gemini-pro-latest`, full mode) — reverse-engineers the video into a scene-by-scene blueprint + word-level transcript, written next to the canvas as **`prompt.json`**. Each scene's `start_frame_prompt`/`end_frame_prompt` are inlined into the frame nodes (see below); `prompt.json` then rides along as the shared **global style reference** (palette, cast cohesion) and as provenance.
|
|
3620
3652
|
2. **recurring-element selection** (`~google/gemini-flash-latest`) — picks only the **recurring, identity-critical** elements (each `global.cast` person, a recurring animal, a showcased product, the brand logo) and the scene indices each appears in. One real reference image grounds each element across **every** frame it appears in, so the same actor stays consistent the whole video. This selection runs as a **second pass over a slimmed blueprint** (cast/branding + each scene's frame prompts only) — a long ad's full blueprint can exceed the engine's inline-prompt limit, so the heavy per-scene detail (dialogue, overlays, transcript) the selector never reads is dropped before the prompt.
|
|
3621
3653
|
|
|
3622
|
-
Before the deconstruct it runs a **local shot-cut pass** on the source file with **[PySceneDetect](https://www.scenedetect.com)** (`scenedetect` CLI, `detect-content` — the battle-tested HSV content detector, installed in the canvas sandbox) and passes the cut timestamps as `video_deconstruct`'s `shot_cuts`. The deconstruct snaps its scene boundaries onto those real cuts and **splits any scene that spans one**, so a scene's frames can never straddle a hard cut (the failure where a scene's start frame was the couch and its end frame the b-roll). Two knobs tuned for fast social ads: the content **threshold defaults to 18** (PySceneDetect's own default of 27 misses soft reframes) and the **minimum scene length is dropped to 0.25s** (its default ~0.6s merges away rapid montage flashes) — so super-fast cuts survive and become cheap still-holds downstream.
|
|
3654
|
+
Before the deconstruct it runs a **local shot-cut pass** on the source file with **[PySceneDetect](https://www.scenedetect.com)** (`scenedetect` CLI, `detect-content` — the battle-tested HSV content detector, installed in the canvas sandbox) and passes the cut timestamps as `video_deconstruct`'s `shot_cuts`. The deconstruct snaps its scene boundaries onto those real cuts and **splits any scene that spans one**, so a scene's frames can never straddle a hard cut (the failure where a scene's start frame was the couch and its end frame the b-roll). Two knobs tuned for fast social ads: the content **threshold defaults to 18** (PySceneDetect's own default of 27 misses soft reframes) and the **minimum scene length is dropped to 0.25s** (its default ~0.6s merges away rapid montage flashes) — so super-fast cuts survive and become cheap still-holds downstream. The threshold is **adaptive**: if the first pass looks like a continuous shot shredded into many close micro-cuts (a talking-head selfie's natural motion), it re-runs at PySceneDetect's own default of 27 and trusts that — real montage cuts survive it, motion artifacts don't. Pinning **`--shot-threshold N`** disables the re-check (lower = more cuts). The backend additionally coalesces residual same-shot slivers. If `scenedetect` is unavailable it warns loudly and degrades to LLM-only boundaries.
|
|
3623
3655
|
|
|
3624
3656
|
A shot longer than the video model's per-clip ceiling (Seedance's 15s, passed as `video_deconstruct`'s `max_clip_s`) is split into equal **continuation sub-scenes** that share their splice boundary exactly — so a long shot is reproduced in **full** (no truncation) and joins seamlessly. Each sub-scene carries `continues_previous`.
|
|
3625
3657
|
|
|
3626
|
-
It then scaffolds the full pipeline like an **editing timeline**: each clip gets a **static-ad-grade start AND end keyframe** (`image_generate`, each with its **own self-contained `params.prompt`** — edit a frame node to change only that frame; `prompt.json` wired as the **authoritative shared `target_blueprint`**, plus a per-element reference legend). Each keyframe is **fully recast** to the dropped `el_*` reference images
|
|
3658
|
+
It then scaffolds the full pipeline like an **editing timeline**: each clip gets a **static-ad-grade start AND end keyframe** (`image_generate`, each with its **own self-contained `params.prompt`** — edit a frame node to change only that frame; `prompt.json` wired as the **authoritative shared `target_blueprint`**, plus a per-element reference legend). Each keyframe is **fully recast** to the dropped `el_*` reference images. For a frame with **no person/animal** the original extracted frame is kept LAST as a pure composition anchor; for any frame **with a face it is dropped entirely** (it leaked the source person's identity — the hook used to render the reference woman, not our actor), so the recast `el_*`/actor-sheet is the sole identity reference. Both keyframes feed `video_generate` (`first_frame`+`last_frame`, so Seedance interpolates real in-shot motion; ultra-detailed motion brief; duration snapped to the nearest allowed clip length). Every keyframe grounds **only on its own extracted frame + `el_*` slots** — no reference to any other generated frame — so all images render **in parallel** (no cascade). Source-frame URLs are **deduped** (each ingested once). `--frames reuse` wires the real source frame straight in.
|
|
3627
3659
|
|
|
3628
3660
|
**Composited scenes (split-screen / picture-in-picture / keyed presenter).** Real ads aren't always one full-frame shot — a frame can be **persistently divided** (b-roll on top, a presenter talking on the bottom) or **layer a presenter** over background footage (boxed in a corner, or green-screen keyed). The deconstruct now reports this per scene as `scene.composition` (`layout: split_screen | pip | keyed_overlay`, with one `region` per stream — each its own clean-plate frame + motion brief, the talking-head region flagged `is_presenter`). The scaffold reproduces a composited scene by building **one clip per region** (`s<i>_r0_*`, `s<i>_r1_*`, …) and compositing them with ffmpeg: a split-screen `vstack`/`hstack` (stack direction read from the region **panels**, so a top/bottom split always stacks vertically), or a picture-in-picture `overlay` of the presenter inset at its corner. A **keyed** presenter is first cut to transparency by `video_background_remove` (`s<i>_key`), then overlaid. The presenter region carries the native lip-synced voice; b-roll/render panels stay silent. To change a layout, edit `composition` in `prompt.json` and re-scaffold, or hand-edit the `s<i>_composite` ffmpeg args. Plain full-frame scenes (the default) are unaffected.
|
|
3629
3661
|
|
|
@@ -3664,7 +3696,6 @@ baker canvas run ./reference-ad.video.canvas.json
|
|
|
3664
3696
|
| `--out <path>` | `<video-dir>/<name>.video.canvas.json` | Where to write the canvas (composition is copied alongside). |
|
|
3665
3697
|
| `--frames <mode>` | `generate` | `generate` emits ONE recast keyframe per scene (the original frame is dropped so the dropped `el_*` assets drive identity); `reuse` wires the real extracted first+last frames straight into the clips (faithful, cheaper, no recast). |
|
|
3666
3698
|
| `--ambient` | off | Give silent **b-roll** scenes native diegetic ambient (Seedance `generate_audio`), mixed deep under the music bed. Talking scenes already carry voice; check levels don't muddy the mix before keeping it. |
|
|
3667
|
-
| `--actor-sheets` | off | Lock a recast **person/animal that recurs across ≥2 scenes** to ONE multi-view turnaround (`image_reference_sheet`) that every frame grounds on — the strongest cross-scene identity lock. Costs extra credits per sheet; a fused sheet can over-polish, so eyeball it. |
|
|
3668
3699
|
| `--max-scenes <n>` | all source scenes | **Cost lever that reduces fidelity** — caps the deconstruct, MERGING away every scene beyond the cap (fewer cuts, lost beats). Prints a warning when set; omit it to reproduce every scene. |
|
|
3669
3700
|
| `--language <code>` | auto | Transcript/dialogue language hint (e.g. `fr`, `en`). |
|
|
3670
3701
|
| `--focus <text>` | — | Known provenance/emphasis to ground the deconstruct. |
|
|
@@ -3675,6 +3706,8 @@ baker canvas run ./reference-ad.video.canvas.json
|
|
|
3675
3706
|
|
|
3676
3707
|
Each scene is captured in a **shoot mode** — `ugc_selfie` (talking heads, the default look), `ugc_broll`, `studio_product` (pack shot), `lifestyle_cinematic`, or `screen_ui`. The scaffold derives one per scene (UGC by default; the cinematic and screen lanes are opt-in) and bakes its capture block into the frame and a camera default into the clip; override per scene with a `shoot_mode` field in `prompt.json`. Capture aesthetic + depth-of-field follow the mode (UGC stays flat; studio/lifestyle allow shallow DoF). Clips also carry **diegetic native audio** — the scene's own ambience described in the Seedance prompt, never music (the music bed is a separate, ducked track); set a scene's `ambient` field to steer it.
|
|
3677
3708
|
|
|
3709
|
+
**Automatic by default (no flags).** A recast **person/pet recurring across ≥2 scenes** is always locked to ONE multi-view turnaround (`image_reference_sheet`) every frame grounds on. An **app/website/chat screen** is never sent to the video model — the scaffold drops the scene to a clean talking-head and seeds a phone-mockup PIP stub to fill with a real `baker images screenshot` or brand HTML block (Seedance garbles UI and a split leaves a seam). The **music bed is instrumental** (the script is never fed to the music model — it would sing over the voice), enters only after the hook, and is **sidechain-ducked** under the voice. **Word-synced TikTok captions** are wired off the deconstruct transcript whenever the ad has speech. Seeded overlays are pushed **off the subject's face** (dead-center → bottom band).
|
|
3710
|
+
|
|
3678
3711
|
The two scaffold passes are billed (the full `video_deconstruct` is the heavy one); **running** the result then generates many image/video/audio assets and is not free. Defaults to vertical 1080×1920 overlays — copy + edit the composition for other aspect ratios. For on-brand overlay type, drop `brand-bold.otf`/`brand-regular.otf` into the copied `video-overlay-composition/` dir (wired via `@font-face`, with a system fallback). Richer transcription (punctuated words + paragraphs) is available via the deconstruct's `transcriber: "deepgram"` param when `DEEPGRAM_API_KEY` is set.
|
|
3679
3712
|
|
|
3680
3713
|
#### `baker canvas scaffold-static-ad <image> [flags]`
|
|
@@ -64,12 +64,33 @@
|
|
|
64
64
|
}
|
|
65
65
|
.ov.fe { font-size: 30px; font-weight: 600; opacity: 0.9; }
|
|
66
66
|
|
|
67
|
+
/* SAFE ZONES — a 9:16 talking-head's FACE fills the vertical CENTER band. Keep
|
|
68
|
+
graphics in the TOP band (≤360px) or BOTTOM band (≥1400px); never park a caption
|
|
69
|
+
dead-center over the face. The scaffold already pushes center placements to the
|
|
70
|
+
bottom, but if you hand-place, respect the bands. */
|
|
71
|
+
|
|
72
|
+
/* COLLISION-SAFE TOP BAR — wrap co-timed top items (logo + trust badge + …) in ONE
|
|
73
|
+
.top-bar so they pack side-by-side with a gap and never overlap. Example:
|
|
74
|
+
<div class="top-bar" data-start="0" data-dur="30">
|
|
75
|
+
<img class="brandmark" src="logo.svg"><span class="trust">★ 4,5/5</span>
|
|
76
|
+
</div> */
|
|
77
|
+
.top-bar {
|
|
78
|
+
position: absolute; top: 70px; left: 56px; right: 56px;
|
|
79
|
+
display: flex; align-items: center; gap: 20px; flex-wrap: wrap;
|
|
80
|
+
}
|
|
81
|
+
|
|
82
|
+
/* STAGGER — give a group `data-stagger="0.12"` and its direct children reveal one
|
|
83
|
+
after another (e.g. coverage chips appearing as each is spoken), not as a clump. */
|
|
84
|
+
.chips { display: flex; flex-wrap: wrap; gap: 14px; justify-content: center; max-width: 92%; }
|
|
85
|
+
|
|
67
86
|
/* 9-grid position helpers (absolute). Tweak the insets or add your own. */
|
|
68
87
|
.pos-top-left { top: 90px; left: 56px; text-align: left; }
|
|
69
88
|
.pos-top-center { top: 90px; left: 50%; transform: translateX(-50%); }
|
|
70
89
|
.pos-top-right { top: 90px; right: 56px; text-align: right; }
|
|
71
90
|
.pos-mid-left,
|
|
72
91
|
.pos-center-left { top: 50%; left: 56px; transform: translateY(-50%); text-align: left; }
|
|
92
|
+
/* Center = the face. Kept only for non-talking-head plates; the scaffold remaps
|
|
93
|
+
seeded center overlays to the bottom band so they never cover the subject. */
|
|
73
94
|
.pos-center,
|
|
74
95
|
.pos-mid-center { top: 50%; left: 50%; transform: translate(-50%,-50%); }
|
|
75
96
|
.pos-mid-right,
|
|
@@ -115,26 +136,31 @@
|
|
|
115
136
|
const els = Array.from(document.querySelectorAll('#overlay-root [data-start]'));
|
|
116
137
|
|
|
117
138
|
// Generic timeline: show each element at data-start, hide at start+data-dur,
|
|
118
|
-
// with an optional canned entrance from data-anim.
|
|
139
|
+
// with an optional canned entrance from data-anim. With data-stagger="<sec>" the
|
|
140
|
+
// element's DIRECT CHILDREN reveal one-by-one (e.g. coverage chips landing on the
|
|
141
|
+
// word each is spoken) instead of as a single clump. No styling decisions here —
|
|
119
142
|
// the look lives entirely in the CSS/markup above.
|
|
120
143
|
for (const el of els) {
|
|
121
144
|
const at = parseFloat(el.getAttribute('data-start') || '0') || 0;
|
|
122
145
|
const dur = parseFloat(el.getAttribute('data-dur') || '2.5') || 2.5;
|
|
123
146
|
const anim = el.getAttribute('data-anim') || '';
|
|
147
|
+
const stagger = parseFloat(el.getAttribute('data-stagger') || '0') || 0;
|
|
124
148
|
// Preserve any positioning transform the CSS set (translate(...)).
|
|
125
149
|
const baseTransform = getComputedStyle(el).transform;
|
|
126
150
|
const tx = baseTransform && baseTransform !== 'none' ? baseTransform : '';
|
|
127
151
|
|
|
152
|
+
// Stagger animates the children; otherwise the element itself enters.
|
|
153
|
+
const targets = stagger > 0 && el.children.length ? Array.from(el.children) : el;
|
|
128
154
|
tl.set(el, { visibility: 'visible' }, at);
|
|
129
155
|
if (anim === 'pop') {
|
|
130
|
-
tl.fromTo(
|
|
156
|
+
tl.fromTo(targets, { opacity: 0, scale: 0.7 }, { opacity: 1, scale: 1, duration: 0.3, ease: 'back.out(1.7)', stagger }, at);
|
|
131
157
|
} else if (anim === 'slide_up') {
|
|
132
|
-
tl.fromTo(
|
|
158
|
+
tl.fromTo(targets, { opacity: 0, yPercent: 30 }, { opacity: 1, yPercent: 0, duration: 0.3, ease: 'power2.out', stagger }, at);
|
|
133
159
|
} else if (anim === 'slide_down') {
|
|
134
|
-
tl.fromTo(
|
|
160
|
+
tl.fromTo(targets, { opacity: 0, yPercent: -30 }, { opacity: 1, yPercent: 0, duration: 0.3, ease: 'power2.out', stagger }, at);
|
|
135
161
|
} else {
|
|
136
162
|
// Default / any unrecognized data-anim value: a plain fade.
|
|
137
|
-
tl.fromTo(
|
|
163
|
+
tl.fromTo(targets, { opacity: 0 }, { opacity: 1, duration: 0.25, ease: 'power1.out', stagger }, at);
|
|
138
164
|
}
|
|
139
165
|
tl.to(el, { opacity: 0, duration: 0.2 }, Math.max(at + 0.2, at + dur));
|
|
140
166
|
tl.set(el, { visibility: 'hidden' }, at + dur + 0.21);
|
|
@@ -3853,19 +3853,56 @@ var Track = z6.object({
|
|
|
3853
3853
|
/** Optional level adjustment in dB (negative ducks, e.g. a music bed at -12). */
|
|
3854
3854
|
gain_db: z6.number().optional()
|
|
3855
3855
|
}).strict();
|
|
3856
|
+
var DuckSpec = z6.object({
|
|
3857
|
+
track: z6.string().min(1),
|
|
3858
|
+
against: z6.array(z6.string().min(1)).min(1),
|
|
3859
|
+
threshold: z6.number().min(0).max(1).optional(),
|
|
3860
|
+
ratio: z6.number().min(1).max(20).optional(),
|
|
3861
|
+
attack: z6.number().min(0.01).optional(),
|
|
3862
|
+
release: z6.number().min(0.01).optional()
|
|
3863
|
+
}).strict();
|
|
3856
3864
|
var AudioTimelineParams = z6.object({
|
|
3857
3865
|
tracks: z6.array(Track).min(1),
|
|
3858
3866
|
/** Final track length in ms — pads short / trims long. Defaults to the natural mix length. */
|
|
3859
3867
|
total_ms: z6.number().int().positive().optional(),
|
|
3860
|
-
output_format: z6.enum(["mp3", "wav", "m4a"]).optional()
|
|
3868
|
+
output_format: z6.enum(["mp3", "wav", "m4a"]).optional(),
|
|
3869
|
+
/** Optional sidechain duck of one track under others (e.g. music under voice). */
|
|
3870
|
+
duck: DuckSpec.optional()
|
|
3861
3871
|
}).strict();
|
|
3862
3872
|
var AudioTimelineInputs = z6.record(z6.string(), z6.unknown());
|
|
3863
3873
|
var AudioTimelineOutputs = z6.object({ audio: z6.custom() }).strict();
|
|
3874
|
+
function applyDuck(params, labelFor, filterChains) {
|
|
3875
|
+
const duck = params.duck;
|
|
3876
|
+
if (!duck) return;
|
|
3877
|
+
const trackIdx = params.tracks.findIndex((t) => t.slot === duck.track);
|
|
3878
|
+
const keyIdxs = duck.against.map((s) => params.tracks.findIndex((t) => t.slot === s)).filter((i) => i >= 0);
|
|
3879
|
+
if (trackIdx < 0 || keyIdxs.length === 0) return;
|
|
3880
|
+
const keyLabels = [];
|
|
3881
|
+
for (const ki of keyIdxs) {
|
|
3882
|
+
const base = labelFor[ki];
|
|
3883
|
+
filterChains.push(`[${base}]asplit=2[${base}m][${base}k]`);
|
|
3884
|
+
labelFor[ki] = `${base}m`;
|
|
3885
|
+
keyLabels.push(`[${base}k]`);
|
|
3886
|
+
}
|
|
3887
|
+
let keyOut = keyLabels[0];
|
|
3888
|
+
if (keyLabels.length > 1) {
|
|
3889
|
+
filterChains.push(`${keyLabels.join("")}amix=inputs=${keyLabels.length}:normalize=0[duckkey]`);
|
|
3890
|
+
keyOut = "[duckkey]";
|
|
3891
|
+
}
|
|
3892
|
+
const th = duck.threshold ?? 0.03;
|
|
3893
|
+
const ra = duck.ratio ?? 8;
|
|
3894
|
+
const at = duck.attack ?? 5;
|
|
3895
|
+
const re = duck.release ?? 300;
|
|
3896
|
+
filterChains.push(
|
|
3897
|
+
`[${labelFor[trackIdx]}]${keyOut}sidechaincompress=threshold=${th}:ratio=${ra}:attack=${at}:release=${re}[ducked]`
|
|
3898
|
+
);
|
|
3899
|
+
labelFor[trackIdx] = "ducked";
|
|
3900
|
+
}
|
|
3864
3901
|
function buildAudioTimelineArgs(params) {
|
|
3865
3902
|
const fmt = params.output_format ?? "mp3";
|
|
3866
3903
|
const inputArgs = [];
|
|
3867
3904
|
const filterChains = [];
|
|
3868
|
-
const
|
|
3905
|
+
const labelFor = [];
|
|
3869
3906
|
params.tracks.forEach((track, i) => {
|
|
3870
3907
|
inputArgs.push("-i", `{{in.${track.slot}}}`);
|
|
3871
3908
|
const delayMs = Math.round(track.start_s * 1e3);
|
|
@@ -3873,8 +3910,10 @@ function buildAudioTimelineArgs(params) {
|
|
|
3873
3910
|
if (track.gain_db !== void 0) steps.push(`volume=${track.gain_db}dB`);
|
|
3874
3911
|
const label = `a${i}`;
|
|
3875
3912
|
filterChains.push(`[${i}:a]${steps.join(",")}[${label}]`);
|
|
3876
|
-
|
|
3913
|
+
labelFor[i] = label;
|
|
3877
3914
|
});
|
|
3915
|
+
applyDuck(params, labelFor, filterChains);
|
|
3916
|
+
const mixLabels = labelFor.map((l) => `[${l}]`);
|
|
3878
3917
|
let graph = `${filterChains.join(";")};${mixLabels.join("")}amix=inputs=${params.tracks.length}:normalize=0`;
|
|
3879
3918
|
if (params.total_ms !== void 0) {
|
|
3880
3919
|
const totalS = params.total_ms / 1e3;
|
|
@@ -3885,7 +3924,7 @@ function buildAudioTimelineArgs(params) {
|
|
|
3885
3924
|
}
|
|
3886
3925
|
var audioTimelineNode = defineNode({
|
|
3887
3926
|
id: "audio_timeline",
|
|
3888
|
-
version: "1.
|
|
3927
|
+
version: "1.1.0",
|
|
3889
3928
|
category: "audio",
|
|
3890
3929
|
location: "local",
|
|
3891
3930
|
summary: "Place and mix several audio clips onto one timeline: each track starts at a given second (optionally level-adjusted in dB), then they're combined into a single track. Built for laying a music bed plus timed voiceover lines and sound effects under a video.",
|
|
@@ -3910,6 +3949,21 @@ var audioTimelineNode = defineNode({
|
|
|
3910
3949
|
});
|
|
3911
3950
|
}
|
|
3912
3951
|
});
|
|
3952
|
+
const duck = parsed.data.duck;
|
|
3953
|
+
if (duck) {
|
|
3954
|
+
const slots = new Set(parsed.data.tracks.map((t) => t.slot));
|
|
3955
|
+
if (!slots.has(duck.track)) {
|
|
3956
|
+
issues.push({ path: "params.duck.track", message: `duck.track "${duck.track}" is not one of the tracks` });
|
|
3957
|
+
}
|
|
3958
|
+
duck.against.forEach((s, k) => {
|
|
3959
|
+
if (!slots.has(s)) {
|
|
3960
|
+
issues.push({ path: `params.duck.against[${k}]`, message: `duck.against "${s}" is not one of the tracks` });
|
|
3961
|
+
}
|
|
3962
|
+
if (s === duck.track) {
|
|
3963
|
+
issues.push({ path: `params.duck.against[${k}]`, message: "a track cannot duck against itself" });
|
|
3964
|
+
}
|
|
3965
|
+
});
|
|
3966
|
+
}
|
|
3913
3967
|
return issues;
|
|
3914
3968
|
},
|
|
3915
3969
|
async cacheKeyExtras() {
|
|
@@ -6085,4 +6139,4 @@ export {
|
|
|
6085
6139
|
defaultRegistry,
|
|
6086
6140
|
createEngineFromEnv
|
|
6087
6141
|
};
|
|
6088
|
-
//# sourceMappingURL=chunk-
|
|
6142
|
+
//# sourceMappingURL=chunk-LMVDA3EZ.js.map
|