arca-marketing-video 2.5.0 → 2.7.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "arca-marketing-video",
3
- "version": "2.5.0",
3
+ "version": "2.7.0",
4
4
  "description": "Brand-driven short-form marketing content kit: four installable Claude Code skills (carousel generator, storyboard prompt, video prompt, shorts editor) plus a shared brand profile and assets. Run `npx arca-marketing-video` to install them into .claude/skills.",
5
5
  "keywords": [
6
6
  "skill",
@@ -700,7 +700,11 @@ The grid:
700
700
  edge-to-edge or with a thin uniform gutter, so every frame crops out cleanly on its own.
701
701
  - Frames in beat order (frame 1 = top-left). Each frame is a different moment (the 9 beats below).
702
702
  - Render the frames in the chosen VIDEO TYPE's look (UGC phone-shot by default — raw, imperfect,
703
- human per Phases 6 and 9; cinematic / animation / product film per the declared type).
703
+ human per Phases 6 and 9; cinematic / animation / product film per the declared type). For UGC, the
704
+ frames must read like stills grabbed from a phone video, NOT stock/AI photos — apply Phases 6 & 9
705
+ (real skin texture, non-model casting, mixed practical light, off-center handheld framing, lived-in
706
+ clutter); avoid even studio light, model-perfect symmetric faces, plastic skin, and staged centering.
707
+ (`video-prompt` reuses these frames as start frames, so the realism has to start here.)
704
708
 
705
709
  The frames must contain NO:
706
710
  - panel numbers, labels, titles, or letters
@@ -1,6 +1,6 @@
1
1
  ---
2
2
  name: video-prompt
3
- description: Use when turning a storyboard into a finished vertical short-form video of any type (UGC, cinematic / movie-trailer, animation, product film, etc.). Handles both photographic storyboards (clean and use as start frames) and schematic / annotated plans (do NOT upscale 1:1 — rebuild patterned to them using Wyren clips for live footage + HyperFrames for on-screen graphics). Drives the Wyren MCP — picks image/video models and resolutions, optimizes the TikTok first 5 seconds, and keeps character faces consistent across shots. Triggers on "make the video from this storyboard", "generate the short", "render the clips". Part of the arca-marketing-video kit.
3
+ description: Use when turning a storyboard into a finished vertical short-form video of any type (UGC, cinematic / movie-trailer, animation, product film, etc.). Handles both photographic storyboards (clean and use as start frames) and schematic / annotated plans (do NOT upscale 1:1 — rebuild patterned to them with Wyren clips, with any on-screen UI/screens generated diegetically in the footage, never composited as text/UI overlays). Drives the Wyren MCP — picks image/video models and resolutions, forces phone-camera realism on the start frames (so they don't look like stock/AI photos), generates multi-angle coverage so the edit can cut fast like real UGC, optimizes the TikTok first 5 seconds, and keeps character faces consistent across shots. Triggers on "make the video from this storyboard", "generate the short", "render the clips". Part of the arca-marketing-video kit.
4
4
  ---
5
5
 
6
6
  # Video Prompt
@@ -29,9 +29,10 @@ Ask first. Only make smart, briefly stated assumptions for whatever is still mis
29
29
  Use the storyboard as the PLAN, not as footage to copy. First classify it (see STORYBOARD
30
30
  INTERPRETATION below): if the panels are photographic, clean them and use them as start frames; if
31
31
  they are schematic / annotated mockups (panel numbers, notes boxes, drawn phone bezels, mock UI), do
32
- NOT upscale or reproduce them 1:1 — rebuild the video patterned to them, using Wyren clips for the
33
- live footage and HyperFrames for the on-screen graphics. You may generate clips separately and merge
34
- them (Wyren or HyperFrames), or use video-model multishot — decide smartly.
32
+ NOT upscale or reproduce them 1:1 — rebuild the video patterned to them with Wyren clips, realizing any
33
+ on-screen UI / screens DIEGETICALLY inside the footage (never composited text/UI overlays see DO NOT
34
+ COMPOSITE TEXT/UI GRAPHICS). You may generate clips separately and merge them, or use video-model
35
+ multishot — decide smartly.
35
36
 
36
37
  VIDEO TYPE SCOPE — read before applying the rules below.
37
38
  This skill makes ANY video type. The LOOK follows the chosen VIDEO TYPE; the TikTok retention rules
@@ -69,20 +70,33 @@ B) SCHEMATIC / ANNOTATED storyboard — each panel is a rough PLAN: a drawn phon
69
70
  - Generate the LIVE FOOTAGE with Wyren (people, faces, desk, reactions, real environment, camera
70
71
  move) using the recurring character profile for face consistency and a fresh start frame
71
72
  DESIGNED for that shot via image AI — not the schematic panel.
72
- - Build the ON-SCREEN GRAPHICS with HyperFrames and composite them over the Wyren footage: UI
73
- cards, data / deck mockups, split-desk labels, chips, checklists, route-map lines, captions,
74
- progress arcs, a REC indicator, the logo anything that is a graphic, not live action.
75
- - Match each panel's layout and intent (what graphic sits where, what the person is doing), but
76
- realize it as real footage + clean motion graphics in the chosen VIDEO TYPE's look.
73
+ - Realize any on-screen UI / data / screens the panel shows (a deck, a dashboard, sticky notes,
74
+ a checklist, a route map, a "REC" indicator) as DIEGETIC parts of the scene — i.e. generate
75
+ them INSIDE the Wyren clip as the real laptop/phone screen, real paper, or real props, framed
76
+ so the text is small/partial/blurred and need not render perfectly. Do NOT composite them as
77
+ floating graphics on top.
78
+ - Match each panel's layout and intent (what's on screen, what the person is doing), realized as
79
+ real in-scene footage in the chosen VIDEO TYPE's look.
77
80
 
78
81
  WHO MAKES WHAT (schematic path):
79
- - Wyren videoAI → live-action clips: people, faces, reactions, hands, environment, props, camera move.
82
+ - Wyren videoAI → everything that appears in-frame: people, faces, reactions, hands, environment,
83
+ props, camera move, AND any on-screen UI / data / screens (generated diegetically in the clip).
80
84
  - Wyren imageAI → designed start frames and any photographic plates feeding the clips.
81
- - HyperFrames → all overlay graphics, captions, chips, UI / data mockups, transitions, logo, splash,
82
- composited on top of the clips (the same graphics layer the `shorts-editor` skill uses).
83
- Keep diegetic vs overlay clear: a screen the actor really looks at can be a graphic comped onto the
84
- device; floating captions / chips are HyperFrames overlays. Never bake storyboard annotations into the
85
- video. If you are unsure which form the storyboard is, ask the user before generating.
85
+ - HyperFrames → reserved for the EDIT stage (`shorts-editor`): spoken-word captions, brand splash /
86
+ end card, and timing of zooms / SFX. Nothing else.
87
+
88
+ DO NOT COMPOSITE TEXT/UI GRAPHICS (hard rule):
89
+ This skill (the video-generation stage) does NOT composite ANY overlay onto the footage. In particular,
90
+ never recreate the storyboard's mock UI / data as floating overlays — no data/deck cards, dashboards,
91
+ labels, callouts, checklists, route-map text, fake screens, or "REC"-style HUD text laid over the
92
+ video. It looks fake and instantly kills the UGC feel. Any text or screen the viewer sees here must be
93
+ DIEGETIC (filmed/generated as a real screen or prop inside the Wyren clip) or simply dropped — if a
94
+ panel's mock UI can't be made diegetic, simplify or omit it.
95
+ All overlays are decided later by the EDIT stage (`shorts-editor`): spoken-word captions, the brand
96
+ splash, and its own engagement chips/graphics live there, on the editor's terms — not generated or
97
+ composited in this stage.
98
+
99
+ If you are unsure which form the storyboard is, ask the user before generating.
86
100
 
87
101
  CORE OUTPUT
88
102
 
@@ -113,8 +127,31 @@ Step A — produce a start frame per shot (image model, `imageAI` node):
113
127
  first frame.
114
128
  - SCHEMATIC storyboard (path B): do NOT upscale the panel. Use image AI to DESIGN a new start frame
115
129
  for that shot from the character profile + the panel's brief (scene, action, framing), so the frame
116
- is real-looking footage, not a redraw of the mockup. The panel's mock UI becomes a HyperFrames
117
- overlay later, not part of this start frame.
130
+ is real-looking footage, not a redraw of the mockup. Any on-screen UI the panel shows is realized
131
+ diegetically inside the clip (a real screen/prop), never composited as a text/graphic overlay.
132
+
133
+ PHONE-CAMERA REALISM FOR START FRAMES (UGC types — non-negotiable)
134
+ The #1 failure mode is a start frame that looks like a polished STOCK PHOTO or AI render (even soft
135
+ studio light, model-perfect symmetric faces, plastic skin, centered staged composition, sterile clean
136
+ room). The video model inherits the start frame's look, so a stock-looking frame produces a stock-
137
+ looking clip. Fix it at the frame.
138
+ - Apply the SAME proven realism rules this skill already specifies for the footage — PHONE-SHOT VISUAL
139
+ STYLE, CHARACTER REALISM, FRAMING, LIGHTING, and DEPTH OF FIELD (all below) — to the START FRAME
140
+ too. The frame is a paused still from that same casual phone video, NOT a separate photo shoot. In
141
+ particular reuse: natural skin texture / pores / asymmetry, non-model casting, candid mid-action
142
+ expression (CHARACTER REALISM); mixed practical light, uneven exposure, blown-out window, mild
143
+ noise (LIGHTING); off-center imperfect handheld framing (FRAMING); normal phone depth of field, no
144
+ creamy bokeh (DEPTH OF FIELD); real lived-in clutter (ENVIRONMENT).
145
+ - Add explicit phone-camera tokens to the image prompt: "shot on an iPhone, candid vertical phone
146
+ snapshot, amateur, mild phone HDR, slight grain, imperfect framing."
147
+ - Always include a negative prompt: stock photo, AI render, 3D render, glossy, airbrushed, retouched,
148
+ model, supermodel, perfect skin, symmetrical, studio lighting, softbox, beauty lighting, creamy
149
+ bokeh, shallow depth of field, hyperdetailed, 8k, cinematic, magazine, advertisement, clean, sterile,
150
+ staged.
151
+ - Applies whether you upscale a photo (path A) or design a fresh frame (path B). If the source/look is
152
+ already stock-ish (the common failure), explicitly degrade it toward phone realism — grain, uneven
153
+ light, candid expression, off-center framing — BEFORE generating the clip. (NON-UGC types: follow
154
+ that type's craft instead — this phone-realism bar is for UGC.)
118
155
  - **Keep the face.** When upscaling/cleaning with the image model, instruct it to PRESERVE the
119
156
  existing person's identity — same face, hair, age, build, and wardrobe — and only improve quality
120
157
  (sharpen, denoise, fix artifacts). Do not let it redraw or beautify into a different face. Use an
@@ -153,12 +190,11 @@ Wyren execution flow (per the wyren skill's policy — load it before any `mcp__
153
190
  3. `validate_workflow` — resolve warnings with the user.
154
191
  4. Estimate cost with `get_pricing` (chain mode) / `estimate_product_cost`; get the user's OK to spend.
155
192
  5. `run_workflow` (`userConfirmed: true`), then poll `get_workflow_run_status` every 5s until terminal.
156
- 6. Pull the clips with `get_node_outputs`.
157
- 7. GRAPHICS PASS (HyperFrames) build the on-screen overlay graphics the storyboard calls for (UI/data
158
- cards, chips, captions, checklists, route maps, logo, transitions) and composite them over the Wyren
159
- clips. For a SCHEMATIC storyboard this pass is required (the mock UI in the panels lives here, not in
160
- the footage). Hand off to the `shorts-editor` skill, which owns this HyperFrames graphics + master step.
161
- 8. Merge the clips + graphics into the final cut.
193
+ 6. Pull the clips with `get_node_outputs`. Any on-screen UI / data / screens were generated DIEGETICALLY
194
+ inside the clips (see DO NOT COMPOSITE TEXT/UI GRAPHICS) there is no text-overlay pass here.
195
+ 7. EDIT & FINISH hand the clips to the `shorts-editor` skill: fast-cut assembly, spoken-word CAPTIONS,
196
+ brand splash / end card, zoom/SFX timing, and master. That is the ONLY place HyperFrames is used, and
197
+ only for captions + splash + timing never to composite text/UI graphics onto the footage.
162
198
 
163
199
  RECURRING CHARACTER CONSISTENCY (multishot / multi-clip)
164
200
 
@@ -343,6 +379,33 @@ Avoid:
343
379
  Every shot should answer:
344
380
  “Why would someone keep watching right now?”
345
381
 
382
+ FAST CUTS & MULTI-ANGLE COVERAGE (the UGC editing engine)
383
+
384
+ A single continuous clip per beat reads as a slow AI ad. Real engaging TikTok/Reels UGC is CUT FAST
385
+ and constantly SWITCHES ANGLE and shot size. This operationalizes the "new beat every 2–4s" rule and
386
+ the CAMERA RULES + TRANSITIONS sections below for GENERATION — and it overrides the one-shot-per-panel
387
+ default in SHOT-BY-SHOT EXECUTION: a storyboard panel is a BEAT, usually several shots, not one clip.
388
+ You cannot cut between angles you never generated.
389
+
390
+ Rules:
391
+ - Cut roughly every 1–2.5s. No talking shot holds longer than ~3s without a cut, angle change, or
392
+ reframe. The whole video should feel like many short shots, not a few long ones.
393
+ - Cover each beat from MULTIPLE angles / shot sizes, then cut between them. Generate 2–3 variations per
394
+ beat drawn from the FRAMING + CAMERA move set: wide/establishing → medium close-up → tight close-up →
395
+ over-the-shoulder → desk-level POV → reaction insert. Vary framing on EVERY cut (never two same-size
396
+ shots back to back).
397
+ - Use the video model's multishot where available (Kling V3 multiShot up to 6 shots) to get angle
398
+ changes inside one generation; otherwise generate several short clips per beat with different framings
399
+ and assemble them.
400
+ - Intercut B-roll / insert shots between talking shots — hands on the keyboard, the screen, the paper
401
+ packet, a face reaction, an object. These inserts let you cut on every sentence and hide jumps.
402
+ - Punctuate with the proven TRANSITIONS vocabulary (jump cuts, reaction cuts, quick whip-pans, match
403
+ cuts, camera repositioning) and snap zooms / handheld punch-ins — never slow dissolves or cinematic moves.
404
+ - Keep continuity locked across the angle changes (same person/face, wardrobe, set, lighting) via the
405
+ character profile — fast cuts must not become a different person.
406
+ - The actual cutting/assembly is finished in the editor (`shorts-editor` skill): generate enough angle
407
+ coverage here so that stage can cut fast. Hand it one long clip per beat and it cannot.
408
+
346
409
  PHONE-SHOT VISUAL STYLE
347
410
 
348
411
  SCOPE: this section (and the CAMERA / FRAMING / LIGHTING / DEPTH OF FIELD / CHARACTER REALISM /
@@ -811,7 +874,8 @@ Avoid:
811
874
 
812
875
  SHOT-BY-SHOT EXECUTION
813
876
 
814
- For each storyboard panel:
877
+ Each storyboard panel is a BEAT — realize it as several short shots from different angles, not one
878
+ held clip (see FAST CUTS & MULTI-ANGLE COVERAGE). For each panel:
815
879
 
816
880
  - preserve the main action
817
881
  - preserve the emotional beat