npm - hyperframes - Versions diffs - 0.6.0-alpha.2 → 0.6.0-alpha.5 - Mend

hyperframes 0.6.0-alpha.2 → 0.6.0-alpha.5

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (19) hide show

package/dist/cli.js +1593 -556
package/dist/docs/rendering.md +2 -2
package/dist/hyperframe-runtime.js +53 -36
package/dist/hyperframe.manifest.json +1 -1
package/dist/hyperframe.runtime.iife.js +53 -36
package/dist/skills/hyperframes/SKILL.md +2 -3
package/dist/skills/hyperframes/patterns.md +73 -0
package/dist/skills/hyperframes/references/transcript-guide.md +1 -45
package/dist/skills/hyperframes-cli/SKILL.md +3 -31
package/dist/studio/assets/hyperframes-player-CEnWY28J.js +417 -0
package/dist/studio/assets/index-BTa7zV4Z.js +106 -0
package/dist/studio/assets/index-pZvEUcY0.css +1 -0
package/dist/studio/index.html +2 -2
package/dist/templates/_shared/CLAUDE.md +2 -1
package/package.json +1 -1
package/dist/skills/hyperframes/references/tts.md +0 -75
package/dist/studio/assets/hyperframes-player-Cd8vYWxP.js +0 -198
package/dist/studio/assets/index-UWFaHilT.css +0 -1
package/dist/studio/assets/index-cPJbxeAk.js +0 -107

package/dist/skills/hyperframes/SKILL.md CHANGED Viewed

@@ -1,6 +1,6 @@
 ---
 name: hyperframes
-description: Create video compositions, animations, title cards, overlays, captions, voiceovers, audio-reactive visuals, and scene transitions in HyperFrames HTML. Use when asked to build any HTML-based video content, add captions or subtitles synced to audio, generate text-to-speech narration, create audio-reactive animation (beat sync, glow, pulse driven by music), add animated text highlighting (marker sweeps, hand-drawn circles, burst lines, scribble, sketchout), or add transitions between scenes (crossfades, wipes, reveals, shader transitions). Covers composition authoring, timing, media, and the full video production workflow. For CLI commands (init, lint, preview, render, transcribe, tts) see the hyperframes-cli skill.
+description: Create video compositions, animations, title cards, overlays, captions, voiceovers, audio-reactive visuals, and scene transitions in HyperFrames HTML. Use when asked to build any HTML-based video content, add captions or subtitles synced to audio, generate text-to-speech narration, create audio-reactive animation (beat sync, glow, pulse driven by music), add animated text highlighting (marker sweeps, hand-drawn circles, burst lines, scribble, sketchout), or add transitions between scenes (crossfades, wipes, reveals, shader transitions). Covers composition authoring, timing, media, and the full video production workflow. For dev-loop CLI commands (init, lint, inspect, preview, render) see the hyperframes-cli skill; for asset preprocessing commands (tts, transcribe, remove-background) see the hyperframes-media skill.
 ---
 # HyperFrames
@@ -467,7 +467,6 @@ Skip on small edits (fixing a color, adjusting one duration). Run on new composi
 ## References (loaded on demand)
 - **[references/captions.md](references/captions.md)** — Captions, subtitles, lyrics, karaoke synced to audio. Tone-adaptive style detection, per-word styling, text overflow prevention, caption exit guarantees, word grouping. Read when adding any text synced to audio timing.
-- **[references/tts.md](references/tts.md)** — Text-to-speech with Kokoro-82M. Voice selection, speed tuning, TTS+captions workflow. Read when generating narration or voiceover.
 - **[references/audio-reactive.md](references/audio-reactive.md)** — Audio-reactive animation: map frequency bands and amplitude to GSAP properties. Read when visuals should respond to music, voice, or sound.
 - **[references/css-patterns.md](references/css-patterns.md)** — CSS+GSAP marker highlighting: highlight, circle, burst, scribble, sketchout. Deterministic, fully seekable. Read when adding visual emphasis to text.
 - **[references/video-composition.md](references/video-composition.md)** — Video-medium rules: density, color presence, scale, frame composition, design.md as brand not layout. **Always read** — these override web instincts.
@@ -481,7 +480,7 @@ Skip on small edits (fixing a color, adjusting one duration). Run on new composi
 - **[house-style.md](house-style.md)** — Default motion, sizing, and color palettes when no design.md is specified.
 - **[patterns.md](patterns.md)** — PiP, title cards, slide show patterns.
 - **[data-in-motion.md](data-in-motion.md)** — Data, stats, and infographic patterns.
-- **[references/transcript-guide.md](references/transcript-guide.md)** — Transcription commands, whisper models, external APIs, troubleshooting.
+- **[references/transcript-guide.md](references/transcript-guide.md)** — Caption-side transcript handling: input formats, mandatory quality check, cleaning JS, OpenAI/Groq API fallback, "if no transcript exists" flow. (For the `transcribe` CLI invocation, model selection rules, and the `.en` gotcha, see the `hyperframes-media` skill.)
 - **[references/dynamic-techniques.md](references/dynamic-techniques.md)** — Dynamic caption animation techniques (karaoke, clip-path, slam, scatter, elastic, 3D).
 - **[references/transitions.md](references/transitions.md)** — Scene transitions: crossfades, wipes, reveals, shader transitions. Energy/mood selection, CSS vs WebGL guidance. **Always read for multi-scene compositions** — scenes without transitions feel like jump cuts.

package/dist/skills/hyperframes/patterns.md CHANGED Viewed

@@ -30,6 +30,79 @@ tl.to(
 tl.to("#pip-frame", { left: 40, duration: 0.6 }, 30);
 ```
+## Text Behind Subject (transparent webm overlay)
+Put a headline _behind_ a presenter so their silhouette occludes the text. Requires a transparent cutout produced by `npx hyperframes remove-background presenter.mp4 -o presenter.webm`.
+Three layers, plus one critical rule:
+```html
+<!-- z=1 base — full opaque mp4 (lobby + presenter), always visible -->
+<video
+  id="cf-base"
+  data-start="0"
+  data-duration="6"
+  data-media-start="0"
+  data-track-index="0"
+  src="presenter.mp4"
+  muted
+  playsinline
+></video>
+<!-- z=2 headline — visible the whole time -->
+<h1
+  id="cf-headline"
+  style="position:absolute;top:50%;left:50%;
+     transform:translate(-50%,-50%); z-index:2; font-size:220px; font-weight:900;
+     color:#fff; text-shadow:0 6px 32px rgba(0,0,0,.55); clip-path:inset(0 0 100% 0);"
+>
+  MAKE IT IN HYPERFRAMES
+</h1>
+<!-- z=3 cutout — same source, alpha around presenter, hidden until the cut -->
+<!-- WRAPPER has the opacity, NOT the video itself (see rule below). -->
+<div class="cutout-wrap" style="position:absolute;inset:0;z-index:3;opacity:0">
+  <video
+    id="cf-cutout"
+    data-start="0"
+    data-duration="6"
+    data-media-start="0"
+    data-track-index="1"
+    src="presenter.webm"
+    muted
+    playsinline
+  ></video>
+</div>
+```
+```js
+const tl = gsap.timeline({ paused: true });
+const CUT = 3.3;
+// Reveal headline early
+tl.to("#cf-headline", { clipPath: "inset(0 0 0% 0)", duration: 0.6, ease: "expo.out" }, 0.25);
+// At the cut, flip the cutout wrapper visible — the presenter's silhouette
+// punches through the headline.
+tl.set(".cutout-wrap", { opacity: 1 }, CUT);
+// Sentinel: extend timeline to the composition's full duration so the
+// renderer doesn't bail past the last meaningful tween.
+tl.set({}, {}, 6);
+window.__timelines["cover-flip"] = tl;
+```
+**Why a wrapper div, not opacity on the video itself?**
+The framework forces `opacity: 1` on any element with `data-start`/`data-duration` while it's "active" — that's how it manages clip lifecycles. A CSS `opacity: 0` on the video element is silently overwritten. Wrap the video in a div with no `data-*` attributes; the wrapper is owned by your CSS/GSAP.
+**Why both videos at `data-start="0"`?**
+So both decode in sync from t=0. Late-mounting the cutout (`data-start=3.3`) makes Chrome do a seek + decoder warm-up at mount, which can land a frame off the base mp4 — visible as a one-frame jitter at the cut.
+**Color match:** `remove-background` defaults to `--quality balanced` (crf 18) which keeps the cutout's RGB nearly identical to the source mp4 — minimal edge halo or color shift when overlaid. Use `--quality best` (crf 12) for hero shots; only drop to `--quality fast` (crf 30) when the cutout sits over a _different_ background and the size matters.
 ## Title Card with Fade
 ```html

package/dist/skills/hyperframes/references/transcript-guide.md CHANGED Viewed

@@ -1,24 +1,6 @@
 # Transcript Guide
-## How Transcripts Are Generated
-`hyperframes transcribe` handles both transcription and format conversion:
-```bash
-# Transcribe audio/video (uses whisper.cpp locally, no API key needed)
-npx hyperframes transcribe audio.mp3
-# Use a larger model for better accuracy
-npx hyperframes transcribe audio.mp3 --model medium.en
-# Filter to English only (skips non-English speech)
-npx hyperframes transcribe audio.mp3 --language en
-# Import an existing transcript from another tool
-npx hyperframes transcribe captions.srt
-npx hyperframes transcribe captions.vtt
-npx hyperframes transcribe openai-response.json
-```
+For the `transcribe` CLI invocation, the `.en`-translates-non-English rule, and whisper model selection, see the `hyperframes-media` skill. This file covers what to do with the resulting transcript when authoring captions: input formats, mandatory quality checks, cleaning code, external-API fallbacks.
 ## Supported Input Formats
@@ -34,32 +16,6 @@ The CLI auto-detects and normalizes these formats:
 **Word-level timestamps produce better captions.** SRT/VTT give phrase-level timing, which works but can't do per-word animation effects.
-## Whisper Model Guide
-The default model (`small.en`) balances accuracy and speed. For better results, use a larger model:
-| Model      | Size   | Speed    | Accuracy  | When to use                           |
-| ---------- | ------ | -------- | --------- | ------------------------------------- |
-| `tiny`     | 75 MB  | Fastest  | Low       | Quick previews, testing pipeline      |
-| `base`     | 142 MB | Fast     | Fair      | Short clips, clear audio              |
-| `small`    | 466 MB | Moderate | Good      | **Default** — good for most content   |
-| `medium`   | 1.5 GB | Slow     | Very good | Important content, noisy audio, music |
-| `large-v3` | 3.1 GB | Slowest  | Best      | Production quality                    |
-**Only add `.en` suffix when the user explicitly says the audio is English.** `.en` models are slightly more accurate for English but will TRANSLATE non-English audio instead of transcribing it.
-**Critical: `.en` models translate non-English audio into English** — they don't transcribe it. If the audio might not be English, always use a model without the `.en` suffix and pass `--language` to specify the source language. If you're unsure of the language, use `small` (not `small.en`) without `--language` — whisper will auto-detect.
-```bash
-# Spanish audio
-npx hyperframes transcribe audio.mp3 --model small --language es
-# Unknown language — let whisper auto-detect
-npx hyperframes transcribe audio.mp3 --model small
-```
-**Music and vocals over instrumentation**: `small.en` will misidentify lyrics — use `medium.en` as the minimum, or import lyrics manually. Even `medium.en` struggles with heavily produced tracks; for music videos, providing known lyrics as an SRT/VTT and importing with `hyperframes transcribe lyrics.srt` will always beat automated transcription.
 ## Transcript Quality Check (Mandatory)
 After every transcription, **read the transcript and check for quality issues before proceeding.** Bad transcripts produce nonsensical captions. Never skip this step.

package/dist/skills/hyperframes-cli/SKILL.md CHANGED Viewed

@@ -1,6 +1,6 @@
 ---
 name: hyperframes-cli
-description: HyperFrames CLI tool — hyperframes init, lint, inspect, preview, render, transcribe, tts, remove-background, doctor, browser, info, upgrade, compositions, docs, benchmark. Use when scaffolding a project, linting, validating, inspecting visual layout in compositions, previewing in the studio, rendering to video, transcribing audio, generating TTS, removing the background from an avatar video for transparent overlays, or troubleshooting the HyperFrames environment.
+description: HyperFrames CLI dev loop — `npx hyperframes` for scaffolding (init), validation (lint, inspect), preview, render, and environment troubleshooting (doctor, browser, info, upgrade). Use when running any of these commands or troubleshooting the HyperFrames build/render environment. For asset preprocessing commands (`tts`, `transcribe`, `remove-background`), invoke the `hyperframes-media` skill instead.
 ---
 # HyperFrames CLI
@@ -120,37 +120,9 @@ npx hyperframes render --docker                       # byte-identical
 **Parametrized renders:** the composition declares its variables on the `<html>` root with **`data-composition-variables`** — a JSON **array of declarations** (`{id, type, label, default}` per entry) that defines the schema. Scripts inside read the resolved values via `window.__hyperframes.getVariables()`. The CLI **`--variables '{"title":"Q4 Report"}'`** is a JSON **object keyed by id** that overrides those declared defaults for one render; missing keys fall through, so the same composition runs unchanged in dev preview and in production. (Sub-comp hosts can also override per-instance with **`data-variable-values`** — same object shape, scoped to one mount of the sub-composition. See the `hyperframes` skill for the full pattern.)
-## Transcription
+## Asset Preprocessing
-```bash
-npx hyperframes transcribe audio.mp3
-npx hyperframes transcribe video.mp4 --model medium.en --language en
-npx hyperframes transcribe subtitles.srt   # import existing
-npx hyperframes transcribe subtitles.vtt
-npx hyperframes transcribe openai-response.json
-```
-## Text-to-Speech
-```bash
-npx hyperframes tts "Text here" --voice af_nova --output narration.wav
-npx hyperframes tts script.txt --voice bf_emma
-npx hyperframes tts --list  # show all voices
-```
-## Background Removal (transparent video)
-Remove the background from a video or image so it can be used as a transparent overlay in a composition (e.g. an avatar floating on a background).
-```bash
-npx hyperframes remove-background avatar.mp4 -o transparent.webm  # default: VP9 alpha WebM
-npx hyperframes remove-background avatar.mp4 -o transparent.mov   # ProRes 4444 for editing
-npx hyperframes remove-background portrait.jpg -o cutout.png      # single-image cutout
-npx hyperframes remove-background avatar.mp4 -o transparent.webm --device cpu
-npx hyperframes remove-background --info                          # detected providers
-```
-Uses `u2net_human_seg` (MIT). First run downloads ~168 MB of weights to `~/.cache/hyperframes/background-removal/models/` and reuses them after. Drop the resulting `.webm` into a composition with `<video src="transparent.webm" autoplay muted loop>` — Chrome decodes the alpha natively.
+`npx hyperframes tts`, `transcribe`, and `remove-background` produce assets (narration audio, word-level transcripts, transparent video) that get dropped into a composition. Each downloads its own model on first run. For voice selection, whisper model rules (the `.en`-translates-non-English gotcha), output format choice (VP9 alpha WebM vs ProRes), and the TTS → transcribe → captions chain, invoke the `hyperframes-media` skill.
 ## Troubleshooting