@kadj-amoah/showrunner 1.1.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/CHANGELOG.md ADDED
@@ -0,0 +1,64 @@
1
+ # Changelog
2
+
3
+ All notable changes to Showrunner are documented here. Format follows [Keep a Changelog](https://keepachangelog.com/en/1.1.0/); the project tracks loose semver — minor bumps for new capability, patch for fixes.
4
+
5
+ ## [1.1.0] — 2026-05-23
6
+
7
+ First tagged release. Versioned `1.1.0` rather than `0.1.0` because the project has been tracked as "v1.1 — reliability hardening + provider-agnostic refactor" throughout development; this is the first cut where the pipeline works end-to-end against real Next.js targets and the LLM + TTS layers are swappable. No `1.0.0` was ever published.
8
+
9
+ ### Added
10
+
11
+ - **Provider-agnostic LLM layer** (`llm.default` + per-stage `llm.overrides`). Four providers:
12
+ - `anthropic` — Claude via the official SDK with structured outputs.
13
+ - `openai` — `gpt-4o`-style models via `response_format: json_schema` with `json_object` fallback.
14
+ - `agent_bridge` — spawn a headless CLI agent like `claude -p --output-format json`; **no API key on file required**. Supports `spawn` and `file_poll` modes.
15
+ - `custom` — dynamic-import an operator-supplied module conforming to the `LLMProvider` interface.
16
+ - **Provider-agnostic TTS layer** (`voiceover.provider`). Three providers:
17
+ - `elevenlabs` — the only one that returns per-character alignment.
18
+ - `openai` — `tts-1-hd` via the audio API.
19
+ - `custom` — dynamic-import for in-house TTS pipelines.
20
+ - **`alignment_strategy: required | best_effort`** — when a TTS provider doesn't return alignment, switch automatically to per-segment synthesis (one TTS call per segment) so audio doesn't get sliced across mid-word boundaries.
21
+ - **`showrunner doctor`** — preflight checks before any run. Validates config syntax, provider env vars, ffmpeg + ffprobe on PATH, Playwright chromium binary, target URL reachability, free disk on every output dir, free memory + computed ffmpeg thread cap, and lifecycle script executability. Wired into `run` implicitly (escape with `--skip-doctor`).
22
+ - **`showrunner record-actions`** with scroll capture. Drives a headed browser; captures click / input / change / keydown / submit / **scroll** events; coalesces them into a clean action stream. Scroll capture is 250 ms debounced with direction coalescing.
23
+ - **DOM preflight in the `script` stage** — scrapes the live target's actionable selector inventory (≈60–80 entries on a typical marketing site) before asking the LLM for a manifest. The LLM is constrained to inventory selectors only, with a one-shot remediation retry if it strays.
24
+ - **Resource preflight** in `record` and `mux` stages — checks free disk against estimated artifact sizes, computes an ffmpeg `-threads` cap from free RAM × resolution × quality preset (`SHOWRUNNER_FFMPEG_THREADS` env override available).
25
+ - **Structured ffmpeg errors** — categorized as `oom | no_space | codec_missing | permission | unknown`, with per-category remediation hints.
26
+ - **`--resolution` flag** on both `init` and `run`. Presets: `low` (854×480), `standard` (720p, the default), `high` (1080p), `extreme` (4K). On `run` it overrides both `recording.viewport` and `output.resolution` in one step so they stay matched and mux doesn't upscale.
27
+ - **`init` flags**: `--llm-provider`, `--tts-provider`, `--resolution`. Scaffolds a `demo.yaml` with the right provider blocks, an `.env.example` listing only the env vars that combination actually needs (deduplicated when LLM + TTS both use OpenAI), and a real `docs/PRD.md` stub with section guides.
28
+ - **Legacy config auto-migration** — pre-v0.1 `demo.yaml` files with flat `voiceover.{voice_id, model, ...}` fields and no `llm` block are normalized at load time. Existing configs keep working with zero edits.
29
+ - **`validate --strict`** — exit nonzero on missing provider env vars (otherwise warns).
30
+ - **Captions** — SRT + VTT emitted alongside the final MP4 when `output.captions.enabled` is set. Whole-segment cues are used as a fallback when per-character alignment isn't available.
31
+
32
+ ### Changed
33
+
34
+ - `engines.node` bumped to `>=20.6` (needed for `process.loadEnvFile`).
35
+ - `init` defaults: `script.vo_review_gate: false` (was true), `recording.viewport: 1280×720` (was 1920×1080), `output.resolution: 1280x720`. Each is opt-in to the larger/stricter value via a one-line edit.
36
+ - `tsup` configured with `skipNodeModulesBundle: true` and `target: 'node20'`. Built `dist/cli.js` dropped from 1.75 MB to 247 KB and no longer crashes with `Dynamic require of "tty" is not supported` — `npm link` and `npm i -g` are viable now.
37
+ - README rewritten for v0.1 — install path (with `npm link`), full prereqs (Node ≥ 20.6, ffmpeg, chromium), provider-choice tables, pipeline-per-stage table, daily-commands cookbook, troubleshooting section.
38
+
39
+ ### Fixed
40
+
41
+ - `understand --interactive` no longer silently exits after the second question on piped stdin. Replaced `readline/promises.question()` with a queued `'line'`-event reader that handles both TTY and pipe semantics.
42
+ - `understand --output <path>` now correctly takes precedence over `project.product_model` in the config (was silently overridden).
43
+ - Script-stage prompt now bans Tailwind arbitrary-value class names (`.max-w-[1280px]`) which aren't legal CSS and crashed Playwright with a `SyntaxError`. A post-validator catches violations and retries once with a remediation prompt before accepting the manifest.
44
+ - `looksUnstable()` heuristic now catches Next.js compiled-CSS classes like `__variable_3eb911` (leading underscores were previously slipping past).
45
+ - `mux` output write now falls back to a timestamped sibling MP4 when the canonical path is locked by a media player, with a warning, instead of failing the run.
46
+
47
+ ### Known limitations (Tier 3 — coded but not exercised end-to-end)
48
+
49
+ - OpenAI LLM + OpenAI TTS provider paths are wired but were not run with a real API key in v0.1 validation. The agent_bridge + ElevenLabs combination is the validated default.
50
+ - `agent_bridge` file_poll mode is coded but only `spawn` mode was exercised.
51
+ - Custom provider modules (LLM + TTS) — dynamic-import loader is wired but no reference implementation has been swapped in.
52
+ - `instrument`, `capture-auth`, `trace`, `preview`, `rerun-segment`, `print-vo`, `approve-vo` commands are mostly thin wrappers but were not stress-tested in v0.1.
53
+ - Background music mix and title-card logo / custom-font rendering are coded but were not exercised.
54
+ - 1920×1080 mux remains RAM-sensitive on tight boxes; the thread-cap heuristic mitigates but the safe default is `--resolution standard`.
55
+
56
+ ### Validation artifact
57
+
58
+ A real end-to-end run against the Credstone Atlas marketing site produced `output/ct-website-atlas_demo.mp4` (4.3 MB, H.264 + AAC, 1280×720, 48.2 s, with SRT + VTT) using:
59
+ - `agent_bridge` LLM (via local `claude` CLI, no Anthropic key on file)
60
+ - ElevenLabs TTS with `alignment_strategy: required`
61
+ - DOM preflight + selector validator producing inventory-only manifest selectors
62
+ - Full mux preflight + thread cap on a 5.9 GB RAM machine with 1.1 GB free at run time
63
+
64
+ [1.1.0]: https://github.com/kadj-amoah/showrunner/releases/tag/v1.1.0
package/LICENSE ADDED
@@ -0,0 +1,21 @@
1
+ MIT License
2
+
3
+ Copyright (c) 2026 Kofi Adjei
4
+
5
+ Permission is hereby granted, free of charge, to any person obtaining a copy
6
+ of this software and associated documentation files (the "Software"), to deal
7
+ in the Software without restriction, including without limitation the rights
8
+ to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
9
+ copies of the Software, and to permit persons to whom the Software is
10
+ furnished to do so, subject to the following conditions:
11
+
12
+ The above copyright notice and this permission notice shall be included in all
13
+ copies or substantial portions of the Software.
14
+
15
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16
+ IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17
+ FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18
+ AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19
+ LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20
+ OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21
+ SOFTWARE.
package/README.md ADDED
@@ -0,0 +1,145 @@
1
+ # Showrunner
2
+
3
+ Automated product demo recording & production tool.
4
+
5
+ Showrunner collapses the demo-video pipeline into a single, repeatable, automatable command. Point it at a running web product, give it a one-page brief, and it produces a finished, captioned MP4 — comprehension, script, recording, voiceover, and mux in one pass.
6
+
7
+ ## Status
8
+
9
+ v1.1 — usable end-to-end for short demos. The LLM and TTS layers are provider-agnostic so you can bring your own keys or wire in an in-house pipeline. See `prd_showrunner.md` for the full product spec.
10
+
11
+ ## What you need on your machine
12
+
13
+ - **Node ≥ 20.6** (`process.loadEnvFile` is required)
14
+ - **ffmpeg + ffprobe on PATH** (`apt install ffmpeg` on Debian/Ubuntu, `brew install ffmpeg` on macOS, gyan.dev build on Windows)
15
+ - **A Chromium binary** — Playwright installs one for you (`npx playwright install chromium`)
16
+ - **API keys for the providers you pick**, OR a headless CLI agent like `claude -p` (see _Provider choices_ below)
17
+
18
+ Showrunner is built to deploy on Linux but develops fine on WSL2 or macOS.
19
+
20
+ ## Install
21
+
22
+ Until the GitHub Packages release lands, install from source:
23
+
24
+ ```bash
25
+ git clone https://github.com/kadj-amoah/showrunner.git
26
+ cd showrunner
27
+ npm install
28
+ npm run build
29
+ npm link # makes `showrunner` available globally
30
+ npx playwright install chromium
31
+ ```
32
+
33
+ Verify with `showrunner --help`.
34
+
35
+ ## First demo in five commands
36
+
37
+ ```bash
38
+ showrunner init --name my-demo --url http://localhost:3000
39
+ cd my-demo
40
+ cp .env.example .env # then paste in your provider keys
41
+ $EDITOR docs/PRD.md # replace the stub with your product brief
42
+ showrunner doctor -c demo.yaml # preflight: 11–12 PASS/FAIL rows
43
+ showrunner run -c demo.yaml # the whole pipeline
44
+ open output/demo_final.mp4
45
+ ```
46
+
47
+ If you'd rather not write a PRD upfront, swap the `$EDITOR` step for `showrunner understand -c demo.yaml --interactive` — it asks five questions and produces the product model on the spot.
48
+
49
+ ## Provider choices
50
+
51
+ The two generative stages (LLM for comprehension + script, TTS for voiceover) are pluggable. Pick them at scaffold time with `--llm-provider` and `--tts-provider`, or edit the `llm` and `voiceover.provider` blocks in `demo.yaml` later.
52
+
53
+ ### LLM (`llm.default.provider`)
54
+
55
+ | Provider | Needs | Notes |
56
+ |----------------|--------------------------------------------|-----------------------------------------------------------------------------|
57
+ | `anthropic` | `ANTHROPIC_API_KEY` | Default. Uses Claude with structured outputs. |
58
+ | `openai` | `OPENAI_API_KEY` | Uses `response_format: json_schema`, falls back to `json_object` if needed. |
59
+ | `agent_bridge` | A headless CLI agent on PATH (default `claude -p --output-format json`) | No API key on file — Showrunner spawns the agent per request. |
60
+ | `custom` | A dynamic-importable module | Implement the `LLMProvider` interface. |
61
+
62
+ Per-stage overrides are supported under `llm.overrides.{comprehension,script,instrument}` so you can, e.g., use `agent_bridge` for the heavy script generation and `anthropic` for the small `instrument` calls.
63
+
64
+ ### TTS (`voiceover.provider.name`)
65
+
66
+ | Provider | Needs | Alignment? | Default `alignment_strategy` |
67
+ |--------------|------------------------|------------|------------------------------|
68
+ | `elevenlabs` | `ELEVENLABS_API_KEY` | ✅ | `required` |
69
+ | `openai` | `OPENAI_API_KEY` | ❌ | `best_effort` |
70
+ | `custom` | A dynamic-import module| Your call | `best_effort` |
71
+
72
+ Only ElevenLabs returns per-character alignment. With other providers, Showrunner takes the **best-effort path**: one synthesis call per segment (no slicing), captions collapse to whole-segment cues, and `at_word` action timing degrades to `at`. Set `voiceover.alignment_strategy: required` if you want it to fail loudly instead.
73
+
74
+ ## Pipeline
75
+
76
+ ```
77
+ comprehension → script → record + voiceover → mux → demo_final.mp4
78
+ ```
79
+
80
+ | Stage | What it does | Inputs | Outputs |
81
+ |-----------------|----------------------------------------------------------------------------------------------------|-----------------------------------------|---------------------------------------------------------|
82
+ | `comprehension` | Reads `docs/`, an optional codebase, or runs the interactive Q&A. Emits `product_model.json`. | `docs/PRD.md` (or `--interactive`) | `product_model.json` |
83
+ | `script` | Scrapes the live target's actionable DOM, then asks the LLM for a manifest using only those selectors. | `product_model.json` + live target URL | `scripts/manifest.json`, `vo_script.txt`, Playwright spec |
84
+ | `record` | Drives Playwright through the manifest, captures `master.webm` + a slice plan. | manifest, dev server | `segments/video/master.webm`, `slice_plan.json` |
85
+ | `voiceover` | TTS for the whole script, slices per segment, writes alignment files (when supported). | manifest | `segments/audio/*.mp3`, `segments/alignment/*.json` |
86
+ | `mux` | Normalizes, slices, branding cards, background music, captions. Outputs the MP4. | video + audio + alignment | `output/demo_final.mp4` (+ `.srt`/`.vtt` if enabled) |
87
+
88
+ Stages are independently runnable, checkpointed, and idempotent. Existing artifacts are never silently overwritten — use `--force <stage>[,<stage>]` to regenerate.
89
+
90
+ ## Daily commands
91
+
92
+ ```bash
93
+ # preflight — catches missing keys, ffmpeg, dev server down, free disk, RAM cap
94
+ showrunner doctor -c demo.yaml
95
+
96
+ # generate (or refresh) product_model.json from docs/PRD.md
97
+ showrunner understand -c demo.yaml
98
+ showrunner understand -c demo.yaml --interactive # 5-question fallback
99
+
100
+ # full pipeline (`--skip-doctor` to bypass the implicit preflight)
101
+ showrunner run -c demo.yaml
102
+
103
+ # re-run just one stage
104
+ showrunner run -c demo.yaml --stages script
105
+ showrunner run -c demo.yaml --force voiceover,mux # regen, don't reuse
106
+
107
+ # the LLM's selectors are wrong for one segment — demo it yourself
108
+ showrunner record-actions -c demo.yaml --segment fill-form
109
+
110
+ # preview the manifest in Playwright UI Mode (no recording)
111
+ showrunner preview -c demo.yaml
112
+
113
+ # inspect a failed take
114
+ showrunner trace -c demo.yaml --segment <id>
115
+ ```
116
+
117
+ ## When things go wrong
118
+
119
+ - **`x264 malloc failed` during mux** — out of RAM. The `doctor` row "free memory: … (ffmpeg thread cap: N)" shows what was budgeted. Drop resolution to 1280x720 (or 854x480 for draft), set `SHOWRUNNER_FFMPEG_THREADS=1`, or close Docker / browsers.
120
+ - **Recording fails because a selector doesn't resolve** — usually the LLM picked something fragile. The `script` stage's DOM preflight is supposed to prevent this; if it does happen, run `showrunner record-actions -c demo.yaml --segment <id>` and demonstrate the interaction yourself.
121
+ - **`vo_review_gate` halted the pipeline** — by design. Edit `scripts/vo_script.txt`, then `showrunner approve-vo -c demo.yaml`. (The init scaffold ships with this off — you have to opt in via `script.vo_review_gate: true`.)
122
+ - **Output file is locked by a media player** — close VLC/QuickTime/your-browser-tab. Showrunner falls back to writing a timestamped sibling MP4 with a warning, but the canonical path needs the lock released.
123
+ - **DOM preflight failed** — your dev server isn't on the URL in `demo.yaml`, or it's behind auth. Bring the server up, configure `recording.auth` for session/form/setup-script flows.
124
+
125
+ ## Configuration reference
126
+
127
+ `demo.yaml` is the single source of truth. Sections:
128
+
129
+ - `project` — name and optional `product_model` path
130
+ - `comprehension` — `mode` + `sources` (PRD, README, codebase, OpenAPI, etc.)
131
+ - `script` — `style`, `duration_target_seconds`, `highlight_features`, `vo_review_gate`
132
+ - `recording` — `target_url`, viewport, browser, cursor + segment timing knobs, auth, lifecycle scripts
133
+ - `voiceover` — `provider` (discriminated by `name`), `alignment_strategy`, output dirs, drift behavior, pause placement
134
+ - `llm` — `default` provider + per-stage `overrides` (comprehension, script, instrument)
135
+ - `output` — resolution, fps, branding cards, background music, captions, output path
136
+
137
+ Legacy v1.0 configs (flat `voiceover.voice_id`, no `llm` block) are auto-migrated by the loader — your existing demos keep working without edits.
138
+
139
+ ## Deployment target
140
+
141
+ Linux. Local dev runs on WSL2/Docker on Windows or macOS hosts. A Docker image bundling Node 20, Playwright (Chromium), FFmpeg, and `xvfb` is on the roadmap.
142
+
143
+ ## License
144
+
145
+ MIT
package/dist/cli.d.ts ADDED
@@ -0,0 +1 @@
1
+ #!/usr/bin/env node