npm - @kadj-amoah/showrunner - Versions diffs - 1.1.0 - Mend

@kadj-amoah/showrunner 1.1.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (10) hide show

package/CHANGELOG.md ADDED Viewed

@@ -0,0 +1,64 @@
+# Changelog
+All notable changes to Showrunner are documented here. Format follows [Keep a Changelog](https://keepachangelog.com/en/1.1.0/); the project tracks loose semver — minor bumps for new capability, patch for fixes.
+## [1.1.0] — 2026-05-23
+First tagged release. Versioned `1.1.0` rather than `0.1.0` because the project has been tracked as "v1.1 — reliability hardening + provider-agnostic refactor" throughout development; this is the first cut where the pipeline works end-to-end against real Next.js targets and the LLM + TTS layers are swappable. No `1.0.0` was ever published.
+### Added
+- **Provider-agnostic LLM layer** (`llm.default` + per-stage `llm.overrides`). Four providers:
+  - `anthropic` — Claude via the official SDK with structured outputs.
+  - `openai` — `gpt-4o`-style models via `response_format: json_schema` with `json_object` fallback.
+  - `agent_bridge` — spawn a headless CLI agent like `claude -p --output-format json`; **no API key on file required**. Supports `spawn` and `file_poll` modes.
+  - `custom` — dynamic-import an operator-supplied module conforming to the `LLMProvider` interface.
+- **Provider-agnostic TTS layer** (`voiceover.provider`). Three providers:
+  - `elevenlabs` — the only one that returns per-character alignment.
+  - `openai` — `tts-1-hd` via the audio API.
+  - `custom` — dynamic-import for in-house TTS pipelines.
+- **`alignment_strategy: required | best_effort`** — when a TTS provider doesn't return alignment, switch automatically to per-segment synthesis (one TTS call per segment) so audio doesn't get sliced across mid-word boundaries.
+- **`showrunner doctor`** — preflight checks before any run. Validates config syntax, provider env vars, ffmpeg + ffprobe on PATH, Playwright chromium binary, target URL reachability, free disk on every output dir, free memory + computed ffmpeg thread cap, and lifecycle script executability. Wired into `run` implicitly (escape with `--skip-doctor`).
+- **`showrunner record-actions`** with scroll capture. Drives a headed browser; captures click / input / change / keydown / submit / **scroll** events; coalesces them into a clean action stream. Scroll capture is 250 ms debounced with direction coalescing.
+- **DOM preflight in the `script` stage** — scrapes the live target's actionable selector inventory (≈60–80 entries on a typical marketing site) before asking the LLM for a manifest. The LLM is constrained to inventory selectors only, with a one-shot remediation retry if it strays.
+- **Resource preflight** in `record` and `mux` stages — checks free disk against estimated artifact sizes, computes an ffmpeg `-threads` cap from free RAM × resolution × quality preset (`SHOWRUNNER_FFMPEG_THREADS` env override available).
+- **Structured ffmpeg errors** — categorized as `oom | no_space | codec_missing | permission | unknown`, with per-category remediation hints.
+- **`--resolution` flag** on both `init` and `run`. Presets: `low` (854×480), `standard` (720p, the default), `high` (1080p), `extreme` (4K). On `run` it overrides both `recording.viewport` and `output.resolution` in one step so they stay matched and mux doesn't upscale.
+- **`init` flags**: `--llm-provider`, `--tts-provider`, `--resolution`. Scaffolds a `demo.yaml` with the right provider blocks, an `.env.example` listing only the env vars that combination actually needs (deduplicated when LLM + TTS both use OpenAI), and a real `docs/PRD.md` stub with section guides.
+- **Legacy config auto-migration** — pre-v0.1 `demo.yaml` files with flat `voiceover.{voice_id, model, ...}` fields and no `llm` block are normalized at load time. Existing configs keep working with zero edits.
+- **`validate --strict`** — exit nonzero on missing provider env vars (otherwise warns).
+- **Captions** — SRT + VTT emitted alongside the final MP4 when `output.captions.enabled` is set. Whole-segment cues are used as a fallback when per-character alignment isn't available.
+### Changed
+- `engines.node` bumped to `>=20.6` (needed for `process.loadEnvFile`).
+- `init` defaults: `script.vo_review_gate: false` (was true), `recording.viewport: 1280×720` (was 1920×1080), `output.resolution: 1280x720`. Each is opt-in to the larger/stricter value via a one-line edit.
+- `tsup` configured with `skipNodeModulesBundle: true` and `target: 'node20'`. Built `dist/cli.js` dropped from 1.75 MB to 247 KB and no longer crashes with `Dynamic require of "tty" is not supported` — `npm link` and `npm i -g` are viable now.
+- README rewritten for v0.1 — install path (with `npm link`), full prereqs (Node ≥ 20.6, ffmpeg, chromium), provider-choice tables, pipeline-per-stage table, daily-commands cookbook, troubleshooting section.
+### Fixed
+- `understand --interactive` no longer silently exits after the second question on piped stdin. Replaced `readline/promises.question()` with a queued `'line'`-event reader that handles both TTY and pipe semantics.
+- `understand --output <path>` now correctly takes precedence over `project.product_model` in the config (was silently overridden).
+- Script-stage prompt now bans Tailwind arbitrary-value class names (`.max-w-[1280px]`) which aren't legal CSS and crashed Playwright with a `SyntaxError`. A post-validator catches violations and retries once with a remediation prompt before accepting the manifest.
+- `looksUnstable()` heuristic now catches Next.js compiled-CSS classes like `__variable_3eb911` (leading underscores were previously slipping past).
+- `mux` output write now falls back to a timestamped sibling MP4 when the canonical path is locked by a media player, with a warning, instead of failing the run.
+### Known limitations (Tier 3 — coded but not exercised end-to-end)
+- OpenAI LLM + OpenAI TTS provider paths are wired but were not run with a real API key in v0.1 validation. The agent_bridge + ElevenLabs combination is the validated default.
+- `agent_bridge` file_poll mode is coded but only `spawn` mode was exercised.
+- Custom provider modules (LLM + TTS) — dynamic-import loader is wired but no reference implementation has been swapped in.
+- `instrument`, `capture-auth`, `trace`, `preview`, `rerun-segment`, `print-vo`, `approve-vo` commands are mostly thin wrappers but were not stress-tested in v0.1.
+- Background music mix and title-card logo / custom-font rendering are coded but were not exercised.
+- 1920×1080 mux remains RAM-sensitive on tight boxes; the thread-cap heuristic mitigates but the safe default is `--resolution standard`.
+### Validation artifact
+A real end-to-end run against the Credstone Atlas marketing site produced `output/ct-website-atlas_demo.mp4` (4.3 MB, H.264 + AAC, 1280×720, 48.2 s, with SRT + VTT) using:
+- `agent_bridge` LLM (via local `claude` CLI, no Anthropic key on file)
+- ElevenLabs TTS with `alignment_strategy: required`
+- DOM preflight + selector validator producing inventory-only manifest selectors
+- Full mux preflight + thread cap on a 5.9 GB RAM machine with 1.1 GB free at run time
+[1.1.0]: https://github.com/kadj-amoah/showrunner/releases/tag/v1.1.0

package/LICENSE ADDED Viewed

@@ -0,0 +1,21 @@
+MIT License
+Copyright (c) 2026 Kofi Adjei
+Permission is hereby granted, free of charge, to any person obtaining a copy
+of this software and associated documentation files (the "Software"), to deal
+in the Software without restriction, including without limitation the rights
+to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+copies of the Software, and to permit persons to whom the Software is
+furnished to do so, subject to the following conditions:
+The above copyright notice and this permission notice shall be included in all
+copies or substantial portions of the Software.
+THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
+OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+SOFTWARE.

package/README.md ADDED Viewed

@@ -0,0 +1,145 @@
+# Showrunner
+Automated product demo recording & production tool.
+Showrunner collapses the demo-video pipeline into a single, repeatable, automatable command. Point it at a running web product, give it a one-page brief, and it produces a finished, captioned MP4 — comprehension, script, recording, voiceover, and mux in one pass.
+## Status
+v1.1 — usable end-to-end for short demos. The LLM and TTS layers are provider-agnostic so you can bring your own keys or wire in an in-house pipeline. See `prd_showrunner.md` for the full product spec.
+## What you need on your machine
+- **Node ≥ 20.6** (`process.loadEnvFile` is required)
+- **ffmpeg + ffprobe on PATH** (`apt install ffmpeg` on Debian/Ubuntu, `brew install ffmpeg` on macOS, gyan.dev build on Windows)
+- **A Chromium binary** — Playwright installs one for you (`npx playwright install chromium`)
+- **API keys for the providers you pick**, OR a headless CLI agent like `claude -p` (see _Provider choices_ below)
+Showrunner is built to deploy on Linux but develops fine on WSL2 or macOS.
+## Install
+Until the GitHub Packages release lands, install from source:
+```bash
+git clone https://github.com/kadj-amoah/showrunner.git
+cd showrunner
+npm install
+npm run build
+npm link                    # makes `showrunner` available globally
+npx playwright install chromium
+```
+Verify with `showrunner --help`.
+## First demo in five commands
+```bash
+showrunner init --name my-demo --url http://localhost:3000
+cd my-demo
+cp .env.example .env                       # then paste in your provider keys
+$EDITOR docs/PRD.md                        # replace the stub with your product brief
+showrunner doctor -c demo.yaml             # preflight: 11–12 PASS/FAIL rows
+showrunner run -c demo.yaml                # the whole pipeline
+open output/demo_final.mp4
+```
+If you'd rather not write a PRD upfront, swap the `$EDITOR` step for `showrunner understand -c demo.yaml --interactive` — it asks five questions and produces the product model on the spot.
+## Provider choices
+The two generative stages (LLM for comprehension + script, TTS for voiceover) are pluggable. Pick them at scaffold time with `--llm-provider` and `--tts-provider`, or edit the `llm` and `voiceover.provider` blocks in `demo.yaml` later.
+### LLM (`llm.default.provider`)
+| Provider       | Needs                                      | Notes                                                                       |
+|----------------|--------------------------------------------|-----------------------------------------------------------------------------|
+| `anthropic`    | `ANTHROPIC_API_KEY`                        | Default. Uses Claude with structured outputs.                                |
+| `openai`       | `OPENAI_API_KEY`                           | Uses `response_format: json_schema`, falls back to `json_object` if needed.  |
+| `agent_bridge` | A headless CLI agent on PATH (default `claude -p --output-format json`) | No API key on file — Showrunner spawns the agent per request.                |
+| `custom`       | A dynamic-importable module                | Implement the `LLMProvider` interface.                                       |
+Per-stage overrides are supported under `llm.overrides.{comprehension,script,instrument}` so you can, e.g., use `agent_bridge` for the heavy script generation and `anthropic` for the small `instrument` calls.
+### TTS (`voiceover.provider.name`)
+| Provider     | Needs                  | Alignment? | Default `alignment_strategy` |
+|--------------|------------------------|------------|------------------------------|
+| `elevenlabs` | `ELEVENLABS_API_KEY`   | ✅          | `required`                   |
+| `openai`     | `OPENAI_API_KEY`       | ❌          | `best_effort`                |
+| `custom`     | A dynamic-import module| Your call  | `best_effort`                |
+Only ElevenLabs returns per-character alignment. With other providers, Showrunner takes the **best-effort path**: one synthesis call per segment (no slicing), captions collapse to whole-segment cues, and `at_word` action timing degrades to `at`. Set `voiceover.alignment_strategy: required` if you want it to fail loudly instead.
+## Pipeline
+```
+comprehension → script → record + voiceover → mux → demo_final.mp4
+```
+| Stage           | What it does                                                                                       | Inputs                                  | Outputs                                                 |
+|-----------------|----------------------------------------------------------------------------------------------------|-----------------------------------------|---------------------------------------------------------|
+| `comprehension` | Reads `docs/`, an optional codebase, or runs the interactive Q&A. Emits `product_model.json`.       | `docs/PRD.md` (or `--interactive`)      | `product_model.json`                                    |
+| `script`        | Scrapes the live target's actionable DOM, then asks the LLM for a manifest using only those selectors. | `product_model.json` + live target URL  | `scripts/manifest.json`, `vo_script.txt`, Playwright spec |
+| `record`        | Drives Playwright through the manifest, captures `master.webm` + a slice plan.                     | manifest, dev server                    | `segments/video/master.webm`, `slice_plan.json`         |
+| `voiceover`     | TTS for the whole script, slices per segment, writes alignment files (when supported).             | manifest                                | `segments/audio/*.mp3`, `segments/alignment/*.json`     |
+| `mux`           | Normalizes, slices, branding cards, background music, captions. Outputs the MP4.                   | video + audio + alignment               | `output/demo_final.mp4` (+ `.srt`/`.vtt` if enabled)    |
+Stages are independently runnable, checkpointed, and idempotent. Existing artifacts are never silently overwritten — use `--force <stage>[,<stage>]` to regenerate.
+## Daily commands
+```bash
+# preflight — catches missing keys, ffmpeg, dev server down, free disk, RAM cap
+showrunner doctor -c demo.yaml
+# generate (or refresh) product_model.json from docs/PRD.md
+showrunner understand -c demo.yaml
+showrunner understand -c demo.yaml --interactive          # 5-question fallback
+# full pipeline (`--skip-doctor` to bypass the implicit preflight)
+showrunner run -c demo.yaml
+# re-run just one stage
+showrunner run -c demo.yaml --stages script
+showrunner run -c demo.yaml --force voiceover,mux         # regen, don't reuse
+# the LLM's selectors are wrong for one segment — demo it yourself
+showrunner record-actions -c demo.yaml --segment fill-form
+# preview the manifest in Playwright UI Mode (no recording)
+showrunner preview -c demo.yaml
+# inspect a failed take
+showrunner trace -c demo.yaml --segment <id>
+```
+## When things go wrong
+- **`x264 malloc failed` during mux** — out of RAM. The `doctor` row "free memory: … (ffmpeg thread cap: N)" shows what was budgeted. Drop resolution to 1280x720 (or 854x480 for draft), set `SHOWRUNNER_FFMPEG_THREADS=1`, or close Docker / browsers.
+- **Recording fails because a selector doesn't resolve** — usually the LLM picked something fragile. The `script` stage's DOM preflight is supposed to prevent this; if it does happen, run `showrunner record-actions -c demo.yaml --segment <id>` and demonstrate the interaction yourself.
+- **`vo_review_gate` halted the pipeline** — by design. Edit `scripts/vo_script.txt`, then `showrunner approve-vo -c demo.yaml`. (The init scaffold ships with this off — you have to opt in via `script.vo_review_gate: true`.)
+- **Output file is locked by a media player** — close VLC/QuickTime/your-browser-tab. Showrunner falls back to writing a timestamped sibling MP4 with a warning, but the canonical path needs the lock released.
+- **DOM preflight failed** — your dev server isn't on the URL in `demo.yaml`, or it's behind auth. Bring the server up, configure `recording.auth` for session/form/setup-script flows.
+## Configuration reference
+`demo.yaml` is the single source of truth. Sections:
+- `project` — name and optional `product_model` path
+- `comprehension` — `mode` + `sources` (PRD, README, codebase, OpenAPI, etc.)
+- `script` — `style`, `duration_target_seconds`, `highlight_features`, `vo_review_gate`
+- `recording` — `target_url`, viewport, browser, cursor + segment timing knobs, auth, lifecycle scripts
+- `voiceover` — `provider` (discriminated by `name`), `alignment_strategy`, output dirs, drift behavior, pause placement
+- `llm` — `default` provider + per-stage `overrides` (comprehension, script, instrument)
+- `output` — resolution, fps, branding cards, background music, captions, output path
+Legacy v1.0 configs (flat `voiceover.voice_id`, no `llm` block) are auto-migrated by the loader — your existing demos keep working without edits.
+## Deployment target
+Linux. Local dev runs on WSL2/Docker on Windows or macOS hosts. A Docker image bundling Node 20, Playwright (Chromium), FFmpeg, and `xvfb` is on the roadmap.
+## License
+MIT

package/dist/cli.d.ts ADDED Viewed

	@@ -0,0 +1 @@
1	+ #!/usr/bin/env node