reelrecon 1.2.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/CLAUDE.md ADDED
@@ -0,0 +1,136 @@
1
+ # Claude Usage
2
+
3
+ Use this repository to:
4
+
5
+ - fetch the latest 10 videos from a public Instagram profile and transcribe them, or
6
+ - transcribe a single direct video URL, or
7
+ - transcribe a local uploaded audio file.
8
+
9
+ There is also a local web app for interactive use and progress tracking. The frontend is a Vite React app built with shadcn/ui components and served by the FastAPI backend after build.
10
+ There is also an MCP server so Claude or other MCP-compatible clients can operate the tool directly.
11
+
12
+ AI insights are generated with GroqCloud when `GROQ_API_KEY` is available. The app falls back to local heuristic insights if Groq is unavailable.
13
+
14
+ ## Install
15
+
16
+ Run:
17
+
18
+ ```bash
19
+ python3.11 -m venv .venv
20
+ .venv/bin/pip install -r requirements.txt
21
+ ```
22
+
23
+ Optional:
24
+
25
+ ```bash
26
+ cp .env.example .env.local
27
+ ```
28
+
29
+ Then set `GROQ_API_KEY` in `.env.local`.
30
+
31
+ Requirements:
32
+
33
+ - `ffmpeg` available on `PATH`
34
+ - network access enabled
35
+ - public Instagram profile URL
36
+
37
+ ## Preferred command
38
+
39
+ Use JSON mode so stdout is machine-readable:
40
+
41
+ ```bash
42
+ ./run_latest_reel_transcription.sh "https://www.instagram.com/<username>/" --json
43
+ ```
44
+
45
+ Optional:
46
+
47
+ ```bash
48
+ ./run_latest_reel_transcription.sh "https://www.instagram.com/reel/<id>/" --json --model small --language en
49
+ ```
50
+
51
+ ## MCP
52
+
53
+ Preferred command for MCP clients:
54
+
55
+ ```bash
56
+ ./run_mcp_server.sh
57
+ ```
58
+
59
+ Or, without a local clone (Node 18+, Python 3.10+, ffmpeg required; first run provisions a Python env in `~/.reelrecon`):
60
+
61
+ ```bash
62
+ npx -y reelrecon
63
+ ```
64
+
65
+ This starts the server over stdio. The MCP surface exposes:
66
+
67
+ - `transcribe_input`
68
+ - `transcribe_local_audio`
69
+ - `list_recent_batches`
70
+ - `read_batch_manifest`
71
+ - `read_video_output`
72
+ - `check_health`
73
+
74
+ MCP tools never raise for expected failures: every tool returns `status: "ok"` or `status: "error"` with `error_type`, `error`, and usually a `hint`. Use `check_health` to diagnose setup problems (whisper/yt-dlp/ffmpeg availability, output directory writability, job activity). Use `include_transcript_text=false` or `max_transcript_chars` to keep tool responses small; full transcripts stay on disk and behind the transcript resources. In multi-video batches a failing video is recorded with `status: "error"` and counted in `failed_videos` instead of aborting the batch. Server limits (job timeout, concurrency, upload size) are tunable via `REELRECON_*` environment variables (legacy `IG_TRANSCRIBER_*` names still work) documented in the README.
75
+
76
+ Resources:
77
+
78
+ - `reelrecon://server`
79
+ - `reelrecon://recent-batches`
80
+ - `reelrecon://manifest/{source_group}/{source_label}`
81
+ - `reelrecon://transcript/{source_group}/{source_label}/{video_id}`
82
+
83
+ If an MCP client needs HTTP instead of stdio:
84
+
85
+ ```bash
86
+ ./run_mcp_server.sh --transport streamable-http --host 127.0.0.1 --port 8001
87
+ ```
88
+
89
+ Then connect the client to `http://127.0.0.1:8001/mcp`.
90
+
91
+ ## UI
92
+
93
+ Start the local app:
94
+
95
+ ```bash
96
+ ./run_ui.sh
97
+ ```
98
+
99
+ The launcher picks an open localhost port and opens the browser automatically. If needed, read the URL from terminal output.
100
+ It also builds the frontend before starting the server.
101
+
102
+ ## Success contract
103
+
104
+ On success, stdout is a single JSON object with:
105
+
106
+ - `status`
107
+ - `input_kind`
108
+ - `input_url`
109
+ - `canonical_url`
110
+ - `total_videos`
111
+ - `completed_videos`
112
+ - `videos`
113
+ - `ai_overview`
114
+ - `manifest_file`
115
+
116
+ Each item in `videos` includes transcript paths, metadata paths, detected language, and `ai_insights`.
117
+
118
+ ## Failure contract
119
+
120
+ On failure, the command exits non-zero.
121
+
122
+ If `--json` is used, stdout includes:
123
+
124
+ ```json
125
+ {"status":"error","error":"..."}
126
+ ```
127
+
128
+ Human-readable error details are also written to stderr.
129
+
130
+ ## Notes
131
+
132
+ - Public profiles only.
133
+ - Local audio uploads bypass Instagram entirely.
134
+ - Instagram may rate-limit anonymous requests.
135
+ - The wrapper prefers Python 3.11 when available to avoid `yt-dlp` Python 3.9 deprecation noise.
136
+ - The wrapper prefers the repo-local `.venv` first when present.
package/LICENSE ADDED
@@ -0,0 +1,21 @@
1
+ MIT License
2
+
3
+ Copyright (c) 2026 4nw3rprod
4
+
5
+ Permission is hereby granted, free of charge, to any person obtaining a copy
6
+ of this software and associated documentation files (the "Software"), to deal
7
+ in the Software without restriction, including without limitation the rights
8
+ to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
9
+ copies of the Software, and to permit persons to whom the Software is
10
+ furnished to do so, subject to the following conditions:
11
+
12
+ The above copyright notice and this permission notice shall be included in all
13
+ copies or substantial portions of the Software.
14
+
15
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16
+ IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17
+ FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18
+ AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19
+ LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20
+ OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21
+ SOFTWARE.
package/README.md ADDED
@@ -0,0 +1,335 @@
1
+ <div align="center">
2
+
3
+ # 🎬 ReelRecon
4
+
5
+ ### Reel reconnaissance for AI agents.
6
+
7
+ **Transcribe and decode any public Instagram profile — hooks, CTAs, and script patterns — locally and for free.**
8
+
9
+ **Give Claude, ChatGPT, Gemini, Hermes, OpenClaw — or any MCP-capable agent — the power to watch Instagram for you.**
10
+
11
+ [![Python](https://img.shields.io/badge/python-3.11+-3776AB?logo=python&logoColor=white)](https://www.python.org/)
12
+ [![Whisper](https://img.shields.io/badge/transcription-OpenAI%20Whisper-74aa9c?logo=openai&logoColor=white)](https://github.com/openai/whisper)
13
+ [![MCP](https://img.shields.io/badge/protocol-MCP%20native-8A2BE2)](https://modelcontextprotocol.io/)
14
+ [![Agents](https://img.shields.io/badge/works%20with-Claude%20·%20ChatGPT%20·%20Gemini%20·%20Hermes%20·%20OpenClaw-blueviolet)](#-drop-it-into-your-agent-stack)
15
+ [![Price](https://img.shields.io/badge/price-free-success)](#)
16
+ [![Privacy](https://img.shields.io/badge/runs-locally-orange)](#)
17
+
18
+ *Your agent can already write scripts. Now it can study the competition first:*
19
+ *"Transcribe @competitor's latest 10 Reels and break down their hook formulas" — one tool call away.*
20
+
21
+ [🤖 Agent Setup](#-drop-it-into-your-agent-stack) · [🚀 Quick Start](#-quick-start) · [🔍 Use Cases](#-what-your-agent-can-do-with-it) · [🧰 Tool Reference](#-mcp-tool-reference) · [🖥️ Web UI](#️-the-dashboard-for-humans)
22
+
23
+ <img src="screen.png" alt="ReelRecon dashboard" width="850"/>
24
+
25
+ </div>
26
+
27
+ ---
28
+
29
+ ## 🎯 Why this exists
30
+
31
+ LLMs can't watch video. Agentic frameworks can browse, code, and write — but a Reel is a black box to them. **ReelRecon** closes that gap with a local, free, MCP-native pipeline:
32
+
33
+ 1. Your agent calls one tool with a **public Instagram profile URL**.
34
+ 2. The server grabs the **latest 10 videos**, extracts audio, and transcribes every word with **OpenAI Whisper** — locally, no per-minute API fees.
35
+ 3. The agent gets back **structured JSON**: full transcripts plus mined hooks, CTAs, sentiment, keyword clusters, title ideas, and a cross-video strategy overview.
36
+
37
+ Built agent-tough: structured errors instead of exceptions, progress notifications, job queueing with hard timeouts, context-window-friendly response trimming, and a `check_health` tool so your agent can self-diagnose a broken install instead of hallucinating around it.
38
+
39
+ ## 🤖 Drop it into your agent stack
40
+
41
+ The server speaks **stdio and streamable-HTTP MCP**, so anything MCP-capable can use it. No MCP? There's a JSON-mode CLI any framework can shell out to.
42
+
43
+ ### ⚡ One command, no clone: `npx`
44
+
45
+ With Node 18+, Python 3.10+ (3.11 recommended), and `ffmpeg` installed:
46
+
47
+ ```bash
48
+ npx -y reelrecon
49
+ ```
50
+
51
+ That starts the MCP server on stdio. The first run provisions a private Python environment in `~/.reelrecon` (Whisper + friends — a few minutes and a few GB, once); every start after that is instant. One-off CLI runs work too:
52
+
53
+ ```bash
54
+ npx -y reelrecon transcribe "https://www.instagram.com/<username>/" --json
55
+ ```
56
+
57
+ > Already have Python + deps? Set `REELRECON_PYTHON=/path/to/python` to skip provisioning and use your own environment.
58
+ >
59
+ > Package not on npm yet in your region/registry? Run it straight from GitHub — same launcher: `npx -y github:4nw3rprod/IG-Content-Transcriber`
60
+
61
+ | Agent / Framework | Integration |
62
+ |---|---|
63
+ | **Claude Code** (CLI) | `claude mcp add reelrecon -- npx -y reelrecon` |
64
+ | **Claude Desktop** | `mcpServers` entry in config |
65
+ | **ChatGPT / Codex CLI** | `mcp_servers` entry in `~/.codex/config.toml` |
66
+ | **Gemini CLI** | `mcpServers` entry in `~/.gemini/settings.json` |
67
+ | **Cursor / Windsurf / Cline** | Standard MCP server config (stdio) |
68
+ | **OpenClaw, Hermes & other open agent frameworks** | Point the framework's MCP client at `npx -y reelrecon` (stdio) or the HTTP endpoint |
69
+ | **LangChain / CrewAI / custom loops** | Use an MCP adapter, or shell out to the CLI with `--json` |
70
+
71
+ <details>
72
+ <summary><b>Claude Code</b></summary>
73
+
74
+ ```bash
75
+ claude mcp add reelrecon -- npx -y reelrecon
76
+ ```
77
+ </details>
78
+
79
+ <details>
80
+ <summary><b>Claude Desktop / Cursor / most MCP clients</b></summary>
81
+
82
+ ```json
83
+ {
84
+ "mcpServers": {
85
+ "reelrecon": {
86
+ "command": "npx",
87
+ "args": ["-y", "reelrecon"]
88
+ }
89
+ }
90
+ }
91
+ ```
92
+ </details>
93
+
94
+ <details>
95
+ <summary><b>ChatGPT — Codex CLI</b> (<code>~/.codex/config.toml</code>)</summary>
96
+
97
+ ```toml
98
+ [mcp_servers.reelrecon]
99
+ command = "npx"
100
+ args = ["-y", "reelrecon"]
101
+ ```
102
+ </details>
103
+
104
+ <details>
105
+ <summary><b>Gemini CLI</b> (<code>~/.gemini/settings.json</code>)</summary>
106
+
107
+ ```json
108
+ {
109
+ "mcpServers": {
110
+ "reelrecon": {
111
+ "command": "npx",
112
+ "args": ["-y", "reelrecon"]
113
+ }
114
+ }
115
+ }
116
+ ```
117
+ </details>
118
+
119
+ <details>
120
+ <summary><b>Running from a local clone instead of npx</b></summary>
121
+
122
+ Clone the repo, install the Python deps ([Quick Start](#-quick-start)), then point your MCP client at the launcher script:
123
+
124
+ ```json
125
+ {
126
+ "mcpServers": {
127
+ "reelrecon": {
128
+ "command": "/absolute/path/to/ReelRecon/run_mcp_server.sh"
129
+ }
130
+ }
131
+ }
132
+ ```
133
+ </details>
134
+
135
+ <details>
136
+ <summary><b>HTTP transport</b> (for frameworks that prefer a URL — OpenClaw, Hermes, remote setups)</summary>
137
+
138
+ ```bash
139
+ ./run_mcp_server.sh --transport streamable-http --host 127.0.0.1 --port 8001
140
+ ```
141
+
142
+ Then point the client at `http://127.0.0.1:8001/mcp`.
143
+ </details>
144
+
145
+ <details>
146
+ <summary><b>No MCP? Shell out to the JSON CLI</b> (LangChain, CrewAI, cron jobs, anything)</summary>
147
+
148
+ ```bash
149
+ ./run_latest_reel_transcription.sh "https://www.instagram.com/<username>/" --json
150
+ ```
151
+
152
+ stdout is a single JSON object on success; non-zero exit + `{"status":"error","error":"..."}` on failure. Trivially parseable from any language.
153
+ </details>
154
+
155
+ **Then just prompt your agent:**
156
+
157
+ > *"Use reelrecon to transcribe the latest Reels from @competitor. Compare their hooks against my last 5 scripts and tell me what patterns I'm missing."*
158
+
159
+ ## 🔍 What your agent can do with it
160
+
161
+ Point any LLM at the structured output and competitive content research becomes a conversation:
162
+
163
+ - **🪝 Hook mining** — the opening line of a competitor's last 10 videos, side by side. Your agent extracts the formula.
164
+ - **📣 CTA patterns** — every "follow / comment / link in bio / DM me" detected and counted per batch.
165
+ - **🧬 Script structure** — full transcripts expose pacing: hook → context → payoff → CTA. Steal the skeleton, not the words.
166
+ - **🔑 Topic clusters** — recurring keywords across recent videos = a creator's actual content pillars.
167
+ - **📈 Trend triangulation** — run 3–5 competitors and let the LLM diff what they're all suddenly talking about.
168
+ - **♻️ Repurposing engine** — each video ships with ready-made content angles and title suggestions for your own spin.
169
+ - **🕵️ Scheduled watching** — pair with your agent's cron/loop feature: "check these 3 profiles every morning and brief me."
170
+
171
+ > **Fair use, please:** public profiles only (private accounts are detected and refused), and it's built for research and inspiration — study patterns, don't plagiarize scripts. Instagram may rate-limit anonymous requests; be a good citizen.
172
+
173
+ ## ⚙️ How it works
174
+
175
+ ```mermaid
176
+ flowchart LR
177
+ A["🤖 Agent / LLM<br/>MCP tool call"] --> B["📱 Public IG profile<br/>latest 10 videos"]
178
+ A --> C["🔗 Single video URL"]
179
+ A --> D["🎙️ Local audio file"]
180
+ B --> E["⬇️ yt-dlp<br/>audio extraction"]
181
+ C --> E
182
+ D --> F
183
+ E --> F["📝 Whisper<br/>local transcription"]
184
+ F --> G["🧠 AI insights<br/>hooks · CTAs · keywords"]
185
+ G --> H["📦 Structured JSON<br/>back to the agent"]
186
+ ```
187
+
188
+ ## 🚀 Quick Start
189
+
190
+ **Fastest path (no clone):** `npx -y reelrecon` — see [agent setup](#-drop-it-into-your-agent-stack) above.
191
+
192
+ **Manual setup — requirements:** Python 3.11+, `ffmpeg` on your PATH, network access.
193
+
194
+ ```bash
195
+ git clone https://github.com/4nw3rprod/ReelRecon.git
196
+ cd ReelRecon
197
+ python3.11 -m venv .venv
198
+ .venv/bin/pip install -r requirements.txt
199
+ ```
200
+
201
+ Optional (Groq-powered insights instead of the built-in heuristics):
202
+
203
+ ```bash
204
+ cp .env.example .env.local # then set GROQ_API_KEY
205
+ ```
206
+
207
+ Connect your agent ([see configs above](#-drop-it-into-your-agent-stack)), or run it by hand:
208
+
209
+ ```bash
210
+ # A competitor's latest 10 videos
211
+ ./run_latest_reel_transcription.sh "https://www.instagram.com/nike/" --json
212
+
213
+ # A single Reel, with model + language hints
214
+ ./run_latest_reel_transcription.sh "https://www.instagram.com/reel/<id>/" --json --model small --language en
215
+ ```
216
+
217
+ ## 🧰 MCP tool reference
218
+
219
+ | Tool | What it does |
220
+ |---|---|
221
+ | `transcribe_input` | Profile URL → latest 10 videos, or any single video URL yt-dlp supports |
222
+ | `transcribe_local_audio` | Transcribe a local audio file + generate insights |
223
+ | `list_recent_batches` | Browse saved runs |
224
+ | `read_batch_manifest` | Load a full batch result |
225
+ | `read_video_output` | Load one video's transcript + metadata |
226
+ | `check_health` | Self-diagnose ffmpeg/Whisper/yt-dlp, disk, and job status |
227
+
228
+ Resources: `reelrecon://server` · `reelrecon://recent-batches` · `reelrecon://manifest/{group}/{label}` · `reelrecon://transcript/{group}/{label}/{video_id}`
229
+
230
+ **The contract your agent can rely on:**
231
+
232
+ - Tools **never raise** for expected failures — every call returns `status: "ok"` or a structured error: `error_type` (`invalid_input`, `not_found`, `pipeline_error`, `dependency_error`, `server_busy`, `timeout`, …), a message, and a `hint` the agent can act on.
233
+ - **Progress streams** as MCP notifications during long batches.
234
+ - **Context-window friendly:** `include_transcript_text=false` or `max_transcript_chars=N` trims responses; full transcripts always stay on disk and behind resources.
235
+ - **Partial success:** in a 10-video batch, one broken video is recorded (`failed_videos`) instead of sinking the other nine.
236
+ - Jobs are **queued with hard timeouts**; limits are env-tunable (below).
237
+
238
+ ## 📦 What comes out
239
+
240
+ ```text
241
+ outputs/
242
+ └── instagram_profiles/
243
+ └── nike/
244
+ ├── manifest.json ← batch result + AI overview
245
+ └── <video_id>/
246
+ ├── audio.mp3
247
+ ├── transcript.txt ← the gold
248
+ └── metadata.json ← caption, timestamps, insights
249
+ ```
250
+
251
+ ```jsonc
252
+ {
253
+ "status": "ok",
254
+ "input_kind": "instagram_profile",
255
+ "total_videos": 10,
256
+ "completed_videos": 10,
257
+ "videos": [
258
+ {
259
+ "title": "You don't need motivation…",
260
+ "transcript_text": "...",
261
+ "ai_insights": {
262
+ "hook": "You don't need motivation, you need a system.",
263
+ "cta": "follow",
264
+ "sentiment": "positive",
265
+ "keywords": ["system", "habits", "training"],
266
+ "title_suggestions": ["..."],
267
+ "content_angles": ["..."]
268
+ }
269
+ }
270
+ ],
271
+ "ai_overview": {
272
+ "recurring_keywords": ["..."],
273
+ "top_hooks": ["..."],
274
+ "cta_patterns": [["follow", 6], ["link in bio", 3]]
275
+ },
276
+ "manifest_file": "outputs/instagram_profiles/nike/manifest.json"
277
+ }
278
+ ```
279
+
280
+ ## 🖥️ The Dashboard (for humans)
281
+
282
+ Agents get MCP; you get a live dashboard:
283
+
284
+ ```bash
285
+ ./run_ui.sh
286
+ ```
287
+
288
+ Builds the frontend, picks an open localhost port, opens your browser. Paste a profile/Reel URL **or upload audio** (`mp3`, `wav`, `m4a`, `aac`, `flac`, `ogg`, `webm`), pick the Whisper model, watch live progress through every pipeline stage, and browse transcript + insight history.
289
+
290
+ ## 🎛️ Tuning
291
+
292
+ All optional, via environment variables:
293
+
294
+ | Variable | Default | Purpose |
295
+ |---|---|---|
296
+ | `GROQ_API_KEY` | — | Enables GroqCloud AI insights (heuristic fallback otherwise) |
297
+ | `REELRECON_OUTPUT_DIR` | `<repo>/outputs` | Where results are written |
298
+ | `REELRECON_JOB_TIMEOUT_SECONDS` | `3600` | Hard per-job timeout (MCP) |
299
+ | `REELRECON_QUEUE_TIMEOUT_SECONDS` | `900` | Max wait for a job slot (MCP) |
300
+ | `REELRECON_MAX_CONCURRENT_JOBS` | `1` | Parallel transcription jobs (MCP) |
301
+ | `REELRECON_MAX_UPLOAD_BYTES` | 2 GiB | Max local audio file size (MCP) |
302
+ | `REELRECON_EXTRA_MODELS` | — | Comma-separated extra Whisper model names to allow |
303
+ | `REELRECON_HTTP_TIMEOUT_SECONDS` | `30` | Instagram/Groq/yt-dlp socket timeout |
304
+ | `REELRECON_FETCH_RETRIES` | `3` | Instagram profile fetch attempts (with backoff) |
305
+
306
+ > Legacy `IG_TRANSCRIBER_*` variable names are still honored, so existing setups keep working.
307
+
308
+ **Whisper model cheat sheet:** `tiny` = fastest, `base` = default sweet spot, `small`/`medium` = better accuracy, `large-v3` = best (needs RAM/time).
309
+
310
+ ## ✅ Tests
311
+
312
+ The MCP server and pipeline helpers ship with a lightweight suite (no Whisper/torch download needed):
313
+
314
+ ```bash
315
+ .venv/bin/pip install pytest
316
+ .venv/bin/python -m pytest tests/ -q
317
+ ```
318
+
319
+ ## 📝 Good to know
320
+
321
+ - **Public profiles only** — private accounts are detected and refused.
322
+ - Instagram may rate-limit anonymous requests; the tool retries with backoff, but if it's blocked, wait and rerun.
323
+ - Whisper models are cached after first load; already-transcribed videos are reused on reruns.
324
+ - Everything runs locally. The only network calls are to Instagram/video hosts, and (optionally) GroqCloud with your key.
325
+ - Agent-facing docs live in [`CLAUDE.md`](CLAUDE.md) — most MCP-aware coding agents pick it up automatically.
326
+
327
+ ---
328
+
329
+ <div align="center">
330
+
331
+ **Wiring this into your agent? ⭐ Star the repo — it's free and it helps others find it.**
332
+
333
+ *Built with Whisper, yt-dlp, FastAPI, React + shadcn/ui, and the Model Context Protocol.*
334
+
335
+ </div>
@@ -0,0 +1,182 @@
1
+ #!/usr/bin/env node
2
+ 'use strict';
3
+
4
+ /*
5
+ * ReelRecon npx launcher.
6
+ *
7
+ * Finds a suitable Python, provisions a private virtualenv under
8
+ * ~/.reelrecon on first run, then hands stdio over to the Python MCP
9
+ * server (or the transcribe CLI). Everything the launcher prints goes
10
+ * to stderr: when an MCP client spawns us, stdout belongs to the
11
+ * protocol and must stay clean.
12
+ *
13
+ * Environment:
14
+ * REELRECON_HOME where the venv lives (default: ~/.reelrecon)
15
+ * REELRECON_PYTHON bring-your-own interpreter with deps already
16
+ * installed; skips venv provisioning entirely
17
+ */
18
+
19
+ const { spawn, spawnSync } = require('node:child_process');
20
+ const crypto = require('node:crypto');
21
+ const fs = require('node:fs');
22
+ const os = require('node:os');
23
+ const path = require('node:path');
24
+
25
+ const MIN_PYTHON = [3, 10];
26
+ const PREFERRED_PYTHONS = ['python3.11', 'python3.12', 'python3.13', 'python3.10', 'python3', 'python'];
27
+
28
+ const packageRoot = path.resolve(__dirname, '..');
29
+ const requirementsFile = path.join(packageRoot, 'requirements.txt');
30
+ const isWindows = process.platform === 'win32';
31
+
32
+ function log(message) {
33
+ process.stderr.write(`[reelrecon] ${message}\n`);
34
+ }
35
+
36
+ function fail(message) {
37
+ log(`ERROR: ${message}`);
38
+ process.exit(1);
39
+ }
40
+
41
+ function pythonVersion(command) {
42
+ const result = spawnSync(command, ['-c', 'import sys; print("%d.%d" % sys.version_info[:2])'], {
43
+ encoding: 'utf-8',
44
+ stdio: ['ignore', 'pipe', 'ignore'],
45
+ });
46
+ if (result.status !== 0 || !result.stdout) {
47
+ return null;
48
+ }
49
+ const [major, minor] = result.stdout.trim().split('.').map(Number);
50
+ if (!Number.isInteger(major) || !Number.isInteger(minor)) {
51
+ return null;
52
+ }
53
+ return [major, minor];
54
+ }
55
+
56
+ function versionOk(version) {
57
+ if (!version) return false;
58
+ const [major, minor] = version;
59
+ return major === MIN_PYTHON[0] && minor >= MIN_PYTHON[1];
60
+ }
61
+
62
+ function findSystemPython() {
63
+ for (const candidate of PREFERRED_PYTHONS) {
64
+ if (versionOk(pythonVersion(candidate))) {
65
+ return candidate;
66
+ }
67
+ }
68
+ return null;
69
+ }
70
+
71
+ function venvPythonPath(venvDir) {
72
+ return isWindows ? path.join(venvDir, 'Scripts', 'python.exe') : path.join(venvDir, 'bin', 'python');
73
+ }
74
+
75
+ function installMarker(home) {
76
+ return path.join(home, '.install-marker');
77
+ }
78
+
79
+ function desiredMarker(basePython) {
80
+ const requirements = fs.readFileSync(requirementsFile, 'utf-8');
81
+ const version = pythonVersion(basePython) || [];
82
+ return crypto.createHash('sha256').update(`${version.join('.')}\n${requirements}`).digest('hex');
83
+ }
84
+
85
+ function run(command, args, description) {
86
+ // stdout is routed to stderr (fd 2): pip and venv chatter must never
87
+ // reach our stdout, which belongs to the MCP stdio framing.
88
+ const result = spawnSync(command, args, { stdio: ['ignore', 2, 2] });
89
+ if (result.error) {
90
+ fail(`${description} failed to start: ${result.error.message}`);
91
+ }
92
+ if (result.status !== 0) {
93
+ fail(`${description} failed with exit code ${result.status}.`);
94
+ }
95
+ }
96
+
97
+ function ensureVenv() {
98
+ const home = process.env.REELRECON_HOME || path.join(os.homedir(), '.reelrecon');
99
+ const venvDir = path.join(home, 'venv');
100
+ const venvPython = venvPythonPath(venvDir);
101
+
102
+ const basePython = findSystemPython();
103
+ if (!basePython) {
104
+ fail(
105
+ `No suitable Python found. ReelRecon needs Python >= ${MIN_PYTHON.join('.')} (3.11 recommended). ` +
106
+ 'Install it, or point REELRECON_PYTHON at an interpreter that already has the dependencies.'
107
+ );
108
+ }
109
+
110
+ const marker = desiredMarker(basePython);
111
+ const markerFile = installMarker(home);
112
+ if (fs.existsSync(venvPython) && fs.existsSync(markerFile) && fs.readFileSync(markerFile, 'utf-8') === marker) {
113
+ return venvPython;
114
+ }
115
+
116
+ log(`Setting up the ReelRecon Python environment in ${venvDir}`);
117
+ log('First run downloads Whisper/torch and friends — this can take a few minutes and a few GB.');
118
+ fs.mkdirSync(home, { recursive: true });
119
+ run(basePython, ['-m', 'venv', '--clear', venvDir], 'Creating the virtualenv');
120
+ run(venvPython, ['-m', 'pip', 'install', '--upgrade', 'pip', '--quiet'], 'Upgrading pip');
121
+ run(venvPython, ['-m', 'pip', 'install', '-r', requirementsFile], 'Installing Python dependencies');
122
+ fs.writeFileSync(markerFile, marker);
123
+ log('Environment ready.');
124
+ return venvPython;
125
+ }
126
+
127
+ function resolvePython() {
128
+ const custom = process.env.REELRECON_PYTHON;
129
+ if (custom) {
130
+ if (!versionOk(pythonVersion(custom))) {
131
+ fail(`REELRECON_PYTHON (${custom}) is not a working Python >= ${MIN_PYTHON.join('.')}.`);
132
+ }
133
+ return custom;
134
+ }
135
+ return ensureVenv();
136
+ }
137
+
138
+ function warnIfNoFfmpeg() {
139
+ const probe = spawnSync('ffmpeg', ['-version'], { stdio: 'ignore' });
140
+ if (probe.error || probe.status !== 0) {
141
+ log('WARNING: ffmpeg was not found on PATH. Transcription will fail until it is installed.');
142
+ log(' Install it with e.g. `apt install ffmpeg` or `brew install ffmpeg`.');
143
+ }
144
+ }
145
+
146
+ function main() {
147
+ const args = process.argv.slice(2);
148
+
149
+ let script = path.join(packageRoot, 'mcp_server.py');
150
+ let scriptArgs = args;
151
+ if (args[0] === 'transcribe') {
152
+ script = path.join(packageRoot, 'transcribe_latest_reel.py');
153
+ scriptArgs = args.slice(1);
154
+ } else if (args[0] === '--version') {
155
+ const pkg = JSON.parse(fs.readFileSync(path.join(packageRoot, 'package.json'), 'utf-8'));
156
+ process.stdout.write(`${pkg.version}\n`);
157
+ return;
158
+ }
159
+
160
+ const python = resolvePython();
161
+ warnIfNoFfmpeg();
162
+
163
+ const child = spawn(python, [script, ...scriptArgs], {
164
+ stdio: 'inherit',
165
+ env: { ...process.env, PYTHONUNBUFFERED: '1' },
166
+ });
167
+
168
+ const forward = (signal) => {
169
+ if (!child.killed) {
170
+ child.kill(signal);
171
+ }
172
+ };
173
+ process.on('SIGINT', () => forward('SIGINT'));
174
+ process.on('SIGTERM', () => forward('SIGTERM'));
175
+
176
+ child.on('error', (error) => fail(`Failed to start Python: ${error.message}`));
177
+ child.on('exit', (code, signal) => {
178
+ process.exit(signal ? 1 : code ?? 0);
179
+ });
180
+ }
181
+
182
+ main();
@@ -0,0 +1,3 @@
1
+ from .pipeline import PipelineError, run_audio_file_transcription, run_transcription
2
+
3
+ __all__ = ["PipelineError", "run_transcription", "run_audio_file_transcription"]