reelrecon 1.2.0 โ 1.2.1
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/CLAUDE.md +1 -1
- package/README.md +37 -12
- package/ig_transcriber/pipeline.py +102 -16
- package/mcp_server.py +2 -1
- package/package.json +3 -3
package/CLAUDE.md
CHANGED
|
@@ -131,6 +131,6 @@ Human-readable error details are also written to stderr.
|
|
|
131
131
|
|
|
132
132
|
- Public profiles only.
|
|
133
133
|
- Local audio uploads bypass Instagram entirely.
|
|
134
|
-
- Instagram may rate-limit anonymous requests.
|
|
134
|
+
- Instagram may rate-limit anonymous requests. Public reels work anonymously; if Instagram login-walls a public link, set `REELRECON_COOKIES_FILE` (cookies.txt export) or `REELRECON_COOKIES_FROM_BROWSER=chrome` to use your own session.
|
|
135
135
|
- The wrapper prefers Python 3.11 when available to avoid `yt-dlp` Python 3.9 deprecation noise.
|
|
136
136
|
- The wrapper prefers the repo-local `.venv` first when present.
|
package/README.md
CHANGED
|
@@ -1,26 +1,26 @@
|
|
|
1
1
|
<div align="center">
|
|
2
2
|
|
|
3
|
-
|
|
4
|
-
|
|
5
|
-
### Reel reconnaissance for AI agents.
|
|
3
|
+
<img src="https://raw.githubusercontent.com/4nw3rprod/ReelRecon/main/assets/hero.png" alt="ReelRecon โ Reel reconnaissance for AI agents. Works with Claude, ChatGPT, Gemini, Hermes, OpenClaw, and any MCP-capable agent." width="820"/>
|
|
6
4
|
|
|
7
5
|
**Transcribe and decode any public Instagram profile โ hooks, CTAs, and script patterns โ locally and for free.**
|
|
8
6
|
|
|
9
|
-
|
|
10
|
-
|
|
11
|
-
[](https://www.python.org/)
|
|
7
|
+
[](https://www.npmjs.com/package/reelrecon)
|
|
8
|
+
[](https://www.python.org/)
|
|
12
9
|
[](https://github.com/openai/whisper)
|
|
13
10
|
[](https://modelcontextprotocol.io/)
|
|
14
|
-
[](#)
|
|
11
|
+
[](LICENSE)
|
|
16
12
|
[](#)
|
|
17
13
|
|
|
14
|
+
```bash
|
|
15
|
+
npx -y reelrecon
|
|
16
|
+
```
|
|
17
|
+
|
|
18
18
|
*Your agent can already write scripts. Now it can study the competition first:*
|
|
19
19
|
*"Transcribe @competitor's latest 10 Reels and break down their hook formulas" โ one tool call away.*
|
|
20
20
|
|
|
21
21
|
[๐ค Agent Setup](#-drop-it-into-your-agent-stack) ยท [๐ Quick Start](#-quick-start) ยท [๐ Use Cases](#-what-your-agent-can-do-with-it) ยท [๐งฐ Tool Reference](#-mcp-tool-reference) ยท [๐ฅ๏ธ Web UI](#๏ธ-the-dashboard-for-humans)
|
|
22
22
|
|
|
23
|
-
<img src="screen.png" alt="ReelRecon dashboard" width="850"/>
|
|
23
|
+
<img src="https://raw.githubusercontent.com/4nw3rprod/ReelRecon/main/screen.png" alt="ReelRecon dashboard" width="850"/>
|
|
24
24
|
|
|
25
25
|
</div>
|
|
26
26
|
|
|
@@ -28,12 +28,14 @@
|
|
|
28
28
|
|
|
29
29
|
## ๐ฏ Why this exists
|
|
30
30
|
|
|
31
|
-
|
|
31
|
+
Today, analyzing a competitor's Reels means either paying a per-minute transcription SaaS, or uploading videos one by one to a multimodal model and burning tokens while it watches. **ReelRecon is the third option: free, open source, and local** โ an MCP-native pipeline built for the part that actually matters for content strategy, the spoken word:
|
|
32
32
|
|
|
33
33
|
1. Your agent calls one tool with a **public Instagram profile URL**.
|
|
34
|
-
2. The server grabs the **latest 10 videos**, extracts audio, and transcribes every word with **OpenAI Whisper** โ locally, no per-minute
|
|
34
|
+
2. The server grabs the **latest 10 videos**, extracts audio, and transcribes every word with **OpenAI Whisper** โ locally. No subscriptions, no per-minute fees, no tokens spent on video frames.
|
|
35
35
|
3. The agent gets back **structured JSON**: full transcripts plus mined hooks, CTAs, sentiment, keyword clusters, title ideas, and a cross-video strategy overview.
|
|
36
36
|
|
|
37
|
+
ReelRecon doesn't analyze visuals โ scripts, hooks, and CTAs live in the audio, and that's what it mines for patterns and content ideas. It pairs perfectly with video-capable models: triage all ten Reels here for free in minutes, then send only the one or two that matter to a multimodal model for full visual breakdown.
|
|
38
|
+
|
|
37
39
|
Built agent-tough: structured errors instead of exceptions, progress notifications, job queueing with hard timeouts, context-window-friendly response trimming, and a `check_health` tool so your agent can self-diagnose a broken install instead of hallucinating around it.
|
|
38
40
|
|
|
39
41
|
## ๐ค Drop it into your agent stack
|
|
@@ -56,7 +58,27 @@ npx -y reelrecon transcribe "https://www.instagram.com/<username>/" --json
|
|
|
56
58
|
|
|
57
59
|
> Already have Python + deps? Set `REELRECON_PYTHON=/path/to/python` to skip provisioning and use your own environment.
|
|
58
60
|
>
|
|
59
|
-
> Package not on npm yet in your region/registry? Run it straight from GitHub โ same launcher: `npx -y github:4nw3rprod/
|
|
61
|
+
> Package not on npm yet in your region/registry? Run it straight from GitHub โ same launcher: `npx -y github:4nw3rprod/ReelRecon`
|
|
62
|
+
|
|
63
|
+
### ๐ Upgrading
|
|
64
|
+
|
|
65
|
+
- **npx users:** pin `reelrecon@latest` in your config (as in the snippets below) and every server start runs the newest published version. If plain `npx -y reelrecon` keeps serving you a stale cached copy, run `npx -y reelrecon@latest` once or clear the cache with `npm cache clean --force`.
|
|
66
|
+
- **GitHub-direct:** `npx -y github:4nw3rprod/ReelRecon` always runs the latest `main` โ no npm release needed.
|
|
67
|
+
- **Local clone:** `git pull`. That's it โ the private Python env in `~/.reelrecon` is reused automatically and only reinstalls when `requirements.txt` changes.
|
|
68
|
+
|
|
69
|
+
If Instagram login-walled you on a public reel, upgrade to **v1.2.1+** and (optionally) hand the server your own session in the MCP config:
|
|
70
|
+
|
|
71
|
+
```json
|
|
72
|
+
{
|
|
73
|
+
"mcpServers": {
|
|
74
|
+
"reelrecon": {
|
|
75
|
+
"command": "npx",
|
|
76
|
+
"args": ["-y", "reelrecon@latest"],
|
|
77
|
+
"env": { "REELRECON_COOKIES_FILE": "/absolute/path/to/cookies.txt" }
|
|
78
|
+
}
|
|
79
|
+
}
|
|
80
|
+
}
|
|
81
|
+
```
|
|
60
82
|
|
|
61
83
|
| Agent / Framework | Integration |
|
|
62
84
|
|---|---|
|
|
@@ -300,6 +322,8 @@ All optional, via environment variables:
|
|
|
300
322
|
| `REELRECON_MAX_CONCURRENT_JOBS` | `1` | Parallel transcription jobs (MCP) |
|
|
301
323
|
| `REELRECON_MAX_UPLOAD_BYTES` | 2 GiB | Max local audio file size (MCP) |
|
|
302
324
|
| `REELRECON_EXTRA_MODELS` | โ | Comma-separated extra Whisper model names to allow |
|
|
325
|
+
| `REELRECON_COOKIES_FILE` | โ | Path to a `cookies.txt` export of your own Instagram session (fallback when Instagram login-walls anonymous access) |
|
|
326
|
+
| `REELRECON_COOKIES_FROM_BROWSER` | โ | Read your session straight from a browser, e.g. `chrome` or `firefox:ProfileName` |
|
|
303
327
|
| `REELRECON_HTTP_TIMEOUT_SECONDS` | `30` | Instagram/Groq/yt-dlp socket timeout |
|
|
304
328
|
| `REELRECON_FETCH_RETRIES` | `3` | Instagram profile fetch attempts (with backoff) |
|
|
305
329
|
|
|
@@ -320,6 +344,7 @@ The MCP server and pipeline helpers ship with a lightweight suite (no Whisper/to
|
|
|
320
344
|
|
|
321
345
|
- **Public profiles only** โ private accounts are detected and refused.
|
|
322
346
|
- Instagram may rate-limit anonymous requests; the tool retries with backoff, but if it's blocked, wait and rerun.
|
|
347
|
+
- **Hitting Instagram's login wall on a public reel?** Public videos normally work anonymously (the same access you get after dismissing the login popup in a browser). If Instagram keeps refusing, supply your own logged-in session: set `REELRECON_COOKIES_FILE` to a `cookies.txt` export, or `REELRECON_COOKIES_FROM_BROWSER=chrome`. Your session, your account, your responsibility โ keep it to research-scale use.
|
|
323
348
|
- Whisper models are cached after first load; already-transcribed videos are reused on reruns.
|
|
324
349
|
- Everything runs locally. The only network calls are to Instagram/video hosts, and (optionally) GroqCloud with your key.
|
|
325
350
|
- Agent-facing docs live in [`CLAUDE.md`](CLAUDE.md) โ most MCP-aware coding agents pick it up automatically.
|
|
@@ -13,6 +13,7 @@ from dataclasses import dataclass
|
|
|
13
13
|
from datetime import datetime, timezone
|
|
14
14
|
from functools import lru_cache
|
|
15
15
|
from hashlib import sha1
|
|
16
|
+
from http.cookiejar import MozillaCookieJar
|
|
16
17
|
from pathlib import Path
|
|
17
18
|
from typing import Any, Callable, Dict, Iterable, Optional
|
|
18
19
|
from urllib.error import HTTPError, URLError
|
|
@@ -23,10 +24,18 @@ warnings.filterwarnings("ignore", message="urllib3 v2 only supports OpenSSL 1.1.
|
|
|
23
24
|
warnings.filterwarnings("ignore", message="Support for Python version 3.9 has been deprecated.*")
|
|
24
25
|
|
|
25
26
|
|
|
26
|
-
def
|
|
27
|
+
def _env_str(name: str) -> Optional[str]:
|
|
27
28
|
# REELRECON_* is the primary prefix; the legacy IG_TRANSCRIBER_* prefix
|
|
28
29
|
# remains supported so existing setups keep working after the rename.
|
|
29
|
-
|
|
30
|
+
for key in (f"REELRECON_{name}", f"IG_TRANSCRIBER_{name}"):
|
|
31
|
+
value = os.environ.get(key)
|
|
32
|
+
if value and value.strip():
|
|
33
|
+
return value.strip()
|
|
34
|
+
return None
|
|
35
|
+
|
|
36
|
+
|
|
37
|
+
def _env_int(name: str, default: int, *, minimum: int = 0) -> int:
|
|
38
|
+
raw = _env_str(name) or default
|
|
30
39
|
try:
|
|
31
40
|
return max(int(raw), minimum)
|
|
32
41
|
except (TypeError, ValueError):
|
|
@@ -269,21 +278,35 @@ def detect_input_kind(input_url: str) -> tuple[str, str]:
|
|
|
269
278
|
return "video", input_url
|
|
270
279
|
|
|
271
280
|
|
|
281
|
+
def _instagram_cookie_header() -> Optional[str]:
|
|
282
|
+
cookies_file, _ = cookie_settings()
|
|
283
|
+
if not cookies_file:
|
|
284
|
+
return None
|
|
285
|
+
jar = MozillaCookieJar()
|
|
286
|
+
try:
|
|
287
|
+
jar.load(cookies_file, ignore_discard=True, ignore_expires=True)
|
|
288
|
+
except Exception:
|
|
289
|
+
return None
|
|
290
|
+
pairs = [f"{cookie.name}={cookie.value}" for cookie in jar if "instagram.com" in (cookie.domain or "")]
|
|
291
|
+
return "; ".join(pairs) if pairs else None
|
|
292
|
+
|
|
293
|
+
|
|
272
294
|
def fetch_profile(username: str, canonical_url: str) -> Dict[str, Any]:
|
|
273
295
|
api_url = f"https://www.instagram.com/api/v1/users/web_profile_info/?username={username}"
|
|
296
|
+
cookie_header = _instagram_cookie_header()
|
|
274
297
|
|
|
275
298
|
payload: Optional[Dict[str, Any]] = None
|
|
276
299
|
last_error: Optional[PipelineError] = None
|
|
277
300
|
for attempt in range(1, FETCH_RETRY_ATTEMPTS + 1):
|
|
278
|
-
|
|
279
|
-
|
|
280
|
-
|
|
281
|
-
|
|
282
|
-
|
|
283
|
-
|
|
284
|
-
|
|
285
|
-
|
|
286
|
-
)
|
|
301
|
+
headers = {
|
|
302
|
+
"User-Agent": "Mozilla/5.0",
|
|
303
|
+
"x-ig-app-id": INSTAGRAM_APP_ID,
|
|
304
|
+
"Referer": canonical_url,
|
|
305
|
+
"Accept": "application/json",
|
|
306
|
+
}
|
|
307
|
+
if cookie_header:
|
|
308
|
+
headers["Cookie"] = cookie_header
|
|
309
|
+
request = Request(api_url, headers=headers)
|
|
287
310
|
try:
|
|
288
311
|
with urlopen(request, timeout=DEFAULT_TIMEOUT_SECONDS) as response:
|
|
289
312
|
payload = json.load(response)
|
|
@@ -294,7 +317,9 @@ def fetch_profile(username: str, canonical_url: str) -> Dict[str, Any]:
|
|
|
294
317
|
raise PipelineError(f"Instagram profile not found: {canonical_url}") from exc
|
|
295
318
|
if exc.code in {401, 403}:
|
|
296
319
|
raise PipelineError(
|
|
297
|
-
"Instagram blocked the profile lookup.
|
|
320
|
+
"Instagram blocked the anonymous profile lookup. Only public profiles are supported. "
|
|
321
|
+
"If this profile is public and the block persists, supply your own logged-in session via "
|
|
322
|
+
"REELRECON_COOKIES_FILE (a cookies.txt export) and retry."
|
|
298
323
|
) from exc
|
|
299
324
|
if exc.code == 429:
|
|
300
325
|
last_error = PipelineError(
|
|
@@ -405,8 +430,56 @@ def collect_instagram_profile_videos(canonical_url: str) -> list[VideoCandidate]
|
|
|
405
430
|
]
|
|
406
431
|
|
|
407
432
|
|
|
433
|
+
def cookie_settings() -> tuple[Optional[str], Optional[str]]:
|
|
434
|
+
"""Optional own-session cookies for when Instagram hard-walls anonymous access.
|
|
435
|
+
|
|
436
|
+
Public reels normally work anonymously (same as playing the video after
|
|
437
|
+
dismissing the login popup in a browser). When Instagram rate-limits or
|
|
438
|
+
login-walls anonymous requests, users can supply their own logged-in
|
|
439
|
+
session: REELRECON_COOKIES_FILE (a Netscape cookies.txt export) or
|
|
440
|
+
REELRECON_COOKIES_FROM_BROWSER (e.g. "chrome", "firefox:ProfileName").
|
|
441
|
+
"""
|
|
442
|
+
cookies_file = _env_str("COOKIES_FILE")
|
|
443
|
+
if cookies_file:
|
|
444
|
+
path = Path(cookies_file).expanduser()
|
|
445
|
+
if not path.is_file():
|
|
446
|
+
raise PipelineError(
|
|
447
|
+
f"REELRECON_COOKIES_FILE points to a missing file: {path}. "
|
|
448
|
+
"Export a cookies.txt from your logged-in browser (e.g. the 'Get cookies.txt' extension) "
|
|
449
|
+
"or unset the variable to use anonymous access."
|
|
450
|
+
)
|
|
451
|
+
cookies_file = str(path)
|
|
452
|
+
return cookies_file, _env_str("COOKIES_FROM_BROWSER")
|
|
453
|
+
|
|
454
|
+
|
|
455
|
+
_LOGIN_WALL_MARKERS = (
|
|
456
|
+
"login required",
|
|
457
|
+
"log in",
|
|
458
|
+
"login",
|
|
459
|
+
"rate-limit",
|
|
460
|
+
"rate limit",
|
|
461
|
+
"requested content is not available",
|
|
462
|
+
"checkpoint required",
|
|
463
|
+
"checkpoint_required",
|
|
464
|
+
)
|
|
465
|
+
|
|
466
|
+
|
|
467
|
+
def _download_error(exc: Exception, target_url: str, action: str) -> PipelineError:
|
|
468
|
+
text = str(exc).strip()
|
|
469
|
+
lowered = text.lower()
|
|
470
|
+
if any(marker in lowered for marker in _LOGIN_WALL_MARKERS):
|
|
471
|
+
return PipelineError(
|
|
472
|
+
f"Instagram refused anonymous access while {action} {target_url}: {text} โ "
|
|
473
|
+
"this usually means the video is private/removed, or Instagram is rate-limiting anonymous requests. "
|
|
474
|
+
"If the link plays in a private browser window (after dismissing the login popup), wait a few minutes and retry, "
|
|
475
|
+
"or supply your own logged-in session: set REELRECON_COOKIES_FILE to a cookies.txt export, or "
|
|
476
|
+
"REELRECON_COOKIES_FROM_BROWSER=chrome (also: firefox, edge, safari, brave)."
|
|
477
|
+
)
|
|
478
|
+
return PipelineError(f"Failed while {action} {target_url}: {text}")
|
|
479
|
+
|
|
480
|
+
|
|
408
481
|
def _yt_dlp_base_options() -> Dict[str, Any]:
|
|
409
|
-
|
|
482
|
+
options: Dict[str, Any] = {
|
|
410
483
|
"quiet": True,
|
|
411
484
|
"no_warnings": True,
|
|
412
485
|
"noprogress": True,
|
|
@@ -415,15 +488,26 @@ def _yt_dlp_base_options() -> Dict[str, Any]:
|
|
|
415
488
|
"fragment_retries": 3,
|
|
416
489
|
"extractor_retries": 2,
|
|
417
490
|
}
|
|
491
|
+
cookies_file, cookies_browser = cookie_settings()
|
|
492
|
+
if cookies_file:
|
|
493
|
+
options["cookiefile"] = cookies_file
|
|
494
|
+
elif cookies_browser:
|
|
495
|
+
options["cookiesfrombrowser"] = tuple(
|
|
496
|
+
part.strip() for part in cookies_browser.split(":") if part.strip()
|
|
497
|
+
)
|
|
498
|
+
return options
|
|
418
499
|
|
|
419
500
|
|
|
420
501
|
def _yt_dlp_extract_info(target_url: str) -> Dict[str, Any]:
|
|
421
502
|
YoutubeDL = _import_yt_dlp()
|
|
503
|
+
options = _yt_dlp_base_options()
|
|
422
504
|
try:
|
|
423
|
-
with YoutubeDL(
|
|
505
|
+
with YoutubeDL(options) as ydl:
|
|
424
506
|
info = ydl.extract_info(target_url, download=False)
|
|
507
|
+
except PipelineError:
|
|
508
|
+
raise
|
|
425
509
|
except Exception as exc:
|
|
426
|
-
raise
|
|
510
|
+
raise _download_error(exc, target_url, "inspecting") from exc
|
|
427
511
|
if not isinstance(info, dict):
|
|
428
512
|
raise PipelineError(f"Could not extract video information from URL: {target_url}")
|
|
429
513
|
return info
|
|
@@ -827,8 +911,10 @@ def download_audio(candidate: VideoCandidate, run_dir: Path) -> Path:
|
|
|
827
911
|
try:
|
|
828
912
|
with YoutubeDL(options) as ydl:
|
|
829
913
|
info = ydl.extract_info(candidate.video_url, download=True)
|
|
914
|
+
except PipelineError:
|
|
915
|
+
raise
|
|
830
916
|
except Exception as exc:
|
|
831
|
-
raise
|
|
917
|
+
raise _download_error(exc, candidate.video_url, "downloading audio for") from exc
|
|
832
918
|
if not isinstance(info, dict) or not info.get("id"):
|
|
833
919
|
raise PipelineError(f"yt-dlp did not return download metadata for {candidate.video_url}")
|
|
834
920
|
|
package/mcp_server.py
CHANGED
|
@@ -18,7 +18,7 @@ from mcp.server.fastmcp import Context, FastMCP
|
|
|
18
18
|
from mcp.types import ToolAnnotations
|
|
19
19
|
|
|
20
20
|
SERVER_NAME = "ReelRecon"
|
|
21
|
-
SERVER_VERSION = "1.2.
|
|
21
|
+
SERVER_VERSION = "1.2.1"
|
|
22
22
|
|
|
23
23
|
logger = logging.getLogger("reelrecon.mcp")
|
|
24
24
|
|
|
@@ -936,6 +936,7 @@ def build_server(*, host: str, port: int, debug: bool) -> FastMCP:
|
|
|
936
936
|
"output_root_writable": output_root_writable,
|
|
937
937
|
"saved_batches": len(_recent_manifest_paths(MAX_LIST_LIMIT)),
|
|
938
938
|
"groq_configured": bool(os.environ.get("GROQ_API_KEY")),
|
|
939
|
+
"instagram_cookies_configured": bool(_env("COOKIES_FILE") or _env("COOKIES_FROM_BROWSER")),
|
|
939
940
|
"jobs": {
|
|
940
941
|
"active": _active_jobs,
|
|
941
942
|
"abandoned_after_timeout": _abandoned_jobs,
|
package/package.json
CHANGED
|
@@ -1,6 +1,6 @@
|
|
|
1
1
|
{
|
|
2
2
|
"name": "reelrecon",
|
|
3
|
-
"version": "1.2.
|
|
3
|
+
"version": "1.2.1",
|
|
4
4
|
"description": "Reel reconnaissance for AI agents โ transcribe and decode public Instagram profiles via MCP. Whisper-powered, free, runs locally.",
|
|
5
5
|
"bin": {
|
|
6
6
|
"reelrecon": "bin/reelrecon.js"
|
|
@@ -20,9 +20,9 @@
|
|
|
20
20
|
},
|
|
21
21
|
"repository": {
|
|
22
22
|
"type": "git",
|
|
23
|
-
"url": "git+https://github.com/4nw3rprod/
|
|
23
|
+
"url": "git+https://github.com/4nw3rprod/ReelRecon.git"
|
|
24
24
|
},
|
|
25
|
-
"homepage": "https://github.com/4nw3rprod/
|
|
25
|
+
"homepage": "https://github.com/4nw3rprod/ReelRecon#readme",
|
|
26
26
|
"keywords": [
|
|
27
27
|
"mcp",
|
|
28
28
|
"mcp-server",
|