reelrecon 1.2.0 โ†’ 1.2.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/CLAUDE.md CHANGED
@@ -131,6 +131,6 @@ Human-readable error details are also written to stderr.
131
131
 
132
132
  - Public profiles only.
133
133
  - Local audio uploads bypass Instagram entirely.
134
- - Instagram may rate-limit anonymous requests.
134
+ - Instagram may rate-limit anonymous requests. Public reels work anonymously; if Instagram login-walls a public link, set `REELRECON_COOKIES_FILE` (cookies.txt export) or `REELRECON_COOKIES_FROM_BROWSER=chrome` to use your own session.
135
135
  - The wrapper prefers Python 3.11 when available to avoid `yt-dlp` Python 3.9 deprecation noise.
136
136
  - The wrapper prefers the repo-local `.venv` first when present.
package/README.md CHANGED
@@ -1,26 +1,26 @@
1
1
  <div align="center">
2
2
 
3
- # ๐ŸŽฌ ReelRecon
4
-
5
- ### Reel reconnaissance for AI agents.
3
+ <img src="https://raw.githubusercontent.com/4nw3rprod/ReelRecon/main/assets/hero.png" alt="ReelRecon โ€” Reel reconnaissance for AI agents. Works with Claude, ChatGPT, Gemini, Hermes, OpenClaw, and any MCP-capable agent." width="820"/>
6
4
 
7
5
  **Transcribe and decode any public Instagram profile โ€” hooks, CTAs, and script patterns โ€” locally and for free.**
8
6
 
9
- **Give Claude, ChatGPT, Gemini, Hermes, OpenClaw โ€” or any MCP-capable agent โ€” the power to watch Instagram for you.**
10
-
11
- [![Python](https://img.shields.io/badge/python-3.11+-3776AB?logo=python&logoColor=white)](https://www.python.org/)
7
+ [![npm](https://img.shields.io/npm/v/reelrecon?logo=npm&color=CB3837)](https://www.npmjs.com/package/reelrecon)
8
+ [![Python](https://img.shields.io/badge/python-3.10+-3776AB?logo=python&logoColor=white)](https://www.python.org/)
12
9
  [![Whisper](https://img.shields.io/badge/transcription-OpenAI%20Whisper-74aa9c?logo=openai&logoColor=white)](https://github.com/openai/whisper)
13
10
  [![MCP](https://img.shields.io/badge/protocol-MCP%20native-8A2BE2)](https://modelcontextprotocol.io/)
14
- [![Agents](https://img.shields.io/badge/works%20with-Claude%20ยท%20ChatGPT%20ยท%20Gemini%20ยท%20Hermes%20ยท%20OpenClaw-blueviolet)](#-drop-it-into-your-agent-stack)
15
- [![Price](https://img.shields.io/badge/price-free-success)](#)
11
+ [![License](https://img.shields.io/badge/license-MIT-success)](LICENSE)
16
12
  [![Privacy](https://img.shields.io/badge/runs-locally-orange)](#)
17
13
 
14
+ ```bash
15
+ npx -y reelrecon
16
+ ```
17
+
18
18
  *Your agent can already write scripts. Now it can study the competition first:*
19
19
  *"Transcribe @competitor's latest 10 Reels and break down their hook formulas" โ€” one tool call away.*
20
20
 
21
21
  [๐Ÿค– Agent Setup](#-drop-it-into-your-agent-stack) ยท [๐Ÿš€ Quick Start](#-quick-start) ยท [๐Ÿ” Use Cases](#-what-your-agent-can-do-with-it) ยท [๐Ÿงฐ Tool Reference](#-mcp-tool-reference) ยท [๐Ÿ–ฅ๏ธ Web UI](#๏ธ-the-dashboard-for-humans)
22
22
 
23
- <img src="screen.png" alt="ReelRecon dashboard" width="850"/>
23
+ <img src="https://raw.githubusercontent.com/4nw3rprod/ReelRecon/main/screen.png" alt="ReelRecon dashboard" width="850"/>
24
24
 
25
25
  </div>
26
26
 
@@ -28,12 +28,14 @@
28
28
 
29
29
  ## ๐ŸŽฏ Why this exists
30
30
 
31
- LLMs can't watch video. Agentic frameworks can browse, code, and write โ€” but a Reel is a black box to them. **ReelRecon** closes that gap with a local, free, MCP-native pipeline:
31
+ Today, analyzing a competitor's Reels means either paying a per-minute transcription SaaS, or uploading videos one by one to a multimodal model and burning tokens while it watches. **ReelRecon is the third option: free, open source, and local** โ€” an MCP-native pipeline built for the part that actually matters for content strategy, the spoken word:
32
32
 
33
33
  1. Your agent calls one tool with a **public Instagram profile URL**.
34
- 2. The server grabs the **latest 10 videos**, extracts audio, and transcribes every word with **OpenAI Whisper** โ€” locally, no per-minute API fees.
34
+ 2. The server grabs the **latest 10 videos**, extracts audio, and transcribes every word with **OpenAI Whisper** โ€” locally. No subscriptions, no per-minute fees, no tokens spent on video frames.
35
35
  3. The agent gets back **structured JSON**: full transcripts plus mined hooks, CTAs, sentiment, keyword clusters, title ideas, and a cross-video strategy overview.
36
36
 
37
+ ReelRecon doesn't analyze visuals โ€” scripts, hooks, and CTAs live in the audio, and that's what it mines for patterns and content ideas. It pairs perfectly with video-capable models: triage all ten Reels here for free in minutes, then send only the one or two that matter to a multimodal model for full visual breakdown.
38
+
37
39
  Built agent-tough: structured errors instead of exceptions, progress notifications, job queueing with hard timeouts, context-window-friendly response trimming, and a `check_health` tool so your agent can self-diagnose a broken install instead of hallucinating around it.
38
40
 
39
41
  ## ๐Ÿค– Drop it into your agent stack
@@ -56,7 +58,27 @@ npx -y reelrecon transcribe "https://www.instagram.com/<username>/" --json
56
58
 
57
59
  > Already have Python + deps? Set `REELRECON_PYTHON=/path/to/python` to skip provisioning and use your own environment.
58
60
  >
59
- > Package not on npm yet in your region/registry? Run it straight from GitHub โ€” same launcher: `npx -y github:4nw3rprod/IG-Content-Transcriber`
61
+ > Package not on npm yet in your region/registry? Run it straight from GitHub โ€” same launcher: `npx -y github:4nw3rprod/ReelRecon`
62
+
63
+ ### ๐Ÿ”„ Upgrading
64
+
65
+ - **npx users:** pin `reelrecon@latest` in your config (as in the snippets below) and every server start runs the newest published version. If plain `npx -y reelrecon` keeps serving you a stale cached copy, run `npx -y reelrecon@latest` once or clear the cache with `npm cache clean --force`.
66
+ - **GitHub-direct:** `npx -y github:4nw3rprod/ReelRecon` always runs the latest `main` โ€” no npm release needed.
67
+ - **Local clone:** `git pull`. That's it โ€” the private Python env in `~/.reelrecon` is reused automatically and only reinstalls when `requirements.txt` changes.
68
+
69
+ If Instagram login-walled you on a public reel, upgrade to **v1.2.1+** and (optionally) hand the server your own session in the MCP config:
70
+
71
+ ```json
72
+ {
73
+ "mcpServers": {
74
+ "reelrecon": {
75
+ "command": "npx",
76
+ "args": ["-y", "reelrecon@latest"],
77
+ "env": { "REELRECON_COOKIES_FILE": "/absolute/path/to/cookies.txt" }
78
+ }
79
+ }
80
+ }
81
+ ```
60
82
 
61
83
  | Agent / Framework | Integration |
62
84
  |---|---|
@@ -300,6 +322,8 @@ All optional, via environment variables:
300
322
  | `REELRECON_MAX_CONCURRENT_JOBS` | `1` | Parallel transcription jobs (MCP) |
301
323
  | `REELRECON_MAX_UPLOAD_BYTES` | 2 GiB | Max local audio file size (MCP) |
302
324
  | `REELRECON_EXTRA_MODELS` | โ€” | Comma-separated extra Whisper model names to allow |
325
+ | `REELRECON_COOKIES_FILE` | โ€” | Path to a `cookies.txt` export of your own Instagram session (fallback when Instagram login-walls anonymous access) |
326
+ | `REELRECON_COOKIES_FROM_BROWSER` | โ€” | Read your session straight from a browser, e.g. `chrome` or `firefox:ProfileName` |
303
327
  | `REELRECON_HTTP_TIMEOUT_SECONDS` | `30` | Instagram/Groq/yt-dlp socket timeout |
304
328
  | `REELRECON_FETCH_RETRIES` | `3` | Instagram profile fetch attempts (with backoff) |
305
329
 
@@ -320,6 +344,7 @@ The MCP server and pipeline helpers ship with a lightweight suite (no Whisper/to
320
344
 
321
345
  - **Public profiles only** โ€” private accounts are detected and refused.
322
346
  - Instagram may rate-limit anonymous requests; the tool retries with backoff, but if it's blocked, wait and rerun.
347
+ - **Hitting Instagram's login wall on a public reel?** Public videos normally work anonymously (the same access you get after dismissing the login popup in a browser). If Instagram keeps refusing, supply your own logged-in session: set `REELRECON_COOKIES_FILE` to a `cookies.txt` export, or `REELRECON_COOKIES_FROM_BROWSER=chrome`. Your session, your account, your responsibility โ€” keep it to research-scale use.
323
348
  - Whisper models are cached after first load; already-transcribed videos are reused on reruns.
324
349
  - Everything runs locally. The only network calls are to Instagram/video hosts, and (optionally) GroqCloud with your key.
325
350
  - Agent-facing docs live in [`CLAUDE.md`](CLAUDE.md) โ€” most MCP-aware coding agents pick it up automatically.
@@ -13,6 +13,7 @@ from dataclasses import dataclass
13
13
  from datetime import datetime, timezone
14
14
  from functools import lru_cache
15
15
  from hashlib import sha1
16
+ from http.cookiejar import MozillaCookieJar
16
17
  from pathlib import Path
17
18
  from typing import Any, Callable, Dict, Iterable, Optional
18
19
  from urllib.error import HTTPError, URLError
@@ -23,10 +24,18 @@ warnings.filterwarnings("ignore", message="urllib3 v2 only supports OpenSSL 1.1.
23
24
  warnings.filterwarnings("ignore", message="Support for Python version 3.9 has been deprecated.*")
24
25
 
25
26
 
26
- def _env_int(name: str, default: int, *, minimum: int = 0) -> int:
27
+ def _env_str(name: str) -> Optional[str]:
27
28
  # REELRECON_* is the primary prefix; the legacy IG_TRANSCRIBER_* prefix
28
29
  # remains supported so existing setups keep working after the rename.
29
- raw = os.environ.get(f"REELRECON_{name}", os.environ.get(f"IG_TRANSCRIBER_{name}", default))
30
+ for key in (f"REELRECON_{name}", f"IG_TRANSCRIBER_{name}"):
31
+ value = os.environ.get(key)
32
+ if value and value.strip():
33
+ return value.strip()
34
+ return None
35
+
36
+
37
+ def _env_int(name: str, default: int, *, minimum: int = 0) -> int:
38
+ raw = _env_str(name) or default
30
39
  try:
31
40
  return max(int(raw), minimum)
32
41
  except (TypeError, ValueError):
@@ -269,21 +278,35 @@ def detect_input_kind(input_url: str) -> tuple[str, str]:
269
278
  return "video", input_url
270
279
 
271
280
 
281
+ def _instagram_cookie_header() -> Optional[str]:
282
+ cookies_file, _ = cookie_settings()
283
+ if not cookies_file:
284
+ return None
285
+ jar = MozillaCookieJar()
286
+ try:
287
+ jar.load(cookies_file, ignore_discard=True, ignore_expires=True)
288
+ except Exception:
289
+ return None
290
+ pairs = [f"{cookie.name}={cookie.value}" for cookie in jar if "instagram.com" in (cookie.domain or "")]
291
+ return "; ".join(pairs) if pairs else None
292
+
293
+
272
294
  def fetch_profile(username: str, canonical_url: str) -> Dict[str, Any]:
273
295
  api_url = f"https://www.instagram.com/api/v1/users/web_profile_info/?username={username}"
296
+ cookie_header = _instagram_cookie_header()
274
297
 
275
298
  payload: Optional[Dict[str, Any]] = None
276
299
  last_error: Optional[PipelineError] = None
277
300
  for attempt in range(1, FETCH_RETRY_ATTEMPTS + 1):
278
- request = Request(
279
- api_url,
280
- headers={
281
- "User-Agent": "Mozilla/5.0",
282
- "x-ig-app-id": INSTAGRAM_APP_ID,
283
- "Referer": canonical_url,
284
- "Accept": "application/json",
285
- },
286
- )
301
+ headers = {
302
+ "User-Agent": "Mozilla/5.0",
303
+ "x-ig-app-id": INSTAGRAM_APP_ID,
304
+ "Referer": canonical_url,
305
+ "Accept": "application/json",
306
+ }
307
+ if cookie_header:
308
+ headers["Cookie"] = cookie_header
309
+ request = Request(api_url, headers=headers)
287
310
  try:
288
311
  with urlopen(request, timeout=DEFAULT_TIMEOUT_SECONDS) as response:
289
312
  payload = json.load(response)
@@ -294,7 +317,9 @@ def fetch_profile(username: str, canonical_url: str) -> Dict[str, Any]:
294
317
  raise PipelineError(f"Instagram profile not found: {canonical_url}") from exc
295
318
  if exc.code in {401, 403}:
296
319
  raise PipelineError(
297
- "Instagram blocked the profile lookup. This pipeline currently supports public profiles only."
320
+ "Instagram blocked the anonymous profile lookup. Only public profiles are supported. "
321
+ "If this profile is public and the block persists, supply your own logged-in session via "
322
+ "REELRECON_COOKIES_FILE (a cookies.txt export) and retry."
298
323
  ) from exc
299
324
  if exc.code == 429:
300
325
  last_error = PipelineError(
@@ -405,8 +430,56 @@ def collect_instagram_profile_videos(canonical_url: str) -> list[VideoCandidate]
405
430
  ]
406
431
 
407
432
 
433
+ def cookie_settings() -> tuple[Optional[str], Optional[str]]:
434
+ """Optional own-session cookies for when Instagram hard-walls anonymous access.
435
+
436
+ Public reels normally work anonymously (same as playing the video after
437
+ dismissing the login popup in a browser). When Instagram rate-limits or
438
+ login-walls anonymous requests, users can supply their own logged-in
439
+ session: REELRECON_COOKIES_FILE (a Netscape cookies.txt export) or
440
+ REELRECON_COOKIES_FROM_BROWSER (e.g. "chrome", "firefox:ProfileName").
441
+ """
442
+ cookies_file = _env_str("COOKIES_FILE")
443
+ if cookies_file:
444
+ path = Path(cookies_file).expanduser()
445
+ if not path.is_file():
446
+ raise PipelineError(
447
+ f"REELRECON_COOKIES_FILE points to a missing file: {path}. "
448
+ "Export a cookies.txt from your logged-in browser (e.g. the 'Get cookies.txt' extension) "
449
+ "or unset the variable to use anonymous access."
450
+ )
451
+ cookies_file = str(path)
452
+ return cookies_file, _env_str("COOKIES_FROM_BROWSER")
453
+
454
+
455
+ _LOGIN_WALL_MARKERS = (
456
+ "login required",
457
+ "log in",
458
+ "login",
459
+ "rate-limit",
460
+ "rate limit",
461
+ "requested content is not available",
462
+ "checkpoint required",
463
+ "checkpoint_required",
464
+ )
465
+
466
+
467
+ def _download_error(exc: Exception, target_url: str, action: str) -> PipelineError:
468
+ text = str(exc).strip()
469
+ lowered = text.lower()
470
+ if any(marker in lowered for marker in _LOGIN_WALL_MARKERS):
471
+ return PipelineError(
472
+ f"Instagram refused anonymous access while {action} {target_url}: {text} โ€” "
473
+ "this usually means the video is private/removed, or Instagram is rate-limiting anonymous requests. "
474
+ "If the link plays in a private browser window (after dismissing the login popup), wait a few minutes and retry, "
475
+ "or supply your own logged-in session: set REELRECON_COOKIES_FILE to a cookies.txt export, or "
476
+ "REELRECON_COOKIES_FROM_BROWSER=chrome (also: firefox, edge, safari, brave)."
477
+ )
478
+ return PipelineError(f"Failed while {action} {target_url}: {text}")
479
+
480
+
408
481
  def _yt_dlp_base_options() -> Dict[str, Any]:
409
- return {
482
+ options: Dict[str, Any] = {
410
483
  "quiet": True,
411
484
  "no_warnings": True,
412
485
  "noprogress": True,
@@ -415,15 +488,26 @@ def _yt_dlp_base_options() -> Dict[str, Any]:
415
488
  "fragment_retries": 3,
416
489
  "extractor_retries": 2,
417
490
  }
491
+ cookies_file, cookies_browser = cookie_settings()
492
+ if cookies_file:
493
+ options["cookiefile"] = cookies_file
494
+ elif cookies_browser:
495
+ options["cookiesfrombrowser"] = tuple(
496
+ part.strip() for part in cookies_browser.split(":") if part.strip()
497
+ )
498
+ return options
418
499
 
419
500
 
420
501
  def _yt_dlp_extract_info(target_url: str) -> Dict[str, Any]:
421
502
  YoutubeDL = _import_yt_dlp()
503
+ options = _yt_dlp_base_options()
422
504
  try:
423
- with YoutubeDL(_yt_dlp_base_options()) as ydl:
505
+ with YoutubeDL(options) as ydl:
424
506
  info = ydl.extract_info(target_url, download=False)
507
+ except PipelineError:
508
+ raise
425
509
  except Exception as exc:
426
- raise PipelineError(f"Failed to inspect video URL: {exc}") from exc
510
+ raise _download_error(exc, target_url, "inspecting") from exc
427
511
  if not isinstance(info, dict):
428
512
  raise PipelineError(f"Could not extract video information from URL: {target_url}")
429
513
  return info
@@ -827,8 +911,10 @@ def download_audio(candidate: VideoCandidate, run_dir: Path) -> Path:
827
911
  try:
828
912
  with YoutubeDL(options) as ydl:
829
913
  info = ydl.extract_info(candidate.video_url, download=True)
914
+ except PipelineError:
915
+ raise
830
916
  except Exception as exc:
831
- raise PipelineError(f"Failed to download audio for {candidate.video_url}: {exc}") from exc
917
+ raise _download_error(exc, candidate.video_url, "downloading audio for") from exc
832
918
  if not isinstance(info, dict) or not info.get("id"):
833
919
  raise PipelineError(f"yt-dlp did not return download metadata for {candidate.video_url}")
834
920
 
package/mcp_server.py CHANGED
@@ -18,7 +18,7 @@ from mcp.server.fastmcp import Context, FastMCP
18
18
  from mcp.types import ToolAnnotations
19
19
 
20
20
  SERVER_NAME = "ReelRecon"
21
- SERVER_VERSION = "1.2.0"
21
+ SERVER_VERSION = "1.2.1"
22
22
 
23
23
  logger = logging.getLogger("reelrecon.mcp")
24
24
 
@@ -936,6 +936,7 @@ def build_server(*, host: str, port: int, debug: bool) -> FastMCP:
936
936
  "output_root_writable": output_root_writable,
937
937
  "saved_batches": len(_recent_manifest_paths(MAX_LIST_LIMIT)),
938
938
  "groq_configured": bool(os.environ.get("GROQ_API_KEY")),
939
+ "instagram_cookies_configured": bool(_env("COOKIES_FILE") or _env("COOKIES_FROM_BROWSER")),
939
940
  "jobs": {
940
941
  "active": _active_jobs,
941
942
  "abandoned_after_timeout": _abandoned_jobs,
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "reelrecon",
3
- "version": "1.2.0",
3
+ "version": "1.2.1",
4
4
  "description": "Reel reconnaissance for AI agents โ€” transcribe and decode public Instagram profiles via MCP. Whisper-powered, free, runs locally.",
5
5
  "bin": {
6
6
  "reelrecon": "bin/reelrecon.js"
@@ -20,9 +20,9 @@
20
20
  },
21
21
  "repository": {
22
22
  "type": "git",
23
- "url": "git+https://github.com/4nw3rprod/IG-Content-Transcriber.git"
23
+ "url": "git+https://github.com/4nw3rprod/ReelRecon.git"
24
24
  },
25
- "homepage": "https://github.com/4nw3rprod/IG-Content-Transcriber#readme",
25
+ "homepage": "https://github.com/4nw3rprod/ReelRecon#readme",
26
26
  "keywords": [
27
27
  "mcp",
28
28
  "mcp-server",