davinci-resolve-mcp 2.26.1 → 2.27.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/CHANGELOG.md CHANGED
@@ -2,6 +2,43 @@
2
2
 
3
3
  Release history for the DaVinci Resolve MCP Server. The latest release is summarized in the root README; older entries live here to keep the README focused.
4
4
 
5
+ ## What's New in v2.27.0
6
+
7
+ **Frame-sampling modes (issue #46)** — how many frames a clip gets for visual
8
+ analysis is now governed by a `sampling_mode`, decoupled from `depth` (which
9
+ still controls which layers run). A fixed frame count over-sampled short clips
10
+ and under-covered long ones; the demand-driven engine already scaled by
11
+ duration/content, but the caps layer was flat-truncating its output back to 8
12
+ frames — that flat cap was the real cause of long-clip under-coverage.
13
+
14
+ Four clearly-tiered modes, organized so token cost is predictable per tier:
15
+
16
+ - **Economy** (`fixed`) — flat N evenly-spaced, content-blind frames. Cheapest and
17
+ most predictable; good for proxies/triage.
18
+ - **Balanced** (`per_minute`) — `clamp(minutes × frames_per_minute, floor, ceiling)`
19
+ (defaults 4/min, 3–80). Cost is linear in footage length; content-blind.
20
+ - **Thorough** (`adaptive_capped`, recommended/default) — content-aware: samples
21
+ shot boundaries, representatives, and flash candidates, bounded to `[floor,
22
+ ceiling]`. Best coverage with a bounded cost.
23
+ - **Thorough (uncapped)** (`adaptive`) — content-aware with no per-clip ceiling
24
+ (up to the 512-frame hard cap). Use only when clips are short or few.
25
+
26
+ The first time you analyze without a saved default, the tool returns a
27
+ `confirmation_required` response with a `sampling_mode_prompt`; choosing a mode
28
+ saves it as your standing default (mirrors `timed_markers_default`). Pass
29
+ `sampling_mode` per call any time for a one-off that doesn't change the default.
30
+ Tunables (`frames_per_minute`, `frame_floor`, `frame_ceiling`) and the mode are
31
+ all exposed in the control panel (Preferences → Frame sampling mode) with a live
32
+ per-clip token-cost estimate; batch jobs honor the saved default.
33
+
34
+ Analysis-caps presets were retuned so `frames_per_clip` is now a *safety ceiling*
35
+ (minimal/standard/generous = 12/80/200), not the primary frame dial, and the
36
+ per-clip/job/day vision-token caps were raised so the default Thorough mode isn't
37
+ refused by the per-clip token cap. Cache reuse re-samples only when switching up
38
+ the thoroughness rank; a richer prior report still satisfies a cheaper mode. Adds
39
+ `tests/test_sampling_modes.py` (30 tests). Validated end-to-end on a synthetic
40
+ multi-shot clip with real ffmpeg frame extraction.
41
+
5
42
  ## What's New in v2.26.1
6
43
 
7
44
  **Python 3.13 / 3.14 support (issue #45)** — `npx davinci-resolve-mcp setup`
package/README.md CHANGED
@@ -1,6 +1,6 @@
1
1
  # DaVinci Resolve MCP Server
2
2
 
3
- [![Version](https://img.shields.io/badge/version-2.26.1-blue.svg)](https://github.com/samuelgursky/davinci-resolve-mcp/releases)
3
+ [![Version](https://img.shields.io/badge/version-2.27.0-blue.svg)](https://github.com/samuelgursky/davinci-resolve-mcp/releases)
4
4
  [![npm](https://img.shields.io/npm/v/davinci-resolve-mcp.svg?label=npm&color=CB3837)](https://www.npmjs.com/package/davinci-resolve-mcp)
5
5
  [![API Coverage](https://img.shields.io/badge/API%20Coverage-100%25-brightgreen.svg)](docs/reference/api-coverage.md)
6
6
  [![Tools](https://img.shields.io/badge/MCP%20Tools-32%20(329%20full)-blue.svg)](#server-modes)
package/docs/SKILL.md CHANGED
@@ -468,6 +468,16 @@ for existing reports. `quick` uses ffprobe metadata; `standard` adds ffmpeg
468
468
  read-through checks,
469
469
  cut-boundary analysis from full-stream scene detection, flash-frame candidates,
470
470
  motion/variance scoring, analysis keyframes, and sidecar reports.
471
+ `depth` controls which layers run; a separate `sampling_mode` controls how many
472
+ frames each clip gets for visual analysis (and thus token cost): `fixed`
473
+ (Economy, flat content-blind frames), `per_minute` (Balanced, frames scale with
474
+ duration), `adaptive_capped` (Thorough, content-aware bounded to
475
+ `[frame_floor, frame_ceiling]` — recommended/default), or `adaptive` (Thorough
476
+ uncapped). When no default is saved, the first analyze returns
477
+ `confirmation_required` with a `sampling_mode_prompt`; choosing a mode saves it
478
+ as the default. Pass `sampling_mode` per call for a one-off. The mode owns frame
479
+ count — `analysis_caps.frames_per_clip` is a safety ceiling above it, not the
480
+ primary dial.
471
481
  By default, planning checks the active project's analysis root and bounded
472
482
  related project-version roots for existing reports, then marks matching clips
473
483
  `skip_execution=true` when those reports already contain the requested
@@ -93,6 +93,33 @@ FFprobe is required. If missing:
93
93
  via host_chat_paths (finalized per clip with commit_vision, ~2-5 minutes per
94
94
  file plus host-chat read time)
95
95
 
96
+ `depth` controls *which layers run*. How many frames each clip gets for visual
97
+ analysis is a separate axis — the **sampling mode** — because a fixed frame count
98
+ over-samples short clips and under-covers long ones.
99
+
100
+ ### 3b. Frame-sampling mode
101
+
102
+ Pass `sampling_mode` on any analyze action, or set a standing default in the
103
+ control panel (Preferences → Frame sampling mode). The mode owns frame count and
104
+ thus token cost; `analysis_caps.frames_per_clip` is now a safety ceiling above it,
105
+ not the primary dial.
106
+
107
+ - **Economy** (`fixed`) — flat N frames (depth-derived, default 8) regardless of
108
+ clip length. Cheapest and most predictable; good for proxies/triage.
109
+ - **Balanced** (`per_minute`) — `frames = clamp(minutes × frames_per_minute, floor,
110
+ ceiling)` (defaults 4/min, 3–80). Cost is linear in footage length; content-blind.
111
+ - **Thorough** (`adaptive_capped`, **recommended**) — content-aware: samples shot
112
+ boundaries, representatives, and flash candidates, bounded to `[floor, ceiling]`
113
+ (3–80). Best coverage with a bounded cost.
114
+ - **Thorough (uncapped)** (`adaptive`) — content-aware with no per-clip ceiling
115
+ (up to the absolute 512-frame hard cap). Use only when clips are short or few.
116
+
117
+ Tunables (`frames_per_minute`, `frame_floor`, `frame_ceiling`) apply to Balanced
118
+ and Thorough. The first time you analyze without a saved default, the tool returns
119
+ a `confirmation_required` response with a `sampling_mode_prompt`; re-run with
120
+ `sampling_mode=<choice>` (which saves it as your default) or pick it in the panel.
121
+ Pass `sampling_mode` explicitly any time for a one-off that doesn't change the default.
122
+
96
123
  ---
97
124
 
98
125
  ## Analysis Commands
package/install.py CHANGED
@@ -35,7 +35,7 @@ from src.utils.update_check import (
35
35
 
36
36
  # ─── Version ──────────────────────────────────────────────────────────────────
37
37
 
38
- VERSION = "2.26.1"
38
+ VERSION = "2.27.0"
39
39
  # Only hard floor: mcp[cli] requires Python 3.10+. There is no upper bound —
40
40
  # Resolve's scripting bridge loads into newer interpreters on recent builds
41
41
  # (Python 3.14 verified against Resolve Studio 20.3.2). Older Resolve builds
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "davinci-resolve-mcp",
3
- "version": "2.26.1",
3
+ "version": "2.27.0",
4
4
  "description": "NPM bootstrapper for the DaVinci Resolve MCP Server.",
5
5
  "license": "MIT",
6
6
  "author": "Samuel Gursky <samgursky@gmail.com>",
@@ -4272,6 +4272,21 @@ HTML = r"""<!doctype html>
4272
4272
  </select>
4273
4273
  </label>
4274
4274
  <label>Default sample frames <input id="prefFrames" type="number" min="0" max="48" value="8"></label>
4275
+ <label>Frame sampling mode
4276
+ <select id="prefSamplingMode" onchange="updateSamplingModeHint()">
4277
+ <option value="ask">ask · choose on first analysis</option>
4278
+ <option value="fixed">Economy · flat frames, cheapest &amp; most predictable</option>
4279
+ <option value="per_minute">Balanced · frames scale with duration (linear cost)</option>
4280
+ <option value="adaptive_capped">Thorough · content-aware, bounded cost (recommended)</option>
4281
+ <option value="adaptive">Thorough (uncapped) · content-aware, up to 512 frames</option>
4282
+ </select>
4283
+ </label>
4284
+ <small class="pref-hint" id="samplingModeHint" style="display:block;margin:-4px 0 8px;opacity:0.75;"></small>
4285
+ <div class="pref-inline-row" style="display:flex;gap:10px;flex-wrap:wrap;">
4286
+ <label>Frames / minute <input id="prefSamplingRate" type="number" min="0.1" step="0.5" value="4" oninput="updateSamplingModeHint()"></label>
4287
+ <label>Frame floor <input id="prefSamplingFloor" type="number" min="1" value="3" oninput="updateSamplingModeHint()"></label>
4288
+ <label>Frame ceiling <input id="prefSamplingCeiling" type="number" min="1" value="80" oninput="updateSamplingModeHint()"></label>
4289
+ </div>
4275
4290
  <label>Persistence
4276
4291
  <select id="prefAnalysisPersistence">
4277
4292
  <option value="session_only">session only</option>
@@ -4755,6 +4770,10 @@ HTML = r"""<!doctype html>
4755
4770
  prefVisionDefault: 'Controls whether visual frame analysis is used by default when an operation supports it.',
4756
4771
  prefTranscriptionDefault: 'Sets the default answer for transcript generation on audio-bearing clips.',
4757
4772
  prefSlateDetectionDefault: 'Controls whether slate detection should run or ask before adding slate-informed context.',
4773
+ prefSamplingMode: 'Chooses how many frames each clip gets for visual analysis: Economy (flat), Balanced (scales with duration), or Thorough (content-aware, bounded). Drives both coverage and token cost.',
4774
+ prefSamplingRate: 'Frames sampled per minute in Balanced mode (also seeds Thorough on short clips).',
4775
+ prefSamplingFloor: 'Minimum frames per clip for duration/content-scaled modes.',
4776
+ prefSamplingCeiling: 'Maximum frames per clip for Balanced and Thorough modes (the Thorough per-clip cap).',
4758
4777
  prefAnalysisPersistence: 'Chooses whether analysis artifacts stay session-only or keep reusable reports and frames.',
4759
4778
  prefAnalysisSummaryStyle: 'Tunes the language of generated summaries for editorial, QC, producer, or full-detail review.',
4760
4779
  prefReportFormat: 'Chooses compact readable reports, full reports, or machine-readable output for downstream agents.',
@@ -9088,6 +9107,12 @@ HTML = r"""<!doctype html>
9088
9107
  setControlValue('prefSourceTrust', media.source_trust || 'auto');
9089
9108
  setControlValue('prefDepth', media.default_depth || 'standard');
9090
9109
  setControlValue('prefFrames', media.default_sample_frames ?? 8);
9110
+ // sampling_mode_default is null when unset → show "ask".
9111
+ setControlValue('prefSamplingMode', media.sampling_mode_default || 'ask');
9112
+ setControlValue('prefSamplingRate', media.sampling_frames_per_minute ?? 4);
9113
+ setControlValue('prefSamplingFloor', media.sampling_frame_floor ?? 3);
9114
+ setControlValue('prefSamplingCeiling', media.sampling_frame_ceiling ?? 80);
9115
+ updateSamplingModeHint();
9091
9116
  setControlValue('prefAnalysisPersistence', media.analysis_persistence);
9092
9117
  const legacySummaryMap = { assistant_editor: 'creative', producer: 'creative', qc: 'technical' };
9093
9118
  const summaryStyle = legacySummaryMap[media.analysis_summary_style] || media.analysis_summary_style || 'concise';
@@ -9182,6 +9207,38 @@ HTML = r"""<!doctype html>
9182
9207
  };
9183
9208
  }
9184
9209
 
9210
+ // Rough per-frame vision cost for the estimate (≈768px frame at typical
9211
+ // tokenization). The engine's pre-call refusal estimates more conservatively.
9212
+ const SAMPLING_TOKENS_PER_FRAME = 450;
9213
+ function _fmtTokens(frames) {
9214
+ const k = (frames * SAMPLING_TOKENS_PER_FRAME) / 1000;
9215
+ return k >= 1 ? `~${k.toFixed(k < 10 ? 1 : 0)}k tokens` : `~${Math.round(k * 1000)} tokens`;
9216
+ }
9217
+ function updateSamplingModeHint() {
9218
+ const hintEl = document.getElementById('samplingModeHint');
9219
+ if (!hintEl) return;
9220
+ const mode = ($('prefSamplingMode') || {}).value || 'ask';
9221
+ const rate = Number(($('prefSamplingRate') || {}).value) || 4;
9222
+ const floor = Number(($('prefSamplingFloor') || {}).value) || 3;
9223
+ const ceil = Number(($('prefSamplingCeiling') || {}).value) || 80;
9224
+ const fixed = Number(($('prefFrames') || {}).value) || 8;
9225
+ let msg = '';
9226
+ if (mode === 'ask') {
9227
+ msg = 'You will be asked to pick a mode the first time you analyze. Recommended: Thorough.';
9228
+ } else if (mode === 'fixed') {
9229
+ msg = `Economy — flat ${fixed} frames per clip regardless of length (${_fmtTokens(fixed)}/clip). Most predictable.`;
9230
+ } else if (mode === 'per_minute') {
9231
+ const oneMin = Math.max(floor, Math.min(ceil, Math.round(rate)));
9232
+ const tenMin = Math.max(floor, Math.min(ceil, Math.round(rate * 10)));
9233
+ msg = `Balanced — ${rate}/min, bounded ${floor}–${ceil}. ~${oneMin}f (${_fmtTokens(oneMin)}) for 1 min · ~${tenMin}f (${_fmtTokens(tenMin)}) for 10 min. Linear cost.`;
9234
+ } else if (mode === 'adaptive_capped') {
9235
+ msg = `Thorough — content-aware (shot boundaries + flashes), bounded ${floor}–${ceil} frames/clip (${_fmtTokens(floor)}–${_fmtTokens(ceil)}). Best coverage, bounded cost.`;
9236
+ } else if (mode === 'adaptive') {
9237
+ msg = `Thorough (uncapped) — content-aware, no per-clip ceiling (up to 512 frames, ${_fmtTokens(512)}). Use only for short/few clips.`;
9238
+ }
9239
+ hintEl.textContent = msg;
9240
+ }
9241
+
9185
9242
  function setupPreferencePayload() {
9186
9243
  let markerColors = {};
9187
9244
  try {
@@ -9197,6 +9254,10 @@ HTML = r"""<!doctype html>
9197
9254
  source_trust: $('prefSourceTrust').value,
9198
9255
  default_depth: $('prefDepth').value,
9199
9256
  default_sample_frames: Number($('prefFrames').value || 8),
9257
+ sampling_mode_default: $('prefSamplingMode').value,
9258
+ sampling_frames_per_minute: Number($('prefSamplingRate').value || 4),
9259
+ sampling_frame_floor: Number($('prefSamplingFloor').value || 3),
9260
+ sampling_frame_ceiling: Number($('prefSamplingCeiling').value || 80),
9200
9261
  analysis_persistence: $('prefAnalysisPersistence').value,
9201
9262
  analysis_summary_style: $('prefAnalysisSummaryStyle').value,
9202
9263
  report_format: $('prefReportFormat').value,
@@ -13299,6 +13360,27 @@ class Handler(BaseHTTPRequestHandler):
13299
13360
  "cleanup_frames": True,
13300
13361
  "reuse_project_roots": self.state.related_project_roots(),
13301
13362
  }
13363
+ # Honor the saved frame-sampling mode (or an explicit per-job override)
13364
+ # so batch runs match the user's chosen coverage/cost. Falls back to the
13365
+ # recommended mode when the user hasn't set a default yet (batch jobs
13366
+ # shouldn't block on the first-run prompt).
13367
+ try:
13368
+ from src.server import (
13369
+ _media_analysis_effective_preferences as _ma_eff_prefs,
13370
+ )
13371
+ from src.utils import media_analysis as _ma_mod
13372
+ _ma_prefs = _ma_eff_prefs()
13373
+ params["sampling_mode"] = (
13374
+ body.get("sampling_mode")
13375
+ or _ma_prefs.get("sampling_mode_default")
13376
+ or _ma_mod.RECOMMENDED_SAMPLING_MODE
13377
+ )
13378
+ params["frames_per_minute"] = body.get("frames_per_minute") or _ma_prefs.get("sampling_frames_per_minute")
13379
+ params["frame_floor"] = body.get("frame_floor") or _ma_prefs.get("sampling_frame_floor")
13380
+ params["frame_ceiling"] = body.get("frame_ceiling") or _ma_prefs.get("sampling_frame_ceiling")
13381
+ except Exception:
13382
+ # Best-effort; the engine still applies its own defaults.
13383
+ pass
13302
13384
  with self.state.lock:
13303
13385
  created = create_batch_job_from_paths(
13304
13386
  project_name=self.state.project_name,
@@ -80,7 +80,7 @@ if not logging.getLogger().handlers:
80
80
  handlers=[logging.StreamHandler()],
81
81
  )
82
82
 
83
- VERSION = "2.26.1"
83
+ VERSION = "2.27.0"
84
84
  logger = logging.getLogger("davinci-resolve-mcp")
85
85
  logger.info(f"Starting DaVinci Resolve MCP Server v{VERSION}")
86
86
  logger.info(f"Detected platform: {get_platform()}")
package/src/server.py CHANGED
@@ -11,7 +11,7 @@ Usage:
11
11
  python src/server.py --full # Start the 329-tool granular server instead
12
12
  """
13
13
 
14
- VERSION = "2.26.1"
14
+ VERSION = "2.27.0"
15
15
 
16
16
  import base64
17
17
  import os
@@ -6312,6 +6312,18 @@ _MEDIA_ANALYSIS_DEFAULT_PREFS = {
6312
6312
  "source_trust": "auto",
6313
6313
  "default_depth": "standard",
6314
6314
  "default_sample_frames": 8,
6315
+ # Frame-sampling mode — how many frames a clip gets for visual analysis.
6316
+ # "ask" prompts the user to choose a standing default the first time they
6317
+ # analyze; the choice is then saved here. Canonical modes (see
6318
+ # media_analysis.SAMPLING_MODES): fixed (Economy), per_minute (Balanced),
6319
+ # adaptive_capped (Thorough, recommended), adaptive (Thorough uncapped).
6320
+ "sampling_mode_default": "ask",
6321
+ # Tunables shared by Balanced + Thorough modes. frames_per_minute drives the
6322
+ # Balanced target; frame_floor/frame_ceiling bound every duration/content
6323
+ # scaled mode (the ceiling is also the Thorough per-clip cap).
6324
+ "sampling_frames_per_minute": 4.0,
6325
+ "sampling_frame_floor": 3,
6326
+ "sampling_frame_ceiling": 80,
6315
6327
  "preferred_analysis_root": None,
6316
6328
  "preferred_generated_media_folder": None,
6317
6329
  "default_post_operation_page": "stay_put",
@@ -6580,6 +6592,29 @@ def _media_analysis_effective_preferences() -> Dict[str, Any]:
6580
6592
  except (TypeError, ValueError):
6581
6593
  sample_frames_int = 8
6582
6594
  effective["default_sample_frames"] = max(0, min(48, sample_frames_int))
6595
+ # sampling_mode_default normalizes to a canonical mode, or None when unset /
6596
+ # "ask" (None means "not yet chosen" → first-run prompt fires).
6597
+ effective["sampling_mode_default"] = _normalize_sampling_mode_default(effective.get("sampling_mode_default"))
6598
+
6599
+ def _pos_number(value: Any, fallback: float) -> float:
6600
+ try:
6601
+ f = float(value)
6602
+ except (TypeError, ValueError):
6603
+ return fallback
6604
+ return f if f > 0 else fallback
6605
+
6606
+ effective["sampling_frames_per_minute"] = _pos_number(
6607
+ effective.get("sampling_frames_per_minute"), _media_analysis_module.DEFAULT_FRAMES_PER_MINUTE
6608
+ )
6609
+ effective["sampling_frame_floor"] = int(_pos_number(
6610
+ effective.get("sampling_frame_floor"), _media_analysis_module.DEFAULT_FRAME_FLOOR
6611
+ ))
6612
+ ceiling = int(_pos_number(
6613
+ effective.get("sampling_frame_ceiling"), _media_analysis_module.DEFAULT_FRAME_CEILING
6614
+ ))
6615
+ if ceiling < effective["sampling_frame_floor"]:
6616
+ ceiling = effective["sampling_frame_floor"]
6617
+ effective["sampling_frame_ceiling"] = ceiling
6583
6618
  effective["default_post_operation_page"] = _normalize_setup_choice(
6584
6619
  effective.get("default_post_operation_page"),
6585
6620
  ["stay_put", "media", "cut", "edit", "fusion", "color", "fairlight", "deliver"],
@@ -6743,6 +6778,148 @@ def _media_analysis_timed_marker_decision(p: Dict[str, Any]) -> Dict[str, Any]:
6743
6778
  }
6744
6779
 
6745
6780
 
6781
+ def _normalize_sampling_mode_default(value: Any) -> Optional[str]:
6782
+ """Resolve a stored sampling_mode_default to a canonical mode, or None.
6783
+
6784
+ None means "not chosen yet" — covers an unset value and the "ask" sentinel,
6785
+ both of which should trigger the first-run prompt.
6786
+ """
6787
+ if value is None:
6788
+ return None
6789
+ raw = str(value).strip().lower()
6790
+ if raw in {"", "ask", "prompt", "ask_me", "ask_user", "none", "unset", "default"}:
6791
+ return None
6792
+ return _media_analysis_module.normalize_sampling_mode(value, default=None)
6793
+
6794
+
6795
+ def _sampling_mode_choice_from_params(p: Dict[str, Any]) -> Optional[str]:
6796
+ """Read an explicit sampling-mode choice from analysis params.
6797
+
6798
+ Returns a canonical mode, the "ask" sentinel, or None when unspecified.
6799
+ """
6800
+ raw = _first_param(
6801
+ p,
6802
+ "sampling_mode",
6803
+ "samplingMode",
6804
+ "frame_sampling_mode",
6805
+ "frameSamplingMode",
6806
+ "analysis_mode",
6807
+ "analysisMode",
6808
+ )
6809
+ if raw is None and isinstance(p.get("sampling"), dict):
6810
+ raw = _first_param(p["sampling"], "mode", "sampling_mode", "samplingMode")
6811
+ if raw is None:
6812
+ return None
6813
+ if str(raw).strip().lower() in {"ask", "prompt", "ask_me", "ask_user"}:
6814
+ return "ask"
6815
+ return _media_analysis_module.normalize_sampling_mode(raw, default=None)
6816
+
6817
+
6818
+ def _media_analysis_sampling_mode_prompt() -> Dict[str, Any]:
6819
+ """First-run prompt offering the four sampling modes (Thorough recommended).
6820
+
6821
+ Each option saves the chosen mode as the standing default (save_sampling_default)
6822
+ so the user is only asked once. Pass `sampling_mode` alone, without the save
6823
+ flag, for a one-off run that doesn't change the default.
6824
+ """
6825
+ return {
6826
+ "question": (
6827
+ "How should frames be sampled for visual analysis? This sets your "
6828
+ "default for future runs (pass sampling_mode per-call for a one-off)."
6829
+ ),
6830
+ "default_behavior": "Recommended: Thorough — content-aware coverage with a bounded, predictable cost.",
6831
+ "options": [
6832
+ {
6833
+ "id": "fixed",
6834
+ "label": "Economy",
6835
+ "description": (
6836
+ "Flat ~8 frames per clip regardless of length. Cheapest and most "
6837
+ "predictable; good for proxies, triage, or known-short clips."
6838
+ ),
6839
+ "params": {"sampling_mode": "fixed", "save_sampling_default": True},
6840
+ },
6841
+ {
6842
+ "id": "per_minute",
6843
+ "label": "Balanced",
6844
+ "description": (
6845
+ "Frames scale with duration (~4/min, bounded 3–80). Cost is linear "
6846
+ "in footage length and easy to predict; content-blind."
6847
+ ),
6848
+ "params": {"sampling_mode": "per_minute", "save_sampling_default": True},
6849
+ },
6850
+ {
6851
+ "id": "adaptive_capped",
6852
+ "label": "Thorough (recommended)",
6853
+ "description": (
6854
+ "Content-aware: samples shot boundaries, representatives, and flash "
6855
+ "frames, bounded 3–80 per clip. Best coverage with a bounded cost."
6856
+ ),
6857
+ "params": {"sampling_mode": "adaptive_capped", "save_sampling_default": True},
6858
+ },
6859
+ {
6860
+ "id": "adaptive",
6861
+ "label": "Thorough (uncapped)",
6862
+ "description": (
6863
+ "Content-aware with no per-clip ceiling (up to 512 frames). Use only "
6864
+ "when clips are known to be short or few — cost can grow fast."
6865
+ ),
6866
+ "params": {"sampling_mode": "adaptive", "save_sampling_default": True},
6867
+ },
6868
+ ],
6869
+ "recommended": _media_analysis_module.RECOMMENDED_SAMPLING_MODE,
6870
+ }
6871
+
6872
+
6873
+ def _media_analysis_sampling_mode_decision(p: Dict[str, Any]) -> Dict[str, Any]:
6874
+ """Resolve the frame-sampling mode for an analysis run.
6875
+
6876
+ Resolution order:
6877
+ 1. Explicit `sampling_mode` param → one-off (persisted only if save flag set).
6878
+ 2. Saved `sampling_mode_default` preference → used silently.
6879
+ 3. Otherwise → prompt_required (first run); falls back to the recommended
6880
+ mode so previews/automation still work, but the entry point surfaces the
6881
+ prompt and blocks real execution.
6882
+ """
6883
+ choice = _sampling_mode_choice_from_params(p)
6884
+ save_default = _media_analysis_bool(
6885
+ _first_param(p, "save_sampling_default", "saveSamplingDefault", "set_sampling_default", "setSamplingDefault"),
6886
+ False,
6887
+ )
6888
+ preferences = _read_media_analysis_preferences()
6889
+ explicit_saved = "sampling_mode_default" in preferences
6890
+ saved_default = _media_analysis_effective_preferences().get("sampling_mode_default")
6891
+
6892
+ recommended = _media_analysis_module.RECOMMENDED_SAMPLING_MODE
6893
+ saved = None
6894
+
6895
+ if choice and choice != "ask":
6896
+ mode = choice
6897
+ source = "explicit"
6898
+ if save_default:
6899
+ saved = mode
6900
+ preferences["sampling_mode_default"] = mode
6901
+ preferences["sampling_mode_default_updated_at"] = time.strftime("%Y-%m-%dT%H:%M:%SZ", time.gmtime())
6902
+ _write_media_analysis_preferences(preferences)
6903
+ source = "saved_default"
6904
+ elif choice == "ask":
6905
+ mode = recommended
6906
+ source = "prompt_required"
6907
+ elif saved_default:
6908
+ mode = saved_default
6909
+ source = "saved_default" if explicit_saved else "default"
6910
+ else:
6911
+ mode = recommended
6912
+ source = "prompt_required"
6913
+
6914
+ return {
6915
+ "mode": mode,
6916
+ "source": source,
6917
+ "prompt_required": source == "prompt_required",
6918
+ "saved_default": saved or saved_default,
6919
+ "preferences_path": _media_analysis_preferences_path(),
6920
+ }
6921
+
6922
+
6746
6923
  def _media_analysis_sync_marker_suggestions(detection: Dict[str, Any]) -> List[Dict[str, Any]]:
6747
6924
  suggestions = []
6748
6925
  for file_result in detection.get("files") or []:
@@ -8086,6 +8263,23 @@ def _media_analysis_apply_setup_defaults(action: str, p: Dict[str, Any]) -> Dict
8086
8263
  )
8087
8264
  applied["transcription_default"] = transcription_default
8088
8265
 
8266
+ # Frame-sampling mode: resolve from explicit param > saved default > first-run
8267
+ # prompt (recommended fallback). Inject mode + tunables so the analysis engine
8268
+ # picks them up via _resolve_sampling_config; stash the decision so the entry
8269
+ # point can surface the first-run prompt.
8270
+ if action in {"plan", "analyze_file", "analyze_clip", "analyze_bin", "analyze_project", "analyze_timeline", "analyze_sequence", "start_batch_job"}:
8271
+ sampling_decision = _media_analysis_sampling_mode_decision(out)
8272
+ if not _has_any_param(out, "sampling_mode", "samplingMode", "frame_sampling_mode", "frameSamplingMode"):
8273
+ out["sampling_mode"] = sampling_decision["mode"]
8274
+ applied["sampling_mode"] = sampling_decision["mode"]
8275
+ if not _has_any_param(out, "frames_per_minute", "framesPerMinute"):
8276
+ out["frames_per_minute"] = prefs.get("sampling_frames_per_minute")
8277
+ if not _has_any_param(out, "frame_floor", "frameFloor"):
8278
+ out["frame_floor"] = prefs.get("sampling_frame_floor")
8279
+ if not _has_any_param(out, "frame_ceiling", "frameCeiling"):
8280
+ out["frame_ceiling"] = prefs.get("sampling_frame_ceiling")
8281
+ out["_sampling_mode_decision"] = sampling_decision
8282
+
8089
8283
  if action in {"plan", "analyze_file", "analyze_clip", "analyze_bin", "analyze_project", "analyze_timeline", "analyze_sequence", "start_batch_job", "publish_clip_metadata"}:
8090
8284
  if not _has_any_param(
8091
8285
  out,
@@ -9391,7 +9585,9 @@ def _setup_media_analysis_defaults() -> Dict[str, Any]:
9391
9585
  "source_trust": ["auto", "filename", "low", "medium", "high"],
9392
9586
  "default_depth": ["quick", "standard", "deep"],
9393
9587
  "default_post_operation_page": ["stay_put", "media", "cut", "edit", "fusion", "color", "fairlight", "deliver"],
9588
+ "sampling_mode_default": ["ask", "fixed", "per_minute", "adaptive_capped", "adaptive"],
9394
9589
  },
9590
+ "sampling_mode_labels": dict(_media_analysis_module.SAMPLING_MODE_LABELS),
9395
9591
  }
9396
9592
 
9397
9593
 
@@ -9512,6 +9708,24 @@ def _setup_set_media_analysis_defaults(media_defaults: Dict[str, Any], dry_run:
9512
9708
  "defaultsampleframes": "default_sample_frames",
9513
9709
  "sample_frames": "default_sample_frames",
9514
9710
  "sampleframes": "default_sample_frames",
9711
+ "sampling_mode_default": "sampling_mode_default",
9712
+ "samplingmodedefault": "sampling_mode_default",
9713
+ "sampling_mode": "sampling_mode_default",
9714
+ "samplingmode": "sampling_mode_default",
9715
+ "analysis_mode": "sampling_mode_default",
9716
+ "analysismode": "sampling_mode_default",
9717
+ "sampling_frames_per_minute": "sampling_frames_per_minute",
9718
+ "samplingframesperminute": "sampling_frames_per_minute",
9719
+ "frames_per_minute": "sampling_frames_per_minute",
9720
+ "framesperminute": "sampling_frames_per_minute",
9721
+ "sampling_frame_floor": "sampling_frame_floor",
9722
+ "samplingframefloor": "sampling_frame_floor",
9723
+ "frame_floor": "sampling_frame_floor",
9724
+ "framefloor": "sampling_frame_floor",
9725
+ "sampling_frame_ceiling": "sampling_frame_ceiling",
9726
+ "samplingframeceiling": "sampling_frame_ceiling",
9727
+ "frame_ceiling": "sampling_frame_ceiling",
9728
+ "frameceiling": "sampling_frame_ceiling",
9515
9729
  }
9516
9730
 
9517
9731
  requested: Dict[str, Any] = {}
@@ -9640,6 +9854,37 @@ def _setup_set_media_analysis_defaults(media_defaults: Dict[str, Any], dry_run:
9640
9854
  except (TypeError, ValueError):
9641
9855
  return _err("default_sample_frames must be an integer between 0 and 48.")
9642
9856
  set_or_clear(key, raw_value, max(0, min(48, frames_int)))
9857
+ elif key == "sampling_mode_default":
9858
+ # "ask" clears the saved default so the first-run prompt fires again;
9859
+ # otherwise normalize a canonical key or friendly label.
9860
+ if _setup_text_key(raw_value) in {"ask", "prompt", "askme", "askuser"}:
9861
+ next_preferences.pop("sampling_mode_default", None)
9862
+ next_preferences.pop("sampling_mode_default_updated_at", None)
9863
+ updates[key] = {"before": before.get(key), "after": None, "cleared": True}
9864
+ else:
9865
+ normalized = _media_analysis_module.normalize_sampling_mode(raw_value, default=None)
9866
+ if normalized is None:
9867
+ return _err(
9868
+ "Unsupported sampling_mode_default. Use ask, fixed/economy, "
9869
+ "per_minute/balanced, adaptive_capped/thorough, or adaptive."
9870
+ )
9871
+ set_or_clear(key, raw_value, normalized)
9872
+ elif key == "sampling_frames_per_minute":
9873
+ try:
9874
+ rate = float(raw_value)
9875
+ except (TypeError, ValueError):
9876
+ return _err("sampling_frames_per_minute must be a positive number.")
9877
+ if rate <= 0:
9878
+ return _err("sampling_frames_per_minute must be greater than 0.")
9879
+ set_or_clear(key, raw_value, rate)
9880
+ elif key in {"sampling_frame_floor", "sampling_frame_ceiling"}:
9881
+ try:
9882
+ n = int(raw_value) if not isinstance(raw_value, bool) else 0
9883
+ except (TypeError, ValueError):
9884
+ return _err(f"{key} must be a positive integer.")
9885
+ if n <= 0:
9886
+ return _err(f"{key} must be a positive integer.")
9887
+ set_or_clear(key, raw_value, n)
9643
9888
  elif key == "default_post_operation_page":
9644
9889
  normalized = _normalize_setup_choice(
9645
9890
  raw_value,
@@ -9797,6 +10042,10 @@ def _setup_clear_defaults(keys: Any, dry_run: bool) -> Dict[str, Any]:
9797
10042
  "metadata_writeback_default": "media_analysis.metadata_writeback_default",
9798
10043
  "ask_before_metadata_publish": "media_analysis.ask_before_metadata_publish",
9799
10044
  "dry_run_first_default": "media_analysis.dry_run_first_default",
10045
+ "sampling_mode_default": "media_analysis.sampling_mode_default",
10046
+ "sampling_frames_per_minute": "media_analysis.sampling_frames_per_minute",
10047
+ "sampling_frame_floor": "media_analysis.sampling_frame_floor",
10048
+ "sampling_frame_ceiling": "media_analysis.sampling_frame_ceiling",
9800
10049
  }
9801
10050
  media_payload: Dict[str, Any] = {}
9802
10051
  if clear_all or "media_analysis" in normalized_keys:
@@ -9899,6 +10148,14 @@ def setup(action: str, params: Optional[Dict[str, Any]] = None) -> Dict[str, Any
9899
10148
  "media_analysis.metadata_writeback_default": {"values": [True, False], "storage": _media_analysis_preferences_path()},
9900
10149
  "media_analysis.ask_before_metadata_publish": {"values": [True, False], "storage": _media_analysis_preferences_path()},
9901
10150
  "media_analysis.dry_run_first_default": {"values": [True, False], "storage": _media_analysis_preferences_path()},
10151
+ "media_analysis.sampling_mode_default": {
10152
+ "description": "Frame-sampling mode for visual analysis. 'ask' prompts on first analysis to set a standing default. fixed=Economy (flat frames), per_minute=Balanced (duration-scaled), adaptive_capped=Thorough (content-aware, bounded — recommended), adaptive=Thorough uncapped.",
10153
+ "values": ["ask", "fixed", "per_minute", "adaptive_capped", "adaptive"],
10154
+ "storage": _media_analysis_preferences_path(),
10155
+ },
10156
+ "media_analysis.sampling_frames_per_minute": {"description": "Frames per minute for Balanced mode (also seeds Thorough on short clips).", "values": "number > 0 (default 4)", "storage": _media_analysis_preferences_path()},
10157
+ "media_analysis.sampling_frame_floor": {"description": "Minimum frames per clip for duration/content-scaled modes.", "values": "integer > 0 (default 3)", "storage": _media_analysis_preferences_path()},
10158
+ "media_analysis.sampling_frame_ceiling": {"description": "Maximum frames per clip for Balanced + Thorough modes (the Thorough per-clip cap).", "values": "integer > 0 (default 80)", "storage": _media_analysis_preferences_path()},
9902
10159
  "updates.mode": {
9903
10160
  "description": "Local MCP update policy.",
9904
10161
  "values": sorted(_SETUP_UPDATE_MODES),
@@ -13191,6 +13448,31 @@ async def media_analysis(action: str, params: Optional[Dict[str, Any]] = None, c
13191
13448
  """
13192
13449
  p = _media_analysis_apply_setup_defaults(action, dict(params or {}))
13193
13450
 
13451
+ # First-run frame-sampling prompt: if the user has never chosen a sampling
13452
+ # mode (and didn't pass one this call), ask before spending any vision
13453
+ # tokens. Re-running with sampling_mode=<choice> saves it as the default;
13454
+ # passing sampling_mode explicitly any time is a one-off that skips this.
13455
+ _sampling_decision = p.get("_sampling_mode_decision") if isinstance(p, dict) else None
13456
+ if (
13457
+ isinstance(_sampling_decision, dict)
13458
+ and _sampling_decision.get("prompt_required")
13459
+ and action in {"analyze_clip", "analyze_file", "analyze_bin", "analyze_project", "analyze_sequence", "analyze_timeline", "start_batch_job"}
13460
+ ):
13461
+ return {
13462
+ "success": True,
13463
+ "status": "confirmation_required",
13464
+ "confirmation_required": True,
13465
+ "sampling_mode_prompt": _media_analysis_sampling_mode_prompt(),
13466
+ "recommended_sampling_mode": _media_analysis_module.RECOMMENDED_SAMPLING_MODE,
13467
+ "message": (
13468
+ "Choose a frame-sampling mode for visual analysis. Re-run with "
13469
+ "sampling_mode set to one of fixed/per_minute/adaptive_capped/adaptive "
13470
+ "(the chosen value is saved as your default), or set it in the control "
13471
+ "panel under Analysis Modes. Pass sampling_mode per-call any time for a one-off."
13472
+ ),
13473
+ "preferences_path": _media_analysis_preferences_path(),
13474
+ }
13475
+
13194
13476
  # E2 — capture original action + scope before the dispatch may rewrite
13195
13477
  # `action` (e.g. analyze_clip → plan). Used by _e2_wrap below to attach
13196
13478
  # an `escalation` block when a (scope, action) pair fails repeatedly.
@@ -18,17 +18,26 @@ deferred a "$ cost cap" axis. Token budgets are the unit of currency here.
18
18
 
19
19
  | Dimension | minimal | standard | generous | unlimited |
20
20
  |-----------|--------:|---------:|---------:|----------:|
21
- | response_chars | 5,000 | 25,000 | 100,000 | None |
22
- | vision_tokens_per_clip | 5,000 | 25,000 | 100,000 | None |
23
- | frames_per_clip | 4 | 8 | 24 | None |
24
- | vision_tokens_per_job | 50,000 | 250,000 | 1,000,000 | None |
25
- | vision_tokens_per_day | 100,000 | 500,000 | 2,000,000 | None |
26
- | wall_clock_seconds_per_call | 30 | 90 | 300 | None |
27
- | max_frame_dim_pixels | 512 | 768 | 1280 | None |
21
+ | response_chars | 5,000 | 25,000 | 100,000 | None |
22
+ | vision_tokens_per_clip | 16,000 | 100,000 | 250,000 | None |
23
+ | frames_per_clip | 12 | 80 | 200 | None |
24
+ | vision_tokens_per_job | 60,000 | 1,000,000 | 3,000,000 | None |
25
+ | vision_tokens_per_day | 150,000 | 2,000,000 | 6,000,000 | None |
26
+ | wall_clock_seconds_per_call | 30 | 90 | 300 | None |
27
+ | max_frame_dim_pixels | 512 | 768 | 1280 | None |
28
28
 
29
29
  `minimal` = preview/triage mode. `standard` = realistic per-project default.
30
30
  `generous` = high-fidelity analysis on a few specific clips. `unlimited` = all
31
31
  guards off; use only when you're certain about the input size.
32
+
33
+ NOTE on `frames_per_clip`: this is a *safety ceiling*, not the primary frame
34
+ dial. How many frames a clip actually gets is chosen by the `sampling_mode`
35
+ (Economy/Balanced/Thorough — see media_analysis.SAMPLING_MODES), which is
36
+ duration- and content-aware. `frames_per_clip` only clips the result if the mode
37
+ would exceed it. The standard ceiling (80) matches the default Thorough ceiling
38
+ so the mode is never silently truncated; lower it to hard-cap cost, raise it for
39
+ unusually long/cutty clips. (Before sampling modes existed this defaulted to 8
40
+ and *was* the frame dial — that flat cap is what made long clips under-covered.)
32
41
  """
33
42
 
34
43
  from __future__ import annotations
@@ -74,30 +83,30 @@ CAP_PRESETS: Dict[str, Caps] = {
74
83
  PRESET_MINIMAL: Caps(
75
84
  preset=PRESET_MINIMAL,
76
85
  response_chars=5_000,
77
- vision_tokens_per_clip=5_000,
78
- frames_per_clip=4,
79
- vision_tokens_per_job=50_000,
80
- vision_tokens_per_day=100_000,
86
+ vision_tokens_per_clip=16_000,
87
+ frames_per_clip=12,
88
+ vision_tokens_per_job=60_000,
89
+ vision_tokens_per_day=150_000,
81
90
  wall_clock_seconds_per_call=30,
82
91
  max_frame_dim_pixels=512,
83
92
  ),
84
93
  PRESET_STANDARD: Caps(
85
94
  preset=PRESET_STANDARD,
86
95
  response_chars=25_000,
87
- vision_tokens_per_clip=25_000,
88
- frames_per_clip=8,
89
- vision_tokens_per_job=250_000,
90
- vision_tokens_per_day=500_000,
96
+ vision_tokens_per_clip=100_000,
97
+ frames_per_clip=80,
98
+ vision_tokens_per_job=1_000_000,
99
+ vision_tokens_per_day=2_000_000,
91
100
  wall_clock_seconds_per_call=90,
92
101
  max_frame_dim_pixels=768,
93
102
  ),
94
103
  PRESET_GENEROUS: Caps(
95
104
  preset=PRESET_GENEROUS,
96
105
  response_chars=100_000,
97
- vision_tokens_per_clip=100_000,
98
- frames_per_clip=24,
99
- vision_tokens_per_job=1_000_000,
100
- vision_tokens_per_day=2_000_000,
106
+ vision_tokens_per_clip=250_000,
107
+ frames_per_clip=200,
108
+ vision_tokens_per_job=3_000_000,
109
+ vision_tokens_per_day=6_000_000,
101
110
  wall_clock_seconds_per_call=300,
102
111
  max_frame_dim_pixels=1280,
103
112
  ),
@@ -624,6 +624,103 @@ FRAME_CAPS = {
624
624
  }
625
625
  HARD_FRAME_CAP = 512
626
626
 
627
+ # ── Frame-sampling modes ─────────────────────────────────────────────────────
628
+ # How many frames a clip gets is governed by a `sampling_mode`. `depth` still
629
+ # governs *which* analysis layers run; the mode governs frame coverage + cost.
630
+ #
631
+ # fixed "Economy" — flat N frames (depth-derived / max_analysis_frames),
632
+ # independent of clip length. Most predictable cost.
633
+ # per_minute "Balanced" — N = clamp(minutes * frames_per_minute, floor, ceiling).
634
+ # Cost is linear in footage length; content-blind.
635
+ # adaptive_capped "Thorough" — content-aware (per-shot boundaries + flashes), bounded
636
+ # by [floor, frame_ceiling]. Best coverage, bounded cost.
637
+ # adaptive "Thorough (uncapped)" — content-aware, bounded only by the absolute HARD_FRAME_CAP.
638
+ # Use only when clips are known to be short/few.
639
+ #
640
+ # The math-layer default is `adaptive` so any caller that doesn't thread a
641
+ # sampling config keeps the legacy demand-driven behaviour. The *product*
642
+ # default (what new analysis runs use) is resolved at the preference layer in
643
+ # server.py and recommends "adaptive_capped" (Thorough).
644
+ SAMPLING_MODES = {"fixed", "per_minute", "adaptive", "adaptive_capped"}
645
+ DEFAULT_SAMPLING_MODE = "adaptive"
646
+ RECOMMENDED_SAMPLING_MODE = "adaptive_capped"
647
+ DEFAULT_FRAMES_PER_MINUTE = 4.0
648
+ DEFAULT_FRAME_FLOOR = 3
649
+ DEFAULT_FRAME_CEILING = 80
650
+
651
+ # Thoroughness ranking — used for cache reuse: a richer prior report satisfies a
652
+ # cheaper mode, but switching *up* forces a re-sample.
653
+ SAMPLING_MODE_RANK = {"fixed": 0, "per_minute": 1, "adaptive_capped": 2, "adaptive": 3}
654
+
655
+ # User-facing labels (prompt + control panel).
656
+ SAMPLING_MODE_LABELS = {
657
+ "fixed": "Economy",
658
+ "per_minute": "Balanced",
659
+ "adaptive_capped": "Thorough",
660
+ "adaptive": "Thorough (uncapped)",
661
+ }
662
+
663
+ _SAMPLING_MODE_ALIASES = {
664
+ "economy": "fixed", "fixed": "fixed", "flat": "fixed",
665
+ "balanced": "per_minute", "per_minute": "per_minute", "perminute": "per_minute",
666
+ "per-minute": "per_minute", "duration": "per_minute",
667
+ "thorough": "adaptive_capped", "adaptive_capped": "adaptive_capped",
668
+ "adaptive-capped": "adaptive_capped", "capped": "adaptive_capped",
669
+ "thorough_uncapped": "adaptive", "thorough (uncapped)": "adaptive",
670
+ "adaptive": "adaptive", "uncapped": "adaptive",
671
+ }
672
+
673
+
674
+ def normalize_sampling_mode(value: Any, default: Optional[str] = None) -> Optional[str]:
675
+ """Resolve a user-supplied mode string (label or key) to a canonical mode."""
676
+ raw = str(value or "").strip().lower().replace("_", "_")
677
+ return _SAMPLING_MODE_ALIASES.get(raw, default)
678
+
679
+
680
+ def _resolve_sampling_config(params: Optional[Dict[str, Any]]) -> Dict[str, Any]:
681
+ """Read sampling mode + tunables from analysis params, applying defaults."""
682
+ params = params or {}
683
+
684
+ def _first(*keys: str) -> Any:
685
+ for key in keys:
686
+ if key in params and params[key] is not None:
687
+ return params[key]
688
+ return None
689
+
690
+ mode = normalize_sampling_mode(
691
+ _first("sampling_mode", "samplingMode"), default=DEFAULT_SAMPLING_MODE
692
+ ) or DEFAULT_SAMPLING_MODE
693
+
694
+ def _pos_float(value: Any, fallback: float) -> float:
695
+ try:
696
+ f = float(value)
697
+ except (TypeError, ValueError):
698
+ return fallback
699
+ return f if f > 0 else fallback
700
+
701
+ rate = _pos_float(_first("frames_per_minute", "framesPerMinute"), DEFAULT_FRAMES_PER_MINUTE)
702
+ floor = int(_pos_float(_first("frame_floor", "frameFloor"), DEFAULT_FRAME_FLOOR))
703
+ ceiling = int(_pos_float(_first("frame_ceiling", "frameCeiling"), DEFAULT_FRAME_CEILING))
704
+ if ceiling < floor:
705
+ ceiling = floor
706
+ return {
707
+ "mode": mode,
708
+ "frames_per_minute": rate,
709
+ "frame_floor": floor,
710
+ "frame_ceiling": ceiling,
711
+ }
712
+
713
+
714
+ def _clamp_int(value: Any, low: int, high: int) -> int:
715
+ if high < low:
716
+ high = low
717
+ v = int(value)
718
+ if v < low:
719
+ return low
720
+ if v > high:
721
+ return high
722
+ return v
723
+
627
724
 
628
725
  def slugify(value: Any, fallback: str = "untitled") -> str:
629
726
  raw = str(value or "").strip().lower()
@@ -1441,13 +1538,14 @@ def analysis_request_signature(
1441
1538
  depth: str,
1442
1539
  options: Dict[str, Any],
1443
1540
  frame_count: int,
1541
+ sampling: Optional[Dict[str, Any]] = None,
1444
1542
  ) -> Dict[str, Any]:
1445
1543
  """Return the cache signature for a requested analysis profile."""
1446
1544
  transcription = options.get("transcription") or {}
1447
1545
  vision = options.get("vision") or {}
1448
1546
  marker_plan = options.get("marker_plan") or {}
1449
1547
  vision_prompt = vision.get("prompt") or DEFAULT_VISION_ANALYSIS_PROMPT
1450
- return {
1548
+ signature = {
1451
1549
  "analysis_version": ANALYSIS_VERSION,
1452
1550
  "depth": depth,
1453
1551
  "analysis_keyframe_budget": int(frame_count or 0),
@@ -1506,6 +1604,16 @@ def analysis_request_signature(
1506
1604
  },
1507
1605
  }),
1508
1606
  }
1607
+ # Recorded outside signature_hash so it doesn't bust pre-existing caches;
1608
+ # mode changes are reconciled by thoroughness rank in _report_cache_state.
1609
+ if sampling:
1610
+ signature["analysis_sampling"] = {
1611
+ "mode": sampling.get("mode"),
1612
+ "frames_per_minute": sampling.get("frames_per_minute"),
1613
+ "frame_floor": sampling.get("frame_floor"),
1614
+ "frame_ceiling": sampling.get("frame_ceiling"),
1615
+ }
1616
+ return signature
1509
1617
 
1510
1618
 
1511
1619
  def _timestamp_from_analyzed_at(value: Any) -> Optional[float]:
@@ -1659,6 +1767,7 @@ def build_plan(
1659
1767
  }
1660
1768
  gaps = _required_capability_gaps(depth, options, caps)
1661
1769
  frame_count = _bounded_frame_count(depth, params.get("max_analysis_frames"))
1770
+ sampling_config = _resolve_sampling_config(params)
1662
1771
  transcription_enabled = _coerce_bool((options.get("transcription") or {}).get("enabled"), default=DEFAULT_TRANSCRIPTION_ENABLED)
1663
1772
  notes = [
1664
1773
  "Plans describe analysis before execution.",
@@ -1721,11 +1830,12 @@ def build_plan(
1721
1830
  clip_plans = []
1722
1831
  for record in records:
1723
1832
  artifacts = _artifact_paths(root["project_root"], record, depth, options)
1724
- request_signature = analysis_request_signature(record, depth, options, frame_count)
1833
+ request_signature = analysis_request_signature(record, depth, options, frame_count, sampling=sampling_config)
1725
1834
  existing: Optional[Dict[str, Any]] = None
1726
1835
  clip_plan = {
1727
1836
  "record": record,
1728
1837
  "analysis_keyframe_budget": frame_count,
1838
+ "sampling": sampling_config,
1729
1839
  "analysis_signature": request_signature,
1730
1840
  "cache_status": "not_checked",
1731
1841
  "artifacts": artifacts,
@@ -1853,6 +1963,8 @@ def build_plan(
1853
1963
  "estimated_seconds": per_clip_seconds * len(records),
1854
1964
  "estimated_seconds_after_reuse": per_clip_seconds * max(0, len(records) - reusable_count),
1855
1965
  "analysis_keyframe_budget_per_clip": frame_count,
1966
+ "sampling": sampling_config,
1967
+ "sampling_mode": sampling_config.get("mode"),
1856
1968
  "reuse_existing": reuse_existing,
1857
1969
  "force_refresh": force_refresh,
1858
1970
  "reuse_policy": reuse_policy,
@@ -2414,24 +2526,19 @@ def _cut_boundary_analysis(
2414
2526
  }
2415
2527
 
2416
2528
 
2417
- def _compute_demand_driven_budget(
2418
- requested_budget: int,
2419
- cut_analysis: Optional[Dict[str, Any]],
2529
+ def _demand_frame_count(
2530
+ cut_analysis: Dict[str, Any],
2420
2531
  duration_seconds: Optional[float],
2421
2532
  ) -> int:
2422
- """Compute the effective frame-sampling budget driven by analysis demand.
2533
+ """Frames the content *demands* so vision can populate the per-shot schema.
2423
2534
 
2424
- Demand sources (so vision can populate the V2 per-shot schema):
2535
+ Demand sources:
2425
2536
  - Per shot: 1 representative (midpoint) + 2 boundary frames + duration-scaled extras
2426
2537
  (+1 for shots >5s, +1 for shots >15s, +1 per additional 15s beyond 30s)
2427
2538
  - Per flash_candidate: 1 mid-frame for vision adjudication (preserve all)
2428
2539
  - Per cut_point: a small buffer for cuts not covered by shot boundaries
2429
-
2430
- Safety ceiling: HARD_FRAME_CAP capped by duration so a 10s clip cannot request 500 frames.
2540
+ - Clip-level: first_usable, last_usable, midpoint
2431
2541
  """
2432
- if not isinstance(cut_analysis, dict):
2433
- return min(max(int(requested_budget or 0), 0), HARD_FRAME_CAP)
2434
-
2435
2542
  per_shot_demand = 0
2436
2543
  for shot in cut_analysis.get("shot_ranges") or []:
2437
2544
  if not isinstance(shot, dict):
@@ -2458,13 +2565,84 @@ def _compute_demand_driven_budget(
2458
2565
  # Clip-level frames (first_usable, last_usable, midpoint)
2459
2566
  clip_buffer = 4
2460
2567
 
2461
- demand = per_shot_demand + flash_count + cut_buffer + clip_buffer
2568
+ return per_shot_demand + flash_count + cut_buffer + clip_buffer
2569
+
2570
+
2571
+ def _compute_demand_driven_budget(
2572
+ requested_budget: int,
2573
+ cut_analysis: Optional[Dict[str, Any]],
2574
+ duration_seconds: Optional[float],
2575
+ sampling: Optional[Dict[str, Any]] = None,
2576
+ ) -> int:
2577
+ """Resolve the effective frame-sampling budget for the active sampling mode.
2578
+
2579
+ Modes (see SAMPLING_MODES):
2580
+ - fixed: flat `requested_budget`, duration-independent.
2581
+ - per_minute: clamp(minutes * frames_per_minute, floor, ceiling); content-blind.
2582
+ - adaptive_capped: content demand (see _demand_frame_count), clamped to [floor, ceiling].
2583
+ - adaptive: content demand, clamped only by a generous duration-scaled HARD_FRAME_CAP
2584
+ (legacy behaviour; the default when no sampling config is threaded).
2585
+
2586
+ `requested_budget` (depth-derived / max_analysis_frames) acts as a floor for the
2587
+ adaptive modes so an explicit request is never undercut.
2588
+ """
2589
+ sampling = sampling or {}
2590
+ mode = normalize_sampling_mode(sampling.get("mode"), default=DEFAULT_SAMPLING_MODE) or DEFAULT_SAMPLING_MODE
2591
+ rate = sampling.get("frames_per_minute") or DEFAULT_FRAMES_PER_MINUTE
2592
+ floor = int(sampling.get("frame_floor") or DEFAULT_FRAME_FLOOR)
2593
+ ceiling = int(sampling.get("frame_ceiling") or DEFAULT_FRAME_CEILING)
2594
+ if ceiling < floor:
2595
+ ceiling = floor
2596
+ requested = max(int(requested_budget or 0), 0)
2597
+ minutes = max(0.0, float(duration_seconds or 0) / 60.0)
2598
+ per_minute_count = int(round(minutes * float(rate)))
2599
+
2600
+ if mode == "fixed":
2601
+ return min(requested, HARD_FRAME_CAP)
2602
+
2603
+ if mode == "per_minute":
2604
+ return _clamp_int(per_minute_count, floor, min(ceiling, HARD_FRAME_CAP))
2605
+
2606
+ # Adaptive modes need shot/cut analysis. Without it, fall back to a duration
2607
+ # estimate (adaptive_capped) or the legacy requested-only budget (adaptive).
2608
+ if not isinstance(cut_analysis, dict):
2609
+ if mode == "adaptive_capped":
2610
+ return _clamp_int(max(requested, per_minute_count), floor, min(ceiling, HARD_FRAME_CAP))
2611
+ return min(max(requested, 0), HARD_FRAME_CAP)
2612
+
2613
+ demand = _demand_frame_count(cut_analysis, duration_seconds)
2614
+ target = max(requested, demand, floor)
2462
2615
 
2463
- # Duration-scaled safety ceiling: clips can't request hundreds of frames per second.
2464
- # Floor at 64 so short clips still have headroom; ceiling at HARD_FRAME_CAP.
2465
- duration_cap = max(64, min(HARD_FRAME_CAP, int((duration_seconds or 0) * 2)))
2616
+ if mode == "adaptive_capped":
2617
+ return _clamp_int(target, floor, min(ceiling, HARD_FRAME_CAP))
2466
2618
 
2467
- return min(max(int(requested_budget or 0), demand), duration_cap)
2619
+ # adaptive (uncapped): only the absolute hard cap, scaled by duration so a
2620
+ # 10s clip cannot request 500 frames. Floor at 64 for short-clip headroom.
2621
+ duration_cap = max(64, min(HARD_FRAME_CAP, int(float(duration_seconds or 0) * 2)))
2622
+ return _clamp_int(target, floor, duration_cap)
2623
+
2624
+
2625
+ def _even_interval_samples(
2626
+ duration: float,
2627
+ count: int,
2628
+ frame_step: float,
2629
+ ) -> List[Dict[str, Any]]:
2630
+ """Content-blind evenly-spaced samples (Economy / Balanced modes).
2631
+
2632
+ Returns exactly `count` frames at the midpoints of `count` equal slices of
2633
+ [0, duration], so cost is a clean function of `count` and never inflated by
2634
+ shot/cut demand. Used when the user has chosen a predictable, content-blind mode.
2635
+ """
2636
+ if count <= 0 or duration <= 0:
2637
+ return []
2638
+ out: List[Dict[str, Any]] = []
2639
+ for i in range(count):
2640
+ t = duration * (i + 0.5) / count
2641
+ out.append({
2642
+ "time_seconds": _clamp_sample_time(float(t), duration),
2643
+ "selection_reason": "interval",
2644
+ })
2645
+ return out
2468
2646
 
2469
2647
 
2470
2648
  def _sample_times(
@@ -2474,21 +2652,25 @@ def _sample_times(
2474
2652
  *,
2475
2653
  fps: Optional[float] = None,
2476
2654
  cut_analysis: Optional[Dict[str, Any]] = None,
2655
+ sampling: Optional[Dict[str, Any]] = None,
2477
2656
  ) -> List[Dict[str, Any]]:
2478
- """Two-pass frame allocation.
2657
+ """Frame allocation. Content-blind for Economy/Balanced; demand-driven otherwise.
2479
2658
 
2480
- Pass 1 (reservations, always allocated demand-driven, not budget-bounded):
2481
- - Per shot: shot_representative (midpoint), shot_start, shot_end boundaries,
2482
- duration-scaled progress samples (+1 for shots >5s, +1 for shots >15s,
2483
- +1 per 15s beyond 30s).
2484
- - Per flash_candidate: mid-frame for vision adjudication.
2659
+ Economy (fixed) / Balanced (per_minute): exactly `budget` evenly-spaced frames
2660
+ (see _even_interval_samples) predictable cost, ignores shot structure.
2485
2661
 
2486
- Pass 2 (priority fill, consumes remaining budget):
2487
- - cut_before/cut_after pairs (for cuts not covered by shot boundaries)
2488
- - first_usable, last_usable, scene_change, midpoint, interval fillers
2662
+ Thorough (adaptive / adaptive_capped) two-pass demand-driven allocation:
2663
+ Pass 1 (reservations, always allocated demand-driven, not budget-bounded):
2664
+ - Per shot: shot_representative (midpoint), shot_start, shot_end boundaries,
2665
+ duration-scaled progress samples (+1 for shots >5s, +1 for shots >15s,
2666
+ +1 per 15s beyond 30s).
2667
+ - Per flash_candidate: mid-frame for vision adjudication.
2668
+ Pass 2 (priority fill, consumes remaining budget):
2669
+ - cut_before/cut_after pairs (for cuts not covered by shot boundaries)
2670
+ - first_usable, last_usable, scene_change, midpoint, interval fillers
2489
2671
 
2490
- The caller passes `budget` as the soft target. Reservations always land
2491
- (demand-driven); priority fill is what `budget` constrains.
2672
+ The caller passes `budget` as the soft target. Reservations always land
2673
+ (demand-driven); priority fill is what `budget` constrains.
2492
2674
 
2493
2675
  Returns a time-sorted list of sample candidates.
2494
2676
  """
@@ -2498,6 +2680,12 @@ def _sample_times(
2498
2680
  cut_analysis = cut_analysis if isinstance(cut_analysis, dict) else {}
2499
2681
  frame_step = _frame_step_seconds(fps)
2500
2682
 
2683
+ # Content-blind modes: even-interval sampling of exactly `budget` frames so
2684
+ # cost stays predictable and is not inflated by per-shot reservations.
2685
+ mode = normalize_sampling_mode((sampling or {}).get("mode"), default=DEFAULT_SAMPLING_MODE) or DEFAULT_SAMPLING_MODE
2686
+ if mode in {"fixed", "per_minute"}:
2687
+ return _even_interval_samples(duration, budget, frame_step)
2688
+
2501
2689
  # ===================== Pass 1: Reservations =====================
2502
2690
  reserved: List[Dict[str, Any]] = []
2503
2691
 
@@ -2729,6 +2917,7 @@ def _motion_and_keyframes(
2729
2917
  fps: Optional[float] = None,
2730
2918
  cut_analysis: Optional[Dict[str, Any]] = None,
2731
2919
  write_frames: bool = True,
2920
+ sampling: Optional[Dict[str, Any]] = None,
2732
2921
  ) -> Dict[str, Any]:
2733
2922
  sampled = []
2734
2923
  previous_raw = None
@@ -2736,8 +2925,8 @@ def _motion_and_keyframes(
2736
2925
  if isinstance(cut_analysis, dict):
2737
2926
  required_boundary_frames += len(cut_analysis.get("cut_points") or []) * 2
2738
2927
  required_boundary_frames += len(cut_analysis.get("flash_frame_candidates") or [])
2739
- effective_budget = _compute_demand_driven_budget(budget, cut_analysis, duration)
2740
- times = _sample_times(duration, scene_items, effective_budget, fps=fps, cut_analysis=cut_analysis)
2928
+ effective_budget = _compute_demand_driven_budget(budget, cut_analysis, duration, sampling=sampling)
2929
+ times = _sample_times(duration, scene_items, effective_budget, fps=fps, cut_analysis=cut_analysis, sampling=sampling)
2741
2930
  frames_dir = artifacts.get("frames_dir")
2742
2931
  for index, sample in enumerate(times, 1):
2743
2932
  time_seconds = float(sample.get("time_seconds") or 0.0)
@@ -2971,6 +3160,15 @@ def _report_cache_state(
2971
3160
  if report_budget < request_budget:
2972
3161
  issues.append("analysis_keyframe_budget_lower_than_requested")
2973
3162
 
3163
+ # Sampling-mode reconciliation: a prior report sampled under a less-thorough
3164
+ # mode can't satisfy a request for a more-thorough one. The reverse (richer
3165
+ # report, cheaper request) is reused as a free upgrade.
3166
+ report_mode = (report_signature.get("analysis_sampling") or {}).get("mode")
3167
+ request_mode = (request_signature.get("analysis_sampling") or {}).get("mode")
3168
+ if request_mode and report_mode and request_mode != report_mode:
3169
+ if SAMPLING_MODE_RANK.get(request_mode, 0) > SAMPLING_MODE_RANK.get(report_mode, 0):
3170
+ issues.append("sampling_mode_increased")
3171
+
2974
3172
  report_layers = report_signature.get("layers") or {}
2975
3173
  request_layers = request_signature.get("layers") or {}
2976
3174
  report_vision = report_layers.get("vision") or {}
@@ -4747,6 +4945,7 @@ async def execute_plan_async(
4747
4945
  fps=fps,
4748
4946
  cut_analysis=readthrough.get("cut_analysis"),
4749
4947
  write_frames=keep_frame_artifacts_for_vision or not _coerce_bool(params.get("cleanup_frames"), default=False),
4948
+ sampling=clip_plan.get("sampling"),
4750
4949
  )
4751
4950
  if artifacts.get("motion_json"):
4752
4951
  _write_json(artifacts["motion_json"], motion)