talks-reducer 0.1.0__tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -0,0 +1,23 @@
1
+ MIT License
2
+
3
+ Copyright (c) 2019 carykh
4
+ Copyright (c) 2020 gegell
5
+ Copyright (c) 2025 Stanislav Popov
6
+
7
+ Permission is hereby granted, free of charge, to any person obtaining a copy
8
+ of this software and associated documentation files (the "Software"), to deal
9
+ in the Software without restriction, including without limitation the rights
10
+ to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
11
+ copies of the Software, and to permit persons to whom the Software is
12
+ furnished to do so, subject to the following conditions:
13
+
14
+ The above copyright notice and this permission notice shall be included in all
15
+ copies or substantial portions of the Software.
16
+
17
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
18
+ IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
19
+ FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
20
+ AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
21
+ LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
22
+ OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
23
+ SOFTWARE.
@@ -0,0 +1,102 @@
1
+ Metadata-Version: 2.4
2
+ Name: talks-reducer
3
+ Version: 0.1.0
4
+ Summary: CLI for speeding up long-form talks by removing silence
5
+ Author: Talks Reducer Maintainers
6
+ License: MIT License
7
+
8
+ Copyright (c) 2019 carykh
9
+ Copyright (c) 2020 gegell
10
+ Copyright (c) 2025 Stanislav Popov
11
+
12
+ Permission is hereby granted, free of charge, to any person obtaining a copy
13
+ of this software and associated documentation files (the "Software"), to deal
14
+ in the Software without restriction, including without limitation the rights
15
+ to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
16
+ copies of the Software, and to permit persons to whom the Software is
17
+ furnished to do so, subject to the following conditions:
18
+
19
+ The above copyright notice and this permission notice shall be included in all
20
+ copies or substantial portions of the Software.
21
+
22
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
23
+ IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
24
+ FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
25
+ AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
26
+ LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
27
+ OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
28
+ SOFTWARE.
29
+
30
+ Requires-Python: >=3.9
31
+ Description-Content-Type: text/markdown
32
+ License-File: LICENSE
33
+ Requires-Dist: audiotsm>=0.1.2
34
+ Requires-Dist: scipy>=1.10.0
35
+ Requires-Dist: numpy<2.0.0,>=1.22.0
36
+ Requires-Dist: tqdm>=4.65.0
37
+ Dynamic: license-file
38
+
39
+ # Talks Reducer
40
+ Talks Reducer shortens long-form presentations by removing silent gaps and optionally re-encoding them to smaller files. The
41
+ project was renamed from **jumpcutter** to emphasize its focus on conference talks and lectures.
42
+
43
+ When CUDA-capable hardware is available the pipeline leans on GPU encoders to keep export times low, but it still runs great on
44
+ CPUs.
45
+
46
+ ## Repository Structure
47
+ - `talks_reducer/` — Python package that exposes the CLI and reusable pipeline:
48
+ - `cli.py` parses arguments and dispatches to the pipeline.
49
+ - `pipeline.py` orchestrates FFmpeg, audio processing, and temporary assets.
50
+ - `audio.py` handles audio validation, volume analysis, and phase vocoder processing.
51
+ - `chunks.py` builds timing metadata and FFmpeg expressions for frame selection.
52
+ - `ffmpeg.py` discovers the FFmpeg binary, checks CUDA availability, and assembles command strings.
53
+ - `requirements.txt` — Python dependencies for local development.
54
+ - `default.nix` — reproducible environment definition for Nix users.
55
+ - `CONTRIBUTION.md` — development workflow, formatting expectations, and release checklist.
56
+ - `AGENTS.md` — maintainer tips and coding conventions for this repository.
57
+
58
+ ## Example
59
+ - 1h 37m, 571 MB — Original OBS video
60
+ - 1h 19m, 751 MB — Talks Reducer
61
+ - 1h 19m, 171 MB — Talks Reducer `--small`
62
+
63
+ The `--small` preset applies a 720p video scale and 128 kbps audio bitrate, making it useful for sharing talks over constrained
64
+ connections. Without `--small`, the script aims to preserve original quality while removing silence.
65
+
66
+ ## Highlights
67
+ - Builds on gegell's classic jumpcutter workflow with more efficient frame and audio processing
68
+ - Generates FFmpeg filter graphs instead of writing temporary frames to disk
69
+ - Streams audio transformations in memory to avoid slow intermediate files
70
+ - Accepts multiple inputs or directories of recordings in a single run
71
+ - Provides progress feedback via `tqdm`
72
+ - Automatically detects NVENC availability, so you no longer need to pass `--cuda`
73
+
74
+ ## Processing Pipeline
75
+ 1. Validate that each input file contains an audio stream using `ffprobe`.
76
+ 2. Extract audio and calculate loudness to identify silent regions.
77
+ 3. Stretch the non-silent segments with `audiotsm` to maintain speech clarity.
78
+ 4. Stitch the processed audio and video together with FFmpeg, using NVENC if the GPU encoders are detected.
79
+
80
+ ## Recent Updates
81
+ - **October 2025** — Project renamed to *Talks Reducer* across documentation and scripts.
82
+ - **October 2025** — Added `--small` preset with 720p/128 kbps defaults for bandwidth-friendly exports.
83
+ - **October 2025** — Removed the `--cuda` flag; CUDA/NVENC support is now auto-detected.
84
+ - **October 2025** — Improved `--small` encoder arguments to balance size and clarity.
85
+ - **October 2025** — CLI argument parsing fixes to prevent crashes on invalid combinations.
86
+ - **October 2025** — Added example output comparison to the README.
87
+
88
+ ## Quick Start
89
+ 1. Install FFmpeg and ensure it is on your `PATH`
90
+ 2. Install Talks Reducer with `pip install talks-reducer` (this exposes the `talks-reducer` command)
91
+ 3. Inspect available options with `talks-reducer --help`
92
+ 4. Process a recording using `talks-reducer /path/to/video`
93
+
94
+ ## Requirements
95
+ - Python 3 with `numpy`, `scipy`, `audiotsm`, and `tqdm`
96
+ - FFmpeg with optional NVIDIA NVENC support for CUDA acceleration
97
+
98
+ ## Contributing
99
+ See `CONTRIBUTION.md` for development setup details and guidance on sharing improvements.
100
+
101
+ ## License
102
+ Talks Reducer is released under the MIT License. See `LICENSE` for the full text.
@@ -0,0 +1,64 @@
1
+ # Talks Reducer
2
+ Talks Reducer shortens long-form presentations by removing silent gaps and optionally re-encoding them to smaller files. The
3
+ project was renamed from **jumpcutter** to emphasize its focus on conference talks and lectures.
4
+
5
+ When CUDA-capable hardware is available the pipeline leans on GPU encoders to keep export times low, but it still runs great on
6
+ CPUs.
7
+
8
+ ## Repository Structure
9
+ - `talks_reducer/` — Python package that exposes the CLI and reusable pipeline:
10
+ - `cli.py` parses arguments and dispatches to the pipeline.
11
+ - `pipeline.py` orchestrates FFmpeg, audio processing, and temporary assets.
12
+ - `audio.py` handles audio validation, volume analysis, and phase vocoder processing.
13
+ - `chunks.py` builds timing metadata and FFmpeg expressions for frame selection.
14
+ - `ffmpeg.py` discovers the FFmpeg binary, checks CUDA availability, and assembles command strings.
15
+ - `requirements.txt` — Python dependencies for local development.
16
+ - `default.nix` — reproducible environment definition for Nix users.
17
+ - `CONTRIBUTION.md` — development workflow, formatting expectations, and release checklist.
18
+ - `AGENTS.md` — maintainer tips and coding conventions for this repository.
19
+
20
+ ## Example
21
+ - 1h 37m, 571 MB — Original OBS video
22
+ - 1h 19m, 751 MB — Talks Reducer
23
+ - 1h 19m, 171 MB — Talks Reducer `--small`
24
+
25
+ The `--small` preset applies a 720p video scale and 128 kbps audio bitrate, making it useful for sharing talks over constrained
26
+ connections. Without `--small`, the script aims to preserve original quality while removing silence.
27
+
28
+ ## Highlights
29
+ - Builds on gegell's classic jumpcutter workflow with more efficient frame and audio processing
30
+ - Generates FFmpeg filter graphs instead of writing temporary frames to disk
31
+ - Streams audio transformations in memory to avoid slow intermediate files
32
+ - Accepts multiple inputs or directories of recordings in a single run
33
+ - Provides progress feedback via `tqdm`
34
+ - Automatically detects NVENC availability, so you no longer need to pass `--cuda`
35
+
36
+ ## Processing Pipeline
37
+ 1. Validate that each input file contains an audio stream using `ffprobe`.
38
+ 2. Extract audio and calculate loudness to identify silent regions.
39
+ 3. Stretch the non-silent segments with `audiotsm` to maintain speech clarity.
40
+ 4. Stitch the processed audio and video together with FFmpeg, using NVENC if the GPU encoders are detected.
41
+
42
+ ## Recent Updates
43
+ - **October 2025** — Project renamed to *Talks Reducer* across documentation and scripts.
44
+ - **October 2025** — Added `--small` preset with 720p/128 kbps defaults for bandwidth-friendly exports.
45
+ - **October 2025** — Removed the `--cuda` flag; CUDA/NVENC support is now auto-detected.
46
+ - **October 2025** — Improved `--small` encoder arguments to balance size and clarity.
47
+ - **October 2025** — CLI argument parsing fixes to prevent crashes on invalid combinations.
48
+ - **October 2025** — Added example output comparison to the README.
49
+
50
+ ## Quick Start
51
+ 1. Install FFmpeg and ensure it is on your `PATH`
52
+ 2. Install Talks Reducer with `pip install talks-reducer` (this exposes the `talks-reducer` command)
53
+ 3. Inspect available options with `talks-reducer --help`
54
+ 4. Process a recording using `talks-reducer /path/to/video`
55
+
56
+ ## Requirements
57
+ - Python 3 with `numpy`, `scipy`, `audiotsm`, and `tqdm`
58
+ - FFmpeg with optional NVIDIA NVENC support for CUDA acceleration
59
+
60
+ ## Contributing
61
+ See `CONTRIBUTION.md` for development setup details and guidance on sharing improvements.
62
+
63
+ ## License
64
+ Talks Reducer is released under the MIT License. See `LICENSE` for the full text.
@@ -0,0 +1,33 @@
1
+ [build-system]
2
+ requires = ["setuptools>=64", "wheel"]
3
+ build-backend = "setuptools.build_meta"
4
+
5
+ [project]
6
+ name = "talks-reducer"
7
+ version = "0.1.0"
8
+ description = "CLI for speeding up long-form talks by removing silence"
9
+ readme = "README.md"
10
+ requires-python = ">=3.9"
11
+ license = { file = "LICENSE" }
12
+ authors = [
13
+ { name = "Talks Reducer Maintainers" }
14
+ ]
15
+ dependencies = [
16
+ "audiotsm>=0.1.2",
17
+ "scipy>=1.10.0",
18
+ "numpy>=1.22.0,<2.0.0",
19
+ "tqdm>=4.65.0",
20
+ ]
21
+
22
+ [project.scripts]
23
+ talks-reducer = "talks_reducer.cli:main"
24
+
25
+ [tool.black]
26
+ line-length = 88
27
+ target-version = ["py39"]
28
+
29
+ [tool.isort]
30
+ profile = "black"
31
+ line_length = 88
32
+ known_first_party = ["talks_reducer"]
33
+
@@ -0,0 +1,4 @@
1
+ [egg_info]
2
+ tag_build =
3
+ tag_date = 0
4
+
@@ -0,0 +1,7 @@
1
+ """talks_reducer exposes a CLI for speeding up videos with silent sections."""
2
+
3
+ from __future__ import annotations
4
+
5
+ from .cli import main
6
+
7
+ __all__ = ["main"]
@@ -0,0 +1,8 @@
1
+ """Module executed when running ``python -m talks_reducer``."""
2
+
3
+ from __future__ import annotations
4
+
5
+ from .cli import main
6
+
7
+ if __name__ == "__main__":
8
+ main()
@@ -0,0 +1,109 @@
1
+ """Audio processing helpers for the talks reducer pipeline."""
2
+
3
+ from __future__ import annotations
4
+
5
+ import math
6
+ import subprocess
7
+ from typing import List, Sequence, Tuple
8
+
9
+ import numpy as np
10
+ from audiotsm import phasevocoder
11
+ from audiotsm.io.array import ArrayReader, ArrayWriter
12
+
13
+
14
+ def get_max_volume(samples: np.ndarray) -> float:
15
+ """Return the maximum absolute volume in the provided sample array."""
16
+
17
+ return float(max(-np.min(samples), np.max(samples)))
18
+
19
+
20
+ def is_valid_input_file(filename: str) -> bool:
21
+ """Check whether ``ffprobe`` recognises the input file and finds an audio stream."""
22
+
23
+ command = (
24
+ 'ffprobe -i "{}" -hide_banner -loglevel error -select_streams a'
25
+ " -show_entries stream=codec_type".format(filename)
26
+ )
27
+ process = subprocess.Popen(command, stdout=subprocess.PIPE, stderr=subprocess.PIPE)
28
+ outs, errs = None, None
29
+ try:
30
+ outs, errs = process.communicate(timeout=1)
31
+ except subprocess.TimeoutExpired:
32
+ print("Timeout while checking the input file. Aborting. Command:")
33
+ print(command)
34
+ process.kill()
35
+ outs, errs = process.communicate()
36
+ finally:
37
+ return len(errs) == 0 and len(outs) > 0
38
+
39
+
40
+ def process_audio_chunks(
41
+ audio_data: np.ndarray,
42
+ chunks: Sequence[Sequence[int]],
43
+ samples_per_frame: float,
44
+ speeds: Sequence[float],
45
+ audio_fade_envelope_size: int,
46
+ max_audio_volume: float,
47
+ *,
48
+ batch_size: int = 10,
49
+ ) -> Tuple[np.ndarray, List[List[int]]]:
50
+ """Return processed audio and updated chunk timings for the provided chunk list."""
51
+
52
+ audio_buffers: List[np.ndarray] = []
53
+ output_pointer = 0
54
+ updated_chunks: List[List[int]] = [list(chunk) for chunk in chunks]
55
+ normaliser = max(max_audio_volume, 1e-9)
56
+
57
+ for batch_start in range(0, len(chunks), batch_size):
58
+ batch_chunks = chunks[batch_start : batch_start + batch_size]
59
+ batch_audio: List[np.ndarray] = []
60
+
61
+ for chunk in batch_chunks:
62
+ start = int(chunk[0] * samples_per_frame)
63
+ end = int(chunk[1] * samples_per_frame)
64
+ audio_chunk = audio_data[start:end]
65
+
66
+ if audio_chunk.size == 0:
67
+ channels = audio_data.shape[1] if audio_data.ndim > 1 else 1
68
+ batch_audio.append(np.zeros((0, channels)))
69
+ continue
70
+
71
+ reader = ArrayReader(np.transpose(audio_chunk))
72
+ writer = ArrayWriter(reader.channels)
73
+ tsm = phasevocoder(reader.channels, speed=speeds[int(chunk[2])])
74
+ tsm.run(reader, writer)
75
+ altered_audio_data = np.transpose(writer.data)
76
+
77
+ if altered_audio_data.shape[0] < audio_fade_envelope_size:
78
+ altered_audio_data[:] = 0
79
+ else:
80
+ premask = np.arange(audio_fade_envelope_size) / audio_fade_envelope_size
81
+ mask = np.repeat(
82
+ premask[:, np.newaxis], altered_audio_data.shape[1], axis=1
83
+ )
84
+ altered_audio_data[:audio_fade_envelope_size] *= mask
85
+ altered_audio_data[-audio_fade_envelope_size:] *= 1 - mask
86
+
87
+ batch_audio.append(altered_audio_data / normaliser)
88
+
89
+ for index, chunk in enumerate(batch_chunks):
90
+ altered_audio_data = batch_audio[index]
91
+ audio_buffers.append(altered_audio_data)
92
+
93
+ end_pointer = output_pointer + altered_audio_data.shape[0]
94
+ start_output_frame = int(math.ceil(output_pointer / samples_per_frame))
95
+ end_output_frame = int(math.ceil(end_pointer / samples_per_frame))
96
+
97
+ updated_chunks[batch_start + index] = list(chunk[:2]) + [
98
+ start_output_frame,
99
+ end_output_frame,
100
+ ]
101
+ output_pointer = end_pointer
102
+
103
+ if audio_buffers:
104
+ output_audio_data = np.concatenate(audio_buffers)
105
+ else:
106
+ channels = audio_data.shape[1] if audio_data.ndim > 1 else 1
107
+ output_audio_data = np.zeros((0, channels))
108
+
109
+ return output_audio_data, updated_chunks
@@ -0,0 +1,92 @@
1
+ """Chunk creation utilities used by the talks reducer pipeline."""
2
+
3
+ from __future__ import annotations
4
+
5
+ from typing import List, Sequence, Tuple
6
+
7
+ import numpy as np
8
+
9
+ from .audio import get_max_volume
10
+
11
+
12
+ def detect_loud_frames(
13
+ audio_data: np.ndarray,
14
+ audio_frame_count: int,
15
+ samples_per_frame: float,
16
+ max_audio_volume: float,
17
+ silent_threshold: float,
18
+ ) -> np.ndarray:
19
+ """Return a boolean array indicating which frames contain loud audio."""
20
+
21
+ normaliser = max(max_audio_volume, 1e-9)
22
+ has_loud_audio = np.zeros(audio_frame_count, dtype=bool)
23
+
24
+ for frame_index in range(audio_frame_count):
25
+ start = int(frame_index * samples_per_frame)
26
+ end = min(int((frame_index + 1) * samples_per_frame), audio_data.shape[0])
27
+ audio_chunk = audio_data[start:end]
28
+ chunk_max_volume = float(get_max_volume(audio_chunk)) / normaliser
29
+ if chunk_max_volume >= silent_threshold:
30
+ has_loud_audio[frame_index] = True
31
+
32
+ return has_loud_audio
33
+
34
+
35
+ def build_chunks(
36
+ has_loud_audio: np.ndarray, frame_spreadage: int
37
+ ) -> Tuple[List[List[int]], np.ndarray]:
38
+ """Return chunks describing which frame ranges should be retained."""
39
+
40
+ audio_frame_count = len(has_loud_audio)
41
+ chunks: List[List[int]] = [[0, 0, 0]]
42
+ should_include_frame = np.zeros(audio_frame_count, dtype=bool)
43
+
44
+ for frame_index in range(audio_frame_count):
45
+ start = int(max(0, frame_index - frame_spreadage))
46
+ end = int(min(audio_frame_count, frame_index + 1 + frame_spreadage))
47
+ should_include_frame[frame_index] = np.any(has_loud_audio[start:end])
48
+ if (
49
+ frame_index >= 1
50
+ and should_include_frame[frame_index]
51
+ != should_include_frame[frame_index - 1]
52
+ ):
53
+ chunks.append(
54
+ [chunks[-1][1], frame_index, int(should_include_frame[frame_index - 1])]
55
+ )
56
+
57
+ chunks.append(
58
+ [
59
+ chunks[-1][1],
60
+ audio_frame_count,
61
+ int(should_include_frame[audio_frame_count - 1]),
62
+ ]
63
+ )
64
+ return chunks[1:], should_include_frame
65
+
66
+
67
+ def get_tree_expression(chunks: Sequence[Sequence[int]]) -> str:
68
+ """Return the FFmpeg expression needed to map chunk timing updates."""
69
+
70
+ return "{}/TB/FR".format(_get_tree_expression_rec(chunks))
71
+
72
+
73
+ def _get_tree_expression_rec(chunks: Sequence[Sequence[int]]) -> str:
74
+ if len(chunks) > 1:
75
+ split_index = int(len(chunks) / 2)
76
+ center = chunks[split_index]
77
+ return "if(lt(N,{}),{},{})".format(
78
+ center[0],
79
+ _get_tree_expression_rec(chunks[:split_index]),
80
+ _get_tree_expression_rec(chunks[split_index:]),
81
+ )
82
+ chunk = chunks[0]
83
+ local_speedup = (chunk[3] - chunk[2]) / (chunk[1] - chunk[0])
84
+ offset = -chunk[0] * local_speedup + chunk[2]
85
+ return "N*{}{:+}".format(local_speedup, offset)
86
+
87
+
88
+ __all__ = [
89
+ "detect_loud_frames",
90
+ "build_chunks",
91
+ "get_tree_expression",
92
+ ]
@@ -0,0 +1,129 @@
1
+ """Command line interface for the talks reducer package."""
2
+
3
+ from __future__ import annotations
4
+
5
+ import argparse
6
+ import os
7
+ import time
8
+ from typing import Dict, List
9
+
10
+ from . import audio
11
+ from .pipeline import speed_up_video
12
+
13
+
14
+ def _build_parser() -> argparse.ArgumentParser:
15
+ """Create the argument parser used by the command line interface."""
16
+
17
+ parser = argparse.ArgumentParser(
18
+ description="Modifies a video file to play at different speeds when there is sound vs. silence.",
19
+ )
20
+ parser.add_argument(
21
+ "input_file",
22
+ type=str,
23
+ nargs="+",
24
+ help="The video file(s) you want modified. Can be one or more directories and / or single files.",
25
+ )
26
+ parser.add_argument(
27
+ "-o",
28
+ "--output_file",
29
+ type=str,
30
+ dest="output_file",
31
+ help="The output file. Only usable if a single file is given. If not included, it'll append _ALTERED to the name.",
32
+ )
33
+ parser.add_argument(
34
+ "--temp_folder",
35
+ type=str,
36
+ default="TEMP",
37
+ help="The file path of the temporary working folder.",
38
+ )
39
+ parser.add_argument(
40
+ "-t",
41
+ "--silent_threshold",
42
+ type=float,
43
+ dest="silent_threshold",
44
+ help="The volume amount that frames' audio needs to surpass to be considered sounded. Defaults to 0.03.",
45
+ )
46
+ parser.add_argument(
47
+ "-S",
48
+ "--sounded_speed",
49
+ type=float,
50
+ dest="sounded_speed",
51
+ help="The speed that sounded (spoken) frames should be played at. Defaults to 1.",
52
+ )
53
+ parser.add_argument(
54
+ "-s",
55
+ "--silent_speed",
56
+ type=float,
57
+ dest="silent_speed",
58
+ help="The speed that silent frames should be played at. Defaults to 4.",
59
+ )
60
+ parser.add_argument(
61
+ "-fm",
62
+ "--frame_margin",
63
+ type=float,
64
+ dest="frame_spreadage",
65
+ help="Some silent frames adjacent to sounded frames are included to provide context. Defaults to 2.",
66
+ )
67
+ parser.add_argument(
68
+ "-sr",
69
+ "--sample_rate",
70
+ type=float,
71
+ dest="sample_rate",
72
+ help="Sample rate of the input and output videos. Usually extracted automatically by FFmpeg.",
73
+ )
74
+ parser.add_argument(
75
+ "--small",
76
+ action="store_true",
77
+ help="Apply small file optimizations: resize video to 720p, audio to 128k bitrate, best compression (uses CUDA if available).",
78
+ )
79
+ return parser
80
+
81
+
82
+ def _gather_input_files(paths: List[str]) -> List[str]:
83
+ """Expand provided paths into a flat list of files that contain audio streams."""
84
+
85
+ files: List[str] = []
86
+ for input_path in paths:
87
+ if os.path.isfile(input_path) and audio.is_valid_input_file(input_path):
88
+ files.append(os.path.abspath(input_path))
89
+ elif os.path.isdir(input_path):
90
+ for file in os.listdir(input_path):
91
+ candidate = os.path.join(input_path, file)
92
+ if audio.is_valid_input_file(candidate):
93
+ files.append(candidate)
94
+ return files
95
+
96
+
97
+ def main() -> None:
98
+ """Entry point for the command line interface."""
99
+
100
+ parser = _build_parser()
101
+ parsed_args = parser.parse_args()
102
+ start_time = time.time()
103
+
104
+ files = _gather_input_files(parsed_args.input_file)
105
+
106
+ args: Dict[str, object] = {
107
+ k: v for k, v in vars(parsed_args).items() if v is not None
108
+ }
109
+ del args["input_file"]
110
+
111
+ if len(files) > 1 and "output_file" in args:
112
+ del args["output_file"]
113
+
114
+ for index, file in enumerate(files):
115
+ print(f"Processing file {index + 1}/{len(files)} '{os.path.basename(file)}'")
116
+ local_options = dict(args)
117
+ local_options["input_file"] = file
118
+ local_options["small"] = bool(local_options.get("small", False))
119
+ speed_up_video(**local_options)
120
+
121
+ end_time = time.time()
122
+ total_time = end_time - start_time
123
+ hours, remainder = divmod(total_time, 3600)
124
+ minutes, seconds = divmod(remainder, 60)
125
+ print(f"\nTime: {int(hours)}h {int(minutes)}m {seconds:.2f}s")
126
+
127
+
128
+ if __name__ == "__main__":
129
+ main()
@@ -0,0 +1,269 @@
1
+ """Utilities for discovering and invoking FFmpeg commands."""
2
+
3
+ from __future__ import annotations
4
+
5
+ import os
6
+ import re
7
+ import subprocess
8
+ import sys
9
+ from functools import partial
10
+ from typing import List, Optional, Tuple
11
+
12
+ from tqdm import tqdm as std_tqdm
13
+
14
+
15
+ def find_ffmpeg() -> Optional[str]:
16
+ """Locate the FFmpeg executable in common installation locations."""
17
+
18
+ env_override = os.environ.get("TALKS_REDUCER_FFMPEG") or os.environ.get(
19
+ "FFMPEG_PATH"
20
+ )
21
+ if env_override and (os.path.isfile(env_override) or shutil_which(env_override)):
22
+ return (
23
+ os.path.abspath(env_override)
24
+ if os.path.isfile(env_override)
25
+ else env_override
26
+ )
27
+
28
+ common_paths = [
29
+ "C:\\ProgramData\\chocolatey\\bin\\ffmpeg.exe",
30
+ "C:\\Program Files\\ffmpeg\\bin\\ffmpeg.exe",
31
+ "C:\\ffmpeg\\bin\\ffmpeg.exe",
32
+ "ffmpeg",
33
+ ]
34
+
35
+ for path in common_paths:
36
+ if os.path.isfile(path) or shutil_which(path):
37
+ return os.path.abspath(path) if os.path.isfile(path) else path
38
+
39
+ return None
40
+
41
+
42
+ def _resolve_ffmpeg_path() -> str:
43
+ """Resolve the FFmpeg executable path or exit with a helpful message."""
44
+
45
+ ffmpeg_path = find_ffmpeg()
46
+ if not ffmpeg_path:
47
+ print(
48
+ "Error: FFmpeg not found. Please install FFmpeg and add it to your PATH or specify the full path.",
49
+ file=sys.stderr,
50
+ )
51
+ raise SystemExit(1)
52
+
53
+ print(f"Using FFmpeg at: {ffmpeg_path}")
54
+ return ffmpeg_path
55
+
56
+
57
+ FFMPEG_PATH = _resolve_ffmpeg_path()
58
+
59
+ tqdm = partial(
60
+ std_tqdm,
61
+ bar_format=(
62
+ "{desc:<20} {percentage:3.0f}%"
63
+ "|{bar:10}|"
64
+ " {n_fmt:>6}/{total_fmt:>6} [{elapsed:^5}<{remaining:^5}, {rate_fmt}{postfix}]"
65
+ ),
66
+ )
67
+
68
+
69
+ def check_cuda_available(ffmpeg_path: str = FFMPEG_PATH) -> bool:
70
+ """Return whether CUDA hardware encoders are available in the FFmpeg build."""
71
+
72
+ try:
73
+ result = subprocess.run(
74
+ [ffmpeg_path, "-encoders"], capture_output=True, text=True, timeout=5
75
+ )
76
+ except (
77
+ subprocess.TimeoutExpired,
78
+ subprocess.CalledProcessError,
79
+ FileNotFoundError,
80
+ ):
81
+ return False
82
+
83
+ if result.returncode != 0:
84
+ return False
85
+
86
+ encoder_list = result.stdout.lower()
87
+ return any(
88
+ encoder in encoder_list for encoder in ["h264_nvenc", "hevc_nvenc", "nvenc"]
89
+ )
90
+
91
+
92
+ def run_timed_ffmpeg_command(command: str, **kwargs) -> None:
93
+ """Execute an FFmpeg command while streaming progress information to ``tqdm``."""
94
+
95
+ import shlex
96
+
97
+ try:
98
+ args = shlex.split(command)
99
+ except Exception as exc: # pragma: no cover - defensive logging
100
+ print(f"Error parsing command: {exc}", file=sys.stderr)
101
+ raise
102
+
103
+ try:
104
+ process = subprocess.Popen(
105
+ args,
106
+ stdout=subprocess.PIPE,
107
+ stderr=subprocess.PIPE,
108
+ universal_newlines=True,
109
+ bufsize=1,
110
+ errors="replace",
111
+ )
112
+ except Exception as exc: # pragma: no cover - defensive logging
113
+ print(f"Error starting FFmpeg: {exc}", file=sys.stderr)
114
+ raise
115
+
116
+ with tqdm(**kwargs) as progress:
117
+ while True:
118
+ line = process.stderr.readline()
119
+ if not line and process.poll() is not None:
120
+ break
121
+
122
+ if not line:
123
+ continue
124
+
125
+ sys.stderr.write(line)
126
+ sys.stderr.flush()
127
+
128
+ match = re.search(r"frame=\s*(\d+)", line)
129
+ if match:
130
+ try:
131
+ new_frame = int(match.group(1))
132
+ if progress.total < new_frame:
133
+ progress.total = new_frame
134
+ progress.update(new_frame - progress.n)
135
+ except (ValueError, IndexError):
136
+ pass
137
+
138
+ process.wait()
139
+
140
+ if process.returncode != 0:
141
+ error_output = process.stderr.read()
142
+ print(
143
+ f"\nFFmpeg error (return code {process.returncode}):", file=sys.stderr
144
+ )
145
+ print(error_output, file=sys.stderr)
146
+ raise subprocess.CalledProcessError(process.returncode, args)
147
+
148
+ if progress.n < progress.total:
149
+ progress.update(progress.total - progress.n)
150
+
151
+
152
+ def build_extract_audio_command(
153
+ input_file: str,
154
+ output_wav: str,
155
+ sample_rate: int,
156
+ audio_bitrate: str,
157
+ hwaccel: Optional[List[str]] = None,
158
+ ffmpeg_path: str = FFMPEG_PATH,
159
+ ) -> str:
160
+ """Build the FFmpeg command used to extract audio into a temporary WAV file."""
161
+
162
+ hwaccel = hwaccel or []
163
+ command_parts: List[str] = [f'"{ffmpeg_path}"']
164
+ command_parts.extend(hwaccel)
165
+ command_parts.extend(
166
+ [
167
+ f'-i "{input_file}"',
168
+ f"-ab {audio_bitrate} -ac 2",
169
+ f"-ar {sample_rate}",
170
+ "-vn",
171
+ f'"{output_wav}"',
172
+ "-hide_banner -loglevel warning -stats",
173
+ ]
174
+ )
175
+ return " ".join(command_parts)
176
+
177
+
178
+ def build_video_commands(
179
+ input_file: str,
180
+ audio_file: str,
181
+ filter_script: str,
182
+ output_file: str,
183
+ *,
184
+ ffmpeg_path: str = FFMPEG_PATH,
185
+ cuda_available: bool,
186
+ small: bool,
187
+ ) -> Tuple[str, Optional[str], bool]:
188
+ """Create the FFmpeg command strings used to render the final video output."""
189
+
190
+ global_parts: List[str] = [f'"{ffmpeg_path}"', "-y"]
191
+ hwaccel_args: List[str] = []
192
+
193
+ if cuda_available and not small:
194
+ hwaccel_args = ["-hwaccel", "cuda", "-hwaccel_output_format", "cuda"]
195
+ global_parts.extend(hwaccel_args)
196
+ elif small and cuda_available:
197
+ pass
198
+
199
+ input_parts = [f'-i "{input_file}"', f'-i "{audio_file}"']
200
+
201
+ output_parts = [
202
+ "-map 0 -map -0:a -map 1:a",
203
+ f'-filter_script:v "{filter_script}"',
204
+ ]
205
+
206
+ video_encoder_args: List[str]
207
+ fallback_encoder_args: List[str] = []
208
+ use_cuda_encoder = False
209
+
210
+ if small:
211
+ if cuda_available:
212
+ use_cuda_encoder = True
213
+ video_encoder_args = [
214
+ "-c:v h264_nvenc",
215
+ "-preset p1",
216
+ "-cq 28",
217
+ "-tune",
218
+ "ll",
219
+ ]
220
+ fallback_encoder_args = [
221
+ "-c:v libx264",
222
+ "-preset veryfast",
223
+ "-crf 24",
224
+ "-tune",
225
+ "zerolatency",
226
+ ]
227
+ else:
228
+ video_encoder_args = [
229
+ "-c:v libx264",
230
+ "-preset veryfast",
231
+ "-crf 24",
232
+ "-tune",
233
+ "zerolatency",
234
+ ]
235
+ else:
236
+ global_parts.append("-filter_complex_threads 1")
237
+ if cuda_available:
238
+ video_encoder_args = ["-c:v h264_nvenc"]
239
+ use_cuda_encoder = True
240
+ else:
241
+ video_encoder_args = ["-c:v copy"]
242
+
243
+ audio_parts = ["-c:a aac", f'"{output_file}"', "-loglevel info -stats -hide_banner"]
244
+
245
+ full_command_parts = (
246
+ global_parts + input_parts + output_parts + video_encoder_args + audio_parts
247
+ )
248
+ command_str = " ".join(full_command_parts)
249
+
250
+ fallback_command_str: Optional[str] = None
251
+ if fallback_encoder_args:
252
+ fallback_parts = (
253
+ global_parts
254
+ + input_parts
255
+ + output_parts
256
+ + fallback_encoder_args
257
+ + audio_parts
258
+ )
259
+ fallback_command_str = " ".join(fallback_parts)
260
+
261
+ return command_str, fallback_command_str, use_cuda_encoder
262
+
263
+
264
+ def shutil_which(cmd: str) -> Optional[str]:
265
+ """Wrapper around :func:`shutil.which` for easier testing."""
266
+
267
+ from shutil import which as _which
268
+
269
+ return _which(cmd)
@@ -0,0 +1,244 @@
1
+ """High-level pipeline orchestration for Talks Reducer."""
2
+
3
+ from __future__ import annotations
4
+
5
+ import math
6
+ import os
7
+ import re
8
+ import subprocess
9
+ from typing import Dict, Optional
10
+
11
+ import numpy as np
12
+ from scipy.io import wavfile
13
+
14
+ from . import audio as audio_utils
15
+ from . import chunks as chunk_utils
16
+ from .ffmpeg import (
17
+ FFMPEG_PATH,
18
+ build_extract_audio_command,
19
+ build_video_commands,
20
+ check_cuda_available,
21
+ run_timed_ffmpeg_command,
22
+ )
23
+
24
+
25
+ def _input_to_output_filename(filename: str, small: bool = False) -> str:
26
+ dot_index = filename.rfind(".")
27
+ suffix = "_speedup_small" if small else "_speedup"
28
+ return filename[:dot_index] + suffix + filename[dot_index:]
29
+
30
+
31
+ def _create_path(path: str) -> None:
32
+ try:
33
+ os.mkdir(path)
34
+ except OSError as exc: # pragma: no cover - defensive logging
35
+ raise AssertionError(
36
+ "Creation of the directory failed. (The TEMP folder may already exist. Delete or rename it, and try again.)"
37
+ ) from exc
38
+
39
+
40
+ def _delete_path(path: str) -> None:
41
+ import time
42
+ from shutil import rmtree
43
+
44
+ try:
45
+ rmtree(path, ignore_errors=False)
46
+ for i in range(5):
47
+ if not os.path.exists(path):
48
+ return
49
+ time.sleep(0.01 * i)
50
+ except OSError as exc: # pragma: no cover - defensive logging
51
+ print(f"Deletion of the directory {path} failed")
52
+ print(exc)
53
+
54
+
55
+ def _extract_video_metadata(input_file: str, frame_rate: float) -> Dict[str, float]:
56
+ command = (
57
+ 'ffprobe -i "{}" -hide_banner -loglevel error -select_streams v'
58
+ " -show_entries format=duration:stream=avg_frame_rate".format(input_file)
59
+ )
60
+ process = subprocess.Popen(
61
+ command,
62
+ stdout=subprocess.PIPE,
63
+ stderr=subprocess.PIPE,
64
+ bufsize=1,
65
+ universal_newlines=True,
66
+ )
67
+ stdout, _ = process.communicate()
68
+
69
+ match_frame_rate = re.search(r"frame_rate=(\d*)/(\d*)", str(stdout))
70
+ if match_frame_rate is not None:
71
+ frame_rate = float(match_frame_rate.group(1)) / float(match_frame_rate.group(2))
72
+
73
+ match_duration = re.search(r"duration=([\d.]*)", str(stdout))
74
+ original_duration = float(match_duration.group(1)) if match_duration else 0.0
75
+
76
+ return {"frame_rate": frame_rate, "duration": original_duration}
77
+
78
+
79
+ def _ensure_two_dimensional(audio_data: np.ndarray) -> np.ndarray:
80
+ if audio_data.ndim == 1:
81
+ return audio_data[:, np.newaxis]
82
+ return audio_data
83
+
84
+
85
+ def _prepare_output_audio(output_audio_data: np.ndarray) -> np.ndarray:
86
+ if output_audio_data.ndim == 2 and output_audio_data.shape[1] == 1:
87
+ return output_audio_data[:, 0]
88
+ return output_audio_data
89
+
90
+
91
+ def speed_up_video(
92
+ input_file: str,
93
+ output_file: Optional[str] = None,
94
+ frame_rate: float = 30,
95
+ sample_rate: int = 44100,
96
+ silent_threshold: float = 0.03,
97
+ silent_speed: float = 4.0,
98
+ sounded_speed: float = 1.0,
99
+ frame_spreadage: int = 2,
100
+ audio_fade_envelope_size: int = 400,
101
+ temp_folder: str = "TEMP",
102
+ small: bool = False,
103
+ ) -> None:
104
+ """Speed up a video by shortening silent sections while keeping sounded sections intact."""
105
+
106
+ if output_file is None:
107
+ output_file = _input_to_output_filename(input_file, small)
108
+
109
+ cuda_available = check_cuda_available()
110
+
111
+ if os.path.exists(temp_folder):
112
+ _delete_path(temp_folder)
113
+ _create_path(temp_folder)
114
+
115
+ metadata = _extract_video_metadata(input_file, frame_rate)
116
+ frame_rate = metadata["frame_rate"]
117
+ original_duration = metadata["duration"]
118
+
119
+ hwaccel = (
120
+ ["-hwaccel", "cuda", "-hwaccel_output_format", "cuda"] if cuda_available else []
121
+ )
122
+ audio_bitrate = "128k" if small else "160k"
123
+ audio_wav = os.path.join(temp_folder, "audio.wav")
124
+
125
+ extract_command = build_extract_audio_command(
126
+ input_file,
127
+ audio_wav,
128
+ sample_rate,
129
+ audio_bitrate,
130
+ hwaccel,
131
+ )
132
+
133
+ run_timed_ffmpeg_command(
134
+ extract_command,
135
+ total=int(original_duration * frame_rate),
136
+ unit="frames",
137
+ desc="Extracting audio:",
138
+ )
139
+
140
+ wav_sample_rate, audio_data = wavfile.read(audio_wav)
141
+ audio_data = _ensure_two_dimensional(audio_data)
142
+ audio_sample_count = audio_data.shape[0]
143
+ max_audio_volume = audio_utils.get_max_volume(audio_data)
144
+
145
+ print("\nProcessing Information:")
146
+ print(f"- Max Audio Volume: {max_audio_volume}")
147
+ print(f"- Processing on: {'GPU (CUDA)' if cuda_available else 'CPU'}")
148
+ if small:
149
+ print("- Small mode: 720p video, 128k audio, optimized compression")
150
+
151
+ samples_per_frame = wav_sample_rate / frame_rate
152
+ audio_frame_count = int(math.ceil(audio_sample_count / samples_per_frame))
153
+
154
+ has_loud_audio = chunk_utils.detect_loud_frames(
155
+ audio_data,
156
+ audio_frame_count,
157
+ samples_per_frame,
158
+ max_audio_volume,
159
+ silent_threshold,
160
+ )
161
+
162
+ chunks, _ = chunk_utils.build_chunks(has_loud_audio, frame_spreadage)
163
+
164
+ print(f"Generated {len(chunks)} chunks:")
165
+ for index, chunk in enumerate(chunks[:5]):
166
+ print(f" Chunk {index}: {chunk}")
167
+ if len(chunks) > 5:
168
+ print(f" ... and {len(chunks) - 5} more chunks")
169
+
170
+ new_speeds = [silent_speed, sounded_speed]
171
+ output_audio_data, updated_chunks = audio_utils.process_audio_chunks(
172
+ audio_data,
173
+ chunks,
174
+ samples_per_frame,
175
+ new_speeds,
176
+ audio_fade_envelope_size,
177
+ max_audio_volume,
178
+ )
179
+
180
+ audio_new_path = os.path.join(temp_folder, "audioNew.wav")
181
+ wavfile.write(audio_new_path, sample_rate, _prepare_output_audio(output_audio_data))
182
+
183
+ expression = chunk_utils.get_tree_expression(updated_chunks)
184
+ filter_graph_path = os.path.join(temp_folder, "filterGraph.txt")
185
+ with open(filter_graph_path, "w", encoding="utf-8") as filter_graph_file:
186
+ filter_parts = []
187
+ if small:
188
+ filter_parts.append("scale=-2:720")
189
+ filter_parts.append(f"fps=fps={frame_rate}")
190
+ filter_parts.append(f'setpts={expression.replace(",", "\\,")}')
191
+ filter_graph_file.write(",".join(filter_parts))
192
+
193
+ command_str, fallback_command_str, use_cuda_encoder = build_video_commands(
194
+ input_file,
195
+ audio_new_path,
196
+ filter_graph_path,
197
+ output_file,
198
+ ffmpeg_path=FFMPEG_PATH,
199
+ cuda_available=cuda_available,
200
+ small=small,
201
+ )
202
+
203
+ output_dir = os.path.dirname(os.path.abspath(output_file))
204
+ if output_dir and not os.path.exists(output_dir):
205
+ print(f"Creating output directory: {output_dir}")
206
+ os.makedirs(output_dir, exist_ok=True)
207
+
208
+ print("\nExecuting FFmpeg command:")
209
+ print(command_str)
210
+
211
+ if not os.path.exists(audio_new_path):
212
+ print("ERROR: Audio file not found!")
213
+ _delete_path(temp_folder)
214
+ return
215
+
216
+ if not os.path.exists(filter_graph_path):
217
+ print("ERROR: Filter file not found!")
218
+ _delete_path(temp_folder)
219
+ return
220
+
221
+ try:
222
+ run_timed_ffmpeg_command(
223
+ command_str,
224
+ total=updated_chunks[-1][3],
225
+ unit="frames",
226
+ desc="Generating final:",
227
+ )
228
+ except subprocess.CalledProcessError as exc:
229
+ if fallback_command_str and use_cuda_encoder:
230
+ print("CUDA encoding failed, retrying with CPU encoder...")
231
+ run_timed_ffmpeg_command(
232
+ fallback_command_str,
233
+ total=updated_chunks[-1][3],
234
+ unit="frames",
235
+ desc="Generating final (fallback):",
236
+ )
237
+ else:
238
+ print(f"\nError running FFmpeg command: {exc}")
239
+ print(
240
+ "Please check if all input files exist and FFmpeg has proper permissions."
241
+ )
242
+ raise
243
+ finally:
244
+ _delete_path(temp_folder)
@@ -0,0 +1,102 @@
1
+ Metadata-Version: 2.4
2
+ Name: talks-reducer
3
+ Version: 0.1.0
4
+ Summary: CLI for speeding up long-form talks by removing silence
5
+ Author: Talks Reducer Maintainers
6
+ License: MIT License
7
+
8
+ Copyright (c) 2019 carykh
9
+ Copyright (c) 2020 gegell
10
+ Copyright (c) 2025 Stanislav Popov
11
+
12
+ Permission is hereby granted, free of charge, to any person obtaining a copy
13
+ of this software and associated documentation files (the "Software"), to deal
14
+ in the Software without restriction, including without limitation the rights
15
+ to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
16
+ copies of the Software, and to permit persons to whom the Software is
17
+ furnished to do so, subject to the following conditions:
18
+
19
+ The above copyright notice and this permission notice shall be included in all
20
+ copies or substantial portions of the Software.
21
+
22
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
23
+ IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
24
+ FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
25
+ AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
26
+ LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
27
+ OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
28
+ SOFTWARE.
29
+
30
+ Requires-Python: >=3.9
31
+ Description-Content-Type: text/markdown
32
+ License-File: LICENSE
33
+ Requires-Dist: audiotsm>=0.1.2
34
+ Requires-Dist: scipy>=1.10.0
35
+ Requires-Dist: numpy<2.0.0,>=1.22.0
36
+ Requires-Dist: tqdm>=4.65.0
37
+ Dynamic: license-file
38
+
39
+ # Talks Reducer
40
+ Talks Reducer shortens long-form presentations by removing silent gaps and optionally re-encoding them to smaller files. The
41
+ project was renamed from **jumpcutter** to emphasize its focus on conference talks and lectures.
42
+
43
+ When CUDA-capable hardware is available the pipeline leans on GPU encoders to keep export times low, but it still runs great on
44
+ CPUs.
45
+
46
+ ## Repository Structure
47
+ - `talks_reducer/` — Python package that exposes the CLI and reusable pipeline:
48
+ - `cli.py` parses arguments and dispatches to the pipeline.
49
+ - `pipeline.py` orchestrates FFmpeg, audio processing, and temporary assets.
50
+ - `audio.py` handles audio validation, volume analysis, and phase vocoder processing.
51
+ - `chunks.py` builds timing metadata and FFmpeg expressions for frame selection.
52
+ - `ffmpeg.py` discovers the FFmpeg binary, checks CUDA availability, and assembles command strings.
53
+ - `requirements.txt` — Python dependencies for local development.
54
+ - `default.nix` — reproducible environment definition for Nix users.
55
+ - `CONTRIBUTION.md` — development workflow, formatting expectations, and release checklist.
56
+ - `AGENTS.md` — maintainer tips and coding conventions for this repository.
57
+
58
+ ## Example
59
+ - 1h 37m, 571 MB — Original OBS video
60
+ - 1h 19m, 751 MB — Talks Reducer
61
+ - 1h 19m, 171 MB — Talks Reducer `--small`
62
+
63
+ The `--small` preset applies a 720p video scale and 128 kbps audio bitrate, making it useful for sharing talks over constrained
64
+ connections. Without `--small`, the script aims to preserve original quality while removing silence.
65
+
66
+ ## Highlights
67
+ - Builds on gegell's classic jumpcutter workflow with more efficient frame and audio processing
68
+ - Generates FFmpeg filter graphs instead of writing temporary frames to disk
69
+ - Streams audio transformations in memory to avoid slow intermediate files
70
+ - Accepts multiple inputs or directories of recordings in a single run
71
+ - Provides progress feedback via `tqdm`
72
+ - Automatically detects NVENC availability, so you no longer need to pass `--cuda`
73
+
74
+ ## Processing Pipeline
75
+ 1. Validate that each input file contains an audio stream using `ffprobe`.
76
+ 2. Extract audio and calculate loudness to identify silent regions.
77
+ 3. Stretch the non-silent segments with `audiotsm` to maintain speech clarity.
78
+ 4. Stitch the processed audio and video together with FFmpeg, using NVENC if the GPU encoders are detected.
79
+
80
+ ## Recent Updates
81
+ - **October 2025** — Project renamed to *Talks Reducer* across documentation and scripts.
82
+ - **October 2025** — Added `--small` preset with 720p/128 kbps defaults for bandwidth-friendly exports.
83
+ - **October 2025** — Removed the `--cuda` flag; CUDA/NVENC support is now auto-detected.
84
+ - **October 2025** — Improved `--small` encoder arguments to balance size and clarity.
85
+ - **October 2025** — CLI argument parsing fixes to prevent crashes on invalid combinations.
86
+ - **October 2025** — Added example output comparison to the README.
87
+
88
+ ## Quick Start
89
+ 1. Install FFmpeg and ensure it is on your `PATH`
90
+ 2. Install Talks Reducer with `pip install talks-reducer` (this exposes the `talks-reducer` command)
91
+ 3. Inspect available options with `talks-reducer --help`
92
+ 4. Process a recording using `talks-reducer /path/to/video`
93
+
94
+ ## Requirements
95
+ - Python 3 with `numpy`, `scipy`, `audiotsm`, and `tqdm`
96
+ - FFmpeg with optional NVIDIA NVENC support for CUDA acceleration
97
+
98
+ ## Contributing
99
+ See `CONTRIBUTION.md` for development setup details and guidance on sharing improvements.
100
+
101
+ ## License
102
+ Talks Reducer is released under the MIT License. See `LICENSE` for the full text.
@@ -0,0 +1,16 @@
1
+ LICENSE
2
+ README.md
3
+ pyproject.toml
4
+ talks_reducer/__init__.py
5
+ talks_reducer/__main__.py
6
+ talks_reducer/audio.py
7
+ talks_reducer/chunks.py
8
+ talks_reducer/cli.py
9
+ talks_reducer/ffmpeg.py
10
+ talks_reducer/pipeline.py
11
+ talks_reducer.egg-info/PKG-INFO
12
+ talks_reducer.egg-info/SOURCES.txt
13
+ talks_reducer.egg-info/dependency_links.txt
14
+ talks_reducer.egg-info/entry_points.txt
15
+ talks_reducer.egg-info/requires.txt
16
+ talks_reducer.egg-info/top_level.txt
@@ -0,0 +1,2 @@
1
+ [console_scripts]
2
+ talks-reducer = talks_reducer.cli:main
@@ -0,0 +1,4 @@
1
+ audiotsm>=0.1.2
2
+ scipy>=1.10.0
3
+ numpy<2.0.0,>=1.22.0
4
+ tqdm>=4.65.0
@@ -0,0 +1 @@
1
+ talks_reducer