karaoke-gen 0.50.0__tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Potentially problematic release.


This version of karaoke-gen might be problematic. Click here for more details.

@@ -0,0 +1,21 @@
1
+ MIT License
2
+
3
+ Copyright (c) 2024 Nomad Karaoke LLC
4
+
5
+ Permission is hereby granted, free of charge, to any person obtaining a copy
6
+ of this software and associated documentation files (the "Software"), to deal
7
+ in the Software without restriction, including without limitation the rights
8
+ to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
9
+ copies of the Software, and to permit persons to whom the Software is
10
+ furnished to do so, subject to the following conditions:
11
+
12
+ The above copyright notice and this permission notice shall be included in all
13
+ copies or substantial portions of the Software.
14
+
15
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16
+ IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17
+ FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18
+ AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19
+ LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20
+ OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21
+ SOFTWARE.
@@ -0,0 +1,140 @@
1
+ Metadata-Version: 2.3
2
+ Name: karaoke-gen
3
+ Version: 0.50.0
4
+ Summary: Generate karaoke videos with synchronized lyrics. Handles the entire process from downloading audio and lyrics to creating the final video with title screens.
5
+ License: MIT
6
+ Author: Andrew Beveridge
7
+ Author-email: andrew@beveridge.uk
8
+ Requires-Python: >=3.10,<3.13
9
+ Classifier: License :: OSI Approved :: MIT License
10
+ Classifier: Programming Language :: Python :: 3
11
+ Classifier: Programming Language :: Python :: 3.10
12
+ Classifier: Programming Language :: Python :: 3.11
13
+ Classifier: Programming Language :: Python :: 3.12
14
+ Requires-Dist: argparse (>=1.4.0)
15
+ Requires-Dist: attrs (>=24.2.0)
16
+ Requires-Dist: audio-separator[cpu] (>=0.21.0)
17
+ Requires-Dist: beautifulsoup4 (>=4)
18
+ Requires-Dist: cattrs (>=24.1.2)
19
+ Requires-Dist: fetch-lyrics-from-genius (>=0.1)
20
+ Requires-Dist: ffmpeg-python (>=0.2.0,<0.3.0)
21
+ Requires-Dist: google-api-python-client
22
+ Requires-Dist: google-auth
23
+ Requires-Dist: google-auth-httplib2
24
+ Requires-Dist: google-auth-oauthlib
25
+ Requires-Dist: kbputils (>=0.0.16,<0.0.17)
26
+ Requires-Dist: lyrics-converter (>=0.2.1)
27
+ Requires-Dist: lyrics-transcriber (>=0.34)
28
+ Requires-Dist: lyricsgenius (>=3)
29
+ Requires-Dist: numpy (>=1,<2)
30
+ Requires-Dist: pillow (>=10.1)
31
+ Requires-Dist: psutil (>=7.0.0,<8.0.0)
32
+ Requires-Dist: pyinstaller (>=6.3)
33
+ Requires-Dist: pyperclip
34
+ Requires-Dist: pytest-asyncio (>=0.23.5,<0.24.0)
35
+ Requires-Dist: requests (>=2)
36
+ Requires-Dist: thefuzz (>=0.22)
37
+ Requires-Dist: toml (>=0.10)
38
+ Requires-Dist: torch (<2.5)
39
+ Requires-Dist: yt-dlp
40
+ Project-URL: Documentation, https://github.com/karaokenerds/karaoke-gen/blob/main/README.md
41
+ Project-URL: Homepage, https://github.com/karaokenerds/karaoke-gen
42
+ Project-URL: Repository, https://github.com/karaokenerds/karaoke-gen
43
+ Description-Content-Type: text/markdown
44
+
45
+ # Karaoke Gen
46
+
47
+ Generate karaoke videos with synchronized lyrics. Handles the entire process from downloading audio and lyrics to creating the final video with title screens.
48
+
49
+ ## Overview
50
+
51
+ Karaoke Gen is a comprehensive tool for creating high-quality karaoke videos. It automates the entire workflow:
52
+
53
+ 1. **Download** audio and lyrics for a specified song
54
+ 2. **Separate** audio stems (vocals, instrumental)
55
+ 3. **Synchronize** lyrics with the audio
56
+ 4. **Generate** title and end screens
57
+ 5. **Combine** everything into a polished final video
58
+ 6. **Organize** and **share** the output files
59
+
60
+ ## Installation
61
+
62
+ ```bash
63
+ pip install karaoke-gen
64
+ ```
65
+
66
+ ## Quick Start
67
+
68
+ ```bash
69
+ # Generate a karaoke video from a YouTube URL
70
+ karaoke-gen "https://www.youtube.com/watch?v=dQw4w9WgXcQ" "Rick Astley" "Never Gonna Give You Up"
71
+
72
+ # Or let it search YouTube for you
73
+ karaoke-gen "Rick Astley" "Never Gonna Give You Up"
74
+ ```
75
+
76
+ ## Workflow Options
77
+
78
+ Karaoke Gen supports different workflow options to fit your needs:
79
+
80
+ ```bash
81
+ # Run only the preparation phase (download, separate stems, create title screens)
82
+ karaoke-gen --prep-only "Rick Astley" "Never Gonna Give You Up"
83
+
84
+ # Run only the finalisation phase (must be run in a directory prepared by the prep phase)
85
+ karaoke-gen --finalise-only
86
+
87
+ # Skip automatic lyrics transcription/synchronization (for manual syncing)
88
+ karaoke-gen --skip-transcription "Rick Astley" "Never Gonna Give You Up"
89
+
90
+ # Skip audio separation (if you already have instrumental)
91
+ karaoke-gen --skip-separation --existing-instrumental="path/to/instrumental.mp3" "Rick Astley" "Never Gonna Give You Up"
92
+ ```
93
+
94
+ ## Advanced Features
95
+
96
+ ### Audio Processing
97
+
98
+ ```bash
99
+ # Specify custom audio separation models
100
+ karaoke-gen --clean_instrumental_model="model_name.ckpt" "Rick Astley" "Never Gonna Give You Up"
101
+ ```
102
+
103
+ ### Lyrics Handling
104
+
105
+ ```bash
106
+ # Use a local lyrics file instead of fetching from online
107
+ karaoke-gen --lyrics_file="path/to/lyrics.txt" "Rick Astley" "Never Gonna Give You Up"
108
+
109
+ # Adjust subtitle timing
110
+ karaoke-gen --subtitle_offset_ms=500 "Rick Astley" "Never Gonna Give You Up"
111
+ ```
112
+
113
+ ### Finalisation Options
114
+
115
+ ```bash
116
+ # Enable CDG ZIP generation
117
+ karaoke-gen --enable_cdg --style_params_json="path/to/style.json" "Rick Astley" "Never Gonna Give You Up"
118
+
119
+ # Enable TXT ZIP generation
120
+ karaoke-gen --enable_txt "Rick Astley" "Never Gonna Give You Up"
121
+
122
+ # Upload to YouTube
123
+ karaoke-gen --youtube_client_secrets_file="path/to/client_secret.json" --youtube_description_file="path/to/description.txt" "Rick Astley" "Never Gonna Give You Up"
124
+
125
+ # Organize files with brand code
126
+ karaoke-gen --brand_prefix="BRAND" --organised_dir="path/to/Tracks-Organized" "Rick Astley" "Never Gonna Give You Up"
127
+ ```
128
+
129
+ ## Full Command Reference
130
+
131
+ For a complete list of options:
132
+
133
+ ```bash
134
+ karaoke-gen --help
135
+ ```
136
+
137
+ ## License
138
+
139
+ MIT
140
+
@@ -0,0 +1,95 @@
1
+ # Karaoke Gen
2
+
3
+ Generate karaoke videos with synchronized lyrics. Handles the entire process from downloading audio and lyrics to creating the final video with title screens.
4
+
5
+ ## Overview
6
+
7
+ Karaoke Gen is a comprehensive tool for creating high-quality karaoke videos. It automates the entire workflow:
8
+
9
+ 1. **Download** audio and lyrics for a specified song
10
+ 2. **Separate** audio stems (vocals, instrumental)
11
+ 3. **Synchronize** lyrics with the audio
12
+ 4. **Generate** title and end screens
13
+ 5. **Combine** everything into a polished final video
14
+ 6. **Organize** and **share** the output files
15
+
16
+ ## Installation
17
+
18
+ ```bash
19
+ pip install karaoke-gen
20
+ ```
21
+
22
+ ## Quick Start
23
+
24
+ ```bash
25
+ # Generate a karaoke video from a YouTube URL
26
+ karaoke-gen "https://www.youtube.com/watch?v=dQw4w9WgXcQ" "Rick Astley" "Never Gonna Give You Up"
27
+
28
+ # Or let it search YouTube for you
29
+ karaoke-gen "Rick Astley" "Never Gonna Give You Up"
30
+ ```
31
+
32
+ ## Workflow Options
33
+
34
+ Karaoke Gen supports different workflow options to fit your needs:
35
+
36
+ ```bash
37
+ # Run only the preparation phase (download, separate stems, create title screens)
38
+ karaoke-gen --prep-only "Rick Astley" "Never Gonna Give You Up"
39
+
40
+ # Run only the finalisation phase (must be run in a directory prepared by the prep phase)
41
+ karaoke-gen --finalise-only
42
+
43
+ # Skip automatic lyrics transcription/synchronization (for manual syncing)
44
+ karaoke-gen --skip-transcription "Rick Astley" "Never Gonna Give You Up"
45
+
46
+ # Skip audio separation (if you already have instrumental)
47
+ karaoke-gen --skip-separation --existing-instrumental="path/to/instrumental.mp3" "Rick Astley" "Never Gonna Give You Up"
48
+ ```
49
+
50
+ ## Advanced Features
51
+
52
+ ### Audio Processing
53
+
54
+ ```bash
55
+ # Specify custom audio separation models
56
+ karaoke-gen --clean_instrumental_model="model_name.ckpt" "Rick Astley" "Never Gonna Give You Up"
57
+ ```
58
+
59
+ ### Lyrics Handling
60
+
61
+ ```bash
62
+ # Use a local lyrics file instead of fetching from online
63
+ karaoke-gen --lyrics_file="path/to/lyrics.txt" "Rick Astley" "Never Gonna Give You Up"
64
+
65
+ # Adjust subtitle timing
66
+ karaoke-gen --subtitle_offset_ms=500 "Rick Astley" "Never Gonna Give You Up"
67
+ ```
68
+
69
+ ### Finalisation Options
70
+
71
+ ```bash
72
+ # Enable CDG ZIP generation
73
+ karaoke-gen --enable_cdg --style_params_json="path/to/style.json" "Rick Astley" "Never Gonna Give You Up"
74
+
75
+ # Enable TXT ZIP generation
76
+ karaoke-gen --enable_txt "Rick Astley" "Never Gonna Give You Up"
77
+
78
+ # Upload to YouTube
79
+ karaoke-gen --youtube_client_secrets_file="path/to/client_secret.json" --youtube_description_file="path/to/description.txt" "Rick Astley" "Never Gonna Give You Up"
80
+
81
+ # Organize files with brand code
82
+ karaoke-gen --brand_prefix="BRAND" --organised_dir="path/to/Tracks-Organized" "Rick Astley" "Never Gonna Give You Up"
83
+ ```
84
+
85
+ ## Full Command Reference
86
+
87
+ For a complete list of options:
88
+
89
+ ```bash
90
+ karaoke-gen --help
91
+ ```
92
+
93
+ ## License
94
+
95
+ MIT
@@ -0,0 +1 @@
1
+ from .karaoke_prep import KaraokePrep
@@ -0,0 +1,396 @@
1
+ import os
2
+ import sys
3
+ import json
4
+ import logging
5
+ import glob
6
+ import shutil
7
+ import tempfile
8
+ import time
9
+ import fcntl
10
+ import errno
11
+ import psutil
12
+ from datetime import datetime
13
+ from pydub import AudioSegment
14
+
15
+
16
+ # Placeholder class or functions for audio processing
17
+ class AudioProcessor:
18
+ def __init__(
19
+ self,
20
+ logger,
21
+ log_level,
22
+ log_formatter,
23
+ model_file_dir,
24
+ lossless_output_format,
25
+ clean_instrumental_model,
26
+ backing_vocals_models,
27
+ other_stems_models,
28
+ ffmpeg_base_command,
29
+ ):
30
+ self.logger = logger
31
+ self.log_level = log_level
32
+ self.log_formatter = log_formatter
33
+ self.model_file_dir = model_file_dir
34
+ self.lossless_output_format = lossless_output_format
35
+ self.clean_instrumental_model = clean_instrumental_model
36
+ self.backing_vocals_models = backing_vocals_models
37
+ self.other_stems_models = other_stems_models
38
+ self.ffmpeg_base_command = ffmpeg_base_command # Needed for combined instrumentals
39
+
40
+ def _file_exists(self, file_path):
41
+ """Check if a file exists and log the result."""
42
+ exists = os.path.isfile(file_path)
43
+ if exists:
44
+ self.logger.info(f"File already exists, skipping creation: {file_path}")
45
+ return exists
46
+
47
+ def separate_audio(self, audio_file, model_name, artist_title, track_output_dir, instrumental_path, vocals_path):
48
+ if audio_file is None or not os.path.isfile(audio_file):
49
+ raise Exception("Error: Invalid audio source provided.")
50
+
51
+ self.logger.debug(f"audio_file is valid file: {audio_file}")
52
+
53
+ self.logger.info(
54
+ f"instantiating Separator with model_file_dir: {self.model_file_dir}, model_filename: {model_name} output_format: {self.lossless_output_format}"
55
+ )
56
+
57
+ from audio_separator.separator import Separator
58
+
59
+ separator = Separator(
60
+ log_level=self.log_level,
61
+ log_formatter=self.log_formatter,
62
+ model_file_dir=self.model_file_dir,
63
+ output_format=self.lossless_output_format,
64
+ )
65
+
66
+ separator.load_model(model_filename=model_name)
67
+ output_files = separator.separate(audio_file)
68
+
69
+ self.logger.debug(f"Separator output files: {output_files}")
70
+
71
+ model_name_no_extension = os.path.splitext(model_name)[0]
72
+
73
+ for file in output_files:
74
+ if "(Vocals)" in file:
75
+ self.logger.info(f"Renaming Vocals file {file} to {vocals_path}")
76
+ os.rename(file, vocals_path)
77
+ elif "(Instrumental)" in file:
78
+ self.logger.info(f"Renaming Instrumental file {file} to {instrumental_path}")
79
+ os.rename(file, instrumental_path)
80
+ elif model_name in file:
81
+ # Example filename 1: "Freddie Jackson - All I'll Ever Ask (feat. Najee) (Local)_(Piano)_htdemucs_6s.flac"
82
+ # Example filename 2: "Freddie Jackson - All I'll Ever Ask (feat. Najee) (Local)_(Guitar)_htdemucs_6s.flac"
83
+ # The stem name in these examples would be "Piano" or "Guitar"
84
+ # Extract stem_name from the filename
85
+ stem_name = file.split(f"_{model_name}")[0].split("_")[-1]
86
+ stem_name = stem_name.strip("()") # Remove parentheses if present
87
+
88
+ other_stem_path = os.path.join(track_output_dir, f"{artist_title} ({stem_name} {model_name}).{self.lossless_output_format}")
89
+ self.logger.info(f"Renaming other stem file {file} to {other_stem_path}")
90
+ os.rename(file, other_stem_path)
91
+
92
+ elif model_name_no_extension in file:
93
+ # Example filename 1: "Freddie Jackson - All I'll Ever Ask (feat. Najee) (Local)_(Piano)_htdemucs_6s.flac"
94
+ # Example filename 2: "Freddie Jackson - All I'll Ever Ask (feat. Najee) (Local)_(Guitar)_htdemucs_6s.flac"
95
+ # The stem name in these examples would be "Piano" or "Guitar"
96
+ # Extract stem_name from the filename
97
+ stem_name = file.split(f"_{model_name_no_extension}")[0].split("_")[-1]
98
+ stem_name = stem_name.strip("()") # Remove parentheses if present
99
+
100
+ other_stem_path = os.path.join(track_output_dir, f"{artist_title} ({stem_name} {model_name}).{self.lossless_output_format}")
101
+ self.logger.info(f"Renaming other stem file {file} to {other_stem_path}")
102
+ os.rename(file, other_stem_path)
103
+
104
+ self.logger.info(f"Separation complete! Output file(s): {vocals_path} {instrumental_path}")
105
+
106
+ def process_audio_separation(self, audio_file, artist_title, track_output_dir):
107
+ from audio_separator.separator import Separator
108
+
109
+ self.logger.info(f"Starting audio separation process for {artist_title}")
110
+
111
+ # Define lock file path in system temp directory
112
+ lock_file_path = os.path.join(tempfile.gettempdir(), "audio_separator.lock")
113
+
114
+ # Try to acquire lock
115
+ while True:
116
+ try:
117
+ # First check if there's a stale lock
118
+ if os.path.exists(lock_file_path):
119
+ try:
120
+ with open(lock_file_path, "r") as f:
121
+ lock_data = json.load(f)
122
+ pid = lock_data.get("pid")
123
+ start_time = datetime.fromisoformat(lock_data.get("start_time"))
124
+ running_track = lock_data.get("track")
125
+
126
+ # Check if process is still running
127
+ if not psutil.pid_exists(pid):
128
+ self.logger.warning(f"Found stale lock from dead process {pid}, removing...")
129
+ os.remove(lock_file_path)
130
+ else:
131
+ # Calculate runtime
132
+ runtime = datetime.now() - start_time
133
+ runtime_mins = runtime.total_seconds() / 60
134
+
135
+ # Get process command line
136
+ proc = psutil.Process(pid)
137
+ cmd = " ".join(proc.cmdline())
138
+
139
+ self.logger.info(
140
+ f"Waiting for other audio separation process to complete before starting separation for {artist_title}...\n"
141
+ f"Currently running process details:\n"
142
+ f" Track: {running_track}\n"
143
+ f" PID: {pid}\n"
144
+ f" Running time: {runtime_mins:.1f} minutes\n"
145
+ f" Command: {cmd}\n"
146
+ f"To force clear the lock and kill the process, run:\n"
147
+ f" kill {pid} && rm {lock_file_path}"
148
+ )
149
+ except (json.JSONDecodeError, KeyError, ValueError) as e:
150
+ self.logger.warning(f"Found invalid lock file, removing: {e}")
151
+ os.remove(lock_file_path)
152
+
153
+ # Try to acquire lock
154
+ lock_file = open(lock_file_path, "w")
155
+ fcntl.flock(lock_file.fileno(), fcntl.LOCK_EX | fcntl.LOCK_NB)
156
+
157
+ # Write metadata to lock file
158
+ lock_data = {
159
+ "pid": os.getpid(),
160
+ "start_time": datetime.now().isoformat(),
161
+ "track": f"{artist_title}",
162
+ }
163
+ json.dump(lock_data, lock_file)
164
+ lock_file.flush()
165
+ break
166
+
167
+ except IOError as e:
168
+ if e.errno != errno.EAGAIN:
169
+ raise
170
+ # Lock is held by another process
171
+ time.sleep(30) # Wait 30 seconds before trying again
172
+ continue
173
+
174
+ try:
175
+ separator = Separator(
176
+ log_level=self.log_level,
177
+ log_formatter=self.log_formatter,
178
+ model_file_dir=self.model_file_dir,
179
+ output_format=self.lossless_output_format,
180
+ )
181
+
182
+ stems_dir = self._create_stems_directory(track_output_dir)
183
+ result = {"clean_instrumental": {}, "other_stems": {}, "backing_vocals": {}, "combined_instrumentals": {}}
184
+
185
+ if os.environ.get("KARAOKE_PREP_SKIP_AUDIO_SEPARATION"):
186
+ return result
187
+
188
+ result["clean_instrumental"] = self._separate_clean_instrumental(
189
+ separator, audio_file, artist_title, track_output_dir, stems_dir
190
+ )
191
+ result["other_stems"] = self._separate_other_stems(separator, audio_file, artist_title, stems_dir)
192
+ result["backing_vocals"] = self._separate_backing_vocals(
193
+ separator, result["clean_instrumental"]["vocals"], artist_title, stems_dir
194
+ )
195
+ result["combined_instrumentals"] = self._generate_combined_instrumentals(
196
+ result["clean_instrumental"]["instrumental"], result["backing_vocals"], artist_title, track_output_dir
197
+ )
198
+ self._normalize_audio_files(result, artist_title, track_output_dir)
199
+
200
+ # Create Audacity LOF file
201
+ lof_path = os.path.join(stems_dir, f"{artist_title} (Audacity).lof")
202
+ first_model = list(result["backing_vocals"].keys())[0]
203
+
204
+ files_to_include = [
205
+ audio_file, # Original audio
206
+ result["clean_instrumental"]["instrumental"], # Clean instrumental
207
+ result["backing_vocals"][first_model]["backing_vocals"], # Backing vocals
208
+ result["combined_instrumentals"][first_model], # Combined instrumental+BV
209
+ ]
210
+
211
+ # Convert to absolute paths
212
+ files_to_include = [os.path.abspath(f) for f in files_to_include]
213
+
214
+ with open(lof_path, "w") as lof:
215
+ for file_path in files_to_include:
216
+ lof.write(f'file "{file_path}"\n')
217
+
218
+ self.logger.info(f"Created Audacity LOF file: {lof_path}")
219
+ result["audacity_lof"] = lof_path
220
+
221
+ # Launch Audacity with multiple tracks
222
+ if sys.platform == "darwin": # Check if we're on macOS
223
+ if lof_path and os.path.exists(lof_path):
224
+ self.logger.info(f"Launching Audacity with LOF file: {lof_path}")
225
+ os.system(f'open -a Audacity "{lof_path}"')
226
+ else:
227
+ self.logger.debug("Audacity LOF file not available or not found")
228
+
229
+ self.logger.info("Audio separation, combination, and normalization process completed")
230
+ return result
231
+ finally:
232
+ # Release lock
233
+ fcntl.flock(lock_file.fileno(), fcntl.LOCK_UN)
234
+ lock_file.close()
235
+ try:
236
+ os.remove(lock_file_path)
237
+ except OSError:
238
+ pass
239
+
240
+ def _create_stems_directory(self, track_output_dir):
241
+ stems_dir = os.path.join(track_output_dir, "stems")
242
+ os.makedirs(stems_dir, exist_ok=True)
243
+ self.logger.info(f"Created stems directory: {stems_dir}")
244
+ return stems_dir
245
+
246
+ def _separate_clean_instrumental(self, separator, audio_file, artist_title, track_output_dir, stems_dir):
247
+ self.logger.info(f"Separating using clean instrumental model: {self.clean_instrumental_model}")
248
+ instrumental_path = os.path.join(
249
+ track_output_dir, f"{artist_title} (Instrumental {self.clean_instrumental_model}).{self.lossless_output_format}"
250
+ )
251
+ vocals_path = os.path.join(stems_dir, f"{artist_title} (Vocals {self.clean_instrumental_model}).{self.lossless_output_format}")
252
+
253
+ result = {}
254
+ if not self._file_exists(instrumental_path) or not self._file_exists(vocals_path):
255
+ separator.load_model(model_filename=self.clean_instrumental_model)
256
+ clean_output_files = separator.separate(audio_file)
257
+
258
+ for file in clean_output_files:
259
+ if "(Vocals)" in file and not self._file_exists(vocals_path):
260
+ os.rename(file, vocals_path)
261
+ result["vocals"] = vocals_path
262
+ elif "(Instrumental)" in file and not self._file_exists(instrumental_path):
263
+ os.rename(file, instrumental_path)
264
+ result["instrumental"] = instrumental_path
265
+ else:
266
+ result["vocals"] = vocals_path
267
+ result["instrumental"] = instrumental_path
268
+
269
+ return result
270
+
271
+ def _separate_other_stems(self, separator, audio_file, artist_title, stems_dir):
272
+ self.logger.info(f"Separating using other stems models: {self.other_stems_models}")
273
+ result = {}
274
+ for model in self.other_stems_models:
275
+ self.logger.info(f"Processing with model: {model}")
276
+ result[model] = {}
277
+
278
+ # Check if any stem files for this model already exist
279
+ existing_stems = glob.glob(os.path.join(stems_dir, f"{artist_title} (*{model}).{self.lossless_output_format}"))
280
+
281
+ if existing_stems:
282
+ self.logger.info(f"Found existing stem files for model {model}, skipping separation")
283
+ for stem_file in existing_stems:
284
+ stem_name = os.path.basename(stem_file).split("(")[1].split(")")[0].strip()
285
+ result[model][stem_name] = stem_file
286
+ else:
287
+ separator.load_model(model_filename=model)
288
+ other_stems_output = separator.separate(audio_file)
289
+
290
+ for file in other_stems_output:
291
+ file_name = os.path.basename(file)
292
+ stem_name = file_name[file_name.rfind("_(") + 2 : file_name.rfind(")_")]
293
+ new_filename = f"{artist_title} ({stem_name} {model}).{self.lossless_output_format}"
294
+ other_stem_path = os.path.join(stems_dir, new_filename)
295
+ if not self._file_exists(other_stem_path):
296
+ os.rename(file, other_stem_path)
297
+ result[model][stem_name] = other_stem_path
298
+
299
+ return result
300
+
301
+ def _separate_backing_vocals(self, separator, vocals_path, artist_title, stems_dir):
302
+ self.logger.info(f"Separating clean vocals using backing vocals models: {self.backing_vocals_models}")
303
+ result = {}
304
+ for model in self.backing_vocals_models:
305
+ self.logger.info(f"Processing with model: {model}")
306
+ result[model] = {}
307
+ lead_vocals_path = os.path.join(stems_dir, f"{artist_title} (Lead Vocals {model}).{self.lossless_output_format}")
308
+ backing_vocals_path = os.path.join(stems_dir, f"{artist_title} (Backing Vocals {model}).{self.lossless_output_format}")
309
+
310
+ if not self._file_exists(lead_vocals_path) or not self._file_exists(backing_vocals_path):
311
+ separator.load_model(model_filename=model)
312
+ backing_vocals_output = separator.separate(vocals_path)
313
+
314
+ for file in backing_vocals_output:
315
+ if "(Vocals)" in file and not self._file_exists(lead_vocals_path):
316
+ os.rename(file, lead_vocals_path)
317
+ result[model]["lead_vocals"] = lead_vocals_path
318
+ elif "(Instrumental)" in file and not self._file_exists(backing_vocals_path):
319
+ os.rename(file, backing_vocals_path)
320
+ result[model]["backing_vocals"] = backing_vocals_path
321
+ else:
322
+ result[model]["lead_vocals"] = lead_vocals_path
323
+ result[model]["backing_vocals"] = backing_vocals_path
324
+ return result
325
+
326
+ def _generate_combined_instrumentals(self, instrumental_path, backing_vocals_result, artist_title, track_output_dir):
327
+ self.logger.info("Generating combined instrumental tracks with backing vocals")
328
+ result = {}
329
+ for model, paths in backing_vocals_result.items():
330
+ backing_vocals_path = paths["backing_vocals"]
331
+ combined_path = os.path.join(track_output_dir, f"{artist_title} (Instrumental +BV {model}).{self.lossless_output_format}")
332
+
333
+ if not self._file_exists(combined_path):
334
+ ffmpeg_command = (
335
+ f'{self.ffmpeg_base_command} -i "{instrumental_path}" -i "{backing_vocals_path}" '
336
+ f'-filter_complex "[0:a][1:a]amix=inputs=2:duration=longest:weights=1 1" '
337
+ f'-c:a {self.lossless_output_format.lower()} "{combined_path}"'
338
+ )
339
+
340
+ self.logger.debug(f"Running command: {ffmpeg_command}")
341
+ os.system(ffmpeg_command)
342
+
343
+ result[model] = combined_path
344
+ return result
345
+
346
+ def _normalize_audio_files(self, separation_result, artist_title, track_output_dir):
347
+ self.logger.info("Normalizing clean instrumental and combined instrumentals")
348
+
349
+ files_to_normalize = [
350
+ ("clean_instrumental", separation_result["clean_instrumental"]["instrumental"]),
351
+ ] + [("combined_instrumentals", path) for path in separation_result["combined_instrumentals"].values()]
352
+
353
+ for key, file_path in files_to_normalize:
354
+ if self._file_exists(file_path):
355
+ try:
356
+ self._normalize_audio(file_path, file_path) # Normalize in-place
357
+
358
+ # Verify the normalized file
359
+ if os.path.getsize(file_path) > 0:
360
+ self.logger.info(f"Successfully normalized: {file_path}")
361
+ else:
362
+ raise Exception("Normalized file is empty")
363
+
364
+ except Exception as e:
365
+ self.logger.error(f"Error during normalization of {file_path}: {e}")
366
+ self.logger.warning(f"Normalization failed for {file_path}. Original file remains unchanged.")
367
+ else:
368
+ self.logger.warning(f"File not found for normalization: {file_path}")
369
+
370
+ self.logger.info("Audio normalization process completed")
371
+
372
+ def _normalize_audio(self, input_path, output_path, target_level=0.0):
373
+ self.logger.info(f"Normalizing audio file: {input_path}")
374
+
375
+ # Load audio file
376
+ audio = AudioSegment.from_file(input_path, format=self.lossless_output_format.lower())
377
+
378
+ # Calculate the peak amplitude
379
+ peak_amplitude = float(audio.max_dBFS)
380
+
381
+ # Calculate the necessary gain
382
+ gain_db = target_level - peak_amplitude
383
+
384
+ # Apply gain
385
+ normalized_audio = audio.apply_gain(gain_db)
386
+
387
+ # Ensure the audio is not completely silent
388
+ if normalized_audio.rms == 0:
389
+ self.logger.warning(f"Normalized audio is silent for {input_path}. Using original audio.")
390
+ normalized_audio = audio
391
+
392
+ # Export normalized audio, overwriting the original file
393
+ normalized_audio.export(output_path, format=self.lossless_output_format.lower())
394
+
395
+ self.logger.info(f"Normalized audio saved, replacing: {output_path}")
396
+ self.logger.debug(f"Original peak: {peak_amplitude} dB, Applied gain: {gain_db} dB")