videoconverter-worker 1.0.0__tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -0,0 +1,81 @@
1
+ Metadata-Version: 2.4
2
+ Name: videoconverter-worker
3
+ Version: 1.0.0
4
+ Summary: VideoConverter Python Worker:从 queue 目录读取任务并执行切分/去字幕/合成
5
+ License: MIT
6
+ Keywords: videoconverter,ffmpeg,worker,video
7
+ Classifier: License :: OSI Approved :: MIT License
8
+ Classifier: Operating System :: OS Independent
9
+ Classifier: Programming Language :: Python :: 3
10
+ Classifier: Programming Language :: Python :: 3.8
11
+ Classifier: Programming Language :: Python :: 3.9
12
+ Classifier: Programming Language :: Python :: 3.10
13
+ Classifier: Programming Language :: Python :: 3.11
14
+ Classifier: Programming Language :: Python :: 3.12
15
+ Requires-Python: >=3.8
16
+ Description-Content-Type: text/plain
17
+
18
+ Python Worker 使用说明
19
+
20
+ 0) 安装(任选其一)
21
+ - 从 PyPI 安装(推荐,Windows/Linux/macOS 通用):
22
+ pip install videoconverter
23
+ 安装后直接运行: videoconverter [--data-dir DIR] [--path-replace OLD=NEW]
24
+ - 或从本地源码安装: cd src/python && pip install .
25
+
26
+ 1) 作用
27
+ 与 Java BackendWorker 行为一致:从 queue 目录读取任务(queue/<task_id>.json),
28
+ 执行 SPLIT / DESUBTITLE / MERGE / ONE_CLICK_COMPOSE,更新状态与 metadata.json。
29
+ 适合在服务器上运行,利用多核与高性能做切分、去字幕、合成。
30
+
31
+ 2) 环境
32
+ - Python 3.8+
33
+ - 系统已安装 ffmpeg、ffprobe(或在 PATH 中)
34
+ - 可选:环境变量 FFMPEG_PATH、FFPROBE_PATH 指定路径
35
+
36
+ 3) 本机与 Java 共用同一队列(同一台机器)
37
+ - 数据目录一致即可,例如: ~/.videoconverter/data
38
+ - 先由 Java 前端创建任务(写 queue/*.json),再在本机运行 Python worker 处理:
39
+ cd src/python
40
+ python worker.py
41
+ - 或指定数据目录:
42
+ python worker.py --data-dir /path/to/.videoconverter/data
43
+
44
+ 4) 把「前端处理过的文件夹」拷到服务器后运行
45
+ - 将本机用于队列的「数据目录」整个拷到服务器(例如 /server/data),
46
+ 其中应包含 queue/ 目录及 queue/*.json 任务文件。
47
+ - 若任务 JSON 里的路径是本机绝对路径(如 /Users/me/videos/a.mp4),
48
+ 需要在服务器上做路径替换,否则找不到文件:
49
+ python worker.py --data-dir /server/data --path-replace "/Users/me/videos=/server/videos"
50
+ - 环境变量方式(便于脚本/ systemd):
51
+ export VIDEOCONVERTER_DATA_DIR=/server/data
52
+ export VIDEOCONVERTER_PATH_REPLACE="/Users/me/videos=/server/videos"
53
+ python worker.py
54
+ - 建议:在服务器上把视频放在固定目录(如 /server/videos),
55
+ 拷过去的 data 里 queue 的 JSON 中路径统一用本机前缀,用 --path-replace 换成服务器前缀。
56
+
57
+ 5) 多进程并发
58
+ 当前 worker 为单进程单线程循环。要跑满多核,可在同一 data-dir 下启动多个进程:
59
+ - 任务抢占通过 queue/<task_id>.lock 原子创建,多进程不会抢到同一任务。
60
+ - 示例(4 个 worker 进程):
61
+ for i in 1 2 3 4; do python worker.py --data-dir /server/data & done
62
+ 或使用 systemd/supervisor 起多个 worker 实例。
63
+
64
+ 6) 暂停/继续
65
+ 队列暂停由 queue_config.json 的 queue_paused 控制("true" 为暂停)。
66
+ Python worker 会定期读该配置,为 true 时不取新任务。可由 Java 前端或手动改该文件控制。
67
+
68
+ 7) 与 Java 的约定
69
+ - 任务与 config 格式见项目根目录 queue_task_schema.txt。
70
+ - metadata.json 与 Java 生成的格式一致,合成(MERGE)以 metadata 为准。
71
+
72
+ 8) 打包与部署
73
+ - 打 zip 包(拷贝到服务器解压即用):
74
+ cd src/python
75
+ ./build_deploy.sh
76
+ 会生成 videoconverter.zip,解压后在该目录执行:
77
+ python3 worker.py [--data-dir DIR] [--path-replace OLD=NEW]
78
+ - 或安装为命令行(本机/服务器均可):
79
+ cd src/python
80
+ pip install .
81
+ 然后可直接运行: videoconverter [--data-dir DIR] [--path-replace OLD=NEW]
@@ -0,0 +1,64 @@
1
+ Python Worker 使用说明
2
+
3
+ 0) 安装(任选其一)
4
+ - 从 PyPI 安装(推荐,Windows/Linux/macOS 通用):
5
+ pip install videoconverter
6
+ 安装后直接运行: videoconverter [--data-dir DIR] [--path-replace OLD=NEW]
7
+ - 或从本地源码安装: cd src/python && pip install .
8
+
9
+ 1) 作用
10
+ 与 Java BackendWorker 行为一致:从 queue 目录读取任务(queue/<task_id>.json),
11
+ 执行 SPLIT / DESUBTITLE / MERGE / ONE_CLICK_COMPOSE,更新状态与 metadata.json。
12
+ 适合在服务器上运行,利用多核与高性能做切分、去字幕、合成。
13
+
14
+ 2) 环境
15
+ - Python 3.8+
16
+ - 系统已安装 ffmpeg、ffprobe(或在 PATH 中)
17
+ - 可选:环境变量 FFMPEG_PATH、FFPROBE_PATH 指定路径
18
+
19
+ 3) 本机与 Java 共用同一队列(同一台机器)
20
+ - 数据目录一致即可,例如: ~/.videoconverter/data
21
+ - 先由 Java 前端创建任务(写 queue/*.json),再在本机运行 Python worker 处理:
22
+ cd src/python
23
+ python worker.py
24
+ - 或指定数据目录:
25
+ python worker.py --data-dir /path/to/.videoconverter/data
26
+
27
+ 4) 把「前端处理过的文件夹」拷到服务器后运行
28
+ - 将本机用于队列的「数据目录」整个拷到服务器(例如 /server/data),
29
+ 其中应包含 queue/ 目录及 queue/*.json 任务文件。
30
+ - 若任务 JSON 里的路径是本机绝对路径(如 /Users/me/videos/a.mp4),
31
+ 需要在服务器上做路径替换,否则找不到文件:
32
+ python worker.py --data-dir /server/data --path-replace "/Users/me/videos=/server/videos"
33
+ - 环境变量方式(便于脚本/ systemd):
34
+ export VIDEOCONVERTER_DATA_DIR=/server/data
35
+ export VIDEOCONVERTER_PATH_REPLACE="/Users/me/videos=/server/videos"
36
+ python worker.py
37
+ - 建议:在服务器上把视频放在固定目录(如 /server/videos),
38
+ 拷过去的 data 里 queue 的 JSON 中路径统一用本机前缀,用 --path-replace 换成服务器前缀。
39
+
40
+ 5) 多进程并发
41
+ 当前 worker 为单进程单线程循环。要跑满多核,可在同一 data-dir 下启动多个进程:
42
+ - 任务抢占通过 queue/<task_id>.lock 原子创建,多进程不会抢到同一任务。
43
+ - 示例(4 个 worker 进程):
44
+ for i in 1 2 3 4; do python worker.py --data-dir /server/data & done
45
+ 或使用 systemd/supervisor 起多个 worker 实例。
46
+
47
+ 6) 暂停/继续
48
+ 队列暂停由 queue_config.json 的 queue_paused 控制("true" 为暂停)。
49
+ Python worker 会定期读该配置,为 true 时不取新任务。可由 Java 前端或手动改该文件控制。
50
+
51
+ 7) 与 Java 的约定
52
+ - 任务与 config 格式见项目根目录 queue_task_schema.txt。
53
+ - metadata.json 与 Java 生成的格式一致,合成(MERGE)以 metadata 为准。
54
+
55
+ 8) 打包与部署
56
+ - 打 zip 包(拷贝到服务器解压即用):
57
+ cd src/python
58
+ ./build_deploy.sh
59
+ 会生成 videoconverter.zip,解压后在该目录执行:
60
+ python3 worker.py [--data-dir DIR] [--path-replace OLD=NEW]
61
+ - 或安装为命令行(本机/服务器均可):
62
+ cd src/python
63
+ pip install .
64
+ 然后可直接运行: videoconverter [--data-dir DIR] [--path-replace OLD=NEW]
@@ -0,0 +1,278 @@
1
+ # -*- coding: utf-8 -*-
2
+ """
3
+ FFmpeg 切分、去字幕、合并,与 Java ChunkSplitService / FFmpegService / ChunkMergeService 逻辑对齐。
4
+ """
5
+ import logging
6
+ import os
7
+ import subprocess
8
+ import tempfile
9
+ import uuid
10
+ from pathlib import Path
11
+ from typing import List, Optional, Tuple
12
+
13
+ logger = logging.getLogger(__name__)
14
+
15
+
16
+ def _find_ffmpeg() -> str:
17
+ return os.environ.get("FFMPEG_PATH", "ffmpeg")
18
+
19
+
20
+ def _find_ffprobe() -> str:
21
+ return os.environ.get("FFPROBE_PATH", "ffprobe")
22
+
23
+
24
+ def _format_time(seconds: float) -> str:
25
+ if seconds < 0:
26
+ return "0"
27
+ if seconds < 3600:
28
+ m = int(seconds // 60)
29
+ s = seconds % 60
30
+ return f"{m}:{s:05.2f}" if s != int(s) else f"{m}:{int(s):02d}"
31
+ h = int(seconds // 3600)
32
+ m = int((seconds % 3600) // 60)
33
+ s = seconds % 60
34
+ return f"{h}:{m:02d}:{s:05.2f}" if s != int(s) else f"{h}:{m:02d}:{int(s):02d}"
35
+
36
+
37
+ def get_duration(video_path: str) -> float:
38
+ cmd = [
39
+ _find_ffprobe(),
40
+ "-v", "error",
41
+ "-show_entries", "format=duration",
42
+ "-of", "default=noprint_wrappers=1:nokey=1",
43
+ video_path,
44
+ ]
45
+ r = subprocess.run(cmd, capture_output=True, text=True, timeout=30)
46
+ if r.returncode != 0:
47
+ raise RuntimeError(f"ffprobe 失败: {r.stderr or r.stdout}")
48
+ line = (r.stdout or "").strip()
49
+ if not line:
50
+ return 0.0
51
+ return float(line)
52
+
53
+
54
+ def split_chunk(video_path: str, start_time: float, duration: float, output_path: str) -> None:
55
+ cmd = [
56
+ _find_ffmpeg(),
57
+ "-ss", _format_time(start_time),
58
+ "-i", video_path,
59
+ "-t", _format_time(duration),
60
+ "-c", "copy",
61
+ "-avoid_negative_ts", "make_zero",
62
+ "-y", output_path,
63
+ ]
64
+ run_ffmpeg(cmd)
65
+
66
+
67
+ def run_ffmpeg(cmd: List[str], timeout: Optional[int] = None) -> None:
68
+ logger.debug("FFmpeg: %s", " ".join(cmd))
69
+ p = subprocess.run(
70
+ cmd,
71
+ capture_output=True,
72
+ text=True,
73
+ timeout=timeout or 86400,
74
+ )
75
+ if p.returncode != 0:
76
+ raise RuntimeError(f"FFmpeg 退出码 {p.returncode}: {p.stderr[:2000] if p.stderr else p.stdout}")
77
+
78
+
79
+ def build_video_filter(config: dict, original_width: int, original_height: int) -> str:
80
+ """与 Java buildFilterFor1080pInput / buildFilterFor4KInput 对齐。"""
81
+ target_width = config.get("targetWidth", 1920)
82
+ target_height = config.get("targetHeight", 1080)
83
+ crop_bottom = config.get("cropBottom", 0)
84
+ if target_width <= 0 or target_height <= 0:
85
+ target_width, target_height = 1920, 1080
86
+ cropped_height = original_height - crop_bottom
87
+
88
+ if original_width == 1920 and original_height == 1080:
89
+ filters = []
90
+ if crop_bottom > 0:
91
+ filters.append(f"crop={original_width}:{cropped_height}:0:0")
92
+ current_h = cropped_height
93
+ else:
94
+ current_h = original_height
95
+ current_w = original_width
96
+ aspect = current_w / current_h if current_h else 16 / 9
97
+ if abs(aspect - 16 / 9) > 0.01:
98
+ final_w = int(current_h * 16 / 9)
99
+ crop_x = (current_w - final_w) // 2
100
+ filters.append(f"crop={final_w}:{current_h}:{crop_x}:0")
101
+ current_w = final_w
102
+ if current_w != target_width or current_h != target_height:
103
+ filters.append(f"scale={target_width}:{target_height}")
104
+ return ",".join(filters) if filters else "null"
105
+
106
+ # 4K/高分辨率:裁底 -> scale -> 裁左右 -> scale
107
+ scaled_w = target_width
108
+ scaled_h = int(cropped_height * scaled_w / original_width)
109
+ final_crop_w = int(scaled_h * 16 / 9)
110
+ crop_x = (scaled_w - final_crop_w) // 2
111
+ return (
112
+ f"crop={original_width}:{cropped_height}:0:0,"
113
+ f"scale={scaled_w}:{scaled_h},"
114
+ f"crop={final_crop_w}:{scaled_h}:{crop_x}:0,"
115
+ f"scale={target_width}:{target_height}"
116
+ )
117
+
118
+
119
+ def build_desubtitle_command(config: dict, input_path: str, output_path: str) -> List[str]:
120
+ """构建去字幕 FFmpeg 命令(软件编码 libx264,服务器通用)。"""
121
+ start_time = config.get("startTime", 0) or 0
122
+ end_time = config.get("endTime", 0) or 0
123
+ keep_audio = config.get("keepAudio", True)
124
+ audio_bitrate = config.get("audioBitrate", 192)
125
+ video_quality = config.get("videoQuality", 23)
126
+ original_width = config.get("originalWidth", 0) or 1920
127
+ original_height = config.get("originalHeight", 0) or 1080
128
+ force_keyframe = config.get("forceKeyframeAtStart", False)
129
+
130
+ cmd = [_find_ffmpeg()]
131
+ if start_time > 0:
132
+ cmd += ["-ss", _format_time(start_time)]
133
+ cmd += ["-i", input_path]
134
+ if end_time > 0 and start_time >= 0:
135
+ duration = end_time - start_time
136
+ if duration > 0:
137
+ cmd += ["-t", _format_time(duration)]
138
+ elif end_time > 0:
139
+ cmd += ["-t", _format_time(end_time)]
140
+
141
+ vf = build_video_filter(config, original_width, original_height)
142
+ if vf and vf != "null":
143
+ cmd += ["-vf", vf]
144
+ cmd += ["-c:v", "libx264", "-preset", "medium", "-crf", str(video_quality)]
145
+ if force_keyframe:
146
+ cmd += ["-force_key_frames", "expr:eq(n,0)"]
147
+ if keep_audio:
148
+ cmd += ["-c:a", "aac", "-b:a", f"{audio_bitrate}k", "-ac", "2"]
149
+ else:
150
+ cmd += ["-an"]
151
+ cmd += ["-y", output_path]
152
+ return cmd
153
+
154
+
155
+ def run_desubtitle(config: dict, input_path: str, output_path: str, progress_callback=None) -> bool:
156
+ cmd = build_desubtitle_command(config, input_path, output_path)
157
+ run_ffmpeg(cmd)
158
+ return True
159
+
160
+
161
+ def merge_with_concat(filelist_path: str, output_path: str) -> None:
162
+ cmd = [
163
+ _find_ffmpeg(),
164
+ "-f", "concat", "-safe", "0",
165
+ "-i", filelist_path,
166
+ "-c", "copy",
167
+ "-y", output_path,
168
+ ]
169
+ run_ffmpeg(cmd)
170
+
171
+
172
+ def trim_video(input_path: str, start_offset: float, duration: float, output_path: str) -> None:
173
+ cmd = [
174
+ _find_ffmpeg(),
175
+ "-i", input_path,
176
+ "-ss", _format_time(start_offset),
177
+ "-t", _format_time(duration),
178
+ "-c", "copy",
179
+ "-y", output_path,
180
+ ]
181
+ run_ffmpeg(cmd)
182
+
183
+
184
+ def split_video_to_chunks(
185
+ video_path: str,
186
+ output_dir: str,
187
+ chunk_size_sec: float,
188
+ range_start: float,
189
+ range_end: float,
190
+ ) -> Tuple[dict, str]:
191
+ """
192
+ 切分视频为多个 chunk,写入 output_dir/<video_id>/,返回 (metadata_dict, video_id)。
193
+ """
194
+ path = Path(video_path)
195
+ if not path.exists():
196
+ raise FileNotFoundError(f"视频不存在: {video_path}")
197
+ duration = get_duration(video_path)
198
+ if duration <= 0:
199
+ raise ValueError(f"无法获取视频时长: {video_path}")
200
+
201
+ start_sec = max(0, range_start)
202
+ end_sec = min(duration, range_end) if range_end > 0 else duration
203
+ if start_sec >= end_sec:
204
+ raise ValueError("开始时间必须小于结束时间")
205
+ effective = end_sec - start_sec
206
+
207
+ video_id = path.stem + "_" + uuid.uuid4().hex[:8]
208
+ chunk_dir = Path(output_dir) / video_id
209
+ original_dir = chunk_dir / "original"
210
+ original_dir.mkdir(parents=True, exist_ok=True)
211
+
212
+ total_chunks = int(__import__("math").ceil(effective / chunk_size_sec))
213
+ chunks = []
214
+ for i in range(total_chunks):
215
+ ch_start = start_sec + i * chunk_size_sec
216
+ ch_end = min(start_sec + (i + 1) * chunk_size_sec, end_sec)
217
+ ch_duration = ch_end - ch_start
218
+ chunk_id = f"chunk_{i:03d}"
219
+ out_path = original_dir / f"{chunk_id}.mp4"
220
+ split_chunk(video_path, ch_start, ch_duration, str(out_path))
221
+ rel_path = f"{video_id}/original/{chunk_id}.mp4"
222
+ chunks.append({
223
+ "chunkId": chunk_id,
224
+ "startTime": ch_start,
225
+ "endTime": ch_end,
226
+ "originalPath": rel_path,
227
+ "processedPath": "",
228
+ "status": "pending",
229
+ "processedAt": "",
230
+ "errorMessage": "",
231
+ })
232
+ logger.info("切分完成: %s (%.1f - %.1f秒)", chunk_id, ch_start, ch_end)
233
+
234
+ metadata = {
235
+ "videoId": video_id,
236
+ "originalPath": video_path,
237
+ "chunkSize": chunk_size_sec,
238
+ "totalChunks": total_chunks,
239
+ "chunks": chunks,
240
+ "createdAt": __import__("datetime").datetime.utcnow().isoformat() + "Z",
241
+ }
242
+ meta_path = chunk_dir / "metadata.json"
243
+ with open(meta_path, "w", encoding="utf-8") as f:
244
+ __import__("json").dump(metadata, f, indent=2, ensure_ascii=False)
245
+ metadata["_metadataPath"] = str(meta_path)
246
+ return metadata, video_id
247
+
248
+
249
+ def merge_chunks(metadata: dict, start_time: float, end_time: float, output_path: str) -> bool:
250
+ """合并已处理的 chunk(按 startTime 排序,concat + 可选 trim)。"""
251
+ chunks = metadata.get("chunks") or []
252
+ processed = [c for c in chunks if c.get("status") == "processed" and c.get("processedPath")]
253
+ processed = [c for c in processed if Path(c["processedPath"]).exists()]
254
+ if not processed:
255
+ raise ValueError("没有可用的已处理小块")
256
+ processed.sort(key=lambda c: c["startTime"])
257
+
258
+ with tempfile.NamedTemporaryFile(mode="w", suffix=".txt", delete=False) as f:
259
+ for c in processed:
260
+ f.write(f"file '{Path(c['processedPath']).resolve()}'\n")
261
+ list_path = f.name
262
+ try:
263
+ out_path = Path(output_path)
264
+ tmp_concat = out_path.parent / f"chunk_merge_{os.getpid()}.mp4"
265
+ tmp_trim = out_path.parent / f"chunk_trim_{os.getpid()}.mp4"
266
+ try:
267
+ merge_with_concat(list_path, str(tmp_concat))
268
+ first_start = processed[0]["startTime"]
269
+ trim_start = max(0, start_time - first_start)
270
+ exact_duration = end_time - start_time
271
+ trim_video(str(tmp_concat), trim_start, exact_duration, str(tmp_trim))
272
+ tmp_trim.replace(out_path)
273
+ finally:
274
+ tmp_concat.unlink(missing_ok=True)
275
+ tmp_trim.unlink(missing_ok=True)
276
+ return True
277
+ finally:
278
+ os.unlink(list_path)
@@ -0,0 +1,75 @@
1
+ # -*- coding: utf-8 -*-
2
+ """
3
+ metadata.json 读写与去字幕后更新(含跨进程文件锁,与 Java MarkerExportService 一致)。
4
+ """
5
+ import json
6
+ import logging
7
+ from pathlib import Path
8
+ from typing import List, Dict, Any
9
+
10
+ logger = logging.getLogger(__name__)
11
+
12
+
13
+ def load_metadata(metadata_path: str) -> Dict[str, Any]:
14
+ with open(metadata_path, "r", encoding="utf-8") as f:
15
+ data = json.load(f)
16
+ return data
17
+
18
+
19
+ def save_metadata(metadata_path: str, data: dict) -> None:
20
+ tmp = metadata_path + ".tmp"
21
+ with open(tmp, "w", encoding="utf-8") as f:
22
+ json.dump(data, f, indent=2, ensure_ascii=False)
23
+ Path(tmp).replace(metadata_path)
24
+
25
+
26
+ def get_processed_chunks(data: dict) -> List[Dict[str, Any]]:
27
+ chunks = data.get("chunks") or []
28
+ return [c for c in chunks if c.get("status") == "processed"]
29
+
30
+
31
+ def get_pending_chunks(data: dict) -> List[Dict[str, Any]]:
32
+ chunks = data.get("chunks") or []
33
+ return [c for c in chunks if c.get("status") in ("pending", "failed")]
34
+
35
+
36
+ def update_chunk_processed(metadata_path: str, chunk_id: str, processed_path: str) -> None:
37
+ """将指定 chunk 标记为已处理并写入 processedPath;同目录校验 + 文件锁防串片与并发丢失。"""
38
+ meta_path = Path(metadata_path)
39
+ processed_path_obj = Path(processed_path)
40
+ meta_dir = meta_path.parent.resolve()
41
+ processed_dir = processed_path_obj.parent.resolve()
42
+ if meta_dir != processed_dir:
43
+ logger.error("串片防护: processedPath 与 metadata 不在同一目录, meta_dir=%s, processed_dir=%s, chunk_id=%s",
44
+ meta_dir, processed_dir, chunk_id)
45
+ raise ValueError("processedPath 必须与 metadata.json 同目录,防止串片")
46
+
47
+ lock_path = Path(metadata_path + ".lock")
48
+ lock_path.parent.mkdir(parents=True, exist_ok=True)
49
+ try:
50
+ import fcntl
51
+ _use_flock = True
52
+ except ImportError:
53
+ _use_flock = False # Windows 无 fcntl,单进程或接受并发风险
54
+
55
+ def _do_update():
56
+ data = load_metadata(metadata_path)
57
+ for chunk in data.get("chunks") or []:
58
+ if chunk.get("chunkId") == chunk_id:
59
+ chunk["processedPath"] = processed_path
60
+ chunk["status"] = "processed"
61
+ chunk["processedAt"] = __import__("datetime").datetime.utcnow().isoformat() + "Z"
62
+ save_metadata(metadata_path, data)
63
+ logger.info("已更新 metadata 中 chunk %s 为已处理: %s", chunk_id, processed_path)
64
+ return
65
+ logger.warning("未在 metadata 中找到 chunk: %s", chunk_id)
66
+
67
+ if _use_flock:
68
+ with open(lock_path, "w") as lock_file:
69
+ fcntl.flock(lock_file.fileno(), fcntl.LOCK_EX)
70
+ try:
71
+ _do_update()
72
+ finally:
73
+ fcntl.flock(lock_file.fileno(), fcntl.LOCK_UN)
74
+ else:
75
+ _do_update()
@@ -0,0 +1,29 @@
1
+ [build-system]
2
+ requires = ["setuptools>=61.0"]
3
+ build-backend = "setuptools.build_meta"
4
+
5
+ [project]
6
+ name = "videoconverter-worker"
7
+ version = "1.0.0"
8
+ description = "VideoConverter Python Worker:从 queue 目录读取任务并执行切分/去字幕/合成"
9
+ readme = "README.txt"
10
+ requires-python = ">=3.8"
11
+ license = { text = "MIT" }
12
+ keywords = ["videoconverter", "ffmpeg", "worker", "video"]
13
+ classifiers = [
14
+ "License :: OSI Approved :: MIT License",
15
+ "Operating System :: OS Independent",
16
+ "Programming Language :: Python :: 3",
17
+ "Programming Language :: Python :: 3.8",
18
+ "Programming Language :: Python :: 3.9",
19
+ "Programming Language :: Python :: 3.10",
20
+ "Programming Language :: Python :: 3.11",
21
+ "Programming Language :: Python :: 3.12",
22
+ ]
23
+
24
+ [project.scripts]
25
+ videoconverter = "worker:main"
26
+ videoconverter-worker = "worker:main"
27
+
28
+ [tool.setuptools]
29
+ py-modules = ["worker", "task_queue", "metadata", "ffmpeg_runner"]
@@ -0,0 +1,4 @@
1
+ [egg_info]
2
+ tag_build =
3
+ tag_date = 0
4
+
@@ -0,0 +1,244 @@
1
+ # -*- coding: utf-8 -*-
2
+ """
3
+ 队列文件读写与任务抢占,与 queue_task_schema.txt 约定一致。
4
+ (模块名 task_queue 避免与标准库 queue 冲突)
5
+ """
6
+ import json
7
+ import os
8
+ import uuid
9
+ import logging
10
+ from pathlib import Path
11
+ from typing import Optional, List, Dict, Any
12
+
13
+ logger = logging.getLogger(__name__)
14
+
15
+ QUEUE_DIR_NAME = "queue"
16
+ CONFIG_FILE_NAME = "queue_config.json"
17
+
18
+
19
+ def _format_time(seconds: float) -> str:
20
+ if seconds < 0:
21
+ return "0"
22
+ if seconds < 3600:
23
+ m = int(seconds // 60)
24
+ s = seconds % 60
25
+ return f"{m}:{s:05.2f}" if s != int(s) else f"{m}:{int(s):02d}"
26
+ h = int(seconds // 3600)
27
+ m = int((seconds % 3600) // 60)
28
+ s = seconds % 60
29
+ return f"{h}:{m:02d}:{s:05.2f}" if s != int(s) else f"{h}:{m:02d}:{int(s):02d}"
30
+
31
+
32
+ class QueueStore:
33
+ def __init__(self, data_dir: str, path_replace: Optional[tuple] = None):
34
+ """
35
+ :param data_dir: 数据目录,如 ~/.videoconverter/data
36
+ :param path_replace: (old_prefix, new_prefix) 将任务中的路径前缀替换,便于本机/服务器迁移
37
+ """
38
+ self.data_dir = Path(data_dir).resolve()
39
+ self.queue_dir = self.data_dir / QUEUE_DIR_NAME
40
+ self.config_file = self.data_dir / CONFIG_FILE_NAME
41
+ self.path_replace = path_replace # (old, new)
42
+ self._ensure_dirs()
43
+
44
+ def _ensure_dirs(self) -> None:
45
+ self.queue_dir.mkdir(parents=True, exist_ok=True)
46
+ if not self.config_file.exists():
47
+ self._write_config({"queue_paused": "true"})
48
+
49
+ def _apply_path(self, path: str) -> str:
50
+ if not path or not self.path_replace:
51
+ return path
52
+ old, new = self.path_replace
53
+ if path.startswith(old):
54
+ return new + path[len(old):].lstrip(os.sep)
55
+ return path
56
+
57
+ def _read_config(self) -> dict:
58
+ if not self.config_file.exists():
59
+ return {}
60
+ try:
61
+ with open(self.config_file, "r", encoding="utf-8") as f:
62
+ return json.load(f)
63
+ except Exception:
64
+ return {}
65
+
66
+ def _write_config(self, config: dict) -> None:
67
+ tmp = self.config_file.with_suffix(self.config_file.suffix + ".tmp")
68
+ with open(tmp, "w", encoding="utf-8") as f:
69
+ json.dump(config, f, indent=2, ensure_ascii=False)
70
+ tmp.replace(self.config_file)
71
+
72
+ def is_paused(self) -> bool:
73
+ cfg = self._read_config()
74
+ return cfg.get("queue_paused", "true") == "true"
75
+
76
+ def set_paused(self, paused: bool) -> None:
77
+ cfg = self._read_config()
78
+ cfg["queue_paused"] = "true" if paused else "false"
79
+ self._write_config(cfg)
80
+
81
+ def _task_file(self, task_id: str) -> Path:
82
+ return self.queue_dir / f"{task_id}.json"
83
+
84
+ def _lock_file(self, task_id: str) -> Path:
85
+ return self.queue_dir / f"{task_id}.lock"
86
+
87
+ def _log_file(self, task_id: str) -> Path:
88
+ return self.queue_dir / f"{task_id}.log"
89
+
90
+ def _list_task_files(self) -> List[Path]:
91
+ if not self.queue_dir.exists():
92
+ return []
93
+ return sorted(self.queue_dir.glob("*.json"), key=lambda p: p.name)
94
+
95
+ def acquire_pending_task(self) -> Optional[Dict[str, Any]]:
96
+ """原子抢占一个 PENDING 任务,返回任务 dict 或 None。"""
97
+ files = self._list_task_files()
98
+ pending_with_time = []
99
+ for p in files:
100
+ try:
101
+ with open(p, "r", encoding="utf-8") as f:
102
+ data = json.load(f)
103
+ if data.get("status") != "PENDING":
104
+ continue
105
+ pending_with_time.append((data.get("created_time", 0), p))
106
+ except Exception as e:
107
+ logger.warning("读取任务文件失败 %s: %s", p.name, e)
108
+ pending_with_time.sort(key=lambda x: x[0])
109
+
110
+ for _, path in pending_with_time:
111
+ task_id = path.stem
112
+ lock_path = self._lock_file(task_id)
113
+ try:
114
+ with open(lock_path, "x"):
115
+ pass
116
+ except FileExistsError:
117
+ continue
118
+ try:
119
+ with open(path, "r", encoding="utf-8") as f:
120
+ data = json.load(f)
121
+ if data.get("status") != "PENDING":
122
+ lock_path.unlink(missing_ok=True)
123
+ continue
124
+ data["status"] = "PROCESSING"
125
+ data["start_time"] = int(__import__("time").time() * 1000)
126
+ with open(path, "w", encoding="utf-8") as f:
127
+ json.dump(data, f, indent=2, ensure_ascii=False)
128
+ self._apply_paths_to_task(data)
129
+ logger.debug("抢占任务: %s", task_id)
130
+ return data
131
+ except Exception as e:
132
+ logger.exception("抢占任务失败 %s: %s", task_id, e)
133
+ raise
134
+ finally:
135
+ lock_path.unlink(missing_ok=True)
136
+ return None
137
+
138
+ def _apply_paths_to_task(self, task: dict) -> None:
139
+ for key in ("input_file", "output_dir"):
140
+ if key in task and task[key]:
141
+ task[key] = self._apply_path(task[key])
142
+ cfg = task.get("config") or {}
143
+ for key in ("inputPath", "outputPath"):
144
+ if key in cfg and cfg[key]:
145
+ cfg[key] = self._apply_path(cfg[key])
146
+ task["config"] = cfg
147
+
148
+ def update_progress(self, task_id: str, progress: float, progress_text: str) -> None:
149
+ self._update_task(task_id, {
150
+ "progress": max(0, min(100, progress)),
151
+ "progress_text": progress_text or "",
152
+ })
153
+
154
+ def complete_task(self, task_id: str) -> None:
155
+ self._update_task(task_id, {
156
+ "status": "COMPLETED",
157
+ "progress": 100.0,
158
+ "progress_text": "已完成",
159
+ "end_time": int(__import__("time").time() * 1000),
160
+ })
161
+
162
+ def fail_task(self, task_id: str, error_message: str) -> None:
163
+ self._update_task(task_id, {
164
+ "status": "FAILED",
165
+ "end_time": int(__import__("time").time() * 1000),
166
+ "error_message": error_message or "",
167
+ })
168
+
169
+ def _update_task(self, task_id: str, updates: dict) -> None:
170
+ path = self._task_file(task_id)
171
+ if not path.exists():
172
+ return
173
+ with open(path, "r", encoding="utf-8") as f:
174
+ data = json.load(f)
175
+ data.update(updates)
176
+ tmp = path.with_suffix(path.suffix + ".tmp")
177
+ with open(tmp, "w", encoding="utf-8") as f:
178
+ json.dump(data, f, indent=2, ensure_ascii=False)
179
+ tmp.replace(path)
180
+
181
+ def add_log(self, task_id: str, level: str, message: str) -> None:
182
+ log_path = self._log_file(task_id)
183
+ line = f"{int(__import__('time').time()*1000)}\t{level}\t{message}\n"
184
+ with open(log_path, "a", encoding="utf-8") as f:
185
+ f.write(line)
186
+
187
+ def get_tasks_by_video_id(self, video_id: str) -> List[Dict[str, Any]]:
188
+ out = []
189
+ for p in self._list_task_files():
190
+ try:
191
+ with open(p, "r", encoding="utf-8") as f:
192
+ d = json.load(f)
193
+ if d.get("video_id") == video_id:
194
+ self._apply_paths_to_task(d)
195
+ out.append(d)
196
+ except Exception:
197
+ pass
198
+ out.sort(key=lambda x: x.get("created_time", 0))
199
+ return out
200
+
201
+ def create_desubtitle_task(self, input_file: str, output_dir: str, config: dict,
202
+ parent_task_id: str, video_id: str) -> str:
203
+ task_id = str(uuid.uuid4())
204
+ task = {
205
+ "task_id": task_id,
206
+ "task_type": "DESUBTITLE",
207
+ "parent_task_id": parent_task_id or "",
208
+ "video_id": video_id or "",
209
+ "input_file": input_file,
210
+ "output_dir": output_dir,
211
+ "config": dict(config) if config else {},
212
+ "status": "PENDING",
213
+ "progress": 0.0,
214
+ "progress_text": "等待去字幕...",
215
+ "created_time": int(__import__("time").time() * 1000),
216
+ "error_message": "",
217
+ }
218
+ path = self._task_file(task_id)
219
+ with open(path, "w", encoding="utf-8") as f:
220
+ json.dump(task, f, indent=2, ensure_ascii=False)
221
+ logger.info("创建去字幕任务: %s -> %s", input_file, task_id)
222
+ return task_id
223
+
224
+ def create_merge_task(self, video_id: str, output_dir: str, config: dict) -> str:
225
+ task_id = str(uuid.uuid4())
226
+ task = {
227
+ "task_id": task_id,
228
+ "task_type": "MERGE",
229
+ "parent_task_id": "",
230
+ "video_id": video_id,
231
+ "input_file": "",
232
+ "output_dir": output_dir,
233
+ "config": dict(config) if config else {},
234
+ "status": "PENDING",
235
+ "progress": 0.0,
236
+ "progress_text": "等待合成...",
237
+ "created_time": int(__import__("time").time() * 1000),
238
+ "error_message": "",
239
+ }
240
+ path = self._task_file(task_id)
241
+ with open(path, "w", encoding="utf-8") as f:
242
+ json.dump(task, f, indent=2, ensure_ascii=False)
243
+ logger.info("创建合成任务: videoId=%s -> %s", video_id, task_id)
244
+ return task_id
@@ -0,0 +1,81 @@
1
+ Metadata-Version: 2.4
2
+ Name: videoconverter-worker
3
+ Version: 1.0.0
4
+ Summary: VideoConverter Python Worker:从 queue 目录读取任务并执行切分/去字幕/合成
5
+ License: MIT
6
+ Keywords: videoconverter,ffmpeg,worker,video
7
+ Classifier: License :: OSI Approved :: MIT License
8
+ Classifier: Operating System :: OS Independent
9
+ Classifier: Programming Language :: Python :: 3
10
+ Classifier: Programming Language :: Python :: 3.8
11
+ Classifier: Programming Language :: Python :: 3.9
12
+ Classifier: Programming Language :: Python :: 3.10
13
+ Classifier: Programming Language :: Python :: 3.11
14
+ Classifier: Programming Language :: Python :: 3.12
15
+ Requires-Python: >=3.8
16
+ Description-Content-Type: text/plain
17
+
18
+ Python Worker 使用说明
19
+
20
+ 0) 安装(任选其一)
21
+ - 从 PyPI 安装(推荐,Windows/Linux/macOS 通用):
22
+ pip install videoconverter
23
+ 安装后直接运行: videoconverter [--data-dir DIR] [--path-replace OLD=NEW]
24
+ - 或从本地源码安装: cd src/python && pip install .
25
+
26
+ 1) 作用
27
+ 与 Java BackendWorker 行为一致:从 queue 目录读取任务(queue/<task_id>.json),
28
+ 执行 SPLIT / DESUBTITLE / MERGE / ONE_CLICK_COMPOSE,更新状态与 metadata.json。
29
+ 适合在服务器上运行,利用多核与高性能做切分、去字幕、合成。
30
+
31
+ 2) 环境
32
+ - Python 3.8+
33
+ - 系统已安装 ffmpeg、ffprobe(或在 PATH 中)
34
+ - 可选:环境变量 FFMPEG_PATH、FFPROBE_PATH 指定路径
35
+
36
+ 3) 本机与 Java 共用同一队列(同一台机器)
37
+ - 数据目录一致即可,例如: ~/.videoconverter/data
38
+ - 先由 Java 前端创建任务(写 queue/*.json),再在本机运行 Python worker 处理:
39
+ cd src/python
40
+ python worker.py
41
+ - 或指定数据目录:
42
+ python worker.py --data-dir /path/to/.videoconverter/data
43
+
44
+ 4) 把「前端处理过的文件夹」拷到服务器后运行
45
+ - 将本机用于队列的「数据目录」整个拷到服务器(例如 /server/data),
46
+ 其中应包含 queue/ 目录及 queue/*.json 任务文件。
47
+ - 若任务 JSON 里的路径是本机绝对路径(如 /Users/me/videos/a.mp4),
48
+ 需要在服务器上做路径替换,否则找不到文件:
49
+ python worker.py --data-dir /server/data --path-replace "/Users/me/videos=/server/videos"
50
+ - 环境变量方式(便于脚本/ systemd):
51
+ export VIDEOCONVERTER_DATA_DIR=/server/data
52
+ export VIDEOCONVERTER_PATH_REPLACE="/Users/me/videos=/server/videos"
53
+ python worker.py
54
+ - 建议:在服务器上把视频放在固定目录(如 /server/videos),
55
+ 拷过去的 data 里 queue 的 JSON 中路径统一用本机前缀,用 --path-replace 换成服务器前缀。
56
+
57
+ 5) 多进程并发
58
+ 当前 worker 为单进程单线程循环。要跑满多核,可在同一 data-dir 下启动多个进程:
59
+ - 任务抢占通过 queue/<task_id>.lock 原子创建,多进程不会抢到同一任务。
60
+ - 示例(4 个 worker 进程):
61
+ for i in 1 2 3 4; do python worker.py --data-dir /server/data & done
62
+ 或使用 systemd/supervisor 起多个 worker 实例。
63
+
64
+ 6) 暂停/继续
65
+ 队列暂停由 queue_config.json 的 queue_paused 控制("true" 为暂停)。
66
+ Python worker 会定期读该配置,为 true 时不取新任务。可由 Java 前端或手动改该文件控制。
67
+
68
+ 7) 与 Java 的约定
69
+ - 任务与 config 格式见项目根目录 queue_task_schema.txt。
70
+ - metadata.json 与 Java 生成的格式一致,合成(MERGE)以 metadata 为准。
71
+
72
+ 8) 打包与部署
73
+ - 打 zip 包(拷贝到服务器解压即用):
74
+ cd src/python
75
+ ./build_deploy.sh
76
+ 会生成 videoconverter.zip,解压后在该目录执行:
77
+ python3 worker.py [--data-dir DIR] [--path-replace OLD=NEW]
78
+ - 或安装为命令行(本机/服务器均可):
79
+ cd src/python
80
+ pip install .
81
+ 然后可直接运行: videoconverter [--data-dir DIR] [--path-replace OLD=NEW]
@@ -0,0 +1,11 @@
1
+ README.txt
2
+ ffmpeg_runner.py
3
+ metadata.py
4
+ pyproject.toml
5
+ task_queue.py
6
+ worker.py
7
+ videoconverter_worker.egg-info/PKG-INFO
8
+ videoconverter_worker.egg-info/SOURCES.txt
9
+ videoconverter_worker.egg-info/dependency_links.txt
10
+ videoconverter_worker.egg-info/entry_points.txt
11
+ videoconverter_worker.egg-info/top_level.txt
@@ -0,0 +1,3 @@
1
+ [console_scripts]
2
+ videoconverter = worker:main
3
+ videoconverter-worker = worker:main
@@ -0,0 +1,4 @@
1
+ ffmpeg_runner
2
+ metadata
3
+ task_queue
4
+ worker
@@ -0,0 +1,266 @@
1
+ # -*- coding: utf-8 -*-
2
+ """
3
+ Python Worker:从 queue 目录读取任务并执行切分/去字幕/合成,与 Java BackendWorker 行为一致。
4
+ 用法:
5
+ python worker.py [--data-dir DIR] [--path-replace OLD=NEW] [--workers N]
6
+ 或设置环境变量: VIDEOCONVERTER_DATA_DIR, VIDEOCONVERTER_PATH_REPLACE (OLD=NEW)
7
+ """
8
+ import argparse
9
+ import logging
10
+ import sys
11
+ import time
12
+ from pathlib import Path
13
+
14
+ from task_queue import QueueStore
15
+ from metadata import load_metadata, get_processed_chunks, get_pending_chunks, update_chunk_processed
16
+ from ffmpeg_runner import (
17
+ split_video_to_chunks,
18
+ run_desubtitle,
19
+ merge_chunks,
20
+ get_duration,
21
+ )
22
+
23
+ logging.basicConfig(
24
+ level=logging.INFO,
25
+ format="%(asctime)s [%(levelname)s] %(name)s: %(message)s",
26
+ datefmt="%Y-%m-%d %H:%M:%S",
27
+ )
28
+ logger = logging.getLogger("worker")
29
+
30
+
31
+ def process_split_task(store: QueueStore, task: dict) -> None:
32
+ task_id = task["task_id"]
33
+ input_file = task["input_file"]
34
+ output_dir = task["output_dir"]
35
+ config = task.get("config") or {}
36
+ range_start = config.get("startTime", 0) or 0
37
+ range_end = config.get("endTime", 0) or 0
38
+
39
+ store.add_log(task_id, "INFO", f"开始切分视频: {Path(input_file).name} (范围: {range_start} - {range_end}秒)")
40
+ metadata, video_id = split_video_to_chunks(input_file, output_dir, 120.0, range_start, range_end)
41
+ store.add_log(task_id, "INFO", f"切分完成: {metadata['totalChunks']} 个小块,元数据: {metadata.get('_metadataPath', '')}")
42
+
43
+ store.complete_task(task_id)
44
+
45
+ # 为每个 chunk 创建去字幕任务
46
+ chunk_files = []
47
+ for ch in metadata.get("chunks") or []:
48
+ rel = ch.get("originalPath", "")
49
+ if rel:
50
+ chunk_path = Path(output_dir) / rel
51
+ chunk_files.append(str(chunk_path))
52
+
53
+ for chunk_file in chunk_files:
54
+ store.create_desubtitle_task(chunk_file, output_dir, config, task_id, video_id)
55
+ store.add_log(task_id, "INFO", f"已创建 {len(chunk_files)} 个去字幕任务")
56
+
57
+
58
+ def process_desubtitle_task(store: QueueStore, task: dict) -> None:
59
+ task_id = task["task_id"]
60
+ input_file = task["input_file"]
61
+ output_dir = task["output_dir"]
62
+ video_id = task.get("video_id") or ""
63
+ config = dict(task.get("config") or {})
64
+
65
+ input_path = Path(input_file)
66
+ if not input_path.exists():
67
+ store.fail_task(task_id, f"输入文件不存在: {input_file}")
68
+ return
69
+
70
+ store.add_log(task_id, "INFO", f"开始去字幕: {input_path.name}")
71
+
72
+ output_dir_for_file = Path(output_dir) / video_id if video_id else Path(output_dir)
73
+ output_dir_for_file.mkdir(parents=True, exist_ok=True)
74
+ output_name = input_path.stem + "_desub.mp4"
75
+ output_file = output_dir_for_file / output_name
76
+
77
+ config["inputPath"] = input_file
78
+ config["outputPath"] = str(output_file)
79
+ if video_id:
80
+ config["startTime"] = 0
81
+ config["endTime"] = 0
82
+ config["forceKeyframeAtStart"] = True
83
+
84
+ try:
85
+ run_desubtitle(config, input_file, str(output_file))
86
+ except Exception as e:
87
+ store.fail_task(task_id, str(e))
88
+ store.add_log(task_id, "WARN", str(e))
89
+ return
90
+
91
+ store.complete_task(task_id)
92
+ store.add_log(task_id, "INFO", f"去字幕完成: {output_name}")
93
+
94
+ if video_id:
95
+ metadata_path = Path(output_dir) / video_id / "metadata.json"
96
+ if metadata_path.exists():
97
+ chunk_id = input_path.stem
98
+ try:
99
+ update_chunk_processed(str(metadata_path), chunk_id, str(output_file))
100
+ check_and_create_merge_task(store, video_id, output_dir, config)
101
+ except Exception as e:
102
+ logger.warning("更新 metadata 或检查合成失败: videoId=%s, chunkId=%s, %s", video_id, chunk_id, e)
103
+
104
+
105
+ def check_and_create_merge_task(store: QueueStore, video_id: str, output_dir: str, config: dict) -> None:
106
+ metadata_path = Path(output_dir) / video_id / "metadata.json"
107
+ if not metadata_path.exists():
108
+ return
109
+ data = load_metadata(str(metadata_path))
110
+ chunks = data.get("chunks") or []
111
+ total = len(chunks)
112
+ processed = get_processed_chunks(data)
113
+ pending = get_pending_chunks(data)
114
+ existing = store.get_tasks_by_video_id(video_id)
115
+ has_merge = any(t.get("task_type") == "MERGE" for t in existing)
116
+
117
+ if total > 0 and len(processed) == total and not pending and not has_merge:
118
+ store.create_merge_task(video_id, output_dir, config)
119
+ logger.info("自动创建合成任务: videoId=%s, 已处理 %d/%d 块", video_id, len(processed), total)
120
+
121
+
122
+ def process_merge_task(store: QueueStore, task: dict) -> None:
123
+ task_id = task["task_id"]
124
+ video_id = task.get("video_id")
125
+ output_dir = task["output_dir"]
126
+
127
+ store.add_log(task_id, "INFO", f"开始合成视频: videoId={video_id}")
128
+
129
+ metadata_path = Path(output_dir) / video_id / "metadata.json"
130
+ if not metadata_path.exists():
131
+ store.fail_task(task_id, f"元数据文件不存在: {metadata_path}")
132
+ return
133
+
134
+ data = load_metadata(str(metadata_path))
135
+ processed = get_processed_chunks(data)
136
+ if not processed:
137
+ store.fail_task(task_id, "没有已处理的 chunk")
138
+ return
139
+
140
+ start_time = processed[0]["startTime"]
141
+ end_time = processed[-1]["endTime"]
142
+ output_file = Path(output_dir) / f"{video_id}_merged.mp4"
143
+
144
+ store.add_log(task_id, "INFO", f"合并 {len(processed)} 个小块,时间范围: {start_time} - {end_time}秒")
145
+
146
+ try:
147
+ merge_chunks(data, start_time, end_time, str(output_file))
148
+ store.complete_task(task_id)
149
+ store.add_log(task_id, "INFO", f"合成完成: {output_file.name}")
150
+ except Exception as e:
151
+ store.fail_task(task_id, str(e))
152
+ store.add_log(task_id, "WARN", str(e))
153
+
154
+
155
+ def process_one_click_compose_task(store: QueueStore, task: dict) -> None:
156
+ task_id = task["task_id"]
157
+ input_file = task["input_file"]
158
+ output_dir = task["output_dir"]
159
+ config = task.get("config") or {}
160
+ range_start = config.get("startTime", 0) or 0
161
+ range_end = config.get("endTime", 0) or 0
162
+
163
+ store.add_log(task_id, "INFO", f"开始一键合成: {Path(input_file).name} (videoId={task.get('video_id', '')})")
164
+
165
+ metadata, folder_video_id = split_video_to_chunks(input_file, output_dir, 120.0, range_start, range_end)
166
+ store.add_log(task_id, "INFO", f"切分完成: {metadata['totalChunks']} 个小块")
167
+
168
+ store.complete_task(task_id)
169
+
170
+ chunk_files = []
171
+ for ch in metadata.get("chunks") or []:
172
+ rel = ch.get("originalPath", "")
173
+ if rel:
174
+ chunk_files.append(str(Path(output_dir) / rel))
175
+
176
+ for cf in chunk_files:
177
+ store.create_desubtitle_task(cf, output_dir, config, task_id, folder_video_id)
178
+ store.add_log(task_id, "INFO", f"已创建 {len(chunk_files)} 个去字幕任务")
179
+
180
+
181
+ def work_loop(store: QueueStore) -> None:
182
+ while True:
183
+ try:
184
+ if store.is_paused():
185
+ time.sleep(1)
186
+ continue
187
+
188
+ task = store.acquire_pending_task()
189
+ if task is None:
190
+ time.sleep(1)
191
+ continue
192
+
193
+ task_id = task["task_id"]
194
+ task_type = task.get("task_type", "")
195
+ logger.info("开始处理任务: %s (类型: %s)", task_id, task_type)
196
+
197
+ try:
198
+ if task_type == "SPLIT":
199
+ process_split_task(store, task)
200
+ elif task_type == "DESUBTITLE":
201
+ process_desubtitle_task(store, task)
202
+ elif task_type == "MERGE":
203
+ process_merge_task(store, task)
204
+ elif task_type == "ONE_CLICK_COMPOSE":
205
+ process_one_click_compose_task(store, task)
206
+ else:
207
+ store.fail_task(task_id, f"未知任务类型: {task_type}")
208
+ except Exception as e:
209
+ logger.exception("任务处理失败: %s", task_id)
210
+ store.fail_task(task_id, str(e))
211
+ store.add_log(task_id, "WARN", str(e))
212
+
213
+ except KeyboardInterrupt:
214
+ logger.info("收到中断,退出")
215
+ break
216
+ except Exception as e:
217
+ logger.exception("工作循环异常: %s", e)
218
+ time.sleep(5)
219
+
220
+
221
+ def main() -> int:
222
+ parser = argparse.ArgumentParser(description="VideoConverter Python Worker")
223
+ parser.add_argument(
224
+ "--data-dir",
225
+ default=None,
226
+ help="数据目录(默认 $VIDEOCONVERTER_DATA_DIR 或 ~/.videoconverter/data)",
227
+ )
228
+ parser.add_argument(
229
+ "--path-replace",
230
+ default=None,
231
+ help="路径前缀替换,便于本机任务拷到服务器: OLD=NEW,如 /Users/me/videos=/data/videos",
232
+ )
233
+ parser.add_argument(
234
+ "--workers",
235
+ type=int,
236
+ default=1,
237
+ help="并发 worker 数(默认 1,多进程时由外部起多个 worker 进程)",
238
+ )
239
+ args = parser.parse_args()
240
+
241
+ data_dir = args.data_dir or __import__("os").environ.get(
242
+ "VIDEOCONVERTER_DATA_DIR",
243
+ str(Path.home() / ".videoconverter" / "data"),
244
+ )
245
+ path_replace = None
246
+ if args.path_replace:
247
+ if "=" in args.path_replace:
248
+ a, b = args.path_replace.split("=", 1)
249
+ path_replace = (a.strip(), b.strip())
250
+ env_replace = __import__("os").environ.get("VIDEOCONVERTER_PATH_REPLACE")
251
+ if env_replace and "=" in env_replace and path_replace is None:
252
+ a, b = env_replace.split("=", 1)
253
+ path_replace = (a.strip(), b.strip())
254
+
255
+ store = QueueStore(data_dir, path_replace=path_replace)
256
+ logger.info("数据目录: %s", data_dir)
257
+ logger.info("队列目录: %s", store.queue_dir)
258
+ if path_replace:
259
+ logger.info("路径替换: %s -> %s", path_replace[0], path_replace[1])
260
+
261
+ work_loop(store)
262
+ return 0
263
+
264
+
265
+ if __name__ == "__main__":
266
+ sys.exit(main())