remotion-claude-agent-demo 0.1.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +160 -0
- package/apps/web/README.md +36 -0
- package/apps/web/env.example +20 -0
- package/apps/web/eslint.config.mjs +18 -0
- package/apps/web/next.config.ts +7 -0
- package/apps/web/package-lock.json +10348 -0
- package/apps/web/package.json +35 -0
- package/apps/web/postcss.config.mjs +7 -0
- package/apps/web/public/file.svg +1 -0
- package/apps/web/public/globe.svg +1 -0
- package/apps/web/public/next.svg +1 -0
- package/apps/web/public/vercel.svg +1 -0
- package/apps/web/public/window.svg +1 -0
- package/apps/web/src/app/.well-known/agent-card.json/route.ts +50 -0
- package/apps/web/src/app/background-tasks/[jobId]/cancel/route.ts +29 -0
- package/apps/web/src/app/events/stream/route.ts +58 -0
- package/apps/web/src/app/favicon.ico +0 -0
- package/apps/web/src/app/globals.css +174 -0
- package/apps/web/src/app/layout.tsx +34 -0
- package/apps/web/src/app/messages/answer/route.ts +57 -0
- package/apps/web/src/app/messages/stream/route.ts +381 -0
- package/apps/web/src/app/page.tsx +358 -0
- package/apps/web/src/app/tasks/[taskId]/cancel/route.ts +24 -0
- package/apps/web/src/app/tasks/[taskId]/route.ts +24 -0
- package/apps/web/src/app/tasks/route.ts +13 -0
- package/apps/web/src/components/chat/agent-blocks.tsx +111 -0
- package/apps/web/src/components/chat/ask-user-question-panel.tsx +172 -0
- package/apps/web/src/components/chat/session-sidebar.tsx +222 -0
- package/apps/web/src/components/chat/subagent-activity-sidebar.tsx +248 -0
- package/apps/web/src/components/chat/tool-blocks.tsx +550 -0
- package/apps/web/src/lib/a2a/activity-store.ts +150 -0
- package/apps/web/src/lib/a2a/client.ts +357 -0
- package/apps/web/src/lib/a2a/sse.ts +19 -0
- package/apps/web/src/lib/a2a/task-store.ts +111 -0
- package/apps/web/src/lib/a2a/types.ts +216 -0
- package/apps/web/src/lib/agent/answer-store.ts +109 -0
- package/apps/web/src/lib/agent/background-delivery.ts +343 -0
- package/apps/web/src/lib/agent/background-tool.ts +78 -0
- package/apps/web/src/lib/agent/background.ts +452 -0
- package/apps/web/src/lib/agent/chat.ts +543 -0
- package/apps/web/src/lib/agent/session-store.ts +26 -0
- package/apps/web/src/lib/chat/types.ts +44 -0
- package/apps/web/src/lib/env.ts +31 -0
- package/apps/web/src/lib/hooks/useA2AChat.ts +863 -0
- package/apps/web/src/lib/state/chat-atoms.ts +52 -0
- package/apps/web/src/lib/workspace.ts +9 -0
- package/apps/web/tsconfig.json +35 -0
- package/bin/remotion-agent.js +451 -0
- package/package.json +34 -0
- package/templates/.claude/CLAUDE.md +95 -0
- package/templates/.claude/README.md +129 -0
- package/templates/.claude/agents/composer-agent.md +188 -0
- package/templates/.claude/agents/crafter.md +181 -0
- package/templates/.claude/agents/creator.md +134 -0
- package/templates/.claude/agents/perceiver.md +92 -0
- package/templates/.claude/settings.json +36 -0
- package/templates/.claude/settings.local.json +39 -0
- package/templates/.claude/skills/agent-browser/SKILL.md +349 -0
- package/templates/.claude/skills/agent-browser/references/authentication.md +188 -0
- package/templates/.claude/skills/agent-browser/references/proxy-support.md +175 -0
- package/templates/.claude/skills/agent-browser/references/session-management.md +181 -0
- package/templates/.claude/skills/agent-browser/references/snapshot-refs.md +186 -0
- package/templates/.claude/skills/agent-browser/references/video-recording.md +162 -0
- package/templates/.claude/skills/agent-browser/templates/authenticated-session.sh +91 -0
- package/templates/.claude/skills/agent-browser/templates/capture-workflow.sh +68 -0
- package/templates/.claude/skills/agent-browser/templates/form-automation.sh +64 -0
- package/templates/.claude/skills/algorithmic-art/LICENSE.txt +202 -0
- package/templates/.claude/skills/algorithmic-art/SKILL.md +405 -0
- package/templates/.claude/skills/algorithmic-art/templates/generator_template.js +223 -0
- package/templates/.claude/skills/algorithmic-art/templates/viewer.html +599 -0
- package/templates/.claude/skills/asset-validator/SKILL.md +376 -0
- package/templates/.claude/skills/audio-video-sync/SKILL.md +219 -0
- package/templates/.claude/skills/bgm-manager/SKILL.md +334 -0
- package/templates/.claude/skills/remotion-best-practices/SKILL.md +45 -0
- package/templates/.claude/skills/remotion-best-practices/rules/3d.md +86 -0
- package/templates/.claude/skills/remotion-best-practices/rules/animations.md +29 -0
- package/templates/.claude/skills/remotion-best-practices/rules/assets/charts-bar-chart.tsx +173 -0
- package/templates/.claude/skills/remotion-best-practices/rules/assets/text-animations-typewriter.tsx +100 -0
- package/templates/.claude/skills/remotion-best-practices/rules/assets/text-animations-word-highlight.tsx +108 -0
- package/templates/.claude/skills/remotion-best-practices/rules/assets.md +78 -0
- package/templates/.claude/skills/remotion-best-practices/rules/audio.md +172 -0
- package/templates/.claude/skills/remotion-best-practices/rules/calculate-metadata.md +104 -0
- package/templates/.claude/skills/remotion-best-practices/rules/can-decode.md +75 -0
- package/templates/.claude/skills/remotion-best-practices/rules/charts.md +58 -0
- package/templates/.claude/skills/remotion-best-practices/rules/compositions.md +141 -0
- package/templates/.claude/skills/remotion-best-practices/rules/display-captions.md +126 -0
- package/templates/.claude/skills/remotion-best-practices/rules/extract-frames.md +229 -0
- package/templates/.claude/skills/remotion-best-practices/rules/fonts.md +152 -0
- package/templates/.claude/skills/remotion-best-practices/rules/get-audio-duration.md +58 -0
- package/templates/.claude/skills/remotion-best-practices/rules/get-video-dimensions.md +68 -0
- package/templates/.claude/skills/remotion-best-practices/rules/get-video-duration.md +58 -0
- package/templates/.claude/skills/remotion-best-practices/rules/gifs.md +138 -0
- package/templates/.claude/skills/remotion-best-practices/rules/images.md +130 -0
- package/templates/.claude/skills/remotion-best-practices/rules/import-srt-captions.md +67 -0
- package/templates/.claude/skills/remotion-best-practices/rules/lottie.md +68 -0
- package/templates/.claude/skills/remotion-best-practices/rules/maps.md +403 -0
- package/templates/.claude/skills/remotion-best-practices/rules/measuring-dom-nodes.md +35 -0
- package/templates/.claude/skills/remotion-best-practices/rules/measuring-text.md +143 -0
- package/templates/.claude/skills/remotion-best-practices/rules/parameters.md +98 -0
- package/templates/.claude/skills/remotion-best-practices/rules/sequencing.md +118 -0
- package/templates/.claude/skills/remotion-best-practices/rules/tailwind.md +11 -0
- package/templates/.claude/skills/remotion-best-practices/rules/text-animations.md +20 -0
- package/templates/.claude/skills/remotion-best-practices/rules/timing.md +179 -0
- package/templates/.claude/skills/remotion-best-practices/rules/transcribe-captions.md +19 -0
- package/templates/.claude/skills/remotion-best-practices/rules/transitions.md +122 -0
- package/templates/.claude/skills/remotion-best-practices/rules/trimming.md +53 -0
- package/templates/.claude/skills/remotion-best-practices/rules/videos.md +171 -0
- package/templates/.claude/skills/remotion-components/SKILL.md +453 -0
- package/templates/.claude/skills/render-config/SKILL.md +290 -0
- package/templates/.claude/skills/script-writer/SKILL.md +59 -0
- package/templates/.claude/skills/style-director/script-writer/SKILL.md +82 -0
- package/templates/.claude/skills/style-director/style-director/SKILL.md +287 -0
- package/templates/.claude/skills/style-director/style-director/references/audience-and-scenarios.md +43 -0
- package/templates/.claude/skills/style-director/style-director/references/interaction-innovation.md +26 -0
- package/templates/.claude/skills/style-director/style-director/references/motion-grammar.md +66 -0
- package/templates/.claude/skills/style-director/style-director/references/quality-checklist.md +29 -0
- package/templates/.claude/skills/style-director/style-director/references/scene-recipes.md +38 -0
- package/templates/.claude/skills/style-director/style-director/references/visual-style-system.md +148 -0
- package/templates/.claude/skills/subtitle-composer/SKILL.md +304 -0
- package/templates/.claude/skills/subtitle-processor/SKILL.md +308 -0
- package/templates/.claude/skills/timeline-generator/SKILL.md +253 -0
- package/templates/.claude/skills/video-preflight-check/SKILL.md +353 -0
- package/templates/.claude/skills/voice-synthesizer/SKILL.md +296 -0
- package/templates/.claude/skills/voice-synthesizer/scripts/synthesize_voice.py +315 -0
- package/templates/.claude/skills/voice-synthesizer/scripts/tts_cli.py +142 -0
- package/templates/.claude/skills/web-design-guidelines/SKILL.md +36 -0
- package/templates/.claude/skills/youtube-downloader/SKILL.md +99 -0
- package/templates/.claude/skills/youtube-downloader/scripts/download_video.py +145 -0
|
@@ -0,0 +1,315 @@
|
|
|
1
|
+
#!/usr/bin/env python3
|
|
2
|
+
"""
|
|
3
|
+
CosyVoice 语音合成核心模块
|
|
4
|
+
使用阿里云 DashScope SDK 调用 cosyvoice-v3-flash 模型
|
|
5
|
+
"""
|
|
6
|
+
|
|
7
|
+
import os
|
|
8
|
+
import json
|
|
9
|
+
import time
|
|
10
|
+
import subprocess
|
|
11
|
+
from typing import List, Dict, Optional, Tuple
|
|
12
|
+
from pathlib import Path
|
|
13
|
+
|
|
14
|
+
# 默认配置
|
|
15
|
+
DEFAULT_MODEL = "cosyvoice-v3-flash"
|
|
16
|
+
DEFAULT_VOICE = "longanyang"
|
|
17
|
+
|
|
18
|
+
|
|
19
|
+
def load_config_from_env() -> Optional[str]:
|
|
20
|
+
"""
|
|
21
|
+
从配置文件加载环境变量
|
|
22
|
+
按以下顺序查找 .env 文件:
|
|
23
|
+
1. 当前工作目录
|
|
24
|
+
2. 脚本所在目录
|
|
25
|
+
3. 项目根目录(向上查找包含 .git 或 .claude 的目录)
|
|
26
|
+
|
|
27
|
+
Returns:
|
|
28
|
+
加载的 .env 文件路径,如果未找到则返回 None
|
|
29
|
+
"""
|
|
30
|
+
try:
|
|
31
|
+
from dotenv import load_dotenv
|
|
32
|
+
except ImportError:
|
|
33
|
+
return None
|
|
34
|
+
|
|
35
|
+
# 1. 尝试当前工作目录
|
|
36
|
+
cwd_env = Path.cwd() / ".env"
|
|
37
|
+
if cwd_env.exists():
|
|
38
|
+
load_dotenv(cwd_env)
|
|
39
|
+
return str(cwd_env)
|
|
40
|
+
|
|
41
|
+
# 2. 尝试脚本所在目录
|
|
42
|
+
script_dir = Path(__file__).parent
|
|
43
|
+
script_env = script_dir / ".env"
|
|
44
|
+
if script_env.exists():
|
|
45
|
+
load_dotenv(script_env)
|
|
46
|
+
return str(script_env)
|
|
47
|
+
|
|
48
|
+
# 3. 向上查找项目根目录
|
|
49
|
+
current = Path.cwd()
|
|
50
|
+
for parent in [current] + list(current.parents):
|
|
51
|
+
if (parent / ".git").exists() or (parent / ".claude").exists():
|
|
52
|
+
root_env = parent / ".env"
|
|
53
|
+
if root_env.exists():
|
|
54
|
+
load_dotenv(root_env)
|
|
55
|
+
return str(root_env)
|
|
56
|
+
|
|
57
|
+
return None
|
|
58
|
+
|
|
59
|
+
|
|
60
|
+
def check_environment() -> Tuple[List[str], List[str]]:
|
|
61
|
+
"""
|
|
62
|
+
检查运行环境
|
|
63
|
+
|
|
64
|
+
Returns:
|
|
65
|
+
(errors, warnings): 错误列表和警告列表
|
|
66
|
+
"""
|
|
67
|
+
errors = []
|
|
68
|
+
warnings = []
|
|
69
|
+
|
|
70
|
+
# 1. 检查 dashscope SDK
|
|
71
|
+
try:
|
|
72
|
+
import dashscope # noqa: F401
|
|
73
|
+
except ImportError:
|
|
74
|
+
errors.append("dashscope SDK 未安装,请运行: pip install dashscope")
|
|
75
|
+
|
|
76
|
+
# 2. 检查 python-dotenv
|
|
77
|
+
try:
|
|
78
|
+
from dotenv import load_dotenv # noqa: F401
|
|
79
|
+
except ImportError:
|
|
80
|
+
warnings.append("python-dotenv 未安装,无法从 .env 文件加载配置。请运行: pip install python-dotenv")
|
|
81
|
+
|
|
82
|
+
# 3. 检查 API Key
|
|
83
|
+
api_key = os.getenv("DASHSCOPE_API_KEY")
|
|
84
|
+
if not api_key:
|
|
85
|
+
load_config_from_env()
|
|
86
|
+
api_key = os.getenv("DASHSCOPE_API_KEY")
|
|
87
|
+
|
|
88
|
+
if not api_key:
|
|
89
|
+
errors.append("DASHSCOPE_API_KEY 未配置,请设置环境变量或创建 .env 文件")
|
|
90
|
+
|
|
91
|
+
# 4. 检查 ffprobe(可选,用于获取音频时长)
|
|
92
|
+
try:
|
|
93
|
+
subprocess.run(["ffprobe", "-version"], capture_output=True, check=True)
|
|
94
|
+
except (FileNotFoundError, subprocess.CalledProcessError):
|
|
95
|
+
warnings.append("ffprobe 未安装,无法获取音频时长(可选)")
|
|
96
|
+
|
|
97
|
+
return errors, warnings
|
|
98
|
+
|
|
99
|
+
|
|
100
|
+
class CosyVoiceSynthesizer:
|
|
101
|
+
"""CosyVoice 语音合成器"""
|
|
102
|
+
|
|
103
|
+
def __init__(
|
|
104
|
+
self,
|
|
105
|
+
api_key: Optional[str] = None,
|
|
106
|
+
model: str = DEFAULT_MODEL,
|
|
107
|
+
voice: str = DEFAULT_VOICE,
|
|
108
|
+
audio_format: Optional[str] = None,
|
|
109
|
+
speech_rate: float = 1.0,
|
|
110
|
+
pitch_rate: float = 1.0,
|
|
111
|
+
volume: int = 50
|
|
112
|
+
):
|
|
113
|
+
"""
|
|
114
|
+
初始化语音合成器
|
|
115
|
+
|
|
116
|
+
Args:
|
|
117
|
+
api_key: API Key,如果不提供则从环境变量或配置文件读取
|
|
118
|
+
model: 模型名称,默认 cosyvoice-v3-flash
|
|
119
|
+
voice: 音色ID,默认 longanyang
|
|
120
|
+
audio_format: 音频格式,默认 MP3_22050HZ_MONO_256KBPS
|
|
121
|
+
speech_rate: 语速 [0.5, 2.0],默认 1.0
|
|
122
|
+
pitch_rate: 音调 [0.5, 2.0],默认 1.0
|
|
123
|
+
volume: 音量 [0, 100],默认 50
|
|
124
|
+
"""
|
|
125
|
+
import dashscope
|
|
126
|
+
from dashscope.audio.tts_v2 import AudioFormat
|
|
127
|
+
|
|
128
|
+
# 如果没有提供 API Key,尝试从配置文件加载
|
|
129
|
+
if not api_key:
|
|
130
|
+
env_file = load_config_from_env()
|
|
131
|
+
if env_file:
|
|
132
|
+
print(f"从配置文件加载: {env_file}")
|
|
133
|
+
|
|
134
|
+
self.api_key = api_key or os.getenv("DASHSCOPE_API_KEY")
|
|
135
|
+
if not self.api_key:
|
|
136
|
+
raise ValueError(
|
|
137
|
+
"API Key is required. Please:\n"
|
|
138
|
+
" 1. Set DASHSCOPE_API_KEY environment variable, or\n"
|
|
139
|
+
" 2. Create a .env file with DASHSCOPE_API_KEY=your-key, or\n"
|
|
140
|
+
" 3. Pass api_key parameter directly"
|
|
141
|
+
)
|
|
142
|
+
|
|
143
|
+
dashscope.api_key = self.api_key
|
|
144
|
+
self.model = model
|
|
145
|
+
self.voice = voice
|
|
146
|
+
self.audio_format = audio_format or AudioFormat.MP3_22050HZ_MONO_256KBPS
|
|
147
|
+
self.speech_rate = speech_rate
|
|
148
|
+
self.pitch_rate = pitch_rate
|
|
149
|
+
self.volume = volume
|
|
150
|
+
|
|
151
|
+
def synthesize(self, text: str, output_file: str) -> Dict:
|
|
152
|
+
"""
|
|
153
|
+
合成单个文本
|
|
154
|
+
|
|
155
|
+
Args:
|
|
156
|
+
text: 待合成文本
|
|
157
|
+
output_file: 输出文件路径
|
|
158
|
+
|
|
159
|
+
Returns:
|
|
160
|
+
包含文件路径、request_id、首包延迟等信息的字典
|
|
161
|
+
"""
|
|
162
|
+
from dashscope.audio.tts_v2 import SpeechSynthesizer
|
|
163
|
+
|
|
164
|
+
# 创建输出目录
|
|
165
|
+
output_dir = os.path.dirname(output_file)
|
|
166
|
+
if output_dir:
|
|
167
|
+
os.makedirs(output_dir, exist_ok=True)
|
|
168
|
+
|
|
169
|
+
# 实例化 SpeechSynthesizer
|
|
170
|
+
synthesizer = SpeechSynthesizer(
|
|
171
|
+
model=self.model,
|
|
172
|
+
voice=self.voice,
|
|
173
|
+
format=self.audio_format,
|
|
174
|
+
speech_rate=self.speech_rate,
|
|
175
|
+
pitch_rate=self.pitch_rate,
|
|
176
|
+
volume=self.volume
|
|
177
|
+
)
|
|
178
|
+
|
|
179
|
+
# 调用合成
|
|
180
|
+
audio = synthesizer.call(text)
|
|
181
|
+
|
|
182
|
+
# 保存音频文件
|
|
183
|
+
with open(output_file, 'wb') as f:
|
|
184
|
+
f.write(audio)
|
|
185
|
+
|
|
186
|
+
# 获取音频时长
|
|
187
|
+
duration = self._get_audio_duration(output_file)
|
|
188
|
+
|
|
189
|
+
return {
|
|
190
|
+
"file_path": output_file,
|
|
191
|
+
"duration": duration,
|
|
192
|
+
"request_id": synthesizer.get_last_request_id(),
|
|
193
|
+
"first_package_delay": synthesizer.get_first_package_delay()
|
|
194
|
+
}
|
|
195
|
+
|
|
196
|
+
def batch_synthesize(
|
|
197
|
+
self,
|
|
198
|
+
segments: List[Dict],
|
|
199
|
+
output_dir: str,
|
|
200
|
+
rate_limit_delay: float = 0.35
|
|
201
|
+
) -> List[Dict]:
|
|
202
|
+
"""
|
|
203
|
+
批量合成
|
|
204
|
+
|
|
205
|
+
Args:
|
|
206
|
+
segments: 文本段落列表,每个元素包含 id 和 text
|
|
207
|
+
output_dir: 输出目录
|
|
208
|
+
rate_limit_delay: 请求间隔(秒),默认 0.35 秒(约 3 RPS)
|
|
209
|
+
|
|
210
|
+
Returns:
|
|
211
|
+
合成结果列表
|
|
212
|
+
"""
|
|
213
|
+
os.makedirs(output_dir, exist_ok=True)
|
|
214
|
+
results = []
|
|
215
|
+
|
|
216
|
+
for seg in segments:
|
|
217
|
+
segment_id = seg.get("id", f"seg_{len(results):03d}")
|
|
218
|
+
text = seg["text"]
|
|
219
|
+
output_file = os.path.join(output_dir, f"{segment_id}.mp3")
|
|
220
|
+
|
|
221
|
+
try:
|
|
222
|
+
result = self.synthesize(text, output_file)
|
|
223
|
+
results.append({
|
|
224
|
+
"segment_id": segment_id,
|
|
225
|
+
"status": "success",
|
|
226
|
+
**result
|
|
227
|
+
})
|
|
228
|
+
print(f"[OK] {segment_id} -> {output_file}")
|
|
229
|
+
except Exception as e:
|
|
230
|
+
results.append({
|
|
231
|
+
"segment_id": segment_id,
|
|
232
|
+
"status": "failed",
|
|
233
|
+
"error": str(e)
|
|
234
|
+
})
|
|
235
|
+
print(f"[FAIL] {segment_id} - {e}")
|
|
236
|
+
|
|
237
|
+
# 限流控制
|
|
238
|
+
if seg != segments[-1]:
|
|
239
|
+
time.sleep(rate_limit_delay)
|
|
240
|
+
|
|
241
|
+
return results
|
|
242
|
+
|
|
243
|
+
def _get_audio_duration(self, audio_file: str) -> float:
|
|
244
|
+
"""
|
|
245
|
+
获取音频时长(秒)
|
|
246
|
+
|
|
247
|
+
Args:
|
|
248
|
+
audio_file: 音频文件路径
|
|
249
|
+
|
|
250
|
+
Returns:
|
|
251
|
+
音频时长(秒),如果无法获取则返回 0.0
|
|
252
|
+
"""
|
|
253
|
+
try:
|
|
254
|
+
result = subprocess.run(
|
|
255
|
+
["ffprobe", "-v", "error", "-show_entries", "format=duration",
|
|
256
|
+
"-of", "csv=p=0", audio_file],
|
|
257
|
+
capture_output=True,
|
|
258
|
+
text=True,
|
|
259
|
+
check=True
|
|
260
|
+
)
|
|
261
|
+
return float(result.stdout.strip())
|
|
262
|
+
except Exception:
|
|
263
|
+
return 0.0
|
|
264
|
+
|
|
265
|
+
|
|
266
|
+
def main():
|
|
267
|
+
"""命令行入口"""
|
|
268
|
+
import argparse
|
|
269
|
+
|
|
270
|
+
parser = argparse.ArgumentParser(description="CosyVoice 语音合成")
|
|
271
|
+
parser.add_argument("--text", type=str, help="待合成文本")
|
|
272
|
+
parser.add_argument("--voice", type=str, default=DEFAULT_VOICE, help="音色ID")
|
|
273
|
+
parser.add_argument("--output", type=str, default="output.mp3", help="输出文件")
|
|
274
|
+
parser.add_argument("--batch", type=str, help="批量合成 JSON 文件")
|
|
275
|
+
parser.add_argument("--output-dir", type=str, default="./voices", help="批量输出目录")
|
|
276
|
+
parser.add_argument("--check-env", action="store_true", help="检查运行环境")
|
|
277
|
+
|
|
278
|
+
args = parser.parse_args()
|
|
279
|
+
|
|
280
|
+
# 环境检查
|
|
281
|
+
if args.check_env:
|
|
282
|
+
errors, warnings = check_environment()
|
|
283
|
+
if errors:
|
|
284
|
+
print("错误:")
|
|
285
|
+
for e in errors:
|
|
286
|
+
print(f" - {e}")
|
|
287
|
+
if warnings:
|
|
288
|
+
print("警告:")
|
|
289
|
+
for w in warnings:
|
|
290
|
+
print(f" - {w}")
|
|
291
|
+
if not errors and not warnings:
|
|
292
|
+
print("环境检查通过")
|
|
293
|
+
return
|
|
294
|
+
|
|
295
|
+
synthesizer = CosyVoiceSynthesizer(voice=args.voice)
|
|
296
|
+
|
|
297
|
+
if args.batch:
|
|
298
|
+
with open(args.batch, 'r', encoding='utf-8') as f:
|
|
299
|
+
segments = json.load(f)
|
|
300
|
+
results = synthesizer.batch_synthesize(segments, args.output_dir)
|
|
301
|
+
success = len([r for r in results if r['status'] == 'success'])
|
|
302
|
+
print(f"\n批量合成完成: {success}/{len(results)}")
|
|
303
|
+
elif args.text:
|
|
304
|
+
result = synthesizer.synthesize(args.text, args.output)
|
|
305
|
+
print(f"\n合成完成:")
|
|
306
|
+
print(f" 文件: {result['file_path']}")
|
|
307
|
+
print(f" 时长: {result['duration']:.2f} 秒")
|
|
308
|
+
print(f" Request ID: {result['request_id']}")
|
|
309
|
+
else:
|
|
310
|
+
parser.print_help()
|
|
311
|
+
|
|
312
|
+
|
|
313
|
+
if __name__ == "__main__":
|
|
314
|
+
main()
|
|
315
|
+
|
|
@@ -0,0 +1,142 @@
|
|
|
1
|
+
#!/usr/bin/env python3
|
|
2
|
+
"""
|
|
3
|
+
CosyVoice TTS CLI 工具
|
|
4
|
+
提供简化的命令行接口用于语音合成
|
|
5
|
+
"""
|
|
6
|
+
|
|
7
|
+
import sys
|
|
8
|
+
import os
|
|
9
|
+
|
|
10
|
+
# 添加当前目录到 Python 路径
|
|
11
|
+
sys.path.insert(0, os.path.dirname(os.path.abspath(__file__)))
|
|
12
|
+
|
|
13
|
+
from synthesize_voice import CosyVoiceSynthesizer, check_environment, DEFAULT_VOICE
|
|
14
|
+
import json
|
|
15
|
+
import argparse
|
|
16
|
+
|
|
17
|
+
|
|
18
|
+
def main():
|
|
19
|
+
"""命令行入口"""
|
|
20
|
+
parser = argparse.ArgumentParser(
|
|
21
|
+
description="CosyVoice TTS CLI - 阿里云语音合成命令行工具",
|
|
22
|
+
formatter_class=argparse.RawDescriptionHelpFormatter,
|
|
23
|
+
epilog="""
|
|
24
|
+
示例:
|
|
25
|
+
# 单个文本合成
|
|
26
|
+
%(prog)s --text "今天天气怎么样" --voice longanyang --output output.mp3
|
|
27
|
+
|
|
28
|
+
# 批量合成
|
|
29
|
+
%(prog)s --batch segments.json --output-dir ./voices/
|
|
30
|
+
|
|
31
|
+
# 检查环境
|
|
32
|
+
%(prog)s --check-env
|
|
33
|
+
"""
|
|
34
|
+
)
|
|
35
|
+
|
|
36
|
+
# 输入选项
|
|
37
|
+
input_group = parser.add_mutually_exclusive_group()
|
|
38
|
+
input_group.add_argument("--text", type=str, help="待合成文本")
|
|
39
|
+
input_group.add_argument("--batch", type=str, help="批量合成 JSON 文件路径")
|
|
40
|
+
|
|
41
|
+
# 输出选项
|
|
42
|
+
parser.add_argument("--output", type=str, help="输出文件路径(单个合成时使用)")
|
|
43
|
+
parser.add_argument("--output-dir", type=str, default="./voices",
|
|
44
|
+
help="输出目录(批量合成时使用,默认: ./voices)")
|
|
45
|
+
|
|
46
|
+
# 音色选项
|
|
47
|
+
parser.add_argument("--voice", type=str, default=DEFAULT_VOICE,
|
|
48
|
+
help=f"音色ID(默认: {DEFAULT_VOICE})")
|
|
49
|
+
|
|
50
|
+
# 高级参数
|
|
51
|
+
parser.add_argument("--speech-rate", type=float, default=1.0,
|
|
52
|
+
help="语速 [0.5, 2.0](默认: 1.0)")
|
|
53
|
+
parser.add_argument("--pitch-rate", type=float, default=1.0,
|
|
54
|
+
help="音调 [0.5, 2.0](默认: 1.0)")
|
|
55
|
+
parser.add_argument("--volume", type=int, default=50,
|
|
56
|
+
help="音量 [0, 100](默认: 50)")
|
|
57
|
+
|
|
58
|
+
# API Key
|
|
59
|
+
parser.add_argument("--api-key", type=str,
|
|
60
|
+
help="API Key(默认从环境变量 DASHSCOPE_API_KEY 读取)")
|
|
61
|
+
|
|
62
|
+
# 环境检查
|
|
63
|
+
parser.add_argument("--check-env", action="store_true",
|
|
64
|
+
help="检查运行环境")
|
|
65
|
+
|
|
66
|
+
args = parser.parse_args()
|
|
67
|
+
|
|
68
|
+
# 环境检查模式
|
|
69
|
+
if args.check_env:
|
|
70
|
+
errors, warnings = check_environment()
|
|
71
|
+
if errors:
|
|
72
|
+
print("错误:")
|
|
73
|
+
for e in errors:
|
|
74
|
+
print(f" - {e}")
|
|
75
|
+
if warnings:
|
|
76
|
+
print("警告:")
|
|
77
|
+
for w in warnings:
|
|
78
|
+
print(f" - {w}")
|
|
79
|
+
if not errors and not warnings:
|
|
80
|
+
print("环境检查通过")
|
|
81
|
+
sys.exit(1 if errors else 0)
|
|
82
|
+
|
|
83
|
+
# 验证参数
|
|
84
|
+
if not args.text and not args.batch:
|
|
85
|
+
parser.error("需要指定 --text 或 --batch")
|
|
86
|
+
|
|
87
|
+
if args.text and not args.output:
|
|
88
|
+
parser.error("--text 需要指定 --output")
|
|
89
|
+
|
|
90
|
+
try:
|
|
91
|
+
# 初始化合成器
|
|
92
|
+
synthesizer = CosyVoiceSynthesizer(
|
|
93
|
+
api_key=args.api_key,
|
|
94
|
+
voice=args.voice,
|
|
95
|
+
speech_rate=args.speech_rate,
|
|
96
|
+
pitch_rate=args.pitch_rate,
|
|
97
|
+
volume=args.volume
|
|
98
|
+
)
|
|
99
|
+
|
|
100
|
+
if args.batch:
|
|
101
|
+
# 批量合成
|
|
102
|
+
print(f"批量合成: {args.batch} -> {args.output_dir}")
|
|
103
|
+
print(f"音色: {args.voice}")
|
|
104
|
+
|
|
105
|
+
with open(args.batch, 'r', encoding='utf-8') as f:
|
|
106
|
+
segments = json.load(f)
|
|
107
|
+
|
|
108
|
+
print(f"共 {len(segments)} 个段落\n")
|
|
109
|
+
|
|
110
|
+
results = synthesizer.batch_synthesize(segments, args.output_dir)
|
|
111
|
+
|
|
112
|
+
success_count = len([r for r in results if r["status"] == "success"])
|
|
113
|
+
print(f"\n完成: {success_count}/{len(results)}")
|
|
114
|
+
|
|
115
|
+
# 保存结果
|
|
116
|
+
results_file = os.path.join(args.output_dir, "synthesis_results.json")
|
|
117
|
+
with open(results_file, 'w', encoding='utf-8') as f:
|
|
118
|
+
json.dump(results, f, ensure_ascii=False, indent=2)
|
|
119
|
+
|
|
120
|
+
else:
|
|
121
|
+
# 单个合成
|
|
122
|
+
print(f"合成: \"{args.text[:30]}...\" -> {args.output}")
|
|
123
|
+
|
|
124
|
+
result = synthesizer.synthesize(args.text, args.output)
|
|
125
|
+
|
|
126
|
+
print(f"\n完成:")
|
|
127
|
+
print(f" 文件: {result['file_path']}")
|
|
128
|
+
print(f" 时长: {result['duration']:.2f} 秒")
|
|
129
|
+
|
|
130
|
+
except FileNotFoundError as e:
|
|
131
|
+
print(f"错误: 文件未找到 - {e}", file=sys.stderr)
|
|
132
|
+
sys.exit(1)
|
|
133
|
+
except ValueError as e:
|
|
134
|
+
print(f"错误: {e}", file=sys.stderr)
|
|
135
|
+
sys.exit(1)
|
|
136
|
+
except Exception as e:
|
|
137
|
+
print(f"错误: 语音合成失败 - {e}", file=sys.stderr)
|
|
138
|
+
sys.exit(1)
|
|
139
|
+
|
|
140
|
+
|
|
141
|
+
if __name__ == "__main__":
|
|
142
|
+
main()
|
|
@@ -0,0 +1,36 @@
|
|
|
1
|
+
---
|
|
2
|
+
name: web-design-guidelines
|
|
3
|
+
description: Review UI code for Web Interface Guidelines compliance. Use when asked to "review my UI", "check accessibility", "audit design", "review UX", or "check my site against best practices".
|
|
4
|
+
argument-hint: <file-or-pattern>
|
|
5
|
+
---
|
|
6
|
+
|
|
7
|
+
# Web Interface Guidelines
|
|
8
|
+
|
|
9
|
+
Review files for compliance with Web Interface Guidelines.
|
|
10
|
+
|
|
11
|
+
## How It Works
|
|
12
|
+
|
|
13
|
+
1. Fetch the latest guidelines from the source URL below
|
|
14
|
+
2. Read the specified files (or prompt user for files/pattern)
|
|
15
|
+
3. Check against all rules in the fetched guidelines
|
|
16
|
+
4. Output findings in the terse `file:line` format
|
|
17
|
+
|
|
18
|
+
## Guidelines Source
|
|
19
|
+
|
|
20
|
+
Fetch fresh guidelines before each review:
|
|
21
|
+
|
|
22
|
+
```
|
|
23
|
+
https://raw.githubusercontent.com/vercel-labs/web-interface-guidelines/main/command.md
|
|
24
|
+
```
|
|
25
|
+
|
|
26
|
+
Use WebFetch to retrieve the latest rules. The fetched content contains all the rules and output format instructions.
|
|
27
|
+
|
|
28
|
+
## Usage
|
|
29
|
+
|
|
30
|
+
When a user provides a file or pattern argument:
|
|
31
|
+
1. Fetch guidelines from the source URL above
|
|
32
|
+
2. Read the specified files
|
|
33
|
+
3. Apply all rules from the fetched guidelines
|
|
34
|
+
4. Output findings using the format specified in the guidelines
|
|
35
|
+
|
|
36
|
+
If no files specified, ask the user which files to review.
|
|
@@ -0,0 +1,99 @@
|
|
|
1
|
+
---
|
|
2
|
+
name: youtube-downloader
|
|
3
|
+
description: Download YouTube videos with customizable quality and format options. Use this skill when the user asks to download, save, or grab YouTube videos. Supports various quality settings (best, 1080p, 720p, 480p, 360p), multiple formats (mp4, webm, mkv), and audio-only downloads as MP3.
|
|
4
|
+
---
|
|
5
|
+
|
|
6
|
+
# YouTube Video Downloader
|
|
7
|
+
|
|
8
|
+
Download YouTube videos with full control over quality and format settings.
|
|
9
|
+
|
|
10
|
+
## Quick Start
|
|
11
|
+
|
|
12
|
+
The simplest way to download a video:
|
|
13
|
+
|
|
14
|
+
```bash
|
|
15
|
+
python scripts/download_video.py "https://www.youtube.com/watch?v=VIDEO_ID"
|
|
16
|
+
```
|
|
17
|
+
|
|
18
|
+
This downloads the video in best available quality as MP4 to `/mnt/user-data/outputs/`.
|
|
19
|
+
|
|
20
|
+
## Options
|
|
21
|
+
|
|
22
|
+
### Quality Settings
|
|
23
|
+
|
|
24
|
+
Use `-q` or `--quality` to specify video quality:
|
|
25
|
+
|
|
26
|
+
- `best` (default): Highest quality available
|
|
27
|
+
- `1080p`: Full HD
|
|
28
|
+
- `720p`: HD
|
|
29
|
+
- `480p`: Standard definition
|
|
30
|
+
- `360p`: Lower quality
|
|
31
|
+
- `worst`: Lowest quality available
|
|
32
|
+
|
|
33
|
+
Example:
|
|
34
|
+
```bash
|
|
35
|
+
python scripts/download_video.py "URL" -q 720p
|
|
36
|
+
```
|
|
37
|
+
|
|
38
|
+
### Format Options
|
|
39
|
+
|
|
40
|
+
Use `-f` or `--format` to specify output format (video downloads only):
|
|
41
|
+
|
|
42
|
+
- `mp4` (default): Most compatible
|
|
43
|
+
- `webm`: Modern format
|
|
44
|
+
- `mkv`: Matroska container
|
|
45
|
+
|
|
46
|
+
Example:
|
|
47
|
+
```bash
|
|
48
|
+
python scripts/download_video.py "URL" -f webm
|
|
49
|
+
```
|
|
50
|
+
|
|
51
|
+
### Audio Only
|
|
52
|
+
|
|
53
|
+
Use `-a` or `--audio-only` to download only audio as MP3:
|
|
54
|
+
|
|
55
|
+
```bash
|
|
56
|
+
python scripts/download_video.py "URL" -a
|
|
57
|
+
```
|
|
58
|
+
|
|
59
|
+
### Custom Output Directory
|
|
60
|
+
|
|
61
|
+
Use `-o` or `--output` to specify a different output directory:
|
|
62
|
+
|
|
63
|
+
```bash
|
|
64
|
+
python scripts/download_video.py "URL" -o /path/to/directory
|
|
65
|
+
```
|
|
66
|
+
|
|
67
|
+
## Complete Examples
|
|
68
|
+
|
|
69
|
+
1. Download video in 1080p as MP4:
|
|
70
|
+
```bash
|
|
71
|
+
python scripts/download_video.py "https://www.youtube.com/watch?v=dQw4w9WgXcQ" -q 1080p
|
|
72
|
+
```
|
|
73
|
+
|
|
74
|
+
2. Download audio only as MP3:
|
|
75
|
+
```bash
|
|
76
|
+
python scripts/download_video.py "https://www.youtube.com/watch?v=dQw4w9WgXcQ" -a
|
|
77
|
+
```
|
|
78
|
+
|
|
79
|
+
3. Download in 720p as WebM to custom directory:
|
|
80
|
+
```bash
|
|
81
|
+
python scripts/download_video.py "https://www.youtube.com/watch?v=dQw4w9WgXcQ" -q 720p -f webm -o /custom/path
|
|
82
|
+
```
|
|
83
|
+
|
|
84
|
+
## How It Works
|
|
85
|
+
|
|
86
|
+
The skill uses `yt-dlp`, a robust YouTube downloader that:
|
|
87
|
+
- Automatically installs itself if not present
|
|
88
|
+
- Fetches video information before downloading
|
|
89
|
+
- Selects the best available streams matching your criteria
|
|
90
|
+
- Merges video and audio streams when needed
|
|
91
|
+
- Supports a wide range of YouTube video formats
|
|
92
|
+
|
|
93
|
+
## Important Notes
|
|
94
|
+
|
|
95
|
+
- Downloads are saved to `/mnt/user-data/outputs/` by default
|
|
96
|
+
- Video filename is automatically generated from the video title
|
|
97
|
+
- The script handles installation of yt-dlp automatically
|
|
98
|
+
- Only single videos are downloaded (playlists are skipped by default)
|
|
99
|
+
- Higher quality videos may take longer to download and use more disk space
|