cerevox 3.0.0-beta.2 → 3.0.0-beta.20

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -25,13 +25,13 @@
25
25
  * 分组场景图片 `generate-image-serials` (一次性生成所有分镜图片)
26
26
  * 配音 `generate-scene-tts` (⚠️ 务必严格采用storyboard中场景的script或dialog作为配音输入文本)
27
27
  * 视频 `generate-video`
28
- * 背景音乐 `generate-music`
28
+ * 背景音乐 `generate-music-or-mv`
29
29
  **模式二:角色三视图生成**
30
30
  * 角色三视图 `generate-character-image` (为主要角色生成三视图)
31
31
  * 分镜图片 `generate-image` (依次生成各分镜图片,使用角色三视图作为参考)
32
32
  * 配音 `generate-scene-tts` (⚠️ 务必严格采用storyboard中场景的script或dialog作为配音输入文本)
33
33
  * 视频 `generate-video`
34
- * 背景音乐 `generate-music`
34
+ * 背景音乐 `generate-music-or-mv`
35
35
  10. 技术规范 → 调用`get-schema(type: draft_content)`获取 draft_content 规范 → 根据规范创建 draft_content.json
36
36
  11. 执行渲染 → `compile-and-run` 输出成品并自动下载到本地
37
37
  12. 关闭项目 → `project-close`
@@ -55,7 +55,7 @@
55
55
  - **配音同步:** 确保 `generate-scene-tts` 的输入文本与 storyboard 中的 script 或 dialog 完全一致
56
56
  - **时长规范:** 视频时长必须为整秒数,配音、音效等可以精确到毫秒,如有对应配音,默认视频时长为 ceil(配音时长) 秒数
57
57
  - **内容一致性:** storyboard 文案 script 和配音内容必须严格一致,如生成配音时修改了文案,必须及时更新 storyboard
58
- - **音画协调:** 背景音乐 `generate-music` 必须与故事情感基调和节奏相匹配
58
+ - **音画协调:** 背景音乐 `generate-music-or-mv` 必须与故事情感基调和节奏相匹配
59
59
 
60
60
  ## 专业技巧
61
61
 
@@ -19,7 +19,7 @@ description: 制作通用视频时,可以根据用户需求,按照这个流
19
19
  1) 配音 `generate-scene-tts`
20
20
  2) 图片 `generate-image` + 视频 `generate-video`
21
21
  或 `generate-video-by-ref` 参考图生视频
22
- 3) 背景音乐 `generate-music`
22
+ 3) 背景音乐 `generate-music-or-mv`
23
23
  6. 技术规范 → 调用`get-schema(type: draft_content)`获取 draft_content 规范 → 根据规范创建 draft_content.json
24
24
  7. 执行渲染 → `compile-and-run` 输出成品并自动下载到本地
25
25
  8. 关闭项目 → `project-close`
@@ -56,7 +56,7 @@ description: 制作通用视频时,可以根据用户需求,按照这个流
56
56
  * 方式二(storyboard 的当前 scene 设置 video_type: references):
57
57
  1) `generate-scene-tts`生成配音(确定时长:接口返回数据中durationMs为配音时长)
58
58
  2) `generate-video-by-ref`参考图生视频
59
- 5. `generate-music`生成背景音乐
59
+ 5. `generate-music-or-mv`生成背景音乐
60
60
  6. 创建`draft_content.json`:
61
61
  - ⚠️ 必须包含完整的VideoProject结构
62
62
  - 除非用户明确拒绝,否则`draft_content.json`中必须包含字幕:
@@ -12,7 +12,7 @@ description: 创作专业音乐MV,基于 Zerocut 自主完成音乐MV成片的
12
12
  1. 确保项目已启动 → `project-open`
13
13
  2. 资料收集(可选)→ 使用搜索工具收集相关资料
14
14
  3. 音乐创作 → 根据主题构思音乐氛围 → 创作歌词 lyrics.txt
15
- 4. 音乐生成 → 根据 lyrics.txt 调用 `generate-music` → 获得歌曲和 captions
15
+ 4. 音乐生成 → 根据 lyrics.txt 调用 `generate-music-or-mv` → 获得歌曲和 captions
16
16
  5. 分析歌曲 → 创建 timeline_analysis.json 得到 captions 的时间线
17
17
  6. 设计分镜场景 → `get-schema(type: storyboard)` 获取分镜规范 → 创建初始 storyboard.json
18
18
  7. 主要角色形象塑造 → `generate-character-image` → 生成主要角色形象参考图(三视图)
@@ -70,11 +70,6 @@ projects/<id>/
70
70
  └─ draft_content.json # 技术规范
71
71
  ```
72
72
 
73
- ### materials 资源命名规范
74
-
75
- - 场景素材:`sc01_bg.png`、`sc01_motion.mp4`、`sc01_vo.mp3`
76
- - 通用素材:`main_bgm_60s.wav`
77
-
78
73
  ### output 输出规范
79
74
 
80
75
  - 画幅:提前确定横竖屏,竖屏720x1280,横屏1280x720,如无特殊要求,竖屏(720x1280)优先
@@ -1 +1 @@
1
- {"version":3,"file":"zerocut.d.ts","sourceRoot":"","sources":["../../../src/mcp/servers/zerocut.ts"],"names":[],"mappings":";AAm/KA,wBAAsB,GAAG,kBAKxB"}
1
+ {"version":3,"file":"zerocut.d.ts","sourceRoot":"","sources":["../../../src/mcp/servers/zerocut.ts"],"names":[],"mappings":";AA0pLA,wBAAsB,GAAG,kBAKxB"}
@@ -731,7 +731,7 @@ server.registerTool('generate-character-image', {
731
731
  description: 'Generate a turnaround image or portrait for any character.',
732
732
  inputSchema: {
733
733
  type: zod_1.z
734
- .enum(['banana', 'banana-pro', 'seedream'])
734
+ .enum(['banana', 'banana-pro', 'seedream', 'seedream-pro'])
735
735
  .optional()
736
736
  .default('banana'),
737
737
  name: zod_1.z.string().describe('The name of the character.'),
@@ -761,7 +761,9 @@ server.registerTool('generate-character-image', {
761
761
  .boolean()
762
762
  .default(true)
763
763
  .describe('是否生成三视图。true: 生成4096x3072的三视图,false: 生成2304x4096的竖版人物正视图'),
764
- saveToFileName: zod_1.z.string().describe('The filename to save.'),
764
+ saveToFileName: zod_1.z
765
+ .string()
766
+ .describe('The filename to save. 应该是png文件'),
765
767
  },
766
768
  }, async ({ type, name, gender, age, appearance, clothing, personality, detail_features, style, saveToFileName, referenceImage, referenceImagePrompt, isTurnaround, }) => {
767
769
  try {
@@ -937,7 +939,7 @@ server.registerTool('generate-line-sketch', {
937
939
  prompt: zod_1.z.string().describe('The prompt to generate line sketch.'),
938
940
  saveToFileName: zod_1.z
939
941
  .string()
940
- .describe('The filename to save the generated line sketch.'),
942
+ .describe('The filename to save the generated line sketch. 应该是png文件'),
941
943
  },
942
944
  }, async ({ prompt, saveToFileName }) => {
943
945
  try {
@@ -1064,7 +1066,7 @@ server.registerTool('generate-image', {
1064
1066
  description: `生成图片`,
1065
1067
  inputSchema: {
1066
1068
  type: zod_1.z
1067
- .enum(['banana', 'banana-pro', 'seedream'])
1069
+ .enum(['banana', 'banana-pro', 'seedream', 'seedream-pro'])
1068
1070
  .optional()
1069
1071
  .default('seedream'),
1070
1072
  prompt: zod_1.z
@@ -1124,17 +1126,14 @@ server.registerTool('generate-image', {
1124
1126
  ])
1125
1127
  .default('720x1280')
1126
1128
  .describe('The size of the image.'),
1127
- saveToFileName: zod_1.z.string().describe('The filename to save.'),
1129
+ saveToFileName: zod_1.z
1130
+ .string()
1131
+ .describe('The filename to save. 应该是png文件'),
1128
1132
  watermark: zod_1.z
1129
1133
  .boolean()
1130
1134
  .optional()
1131
1135
  .default(false)
1132
1136
  .describe('Whether to add watermark to the image.'),
1133
- optimizePrompt: zod_1.z
1134
- .boolean()
1135
- .optional()
1136
- .default(false)
1137
- .describe('Whether to optimize the prompt.'),
1138
1137
  referenceImages: zod_1.z
1139
1138
  .array(zod_1.z.object({
1140
1139
  image: zod_1.z.string().describe('Local image file path'),
@@ -1163,7 +1162,7 @@ server.registerTool('generate-image', {
1163
1162
  \`\`\`
1164
1163
  `),
1165
1164
  },
1166
- }, async ({ type = 'seedream', prompt, sceneIndex, storyBoardFile = 'storyboard.json', skipConsistencyCheck = false, size = '720x1280', saveToFileName, watermark, referenceImages, optimizePrompt, }) => {
1165
+ }, async ({ type = 'seedream', prompt, sceneIndex, storyBoardFile = 'storyboard.json', skipConsistencyCheck = false, size = '720x1280', saveToFileName, watermark, referenceImages, }) => {
1167
1166
  try {
1168
1167
  // 验证session状态
1169
1168
  const currentSession = await validateSession('generate-image');
@@ -1257,53 +1256,71 @@ server.registerTool('generate-image', {
1257
1256
  // 检查并替换英文单引号包裹的中文内容为中文双引号
1258
1257
  // 这样才能让 seedream 生成更好的中文文字
1259
1258
  let processedPrompt = prompt.replace(/'([^']*[\u4e00-\u9fff][^']*)'/g, '“$1”');
1260
- if (optimizePrompt) {
1261
- try {
1262
- const ai = currentSession.ai;
1263
- const promptOptimizer = await (0, promises_1.readFile)((0, node_path_1.resolve)(__dirname, './prompts/image-prompt-optimizer.md'), 'utf8');
1264
- const completion = await ai.getCompletions({
1265
- model: 'Doubao-Seed-1.6-flash',
1266
- messages: [
1267
- {
1268
- role: 'system',
1269
- content: promptOptimizer,
1259
+ try {
1260
+ const ai = currentSession.ai;
1261
+ const promptOptimizer = await (0, promises_1.readFile)((0, node_path_1.resolve)(__dirname, './prompts/image-prompt-optimizer.md'), 'utf8');
1262
+ const schema = {
1263
+ name: 'optimize_image_prompt',
1264
+ schema: {
1265
+ type: 'object',
1266
+ properties: {
1267
+ prompt_optimized: {
1268
+ type: 'string',
1269
+ description: '优化后的提示词',
1270
1270
  },
1271
- {
1272
- role: 'user',
1273
- content: processedPrompt.trim(),
1271
+ metaphor_modifiers: {
1272
+ type: 'array',
1273
+ description: '从 prompt_optimized 中抽取的所有比喻修饰词(字符串数组)',
1274
+ items: {
1275
+ type: 'string',
1276
+ description: '比喻性修饰词,例如 “如羽毛般轻盈”、“像晨雾一样柔和”',
1277
+ },
1274
1278
  },
1275
- ],
1276
- });
1277
- let optimizedPrompt = completion.choices[0]?.message?.content.trim();
1278
- if (optimizedPrompt) {
1279
- if (optimizedPrompt.startsWith('```json')) {
1280
- // 提取 JSON 代码块中的内容
1281
- const jsonMatch = optimizedPrompt.match(/```json\s*([\s\S]*?)\s*```/);
1282
- if (jsonMatch && jsonMatch[1]) {
1283
- optimizedPrompt = jsonMatch[1];
1284
- }
1285
- }
1286
- if (optimizedPrompt.startsWith('{')) {
1287
- try {
1288
- const { prompt_optimized, metaphor_modifiers } = JSON.parse(optimizedPrompt);
1289
- processedPrompt = `${prompt_optimized}`;
1290
- if (metaphor_modifiers?.length) {
1291
- processedPrompt += `\n\n注意:下面这些是形象比喻,并不是输出内容。\n${metaphor_modifiers}`;
1292
- }
1293
- }
1294
- catch (ex) {
1295
- processedPrompt = optimizedPrompt;
1296
- }
1297
- }
1298
- else {
1299
- processedPrompt = optimizedPrompt;
1279
+ },
1280
+ required: ['prompt_optimized', 'metaphor_modifiers'],
1281
+ },
1282
+ };
1283
+ const completion = await ai.getCompletions({
1284
+ model: 'Doubao-Seed-1.6',
1285
+ messages: [
1286
+ {
1287
+ role: 'system',
1288
+ content: promptOptimizer,
1289
+ },
1290
+ {
1291
+ role: 'user',
1292
+ content: `## 用户指令
1293
+
1294
+ ${processedPrompt.trim()}
1295
+
1296
+ ## 参考图
1297
+
1298
+ ${referenceImages?.map((ref, index) => `图${index + 1}:${ref.image}`).join('\n') || '无'}`,
1299
+ },
1300
+ ],
1301
+ response_format: {
1302
+ type: 'json_schema',
1303
+ json_schema: schema,
1304
+ },
1305
+ });
1306
+ const optimizedPrompt = completion.choices[0]?.message?.content.trim();
1307
+ if (optimizedPrompt) {
1308
+ try {
1309
+ const { prompt_optimized, metaphor_modifiers } = JSON.parse(optimizedPrompt);
1310
+ processedPrompt = `${prompt_optimized}`;
1311
+ if (metaphor_modifiers?.length) {
1312
+ processedPrompt += `\n\n注意:下面这些是形象比喻,并不是输出内容。\n${metaphor_modifiers}`;
1300
1313
  }
1301
1314
  }
1302
- }
1303
- catch (error) {
1304
- console.error('Failed to optimize prompt:', error);
1315
+ catch (ex) {
1316
+ console.error('Failed to parse optimized prompt:', ex);
1317
+ processedPrompt = optimizedPrompt;
1318
+ }
1305
1319
  }
1306
1320
  }
1321
+ catch (error) {
1322
+ console.error('Failed to optimize prompt:', error);
1323
+ }
1307
1324
  console.log(`Generating image with prompt: ${processedPrompt.substring(0, 100)}...`);
1308
1325
  // 处理参考图片
1309
1326
  let imageBase64Array;
@@ -1432,12 +1449,14 @@ server.registerTool('edit-image', {
1432
1449
  inputSchema: {
1433
1450
  prompt: zod_1.z.string().describe('要编辑图片的中文提示词'),
1434
1451
  type: zod_1.z
1435
- .enum(['banana-pro', 'banana', 'seedream'])
1452
+ .enum(['banana-pro', 'banana', 'seedream', 'seedream-pro'])
1436
1453
  .optional()
1437
1454
  .default('seedream')
1438
1455
  .describe('The type of image model to use.'),
1439
1456
  sourceImageFileName: zod_1.z.string().describe('The source image file name.'),
1440
- saveToFileName: zod_1.z.string().describe('The filename to save.'),
1457
+ saveToFileName: zod_1.z
1458
+ .string()
1459
+ .describe('The filename to save. 应该是png文件'),
1441
1460
  size: zod_1.z
1442
1461
  .enum([
1443
1462
  '1024x1024',
@@ -1592,6 +1611,8 @@ server.registerTool('generate-video', {
1592
1611
  'hailuo-fast',
1593
1612
  'vidu',
1594
1613
  'vidu-pro',
1614
+ 'vidu-uc',
1615
+ 'vidu-uc-pro',
1595
1616
  'kling',
1596
1617
  'kling-pro',
1597
1618
  'pixv',
@@ -1601,7 +1622,9 @@ server.registerTool('generate-video', {
1601
1622
  ])
1602
1623
  .default('lite')
1603
1624
  .describe('除非用户明确提出使用其他模型,否则一律用lite模型;zero 系列模型适合创作8-23秒带故事情节的短片'),
1604
- saveToFileName: zod_1.z.string().describe('The filename to save.'),
1625
+ saveToFileName: zod_1.z
1626
+ .string()
1627
+ .describe('The filename to save. 应该是mp4文件'),
1605
1628
  start_frame: zod_1.z
1606
1629
  .string()
1607
1630
  .optional()
@@ -2008,7 +2031,7 @@ server.registerTool('generate-video', {
2008
2031
  console.warn('Failed to send progress update:', progressError);
2009
2032
  }
2010
2033
  },
2011
- waitForFinish: true,
2034
+ waitForFinish: type !== 'zero',
2012
2035
  });
2013
2036
  if (!res) {
2014
2037
  throw new Error('Failed to generate video: no response from AI service');
@@ -2056,7 +2079,7 @@ server.registerTool('generate-video', {
2056
2079
  type: 'text',
2057
2080
  text: JSON.stringify({
2058
2081
  success: true,
2059
- message: '该视频生成任务正在运行中,它是异步任务,且执行时间较长,你应立即调用工具 wait-for-task-finish 来等待任务结束,如该工具调用超时,你应立即再次重新调用直到任务结束。',
2082
+ message: '该视频生成任务正在运行中,它是异步任务,且执行时间较长,你应立即调用工具 wait-for-task-finish 来等待任务结束,如 wait-for-task-finish 工具调用超时,你应立即再次重新调用直到任务结束。',
2060
2083
  taskUrl: res.taskUrl,
2061
2084
  }),
2062
2085
  },
@@ -2094,7 +2117,7 @@ server.registerTool('wait-for-task-finish', {
2094
2117
  .describe('The taskUrl of the video task to wait for.'),
2095
2118
  saveToFileName: zod_1.z
2096
2119
  .string()
2097
- .describe('The file name to save the video to.'),
2120
+ .describe('The file name to save the video to. 应该是mp4文件'),
2098
2121
  },
2099
2122
  }, async ({ taskUrl, saveToFileName }, context) => {
2100
2123
  try {
@@ -2169,7 +2192,7 @@ server.registerTool('generate-sound-effect', {
2169
2192
  .describe('The duration of the sound which will be generated in seconds. Must be at least 0.5 and at most 30. If set to None we will guess the optimal duration using the prompt. Defaults to None.'),
2170
2193
  saveToFileName: zod_1.z
2171
2194
  .string()
2172
- .describe('The filename to save. The mime type is audio/mpeg (mp3).'),
2195
+ .describe('The filename to save. 应该是mp3文件'),
2173
2196
  },
2174
2197
  }, async ({ prompt_in_english, loop, saveToFileName, duration_seconds }) => {
2175
2198
  try {
@@ -2216,14 +2239,32 @@ server.registerTool('generate-sound-effect', {
2216
2239
  return createErrorResponse(error, 'generate-sound-effect');
2217
2240
  }
2218
2241
  });
2219
- server.registerTool('generate-music', {
2220
- title: 'Generate Music',
2221
- description: 'Generate the music. Include background music or song.',
2242
+ server.registerTool('generate-music-or-mv', {
2243
+ title: '创作音乐(Music)或音乐视频(Music Video)',
2244
+ description: '生成音乐,包括MV(music video)、BGM 歌曲',
2222
2245
  inputSchema: {
2223
2246
  prompt: zod_1.z.string().describe('The prompt to generate.'),
2247
+ singerPhoto: zod_1.z
2248
+ .string()
2249
+ .optional()
2250
+ .describe('The singer photo to use. 只有type为music_video的时候才生效,也可以不传,模型会自动生成'),
2251
+ mvOrientation: zod_1.z
2252
+ .enum(['portrait', 'landscape'])
2253
+ .optional()
2254
+ .describe('The orientation of the music video. Defaults to portrait.')
2255
+ .default('portrait'),
2256
+ mvOriginalSong: zod_1.z
2257
+ .string()
2258
+ .optional()
2259
+ .describe('用于生成mv的音乐. 只有type为music_video的时候才生效,也可以不传,模型会自动创作'),
2260
+ mvGenSubtitles: zod_1.z
2261
+ .boolean()
2262
+ .optional()
2263
+ .default(false)
2264
+ .describe('是否生成mv的字幕. 默认为false,只有type为music_video的时候才生效'),
2224
2265
  type: zod_1.z
2225
- .enum(['bgm', 'song'])
2226
- .describe('The type of music. Defaults to background music.')
2266
+ .enum(['bgm', 'song', 'music_video'])
2267
+ .describe('The type of music. Defaults to BGM. ⚠️ 如果 type 是 music_video,会直接生成音频和视频,**不需要**额外专门生成歌曲')
2227
2268
  .default('bgm'),
2228
2269
  model: zod_1.z
2229
2270
  .enum(['doubao', 'minimax'])
@@ -2239,9 +2280,11 @@ server.registerTool('generate-music', {
2239
2280
  .boolean()
2240
2281
  .default(false)
2241
2282
  .describe('Whether to skip copyright check.'),
2242
- saveToFileName: zod_1.z.string().describe('The filename to save.'),
2283
+ saveToFileName: zod_1.z
2284
+ .string()
2285
+ .describe('The filename to save. 如果type是music video,应该是mp4文件,否则应该是mp3文件'),
2243
2286
  },
2244
- }, async ({ prompt, type, model, duration, skipCopyCheck, saveToFileName }, context) => {
2287
+ }, async ({ prompt, singerPhoto, mvOrientation, mvOriginalSong, mvGenSubtitles, type, model, duration, skipCopyCheck, saveToFileName, }, context) => {
2245
2288
  try {
2246
2289
  // 验证session状态
2247
2290
  const currentSession = await validateSession('generate-music');
@@ -2252,24 +2295,54 @@ server.registerTool('generate-music', {
2252
2295
  if (type === 'bgm' && duration > 120) {
2253
2296
  throw new Error('BGM duration must be at most 120 seconds.');
2254
2297
  }
2255
- const finalPrompt = `${prompt.trim()} ${type === 'bgm' ? `纯音乐无歌词,时长${duration}秒` : `时长${duration}秒,使用${model}模型`}`;
2256
- const res = await ai.generateMusic({
2257
- prompt: finalPrompt,
2258
- skipCopyCheck,
2259
- onProgress: async (metaData) => {
2260
- try {
2261
- await sendProgress(context, metaData.Result?.Progress ?? ++progress, metaData.Result?.Progress ? 100 : undefined, JSON.stringify(metaData));
2262
- }
2263
- catch (progressError) {
2264
- console.warn('Failed to send progress update:', progressError);
2265
- }
2266
- },
2267
- });
2298
+ let res;
2299
+ if (type === 'music_video') {
2300
+ const singer_photo = singerPhoto
2301
+ ? await getMaterialUri(currentSession, singerPhoto)
2302
+ : undefined;
2303
+ const original_song = mvOriginalSong
2304
+ ? await getMaterialUri(currentSession, mvOriginalSong)
2305
+ : undefined;
2306
+ res = await ai.generateZeroCutMusicVideo({
2307
+ // prompt: `${prompt.trim()} 音乐时长${duration}秒`,
2308
+ prompt,
2309
+ singerPhoto: singer_photo,
2310
+ orientation: mvOrientation,
2311
+ genSubtitles: mvGenSubtitles,
2312
+ originalSong: original_song,
2313
+ duration,
2314
+ resolution: '720p',
2315
+ onProgress: async (metaData) => {
2316
+ try {
2317
+ await sendProgress(context, metaData.Result?.Progress ?? ++progress, metaData.Result?.Progress ? 100 : undefined, JSON.stringify(metaData));
2318
+ }
2319
+ catch (progressError) {
2320
+ console.warn('Failed to send progress update:', progressError);
2321
+ }
2322
+ },
2323
+ waitForFinish: false,
2324
+ });
2325
+ }
2326
+ else {
2327
+ const finalPrompt = `${prompt.trim()} ${type === 'bgm' ? `纯音乐无歌词,时长${duration}秒` : `时长${duration}秒,使用${model}模型`}`;
2328
+ res = await ai.generateMusic({
2329
+ prompt: finalPrompt,
2330
+ skipCopyCheck,
2331
+ onProgress: async (metaData) => {
2332
+ try {
2333
+ await sendProgress(context, metaData.Result?.Progress ?? ++progress, metaData.Result?.Progress ? 100 : undefined, JSON.stringify(metaData));
2334
+ }
2335
+ catch (progressError) {
2336
+ console.warn('Failed to send progress update:', progressError);
2337
+ }
2338
+ },
2339
+ });
2340
+ }
2268
2341
  if (!res) {
2269
- throw new Error('Failed to generate BGM: no response from AI service');
2342
+ throw new Error('Failed to generate Music: no response from AI service');
2270
2343
  }
2271
2344
  if (res.url) {
2272
- console.log('BGM generated successfully, saving to materials...');
2345
+ console.log('Music generated successfully, saving to materials...');
2273
2346
  const uri = await saveMaterial(currentSession, res.url, validatedFileName);
2274
2347
  const { url, duration: bgmDuration, captions, ...opts } = res;
2275
2348
  // 保存captions到本地
@@ -2307,15 +2380,29 @@ server.registerTool('generate-music', {
2307
2380
  ],
2308
2381
  };
2309
2382
  }
2383
+ else if (res.taskUrl) {
2384
+ return {
2385
+ content: [
2386
+ {
2387
+ type: 'text',
2388
+ text: JSON.stringify({
2389
+ success: true,
2390
+ message: '该视频生成任务正在运行中,它是异步任务,且执行时间较长,你应立即调用工具 wait-for-task-finish 来等待任务结束,如如 wait-for-task-finish 工具调用超时,你应立即再次重新调用直到任务结束。',
2391
+ taskUrl: res.taskUrl,
2392
+ }),
2393
+ },
2394
+ ],
2395
+ };
2396
+ }
2310
2397
  else {
2311
- console.warn('BGM generation completed but no URL returned');
2398
+ console.warn('Music generation completed but no URL returned');
2312
2399
  return {
2313
2400
  content: [
2314
2401
  {
2315
2402
  type: 'text',
2316
2403
  text: JSON.stringify({
2317
2404
  success: false,
2318
- error: 'No BGM URL returned from AI service',
2405
+ error: 'No Music URL returned from AI service',
2319
2406
  response: res,
2320
2407
  timestamp: new Date().toISOString(),
2321
2408
  }),
@@ -2325,7 +2412,7 @@ server.registerTool('generate-music', {
2325
2412
  }
2326
2413
  }
2327
2414
  catch (error) {
2328
- return createErrorResponse(error, 'generate-bgm');
2415
+ return createErrorResponse(error, 'generate-music');
2329
2416
  }
2330
2417
  });
2331
2418
  server.registerTool('generate-scene-tts', {
@@ -2352,7 +2439,9 @@ server.registerTool('generate-scene-tts', {
2352
2439
  .string()
2353
2440
  .optional()
2354
2441
  .describe('跳过校验的理由,如果skipConsistencyCheck设为true,必须要传这个参数'),
2355
- saveToFileName: zod_1.z.string().describe('The filename to save.'),
2442
+ saveToFileName: zod_1.z
2443
+ .string()
2444
+ .describe('The filename to save. 应该是mp3文件'),
2356
2445
  speed: zod_1.z
2357
2446
  .number()
2358
2447
  .min(0.5)
@@ -2923,7 +3012,7 @@ server.registerTool('voice-design', {
2923
3012
  previewText: zod_1.z.string().describe('The preview text to design the voice.'),
2924
3013
  saveToFileName: zod_1.z
2925
3014
  .string()
2926
- .describe('The file name to save the designed voice.'),
3015
+ .describe('The file name to save the designed voice. 应该是mp3文件'),
2927
3016
  },
2928
3017
  }, async ({ prompt, previewText, saveToFileName }) => {
2929
3018
  try {
@@ -3399,7 +3488,7 @@ server.registerTool('audio-video-sync', {
3399
3488
  .describe('The reference photo face for lip sync.'),
3400
3489
  saveToFileName: zod_1.z
3401
3490
  .string()
3402
- .describe('The filename to save the audio-video-synced video.'),
3491
+ .describe('The filename to save the audio-video-synced video. 应该是mp4文件'),
3403
3492
  },
3404
3493
  }, async ({ lipSync, lipSyncType, lipSyncPadAudio, videoFileName, audioFileName, audioInMs, refPhotoFileName, saveToFileName, }, context) => {
3405
3494
  try {
@@ -3577,6 +3666,7 @@ server.registerTool('generate-video-by-ref', {
3577
3666
  'veo3.1',
3578
3667
  'veo3.1-pro',
3579
3668
  'vidu',
3669
+ 'vidu-uc',
3580
3670
  'pixv',
3581
3671
  ])
3582
3672
  .default('lite')
@@ -3588,7 +3678,7 @@ server.registerTool('generate-video-by-ref', {
3588
3678
  .describe('Whether to mute the video (effective for sora2 and veo3.1).'),
3589
3679
  saveToFileName: zod_1.z
3590
3680
  .string()
3591
- .describe('The filename to save the generated video.'),
3681
+ .describe('The filename to save the generated video. 应该是mp4文件'),
3592
3682
  sceneIndex: zod_1.z
3593
3683
  .number()
3594
3684
  .min(1)
@@ -3794,7 +3884,54 @@ server.registerTool('generate-video-by-ref', {
3794
3884
  if (promptPrefix) {
3795
3885
  promptPrefix += '\n';
3796
3886
  }
3797
- const finalPrompt = `${promptPrefix}${prompt}`;
3887
+ let finalPrompt = `${promptPrefix}${prompt}`;
3888
+ if (type === 'pixv') {
3889
+ const completion = await ai.getCompletions({
3890
+ model: 'Doubao-Seed-1.6',
3891
+ messages: [
3892
+ {
3893
+ role: 'system',
3894
+ content: `你根据主体信息,优化用户指令,使描述中的内容正确引用主体名称。
3895
+
3896
+ 具体方式为,将用户指令中引用主体信息中主体名称的部分,用 “@主体名” 的形式替代,注意它和前后内容之间也需要用**空格**分隔。
3897
+
3898
+ ## 例子
3899
+
3900
+ ### 输入:
3901
+
3902
+ 主体信息
3903
+ [
3904
+ {"type": "subject", "fileName": "dog.png", "ref_name": "狗"},
3905
+ {"type": "background", "fileName": "room.png", "ref_name": "房间"}
3906
+ ]
3907
+
3908
+ 用户指令
3909
+ 一只狗在房间里玩耍
3910
+
3911
+ ### 输出:
3912
+ 一只 @狗 在 @房间 里玩耍
3913
+
3914
+ ---
3915
+
3916
+ ## 要求与约束
3917
+
3918
+ 只输出替换主体名后的用户指令,不要输出其他任何额外内容
3919
+ `,
3920
+ },
3921
+ {
3922
+ role: 'user',
3923
+ content: `## 主体信息
3924
+
3925
+ ${JSON.stringify(referenceImages)}
3926
+
3927
+ ## 用户指令
3928
+
3929
+ ${prompt.trim()}`,
3930
+ },
3931
+ ],
3932
+ });
3933
+ finalPrompt = completion.choices[0]?.message?.content.trim();
3934
+ }
3798
3935
  // 调用 referencesToVideo 函数
3799
3936
  const result = await currentSession.ai.referencesToVideo({
3800
3937
  prompt: finalPrompt,
@@ -3823,7 +3960,7 @@ server.registerTool('generate-video-by-ref', {
3823
3960
  type: 'text',
3824
3961
  text: JSON.stringify({
3825
3962
  success: true,
3826
- message: '该视频生成任务正在运行中,它是异步任务,且执行时间较长,你应立即调用工具 wait-for-task-finish 来等待任务结束,如该工具调用超时,你应立即再次重新调用直到任务结束。',
3963
+ message: '该视频生成任务正在运行中,它是异步任务,且执行时间较长,你应立即调用工具 wait-for-task-finish 来等待任务结束,如如 wait-for-task-finish 工具调用超时,你应立即再次重新调用直到任务结束。',
3827
3964
  taskUrl: result.taskUrl,
3828
3965
  }),
3829
3966
  },
@@ -3898,7 +4035,7 @@ server.registerTool('extend-video-duration', {
3898
4035
  .describe('Optional end frame image file name in materials directory to guide the video extension.'),
3899
4036
  saveToFileName: zod_1.z
3900
4037
  .string()
3901
- .describe('The filename to save the extended video.'),
4038
+ .describe('The filename to save the extended video. 应该是mp4文件'),
3902
4039
  },
3903
4040
  }, async ({ videoFileName, duration, resolution, prompt, type = 'turbo', endFrame, saveToFileName, }, context) => {
3904
4041
  try {
@@ -3994,11 +4131,11 @@ server.registerTool('use-template', {
3994
4131
  .describe('Optional materials to use in the template.'),
3995
4132
  saveToFileName: zod_1.z
3996
4133
  .string()
3997
- .describe('The filename to save the generated material.'),
4134
+ .describe('The filename to save the generated material. 根据用户具体需求,应该是mp4或png文件'),
3998
4135
  },
3999
4136
  }, async ({ user_request, saveToFileName, materials }) => {
4000
4137
  try {
4001
- const currentSession = await validateSession('generate-video-by-template');
4138
+ const currentSession = await validateSession('use-template');
4002
4139
  const ai = currentSession.ai;
4003
4140
  const data = await ai.listTemplates('all');
4004
4141
  const templates = data.map(item => ({
@@ -4013,7 +4150,7 @@ server.registerTool('use-template', {
4013
4150
  messages: [
4014
4151
  {
4015
4152
  role: 'system',
4016
- content: `你根据用户需求,从以下模板中选择一个匹配的模板,返回模板ID:\n\n${JSON.stringify(templates)}\n\n**约束**:只输出模板ID,不需要其他解释,如果没有匹配的模版,输出"无匹配模版"`,
4153
+ content: `你根据用户需求,分析需求与模板描述(description)和触发器(trigger)的匹配程度,从以下模板中选择一个匹配的模板,返回模板ID:\n\n${JSON.stringify(templates)}\n\n**约束**:只输出模板ID,不需要其他解释,如果没有匹配的模版,输出"无匹配模版"`,
4017
4154
  },
4018
4155
  {
4019
4156
  role: 'user',