bailian-cli 1.3.3 → 1.4.1
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +7 -4
- package/README.zh.md +7 -4
- package/dist/bailian.mjs +198 -206
- package/package.json +2 -2
package/README.md
CHANGED
|
@@ -27,9 +27,12 @@ Equip your AI Agent out-of-the-box with these capabilities, composable across co
|
|
|
27
27
|
- **Text chat** — Qwen3.7-max: major gains in agentic coding, frontend coding, and vibe coding
|
|
28
28
|
- **Multimodal (Omni)** — Full omni-modal support across text + image + audio + video
|
|
29
29
|
- **Image generation & editing** — Qwen-Image 2.0: pro text rendering, photorealism, strong semantic adherence, multi-image composition
|
|
30
|
-
- **Video generation & editing** —
|
|
30
|
+
- **Video generation & editing** — happyhorse-1.1 series: text-/image-/reference-to-video and natural-language video editing (up to 9-image reference)
|
|
31
31
|
- **Speech synthesis & recognition** — CosyVoice streaming TTS, voice cloning from 5–20s samples; FunAudio-ASR covers 30 languages including 7 Chinese dialects and 20+ Mandarin accents
|
|
32
32
|
- **Image & video understanding** — Qwen-VL: long-form video analysis, chart/document parsing, visual reasoning, multilingual OCR
|
|
33
|
+
|
|
34
|
+
> **Note:** The features below are currently available only to China site (aliyun.com) account holders and are not yet supported for international / global site accounts.
|
|
35
|
+
|
|
33
36
|
- **Knowledge base & memory** — Multimodal RAG retrieval and cross-session memory for personalized, coherent dialogue
|
|
34
37
|
- **App calls** — Invoke agents and workflows already published on Aliyun Model Studio
|
|
35
38
|
- **MCP integration** — Orchestrate Bailian MCP servers: list services, inspect tools, and invoke any tool directly from the terminal
|
|
@@ -51,7 +54,7 @@ Equip your AI Agent out-of-the-box with these capabilities, composable across co
|
|
|
51
54
|
A complete **2-minute, 16:9 cinematic short film** — produced end-to-end from a single natural-language sentence, with **zero manual editing**. This showcase demonstrates how an AI Agent can compose a multi-step creative pipeline by orchestrating three primitives:
|
|
52
55
|
|
|
53
56
|
- **[Qwen Code](https://github.com/QwenLM/qwen-code)** — the agentic coding model that interprets the user's intent and drives the workflow
|
|
54
|
-
- **[Aliyun Model Studio CLI](https://bailian.console.aliyun.com/cli?source_channel=cli_github&)** — invokes **HappyHorse 1.
|
|
57
|
+
- **[Aliyun Model Studio CLI](https://bailian.console.aliyun.com/cli?source_channel=cli_github&)** — invokes **HappyHorse 1.1**, Aliyun Model Studio's text-/image-/reference-to-video generation model
|
|
55
58
|
- **[spark-video Skill](https://github.com/JohnKeating1997/spark-video)** — handles scene decomposition, storyboarding, shot continuity, and final stitching
|
|
56
59
|
|
|
57
60
|
### The single prompt
|
|
@@ -64,7 +67,7 @@ A complete **2-minute, 16:9 cinematic short film** — produced end-to-end from
|
|
|
64
67
|
|
|
65
68
|
1. **Qwen Code** parses the request, plans the narrative beats, and decides which tools to call.
|
|
66
69
|
2. The **spark-video Skill** breaks the story into shots, writes per-shot prompts, and enforces visual continuity (characters, lighting, palette, lens language).
|
|
67
|
-
3. **`bl video generate`** dispatches each shot to **HappyHorse 1.
|
|
70
|
+
3. **`bl video generate`** dispatches each shot to **HappyHorse 1.1** in parallel.
|
|
68
71
|
4. The skill stitches all clips back together into a single 16:9 / ~2-min deliverable.
|
|
69
72
|
|
|
70
73
|
No timeline scrubbing. No frame-by-frame editing. Just one sentence → one video.
|
|
@@ -172,7 +175,7 @@ export BAILIAN_WORKSPACE_ID=ws-...
|
|
|
172
175
|
bl config show
|
|
173
176
|
|
|
174
177
|
# Set defaults
|
|
175
|
-
bl config set --key
|
|
178
|
+
bl config set --key base_url --value https://dashscope-us.aliyuncs.com
|
|
176
179
|
bl config set --key default_text_model --value qwen-turbo
|
|
177
180
|
bl config set --key timeout --value 600
|
|
178
181
|
|
package/README.zh.md
CHANGED
|
@@ -27,9 +27,12 @@ _专为 AI Agent 打造,每个命令均可作为结构化工具调用。_
|
|
|
27
27
|
- **文本对话** — Qwen3.7-max:Agentic coding、前端编程、Vibe coding 等能力显著增强
|
|
28
28
|
- **全模态对话** — 文本 + 图像 + 音频 + 视频全模态支持
|
|
29
29
|
- **图像生成与编辑** — Qwen-Image 2.0:专业文字渲染、真实质感、强语义遵循、多图合成
|
|
30
|
-
- **视频生成与编辑** —
|
|
30
|
+
- **视频生成与编辑** — happyhorse-1.1 系列,支持文生 / 图生 / 参考生(最多 9 张图参考)/ 自然语言视频编辑
|
|
31
31
|
- **语音合成与识别** — CosyVoice 实时流式合成,5-20s 样本即可克隆;FunAudio-ASR 覆盖 30 种语种,含汉语七大方言与 20+ 口音官话
|
|
32
32
|
- **图像与视频理解** — Qwen-VL:长视频解析、复杂图表与文档识别、视觉推理、多语种 OCR
|
|
33
|
+
|
|
34
|
+
> **注意:** 以下功能目前仅对中国站(aliyun.com)账号开放,国际站 / 全球站账号暂不支持。
|
|
35
|
+
|
|
33
36
|
- **知识库与记忆库** — 多模态 RAG 检索 + 跨会话记忆,提供个性化连贯对话体验
|
|
34
37
|
- **应用调用** — 调用已发布在阿里云百炼平台上的智能体与工作流应用
|
|
35
38
|
- **MCP 集成** — 统一调度百炼 MCP 服务:列出服务、查看工具、直接在终端调用任意工具
|
|
@@ -51,7 +54,7 @@ _专为 AI Agent 打造,每个命令均可作为结构化工具调用。_
|
|
|
51
54
|
一部完整的 **2 分钟、16:9 电影感短片** —— 由一句自然语言端到端生成,**全程零手动剪辑**。这个示例展示了 AI Agent 如何把三个基础能力编排成一条多步创作流水线:
|
|
52
55
|
|
|
53
56
|
- **[Qwen Code](https://github.com/QwenLM/qwen-code)** —— Agentic coding 模型,解析用户意图、驱动整个工作流
|
|
54
|
-
- **[阿里云百炼 CLI](https://github.com/modelstudioai/cli/)** —— 调用 **HappyHorse 1.
|
|
57
|
+
- **[阿里云百炼 CLI](https://github.com/modelstudioai/cli/)** —— 调用 **HappyHorse 1.1**,百炼的文生/图生/参考生视频模型
|
|
55
58
|
- **[spark-video Skill](https://github.com/JohnKeating1997/spark-video)** —— 负责场景拆分、分镜设计、镜头连贯性和最终拼接
|
|
56
59
|
|
|
57
60
|
### 唯一的提示词
|
|
@@ -62,7 +65,7 @@ _专为 AI Agent 打造,每个命令均可作为结构化工具调用。_
|
|
|
62
65
|
|
|
63
66
|
1. **Qwen Code** 解析需求、规划叙事节奏,决定要调用哪些工具。
|
|
64
67
|
2. **spark-video Skill** 把故事拆成镜头、为每个镜头写提示词,并保证视觉连贯性(角色、光线、色调、镜头语言)。
|
|
65
|
-
3. **`bl video generate`** 把每个镜头并行下发给 **HappyHorse 1.
|
|
68
|
+
3. **`bl video generate`** 把每个镜头并行下发给 **HappyHorse 1.1**。
|
|
66
69
|
4. Skill 把所有片段拼成最终的 16:9 / 约 2 分钟成片。
|
|
67
70
|
|
|
68
71
|
没有时间线拖拽,没有逐帧剪辑。一句话 → 一部短片。
|
|
@@ -167,7 +170,7 @@ export BAILIAN_WORKSPACE_ID=ws-...
|
|
|
167
170
|
bl config show
|
|
168
171
|
|
|
169
172
|
# 设置默认值
|
|
170
|
-
bl config set --key
|
|
173
|
+
bl config set --key base_url --value https://dashscope-us.aliyuncs.com
|
|
171
174
|
bl config set --key default_text_model --value qwen-turbo
|
|
172
175
|
bl config set --key timeout --value 600
|
|
173
176
|
|