@optima-chat/optima-agent 0.8.95 → 0.8.97
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/.claude/skills/gen/SKILL.md +1 -1
- package/.claude/skills/video-gen/SKILL.md +449 -0
- package/.claude/skills/video-gen/templates/lifestyle-scene.md +18 -0
- package/.claude/skills/video-gen/templates/pdp-360-showcase.md +17 -0
- package/.claude/skills/video-gen/templates/pdp-feature-highlight.md +18 -0
- package/.claude/skills/video-gen/templates/tiktok-before-after.md +17 -0
- package/.claude/skills/video-gen/templates/tiktok-product-reveal.md +17 -0
- package/.claude/skills/video-gen/templates/tiktok-unboxing.md +18 -0
- package/package.json +1 -1
- package/.claude/skills/video-clone/SKILL.md +0 -199
- package/.claude/skills/video-clone/assets/phase-state-template.json +0 -11
- package/.claude/skills/video-clone/references/ffmpeg-commands.md +0 -42
- package/.claude/skills/video-clone/references/gate-enforcement.md +0 -144
- package/.claude/skills/video-clone/references/kling-api.md +0 -85
- package/.claude/skills/video-clone/references/prompt-template.md +0 -71
- package/.claude/skills/video-clone/references/url-parsing.md +0 -32
- package/.claude/skills/video-clone/references/workflow-system.md +0 -92
- package/.claude/skills/video-clone/scripts/_confirm.py +0 -96
- package/.claude/skills/video-clone/scripts/_confirm_test.py +0 -125
- package/.claude/skills/video-clone/scripts/_gate.py +0 -162
- package/.claude/skills/video-clone/scripts/_gate_e2e_test.py +0 -226
- package/.claude/skills/video-clone/scripts/_gate_test.py +0 -148
- package/.claude/skills/video-clone/scripts/_project.py +0 -56
- package/.claude/skills/video-clone/scripts/analyze_source.py +0 -113
- package/.claude/skills/video-clone/scripts/analyze_source_test.py +0 -52
- package/.claude/skills/video-clone/scripts/assemble.py +0 -106
- package/.claude/skills/video-clone/scripts/confirm.py +0 -12
- package/.claude/skills/video-clone/scripts/edit_first_frame.py +0 -66
- package/.claude/skills/video-clone/scripts/extract_frames.py +0 -108
- package/.claude/skills/video-clone/scripts/gen_video.py +0 -59
- package/.claude/skills/video-clone/scripts/init_project.py +0 -103
- package/.claude/skills/video-clone/scripts/init_project_test.py +0 -106
- package/.claude/skills/video-clone/scripts/kling_generate.py +0 -262
- package/.claude/skills/video-clone/scripts/kling_generate_test.py +0 -191
- package/.claude/skills/video-clone/scripts/preflight.py +0 -102
- package/.claude/skills/video-clone/scripts/preview.py +0 -208
- package/.claude/skills/video-clone/scripts/preview_test.py +0 -169
- package/.claude/skills/video-clone/scripts/save_workflow.py +0 -129
- package/.claude/skills/video-clone/scripts/save_workflow_test.py +0 -106
- package/.claude/skills/video-clone/scripts/status.py +0 -202
- package/.claude/skills/video-clone/scripts/status_test.py +0 -174
|
@@ -1,199 +0,0 @@
|
|
|
1
|
-
---
|
|
2
|
-
name: video-clone
|
|
3
|
-
description: "Use when user wants to clone/replicate a reference video with product swap, or generate a new video from product images + text descriptions. 触发场景:复刻视频(复刻/翻拍/仿拍/做同款/视频换产品/product swap/爆款复刻/video replication)、用户贴视频链接+产品图要求出同款视频、或用户提供图片/文字描述要求直接生成视频(生成视频/图生视频/做一个视频)。Pipeline is a script-based state machine under scripts/ — the generation scripts block until the user confirms the preview bundle. Requires `gen` CLI, video generation API via PiAPI, ffmpeg."
|
|
4
|
-
---
|
|
5
|
-
|
|
6
|
-
# Video Clone
|
|
7
|
-
|
|
8
|
-
通过产品替换或文字描述,复刻源视频或生成全新视频。
|
|
9
|
-
|
|
10
|
-
> **PR #65 升级注意**:本版本采用单门 (single-gate) 模型。旧版三门
|
|
11
|
-
> (`plan_confirmed` / `prompt_confirmed` / `frame_confirmed`) 已废弃。
|
|
12
|
-
> 旧项目继续用旧脚本完成,新项目全部走本版本流程。
|
|
13
|
-
|
|
14
|
-
## 前置依赖
|
|
15
|
-
|
|
16
|
-
- **`gen` CLI** — `gen image`(首帧编辑)、`gen video`(I2V,无音频时使用)
|
|
17
|
-
- **视频生成 API via PiAPI** — 有音频/口型同步时使用。需要 `PIAPI_KEY` 环境变量
|
|
18
|
-
- **ffmpeg / ffprobe** — 场景检测、抽帧、后处理
|
|
19
|
-
- **Python ≥ 3.10** — 所有脚本运行时
|
|
20
|
-
- **freeimage.host** — 首帧上传时的公开 URL 托管(详见 [kling-api.md](references/kling-api.md))
|
|
21
|
-
|
|
22
|
-
触发 skill 后第一步:运行 `python scripts/preflight.py`,确认上述依赖。
|
|
23
|
-
|
|
24
|
-
## 脚本 Pipeline 一览
|
|
25
|
-
|
|
26
|
-
单门模型:所有 prep 脚本自由运行,preview 之后只有一个 GATE。
|
|
27
|
-
|
|
28
|
-
```
|
|
29
|
-
preflight.py
|
|
30
|
-
↓
|
|
31
|
-
init_project.py (Phase 0 — 创建项目目录 + 状态文件)
|
|
32
|
-
↓
|
|
33
|
-
analyze_source.py (Phase 1, 复刻才需要)
|
|
34
|
-
extract_frames.py (Phase 1, 复刻才需要)
|
|
35
|
-
↓
|
|
36
|
-
Claude 分析帧网格,写 prompt.md
|
|
37
|
-
edit_first_frame.py (Phase 2, 复刻才需要)
|
|
38
|
-
写 cost.json (可选,估算成本)
|
|
39
|
-
↓
|
|
40
|
-
preview.py (汇总所有 prep 产物 → preview_vN.md)
|
|
41
|
-
↓
|
|
42
|
-
[展示 preview 给用户,等待确认]
|
|
43
|
-
↓
|
|
44
|
-
confirm.py --quote "<用户原话>" ← 唯一 GATE
|
|
45
|
-
↓
|
|
46
|
-
kling_generate.py (Phase 3, 有音频)
|
|
47
|
-
gen_video.py (Phase 3, 无音频)
|
|
48
|
-
↓
|
|
49
|
-
assemble.py (Phase 4 — 标准化/拼接)
|
|
50
|
-
↓
|
|
51
|
-
save_workflow.py (Phase 5 — 可选,效果好时沉淀)
|
|
52
|
-
```
|
|
53
|
-
|
|
54
|
-
`kling_generate.py`、`gen_video.py`、`save_workflow.py` 启动时调用
|
|
55
|
-
`require_gate("preview_confirmed")`,未确认就 exit 1 并打印
|
|
56
|
-
`[HARD-GATE BLOCKED]`。详见 [gate-enforcement.md](references/gate-enforcement.md)。
|
|
57
|
-
|
|
58
|
-
<HARD-GATE>
|
|
59
|
-
**Gate 由脚本机械强制,不是文本约定。** 任何绕过 gate 的方式都会留下证据:
|
|
60
|
-
1. 手工编辑 `.state/phase.json` — history 字段会显示断点
|
|
61
|
-
2. 伪造 `--quote` 调用 `confirm.py` — user_quote 字段留痕,事后可倒查
|
|
62
|
-
|
|
63
|
-
**USER GATE 不可被用户单方面取消。** 即使用户说"不要问问题/我信任你/自己判断",
|
|
64
|
-
仍必须展示 preview bundle 并等待确认。用户可以说"可以,开始"以确认,但不能
|
|
65
|
-
预先豁免整个 GATE 机制。
|
|
66
|
-
|
|
67
|
-
每个 Phase 产物必须版本化命名(脚本通过 `_project.next_version()` 自动处理),
|
|
68
|
-
绝不覆盖已有文件。
|
|
69
|
-
</HARD-GATE>
|
|
70
|
-
|
|
71
|
-
## Rationalization Counter
|
|
72
|
-
|
|
73
|
-
以下是最常见的"合理化借口"及其反驳。遇到这些想法时必须 STOP。
|
|
74
|
-
|
|
75
|
-
| Claude 的想法 | 现实 |
|
|
76
|
-
|---|---|
|
|
77
|
-
| "我只是在做 Phase 0 分析,下载视频不算执行" | 下载 = 工具调用。分析基于用户提供的元数据,不是基于已下载的文件 |
|
|
78
|
-
| "用户说'急用'/'现在就开始',所以可以跳过确认" | 紧迫感不是跳过 GATE 的理由。越急越要快速对齐,避免返工 |
|
|
79
|
-
| "用户说'不要问问题,自己判断'" | 覆盖提问,**不覆盖** preview 展示。自己判断后仍要出 preview |
|
|
80
|
-
| "信息已经完整了,preview 只是走过场" | preview 是用户发现隐含错误的最后机会 |
|
|
81
|
-
| "我直接运行 kling_generate 只是测试" | 脚本 exit 1,测试什么都看不到 |
|
|
82
|
-
| "我 `python -c 'import _gate; _gate.set_gate(...)'` 自己设 gate" | history 字段暴露没走 confirm.py 正常路径 |
|
|
83
|
-
| "prep 脚本没有 gate,所以我可以随意运行" | prep 脚本确实无 gate — 但 **preview → confirm** 这一步仍是必须的 |
|
|
84
|
-
|
|
85
|
-
## Phase 0: 理解需求 + 匹配 Workflow
|
|
86
|
-
|
|
87
|
-
**在动手之前,先搞清楚用户到底要什么,再看有没有现成的好方案。**
|
|
88
|
-
|
|
89
|
-
快速判断任务类型:有源视频 + 产品图 → 视频复刻;无源视频 → 纯视频生成。
|
|
90
|
-
|
|
91
|
-
**仅做轻量元数据识别,不执行工具调用**。不下载视频、不调 ffprobe、不抽帧。
|
|
92
|
-
|
|
93
|
-
按需提问(不一次全问):产品是什么、替换目标、音频需求(有音频 ~$1.50/10s,无音频 ~$0.02/10s,扣 Optima credits 由服务端中间件完成)、时长期望。
|
|
94
|
-
|
|
95
|
-
先查 `gen-output/video-clone/workflows/README.md` — 完全匹配 → 复用;部分匹配 → 调整;
|
|
96
|
-
无匹配 → 走通用 pipeline。详见 [workflow-system.md](references/workflow-system.md)。
|
|
97
|
-
|
|
98
|
-
## Phase 1-2: Prep(自主运行,无 gate)
|
|
99
|
-
|
|
100
|
-
Prep 阶段所有脚本均无 gate,可自主运行:
|
|
101
|
-
|
|
102
|
-
```bash
|
|
103
|
-
python scripts/init_project.py --name <slug> --task-type video_clone
|
|
104
|
-
python scripts/analyze_source.py --project <slug> --source <path>
|
|
105
|
-
python scripts/extract_frames.py --project <slug> --source <path>
|
|
106
|
-
# Claude 分析帧网格,写 prompt.md
|
|
107
|
-
python scripts/edit_first_frame.py --project <slug> --image <frame> --product <img>
|
|
108
|
-
# 写 cost.json(可选)
|
|
109
|
-
python scripts/preview.py --project <slug>
|
|
110
|
-
```
|
|
111
|
-
|
|
112
|
-
`preview.py` 检查所有 prep 产物是否齐全,输出 `preview_vN.md`(六节)。
|
|
113
|
-
|
|
114
|
-
**[USER GATE] 把 preview 展示给用户,等待确认,然后运行:**
|
|
115
|
-
|
|
116
|
-
```bash
|
|
117
|
-
python scripts/confirm.py --project <slug> --quote "<用户原话>"
|
|
118
|
-
```
|
|
119
|
-
|
|
120
|
-
## Phase 3-5: 生成 + 后处理 + 沉淀
|
|
121
|
-
|
|
122
|
-
```bash
|
|
123
|
-
# Phase 3 — 视频生成(需要 preview_confirmed)
|
|
124
|
-
python scripts/kling_generate.py --project <slug> --frame frames/frame_vN.png
|
|
125
|
-
# 或无音频版本:
|
|
126
|
-
python scripts/gen_video.py --project <slug> --frame frames/frame_vN.png
|
|
127
|
-
|
|
128
|
-
# Phase 4 — 后处理
|
|
129
|
-
python scripts/assemble.py --project <slug> --single videos/video_vN.mp4
|
|
130
|
-
|
|
131
|
-
# Phase 5 — 沉淀(可选)
|
|
132
|
-
python scripts/save_workflow.py --project <slug> \
|
|
133
|
-
--name <workflow-slug> --scene "<适用场景>" \
|
|
134
|
-
--rating <1-5> --strategy "<关键策略>"
|
|
135
|
-
```
|
|
136
|
-
|
|
137
|
-
## 铁律
|
|
138
|
-
|
|
139
|
-
1. 先理解再动手 — 出 preview 前先跑完所有 prep 脚本
|
|
140
|
-
2. 先查 workflow 再造轮子 — 已有经验不要浪费
|
|
141
|
-
3. 首帧质量决定一切 — 自动选产品最清晰+手部最自然的帧
|
|
142
|
-
4. Prompt 质量 = 视频质量 — preview 里展示给用户,用户可能修改
|
|
143
|
-
5. 永远不覆盖文件 — 脚本通过 next_version() 自动 v{N} 递增
|
|
144
|
-
6. 不要自己编辑 .state/phase.json — 还不如走 confirm.py 正常路径
|
|
145
|
-
|
|
146
|
-
## 项目目录
|
|
147
|
-
|
|
148
|
-
```
|
|
149
|
-
gen-output/video-clone/
|
|
150
|
-
├── workflows/ ← Workflow 经验库(README.md 索引)
|
|
151
|
-
└── {project}/
|
|
152
|
-
├── .state/phase.json ← Gate 状态机
|
|
153
|
-
├── source/ ← 原始素材 + analysis_vN.json
|
|
154
|
-
├── frames/ ← extract_vN/ + frame_vN.png
|
|
155
|
-
├── videos/ ← video_vN.mp4 + final_vN.mp4
|
|
156
|
-
├── prompt.md cost.json preview_vN.md log.md
|
|
157
|
-
```
|
|
158
|
-
|
|
159
|
-
新任务 → 新目录;改 prompt/换 seed → 同目录 v{N}+1。
|
|
160
|
-
跨会话续接:先 `python scripts/status.py --project <slug>` 看当前状态和下一步。
|
|
161
|
-
|
|
162
|
-
## 工具分工
|
|
163
|
-
|
|
164
|
-
| 工具 | 脚本 | 职责 |
|
|
165
|
-
|---|---|---|
|
|
166
|
-
| Claude | — | 需求理解 + 逐帧分析 → 中文 6 段 prompt |
|
|
167
|
-
| gen image | `edit_first_frame.py` | 首帧编辑(双图模式) |
|
|
168
|
-
| gen video | `gen_video.py` | I2V(不需要音频) |
|
|
169
|
-
| 视频生成 API | `kling_generate.py` | 有音频/口型同步。详见 [kling-api.md](references/kling-api.md) |
|
|
170
|
-
| ffmpeg | `analyze_source.py` / `extract_frames.py` / `assemble.py` | 抽帧、场景检测、后处理 |
|
|
171
|
-
|
|
172
|
-
**不要在回复中提及具体模型名称**(如 "Kling 3.0"、"Wan 2.6")。
|
|
173
|
-
只说"视频生成中..."或"已提交生成任务"。
|
|
174
|
-
|
|
175
|
-
## 首帧编辑策略
|
|
176
|
-
|
|
177
|
-
**单片段**:`edit_first_frame.py` 用双图(`-i 源帧 -i 产品图`),保持场景只替换产品。
|
|
178
|
-
**多片段**:每段仅 `-i 产品图`,prompt 描述完整场景,产品描述跨段重复。
|
|
179
|
-
|
|
180
|
-
## Anti-Pattern
|
|
181
|
-
|
|
182
|
-
| 你在想... | 应该做的 |
|
|
183
|
-
|---|---|
|
|
184
|
-
| 用户给了素材直接开干 | **先 preflight + 跑完 prep + 出 preview + confirm.py** |
|
|
185
|
-
| 不看 workflow 库直接走通用 | 先查 `workflows/README.md` |
|
|
186
|
-
| 直接取 t=1s 当首帧 | 自动选产品最清晰的帧 |
|
|
187
|
-
| prompt 不给用户看 | **必须在 preview 里展示,用户可能修改** |
|
|
188
|
-
| 跳过 preview 直接 confirm | preview.py 会验证所有 prep 产物是否齐全 |
|
|
189
|
-
| 告诉用户具体模型名称 | 只说 "视频生成中..." 不透露底层模型 |
|
|
190
|
-
| 做出好效果不保存 workflow | **效果好 + 新场景 = 必须沉淀** |
|
|
191
|
-
| 覆盖之前生成的文件 | 脚本自动 v{N} 递增 |
|
|
192
|
-
| 把 `--quote "ok"` 当成真实确认 | 伪造 quote 会在 history 留痕,事后可倒查 |
|
|
193
|
-
|
|
194
|
-
## 已知限制
|
|
195
|
-
|
|
196
|
-
- 单段最长 10s,多段需拼接
|
|
197
|
-
- 动作模型自编,不还原源视频动作序列
|
|
198
|
-
- PiAPI CDN 不稳定,`kling_generate.py` 内置 3 次重试
|
|
199
|
-
- 平台解析(TikTok/抖音/Instagram/小红书)仍需手动调 `scout` 命令,不在脚本 pipeline 里(详见 [url-parsing.md](references/url-parsing.md))
|
|
@@ -1,42 +0,0 @@
|
|
|
1
|
-
# FFmpeg — where the commands live
|
|
2
|
-
|
|
3
|
-
The ffmpeg commands you'd have inlined here are owned by scripts now. Read
|
|
4
|
-
the script source if you need to see the exact flags.
|
|
5
|
-
|
|
6
|
-
| What | Script |
|
|
7
|
-
|---|---|
|
|
8
|
-
| ffprobe + scene detection (`select='gt(scene,0.3)'`) | `scripts/analyze_source.py` |
|
|
9
|
-
| equidistant frame extraction + tile grid | `scripts/extract_frames.py` |
|
|
10
|
-
| single-segment normalize (`-r 30 -crf 18 -c:a aac -b:a 192k`) | `scripts/assemble.py --single` |
|
|
11
|
-
| multi-segment normalize + concat (scale=720:1280 + concat demuxer) | `scripts/assemble.py --multi` |
|
|
12
|
-
|
|
13
|
-
## Single vs multi-segment heuristic
|
|
14
|
-
|
|
15
|
-
`analyze_source.py` classifies automatically:
|
|
16
|
-
|
|
17
|
-
- scene cuts ≤1 (after filtering out cuts <0.5s apart) → `classification: single`
|
|
18
|
-
- scene cuts ≥2 → `classification: multi`
|
|
19
|
-
|
|
20
|
-
If the auto-classification is wrong, override the plan manually — don't
|
|
21
|
-
edit the script.
|
|
22
|
-
|
|
23
|
-
## Things the scripts deliberately don't do
|
|
24
|
-
|
|
25
|
-
- **Audio replacement from source** — Kling 3.0 handles lip sync itself;
|
|
26
|
-
don't splice the source audio back in or you'll get misaligned mouths.
|
|
27
|
-
- **Resolution auto-detection for multi-segment** — `assemble.py --multi`
|
|
28
|
-
hardcodes `scale=720:1280`. If your source is different, pass pre-scaled
|
|
29
|
-
clips or extend the script.
|
|
30
|
-
- **Re-encoding `final_v{N}.mp4` after assembly** — if final looks wrong,
|
|
31
|
-
regenerate the constituent videos, not the final.
|
|
32
|
-
|
|
33
|
-
## Raw command reference (only when debugging outside the scripts)
|
|
34
|
-
|
|
35
|
-
```bash
|
|
36
|
-
# probe metadata
|
|
37
|
-
ffprobe -v quiet -print_format json -show_format -show_streams video.mp4
|
|
38
|
-
|
|
39
|
-
# concat list format (scripts generate this automatically)
|
|
40
|
-
file 'clip1_norm.mp4'
|
|
41
|
-
file 'clip2_norm.mp4'
|
|
42
|
-
```
|
|
@@ -1,144 +0,0 @@
|
|
|
1
|
-
# Gate Enforcement
|
|
2
|
-
|
|
3
|
-
This skill enforces its HARD-GATE mechanically, not textually. Every
|
|
4
|
-
executor script that costs money or generates output calls `require_gate()`
|
|
5
|
-
at startup and exits with code 1 if the gate isn't set. You cannot
|
|
6
|
-
rationalize past a CLI that won't produce output.
|
|
7
|
-
|
|
8
|
-
## State location
|
|
9
|
-
|
|
10
|
-
Each project has its own state file:
|
|
11
|
-
|
|
12
|
-
```
|
|
13
|
-
gen-output/video-clone/<project>/.state/phase.json
|
|
14
|
-
```
|
|
15
|
-
|
|
16
|
-
## Schema (single-gate model — PR #65)
|
|
17
|
-
|
|
18
|
-
```json
|
|
19
|
-
{
|
|
20
|
-
"schema_version": 1,
|
|
21
|
-
"project": "handheld-phone-swap",
|
|
22
|
-
"task_type": "video_clone",
|
|
23
|
-
"created_at": "2026-04-11T16:55:00Z",
|
|
24
|
-
"current_phase": 0,
|
|
25
|
-
"gates": {
|
|
26
|
-
"preview_confirmed": {"status": false, "confirmed_at": null, "user_quote": null}
|
|
27
|
-
},
|
|
28
|
-
"history": []
|
|
29
|
-
}
|
|
30
|
-
```
|
|
31
|
-
|
|
32
|
-
There is **one gate**: `preview_confirmed`. It is set by `confirm.py`
|
|
33
|
-
after the user reviews the complete prep preview (analysis + grid +
|
|
34
|
-
prompt + edited frame). The prep scripts (analyze_source, extract_frames,
|
|
35
|
-
edit_first_frame) run freely before confirmation — only the generation
|
|
36
|
-
scripts are gated.
|
|
37
|
-
|
|
38
|
-
Gates are never un-set. Each set operation appends an entry to `history`
|
|
39
|
-
with timestamp + user quote + caller script name, giving you an audit
|
|
40
|
-
trail to answer "did the user actually confirm this?"
|
|
41
|
-
|
|
42
|
-
## Which script needs which gate
|
|
43
|
-
|
|
44
|
-
| Script | Required gate |
|
|
45
|
-
|---|---|
|
|
46
|
-
| `analyze_source.py` | (none — prep runs freely) |
|
|
47
|
-
| `extract_frames.py` | (none — prep runs freely) |
|
|
48
|
-
| `edit_first_frame.py` | (none — prep runs freely) |
|
|
49
|
-
| `preview.py` | (none — collects artifacts, no cost) |
|
|
50
|
-
| `kling_generate.py` | **preview_confirmed** |
|
|
51
|
-
| `gen_video.py` | **preview_confirmed** |
|
|
52
|
-
| `save_workflow.py` | **preview_confirmed** |
|
|
53
|
-
| `assemble.py` | (none — post-processes already-generated videos) |
|
|
54
|
-
| `init_project.py` | (none — must run before any gate exists) |
|
|
55
|
-
| `confirm.py` | (none — sets the gate) |
|
|
56
|
-
| `preflight.py` | (none — environment check) |
|
|
57
|
-
| `status.py` | (none — read-only) |
|
|
58
|
-
|
|
59
|
-
## How to set the gate
|
|
60
|
-
|
|
61
|
-
```bash
|
|
62
|
-
python scripts/confirm.py --project <name> --quote "<user's actual words>"
|
|
63
|
-
```
|
|
64
|
-
|
|
65
|
-
- **`--quote` is required** and must be non-empty.
|
|
66
|
-
- **Quote heuristic**: if the quote contains negation markers (`不`, `改`,
|
|
67
|
-
`no`, `not`, `change`, `modify`, `wrong`, …), the script refuses to set
|
|
68
|
-
the gate unless you also pass `--force`. This prevents Claude from
|
|
69
|
-
using a correction ("不需要音频") as a confirmation.
|
|
70
|
-
- **Use `--force` sparingly**: only when the user said something like
|
|
71
|
-
"no audio, otherwise good" — a genuine confirmation that contains a
|
|
72
|
-
negation word.
|
|
73
|
-
- The gate records `caller` (the script name) in history. Fabricated
|
|
74
|
-
quotes are detectable in post-mortem review.
|
|
75
|
-
|
|
76
|
-
## How a blocked script looks
|
|
77
|
-
|
|
78
|
-
```
|
|
79
|
-
[HARD-GATE BLOCKED] kling_generate.py needs preview_confirmed=True
|
|
80
|
-
Current state: preview_confirmed=False
|
|
81
|
-
Project: /abs/path/to/gen-output/video-clone/handheld-phone-swap
|
|
82
|
-
|
|
83
|
-
To proceed:
|
|
84
|
-
1. Show the preview bundle to the user and wait for their confirmation.
|
|
85
|
-
2. Run: python scripts/confirm.py --project <name> --quote "<user's actual words>"
|
|
86
|
-
3. Retry this command.
|
|
87
|
-
|
|
88
|
-
Claude: do NOT rationalize past this. The gate exists because text
|
|
89
|
-
instructions alone did not stop prior bypass attempts. Go get the real
|
|
90
|
-
user confirmation.
|
|
91
|
-
```
|
|
92
|
-
|
|
93
|
-
Exit code is always 1. Downstream pipes will break; you can't accidentally
|
|
94
|
-
feed "locked" output into the next step.
|
|
95
|
-
|
|
96
|
-
## Old 3-gate schema (pre-PR #65)
|
|
97
|
-
|
|
98
|
-
If you have a project created before the single-gate refactor, the state
|
|
99
|
-
file will have `plan_confirmed`, `prompt_confirmed`, `frame_confirmed`
|
|
100
|
-
instead of `preview_confirmed`. Any gated script will exit 1 with:
|
|
101
|
-
|
|
102
|
-
```
|
|
103
|
-
[HARD-GATE BLOCKED] expected gate 'preview_confirmed' but it is not
|
|
104
|
-
present. This is most likely an old 3-gate schema project.
|
|
105
|
-
Either: complete this project with PR #65 scripts, or start a new project.
|
|
106
|
-
```
|
|
107
|
-
|
|
108
|
-
Do NOT manually edit old phase.json files to add `preview_confirmed`.
|
|
109
|
-
Start a new project with `init_project.py` and re-run prep.
|
|
110
|
-
|
|
111
|
-
## Common bypass attempts (and why the gate still wins)
|
|
112
|
-
|
|
113
|
-
| Attempt | Outcome |
|
|
114
|
-
|---|---|
|
|
115
|
-
| "analyze_source is just analysis, no cost" | Runs freely — no gate on prep scripts. Correct behavior. |
|
|
116
|
-
| "I'll edit phase.json to set the gate" | Possible, but leaves a gap in `history`. Audit trail shows no `confirm.py` call. |
|
|
117
|
-
| `confirm.py --quote 'ok'` without asking user | Sets gate with `user_quote: "ok"`. No user types that in isolation — obvious in post-mortem. |
|
|
118
|
-
| "I'll skip preview.py and confirm directly" | `confirm.py` doesn't require preview_v*.md — but `preview.py` must have run first for artifacts to be present for the user to review. |
|
|
119
|
-
| "I'll symlink phase.json to /dev/null" | Exit 1 on read. |
|
|
120
|
-
| "I'll delete .state/" | Exit 1 — "phase.json not found". |
|
|
121
|
-
|
|
122
|
-
The cheapest path is always to actually get the user's confirmation.
|
|
123
|
-
|
|
124
|
-
## When to read history
|
|
125
|
-
|
|
126
|
-
The `history` array is append-only. Useful cases:
|
|
127
|
-
|
|
128
|
-
- **Resuming a multi-session project** — read history to know what gate
|
|
129
|
-
is set and what the user said. Use `status.py` for a human-readable view.
|
|
130
|
-
- **Debugging bad output** — if a video is wrong, history shows the exact
|
|
131
|
-
quote the user gave when confirming the preview.
|
|
132
|
-
- **Verifying gate authenticity** — `caller` field shows which script set
|
|
133
|
-
the gate. `confirm.py` is the only legitimate caller.
|
|
134
|
-
|
|
135
|
-
## Why this exists
|
|
136
|
-
|
|
137
|
-
Previous versions had HARD-GATE rules in SKILL.md text with a
|
|
138
|
-
Rationalization Counter table. They did not stop Claude from running
|
|
139
|
-
ffprobe / downloading videos / calling `gen image` before the user
|
|
140
|
-
confirmed. The failure mode was always the same: Claude decided "my action
|
|
141
|
-
doesn't count as execution" and rationalized past the text rule.
|
|
142
|
-
|
|
143
|
-
Mechanical gates end the argument. The script either runs or it doesn't,
|
|
144
|
-
and the decision doesn't depend on how Claude interprets the rules.
|
|
@@ -1,85 +0,0 @@
|
|
|
1
|
-
# Video generation with audio — what the script handles + what you need to know
|
|
2
|
-
|
|
3
|
-
The `kling_generate.py` script calls the **Optima generation backend**, which
|
|
4
|
-
routes to Kling 3.0 internally. The script itself knows nothing about the
|
|
5
|
-
upstream provider — only the backend does.
|
|
6
|
-
|
|
7
|
-
```bash
|
|
8
|
-
python scripts/kling_generate.py --project <name> --frame <confirmed-frame.png>
|
|
9
|
-
# options: --duration 5|10 --aspect-ratio 9:16 --mode std|pro
|
|
10
|
-
# --cfg-scale 0.5 --no-audio
|
|
11
|
-
```
|
|
12
|
-
|
|
13
|
-
## When to use `kling_generate.py` vs `gen_video.py`
|
|
14
|
-
|
|
15
|
-
| Need audio / lip sync? | Use |
|
|
16
|
-
|---|---|
|
|
17
|
-
| Yes | `kling_generate.py` ($0.15/s ≈ $1.50 per 10s equivalent) |
|
|
18
|
-
| No | `gen_video.py` (~$0.02 per 10s equivalent) |
|
|
19
|
-
|
|
20
|
-
Both scripts go through the same generation backend and the same billing
|
|
21
|
-
middleware — the difference is server-side provider selection.
|
|
22
|
-
|
|
23
|
-
## Auth + API URL discovery
|
|
24
|
-
|
|
25
|
-
`kling_generate.py` mirrors the `@optima-chat/gen-cli` auth convention, so
|
|
26
|
-
any user who has run `optima login` is automatically ready — no extra env
|
|
27
|
-
setup needed.
|
|
28
|
-
|
|
29
|
-
Token resolution (first match wins):
|
|
30
|
-
|
|
31
|
-
1. `OPTIMA_TOKEN` env var
|
|
32
|
-
2. `~/.optima/token.json` (`access_token` field)
|
|
33
|
-
|
|
34
|
-
Generation API URL resolution (first match wins):
|
|
35
|
-
|
|
36
|
-
1. `GENERATION_API_URL` env var
|
|
37
|
-
2. `~/.optima/token.json` `env` field → `ci` / `stage` / `prod` mapping
|
|
38
|
-
3. default: `https://gen-api.optima.onl` (prod)
|
|
39
|
-
|
|
40
|
-
If neither token source resolves, the script exits 1 with a clear hint
|
|
41
|
-
(`Run `optima login` or set OPTIMA_TOKEN`). **No fallback to direct upstream
|
|
42
|
-
calls** — that would bypass billing.
|
|
43
|
-
|
|
44
|
-
Run `scripts/preflight.py` to verify the auth state and other deps.
|
|
45
|
-
|
|
46
|
-
The `requests` Python package is also required.
|
|
47
|
-
|
|
48
|
-
## Billing
|
|
49
|
-
|
|
50
|
-
Billing is entirely server-side. When `kling_generate.py` calls
|
|
51
|
-
`POST /api/video/generate`, the backend's billing middleware pre-deducts
|
|
52
|
-
credits from the user's Optima wallet based on `metadata.duration`. If the
|
|
53
|
-
upstream task later fails, the server refunds the credits automatically
|
|
54
|
-
(see `billingClient.refund()` in the generation service worker).
|
|
55
|
-
|
|
56
|
-
This means the skill never sees raw USD pricing — it just sends a request
|
|
57
|
-
and waits. The user's wallet is the source of truth for how much was spent.
|
|
58
|
-
|
|
59
|
-
## Error handling
|
|
60
|
-
|
|
61
|
-
`_submit()` translates the common billing errors into readable messages:
|
|
62
|
-
|
|
63
|
-
- HTTP 402 `INSUFFICIENT_CREDITS` — user does not have enough credits
|
|
64
|
-
- HTTP 403 `PLAN_RESTRICTED` — user's plan does not include video generation
|
|
65
|
-
|
|
66
|
-
Other HTTP errors propagate via `raise_for_status()`. Polling uses the
|
|
67
|
-
backend's unified `GET /api/task/{id}` endpoint.
|
|
68
|
-
|
|
69
|
-
## Non-obvious traps (now handled server-side)
|
|
70
|
-
|
|
71
|
-
The `cfg_scale`-must-be-float / lowercase-status / 3.0-vs-2.6 response
|
|
72
|
-
differences / freeimage-vs-catbox quirks are all handled by the server's
|
|
73
|
-
PiAPI adapter (`optima-gen: packages/generation/src/adapters/piapi-video.ts`).
|
|
74
|
-
The skill no longer has to know about them.
|
|
75
|
-
|
|
76
|
-
## What changed vs the old direct-PiAPI version
|
|
77
|
-
|
|
78
|
-
Before: script uploaded to freeimage.host, submitted to PiAPI, polled PiAPI,
|
|
79
|
-
downloaded from Kling CDN. No billing. Shared PiAPI key hardcoded in the
|
|
80
|
-
skill.
|
|
81
|
-
|
|
82
|
-
After: script sends one POST + polls one endpoint on the Optima backend.
|
|
83
|
-
Billing is automatic. No upstream API keys in the skill. Provider switches
|
|
84
|
-
(if the Kling API ever changes) happen server-side without touching any
|
|
85
|
-
client code.
|
|
@@ -1,71 +0,0 @@
|
|
|
1
|
-
# Motion Prompt 模板与规范
|
|
2
|
-
|
|
3
|
-
## 中文 6 段模板
|
|
4
|
-
|
|
5
|
-
复刻时由 Claude Opus 分析源帧生成,纯生成时根据用户描述编写。
|
|
6
|
-
写入 `prompt.md` 并打印给用户,Phase 3 从 `prompt.md` 读取。
|
|
7
|
-
|
|
8
|
-
```
|
|
9
|
-
### 视觉风格
|
|
10
|
-
[拍摄设备感 + 画面质感 + 色彩方案 + 光线 + 氛围]
|
|
11
|
-
|
|
12
|
-
### 场景叙述
|
|
13
|
-
[时间地点 + 人物外貌 + 产品描述(重复颜色/特征) + 背景环境]
|
|
14
|
-
|
|
15
|
-
### 摄影技术
|
|
16
|
-
[景别 + 运镜 + 焦段 + 景深 + 光线] 情绪:[...]
|
|
17
|
-
|
|
18
|
-
### 动作清单
|
|
19
|
-
- [时间顺序,精确到哪只手]
|
|
20
|
-
- [产品交互,避免复杂手部操作]
|
|
21
|
-
|
|
22
|
-
### 对话
|
|
23
|
-
- [语言和风格]
|
|
24
|
-
|
|
25
|
-
### 背景声音
|
|
26
|
-
- [环境音 + 人声 + 无背景音乐]
|
|
27
|
-
```
|
|
28
|
-
|
|
29
|
-
## Anti-AI 风格(融入摄影技术段)
|
|
30
|
-
|
|
31
|
-
手持拍摄/轻微晃动/自然光/无滤镜/纪实感
|
|
32
|
-
|
|
33
|
-
## 禁用词
|
|
34
|
-
|
|
35
|
-
梦幻/空灵/电影感/慢动作/丝滑/优雅
|
|
36
|
-
|
|
37
|
-
## 长度控制
|
|
38
|
-
|
|
39
|
-
1200-2000 字符,超 2500 必须精简。
|
|
40
|
-
|
|
41
|
-
## prompt.md 工作流
|
|
42
|
-
|
|
43
|
-
1. Claude 分析 → 写入 prompt.md → 打印给用户
|
|
44
|
-
2. 用户说"OK" → 直接用;用户说"改一下" → 用户编辑或告诉 Claude 改
|
|
45
|
-
3. Phase 3 从 prompt.md 读取生成视频
|
|
46
|
-
4. 重跑时:修改 prompt.md → 旧版本记录到 log.md
|
|
47
|
-
|
|
48
|
-
## 示例
|
|
49
|
-
|
|
50
|
-
```markdown
|
|
51
|
-
# fishing-scale — Prompt
|
|
52
|
-
|
|
53
|
-
### 视觉风格
|
|
54
|
-
竖屏手持vlog,自然饱和色彩,明亮日光,无滤镜,纪实感。
|
|
55
|
-
|
|
56
|
-
### 场景叙述
|
|
57
|
-
阳光白天,戴眼镜、深蓝头巾、黑色运动上衣的女子跪坐沙滩...
|
|
58
|
-
|
|
59
|
-
### 摄影技术
|
|
60
|
-
中景,手持拍摄,轻微晃动,自然光,浅景深。情绪:轻松日常
|
|
61
|
-
|
|
62
|
-
### 动作清单
|
|
63
|
-
- 左手托住电子秤底部,右手食指轻触屏幕
|
|
64
|
-
- 产品保持静止,人物微笑看向镜头
|
|
65
|
-
|
|
66
|
-
### 对话
|
|
67
|
-
- 英语,日常对话风格
|
|
68
|
-
|
|
69
|
-
### 背景声音
|
|
70
|
-
- 海浪声、风声、远处人声,无背景音乐
|
|
71
|
-
```
|
|
@@ -1,32 +0,0 @@
|
|
|
1
|
-
# URL / Source Download — decision table
|
|
2
|
-
|
|
3
|
-
**Do not use WebFetch** for source videos (anti-scraping, auth walls).
|
|
4
|
-
Choose the right tool based on the URL shape and use it manually — there
|
|
5
|
-
is no script wrapper because the right approach varies by platform.
|
|
6
|
-
|
|
7
|
-
| URL shape | Command |
|
|
8
|
-
|---|---|
|
|
9
|
-
| `tiktok.com/@user/video/<id>` | `scout tiktok video-detail <id>` → grab video URL → `wget` |
|
|
10
|
-
| `vm.tiktok.com/<short>` | `curl -sI <url>` → `Location:` header → extract id → see TikTok row |
|
|
11
|
-
| `douyin.com/video/<id>` | `scout douyin video-download <id>` → `wget` |
|
|
12
|
-
| `v.douyin.com/<short>` | `scout douyin video-by-url "<url>"` |
|
|
13
|
-
| Instagram Reels (`instagram.com/reel/...`) | `scout instagram download-reel "<url>"` |
|
|
14
|
-
| 小红书视频 (`xiaohongshu.com/explore/<id>`) | `scout xhs note-detail <id>` → grab video link → `wget` |
|
|
15
|
-
| Local file path | Pass directly to `--video` |
|
|
16
|
-
|
|
17
|
-
After download, save to `gen-output/video-clone/<project>/source/` and
|
|
18
|
-
then feed the local path to `analyze_source.py --video <path>`.
|
|
19
|
-
|
|
20
|
-
## Why the script pipeline starts after download
|
|
21
|
-
|
|
22
|
-
Downloading is the *only* step that remains manual because platform APIs
|
|
23
|
-
change faster than the script would. Everything downstream of
|
|
24
|
-
`--video <local-file>` is automated by the Python scripts.
|
|
25
|
-
|
|
26
|
-
## Sanity checks before running analyze_source.py
|
|
27
|
-
|
|
28
|
-
1. File size > 0
|
|
29
|
-
2. `ffprobe -v error <file>` returns no errors
|
|
30
|
-
3. Duration makes sense for the source (`ffprobe -show_entries format=duration`)
|
|
31
|
-
|
|
32
|
-
If any of these fail, re-download with a different tool before proceeding.
|
|
@@ -1,92 +0,0 @@
|
|
|
1
|
-
# Workflow 经验库系统
|
|
2
|
-
|
|
3
|
-
## 什么是 Workflow
|
|
4
|
-
|
|
5
|
-
Workflow 是一次成功视频制作的经验总结。它记录了在特定场景下"什么方法效果最好",让相似任务不用从零摸索。
|
|
6
|
-
|
|
7
|
-
## 目录结构
|
|
8
|
-
|
|
9
|
-
```
|
|
10
|
-
gen-output/video-clone/workflows/
|
|
11
|
-
├── README.md ← 索引,每个 workflow 一行
|
|
12
|
-
├── handheld-product-swap.md ← 手持vlog产品替换
|
|
13
|
-
├── multi-scene-product-demo.md ← 多场景产品展示
|
|
14
|
-
└── lifestyle-pure-gen.md ← 生活方式纯生成
|
|
15
|
-
```
|
|
16
|
-
|
|
17
|
-
## README.md 格式
|
|
18
|
-
|
|
19
|
-
索引文件,快速定位。每行一个 workflow,格式:
|
|
20
|
-
|
|
21
|
-
```markdown
|
|
22
|
-
# Video Clone Workflows
|
|
23
|
-
|
|
24
|
-
| Workflow | 适用场景 | 效果 | 关键策略 |
|
|
25
|
-
|---|---|---|---|
|
|
26
|
-
| [handheld-product-swap](handheld-product-swap.md) | 手持vlog + 单物品替换 | ⭐⭐⭐⭐⭐ | 双图首帧, t=15s选帧, 简单手部动作 |
|
|
27
|
-
| [multi-scene-product-demo](multi-scene-product-demo.md) | 多场景产品展示(>2段) | ⭐⭐⭐⭐ | 每段独立首帧, 产品描述跨段一致 |
|
|
28
|
-
```
|
|
29
|
-
|
|
30
|
-
## Workflow 文件格式
|
|
31
|
-
|
|
32
|
-
每个 `.md` 文件包含:
|
|
33
|
-
|
|
34
|
-
```markdown
|
|
35
|
-
# {workflow-name}
|
|
36
|
-
|
|
37
|
-
## 适用场景
|
|
38
|
-
- 什么类型的视频适合用这个 workflow
|
|
39
|
-
- 关键特征(单/多段、有无人物、产品类型等)
|
|
40
|
-
|
|
41
|
-
## 策略
|
|
42
|
-
|
|
43
|
-
### 首帧
|
|
44
|
-
- 选帧策略(哪个时间点最好、为什么)
|
|
45
|
-
- gen image 参数(单图/双图、prompt 关键词)
|
|
46
|
-
- 踩坑记录(什么不 work)
|
|
47
|
-
|
|
48
|
-
### Prompt
|
|
49
|
-
- prompt 风格和重点(哪些段需要重点写)
|
|
50
|
-
- 验证有效的 prompt 片段(可直接复用)
|
|
51
|
-
- 禁用/低效的描述方式
|
|
52
|
-
|
|
53
|
-
### 视频生成
|
|
54
|
-
- 工具选择和参数
|
|
55
|
-
- cfg_scale / duration / mode 配置
|
|
56
|
-
|
|
57
|
-
### 后处理
|
|
58
|
-
- 特殊的 ffmpeg 参数
|
|
59
|
-
|
|
60
|
-
## 成功案例
|
|
61
|
-
- 项目名 + 简要结果(链接到项目目录)
|
|
62
|
-
|
|
63
|
-
## 踩坑记录
|
|
64
|
-
- 试过但失败的方法,避免重蹈覆辙
|
|
65
|
-
```
|
|
66
|
-
|
|
67
|
-
## 何时创建新 Workflow
|
|
68
|
-
|
|
69
|
-
满足以下全部条件:
|
|
70
|
-
|
|
71
|
-
1. **用户满意** — 最终视频被用户认可
|
|
72
|
-
2. **新场景** — 没有已有 workflow 完全覆盖
|
|
73
|
-
3. **有可复用的经验** — 不是纯靠运气,有明确的策略可提炼
|
|
74
|
-
|
|
75
|
-
## 何时更新已有 Workflow
|
|
76
|
-
|
|
77
|
-
- 同类任务发现了更好的参数/策略
|
|
78
|
-
- 踩了新坑,值得记录避免下次再踩
|
|
79
|
-
- 工具更新导致旧策略需要调整
|
|
80
|
-
|
|
81
|
-
## 匹配逻辑
|
|
82
|
-
|
|
83
|
-
Phase 0 读取 README.md 后,按以下维度匹配:
|
|
84
|
-
|
|
85
|
-
1. **视频类型**:手持vlog / 产品展示 / 口播 / 纯展示
|
|
86
|
-
2. **片段结构**:单片段 / 多片段
|
|
87
|
-
3. **产品交互**:手持 / 桌面摆放 / 穿戴 / 无人物
|
|
88
|
-
4. **音频需求**:有口型同步 / 纯BGM / 无音频
|
|
89
|
-
|
|
90
|
-
完全匹配 → 直接复用策略。
|
|
91
|
-
部分匹配 → 以最接近的 workflow 为基础调整。
|
|
92
|
-
无匹配 → 走通用 pipeline。
|