@kolbo/kolbo-code-linux-arm64-musl 2.0.0 → 2.0.2
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/bin/kolbo +0 -0
- package/package.json +1 -1
- package/skills/color-grading/SKILL.md +152 -0
- package/skills/ffmpeg-patterns/SKILL.md +240 -0
- package/skills/image-prompting-guide/SKILL.md +143 -0
- package/skills/kolbo/SKILL.md +263 -19
- package/skills/music-prompting/SKILL.md +146 -0
- package/skills/production-review/SKILL.md +152 -0
- package/skills/short-form-video/SKILL.md +168 -0
- package/skills/sound-design/SKILL.md +154 -0
- package/skills/storytelling/SKILL.md +139 -0
- package/skills/subtitle-production/SKILL.md +244 -0
- package/skills/subtitle-production/reference/burn_to_video.py +222 -0
- package/skills/subtitle-production/reference/export_srts.py +127 -0
- package/skills/subtitle-production/reference/gen_srt.py +42 -0
- package/skills/typography-video/SKILL.md +182 -0
- package/skills/typography-video/reference/KineticTitleScene.tsx +345 -0
- package/skills/video-editing/SKILL.md +128 -0
- package/skills/video-production/SKILL.md +7 -8
- package/skills/video-prompting-guide/SKILL.md +268 -0
|
@@ -0,0 +1,244 @@
|
|
|
1
|
+
---
|
|
2
|
+
name: subtitle-production
|
|
3
|
+
description: >
|
|
4
|
+
Subtitle and caption production: timing strategies, cue length by format (vertical vs horizontal),
|
|
5
|
+
ASS/SRT styling, word-level timing, RTL support for Hebrew/Arabic, burn-in with FFmpeg,
|
|
6
|
+
readability rules. Use when generating, styling, or burning in subtitles.
|
|
7
|
+
Keywords: subtitle, caption, SRT, ASS, VTT, timing, burn-in, word-level, karaoke, RTL,
|
|
8
|
+
Hebrew, Arabic, font size, cue, readability
|
|
9
|
+
---
|
|
10
|
+
|
|
11
|
+
# Subtitle & Caption Production
|
|
12
|
+
|
|
13
|
+
## Output Formats
|
|
14
|
+
|
|
15
|
+
| Format | Extension | Use Case |
|
|
16
|
+
|--------|-----------|----------|
|
|
17
|
+
| SRT | `.srt` | Universal — FFmpeg, players, YouTube upload |
|
|
18
|
+
| VTT | `.vtt` | Web-native — HTML5 video, browser playback |
|
|
19
|
+
| ASS | `.ass` | Advanced styling, RTL support, per-word positioning |
|
|
20
|
+
|
|
21
|
+
## Cue Length by Format
|
|
22
|
+
|
|
23
|
+
### Vertical Short-Form (TikTok, Reels, Shorts)
|
|
24
|
+
- **Max 3-4 words per cue** — narrow screen, text must be large
|
|
25
|
+
- **Max 20 characters per line**
|
|
26
|
+
- Subtitles are **mandatory** (85% watch muted)
|
|
27
|
+
|
|
28
|
+
### Horizontal Standard (YouTube, web)
|
|
29
|
+
- **Max 6-8 words per cue** — wider screen
|
|
30
|
+
- **Max 42 characters per line** (broadcast standard)
|
|
31
|
+
|
|
32
|
+
### General Rules
|
|
33
|
+
- Average viewer reads ~15 characters/second
|
|
34
|
+
- Minimum display time: 0.5 seconds per cue
|
|
35
|
+
- Maximum display time: 5 seconds per cue
|
|
36
|
+
|
|
37
|
+
## Styling for Burn-in
|
|
38
|
+
|
|
39
|
+
### Vertical Video (1080x1920)
|
|
40
|
+
```
|
|
41
|
+
font: Arial (or Heebo Bold for Hebrew)
|
|
42
|
+
font_size: 18
|
|
43
|
+
bold: true
|
|
44
|
+
primary_color: &H00FFFFFF (white, ASS format)
|
|
45
|
+
outline_color: &H00000000 (black)
|
|
46
|
+
outline_width: 3 (thick for readability)
|
|
47
|
+
shadow: 2
|
|
48
|
+
margin_v: 50
|
|
49
|
+
alignment: 2 (bottom center)
|
|
50
|
+
```
|
|
51
|
+
|
|
52
|
+
### Horizontal Video (1920x1080)
|
|
53
|
+
```
|
|
54
|
+
font: Arial
|
|
55
|
+
font_size: 22
|
|
56
|
+
bold: true
|
|
57
|
+
primary_color: &H00FFFFFF
|
|
58
|
+
outline_color: &H00000000
|
|
59
|
+
outline_width: 2
|
|
60
|
+
shadow: 1
|
|
61
|
+
margin_v: 40
|
|
62
|
+
alignment: 2
|
|
63
|
+
```
|
|
64
|
+
|
|
65
|
+
### Common Mistakes
|
|
66
|
+
- **Wrong color format:** `&HFFFFFF` breaks positioning. Always use full 8-char `&H00FFFFFF`
|
|
67
|
+
- **Font too large on vertical:** `font_size: 28` fills center of 9:16. Use 18 max
|
|
68
|
+
- **Too many words per cue on vertical:** 5+ words creates multi-line blocks covering the face
|
|
69
|
+
- **MarginV too large:** Values over 200 push text off-screen. Stay under 100
|
|
70
|
+
|
|
71
|
+
## Timing Best Practices
|
|
72
|
+
|
|
73
|
+
- Cue start must match word onset (not before the speaker starts)
|
|
74
|
+
- Cue end should extend ~200ms past the last word for comfortable reading
|
|
75
|
+
- Never let a cue linger into the next speaker's turn
|
|
76
|
+
- Don't split a thought across two cues if it fits in one
|
|
77
|
+
|
|
78
|
+
## FFmpeg Burn-in Commands
|
|
79
|
+
|
|
80
|
+
### Simple SRT
|
|
81
|
+
```bash
|
|
82
|
+
ffmpeg -i input.mp4 -vf "subtitles=subs.srt:force_style='FontSize=22,Bold=1,PrimaryColour=&H00FFFFFF,OutlineColour=&H00000000,Outline=2'" -c:v libx264 -crf 18 -c:a copy output.mp4
|
|
83
|
+
```
|
|
84
|
+
|
|
85
|
+
### ASS with Custom Styling
|
|
86
|
+
```bash
|
|
87
|
+
ffmpeg -i input.mp4 -vf "ass=styled_subs.ass" -c:v libx264 -crf 18 -c:a copy output.mp4
|
|
88
|
+
```
|
|
89
|
+
|
|
90
|
+
### Windows Path Escaping
|
|
91
|
+
```bash
|
|
92
|
+
# Escape colons in subtitle filter paths on Windows
|
|
93
|
+
ffmpeg -i input.mp4 -vf "subtitles=C\\:/Users/path/subs.srt" output.mp4
|
|
94
|
+
```
|
|
95
|
+
|
|
96
|
+
## RTL (Hebrew/Arabic) — Proven Patterns
|
|
97
|
+
|
|
98
|
+
RTL subtitles are tricky. These patterns are battle-tested in Kolbo's video production pipeline.
|
|
99
|
+
|
|
100
|
+
**Reference implementations (bundled in `./reference/`):**
|
|
101
|
+
- `reference/burn_to_video.py` — Full burn pipeline with RTL progress bar (`geq` filter), chapter compositing, NVENC encoding
|
|
102
|
+
- `reference/export_srts.py` — SRT generation with chapter divider offset accounting
|
|
103
|
+
- `reference/gen_srt.py` — Word-level SRT from transcript JSON (8-word grouping, 1.5s gap detection)
|
|
104
|
+
|
|
105
|
+
### Option 1: SRT with Simple Burn-in (easiest, works for most cases)
|
|
106
|
+
|
|
107
|
+
Plain SRT files work for Hebrew/Arabic if you use the right font and let FFmpeg's libass handle bidi:
|
|
108
|
+
```bash
|
|
109
|
+
ffmpeg -i input.mp4 -vf "subtitles=subs.srt:force_style='FontName=Heebo,FontSize=22,Bold=1,Encoding=177,PrimaryColour=&H00FFFFFF,OutlineColour=&H00000000,Outline=2'" -c:v libx264 -crf 18 -c:a copy output.mp4
|
|
110
|
+
```
|
|
111
|
+
- **Font**: Heebo Bold for Hebrew, Cairo Bold for Arabic
|
|
112
|
+
- **Encoding=177** (Hebrew) or **Encoding=178** (Arabic) in ASS style
|
|
113
|
+
|
|
114
|
+
### Option 2: ASS with Per-Word Positioning (for karaoke/highlighting)
|
|
115
|
+
|
|
116
|
+
When you need per-word color highlighting with RTL text, you MUST use separate ASS Dialogue lines per word:
|
|
117
|
+
|
|
118
|
+
- Each word gets its own `Dialogue` line with explicit `\pos(x,y)`
|
|
119
|
+
- Use PIL to measure word widths: apply `~0.74` scale factor (PIL→libass calibration)
|
|
120
|
+
- Use `Alignment=7` (top-left anchor) so `\pos` sets exact top-left of each word
|
|
121
|
+
- Two named ASS styles (e.g., White + Yellow) for highlight vs inactive — NO inline `\c` tags
|
|
122
|
+
|
|
123
|
+
**CRITICAL:** Any inline ASS tag (`\c`, `\K`, `\1c`) between RTL words **breaks Unicode bidi in libass** — words render LTR instead of RTL. Always use separate Dialogue lines per word.
|
|
124
|
+
|
|
125
|
+
### Option 3: Remotion Captions (best for karaoke, full RTL control)
|
|
126
|
+
|
|
127
|
+
Remotion gives you full CSS control over RTL text. Proven pattern from Kolbo's video pipeline:
|
|
128
|
+
|
|
129
|
+
```tsx
|
|
130
|
+
// Detect language and set direction
|
|
131
|
+
const isHebrew = language === "he" || language === "iw";
|
|
132
|
+
const fontFamily = isHebrew ? "'Heebo', sans-serif" : "'Poppins', sans-serif";
|
|
133
|
+
|
|
134
|
+
// Root container
|
|
135
|
+
<div style={{
|
|
136
|
+
direction: isHebrew ? "rtl" : "ltr",
|
|
137
|
+
fontFamily,
|
|
138
|
+
textTransform: isHebrew ? "none" : "uppercase",
|
|
139
|
+
letterSpacing: isHebrew ? 0 : -2,
|
|
140
|
+
}}>
|
|
141
|
+
{words.map((word, i) => {
|
|
142
|
+
const progress = interpolate(frame, [word.startFrame, word.endFrame], [0, 1], {
|
|
143
|
+
extrapolateLeft: "clamp", extrapolateRight: "clamp"
|
|
144
|
+
});
|
|
145
|
+
return (
|
|
146
|
+
<span key={i} style={{
|
|
147
|
+
color: progress > 0 ? accentColor : "#ffffff",
|
|
148
|
+
transition: "none", // No CSS transitions in Remotion!
|
|
149
|
+
}}>
|
|
150
|
+
{word.text}{" "}
|
|
151
|
+
</span>
|
|
152
|
+
);
|
|
153
|
+
})}
|
|
154
|
+
</div>
|
|
155
|
+
```
|
|
156
|
+
|
|
157
|
+
**RTL-specific gotchas in Remotion (proven fixes):**
|
|
158
|
+
- Flip `paddingLeft` ↔ `paddingRight` when Hebrew
|
|
159
|
+
- Flip `transformOrigin`: `"top left"` → `"top right"` for Hebrew
|
|
160
|
+
- Gradient directions: `270deg` (RTL) vs `90deg` (LTR)
|
|
161
|
+
- Position logic: for Hebrew, "left" position actually means right side of screen
|
|
162
|
+
- `letterSpacing: 0` for Hebrew (negative kerning looks wrong with Hebrew fonts)
|
|
163
|
+
- `textTransform: "none"` for Hebrew (uppercase has no meaning in Hebrew)
|
|
164
|
+
|
|
165
|
+
### RTL Progress Bar (FFmpeg)
|
|
166
|
+
|
|
167
|
+
Animated progress bar that fills right-to-left for Hebrew, using `geq` filter:
|
|
168
|
+
|
|
169
|
+
```python
|
|
170
|
+
duration = 5.0 # seconds
|
|
171
|
+
|
|
172
|
+
# Hebrew (RTL): bar fills RIGHT → LEFT
|
|
173
|
+
bar_cond = f"gt(X,W*(1-T/{duration}))"
|
|
174
|
+
|
|
175
|
+
# English (LTR): bar fills LEFT → RIGHT
|
|
176
|
+
bar_cond = f"lt(X,W*T/{duration})"
|
|
177
|
+
|
|
178
|
+
# Apply as geq filter on bottom 4px strip (performant: 5760px/frame not 2M)
|
|
179
|
+
bar_geq = (
|
|
180
|
+
f"geq="
|
|
181
|
+
f"r='if({bar_cond},59,r(X,Y))':" # #3b82f6 blue
|
|
182
|
+
f"g='if({bar_cond},130,g(X,Y))':"
|
|
183
|
+
f"b='if({bar_cond},246,b(X,Y))'"
|
|
184
|
+
)
|
|
185
|
+
```
|
|
186
|
+
Uses capital `T` for timestamp in `geq` — avoids conflict with drawbox's `t=fill`.
|
|
187
|
+
|
|
188
|
+
### Language Detection
|
|
189
|
+
|
|
190
|
+
```python
|
|
191
|
+
_lang_map = {"heb": "he", "eng": "en", "iw": "he", "ara": "ar", "rus": "ru"}
|
|
192
|
+
language_code = _lang_map.get(raw_lang, raw_lang)
|
|
193
|
+
is_rtl = language_code in ("he", "ar", "fa", "ur")
|
|
194
|
+
```
|
|
195
|
+
|
|
196
|
+
## Word-Level Timing (Karaoke / Motion Graphics)
|
|
197
|
+
|
|
198
|
+
For word-by-word highlighting:
|
|
199
|
+
1. `transcribe_audio` via Kolbo MCP → get `word_by_word_srt_url` (ElevenLabs Scribe word-level timestamps)
|
|
200
|
+
2. Each word has precise start/end timing
|
|
201
|
+
3. Group words into display cues (8+ words or >1.5s gap triggers new line)
|
|
202
|
+
4. **For Remotion**: use word timings directly as props — CSS `direction: rtl` handles Hebrew ordering automatically
|
|
203
|
+
5. **For FFmpeg**: use ASS with per-word Dialogue lines (see Option 2 above)
|
|
204
|
+
|
|
205
|
+
## Quality Checklist
|
|
206
|
+
|
|
207
|
+
- [ ] Every spoken word appears in a subtitle cue
|
|
208
|
+
- [ ] No cue exceeds the character limit for target format
|
|
209
|
+
- [ ] Subtitles in bottom 20% of frame — never covering the face
|
|
210
|
+
- [ ] Text readable on mobile at native resolution
|
|
211
|
+
- [ ] Timing matches speech — no early or late cues
|
|
212
|
+
- [ ] Cues don't overlap each other
|
|
213
|
+
- [ ] Outline/shadow provides sufficient contrast against all backgrounds
|
|
214
|
+
|
|
215
|
+
---
|
|
216
|
+
|
|
217
|
+
## Kolbo MCP Integration
|
|
218
|
+
|
|
219
|
+
| Task | Kolbo MCP Tool | Notes |
|
|
220
|
+
|------|---------------|-------|
|
|
221
|
+
| Transcribe → SRT | `transcribe_audio` | Returns `srt_url` (grouped) + `word_by_word_srt_url` |
|
|
222
|
+
| Word-level captions | `transcribe_audio` | `word_by_word_srt_url` for karaoke/Remotion |
|
|
223
|
+
| Burn-in to video | FFmpeg | Use SRT from transcription |
|
|
224
|
+
| Visual analysis | `chat_send_message` + Gemini | Analyze where speaker's face is for caption placement |
|
|
225
|
+
|
|
226
|
+
**Subtitle production workflow:**
|
|
227
|
+
1. `transcribe_audio` → get `srt_url` and `word_by_word_srt_url`
|
|
228
|
+
2. Download the SRT file
|
|
229
|
+
3. Adjust styling for target format (vertical vs horizontal)
|
|
230
|
+
4. Burn in with FFmpeg using the commands above
|
|
231
|
+
5. For Remotion: use `word_by_word_srt_url` with CaptionOverlay component
|
|
232
|
+
|
|
233
|
+
**For Remotion captions (preferred over FFmpeg burn-in):**
|
|
234
|
+
- Load the word-by-word SRT
|
|
235
|
+
- Use Remotion's CaptionOverlay for animated word highlighting
|
|
236
|
+
- See `remotion-best-practices` skill for details
|
|
237
|
+
|
|
238
|
+
---
|
|
239
|
+
|
|
240
|
+
## Local / Free Option
|
|
241
|
+
|
|
242
|
+
> **IMPORTANT:** Always use Kolbo's `transcribe_audio` by default — it returns both grouped SRT and word-by-word SRT with no setup. FFmpeg burn-in is safe to use directly. Only suggest local transcription if the user explicitly asks for offline/free. Confirm before installing.
|
|
243
|
+
|
|
244
|
+
**Offline transcription:** `faster-whisper` runs on CPU, no GPU needed (`pip install faster-whisper`). Supports word-level timestamps for subtitle generation.
|
|
@@ -0,0 +1,222 @@
|
|
|
1
|
+
"""
|
|
2
|
+
Full burn pipeline:
|
|
3
|
+
For each chapter:
|
|
4
|
+
1. Render SectionDivider card (4s) + mux with SFX
|
|
5
|
+
2. Cut raw footage segment
|
|
6
|
+
3. Render ChapterProgress banner (ProRes alpha, exact chapter duration)
|
|
7
|
+
4. Composite banner onto footage
|
|
8
|
+
Concatenate everything → final burned MP4
|
|
9
|
+
"""
|
|
10
|
+
import json, os, sys, subprocess
|
|
11
|
+
|
|
12
|
+
_root = r"G:\Projects\Master Agent"
|
|
13
|
+
for _p in [os.path.join(_root, 'core'), os.path.join(_root, 'agents', 'content-creation')]:
|
|
14
|
+
if _p not in sys.path:
|
|
15
|
+
sys.path.insert(0, _p)
|
|
16
|
+
|
|
17
|
+
import config
|
|
18
|
+
sys.path.insert(0, os.path.join(_root, 'agents', 'content-creation', 'modules'))
|
|
19
|
+
from remotion_render import render as remotion_render, render_still
|
|
20
|
+
|
|
21
|
+
# ── Config ───────────────────────────────────────────────────────────────────
|
|
22
|
+
SOURCE_VIDEO = r"C:\Users\Zohar\Downloads\מכללת ספיר H.264.mp4"
|
|
23
|
+
CHAPTERS_JSON = r"G:\Projects\Master Agent\ytp_jobs\sapir_test\chapters.json"
|
|
24
|
+
SFX_FILE = r"G:\Projects\Master Agent\ytp_jobs\sapir_test\sfx\v1_cinematic_eq.mp3"
|
|
25
|
+
WORK_DIR = r"G:\Projects\Master Agent\ytp_jobs\sapir_test\burn"
|
|
26
|
+
FINAL_OUTPUT = r"G:\Projects\Youtube Editings\renders\sapir_edited_final.mp4"
|
|
27
|
+
VIDEO_DURATION = 1041.1
|
|
28
|
+
FPS = 30
|
|
29
|
+
FFMPEG = "ffmpeg"
|
|
30
|
+
NVENC = True # -bf 0 -rc-lookahead 0 eliminates encoder delay
|
|
31
|
+
|
|
32
|
+
os.makedirs(WORK_DIR, exist_ok=True)
|
|
33
|
+
|
|
34
|
+
# ── Load chapters ─────────────────────────────────────────────────────────────
|
|
35
|
+
with open(CHAPTERS_JSON, encoding='utf-8') as f:
|
|
36
|
+
chapters = json.load(f)
|
|
37
|
+
|
|
38
|
+
for i, ch in enumerate(chapters):
|
|
39
|
+
ch['end_time'] = chapters[i + 1]['start_time'] if i + 1 < len(chapters) else VIDEO_DURATION
|
|
40
|
+
ch['duration'] = ch['end_time'] - ch['start_time']
|
|
41
|
+
|
|
42
|
+
# ── Helpers ───────────────────────────────────────────────────────────────────
|
|
43
|
+
def _venc(cq=19):
|
|
44
|
+
"""Return video encoder args — NVENC (GPU, no delay) or libx264 fallback."""
|
|
45
|
+
if NVENC:
|
|
46
|
+
# -bf 0 -rc-lookahead 0: zero encoder delay → no A/V drift with -c:a copy
|
|
47
|
+
return ["-c:v", "h264_nvenc", "-preset", "p4", "-cq", str(cq),
|
|
48
|
+
"-bf", "0", "-rc-lookahead", "0"]
|
|
49
|
+
return ["-c:v", "libx264", "-preset", "fast", "-crf", str(cq)]
|
|
50
|
+
|
|
51
|
+
|
|
52
|
+
def run(cmd, desc=""):
|
|
53
|
+
print(f"[ffmpeg] {desc}", flush=True)
|
|
54
|
+
r = subprocess.run(cmd, capture_output=True)
|
|
55
|
+
if r.returncode != 0:
|
|
56
|
+
raise RuntimeError(f"FAILED {desc}:\n{r.stderr.decode('utf-8','replace')[-600:]}")
|
|
57
|
+
|
|
58
|
+
|
|
59
|
+
def cut_footage(start, end, output):
|
|
60
|
+
if os.path.exists(output):
|
|
61
|
+
print(f"[skip] {os.path.basename(output)} exists")
|
|
62
|
+
return
|
|
63
|
+
duration = end - start
|
|
64
|
+
# Dual-seek: fast input seek to 5s before target, then frame-accurate output seek
|
|
65
|
+
# This gives exact A/V sync without decoding the full file from the beginning.
|
|
66
|
+
pre = min(5.0, start)
|
|
67
|
+
run([FFMPEG, "-y",
|
|
68
|
+
"-ss", str(start - pre), "-i", SOURCE_VIDEO,
|
|
69
|
+
"-ss", str(pre), "-t", str(duration),
|
|
70
|
+
*_venc(),
|
|
71
|
+
"-c:a", "copy",
|
|
72
|
+
"-movflags", "+faststart", output],
|
|
73
|
+
f"cut footage {start:.1f}s + {duration:.1f}s")
|
|
74
|
+
|
|
75
|
+
|
|
76
|
+
def mux_sfx(video, sfx, output, video_duration=4.0):
|
|
77
|
+
if os.path.exists(output):
|
|
78
|
+
print(f"[skip] {os.path.basename(output)} exists")
|
|
79
|
+
return
|
|
80
|
+
# Resample SFX to 48kHz to match source video, pad, then mux
|
|
81
|
+
run([FFMPEG, "-y",
|
|
82
|
+
"-i", video, "-i", sfx,
|
|
83
|
+
"-filter_complex", f"[1:a]aresample=48000,apad=pad_dur={video_duration}[a]",
|
|
84
|
+
"-map", "0:v", "-map", "[a]",
|
|
85
|
+
*_venc(cq=12), "-c:a", "aac", "-b:a", "192k", "-ar", "48000",
|
|
86
|
+
"-t", str(video_duration), output],
|
|
87
|
+
f"mux SFX into {os.path.basename(video)}")
|
|
88
|
+
|
|
89
|
+
|
|
90
|
+
def composite_with_banner(footage, banner_png, output, duration_sec, is_hebrew=True):
|
|
91
|
+
"""Composite static banner PNG + animated progress bar.
|
|
92
|
+
Uses crop+geq+overlay on just the bottom 4 rows — fast (5760 px/frame not 2M).
|
|
93
|
+
geq uses capital T for timestamp, avoiding conflict with drawbox's t=fill.
|
|
94
|
+
"""
|
|
95
|
+
if os.path.exists(output):
|
|
96
|
+
print(f"[skip] {os.path.basename(output)} exists")
|
|
97
|
+
return
|
|
98
|
+
d = float(duration_sec)
|
|
99
|
+
# geq on a 4px strip: T=timestamp(secs), W=strip width, X=pixel x-coord
|
|
100
|
+
# Hebrew RTL: fill right side first → X > W*(1 - T/D)
|
|
101
|
+
# LTR: fill left side first → X < W*T/D
|
|
102
|
+
if is_hebrew:
|
|
103
|
+
bar_cond = f"gt(X,W*(1-T/{d}))"
|
|
104
|
+
else:
|
|
105
|
+
bar_cond = f"lt(X,W*T/{d})"
|
|
106
|
+
|
|
107
|
+
bar_geq = (
|
|
108
|
+
f"geq="
|
|
109
|
+
f"r='if({bar_cond},59,r(X,Y))':"
|
|
110
|
+
f"g='if({bar_cond},130,g(X,Y))':"
|
|
111
|
+
f"b='if({bar_cond},246,b(X,Y))'"
|
|
112
|
+
)
|
|
113
|
+
|
|
114
|
+
# overlay=0:0 composites banner PNG onto footage
|
|
115
|
+
# split → crop bottom 4px → geq colors the bar → overlay back at bottom
|
|
116
|
+
fc = (
|
|
117
|
+
f"[0:v][1:v]overlay=0:0:format=auto,format=yuv420p[base];"
|
|
118
|
+
f"[base]split[main][bot_src];"
|
|
119
|
+
f"[bot_src]crop=iw:4:0:ih-4[strip];"
|
|
120
|
+
f"[strip]{bar_geq}[bar];"
|
|
121
|
+
f"[main][bar]overlay=0:H-4[v]"
|
|
122
|
+
)
|
|
123
|
+
|
|
124
|
+
run([FFMPEG, "-y",
|
|
125
|
+
"-i", footage,
|
|
126
|
+
"-i", banner_png,
|
|
127
|
+
"-filter_complex", fc,
|
|
128
|
+
"-map", "[v]", "-map", "0:a",
|
|
129
|
+
*_venc(),
|
|
130
|
+
"-c:a", "copy",
|
|
131
|
+
"-movflags", "+faststart", output],
|
|
132
|
+
f"composite+bar {os.path.basename(output)}")
|
|
133
|
+
|
|
134
|
+
|
|
135
|
+
# ── Main loop ─────────────────────────────────────────────────────────────────
|
|
136
|
+
parts = []
|
|
137
|
+
total = len(chapters)
|
|
138
|
+
|
|
139
|
+
for ch in chapters:
|
|
140
|
+
n = ch['chapter_number']
|
|
141
|
+
dur = ch['duration']
|
|
142
|
+
dur_frames = int(round(dur * FPS))
|
|
143
|
+
|
|
144
|
+
print(f"\n{'='*60}")
|
|
145
|
+
title_safe = ch['title'].encode('ascii','replace').decode('ascii')
|
|
146
|
+
print(f"Chapter {n}/{total}: {title_safe} ({dur:.1f}s = {dur_frames} frames)")
|
|
147
|
+
print('='*60)
|
|
148
|
+
|
|
149
|
+
# ── 1. SectionDivider render ──────────────────────────────────────────
|
|
150
|
+
divider_raw = os.path.join(WORK_DIR, f"ch{n:02d}_divider_raw.mp4")
|
|
151
|
+
divider_sfx = os.path.join(WORK_DIR, f"ch{n:02d}_divider.mp4")
|
|
152
|
+
|
|
153
|
+
if not os.path.exists(divider_raw):
|
|
154
|
+
print(f"[render] SectionDivider ch{n}...")
|
|
155
|
+
remotion_render(
|
|
156
|
+
composition_id="SectionDivider-16x9",
|
|
157
|
+
props={
|
|
158
|
+
"chapterNumber": n,
|
|
159
|
+
"title": ch['title'],
|
|
160
|
+
"subtitle": ch.get('subtitle', ''),
|
|
161
|
+
"language": "he",
|
|
162
|
+
"durationInFrames": 120,
|
|
163
|
+
"fps": FPS,
|
|
164
|
+
},
|
|
165
|
+
output_path=divider_raw,
|
|
166
|
+
job_dir=WORK_DIR,
|
|
167
|
+
alpha=False,
|
|
168
|
+
concurrency=16,
|
|
169
|
+
)
|
|
170
|
+
else:
|
|
171
|
+
print(f"[skip] divider ch{n} exists")
|
|
172
|
+
|
|
173
|
+
mux_sfx(divider_raw, SFX_FILE, divider_sfx, video_duration=4.0)
|
|
174
|
+
parts.append(divider_sfx)
|
|
175
|
+
|
|
176
|
+
# ── 2. Cut raw footage ────────────────────────────────────────────────
|
|
177
|
+
raw_clip = os.path.join(WORK_DIR, f"ch{n:02d}_raw.mp4")
|
|
178
|
+
cut_footage(ch['start_time'], ch['end_time'], raw_clip)
|
|
179
|
+
|
|
180
|
+
# ── 3. ChapterBanner still PNG render (single frame, fast) ───────────
|
|
181
|
+
banner_png = os.path.join(WORK_DIR, f"ch{n:02d}_banner.png")
|
|
182
|
+
|
|
183
|
+
if not os.path.exists(banner_png):
|
|
184
|
+
print(f"[still] ChapterBanner ch{n}...")
|
|
185
|
+
render_still(
|
|
186
|
+
composition_id="ChapterBanner-16x9",
|
|
187
|
+
props={
|
|
188
|
+
"chapterNumber": n,
|
|
189
|
+
"title": ch['title'],
|
|
190
|
+
"language": "he",
|
|
191
|
+
},
|
|
192
|
+
output_path=banner_png,
|
|
193
|
+
job_dir=WORK_DIR,
|
|
194
|
+
)
|
|
195
|
+
else:
|
|
196
|
+
print(f"[skip] banner ch{n} exists")
|
|
197
|
+
|
|
198
|
+
# ── 4. Composite banner + progress bar onto footage ───────────────────
|
|
199
|
+
composited = os.path.join(WORK_DIR, f"ch{n:02d}_composited.mp4")
|
|
200
|
+
composite_with_banner(raw_clip, banner_png, composited, dur, is_hebrew=True)
|
|
201
|
+
parts.append(composited)
|
|
202
|
+
|
|
203
|
+
# ── Concatenate all parts ─────────────────────────────────────────────────────
|
|
204
|
+
print(f"\n{'='*60}")
|
|
205
|
+
print(f"Concatenating {len(parts)} clips...")
|
|
206
|
+
|
|
207
|
+
concat_list = os.path.join(WORK_DIR, "concat.txt")
|
|
208
|
+
with open(concat_list, 'w', encoding='utf-8') as f:
|
|
209
|
+
for p in parts:
|
|
210
|
+
f.write(f"file '{p}'\n")
|
|
211
|
+
|
|
212
|
+
run([FFMPEG, "-y",
|
|
213
|
+
"-f", "concat", "-safe", "0",
|
|
214
|
+
"-i", concat_list,
|
|
215
|
+
*_venc(),
|
|
216
|
+
"-c:a", "copy",
|
|
217
|
+
"-movflags", "+faststart", FINAL_OUTPUT],
|
|
218
|
+
f"final concat -> {FINAL_OUTPUT}")
|
|
219
|
+
|
|
220
|
+
size_mb = os.path.getsize(FINAL_OUTPUT) / 1024 / 1024
|
|
221
|
+
print(f"\nDone! Final output: {FINAL_OUTPUT}")
|
|
222
|
+
print(f"Size: {size_mb:.0f} MB")
|
|
@@ -0,0 +1,127 @@
|
|
|
1
|
+
"""
|
|
2
|
+
Export SRT subtitle files for each video and each individual chapter.
|
|
3
|
+
- Full video SRT: placed next to the edited MP4
|
|
4
|
+
- Per-chapter SRTs: placed in each chapter's subfolder, timestamps zeroed to chapter start
|
|
5
|
+
"""
|
|
6
|
+
import json, os
|
|
7
|
+
|
|
8
|
+
def fmt_srt_time(seconds):
|
|
9
|
+
"""Format seconds as SRT timestamp: HH:MM:SS,mmm"""
|
|
10
|
+
h = int(seconds // 3600)
|
|
11
|
+
m = int((seconds % 3600) // 60)
|
|
12
|
+
s = seconds % 60
|
|
13
|
+
return f"{h:02d}:{m:02d}:{s:06.3f}".replace('.', ',')
|
|
14
|
+
|
|
15
|
+
|
|
16
|
+
def sentences_to_srt(sentences, offset=0.0):
|
|
17
|
+
"""Convert sentence list to SRT text, subtracting offset from all timestamps."""
|
|
18
|
+
blocks = []
|
|
19
|
+
for i, s in enumerate(sentences, 1):
|
|
20
|
+
start = max(0.0, s['start'] - offset)
|
|
21
|
+
end = max(0.0, s['end'] - offset)
|
|
22
|
+
blocks.append(f"{i}\n{fmt_srt_time(start)} --> {fmt_srt_time(end)}\n{s['text']}\n")
|
|
23
|
+
return '\n'.join(blocks)
|
|
24
|
+
|
|
25
|
+
|
|
26
|
+
def get_sentences_in_range(sentences, start_time, end_time):
|
|
27
|
+
"""Get all sentences that overlap with the given time range."""
|
|
28
|
+
result = []
|
|
29
|
+
for s in sentences:
|
|
30
|
+
# Include sentence if it overlaps with the range
|
|
31
|
+
if s['end'] > start_time and s['start'] < end_time:
|
|
32
|
+
result.append(s)
|
|
33
|
+
return result
|
|
34
|
+
|
|
35
|
+
|
|
36
|
+
JOBS = [
|
|
37
|
+
{
|
|
38
|
+
"name": "lior_course_01",
|
|
39
|
+
"transcript": r"G:\Projects\Master Agent\ytp_jobs\lior_course_01\transcript.json",
|
|
40
|
+
"chapters_json": r"G:\Projects\Master Agent\ytp_jobs\lior_course_01\chapters.json",
|
|
41
|
+
"full_srt_path": r"G:\Projects\Kolbo.AI\Courses\Lior\Claude\1 - היכרות עם Kolbo.AI - edited.srt",
|
|
42
|
+
"chapters_dir": r"G:\Projects\Kolbo.AI\Courses\Lior\Claude\1 - היכרות עם Kolbo.AI",
|
|
43
|
+
"video_duration": 540.2,
|
|
44
|
+
},
|
|
45
|
+
{
|
|
46
|
+
"name": "lior_course_02",
|
|
47
|
+
"transcript": r"G:\Projects\Master Agent\ytp_jobs\lior_course_02\transcript.json",
|
|
48
|
+
"chapters_json": r"G:\Projects\Master Agent\ytp_jobs\lior_course_02\chapters.json",
|
|
49
|
+
"full_srt_path": r"G:\Projects\Kolbo.AI\Courses\Lior\Claude\2 - הסבר על פרוייקטים - edited.srt",
|
|
50
|
+
"chapters_dir": r"G:\Projects\Kolbo.AI\Courses\Lior\Claude\2 - הסבר על פרוייקטים",
|
|
51
|
+
"video_duration": 1643.0,
|
|
52
|
+
},
|
|
53
|
+
{
|
|
54
|
+
"name": "lior_course_03",
|
|
55
|
+
"transcript": r"G:\Projects\Master Agent\ytp_jobs\lior_course_03\transcript.json",
|
|
56
|
+
"chapters_json": r"G:\Projects\Master Agent\ytp_jobs\lior_course_03\chapters.json",
|
|
57
|
+
"full_srt_path": r"G:\Projects\Kolbo.AI\Courses\Lior\Claude\3 - כלי הצאט - edited.srt",
|
|
58
|
+
"chapters_dir": r"G:\Projects\Kolbo.AI\Courses\Lior\Claude\3 - כלי הצאט",
|
|
59
|
+
"video_duration": 2789.4,
|
|
60
|
+
},
|
|
61
|
+
]
|
|
62
|
+
|
|
63
|
+
for job in JOBS:
|
|
64
|
+
print(f"\n{'='*60}")
|
|
65
|
+
print(f"SRT export: {job['name']}")
|
|
66
|
+
print('='*60)
|
|
67
|
+
|
|
68
|
+
with open(job['transcript'], encoding='utf-8') as f:
|
|
69
|
+
transcript = json.load(f)
|
|
70
|
+
sentences = transcript['sentences']
|
|
71
|
+
|
|
72
|
+
with open(job['chapters_json'], encoding='utf-8') as f:
|
|
73
|
+
chapters = json.load(f)
|
|
74
|
+
|
|
75
|
+
# Compute end times
|
|
76
|
+
for i, ch in enumerate(chapters):
|
|
77
|
+
ch['end_time'] = chapters[i + 1]['start_time'] if i + 1 < len(chapters) else job['video_duration']
|
|
78
|
+
|
|
79
|
+
# ── Full video SRT ────────────────────────────────────────────────────
|
|
80
|
+
# Note: the burned video has 4s divider cards inserted before each chapter.
|
|
81
|
+
# We need to offset all timestamps to account for the accumulated divider time.
|
|
82
|
+
full_blocks = []
|
|
83
|
+
block_num = 1
|
|
84
|
+
accumulated_divider_time = 0.0
|
|
85
|
+
DIVIDER_DURATION = 4.0
|
|
86
|
+
|
|
87
|
+
for ch in chapters:
|
|
88
|
+
ch_start = ch['start_time']
|
|
89
|
+
ch_end = ch['end_time']
|
|
90
|
+
ch_sentences = get_sentences_in_range(sentences, ch_start, ch_end)
|
|
91
|
+
|
|
92
|
+
# Each chapter is preceded by a 4s divider
|
|
93
|
+
accumulated_divider_time += DIVIDER_DURATION
|
|
94
|
+
|
|
95
|
+
for s in ch_sentences:
|
|
96
|
+
start = s['start'] + accumulated_divider_time
|
|
97
|
+
end = s['end'] + accumulated_divider_time
|
|
98
|
+
full_blocks.append(
|
|
99
|
+
f"{block_num}\n{fmt_srt_time(start)} --> {fmt_srt_time(end)}\n{s['text']}\n"
|
|
100
|
+
)
|
|
101
|
+
block_num += 1
|
|
102
|
+
|
|
103
|
+
with open(job['full_srt_path'], 'w', encoding='utf-8') as f:
|
|
104
|
+
f.write('\n'.join(full_blocks))
|
|
105
|
+
print(f" Full SRT: {block_num - 1} blocks")
|
|
106
|
+
|
|
107
|
+
# ── Per-chapter SRTs ──────────────────────────────────────────────────
|
|
108
|
+
for ch in chapters:
|
|
109
|
+
n = ch['chapter_number']
|
|
110
|
+
title = ch['title']
|
|
111
|
+
safe_title = title.replace('/', '-').replace('\\', '-').replace(':', '-').replace('"', '').replace('?', '').replace('*', '').replace('<', '').replace('>', '').replace('|', '')
|
|
112
|
+
|
|
113
|
+
ch_start = ch['start_time']
|
|
114
|
+
ch_end = ch['end_time']
|
|
115
|
+
ch_sentences = get_sentences_in_range(sentences, ch_start, ch_end)
|
|
116
|
+
|
|
117
|
+
# Offset = chapter start time (zero the timestamps to chapter-local time)
|
|
118
|
+
# Add 4s for the divider card at the beginning of each chapter clip
|
|
119
|
+
srt_text = sentences_to_srt(ch_sentences, offset=ch_start - DIVIDER_DURATION)
|
|
120
|
+
|
|
121
|
+
srt_path = os.path.join(job['chapters_dir'], f"{n:02d} - {safe_title}.srt")
|
|
122
|
+
with open(srt_path, 'w', encoding='utf-8') as f:
|
|
123
|
+
f.write(srt_text)
|
|
124
|
+
|
|
125
|
+
print(f" Ch {n:02d}: {len(ch_sentences)} blocks")
|
|
126
|
+
|
|
127
|
+
print("\nAll SRTs exported!")
|
|
@@ -0,0 +1,42 @@
|
|
|
1
|
+
"""Generate SRT from transcript.json and analyze chapters via Claude."""
|
|
2
|
+
import json, os, sys
|
|
3
|
+
|
|
4
|
+
# ── SRT generation ────────────────────────────────────────────────────────
|
|
5
|
+
|
|
6
|
+
with open('ytp_jobs/sapir_test/transcript.json', encoding='utf-8') as f:
|
|
7
|
+
data = json.load(f)
|
|
8
|
+
|
|
9
|
+
words = data['words']
|
|
10
|
+
|
|
11
|
+
def fmt_time(s):
|
|
12
|
+
h = int(s // 3600)
|
|
13
|
+
m = int((s % 3600) // 60)
|
|
14
|
+
sec = s % 60
|
|
15
|
+
return f'{h:02d}:{m:02d}:{sec:06.3f}'.replace('.', ',')
|
|
16
|
+
|
|
17
|
+
lines, current, cur_start = [], [], None
|
|
18
|
+
for i, w in enumerate(words):
|
|
19
|
+
if not current:
|
|
20
|
+
cur_start = w['start']
|
|
21
|
+
current.append(w['text'])
|
|
22
|
+
gap = words[i + 1]['start'] - w['end'] if i + 1 < len(words) else 999
|
|
23
|
+
if len(current) >= 8 or gap > 1.5:
|
|
24
|
+
lines.append({'start': cur_start, 'end': w['end'], 'text': ' '.join(current)})
|
|
25
|
+
current, cur_start = [], None
|
|
26
|
+
|
|
27
|
+
if current:
|
|
28
|
+
lines.append({'start': cur_start, 'end': words[-1]['end'], 'text': ' '.join(current)})
|
|
29
|
+
|
|
30
|
+
srt_blocks = []
|
|
31
|
+
for i, l in enumerate(lines, 1):
|
|
32
|
+
srt_blocks.append(f"{i}\n{fmt_time(l['start'])} --> {fmt_time(l['end'])}\n{l['text']}\n")
|
|
33
|
+
|
|
34
|
+
srt_text = '\n'.join(srt_blocks)
|
|
35
|
+
|
|
36
|
+
with open('ytp_jobs/sapir_test/sapir.srt', 'w', encoding='utf-8') as f:
|
|
37
|
+
f.write(srt_text)
|
|
38
|
+
|
|
39
|
+
print(f"[srt] Written {len(lines)} subtitle blocks -> ytp_jobs/sapir_test/sapir.srt")
|
|
40
|
+
print(f"[srt] First 3 blocks:")
|
|
41
|
+
for l in lines[:3]:
|
|
42
|
+
print(f" [{fmt_time(l['start'])} --> {fmt_time(l['end'])}] {l['text'][:70]}")
|