@kolbo/kolbo-code-linux-arm64-musl 2.0.0 → 2.0.2

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -0,0 +1,244 @@
1
+ ---
2
+ name: subtitle-production
3
+ description: >
4
+ Subtitle and caption production: timing strategies, cue length by format (vertical vs horizontal),
5
+ ASS/SRT styling, word-level timing, RTL support for Hebrew/Arabic, burn-in with FFmpeg,
6
+ readability rules. Use when generating, styling, or burning in subtitles.
7
+ Keywords: subtitle, caption, SRT, ASS, VTT, timing, burn-in, word-level, karaoke, RTL,
8
+ Hebrew, Arabic, font size, cue, readability
9
+ ---
10
+
11
+ # Subtitle & Caption Production
12
+
13
+ ## Output Formats
14
+
15
+ | Format | Extension | Use Case |
16
+ |--------|-----------|----------|
17
+ | SRT | `.srt` | Universal — FFmpeg, players, YouTube upload |
18
+ | VTT | `.vtt` | Web-native — HTML5 video, browser playback |
19
+ | ASS | `.ass` | Advanced styling, RTL support, per-word positioning |
20
+
21
+ ## Cue Length by Format
22
+
23
+ ### Vertical Short-Form (TikTok, Reels, Shorts)
24
+ - **Max 3-4 words per cue** — narrow screen, text must be large
25
+ - **Max 20 characters per line**
26
+ - Subtitles are **mandatory** (85% watch muted)
27
+
28
+ ### Horizontal Standard (YouTube, web)
29
+ - **Max 6-8 words per cue** — wider screen
30
+ - **Max 42 characters per line** (broadcast standard)
31
+
32
+ ### General Rules
33
+ - Average viewer reads ~15 characters/second
34
+ - Minimum display time: 0.5 seconds per cue
35
+ - Maximum display time: 5 seconds per cue
36
+
37
+ ## Styling for Burn-in
38
+
39
+ ### Vertical Video (1080x1920)
40
+ ```
41
+ font: Arial (or Heebo Bold for Hebrew)
42
+ font_size: 18
43
+ bold: true
44
+ primary_color: &H00FFFFFF (white, ASS format)
45
+ outline_color: &H00000000 (black)
46
+ outline_width: 3 (thick for readability)
47
+ shadow: 2
48
+ margin_v: 50
49
+ alignment: 2 (bottom center)
50
+ ```
51
+
52
+ ### Horizontal Video (1920x1080)
53
+ ```
54
+ font: Arial
55
+ font_size: 22
56
+ bold: true
57
+ primary_color: &H00FFFFFF
58
+ outline_color: &H00000000
59
+ outline_width: 2
60
+ shadow: 1
61
+ margin_v: 40
62
+ alignment: 2
63
+ ```
64
+
65
+ ### Common Mistakes
66
+ - **Wrong color format:** `&HFFFFFF` breaks positioning. Always use full 8-char `&H00FFFFFF`
67
+ - **Font too large on vertical:** `font_size: 28` fills center of 9:16. Use 18 max
68
+ - **Too many words per cue on vertical:** 5+ words creates multi-line blocks covering the face
69
+ - **MarginV too large:** Values over 200 push text off-screen. Stay under 100
70
+
71
+ ## Timing Best Practices
72
+
73
+ - Cue start must match word onset (not before the speaker starts)
74
+ - Cue end should extend ~200ms past the last word for comfortable reading
75
+ - Never let a cue linger into the next speaker's turn
76
+ - Don't split a thought across two cues if it fits in one
77
+
78
+ ## FFmpeg Burn-in Commands
79
+
80
+ ### Simple SRT
81
+ ```bash
82
+ ffmpeg -i input.mp4 -vf "subtitles=subs.srt:force_style='FontSize=22,Bold=1,PrimaryColour=&H00FFFFFF,OutlineColour=&H00000000,Outline=2'" -c:v libx264 -crf 18 -c:a copy output.mp4
83
+ ```
84
+
85
+ ### ASS with Custom Styling
86
+ ```bash
87
+ ffmpeg -i input.mp4 -vf "ass=styled_subs.ass" -c:v libx264 -crf 18 -c:a copy output.mp4
88
+ ```
89
+
90
+ ### Windows Path Escaping
91
+ ```bash
92
+ # Escape colons in subtitle filter paths on Windows
93
+ ffmpeg -i input.mp4 -vf "subtitles=C\\:/Users/path/subs.srt" output.mp4
94
+ ```
95
+
96
+ ## RTL (Hebrew/Arabic) — Proven Patterns
97
+
98
+ RTL subtitles are tricky. These patterns are battle-tested in Kolbo's video production pipeline.
99
+
100
+ **Reference implementations (bundled in `./reference/`):**
101
+ - `reference/burn_to_video.py` — Full burn pipeline with RTL progress bar (`geq` filter), chapter compositing, NVENC encoding
102
+ - `reference/export_srts.py` — SRT generation with chapter divider offset accounting
103
+ - `reference/gen_srt.py` — Word-level SRT from transcript JSON (8-word grouping, 1.5s gap detection)
104
+
105
+ ### Option 1: SRT with Simple Burn-in (easiest, works for most cases)
106
+
107
+ Plain SRT files work for Hebrew/Arabic if you use the right font and let FFmpeg's libass handle bidi:
108
+ ```bash
109
+ ffmpeg -i input.mp4 -vf "subtitles=subs.srt:force_style='FontName=Heebo,FontSize=22,Bold=1,Encoding=177,PrimaryColour=&H00FFFFFF,OutlineColour=&H00000000,Outline=2'" -c:v libx264 -crf 18 -c:a copy output.mp4
110
+ ```
111
+ - **Font**: Heebo Bold for Hebrew, Cairo Bold for Arabic
112
+ - **Encoding=177** (Hebrew) or **Encoding=178** (Arabic) in ASS style
113
+
114
+ ### Option 2: ASS with Per-Word Positioning (for karaoke/highlighting)
115
+
116
+ When you need per-word color highlighting with RTL text, you MUST use separate ASS Dialogue lines per word:
117
+
118
+ - Each word gets its own `Dialogue` line with explicit `\pos(x,y)`
119
+ - Use PIL to measure word widths: apply `~0.74` scale factor (PIL→libass calibration)
120
+ - Use `Alignment=7` (top-left anchor) so `\pos` sets exact top-left of each word
121
+ - Two named ASS styles (e.g., White + Yellow) for highlight vs inactive — NO inline `\c` tags
122
+
123
+ **CRITICAL:** Any inline ASS tag (`\c`, `\K`, `\1c`) between RTL words **breaks Unicode bidi in libass** — words render LTR instead of RTL. Always use separate Dialogue lines per word.
124
+
125
+ ### Option 3: Remotion Captions (best for karaoke, full RTL control)
126
+
127
+ Remotion gives you full CSS control over RTL text. Proven pattern from Kolbo's video pipeline:
128
+
129
+ ```tsx
130
+ // Detect language and set direction
131
+ const isHebrew = language === "he" || language === "iw";
132
+ const fontFamily = isHebrew ? "'Heebo', sans-serif" : "'Poppins', sans-serif";
133
+
134
+ // Root container
135
+ <div style={{
136
+ direction: isHebrew ? "rtl" : "ltr",
137
+ fontFamily,
138
+ textTransform: isHebrew ? "none" : "uppercase",
139
+ letterSpacing: isHebrew ? 0 : -2,
140
+ }}>
141
+ {words.map((word, i) => {
142
+ const progress = interpolate(frame, [word.startFrame, word.endFrame], [0, 1], {
143
+ extrapolateLeft: "clamp", extrapolateRight: "clamp"
144
+ });
145
+ return (
146
+ <span key={i} style={{
147
+ color: progress > 0 ? accentColor : "#ffffff",
148
+ transition: "none", // No CSS transitions in Remotion!
149
+ }}>
150
+ {word.text}{" "}
151
+ </span>
152
+ );
153
+ })}
154
+ </div>
155
+ ```
156
+
157
+ **RTL-specific gotchas in Remotion (proven fixes):**
158
+ - Flip `paddingLeft` ↔ `paddingRight` when Hebrew
159
+ - Flip `transformOrigin`: `"top left"` → `"top right"` for Hebrew
160
+ - Gradient directions: `270deg` (RTL) vs `90deg` (LTR)
161
+ - Position logic: for Hebrew, "left" position actually means right side of screen
162
+ - `letterSpacing: 0` for Hebrew (negative kerning looks wrong with Hebrew fonts)
163
+ - `textTransform: "none"` for Hebrew (uppercase has no meaning in Hebrew)
164
+
165
+ ### RTL Progress Bar (FFmpeg)
166
+
167
+ Animated progress bar that fills right-to-left for Hebrew, using `geq` filter:
168
+
169
+ ```python
170
+ duration = 5.0 # seconds
171
+
172
+ # Hebrew (RTL): bar fills RIGHT → LEFT
173
+ bar_cond = f"gt(X,W*(1-T/{duration}))"
174
+
175
+ # English (LTR): bar fills LEFT → RIGHT
176
+ bar_cond = f"lt(X,W*T/{duration})"
177
+
178
+ # Apply as geq filter on bottom 4px strip (performant: 5760px/frame not 2M)
179
+ bar_geq = (
180
+ f"geq="
181
+ f"r='if({bar_cond},59,r(X,Y))':" # #3b82f6 blue
182
+ f"g='if({bar_cond},130,g(X,Y))':"
183
+ f"b='if({bar_cond},246,b(X,Y))'"
184
+ )
185
+ ```
186
+ Uses capital `T` for timestamp in `geq` — avoids conflict with drawbox's `t=fill`.
187
+
188
+ ### Language Detection
189
+
190
+ ```python
191
+ _lang_map = {"heb": "he", "eng": "en", "iw": "he", "ara": "ar", "rus": "ru"}
192
+ language_code = _lang_map.get(raw_lang, raw_lang)
193
+ is_rtl = language_code in ("he", "ar", "fa", "ur")
194
+ ```
195
+
196
+ ## Word-Level Timing (Karaoke / Motion Graphics)
197
+
198
+ For word-by-word highlighting:
199
+ 1. `transcribe_audio` via Kolbo MCP → get `word_by_word_srt_url` (ElevenLabs Scribe word-level timestamps)
200
+ 2. Each word has precise start/end timing
201
+ 3. Group words into display cues (8+ words or >1.5s gap triggers new line)
202
+ 4. **For Remotion**: use word timings directly as props — CSS `direction: rtl` handles Hebrew ordering automatically
203
+ 5. **For FFmpeg**: use ASS with per-word Dialogue lines (see Option 2 above)
204
+
205
+ ## Quality Checklist
206
+
207
+ - [ ] Every spoken word appears in a subtitle cue
208
+ - [ ] No cue exceeds the character limit for target format
209
+ - [ ] Subtitles in bottom 20% of frame — never covering the face
210
+ - [ ] Text readable on mobile at native resolution
211
+ - [ ] Timing matches speech — no early or late cues
212
+ - [ ] Cues don't overlap each other
213
+ - [ ] Outline/shadow provides sufficient contrast against all backgrounds
214
+
215
+ ---
216
+
217
+ ## Kolbo MCP Integration
218
+
219
+ | Task | Kolbo MCP Tool | Notes |
220
+ |------|---------------|-------|
221
+ | Transcribe → SRT | `transcribe_audio` | Returns `srt_url` (grouped) + `word_by_word_srt_url` |
222
+ | Word-level captions | `transcribe_audio` | `word_by_word_srt_url` for karaoke/Remotion |
223
+ | Burn-in to video | FFmpeg | Use SRT from transcription |
224
+ | Visual analysis | `chat_send_message` + Gemini | Analyze where speaker's face is for caption placement |
225
+
226
+ **Subtitle production workflow:**
227
+ 1. `transcribe_audio` → get `srt_url` and `word_by_word_srt_url`
228
+ 2. Download the SRT file
229
+ 3. Adjust styling for target format (vertical vs horizontal)
230
+ 4. Burn in with FFmpeg using the commands above
231
+ 5. For Remotion: use `word_by_word_srt_url` with CaptionOverlay component
232
+
233
+ **For Remotion captions (preferred over FFmpeg burn-in):**
234
+ - Load the word-by-word SRT
235
+ - Use Remotion's CaptionOverlay for animated word highlighting
236
+ - See `remotion-best-practices` skill for details
237
+
238
+ ---
239
+
240
+ ## Local / Free Option
241
+
242
+ > **IMPORTANT:** Always use Kolbo's `transcribe_audio` by default — it returns both grouped SRT and word-by-word SRT with no setup. FFmpeg burn-in is safe to use directly. Only suggest local transcription if the user explicitly asks for offline/free. Confirm before installing.
243
+
244
+ **Offline transcription:** `faster-whisper` runs on CPU, no GPU needed (`pip install faster-whisper`). Supports word-level timestamps for subtitle generation.
@@ -0,0 +1,222 @@
1
+ """
2
+ Full burn pipeline:
3
+ For each chapter:
4
+ 1. Render SectionDivider card (4s) + mux with SFX
5
+ 2. Cut raw footage segment
6
+ 3. Render ChapterProgress banner (ProRes alpha, exact chapter duration)
7
+ 4. Composite banner onto footage
8
+ Concatenate everything → final burned MP4
9
+ """
10
+ import json, os, sys, subprocess
11
+
12
+ _root = r"G:\Projects\Master Agent"
13
+ for _p in [os.path.join(_root, 'core'), os.path.join(_root, 'agents', 'content-creation')]:
14
+ if _p not in sys.path:
15
+ sys.path.insert(0, _p)
16
+
17
+ import config
18
+ sys.path.insert(0, os.path.join(_root, 'agents', 'content-creation', 'modules'))
19
+ from remotion_render import render as remotion_render, render_still
20
+
21
+ # ── Config ───────────────────────────────────────────────────────────────────
22
+ SOURCE_VIDEO = r"C:\Users\Zohar\Downloads\מכללת ספיר H.264.mp4"
23
+ CHAPTERS_JSON = r"G:\Projects\Master Agent\ytp_jobs\sapir_test\chapters.json"
24
+ SFX_FILE = r"G:\Projects\Master Agent\ytp_jobs\sapir_test\sfx\v1_cinematic_eq.mp3"
25
+ WORK_DIR = r"G:\Projects\Master Agent\ytp_jobs\sapir_test\burn"
26
+ FINAL_OUTPUT = r"G:\Projects\Youtube Editings\renders\sapir_edited_final.mp4"
27
+ VIDEO_DURATION = 1041.1
28
+ FPS = 30
29
+ FFMPEG = "ffmpeg"
30
+ NVENC = True # -bf 0 -rc-lookahead 0 eliminates encoder delay
31
+
32
+ os.makedirs(WORK_DIR, exist_ok=True)
33
+
34
+ # ── Load chapters ─────────────────────────────────────────────────────────────
35
+ with open(CHAPTERS_JSON, encoding='utf-8') as f:
36
+ chapters = json.load(f)
37
+
38
+ for i, ch in enumerate(chapters):
39
+ ch['end_time'] = chapters[i + 1]['start_time'] if i + 1 < len(chapters) else VIDEO_DURATION
40
+ ch['duration'] = ch['end_time'] - ch['start_time']
41
+
42
+ # ── Helpers ───────────────────────────────────────────────────────────────────
43
+ def _venc(cq=19):
44
+ """Return video encoder args — NVENC (GPU, no delay) or libx264 fallback."""
45
+ if NVENC:
46
+ # -bf 0 -rc-lookahead 0: zero encoder delay → no A/V drift with -c:a copy
47
+ return ["-c:v", "h264_nvenc", "-preset", "p4", "-cq", str(cq),
48
+ "-bf", "0", "-rc-lookahead", "0"]
49
+ return ["-c:v", "libx264", "-preset", "fast", "-crf", str(cq)]
50
+
51
+
52
+ def run(cmd, desc=""):
53
+ print(f"[ffmpeg] {desc}", flush=True)
54
+ r = subprocess.run(cmd, capture_output=True)
55
+ if r.returncode != 0:
56
+ raise RuntimeError(f"FAILED {desc}:\n{r.stderr.decode('utf-8','replace')[-600:]}")
57
+
58
+
59
+ def cut_footage(start, end, output):
60
+ if os.path.exists(output):
61
+ print(f"[skip] {os.path.basename(output)} exists")
62
+ return
63
+ duration = end - start
64
+ # Dual-seek: fast input seek to 5s before target, then frame-accurate output seek
65
+ # This gives exact A/V sync without decoding the full file from the beginning.
66
+ pre = min(5.0, start)
67
+ run([FFMPEG, "-y",
68
+ "-ss", str(start - pre), "-i", SOURCE_VIDEO,
69
+ "-ss", str(pre), "-t", str(duration),
70
+ *_venc(),
71
+ "-c:a", "copy",
72
+ "-movflags", "+faststart", output],
73
+ f"cut footage {start:.1f}s + {duration:.1f}s")
74
+
75
+
76
+ def mux_sfx(video, sfx, output, video_duration=4.0):
77
+ if os.path.exists(output):
78
+ print(f"[skip] {os.path.basename(output)} exists")
79
+ return
80
+ # Resample SFX to 48kHz to match source video, pad, then mux
81
+ run([FFMPEG, "-y",
82
+ "-i", video, "-i", sfx,
83
+ "-filter_complex", f"[1:a]aresample=48000,apad=pad_dur={video_duration}[a]",
84
+ "-map", "0:v", "-map", "[a]",
85
+ *_venc(cq=12), "-c:a", "aac", "-b:a", "192k", "-ar", "48000",
86
+ "-t", str(video_duration), output],
87
+ f"mux SFX into {os.path.basename(video)}")
88
+
89
+
90
+ def composite_with_banner(footage, banner_png, output, duration_sec, is_hebrew=True):
91
+ """Composite static banner PNG + animated progress bar.
92
+ Uses crop+geq+overlay on just the bottom 4 rows — fast (5760 px/frame not 2M).
93
+ geq uses capital T for timestamp, avoiding conflict with drawbox's t=fill.
94
+ """
95
+ if os.path.exists(output):
96
+ print(f"[skip] {os.path.basename(output)} exists")
97
+ return
98
+ d = float(duration_sec)
99
+ # geq on a 4px strip: T=timestamp(secs), W=strip width, X=pixel x-coord
100
+ # Hebrew RTL: fill right side first → X > W*(1 - T/D)
101
+ # LTR: fill left side first → X < W*T/D
102
+ if is_hebrew:
103
+ bar_cond = f"gt(X,W*(1-T/{d}))"
104
+ else:
105
+ bar_cond = f"lt(X,W*T/{d})"
106
+
107
+ bar_geq = (
108
+ f"geq="
109
+ f"r='if({bar_cond},59,r(X,Y))':"
110
+ f"g='if({bar_cond},130,g(X,Y))':"
111
+ f"b='if({bar_cond},246,b(X,Y))'"
112
+ )
113
+
114
+ # overlay=0:0 composites banner PNG onto footage
115
+ # split → crop bottom 4px → geq colors the bar → overlay back at bottom
116
+ fc = (
117
+ f"[0:v][1:v]overlay=0:0:format=auto,format=yuv420p[base];"
118
+ f"[base]split[main][bot_src];"
119
+ f"[bot_src]crop=iw:4:0:ih-4[strip];"
120
+ f"[strip]{bar_geq}[bar];"
121
+ f"[main][bar]overlay=0:H-4[v]"
122
+ )
123
+
124
+ run([FFMPEG, "-y",
125
+ "-i", footage,
126
+ "-i", banner_png,
127
+ "-filter_complex", fc,
128
+ "-map", "[v]", "-map", "0:a",
129
+ *_venc(),
130
+ "-c:a", "copy",
131
+ "-movflags", "+faststart", output],
132
+ f"composite+bar {os.path.basename(output)}")
133
+
134
+
135
+ # ── Main loop ─────────────────────────────────────────────────────────────────
136
+ parts = []
137
+ total = len(chapters)
138
+
139
+ for ch in chapters:
140
+ n = ch['chapter_number']
141
+ dur = ch['duration']
142
+ dur_frames = int(round(dur * FPS))
143
+
144
+ print(f"\n{'='*60}")
145
+ title_safe = ch['title'].encode('ascii','replace').decode('ascii')
146
+ print(f"Chapter {n}/{total}: {title_safe} ({dur:.1f}s = {dur_frames} frames)")
147
+ print('='*60)
148
+
149
+ # ── 1. SectionDivider render ──────────────────────────────────────────
150
+ divider_raw = os.path.join(WORK_DIR, f"ch{n:02d}_divider_raw.mp4")
151
+ divider_sfx = os.path.join(WORK_DIR, f"ch{n:02d}_divider.mp4")
152
+
153
+ if not os.path.exists(divider_raw):
154
+ print(f"[render] SectionDivider ch{n}...")
155
+ remotion_render(
156
+ composition_id="SectionDivider-16x9",
157
+ props={
158
+ "chapterNumber": n,
159
+ "title": ch['title'],
160
+ "subtitle": ch.get('subtitle', ''),
161
+ "language": "he",
162
+ "durationInFrames": 120,
163
+ "fps": FPS,
164
+ },
165
+ output_path=divider_raw,
166
+ job_dir=WORK_DIR,
167
+ alpha=False,
168
+ concurrency=16,
169
+ )
170
+ else:
171
+ print(f"[skip] divider ch{n} exists")
172
+
173
+ mux_sfx(divider_raw, SFX_FILE, divider_sfx, video_duration=4.0)
174
+ parts.append(divider_sfx)
175
+
176
+ # ── 2. Cut raw footage ────────────────────────────────────────────────
177
+ raw_clip = os.path.join(WORK_DIR, f"ch{n:02d}_raw.mp4")
178
+ cut_footage(ch['start_time'], ch['end_time'], raw_clip)
179
+
180
+ # ── 3. ChapterBanner still PNG render (single frame, fast) ───────────
181
+ banner_png = os.path.join(WORK_DIR, f"ch{n:02d}_banner.png")
182
+
183
+ if not os.path.exists(banner_png):
184
+ print(f"[still] ChapterBanner ch{n}...")
185
+ render_still(
186
+ composition_id="ChapterBanner-16x9",
187
+ props={
188
+ "chapterNumber": n,
189
+ "title": ch['title'],
190
+ "language": "he",
191
+ },
192
+ output_path=banner_png,
193
+ job_dir=WORK_DIR,
194
+ )
195
+ else:
196
+ print(f"[skip] banner ch{n} exists")
197
+
198
+ # ── 4. Composite banner + progress bar onto footage ───────────────────
199
+ composited = os.path.join(WORK_DIR, f"ch{n:02d}_composited.mp4")
200
+ composite_with_banner(raw_clip, banner_png, composited, dur, is_hebrew=True)
201
+ parts.append(composited)
202
+
203
+ # ── Concatenate all parts ─────────────────────────────────────────────────────
204
+ print(f"\n{'='*60}")
205
+ print(f"Concatenating {len(parts)} clips...")
206
+
207
+ concat_list = os.path.join(WORK_DIR, "concat.txt")
208
+ with open(concat_list, 'w', encoding='utf-8') as f:
209
+ for p in parts:
210
+ f.write(f"file '{p}'\n")
211
+
212
+ run([FFMPEG, "-y",
213
+ "-f", "concat", "-safe", "0",
214
+ "-i", concat_list,
215
+ *_venc(),
216
+ "-c:a", "copy",
217
+ "-movflags", "+faststart", FINAL_OUTPUT],
218
+ f"final concat -> {FINAL_OUTPUT}")
219
+
220
+ size_mb = os.path.getsize(FINAL_OUTPUT) / 1024 / 1024
221
+ print(f"\nDone! Final output: {FINAL_OUTPUT}")
222
+ print(f"Size: {size_mb:.0f} MB")
@@ -0,0 +1,127 @@
1
+ """
2
+ Export SRT subtitle files for each video and each individual chapter.
3
+ - Full video SRT: placed next to the edited MP4
4
+ - Per-chapter SRTs: placed in each chapter's subfolder, timestamps zeroed to chapter start
5
+ """
6
+ import json, os
7
+
8
+ def fmt_srt_time(seconds):
9
+ """Format seconds as SRT timestamp: HH:MM:SS,mmm"""
10
+ h = int(seconds // 3600)
11
+ m = int((seconds % 3600) // 60)
12
+ s = seconds % 60
13
+ return f"{h:02d}:{m:02d}:{s:06.3f}".replace('.', ',')
14
+
15
+
16
+ def sentences_to_srt(sentences, offset=0.0):
17
+ """Convert sentence list to SRT text, subtracting offset from all timestamps."""
18
+ blocks = []
19
+ for i, s in enumerate(sentences, 1):
20
+ start = max(0.0, s['start'] - offset)
21
+ end = max(0.0, s['end'] - offset)
22
+ blocks.append(f"{i}\n{fmt_srt_time(start)} --> {fmt_srt_time(end)}\n{s['text']}\n")
23
+ return '\n'.join(blocks)
24
+
25
+
26
+ def get_sentences_in_range(sentences, start_time, end_time):
27
+ """Get all sentences that overlap with the given time range."""
28
+ result = []
29
+ for s in sentences:
30
+ # Include sentence if it overlaps with the range
31
+ if s['end'] > start_time and s['start'] < end_time:
32
+ result.append(s)
33
+ return result
34
+
35
+
36
+ JOBS = [
37
+ {
38
+ "name": "lior_course_01",
39
+ "transcript": r"G:\Projects\Master Agent\ytp_jobs\lior_course_01\transcript.json",
40
+ "chapters_json": r"G:\Projects\Master Agent\ytp_jobs\lior_course_01\chapters.json",
41
+ "full_srt_path": r"G:\Projects\Kolbo.AI\Courses\Lior\Claude\1 - היכרות עם Kolbo.AI - edited.srt",
42
+ "chapters_dir": r"G:\Projects\Kolbo.AI\Courses\Lior\Claude\1 - היכרות עם Kolbo.AI",
43
+ "video_duration": 540.2,
44
+ },
45
+ {
46
+ "name": "lior_course_02",
47
+ "transcript": r"G:\Projects\Master Agent\ytp_jobs\lior_course_02\transcript.json",
48
+ "chapters_json": r"G:\Projects\Master Agent\ytp_jobs\lior_course_02\chapters.json",
49
+ "full_srt_path": r"G:\Projects\Kolbo.AI\Courses\Lior\Claude\2 - הסבר על פרוייקטים - edited.srt",
50
+ "chapters_dir": r"G:\Projects\Kolbo.AI\Courses\Lior\Claude\2 - הסבר על פרוייקטים",
51
+ "video_duration": 1643.0,
52
+ },
53
+ {
54
+ "name": "lior_course_03",
55
+ "transcript": r"G:\Projects\Master Agent\ytp_jobs\lior_course_03\transcript.json",
56
+ "chapters_json": r"G:\Projects\Master Agent\ytp_jobs\lior_course_03\chapters.json",
57
+ "full_srt_path": r"G:\Projects\Kolbo.AI\Courses\Lior\Claude\3 - כלי הצאט - edited.srt",
58
+ "chapters_dir": r"G:\Projects\Kolbo.AI\Courses\Lior\Claude\3 - כלי הצאט",
59
+ "video_duration": 2789.4,
60
+ },
61
+ ]
62
+
63
+ for job in JOBS:
64
+ print(f"\n{'='*60}")
65
+ print(f"SRT export: {job['name']}")
66
+ print('='*60)
67
+
68
+ with open(job['transcript'], encoding='utf-8') as f:
69
+ transcript = json.load(f)
70
+ sentences = transcript['sentences']
71
+
72
+ with open(job['chapters_json'], encoding='utf-8') as f:
73
+ chapters = json.load(f)
74
+
75
+ # Compute end times
76
+ for i, ch in enumerate(chapters):
77
+ ch['end_time'] = chapters[i + 1]['start_time'] if i + 1 < len(chapters) else job['video_duration']
78
+
79
+ # ── Full video SRT ────────────────────────────────────────────────────
80
+ # Note: the burned video has 4s divider cards inserted before each chapter.
81
+ # We need to offset all timestamps to account for the accumulated divider time.
82
+ full_blocks = []
83
+ block_num = 1
84
+ accumulated_divider_time = 0.0
85
+ DIVIDER_DURATION = 4.0
86
+
87
+ for ch in chapters:
88
+ ch_start = ch['start_time']
89
+ ch_end = ch['end_time']
90
+ ch_sentences = get_sentences_in_range(sentences, ch_start, ch_end)
91
+
92
+ # Each chapter is preceded by a 4s divider
93
+ accumulated_divider_time += DIVIDER_DURATION
94
+
95
+ for s in ch_sentences:
96
+ start = s['start'] + accumulated_divider_time
97
+ end = s['end'] + accumulated_divider_time
98
+ full_blocks.append(
99
+ f"{block_num}\n{fmt_srt_time(start)} --> {fmt_srt_time(end)}\n{s['text']}\n"
100
+ )
101
+ block_num += 1
102
+
103
+ with open(job['full_srt_path'], 'w', encoding='utf-8') as f:
104
+ f.write('\n'.join(full_blocks))
105
+ print(f" Full SRT: {block_num - 1} blocks")
106
+
107
+ # ── Per-chapter SRTs ──────────────────────────────────────────────────
108
+ for ch in chapters:
109
+ n = ch['chapter_number']
110
+ title = ch['title']
111
+ safe_title = title.replace('/', '-').replace('\\', '-').replace(':', '-').replace('"', '').replace('?', '').replace('*', '').replace('<', '').replace('>', '').replace('|', '')
112
+
113
+ ch_start = ch['start_time']
114
+ ch_end = ch['end_time']
115
+ ch_sentences = get_sentences_in_range(sentences, ch_start, ch_end)
116
+
117
+ # Offset = chapter start time (zero the timestamps to chapter-local time)
118
+ # Add 4s for the divider card at the beginning of each chapter clip
119
+ srt_text = sentences_to_srt(ch_sentences, offset=ch_start - DIVIDER_DURATION)
120
+
121
+ srt_path = os.path.join(job['chapters_dir'], f"{n:02d} - {safe_title}.srt")
122
+ with open(srt_path, 'w', encoding='utf-8') as f:
123
+ f.write(srt_text)
124
+
125
+ print(f" Ch {n:02d}: {len(ch_sentences)} blocks")
126
+
127
+ print("\nAll SRTs exported!")
@@ -0,0 +1,42 @@
1
+ """Generate SRT from transcript.json and analyze chapters via Claude."""
2
+ import json, os, sys
3
+
4
+ # ── SRT generation ────────────────────────────────────────────────────────
5
+
6
+ with open('ytp_jobs/sapir_test/transcript.json', encoding='utf-8') as f:
7
+ data = json.load(f)
8
+
9
+ words = data['words']
10
+
11
+ def fmt_time(s):
12
+ h = int(s // 3600)
13
+ m = int((s % 3600) // 60)
14
+ sec = s % 60
15
+ return f'{h:02d}:{m:02d}:{sec:06.3f}'.replace('.', ',')
16
+
17
+ lines, current, cur_start = [], [], None
18
+ for i, w in enumerate(words):
19
+ if not current:
20
+ cur_start = w['start']
21
+ current.append(w['text'])
22
+ gap = words[i + 1]['start'] - w['end'] if i + 1 < len(words) else 999
23
+ if len(current) >= 8 or gap > 1.5:
24
+ lines.append({'start': cur_start, 'end': w['end'], 'text': ' '.join(current)})
25
+ current, cur_start = [], None
26
+
27
+ if current:
28
+ lines.append({'start': cur_start, 'end': words[-1]['end'], 'text': ' '.join(current)})
29
+
30
+ srt_blocks = []
31
+ for i, l in enumerate(lines, 1):
32
+ srt_blocks.append(f"{i}\n{fmt_time(l['start'])} --> {fmt_time(l['end'])}\n{l['text']}\n")
33
+
34
+ srt_text = '\n'.join(srt_blocks)
35
+
36
+ with open('ytp_jobs/sapir_test/sapir.srt', 'w', encoding='utf-8') as f:
37
+ f.write(srt_text)
38
+
39
+ print(f"[srt] Written {len(lines)} subtitle blocks -> ytp_jobs/sapir_test/sapir.srt")
40
+ print(f"[srt] First 3 blocks:")
41
+ for l in lines[:3]:
42
+ print(f" [{fmt_time(l['start'])} --> {fmt_time(l['end'])}] {l['text'][:70]}")