@thunderkiller/video-clipper 1.1.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/.env.example +130 -0
- package/.github/workflows/ci.yml +42 -0
- package/.github/workflows/release.yml +72 -0
- package/.husky/pre-commit +3 -0
- package/.prettierignore +6 -0
- package/.prettierrc +7 -0
- package/.releaserc.json +21 -0
- package/AGENTS.md +122 -0
- package/CHANGELOG.md +45 -0
- package/README.md +410 -0
- package/dist/cli.js +187 -0
- package/dist/config/env.js +14 -0
- package/dist/config/index.js +1 -0
- package/dist/index.js +35 -0
- package/dist/pipeline/runner.js +132 -0
- package/dist/pipeline/stages/audioProcessor.js +75 -0
- package/dist/pipeline/stages/clipExporter.js +44 -0
- package/dist/pipeline/stages/segmentAnalyzer.js +46 -0
- package/dist/pipeline/stages/segmentSelector.js +23 -0
- package/dist/pipeline/stages/videoResolver.js +34 -0
- package/dist/services/audioAnalyzers/base.js +13 -0
- package/dist/services/audioAnalyzers/factory.js +56 -0
- package/dist/services/audioAnalyzers/gemini.js +109 -0
- package/dist/services/audioAnalyzers/index.js +5 -0
- package/dist/services/audioAnalyzers/whisper.js +62 -0
- package/dist/services/audioAnalyzers/yamnet.js +40 -0
- package/dist/services/audioDownloader/index.js +81 -0
- package/dist/services/chunkBuilder/index.js +71 -0
- package/dist/services/clipGenerator/index.js +156 -0
- package/dist/services/clipRefiner/index.js +103 -0
- package/dist/services/eventDetector/index.js +54 -0
- package/dist/services/llmAnalyzer/LLMAnalyzer.js +63 -0
- package/dist/services/llmAnalyzer/index.js +173 -0
- package/dist/services/metadataExtractor/index.js +66 -0
- package/dist/services/segmentRanker/index.js +40 -0
- package/dist/services/signalMerger/index.js +36 -0
- package/dist/services/transcriptAnalyzers/base.js +13 -0
- package/dist/services/transcriptAnalyzers/factory.js +51 -0
- package/dist/services/transcriptAnalyzers/gemini.js +19 -0
- package/dist/services/transcriptAnalyzers/index.js +5 -0
- package/dist/services/transcriptAnalyzers/whisper.js +55 -0
- package/dist/services/transcriptAnalyzers/ytdlp.js +16 -0
- package/dist/services/transcriptDetector/index.js +102 -0
- package/dist/services/transcriptFetcher/index.js +124 -0
- package/dist/services/urlParser/index.js +46 -0
- package/dist/services/videoDownloader/index.js +212 -0
- package/dist/types/audio.js +15 -0
- package/dist/types/cli.js +1 -0
- package/dist/types/config.js +150 -0
- package/dist/types/index.js +5 -0
- package/dist/types/pipeline.js +9 -0
- package/dist/types/segment.js +36 -0
- package/dist/types/transcript.js +16 -0
- package/dist/types/video.js +14 -0
- package/dist/utils/cache.js +143 -0
- package/dist/utils/chunker.js +51 -0
- package/dist/utils/dumper.js +36 -0
- package/dist/utils/format.js +10 -0
- package/dist/utils/logger.js +16 -0
- package/dist/utils/modelFactory.js +60 -0
- package/dist/utils/redactConfig.js +20 -0
- package/dist/utils/sliceAudio.js +26 -0
- package/docs/free-models.md +78 -0
- package/docs/plan.md +442 -0
- package/docs/refactorPhases.md +105 -0
- package/docs/yt-downloader.md +440 -0
- package/package.json +65 -0
- package/requirements.txt +5 -0
- package/scripts/detect_events.py +81 -0
- package/scripts/detect_events_whisper.py +101 -0
- package/scripts/transcribe_whisper.py +70 -0
- package/src/cli.ts +186 -0
- package/src/config/env.ts +18 -0
- package/src/config/index.ts +2 -0
- package/src/index.ts +46 -0
- package/src/pipeline/runner.ts +155 -0
- package/src/pipeline/stages/audioProcessor.ts +129 -0
- package/src/pipeline/stages/clipExporter.ts +80 -0
- package/src/pipeline/stages/segmentAnalyzer.ts +72 -0
- package/src/pipeline/stages/segmentSelector.ts +39 -0
- package/src/pipeline/stages/videoResolver.ts +47 -0
- package/src/services/audioAnalyzers/base.ts +32 -0
- package/src/services/audioAnalyzers/factory.ts +71 -0
- package/src/services/audioAnalyzers/gemini.ts +137 -0
- package/src/services/audioAnalyzers/index.ts +6 -0
- package/src/services/audioAnalyzers/whisper.ts +80 -0
- package/src/services/audioAnalyzers/yamnet.ts +54 -0
- package/src/services/audioDownloader/index.ts +102 -0
- package/src/services/chunkBuilder/index.ts +86 -0
- package/src/services/clipGenerator/index.ts +210 -0
- package/src/services/clipRefiner/index.ts +141 -0
- package/src/services/eventDetector/index.ts +68 -0
- package/src/services/llmAnalyzer/LLMAnalyzer.ts +114 -0
- package/src/services/llmAnalyzer/index.ts +231 -0
- package/src/services/metadataExtractor/index.ts +83 -0
- package/src/services/segmentRanker/index.ts +88 -0
- package/src/services/signalMerger/index.ts +53 -0
- package/src/services/transcriptAnalyzers/base.ts +26 -0
- package/src/services/transcriptAnalyzers/factory.ts +67 -0
- package/src/services/transcriptAnalyzers/gemini.ts +24 -0
- package/src/services/transcriptAnalyzers/index.ts +6 -0
- package/src/services/transcriptAnalyzers/whisper.ts +68 -0
- package/src/services/transcriptAnalyzers/ytdlp.ts +19 -0
- package/src/services/transcriptDetector/index.ts +128 -0
- package/src/services/transcriptFetcher/index.ts +151 -0
- package/src/services/urlParser/index.ts +53 -0
- package/src/services/videoDownloader/index.ts +282 -0
- package/src/types/audio.ts +19 -0
- package/src/types/cli.ts +22 -0
- package/src/types/config.ts +174 -0
- package/src/types/index.ts +26 -0
- package/src/types/pipeline.ts +93 -0
- package/src/types/segment.ts +43 -0
- package/src/types/transcript.ts +22 -0
- package/src/types/video.ts +18 -0
- package/src/utils/cache.ts +223 -0
- package/src/utils/chunker.ts +60 -0
- package/src/utils/dumper.ts +41 -0
- package/src/utils/format.ts +10 -0
- package/src/utils/logger.ts +17 -0
- package/src/utils/modelFactory.ts +71 -0
- package/src/utils/redactConfig.ts +23 -0
- package/src/utils/sliceAudio.ts +35 -0
- package/test-trigger.txt +1 -0
- package/tests/analyzerFactory.test.ts +146 -0
- package/tests/audioEventDetector.test.ts +69 -0
- package/tests/cache.test.ts +203 -0
- package/tests/chunkBuilder.test.ts +146 -0
- package/tests/chunker.test.ts +95 -0
- package/tests/eventDetector.test.ts +103 -0
- package/tests/llmAnalyzer.test.ts +283 -0
- package/tests/segmentRanker.test.ts +133 -0
- package/tests/setup.ts +48 -0
- package/tests/signalMerger.test.ts +197 -0
- package/tests/transcriptDetector.test.ts +150 -0
- package/tests/transcriptFetcher.test.ts +179 -0
- package/tests/urlParser.test.ts +70 -0
- package/tsconfig.json +16 -0
- package/tsconfig.test.json +8 -0
- package/vitest.config.ts +8 -0
package/.env.example
ADDED
|
@@ -0,0 +1,130 @@
|
|
|
1
|
+
# ── Provider selection ────────────────────────────────────────────────────────
|
|
2
|
+
# Allowed values: openai | anthropic | google | xai | mistral | groq | zai | openrouter | custom
|
|
3
|
+
# Default: openai
|
|
4
|
+
LLM_PROVIDER=openai
|
|
5
|
+
|
|
6
|
+
# ── API keys — set only the key for your chosen provider ─────────────────────
|
|
7
|
+
OPENAI_API_KEY=your_key_here
|
|
8
|
+
# ANTHROPIC_API_KEY=your_key_here
|
|
9
|
+
# GOOGLE_GENERATIVE_AI_API_KEY=your_key_here
|
|
10
|
+
# XAI_API_KEY=your_key_here
|
|
11
|
+
# MISTRAL_API_KEY=your_key_here
|
|
12
|
+
# GROQ_API_KEY=your_key_here
|
|
13
|
+
# ZAI_API_KEY=your_key_here
|
|
14
|
+
# OPENROUTER_API_KEY=your_key_here
|
|
15
|
+
|
|
16
|
+
# ── Custom OpenAI-compatible provider ─────────────────────────────────────────
|
|
17
|
+
# Set LLM_PROVIDER=custom and supply the base URL + API key for any endpoint
|
|
18
|
+
# that speaks the OpenAI Chat Completions API (e.g. LM Studio, vLLM, Ollama,
|
|
19
|
+
# LocalAI, Together AI, Fireworks AI, etc.).
|
|
20
|
+
# Both vars are required when LLM_PROVIDER=custom.
|
|
21
|
+
# CUSTOM_OPENAI_BASE_URL=http://localhost:11434/v1
|
|
22
|
+
# CUSTOM_OPENAI_API_KEY=your_key_here
|
|
23
|
+
|
|
24
|
+
# ── Model name — must match the chosen provider's model IDs ──────────────────
|
|
25
|
+
# openai: gpt-4o, gpt-4o-mini, gpt-4-turbo, ...
|
|
26
|
+
# anthropic: claude-sonnet-4-5, claude-opus-4, claude-haiku-3-5, ...
|
|
27
|
+
# google: gemini-2.0-flash, gemini-1.5-pro, ...
|
|
28
|
+
# xai: grok-beta, grok-2, ...
|
|
29
|
+
# mistral: mistral-large-latest, mistral-small-latest, ...
|
|
30
|
+
# groq: llama-3.3-70b-versatile, llama-3.1-8b-instant, ...
|
|
31
|
+
# zai: glm-5, glm-4.7, glm-4.6, glm-5-turbo, ...
|
|
32
|
+
# openrouter: meta-llama/llama-3.3-70b-instruct:free, google/gemma-3-27b-it:free, ...
|
|
33
|
+
# custom: depends on your endpoint (e.g. llama3.2, mistral, phi4, ...)
|
|
34
|
+
LLM_MODEL=gpt-4o
|
|
35
|
+
|
|
36
|
+
# ── Tunable parameters (all have defaults — only set if you want to override) ─
|
|
37
|
+
# SCORE_THRESHOLD=7 # min score 1–10 to keep a segment
|
|
38
|
+
# TOP_N_SEGMENTS=10 # max segments returned
|
|
39
|
+
# CHUNK_LENGTH_SEC=120 # LLM chunk window size in seconds
|
|
40
|
+
# CHUNK_OVERLAP_SEC=20 # overlap between consecutive chunks in seconds
|
|
41
|
+
# MICRO_BLOCK_SEC=15 # micro-block grouping window in seconds
|
|
42
|
+
# LLM_MAX_RETRIES=3 # max retries on rate-limit errors
|
|
43
|
+
# DOWNLOAD_DIR=downloads/ # directory for yt-dlp downloads
|
|
44
|
+
# OUTPUT_DIR=outputs/ # directory for clips and dumps
|
|
45
|
+
# CACHE_DIR=outputs/cache # directory for transcript and LLM result caching
|
|
46
|
+
# LLM_CONCURRENCY=3 # max parallel LLM calls
|
|
47
|
+
# CLIP_CONCURRENCY=1 # max parallel clip generation operations
|
|
48
|
+
# DOWNLOAD_SECTIONS_MODE=all # yt-dlp download mode: all (full video) or segments (individual clips)
|
|
49
|
+
# FFMPEG_PRESET=fast # ffmpeg encoding preset: ultrafast, superfast, veryfast, fast (default), medium, slow, slower
|
|
50
|
+
# TIMESTAMP_OFFSET_SECONDS=0 # Adjust all clip timestamps (positive = later, negative = earlier) if transcript is misaligned with video
|
|
51
|
+
|
|
52
|
+
# ── Custom system prompt ────────────────────────────────────────────────────────
|
|
53
|
+
# Override the default LLM system prompt with a custom one. Useful for adapting
|
|
54
|
+
# the analysis to specific content types (e.g., comedy, tech, education).
|
|
55
|
+
# LLM_SYSTEM_PROMPT=You are an expert video editor specializing in viral clips.
|
|
56
|
+
|
|
57
|
+
# ── LLM evaluation limits ─────────────────────────────────────────────────────
|
|
58
|
+
# Cap the number of transcript chunks sent to the LLM. Useful for testing or
|
|
59
|
+
# controlling API costs. Unset (default) means all chunks are evaluated.
|
|
60
|
+
# MAX_CHUNKS=5
|
|
61
|
+
|
|
62
|
+
# ── Output dumping ────────────────────────────────────────────────────────────
|
|
63
|
+
# When true (default), writes two files after each run:
|
|
64
|
+
# OUTPUT_DIR/transcript/{videoId}.json — raw normalized transcript lines
|
|
65
|
+
# OUTPUT_DIR/analysis/{videoId}.json — full pipeline result (metadata + ranked segments)
|
|
66
|
+
# Set to false to disable.
|
|
67
|
+
# DUMP_OUTPUTS=true
|
|
68
|
+
|
|
69
|
+
# ── Audio event detection ───────────────────────────────────────────────────────
|
|
70
|
+
# Enable/disable audio event detection. When false, transcript-only mode is used.
|
|
71
|
+
# Default: true
|
|
72
|
+
# AUDIO_DETECTION_ENABLED=true
|
|
73
|
+
|
|
74
|
+
# ── Transcript provider ─────────────────────────────────────────────────────────
|
|
75
|
+
# Comma-separated ordered fallback chain. First provider to succeed wins.
|
|
76
|
+
# ytdlp — yt-dlp auto-generated VTT subtitles (default, no audio required)
|
|
77
|
+
# whisper — local openai-whisper (Python), requires audio download + pip install openai-whisper
|
|
78
|
+
# gemini — Gemini multimodal transcription (not yet implemented)
|
|
79
|
+
# Default: ytdlp
|
|
80
|
+
# TRANSCRIPT_PROVIDER=ytdlp
|
|
81
|
+
|
|
82
|
+
# Audio provider:
|
|
83
|
+
# gemini — Gemini Flash, semantic understanding, requires GOOGLE_GENERATIVE_AI_API_KEY
|
|
84
|
+
# yamnet — local YAMNet (Python), class-ID based, requires: pip install tensorflow-hub soundfile numpy
|
|
85
|
+
# whisper — local openai-whisper (Python), speech-to-text + keyword match, requires: pip install openai-whisper
|
|
86
|
+
# both — deprecated alias for "gemini,whisper"
|
|
87
|
+
# Default: gemini,whisper
|
|
88
|
+
# AUDIO_PROVIDER=gemini,whisper
|
|
89
|
+
|
|
90
|
+
# Whisper model size to use when AUDIO_PROVIDER=whisper or AUDIO_PROVIDER=both.
|
|
91
|
+
# Larger models are more accurate but slower. medium is the recommended default.
|
|
92
|
+
# Options: tiny | base | small | medium | large-v3
|
|
93
|
+
# Default: medium
|
|
94
|
+
# AUDIO_WHISPER_MODEL=medium
|
|
95
|
+
|
|
96
|
+
# Confidence threshold (0-1). Events below this value are discarded.
|
|
97
|
+
# For Whisper: 1.0 = exact phrase match, 0.8 = partial keyword match.
|
|
98
|
+
# For YAMNet: raw class score from the model.
|
|
99
|
+
# Default: 0.30
|
|
100
|
+
# AUDIO_CONFIDENCE_THRESHOLD=0.30
|
|
101
|
+
|
|
102
|
+
# Pre-roll and post-roll for audio-only clip candidates (seconds before/after event)
|
|
103
|
+
# Default: 5s pre-roll, 15s post-roll
|
|
104
|
+
# AUDIO_CLIP_PRE_ROLL=5
|
|
105
|
+
# AUDIO_CLIP_POST_ROLL=15
|
|
106
|
+
|
|
107
|
+
# Time window (seconds) within which audio events boost LLM segment scores
|
|
108
|
+
# Default: 10s
|
|
109
|
+
# AUDIO_LLM_BOOST_WINDOW=10
|
|
110
|
+
|
|
111
|
+
# Score boost applied to LLM segments when audio event is detected nearby
|
|
112
|
+
# Default: 2
|
|
113
|
+
# AUDIO_LLM_SCORE_BOOST=2
|
|
114
|
+
|
|
115
|
+
# ── Game profile ────────────────────────────────────────────────────────────────
|
|
116
|
+
# Game-specific event detection and LLM keywords:
|
|
117
|
+
# valorant: gunshot, gunfire_burst, explosion + ace, clutch, defuse, spike
|
|
118
|
+
# fps: gunshot, gunfire_burst, explosion + kill, streak, headshot
|
|
119
|
+
# boss_fight: explosion, crowd_cheering + phase, dead, finally
|
|
120
|
+
# general: crowd_cheering, applause + insane, crazy, let's go
|
|
121
|
+
# Default: general
|
|
122
|
+
# GAME_PROFILE=general
|
|
123
|
+
|
|
124
|
+
# ── Audio extra instructions ─────────────────────────────────────────────────
|
|
125
|
+
# Optional extra instructions appended to the Gemini audio detection prompt,
|
|
126
|
+
# after the game-profile preamble and before the JSON format block.
|
|
127
|
+
# Use this to focus detection on specific moments, ignore certain sounds, or
|
|
128
|
+
# add game-specific context not covered by the built-in profiles.
|
|
129
|
+
# Example:
|
|
130
|
+
# AUDIO_EXTRA_INSTRUCTIONS=Also flag moments where the caster's voice rises sharply. Ignore background music.
|
|
@@ -0,0 +1,42 @@
|
|
|
1
|
+
name: CI
|
|
2
|
+
|
|
3
|
+
on:
|
|
4
|
+
push:
|
|
5
|
+
branches:
|
|
6
|
+
- main
|
|
7
|
+
- master
|
|
8
|
+
pull_request:
|
|
9
|
+
branches:
|
|
10
|
+
- main
|
|
11
|
+
- master
|
|
12
|
+
jobs:
|
|
13
|
+
ci:
|
|
14
|
+
name: Type-check, Format, Test & Build
|
|
15
|
+
runs-on: ubuntu-latest
|
|
16
|
+
|
|
17
|
+
steps:
|
|
18
|
+
- name: Checkout
|
|
19
|
+
uses: actions/checkout@v4
|
|
20
|
+
|
|
21
|
+
- name: Setup Node.js (LTS)
|
|
22
|
+
uses: actions/setup-node@v4
|
|
23
|
+
with:
|
|
24
|
+
node-version: '24'
|
|
25
|
+
|
|
26
|
+
- name: Setup pnpm
|
|
27
|
+
uses: pnpm/action-setup@v4
|
|
28
|
+
|
|
29
|
+
- name: Install dependencies
|
|
30
|
+
run: pnpm install --frozen-lockfile
|
|
31
|
+
|
|
32
|
+
- name: Type-check
|
|
33
|
+
run: pnpm type-check
|
|
34
|
+
|
|
35
|
+
- name: Format check
|
|
36
|
+
run: pnpm format:check
|
|
37
|
+
|
|
38
|
+
- name: Test
|
|
39
|
+
run: pnpm test
|
|
40
|
+
|
|
41
|
+
- name: Build
|
|
42
|
+
run: pnpm build
|
|
@@ -0,0 +1,72 @@
|
|
|
1
|
+
name: Release
|
|
2
|
+
|
|
3
|
+
on:
|
|
4
|
+
push:
|
|
5
|
+
branches:
|
|
6
|
+
- master
|
|
7
|
+
|
|
8
|
+
jobs:
|
|
9
|
+
release:
|
|
10
|
+
name: Semantic Release
|
|
11
|
+
runs-on: ubuntu-latest
|
|
12
|
+
permissions:
|
|
13
|
+
contents: write
|
|
14
|
+
id-token: write
|
|
15
|
+
|
|
16
|
+
steps:
|
|
17
|
+
- name: Checkout
|
|
18
|
+
uses: actions/checkout@v4
|
|
19
|
+
with:
|
|
20
|
+
fetch-depth: 0
|
|
21
|
+
persist-credentials: false
|
|
22
|
+
|
|
23
|
+
- name: Setup pnpm
|
|
24
|
+
uses: pnpm/action-setup@v4
|
|
25
|
+
|
|
26
|
+
- name: Setup Node.js
|
|
27
|
+
uses: actions/setup-node@v4
|
|
28
|
+
with:
|
|
29
|
+
node-version: '24'
|
|
30
|
+
registry-url: 'https://registry.npmjs.org'
|
|
31
|
+
|
|
32
|
+
- name: Setup npm auth
|
|
33
|
+
run: echo "//registry.npmjs.org/:_authToken=\${{ secrets.NPM_TOKEN }}" > ~/.npmrc
|
|
34
|
+
|
|
35
|
+
- name: Install dependencies
|
|
36
|
+
run: pnpm install --frozen-lockfile
|
|
37
|
+
|
|
38
|
+
- name: Build
|
|
39
|
+
run: pnpm build
|
|
40
|
+
|
|
41
|
+
- name: Configure Git
|
|
42
|
+
run: |
|
|
43
|
+
git config user.name "github-actions[bot]"
|
|
44
|
+
git config user.email "github-actions[bot]@users.noreply.github.com"
|
|
45
|
+
git remote set-url origin https://x-access-token:${{ secrets.PUSH_TOKEN }}@github.com/${{ github.repository }}.git
|
|
46
|
+
|
|
47
|
+
- name: Stage 1 GitHub Release (versioning, tags, releases)
|
|
48
|
+
id: github_release
|
|
49
|
+
run: npx semantic-release
|
|
50
|
+
env:
|
|
51
|
+
GITHUB_TOKEN: ${{ secrets.PUSH_TOKEN }}
|
|
52
|
+
|
|
53
|
+
- name: Verify GitHub Release
|
|
54
|
+
if: success()
|
|
55
|
+
run: |
|
|
56
|
+
echo "Checking for new git tags..."
|
|
57
|
+
CURRENT_TAG=$(git describe --tags --abbrev=0 2>/dev/null || echo "none")
|
|
58
|
+
echo "Current tag: $CURRENT_TAG"
|
|
59
|
+
|
|
60
|
+
if [ "$CURRENT_TAG" != "none" ]; then
|
|
61
|
+
echo "✅ New tag created: $CURRENT_TAG"
|
|
62
|
+
echo "Proceeding to npm publish..."
|
|
63
|
+
else
|
|
64
|
+
echo "❌ No new tag created - skipping npm publish"
|
|
65
|
+
exit 1
|
|
66
|
+
fi
|
|
67
|
+
|
|
68
|
+
- name: Stage 2 Publish to NPM
|
|
69
|
+
if: success()
|
|
70
|
+
run: npm publish
|
|
71
|
+
env:
|
|
72
|
+
NODE_AUTH_TOKEN: ${{ secrets.NPM_TOKEN }}
|
package/.prettierignore
ADDED
package/.prettierrc
ADDED
package/.releaserc.json
ADDED
|
@@ -0,0 +1,21 @@
|
|
|
1
|
+
{
|
|
2
|
+
"branches": ["master"],
|
|
3
|
+
"plugins": [
|
|
4
|
+
"@semantic-release/commit-analyzer",
|
|
5
|
+
"@semantic-release/release-notes-generator",
|
|
6
|
+
[
|
|
7
|
+
"@semantic-release/changelog",
|
|
8
|
+
{
|
|
9
|
+
"changelogFile": "CHANGELOG.md"
|
|
10
|
+
}
|
|
11
|
+
],
|
|
12
|
+
[
|
|
13
|
+
"@semantic-release/git",
|
|
14
|
+
{
|
|
15
|
+
"assets": ["package.json", "CHANGELOG.md"],
|
|
16
|
+
"message": "chore(release): ${nextRelease.version} [skip ci]"
|
|
17
|
+
}
|
|
18
|
+
],
|
|
19
|
+
"@semantic-release/github"
|
|
20
|
+
]
|
|
21
|
+
}
|
package/AGENTS.md
ADDED
|
@@ -0,0 +1,122 @@
|
|
|
1
|
+
# Agent Instructions
|
|
2
|
+
|
|
3
|
+
## Project
|
|
4
|
+
|
|
5
|
+
TypeScript CLI that analyzes YouTube transcripts with an LLM to find interesting moments and optionally cut video clips.
|
|
6
|
+
|
|
7
|
+
See `docs/plan.md` for the full architecture.
|
|
8
|
+
|
|
9
|
+
## Stack
|
|
10
|
+
|
|
11
|
+
- TypeScript (Node.js 18+)
|
|
12
|
+
- Vercel AI SDK (`ai`, `@ai-sdk/openai`, `@ai-sdk/anthropic`, `@ai-sdk/google`, `@ai-sdk/xai`, `@ai-sdk/mistral`, `@ai-sdk/groq`) with `generateObject` + `zod` for structured LLM output
|
|
13
|
+
- Multi-provider support: OpenAI, Anthropic, Google, XAI, Mistral, Groq, Zai, OpenRouter
|
|
14
|
+
- `youtube-transcript` for transcript fetching
|
|
15
|
+
- `yt-dlp` + `execa` for video download
|
|
16
|
+
- `fluent-ffmpeg` for clip cutting
|
|
17
|
+
- `zod` for config validation at startup
|
|
18
|
+
- `p-limit` for concurrency control
|
|
19
|
+
|
|
20
|
+
## Project Structure
|
|
21
|
+
|
|
22
|
+
```
|
|
23
|
+
src/
|
|
24
|
+
config/ # zod-validated env config — import this, never read process.env directly
|
|
25
|
+
index.ts
|
|
26
|
+
env.ts
|
|
27
|
+
services/ # core pipeline modules
|
|
28
|
+
urlParser/
|
|
29
|
+
metadataExtractor/
|
|
30
|
+
transcriptFetcher/
|
|
31
|
+
chunkBuilder/
|
|
32
|
+
llmAnalyzer/
|
|
33
|
+
segmentRanker/
|
|
34
|
+
clipRefiner/
|
|
35
|
+
videoDownloader/
|
|
36
|
+
clipGenerator/
|
|
37
|
+
types/ # shared TypeScript types
|
|
38
|
+
index.ts
|
|
39
|
+
config.ts
|
|
40
|
+
segment.ts
|
|
41
|
+
transcript.ts
|
|
42
|
+
video.ts
|
|
43
|
+
youtube-transcript.d.ts
|
|
44
|
+
utils/ # utility functions
|
|
45
|
+
cache.ts # transcript + chunk LLM result caching
|
|
46
|
+
dumper.ts # transcript/analysis JSON dumps
|
|
47
|
+
redactConfig.ts # config formatting for logs (redacts API keys)
|
|
48
|
+
format.ts # timestamp formatting utilities
|
|
49
|
+
logger.ts # logging utilities
|
|
50
|
+
modelFactory.ts # LLM provider factory
|
|
51
|
+
index.ts # CLI entrypoint
|
|
52
|
+
tests/ # all unit tests (mirrors module names)
|
|
53
|
+
urlParser.test.ts
|
|
54
|
+
chunkBuilder.test.ts
|
|
55
|
+
segmentRanker.test.ts
|
|
56
|
+
downloads/ # yt-dlp output (gitignored)
|
|
57
|
+
outputs/ # ffmpeg clip output, caches, dumps (gitignored)
|
|
58
|
+
cache/ # transcript + LLM result cache
|
|
59
|
+
transcript/ # transcript dumps
|
|
60
|
+
analysis/ # analysis dumps
|
|
61
|
+
docs/
|
|
62
|
+
plan.md
|
|
63
|
+
free-models.md
|
|
64
|
+
```
|
|
65
|
+
|
|
66
|
+
## Code Rules
|
|
67
|
+
|
|
68
|
+
- All code in TypeScript — no plain `.js` files
|
|
69
|
+
- Every function must have explicit input/output types; avoid `any`
|
|
70
|
+
- Use `zod` for all external data validation (LLM output, env vars, API responses)
|
|
71
|
+
- Never read `process.env` directly — always import from `src/config.ts`
|
|
72
|
+
- Never hardcode API keys, model names, thresholds, or directory paths — all come from config
|
|
73
|
+
- Use `async/await` — no raw `.then()` chains
|
|
74
|
+
- Use `Promise.allSettled` for parallel LLM calls so one failure doesn't abort the rest
|
|
75
|
+
- Handle errors explicitly — no silent catches. Log a warning with the chunk index and reason on skip.
|
|
76
|
+
|
|
77
|
+
## LLM Usage
|
|
78
|
+
|
|
79
|
+
- Use `generateObject` (not `generateText`) for all LLM calls that return structured data
|
|
80
|
+
- Define a `zod` schema for every LLM response before writing the prompt
|
|
81
|
+
- Keep prompts in the same file as the function that uses them
|
|
82
|
+
- Do not retry on malformed JSON — `generateObject` handles structured output natively
|
|
83
|
+
- On any LLM call failure, catch and log, then continue — never crash the pipeline
|
|
84
|
+
|
|
85
|
+
## Module Conventions
|
|
86
|
+
|
|
87
|
+
Each module in `src/modules/` should:
|
|
88
|
+
|
|
89
|
+
- Export a single main function named after the module (e.g. `fetchTranscript`, `buildChunks`)
|
|
90
|
+
- Accept typed inputs and return typed outputs (no `any`)
|
|
91
|
+
- Not import from other modules except `config.ts` and `types.ts` unless there is a clear dependency
|
|
92
|
+
|
|
93
|
+
## Transcript Notes
|
|
94
|
+
|
|
95
|
+
- `youtube-transcript` returns `offset` in **milliseconds** — normalize to seconds immediately after fetching
|
|
96
|
+
- Micro-blocks group raw lines into ~15s windows before chunking
|
|
97
|
+
- LLM chunks are 120s windows with 20s overlap — built from micro-blocks, not raw lines
|
|
98
|
+
|
|
99
|
+
## Naming
|
|
100
|
+
|
|
101
|
+
- Files: `camelCase.ts`
|
|
102
|
+
- Functions: `camelCase`
|
|
103
|
+
- Types/interfaces: `PascalCase`
|
|
104
|
+
- Constants: `UPPER_SNAKE_CASE`
|
|
105
|
+
- Zod schemas: `PascalCase` + `Schema` suffix (e.g. `SegmentSchema`)
|
|
106
|
+
|
|
107
|
+
## Testing
|
|
108
|
+
|
|
109
|
+
- Write unit tests for pure functions (URL parser, chunker, ranker, deduplicator)
|
|
110
|
+
- Do not unit test functions that call external services (LLM, yt-dlp, ffmpeg) — integration test those separately
|
|
111
|
+
- Test files live in `tests/` at the project root, mirroring the module name (e.g. `tests/urlParser.test.ts`)
|
|
112
|
+
|
|
113
|
+
## Git
|
|
114
|
+
|
|
115
|
+
- Commit messages: lowercase, imperative
|
|
116
|
+
- One logical change per commit
|
|
117
|
+
- Never commit `.env`, `downloads/`, or `outputs/`
|
|
118
|
+
- It should be pointwise `-`
|
|
119
|
+
- Shouldn't be more than 500 characters all together
|
|
120
|
+
- it should be categorized as feat: , fix:, docs:, refactor etc...
|
|
121
|
+
- also it should include specifics like feat(<feature>)
|
|
122
|
+
- so final structure is <change ex: feat, docs, fix>: <short desc> followed by <full desc> pointwise
|
package/CHANGELOG.md
ADDED
|
@@ -0,0 +1,45 @@
|
|
|
1
|
+
# 1.0.0 (2026-03-18)
|
|
2
|
+
|
|
3
|
+
### Bug Fixes
|
|
4
|
+
|
|
5
|
+
- **audio:** align audio chunk windows with transcript LLM chunks ([f81981f](https://github.com/AmreetKumarkhuntia/video-clipper/commit/f81981fdca8c8b4b13682112551541b1df75a138))
|
|
6
|
+
- **audio:** fix Gemini markdown fence parse error and add per-chunk caching ([3cc237f](https://github.com/AmreetKumarkhuntia/video-clipper/commit/3cc237f851c56236fb3f2795f5378b02cf34bde3))
|
|
7
|
+
- **audio:** fix Gemini MM.SS timestamp ambiguity and make model configurable ([2139c5f](https://github.com/AmreetKumarkhuntia/video-clipper/commit/2139c5fa2030679bb460b402108f0e64482c0d0b))
|
|
8
|
+
- **audio:** use python3 or python whichever available ([6501bf6](https://github.com/AmreetKumarkhuntia/video-clipper/commit/6501bf6ba99ba1364d285b7b921a22ecdfd23182))
|
|
9
|
+
- github workflows ([18a9536](https://github.com/AmreetKumarkhuntia/video-clipper/commit/18a953619ed17de71d3c9bd0a86e1b42a10aea37))
|
|
10
|
+
- **logging:** improve chunk analysis logs and gitignore outputs dir ([cc007de](https://github.com/AmreetKumarkhuntia/video-clipper/commit/cc007de0b1f453064a907689f9f30f6da2bebded))
|
|
11
|
+
- **release:** add explicit npm auth setup step ([21c9287](https://github.com/AmreetKumarkhuntia/video-clipper/commit/21c928748fe90f278e67e34ae57553e5c279faf2))
|
|
12
|
+
- **release:** add explicit npm auth setup step ([7e423fa](https://github.com/AmreetKumarkhuntia/video-clipper/commit/7e423fa76779d1dd6fe0d5bd0bac1cf0d54ffc8a))
|
|
13
|
+
- **release:** remove github release plugin, keep npm only ([b37870b](https://github.com/AmreetKumarkhuntia/video-clipper/commit/b37870bfe997ad9d16ec1e0c22d52a07c99b4aaf))
|
|
14
|
+
- **release:** restore github releases and fix git push permissions ([73e6fea](https://github.com/AmreetKumarkhuntia/video-clipper/commit/73e6fea88040844d00cebe79953f50f5ac96bc1e))
|
|
15
|
+
- **release:** use PUSH_TOKEN instead of GITHUB_TOKEN ([d3414ad](https://github.com/AmreetKumarkhuntia/video-clipper/commit/d3414ad84944d48bc5dfcee7562b00c358bca27e))
|
|
16
|
+
- **tests:** add global vitest setup to mock config module ([ecffd3b](https://github.com/AmreetKumarkhuntia/video-clipper/commit/ecffd3b940b4a2a3e5d0bd29911f2e70e22bdfe8))
|
|
17
|
+
- yaml correction ([33c7854](https://github.com/AmreetKumarkhuntia/video-clipper/commit/33c7854015e1266b3fc01b6206da4ec946f94307))
|
|
18
|
+
|
|
19
|
+
### Features
|
|
20
|
+
|
|
21
|
+
- **analysis:** add per-chunk evaluations to analysis dump ([c562878](https://github.com/AmreetKumarkhuntia/video-clipper/commit/c562878971c65b8739b373cf2cac235d5c6697c0))
|
|
22
|
+
- **audio:** add audio event result caching + requirements.txt ([5fd41d5](https://github.com/AmreetKumarkhuntia/video-clipper/commit/5fd41d5895e50060a93e9ef20621c85db52e80c4))
|
|
23
|
+
- **audio:** add openai-whisper as local audio event detector ([12aa5be](https://github.com/AmreetKumarkhuntia/video-clipper/commit/12aa5be9a5521efef07e29f20ea93dd1a54aeda6))
|
|
24
|
+
- **audio:** apply --max-parallel to audio event detection ([4996fc7](https://github.com/AmreetKumarkhuntia/video-clipper/commit/4996fc744f5000b791ecdd6677c95c24d5cf8e59))
|
|
25
|
+
- **cache:** add chunk/transcript caching and format timestamps as HH:MM:SS ([4ea921d](https://github.com/AmreetKumarkhuntia/video-clipper/commit/4ea921d409c2b8cda2a1a5d59eb8bec14ca2c919))
|
|
26
|
+
- **cache:** write chunk cache immediately after LLM analysis ([2ef3454](https://github.com/AmreetKumarkhuntia/video-clipper/commit/2ef3454a57c61abd5e0031639d6b79a726c7f413))
|
|
27
|
+
- **ci:** add semantic-release for automated versioning ([0b7b6b6](https://github.com/AmreetKumarkhuntia/video-clipper/commit/0b7b6b65f685f486f3b51d21a78ac63108b87695))
|
|
28
|
+
- **clip-generator:** add CLIP_CONCURRENCY config to prevent memory spikes ([be2c8af](https://github.com/AmreetKumarkhuntia/video-clipper/commit/be2c8af7763c5967b9b88d0b4406ac1a2c50af71))
|
|
29
|
+
- **clip:** fix audio/video sync and add flexible download options ([c8ebe1a](https://github.com/AmreetKumarkhuntia/video-clipper/commit/c8ebe1a7f2218dcb881d10bd421e7b77b84009b6))
|
|
30
|
+
- **llm:** add multi-provider support via modelFactory ([d72cb81](https://github.com/AmreetKumarkhuntia/video-clipper/commit/d72cb81d8adfe6366b60ef8ab74bbd7c38f67fe5))
|
|
31
|
+
- **llm:** support custom system prompt via LLM_SYSTEM_PROMPT env var ([8b4525d](https://github.com/AmreetKumarkhuntia/video-clipper/commit/8b4525d9fffe2806c2232035b9a8553e56f37270))
|
|
32
|
+
- **npm:** convert to scoped package @thunderkiller/video-clipper ([c8bf930](https://github.com/AmreetKumarkhuntia/video-clipper/commit/c8bf930c789cd9ceee3d8143fc5ff75674609f9a))
|
|
33
|
+
- **output:** add transcript and analysis dump feature ([15db23c](https://github.com/AmreetKumarkhuntia/video-clipper/commit/15db23c77244b9db89604be463bccb97f6161987))
|
|
34
|
+
- **phase4:** add video downloader and clip generator modules ([5c1cbde](https://github.com/AmreetKumarkhuntia/video-clipper/commit/5c1cbdeabe84a65013f7d5a8f34b3c2c336097e0))
|
|
35
|
+
- **phase5:** add CLI flags, error handling, and progress logging ([36edf0a](https://github.com/AmreetKumarkhuntia/video-clipper/commit/36edf0a93da5389c7d5dee555fb6e3c7c7e47ee1))
|
|
36
|
+
- **pipeline:** implement phase 2 core pipeline modules ([957b068](https://github.com/AmreetKumarkhuntia/video-clipper/commit/957b068a36dc29552c1a950ac74a8e337f097764))
|
|
37
|
+
- **pipeline:** implement phase 3 LLM analysis and metadata modules ([8a088c9](https://github.com/AmreetKumarkhuntia/video-clipper/commit/8a088c99498824b35c9e348dd1ef8bb01eba7659))
|
|
38
|
+
- **provider:** add custom OpenAI-compatible provider support ([5dc0c4e](https://github.com/AmreetKumarkhuntia/video-clipper/commit/5dc0c4e5fedddc97ba7907d5af89ba2fa46f2bc3))
|
|
39
|
+
- **provider:** add openrouter support and free models doc ([e41d4a6](https://github.com/AmreetKumarkhuntia/video-clipper/commit/e41d4a6780dc5d72e83aafd1a1b4b8a46f2dd284))
|
|
40
|
+
- **refiner:** add concurrency control and caching to segment refinement ([0738599](https://github.com/AmreetKumarkhuntia/video-clipper/commit/073859948700da0d59b806e29f03e9b7956f704d))
|
|
41
|
+
- **release:** split into two sequential stages - GitHub then npm ([9f6d110](https://github.com/AmreetKumarkhuntia/video-clipper/commit/9f6d1108abd80cacf695d3872f4800fb84884bdd))
|
|
42
|
+
- **scaffold:** bootstrap project structure and core infrastructure ([b187b3f](https://github.com/AmreetKumarkhuntia/video-clipper/commit/b187b3f43082ae88525f9444e84297b975167720))
|
|
43
|
+
- **tooling:** add prettier, husky pre-commit hooks, and github actions ci ([fdbdc9a](https://github.com/AmreetKumarkhuntia/video-clipper/commit/fdbdc9a838e083d2777e6c6fb0f700b28fb1c324))
|
|
44
|
+
- **version:** updated version ([69164f5](https://github.com/AmreetKumarkhuntia/video-clipper/commit/69164f59acca2e3236b4ab9a6e85441f89df01b4))
|
|
45
|
+
- **yt-dlp:** add cookie support and improve error logging ([ae4afd4](https://github.com/AmreetKumarkhuntia/video-clipper/commit/ae4afd476b4176645bd69f40a62f22fd775356eb))
|