task-summary-extractor 9.4.0 → 9.5.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/.env.example +1 -1
- package/ARCHITECTURE.md +6 -0
- package/QUICK_START.md +4 -2
- package/README.md +18 -0
- package/package.json +1 -1
- package/src/phases/discover.js +1 -0
- package/src/phases/init.js +63 -1
- package/src/phases/process-media.js +32 -5
- package/src/pipeline.js +4 -3
- package/src/services/video.js +116 -25
- package/src/utils/cli.js +12 -2
package/.env.example
CHANGED
|
@@ -9,7 +9,7 @@ FIREBASE_MEASUREMENT_ID=G-XXXXXXXXXX
|
|
|
9
9
|
|
|
10
10
|
# ======================== GEMINI AI ========================
|
|
11
11
|
GEMINI_API_KEY=your_gemini_api_key
|
|
12
|
-
GEMINI_MODEL=gemini-2.5-flash
|
|
12
|
+
GEMINI_MODEL=gemini-2.5-flash-lite
|
|
13
13
|
|
|
14
14
|
# ======================== VIDEO PROCESSING ========================
|
|
15
15
|
# Speed multiplier (default: 1.6)
|
package/ARCHITECTURE.md
CHANGED
|
@@ -525,6 +525,7 @@ Directories skipped during recursive discovery: `node_modules`, `.git`, `compres
|
|
|
525
525
|
| Stage | Skip Condition |
|
|
526
526
|
|-------|----------------|
|
|
527
527
|
| **Compression** | `compressed/{video}/segment_*.mp4` exist on disk |
|
|
528
|
+
| **No-compress split** | `--no-compress` flag: raw keyframe split via `ffmpeg -c copy` (no re-encoding) |
|
|
528
529
|
| **Firebase upload** | File already exists at `calls/{name}/segments/{video}/` (bypassed by `--force-upload`) |
|
|
529
530
|
| **Storage URL → Gemini** | Firebase download URL available (bypassed by `--no-storage-url`) |
|
|
530
531
|
| **Gemini analysis** | Run file exists in `gemini_runs/` AND user chooses not to re-analyze |
|
|
@@ -577,6 +578,11 @@ JSONL structured format includes phase spans with timing metrics for observabili
|
|
|
577
578
|
| Sharpening | `unsharp=3:3:0.3` | Preserve text clarity |
|
|
578
579
|
| x264 params | `aq-mode=3:deblock=-1,-1:psy-rd=1.0,0.0` | Text readability |
|
|
579
580
|
| Audio | AAC, 64–128k, original sample rate | Clear speech |
|
|
581
|
+
| Speed | 1.6× default (`--speed` flag, env `VIDEO_SPEED`) | Reduce tokens per segment |
|
|
582
|
+
| Segment Duration | 280s default, compress mode only (`--segment-time` flag) | Context budget per segment |
|
|
583
|
+
| No-Compress Mode | Off by default (`--no-compress` flag) | Stream-copy split at 1200s (20 min), no re-encoding |
|
|
584
|
+
|
|
585
|
+
> **Google Gemini constraints:** ~300 tokens/sec (default res), ~100 tok/sec (low res). File API: 2 GB/file (free), 20 GB (paid). Max ~1 hour at default res per 1M context window.
|
|
580
586
|
|
|
581
587
|
---
|
|
582
588
|
|
package/QUICK_START.md
CHANGED
|
@@ -150,8 +150,8 @@ taskex --name "Your Name" --skip-upload "my-meeting"
|
|
|
150
150
|
### What happens
|
|
151
151
|
|
|
152
152
|
The pipeline will:
|
|
153
|
-
1. **Compress** the video (~30s)
|
|
154
|
-
2. **Segment** it into ≤5 min chunks
|
|
153
|
+
1. **Compress** the video (~30s) — or **split raw** with `--no-compress`
|
|
154
|
+
2. **Segment** it into ≤5 min chunks (configurable with `--segment-time` in compress mode)
|
|
155
155
|
3. **Upload** segments to Firebase Storage (if configured)
|
|
156
156
|
4. **Analyze** each segment with Gemini AI — uses Firebase Storage URL directly when available (skips separate Gemini upload)
|
|
157
157
|
5. **Quality check** — retry weak segments automatically (reuses file reference — no re-upload)
|
|
@@ -162,6 +162,8 @@ The pipeline will:
|
|
|
162
162
|
|
|
163
163
|
> **Tip:** Use `--force-upload` to re-upload files that already exist in Storage. Use `--no-storage-url` to bypass Storage URL optimization and force Gemini File API uploads.
|
|
164
164
|
|
|
165
|
+
> **Tip:** Use `--no-compress` to skip re-encoding (auto-splits at 20 min). Use `--speed 2.0` to speed up compressed playback (saves tokens), or `--segment-time 600` for longer compressed segments.
|
|
166
|
+
|
|
165
167
|
This takes **~2-5 minutes** depending on video length.
|
|
166
168
|
|
|
167
169
|
---
|
package/README.md
CHANGED
|
@@ -241,6 +241,23 @@ Skip parts of the pipeline you don't need:
|
|
|
241
241
|
| `--skip-compression` | Video compression | You already compressed/segmented the video |
|
|
242
242
|
| `--skip-gemini` | AI analysis entirely | You just want to compress & upload |
|
|
243
243
|
|
|
244
|
+
### Video Processing Flags
|
|
245
|
+
|
|
246
|
+
Control how video is processed before AI analysis:
|
|
247
|
+
|
|
248
|
+
| Flag | Default | Description |
|
|
249
|
+
|------|---------|-------------|
|
|
250
|
+
| `--no-compress` | off | Skip re-encoding — pass raw video to Gemini (auto-splits at 20 min) |
|
|
251
|
+
| `--speed <n>` | `1.6` | Playback speed multiplier (compress mode only) |
|
|
252
|
+
| `--segment-time <n>` | `280` | Segment duration in seconds, compress mode only (30–3600) |
|
|
253
|
+
|
|
254
|
+
**Duration constraints** (per [Google Gemini docs](https://ai.google.dev/gemini-api/docs/vision#video)):
|
|
255
|
+
- Default resolution: ~300 tokens/sec → max ~55 min/segment (recommended: ≤20 min)
|
|
256
|
+
- File API limit: 2 GB/file (free) / 20 GB (paid)
|
|
257
|
+
- Supported formats: mp4, mpeg, mov, avi, x-flv, mpg, webm, wmv, 3gpp
|
|
258
|
+
|
|
259
|
+
> **Tip:** Use `--no-compress` for large, high-quality recordings that you want to analyze at original quality. Raw video is auto-split at 20-minute intervals via `ffmpeg -c copy` (stream-copy). `--speed` and `--segment-time` only apply to compression mode.
|
|
260
|
+
|
|
244
261
|
### Tuning Flags
|
|
245
262
|
|
|
246
263
|
**You probably don't need these.** The defaults work well. These are for power users:
|
|
@@ -282,6 +299,7 @@ OUTPUT --format <md|html|json|pdf|docx|all> --min-confidence <high|medium|l
|
|
|
282
299
|
--no-html
|
|
283
300
|
UPLOAD --force-upload --no-storage-url
|
|
284
301
|
SKIP --skip-compression --skip-gemini
|
|
302
|
+
VIDEO --no-compress --speed <n> --segment-time <n>
|
|
285
303
|
DYNAMIC --request <text>
|
|
286
304
|
PROGRESS --repo <path>
|
|
287
305
|
TUNING --thinking-budget --compilation-thinking-budget --parallel
|
package/package.json
CHANGED
|
@@ -1,6 +1,6 @@
|
|
|
1
1
|
{
|
|
2
2
|
"name": "task-summary-extractor",
|
|
3
|
-
"version": "9.
|
|
3
|
+
"version": "9.5.0",
|
|
4
4
|
"description": "AI-powered meeting analysis & document generation CLI — video + document processing, deep dive docs, dynamic mode, interactive CLI with model selection, confidence scoring, learning loop, git progress tracking",
|
|
5
5
|
"main": "process_and_upload.js",
|
|
6
6
|
"bin": {
|
package/src/phases/discover.js
CHANGED
|
@@ -80,6 +80,7 @@ async function phaseDiscover(ctx) {
|
|
|
80
80
|
if (opts.skipUpload) activeFlags.push('skip-upload');
|
|
81
81
|
if (opts.forceUpload) activeFlags.push('force-upload');
|
|
82
82
|
if (opts.noStorageUrl) activeFlags.push('no-storage-url');
|
|
83
|
+
if (opts.noCompress) activeFlags.push('no-compress');
|
|
83
84
|
if (opts.skipCompression) activeFlags.push('skip-compression');
|
|
84
85
|
if (opts.skipGemini) activeFlags.push('skip-gemini');
|
|
85
86
|
if (opts.resume) activeFlags.push('resume');
|
package/src/phases/init.js
CHANGED
|
@@ -50,7 +50,7 @@ async function phaseInit() {
|
|
|
50
50
|
skipUpload: !!flags['skip-upload'],
|
|
51
51
|
forceUpload: !!flags['force-upload'],
|
|
52
52
|
noStorageUrl: !!flags['no-storage-url'],
|
|
53
|
-
skipCompression: !!flags['skip-compression'],
|
|
53
|
+
skipCompression: !!flags['skip-compression'], // DEPRECATED — use --no-compress
|
|
54
54
|
skipGemini: !!flags['skip-gemini'],
|
|
55
55
|
resume: !!flags.resume,
|
|
56
56
|
reanalyze: !!flags.reanalyze,
|
|
@@ -66,6 +66,10 @@ async function phaseInit() {
|
|
|
66
66
|
disableLearning: !!flags['no-learning'],
|
|
67
67
|
disableDiff: !!flags['no-diff'],
|
|
68
68
|
noHtml: !!flags['no-html'],
|
|
69
|
+
// Video processing flags
|
|
70
|
+
noCompress: !!flags['no-compress'],
|
|
71
|
+
speed: flags.speed ? parseFloat(flags.speed) : null,
|
|
72
|
+
segmentTime: flags['segment-time'] ? parseInt(flags['segment-time'], 10) : null,
|
|
69
73
|
deepDive: !!flags['deep-dive'],
|
|
70
74
|
deepSummary: !!flags['deep-summary'],
|
|
71
75
|
deepSummaryExclude: typeof flags['exclude-docs'] === 'string'
|
|
@@ -123,6 +127,55 @@ async function phaseInit() {
|
|
|
123
127
|
}
|
|
124
128
|
}
|
|
125
129
|
|
|
130
|
+
// --- Validate video processing flags ---
|
|
131
|
+
if (opts.noCompress) {
|
|
132
|
+
// --no-compress: raw passthrough — speed and segment-time are not user-configurable
|
|
133
|
+
if (opts.speed !== null) {
|
|
134
|
+
console.log(c.warn(' ⚠ --speed is ignored with --no-compress (raw video is not re-encoded)'));
|
|
135
|
+
opts.speed = null;
|
|
136
|
+
}
|
|
137
|
+
if (opts.segmentTime !== null) {
|
|
138
|
+
console.log(c.warn(' ⚠ --segment-time is ignored with --no-compress (auto: 1200s / 20 min per segment)'));
|
|
139
|
+
opts.segmentTime = null;
|
|
140
|
+
}
|
|
141
|
+
if (opts.skipCompression) {
|
|
142
|
+
console.log(c.warn(' ⚠ --skip-compression is redundant with --no-compress — ignoring'));
|
|
143
|
+
opts.skipCompression = false;
|
|
144
|
+
}
|
|
145
|
+
} else {
|
|
146
|
+
if (opts.speed !== null) {
|
|
147
|
+
if (Number.isNaN(opts.speed) || opts.speed < 0.1 || opts.speed > 10) {
|
|
148
|
+
throw new Error(`Invalid --speed "${flags.speed}". Must be between 0.1 and 10.`);
|
|
149
|
+
}
|
|
150
|
+
}
|
|
151
|
+
if (opts.segmentTime !== null) {
|
|
152
|
+
if (Number.isNaN(opts.segmentTime) || opts.segmentTime < 30 || opts.segmentTime > 3600) {
|
|
153
|
+
throw new Error(`Invalid --segment-time "${flags['segment-time']}". Must be between 30 and 3600 seconds.`);
|
|
154
|
+
}
|
|
155
|
+
// Duration-aware validation (Google Gemini: ~300 tokens/sec at default resolution)
|
|
156
|
+
const TOKENS_PER_SEC = 300;
|
|
157
|
+
const CONTEXT_LIMIT = 1_048_576;
|
|
158
|
+
const SAFE_VIDEO_BUDGET = CONTEXT_LIMIT * 0.6; // 60% for video, rest for prompt+docs+output
|
|
159
|
+
const effectiveSpeed = opts.speed || 1.0;
|
|
160
|
+
const effectiveVideoSec = opts.segmentTime / effectiveSpeed;
|
|
161
|
+
const estimatedTokens = Math.round(effectiveVideoSec * TOKENS_PER_SEC);
|
|
162
|
+
|
|
163
|
+
if (estimatedTokens > CONTEXT_LIMIT) {
|
|
164
|
+
throw new Error(
|
|
165
|
+
`--segment-time ${opts.segmentTime}s exceeds Gemini context window! ` +
|
|
166
|
+
`Estimated ${(estimatedTokens / 1000).toFixed(0)}K tokens/segment (limit: 1,048K). ` +
|
|
167
|
+
`Reduce to ≤${Math.floor((CONTEXT_LIMIT / TOKENS_PER_SEC) * effectiveSpeed)}s.`
|
|
168
|
+
);
|
|
169
|
+
}
|
|
170
|
+
if (estimatedTokens > SAFE_VIDEO_BUDGET) {
|
|
171
|
+
console.log(c.warn(
|
|
172
|
+
` ⚠ --segment-time ${opts.segmentTime}s is very large (~${(estimatedTokens / 1000).toFixed(0)}K tokens/segment). ` +
|
|
173
|
+
`Recommended: ≤${Math.floor((SAFE_VIDEO_BUDGET / TOKENS_PER_SEC) * effectiveSpeed)}s to leave room for prompt & output.`
|
|
174
|
+
));
|
|
175
|
+
}
|
|
176
|
+
}
|
|
177
|
+
}
|
|
178
|
+
|
|
126
179
|
// --- Validate min-confidence level ---
|
|
127
180
|
if (opts.minConfidence) {
|
|
128
181
|
const { validateConfidenceLevel } = require('../utils/confidence-filter');
|
|
@@ -318,6 +371,15 @@ function _printRunSummary(opts, modelId, models, targetDir) {
|
|
|
318
371
|
console.log(` ${c.dim('Disabled:')} ${disabled.join(c.dim(' · '))}`);
|
|
319
372
|
}
|
|
320
373
|
|
|
374
|
+
// Video processing settings
|
|
375
|
+
const { SPEED, SEG_TIME } = require('../config');
|
|
376
|
+
const effectiveSpeed = opts.noCompress ? 1.0 : (opts.speed || SPEED);
|
|
377
|
+
const effectiveSegTime = opts.noCompress ? 1200 : (opts.segmentTime || SEG_TIME);
|
|
378
|
+
const videoMode = opts.noCompress
|
|
379
|
+
? c.cyan('raw (stream-copy, auto-split at 20 min)')
|
|
380
|
+
: c.green(`compress × ${effectiveSpeed}x | ${effectiveSegTime}s segments`);
|
|
381
|
+
console.log(` ${c.dim('Video:')} ${videoMode}`);
|
|
382
|
+
|
|
321
383
|
if (opts.runMode) {
|
|
322
384
|
console.log(` ${c.dim('Run mode:')} ${c.bold(opts.runMode)}`);
|
|
323
385
|
}
|
|
@@ -10,7 +10,7 @@ const { AUDIO_EXTS, SPEED } = config;
|
|
|
10
10
|
// --- Services ---
|
|
11
11
|
const { uploadToStorage, storageExists } = require('../services/firebase');
|
|
12
12
|
const { processWithGemini, cleanupGeminiFiles } = require('../services/gemini');
|
|
13
|
-
const { compressAndSegment, compressAndSegmentAudio, probeFormat, verifySegment } = require('../services/video');
|
|
13
|
+
const { compressAndSegment, compressAndSegmentAudio, splitOnly, probeFormat, verifySegment } = require('../services/video');
|
|
14
14
|
|
|
15
15
|
// --- Utils ---
|
|
16
16
|
const { fmtDuration, fmtBytes } = require('../utils/format');
|
|
@@ -60,6 +60,12 @@ async function phaseProcessVideo(ctx, videoPath, videoIndex) {
|
|
|
60
60
|
? fs.readdirSync(segmentDir).filter(f => f.startsWith('segment_') && (f.endsWith('.mp4') || f.endsWith('.m4a'))).sort()
|
|
61
61
|
: [];
|
|
62
62
|
|
|
63
|
+
// Build video processing options from CLI flags
|
|
64
|
+
// --no-compress uses hardcoded 1200s (splitOnly default); --segment-time only for compress mode
|
|
65
|
+
const videoOpts = {};
|
|
66
|
+
if (!opts.noCompress && opts.segmentTime) videoOpts.segTime = opts.segmentTime;
|
|
67
|
+
if (!opts.noCompress && opts.speed) videoOpts.speed = opts.speed;
|
|
68
|
+
|
|
63
69
|
if (opts.skipCompression || opts.dryRun) {
|
|
64
70
|
if (existingSegments.length > 0) {
|
|
65
71
|
segments = existingSegments.map(f => path.join(segmentDir, f));
|
|
@@ -70,18 +76,23 @@ async function phaseProcessVideo(ctx, videoPath, videoIndex) {
|
|
|
70
76
|
console.log(` ${c.dim(`[DRY-RUN] Would compress "${path.basename(videoPath)}" into segments`)}`);
|
|
71
77
|
return { fileResult: null, segmentAnalyses: [] };
|
|
72
78
|
}
|
|
73
|
-
segments = compressAndSegment(videoPath, segmentDir);
|
|
79
|
+
segments = compressAndSegment(videoPath, segmentDir, videoOpts);
|
|
74
80
|
log.step(`Compressed → ${segments.length} segment(s)`);
|
|
75
81
|
}
|
|
76
82
|
} else if (existingSegments.length > 0) {
|
|
77
83
|
segments = existingSegments.map(f => path.join(segmentDir, f));
|
|
78
84
|
log.step(`SKIP compression — ${segments.length} segment(s) already on disk`);
|
|
79
85
|
console.log(` ${c.success(`Skipped compression \u2014 ${c.highlight(segments.length)} segment(s) already exist`)}`);
|
|
86
|
+
} else if (opts.noCompress) {
|
|
87
|
+
// --no-compress: split raw video at keyframes, no re-encoding
|
|
88
|
+
segments = splitOnly(videoPath, segmentDir, videoOpts);
|
|
89
|
+
log.step(`Split (raw) → ${segments.length} segment(s)`);
|
|
90
|
+
console.log(` \u2192 ${c.highlight(segments.length)} raw segment(s) created`);
|
|
80
91
|
} else {
|
|
81
92
|
if (isAudio) {
|
|
82
|
-
segments = compressAndSegmentAudio(videoPath, segmentDir);
|
|
93
|
+
segments = compressAndSegmentAudio(videoPath, segmentDir, videoOpts);
|
|
83
94
|
} else {
|
|
84
|
-
segments = compressAndSegment(videoPath, segmentDir);
|
|
95
|
+
segments = compressAndSegment(videoPath, segmentDir, videoOpts);
|
|
85
96
|
}
|
|
86
97
|
log.step(`Compressed → ${segments.length} segment(s)`);
|
|
87
98
|
console.log(` \u2192 ${c.highlight(segments.length)} segment(s) created`);
|
|
@@ -90,6 +101,20 @@ async function phaseProcessVideo(ctx, videoPath, videoIndex) {
|
|
|
90
101
|
progress.markCompressed(baseName, segments.length);
|
|
91
102
|
const origSize = fs.statSync(videoPath).size;
|
|
92
103
|
log.step(`original=${(origSize / 1048576).toFixed(2)}MB (${fmtBytes(origSize)}) | ${segments.length} segment(s)`);
|
|
104
|
+
|
|
105
|
+
// Duration-aware warnings for raw segments
|
|
106
|
+
if (opts.noCompress && segments.length > 0) {
|
|
107
|
+
const totalSegSize = segments.reduce((s, p) => s + fs.statSync(p).size, 0);
|
|
108
|
+
const avgSegMB = totalSegSize / segments.length / 1048576;
|
|
109
|
+
if (avgSegMB > 500) {
|
|
110
|
+
console.warn(` ${c.warn(`Avg segment ~${avgSegMB.toFixed(0)} MB — large raw segments take longer to upload.`)}`);
|
|
111
|
+
console.warn(` ${c.dim(' Tip: remove --no-compress to re-encode into smaller segments.')}`);
|
|
112
|
+
}
|
|
113
|
+
// All raw segments must use Gemini File API (>20 MB external URL limit)
|
|
114
|
+
if (avgSegMB > 20) {
|
|
115
|
+
console.log(` ${c.dim('Raw segments >20 MB — will use Gemini File API upload (not storage URLs).')}`);
|
|
116
|
+
}
|
|
117
|
+
}
|
|
93
118
|
console.log('');
|
|
94
119
|
|
|
95
120
|
const fileResult = {
|
|
@@ -178,10 +203,12 @@ async function phaseProcessVideo(ctx, videoPath, videoIndex) {
|
|
|
178
203
|
}
|
|
179
204
|
|
|
180
205
|
// Calculate cumulative time offsets for VTT time-slicing
|
|
206
|
+
// When --no-compress is active, segments play at real time (speed = 1.0)
|
|
207
|
+
const effectiveSpeed = opts.noCompress ? 1.0 : (opts.speed || SPEED);
|
|
181
208
|
let cumulativeTimeSec = 0;
|
|
182
209
|
for (const meta of segmentMeta) {
|
|
183
210
|
meta.startTimeSec = cumulativeTimeSec;
|
|
184
|
-
meta.endTimeSec = cumulativeTimeSec + (meta.durSec || 0) *
|
|
211
|
+
meta.endTimeSec = cumulativeTimeSec + (meta.durSec || 0) * effectiveSpeed;
|
|
185
212
|
cumulativeTimeSec = meta.endTimeSec;
|
|
186
213
|
}
|
|
187
214
|
|
package/src/pipeline.js
CHANGED
|
@@ -136,9 +136,10 @@ async function run() {
|
|
|
136
136
|
userName: fullCtx.userName,
|
|
137
137
|
inputMode: ctx.inputMode,
|
|
138
138
|
settings: {
|
|
139
|
-
speed: SPEED,
|
|
140
|
-
segmentTimeSec: SEG_TIME,
|
|
141
|
-
|
|
139
|
+
speed: fullCtx.opts.noCompress ? 1.0 : (fullCtx.opts.speed || SPEED),
|
|
140
|
+
segmentTimeSec: fullCtx.opts.noCompress ? 1200 : (fullCtx.opts.segmentTime || SEG_TIME),
|
|
141
|
+
noCompress: !!fullCtx.opts.noCompress,
|
|
142
|
+
...(fullCtx.opts.noCompress ? {} : { preset: PRESET }),
|
|
142
143
|
geminiModel: config.GEMINI_MODEL,
|
|
143
144
|
thinkingBudget: fullCtx.opts.thinkingBudget,
|
|
144
145
|
},
|
package/src/services/video.js
CHANGED
|
@@ -13,7 +13,7 @@ const { execSync, spawnSync } = require('child_process');
|
|
|
13
13
|
const fs = require('fs');
|
|
14
14
|
const path = require('path');
|
|
15
15
|
const { SPEED, SEG_TIME, PRESET } = require('../config');
|
|
16
|
-
const { fmtDuration } = require('../utils/format');
|
|
16
|
+
const { fmtDuration, fmtBytes } = require('../utils/format');
|
|
17
17
|
const { c } = require('../utils/colors');
|
|
18
18
|
|
|
19
19
|
// ======================== BINARY DETECTION ========================
|
|
@@ -103,17 +103,19 @@ function verifySegment(segPath) {
|
|
|
103
103
|
|
|
104
104
|
/**
|
|
105
105
|
* Build the common ffmpeg encoding args (video + audio filters/codecs).
|
|
106
|
+
* @param {string} inputFile
|
|
107
|
+
* @param {{ speed?: number }} [overrides]
|
|
106
108
|
* Returns { encodingArgs, effectiveDuration }.
|
|
107
109
|
*/
|
|
108
|
-
function buildEncodingArgs(inputFile) {
|
|
110
|
+
function buildEncodingArgs(inputFile, { speed = SPEED } = {}) {
|
|
109
111
|
const width = parseInt(probe(inputFile, 'v:0', 'width') || '0');
|
|
110
112
|
const channels = parseInt(probe(inputFile, 'a:0', 'channels') || '1');
|
|
111
113
|
const sampleRate = probe(inputFile, 'a:0', 'sample_rate') || '16000';
|
|
112
114
|
const duration = probeFormat(inputFile, 'duration');
|
|
113
115
|
const durationSec = duration ? parseFloat(duration) : null;
|
|
114
|
-
const effectiveDuration = durationSec ? durationSec /
|
|
116
|
+
const effectiveDuration = durationSec ? durationSec / speed : null;
|
|
115
117
|
|
|
116
|
-
let vf = `setpts=PTS/${
|
|
118
|
+
let vf = `setpts=PTS/${speed}`;
|
|
117
119
|
let crf = 24;
|
|
118
120
|
let tune = ['-tune', 'stillimage'];
|
|
119
121
|
let profile = ['-profile:v', 'main'];
|
|
@@ -122,21 +124,21 @@ function buildEncodingArgs(inputFile) {
|
|
|
122
124
|
|
|
123
125
|
if (width > 1920) {
|
|
124
126
|
// 4K+ → scale to 1080p
|
|
125
|
-
vf = `scale=1920:1080,unsharp=3:3:0.3,setpts=PTS/${
|
|
127
|
+
vf = `scale=1920:1080,unsharp=3:3:0.3,setpts=PTS/${speed}`;
|
|
126
128
|
crf = 20;
|
|
127
129
|
tune = [];
|
|
128
130
|
profile = ['-profile:v', 'high'];
|
|
129
131
|
audioBr = '128k';
|
|
130
132
|
} else if (width > 0) {
|
|
131
133
|
// Meeting / screenshare
|
|
132
|
-
vf = `unsharp=3:3:0.3,setpts=PTS/${
|
|
134
|
+
vf = `unsharp=3:3:0.3,setpts=PTS/${speed}`;
|
|
133
135
|
}
|
|
134
136
|
|
|
135
137
|
if (channels === 2) audioBr = '128k';
|
|
136
138
|
|
|
137
139
|
const encodingArgs = [
|
|
138
140
|
'-vf', vf,
|
|
139
|
-
'-af', `atempo=${
|
|
141
|
+
'-af', `atempo=${speed}`,
|
|
140
142
|
'-c:v', 'libx264', '-crf', String(crf), '-preset', PRESET,
|
|
141
143
|
...tune,
|
|
142
144
|
'-x264-params', x264p,
|
|
@@ -146,7 +148,7 @@ function buildEncodingArgs(inputFile) {
|
|
|
146
148
|
'-movflags', '+faststart',
|
|
147
149
|
];
|
|
148
150
|
|
|
149
|
-
return { encodingArgs, effectiveDuration, width, crf, audioBr, duration };
|
|
151
|
+
return { encodingArgs, effectiveDuration, width, crf, audioBr, duration, speed };
|
|
150
152
|
}
|
|
151
153
|
|
|
152
154
|
/**
|
|
@@ -155,27 +157,28 @@ function buildEncodingArgs(inputFile) {
|
|
|
155
157
|
* - Long videos → segment muxer for splitting.
|
|
156
158
|
* - Post-compression validation: verifies each output has a valid moov atom.
|
|
157
159
|
* Corrupt segments are re-encoded individually with the regular MP4 muxer.
|
|
160
|
+
* @param {{ segTime?: number, speed?: number }} [opts]
|
|
158
161
|
* Returns sorted array of segment file paths.
|
|
159
162
|
*/
|
|
160
|
-
function compressAndSegment(inputFile, outputDir) {
|
|
161
|
-
const { encodingArgs, effectiveDuration, width, crf, audioBr, duration } = buildEncodingArgs(inputFile);
|
|
163
|
+
function compressAndSegment(inputFile, outputDir, { segTime = SEG_TIME, speed = SPEED } = {}) {
|
|
164
|
+
const { encodingArgs, effectiveDuration, width, crf, audioBr, duration } = buildEncodingArgs(inputFile, { speed });
|
|
162
165
|
|
|
163
166
|
fs.mkdirSync(outputDir, { recursive: true });
|
|
164
167
|
|
|
165
168
|
console.log(` Resolution : ${width > 0 ? width + 'p' : 'unknown'}`);
|
|
166
|
-
console.log(` Duration : ${duration ? fmtDuration(parseFloat(duration)) : 'unknown'}${effectiveDuration ? ` (${fmtDuration(effectiveDuration)} at ${
|
|
167
|
-
console.log(` CRF ${crf} | ${audioBr} audio | ${
|
|
169
|
+
console.log(` Duration : ${duration ? fmtDuration(parseFloat(duration)) : 'unknown'}${effectiveDuration ? ` (${fmtDuration(effectiveDuration)} at ${speed}x)` : ''}`);
|
|
170
|
+
console.log(` CRF ${crf} | ${audioBr} audio | ${speed}x speed`);
|
|
168
171
|
|
|
169
172
|
// Decide: single output vs segmented
|
|
170
|
-
const needsSegmentation = effectiveDuration === null || effectiveDuration >
|
|
173
|
+
const needsSegmentation = effectiveDuration === null || effectiveDuration > segTime;
|
|
171
174
|
|
|
172
175
|
if (needsSegmentation) {
|
|
173
|
-
console.log(` Compressing (segmented, ${
|
|
176
|
+
console.log(` Compressing (segmented, ${segTime}s chunks)...`);
|
|
174
177
|
const args = [
|
|
175
178
|
'-y', '-err_detect', 'ignore_err', '-fflags', '+genpts+discardcorrupt',
|
|
176
179
|
'-i', inputFile,
|
|
177
180
|
...encodingArgs,
|
|
178
|
-
'-f', 'segment', '-segment_time', String(
|
|
181
|
+
'-f', 'segment', '-segment_time', String(segTime), '-reset_timestamps', '1',
|
|
179
182
|
'-map', '0:v:0', '-map', '0:a:0',
|
|
180
183
|
path.join(outputDir, 'segment_%02d.mp4'),
|
|
181
184
|
];
|
|
@@ -248,7 +251,7 @@ function compressAndSegment(inputFile, outputDir) {
|
|
|
248
251
|
const rsArgs = [
|
|
249
252
|
'-y', '-i', fallbackPath,
|
|
250
253
|
'-c', 'copy',
|
|
251
|
-
'-f', 'segment', '-segment_time', String(
|
|
254
|
+
'-f', 'segment', '-segment_time', String(segTime), '-reset_timestamps', '1',
|
|
252
255
|
'-movflags', '+faststart',
|
|
253
256
|
path.join(reSegDir, 'segment_%02d.mp4'),
|
|
254
257
|
];
|
|
@@ -302,34 +305,34 @@ function compressAndSegment(inputFile, outputDir) {
|
|
|
302
305
|
*
|
|
303
306
|
* Returns sorted array of segment file paths.
|
|
304
307
|
*/
|
|
305
|
-
function compressAndSegmentAudio(inputFile, outputDir) {
|
|
308
|
+
function compressAndSegmentAudio(inputFile, outputDir, { segTime = SEG_TIME, speed = SPEED } = {}) {
|
|
306
309
|
fs.mkdirSync(outputDir, { recursive: true });
|
|
307
310
|
|
|
308
311
|
const duration = probeFormat(inputFile, 'duration');
|
|
309
312
|
const durationSec = duration ? parseFloat(duration) : null;
|
|
310
|
-
const effectiveDuration = durationSec ? durationSec /
|
|
313
|
+
const effectiveDuration = durationSec ? durationSec / speed : null;
|
|
311
314
|
const channels = parseInt(probe(inputFile, 'a:0', 'channels') || '1', 10);
|
|
312
315
|
const sampleRate = probe(inputFile, 'a:0', 'sample_rate') || '16000';
|
|
313
316
|
const audioBr = channels >= 2 ? '128k' : '64k';
|
|
314
317
|
|
|
315
|
-
console.log(` Duration : ${duration ? fmtDuration(parseFloat(duration)) : 'unknown'}${effectiveDuration ? ` (${fmtDuration(effectiveDuration)} at ${
|
|
316
|
-
console.log(` Audio-only mode | ${
|
|
318
|
+
console.log(` Duration : ${duration ? fmtDuration(parseFloat(duration)) : 'unknown'}${effectiveDuration ? ` (${fmtDuration(effectiveDuration)} at ${speed}x)` : ''}`);
|
|
319
|
+
console.log(` Audio-only mode | ${speed}x speed | ${audioBr} bitrate`);
|
|
317
320
|
|
|
318
321
|
const encodingArgs = [
|
|
319
|
-
'-af', `atempo=${
|
|
322
|
+
'-af', `atempo=${speed}`,
|
|
320
323
|
'-c:a', 'aac', '-b:a', audioBr, '-ar', sampleRate, '-ac', String(channels),
|
|
321
324
|
'-vn', // no video
|
|
322
325
|
'-movflags', '+faststart',
|
|
323
326
|
];
|
|
324
327
|
|
|
325
|
-
const needsSegmentation = effectiveDuration === null || effectiveDuration >
|
|
328
|
+
const needsSegmentation = effectiveDuration === null || effectiveDuration > segTime;
|
|
326
329
|
|
|
327
330
|
if (needsSegmentation) {
|
|
328
|
-
console.log(` Compressing (segmented, ${
|
|
331
|
+
console.log(` Compressing (segmented, ${segTime}s chunks)...`);
|
|
329
332
|
const args = [
|
|
330
333
|
'-y', '-i', inputFile,
|
|
331
334
|
...encodingArgs,
|
|
332
|
-
'-f', 'segment', '-segment_time', String(
|
|
335
|
+
'-f', 'segment', '-segment_time', String(segTime), '-reset_timestamps', '1',
|
|
333
336
|
path.join(outputDir, 'segment_%02d.m4a'),
|
|
334
337
|
];
|
|
335
338
|
const result = spawnSync(getFFmpeg(), args, { stdio: 'inherit' });
|
|
@@ -383,7 +386,7 @@ function compressAndSegmentAudio(inputFile, outputDir) {
|
|
|
383
386
|
const rsArgs = [
|
|
384
387
|
'-y', '-i', fallbackPath,
|
|
385
388
|
'-c', 'copy', '-vn',
|
|
386
|
-
'-f', 'segment', '-segment_time', String(
|
|
389
|
+
'-f', 'segment', '-segment_time', String(segTime), '-reset_timestamps', '1',
|
|
387
390
|
path.join(reSegDir, 'segment_%02d.m4a'),
|
|
388
391
|
];
|
|
389
392
|
spawnSync(getFFmpeg(), rsArgs, { stdio: 'inherit' });
|
|
@@ -408,12 +411,100 @@ function compressAndSegmentAudio(inputFile, outputDir) {
|
|
|
408
411
|
return segments;
|
|
409
412
|
}
|
|
410
413
|
|
|
414
|
+
/**
|
|
415
|
+
* Split a media file into segments WITHOUT re-encoding (stream copy).
|
|
416
|
+
* No compression, no speed-up — just fast keyframe-aligned splitting.
|
|
417
|
+
* For use with --no-compress: passes raw video to Gemini via File API.
|
|
418
|
+
*
|
|
419
|
+
* @param {string} inputFile - Path to input media file
|
|
420
|
+
* @param {string} outputDir - Directory for output segments
|
|
421
|
+
* @param {{ segTime?: number }} opts - Options (segTime defaults to 1200s for raw mode)
|
|
422
|
+
* @returns {string[]} Sorted array of segment file paths
|
|
423
|
+
*/
|
|
424
|
+
function splitOnly(inputFile, outputDir, { segTime = 1200 } = {}) {
|
|
425
|
+
fs.mkdirSync(outputDir, { recursive: true });
|
|
426
|
+
|
|
427
|
+
const duration = probeFormat(inputFile, 'duration');
|
|
428
|
+
const durationSec = duration ? parseFloat(duration) : null;
|
|
429
|
+
const ext = path.extname(inputFile).toLowerCase();
|
|
430
|
+
const isAudio = ['.mp3', '.wav', '.m4a', '.ogg', '.flac', '.aac', '.wma'].includes(ext);
|
|
431
|
+
const outExt = isAudio ? '.m4a' : '.mp4';
|
|
432
|
+
const width = isAudio ? 0 : parseInt(probe(inputFile, 'v:0', 'width') || '0');
|
|
433
|
+
|
|
434
|
+
console.log(` Mode : ${c.cyan('raw split')} (no re-encoding, no speed-up)`);
|
|
435
|
+
if (!isAudio) console.log(` Resolution : ${width > 0 ? width + 'p' : 'unknown'}`);
|
|
436
|
+
console.log(` Duration : ${duration ? fmtDuration(durationSec) : 'unknown'}`);
|
|
437
|
+
console.log(` File size: ${fmtBytes(fs.statSync(inputFile).size)}`);
|
|
438
|
+
|
|
439
|
+
const needsSegmentation = durationSec === null || durationSec > segTime;
|
|
440
|
+
|
|
441
|
+
if (needsSegmentation) {
|
|
442
|
+
console.log(` Splitting at keyframes (~${segTime}s chunks)...`);
|
|
443
|
+
const args = [
|
|
444
|
+
'-y', '-err_detect', 'ignore_err', '-fflags', '+genpts+discardcorrupt',
|
|
445
|
+
'-i', inputFile,
|
|
446
|
+
'-c', 'copy',
|
|
447
|
+
'-f', 'segment', '-segment_time', String(segTime), '-reset_timestamps', '1',
|
|
448
|
+
...(isAudio ? ['-vn'] : ['-map', '0:v:0', '-map', '0:a:0']),
|
|
449
|
+
'-movflags', '+faststart',
|
|
450
|
+
path.join(outputDir, `segment_%02d${outExt}`),
|
|
451
|
+
];
|
|
452
|
+
const result = spawnSync(getFFmpeg(), args, { stdio: 'inherit' });
|
|
453
|
+
if (result.status !== 0) {
|
|
454
|
+
console.warn(` ${c.warn(`ffmpeg exited with code ${result.status} (output may still be usable)`)}`);
|
|
455
|
+
}
|
|
456
|
+
} else {
|
|
457
|
+
console.log(` Single segment (duration ${fmtDuration(durationSec)} ≤ ${segTime}s) — copying...`);
|
|
458
|
+
const outPath = path.join(outputDir, `segment_00${outExt}`);
|
|
459
|
+
const args = [
|
|
460
|
+
'-y', '-err_detect', 'ignore_err', '-fflags', '+genpts+discardcorrupt',
|
|
461
|
+
'-i', inputFile,
|
|
462
|
+
'-c', 'copy',
|
|
463
|
+
...(isAudio ? ['-vn'] : ['-map', '0:v:0', '-map', '0:a:0']),
|
|
464
|
+
'-movflags', '+faststart',
|
|
465
|
+
outPath,
|
|
466
|
+
];
|
|
467
|
+
const result = spawnSync(getFFmpeg(), args, { stdio: 'inherit' });
|
|
468
|
+
if (result.status !== 0) {
|
|
469
|
+
console.warn(` ${c.warn(`ffmpeg exited with code ${result.status}`)}`);
|
|
470
|
+
}
|
|
471
|
+
}
|
|
472
|
+
|
|
473
|
+
// Collect segments
|
|
474
|
+
const segments = fs.readdirSync(outputDir)
|
|
475
|
+
.filter(f => f.startsWith('segment_') && (f.endsWith('.mp4') || f.endsWith('.m4a')))
|
|
476
|
+
.sort()
|
|
477
|
+
.map(f => path.join(outputDir, f));
|
|
478
|
+
|
|
479
|
+
// Validate
|
|
480
|
+
const corrupt = segments.filter(s => !verifySegment(s));
|
|
481
|
+
if (corrupt.length > 0) {
|
|
482
|
+
console.warn(` ${c.warn(`${corrupt.length} segment(s) may be corrupt (no moov atom):`)}`);
|
|
483
|
+
corrupt.forEach(s => console.warn(` ${c.error(path.basename(s))}`));
|
|
484
|
+
console.warn(` ${c.dim('Stream-copy splits at keyframes — some containers may need re-mux.')}`);
|
|
485
|
+
console.warn(` ${c.dim('Remove --no-compress to re-encode instead.')}`);
|
|
486
|
+
}
|
|
487
|
+
|
|
488
|
+
// Duration validation: warn if any segment exceeds 1 hour (Gemini sweet spot)
|
|
489
|
+
for (const seg of segments) {
|
|
490
|
+
const dur = probeFormat(seg, 'duration');
|
|
491
|
+
if (dur && parseFloat(dur) > 3600) {
|
|
492
|
+
console.warn(` ${c.warn(`${path.basename(seg)} is ${fmtDuration(parseFloat(dur))} — very long segments use more Gemini tokens.`)}`);
|
|
493
|
+
console.warn(` ${c.dim(' Consider removing --no-compress to re-encode into shorter segments.')}`);
|
|
494
|
+
break; // warn once
|
|
495
|
+
}
|
|
496
|
+
}
|
|
497
|
+
|
|
498
|
+
return segments;
|
|
499
|
+
}
|
|
500
|
+
|
|
411
501
|
module.exports = {
|
|
412
502
|
findBin,
|
|
413
503
|
probe,
|
|
414
504
|
probeFormat,
|
|
415
505
|
compressAndSegment,
|
|
416
506
|
compressAndSegmentAudio,
|
|
507
|
+
splitOnly,
|
|
417
508
|
verifySegment,
|
|
418
509
|
getFFmpeg,
|
|
419
510
|
getFFprobe,
|
package/src/utils/cli.js
CHANGED
|
@@ -33,7 +33,7 @@ function parseArgs(argv) {
|
|
|
33
33
|
const BOOLEAN_FLAGS = new Set([
|
|
34
34
|
'help', 'h', 'version', 'v',
|
|
35
35
|
'skip-upload', 'force-upload', 'no-storage-url',
|
|
36
|
-
'skip-compression', 'skip-gemini',
|
|
36
|
+
'skip-compression', 'skip-gemini', 'no-compress',
|
|
37
37
|
'resume', 'reanalyze', 'dry-run',
|
|
38
38
|
'dynamic', 'deep-dive', 'deep-summary', 'update-progress',
|
|
39
39
|
'no-focused-pass', 'no-learning', 'no-diff',
|
|
@@ -381,12 +381,22 @@ ${f('--format <type>', 'Output: md, html, json, pdf, docx, all — comma-separat
|
|
|
381
381
|
${f('--min-confidence <level>', 'Filter: high, medium, low (default: all)')}
|
|
382
382
|
${f('--output <dir>', 'Custom output directory for results')}
|
|
383
383
|
${f('--skip-upload', 'Skip Firebase Storage uploads')}
|
|
384
|
-
${f('--skip-compression', 'Use existing segments (
|
|
384
|
+
${f('--skip-compression', 'Use existing segments from previous run (deprecated: auto-detected)')}
|
|
385
385
|
${f('--skip-gemini', 'Skip AI analysis')}
|
|
386
386
|
${f('--resume', 'Resume from last checkpoint')}
|
|
387
387
|
${f('--reanalyze', 'Force re-analysis of all segments')}
|
|
388
388
|
${f('--dry-run', 'Preview without executing')}
|
|
389
389
|
|
|
390
|
+
${h('VIDEO PROCESSING')}
|
|
391
|
+
${f('--no-compress', 'Skip re-encoding — pass raw video to Gemini (fast, no quality loss)')}
|
|
392
|
+
${f2('Auto-splits at 20 min (1200s) if needed. --speed and --segment-time are ignored.')}
|
|
393
|
+
${f2('Gemini File API: up to 2 GB/file, ~300 tok/sec at default resolution.')}
|
|
394
|
+
${f('--speed <n>', 'Playback speed multiplier for compression mode (default: 1.6)')}
|
|
395
|
+
${f('--segment-time <n>', 'Segment duration in seconds for compression mode (default: 280)')}
|
|
396
|
+
${f2('Duration constraints (per Google Gemini docs):')}
|
|
397
|
+
${f2(' • Default res: ~300 tok/sec → max ~55 min/segment (safe: ≤20 min)')}
|
|
398
|
+
${f2(' • File API limit: 2 GB (free) / 20 GB (paid) per file')}
|
|
399
|
+
|
|
390
400
|
${h('TUNING')}
|
|
391
401
|
${f('--parallel <n>', 'Max parallel uploads (default: 3)')}
|
|
392
402
|
${f('--parallel-analysis <n>', 'Concurrent analysis batches (default: 2)')}
|