task-summary-extractor 8.1.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -0,0 +1,605 @@
1
+ # Architecture & Technical Deep Dive
2
+
3
+ > Internal reference for the pipeline's architecture, processing flows, and design decisions.
4
+ > For setup instructions, see [README.md](README.md) · [Quick Start](QUICK_START.md)
5
+ > For module map and roadmap, see [EXPLORATION.md](EXPLORATION.md)
6
+
7
+ ---
8
+
9
+ ## Table of Contents
10
+
11
+ - [Architecture \& Technical Deep Dive](#architecture--technical-deep-dive)
12
+ - [Table of Contents](#table-of-contents)
13
+ - [System Architecture](#system-architecture)
14
+ - [Phase Descriptions](#phase-descriptions)
15
+ - [Pipeline Phases](#pipeline-phases)
16
+ - [Per-Segment Processing](#per-segment-processing)
17
+ - [File Resolution Strategies](#file-resolution-strategies)
18
+ - [Quality Gate Decision Table](#quality-gate-decision-table)
19
+ - [Smart Change Detection](#smart-change-detection)
20
+ - [Correlation Strategies](#correlation-strategies)
21
+ - [Assessment Thresholds](#assessment-thresholds)
22
+ - [Extraction Schema](#extraction-schema)
23
+ - [Categories](#categories)
24
+ - [Personalized Task Section](#personalized-task-section)
25
+ - [Confidence Scoring](#confidence-scoring)
26
+ - [JSON Parser — 5-Strategy Extraction](#json-parser--5-strategy-extraction)
27
+ - [Quality Gate — 4-Dimension Scoring](#quality-gate--4-dimension-scoring)
28
+ - [Learning Loop — Self-Improving Budgets](#learning-loop--self-improving-budgets)
29
+ - [Cross-Segment Continuity](#cross-segment-continuity)
30
+ - [Diff Engine — Cross-Run Intelligence](#diff-engine--cross-run-intelligence)
31
+ - [Deep Dive Mode](#deep-dive-mode)
32
+ - [Dynamic Mode](#dynamic-mode)
33
+ - [Dynamic Mode Categories](#dynamic-mode-categories)
34
+ - [Document Context Processing](#document-context-processing)
35
+ - [Skip Logic / Caching](#skip-logic--caching)
36
+ - [Logging](#logging)
37
+ - [Tech Stack](#tech-stack)
38
+ - [Video Encoding Parameters](#video-encoding-parameters)
39
+ - [Gemini Run Record Format](#gemini-run-record-format)
40
+ - [See Also](#see-also)
41
+
42
+ ---
43
+
44
+ ## System Architecture
45
+
46
+ ```mermaid
47
+ flowchart TB
48
+ subgraph Entry["Entry Point"]
49
+ EP["taskex (bin/taskex.js)\nor process_and_upload.js"]
50
+ end
51
+
52
+ subgraph Pipeline["pipeline.js — Multi-Mode Orchestrator"]
53
+ direction TB
54
+ P1["Phase 1: Init + Interactive Selection"]
55
+ P2["Phase 2: Discover"]
56
+ P3["Phase 3: Services"]
57
+ P4["Phase 4: Process Videos"]
58
+ P5["Phase 5: Compile"]
59
+ P6["Phase 6: Output"]
60
+ P7["Phase 7: Health Dashboard"]
61
+ P8["Phase 8: Summary"]
62
+ P9["Phase 9: Deep Dive (optional)"]
63
+
64
+ P1 --> P2 --> P3 --> P4 --> P5 --> P6 --> P7 --> P8 --> P9
65
+ end
66
+
67
+ subgraph AltModes["Alternative Modes"]
68
+ UP["--update-progress"]
69
+ DYN["--dynamic"]
70
+ end
71
+
72
+ subgraph Services["Services"]
73
+ GEM["gemini.js"]
74
+ FB["firebase.js"]
75
+ VID["video.js"]
76
+ GIT["git.js"]
77
+ end
78
+
79
+ subgraph Utils["Utilities — 19 modules"]
80
+ QG["quality-gate"]
81
+ FR["focused-reanalysis"]
82
+ LL["learning-loop"]
83
+ DE["diff-engine"]
84
+ CD["change-detector"]
85
+ PU["progress-updater"]
86
+ CM["context-manager"]
87
+ JP["json-parser"]
88
+ AB["adaptive-budget"]
89
+ HD["health-dashboard"]
90
+ DD["deep-dive"]
91
+ DM["dynamic-mode"]
92
+ OT["+ 7 more"]
93
+ end
94
+
95
+ subgraph Renderers["Renderers"]
96
+ MD["markdown.js"]
97
+ end
98
+
99
+ EP --> Pipeline
100
+ P1 -.->|"--update-progress"| UP
101
+ P1 -.->|"--dynamic"| DYN
102
+ Pipeline --> Services
103
+ Pipeline --> Utils
104
+ Pipeline --> Renderers
105
+ UP --> GIT
106
+ UP --> CD
107
+ UP --> PU
108
+ UP --> GEM
109
+ DYN --> DM
110
+ DYN --> GEM
111
+ ```
112
+
113
+ ### Phase Descriptions
114
+
115
+ | Phase | Name | What Happens |
116
+ |-------|------|-------------|
117
+ | 1 | **Init** | CLI parsing, interactive folder selection (if no arg), config validation, logger setup, load learning insights, route to dynamic/progress mode |
118
+ | 2 | **Discover** | Find videos, discover documents, resolve user name, check resume state |
119
+ | 3 | **Services** | Firebase auth, Gemini init, prepare document parts |
120
+ | 4 | **Process** | Compress → Upload → Analyze → Quality Gate → Retry → Focused Pass |
121
+ | 5 | **Compile** | Cross-segment compilation, diff engine comparison |
122
+ | 6 | **Output** | Write JSON, render Markdown, upload to Firebase |
123
+ | 7 | **Health** | Quality metrics dashboard, cost breakdown |
124
+ | 8 | **Summary** | Save learning history, print run summary |
125
+ | 9 | **Deep Dive** | (optional, `--deep-dive`) Topic discovery + explanatory document generation |
126
+
127
+ ---
128
+
129
+ ## Pipeline Phases
130
+
131
+ ```mermaid
132
+ flowchart LR
133
+ subgraph P1["Phase 1: Init"]
134
+ CLI["Parse CLI args"]
135
+ CFG["Validate config"]
136
+ LOG["Init logger"]
137
+ LRN["Load learning history"]
138
+ end
139
+
140
+ subgraph P2["Phase 2: Discover"]
141
+ VID["Find videos"]
142
+ DOC["Find documents"]
143
+ USR["Resolve user name"]
144
+ end
145
+
146
+ subgraph P3["Phase 3: Services"]
147
+ FB["Firebase auth"]
148
+ AI["Gemini init"]
149
+ DPR["Prepare docs"]
150
+ end
151
+
152
+ subgraph P4["Phase 4: Process"]
153
+ CMP["Compress"]
154
+ UPL["Upload"]
155
+ ANZ["Analyze"]
156
+ QG["Quality Gate"]
157
+ RTY["Retry"]
158
+ FOC["Focused Pass"]
159
+ end
160
+
161
+ subgraph P5["Phase 5: Compile"]
162
+ CFL["Final Compilation"]
163
+ DIF["Diff Engine"]
164
+ end
165
+
166
+ subgraph P6["Phase 6: Output"]
167
+ JSON["results.json"]
168
+ MDR["results.md"]
169
+ FBU["Firebase upload"]
170
+ end
171
+
172
+ subgraph P7["Phase 7: Health"]
173
+ HD["Health Dashboard"]
174
+ end
175
+
176
+ subgraph P8["Phase 8: Summary"]
177
+ SAV["Save learning history"]
178
+ SUM["Print summary"]
179
+ end
180
+
181
+ P1 --> P2 --> P3 --> P4 --> P5 --> P6 --> P7 --> P8
182
+ ```
183
+
184
+ ---
185
+
186
+ ## Per-Segment Processing
187
+
188
+ Each video segment goes through this flow (Phase 4 detail):
189
+
190
+ ```mermaid
191
+ flowchart TB
192
+ START(["Segment N"]) --> COMPRESS["ffmpeg compress\nH.264 CRF 24, 1.5x speed"]
193
+ COMPRESS --> VERIFY["Verify segment integrity"]
194
+ VERIFY --> UPLOAD_FB["Upload to Firebase Storage\n→ download URL"]
195
+
196
+ UPLOAD_FB --> RESOLVE{"File Resolution\n3-Strategy Hierarchy"}
197
+
198
+ RESOLVE -->|"Strategy A\nRetry/Focused pass"| REUSE["Reuse existing\nGemini File API URI"]
199
+ RESOLVE -->|"Strategy B\nFirebase URL available"| EXTURL["Use Firebase download URL\nas Gemini External URL\n(skip File API upload)\n(disabled by --no-storage-url)"]
200
+ RESOLVE -->|"Strategy C\nFallback"| UPLOAD_GEM["Upload to Gemini File API"]
201
+ UPLOAD_GEM --> WAIT["Poll until ACTIVE"]
202
+
203
+ REUSE & EXTURL & WAIT --> ANALYZE["Gemini AI Analysis\nVideo + Docs + Prompt + Context"]
204
+ ANALYZE --> PARSE["JSON Parser\n5-strategy extraction"]
205
+ PARSE --> QUALITY{"Quality Gate\nScore 0-100"}
206
+
207
+ QUALITY -->|"Score < 45"| RETRY["Auto-Retry\nwith corrective hints\n(reuses URI — Strategy A)"]
208
+ RETRY --> ANALYZE
209
+ QUALITY -->|"Score 45-59\nweak areas"| FOCUS["Focused Re-Analysis\ntargeted second pass\n(reuses URI — Strategy A)"]
210
+ FOCUS --> MERGE["Merge focused results"]
211
+ QUALITY -->|"Score >= 60"| CLEANUP
212
+ MERGE --> CLEANUP["Cleanup: delete\nGemini File API uploads"]
213
+ CLEANUP --> NEXT(["Next Segment"])
214
+
215
+ NEXT --> CTX["Inject into cross-segment context"]
216
+ ```
217
+
218
+ ### File Resolution Strategies
219
+
220
+ The pipeline uses a 3-strategy hierarchy to avoid redundant uploads:
221
+
222
+ | Strategy | When Used | What Happens | Benefit |
223
+ |----------|-----------|-------------|---------|
224
+ | **A: Reuse URI** | Retry or focused re-analysis pass | Uses the Gemini File API URI or External URL from the first analysis | Zero upload — instant |
225
+ | **B: Storage URL** | Firebase upload succeeded, segment available via HTTPS | Uses the Firebase Storage download URL directly as a Gemini External URL | Skips Gemini File API upload + polling entirely |
226
+ | **C: File API Upload** | Fallback (no Firebase, `--skip-upload`, `--no-storage-url`, etc.) | Uploads to Gemini File API, polls until ACTIVE | Full upload + processing wait |
227
+
228
+ After all passes complete, any Gemini File API uploads are cleaned up (fire-and-forget delete). When Strategy B was used, no cleanup is needed since no Gemini file was created.
229
+
230
+ > **Upload control flags:** Use `--force-upload` to re-upload segments/documents even if they already exist in Firebase Storage. Use `--no-storage-url` to disable Strategy B and force Gemini File API uploads (Strategy C).
231
+
232
+ ### Quality Gate Decision Table
233
+
234
+ | Score | Action |
235
+ |-------|--------|
236
+ | < 45 | Auto-retry with corrective hints |
237
+ | 45–59 with ≥2 weak dimensions | Focused re-analysis on weak areas |
238
+ | ≥ 60 | Pass |
239
+
240
+ ---
241
+
242
+ ## Smart Change Detection
243
+
244
+ The `--update-progress` mode tracks which extracted items have been addressed:
245
+
246
+ ```mermaid
247
+ flowchart TB
248
+ START(["--update-progress"]) --> INIT["Auto-init git repo\nif call folder has no repo"]
249
+ INIT --> LOAD["Load latest\ncompilation.json"]
250
+ LOAD --> GIT["Git: commits, changed files,\nworking tree, diff summary"]
251
+ LOAD --> DOCS["Doc changes:\nfile mtime comparison"]
252
+
253
+ GIT --> ITEMS["Extract trackable items\nfrom analysis"]
254
+ DOCS --> ITEMS
255
+
256
+ ITEMS --> CORR["Correlation Engine"]
257
+
258
+ CORR --> S1["File Path Match\nscore: +0.4"]
259
+ CORR --> S2["Ticket ID in Commit\nscore: +0.5"]
260
+ CORR --> S3["Keyword Overlap\nscore: +0.3"]
261
+ CORR --> S4["Commit-File Overlap\nscore: +0.15"]
262
+
263
+ S1 & S2 & S3 & S4 --> LOCAL["Local Assessment"]
264
+
265
+ LOCAL --> AI{"Gemini AI\navailable?"}
266
+ AI -->|Yes| SMART["AI Smart Layer\nReviews all evidence\nAssigns final status"]
267
+ AI -->|No| OUTPUT
268
+
269
+ SMART --> OUTPUT(["Output"])
270
+ OUTPUT --> PJ["progress.json"]
271
+ OUTPUT --> PM["progress.md"]
272
+ OUTPUT --> FBU["Firebase upload"]
273
+ ```
274
+
275
+ ### Correlation Strategies
276
+
277
+ | Strategy | Score Contribution | How It Works |
278
+ |----------|--------------------|-------------|
279
+ | **File Path Match** | +0.4 | Git changed files match file paths mentioned in analysis items |
280
+ | **Ticket ID in Commit** | +0.5 | Commit messages contain ticket IDs from extracted items |
281
+ | **Keyword Overlap** | +0.3 | Keywords from item descriptions appear in commit messages or file names |
282
+ | **Commit-File Overlap** | +0.15 | Files touched in commits overlap with files referenced across items |
283
+
284
+ ### Assessment Thresholds
285
+
286
+ | Correlation Score | Status Assigned |
287
+ |-------------------|----------------|
288
+ | ≥ 0.6 | **DONE** ✅ |
289
+ | ≥ 0.25 | **IN_PROGRESS** 🔄 |
290
+ | < 0.25 | **NOT_STARTED** ⏳ |
291
+ | *(AI override)* | **SUPERSEDED** 🔀 |
292
+
293
+ ---
294
+
295
+ ## Extraction Schema
296
+
297
+ The AI extracts 6 structured categories from each meeting. The categories are content-adaptive — the AI populates whichever fields are relevant to the actual discussion.
298
+
299
+ ### Categories
300
+
301
+ | Category | Key Fields | Adapts To |
302
+ |----------|-----------|----------|
303
+ | **Tickets / Items** | `ticket_id`, `title`, `status`, `assignee`, `reviewer`, `video_segments` with timestamps, `speaker_comments`, `details` with priority, confidence | Sprint items, requirements, interview topics, incident items |
304
+ | **Change Requests** | `WHERE` (target: file, system, process, scope), `WHAT` (specific change), `HOW` (approach), `WHY` (justification), `dependencies`, `blocked_by`, confidence | Code changes, requirement changes, process changes, scope adjustments |
305
+ | **References** | `name`, `type`, `role`, cross-refs to tickets & CRs, `context_doc_match` | Files, documents, URLs, tools, systems, resources mentioned |
306
+ | **Action Items** | `description`, `assigned_to`, `status`, `deadline`, `dependencies`, related tickets & CRs, confidence | Any follow-up work discussed |
307
+ | **Blockers** | `description`, `severity`, `owner`, `status`, `proposed_resolution`, confidence | Technical blockers, approval gates, resource constraints |
308
+ | **Scope Changes** | `type` (added/removed/deferred), `original` vs `new` scope, `decided_by`, `impact`, confidence | Feature scope, project scope, contract scope, training scope |
309
+
310
+ ### Personalized Task Section
311
+
312
+ Every analysis includes a `your_tasks` section scoped to the `--name` user:
313
+
314
+ | Field | Description |
315
+ |-------|-------------|
316
+ | `owned_tickets` | Items assigned to you |
317
+ | `tasks_todo` | Action items with priority |
318
+ | `waiting_on_others` | Items blocked on other people |
319
+ | `decisions_needed` | Things you need to decide |
320
+ | `completed_in_call` | Items resolved during the meeting |
321
+
322
+ ### Confidence Scoring
323
+
324
+ Every extracted item carries a confidence rating:
325
+
326
+ | Level | Criteria | Example |
327
+ |-------|----------|---------|
328
+ | **HIGH** | Explicitly stated + corroborated | "Mentioned with ticket ID in VTT and task docs" |
329
+ | **MEDIUM** | Partially stated or single-source | "Discussed verbally, no written reference" |
330
+ | **LOW** | Inferred from context | "Implied from related discussion" |
331
+
332
+ ---
333
+
334
+ ## JSON Parser — 5-Strategy Extraction
335
+
336
+ Gemini output is unpredictable. The parser handles it with cascading strategies:
337
+
338
+ ```mermaid
339
+ flowchart TB
340
+ RAW(["Raw AI Response"]) --> S1["Strategy 1\nStrip markdown fences"]
341
+ S1 -->|fail| S2["Strategy 2\nBrace-depth matching"]
342
+ S2 -->|fail| S3["Strategy 3\nRegex fence extraction"]
343
+ S3 -->|fail| S4["Strategy 4\nTruncation repair"]
344
+ S4 -->|fail| S5["Strategy 5\nDoubled-closer fix"]
345
+
346
+ S1 -->|success| OK
347
+ S2 -->|success| OK
348
+ S3 -->|success| OK
349
+ S4 -->|success| OK
350
+ S5 -->|success| OK
351
+
352
+ S1 & S2 & S3 & S4 & S5 -->|"each retries with"| SAN["Escape Sanitizer\nFixes invalid backslash-d backslash-s backslash-w"]
353
+ SAN -->|success| OK(["Parsed JSON"])
354
+ SAN -->|"still fails"| MAL["Malformation Fixer\nDoubled braces, trailing commas"]
355
+ MAL -->|success| OK
356
+ MAL -->|fail| NULL(["null — parse failed"])
357
+ ```
358
+
359
+ Each strategy is tried in order. If a strategy fails, it falls through to the next. After each strategy, a sanitizer pass is attempted. This achieves >99% parse success on real Gemini output.
360
+
361
+ ---
362
+
363
+ ## Quality Gate — 4-Dimension Scoring
364
+
365
+ | Dimension | Weight | What It Measures |
366
+ |-----------|--------|------------------|
367
+ | **Density** | 30% | Items extracted per minute of video |
368
+ | **Structure** | 25% | Required fields present (IDs, assignees, statuses) |
369
+ | **Confidence** | 25% | Confidence field coverage + calibration (not all HIGH) |
370
+ | **Cross-References** | 20% | Tickets linked to CRs, files referenced, action items connected |
371
+
372
+ The weighted sum yields a score 0–100. Low scores trigger automatic retry or focused re-analysis.
373
+
374
+ ---
375
+
376
+ ## Learning Loop — Self-Improving Budgets
377
+
378
+ ```mermaid
379
+ flowchart LR
380
+ HIST["history.json\nup to 50 runs"] --> ANALYZE["analyzeHistory()"]
381
+ ANALYZE --> TREND["Quality trend:\nimproving / declining / stable"]
382
+ ANALYZE --> ADJ["Budget adjustment"]
383
+ ANALYZE --> REC["Recommendations"]
384
+
385
+ ADJ --> PIPE["Applied to next pipeline run"]
386
+ PIPE --> SAVE["After run: save metrics"]
387
+ SAVE --> HIST
388
+ ```
389
+
390
+ | Condition | Adjustment |
391
+ |-----------|-----------|
392
+ | Avg quality < 45 | +4096 thinking tokens |
393
+ | Avg quality > 80 | -2048 thinking tokens (save cost) |
394
+ | Quality stable | No change |
395
+
396
+ ---
397
+
398
+ ## Cross-Segment Continuity
399
+
400
+ ```mermaid
401
+ flowchart LR
402
+ S0(["Segment 0"]) --> CTX1["Context:\ntickets, CRs, names,\nfile refs, your_tasks"]
403
+ CTX1 --> S1(["Segment 1"])
404
+ S1 --> CTX2["Accumulated context\nfrom segments 0+1"]
405
+ CTX2 --> S2(["Segment 2"])
406
+ S2 --> CTX3["...continues"]
407
+ ```
408
+
409
+ Each segment receives the full accumulated context from all prior segments. This ensures:
410
+ - Topic IDs mentioned in segment 0 are recognized in segment 3
411
+ - CR numbering is consistent across the entire recording
412
+ - Speaker names are resolved once and carried forward
413
+
414
+ ---
415
+
416
+ ## Diff Engine — Cross-Run Intelligence
417
+
418
+ When a previous run exists, the diff engine compares:
419
+
420
+ | Category | Detection |
421
+ |----------|-----------|
422
+ | **New items** | Present in current, absent in previous |
423
+ | **Resolved items** | Present in previous, absent in current |
424
+ | **Changed items** | Same ID but different status, assignee, or description |
425
+ | **Stable items** | Unchanged across runs |
426
+
427
+ This is useful when re-running analysis after updating documents — the diff shows exactly what the AI extracted differently.
428
+
429
+ ---
430
+
431
+ ## Deep Dive Mode
432
+
433
+ The `--deep-dive` flag triggers an additional phase after the main video analysis pipeline:
434
+
435
+ ```mermaid
436
+ flowchart TB
437
+ START(["Compiled Analysis"]) --> DISC["Phase 1: Topic Discovery\nAI identifies 3-10 explainable topics"]
438
+ DISC --> PLAN["Topics with categories:\nconcept, decision, process, system,\nrequirement, guide, context, action-plan"]
439
+ PLAN --> GEN["Phase 2: Parallel Document Generation\n2-3 concurrent writers"]
440
+ GEN --> WRITE["Phase 3: Write Output"]
441
+ WRITE --> INDEX["INDEX.md — grouped by category"]
442
+ WRITE --> DOCS["dd-01-topic.md, dd-02-topic.md, ..."]
443
+ WRITE --> META["deep-dive.json — metadata + token usage"]
444
+ ```
445
+
446
+ Deep dive runs AFTER the standard 8-phase pipeline completes, using the compiled analysis as input. Each topic document is self-contained (200-800 words) and written for someone who wasn't on the call.
447
+
448
+ ---
449
+
450
+ ## Dynamic Mode
451
+
452
+ The `--dynamic` flag routes to an entirely separate pipeline that works without video:
453
+
454
+ ```mermaid
455
+ flowchart TB
456
+ START(["--dynamic"]) --> REQ["Get User Request\n--request flag or interactive prompt"]
457
+ REQ --> DOCS["Discover & Load Documents\nRecursive folder scan"]
458
+ DOCS --> AI["Initialize Gemini AI"]
459
+ AI --> PLAN["Phase 1: Plan Topics\nAI plans 3-15 documents"]
460
+ PLAN --> GEN["Phase 2: Generate Documents\nParallel batch generation"]
461
+ GEN --> WRITE["Write Output"]
462
+ WRITE --> INDEX["INDEX.md — document set index"]
463
+ WRITE --> FILES["dm-01-overview.md, dm-02-guide.md, ..."]
464
+ WRITE --> META["dynamic-run.json — metadata"]
465
+ ```
466
+
467
+ ### Dynamic Mode Categories
468
+
469
+ | Category | Purpose | When Used |
470
+ |----------|---------|-----------|
471
+ | **overview** | High-level summaries, introductions | Always first document |
472
+ | **guide** | Step-by-step instructions, tutorials | How-to requests |
473
+ | **analysis** | Comparisons, evaluations, assessments | Analysis/research requests |
474
+ | **plan** | Roadmaps, timelines, strategies | Planning requests |
475
+ | **reference** | Specifications, API docs, schemas | Documentation requests |
476
+ | **concept** | Explanations, definitions, theory | Learning/teaching requests |
477
+ | **decision** | Decision records, trade-off evaluations | Architecture decisions |
478
+ | **checklist** | Verification lists, audit documents | Process/compliance requests |
479
+ | **template** | Reusable patterns, scaffolds | Template requests |
480
+ | **report** | Status reports, findings summaries | Reporting requests |
481
+
482
+ Dynamic mode accepts any request — the AI adapts document categories and count to match what's needed:
483
+
484
+ ```bash
485
+ # Migration planning → plan + guide + checklist + risk analysis
486
+ taskex --dynamic --request "Plan migration from MySQL to PostgreSQL"
487
+
488
+ # Learning → concept + guide + reference (progressive complexity)
489
+ taskex --dynamic --request "Create React hooks tutorial"
490
+
491
+ # Architecture → overview + system docs + decision records
492
+ taskex --dynamic --request "Document this microservices architecture"
493
+ ```
494
+
495
+ ---
496
+
497
+ ## Document Context Processing
498
+
499
+ | Extension | Method | Description |
500
+ |-----------|--------|-------------|
501
+ | `.vtt` `.srt` `.txt` `.md` `.csv` | Inline text | Read and passed directly as text parts |
502
+ | `.pdf` | Gemini File API | Uploaded as binary, Gemini processes natively |
503
+ | `.docx` `.doc` | Firebase only | Uploaded for archival, not processable by Gemini |
504
+
505
+ Directories skipped during recursive discovery: `node_modules`, `.git`, `compressed`, `logs`, `gemini_runs`, `runs`
506
+
507
+ ---
508
+
509
+ ## Skip Logic / Caching
510
+
511
+ | Stage | Skip Condition |
512
+ |-------|----------------|
513
+ | **Compression** | `compressed/{video}/segment_*.mp4` exist on disk |
514
+ | **Firebase upload** | File already exists at `calls/{name}/segments/{video}/` (bypassed by `--force-upload`) |
515
+ | **Storage URL → Gemini** | Firebase download URL available (bypassed by `--no-storage-url`) |
516
+ | **Gemini analysis** | Run file exists in `gemini_runs/` AND user chooses not to re-analyze |
517
+
518
+ ---
519
+
520
+ ## Logging
521
+
522
+ Every run creates three log files in `logs/`:
523
+
524
+ | File | Contents |
525
+ |------|----------|
526
+ | **Detailed** (`_detailed.log`) | All console output, debug info, response previews, timestamps |
527
+ | **Minimal** (`_minimal.log`) | Steps, info, warnings, errors + timestamps (no debug) |
528
+ | **Structured** (`_structured.jsonl`) | Every event as a JSON object with level, timestamp, context, phase |
529
+
530
+ Log levels: `STEP` (milestones) · `INFO` (verbose) · `WARN` (non-fatal) · `ERR` (failures) · `DBG` (debug data)
531
+
532
+ JSONL structured format includes phase spans with timing metrics for observability.
533
+
534
+ ---
535
+
536
+ ## Tech Stack
537
+
538
+ | Component | Package | Purpose |
539
+ |-----------|---------|---------|
540
+ | **Node.js** | ≥ 18.0.0 | Runtime (v24 tested) |
541
+ | **Gemini AI** | `@google/genai@^1.42.0` | Video analysis, File API, 1M context window |
542
+ | **Firebase** | `firebase@^12.9.0` | Anonymous auth + Cloud Storage uploads |
543
+ | **dotenv** | `dotenv@^17.3.1` | Environment variable loading |
544
+ | **ffmpeg** | System binary | H.264 video compression + segmentation |
545
+ | **Git** | System binary | Change detection for progress tracking |
546
+
547
+ **Codebase: 31 files · ~10,300 lines** · npm package: `task-summary-extractor` · CLI: `taskex`
548
+
549
+ ---
550
+
551
+ ## Video Encoding Parameters
552
+
553
+ | Parameter | Value | Purpose |
554
+ |-----------|-------|---------|
555
+ | Codec | H.264 (libx264) | Universal compatibility |
556
+ | CRF | 24 (screenshare) / 20 (4K) | Quality-size balance |
557
+ | Tune | `stillimage` | Optimized for screen content |
558
+ | Sharpening | `unsharp=3:3:0.3` | Preserve text clarity |
559
+ | x264 params | `aq-mode=3:deblock=-1,-1:psy-rd=1.0,0.0` | Text readability |
560
+ | Audio | AAC, 64–128k, original sample rate | Clear speech |
561
+
562
+ ---
563
+
564
+ ## Gemini Run Record Format
565
+
566
+ Each segment analysis is saved as a timestamped JSON file:
567
+
568
+ ```json
569
+ {
570
+ "run": {
571
+ "model": "gemini-2.5-flash",
572
+ "displayName": "my-meeting_Recording_seg00",
573
+ "userName": "Jane Smith",
574
+ "timestamp": "2026-02-23T17:39:50.123Z",
575
+ "durationMs": 45230
576
+ },
577
+ "input": {
578
+ "videoFile": {
579
+ "mimeType": "video/mp4",
580
+ "fileUri": "...",
581
+ "geminiFileName": "files/abc123",
582
+ "usedExternalUrl": false
583
+ },
584
+ "contextDocuments": [{ "fileName": ".tasks/requirements.md" }],
585
+ "previousSegmentCount": 0
586
+ },
587
+ "output": {
588
+ "raw": "{ ... full AI response ... }",
589
+ "parsed": { "tickets": [], "change_requests": [] },
590
+ "parseSuccess": true
591
+ }
592
+ }
593
+ ```
594
+
595
+ When `usedExternalUrl` is `true`, the `fileUri` contains the Firebase Storage download URL and `geminiFileName` is `null` (no File API upload was made).
596
+
597
+ ---
598
+
599
+ ## See Also
600
+
601
+ | Doc | What's In It |
602
+ |-----|-------------|
603
+ | 📖 [README.md](README.md) | Setup, CLI flags, configuration, features |
604
+ | 📖 [QUICK_START.md](QUICK_START.md) | Step-by-step first-time walkthrough |
605
+ | 🔭 [EXPLORATION.md](EXPLORATION.md) | Module map, line counts, future roadmap |