task-summary-extractor 9.3.1 → 9.4.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/ARCHITECTURE.md CHANGED
@@ -1,8 +1,7 @@
1
1
  # Architecture & Technical Deep Dive
2
2
 
3
3
  > Internal reference for the pipeline's architecture, processing flows, and design decisions.
4
- > For setup instructions, see [README.md](README.md) · [Quick Start](QUICK_START.md)
5
- > For module map and roadmap, see [EXPLORATION.md](EXPLORATION.md)
4
+ > For setup instructions, see [README.md](README.md) · [Quick Start](QUICK_START.md)
6
5
 
7
6
  ---
8
7
 
@@ -126,6 +125,7 @@ flowchart TB
126
125
  | 1 | **Init** | CLI parsing, interactive folder selection (if no arg), config validation, logger setup, load learning insights, route to dynamic/progress mode |
127
126
  | 2 | **Discover** | Find videos/audio, discover documents, resolve user name, check resume state |
128
127
  | 3 | **Services** | Firebase auth, Gemini init, prepare document parts |
128
+ | 3.5 | **Deep Summary** | (optional) Pre-summarize context docs with Gemini — 60-80% token savings |
129
129
  | 4 | **Process** | Compress → Upload → Analyze → Quality Gate → Retry → Focused Pass |
130
130
  | 5 | **Compile** | Cross-segment compilation, diff engine comparison |
131
131
  | 6 | **Output** | Write JSON, render Markdown + HTML, upload to Firebase |
@@ -199,7 +199,7 @@ Each video segment goes through this flow (Phase 4 detail):
199
199
 
200
200
  ```mermaid
201
201
  flowchart TB
202
- START(["Segment N"]) --> COMPRESS["ffmpeg compress\nH.264 CRF 24, 1.5x speed"]
202
+ START(["Segment N"]) --> COMPRESS["ffmpeg compress\nH.264 CRF 24, 1.6x speed"]
203
203
  COMPRESS --> VERIFY["Verify segment integrity"]
204
204
  VERIFY --> UPLOAD_FB["Upload to Firebase Storage\n→ download URL"]
205
205
 
@@ -563,7 +563,7 @@ JSONL structured format includes phase spans with timing metrics for observabili
563
563
  | **ffmpeg** | System binary | H.264 video compression + segmentation |
564
564
  | **Git** | System binary | Change detection for progress tracking |
565
565
 
566
- **Codebase: ~45 files · ~13,000+ lines** · npm package: `task-summary-extractor` · CLI: `taskex`
566
+ **Codebase: ~48 files · ~13,600+ lines** · npm package: `task-summary-extractor` · CLI: `taskex`
567
567
 
568
568
  ---
569
569
 
@@ -634,8 +634,8 @@ The project includes a comprehensive test suite using [vitest](https://vitest.de
634
634
 
635
635
  | Metric | Value |
636
636
  |--------|-------|
637
- | Test files | 13 |
638
- | Total tests | 285 |
637
+ | Test files | 15 |
638
+ | Total tests | 331 |
639
639
  | Framework | vitest v4.x |
640
640
  | Coverage | `@vitest/coverage-v8` |
641
641
 
@@ -662,45 +662,45 @@ npm run test:coverage # Coverage report
662
662
  |-----|-------------|
663
663
  | 📖 [README.md](README.md) | Setup, CLI flags, configuration, features |
664
664
  | 📖 [QUICK_START.md](QUICK_START.md) | Step-by-step first-time walkthrough |
665
- | 🔭 [EXPLORATION.md](EXPLORATION.md) | Module map, line counts, future roadmap |
666
665
 
667
666
  ---
668
667
 
669
- ## JSON Schema Validation
670
-
671
- All AI output is validated against JSON Schema definitions in `src/schemas/`:
672
-
673
- | Schema | File | Purpose |
674
- |--------|------|---------|
675
- | Segment analysis | `analysis-segment.schema.json` | Validates each segment's extracted data |
676
- | Compiled analysis | `analysis-compiled.schema.json` | Validates the final cross-segment compilation |
668
+ ## Deep Summary
677
669
 
678
- Validation is performed by `src/utils/schema-validator.js` using [ajv](https://ajv.js.org/). Validation errors are reported as warnings with contextual hints for the retry/focused-pass cycle they do not hard-fail the pipeline but are injected as corrective hints when the quality gate triggers a retry.
679
-
680
- ---
670
+ The `--deep-summary` flag (or interactive prompt when many docs are detected) pre-summarizes context documents before segment analysis:
681
671
 
682
- ## Test Suite
672
+ ```mermaid
673
+ flowchart TB
674
+ START(["Context Docs"]) --> PARTITION["Partition: summarize vs. keep full"]
675
+ PARTITION --> SKIP["Skip tiny docs (<500 chars)"]
676
+ PARTITION --> EXCL["Excluded docs → keep full fidelity"]
677
+ PARTITION --> TO_SUM["Docs to summarize"]
678
+ TO_SUM --> TRUNC["Truncate oversized docs (>900K chars)"]
679
+ TRUNC --> BATCH["Group into batches\n(≤600K chars each)"]
680
+ BATCH --> AI["Gemini summarization\n(per batch)"]
681
+ AI --> REPLACE["Replace full content\nwith condensed summaries"]
682
+ REPLACE --> OUT(["Token-efficient\ncontext docs"])
683
+ ```
683
684
 
684
- The project includes a comprehensive test suite using [vitest](https://vitest.dev/):
685
+ | Constant | Value | Purpose |
686
+ |----------|-------|---------|
687
+ | `BATCH_MAX_CHARS` | 600,000 | Max input chars per summarization batch |
688
+ | `MAX_DOC_CHARS` | 900,000 | Hard cap per-document before truncation |
689
+ | `SUMMARY_MAX_OUTPUT` | 16,384 | Max output tokens per summarization call |
690
+ | `MIN_SUMMARIZE_LENGTH` | 500 | Docs below this skip summarization |
685
691
 
686
- | Metric | Value |
687
- |--------|-------|
688
- | Test files | 13 |
689
- | Total tests | 285 |
690
- | Framework | vitest v4.x |
691
- | Coverage | `@vitest/coverage-v8` |
692
+ Typical savings: 60-80% reduction in per-segment context tokens. The user can exclude specific docs from summarization via `--exclude-docs` or the interactive picker.
692
693
 
693
- **Test categories:**
694
+ ---
694
695
 
695
- | Directory | What's Tested |
696
- |-----------|---------------|
697
- | `tests/utils/` | Utility modules: adaptive-budget, cli, confidence-filter, context-manager, diff-engine, format, json-parser, progress-bar, quality-gate, retry, schema-validator |
698
- | `tests/renderers/` | Renderer modules: html, markdown |
696
+ ## Context Window Safety
699
697
 
700
- **Commands:**
698
+ Safeguards to prevent context window overflow:
701
699
 
702
- ```bash
703
- npm test # Run all tests
704
- npm run test:watch # Watch mode
705
- npm run test:coverage # Coverage report
706
- ```
700
+ | Safeguard | Where | What It Does |
701
+ |-----------|-------|-------------|
702
+ | **P0/P1 hard cap** | `context-manager.js` | Critical docs can't exceed 2× the token budget |
703
+ | **VTT fallback cap** | `context-manager.js` | Full VTT fallback capped at 500K chars |
704
+ | **Doc truncation** | `deep-summary.js` | Oversized docs truncated to 900K chars before summarization |
705
+ | **Compilation pre-flight** | `gemini.js` | Estimates tokens before compilation; trims middle segments if >80% of context |
706
+ | **RESOURCE_EXHAUSTED recovery** | `gemini.js` | On quota/context errors: waits 30s, sheds docs, retries with reduced input |
package/QUICK_START.md CHANGED
@@ -223,6 +223,7 @@ my-project/runs/{timestamp}/
223
223
  | **Force Gemini File API** | `taskex --no-storage-url "my-meeting"` |
224
224
  | **Preview without running** | `taskex --dry-run "my-meeting"` |
225
225
  | **Deep dive docs** | `taskex --deep-dive "my-meeting"` |
226
+ | **Pre-summarize docs** | `taskex --deep-summary "my-meeting"` |
226
227
  | **Generate docs (no video)** | `taskex --dynamic "my-project"` |
227
228
  | **Track progress via git** | `taskex --update-progress --repo "C:\project" "my-meeting"` |
228
229
  | **Debug mode** | `taskex --log-level debug "my-meeting"` |
@@ -272,4 +273,3 @@ Your recordings, `.env`, logs — everything local is `.gitignore`d and safe.
272
273
  |------|-------|
273
274
  | Full feature list, all CLI flags, configuration | [README.md](README.md) |
274
275
  | How the pipeline works internally | [ARCHITECTURE.md](ARCHITECTURE.md) |
275
- | Module map, line counts, roadmap | [EXPLORATION.md](EXPLORATION.md) |
package/README.md CHANGED
@@ -1,13 +1,13 @@
1
1
  # Task Summary Extractor
2
2
 
3
- > **v9.0.0** — AI-powered content analysis CLI — meetings, recordings, documents, or any mix. Install globally, run anywhere.
3
+ > **v9.4.0** — AI-powered content analysis CLI — meetings, recordings, documents, or any mix. Install globally, run anywhere.
4
4
 
5
5
  <p align="center">
6
6
  <img src="https://img.shields.io/badge/node-%3E%3D18.0.0-green" alt="Node.js" />
7
7
  <img src="https://img.shields.io/badge/gemini-2.5--flash-blue" alt="Gemini" />
8
8
  <img src="https://img.shields.io/badge/firebase-11.x-orange" alt="Firebase" />
9
- <img src="https://img.shields.io/badge/version-9.0.0-brightgreen" alt="Version" />
10
- <img src="https://img.shields.io/badge/tests-285%20passing-brightgreen" alt="Tests" />
9
+ <img src="https://img.shields.io/badge/version-9.4.0-brightgreen" alt="Version" />
10
+ <img src="https://img.shields.io/badge/tests-331%20passing-brightgreen" alt="Tests" />
11
11
  <img src="https://img.shields.io/badge/npm-task--summary--extractor-red" alt="npm" />
12
12
  </p>
13
13
 
@@ -62,6 +62,20 @@ taskex --update-progress --repo "C:\my-project" "my-meeting"
62
62
 
63
63
  > **v7.2.3**: If the call folder isn't a git repo, the tool auto-initializes one for baseline tracking.
64
64
 
65
+ ### ⚡ Deep Summary (`--deep-summary`)
66
+
67
+ Pre-summarize context documents to reduce per-segment token usage by 60-80%:
68
+
69
+ ```bash
70
+ taskex --deep-summary --name "Jane" "my-meeting"
71
+ ```
72
+
73
+ Exclude specific docs from summarization (keep at full fidelity):
74
+
75
+ ```bash
76
+ taskex --deep-summary --exclude-docs "code-map.md,sprint.md" "my-meeting"
77
+ ```
78
+
65
79
  > See all modes explained with diagrams → [ARCHITECTURE.md](ARCHITECTURE.md#pipeline-phases)
66
80
 
67
81
  ---
@@ -172,6 +186,8 @@ These are the ones you'll actually use:
172
186
  | `--format <type>` | Output format: `md`, `html`, `json`, `pdf`, `docx`, `all` (default: `md`) | `--format html` |
173
187
  | `--min-confidence <level>` | Filter items by confidence: `high`, `medium`, `low` | `--min-confidence high` |
174
188
  | `--no-html` | Suppress HTML report generation | `--no-html` |
189
+ | `--deep-summary` | Pre-summarize context docs (60-80% token savings) | `--deep-summary` |
190
+ | `--exclude-docs <list>` | Docs to keep full during deep-summary (comma-separated) | `--exclude-docs "code-map.md"` |
175
191
 
176
192
  **Typical usage:**
177
193
 
@@ -198,6 +214,7 @@ Choose what the tool does. Only use one at a time:
198
214
  | *(none)* | **Content analysis** | `results.md` + `results.html` — structured task document |
199
215
  | `--dynamic` | **Doc generation** | `INDEX.md` + 3–15 topic documents |
200
216
  | `--deep-dive` | **Topic explainers** | `INDEX.md` + per-topic deep-dive docs |
217
+ | `--deep-summary` | **Token-efficient analysis** | Same as content analysis, but context docs pre-summarized (60-80% savings) |
201
218
  | `--update-progress` | **Progress check** | `progress.md` — item status via git |
202
219
 
203
220
  **Dynamic mode** also uses:
@@ -259,7 +276,7 @@ taskex [flags] [folder]
259
276
 
260
277
  CONFIG --gemini-key --firebase-key --firebase-project
261
278
  --firebase-bucket --firebase-domain
262
- MODES --dynamic --deep-dive --update-progress
279
+ MODES --dynamic --deep-dive --deep-summary --update-progress
263
280
  CORE --name --model --skip-upload --resume --reanalyze --dry-run
264
281
  OUTPUT --format <md|html|json|pdf|docx|all> --min-confidence <high|medium|low>
265
282
  --no-html
@@ -394,7 +411,7 @@ GEMINI_API_KEY=your-key-here
394
411
 
395
412
  # Optional — uncomment to customize
396
413
  # GEMINI_MODEL=gemini-2.5-flash
397
- # VIDEO_SPEED=1.5
414
+ # VIDEO_SPEED=1.6
398
415
  # THINKING_BUDGET=24576
399
416
  # LOG_LEVEL=info
400
417
 
@@ -413,7 +430,7 @@ GEMINI_API_KEY=your-key-here
413
430
 
414
431
  | Feature | Description |
415
432
  |---------|-------------|
416
- | **Video/Audio Compression** | H.264 CRF 24, text-optimized sharpening, configurable speed |
433
+ | **Video/Audio Compression** | H.264 CRF 24, text-optimized sharpening, 1.6× speed |
417
434
  | **Smart Segmentation** | ≤5 min chunks with boundary-aware splitting |
418
435
  | **Cross-Segment Continuity** | Ticket IDs, names, and context carry forward |
419
436
  | **Document Discovery** | Auto-finds docs in all subfolders |
@@ -434,6 +451,8 @@ GEMINI_API_KEY=your-key-here
434
451
  | **HTML Report** | Self-contained HTML report with collapsible sections, filtering, dark mode |
435
452
  | **JSON Schema Validation** | Validates AI output against JSON Schema (segment + compiled) |
436
453
  | **Confidence Filter** | `--min-confidence` flag to exclude low-confidence items from output |
454
+ | **Deep Summary** | `--deep-summary` pre-summarizes context docs, 60-80% token savings per segment |
455
+ | **Context Window Safety** | Auto-truncation, pre-flight token checks, RESOURCE_EXHAUSTED recovery |
437
456
  | **Multi-Format Output** | `--format` flag: Markdown, HTML, JSON, PDF, DOCX, or all formats at once |
438
457
  | **Interactive CLI** | Run with no args → guided experience |
439
458
  | **Resume / Checkpoint** | `--resume` continues interrupted runs |
@@ -507,6 +526,7 @@ task-summary-extractor/
507
526
  │ │ ├── git.js Git CLI wrapper
508
527
  │ │ └── doc-parser.js Document text extraction (DOCX, XLSX, PPTX, etc.)
509
528
  │ ├── modes/ AI-heavy pipeline phase modules
529
+ │ │ ├── deep-summary.js Pre-summarize context docs (deep-summary feature)
510
530
  │ │ ├── deep-dive.js Topic discovery & deep-dive doc generation
511
531
  │ │ ├── dynamic-mode.js Dynamic document planning & generation
512
532
  │ │ ├── focused-reanalysis.js Targeted reanalysis of weak segments
@@ -528,17 +548,14 @@ task-summary-extractor/
528
548
  │ ├── schema-validator.js JSON Schema validation (ajv)
529
549
  │ └── ... (15 more utility modules)
530
550
 
531
- ├── tests/ Test suite — 285 tests across 13 files (vitest)
551
+ ├── tests/ Test suite — 331 tests across 15 files (vitest)
532
552
  │ ├── utils/ Utility module tests
533
553
  │ └── renderers/ Renderer tests
534
554
 
535
555
  ├── QUICK_START.md Step-by-step setup guide
536
- ├── ARCHITECTURE.md Technical deep dive
537
- └── EXPLORATION.md Roadmap & future features
556
+ └── ARCHITECTURE.md Technical deep dive
538
557
  ```
539
558
 
540
- > Full module map with line counts → [EXPLORATION.md](EXPLORATION.md#full-module-map)
541
-
542
559
  ---
543
560
 
544
561
  ## npm Scripts
@@ -551,7 +568,7 @@ task-summary-extractor/
551
568
  | `npm run check` | Validate environment |
552
569
  | `npm start` | Run the pipeline |
553
570
  | `npm run help` | Show CLI help |
554
- | `npm test` | Run test suite (285 tests) |
571
+ | `npm test` | Run test suite (331 tests) |
555
572
  | `npm run test:watch` | Run tests in watch mode |
556
573
  | `npm run test:coverage` | Run tests with coverage report |
557
574
 
@@ -561,6 +578,9 @@ task-summary-extractor/
561
578
 
562
579
  | Version | Highlights |
563
580
  |---------|-----------|
581
+ | **v9.4.0** | **Context window safety** — pre-flight token checks, auto-truncation for oversized docs/VTTs, RESOURCE_EXHAUSTED recovery with automatic doc shedding, chunked compilation for large segment sets, P0/P1 hard cap (2× budget) prevents context overflow, improved deep-summary prompt quality |
582
+ | **v9.3.1** | **Audit & polish** — VIDEO_SPEED 1.5→1.6, `--exclude-docs` flag for non-interactive deep-summary exclusion, friendlier Gemini error messages, dead code removal, DRY RUN_PRESETS |
583
+ | **v9.3.0** | **Deep summary** — `--deep-summary` pre-summarizes context documents (60-80% token savings), interactive doc picker, `--exclude-docs` for CLI automation, batch processing |
564
584
  | **v9.0.0** | **CLI UX upgrade** — colors & progress bar, HTML reports, PDF & DOCX output (via puppeteer and docx npm package), JSON Schema validation, confidence filter (`--min-confidence`), pipeline decomposition (`src/phases/` — 9 modules), test suite (285 tests via vitest), multi-format output (`--format`: md/html/json/pdf/docx/all), doc-parser service, shared renderer utilities |
565
585
  | **v8.3.0** | **Universal content analysis** — prompt v4.0.0 supports video, audio, documents, and mixed content; input type auto-detection; timestamps conditional on content type; gemini.js bridge text generalized; all markdown docs updated |
566
586
  | **v8.2.0** | **Architecture cleanup** — `src/modes/` for AI pipeline phases, `retry.js` self-contained defaults, dead code removal, export trimming, `process_and_upload.js` slim shim, `progress.js` → `checkpoint.js`, merged `prompt.js` into `cli.js` |
@@ -587,7 +607,6 @@ task-summary-extractor/
587
607
  |-----|-------------|-------------|
588
608
  | 📖 **[QUICK_START.md](QUICK_START.md)** | Full setup walkthrough, examples, troubleshooting | First time using the tool |
589
609
  | 🏗️ **[ARCHITECTURE.md](ARCHITECTURE.md)** | Pipeline phases, algorithms, Mermaid diagrams | Understanding how it works |
590
- | 🔭 **[EXPLORATION.md](EXPLORATION.md)** | Module map, line counts, future roadmap | Contributing or extending |
591
610
 
592
611
  ---
593
612
 
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "task-summary-extractor",
3
- "version": "9.3.1",
3
+ "version": "9.4.0",
4
4
  "description": "AI-powered meeting analysis & document generation CLI — video + document processing, deep dive docs, dynamic mode, interactive CLI with model selection, confidence scoring, learning loop, git progress tracking",
5
5
  "main": "process_and_upload.js",
6
6
  "bin": {
@@ -36,6 +36,13 @@ const BATCH_MAX_CHARS = 600000;
36
36
  /** Minimum content length (chars) to bother summarizing — below this, keep full */
37
37
  const MIN_SUMMARIZE_LENGTH = 500;
38
38
 
39
+ /**
40
+ * Hard cap per-document chars before sending to Gemini.
41
+ * Gemini context = 1M tokens; prompt overhead ~50K tokens; at 0.3 tok/char
42
+ * 900K chars ≈ 270K tokens — safe with prompt + thinking overhead.
43
+ */
44
+ const MAX_DOC_CHARS = 900000;
45
+
39
46
  // ======================== BATCH BUILDER ========================
40
47
 
41
48
  /**
@@ -51,8 +58,22 @@ function buildBatches(docs, maxChars = BATCH_MAX_CHARS) {
51
58
  let currentBatch = [];
52
59
  let currentChars = 0;
53
60
 
54
- for (const doc of docs) {
55
- const docChars = doc.content ? doc.content.length : 0;
61
+ for (let doc of docs) {
62
+ let docChars = doc.content ? doc.content.length : 0;
63
+
64
+ // Truncate extremely large docs to avoid exceeding the context window.
65
+ // Any single doc beyond MAX_DOC_CHARS is capped (tail is dropped) and a
66
+ // warning is prepended so the summariser knows the content is incomplete.
67
+ if (docChars > MAX_DOC_CHARS) {
68
+ const truncated = doc.content.substring(0, MAX_DOC_CHARS);
69
+ doc = {
70
+ ...doc,
71
+ content: `[TRUNCATED — original ${(docChars / 1024).toFixed(0)} KB exceeded the ${(MAX_DOC_CHARS / 1024).toFixed(0)} KB limit; only the first ${(MAX_DOC_CHARS / 1024).toFixed(0)} KB is included]\n\n${truncated}`,
72
+ _truncatedFrom: docChars,
73
+ };
74
+ docChars = doc.content.length;
75
+ console.warn(` ${c.warn(`${doc.fileName} truncated from ${(doc._truncatedFrom / 1024).toFixed(0)} KB to ${(MAX_DOC_CHARS / 1024).toFixed(0)} KB for deep summary`)}`);
76
+ }
56
77
 
57
78
  // If this single doc exceeds the batch limit, it gets its own batch
58
79
  if (docChars > maxChars) {
@@ -120,23 +141,35 @@ async function summarizeBatch(ai, docs, opts = {}) {
120
141
 
121
142
  const promptText = `You are a precision document summarizer for a meeting analysis pipeline.
122
143
 
123
- Your job: read ALL documents below and produce a CONDENSED version of each that preserves:
124
- - Every ticket ID, task ID, CR number, or reference number (verbatim)
125
- - All assignees, reviewers, and responsible parties
126
- - All statuses (open, closed, in_progress, blocked, etc.)
127
- - All action items and their owners
128
- - All blockers, dependencies, and deadlines
129
- - Key decisions and their rationale
130
- - File paths and code references
131
- - Numerical data (percentages, counts, dates, versions)
132
-
133
- What to remove:
144
+ Your job: read ALL documents below and produce a CONDENSED version of each that preserves every piece of actionable information.
145
+
146
+ WHAT TO PRESERVE (in order of importance):
147
+ 1. IDENTIFIERS Every ticket ID, task ID, CR number, PR number, JIRA key, GitHub issue, reference number, version number. Copy these VERBATIM — do not paraphrase or abbreviate IDs.
148
+ 2. PEOPLE — All assignees, reviewers, approvers, requesters, and responsible parties. Use full names exactly as they appear.
149
+ 3. STATUSES & STATES — All statuses (open, closed, in_progress, blocked, deferred, etc.) and state markers (✅, ⬜, ⏸️, 🔲). Preserve the exact status vocabulary used in the document.
150
+ 4. ACTION ITEMS — Every action item, commitment, and deliverable with its owner, deadline, and dependency chain.
151
+ 5. BLOCKERS & DEPENDENCIES What is blocked, by whom, what it blocks downstream.
152
+ 6. DECISIONS & RATIONALE — Key decisions and WHY they were made (not just what).
153
+ 7. CROSS-REFERENCES — When Document A references something from Document B, preserve that linkage. If ticket X is mentioned in a code-map entry, keep both the ticket ID and the code-map path.
154
+ 8. TECHNICAL SPECIFICS — File paths, code references, API endpoints, database tables, configuration keys, environment names (dev/staging/prod).
155
+ 9. NUMERICAL DATA — Percentages, counts, dates, deadlines, version numbers, sizes.
156
+ 10. CHECKLISTS & PROGRESS — Preserve checklist items with their completion status markers. Include progress ratios (e.g., "35/74 done, 6 blocked").
157
+
158
+ WHAT TO REMOVE:
134
159
  - Verbose explanations of well-known concepts
135
- - Redundant phrasing and filler text
136
- - Formatting-only content (decorative headers, dividers)
137
- - Boilerplate/template text that adds no information
160
+ - Redundant phrasing, filler text, throat-clearing sentences
161
+ - Formatting-only content (decorative headers, horizontal rules, empty sections)
162
+ - Boilerplate/template text that adds no project-specific information
163
+ - Repeated definitions or glossary entries that don't change across documents
138
164
  ${focusSection}
139
165
 
166
+ QUALITY REQUIREMENTS:
167
+ - Aim for 70-80% size reduction while preserving ALL actionable information.
168
+ - Every ID, every name, every status MUST survive the summarization.
169
+ - If two documents reference the same entity (ticket, file, person), ensure the summary preserves enough context in BOTH summaries for downstream consumers to make the connection.
170
+ - When a document contains a table, preserve the table structure (header + key rows). Omit empty or low-value rows.
171
+ - When a document has nested structure (subsections, indented lists), preserve the hierarchy — use indentation or numbering.
172
+
140
173
  OUTPUT FORMAT:
141
174
  Return valid JSON with this structure:
142
175
  {
@@ -151,9 +184,6 @@ Return valid JSON with this structure:
151
184
  }
152
185
  }
153
186
 
154
- Aim for 70-80% size reduction while preserving ALL actionable information.
155
- Every ID, every name, every status must survive the summarization.
156
-
157
187
  DOCUMENTS TO SUMMARIZE (${docEntries.length} documents):
158
188
 
159
189
  ${docEntries.join('\n\n')}`;
@@ -162,7 +192,7 @@ ${docEntries.join('\n\n')}`;
162
192
  model: config.GEMINI_MODEL,
163
193
  contents: [{ role: 'user', parts: [{ text: promptText }] }],
164
194
  config: {
165
- systemInstruction: 'You are a lossless information compressor. Preserve every ID, name, status, assignment, and actionable detail. Output valid JSON only.',
195
+ systemInstruction: 'You are a lossless information compressor specialized in engineering and business documents. Preserve every ID, name, status, assignment, dependency, file path, decision rationale, and actionable detail. Maintain cross-document references (when doc A mentions entity from doc B, keep both sides). Output valid JSON only.',
166
196
  maxOutputTokens: SUMMARY_MAX_OUTPUT,
167
197
  temperature: 0,
168
198
  thinkingConfig: { thinkingBudget },
@@ -372,4 +402,5 @@ module.exports = {
372
402
  SUMMARY_MAX_OUTPUT,
373
403
  BATCH_MAX_CHARS,
374
404
  MIN_SUMMARIZE_LENGTH,
405
+ MAX_DOC_CHARS,
375
406
  };
package/src/pipeline.js CHANGED
@@ -46,7 +46,7 @@ const phaseDeepDive = require('./phases/deep-dive');
46
46
  // --- Utils (for run orchestration + alt modes) ---
47
47
  const { c } = require('./utils/colors');
48
48
  const { findDocsRecursive } = require('./utils/fs');
49
- const { promptUserText, selectDocsToExclude } = require('./utils/cli');
49
+ const { promptUser, promptUserText, selectDocsToExclude } = require('./utils/cli');
50
50
  const { createProgressBar } = require('./utils/progress-bar');
51
51
  const { buildHealthReport, printHealthDashboard } = require('./utils/health-dashboard');
52
52
  const { saveHistory, buildHistoryEntry } = require('./utils/learning-loop');
@@ -96,6 +96,23 @@ async function run() {
96
96
  bar.tick('Services ready');
97
97
 
98
98
  // Phase 3.5 (optional): Deep Summary — pre-summarize context docs
99
+ // If user didn't pass --deep-summary but has many context docs, offer it interactively
100
+ if (!fullCtx.opts.deepSummary && process.stdin.isTTY && fullCtx.ai && fullCtx.contextDocs.length >= 3) {
101
+ const inlineDocs = fullCtx.contextDocs.filter(d => d.type === 'inlineText' && d.content);
102
+ const totalChars = inlineDocs.reduce((sum, d) => sum + d.content.length, 0);
103
+ const totalTokensEstimate = Math.ceil(totalChars * 0.3);
104
+ // Only offer when context is large enough to benefit (>100K tokens)
105
+ if (totalTokensEstimate > 100000) {
106
+ console.log('');
107
+ console.log(` ${c.cyan('You have')} ${c.highlight(inlineDocs.length)} ${c.cyan('context docs')} (~${c.highlight((totalTokensEstimate / 1000).toFixed(0) + 'K')} ${c.cyan('tokens)')}`);
108
+ console.log(` ${c.dim('Deep summary can reduce per-segment context by 60-80%, saving time and cost.')}`);
109
+ const wantDeepSummary = await promptUser(` ${c.cyan('Enable deep summary?')} [y/N] `);
110
+ if (wantDeepSummary) {
111
+ fullCtx.opts.deepSummary = true;
112
+ }
113
+ }
114
+ }
115
+
99
116
  if (fullCtx.opts.deepSummary && fullCtx.ai && fullCtx.contextDocs.length > 0) {
100
117
  // Interactive picker: let user choose docs to keep at full fidelity
101
118
  if (process.stdin.isTTY && fullCtx.opts.deepSummaryExclude.length === 0) {
@@ -459,16 +459,53 @@ async function processWithGemini(ai, filePath, displayName, contextDocs = [], pr
459
459
  throw reuploadErr;
460
460
  }
461
461
  } else {
462
- // Log request diagnostics for other errors to aid debugging
463
- const partSummary = contentParts.map((p, i) => {
464
- if (p.fileData) return ` [${i}] fileData: ${p.fileData.mimeType} ${(p.fileData.fileUri || '').substring(0, 120)}`;
465
- if (p.text) return ` [${i}] text: ${p.text.length} chars → ${p.text.substring(0, 80).replace(/\n/g, ' ')}...`;
466
- return ` [${i}] unknown part`;
467
- });
468
- console.error(` ${c.error('Request diagnostics:')}`);
469
- console.error(` Model: ${config.GEMINI_MODEL} | Parts: ${contentParts.length} | maxOutput: 65536`);
470
- partSummary.forEach(s => console.error(` ${s}`));
471
- throw apiErr;
462
+ // Handle RESOURCE_EXHAUSTED specifically shed lower-priority docs and retry
463
+ if (errMsg.includes('RESOURCE_EXHAUSTED') || errMsg.includes('429') || errMsg.includes('quota')) {
464
+ console.warn(` ${c.warn('Context window or quota exceeded shedding docs and retrying after 30s...')}`);
465
+ await new Promise(r => setTimeout(r, 30000));
466
+ // Rebuild with half the doc budget
467
+ const reducedBudget = Math.floor(docBudget * 0.5);
468
+ const { selected: reducedDocs } = selectDocsByBudget(contextDocs, reducedBudget, { segmentIndex });
469
+ const reducedParts = [contentParts[0]]; // keep video
470
+ for (const doc of reducedDocs) {
471
+ if (doc.type === 'inlineText') {
472
+ let content = doc.content;
473
+ const isVtt = doc.fileName.toLowerCase().endsWith('.vtt') || doc.fileName.toLowerCase().endsWith('.srt');
474
+ if (isVtt && segmentStartSec != null && segmentEndSec != null) {
475
+ content = sliceVttForSegment(content, segmentStartSec, segmentEndSec);
476
+ }
477
+ reducedParts.push({ text: `=== Document: ${doc.fileName} ===\n${content}` });
478
+ } else if (doc.type === 'fileData') {
479
+ reducedParts.push({ fileData: { mimeType: doc.mimeType, fileUri: doc.fileUri } });
480
+ }
481
+ }
482
+ // Re-add prompt/context parts (last 3-5 parts are prompt, focus, etc.)
483
+ const nonDocParts = contentParts.slice(1 + selectedDocs.length);
484
+ reducedParts.push(...nonDocParts);
485
+ requestPayload.contents[0].parts = reducedParts;
486
+ console.log(` Reduced to ${reducedDocs.length} docs (budget: ${(reducedBudget / 1000).toFixed(0)}K tokens)`);
487
+ try {
488
+ response = await withRetry(
489
+ () => ai.models.generateContent(requestPayload),
490
+ { label: `Gemini segment analysis — reduced docs (${displayName})`, maxRetries: 1, baseDelay: 5000 }
491
+ );
492
+ console.log(` ${c.success('Reduced-context retry succeeded')}`);
493
+ } catch (reduceErr) {
494
+ console.error(` ${c.error(`Reduced-context retry also failed: ${reduceErr.message}`)}`);
495
+ throw reduceErr;
496
+ }
497
+ } else {
498
+ // Log request diagnostics for other errors to aid debugging
499
+ const partSummary = contentParts.map((p, i) => {
500
+ if (p.fileData) return ` [${i}] fileData: ${p.fileData.mimeType} → ${(p.fileData.fileUri || '').substring(0, 120)}`;
501
+ if (p.text) return ` [${i}] text: ${p.text.length} chars → ${p.text.substring(0, 80).replace(/\n/g, ' ')}...`;
502
+ return ` [${i}] unknown part`;
503
+ });
504
+ console.error(` ${c.error('Request diagnostics:')}`);
505
+ console.error(` Model: ${config.GEMINI_MODEL} | Parts: ${contentParts.length} | maxOutput: 65536`);
506
+ partSummary.forEach(s => console.error(` ${s}`));
507
+ throw apiErr;
508
+ }
472
509
  }
473
510
  }
474
511
  const durationMs = Date.now() - t0;
@@ -628,6 +665,60 @@ ${segmentDumps}`;
628
665
 
629
666
  const contentParts = [{ text: compilationPrompt }];
630
667
 
668
+ // ------- Pre-flight context window check -------
669
+ const estimatedInputTokens = estimateTokens(compilationPrompt);
670
+ const safeLimit = Math.floor(config.GEMINI_CONTEXT_WINDOW * 0.80); // 80% of context window
671
+ if (estimatedInputTokens > safeLimit) {
672
+ console.warn(` ${c.warn(`Compilation input (~${(estimatedInputTokens / 1000).toFixed(0)}K tokens) exceeds 80% of context window (${(safeLimit / 1000).toFixed(0)}K). Trimming older segment detail...`)}`);
673
+ // Re-build segment dumps with aggressive compression: keep only first & last 2 segments
674
+ // at full detail, compress the middle ones to IDs + statuses only.
675
+ const trimmedDumps = allSegmentAnalyses.map((analysis, idx) => {
676
+ const clean = { ...analysis };
677
+ delete clean._geminiMeta;
678
+ delete clean.seg;
679
+ delete clean.conversation_transcript;
680
+ const isEdge = idx < 2 || idx >= allSegmentAnalyses.length - 2;
681
+ if (!isEdge) {
682
+ // Aggressive compression for middle segments
683
+ if (clean.tickets) {
684
+ clean.tickets = clean.tickets.map(t => ({
685
+ ticket_id: t.ticket_id, status: t.status, title: t.title,
686
+ assignee: t.assignee, source_segment: t.source_segment,
687
+ }));
688
+ }
689
+ if (clean.change_requests) {
690
+ clean.change_requests = clean.change_requests.map(cr => ({
691
+ id: cr.id, status: cr.status, title: cr.title,
692
+ assigned_to: cr.assigned_to, source_segment: cr.source_segment,
693
+ }));
694
+ }
695
+ if (clean.action_items) {
696
+ clean.action_items = clean.action_items.map(ai => ({
697
+ id: ai.id, description: ai.description, assigned_to: ai.assigned_to,
698
+ status: ai.status, source_segment: ai.source_segment,
699
+ }));
700
+ }
701
+ delete clean.file_references;
702
+ clean.summary = (clean.summary || '').substring(0, 200);
703
+ } else {
704
+ if (clean.tickets) {
705
+ clean.tickets = clean.tickets.map(t => {
706
+ const tc = { ...t };
707
+ if (tc.comments && tc.comments.length > 5) {
708
+ tc.comments = tc.comments.slice(0, 5);
709
+ tc.comments.push({ note: `...${t.comments.length - 5} more comments omitted` });
710
+ }
711
+ return tc;
712
+ });
713
+ }
714
+ }
715
+ return `=== SEGMENT ${idx + 1} OF ${allSegmentAnalyses.length} ===\n${JSON.stringify(clean, null, 2)}`;
716
+ }).join('\n\n');
717
+ contentParts[0] = { text: compilationPrompt.replace(segmentDumps, trimmedDumps) };
718
+ const newEstimate = estimateTokens(contentParts[0].text);
719
+ console.log(` Trimmed compilation input to ~${(newEstimate / 1000).toFixed(0)}K tokens`);
720
+ }
721
+
631
722
  const requestPayload = {
632
723
  model: config.GEMINI_MODEL,
633
724
  contents: [{ role: 'user', parts: contentParts }],
@@ -640,10 +731,44 @@ ${segmentDumps}`;
640
731
 
641
732
  const t0 = Date.now();
642
733
  console.log(` Compiling with ${config.GEMINI_MODEL}...`);
643
- const response = await withRetry(
644
- () => ai.models.generateContent(requestPayload),
645
- { label: 'Gemini final compilation', maxRetries: 2, baseDelay: 5000 }
646
- );
734
+ let response;
735
+ try {
736
+ response = await withRetry(
737
+ () => ai.models.generateContent(requestPayload),
738
+ { label: 'Gemini final compilation', maxRetries: 2, baseDelay: 5000 }
739
+ );
740
+ } catch (compileErr) {
741
+ const errMsg = compileErr.message || '';
742
+ if (errMsg.includes('RESOURCE_EXHAUSTED') || errMsg.includes('429') || errMsg.includes('quota')) {
743
+ console.warn(` ${c.warn('Context window or quota exceeded during compilation — waiting 30s and retrying with reduced input...')}`);
744
+ await new Promise(r => setTimeout(r, 30000));
745
+ // Halve the compilation prompt by keeping only edge segments
746
+ const miniDumps = allSegmentAnalyses.map((analysis, idx) => {
747
+ const clean = { tickets: (analysis.tickets || []).map(t => ({ ticket_id: t.ticket_id, status: t.status, title: t.title, assignee: t.assignee })),
748
+ change_requests: (analysis.change_requests || []).map(cr => ({ id: cr.id, status: cr.status, title: cr.title })),
749
+ action_items: (analysis.action_items || []).map(ai => ({ id: ai.id, description: ai.description, assigned_to: ai.assigned_to, status: ai.status })),
750
+ blockers: (analysis.blockers || []).map(b => ({ id: b.id, description: b.description, status: b.status })),
751
+ scope_changes: analysis.scope_changes || [],
752
+ your_tasks: analysis.your_tasks || {},
753
+ summary: (analysis.summary || '').substring(0, 300),
754
+ };
755
+ return `=== SEGMENT ${idx + 1} OF ${allSegmentAnalyses.length} ===\n${JSON.stringify(clean, null, 2)}`;
756
+ }).join('\n\n');
757
+ requestPayload.contents[0].parts = [{ text: compilationPrompt.replace(/SEGMENT ANALYSES:\n[\s\S]*$/, `SEGMENT ANALYSES:\n${miniDumps}`) }];
758
+ try {
759
+ response = await withRetry(
760
+ () => ai.models.generateContent(requestPayload),
761
+ { label: 'Gemini compilation (reduced)', maxRetries: 1, baseDelay: 5000 }
762
+ );
763
+ console.log(` ${c.success('Reduced compilation succeeded')}`);
764
+ } catch (reduceErr) {
765
+ console.error(` ${c.error(`Reduced compilation also failed: ${reduceErr.message}`)}`);
766
+ throw reduceErr;
767
+ }
768
+ } else {
769
+ throw compileErr;
770
+ }
771
+ }
647
772
  const durationMs = Date.now() - t0;
648
773
  const rawText = response.text;
649
774
 
@@ -29,6 +29,14 @@ function estimateDocTokens(doc) {
29
29
  return 500;
30
30
  }
31
31
 
32
+ /**
33
+ * Hard character limit for VTT fallback.
34
+ * When VTT parsing fails (0 cues), the full VTT is returned.
35
+ * Cap it so a huge transcript can't blow the context window.
36
+ * 500K chars ≈ 150K tokens — leaves plenty of room for docs + prompt.
37
+ */
38
+ const VTT_FALLBACK_MAX_CHARS = 500000;
39
+
32
40
  // ════════════════════════════════════════════════════════════
33
41
  // Priority Classification
34
42
  // ════════════════════════════════════════════════════════════
@@ -100,12 +108,16 @@ function selectDocsByBudget(allDocs, tokenBudget, opts = {}) {
100
108
  const excluded = [];
101
109
  let usedTokens = 0;
102
110
 
111
+ // Hard cap: even P0/P1 docs may not exceed 2× the budget.
112
+ // This prevents a handful of huge critical docs from blowing the context window.
113
+ const hardCap = tokenBudget * 2;
114
+
103
115
  for (const item of classified) {
104
116
  if (usedTokens + item.tokens <= tokenBudget) {
105
117
  selected.push(item.doc);
106
118
  usedTokens += item.tokens;
107
- } else if (item.priority <= PRIORITY.HIGH) {
108
- // P0 and P1 are always included even if over budget
119
+ } else if (item.priority <= PRIORITY.HIGH && usedTokens + item.tokens <= hardCap) {
120
+ // P0 and P1 are always included even if over budget, up to the hard cap
109
121
  selected.push(item.doc);
110
122
  usedTokens += item.tokens;
111
123
  } else {
@@ -171,14 +183,28 @@ function parseVttCues(vttContent) {
171
183
  */
172
184
  function sliceVttForSegment(vttContent, segStartSec, segEndSec, overlapSec = 30) {
173
185
  const cues = parseVttCues(vttContent);
174
- if (cues.length === 0) return vttContent; // fallback: return full VTT
186
+ if (cues.length === 0) {
187
+ // Fallback: return full VTT but cap size to avoid context window overflow
188
+ if (vttContent.length > VTT_FALLBACK_MAX_CHARS) {
189
+ return vttContent.substring(0, VTT_FALLBACK_MAX_CHARS) +
190
+ `\n\n[TRUNCATED — original VTT was ${(vttContent.length / 1024).toFixed(0)} KB; capped at ${(VTT_FALLBACK_MAX_CHARS / 1024).toFixed(0)} KB]`;
191
+ }
192
+ return vttContent;
193
+ }
175
194
 
176
195
  const rangeStart = Math.max(0, segStartSec - overlapSec);
177
196
  const rangeEnd = segEndSec + overlapSec;
178
197
 
179
198
  const filtered = cues.filter(c => c.endSec >= rangeStart && c.startSec <= rangeEnd);
180
199
 
181
- if (filtered.length === 0) return vttContent; // fallback
200
+ if (filtered.length === 0) {
201
+ // Fallback with cap
202
+ if (vttContent.length > VTT_FALLBACK_MAX_CHARS) {
203
+ return vttContent.substring(0, VTT_FALLBACK_MAX_CHARS) +
204
+ `\n\n[TRUNCATED — original VTT was ${(vttContent.length / 1024).toFixed(0)} KB; capped at ${(VTT_FALLBACK_MAX_CHARS / 1024).toFixed(0)} KB]`;
205
+ }
206
+ return vttContent;
207
+ }
182
208
 
183
209
  const header = `WEBVTT\n\n[Segment transcript: ${formatHMS(segStartSec)} — ${formatHMS(segEndSec)}]\n[Showing cues from ${formatHMS(rangeStart)} to ${formatHMS(rangeEnd)} with ${overlapSec}s overlap]\n`;
184
210
 
@@ -492,4 +518,5 @@ module.exports = {
492
518
  buildProgressiveContext,
493
519
  buildSegmentFocus,
494
520
  detectBoundaryContext,
521
+ VTT_FALLBACK_MAX_CHARS,
495
522
  };