@steipete/summarize 0.8.2 → 0.10.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/CHANGELOG.md +114 -1
- package/LICENSE +1 -1
- package/README.md +309 -182
- package/dist/cli.js +1 -1
- package/dist/esm/cache.js +72 -4
- package/dist/esm/cache.js.map +1 -1
- package/dist/esm/config.js +197 -1
- package/dist/esm/config.js.map +1 -1
- package/dist/esm/content/asset.js +75 -2
- package/dist/esm/content/asset.js.map +1 -1
- package/dist/esm/daemon/agent.js +547 -0
- package/dist/esm/daemon/agent.js.map +1 -0
- package/dist/esm/daemon/chat.js +97 -0
- package/dist/esm/daemon/chat.js.map +1 -0
- package/dist/esm/daemon/cli.js +105 -10
- package/dist/esm/daemon/cli.js.map +1 -1
- package/dist/esm/daemon/env-snapshot.js +3 -0
- package/dist/esm/daemon/env-snapshot.js.map +1 -1
- package/dist/esm/daemon/flow-context.js +53 -28
- package/dist/esm/daemon/flow-context.js.map +1 -1
- package/dist/esm/daemon/launchd.js +27 -0
- package/dist/esm/daemon/launchd.js.map +1 -1
- package/dist/esm/daemon/process-registry.js +206 -0
- package/dist/esm/daemon/process-registry.js.map +1 -0
- package/dist/esm/daemon/schtasks.js +64 -0
- package/dist/esm/daemon/schtasks.js.map +1 -1
- package/dist/esm/daemon/server.js +1034 -52
- package/dist/esm/daemon/server.js.map +1 -1
- package/dist/esm/daemon/summarize.js +66 -18
- package/dist/esm/daemon/summarize.js.map +1 -1
- package/dist/esm/daemon/systemd.js +61 -0
- package/dist/esm/daemon/systemd.js.map +1 -1
- package/dist/esm/flags.js +24 -0
- package/dist/esm/flags.js.map +1 -1
- package/dist/esm/llm/attachments.js +2 -0
- package/dist/esm/llm/attachments.js.map +1 -0
- package/dist/esm/llm/errors.js +6 -0
- package/dist/esm/llm/errors.js.map +1 -0
- package/dist/esm/llm/generate-text.js +206 -356
- package/dist/esm/llm/generate-text.js.map +1 -1
- package/dist/esm/llm/html-to-markdown.js +1 -2
- package/dist/esm/llm/html-to-markdown.js.map +1 -1
- package/dist/esm/llm/prompt.js.map +1 -1
- package/dist/esm/llm/providers/anthropic.js +126 -0
- package/dist/esm/llm/providers/anthropic.js.map +1 -0
- package/dist/esm/llm/providers/google.js +78 -0
- package/dist/esm/llm/providers/google.js.map +1 -0
- package/dist/esm/llm/providers/models.js +111 -0
- package/dist/esm/llm/providers/models.js.map +1 -0
- package/dist/esm/llm/providers/openai.js +150 -0
- package/dist/esm/llm/providers/openai.js.map +1 -0
- package/dist/esm/llm/providers/shared.js +48 -0
- package/dist/esm/llm/providers/shared.js.map +1 -0
- package/dist/esm/llm/providers/types.js +2 -0
- package/dist/esm/llm/providers/types.js.map +1 -0
- package/dist/esm/llm/transcript-to-markdown.js +1 -2
- package/dist/esm/llm/transcript-to-markdown.js.map +1 -1
- package/dist/esm/llm/types.js +2 -0
- package/dist/esm/llm/types.js.map +1 -0
- package/dist/esm/llm/usage.js +69 -0
- package/dist/esm/llm/usage.js.map +1 -0
- package/dist/esm/logging/daemon.js +124 -0
- package/dist/esm/logging/daemon.js.map +1 -0
- package/dist/esm/logging/ring-file.js +66 -0
- package/dist/esm/logging/ring-file.js.map +1 -0
- package/dist/esm/media-cache.js +251 -0
- package/dist/esm/media-cache.js.map +1 -0
- package/dist/esm/model-auto.js +103 -5
- package/dist/esm/model-auto.js.map +1 -1
- package/dist/esm/processes.js +2 -0
- package/dist/esm/processes.js.map +1 -0
- package/dist/esm/refresh-free.js +3 -3
- package/dist/esm/refresh-free.js.map +1 -1
- package/dist/esm/run/attachments.js +8 -4
- package/dist/esm/run/attachments.js.map +1 -1
- package/dist/esm/run/bird.js +118 -5
- package/dist/esm/run/bird.js.map +1 -1
- package/dist/esm/run/cache-state.js +3 -2
- package/dist/esm/run/cache-state.js.map +1 -1
- package/dist/esm/run/cli-preflight.js +19 -1
- package/dist/esm/run/cli-preflight.js.map +1 -1
- package/dist/esm/run/constants.js +0 -7
- package/dist/esm/run/constants.js.map +1 -1
- package/dist/esm/run/finish-line.js +58 -11
- package/dist/esm/run/finish-line.js.map +1 -1
- package/dist/esm/run/flows/asset/extract.js +70 -0
- package/dist/esm/run/flows/asset/extract.js.map +1 -0
- package/dist/esm/run/flows/asset/input.js +209 -25
- package/dist/esm/run/flows/asset/input.js.map +1 -1
- package/dist/esm/run/flows/asset/media-policy.js +3 -0
- package/dist/esm/run/flows/asset/media-policy.js.map +1 -0
- package/dist/esm/run/flows/asset/media.js +224 -0
- package/dist/esm/run/flows/asset/media.js.map +1 -0
- package/dist/esm/run/flows/asset/output.js +98 -0
- package/dist/esm/run/flows/asset/output.js.map +1 -0
- package/dist/esm/run/flows/asset/preprocess.js +92 -16
- package/dist/esm/run/flows/asset/preprocess.js.map +1 -1
- package/dist/esm/run/flows/asset/summary.js +165 -11
- package/dist/esm/run/flows/asset/summary.js.map +1 -1
- package/dist/esm/run/flows/url/extract.js +6 -6
- package/dist/esm/run/flows/url/extract.js.map +1 -1
- package/dist/esm/run/flows/url/flow.js +338 -36
- package/dist/esm/run/flows/url/flow.js.map +1 -1
- package/dist/esm/run/flows/url/markdown.js +6 -1
- package/dist/esm/run/flows/url/markdown.js.map +1 -1
- package/dist/esm/run/flows/url/slides-output.js +485 -0
- package/dist/esm/run/flows/url/slides-output.js.map +1 -0
- package/dist/esm/run/flows/url/slides-text.js +628 -0
- package/dist/esm/run/flows/url/slides-text.js.map +1 -0
- package/dist/esm/run/flows/url/summary.js +358 -83
- package/dist/esm/run/flows/url/summary.js.map +1 -1
- package/dist/esm/run/help.js +94 -5
- package/dist/esm/run/help.js.map +1 -1
- package/dist/esm/run/logging.js +12 -4
- package/dist/esm/run/logging.js.map +1 -1
- package/dist/esm/run/media-cache-state.js +33 -0
- package/dist/esm/run/media-cache-state.js.map +1 -0
- package/dist/esm/run/progress.js +19 -1
- package/dist/esm/run/progress.js.map +1 -1
- package/dist/esm/run/run-context.js +19 -0
- package/dist/esm/run/run-context.js.map +1 -0
- package/dist/esm/run/run-output.js +1 -1
- package/dist/esm/run/run-output.js.map +1 -1
- package/dist/esm/run/run-settings.js +182 -0
- package/dist/esm/run/run-settings.js.map +1 -0
- package/dist/esm/run/runner.js +225 -32
- package/dist/esm/run/runner.js.map +1 -1
- package/dist/esm/run/slides-cli.js +225 -0
- package/dist/esm/run/slides-cli.js.map +1 -0
- package/dist/esm/run/slides-render.js +163 -0
- package/dist/esm/run/slides-render.js.map +1 -0
- package/dist/esm/run/stream-output.js +63 -0
- package/dist/esm/run/stream-output.js.map +1 -0
- package/dist/esm/run/streaming.js +16 -43
- package/dist/esm/run/streaming.js.map +1 -1
- package/dist/esm/run/summary-engine.js +59 -41
- package/dist/esm/run/summary-engine.js.map +1 -1
- package/dist/esm/run/transcriber-cli.js +148 -0
- package/dist/esm/run/transcriber-cli.js.map +1 -0
- package/dist/esm/shared/sse-events.js +26 -0
- package/dist/esm/shared/sse-events.js.map +1 -0
- package/dist/esm/shared/streaming-merge.js +44 -0
- package/dist/esm/shared/streaming-merge.js.map +1 -0
- package/dist/esm/slides/extract.js +1942 -0
- package/dist/esm/slides/extract.js.map +1 -0
- package/dist/esm/slides/index.js +4 -0
- package/dist/esm/slides/index.js.map +1 -0
- package/dist/esm/slides/settings.js +73 -0
- package/dist/esm/slides/settings.js.map +1 -0
- package/dist/esm/slides/store.js +111 -0
- package/dist/esm/slides/store.js.map +1 -0
- package/dist/esm/slides/types.js +2 -0
- package/dist/esm/slides/types.js.map +1 -0
- package/dist/esm/tty/osc-progress.js +21 -1
- package/dist/esm/tty/osc-progress.js.map +1 -1
- package/dist/esm/tty/progress/fetch-html.js +8 -4
- package/dist/esm/tty/progress/fetch-html.js.map +1 -1
- package/dist/esm/tty/progress/transcript.js +82 -31
- package/dist/esm/tty/progress/transcript.js.map +1 -1
- package/dist/esm/tty/spinner.js +2 -2
- package/dist/esm/tty/spinner.js.map +1 -1
- package/dist/esm/tty/theme.js +189 -0
- package/dist/esm/tty/theme.js.map +1 -0
- package/dist/esm/tty/website-progress.js +17 -13
- package/dist/esm/tty/website-progress.js.map +1 -1
- package/dist/esm/version.js +1 -1
- package/dist/esm/version.js.map +1 -1
- package/dist/types/cache.d.ts +14 -2
- package/dist/types/config.d.ts +34 -0
- package/dist/types/daemon/agent.d.ts +25 -0
- package/dist/types/daemon/chat.d.ts +27 -0
- package/dist/types/daemon/env-snapshot.d.ts +1 -1
- package/dist/types/daemon/flow-context.d.ts +24 -3
- package/dist/types/daemon/launchd.d.ts +4 -0
- package/dist/types/daemon/process-registry.d.ts +73 -0
- package/dist/types/daemon/schtasks.d.ts +4 -0
- package/dist/types/daemon/server.d.ts +7 -1
- package/dist/types/daemon/summarize.d.ts +47 -5
- package/dist/types/daemon/systemd.d.ts +4 -0
- package/dist/types/flags.d.ts +1 -0
- package/dist/types/llm/attachments.d.ts +6 -0
- package/dist/types/llm/errors.d.ts +1 -0
- package/dist/types/llm/generate-text.d.ts +29 -13
- package/dist/types/llm/prompt.d.ts +7 -2
- package/dist/types/llm/providers/anthropic.d.ts +30 -0
- package/dist/types/llm/providers/google.d.ts +29 -0
- package/dist/types/llm/providers/models.d.ts +27 -0
- package/dist/types/llm/providers/openai.d.ts +38 -0
- package/dist/types/llm/providers/shared.d.ts +14 -0
- package/dist/types/llm/providers/types.d.ts +6 -0
- package/dist/types/llm/types.d.ts +5 -0
- package/dist/types/llm/usage.d.ts +5 -0
- package/dist/types/logging/daemon.d.ts +26 -0
- package/dist/types/logging/ring-file.d.ts +10 -0
- package/dist/types/media-cache.d.ts +22 -0
- package/dist/types/model-auto.d.ts +1 -0
- package/dist/types/processes.d.ts +1 -0
- package/dist/types/run/attachments.d.ts +9 -6
- package/dist/types/run/bird.d.ts +7 -0
- package/dist/types/run/constants.d.ts +0 -2
- package/dist/types/run/finish-line.d.ts +59 -1
- package/dist/types/run/flows/asset/extract.d.ts +18 -0
- package/dist/types/run/flows/asset/input.d.ts +12 -2
- package/dist/types/run/flows/asset/media-policy.d.ts +2 -0
- package/dist/types/run/flows/asset/media.d.ts +21 -0
- package/dist/types/run/flows/asset/output.d.ts +42 -0
- package/dist/types/run/flows/asset/preprocess.d.ts +22 -2
- package/dist/types/run/flows/asset/summary.d.ts +6 -0
- package/dist/types/run/flows/url/extract.d.ts +2 -1
- package/dist/types/run/flows/url/slides-output.d.ts +66 -0
- package/dist/types/run/flows/url/slides-text.d.ts +87 -0
- package/dist/types/run/flows/url/summary.d.ts +11 -3
- package/dist/types/run/flows/url/types.d.ts +29 -2
- package/dist/types/run/help.d.ts +3 -0
- package/dist/types/run/logging.d.ts +3 -2
- package/dist/types/run/media-cache-state.d.ts +7 -0
- package/dist/types/run/progress.d.ts +2 -1
- package/dist/types/run/run-context.d.ts +44 -0
- package/dist/types/run/run-settings.d.ts +62 -0
- package/dist/types/run/slides-cli.d.ts +9 -0
- package/dist/types/run/slides-render.d.ts +30 -0
- package/dist/types/run/stream-output.d.ts +12 -0
- package/dist/types/run/streaming.d.ts +10 -4
- package/dist/types/run/summary-engine.d.ts +15 -3
- package/dist/types/run/summary-llm.d.ts +2 -2
- package/dist/types/run/transcriber-cli.d.ts +8 -0
- package/dist/types/shared/sse-events.d.ts +64 -0
- package/dist/types/shared/streaming-merge.d.ts +4 -0
- package/dist/types/slides/extract.d.ts +42 -0
- package/dist/types/slides/index.d.ts +5 -0
- package/dist/types/slides/settings.d.ts +20 -0
- package/dist/types/slides/store.d.ts +15 -0
- package/dist/types/slides/types.d.ts +40 -0
- package/dist/types/tty/osc-progress.d.ts +2 -2
- package/dist/types/tty/progress/fetch-html.d.ts +3 -1
- package/dist/types/tty/progress/transcript.d.ts +3 -1
- package/dist/types/tty/spinner.d.ts +3 -1
- package/dist/types/tty/theme.d.ts +44 -0
- package/dist/types/tty/website-progress.d.ts +3 -1
- package/dist/types/version.d.ts +1 -1
- package/docs/README.md +13 -8
- package/docs/_config.yml +26 -0
- package/docs/_layouts/default.html +60 -0
- package/docs/agent.md +333 -0
- package/docs/assets/site.css +748 -0
- package/docs/assets/site.js +72 -0
- package/docs/assets/summarize-cli.png +0 -0
- package/docs/assets/summarize-extension.png +0 -0
- package/docs/assets/youtube-slides.png +0 -0
- package/docs/cache.md +29 -3
- package/docs/chrome-extension.md +85 -7
- package/docs/config.md +74 -2
- package/docs/extract-only.md +10 -2
- package/docs/index.html +205 -0
- package/docs/index.md +25 -0
- package/docs/language.md +1 -1
- package/docs/llm.md +17 -1
- package/docs/manual-tests.md +2 -0
- package/docs/media.md +37 -0
- package/docs/model-auto.md +2 -1
- package/docs/nvidia-onnx-transcription.md +55 -0
- package/docs/openai.md +5 -0
- package/docs/releasing.md +26 -0
- package/docs/site/assets/site.css +399 -228
- package/docs/site/assets/summarize-cli.png +0 -0
- package/docs/site/assets/summarize-extension.png +0 -0
- package/docs/site/docs/chrome-extension.html +89 -0
- package/docs/site/docs/config.html +1 -0
- package/docs/site/docs/extract-only.html +1 -0
- package/docs/site/docs/firecrawl.html +1 -0
- package/docs/site/docs/index.html +5 -0
- package/docs/site/docs/llm.html +1 -0
- package/docs/site/docs/openai.html +1 -0
- package/docs/site/docs/website.html +1 -0
- package/docs/site/docs/youtube.html +1 -0
- package/docs/site/index.html +148 -84
- package/docs/slides.md +74 -0
- package/docs/timestamps.md +103 -0
- package/docs/website.md +13 -0
- package/docs/youtube.md +16 -0
- package/package.json +22 -18
- package/dist/esm/daemon/request-settings.js +0 -91
- package/dist/esm/daemon/request-settings.js.map +0 -1
- package/dist/types/daemon/request-settings.d.ts +0 -27
package/README.md
CHANGED
|
@@ -1,17 +1,93 @@
|
|
|
1
|
-
# Summarize
|
|
1
|
+
# Summarize — Chrome Side Panel + CLI
|
|
2
2
|
|
|
3
|
-
Fast
|
|
3
|
+
Fast summaries from URLs, files, and media. Works in the terminal, a Chrome Side Panel and Firefox Sidebar.
|
|
4
4
|
|
|
5
|
-
|
|
6
|
-
- YouTube links (best-effort transcripts; can fall back to audio transcription)
|
|
7
|
-
- Podcasts (Apple Podcasts / Spotify / RSS; prefers published transcripts when available; otherwise transcribes full episodes)
|
|
8
|
-
- Any audio/video (local files or direct media URLs; transcribe via Whisper, then summarize)
|
|
9
|
-
- Remote files (PDFs/images/audio/video via URL — downloaded and forwarded to the model)
|
|
10
|
-
- Local files (PDFs/images/audio/video/text — forwarded or inlined; support depends on provider/model)
|
|
5
|
+
**0.10.0 preview (unreleased):** this README reflects the upcoming release.
|
|
11
6
|
|
|
12
|
-
|
|
7
|
+
## 0.10.0 preview highlights (most interesting first)
|
|
13
8
|
|
|
14
|
-
|
|
9
|
+
- Chrome Side Panel **chat** (streaming agent + history) inside the sidebar.
|
|
10
|
+
- **YouTube slides**: screenshots + OCR + transcript cards, timestamped seek, OCR/Transcript toggle.
|
|
11
|
+
- Media-aware summaries: auto‑detect video/audio vs page content.
|
|
12
|
+
- Streaming Markdown + metrics + cache‑aware status.
|
|
13
|
+
- CLI supports URLs, files, podcasts, YouTube, audio/video, PDFs.
|
|
14
|
+
|
|
15
|
+
## Feature overview
|
|
16
|
+
|
|
17
|
+
- URLs, files, and media: web pages, PDFs, images, audio/video, YouTube, podcasts, RSS.
|
|
18
|
+
- Slide extraction for video sources (YouTube/direct media) with OCR + timestamped cards.
|
|
19
|
+
- Transcript-first media flow: published transcripts when available, Whisper fallback when not.
|
|
20
|
+
- Streaming output with Markdown rendering, metrics, and cache-aware status.
|
|
21
|
+
- Local, paid, and free models: OpenAI‑compatible local endpoints, paid providers, plus an OpenRouter free preset.
|
|
22
|
+
- Output modes: Markdown/text, JSON diagnostics, extract-only, metrics, timing, and cost estimates.
|
|
23
|
+
- Smart default: if content is shorter than the requested length, we return it as-is (use `--force-summary` to override).
|
|
24
|
+
|
|
25
|
+
## Get the extension (recommended)
|
|
26
|
+
|
|
27
|
+

|
|
28
|
+
|
|
29
|
+
One‑click summarizer for the current tab. Chrome Side Panel + Firefox Sidebar + local daemon for streaming Markdown.
|
|
30
|
+
|
|
31
|
+
**Chrome Web Store:** [Summarize Side Panel](https://chromewebstore.google.com/detail/summarize/cejgnmmhbbpdmjnfppjdfkocebngehfg)
|
|
32
|
+
|
|
33
|
+
YouTube slide screenshots (from the browser):
|
|
34
|
+
|
|
35
|
+

|
|
36
|
+
|
|
37
|
+
### Beginner quickstart (extension)
|
|
38
|
+
|
|
39
|
+
1) Install the CLI (choose one):
|
|
40
|
+
- **npm** (cross‑platform): `npm i -g @steipete/summarize`
|
|
41
|
+
- **Homebrew** (macOS arm64): `brew install steipete/tap/summarize`
|
|
42
|
+
2) Install the extension (Chrome Web Store link above) and open the Side Panel.
|
|
43
|
+
3) The panel shows a token + install command. Run it in Terminal:
|
|
44
|
+
- `summarize daemon install --token <TOKEN>`
|
|
45
|
+
|
|
46
|
+
Why a daemon/service?
|
|
47
|
+
- The extension can’t run heavy extraction inside the browser. It talks to a local background service on `127.0.0.1` for fast streaming and media tools (yt‑dlp, ffmpeg, OCR, transcription).
|
|
48
|
+
- The service autostarts (launchd/systemd/Scheduled Task) so the Side Panel is always ready.
|
|
49
|
+
|
|
50
|
+
If you only want the **CLI**, you can skip the daemon install entirely.
|
|
51
|
+
|
|
52
|
+
Notes:
|
|
53
|
+
|
|
54
|
+
- Summarization only runs when the Side Panel is open.
|
|
55
|
+
- Auto mode summarizes on navigation (incl. SPAs); otherwise use the button.
|
|
56
|
+
- Daemon is localhost-only and requires a shared token.
|
|
57
|
+
- Autostart: macOS (launchd), Linux (systemd user), Windows (Scheduled Task).
|
|
58
|
+
- Tip: configure `free` via `summarize refresh-free` (needs `OPENROUTER_API_KEY`). Add `--set-default` to set model=`free`.
|
|
59
|
+
|
|
60
|
+
More:
|
|
61
|
+
|
|
62
|
+
- Step-by-step install: [apps/chrome-extension/README.md](apps/chrome-extension/README.md)
|
|
63
|
+
- Architecture + troubleshooting: [docs/chrome-extension.md](docs/chrome-extension.md)
|
|
64
|
+
- Firefox compatibility notes: [apps/chrome-extension/docs/firefox.md](apps/chrome-extension/docs/firefox.md)
|
|
65
|
+
|
|
66
|
+
### Slides (extension)
|
|
67
|
+
|
|
68
|
+
- Select **Video + Slides** in the Summarize picker.
|
|
69
|
+
- Slides render at the top; expand to full‑width cards with timestamps.
|
|
70
|
+
- Click a slide to seek the video; toggle **Transcript/OCR** when OCR is significant.
|
|
71
|
+
- Requirements: `yt-dlp` + `ffmpeg` for extraction; `tesseract` for OCR. Missing tools show an in‑panel notice.
|
|
72
|
+
|
|
73
|
+
### Advanced (unpacked / dev)
|
|
74
|
+
|
|
75
|
+
1) Build + load the extension (unpacked):
|
|
76
|
+
- Chrome: `pnpm -C apps/chrome-extension build`
|
|
77
|
+
- `chrome://extensions` → Developer mode → Load unpacked
|
|
78
|
+
- Pick: `apps/chrome-extension/.output/chrome-mv3`
|
|
79
|
+
- Firefox: `pnpm -C apps/chrome-extension build:firefox`
|
|
80
|
+
- `about:debugging#/runtime/this-firefox` → Load Temporary Add-on
|
|
81
|
+
- Pick: `apps/chrome-extension/.output/firefox-mv3/manifest.json`
|
|
82
|
+
2) Open Side Panel/Sidebar → copy token.
|
|
83
|
+
3) Install daemon in dev mode:
|
|
84
|
+
- `pnpm summarize daemon install --token <TOKEN> --dev`
|
|
85
|
+
|
|
86
|
+
## CLI
|
|
87
|
+
|
|
88
|
+

|
|
89
|
+
|
|
90
|
+
### Install
|
|
15
91
|
|
|
16
92
|
Requires Node 22+.
|
|
17
93
|
|
|
@@ -21,7 +97,7 @@ Requires Node 22+.
|
|
|
21
97
|
npx -y @steipete/summarize "https://example.com"
|
|
22
98
|
```
|
|
23
99
|
|
|
24
|
-
- npm (global
|
|
100
|
+
- npm (global):
|
|
25
101
|
|
|
26
102
|
```bash
|
|
27
103
|
npm i -g @steipete/summarize
|
|
@@ -45,112 +121,95 @@ brew install steipete/tap/summarize
|
|
|
45
121
|
|
|
46
122
|
Apple Silicon only (arm64).
|
|
47
123
|
|
|
48
|
-
|
|
124
|
+
### CLI vs extension
|
|
125
|
+
|
|
126
|
+
- **CLI only:** just install via npm/Homebrew and run `summarize ...` (no daemon needed).
|
|
127
|
+
- **Chrome/Firefox extension:** install the CLI **and** run `summarize daemon install --token <TOKEN>` so the Side Panel can stream results and use local tools.
|
|
128
|
+
|
|
129
|
+
### Quickstart
|
|
49
130
|
|
|
50
131
|
```bash
|
|
51
132
|
summarize "https://example.com"
|
|
52
133
|
```
|
|
53
134
|
|
|
54
|
-
|
|
55
|
-
|
|
56
|
-
Want a one-click “always-on” summarizer in Chrome (real Side Panel, not injected UI)?
|
|
57
|
-
|
|
58
|
-
This is a **Chrome extension** + a tiny local **daemon** (autostart service) that streams Markdown summaries for the **currently visible tab** into the Side Panel.
|
|
59
|
-
|
|
60
|
-
Docs + setup: `https://summarize.sh`
|
|
61
|
-
|
|
62
|
-
Quickstart (local daemon):
|
|
63
|
-
|
|
64
|
-
1) Install summarize (choose one):
|
|
65
|
-
- `npm i -g @steipete/summarize`
|
|
66
|
-
- `brew install steipete/tap/summarize` (macOS arm64)
|
|
67
|
-
2) Build + load the extension (unpacked):
|
|
68
|
-
- `pnpm -C apps/chrome-extension build`
|
|
69
|
-
- Chrome → `chrome://extensions` → Developer mode → “Load unpacked”
|
|
70
|
-
- Pick: `apps/chrome-extension/.output/chrome-mv3`
|
|
71
|
-
3) Open the Side Panel → it shows a token + install command.
|
|
72
|
-
4) Run the install command in Terminal:
|
|
73
|
-
- Installed binary: `summarize daemon install --token <TOKEN>`
|
|
74
|
-
- Repo/dev checkout: `pnpm summarize daemon install --token <TOKEN> --dev`
|
|
75
|
-
5) Verify / debug:
|
|
76
|
-
- `summarize daemon status`
|
|
77
|
-
- `summarize daemon restart`
|
|
78
|
-
|
|
79
|
-
Notes:
|
|
80
|
-
|
|
81
|
-
- Summarization only runs when the Side Panel is open.
|
|
82
|
-
- “Auto” mode summarizes on navigation (incl. SPAs); otherwise use the button.
|
|
83
|
-
- The daemon is localhost-only and requires a shared token.
|
|
84
|
-
- Daemon autostart: macOS (launchd), Linux (systemd user), Windows (Scheduled Task).
|
|
85
|
-
- Tip: configure `free` via `summarize refresh-free` (requires `OPENROUTER_API_KEY`). Add `--set-default` to also set model=`free`, then set Model to `free` in extension settings.
|
|
135
|
+
### Inputs
|
|
86
136
|
|
|
87
|
-
|
|
88
|
-
- Extension package/dev notes: `apps/chrome-extension/README.md`
|
|
89
|
-
|
|
90
|
-
Troubleshooting:
|
|
91
|
-
|
|
92
|
-
- **“Receiving end does not exist”**: Chrome didn’t inject the content script yet.
|
|
93
|
-
- Extension details → “Site access” → set to “On all sites” (or allow this domain)
|
|
94
|
-
- Reload the tab once.
|
|
95
|
-
- **“Failed to fetch” / daemon unreachable**:
|
|
96
|
-
- Run `summarize daemon status`
|
|
97
|
-
- Check logs: `~/.summarize/logs/daemon.err.log`
|
|
98
|
-
|
|
99
|
-
Input can be a URL or a local file path:
|
|
137
|
+
URLs or local paths:
|
|
100
138
|
|
|
101
139
|
```bash
|
|
102
|
-
|
|
103
|
-
|
|
140
|
+
summarize "/path/to/file.pdf" --model google/gemini-3-flash-preview
|
|
141
|
+
summarize "https://example.com/report.pdf" --model google/gemini-3-flash-preview
|
|
142
|
+
summarize "/path/to/audio.mp3"
|
|
143
|
+
summarize "/path/to/video.mp4"
|
|
104
144
|
```
|
|
105
145
|
|
|
106
|
-
|
|
146
|
+
YouTube (supports `youtube.com` and `youtu.be`):
|
|
107
147
|
|
|
108
148
|
```bash
|
|
109
|
-
|
|
149
|
+
summarize "https://youtu.be/dQw4w9WgXcQ" --youtube auto
|
|
110
150
|
```
|
|
111
151
|
|
|
112
|
-
|
|
152
|
+
Podcast RSS (transcribes latest enclosure):
|
|
113
153
|
|
|
114
154
|
```bash
|
|
115
|
-
|
|
155
|
+
summarize "https://feeds.npr.org/500005/podcast.xml"
|
|
116
156
|
```
|
|
117
157
|
|
|
118
|
-
|
|
158
|
+
Apple Podcasts episode page:
|
|
119
159
|
|
|
120
160
|
```bash
|
|
121
|
-
|
|
161
|
+
summarize "https://podcasts.apple.com/us/podcast/2424-jelly-roll/id360084272?i=1000740717432"
|
|
122
162
|
```
|
|
123
163
|
|
|
124
|
-
|
|
164
|
+
Spotify episode page (best-effort; may fail for exclusives):
|
|
125
165
|
|
|
126
166
|
```bash
|
|
127
|
-
|
|
167
|
+
summarize "https://open.spotify.com/episode/5auotqWAXhhKyb9ymCuBJY"
|
|
128
168
|
```
|
|
129
169
|
|
|
130
|
-
|
|
170
|
+
### Output length
|
|
171
|
+
|
|
172
|
+
`--length` controls how much output we ask for (guideline), not a hard cap.
|
|
131
173
|
|
|
132
174
|
```bash
|
|
133
|
-
|
|
175
|
+
summarize "https://example.com" --length long
|
|
176
|
+
summarize "https://example.com" --length 20k
|
|
134
177
|
```
|
|
135
178
|
|
|
136
|
-
|
|
137
|
-
|
|
138
|
-
|
|
139
|
-
|
|
140
|
-
-
|
|
141
|
-
-
|
|
142
|
-
-
|
|
179
|
+
- Presets: `short|medium|long|xl|xxl`
|
|
180
|
+
- Character targets: `1500`, `20k`, `20000`
|
|
181
|
+
- Optional hard cap: `--max-output-tokens <count>` (e.g. `2000`, `2k`)
|
|
182
|
+
- Provider/model APIs still enforce their own maximum output limits.
|
|
183
|
+
- If omitted, no max token parameter is sent (provider default).
|
|
184
|
+
- Prefer `--length` unless you need a hard cap.
|
|
185
|
+
- Short content: when extracted content is shorter than the requested length, the CLI returns the content as-is.
|
|
186
|
+
- Override with `--force-summary` to always run the LLM.
|
|
187
|
+
- Minimums: `--length` numeric values must be >= 50 chars; `--max-output-tokens` must be >= 16.
|
|
188
|
+
- Preset targets (source of truth: `packages/core/src/prompts/summary-lengths.ts`):
|
|
189
|
+
- short: target ~900 chars (range 600-1,200)
|
|
190
|
+
- medium: target ~1,800 chars (range 1,200-2,500)
|
|
191
|
+
- long: target ~4,200 chars (range 2,500-6,000)
|
|
192
|
+
- xl: target ~9,000 chars (range 6,000-14,000)
|
|
193
|
+
- xxl: target ~17,000 chars (range 14,000-22,000)
|
|
194
|
+
|
|
195
|
+
### What file types work?
|
|
196
|
+
|
|
197
|
+
Best effort and provider-dependent. These usually work well:
|
|
198
|
+
|
|
199
|
+
- `text/*` and common structured text (`.txt`, `.md`, `.json`, `.yaml`, `.xml`, ...)
|
|
200
|
+
- Text-like files are inlined into the prompt for better provider compatibility.
|
|
201
|
+
- PDFs: `application/pdf` (provider support varies; Google is the most reliable here)
|
|
143
202
|
- Images: `image/jpeg`, `image/png`, `image/webp`, `image/gif`
|
|
144
|
-
- Audio/Video: `audio/*`, `video/*` (when supported by the model)
|
|
203
|
+
- Audio/Video: `audio/*`, `video/*` (local audio/video files MP3/WAV/M4A/OGG/FLAC/MP4/MOV/WEBM automatically transcribed, when supported by the model)
|
|
145
204
|
|
|
146
205
|
Notes:
|
|
147
206
|
|
|
148
|
-
- If a provider rejects a media type, the CLI fails fast with a friendly message
|
|
149
|
-
- xAI models
|
|
207
|
+
- If a provider rejects a media type, the CLI fails fast with a friendly message.
|
|
208
|
+
- xAI models do not support attaching generic files (like PDFs) via the AI SDK; use Google/OpenAI/Anthropic for those.
|
|
150
209
|
|
|
151
|
-
|
|
210
|
+
### Model ids
|
|
152
211
|
|
|
153
|
-
Use
|
|
212
|
+
Use gateway-style ids: `<provider>/<model>`.
|
|
154
213
|
|
|
155
214
|
Examples:
|
|
156
215
|
|
|
@@ -161,98 +220,54 @@ Examples:
|
|
|
161
220
|
- `zai/glm-4.7`
|
|
162
221
|
- `openrouter/openai/gpt-5-mini` (force OpenRouter)
|
|
163
222
|
|
|
164
|
-
Note: some models/providers
|
|
223
|
+
Note: some models/providers do not support streaming or certain file media types. When that happens, the CLI prints a friendly error (or auto-disables streaming for that model when supported by the provider).
|
|
165
224
|
|
|
166
|
-
|
|
167
|
-
|
|
168
|
-
`--length` controls *how much output we ask for* (guideline), not a hard truncation.
|
|
169
|
-
|
|
170
|
-
```bash
|
|
171
|
-
npx -y @steipete/summarize "https://example.com" --length long
|
|
172
|
-
npx -y @steipete/summarize "https://example.com" --length 20k
|
|
173
|
-
```
|
|
174
|
-
|
|
175
|
-
- Presets: `short|medium|long|xl|xxl`
|
|
176
|
-
- Character targets: `1500`, `20k`, `20000`
|
|
177
|
-
- Optional hard cap: `--max-output-tokens <count>` (e.g. `2000`, `2k`)
|
|
178
|
-
- Provider/model APIs still enforce their own maximum output limits.
|
|
179
|
-
- If omitted, no max token parameter is sent (provider default).
|
|
180
|
-
- Prefer `--length` unless you need a hard cap (some providers count “reasoning” into the cap).
|
|
181
|
-
- Minimums: `--length` numeric values must be ≥ 50 chars; `--max-output-tokens` must be ≥ 16.
|
|
182
|
-
|
|
183
|
-
## Limits
|
|
225
|
+
### Limits
|
|
184
226
|
|
|
185
227
|
- Text inputs over 10 MB are rejected before tokenization.
|
|
186
|
-
- Text prompts are preflighted against the model
|
|
228
|
+
- Text prompts are preflighted against the model input limit (LiteLLM catalog), using a GPT tokenizer.
|
|
187
229
|
|
|
188
|
-
|
|
230
|
+
### Common flags
|
|
189
231
|
|
|
190
232
|
```bash
|
|
191
|
-
|
|
233
|
+
summarize <input> [flags]
|
|
192
234
|
```
|
|
193
235
|
|
|
194
236
|
Use `summarize --help` or `summarize help` for the full help text.
|
|
195
237
|
|
|
196
238
|
- `--model <provider/model>`: which model to use (defaults to `auto`)
|
|
197
239
|
- `--model auto`: automatic model selection + fallback (default)
|
|
198
|
-
- `--model <name>`: use a config-defined model (see
|
|
240
|
+
- `--model <name>`: use a config-defined model (see Configuration)
|
|
199
241
|
- `--timeout <duration>`: `30s`, `2m`, `5000ms` (default `2m`)
|
|
200
242
|
- `--retries <count>`: LLM retry attempts on timeout (default `1`)
|
|
201
243
|
- `--length short|medium|long|xl|xxl|s|m|l|<chars>`
|
|
202
|
-
- `--language, --lang <language>`: output language (`auto` = match source
|
|
203
|
-
- `--max-output-tokens <count>`: hard cap for LLM output tokens
|
|
204
|
-
- `--cli [provider]`: use a CLI provider (
|
|
244
|
+
- `--language, --lang <language>`: output language (`auto` = match source)
|
|
245
|
+
- `--max-output-tokens <count>`: hard cap for LLM output tokens
|
|
246
|
+
- `--cli [provider]`: use a CLI provider (`--model cli/<provider>`). If omitted, uses auto selection with CLI enabled.
|
|
205
247
|
- `--stream auto|on|off`: stream LLM output (`auto` = TTY only; disabled in `--json` mode)
|
|
206
|
-
- `--plain`:
|
|
248
|
+
- `--plain`: keep raw output (no ANSI/OSC Markdown rendering)
|
|
207
249
|
- `--no-color`: disable ANSI colors
|
|
250
|
+
- `--theme <name>`: CLI theme (`aurora`, `ember`, `moss`, `mono`)
|
|
208
251
|
- `--format md|text`: website/file content format (default `text`)
|
|
209
|
-
- `--markdown-mode off|auto|llm|readability`: Markdown
|
|
210
|
-
- `--preprocess off|auto|always`: controls `uvx markitdown` usage (default `auto
|
|
252
|
+
- `--markdown-mode off|auto|llm|readability`: HTML -> Markdown mode (default `readability`)
|
|
253
|
+
- `--preprocess off|auto|always`: controls `uvx markitdown` usage (default `auto`)
|
|
211
254
|
- Install `uvx`: `brew install uv` (or https://astral.sh/uv/)
|
|
212
|
-
- `--extract`: print extracted content and exit (
|
|
255
|
+
- `--extract`: print extracted content and exit (URLs only)
|
|
213
256
|
- Deprecated alias: `--extract-only`
|
|
257
|
+
- `--slides`: extract slides for YouTube/direct video URLs and render them inline in the summary narrative (auto-renders inline in supported terminals)
|
|
258
|
+
- `--slides-ocr`: run OCR on extracted slides (requires `tesseract`)
|
|
259
|
+
- `--slides-dir <dir>`: base output dir for slide images (default `./slides`)
|
|
260
|
+
- `--slides-scene-threshold <value>`: scene detection threshold (0.1-1.0)
|
|
261
|
+
- `--slides-max <count>`: maximum slides to extract (default `6`)
|
|
262
|
+
- `--slides-min-duration <seconds>`: minimum seconds between slides
|
|
214
263
|
- `--json`: machine-readable output with diagnostics, prompt, `metrics`, and optional summary
|
|
215
264
|
- `--verbose`: debug/diagnostics on stderr
|
|
216
|
-
- `--metrics off|on|detailed`: metrics output (default `on
|
|
265
|
+
- `--metrics off|on|detailed`: metrics output (default `on`)
|
|
217
266
|
|
|
218
|
-
|
|
219
|
-
|
|
220
|
-
Run: `summarize <url>`
|
|
221
|
-
|
|
222
|
-
- Apple Podcasts
|
|
223
|
-
- Spotify
|
|
224
|
-
- Amazon Music / Audible podcast pages
|
|
225
|
-
- Podbean
|
|
226
|
-
- Podchaser
|
|
227
|
-
- RSS feeds (Podcasting 2.0 transcripts when available)
|
|
228
|
-
- Embedded YouTube podcast pages (e.g. JREPodcast)
|
|
229
|
-
|
|
230
|
-
Transcription: prefers local `whisper.cpp` when installed; otherwise uses OpenAI Whisper or FAL when keys are set.
|
|
231
|
-
|
|
232
|
-
## Translation paths
|
|
233
|
-
|
|
234
|
-
`--language/--lang` controls the *output language* of the summary (and other LLM-generated text). Default is `auto` (match source language).
|
|
235
|
-
|
|
236
|
-
When the input is audio/video, the CLI needs a transcript first. The transcript comes from one of these paths:
|
|
237
|
-
|
|
238
|
-
1. **Existing transcript (preferred)**
|
|
239
|
-
- YouTube: uses `youtubei` / `captionTracks` when available.
|
|
240
|
-
- Podcasts: uses Podcasting 2.0 RSS `<podcast:transcript>` (JSON/VTT) when the feed publishes it.
|
|
241
|
-
2. **Whisper transcription (fallback)**
|
|
242
|
-
- YouTube: falls back to `yt-dlp` (audio download) + Whisper transcription when configured; Apify is a last-last resort (requires `APIFY_API_TOKEN`).
|
|
243
|
-
- Prefers local `whisper.cpp` when installed + model available.
|
|
244
|
-
- Otherwise uses cloud Whisper (OpenAI `OPENAI_API_KEY`) or FAL (`FAL_KEY`) depending on configuration.
|
|
245
|
-
|
|
246
|
-
For “any video/audio file” (local path or direct media URL), use `--video-mode transcript` to force “transcribe → summarize”:
|
|
247
|
-
|
|
248
|
-
```bash
|
|
249
|
-
summarize /path/to/file.mp4 --video-mode transcript --lang en
|
|
250
|
-
```
|
|
251
|
-
|
|
252
|
-
## Auto model ordering
|
|
267
|
+
### Auto model ordering
|
|
253
268
|
|
|
254
269
|
`--model auto` builds candidate attempts from built-in rules (or your `model.rules` overrides).
|
|
255
|
-
CLI tools are
|
|
270
|
+
CLI tools are not used in auto mode unless you enable them via `cli.enabled` in config.
|
|
256
271
|
Why: CLI adds ~4s latency per attempt and higher variance.
|
|
257
272
|
Shortcut: `--cli` (with no provider) uses auto selection with CLI enabled.
|
|
258
273
|
|
|
@@ -276,28 +291,31 @@ Disable CLI attempts:
|
|
|
276
291
|
}
|
|
277
292
|
```
|
|
278
293
|
|
|
279
|
-
Note: when `cli.enabled` is set, it
|
|
294
|
+
Note: when `cli.enabled` is set, it is also an allowlist for explicit `--cli` / `--model cli/...`.
|
|
280
295
|
|
|
281
|
-
|
|
296
|
+
### Website extraction (Firecrawl + Markdown)
|
|
282
297
|
|
|
283
|
-
Non-YouTube URLs go through a
|
|
298
|
+
Non-YouTube URLs go through a fetch -> extract pipeline. When direct fetch/extraction is blocked or too thin,
|
|
299
|
+
`--firecrawl auto` can fall back to Firecrawl (if configured).
|
|
284
300
|
|
|
285
301
|
- `--firecrawl off|auto|always` (default `auto`)
|
|
286
302
|
- `--extract --format md|text` (default `text`; if `--format` is omitted, `--extract` defaults to `md` for non-YouTube URLs)
|
|
287
|
-
- `--markdown-mode off|auto|llm|readability` (default `readability
|
|
303
|
+
- `--markdown-mode off|auto|llm|readability` (default `readability`)
|
|
288
304
|
- `auto`: use an LLM converter when configured; may fall back to `uvx markitdown`
|
|
289
305
|
- `llm`: force LLM conversion (requires a configured model key)
|
|
290
306
|
- `off`: disable LLM conversion (still may return Firecrawl Markdown when configured)
|
|
291
307
|
- Plain-text mode: use `--format text`.
|
|
292
308
|
|
|
293
|
-
|
|
309
|
+
### YouTube transcripts
|
|
294
310
|
|
|
295
|
-
`--youtube auto` tries best-effort web transcript endpoints first. When captions
|
|
311
|
+
`--youtube auto` tries best-effort web transcript endpoints first. When captions are not available, it falls back to:
|
|
296
312
|
|
|
297
|
-
1.
|
|
298
|
-
2.
|
|
313
|
+
1. Apify (if `APIFY_API_TOKEN` is set): uses a scraping actor (`faVsWy9VTSNVIhWpR`)
|
|
314
|
+
2. yt-dlp + Whisper (if `yt-dlp` is available): downloads audio, then transcribes with local `whisper.cpp` when installed
|
|
315
|
+
(preferred), otherwise falls back to OpenAI (`OPENAI_API_KEY`) or FAL (`FAL_KEY`)
|
|
299
316
|
|
|
300
317
|
Environment variables for yt-dlp mode:
|
|
318
|
+
|
|
301
319
|
- `YT_DLP_PATH` - optional path to yt-dlp binary (otherwise `yt-dlp` is resolved via `PATH`)
|
|
302
320
|
- `SUMMARIZE_WHISPER_CPP_MODEL_PATH` - optional override for the local `whisper.cpp` model file
|
|
303
321
|
- `SUMMARIZE_WHISPER_CPP_BINARY` - optional override for the local binary (default: `whisper-cli`)
|
|
@@ -307,17 +325,82 @@ Environment variables for yt-dlp mode:
|
|
|
307
325
|
|
|
308
326
|
Apify costs money but tends to be more reliable when captions exist.
|
|
309
327
|
|
|
328
|
+
### Slide extraction (YouTube + direct video URLs)
|
|
329
|
+
|
|
330
|
+
Extract slide screenshots (scene detection via `ffmpeg`) and optional OCR:
|
|
331
|
+
|
|
332
|
+
```bash
|
|
333
|
+
summarize "https://www.youtube.com/watch?v=..." --slides
|
|
334
|
+
summarize "https://www.youtube.com/watch?v=..." --slides --slides-ocr
|
|
335
|
+
```
|
|
336
|
+
|
|
337
|
+
Outputs are written under `./slides/<sourceId>/` (or `--slides-dir`). OCR results are included in JSON output
|
|
338
|
+
(`--json`) and stored in `slides.json` inside the slide directory. When scene detection is too sparse, the
|
|
339
|
+
extractor also samples at a fixed interval to improve coverage.
|
|
340
|
+
When using `--slides`, supported terminals (kitty/iTerm/Konsole) render inline thumbnails automatically inside the
|
|
341
|
+
summary narrative (the model inserts `[slide:N]` markers). Timestamp links are clickable when the terminal supports
|
|
342
|
+
OSC-8 (YouTube/Vimeo/Loom/Dropbox). If inline images are unsupported, Summarize prints a note with the on-disk
|
|
343
|
+
slide directory.
|
|
344
|
+
|
|
345
|
+
Use `--slides --extract` to print the full timed transcript and insert slide images inline at matching timestamps.
|
|
346
|
+
|
|
310
347
|
Format the extracted transcript as Markdown (headings + paragraphs) via an LLM:
|
|
311
348
|
|
|
312
349
|
```bash
|
|
313
350
|
summarize "https://www.youtube.com/watch?v=..." --extract --format md --markdown-mode llm
|
|
314
351
|
```
|
|
315
352
|
|
|
316
|
-
|
|
353
|
+
### Media transcription (Whisper)
|
|
354
|
+
|
|
355
|
+
Local audio/video files are transcribed first, then summarized. `--video-mode transcript` forces
|
|
356
|
+
direct media URLs (and embedded media) through Whisper first. Prefers local `whisper.cpp` when available; otherwise requires
|
|
357
|
+
`OPENAI_API_KEY` or `FAL_KEY`.
|
|
358
|
+
|
|
359
|
+
### Local ONNX transcription (Parakeet/Canary)
|
|
360
|
+
|
|
361
|
+
Summarize can use NVIDIA Parakeet/Canary ONNX models via a local CLI you provide. Auto selection (default) prefers ONNX when configured.
|
|
362
|
+
|
|
363
|
+
- Setup helper: `summarize transcriber setup`
|
|
364
|
+
- Install `sherpa-onnx` from upstream binaries/build (Homebrew may not have a formula)
|
|
365
|
+
- Auto selection: set `SUMMARIZE_ONNX_PARAKEET_CMD` or `SUMMARIZE_ONNX_CANARY_CMD` (no flag needed)
|
|
366
|
+
- Force a model: `--transcriber parakeet|canary|whisper|auto`
|
|
367
|
+
- Docs: `docs/nvidia-onnx-transcription.md`
|
|
368
|
+
|
|
369
|
+
### Verified podcast services (2025-12-25)
|
|
370
|
+
|
|
371
|
+
Run: `summarize <url>`
|
|
372
|
+
|
|
373
|
+
- Apple Podcasts
|
|
374
|
+
- Spotify
|
|
375
|
+
- Amazon Music / Audible podcast pages
|
|
376
|
+
- Podbean
|
|
377
|
+
- Podchaser
|
|
378
|
+
- RSS feeds (Podcasting 2.0 transcripts when available)
|
|
379
|
+
- Embedded YouTube podcast pages (e.g. JREPodcast)
|
|
380
|
+
|
|
381
|
+
Transcription: prefers local `whisper.cpp` when installed; otherwise uses OpenAI Whisper or FAL when keys are set.
|
|
382
|
+
|
|
383
|
+
### Translation paths
|
|
384
|
+
|
|
385
|
+
`--language/--lang` controls the output language of the summary (and other LLM-generated text). Default is `auto`.
|
|
386
|
+
|
|
387
|
+
When the input is audio/video, the CLI needs a transcript first. The transcript comes from one of these paths:
|
|
388
|
+
|
|
389
|
+
1. Existing transcript (preferred)
|
|
390
|
+
- YouTube: uses `youtubei` / `captionTracks` when available.
|
|
391
|
+
- Podcasts: uses Podcasting 2.0 RSS `<podcast:transcript>` (JSON/VTT) when the feed publishes it.
|
|
392
|
+
2. Whisper transcription (fallback)
|
|
393
|
+
- YouTube: falls back to yt-dlp (audio download) + Whisper transcription when configured; Apify is a last resort.
|
|
394
|
+
- Prefers local `whisper.cpp` when installed + model available.
|
|
395
|
+
- Otherwise uses cloud Whisper (OpenAI `OPENAI_API_KEY`) or FAL (`FAL_KEY`).
|
|
317
396
|
|
|
318
|
-
|
|
397
|
+
For direct media URLs, use `--video-mode transcript` to force transcribe -> summarize:
|
|
319
398
|
|
|
320
|
-
|
|
399
|
+
```bash
|
|
400
|
+
summarize https://example.com/file.mp4 --video-mode transcript --lang en
|
|
401
|
+
```
|
|
402
|
+
|
|
403
|
+
### Configuration
|
|
321
404
|
|
|
322
405
|
Single config location:
|
|
323
406
|
|
|
@@ -327,7 +410,8 @@ Supported keys today:
|
|
|
327
410
|
|
|
328
411
|
```json
|
|
329
412
|
{
|
|
330
|
-
"model": { "id": "openai/gpt-5-mini" }
|
|
413
|
+
"model": { "id": "openai/gpt-5-mini" },
|
|
414
|
+
"ui": { "theme": "ember" }
|
|
331
415
|
}
|
|
332
416
|
```
|
|
333
417
|
|
|
@@ -341,14 +425,28 @@ Shorthand (equivalent):
|
|
|
341
425
|
|
|
342
426
|
Also supported:
|
|
343
427
|
|
|
344
|
-
- `model: { "mode": "auto" }` (automatic model selection + fallback; see
|
|
428
|
+
- `model: { "mode": "auto" }` (automatic model selection + fallback; see [docs/model-auto.md](docs/model-auto.md))
|
|
345
429
|
- `model.rules` (customize candidates / ordering)
|
|
346
430
|
- `models` (define presets selectable via `--model <preset>`)
|
|
431
|
+
- `cache.media` (media download cache: TTL 7 days, 2048 MB cap by default; `--no-media-cache` disables)
|
|
347
432
|
- `media.videoMode: "auto"|"transcript"|"understand"`
|
|
433
|
+
- `slides.enabled` / `slides.max` / `slides.ocr` / `slides.dir` (defaults for `--slides`)
|
|
434
|
+
- `ui.theme: "aurora"|"ember"|"moss"|"mono"`
|
|
348
435
|
- `openai.useChatCompletions: true` (force OpenAI-compatible chat completions)
|
|
349
436
|
|
|
350
|
-
Note: the config is parsed leniently (JSON5), but
|
|
351
|
-
|
|
437
|
+
Note: the config is parsed leniently (JSON5), but comments are not allowed. Unknown keys are ignored.
|
|
438
|
+
|
|
439
|
+
Media cache defaults:
|
|
440
|
+
|
|
441
|
+
```json
|
|
442
|
+
{
|
|
443
|
+
"cache": {
|
|
444
|
+
"media": { "enabled": true, "ttlDays": 7, "maxMb": 2048, "verify": "size" }
|
|
445
|
+
}
|
|
446
|
+
}
|
|
447
|
+
```
|
|
448
|
+
|
|
449
|
+
Note: `--no-cache` bypasses summary caching only (LLM output). Extract/transcript caches still apply. Use `--no-media-cache` to skip media files.
|
|
352
450
|
|
|
353
451
|
Precedence:
|
|
354
452
|
|
|
@@ -357,7 +455,14 @@ Precedence:
|
|
|
357
455
|
3) `~/.summarize/config.json`
|
|
358
456
|
4) default (`auto`)
|
|
359
457
|
|
|
360
|
-
|
|
458
|
+
Theme precedence:
|
|
459
|
+
|
|
460
|
+
1) `--theme`
|
|
461
|
+
2) `SUMMARIZE_THEME`
|
|
462
|
+
3) `~/.summarize/config.json` (`ui.theme`)
|
|
463
|
+
4) default (`aurora`)
|
|
464
|
+
|
|
465
|
+
### Environment variables
|
|
361
466
|
|
|
362
467
|
Set the key matching your chosen `--model`:
|
|
363
468
|
|
|
@@ -365,43 +470,44 @@ Set the key matching your chosen `--model`:
|
|
|
365
470
|
- `ANTHROPIC_API_KEY` (for `anthropic/...`)
|
|
366
471
|
- `XAI_API_KEY` (for `xai/...`)
|
|
367
472
|
- `Z_AI_API_KEY` (for `zai/...`; supports `ZAI_API_KEY` alias)
|
|
368
|
-
- `GEMINI_API_KEY` (for `google/...`)
|
|
473
|
+
- `GEMINI_API_KEY` (for `google/...`)
|
|
369
474
|
- also accepts `GOOGLE_GENERATIVE_AI_API_KEY` and `GOOGLE_API_KEY` as aliases
|
|
370
475
|
|
|
371
476
|
OpenAI-compatible chat completions toggle:
|
|
372
477
|
|
|
373
478
|
- `OPENAI_USE_CHAT_COMPLETIONS=1` (or set `openai.useChatCompletions` in config)
|
|
374
479
|
|
|
480
|
+
UI theme:
|
|
481
|
+
|
|
482
|
+
- `SUMMARIZE_THEME=aurora|ember|moss|mono`
|
|
483
|
+
- `SUMMARIZE_TRUECOLOR=1` (force 24-bit ANSI)
|
|
484
|
+
- `SUMMARIZE_NO_TRUECOLOR=1` (disable 24-bit ANSI)
|
|
485
|
+
|
|
375
486
|
OpenRouter (OpenAI-compatible):
|
|
376
487
|
|
|
377
488
|
- Set `OPENROUTER_API_KEY=...`
|
|
378
|
-
- Prefer forcing OpenRouter per model id: `--model openrouter/<author>/<slug>`
|
|
379
|
-
- Built-in preset: `--model free` (uses a default set of OpenRouter `:free` models)
|
|
489
|
+
- Prefer forcing OpenRouter per model id: `--model openrouter/<author>/<slug>`
|
|
490
|
+
- Built-in preset: `--model free` (uses a default set of OpenRouter `:free` models)
|
|
380
491
|
|
|
381
492
|
### `summarize refresh-free`
|
|
382
493
|
|
|
383
494
|
Quick start: make free the default (keep `auto` available)
|
|
384
495
|
|
|
385
496
|
```bash
|
|
386
|
-
# writes ~/.summarize/config.json (models.free) and sets model="free"
|
|
387
497
|
summarize refresh-free --set-default
|
|
388
|
-
|
|
389
|
-
# now this defaults to free models
|
|
390
498
|
summarize "https://example.com"
|
|
391
|
-
|
|
392
|
-
# whenever you want best quality instead
|
|
393
499
|
summarize "https://example.com" --model auto
|
|
394
500
|
```
|
|
395
501
|
|
|
396
|
-
Regenerates the `free` preset (
|
|
502
|
+
Regenerates the `free` preset (`models.free` in `~/.summarize/config.json`) by:
|
|
397
503
|
|
|
398
504
|
- Fetching OpenRouter `/models`, filtering `:free`
|
|
399
|
-
- Skipping models that look very small (<27B by default) based on the model id/name
|
|
505
|
+
- Skipping models that look very small (<27B by default) based on the model id/name
|
|
400
506
|
- Testing which ones return non-empty text (concurrency 4, timeout 10s)
|
|
401
|
-
- Picking a mix of
|
|
402
|
-
- Refining timings
|
|
507
|
+
- Picking a mix of smart-ish (bigger `context_length` / output cap) and fast models
|
|
508
|
+
- Refining timings and writing the sorted list back
|
|
403
509
|
|
|
404
|
-
If `--model free` stops working
|
|
510
|
+
If `--model free` stops working, run:
|
|
405
511
|
|
|
406
512
|
```bash
|
|
407
513
|
summarize refresh-free
|
|
@@ -410,7 +516,7 @@ summarize refresh-free
|
|
|
410
516
|
Flags:
|
|
411
517
|
|
|
412
518
|
- `--runs 2` (default): extra timing runs per selected model (total runs = 1 + runs)
|
|
413
|
-
- `--smart 3` (default): how many
|
|
519
|
+
- `--smart 3` (default): how many smart-first picks (rest filled by fastest)
|
|
414
520
|
- `--min-params 27b` (default): ignore models with inferred size smaller than N billion parameters
|
|
415
521
|
- `--max-age-days 180` (default): ignore models older than N days (set 0 to disable)
|
|
416
522
|
- `--set-default`: also sets `"model": "free"` in `~/.summarize/config.json`
|
|
@@ -422,7 +528,7 @@ OPENROUTER_API_KEY=sk-or-... summarize "https://example.com" --model openrouter/
|
|
|
422
528
|
```
|
|
423
529
|
|
|
424
530
|
If your OpenRouter account enforces an allowed-provider list, make sure at least one provider
|
|
425
|
-
is allowed for the selected model.
|
|
531
|
+
is allowed for the selected model. When routing fails, `summarize` prints the exact providers to allow.
|
|
426
532
|
|
|
427
533
|
Legacy: `OPENAI_BASE_URL=https://openrouter.ai/api/v1` (and either `OPENAI_API_KEY` or `OPENROUTER_API_KEY`) also works.
|
|
428
534
|
|
|
@@ -438,14 +544,14 @@ Optional services:
|
|
|
438
544
|
- `FAL_KEY` (FAL AI API key for audio transcription via Whisper)
|
|
439
545
|
- `APIFY_API_TOKEN` (YouTube transcript fallback)
|
|
440
546
|
|
|
441
|
-
|
|
547
|
+
### Model limits
|
|
442
548
|
|
|
443
549
|
The CLI uses the LiteLLM model catalog for model limits (like max output tokens):
|
|
444
550
|
|
|
445
551
|
- Downloaded from: `https://raw.githubusercontent.com/BerriAI/litellm/main/model_prices_and_context_window.json`
|
|
446
552
|
- Cached at: `~/.summarize/cache/`
|
|
447
553
|
|
|
448
|
-
|
|
554
|
+
### Library usage (optional)
|
|
449
555
|
|
|
450
556
|
Recommended (minimal deps):
|
|
451
557
|
|
|
@@ -457,9 +563,30 @@ Compatibility (pulls in CLI deps):
|
|
|
457
563
|
- `@steipete/summarize/content`
|
|
458
564
|
- `@steipete/summarize/prompts`
|
|
459
565
|
|
|
460
|
-
|
|
566
|
+
### Development
|
|
461
567
|
|
|
462
568
|
```bash
|
|
463
569
|
pnpm install
|
|
464
570
|
pnpm check
|
|
465
571
|
```
|
|
572
|
+
|
|
573
|
+
## More
|
|
574
|
+
|
|
575
|
+
- Docs index: [docs/README.md](docs/README.md)
|
|
576
|
+
- CLI providers and config: [docs/cli.md](docs/cli.md)
|
|
577
|
+
- Auto model rules: [docs/model-auto.md](docs/model-auto.md)
|
|
578
|
+
- Website extraction: [docs/website.md](docs/website.md)
|
|
579
|
+
- YouTube handling: [docs/youtube.md](docs/youtube.md)
|
|
580
|
+
- Media pipeline: [docs/media.md](docs/media.md)
|
|
581
|
+
- Config schema and precedence: [docs/config.md](docs/config.md)
|
|
582
|
+
|
|
583
|
+
## Troubleshooting
|
|
584
|
+
|
|
585
|
+
- "Receiving end does not exist": Chrome did not inject the content script yet.
|
|
586
|
+
- Extension details -> Site access -> On all sites (or allow this domain)
|
|
587
|
+
- Reload the tab once.
|
|
588
|
+
- "Failed to fetch" / daemon unreachable:
|
|
589
|
+
- `summarize daemon status`
|
|
590
|
+
- Logs: `~/.summarize/logs/daemon.err.log`
|
|
591
|
+
|
|
592
|
+
License: MIT
|