@steipete/summarize 0.8.2 → 0.10.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/CHANGELOG.md +114 -1
- package/LICENSE +1 -1
- package/README.md +309 -182
- package/dist/cli.js +1 -1
- package/dist/esm/cache.js +72 -4
- package/dist/esm/cache.js.map +1 -1
- package/dist/esm/config.js +197 -1
- package/dist/esm/config.js.map +1 -1
- package/dist/esm/content/asset.js +75 -2
- package/dist/esm/content/asset.js.map +1 -1
- package/dist/esm/daemon/agent.js +547 -0
- package/dist/esm/daemon/agent.js.map +1 -0
- package/dist/esm/daemon/chat.js +97 -0
- package/dist/esm/daemon/chat.js.map +1 -0
- package/dist/esm/daemon/cli.js +105 -10
- package/dist/esm/daemon/cli.js.map +1 -1
- package/dist/esm/daemon/env-snapshot.js +3 -0
- package/dist/esm/daemon/env-snapshot.js.map +1 -1
- package/dist/esm/daemon/flow-context.js +53 -28
- package/dist/esm/daemon/flow-context.js.map +1 -1
- package/dist/esm/daemon/launchd.js +27 -0
- package/dist/esm/daemon/launchd.js.map +1 -1
- package/dist/esm/daemon/process-registry.js +206 -0
- package/dist/esm/daemon/process-registry.js.map +1 -0
- package/dist/esm/daemon/schtasks.js +64 -0
- package/dist/esm/daemon/schtasks.js.map +1 -1
- package/dist/esm/daemon/server.js +1034 -52
- package/dist/esm/daemon/server.js.map +1 -1
- package/dist/esm/daemon/summarize.js +66 -18
- package/dist/esm/daemon/summarize.js.map +1 -1
- package/dist/esm/daemon/systemd.js +61 -0
- package/dist/esm/daemon/systemd.js.map +1 -1
- package/dist/esm/flags.js +24 -0
- package/dist/esm/flags.js.map +1 -1
- package/dist/esm/llm/attachments.js +2 -0
- package/dist/esm/llm/attachments.js.map +1 -0
- package/dist/esm/llm/errors.js +6 -0
- package/dist/esm/llm/errors.js.map +1 -0
- package/dist/esm/llm/generate-text.js +206 -356
- package/dist/esm/llm/generate-text.js.map +1 -1
- package/dist/esm/llm/html-to-markdown.js +1 -2
- package/dist/esm/llm/html-to-markdown.js.map +1 -1
- package/dist/esm/llm/prompt.js.map +1 -1
- package/dist/esm/llm/providers/anthropic.js +126 -0
- package/dist/esm/llm/providers/anthropic.js.map +1 -0
- package/dist/esm/llm/providers/google.js +78 -0
- package/dist/esm/llm/providers/google.js.map +1 -0
- package/dist/esm/llm/providers/models.js +111 -0
- package/dist/esm/llm/providers/models.js.map +1 -0
- package/dist/esm/llm/providers/openai.js +150 -0
- package/dist/esm/llm/providers/openai.js.map +1 -0
- package/dist/esm/llm/providers/shared.js +48 -0
- package/dist/esm/llm/providers/shared.js.map +1 -0
- package/dist/esm/llm/providers/types.js +2 -0
- package/dist/esm/llm/providers/types.js.map +1 -0
- package/dist/esm/llm/transcript-to-markdown.js +1 -2
- package/dist/esm/llm/transcript-to-markdown.js.map +1 -1
- package/dist/esm/llm/types.js +2 -0
- package/dist/esm/llm/types.js.map +1 -0
- package/dist/esm/llm/usage.js +69 -0
- package/dist/esm/llm/usage.js.map +1 -0
- package/dist/esm/logging/daemon.js +124 -0
- package/dist/esm/logging/daemon.js.map +1 -0
- package/dist/esm/logging/ring-file.js +66 -0
- package/dist/esm/logging/ring-file.js.map +1 -0
- package/dist/esm/media-cache.js +251 -0
- package/dist/esm/media-cache.js.map +1 -0
- package/dist/esm/model-auto.js +103 -5
- package/dist/esm/model-auto.js.map +1 -1
- package/dist/esm/processes.js +2 -0
- package/dist/esm/processes.js.map +1 -0
- package/dist/esm/refresh-free.js +3 -3
- package/dist/esm/refresh-free.js.map +1 -1
- package/dist/esm/run/attachments.js +8 -4
- package/dist/esm/run/attachments.js.map +1 -1
- package/dist/esm/run/bird.js +118 -5
- package/dist/esm/run/bird.js.map +1 -1
- package/dist/esm/run/cache-state.js +3 -2
- package/dist/esm/run/cache-state.js.map +1 -1
- package/dist/esm/run/cli-preflight.js +19 -1
- package/dist/esm/run/cli-preflight.js.map +1 -1
- package/dist/esm/run/constants.js +0 -7
- package/dist/esm/run/constants.js.map +1 -1
- package/dist/esm/run/finish-line.js +58 -11
- package/dist/esm/run/finish-line.js.map +1 -1
- package/dist/esm/run/flows/asset/extract.js +70 -0
- package/dist/esm/run/flows/asset/extract.js.map +1 -0
- package/dist/esm/run/flows/asset/input.js +209 -25
- package/dist/esm/run/flows/asset/input.js.map +1 -1
- package/dist/esm/run/flows/asset/media-policy.js +3 -0
- package/dist/esm/run/flows/asset/media-policy.js.map +1 -0
- package/dist/esm/run/flows/asset/media.js +224 -0
- package/dist/esm/run/flows/asset/media.js.map +1 -0
- package/dist/esm/run/flows/asset/output.js +98 -0
- package/dist/esm/run/flows/asset/output.js.map +1 -0
- package/dist/esm/run/flows/asset/preprocess.js +92 -16
- package/dist/esm/run/flows/asset/preprocess.js.map +1 -1
- package/dist/esm/run/flows/asset/summary.js +165 -11
- package/dist/esm/run/flows/asset/summary.js.map +1 -1
- package/dist/esm/run/flows/url/extract.js +6 -6
- package/dist/esm/run/flows/url/extract.js.map +1 -1
- package/dist/esm/run/flows/url/flow.js +338 -36
- package/dist/esm/run/flows/url/flow.js.map +1 -1
- package/dist/esm/run/flows/url/markdown.js +6 -1
- package/dist/esm/run/flows/url/markdown.js.map +1 -1
- package/dist/esm/run/flows/url/slides-output.js +485 -0
- package/dist/esm/run/flows/url/slides-output.js.map +1 -0
- package/dist/esm/run/flows/url/slides-text.js +628 -0
- package/dist/esm/run/flows/url/slides-text.js.map +1 -0
- package/dist/esm/run/flows/url/summary.js +358 -83
- package/dist/esm/run/flows/url/summary.js.map +1 -1
- package/dist/esm/run/help.js +94 -5
- package/dist/esm/run/help.js.map +1 -1
- package/dist/esm/run/logging.js +12 -4
- package/dist/esm/run/logging.js.map +1 -1
- package/dist/esm/run/media-cache-state.js +33 -0
- package/dist/esm/run/media-cache-state.js.map +1 -0
- package/dist/esm/run/progress.js +19 -1
- package/dist/esm/run/progress.js.map +1 -1
- package/dist/esm/run/run-context.js +19 -0
- package/dist/esm/run/run-context.js.map +1 -0
- package/dist/esm/run/run-output.js +1 -1
- package/dist/esm/run/run-output.js.map +1 -1
- package/dist/esm/run/run-settings.js +182 -0
- package/dist/esm/run/run-settings.js.map +1 -0
- package/dist/esm/run/runner.js +225 -32
- package/dist/esm/run/runner.js.map +1 -1
- package/dist/esm/run/slides-cli.js +225 -0
- package/dist/esm/run/slides-cli.js.map +1 -0
- package/dist/esm/run/slides-render.js +163 -0
- package/dist/esm/run/slides-render.js.map +1 -0
- package/dist/esm/run/stream-output.js +63 -0
- package/dist/esm/run/stream-output.js.map +1 -0
- package/dist/esm/run/streaming.js +16 -43
- package/dist/esm/run/streaming.js.map +1 -1
- package/dist/esm/run/summary-engine.js +59 -41
- package/dist/esm/run/summary-engine.js.map +1 -1
- package/dist/esm/run/transcriber-cli.js +148 -0
- package/dist/esm/run/transcriber-cli.js.map +1 -0
- package/dist/esm/shared/sse-events.js +26 -0
- package/dist/esm/shared/sse-events.js.map +1 -0
- package/dist/esm/shared/streaming-merge.js +44 -0
- package/dist/esm/shared/streaming-merge.js.map +1 -0
- package/dist/esm/slides/extract.js +1942 -0
- package/dist/esm/slides/extract.js.map +1 -0
- package/dist/esm/slides/index.js +4 -0
- package/dist/esm/slides/index.js.map +1 -0
- package/dist/esm/slides/settings.js +73 -0
- package/dist/esm/slides/settings.js.map +1 -0
- package/dist/esm/slides/store.js +111 -0
- package/dist/esm/slides/store.js.map +1 -0
- package/dist/esm/slides/types.js +2 -0
- package/dist/esm/slides/types.js.map +1 -0
- package/dist/esm/tty/osc-progress.js +21 -1
- package/dist/esm/tty/osc-progress.js.map +1 -1
- package/dist/esm/tty/progress/fetch-html.js +8 -4
- package/dist/esm/tty/progress/fetch-html.js.map +1 -1
- package/dist/esm/tty/progress/transcript.js +82 -31
- package/dist/esm/tty/progress/transcript.js.map +1 -1
- package/dist/esm/tty/spinner.js +2 -2
- package/dist/esm/tty/spinner.js.map +1 -1
- package/dist/esm/tty/theme.js +189 -0
- package/dist/esm/tty/theme.js.map +1 -0
- package/dist/esm/tty/website-progress.js +17 -13
- package/dist/esm/tty/website-progress.js.map +1 -1
- package/dist/esm/version.js +1 -1
- package/dist/esm/version.js.map +1 -1
- package/dist/types/cache.d.ts +14 -2
- package/dist/types/config.d.ts +34 -0
- package/dist/types/daemon/agent.d.ts +25 -0
- package/dist/types/daemon/chat.d.ts +27 -0
- package/dist/types/daemon/env-snapshot.d.ts +1 -1
- package/dist/types/daemon/flow-context.d.ts +24 -3
- package/dist/types/daemon/launchd.d.ts +4 -0
- package/dist/types/daemon/process-registry.d.ts +73 -0
- package/dist/types/daemon/schtasks.d.ts +4 -0
- package/dist/types/daemon/server.d.ts +7 -1
- package/dist/types/daemon/summarize.d.ts +47 -5
- package/dist/types/daemon/systemd.d.ts +4 -0
- package/dist/types/flags.d.ts +1 -0
- package/dist/types/llm/attachments.d.ts +6 -0
- package/dist/types/llm/errors.d.ts +1 -0
- package/dist/types/llm/generate-text.d.ts +29 -13
- package/dist/types/llm/prompt.d.ts +7 -2
- package/dist/types/llm/providers/anthropic.d.ts +30 -0
- package/dist/types/llm/providers/google.d.ts +29 -0
- package/dist/types/llm/providers/models.d.ts +27 -0
- package/dist/types/llm/providers/openai.d.ts +38 -0
- package/dist/types/llm/providers/shared.d.ts +14 -0
- package/dist/types/llm/providers/types.d.ts +6 -0
- package/dist/types/llm/types.d.ts +5 -0
- package/dist/types/llm/usage.d.ts +5 -0
- package/dist/types/logging/daemon.d.ts +26 -0
- package/dist/types/logging/ring-file.d.ts +10 -0
- package/dist/types/media-cache.d.ts +22 -0
- package/dist/types/model-auto.d.ts +1 -0
- package/dist/types/processes.d.ts +1 -0
- package/dist/types/run/attachments.d.ts +9 -6
- package/dist/types/run/bird.d.ts +7 -0
- package/dist/types/run/constants.d.ts +0 -2
- package/dist/types/run/finish-line.d.ts +59 -1
- package/dist/types/run/flows/asset/extract.d.ts +18 -0
- package/dist/types/run/flows/asset/input.d.ts +12 -2
- package/dist/types/run/flows/asset/media-policy.d.ts +2 -0
- package/dist/types/run/flows/asset/media.d.ts +21 -0
- package/dist/types/run/flows/asset/output.d.ts +42 -0
- package/dist/types/run/flows/asset/preprocess.d.ts +22 -2
- package/dist/types/run/flows/asset/summary.d.ts +6 -0
- package/dist/types/run/flows/url/extract.d.ts +2 -1
- package/dist/types/run/flows/url/slides-output.d.ts +66 -0
- package/dist/types/run/flows/url/slides-text.d.ts +87 -0
- package/dist/types/run/flows/url/summary.d.ts +11 -3
- package/dist/types/run/flows/url/types.d.ts +29 -2
- package/dist/types/run/help.d.ts +3 -0
- package/dist/types/run/logging.d.ts +3 -2
- package/dist/types/run/media-cache-state.d.ts +7 -0
- package/dist/types/run/progress.d.ts +2 -1
- package/dist/types/run/run-context.d.ts +44 -0
- package/dist/types/run/run-settings.d.ts +62 -0
- package/dist/types/run/slides-cli.d.ts +9 -0
- package/dist/types/run/slides-render.d.ts +30 -0
- package/dist/types/run/stream-output.d.ts +12 -0
- package/dist/types/run/streaming.d.ts +10 -4
- package/dist/types/run/summary-engine.d.ts +15 -3
- package/dist/types/run/summary-llm.d.ts +2 -2
- package/dist/types/run/transcriber-cli.d.ts +8 -0
- package/dist/types/shared/sse-events.d.ts +64 -0
- package/dist/types/shared/streaming-merge.d.ts +4 -0
- package/dist/types/slides/extract.d.ts +42 -0
- package/dist/types/slides/index.d.ts +5 -0
- package/dist/types/slides/settings.d.ts +20 -0
- package/dist/types/slides/store.d.ts +15 -0
- package/dist/types/slides/types.d.ts +40 -0
- package/dist/types/tty/osc-progress.d.ts +2 -2
- package/dist/types/tty/progress/fetch-html.d.ts +3 -1
- package/dist/types/tty/progress/transcript.d.ts +3 -1
- package/dist/types/tty/spinner.d.ts +3 -1
- package/dist/types/tty/theme.d.ts +44 -0
- package/dist/types/tty/website-progress.d.ts +3 -1
- package/dist/types/version.d.ts +1 -1
- package/docs/README.md +13 -8
- package/docs/_config.yml +26 -0
- package/docs/_layouts/default.html +60 -0
- package/docs/agent.md +333 -0
- package/docs/assets/site.css +748 -0
- package/docs/assets/site.js +72 -0
- package/docs/assets/summarize-cli.png +0 -0
- package/docs/assets/summarize-extension.png +0 -0
- package/docs/assets/youtube-slides.png +0 -0
- package/docs/cache.md +29 -3
- package/docs/chrome-extension.md +85 -7
- package/docs/config.md +74 -2
- package/docs/extract-only.md +10 -2
- package/docs/index.html +205 -0
- package/docs/index.md +25 -0
- package/docs/language.md +1 -1
- package/docs/llm.md +17 -1
- package/docs/manual-tests.md +2 -0
- package/docs/media.md +37 -0
- package/docs/model-auto.md +2 -1
- package/docs/nvidia-onnx-transcription.md +55 -0
- package/docs/openai.md +5 -0
- package/docs/releasing.md +26 -0
- package/docs/site/assets/site.css +399 -228
- package/docs/site/assets/summarize-cli.png +0 -0
- package/docs/site/assets/summarize-extension.png +0 -0
- package/docs/site/docs/chrome-extension.html +89 -0
- package/docs/site/docs/config.html +1 -0
- package/docs/site/docs/extract-only.html +1 -0
- package/docs/site/docs/firecrawl.html +1 -0
- package/docs/site/docs/index.html +5 -0
- package/docs/site/docs/llm.html +1 -0
- package/docs/site/docs/openai.html +1 -0
- package/docs/site/docs/website.html +1 -0
- package/docs/site/docs/youtube.html +1 -0
- package/docs/site/index.html +148 -84
- package/docs/slides.md +74 -0
- package/docs/timestamps.md +103 -0
- package/docs/website.md +13 -0
- package/docs/youtube.md +16 -0
- package/package.json +22 -18
- package/dist/esm/daemon/request-settings.js +0 -91
- package/dist/esm/daemon/request-settings.js.map +0 -1
- package/dist/types/daemon/request-settings.d.ts +0 -27
package/docs/language.md
CHANGED
|
@@ -6,7 +6,7 @@ read_when:
|
|
|
6
6
|
|
|
7
7
|
# Output language
|
|
8
8
|
|
|
9
|
-
By default, `summarize` writes the summary in the **same language as the source content** (`--language auto`).
|
|
9
|
+
By default, `summarize` writes the summary in the **same language as the source content** (`--language auto`). If language detection is uncertain, it falls back to English.
|
|
10
10
|
|
|
11
11
|
This affects the language of the generated summary text (not extraction/transcription).
|
|
12
12
|
|
package/docs/llm.md
CHANGED
|
@@ -57,7 +57,7 @@ installed, auto mode can use local CLI models when `cli.enabled` is set (see `do
|
|
|
57
57
|
- Prompts are wrapped in `<instructions>`, `<context>`, `<content>` tags.
|
|
58
58
|
- When `--length` is numeric, we add `Output is X characters.` When `--language` is explicitly set, we add `Output should be <language>.`
|
|
59
59
|
- `--no-cache`
|
|
60
|
-
- Bypass cache reads and writes (
|
|
60
|
+
- Bypass summary cache reads and writes only (LLM output). Extract/transcript caches still apply.
|
|
61
61
|
- `--cache-stats`
|
|
62
62
|
- Print cache stats and exit.
|
|
63
63
|
- `--clear-cache`
|
|
@@ -69,6 +69,8 @@ installed, auto mode can use local CLI models when `cli.enabled` is set (see `do
|
|
|
69
69
|
- Minimum numeric value: 50 chars.
|
|
70
70
|
- Default: `long`.
|
|
71
71
|
- Output format is Markdown; use short paragraphs and only add bullets when they improve scanability.
|
|
72
|
+
- `--force-summary`
|
|
73
|
+
- Always run the LLM even when extracted content is shorter than the requested length.
|
|
72
74
|
- `--max-output-tokens <count>`
|
|
73
75
|
- Hard cap for output tokens (optional).
|
|
74
76
|
- If omitted, no max token parameter is sent (provider default).
|
|
@@ -78,6 +80,14 @@ installed, auto mode can use local CLI models when `cli.enabled` is set (see `do
|
|
|
78
80
|
- LLM retry attempts on timeout (default: 1).
|
|
79
81
|
- `--json` (includes prompt + summary in one JSON object)
|
|
80
82
|
|
|
83
|
+
## Prompt rules
|
|
84
|
+
|
|
85
|
+
- Video and podcast summaries omit sponsor/ads/promotional segments; do not include them in the summary.
|
|
86
|
+
- Do not mention or acknowledge sponsors/ads, and do not say you skipped or ignored anything.
|
|
87
|
+
- If a standout line is present, include 1-2 short exact excerpts formatted as Markdown italics with single asterisks. Do not use quotation marks of any kind (straight or curly). If a title or excerpt would normally use quotes, remove them and optionally italicize the text instead. Apostrophes in contractions are OK. Never include ad/sponsor/boilerplate excerpts and do not mention them. Avoid sponsor/ad/promo language, brand names like Squarespace, or CTA phrases like discount code.
|
|
88
|
+
- Final check: remove sponsor/ad references or mentions of skipping/ignoring content. Remove any quotation marks. Ensure standout excerpts are italicized; otherwise omit them.
|
|
89
|
+
- Hard rules: never mention sponsor/ads; never output quotation marks of any kind (straight or curly), even for titles.
|
|
90
|
+
|
|
81
91
|
## Z.AI
|
|
82
92
|
|
|
83
93
|
Use `--model zai/<model>` (e.g. `zai/glm-4.7`). Defaults to Z.AI’s base URL and uses chat completions.
|
|
@@ -86,3 +96,9 @@ Use `--model zai/<model>` (e.g. `zai/glm-4.7`). Defaults to Z.AI’s base URL an
|
|
|
86
96
|
|
|
87
97
|
- Text prompts are checked against the model’s max input tokens (LiteLLM catalog) using a GPT tokenizer.
|
|
88
98
|
- Text files over 10 MB are rejected before tokenization.
|
|
99
|
+
|
|
100
|
+
## PDF attachments
|
|
101
|
+
|
|
102
|
+
- For PDF inputs, `--preprocess auto` will send the PDF directly to Anthropic/OpenAI/Gemini when a fixed model supports documents; otherwise we fall back to markitdown.
|
|
103
|
+
- `--preprocess always` forces markitdown (no direct attachments).
|
|
104
|
+
- Streaming is disabled for document attachments.
|
package/docs/manual-tests.md
CHANGED
|
@@ -45,6 +45,8 @@ Tip: use `--verbose` to see model attempts + the chosen model.
|
|
|
45
45
|
|
|
46
46
|
- YouTube:
|
|
47
47
|
- `summarize https://www.youtube.com/watch?v=dQw4w9WgXcQ --max-output-tokens 200`
|
|
48
|
+
- YouTube summary w/ timestamps (expect `[mm:ss]` in output):
|
|
49
|
+
- `summarize --timestamps --youtube web --length short https://www.youtube.com/watch?v=I845O57ZSy4`
|
|
48
50
|
- Local video understanding (requires Gemini video-capable model; otherwise expect an error or transcript-only behavior depending on input):
|
|
49
51
|
- `summarize ./path/to/video.mp4 --max-output-tokens 200`
|
|
50
52
|
|
package/docs/media.md
ADDED
|
@@ -0,0 +1,37 @@
|
|
|
1
|
+
---
|
|
2
|
+
summary: "Embedded media detection + transcript-first pipeline."
|
|
3
|
+
read_when:
|
|
4
|
+
- "When changing media detection, embedded captions, or video-mode behavior."
|
|
5
|
+
---
|
|
6
|
+
|
|
7
|
+
# Media detection + transcript-first
|
|
8
|
+
|
|
9
|
+
## Detection (HTML)
|
|
10
|
+
|
|
11
|
+
- Embedded video/audio: `<video>` / `<audio>` tags, `og:video` / `og:audio`, iframe embeds (YouTube/Vimeo/Twitch/Wistia, Spotify/SoundCloud/Podcasts).
|
|
12
|
+
- Captions: `<track kind="captions|subtitles" src=...>`.
|
|
13
|
+
|
|
14
|
+
## Transcript resolution order
|
|
15
|
+
|
|
16
|
+
1) Embedded captions (VTT/JSON) when available.
|
|
17
|
+
2) yt-dlp download + Whisper transcription (prefers local whisper.cpp; OpenAI/FAL fallback).
|
|
18
|
+
|
|
19
|
+
## CLI behavior
|
|
20
|
+
|
|
21
|
+
- `--video-mode transcript` prefers transcript-first media handling even when a page has text.
|
|
22
|
+
- Direct media URLs (mp4/webm/m4a/etc) skip HTML and transcribe.
|
|
23
|
+
- Local audio/video files are routed through the same transcript-first pipeline.
|
|
24
|
+
- YouTube still uses the YouTube transcript pipeline (captions → yt-dlp fallback).
|
|
25
|
+
- X/Twitter status URLs with detected video auto-switch to transcript-first (yt-dlp), even in auto mode.
|
|
26
|
+
- Local media files are capped at 2 GB; remote media URLs are best-effort via yt-dlp (no explicit size limit).
|
|
27
|
+
|
|
28
|
+
## Chrome extension behavior
|
|
29
|
+
|
|
30
|
+
- When media is detected on a page, the Summarize button gains a dropdown caret (Page/Video or Page/Audio).
|
|
31
|
+
- Selecting Video/Audio forces URL mode + transcript-first extraction for that run only.
|
|
32
|
+
- Selection is not stored.
|
|
33
|
+
|
|
34
|
+
## Known limits
|
|
35
|
+
|
|
36
|
+
- No auth/cookie handling for embedded media; login-gated assets will fail.
|
|
37
|
+
- Captions are best-effort; if captions are missing or unreadable, we fall back to transcription.
|
package/docs/model-auto.md
CHANGED
|
@@ -35,7 +35,8 @@ Behavior:
|
|
|
35
35
|
|
|
36
36
|
- Uses the order you provide in `model.rules[].candidates[]` (or `bands[].candidates[]`).
|
|
37
37
|
- Filters out candidates that can’t fit the prompt (max input tokens, LiteLLM catalog).
|
|
38
|
-
- For a native candidate, auto mode may add an OpenRouter fallback attempt right after it (when `OPENROUTER_API_KEY` is set
|
|
38
|
+
- For a native candidate, auto mode may add an OpenRouter fallback attempt right after it (when `OPENROUTER_API_KEY` is set, video understanding isn’t required, and a matching OpenRouter model id is found). If no unique OpenRouter id matches, the fallback is skipped.
|
|
39
|
+
- To force native-only attempts, unset `OPENROUTER_API_KEY` (or pass an explicit native model id like `xai/...` and keep the key unset).
|
|
39
40
|
|
|
40
41
|
Notes:
|
|
41
42
|
|
|
@@ -0,0 +1,55 @@
|
|
|
1
|
+
---
|
|
2
|
+
summary: "Local Parakeet/Canary ONNX transcription via external CLI."
|
|
3
|
+
read_when:
|
|
4
|
+
- "When configuring or changing local ONNX transcription (parakeet/canary)."
|
|
5
|
+
---
|
|
6
|
+
|
|
7
|
+
# NVIDIA Parakeet/Canary ONNX transcription
|
|
8
|
+
|
|
9
|
+
Summarize can now run local transcription through NVIDIA's Parakeet-TDT 0.6B-v3 or Canary 1B-v2 ONNX exports by shelling out to a user-provided CLI. Auto selection prefers ONNX when configured; you can still force Whisper or a specific ONNX model.
|
|
10
|
+
|
|
11
|
+
## How to enable
|
|
12
|
+
|
|
13
|
+
1) Install a CLI capable of running the ONNX models (e.g. `sherpa-onnx` or a custom wrapper). Homebrew may not have a formula; use upstream binaries or build from source if needed. The CLI must emit the transcribed text on stdout and accept a single WAV input path. **Summarize now downloads the Hugging Face model files automatically on first use** into the cache (see below), so your command template can reference the provided paths.
|
|
14
|
+
2) Set one (or both) command templates:
|
|
15
|
+
|
|
16
|
+
- Recommended (no shell): provide a JSON array (command + args):
|
|
17
|
+
- `SUMMARIZE_ONNX_PARAKEET_CMD='["sherpa-onnx", "...", "--tokens", "{vocab}", "--offline-ctc-model", "{model}", "--input-wav", "{input}"]'`
|
|
18
|
+
- `SUMMARIZE_ONNX_CANARY_CMD='["my-canary-wrapper", "{model_dir}", "{input}"]'`
|
|
19
|
+
- Shell string (advanced): `SUMMARIZE_ONNX_PARAKEET_CMD="sherpa-onnx ... --tokens {vocab} --offline-ctc-model {model} --input-wav {input}"`
|
|
20
|
+
|
|
21
|
+
Notes:
|
|
22
|
+
|
|
23
|
+
- If you use the shell string form, **do not quote placeholders** (Summarize shell-escapes substituted paths so spaces work and injection risk is reduced).
|
|
24
|
+
|
|
25
|
+
Placeholders:
|
|
26
|
+
|
|
27
|
+
- `{input}` — audio path (added to the end if not present)
|
|
28
|
+
- `{model}` — downloaded `model.onnx` path
|
|
29
|
+
- `{vocab}` — downloaded `vocab.txt` path
|
|
30
|
+
- `{model_dir}` — parent directory containing the downloaded files
|
|
31
|
+
|
|
32
|
+
3) Pick the ONNX model via CLI or env:
|
|
33
|
+
|
|
34
|
+
- Auto (default): leave `SUMMARIZE_TRANSCRIBER` unset or set `SUMMARIZE_TRANSCRIBER=auto`
|
|
35
|
+
- CLI: `--transcriber parakeet` or `--transcriber canary`
|
|
36
|
+
- Env: `SUMMARIZE_TRANSCRIBER=parakeet` (or `canary`)
|
|
37
|
+
|
|
38
|
+
For the Chrome extension, you can pick a permanent default under **Settings → Model → Advanced Overrides → Transcriber**. The selection is sent with every request. Make sure the daemon environment still has your ONNX CLI commands configured (env vars above) so the override can take effect. Alternatively, export the env vars before running `summarize daemon install --token <TOKEN>` so the daemon inherits your ONNX command templates and default transcriber.
|
|
39
|
+
|
|
40
|
+
### Cache + download details
|
|
41
|
+
|
|
42
|
+
- Artifacts are stored under `${SUMMARIZE_ONNX_CACHE_DIR || $XDG_CACHE_HOME || ~/.cache}/summarize/onnx/<model>/`.
|
|
43
|
+
- Set `SUMMARIZE_ONNX_MODEL_BASE_URL` to point at a mirror (defaults to the Hugging Face repo for the chosen model).
|
|
44
|
+
- The first run downloads `model.onnx` and `vocab.txt`; subsequent runs reuse cached files.
|
|
45
|
+
|
|
46
|
+
## Behavior
|
|
47
|
+
|
|
48
|
+
- Input audio is transcoded to 16kHz mono WAV via `ffmpeg` when available; otherwise the original file is passed to the CLI.
|
|
49
|
+
- Onnx errors (missing command, non-zero exit, empty output) fall back to the existing Whisper flow with a note recorded in the transcript metadata.
|
|
50
|
+
- Progress UI shows "ONNX (Parakeet/Canary)" while the external transcriber runs.
|
|
51
|
+
|
|
52
|
+
## Notes
|
|
53
|
+
|
|
54
|
+
- The ONNX inference binary itself is **not** bundled; users must install or provide it separately.
|
|
55
|
+
- This flow remains CPU-only and compatible with existing transcript providers.
|
package/docs/openai.md
CHANGED
|
@@ -23,3 +23,8 @@ For the full model/provider matrix, see `docs/llm.md`.
|
|
|
23
23
|
- `--max-output-tokens <count>`
|
|
24
24
|
- Hard cap for output tokens (optional).
|
|
25
25
|
- `--json` (includes prompt + summary in one JSON object)
|
|
26
|
+
|
|
27
|
+
## PDF inputs
|
|
28
|
+
|
|
29
|
+
- When a PDF is provided and `--preprocess auto` is used, summarize sends the PDF as a file input via the OpenAI Responses API.
|
|
30
|
+
- Document streaming is disabled for file inputs; non-streaming calls are used instead.
|
|
@@ -0,0 +1,26 @@
|
|
|
1
|
+
---
|
|
2
|
+
summary: "Release checklist + Homebrew tap update."
|
|
3
|
+
---
|
|
4
|
+
|
|
5
|
+
# Releasing
|
|
6
|
+
|
|
7
|
+
## Goals
|
|
8
|
+
- Ship npm packages (core first, then CLI).
|
|
9
|
+
- Tag + GitHub release.
|
|
10
|
+
- Update Homebrew tap so `brew install steipete/tap/summarize` matches latest tag.
|
|
11
|
+
|
|
12
|
+
## Checklist
|
|
13
|
+
1. `scripts/release.sh all` (gates → build → verify → publish → smoke → tag → tap).
|
|
14
|
+
2. Create GitHub release for the new tag (match version, attach notes/assets as needed).
|
|
15
|
+
3. If you didn’t run `tap` in the script, update the Homebrew tap formula for `summarize`:
|
|
16
|
+
- Bump version to the new tag.
|
|
17
|
+
- Update tarball URL + SHA256 for the new release.
|
|
18
|
+
4. Verify Homebrew install reflects the new version:
|
|
19
|
+
- `brew install steipete/tap/summarize`
|
|
20
|
+
- `summarize --version` matches tag.
|
|
21
|
+
- Run a feature added in the release (e.g. `summarize daemon install` for v0.8.2).
|
|
22
|
+
5. If anything fails, fix and re-cut the release (no partials).
|
|
23
|
+
|
|
24
|
+
## Common failure
|
|
25
|
+
- NPM/GitHub release updated, tap not updated → users stuck on old version.
|
|
26
|
+
Fix: always do step 3–4 before announcing.
|