recmp3-cli 1.0.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -0,0 +1,25 @@
1
+ # Commercial License
2
+
3
+ `recmp3-cli` is dual-licensed:
4
+
5
+ 1. **Open Source (AGPL-3.0)** — Free for personal use, open source projects, and non-commercial use. See [LICENSE](LICENSE).
6
+
7
+ 2. **Commercial License** — Required if you:
8
+ - Integrate `recmp3-cli` into a proprietary product or service without open-sourcing your modifications
9
+ - Distribute it as part of a closed-source product
10
+ - Use it in a commercial SaaS, enterprise product, or internal tooling at a for-profit entity without complying with AGPL-3.0
11
+
12
+ ## How to obtain a commercial license
13
+
14
+ Email: **eduardoa.borjas@gmail.com**
15
+
16
+ Include:
17
+ - Company name and size
18
+ - Intended use case
19
+ - Expected volume (users / deployments)
20
+
21
+ Response within 48 business hours.
22
+
23
+ ---
24
+
25
+ The AGPL-3.0 license means: if you run a modified version of this software as a service (SaaS), you must release your modifications under the same license. If that model doesn't work for your organization, a commercial license removes this requirement.
package/README.md ADDED
@@ -0,0 +1,255 @@
1
+ # recmp3-cli
2
+
3
+ [![CI](https://github.com/aedneth/recmp3-cli/actions/workflows/ci.yml/badge.svg)](https://github.com/aedneth/recmp3-cli/actions/workflows/ci.yml)
4
+ [![License: AGPL v3](https://img.shields.io/badge/License-AGPL_v3-blue.svg)](https://www.gnu.org/licenses/agpl-3.0)
5
+ [![version](https://img.shields.io/badge/version-1.0.0-blue)](https://github.com/aedneth/recmp3-cli/releases)
6
+
7
+ Record audio from any terminal, transcribe with Groq Whisper, get developer-ready output.
8
+ A first-class tool for **both humans and terminal AI agents** — every interactive flow has a
9
+ fully non-interactive, JSON-emitting equivalent, plus a built-in MCP server.
10
+
11
+ ```
12
+ recmp3 record --name "my standup"
13
+ recmp3 prompt standup.wav --template claude-code | pbcopy
14
+ ```
15
+
16
+ ## What it does
17
+
18
+ - **Records** audio with pause/resume using an Ink TUI (runs in your current terminal — no popup windows)
19
+ - **Transcribes** via Groq `whisper-large-v3-turbo` (or OpenAI Whisper)
20
+ - **Formats** output with 7 developer templates: `claude-code`, `prd`, `bug`, `meeting-notes`, `todo`, `commit-message`, `raw`
21
+ - **Cross-platform:** Linux (PulseAudio/PipeWire), macOS (AVFoundation), Windows (DirectShow)
22
+ - **Agent-native:** global `--json` envelopes, `--yes`, deterministic exit codes, stdin/stdout piping, a discoverable `manifest`, and an MCP server — see [Agent & scripting use](#agent--scripting-use)
23
+ - **Local option:** `--provider local-whisper` transcribes on-device via whisper.cpp (no upload)
24
+
25
+ ## Requirements
26
+
27
+ - Node.js ≥ 20
28
+ - ffmpeg ≥ 4.4 (`sudo apt install ffmpeg` / `brew install ffmpeg`)
29
+ - A Groq API key (free tier works) — get one at console.groq.com
30
+
31
+ ## Installation
32
+
33
+ ```bash
34
+ npm install -g recmp3-cli
35
+ ```
36
+
37
+ Or build from source:
38
+
39
+ ```bash
40
+ git clone https://github.com/aedneth/recmp3-cli
41
+ cd recmp3-cli
42
+ npm install && npm run build && npm link
43
+ ```
44
+
45
+ Then set your API key:
46
+
47
+ ```bash
48
+ echo 'export GROQ_API_KEY=gsk_...' >> ~/.bashrc
49
+ source ~/.bashrc
50
+ ```
51
+
52
+ Verify the install:
53
+
54
+ ```bash
55
+ recmp3 doctor
56
+ ```
57
+
58
+ ## Commands
59
+
60
+ ### `recmp3 record`
61
+
62
+ ```
63
+ recmp3 record [options]
64
+
65
+ Options:
66
+ -n, --name <name> Output filename stem (e.g. "my-idea")
67
+ -o, --out <dir> Output directory
68
+ -t, --transcribe Transcribe immediately after recording
69
+ --mp3 Save as MP3 instead of WAV
70
+ --provider <name> Override provider (groq, openai, local-whisper)
71
+ --lang <code> Force language code (e.g. es, en)
72
+ --source <id> Audio source id, or "auto" for the best physical mic
73
+ --duration <seconds> Headless: record N seconds then stop (no TUI)
74
+ --no-tui Force headless capture (record until Ctrl+C)
75
+ --no-copy / --no-print Don't copy / don't print the transcript
76
+ -y, --yes Skip upload consent prompt
77
+ ```
78
+
79
+ > **Tip:** `recmp3 record --source auto` selects your real microphone automatically,
80
+ > skipping the generic `default` device and system-audio `.monitor` sources.
81
+ > Run `recmp3 sources` to see every device and which one is `(recommended)`.
82
+
83
+ Controls while recording:
84
+ - `p` or Space — pause / resume
85
+ - `s` or Enter — save and finish
86
+ - `c` or Escape — cancel (discard recording)
87
+ - Ctrl+C — cancel
88
+
89
+ ### `recmp3 transcribe <file>`
90
+
91
+ Transcribe an existing audio file. Outputs transcript text to stdout (pipeable).
92
+
93
+ ```bash
94
+ recmp3 transcribe meeting.wav --template prd > meeting-prd.md
95
+ ```
96
+
97
+ ### `recmp3 prompt <file>`
98
+
99
+ Apply a developer template to a transcript or text file. No network call — purely deterministic formatting.
100
+
101
+ ```bash
102
+ recmp3 prompt transcript.txt --template claude-code
103
+ recmp3 prompt transcript.txt --list-templates
104
+ ```
105
+
106
+ **Available templates:** `raw`, `claude-code`, `prd`, `bug`, `meeting-notes`, `todo`, `commit-message`
107
+
108
+ ### `recmp3 sources`
109
+
110
+ List available audio input devices for the current platform.
111
+
112
+ ```bash
113
+ recmp3 sources
114
+ recmp3 sources --json
115
+ ```
116
+
117
+ ### `recmp3 doctor`
118
+
119
+ Run 8 system checks: Node version, platform support, ffmpeg version, audio backend, config file, API key, provider connectivity, and recordings directory.
120
+
121
+ ### `recmp3 config`
122
+
123
+ ```bash
124
+ recmp3 config init # Setup (interactive, or flag-driven: --provider/--lang/--outdir/--key)
125
+ recmp3 config show # Display current config (API key redacted)
126
+ recmp3 config path # Print config file path
127
+ recmp3 config set <k> <v> # Set a config key
128
+ recmp3 config set-key groq --key gsk_... # Store an API key in the OS keychain
129
+ ```
130
+
131
+ ## Agent & scripting use
132
+
133
+ Every command is usable by AI agents (Claude Code, Codex, Gemini CLI, …) and shell scripts
134
+ with no TTY and no prompts. See [`docs/AGENTS.md`](docs/AGENTS.md) for the full reference.
135
+
136
+ ```bash
137
+ # Stable JSON envelope on stdout; chatter on stderr
138
+ recmp3 transcribe meeting.wav --json --yes | jq -r .data.text
139
+
140
+ # Compose via pipes: transcribe → template
141
+ recmp3 transcribe meeting.wav --json --yes | jq -r .data.text | recmp3 prompt - --template prd
142
+
143
+ # Headless recording (no Ink TUI)
144
+ recmp3 record --duration 5 --json --yes
145
+
146
+ # Discover the command/tool surface
147
+ recmp3 manifest --json
148
+ ```
149
+
150
+ **Exit codes:** `0` success · `1` unknown · `2` config · `3` audio/ffmpeg · `4` transcription ·
151
+ `5` network · `6` local-whisper · `7` input · `130` user abort.
152
+
153
+ ### MCP server
154
+
155
+ `recmp3` ships a Model Context Protocol server over stdio. Register it with any MCP client:
156
+
157
+ ```jsonc
158
+ { "mcpServers": { "recmp3": { "command": "recmp3", "args": ["mcp"] } } }
159
+ ```
160
+
161
+ Tools: `recmp3_transcribe`, `recmp3_prompt`, `recmp3_sources`, `recmp3_doctor`,
162
+ `recmp3_config_show`, `recmp3_record`, `recmp3_manifest`.
163
+
164
+ ### Local, no-upload transcription
165
+
166
+ ```bash
167
+ export RECMP3_WHISPER_BIN=/usr/local/bin/whisper-cli # whisper.cpp binary
168
+ export RECMP3_WHISPER_MODEL=/models/ggml-base.en.bin
169
+ recmp3 transcribe clip.wav --provider local-whisper --json
170
+ ```
171
+
172
+ ## Use Cases
173
+
174
+ ### Transcribe Instagram reels, TikToks, or any system audio (`recwatch`)
175
+
176
+ Capture the audio your speakers are playing — useful for transcribing reels, TikToks, YouTube clips, podcasts, or any video in your browser without re-recording with the mic. Linux + PulseAudio/PipeWire only; the source name below is the monitor of your default output sink (find yours via `recmp3 sources` or `pactl list short sources`).
177
+
178
+ ```bash
179
+ alias recwatch='RECMP3_SOURCE="alsa_output.platform-avs_hdaudio.0.stereo-fallback.monitor" recmp3 record --transcribe -y'
180
+ ```
181
+
182
+ 1. Add the alias above to `~/.bashrc` and `source ~/.bashrc`.
183
+ 2. Open the Instagram reel / TikTok / video in your browser.
184
+ 3. Run `recwatch` — it starts recording your system audio output.
185
+ 4. Watch the video; the audio plays through your speakers and is captured.
186
+ 5. Press `s` to stop → auto-transcribes via Groq Whisper → transcript prints to terminal and is copied to your clipboard.
187
+
188
+ ## Configuration
189
+
190
+ Config file location:
191
+ - Linux: `~/.config/recmp3/config.json`
192
+ - macOS: `~/Library/Preferences/recmp3/config.json`
193
+ - Windows: `%APPDATA%\recmp3\config.json`
194
+
195
+ Environment variables override config file values:
196
+
197
+ | Variable | Effect |
198
+ |---|---|
199
+ | `GROQ_API_KEY` | Groq API key |
200
+ | `OPENAI_API_KEY` | OpenAI API key |
201
+ | `RECMP3_PROVIDER` | `groq` or `openai` |
202
+ | `RECMP3_MODEL` | Override transcription model |
203
+ | `RECMP3_SOURCE` | Default audio source |
204
+ | `RECMP3_FFMPEG_PATH` | Path to ffmpeg binary |
205
+ | `RECMP3_OUTDIR` | Default recordings output directory |
206
+ | `RECMP3_LANG` | Default language hint (e.g. `es`, `en`) |
207
+ | `RECMP3_WHISPER_BIN` | Path to a whisper.cpp binary (for `local-whisper`) |
208
+ | `RECMP3_WHISPER_MODEL` | Path to a GGML model file (for `local-whisper`) |
209
+ | `RECMP3_JSON` | `1` to always emit JSON envelopes |
210
+ | `RECMP3_YES` | `1` to skip all prompts |
211
+ | `RECMP3_QUIET` | `1` to suppress stderr chatter |
212
+ | `RECMP3_SKIP_CONSENT` | `1` to skip upload consent prompt |
213
+
214
+ ## Providers
215
+
216
+ | Provider | Default model | Max file size | Upload |
217
+ |---|---|---|---|
218
+ | Groq | `whisper-large-v3-turbo` | 25 MB | yes |
219
+ | OpenAI | `whisper-1` | 25 MB | yes |
220
+ | local-whisper | whisper.cpp GGML model | unlimited | **no (on-device)** |
221
+
222
+ Audio is captured as WAV 16kHz mono (~1 MB/min), so the 25 MB limit covers ~25 minutes per recording. Longer recordings are chunked automatically.
223
+
224
+ ## Roadmap
225
+
226
+ | Version | Theme | Status |
227
+ |---|---|---|
228
+ | v0.1.0 | Core TUI recorder + Groq/OpenAI transcription | ✅ shipped |
229
+ | v0.2.0 | Agent-native: `--json`, MCP server, local Whisper, keychain | ✅ shipped |
230
+ | v1.0.0 | Source auto-detection, graceful MCP shutdown, stable CLI + exit-code contract | ✅ shipped |
231
+ | v1.1.0 | Streaming transcription, real-time waveform display | planned |
232
+ | v1.2.0 | Multi-segment smart chunking, speaker diarization | planned |
233
+ | v2.0.0 | Plugin SDK for custom providers and templates | planned |
234
+
235
+ ## Development
236
+
237
+ ```bash
238
+ npm run dev # Run with tsx (no build step)
239
+ npm run build # Build to dist/
240
+ npm run typecheck # TypeScript check
241
+ npm run lint # Biome lint
242
+ npm test # Run test suite (vitest)
243
+ npm run test:watch # Watch mode
244
+ ```
245
+
246
+ See [CONTRIBUTING.md](CONTRIBUTING.md) for the full contribution guide.
247
+
248
+ ## License
249
+
250
+ `recmp3-cli` is dual-licensed:
251
+
252
+ - **[AGPL-3.0](LICENSE)** — free for personal use and open source projects
253
+ - **[Commercial license](LICENSE-COMMERCIAL.md)** — required for proprietary/commercial use
254
+
255
+ Contact **eduardoa.borjas@gmail.com** to purchase a commercial license.
@@ -0,0 +1,46 @@
1
+ #!/usr/bin/env node
2
+ import { execFile } from 'child_process';
3
+ import { promisify } from 'util';
4
+
5
+ var execFileAsync = promisify(execFile);
6
+ var _ffmpegPath = null;
7
+ async function findFfmpeg() {
8
+ if (_ffmpegPath) return _ffmpegPath;
9
+ if (process.env.RECMP3_FFMPEG_PATH) {
10
+ _ffmpegPath = process.env.RECMP3_FFMPEG_PATH;
11
+ return _ffmpegPath;
12
+ }
13
+ try {
14
+ const which = process.platform === "win32" ? "where" : "which";
15
+ const { stdout } = await execFileAsync(which, ["ffmpeg"]);
16
+ _ffmpegPath = stdout.trim().split("\n")[0];
17
+ return _ffmpegPath;
18
+ } catch {
19
+ throw new Error("ffmpeg not found. Install with: sudo apt install ffmpeg");
20
+ }
21
+ }
22
+ async function checkFfmpegVersion() {
23
+ try {
24
+ const ffmpeg = await findFfmpeg();
25
+ const { stdout, stderr } = await execFileAsync(ffmpeg, ["-version"]);
26
+ const output = stdout + stderr;
27
+ const match = output.match(/ffmpeg version (\S+)/);
28
+ const version = match?.[1] ?? "unknown";
29
+ const [major, minor] = version.split(".").map(Number);
30
+ const meets = major > 4 || major === 4 && minor >= 4;
31
+ return { version, meets };
32
+ } catch {
33
+ return { version: "not found", meets: false };
34
+ }
35
+ }
36
+ async function supportsInputFormat(format) {
37
+ try {
38
+ const ffmpeg = await findFfmpeg();
39
+ const { stdout } = await execFileAsync(ffmpeg, ["-formats"]);
40
+ return stdout.includes(format);
41
+ } catch {
42
+ return false;
43
+ }
44
+ }
45
+
46
+ export { checkFfmpegVersion, findFfmpeg, supportsInputFormat };
@@ -0,0 +1,40 @@
1
+ #!/usr/bin/env node
2
+ // src/log.ts
3
+ var debugEnabled = false;
4
+ var verboseEnabled = false;
5
+ function initLogger(opts) {
6
+ debugEnabled = opts.debug ?? process.env.RECMP3_DEBUG === "1";
7
+ verboseEnabled = opts.verbose ?? debugEnabled;
8
+ }
9
+ var log = {
10
+ debug(msg, ...args) {
11
+ if (debugEnabled) {
12
+ process.stderr.write(
13
+ `[debug] ${msg} ${args.length ? JSON.stringify(args) : ""}
14
+ `
15
+ );
16
+ }
17
+ },
18
+ info(msg, ...args) {
19
+ if (verboseEnabled) {
20
+ process.stderr.write(
21
+ `[info] ${msg} ${args.length ? JSON.stringify(args) : ""}
22
+ `
23
+ );
24
+ }
25
+ },
26
+ warn(msg) {
27
+ process.stderr.write(`[warn] ${msg}
28
+ `);
29
+ },
30
+ error(msg) {
31
+ process.stderr.write(`[error] ${msg}
32
+ `);
33
+ }
34
+ };
35
+ function redactKey(key) {
36
+ if (!key || key.length < 8) return "***";
37
+ return `${key.slice(0, 3)}***${key.slice(-4)}`;
38
+ }
39
+
40
+ export { initLogger, log, redactKey };
@@ -0,0 +1,51 @@
1
+ #!/usr/bin/env node
2
+ import { log } from './chunk-DDXRBIWU.js';
3
+
4
+ // src/secrets/keychain.ts
5
+ var SERVICE = "recmp3";
6
+ var _keytar;
7
+ var _warned = false;
8
+ async function getKeytar() {
9
+ if (_keytar !== void 0) return _keytar;
10
+ try {
11
+ const mod = await import('keytar');
12
+ _keytar = mod.default ?? mod;
13
+ } catch {
14
+ _keytar = null;
15
+ if (!_warned) {
16
+ _warned = true;
17
+ log.info(
18
+ "OS keychain (keytar) unavailable \u2014 falling back to environment variables."
19
+ );
20
+ }
21
+ }
22
+ return _keytar;
23
+ }
24
+ async function keychainAvailable() {
25
+ return await getKeytar() !== null;
26
+ }
27
+ async function getSecret(account) {
28
+ const kt = await getKeytar();
29
+ if (!kt) return void 0;
30
+ try {
31
+ return await kt.getPassword(SERVICE, account) ?? void 0;
32
+ } catch (err) {
33
+ log.info(
34
+ `keychain read failed for ${account}: ${err instanceof Error ? err.message : String(err)}`
35
+ );
36
+ return void 0;
37
+ }
38
+ }
39
+ async function setSecret(account, value) {
40
+ const kt = await getKeytar();
41
+ if (!kt) return false;
42
+ await kt.setPassword(SERVICE, account, value);
43
+ return true;
44
+ }
45
+ async function deleteSecret(account) {
46
+ const kt = await getKeytar();
47
+ if (!kt) return false;
48
+ return kt.deletePassword(SERVICE, account);
49
+ }
50
+
51
+ export { deleteSecret, getSecret, keychainAvailable, setSecret };
@@ -0,0 +1,63 @@
1
+ #!/usr/bin/env node
2
+ // src/errors.ts
3
+ var ExitCode = {
4
+ SUCCESS: 0,
5
+ UNKNOWN: 1,
6
+ CONFIG: 2,
7
+ AUDIO: 3,
8
+ TRANSCRIPTION: 4,
9
+ NETWORK: 5,
10
+ LOCAL_WHISPER: 6,
11
+ INPUT: 7,
12
+ USER_ABORT: 130
13
+ };
14
+ var RecmpError = class extends Error {
15
+ constructor(code, message, exitCode = ExitCode.UNKNOWN) {
16
+ super(message);
17
+ this.code = code;
18
+ this.exitCode = exitCode;
19
+ this.name = "RecmpError";
20
+ }
21
+ code;
22
+ exitCode;
23
+ };
24
+ var ConfigError = class extends RecmpError {
25
+ constructor(message) {
26
+ super("CONFIG_ERROR", message, 2);
27
+ this.name = "ConfigError";
28
+ }
29
+ };
30
+ var AudioCaptureError = class extends RecmpError {
31
+ constructor(message) {
32
+ super("AUDIO_CAPTURE_ERROR", message, 3);
33
+ this.name = "AudioCaptureError";
34
+ }
35
+ };
36
+ var TranscriptionError = class extends RecmpError {
37
+ constructor(message, statusCode) {
38
+ super("TRANSCRIPTION_ERROR", message, 4);
39
+ this.statusCode = statusCode;
40
+ this.name = "TranscriptionError";
41
+ }
42
+ statusCode;
43
+ };
44
+ var NetworkError = class extends RecmpError {
45
+ constructor(message) {
46
+ super("NETWORK_ERROR", message, 5);
47
+ this.name = "NetworkError";
48
+ }
49
+ };
50
+ var LocalWhisperError = class extends RecmpError {
51
+ constructor(message) {
52
+ super("LOCAL_WHISPER_ERROR", message, ExitCode.LOCAL_WHISPER);
53
+ this.name = "LocalWhisperError";
54
+ }
55
+ };
56
+ var InputError = class extends RecmpError {
57
+ constructor(message) {
58
+ super("INPUT_ERROR", message, ExitCode.INPUT);
59
+ this.name = "InputError";
60
+ }
61
+ };
62
+
63
+ export { AudioCaptureError, ConfigError, ExitCode, InputError, LocalWhisperError, NetworkError, RecmpError, TranscriptionError };
@@ -0,0 +1,127 @@
1
+ #!/usr/bin/env node
2
+ import { log } from './chunk-DDXRBIWU.js';
3
+ import { LocalWhisperError } from './chunk-NUWDWBJQ.js';
4
+ import { execFile } from 'child_process';
5
+ import { existsSync } from 'fs';
6
+ import { mkdtemp, rm, readFile } from 'fs/promises';
7
+ import { tmpdir } from 'os';
8
+ import { basename, join } from 'path';
9
+ import { promisify } from 'util';
10
+
11
+ var execFileAsync = promisify(execFile);
12
+ var CANDIDATE_BINS = ["whisper-cli", "whisper", "main"];
13
+ async function findWhisperBin(configPath) {
14
+ const explicit = configPath ?? process.env.RECMP3_WHISPER_BIN;
15
+ if (explicit) {
16
+ if (!existsSync(explicit)) {
17
+ throw new LocalWhisperError(`whisper binary not found at: ${explicit}`);
18
+ }
19
+ return explicit;
20
+ }
21
+ const which = process.platform === "win32" ? "where" : "which";
22
+ for (const name of CANDIDATE_BINS) {
23
+ try {
24
+ const { stdout } = await execFileAsync(which, [name]);
25
+ const found = stdout.trim().split("\n")[0];
26
+ if (found) return found;
27
+ } catch {
28
+ }
29
+ }
30
+ throw new LocalWhisperError(
31
+ 'whisper.cpp binary not found. Install whisper.cpp and ensure "whisper-cli" is on PATH, or set RECMP3_WHISPER_BIN / config provider.local.binPath.'
32
+ );
33
+ }
34
+ function findWhisperModel(configPath) {
35
+ const model = configPath ?? process.env.RECMP3_WHISPER_MODEL;
36
+ if (!model) {
37
+ throw new LocalWhisperError(
38
+ "No whisper model configured. Set RECMP3_WHISPER_MODEL / config provider.local.modelPath to a .bin model file (e.g. ggml-base.en.bin)."
39
+ );
40
+ }
41
+ if (!existsSync(model)) {
42
+ throw new LocalWhisperError(`whisper model not found at: ${model}`);
43
+ }
44
+ return model;
45
+ }
46
+
47
+ // src/transcription/local-whisper.ts
48
+ var execFileAsync2 = promisify(execFile);
49
+ var LocalWhisperProvider = class {
50
+ constructor(config = {}) {
51
+ this.config = config;
52
+ }
53
+ config;
54
+ name = "local-whisper";
55
+ maxFileSizeBytes = Number.POSITIVE_INFINITY;
56
+ supportedFormats = ["wav", "mp3", "flac", "ogg"];
57
+ async transcribe(input) {
58
+ const { audioPath, language, signal } = input;
59
+ const t0 = Date.now();
60
+ const bin = await findWhisperBin(this.config.binPath);
61
+ const model = findWhisperModel(this.config.modelPath);
62
+ const lang = language ?? this.config.language;
63
+ log.info(`Transcribing locally with whisper.cpp: ${basename(audioPath)}`);
64
+ const workDir = await mkdtemp(join(tmpdir(), "recmp3-whisper-"));
65
+ const outPrefix = join(workDir, "out");
66
+ const args = ["-m", model, "-f", audioPath, "-oj", "-of", outPrefix];
67
+ if (lang) args.push("-l", lang);
68
+ try {
69
+ await execFileAsync2(bin, args, { signal, maxBuffer: 64 * 1024 * 1024 });
70
+ } catch (err) {
71
+ await rm(workDir, { recursive: true, force: true }).catch(() => {
72
+ });
73
+ throw new LocalWhisperError(
74
+ `whisper.cpp failed: ${err instanceof Error ? err.message : String(err)}`
75
+ );
76
+ }
77
+ const jsonPath = `${outPrefix}.json`;
78
+ if (!existsSync(jsonPath)) {
79
+ await rm(workDir, { recursive: true, force: true }).catch(() => {
80
+ });
81
+ throw new LocalWhisperError("whisper.cpp produced no JSON output.");
82
+ }
83
+ let parsed;
84
+ try {
85
+ parsed = JSON.parse(await readFile(jsonPath, "utf-8"));
86
+ } catch (err) {
87
+ throw new LocalWhisperError(
88
+ `Failed to parse whisper.cpp output: ${err instanceof Error ? err.message : String(err)}`
89
+ );
90
+ } finally {
91
+ await rm(workDir, { recursive: true, force: true }).catch(() => {
92
+ });
93
+ }
94
+ const segments = (parsed.transcription ?? []).map((s) => ({
95
+ startSec: (s.offsets?.from ?? 0) / 1e3,
96
+ endSec: (s.offsets?.to ?? 0) / 1e3,
97
+ text: (s.text ?? "").trim()
98
+ }));
99
+ const text = segments.map((s) => s.text).join(" ").replace(/\s+/g, " ").trim();
100
+ const latencyMs = Date.now() - t0;
101
+ return {
102
+ text,
103
+ language: parsed.result?.language ?? lang,
104
+ durationSec: segments.length ? segments[segments.length - 1].endSec : void 0,
105
+ segments,
106
+ raw: parsed,
107
+ provider: "local-whisper",
108
+ model: basename(model),
109
+ latencyMs
110
+ };
111
+ }
112
+ async ping() {
113
+ const t0 = Date.now();
114
+ try {
115
+ await findWhisperBin(this.config.binPath);
116
+ findWhisperModel(this.config.modelPath);
117
+ return { ok: true, latencyMs: Date.now() - t0 };
118
+ } catch (err) {
119
+ return {
120
+ ok: false,
121
+ error: err instanceof Error ? err.message : String(err)
122
+ };
123
+ }
124
+ }
125
+ };
126
+
127
+ export { LocalWhisperProvider };