whisper-coreml 0.2.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/LICENSE ADDED
@@ -0,0 +1,22 @@
1
+ MIT License
2
+
3
+ Copyright (c) 2026 Sebastian Werner
4
+
5
+ Permission is hereby granted, free of charge, to any person obtaining a copy
6
+ of this software and associated documentation files (the "Software"), to deal
7
+ in the Software without restriction, including without limitation the rights
8
+ to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
9
+ copies of the Software, and to permit persons to whom the Software is
10
+ furnished to do so, subject to the following conditions:
11
+
12
+ The above copyright notice and this permission notice shall be included in all
13
+ copies or substantial portions of the Software.
14
+
15
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16
+ IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17
+ FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18
+ AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19
+ LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20
+ OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21
+ SOFTWARE.
22
+
package/README.md ADDED
@@ -0,0 +1,290 @@
1
+ # whisper-coreml
2
+
3
+ <p align="center">
4
+ <img src="logo.svg" alt="whisper-coreml" width="128" height="128">
5
+ </p>
6
+
7
+ <p align="center">
8
+ <strong>OpenAI Whisper ASR for Node.js with CoreML/ANE acceleration on Apple Silicon</strong>
9
+ </p>
10
+
11
+ <p align="center">
12
+ <a href="https://github.com/sebastian-software/whisper-coreml/actions/workflows/ci.yml"><img src="https://github.com/sebastian-software/whisper-coreml/actions/workflows/ci.yml/badge.svg" alt="CI"></a>
13
+ <a href="https://www.npmjs.com/package/whisper-coreml"><img src="https://img.shields.io/npm/v/whisper-coreml.svg" alt="npm version"></a>
14
+ <a href="https://www.npmjs.com/package/whisper-coreml"><img src="https://img.shields.io/npm/dm/whisper-coreml.svg" alt="npm downloads"></a>
15
+ <br>
16
+ <a href="https://www.typescriptlang.org/"><img src="https://img.shields.io/badge/TypeScript-5.x-blue.svg" alt="TypeScript"></a>
17
+ <a href="https://nodejs.org/"><img src="https://img.shields.io/badge/Node.js-20+-green.svg" alt="Node.js"></a>
18
+ <a href="https://opensource.org/licenses/MIT"><img src="https://img.shields.io/badge/License-MIT-yellow.svg" alt="License: MIT"></a>
19
+ </p>
20
+
21
+ Powered by [whisper.cpp](https://github.com/ggerganov/whisper.cpp) running on Apple's Neural Engine
22
+ via CoreML.
23
+
24
+ ## Why whisper-coreml?
25
+
26
+ When you need **higher transcription quality** than
27
+ [parakeet-coreml](https://github.com/sebastian-software/parakeet-coreml), Whisper's large-v3-turbo
28
+ model delivers. It offers:
29
+
30
+ - **99 language support** vs Parakeet's 25 European languages
31
+ - **Better accuracy** on challenging audio (accents, background noise)
32
+ - **Translation capability** (any language → English)
33
+ - **Word-level confidence scores**
34
+
35
+ ### When to Use Which
36
+
37
+ | Use Case | Recommended |
38
+ | ----------------------------------- | ------------------------------------------------------------------------ |
39
+ | Fast transcription, major languages | [parakeet-coreml](https://github.com/sebastian-software/parakeet-coreml) |
40
+ | Maximum accuracy, any language | **whisper-coreml** |
41
+ | Translation to English | **whisper-coreml** |
42
+ | Edge cases (accents, noise) | **whisper-coreml** |
43
+
44
+ ## Features
45
+
46
+ - šŸŽÆ **99 Languages** – Full Whisper multilingual support
47
+ - šŸš€ **14x real-time** – Transcribe 1 hour of audio in ~4.5 minutes (M1 Ultra, measured)
48
+ - šŸŽ **Neural Engine Acceleration** – Runs on Apple's dedicated ML silicon via CoreML
49
+ - šŸ”’ **Fully Offline** – All processing happens locally
50
+ - šŸ“¦ **Zero Runtime Dependencies** – No Python, no subprocess
51
+ - šŸ“ **Timestamps** – Segment-level timing for subtitles
52
+ - šŸ”„ **Translation** – Translate any language to English
53
+ - ā¬‡ļø **Easy Setup** – Single CLI command to download the model
54
+
55
+ ## Performance
56
+
57
+ The CoreML encoder runs on Apple's Neural Engine for accelerated inference:
58
+
59
+ **Measured: M1 Ultra**
60
+
61
+ ```
62
+ 5 minutes of audio → 22.5 seconds
63
+ Speed: 14x real-time
64
+ 1 hour of audio in ~4.5 minutes
65
+ ```
66
+
67
+ Run your own benchmark:
68
+
69
+ ```bash
70
+ git clone https://github.com/sebastian-software/whisper-coreml
71
+ cd whisper-coreml && npm install && npm run benchmark
72
+ ```
73
+
74
+ ### Comparison with parakeet-coreml
75
+
76
+ | Metric | whisper-coreml | parakeet-coreml |
77
+ | ---------------- | -------------- | --------------- |
78
+ | Speed (M1 Ultra) | 14x real-time | 40x real-time |
79
+ | Languages | 99 | 25 European |
80
+ | Translation | āœ… Yes | āŒ No |
81
+ | Accuracy (WER) | Lower (better) | Higher |
82
+ | Model Size | ~3 GB | ~1.5 GB |
83
+
84
+ **When to choose whisper-coreml:** Maximum accuracy, rare languages, translation, challenging audio.
85
+
86
+ **When to choose parakeet-coreml:** Maximum speed, major languages only.
87
+
88
+ ## Requirements
89
+
90
+ - macOS 14.0+ (Sonoma or later)
91
+ - Apple Silicon (M1, M2, M3, M4 – any variant)
92
+ - Node.js 20+
93
+
94
+ ## Installation
95
+
96
+ ```bash
97
+ npm install whisper-coreml
98
+ ```
99
+
100
+ ### Download the Model
101
+
102
+ ```bash
103
+ npx whisper-coreml download
104
+ ```
105
+
106
+ This downloads the **large-v3-turbo** model (~1.5GB) – the only model we support, as it offers the
107
+ best speed/quality ratio.
108
+
109
+ ## Quick Start
110
+
111
+ ```typescript
112
+ import { WhisperAsrEngine, getModelPath } from "whisper-coreml"
113
+
114
+ const engine = new WhisperAsrEngine({
115
+ modelPath: getModelPath()
116
+ })
117
+
118
+ await engine.initialize()
119
+
120
+ // Transcribe audio (16kHz, mono, Float32Array)
121
+ const result = await engine.transcribe(audioSamples, 16000)
122
+
123
+ console.log(result.text)
124
+ // "Hello, this is a test transcription."
125
+
126
+ console.log(`Language: ${result.language}`)
127
+ console.log(`Processed in ${result.durationMs}ms`)
128
+
129
+ // Segments include timestamps
130
+ for (const seg of result.segments) {
131
+ console.log(`[${seg.startMs}ms - ${seg.endMs}ms] ${seg.text}`)
132
+ }
133
+
134
+ engine.cleanup()
135
+ ```
136
+
137
+ ## Audio Format
138
+
139
+ | Property | Requirement |
140
+ | ----------- | --------------------------------------------- |
141
+ | Sample Rate | **16,000 Hz** (16 kHz) |
142
+ | Channels | **Mono** (single channel) |
143
+ | Format | **Float32Array** with values between -1.0–1.0 |
144
+ | Duration | **Any length** (auto-chunked internally) |
145
+
146
+ ### Converting Audio Files
147
+
148
+ Example with ffmpeg:
149
+
150
+ ```bash
151
+ ffmpeg -i input.mp3 -ar 16000 -ac 1 -f f32le output.pcm
152
+ ```
153
+
154
+ Then load the raw PCM file:
155
+
156
+ ```typescript
157
+ import { readFileSync } from "fs"
158
+
159
+ const buffer = readFileSync("output.pcm")
160
+ const samples = new Float32Array(buffer.buffer, buffer.byteOffset, buffer.length / 4)
161
+ ```
162
+
163
+ ## CLI Commands
164
+
165
+ ```bash
166
+ # Download the model (~1.5GB)
167
+ npx whisper-coreml download
168
+
169
+ # Check status
170
+ npx whisper-coreml status
171
+
172
+ # Run benchmark (requires cloned repo)
173
+ npx whisper-coreml benchmark
174
+
175
+ # Get model directory path
176
+ npx whisper-coreml path
177
+ ```
178
+
179
+ ## API Reference
180
+
181
+ ### `WhisperAsrEngine`
182
+
183
+ The main class for speech recognition.
184
+
185
+ ```typescript
186
+ new WhisperAsrEngine(options: WhisperAsrOptions)
187
+ ```
188
+
189
+ #### Options
190
+
191
+ | Option | Type | Default | Description |
192
+ | ----------- | --------- | -------- | --------------------------------- |
193
+ | `modelPath` | `string` | required | Path to ggml model file |
194
+ | `language` | `string` | `"auto"` | Language code or "auto" to detect |
195
+ | `translate` | `boolean` | `false` | Translate to English |
196
+ | `threads` | `number` | `0` | CPU threads (0 = auto) |
197
+
198
+ #### Methods
199
+
200
+ | Method | Description |
201
+ | --------------------------- | ------------------------------ |
202
+ | `initialize()` | Load model (async) |
203
+ | `transcribe(samples, rate)` | Transcribe audio |
204
+ | `isReady()` | Check if engine is initialized |
205
+ | `cleanup()` | Release native resources |
206
+ | `getVersion()` | Get version information |
207
+
208
+ ### `TranscriptionResult`
209
+
210
+ ```typescript
211
+ interface TranscriptionResult {
212
+ text: string // Full transcription
213
+ language: string // Detected language (ISO code)
214
+ durationMs: number // Processing time in milliseconds
215
+ segments: TranscriptionSegment[]
216
+ }
217
+
218
+ interface TranscriptionSegment {
219
+ startMs: number // Segment start in milliseconds
220
+ endMs: number // Segment end in milliseconds
221
+ text: string // Transcription for this segment
222
+ confidence: number // Confidence score (0-1)
223
+ }
224
+ ```
225
+
226
+ ### Helper Functions
227
+
228
+ | Function | Description |
229
+ | ---------------------- | -------------------------------------- |
230
+ | `isAvailable()` | Check if running on supported platform |
231
+ | `getDefaultModelDir()` | Get default model cache path |
232
+ | `getModelPath()` | Get path to the model file |
233
+ | `isModelDownloaded()` | Check if model is downloaded |
234
+ | `downloadModel()` | Download the model |
235
+
236
+ ## Translation
237
+
238
+ Translate any language to English:
239
+
240
+ ```typescript
241
+ const engine = new WhisperAsrEngine({
242
+ modelPath: getModelPath(),
243
+ language: "de", // German input
244
+ translate: true // Output in English
245
+ })
246
+ ```
247
+
248
+ ## Architecture
249
+
250
+ ```
251
+ ā”Œā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”
252
+ │ Your Node.js App │
253
+ ā”œā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”¤
254
+ │ whisper-coreml API │ TypeScript
255
+ ā”œā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”¤
256
+ │ Native Addon │ N-API + C++
257
+ │ (whisper_engine) │
258
+ ā”œā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”¤
259
+ │ whisper.cpp │ C++
260
+ ā”œā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”¤
261
+ │ CoreML │ Apple Framework
262
+ ā”œā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”¤
263
+ │ Apple Neural Engine │ Dedicated ML Silicon
264
+ ā””ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”˜
265
+ ```
266
+
267
+ ## Use Cases
268
+
269
+ - **Maximum accuracy** – When Parakeet's quality isn't sufficient
270
+ - **Rare languages** – Languages not supported by Parakeet
271
+ - **Translation** – Convert foreign speech to English text
272
+ - **Accented speech** – Whisper handles accents better
273
+ - **Noisy audio** – More robust to background noise
274
+
275
+ ## Contributing
276
+
277
+ Contributions are welcome! Please read our [Contributing Guide](CONTRIBUTING.md) for details.
278
+
279
+ ## License
280
+
281
+ MIT – see [LICENSE](LICENSE) for details.
282
+
283
+ ## Credits
284
+
285
+ - [whisper.cpp](https://github.com/ggerganov/whisper.cpp) by Georgi Gerganov
286
+ - [OpenAI Whisper](https://github.com/openai/whisper) by OpenAI
287
+
288
+ ---
289
+
290
+ Copyright Ā© 2026 [Sebastian Software GmbH](https://sebastian-software.de), Mainz, Germany
Binary file
@@ -0,0 +1,216 @@
1
+ var __require = /* @__PURE__ */ ((x) => typeof require !== "undefined" ? require : typeof Proxy !== "undefined" ? new Proxy(x, {
2
+ get: (a, b) => (typeof require !== "undefined" ? require : a)[b]
3
+ }) : x)(function(x) {
4
+ if (typeof require !== "undefined") return require.apply(this, arguments);
5
+ throw Error('Dynamic require of "' + x + '" is not supported');
6
+ });
7
+
8
+ // src/download.ts
9
+ import { existsSync, mkdirSync, writeFileSync, rmSync } from "fs";
10
+ import { homedir } from "os";
11
+ import { join, dirname } from "path";
12
+ var WHISPER_MODEL = {
13
+ name: "large-v3-turbo",
14
+ size: "1.5 GB",
15
+ languages: "99 languages",
16
+ url: "https://huggingface.co/ggerganov/whisper.cpp/resolve/main/ggml-large-v3-turbo.bin"
17
+ };
18
+ function getDefaultModelDir() {
19
+ return join(homedir(), ".cache", "whisper-coreml", "models");
20
+ }
21
+ function getModelPath(modelDir) {
22
+ const dir = modelDir ?? getDefaultModelDir();
23
+ return join(dir, `ggml-${WHISPER_MODEL.name}.bin`);
24
+ }
25
+ function isModelDownloaded(modelDir) {
26
+ const modelPath = getModelPath(modelDir);
27
+ return existsSync(modelPath);
28
+ }
29
+ async function downloadModel(options = {}) {
30
+ const modelDir = options.modelDir ?? getDefaultModelDir();
31
+ const modelPath = getModelPath(modelDir);
32
+ if (!options.force && existsSync(modelPath)) {
33
+ return modelPath;
34
+ }
35
+ if (existsSync(modelPath)) {
36
+ rmSync(modelPath);
37
+ }
38
+ mkdirSync(dirname(modelPath), { recursive: true });
39
+ console.log(`Downloading Whisper ${WHISPER_MODEL.name} (${WHISPER_MODEL.size})...`);
40
+ console.log(`Source: ${WHISPER_MODEL.url}`);
41
+ console.log(`Target: ${modelPath}`);
42
+ const response = await fetch(WHISPER_MODEL.url);
43
+ if (!response.ok) {
44
+ throw new Error(`Failed to download model: ${response.statusText}`);
45
+ }
46
+ const contentLength = response.headers.get("content-length");
47
+ const totalBytes = contentLength ? parseInt(contentLength, 10) : 0;
48
+ const reader = response.body?.getReader();
49
+ if (!reader) {
50
+ throw new Error("Failed to get response body reader");
51
+ }
52
+ const chunks = [];
53
+ let downloadedBytes = 0;
54
+ while (true) {
55
+ const result = await reader.read();
56
+ if (result.done) {
57
+ break;
58
+ }
59
+ const chunk = result.value;
60
+ chunks.push(chunk);
61
+ downloadedBytes += chunk.length;
62
+ const percent = totalBytes > 0 ? Math.round(downloadedBytes / totalBytes * 100) : 0;
63
+ if (options.onProgress) {
64
+ options.onProgress({
65
+ downloadedBytes,
66
+ totalBytes,
67
+ percent
68
+ });
69
+ }
70
+ process.stdout.write(
71
+ `\rProgress: ${String(percent)}% (${formatBytes(downloadedBytes)}/${formatBytes(totalBytes)})`
72
+ );
73
+ }
74
+ const buffer = Buffer.concat(chunks);
75
+ writeFileSync(modelPath, buffer);
76
+ console.log("\n\u2713 Model downloaded successfully!");
77
+ return modelPath;
78
+ }
79
+ function formatBytes(bytes) {
80
+ if (bytes < 1024) {
81
+ return `${String(bytes)} B`;
82
+ }
83
+ if (bytes < 1024 * 1024) {
84
+ return `${(bytes / 1024).toFixed(1)} KB`;
85
+ }
86
+ if (bytes < 1024 * 1024 * 1024) {
87
+ return `${(bytes / 1024 / 1024).toFixed(1)} MB`;
88
+ }
89
+ return `${(bytes / 1024 / 1024 / 1024).toFixed(2)} GB`;
90
+ }
91
+
92
+ // src/index.ts
93
+ var bindingsModule = __require("bindings");
94
+ function loadAddon() {
95
+ if (process.platform !== "darwin") {
96
+ throw new Error("whisper-coreml is only supported on macOS");
97
+ }
98
+ try {
99
+ return bindingsModule("whisper_asr");
100
+ } catch (error) {
101
+ const message = error instanceof Error ? error.message : String(error);
102
+ throw new Error(`Failed to load Whisper ASR native addon: ${message}`);
103
+ }
104
+ }
105
+ var addon = null;
106
+ var loadError = null;
107
+ function getAddon() {
108
+ if (!addon) {
109
+ try {
110
+ addon = loadAddon();
111
+ } catch (error) {
112
+ loadError = error instanceof Error ? error : new Error(String(error));
113
+ throw error;
114
+ }
115
+ }
116
+ return addon;
117
+ }
118
+ function isAvailable() {
119
+ return process.platform === "darwin" && process.arch === "arm64";
120
+ }
121
+ function getLoadError() {
122
+ return loadError;
123
+ }
124
+ var WhisperAsrEngine = class {
125
+ options;
126
+ initialized = false;
127
+ constructor(options) {
128
+ this.options = options;
129
+ }
130
+ /* v8 ignore start - native addon calls, tested via E2E */
131
+ /**
132
+ * Initialize the Whisper engine
133
+ * This loads the model into memory - may take a few seconds.
134
+ */
135
+ initialize() {
136
+ if (this.initialized) {
137
+ return Promise.resolve();
138
+ }
139
+ const nativeAddon = getAddon();
140
+ const success = nativeAddon.initialize({
141
+ modelPath: this.options.modelPath,
142
+ language: this.options.language ?? "auto",
143
+ translate: this.options.translate ?? false,
144
+ threads: this.options.threads ?? 0
145
+ });
146
+ if (!success) {
147
+ return Promise.reject(new Error("Failed to initialize Whisper engine"));
148
+ }
149
+ this.initialized = true;
150
+ return Promise.resolve();
151
+ }
152
+ /**
153
+ * Check if the engine is ready for transcription
154
+ */
155
+ isReady() {
156
+ if (!this.initialized) {
157
+ return false;
158
+ }
159
+ try {
160
+ return getAddon().isInitialized();
161
+ } catch {
162
+ return false;
163
+ }
164
+ }
165
+ /**
166
+ * Transcribe audio samples
167
+ *
168
+ * @param samples - Float32Array of audio samples (mono, 16kHz)
169
+ * @param sampleRate - Sample rate in Hz (default: 16000)
170
+ * @returns Transcription result with text and segments
171
+ */
172
+ transcribe(samples, sampleRate = 16e3) {
173
+ if (!this.initialized) {
174
+ return Promise.reject(new Error("Whisper engine not initialized. Call initialize() first."));
175
+ }
176
+ const result = getAddon().transcribe(samples, sampleRate);
177
+ return Promise.resolve({
178
+ text: result.text,
179
+ language: result.language,
180
+ durationMs: result.durationMs,
181
+ segments: result.segments
182
+ });
183
+ }
184
+ /**
185
+ * Clean up resources and unload the model
186
+ */
187
+ cleanup() {
188
+ if (this.initialized) {
189
+ try {
190
+ getAddon().cleanup();
191
+ } catch {
192
+ }
193
+ this.initialized = false;
194
+ }
195
+ }
196
+ /**
197
+ * Get version information
198
+ */
199
+ getVersion() {
200
+ return getAddon().getVersion();
201
+ }
202
+ /* v8 ignore stop */
203
+ };
204
+
205
+ export {
206
+ WHISPER_MODEL,
207
+ getDefaultModelDir,
208
+ getModelPath,
209
+ isModelDownloaded,
210
+ downloadModel,
211
+ formatBytes,
212
+ isAvailable,
213
+ getLoadError,
214
+ WhisperAsrEngine
215
+ };
216
+ //# sourceMappingURL=chunk-MOQMN4DX.js.map
@@ -0,0 +1 @@
1
+ {"version":3,"sources":["../src/download.ts","../src/index.ts"],"sourcesContent":["/**\n * Model download functionality for whisper-coreml\n *\n * Note: We only support large-v3-turbo as it's the only Whisper model\n * that offers better quality than Parakeet while maintaining reasonable speed.\n */\n\nimport { existsSync, mkdirSync, writeFileSync, rmSync } from \"node:fs\"\nimport { homedir } from \"node:os\"\nimport { join, dirname } from \"node:path\"\n\n/**\n * Whisper large-v3-turbo model info\n * This is the only model we support as it offers the best speed/quality ratio\n * and is the main reason to choose Whisper over Parakeet.\n */\nexport const WHISPER_MODEL = {\n name: \"large-v3-turbo\",\n size: \"1.5 GB\",\n languages: \"99 languages\",\n url: \"https://huggingface.co/ggerganov/whisper.cpp/resolve/main/ggml-large-v3-turbo.bin\"\n} as const\n\n/**\n * Default model directory in user's cache\n */\nexport function getDefaultModelDir(): string {\n return join(homedir(), \".cache\", \"whisper-coreml\", \"models\")\n}\n\n/**\n * Get the path to the model\n */\nexport function getModelPath(modelDir?: string): string {\n const dir = modelDir ?? getDefaultModelDir()\n return join(dir, `ggml-${WHISPER_MODEL.name}.bin`)\n}\n\n/**\n * Check if the model is downloaded\n */\nexport function isModelDownloaded(modelDir?: string): boolean {\n const modelPath = getModelPath(modelDir)\n return existsSync(modelPath)\n}\n\ninterface DownloadProgress {\n downloadedBytes: number\n totalBytes: number\n percent: number\n}\n\nexport interface DownloadOptions {\n /** Target directory for model (default: ~/.cache/whisper-coreml/models) */\n modelDir?: string\n\n /** Progress callback */\n onProgress?: (progress: DownloadProgress) => void\n\n /** Force re-download even if model exists */\n force?: boolean\n}\n\n/* v8 ignore start - network I/O */\n\n/**\n * Download the Whisper large-v3-turbo model from Hugging Face\n */\nexport async function downloadModel(options: DownloadOptions = {}): Promise<string> {\n const modelDir = options.modelDir ?? getDefaultModelDir()\n const modelPath = getModelPath(modelDir)\n\n if (!options.force && existsSync(modelPath)) {\n return modelPath\n }\n\n // Clean up partial downloads\n if (existsSync(modelPath)) {\n rmSync(modelPath)\n }\n\n mkdirSync(dirname(modelPath), { recursive: true })\n\n console.log(`Downloading Whisper ${WHISPER_MODEL.name} (${WHISPER_MODEL.size})...`)\n console.log(`Source: ${WHISPER_MODEL.url}`)\n console.log(`Target: ${modelPath}`)\n\n const response = await fetch(WHISPER_MODEL.url)\n if (!response.ok) {\n throw new Error(`Failed to download model: ${response.statusText}`)\n }\n\n const contentLength = response.headers.get(\"content-length\")\n const totalBytes = contentLength ? parseInt(contentLength, 10) : 0\n\n const reader = response.body?.getReader()\n if (!reader) {\n throw new Error(\"Failed to get response body reader\")\n }\n\n const chunks: Uint8Array[] = []\n let downloadedBytes = 0\n\n // eslint-disable-next-line @typescript-eslint/no-unnecessary-condition\n while (true) {\n const result = await reader.read()\n if (result.done) {\n break\n }\n\n const chunk = result.value as Uint8Array\n chunks.push(chunk)\n downloadedBytes += chunk.length\n\n const percent = totalBytes > 0 ? Math.round((downloadedBytes / totalBytes) * 100) : 0\n\n if (options.onProgress) {\n options.onProgress({\n downloadedBytes,\n totalBytes,\n percent\n })\n }\n\n // Progress indicator\n process.stdout.write(\n `\\rProgress: ${String(percent)}% (${formatBytes(downloadedBytes)}/${formatBytes(totalBytes)})`\n )\n }\n\n // Combine chunks and write to file\n const buffer = Buffer.concat(chunks)\n writeFileSync(modelPath, buffer)\n\n console.log(\"\\nāœ“ Model downloaded successfully!\")\n return modelPath\n}\n\n/* v8 ignore stop */\n\n/**\n * Format bytes to human readable string\n * @internal Exported for testing\n */\nexport function formatBytes(bytes: number): string {\n if (bytes < 1024) {\n return `${String(bytes)} B`\n }\n if (bytes < 1024 * 1024) {\n return `${(bytes / 1024).toFixed(1)} KB`\n }\n if (bytes < 1024 * 1024 * 1024) {\n return `${(bytes / 1024 / 1024).toFixed(1)} MB`\n }\n return `${(bytes / 1024 / 1024 / 1024).toFixed(2)} GB`\n}\n","/**\n * whisper-coreml\n *\n * OpenAI Whisper ASR for Node.js with CoreML/ANE acceleration on Apple Silicon.\n * Based on whisper.cpp with Apple Neural Engine support.\n *\n * Uses the large-v3-turbo model exclusively, as it offers the best speed/quality\n * ratio and is the main reason to choose Whisper over Parakeet.\n */\n\n// Dynamic require for loading native addon (works in both ESM and CJS)\n// eslint-disable-next-line @typescript-eslint/no-require-imports\nconst bindingsModule = require(\"bindings\") as (name: string) => unknown\n\n/**\n * Native addon interface\n */\ninterface NativeAddon {\n initialize(options: {\n modelPath: string\n language?: string\n translate?: boolean\n threads?: number\n }): boolean\n isInitialized(): boolean\n transcribe(samples: Float32Array, sampleRate: number): NativeTranscriptionResult\n cleanup(): void\n getVersion(): { addon: string; whisper: string; coreml: string }\n}\n\ninterface NativeTranscriptionResult {\n text: string\n language: string\n durationMs: number\n segments: {\n startMs: number\n endMs: number\n text: string\n confidence: number\n }[]\n}\n\n/* v8 ignore start - platform checks and native addon loading */\n\n/**\n * Load the native addon\n */\nfunction loadAddon(): NativeAddon {\n if (process.platform !== \"darwin\") {\n throw new Error(\"whisper-coreml is only supported on macOS\")\n }\n\n try {\n return bindingsModule(\"whisper_asr\") as NativeAddon\n } catch (error) {\n const message = error instanceof Error ? error.message : String(error)\n throw new Error(`Failed to load Whisper ASR native addon: ${message}`)\n }\n}\n\n/* v8 ignore stop */\n\nlet addon: NativeAddon | null = null\nlet loadError: Error | null = null\n\nfunction getAddon(): NativeAddon {\n if (!addon) {\n try {\n addon = loadAddon()\n } catch (error) {\n loadError = error instanceof Error ? error : new Error(String(error))\n throw error\n }\n }\n return addon\n}\n\n/**\n * Check if Whisper ASR is available on this platform\n */\nexport function isAvailable(): boolean {\n return process.platform === \"darwin\" && process.arch === \"arm64\"\n}\n\n/**\n * Get the load error if the addon failed to load\n */\nexport function getLoadError(): Error | null {\n return loadError\n}\n\n/**\n * Transcription segment with timestamps\n */\nexport interface TranscriptionSegment {\n /** Start time in milliseconds */\n startMs: number\n /** End time in milliseconds */\n endMs: number\n /** Transcribed text for this segment */\n text: string\n /** Confidence score (0-1) */\n confidence: number\n}\n\n/**\n * Transcription result\n */\nexport interface TranscriptionResult {\n /** Full transcribed text */\n text: string\n /** Detected or specified language (ISO code) */\n language: string\n /** Processing time in milliseconds */\n durationMs: number\n /** Individual segments with timestamps */\n segments: TranscriptionSegment[]\n}\n\n/**\n * Whisper ASR engine options\n */\nexport interface WhisperAsrOptions {\n /** Path to the Whisper model file (ggml format) */\n modelPath: string\n /** Language code (e.g., \"en\", \"de\", \"fr\") or \"auto\" for auto-detection */\n language?: string\n /** Translate to English (default: false) */\n translate?: boolean\n /** Number of threads (0 = auto) */\n threads?: number\n}\n\n/**\n * Whisper ASR Engine with CoreML acceleration\n *\n * Uses the large-v3-turbo model for best speed/quality balance.\n *\n * @example\n * ```typescript\n * import { WhisperAsrEngine, getModelPath } from \"whisper-coreml\"\n *\n * const engine = new WhisperAsrEngine({\n * modelPath: getModelPath()\n * })\n *\n * await engine.initialize()\n * const result = await engine.transcribe(audioSamples, 16000)\n * console.log(result.text)\n * ```\n */\nexport class WhisperAsrEngine {\n private options: WhisperAsrOptions\n private initialized = false\n\n constructor(options: WhisperAsrOptions) {\n this.options = options\n }\n\n /* v8 ignore start - native addon calls, tested via E2E */\n\n /**\n * Initialize the Whisper engine\n * This loads the model into memory - may take a few seconds.\n */\n initialize(): Promise<void> {\n if (this.initialized) {\n return Promise.resolve()\n }\n\n const nativeAddon = getAddon()\n const success = nativeAddon.initialize({\n modelPath: this.options.modelPath,\n language: this.options.language ?? \"auto\",\n translate: this.options.translate ?? false,\n threads: this.options.threads ?? 0\n })\n\n if (!success) {\n return Promise.reject(new Error(\"Failed to initialize Whisper engine\"))\n }\n\n this.initialized = true\n return Promise.resolve()\n }\n\n /**\n * Check if the engine is ready for transcription\n */\n isReady(): boolean {\n if (!this.initialized) {\n return false\n }\n try {\n return getAddon().isInitialized()\n } catch {\n return false\n }\n }\n\n /**\n * Transcribe audio samples\n *\n * @param samples - Float32Array of audio samples (mono, 16kHz)\n * @param sampleRate - Sample rate in Hz (default: 16000)\n * @returns Transcription result with text and segments\n */\n transcribe(samples: Float32Array, sampleRate = 16000): Promise<TranscriptionResult> {\n if (!this.initialized) {\n return Promise.reject(new Error(\"Whisper engine not initialized. Call initialize() first.\"))\n }\n\n const result = getAddon().transcribe(samples, sampleRate)\n\n return Promise.resolve({\n text: result.text,\n language: result.language,\n durationMs: result.durationMs,\n segments: result.segments\n })\n }\n\n /**\n * Clean up resources and unload the model\n */\n cleanup(): void {\n if (this.initialized) {\n try {\n getAddon().cleanup()\n } catch {\n // Ignore cleanup errors\n }\n this.initialized = false\n }\n }\n\n /**\n * Get version information\n */\n getVersion(): { addon: string; whisper: string; coreml: string } {\n return getAddon().getVersion()\n }\n\n /* v8 ignore stop */\n}\n\n// Re-export download utilities\nexport {\n downloadModel,\n formatBytes,\n getDefaultModelDir,\n getModelPath,\n isModelDownloaded,\n WHISPER_MODEL,\n type DownloadOptions\n} from \"./download.js\"\n"],"mappings":";;;;;;;;AAOA,SAAS,YAAY,WAAW,eAAe,cAAc;AAC7D,SAAS,eAAe;AACxB,SAAS,MAAM,eAAe;AAOvB,IAAM,gBAAgB;AAAA,EAC3B,MAAM;AAAA,EACN,MAAM;AAAA,EACN,WAAW;AAAA,EACX,KAAK;AACP;AAKO,SAAS,qBAA6B;AAC3C,SAAO,KAAK,QAAQ,GAAG,UAAU,kBAAkB,QAAQ;AAC7D;AAKO,SAAS,aAAa,UAA2B;AACtD,QAAM,MAAM,YAAY,mBAAmB;AAC3C,SAAO,KAAK,KAAK,QAAQ,cAAc,IAAI,MAAM;AACnD;AAKO,SAAS,kBAAkB,UAA4B;AAC5D,QAAM,YAAY,aAAa,QAAQ;AACvC,SAAO,WAAW,SAAS;AAC7B;AAwBA,eAAsB,cAAc,UAA2B,CAAC,GAAoB;AAClF,QAAM,WAAW,QAAQ,YAAY,mBAAmB;AACxD,QAAM,YAAY,aAAa,QAAQ;AAEvC,MAAI,CAAC,QAAQ,SAAS,WAAW,SAAS,GAAG;AAC3C,WAAO;AAAA,EACT;AAGA,MAAI,WAAW,SAAS,GAAG;AACzB,WAAO,SAAS;AAAA,EAClB;AAEA,YAAU,QAAQ,SAAS,GAAG,EAAE,WAAW,KAAK,CAAC;AAEjD,UAAQ,IAAI,uBAAuB,cAAc,IAAI,KAAK,cAAc,IAAI,MAAM;AAClF,UAAQ,IAAI,WAAW,cAAc,GAAG,EAAE;AAC1C,UAAQ,IAAI,WAAW,SAAS,EAAE;AAElC,QAAM,WAAW,MAAM,MAAM,cAAc,GAAG;AAC9C,MAAI,CAAC,SAAS,IAAI;AAChB,UAAM,IAAI,MAAM,6BAA6B,SAAS,UAAU,EAAE;AAAA,EACpE;AAEA,QAAM,gBAAgB,SAAS,QAAQ,IAAI,gBAAgB;AAC3D,QAAM,aAAa,gBAAgB,SAAS,eAAe,EAAE,IAAI;AAEjE,QAAM,SAAS,SAAS,MAAM,UAAU;AACxC,MAAI,CAAC,QAAQ;AACX,UAAM,IAAI,MAAM,oCAAoC;AAAA,EACtD;AAEA,QAAM,SAAuB,CAAC;AAC9B,MAAI,kBAAkB;AAGtB,SAAO,MAAM;AACX,UAAM,SAAS,MAAM,OAAO,KAAK;AACjC,QAAI,OAAO,MAAM;AACf;AAAA,IACF;AAEA,UAAM,QAAQ,OAAO;AACrB,WAAO,KAAK,KAAK;AACjB,uBAAmB,MAAM;AAEzB,UAAM,UAAU,aAAa,IAAI,KAAK,MAAO,kBAAkB,aAAc,GAAG,IAAI;AAEpF,QAAI,QAAQ,YAAY;AACtB,cAAQ,WAAW;AAAA,QACjB;AAAA,QACA;AAAA,QACA;AAAA,MACF,CAAC;AAAA,IACH;AAGA,YAAQ,OAAO;AAAA,MACb,eAAe,OAAO,OAAO,CAAC,MAAM,YAAY,eAAe,CAAC,IAAI,YAAY,UAAU,CAAC;AAAA,IAC7F;AAAA,EACF;AAGA,QAAM,SAAS,OAAO,OAAO,MAAM;AACnC,gBAAc,WAAW,MAAM;AAE/B,UAAQ,IAAI,yCAAoC;AAChD,SAAO;AACT;AAQO,SAAS,YAAY,OAAuB;AACjD,MAAI,QAAQ,MAAM;AAChB,WAAO,GAAG,OAAO,KAAK,CAAC;AAAA,EACzB;AACA,MAAI,QAAQ,OAAO,MAAM;AACvB,WAAO,IAAI,QAAQ,MAAM,QAAQ,CAAC,CAAC;AAAA,EACrC;AACA,MAAI,QAAQ,OAAO,OAAO,MAAM;AAC9B,WAAO,IAAI,QAAQ,OAAO,MAAM,QAAQ,CAAC,CAAC;AAAA,EAC5C;AACA,SAAO,IAAI,QAAQ,OAAO,OAAO,MAAM,QAAQ,CAAC,CAAC;AACnD;;;AC/IA,IAAM,iBAAiB,UAAQ,UAAU;AAmCzC,SAAS,YAAyB;AAChC,MAAI,QAAQ,aAAa,UAAU;AACjC,UAAM,IAAI,MAAM,2CAA2C;AAAA,EAC7D;AAEA,MAAI;AACF,WAAO,eAAe,aAAa;AAAA,EACrC,SAAS,OAAO;AACd,UAAM,UAAU,iBAAiB,QAAQ,MAAM,UAAU,OAAO,KAAK;AACrE,UAAM,IAAI,MAAM,4CAA4C,OAAO,EAAE;AAAA,EACvE;AACF;AAIA,IAAI,QAA4B;AAChC,IAAI,YAA0B;AAE9B,SAAS,WAAwB;AAC/B,MAAI,CAAC,OAAO;AACV,QAAI;AACF,cAAQ,UAAU;AAAA,IACpB,SAAS,OAAO;AACd,kBAAY,iBAAiB,QAAQ,QAAQ,IAAI,MAAM,OAAO,KAAK,CAAC;AACpE,YAAM;AAAA,IACR;AAAA,EACF;AACA,SAAO;AACT;AAKO,SAAS,cAAuB;AACrC,SAAO,QAAQ,aAAa,YAAY,QAAQ,SAAS;AAC3D;AAKO,SAAS,eAA6B;AAC3C,SAAO;AACT;AA8DO,IAAM,mBAAN,MAAuB;AAAA,EACpB;AAAA,EACA,cAAc;AAAA,EAEtB,YAAY,SAA4B;AACtC,SAAK,UAAU;AAAA,EACjB;AAAA;AAAA;AAAA;AAAA;AAAA;AAAA,EAQA,aAA4B;AAC1B,QAAI,KAAK,aAAa;AACpB,aAAO,QAAQ,QAAQ;AAAA,IACzB;AAEA,UAAM,cAAc,SAAS;AAC7B,UAAM,UAAU,YAAY,WAAW;AAAA,MACrC,WAAW,KAAK,QAAQ;AAAA,MACxB,UAAU,KAAK,QAAQ,YAAY;AAAA,MACnC,WAAW,KAAK,QAAQ,aAAa;AAAA,MACrC,SAAS,KAAK,QAAQ,WAAW;AAAA,IACnC,CAAC;AAED,QAAI,CAAC,SAAS;AACZ,aAAO,QAAQ,OAAO,IAAI,MAAM,qCAAqC,CAAC;AAAA,IACxE;AAEA,SAAK,cAAc;AACnB,WAAO,QAAQ,QAAQ;AAAA,EACzB;AAAA;AAAA;AAAA;AAAA,EAKA,UAAmB;AACjB,QAAI,CAAC,KAAK,aAAa;AACrB,aAAO;AAAA,IACT;AACA,QAAI;AACF,aAAO,SAAS,EAAE,cAAc;AAAA,IAClC,QAAQ;AACN,aAAO;AAAA,IACT;AAAA,EACF;AAAA;AAAA;AAAA;AAAA;AAAA;AAAA;AAAA;AAAA,EASA,WAAW,SAAuB,aAAa,MAAqC;AAClF,QAAI,CAAC,KAAK,aAAa;AACrB,aAAO,QAAQ,OAAO,IAAI,MAAM,0DAA0D,CAAC;AAAA,IAC7F;AAEA,UAAM,SAAS,SAAS,EAAE,WAAW,SAAS,UAAU;AAExD,WAAO,QAAQ,QAAQ;AAAA,MACrB,MAAM,OAAO;AAAA,MACb,UAAU,OAAO;AAAA,MACjB,YAAY,OAAO;AAAA,MACnB,UAAU,OAAO;AAAA,IACnB,CAAC;AAAA,EACH;AAAA;AAAA;AAAA;AAAA,EAKA,UAAgB;AACd,QAAI,KAAK,aAAa;AACpB,UAAI;AACF,iBAAS,EAAE,QAAQ;AAAA,MACrB,QAAQ;AAAA,MAER;AACA,WAAK,cAAc;AAAA,IACrB;AAAA,EACF;AAAA;AAAA;AAAA;AAAA,EAKA,aAAiE;AAC/D,WAAO,SAAS,EAAE,WAAW;AAAA,EAC/B;AAAA;AAGF;","names":[]}