whisper-cpp-node 0.2.9 → 0.2.12
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +387 -353
- package/dist/index.d.ts +20 -2
- package/dist/index.d.ts.map +1 -1
- package/dist/index.js +21 -0
- package/dist/index.js.map +1 -1
- package/dist/types.d.ts +19 -1
- package/dist/types.d.ts.map +1 -1
- package/package.json +3 -3
package/README.md
CHANGED
|
@@ -1,353 +1,387 @@
|
|
|
1
|
-
# whisper-cpp-node
|
|
2
|
-
|
|
3
|
-
Node.js bindings for [whisper.cpp](https://github.com/ggerganov/whisper.cpp) - fast speech-to-text with GPU acceleration.
|
|
4
|
-
|
|
5
|
-
## Features
|
|
6
|
-
|
|
7
|
-
- **Fast**: Native whisper.cpp performance with GPU acceleration
|
|
8
|
-
- **Cross-platform**: macOS (Metal), Windows (Vulkan)
|
|
9
|
-
- **Core ML**: Optional Apple Neural Engine support for 3x+ speedup (macOS)
|
|
10
|
-
- **OpenVINO**: Optional Intel CPU/GPU encoder acceleration (Windows/Linux)
|
|
11
|
-
- **Streaming VAD**: Built-in Silero voice activity detection
|
|
12
|
-
- **TypeScript**: Full type definitions included
|
|
13
|
-
- **
|
|
14
|
-
|
|
15
|
-
|
|
16
|
-
|
|
17
|
-
|
|
18
|
-
|
|
19
|
-
-
|
|
20
|
-
-
|
|
21
|
-
|
|
22
|
-
|
|
23
|
-
|
|
24
|
-
-
|
|
25
|
-
-
|
|
26
|
-
|
|
27
|
-
|
|
28
|
-
|
|
29
|
-
|
|
30
|
-
|
|
31
|
-
|
|
32
|
-
|
|
33
|
-
|
|
34
|
-
|
|
35
|
-
|
|
36
|
-
-
|
|
37
|
-
-
|
|
38
|
-
|
|
39
|
-
|
|
40
|
-
|
|
41
|
-
|
|
42
|
-
|
|
43
|
-
|
|
44
|
-
|
|
45
|
-
|
|
46
|
-
|
|
47
|
-
|
|
48
|
-
|
|
49
|
-
|
|
50
|
-
|
|
51
|
-
|
|
52
|
-
|
|
53
|
-
|
|
54
|
-
|
|
55
|
-
|
|
56
|
-
|
|
57
|
-
|
|
58
|
-
|
|
59
|
-
|
|
60
|
-
|
|
61
|
-
|
|
62
|
-
|
|
63
|
-
|
|
64
|
-
}
|
|
65
|
-
|
|
66
|
-
|
|
67
|
-
|
|
68
|
-
|
|
69
|
-
|
|
70
|
-
|
|
71
|
-
|
|
72
|
-
|
|
73
|
-
|
|
74
|
-
|
|
75
|
-
|
|
76
|
-
|
|
77
|
-
|
|
78
|
-
|
|
79
|
-
|
|
80
|
-
|
|
81
|
-
|
|
82
|
-
|
|
83
|
-
|
|
84
|
-
|
|
85
|
-
const
|
|
86
|
-
|
|
87
|
-
|
|
88
|
-
|
|
89
|
-
|
|
90
|
-
|
|
91
|
-
|
|
92
|
-
}
|
|
93
|
-
|
|
94
|
-
|
|
95
|
-
|
|
96
|
-
|
|
97
|
-
|
|
98
|
-
|
|
99
|
-
|
|
100
|
-
|
|
101
|
-
|
|
102
|
-
|
|
103
|
-
|
|
104
|
-
|
|
105
|
-
|
|
106
|
-
|
|
107
|
-
|
|
108
|
-
|
|
109
|
-
|
|
110
|
-
|
|
111
|
-
|
|
112
|
-
|
|
113
|
-
|
|
114
|
-
|
|
115
|
-
|
|
116
|
-
},
|
|
117
|
-
|
|
118
|
-
|
|
119
|
-
|
|
120
|
-
|
|
121
|
-
|
|
122
|
-
|
|
123
|
-
|
|
124
|
-
|
|
125
|
-
|
|
126
|
-
|
|
127
|
-
|
|
128
|
-
|
|
129
|
-
|
|
130
|
-
|
|
131
|
-
|
|
132
|
-
|
|
133
|
-
|
|
134
|
-
|
|
135
|
-
|
|
136
|
-
|
|
137
|
-
|
|
138
|
-
|
|
139
|
-
|
|
140
|
-
|
|
141
|
-
|
|
142
|
-
|
|
143
|
-
|
|
144
|
-
|
|
145
|
-
|
|
146
|
-
|
|
147
|
-
|
|
148
|
-
|
|
149
|
-
|
|
150
|
-
|
|
151
|
-
|
|
152
|
-
|
|
153
|
-
|
|
154
|
-
//
|
|
155
|
-
|
|
156
|
-
|
|
157
|
-
|
|
158
|
-
|
|
159
|
-
|
|
160
|
-
//
|
|
161
|
-
|
|
162
|
-
|
|
163
|
-
|
|
164
|
-
|
|
165
|
-
|
|
166
|
-
|
|
167
|
-
|
|
168
|
-
|
|
169
|
-
|
|
170
|
-
|
|
171
|
-
|
|
172
|
-
|
|
173
|
-
|
|
174
|
-
|
|
175
|
-
|
|
176
|
-
|
|
177
|
-
|
|
178
|
-
|
|
179
|
-
|
|
180
|
-
|
|
181
|
-
|
|
182
|
-
|
|
183
|
-
|
|
184
|
-
|
|
185
|
-
|
|
186
|
-
|
|
187
|
-
|
|
188
|
-
|
|
189
|
-
|
|
190
|
-
|
|
191
|
-
|
|
192
|
-
|
|
193
|
-
|
|
194
|
-
|
|
195
|
-
|
|
196
|
-
|
|
197
|
-
|
|
198
|
-
|
|
199
|
-
|
|
200
|
-
|
|
201
|
-
|
|
202
|
-
|
|
203
|
-
|
|
204
|
-
|
|
205
|
-
|
|
206
|
-
|
|
207
|
-
|
|
208
|
-
|
|
209
|
-
|
|
210
|
-
|
|
211
|
-
|
|
212
|
-
|
|
213
|
-
|
|
214
|
-
|
|
215
|
-
|
|
216
|
-
|
|
217
|
-
|
|
218
|
-
|
|
219
|
-
|
|
220
|
-
|
|
221
|
-
|
|
222
|
-
|
|
223
|
-
|
|
224
|
-
|
|
225
|
-
|
|
226
|
-
|
|
227
|
-
|
|
228
|
-
|
|
229
|
-
|
|
230
|
-
|
|
231
|
-
|
|
232
|
-
|
|
233
|
-
|
|
234
|
-
|
|
235
|
-
|
|
236
|
-
|
|
237
|
-
|
|
238
|
-
|
|
239
|
-
|
|
240
|
-
|
|
241
|
-
|
|
242
|
-
}
|
|
243
|
-
|
|
244
|
-
|
|
245
|
-
|
|
246
|
-
|
|
247
|
-
|
|
248
|
-
|
|
249
|
-
|
|
250
|
-
|
|
251
|
-
|
|
252
|
-
|
|
253
|
-
|
|
254
|
-
|
|
255
|
-
```typescript
|
|
256
|
-
|
|
257
|
-
|
|
258
|
-
|
|
259
|
-
|
|
260
|
-
|
|
261
|
-
|
|
262
|
-
|
|
263
|
-
|
|
264
|
-
|
|
265
|
-
|
|
266
|
-
|
|
267
|
-
|
|
268
|
-
|
|
269
|
-
|
|
270
|
-
|
|
271
|
-
|
|
272
|
-
|
|
273
|
-
//
|
|
274
|
-
|
|
275
|
-
|
|
276
|
-
|
|
277
|
-
|
|
278
|
-
|
|
279
|
-
|
|
280
|
-
|
|
281
|
-
|
|
282
|
-
|
|
283
|
-
|
|
284
|
-
|
|
285
|
-
|
|
286
|
-
|
|
287
|
-
|
|
288
|
-
|
|
289
|
-
|
|
290
|
-
|
|
291
|
-
|
|
292
|
-
|
|
293
|
-
|
|
294
|
-
|
|
295
|
-
|
|
296
|
-
|
|
297
|
-
|
|
298
|
-
|
|
299
|
-
|
|
300
|
-
|
|
301
|
-
|
|
302
|
-
|
|
303
|
-
|
|
304
|
-
|
|
305
|
-
|
|
306
|
-
|
|
307
|
-
|
|
308
|
-
|
|
309
|
-
|
|
310
|
-
|
|
311
|
-
|
|
312
|
-
|
|
313
|
-
|
|
314
|
-
|
|
315
|
-
|
|
316
|
-
|
|
317
|
-
|
|
318
|
-
|
|
319
|
-
```
|
|
320
|
-
|
|
321
|
-
|
|
322
|
-
```
|
|
323
|
-
|
|
324
|
-
|
|
325
|
-
|
|
326
|
-
|
|
327
|
-
|
|
328
|
-
|
|
329
|
-
|
|
330
|
-
|
|
331
|
-
|
|
332
|
-
|
|
333
|
-
|
|
334
|
-
|
|
335
|
-
|
|
336
|
-
|
|
337
|
-
|
|
338
|
-
|
|
339
|
-
|
|
340
|
-
|
|
341
|
-
|
|
342
|
-
|
|
343
|
-
|
|
344
|
-
|
|
345
|
-
|
|
346
|
-
|
|
347
|
-
|
|
348
|
-
|
|
349
|
-
```
|
|
350
|
-
|
|
351
|
-
|
|
352
|
-
|
|
353
|
-
|
|
1
|
+
# whisper-cpp-node
|
|
2
|
+
|
|
3
|
+
Node.js bindings for [whisper.cpp](https://github.com/ggerganov/whisper.cpp) - fast speech-to-text with GPU acceleration.
|
|
4
|
+
|
|
5
|
+
## Features
|
|
6
|
+
|
|
7
|
+
- **Fast**: Native whisper.cpp performance with GPU acceleration
|
|
8
|
+
- **Cross-platform**: macOS (Metal), Windows (Vulkan)
|
|
9
|
+
- **Core ML**: Optional Apple Neural Engine support for 3x+ speedup (macOS)
|
|
10
|
+
- **OpenVINO**: Optional Intel CPU/GPU encoder acceleration (Windows/Linux)
|
|
11
|
+
- **Streaming VAD**: Built-in Silero voice activity detection
|
|
12
|
+
- **TypeScript**: Full type definitions included
|
|
13
|
+
- **GPU Discovery**: Enumerate available GPU devices for multi-GPU selection
|
|
14
|
+
- **Self-contained**: No external dependencies, just install and use
|
|
15
|
+
|
|
16
|
+
## Requirements
|
|
17
|
+
|
|
18
|
+
**macOS:**
|
|
19
|
+
- macOS 13.3+ (Ventura or later)
|
|
20
|
+
- Apple Silicon (M1/M2/M3/M4)
|
|
21
|
+
- Node.js 18+
|
|
22
|
+
|
|
23
|
+
**Windows:**
|
|
24
|
+
- Windows 10/11 (x64)
|
|
25
|
+
- Node.js 18+
|
|
26
|
+
- Vulkan-capable GPU (optional, for GPU acceleration)
|
|
27
|
+
|
|
28
|
+
## Installation
|
|
29
|
+
|
|
30
|
+
```bash
|
|
31
|
+
npm install whisper-cpp-node
|
|
32
|
+
# or
|
|
33
|
+
pnpm add whisper-cpp-node
|
|
34
|
+
```
|
|
35
|
+
|
|
36
|
+
The platform-specific binary is automatically installed:
|
|
37
|
+
- macOS ARM64: `@whisper-cpp-node/darwin-arm64`
|
|
38
|
+
- Windows x64: `@whisper-cpp-node/win32-x64`
|
|
39
|
+
|
|
40
|
+
## Quick Start
|
|
41
|
+
|
|
42
|
+
### File-based transcription
|
|
43
|
+
|
|
44
|
+
```typescript
|
|
45
|
+
import {
|
|
46
|
+
createWhisperContext,
|
|
47
|
+
transcribeAsync,
|
|
48
|
+
} from "whisper-cpp-node";
|
|
49
|
+
|
|
50
|
+
// Create a context with your model
|
|
51
|
+
const ctx = createWhisperContext({
|
|
52
|
+
model: "./models/ggml-base.en.bin",
|
|
53
|
+
use_gpu: true,
|
|
54
|
+
});
|
|
55
|
+
|
|
56
|
+
// Transcribe audio file
|
|
57
|
+
const result = await transcribeAsync(ctx, {
|
|
58
|
+
fname_inp: "./audio.wav",
|
|
59
|
+
language: "en",
|
|
60
|
+
});
|
|
61
|
+
|
|
62
|
+
// Result: { segments: [["00:00:00,000", "00:00:02,500", " Hello world"], ...] }
|
|
63
|
+
for (const [start, end, text] of result.segments) {
|
|
64
|
+
console.log(`[${start} --> ${end}]${text}`);
|
|
65
|
+
}
|
|
66
|
+
|
|
67
|
+
// Clean up
|
|
68
|
+
ctx.free();
|
|
69
|
+
```
|
|
70
|
+
|
|
71
|
+
### Buffer-based transcription
|
|
72
|
+
|
|
73
|
+
```typescript
|
|
74
|
+
import {
|
|
75
|
+
createWhisperContext,
|
|
76
|
+
transcribeAsync,
|
|
77
|
+
} from "whisper-cpp-node";
|
|
78
|
+
|
|
79
|
+
const ctx = createWhisperContext({
|
|
80
|
+
model: "./models/ggml-base.en.bin",
|
|
81
|
+
use_gpu: true,
|
|
82
|
+
});
|
|
83
|
+
|
|
84
|
+
// Pass raw PCM audio (16kHz, mono, float32)
|
|
85
|
+
const pcmData = new Float32Array(/* your audio samples */);
|
|
86
|
+
const result = await transcribeAsync(ctx, {
|
|
87
|
+
pcmf32: pcmData,
|
|
88
|
+
language: "en",
|
|
89
|
+
});
|
|
90
|
+
|
|
91
|
+
for (const [start, end, text] of result.segments) {
|
|
92
|
+
console.log(`[${start} --> ${end}]${text}`);
|
|
93
|
+
}
|
|
94
|
+
|
|
95
|
+
ctx.free();
|
|
96
|
+
```
|
|
97
|
+
|
|
98
|
+
### Streaming transcription
|
|
99
|
+
|
|
100
|
+
Get real-time output as audio is processed. The `on_new_segment` callback fires for each segment as it's generated, while the final callback still receives all segments at completion (backward compatible):
|
|
101
|
+
|
|
102
|
+
```typescript
|
|
103
|
+
import { createWhisperContext, transcribe } from "whisper-cpp-node";
|
|
104
|
+
|
|
105
|
+
const ctx = createWhisperContext({
|
|
106
|
+
model: "./models/ggml-base.en.bin",
|
|
107
|
+
});
|
|
108
|
+
|
|
109
|
+
transcribe(ctx, {
|
|
110
|
+
fname_inp: "./long-audio.wav",
|
|
111
|
+
language: "en",
|
|
112
|
+
|
|
113
|
+
// Called for each segment as it's generated
|
|
114
|
+
on_new_segment: (segment) => {
|
|
115
|
+
console.log(`[${segment.start}]${segment.text}`);
|
|
116
|
+
},
|
|
117
|
+
}, (err, result) => {
|
|
118
|
+
// Final callback still receives ALL segments at completion
|
|
119
|
+
console.log(`Done! ${result.segments.length} segments`);
|
|
120
|
+
ctx.free();
|
|
121
|
+
});
|
|
122
|
+
```
|
|
123
|
+
|
|
124
|
+
## API
|
|
125
|
+
|
|
126
|
+
### `createWhisperContext(options)`
|
|
127
|
+
|
|
128
|
+
Create a persistent context for transcription.
|
|
129
|
+
|
|
130
|
+
```typescript
|
|
131
|
+
interface WhisperContextOptions {
|
|
132
|
+
model: string; // Path to GGML model file (required)
|
|
133
|
+
use_gpu?: boolean; // Enable GPU acceleration (default: true)
|
|
134
|
+
// Uses Metal on macOS, Vulkan on Windows
|
|
135
|
+
use_coreml?: boolean; // Enable Core ML on macOS (default: false)
|
|
136
|
+
use_openvino?: boolean; // Enable OpenVINO encoder on Intel (default: false)
|
|
137
|
+
openvino_device?: string; // OpenVINO device: 'CPU', 'GPU', 'NPU' (default: 'CPU')
|
|
138
|
+
openvino_model_path?: string; // Path to OpenVINO encoder model (auto-derived)
|
|
139
|
+
openvino_cache_dir?: string; // Cache dir for compiled OpenVINO models
|
|
140
|
+
flash_attn?: boolean; // Enable Flash Attention (default: false)
|
|
141
|
+
gpu_device?: number; // GPU device index (default: 0, see getGpuDevices())
|
|
142
|
+
dtw?: string; // DTW preset for word timestamps
|
|
143
|
+
no_prints?: boolean; // Suppress log output (default: false)
|
|
144
|
+
}
|
|
145
|
+
```
|
|
146
|
+
|
|
147
|
+
### `transcribeAsync(context, options)`
|
|
148
|
+
|
|
149
|
+
Transcribe audio (Promise-based). Accepts either a file path or PCM buffer.
|
|
150
|
+
|
|
151
|
+
```typescript
|
|
152
|
+
// File input
|
|
153
|
+
interface TranscribeOptionsFile {
|
|
154
|
+
fname_inp: string; // Path to audio file
|
|
155
|
+
// ... common options
|
|
156
|
+
}
|
|
157
|
+
|
|
158
|
+
// Buffer input
|
|
159
|
+
interface TranscribeOptionsBuffer {
|
|
160
|
+
pcmf32: Float32Array; // Raw PCM (16kHz, mono, float32, -1.0 to 1.0)
|
|
161
|
+
// ... common options
|
|
162
|
+
}
|
|
163
|
+
|
|
164
|
+
// Common options (partial list - see types.ts for full options)
|
|
165
|
+
interface TranscribeOptionsBase {
|
|
166
|
+
// Language
|
|
167
|
+
language?: string; // Language code ('en', 'zh', 'auto')
|
|
168
|
+
translate?: boolean; // Translate to English
|
|
169
|
+
detect_language?: boolean; // Auto-detect language
|
|
170
|
+
|
|
171
|
+
// Threading
|
|
172
|
+
n_threads?: number; // CPU threads (default: 4)
|
|
173
|
+
n_processors?: number; // Parallel processors
|
|
174
|
+
|
|
175
|
+
// Audio processing
|
|
176
|
+
offset_ms?: number; // Start offset in ms
|
|
177
|
+
duration_ms?: number; // Duration to process (0 = all)
|
|
178
|
+
|
|
179
|
+
// Output control
|
|
180
|
+
no_timestamps?: boolean; // Disable timestamps
|
|
181
|
+
max_len?: number; // Max segment length (chars)
|
|
182
|
+
max_tokens?: number; // Max tokens per segment
|
|
183
|
+
split_on_word?: boolean; // Split on word boundaries
|
|
184
|
+
token_timestamps?: boolean; // Include token-level timestamps
|
|
185
|
+
|
|
186
|
+
// Sampling
|
|
187
|
+
temperature?: number; // Sampling temperature (0.0 = greedy)
|
|
188
|
+
beam_size?: number; // Beam search size (-1 = greedy)
|
|
189
|
+
best_of?: number; // Best-of-N sampling
|
|
190
|
+
|
|
191
|
+
// Thresholds
|
|
192
|
+
entropy_thold?: number; // Entropy threshold
|
|
193
|
+
logprob_thold?: number; // Log probability threshold
|
|
194
|
+
no_speech_thold?: number; // No-speech probability threshold
|
|
195
|
+
|
|
196
|
+
// Context
|
|
197
|
+
prompt?: string; // Initial prompt text
|
|
198
|
+
no_context?: boolean; // Don't use previous context
|
|
199
|
+
|
|
200
|
+
// VAD preprocessing
|
|
201
|
+
vad?: boolean; // Enable VAD preprocessing
|
|
202
|
+
vad_model?: string; // Path to VAD model
|
|
203
|
+
vad_threshold?: number; // VAD threshold (0.0-1.0)
|
|
204
|
+
vad_min_speech_duration_ms?: number;
|
|
205
|
+
vad_min_silence_duration_ms?: number;
|
|
206
|
+
vad_speech_pad_ms?: number;
|
|
207
|
+
|
|
208
|
+
// Callbacks
|
|
209
|
+
progress_callback?: (progress: number) => void;
|
|
210
|
+
on_new_segment?: (segment: StreamingSegment) => void; // Streaming callback
|
|
211
|
+
}
|
|
212
|
+
|
|
213
|
+
// Streaming segment (passed to on_new_segment callback)
|
|
214
|
+
interface StreamingSegment {
|
|
215
|
+
start: string; // Start timestamp "HH:MM:SS,mmm"
|
|
216
|
+
end: string; // End timestamp
|
|
217
|
+
text: string; // Transcribed text
|
|
218
|
+
segment_index: number; // 0-based index
|
|
219
|
+
is_partial: boolean; // Reserved for future use
|
|
220
|
+
tokens?: StreamingToken[]; // Only if token_timestamps enabled
|
|
221
|
+
}
|
|
222
|
+
|
|
223
|
+
// Result
|
|
224
|
+
interface TranscribeResult {
|
|
225
|
+
segments: TranscriptSegment[];
|
|
226
|
+
}
|
|
227
|
+
|
|
228
|
+
// Segment is a tuple: [start, end, text]
|
|
229
|
+
type TranscriptSegment = [string, string, string];
|
|
230
|
+
// Example: ["00:00:00,000", "00:00:02,500", " Hello world"]
|
|
231
|
+
```
|
|
232
|
+
|
|
233
|
+
### `getGpuDevices()`
|
|
234
|
+
|
|
235
|
+
Enumerate available GPU backend devices. Returns an array of GPU/IGPU devices. Never throws — returns an empty array if no GPUs are available.
|
|
236
|
+
|
|
237
|
+
```typescript
|
|
238
|
+
import { getGpuDevices, createWhisperContext } from "whisper-cpp-node";
|
|
239
|
+
|
|
240
|
+
const gpus = getGpuDevices();
|
|
241
|
+
for (const gpu of gpus) {
|
|
242
|
+
console.log(`[${gpu.index}] ${gpu.description} (${gpu.type}, ${(gpu.memory_total / 1e9).toFixed(1)} GB)`);
|
|
243
|
+
}
|
|
244
|
+
// Example output:
|
|
245
|
+
// [0] NVIDIA GeForce RTX 4050 Laptop GPU (gpu, 6.0 GB)
|
|
246
|
+
// [1] AMD Radeon 740M (igpu, 8.0 GB)
|
|
247
|
+
|
|
248
|
+
// Use a specific GPU for transcription:
|
|
249
|
+
const ctx = createWhisperContext({
|
|
250
|
+
model: "./models/ggml-base.en.bin",
|
|
251
|
+
gpu_device: gpus[0].index,
|
|
252
|
+
});
|
|
253
|
+
```
|
|
254
|
+
|
|
255
|
+
```typescript
|
|
256
|
+
interface GpuDevice {
|
|
257
|
+
index: number; // GPU-relative index (matches gpu_device option)
|
|
258
|
+
name: string; // Backend device name (e.g., "Vulkan0")
|
|
259
|
+
description: string; // Human-readable name (e.g., "NVIDIA GeForce RTX 4050")
|
|
260
|
+
type: "gpu" | "igpu"; // Discrete or integrated GPU
|
|
261
|
+
memory_free: number; // Free memory in bytes
|
|
262
|
+
memory_total: number; // Total memory in bytes
|
|
263
|
+
}
|
|
264
|
+
```
|
|
265
|
+
|
|
266
|
+
### `createVadContext(options)`
|
|
267
|
+
|
|
268
|
+
Create a voice activity detection context for streaming audio.
|
|
269
|
+
|
|
270
|
+
```typescript
|
|
271
|
+
interface VadContextOptions {
|
|
272
|
+
model: string; // Path to Silero VAD model
|
|
273
|
+
threshold?: number; // Speech threshold (default: 0.5)
|
|
274
|
+
n_threads?: number; // Number of threads (default: 1)
|
|
275
|
+
no_prints?: boolean; // Suppress log output
|
|
276
|
+
}
|
|
277
|
+
|
|
278
|
+
interface VadContext {
|
|
279
|
+
getWindowSamples(): number; // Returns 512 (32ms at 16kHz)
|
|
280
|
+
getSampleRate(): number; // Returns 16000
|
|
281
|
+
process(samples: Float32Array): number; // Returns probability 0.0-1.0
|
|
282
|
+
reset(): void; // Reset LSTM state
|
|
283
|
+
free(): void; // Release resources
|
|
284
|
+
}
|
|
285
|
+
```
|
|
286
|
+
|
|
287
|
+
#### VAD Example
|
|
288
|
+
|
|
289
|
+
```typescript
|
|
290
|
+
import { createVadContext } from "whisper-cpp-node";
|
|
291
|
+
|
|
292
|
+
const vad = createVadContext({
|
|
293
|
+
model: "./models/ggml-silero-v6.2.0.bin",
|
|
294
|
+
threshold: 0.5,
|
|
295
|
+
});
|
|
296
|
+
|
|
297
|
+
const windowSize = vad.getWindowSamples(); // 512 samples
|
|
298
|
+
|
|
299
|
+
// Process audio in 32ms chunks
|
|
300
|
+
function processAudioChunk(samples: Float32Array) {
|
|
301
|
+
const probability = vad.process(samples);
|
|
302
|
+
if (probability >= 0.5) {
|
|
303
|
+
console.log("Speech detected!", probability);
|
|
304
|
+
}
|
|
305
|
+
}
|
|
306
|
+
|
|
307
|
+
// Reset when starting new audio stream
|
|
308
|
+
vad.reset();
|
|
309
|
+
|
|
310
|
+
// Clean up when done
|
|
311
|
+
vad.free();
|
|
312
|
+
```
|
|
313
|
+
|
|
314
|
+
## Core ML Acceleration (macOS)
|
|
315
|
+
|
|
316
|
+
For 3x+ faster encoding on Apple Silicon:
|
|
317
|
+
|
|
318
|
+
1. Generate a Core ML model:
|
|
319
|
+
```bash
|
|
320
|
+
pip install ane_transformers openai-whisper coremltools
|
|
321
|
+
./models/generate-coreml-model.sh base.en
|
|
322
|
+
```
|
|
323
|
+
|
|
324
|
+
2. Place it next to your GGML model:
|
|
325
|
+
```
|
|
326
|
+
models/ggml-base.en.bin
|
|
327
|
+
models/ggml-base.en-encoder.mlmodelc/
|
|
328
|
+
```
|
|
329
|
+
|
|
330
|
+
3. Enable Core ML:
|
|
331
|
+
```typescript
|
|
332
|
+
const ctx = createWhisperContext({
|
|
333
|
+
model: "./models/ggml-base.en.bin",
|
|
334
|
+
use_coreml: true,
|
|
335
|
+
});
|
|
336
|
+
```
|
|
337
|
+
|
|
338
|
+
## OpenVINO Acceleration (Intel)
|
|
339
|
+
|
|
340
|
+
For faster encoder inference on Intel CPUs and GPUs (requires build with OpenVINO support):
|
|
341
|
+
|
|
342
|
+
1. Install OpenVINO and convert the model:
|
|
343
|
+
```bash
|
|
344
|
+
pip install openvino openvino-dev
|
|
345
|
+
python models/convert-whisper-to-openvino.py --model base.en
|
|
346
|
+
```
|
|
347
|
+
|
|
348
|
+
2. The OpenVINO model files are placed next to your GGML model:
|
|
349
|
+
```
|
|
350
|
+
models/ggml-base.en.bin
|
|
351
|
+
models/ggml-base.en-encoder-openvino.xml
|
|
352
|
+
models/ggml-base.en-encoder-openvino.bin
|
|
353
|
+
```
|
|
354
|
+
|
|
355
|
+
3. Enable OpenVINO:
|
|
356
|
+
```typescript
|
|
357
|
+
const ctx = createWhisperContext({
|
|
358
|
+
model: "./models/ggml-base.en.bin",
|
|
359
|
+
use_openvino: true,
|
|
360
|
+
openvino_device: "CPU", // or "GPU" for Intel iGPU
|
|
361
|
+
openvino_cache_dir: "./openvino_cache", // optional, speeds up init
|
|
362
|
+
});
|
|
363
|
+
```
|
|
364
|
+
|
|
365
|
+
**Note:** OpenVINO support requires the addon to be built with `-DADDON_OPENVINO=ON`.
|
|
366
|
+
|
|
367
|
+
## Models
|
|
368
|
+
|
|
369
|
+
Download models from [Hugging Face](https://huggingface.co/ggerganov/whisper.cpp):
|
|
370
|
+
|
|
371
|
+
```bash
|
|
372
|
+
# Base English model (~150MB)
|
|
373
|
+
curl -L -o models/ggml-base.en.bin \
|
|
374
|
+
https://huggingface.co/ggerganov/whisper.cpp/resolve/main/ggml-base.en.bin
|
|
375
|
+
|
|
376
|
+
# Large v3 Turbo quantized (~500MB)
|
|
377
|
+
curl -L -o models/ggml-large-v3-turbo-q4_0.bin \
|
|
378
|
+
https://huggingface.co/ggerganov/whisper.cpp/resolve/main/ggml-large-v3-turbo-q4_0.bin
|
|
379
|
+
|
|
380
|
+
# Silero VAD model (for streaming VAD)
|
|
381
|
+
curl -L -o models/ggml-silero-v6.2.0.bin \
|
|
382
|
+
https://huggingface.co/ggerganov/whisper.cpp/resolve/main/ggml-silero-v6.2.0.bin
|
|
383
|
+
```
|
|
384
|
+
|
|
385
|
+
## License
|
|
386
|
+
|
|
387
|
+
MIT
|
package/dist/index.d.ts
CHANGED
|
@@ -1,5 +1,5 @@
|
|
|
1
|
-
import type { WhisperContext, WhisperContextOptions, VadContext, VadContextOptions, TranscribeOptions, TranscribeResult } from "./types";
|
|
2
|
-
export type { WhisperContextOptions, VadContextOptions, TranscribeOptions, TranscribeOptionsBase, TranscribeOptionsFile, TranscribeOptionsBuffer, TranscribeResult, TranscriptSegment, StreamingSegment, StreamingToken, WhisperContext, VadContext, WhisperContextConstructor, VadContextConstructor, } from "./types";
|
|
1
|
+
import type { WhisperContext, WhisperContextOptions, VadContext, VadContextOptions, TranscribeOptions, TranscribeResult, GpuDevice } from "./types";
|
|
2
|
+
export type { WhisperContextOptions, VadContextOptions, TranscribeOptions, TranscribeOptionsBase, TranscribeOptionsFile, TranscribeOptionsBuffer, TranscribeResult, TranscriptSegment, StreamingSegment, StreamingToken, WhisperContext, VadContext, WhisperContextConstructor, VadContextConstructor, GpuDevice, } from "./types";
|
|
3
3
|
export declare const WhisperContextClass: import("./types").WhisperContextConstructor;
|
|
4
4
|
export declare const VadContextClass: import("./types").VadContextConstructor;
|
|
5
5
|
export declare const transcribe: (context: WhisperContext, options: TranscribeOptions, callback: import("./types").TranscribeCallback) => void;
|
|
@@ -42,6 +42,23 @@ export declare function createWhisperContext(options: WhisperContextOptions): Wh
|
|
|
42
42
|
* ```
|
|
43
43
|
*/
|
|
44
44
|
export declare function createVadContext(options: VadContextOptions): VadContext;
|
|
45
|
+
/**
|
|
46
|
+
* Enumerate available GPU backend devices.
|
|
47
|
+
* Returns an array of GPU/IGPU devices with their properties.
|
|
48
|
+
* The `index` field matches what `gpu_device` option expects in WhisperContextOptions.
|
|
49
|
+
* Never throws — returns an empty array if no GPUs are available or on any error.
|
|
50
|
+
*
|
|
51
|
+
* @example
|
|
52
|
+
* ```typescript
|
|
53
|
+
* const gpus = getGpuDevices();
|
|
54
|
+
* for (const gpu of gpus) {
|
|
55
|
+
* console.log(`[${gpu.index}] ${gpu.description} (${gpu.type}, ${(gpu.memory_total / 1e9).toFixed(1)} GB)`);
|
|
56
|
+
* }
|
|
57
|
+
* // Use a specific GPU:
|
|
58
|
+
* const ctx = createWhisperContext({ model: '...', gpu_device: gpus[1].index });
|
|
59
|
+
* ```
|
|
60
|
+
*/
|
|
61
|
+
export declare function getGpuDevices(): GpuDevice[];
|
|
45
62
|
declare const _default: {
|
|
46
63
|
WhisperContext: import("./types").WhisperContextConstructor;
|
|
47
64
|
VadContext: import("./types").VadContextConstructor;
|
|
@@ -49,6 +66,7 @@ declare const _default: {
|
|
|
49
66
|
transcribeAsync: (context: WhisperContext, options: TranscribeOptions) => Promise<TranscribeResult>;
|
|
50
67
|
createWhisperContext: typeof createWhisperContext;
|
|
51
68
|
createVadContext: typeof createVadContext;
|
|
69
|
+
getGpuDevices: typeof getGpuDevices;
|
|
52
70
|
};
|
|
53
71
|
export default _default;
|
|
54
72
|
//# sourceMappingURL=index.d.ts.map
|
package/dist/index.d.ts.map
CHANGED
|
@@ -1 +1 @@
|
|
|
1
|
-
{"version":3,"file":"index.d.ts","sourceRoot":"","sources":["../src/index.ts"],"names":[],"mappings":"AAEA,OAAO,KAAK,EAEV,cAAc,EACd,qBAAqB,EACrB,UAAU,EACV,iBAAiB,EACjB,iBAAiB,EACjB,gBAAgB,
|
|
1
|
+
{"version":3,"file":"index.d.ts","sourceRoot":"","sources":["../src/index.ts"],"names":[],"mappings":"AAEA,OAAO,KAAK,EAEV,cAAc,EACd,qBAAqB,EACrB,UAAU,EACV,iBAAiB,EACjB,iBAAiB,EACjB,gBAAgB,EAChB,SAAS,EACV,MAAM,SAAS,CAAC;AAGjB,YAAY,EACV,qBAAqB,EACrB,iBAAiB,EACjB,iBAAiB,EACjB,qBAAqB,EACrB,qBAAqB,EACrB,uBAAuB,EACvB,gBAAgB,EAChB,iBAAiB,EACjB,gBAAgB,EAChB,cAAc,EACd,cAAc,EACd,UAAU,EACV,yBAAyB,EACzB,qBAAqB,EACrB,SAAS,GACV,MAAM,SAAS,CAAC;AAMjB,eAAO,MAAM,mBAAmB,6CAAuB,CAAC;AACxD,eAAO,MAAM,eAAe,yCAAmB,CAAC;AAGhD,eAAO,MAAM,UAAU,+GAAmB,CAAC;AAG3C,eAAO,MAAM,eAAe,EAAkC,CAC5D,OAAO,EAAE,cAAc,EACvB,OAAO,EAAE,iBAAiB,KACvB,OAAO,CAAC,gBAAgB,CAAC,CAAC;AAE/B;;;;;;;;;;;;;;;;;;;GAmBG;AACH,wBAAgB,oBAAoB,CAClC,OAAO,EAAE,qBAAqB,GAC7B,cAAc,CAEhB;AAED;;;;;;;;;;;;;;;GAeG;AACH,wBAAgB,gBAAgB,CAAC,OAAO,EAAE,iBAAiB,GAAG,UAAU,CAEvE;AAED;;;;;;;;;;;;;;;GAeG;AACH,wBAAgB,aAAa,IAAI,SAAS,EAAE,CAE3C;;;;;+BApEU,cAAc,WACd,iBAAiB,KACvB,OAAO,CAAC,gBAAgB,CAAC;;;;;AAqE9B,wBAQE"}
|
package/dist/index.js
CHANGED
|
@@ -3,6 +3,7 @@ Object.defineProperty(exports, "__esModule", { value: true });
|
|
|
3
3
|
exports.transcribeAsync = exports.transcribe = exports.VadContextClass = exports.WhisperContextClass = void 0;
|
|
4
4
|
exports.createWhisperContext = createWhisperContext;
|
|
5
5
|
exports.createVadContext = createVadContext;
|
|
6
|
+
exports.getGpuDevices = getGpuDevices;
|
|
6
7
|
const util_1 = require("util");
|
|
7
8
|
const loader_1 = require("./loader");
|
|
8
9
|
// Load native addon
|
|
@@ -56,6 +57,25 @@ function createWhisperContext(options) {
|
|
|
56
57
|
function createVadContext(options) {
|
|
57
58
|
return new addon.VadContext(options);
|
|
58
59
|
}
|
|
60
|
+
/**
|
|
61
|
+
* Enumerate available GPU backend devices.
|
|
62
|
+
* Returns an array of GPU/IGPU devices with their properties.
|
|
63
|
+
* The `index` field matches what `gpu_device` option expects in WhisperContextOptions.
|
|
64
|
+
* Never throws — returns an empty array if no GPUs are available or on any error.
|
|
65
|
+
*
|
|
66
|
+
* @example
|
|
67
|
+
* ```typescript
|
|
68
|
+
* const gpus = getGpuDevices();
|
|
69
|
+
* for (const gpu of gpus) {
|
|
70
|
+
* console.log(`[${gpu.index}] ${gpu.description} (${gpu.type}, ${(gpu.memory_total / 1e9).toFixed(1)} GB)`);
|
|
71
|
+
* }
|
|
72
|
+
* // Use a specific GPU:
|
|
73
|
+
* const ctx = createWhisperContext({ model: '...', gpu_device: gpus[1].index });
|
|
74
|
+
* ```
|
|
75
|
+
*/
|
|
76
|
+
function getGpuDevices() {
|
|
77
|
+
return addon.getGpuDevices();
|
|
78
|
+
}
|
|
59
79
|
// Default export with all functionality
|
|
60
80
|
exports.default = {
|
|
61
81
|
WhisperContext: addon.WhisperContext,
|
|
@@ -64,5 +84,6 @@ exports.default = {
|
|
|
64
84
|
transcribeAsync: exports.transcribeAsync,
|
|
65
85
|
createWhisperContext,
|
|
66
86
|
createVadContext,
|
|
87
|
+
getGpuDevices,
|
|
67
88
|
};
|
|
68
89
|
//# sourceMappingURL=index.js.map
|
package/dist/index.js.map
CHANGED
|
@@ -1 +1 @@
|
|
|
1
|
-
{"version":3,"file":"index.js","sourceRoot":"","sources":["../src/index.ts"],"names":[],"mappings":";;;
|
|
1
|
+
{"version":3,"file":"index.js","sourceRoot":"","sources":["../src/index.ts"],"names":[],"mappings":";;;AAoEA,oDAIC;AAkBD,4CAEC;AAkBD,sCAEC;AAhHD,+BAAiC;AACjC,qCAA2C;AA+B3C,oBAAoB;AACpB,MAAM,KAAK,GAAiB,IAAA,wBAAe,GAAE,CAAC;AAE9C,oEAAoE;AACvD,QAAA,mBAAmB,GAAG,KAAK,CAAC,cAAc,CAAC;AAC3C,QAAA,eAAe,GAAG,KAAK,CAAC,UAAU,CAAC;AAEhD,qCAAqC;AACxB,QAAA,UAAU,GAAG,KAAK,CAAC,UAAU,CAAC;AAE3C,sCAAsC;AACzB,QAAA,eAAe,GAAG,IAAA,gBAAS,EAAC,KAAK,CAAC,UAAU,CAG3B,CAAC;AAE/B;;;;;;;;;;;;;;;;;;;GAmBG;AACH,SAAgB,oBAAoB,CAClC,OAA8B;IAE9B,OAAO,IAAI,KAAK,CAAC,cAAc,CAAC,OAAO,CAAC,CAAC;AAC3C,CAAC;AAED;;;;;;;;;;;;;;;GAeG;AACH,SAAgB,gBAAgB,CAAC,OAA0B;IACzD,OAAO,IAAI,KAAK,CAAC,UAAU,CAAC,OAAO,CAAC,CAAC;AACvC,CAAC;AAED;;;;;;;;;;;;;;;GAeG;AACH,SAAgB,aAAa;IAC3B,OAAO,KAAK,CAAC,aAAa,EAAE,CAAC;AAC/B,CAAC;AAED,wCAAwC;AACxC,kBAAe;IACb,cAAc,EAAE,KAAK,CAAC,cAAc;IACpC,UAAU,EAAE,KAAK,CAAC,UAAU;IAC5B,UAAU,EAAE,KAAK,CAAC,UAAU;IAC5B,eAAe,EAAf,uBAAe;IACf,oBAAoB;IACpB,gBAAgB;IAChB,aAAa;CACd,CAAC"}
|
package/dist/types.d.ts
CHANGED
|
@@ -8,7 +8,7 @@ export interface WhisperContextOptions {
|
|
|
8
8
|
use_gpu?: boolean;
|
|
9
9
|
/** Enable Flash Attention (default: false) */
|
|
10
10
|
flash_attn?: boolean;
|
|
11
|
-
/** GPU device index (default: 0) */
|
|
11
|
+
/** GPU device index to use (default: 0). Use getGpuDevices() to list available devices. */
|
|
12
12
|
gpu_device?: number;
|
|
13
13
|
/** Enable Core ML acceleration on macOS (default: false) */
|
|
14
14
|
use_coreml?: boolean;
|
|
@@ -235,6 +235,23 @@ export interface TranscribeResult {
|
|
|
235
235
|
/** Detected language (when detect_language is true) */
|
|
236
236
|
language?: string;
|
|
237
237
|
}
|
|
238
|
+
/**
|
|
239
|
+
* GPU device information returned by getGpuDevices()
|
|
240
|
+
*/
|
|
241
|
+
export interface GpuDevice {
|
|
242
|
+
/** GPU-relative index (0, 1, 2...) — matches what gpu_device option expects */
|
|
243
|
+
index: number;
|
|
244
|
+
/** Backend device name (e.g., "Vulkan0") */
|
|
245
|
+
name: string;
|
|
246
|
+
/** Human-readable device description (e.g., "NVIDIA GeForce RTX 4050 Laptop GPU") */
|
|
247
|
+
description: string;
|
|
248
|
+
/** Device type: "gpu" for discrete, "igpu" for integrated */
|
|
249
|
+
type: "gpu" | "igpu";
|
|
250
|
+
/** Free device memory in bytes */
|
|
251
|
+
memory_free: number;
|
|
252
|
+
/** Total device memory in bytes */
|
|
253
|
+
memory_total: number;
|
|
254
|
+
}
|
|
238
255
|
/**
|
|
239
256
|
* Options for creating a VadContext
|
|
240
257
|
*/
|
|
@@ -298,5 +315,6 @@ export interface WhisperAddon {
|
|
|
298
315
|
VadContext: VadContextConstructor;
|
|
299
316
|
transcribe: (context: WhisperContext, options: TranscribeOptions, callback: TranscribeCallback) => void;
|
|
300
317
|
whisper: Record<string, unknown>;
|
|
318
|
+
getGpuDevices: () => GpuDevice[];
|
|
301
319
|
}
|
|
302
320
|
//# sourceMappingURL=types.d.ts.map
|
package/dist/types.d.ts.map
CHANGED
|
@@ -1 +1 @@
|
|
|
1
|
-
{"version":3,"file":"types.d.ts","sourceRoot":"","sources":["../src/types.ts"],"names":[],"mappings":"AAAA;;GAEG;AACH,MAAM,WAAW,qBAAqB;IACpC,kCAAkC;IAClC,KAAK,EAAE,MAAM,CAAC;IACd,8CAA8C;IAC9C,OAAO,CAAC,EAAE,OAAO,CAAC;IAClB,8CAA8C;IAC9C,UAAU,CAAC,EAAE,OAAO,CAAC;IACrB,
|
|
1
|
+
{"version":3,"file":"types.d.ts","sourceRoot":"","sources":["../src/types.ts"],"names":[],"mappings":"AAAA;;GAEG;AACH,MAAM,WAAW,qBAAqB;IACpC,kCAAkC;IAClC,KAAK,EAAE,MAAM,CAAC;IACd,8CAA8C;IAC9C,OAAO,CAAC,EAAE,OAAO,CAAC;IAClB,8CAA8C;IAC9C,UAAU,CAAC,EAAE,OAAO,CAAC;IACrB,2FAA2F;IAC3F,UAAU,CAAC,EAAE,MAAM,CAAC;IACpB,4DAA4D;IAC5D,UAAU,CAAC,EAAE,OAAO,CAAC;IACrB;;;;;OAKG;IACH,YAAY,CAAC,EAAE,OAAO,CAAC;IACvB;;;OAGG;IACH,mBAAmB,CAAC,EAAE,MAAM,CAAC;IAC7B;;;OAGG;IACH,eAAe,CAAC,EAAE,MAAM,CAAC;IACzB;;;OAGG;IACH,kBAAkB,CAAC,EAAE,MAAM,CAAC;IAC5B;;;;;;;;;;;;;;OAcG;IACH,GAAG,CAAC,EAAE,MAAM,CAAC;IACb;;;OAGG;IACH,cAAc,CAAC,EAAE,MAAM,CAAC;IACxB,uDAAuD;IACvD,SAAS,CAAC,EAAE,OAAO,CAAC;CACrB;AAED;;GAEG;AACH,MAAM,WAAW,qBAAqB;IAEpC,+CAA+C;IAC/C,QAAQ,CAAC,EAAE,MAAM,CAAC;IAClB,2BAA2B;IAC3B,SAAS,CAAC,EAAE,OAAO,CAAC;IACpB,oCAAoC;IACpC,eAAe,CAAC,EAAE,OAAO,CAAC;IAG1B,+BAA+B;IAC/B,SAAS,CAAC,EAAE,MAAM,CAAC;IACnB,mDAAmD;IACnD,YAAY,CAAC,EAAE,MAAM,CAAC;IAGtB,mCAAmC;IACnC,SAAS,CAAC,EAAE,MAAM,CAAC;IACnB,oDAAoD;IACpD,WAAW,CAAC,EAAE,MAAM,CAAC;IACrB,yBAAyB;IACzB,SAAS,CAAC,EAAE,MAAM,CAAC;IAGnB,mCAAmC;IACnC,aAAa,CAAC,EAAE,OAAO,CAAC;IACxB,0BAA0B;IAC1B,cAAc,CAAC,EAAE,OAAO,CAAC;IACzB,0DAA0D;IAC1D,OAAO,CAAC,EAAE,MAAM,CAAC;IACjB,gDAAgD;IAChD,UAAU,CAAC,EAAE,MAAM,CAAC;IACpB,0CAA0C;IAC1C,WAAW,CAAC,EAAE,MAAM,CAAC;IACrB,wCAAwC;IACxC,aAAa,CAAC,EAAE,OAAO,CAAC;IACxB,qCAAqC;IACrC,gBAAgB,CAAC,EAAE,OAAO,CAAC;IAC3B,+BAA+B;IAC/B,UAAU,CAAC,EAAE,MAAM,CAAC;IACpB,oDAAoD;IACpD,aAAa,CAAC,EAAE,OAAO,CAAC;IAGxB,8CAA8C;IAC9C,WAAW,CAAC,EAAE,MAAM,CAAC;IACrB,yCAAyC;IACzC,eAAe,CAAC,EAAE,MAAM,CAAC;IACzB,oCAAoC;IACpC,OAAO,CAAC,EAAE,MAAM,CAAC;IACjB,8CAA8C;IAC9C,SAAS,CAAC,EAAE,MAAM,CAAC;IACnB,mCAAmC;IACnC,WAAW,CAAC,EAAE,OAAO,CAAC;IAGtB,qCAAqC;IACrC,aAAa,CAAC,EAAE,MAAM,CAAC;IACvB,gCAAgC;IAChC,aAAa,CAAC,EAAE,MAAM,CAAC;IACvB,sCAAsC;IACtC,eAAe,CAAC,EAAE,MAAM,CAAC;IAGzB,sCAAsC;IACtC,MAAM,CAAC,EAAE,MAAM,CAAC;IAChB,iCAAiC;IACjC,UAAU,CAAC,EAAE,OAAO,CAAC;IACrB,6BAA6B;IAC7B,cAAc,CAAC,EAAE,OAAO,CAAC;IACzB,iCAAiC;IACjC,YAAY,CAAC,EAAE,OAAO,CAAC;IAGvB,iCAAiC;IACjC,OAAO,CAAC,EAAE,OAAO,CAAC;IAClB,oDAAoD;IACpD,WAAW,CAAC,EAAE,OAAO,CAAC;IAGtB,2BAA2B;IAC3B,aAAa,CAAC,EAAE,OAAO,CAAC;IACxB,qBAAqB;IACrB,cAAc,CAAC,EAAE,OAAO,CAAC;IACzB,4BAA4B;IAC5B,cAAc,CAAC,EAAE,OAAO,CAAC;IACzB,uBAAuB;IACvB,gBAAgB,CAAC,EAAE,OAAO,CAAC;IAG3B,+BAA+B;IAC/B,GAAG,CAAC,EAAE,OAAO,CAAC;IACd,wBAAwB;IACxB,SAAS,CAAC,EAAE,MAAM,CAAC;IACnB,+CAA+C;IAC/C,aAAa,CAAC,EAAE,MAAM,CAAC;IACvB,8CAA8C;IAC9C,0BAA0B,CAAC,EAAE,MAAM,CAAC;IACpC,+CAA+C;IAC/C,2BAA2B,CAAC,EAAE,MAAM,CAAC;IACrC,yCAAyC;IACzC,yBAAyB,CAAC,EAAE,MAAM,CAAC;IACnC,qCAAqC;IACrC,iBAAiB,CAAC,EAAE,MAAM,CAAC;IAC3B,gCAAgC;IAChC,mBAAmB,CAAC,EAAE,MAAM,CAAC;IAG7B,mDAAmD;IACnD,iBAAiB,CAAC,EAAE,CAAC,QAAQ,EAAE,MAAM,KAAK,IAAI,CAAC;IAE/C;;;;OAIG;IACH,cAAc,CAAC,EAAE,CAAC,OAAO,EAAE,gBAAgB,KAAK,IAAI,CAAC;CACtD;AAED;;GAEG;AACH,MAAM,WAAW,qBAAsB,SAAQ,qBAAqB;IAClE,6BAA6B;IAC7B,SAAS,EAAE,MAAM,CAAC;IAClB,MAAM,CAAC,EAAE,KAAK,CAAC;CAChB;AAED;;GAEG;AACH,MAAM,WAAW,uBAAwB,SAAQ,qBAAqB;IACpE,uEAAuE;IACvE,MAAM,EAAE,YAAY,CAAC;IACrB,SAAS,CAAC,EAAE,KAAK,CAAC;CACnB;AAED;;GAEG;AACH,MAAM,MAAM,iBAAiB,GAAG,qBAAqB,GAAG,uBAAuB,CAAC;AAEhF;;GAEG;AACH,MAAM,WAAW,iBAAiB;IAChC,0CAA0C;IAC1C,KAAK,EAAE,MAAM,CAAC;IACd,wCAAwC;IACxC,GAAG,EAAE,MAAM,CAAC;IACZ,uBAAuB;IACvB,IAAI,EAAE,MAAM,CAAC;IACb,uEAAuE;IACvE,MAAM,CAAC,EAAE,cAAc,EAAE,CAAC;CAC3B;AAED;;GAEG;AACH,MAAM,WAAW,cAAc;IAC7B,iBAAiB;IACjB,IAAI,EAAE,MAAM,CAAC;IACb,qCAAqC;IACrC,WAAW,EAAE,MAAM,CAAC;IACpB,+DAA+D;IAC/D,EAAE,EAAE,MAAM,CAAC;IACX,6DAA6D;IAC7D,EAAE,EAAE,MAAM,CAAC;IACX;;;;OAIG;IACH,KAAK,EAAE,MAAM,CAAC;CACf;AAED;;GAEG;AACH,MAAM,WAAW,gBAAgB;IAC/B,0CAA0C;IAC1C,KAAK,EAAE,MAAM,CAAC;IACd,wCAAwC;IACxC,GAAG,EAAE,MAAM,CAAC;IACZ,wCAAwC;IACxC,IAAI,EAAE,MAAM,CAAC;IACb,8BAA8B;IAC9B,aAAa,EAAE,MAAM,CAAC;IACtB,yDAAyD;IACzD,UAAU,EAAE,OAAO,CAAC;IACpB,iEAAiE;IACjE,MAAM,CAAC,EAAE,cAAc,EAAE,CAAC;CAC3B;AAED;;GAEG;AACH,MAAM,WAAW,gBAAgB;IAC/B,mCAAmC;IACnC,QAAQ,EAAE,iBAAiB,EAAE,CAAC;IAC9B,uDAAuD;IACvD,QAAQ,CAAC,EAAE,MAAM,CAAC;CACnB;AAED;;GAEG;AACH,MAAM,WAAW,SAAS;IACxB,+EAA+E;IAC/E,KAAK,EAAE,MAAM,CAAC;IACd,4CAA4C;IAC5C,IAAI,EAAE,MAAM,CAAC;IACb,qFAAqF;IACrF,WAAW,EAAE,MAAM,CAAC;IACpB,6DAA6D;IAC7D,IAAI,EAAE,KAAK,GAAG,MAAM,CAAC;IACrB,kCAAkC;IAClC,WAAW,EAAE,MAAM,CAAC;IACpB,mCAAmC;IACnC,YAAY,EAAE,MAAM,CAAC;CACtB;AAED;;GAEG;AACH,MAAM,WAAW,iBAAiB;IAChC,wCAAwC;IACxC,KAAK,EAAE,MAAM,CAAC;IACd,gDAAgD;IAChD,SAAS,CAAC,EAAE,MAAM,CAAC;IACnB,qCAAqC;IACrC,SAAS,CAAC,EAAE,MAAM,CAAC;IACnB,oCAAoC;IACpC,SAAS,CAAC,EAAE,OAAO,CAAC;CACrB;AAED;;GAEG;AACH,MAAM,WAAW,cAAc;IAC7B,yCAAyC;IACzC,aAAa,IAAI,MAAM,CAAC;IACxB,qCAAqC;IACrC,cAAc,IAAI,OAAO,CAAC;IAC1B,6CAA6C;IAC7C,IAAI,IAAI,IAAI,CAAC;CACd;AAED;;GAEG;AACH,MAAM,WAAW,yBAAyB;IACxC,KAAK,OAAO,EAAE,qBAAqB,GAAG,cAAc,CAAC;CACtD;AAED;;GAEG;AACH,MAAM,WAAW,UAAU;IACzB,8CAA8C;IAC9C,gBAAgB,IAAI,MAAM,CAAC;IAC3B,8CAA8C;IAC9C,aAAa,IAAI,MAAM,CAAC;IACxB,iEAAiE;IACjE,OAAO,CAAC,OAAO,EAAE,YAAY,GAAG,MAAM,CAAC;IACvC,oCAAoC;IACpC,KAAK,IAAI,IAAI,CAAC;IACd,6CAA6C;IAC7C,IAAI,IAAI,IAAI,CAAC;CACd;AAED;;GAEG;AACH,MAAM,WAAW,qBAAqB;IACpC,KAAK,OAAO,EAAE,iBAAiB,GAAG,UAAU,CAAC;CAC9C;AAED;;GAEG;AACH,MAAM,MAAM,kBAAkB,GAAG,CAC/B,KAAK,EAAE,KAAK,GAAG,IAAI,EACnB,MAAM,CAAC,EAAE,gBAAgB,KACtB,IAAI,CAAC;AAEV;;GAEG;AACH,MAAM,WAAW,YAAY;IAC3B,cAAc,EAAE,yBAAyB,CAAC;IAC1C,UAAU,EAAE,qBAAqB,CAAC;IAClC,UAAU,EAAE,CACV,OAAO,EAAE,cAAc,EACvB,OAAO,EAAE,iBAAiB,EAC1B,QAAQ,EAAE,kBAAkB,KACzB,IAAI,CAAC;IACV,OAAO,EAAE,MAAM,CAAC,MAAM,EAAE,OAAO,CAAC,CAAC;IACjC,aAAa,EAAE,MAAM,SAAS,EAAE,CAAC;CAClC"}
|
package/package.json
CHANGED
|
@@ -1,6 +1,6 @@
|
|
|
1
1
|
{
|
|
2
2
|
"name": "whisper-cpp-node",
|
|
3
|
-
"version": "0.2.
|
|
3
|
+
"version": "0.2.12",
|
|
4
4
|
"description": "Node.js bindings for whisper.cpp - fast speech-to-text with GPU acceleration",
|
|
5
5
|
"license": "MIT",
|
|
6
6
|
"repository": {
|
|
@@ -21,9 +21,9 @@
|
|
|
21
21
|
"dist"
|
|
22
22
|
],
|
|
23
23
|
"optionalDependencies": {
|
|
24
|
+
"@whisper-cpp-node/darwin-arm64": "0.2.12",
|
|
24
25
|
"@whisper-cpp-node/win32-ia32": "0.2.7",
|
|
25
|
-
"@whisper-cpp-node/
|
|
26
|
-
"@whisper-cpp-node/win32-x64": "0.2.9"
|
|
26
|
+
"@whisper-cpp-node/win32-x64": "0.2.11"
|
|
27
27
|
},
|
|
28
28
|
"devDependencies": {
|
|
29
29
|
"@types/node": "^20.0.0",
|