audio 2.0.0-1 → 2.1.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +660 -44
- package/audio.d.ts +244 -0
- package/audio.js +54 -0
- package/bin/cli.js +982 -0
- package/cache.js +89 -0
- package/core.js +630 -0
- package/dist/audio.all.js +105450 -0
- package/dist/audio.js +3571 -0
- package/dist/audio.min.js +14 -0
- package/fn/cepstrum.js +55 -0
- package/fn/clip.js +11 -0
- package/fn/crop.js +19 -0
- package/fn/fade.js +47 -0
- package/fn/filter.js +67 -0
- package/fn/gain.js +16 -0
- package/fn/insert.js +31 -0
- package/fn/loudness.js +102 -0
- package/fn/mix.js +21 -0
- package/fn/normalize.js +80 -0
- package/fn/pad.js +19 -0
- package/fn/pan.js +35 -0
- package/fn/play.js +123 -0
- package/fn/remix.js +25 -0
- package/fn/remove.js +25 -0
- package/fn/repeat.js +35 -0
- package/fn/reverse.js +25 -0
- package/fn/save.js +55 -0
- package/fn/silence.js +34 -0
- package/fn/spectrum.js +104 -0
- package/fn/speed.js +23 -0
- package/fn/split.js +9 -0
- package/fn/stat.js +77 -0
- package/fn/transform.js +6 -0
- package/fn/trim.js +68 -0
- package/fn/write.js +15 -0
- package/{LICENSE → license.md} +21 -21
- package/package.json +60 -19
- package/plan.js +456 -0
- package/stats.js +255 -0
- package/index.js +0 -107
package/README.md
CHANGED
|
@@ -1,44 +1,660 @@
|
|
|
1
|
-
#
|
|
2
|
-
|
|
3
|
-
|
|
4
|
-
|
|
5
|
-
|
|
6
|
-
|
|
7
|
-
|
|
8
|
-
.
|
|
9
|
-
|
|
10
|
-
.
|
|
11
|
-
|
|
12
|
-
|
|
13
|
-
|
|
14
|
-
|
|
15
|
-
|
|
16
|
-
|
|
17
|
-
|
|
18
|
-
|
|
19
|
-
|
|
20
|
-
|
|
21
|
-
|
|
22
|
-
|
|
23
|
-
|
|
24
|
-
|
|
25
|
-
|
|
26
|
-
|
|
27
|
-
|
|
28
|
-
|
|
29
|
-
|
|
30
|
-
|
|
31
|
-
|
|
32
|
-
|
|
33
|
-
|
|
34
|
-
|
|
35
|
-
|
|
36
|
-
|
|
37
|
-
|
|
38
|
-
|
|
39
|
-
|
|
40
|
-
|
|
41
|
-
|
|
42
|
-
|
|
43
|
-
|
|
44
|
-
|
|
1
|
+
# <img src="logo.svg" width="20" height="20" alt="audio"> audio [](https://github.com/audiojs/audio/actions/workflows/test.yml) [](https://npmjs.org/package/audio)
|
|
2
|
+
|
|
3
|
+
_Audio in JavaScript_
|
|
4
|
+
|
|
5
|
+
```js
|
|
6
|
+
audio('raw-take.wav')
|
|
7
|
+
.trim(-30)
|
|
8
|
+
.normalize('podcast')
|
|
9
|
+
.fade(0.3, 0.5)
|
|
10
|
+
.save('clean.mp3')
|
|
11
|
+
```
|
|
12
|
+
|
|
13
|
+
<!-- <img src="preview.svg?v=1" alt="Audiojs demo" width="540"> -->
|
|
14
|
+
|
|
15
|
+
* **Any Format** — fast wasm codecs, no ffmpeg.
|
|
16
|
+
* **Streaming** — playback during decode.
|
|
17
|
+
* **Immutable** — safe edits, infinite undo/redo.
|
|
18
|
+
* **Page Cache** — open 10Gb+ files.
|
|
19
|
+
* **Analysis** — loudness, spectrum, and more.
|
|
20
|
+
* **Modular** – pluggable ops, tree-shakable.
|
|
21
|
+
* **CLI** — playback, unix pipes, tab completion.
|
|
22
|
+
* **Isomorphic** — node / browser.
|
|
23
|
+
* **Audio-first** – dB, Hz, LUFS, not bytes and indices.
|
|
24
|
+
|
|
25
|
+
<!--
|
|
26
|
+
* [Architecture](docs/architecture.md) – stream-first design, pages & blocks, non-destructive editing, plan compilation
|
|
27
|
+
* [Plugins](docs/plugins.md) – custom ops, stats, descriptors (process, plan, resolve, call), persistent ctx
|
|
28
|
+
-->
|
|
29
|
+
|
|
30
|
+
<div align="center">
|
|
31
|
+
|
|
32
|
+
#### [Quick Start](#quick-start) [Recipes](#recipes) [API](#api) [CLI](#cli) [Plugins](docs/plugins.md) [Architecture](docs/architecture.md) [FAQ](#faq) [Ecosystem](#ecosystem)
|
|
33
|
+
|
|
34
|
+
</div>
|
|
35
|
+
|
|
36
|
+
## Quick Start
|
|
37
|
+
|
|
38
|
+
### Node
|
|
39
|
+
|
|
40
|
+
`npm i audio`
|
|
41
|
+
|
|
42
|
+
```js
|
|
43
|
+
import audio from 'audio'
|
|
44
|
+
let a = audio('voice.mp3')
|
|
45
|
+
a.trim().normalize('podcast').fade(0.3, 0.5)
|
|
46
|
+
await a.save('clean.mp3')
|
|
47
|
+
```
|
|
48
|
+
|
|
49
|
+
### Browser
|
|
50
|
+
|
|
51
|
+
```html
|
|
52
|
+
<script type="module">
|
|
53
|
+
import audio from './dist/audio.min.js'
|
|
54
|
+
let a = audio('./song.mp3')
|
|
55
|
+
a.trim().normalize().fade(0.5, 2)
|
|
56
|
+
a.clip({ at: 60, duration: 30 }).play() // play the chorus
|
|
57
|
+
</script>
|
|
58
|
+
```
|
|
59
|
+
|
|
60
|
+
Codecs load on demand via `import()` — map them with an import map or your bundler.
|
|
61
|
+
<details>
|
|
62
|
+
<summary><strong>Import map example</strong></summary>
|
|
63
|
+
|
|
64
|
+
|
|
65
|
+
```html
|
|
66
|
+
<script type="importmap">
|
|
67
|
+
{
|
|
68
|
+
"imports": {
|
|
69
|
+
"@audio/decode-wav": "https://esm.sh/@audio/decode-wav",
|
|
70
|
+
"@audio/decode-aac": "https://esm.sh/@audio/decode-aac",
|
|
71
|
+
"@audio/decode-aiff": "https://esm.sh/@audio/decode-aiff",
|
|
72
|
+
"@audio/decode-caf": "https://esm.sh/@audio/decode-caf",
|
|
73
|
+
"@audio/decode-webm": "https://esm.sh/@audio/decode-webm",
|
|
74
|
+
"@audio/decode-amr": "https://esm.sh/@audio/decode-amr",
|
|
75
|
+
"@audio/decode-wma": "https://esm.sh/@audio/decode-wma",
|
|
76
|
+
"mpg123-decoder": "https://esm.sh/mpg123-decoder",
|
|
77
|
+
"@wasm-audio-decoders/flac": "https://esm.sh/@wasm-audio-decoders/flac",
|
|
78
|
+
"ogg-opus-decoder": "https://esm.sh/ogg-opus-decoder",
|
|
79
|
+
"@wasm-audio-decoders/ogg-vorbis": "https://esm.sh/@wasm-audio-decoders/ogg-vorbis",
|
|
80
|
+
"qoa-format": "https://esm.sh/qoa-format",
|
|
81
|
+
"@audio/encode-wav": "https://esm.sh/@audio/encode-wav",
|
|
82
|
+
"@audio/encode-mp3": "https://esm.sh/@audio/encode-mp3",
|
|
83
|
+
"@audio/encode-flac": "https://esm.sh/@audio/encode-flac",
|
|
84
|
+
"@audio/encode-opus": "https://esm.sh/@audio/encode-opus",
|
|
85
|
+
"@audio/encode-ogg": "https://esm.sh/@audio/encode-ogg",
|
|
86
|
+
"@audio/encode-aiff": "https://esm.sh/@audio/encode-aiff"
|
|
87
|
+
}
|
|
88
|
+
}
|
|
89
|
+
</script>
|
|
90
|
+
```
|
|
91
|
+
|
|
92
|
+
</details>
|
|
93
|
+
|
|
94
|
+
### CLI
|
|
95
|
+
|
|
96
|
+
```sh
|
|
97
|
+
npm i -g audio
|
|
98
|
+
audio voice.wav trim normalize podcast fade 0.3s -0.5s -o clean.mp3
|
|
99
|
+
```
|
|
100
|
+
|
|
101
|
+
## Recipes
|
|
102
|
+
|
|
103
|
+
### Clean up a recording
|
|
104
|
+
|
|
105
|
+
```js
|
|
106
|
+
let a = audio('raw-take.wav')
|
|
107
|
+
a.trim(-30).normalize('podcast').fade(0.3, 0.5)
|
|
108
|
+
await a.save('clean.wav')
|
|
109
|
+
```
|
|
110
|
+
|
|
111
|
+
### Podcast montage
|
|
112
|
+
|
|
113
|
+
```js
|
|
114
|
+
let intro = audio('intro.mp3')
|
|
115
|
+
let body = audio('interview.wav')
|
|
116
|
+
let outro = audio('outro.mp3')
|
|
117
|
+
|
|
118
|
+
body.trim().normalize('podcast')
|
|
119
|
+
let ep = audio([intro, body, outro])
|
|
120
|
+
ep.fade(0.5, 2)
|
|
121
|
+
await ep.save('episode.mp3')
|
|
122
|
+
```
|
|
123
|
+
|
|
124
|
+
### Render a waveform
|
|
125
|
+
|
|
126
|
+
```js
|
|
127
|
+
let a = audio('track.mp3')
|
|
128
|
+
let [mins, peaks] = await a.stat(['min', 'max'], { bins: canvas.width })
|
|
129
|
+
for (let i = 0; i < peaks.length; i++)
|
|
130
|
+
ctx.fillRect(i, h/2 - peaks[i] * h/2, 1, (peaks[i] - mins[i]) * h/2)
|
|
131
|
+
```
|
|
132
|
+
|
|
133
|
+
### Render as it decodes
|
|
134
|
+
|
|
135
|
+
```js
|
|
136
|
+
let a = audio('long.flac')
|
|
137
|
+
a.on('data', ({ delta }) => appendBars(delta.max[0], delta.min[0]))
|
|
138
|
+
await a
|
|
139
|
+
```
|
|
140
|
+
|
|
141
|
+
### Voiceover on music
|
|
142
|
+
|
|
143
|
+
```js
|
|
144
|
+
let music = audio('bg.mp3')
|
|
145
|
+
let voice = audio('narration.wav')
|
|
146
|
+
music.gain(-12).mix(voice, { at: 2 })
|
|
147
|
+
await music.save('mixed.wav')
|
|
148
|
+
```
|
|
149
|
+
|
|
150
|
+
### Split a long file
|
|
151
|
+
|
|
152
|
+
```js
|
|
153
|
+
let a = audio('audiobook.mp3')
|
|
154
|
+
let [ch1, ch2, ch3] = a.split(1800, 3600)
|
|
155
|
+
for (let [i, ch] of [ch1, ch2, ch3].entries())
|
|
156
|
+
await ch.save(`chapter-${i + 1}.mp3`)
|
|
157
|
+
```
|
|
158
|
+
|
|
159
|
+
### Record from mic
|
|
160
|
+
|
|
161
|
+
```js
|
|
162
|
+
let a = audio()
|
|
163
|
+
a.record()
|
|
164
|
+
await new Promise(r => setTimeout(r, 5000))
|
|
165
|
+
a.stop()
|
|
166
|
+
a.trim().normalize()
|
|
167
|
+
await a.save('recording.wav')
|
|
168
|
+
```
|
|
169
|
+
|
|
170
|
+
### Extract features for ML
|
|
171
|
+
|
|
172
|
+
```js
|
|
173
|
+
let a = audio('speech.wav')
|
|
174
|
+
let mfcc = await a.stat('cepstrum', { bins: 13 })
|
|
175
|
+
let spec = await a.stat('spectrum', { bins: 128 })
|
|
176
|
+
let [loud, rms] = await a.stat(['loudness', 'rms'])
|
|
177
|
+
```
|
|
178
|
+
|
|
179
|
+
### Generate a tone
|
|
180
|
+
|
|
181
|
+
```js
|
|
182
|
+
let a = audio.from(t => Math.sin(440 * Math.PI * 2 * t), { duration: 2 })
|
|
183
|
+
await a.save('440hz.wav')
|
|
184
|
+
```
|
|
185
|
+
|
|
186
|
+
### Custom op
|
|
187
|
+
|
|
188
|
+
```js
|
|
189
|
+
audio.op('crush', (chs, ctx) => {
|
|
190
|
+
let steps = 2 ** (ctx.args[0] ?? 8)
|
|
191
|
+
return chs.map(ch => ch.map(s => Math.round(s * steps) / steps))
|
|
192
|
+
})
|
|
193
|
+
|
|
194
|
+
a.crush(4)
|
|
195
|
+
```
|
|
196
|
+
|
|
197
|
+
### Serialize and restore
|
|
198
|
+
|
|
199
|
+
```js
|
|
200
|
+
let json = JSON.stringify(a) // { source, edits, ... }
|
|
201
|
+
let b = audio(JSON.parse(json)) // re-decode + replay edits
|
|
202
|
+
```
|
|
203
|
+
|
|
204
|
+
### Remove a section
|
|
205
|
+
|
|
206
|
+
```js
|
|
207
|
+
let a = audio('interview.wav')
|
|
208
|
+
a.remove({ at: 120, duration: 15 }) // cut 2:00–2:15
|
|
209
|
+
a.fade(0.1, { at: 120 }) // smooth the splice
|
|
210
|
+
await a.save('edited.wav')
|
|
211
|
+
```
|
|
212
|
+
|
|
213
|
+
### Ringtone from any song
|
|
214
|
+
|
|
215
|
+
```js
|
|
216
|
+
let a = audio('song.mp3')
|
|
217
|
+
a.crop({ at: 45, duration: 30 }).fade(0.5, 2).normalize()
|
|
218
|
+
await a.save('ringtone.mp3')
|
|
219
|
+
```
|
|
220
|
+
|
|
221
|
+
### Detect clipping
|
|
222
|
+
|
|
223
|
+
```js
|
|
224
|
+
let a = audio('master.wav')
|
|
225
|
+
let clips = await a.stat('clipping')
|
|
226
|
+
if (clips.length) console.warn(`${clips.length} clipped blocks`)
|
|
227
|
+
```
|
|
228
|
+
|
|
229
|
+
### Stream to network
|
|
230
|
+
|
|
231
|
+
```js
|
|
232
|
+
let a = audio('2hour-mix.flac')
|
|
233
|
+
a.highpass(40).normalize('broadcast')
|
|
234
|
+
for await (let chunk of a) socket.send(chunk[0].buffer)
|
|
235
|
+
```
|
|
236
|
+
|
|
237
|
+
### Glitch: stutter + reverse
|
|
238
|
+
|
|
239
|
+
```js
|
|
240
|
+
let a = audio('beat.wav')
|
|
241
|
+
let v = a.clip({ at: 1, duration: 0.25 })
|
|
242
|
+
let glitch = audio([v, v, v, v])
|
|
243
|
+
glitch.reverse({ at: 0.25, duration: 0.25 })
|
|
244
|
+
await glitch.save('glitch.wav')
|
|
245
|
+
```
|
|
246
|
+
|
|
247
|
+
### Tremolo / sidechain
|
|
248
|
+
|
|
249
|
+
```js
|
|
250
|
+
let a = audio('pad.wav')
|
|
251
|
+
a.gain(t => -12 * (0.5 + 0.5 * Math.cos(t * Math.PI * 4))) // 2Hz tremolo in dB
|
|
252
|
+
await a.save('tremolo.wav')
|
|
253
|
+
```
|
|
254
|
+
|
|
255
|
+
### Sonify data
|
|
256
|
+
|
|
257
|
+
```js
|
|
258
|
+
let prices = [100, 102, 98, 105, 110, 95, 88, 92, 101, 107]
|
|
259
|
+
let a = audio.from(t => {
|
|
260
|
+
let freq = 200 + (prices[Math.min(Math.floor(t / 0.2), prices.length - 1)] - 80) * 10
|
|
261
|
+
return Math.sin(freq * Math.PI * 2 * t) * 0.5
|
|
262
|
+
}, { duration: prices.length * 0.2 })
|
|
263
|
+
await a.save('sonification.wav')
|
|
264
|
+
```
|
|
265
|
+
|
|
266
|
+
|
|
267
|
+
## API
|
|
268
|
+
|
|
269
|
+
### Create
|
|
270
|
+
|
|
271
|
+
* **`audio(source, opts?)`** – decode from file, URL, or bytes. Returns instantly — decodes in background.
|
|
272
|
+
* **`audio.from(source, opts?)`** – wrap existing PCM, AudioBuffer, silence, or function. Sync, no I/O.
|
|
273
|
+
|
|
274
|
+
```js
|
|
275
|
+
let a = audio('voice.mp3') // file path
|
|
276
|
+
let b = audio('https://cdn.ex/track.mp3') // URL
|
|
277
|
+
let c = audio(inputEl.files[0]) // Blob, File, Response, ArrayBuffer
|
|
278
|
+
let d = audio() // empty, ready for .push() or .record()
|
|
279
|
+
let e = audio([intro, body, outro]) // concat (virtual, no copy)
|
|
280
|
+
// opts: { sampleRate, channels, storage: 'memory' | 'persistent' | 'auto' }
|
|
281
|
+
|
|
282
|
+
await a // await for decode — if you need .duration, full stats etc
|
|
283
|
+
|
|
284
|
+
let a = audio.from([left, right]) // Float32Array[] channels
|
|
285
|
+
let b = audio.from(3, { channels: 2 }) // 3s silence
|
|
286
|
+
let c = audio.from(t => Math.sin(440*TAU*t), { duration: 2 }) // generator
|
|
287
|
+
let d = audio.from(audioBuffer) // Web Audio AudioBuffer
|
|
288
|
+
let e = audio.from(int16arr, { format: 'int16' }) // typed array + format
|
|
289
|
+
```
|
|
290
|
+
|
|
291
|
+
|
|
292
|
+
### Properties
|
|
293
|
+
|
|
294
|
+
```js
|
|
295
|
+
// format
|
|
296
|
+
a.duration // total seconds (reflects edits)
|
|
297
|
+
a.channels // channel count
|
|
298
|
+
a.sampleRate // sample rate
|
|
299
|
+
a.length // total samples per channel
|
|
300
|
+
|
|
301
|
+
// playback
|
|
302
|
+
a.currentTime // position in seconds (smooth interpolation during playback)
|
|
303
|
+
a.playing // true during playback
|
|
304
|
+
a.paused // true when paused
|
|
305
|
+
a.volume = 0.5 // 0..1 linear (settable)
|
|
306
|
+
a.muted = true // mute gate (independent of volume)
|
|
307
|
+
a.loop = true // on/off (settable)
|
|
308
|
+
a.ended // true when playback ended naturally (not via stop)
|
|
309
|
+
a.seeking // true during a seek operation
|
|
310
|
+
a.played // promise, resolves when playback starts
|
|
311
|
+
a.recording // true during mic recording
|
|
312
|
+
|
|
313
|
+
// state
|
|
314
|
+
a.ready // promise, resolves when fully decoded
|
|
315
|
+
a.source // original source reference
|
|
316
|
+
a.pages // Float32Array page store
|
|
317
|
+
a.stats // per-block stats (peak, rms, etc.)
|
|
318
|
+
a.edits // edit list (non-destructive ops)
|
|
319
|
+
a.version // increments on each edit
|
|
320
|
+
```
|
|
321
|
+
|
|
322
|
+
### Structure
|
|
323
|
+
|
|
324
|
+
Non-destructive time/channel rearrangement. All support `{at, duration, channel}`.
|
|
325
|
+
|
|
326
|
+
* **`.trim(threshold?)`** – strip leading/trailing silence (dB, default auto).
|
|
327
|
+
* **`.crop({at, duration})`** – keep range, discard rest.
|
|
328
|
+
* **`.remove({at, duration})`** – cut range, close gap.
|
|
329
|
+
* **`.insert(source, {at})`** – insert audio or silence (number of seconds) at position.
|
|
330
|
+
* **`.clip({at, duration})`** – zero-copy range reference.
|
|
331
|
+
* **`.split(...offsets)`** – zero-copy split at timestamps.
|
|
332
|
+
* **`.pad(before, after?)`** – silence at edges (seconds).
|
|
333
|
+
* **`.repeat(n)`** – repeat n times.
|
|
334
|
+
* **`.reverse({at?, duration?})`** – reverse audio or range.
|
|
335
|
+
* **`.speed(rate)`** – playback speed (affects pitch and duration).
|
|
336
|
+
* **`.remix(channels)`** – channel count: number or array map (`[1, 0]` swaps L/R).
|
|
337
|
+
|
|
338
|
+
```js
|
|
339
|
+
a.trim(-30) // strip silence below -30dB
|
|
340
|
+
a.remove({ at: '2m', duration: 15 }) // cut 2:00–2:15, close gap
|
|
341
|
+
a.insert(intro, { at: 0 }) // prepend; .insert(3) appends 3s silence
|
|
342
|
+
let [pt1, pt2] = a.split('30m') // zero-copy views
|
|
343
|
+
let hook = a.clip({ at: 60, duration: 30 }) // zero-copy excerpt
|
|
344
|
+
a.remix([0, 0]) // L→both; .remix(1) for mono
|
|
345
|
+
```
|
|
346
|
+
|
|
347
|
+
### Process
|
|
348
|
+
|
|
349
|
+
Amplitude, mixing, normalization. All support `{at, duration, channel}` ranges.
|
|
350
|
+
|
|
351
|
+
* **`.gain(dB, opts?)`** – volume. Number, range, or `t => dB` function. `{ unit: 'linear' }` for multiplier.
|
|
352
|
+
* **`.fade(in, out?, curve?)`** – fade in/out. Curves: `'linear'` `'exp'` `'log'` `'cos'`.
|
|
353
|
+
* **`.normalize(target?)`** – remove DC offset, clamp, and normalize loudness.
|
|
354
|
+
* `'podcast'` – -16 LUFS, -1 dBTP.
|
|
355
|
+
* `'streaming'` – -14 LUFS.
|
|
356
|
+
* `'broadcast'` – -23 LUFS.
|
|
357
|
+
* `-3` – custom dB target (peak mode).
|
|
358
|
+
* no arg – peak 0dBFS.
|
|
359
|
+
* `{ mode: 'rms' }` – RMS normalization. Also `'peak'`, `'lufs'`.
|
|
360
|
+
* `{ ceiling: -1 }` – true peak limiter in dB.
|
|
361
|
+
* `{ dc: false }` – skip DC removal.
|
|
362
|
+
* **`.mix(source, opts?)`** – overlay another audio (additive).
|
|
363
|
+
* **`.pan(value, opts?)`** – stereo balance (−1 left, 0 center, 1 right). Accepts function.
|
|
364
|
+
* **`.write(data, {at?})`** – overwrite samples with raw PCM.
|
|
365
|
+
* **`.transform(fn)`** – inline processor: `(chs, ctx) => chs`. Not serialized.
|
|
366
|
+
|
|
367
|
+
```js
|
|
368
|
+
a.gain(-3) // reduce 3dB
|
|
369
|
+
a.gain(6, { at: 10, duration: 5 }) // boost range
|
|
370
|
+
a.gain(t => -12 * Math.cos(t * TAU)) // automate over time
|
|
371
|
+
a.fade(0.5, -2, 'exp') // 0.5s in, 2s exp fade-out
|
|
372
|
+
a.normalize('podcast') // -16 LUFS; also 'streaming', 'broadcast'
|
|
373
|
+
a.mix(voice, { at: 2 }) // overlay at 2s
|
|
374
|
+
a.pan(-0.3, { at: 10, duration: 5 }) // pan left for range
|
|
375
|
+
```
|
|
376
|
+
|
|
377
|
+
### Filter
|
|
378
|
+
|
|
379
|
+
Biquad filters, chainable. All support `{at, duration}` ranges.
|
|
380
|
+
|
|
381
|
+
* **`.highpass(freq)`**, **`.lowpass(freq)`** – pass filter.
|
|
382
|
+
* **`.bandpass(freq, Q?)`**, **`.notch(freq, Q?)`** – band-pass / notch.
|
|
383
|
+
* **`.lowshelf(freq, dB)`**, **`.highshelf(freq, dB)`** – shelf EQ.
|
|
384
|
+
* **`.eq(freq, gain, Q?)`** – parametric EQ.
|
|
385
|
+
* **`.filter(type, ...params)`** – generic dispatch.
|
|
386
|
+
|
|
387
|
+
```js
|
|
388
|
+
a.highpass(80).lowshelf(200, -3) // rumble + mud
|
|
389
|
+
a.eq(3000, 2, 1.5).highshelf(8000, 3) // presence + air
|
|
390
|
+
a.notch(50) // remove hum
|
|
391
|
+
a.filter(customFn, { cutoff: 2000 }) // custom filter function
|
|
392
|
+
```
|
|
393
|
+
|
|
394
|
+
### I/O
|
|
395
|
+
|
|
396
|
+
Read PCM, encode, stream, push. Format inferred from extension.
|
|
397
|
+
|
|
398
|
+
* **`await .read(opts?)`** – rendered PCM. `{ format, channel }` to convert.
|
|
399
|
+
* **`await .save(path, opts?)`** – encode + write. `{ at, duration }` for sub-range.
|
|
400
|
+
* **`await .encode(format?, opts?)`** – encode to `Uint8Array`.
|
|
401
|
+
* **`for await (let block of a)`** – async-iterable over blocks.
|
|
402
|
+
* **`.clone()`** – deep copy, independent edits, shared pages.
|
|
403
|
+
* **`.push(data, format?)`** – feed PCM into pushable instance. `.stop()` to finalize.
|
|
404
|
+
|
|
405
|
+
```js
|
|
406
|
+
let pcm = await a.read() // Float32Array[]
|
|
407
|
+
let raw = await a.read({ format: 'int16', channel: 0 })
|
|
408
|
+
await a.save('out.mp3') // format from extension
|
|
409
|
+
let bytes = await a.encode('flac') // Uint8Array
|
|
410
|
+
for await (let block of a) send(block) // stream blocks
|
|
411
|
+
let b = a.clone() // independent copy, shared pages
|
|
412
|
+
|
|
413
|
+
let src = audio() // pushable source
|
|
414
|
+
src.push(buf, 'int16') // feed PCM
|
|
415
|
+
src.stop() // finalize
|
|
416
|
+
```
|
|
417
|
+
|
|
418
|
+
### Playback / Recording
|
|
419
|
+
|
|
420
|
+
Live playback with dB volume, seeking, looping. Mic recording via `audio-mic`.
|
|
421
|
+
|
|
422
|
+
* **`.play(opts?)`** – start playback. `{ at, duration, volume, loop }`. `.played` promise resolves when output starts.
|
|
423
|
+
* **`.pause()`**, **`.resume()`**, **`.seek(t)`**, **`.stop()`** – playback control.
|
|
424
|
+
* **`.record(opts?)`** – mic recording. `{ deviceId, sampleRate, channels }`.
|
|
425
|
+
|
|
426
|
+
```js
|
|
427
|
+
a.play({ at: 30, duration: 10 }) // play 30s–40s
|
|
428
|
+
await a.played // wait for output to start
|
|
429
|
+
a.volume = 0.5; a.loop = true // live adjustments
|
|
430
|
+
a.muted = true // mute without changing volume
|
|
431
|
+
a.pause(); a.seek(60); a.resume() // jump to 1:00
|
|
432
|
+
a.stop() // end playback or recording
|
|
433
|
+
|
|
434
|
+
let mic = audio()
|
|
435
|
+
mic.record({ sampleRate: 16000, channels: 1 })
|
|
436
|
+
mic.stop()
|
|
437
|
+
```
|
|
438
|
+
|
|
439
|
+
|
|
440
|
+
### Analysis
|
|
441
|
+
|
|
442
|
+
`await .stat(name, opts?)` — without `bins` returns scalar, with `bins` returns `Float32Array`. Array of names returns array of results. Sub-ranges via `{at, duration}`, per-channel via `{channel}`.
|
|
443
|
+
|
|
444
|
+
* **`'db'`** – peak amplitude in dBFS.
|
|
445
|
+
* **`'rms'`** – RMS amplitude (linear).
|
|
446
|
+
* **`'loudness'`** – integrated LUFS (ITU-R BS.1770).
|
|
447
|
+
* **`'dc'`** – DC offset.
|
|
448
|
+
* **`'clipping'`** – clipped samples (scalar: timestamps, binned: counts).
|
|
449
|
+
* **`'silence'`** – silent ranges as `{at, duration}`.
|
|
450
|
+
* **`'max'`**, **`'min'`** – peak envelope (use together for waveform rendering).
|
|
451
|
+
* **`'spectrum'`** – mel-frequency spectrum in dB (A-weighted).
|
|
452
|
+
* **`'cepstrum'`** – MFCCs.
|
|
453
|
+
|
|
454
|
+
```js
|
|
455
|
+
let loud = await a.stat('loudness') // LUFS
|
|
456
|
+
let [db, clips] = await a.stat(['db', 'clipping']) // multiple at once
|
|
457
|
+
let spec = await a.stat('spectrum', { bins: 128 }) // frequency bins
|
|
458
|
+
let peaks = await a.stat('max', { bins: 800 }) // waveform data
|
|
459
|
+
await a.stat('rms', { channel: 0 }) // left only → number
|
|
460
|
+
await a.stat('rms', { channel: [0, 1] }) // per-channel → [n, n]
|
|
461
|
+
let gaps = await a.stat('silence', { threshold: -40 }) // [{at, duration}, ...]
|
|
462
|
+
```
|
|
463
|
+
|
|
464
|
+
|
|
465
|
+
### Utility
|
|
466
|
+
|
|
467
|
+
Events, lifecycle, undo/redo, serialization.
|
|
468
|
+
|
|
469
|
+
* **`.on(event, fn)`** / **`.off(event?, fn?)`** – subscribe / unsubscribe.
|
|
470
|
+
* `'data'` – pages decoded/pushed. Payload: `{ delta, offset, sampleRate, channels }`.
|
|
471
|
+
* `'change'` – any edit or undo.
|
|
472
|
+
* `'metadata'` – stream header decoded. Payload: `{ sampleRate, channels }`.
|
|
473
|
+
* `'timeupdate'` – playback position. Payload: `currentTime`.
|
|
474
|
+
* `'play'` – playback started or resumed.
|
|
475
|
+
* `'pause'` – playback paused.
|
|
476
|
+
* `'volumechange'` – volume or muted changed.
|
|
477
|
+
* `'ended'` – playback finished (not on loop).
|
|
478
|
+
* `'progress'` – during save/encode. Payload: `{ offset, total }` in seconds.
|
|
479
|
+
* **`.dispose()`** – release resources. Supports `using` for auto-dispose.
|
|
480
|
+
* **`.undo(n?)`** – undo last edit(s). Returns edit for redo via `.run()`.
|
|
481
|
+
* **`.run(...edits)`** – apply edit objects `{ type, args, at?, duration? }`. Batch or replay.
|
|
482
|
+
|
|
483
|
+
```js
|
|
484
|
+
a.on('data', ({ delta }) => draw(delta)) // decode progress
|
|
485
|
+
a.on('timeupdate', t => ui.update(t)) // playback position
|
|
486
|
+
|
|
487
|
+
a.undo() // undo last edit
|
|
488
|
+
b.run(...a.edits) // replay onto another file
|
|
489
|
+
JSON.stringify(a); audio(json) // serialize / restore
|
|
490
|
+
```
|
|
491
|
+
|
|
492
|
+
### Plugins
|
|
493
|
+
|
|
494
|
+
Extend with custom ops and stats. See [Plugin Tutorial](docs/plugins.md).
|
|
495
|
+
|
|
496
|
+
* **`audio.op(name, fn)`** – register op. Shorthand for `{ process: fn }`. Full descriptor: `{ process, plan, resolve, call }`.
|
|
497
|
+
* **`audio.op(name)`** – query descriptor. **`audio.op()`** – all ops.
|
|
498
|
+
* **`audio.stat(name, descriptor)`** – register stat. Shorthand `(chs, ctx) => [...]` or `{ block, reduce, query }`.
|
|
499
|
+
|
|
500
|
+
```js
|
|
501
|
+
// op: process function receives (channels[], ctx) per 1024-sample block
|
|
502
|
+
audio.op('crush', (chs, ctx) => {
|
|
503
|
+
let steps = 2 ** (ctx.args[0] ?? 8)
|
|
504
|
+
return chs.map(ch => ch.map(s => Math.round(s * steps) / steps))
|
|
505
|
+
})
|
|
506
|
+
|
|
507
|
+
// stat: block function collects per-block, reduce enables scalar queries
|
|
508
|
+
audio.stat('peak', {
|
|
509
|
+
block: (chs) => chs.map(ch => { let m = 0; for (let s of ch) m = Math.max(m, Math.abs(s)); return m }),
|
|
510
|
+
reduce: (src, from, to) => { let m = 0; for (let i = from; i < to; i++) m = Math.max(m, src[i]); return m },
|
|
511
|
+
})
|
|
512
|
+
|
|
513
|
+
a.crush(4) // chainable like built-in ops
|
|
514
|
+
a.stat('peak') // → scalar from reduce
|
|
515
|
+
a.stat('peak', { bins: 100 }) // → binned array
|
|
516
|
+
```
|
|
517
|
+
|
|
518
|
+
## CLI
|
|
519
|
+
|
|
520
|
+
|
|
521
|
+
```sh
|
|
522
|
+
npx audio [file] [ops...] [-o output] [options]
|
|
523
|
+
|
|
524
|
+
# ops
|
|
525
|
+
eq mix pad pan crop
|
|
526
|
+
fade gain stat trim notch
|
|
527
|
+
remix speed split insert remove
|
|
528
|
+
repeat bandpass highpass lowpass reverse
|
|
529
|
+
lowshelf highshelf normalize
|
|
530
|
+
```
|
|
531
|
+
|
|
532
|
+
|
|
533
|
+
`-o` output · `-p` play · `-f` force · `--format` · `--verbose` · `+` concat
|
|
534
|
+
|
|
535
|
+
### Playback
|
|
536
|
+
|
|
537
|
+
|
|
538
|
+
<img src="player.gif" alt="Audiojs demo" width="624">
|
|
539
|
+
|
|
540
|
+
<!-- ```sh
|
|
541
|
+
npx audio kirtan.mp3
|
|
542
|
+
▶ 0:06:37 ━━━━━━━━────────────────────────────────────────── -0:36:30 ▁▂▃▄▅__
|
|
543
|
+
▂▅▇▇██▇▆▇▇▇██▆▇▇▇▆▆▅▅▆▅▆▆▅▅▆▅▅▅▃▂▂▂▂▁_____________
|
|
544
|
+
50 500 1k 2k 5k 10k 20k
|
|
545
|
+
|
|
546
|
+
48k 2ch 43:07 -0.8dBFS -30.8LUFS
|
|
547
|
+
``` -->
|
|
548
|
+
|
|
549
|
+
<kbd>␣</kbd> pause · <kbd>←</kbd>/<kbd>→</kbd> seek ±10s · <kbd>⇧←</kbd>/<kbd>⇧→</kbd> seek ±60s · <kbd>↑</kbd>/<kbd>↓</kbd> volume ±3dB · <kbd>l</kbd> loop · <kbd>q</kbd> quit
|
|
550
|
+
|
|
551
|
+
|
|
552
|
+
|
|
553
|
+
### Edit
|
|
554
|
+
|
|
555
|
+
```sh
|
|
556
|
+
# clean up
|
|
557
|
+
npx audio raw-take.wav trim -30db normalize podcast fade 0.3s -0.5s -o clean.wav
|
|
558
|
+
|
|
559
|
+
# ranges
|
|
560
|
+
npx audio in.wav gain -3db 1s..10s -o out.wav
|
|
561
|
+
|
|
562
|
+
# filter chain
|
|
563
|
+
npx audio in.mp3 highpass 80hz lowshelf 200hz -3db -o out.wav
|
|
564
|
+
|
|
565
|
+
# join
|
|
566
|
+
npx audio intro.mp3 + content.wav + outro.mp3 trim normalize fade 0.5s -2s -o ep.mp3
|
|
567
|
+
|
|
568
|
+
# voiceover
|
|
569
|
+
npx audio bg.mp3 gain -12db mix narration.wav 2s -o mixed.wav
|
|
570
|
+
|
|
571
|
+
# split
|
|
572
|
+
npx audio audiobook.mp3 split 30m 60m -o 'chapter-{i}.mp3'
|
|
573
|
+
```
|
|
574
|
+
|
|
575
|
+
### Analysis
|
|
576
|
+
|
|
577
|
+
```sh
|
|
578
|
+
# all default stats (db, rms, loudness, clipping, dc)
|
|
579
|
+
npx audio speech.wav stat
|
|
580
|
+
|
|
581
|
+
# specific stats
|
|
582
|
+
npx audio speech.wav stat loudness rms
|
|
583
|
+
|
|
584
|
+
# spectrum / cepstrum with bin count
|
|
585
|
+
npx audio speech.wav stat spectrum 128
|
|
586
|
+
npx audio speech.wav stat cepstrum 13
|
|
587
|
+
|
|
588
|
+
# stat after transforms
|
|
589
|
+
npx audio speech.wav gain -3db stat db
|
|
590
|
+
```
|
|
591
|
+
|
|
592
|
+
### Batch
|
|
593
|
+
|
|
594
|
+
```sh
|
|
595
|
+
npx audio '*.wav' trim normalize podcast -o '{name}.clean.{ext}'
|
|
596
|
+
npx audio '*.wav' gain -3db -o '{name}.out.{ext}'
|
|
597
|
+
```
|
|
598
|
+
|
|
599
|
+
### Stdin/stdout
|
|
600
|
+
|
|
601
|
+
```sh
|
|
602
|
+
cat in.wav | audio gain -3db > out.wav
|
|
603
|
+
curl -s https://example.com/speech.mp3 | npx audio normalize -o clean.wav
|
|
604
|
+
ffmpeg -i video.mp4 -f wav - | npx audio trim normalize podcast > voice.wav
|
|
605
|
+
```
|
|
606
|
+
|
|
607
|
+
### Tab completion
|
|
608
|
+
|
|
609
|
+
```sh
|
|
610
|
+
eval "$(audio --completions zsh)" # add to ~/.zshrc
|
|
611
|
+
eval "$(audio --completions bash)" # add to ~/.bashrc
|
|
612
|
+
audio --completions fish | source # fish
|
|
613
|
+
```
|
|
614
|
+
|
|
615
|
+
|
|
616
|
+
|
|
617
|
+
|
|
618
|
+
## FAQ
|
|
619
|
+
|
|
620
|
+
<dl>
|
|
621
|
+
<dt>How is this different from Web Audio API?</dt>
|
|
622
|
+
<dd>Web Audio API is a real-time audio graph for playback and synthesis. This is for work on audio files specifically. For Web Audio API in Node, see <a href="https://github.com/audiojs/web-audio-api">web-audio-api</a>.</dd>
|
|
623
|
+
|
|
624
|
+
<dt>What formats are supported?</dt>
|
|
625
|
+
<dd>Decode: WAV, MP3, FLAC, OGG Vorbis, Opus, AAC, AIFF, CAF, WebM, AMR, WMA, QOA via <a href="https://github.com/audiojs/audio-decode">audio-decode</a>. Encode: WAV, MP3, FLAC, Opus, OGG, AIFF via <a href="https://github.com/audiojs/audio-encode">audio-encode</a>. Codecs are WASM-based, lazy-loaded on first use.</dd>
|
|
626
|
+
|
|
627
|
+
<dt>Does it need ffmpeg or native addons?</dt>
|
|
628
|
+
<dd>No, pure JS + WASM. For CLI, you can install globally: <code>npm i -g audio</code>.</dd>
|
|
629
|
+
|
|
630
|
+
<dt>How big is the bundle?</dt>
|
|
631
|
+
<dd>~20K gzipped core. Codecs load on demand via <code>import()</code>, so unused formats aren't fetched.</dd>
|
|
632
|
+
|
|
633
|
+
<dt>How does it handle large files?</dt>
|
|
634
|
+
<dd>Audio is stored in fixed-size pages. In the browser, cold pages can evict to OPFS when memory exceeds budget. Stats stay resident (~7 MB for 2h stereo).</dd>
|
|
635
|
+
|
|
636
|
+
<dt>Are edits destructive?</dt>
|
|
637
|
+
<dd>No. <code>a.gain(-3).trim()</code> pushes entries to an edit list — source pages aren't touched. Edits replay on <code>read()</code> / <code>save()</code> / <code>for await</code>.</dd>
|
|
638
|
+
|
|
639
|
+
<dt>Can I use it in the browser?</dt>
|
|
640
|
+
<dd>Yes, same API. See <a href="#browser">Browser</a> for bundle options and import maps.</dd>
|
|
641
|
+
|
|
642
|
+
<dt>Does it need the full file before I can work with it?</dt>
|
|
643
|
+
<dd>No, playback and edits work during decode. The <code>'data'</code> event fires as pages arrive.</dd>
|
|
644
|
+
|
|
645
|
+
<dt>TypeScript?</dt>
|
|
646
|
+
<dd>Yes, ships with <code>audio.d.ts</code>.</dd>
|
|
647
|
+
</dl>
|
|
648
|
+
|
|
649
|
+
|
|
650
|
+
## Ecosystem
|
|
651
|
+
|
|
652
|
+
* [audio-decode](https://github.com/audiojs/audio-decode) – codec decoding (13+ formats)
|
|
653
|
+
* [encode-audio](https://github.com/audiojs/audio-encode) – codec encoding
|
|
654
|
+
* [audio-filter](https://github.com/audiojs/audio-filter) – filters (weighting, EQ, auditory)
|
|
655
|
+
* [audio-speaker](https://github.com/audiojs/audio-speaker) – audio output
|
|
656
|
+
* [audio-mic](https://github.com/audiojs/audio-mic) – audio input
|
|
657
|
+
* [audio-type](https://github.com/nickolanack/audio-type) – format detection
|
|
658
|
+
* [pcm-convert](https://github.com/nickolanack/pcm-convert) – PCM format conversion
|
|
659
|
+
|
|
660
|
+
<p align="center"><a href="./license.md">MIT</a> · <a href="https://github.com/krishnized/license">ॐ</a></p>
|