audio 2.0.0-0 → 2.0.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -1,52 +1,647 @@
1
- # Audio [![Travis][travis-icon]][travis] [![Gitter][gitter-icon]][gitter]
2
- > Audio in JavaScript.
3
-
4
- An object that enables you to store, read, and write [PCM audio][pcm] data more easily. You can use [utilities][npm-audiojs] for any type of audio manipulation, such as compression or conversion to and from different audio formats. This object works as the building block for audio in JavaScript, and [Audio.js][audiojs] is a suite of common audio utilities using it in streams.
5
-
6
- ```javascript
7
- var test = new Audio({
8
- sampleRate: 44100,
9
- bitDepth: 16,
10
- source: new Buffer(/* ... */),
11
- // more options in docs...
12
- });
13
-
14
- // Read left channel on block 2:
15
- var left = test.read(2, 1);
16
-
17
- // Read right channel on block 3
18
- var right = test.read(3, 2);
19
- ```
20
-
21
- See [the "docs" folder](/docs) for more information on using `Audio`.
22
-
23
- ## Installation
24
- ```shell
25
- $ npm install --save audio
26
- ```
27
- For use in the browser use [Browserify][browserify].
28
-
29
- ## Also See
30
- - [Audio.js][audiojs]: A suite of utilities based around this object.
31
- - [`audio-out`][audio-out]: Output an Audio object to the speaker
32
-
33
- ## Credits
34
- | ![jamen][avatar] |
35
- |:---:|
36
- | [Jamen Marzonie][github] |
37
-
38
- ## License
39
- [MIT](LICENSE) © Jamen Marzonie
40
-
41
- [avatar]: https://avatars.githubusercontent.com/u/6251703?v=3&s=125
42
- [github]: https://github.com/jamen
43
- [travis]: https://travis-ci.org/audiojs/audio
44
- [travis-icon]: https://img.shields.io/travis/audiojs/audio.svg
45
- [gitter]: https://gitter.im/audiojs/audio
46
- [gitter-icon]: https://img.shields.io/gitter/room/audiojs/audio.svg
47
- [browserify]: http://npmjs.com/browserify
48
- [npm-audiojs]: https://www.npmjs.com/browse/keyword/audiojs
49
- [audiojs]: https://github.com/audiojs
50
- [audio-out]: https://github.com/audiojs/out
51
- [pcm]: https://en.wikipedia.org/wiki/Pulse-code_modulation
52
- [node-speaker]: https://github.com/tootallnate/node-speaker
1
+ # <img src="logo.svg" width="20" height="20" alt="audio"> audio [![test](https://github.com/audiojs/audio/actions/workflows/test.yml/badge.svg)](https://github.com/audiojs/audio/actions/workflows/test.yml) [![npm](https://img.shields.io/npm/v/audio?color=white)](https://npmjs.org/package/audio)
2
+
3
+ Audio in JavaScript: load, edit, play, analyze, save, batch-process.
4
+
5
+ ```js
6
+ audio('raw-take.wav')
7
+ .trim(-30)
8
+ .normalize('podcast')
9
+ .fade(0.3, 0.5)
10
+ .save('clean.mp3')
11
+ ```
12
+
13
+ <!-- <img src="preview.svg?v=1" alt="Audiojs demo" width="540"> -->
14
+
15
+ * **Universal Format Support** — fast WASM codecs, no ffmpeg.
16
+ * **Streaming** — instant playback not waiting for decode.
17
+ * **Immutable** instant edits, safe undo/redo, serializing.
18
+ * **Virtual Page cache** — open 10Gb+ files, no 2Gb RAM ceiling.
19
+ * **Analysis** — peak, RMS, LUFS, spectrum, clip detection, feature extraction.
20
+ * **Modular** – pluggable ops/stats, autodiscovery, tree-shakable.
21
+ * **CLI** builtin player, unix pipelines, globs, tab completion.
22
+ * **Isomorphic** — cross-platform: node, browser, electron, deno, bun.
23
+ * **Audio-first** – talk dB, Hz, LUFS, not samples, indices or byte arrays.
24
+
25
+ <!--
26
+ * [Architecture](docs/architecture.md) – stream-first design, pages & blocks, non-destructive editing, plan compilation
27
+ * [Plugins](docs/plugins.md) custom ops, stats, descriptors (process, plan, resolve, call), persistent ctx
28
+ -->
29
+
30
+ #### [Quick Start](#quick-start) ǀ [Recipes](#recipes) ǀ [API](#api) ǀ [CLI](#cli) ǀ [Plugins](docs/plugins.md) ǀ [Architecture](docs/architecture.md) ǀ [FAQ](#faq) ǀ [Ecosystem](#ecosystem)
31
+
32
+
33
+ ## Quick Start
34
+
35
+ ### Node
36
+
37
+ `npm i audio`
38
+
39
+ ```js
40
+ import audio from 'audio'
41
+ let a = audio('voice.mp3')
42
+ a.trim().normalize('podcast').fade(0.3, 0.5)
43
+ await a.save('clean.mp3')
44
+ ```
45
+
46
+ ### Browser
47
+
48
+ ```html
49
+ <script type="module">
50
+ import audio from './dist/audio.min.js'
51
+ let a = audio('./song.mp3')
52
+ a.trim().normalize().fade(0.5, 2)
53
+ a.clip({ at: 60, duration: 30 }).play() // play the chorus
54
+ </script>
55
+ ```
56
+
57
+ Codecs load on demand via `import()` — map them with an import map or your bundler.
58
+ <details>
59
+ <summary><strong>Import map example</strong></summary>
60
+
61
+
62
+ ```html
63
+ <script type="importmap">
64
+ {
65
+ "imports": {
66
+ "@audio/decode-wav": "https://esm.sh/@audio/decode-wav",
67
+ "@audio/decode-aac": "https://esm.sh/@audio/decode-aac",
68
+ "@audio/decode-aiff": "https://esm.sh/@audio/decode-aiff",
69
+ "@audio/decode-caf": "https://esm.sh/@audio/decode-caf",
70
+ "@audio/decode-webm": "https://esm.sh/@audio/decode-webm",
71
+ "@audio/decode-amr": "https://esm.sh/@audio/decode-amr",
72
+ "@audio/decode-wma": "https://esm.sh/@audio/decode-wma",
73
+ "mpg123-decoder": "https://esm.sh/mpg123-decoder",
74
+ "@wasm-audio-decoders/flac": "https://esm.sh/@wasm-audio-decoders/flac",
75
+ "ogg-opus-decoder": "https://esm.sh/ogg-opus-decoder",
76
+ "@wasm-audio-decoders/ogg-vorbis": "https://esm.sh/@wasm-audio-decoders/ogg-vorbis",
77
+ "qoa-format": "https://esm.sh/qoa-format",
78
+ "@audio/encode-wav": "https://esm.sh/@audio/encode-wav",
79
+ "@audio/encode-mp3": "https://esm.sh/@audio/encode-mp3",
80
+ "@audio/encode-flac": "https://esm.sh/@audio/encode-flac",
81
+ "@audio/encode-opus": "https://esm.sh/@audio/encode-opus",
82
+ "@audio/encode-ogg": "https://esm.sh/@audio/encode-ogg",
83
+ "@audio/encode-aiff": "https://esm.sh/@audio/encode-aiff"
84
+ }
85
+ }
86
+ </script>
87
+ ```
88
+
89
+ </details>
90
+
91
+ ### CLI
92
+
93
+ ```sh
94
+ npm i -g audio
95
+ audio voice.wav trim normalize podcast fade 0.3s -0.5s -o clean.mp3
96
+ ```
97
+
98
+ ## Recipes
99
+
100
+ ### Clean up a recording
101
+
102
+ ```js
103
+ let a = audio('raw-take.wav')
104
+ a.trim(-30).normalize('podcast').fade(0.3, 0.5)
105
+ await a.save('clean.wav')
106
+ ```
107
+
108
+ ### Podcast montage
109
+
110
+ ```js
111
+ let intro = audio('intro.mp3')
112
+ let body = audio('interview.wav')
113
+ let outro = audio('outro.mp3')
114
+
115
+ body.trim().normalize('podcast')
116
+ let ep = audio([intro, body, outro])
117
+ ep.fade(0.5, 2)
118
+ await ep.save('episode.mp3')
119
+ ```
120
+
121
+ ### Render a waveform
122
+
123
+ ```js
124
+ let a = audio('track.mp3')
125
+ let [mins, peaks] = await a.stat(['min', 'max'], { bins: canvas.width })
126
+ for (let i = 0; i < peaks.length; i++)
127
+ ctx.fillRect(i, h/2 - peaks[i] * h/2, 1, (peaks[i] - mins[i]) * h/2)
128
+ ```
129
+
130
+ ### Render as it decodes
131
+
132
+ ```js
133
+ let a = audio('long.flac')
134
+ a.on('data', ({ delta }) => appendBars(delta.max[0], delta.min[0]))
135
+ await a
136
+ ```
137
+
138
+ ### Voiceover on music
139
+
140
+ ```js
141
+ let music = audio('bg.mp3')
142
+ let voice = audio('narration.wav')
143
+ music.gain(-12).mix(voice, { at: 2 })
144
+ await music.save('mixed.wav')
145
+ ```
146
+
147
+ ### Split a long file
148
+
149
+ ```js
150
+ let a = audio('audiobook.mp3')
151
+ let [ch1, ch2, ch3] = a.split(1800, 3600)
152
+ for (let [i, ch] of [ch1, ch2, ch3].entries())
153
+ await ch.save(`chapter-${i + 1}.mp3`)
154
+ ```
155
+
156
+ ### Record from mic
157
+
158
+ ```js
159
+ let a = audio()
160
+ a.record()
161
+ await new Promise(r => setTimeout(r, 5000))
162
+ a.stop()
163
+ a.trim().normalize()
164
+ await a.save('recording.wav')
165
+ ```
166
+
167
+ ### Extract features for ML
168
+
169
+ ```js
170
+ let a = audio('speech.wav')
171
+ let mfcc = await a.stat('cepstrum', { bins: 13 })
172
+ let spec = await a.stat('spectrum', { bins: 128 })
173
+ let [loud, rms] = await a.stat(['loudness', 'rms'])
174
+ ```
175
+
176
+ ### Generate a tone
177
+
178
+ ```js
179
+ let a = audio.from(t => Math.sin(440 * Math.PI * 2 * t), { duration: 2 })
180
+ await a.save('440hz.wav')
181
+ ```
182
+
183
+ ### Custom op
184
+
185
+ ```js
186
+ audio.op('crush', (chs, ctx) => {
187
+ let steps = 2 ** (ctx.args[0] ?? 8)
188
+ return chs.map(ch => ch.map(s => Math.round(s * steps) / steps))
189
+ })
190
+
191
+ a.crush(4)
192
+ ```
193
+
194
+ ### Serialize and restore
195
+
196
+ ```js
197
+ let json = JSON.stringify(a) // { source, edits, ... }
198
+ let b = audio(JSON.parse(json)) // re-decode + replay edits
199
+ ```
200
+
201
+ ### Remove a section
202
+
203
+ ```js
204
+ let a = audio('interview.wav')
205
+ a.remove({ at: 120, duration: 15 }) // cut 2:00–2:15
206
+ a.fade(0.1, { at: 120 }) // smooth the splice
207
+ await a.save('edited.wav')
208
+ ```
209
+
210
+ ### Ringtone from any song
211
+
212
+ ```js
213
+ let a = audio('song.mp3')
214
+ a.crop({ at: 45, duration: 30 }).fade(0.5, 2).normalize()
215
+ await a.save('ringtone.mp3')
216
+ ```
217
+
218
+ ### Detect clipping
219
+
220
+ ```js
221
+ let a = audio('master.wav')
222
+ let clips = await a.stat('clipping')
223
+ if (clips.length) console.warn(`${clips.length} clipped blocks`)
224
+ ```
225
+
226
+ ### Stream to network
227
+
228
+ ```js
229
+ let a = audio('2hour-mix.flac')
230
+ a.highpass(40).normalize('broadcast')
231
+ for await (let chunk of a) socket.send(chunk[0].buffer)
232
+ ```
233
+
234
+ ### Glitch: stutter + reverse
235
+
236
+ ```js
237
+ let a = audio('beat.wav')
238
+ let v = a.clip({ at: 1, duration: 0.25 })
239
+ let glitch = audio([v, v, v, v])
240
+ glitch.reverse({ at: 0.25, duration: 0.25 })
241
+ await glitch.save('glitch.wav')
242
+ ```
243
+
244
+ ### Tremolo / sidechain
245
+
246
+ ```js
247
+ let a = audio('pad.wav')
248
+ a.gain(t => -12 * (0.5 + 0.5 * Math.cos(t * Math.PI * 4))) // 2Hz tremolo in dB
249
+ await a.save('tremolo.wav')
250
+ ```
251
+
252
+ ### Sonify data
253
+
254
+ ```js
255
+ let prices = [100, 102, 98, 105, 110, 95, 88, 92, 101, 107]
256
+ let a = audio.from(t => {
257
+ let freq = 200 + (prices[Math.min(Math.floor(t / 0.2), prices.length - 1)] - 80) * 10
258
+ return Math.sin(freq * Math.PI * 2 * t) * 0.5
259
+ }, { duration: prices.length * 0.2 })
260
+ await a.save('sonification.wav')
261
+ ```
262
+
263
+
264
+ ## API
265
+
266
+ ### Create
267
+
268
+ * **`audio(source, opts?)`** – decode from file, URL, or bytes. Returns instantly — decodes in background.
269
+ * **`audio.from(source, opts?)`** – wrap existing PCM, AudioBuffer, silence, or function. Sync, no I/O.
270
+
271
+ ```js
272
+ let a = audio('voice.mp3') // file path
273
+ let b = audio('https://cdn.ex/track.mp3') // URL
274
+ let c = audio(inputEl.files[0]) // Blob, File, Response, ArrayBuffer
275
+ let d = audio() // empty, ready for .push() or .record()
276
+ let e = audio([intro, body, outro]) // concat (virtual, no copy)
277
+ // opts: { sampleRate, channels, storage: 'memory' | 'persistent' | 'auto' }
278
+
279
+ await a // await for decode — if you need .duration, full stats etc
280
+
281
+ let a = audio.from([left, right]) // Float32Array[] channels
282
+ let b = audio.from(3, { channels: 2 }) // 3s silence
283
+ let c = audio.from(t => Math.sin(440*TAU*t), { duration: 2 }) // generator
284
+ let d = audio.from(audioBuffer) // Web Audio AudioBuffer
285
+ let e = audio.from(int16arr, { format: 'int16' }) // typed array + format
286
+ ```
287
+
288
+
289
+ ### Properties
290
+
291
+ ```js
292
+ // format
293
+ a.duration // total seconds (reflects edits)
294
+ a.channels // channel count
295
+ a.sampleRate // sample rate
296
+ a.length // total samples per channel
297
+
298
+ // playback
299
+ a.currentTime // position in seconds
300
+ a.playing // true during playback
301
+ a.paused // true when paused
302
+ a.volume = -3 // dB (settable)
303
+ a.loop = true // on/off (settable)
304
+ a.recording // true during mic recording
305
+
306
+ // state
307
+ a.ready // promise, resolves when fully decoded
308
+ a.source // original source reference
309
+ a.pages // Float32Array page store
310
+ a.stats // per-block stats (peak, rms, etc.)
311
+ a.edits // edit list (non-destructive ops)
312
+ a.version // increments on each edit
313
+ ```
314
+
315
+ ### Structure
316
+
317
+ Non-destructive time/channel rearrangement. All support `{at, duration, channel}`.
318
+
319
+ * **`.trim(threshold?)`** – strip leading/trailing silence (dB, default auto).
320
+ * **`.crop({at, duration})`** – keep range, discard rest.
321
+ * **`.remove({at, duration})`** – cut range, close gap.
322
+ * **`.insert(source, {at})`** – insert audio or silence (number of seconds) at position.
323
+ * **`.clip({at, duration})`** – zero-copy range reference.
324
+ * **`.split(...offsets)`** – zero-copy split at timestamps.
325
+ * **`.pad(before, after?)`** – silence at edges (seconds).
326
+ * **`.repeat(n)`** – repeat n times.
327
+ * **`.reverse({at?, duration?})`** – reverse audio or range.
328
+ * **`.speed(rate)`** – playback speed (affects pitch and duration).
329
+ * **`.remix(channels)`** – channel count: number or array map (`[1, 0]` swaps L/R).
330
+
331
+ ```js
332
+ a.trim(-30) // strip silence below -30dB
333
+ a.remove({ at: '2m', duration: 15 }) // cut 2:00–2:15, close gap
334
+ a.insert(intro, { at: 0 }) // prepend; .insert(3) appends 3s silence
335
+ let [pt1, pt2] = a.split('30m') // zero-copy views
336
+ let hook = a.clip({ at: 60, duration: 30 }) // zero-copy excerpt
337
+ a.remix([0, 0]) // L→both; .remix(1) for mono
338
+ ```
339
+
340
+ ### Process
341
+
342
+ Amplitude, mixing, normalization. All support `{at, duration, channel}` ranges.
343
+
344
+ * **`.gain(dB, opts?)`** – volume. Number, range, or `t => dB` function. `{ unit: 'linear' }` for multiplier.
345
+ * **`.fade(in, out?, curve?)`** – fade in/out. Curves: `'linear'` `'exp'` `'log'` `'cos'`.
346
+ * **`.normalize(target?)`** – remove DC offset, clamp, and normalize loudness.
347
+ * `'podcast'` – -16 LUFS, -1 dBTP.
348
+ * `'streaming'` – -14 LUFS.
349
+ * `'broadcast'` – -23 LUFS.
350
+ * `-3` – custom dB target (peak mode).
351
+ * no arg – peak 0dBFS.
352
+ * `{ mode: 'rms' }` – RMS normalization. Also `'peak'`, `'lufs'`.
353
+ * `{ ceiling: -1 }` – true peak limiter in dB.
354
+ * `{ dc: false }` – skip DC removal.
355
+ * **`.mix(source, opts?)`** – overlay another audio (additive).
356
+ * **`.pan(value, opts?)`** – stereo balance (−1 left, 0 center, 1 right). Accepts function.
357
+ * **`.write(data, {at?})`** – overwrite samples with raw PCM.
358
+ * **`.transform(fn)`** – inline processor: `(chs, ctx) => chs`. Not serialized.
359
+
360
+ ```js
361
+ a.gain(-3) // reduce 3dB
362
+ a.gain(6, { at: 10, duration: 5 }) // boost range
363
+ a.gain(t => -12 * Math.cos(t * TAU)) // automate over time
364
+ a.fade(0.5, -2, 'exp') // 0.5s in, 2s exp fade-out
365
+ a.normalize('podcast') // -16 LUFS; also 'streaming', 'broadcast'
366
+ a.mix(voice, { at: 2 }) // overlay at 2s
367
+ a.pan(-0.3, { at: 10, duration: 5 }) // pan left for range
368
+ ```
369
+
370
+ ### Filter
371
+
372
+ Biquad filters, chainable. All support `{at, duration}` ranges.
373
+
374
+ * **`.highpass(freq)`**, **`.lowpass(freq)`** – pass filter.
375
+ * **`.bandpass(freq, Q?)`**, **`.notch(freq, Q?)`** – band-pass / notch.
376
+ * **`.lowshelf(freq, dB)`**, **`.highshelf(freq, dB)`** – shelf EQ.
377
+ * **`.eq(freq, gain, Q?)`** – parametric EQ.
378
+ * **`.filter(type, ...params)`** – generic dispatch.
379
+
380
+ ```js
381
+ a.highpass(80).lowshelf(200, -3) // rumble + mud
382
+ a.eq(3000, 2, 1.5).highshelf(8000, 3) // presence + air
383
+ a.notch(50) // remove hum
384
+ a.filter(customFn, { cutoff: 2000 }) // custom filter function
385
+ ```
386
+
387
+ ### I/O
388
+
389
+ Read PCM, encode, stream, push. Format inferred from extension.
390
+
391
+ * **`await .read(opts?)`** – rendered PCM. `{ format, channel }` to convert.
392
+ * **`await .save(path, opts?)`** – encode + write. `{ at, duration }` for sub-range.
393
+ * **`await .encode(format?, opts?)`** – encode to `Uint8Array`.
394
+ * **`for await (let block of a)`** – async-iterable over blocks.
395
+ * **`.clone()`** – deep copy, independent edits, shared pages.
396
+ * **`.push(data, format?)`** – feed PCM into pushable instance. `.stop()` to finalize.
397
+
398
+ ```js
399
+ let pcm = await a.read() // Float32Array[]
400
+ let raw = await a.read({ format: 'int16', channel: 0 })
401
+ await a.save('out.mp3') // format from extension
402
+ let bytes = await a.encode('flac') // Uint8Array
403
+ for await (let block of a) send(block) // stream blocks
404
+ let b = a.clone() // independent copy, shared pages
405
+
406
+ let src = audio() // pushable source
407
+ src.push(buf, 'int16') // feed PCM
408
+ src.stop() // finalize
409
+ ```
410
+
411
+ ### Playback / Recording
412
+
413
+ Live playback with dB volume, seeking, looping. Mic recording via `audio-mic`.
414
+
415
+ * **`.play(opts?)`** – start playback. `{ at, duration, volume, loop }`.
416
+ * **`.pause()`**, **`.resume()`**, **`.seek(t)`**, **`.stop()`** – playback control.
417
+ * **`.record(opts?)`** – mic recording. `{ deviceId, sampleRate, channels }`.
418
+
419
+ ```js
420
+ a.play({ at: 30, duration: 10 }) // play 30s–40s
421
+ a.volume = -6; a.loop = true // live adjustments
422
+ a.pause(); a.seek(60); a.resume() // jump to 1:00
423
+ a.stop() // end playback or recording
424
+
425
+ let mic = audio()
426
+ mic.record({ sampleRate: 16000, channels: 1 })
427
+ mic.stop()
428
+ ```
429
+
430
+
431
+ ### Analysis
432
+
433
+ `await .stat(name, opts?)` — without `bins` returns scalar, with `bins` returns `Float32Array`. Array of names returns array of results. Sub-ranges via `{at, duration}`, per-channel via `{channel}`.
434
+
435
+ * **`'db'`** – peak amplitude in dBFS.
436
+ * **`'rms'`** – RMS amplitude (linear).
437
+ * **`'loudness'`** – integrated LUFS (ITU-R BS.1770).
438
+ * **`'dc'`** – DC offset.
439
+ * **`'clipping'`** – clipped samples (scalar: timestamps, binned: counts).
440
+ * **`'silence'`** – silent ranges as `{at, duration}`.
441
+ * **`'max'`**, **`'min'`** – peak envelope (use together for waveform rendering).
442
+ * **`'spectrum'`** – mel-frequency spectrum in dB (A-weighted).
443
+ * **`'cepstrum'`** – MFCCs.
444
+
445
+ ```js
446
+ let loud = await a.stat('loudness') // LUFS
447
+ let [db, clips] = await a.stat(['db', 'clipping']) // multiple at once
448
+ let spec = await a.stat('spectrum', { bins: 128 }) // frequency bins
449
+ let peaks = await a.stat('max', { bins: 800 }) // waveform data
450
+ await a.stat('rms', { channel: 0 }) // left only → number
451
+ await a.stat('rms', { channel: [0, 1] }) // per-channel → [n, n]
452
+ let gaps = await a.stat('silence', { threshold: -40 }) // [{at, duration}, ...]
453
+ ```
454
+
455
+
456
+ ### Utility
457
+
458
+ Events, lifecycle, undo/redo, serialization.
459
+
460
+ * **`.on(event, fn)`** / **`.off(event?, fn?)`** – subscribe / unsubscribe.
461
+ * `'data'` – pages decoded/pushed. Payload: `{ delta, offset, sampleRate, channels }`.
462
+ * `'change'` – any edit or undo.
463
+ * `'metadata'` – stream header decoded. Payload: `{ sampleRate, channels }`.
464
+ * `'timeupdate'` – playback position. Payload: `currentTime`.
465
+ * `'ended'` – playback finished (not on loop).
466
+ * `'progress'` – during save/encode. Payload: `{ offset, total }` in seconds.
467
+ * **`.dispose()`** – release resources. Supports `using` for auto-dispose.
468
+ * **`.undo(n?)`** – undo last edit(s). Returns edit for redo via `.run()`.
469
+ * **`.run(...edits)`** – apply edit objects `{ type, args, at?, duration? }`. Batch or replay.
470
+
471
+ ```js
472
+ a.on('data', ({ delta }) => draw(delta)) // decode progress
473
+ a.on('timeupdate', t => ui.update(t)) // playback position
474
+
475
+ a.undo() // undo last edit
476
+ b.run(...a.edits) // replay onto another file
477
+ JSON.stringify(a); audio(json) // serialize / restore
478
+ ```
479
+
480
+ ### Plugins
481
+
482
+ Extend with custom ops and stats. See [Plugin Tutorial](docs/plugins.md).
483
+
484
+ * **`audio.op(name, fn)`** – register op. Shorthand for `{ process: fn }`. Full descriptor: `{ process, plan, resolve, call }`.
485
+ * **`audio.op(name)`** – query descriptor. **`audio.op()`** – all ops.
486
+ * **`audio.stat(name, descriptor)`** – register stat. Shorthand `(chs, ctx) => [...]` or `{ block, reduce, query }`.
487
+
488
+ ```js
489
+ // op: process function receives (channels[], ctx) per 1024-sample block
490
+ audio.op('crush', (chs, ctx) => {
491
+ let steps = 2 ** (ctx.args[0] ?? 8)
492
+ return chs.map(ch => ch.map(s => Math.round(s * steps) / steps))
493
+ })
494
+
495
+ // stat: block function collects per-block, reduce enables scalar queries
496
+ audio.stat('peak', {
497
+ block: (chs) => chs.map(ch => { let m = 0; for (let s of ch) m = Math.max(m, Math.abs(s)); return m }),
498
+ reduce: (src, from, to) => { let m = 0; for (let i = from; i < to; i++) m = Math.max(m, src[i]); return m },
499
+ })
500
+
501
+ a.crush(4) // chainable like built-in ops
502
+ a.stat('peak') // → scalar from reduce
503
+ a.stat('peak', { bins: 100 }) // → binned array
504
+ ```
505
+
506
+ ## CLI
507
+
508
+
509
+ ```sh
510
+ npx audio [file] [ops...] [-o output] [options]
511
+
512
+ # ops
513
+ eq mix pad pan crop
514
+ fade gain stat trim notch
515
+ remix speed split insert remove
516
+ repeat bandpass highpass lowpass reverse
517
+ lowshelf highshelf normalize
518
+ ```
519
+
520
+
521
+ `-o` output · `-p` play · `-f` force · `--format` · `--verbose` · `+` concat
522
+
523
+ ### Playback
524
+
525
+
526
+ <img src="player.gif" alt="Audiojs demo" width="624">
527
+
528
+ <!-- ```sh
529
+ npx audio kirtan.mp3
530
+ ▶ 0:06:37 ━━━━━━━━────────────────────────────────────────── -0:36:30 ▁▂▃▄▅__
531
+ ▂▅▇▇██▇▆▇▇▇██▆▇▇▇▆▆▅▅▆▅▆▆▅▅▆▅▅▅▃▂▂▂▂▁_____________
532
+ 50 500 1k 2k 5k 10k 20k
533
+
534
+ 48k 2ch 43:07 -0.8dBFS -30.8LUFS
535
+ ``` -->
536
+
537
+ <kbd>␣</kbd> pause · <kbd>←</kbd>/<kbd>→</kbd> seek ±10s · <kbd>⇧←</kbd>/<kbd>⇧→</kbd> seek ±60s · <kbd>↑</kbd>/<kbd>↓</kbd> volume ±3dB · <kbd>l</kbd> loop · <kbd>q</kbd> quit
538
+
539
+
540
+
541
+ ### Edit
542
+
543
+ ```sh
544
+ # clean up
545
+ npx audio raw-take.wav trim -30db normalize podcast fade 0.3s -0.5s -o clean.wav
546
+
547
+ # ranges
548
+ npx audio in.wav gain -3db 1s..10s -o out.wav
549
+
550
+ # filter chain
551
+ npx audio in.mp3 highpass 80hz lowshelf 200hz -3db -o out.wav
552
+
553
+ # join
554
+ npx audio intro.mp3 + content.wav + outro.mp3 trim normalize fade 0.5s -2s -o ep.mp3
555
+
556
+ # voiceover
557
+ npx audio bg.mp3 gain -12db mix narration.wav 2s -o mixed.wav
558
+
559
+ # split
560
+ npx audio audiobook.mp3 split 30m 60m -o 'chapter-{i}.mp3'
561
+ ```
562
+
563
+ ### Analysis
564
+
565
+ ```sh
566
+ # all default stats (db, rms, loudness, clipping, dc)
567
+ npx audio speech.wav stat
568
+
569
+ # specific stats
570
+ npx audio speech.wav stat loudness rms
571
+
572
+ # spectrum / cepstrum with bin count
573
+ npx audio speech.wav stat spectrum 128
574
+ npx audio speech.wav stat cepstrum 13
575
+
576
+ # stat after transforms
577
+ npx audio speech.wav gain -3db stat db
578
+ ```
579
+
580
+ ### Batch
581
+
582
+ ```sh
583
+ npx audio '*.wav' trim normalize podcast -o '{name}.clean.{ext}'
584
+ npx audio '*.wav' gain -3db -o '{name}.out.{ext}'
585
+ ```
586
+
587
+ ### Stdin/stdout
588
+
589
+ ```sh
590
+ cat in.wav | audio gain -3db > out.wav
591
+ curl -s https://example.com/speech.mp3 | npx audio normalize -o clean.wav
592
+ ffmpeg -i video.mp4 -f wav - | npx audio trim normalize podcast > voice.wav
593
+ ```
594
+
595
+ ### Tab completion
596
+
597
+ ```sh
598
+ eval "$(audio --completions zsh)" # add to ~/.zshrc
599
+ eval "$(audio --completions bash)" # add to ~/.bashrc
600
+ audio --completions fish | source # fish
601
+ ```
602
+
603
+
604
+
605
+
606
+ ## FAQ
607
+
608
+ <dl>
609
+ <dt>How is this different from Web Audio API?</dt>
610
+ <dd>Web Audio API is a real-time graph for playback and synthesis. This is for loading, editing, analyzing, and saving audio files. They work well together. For Web Audio API in Node, see <a href="https://github.com/audiojs/web-audio-api">web-audio-api</a>.</dd>
611
+
612
+ <dt>What formats are supported?</dt>
613
+ <dd>Decode: WAV, MP3, FLAC, OGG Vorbis, Opus, AAC, AIFF, CAF, WebM, AMR, WMA, QOA via <a href="https://github.com/audiojs/audio-decode">audio-decode</a>. Encode: WAV, MP3, FLAC, Opus, OGG, AIFF via <a href="https://github.com/audiojs/audio-encode">audio-encode</a>. Codecs are WASM-based, lazy-loaded on first use.</dd>
614
+
615
+ <dt>Does it need ffmpeg or native addons?</dt>
616
+ <dd>No, pure JS + WASM. For CLI, you can install globally: <code>npm i -g audio</code>.</dd>
617
+
618
+ <dt>How big is the bundle?</dt>
619
+ <dd>~20K gzipped core. Codecs load on demand via <code>import()</code>, so unused formats aren't fetched.</dd>
620
+
621
+ <dt>How does it handle large files?</dt>
622
+ <dd>Audio is stored in fixed-size pages. In the browser, cold pages can evict to OPFS when memory exceeds budget. Stats stay resident (~7 MB for 2h stereo).</dd>
623
+
624
+ <dt>Are edits destructive?</dt>
625
+ <dd>No. <code>a.gain(-3).trim()</code> pushes entries to an edit list — source pages aren't touched. Edits replay on <code>read()</code> / <code>save()</code> / <code>for await</code>.</dd>
626
+
627
+ <dt>Can I use it in the browser?</dt>
628
+ <dd>Yes, same API. See <a href="#browser">Browser</a> for bundle options and import maps.</dd>
629
+
630
+ <dt>Does it need the full file before I can work with it?</dt>
631
+ <dd>No, playback and edits work during decode. The <code>'data'</code> event fires as pages arrive.</dd>
632
+
633
+ <dt>TypeScript?</dt>
634
+ <dd>Yes, ships with <code>audio.d.ts</code>.</dd>
635
+ </dl>
636
+
637
+
638
+ ## Ecosystem
639
+
640
+ * [audio-decode](https://github.com/audiojs/audio-decode) – codec decoding (13+ formats)
641
+ * [encode-audio](https://github.com/audiojs/audio-encode) – codec encoding
642
+ * [audio-filter](https://github.com/audiojs/audio-filter) – filters (weighting, EQ, auditory)
643
+ * [audio-speaker](https://github.com/audiojs/audio-speaker) – audio output (Node)
644
+ * [audio-type](https://github.com/nickolanack/audio-type) – format detection
645
+ * [pcm-convert](https://github.com/nickolanack/pcm-convert) – PCM format conversion
646
+
647
+ <p align="center"><a href="./license.md">MIT</a> · <a href="https://github.com/krishnized/license">ॐ</a></p>