npm - @drakulavich/parakeet-cli - Versions diffs - 0.5.3 → 0.6.0 - Mend

@drakulavich/parakeet-cli 0.5.3 → 0.6.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (4) hide show

package/README.md +40 -124
package/package.json +1 -1
package/src/models.ts +5 -0
package/src/__tests__/audio.test.ts +0 -28

package/README.md CHANGED Viewed

@@ -1,172 +1,88 @@
-# parakeet-cli
+# 🦜 parakeet-cli
 [![npm version](https://img.shields.io/npm/v/@drakulavich/parakeet-cli)](https://www.npmjs.com/package/@drakulavich/parakeet-cli)
 [![CI](https://github.com/drakulavich/parakeet-cli/actions/workflows/ci.yml/badge.svg)](https://github.com/drakulavich/parakeet-cli/actions/workflows/ci.yml)
 [![License: MIT](https://img.shields.io/badge/License-MIT-blue.svg)](https://opensource.org/licenses/MIT)
 [![Bun](https://img.shields.io/badge/runtime-Bun-f9f1e1?logo=bun)](https://bun.sh)
+[![Test Report](https://img.shields.io/badge/Test_Report-Allure-orange)](https://drakulavich.github.io/parakeet-cli/reports/allure)
-Fast multilingual speech-to-text CLI powered by NVIDIA Parakeet models. Zero Python. CoreML on Apple Silicon, ONNX on CPU.
+Fast local speech-to-text. 25 languages. ~18x faster than Whisper on Apple Silicon.
-## Features
+- **CoreML on Apple Silicon** — ~155x real-time via [FluidAudio](https://github.com/FluidInference/FluidAudio)
+- **ONNX on CPU** — cross-platform fallback, 3x faster than Whisper
+- **Any audio format** — ffmpeg handles OGG, MP3, WAV, FLAC, M4A
+- **Zero Python** — Bun + TypeScript, native Swift binary for CoreML
-- **25 languages** — automatic language detection, no prompting needed
-- **~155x real-time on Apple Silicon** — CoreML backend via [FluidAudio](https://github.com/FluidInference/FluidAudio) (1 min audio in ~0.4s)
-- **3x faster than Whisper** on CPU with ONNX fallback (see [benchmark](#benchmark))
-- **Zero Python** — pure TypeScript/Bun, native Swift binary for CoreML
-- **Smart install** — `parakeet install` auto-detects platform: CoreML on macOS arm64, ONNX elsewhere
-- **Any audio format** — ffmpeg handles OGG, MP3, WAV, FLAC, M4A, etc.
-## Install
-Using Bun (recommended):
+## Quick Start
 ```bash
 bun install -g @drakulavich/parakeet-cli
+parakeet install          # CoreML on macOS arm64, ONNX elsewhere
+parakeet audio.ogg        # → transcript to stdout
 ```
-Using npm (requires Bun runtime installed):
-```bash
-npm install -g @drakulavich/parakeet-cli
-```
-Or clone and link locally:
-```bash
-git clone https://github.com/drakulavich/parakeet-cli.git
-cd parakeet-cli
-bun install
-bun link
-```
-> **Note:** Bun is required as the runtime — the CLI uses Bun-native APIs and TypeScript execution. You can use either `bun` or `npm` as the package manager to install it, but Bun must be available in PATH to run the `parakeet` command.
 ## Usage
 ```bash
-# Download backend (required before first use)
-# On macOS Apple Silicon: downloads CoreML binary
-# On Linux/other: downloads ONNX models (~3GB)
-parakeet install
+parakeet install                 # auto-detect backend
+parakeet install --coreml        # force CoreML (macOS arm64)
+parakeet install --onnx          # force ONNX (~3GB)
+parakeet audio.ogg               # transcribe (language auto-detected)
+parakeet --version
+```
-# Force a specific backend
-parakeet install --coreml    # CoreML (macOS arm64 only)
-parakeet install --onnx      # ONNX (any platform)
+Stdout: transcript. Stderr: errors. Pipe-friendly.
-# Transcribe any audio file (language auto-detected)
-parakeet audio.ogg
+## Requirements
-# Force re-download
-parakeet install --no-cache
+- [Bun](https://bun.sh) >= 1.3
+- [ffmpeg](https://ffmpeg.org) in PATH (ONNX backend only)
+- ~3GB disk (ONNX models)
-# Show version
-parakeet --version
-```
+## Supported Languages
-Output goes to stdout, errors to stderr. Designed for piping and scripting.
+:bulgaria: Bulgarian, :croatia: Croatian, :czech_republic: Czech, :denmark: Danish, :netherlands: Dutch, :gb: English, :estonia: Estonian, :finland: Finnish, :fr: French, :de: German, :greece: Greek, :hungary: Hungarian, :it: Italian, :latvia: Latvian, :lithuania: Lithuanian, :malta: Maltese, :poland: Polish, :portugal: Portuguese, :romania: Romanian, :ru: Russian, :slovakia: Slovak, :slovenia: Slovenian, :es: Spanish, :sweden: Swedish, :ukraine: Ukrainian
 ## Benchmark
-10 Telegram voice messages (Russian, 3-10s each) on MacBook Pro M3 Pro:
+MacBook Pro M3 Pro — 10 Russian voice messages:
 | | faster-whisper (CPU) | Parakeet (CoreML) |
 |---|---|---|
-| **Total time** | 35.3s | 1.9s |
-| **Speedup** | | **~18x faster** |
-Models: faster-whisper medium (int8) vs Parakeet TDT 0.6B v3 (CoreML, Apple Neural Engine).
-See [BENCHMARK.md](BENCHMARK.md) for full results with transcripts. Updated automatically on each release.
-## Supported Languages
+| **Total** | 35.3s | 1.9s |
+| **Speed** | | **~18x faster** |
-Bulgarian, Croatian, Czech, Danish, Dutch, English, Estonian, Finnish, French, German, Greek, Hungarian, Italian, Latvian, Lithuanian, Maltese, Polish, Portuguese, Romanian, Russian, Slovak, Slovenian, Spanish, Swedish, Ukrainian.
+Full results with transcripts: [BENCHMARK.md](BENCHMARK.md)
 ## How It Works
-### CoreML backend (macOS Apple Silicon)
 ```
 parakeet audio.ogg
-  |
-  +-- parakeet-coreml (Swift binary via FluidAudio)
-  |   +-- CoreML inference on Apple Neural Engine
-  |   +-- ~155x real-time on M4 Pro
-  |
-  stdout: transcript
-```
-Uses [FluidAudio](https://github.com/FluidInference/FluidAudio) with the [CoreML model](https://huggingface.co/FluidInference/parakeet-tdt-0.6b-v3-coreml). CoreML model files are downloaded by FluidAudio on first transcription.
-### ONNX backend (cross-platform fallback)
+  ├── CoreML installed? → parakeet-coreml subprocess → stdout
+  └── ONNX installed?   → ffmpeg → mel → encoder → decoder → stdout
 ```
-parakeet audio.ogg
-  |
-  +-- ffmpeg: any format -> 16kHz mono float32
-  +-- nemo128.onnx: waveform -> 128-dim log-mel spectrogram
-  +-- per-utterance normalization (mean=0, std=1)
-  +-- encoder-model.onnx: mel features -> encoder output
-  +-- TDT greedy decoder: encoder output -> token IDs + durations
-  +-- vocab.txt: token IDs -> text
-  |
-  stdout: transcript
-```
-Uses [NVIDIA Parakeet TDT 0.6B v3](https://huggingface.co/nvidia/parakeet-tdt-0.6b-v3) exported to ONNX by [istupakov](https://huggingface.co/istupakov/parakeet-tdt-0.6b-v3-onnx). Run `parakeet install --onnx` to download models from HuggingFace (~3GB).
-## Requirements
-- [Bun](https://bun.sh) >= 1.3 (runtime)
-- [ffmpeg](https://ffmpeg.org) installed and in PATH
-- ~3GB disk space for model cache
-- npm or Bun can be used as the package manager
-### macOS (Apple Silicon)
-Works natively on M1/M2/M3/M4 with CoreML acceleration. Install dependencies with Homebrew:
-```bash
-brew install ffmpeg
-curl -fsSL https://bun.sh/install | bash
-bun install -g @drakulavich/parakeet-cli    # or: npm install -g @drakulavich/parakeet-cli
-parakeet install                             # downloads CoreML binary
-```
-### Linux
-```bash
-apt install ffmpeg   # or yum, pacman, etc.
-curl -fsSL https://bun.sh/install | bash
-bun install -g @drakulavich/parakeet-cli    # or: npm install -g @drakulavich/parakeet-cli
-parakeet install                             # downloads ONNX models (~3GB)
-```
+- **CoreML**: Swift binary wraps [FluidAudio](https://github.com/FluidInference/FluidAudio) + [CoreML model](https://huggingface.co/FluidInference/parakeet-tdt-0.6b-v3-coreml)
+- **ONNX**: [NVIDIA Parakeet TDT 0.6B v3](https://huggingface.co/nvidia/parakeet-tdt-0.6b-v3) via [onnxruntime-node](https://www.npmjs.com/package/onnxruntime-node)
 ## OpenClaw Integration
-To use parakeet as the voice transcription engine in [OpenClaw](https://docs.openclaw.ai), update `~/.openclaw/openclaw.json`:
+Drop-in replacement for OpenClaw voice processing — no API keys, runs locally.
 ```json
-"tools": {
-  "media": {
-    "audio": {
-      "enabled": true,
-      "models": [
-        {
-          "type": "cli",
-          "command": "parakeet",
-          "args": ["{{MediaPath}}"],
-          "timeoutSeconds": 120
-        }
-      ],
-      "echoTranscript": false
+{
+  "tools": {
+    "media": {
+      "audio": {
+        "enabled": true,
+        "models": [{"type": "cli", "command": "parakeet", "args": ["{{MediaPath}}"], "timeoutSeconds": 120}],
+        "echoTranscript": false
+      }
     }
   }
 }
 ```
-Then restart the gateway: `openclaw gateway restart`
 ## License
 MIT

package/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "@drakulavich/parakeet-cli",
-  "version": "0.5.3",
+  "version": "0.6.0",
   "description": "Fast multilingual speech-to-text CLI powered by NVIDIA Parakeet ONNX models",
   "type": "module",
   "bin": {

package/src/models.ts CHANGED Viewed

@@ -1,6 +1,7 @@
 import { join, dirname } from "path";
 import { homedir } from "os";
 import { existsSync, mkdirSync, chmodSync } from "fs";
+import { isCoreMLInstalled } from "./coreml";
 export const HF_REPO = "istupakov/parakeet-tdt-0.6b-v3-onnx";
@@ -21,6 +22,10 @@ export function isModelCached(dir?: string): boolean {
   return MODEL_FILES.every((f) => existsSync(join(d, f)));
 }
+export function isModelInstalled(modelDir?: string): boolean {
+  return isCoreMLInstalled() || isModelCached(modelDir);
+}
 export function installHintError(headline: string): Error {
   const lines = [
     headline,

package/src/__tests__/audio.test.ts DELETED Viewed

@@ -1,28 +0,0 @@
-import { describe, test, expect } from "bun:test";
-import { convertToFloat32PCM } from "../audio";
-import { spawnSync } from "child_process";
-const hasFfmpeg = spawnSync("which", ["ffmpeg"]).status === 0;
-describe.skipIf(!hasFfmpeg)("audio", () => {
-  test("converts WAV to 16kHz mono Float32Array", async () => {
-    const buffer = await convertToFloat32PCM("fixtures/silence.wav");
-    expect(buffer).toBeInstanceOf(Float32Array);
-    // 1 second at 16kHz = 16000 samples
-    expect(buffer.length).toBeGreaterThan(15000);
-    expect(buffer.length).toBeLessThan(17000);
-  });
-  test("throws on missing file", async () => {
-    expect(convertToFloat32PCM("nonexistent.wav")).rejects.toThrow(
-      "file not found"
-    );
-  });
-  test("throws on corrupt file", async () => {
-    await Bun.write("fixtures/corrupt.bin", "not audio data");
-    expect(convertToFloat32PCM("fixtures/corrupt.bin")).rejects.toThrow(
-      "failed to convert audio"
-    );
-  });
-});