react-native-sherpa-onnx 0.1.0 → 0.2.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (62) hide show
  1. package/README.md +236 -402
  2. package/android/src/main/cpp/CMakeLists.txt +129 -121
  3. package/android/src/main/cpp/jni/sherpa-onnx-common.h +18 -0
  4. package/android/src/main/cpp/jni/sherpa-onnx-model-detect.cpp +541 -0
  5. package/android/src/main/cpp/jni/sherpa-onnx-model-detect.h +89 -0
  6. package/android/src/main/cpp/jni/sherpa-onnx-stt-jni.cpp +336 -0
  7. package/android/src/main/cpp/jni/sherpa-onnx-stt-wrapper.cpp +222 -0
  8. package/{ios/sherpa-onnx-wrapper.h → android/src/main/cpp/jni/sherpa-onnx-stt-wrapper.h} +23 -12
  9. package/android/src/main/cpp/jni/sherpa-onnx-tts-jni.cpp +823 -0
  10. package/android/src/main/cpp/jni/sherpa-onnx-tts-wrapper.cpp +387 -0
  11. package/android/src/main/cpp/jni/sherpa-onnx-tts-wrapper.h +147 -0
  12. package/android/src/main/java/com/sherpaonnx/SherpaOnnxCoreHelper.kt +236 -0
  13. package/android/src/main/java/com/sherpaonnx/SherpaOnnxModule.kt +483 -316
  14. package/android/src/main/java/com/sherpaonnx/SherpaOnnxSttHelper.kt +109 -0
  15. package/android/src/main/java/com/sherpaonnx/SherpaOnnxTtsHelper.kt +668 -0
  16. package/ios/SherpaOnnx+STT.mm +118 -0
  17. package/ios/SherpaOnnx+TTS.mm +712 -0
  18. package/ios/SherpaOnnx.h +6 -5
  19. package/ios/SherpaOnnx.mm +311 -293
  20. package/ios/sherpa-onnx-common.h +18 -0
  21. package/ios/sherpa-onnx-model-detect.h +89 -0
  22. package/ios/sherpa-onnx-model-detect.mm +441 -0
  23. package/ios/sherpa-onnx-stt-wrapper.h +48 -0
  24. package/ios/sherpa-onnx-stt-wrapper.mm +201 -0
  25. package/ios/sherpa-onnx-tts-wrapper.h +85 -0
  26. package/ios/sherpa-onnx-tts-wrapper.mm +345 -0
  27. package/lib/module/NativeSherpaOnnx.js.map +1 -1
  28. package/lib/module/index.js +11 -13
  29. package/lib/module/index.js.map +1 -1
  30. package/lib/module/stt/index.js +46 -43
  31. package/lib/module/stt/index.js.map +1 -1
  32. package/lib/module/tts/index.js +324 -36
  33. package/lib/module/tts/index.js.map +1 -1
  34. package/lib/module/tts/types.js +4 -0
  35. package/lib/module/tts/types.js.map +1 -0
  36. package/lib/module/utils.js +66 -34
  37. package/lib/module/utils.js.map +1 -1
  38. package/lib/typescript/src/NativeSherpaOnnx.d.ts +169 -3
  39. package/lib/typescript/src/NativeSherpaOnnx.d.ts.map +1 -1
  40. package/lib/typescript/src/index.d.ts +1 -3
  41. package/lib/typescript/src/index.d.ts.map +1 -1
  42. package/lib/typescript/src/stt/index.d.ts +14 -5
  43. package/lib/typescript/src/stt/index.d.ts.map +1 -1
  44. package/lib/typescript/src/tts/index.d.ts +227 -23
  45. package/lib/typescript/src/tts/index.d.ts.map +1 -1
  46. package/lib/typescript/src/tts/types.d.ts +188 -0
  47. package/lib/typescript/src/tts/types.d.ts.map +1 -0
  48. package/lib/typescript/src/utils.d.ts +32 -0
  49. package/lib/typescript/src/utils.d.ts.map +1 -1
  50. package/package.json +222 -221
  51. package/scripts/setup-assets.js +323 -323
  52. package/scripts/switch-registry.js +8 -8
  53. package/src/NativeSherpaOnnx.ts +251 -44
  54. package/src/index.tsx +27 -30
  55. package/src/stt/index.ts +89 -83
  56. package/src/tts/index.ts +458 -67
  57. package/src/tts/types.ts +218 -0
  58. package/src/utils.ts +131 -97
  59. package/android/src/main/cpp/jni/sherpa-onnx-jni.cpp +0 -129
  60. package/android/src/main/cpp/jni/sherpa-onnx-wrapper.cpp +0 -649
  61. package/android/src/main/cpp/jni/sherpa-onnx-wrapper.h +0 -56
  62. package/ios/sherpa-onnx-wrapper.mm +0 -432
package/README.md CHANGED
@@ -1,402 +1,236 @@
1
- # react-native-sherpa-onnx
2
-
3
- React Native SDK for sherpa-onnx - providing offline speech processing capabilities
4
-
5
- [![npm version](https://img.shields.io/npm/v/react-native-sherpa-onnx.svg)](https://www.npmjs.com/package/react-native-sherpa-onnx)
6
- [![npm downloads](https://img.shields.io/npm/dm/react-native-sherpa-onnx.svg)](https://www.npmjs.com/package/react-native-sherpa-onnx)
7
- [![npm license](https://img.shields.io/npm/l/react-native-sherpa-onnx.svg)](https://www.npmjs.com/package/react-native-sherpa-onnx)
8
- [![Android](https://img.shields.io/badge/Android-Supported-green)](https://www.android.com/)
9
- [![iOS](https://img.shields.io/badge/iOS-Supported-blue)](https://www.apple.com/ios/)
10
-
11
- A React Native TurboModule that provides offline speech processing capabilities using [sherpa-onnx](https://github.com/k2-fsa/sherpa-onnx). The SDK aims to support all functionalities that sherpa-onnx offers, including offline speech-to-text, text-to-speech, speaker diarization, speech enhancement, source separation, and VAD (Voice Activity Detection).
12
-
13
- ## Feature Support
14
-
15
- | Feature | Status |
16
- |---------|--------|
17
- | Offline Speech-to-Text | ✅ Supported |
18
- | Text-to-Speech | Not yet supported |
19
- | Speaker Diarization | ❌ Not yet supported |
20
- | Speech Enhancement | ❌ Not yet supported |
21
- | Source Separation | ❌ Not yet supported |
22
- | VAD (Voice Activity Detection) | ❌ Not yet supported |
23
-
24
- ## Platform Support Status
25
-
26
- | Platform | Status | Notes |
27
- |----------|--------|-------|
28
- | **Android** | ✅ **Production Ready** | Fully tested, CI/CD automated, multiple models supported |
29
- | **iOS** | 🟡 **Beta / Experimental** | XCFramework + Podspec ready<br/>✅ GitHub Actions builds pass<br/>❌ **No local Xcode testing** *(Windows-only dev)* |
30
-
31
- ### 🔧 **iOS Contributors WANTED!** 🙌
32
-
33
- **Full iOS support is a priority!** Help bring sherpa-onnx to iOS devices.
34
-
35
- **What's ready:**
36
- - ✅ XCFramework integration
37
- - ✅ Podspec configuration
38
- - ✅ GitHub Actions CI (macOS runner)
39
- - ✅ TypeScript bindings
40
-
41
- **What's needed:**
42
- - **Local Xcode testing** (Simulator + Device)
43
- - **iOS example app** (beyond CI)
44
- - **TurboModule iOS testing**
45
- - **Edge case testing**
46
-
47
- ## Supported Model Types
48
-
49
- | Model Type | `modelType` Value | Description | Download Links |
50
- | ------------------------ | ----------------- | ---------------------------------------------------------------------------------------- | ------------------------------------------------------------------------------------------------ |
51
- | **Zipformer/Transducer** | `'transducer'` | Requires `encoder.onnx`, `decoder.onnx`, `joiner.onnx`, and `tokens.txt` | [Download](https://k2-fsa.github.io/sherpa/onnx/pretrained_models/offline-transducer/index.html) |
52
- | **Paraformer** | `'paraformer'` | Requires `model.onnx` (or `model.int8.onnx`) and `tokens.txt` | [Download](https://k2-fsa.github.io/sherpa/onnx/pretrained_models/offline-paraformer/index.html) |
53
- | **NeMo CTC** | `'nemo_ctc'` | Requires `model.onnx` (or `model.int8.onnx`) and `tokens.txt` | [Download](https://k2-fsa.github.io/sherpa/onnx/pretrained_models/offline-ctc/nemo/index.html) |
54
- | **Whisper** | `'whisper'` | Requires `encoder.onnx`, `decoder.onnx`, and `tokens.txt` | [Download](https://k2-fsa.github.io/sherpa/onnx/pretrained_models/whisper/index.html) |
55
- | **WeNet CTC** | `'wenet_ctc'` | Requires `model.onnx` (or `model.int8.onnx`) and `tokens.txt` | [Download](https://k2-fsa.github.io/sherpa/onnx/pretrained_models/offline-ctc/wenet/index.html) |
56
- | **SenseVoice** | `'sense_voice'` | Requires `model.onnx` (or `model.int8.onnx`) and `tokens.txt` | [Download](https://k2-fsa.github.io/sherpa/onnx/pretrained_models/sense-voice/index.html) |
57
- | **FunASR Nano** | `'funasr_nano'` | Requires `encoder_adaptor.onnx`, `llm.onnx`, `embedding.onnx`, and `tokenizer` directory | [Download](https://k2-fsa.github.io/sherpa/onnx/pretrained_models/funasr-nano/index.html) |
58
-
59
- ## Features
60
-
61
- - ✅ **Offline Speech-to-Text** - No internet connection required for speech recognition
62
- - ✅ **Multiple Model Types** - Supports Zipformer/Transducer, Paraformer, NeMo CTC, Whisper, WeNet CTC, SenseVoice, and FunASR Nano models
63
- - ✅ **Model Quantization** - Automatic detection and preference for quantized (int8) models
64
- - **Flexible Model Loading** - Asset models, file system models, or auto-detection
65
- - **Android Support** - Fully supported on Android
66
- - **iOS Support** - Fully supported on iOS (requires sherpa-onnx XCFramework)
67
- - **TypeScript Support** - Full TypeScript definitions included
68
- - 🚧 **Additional Features Coming Soon** - Text-to-Speech, Speaker Diarization, Speech Enhancement, Source Separation, and VAD support are planned for future releases
69
-
70
- ## Installation
71
-
72
- ```sh
73
- npm install react-native-sherpa-onnx
74
- ```
75
-
76
- If your project uses Yarn (v3+) or Plug'n'Play, configure Yarn to use the Node Modules linker to avoid postinstall issues:
77
-
78
- ```yaml
79
- # .yarnrc.yml
80
- nodeLinker: node-modules
81
- ```
82
-
83
- Alternatively, set the environment variable during install:
84
-
85
- ```sh
86
- YARN_NODE_LINKER=node-modules yarn install
87
- ```
88
-
89
- ### Android
90
-
91
- No additional setup required. The library automatically handles native dependencies via Gradle.
92
-
93
- ### iOS
94
-
95
- The sherpa-onnx XCFramework is **not included in the repository or npm package** due to its size (~80MB), but **no manual action is required**! The framework is automatically downloaded during `pod install`.
96
-
97
- #### Quick Setup
98
-
99
- ```sh
100
- cd example
101
- bundle install
102
- bundle exec pod install --project-directory=ios
103
- ```
104
-
105
- That's it! The `Podfile` automatically:
106
- 1. Copies required header files from the git submodule
107
- 2. Downloads the latest XCFramework from [GitHub Releases](https://github.com/XDcobra/react-native-sherpa-onnx/releases?q=framework)
108
- 3. Verifies everything is in place before building
109
-
110
- #### For Advanced Users: Building the Framework Locally
111
-
112
- If you want to build the XCFramework yourself instead of using the prebuilt release:
113
-
114
- ```sh
115
- # Clone sherpa-onnx repository
116
- git clone https://github.com/k2-fsa/sherpa-onnx.git
117
- cd sherpa-onnx
118
- git checkout v1.12.23
119
-
120
- # Build the iOS XCFramework (requires macOS, Xcode, CMake, and ONNX Runtime)
121
- ./build-ios.sh
122
-
123
- # Copy to your project
124
- cp -r build-ios/sherpa_onnx.xcframework /path/to/react-native-sherpa-onnx/ios/Frameworks/
125
- ```
126
-
127
- Then run `pod install` as usual.
128
-
129
- **Note:** The iOS implementation uses the same C++ wrapper as Android, ensuring consistent behavior across platforms.
130
-
131
- ## Quick Start
132
-
133
- ```typescript
134
- import { resolveModelPath } from 'react-native-sherpa-onnx';
135
- import {
136
- initializeSTT,
137
- transcribeFile,
138
- unloadSTT,
139
- } from 'react-native-sherpa-onnx/stt';
140
-
141
- // Initialize with a model
142
- const modelPath = await resolveModelPath({
143
- type: 'asset',
144
- path: 'models/sherpa-onnx-model',
145
- });
146
-
147
- await initializeSTT({
148
- modelPath: modelPath,
149
- preferInt8: true, // Optional: prefer quantized models
150
- });
151
-
152
- // Transcribe an audio file
153
- const transcription = await transcribeFile('path/to/audio.wav');
154
- console.log('Transcription:', transcription);
155
-
156
- // Release resources when done
157
- await unloadSTT();
158
- ```
159
-
160
- ## Usage
161
-
162
- ### Initialization
163
-
164
- ```typescript
165
- import {
166
- initializeSherpaOnnx,
167
- assetModelPath,
168
- autoModelPath,
169
- } from 'react-native-sherpa-onnx';
170
-
171
- // Option 1: Asset model (bundled in app)
172
- await initializeSherpaOnnx({
173
- modelPath: assetModelPath('models/sherpa-onnx-model'),
174
- preferInt8: true, // Prefer quantized models
175
- });
176
-
177
- // Option 2: Auto-detect (tries asset, then file system)
178
- await initializeSherpaOnnx({
179
- modelPath: autoModelPath('models/sherpa-onnx-model'),
180
- });
181
-
182
- // Option 3: Simple string (backward compatible)
183
- await initializeSherpaOnnx('models/sherpa-onnx-model');
184
- ```
185
-
186
- ### Transcription (Speech-to-Text)
187
-
188
- ```typescript
189
- import { transcribeFile } from 'react-native-sherpa-onnx/stt';
190
-
191
- // Transcribe a WAV file (16kHz, mono, 16-bit PCM)
192
- const result = await transcribeFile('path/to/audio.wav');
193
- console.log('Transcription:', result);
194
- ```
195
-
196
- ### Model Quantization
197
-
198
- Control whether to prefer quantized (int8) or regular models:
199
-
200
- ```typescript
201
- import { initializeSTT } from 'react-native-sherpa-onnx/stt';
202
- import { resolveModelPath } from 'react-native-sherpa-onnx';
203
-
204
- const modelPath = await resolveModelPath({
205
- type: 'asset',
206
- path: 'models/my-model',
207
- });
208
-
209
- // Default: try int8 first, then regular
210
- await initializeSTT({ modelPath });
211
-
212
- // Explicitly prefer int8 models (smaller, faster)
213
- await initializeSTT({
214
- modelPath,
215
- preferInt8: true,
216
- });
217
-
218
- // Explicitly prefer regular models (higher accuracy)
219
- await initializeSTT({
220
- modelPath,
221
- preferInt8: false,
222
- });
223
- ```
224
-
225
- ### Explicit Model Type
226
-
227
- For robustness, you can explicitly specify the model type to avoid auto-detection issues:
228
-
229
- ```typescript
230
- import { initializeSTT } from 'react-native-sherpa-onnx/stt';
231
- import { resolveModelPath } from 'react-native-sherpa-onnx';
232
-
233
- const modelPath = await resolveModelPath({
234
- type: 'asset',
235
- path: 'models/sherpa-onnx-nemo-parakeet-tdt-ctc-en',
236
- });
237
-
238
- // Explicitly specify model type
239
- await initializeSTT({
240
- modelPath,
241
- modelType: 'nemo_ctc', // 'transducer', 'paraformer', 'nemo_ctc', 'whisper', 'wenet_ctc', 'sense_voice', 'funasr_nano'
242
- });
243
-
244
- // Auto-detection (default behavior)
245
- await initializeSTT({
246
- modelPath,
247
- // modelType defaults to 'auto'
248
- });
249
- ```
250
-
251
- ### Cleanup (Speech-to-Text)
252
-
253
- ```typescript
254
- import { unloadSTT } from 'react-native-sherpa-onnx/stt';
255
-
256
- // Release resources when done
257
- await unloadSTT();
258
- ```
259
-
260
- ## Model Setup
261
-
262
- The library does **not** bundle models. You must provide your own models. See [MODEL_SETUP.md](./MODEL_SETUP.md) for detailed setup instructions.
263
-
264
- ### Model File Requirements
265
-
266
- - **Zipformer/Transducer**: Requires `encoder.onnx`, `decoder.onnx`, `joiner.onnx`, and `tokens.txt`
267
- - **Paraformer**: Requires `model.onnx` (or `model.int8.onnx`) and `tokens.txt`
268
- - **NeMo CTC**: Requires `model.onnx` (or `model.int8.onnx`) and `tokens.txt`
269
- - **Whisper**: Requires `encoder.onnx`, `decoder.onnx`, and `tokens.txt`
270
- - **WeNet CTC**: Requires `model.onnx` (or `model.int8.onnx`) and `tokens.txt`
271
- - **SenseVoice**: Requires `model.onnx` (or `model.int8.onnx`) and `tokens.txt`
272
-
273
- ### Model Files
274
-
275
- Place models in:
276
-
277
- - **Android**: `android/app/src/main/assets/models/`
278
- - **iOS**: Add to Xcode project as folder reference
279
-
280
- ## API Reference
281
-
282
- ### Speech-to-Text (STT) Module
283
-
284
- Import from `react-native-sherpa-onnx/stt`:
285
-
286
- #### `initializeSTT(options)`
287
-
288
- Initialize the speech-to-text engine with a model.
289
-
290
- **Parameters:**
291
-
292
- - `options.modelPath`: Absolute path to the model directory
293
- - `options.preferInt8` (optional): Prefer quantized models (`true`), regular models (`false`), or auto-detect (`undefined`, default)
294
- - `options.modelType` (optional): Explicit model type (`'transducer'`, `'paraformer'`, `'nemo_ctc'`, `'whisper'`, `'wenet_ctc'`, `'sense_voice'`, `'funasr_nano'`), or auto-detect (`'auto'`, default)
295
-
296
- **Returns:** `Promise<void>`
297
-
298
- #### `transcribeFile(filePath)`
299
-
300
- Transcribe an audio file.
301
-
302
- **Parameters:**
303
-
304
- - `filePath`: Path to WAV file (16kHz, mono, 16-bit PCM)
305
-
306
- **Returns:** `Promise<string>` - Transcribed text
307
-
308
- #### `unloadSTT()`
309
-
310
- Release resources and unload the speech-to-text model.
311
-
312
- **Returns:** `Promise<void>`
313
-
314
- ### Utility Functions
315
-
316
- Import from `react-native-sherpa-onnx`:
317
-
318
- #### `resolveModelPath(config)`
319
-
320
- Resolve a model path configuration to an absolute path.
321
-
322
- **Parameters:**
323
-
324
- - `config.type`: Path type (`'asset'`, `'file'`, or `'auto'`)
325
- - `config.path`: Path to resolve (relative for assets, absolute for files)
326
-
327
- **Returns:** `Promise<string>` - Absolute path to model directory
328
-
329
- #### `testSherpaInit()`
330
-
331
- Test that the sherpa-onnx native module is properly loaded.
332
-
333
- **Returns:** `Promise<string>` - Test message confirming module is loaded
334
-
335
- ## Requirements
336
-
337
- - React Native >= 0.70
338
- - Android API 24+ (Android 7.0+)
339
- - iOS 13.0+ (requires sherpa-onnx XCFramework - see iOS Setup below)
340
-
341
- ## Example Apps
342
-
343
- We provide example applications to help you get started with `react-native-sherpa-onnx`:
344
-
345
- ### Example App (Audio to Text)
346
-
347
- The example app included in this repository demonstrates basic audio-to-text transcription capabilities. It includes:
348
-
349
- - Multiple model type support (Zipformer, Paraformer, NeMo CTC, Whisper, WeNet CTC, SenseVoice, FunASR Nano)
350
- - Model selection and configuration
351
- - Audio file transcription
352
- - Test audio files for different languages
353
-
354
- **Getting started:**
355
-
356
- ```sh
357
- cd example
358
- yarn install
359
- yarn android # or yarn ios
360
- ```
361
-
362
- <div align="center">
363
- <img src="./docs/images/example_home_screen.png" alt="Model selection home screen" width="30%" />
364
- <img src="./docs/images/example_english.png" alt="Transcribe english audio" width="30%" />
365
- <img src="./docs/images/example_multilanguage.png" alt="Transcribe english and chinese audio" width="30%" />
366
- </div>
367
-
368
- ### Video to Text Comparison App
369
-
370
- A comprehensive comparison app that demonstrates video-to-text transcription using `react-native-sherpa-onnx` alongside other speech-to-text solutions:
371
-
372
- **Repository:** [mobile-videototext-comparison](https://github.com/XDcobra/mobile-videototext-comparison)
373
-
374
- **Features:**
375
-
376
- - Video to audio conversion (using native APIs)
377
- - Audio to text transcription
378
- - Video to text (video --> WAV --> text)
379
- - Comparison between different STT providers
380
- - Performance benchmarking
381
-
382
- This app showcases how to integrate `react-native-sherpa-onnx` into a real-world application that processes video files and converts them to text.
383
-
384
- <div align="center">
385
- <img src="./docs/images/vtt_model_overview.png" alt="Video-to-Text Model Overview" width="30%" />
386
- <img src="./docs/images/vtt_result_file_picker.png" alt="Video-to-Text file picker" width="30%" />
387
- <img src="./docs/images/vtt_result_test_audio.png" alt="Video-to-Text test audio" width="30%" />
388
- </div>
389
-
390
- ## Contributing
391
-
392
- - [Development workflow](CONTRIBUTING.md#development-workflow)
393
- - [Sending a pull request](CONTRIBUTING.md#sending-a-pull-request)
394
- - [Code of conduct](CODE_OF_CONDUCT.md)
395
-
396
- ## License
397
-
398
- MIT
399
-
400
- ---
401
-
402
- Made with [create-react-native-library](https://github.com/callstack/react-native-builder-bob)
1
+ # react-native-sherpa-onnx
2
+
3
+ React Native SDK for sherpa-onnx - providing offline speech processing capabilities
4
+
5
+ [![npm version](https://img.shields.io/npm/v/react-native-sherpa-onnx.svg)](https://www.npmjs.com/package/react-native-sherpa-onnx)
6
+ [![npm downloads](https://img.shields.io/npm/dm/react-native-sherpa-onnx.svg)](https://www.npmjs.com/package/react-native-sherpa-onnx)
7
+ [![npm license](https://img.shields.io/npm/l/react-native-sherpa-onnx.svg)](https://www.npmjs.com/package/react-native-sherpa-onnx)
8
+ [![Android](https://img.shields.io/badge/Android-Supported-green)](https://www.android.com/)
9
+ [![iOS](https://img.shields.io/badge/iOS-Supported-blue)](https://www.apple.com/ios/)
10
+
11
+ A React Native TurboModule that provides offline speech processing capabilities using [sherpa-onnx](https://github.com/k2-fsa/sherpa-onnx). The SDK aims to support all functionalities that sherpa-onnx offers, including offline speech-to-text, text-to-speech, speaker diarization, speech enhancement, source separation, and VAD (Voice Activity Detection).
12
+
13
+ ## Feature Support
14
+
15
+ | Feature | Status |
16
+ |---------|--------|
17
+ | Offline Speech-to-Text | ✅ **Supported** |
18
+ | Text-to-Speech | **Supported** |
19
+ | Speaker Diarization | ❌ Not yet supported |
20
+ | Speech Enhancement | ❌ Not yet supported |
21
+ | Source Separation | ❌ Not yet supported |
22
+ | VAD (Voice Activity Detection) | ❌ Not yet supported |
23
+
24
+ ## Platform Support Status
25
+
26
+ | Platform | Status | Notes |
27
+ |----------|--------|-------|
28
+ | **Android** | ✅ **Production Ready** | Fully tested, CI/CD automated, multiple models supported |
29
+ | **iOS** | 🟡 **Beta / Experimental** | XCFramework + Podspec ready<br/>✅ GitHub Actions builds pass<br/>❌ **No local Xcode testing** *(Windows-only dev)* |
30
+
31
+ ### 🔧 **iOS Contributors WANTED!**
32
+
33
+ **Full iOS support is a priority!** Help bring sherpa-onnx to iOS devices.
34
+
35
+ **What's ready:**
36
+ - ✅ XCFramework integration
37
+ - ✅ Podspec configuration
38
+ - ✅ GitHub Actions CI (macOS runner)
39
+ - ✅ TypeScript bindings
40
+
41
+ **What's needed:**
42
+ - **Local Xcode testing** (Simulator + Device)
43
+ - **iOS example app** (beyond CI)
44
+ - **TurboModule iOS testing**
45
+ - **Edge case testing**
46
+
47
+ ## Supported Model Types
48
+
49
+ ### Speech-to-Text (STT) Models
50
+
51
+ | Model Type | `modelType` Value | Description | Download Links |
52
+ | ------------------------ | ----------------- | ---------------------------------------------------------------------------------------- | ------------------------------------------------------------------------------------------------ |
53
+ | **Zipformer/Transducer** | `'transducer'` | Requires `encoder.onnx`, `decoder.onnx`, `joiner.onnx`, and `tokens.txt` | [Download](https://k2-fsa.github.io/sherpa/onnx/pretrained_models/offline-transducer/index.html) |
54
+ | **Paraformer** | `'paraformer'` | Requires `model.onnx` (or `model.int8.onnx`) and `tokens.txt` | [Download](https://k2-fsa.github.io/sherpa/onnx/pretrained_models/offline-paraformer/index.html) |
55
+ | **NeMo CTC** | `'nemo_ctc'` | Requires `model.onnx` (or `model.int8.onnx`) and `tokens.txt` | [Download](https://k2-fsa.github.io/sherpa/onnx/pretrained_models/offline-ctc/nemo/index.html) |
56
+ | **Whisper** | `'whisper'` | Requires `encoder.onnx`, `decoder.onnx`, and `tokens.txt` | [Download](https://k2-fsa.github.io/sherpa/onnx/pretrained_models/whisper/index.html) |
57
+ | **WeNet CTC** | `'wenet_ctc'` | Requires `model.onnx` (or `model.int8.onnx`) and `tokens.txt` | [Download](https://k2-fsa.github.io/sherpa/onnx/pretrained_models/offline-ctc/wenet/index.html) |
58
+ | **SenseVoice** | `'sense_voice'` | Requires `model.onnx` (or `model.int8.onnx`) and `tokens.txt` | [Download](https://k2-fsa.github.io/sherpa/onnx/pretrained_models/sense-voice/index.html) |
59
+ | **FunASR Nano** | `'funasr_nano'` | Requires `encoder_adaptor.onnx`, `llm.onnx`, `embedding.onnx`, and `tokenizer` directory | [Download](https://k2-fsa.github.io/sherpa/onnx/pretrained_models/funasr-nano/index.html) |
60
+
61
+ ### Text-to-Speech (TTS) Models
62
+
63
+ | Model Type | `modelType` Value | Description | Download Links |
64
+ | ---------------- | ----------------- | ---------------------------------------------------------------------------------------------------- | ----------------------------------------------------------------------------------- |
65
+ | **VITS** | `'vits'` | Fast, high-quality TTS. Includes Piper, Coqui, MeloTTS, MMS variants. Requires `model.onnx`, `tokens.txt` | [Download](https://github.com/k2-fsa/sherpa-onnx/releases/tag/tts-models) |
66
+ | **Matcha** | `'matcha'` | High-quality acoustic model + vocoder. Requires `acoustic_model.onnx`, `vocoder.onnx`, `tokens.txt` | [Download](https://k2-fsa.github.io/sherpa/onnx/tts/pretrained_models/matcha.html) |
67
+ | **Kokoro** | `'kokoro'` | Multi-speaker, multi-language. Requires `model.onnx`, `voices.bin`, `tokens.txt`, `espeak-ng-data/` | [Download](https://github.com/k2-fsa/sherpa-onnx/releases/tag/tts-models) |
68
+ | **KittenTTS** | `'kitten'` | Lightweight, multi-speaker. Requires `model.onnx`, `voices.bin`, `tokens.txt`, `espeak-ng-data/` | [Download](https://github.com/k2-fsa/sherpa-onnx/releases/tag/tts-models) |
69
+ | **Zipvoice** | `'zipvoice'` | Voice cloning capable. Requires `encoder.onnx`, `decoder.onnx`, `vocoder.onnx`, `tokens.txt` | [Download](https://k2-fsa.github.io/sherpa/onnx/tts/pretrained_models/zipvoice.html) |
70
+
71
+ ## Features
72
+
73
+ - **Offline Speech-to-Text** - No internet connection required for speech recognition
74
+ - ✅ **Multiple Model Types** - Supports Zipformer/Transducer, Paraformer, NeMo CTC, Whisper, WeNet CTC, SenseVoice, and FunASR Nano models
75
+ - ✅ **Model Quantization** - Automatic detection and preference for quantized (int8) models
76
+ - **Flexible Model Loading** - Asset models, file system models, or auto-detection
77
+ - ✅ **Android Support** - Fully supported on Android
78
+ - ✅ **iOS Support** - Fully supported on iOS (requires sherpa-onnx XCFramework)
79
+ - ✅ **TypeScript Support** - Full TypeScript definitions included
80
+ - 🚧 **Additional Features Coming Soon** - Speaker Diarization, Speech Enhancement, Source Separation, and VAD support are planned for future releases
81
+
82
+ ## Installation
83
+
84
+ ```sh
85
+ npm install react-native-sherpa-onnx
86
+ ```
87
+
88
+ If your project uses Yarn (v3+) or Plug'n'Play, configure Yarn to use the Node Modules linker to avoid postinstall issues:
89
+
90
+ ```yaml
91
+ # .yarnrc.yml
92
+ nodeLinker: node-modules
93
+ ```
94
+
95
+ Alternatively, set the environment variable during install:
96
+
97
+ ```sh
98
+ YARN_NODE_LINKER=node-modules yarn install
99
+ ```
100
+
101
+ ### Android
102
+
103
+ No additional setup required. The library automatically handles native dependencies via Gradle.
104
+
105
+ ### iOS
106
+
107
+ The sherpa-onnx XCFramework is **not included in the repository or npm package** due to its size (~80MB), but **no manual action is required**! The framework is automatically downloaded during `pod install`.
108
+
109
+ #### Quick Setup
110
+
111
+ ```sh
112
+ cd example
113
+ bundle install
114
+ bundle exec pod install --project-directory=ios
115
+ ```
116
+
117
+ That's it! The `Podfile` automatically:
118
+ 1. Copies required header files from the git submodule
119
+ 2. Downloads the latest XCFramework from [GitHub Releases](https://github.com/XDcobra/react-native-sherpa-onnx/releases?q=framework)
120
+ 3. Verifies everything is in place before building
121
+
122
+ #### For Advanced Users: Building the Framework Locally
123
+
124
+ If you want to build the XCFramework yourself instead of using the prebuilt release:
125
+
126
+ ```sh
127
+ # Clone sherpa-onnx repository
128
+ git clone https://github.com/k2-fsa/sherpa-onnx.git
129
+ cd sherpa-onnx
130
+ git checkout v1.12.23
131
+
132
+ # Build the iOS XCFramework (requires macOS, Xcode, CMake, and ONNX Runtime)
133
+ ./build-ios.sh
134
+
135
+ # Copy to your project
136
+ cp -r build-ios/sherpa_onnx.xcframework /path/to/react-native-sherpa-onnx/ios/Frameworks/
137
+ ```
138
+
139
+ Then run `pod install` as usual.
140
+
141
+ **Note:** The iOS implementation uses the same C++ wrapper as Android, ensuring consistent behavior across platforms.
142
+
143
+ ## Documentation
144
+
145
+ - [Speech-to-Text (STT)](./docs/stt.md)
146
+ - [Text-to-Speech (TTS)](./docs/tts.md)
147
+ - [Voice Activity Detection (VAD)](./docs/vad.md)
148
+ - [Speaker Diarization](./docs/diarization.md)
149
+ - [Speech Enhancement](./docs/enhancement.md)
150
+ - [Source Separation](./docs/separation.md)
151
+ - [General STT Model Setup](./docs/STT_MODEL_SETUP.md)
152
+ - [General TTS Model Setup](./docs/TTS_MODEL_SETUP.md)
153
+
154
+ ### Example Model READMEs
155
+
156
+ - [kokoro (US) README](./example/android/app/src/main/assets/models/kokoro-us/README.md)
157
+ - [kokoro (ZH) README](./example/android/app/src/main/assets/models/kokoro-zh/README.md)
158
+ - [funasr-nano README](./example/android/app/src/main/assets/models/sherpa-onnx-funasr-nano-int8/README.md)
159
+ - [kitten-nano README](./example/android/app/src/main/assets/models/sherpa-onnx-kitten-nano-en-v0_1-fp16/README.md)
160
+ - [matcha README](./example/android/app/src/main/assets/models/sherpa-onnx-matcha-icefall-en_US-ljspeech/README.md)
161
+ - [nemo-ctc README](./example/android/app/src/main/assets/models/sherpa-onnx-nemo-parakeet-tdt-ctc-en/README.md)
162
+ - [paraformer README](./example/android/app/src/main/assets/models/sherpa-onnx-paraformer-zh-small/README.md)
163
+ - [sense-voice README](./example/android/app/src/main/assets/models/sherpa-onnx-sense-voice-zh-en-ja-ko-yue-int8/README.md)
164
+ - [vits README](./example/android/app/src/main/assets/models/sherpa-onnx-vits-piper-en_US-libritts_r-medium/README.md)
165
+ - [wenet-ctc README](./example/android/app/src/main/assets/models/sherpa-onnx-wenetspeech-ctc-zh-en-cantonese/README.md)
166
+ - [whisper-tiny README](./example/android/app/src/main/assets/models/sherpa-onnx-whisper-tiny-en/README.md)
167
+ - [zipformer README](./example/android/app/src/main/assets/models/sherpa-onnx-zipformer-small-en/README.md)
168
+
169
+ ## Requirements
170
+
171
+ - React Native >= 0.70
172
+ - Android API 24+ (Android 7.0+)
173
+ - iOS 13.0+ (requires sherpa-onnx XCFramework - see iOS Setup below)
174
+
175
+ ## Example Apps
176
+
177
+ We provide example applications to help you get started with `react-native-sherpa-onnx`:
178
+
179
+ ### Example App (Audio to Text)
180
+
181
+ The example app included in this repository demonstrates basic audio-to-text transcription capabilities. It includes:
182
+
183
+ - Multiple model type support (Zipformer, Paraformer, NeMo CTC, Whisper, WeNet CTC, SenseVoice, FunASR Nano)
184
+ - Model selection and configuration
185
+ - Audio file transcription
186
+ - Test audio files for different languages
187
+
188
+ **Getting started:**
189
+
190
+ ```sh
191
+ cd example
192
+ yarn install
193
+ yarn android # or yarn ios
194
+ ```
195
+
196
+ <div align="center">
197
+ <img src="./docs/images/example_home_screen.png" alt="Model selection home screen" width="30%" />
198
+ <img src="./docs/images/example_english.png" alt="Transcribe english audio" width="30%" />
199
+ <img src="./docs/images/example_multilanguage.png" alt="Transcribe english and chinese audio" width="30%" />
200
+ </div>
201
+
202
+ ### Video to Text Comparison App
203
+
204
+ A comprehensive comparison app that demonstrates video-to-text transcription using `react-native-sherpa-onnx` alongside other speech-to-text solutions:
205
+
206
+ **Repository:** [mobile-videototext-comparison](https://github.com/XDcobra/mobile-videototext-comparison)
207
+
208
+ **Features:**
209
+
210
+ - Video to audio conversion (using native APIs)
211
+ - Audio to text transcription
212
+ - Video to text (video --> WAV --> text)
213
+ - Comparison between different STT providers
214
+ - Performance benchmarking
215
+
216
+ This app showcases how to integrate `react-native-sherpa-onnx` into a real-world application that processes video files and converts them to text.
217
+
218
+ <div align="center">
219
+ <img src="./docs/images/vtt_model_overview.png" alt="Video-to-Text Model Overview" width="30%" />
220
+ <img src="./docs/images/vtt_result_file_picker.png" alt="Video-to-Text file picker" width="30%" />
221
+ <img src="./docs/images/vtt_result_test_audio.png" alt="Video-to-Text test audio" width="30%" />
222
+ </div>
223
+
224
+ ## Contributing
225
+
226
+ - [Development workflow](CONTRIBUTING.md#development-workflow)
227
+ - [Sending a pull request](CONTRIBUTING.md#sending-a-pull-request)
228
+ - [Code of conduct](CODE_OF_CONDUCT.md)
229
+
230
+ ## License
231
+
232
+ MIT
233
+
234
+ ---
235
+
236
+ Made with [create-react-native-library](https://github.com/callstack/react-native-builder-bob)