cactus-react-native 0.1.2 → 0.1.4

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (77) hide show
  1. package/README.md +874 -146
  2. package/android/src/main/CMakeLists.txt +1 -1
  3. package/android/src/main/java/com/cactus/Cactus.java +0 -134
  4. package/android/src/main/java/com/cactus/LlamaContext.java +0 -22
  5. package/android/src/main/jni.cpp +1 -53
  6. package/android/src/main/jniLibs/arm64-v8a/libcactus.so +0 -0
  7. package/android/src/main/jniLibs/arm64-v8a/libcactus_v8.so +0 -0
  8. package/android/src/main/jniLibs/arm64-v8a/libcactus_v8_2.so +0 -0
  9. package/android/src/main/jniLibs/arm64-v8a/libcactus_v8_2_dotprod.so +0 -0
  10. package/android/src/main/jniLibs/arm64-v8a/libcactus_v8_2_dotprod_i8mm.so +0 -0
  11. package/android/src/main/jniLibs/arm64-v8a/libcactus_v8_2_i8mm.so +0 -0
  12. package/android/src/newarch/java/com/cactus/CactusModule.java +0 -20
  13. package/android/src/oldarch/java/com/cactus/CactusModule.java +0 -20
  14. package/ios/CMakeLists.txt +6 -6
  15. package/ios/Cactus.mm +0 -80
  16. package/ios/CactusContext.h +0 -6
  17. package/ios/CactusContext.mm +0 -27
  18. package/ios/cactus.xcframework/ios-arm64/cactus.framework/Headers/cactus.h +0 -6
  19. package/ios/cactus.xcframework/ios-arm64/cactus.framework/cactus +0 -0
  20. package/ios/cactus.xcframework/ios-arm64_x86_64-simulator/cactus.framework/Headers/cactus.h +0 -6
  21. package/ios/cactus.xcframework/ios-arm64_x86_64-simulator/cactus.framework/cactus +0 -0
  22. package/ios/cactus.xcframework/tvos-arm64/cactus.framework/Headers/cactus.h +0 -6
  23. package/ios/cactus.xcframework/tvos-arm64/cactus.framework/cactus +0 -0
  24. package/ios/cactus.xcframework/tvos-arm64_x86_64-simulator/cactus.framework/Headers/cactus.h +0 -6
  25. package/ios/cactus.xcframework/tvos-arm64_x86_64-simulator/cactus.framework/cactus +0 -0
  26. package/lib/commonjs/NativeCactus.js +0 -1
  27. package/lib/commonjs/NativeCactus.js.map +1 -1
  28. package/lib/commonjs/index.js +55 -37
  29. package/lib/commonjs/index.js.map +1 -1
  30. package/lib/commonjs/lm.js +72 -0
  31. package/lib/commonjs/lm.js.map +1 -0
  32. package/lib/commonjs/telemetry.js +97 -0
  33. package/lib/commonjs/telemetry.js.map +1 -0
  34. package/lib/commonjs/tools.js +21 -60
  35. package/lib/commonjs/tools.js.map +1 -1
  36. package/lib/commonjs/tts.js +32 -0
  37. package/lib/commonjs/tts.js.map +1 -0
  38. package/lib/commonjs/vlm.js +83 -0
  39. package/lib/commonjs/vlm.js.map +1 -0
  40. package/lib/module/NativeCactus.js +0 -2
  41. package/lib/module/NativeCactus.js.map +1 -1
  42. package/lib/module/index.js +38 -38
  43. package/lib/module/index.js.map +1 -1
  44. package/lib/module/lm.js +67 -0
  45. package/lib/module/lm.js.map +1 -0
  46. package/lib/module/telemetry.js +92 -0
  47. package/lib/module/telemetry.js.map +1 -0
  48. package/lib/module/tools.js +21 -58
  49. package/lib/module/tools.js.map +1 -1
  50. package/lib/module/tts.js +27 -0
  51. package/lib/module/tts.js.map +1 -0
  52. package/lib/module/vlm.js +78 -0
  53. package/lib/module/vlm.js.map +1 -0
  54. package/lib/typescript/NativeCactus.d.ts +0 -10
  55. package/lib/typescript/NativeCactus.d.ts.map +1 -1
  56. package/lib/typescript/index.d.ts +6 -18
  57. package/lib/typescript/index.d.ts.map +1 -1
  58. package/lib/typescript/lm.d.ts +17 -0
  59. package/lib/typescript/lm.d.ts.map +1 -0
  60. package/lib/typescript/telemetry.d.ts +21 -0
  61. package/lib/typescript/telemetry.d.ts.map +1 -0
  62. package/lib/typescript/tools.d.ts +0 -3
  63. package/lib/typescript/tools.d.ts.map +1 -1
  64. package/lib/typescript/tts.d.ts +10 -0
  65. package/lib/typescript/tts.d.ts.map +1 -0
  66. package/lib/typescript/vlm.d.ts +22 -0
  67. package/lib/typescript/vlm.d.ts.map +1 -0
  68. package/package.json +2 -1
  69. package/src/NativeCactus.ts +0 -22
  70. package/src/index.ts +75 -78
  71. package/src/lm.ts +89 -0
  72. package/src/telemetry.ts +123 -0
  73. package/src/tools.ts +17 -58
  74. package/src/tts.ts +45 -0
  75. package/src/vlm.ts +112 -0
  76. package/android/src/main/jniLibs/x86_64/libcactus.so +0 -0
  77. package/android/src/main/jniLibs/x86_64/libcactus_x86_64.so +0 -0
package/README.md CHANGED
@@ -1,232 +1,960 @@
1
- # Cactus for React Native
1
+ # Cactus React Native
2
2
 
3
- A lightweight, high-performance framework for running AI models on mobile devices with React Native.
3
+ A powerful React Native library for running Large Language Models (LLMs) and Vision Language Models (VLMs) directly on mobile devices, with full support for chat completions, multimodal inputs, embeddings, text-to-speech and advanced features.
4
4
 
5
5
  ## Installation
6
6
 
7
7
  ```bash
8
- # Using npm
9
- npm install react-native-fs
10
- npm install cactus-react-native
8
+ npm install cactus-react-native react-native-fs
9
+ # or
10
+ yarn add cactus-react-native react-native-fs
11
+ ```
11
12
 
12
- # Using yarn
13
- yarn add react-native-fs
14
- yarn add cactus-react-native
13
+ **Additional Setup:**
14
+ - For iOS: `cd ios && npx pod-install` or `yarn pod-install`
15
+ - For Android: Ensure your `minSdkVersion` is 24 or higher
15
16
 
16
- # For iOS, install pods if not on Expo
17
- npx pod-install
18
- ```
17
+ > **Important**: `react-native-fs` is required for file system access to download and manage model files locally.
19
18
 
20
- ## Basic Usage
19
+ ## Quick Start
21
20
 
22
- ### Initialize a Model
21
+ ### Basic Text Completion
23
22
 
24
23
  ```typescript
25
- import { initLlama, LlamaContext } from 'cactus-react-native';
24
+ import { CactusLM } from 'cactus-react-native';
26
25
 
27
- // Initialize the model
28
- const context = await initLlama({
29
- model: 'models/llama-2-7b-chat.gguf', // Path to your model
30
- n_ctx: 2048, // Context size
31
- n_batch: 512, // Batch size for prompt processing
32
- n_threads: 4 // Number of threads to use
26
+ // Initialize a language model
27
+ const { lm, error } = await CactusLM.init({
28
+ model: '/path/to/your/model.gguf',
29
+ n_ctx: 2048,
30
+ n_threads: 4,
33
31
  });
32
+ if (error) throw error; // handle error gracefully
33
+
34
+ // Generate text
35
+ const messages = [{ role: 'user', content: 'Hello, how are you?' }];
36
+ const params = { n_predict: 100, temperature: 0.7 };
37
+
38
+ const result = await lm.completion(messages, params);
39
+ console.log(result.text);
40
+ ```
41
+
42
+ ### Complete Chat App Example
43
+
44
+ ```typescript
45
+ import React, { useState, useEffect } from 'react';
46
+ import { View, Text, TextInput, TouchableOpacity } from 'react-native';
47
+ import { CactusLM } from 'cactus-react-native';
48
+ import RNFS from 'react-native-fs';
49
+
50
+ interface Message {
51
+ role: 'user' | 'assistant';
52
+ content: string;
53
+ }
54
+
55
+ export default function ChatApp() {
56
+ const [lm, setLM] = useState<CactusLM | null>(null);
57
+ const [messages, setMessages] = useState<Message[]>([]);
58
+ const [input, setInput] = useState('');
59
+ const [loading, setLoading] = useState(true);
60
+
61
+ useEffect(() => {
62
+ initializeModel();
63
+ }, []);
64
+
65
+ async function initializeModel() {
66
+ try {
67
+ // Download model (example URL)
68
+ const modelUrl = 'https://huggingface.co/Cactus-Compute/Qwen3-600m-Instruct-GGUF/resolve/main/Qwen3-0.6B-Q8_0.gguf';
69
+ const modelPath = `${RNFS.DocumentDirectoryPath}/model.gguf`;
70
+
71
+ // Download if not exists
72
+ if (!(await RNFS.exists(modelPath))) {
73
+ await RNFS.downloadFile({
74
+ fromUrl: modelUrl,
75
+ toFile: modelPath,
76
+ }).promise;
77
+ }
78
+
79
+ // Initialize language model
80
+ const { lm, error } = await CactusLM.init({
81
+ model: modelPath,
82
+ n_ctx: 2048,
83
+ n_threads: 4,
84
+ n_gpu_layers: 99, // Use GPU acceleration
85
+ });
86
+ if (error) throw error; // handle error gracefully
87
+
88
+ setLM(lm);
89
+ setLoading(false);
90
+ } catch (error) {
91
+ console.error('Failed to initialize model:', error);
92
+ }
93
+ }
94
+
95
+ async function sendMessage() {
96
+ if (!lm || !input.trim()) return;
97
+
98
+ const userMessage: Message = { role: 'user', content: input };
99
+ const newMessages = [...messages, userMessage];
100
+ setMessages(newMessages);
101
+ setInput('');
102
+
103
+ try {
104
+ const params = {
105
+ n_predict: 256,
106
+ temperature: 0.7,
107
+ stop: ['</s>', '<|end|>'],
108
+ };
109
+
110
+ const result = await lm.completion(newMessages, params);
111
+
112
+ const assistantMessage: Message = {
113
+ role: 'assistant',
114
+ content: result.text
115
+ };
116
+ setMessages([...newMessages, assistantMessage]);
117
+ } catch (error) {
118
+ console.error('Completion failed:', error);
119
+ }
120
+ }
121
+
122
+ if (loading) {
123
+ return (
124
+ <View style={{ flex: 1, justifyContent: 'center', alignItems: 'center' }}>
125
+ <Text>Loading model...</Text>
126
+ </View>
127
+ );
128
+ }
129
+
130
+ return (
131
+ <View style={{ flex: 1, padding: 16 }}>
132
+ {/* Messages */}
133
+ <View style={{ flex: 1 }}>
134
+ {messages.map((msg, index) => (
135
+ <Text key={index} style={{
136
+ backgroundColor: msg.role === 'user' ? '#007AFF' : '#f0f0f0',
137
+ color: msg.role === 'user' ? 'white' : 'black',
138
+ padding: 8,
139
+ margin: 4,
140
+ borderRadius: 8,
141
+ }}>
142
+ {msg.content}
143
+ </Text>
144
+ ))}
145
+ </View>
146
+
147
+ {/* Input */}
148
+ <View style={{ flexDirection: 'row' }}>
149
+ <TextInput
150
+ style={{ flex: 1, borderWidth: 1, padding: 8, borderRadius: 4 }}
151
+ value={input}
152
+ onChangeText={setInput}
153
+ placeholder="Type a message..."
154
+ />
155
+ <TouchableOpacity
156
+ onPress={sendMessage}
157
+ style={{ backgroundColor: '#007AFF', padding: 8, borderRadius: 4, marginLeft: 8 }}
158
+ >
159
+ <Text style={{ color: 'white' }}>Send</Text>
160
+ </TouchableOpacity>
161
+ </View>
162
+ </View>
163
+ );
164
+ }
165
+ ```
166
+
167
+ ## File Path Requirements
168
+
169
+ **Critical**: Cactus requires **absolute local file paths**, not Metro bundler URLs or asset references.
170
+
171
+ ### ❌ Won't Work
172
+ ```typescript
173
+ // Metro bundler URLs
174
+ 'http://localhost:8081/assets/model.gguf'
175
+
176
+ // React Native asset requires
177
+ require('./assets/model.gguf')
178
+
179
+ // Relative paths
180
+ './models/model.gguf'
181
+ ```
182
+
183
+ ### ✅ Will Work
184
+ ```typescript
185
+ import RNFS from 'react-native-fs';
186
+
187
+ // Absolute paths in app directories
188
+ const modelPath = `${RNFS.DocumentDirectoryPath}/model.gguf`;
189
+ const imagePath = `${RNFS.DocumentDirectoryPath}/image.jpg`;
190
+
191
+ // Downloaded/copied files
192
+ const downloadModel = async () => {
193
+ const modelUrl = 'https://example.com/model.gguf';
194
+ const localPath = `${RNFS.DocumentDirectoryPath}/model.gguf`;
195
+
196
+ await RNFS.downloadFile({
197
+ fromUrl: modelUrl,
198
+ toFile: localPath,
199
+ }).promise;
200
+
201
+ return localPath; // Use this path with Cactus
202
+ };
203
+ ```
204
+
205
+ ### Image Assets
206
+ For images, you need to copy them to local storage first:
207
+
208
+ ```typescript
209
+ // Copy bundled asset to local storage
210
+ const copyAssetToLocal = async (assetName: string): Promise<string> => {
211
+ const assetPath = `${RNFS.MainBundlePath}/${assetName}`;
212
+ const localPath = `${RNFS.DocumentDirectoryPath}/${assetName}`;
213
+
214
+ if (!(await RNFS.exists(localPath))) {
215
+ await RNFS.copyFile(assetPath, localPath);
216
+ }
217
+
218
+ return localPath;
219
+ };
220
+
221
+ // Usage
222
+ const imagePath = await copyAssetToLocal('demo.jpg');
223
+ const params = { images: [imagePath], n_predict: 200 };
224
+ const result = await vlm.completion(messages, params);
34
225
  ```
35
226
 
36
- ### Text Completion
227
+ ### External Images
228
+ Download external images to local storage:
37
229
 
38
230
  ```typescript
39
- // Generate text completion
40
- const result = await context.completion({
41
- prompt: "Explain quantum computing in simple terms",
231
+ const downloadImage = async (imageUrl: string): Promise<string> => {
232
+ const localPath = `${RNFS.DocumentDirectoryPath}/temp_image.jpg`;
233
+
234
+ await RNFS.downloadFile({
235
+ fromUrl: imageUrl,
236
+ toFile: localPath,
237
+ }).promise;
238
+
239
+ return localPath;
240
+ };
241
+ ```
242
+
243
+ ## Core APIs
244
+
245
+ ### CactusLM (Language Model)
246
+
247
+ For text-only language models:
248
+
249
+ ```typescript
250
+ import { CactusLM } from 'cactus-react-native';
251
+
252
+ // Initialize
253
+ const lm = await CactusLM.init({
254
+ model: '/path/to/model.gguf',
255
+ n_ctx: 4096, // Context window size
256
+ n_batch: 512, // Batch size for processing
257
+ n_threads: 4, // Number of threads
258
+ n_gpu_layers: 99, // GPU layers (0 = CPU only)
259
+ });
260
+
261
+ // Text completion
262
+ const messages = [
263
+ { role: 'system', content: 'You are a helpful assistant.' },
264
+ { role: 'user', content: 'What is the capital of France?' },
265
+ ];
266
+
267
+ const params = {
268
+ n_predict: 200,
42
269
  temperature: 0.7,
43
- top_k: 40,
44
- top_p: 0.95,
45
- n_predict: 512
46
- }, (token) => {
47
- // Process each token as it's generated
48
- console.log(token.token);
270
+ top_p: 0.9,
271
+ stop: ['</s>', '\n\n'],
272
+ };
273
+
274
+ const result = await lm.completion(messages, params);
275
+
276
+ // Embeddings
277
+ const embeddingResult = await lm.embedding('Your text here');
278
+ console.log('Embedding vector:', embeddingResult.embedding);
279
+
280
+ // Cleanup
281
+ await lm.rewind(); // Clear conversation
282
+ await lm.release(); // Release resources
283
+ ```
284
+
285
+ ### CactusVLM (Vision Language Model)
286
+
287
+ For multimodal models that can process both text and images:
288
+
289
+ ```typescript
290
+ import { CactusVLM } from 'cactus-react-native';
291
+
292
+ // Initialize with multimodal projector
293
+ const vlm = await CactusVLM.init({
294
+ model: '/path/to/vision-model.gguf',
295
+ mmproj: '/path/to/mmproj.gguf',
296
+ n_ctx: 2048,
297
+ n_threads: 4,
298
+ n_gpu_layers: 99, // GPU for main model, CPU for projector
49
299
  });
50
300
 
51
- // Clean up when done
52
- await context.release();
301
+ // Image + text completion
302
+ const messages = [{ role: 'user', content: 'What do you see in this image?' }];
303
+ const params = {
304
+ images: ['/path/to/image.jpg'],
305
+ n_predict: 200,
306
+ temperature: 0.3,
307
+ };
308
+
309
+ const result = await vlm.completion(messages, params);
310
+
311
+ // Text-only completion (same interface)
312
+ const textMessages = [{ role: 'user', content: 'Tell me a joke' }];
313
+ const textParams = { n_predict: 100 };
314
+ const textResult = await vlm.completion(textMessages, textParams);
315
+
316
+ // Cleanup
317
+ await vlm.rewind();
318
+ await vlm.release();
319
+ ```
320
+
321
+ ### CactusTTS (Text-to-Speech)
322
+
323
+ For text-to-speech generation:
324
+
325
+ ```typescript
326
+ import { CactusTTS } from 'cactus-react-native';
327
+
328
+ // Initialize with vocoder
329
+ const tts = await CactusTTS.init({
330
+ model: '/path/to/tts-model.gguf',
331
+ vocoder: '/path/to/vocoder.gguf',
332
+ n_ctx: 1024,
333
+ n_threads: 4,
334
+ });
335
+
336
+ // Generate speech
337
+ const text = 'Hello, this is a test of text-to-speech functionality.';
338
+ const params = {
339
+ voice_id: 0,
340
+ temperature: 0.7,
341
+ speed: 1.0,
342
+ };
343
+
344
+ const audioResult = await tts.generateSpeech(text, params);
345
+ console.log('Audio data:', audioResult.audio_data);
346
+
347
+ // Advanced token-based generation
348
+ const tokens = await tts.getGuideTokens('Your text here');
349
+ const audio = await tts.decodeTokens(tokens);
350
+
351
+ // Cleanup
352
+ await tts.release();
53
353
  ```
54
354
 
55
- ### Chat Completion
355
+ ## Text Completion
356
+
357
+ ### Basic Completion
56
358
 
57
359
  ```typescript
58
- // Chat messages following OpenAI format
360
+ const lm = await CactusLM.init({
361
+ model: '/path/to/model.gguf',
362
+ n_ctx: 2048,
363
+ });
364
+
59
365
  const messages = [
60
- { role: "system", content: "You are a helpful assistant." },
61
- { role: "user", content: "What is machine learning?" }
366
+ { role: 'user', content: 'Write a short poem about coding' }
62
367
  ];
63
368
 
64
- // Generate chat completion
65
- const result = await context.completion({
66
- messages: messages,
67
- temperature: 0.7,
68
- top_k: 40,
69
- top_p: 0.95,
70
- n_predict: 512
71
- }, (token) => {
72
- // Process each token
73
- console.log(token.token);
369
+ const params = {
370
+ n_predict: 200,
371
+ temperature: 0.8,
372
+ top_p: 0.9,
373
+ stop: ['</s>', '\n\n'],
374
+ };
375
+
376
+ const result = await lm.completion(messages, params);
377
+
378
+ console.log(result.text);
379
+ console.log(`Tokens: ${result.tokens_predicted}`);
380
+ console.log(`Speed: ${result.timings.predicted_per_second.toFixed(2)} tokens/sec`);
381
+ ```
382
+
383
+ ### Streaming Completion
384
+
385
+ ```typescript
386
+ const result = await lm.completion(messages, params, (token) => {
387
+ // Called for each generated token
388
+ console.log('Token:', token.token);
389
+ updateUI(token.token);
74
390
  });
75
391
  ```
76
392
 
77
- ## Advanced Features
393
+ ### Advanced Parameters
78
394
 
79
- ### JSON Mode with Schema Validation
395
+ ```typescript
396
+ const params = {
397
+ // Generation control
398
+ n_predict: 256, // Max tokens to generate
399
+ temperature: 0.7, // Randomness (0.0 - 2.0)
400
+ top_p: 0.9, // Nucleus sampling
401
+ top_k: 40, // Top-k sampling
402
+ min_p: 0.05, // Minimum probability
403
+
404
+ // Repetition control
405
+ penalty_repeat: 1.1, // Repetition penalty
406
+ penalty_freq: 0.0, // Frequency penalty
407
+ penalty_present: 0.0, // Presence penalty
408
+
409
+ // Stop conditions
410
+ stop: ['</s>', '<|end|>', '\n\n'],
411
+ ignore_eos: false,
412
+
413
+ // Sampling methods
414
+ mirostat: 0, // Mirostat sampling (0=disabled)
415
+ mirostat_tau: 5.0, // Target entropy
416
+ mirostat_eta: 0.1, // Learning rate
417
+
418
+ // Advanced
419
+ seed: -1, // Random seed (-1 = random)
420
+ n_probs: 0, // Return token probabilities
421
+ };
422
+ ```
423
+
424
+ ## Multimodal (Vision)
425
+
426
+ ### Setup Vision Model
80
427
 
81
428
  ```typescript
82
- // Define a JSON schema
83
- const schema = {
84
- type: "object",
85
- properties: {
86
- name: { type: "string" },
87
- age: { type: "number" },
88
- hobbies: {
89
- type: "array",
90
- items: { type: "string" }
91
- }
92
- },
93
- required: ["name", "age"]
429
+ import { CactusVLM } from 'cactus-react-native';
430
+
431
+ const vlm = await CactusVLM.init({
432
+ model: '/path/to/vision-model.gguf',
433
+ mmproj: '/path/to/mmproj.gguf', // Multimodal projector
434
+ n_ctx: 4096,
435
+ });
436
+ ```
437
+
438
+ ### Image Analysis
439
+
440
+ ```typescript
441
+ // Analyze single image
442
+ const messages = [{ role: 'user', content: 'Describe this image in detail' }];
443
+ const params = {
444
+ images: ['/path/to/image.jpg'],
445
+ n_predict: 200,
446
+ temperature: 0.3,
94
447
  };
95
448
 
96
- // Generate JSON-structured output
97
- const result = await context.completion({
98
- prompt: "Generate a profile for a fictional person",
99
- response_format: {
100
- type: "json_schema",
101
- json_schema: {
102
- schema: schema,
103
- strict: true
104
- }
105
- },
106
- temperature: 0.7,
107
- n_predict: 512
449
+ const result = await vlm.completion(messages, params);
450
+ console.log(result.text);
451
+ ```
452
+
453
+ ### Multi-Image Analysis
454
+
455
+ ```typescript
456
+ const imagePaths = [
457
+ '/path/to/image1.jpg',
458
+ '/path/to/image2.jpg',
459
+ '/path/to/image3.jpg'
460
+ ];
461
+
462
+ const messages = [{ role: 'user', content: 'Compare these images and explain the differences' }];
463
+ const params = {
464
+ images: imagePaths,
465
+ n_predict: 300,
466
+ temperature: 0.4,
467
+ };
468
+
469
+ const result = await vlm.completion(messages, params);
470
+ ```
471
+
472
+ ### Conversation with Images
473
+
474
+ ```typescript
475
+ const conversation = [
476
+ { role: 'user', content: 'What do you see in this image?' }
477
+ ];
478
+
479
+ const params = {
480
+ images: ['/path/to/image.jpg'],
481
+ n_predict: 256,
482
+ temperature: 0.3,
483
+ };
484
+
485
+ const result = await vlm.completion(conversation, params);
486
+ ```
487
+
488
+ ## Embeddings
489
+
490
+ ### Text Embeddings
491
+
492
+ ```typescript
493
+ // Enable embeddings during initialization
494
+ const lm = await CactusLM.init({
495
+ model: '/path/to/embedding-model.gguf',
496
+ embedding: true, // Enable embedding mode
497
+ n_ctx: 512, // Smaller context for embeddings
108
498
  });
109
499
 
110
- // The result will be valid JSON according to the schema
111
- const jsonData = JSON.parse(result.text);
500
+ // Generate embeddings
501
+ const text = 'Your text here';
502
+ const result = await lm.embedding(text);
503
+ console.log('Embedding vector:', result.embedding);
504
+ console.log('Dimensions:', result.embedding.length);
505
+ ```
506
+
507
+ ### Batch Embeddings
508
+
509
+ ```typescript
510
+ const texts = [
511
+ 'The quick brown fox',
512
+ 'Machine learning is fascinating',
513
+ 'React Native development'
514
+ ];
515
+
516
+ const embeddings = await Promise.all(
517
+ texts.map(text => lm.embedding(text))
518
+ );
519
+
520
+ // Calculate similarity
521
+ function cosineSimilarity(a: number[], b: number[]): number {
522
+ const dotProduct = a.reduce((sum, ai, i) => sum + ai * b[i], 0);
523
+ const magnitudeA = Math.sqrt(a.reduce((sum, ai) => sum + ai * ai, 0));
524
+ const magnitudeB = Math.sqrt(b.reduce((sum, bi) => sum + bi * bi, 0));
525
+ return dotProduct / (magnitudeA * magnitudeB);
526
+ }
527
+
528
+ const similarity = cosineSimilarity(
529
+ embeddings[0].embedding,
530
+ embeddings[1].embedding
531
+ );
112
532
  ```
113
533
 
114
- ### Working with Embeddings
534
+ ## Text-to-Speech (TTS)
535
+
536
+ Cactus supports text-to-speech through vocoder models, allowing you to generate speech from text.
537
+
538
+ ### Setup TTS Model
115
539
 
116
540
  ```typescript
117
- // Generate embeddings for text
118
- const embedding = await context.embedding("This is a sample text", {
119
- pooling_type: "mean" // Options: "none", "mean", "cls", "last", "rank"
541
+ import { CactusTTS } from 'cactus-react-native';
542
+
543
+ const tts = await CactusTTS.init({
544
+ model: '/path/to/text-model.gguf',
545
+ vocoder: '/path/to/vocoder-model.gguf',
546
+ n_ctx: 2048,
120
547
  });
548
+ ```
549
+
550
+ ### Basic Text-to-Speech
551
+
552
+ ```typescript
553
+ const text = 'Hello, this is a test of text-to-speech functionality.';
554
+ const params = {
555
+ voice_id: 0, // Speaker voice ID
556
+ temperature: 0.7, // Speech variation
557
+ speed: 1.0, // Speech speed
558
+ };
559
+
560
+ const result = await tts.generateSpeech(text, params);
561
+
562
+ console.log('Audio data:', result.audio_data);
563
+ console.log('Sample rate:', result.sample_rate);
564
+ console.log('Audio format:', result.format);
565
+ ```
566
+
567
+ ### Advanced TTS with Token Control
568
+
569
+ ```typescript
570
+ // Get guide tokens for precise control
571
+ const tokensResult = await tts.getGuideTokens(
572
+ 'This text will be converted to speech tokens.'
573
+ );
574
+
575
+ console.log('Guide tokens:', tokensResult.tokens);
576
+ console.log('Token count:', tokensResult.tokens.length);
121
577
 
122
- console.log(`Embedding dimensions: ${embedding.embedding.length}`);
123
- // Use the embedding for similarity comparison, clustering, etc.
578
+ // Decode tokens to audio
579
+ const audioResult = await tts.decodeTokens(tokensResult.tokens);
580
+
581
+ console.log('Decoded audio:', audioResult.audio_data);
582
+ console.log('Duration:', audioResult.duration_seconds);
124
583
  ```
125
584
 
585
+ ### Complete TTS Example
586
+
587
+ ```typescript
588
+ import React, { useState, useEffect } from 'react';
589
+ import { View, Text, TextInput, TouchableOpacity, Alert } from 'react-native';
590
+ import { Audio } from 'expo-av';
591
+ import RNFS from 'react-native-fs';
592
+ import { CactusTTS } from 'cactus-react-native';
593
+
594
+ export default function TTSDemo() {
595
+ const [tts, setTTS] = useState<CactusTTS | null>(null);
596
+ const [text, setText] = useState('Hello, this is a test of speech synthesis.');
597
+ const [isGenerating, setIsGenerating] = useState(false);
598
+ const [sound, setSound] = useState<Audio.Sound | null>(null);
599
+
600
+ useEffect(() => {
601
+ initializeTTS();
602
+ return () => {
603
+ if (sound) {
604
+ sound.unloadAsync();
605
+ }
606
+ };
607
+ }, []);
608
+
609
+ async function initializeTTS() {
610
+ try {
611
+ // Download and initialize models
612
+ const modelPath = await downloadModel();
613
+ const vocoderPath = await downloadVocoder();
614
+
615
+ const cactusTTS = await CactusTTS.init({
616
+ model: modelPath,
617
+ vocoder: vocoderPath,
618
+ n_ctx: 1024,
619
+ n_threads: 4,
620
+ });
621
+
622
+ setTTS(cactusTTS);
623
+ } catch (error) {
624
+ console.error('Failed to initialize TTS:', error);
625
+ Alert.alert('Error', 'Failed to initialize TTS');
626
+ }
627
+ }
628
+
629
+ async function generateSpeech() {
630
+ if (!tts || !text.trim()) return;
631
+
632
+ setIsGenerating(true);
633
+ try {
634
+ const params = {
635
+ voice_id: 0,
636
+ temperature: 0.7,
637
+ speed: 1.0,
638
+ };
639
+
640
+ const result = await tts.generateSpeech(text, params);
641
+
642
+ // Save audio to file
643
+ const audioPath = `${RNFS.DocumentDirectoryPath}/speech.wav`;
644
+ await RNFS.writeFile(audioPath, result.audio_data, 'base64');
645
+
646
+ // Play audio
647
+ const { sound: audioSound } = await Audio.Sound.createAsync({
648
+ uri: `file://${audioPath}`,
649
+ });
650
+
651
+ setSound(audioSound);
652
+ await audioSound.playAsync();
653
+
654
+ console.log(`Generated speech: ${result.duration_seconds}s`);
655
+ } catch (error) {
656
+ console.error('Speech generation failed:', error);
657
+ Alert.alert('Error', 'Failed to generate speech');
658
+ } finally {
659
+ setIsGenerating(false);
660
+ }
661
+ }
662
+
663
+ // Helper functions for downloading models would go here...
664
+
665
+ return (
666
+ <View style={{ flex: 1, padding: 16 }}>
667
+ <Text style={{ fontSize: 18, marginBottom: 16 }}>
668
+ Text-to-Speech Demo
669
+ </Text>
670
+
671
+ <TextInput
672
+ style={{
673
+ borderWidth: 1,
674
+ borderColor: '#ddd',
675
+ borderRadius: 8,
676
+ padding: 12,
677
+ marginBottom: 16,
678
+ minHeight: 100,
679
+ }}
680
+ value={text}
681
+ onChangeText={setText}
682
+ placeholder="Enter text to convert to speech..."
683
+ multiline
684
+ />
685
+
686
+ <TouchableOpacity
687
+ onPress={generateSpeech}
688
+ disabled={isGenerating || !tts}
689
+ style={{
690
+ backgroundColor: isGenerating ? '#ccc' : '#007AFF',
691
+ padding: 16,
692
+ borderRadius: 8,
693
+ alignItems: 'center',
694
+ }}
695
+ >
696
+ <Text style={{ color: 'white', fontSize: 16, fontWeight: 'bold' }}>
697
+ {isGenerating ? 'Generating...' : 'Generate Speech'}
698
+ </Text>
699
+ </TouchableOpacity>
700
+ </View>
701
+ );
702
+ }
703
+ ```
704
+
705
+ ## Advanced Features
706
+
126
707
  ### Session Management
127
708
 
709
+ For the low-level API, you can still access session management:
710
+
128
711
  ```typescript
129
- // Save the current session state
130
- const tokenCount = await context.saveSession("session.bin", { tokenSize: 1024 });
131
- console.log(`Saved session with ${tokenCount} tokens`);
712
+ import { initLlama } from 'cactus-react-native';
713
+
714
+ const context = await initLlama({ model: '/path/to/model.gguf' });
132
715
 
133
- // Load a saved session
134
- const loadResult = await context.loadSession("session.bin");
135
- console.log(`Loaded session: ${loadResult.success}`);
716
+ // Save session
717
+ const tokensKept = await context.saveSession('/path/to/session.bin', {
718
+ tokenSize: 1024 // Number of tokens to keep
719
+ });
720
+
721
+ // Load session
722
+ const sessionInfo = await context.loadSession('/path/to/session.bin');
723
+ console.log(`Loaded ${sessionInfo.tokens_loaded} tokens`);
136
724
  ```
137
725
 
138
- ### Working with LoRA Adapters
726
+ ### LoRA Adapters
139
727
 
140
728
  ```typescript
141
- // Apply LoRA adapters to the model
729
+ const context = await initLlama({ model: '/path/to/model.gguf' });
730
+
731
+ // Apply LoRA adapters
142
732
  await context.applyLoraAdapters([
143
- { path: "models/lora_adapter.bin", scaled: 0.8 }
733
+ { path: '/path/to/lora1.gguf', scaled: 1.0 },
734
+ { path: '/path/to/lora2.gguf', scaled: 0.8 }
144
735
  ]);
145
736
 
146
- // Get currently loaded adapters
147
- const loadedAdapters = await context.getLoadedLoraAdapters();
737
+ // Get loaded adapters
738
+ const adapters = await context.getLoadedLoraAdapters();
739
+ console.log('Loaded adapters:', adapters);
148
740
 
149
- // Remove all LoRA adapters
741
+ // Remove adapters
150
742
  await context.removeLoraAdapters();
151
743
  ```
152
744
 
153
- ### Model Benchmarking
745
+ ### Structured Output (JSON)
154
746
 
155
747
  ```typescript
156
- // Benchmark the model performance
157
- const benchResult = await context.bench(
158
- 32, // pp: prompt processing tests
159
- 32, // tg: token generation tests
160
- 512, // pl: prompt length
161
- 5 // nr: number of runs
162
- );
748
+ const messages = [
749
+ { role: 'user', content: 'Extract information about this person: John Doe, 30 years old, software engineer from San Francisco' }
750
+ ];
163
751
 
164
- console.log(`Average token generation speed: ${benchResult.tgAvg} tokens/sec`);
165
- console.log(`Model size: ${benchResult.modelSize} bytes`);
752
+ const params = {
753
+ response_format: {
754
+ type: 'json_object',
755
+ schema: {
756
+ type: 'object',
757
+ properties: {
758
+ name: { type: 'string' },
759
+ age: { type: 'number' },
760
+ profession: { type: 'string' },
761
+ location: { type: 'string' }
762
+ },
763
+ required: ['name', 'age']
764
+ }
765
+ }
766
+ };
767
+
768
+ const result = await lm.completion(messages, params);
769
+ const person = JSON.parse(result.text);
770
+ console.log(person.name); // "John Doe"
166
771
  ```
167
772
 
168
- ### Native Logging
773
+ ### Performance Monitoring
169
774
 
170
775
  ```typescript
171
- import { addNativeLogListener, toggleNativeLog } from 'cactus-react-native';
776
+ const result = await lm.completion(messages, { n_predict: 100 });
777
+
778
+ console.log('Performance metrics:');
779
+ console.log(`Prompt tokens: ${result.timings.prompt_n}`);
780
+ console.log(`Generated tokens: ${result.timings.predicted_n}`);
781
+ console.log(`Prompt speed: ${result.timings.prompt_per_second.toFixed(2)} tokens/sec`);
782
+ console.log(`Generation speed: ${result.timings.predicted_per_second.toFixed(2)} tokens/sec`);
783
+ console.log(`Total time: ${(result.timings.prompt_ms + result.timings.predicted_ms).toFixed(0)}ms`);
784
+ ```
172
785
 
173
- // Enable native logging
174
- await toggleNativeLog(true);
786
+ ## Best Practices
175
787
 
176
- // Add a listener for native logs
177
- const logListener = addNativeLogListener((level, text) => {
178
- console.log(`[${level}] ${text}`);
179
- });
788
+ ### Model Management
789
+
790
+ ```typitten
791
+ class ModelManager {
792
+ private models = new Map<string, CactusLM | CactusVLM | CactusTTS>();
180
793
 
181
- // Remove the listener when no longer needed
182
- logListener.remove();
794
+ async loadLM(name: string, modelPath: string): Promise<CactusLM> {
795
+ if (this.models.has(name)) {
796
+ return this.models.get(name)! as CactusLM;
797
+ }
798
+
799
+ const lm = await CactusLM.init({ model: modelPath });
800
+ this.models.set(name, lm);
801
+ return lm;
802
+ }
803
+
804
+ async loadVLM(name: string, modelPath: string, mmprojPath: string): Promise<CactusVLM> {
805
+ if (this.models.has(name)) {
806
+ return this.models.get(name)! as CactusVLM;
807
+ }
808
+
809
+ const vlm = await CactusVLM.init({ model: modelPath, mmproj: mmprojPath });
810
+ this.models.set(name, vlm);
811
+ return vlm;
812
+ }
813
+
814
+ async unloadModel(name: string): Promise<void> {
815
+ const model = this.models.get(name);
816
+ if (model) {
817
+ await model.release();
818
+ this.models.delete(name);
819
+ }
820
+ }
821
+
822
+ async unloadAll(): Promise<void> {
823
+ await Promise.all(
824
+ Array.from(this.models.values()).map(model => model.release())
825
+ );
826
+ this.models.clear();
827
+ }
828
+ }
183
829
  ```
184
830
 
185
- ## Error Handling
831
+ ### Error Handling
186
832
 
187
833
  ```typescript
188
- try {
189
- const context = await initLlama({
190
- model: 'models/non-existent-model.gguf',
191
- n_ctx: 2048,
192
- n_threads: 4
193
- });
194
- } catch (error) {
195
- console.error('Failed to initialize model:', error);
834
+ async function safeCompletion(lm: CactusLM, messages: any[]) {
835
+ try {
836
+ const result = await lm.completion(messages, {
837
+ n_predict: 256,
838
+ temperature: 0.7,
839
+ });
840
+ return { success: true, data: result };
841
+ } catch (error) {
842
+ if (error.message.includes('Context is busy')) {
843
+ // Handle concurrent requests
844
+ await new Promise(resolve => setTimeout(resolve, 100));
845
+ return safeCompletion(lm, messages);
846
+ } else if (error.message.includes('Context not found')) {
847
+ // Handle context cleanup
848
+ throw new Error('Model context was released');
849
+ } else {
850
+ // Handle other errors
851
+ console.error('Completion failed:', error);
852
+ return { success: false, error: error.message };
853
+ }
854
+ }
196
855
  }
197
856
  ```
198
857
 
199
- ## Best Practices
858
+ ### Memory Management
200
859
 
201
- 1. **Model Management**
202
- - Store models in the app's document directory
203
- - Consider model size when targeting specific devices
204
- - Smaller models like SmolLM (135M) work well on most devices
860
+ ```typescript
861
+ // Monitor memory usage
862
+ const checkMemory = () => {
863
+ if (Platform.OS === 'android') {
864
+ // Android-specific memory monitoring
865
+ console.log('Memory warning - consider releasing unused models');
866
+ }
867
+ };
205
868
 
206
- 2. **Performance Optimization**
207
- - Adjust `n_threads` based on the device's capabilities
208
- - Use a smaller `n_ctx` for memory-constrained devices
209
- - Consider INT8 or INT4 quantized models for better performance
869
+ // Release models when app goes to background
870
+ import { AppState } from 'react-native';
210
871
 
211
- 3. **Battery Efficiency**
212
- - Release the model context when not in use
213
- - Process inference in smaller batches
214
- - Consider background processing for long generations
872
+ AppState.addEventListener('change', (nextAppState) => {
873
+ if (nextAppState === 'background') {
874
+ // Release non-essential models
875
+ modelManager.unloadAll();
876
+ }
877
+ });
878
+ ```
879
+
880
+ ## API Reference
881
+
882
+ ### High-Level APIs
883
+
884
+ - `CactusLM.init(params: ContextParams): Promise<CactusLM>` - Initialize language model
885
+ - `CactusVLM.init(params: VLMContextParams): Promise<CactusVLM>` - Initialize vision language model
886
+ - `CactusTTS.init(params: TTSContextParams): Promise<CactusTTS>` - Initialize text-to-speech model
887
+
888
+ ### CactusLM Methods
889
+
890
+ - `completion(messages: CactusOAICompatibleMessage[], params: CompletionParams, callback?: (token: TokenData) => void): Promise<NativeCompletionResult>`
891
+ - `embedding(text: string, params?: EmbeddingParams): Promise<NativeEmbeddingResult>`
892
+ - `rewind(): Promise<void>` - Clear conversation history
893
+ - `release(): Promise<void>` - Release resources
894
+
895
+ ### CactusVLM Methods
896
+
897
+ - `completion(messages: CactusOAICompatibleMessage[], params: VLMCompletionParams, callback?: (token: TokenData) => void): Promise<NativeCompletionResult>`
898
+ - `rewind(): Promise<void>` - Clear conversation history
899
+ - `release(): Promise<void>` - Release resources
900
+
901
+ ### CactusTTS Methods
902
+
903
+ - `generateSpeech(text: string, params: TTSSpeechParams): Promise<NativeAudioCompletionResult>`
904
+ - `getGuideTokens(text: string): Promise<NativeAudioTokensResult>`
905
+ - `decodeTokens(tokens: number[]): Promise<NativeAudioDecodeResult>`
906
+ - `release(): Promise<void>` - Release resources
907
+
908
+ ### Low-Level Functions (Advanced)
909
+
910
+ For advanced use cases, the original low-level API is still available:
911
+
912
+ - `initLlama(params: ContextParams): Promise<LlamaContext>` - Initialize a model context
913
+ - `releaseAllLlama(): Promise<void>` - Release all contexts
914
+ - `setContextLimit(limit: number): Promise<void>` - Set maximum contexts
915
+ - `toggleNativeLog(enabled: boolean): Promise<void>` - Enable/disable native logging
215
916
 
216
- 4. **Memory Management**
217
- - Always call `context.release()` when done with a model
218
- - Use `releaseAllLlama()` when switching between multiple models
917
+ ## Troubleshooting
219
918
 
220
- ## Example App
919
+ ### Common Issues
221
920
 
222
- For a complete working example, check out the [React Native example app](https://github.com/cactus-compute/cactus/tree/main/examples/react-example) in the repository.
921
+ **Model Loading Fails**
922
+ ```typescript
923
+ // Check file exists and is accessible
924
+ if (!(await RNFS.exists(modelPath))) {
925
+ throw new Error('Model file not found');
926
+ }
927
+
928
+ // Check file size
929
+ const stats = await RNFS.stat(modelPath);
930
+ console.log('Model size:', stats.size);
931
+ ```
932
+
933
+ **Out of Memory**
934
+ ```typescript
935
+ // Reduce context size
936
+ const lm = await CactusLM.init({
937
+ model: '/path/to/model.gguf',
938
+ n_ctx: 1024, // Reduce from 4096
939
+ n_batch: 128, // Reduce batch size
940
+ });
941
+ ```
942
+
943
+ **GPU Issues**
944
+ ```typescript
945
+ // Disable GPU if having issues
946
+ const lm = await CactusLM.init({
947
+ model: '/path/to/model.gguf',
948
+ n_gpu_layers: 0, // Use CPU only
949
+ });
950
+ ```
223
951
 
224
- This example demonstrates:
225
- - Loading and initializing models
226
- - Building a chat interface
227
- - Streaming responses
228
- - Proper resource management
952
+ ### Performance Tips
229
953
 
230
- ## License
954
+ 1. **Use appropriate context sizes** - Larger contexts use more memory
955
+ 2. **Optimize batch sizes** - Balance between speed and memory
956
+ 3. **Cache models** - Don't reload models unnecessarily
957
+ 4. **Use GPU acceleration** - When available and stable
958
+ 5. **Monitor memory usage** - Release models when not needed
231
959
 
232
- This project is licensed under the Apache 2.0 License.
960
+ This documentation covers the essential usage patterns for cactus-react-native. For more examples, check the [example apps](../examples/) in the repository.