cactus-react-native 0.1.4 → 0.2.2

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (68) hide show
  1. package/README.md +550 -721
  2. package/android/src/main/java/com/cactus/Cactus.java +41 -0
  3. package/android/src/main/java/com/cactus/LlamaContext.java +19 -0
  4. package/android/src/main/jni.cpp +36 -11
  5. package/android/src/main/jniLibs/arm64-v8a/libcactus.so +0 -0
  6. package/android/src/main/jniLibs/arm64-v8a/libcactus_v8.so +0 -0
  7. package/android/src/main/jniLibs/arm64-v8a/libcactus_v8_2.so +0 -0
  8. package/android/src/main/jniLibs/arm64-v8a/libcactus_v8_2_dotprod.so +0 -0
  9. package/android/src/main/jniLibs/arm64-v8a/libcactus_v8_2_dotprod_i8mm.so +0 -0
  10. package/android/src/main/jniLibs/arm64-v8a/libcactus_v8_2_i8mm.so +0 -0
  11. package/android/src/main/jniLibs/x86_64/libcactus.so +0 -0
  12. package/android/src/main/jniLibs/x86_64/libcactus_x86_64.so +0 -0
  13. package/android/src/newarch/java/com/cactus/CactusModule.java +5 -0
  14. package/android/src/oldarch/java/com/cactus/CactusModule.java +5 -0
  15. package/ios/Cactus.mm +14 -0
  16. package/ios/CactusContext.h +1 -0
  17. package/ios/CactusContext.mm +18 -0
  18. package/ios/cactus.xcframework/ios-arm64_x86_64-simulator/cactus.framework/cactus +0 -0
  19. package/ios/cactus.xcframework/tvos-arm64_x86_64-simulator/cactus.framework/cactus +0 -0
  20. package/lib/commonjs/NativeCactus.js.map +1 -1
  21. package/lib/commonjs/index.js +92 -6
  22. package/lib/commonjs/index.js.map +1 -1
  23. package/lib/commonjs/lm.js +64 -21
  24. package/lib/commonjs/lm.js.map +1 -1
  25. package/lib/commonjs/projectId.js +8 -0
  26. package/lib/commonjs/projectId.js.map +1 -0
  27. package/lib/commonjs/remote.js +153 -0
  28. package/lib/commonjs/remote.js.map +1 -0
  29. package/lib/commonjs/telemetry.js +11 -5
  30. package/lib/commonjs/telemetry.js.map +1 -1
  31. package/lib/commonjs/vlm.js +90 -23
  32. package/lib/commonjs/vlm.js.map +1 -1
  33. package/lib/module/NativeCactus.js.map +1 -1
  34. package/lib/module/index.js +48 -5
  35. package/lib/module/index.js.map +1 -1
  36. package/lib/module/lm.js +63 -21
  37. package/lib/module/lm.js.map +1 -1
  38. package/lib/module/projectId.js +4 -0
  39. package/lib/module/projectId.js.map +1 -0
  40. package/lib/module/remote.js +144 -0
  41. package/lib/module/remote.js.map +1 -0
  42. package/lib/module/telemetry.js +11 -5
  43. package/lib/module/telemetry.js.map +1 -1
  44. package/lib/module/vlm.js +90 -23
  45. package/lib/module/vlm.js.map +1 -1
  46. package/lib/typescript/NativeCactus.d.ts +7 -0
  47. package/lib/typescript/NativeCactus.d.ts.map +1 -1
  48. package/lib/typescript/index.d.ts +3 -1
  49. package/lib/typescript/index.d.ts.map +1 -1
  50. package/lib/typescript/lm.d.ts +4 -3
  51. package/lib/typescript/lm.d.ts.map +1 -1
  52. package/lib/typescript/projectId.d.ts +2 -0
  53. package/lib/typescript/projectId.d.ts.map +1 -0
  54. package/lib/typescript/remote.d.ts +7 -0
  55. package/lib/typescript/remote.d.ts.map +1 -0
  56. package/lib/typescript/telemetry.d.ts +7 -3
  57. package/lib/typescript/telemetry.d.ts.map +1 -1
  58. package/lib/typescript/vlm.d.ts +4 -2
  59. package/lib/typescript/vlm.d.ts.map +1 -1
  60. package/package.json +4 -4
  61. package/scripts/postInstall.js +33 -0
  62. package/src/NativeCactus.ts +7 -0
  63. package/src/index.ts +58 -5
  64. package/src/lm.ts +66 -28
  65. package/src/projectId.ts +1 -0
  66. package/src/remote.ts +175 -0
  67. package/src/telemetry.ts +27 -12
  68. package/src/vlm.ts +104 -25
package/README.md CHANGED
@@ -1,49 +1,49 @@
1
1
  # Cactus React Native
2
2
 
3
- A powerful React Native library for running Large Language Models (LLMs) and Vision Language Models (VLMs) directly on mobile devices, with full support for chat completions, multimodal inputs, embeddings, text-to-speech and advanced features.
3
+ Run LLMs, VLMs, and TTS models directly on mobile devices.
4
4
 
5
5
  ## Installation
6
6
 
7
- ```bash
8
- npm install cactus-react-native react-native-fs
9
- # or
10
- yarn add cactus-react-native react-native-fs
7
+ ```json
8
+ {
9
+ "dependencies": {
10
+ "cactus-react-native": "^0.2.1",
11
+ "react-native-fs": "^2.20.0"
12
+ }
13
+ }
11
14
  ```
12
15
 
13
- **Additional Setup:**
14
- - For iOS: `cd ios && npx pod-install` or `yarn pod-install`
15
- - For Android: Ensure your `minSdkVersion` is 24 or higher
16
-
17
- > **Important**: `react-native-fs` is required for file system access to download and manage model files locally.
16
+ **Setup:**
17
+ - iOS: `cd ios && npx pod-install`
18
+ - Android: Ensure `minSdkVersion` 24+
18
19
 
19
20
  ## Quick Start
20
21
 
21
- ### Basic Text Completion
22
-
23
22
  ```typescript
24
23
  import { CactusLM } from 'cactus-react-native';
24
+ import RNFS from 'react-native-fs';
25
+
26
+ const modelPath = `${RNFS.DocumentDirectoryPath}/model.gguf`;
25
27
 
26
- // Initialize a language model
27
28
  const { lm, error } = await CactusLM.init({
28
- model: '/path/to/your/model.gguf',
29
+ model: modelPath,
29
30
  n_ctx: 2048,
30
31
  n_threads: 4,
31
32
  });
32
- if (error) throw error; // handle error gracefully
33
33
 
34
- // Generate text
35
- const messages = [{ role: 'user', content: 'Hello, how are you?' }];
36
- const params = { n_predict: 100, temperature: 0.7 };
34
+ if (error) throw error;
37
35
 
38
- const result = await lm.completion(messages, params);
36
+ const messages = [{ role: 'user', content: 'Hello!' }];
37
+ const result = await lm.completion(messages, { n_predict: 100 });
39
38
  console.log(result.text);
39
+ lm.release();
40
40
  ```
41
41
 
42
- ### Complete Chat App Example
42
+ ## Streaming Chat
43
43
 
44
44
  ```typescript
45
45
  import React, { useState, useEffect } from 'react';
46
- import { View, Text, TextInput, TouchableOpacity } from 'react-native';
46
+ import { View, Text, TextInput, TouchableOpacity, ScrollView, ActivityIndicator } from 'react-native';
47
47
  import { CactusLM } from 'cactus-react-native';
48
48
  import RNFS from 'react-native-fs';
49
49
 
@@ -52,111 +52,170 @@ interface Message {
52
52
  content: string;
53
53
  }
54
54
 
55
- export default function ChatApp() {
55
+ export default function ChatScreen() {
56
56
  const [lm, setLM] = useState<CactusLM | null>(null);
57
57
  const [messages, setMessages] = useState<Message[]>([]);
58
58
  const [input, setInput] = useState('');
59
- const [loading, setLoading] = useState(true);
59
+ const [isLoading, setIsLoading] = useState(true);
60
+ const [isGenerating, setIsGenerating] = useState(false);
60
61
 
61
62
  useEffect(() => {
62
63
  initializeModel();
64
+ return () => {
65
+ lm?.release();
66
+ };
63
67
  }, []);
64
68
 
65
- async function initializeModel() {
69
+ const initializeModel = async () => {
66
70
  try {
67
- // Download model (example URL)
68
71
  const modelUrl = 'https://huggingface.co/Cactus-Compute/Qwen3-600m-Instruct-GGUF/resolve/main/Qwen3-0.6B-Q8_0.gguf';
69
- const modelPath = `${RNFS.DocumentDirectoryPath}/model.gguf`;
70
-
71
- // Download if not exists
72
- if (!(await RNFS.exists(modelPath))) {
73
- await RNFS.downloadFile({
74
- fromUrl: modelUrl,
75
- toFile: modelPath,
76
- }).promise;
77
- }
72
+ const modelPath = await downloadModel(modelUrl, 'qwen-600m.gguf');
78
73
 
79
- // Initialize language model
80
- const { lm, error } = await CactusLM.init({
74
+ const { lm: model, error } = await CactusLM.init({
81
75
  model: modelPath,
82
76
  n_ctx: 2048,
83
77
  n_threads: 4,
84
- n_gpu_layers: 99, // Use GPU acceleration
78
+ n_gpu_layers: 99,
85
79
  });
86
- if (error) throw error; // handle error gracefully
87
80
 
88
- setLM(lm);
89
- setLoading(false);
81
+ if (error) throw error;
82
+ setLM(model);
90
83
  } catch (error) {
91
84
  console.error('Failed to initialize model:', error);
85
+ } finally {
86
+ setIsLoading(false);
92
87
  }
93
- }
88
+ };
89
+
90
+ const downloadModel = async (url: string, filename: string): Promise<string> => {
91
+ const path = `${RNFS.DocumentDirectoryPath}/${filename}`;
92
+
93
+ if (await RNFS.exists(path)) return path;
94
+
95
+ console.log('Downloading model...');
96
+ await RNFS.downloadFile({
97
+ fromUrl: url,
98
+ toFile: path,
99
+ progress: (res) => {
100
+ const progress = res.bytesWritten / res.contentLength;
101
+ console.log(`Download progress: ${(progress * 100).toFixed(1)}%`);
102
+ },
103
+ }).promise;
104
+
105
+ return path;
106
+ };
94
107
 
95
- async function sendMessage() {
96
- if (!lm || !input.trim()) return;
108
+ const sendMessage = async () => {
109
+ if (!lm || !input.trim() || isGenerating) return;
97
110
 
98
- const userMessage: Message = { role: 'user', content: input };
111
+ const userMessage: Message = { role: 'user', content: input.trim() };
99
112
  const newMessages = [...messages, userMessage];
100
- setMessages(newMessages);
113
+ setMessages([...newMessages, { role: 'assistant', content: '' }]);
101
114
  setInput('');
115
+ setIsGenerating(true);
102
116
 
103
117
  try {
104
- const params = {
105
- n_predict: 256,
118
+ let response = '';
119
+ await lm.completion(newMessages, {
120
+ n_predict: 200,
106
121
  temperature: 0.7,
107
122
  stop: ['</s>', '<|end|>'],
108
- };
109
-
110
- const result = await lm.completion(newMessages, params);
111
-
112
- const assistantMessage: Message = {
113
- role: 'assistant',
114
- content: result.text
115
- };
116
- setMessages([...newMessages, assistantMessage]);
123
+ }, (token) => {
124
+ response += token.token;
125
+ setMessages(prev => [
126
+ ...prev.slice(0, -1),
127
+ { role: 'assistant', content: response }
128
+ ]);
129
+ });
117
130
  } catch (error) {
118
- console.error('Completion failed:', error);
131
+ console.error('Generation failed:', error);
132
+ setMessages(prev => [
133
+ ...prev.slice(0, -1),
134
+ { role: 'assistant', content: 'Error generating response' }
135
+ ]);
136
+ } finally {
137
+ setIsGenerating(false);
119
138
  }
120
- }
139
+ };
121
140
 
122
- if (loading) {
141
+ if (isLoading) {
123
142
  return (
124
143
  <View style={{ flex: 1, justifyContent: 'center', alignItems: 'center' }}>
125
- <Text>Loading model...</Text>
144
+ <ActivityIndicator size="large" />
145
+ <Text style={{ marginTop: 16 }}>Loading model...</Text>
126
146
  </View>
127
147
  );
128
148
  }
129
149
 
130
150
  return (
131
- <View style={{ flex: 1, padding: 16 }}>
132
- {/* Messages */}
133
- <View style={{ flex: 1 }}>
151
+ <View style={{ flex: 1, backgroundColor: '#f5f5f5' }}>
152
+ <ScrollView style={{ flex: 1, padding: 16 }}>
134
153
  {messages.map((msg, index) => (
135
- <Text key={index} style={{
136
- backgroundColor: msg.role === 'user' ? '#007AFF' : '#f0f0f0',
137
- color: msg.role === 'user' ? 'white' : 'black',
138
- padding: 8,
139
- margin: 4,
140
- borderRadius: 8,
141
- }}>
142
- {msg.content}
143
- </Text>
154
+ <View
155
+ key={index}
156
+ style={{
157
+ backgroundColor: msg.role === 'user' ? '#007AFF' : '#ffffff',
158
+ padding: 12,
159
+ marginVertical: 4,
160
+ borderRadius: 12,
161
+ alignSelf: msg.role === 'user' ? 'flex-end' : 'flex-start',
162
+ maxWidth: '80%',
163
+ shadowColor: '#000',
164
+ shadowOffset: { width: 0, height: 1 },
165
+ shadowOpacity: 0.2,
166
+ shadowRadius: 2,
167
+ elevation: 2,
168
+ }}
169
+ >
170
+ <Text style={{
171
+ color: msg.role === 'user' ? '#ffffff' : '#000000',
172
+ fontSize: 16,
173
+ }}>
174
+ {msg.content}
175
+ </Text>
176
+ </View>
144
177
  ))}
145
- </View>
146
-
147
- {/* Input */}
148
- <View style={{ flexDirection: 'row' }}>
178
+ </ScrollView>
179
+
180
+ <View style={{
181
+ flexDirection: 'row',
182
+ padding: 16,
183
+ backgroundColor: '#ffffff',
184
+ borderTopWidth: 1,
185
+ borderTopColor: '#e0e0e0',
186
+ }}>
149
187
  <TextInput
150
- style={{ flex: 1, borderWidth: 1, padding: 8, borderRadius: 4 }}
188
+ style={{
189
+ flex: 1,
190
+ borderWidth: 1,
191
+ borderColor: '#e0e0e0',
192
+ borderRadius: 20,
193
+ paddingHorizontal: 16,
194
+ paddingVertical: 10,
195
+ fontSize: 16,
196
+ backgroundColor: '#f8f8f8',
197
+ }}
151
198
  value={input}
152
199
  onChangeText={setInput}
153
200
  placeholder="Type a message..."
201
+ multiline
202
+ onSubmitEditing={sendMessage}
154
203
  />
155
- <TouchableOpacity
204
+ <TouchableOpacity
156
205
  onPress={sendMessage}
157
- style={{ backgroundColor: '#007AFF', padding: 8, borderRadius: 4, marginLeft: 8 }}
206
+ disabled={isGenerating || !input.trim()}
207
+ style={{
208
+ backgroundColor: isGenerating ? '#cccccc' : '#007AFF',
209
+ borderRadius: 20,
210
+ paddingHorizontal: 16,
211
+ paddingVertical: 10,
212
+ marginLeft: 8,
213
+ justifyContent: 'center',
214
+ }}
158
215
  >
159
- <Text style={{ color: 'white' }}>Send</Text>
216
+ <Text style={{ color: '#ffffff', fontWeight: 'bold' }}>
217
+ {isGenerating ? '...' : 'Send'}
218
+ </Text>
160
219
  </TouchableOpacity>
161
220
  </View>
162
221
  </View>
@@ -164,797 +223,567 @@ export default function ChatApp() {
164
223
  }
165
224
  ```
166
225
 
167
- ## File Path Requirements
168
-
169
- **Critical**: Cactus requires **absolute local file paths**, not Metro bundler URLs or asset references.
170
-
171
- ### ❌ Won't Work
172
- ```typescript
173
- // Metro bundler URLs
174
- 'http://localhost:8081/assets/model.gguf'
175
-
176
- // React Native asset requires
177
- require('./assets/model.gguf')
178
-
179
- // Relative paths
180
- './models/model.gguf'
181
- ```
182
-
183
- ### ✅ Will Work
184
- ```typescript
185
- import RNFS from 'react-native-fs';
186
-
187
- // Absolute paths in app directories
188
- const modelPath = `${RNFS.DocumentDirectoryPath}/model.gguf`;
189
- const imagePath = `${RNFS.DocumentDirectoryPath}/image.jpg`;
190
-
191
- // Downloaded/copied files
192
- const downloadModel = async () => {
193
- const modelUrl = 'https://example.com/model.gguf';
194
- const localPath = `${RNFS.DocumentDirectoryPath}/model.gguf`;
195
-
196
- await RNFS.downloadFile({
197
- fromUrl: modelUrl,
198
- toFile: localPath,
199
- }).promise;
200
-
201
- return localPath; // Use this path with Cactus
202
- };
203
- ```
204
-
205
- ### Image Assets
206
- For images, you need to copy them to local storage first:
207
-
208
- ```typescript
209
- // Copy bundled asset to local storage
210
- const copyAssetToLocal = async (assetName: string): Promise<string> => {
211
- const assetPath = `${RNFS.MainBundlePath}/${assetName}`;
212
- const localPath = `${RNFS.DocumentDirectoryPath}/${assetName}`;
213
-
214
- if (!(await RNFS.exists(localPath))) {
215
- await RNFS.copyFile(assetPath, localPath);
216
- }
217
-
218
- return localPath;
219
- };
220
-
221
- // Usage
222
- const imagePath = await copyAssetToLocal('demo.jpg');
223
- const params = { images: [imagePath], n_predict: 200 };
224
- const result = await vlm.completion(messages, params);
225
- ```
226
-
227
- ### External Images
228
- Download external images to local storage:
229
-
230
- ```typescript
231
- const downloadImage = async (imageUrl: string): Promise<string> => {
232
- const localPath = `${RNFS.DocumentDirectoryPath}/temp_image.jpg`;
233
-
234
- await RNFS.downloadFile({
235
- fromUrl: imageUrl,
236
- toFile: localPath,
237
- }).promise;
238
-
239
- return localPath;
240
- };
241
- ```
242
-
243
226
  ## Core APIs
244
227
 
245
- ### CactusLM (Language Model)
246
-
247
- For text-only language models:
228
+ ### CactusLM
248
229
 
249
230
  ```typescript
250
231
  import { CactusLM } from 'cactus-react-native';
251
232
 
252
- // Initialize
253
- const lm = await CactusLM.init({
233
+ const { lm, error } = await CactusLM.init({
254
234
  model: '/path/to/model.gguf',
255
- n_ctx: 4096, // Context window size
256
- n_batch: 512, // Batch size for processing
257
- n_threads: 4, // Number of threads
258
- n_gpu_layers: 99, // GPU layers (0 = CPU only)
235
+ n_ctx: 2048,
236
+ n_threads: 4,
237
+ n_gpu_layers: 99,
238
+ embedding: true,
259
239
  });
260
240
 
261
- // Text completion
262
- const messages = [
263
- { role: 'system', content: 'You are a helpful assistant.' },
264
- { role: 'user', content: 'What is the capital of France?' },
265
- ];
266
-
267
- const params = {
241
+ const messages = [{ role: 'user', content: 'What is AI?' }];
242
+ const result = await lm.completion(messages, {
268
243
  n_predict: 200,
269
244
  temperature: 0.7,
270
- top_p: 0.9,
271
- stop: ['</s>', '\n\n'],
272
- };
273
-
274
- const result = await lm.completion(messages, params);
275
-
276
- // Embeddings
277
- const embeddingResult = await lm.embedding('Your text here');
278
- console.log('Embedding vector:', embeddingResult.embedding);
245
+ stop: ['</s>'],
246
+ });
279
247
 
280
- // Cleanup
281
- await lm.rewind(); // Clear conversation
282
- await lm.release(); // Release resources
248
+ const embedding = await lm.embedding('Your text here');
249
+ await lm.rewind();
250
+ await lm.release();
283
251
  ```
284
252
 
285
- ### CactusVLM (Vision Language Model)
286
-
287
- For multimodal models that can process both text and images:
253
+ ### CactusVLM
288
254
 
289
255
  ```typescript
290
256
  import { CactusVLM } from 'cactus-react-native';
291
257
 
292
- // Initialize with multimodal projector
293
- const vlm = await CactusVLM.init({
258
+ const { vlm, error } = await CactusVLM.init({
294
259
  model: '/path/to/vision-model.gguf',
295
260
  mmproj: '/path/to/mmproj.gguf',
296
261
  n_ctx: 2048,
297
- n_threads: 4,
298
- n_gpu_layers: 99, // GPU for main model, CPU for projector
299
262
  });
300
263
 
301
- // Image + text completion
302
- const messages = [{ role: 'user', content: 'What do you see in this image?' }];
303
- const params = {
264
+ const messages = [{ role: 'user', content: 'Describe this image' }];
265
+ const result = await vlm.completion(messages, {
304
266
  images: ['/path/to/image.jpg'],
305
267
  n_predict: 200,
306
268
  temperature: 0.3,
307
- };
308
-
309
- const result = await vlm.completion(messages, params);
310
-
311
- // Text-only completion (same interface)
312
- const textMessages = [{ role: 'user', content: 'Tell me a joke' }];
313
- const textParams = { n_predict: 100 };
314
- const textResult = await vlm.completion(textMessages, textParams);
269
+ });
315
270
 
316
- // Cleanup
317
- await vlm.rewind();
318
271
  await vlm.release();
319
272
  ```
320
273
 
321
- ### CactusTTS (Text-to-Speech)
322
-
323
- For text-to-speech generation:
274
+ ### CactusTTS
324
275
 
325
276
  ```typescript
326
- import { CactusTTS } from 'cactus-react-native';
277
+ import { CactusTTS, initLlama } from 'cactus-react-native';
327
278
 
328
- // Initialize with vocoder
329
- const tts = await CactusTTS.init({
279
+ const context = await initLlama({
330
280
  model: '/path/to/tts-model.gguf',
331
- vocoder: '/path/to/vocoder.gguf',
332
281
  n_ctx: 1024,
333
- n_threads: 4,
334
282
  });
335
283
 
336
- // Generate speech
337
- const text = 'Hello, this is a test of text-to-speech functionality.';
338
- const params = {
339
- voice_id: 0,
340
- temperature: 0.7,
341
- speed: 1.0,
342
- };
343
-
344
- const audioResult = await tts.generateSpeech(text, params);
345
- console.log('Audio data:', audioResult.audio_data);
284
+ const tts = await CactusTTS.init(context, '/path/to/vocoder.gguf');
346
285
 
347
- // Advanced token-based generation
348
- const tokens = await tts.getGuideTokens('Your text here');
349
- const audio = await tts.decodeTokens(tokens);
286
+ const audio = await tts.generate(
287
+ 'Hello, this is text-to-speech',
288
+ '{"speaker_id": 0}'
289
+ );
350
290
 
351
- // Cleanup
352
291
  await tts.release();
353
292
  ```
354
293
 
355
- ## Text Completion
356
-
357
- ### Basic Completion
358
-
359
- ```typescript
360
- const lm = await CactusLM.init({
361
- model: '/path/to/model.gguf',
362
- n_ctx: 2048,
363
- });
364
-
365
- const messages = [
366
- { role: 'user', content: 'Write a short poem about coding' }
367
- ];
368
-
369
- const params = {
370
- n_predict: 200,
371
- temperature: 0.8,
372
- top_p: 0.9,
373
- stop: ['</s>', '\n\n'],
374
- };
375
-
376
- const result = await lm.completion(messages, params);
377
-
378
- console.log(result.text);
379
- console.log(`Tokens: ${result.tokens_predicted}`);
380
- console.log(`Speed: ${result.timings.predicted_per_second.toFixed(2)} tokens/sec`);
381
- ```
382
-
383
- ### Streaming Completion
384
-
385
- ```typescript
386
- const result = await lm.completion(messages, params, (token) => {
387
- // Called for each generated token
388
- console.log('Token:', token.token);
389
- updateUI(token.token);
390
- });
391
- ```
294
+ ## Advanced Usage
392
295
 
393
- ### Advanced Parameters
296
+ ### Model Manager
394
297
 
395
298
  ```typescript
396
- const params = {
397
- // Generation control
398
- n_predict: 256, // Max tokens to generate
399
- temperature: 0.7, // Randomness (0.0 - 2.0)
400
- top_p: 0.9, // Nucleus sampling
401
- top_k: 40, // Top-k sampling
402
- min_p: 0.05, // Minimum probability
299
+ class ModelManager {
300
+ private models = new Map<string, CactusLM | CactusVLM>();
403
301
 
404
- // Repetition control
405
- penalty_repeat: 1.1, // Repetition penalty
406
- penalty_freq: 0.0, // Frequency penalty
407
- penalty_present: 0.0, // Presence penalty
302
+ async loadLM(name: string, modelPath: string): Promise<CactusLM> {
303
+ if (this.models.has(name)) {
304
+ return this.models.get(name) as CactusLM;
305
+ }
306
+
307
+ const { lm, error } = await CactusLM.init({
308
+ model: modelPath,
309
+ n_ctx: 2048,
310
+ });
311
+
312
+ if (error) throw error;
313
+ this.models.set(name, lm);
314
+ return lm;
315
+ }
408
316
 
409
- // Stop conditions
410
- stop: ['</s>', '<|end|>', '\n\n'],
411
- ignore_eos: false,
317
+ async loadVLM(name: string, modelPath: string, mmprojPath: string): Promise<CactusVLM> {
318
+ if (this.models.has(name)) {
319
+ return this.models.get(name) as CactusVLM;
320
+ }
321
+
322
+ const { vlm, error } = await CactusVLM.init({
323
+ model: modelPath,
324
+ mmproj: mmprojPath,
325
+ });
326
+
327
+ if (error) throw error;
328
+ this.models.set(name, vlm);
329
+ return vlm;
330
+ }
412
331
 
413
- // Sampling methods
414
- mirostat: 0, // Mirostat sampling (0=disabled)
415
- mirostat_tau: 5.0, // Target entropy
416
- mirostat_eta: 0.1, // Learning rate
332
+ async releaseModel(name: string): Promise<void> {
333
+ const model = this.models.get(name);
334
+ if (model) {
335
+ await model.release();
336
+ this.models.delete(name);
337
+ }
338
+ }
417
339
 
418
- // Advanced
419
- seed: -1, // Random seed (-1 = random)
420
- n_probs: 0, // Return token probabilities
421
- };
422
- ```
423
-
424
- ## Multimodal (Vision)
425
-
426
- ### Setup Vision Model
427
-
428
- ```typescript
429
- import { CactusVLM } from 'cactus-react-native';
430
-
431
- const vlm = await CactusVLM.init({
432
- model: '/path/to/vision-model.gguf',
433
- mmproj: '/path/to/mmproj.gguf', // Multimodal projector
434
- n_ctx: 4096,
435
- });
436
- ```
437
-
438
- ### Image Analysis
439
-
440
- ```typescript
441
- // Analyze single image
442
- const messages = [{ role: 'user', content: 'Describe this image in detail' }];
443
- const params = {
444
- images: ['/path/to/image.jpg'],
445
- n_predict: 200,
446
- temperature: 0.3,
447
- };
448
-
449
- const result = await vlm.completion(messages, params);
450
- console.log(result.text);
451
- ```
452
-
453
- ### Multi-Image Analysis
454
-
455
- ```typescript
456
- const imagePaths = [
457
- '/path/to/image1.jpg',
458
- '/path/to/image2.jpg',
459
- '/path/to/image3.jpg'
460
- ];
461
-
462
- const messages = [{ role: 'user', content: 'Compare these images and explain the differences' }];
463
- const params = {
464
- images: imagePaths,
465
- n_predict: 300,
466
- temperature: 0.4,
467
- };
468
-
469
- const result = await vlm.completion(messages, params);
470
- ```
471
-
472
- ### Conversation with Images
473
-
474
- ```typescript
475
- const conversation = [
476
- { role: 'user', content: 'What do you see in this image?' }
477
- ];
478
-
479
- const params = {
480
- images: ['/path/to/image.jpg'],
481
- n_predict: 256,
482
- temperature: 0.3,
483
- };
484
-
485
- const result = await vlm.completion(conversation, params);
486
- ```
487
-
488
- ## Embeddings
489
-
490
- ### Text Embeddings
491
-
492
- ```typescript
493
- // Enable embeddings during initialization
494
- const lm = await CactusLM.init({
495
- model: '/path/to/embedding-model.gguf',
496
- embedding: true, // Enable embedding mode
497
- n_ctx: 512, // Smaller context for embeddings
498
- });
499
-
500
- // Generate embeddings
501
- const text = 'Your text here';
502
- const result = await lm.embedding(text);
503
- console.log('Embedding vector:', result.embedding);
504
- console.log('Dimensions:', result.embedding.length);
505
- ```
506
-
507
- ### Batch Embeddings
508
-
509
- ```typescript
510
- const texts = [
511
- 'The quick brown fox',
512
- 'Machine learning is fascinating',
513
- 'React Native development'
514
- ];
515
-
516
- const embeddings = await Promise.all(
517
- texts.map(text => lm.embedding(text))
518
- );
519
-
520
- // Calculate similarity
521
- function cosineSimilarity(a: number[], b: number[]): number {
522
- const dotProduct = a.reduce((sum, ai, i) => sum + ai * b[i], 0);
523
- const magnitudeA = Math.sqrt(a.reduce((sum, ai) => sum + ai * ai, 0));
524
- const magnitudeB = Math.sqrt(b.reduce((sum, bi) => sum + bi * bi, 0));
525
- return dotProduct / (magnitudeA * magnitudeB);
340
+ async releaseAll(): Promise<void> {
341
+ await Promise.all(
342
+ Array.from(this.models.values()).map(model => model.release())
343
+ );
344
+ this.models.clear();
345
+ }
526
346
  }
527
347
 
528
- const similarity = cosineSimilarity(
529
- embeddings[0].embedding,
530
- embeddings[1].embedding
531
- );
348
+ const modelManager = new ModelManager();
532
349
  ```
533
350
 
534
- ## Text-to-Speech (TTS)
535
-
536
- Cactus supports text-to-speech through vocoder models, allowing you to generate speech from text.
537
-
538
- ### Setup TTS Model
351
+ ### File Management Hook
539
352
 
540
353
  ```typescript
541
- import { CactusTTS } from 'cactus-react-native';
542
-
543
- const tts = await CactusTTS.init({
544
- model: '/path/to/text-model.gguf',
545
- vocoder: '/path/to/vocoder-model.gguf',
546
- n_ctx: 2048,
547
- });
548
- ```
354
+ import { useState, useCallback } from 'react';
355
+ import RNFS from 'react-native-fs';
549
356
 
550
- ### Basic Text-to-Speech
357
+ interface DownloadProgress {
358
+ progress: number;
359
+ isDownloading: boolean;
360
+ error: string | null;
361
+ }
551
362
 
552
- ```typescript
553
- const text = 'Hello, this is a test of text-to-speech functionality.';
554
- const params = {
555
- voice_id: 0, // Speaker voice ID
556
- temperature: 0.7, // Speech variation
557
- speed: 1.0, // Speech speed
363
+ export const useModelDownload = () => {
364
+ const [downloads, setDownloads] = useState<Map<string, DownloadProgress>>(new Map());
365
+
366
+ const downloadModel = useCallback(async (url: string, filename: string): Promise<string> => {
367
+ const path = `${RNFS.DocumentDirectoryPath}/${filename}`;
368
+
369
+ if (await RNFS.exists(path)) {
370
+ const stats = await RNFS.stat(path);
371
+ if (stats.size > 0) return path;
372
+ }
373
+
374
+ setDownloads(prev => new Map(prev.set(filename, {
375
+ progress: 0,
376
+ isDownloading: true,
377
+ error: null,
378
+ })));
379
+
380
+ try {
381
+ await RNFS.downloadFile({
382
+ fromUrl: url,
383
+ toFile: path,
384
+ progress: (res) => {
385
+ const progress = res.bytesWritten / res.contentLength;
386
+ setDownloads(prev => new Map(prev.set(filename, {
387
+ progress,
388
+ isDownloading: true,
389
+ error: null,
390
+ })));
391
+ },
392
+ }).promise;
393
+
394
+ setDownloads(prev => new Map(prev.set(filename, {
395
+ progress: 1,
396
+ isDownloading: false,
397
+ error: null,
398
+ })));
399
+
400
+ return path;
401
+ } catch (error) {
402
+ setDownloads(prev => new Map(prev.set(filename, {
403
+ progress: 0,
404
+ isDownloading: false,
405
+ error: error.message,
406
+ })));
407
+ throw error;
408
+ }
409
+ }, []);
410
+
411
+ return { downloadModel, downloads };
558
412
  };
559
-
560
- const result = await tts.generateSpeech(text, params);
561
-
562
- console.log('Audio data:', result.audio_data);
563
- console.log('Sample rate:', result.sample_rate);
564
- console.log('Audio format:', result.format);
565
- ```
566
-
567
- ### Advanced TTS with Token Control
568
-
569
- ```typescript
570
- // Get guide tokens for precise control
571
- const tokensResult = await tts.getGuideTokens(
572
- 'This text will be converted to speech tokens.'
573
- );
574
-
575
- console.log('Guide tokens:', tokensResult.tokens);
576
- console.log('Token count:', tokensResult.tokens.length);
577
-
578
- // Decode tokens to audio
579
- const audioResult = await tts.decodeTokens(tokensResult.tokens);
580
-
581
- console.log('Decoded audio:', audioResult.audio_data);
582
- console.log('Duration:', audioResult.duration_seconds);
583
413
  ```
584
414
 
585
- ### Complete TTS Example
415
+ ### Vision Chat Component
586
416
 
587
417
  ```typescript
588
418
  import React, { useState, useEffect } from 'react';
589
- import { View, Text, TextInput, TouchableOpacity, Alert } from 'react-native';
590
- import { Audio } from 'expo-av';
419
+ import { View, Text, TouchableOpacity, Image, Alert } from 'react-native';
420
+ import { launchImageLibrary } from 'react-native-image-picker';
421
+ import { CactusVLM } from 'cactus-react-native';
591
422
  import RNFS from 'react-native-fs';
592
- import { CactusTTS } from 'cactus-react-native';
593
423
 
594
- export default function TTSDemo() {
595
- const [tts, setTTS] = useState<CactusTTS | null>(null);
596
- const [text, setText] = useState('Hello, this is a test of speech synthesis.');
597
- const [isGenerating, setIsGenerating] = useState(false);
598
- const [sound, setSound] = useState<Audio.Sound | null>(null);
424
+ export default function VisionChat() {
425
+ const [vlm, setVLM] = useState<CactusVLM | null>(null);
426
+ const [imagePath, setImagePath] = useState<string | null>(null);
427
+ const [response, setResponse] = useState('');
428
+ const [isLoading, setIsLoading] = useState(true);
429
+ const [isAnalyzing, setIsAnalyzing] = useState(false);
599
430
 
600
431
  useEffect(() => {
601
- initializeTTS();
432
+ initializeVLM();
602
433
  return () => {
603
- if (sound) {
604
- sound.unloadAsync();
605
- }
434
+ vlm?.release();
606
435
  };
607
436
  }, []);
608
437
 
609
- async function initializeTTS() {
438
+ const initializeVLM = async () => {
610
439
  try {
611
- // Download and initialize models
612
- const modelPath = await downloadModel();
613
- const vocoderPath = await downloadVocoder();
440
+ const modelUrl = 'https://huggingface.co/Cactus-Compute/SmolVLM2-500m-Instruct-GGUF/resolve/main/SmolVLM2-500M-Video-Instruct-Q8_0.gguf';
441
+ const mmprojUrl = 'https://huggingface.co/Cactus-Compute/SmolVLM2-500m-Instruct-GGUF/resolve/main/mmproj-SmolVLM2-500M-Video-Instruct-Q8_0.gguf';
442
+
443
+ const [modelPath, mmprojPath] = await Promise.all([
444
+ downloadFile(modelUrl, 'smolvlm-model.gguf'),
445
+ downloadFile(mmprojUrl, 'smolvlm-mmproj.gguf'),
446
+ ]);
614
447
 
615
- const cactusTTS = await CactusTTS.init({
448
+ const { vlm: model, error } = await CactusVLM.init({
616
449
  model: modelPath,
617
- vocoder: vocoderPath,
618
- n_ctx: 1024,
619
- n_threads: 4,
450
+ mmproj: mmprojPath,
451
+ n_ctx: 2048,
620
452
  });
621
453
 
622
- setTTS(cactusTTS);
454
+ if (error) throw error;
455
+ setVLM(model);
623
456
  } catch (error) {
624
- console.error('Failed to initialize TTS:', error);
625
- Alert.alert('Error', 'Failed to initialize TTS');
457
+ console.error('Failed to initialize VLM:', error);
458
+ Alert.alert('Error', 'Failed to initialize vision model');
459
+ } finally {
460
+ setIsLoading(false);
626
461
  }
627
- }
462
+ };
463
+
464
+ const downloadFile = async (url: string, filename: string): Promise<string> => {
465
+ const path = `${RNFS.DocumentDirectoryPath}/${filename}`;
466
+
467
+ if (await RNFS.exists(path)) return path;
468
+
469
+ await RNFS.downloadFile({ fromUrl: url, toFile: path }).promise;
470
+ return path;
471
+ };
472
+
473
+ const pickImage = () => {
474
+ launchImageLibrary(
475
+ {
476
+ mediaType: 'photo',
477
+ quality: 0.8,
478
+ includeBase64: false,
479
+ },
480
+ (response) => {
481
+ if (response.assets && response.assets[0]) {
482
+ setImagePath(response.assets[0].uri!);
483
+ setResponse('');
484
+ }
485
+ }
486
+ );
487
+ };
628
488
 
629
- async function generateSpeech() {
630
- if (!tts || !text.trim()) return;
489
+ const analyzeImage = async () => {
490
+ if (!vlm || !imagePath) return;
631
491
 
632
- setIsGenerating(true);
492
+ setIsAnalyzing(true);
633
493
  try {
634
- const params = {
635
- voice_id: 0,
636
- temperature: 0.7,
637
- speed: 1.0,
638
- };
639
-
640
- const result = await tts.generateSpeech(text, params);
641
-
642
- // Save audio to file
643
- const audioPath = `${RNFS.DocumentDirectoryPath}/speech.wav`;
644
- await RNFS.writeFile(audioPath, result.audio_data, 'base64');
645
-
646
- // Play audio
647
- const { sound: audioSound } = await Audio.Sound.createAsync({
648
- uri: `file://${audioPath}`,
649
- });
494
+ const messages = [{ role: 'user', content: 'Describe this image in detail' }];
650
495
 
651
- setSound(audioSound);
652
- await audioSound.playAsync();
496
+ let analysisResponse = '';
497
+ const result = await vlm.completion(messages, {
498
+ images: [imagePath],
499
+ n_predict: 300,
500
+ temperature: 0.3,
501
+ }, (token) => {
502
+ analysisResponse += token.token;
503
+ setResponse(analysisResponse);
504
+ });
653
505
 
654
- console.log(`Generated speech: ${result.duration_seconds}s`);
506
+ setResponse(analysisResponse || result.text);
655
507
  } catch (error) {
656
- console.error('Speech generation failed:', error);
657
- Alert.alert('Error', 'Failed to generate speech');
508
+ console.error('Analysis failed:', error);
509
+ Alert.alert('Error', 'Failed to analyze image');
658
510
  } finally {
659
- setIsGenerating(false);
511
+ setIsAnalyzing(false);
660
512
  }
661
- }
513
+ };
662
514
 
663
- // Helper functions for downloading models would go here...
515
+ if (isLoading) {
516
+ return (
517
+ <View style={{ flex: 1, justifyContent: 'center', alignItems: 'center' }}>
518
+ <Text>Loading vision model...</Text>
519
+ </View>
520
+ );
521
+ }
664
522
 
665
523
  return (
666
524
  <View style={{ flex: 1, padding: 16 }}>
667
- <Text style={{ fontSize: 18, marginBottom: 16 }}>
668
- Text-to-Speech Demo
525
+ <Text style={{ fontSize: 24, fontWeight: 'bold', marginBottom: 20 }}>
526
+ Vision Chat
669
527
  </Text>
670
528
 
671
- <TextInput
672
- style={{
673
- borderWidth: 1,
674
- borderColor: '#ddd',
675
- borderRadius: 8,
676
- padding: 12,
677
- marginBottom: 16,
678
- minHeight: 100,
679
- }}
680
- value={text}
681
- onChangeText={setText}
682
- placeholder="Enter text to convert to speech..."
683
- multiline
684
- />
529
+ {imagePath && (
530
+ <Image
531
+ source={{ uri: imagePath }}
532
+ style={{
533
+ width: '100%',
534
+ height: 200,
535
+ borderRadius: 8,
536
+ marginBottom: 16,
537
+ }}
538
+ resizeMode="contain"
539
+ />
540
+ )}
685
541
 
686
- <TouchableOpacity
687
- onPress={generateSpeech}
688
- disabled={isGenerating || !tts}
689
- style={{
690
- backgroundColor: isGenerating ? '#ccc' : '#007AFF',
691
- padding: 16,
692
- borderRadius: 8,
693
- alignItems: 'center',
694
- }}
695
- >
696
- <Text style={{ color: 'white', fontSize: 16, fontWeight: 'bold' }}>
697
- {isGenerating ? 'Generating...' : 'Generate Speech'}
542
+ <View style={{ flexDirection: 'row', marginBottom: 16 }}>
543
+ <TouchableOpacity
544
+ onPress={pickImage}
545
+ style={{
546
+ backgroundColor: '#007AFF',
547
+ padding: 12,
548
+ borderRadius: 8,
549
+ marginRight: 8,
550
+ flex: 1,
551
+ }}
552
+ >
553
+ <Text style={{ color: 'white', textAlign: 'center', fontWeight: 'bold' }}>
554
+ Pick Image
555
+ </Text>
556
+ </TouchableOpacity>
557
+
558
+ <TouchableOpacity
559
+ onPress={analyzeImage}
560
+ disabled={!imagePath || isAnalyzing}
561
+ style={{
562
+ backgroundColor: !imagePath || isAnalyzing ? '#cccccc' : '#34C759',
563
+ padding: 12,
564
+ borderRadius: 8,
565
+ flex: 1,
566
+ }}
567
+ >
568
+ <Text style={{ color: 'white', textAlign: 'center', fontWeight: 'bold' }}>
569
+ {isAnalyzing ? 'Analyzing...' : 'Analyze'}
570
+ </Text>
571
+ </TouchableOpacity>
572
+ </View>
573
+
574
+ <View style={{
575
+ flex: 1,
576
+ backgroundColor: '#f8f8f8',
577
+ borderRadius: 8,
578
+ padding: 16,
579
+ }}>
580
+ <Text style={{ fontSize: 16, lineHeight: 24 }}>
581
+ {response || 'Select an image and tap Analyze to get started'}
698
582
  </Text>
699
- </TouchableOpacity>
583
+ </View>
700
584
  </View>
701
585
  );
702
586
  }
703
587
  ```
704
588
 
705
- ## Advanced Features
706
-
707
- ### Session Management
708
-
709
- For the low-level API, you can still access session management:
589
+ ### Cloud Fallback
710
590
 
711
591
  ```typescript
712
- import { initLlama } from 'cactus-react-native';
592
+ const { lm } = await CactusLM.init({
593
+ model: '/path/to/model.gguf',
594
+ n_ctx: 2048,
595
+ }, undefined, 'your_cactus_token');
713
596
 
714
- const context = await initLlama({ model: '/path/to/model.gguf' });
597
+ // Try local first, fallback to cloud if local fails
598
+ const embedding = await lm.embedding('text', undefined, 'localfirst');
715
599
 
716
- // Save session
717
- const tokensKept = await context.saveSession('/path/to/session.bin', {
718
- tokenSize: 1024 // Number of tokens to keep
719
- });
600
+ // Vision models also support cloud fallback
601
+ const { vlm } = await CactusVLM.init({
602
+ model: '/path/to/model.gguf',
603
+ mmproj: '/path/to/mmproj.gguf',
604
+ }, undefined, 'your_cactus_token');
720
605
 
721
- // Load session
722
- const sessionInfo = await context.loadSession('/path/to/session.bin');
723
- console.log(`Loaded ${sessionInfo.tokens_loaded} tokens`);
606
+ const result = await vlm.completion(messages, {
607
+ images: ['/path/to/image.jpg'],
608
+ mode: 'localfirst',
609
+ });
724
610
  ```
725
611
 
726
- ### LoRA Adapters
612
+ ### Embeddings & Similarity
727
613
 
728
614
  ```typescript
729
- const context = await initLlama({ model: '/path/to/model.gguf' });
615
+ const { lm } = await CactusLM.init({
616
+ model: '/path/to/model.gguf',
617
+ embedding: true,
618
+ });
730
619
 
731
- // Apply LoRA adapters
732
- await context.applyLoraAdapters([
733
- { path: '/path/to/lora1.gguf', scaled: 1.0 },
734
- { path: '/path/to/lora2.gguf', scaled: 0.8 }
735
- ]);
620
+ const embedding1 = await lm.embedding('machine learning');
621
+ const embedding2 = await lm.embedding('artificial intelligence');
736
622
 
737
- // Get loaded adapters
738
- const adapters = await context.getLoadedLoraAdapters();
739
- console.log('Loaded adapters:', adapters);
623
+ function cosineSimilarity(a: number[], b: number[]): number {
624
+ const dotProduct = a.reduce((sum, ai, i) => sum + ai * b[i], 0);
625
+ const magnitudeA = Math.sqrt(a.reduce((sum, ai) => sum + ai * ai, 0));
626
+ const magnitudeB = Math.sqrt(b.reduce((sum, bi) => sum + bi * bi, 0));
627
+ return dotProduct / (magnitudeA * magnitudeB);
628
+ }
740
629
 
741
- // Remove adapters
742
- await context.removeLoraAdapters();
630
+ const similarity = cosineSimilarity(embedding1.embedding, embedding2.embedding);
631
+ console.log('Similarity:', similarity);
743
632
  ```
744
633
 
745
- ### Structured Output (JSON)
746
-
747
- ```typescript
748
- const messages = [
749
- { role: 'user', content: 'Extract information about this person: John Doe, 30 years old, software engineer from San Francisco' }
750
- ];
751
-
752
- const params = {
753
- response_format: {
754
- type: 'json_object',
755
- schema: {
756
- type: 'object',
757
- properties: {
758
- name: { type: 'string' },
759
- age: { type: 'number' },
760
- profession: { type: 'string' },
761
- location: { type: 'string' }
762
- },
763
- required: ['name', 'age']
764
- }
765
- }
766
- };
767
-
768
- const result = await lm.completion(messages, params);
769
- const person = JSON.parse(result.text);
770
- console.log(person.name); // "John Doe"
771
- ```
634
+ ## Error Handling & Performance
772
635
 
773
- ### Performance Monitoring
636
+ ### Production Error Handling
774
637
 
775
638
  ```typescript
776
- const result = await lm.completion(messages, { n_predict: 100 });
777
-
778
- console.log('Performance metrics:');
779
- console.log(`Prompt tokens: ${result.timings.prompt_n}`);
780
- console.log(`Generated tokens: ${result.timings.predicted_n}`);
781
- console.log(`Prompt speed: ${result.timings.prompt_per_second.toFixed(2)} tokens/sec`);
782
- console.log(`Generation speed: ${result.timings.predicted_per_second.toFixed(2)} tokens/sec`);
783
- console.log(`Total time: ${(result.timings.prompt_ms + result.timings.predicted_ms).toFixed(0)}ms`);
784
- ```
785
-
786
- ## Best Practices
787
-
788
- ### Model Management
789
-
790
- ```typitten
791
- class ModelManager {
792
- private models = new Map<string, CactusLM | CactusVLM | CactusTTS>();
793
-
794
- async loadLM(name: string, modelPath: string): Promise<CactusLM> {
795
- if (this.models.has(name)) {
796
- return this.models.get(name)! as CactusLM;
797
- }
798
-
799
- const lm = await CactusLM.init({ model: modelPath });
800
- this.models.set(name, lm);
801
- return lm;
802
- }
803
-
804
- async loadVLM(name: string, modelPath: string, mmprojPath: string): Promise<CactusVLM> {
805
- if (this.models.has(name)) {
806
- return this.models.get(name)! as CactusVLM;
639
+ async function safeModelInit(modelPath: string): Promise<CactusLM> {
640
+ const configs = [
641
+ { model: modelPath, n_ctx: 4096, n_gpu_layers: 99 },
642
+ { model: modelPath, n_ctx: 2048, n_gpu_layers: 99 },
643
+ { model: modelPath, n_ctx: 2048, n_gpu_layers: 0 },
644
+ { model: modelPath, n_ctx: 1024, n_gpu_layers: 0 },
645
+ ];
646
+
647
+ for (const config of configs) {
648
+ try {
649
+ const { lm, error } = await CactusLM.init(config);
650
+ if (error) throw error;
651
+ return lm;
652
+ } catch (error) {
653
+ console.warn('Config failed:', config, error.message);
654
+ if (configs.indexOf(config) === configs.length - 1) {
655
+ throw new Error(`All configurations failed. Last error: ${error.message}`);
656
+ }
807
657
  }
808
-
809
- const vlm = await CactusVLM.init({ model: modelPath, mmproj: mmprojPath });
810
- this.models.set(name, vlm);
811
- return vlm;
812
658
  }
659
+
660
+ throw new Error('Model initialization failed');
661
+ }
813
662
 
814
- async unloadModel(name: string): Promise<void> {
815
- const model = this.models.get(name);
816
- if (model) {
817
- await model.release();
818
- this.models.delete(name);
663
+ async function safeCompletion(lm: CactusLM, messages: any[], retries = 3): Promise<any> {
664
+ for (let i = 0; i < retries; i++) {
665
+ try {
666
+ return await lm.completion(messages, { n_predict: 200 });
667
+ } catch (error) {
668
+ if (error.message.includes('Context is busy') && i < retries - 1) {
669
+ await new Promise(resolve => setTimeout(resolve, 1000));
670
+ continue;
671
+ }
672
+ throw error;
819
673
  }
820
674
  }
821
-
822
- async unloadAll(): Promise<void> {
823
- await Promise.all(
824
- Array.from(this.models.values()).map(model => model.release())
825
- );
826
- this.models.clear();
827
- }
828
675
  }
829
676
  ```
830
677
 
831
- ### Error Handling
678
+ ### Memory Management
832
679
 
833
680
  ```typescript
834
- async function safeCompletion(lm: CactusLM, messages: any[]) {
835
- try {
836
- const result = await lm.completion(messages, {
837
- n_predict: 256,
838
- temperature: 0.7,
839
- });
840
- return { success: true, data: result };
841
- } catch (error) {
842
- if (error.message.includes('Context is busy')) {
843
- // Handle concurrent requests
844
- await new Promise(resolve => setTimeout(resolve, 100));
845
- return safeCompletion(lm, messages);
846
- } else if (error.message.includes('Context not found')) {
847
- // Handle context cleanup
848
- throw new Error('Model context was released');
849
- } else {
850
- // Handle other errors
851
- console.error('Completion failed:', error);
852
- return { success: false, error: error.message };
681
+ import { AppState, AppStateStatus } from 'react-native';
682
+
683
+ class AppModelManager {
684
+ private modelManager = new ModelManager();
685
+
686
+ constructor() {
687
+ AppState.addEventListener('change', this.handleAppStateChange);
688
+ }
689
+
690
+ private handleAppStateChange = (nextAppState: AppStateStatus) => {
691
+ if (nextAppState === 'background') {
692
+ // Release non-essential models when app goes to background
693
+ this.modelManager.releaseAll();
694
+ }
695
+ };
696
+
697
+ async getModel(name: string, modelPath: string): Promise<CactusLM> {
698
+ try {
699
+ return await this.modelManager.loadLM(name, modelPath);
700
+ } catch (error) {
701
+ // Handle low memory by releasing other models
702
+ await this.modelManager.releaseAll();
703
+ return await this.modelManager.loadLM(name, modelPath);
853
704
  }
854
705
  }
855
706
  }
856
707
  ```
857
708
 
858
- ### Memory Management
709
+ ### Performance Optimization
859
710
 
860
711
  ```typescript
861
- // Monitor memory usage
862
- const checkMemory = () => {
863
- if (Platform.OS === 'android') {
864
- // Android-specific memory monitoring
865
- console.log('Memory warning - consider releasing unused models');
866
- }
712
+ // Optimize for device capabilities
713
+ const getOptimalConfig = () => {
714
+ const { OS } = Platform;
715
+ const isHighEndDevice = true; // Implement device detection logic
716
+
717
+ return {
718
+ n_ctx: isHighEndDevice ? 4096 : 2048,
719
+ n_gpu_layers: OS === 'ios' ? 99 : 0, // iOS generally has better GPU support
720
+ n_threads: isHighEndDevice ? 6 : 4,
721
+ n_batch: isHighEndDevice ? 512 : 256,
722
+ };
867
723
  };
868
724
 
869
- // Release models when app goes to background
870
- import { AppState } from 'react-native';
871
-
872
- AppState.addEventListener('change', (nextAppState) => {
873
- if (nextAppState === 'background') {
874
- // Release non-essential models
875
- modelManager.unloadAll();
876
- }
725
+ const config = getOptimalConfig();
726
+ const { lm } = await CactusLM.init({
727
+ model: modelPath,
728
+ ...config,
877
729
  });
878
730
  ```
879
731
 
880
732
  ## API Reference
881
733
 
882
- ### High-Level APIs
883
-
884
- - `CactusLM.init(params: ContextParams): Promise<CactusLM>` - Initialize language model
885
- - `CactusVLM.init(params: VLMContextParams): Promise<CactusVLM>` - Initialize vision language model
886
- - `CactusTTS.init(params: TTSContextParams): Promise<CactusTTS>` - Initialize text-to-speech model
887
-
888
- ### CactusLM Methods
889
-
890
- - `completion(messages: CactusOAICompatibleMessage[], params: CompletionParams, callback?: (token: TokenData) => void): Promise<NativeCompletionResult>`
891
- - `embedding(text: string, params?: EmbeddingParams): Promise<NativeEmbeddingResult>`
892
- - `rewind(): Promise<void>` - Clear conversation history
893
- - `release(): Promise<void>` - Release resources
894
-
895
- ### CactusVLM Methods
896
-
897
- - `completion(messages: CactusOAICompatibleMessage[], params: VLMCompletionParams, callback?: (token: TokenData) => void): Promise<NativeCompletionResult>`
898
- - `rewind(): Promise<void>` - Clear conversation history
899
- - `release(): Promise<void>` - Release resources
734
+ ### CactusLM
900
735
 
901
- ### CactusTTS Methods
736
+ **init(params, onProgress?, cactusToken?)**
737
+ - `model: string` - Path to GGUF model file
738
+ - `n_ctx?: number` - Context size (default: 2048)
739
+ - `n_threads?: number` - CPU threads (default: 4)
740
+ - `n_gpu_layers?: number` - GPU layers (default: 99)
741
+ - `embedding?: boolean` - Enable embeddings (default: false)
742
+ - `n_batch?: number` - Batch size (default: 512)
902
743
 
903
- - `generateSpeech(text: string, params: TTSSpeechParams): Promise<NativeAudioCompletionResult>`
904
- - `getGuideTokens(text: string): Promise<NativeAudioTokensResult>`
905
- - `decodeTokens(tokens: number[]): Promise<NativeAudioDecodeResult>`
906
- - `release(): Promise<void>` - Release resources
744
+ **completion(messages, params?, callback?)**
745
+ - `messages: Array<{role: string, content: string}>` - Chat messages
746
+ - `n_predict?: number` - Max tokens (default: -1)
747
+ - `temperature?: number` - Randomness 0.0-2.0 (default: 0.8)
748
+ - `top_p?: number` - Nucleus sampling (default: 0.95)
749
+ - `top_k?: number` - Top-k sampling (default: 40)
750
+ - `stop?: string[]` - Stop sequences
751
+ - `callback?: (token) => void` - Streaming callback
907
752
 
908
- ### Low-Level Functions (Advanced)
753
+ **embedding(text, params?, mode?)**
754
+ - `text: string` - Text to embed
755
+ - `mode?: string` - 'local' | 'localfirst' | 'remotefirst' | 'remote'
909
756
 
910
- For advanced use cases, the original low-level API is still available:
757
+ ### CactusVLM
911
758
 
912
- - `initLlama(params: ContextParams): Promise<LlamaContext>` - Initialize a model context
913
- - `releaseAllLlama(): Promise<void>` - Release all contexts
914
- - `setContextLimit(limit: number): Promise<void>` - Set maximum contexts
915
- - `toggleNativeLog(enabled: boolean): Promise<void>` - Enable/disable native logging
759
+ **init(params, onProgress?, cactusToken?)**
760
+ - All CactusLM params plus:
761
+ - `mmproj: string` - Path to multimodal projector
916
762
 
917
- ## Troubleshooting
763
+ **completion(messages, params?, callback?)**
764
+ - All CactusLM completion params plus:
765
+ - `images?: string[]` - Array of image paths
766
+ - `mode?: string` - Cloud fallback mode
918
767
 
919
- ### Common Issues
768
+ ### Types
920
769
 
921
- **Model Loading Fails**
922
770
  ```typescript
923
- // Check file exists and is accessible
924
- if (!(await RNFS.exists(modelPath))) {
925
- throw new Error('Model file not found');
771
+ interface CactusOAICompatibleMessage {
772
+ role: 'system' | 'user' | 'assistant';
773
+ content: string;
926
774
  }
927
775
 
928
- // Check file size
929
- const stats = await RNFS.stat(modelPath);
930
- console.log('Model size:', stats.size);
931
- ```
932
-
933
- **Out of Memory**
934
- ```typescript
935
- // Reduce context size
936
- const lm = await CactusLM.init({
937
- model: '/path/to/model.gguf',
938
- n_ctx: 1024, // Reduce from 4096
939
- n_batch: 128, // Reduce batch size
940
- });
941
- ```
776
+ interface NativeCompletionResult {
777
+ text: string;
778
+ tokens_predicted: number;
779
+ tokens_evaluated: number;
780
+ timings: {
781
+ predicted_per_second: number;
782
+ prompt_per_second: number;
783
+ };
784
+ }
942
785
 
943
- **GPU Issues**
944
- ```typescript
945
- // Disable GPU if having issues
946
- const lm = await CactusLM.init({
947
- model: '/path/to/model.gguf',
948
- n_gpu_layers: 0, // Use CPU only
949
- });
786
+ interface NativeEmbeddingResult {
787
+ embedding: number[];
788
+ }
950
789
  ```
951
-
952
- ### Performance Tips
953
-
954
- 1. **Use appropriate context sizes** - Larger contexts use more memory
955
- 2. **Optimize batch sizes** - Balance between speed and memory
956
- 3. **Cache models** - Don't reload models unnecessarily
957
- 4. **Use GPU acceleration** - When available and stable
958
- 5. **Monitor memory usage** - Release models when not needed
959
-
960
- This documentation covers the essential usage patterns for cactus-react-native. For more examples, check the [example apps](../examples/) in the repository.