@runanywhere/llamacpp 0.17.6 → 0.18.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -10,6 +10,8 @@ LlamaCPP backend for the RunAnywhere React Native SDK. Provides on-device LLM te
10
10
 
11
11
  - **Text Generation** — Generate text responses from prompts
12
12
  - **Streaming** — Real-time token-by-token output
13
+ - **Tool Calling** — Let models invoke registered tools during generation
14
+ - **Structured Output** — Generate type-safe JSON responses
13
15
  - **GGUF Support** — Run any GGUF-format model (Llama, Mistral, Qwen, SmolLM, etc.)
14
16
  - **Metal GPU Acceleration** — 3-5x faster inference on Apple Silicon (iOS)
15
17
  - **CPU Inference** — Works on all devices without GPU requirements
@@ -20,7 +22,7 @@ LlamaCPP backend for the RunAnywhere React Native SDK. Provides on-device LLM te
20
22
  ## Requirements
21
23
 
22
24
  - `@runanywhere/core` (peer dependency)
23
- - React Native 0.71+
25
+ - React Native 0.74+
24
26
  - iOS 15.1+ / Android API 24+
25
27
 
26
28
  ---
@@ -246,6 +248,51 @@ const result = await streamResult.result;
246
248
  console.log('\nSpeed:', result.performanceMetrics.tokensPerSecond, 'tok/s');
247
249
  ```
248
250
 
251
+ #### Tool Calling
252
+
253
+ Register tools and let the LLM call them during generation. Tool calling parsing and prompt formatting is handled entirely in C++ for consistency across platforms.
254
+
255
+ ```typescript
256
+ import { RunAnywhere } from '@runanywhere/core';
257
+ import { LlamaCPP } from '@runanywhere/llamacpp';
258
+
259
+ // Register a tool
260
+ RunAnywhere.registerTool(
261
+ {
262
+ name: 'calculate',
263
+ description: 'Perform a math calculation',
264
+ parameters: [
265
+ { name: 'expression', type: 'string', description: 'Math expression', required: true },
266
+ ],
267
+ },
268
+ async (args) => {
269
+ const result = eval(args.expression as string); // simplified example
270
+ return { result };
271
+ }
272
+ );
273
+
274
+ // Generate with tools
275
+ const result = await RunAnywhere.generateWithTools(
276
+ 'What is 42 * 17?',
277
+ {
278
+ autoExecute: true,
279
+ maxToolCalls: 3,
280
+ temperature: 0.7,
281
+ format: 'default', // 'default' for most models, 'lfm2' for Liquid AI models
282
+ }
283
+ );
284
+ console.log(result.text); // "42 * 17 = 714"
285
+ ```
286
+
287
+ **Supported tool calling formats:**
288
+
289
+ | Format | Tag Pattern | Models |
290
+ |--------|-------------|--------|
291
+ | `default` | `<tool_call>{"tool":"name","arguments":{}}</tool_call>` | Llama, Qwen, Mistral, SmolLM, most GGUF models |
292
+ | `lfm2` | `<\|tool_call_start\|>[func(arg="val")]<\|tool_call_end\|>` | Liquid AI LFM2-Tool models |
293
+
294
+ ---
295
+
249
296
  #### Model Management
250
297
 
251
298
  ```typescript
@@ -270,27 +317,38 @@ Any GGUF-format model works with this backend. Recommended models:
270
317
 
271
318
  ### Small Models (< 1GB RAM)
272
319
 
273
- | Model | Size | Memory | Description |
274
- |-------|------|--------|-------------|
275
- | SmolLM2 360M Q8_0 | ~400MB | 500MB | Fast, lightweight |
276
- | Qwen 2.5 0.5B Q6_K | ~500MB | 600MB | Multilingual |
277
- | LFM2 350M Q4_K_M | ~200MB | 250MB | Ultra-compact |
320
+ | Model | Size | Memory | Tool Calling | Description |
321
+ |-------|------|--------|:------------:|-------------|
322
+ | SmolLM2 360M Q8_0 | ~400MB | 500MB | - | Fast, lightweight |
323
+ | Qwen 2.5 0.5B Q6_K | ~500MB | 600MB | Yes | Multilingual |
324
+ | LFM2 350M Q4_K_M | ~200MB | 250MB | Yes (lfm2) | Ultra-compact, Liquid AI |
278
325
 
279
326
  ### Medium Models (1-3GB RAM)
280
327
 
281
- | Model | Size | Memory | Description |
282
- |-------|------|--------|-------------|
283
- | Phi-3 Mini Q4_K_M | ~2GB | 2.5GB | Microsoft |
284
- | Gemma 2B Q4_K_M | ~1.5GB | 2GB | Google |
285
- | TinyLlama 1.1B Q4_K_M | ~700MB | 1GB | Fast chat |
328
+ | Model | Size | Memory | Tool Calling | Description |
329
+ |-------|------|--------|:------------:|-------------|
330
+ | Phi-3 Mini Q4_K_M | ~2GB | 2.5GB | - | Microsoft |
331
+ | Gemma 2B Q4_K_M | ~1.5GB | 2GB | - | Google |
332
+ | LFM2 1.2B Q4_K_M | ~800MB | 1GB | Yes (lfm2) | Liquid AI tool-calling |
333
+ | Qwen 2.5 1.5B Instruct Q4_K_M | ~1GB | 1.5GB | Yes | Alibaba, multilingual |
334
+ | TinyLlama 1.1B Q4_K_M | ~700MB | 1GB | - | Fast chat |
286
335
 
287
336
  ### Large Models (4GB+ RAM)
288
337
 
289
- | Model | Size | Memory | Description |
290
- |-------|------|--------|-------------|
291
- | Llama 2 7B Q4_K_M | ~4GB | 5GB | Meta |
292
- | Mistral 7B Q4_K_M | ~4GB | 5GB | Mistral AI |
293
- | Llama 3.2 3B Q4_K_M | ~2GB | 3GB | Meta latest |
338
+ | Model | Size | Memory | Tool Calling | Description |
339
+ |-------|------|--------|:------------:|-------------|
340
+ | Llama 3.2 3B Instruct Q4_K_M | ~2GB | 3GB | Yes | Meta latest |
341
+ | Mistral 7B Instruct Q4_K_M | ~4GB | 5GB | Yes | Mistral AI |
342
+ | Qwen 2.5 7B Instruct Q4_K_M | ~4GB | 5GB | Yes | Alibaba |
343
+ | Llama 2 7B Chat Q4_K_M | ~4GB | 5GB | - | Meta |
344
+
345
+ ### Tool Calling Model Selection Guide
346
+
347
+ - **Best for tool calling (small):** LFM2-350M-Tool (use `format: 'lfm2'`) or Qwen 2.5 0.5B
348
+ - **Best for tool calling (medium):** LFM2-1.2B-Tool or Qwen 2.5 1.5B Instruct
349
+ - **Best for tool calling (large):** Mistral 7B Instruct or Qwen 2.5 7B Instruct
350
+ - **Instruct-tuned models** generally perform better at following tool calling instructions
351
+ - Use `format: 'lfm2'` only with Liquid AI LFM2-Tool models; all others use `format: 'default'`
294
352
 
295
353
  ---
296
354
 
@@ -19,8 +19,12 @@ set(JNILIB_DIR ${CMAKE_SOURCE_DIR}/src/main/jniLibs/${ANDROID_ABI})
19
19
  # Downloaded via Gradle downloadNativeLibs task
20
20
  # =============================================================================
21
21
  if(NOT EXISTS "${JNILIB_DIR}/librac_backend_llamacpp.so")
22
- message(FATAL_ERROR "[RunAnywhereLlama] RABackendLlamaCPP not found at ${JNILIB_DIR}/librac_backend_llamacpp.so\n"
23
- "Run: ./gradlew :runanywhere_llamacpp:downloadNativeLibs")
22
+ message(WARNING "[RunAnywhereLlama] RABackendLlamaCPP not found for ${ANDROID_ABI} at ${JNILIB_DIR}/librac_backend_llamacpp.so\n"
23
+ "This ABI will not be functional. To fix, run: ./gradlew :runanywhere_llamacpp:downloadNativeLibs\n"
24
+ "Or set reactNativeArchitectures=arm64-v8a in gradle.properties to skip this ABI.")
25
+ file(WRITE "${CMAKE_CURRENT_BINARY_DIR}/stub.cpp" "// Stub for missing ABI ${ANDROID_ABI}")
26
+ add_library(${PACKAGE_NAME} SHARED "${CMAKE_CURRENT_BINARY_DIR}/stub.cpp")
27
+ return()
24
28
  endif()
25
29
 
26
30
  add_library(rac_backend_llamacpp SHARED IMPORTED)
@@ -38,9 +38,13 @@ def getExtOrDefault(name) {
38
38
  return rootProject.ext.has(name) ? rootProject.ext.get(name) : project.properties['RunAnywhereLlama_' + name]
39
39
  }
40
40
 
41
- // Only arm64-v8a is supported
41
+ // Supported ABIs - arm64-v8a for physical devices, x86_64 for emulators
42
+ // Can be overridden via gradle.properties: reactNativeArchitectures=arm64-v8a
42
43
  def reactNativeArchitectures() {
43
- return ["arm64-v8a"]
44
+ def value = rootProject.hasProperty("reactNativeArchitectures")
45
+ ? rootProject.property("reactNativeArchitectures")
46
+ : null
47
+ return value ? value.split(",").collect { it.trim() } : ["arm64-v8a", "x86_64"]
44
48
  }
45
49
 
46
50
  apply plugin: 'com.android.library'
@@ -109,7 +113,7 @@ android {
109
113
  targetSdkVersion getExtOrIntegerDefault('targetSdkVersion')
110
114
 
111
115
  ndk {
112
- abiFilters 'arm64-v8a'
116
+ abiFilters(*reactNativeArchitectures())
113
117
  }
114
118
 
115
119
  externalNativeBuild {
@@ -118,7 +122,7 @@ android {
118
122
  arguments "-DANDROID_STL=c++_shared",
119
123
  // Fix NitroModules prefab path - use app's build directory
120
124
  "-DREACT_NATIVE_NITRO_BUILD_DIR=${rootProject.buildDir}"
121
- abiFilters 'arm64-v8a'
125
+ abiFilters(*reactNativeArchitectures())
122
126
  }
123
127
  }
124
128
  }
@@ -132,7 +136,12 @@ android {
132
136
  packagingOptions {
133
137
  excludes = [
134
138
  "META-INF",
135
- "META-INF/**"
139
+ "META-INF/**",
140
+ // Exclude librac_commons.so from this module's packaging.
141
+ // The core package (@runanywhere/core) is the single authoritative source
142
+ // for librac_commons.so. If this module also packages it, Gradle's native
143
+ // lib merge may pick a stale version, causing UnsatisfiedLinkError crashes.
144
+ "**/librac_commons.so"
136
145
  ]
137
146
  pickFirsts = [
138
147
  "**/libc++_shared.so",
@@ -202,11 +211,64 @@ task downloadNativeLibs {
202
211
  return
203
212
  }
204
213
 
205
- // Check if libs are already bundled (npm install case)
206
- def bundledLibsDir = file("${jniLibsDir}/arm64-v8a")
207
- def bundledLibs = bundledLibsDir.exists() ? bundledLibsDir.listFiles()?.findAll { it.name.endsWith(".so") } : []
208
- if (bundledLibs?.size() > 0) {
209
- logger.lifecycle("[RunAnywhereLlama] Using bundled native libraries from npm package (${bundledLibs.size()} .so files)")
214
+ // Check if libs are already bundled for ALL requested ABIs (npm install case)
215
+ def requestedAbis = reactNativeArchitectures()
216
+ def allAbisBundled = requestedAbis.every { abi ->
217
+ def abiDir = file("${jniLibsDir}/${abi}")
218
+ def libs = abiDir.exists() ? abiDir.listFiles()?.findAll { it.name.endsWith(".so") } : []
219
+ return libs?.size() > 0
220
+ }
221
+ if (allAbisBundled) {
222
+ logger.lifecycle("[RunAnywhereLlama] ✅ Using bundled native libraries from npm package for ABIs: ${requestedAbis.join(', ')}")
223
+ return
224
+ }
225
+ // Check if at least arm64-v8a is bundled (partial bundle - need to download missing ABIs)
226
+ def arm64Dir = file("${jniLibsDir}/arm64-v8a")
227
+ def arm64Bundled = arm64Dir.exists() && arm64Dir.listFiles()?.any { it.name.endsWith(".so") }
228
+ if (arm64Bundled) {
229
+ def missingAbis = requestedAbis.findAll { abi ->
230
+ def abiDir = file("${jniLibsDir}/${abi}")
231
+ def libs = abiDir.exists() ? abiDir.listFiles()?.findAll { it.name.endsWith(".so") } : []
232
+ return !libs || libs.size() == 0
233
+ }
234
+ if (missingAbis.size() > 0) {
235
+ logger.lifecycle("[RunAnywhereLlama] ⚠️ Bundled libs found for arm64-v8a but missing for: ${missingAbis.join(', ')}")
236
+ logger.lifecycle("[RunAnywhereLlama] Attempting to download missing ABIs from GitHub releases...")
237
+ try {
238
+ def llamacppUrl = "https://github.com/${githubOrg}/${coreRepo}/releases/download/core-v${coreVersion}/RABackendLlamaCPP-android-v${coreVersion}.zip"
239
+ def tempZip = file("${downloadedLibsDir}/RABackendLlamaCPP-supplement.zip")
240
+ downloadedLibsDir.mkdirs()
241
+ new URL(llamacppUrl).withInputStream { input ->
242
+ tempZip.withOutputStream { output -> output << input }
243
+ }
244
+ copy {
245
+ from zipTree(tempZip)
246
+ into jniLibsDir
247
+ exclude "**/libc++_shared.so"
248
+ // Exclude librac_commons.so - the core package (@runanywhere/core) is the
249
+ // authoritative source. Including it here risks a stale version winning the
250
+ // Gradle native lib merge, causing UnsatisfiedLinkError crashes at runtime.
251
+ exclude "**/librac_commons.so"
252
+ eachFile { fileCopyDetails ->
253
+ def pathString = fileCopyDetails.relativePath.pathString
254
+ def match = pathString =~ /.*\/(arm64-v8a|armeabi-v7a|x86|x86_64)\/(.+\.so)$/
255
+ if (match) {
256
+ def abi = match[0][1]
257
+ def filename = match[0][2]
258
+ fileCopyDetails.relativePath = new RelativePath(true, abi, filename)
259
+ } else if (!pathString.endsWith(".so")) {
260
+ fileCopyDetails.exclude()
261
+ }
262
+ }
263
+ includeEmptyDirs = false
264
+ }
265
+ tempZip.delete()
266
+ logger.lifecycle("[RunAnywhereLlama] ✅ Downloaded missing ABIs successfully")
267
+ } catch (Exception e) {
268
+ logger.warn("[RunAnywhereLlama] ⚠️ Could not download missing ABIs: ${e.message}")
269
+ logger.warn("[RunAnywhereLlama] Building with available ABIs only (arm64-v8a)")
270
+ }
271
+ }
210
272
  return
211
273
  }
212
274
 
@@ -249,6 +311,10 @@ task downloadNativeLibs {
249
311
  // IMPORTANT: Exclude libc++_shared.so - React Native provides its own
250
312
  // Using a different version causes ABI compatibility issues
251
313
  exclude "**/libc++_shared.so"
314
+ // Exclude librac_commons.so - the core package (@runanywhere/core) is the
315
+ // authoritative source. Including it here risks a stale version winning the
316
+ // Gradle native lib merge, causing UnsatisfiedLinkError crashes at runtime.
317
+ exclude "**/librac_commons.so"
252
318
  eachFile { fileCopyDetails ->
253
319
  def pathString = fileCopyDetails.relativePath.pathString
254
320
  // Handle RABackendLlamaCPP-android-vX.Y.Z/llamacpp/ABI/*.so structure
@@ -8,32 +8,32 @@
8
8
  <key>BinaryPath</key>
9
9
  <string>RABackendLLAMACPP.framework/RABackendLLAMACPP</string>
10
10
  <key>LibraryIdentifier</key>
11
- <string>ios-arm64</string>
11
+ <string>ios-arm64_x86_64-simulator</string>
12
12
  <key>LibraryPath</key>
13
13
  <string>RABackendLLAMACPP.framework</string>
14
14
  <key>SupportedArchitectures</key>
15
15
  <array>
16
16
  <string>arm64</string>
17
+ <string>x86_64</string>
17
18
  </array>
18
19
  <key>SupportedPlatform</key>
19
20
  <string>ios</string>
21
+ <key>SupportedPlatformVariant</key>
22
+ <string>simulator</string>
20
23
  </dict>
21
24
  <dict>
22
25
  <key>BinaryPath</key>
23
26
  <string>RABackendLLAMACPP.framework/RABackendLLAMACPP</string>
24
27
  <key>LibraryIdentifier</key>
25
- <string>ios-arm64_x86_64-simulator</string>
28
+ <string>ios-arm64</string>
26
29
  <key>LibraryPath</key>
27
30
  <string>RABackendLLAMACPP.framework</string>
28
31
  <key>SupportedArchitectures</key>
29
32
  <array>
30
33
  <string>arm64</string>
31
- <string>x86_64</string>
32
34
  </array>
33
35
  <key>SupportedPlatform</key>
34
36
  <string>ios</string>
35
- <key>SupportedPlatformVariant</key>
36
- <string>simulator</string>
37
37
  </dict>
38
38
  </array>
39
39
  <key>CFBundlePackageType</key>
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "@runanywhere/llamacpp",
3
- "version": "0.17.6",
3
+ "version": "0.18.0",
4
4
  "description": "LlamaCpp backend for RunAnywhere React Native SDK - GGUF model support for on-device LLM",
5
5
  "main": "src/index.ts",
6
6
  "types": "src/index.ts",
@@ -18,7 +18,10 @@
18
18
  "src",
19
19
  "cpp",
20
20
  "ios",
21
- "android",
21
+ "!ios/build",
22
+ "android/src",
23
+ "android/build.gradle",
24
+ "android/CMakeLists.txt",
22
25
  "nitrogen",
23
26
  "nitro.json",
24
27
  "react-native.config.js",