llama_cpp 0.15.1 → 0.15.2

Sign up to get free protection for your applications and to get access to all the features.
@@ -242,6 +242,9 @@ extern "C" {
242
242
  // proportion of the model (layers or rows) to offload to each GPU, size: llama_max_devices()
243
243
  const float * tensor_split;
244
244
 
245
+ // comma separated list of RPC servers to use for offloading
246
+ const char * rpc_servers;
247
+
245
248
  // Called with a progress value between 0.0 and 1.0. Pass NULL to disable.
246
249
  // If the provided progress_callback returns true, model loading continues.
247
250
  // If it returns false, model loading is immediately aborted.