RubyGems - llama_cpp - Versions diffs - 0.3.1 → 0.3.3 - Mend

llama_cpp 0.3.1 → 0.3.3

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (24) hide show

checksums.yaml +4 -4
data/CHANGELOG.md +41 -0
data/README.md +9 -0
data/examples/chat.rb +1 -1
data/examples/embedding.rb +1 -1
data/examples/prompt_jp.txt +8 -0
data/ext/llama_cpp/extconf.rb +11 -2
data/ext/llama_cpp/llama_cpp.cpp +284 -111
data/ext/llama_cpp/src/ggml-cuda.cu +639 -148
data/ext/llama_cpp/src/ggml-cuda.h +0 -4
data/ext/llama_cpp/src/ggml-metal.h +5 -1
data/ext/llama_cpp/src/ggml-metal.m +19 -6
data/ext/llama_cpp/src/ggml-metal.metal +56 -47
data/ext/llama_cpp/src/ggml-mpi.c +216 -0
data/ext/llama_cpp/src/ggml-mpi.h +39 -0
data/ext/llama_cpp/src/ggml-opencl.cpp +11 -7
data/ext/llama_cpp/src/ggml.c +1734 -2248
data/ext/llama_cpp/src/ggml.h +152 -80
data/ext/llama_cpp/src/llama.cpp +282 -90
data/ext/llama_cpp/src/llama.h +30 -1
data/lib/llama_cpp/version.rb +2 -2
data/lib/llama_cpp.rb +16 -13
data/sig/llama_cpp.rbs +22 -2
metadata +5 -2

checksums.yaml CHANGED Viewed

@@ -1,7 +1,7 @@
 ---
 SHA256:
-  metadata.gz: 7a1f299e21bfe5b12d517a4254657cbc5bf9af6d0571285e2a5aff67b9175646
-  data.tar.gz: 62dd6e0d4f0b052a912d87b52cd0cff5bb873ab12378413a3ee0af5671331ef6
+  metadata.gz: cf337091019bb773e47cf206ff2ff30ed0bef963094494e6493455cad7c59840
+  data.tar.gz: fdbae8e08a6b87d49c5658d5c1857f20bf8efdf5a5371906630dccf4eb0f1159
 SHA512:
-  metadata.gz: b12dc73914e5c7ecdd951fd57b70e01aae1926a2adc88030b5f5310f99c789e129cf552811363ec99525b37b9ca167a708cb756057b94f5cf4dd2a0100b06b6e
-  data.tar.gz: d1d79696b08f89894de02a02fac91f0783c432efa641b21ee59f6987946b045681a60113392db6c85fe97bd0e1fc9860235faa358fb805bb0de21eb85926edd5
+  metadata.gz: f0fee68294960c5ab9f56ebfe7256a00f9330e55f4954f2b016e07cbc023570298fa8f8b578f3e187fe9183b869769085311931122f93a033c6c21158b4e9485
+  data.tar.gz: 7eec8c98ae9ec1a56fa4bdb4e83a2dc2bdea407fc037af8d1b8f09a30c0d1246333d410707f4d66f3f473bf73574757cf12e56a86a0cb47074501f63f65f0c02

data/CHANGELOG.md CHANGED Viewed

@@ -1,3 +1,44 @@
+## [[0.3.3](https://github.com/yoshoku/llama_cpp.rb/compare/v0.3.2...v0.3.3)] - 2023-07-15
+- Bump bundled llama.cpp from master-481f793 to master-32c5411.
+- Add MPI config options:
+  ```
+  $ gem install llama_cpp -- --with-mpi
+  ```
+- Add `backend_free` module function to `LLaMACpp`. This method should be called once at the end of the program when the MPI option is enabled.
+- Add `sample_classifier_free_guidance` method to `Context`.
+**Breaking Changes**
+- Rename `init_backend` method to `backend_init`. This method is called internally at `require 'llama_cpp'`.
+## [[0.3.2](https://github.com/yoshoku/llama_cpp.rb/compare/v0.3.1...v0.3.2)] - 2023-07-08
+- Bump bundled llama.cpp from master-b8c8dda to master-481f793.
+- Add `Timings` class and `timings` method to `Context`:
+  ```ruby
+  require 'llama_cpp'
+  # ...
+  context = LLaMACpp::Context.new(model: model)
+  timings = context.timings
+  puts timings.class
+  # => LLaMACpp::Timings
+  puts timings.t_load_ms
+  # => 79.61
+  ```
+- Expose sampling options as the arguemnts of `generate` module function:
+  ```ruby
+  require 'llama_cpp'
+  # ...
+  LLaMACpp.generate(context, 'Hello, world.', top_k: 30, top_p: 0.8, temperature: 0.9)
+  ```
+- Add `ModelQuantizaParams` class, this class was not published because the author forgot to write rb_define_class.
+- Minor update to example scripts, configuration files, and documentations.
 ## [[0.3.1](https://github.com/yoshoku/llama_cpp.rb/compare/v0.3.0...v0.3.1)] - 2023-07-02
 - Bump bundled llama.cpp from master-9d23589 to master-b8c8dda.

data/README.md CHANGED Viewed

@@ -68,6 +68,15 @@ User:
 ![llama_cpp_chat_example](https://github.com/yoshoku/llama_cpp.rb/assets/5562409/374ae3d8-63a6-498f-ae6e-5552b464bdda)
+Japanse chat is also possible using the [Vicuna model on Hugging Face](https://huggingface.co/CRD716/ggml-vicuna-1.1-quantized).
+```sh
+$ wget https://huggingface.co/CRD716/ggml-vicuna-1.1-quantized/resolve/main/ggml-vicuna-7b-1.1-q4_0.bin
+$ ruby chat.rb --model ggml-vicuna-7b-1.1-q4_0.bin --file prompt_jp.txt
+```
+![llama_cpp rb-jpchat](https://github.com/yoshoku/llama_cpp.rb/assets/5562409/526ff18c-2bb2-4b06-8933-f72960024033)
 ## Contributing
 Bug reports and pull requests are welcome on GitHub at https://github.com/yoshoku/llama_cpp.rb.

data/examples/chat.rb CHANGED Viewed

@@ -33,7 +33,7 @@ class Chat < Thor # rubocop:disable Metrics/ClassLength, Style/Documentation
   option :n_gpu_layers, type: :numeric, desc: 'number of layers on GPU', default: 0
   def main # rubocop:disable Metrics/AbcSize, Metrics/CyclomaticComplexity, Metrics/MethodLength, Metrics/PerceivedComplexity
     params = LLaMACpp::ContextParams.new
-    params.seed = options[:seed]
+    params.seed = options[:seed] if options[:seed] != -1
     params.n_gpu_layers = options[:n_gpu_layers]
     model = LLaMACpp::Model.new(model_path: options[:model], params: params)
     context = LLaMACpp::Context.new(model: model)

data/examples/embedding.rb CHANGED Viewed

@@ -18,7 +18,7 @@ class Embedding < Thor # rubocop:disable Style/Documentation
   option :n_gpu_layers, type: :numeric, desc: 'number of layers on GPU', default: 0
   def main # rubocop:disable Metrics/AbcSize, Metrics/MethodLength
     params = LLaMACpp::ContextParams.new
-    params.seed = options[:seed]
+    params.seed = options[:seed] if options[:seed] != -1
     params.n_gpu_layers = options[:n_gpu_layers]
     params.embedding = true
     model = LLaMACpp::Model.new(model_path: options[:model], params: params)

data/examples/prompt_jp.txt ADDED Viewed

@@ -0,0 +1,8 @@
+UserがTaroという名前のアシスタントと対話するダイアログのトランスクリプト。
+Taroは親切で、親切で、正直で、文章を書くのが上手で、ユーザーのリクエストに即座に正確に答えることを怠りません。
+User: こんにちには、Taro。
+Taro: こんにちは、今日はどのような要件ですか？
+User: 日本で最大の都市について教えてください。
+Taro: はい、日本で最大の都市は東京です。日本の首都でもあります。
+User:

data/ext/llama_cpp/extconf.rb CHANGED Viewed

@@ -7,8 +7,9 @@ abort 'libstdc++ is not found.' unless have_library('stdc++')
 $srcs = %w[ggml.c llama.cpp llama_cpp.cpp]
 $srcs << 'ggml-opencl.cpp' if with_config('clblast')
-$CFLAGS << ' -w'
-$CXXFLAGS << ' -std=c++11'
+$srcs << 'ggml-mpi.c' if with_config('mpi')
+$CFLAGS << ' -w -DNDEBUG'
+$CXXFLAGS << ' -std=c++11 -DNDEBUG'
 $INCFLAGS << ' -I$(srcdir)/src'
 $VPATH << '$(srcdir)/src'
@@ -76,6 +77,14 @@ if with_config('clblast')
   end
 end
+if with_config('mpi')
+  abort 'libmpi is not found.' unless have_library('mpi')
+  abort 'mpi.h is not found.' unless have_header('mpi.h')
+  $CFLAGS << ' -DGGML_USE_MPI -Wno-cast-qual'
+  $CXXFLAGS << ' -DGGML_USE_MPI -Wno-cast-qual'
+end
 UNAME_M = RbConfig::CONFIG['build_cpu'] || RbConfig::CONFIG['host_cpu'] || RbConfig::CONFIG['target_cpu']
 # rubocop:disable Layout/LineLength