PyPI - llama-cpp-python - Versions diffs - 0.2.14__tar.gz → 0.2.16__tar.gz - Mend

llama-cpp-python 0.2.14tar.gz → 0.2.16tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (528) hide show

llama_cpp_python-0.2.16/.git/FETCH_HEAD ADDED Viewed

	@@ -0,0 +1 @@
1	+ b7e60b66f47950e385980a1329af9dfb14da6906 'b7e60b66f47950e385980a1329af9dfb14da6906' of https://github.com/abetlen/llama-cpp-python

llama_cpp_python-0.2.16/.git/HEAD ADDED Viewed

	@@ -0,0 +1 @@
1	+ b7e60b66f47950e385980a1329af9dfb14da6906

{llama_cpp_python-0.2.14 → llama_cpp_python-0.2.16}/.git/config RENAMED Viewed

@@ -9,7 +9,7 @@
 [gc]
 	auto = 0
 [http "https://github.com/"]
-	extraheader = AUTHORIZATION: basic eC1hY2Nlc3MtdG9rZW46Z2hzX0JoYkh4YlZ0cmp6anh3ajNkZ3pBOEFZREFmTVZQWjFsOFZnMQ==
+	extraheader = AUTHORIZATION: basic eC1hY2Nlc3MtdG9rZW46Z2hzXzJBc3h5aUVKQkZ1Q3M5bjVaWU1ZTk9za3hoSzh0VDFMeVdwRw==
 [submodule "vendor/llama.cpp"]
 	active = true
 	url = https://github.com/ggerganov/llama.cpp.git

llama_cpp_python-0.2.16/.git/index ADDED Viewed

Binary file

llama_cpp_python-0.2.16/.git/logs/HEAD ADDED Viewed

	@@ -0,0 +1 @@
1	+ 0000000000000000000000000000000000000000 b7e60b66f47950e385980a1329af9dfb14da6906 runner <runner@fv-az711-229.kxtiaivj4gxuxgxjt4etq45iac.phxx.internal.cloudapp.net> 1699615337 +0000 checkout: moving from master to refs/tags/v0.2.16

llama_cpp_python-0.2.16/.git/modules/vendor/llama.cpp/HEAD ADDED Viewed

	@@ -0,0 +1 @@
1	+ a75fa576abba9d37f463580c379e4bbf1e1ad03c

{llama_cpp_python-0.2.14 → llama_cpp_python-0.2.16}/.git/modules/vendor/llama.cpp/config RENAMED Viewed

@@ -13,7 +13,7 @@
 [gc]
 	auto = 0
 [http "https://github.com/"]
-	extraheader = AUTHORIZATION: basic eC1hY2Nlc3MtdG9rZW46Z2hzX0JoYkh4YlZ0cmp6anh3ajNkZ3pBOEFZREFmTVZQWjFsOFZnMQ==
+	extraheader = AUTHORIZATION: basic eC1hY2Nlc3MtdG9rZW46Z2hzXzJBc3h5aUVKQkZ1Q3M5bjVaWU1ZTk9za3hoSzh0VDFMeVdwRw==
 [url "https://github.com/"]
 	insteadOf = git@github.com:
 	insteadOf = org-6826477@github.com:

llama_cpp_python-0.2.16/.git/modules/vendor/llama.cpp/index ADDED Viewed

Binary file

llama_cpp_python-0.2.16/.git/modules/vendor/llama.cpp/logs/HEAD ADDED Viewed

	@@ -0,0 +1,2 @@
1	+ 0000000000000000000000000000000000000000 a75fa576abba9d37f463580c379e4bbf1e1ad03c runner <runner@fv-az711-229.kxtiaivj4gxuxgxjt4etq45iac.phxx.internal.cloudapp.net> 1699615338 +0000 clone: from https://github.com/ggerganov/llama.cpp.git
2	+ a75fa576abba9d37f463580c379e4bbf1e1ad03c a75fa576abba9d37f463580c379e4bbf1e1ad03c runner <runner@fv-az711-229.kxtiaivj4gxuxgxjt4etq45iac.phxx.internal.cloudapp.net> 1699615338 +0000 checkout: moving from master to a75fa576abba9d37f463580c379e4bbf1e1ad03c

llama_cpp_python-0.2.16/.git/modules/vendor/llama.cpp/logs/refs/heads/master ADDED Viewed

	@@ -0,0 +1 @@
1	+ 0000000000000000000000000000000000000000 a75fa576abba9d37f463580c379e4bbf1e1ad03c runner <runner@fv-az711-229.kxtiaivj4gxuxgxjt4etq45iac.phxx.internal.cloudapp.net> 1699615338 +0000 clone: from https://github.com/ggerganov/llama.cpp.git

llama_cpp_python-0.2.16/.git/modules/vendor/llama.cpp/logs/refs/remotes/origin/HEAD ADDED Viewed

	@@ -0,0 +1 @@
1	+ 0000000000000000000000000000000000000000 a75fa576abba9d37f463580c379e4bbf1e1ad03c runner <runner@fv-az711-229.kxtiaivj4gxuxgxjt4etq45iac.phxx.internal.cloudapp.net> 1699615338 +0000 clone: from https://github.com/ggerganov/llama.cpp.git

llama_cpp_python-0.2.16/.git/modules/vendor/llama.cpp/objects/pack/pack-e9e88c6e4829004ba3844e3ec02cda2d16322828.idx ADDED Viewed

Binary file

llama_cpp_python-0.2.14/.git/modules/vendor/llama.cpp/objects/pack/pack-b61192bd8cad228f74cabbb6f8e9c7e3dbc55ee9.pack → llama_cpp_python-0.2.16/.git/modules/vendor/llama.cpp/objects/pack/pack-e9e88c6e4829004ba3844e3ec02cda2d16322828.pack RENAMED Viewed

Binary file

llama_cpp_python-0.2.16/.git/modules/vendor/llama.cpp/objects/pack/pack-e9e88c6e4829004ba3844e3ec02cda2d16322828.rev ADDED Viewed

Binary file

llama_cpp_python-0.2.16/.git/modules/vendor/llama.cpp/packed-refs ADDED Viewed

	@@ -0,0 +1,2 @@
1	+ # pack-refs with: peeled fully-peeled sorted
2	+ a75fa576abba9d37f463580c379e4bbf1e1ad03c refs/remotes/origin/master

llama_cpp_python-0.2.16/.git/modules/vendor/llama.cpp/refs/heads/master ADDED Viewed

	@@ -0,0 +1 @@
1	+ a75fa576abba9d37f463580c379e4bbf1e1ad03c

llama_cpp_python-0.2.16/.git/modules/vendor/llama.cpp/shallow ADDED Viewed

	@@ -0,0 +1 @@
1	+ a75fa576abba9d37f463580c379e4bbf1e1ad03c

llama_cpp_python-0.2.16/.git/objects/23/c7e86cace58018b34f1dae1b548df9981eebf9 ADDED Viewed

Binary file

llama_cpp_python-0.2.16/.git/objects/25/26bcbf5a89773bf179fd631c782274635da9e1 ADDED Viewed

Binary file

llama_cpp_python-0.2.16/.git/objects/2a/6aed81cf0cc6d59972fe184a57666f281dbe8f ADDED Viewed

Binary file

llama_cpp_python-0.2.16/.git/objects/2e/18b47a0261b4e81255fc71811a7c2405e4e19f ADDED Viewed

Binary file

llama_cpp_python-0.2.14/.git/objects/b7/ea27646d138e37efaad41d5a659d3da6537b6f → llama_cpp_python-0.2.16/.git/objects/36/90f40c28d3d9821712c70f68a25f5671bfcaa8 RENAMED Viewed

Binary file

llama_cpp_python-0.2.16/.git/objects/45/a1513dde96b5d7f0e3b3a49fc3d7bcda8f7c6f ADDED Viewed

Binary file

llama_cpp_python-0.2.16/.git/objects/4d/c32b015468696f721ddb37a53d09cf5f9c7612 ADDED Viewed

Binary file

llama_cpp_python-0.2.16/.git/objects/54/3365d8d631f36da2f57381801edabbc3ca4769 ADDED Viewed

Binary file

llama_cpp_python-0.2.16/.git/objects/5b/51e98ce432974ff031367f8937babe755e3d73 ADDED Viewed

Binary file

llama_cpp_python-0.2.16/.git/objects/61/027ef99725c50b0891fdbf0bf263a33abe648f ADDED Viewed

Binary file

llama_cpp_python-0.2.16/.git/objects/6c/3a6e594fab3a61940f00840cb717f53ea1e8b7 ADDED Viewed

Binary file

llama_cpp_python-0.2.16/.git/objects/72/f6a1211b53960672f7af628800bc86a7c5c547 ADDED Viewed

Binary file

llama_cpp_python-0.2.16/.git/objects/7b/01670640a150525c7671a7a3c1ae652a2d7b3d ADDED Viewed

Binary file

llama_cpp_python-0.2.16/.git/objects/81/d58f627258591fc76e28e8378d0f9c3d49c9e5 ADDED Viewed

Binary file

llama_cpp_python-0.2.16/.git/objects/8d/063708d0b17c59a8637d2d35ec39e7e27b8171 ADDED Viewed

Binary file

llama_cpp_python-0.2.14/.git/objects/18/41560fc0a62ec24c46e99ddace261786ce07b0 → llama_cpp_python-0.2.16/.git/objects/8e/841233c07f9d6be8b4bf1e25231789a84781c0 RENAMED Viewed

Binary file

llama_cpp_python-0.2.16/.git/objects/a0/b7d5b55cf67870c3efc3e5c42b96196d1f707c ADDED Viewed

Binary file

llama_cpp_python-0.2.16/.git/objects/a2/4e55042fd63aeb7e9873fff7474cc9141f4474 ADDED Viewed

Binary file

llama_cpp_python-0.2.16/.git/objects/b7/e60b66f47950e385980a1329af9dfb14da6906 ADDED Viewed

@@ -0,0 +1,4 @@
+x��Q�0��)��miwKb�r��,JB����%���If�Қ�T�9>բ
+ܣ%F�(6`p!1��&YQ
+N��}3�M�.ƞ}3R����%�\�t$��&!
+����x,C�	:��.p��7��,�|Ik����h=�1 ����%�n�|���u1_1�FW

llama_cpp_python-0.2.16/.git/objects/e0/b98f7ec76339ad83913015531541a7de9d8e1e ADDED Viewed

Binary file

llama_cpp_python-0.2.16/.git/objects/e2/1e0bd82d6cacf620ea2f2dd7e8e7e2ee34b42a ADDED Viewed

Binary file

llama_cpp_python-0.2.16/.git/objects/e6/f024107b7e75246ba7a7b083b2aafaada82697 ADDED Viewed

Binary file

llama_cpp_python-0.2.14/.git/objects/b3/164f85806ec28003f217bd108671c4143298d7 → llama_cpp_python-0.2.16/.git/objects/f1/76c95ddb207e422703d8a73dd0d12a984a838f RENAMED Viewed

Binary file

llama_cpp_python-0.2.16/.git/objects/f1/b8e9d154231932c4b7b9b59611626764e68632 ADDED Viewed

Binary file

llama_cpp_python-0.2.16/.git/objects/f7/2b9b39ef1e5d433ac15638f1090b96c582eb5b ADDED Viewed

Binary file

llama_cpp_python-0.2.16/.git/refs/tags/v0.2.16 ADDED Viewed

	@@ -0,0 +1 @@
1	+ b7e60b66f47950e385980a1329af9dfb14da6906

llama_cpp_python-0.2.16/.git/shallow ADDED Viewed

	@@ -0,0 +1 @@
1	+ b7e60b66f47950e385980a1329af9dfb14da6906

{llama_cpp_python-0.2.14 → llama_cpp_python-0.2.16}/.github/workflows/build-and-release.yaml RENAMED Viewed

@@ -33,6 +33,9 @@ jobs:
       - name: Build wheels
         run: python -m cibuildwheel --output-dir wheelhouse
+        env:
+          # disable repair
+          CIBW_REPAIR_WHEEL_COMMAND: ""
       - uses: actions/upload-artifact@v3
         with:

{llama_cpp_python-0.2.14 → llama_cpp_python-0.2.16}/CHANGELOG.md RENAMED Viewed

@@ -7,9 +7,30 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
 ## [Unreleased]
+## [0.2.16]
+- Update llama.cpp to ggerganov/llama.cp@a75fa576abba9d37f463580c379e4bbf1e1ad03c
+- Add `set_seed` to `Llama` class by @abetlen in fd41ed3a908761d286102a019a34c2938a15118d
+- Fix server doc arguments by @kjunggithub in #892
+- Fix response_format handler in llava chat handler by @abetlen in b62c44983921197ed10a7d29dc4ba920e9979380
+- Fix default max_tokens, chat completion is now unlimited (to context length) and completion is 16 tokens to match OpenAI defaults by @abetlen in e7962d2c733cbbeec5a37392c81f64185a9a39e8
+- Fix json_schema_to_gbnf helper so that it takes a json schema string as input instead by @abetlen in faeae181b1e868643c0dc28fcf039f077baf0829
+- Add support for $ref and $def in json_schema_to_gbnf to handle more complex function schemas by @abetlen in 770df344369c0630df1be14be9f9e301e7c56d24
+- Update functionary chat handler for new OpenAI api by abetlen in 1b376c62b775b401653facf25a519d116aafe99a
+- Fix add default stop sequence to chatml chat format by @abetlen in b84d76a844149216d511cfd8cdb9827148a1853c
+- Fix sampling bug when logits_all=False by @abetlen in 6f0b0b1b840af846938ed74d0e8170a91c40e617
+## [0.2.15]
+- Update llama.cpp to ggerganov/llama.cpp@0a7c980b6f94a049cb804573df2d8092a34df8e4
+- Add support for Llava1.5 multimodal models by @damian0815 and @abetlen in #821
+- Update OpenAI API compatibility to match dev day update by @abetlen in #821
+- Add seed parameter to completion and chat_completion functions of Llama class by @abetlen in 86aeb9f3a14808575d2bb0076e6acb4a30907e6a
+- Add JSON mode support to constrain chat completion to JSON objects by @abetlen in b30b9c338bf9af316d497ea501d39f5c246900db
 ## [0.2.14]
-- Update llama.cpp to f0b30ef7dc1360922ccbea0a8cd3918ecf15eaa7
+- Update llama.cpp to ggerganov/llama.cpp@f0b30ef7dc1360922ccbea0a8cd3918ecf15eaa7
 - Add support for Huggingface Autotokenizer Chat Formats by @bioshazard and @abetlen in #790 and bbffdaebaa7bb04b543dbf683a07276087251f86
 - Fix llama-2 chat format by @earonesty in #869
 - Add support for functionary chat format by @abetlen in #784
@@ -17,7 +38,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
 ## [0.2.13]
-- Update llama.cpp to 51b2fc11f7f605fff49725a4540e9a6ef7b51b70
+- Update llama.cpp to ggerganov/llama.cpp@51b2fc11f7f605fff49725a4540e9a6ef7b51b70
 - Fix name 'open' is not defined exception when deleting model by @abetlen in 011b95d7f34cbfc528af75a892757bd9a20838ab
 - Fix tokenization of special characters by @antoine-lizee in #850

{llama_cpp_python-0.2.14 → llama_cpp_python-0.2.16}/CMakeLists.txt RENAMED Viewed

@@ -41,4 +41,23 @@ if (LLAMA_BUILD)
         FILES $<TARGET_RUNTIME_DLLS:llama>
         DESTINATION ${CMAKE_CURRENT_SOURCE_DIR}/llama_cpp
     )
+    add_subdirectory(vendor/llama.cpp/examples/llava)
+    set_target_properties(llava_shared PROPERTIES OUTPUT_NAME "llava")
+    install(
+        TARGETS llava_shared
+        LIBRARY DESTINATION ${SKBUILD_PLATLIB_DIR}/llama_cpp
+        RUNTIME DESTINATION ${SKBUILD_PLATLIB_DIR}/llama_cpp
+        ARCHIVE DESTINATION ${SKBUILD_PLATLIB_DIR}/llama_cpp
+        FRAMEWORK DESTINATION ${SKBUILD_PLATLIB_DIR}/llama_cpp
+        RESOURCE DESTINATION ${SKBUILD_PLATLIB_DIR}/llama_cpp
+    )
+    # Temporary fix for https://github.com/scikit-build/scikit-build-core/issues/374
+    install(
+        TARGETS llava_shared
+        LIBRARY DESTINATION ${CMAKE_CURRENT_SOURCE_DIR}/llama_cpp
+        RUNTIME DESTINATION ${CMAKE_CURRENT_SOURCE_DIR}/llama_cpp
+        ARCHIVE DESTINATION ${CMAKE_CURRENT_SOURCE_DIR}/llama_cpp
+        FRAMEWORK DESTINATION ${CMAKE_CURRENT_SOURCE_DIR}/llama_cpp
+        RESOURCE DESTINATION ${CMAKE_CURRENT_SOURCE_DIR}/llama_cpp
+    )
 endif()

{llama_cpp_python-0.2.14 → llama_cpp_python-0.2.16}/PKG-INFO RENAMED Viewed

@@ -1,6 +1,6 @@
 Metadata-Version: 2.1
 Name: llama_cpp_python
-Version: 0.2.14
+Version: 0.2.16
 Summary: Python bindings for the llama.cpp library
 Author-Email: Andrei Betlen <abetlen@gmail.com>
 License: MIT

llama_cpp_python-0.2.16/docs/server.md ADDED Viewed

@@ -0,0 +1,121 @@
+# OpenAI Compatible Server
+`llama-cpp-python` offers an OpenAI API compatible web server.
+This web server can be used to serve local models and easily connect them to existing clients.
+## Setup
+### Installation
+The server can be installed by running the following command:
+```bash
+pip install llama-cpp-python[server]
+```
+### Running the server
+The server can then be started by running the following command:
+```bash
+python3 -m llama_cpp.server --model <model_path>
+```
+### Server options
+For a full list of options, run:
+```bash
+python3 -m llama_cpp.server --help
+```
+NOTE: All server options are also available as environment variables. For example, `--model` can be set by setting the `MODEL` environment variable.
+## Guides
+### Code Completion
+`llama-cpp-python` supports code completion via GitHub Copilot.
+*NOTE*: Without GPU acceleration this is unlikely to be fast enough to be usable.
+You'll first need to download one of the available code completion models in GGUF format:
+- [replit-code-v1_5-GGUF](https://huggingface.co/abetlen/replit-code-v1_5-3b-GGUF)
+Then you'll need to run the OpenAI compatible web server with a increased context size substantially for GitHub Copilot requests:
+```bash
+python3 -m llama_cpp.server --model <model_path> --n_ctx 16192
+```
+Then just update your settings in `.vscode/settings.json` to point to your code completion server:
+```json
+{
+    // ...
+    "github.copilot.advanced": {
+        "debug.testOverrideProxyUrl": "http://<host>:<port>",
+        "debug.overrideProxyUrl": "http://<host>:<port>"
+    }
+    // ...
+}
+```
+### Function Calling
+`llama-cpp-python` supports structured function calling based on a JSON schema.
+You'll first need to download one of the available function calling models in GGUF format:
+- [functionary-7b-v1](https://huggingface.co/abetlen/functionary-7b-v1-GGUF)
+Then when you run the server you'll need to also specify the `functionary-7b-v1` chat_format
+```bash
+python3 -m llama_cpp.server --model <model_path> --chat_format functionary
+```
+### Multimodal Models
+`llama-cpp-python` supports the llava1.5 family of multi-modal models which allow the language model to
+read information from both text and images.
+You'll first need to download one of the available multi-modal models in GGUF format:
+- [llava-v1.5-7b](https://huggingface.co/mys/ggml_llava-v1.5-7b)
+- [llava-v1.5-13b](https://huggingface.co/mys/ggml_llava-v1.5-13b)
+- [bakllava-1-7b](https://huggingface.co/mys/ggml_bakllava-1)
+Then when you run the server you'll need to also specify the path to the clip model used for image embedding and the `llava-1-5` chat_format
+```bash
+python3 -m llama_cpp.server --model <model_path> --clip_model_path <clip_model_path> --chat_format llava-1-5
+```
+Then you can just use the OpenAI API as normal
+```python3
+from openai import OpenAI
+client = OpenAI(base_url="http://<host>:<port>/v1", api_key="sk-xxx")
+response = client.chat.completions.create(
+    model="gpt-4-vision-preview",
+    messages=[
+        {
+            "role": "user",
+            "content": [
+                {
+                    "type": "image_url",
+                    "image_url": {
+                        "url": "<image_url>"
+                    },
+                },
+                {"type": "text", "text": "What does the image say"},
+            ],
+        }
+    ],
+)
+print(response)
+```

llama-cpp-python 0.2.14__tar.gz → 0.2.16__tar.gz

llama-cpp-python 0.2.14tar.gz → 0.2.16tar.gz