PyPI - gpustack-runner - Versions diffs - 0.1.24.post3__tar.gz → 0.1.25__tar.gz - Mend

gpustack-runner 0.1.24.post3tar.gz → 0.1.25tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (150) hide show

{gpustack_runner-0.1.24.post3 → gpustack_runner-0.1.25}/PKG-INFO RENAMED Viewed

@@ -1,6 +1,6 @@
 Metadata-Version: 2.4
 Name: gpustack-runner
-Version: 0.1.24.post3
+Version: 0.1.25
 Summary: GPUStack Runner is library for registering runnable accelerated backends and services in GPUStack.
 Project-URL: Homepage, https://github.com/gpustack/runner
 Project-URL: Bug Tracker, https://github.com/gpustack/gpustack/issues
@@ -52,17 +52,17 @@ The following table lists the supported accelerated backends and their correspon
     vllm-ascend [#3316](https://github.com/vllm-project/vllm-ascend/issues/3316)
     and [#2795](https://github.com/vllm-project/vllm-ascend/issues/2795).
-| CANN Version <br/> (Variant) | MindIE    | vLLM                                                               | SGLang                 |
-|------------------------------|-----------|--------------------------------------------------------------------|------------------------|
-| 8.5 (A3/910C)                | `2.3.0`   | `0.14.1`, `0.13.0`                                                 | `0.5.8`                |
-| 8.5 (910B)                   | `2.3.0`   | `0.14.1`, `0.13.0`                                                 | `0.5.8`                |
-| 8.5 (310P)                   | `2.3.0`   | `0.14.1`                                                           |                        |
-| 8.3 (A3/910C)                | `2.2.rc1` | `0.12.0`, `0.11.0`                                                 | `0.5.7`, `0.5.6.post2` |
-| 8.3 (910B)                   | `2.2.rc1` | `0.12.0`, `0.11.0`                                                 | `0.5.7`, `0.5.6.post2` |
-| 8.3 (310P)                   | `2.2.rc1` |                                                                    |                        |
-| 8.2 (A3/910C)                | `2.1.rc2` | `0.10.2`, ~~`0.10.1.1`~~                                           | `0.5.2`, `0.5.1.post3` |
-| 8.2 (910B)                   | `2.1.rc2` | `0.10.2`, ~~`0.10.1.1`~~, <br/>`0.10.0`, `0.9.2`, <br/>~~`0.9.1`~~ | `0.5.2`, `0.5.1.post3` |
-| 8.2 (310P)                   | `2.1.rc2` | `0.10.0`, `0.9.2`                                                  |                        |
+| CANN Version <br/> (Variant) | MindIE    | vLLM                              | SGLang                 |
+|------------------------------|-----------|-----------------------------------|------------------------|
+| 8.5 (A3/910C)                | `2.3.0`   | `0.14.1`, `0.13.0`                | `0.5.8`                |
+| 8.5 (910B)                   | `2.3.0`   | `0.14.1`, `0.13.0`                | `0.5.8`                |
+| 8.5 (310P)                   | `2.3.0`   | `0.14.1`                          |                        |
+| 8.3 (A3/910C)                | `2.2.rc1` | `0.12.0`, `0.11.0`                | `0.5.7`, `0.5.6.post2` |
+| 8.3 (910B)                   | `2.2.rc1` | `0.12.0`, `0.11.0`                | `0.5.7`, `0.5.6.post2` |
+| 8.3 (310P)                   | `2.2.rc1` |                                   |                        |
+| 8.2 (A3/910C)                | `2.1.rc2` | `0.10.2`                          | `0.5.2`, `0.5.1.post3` |
+| 8.2 (910B)                   | `2.1.rc2` | `0.10.2`, `0.10.0`,  <br/>`0.9.2` | `0.5.2`, `0.5.1.post3` |
+| 8.2 (310P)                   | `2.1.rc2` | `0.10.0`, `0.9.2`                 |                        |
 ### Iluvatar CoreX
@@ -80,11 +80,11 @@ The following table lists the supported accelerated backends and their correspon
 > - CUDA 12.6/12.4 supports Compute Capabilities:
     `7.5 8.0+PTX 8.9 9.0+PTX`.
-| CUDA Version <br/> (Variant) | vLLM                                                           | SGLang                                                                      | VoxBox   |
-|------------------------------|----------------------------------------------------------------|-----------------------------------------------------------------------------|----------|
-| 12.9                         | `0.14.1`, **`0.13.0`**, <br/>`0.12.0`, `0.11.2`                | `0.5.8`, `0.5.7`, <br/>`0.5.6.post2`                                        |          |
-| 12.8                         | `0.14.1`, **`0.13.0`**, <br/>`0.12.0`, `0.11.2`, <br/>`0.10.2` | `0.5.8`, `0.5.7`, <br/>`0.5.6.post2`, `0.5.5.post3`, <br/>~~`0.5.4.post3`~~ | `0.0.21` |
-| 12.6                         | `0.14.1`, **`0.13.0`**, <br/>`0.12.0`, `0.11.2`, <br/>`0.10.2` |                                                                             | `0.0.21` |
+| CUDA Version <br/> (Variant) | vLLM                                                                 | SGLang                                              | VoxBox   |
+|------------------------------|----------------------------------------------------------------------|-----------------------------------------------------|----------|
+| 12.9                         | `0.15.0`, `0.14.1`, <br/>`0.13.0`, `0.12.0`, <br/>`0.11.2`           | `0.5.8`, `0.5.7`, <br/>`0.5.6.post2`                |          |
+| 12.8                         | `0.15.0`, `0.14.1`, <br/>`0.13.0`, `0.12.0`, <br/>`0.11.2`, `0.10.2` | `0.5.8`, `0.5.7`, <br/>`0.5.6.post2`, `0.5.5.post3` | `0.0.21` |
+| 12.6                         | `0.15.0`, `0.14.1`, <br/>`0.13.0`, `0.12.0`, <br/>`0.11.2`, `0.10.2` |                                                     | `0.0.21` |
 ### Hygon DTK
@@ -128,10 +128,10 @@ The following table lists the supported accelerated backends and their correspon
 > - ROCm 6.4 SGLang supports `gfx942` only.
 > - ROCm 7.0 SGLang supports `gfx950` only.
-| ROCm Version <br/> (Variant) | vLLM                                            | SGLang                                     |
-|------------------------------|-------------------------------------------------|--------------------------------------------|
-| 7.0                          | **`0.13.0`**, `0.12.0`, <br/>`0.11.2`           | `0.5.7`, `0.5.6.post2`                     |
-| 6.4                          | **`0.13.0`**, `0.12.0`, <br/>`0.11.2`, `0.10.2` | `0.5.7`, `0.5.6.post2`, <br/>`0.5.5.post3` |
+| ROCm Version <br/> (Variant) | vLLM                                                                 | SGLang                                              |
+|------------------------------|----------------------------------------------------------------------|-----------------------------------------------------|
+| 7.0                          | `0.15.0`, `0.14.1`, <br/>`0.13.0`, `0.12.0`, <br/>`0.11.2`           | `0.5.8`, `0.5.7`, <br/>`0.5.6.post2`                |
+| 6.4                          | `0.15.0`, `0.14.1`, <br/>`0.13.0`, `0.12.0`, <br/>`0.11.2`, `0.10.2` | `0.5.8`, `0.5.7`, <br/>`0.5.6.post2`, `0.5.5.post3` |
 ## Directory Structure

{gpustack_runner-0.1.24.post3 → gpustack_runner-0.1.25}/README.md RENAMED Viewed

@@ -32,17 +32,17 @@ The following table lists the supported accelerated backends and their correspon
     vllm-ascend [#3316](https://github.com/vllm-project/vllm-ascend/issues/3316)
     and [#2795](https://github.com/vllm-project/vllm-ascend/issues/2795).
-| CANN Version <br/> (Variant) | MindIE    | vLLM                                                               | SGLang                 |
-|------------------------------|-----------|--------------------------------------------------------------------|------------------------|
-| 8.5 (A3/910C)                | `2.3.0`   | `0.14.1`, `0.13.0`                                                 | `0.5.8`                |
-| 8.5 (910B)                   | `2.3.0`   | `0.14.1`, `0.13.0`                                                 | `0.5.8`                |
-| 8.5 (310P)                   | `2.3.0`   | `0.14.1`                                                           |                        |
-| 8.3 (A3/910C)                | `2.2.rc1` | `0.12.0`, `0.11.0`                                                 | `0.5.7`, `0.5.6.post2` |
-| 8.3 (910B)                   | `2.2.rc1` | `0.12.0`, `0.11.0`                                                 | `0.5.7`, `0.5.6.post2` |
-| 8.3 (310P)                   | `2.2.rc1` |                                                                    |                        |
-| 8.2 (A3/910C)                | `2.1.rc2` | `0.10.2`, ~~`0.10.1.1`~~                                           | `0.5.2`, `0.5.1.post3` |
-| 8.2 (910B)                   | `2.1.rc2` | `0.10.2`, ~~`0.10.1.1`~~, <br/>`0.10.0`, `0.9.2`, <br/>~~`0.9.1`~~ | `0.5.2`, `0.5.1.post3` |
-| 8.2 (310P)                   | `2.1.rc2` | `0.10.0`, `0.9.2`                                                  |                        |
+| CANN Version <br/> (Variant) | MindIE    | vLLM                              | SGLang                 |
+|------------------------------|-----------|-----------------------------------|------------------------|
+| 8.5 (A3/910C)                | `2.3.0`   | `0.14.1`, `0.13.0`                | `0.5.8`                |
+| 8.5 (910B)                   | `2.3.0`   | `0.14.1`, `0.13.0`                | `0.5.8`                |
+| 8.5 (310P)                   | `2.3.0`   | `0.14.1`                          |                        |
+| 8.3 (A3/910C)                | `2.2.rc1` | `0.12.0`, `0.11.0`                | `0.5.7`, `0.5.6.post2` |
+| 8.3 (910B)                   | `2.2.rc1` | `0.12.0`, `0.11.0`                | `0.5.7`, `0.5.6.post2` |
+| 8.3 (310P)                   | `2.2.rc1` |                                   |                        |
+| 8.2 (A3/910C)                | `2.1.rc2` | `0.10.2`                          | `0.5.2`, `0.5.1.post3` |
+| 8.2 (910B)                   | `2.1.rc2` | `0.10.2`, `0.10.0`,  <br/>`0.9.2` | `0.5.2`, `0.5.1.post3` |
+| 8.2 (310P)                   | `2.1.rc2` | `0.10.0`, `0.9.2`                 |                        |
 ### Iluvatar CoreX
@@ -60,11 +60,11 @@ The following table lists the supported accelerated backends and their correspon
 > - CUDA 12.6/12.4 supports Compute Capabilities:
     `7.5 8.0+PTX 8.9 9.0+PTX`.
-| CUDA Version <br/> (Variant) | vLLM                                                           | SGLang                                                                      | VoxBox   |
-|------------------------------|----------------------------------------------------------------|-----------------------------------------------------------------------------|----------|
-| 12.9                         | `0.14.1`, **`0.13.0`**, <br/>`0.12.0`, `0.11.2`                | `0.5.8`, `0.5.7`, <br/>`0.5.6.post2`                                        |          |
-| 12.8                         | `0.14.1`, **`0.13.0`**, <br/>`0.12.0`, `0.11.2`, <br/>`0.10.2` | `0.5.8`, `0.5.7`, <br/>`0.5.6.post2`, `0.5.5.post3`, <br/>~~`0.5.4.post3`~~ | `0.0.21` |
-| 12.6                         | `0.14.1`, **`0.13.0`**, <br/>`0.12.0`, `0.11.2`, <br/>`0.10.2` |                                                                             | `0.0.21` |
+| CUDA Version <br/> (Variant) | vLLM                                                                 | SGLang                                              | VoxBox   |
+|------------------------------|----------------------------------------------------------------------|-----------------------------------------------------|----------|
+| 12.9                         | `0.15.0`, `0.14.1`, <br/>`0.13.0`, `0.12.0`, <br/>`0.11.2`           | `0.5.8`, `0.5.7`, <br/>`0.5.6.post2`                |          |
+| 12.8                         | `0.15.0`, `0.14.1`, <br/>`0.13.0`, `0.12.0`, <br/>`0.11.2`, `0.10.2` | `0.5.8`, `0.5.7`, <br/>`0.5.6.post2`, `0.5.5.post3` | `0.0.21` |
+| 12.6                         | `0.15.0`, `0.14.1`, <br/>`0.13.0`, `0.12.0`, <br/>`0.11.2`, `0.10.2` |                                                     | `0.0.21` |
 ### Hygon DTK
@@ -108,10 +108,10 @@ The following table lists the supported accelerated backends and their correspon
 > - ROCm 6.4 SGLang supports `gfx942` only.
 > - ROCm 7.0 SGLang supports `gfx950` only.
-| ROCm Version <br/> (Variant) | vLLM                                            | SGLang                                     |
-|------------------------------|-------------------------------------------------|--------------------------------------------|
-| 7.0                          | **`0.13.0`**, `0.12.0`, <br/>`0.11.2`           | `0.5.7`, `0.5.6.post2`                     |
-| 6.4                          | **`0.13.0`**, `0.12.0`, <br/>`0.11.2`, `0.10.2` | `0.5.7`, `0.5.6.post2`, <br/>`0.5.5.post3` |
+| ROCm Version <br/> (Variant) | vLLM                                                                 | SGLang                                              |
+|------------------------------|----------------------------------------------------------------------|-----------------------------------------------------|
+| 7.0                          | `0.15.0`, `0.14.1`, <br/>`0.13.0`, `0.12.0`, <br/>`0.11.2`           | `0.5.8`, `0.5.7`, <br/>`0.5.6.post2`                |
+| 6.4                          | `0.15.0`, `0.14.1`, <br/>`0.13.0`, `0.12.0`, <br/>`0.11.2`, `0.10.2` | `0.5.8`, `0.5.7`, <br/>`0.5.6.post2`, `0.5.5.post3` |
 ## Directory Structure

{gpustack_runner-0.1.24.post3 → gpustack_runner-0.1.25}/gpustack_runner/_version.py RENAMED Viewed

@@ -27,8 +27,8 @@ version_tuple: VERSION_TUPLE
 __commit_id__: COMMIT_ID
 commit_id: COMMIT_ID
-__version__ = version = '0.1.24.post3'
-__version_tuple__ = version_tuple = (0, 1, 24, 'post3')
+__version__ = version = '0.1.25'
+__version_tuple__ = version_tuple = (0, 1, 25)
 try:
     from ._version_appendix import git_commit
     __commit_id__ = commit_id = git_commit

gpustack_runner-0.1.25/gpustack_runner/_version_appendix.py ADDED Viewed

	@@ -0,0 +1 @@
1	+ git_commit = "b005327"

{gpustack_runner-0.1.24.post3 → gpustack_runner-0.1.25}/gpustack_runner/runner.py.json RENAMED Viewed

@@ -868,6 +868,28 @@
     "docker_image": "gpustack/runner:cuda12.9-sglang0.5.6.post2",
     "deprecated": false
   },
+  {
+    "backend": "cuda",
+    "backend_version": "12.9",
+    "original_backend_version": "12.9.1",
+    "backend_variant": "",
+    "service": "vllm",
+    "service_version": "0.15.0",
+    "platform": "linux/amd64",
+    "docker_image": "gpustack/runner:cuda12.9-vllm0.15.0",
+    "deprecated": false
+  },
+  {
+    "backend": "cuda",
+    "backend_version": "12.9",
+    "original_backend_version": "12.9.1",
+    "backend_variant": "",
+    "service": "vllm",
+    "service_version": "0.15.0",
+    "platform": "linux/arm64",
+    "docker_image": "gpustack/runner:cuda12.9-vllm0.15.0",
+    "deprecated": false
+  },
   {
     "backend": "cuda",
     "backend_version": "12.9",
@@ -1077,6 +1099,28 @@
     "docker_image": "gpustack/runner:cuda12.8-sglang0.5.4.post3",
     "deprecated": true
   },
+  {
+    "backend": "cuda",
+    "backend_version": "12.8",
+    "original_backend_version": "12.8.1",
+    "backend_variant": "",
+    "service": "vllm",
+    "service_version": "0.15.0",
+    "platform": "linux/amd64",
+    "docker_image": "gpustack/runner:cuda12.8-vllm0.15.0",
+    "deprecated": false
+  },
+  {
+    "backend": "cuda",
+    "backend_version": "12.8",
+    "original_backend_version": "12.8.1",
+    "backend_variant": "",
+    "service": "vllm",
+    "service_version": "0.15.0",
+    "platform": "linux/arm64",
+    "docker_image": "gpustack/runner:cuda12.8-vllm0.15.0",
+    "deprecated": false
+  },
   {
     "backend": "cuda",
     "backend_version": "12.8",
@@ -1297,6 +1341,28 @@
     "docker_image": "gpustack/runner:cuda12.8-voxbox0.0.20",
     "deprecated": true
   },
+  {
+    "backend": "cuda",
+    "backend_version": "12.6",
+    "original_backend_version": "12.6.3",
+    "backend_variant": "",
+    "service": "vllm",
+    "service_version": "0.15.0",
+    "platform": "linux/amd64",
+    "docker_image": "gpustack/runner:cuda12.6-vllm0.15.0",
+    "deprecated": false
+  },
+  {
+    "backend": "cuda",
+    "backend_version": "12.6",
+    "original_backend_version": "12.6.3",
+    "backend_variant": "",
+    "service": "vllm",
+    "service_version": "0.15.0",
+    "platform": "linux/arm64",
+    "docker_image": "gpustack/runner:cuda12.6-vllm0.15.0",
+    "deprecated": false
+  },
   {
     "backend": "cuda",
     "backend_version": "12.6",
@@ -1748,6 +1814,17 @@
     "docker_image": "gpustack/runner:musa4.1-vllm0.9.2",
     "deprecated": false
   },
+  {
+    "backend": "rocm",
+    "backend_version": "7.0",
+    "original_backend_version": "7.0.2",
+    "backend_variant": "",
+    "service": "sglang",
+    "service_version": "0.5.8",
+    "platform": "linux/amd64",
+    "docker_image": "gpustack/runner:rocm7.0-sglang0.5.8",
+    "deprecated": false
+  },
   {
     "backend": "rocm",
     "backend_version": "7.0",
@@ -1770,6 +1847,28 @@
     "docker_image": "gpustack/runner:rocm7.0-sglang0.5.6.post2",
     "deprecated": false
   },
+  {
+    "backend": "rocm",
+    "backend_version": "7.0",
+    "original_backend_version": "7.0.2",
+    "backend_variant": "",
+    "service": "vllm",
+    "service_version": "0.15.0",
+    "platform": "linux/amd64",
+    "docker_image": "gpustack/runner:rocm7.0-vllm0.15.0",
+    "deprecated": false
+  },
+  {
+    "backend": "rocm",
+    "backend_version": "7.0",
+    "original_backend_version": "7.0.2",
+    "backend_variant": "",
+    "service": "vllm",
+    "service_version": "0.14.1",
+    "platform": "linux/amd64",
+    "docker_image": "gpustack/runner:rocm7.0-vllm0.14.1",
+    "deprecated": false
+  },
   {
     "backend": "rocm",
     "backend_version": "7.0",
@@ -1814,6 +1913,17 @@
     "docker_image": "gpustack/runner:rocm7.0-vllm0.11.0",
     "deprecated": true
   },
+  {
+    "backend": "rocm",
+    "backend_version": "6.4",
+    "original_backend_version": "6.4.4",
+    "backend_variant": "",
+    "service": "sglang",
+    "service_version": "0.5.8",
+    "platform": "linux/amd64",
+    "docker_image": "gpustack/runner:rocm6.4-sglang0.5.8",
+    "deprecated": false
+  },
   {
     "backend": "rocm",
     "backend_version": "6.4",
@@ -1847,6 +1957,28 @@
     "docker_image": "gpustack/runner:rocm6.4-sglang0.5.5.post3",
     "deprecated": false
   },
+  {
+    "backend": "rocm",
+    "backend_version": "6.4",
+    "original_backend_version": "6.4.4",
+    "backend_variant": "",
+    "service": "vllm",
+    "service_version": "0.15.0",
+    "platform": "linux/amd64",
+    "docker_image": "gpustack/runner:rocm6.4-vllm0.15.0",
+    "deprecated": false
+  },
+  {
+    "backend": "rocm",
+    "backend_version": "6.4",
+    "original_backend_version": "6.4.4",
+    "backend_variant": "",
+    "service": "vllm",
+    "service_version": "0.14.1",
+    "platform": "linux/amd64",
+    "docker_image": "gpustack/runner:rocm6.4-vllm0.14.1",
+    "deprecated": false
+  },
   {
     "backend": "rocm",
     "backend_version": "6.4",

gpustack_runner-0.1.25/pack/.post_operation/20260203_cuda_several_patches/cuda/Dockerfile ADDED Viewed

@@ -0,0 +1,77 @@
+ARG CMAKE_MAX_JOBS
+ARG CUDA_VERSION=12.8
+ARG VLLM_VERSION=0.14.1
+ARG SGLANG_VERSION=0.5.8
+FROM gpustack/runner:cuda${CUDA_VERSION}-vllm${VLLM_VERSION} AS vllm
+SHELL ["/bin/bash", "-eo", "pipefail", "-c"]
+ARG TARGETPLATFORM
+ARG TARGETOS
+ARG TARGETARCH
+## Update CuDNN and NCCL packages
+RUN <<EOF
+    # Update CuDNN and NCCL packages
+    IFS="." read -r CUDA_MAJOR CUDA_MINOR CUDA_PATCH <<< "${VLLM_TORCH_CUDA_VERSION}"
+    # Install
+    cat <<EOT >/tmp/requirements.txt
+nvidia-cudnn-cu${CUDA_MAJOR}>=9.16.0.29
+nvidia-cudnn-frontend>=1.17.0
+nvidia-nccl-cu${CUDA_MAJOR}>=2.28.3
+EOT
+    uv pip install \
+        -r /tmp/requirements.txt
+    # Review
+    uv pip tree
+    # Cleanup
+    rm -rf /var/tmp/* \
+        && rm -rf /tmp/*
+EOF
+## Entrypoint
+WORKDIR /
+ENTRYPOINT [ "tini", "--" ]
+FROM gpustack/runner:cuda${CUDA_VERSION}-sglang${SGLANG_VERSION} AS sglang
+SHELL ["/bin/bash", "-eo", "pipefail", "-c"]
+ARG TARGETPLATFORM
+ARG TARGETOS
+ARG TARGETARCH
+## Update CuDNN and NCCL packages
+RUN <<EOF
+    # Update CuDNN and NCCL packages
+    IFS="." read -r CUDA_MAJOR CUDA_MINOR CUDA_PATCH <<< "${VLLM_TORCH_CUDA_VERSION}"
+    # Install
+    cat <<EOT >/tmp/requirements.txt
+nvidia-cudnn-cu${CUDA_MAJOR}>=9.16.0.29
+nvidia-cudnn-frontend>=1.17.0
+nvidia-nccl-cu${CUDA_MAJOR}>=2.28.3
+EOT
+    uv pip install \
+        -r /tmp/requirements.txt
+    # Review
+    uv pip tree
+    # Cleanup
+    rm -rf /var/tmp/* \
+        && rm -rf /tmp/*
+EOF
+## Entrypoint
+WORKDIR /
+ENTRYPOINT [ "tini", "--" ]

gpustack_runner-0.1.25/pack/.post_operation/20260203_cuda_several_patches/matrix.yaml ADDED Viewed

@@ -0,0 +1,22 @@
+rules:
+  #
+  # NVIDIA CUDA
+  #
+  ## Packed NVIDIA CUDA 12.9.
+  ##
+  - backend: "cuda"
+    services:
+      - "vllm"
+    args:
+      - "CUDA_VERSION=12.9"
+      - "VLLM_VERSION=0.15.0"
+  - backend: "cuda"
+    services:
+      - "vllm"
+      - "sglang"
+    args:
+      - "CUDA_VERSION=12.9"
+      - "VLLM_VERSION=0.14.1"
+      - "SGLANG_VERSION=0.5.8"

gpustack_runner-0.1.25/pack/.post_operation/20260203_sglang_disable_cudnn_check/cuda/Dockerfile ADDED Viewed

@@ -0,0 +1,17 @@
+ARG CMAKE_MAX_JOBS
+ARG CUDA_VERSION=12.8
+ARG SGLANG_VERSION=0.5.8
+FROM gpustack/runner:cuda${CUDA_VERSION}-sglang${SGLANG_VERSION} AS sglang
+SHELL ["/bin/bash", "-eo", "pipefail", "-c"]
+ARG TARGETPLATFORM
+ARG TARGETOS
+ARG TARGETARCH
+## Entrypoint
+ENV SGLANG_DISABLE_CUDNN_CHECK=1
+WORKDIR /
+ENTRYPOINT [ "tini", "--" ]

gpustack_runner-0.1.25/pack/.post_operation/20260203_sglang_disable_cudnn_check/matrix.yaml ADDED Viewed

@@ -0,0 +1,56 @@
+rules:
+  #
+  # NVIDIA CUDA
+  #
+  ## Packed NVIDIA CUDA 12.9.
+  ##
+  - backend: "cuda"
+    services:
+      - "sglang"
+    args:
+      - "CUDA_VERSION=12.9"
+      - "VLLM_VERSION=0.14.1"
+      - "SGLANG_VERSION=0.5.8"
+  #
+  # AMD ROCm
+  #
+  ## Packed ROCm 7.0.
+  ##
+  - backend: "rocm"
+    services:
+      - "sglang"
+    platforms:
+      - "linux/amd64"
+    args:
+      - "ROCM_VERSION=7.0"
+      - "SGLANG_VERSION=0.5.8"
+  - backend: "rocm"
+    services:
+      - "sglang"
+    platforms:
+      - "linux/amd64"
+    args:
+      - "ROCM_VERSION=7.0"
+      - "SGLANG_VERSION=0.5.7"
+  ## Packed ROCm 6.4.
+  ##
+  - backend: "rocm"
+    services:
+      - "sglang"
+    platforms:
+      - "linux/amd64"
+    args:
+      - "ROCM_VERSION=6.4"
+      - "SGLANG_VERSION=0.5.8"
+  - backend: "rocm"
+    services:
+      - "sglang"
+    platforms:
+      - "linux/amd64"
+    args:
+      - "ROCM_VERSION=6.4"
+      - "SGLANG_VERSION=0.5.7"

gpustack_runner-0.1.25/pack/.post_operation/20260203_sglang_disable_cudnn_check/rocm/Dockerfile ADDED Viewed

@@ -0,0 +1,17 @@
+ARG CMAKE_MAX_JOBS
+ARG ROCM_VERSION=7.0
+ARG SGLANG_VERSION=0.5.8
+FROM gpustack/runner:rocm${ROCM_VERSION}-sglang${SGLANG_VERSION} AS sglang
+SHELL ["/bin/bash", "-eo", "pipefail", "-c"]
+ARG TARGETPLATFORM
+ARG TARGETOS
+ARG TARGETARCH
+## Entrypoint
+ENV SGLANG_DISABLE_CUDNN_CHECK=1
+WORKDIR /
+ENTRYPOINT [ "tini", "--" ]

{gpustack_runner-0.1.24.post3 → gpustack_runner-0.1.25}/pack/.post_operation/README.md RENAMED Viewed

@@ -35,3 +35,5 @@ We leverage the matrix expansion feature of GPUStack Runner to achieve this, and
 - [ ] 2026-01-05: Install `vllm-omni` packages for vLLM 0.12.0 of CUDA/ROCm/CANN released images.
 - [x] 2026-01-29: Apply DP deployment patches to vLLM 0.13.0 for CUDA/ROCm released images.
 - [x] 2026-01-29: Reinstall SGLang Kernel for SGLang 0.5.7 of CANN released images.
+- [x] 2026-02-03: Apply several patches to vLLM 0.14.1/0.15.0 and SGLang 0.5.8 for CUDA 12.9 released images.
+- [x] 2026-02-03: Patch SGLang 0.5.8/0.5.7 of CUDA/ROCm released images to disable CuDNN version check.

{gpustack_runner-0.1.24.post3 → gpustack_runner-0.1.25}/pack/cann/Dockerfile RENAMED Viewed

@@ -59,7 +59,7 @@ ARG VLLM_VERSION=0.14.1
 ARG VLLM_ASCEND_VERSION=0.14.0rc1
 ARG VLLM_TORCH_VERSION=2.9.0
 ARG VLLM_MOONCAKE_VERSION=0.3.7.post2
-ARG VLLM_OMNI_COMMIT=e8aa32b
+ARG VLLM_OMNI_COMMIT=de2cac9
 ARG SGLANG_BASE_IMAGE=gpustack/runner:cann${CANN_VERSION}-${CANN_ARCHS}-python${PYTHON_VERSION}
 ARG SGLANG_VERSION=0.5.8
 ARG SGLANG_TORCH_VERSION=2.8.0
@@ -865,6 +865,15 @@ RUN --mount=type=bind,from=vllm-build-omni,source=/,target=/omni,rw <<EOF
     uv pip install --no-build-isolation \
         /omni/workspace/*.whl
+    # Dependencies
+    uv pip uninstall onnxruntime || true
+    cat <<EOT >/tmp/requirements.txt
+onnxruntime-cann
+sox
+EOT
+    uv pip install \
+        -r /tmp/requirements.txt
     # Cleanup
     rm -rf /var/tmp/* \
         && rm -rf /tmp/*
@@ -956,7 +965,11 @@ RUN --mount=type=bind,target=/workspace,rw <<EOF
     tree -hs /workspace/patches
     pushd $(pip show vllm | grep Location: | cut -d" " -f 2) \
-        && patch -p1 < /workspace/patches/vllm_*.patch
+        && patch -p1 < /workspace/patches/vllm/*.patch
+    if pip show vllm_omni > /dev/null 2>&1; then \
+        pushd $(pip show vllm_omni | grep Location: | cut -d" " -f 2) \
+            && patch -p1 < /workspace/patches/vllm_omni/*.patch; \
+    fi
 EOF
 ## Entrypoint

gpustack_runner-0.1.25/pack/cann/patches/vllm_omni/001_wrong_patch.patch ADDED Viewed

@@ -0,0 +1,13 @@
+diff --git a/vllm_omni/patch.py b/vllm_omni/patch.py
+index 687ff51..6b67924 100644
+--- a/vllm_omni/patch.py
++++ b/vllm_omni/patch.py
+@@ -19,6 +19,8 @@ for module_name, module in sys.modules.items():
+     # only do patch on module of vllm, pass others
+     if "vllm" not in module_name:
+         continue
++    if "--omni" not in sys.argv:
++        continue
+     if hasattr(module, "EngineCoreOutput") and module.EngineCoreOutput == _OriginalEngineCoreOutput:
+         module.EngineCoreOutput = OmniEngineCoreOutput
+     if hasattr(module, "EngineCoreOutputs") and module.EngineCoreOutputs == _OriginalEngineCoreOutputs:

gpustack-runner 0.1.24.post3__tar.gz → 0.1.25__tar.gz

gpustack-runner 0.1.24.post3tar.gz → 0.1.25tar.gz