PyPI - gpustack-runner - Versions diffs - 0.1.24.post2__tar.gz → 0.1.24.post3__tar.gz - Mend

gpustack-runner 0.1.24.post2tar.gz → 0.1.24.post3tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (142) hide show

{gpustack_runner-0.1.24.post2 → gpustack_runner-0.1.24.post3}/PKG-INFO RENAMED Viewed

@@ -1,6 +1,6 @@
 Metadata-Version: 2.4
 Name: gpustack-runner
-Version: 0.1.24.post2
+Version: 0.1.24.post3
 Summary: GPUStack Runner is library for registering runnable accelerated backends and services in GPUStack.
 Project-URL: Homepage, https://github.com/gpustack/runner
 Project-URL: Bug Tracker, https://github.com/gpustack/gpustack/issues
@@ -52,17 +52,17 @@ The following table lists the supported accelerated backends and their correspon
     vllm-ascend [#3316](https://github.com/vllm-project/vllm-ascend/issues/3316)
     and [#2795](https://github.com/vllm-project/vllm-ascend/issues/2795).
-| CANN Version <br/> (Variant) | MindIE    | vLLM                                                       | SGLang                 |
-|------------------------------|-----------|------------------------------------------------------------|------------------------|
-| 8.5 (A3/910C)                | `2.3.0`   | `0.14.1`, `0.13.0`                                         | `0.5.8`                |
-| 8.5 (910B)                   | `2.3.0`   | `0.14.1`, `0.13.0`                                         | `0.5.8`                |
-| 8.5 (310P)                   | `2.3.0`   | `0.14.1`                                                   |                        |
-| 8.3 (A3/910C)                | `2.2.rc1` | `0.12.0`, `0.11.0`                                         | `0.5.7`, `0.5.6.post2` |
-| 8.3 (910B)                   | `2.2.rc1` | `0.12.0`, `0.11.0`                                         | `0.5.7`, `0.5.6.post2` |
-| 8.3 (310P)                   | `2.2.rc1` |                                                            |                        |
-| 8.2 (A3/910C)                | `2.1.rc2` | `0.10.2`, `0.10.1.1`                                       | `0.5.2`, `0.5.1.post3` |
-| 8.2 (910B)                   | `2.1.rc2` | `0.10.2`, `0.10.1.1`, <br/>`0.10.0`, `0.9.2`, <br/>`0.9.1` | `0.5.2`, `0.5.1.post3` |
-| 8.2 (310P)                   | `2.1.rc2` | `0.10.0`, `0.9.2`                                          |                        |
+| CANN Version <br/> (Variant) | MindIE    | vLLM                                                               | SGLang                 |
+|------------------------------|-----------|--------------------------------------------------------------------|------------------------|
+| 8.5 (A3/910C)                | `2.3.0`   | `0.14.1`, `0.13.0`                                                 | `0.5.8`                |
+| 8.5 (910B)                   | `2.3.0`   | `0.14.1`, `0.13.0`                                                 | `0.5.8`                |
+| 8.5 (310P)                   | `2.3.0`   | `0.14.1`                                                           |                        |
+| 8.3 (A3/910C)                | `2.2.rc1` | `0.12.0`, `0.11.0`                                                 | `0.5.7`, `0.5.6.post2` |
+| 8.3 (910B)                   | `2.2.rc1` | `0.12.0`, `0.11.0`                                                 | `0.5.7`, `0.5.6.post2` |
+| 8.3 (310P)                   | `2.2.rc1` |                                                                    |                        |
+| 8.2 (A3/910C)                | `2.1.rc2` | `0.10.2`, ~~`0.10.1.1`~~                                           | `0.5.2`, `0.5.1.post3` |
+| 8.2 (910B)                   | `2.1.rc2` | `0.10.2`, ~~`0.10.1.1`~~, <br/>`0.10.0`, `0.9.2`, <br/>~~`0.9.1`~~ | `0.5.2`, `0.5.1.post3` |
+| 8.2 (310P)                   | `2.1.rc2` | `0.10.0`, `0.9.2`                                                  |                        |
 ### Iluvatar CoreX
@@ -80,11 +80,11 @@ The following table lists the supported accelerated backends and their correspon
 > - CUDA 12.6/12.4 supports Compute Capabilities:
     `7.5 8.0+PTX 8.9 9.0+PTX`.
-| CUDA Version <br/> (Variant) | vLLM                                        | SGLang                                                    | VoxBox   |
-|------------------------------|---------------------------------------------|-----------------------------------------------------------|----------|
-| 12.9                         | `0.13.0`, `0.12.0`, <br/>`0.11.2`           | `0.5.7`, `0.5.6.post2`                                    |          |
-| 12.8                         | `0.13.0`, `0.12.0`, <br/>`0.11.2`, `0.10.2` | `0.5.7`, `0.5.6.post2`, <br/>`0.5.5.post3`, `0.5.4.post3` | `0.0.21` |
-| 12.6                         | `0.13.0`, `0.12.0`, <br/>`0.11.2`,`0.10.2`  |                                                           | `0.0.21` |
+| CUDA Version <br/> (Variant) | vLLM                                                           | SGLang                                                                      | VoxBox   |
+|------------------------------|----------------------------------------------------------------|-----------------------------------------------------------------------------|----------|
+| 12.9                         | `0.14.1`, **`0.13.0`**, <br/>`0.12.0`, `0.11.2`                | `0.5.8`, `0.5.7`, <br/>`0.5.6.post2`                                        |          |
+| 12.8                         | `0.14.1`, **`0.13.0`**, <br/>`0.12.0`, `0.11.2`, <br/>`0.10.2` | `0.5.8`, `0.5.7`, <br/>`0.5.6.post2`, `0.5.5.post3`, <br/>~~`0.5.4.post3`~~ | `0.0.21` |
+| 12.6                         | `0.14.1`, **`0.13.0`**, <br/>`0.12.0`, `0.11.2`, <br/>`0.10.2` |                                                                             | `0.0.21` |
 ### Hygon DTK
@@ -128,10 +128,10 @@ The following table lists the supported accelerated backends and their correspon
 > - ROCm 6.4 SGLang supports `gfx942` only.
 > - ROCm 7.0 SGLang supports `gfx950` only.
-| ROCm Version <br/> (Variant) | vLLM                                        | SGLang                                     |
-|------------------------------|---------------------------------------------|--------------------------------------------|
-| 7.0                          | `0.13.0`, `0.12.0`, <br/>`0.11.2`           | `0.5.7`, `0.5.6.post2`                     |
-| 6.4                          | `0.13.0`, `0.12.0`, <br/>`0.11.2`, `0.10.2` | `0.5.7`, `0.5.6.post2`, <br/>`0.5.5.post3` |
+| ROCm Version <br/> (Variant) | vLLM                                            | SGLang                                     |
+|------------------------------|-------------------------------------------------|--------------------------------------------|
+| 7.0                          | **`0.13.0`**, `0.12.0`, <br/>`0.11.2`           | `0.5.7`, `0.5.6.post2`                     |
+| 6.4                          | **`0.13.0`**, `0.12.0`, <br/>`0.11.2`, `0.10.2` | `0.5.7`, `0.5.6.post2`, <br/>`0.5.5.post3` |
 ## Directory Structure

{gpustack_runner-0.1.24.post2 → gpustack_runner-0.1.24.post3}/README.md RENAMED Viewed

@@ -32,17 +32,17 @@ The following table lists the supported accelerated backends and their correspon
     vllm-ascend [#3316](https://github.com/vllm-project/vllm-ascend/issues/3316)
     and [#2795](https://github.com/vllm-project/vllm-ascend/issues/2795).
-| CANN Version <br/> (Variant) | MindIE    | vLLM                                                       | SGLang                 |
-|------------------------------|-----------|------------------------------------------------------------|------------------------|
-| 8.5 (A3/910C)                | `2.3.0`   | `0.14.1`, `0.13.0`                                         | `0.5.8`                |
-| 8.5 (910B)                   | `2.3.0`   | `0.14.1`, `0.13.0`                                         | `0.5.8`                |
-| 8.5 (310P)                   | `2.3.0`   | `0.14.1`                                                   |                        |
-| 8.3 (A3/910C)                | `2.2.rc1` | `0.12.0`, `0.11.0`                                         | `0.5.7`, `0.5.6.post2` |
-| 8.3 (910B)                   | `2.2.rc1` | `0.12.0`, `0.11.0`                                         | `0.5.7`, `0.5.6.post2` |
-| 8.3 (310P)                   | `2.2.rc1` |                                                            |                        |
-| 8.2 (A3/910C)                | `2.1.rc2` | `0.10.2`, `0.10.1.1`                                       | `0.5.2`, `0.5.1.post3` |
-| 8.2 (910B)                   | `2.1.rc2` | `0.10.2`, `0.10.1.1`, <br/>`0.10.0`, `0.9.2`, <br/>`0.9.1` | `0.5.2`, `0.5.1.post3` |
-| 8.2 (310P)                   | `2.1.rc2` | `0.10.0`, `0.9.2`                                          |                        |
+| CANN Version <br/> (Variant) | MindIE    | vLLM                                                               | SGLang                 |
+|------------------------------|-----------|--------------------------------------------------------------------|------------------------|
+| 8.5 (A3/910C)                | `2.3.0`   | `0.14.1`, `0.13.0`                                                 | `0.5.8`                |
+| 8.5 (910B)                   | `2.3.0`   | `0.14.1`, `0.13.0`                                                 | `0.5.8`                |
+| 8.5 (310P)                   | `2.3.0`   | `0.14.1`                                                           |                        |
+| 8.3 (A3/910C)                | `2.2.rc1` | `0.12.0`, `0.11.0`                                                 | `0.5.7`, `0.5.6.post2` |
+| 8.3 (910B)                   | `2.2.rc1` | `0.12.0`, `0.11.0`                                                 | `0.5.7`, `0.5.6.post2` |
+| 8.3 (310P)                   | `2.2.rc1` |                                                                    |                        |
+| 8.2 (A3/910C)                | `2.1.rc2` | `0.10.2`, ~~`0.10.1.1`~~                                           | `0.5.2`, `0.5.1.post3` |
+| 8.2 (910B)                   | `2.1.rc2` | `0.10.2`, ~~`0.10.1.1`~~, <br/>`0.10.0`, `0.9.2`, <br/>~~`0.9.1`~~ | `0.5.2`, `0.5.1.post3` |
+| 8.2 (310P)                   | `2.1.rc2` | `0.10.0`, `0.9.2`                                                  |                        |
 ### Iluvatar CoreX
@@ -60,11 +60,11 @@ The following table lists the supported accelerated backends and their correspon
 > - CUDA 12.6/12.4 supports Compute Capabilities:
     `7.5 8.0+PTX 8.9 9.0+PTX`.
-| CUDA Version <br/> (Variant) | vLLM                                        | SGLang                                                    | VoxBox   |
-|------------------------------|---------------------------------------------|-----------------------------------------------------------|----------|
-| 12.9                         | `0.13.0`, `0.12.0`, <br/>`0.11.2`           | `0.5.7`, `0.5.6.post2`                                    |          |
-| 12.8                         | `0.13.0`, `0.12.0`, <br/>`0.11.2`, `0.10.2` | `0.5.7`, `0.5.6.post2`, <br/>`0.5.5.post3`, `0.5.4.post3` | `0.0.21` |
-| 12.6                         | `0.13.0`, `0.12.0`, <br/>`0.11.2`,`0.10.2`  |                                                           | `0.0.21` |
+| CUDA Version <br/> (Variant) | vLLM                                                           | SGLang                                                                      | VoxBox   |
+|------------------------------|----------------------------------------------------------------|-----------------------------------------------------------------------------|----------|
+| 12.9                         | `0.14.1`, **`0.13.0`**, <br/>`0.12.0`, `0.11.2`                | `0.5.8`, `0.5.7`, <br/>`0.5.6.post2`                                        |          |
+| 12.8                         | `0.14.1`, **`0.13.0`**, <br/>`0.12.0`, `0.11.2`, <br/>`0.10.2` | `0.5.8`, `0.5.7`, <br/>`0.5.6.post2`, `0.5.5.post3`, <br/>~~`0.5.4.post3`~~ | `0.0.21` |
+| 12.6                         | `0.14.1`, **`0.13.0`**, <br/>`0.12.0`, `0.11.2`, <br/>`0.10.2` |                                                                             | `0.0.21` |
 ### Hygon DTK
@@ -108,10 +108,10 @@ The following table lists the supported accelerated backends and their correspon
 > - ROCm 6.4 SGLang supports `gfx942` only.
 > - ROCm 7.0 SGLang supports `gfx950` only.
-| ROCm Version <br/> (Variant) | vLLM                                        | SGLang                                     |
-|------------------------------|---------------------------------------------|--------------------------------------------|
-| 7.0                          | `0.13.0`, `0.12.0`, <br/>`0.11.2`           | `0.5.7`, `0.5.6.post2`                     |
-| 6.4                          | `0.13.0`, `0.12.0`, <br/>`0.11.2`, `0.10.2` | `0.5.7`, `0.5.6.post2`, <br/>`0.5.5.post3` |
+| ROCm Version <br/> (Variant) | vLLM                                            | SGLang                                     |
+|------------------------------|-------------------------------------------------|--------------------------------------------|
+| 7.0                          | **`0.13.0`**, `0.12.0`, <br/>`0.11.2`           | `0.5.7`, `0.5.6.post2`                     |
+| 6.4                          | **`0.13.0`**, `0.12.0`, <br/>`0.11.2`, `0.10.2` | `0.5.7`, `0.5.6.post2`, <br/>`0.5.5.post3` |
 ## Directory Structure

{gpustack_runner-0.1.24.post2 → gpustack_runner-0.1.24.post3}/gpustack_runner/_version.py RENAMED Viewed

@@ -27,8 +27,8 @@ version_tuple: VERSION_TUPLE
 __commit_id__: COMMIT_ID
 commit_id: COMMIT_ID
-__version__ = version = '0.1.24.post2'
-__version_tuple__ = version_tuple = (0, 1, 24, 'post2')
+__version__ = version = '0.1.24.post3'
+__version_tuple__ = version_tuple = (0, 1, 24, 'post3')
 try:
     from ._version_appendix import git_commit
     __commit_id__ = commit_id = git_commit

gpustack_runner-0.1.24.post3/gpustack_runner/_version_appendix.py ADDED Viewed

	@@ -0,0 +1 @@
1	+ git_commit = "dc41ed2"

{gpustack_runner-0.1.24.post2 → gpustack_runner-0.1.24.post3}/gpustack_runner/runner.py.json RENAMED Viewed

@@ -261,7 +261,7 @@
     "service_version": "0.10.1.1",
     "platform": "linux/amd64",
     "docker_image": "gpustack/runner:cann8.2-a3-vllm0.10.1.1",
-    "deprecated": false
+    "deprecated": true
   },
   {
     "backend": "cann",
@@ -272,7 +272,7 @@
     "service_version": "0.10.1.1",
     "platform": "linux/arm64",
     "docker_image": "gpustack/runner:cann8.2-a3-vllm0.10.1.1",
-    "deprecated": false
+    "deprecated": true
   },
   {
     "backend": "cann",
@@ -558,7 +558,7 @@
     "service_version": "0.10.1.1",
     "platform": "linux/amd64",
     "docker_image": "gpustack/runner:cann8.2-910b-vllm0.10.1.1",
-    "deprecated": false
+    "deprecated": true
   },
   {
     "backend": "cann",
@@ -569,7 +569,7 @@
     "service_version": "0.10.1.1",
     "platform": "linux/arm64",
     "docker_image": "gpustack/runner:cann8.2-910b-vllm0.10.1.1",
-    "deprecated": false
+    "deprecated": true
   },
   {
     "backend": "cann",
@@ -624,7 +624,7 @@
     "service_version": "0.9.1",
     "platform": "linux/amd64",
     "docker_image": "gpustack/runner:cann8.2-910b-vllm0.9.1",
-    "deprecated": false
+    "deprecated": true
   },
   {
     "backend": "cann",
@@ -635,7 +635,7 @@
     "service_version": "0.9.1",
     "platform": "linux/arm64",
     "docker_image": "gpustack/runner:cann8.2-910b-vllm0.9.1",
-    "deprecated": false
+    "deprecated": true
   },
   {
     "backend": "cann",
@@ -802,6 +802,28 @@
     "docker_image": "gpustack/runner:corex4.2-vllm0.8.3",
     "deprecated": false
   },
+  {
+    "backend": "cuda",
+    "backend_version": "12.9",
+    "original_backend_version": "12.9.1",
+    "backend_variant": "",
+    "service": "sglang",
+    "service_version": "0.5.8",
+    "platform": "linux/amd64",
+    "docker_image": "gpustack/runner:cuda12.9-sglang0.5.8",
+    "deprecated": false
+  },
+  {
+    "backend": "cuda",
+    "backend_version": "12.9",
+    "original_backend_version": "12.9.1",
+    "backend_variant": "",
+    "service": "sglang",
+    "service_version": "0.5.8",
+    "platform": "linux/arm64",
+    "docker_image": "gpustack/runner:cuda12.9-sglang0.5.8",
+    "deprecated": false
+  },
   {
     "backend": "cuda",
     "backend_version": "12.9",
@@ -846,6 +868,28 @@
     "docker_image": "gpustack/runner:cuda12.9-sglang0.5.6.post2",
     "deprecated": false
   },
+  {
+    "backend": "cuda",
+    "backend_version": "12.9",
+    "original_backend_version": "12.9.1",
+    "backend_variant": "",
+    "service": "vllm",
+    "service_version": "0.14.1",
+    "platform": "linux/amd64",
+    "docker_image": "gpustack/runner:cuda12.9-vllm0.14.1",
+    "deprecated": false
+  },
+  {
+    "backend": "cuda",
+    "backend_version": "12.9",
+    "original_backend_version": "12.9.1",
+    "backend_variant": "",
+    "service": "vllm",
+    "service_version": "0.14.1",
+    "platform": "linux/arm64",
+    "docker_image": "gpustack/runner:cuda12.9-vllm0.14.1",
+    "deprecated": false
+  },
   {
     "backend": "cuda",
     "backend_version": "12.9",
@@ -912,6 +956,28 @@
     "docker_image": "gpustack/runner:cuda12.9-vllm0.11.2",
     "deprecated": false
   },
+  {
+    "backend": "cuda",
+    "backend_version": "12.8",
+    "original_backend_version": "12.8.1",
+    "backend_variant": "",
+    "service": "sglang",
+    "service_version": "0.5.8",
+    "platform": "linux/amd64",
+    "docker_image": "gpustack/runner:cuda12.8-sglang0.5.8",
+    "deprecated": false
+  },
+  {
+    "backend": "cuda",
+    "backend_version": "12.8",
+    "original_backend_version": "12.8.1",
+    "backend_variant": "",
+    "service": "sglang",
+    "service_version": "0.5.8",
+    "platform": "linux/arm64",
+    "docker_image": "gpustack/runner:cuda12.8-sglang0.5.8",
+    "deprecated": false
+  },
   {
     "backend": "cuda",
     "backend_version": "12.8",
@@ -998,7 +1064,7 @@
     "service_version": "0.5.4.post3",
     "platform": "linux/amd64",
     "docker_image": "gpustack/runner:cuda12.8-sglang0.5.4.post3",
-    "deprecated": false
+    "deprecated": true
   },
   {
     "backend": "cuda",
@@ -1009,6 +1075,28 @@
     "service_version": "0.5.4.post3",
     "platform": "linux/arm64",
     "docker_image": "gpustack/runner:cuda12.8-sglang0.5.4.post3",
+    "deprecated": true
+  },
+  {
+    "backend": "cuda",
+    "backend_version": "12.8",
+    "original_backend_version": "12.8.1",
+    "backend_variant": "",
+    "service": "vllm",
+    "service_version": "0.14.1",
+    "platform": "linux/amd64",
+    "docker_image": "gpustack/runner:cuda12.8-vllm0.14.1",
+    "deprecated": false
+  },
+  {
+    "backend": "cuda",
+    "backend_version": "12.8",
+    "original_backend_version": "12.8.1",
+    "backend_variant": "",
+    "service": "vllm",
+    "service_version": "0.14.1",
+    "platform": "linux/arm64",
+    "docker_image": "gpustack/runner:cuda12.8-vllm0.14.1",
     "deprecated": false
   },
   {
@@ -1209,6 +1297,28 @@
     "docker_image": "gpustack/runner:cuda12.8-voxbox0.0.20",
     "deprecated": true
   },
+  {
+    "backend": "cuda",
+    "backend_version": "12.6",
+    "original_backend_version": "12.6.3",
+    "backend_variant": "",
+    "service": "vllm",
+    "service_version": "0.14.1",
+    "platform": "linux/amd64",
+    "docker_image": "gpustack/runner:cuda12.6-vllm0.14.1",
+    "deprecated": false
+  },
+  {
+    "backend": "cuda",
+    "backend_version": "12.6",
+    "original_backend_version": "12.6.3",
+    "backend_variant": "",
+    "service": "vllm",
+    "service_version": "0.14.1",
+    "platform": "linux/arm64",
+    "docker_image": "gpustack/runner:cuda12.6-vllm0.14.1",
+    "deprecated": false
+  },
   {
     "backend": "cuda",
     "backend_version": "12.6",

gpustack_runner-0.1.24.post3/pack/.post_operation/20260129_sglang_reinstall_kernel/cann/Dockerfile ADDED Viewed

@@ -0,0 +1,74 @@
+ARG CMAKE_MAX_JOBS
+ARG CANN_VERSION=8.3
+ARG CANN_ARCHS=910b
+ARG SGLANG_VERSION=0.12.0
+ARG SGLANG_KERNEL_VERSION=20251206
+FROM gpustack/runner:cann${CANN_VERSION}-${CANN_ARCHS}-sglang${SGLANG_VERSION} AS sglang
+SHELL ["/bin/bash", "-eo", "pipefail", "-c"]
+ARG TARGETPLATFORM
+ARG TARGETOS
+ARG TARGETARCH
+## Reinstall SGLang Kernel
+ARG CMAKE_MAX_JOBS
+ARG SGLANG_VERSION
+ARG SGLANG_KERNEL_VERSION
+ENV SGLANG_VERSION=${SGLANG_VERSION} \
+    SGLANG_KERNEL_VERSION=${SGLANG_KERNEL_VERSION}
+RUN <<EOF
+    # SGLang
+    CMAKE_MAX_JOBS="${CMAKE_MAX_JOBS}"
+    if [[ -z "${CMAKE_MAX_JOBS}" ]]; then
+        CMAKE_MAX_JOBS="$(( $(nproc) / 2 ))"
+    fi
+    if (( $(echo "${CMAKE_MAX_JOBS} > 8" | bc -l) )); then
+        CMAKE_MAX_JOBS="8"
+    fi
+    export MAX_JOBS="${CMAKE_MAX_JOBS}"
+    export COMPILE_CUSTOM_KERNELS=1
+    export LD_LIBRARY_PATH="${LD_LIBRARY_PATH}:${CANN_HOME}/ascend-toolkit/latest/$(uname -i)-linux/devlib"
+    export LD_LIBRARY_PATH="${CANN_HOME}/ascend-toolkit/latest/runtime/lib64/stub:${LD_LIBRARY_PATH}"
+    source ${CANN_HOME}/ascend-toolkit/set_env.sh
+    echo "Building SGLang with the following environment variables:"
+    env
+    # Install Dependencies
+    cat <<EOT >/tmp/requirements.txt
+attrs==25.4.0
+decorator==5.2.1
+psutil==7.1.3
+pyyaml==6.0.3
+triton-ascend==3.2.0
+EOT
+    uv pip install \
+        -r /tmp/requirements.txt
+    # Build and Install SGLang Kernel
+    git -C /tmp clone --recursive --shallow-submodules \
+        --depth 1 --branch ${SGLANG_KERNEL_VERSION} --single-branch \
+        https://github.com/sgl-project/sgl-kernel-npu.git sgl-kernel-npu
+    unset ASCEND_HOME_PATH
+    pushd /tmp/sgl-kernel-npu \
+        && ./build.sh \
+        && tree -hs /tmp/sgl-kernel-npu/output \
+        && uv pip install /tmp/sgl-kernel-npu/output/deep_ep*.whl /tmp/sgl-kernel-npu/output/sgl_kernel_npu*.whl
+    # Postprocess SGLang Kernel (DeepEP)
+    cd "$(pip show deep-ep | awk '/^Location:/ {print $2}')" && ln -sf deep_ep/deep_ep_cpp*.so
+    # Cleanup
+    rm -rf /var/tmp/* \
+        && rm -rf /tmp/* \
+        && ccache --clear --clean
+EOF
+## Entrypoint
+WORKDIR /
+ENTRYPOINT [ "tini", "--" ]

gpustack_runner-0.1.24.post3/pack/.post_operation/20260129_sglang_reinstall_kernel/matrix.yaml ADDED Viewed

@@ -0,0 +1,28 @@
+rules:
+  #
+  # Ascend CANN
+  #
+  ## Packed Ascend CANN 8.3, using CANN Kernel for A3.
+  ##
+  - backend: "cann"
+    services:
+      - "sglang"
+    platforms:
+      - "linux/arm64"
+    args:
+      - "CANN_VERSION=8.3"
+      - "CANN_ARCHS=a3"
+      - "SGLANG_VERSION=0.5.7"
+  ## Packed Ascend CANN 8.3, using CANN Kernel for 910B.
+  ##
+  - backend: "cann"
+    services:
+      - "sglang"
+    platforms:
+      - "linux/arm64"
+    args:
+      - "CANN_VERSION=8.3"
+      - "CANN_ARCHS=910b"
+      - "SGLANG_VERSION=0.5.7"

gpustack_runner-0.1.24.post3/pack/.post_operation/20260129_vllm_patch_dp/cuda/Dockerfile ADDED Viewed

@@ -0,0 +1,25 @@
+ARG CMAKE_MAX_JOBS
+ARG CUDA_VERSION=12.8
+ARG VLLM_VERSION=0.13.0
+FROM gpustack/runner:cuda${CUDA_VERSION}-vllm${VLLM_VERSION} AS vllm
+SHELL ["/bin/bash", "-eo", "pipefail", "-c"]
+ARG TARGETPLATFORM
+ARG TARGETOS
+ARG TARGETARCH
+## Patch
+RUN --mount=type=bind,target=/workspace,rw <<EOF
+    # Patch
+    tree -hs /workspace/patches
+    pushd $(pip show vllm | grep Location: | cut -d" " -f 2) \
+        && patch -p1 < /workspace/patches/vllm_*.patch
+EOF
+## Entrypoint
+WORKDIR /
+ENTRYPOINT [ "tini", "--" ]

gpustack_runner-0.1.24.post3/pack/.post_operation/20260129_vllm_patch_dp/cuda/patches/vllm_001_wrong_dp_ray.patch ADDED Viewed

@@ -0,0 +1,41 @@
+diff --git a/vllm/utils/network_utils.py b/vllm/utils/network_utils.py
+index 7d01533cb..311ed44df 100644
+--- a/vllm/utils/network_utils.py
++++ b/vllm/utils/network_utils.py
+@@ -147,6 +147,9 @@ def get_open_zmq_inproc_path() -> str:
+     return f"inproc://{uuid4()}"
++_next_port: int | None = None
++
++
+ def get_open_port() -> int:
+     """
+     Get an open port for the vLLM process to listen on.
+@@ -163,7 +166,7 @@ def get_open_port() -> int:
+             candidate_port = _get_open_port()
+             if candidate_port not in reserved_port_range:
+                 return candidate_port
+-    return _get_open_port()
++    return _get_open_port(_next_port)
+ def get_open_ports_list(count: int = 5) -> list[int]:
+@@ -174,13 +177,15 @@ def get_open_ports_list(count: int = 5) -> list[int]:
+     return list(ports)
+-def _get_open_port() -> int:
+-    port = envs.VLLM_PORT
++def _get_open_port(start: int | None = None) -> int:
++    port = start or envs.VLLM_PORT
+     if port is not None:
++        global _next_port
+         while True:
+             try:
+                 with socket.socket(socket.AF_INET, socket.SOCK_STREAM) as s:
+                     s.bind(("", port))
++                    _next_port = port + 1
+                     return port
+             except OSError:
+                 port += 1  # Increment port number if already in use

gpustack_runner-0.1.24.post3/pack/.post_operation/20260129_vllm_patch_dp/matrix.yaml ADDED Viewed

@@ -0,0 +1,55 @@
+rules:
+  #
+  # NVIDIA CUDA
+  #
+  ## Packed NVIDIA CUDA 12.9.
+  ##
+  - backend: "cuda"
+    services:
+      - "vllm"
+    args:
+      - "CUDA_VERSION=12.9"
+      - "VLLM_VERSION=0.13.0"
+  ## Packed NVIDIA CUDA 12.8.
+  ##
+  - backend: "cuda"
+    services:
+      - "vllm"
+    args:
+      - "CUDA_VERSION=12.8"
+      - "VLLM_VERSION=0.13.0"
+  ## Packed NVIDIA CUDA 12.6.
+  ##
+  - backend: "cuda"
+    services:
+      - "vllm"
+    args:
+      - "CUDA_VERSION=12.6"
+      - "VLLM_VERSION=0.13.0"
+  #
+  # AMD ROCm
+  #
+  ## Packed AMD ROCm 7.0.
+  ##
+  - backend: "rocm"
+    services:
+      - "vllm"
+    platforms:
+      - "linux/amd64"
+    args:
+      - "ROCM_VERSION=7.0"
+      - "VLLM_VERSION=0.13.0"
+  ## Packed AMD ROCm 6.4.
+  ##
+  - backend: "rocm"
+    services:
+      - "vllm"
+    platforms:
+      - "linux/amd64"
+    args:
+      - "ROCM_VERSION=6.4"
+      - "VLLM_VERSION=0.13.0"

gpustack_runner-0.1.24.post3/pack/.post_operation/20260129_vllm_patch_dp/rocm/Dockerfile ADDED Viewed

@@ -0,0 +1,25 @@
+ARG CMAKE_MAX_JOBS
+ARG ROCM_VERSION=6.4
+ARG VLLM_VERSION=0.13.0
+FROM gpustack/runner:rocm${ROCM_VERSION}-vllm${VLLM_VERSION} AS vllm
+SHELL ["/bin/bash", "-eo", "pipefail", "-c"]
+ARG TARGETPLATFORM
+ARG TARGETOS
+ARG TARGETARCH
+## Patch
+RUN --mount=type=bind,target=/workspace,rw <<EOF
+    # Patch
+    tree -hs /workspace/patches
+    pushd $(pip show vllm | grep Location: | cut -d" " -f 2) \
+        && patch -p1 < /workspace/patches/vllm_*.patch
+EOF
+## Entrypoint
+WORKDIR /
+ENTRYPOINT [ "tini", "--" ]

gpustack_runner-0.1.24.post3/pack/.post_operation/20260129_vllm_patch_dp/rocm/patches/vllm_001_wrong_dp_ray.patch ADDED Viewed

@@ -0,0 +1,41 @@
+diff --git a/vllm/utils/network_utils.py b/vllm/utils/network_utils.py
+index 7d01533cb..311ed44df 100644
+--- a/vllm/utils/network_utils.py
++++ b/vllm/utils/network_utils.py
+@@ -147,6 +147,9 @@ def get_open_zmq_inproc_path() -> str:
+     return f"inproc://{uuid4()}"
++_next_port: int | None = None
++
++
+ def get_open_port() -> int:
+     """
+     Get an open port for the vLLM process to listen on.
+@@ -163,7 +166,7 @@ def get_open_port() -> int:
+             candidate_port = _get_open_port()
+             if candidate_port not in reserved_port_range:
+                 return candidate_port
+-    return _get_open_port()
++    return _get_open_port(_next_port)
+ def get_open_ports_list(count: int = 5) -> list[int]:
+@@ -174,13 +177,15 @@ def get_open_ports_list(count: int = 5) -> list[int]:
+     return list(ports)
+-def _get_open_port() -> int:
+-    port = envs.VLLM_PORT
++def _get_open_port(start: int | None = None) -> int:
++    port = start or envs.VLLM_PORT
+     if port is not None:
++        global _next_port
+         while True:
+             try:
+                 with socket.socket(socket.AF_INET, socket.SOCK_STREAM) as s:
+                     s.bind(("", port))
++                    _next_port = port + 1
+                     return port
+             except OSError:
+                 port += 1  # Increment port number if already in use

{gpustack_runner-0.1.24.post2 → gpustack_runner-0.1.24.post3}/pack/.post_operation/README.md RENAMED Viewed

@@ -33,3 +33,5 @@ We leverage the matrix expansion feature of GPUStack Runner to achieve this, and
 - [x] 2025-12-19: Install `petit-kernel` package for vLLM 0.12.0/0.11.2 and SGLang 0.5.6.post2/0.5.5.post3 of ROcm released images.
 - [x] 2025-12-24: Apply ATB config patches to MindIE 2.2.rc1 for CANN released images.
 - [ ] 2026-01-05: Install `vllm-omni` packages for vLLM 0.12.0 of CUDA/ROCm/CANN released images.
+- [x] 2026-01-29: Apply DP deployment patches to vLLM 0.13.0 for CUDA/ROCm released images.
+- [x] 2026-01-29: Reinstall SGLang Kernel for SGLang 0.5.7 of CANN released images.

gpustack-runner 0.1.24.post2__tar.gz → 0.1.24.post3__tar.gz

gpustack-runner 0.1.24.post2tar.gz → 0.1.24.post3tar.gz