PyPI - gpustack-runner - Versions diffs - 0.1.23.post4__tar.gz → 0.1.23.post5__tar.gz - Mend

gpustack-runner 0.1.23.post4tar.gz → 0.1.23.post5tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (124) hide show

{gpustack_runner-0.1.23.post4 → gpustack_runner-0.1.23.post5}/PKG-INFO RENAMED Viewed

@@ -1,6 +1,6 @@
 Metadata-Version: 2.4
 Name: gpustack-runner
-Version: 0.1.23.post4
+Version: 0.1.23.post5
 Summary: GPUStack Runner is library for registering runnable accelerated backends and services in GPUStack.
 Project-URL: Homepage, https://github.com/gpustack/runner
 Project-URL: Bug Tracker, https://github.com/gpustack/gpustack/issues
@@ -97,9 +97,9 @@ The following table lists the supported accelerated backends and their correspon
 ### Hygon DTK
-| DTK Version <br/> (Variant) | vLLM             |
-|-----------------------------|------------------|
-| 25.04                       | `0.9.2`, `0.8.5` |
+| DTK Version <br/> (Variant) | vLLM                       |
+|-----------------------------|----------------------------|
+| 25.04                       | `0.11.0`, `0.9.2`, `0.8.5` |
 ### MetaX MACA
@@ -108,6 +108,13 @@ The following table lists the supported accelerated backends and their correspon
 | 3.2                          | `0.10.2` |
 | 3.0                          | `0.9.1`  |
+### MThreads MUSA
+| MUSA Version <br/> (Variant) | vLLM    | SGLang  |
+|------------------------------|---------|---------|
+| 4.3.2                        |         | `0.5.2` |
+| 4.1.0                        | `0.9.2` |         |
 ### AMD ROCm
 > [!CAUTION]
@@ -171,6 +178,7 @@ ARG PYTHON_VERSION=...                                 # REQUIRED
 ARG CMAKE_MAX_JOBS=...                                 # REQUIRED
 ARG {OTHERS}                                           # OPTIONAL
 ARG {BACKEND}_VERSION=...                              # REQUIRED
+ARG {BACKEND}_VERSION_EXTRA=...                        # OPTIONAL
 ARG {BACKEND}_ARCHS=...                                # REQUIRED
 ARG {BACKEND}_{OTHERS}=...                             # OPTIONAL
 ARG {SERVICE}_BASE_IMAGE=...                           # REQUIRED

{gpustack_runner-0.1.23.post4 → gpustack_runner-0.1.23.post5}/README.md RENAMED Viewed

@@ -77,9 +77,9 @@ The following table lists the supported accelerated backends and their correspon
 ### Hygon DTK
-| DTK Version <br/> (Variant) | vLLM             |
-|-----------------------------|------------------|
-| 25.04                       | `0.9.2`, `0.8.5` |
+| DTK Version <br/> (Variant) | vLLM                       |
+|-----------------------------|----------------------------|
+| 25.04                       | `0.11.0`, `0.9.2`, `0.8.5` |
 ### MetaX MACA
@@ -88,6 +88,13 @@ The following table lists the supported accelerated backends and their correspon
 | 3.2                          | `0.10.2` |
 | 3.0                          | `0.9.1`  |
+### MThreads MUSA
+| MUSA Version <br/> (Variant) | vLLM    | SGLang  |
+|------------------------------|---------|---------|
+| 4.3.2                        |         | `0.5.2` |
+| 4.1.0                        | `0.9.2` |         |
 ### AMD ROCm
 > [!CAUTION]
@@ -151,6 +158,7 @@ ARG PYTHON_VERSION=...                                 # REQUIRED
 ARG CMAKE_MAX_JOBS=...                                 # REQUIRED
 ARG {OTHERS}                                           # OPTIONAL
 ARG {BACKEND}_VERSION=...                              # REQUIRED
+ARG {BACKEND}_VERSION_EXTRA=...                        # OPTIONAL
 ARG {BACKEND}_ARCHS=...                                # REQUIRED
 ARG {BACKEND}_{OTHERS}=...                             # OPTIONAL
 ARG {SERVICE}_BASE_IMAGE=...                           # REQUIRED

{gpustack_runner-0.1.23.post4 → gpustack_runner-0.1.23.post5}/gpustack_runner/_version.py RENAMED Viewed

@@ -27,8 +27,8 @@ version_tuple: VERSION_TUPLE
 __commit_id__: COMMIT_ID
 commit_id: COMMIT_ID
-__version__ = version = '0.1.23.post4'
-__version_tuple__ = version_tuple = (0, 1, 23, 'post4')
+__version__ = version = '0.1.23.post5'
+__version_tuple__ = version_tuple = (0, 1, 23, 'post5')
 try:
     from ._version_appendix import git_commit
     __commit_id__ = commit_id = git_commit

gpustack_runner-0.1.23.post5/gpustack_runner/_version_appendix.py ADDED Viewed

	@@ -0,0 +1 @@
1	+ git_commit = "d297d69"

{gpustack_runner-0.1.23.post4 → gpustack_runner-0.1.23.post5}/gpustack_runner/cmds/images.py RENAMED Viewed

@@ -444,14 +444,14 @@ class SaveImagesSubCommand(SubCommand):
                 command = [
                     "skopeo",
-                    "--override-os",
-                    override_os,
-                    "--override-arch",
-                    override_arch,
                     "copy",
                     "--src-tls-verify=false",
                     "--retry-times",
                     str(self.max_retries),
+                    "--override-os",
+                    override_os,
+                    "--override-arch",
+                    override_arch,
                 ]
                 if self.source_username and self.source_password:
                     command.extend(
@@ -771,6 +771,10 @@ class CopyImagesSubCommand(SubCommand):
                 print(f"❌ Error syncing image '{img_name}'")
                 failures.append((img_name, img_err))
+            override_os, override_arch = None, None
+            if self.platform:
+                override_os, override_arch = self.platform.split("/", maxsplit=1)
             # Submit tasks
             for img in images:
                 command = [
@@ -778,10 +782,20 @@ class CopyImagesSubCommand(SubCommand):
                     "copy",
                     "--src-tls-verify=false",
                     "--dest-tls-verify=false",
-                    "--all",
                     "--retry-times",
                     str(self.max_retries),
                 ]
+                if override_os and override_arch:
+                    command.extend(
+                        [
+                            "--override-os",
+                            override_os,
+                            "--override-arch",
+                            override_arch,
+                        ],
+                    )
+                else:
+                    command.append("--all")
                 if self.source_username and self.source_password:
                     command.extend(
                         [

{gpustack_runner-0.1.23.post4 → gpustack_runner-0.1.23.post5}/gpustack_runner/runner.py RENAMED Viewed

@@ -13,7 +13,7 @@ from dataclasses_json import dataclass_json
 from . import envs
 _RE_DOCKER_IMAGE = re.compile(
-    r"(?:(?P<prefix>[\w\\.\-]+(?:/[\w\\.\-]+)*)/)?runner:(?P<backend>(Host|cann|corex|cuda|dtk|maca|rocm))(?P<backend_version>[XY\d\\.]+)(?:-(?P<backend_variant>\w+))?-(?P<service>(vllm|voxbox|mindie|sglang))(?P<service_version>[\w\\.]+)(?:-(?P<suffix>\w+))?",
+    r"(?:(?P<prefix>[\w\\.\-]+(?:/[\w\\.\-]+)*)/)?runner:(?P<backend>(Host|cann|corex|cuda|dtk|maca|musa|rocm))(?P<backend_version>[XY\d\\.]+)(?:-(?P<backend_variant>\w+))?-(?P<service>(vllm|voxbox|mindie|sglang))(?P<service_version>[\w\\.]+)(?:-(?P<suffix>\w+))?",
 )
 """
 Regex for Docker image parsing,

{gpustack_runner-0.1.23.post4 → gpustack_runner-0.1.23.post5}/gpustack_runner/runner.py.json RENAMED Viewed

@@ -1363,6 +1363,17 @@
     "docker_image": "gpustack/runner:cuda12.4-voxbox0.0.20",
     "deprecated": true
   },
+  {
+    "backend": "dtk",
+    "backend_version": "25.04",
+    "original_backend_version": "25.04.2",
+    "backend_variant": "",
+    "service": "vllm",
+    "service_version": "0.11.0",
+    "platform": "linux/amd64",
+    "docker_image": "gpustack/runner:dtk25.04-vllm0.11.0",
+    "deprecated": false
+  },
   {
     "backend": "dtk",
     "backend_version": "25.04",
@@ -1407,6 +1418,28 @@
     "docker_image": "gpustack/runner:maca3.0-vllm0.9.1",
     "deprecated": false
   },
+  {
+    "backend": "musa",
+    "backend_version": "4.3",
+    "original_backend_version": "4.3.2",
+    "backend_variant": "",
+    "service": "sglang",
+    "service_version": "0.5.7",
+    "platform": "linux/amd64",
+    "docker_image": "gpustack/runner:musa4.3-sglang0.5.7",
+    "deprecated": false
+  },
+  {
+    "backend": "musa",
+    "backend_version": "4.1",
+    "original_backend_version": "4.1.0",
+    "backend_variant": "",
+    "service": "vllm",
+    "service_version": "0.9.2",
+    "platform": "linux/amd64",
+    "docker_image": "gpustack/runner:musa4.1-vllm0.9.2",
+    "deprecated": false
+  },
   {
     "backend": "rocm",
     "backend_version": "7.0",

gpustack_runner-0.1.23.post5/pack/.post_operation/20260105_vllm_install_omni/cann/Dockerfile ADDED Viewed

@@ -0,0 +1,81 @@
+ARG CMAKE_MAX_JOBS
+ARG CANN_VERSION=8.3
+ARG CANN_ARCHS=910b
+ARG VLLM_VERSION=0.12.0
+ARG VLLM_OMNI_COMMIT=75cdf1c
+FROM gpustack/runner:cann${CANN_VERSION}-${CANN_ARCHS}-vllm${VLLM_VERSION} AS vllm-build-omni
+SHELL ["/bin/bash", "-eo", "pipefail", "-c"]
+ARG TARGETPLATFORM
+ARG TARGETOS
+ARG TARGETARCH
+## Build Omni
+ARG CMAKE_MAX_JOBS
+ARG VLLM_OMNI_COMMIT
+ENV VLLM_OMNI_COMMIT=${VLLM_OMNI_COMMIT}
+RUN <<EOF
+    # Omni
+    CMAKE_MAX_JOBS="${CMAKE_MAX_JOBS}"
+    if [[ -z "${CMAKE_MAX_JOBS}" ]]; then
+        CMAKE_MAX_JOBS="$(( $(nproc) / 2 ))"
+    fi
+    if (( $(echo "${CMAKE_MAX_JOBS} > 4" | bc -l) )); then
+        CMAKE_MAX_JOBS="4"
+    fi
+    export MAX_JOBS="${CMAKE_MAX_JOBS}"
+    export COMPILE_CUSTOM_KERNELS=1
+    export LD_LIBRARY_PATH="${LD_LIBRARY_PATH}:${CANN_HOME}/ascend-toolkit/latest/$(uname -i)-linux/devlib"
+    export VLLM_TARGET_DEVICE="empty"
+    echo "Building vLLM Omni with the following environment variables:"
+    env
+    # Build
+    git -C /tmp clone --recursive --shallow-submodules \
+        https://github.com/vllm-project/vllm-omni vllm_omni \
+        && pushd /tmp/vllm_omni \
+        && git checkout ${VLLM_OMNI_COMMIT} \
+        && git submodule update --init --recursive
+    pushd /tmp/vllm_omni \
+        && python -v -m build --no-isolation --wheel \
+        && tree -hs /tmp/vllm_omni/dist \
+        && mv /tmp/vllm_omni/dist /workspace
+    # Cleanup
+    rm -rf /var/tmp/* \
+        && rm -rf /tmp/*
+EOF
+FROM gpustack/runner:cann${CANN_VERSION}-${CANN_ARCHS}-vllm${VLLM_VERSION} AS vllm
+SHELL ["/bin/bash", "-eo", "pipefail", "-c"]
+ARG TARGETPLATFORM
+ARG TARGETOS
+ARG TARGETARCH
+## Install Omni
+RUN --mount=type=bind,from=vllm-build-omni,source=/,target=/omni,rw <<EOF
+    # Omni
+    # Install
+    uv pip install --no-build-isolation \
+        /omni/workspace/*.whl
+    # Review
+    uv pip tree
+    # Cleanup
+    rm -rf /var/tmp/* \
+        && rm -rf /tmp/*
+EOF
+## Entrypoint
+WORKDIR /
+ENTRYPOINT [ "tini", "--" ]

gpustack_runner-0.1.23.post5/pack/.post_operation/20260105_vllm_install_omni/cuda/Dockerfile ADDED Viewed

@@ -0,0 +1,93 @@
+ARG CMAKE_MAX_JOBS
+ARG CUDA_VERSION=12.8
+ARG VLLM_VERSION=0.12.0
+ARG VLLM_OMNI_COMMIT=75cdf1c
+FROM gpustack/runner:cuda${CUDA_VERSION}-vllm${VLLM_VERSION} AS vllm-build-omni
+SHELL ["/bin/bash", "-eo", "pipefail", "-c"]
+ARG TARGETPLATFORM
+ARG TARGETOS
+ARG TARGETARCH
+ARG TARGETPLATFORM
+ARG TARGETOS
+ARG TARGETARCH
+## Build Omni
+ARG CMAKE_MAX_JOBS
+ARG VLLM_OMNI_COMMIT
+ENV VLLM_OMNI_COMMIT=${VLLM_OMNI_COMMIT}
+RUN <<EOF
+    # Omni
+    IFS="." read -r CUDA_MAJOR CUDA_MINOR CUDA_PATCH <<< "${VLLM_TORCH_CUDA_VERSION}"
+    CMAKE_MAX_JOBS="${CMAKE_MAX_JOBS}"
+    if [[ -z "${CMAKE_MAX_JOBS}" ]]; then
+        CMAKE_MAX_JOBS="$(( $(nproc) / 2 ))"
+    fi
+    if (( $(echo "${CMAKE_MAX_JOBS} > 4" | bc -l) )); then
+        CMAKE_MAX_JOBS="4"
+    fi
+    VL_CUDA_ARCHS="${CUDA_ARCHS}"
+    if [[ -z "${VL_CUDA_ARCHS}" ]]; then
+        if (( $(echo "${CUDA_MAJOR}.${CUDA_MINOR} < 12.9" | bc -l) )); then
+            VL_CUDA_ARCHS="7.5 8.0+PTX 8.9 9.0 10.0+PTX 12.0+PTX"
+        else
+            VL_CUDA_ARCHS="7.5 8.0+PTX 8.9 9.0 10.0 10.3 12.0 12.1+PTX"
+        fi
+    fi
+    export MAX_JOBS="${CMAKE_MAX_JOBS}"
+    export TORCH_CUDA_ARCH_LIST="${VL_CUDA_ARCHS}"
+    export NVCC_THREADS=1
+    echo "Building vLLM Omni with the following environment variables:"
+    env
+    # Build
+    git -C /tmp clone --recursive --shallow-submodules \
+        https://github.com/vllm-project/vllm-omni vllm_omni \
+        && pushd /tmp/vllm_omni \
+        && git checkout ${VLLM_OMNI_COMMIT} \
+        && git submodule update --init --recursive
+    pushd /tmp/vllm_omni \
+        && python -v -m build --no-isolation --wheel \
+        && tree -hs /tmp/vllm_omni/dist \
+        && mv /tmp/vllm_omni/dist /workspace
+    # Cleanup
+    rm -rf /var/tmp/* \
+        && rm -rf /tmp/*
+EOF
+FROM gpustack/runner:cuda${CUDA_VERSION}-vllm${VLLM_VERSION} AS vllm
+SHELL ["/bin/bash", "-eo", "pipefail", "-c"]
+ARG TARGETPLATFORM
+ARG TARGETOS
+ARG TARGETARCH
+## Install Omni
+RUN --mount=type=bind,from=vllm-build-omni,source=/,target=/omni,rw <<EOF
+    # Omni
+    # Install
+    uv pip install --no-build-isolation \
+        /omni/workspace/*.whl
+    # Review
+    uv pip tree
+    # Cleanup
+    rm -rf /var/tmp/* \
+        && rm -rf /tmp/*
+EOF
+## Entrypoint
+WORKDIR /
+ENTRYPOINT [ "tini", "--" ]

gpustack_runner-0.1.23.post5/pack/.post_operation/20260105_vllm_install_omni/matrix.yaml ADDED Viewed

@@ -0,0 +1,78 @@
+rules:
+  #
+  # NVIDIA CUDA
+  #
+  ## Packed NVIDIA CUDA 12.9.
+  ##
+  - backend: "cuda"
+    services:
+      - "vllm"
+    args:
+      - "CUDA_VERSION=12.9"
+      - "VLLM_VERSION=0.12.0"
+  ## Packed NVIDIA CUDA 12.8.
+  ##
+  - backend: "cuda"
+    services:
+      - "vllm"
+    args:
+      - "CUDA_VERSION=12.8"
+      - "VLLM_VERSION=0.12.0"
+  ## Packed NVIDIA CUDA 12.6.
+  ##
+  - backend: "cuda"
+    services:
+      - "vllm"
+    args:
+      - "CUDA_VERSION=12.6"
+      - "VLLM_VERSION=0.12.0"
+  #
+  # AMD ROCm
+  #
+  ## Packed AMD ROCm 7.0.
+  ##
+  - backend: "rocm"
+    services:
+      - "vllm"
+    platforms:
+      - "linux/amd64"
+    args:
+      - "ROCM_VERSION=7.0"
+      - "VLLM_VERSION=0.12.0"
+  ## Packed AMD ROCm 6.4.
+  ##
+  - backend: "rocm"
+    services:
+      - "vllm"
+    platforms:
+      - "linux/amd64"
+    args:
+      - "ROCM_VERSION=6.4"
+      - "VLLM_VERSION=0.12.0"
+  #
+  # Ascend CANN
+  #
+  ## Packed Ascend CANN 8.3, using CANN Kernel for A3.
+  ##
+  - backend: "cann"
+    services:
+      - "vllm"
+    args:
+      - "CANN_VERSION=8.3"
+      - "CANN_ARCHS=a3"
+      - "VLLM_VERSION=0.12.0"
+  ## Packed Ascend CANN 8.3, using CANN Kernel for 910B.
+  ##
+  - backend: "cann"
+    services:
+      - "vllm"
+    args:
+      - "CANN_VERSION=8.3"
+      - "CANN_ARCHS=910b"
+      - "VLLM_VERSION=0.12.0"

gpustack_runner-0.1.23.post5/pack/.post_operation/20260105_vllm_install_omni/rocm/Dockerfile ADDED Viewed

@@ -0,0 +1,98 @@
+ARG CMAKE_MAX_JOBS
+ARG ROCM_VERSION=6.4
+ARG VLLM_VERSION=0.12.0
+ARG VLLM_OMNI_COMMIT=75cdf1c
+FROM gpustack/runner:rocm${ROCM_VERSION}-vllm${VLLM_VERSION} AS vllm-build-omni
+SHELL ["/bin/bash", "-eo", "pipefail", "-c"]
+ARG TARGETPLATFORM
+ARG TARGETOS
+ARG TARGETARCH
+## Build Omni
+ARG CMAKE_MAX_JOBS
+ARG VLLM_OMNI_COMMIT
+ENV VLLM_OMNI_COMMIT=${VLLM_OMNI_COMMIT}
+RUN <<EOF
+    # Omni
+    IFS="." read -r ROCM_MAJOR ROCM_MINOR ROCM_PATCH <<< "${VLLM_TORCH_ROCM_VERSION}"
+    IFS="." read -r VL_MAJOR VL_MINOR VL_PATCH <<< "${VLLM_VERSION}"
+    CMAKE_MAX_JOBS="${CMAKE_MAX_JOBS}"
+    if [[ -z "${CMAKE_MAX_JOBS}" ]]; then
+        CMAKE_MAX_JOBS="$(( $(nproc) / 2 ))"
+    fi
+    if (( $(echo "${CMAKE_MAX_JOBS} > 4" | bc -l) )); then
+        CMAKE_MAX_JOBS="4"
+    fi
+    VL_ROCM_ARCHS="${ROCM_ARCHS}"
+    if [[ -z "${VL_ROCM_ARCHS}" ]]; then
+        if (( $(echo "${ROCM_MAJOR}.${ROCM_MINOR} < 7.0" | bc -l) )); then
+            VL_ROCM_ARCHS="gfx908;gfx90a;gfx942;gfx1030;gfx1100"
+            if (( $(echo "${VL_MAJOR}.${VL_MINOR} == 0.13" | bc -l) )); then
+                # TODO(thxCode): Temporarily remove gfx1030 for vLLM ROCm build due to build error in ROCm 6.4.4.
+                # #15 134.9 /tmp/vllm/build/temp.linux-x86_64-cpython-312/csrc/sampler.hip:564:63: error: local memory (66032) exceeds limit (65536) in 'void vllm::topKPerRowDecode<1024, true, false, true>(float const*, int const*, int*, int, int, int, int, float*, int, int const*)'
+                # ##15 134.9   564 | static __global__ __launch_bounds__(kNumThreadsPerBlock) void topKPerRowDecode(
+                # ##15 134.9       |                                                               ^
+                # ##15 134.9 16 warnings and 1 error generated when compiling for gfx1030.
+                VL_ROCM_ARCHS="gfx908;gfx90a;gfx942"
+            fi
+        else
+            VL_ROCM_ARCHS="gfx908;gfx90a;gfx942;gfx950;gfx1030;gfx1100;gfx1101;gfx1200;gfx1201;gfx1150;gfx1151"
+        fi
+    fi
+    export MAX_JOBS="${CMAKE_MAX_JOBS}"
+    export COMPILE_CUSTOM_KERNELS=1
+    export PYTORCH_ROCM_ARCH="${VL_ROCM_ARCHS}"
+    echo "Building vLLM Omni with the following environment variables:"
+    env
+    # Build
+    git -C /tmp clone --recursive --shallow-submodules \
+        https://github.com/vllm-project/vllm-omni vllm_omni \
+        && pushd /tmp/vllm_omni \
+        && git checkout ${VLLM_OMNI_COMMIT} \
+        && git submodule update --init --recursive
+    pushd /tmp/vllm_omni \
+        && python -v -m build --no-isolation --wheel \
+        && tree -hs /tmp/vllm_omni/dist \
+        && mv /tmp/vllm_omni/dist /workspace
+    # Cleanup
+    rm -rf /var/tmp/* \
+        && rm -rf /tmp/*
+EOF
+FROM gpustack/runner:rocm${ROCM_VERSION}-vllm${VLLM_VERSION} AS vllm
+SHELL ["/bin/bash", "-eo", "pipefail", "-c"]
+ARG TARGETPLATFORM
+ARG TARGETOS
+ARG TARGETARCH
+## Install Omni
+RUN --mount=type=bind,from=vllm-build-omni,source=/,target=/omni,rw <<EOF
+    # Omni
+    # Install
+    uv pip install --no-build-isolation \
+        /omni/workspace/*.whl
+    # Review
+    uv pip tree
+    # Cleanup
+    rm -rf /var/tmp/* \
+        && rm -rf /tmp/*
+EOF
+## Entrypoint
+WORKDIR /
+ENTRYPOINT [ "tini", "--" ]

{gpustack_runner-0.1.23.post4 → gpustack_runner-0.1.23.post5}/pack/.post_operation/README.md RENAMED Viewed

@@ -32,3 +32,4 @@ We leverage the matrix expansion feature of GPUStack Runner to achieve this, and
 - [x] 2025-12-19: Install `vLLM[audio]` packages for vLLM 0.12.0/0.11.2 of CUDA/ROCm released images.
 - [x] 2025-12-19: Install `petit-kernel` package for vLLM 0.12.0/0.11.2 and SGLang 0.5.6.post2/0.5.5.post3 of ROcm released images.
 - [x] 2025-12-24: Apply ATB config patches to MindIE 2.2.rc1 for CANN released images.
+- [ ] 2026-01-05: Install `vllm-omni` packages for vLLM 0.12.0 of CUDA/ROCm/CANN released images.

gpustack-runner 0.1.23.post4__tar.gz → 0.1.23.post5__tar.gz

gpustack-runner 0.1.23.post4tar.gz → 0.1.23.post5tar.gz