PyPI - vec-inf - Versions diffs - 0.6.0__tar.gz → 0.7.0__tar.gz - Mend

vec-inf 0.6.0tar.gz → 0.7.0tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (98) hide show

{vec_inf-0.6.0 → vec_inf-0.7.0}/.github/ISSUE_TEMPLATE/bug_report.md RENAMED Viewed

@@ -3,7 +3,7 @@ name: Bug report
 about: Create a report to help us improve
 title: ''
 labels: ''
-assignees: ''
+assignees: XkunW
 ---

vec_inf-0.7.0/.github/ISSUE_TEMPLATE/model-request.md ADDED Viewed

@@ -0,0 +1,14 @@
+---
+name: Model request
+about: Request for new model weights or model config
+title: New model request for [MODEL_NAME]
+labels: new model
+assignees: XkunW
+---
+### Request Type
+Model weights | Model config | Both
+### Model Name
+Name of the model requested

{vec_inf-0.6.0 → vec_inf-0.7.0}/.github/workflows/code_checks.yml RENAMED Viewed

@@ -28,7 +28,7 @@ jobs:
   run-code-check:
     runs-on: ubuntu-latest
     steps:
-      - uses: actions/checkout@v4.2.2
+      - uses: actions/checkout@v5.0.0
       - name: Install uv
         uses: astral-sh/setup-uv@v6
         with:

{vec_inf-0.6.0 → vec_inf-0.7.0}/.github/workflows/docker.yml RENAMED Viewed

@@ -24,7 +24,7 @@ jobs:
     runs-on: ubuntu-latest
     steps:
       - name: Checkout repository
-        uses: actions/checkout@v4.2.2
+        uses: actions/checkout@v5.0.0
       - name: Extract vLLM version
         id: vllm-version
@@ -33,19 +33,19 @@ jobs:
           echo "version=$VERSION" >> $GITHUB_OUTPUT
       - name: Log in to Docker Hub
-        uses: docker/login-action@74a5d142397b4f367a81961eba4e8cd7edddf772
+        uses: docker/login-action@184bdaa0721073962dff0199f1fb9940f07167d1
         with:
           username: ${{ secrets.DOCKER_USERNAME }}
           password: ${{ secrets.DOCKER_PASSWORD }}
       - name: Extract metadata (tags, labels) for Docker
         id: meta
-        uses: docker/metadata-action@902fa8ec7d6ecbf8d84d538b9b233a880e428804
+        uses: docker/metadata-action@c1e51972afc2121e065aed6d45c65596fe445f3f
         with:
           images: vectorinstitute/vector-inference
       - name: Build and push Docker image
-        uses: docker/build-push-action@14487ce63c7a62a4a324b0bfb37086795e31c6c1
+        uses: docker/build-push-action@263435318d21b8e681c14492fe198d362a7d2c83
         with:
           context: .
           file: ./Dockerfile

{vec_inf-0.6.0 → vec_inf-0.7.0}/.github/workflows/docs.yml RENAMED Viewed

@@ -51,7 +51,7 @@ jobs:
     runs-on: ubuntu-latest
     steps:
       - name: Checkout code
-        uses: actions/checkout@v4.2.2
+        uses: actions/checkout@v5.0.0
         with:
           fetch-depth: 0  # Fetch all history for proper versioning
@@ -88,7 +88,7 @@ jobs:
     runs-on: ubuntu-latest
     steps:
       - name: Checkout code
-        uses: actions/checkout@v4.2.2
+        uses: actions/checkout@v5.0.0
         with:
           fetch-depth: 0  # Fetch all history for proper versioning
@@ -112,7 +112,7 @@ jobs:
           git config user.email 41898282+github-actions[bot]@users.noreply.github.com
       - name: Download artifact
-        uses: actions/download-artifact@v4
+        uses: actions/download-artifact@v5
         with:
           name: docs-site
           path: site

{vec_inf-0.6.0 → vec_inf-0.7.0}/.github/workflows/publish.yml RENAMED Viewed

@@ -13,7 +13,7 @@ jobs:
           sudo apt-get update
           sudo apt-get install libcurl4-openssl-dev libssl-dev
-      - uses: actions/checkout@v4.2.2
+      - uses: actions/checkout@v5.0.0
       - name: Install uv
         uses: astral-sh/setup-uv@v6

{vec_inf-0.6.0 → vec_inf-0.7.0}/.github/workflows/unit_tests.yml RENAMED Viewed

@@ -43,7 +43,7 @@ jobs:
       matrix:
         python-version: ["3.10", "3.11", "3.12"]
     steps:
-      - uses: actions/checkout@v4.2.2
+      - uses: actions/checkout@v5.0.0
       - name: Install uv
         uses: astral-sh/setup-uv@v6
@@ -71,8 +71,12 @@ jobs:
         run: |
           uv run pytest tests/test_imports.py
+      - name: Import Codecov GPG public key
+        run: |
+          gpg --keyserver keyserver.ubuntu.com --recv-keys 806BB28AED779869
       - name: Upload coverage to Codecov
-        uses: codecov/codecov-action@v5.4.2
+        uses: codecov/codecov-action@v5.5.0
         with:
           token: ${{ secrets.CODECOV_TOKEN }}
           file: ./coverage.xml

{vec_inf-0.6.0 → vec_inf-0.7.0}/.gitignore RENAMED Viewed

@@ -152,3 +152,7 @@ collect_env.py
 # build files
 dist/
+# type stubs
+stubs/
+mypy.ini

{vec_inf-0.6.0 → vec_inf-0.7.0}/.pre-commit-config.yaml RENAMED Viewed

@@ -1,6 +1,6 @@
 repos:
   - repo: https://github.com/pre-commit/pre-commit-hooks
-    rev: v5.0.0  # Use the ref you want to point at
+    rev: v6.0.0  # Use the ref you want to point at
     hooks:
     - id: trailing-whitespace
     - id: check-ast
@@ -17,7 +17,7 @@ repos:
     - id: check-toml
   - repo: https://github.com/astral-sh/ruff-pre-commit
-    rev: 'v0.11.8'
+    rev: 'v0.12.10'
     hooks:
     - id: ruff
       args: [--fix, --exit-non-zero-on-fix]
@@ -26,7 +26,7 @@ repos:
       types_or: [python, jupyter]
   - repo: https://github.com/pre-commit/mirrors-mypy
-    rev: v1.15.0
+    rev: v1.17.1
     hooks:
     - id: mypy
       entry: python3 -m mypy --config-file pyproject.toml

{vec_inf-0.6.0 → vec_inf-0.7.0}/Dockerfile RENAMED Viewed

@@ -1,4 +1,4 @@
-FROM nvidia/cuda:12.4.1-devel-ubuntu20.04
+FROM nvidia/cuda:12.8.1-cudnn-devel-ubuntu24.04
 # Non-interactive apt-get commands
 ARG DEBIAN_FRONTEND=noninteractive
@@ -6,8 +6,8 @@ ARG DEBIAN_FRONTEND=noninteractive
 # No GPUs visible during build
 ARG CUDA_VISIBLE_DEVICES=none
-# Specify CUDA architectures -> 7.5: RTX 6000 & T4, 8.0: A100, 8.6+PTX
-ARG TORCH_CUDA_ARCH_LIST="7.5;8.0;8.6+PTX"
+# Specify CUDA architectures -> 7.5: Quadro RTX 6000 & T4, 8.0: A100, 8.6: A40, 8.9: L40S, 9.0: H100
+ARG TORCH_CUDA_ARCH_LIST="7.5;8.0;8.6;8.9;9.0+PTX"
 # Set the Python version
 ARG PYTHON_VERSION=3.10.12
@@ -35,20 +35,29 @@ RUN wget https://bootstrap.pypa.io/get-pip.py && \
     rm get-pip.py && \
     python3.10 -m pip install --upgrade pip setuptools wheel uv
+# Install Infiniband/RDMA support
+RUN apt-get update && apt-get install -y \
+    libibverbs1 libibverbs-dev ibverbs-utils \
+    librdmacm1 librdmacm-dev rdmacm-utils \
+    && rm -rf /var/lib/apt/lists/*
+# Set up RDMA environment (these will persist in the final container)
+ENV LD_LIBRARY_PATH="/usr/lib/x86_64-linux-gnu:$LD_LIBRARY_PATH"
+ENV UCX_NET_DEVICES=all
+ENV NCCL_IB_DISABLE=0
 # Set up project
 WORKDIR /vec-inf
 COPY . /vec-inf
 # Install project dependencies with build requirements
-RUN PIP_INDEX_URL="https://download.pytorch.org/whl/cu121" uv pip install --system -e .[dev]
-# Install FlashAttention
-RUN python3.10 -m pip install flash-attn --no-build-isolation
-# Install FlashInfer
-RUN python3.10 -m pip install flashinfer-python -i https://flashinfer.ai/whl/cu124/torch2.6/
+RUN PIP_INDEX_URL="https://download.pytorch.org/whl/cu128" uv pip install --system -e .[dev]
 # Final configuration
 RUN mkdir -p /vec-inf/nccl && \
     mv /root/.config/vllm/nccl/cu12/libnccl.so.2.18.1 /vec-inf/nccl/libnccl.so.2.18.1
+ENV VLLM_NCCL_SO_PATH=/vec-inf/nccl/libnccl.so.2.18.1
+ENV NCCL_DEBUG=INFO
 # Set the default command to start an interactive shell
 CMD ["bash"]

vec_inf-0.7.0/MODEL_TRACKING.md ADDED Viewed

@@ -0,0 +1,324 @@
+# Model Weights Tracking
+This document tracks all model weights available in the `/model-weights` directory on Killarney cluster and indicates which ones have existing configurations in the cached model config (`/model-weights/vec-inf-shared/models.yaml`). By default, `vec-inf` would use the cached model config. To request new model weights to be downloaded or model configuration to be added, please open an issue for "Model request".
+**NOTE**: The [`models.yaml`](./vec_inf/config/models.yaml) file in the package is not always up to date with the latest cached model config on Killarney cluster, new model config would be added to the cached model config. `models.yaml` would be updated to reflect the cached model config when a new version of the package is released.
+## Legend
+- ✅ **Configured**: Model has a complete configuration in `models.yaml`
+- ❌ **Not Configured**: Model exists in `/model-weights` but lacks configuration
+---
+## Text Generation Models (LLM)
+### Cohere for AI: Command R
+| Model | Configuration |
+|:------|:-------------|
+| `c4ai-command-r-plus-08-2024` | ✅ |
+| `c4ai-command-r-08-2024` | ✅ |
+### Code Llama
+| Model | Configuration |
+|:------|:-------------|
+| `CodeLlama-7b-hf` | ✅ |
+| `CodeLlama-7b-Instruct-hf` | ✅ |
+| `CodeLlama-13b-hf` | ✅ |
+| `CodeLlama-13b-Instruct-hf` | ✅ |
+| `CodeLlama-34b-hf` | ✅ |
+| `CodeLlama-34b-Instruct-hf` | ✅ |
+| `CodeLlama-70b-hf` | ✅ |
+| `CodeLlama-70b-Instruct-hf` | ✅ |
+| `CodeLlama-7b-Python-hf` | ❌ |
+| `CodeLlama-13b-Python-hf` | ❌ |
+| `CodeLlama-70b-Python-hf` | ❌ |
+### Google: Gemma
+| Model | Configuration |
+|:------|:-------------|
+| `gemma-2b` | ❌ |
+| `gemma-2b-it` | ❌ |
+| `gemma-7b` | ❌ |
+| `gemma-7b-it` | ❌ |
+| `gemma-2-9b` | ✅ |
+| `gemma-2-9b-it` | ✅ |
+| `gemma-2-27b` | ✅ |
+| `gemma-2-27b-it` | ✅ |
+| `gemma-3-1b-it` | ❌ |
+| `gemma-3-4b-it` | ❌ |
+| `gemma-3-12b-it` | ❌ |
+| `gemma-3-27b-it` | ❌ |
+### Meta: Llama 2
+| Model | Configuration |
+|:------|:-------------|
+| `Llama-2-7b-hf` | ✅ |
+| `Llama-2-7b-chat-hf` | ✅ |
+| `Llama-2-13b-hf` | ✅ |
+| `Llama-2-13b-chat-hf` | ✅ |
+| `Llama-2-70b-hf` | ✅ |
+| `Llama-2-70b-chat-hf` | ✅ |
+### Meta: Llama 3
+| Model | Configuration |
+|:------|:-------------|
+| `Meta-Llama-3-8B` | ✅ |
+| `Meta-Llama-3-8B-Instruct` | ✅ |
+| `Meta-Llama-3-70B` | ✅ |
+| `Meta-Llama-3-70B-Instruct` | ✅ |
+### Meta: Llama 3.1
+| Model | Configuration |
+|:------|:-------------|
+| `Meta-Llama-3.1-8B` | ✅ |
+| `Meta-Llama-3.1-8B-Instruct` | ✅ |
+| `Meta-Llama-3.1-70B` | ✅ |
+| `Meta-Llama-3.1-70B-Instruct` | ✅ |
+| `Meta-Llama-3.1-405B-Instruct` | ✅ |
+### Meta: Llama 3.2
+| Model | Configuration |
+|:------|:-------------|
+| `Llama-3.2-1B` | ✅ |
+| `Llama-3.2-1B-Instruct` | ✅ |
+| `Llama-3.2-3B` | ✅ |
+| `Llama-3.2-3B-Instruct` | ✅ |
+### Meta: Llama 3.3
+| Model | Configuration |
+|:------|:-------------|
+| `Llama-3.3-70B-Instruct` | ✅ |
+### Meta: Llama 4
+| Model | Configuration |
+|:------|:-------------|
+| `Llama-4-Scout-17B-16E-Instruct` | ❌ |
+### Mistral AI: Mistral
+| Model | Configuration |
+|:------|:-------------|
+| `Mistral-7B-v0.3` | ✅ |
+| `Mistral-7B-Instruct-v0.1` | ✅ |
+| `Mistral-7B-Instruct-v0.2` | ✅ |
+| `Mistral-7B-Instruct-v0.3` | ✅ |
+| `Mistral-Large-Instruct-2407` | ✅ |
+| `Mistral-Large-Instruct-2411` | ✅ |
+### Mistral AI: Mixtral
+| Model | Configuration |
+|:------|:-------------|
+| `Mixtral-8x7B-Instruct-v0.1` | ✅ |
+| `Mixtral-8x22B-v0.1` | ✅ |
+| `Mixtral-8x22B-Instruct-v0.1` | ✅ |
+### Microsoft: Phi
+| Model | Configuration |
+|:------|:-------------|
+| `Phi-3-medium-128k-instruct` | ✅ |
+| `phi-4` | ❌ |
+### Nvidia: Llama-3.1-Nemotron
+| Model | Configuration |
+|:------|:-------------|
+| `Llama-3.1-Nemotron-70B-Instruct-HF` | ✅ |
+### Qwen: Qwen2.5
+| Model | Configuration |
+|:------|:-------------|
+| `Qwen2.5-0.5B-Instruct` | ✅ |
+| `Qwen2.5-1.5B-Instruct` | ✅ |
+| `Qwen2.5-3B-Instruct` | ✅ |
+| `Qwen2.5-7B-Instruct` | ✅ |
+| `Qwen2.5-14B-Instruct` | ✅ |
+| `Qwen2.5-32B-Instruct` | ✅ |
+| `Qwen2.5-72B-Instruct` | ✅ |
+### Qwen: Qwen2.5-Math
+| Model | Configuration |
+|:------|:-------------|
+| `Qwen2.5-Math-1.5B-Instruct` | ✅ |
+| `Qwen2.5-Math-7B-Instruct` | ✅ |
+| `Qwen2.5-Math-72B-Instruct` | ✅ |
+### Qwen: Qwen2.5-Coder
+| Model | Configuration |
+|:------|:-------------|
+| `Qwen2.5-Coder-7B-Instruct` | ✅ |
+### Qwen: QwQ
+| Model | Configuration |
+|:------|:-------------|
+| `QwQ-32B` | ✅ |
+### Qwen: Qwen2
+| Model | Configuration |
+|:------|:-------------|
+| `Qwen2-1.5B-Instruct` | ❌ |
+| `Qwen2-7B-Instruct` | ❌ |
+| `Qwen2-Math-1.5B-Instruct` | ❌ |
+| `Qwen2-Math-7B-Instruct` | ❌ |
+| `Qwen2-Math-72B` | ❌ |
+| `Qwen2-Math-72B-Instruct` | ❌ |
+| `Qwen2-VL-7B-Instruct` | ❌ |
+### Qwen: Qwen3
+| Model | Configuration |
+|:------|:-------------|
+| `Qwen3-14B` | ✅ |
+| `Qwen3-8B` | ❌ |
+| `Qwen3-32B` | ❌ |
+| `Qwen3-235B-A22B` | ❌ |
+| `Qwen3-Embedding-8B` | ❌ |
+### DeepSeek: DeepSeek-R1
+| Model | Configuration |
+|:------|:-------------|
+| `DeepSeek-R1-Distill-Llama-8B` | ✅ |
+| `DeepSeek-R1-Distill-Llama-70B` | ✅ |
+| `DeepSeek-R1-Distill-Qwen-1.5B` | ✅ |
+| `DeepSeek-R1-Distill-Qwen-7B` | ✅ |
+| `DeepSeek-R1-Distill-Qwen-14B` | ✅ |
+| `DeepSeek-R1-Distill-Qwen-32B` | ✅ |
+### DeepSeek: Other Models
+| Model | Configuration |
+|:------|:-------------|
+| `DeepSeek-Coder-V2-Lite-Instruct` | ❌ |
+| `deepseek-math-7b-instruct` | ❌ |
+### Other LLM Models
+| Model | Configuration |
+|:------|:-------------|
+| `AI21-Jamba-1.5-Mini` | ❌ |
+| `aya-expanse-32b` | ✅ (as Aya-Expanse-32B) |
+| `gpt2-large` | ❌ |
+| `gpt2-xl` | ❌ |
+| `gpt-oss-120b` | ❌ |
+| `instructblip-vicuna-7b` | ❌ |
+| `internlm2-math-plus-7b` | ❌ |
+| `Janus-Pro-7B` | ❌ |
+| `Kimi-K2-Instruct` | ❌ |
+| `Ministral-8B-Instruct-2410` | ❌ |
+| `Molmo-7B-D-0924` | ✅ |
+| `OLMo-1B-hf` | ❌ |
+| `OLMo-7B-hf` | ❌ |
+| `OLMo-7B-SFT` | ❌ |
+| `pythia` | ❌ |
+| `Qwen1.5-72B-Chat` | ❌ |
+| `ReasonFlux-PRM-7B` | ❌ |
+| `t5-large-lm-adapt` | ❌ |
+| `t5-xl-lm-adapt` | ❌ |
+| `mt5-xl-lm-adapt` | ❌ |
+---
+## Vision Language Models (VLM)
+### LLaVa
+| Model | Configuration |
+|:------|:-------------|
+| `llava-1.5-7b-hf` | ✅ |
+| `llava-1.5-13b-hf` | ✅ |
+| `llava-v1.6-mistral-7b-hf` | ✅ |
+| `llava-v1.6-34b-hf` | ✅ |
+| `llava-med-v1.5-mistral-7b` | ❌ |
+### Microsoft: Phi 3 Vision
+| Model | Configuration |
+|:------|:-------------|
+| `Phi-3-vision-128k-instruct` | ✅ |
+| `Phi-3.5-vision-instruct` | ✅ |
+### Meta: Llama 3.2 Vision
+| Model | Configuration |
+|:------|:-------------|
+| `Llama-3.2-11B-Vision` | ✅ |
+| `Llama-3.2-11B-Vision-Instruct` | ✅ |
+| `Llama-3.2-90B-Vision` | ✅ |
+| `Llama-3.2-90B-Vision-Instruct` | ✅ |
+### Mistral: Pixtral
+| Model | Configuration |
+|:------|:-------------|
+| `Pixtral-12B-2409` | ✅ |
+### OpenGVLab: InternVL2.5
+| Model | Configuration |
+|:------|:-------------|
+| `InternVL2_5-8B` | ✅ |
+| `InternVL2_5-26B` | ✅ |
+| `InternVL2_5-38B` | ✅ |
+### THUDM: GLM-4
+| Model | Configuration |
+|:------|:-------------|
+| `glm-4v-9b` | ✅ |
+### DeepSeek: DeepSeek-VL2
+| Model | Configuration |
+|:------|:-------------|
+| `deepseek-vl2` | ✅ |
+| `deepseek-vl2-small` | ✅ |
+### Other VLM Models
+| Model | Configuration |
+|:------|:-------------|
+| `MiniCPM-Llama3-V-2_5` | ❌ |
+---
+## Text Embedding Models
+### Liang Wang: e5
+| Model | Configuration |
+|:------|:-------------|
+| `e5-mistral-7b-instruct` | ✅ |
+### BAAI: bge
+| Model | Configuration |
+|:------|:-------------|
+| `bge-base-en-v1.5` | ✅ |
+| `bge-m3` | ❌ |
+| `bge-multilingual-gemma2` | ❌ |
+### Sentence Transformers: MiniLM
+| Model | Configuration |
+|:------|:-------------|
+| `all-MiniLM-L6-v2` | ✅ |
+### Other Embedding Models
+| Model | Configuration |
+|:------|:-------------|
+| `data2vec` | ❌ |
+| `gte-modernbert-base` | ❌ |
+| `gte-Qwen2-7B-instruct` | ❌ |
+| `m2-bert-80M-32k-retrieval` | ❌ |
+| `m2-bert-80M-8k-retrieval` | ❌ |
+---
+## Reward Modeling Models
+### Qwen: Qwen2.5-Math
+| Model | Configuration |
+|:------|:-------------|
+| `Qwen2.5-Math-RM-72B` | ✅ |
+| `Qwen2.5-Math-PRM-7B` | ✅ |
+---
+## Multimodal Models
+### CLIP
+| Model | Configuration |
+|:------|:-------------|
+| `clip-vit-base-patch16` | ❌ |
+| `clip-vit-large-patch14-336` | ❌ |
+### Stable Diffusion
+| Model | Configuration |
+|:------|:-------------|
+| `sd-v1-4-full-ema` | ❌ |
+| `stable-diffusion-v1-4` | ❌ |
+---

vec-inf 0.6.0__tar.gz → 0.7.0__tar.gz

vec-inf 0.6.0tar.gz → 0.7.0tar.gz