vec-inf 0.4.0.post1__tar.gz → 0.4.1__tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (60) hide show
  1. vec_inf-0.4.1/.github/ISSUE_TEMPLATE/bug_report.md +26 -0
  2. vec_inf-0.4.1/.github/ISSUE_TEMPLATE/config.yml +1 -0
  3. vec_inf-0.4.1/.github/ISSUE_TEMPLATE/feature_request.md +20 -0
  4. vec_inf-0.4.1/.github/dependabot.yml +11 -0
  5. vec_inf-0.4.1/.github/pull_request_template.md +8 -0
  6. vec_inf-0.4.1/.github/workflows/code_checks.yml +51 -0
  7. vec_inf-0.4.1/.github/workflows/docs_build.yml +44 -0
  8. vec_inf-0.4.1/.github/workflows/docs_deploy.yml +59 -0
  9. vec_inf-0.4.1/.github/workflows/publish.yml +27 -0
  10. vec_inf-0.4.1/.github/workflows/unit_tests.yml +71 -0
  11. vec_inf-0.4.1/.gitignore +154 -0
  12. vec_inf-0.4.1/.pre-commit-config.yaml +61 -0
  13. vec_inf-0.4.1/.python-version +1 -0
  14. vec_inf-0.4.1/Dockerfile +79 -0
  15. {vec_inf-0.4.0.post1 → vec_inf-0.4.1}/PKG-INFO +24 -23
  16. {vec_inf-0.4.0.post1 → vec_inf-0.4.1}/README.md +9 -1
  17. vec_inf-0.4.1/codecov.yml +19 -0
  18. vec_inf-0.4.1/docs/Makefile +24 -0
  19. vec_inf-0.4.1/docs/make.bat +35 -0
  20. vec_inf-0.4.1/docs/source/_static/custom.js +6 -0
  21. vec_inf-0.4.1/docs/source/_static/logos/vector_logo.png +0 -0
  22. vec_inf-0.4.1/docs/source/_static/require.min.js +1 -0
  23. vec_inf-0.4.1/docs/source/_templates/base.html +120 -0
  24. vec_inf-0.4.1/docs/source/_templates/custom-class-template.rst +34 -0
  25. vec_inf-0.4.1/docs/source/_templates/custom-module-template.rst +66 -0
  26. vec_inf-0.4.1/docs/source/_templates/page.html +219 -0
  27. vec_inf-0.4.1/docs/source/conf.py +113 -0
  28. vec_inf-0.4.1/docs/source/index.md +24 -0
  29. vec_inf-0.4.1/docs/source/user_guide.md +123 -0
  30. vec_inf-0.4.1/examples/README.md +9 -0
  31. vec_inf-0.4.1/examples/inference/llm/chat_completions.py +21 -0
  32. vec_inf-0.4.1/examples/inference/llm/completions.py +16 -0
  33. vec_inf-0.4.1/examples/inference/llm/completions.sh +13 -0
  34. vec_inf-0.4.1/examples/inference/text_embedding/embeddings.py +22 -0
  35. vec_inf-0.4.1/examples/inference/vlm/vision_completions.py +29 -0
  36. vec_inf-0.4.1/examples/logits/logits.py +16 -0
  37. vec_inf-0.4.1/profile/avg_throughput.py +57 -0
  38. vec_inf-0.4.1/profile/gen.py +98 -0
  39. vec_inf-0.4.1/pyproject.toml +146 -0
  40. vec_inf-0.4.1/tests/__init__.py +1 -0
  41. vec_inf-0.4.1/tests/vec_inf/__init__.py +1 -0
  42. vec_inf-0.4.1/tests/vec_inf/cli/__init__.py +1 -0
  43. vec_inf-0.4.1/tests/vec_inf/cli/test_utils.py +209 -0
  44. vec_inf-0.4.1/uv.lock +3336 -0
  45. vec_inf-0.4.1/vec_inf/__init__.py +1 -0
  46. vec_inf-0.4.1/vec_inf/cli/__init__.py +1 -0
  47. {vec_inf-0.4.0.post1 → vec_inf-0.4.1}/vec_inf/cli/_cli.py +134 -81
  48. {vec_inf-0.4.0.post1 → vec_inf-0.4.1}/vec_inf/cli/_utils.py +21 -37
  49. {vec_inf-0.4.0.post1 → vec_inf-0.4.1}/vec_inf/launch_server.sh +20 -1
  50. {vec_inf-0.4.0.post1 → vec_inf-0.4.1}/vec_inf/models/README.md +24 -0
  51. {vec_inf-0.4.0.post1 → vec_inf-0.4.1}/vec_inf/models/models.csv +12 -0
  52. {vec_inf-0.4.0.post1 → vec_inf-0.4.1}/vec_inf/multinode_vllm.slurm +3 -1
  53. {vec_inf-0.4.0.post1 → vec_inf-0.4.1}/vec_inf/vllm.slurm +3 -1
  54. vec_inf-0.4.1/venv.sh +29 -0
  55. vec_inf-0.4.0.post1/pyproject.toml +0 -39
  56. vec_inf-0.4.0.post1/vec_inf/__init__.py +0 -0
  57. vec_inf-0.4.0.post1/vec_inf/cli/__init__.py +0 -0
  58. {vec_inf-0.4.0.post1 → vec_inf-0.4.1}/LICENSE +0 -0
  59. {vec_inf-0.4.0.post1 → vec_inf-0.4.1}/vec_inf/README.md +0 -0
  60. {vec_inf-0.4.0.post1 → vec_inf-0.4.1}/vec_inf/find_port.sh +0 -0
@@ -0,0 +1,26 @@
1
+ ---
2
+ name: Bug report
3
+ about: Create a report to help us improve
4
+ title: ''
5
+ labels: ''
6
+ assignees: ''
7
+
8
+ ---
9
+
10
+ ### Describe the bug
11
+ A clear and concise description of what the bug is.
12
+
13
+ ### To Reproduce
14
+ Code snippet or clear steps to reproduce behaviour.
15
+
16
+ ### Expected behavior
17
+ A clear and concise description of what you expected to happen.
18
+
19
+ ### Screenshots
20
+ If applicable, add screenshots to help explain your problem.
21
+
22
+ ### Version
23
+ - Version info such as v0.1.5
24
+
25
+ ### Additional context
26
+ Add any other context about the problem here.
@@ -0,0 +1 @@
1
+ blank_issues_enabled: false
@@ -0,0 +1,20 @@
1
+ ---
2
+ name: Feature request
3
+ about: Suggest an idea for this project
4
+ title: ''
5
+ labels: ''
6
+ assignees: ''
7
+
8
+ ---
9
+
10
+ ### Is your feature request related to a problem? Please describe.
11
+ A clear and concise description of what the problem is. Ex. I'm always frustrated when [...]
12
+
13
+ ### Describe the solution you'd like
14
+ A clear and concise description of what you want to happen.
15
+
16
+ ### Describe alternatives you've considered
17
+ A clear and concise description of any alternative solutions or features you've considered.
18
+
19
+ ### Additional context
20
+ Add any other context or screenshots about the feature request here.
@@ -0,0 +1,11 @@
1
+ # To get started with Dependabot version updates, you'll need to specify which
2
+ # package ecosystems to update and where the package manifests are located.
3
+ # Please see the documentation for all configuration options:
4
+ # https://docs.github.com/code-security/dependabot/dependabot-version-updates/configuration-options-for-the-dependabot.yml-file
5
+
6
+ version: 2
7
+ updates:
8
+ - package-ecosystem: "github-actions" # See documentation for possible values
9
+ directory: "/" # Location of package manifests
10
+ schedule:
11
+ interval: "weekly"
@@ -0,0 +1,8 @@
1
+ # PR Type
2
+ [Feature | Fix | Documentation | Other() ]
3
+
4
+ # Short Description
5
+ ...
6
+
7
+ # Tests Added
8
+ ...
@@ -0,0 +1,51 @@
1
+ name: code checks
2
+
3
+ on:
4
+ push:
5
+ branches:
6
+ - main
7
+ - develop
8
+ paths:
9
+ - .pre-commit-config.yaml
10
+ - .github/workflows/code_checks.yml
11
+ - '**.py'
12
+ - uv.lock
13
+ - pyproject.toml
14
+ - '**.ipynb'
15
+ pull_request:
16
+ branches:
17
+ - main
18
+ - develop
19
+ paths:
20
+ - .pre-commit-config.yaml
21
+ - .github/workflows/code_checks.yml
22
+ - '**.py'
23
+ - uv.lock
24
+ - pyproject.toml
25
+ - '**.ipynb'
26
+
27
+ jobs:
28
+ run-code-check:
29
+ runs-on: ubuntu-latest
30
+ steps:
31
+ - uses: actions/checkout@v4.2.2
32
+ - name: Install uv
33
+ uses: astral-sh/setup-uv@v5.2.2
34
+ with:
35
+ # Install a specific version of uv.
36
+ version: "0.5.21"
37
+ enable-cache: true
38
+ - name: "Set up Python"
39
+ uses: actions/setup-python@v5.4.0
40
+ with:
41
+ python-version-file: ".python-version"
42
+ - name: Install the project
43
+ run: uv sync --dev
44
+ - name: Install dependencies and check code
45
+ run: |
46
+ source .venv/bin/activate
47
+ pre-commit run --all-files
48
+ - name: pip-audit (gh-action-pip-audit)
49
+ uses: pypa/gh-action-pip-audit@v1.0.8
50
+ with:
51
+ virtual-environment: .venv/
@@ -0,0 +1,44 @@
1
+ name: docs (build)
2
+ permissions:
3
+ contents: read
4
+ pull-requests: write
5
+
6
+ on:
7
+ pull_request:
8
+ branches:
9
+ - main
10
+ - develop
11
+ paths:
12
+ - .pre-commit-config.yaml
13
+ - .github/workflows/docs_build.yml
14
+ - '**.py'
15
+ - '**.ipynb'
16
+ - '**.js'
17
+ - '**.html'
18
+ - uv.lock
19
+ - pyproject.toml
20
+ - '**.rst'
21
+ - '**.md'
22
+
23
+ jobs:
24
+ build:
25
+ runs-on: ubuntu-latest
26
+ steps:
27
+ - uses: actions/checkout@v4.2.2
28
+
29
+ - name: Install uv
30
+ uses: astral-sh/setup-uv@4db96194c378173c656ce18a155ffc14a9fc4355
31
+ with:
32
+ version: "0.5.21"
33
+ enable-cache: true
34
+
35
+ - name: "Set up Python"
36
+ uses: actions/setup-python@42375524e23c412d93fb67b49958b491fce71c38
37
+ with:
38
+ python-version-file: ".python-version"
39
+
40
+ - name: Install the project
41
+ run: uv sync --all-extras --all-groups
42
+
43
+ - name: Build docs
44
+ run: cd docs && rm -rf source/reference/api/_autosummary && uv run make html
@@ -0,0 +1,59 @@
1
+ name: docs
2
+ permissions:
3
+ contents: read
4
+ pull-requests: write
5
+
6
+ on:
7
+ push:
8
+ branches:
9
+ - main
10
+ paths:
11
+ - .pre-commit-config.yaml
12
+ - .github/workflows/code_checks.yml
13
+ - .github/workflows/docs_build.yml
14
+ - .github/workflows/docs_deploy.yml
15
+ - .github/workflows/integration_tests.yml
16
+ - '**.py'
17
+ - '**.ipynb'
18
+ - '**.html'
19
+ - '**.js'
20
+ - uv.lock
21
+ - pyproject.toml
22
+ - '**.rst'
23
+ - '**.md'
24
+
25
+ jobs:
26
+ deploy:
27
+ runs-on: ubuntu-latest
28
+ steps:
29
+ - uses: actions/checkout@v4.2.2
30
+ with:
31
+ submodules: 'true'
32
+
33
+ - name: Install uv
34
+ uses: astral-sh/setup-uv@4db96194c378173c656ce18a155ffc14a9fc4355
35
+ with:
36
+ # Install a specific version of uv.
37
+ version: "0.5.21"
38
+ enable-cache: true
39
+
40
+ - name: "Set up Python"
41
+ uses: actions/setup-python@42375524e23c412d93fb67b49958b491fce71c38
42
+ with:
43
+ python-version-file: ".python-version"
44
+
45
+ - name: Install the project
46
+ run: uv sync --all-extras --all-groups
47
+
48
+ - name: Build docs
49
+ run: |
50
+ cd docs
51
+ rm -rf source/reference/api/_autosummary
52
+ uv run make html
53
+ touch build/html/.nojekyll
54
+
55
+ - name: Deploy to Github pages
56
+ uses: JamesIves/github-pages-deploy-action@15de0f09300eea763baee31dff6c6184995c5f6a
57
+ with:
58
+ branch: github_pages
59
+ folder: docs/build/html
@@ -0,0 +1,27 @@
1
+ name: publish package
2
+
3
+ on:
4
+ release:
5
+ types: [published]
6
+
7
+ jobs:
8
+ deploy:
9
+ runs-on: ubuntu-latest
10
+ steps:
11
+ - name: Install apt dependencies
12
+ run: |
13
+ sudo apt-get update
14
+ sudo apt-get install libcurl4-openssl-dev libssl-dev
15
+ - uses: actions/checkout@v4.1.1
16
+ - name: Install poetry
17
+ run: python3 -m pip install --upgrade pip && python3 -m pip install poetry
18
+ - uses: actions/setup-python@v5.0.0
19
+ with:
20
+ python-version: '3.10'
21
+ - name: Build package
22
+ run: poetry build
23
+ - name: Publish package
24
+ uses: pypa/gh-action-pypi-publish@27b31702a0e7fc50959f5ad993c78deac1bdfc29
25
+ with:
26
+ user: __token__
27
+ password: ${{ secrets.PYPI_API_TOKEN }}
@@ -0,0 +1,71 @@
1
+ name: unit tests
2
+
3
+ on:
4
+ push:
5
+ branches:
6
+ - main
7
+ - develop
8
+ paths:
9
+ - .pre-commit-config.yaml
10
+ - .github/workflows/code_checks.yml
11
+ - .github/workflows/docs_build.yml
12
+ - .github/workflows/docs_deploy.yml
13
+ - .github/workflows/unit_tests.yml
14
+ - .github/workflows/integration_tests.yml
15
+ - '**.py'
16
+ - '**.ipynb'
17
+ - uv.lock
18
+ - pyproject.toml
19
+ - '**.rst'
20
+ - '**.md'
21
+ pull_request:
22
+ branches:
23
+ - main
24
+ - develop
25
+ paths:
26
+ - .pre-commit-config.yaml
27
+ - .github/workflows/code_checks.yml
28
+ - .github/workflows/docs_build.yml
29
+ - .github/workflows/docs_deploy.yml
30
+ - .github/workflows/unit_tests.yml
31
+ - .github/workflows/integration_tests.yml
32
+ - '**.py'
33
+ - '**.ipynb'
34
+ - uv.lock
35
+ - pyproject.toml
36
+ - '**.rst'
37
+ - '**.md'
38
+
39
+ jobs:
40
+ unit-tests:
41
+ runs-on: ubuntu-latest
42
+ steps:
43
+ - uses: actions/checkout@v4.2.2
44
+
45
+ - name: Install uv
46
+ uses: astral-sh/setup-uv@v5.2.2
47
+ with:
48
+ # Install a specific version of uv.
49
+ version: "0.5.21"
50
+ enable-cache: true
51
+
52
+ - name: "Set up Python"
53
+ uses: actions/setup-python@v5.4.0
54
+ with:
55
+ python-version-file: ".python-version"
56
+
57
+ - name: Install the project
58
+ run: uv sync --all-extras --dev
59
+
60
+ - name: Install dependencies and check code
61
+ run: |
62
+ uv run pytest -m "not integration_test" --cov vec_inf --cov-report=xml tests
63
+
64
+ # Uncomment this once this repo is configured on Codecov
65
+ - name: Upload coverage to Codecov
66
+ uses: codecov/codecov-action@v5.3.1
67
+ with:
68
+ token: ${{ secrets.CODECOV_TOKEN }}
69
+ slug: VectorInstitute/vec-inf
70
+ fail_ci_if_error: true
71
+ verbose: true
@@ -0,0 +1,154 @@
1
+ # Byte-compiled / optimized / DLL files
2
+ __pycache__/
3
+ *.py[cod]
4
+ *$py.class
5
+
6
+ # C extensions
7
+ *.so
8
+
9
+ # Distribution / packaging
10
+ .Python
11
+ build/
12
+ develop-eggs/
13
+ dist/
14
+ downloads/
15
+ eggs/
16
+ .eggs/
17
+ lib/
18
+ lib64/
19
+ parts/
20
+ sdist/
21
+ var/
22
+ wheels/
23
+ pip-wheel-metadata/
24
+ share/python-wheels/
25
+ *.egg-info/
26
+ .installed.cfg
27
+ *.egg
28
+ MANIFEST
29
+
30
+ # PyInstaller
31
+ # Usually these files are written by a python script from a template
32
+ # before PyInstaller builds the exe, so as to inject date/other infos into it.
33
+ *.manifest
34
+ *.spec
35
+
36
+ # Installer logs
37
+ pip-log.txt
38
+ pip-delete-this-directory.txt
39
+
40
+ # Unit test / coverage reports
41
+ htmlcov/
42
+ .tox/
43
+ .nox/
44
+ .coverage
45
+ .coverage.*
46
+ .cache
47
+ nosetests.xml
48
+ coverage.xml
49
+ *.cover
50
+ *.py,cover
51
+ .hypothesis/
52
+ .pytest_cache/
53
+
54
+ # Translations
55
+ *.mo
56
+ *.pot
57
+
58
+ # Django stuff:
59
+ *.log
60
+ local_settings.py
61
+ db.sqlite3
62
+ db.sqlite3-journal
63
+
64
+ # Flask stuff:
65
+ instance/
66
+ .webassets-cache
67
+
68
+ # Scrapy stuff:
69
+ .scrapy
70
+
71
+ # Sphinx documentation
72
+ docs/_build/
73
+
74
+ # PyBuilder
75
+ target/
76
+
77
+ # Jupyter Notebook
78
+ .ipynb_checkpoints
79
+
80
+ # IPython
81
+ profile_default/
82
+ ipython_config.py
83
+
84
+ # pipenv
85
+ # According to pypa/pipenv#598, it is recommended to include Pipfile.lock in version control.
86
+ # However, in case of collaboration, if having platform-specific dependencies or dependencies
87
+ # having no cross-platform support, pipenv may install dependencies that don't work, or not
88
+ # install all needed dependencies.
89
+ #Pipfile.lock
90
+
91
+ # PEP 582; used by e.g. github.com/David-OConnor/pyflow
92
+ __pypackages__/
93
+
94
+ # Celery stuff
95
+ celerybeat-schedule
96
+ celerybeat.pid
97
+
98
+ # SageMath parsed files
99
+ *.sage.py
100
+
101
+ # Environments
102
+ .env
103
+ .venv
104
+ env/
105
+ venv/
106
+ ENV/
107
+ env.bak/
108
+ venv.bak/
109
+
110
+ # Spyder project settings
111
+ .spyderproject
112
+ .spyproject
113
+
114
+ # Rope project settings
115
+ .ropeproject
116
+
117
+ # mkdocs documentation
118
+ /site
119
+
120
+ # mypy
121
+ .mypy_cache/
122
+ .dmypy.json
123
+ dmypy.json
124
+
125
+ # Pyre type checker
126
+ .pyre/
127
+
128
+ # pycharm
129
+ .idea/
130
+
131
+ # VS Code
132
+ .vscode/
133
+
134
+ # MacOS
135
+ .DS_Store
136
+
137
+ # Slurm logs
138
+ *.out
139
+ *.err
140
+
141
+ # Server url files
142
+ *_url
143
+
144
+ logs/
145
+
146
+ local/
147
+ slurm/
148
+ scripts/
149
+
150
+ # vLLM bug reporting files
151
+ collect_env.py
152
+
153
+ # build files
154
+ dist/
@@ -0,0 +1,61 @@
1
+ repos:
2
+ - repo: https://github.com/pre-commit/pre-commit-hooks
3
+ rev: v5.0.0 # Use the ref you want to point at
4
+ hooks:
5
+ - id: trailing-whitespace
6
+ - id: check-ast
7
+ - id: check-builtin-literals
8
+ - id: check-docstring-first
9
+ - id: check-executables-have-shebangs
10
+ - id: debug-statements
11
+ - id: end-of-file-fixer
12
+ - id: mixed-line-ending
13
+ args: [--fix=lf]
14
+ - id: requirements-txt-fixer
15
+ - id: check-yaml
16
+ - id: check-toml
17
+
18
+ - repo: https://github.com/astral-sh/ruff-pre-commit
19
+ rev: 'v0.9.6'
20
+ hooks:
21
+ - id: ruff
22
+ args: [--fix, --exit-non-zero-on-fix]
23
+ types_or: [python, jupyter]
24
+ - id: ruff-format
25
+ types_or: [python, jupyter]
26
+
27
+ - repo: https://github.com/pre-commit/mirrors-mypy
28
+ rev: v1.15.0
29
+ hooks:
30
+ - id: mypy
31
+ entry: python3 -m mypy --config-file pyproject.toml
32
+ language: system
33
+ types: [python]
34
+ exclude: "tests"
35
+
36
+ - repo: https://github.com/nbQA-dev/nbQA
37
+ rev: 1.9.1
38
+ hooks:
39
+ - id: nbqa-ruff
40
+ args: [--fix, --exit-non-zero-on-fix]
41
+
42
+ - repo: local
43
+ hooks:
44
+ - id: pytest
45
+ name: pytest
46
+ entry: python3 -m pytest -m "not integration_test"
47
+ language: system
48
+ pass_filenames: false
49
+ always_run: true
50
+
51
+ ci:
52
+ autofix_commit_msg: |
53
+ [pre-commit.ci] Add auto fixes from pre-commit.com hooks
54
+
55
+ for more information, see https://pre-commit.ci
56
+ autofix_prs: true
57
+ autoupdate_branch: ''
58
+ autoupdate_commit_msg: '[pre-commit.ci] pre-commit autoupdate'
59
+ autoupdate_schedule: weekly
60
+ skip: [pytest,mypy]
61
+ submodules: false
@@ -0,0 +1 @@
1
+ 3.10
@@ -0,0 +1,79 @@
1
+ FROM nvidia/cuda:12.3.1-devel-ubuntu20.04
2
+
3
+ # Non-interactive apt-get commands
4
+ ARG DEBIAN_FRONTEND=noninteractive
5
+
6
+ # No GPUs visible during build
7
+ ARG CUDA_VISIBLE_DEVICES=none
8
+
9
+ # Specify CUDA architectures -> 7.5: RTX 6000 & T4, 8.0: A100, 8.6+PTX
10
+ ARG TORCH_CUDA_ARCH_LIST="7.5;8.0;8.6+PTX"
11
+
12
+ # Set the Python version
13
+ ARG PYTHON_VERSION=3.10.12
14
+
15
+ # Install dependencies for building Python
16
+ RUN apt-get update && apt-get install -y \
17
+ wget \
18
+ build-essential \
19
+ libssl-dev \
20
+ zlib1g-dev \
21
+ libbz2-dev \
22
+ libreadline-dev \
23
+ libsqlite3-dev \
24
+ libffi-dev \
25
+ libncursesw5-dev \
26
+ xz-utils \
27
+ tk-dev \
28
+ libxml2-dev \
29
+ libxmlsec1-dev \
30
+ liblzma-dev \
31
+ git \
32
+ vim \
33
+ && rm -rf /var/lib/apt/lists/*
34
+
35
+ # Download and install Python from precompiled binaries
36
+ RUN wget https://www.python.org/ftp/python/$PYTHON_VERSION/Python-$PYTHON_VERSION.tgz && \
37
+ tar -xzf Python-$PYTHON_VERSION.tgz && \
38
+ cd Python-$PYTHON_VERSION && \
39
+ ./configure --enable-optimizations && \
40
+ make -j$(nproc) && \
41
+ make altinstall && \
42
+ cd .. && \
43
+ rm -rf Python-$PYTHON_VERSION.tgz Python-$PYTHON_VERSION
44
+
45
+ # Download and install pip using get-pip.py
46
+ RUN wget https://bootstrap.pypa.io/get-pip.py && \
47
+ python3.10 get-pip.py && \
48
+ rm get-pip.py
49
+
50
+ # Ensure pip for Python 3.10 is used
51
+ RUN python3.10 -m pip install --upgrade pip setuptools wheel
52
+
53
+ # Install Poetry using Python 3.10
54
+ RUN python3.10 -m pip install poetry
55
+
56
+ # Don't create venv
57
+ RUN poetry config virtualenvs.create false
58
+
59
+ # Set working directory
60
+ WORKDIR /vec-inf
61
+
62
+ # Copy current directory
63
+ COPY . /vec-inf
64
+
65
+ # Update Poetry lock file if necessary
66
+ RUN poetry lock
67
+
68
+ # Install vec-inf
69
+ RUN poetry install --extras "dev"
70
+
71
+ # Install Flash Attention 2 backend
72
+ RUN python3.10 -m pip install flash-attn --no-build-isolation
73
+
74
+ # Move nccl to accessible location
75
+ RUN mkdir -p /vec-inf/nccl
76
+ RUN mv /root/.config/vllm/nccl/cu12/libnccl.so.2.18.1 /vec-inf/nccl/libnccl.so.2.18.1;
77
+
78
+ # Set the default command to start an interactive shell
79
+ CMD ["bash"]
@@ -1,30 +1,32 @@
1
- Metadata-Version: 2.1
1
+ Metadata-Version: 2.4
2
2
  Name: vec-inf
3
- Version: 0.4.0.post1
3
+ Version: 0.4.1
4
4
  Summary: Efficient LLM inference on Slurm clusters using vLLM.
5
- License: MIT
6
- Author: Marshall Wang
7
- Author-email: marshall.wang@vectorinstitute.ai
8
- Requires-Python: >=3.10,<4.0
9
- Classifier: License :: OSI Approved :: MIT License
10
- Classifier: Programming Language :: Python :: 3
11
- Classifier: Programming Language :: Python :: 3.10
12
- Classifier: Programming Language :: Python :: 3.11
13
- Classifier: Programming Language :: Python :: 3.12
14
- Classifier: Programming Language :: Python :: 3.13
5
+ Author-email: Marshall Wang <marshall.wang@vectorinstitute.ai>
6
+ License-Expression: MIT
7
+ License-File: LICENSE
8
+ Requires-Python: <3.11,>=3.10
9
+ Requires-Dist: click>=8.1.0
10
+ Requires-Dist: numpy>=1.24.0
11
+ Requires-Dist: polars>=1.15.0
12
+ Requires-Dist: requests>=2.31.0
13
+ Requires-Dist: rich>=13.7.0
15
14
  Provides-Extra: dev
16
- Requires-Dist: click (>=8.1.0,<9.0.0)
17
- Requires-Dist: cupy-cuda12x (==12.1.0) ; extra == "dev"
18
- Requires-Dist: numpy (>=1.24.0,<2.0.0)
19
- Requires-Dist: polars (>=1.15.0,<2.0.0)
20
- Requires-Dist: ray (>=2.9.3,<3.0.0) ; extra == "dev"
21
- Requires-Dist: requests (>=2.31.0,<3.0.0)
22
- Requires-Dist: rich (>=13.7.0,<14.0.0)
23
- Requires-Dist: vllm (>=0.6.0,<0.7.0) ; extra == "dev"
24
- Requires-Dist: vllm-nccl-cu12 (>=2.18,<2.19) ; extra == "dev"
15
+ Requires-Dist: cupy-cuda12x==12.1.0; extra == 'dev'
16
+ Requires-Dist: ray>=2.40.0; extra == 'dev'
17
+ Requires-Dist: vllm-nccl-cu12<2.19,>=2.18; extra == 'dev'
18
+ Requires-Dist: vllm>=0.7.2; extra == 'dev'
25
19
  Description-Content-Type: text/markdown
26
20
 
27
21
  # Vector Inference: Easy inference on Slurm clusters
22
+
23
+ ----------------------------------------------------
24
+
25
+ [![code checks](https://github.com/VectorInstitute/vector-inference/actions/workflows/code_checks.yml/badge.svg)](https://github.com/VectorInstitute/vector-inference/actions/workflows/code_checks.yml)
26
+ [![docs](https://github.com/VectorInstitute/vector-inference/actions/workflows/docs_build.yml/badge.svg)](https://github.com/VectorInstitute/vector-inference/actions/workflows/docs_build.yml)
27
+ [![codecov](https://codecov.io/github/VectorInstitute/vector-inference/graph/badge.svg?token=83MYFZ3UPA)](https://codecov.io/github/VectorInstitute/vector-inference)
28
+ ![GitHub License](https://img.shields.io/github/license/VectorInstitute/vector-inference)
29
+
28
30
  This repository provides an easy-to-use solution to run inference servers on [Slurm](https://slurm.schedmd.com/overview.html)-managed computing clusters using [vLLM](https://docs.vllm.ai/en/latest/). **All scripts in this repository runs natively on the Vector Institute cluster environment**. To adapt to other environments, update [`launch_server.sh`](vec_inf/launch_server.sh), [`vllm.slurm`](vec_inf/vllm.slurm), [`multinode_vllm.slurm`](vec_inf/multinode_vllm.slurm) and [`models.csv`](vec_inf/models/models.csv) accordingly.
29
31
 
30
32
  ## Installation
@@ -42,7 +44,7 @@ vec-inf launch Meta-Llama-3.1-8B-Instruct
42
44
  ```
43
45
  You should see an output like the following:
44
46
 
45
- <img width="700" alt="launch_img" src="https://github.com/user-attachments/assets/ab658552-18b2-47e0-bf70-e539c3b898d5">
47
+ <img width="600" alt="launch_img" src="https://github.com/user-attachments/assets/ab658552-18b2-47e0-bf70-e539c3b898d5">
46
48
 
47
49
  The model would be launched using the [default parameters](vec_inf/models/models.csv), you can override these values by providing additional parameters, use `--help` to see the full list. You can also launch your own customized model as long as the model architecture is [supported by vLLM](https://docs.vllm.ai/en/stable/models/supported_models.html), and make sure to follow the instructions below:
48
50
  * Your model weights directory naming convention should follow `$MODEL_FAMILY-$MODEL_VARIANT`.
@@ -117,4 +119,3 @@ If you want to run inference from your local device, you can open a SSH tunnel t
117
119
  ssh -L 8081:172.17.8.29:8081 username@v.vectorinstitute.ai -N
118
120
  ```
119
121
  Where the last number in the URL is the GPU number (gpu029 in this case). The example provided above is for the vector cluster, change the variables accordingly for your environment
120
-