compressed-tensors 0.10.2a20250616__tar.gz → 0.10.2a20250620__tar.gz
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- {compressed_tensors-0.10.2a20250616 → compressed_tensors-0.10.2a20250620}/.github/workflows/report.yml +1 -1
- {compressed_tensors-0.10.2a20250616/src/compressed_tensors.egg-info → compressed_tensors-0.10.2a20250620}/PKG-INFO +1 -1
- {compressed_tensors-0.10.2a20250616 → compressed_tensors-0.10.2a20250620}/src/compressed_tensors/utils/offload.py +42 -11
- {compressed_tensors-0.10.2a20250616 → compressed_tensors-0.10.2a20250620}/src/compressed_tensors/version.py +1 -1
- {compressed_tensors-0.10.2a20250616 → compressed_tensors-0.10.2a20250620/src/compressed_tensors.egg-info}/PKG-INFO +1 -1
- {compressed_tensors-0.10.2a20250616 → compressed_tensors-0.10.2a20250620}/tests/test_utils/test_offload.py +19 -0
- {compressed_tensors-0.10.2a20250616 → compressed_tensors-0.10.2a20250620}/.github/.gitkeep +0 -0
- {compressed_tensors-0.10.2a20250616 → compressed_tensors-0.10.2a20250620}/.github/actions/test/action.yml +0 -0
- {compressed_tensors-0.10.2a20250616 → compressed_tensors-0.10.2a20250620}/.github/scripts/step-status +0 -0
- {compressed_tensors-0.10.2a20250616 → compressed_tensors-0.10.2a20250620}/.github/workflows/build-test.yml +0 -0
- {compressed_tensors-0.10.2a20250616 → compressed_tensors-0.10.2a20250620}/.github/workflows/build.yml +0 -0
- {compressed_tensors-0.10.2a20250616 → compressed_tensors-0.10.2a20250620}/.github/workflows/test-check.yaml +0 -0
- {compressed_tensors-0.10.2a20250616 → compressed_tensors-0.10.2a20250620}/.github/workflows/test.yml +0 -0
- {compressed_tensors-0.10.2a20250616 → compressed_tensors-0.10.2a20250620}/.github/workflows/trigger-all.yml +0 -0
- {compressed_tensors-0.10.2a20250616 → compressed_tensors-0.10.2a20250620}/.github/workflows/upload.yml +0 -0
- {compressed_tensors-0.10.2a20250616 → compressed_tensors-0.10.2a20250620}/.gitignore +0 -0
- {compressed_tensors-0.10.2a20250616 → compressed_tensors-0.10.2a20250620}/LICENSE +0 -0
- {compressed_tensors-0.10.2a20250616 → compressed_tensors-0.10.2a20250620}/Makefile +0 -0
- {compressed_tensors-0.10.2a20250616 → compressed_tensors-0.10.2a20250620}/README.md +0 -0
- {compressed_tensors-0.10.2a20250616 → compressed_tensors-0.10.2a20250620}/examples/bit_packing/ex_quantize_and_pack.py +0 -0
- {compressed_tensors-0.10.2a20250616 → compressed_tensors-0.10.2a20250620}/examples/bit_packing/int4_config.json +0 -0
- {compressed_tensors-0.10.2a20250616 → compressed_tensors-0.10.2a20250620}/examples/bitmask_compression.ipynb +0 -0
- {compressed_tensors-0.10.2a20250616 → compressed_tensors-0.10.2a20250620}/examples/llama_1.1b/ex_config_quantization.py +0 -0
- {compressed_tensors-0.10.2a20250616 → compressed_tensors-0.10.2a20250620}/examples/llama_1.1b/ex_llmcompressor_quantization.py +0 -0
- {compressed_tensors-0.10.2a20250616 → compressed_tensors-0.10.2a20250620}/examples/llama_1.1b/example_quant_config.json +0 -0
- {compressed_tensors-0.10.2a20250616 → compressed_tensors-0.10.2a20250620}/examples/llama_1.1b/example_quant_recipe.yaml +0 -0
- {compressed_tensors-0.10.2a20250616 → compressed_tensors-0.10.2a20250620}/examples/quantize_and_pack_int4.ipynb +0 -0
- {compressed_tensors-0.10.2a20250616 → compressed_tensors-0.10.2a20250620}/pyproject.toml +0 -0
- {compressed_tensors-0.10.2a20250616 → compressed_tensors-0.10.2a20250620}/setup.cfg +0 -0
- {compressed_tensors-0.10.2a20250616 → compressed_tensors-0.10.2a20250620}/setup.py +0 -0
- {compressed_tensors-0.10.2a20250616 → compressed_tensors-0.10.2a20250620}/src/__init__.py +0 -0
- {compressed_tensors-0.10.2a20250616 → compressed_tensors-0.10.2a20250620}/src/compressed_tensors/README.md +0 -0
- {compressed_tensors-0.10.2a20250616 → compressed_tensors-0.10.2a20250620}/src/compressed_tensors/__init__.py +0 -0
- {compressed_tensors-0.10.2a20250616 → compressed_tensors-0.10.2a20250620}/src/compressed_tensors/base.py +0 -0
- {compressed_tensors-0.10.2a20250616 → compressed_tensors-0.10.2a20250620}/src/compressed_tensors/compressors/__init__.py +0 -0
- {compressed_tensors-0.10.2a20250616 → compressed_tensors-0.10.2a20250620}/src/compressed_tensors/compressors/base.py +0 -0
- {compressed_tensors-0.10.2a20250616 → compressed_tensors-0.10.2a20250620}/src/compressed_tensors/compressors/helpers.py +0 -0
- {compressed_tensors-0.10.2a20250616 → compressed_tensors-0.10.2a20250620}/src/compressed_tensors/compressors/model_compressors/__init__.py +0 -0
- {compressed_tensors-0.10.2a20250616 → compressed_tensors-0.10.2a20250620}/src/compressed_tensors/compressors/model_compressors/model_compressor.py +0 -0
- {compressed_tensors-0.10.2a20250616 → compressed_tensors-0.10.2a20250620}/src/compressed_tensors/compressors/quantized_compressors/__init__.py +0 -0
- {compressed_tensors-0.10.2a20250616 → compressed_tensors-0.10.2a20250620}/src/compressed_tensors/compressors/quantized_compressors/base.py +0 -0
- {compressed_tensors-0.10.2a20250616 → compressed_tensors-0.10.2a20250620}/src/compressed_tensors/compressors/quantized_compressors/naive_quantized.py +0 -0
- {compressed_tensors-0.10.2a20250616 → compressed_tensors-0.10.2a20250620}/src/compressed_tensors/compressors/quantized_compressors/nvfp4_quantized.py +0 -0
- {compressed_tensors-0.10.2a20250616 → compressed_tensors-0.10.2a20250620}/src/compressed_tensors/compressors/quantized_compressors/pack_quantized.py +0 -0
- {compressed_tensors-0.10.2a20250616 → compressed_tensors-0.10.2a20250620}/src/compressed_tensors/compressors/sparse_compressors/__init__.py +0 -0
- {compressed_tensors-0.10.2a20250616 → compressed_tensors-0.10.2a20250620}/src/compressed_tensors/compressors/sparse_compressors/base.py +0 -0
- {compressed_tensors-0.10.2a20250616 → compressed_tensors-0.10.2a20250620}/src/compressed_tensors/compressors/sparse_compressors/dense.py +0 -0
- {compressed_tensors-0.10.2a20250616 → compressed_tensors-0.10.2a20250620}/src/compressed_tensors/compressors/sparse_compressors/sparse_24_bitmask.py +0 -0
- {compressed_tensors-0.10.2a20250616 → compressed_tensors-0.10.2a20250620}/src/compressed_tensors/compressors/sparse_compressors/sparse_bitmask.py +0 -0
- {compressed_tensors-0.10.2a20250616 → compressed_tensors-0.10.2a20250620}/src/compressed_tensors/compressors/sparse_quantized_compressors/__init__.py +0 -0
- {compressed_tensors-0.10.2a20250616 → compressed_tensors-0.10.2a20250620}/src/compressed_tensors/compressors/sparse_quantized_compressors/marlin_24.py +0 -0
- {compressed_tensors-0.10.2a20250616 → compressed_tensors-0.10.2a20250620}/src/compressed_tensors/config/__init__.py +0 -0
- {compressed_tensors-0.10.2a20250616 → compressed_tensors-0.10.2a20250620}/src/compressed_tensors/config/base.py +0 -0
- {compressed_tensors-0.10.2a20250616 → compressed_tensors-0.10.2a20250620}/src/compressed_tensors/config/dense.py +0 -0
- {compressed_tensors-0.10.2a20250616 → compressed_tensors-0.10.2a20250620}/src/compressed_tensors/config/sparse_24_bitmask.py +0 -0
- {compressed_tensors-0.10.2a20250616 → compressed_tensors-0.10.2a20250620}/src/compressed_tensors/config/sparse_bitmask.py +0 -0
- {compressed_tensors-0.10.2a20250616 → compressed_tensors-0.10.2a20250620}/src/compressed_tensors/linear/__init__.py +0 -0
- {compressed_tensors-0.10.2a20250616 → compressed_tensors-0.10.2a20250620}/src/compressed_tensors/linear/compressed_linear.py +0 -0
- {compressed_tensors-0.10.2a20250616 → compressed_tensors-0.10.2a20250620}/src/compressed_tensors/quantization/__init__.py +0 -0
- {compressed_tensors-0.10.2a20250616 → compressed_tensors-0.10.2a20250620}/src/compressed_tensors/quantization/lifecycle/__init__.py +0 -0
- {compressed_tensors-0.10.2a20250616 → compressed_tensors-0.10.2a20250620}/src/compressed_tensors/quantization/lifecycle/apply.py +0 -0
- {compressed_tensors-0.10.2a20250616 → compressed_tensors-0.10.2a20250620}/src/compressed_tensors/quantization/lifecycle/compressed.py +0 -0
- {compressed_tensors-0.10.2a20250616 → compressed_tensors-0.10.2a20250620}/src/compressed_tensors/quantization/lifecycle/forward.py +0 -0
- {compressed_tensors-0.10.2a20250616 → compressed_tensors-0.10.2a20250620}/src/compressed_tensors/quantization/lifecycle/helpers.py +0 -0
- {compressed_tensors-0.10.2a20250616 → compressed_tensors-0.10.2a20250620}/src/compressed_tensors/quantization/lifecycle/initialize.py +0 -0
- {compressed_tensors-0.10.2a20250616 → compressed_tensors-0.10.2a20250620}/src/compressed_tensors/quantization/quant_args.py +0 -0
- {compressed_tensors-0.10.2a20250616 → compressed_tensors-0.10.2a20250620}/src/compressed_tensors/quantization/quant_config.py +0 -0
- {compressed_tensors-0.10.2a20250616 → compressed_tensors-0.10.2a20250620}/src/compressed_tensors/quantization/quant_scheme.py +0 -0
- {compressed_tensors-0.10.2a20250616 → compressed_tensors-0.10.2a20250620}/src/compressed_tensors/quantization/utils/__init__.py +0 -0
- {compressed_tensors-0.10.2a20250616 → compressed_tensors-0.10.2a20250620}/src/compressed_tensors/quantization/utils/helpers.py +0 -0
- {compressed_tensors-0.10.2a20250616 → compressed_tensors-0.10.2a20250620}/src/compressed_tensors/registry/__init__.py +0 -0
- {compressed_tensors-0.10.2a20250616 → compressed_tensors-0.10.2a20250620}/src/compressed_tensors/registry/registry.py +0 -0
- {compressed_tensors-0.10.2a20250616 → compressed_tensors-0.10.2a20250620}/src/compressed_tensors/transform/__init__.py +0 -0
- {compressed_tensors-0.10.2a20250616 → compressed_tensors-0.10.2a20250620}/src/compressed_tensors/transform/factory/__init__.py +0 -0
- {compressed_tensors-0.10.2a20250616 → compressed_tensors-0.10.2a20250620}/src/compressed_tensors/transform/factory/base.py +0 -0
- {compressed_tensors-0.10.2a20250616 → compressed_tensors-0.10.2a20250620}/src/compressed_tensors/transform/factory/hadamard.py +0 -0
- {compressed_tensors-0.10.2a20250616 → compressed_tensors-0.10.2a20250620}/src/compressed_tensors/transform/factory/matrix_multiply.py +0 -0
- {compressed_tensors-0.10.2a20250616 → compressed_tensors-0.10.2a20250620}/src/compressed_tensors/transform/factory/random_hadamard.py +0 -0
- {compressed_tensors-0.10.2a20250616 → compressed_tensors-0.10.2a20250620}/src/compressed_tensors/transform/transform_args.py +0 -0
- {compressed_tensors-0.10.2a20250616 → compressed_tensors-0.10.2a20250620}/src/compressed_tensors/transform/transform_config.py +0 -0
- {compressed_tensors-0.10.2a20250616 → compressed_tensors-0.10.2a20250620}/src/compressed_tensors/transform/transform_scheme.py +0 -0
- {compressed_tensors-0.10.2a20250616 → compressed_tensors-0.10.2a20250620}/src/compressed_tensors/transform/utils/__init__.py +0 -0
- {compressed_tensors-0.10.2a20250616 → compressed_tensors-0.10.2a20250620}/src/compressed_tensors/transform/utils/hadamard.py +0 -0
- {compressed_tensors-0.10.2a20250616 → compressed_tensors-0.10.2a20250620}/src/compressed_tensors/transform/utils/hadamards.safetensors +0 -0
- {compressed_tensors-0.10.2a20250616 → compressed_tensors-0.10.2a20250620}/src/compressed_tensors/transform/utils/utils.py +0 -0
- {compressed_tensors-0.10.2a20250616 → compressed_tensors-0.10.2a20250620}/src/compressed_tensors/utils/__init__.py +0 -0
- {compressed_tensors-0.10.2a20250616 → compressed_tensors-0.10.2a20250620}/src/compressed_tensors/utils/helpers.py +0 -0
- {compressed_tensors-0.10.2a20250616 → compressed_tensors-0.10.2a20250620}/src/compressed_tensors/utils/permutations_24.py +0 -0
- {compressed_tensors-0.10.2a20250616 → compressed_tensors-0.10.2a20250620}/src/compressed_tensors/utils/permute.py +0 -0
- {compressed_tensors-0.10.2a20250616 → compressed_tensors-0.10.2a20250620}/src/compressed_tensors/utils/safetensors_load.py +0 -0
- {compressed_tensors-0.10.2a20250616 → compressed_tensors-0.10.2a20250620}/src/compressed_tensors/utils/semi_structured_conversions.py +0 -0
- {compressed_tensors-0.10.2a20250616 → compressed_tensors-0.10.2a20250620}/src/compressed_tensors.egg-info/SOURCES.txt +0 -0
- {compressed_tensors-0.10.2a20250616 → compressed_tensors-0.10.2a20250620}/src/compressed_tensors.egg-info/dependency_links.txt +0 -0
- {compressed_tensors-0.10.2a20250616 → compressed_tensors-0.10.2a20250620}/src/compressed_tensors.egg-info/requires.txt +0 -0
- {compressed_tensors-0.10.2a20250616 → compressed_tensors-0.10.2a20250620}/src/compressed_tensors.egg-info/top_level.txt +0 -0
- {compressed_tensors-0.10.2a20250616 → compressed_tensors-0.10.2a20250620}/tests/__init__.py +0 -0
- {compressed_tensors-0.10.2a20250616 → compressed_tensors-0.10.2a20250620}/tests/conftest.py +0 -0
- {compressed_tensors-0.10.2a20250616 → compressed_tensors-0.10.2a20250620}/tests/test_compressors/__init__.py +0 -0
- {compressed_tensors-0.10.2a20250616 → compressed_tensors-0.10.2a20250620}/tests/test_compressors/model_compressors/__init__.py +0 -0
- {compressed_tensors-0.10.2a20250616 → compressed_tensors-0.10.2a20250620}/tests/test_compressors/model_compressors/test_model_compressor.py +0 -0
- {compressed_tensors-0.10.2a20250616 → compressed_tensors-0.10.2a20250620}/tests/test_compressors/quantized_compressors/__init__.py +0 -0
- {compressed_tensors-0.10.2a20250616 → compressed_tensors-0.10.2a20250620}/tests/test_compressors/quantized_compressors/test_fp8_quant.py +0 -0
- {compressed_tensors-0.10.2a20250616 → compressed_tensors-0.10.2a20250620}/tests/test_compressors/quantized_compressors/test_int_quant.py +0 -0
- {compressed_tensors-0.10.2a20250616 → compressed_tensors-0.10.2a20250620}/tests/test_compressors/quantized_compressors/test_nvfp4_quant.py +0 -0
- {compressed_tensors-0.10.2a20250616 → compressed_tensors-0.10.2a20250620}/tests/test_compressors/quantized_compressors/test_pack_quant.py +0 -0
- {compressed_tensors-0.10.2a20250616 → compressed_tensors-0.10.2a20250620}/tests/test_compressors/sparse_compressors/__init__.py +0 -0
- {compressed_tensors-0.10.2a20250616 → compressed_tensors-0.10.2a20250620}/tests/test_compressors/sparse_compressors/test_bitmask.py +0 -0
- {compressed_tensors-0.10.2a20250616 → compressed_tensors-0.10.2a20250620}/tests/test_compressors/sparse_compressors/test_sparse_24_bitmask.py +0 -0
- {compressed_tensors-0.10.2a20250616 → compressed_tensors-0.10.2a20250620}/tests/test_compressors/sparse_quantized_compressors/__init__.py +0 -0
- {compressed_tensors-0.10.2a20250616 → compressed_tensors-0.10.2a20250620}/tests/test_compressors/sparse_quantized_compressors/test_marlin_24.py +0 -0
- {compressed_tensors-0.10.2a20250616 → compressed_tensors-0.10.2a20250620}/tests/test_configs/__init__.py +0 -0
- {compressed_tensors-0.10.2a20250616 → compressed_tensors-0.10.2a20250620}/tests/test_configs/test_base.py +0 -0
- {compressed_tensors-0.10.2a20250616 → compressed_tensors-0.10.2a20250620}/tests/test_examples/test_bitmask_compression_ipynb.py +0 -0
- {compressed_tensors-0.10.2a20250616 → compressed_tensors-0.10.2a20250620}/tests/test_linear/__init__.py +0 -0
- {compressed_tensors-0.10.2a20250616 → compressed_tensors-0.10.2a20250620}/tests/test_linear/test_compressed_linear.py +0 -0
- {compressed_tensors-0.10.2a20250616 → compressed_tensors-0.10.2a20250620}/tests/test_quantization/__init__.py +0 -0
- {compressed_tensors-0.10.2a20250616 → compressed_tensors-0.10.2a20250620}/tests/test_quantization/lifecycle/__init__.py +0 -0
- {compressed_tensors-0.10.2a20250616 → compressed_tensors-0.10.2a20250620}/tests/test_quantization/lifecycle/conftest.py +0 -0
- {compressed_tensors-0.10.2a20250616 → compressed_tensors-0.10.2a20250620}/tests/test_quantization/lifecycle/test_apply.py +0 -0
- {compressed_tensors-0.10.2a20250616 → compressed_tensors-0.10.2a20250620}/tests/test_quantization/lifecycle/test_dynamic_lifecycle.py +0 -0
- {compressed_tensors-0.10.2a20250616 → compressed_tensors-0.10.2a20250620}/tests/test_quantization/lifecycle/test_enabled.py +0 -0
- {compressed_tensors-0.10.2a20250616 → compressed_tensors-0.10.2a20250620}/tests/test_quantization/lifecycle/test_forward.py +0 -0
- {compressed_tensors-0.10.2a20250616 → compressed_tensors-0.10.2a20250620}/tests/test_quantization/lifecycle/test_helpers.py +0 -0
- {compressed_tensors-0.10.2a20250616 → compressed_tensors-0.10.2a20250620}/tests/test_quantization/lifecycle/test_initialize.py +0 -0
- {compressed_tensors-0.10.2a20250616 → compressed_tensors-0.10.2a20250620}/tests/test_quantization/lifecycle/test_lifecycle.py +0 -0
- {compressed_tensors-0.10.2a20250616 → compressed_tensors-0.10.2a20250620}/tests/test_quantization/test_configs/__init__.py +0 -0
- {compressed_tensors-0.10.2a20250616 → compressed_tensors-0.10.2a20250620}/tests/test_quantization/test_configs/test_bit_depths.py +0 -0
- {compressed_tensors-0.10.2a20250616 → compressed_tensors-0.10.2a20250620}/tests/test_quantization/test_configs/test_strategies.py +0 -0
- {compressed_tensors-0.10.2a20250616 → compressed_tensors-0.10.2a20250620}/tests/test_quantization/test_quant_args.py +0 -0
- {compressed_tensors-0.10.2a20250616 → compressed_tensors-0.10.2a20250620}/tests/test_quantization/test_quant_config.py +0 -0
- {compressed_tensors-0.10.2a20250616 → compressed_tensors-0.10.2a20250620}/tests/test_quantization/test_quant_scheme.py +0 -0
- {compressed_tensors-0.10.2a20250616 → compressed_tensors-0.10.2a20250620}/tests/test_quantization/test_utils/test_helpers.py +0 -0
- {compressed_tensors-0.10.2a20250616 → compressed_tensors-0.10.2a20250620}/tests/test_registry.py +0 -0
- {compressed_tensors-0.10.2a20250616 → compressed_tensors-0.10.2a20250620}/tests/test_transform/factory/test_correctness.py +0 -0
- {compressed_tensors-0.10.2a20250616 → compressed_tensors-0.10.2a20250620}/tests/test_transform/factory/test_memory.py +0 -0
- {compressed_tensors-0.10.2a20250616 → compressed_tensors-0.10.2a20250620}/tests/test_transform/test_transform_args.py +0 -0
- {compressed_tensors-0.10.2a20250616 → compressed_tensors-0.10.2a20250620}/tests/test_transform/test_transform_config.py +0 -0
- {compressed_tensors-0.10.2a20250616 → compressed_tensors-0.10.2a20250620}/tests/test_transform/test_transform_scheme.py +0 -0
- {compressed_tensors-0.10.2a20250616 → compressed_tensors-0.10.2a20250620}/tests/test_transform/utils/test_hadamard.py +0 -0
- {compressed_tensors-0.10.2a20250616 → compressed_tensors-0.10.2a20250620}/tests/test_utils/__init__.py +0 -0
- {compressed_tensors-0.10.2a20250616 → compressed_tensors-0.10.2a20250620}/tests/test_utils/test_helpers.py +0 -0
- {compressed_tensors-0.10.2a20250616 → compressed_tensors-0.10.2a20250620}/tests/test_utils/test_safetensors_load.py +0 -0
- {compressed_tensors-0.10.2a20250616 → compressed_tensors-0.10.2a20250620}/tests/testing_utils.py +0 -0
- {compressed_tensors-0.10.2a20250616 → compressed_tensors-0.10.2a20250620}/utils/copyright.py +0 -0
@@ -120,7 +120,7 @@ jobs:
|
|
120
120
|
shell: bash
|
121
121
|
|
122
122
|
- name: report to reportportal
|
123
|
-
uses: neuralmagic/nm-actions/actions/reportportal_submit_execution_results@v1.
|
123
|
+
uses: neuralmagic/nm-actions/actions/reportportal_submit_execution_results@v1.22.0
|
124
124
|
with:
|
125
125
|
droute_username: ${{ secrets.DROUTE_USERNAME }}
|
126
126
|
droute_password: ${{ secrets.DROUTE_PASSWORD }}
|
@@ -1,6 +1,6 @@
|
|
1
1
|
Metadata-Version: 2.4
|
2
2
|
Name: compressed-tensors
|
3
|
-
Version: 0.10.
|
3
|
+
Version: 0.10.2a20250620
|
4
4
|
Summary: Library for utilization of compressed safetensors of neural network models
|
5
5
|
Home-page: https://github.com/neuralmagic/compressed-tensors
|
6
6
|
Author: Neuralmagic, Inc.
|
@@ -85,6 +85,7 @@ __all__ = [
|
|
85
85
|
"delete_offload_module",
|
86
86
|
"offloaded_dispatch",
|
87
87
|
"disable_offloading",
|
88
|
+
"remove_dispatch",
|
88
89
|
]
|
89
90
|
|
90
91
|
|
@@ -170,22 +171,22 @@ def update_parameter_data(
|
|
170
171
|
|
171
172
|
def get_execution_device(module: torch.nn.Module) -> torch.device:
|
172
173
|
"""
|
173
|
-
Get the device which inputs should be moved to before module execution
|
174
|
+
Get the device which inputs should be moved to before module execution.
|
175
|
+
Assume that modules execute in the same order as returned by `model.modules()`
|
174
176
|
|
175
177
|
:param module: module to check, may be offloaded
|
176
178
|
:return: onload device of module
|
177
179
|
"""
|
178
|
-
|
179
|
-
|
180
|
+
for submodule in module.modules():
|
181
|
+
if has_offloaded_params(submodule):
|
182
|
+
return submodule._hf_hook.execution_device
|
180
183
|
|
181
|
-
|
182
|
-
|
183
|
-
|
184
|
-
f"Unable able to infer execution device of {module}, falling back to CPU"
|
185
|
-
)
|
186
|
-
return torch.device("cpu")
|
184
|
+
param = next(submodule.parameters(recurse=False), None)
|
185
|
+
if param is not None:
|
186
|
+
return param.device
|
187
187
|
|
188
|
-
|
188
|
+
warnings.warn(f"Unable to get execution device of {module}, falling back to CPU")
|
189
|
+
return torch.device("cpu")
|
189
190
|
|
190
191
|
|
191
192
|
def register_offload_parameter(
|
@@ -514,6 +515,9 @@ def offloaded_dispatch(
|
|
514
515
|
if offload_device == "disk":
|
515
516
|
raise NotImplementedError("Disk offloading is not currently supported")
|
516
517
|
|
518
|
+
# remove any existing hooks
|
519
|
+
remove_dispatch(module)
|
520
|
+
|
517
521
|
# create weights map
|
518
522
|
state_dict = module.state_dict()
|
519
523
|
state_dict = {key: val.to(offload_device) for key, val in state_dict.items()}
|
@@ -535,6 +539,33 @@ def offloaded_dispatch(
|
|
535
539
|
weights_map=weights_map,
|
536
540
|
tied_params_map=tied_params_map,
|
537
541
|
)
|
542
|
+
|
543
|
+
# when saving a model, `PretrainedModel.save_pretrained` will only
|
544
|
+
# onload weights if the following requirements are met
|
545
|
+
# if (
|
546
|
+
# hasattr(self, "hf_device_map")
|
547
|
+
# and len(set(self.hf_device_map.values())) > 1
|
548
|
+
# and ("cpu" in self.hf_device_map.values()
|
549
|
+
# or "disk" in self.hf_device_map.values())
|
550
|
+
# ):
|
551
|
+
# because this function always offloads, disregard actual devices and
|
552
|
+
# always use `cpu` and `cuda:0` to guarantee this condition passes
|
553
|
+
setattr(module, "hf_device_map", {"fake_offload": "cpu", "fake_exec": "cuda:0"})
|
554
|
+
|
555
|
+
return module
|
556
|
+
|
557
|
+
|
558
|
+
def remove_dispatch(module: torch.nn.Module) -> torch.nn.Module:
|
559
|
+
"""
|
560
|
+
Remove any existing dispatches from module
|
561
|
+
|
562
|
+
:param module: module which may be dispatched with hf hooks
|
563
|
+
:return: module without dispatch
|
564
|
+
"""
|
565
|
+
remove_hook_from_module(module, recurse=True)
|
566
|
+
if hasattr(module, "hf_device_map"):
|
567
|
+
delattr(module, "hf_device_map")
|
568
|
+
|
538
569
|
return module
|
539
570
|
|
540
571
|
|
@@ -563,7 +594,7 @@ def disable_offloading():
|
|
563
594
|
# update any parameters which may have changed
|
564
595
|
for module, (hook, offload) in onloaded_modules.items():
|
565
596
|
hook.offload = offload
|
566
|
-
for name, param in module.named_parameters():
|
597
|
+
for name, param in module.named_parameters(recurse=False):
|
567
598
|
update_offload_parameter(module, name, param.data)
|
568
599
|
hook.post_forward(module, None)
|
569
600
|
|
@@ -1,6 +1,6 @@
|
|
1
1
|
Metadata-Version: 2.4
|
2
2
|
Name: compressed-tensors
|
3
|
-
Version: 0.10.
|
3
|
+
Version: 0.10.2a20250620
|
4
4
|
Summary: Library for utilization of compressed safetensors of neural network models
|
5
5
|
Home-page: https://github.com/neuralmagic/compressed-tensors
|
6
6
|
Author: Neuralmagic, Inc.
|
@@ -102,6 +102,25 @@ def test_get_execution_device():
|
|
102
102
|
assert get_execution_device(module) == torch.device("cuda:0")
|
103
103
|
|
104
104
|
|
105
|
+
@requires_gpu
|
106
|
+
@requires_accelerate()
|
107
|
+
def test_get_execution_device_model():
|
108
|
+
class Model(torch.nn.Module):
|
109
|
+
def __init__(self):
|
110
|
+
super().__init__()
|
111
|
+
self.a = torch.nn.Linear(1, 2)
|
112
|
+
self.b = torch.nn.Linear(2, 2, device="cuda:0")
|
113
|
+
|
114
|
+
def forward(self, x):
|
115
|
+
return self.b(self.a(x).to("cuda:0"))
|
116
|
+
|
117
|
+
model = Model()
|
118
|
+
assert get_execution_device(model) == torch.device("cpu")
|
119
|
+
|
120
|
+
offloaded_dispatch(model.a, torch.device("cuda:0"))
|
121
|
+
assert get_execution_device(model) == torch.device("cuda:0")
|
122
|
+
|
123
|
+
|
105
124
|
@requires_accelerate()
|
106
125
|
def test_register_offload_parameter():
|
107
126
|
from accelerate import init_empty_weights
|
File without changes
|
File without changes
|
File without changes
|
File without changes
|
File without changes
|
File without changes
|
{compressed_tensors-0.10.2a20250616 → compressed_tensors-0.10.2a20250620}/.github/workflows/test.yml
RENAMED
File without changes
|
File without changes
|
File without changes
|
File without changes
|
File without changes
|
File without changes
|
File without changes
|
File without changes
|
File without changes
|
File without changes
|
File without changes
|
File without changes
|
File without changes
|
File without changes
|
File without changes
|
File without changes
|
File without changes
|
File without changes
|
File without changes
|
File without changes
|
File without changes
|
File without changes
|
File without changes
|
File without changes
|
File without changes
|
File without changes
|
File without changes
|
File without changes
|
File without changes
|
File without changes
|
File without changes
|
File without changes
|
File without changes
|
File without changes
|
File without changes
|
File without changes
|
File without changes
|
File without changes
|
File without changes
|
File without changes
|
File without changes
|
File without changes
|
File without changes
|
File without changes
|
File without changes
|
File without changes
|
File without changes
|
File without changes
|
File without changes
|
File without changes
|
File without changes
|
File without changes
|
File without changes
|
File without changes
|
File without changes
|
File without changes
|
File without changes
|
File without changes
|
File without changes
|
File without changes
|
File without changes
|
File without changes
|
File without changes
|
File without changes
|
File without changes
|
File without changes
|
File without changes
|
File without changes
|
File without changes
|
File without changes
|
File without changes
|
File without changes
|
File without changes
|
File without changes
|
File without changes
|
File without changes
|
File without changes
|
File without changes
|
File without changes
|
File without changes
|
File without changes
|
File without changes
|
File without changes
|
File without changes
|
File without changes
|
File without changes
|
File without changes
|
File without changes
|
File without changes
|
File without changes
|
File without changes
|
File without changes
|
File without changes
|
File without changes
|
File without changes
|
File without changes
|
File without changes
|
File without changes
|
File without changes
|
File without changes
|
File without changes
|
File without changes
|
File without changes
|
File without changes
|
File without changes
|
File without changes
|
File without changes
|
File without changes
|
File without changes
|
File without changes
|
File without changes
|
File without changes
|
File without changes
|
File without changes
|
File without changes
|
File without changes
|
File without changes
|
File without changes
|
File without changes
|
File without changes
|
{compressed_tensors-0.10.2a20250616 → compressed_tensors-0.10.2a20250620}/tests/test_registry.py
RENAMED
File without changes
|
File without changes
|
File without changes
|
File without changes
|
File without changes
|
File without changes
|
File without changes
|
File without changes
|
File without changes
|
File without changes
|
{compressed_tensors-0.10.2a20250616 → compressed_tensors-0.10.2a20250620}/tests/testing_utils.py
RENAMED
File without changes
|
{compressed_tensors-0.10.2a20250616 → compressed_tensors-0.10.2a20250620}/utils/copyright.py
RENAMED
File without changes
|