PyPI - onnxslim - Versions diffs - 0.1.46__tar.gz → 0.1.77__tar.gz - Mend

onnxslim 0.1.46tar.gz → 0.1.77tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (80) hide show

onnxslim-0.1.77/PKG-INFO ADDED Viewed

@@ -0,0 +1,146 @@
+Metadata-Version: 2.4
+Name: onnxslim
+Version: 0.1.77
+Summary: OnnxSlim: A Toolkit to Help Optimize Onnx Model
+Home-page: https://github.com/inisis/OnnxSlim
+Author: inisis
+Author-email: desmond.yao@buaa.edu.cn
+License: MIT
+Project-URL: Bug Tracker, https://github.com/inisis/OnnxSlim/issues
+Classifier: Programming Language :: Python :: 3
+Classifier: License :: OSI Approved :: MIT License
+Classifier: Intended Audience :: Developers
+Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
+Requires-Python: >=3.6
+Description-Content-Type: text/markdown
+License-File: LICENSE
+Requires-Dist: onnx
+Requires-Dist: sympy>=1.13.1
+Requires-Dist: packaging
+Requires-Dist: colorama
+Requires-Dist: ml_dtypes
+Dynamic: author
+Dynamic: author-email
+Dynamic: classifier
+Dynamic: description
+Dynamic: description-content-type
+Dynamic: home-page
+Dynamic: license
+Dynamic: license-file
+Dynamic: project-url
+Dynamic: requires-dist
+Dynamic: requires-python
+Dynamic: summary
+# OnnxSlim
+<p align="center">
+    <a href="https://pypi.org/project/onnxslim">
+        <img src="https://img.shields.io/pypi/v/onnxslim?color=blue" />
+    </a>
+    <a href="https://pypi.org/project/onnxslim">
+        <img src="https://static.pepy.tech/badge/onnxslim/week" />
+    </a>
+    <a href="https://pypi.org/project/onnxslim">
+        <img src="https://static.pepy.tech/badge/onnxslim/month" />
+    </a>
+    <a href="https://pypi.org/project/onnxslim">
+        <img src="https://static.pepy.tech/badge/onnxslim" />
+    </a>
+    <a href="https://github.com/inisis/onnxslim/actions/workflows/ci.yaml">
+        <img src="https://github.com/inisis/onnxslim/actions/workflows/ci.yml/badge.svg" />
+    </a>
+    <a href="https://codecov.io/gh/inisis/onnxslim" >
+        <img src="https://codecov.io/gh/inisis/onnxslim/branch/main/graph/badge.svg?token=C69ZH6802N"/>
+    </a>
+    <a href="https://muhammadrizwanmunawar.medium.com/boost-onnx-load-speed-by-10-15-with-onnxslims-python-package-d401eb8c2e69">
+        <img src="https://img.shields.io/badge/Blog-OnnxSlim?style=flat&label=OnnxSlim" />
+    </a>
+    <a href="https://deepwiki.com/inisis/OnnxSlim"><img src="https://img.shields.io/badge/DeepWiki-inisis%2FOnnxSlim-blue.svg?logo=data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAACwAAAAyCAYAAAAnWDnqAAAAAXNSR0IArs4c6QAAA05JREFUaEPtmUtyEzEQhtWTQyQLHNak2AB7ZnyXZMEjXMGeK/AIi+QuHrMnbChYY7MIh8g01fJoopFb0uhhEqqcbWTp06/uv1saEDv4O3n3dV60RfP947Mm9/SQc0ICFQgzfc4CYZoTPAswgSJCCUJUnAAoRHOAUOcATwbmVLWdGoH//PB8mnKqScAhsD0kYP3j/Yt5LPQe2KvcXmGvRHcDnpxfL2zOYJ1mFwrryWTz0advv1Ut4CJgf5uhDuDj5eUcAUoahrdY/56ebRWeraTjMt/00Sh3UDtjgHtQNHwcRGOC98BJEAEymycmYcWwOprTgcB6VZ5JK5TAJ+fXGLBm3FDAmn6oPPjR4rKCAoJCal2eAiQp2x0vxTPB3ALO2CRkwmDy5WohzBDwSEFKRwPbknEggCPB/imwrycgxX2NzoMCHhPkDwqYMr9tRcP5qNrMZHkVnOjRMWwLCcr8ohBVb1OMjxLwGCvjTikrsBOiA6fNyCrm8V1rP93iVPpwaE+gO0SsWmPiXB+jikdf6SizrT5qKasx5j8ABbHpFTx+vFXp9EnYQmLx02h1QTTrl6eDqxLnGjporxl3NL3agEvXdT0WmEost648sQOYAeJS9Q7bfUVoMGnjo4AZdUMQku50McDcMWcBPvr0SzbTAFDfvJqwLzgxwATnCgnp4wDl6Aa+Ax283gghmj+vj7feE2KBBRMW3FzOpLOADl0Isb5587h/U4gGvkt5v60Z1VLG8BhYjbzRwyQZemwAd6cCR5/XFWLYZRIMpX39AR0tjaGGiGzLVyhse5C9RKC6ai42ppWPKiBagOvaYk8lO7DajerabOZP46Lby5wKjw1HCRx7p9sVMOWGzb/vA1hwiWc6jm3MvQDTogQkiqIhJV0nBQBTU+3okKCFDy9WwferkHjtxib7t3xIUQtHxnIwtx4mpg26/HfwVNVDb4oI9RHmx5WGelRVlrtiw43zboCLaxv46AZeB3IlTkwouebTr1y2NjSpHz68WNFjHvupy3q8TFn3Hos2IAk4Ju5dCo8B3wP7VPr/FGaKiG+T+v+TQqIrOqMTL1VdWV1DdmcbO8KXBz6esmYWYKPwDL5b5FA1a0hwapHiom0r/cKaoqr+27/XcrS5UwSMbQAAAABJRU5ErkJggg==" alt="DeepWiki"></a>
+</p>
+OnnxSlim can help you slim your onnx model, with less operators, but same accuracy, better inference speed.
+- 🚀 2025/05/17: OnnxSlim is merged into [optimum](https://github.com/huggingface/optimum) 🤗🤗🤗
+- 🚀 2025/04/30: Rank 1st in the [AICAS 2025 LLM inference optimization challenge](https://tianchi.aliyun.com/competition/entrance/532289/customize588)
+- 🚀 2025/01/28: Achieved 1M downloads
+- 🚀 2024/06/23: OnnxSlim is merged into [transformers.js](https://github.com/huggingface/transformers.js) 🤗🤗🤗
+- 🚀 2024/06/02: OnnxSlim is merged into [ultralytics](https://github.com/ultralytics/ultralytics) ❤️❤️❤️
+- 🚀 2024/04/30: Rank 1st in the [AICAS 2024 LLM inference optimization challenge](https://tianchi.aliyun.com/competition/entrance/532170/customize440) held by Arm and T-head
+- 🚀 2024/01/25: OnnxSlim is merged to [mnn-llm](https://github.com/wangzhaode/mnn-llm), performance increased by 5%
+# Benchmark
+![Image](https://github.com/user-attachments/assets/fefc79f1-5d8d-486b-935a-a088846b3900)
+# Installation
+## Using Prebuilt
+```bash
+pip install onnxslim
+```
+## Install From Source
+```bash
+pip install git+https://github.com/inisis/OnnxSlim@main
+```
+## Install From Local
+```bash
+git clone https://github.com/inisis/OnnxSlim && cd OnnxSlim/
+pip install .
+```
+# How to use
+## Bash
+```bash
+onnxslim your_onnx_model slimmed_onnx_model
+```
+<div align=left><img src="https://raw.githubusercontent.com/inisis/onnxslim/main/images/onnxslim.gif"></div>
+## Inscript
+```inscript
+import onnx
+import onnxslim
+model = onnx.load("model.onnx")
+slimmed_model = onnxslim.slim(model)
+onnx.save(slimmed_model, "slimmed_model.onnx")
+```
+For more usage, see onnxslim -h or refer to our [examples](./examples)
+# Projects using OnnxSlim
+- <img src="https://avatars.githubusercontent.com/u/131524?s=48&v=4" width="22" height="22"/>[Mozilla/smart_autofill](https://github.com/mozilla/smart_autofill)
+- <img src="https://avatars.githubusercontent.com/u/1961952?s=48&v=4" width="22" height="22"/>[alibaba/MNN](https://github.com/alibaba/MNN)
+- <img src="https://avatars.githubusercontent.com/u/23534030?s=48&v=4" width="22" height="22"/>[PaddlePaddle/PaddleOCR](https://github.com/PaddlePaddle/PaddleOCR)
+- <img src="https://avatars.githubusercontent.com/u/25720743?s=48&v=4" width="22" height="22"/>[huggingface/transformers.js](https://github.com/huggingface/transformers.js)
+- <img src="https://avatars.githubusercontent.com/u/25720743?s=48&v=4" width="22" height="22"/>[huggingface/optimum](https://github.com/huggingface/optimum)
+- <img src="https://avatars.githubusercontent.com/u/86091366?s=48&v=4" width="22" height="22"/>[THU-MIG/yolov10](https://github.com/THU-MIG/yolov10)
+- <img src="https://avatars.githubusercontent.com/u/26833451?s=48&v=4" width="22" height="22"/>[ultralytics/ultralytics](https://github.com/ultralytics/ultralytics)
+- <img src="https://avatars.githubusercontent.com/u/109945100?s=48&v=4" width="22" height="22"/>[ModelScope/FunASR](https://github.com/modelscope/FunASR)
+- <img src="https://avatars.githubusercontent.com/u/1961952?s=48&v=4" width="22" height="22"/>[alibaba/MNN-LLM](https://github.com/wangzhaode/mnn-llm)
+- <img src="https://avatars.githubusercontent.com/u/126587470?s=48&v=4" width="22" height="22"/>[deepghs/imgutils](https://github.com/deepghs/imgutils)
+- <img src="https://avatars.githubusercontent.com/u/48153283?s=48&v=4" width="22" height="22"/>[sunsmarterjie/yolov12](https://github.com/sunsmarterjie/yolov12)
+- <img src="https://avatars.githubusercontent.com/u/147458884?s=48&v=4" width="22" height="22"/>[nndeploy/nndeploy](https://github.com/nndeploy/nndeploy)
+- <img src="https://avatars.githubusercontent.com/u/111754012?s=48&v=4" width="22" height="22"/>[CVCUDA/CV-CUDA](https://github.com/CVCUDA/CV-CUDA)
+# References
+> - [onnx-graphsurgeon](https://github.com/NVIDIA/TensorRT/tree/main/tools/onnx-graphsurgeon)
+> - [Polygraphy](https://github.com/NVIDIA/TensorRT/tree/main/tools/Polygraphy/polygraphy)
+> - [onnx-simplifier](https://github.com/daquexian/onnx-simplifier)
+> - [tabulate](https://github.com/astanin/python-tabulate)
+> - [onnxruntime](https://github.com/microsoft/onnxruntime)
+# Contact
+Discord: https://discord.gg/nRw2Fd3VUS QQ Group: `873569894`

onnxslim-0.1.77/README.md ADDED Viewed

@@ -0,0 +1,112 @@
+# OnnxSlim
+<p align="center">
+    <a href="https://pypi.org/project/onnxslim">
+        <img src="https://img.shields.io/pypi/v/onnxslim?color=blue" />
+    </a>
+    <a href="https://pypi.org/project/onnxslim">
+        <img src="https://static.pepy.tech/badge/onnxslim/week" />
+    </a>
+    <a href="https://pypi.org/project/onnxslim">
+        <img src="https://static.pepy.tech/badge/onnxslim/month" />
+    </a>
+    <a href="https://pypi.org/project/onnxslim">
+        <img src="https://static.pepy.tech/badge/onnxslim" />
+    </a>
+    <a href="https://github.com/inisis/onnxslim/actions/workflows/ci.yaml">
+        <img src="https://github.com/inisis/onnxslim/actions/workflows/ci.yml/badge.svg" />
+    </a>
+    <a href="https://codecov.io/gh/inisis/onnxslim" >
+        <img src="https://codecov.io/gh/inisis/onnxslim/branch/main/graph/badge.svg?token=C69ZH6802N"/>
+    </a>
+    <a href="https://muhammadrizwanmunawar.medium.com/boost-onnx-load-speed-by-10-15-with-onnxslims-python-package-d401eb8c2e69">
+        <img src="https://img.shields.io/badge/Blog-OnnxSlim?style=flat&label=OnnxSlim" />
+    </a>
+    <a href="https://deepwiki.com/inisis/OnnxSlim"><img src="https://img.shields.io/badge/DeepWiki-inisis%2FOnnxSlim-blue.svg?logo=data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAACwAAAAyCAYAAAAnWDnqAAAAAXNSR0IArs4c6QAAA05JREFUaEPtmUtyEzEQhtWTQyQLHNak2AB7ZnyXZMEjXMGeK/AIi+QuHrMnbChYY7MIh8g01fJoopFb0uhhEqqcbWTp06/uv1saEDv4O3n3dV60RfP947Mm9/SQc0ICFQgzfc4CYZoTPAswgSJCCUJUnAAoRHOAUOcATwbmVLWdGoH//PB8mnKqScAhsD0kYP3j/Yt5LPQe2KvcXmGvRHcDnpxfL2zOYJ1mFwrryWTz0advv1Ut4CJgf5uhDuDj5eUcAUoahrdY/56ebRWeraTjMt/00Sh3UDtjgHtQNHwcRGOC98BJEAEymycmYcWwOprTgcB6VZ5JK5TAJ+fXGLBm3FDAmn6oPPjR4rKCAoJCal2eAiQp2x0vxTPB3ALO2CRkwmDy5WohzBDwSEFKRwPbknEggCPB/imwrycgxX2NzoMCHhPkDwqYMr9tRcP5qNrMZHkVnOjRMWwLCcr8ohBVb1OMjxLwGCvjTikrsBOiA6fNyCrm8V1rP93iVPpwaE+gO0SsWmPiXB+jikdf6SizrT5qKasx5j8ABbHpFTx+vFXp9EnYQmLx02h1QTTrl6eDqxLnGjporxl3NL3agEvXdT0WmEost648sQOYAeJS9Q7bfUVoMGnjo4AZdUMQku50McDcMWcBPvr0SzbTAFDfvJqwLzgxwATnCgnp4wDl6Aa+Ax283gghmj+vj7feE2KBBRMW3FzOpLOADl0Isb5587h/U4gGvkt5v60Z1VLG8BhYjbzRwyQZemwAd6cCR5/XFWLYZRIMpX39AR0tjaGGiGzLVyhse5C9RKC6ai42ppWPKiBagOvaYk8lO7DajerabOZP46Lby5wKjw1HCRx7p9sVMOWGzb/vA1hwiWc6jm3MvQDTogQkiqIhJV0nBQBTU+3okKCFDy9WwferkHjtxib7t3xIUQtHxnIwtx4mpg26/HfwVNVDb4oI9RHmx5WGelRVlrtiw43zboCLaxv46AZeB3IlTkwouebTr1y2NjSpHz68WNFjHvupy3q8TFn3Hos2IAk4Ju5dCo8B3wP7VPr/FGaKiG+T+v+TQqIrOqMTL1VdWV1DdmcbO8KXBz6esmYWYKPwDL5b5FA1a0hwapHiom0r/cKaoqr+27/XcrS5UwSMbQAAAABJRU5ErkJggg==" alt="DeepWiki"></a>
+</p>
+OnnxSlim can help you slim your onnx model, with less operators, but same accuracy, better inference speed.
+- 🚀 2025/05/17: OnnxSlim is merged into [optimum](https://github.com/huggingface/optimum) 🤗🤗🤗
+- 🚀 2025/04/30: Rank 1st in the [AICAS 2025 LLM inference optimization challenge](https://tianchi.aliyun.com/competition/entrance/532289/customize588)
+- 🚀 2025/01/28: Achieved 1M downloads
+- 🚀 2024/06/23: OnnxSlim is merged into [transformers.js](https://github.com/huggingface/transformers.js) 🤗🤗🤗
+- 🚀 2024/06/02: OnnxSlim is merged into [ultralytics](https://github.com/ultralytics/ultralytics) ❤️❤️❤️
+- 🚀 2024/04/30: Rank 1st in the [AICAS 2024 LLM inference optimization challenge](https://tianchi.aliyun.com/competition/entrance/532170/customize440) held by Arm and T-head
+- 🚀 2024/01/25: OnnxSlim is merged to [mnn-llm](https://github.com/wangzhaode/mnn-llm), performance increased by 5%
+# Benchmark
+![Image](https://github.com/user-attachments/assets/fefc79f1-5d8d-486b-935a-a088846b3900)
+# Installation
+## Using Prebuilt
+```bash
+pip install onnxslim
+```
+## Install From Source
+```bash
+pip install git+https://github.com/inisis/OnnxSlim@main
+```
+## Install From Local
+```bash
+git clone https://github.com/inisis/OnnxSlim && cd OnnxSlim/
+pip install .
+```
+# How to use
+## Bash
+```bash
+onnxslim your_onnx_model slimmed_onnx_model
+```
+<div align=left><img src="https://raw.githubusercontent.com/inisis/onnxslim/main/images/onnxslim.gif"></div>
+## Inscript
+```inscript
+import onnx
+import onnxslim
+model = onnx.load("model.onnx")
+slimmed_model = onnxslim.slim(model)
+onnx.save(slimmed_model, "slimmed_model.onnx")
+```
+For more usage, see onnxslim -h or refer to our [examples](./examples)
+# Projects using OnnxSlim
+- <img src="https://avatars.githubusercontent.com/u/131524?s=48&v=4" width="22" height="22"/>[Mozilla/smart_autofill](https://github.com/mozilla/smart_autofill)
+- <img src="https://avatars.githubusercontent.com/u/1961952?s=48&v=4" width="22" height="22"/>[alibaba/MNN](https://github.com/alibaba/MNN)
+- <img src="https://avatars.githubusercontent.com/u/23534030?s=48&v=4" width="22" height="22"/>[PaddlePaddle/PaddleOCR](https://github.com/PaddlePaddle/PaddleOCR)
+- <img src="https://avatars.githubusercontent.com/u/25720743?s=48&v=4" width="22" height="22"/>[huggingface/transformers.js](https://github.com/huggingface/transformers.js)
+- <img src="https://avatars.githubusercontent.com/u/25720743?s=48&v=4" width="22" height="22"/>[huggingface/optimum](https://github.com/huggingface/optimum)
+- <img src="https://avatars.githubusercontent.com/u/86091366?s=48&v=4" width="22" height="22"/>[THU-MIG/yolov10](https://github.com/THU-MIG/yolov10)
+- <img src="https://avatars.githubusercontent.com/u/26833451?s=48&v=4" width="22" height="22"/>[ultralytics/ultralytics](https://github.com/ultralytics/ultralytics)
+- <img src="https://avatars.githubusercontent.com/u/109945100?s=48&v=4" width="22" height="22"/>[ModelScope/FunASR](https://github.com/modelscope/FunASR)
+- <img src="https://avatars.githubusercontent.com/u/1961952?s=48&v=4" width="22" height="22"/>[alibaba/MNN-LLM](https://github.com/wangzhaode/mnn-llm)
+- <img src="https://avatars.githubusercontent.com/u/126587470?s=48&v=4" width="22" height="22"/>[deepghs/imgutils](https://github.com/deepghs/imgutils)
+- <img src="https://avatars.githubusercontent.com/u/48153283?s=48&v=4" width="22" height="22"/>[sunsmarterjie/yolov12](https://github.com/sunsmarterjie/yolov12)
+- <img src="https://avatars.githubusercontent.com/u/147458884?s=48&v=4" width="22" height="22"/>[nndeploy/nndeploy](https://github.com/nndeploy/nndeploy)
+- <img src="https://avatars.githubusercontent.com/u/111754012?s=48&v=4" width="22" height="22"/>[CVCUDA/CV-CUDA](https://github.com/CVCUDA/CV-CUDA)
+# References
+> - [onnx-graphsurgeon](https://github.com/NVIDIA/TensorRT/tree/main/tools/onnx-graphsurgeon)
+> - [Polygraphy](https://github.com/NVIDIA/TensorRT/tree/main/tools/Polygraphy/polygraphy)
+> - [onnx-simplifier](https://github.com/daquexian/onnx-simplifier)
+> - [tabulate](https://github.com/astanin/python-tabulate)
+> - [onnxruntime](https://github.com/microsoft/onnxruntime)
+# Contact
+Discord: https://discord.gg/nRw2Fd3VUS QQ Group: `873569894`

onnxslim-0.1.77/VERSION ADDED Viewed

	@@ -0,0 +1 @@
1	+ 0.1.77

{onnxslim-0.1.46 → onnxslim-0.1.77}/onnxslim/__init__.py RENAMED Viewed

@@ -3,7 +3,6 @@ import warnings
 from onnxslim.cli import slim
 from onnxslim.core.pattern.registry import (
-    DEFAULT_FUSION_PATTERNS,
     register_fusion_pattern,
 )
 from onnxslim.version import __version__

{onnxslim-0.1.46 → onnxslim-0.1.77}/onnxslim/argparser.py RENAMED Viewed

@@ -2,10 +2,28 @@ import argparse
 import dataclasses
 from argparse import ArgumentDefaultsHelpFormatter, ArgumentParser
 from dataclasses import dataclass, field
-from typing import List, Optional, Type, Union, get_args, get_origin
-import onnxslim
+from typing import List, Optional, Type, Union, get_args, get_origin, TypedDict, Dict, Literal
+from .core.optimization import OptimizationSettings
+from .core.pattern.registry import DEFAULT_FUSION_PATTERNS
+from .version import __version__
+class OnnxSlimKwargs(TypedDict, total=False):
+    model_check: bool
+    input_shapes: Dict[str, List[int]]
+    inputs: List[str]
+    outputs: List[str]
+    no_shape_infer: bool
+    skip_optimizations: List[str]
+    dtype: Literal["float16", "float32", "uint8", "int8"]
+    skip_fusion_patterns: List[str]
+    size_threshold: int
+    inspect: bool
+    dump_to_disk: bool
+    save_as_external_data: bool
+    model_check_inputs: Optional[List[str]]
+    verbose: bool
 def _get_inner_type(arg_type):
     if get_origin(arg_type) is Union:
@@ -38,14 +56,24 @@ class OptimizationArguments:
     """
     no_shape_infer: bool = field(default=False, metadata={"help": "whether to disable shape_infer, default false."})
-    no_constant_folding: bool = field(
-        default=False, metadata={"help": "whether to disable constant_folding, default false."}
+    skip_optimizations: Optional[List[str]] = field(
+        default=None,
+        metadata={
+            "help": "whether to skip some optimizations",
+            "choices": list(OptimizationSettings.keys()),
+        },
     )
     skip_fusion_patterns: Optional[List[str]] = field(
         default=None,
         metadata={
             "help": "whether to skip the fusion of some patterns",
-            "choices": list(onnxslim.DEFAULT_FUSION_PATTERNS.keys()),
+            "choices": list(DEFAULT_FUSION_PATTERNS.keys()),
+        },
+    )
+    size_threshold: int = field(
+        default=None,
+        metadata={
+            "help": "size threshold in bytes, size larger than this value will not be folded, default None, which means fold all constants",
         },
     )
@@ -163,7 +191,7 @@ class OnnxSlimArgumentParser(ArgumentParser):
         # Add positional arguments separately for ModelArguments
         self.parser.add_argument("input_model", help="input onnx model")
         self.parser.add_argument("output_model", nargs="?", default=None, help="output onnx model")
-        self.parser.add_argument("-v", "--version", action="version", version=onnxslim.__version__)
+        self.parser.add_argument("-v", "--version", action="version", version=__version__)
     def parse_args_into_dataclasses(self):
         # Pre-parse arguments to check for `--inspect`

{onnxslim-0.1.46 → onnxslim-0.1.77}/onnxslim/cli/_main.py RENAMED Viewed

@@ -1,14 +1,17 @@
-from typing import List, Union
+from __future__ import annotations
 import onnx
+from onnxslim.argparser import OnnxSlimKwargs
-def slim(model: Union[str, onnx.ModelProto, List[Union[str, onnx.ModelProto]]], *args, **kwargs):
+def slim(model: str | onnx.ModelProto | list[str | onnx.ModelProto], *args, **kwargs: OnnxSlimKwargs):
     import os
     import time
     from pathlib import Path
     from onnxslim.core import (
+        OptimizationSettings,
         convert_data_format,
         freeze,
         input_modification,
@@ -18,6 +21,7 @@ def slim(model: Union[str, onnx.ModelProto, List[Union[str, onnx.ModelProto]]],
         shape_infer,
     )
     from onnxslim.utils import (
+        TensorInfo,
         check_onnx,
         check_point,
         check_result,
@@ -27,6 +31,7 @@ def slim(model: Union[str, onnx.ModelProto, List[Union[str, onnx.ModelProto]]],
         print_model_info_as_table,
         save,
         summarize_model,
+        update_outputs_dims,
     )
     output_model = args[0] if len(args) > 0 else kwargs.get("output_model", None)
@@ -35,9 +40,11 @@ def slim(model: Union[str, onnx.ModelProto, List[Union[str, onnx.ModelProto]]],
     inputs = kwargs.get("inputs", None)
     outputs = kwargs.get("outputs", None)
     no_shape_infer = kwargs.get("no_shape_infer", False)
-    no_constant_folding = kwargs.get("no_constant_folding", False)
+    skip_optimizations = kwargs.get("skip_optimizations", None)
     dtype = kwargs.get("dtype", None)
     skip_fusion_patterns = kwargs.get("skip_fusion_patterns", None)
+    size_threshold = kwargs.get("size_threshold", None)
+    size_threshold = int(size_threshold) if size_threshold else None
     kwargs.get("inspect", False)
     dump_to_disk = kwargs.get("dump_to_disk", False)
     save_as_external_data = kwargs.get("save_as_external_data", False)
@@ -92,14 +99,17 @@ def slim(model: Union[str, onnx.ModelProto, List[Union[str, onnx.ModelProto]]],
     if model_check:
         input_data_dict, raw_onnx_output, model = check_onnx(model, model_check_inputs)
+    output_info = {TensorInfo(o).name: TensorInfo(o).shape for o in model.graph.output}
     if not no_shape_infer:
         model = shape_infer(model)
-    if not no_constant_folding:
+    OptimizationSettings.reset(skip_optimizations)
+    if OptimizationSettings.enabled():
         graph_check_point = check_point(model)
         while MAX_ITER > 0:
             logger.debug(f"iter: {MAX_ITER}")
-            model = optimize(model, skip_fusion_patterns)
+            model = optimize(model, skip_fusion_patterns, size_threshold)
             if not no_shape_infer:
                 model = shape_infer(model)
             graph = check_point(model)
@@ -114,6 +124,8 @@ def slim(model: Union[str, onnx.ModelProto, List[Union[str, onnx.ModelProto]]],
     if dtype:
         model = convert_data_format(model, dtype)
+    model = update_outputs_dims(model, output_dims=output_info)
     if model_check:
         slimmed_onnx_output, model = onnxruntime_inference(model, input_data_dict)
         if not check_result(raw_onnx_output, slimmed_onnx_output):
@@ -151,10 +163,11 @@ def main():
     if not checker_args.inspect and checker_args.dump_to_disk:
         argument_parser.error("dump_to_disk can only be used with --inspect")
-    if not optimization_args.no_shape_infer or optimization_args.no_constant_folding:
-        from onnxslim.utils import check_onnx_compatibility
+    if not optimization_args.no_shape_infer:
+        from onnxslim.utils import check_onnx_compatibility, is_onnxruntime_available
-        check_onnx_compatibility()
+        if is_onnxruntime_available():
+            check_onnx_compatibility()
     slim(
         model_args.input_model,

{onnxslim-0.1.46 → onnxslim-0.1.77}/onnxslim/core/__init__.py RENAMED Viewed

@@ -1,15 +1,18 @@
+from __future__ import annotations
 import logging
 import os
 import tempfile
+from typing import Optional
 import numpy as np
 import onnx
 from onnx import checker
 import onnxslim.third_party.onnx_graphsurgeon as gs
-from onnxslim.core.optimization import optimize_model
+from onnxslim.core.optimization import OptimizationSettings, optimize_model
 from onnxslim.third_party.onnx_graphsurgeon.exporters.onnx_exporter import dtype_to_onnx
-from onnxslim.third_party.onnx_graphsurgeon.ir.tensor import Constant, Variable
+from onnxslim.third_party.onnx_graphsurgeon.ir.tensor import Constant
 from onnxslim.third_party.symbolic_shape_infer import SymbolicShapeInference
 from onnxslim.utils import save
@@ -18,6 +21,7 @@ logger = logging.getLogger("onnxslim")
 DEBUG = bool(os.getenv("ONNXSLIM_DEBUG"))
 AUTO_MERGE = True if os.getenv("ONNXSLIM_AUTO_MERGE") is None else bool(int(os.getenv("ONNXSLIM_AUTO_MERGE")))
+FORCE_ONNXRUNTIME_SHAPE_INFERENCE = bool(os.getenv("ONNXSLIM_FORCE_ONNXRUNTIME_SHAPE_INFERENCE"))
 def input_shape_modification(model: onnx.ModelProto, input_shapes: str) -> onnx.ModelProto:
@@ -122,6 +126,9 @@ def input_modification(model: onnx.ModelProto, inputs: str) -> onnx.ModelProto:
 def shape_infer(model: onnx.ModelProto):
     """Infer tensor shapes in an ONNX model using symbolic and static shape inference techniques."""
     logger.debug("Start shape inference.")
+    if FORCE_ONNXRUNTIME_SHAPE_INFERENCE:
+        logger.debug("force onnxruntime shape infer.")
+        return SymbolicShapeInference.infer_shapes(model, auto_merge=AUTO_MERGE)
     try:
         logger.debug("try onnxruntime shape infer.")
         model = SymbolicShapeInference.infer_shapes(model, auto_merge=AUTO_MERGE)
@@ -142,14 +149,15 @@ def shape_infer(model: onnx.ModelProto):
     return model
-def optimize(model: onnx.ModelProto, skip_fusion_patterns: str = None):
+def optimize(model: onnx.ModelProto, skip_fusion_patterns: str | None = None, size_threshold: int | None = None):
     """Optimize the given ONNX model with options to skip specific fusion patterns and return the optimized model."""
     logger.debug("Start converting model to gs.")
     graph = gs.import_onnx(model).toposort()
     logger.debug("Finish converting model to gs.")
-    logger.debug("Start constant folding.")
-    graph.fold_constants().cleanup().toposort()
-    logger.debug("Finish constant folding.")
+    if OptimizationSettings.constant_folding:
+        logger.debug("Start constant folding.")
+        graph.fold_constants(size_threshold=size_threshold).cleanup().toposort()
+        logger.debug("Finish constant folding.")
     logger.debug("Start optimize model.")
     model = optimize_model(graph, skip_fusion_patterns)
     logger.debug("Finish optimize model.")
@@ -170,11 +178,11 @@ def convert_data_format(model: onnx.ModelProto, dtype: str) -> onnx.ModelProto:
         for node in graph.nodes:
             if node.op == "Cast":
-                inp_dtype = [input.dtype for input in node.inputs][0]
+                inp_dtype = next(input.dtype for input in node.inputs)
                 if inp_dtype in [np.float16, np.float32]:
-                    node.replace_all_uses_with(node.inputs[0])
+                    node.erase()
                 else:
-                    outp_dtype = [output.dtype for output in node.outputs][0]
+                    outp_dtype = next(output.dtype for output in node.outputs)
                     if outp_dtype == np.float16:
                         node.attrs["to"] = dtype_to_onnx(np.float32)
                         node.outputs[0].dtype = np.float32

{onnxslim-0.1.46 → onnxslim-0.1.77}/onnxslim/core/optimization/__init__.py RENAMED Viewed

@@ -1,6 +1,8 @@
+from __future__ import annotations
 import logging
 from collections import Counter
-from typing import List, Union
+from typing import List, Optional, Union
 import onnx
@@ -15,19 +17,62 @@ from .subexpression_elimination import subexpression_elimination
 from .weight_tying import tie_weights
-def optimize_model(model: Union[onnx.ModelProto, gs.Graph], skip_fusion_patterns: str = None) -> onnx.ModelProto:
+class OptimizationSettings:
+    constant_folding = True
+    graph_fusion = True
+    dead_node_elimination = True
+    subexpression_elimination = True
+    weight_tying = True
+    @classmethod
+    def keys(cls):
+        return [
+            "constant_folding",
+            "graph_fusion",
+            "dead_node_elimination",
+            "subexpression_elimination",
+            "weight_tying",
+        ]
+    @classmethod
+    def reset(cls, skip_optimizations: list[str] | None = None):
+        for key in cls.keys():
+            if skip_optimizations and key in skip_optimizations:
+                setattr(cls, key, False)
+            else:
+                setattr(cls, key, True)
+    @classmethod
+    def stats(cls):
+        return {key: getattr(cls, key) for key in cls.keys()}
+    @classmethod
+    def enabled(cls):
+        return any([getattr(cls, key) for key in cls.keys()])
+def optimize_model(model: onnx.ModelProto | gs.Graph, skip_fusion_patterns: str | None = None) -> onnx.ModelProto:
     """Optimize and transform the given ONNX model using various fusion patterns and graph rewriting techniques."""
     graph = model if isinstance(model, gs.Graph) else gs.import_onnx(model)
-    fusion_patterns = get_fusion_patterns(skip_fusion_patterns)
-    fusion_pairs = find_matches(graph, fusion_patterns)
-    for match in fusion_pairs.values():
-        graph.replace_custom_layer(**match)
-    graph.cleanup(remove_unused_graph_inputs=True).toposort()
-    dead_node_elimination(graph)
-    graph.cleanup(remove_unused_graph_inputs=True).toposort()
-    subexpression_elimination(graph)
-    graph.cleanup(remove_unused_graph_inputs=True).toposort()
-    tie_weights(graph)
+    if OptimizationSettings.graph_fusion:
+        logger.debug("Start graph_fusion.")
+        fusion_patterns = get_fusion_patterns(skip_fusion_patterns)
+        graph_fusion(graph, fusion_patterns)
+        logger.debug("Finish graph_fusion.")
+    if OptimizationSettings.dead_node_elimination:
+        logger.debug("Start dead_node_elimination.")
+        dead_node_elimination(graph)
+        graph.cleanup(remove_unused_graph_inputs=True).toposort()
+        logger.debug("Finish dead_node_elimination.")
+    if OptimizationSettings.subexpression_elimination:
+        logger.debug("Start subexpression_elimination.")
+        subexpression_elimination(graph)
+        graph.cleanup(remove_unused_graph_inputs=True).toposort()
+        logger.debug("Finish subexpression_elimination.")
+    if OptimizationSettings.weight_tying:
+        logger.debug("Start weight_tying.")
+        tie_weights(graph)
+        logger.debug("Finish weight_tying.")
     model = gs.export_onnx(graph)
     return model
@@ -38,9 +83,9 @@ def replace_custom_layer(
     self,
     op: str,
     inputs,
-    outputs: List[str],
+    outputs: list[str],
     name: str,
-    attrs: dict = None,
+    attrs: dict | None = None,
     domain: str = "ai.onnx.contrib",
 ):
     """Replace a custom layer in the computational graph with specified parameters and domain."""
@@ -54,9 +99,21 @@ def replace_custom_layer(
     )
+def graph_fusion(graph: Graph, fusion_patterns: dict, is_subgraph=False):
+    for subgraph in graph.subgraphs():
+        graph_fusion(subgraph, fusion_patterns, is_subgraph=True)
+    fusion_pairs = find_matches(graph, fusion_patterns)
+    for match in fusion_pairs.values():
+        graph.replace_custom_layer(**match)
+    graph.cleanup(remove_unused_graph_inputs=True if not is_subgraph else False).toposort()
 def find_matches(graph: Graph, fusion_patterns: dict):
     """Find matching patterns in the graph based on provided fusion patterns."""
     match_map = {}
     counter = Counter()
     for node in reversed(graph.nodes):
         if node.name not in match_map:

onnxslim 0.1.46__tar.gz → 0.1.77__tar.gz

onnxslim 0.1.46tar.gz → 0.1.77tar.gz