PyPI - neural-compressor - Versions diffs - 2.3.2__tar.gz → 2.4__tar.gz - Mend

neural-compressor 2.3.2tar.gz → 2.4tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (540) hide show

{neural_compressor-2.3.2 → neural_compressor-2.4}/PKG-INFO RENAMED Viewed

@@ -1,6 +1,6 @@
 Metadata-Version: 2.1
 Name: neural_compressor
-Version: 2.3.2
+Version: 2.4
 Summary: Repository of Intel® Neural Compressor
 Home-page: https://github.com/intel/neural-compressor
 Author: Intel AIA Team
@@ -11,9 +11,10 @@ Classifier: Intended Audience :: Science/Research
 Classifier: Programming Language :: Python :: 3
 Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
 Classifier: License :: OSI Approved :: Apache Software License
-Requires-Python: >=3.6.0
+Requires-Python: >=3.7.0
 Description-Content-Type: text/markdown
 License-File: LICENSE
+License-File: third-party-programs.txt
 Requires-Dist: deprecated>=1.2.13
 Requires-Dist: numpy
 Requires-Dist: opencv-python-headless
@@ -28,6 +29,12 @@ Requires-Dist: pyyaml
 Requires-Dist: requests
 Requires-Dist: schema
 Requires-Dist: scikit-learn
+Provides-Extra: pt
+Requires-Dist: neural_compressor_3x_pt==2.4; extra == "pt"
+Provides-Extra: tf
+Requires-Dist: neural_compressor_3x_tf==2.4; extra == "tf"
+Provides-Extra: ort
+Requires-Dist: neural_compressor_3x_ort==2.4; extra == "ort"
 <div align="center">
@@ -35,8 +42,8 @@ Intel® Neural Compressor
 ===========================
 <h3> An open-source Python library supporting popular model compression techniques on all mainstream deep learning frameworks (TensorFlow, PyTorch, ONNX Runtime, and MXNet)</h3>
-[![python](https://img.shields.io/badge/python-3.7%2B-blue)](https://github.com/intel/neural-compressor)
-[![version](https://img.shields.io/badge/release-2.3.2-green)](https://github.com/intel/neural-compressor/releases)
+[![python](https://img.shields.io/badge/python-3.8%2B-blue)](https://github.com/intel/neural-compressor)
+[![version](https://img.shields.io/badge/release-2.4-green)](https://github.com/intel/neural-compressor/releases)
 [![license](https://img.shields.io/badge/license-Apache%202-blue)](https://github.com/intel/neural-compressor/blob/master/LICENSE)
 [![coverage](https://img.shields.io/badge/coverage-85%25-green)](https://github.com/intel/neural-compressor)
 [![Downloads](https://static.pepy.tech/personalized-badge/neural-compressor?period=total&units=international_system&left_color=grey&right_color=green&left_text=downloads)](https://pepy.tech/project/neural-compressor)
@@ -52,9 +59,9 @@ In particular, the tool provides the key features, typical examples, and open co
 * Support a wide range of Intel hardware such as [Intel Xeon Scalable Processors](https://www.intel.com/content/www/us/en/products/details/processors/xeon/scalable.html), [Intel Xeon CPU Max Series](https://www.intel.com/content/www/us/en/products/details/processors/xeon/max-series.html), [Intel Data Center GPU Flex Series](https://www.intel.com/content/www/us/en/products/details/discrete-gpus/data-center-gpu/flex-series.html), and [Intel Data Center GPU Max Series](https://www.intel.com/content/www/us/en/products/details/discrete-gpus/data-center-gpu/max-series.html) with extensive testing; support AMD CPU, ARM CPU, and NVidia GPU through ONNX Runtime with limited testing
-* Validate popular LLMs such as LLama2, [LLama](examples/onnxrt/nlp/huggingface_model/text_generation/llama/quantization/ptq_static), [MPT](https://github.com/intel/intel-extension-for-transformers/blob/main/examples/huggingface/pytorch/text-generation/quantization/README.md), [Falcon](https://github.com/intel/intel-extension-for-transformers/blob/main/examples/huggingface/pytorch/language-modeling/quantization/README.md), [GPT-J](/examples/pytorch/nlp/huggingface_models/language-modeling/quantization/ptq_static/fx), [Bloom](/examples/pytorch/nlp/huggingface_models/language-modeling/quantization/ptq_static/ipex/smooth_quant), [OPT](/examples/pytorch/nlp/huggingface_models/language-modeling/quantization/ptq_static/ipex/smooth_quant), and more than 10,000 broad models such as [Stable Diffusion](/examples/pytorch/nlp/huggingface_models/text-to-image/quantization), [BERT-Large](/examples/pytorch/nlp/huggingface_models/text-classification/quantization/ptq_static/fx), and [ResNet50](/examples/pytorch/image_recognition/torchvision_models/quantization/ptq/cpu/fx) from popular model hubs such as [Hugging Face](https://huggingface.co/), [Torch Vision](https://pytorch.org/vision/stable/index.html), and [ONNX Model Zoo](https://github.com/onnx/models#models), by leveraging zero-code optimization solution [Neural Coder](/neural_coder#what-do-we-offer) and automatic [accuracy-driven](/docs/source/design.md#workflow) quantization strategies
+* Validate popular LLMs such as [LLama2](/examples/pytorch/nlp/huggingface_models/language-modeling/quantization/llm), [Falcon](/examples/pytorch/nlp/huggingface_models/language-modeling/quantization/llm), [GPT-J](/examples/pytorch/nlp/huggingface_models/language-modeling/quantization/llm), [Bloom](/examples/pytorch/nlp/huggingface_models/language-modeling/quantization/llm), [OPT](/examples/pytorch/nlp/huggingface_models/language-modeling/quantization/llm), and more than 10,000 broad models such as [Stable Diffusion](/examples/pytorch/nlp/huggingface_models/text-to-image/quantization), [BERT-Large](/examples/pytorch/nlp/huggingface_models/text-classification/quantization/ptq_static/fx), and [ResNet50](/examples/pytorch/image_recognition/torchvision_models/quantization/ptq/cpu/fx) from popular model hubs such as [Hugging Face](https://huggingface.co/), [Torch Vision](https://pytorch.org/vision/stable/index.html), and [ONNX Model Zoo](https://github.com/onnx/models#models), by leveraging zero-code optimization solution [Neural Coder](/neural_coder#what-do-we-offer) and automatic [accuracy-driven](/docs/source/design.md#workflow) quantization strategies
-* Collaborate with cloud marketplace such as [Google Cloud Platform](https://console.cloud.google.com/marketplace/product/bitnami-launchpad/inc-tensorflow-intel?project=verdant-sensor-286207), [Amazon Web Services](https://aws.amazon.com/marketplace/pp/prodview-yjyh2xmggbmga#pdp-support), and [Azure](https://azuremarketplace.microsoft.com/en-us/marketplace/apps/bitnami.inc-tensorflow-intel), software platforms such as [Alibaba Cloud](https://www.intel.com/content/www/us/en/developer/articles/technical/quantize-ai-by-oneapi-analytics-on-alibaba-cloud.html), [Tencent TACO](https://new.qq.com/rain/a/20221202A00B9S00) and [Microsoft Olive](https://github.com/microsoft/Olive), and open AI ecosystem such as [Hugging Face](https://huggingface.co/blog/intel), [PyTorch](https://pytorch.org/tutorials/recipes/intel_neural_compressor_for_pytorch.html), [ONNX](https://github.com/onnx/models#models), [ONNX Runtime](https://github.com/microsoft/onnxruntime), and [Lightning AI](https://github.com/Lightning-AI/lightning/blob/master/docs/source-pytorch/advanced/post_training_quantization.rst)
+* Collaborate with cloud marketplaces such as [Google Cloud Platform](https://console.cloud.google.com/marketplace/product/bitnami-launchpad/inc-tensorflow-intel?project=verdant-sensor-286207), [Amazon Web Services](https://aws.amazon.com/marketplace/pp/prodview-yjyh2xmggbmga#pdp-support), and [Azure](https://azuremarketplace.microsoft.com/en-us/marketplace/apps/bitnami.inc-tensorflow-intel), software platforms such as [Alibaba Cloud](https://www.intel.com/content/www/us/en/developer/articles/technical/quantize-ai-by-oneapi-analytics-on-alibaba-cloud.html), [Tencent TACO](https://new.qq.com/rain/a/20221202A00B9S00) and [Microsoft Olive](https://github.com/microsoft/Olive), and open AI ecosystem such as [Hugging Face](https://huggingface.co/blog/intel), [PyTorch](https://pytorch.org/tutorials/recipes/intel_neural_compressor_for_pytorch.html), [ONNX](https://github.com/onnx/models#models), [ONNX Runtime](https://github.com/microsoft/onnxruntime), and [Lightning AI](https://github.com/Lightning-AI/lightning/blob/master/docs/source-pytorch/advanced/post_training_quantization.rst)
 ## Installation
@@ -62,7 +69,8 @@ In particular, the tool provides the key features, typical examples, and open co
 ```Shell
 pip install neural-compressor
 ```
-> More installation methods can be found at [Installation Guide](./docs/source/installation_guide.md). Please check out our [FAQ](./docs/source/faq.md) for more details.
+> **Note**:
+> More installation methods can be found at [Installation Guide](https://github.com/intel/neural-compressor/blob/master/docs/source/installation_guide.md). Please check out our [FAQ](https://github.com/intel/neural-compressor/blob/master/docs/source/faq.md) for more details.
 ## Getting Started
 ### Quantization with Python API
@@ -152,7 +160,8 @@ q_model = fit(
       </tr>
       <tr>
           <td colspan="4" align="center"><a href="./docs/source/quantization_weight_only.md">Weight-Only Quantization (INT8/INT4/FP4/NF4) </td>
-          <td colspan="4" align="center"><a href="https://github.com/intel/neural-compressor/blob/fp8_adaptor/docs/source/fp8.md">FP8 Quantization </td>
+          <td colspan="2" align="center"><a href="https://github.com/intel/neural-compressor/blob/fp8_adaptor/docs/source/fp8.md">FP8 Quantization </td>
+          <td colspan="2" align="center"><a href="./docs/source/quantization_layer_wise.md">Layer-Wise Quantization </td>
       </tr>
   </tbody>
   <thead>
@@ -168,18 +177,19 @@ q_model = fit(
   </tbody>
 </table>
-> More documentations can be found at [User Guide](./docs/source/user_guide.md).
+> **Note**:
+> More documentations can be found at [User Guide](https://github.com/intel/neural-compressor/blob/master/docs/source/user_guide.md).
 ## Selected Publications/Events
+* Blog by Intel: [Effective Weight-Only Quantization for Large Language Models with Intel® Neural Compressor](https://community.intel.com/t5/Blogs/Tech-Innovation/Artificial-Intelligence-AI/Effective-Weight-Only-Quantization-for-Large-Language-Models/post/1529552) (Oct 2023)
 * EMNLP'2023 (Under Review): [TEQ: Trainable Equivalent Transformation for Quantization of LLMs](https://openreview.net/forum?id=iaI8xEINAf&referrer=%5BAuthor%20Console%5D) (Sep 2023)
 * arXiv: [Efficient Post-training Quantization with FP8 Formats](https://arxiv.org/abs/2309.14592) (Sep 2023)
 * arXiv: [Optimize Weight Rounding via Signed Gradient Descent for the Quantization of LLMs](https://arxiv.org/abs/2309.05516) (Sep 2023)
-* Post on Social Media: [ONNXCommunityMeetup2023: INT8 Quantization for Large Language Models with Intel Neural Compressor](https://www.youtube.com/watch?v=luYBWA1Q5pQ)  (July 2023)
-* Blog by Intel: [Accelerate Llama 2 with Intel AI Hardware and Software Optimizations](https://www.intel.com/content/www/us/en/developer/articles/news/llama2.html) (July 2023)
 * NeurIPS'2022: [Fast Distilbert on CPUs](https://arxiv.org/abs/2211.07715) (Oct 2022)
 * NeurIPS'2022: [QuaLA-MiniLM: a Quantized Length Adaptive MiniLM](https://arxiv.org/abs/2210.17114) (Oct 2022)
-> View [Full Publication List](./docs/source/publication_list.md).
+> **Note**:
+> View [Full Publication List](https://github.com/intel/neural-compressor/blob/master/docs/source/publication_list.md).
 ## Additional Content
@@ -189,6 +199,7 @@ q_model = fit(
 * [Security Policy](SECURITY.md)
 ## Communication
-- [GitHub Issues](https://github.com/intel/neural-compressor/issues): mainly for bugs report, new feature request, question asking, etc.
+- [GitHub Issues](https://github.com/intel/neural-compressor/issues): mainly for bug reports, new feature requests, question asking, etc.
 - [Email](mailto:inc.maintainers@intel.com): welcome to raise any interesting research ideas on model compression techniques by email for collaborations.
+- [Discord Channel](https://discord.com/invite/Wxk3J3ZJkU): join the discord channel for more flexible technical discussion.
 - [WeChat group](/docs/source/imgs/wechat_group.jpg): scan the QA code to join the technical discussion.

{neural_compressor-2.3.2 → neural_compressor-2.4}/README.md RENAMED Viewed

@@ -4,8 +4,8 @@ Intel® Neural Compressor
 ===========================
 <h3> An open-source Python library supporting popular model compression techniques on all mainstream deep learning frameworks (TensorFlow, PyTorch, ONNX Runtime, and MXNet)</h3>
-[![python](https://img.shields.io/badge/python-3.7%2B-blue)](https://github.com/intel/neural-compressor)
-[![version](https://img.shields.io/badge/release-2.3.2-green)](https://github.com/intel/neural-compressor/releases)
+[![python](https://img.shields.io/badge/python-3.8%2B-blue)](https://github.com/intel/neural-compressor)
+[![version](https://img.shields.io/badge/release-2.4-green)](https://github.com/intel/neural-compressor/releases)
 [![license](https://img.shields.io/badge/license-Apache%202-blue)](https://github.com/intel/neural-compressor/blob/master/LICENSE)
 [![coverage](https://img.shields.io/badge/coverage-85%25-green)](https://github.com/intel/neural-compressor)
 [![Downloads](https://static.pepy.tech/personalized-badge/neural-compressor?period=total&units=international_system&left_color=grey&right_color=green&left_text=downloads)](https://pepy.tech/project/neural-compressor)
@@ -21,9 +21,9 @@ In particular, the tool provides the key features, typical examples, and open co
 * Support a wide range of Intel hardware such as [Intel Xeon Scalable Processors](https://www.intel.com/content/www/us/en/products/details/processors/xeon/scalable.html), [Intel Xeon CPU Max Series](https://www.intel.com/content/www/us/en/products/details/processors/xeon/max-series.html), [Intel Data Center GPU Flex Series](https://www.intel.com/content/www/us/en/products/details/discrete-gpus/data-center-gpu/flex-series.html), and [Intel Data Center GPU Max Series](https://www.intel.com/content/www/us/en/products/details/discrete-gpus/data-center-gpu/max-series.html) with extensive testing; support AMD CPU, ARM CPU, and NVidia GPU through ONNX Runtime with limited testing
-* Validate popular LLMs such as LLama2, [LLama](examples/onnxrt/nlp/huggingface_model/text_generation/llama/quantization/ptq_static), [MPT](https://github.com/intel/intel-extension-for-transformers/blob/main/examples/huggingface/pytorch/text-generation/quantization/README.md), [Falcon](https://github.com/intel/intel-extension-for-transformers/blob/main/examples/huggingface/pytorch/language-modeling/quantization/README.md), [GPT-J](/examples/pytorch/nlp/huggingface_models/language-modeling/quantization/ptq_static/fx), [Bloom](/examples/pytorch/nlp/huggingface_models/language-modeling/quantization/ptq_static/ipex/smooth_quant), [OPT](/examples/pytorch/nlp/huggingface_models/language-modeling/quantization/ptq_static/ipex/smooth_quant), and more than 10,000 broad models such as [Stable Diffusion](/examples/pytorch/nlp/huggingface_models/text-to-image/quantization), [BERT-Large](/examples/pytorch/nlp/huggingface_models/text-classification/quantization/ptq_static/fx), and [ResNet50](/examples/pytorch/image_recognition/torchvision_models/quantization/ptq/cpu/fx) from popular model hubs such as [Hugging Face](https://huggingface.co/), [Torch Vision](https://pytorch.org/vision/stable/index.html), and [ONNX Model Zoo](https://github.com/onnx/models#models), by leveraging zero-code optimization solution [Neural Coder](/neural_coder#what-do-we-offer) and automatic [accuracy-driven](/docs/source/design.md#workflow) quantization strategies
+* Validate popular LLMs such as [LLama2](/examples/pytorch/nlp/huggingface_models/language-modeling/quantization/llm), [Falcon](/examples/pytorch/nlp/huggingface_models/language-modeling/quantization/llm), [GPT-J](/examples/pytorch/nlp/huggingface_models/language-modeling/quantization/llm), [Bloom](/examples/pytorch/nlp/huggingface_models/language-modeling/quantization/llm), [OPT](/examples/pytorch/nlp/huggingface_models/language-modeling/quantization/llm), and more than 10,000 broad models such as [Stable Diffusion](/examples/pytorch/nlp/huggingface_models/text-to-image/quantization), [BERT-Large](/examples/pytorch/nlp/huggingface_models/text-classification/quantization/ptq_static/fx), and [ResNet50](/examples/pytorch/image_recognition/torchvision_models/quantization/ptq/cpu/fx) from popular model hubs such as [Hugging Face](https://huggingface.co/), [Torch Vision](https://pytorch.org/vision/stable/index.html), and [ONNX Model Zoo](https://github.com/onnx/models#models), by leveraging zero-code optimization solution [Neural Coder](/neural_coder#what-do-we-offer) and automatic [accuracy-driven](/docs/source/design.md#workflow) quantization strategies
-* Collaborate with cloud marketplace such as [Google Cloud Platform](https://console.cloud.google.com/marketplace/product/bitnami-launchpad/inc-tensorflow-intel?project=verdant-sensor-286207), [Amazon Web Services](https://aws.amazon.com/marketplace/pp/prodview-yjyh2xmggbmga#pdp-support), and [Azure](https://azuremarketplace.microsoft.com/en-us/marketplace/apps/bitnami.inc-tensorflow-intel), software platforms such as [Alibaba Cloud](https://www.intel.com/content/www/us/en/developer/articles/technical/quantize-ai-by-oneapi-analytics-on-alibaba-cloud.html), [Tencent TACO](https://new.qq.com/rain/a/20221202A00B9S00) and [Microsoft Olive](https://github.com/microsoft/Olive), and open AI ecosystem such as [Hugging Face](https://huggingface.co/blog/intel), [PyTorch](https://pytorch.org/tutorials/recipes/intel_neural_compressor_for_pytorch.html), [ONNX](https://github.com/onnx/models#models), [ONNX Runtime](https://github.com/microsoft/onnxruntime), and [Lightning AI](https://github.com/Lightning-AI/lightning/blob/master/docs/source-pytorch/advanced/post_training_quantization.rst)
+* Collaborate with cloud marketplaces such as [Google Cloud Platform](https://console.cloud.google.com/marketplace/product/bitnami-launchpad/inc-tensorflow-intel?project=verdant-sensor-286207), [Amazon Web Services](https://aws.amazon.com/marketplace/pp/prodview-yjyh2xmggbmga#pdp-support), and [Azure](https://azuremarketplace.microsoft.com/en-us/marketplace/apps/bitnami.inc-tensorflow-intel), software platforms such as [Alibaba Cloud](https://www.intel.com/content/www/us/en/developer/articles/technical/quantize-ai-by-oneapi-analytics-on-alibaba-cloud.html), [Tencent TACO](https://new.qq.com/rain/a/20221202A00B9S00) and [Microsoft Olive](https://github.com/microsoft/Olive), and open AI ecosystem such as [Hugging Face](https://huggingface.co/blog/intel), [PyTorch](https://pytorch.org/tutorials/recipes/intel_neural_compressor_for_pytorch.html), [ONNX](https://github.com/onnx/models#models), [ONNX Runtime](https://github.com/microsoft/onnxruntime), and [Lightning AI](https://github.com/Lightning-AI/lightning/blob/master/docs/source-pytorch/advanced/post_training_quantization.rst)
 ## Installation
@@ -31,7 +31,8 @@ In particular, the tool provides the key features, typical examples, and open co
 ```Shell
 pip install neural-compressor
 ```
-> More installation methods can be found at [Installation Guide](./docs/source/installation_guide.md). Please check out our [FAQ](./docs/source/faq.md) for more details.
+> **Note**:
+> More installation methods can be found at [Installation Guide](https://github.com/intel/neural-compressor/blob/master/docs/source/installation_guide.md). Please check out our [FAQ](https://github.com/intel/neural-compressor/blob/master/docs/source/faq.md) for more details.
 ## Getting Started
 ### Quantization with Python API
@@ -121,7 +122,8 @@ q_model = fit(
       </tr>
       <tr>
           <td colspan="4" align="center"><a href="./docs/source/quantization_weight_only.md">Weight-Only Quantization (INT8/INT4/FP4/NF4) </td>
-          <td colspan="4" align="center"><a href="https://github.com/intel/neural-compressor/blob/fp8_adaptor/docs/source/fp8.md">FP8 Quantization </td>
+          <td colspan="2" align="center"><a href="https://github.com/intel/neural-compressor/blob/fp8_adaptor/docs/source/fp8.md">FP8 Quantization </td>
+          <td colspan="2" align="center"><a href="./docs/source/quantization_layer_wise.md">Layer-Wise Quantization </td>
       </tr>
   </tbody>
   <thead>
@@ -137,18 +139,19 @@ q_model = fit(
   </tbody>
 </table>
-> More documentations can be found at [User Guide](./docs/source/user_guide.md).
+> **Note**:
+> More documentations can be found at [User Guide](https://github.com/intel/neural-compressor/blob/master/docs/source/user_guide.md).
 ## Selected Publications/Events
+* Blog by Intel: [Effective Weight-Only Quantization for Large Language Models with Intel® Neural Compressor](https://community.intel.com/t5/Blogs/Tech-Innovation/Artificial-Intelligence-AI/Effective-Weight-Only-Quantization-for-Large-Language-Models/post/1529552) (Oct 2023)
 * EMNLP'2023 (Under Review): [TEQ: Trainable Equivalent Transformation for Quantization of LLMs](https://openreview.net/forum?id=iaI8xEINAf&referrer=%5BAuthor%20Console%5D) (Sep 2023)
 * arXiv: [Efficient Post-training Quantization with FP8 Formats](https://arxiv.org/abs/2309.14592) (Sep 2023)
 * arXiv: [Optimize Weight Rounding via Signed Gradient Descent for the Quantization of LLMs](https://arxiv.org/abs/2309.05516) (Sep 2023)
-* Post on Social Media: [ONNXCommunityMeetup2023: INT8 Quantization for Large Language Models with Intel Neural Compressor](https://www.youtube.com/watch?v=luYBWA1Q5pQ)  (July 2023)
-* Blog by Intel: [Accelerate Llama 2 with Intel AI Hardware and Software Optimizations](https://www.intel.com/content/www/us/en/developer/articles/news/llama2.html) (July 2023)
 * NeurIPS'2022: [Fast Distilbert on CPUs](https://arxiv.org/abs/2211.07715) (Oct 2022)
 * NeurIPS'2022: [QuaLA-MiniLM: a Quantized Length Adaptive MiniLM](https://arxiv.org/abs/2210.17114) (Oct 2022)
-> View [Full Publication List](./docs/source/publication_list.md).
+> **Note**:
+> View [Full Publication List](https://github.com/intel/neural-compressor/blob/master/docs/source/publication_list.md).
 ## Additional Content
@@ -158,6 +161,7 @@ q_model = fit(
 * [Security Policy](SECURITY.md)
 ## Communication
-- [GitHub Issues](https://github.com/intel/neural-compressor/issues): mainly for bugs report, new feature request, question asking, etc.
+- [GitHub Issues](https://github.com/intel/neural-compressor/issues): mainly for bug reports, new feature requests, question asking, etc.
 - [Email](mailto:inc.maintainers@intel.com): welcome to raise any interesting research ideas on model compression techniques by email for collaborations.
+- [Discord Channel](https://discord.com/invite/Wxk3J3ZJkU): join the discord channel for more flexible technical discussion.
 - [WeChat group](/docs/source/imgs/wechat_group.jpg): scan the QA code to join the technical discussion.

neural_compressor-2.4/neural_coder/backends/pytorch_inc_static_quant_ipex_xpu.yaml ADDED Viewed

@@ -0,0 +1,34 @@
+# Copyright (c) 2023 Intel Corporation
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+transformation:
+  location:
+    - ["insert_below_dataloader_definition_line", "insert_below_model_definition_line"]
+  content:
+    - |-
+      [+] from neural_compressor.config import PostTrainingQuantConfig
+      [+] from neural_compressor.quantization import fit
+      [+] MODEL_NAME = MODEL_NAME.to("xpu")
+      [+] conf = PostTrainingQuantConfig(backend='ipex', quant_level=1, device="xpu")
+      [+] MODEL_NAME = fit(model=MODEL_NAME, conf=conf, calib_dataloader=DATALOADER_NAME)
+      [+] MODEL_NAME.save("./quantized_model")
+      [+] MODEL_NAME.eval()
+  order:
+    - below:
+      above:
+        - pytorch_jit_script
+        - pytorch_jit_script_ofi
+        - pytorch_jit_trace
+        - pytorch_jit_trace_ofi
+        - pytorch_channels_last

{neural_compressor-2.3.2 → neural_compressor-2.4}/neural_coder/graphers/function.py RENAMED Viewed

@@ -56,7 +56,7 @@ def register_func_wrap_pair():
             if is_in_function and line_idx == func_end_line_idx:
                 is_in_function = False
-            # handle function's defnition line, to initiate a function
+            # handle function's definition line, to initiate a function
             if not is_in_function and "def " in line:  # only deal with outermost def
                 function_name = line[line.find("def") + 4 : line.find("(")]

{neural_compressor-2.3.2 → neural_compressor-2.4}/neural_coder/interface.py RENAMED Viewed

@@ -118,6 +118,7 @@ def enable(
         "pytorch_inc_dynamic_quant",
         "pytorch_inc_static_quant_fx",
         "pytorch_inc_static_quant_ipex",
+        "pytorch_inc_static_quant_ipex_xpu",
         "pytorch_inc_bf16",
         "pytorch_inc_huggingface_optimum_static",
         "pytorch_inc_huggingface_optimum_dynamic",
@@ -210,6 +211,7 @@ def enable(
         or "pytorch_jit_trace_ofi" in features
         or "pytorch_inc_static_quant_fx" in features
         or "pytorch_inc_static_quant_ipex" in features
+        or "pytorch_inc_static_quant_ipex_xpu" in features
     ):
         features = ["pytorch_reclaim_inputs"] + features
@@ -312,6 +314,7 @@ def enable(
                 "pytorch_inc_dynamic_quant",
                 "pytorch_inc_static_quant_fx",
                 "pytorch_inc_static_quant_ipex",
+                "pytorch_inc_static_quant_ipex_xpu",
                 "pytorch_inc_huggingface_optimum_static",
                 "pytorch_inc_huggingface_optimum_dynamic",
                 "onnx_inc_static_quant_qlinear",
@@ -839,6 +842,7 @@ def superbench(
                 ["pytorch_inc_dynamic_quant"],
                 ["pytorch_inc_static_quant_fx"],
                 ["pytorch_inc_static_quant_ipex"],
+                ["pytorch_inc_static_quant_ipex_xpu"],
                 ["pytorch_inc_bf16"],
             ]
             standalones_pool = []
@@ -857,12 +861,14 @@ def superbench(
                 "pytorch_ipex_bf16",
                 "pytorch_inc_static_quant_fx",
                 "pytorch_inc_static_quant_ipex",
+                "pytorch_inc_static_quant_ipex_xpu",
                 "pytorch_inc_dynamic_quant",
                 "pytorch_ipex_int8_static_quant",
                 "pytorch_ipex_int8_dynamic_quant",
             ]
             # features that can be standalone (either use alone or use with "backend"):
             standalones_pool = [
+                "pytorch_ipex_xpu",
                 "pytorch_mixed_precision_cpu",
                 "pytorch_channels_last",
             ]
@@ -906,6 +912,8 @@ def superbench(
                     continue
                 if "pytorch_inc_static_quant_ipex" in features and "pytorch_mixed_precision_cpu" in features:
                     continue
+                if "pytorch_inc_static_quant_ipex_xpu" in features and "pytorch_mixed_precision_cpu" in features:
+                    continue
                 if "pytorch_inc_dynamic_quant" in features and "pytorch_mixed_precision_cpu" in features:
                     continue
@@ -960,6 +968,8 @@ def superbench(
                         features_display = "Intel INT8 (Static)"
                     elif features == ["pytorch_inc_static_quant_ipex"]:
                         features_display = "Intel INT8 (IPEX)"
+                    elif features == ["pytorch_inc_static_quant_ipex_xpu"]:
+                        features_display = "Intel INT8 (IPEX XPU)"
                     elif features == ["pytorch_inc_bf16"]:
                         features_display = "Intel BF16"
                     elif features == []:
@@ -1047,6 +1057,8 @@ def superbench(
                 best_optimization_display = "Intel INT8 (Static)"
             elif list_optimization_set_top3[0] == ["pytorch_inc_static_quant_ipex"]:
                 best_optimization_display = "Intel INT8 (IPEX)"
+            elif list_optimization_set_top3[0] == ["pytorch_inc_static_quant_ipex_xpu"]:
+                best_optimization_display = "Intel INT8 (IPEX XPU)"
             elif list_optimization_set_top3[0] == ["pytorch_inc_bf16"]:
                 best_optimization_display = "Intel BF16"
             elif list_optimization_set_top3[0] == []:

{neural_compressor-2.3.2 → neural_compressor-2.4}/neural_coder/launcher.py RENAMED Viewed

@@ -57,6 +57,8 @@ class Launcher:
                     args.opt = "pytorch_inc_static_quant_fx"
                 if args.approach == "static_ipex":
                     args.opt = "pytorch_inc_static_quant_ipex"
+                if args.approach == "static_ipex_xpu":
+                    args.opt = "pytorch_inc_static_quant_ipex_xpu"
                 if args.approach == "dynamic":
                     args.opt = "pytorch_inc_dynamic_quant"
                 if args.approach == "auto":

{neural_compressor-2.3.2 → neural_compressor-2.4}/neural_coder/utils/numa_launcher.py RENAMED Viewed

@@ -34,7 +34,7 @@ logger = logging.getLogger(__name__)
 class CPUinfo:
-    """Get CPU inforamation, such as cores list and NUMA information."""
+    """Get CPU information, such as cores list and NUMA information."""
     def __init__(self):
         self.cpuinfo = []
@@ -460,7 +460,7 @@ class MultiInstanceLauncher(Launcher):
 class DistributedTrainingLauncher(Launcher):
-    r"""Launcher for distributed traning with MPI launcher."""
+    r"""Launcher for distributed training with MPI launcher."""
     def get_mpi_pin_domain(self, nproc_per_node, ccl_worker_count, total_cores):
         """I_MPI_PIN_DOMAIN specify the cores used for every MPI process.

{neural_compressor-2.3.2 → neural_compressor-2.4}/neural_compressor/adaptor/keras.py RENAMED Viewed

@@ -867,7 +867,7 @@ class KerasAdaptor(Adaptor):
                     for i in range(len(list_len_dataloader) - 1):
                         if list_len_dataloader[i] != list_len_dataloader[i + 1]:
                             raise AttributeError(
-                                "The traning dataloader's iteration is"
+                                "The training dataloader's iteration is"
                                 "different between processes, please reset dataloader's batch_size."
                             )

neural-compressor 2.3.2__tar.gz → 2.4__tar.gz

neural-compressor 2.3.2tar.gz → 2.4tar.gz