mct-nightly 2.0.0.20240522.420__py3-none-any.whl → 2.0.0.20240523.418__py3-none-any.whl

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -1,6 +1,6 @@
1
1
  Metadata-Version: 2.1
2
2
  Name: mct-nightly
3
- Version: 2.0.0.20240522.420
3
+ Version: 2.0.0.20240523.418
4
4
  Summary: A Model Compression Toolkit for neural networks
5
5
  Home-page: UNKNOWN
6
6
  License: UNKNOWN
@@ -33,7 +33,7 @@ This project provides researchers, developers, and engineers tools for optimizin
33
33
 
34
34
  Specifically, this project aims to apply quantization to compress neural networks.
35
35
 
36
- <img src="docsrc/images/mct_block_diagram.svg" width="10000">
36
+ <img src="https://github.com/sony/model_optimization/raw/main/docsrc/images/mct_block_diagram.svg" width="10000">
37
37
 
38
38
  MCT is developed by researchers and engineers working at Sony Semiconductor Israel.
39
39
 
@@ -41,12 +41,12 @@ MCT is developed by researchers and engineers working at Sony Semiconductor Isra
41
41
 
42
42
  ## Table of Contents
43
43
 
44
- - [Getting Started](#getting-started)
45
- - [Supported features](#supported-features)
46
- - [Results](#results)
47
- - [Troubleshooting](#trouble-shooting)
48
- - [Contributions](#contributions)
49
- - [License](#license)
44
+ - [Getting Started](https://github.com/sony/model_optimization?tab=readme-ov-file#getting-started)
45
+ - [Supported features](https://github.com/sony/model_optimization?tab=readme-ov-file#supported-features)
46
+ - [Results](https://github.com/sony/model_optimization?tab=readme-ov-file#results)
47
+ - [Troubleshooting](https://github.com/sony/model_optimization?tab=readme-ov-file#trouble-shooting)
48
+ - [Contributions](https://github.com/sony/model_optimization?tab=readme-ov-file#contributions)
49
+ - [License](https://github.com/sony/model_optimization?tab=readme-ov-file#license)
50
50
 
51
51
 
52
52
  ## Getting Started
@@ -60,17 +60,17 @@ To install the latest stable release of MCT, run the following command:
60
60
  pip install model-compression-toolkit
61
61
  ```
62
62
 
63
- For installing the nightly version or installing from source, refer to the [installation guide](INSTALLATION.md).
63
+ For installing the nightly version or installing from source, refer to the [installation guide](https://github.com/sony/model_optimization/blob/main/INSTALLATION.md).
64
64
 
65
65
 
66
66
  ### Quick start & tutorials
67
67
 
68
68
  Explore the Model Compression Toolkit (MCT) through our tutorials,
69
- covering compression techniques for Keras and PyTorch models. Access interactive [notebooks](tutorials/README.md)
69
+ covering compression techniques for Keras and PyTorch models. Access interactive [notebooks](https://github.com/sony/model_optimization/blob/main/tutorials/README.md)
70
70
  for hands-on learning. For example:
71
- * [Keras MobileNetV2 post training quantization](tutorials/notebooks/imx500_notebooks/keras/example_keras_mobilenetv2_for_imx500.ipynb)
72
- * [Post training quantization with PyTorch](tutorials/notebooks/mct_features_notebooks/pytorch/example_pytorch_ptq_mnist.ipynb)
73
- * [Data Generation for ResNet18 with PyTorch](tutorials/notebooks/mct_features_notebooks/pytorch/example_pytorch_data_generation.ipynb).
71
+ * [Keras MobileNetV2 post training quantization](https://github.com/sony/model_optimization/blob/main/tutorials/notebooks/imx500_notebooks/keras/example_keras_mobilenetv2_for_imx500.ipynb)
72
+ * [Post training quantization with PyTorch](https://github.com/sony/model_optimization/blob/main/tutorials/notebooks/mct_features_notebooks/pytorch/example_pytorch_ptq_mnist.ipynb)
73
+ * [Data Generation for ResNet18 with PyTorch](https://github.com/sony/model_optimization/blob/main/tutorials/notebooks/mct_features_notebooks/pytorch/example_pytorch_data_generation.ipynb).
74
74
 
75
75
 
76
76
  ### Supported Versions
@@ -94,15 +94,15 @@ Currently, MCT is being tested on various Python, Pytorch and TensorFlow version
94
94
  ## Supported Features
95
95
  MCT offers a range of powerful features to optimize neural network models for efficient deployment. These supported features include:
96
96
 
97
- ### Data Generation [*](#experimental-features)
97
+ ### Data Generation [*](https://github.com/sony/model_optimization?tab=readme-ov-file#experimental-features)
98
98
  MCT provides tools for generating synthetic images based on the statistics stored in a model's batch normalization layers. These generated images are valuable for various compression tasks where image data is required, such as quantization and pruning.
99
- You can customize data generation configurations to suit your specific needs. [Go to the Data Generation page.](model_compression_toolkit/data_generation/README.md)
99
+ You can customize data generation configurations to suit your specific needs. [Go to the Data Generation page.](https://github.com/sony/model_optimization/blob/main/model_compression_toolkit/data_generation/README.md)
100
100
 
101
101
  ### Quantization
102
102
  MCT supports different quantization methods:
103
103
  * Post-training quantization (PTQ): [Keras API](https://sony.github.io/model_optimization/docs/api/api_docs/methods/keras_post_training_quantization.html), [PyTorch API](https://sony.github.io/model_optimization/docs/api/api_docs/methods/pytorch_post_training_quantization.html)
104
104
  * Gradient-based post-training quantization (GPTQ): [Keras API](https://sony.github.io/model_optimization/docs/api/api_docs/methods/keras_gradient_post_training_quantization.html), [PyTorch API](https://sony.github.io/model_optimization/docs/api/api_docs/methods/pytorch_gradient_post_training_quantization.html)
105
- * Quantization-aware training (QAT) [*](#experimental-features)
105
+ * Quantization-aware training (QAT) [*](https://github.com/sony/model_optimization?tab=readme-ov-file#experimental-features)
106
106
 
107
107
 
108
108
  | Quantization Method | Complexity | Computational Cost |
@@ -124,20 +124,20 @@ Main features:
124
124
  * <ins>Advanced quantization algorithms:</ins> To prevent a performance degradation some algorithms are applied such as:
125
125
  * <ins>Shift negative correction:</ins> Symmetric activation quantization can hurt the model's performance when some layers output both negative and positive activations, but their range is asymmetric. For more details please visit [1].
126
126
  * <ins>Outliers filtering:</ins> Computing z-score for activation statistics to detect and remove outliers.
127
- * <ins>Clustering:</ins> Using non-uniform quantization grid to quantize the weights and activations to match their distributions.[*](#experimental-features)
127
+ * <ins>Clustering:</ins> Using non-uniform quantization grid to quantize the weights and activations to match their distributions.[*](https://github.com/sony/model_optimization?tab=readme-ov-file#experimental-features)
128
128
  * <ins>Mixed-precision search:</ins> Assigning quantization bit-width per layer (for weights/activations), based on the layer's sensitivity to different bit-widths.
129
129
  * <ins>Visualization:</ins> You can use TensorBoard to observe useful information for troubleshooting the quantized model's performance (for example, the model in different phases of the quantization, collected statistics, similarity between layers of the float and quantized model and bit-width configuration for mixed-precision quantization). For more details, please read the [visualization documentation](https://sony.github.io/model_optimization/docs/guidelines/visualization.html).
130
- * <ins>Target Platform Capabilities:</ins> The Target Platform Capabilities (TPC) describes the target platform (an edge device with dedicated hardware). For more details, please read the [TPC README](model_compression_toolkit/target_platform_capabilities/README.md).
130
+ * <ins>Target Platform Capabilities:</ins> The Target Platform Capabilities (TPC) describes the target platform (an edge device with dedicated hardware). For more details, please read the [TPC README](https://github.com/sony/model_optimization/blob/main/model_compression_toolkit/target_platform_capabilities/README.md).
131
131
 
132
132
  ### Enhanced Post-Training Quantization (EPTQ)
133
133
  As part of the GPTQ we provide an advanced optimization algorithm called EPTQ.
134
134
 
135
135
  The specifications of the algorithm are detailed in the paper: _"**EPTQ: Enhanced Post-Training Quantization via Label-Free Hessian**"_ [4].
136
136
 
137
- More details on the how to use EPTQ via MCT can be found in the [EPTQ guidelines](model_compression_toolkit/gptq/README.md).
137
+ More details on the how to use EPTQ via MCT can be found in the [EPTQ guidelines](https://github.com/sony/model_optimization/blob/main/model_compression_toolkit/gptq/README.md).
138
138
 
139
139
 
140
- ### Structured Pruning [*](#experimental-features)
140
+ ### Structured Pruning [*](https://github.com/sony/model_optimization?tab=readme-ov-file#experimental-features)
141
141
  MCT introduces a structured and hardware-aware model pruning.
142
142
  This pruning technique is designed to compress models for specific hardware architectures,
143
143
  taking into account the target platform's Single Instruction, Multiple Data (SIMD) capabilities.
@@ -159,7 +159,7 @@ For more details, we highly recommend visiting our project website where experim
159
159
  Graph of [MobileNetV2](https://keras.io/api/applications/mobilenet/) accuracy on ImageNet vs average bit-width of weights, using
160
160
  single-precision quantization, mixed-precision quantization, and mixed-precision quantization with GPTQ.
161
161
 
162
- <img src="docsrc/images/mbv2_accuracy_graph.png">
162
+ <img src="https://github.com/sony/model_optimization/raw/main/docsrc/images/mbv2_accuracy_graph.png">
163
163
 
164
164
  For more results, please see [1]
165
165
 
@@ -195,11 +195,11 @@ Check out the [FAQ](https://github.com/sony/model_optimization/tree/main/FAQ.md)
195
195
  ## Contributions
196
196
  MCT aims at keeping a more up-to-date fork and welcomes contributions from anyone.
197
197
 
198
- *You will find more information about contributions in the [Contribution guide](CONTRIBUTING.md).
198
+ *You will find more information about contributions in the [Contribution guide](https://github.com/sony/model_optimization/blob/main/CONTRIBUTING.md).
199
199
 
200
200
 
201
201
  ## License
202
- [Apache License 2.0](LICENSE.md).
202
+ [Apache License 2.0](https://github.com/sony/model_optimization/blob/main/LICENSE.md).
203
203
 
204
204
  ## References
205
205
 
@@ -1,4 +1,4 @@
1
- model_compression_toolkit/__init__.py,sha256=v8YoSPNYRn91dEf88cagXX095viQMwtZN9gYLQtPGDk,1573
1
+ model_compression_toolkit/__init__.py,sha256=VlsXDtXIZi9zj6JZlqapLpIk6ac53EqfYOnxf_yFqtY,1573
2
2
  model_compression_toolkit/constants.py,sha256=b63Jk_bC7VXEX3Qn9TZ3wUvrNKD8Mkz8zIuayoyF5eU,3828
3
3
  model_compression_toolkit/defaultdict.py,sha256=LSc-sbZYXENMCw3U9F4GiXuv67IKpdn0Qm7Fr11jy-4,2277
4
4
  model_compression_toolkit/logger.py,sha256=3DByV41XHRR3kLTJNbpaMmikL8icd9e1N-nkQAY9oDk,4567
@@ -10,7 +10,7 @@ model_compression_toolkit/core/quantization_prep_runner.py,sha256=0ga95vh_ZXO79r
10
10
  model_compression_toolkit/core/runner.py,sha256=yref5I8eUo2A4hAmc4bOQOj6lUZRDQjLQR_5lJCjXiQ,12696
11
11
  model_compression_toolkit/core/common/__init__.py,sha256=Wh127PbXcETZX_d1PQqZ71ETK3J9XO5A-HpadGUbj6o,1447
12
12
  model_compression_toolkit/core/common/base_substitutions.py,sha256=xDFSmVVs_iFSZfajytI0cuQaNRNcwHX3uqOoHgVUvxQ,1666
13
- model_compression_toolkit/core/common/framework_implementation.py,sha256=pOT9ZmRFL9FY92uUtigrO3sbWGiyVDhHAM1fbA4b5yo,20752
13
+ model_compression_toolkit/core/common/framework_implementation.py,sha256=8b6M1GcUR9bDgoxwqyNP8C6KSU9OTQ5hIk20Y74eLPo,20896
14
14
  model_compression_toolkit/core/common/framework_info.py,sha256=1ZMMGS9ip-kSflqkartyNRt9aQ5ub1WepuTRcTy-YSQ,6337
15
15
  model_compression_toolkit/core/common/memory_computation.py,sha256=ixoSpV5ZYZGyzhre3kQcvR2sNA8KBsPZ3lgbkDnw9Cs,1205
16
16
  model_compression_toolkit/core/common/model_builder_mode.py,sha256=jll9-59OPaE3ug7Y9-lLyV99_FoNHxkGZMgcm0Vkpss,1324
@@ -31,7 +31,7 @@ model_compression_toolkit/core/common/fusion/__init__.py,sha256=Rf1RcYmelmdZmBV5
31
31
  model_compression_toolkit/core/common/fusion/layer_fusing.py,sha256=lOubqpc18TslhXZijWUJQAa1c3jIB2S-M-5HK78wJPQ,5548
32
32
  model_compression_toolkit/core/common/graph/__init__.py,sha256=Xr-Lt_qXMdrCnnOaUS_OJP_3iTTGfPCLf8_vSrQgCs0,773
33
33
  model_compression_toolkit/core/common/graph/base_graph.py,sha256=lmIw0srKiwCvz7KWqfwKTxyQHDy3s6rWMIXzFAa1UMo,38326
34
- model_compression_toolkit/core/common/graph/base_node.py,sha256=WGkSxjvbRLQBfFT_yrSQRnlmUpwtqkUfpVwrhLgMw5k,29338
34
+ model_compression_toolkit/core/common/graph/base_node.py,sha256=exvUkLDChl6YaoaQRHgSrettsgOsd18bfq01tPxXr-4,29722
35
35
  model_compression_toolkit/core/common/graph/edge.py,sha256=buoSEUZwilWBK3WeBKpJ-GeDaUA1SDdOHxDpxU_bGpk,3784
36
36
  model_compression_toolkit/core/common/graph/functional_node.py,sha256=71_4TrCdqR_r0mtgxmAyqI05iP5YoQQGeSmDgynuzTw,3902
37
37
  model_compression_toolkit/core/common/graph/graph_matchers.py,sha256=CrDoHYq4iPaflgJWmoJ1K4ziLrRogJvFTVWg8P0UcDU,4744
@@ -64,7 +64,7 @@ model_compression_toolkit/core/common/mixed_precision/distance_weighting.py,sha2
64
64
  model_compression_toolkit/core/common/mixed_precision/mixed_precision_quantization_config.py,sha256=DP5tcxPtiVbSWAeoFbEp7iTwpxDBU1g7V5w7ehDG6jI,4573
65
65
  model_compression_toolkit/core/common/mixed_precision/mixed_precision_search_facade.py,sha256=JmHopRNpHjxnoyeqXRVO0t-DdqEOm-jOZI06w5aAl9k,7550
66
66
  model_compression_toolkit/core/common/mixed_precision/mixed_precision_search_manager.py,sha256=TTTux4YiOnQqt-2h7Y38959XaDwNZc0eufLMx_yws5U,37578
67
- model_compression_toolkit/core/common/mixed_precision/sensitivity_evaluation.py,sha256=oPKsZj8O5ysQpzvO-ZTP6JoG_-GsUa6r2D7F6hGoDFM,28519
67
+ model_compression_toolkit/core/common/mixed_precision/sensitivity_evaluation.py,sha256=DKaxU9MD97J0yYJOCkhtQUrJLD_xrp0TK7mtcZEp1oA,28940
68
68
  model_compression_toolkit/core/common/mixed_precision/set_layer_to_bitwidth.py,sha256=P8QtKgFXtt5b2RoubzI5OGlCfbEfZsAirjyrkFzK26A,2846
69
69
  model_compression_toolkit/core/common/mixed_precision/solution_refinement_procedure.py,sha256=KifDMbm7qkSfvSl6pcZzQ82naIXzeKL6aT-VsvWZYyc,7901
70
70
  model_compression_toolkit/core/common/mixed_precision/resource_utilization_tools/__init__.py,sha256=Rf1RcYmelmdZmBV5qOKvKWF575ofc06JFQSq83Jz99A,696
@@ -150,7 +150,7 @@ model_compression_toolkit/core/keras/__init__.py,sha256=mjbqLD-KcG3eNeCYpu1GBS7V
150
150
  model_compression_toolkit/core/keras/constants.py,sha256=Uv3c0UdW55pIVQNW_1HQlgl-dHXREkltOLyzp8G1mTQ,3163
151
151
  model_compression_toolkit/core/keras/custom_layer_validation.py,sha256=f-b14wuiIgitBe7d0MmofYhDCTO3IhwJgwrh-Hq_t_U,1192
152
152
  model_compression_toolkit/core/keras/default_framework_info.py,sha256=HcHplb7IcnOTyK2p6uhp3OVG4-RV3RDo9C_4evaIzkQ,4981
153
- model_compression_toolkit/core/keras/keras_implementation.py,sha256=CijrPTyh28Up9-_YYrGNxaflLMAK5CzbXMraAGnX6l4,29716
153
+ model_compression_toolkit/core/keras/keras_implementation.py,sha256=bRH39d4lW7Ngm8xi7v9JQd9gNfGlB_lb-bolbzTYUcc,29881
154
154
  model_compression_toolkit/core/keras/keras_model_validation.py,sha256=1wNV2clFdC9BzIELRLSO2uKf0xqjLqlkTJudwtCeaJk,1722
155
155
  model_compression_toolkit/core/keras/keras_node_prior_info.py,sha256=HUmzEXDQ8LGX7uOYSRiLZ2TNbYxLX9J9IeAa6QYlifg,3927
156
156
  model_compression_toolkit/core/keras/resource_utilization_data_facade.py,sha256=Xmk2ZL5CaYdb7iG62HdtZ1F64vap7ffnrsuR3e3G5hc,4851
@@ -213,7 +213,7 @@ model_compression_toolkit/core/pytorch/__init__.py,sha256=Rf1RcYmelmdZmBV5qOKvKW
213
213
  model_compression_toolkit/core/pytorch/constants.py,sha256=NI-J7REuxn06oEIHsmJ4GqtNC3TbV8xlkJjt5Ar-c4U,2626
214
214
  model_compression_toolkit/core/pytorch/default_framework_info.py,sha256=r1XyzUFvrjGcJHQM5ETLsMZIG2yHCr9HMjqf0ti9inw,4175
215
215
  model_compression_toolkit/core/pytorch/pytorch_device_config.py,sha256=S25cuw10AW3SEN_fRAGRcG_I3wdvvQx1ehSJzPnn-UI,4404
216
- model_compression_toolkit/core/pytorch/pytorch_implementation.py,sha256=sEtlxpWdt0rzuTN3R0bNCC_l75Xy7rIBMUWY7LuhYKI,27351
216
+ model_compression_toolkit/core/pytorch/pytorch_implementation.py,sha256=Qe0GCbXsq8hqheMwZaZGl5caWK59RY4ldL5aJWcCmQ8,27516
217
217
  model_compression_toolkit/core/pytorch/pytorch_node_prior_info.py,sha256=2LDQ7qupglHQ7o1Am7LWdfYVacfQnl-aW2N6l9det1w,3264
218
218
  model_compression_toolkit/core/pytorch/resource_utilization_data_facade.py,sha256=E6ifk1HdO60k4IRH2EFBzAYWtwUlrGqJoQ66nknpHoQ,4983
219
219
  model_compression_toolkit/core/pytorch/utils.py,sha256=OT_mrNEJqPgWLdtQuivKMQVjtJY49cmoIVvbRhANl1w,3004
@@ -222,7 +222,7 @@ model_compression_toolkit/core/pytorch/back2framework/factory_model_builder.py,s
222
222
  model_compression_toolkit/core/pytorch/back2framework/float_model_builder.py,sha256=tLrlUyYhxVKVjkad1ZAtbRra0HedB3iVfIkZ_dYnQ-4,3419
223
223
  model_compression_toolkit/core/pytorch/back2framework/instance_builder.py,sha256=BBHBfTqeWm7L3iDyPBpk0jxvj-rBg1QWI23imkjfIl0,1467
224
224
  model_compression_toolkit/core/pytorch/back2framework/mixed_precision_model_builder.py,sha256=D7lU1r9Uq_7fdNuKk2BMF8ho5GrsY-8gyGN6yYoHaVg,15060
225
- model_compression_toolkit/core/pytorch/back2framework/pytorch_model_builder.py,sha256=5fPlI4BttvQB-gm0iKcWNXZMGjXlcwfXEsxxW0TilTQ,18301
225
+ model_compression_toolkit/core/pytorch/back2framework/pytorch_model_builder.py,sha256=iswwKSTVGJKkYDBiVzs5L0sw2zYax11UfInbelkgU1k,18258
226
226
  model_compression_toolkit/core/pytorch/back2framework/quantized_model_builder.py,sha256=qZNNOlNTTV4ZKPG3q5GDXkIVTPUEr8dvxAS_YiMORmg,3456
227
227
  model_compression_toolkit/core/pytorch/back2framework/quantization_wrapper/__init__.py,sha256=cco4TmeIDIh32nj9ZZXVkws4dd9F2UDrmjKzTN8G0V0,697
228
228
  model_compression_toolkit/core/pytorch/back2framework/quantization_wrapper/quantized_layer_wrapper.py,sha256=q2JDw10NKng50ee2i9faGzWZ-IydnR2aOMGSn9RoZmc,5773
@@ -483,8 +483,8 @@ model_compression_toolkit/trainable_infrastructure/keras/quantize_wrapper.py,sha
483
483
  model_compression_toolkit/trainable_infrastructure/keras/quantizer_utils.py,sha256=MVwXNymmFRB2NXIBx4e2mdJ1RfoHxRPYRgjb1MQP5kY,1797
484
484
  model_compression_toolkit/trainable_infrastructure/pytorch/__init__.py,sha256=huHoBUcKNB6BnY6YaUCcFvdyBtBI172ZoUD8ZYeNc6o,696
485
485
  model_compression_toolkit/trainable_infrastructure/pytorch/base_pytorch_quantizer.py,sha256=MxylaVFPgN7zBiRBy6WV610EA4scLgRJFbMucKvvNDU,2896
486
- mct_nightly-2.0.0.20240522.420.dist-info/LICENSE.md,sha256=aYSSIb-5AFPeITTvXm1UAoe0uYBiMmSS8flvXaaFUks,10174
487
- mct_nightly-2.0.0.20240522.420.dist-info/METADATA,sha256=mVKhgUzRrPWgjkKKh7j45yHheRKd4MwcZtp8-hHALBM,18477
488
- mct_nightly-2.0.0.20240522.420.dist-info/WHEEL,sha256=GJ7t_kWBFywbagK5eo9IoUwLW6oyOeTKmQ-9iHFVNxQ,92
489
- mct_nightly-2.0.0.20240522.420.dist-info/top_level.txt,sha256=gsYA8juk0Z-ZmQRKULkb3JLGdOdz8jW_cMRjisn9ga4,26
490
- mct_nightly-2.0.0.20240522.420.dist-info/RECORD,,
486
+ mct_nightly-2.0.0.20240523.418.dist-info/LICENSE.md,sha256=aYSSIb-5AFPeITTvXm1UAoe0uYBiMmSS8flvXaaFUks,10174
487
+ mct_nightly-2.0.0.20240523.418.dist-info/METADATA,sha256=36VhoCVFIkO7ER4bNkD_NSFWrAfHHcCy-N7xPFQaqb8,19721
488
+ mct_nightly-2.0.0.20240523.418.dist-info/WHEEL,sha256=GJ7t_kWBFywbagK5eo9IoUwLW6oyOeTKmQ-9iHFVNxQ,92
489
+ mct_nightly-2.0.0.20240523.418.dist-info/top_level.txt,sha256=gsYA8juk0Z-ZmQRKULkb3JLGdOdz8jW_cMRjisn9ga4,26
490
+ mct_nightly-2.0.0.20240523.418.dist-info/RECORD,,
@@ -27,4 +27,4 @@ from model_compression_toolkit import data_generation
27
27
  from model_compression_toolkit import pruning
28
28
  from model_compression_toolkit.trainable_infrastructure.keras.load_model import keras_load_quantized_model
29
29
 
30
- __version__ = "2.0.0.20240522.000420"
30
+ __version__ = "2.0.0.20240523.000418"
@@ -348,13 +348,14 @@ class FrameworkImplementation(ABC):
348
348
  raise NotImplemented(f'{self.__class__.__name__} have to implement the '
349
349
  f'framework\'s count_node_for_mixed_precision_interest_points method.') # pragma: no cover
350
350
 
351
- def get_node_distance_fn(self, layer_class: type,
351
+ def get_mp_node_distance_fn(self, layer_class: type,
352
352
  framework_attrs: Dict[str, Any],
353
353
  compute_distance_fn: Callable = None,
354
- axis: int = None) -> Callable:
354
+ axis: int = None,
355
+ norm_mse: bool = False) -> Callable:
355
356
  """
356
357
  A mapping between layers' types and a distance function for computing the distance between
357
- two tensors (for loss computation purposes). Returns a specific function if node of specific types is
358
+ two tensors in mixed precision (for loss computation purposes). Returns a specific function if node of specific types is
358
359
  given, or a default (normalized MSE) function otherwise.
359
360
 
360
361
  Args:
@@ -362,12 +363,13 @@ class FrameworkImplementation(ABC):
362
363
  framework_attrs: Framework attributes the layer had which the graph node holds.
363
364
  compute_distance_fn: An optional distance function to use globally for all nodes.
364
365
  axis: The axis on which the operation is preformed (if specified).
366
+ norm_mse: whether to normalize mse distance function.
365
367
 
366
368
  Returns: A distance function between two tensors.
367
369
  """
368
370
 
369
371
  raise NotImplemented(f'{self.__class__.__name__} have to implement the '
370
- f'framework\'s get_node_distance_fn method.') # pragma: no cover
372
+ f'framework\'s get_mp_node_distance_fn method.') # pragma: no cover
371
373
 
372
374
 
373
375
  @abstractmethod
@@ -238,8 +238,12 @@ class BaseNode:
238
238
  """
239
239
  for pos, weight in sorted((pos, weight) for pos, weight in self.weights.items()
240
240
  if isinstance(pos, int)):
241
- assert pos <= len(input_tensors), 'Positional weight index mismatch'
242
- input_tensors.insert(pos, weight)
241
+ if pos > len(input_tensors):
242
+ Logger.critical("The positional weight index cannot exceed the number of input tensors to the node.") # pragma: no cover
243
+ # Insert only positional weights that are not subject to quantization. If the positional weight is
244
+ # subject to quantization, the quantization wrapper inserts the positional weight into the node.
245
+ if not self.is_weights_quantization_enabled(pos):
246
+ input_tensors.insert(pos, weight)
243
247
 
244
248
  return input_tensors
245
249
 
@@ -89,10 +89,13 @@ class SensitivityEvaluation:
89
89
  fw_impl.count_node_for_mixed_precision_interest_points,
90
90
  quant_config.num_interest_points_factor)
91
91
 
92
- self.ips_distance_fns, self.ips_axis = self._init_metric_points_lists(self.interest_points)
92
+ # We use normalized MSE when not running hessian-based. For Hessian-based normalized MSE is not needed
93
+ # beacause hessian weights already do normalization.
94
+ use_normalized_mse = self.quant_config.use_hessian_based_scores is False
95
+ self.ips_distance_fns, self.ips_axis = self._init_metric_points_lists(self.interest_points, use_normalized_mse)
93
96
 
94
97
  self.output_points = get_output_nodes_for_metric(graph)
95
- self.out_ps_distance_fns, self.out_ps_axis = self._init_metric_points_lists(self.output_points)
98
+ self.out_ps_distance_fns, self.out_ps_axis = self._init_metric_points_lists(self.output_points, use_normalized_mse)
96
99
 
97
100
  # Setting lists with relative position of the interest points
98
101
  # and output points in the list of all mp model activation tensors
@@ -128,7 +131,7 @@ class SensitivityEvaluation:
128
131
  self.interest_points_hessians = self._compute_hessian_based_scores()
129
132
  self.quant_config.distance_weighting_method = lambda d: self.interest_points_hessians
130
133
 
131
- def _init_metric_points_lists(self, points: List[BaseNode]) -> Tuple[List[Callable], List[int]]:
134
+ def _init_metric_points_lists(self, points: List[BaseNode], norm_mse: bool = False) -> Tuple[List[Callable], List[int]]:
132
135
  """
133
136
  Initiates required lists for future use when computing the sensitivity metric.
134
137
  Each point on which the metric is computed uses a dedicated distance function based on its type.
@@ -136,6 +139,7 @@ class SensitivityEvaluation:
136
139
 
137
140
  Args:
138
141
  points: The set of nodes in the graph for which we need to initiate the lists.
142
+ norm_mse: whether to normalize mse distance function.
139
143
 
140
144
  Returns: A lists with distance functions and an axis list for each node.
141
145
 
@@ -144,11 +148,12 @@ class SensitivityEvaluation:
144
148
  axis_list = []
145
149
  for n in points:
146
150
  axis = n.framework_attr.get(AXIS) if not isinstance(n, FunctionalNode) else n.op_call_kwargs.get(AXIS)
147
- distance_fn = self.fw_impl.get_node_distance_fn(
151
+ distance_fn = self.fw_impl.get_mp_node_distance_fn(
148
152
  layer_class=n.layer_class,
149
153
  framework_attrs=n.framework_attr,
150
154
  compute_distance_fn=self.quant_config.compute_distance_fn,
151
- axis=axis)
155
+ axis=axis,
156
+ norm_mse=norm_mse)
152
157
  distance_fns_list.append(distance_fn)
153
158
  # Axis is needed only for KL Divergence calculation, otherwise we use per-tensor computation
154
159
  axis_list.append(axis if distance_fn==compute_kl_divergence else None)
@@ -421,13 +421,14 @@ class KerasImplementation(FrameworkImplementation):
421
421
 
422
422
  return False
423
423
 
424
- def get_node_distance_fn(self, layer_class: type,
424
+ def get_mp_node_distance_fn(self, layer_class: type,
425
425
  framework_attrs: Dict[str, Any],
426
426
  compute_distance_fn: Callable = None,
427
- axis: int = None) -> Callable:
427
+ axis: int = None,
428
+ norm_mse: bool = False) -> Callable:
428
429
  """
429
430
  A mapping between layers' types and a distance function for computing the distance between
430
- two tensors (for loss computation purposes). Returns a specific function if node of specific types is
431
+ two tensors in mixed precision (for loss computation purposes). Returns a specific function if node of specific types is
431
432
  given, or a default (normalized MSE) function otherwise.
432
433
 
433
434
  Args:
@@ -435,6 +436,7 @@ class KerasImplementation(FrameworkImplementation):
435
436
  framework_attrs: Framework attributes the layer had which the graph node holds.
436
437
  compute_distance_fn: An optional distance function to use globally for all nodes.
437
438
  axis: The axis on which the operation is preformed (if specified).
439
+ norm_mse: whether to normalize mse distance function.
438
440
 
439
441
  Returns: A distance function between two tensors.
440
442
  """
@@ -456,7 +458,7 @@ class KerasImplementation(FrameworkImplementation):
456
458
  return compute_cs
457
459
  elif layer_class == Dense:
458
460
  return compute_cs
459
- return compute_mse
461
+ return partial(compute_mse, norm=norm_mse)
460
462
 
461
463
  def get_trace_hessian_calculator(self,
462
464
  graph: Graph,
@@ -67,8 +67,7 @@ def _build_input_tensors_list(node: BaseNode,
67
67
  _input_tensors = node_to_output_tensors_dict[ie.source_node]
68
68
  input_tensors.append(_input_tensors)
69
69
  input_tensors = [tensor for tensor_list in input_tensors for tensor in tensor_list] # flat list of lists
70
- if not is_op_quantize_wrapper:
71
- input_tensors = node.insert_positional_weights_to_input_list(input_tensors)
70
+ input_tensors = node.insert_positional_weights_to_input_list(input_tensors)
72
71
  # convert inputs from positional weights (numpy arrays) to tensors. Must handle each element in the
73
72
  # list separately, because in FX the tensors are FX objects and fail to_torch_tensor
74
73
  input_tensors = [to_torch_tensor(t, numpy_type=t.dtype) if isinstance(t, np.ndarray) else t
@@ -403,13 +403,14 @@ class PytorchImplementation(FrameworkImplementation):
403
403
  return True
404
404
  return False
405
405
 
406
- def get_node_distance_fn(self, layer_class: type,
406
+ def get_mp_node_distance_fn(self, layer_class: type,
407
407
  framework_attrs: Dict[str, Any],
408
408
  compute_distance_fn: Callable = None,
409
- axis: int = None) -> Callable:
409
+ axis: int = None,
410
+ norm_mse: bool = False) -> Callable:
410
411
  """
411
412
  A mapping between layers' types and a distance function for computing the distance between
412
- two tensors (for loss computation purposes). Returns a specific function if node of specific types is
413
+ two tensors in mixed precision (for loss computation purposes). Returns a specific function if node of specific types is
413
414
  given, or a default (normalized MSE) function otherwise.
414
415
 
415
416
  Args:
@@ -417,6 +418,7 @@ class PytorchImplementation(FrameworkImplementation):
417
418
  framework_attrs: Framework attributes the layer had which the graph node holds.
418
419
  compute_distance_fn: An optional distance function to use globally for all nodes.
419
420
  axis: The axis on which the operation is preformed (if specified).
421
+ norm_mse: whether to normalize mse distance function.
420
422
 
421
423
  Returns: A distance function between two tensors.
422
424
  """
@@ -430,7 +432,7 @@ class PytorchImplementation(FrameworkImplementation):
430
432
  return compute_cs
431
433
  elif layer_class == Linear:
432
434
  return compute_cs
433
- return compute_mse
435
+ return partial(compute_mse, norm=norm_mse)
434
436
 
435
437
  def is_output_node_compatible_for_hessian_score_computation(self,
436
438
  node: BaseNode) -> bool: