mct-nightly 1.11.0.20240307.post318__py3-none-any.whl → 1.11.0.20240309.post349__py3-none-any.whl
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- {mct_nightly-1.11.0.20240307.post318.dist-info → mct_nightly-1.11.0.20240309.post349.dist-info}/METADATA +13 -13
- {mct_nightly-1.11.0.20240307.post318.dist-info → mct_nightly-1.11.0.20240309.post349.dist-info}/RECORD +20 -20
- model_compression_toolkit/__init__.py +0 -19
- model_compression_toolkit/core/common/network_editors/actions.py +1 -1
- model_compression_toolkit/core/common/quantization/quantization_config.py +1 -1
- model_compression_toolkit/core/keras/graph_substitutions/substitutions/separableconv_decomposition.py +4 -2
- model_compression_toolkit/data_generation/__init__.py +1 -1
- model_compression_toolkit/data_generation/keras/keras_data_generation.py +11 -6
- model_compression_toolkit/data_generation/pytorch/pytorch_data_generation.py +6 -0
- model_compression_toolkit/gptq/__init__.py +1 -1
- model_compression_toolkit/gptq/common/gptq_config.py +1 -4
- model_compression_toolkit/gptq/pytorch/quantization_facade.py +5 -1
- model_compression_toolkit/pruning/keras/pruning_facade.py +5 -1
- model_compression_toolkit/pruning/pytorch/pruning_facade.py +5 -1
- model_compression_toolkit/qat/__init__.py +2 -2
- model_compression_toolkit/qat/keras/quantization_facade.py +25 -15
- model_compression_toolkit/qat/pytorch/quantization_facade.py +24 -15
- {mct_nightly-1.11.0.20240307.post318.dist-info → mct_nightly-1.11.0.20240309.post349.dist-info}/LICENSE.md +0 -0
- {mct_nightly-1.11.0.20240307.post318.dist-info → mct_nightly-1.11.0.20240309.post349.dist-info}/WHEEL +0 -0
- {mct_nightly-1.11.0.20240307.post318.dist-info → mct_nightly-1.11.0.20240309.post349.dist-info}/top_level.txt +0 -0
|
@@ -1,6 +1,6 @@
|
|
|
1
1
|
Metadata-Version: 2.1
|
|
2
2
|
Name: mct-nightly
|
|
3
|
-
Version: 1.11.0.
|
|
3
|
+
Version: 1.11.0.20240309.post349
|
|
4
4
|
Summary: A Model Compression Toolkit for neural networks
|
|
5
5
|
Home-page: UNKNOWN
|
|
6
6
|
License: UNKNOWN
|
|
@@ -95,7 +95,7 @@ Currently, MCT is being tested on various Python, Pytorch and TensorFlow version
|
|
|
95
95
|
## Supported Features
|
|
96
96
|
MCT offers a range of powerful features to optimize neural network models for efficient deployment. These supported features include:
|
|
97
97
|
|
|
98
|
-
### Data Generation
|
|
98
|
+
### Data Generation [*](#experimental-features)
|
|
99
99
|
MCT provides tools for generating synthetic images based on the statistics stored in a model's batch normalization layers. These generated images are valuable for various compression tasks where image data is required, such as quantization and pruning.
|
|
100
100
|
You can customize data generation configurations to suit your specific needs. [Go to the Data Generation page.](model_compression_toolkit/data_generation/README.md)
|
|
101
101
|
|
|
@@ -103,7 +103,7 @@ You can customize data generation configurations to suit your specific needs. [G
|
|
|
103
103
|
MCT supports different quantization methods:
|
|
104
104
|
* Post-training quantization (PTQ): [Keras API](https://sony.github.io/model_optimization/docs/api/experimental_api_docs/methods/keras_post_training_quantization_experimental.html#ug-keras-post-training-quantization-experimental), [PyTorch API](https://sony.github.io/model_optimization/docs/api/experimental_api_docs/methods/pytorch_post_training_quantization_experimental.html#ug-pytorch-post-training-quantization-experimental)
|
|
105
105
|
* Gradient-based post-training quantization (GPTQ): [Keras API](https://sony.github.io/model_optimization/docs/api/experimental_api_docs/methods/keras_gradient_post_training_quantization_experimental.html#ug-keras-gradient-post-training-quantization-experimental), [PyTorch API](https://sony.github.io/model_optimization/docs/api/experimental_api_docs/methods/pytorch_gradient_post_training_quantization_experimental.html#ug-pytorch-gradient-post-training-quantization-experimental)
|
|
106
|
-
* Quantization-aware training (QAT)[*](#experimental-features)
|
|
106
|
+
* Quantization-aware training (QAT) [*](#experimental-features)
|
|
107
107
|
|
|
108
108
|
|
|
109
109
|
| Quantization Method | Complexity | Computational Cost |
|
|
@@ -138,6 +138,15 @@ The specifications of the algorithm are detailed in the paper: _"**EPTQ: Enhance
|
|
|
138
138
|
More details on the how to use EPTQ via MCT can be found in the [EPTQ guidelines](model_compression_toolkit/gptq/README.md).
|
|
139
139
|
|
|
140
140
|
|
|
141
|
+
### Structured Pruning [*](#experimental-features)
|
|
142
|
+
MCT introduces a structured and hardware-aware model pruning.
|
|
143
|
+
This pruning technique is designed to compress models for specific hardware architectures,
|
|
144
|
+
taking into account the target platform's Single Instruction, Multiple Data (SIMD) capabilities.
|
|
145
|
+
By pruning groups of channels (SIMD groups), our approach not only reduces model size
|
|
146
|
+
and complexity, but ensures that better utilization of channels is in line with the SIMD architecture
|
|
147
|
+
for a target KPI of weights memory footprint.
|
|
148
|
+
[Keras API](https://sony.github.io/model_optimization/docs/api/experimental_api_docs/methods/keras_pruning_experimental.html)
|
|
149
|
+
[Pytorch API](https://github.com/sony/model_optimization/blob/main/model_compression_toolkit/pruning/pytorch/pruning_facade.py#L43)
|
|
141
150
|
|
|
142
151
|
#### Experimental features
|
|
143
152
|
|
|
@@ -167,17 +176,8 @@ In the following table we present the ImageNet validation results for these mode
|
|
|
167
176
|
|
|
168
177
|
For more results, please refer to [quick start](https://github.com/sony/model_optimization/tree/main/tutorials/quick_start).
|
|
169
178
|
|
|
170
|
-
### Structured Pruning
|
|
171
|
-
MCT introduces a structured and hardware-aware model pruning.
|
|
172
|
-
This pruning technique is designed to compress models for specific hardware architectures,
|
|
173
|
-
taking into account the target platform's Single Instruction, Multiple Data (SIMD) capabilities.
|
|
174
|
-
By pruning groups of channels (SIMD groups), our approach not only reduces model size
|
|
175
|
-
and complexity, but ensures that better utilization of channels is in line with the SIMD architecture
|
|
176
|
-
for a target KPI of weights memory footprint.
|
|
177
|
-
[Keras API](https://sony.github.io/model_optimization/docs/api/experimental_api_docs/methods/keras_pruning_experimental.html)
|
|
178
|
-
[Pytorch API](https://github.com/sony/model_optimization/blob/main/model_compression_toolkit/pruning/pytorch/pruning_facade.py#L43)
|
|
179
179
|
|
|
180
|
-
#### Results
|
|
180
|
+
#### Pruning Results
|
|
181
181
|
|
|
182
182
|
Results for applying pruning to reduce the parameters of the following models by 50%:
|
|
183
183
|
|
|
@@ -1,4 +1,4 @@
|
|
|
1
|
-
model_compression_toolkit/__init__.py,sha256=
|
|
1
|
+
model_compression_toolkit/__init__.py,sha256=zi2UNvd56OblMMFUXze7YbWL58bU35_FLiHUzc48eOQ,1558
|
|
2
2
|
model_compression_toolkit/constants.py,sha256=_OW_bUeQmf08Bb4oVZ0KfUt-rcCeNOmdBv3aP7NF5fM,3631
|
|
3
3
|
model_compression_toolkit/defaultdict.py,sha256=LSc-sbZYXENMCw3U9F4GiXuv67IKpdn0Qm7Fr11jy-4,2277
|
|
4
4
|
model_compression_toolkit/logger.py,sha256=b9DVktZ-LymFcRxv2aL_sdiE6S2sSrFGWltx6dgEuUY,4863
|
|
@@ -77,7 +77,7 @@ model_compression_toolkit/core/common/mixed_precision/kpi_tools/kpi_methods.py,s
|
|
|
77
77
|
model_compression_toolkit/core/common/mixed_precision/search_methods/__init__.py,sha256=sw7LOPN1bM82o3SkMaklyH0jw-TLGK0-fl2Wq73rffI,697
|
|
78
78
|
model_compression_toolkit/core/common/mixed_precision/search_methods/linear_programming.py,sha256=tiyIAFa4BdB8LLrEv-TpSbg99oHJHuiBIPGng_U-51U,15563
|
|
79
79
|
model_compression_toolkit/core/common/network_editors/__init__.py,sha256=vZmu55bYqiaOQs3AjfwWDXHmuKZcLHt-wm7uR5fPEqg,1307
|
|
80
|
-
model_compression_toolkit/core/common/network_editors/actions.py,sha256
|
|
80
|
+
model_compression_toolkit/core/common/network_editors/actions.py,sha256=EdP_pn7e-bAuJCoGXY0CLZbB6-amF9L99N8MBRyQIDA,19574
|
|
81
81
|
model_compression_toolkit/core/common/network_editors/edit_network.py,sha256=dfgawi-nB0ocAJ0xcGn9E-Zv203oUnQLuMiXpX8vTgA,1748
|
|
82
82
|
model_compression_toolkit/core/common/network_editors/node_filters.py,sha256=uML5o4o80q4GaEG_D4ZXeb7ocsfuPeV0DRCWJK3HYXY,3149
|
|
83
83
|
model_compression_toolkit/core/common/pruning/__init__.py,sha256=DGJybkDQtKMSMFoZ-nZ3ZifA8uJ6G_D20wHhKHNlmU0,699
|
|
@@ -103,7 +103,7 @@ model_compression_toolkit/core/common/quantization/core_config.py,sha256=IkD4Jl9
|
|
|
103
103
|
model_compression_toolkit/core/common/quantization/debug_config.py,sha256=HtkMmneN-EmAzgZK4Vp4M8Sqm5QKdrvNyyZMpaVqYzY,1482
|
|
104
104
|
model_compression_toolkit/core/common/quantization/filter_nodes_candidates.py,sha256=fwF4VILaX-u3ZaFd81xjbJuhg8Ef-JX_KfMXW0TPV-I,7136
|
|
105
105
|
model_compression_toolkit/core/common/quantization/node_quantization_config.py,sha256=HWBBF--cbzsiMx3BG2kQ3JHkfalVnGO3N-rAXMwNqp4,26707
|
|
106
|
-
model_compression_toolkit/core/common/quantization/quantization_config.py,sha256=
|
|
106
|
+
model_compression_toolkit/core/common/quantization/quantization_config.py,sha256=hQMKm55EXS1oV-Upt6IQtsYhpuhMvYeWRJhh6lhv_Ko,6699
|
|
107
107
|
model_compression_toolkit/core/common/quantization/quantization_fn_selection.py,sha256=f2Qa2majjO-gIN3lxqsA8icKJ9FMP-sKbw3lI6XNgBg,2137
|
|
108
108
|
model_compression_toolkit/core/common/quantization/quantization_params_fn_selection.py,sha256=mrgVzZszWjxnjT8zm77UVLWKTOwd2thGBo6WNqAS4X8,3867
|
|
109
109
|
model_compression_toolkit/core/common/quantization/quantize_graph_weights.py,sha256=xnM9O9LshYw3dprqfsnK9mw7ipOEAkI85o20auyfswg,2626
|
|
@@ -177,7 +177,7 @@ model_compression_toolkit/core/keras/graph_substitutions/substitutions/relu_boun
|
|
|
177
177
|
model_compression_toolkit/core/keras/graph_substitutions/substitutions/remove_relu_upper_bound.py,sha256=cJQTDzTDQKAJ7EQ20tfsmReGA_OoTIN793MwVe1Ok8g,2387
|
|
178
178
|
model_compression_toolkit/core/keras/graph_substitutions/substitutions/residual_collapsing.py,sha256=6PnPIC5ax7uTzcoslW7ropIu7vVmo70AD4QYcYnQV20,3176
|
|
179
179
|
model_compression_toolkit/core/keras/graph_substitutions/substitutions/scale_equalization.py,sha256=ryes9y1ie-vjBGso2TeO4EXxVk69Ew3iSAhshPz1Ou4,5542
|
|
180
|
-
model_compression_toolkit/core/keras/graph_substitutions/substitutions/separableconv_decomposition.py,sha256=
|
|
180
|
+
model_compression_toolkit/core/keras/graph_substitutions/substitutions/separableconv_decomposition.py,sha256=TEaHlIbXj_ZjIdT5TmAICD3WLD3u_7g0fLWQcNzTJuM,7941
|
|
181
181
|
model_compression_toolkit/core/keras/graph_substitutions/substitutions/shift_negative_activation.py,sha256=6vEakr0jWrccU7dfubRCiNg6TFe6whte_pbTiXMJIvc,11045
|
|
182
182
|
model_compression_toolkit/core/keras/graph_substitutions/substitutions/softmax_shift.py,sha256=Qk5seDALj_th9dHJehY7ynZjvFjVfCv_mJ1enA5hX0c,1623
|
|
183
183
|
model_compression_toolkit/core/keras/graph_substitutions/substitutions/virtual_activation_weights_composition.py,sha256=wH9ocMLL725-uUPU-zCxdd8NwT5nyd0ZShmI7iuTwF8,1462
|
|
@@ -263,7 +263,7 @@ model_compression_toolkit/core/pytorch/reader/node_holders.py,sha256=TaolORuwBZE
|
|
|
263
263
|
model_compression_toolkit/core/pytorch/reader/reader.py,sha256=Co3-AHZCEOw5w-jtgf9oAKsgtjQoG0MeeSeBVnQ0xOA,5801
|
|
264
264
|
model_compression_toolkit/core/pytorch/statistics_correction/__init__.py,sha256=Rf1RcYmelmdZmBV5qOKvKWF575ofc06JFQSq83Jz99A,696
|
|
265
265
|
model_compression_toolkit/core/pytorch/statistics_correction/apply_second_moment_correction.py,sha256=VgU24J3jf7QComHH7jonOXSkg6mO4TOch3uFkOthZvM,3261
|
|
266
|
-
model_compression_toolkit/data_generation/__init__.py,sha256=
|
|
266
|
+
model_compression_toolkit/data_generation/__init__.py,sha256=zp3nQ7NhDncuGdHBwCXkRJh6JnGoTYhZZlAOrDE8omc,1138
|
|
267
267
|
model_compression_toolkit/data_generation/common/__init__.py,sha256=huHoBUcKNB6BnY6YaUCcFvdyBtBI172ZoUD8ZYeNc6o,696
|
|
268
268
|
model_compression_toolkit/data_generation/common/constants.py,sha256=21e3ZX9WVYojexG2acTgklrBk8ZO9DjJnKpP4KHZC44,1018
|
|
269
269
|
model_compression_toolkit/data_generation/common/data_generation.py,sha256=PnKkWCBf4yla0E4LhvOqT8htWiGW4F98bygExQnpwqI,6397
|
|
@@ -275,7 +275,7 @@ model_compression_toolkit/data_generation/common/optimization_utils.py,sha256=8w
|
|
|
275
275
|
model_compression_toolkit/data_generation/keras/__init__.py,sha256=lNJ29DYxaLUPDstRDA1PGI5r9Fulq_hvrZMlhst1Z5g,697
|
|
276
276
|
model_compression_toolkit/data_generation/keras/constants.py,sha256=uy3eU24ykygIrjIvwOMj3j5euBeN2PwWiEFPOkJJ7ss,1088
|
|
277
277
|
model_compression_toolkit/data_generation/keras/image_pipeline.py,sha256=_Qezq67huKmmNsxdFBBrTY-VaGR-paFzDH80dDuRnug,7623
|
|
278
|
-
model_compression_toolkit/data_generation/keras/keras_data_generation.py,sha256=
|
|
278
|
+
model_compression_toolkit/data_generation/keras/keras_data_generation.py,sha256=MYFdMPqGxy9tRaTIstJMkcYOk0tMXirke5fxdIJvBjU,19720
|
|
279
279
|
model_compression_toolkit/data_generation/keras/model_info_exctractors.py,sha256=b3BaOGiMAlCCzPICww722l2H_RucoHgpGUK6xYe8xTA,8552
|
|
280
280
|
model_compression_toolkit/data_generation/keras/optimization_utils.py,sha256=uQAJpJPpnLDTTLDQGyTS0ZYp2T38TTZLOOElcJPBKHA,21146
|
|
281
281
|
model_compression_toolkit/data_generation/keras/optimization_functions/__init__.py,sha256=huHoBUcKNB6BnY6YaUCcFvdyBtBI172ZoUD8ZYeNc6o,696
|
|
@@ -289,7 +289,7 @@ model_compression_toolkit/data_generation/pytorch/constants.py,sha256=QWyreMImcf
|
|
|
289
289
|
model_compression_toolkit/data_generation/pytorch/image_pipeline.py,sha256=6g7OpOuO3cU4TIuelaRjBKpCPgiMbe1a3iy9bZtdZUo,6617
|
|
290
290
|
model_compression_toolkit/data_generation/pytorch/model_info_exctractors.py,sha256=wxtaQad4aP8D0SgA8qEPORZM3qBD22G6zO1gjwTNIVU,9632
|
|
291
291
|
model_compression_toolkit/data_generation/pytorch/optimization_utils.py,sha256=AjYsO-lm06JOUMoKkS6VbyF4O_l_ffWXrgamqJm1ofE,19085
|
|
292
|
-
model_compression_toolkit/data_generation/pytorch/pytorch_data_generation.py,sha256=
|
|
292
|
+
model_compression_toolkit/data_generation/pytorch/pytorch_data_generation.py,sha256=BCJ6PVncBBm6sa4IWCYvC-U0-XPs7LV-deao0lq_D20,19192
|
|
293
293
|
model_compression_toolkit/data_generation/pytorch/optimization_functions/__init__.py,sha256=huHoBUcKNB6BnY6YaUCcFvdyBtBI172ZoUD8ZYeNc6o,696
|
|
294
294
|
model_compression_toolkit/data_generation/pytorch/optimization_functions/batchnorm_alignment_functions.py,sha256=dMc4zz9XfYfAT4Cxns57VgvGZWPAMfaGlWLFyCyl8TA,1968
|
|
295
295
|
model_compression_toolkit/data_generation/pytorch/optimization_functions/bn_layer_weighting_functions.py,sha256=i3ePEI8xDE3xZEtmzT5lCkLn9wpObUi_OgqnVDf7nj8,2597
|
|
@@ -328,10 +328,10 @@ model_compression_toolkit/exporter/model_wrapper/pytorch/validate_layer.py,sha25
|
|
|
328
328
|
model_compression_toolkit/exporter/model_wrapper/pytorch/builder/__init__.py,sha256=cco4TmeIDIh32nj9ZZXVkws4dd9F2UDrmjKzTN8G0V0,697
|
|
329
329
|
model_compression_toolkit/exporter/model_wrapper/pytorch/builder/fully_quantized_model_builder.py,sha256=GNwX3gy_wrRaaiQQdH0jUbVvG1jFnZYPUriatIIkd44,4246
|
|
330
330
|
model_compression_toolkit/exporter/model_wrapper/pytorch/builder/node_to_quantizer.py,sha256=h_NoqfryqoQ_4Djkoe4SRUwwmqDtdry0tvJ2o_bNxEw,9342
|
|
331
|
-
model_compression_toolkit/gptq/__init__.py,sha256=
|
|
331
|
+
model_compression_toolkit/gptq/__init__.py,sha256=YKg-tMj9D4Yd0xW9VRD5EN1J5JrmlRbNEF2fOSgodqA,1228
|
|
332
332
|
model_compression_toolkit/gptq/runner.py,sha256=MIg-oBtR1nbHkexySdCJD_XfjRoHSknLotmGBMuD5qM,5924
|
|
333
333
|
model_compression_toolkit/gptq/common/__init__.py,sha256=cco4TmeIDIh32nj9ZZXVkws4dd9F2UDrmjKzTN8G0V0,697
|
|
334
|
-
model_compression_toolkit/gptq/common/gptq_config.py,sha256=
|
|
334
|
+
model_compression_toolkit/gptq/common/gptq_config.py,sha256=U33sLIPB0pI4h_zhr4X_S9K0cEJWTbWFxkj8z9IGlxg,5268
|
|
335
335
|
model_compression_toolkit/gptq/common/gptq_constants.py,sha256=QSm6laLkIV0LYmU0BLtmKp3Fi3SqDfbncFQWOGA1cGU,611
|
|
336
336
|
model_compression_toolkit/gptq/common/gptq_framework_implementation.py,sha256=n3mSf4J92kFjekzyGyrJULylI-8Jf5OVWJ5AFoVnEx0,1266
|
|
337
337
|
model_compression_toolkit/gptq/common/gptq_graph.py,sha256=8qmty-2MzV6USRoHgShCA13HqxDI3PDGJaFKCQPFo5E,3026
|
|
@@ -358,7 +358,7 @@ model_compression_toolkit/gptq/pytorch/gptq_loss.py,sha256=kDuWw-6zh17wZpYWh4Xa9
|
|
|
358
358
|
model_compression_toolkit/gptq/pytorch/gptq_pytorch_implementation.py,sha256=tECPTavxn8EEwgLaP2zvxdJH6Vg9jC0YOIMJ7857Sdc,1268
|
|
359
359
|
model_compression_toolkit/gptq/pytorch/gptq_training.py,sha256=9zQC42RfAj4ak-XOzF8xEXS3IkHKhKlOClIfaUA0bGI,15396
|
|
360
360
|
model_compression_toolkit/gptq/pytorch/graph_info.py,sha256=-0GDC2cr-XXS7cTFTnDflJivGN7VaPnzVPsxCE-vZNU,3955
|
|
361
|
-
model_compression_toolkit/gptq/pytorch/quantization_facade.py,sha256=
|
|
361
|
+
model_compression_toolkit/gptq/pytorch/quantization_facade.py,sha256=ER5VPSkZZjqYj7PJ-3B5RX33YjHz3tJ4Er9SF6M-93c,12369
|
|
362
362
|
model_compression_toolkit/gptq/pytorch/quantizer/__init__.py,sha256=ZHNHo1yzye44m9_ht4UUZfTpK01RiVR3Tr74-vtnOGI,968
|
|
363
363
|
model_compression_toolkit/gptq/pytorch/quantizer/base_pytorch_gptq_quantizer.py,sha256=Zb-P0yRyZHHBlDvUBdRwxDpdduEJyJp6OT9pfKFF5ks,4171
|
|
364
364
|
model_compression_toolkit/gptq/pytorch/quantizer/quant_utils.py,sha256=OocYYRqvl7rZ37QT0hTzfJnWGiNCPskg7cziTlR7TRk,3893
|
|
@@ -372,20 +372,20 @@ model_compression_toolkit/gptq/pytorch/quantizer/ste_rounding/__init__.py,sha256
|
|
|
372
372
|
model_compression_toolkit/gptq/pytorch/quantizer/ste_rounding/symmetric_ste.py,sha256=6uxq_w62jn8DDOt9T7VtA6jZ8jTAPcbTufKFOYpVUm4,8768
|
|
373
373
|
model_compression_toolkit/pruning/__init__.py,sha256=lQMZS8G0pvR1LVi53nnJHNXgLNTan_MWMdwsVxhjrow,1106
|
|
374
374
|
model_compression_toolkit/pruning/keras/__init__.py,sha256=3Lkr37Exk9u8811hw8hVqkGcbTQGcLjd3LLuLC3fa_E,698
|
|
375
|
-
model_compression_toolkit/pruning/keras/pruning_facade.py,sha256=
|
|
375
|
+
model_compression_toolkit/pruning/keras/pruning_facade.py,sha256=B2mkCh3_AKc1O3IBOdo03PuIyjAoK3IBmgBdmIfUkDI,8296
|
|
376
376
|
model_compression_toolkit/pruning/pytorch/__init__.py,sha256=pKAdbTCFM_2BrZXUtTIw0ouKotrWwUDF_hP3rPwCM2k,696
|
|
377
|
-
model_compression_toolkit/pruning/pytorch/pruning_facade.py,sha256=
|
|
377
|
+
model_compression_toolkit/pruning/pytorch/pruning_facade.py,sha256=ZLmMhwAEnbXNRwMwgoGEGNmHpZx_KWYu7yi5K3aICWI,9184
|
|
378
378
|
model_compression_toolkit/ptq/__init__.py,sha256=Z_hkmTh7aLFei1DJKV0oNVUbrv_Q_0CTw-qD85Xf8UM,904
|
|
379
379
|
model_compression_toolkit/ptq/runner.py,sha256=_c1dSjlPPpsx59Vbg1buhG9bZq__OORz1VlPkwjJzoc,2552
|
|
380
380
|
model_compression_toolkit/ptq/keras/__init__.py,sha256=cco4TmeIDIh32nj9ZZXVkws4dd9F2UDrmjKzTN8G0V0,697
|
|
381
381
|
model_compression_toolkit/ptq/keras/quantization_facade.py,sha256=ergUI8RDA2h4_SHU05x2pYJatt-U-fZUrShdHJDLo_o,8844
|
|
382
382
|
model_compression_toolkit/ptq/pytorch/__init__.py,sha256=cco4TmeIDIh32nj9ZZXVkws4dd9F2UDrmjKzTN8G0V0,697
|
|
383
383
|
model_compression_toolkit/ptq/pytorch/quantization_facade.py,sha256=WKzokgg_gGcEHipVH26shneiAiTdSa7d_UUQKoS8ALY,7438
|
|
384
|
-
model_compression_toolkit/qat/__init__.py,sha256=
|
|
384
|
+
model_compression_toolkit/qat/__init__.py,sha256=kj2qsZh_Ca7PncsHKcaL5EVT2H8g4hYtvaQ3KFxOkwE,1143
|
|
385
385
|
model_compression_toolkit/qat/common/__init__.py,sha256=6tLZ4R4pYP6QVztLVQC_jik2nES3l4uhML0qUxZrezk,829
|
|
386
386
|
model_compression_toolkit/qat/common/qat_config.py,sha256=zoq0Vb74vCY7WlWD8JH_KPrHDoUHSvMc3gcO53u7L2U,3394
|
|
387
387
|
model_compression_toolkit/qat/keras/__init__.py,sha256=cco4TmeIDIh32nj9ZZXVkws4dd9F2UDrmjKzTN8G0V0,697
|
|
388
|
-
model_compression_toolkit/qat/keras/quantization_facade.py,sha256=
|
|
388
|
+
model_compression_toolkit/qat/keras/quantization_facade.py,sha256=nqP_QtZ8gv4XIla8Se54LzOFmcQ2GFWsJQc1JlUHJbA,17167
|
|
389
389
|
model_compression_toolkit/qat/keras/quantizer/__init__.py,sha256=zmYyCa25_KLCSUCGUDRslh3RCIjcRMxc_oXa54Aui-4,996
|
|
390
390
|
model_compression_toolkit/qat/keras/quantizer/base_keras_qat_quantizer.py,sha256=gPuIgQb8OafvC3SuA8jNsGoy8S8eTsDCEKuh36WDNss,2104
|
|
391
391
|
model_compression_toolkit/qat/keras/quantizer/quant_utils.py,sha256=cBULOgWUodcBO1lHevZggdTevuDYI6tQceV86U2x6DA,2543
|
|
@@ -397,7 +397,7 @@ model_compression_toolkit/qat/keras/quantizer/ste_rounding/__init__.py,sha256=cc
|
|
|
397
397
|
model_compression_toolkit/qat/keras/quantizer/ste_rounding/symmetric_ste.py,sha256=I4KlaGv17k71IyjuSG9M0OlXlD5P0pfvKa6oCyRQ5FE,13517
|
|
398
398
|
model_compression_toolkit/qat/keras/quantizer/ste_rounding/uniform_ste.py,sha256=EED6LfqhX_OhDRJ9e4GwbpgNC9vq7hoXyJS2VPvG2qc,10789
|
|
399
399
|
model_compression_toolkit/qat/pytorch/__init__.py,sha256=cco4TmeIDIh32nj9ZZXVkws4dd9F2UDrmjKzTN8G0V0,697
|
|
400
|
-
model_compression_toolkit/qat/pytorch/quantization_facade.py,sha256=
|
|
400
|
+
model_compression_toolkit/qat/pytorch/quantization_facade.py,sha256=V0nncfC6nUN2IEbhKSDcCj3dwZPCx7_4DZt78aJrzis,13637
|
|
401
401
|
model_compression_toolkit/qat/pytorch/quantizer/__init__.py,sha256=xYa4C8pr9cG1f3mQQcBXO_u3IdJN-zl7leZxuXDs86w,1003
|
|
402
402
|
model_compression_toolkit/qat/pytorch/quantizer/base_pytorch_qat_quantizer.py,sha256=FnhuFCuQoSf78FM1z1UZgXXd3k-mKSM7i9dYOuJUmeA,2213
|
|
403
403
|
model_compression_toolkit/qat/pytorch/quantizer/quantization_builder.py,sha256=e8Yfqbc552iAiP4Zxbd2ht1A3moRFGnV_KRGDm9Gw_g,5709
|
|
@@ -472,8 +472,8 @@ model_compression_toolkit/trainable_infrastructure/keras/quantize_wrapper.py,sha
|
|
|
472
472
|
model_compression_toolkit/trainable_infrastructure/keras/quantizer_utils.py,sha256=MVwXNymmFRB2NXIBx4e2mdJ1RfoHxRPYRgjb1MQP5kY,1797
|
|
473
473
|
model_compression_toolkit/trainable_infrastructure/pytorch/__init__.py,sha256=huHoBUcKNB6BnY6YaUCcFvdyBtBI172ZoUD8ZYeNc6o,696
|
|
474
474
|
model_compression_toolkit/trainable_infrastructure/pytorch/base_pytorch_quantizer.py,sha256=SbvRlIdE32PEBsINt1bhSqvrKL_zbM9V-aeSkOn-sw4,3083
|
|
475
|
-
mct_nightly-1.11.0.
|
|
476
|
-
mct_nightly-1.11.0.
|
|
477
|
-
mct_nightly-1.11.0.
|
|
478
|
-
mct_nightly-1.11.0.
|
|
479
|
-
mct_nightly-1.11.0.
|
|
475
|
+
mct_nightly-1.11.0.20240309.post349.dist-info/LICENSE.md,sha256=aYSSIb-5AFPeITTvXm1UAoe0uYBiMmSS8flvXaaFUks,10174
|
|
476
|
+
mct_nightly-1.11.0.20240309.post349.dist-info/METADATA,sha256=1kIrDv8zrJWGCYQMghZ-YxJtJQHOSWRTOC3w_D1oM8A,17444
|
|
477
|
+
mct_nightly-1.11.0.20240309.post349.dist-info/WHEEL,sha256=oiQVh_5PnQM0E3gPdiz09WCNmwiHDMaGer_elqB3coM,92
|
|
478
|
+
mct_nightly-1.11.0.20240309.post349.dist-info/top_level.txt,sha256=gsYA8juk0Z-ZmQRKULkb3JLGdOdz8jW_cMRjisn9ga4,26
|
|
479
|
+
mct_nightly-1.11.0.20240309.post349.dist-info/RECORD,,
|
|
@@ -27,23 +27,4 @@ from model_compression_toolkit import data_generation
|
|
|
27
27
|
from model_compression_toolkit import pruning
|
|
28
28
|
from model_compression_toolkit.trainable_infrastructure.keras.load_model import keras_load_quantized_model
|
|
29
29
|
|
|
30
|
-
|
|
31
|
-
# Old API (will not be accessible in future releases)
|
|
32
|
-
from model_compression_toolkit.core.common import network_editors as network_editor
|
|
33
|
-
from model_compression_toolkit.core.common.quantization import quantization_config
|
|
34
|
-
from model_compression_toolkit.core.common.mixed_precision import mixed_precision_quantization_config
|
|
35
|
-
from model_compression_toolkit.core.common.quantization.debug_config import DebugConfig
|
|
36
|
-
from model_compression_toolkit.core.common.quantization.quantization_config import QuantizationConfig, QuantizationErrorMethod, DEFAULTCONFIG
|
|
37
|
-
from model_compression_toolkit.core.common.mixed_precision.kpi_tools.kpi import KPI
|
|
38
|
-
from model_compression_toolkit.core.common.mixed_precision.mixed_precision_quantization_config import MixedPrecisionQuantizationConfig
|
|
39
|
-
from model_compression_toolkit.logger import set_log_folder
|
|
40
|
-
from model_compression_toolkit.core.common.data_loader import FolderImageLoader
|
|
41
|
-
from model_compression_toolkit.core.common.framework_info import FrameworkInfo, ChannelAxis
|
|
42
|
-
from model_compression_toolkit.core.keras.kpi_data_facade import keras_kpi_data
|
|
43
|
-
from model_compression_toolkit.core.pytorch.kpi_data_facade import pytorch_kpi_data
|
|
44
|
-
from model_compression_toolkit.gptq.common.gptq_config import GradientPTQConfig
|
|
45
|
-
from model_compression_toolkit.gptq.common.gptq_config import RoundingType
|
|
46
|
-
from model_compression_toolkit.gptq.keras.quantization_facade import get_keras_gptq_config
|
|
47
|
-
from model_compression_toolkit.gptq.pytorch.quantization_facade import get_pytorch_gptq_config
|
|
48
|
-
|
|
49
30
|
__version__ = "1.11.0"
|
|
@@ -43,7 +43,7 @@ class EditRule(_EditRule):
|
|
|
43
43
|
>>> import model_compression_toolkit as mct
|
|
44
44
|
>>> from model_compression_toolkit.core.keras.constants import KERNEL
|
|
45
45
|
>>> from tensorflow.keras.layers import Conv2D
|
|
46
|
-
>>> er_list = [mct.network_editor.EditRule(filter=mct.network_editor.NodeTypeFilter(Conv2D), action=mct.network_editor.ChangeCandidatesWeightsQuantConfigAttr(attr_name=KERNEL, weights_n_bits=9))]
|
|
46
|
+
>>> er_list = [mct.core.network_editor.EditRule(filter=mct.core.network_editor.NodeTypeFilter(Conv2D), action=mct.core.network_editor.ChangeCandidatesWeightsQuantConfigAttr(attr_name=KERNEL, weights_n_bits=9))]
|
|
47
47
|
|
|
48
48
|
Then the rules list can be passed to :func:`~model_compression_toolkit.keras_post_training_quantization`
|
|
49
49
|
to modify the network during the quantization process.
|
|
@@ -82,7 +82,7 @@ class QuantizationConfig:
|
|
|
82
82
|
block_collapsing (bool): Whether to collapse block one to another in the input network
|
|
83
83
|
shift_negative_ratio (float): Value for the ratio between the minimal negative value of a non-linearity output to its activation threshold, which above it - shifting negative activation should occur if enabled.
|
|
84
84
|
shift_negative_threshold_recalculation (bool): Whether or not to recompute the threshold after shifting negative activation.
|
|
85
|
-
shift_negative_params_search (bool): Whether to search for optimal shift and threshold in shift negative activation
|
|
85
|
+
shift_negative_params_search (bool): Whether to search for optimal shift and threshold in shift negative activation.
|
|
86
86
|
|
|
87
87
|
Examples:
|
|
88
88
|
One may create a quantization configuration to quantize a model according to.
|
|
@@ -75,8 +75,10 @@ class SeparableConvDecomposition(common.BaseSubstitution):
|
|
|
75
75
|
pw_bias = separable_node.get_weights_by_keys(BIAS)
|
|
76
76
|
|
|
77
77
|
dw_weights_dict = {DEPTHWISE_KERNEL: dw_kernel}
|
|
78
|
-
pw_weights_dict = {KERNEL: pw_kernel
|
|
79
|
-
|
|
78
|
+
pw_weights_dict = {KERNEL: pw_kernel}
|
|
79
|
+
|
|
80
|
+
if pw_bias is not None:
|
|
81
|
+
pw_weights_dict[BIAS] = pw_bias
|
|
80
82
|
|
|
81
83
|
# Split separable node attributes into relevant attributes for each of the new nodes.
|
|
82
84
|
# List of dw attributes that should take from separable as they are.
|
|
@@ -16,7 +16,7 @@ from model_compression_toolkit.constants import FOUND_TORCH, FOUND_TF
|
|
|
16
16
|
|
|
17
17
|
if FOUND_TF:
|
|
18
18
|
from model_compression_toolkit.data_generation.keras.keras_data_generation import (
|
|
19
|
-
|
|
19
|
+
keras_data_generation_experimental, get_keras_data_generation_config)
|
|
20
20
|
|
|
21
21
|
if FOUND_TORCH:
|
|
22
22
|
from model_compression_toolkit.data_generation.pytorch.pytorch_data_generation import (
|
|
@@ -49,7 +49,7 @@ if FOUND_TF:
|
|
|
49
49
|
scheduler_step_function_dict
|
|
50
50
|
|
|
51
51
|
# Function to create a DataGenerationConfig object with the specified configuration parameters for Tensorflow
|
|
52
|
-
def
|
|
52
|
+
def get_keras_data_generation_config(
|
|
53
53
|
n_iter: int = DEFAULT_N_ITER,
|
|
54
54
|
optimizer: Optimizer = Adam,
|
|
55
55
|
data_gen_batch_size: int = DEFAULT_DATA_GEN_BS,
|
|
@@ -115,13 +115,13 @@ if FOUND_TF:
|
|
|
115
115
|
output_loss_multiplier=output_loss_multiplier)
|
|
116
116
|
|
|
117
117
|
|
|
118
|
-
def
|
|
118
|
+
def keras_data_generation_experimental(
|
|
119
119
|
model: tf.keras.Model,
|
|
120
120
|
n_images: int,
|
|
121
121
|
output_image_size: Tuple,
|
|
122
122
|
data_generation_config: DataGenerationConfig) -> tf.Tensor:
|
|
123
123
|
"""
|
|
124
|
-
Function to perform data generation using the provided model and data generation configuration.
|
|
124
|
+
Function to perform data generation using the provided Keras model and data generation configuration.
|
|
125
125
|
|
|
126
126
|
Args:
|
|
127
127
|
model (Model): Keras model to generate data for.
|
|
@@ -132,6 +132,11 @@ if FOUND_TF:
|
|
|
132
132
|
Returns:
|
|
133
133
|
List[tf.Tensor]: Finalized list containing generated images.
|
|
134
134
|
"""
|
|
135
|
+
Logger.warning(f"keras_data_generation_experimental is experimental "
|
|
136
|
+
f"and is subject to future changes."
|
|
137
|
+
f"If you encounter an issue, please open an issue in our GitHub "
|
|
138
|
+
f"project https://github.com/sony/model_optimization")
|
|
139
|
+
|
|
135
140
|
# Get Data Generation functions and classes
|
|
136
141
|
image_pipeline, normalization, bn_layer_weighting_fn, bn_alignment_loss_fn, output_loss_fn, \
|
|
137
142
|
init_dataset = get_data_generation_classes(data_generation_config=data_generation_config,
|
|
@@ -323,11 +328,11 @@ if FOUND_TF:
|
|
|
323
328
|
|
|
324
329
|
|
|
325
330
|
else:
|
|
326
|
-
def
|
|
327
|
-
Logger.critical('Installing tensorflow is mandatory when using
|
|
331
|
+
def get_keras_data_generation_config(*args, **kwargs):
|
|
332
|
+
Logger.critical('Installing tensorflow is mandatory when using get_keras_data_generation_config. '
|
|
328
333
|
'Could not find Tensorflow package.') # pragma: no cover
|
|
329
334
|
|
|
330
335
|
|
|
331
|
-
def
|
|
336
|
+
def keras_data_generation_experimental(*args, **kwargs):
|
|
332
337
|
Logger.critical('Installing tensorflow is mandatory when using pytorch_data_generation_experimental. '
|
|
333
338
|
'Could not find Tensorflow package.') # pragma: no cover
|
|
@@ -143,6 +143,12 @@ if FOUND_TORCH:
|
|
|
143
143
|
Returns:
|
|
144
144
|
List[Tensor]: Finalized list containing generated images.
|
|
145
145
|
"""
|
|
146
|
+
|
|
147
|
+
Logger.warning(f"pytorch_data_generation_experimental is experimental "
|
|
148
|
+
f"and is subject to future changes."
|
|
149
|
+
f"If you encounter an issue, please open an issue in our GitHub "
|
|
150
|
+
f"project https://github.com/sony/model_optimization")
|
|
151
|
+
|
|
146
152
|
# get a static graph representation of the model using torch.fx
|
|
147
153
|
fx_model = symbolic_trace(model)
|
|
148
154
|
|
|
@@ -13,7 +13,7 @@
|
|
|
13
13
|
# limitations under the License.
|
|
14
14
|
# ==============================================================================
|
|
15
15
|
|
|
16
|
-
from model_compression_toolkit.gptq.common.gptq_config import GradientPTQConfig, RoundingType,
|
|
16
|
+
from model_compression_toolkit.gptq.common.gptq_config import GradientPTQConfig, RoundingType, GPTQHessianScoresConfig
|
|
17
17
|
from model_compression_toolkit.gptq.keras.quantization_facade import keras_gradient_post_training_quantization
|
|
18
18
|
from model_compression_toolkit.gptq.keras.quantization_facade import get_keras_gptq_config
|
|
19
19
|
from model_compression_toolkit.gptq.pytorch.quantization_facade import pytorch_gradient_post_training_quantization
|
|
@@ -36,8 +36,7 @@ class GPTQHessianScoresConfig:
|
|
|
36
36
|
hessians_num_samples: int = 16,
|
|
37
37
|
norm_scores: bool = True,
|
|
38
38
|
log_norm: bool = True,
|
|
39
|
-
scale_log_norm: bool = False
|
|
40
|
-
hessians_n_iter: int = 50): #TODO: remove
|
|
39
|
+
scale_log_norm: bool = False):
|
|
41
40
|
|
|
42
41
|
"""
|
|
43
42
|
Initialize a GPTQHessianWeightsConfig.
|
|
@@ -47,14 +46,12 @@ class GPTQHessianScoresConfig:
|
|
|
47
46
|
norm_scores (bool): Whether to normalize the returned scores of the weighted loss function (to get values between 0 and 1).
|
|
48
47
|
log_norm (bool): Whether to use log normalization for the GPTQ Hessian-based scores.
|
|
49
48
|
scale_log_norm (bool): Whether to scale the final vector of the Hessian-based scores.
|
|
50
|
-
hessians_n_iter (int): Number of random iterations to run Hessian approximation for GPTQ Hessian-based scores.
|
|
51
49
|
"""
|
|
52
50
|
|
|
53
51
|
self.hessians_num_samples = hessians_num_samples
|
|
54
52
|
self.norm_scores = norm_scores
|
|
55
53
|
self.log_norm = log_norm
|
|
56
54
|
self.scale_log_norm = scale_log_norm
|
|
57
|
-
self.hessians_n_iter = hessians_n_iter
|
|
58
55
|
|
|
59
56
|
|
|
60
57
|
class GradientPTQConfig:
|
|
@@ -129,6 +129,10 @@ if FOUND_TORCH:
|
|
|
129
129
|
|
|
130
130
|
Examples:
|
|
131
131
|
|
|
132
|
+
Import Model Compression Toolkit:
|
|
133
|
+
|
|
134
|
+
>>> import model_compression_toolkit as mct
|
|
135
|
+
|
|
132
136
|
Import a Pytorch module:
|
|
133
137
|
|
|
134
138
|
>>> from torchvision import models
|
|
@@ -149,7 +153,7 @@ if FOUND_TORCH:
|
|
|
149
153
|
|
|
150
154
|
Pass the module, the representative dataset generator and the configuration (optional) to get a quantized module
|
|
151
155
|
|
|
152
|
-
>>> quantized_module, quantization_info = mct.gptq.
|
|
156
|
+
>>> quantized_module, quantization_info = mct.gptq.pytorch_gradient_post_training_quantization(module, repr_datagen, core_config=config, gptq_config=gptq_conf)
|
|
153
157
|
|
|
154
158
|
"""
|
|
155
159
|
|
|
@@ -86,7 +86,7 @@ if FOUND_TF:
|
|
|
86
86
|
are represented in float32 data type (thus, each parameter is represented using 4 bytes):
|
|
87
87
|
|
|
88
88
|
>>> dense_nparams = sum([l.count_params() for l in model.layers])
|
|
89
|
-
>>> target_kpi = mct.KPI(weights_memory=dense_nparams * 4 * 0.5)
|
|
89
|
+
>>> target_kpi = mct.core.KPI(weights_memory=dense_nparams * 4 * 0.5)
|
|
90
90
|
|
|
91
91
|
Optionally, define a pruning configuration. num_score_approximations can be passed
|
|
92
92
|
to configure the number of importance scores that will be calculated for each channel.
|
|
@@ -101,6 +101,10 @@ if FOUND_TF:
|
|
|
101
101
|
|
|
102
102
|
"""
|
|
103
103
|
|
|
104
|
+
Logger.warning(f"keras_pruning_experimental is experimental and is subject to future changes."
|
|
105
|
+
f"If you encounter an issue, please open an issue in our GitHub "
|
|
106
|
+
f"project https://github.com/sony/model_optimization")
|
|
107
|
+
|
|
104
108
|
# Instantiate the Keras framework implementation.
|
|
105
109
|
fw_impl = PruningKerasImplementation()
|
|
106
110
|
|
|
@@ -93,7 +93,7 @@ if FOUND_TORCH:
|
|
|
93
93
|
are represented in float32 data type (thus, each parameter is represented using 4 bytes):
|
|
94
94
|
|
|
95
95
|
>>> dense_nparams = sum(p.numel() for p in model.state_dict().values())
|
|
96
|
-
>>> target_kpi = mct.KPI(weights_memory=dense_nparams * 4 * 0.5)
|
|
96
|
+
>>> target_kpi = mct.core.KPI(weights_memory=dense_nparams * 4 * 0.5)
|
|
97
97
|
|
|
98
98
|
Optionally, define a pruning configuration. num_score_approximations can be passed
|
|
99
99
|
to configure the number of importance scores that will be calculated for each channel.
|
|
@@ -108,6 +108,10 @@ if FOUND_TORCH:
|
|
|
108
108
|
|
|
109
109
|
"""
|
|
110
110
|
|
|
111
|
+
Logger.warning(f"pytorch_pruning_experimental is experimental and is subject to future changes."
|
|
112
|
+
f"If you encounter an issue, please open an issue in our GitHub "
|
|
113
|
+
f"project https://github.com/sony/model_optimization")
|
|
114
|
+
|
|
111
115
|
# Instantiate the Pytorch framework implementation.
|
|
112
116
|
fw_impl = PruningPytorchImplementation()
|
|
113
117
|
|
|
@@ -14,5 +14,5 @@
|
|
|
14
14
|
# ==============================================================================
|
|
15
15
|
from model_compression_toolkit.qat.common.qat_config import QATConfig, TrainingMethod
|
|
16
16
|
|
|
17
|
-
from model_compression_toolkit.qat.keras.quantization_facade import
|
|
18
|
-
from model_compression_toolkit.qat.pytorch.quantization_facade import
|
|
17
|
+
from model_compression_toolkit.qat.keras.quantization_facade import keras_quantization_aware_training_init_experimental, keras_quantization_aware_training_finalize_experimental
|
|
18
|
+
from model_compression_toolkit.qat.pytorch.quantization_facade import pytorch_quantization_aware_training_init_experimental, pytorch_quantization_aware_training_finalize_experimental
|
|
@@ -85,13 +85,13 @@ if FOUND_TF:
|
|
|
85
85
|
return layer
|
|
86
86
|
|
|
87
87
|
|
|
88
|
-
def
|
|
89
|
-
|
|
90
|
-
|
|
91
|
-
|
|
92
|
-
|
|
93
|
-
|
|
94
|
-
|
|
88
|
+
def keras_quantization_aware_training_init_experimental(in_model: Model,
|
|
89
|
+
representative_data_gen: Callable,
|
|
90
|
+
target_kpi: KPI = None,
|
|
91
|
+
core_config: CoreConfig = CoreConfig(),
|
|
92
|
+
qat_config: QATConfig = QATConfig(),
|
|
93
|
+
fw_info: FrameworkInfo = DEFAULT_KERAS_INFO,
|
|
94
|
+
target_platform_capabilities: TargetPlatformCapabilities = DEFAULT_KERAS_TPC):
|
|
95
95
|
"""
|
|
96
96
|
Prepare a trained Keras model for quantization aware training. First the model quantization is optimized
|
|
97
97
|
with post-training quantization, then the model layers are wrapped with QuantizeWrappers. The model is
|
|
@@ -161,7 +161,7 @@ if FOUND_TF:
|
|
|
161
161
|
Pass the model, the representative dataset generator, the configuration and the target KPI to get a
|
|
162
162
|
quantized model:
|
|
163
163
|
|
|
164
|
-
>>> quantized_model, quantization_info, custom_objects = mct.qat.
|
|
164
|
+
>>> quantized_model, quantization_info, custom_objects = mct.qat.keras_quantization_aware_training_init_experimental(model, repr_datagen, kpi, core_config=config)
|
|
165
165
|
|
|
166
166
|
Use the quantized model for fine-tuning. For loading the model from file, use the custom_objects dictionary:
|
|
167
167
|
|
|
@@ -170,6 +170,11 @@ if FOUND_TF:
|
|
|
170
170
|
For more configuration options, please take a look at our `API documentation <https://sony.github.io/model_optimization/api/api_docs/modules/mixed_precision_quantization_config.html>`_.
|
|
171
171
|
|
|
172
172
|
"""
|
|
173
|
+
|
|
174
|
+
Logger.warning(f"keras_quantization_aware_training_init_experimental is experimental and is subject to future changes."
|
|
175
|
+
f"If you encounter an issue, please open an issue in our GitHub "
|
|
176
|
+
f"project https://github.com/sony/model_optimization")
|
|
177
|
+
|
|
173
178
|
KerasModelValidation(model=in_model,
|
|
174
179
|
fw_info=fw_info).validate()
|
|
175
180
|
|
|
@@ -207,7 +212,7 @@ if FOUND_TF:
|
|
|
207
212
|
return qat_model, user_info, {}
|
|
208
213
|
|
|
209
214
|
|
|
210
|
-
def
|
|
215
|
+
def keras_quantization_aware_training_finalize_experimental(in_model: Model) -> Model:
|
|
211
216
|
"""
|
|
212
217
|
Convert a model fine-tuned by the user (Trainable quantizers) to a model with Inferable quantizers.
|
|
213
218
|
|
|
@@ -252,14 +257,19 @@ if FOUND_TF:
|
|
|
252
257
|
Pass the model, the representative dataset generator, the configuration and the target KPI to get a
|
|
253
258
|
quantized model:
|
|
254
259
|
|
|
255
|
-
>>> quantized_model, quantization_info, custom_objects = mct.qat.
|
|
260
|
+
>>> quantized_model, quantization_info, custom_objects = mct.qat.keras_quantization_aware_training_init_experimental(model, repr_datagen, kpi, core_config=config)
|
|
256
261
|
|
|
257
262
|
Use the quantized model for fine-tuning. For loading the model from file, use the custom_objects dictionary:
|
|
258
263
|
|
|
259
264
|
>>> quantized_model = tf.keras.models.load_model(model_file, custom_objects=custom_objects)
|
|
260
|
-
>>> quantized_model = mct.qat.
|
|
265
|
+
>>> quantized_model = mct.qat.keras_quantization_aware_training_finalize_experimental(quantized_model)
|
|
261
266
|
|
|
262
267
|
"""
|
|
268
|
+
Logger.warning(
|
|
269
|
+
f"keras_quantization_aware_training_finalize_experimental is experimental and is subject to future changes."
|
|
270
|
+
f"If you encounter an issue, please open an issue in our GitHub "
|
|
271
|
+
f"project https://github.com/sony/model_optimization")
|
|
272
|
+
|
|
263
273
|
def _export(layer):
|
|
264
274
|
if isinstance(layer, KerasTrainableQuantizationWrapper):
|
|
265
275
|
layer = layer.convert_to_inferable_quantizers()
|
|
@@ -282,13 +292,13 @@ if FOUND_TF:
|
|
|
282
292
|
else:
|
|
283
293
|
# If tensorflow is not installed,
|
|
284
294
|
# we raise an exception when trying to use these functions.
|
|
285
|
-
def
|
|
295
|
+
def keras_quantization_aware_training_init_experimental(*args, **kwargs):
|
|
286
296
|
Logger.critical('Installing tensorflow is mandatory '
|
|
287
|
-
'when using
|
|
297
|
+
'when using keras_quantization_aware_training_init_experimental. '
|
|
288
298
|
'Could not find Tensorflow package.') # pragma: no cover
|
|
289
299
|
|
|
290
300
|
|
|
291
|
-
def
|
|
301
|
+
def keras_quantization_aware_training_finalize_experimental(*args, **kwargs):
|
|
292
302
|
Logger.critical('Installing tensorflow is mandatory '
|
|
293
|
-
'when using
|
|
303
|
+
'when using keras_quantization_aware_training_finalize_experimental. '
|
|
294
304
|
'Could not find Tensorflow package.') # pragma: no cover
|
|
@@ -73,13 +73,13 @@ if FOUND_TORCH:
|
|
|
73
73
|
return module
|
|
74
74
|
|
|
75
75
|
|
|
76
|
-
def
|
|
77
|
-
|
|
78
|
-
|
|
79
|
-
|
|
80
|
-
|
|
81
|
-
|
|
82
|
-
|
|
76
|
+
def pytorch_quantization_aware_training_init_experimental(in_model: Module,
|
|
77
|
+
representative_data_gen: Callable,
|
|
78
|
+
target_kpi: KPI = None,
|
|
79
|
+
core_config: CoreConfig = CoreConfig(),
|
|
80
|
+
qat_config: QATConfig = QATConfig(),
|
|
81
|
+
fw_info: FrameworkInfo = DEFAULT_PYTORCH_INFO,
|
|
82
|
+
target_platform_capabilities: TargetPlatformCapabilities = DEFAULT_PYTORCH_TPC):
|
|
83
83
|
"""
|
|
84
84
|
Prepare a trained Pytorch model for quantization aware training. First the model quantization is optimized
|
|
85
85
|
with post-training quantization, then the model layers are wrapped with QuantizeWrappers. The model is
|
|
@@ -136,11 +136,15 @@ if FOUND_TORCH:
|
|
|
136
136
|
Pass the model, the representative dataset generator, the configuration and the target KPI to get a
|
|
137
137
|
quantized model. Now the model contains quantizer wrappers for fine tunning the weights:
|
|
138
138
|
|
|
139
|
-
>>> quantized_model, quantization_info =
|
|
139
|
+
>>> quantized_model, quantization_info = mct.qat.pytorch_quantization_aware_training_init_experimental(model, repr_datagen, core_config=config)
|
|
140
140
|
|
|
141
141
|
For more configuration options, please take a look at our `API documentation <https://sony.github.io/model_optimization/api/api_docs/modules/mixed_precision_quantization_config.html>`_.
|
|
142
142
|
|
|
143
143
|
"""
|
|
144
|
+
Logger.warning(
|
|
145
|
+
f"pytorch_quantization_aware_training_init_experimental is experimental and is subject to future changes."
|
|
146
|
+
f"If you encounter an issue, please open an issue in our GitHub "
|
|
147
|
+
f"project https://github.com/sony/model_optimization")
|
|
144
148
|
|
|
145
149
|
if core_config.mixed_precision_enable:
|
|
146
150
|
if not isinstance(core_config.mixed_precision_config, MixedPrecisionQuantizationConfig):
|
|
@@ -180,7 +184,7 @@ if FOUND_TORCH:
|
|
|
180
184
|
return qat_model, user_info
|
|
181
185
|
|
|
182
186
|
|
|
183
|
-
def
|
|
187
|
+
def pytorch_quantization_aware_training_finalize_experimental(in_model: Module):
|
|
184
188
|
"""
|
|
185
189
|
Convert a model fine-tuned by the user to a network with QuantizeWrappers containing
|
|
186
190
|
InferableQuantizers, that quantizes both the layers weights and outputs
|
|
@@ -214,13 +218,18 @@ if FOUND_TORCH:
|
|
|
214
218
|
Pass the model, the representative dataset generator, the configuration and the target KPI to get a
|
|
215
219
|
quantized model:
|
|
216
220
|
|
|
217
|
-
>>> quantized_model, quantization_info =
|
|
221
|
+
>>> quantized_model, quantization_info = mct.qat.pytorch_quantization_aware_training_init_experimental(model, repr_datagen, core_config=config)
|
|
218
222
|
|
|
219
223
|
Use the quantized model for fine-tuning. Finally, remove the quantizer wrappers and keep a quantize model ready for inference.
|
|
220
224
|
|
|
221
|
-
>>> quantized_model = mct.
|
|
225
|
+
>>> quantized_model = mct.qat.pytorch_quantization_aware_training_finalize_experimental(quantized_model)
|
|
222
226
|
|
|
223
227
|
"""
|
|
228
|
+
Logger.warning(
|
|
229
|
+
f"pytorch_quantization_aware_training_finalize_experimental is experimental and is subject to future changes."
|
|
230
|
+
f"If you encounter an issue, please open an issue in our GitHub "
|
|
231
|
+
f"project https://github.com/sony/model_optimization")
|
|
232
|
+
|
|
224
233
|
for _, layer in in_model.named_children():
|
|
225
234
|
if isinstance(layer, (PytorchQuantizationWrapper, PytorchActivationQuantizationHolder)):
|
|
226
235
|
layer.convert_to_inferable_quantizers()
|
|
@@ -231,13 +240,13 @@ if FOUND_TORCH:
|
|
|
231
240
|
else:
|
|
232
241
|
# If torch is not installed,
|
|
233
242
|
# we raise an exception when trying to use these functions.
|
|
234
|
-
def
|
|
243
|
+
def pytorch_quantization_aware_training_init_experimental(*args, **kwargs):
|
|
235
244
|
Logger.critical('Installing Pytorch is mandatory '
|
|
236
|
-
'when using
|
|
245
|
+
'when using pytorch_quantization_aware_training_init_experimental. '
|
|
237
246
|
'Could not find the torch package.') # pragma: no cover
|
|
238
247
|
|
|
239
248
|
|
|
240
|
-
def
|
|
249
|
+
def pytorch_quantization_aware_training_finalize_experimental(*args, **kwargs):
|
|
241
250
|
Logger.critical('Installing Pytorch is mandatory '
|
|
242
|
-
'when using
|
|
251
|
+
'when using pytorch_quantization_aware_training_finalize_experimental. '
|
|
243
252
|
'Could not find the torch package.') # pragma: no cover
|
|
File without changes
|
|
File without changes
|
|
File without changes
|