cache-dit 0.2.37__py3-none-any.whl → 0.3.0__py3-none-any.whl
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Potentially problematic release.
This version of cache-dit might be problematic. Click here for more details.
- cache_dit/_version.py +2 -2
- {cache_dit-0.2.37.dist-info → cache_dit-0.3.0.dist-info}/METADATA +24 -18
- {cache_dit-0.2.37.dist-info → cache_dit-0.3.0.dist-info}/RECORD +7 -7
- {cache_dit-0.2.37.dist-info → cache_dit-0.3.0.dist-info}/WHEEL +0 -0
- {cache_dit-0.2.37.dist-info → cache_dit-0.3.0.dist-info}/entry_points.txt +0 -0
- {cache_dit-0.2.37.dist-info → cache_dit-0.3.0.dist-info}/licenses/LICENSE +0 -0
- {cache_dit-0.2.37.dist-info → cache_dit-0.3.0.dist-info}/top_level.txt +0 -0
cache_dit/_version.py
CHANGED
|
@@ -28,7 +28,7 @@ version_tuple: VERSION_TUPLE
|
|
|
28
28
|
commit_id: COMMIT_ID
|
|
29
29
|
__commit_id__: COMMIT_ID
|
|
30
30
|
|
|
31
|
-
__version__ = version = '0.
|
|
32
|
-
__version_tuple__ = version_tuple = (0,
|
|
31
|
+
__version__ = version = '0.3.0'
|
|
32
|
+
__version_tuple__ = version_tuple = (0, 3, 0)
|
|
33
33
|
|
|
34
34
|
__commit_id__ = commit_id = None
|
|
@@ -1,6 +1,6 @@
|
|
|
1
1
|
Metadata-Version: 2.4
|
|
2
2
|
Name: cache_dit
|
|
3
|
-
Version: 0.
|
|
3
|
+
Version: 0.3.0
|
|
4
4
|
Summary: 🤗 A Unified and Training-free Cache Acceleration Toolbox for Diffusion Transformers
|
|
5
5
|
Author: DefTruth, vipshop.com, etc.
|
|
6
6
|
Maintainer: DefTruth, vipshop.com, etc
|
|
@@ -58,7 +58,7 @@ Dynamic: requires-python
|
|
|
58
58
|
<img src=https://img.shields.io/badge/PyPI-pass-brightgreen.svg >
|
|
59
59
|
<img src=https://static.pepy.tech/badge/cache-dit >
|
|
60
60
|
<img src=https://img.shields.io/badge/Python-3.10|3.11|3.12-9cf.svg >
|
|
61
|
-
<img src=https://img.shields.io/badge/Release-v0.
|
|
61
|
+
<img src=https://img.shields.io/badge/Release-v0.3-brightgreen.svg >
|
|
62
62
|
</div>
|
|
63
63
|
<p align="center">
|
|
64
64
|
<b><a href="#unified">📚Unified Cache APIs</a></b> | <a href="#forward-pattern-matching">📚Forward Pattern Matching</a> | <a href="#automatic-block-adapter">📚Automatic Block Adapter</a><br>
|
|
@@ -286,24 +286,32 @@ Comparisons between different FnBn compute block configurations show that **more
|
|
|
286
286
|
| Config | Clip Score(↑) | ImageReward(↑) | PSNR(↑) | TFLOPs(↓) | SpeedUp(↑) |
|
|
287
287
|
| --- | --- | --- | --- | --- | --- |
|
|
288
288
|
| [**FLUX.1**-dev]: 50 steps | 32.9217 | 1.0412 | INF | 3726.87 | 1.00x |
|
|
289
|
-
| F8B0_W8MC0_R0.08 | 33.0070 | 1.0333 | 35.2008 | 2162.19 | 1.72x |
|
|
290
289
|
| F8B0_W4MC0_R0.08 | 32.9871 | 1.0370 | 33.8317 | 2064.81 | 1.80x |
|
|
291
|
-
| F4B0_W4MC2_R0.12 | 32.9718 | 1.0301 | 31.9394 | 1678.98 | 2.22x |
|
|
292
|
-
| F8B0_W8MC3_R0.12 | 32.9613 | 1.0270 | 34.2834 | 1977.69 | 1.88x |
|
|
293
290
|
| F8B0_W4MC2_R0.12 | 32.9535 | 1.0185 | 32.7346 | 1935.73 | 1.93x |
|
|
294
|
-
| F8B0_W8MC2_R0.12 | 32.9302 | 1.0227 | 34.7449 | 2072.18 | 1.80x |
|
|
295
291
|
| F8B0_W4MC3_R0.12 | 32.9234 | 1.0085 | 32.5385 | 1816.58 | 2.05x |
|
|
296
|
-
| F8B0_W8MC4_R0.12 | 32.9041 | 1.0140 | 33.9466 | 1897.61 | 1.96x |
|
|
297
292
|
| F4B0_W4MC3_R0.12 | 32.8981 | 1.0130 | 31.8031 | 1507.83 | 2.47x |
|
|
298
|
-
| F4B0_W4MC0_R0.08 | 32.8544 | 1.0065 | 32.3555 | 1654.72 | 2.25x |
|
|
299
|
-
| F8B0_W4MC4_R0.12 | 32.8443 | 1.0102 | 32.4231 | 1753.48 | 2.13x |
|
|
300
293
|
| F4B0_W4MC4_R0.12 | 32.8384 | 1.0065 | 31.5292 | 1400.08 | 2.66x |
|
|
301
|
-
| F1B0_W4MC4_R0.12 | 32.8291 | 1.0181 | 32.9462 | 1401.61 | 2.66x |
|
|
302
|
-
| F1B0_W4MC3_R0.12 | 32.8236 | 1.0166 | 33.0037 | 1457.62 | 2.56x |
|
|
303
|
-
| F1B0_W4MC10_R1.0 | 32.3183 | 0.8796 | 29.6757 | 651.90 | 5.72x |
|
|
304
294
|
|
|
305
295
|
The comparison between **cache-dit: DBCache** and algorithms such as Δ-DiT, Chipmunk, FORA, DuCa, TaylorSeer and FoCa is as follows. Now, in the comparison with a speedup ratio less than **3x**, cache-dit achieved the best accuracy. Please check [📚How to Reproduce?](./bench/) for more details.
|
|
306
296
|
|
|
297
|
+
| Method | TFLOPs(↓) | SpeedUp(↑) | ImageReward(↑) | Clip Score(↑) |
|
|
298
|
+
| --- | --- | --- | --- | --- |
|
|
299
|
+
| [**FLUX.1**-dev]: 50 steps | 3726.87 | 1.00× | 0.9898 | 32.404 |
|
|
300
|
+
| [**FLUX.1**-dev]: 60% steps | 2231.70 | 1.67× | 0.9663 | 32.312 |
|
|
301
|
+
| Δ-DiT(N=2) | 2480.01 | 1.50× | 0.9444 | 32.273 |
|
|
302
|
+
| Δ-DiT(N=3) | 1686.76 | 2.21× | 0.8721 | 32.102 |
|
|
303
|
+
| [**FLUX.1**-dev]: 34% steps | 1264.63 | 3.13× | 0.9453 | 32.114 |
|
|
304
|
+
| Chipmunk | 1505.87 | 2.47× | 0.9936 | 32.776 |
|
|
305
|
+
| FORA (N=3) | 1320.07 | 2.82× | 0.9776 | 32.266 |
|
|
306
|
+
| **[DBCache(F=4,B=0,W=4,MC=4)](https://github.com/vipshop/cache-dit)** | **1400.08** | **2.66×** | **1.0065** | **32.838** |
|
|
307
|
+
| DuCa(N=5) | 978.76 | 3.80× | 0.9955 | 32.241 |
|
|
308
|
+
| TaylorSeer(N=4,O=2) | 1042.27 | 3.57× | 0.9857 | 32.413 |
|
|
309
|
+
| **[DBCache+TaylorSeer(F=1,B=0,O=1)](https://github.com/vipshop/cache-dit)** | **1153.05** | **3.23×** | **1.0221** | **32.819** |
|
|
310
|
+
| **[FoCa(N=5) arxiv.2508.16211](https://arxiv.org/pdf/2508.16211)** | **893.54** | **4.16×** | **1.0029** | **32.948** |
|
|
311
|
+
|
|
312
|
+
<details>
|
|
313
|
+
<summary> Show all comparison </summary>
|
|
314
|
+
|
|
307
315
|
| Method | TFLOPs(↓) | SpeedUp(↑) | ImageReward(↑) | Clip Score(↑) |
|
|
308
316
|
| --- | --- | --- | --- | --- |
|
|
309
317
|
| [**FLUX.1**-dev]: 50 steps | 3726.87 | 1.00× | 0.9898 | 32.404 |
|
|
@@ -336,22 +344,20 @@ The comparison between **cache-dit: DBCache** and algorithms such as Δ-DiT, Chi
|
|
|
336
344
|
|
|
337
345
|
NOTE: Except for DBCache, other performance data are referenced from the paper [FoCa, arxiv.2508.16211](https://arxiv.org/pdf/2508.16211).
|
|
338
346
|
|
|
347
|
+
</details>
|
|
348
|
+
|
|
339
349
|
### 📚Text2Image Distillation DrawBench: Qwen-Image-Lightning
|
|
340
350
|
|
|
341
351
|
Surprisingly, cache-dit: DBCache still works in the extremely few-step distill model. For example, **Qwen-Image-Lightning w/ 4 steps**, with the F16B16 configuration, the PSNR is 34.8163, the Clip Score is 35.6109, and the ImageReward is 1.2614. It maintained a relatively high precision.
|
|
342
352
|
|
|
343
353
|
| Config | PSNR(↑) | Clip Score(↑) | ImageReward(↑) | TFLOPs(↓) | SpeedUp(↑) |
|
|
344
354
|
|----------------------------|-----------|------------|--------------|----------|------------|
|
|
345
|
-
| [**Lightning**]: 4 steps
|
|
355
|
+
| [**Lightning**]: 4 steps | INF | 35.5797 | 1.2630 | 274.33 | 1.00x |
|
|
346
356
|
| F24B24_W2MC1_R0.8 | 36.3242 | 35.6224 | 1.2630 | 264.74 | 1.04x |
|
|
347
357
|
| F16B16_W2MC1_R0.8 | 34.8163 | 35.6109 | 1.2614 | 244.25 | 1.12x |
|
|
348
358
|
| F12B12_W2MC1_R0.8 | 33.8953 | 35.6535 | 1.2549 | 234.63 | 1.17x |
|
|
349
359
|
| F8B8_W2MC1_R0.8 | 33.1374 | 35.7284 | 1.2517 | 224.29 | 1.22x |
|
|
350
|
-
|
|
|
351
|
-
| F32B0_W2MC1_R0.8 | 29.6490 | 35.7684 | 1.2302 | 261.05 | 1.05x |
|
|
352
|
-
| F24B0_W2MC1_R0.8 | 29.6081 | 35.8599 | 1.1874 | 245.54 | 1.12x |
|
|
353
|
-
| F16B0_W2MC1_R0.8 | 29.4844 | 36.0810 | 1.1586 | 227.06 | 1.21x |
|
|
354
|
-
|
|
360
|
+
| F1B0_W2MC1_R0.8 | 31.8317 | 35.6651 | 1.2397 | 206.90 | 1.33x |
|
|
355
361
|
|
|
356
362
|
## 🎉Unified Cache APIs
|
|
357
363
|
|
|
@@ -1,5 +1,5 @@
|
|
|
1
1
|
cache_dit/__init__.py,sha256=hzaexC1VQ0TxiWY6TJ1lTm-04e65WOTNHOfYryu1vFA,1284
|
|
2
|
-
cache_dit/_version.py,sha256=
|
|
2
|
+
cache_dit/_version.py,sha256=5zTqm8rgXsWYBpB2M3Zw_K1D-aV8wP7NsBLrmMKkrAQ,704
|
|
3
3
|
cache_dit/logger.py,sha256=0zsu42hN-3-rgGC_C29ms1IvVpV4_b4_SwJCKSenxBE,4304
|
|
4
4
|
cache_dit/utils.py,sha256=nuHHr6NB286qE9u6klLNfhAVRMOGipihOhM8LRqznmU,10775
|
|
5
5
|
cache_dit/cache_factory/.gitignore,sha256=5Cb-qT9wsTUoMJ7vACDF7ZcLpAXhi5v-xdcWSRit988,23
|
|
@@ -43,9 +43,9 @@ cache_dit/metrics/metrics.py,sha256=7UV-H2NRbhfr6dvrXEzU97Zy-BSQ5zEfm9CKtaK4ldg,
|
|
|
43
43
|
cache_dit/quantize/__init__.py,sha256=kWYoMAyZgBXu9BJlZjTQ0dRffW9GqeeY9_iTkXrb70A,59
|
|
44
44
|
cache_dit/quantize/quantize_ao.py,sha256=Fx1KW4l3gdEkdrcAYtPoDW7WKBJWrs3glOHiEwW_TgE,6160
|
|
45
45
|
cache_dit/quantize/quantize_interface.py,sha256=2s_R7xPSKuJeFpEGeLwRxnq_CqJcBG3a3lzyW5wh-UM,1241
|
|
46
|
-
cache_dit-0.
|
|
47
|
-
cache_dit-0.
|
|
48
|
-
cache_dit-0.
|
|
49
|
-
cache_dit-0.
|
|
50
|
-
cache_dit-0.
|
|
51
|
-
cache_dit-0.
|
|
46
|
+
cache_dit-0.3.0.dist-info/licenses/LICENSE,sha256=Dqb07Ik2dV41s9nIdMUbiRWEfDqo7-dQeRiY7kPO8PE,3769
|
|
47
|
+
cache_dit-0.3.0.dist-info/METADATA,sha256=NW9YEZ1Dt3y0_o89jS3iO9o9-Y83Yo0qTz2iDOGF4j0,45943
|
|
48
|
+
cache_dit-0.3.0.dist-info/WHEEL,sha256=_zCd3N1l69ArxyTb8rzEoP9TpbYXkqRFSNOD5OuxnTs,91
|
|
49
|
+
cache_dit-0.3.0.dist-info/entry_points.txt,sha256=FX2gysXaZx6NeK1iCLMcIdP8Q4_qikkIHtEmi3oWn8o,65
|
|
50
|
+
cache_dit-0.3.0.dist-info/top_level.txt,sha256=ZJDydonLEhujzz0FOkVbO-BqfzO9d_VqRHmZU-3MOZo,10
|
|
51
|
+
cache_dit-0.3.0.dist-info/RECORD,,
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|