cache-dit 0.1.1__py3-none-any.whl → 0.1.2__py3-none-any.whl
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Potentially problematic release.
This version of cache-dit might be problematic. Click here for more details.
- cache_dit/_version.py +2 -2
- {cache_dit-0.1.1.dist-info → cache_dit-0.1.2.dist-info}/METADATA +28 -24
- {cache_dit-0.1.1.dist-info → cache_dit-0.1.2.dist-info}/RECORD +6 -6
- {cache_dit-0.1.1.dist-info → cache_dit-0.1.2.dist-info}/WHEEL +0 -0
- {cache_dit-0.1.1.dist-info → cache_dit-0.1.2.dist-info}/licenses/LICENSE +0 -0
- {cache_dit-0.1.1.dist-info → cache_dit-0.1.2.dist-info}/top_level.txt +0 -0
cache_dit/_version.py
CHANGED
|
@@ -1,11 +1,11 @@
|
|
|
1
1
|
Metadata-Version: 2.4
|
|
2
2
|
Name: cache_dit
|
|
3
|
-
Version: 0.1.
|
|
4
|
-
Summary:
|
|
3
|
+
Version: 0.1.2
|
|
4
|
+
Summary: 🤗 CacheDiT: A Training-free and Easy-to-use Cache Acceleration Toolbox for Diffusion Transformers
|
|
5
5
|
Author: DefTruth, vipshop.com, etc.
|
|
6
6
|
Maintainer: DefTruth, vipshop.com, etc
|
|
7
|
-
Project-URL: Repository, https://github.com/vipshop/
|
|
8
|
-
Project-URL: Homepage, https://github.com/vipshop/
|
|
7
|
+
Project-URL: Repository, https://github.com/vipshop/cache-dit.git
|
|
8
|
+
Project-URL: Homepage, https://github.com/vipshop/cache-dit.git
|
|
9
9
|
Requires-Python: >=3.10
|
|
10
10
|
Description-Content-Type: text/markdown
|
|
11
11
|
License-File: LICENSE
|
|
@@ -35,18 +35,18 @@ Dynamic: requires-python
|
|
|
35
35
|
|
|
36
36
|
<div align="center">
|
|
37
37
|
<p align="center">
|
|
38
|
-
<h3
|
|
38
|
+
<h3>🤗 CacheDiT: A Training-free and Easy-to-use Cache Acceleration <br>Toolbox for Diffusion Transformers</h3>
|
|
39
39
|
</p>
|
|
40
|
-
|
|
40
|
+
<img src=https://github.com/vipshop/cache-dit/raw/dev/assets/cache-dit.png >
|
|
41
41
|
<div align='center'>
|
|
42
42
|
<img src=https://img.shields.io/badge/Language-Python-brightgreen.svg >
|
|
43
43
|
<img src=https://img.shields.io/badge/PRs-welcome-9cf.svg >
|
|
44
44
|
<img src=https://img.shields.io/badge/PyPI-pass-brightgreen.svg >
|
|
45
45
|
<img src=https://img.shields.io/badge/Python-3.10|3.11|3.12-9cf.svg >
|
|
46
|
-
<img src=https://img.shields.io/badge/Release-v0.1.
|
|
46
|
+
<img src=https://img.shields.io/badge/Release-v0.1.2-brightgreen.svg >
|
|
47
47
|
</div>
|
|
48
48
|
<p align="center">
|
|
49
|
-
DeepCache
|
|
49
|
+
DeepCache is for UNet not DiT. Most DiT cache speedups are complex and not training-free. CacheDiT provides <br>a series of training-free, UNet-style cache accelerators for DiT: DBCache, DBPrune, FBCache, etc.
|
|
50
50
|
</p>
|
|
51
51
|
</div>
|
|
52
52
|
|
|
@@ -69,7 +69,7 @@ Dynamic: requires-python
|
|
|
69
69
|
|Baseline(L20x1)|F1B0 (0.08)|F1B0 (0.20)|F8B8 (0.15)|F12B12 (0.20)|F16B16 (0.20)|
|
|
70
70
|
|:---:|:---:|:---:|:---:|:---:|:---:|
|
|
71
71
|
|24.85s|15.59s|8.58s|15.41s|15.11s|17.74s|
|
|
72
|
-
|<img src=https://github.com/vipshop/
|
|
72
|
+
|<img src=https://github.com/vipshop/cache-dit/raw/main/assets/NONE_R0.08_S0.png width=105px>|<img src=https://github.com/vipshop/cache-dit/raw/main/assets/DBCACHE_F1B0S1_R0.08_S11.png width=105px> | <img src=https://github.com/vipshop/cache-dit/raw/main/assets/DBCACHE_F1B0S1_R0.2_S19.png width=105px>|<img src=https://github.com/vipshop/cache-dit/raw/main/assets/DBCACHE_F8B8S1_R0.15_S15.png width=105px>|<img src=https://github.com/vipshop/cache-dit/raw/main/assets/DBCACHE_F12B12S4_R0.2_S16.png width=105px>|<img src=https://github.com/vipshop/cache-dit/raw/main/assets/DBCACHE_F16B16S4_R0.2_S13.png width=105px>|
|
|
73
73
|
|**Baseline(L20x1)**|**F1B0 (0.08)**|**F8B8 (0.12)**|**F8B12 (0.20)**|**F8B16 (0.20)**|**F8B20 (0.20)**|
|
|
74
74
|
|27.85s|6.04s|5.88s|5.77s|6.01s|6.20s|
|
|
75
75
|
|<img src=https://github.com/user-attachments/assets/70ea57f4-d8f2-415b-8a96-d8315974a5e6 width=105px>|<img src=https://github.com/user-attachments/assets/fc0e1a67-19cc-44aa-bf50-04696e7978a0 width=105px> |<img src=https://github.com/user-attachments/assets/d1434896-628c-436b-95ad-43c085a8629e width=105px>|<img src=https://github.com/user-attachments/assets/aaa42cd2-57de-4c4e-8bfb-913018a8251d width=105px>|<img src=https://github.com/user-attachments/assets/dc0ba2a4-ef7c-436d-8a39-67055deab92f width=105px>|<img src=https://github.com/user-attachments/assets/aede466f-61ed-4256-8df0-fecf8020c5ca width=105px>|
|
|
@@ -93,7 +93,7 @@ These case studies demonstrate that even with relatively high thresholds (such a
|
|
|
93
93
|
|Baseline(L20x1)|Pruned(24%)|Pruned(35%)|Pruned(38%)|Pruned(45%)|Pruned(60%)|
|
|
94
94
|
|:---:|:---:|:---:|:---:|:---:|:---:|
|
|
95
95
|
|24.85s|19.43s|16.82s|15.95s|14.24s|10.66s|
|
|
96
|
-
|<img src=https://github.com/vipshop/
|
|
96
|
+
|<img src=https://github.com/vipshop/cache-dit/raw/main/assets/NONE_R0.08_S0.png width=105px>|<img src=https://github.com/vipshop/cache-dit/raw/main/assets/DBPRUNE_F1B0_R0.03_P24.0_T19.43s.png width=105px> | <img src=https://github.com/vipshop/cache-dit/raw/main/assets/DBPRUNE_F1B0_R0.04_P34.6_T16.82s.png width=105px>|<img src=https://github.com/vipshop/cache-dit/raw/main/assets/DBPRUNE_F1B0_R0.05_P38.3_T15.95s.png width=105px>|<img src=https://github.com/vipshop/cache-dit/raw/main/assets/DBPRUNE_F1B0_R0.06_P45.2_T14.24s.png width=105px>|<img src=https://github.com/vipshop/cache-dit/raw/main/assets/DBPRUNE_F1B0_R0.2_P59.5_T10.66s.png width=105px>|
|
|
97
97
|
|
|
98
98
|
<div align="center">
|
|
99
99
|
<p align="center">
|
|
@@ -103,13 +103,17 @@ These case studies demonstrate that even with relatively high thresholds (such a
|
|
|
103
103
|
|
|
104
104
|
Moreover, both DBCache and DBPrune are **plug-and-play** solutions that works hand-in-hand with [ParaAttention](https://github.com/chengzeyi/ParaAttention). Users can easily tap into its **Context Parallelism** features for distributed inference.
|
|
105
105
|
|
|
106
|
+
<p align="center">
|
|
107
|
+
♥️ Please consider to leave a ⭐️ Star to support us ~ ♥️
|
|
108
|
+
</p>
|
|
109
|
+
|
|
106
110
|
## ©️Citations
|
|
107
111
|
|
|
108
112
|
```BibTeX
|
|
109
|
-
@misc{
|
|
110
|
-
title={
|
|
111
|
-
url={https://github.com/vipshop/
|
|
112
|
-
note={Open-source software available at https://github.com/vipshop/
|
|
113
|
+
@misc{CacheDiT@2025,
|
|
114
|
+
title={CacheDiT: A Training-free and Easy-to-use cache acceleration Toolbox for Diffusion Transformers},
|
|
115
|
+
url={https://github.com/vipshop/cache-dit.git},
|
|
116
|
+
note={Open-source software available at https://github.com/vipshop/cache-dit.git},
|
|
113
117
|
author={vipshop.com},
|
|
114
118
|
year={2025}
|
|
115
119
|
}
|
|
@@ -119,7 +123,7 @@ Moreover, both DBCache and DBPrune are **plug-and-play** solutions that works ha
|
|
|
119
123
|
|
|
120
124
|
<div id="reference"></div>
|
|
121
125
|
|
|
122
|
-
|
|
126
|
+
The **CacheDiT** codebase was adapted from FBCache's implementation at the [ParaAttention](https://github.com/chengzeyi/ParaAttention/tree/main/src/para_attn/first_block_cache). We would like to express our sincere gratitude for this excellent work!
|
|
123
127
|
|
|
124
128
|
## 📖Contents
|
|
125
129
|
|
|
@@ -140,7 +144,7 @@ Moreover, both DBCache and DBPrune are **plug-and-play** solutions that works ha
|
|
|
140
144
|
|
|
141
145
|
<div id="installation"></div>
|
|
142
146
|
|
|
143
|
-
You can install the stable release of `
|
|
147
|
+
You can install the stable release of `cache-dit` from PyPI:
|
|
144
148
|
|
|
145
149
|
```bash
|
|
146
150
|
pip3 install cache-dit
|
|
@@ -148,7 +152,7 @@ pip3 install cache-dit
|
|
|
148
152
|
Or you can install the latest develop version from GitHub:
|
|
149
153
|
|
|
150
154
|
```bash
|
|
151
|
-
pip3 install git+https://github.com/vipshop/
|
|
155
|
+
pip3 install git+https://github.com/vipshop/cache-dit.git
|
|
152
156
|
```
|
|
153
157
|
|
|
154
158
|
## ⚡️DBCache: Dual Block Cache
|
|
@@ -270,13 +274,13 @@ apply_cache_on_pipe(pipe, **cache_options)
|
|
|
270
274
|
|Baseline(L20x1)|Pruned(24%)|Pruned(35%)|Pruned(38%)|Pruned(45%)|Pruned(60%)|
|
|
271
275
|
|:---:|:---:|:---:|:---:|:---:|:---:|
|
|
272
276
|
|24.85s|19.43s|16.82s|15.95s|14.24s|10.66s|
|
|
273
|
-
|<img src=https://github.com/vipshop/
|
|
277
|
+
|<img src=https://github.com/vipshop/cache-dit/raw/main/assets/NONE_R0.08_S0.png width=105px>|<img src=https://github.com/vipshop/cache-dit/raw/main/assets/DBPRUNE_F1B0_R0.03_P24.0_T19.43s.png width=105px> | <img src=https://github.com/vipshop/cache-dit/raw/main/assets/DBPRUNE_F1B0_R0.04_P34.6_T16.82s.png width=105px>|<img src=https://github.com/vipshop/cache-dit/raw/main/assets/DBPRUNE_F1B0_R0.05_P38.3_T15.95s.png width=105px>|<img src=https://github.com/vipshop/cache-dit/raw/main/assets/DBPRUNE_F1B0_R0.06_P45.2_T14.24s.png width=105px>|<img src=https://github.com/vipshop/cache-dit/raw/main/assets/DBPRUNE_F1B0_R0.2_P59.5_T10.66s.png width=105px>|
|
|
274
278
|
|
|
275
279
|
## 🎉Context Parallelism
|
|
276
280
|
|
|
277
281
|
<div id="context-parallelism"></div>
|
|
278
282
|
|
|
279
|
-
|
|
283
|
+
**CacheDiT** are **plug-and-play** solutions that works hand-in-hand with [ParaAttention](https://github.com/chengzeyi/ParaAttention). Users can **easily tap into** its **Context Parallelism** features for distributed inference. Firstly, install `para-attn` from PyPI:
|
|
280
284
|
|
|
281
285
|
```bash
|
|
282
286
|
pip3 install para-attn # or install `para-attn` from sources.
|
|
@@ -312,7 +316,7 @@ apply_cache_on_pipe(
|
|
|
312
316
|
|
|
313
317
|
<div id="compile"></div>
|
|
314
318
|
|
|
315
|
-
**
|
|
319
|
+
**CacheDiT** are designed to work compatibly with `torch.compile`. For example:
|
|
316
320
|
|
|
317
321
|
```python
|
|
318
322
|
apply_cache_on_pipe(
|
|
@@ -321,7 +325,7 @@ apply_cache_on_pipe(
|
|
|
321
325
|
# Compile the Transformer module
|
|
322
326
|
pipe.transformer = torch.compile(pipe.transformer)
|
|
323
327
|
```
|
|
324
|
-
However, users intending to use
|
|
328
|
+
However, users intending to use **CacheDiT** for DiT with **dynamic input shapes** should consider increasing the **recompile** **limit** of `torch._dynamo` to achieve better performance.
|
|
325
329
|
|
|
326
330
|
```python
|
|
327
331
|
torch._dynamo.config.recompile_limit = 96 # default is 8
|
|
@@ -333,9 +337,9 @@ Otherwise, the recompile_limit error may be triggered, causing the module to fal
|
|
|
333
337
|
|
|
334
338
|
<div id="supported"></div>
|
|
335
339
|
|
|
336
|
-
- [🚀FLUX.1](https://github.com/vipshop/
|
|
337
|
-
- [🚀CogVideoX](https://github.com/vipshop/
|
|
338
|
-
- [🚀Mochi](https://github.com/vipshop/
|
|
340
|
+
- [🚀FLUX.1](https://github.com/vipshop/cache-dit/raw/main/src/cache_dit/cache_factory/dual_block_cache/diffusers_adapters)
|
|
341
|
+
- [🚀CogVideoX](https://github.com/vipshop/cache-dit/raw/main/src/cache_dit/cache_factory/dual_block_cache/diffusers_adapters)
|
|
342
|
+
- [🚀Mochi](https://github.com/vipshop/cache-dit/raw/main/src/cache_dit/cache_factory/dual_block_cache/diffusers_adapters)
|
|
339
343
|
|
|
340
344
|
## 👋Contribute
|
|
341
345
|
<div id="contribute"></div>
|
|
@@ -1,5 +1,5 @@
|
|
|
1
1
|
cache_dit/__init__.py,sha256=47DEQpj8HBSa-_TImW-5JCeuQeRkm5NMpJWZG3hSuFU,0
|
|
2
|
-
cache_dit/_version.py,sha256=
|
|
2
|
+
cache_dit/_version.py,sha256=bSmADqydH8nBu-J4lG8UVuR7hnU_zcwhnSav2oQ0W0A,511
|
|
3
3
|
cache_dit/logger.py,sha256=dKfNe_RRk9HJwfgHGeRR1f0LbskJpKdGmISCbL9roQs,3443
|
|
4
4
|
cache_dit/primitives.py,sha256=A2iG9YLot3gOsZSPp-_gyjqjLgJvWQRx8aitD4JQ23Y,3877
|
|
5
5
|
cache_dit/cache_factory/__init__.py,sha256=5RNuhWakvvqrOV4vkqrEBA7d-V1LwcNSsjtW14mkqK8,5255
|
|
@@ -24,8 +24,8 @@ cache_dit/cache_factory/first_block_cache/diffusers_adapters/cogvideox.py,sha256
|
|
|
24
24
|
cache_dit/cache_factory/first_block_cache/diffusers_adapters/flux.py,sha256=Dcd4OzABCtyQCZNX2KNnUTdVoO1E1ApM7P8gcVYzcK0,2733
|
|
25
25
|
cache_dit/cache_factory/first_block_cache/diffusers_adapters/mochi.py,sha256=lQTClo52OwPbNEE4jiBZQhfC7hbtYqnYIABp_vbm_dk,2363
|
|
26
26
|
cache_dit/cache_factory/first_block_cache/diffusers_adapters/wan.py,sha256=IVH-lroOzvYb4XKLk9MOw54EtijBtuzVaKcVGz0KlBA,2656
|
|
27
|
-
cache_dit-0.1.
|
|
28
|
-
cache_dit-0.1.
|
|
29
|
-
cache_dit-0.1.
|
|
30
|
-
cache_dit-0.1.
|
|
31
|
-
cache_dit-0.1.
|
|
27
|
+
cache_dit-0.1.2.dist-info/licenses/LICENSE,sha256=Dqb07Ik2dV41s9nIdMUbiRWEfDqo7-dQeRiY7kPO8PE,3769
|
|
28
|
+
cache_dit-0.1.2.dist-info/METADATA,sha256=0XA0RjWrjEgUHzlWRiIephOWaubyKBnLjk0YmL8PZv8,16711
|
|
29
|
+
cache_dit-0.1.2.dist-info/WHEEL,sha256=_zCd3N1l69ArxyTb8rzEoP9TpbYXkqRFSNOD5OuxnTs,91
|
|
30
|
+
cache_dit-0.1.2.dist-info/top_level.txt,sha256=ZJDydonLEhujzz0FOkVbO-BqfzO9d_VqRHmZU-3MOZo,10
|
|
31
|
+
cache_dit-0.1.2.dist-info/RECORD,,
|
|
File without changes
|
|
File without changes
|
|
File without changes
|