cache-dit 0.1.7__py3-none-any.whl → 0.1.8__py3-none-any.whl

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Potentially problematic release.


This version of cache-dit might be problematic. Click here for more details.

cache_dit/_version.py CHANGED
@@ -17,5 +17,5 @@ __version__: str
17
17
  __version_tuple__: VERSION_TUPLE
18
18
  version_tuple: VERSION_TUPLE
19
19
 
20
- __version__ = version = '0.1.7'
21
- __version_tuple__ = version_tuple = (0, 1, 7)
20
+ __version__ = version = '0.1.8'
21
+ __version_tuple__ = version_tuple = (0, 1, 8)
@@ -628,7 +628,7 @@ class DBPrunedTransformerBlocks(torch.nn.Module):
628
628
  return sorted(non_prune_blocks_ids)
629
629
 
630
630
  # @torch.compile(dynamic=True)
631
- # mark this function as compile with dynamic=True will
631
+ # mark this function as compile with dynamic=True will
632
632
  # cause precision degradate, so, we choose to disable it
633
633
  # now, until we find a better solution or fixed the bug.
634
634
  @torch.compiler.disable
@@ -668,7 +668,7 @@ class DBPrunedTransformerBlocks(torch.nn.Module):
668
668
  )
669
669
 
670
670
  # @torch.compile(dynamic=True)
671
- # mark this function as compile with dynamic=True will
671
+ # mark this function as compile with dynamic=True will
672
672
  # cause precision degradate, so, we choose to disable it
673
673
  # now, until we find a better solution or fixed the bug.
674
674
  @torch.compiler.disable
@@ -1,6 +1,6 @@
1
1
  Metadata-Version: 2.4
2
2
  Name: cache_dit
3
- Version: 0.1.7
3
+ Version: 0.1.8
4
4
  Summary: 🤗 CacheDiT: A Training-free and Easy-to-use Cache Acceleration Toolbox for Diffusion Transformers
5
5
  Author: DefTruth, vipshop.com, etc.
6
6
  Maintainer: DefTruth, vipshop.com, etc
@@ -35,7 +35,7 @@ Dynamic: requires-python
35
35
 
36
36
  <div align="center">
37
37
  <p align="center">
38
- <h3>🤗 CacheDiT: A Training-free and Easy-to-use Cache Acceleration <br>Toolbox for Diffusion Transformers</h3>
38
+ <h2>🤗 CacheDiT: A Training-free and Easy-to-use Cache Acceleration <br>Toolbox for Diffusion Transformers</h2>
39
39
  </p>
40
40
  <img src=https://github.com/vipshop/cache-dit/raw/main/assets/cache-dit.png >
41
41
  <div align='center'>
@@ -44,13 +44,32 @@ Dynamic: requires-python
44
44
  <img src=https://img.shields.io/badge/PyPI-pass-brightgreen.svg >
45
45
  <img src=https://static.pepy.tech/badge/cache-dit >
46
46
  <img src=https://img.shields.io/badge/Python-3.10|3.11|3.12-9cf.svg >
47
- <img src=https://img.shields.io/badge/Release-v0.1.7-brightgreen.svg >
47
+ <img src=https://img.shields.io/badge/Release-v0.1.8-brightgreen.svg >
48
48
  </div>
49
49
  <p align="center">
50
50
  DeepCache is for UNet not DiT. Most DiT cache speedups are complex and not training-free. CacheDiT <br>offers a set of training-free cache accelerators for DiT: 🔥DBCache, DBPrune, FBCache, etc🔥
51
51
  </p>
52
+ <p align="center">
53
+ <h3> 🔥Supported Models🔥</h2>
54
+ <a href=https://github.com/vipshop/cache-dit/raw/main/examples> <b>🚀FLUX.1</b>: ✔️DBCache, ✔️DBPrune, ✔️FBCache🔥</a> <br>
55
+ <a href=https://github.com/vipshop/cache-dit/raw/main/examples> <b>🚀CogVideoX</b>: ✔️DBCache, ✔️DBPrune, ✔️FBCache🔥</a> <br>
56
+ <a href=https://github.com/vipshop/cache-dit/raw/main/examples> <b>🚀Mochi</b>: ✔️DBCache, ✔️DBPrune, ✔️FBCache🔥</a> <br>
57
+ <a href=https://github.com/vipshop/cache-dit/raw/main/examples> <b>🚀Wan2.1</b>: 🔜DBCache, 🔜DBPrune, ✔️FBCache🔥</a> <br> <br>
58
+ <b>♥️ Please consider to leave a ⭐️ Star to support us ~ ♥️</b>
59
+ </p>
52
60
  </div>
53
61
 
62
+
63
+ <!--
64
+ ## 🎉Supported Models
65
+ <div id="supported"></div>
66
+ - [🚀FLUX.1](https://github.com/vipshop/cache-dit/raw/main/examples): *✔️DBCache, ✔️DBPrune, ✔️FBCache*
67
+ - [🚀CogVideoX](https://github.com/vipshop/cache-dit/raw/main/examples): *✔️DBCache, ✔️DBPrune, ✔️FBCache*
68
+ - [🚀Mochi](https://github.com/vipshop/cache-dit/raw/main/examples): *✔️DBCache, ✔️DBPrune, ✔️FBCache*
69
+ - [🚀Wan2.1**](https://github.com/vipshop/cache-dit/raw/main/examples): *🔜DBCache, 🔜DBPrune, ✔️FBCache*
70
+ -->
71
+
72
+
54
73
  ## 🤗 Introduction
55
74
 
56
75
  <div align="center">
@@ -102,11 +121,20 @@ These case studies demonstrate that even with relatively high thresholds (such a
102
121
  </p>
103
122
  </div>
104
123
 
105
- Moreover, **CacheDiT** are **plug-and-play** solutions that works hand-in-hand with [ParaAttention](https://github.com/chengzeyi/ParaAttention). Users can easily tap into its **Context Parallelism** features for distributed inference.
124
+ **CacheDiT** are **plug-and-play** solutions that works hand-in-hand with [ParaAttention](https://github.com/chengzeyi/ParaAttention). Users can easily tap into its **Context Parallelism** features for distributed inference. Moreover, **CacheDiT** are designed to work compatibly with `torch.compile`. You can easily use CacheDiT with torch.compile to further achieve a better performance.
125
+
126
+ <div align="center">
127
+ <p align="center">
128
+ DBPrune + <b>torch.compile + context parallelism</b> <br>Steps: 28, "A cat holding a sign that says hello world with complex background"
129
+ </p>
130
+ </div>
106
131
 
107
- <p align="center">
108
- ♥️ Please consider to leave a ⭐️ Star to support us ~ ♥️
109
- </p>
132
+ |Baseline|Pruned(24%)|Pruned(35%)|Pruned(38%)|Pruned(45%)|Pruned(60%)|
133
+ |:---:|:---:|:---:|:---:|:---:|:---:|
134
+ |+L20x1:24.85s|19.43s|16.82s|15.95s|14.24s|10.66s|
135
+ |+compile:20.43s|16.25s|14.12s|13.41s|12s|8.86s|
136
+ |+L20x4:7.75s|6.62s|6.03s|5.81s|5.24s|3.93s|
137
+ |<img src=https://github.com/vipshop/cache-dit/raw/main/assets/U0_C1_NONE_R0.08_S0_T20.43s.png width=105px>|<img src=https://github.com/vipshop/cache-dit/raw/main/assets/U0_C1_DBPRUNE_F1B0_R0.03_P24.0_T16.25s.png width=105px> | <img src=https://github.com/vipshop/cache-dit/raw/main/assets/U0_C1_DBPRUNE_F1B0_R0.04_P34.6_T14.12s.png width=105px>|<img src=https://github.com/vipshop/cache-dit/raw/main/assets/U0_C1_DBPRUNE_F1B0_R0.045_P38.2_T13.41s.png width=105px>|<img src=https://github.com/vipshop/cache-dit/raw/main/assets/U0_C1_DBPRUNE_F1B0_R0.055_P45.1_T12.00s.png width=105px>|<img src=https://github.com/vipshop/cache-dit/raw/main/assets/U0_C1_DBPRUNE_F1B0_R0.2_P59.5_T8.86s.png width=105px>|
110
138
 
111
139
  ## ©️Citations
112
140
 
@@ -136,11 +164,9 @@ The **CacheDiT** codebase was adapted from FBCache's implementation at the [Para
136
164
  - [⚡️Dynamic Block Prune](#dbprune)
137
165
  - [🎉Context Parallelism](#context-parallelism)
138
166
  - [🔥Torch Compile](#compile)
139
- - [🎉Supported Models](#supported)
140
167
  - [👋Contribute](#contribute)
141
168
  - [©️License](#license)
142
169
 
143
-
144
170
  ## ⚙️Installation
145
171
 
146
172
  <div id="installation"></div>
@@ -370,6 +396,7 @@ Then, run the python test script with `torchrun`:
370
396
  ```bash
371
397
  torchrun --nproc_per_node=4 parallel_cache.py
372
398
  ```
399
+ <!--
373
400
 
374
401
  <div align="center">
375
402
  <p align="center">
@@ -377,17 +404,18 @@ torchrun --nproc_per_node=4 parallel_cache.py
377
404
  </p>
378
405
  </div>
379
406
 
380
- |Baseline(L20x1)|Pruned(24%)|Pruned(35%)|Pruned(38%)|Pruned(45%)|Pruned(60%)|
407
+ |Baseline|Pruned(24%)|Pruned(35%)|Pruned(38%)|Pruned(45%)|Pruned(60%)|
381
408
  |:---:|:---:|:---:|:---:|:---:|:---:|
382
- |24.85s|19.43s|16.82s|15.95s|14.24s|10.66s|
383
- |8.54s (L20x4)|7.20s (L20x4)|6.61s (L20x4)|6.09s (L20x4)|5.54s (L20x4)|4.22s (L20x4)|
409
+ |+L20x1:24.85s|19.43s|16.82s|15.95s|14.24s|10.66s|
410
+ |+L20x4:8.54s|7.20s|6.61s|6.09s|5.54s|4.22s|
384
411
  |<img src=https://github.com/vipshop/cache-dit/raw/main/assets/NONE_R0.08_S0.png width=105px>|<img src=https://github.com/vipshop/cache-dit/raw/main/assets/DBPRUNE_F1B0_R0.03_P24.0_T19.43s.png width=105px> | <img src=https://github.com/vipshop/cache-dit/raw/main/assets/DBPRUNE_F1B0_R0.04_P34.6_T16.82s.png width=105px>|<img src=https://github.com/vipshop/cache-dit/raw/main/assets/DBPRUNE_F1B0_R0.05_P38.3_T15.95s.png width=105px>|<img src=https://github.com/vipshop/cache-dit/raw/main/assets/DBPRUNE_F1B0_R0.06_P45.2_T14.24s.png width=105px>|<img src=https://github.com/vipshop/cache-dit/raw/main/assets/DBPRUNE_F1B0_R0.2_P59.5_T10.66s.png width=105px>|
412
+ -->
385
413
 
386
414
  ## 🔥Torch Compile
387
415
 
388
416
  <div id="compile"></div>
389
417
 
390
- **CacheDiT** are designed to work compatibly with `torch.compile`. For example:
418
+ **CacheDiT** are designed to work compatibly with `torch.compile`. You can easily use CacheDiT with torch.compile to further achieve a better performance. For example:
391
419
 
392
420
  ```python
393
421
  apply_cache_on_pipe(
@@ -396,21 +424,27 @@ apply_cache_on_pipe(
396
424
  # Compile the Transformer module
397
425
  pipe.transformer = torch.compile(pipe.transformer)
398
426
  ```
399
- However, users intending to use **CacheDiT** for DiT with **dynamic input shapes** should consider increasing the **recompile** **limit** of `torch._dynamo` to achieve better performance.
400
-
427
+ However, users intending to use **CacheDiT** for DiT with **dynamic input shapes** should consider increasing the **recompile** **limit** of `torch._dynamo`. Otherwise, the recompile_limit error may be triggered, causing the module to fall back to eager mode.
401
428
  ```python
402
429
  torch._dynamo.config.recompile_limit = 96 # default is 8
403
430
  torch._dynamo.config.accumulated_recompile_limit = 2048 # default is 256
404
431
  ```
405
- Otherwise, the recompile_limit error may be triggered, causing the module to fall back to eager mode.
406
432
 
407
- ## 🎉Supported Models
433
+ <!--
408
434
 
409
- <div id="supported"></div>
435
+ <div align="center">
436
+ <p align="center">
437
+ DBPrune + <b>torch.compile</b>, Steps: 28, "A cat holding a sign that says hello world with complex background"
438
+ </p>
439
+ </div>
410
440
 
411
- - [🚀FLUX.1](https://github.com/vipshop/cache-dit/raw/main/src/cache_dit/cache_factory/dual_block_cache/diffusers_adapters)
412
- - [🚀CogVideoX](https://github.com/vipshop/cache-dit/raw/main/src/cache_dit/cache_factory/dual_block_cache/diffusers_adapters)
413
- - [🚀Mochi](https://github.com/vipshop/cache-dit/raw/main/src/cache_dit/cache_factory/dual_block_cache/diffusers_adapters)
441
+ |Baseline|Pruned(24%)|Pruned(35%)|Pruned(38%)|Pruned(45%)|Pruned(60%)|
442
+ |:---:|:---:|:---:|:---:|:---:|:---:|
443
+ |+L20x1:24.8s|19.4s|16.8s|15.9s|14.2s|10.6s|
444
+ |+compile:20.4s|16.5s|14.1s|13.4s|12s|8.8s|
445
+ |+L20x4:7.7s|6.6s|6.0s|5.8s|5.2s|3.9s|
446
+ |<img src=https://github.com/vipshop/cache-dit/raw/main/assets/U0_C1_NONE_R0.08_S0_T20.43s.png width=105px>|<img src=https://github.com/vipshop/cache-dit/raw/main/assets/U0_C1_DBPRUNE_F1B0_R0.03_P24.0_T16.25s.png width=105px> | <img src=https://github.com/vipshop/cache-dit/raw/main/assets/U0_C1_DBPRUNE_F1B0_R0.04_P34.6_T14.12s.png width=105px>|<img src=https://github.com/vipshop/cache-dit/raw/main/assets/U0_C1_DBPRUNE_F1B0_R0.045_P38.2_T13.41s.png width=105px>|<img src=https://github.com/vipshop/cache-dit/raw/main/assets/U0_C1_DBPRUNE_F1B0_R0.055_P45.1_T12.00s.png width=105px>|<img src=https://github.com/vipshop/cache-dit/raw/main/assets/U0_C1_DBPRUNE_F1B0_R0.2_P59.5_T8.86s.png width=105px>|
447
+ -->
414
448
 
415
449
  ## 👋Contribute
416
450
  <div id="contribute"></div>
@@ -1,5 +1,5 @@
1
1
  cache_dit/__init__.py,sha256=47DEQpj8HBSa-_TImW-5JCeuQeRkm5NMpJWZG3hSuFU,0
2
- cache_dit/_version.py,sha256=W_EoL8cAL4KhujvbYWEpb9NqRLbbrH0T024lJvRRWHI,511
2
+ cache_dit/_version.py,sha256=AjUi5zEL_BoWoXMXR1FnWc3mD6FHX7snDXjDHVLoens,511
3
3
  cache_dit/logger.py,sha256=dKfNe_RRk9HJwfgHGeRR1f0LbskJpKdGmISCbL9roQs,3443
4
4
  cache_dit/primitives.py,sha256=A2iG9YLot3gOsZSPp-_gyjqjLgJvWQRx8aitD4JQ23Y,3877
5
5
  cache_dit/cache_factory/__init__.py,sha256=5RNuhWakvvqrOV4vkqrEBA7d-V1LwcNSsjtW14mkqK8,5255
@@ -12,7 +12,7 @@ cache_dit/cache_factory/dual_block_cache/diffusers_adapters/cogvideox.py,sha256=
12
12
  cache_dit/cache_factory/dual_block_cache/diffusers_adapters/flux.py,sha256=UbE6nIF-EtA92QxIZVMzIssdZKQSPAVX1hchF9R8drU,2754
13
13
  cache_dit/cache_factory/dual_block_cache/diffusers_adapters/mochi.py,sha256=qxMu1L3ycT8F-uxpGsmFQBY_BH1vDiGIOXgS_Qbb7dM,2391
14
14
  cache_dit/cache_factory/dynamic_block_prune/__init__.py,sha256=47DEQpj8HBSa-_TImW-5JCeuQeRkm5NMpJWZG3hSuFU,0
15
- cache_dit/cache_factory/dynamic_block_prune/prune_context.py,sha256=foUGCBtpCbfLWw6pxJguyxOfcp_YrizfDEKawCt_UKI,35028
15
+ cache_dit/cache_factory/dynamic_block_prune/prune_context.py,sha256=YRDwZ_16yjThpgVgDv6YaIB4QCE9nEkE-MOru0jOd50,35026
16
16
  cache_dit/cache_factory/dynamic_block_prune/diffusers_adapters/__init__.py,sha256=8IjJjZOs5XRzsj7Ni2MXpR2Z1PUyRSONIhmfAn1G0eM,1667
17
17
  cache_dit/cache_factory/dynamic_block_prune/diffusers_adapters/cogvideox.py,sha256=ORJpdkXkgziDUo-rpebC6pUemgYaDCoeu0cwwLz175U,2407
18
18
  cache_dit/cache_factory/dynamic_block_prune/diffusers_adapters/flux.py,sha256=KbEkLSsHtS6xwLWNh3jlOlXRyGRdrI2pWV1zyQxMTj4,2757
@@ -24,8 +24,8 @@ cache_dit/cache_factory/first_block_cache/diffusers_adapters/cogvideox.py,sha256
24
24
  cache_dit/cache_factory/first_block_cache/diffusers_adapters/flux.py,sha256=Dcd4OzABCtyQCZNX2KNnUTdVoO1E1ApM7P8gcVYzcK0,2733
25
25
  cache_dit/cache_factory/first_block_cache/diffusers_adapters/mochi.py,sha256=lQTClo52OwPbNEE4jiBZQhfC7hbtYqnYIABp_vbm_dk,2363
26
26
  cache_dit/cache_factory/first_block_cache/diffusers_adapters/wan.py,sha256=IVH-lroOzvYb4XKLk9MOw54EtijBtuzVaKcVGz0KlBA,2656
27
- cache_dit-0.1.7.dist-info/licenses/LICENSE,sha256=Dqb07Ik2dV41s9nIdMUbiRWEfDqo7-dQeRiY7kPO8PE,3769
28
- cache_dit-0.1.7.dist-info/METADATA,sha256=1laHsnvDQmPDn5f8FXzSdbFRyrKZ2UljISfmxCWw8J0,20822
29
- cache_dit-0.1.7.dist-info/WHEEL,sha256=_zCd3N1l69ArxyTb8rzEoP9TpbYXkqRFSNOD5OuxnTs,91
30
- cache_dit-0.1.7.dist-info/top_level.txt,sha256=ZJDydonLEhujzz0FOkVbO-BqfzO9d_VqRHmZU-3MOZo,10
31
- cache_dit-0.1.7.dist-info/RECORD,,
27
+ cache_dit-0.1.8.dist-info/licenses/LICENSE,sha256=Dqb07Ik2dV41s9nIdMUbiRWEfDqo7-dQeRiY7kPO8PE,3769
28
+ cache_dit-0.1.8.dist-info/METADATA,sha256=sAYGKro4VfeE_SHrZA8X0BcHfw9y3YY_Qcj9ONkbemE,23952
29
+ cache_dit-0.1.8.dist-info/WHEEL,sha256=_zCd3N1l69ArxyTb8rzEoP9TpbYXkqRFSNOD5OuxnTs,91
30
+ cache_dit-0.1.8.dist-info/top_level.txt,sha256=ZJDydonLEhujzz0FOkVbO-BqfzO9d_VqRHmZU-3MOZo,10
31
+ cache_dit-0.1.8.dist-info/RECORD,,