PyPI - cache-dit - Versions diffs - 0.2.19__py3-none-any.whl → 0.2.20__py3-none-any.whl - Mend

cache-dit 0.2.19py3-none-any.whl → 0.2.20py3-none-any.whl

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Potentially problematic release.

This version of cache-dit might be problematic. Click here for more details.

Files changed (7) hide show

cache_dit/_version.py CHANGED Viewed

@@ -28,7 +28,7 @@ version_tuple: VERSION_TUPLE
 commit_id: COMMIT_ID
 __commit_id__: COMMIT_ID
-__version__ = version = '0.2.19'
-__version_tuple__ = version_tuple = (0, 2, 19)
+__version__ = version = '0.2.20'
+__version_tuple__ = version_tuple = (0, 2, 20)
 __commit_id__ = commit_id = None

{cache_dit-0.2.19.dist-info → cache_dit-0.2.20.dist-info}/METADATA RENAMED Viewed

@@ -1,6 +1,6 @@
 Metadata-Version: 2.4
 Name: cache_dit
-Version: 0.2.19
+Version: 0.2.20
 Summary: 🤗 CacheDiT: An Unified and Training-free Cache Acceleration Toolbox for Diffusion Transformers
 Author: DefTruth, vipshop.com, etc.
 Maintainer: DefTruth, vipshop.com, etc
@@ -40,10 +40,12 @@ Dynamic: requires-dist
 Dynamic: requires-python
 <div align="center">
+  <img src=https://github.com/vipshop/cache-dit/raw/main/assets/cache-dit-logo.png height="120">
   <p align="center">
-    <h2>🤗 CacheDiT: An Unified and Training-free Cache Acceleration <br>Toolbox for Diffusion Transformers</h2>
+    An <b>Unified</b> and Training-free <b>Cache Acceleration</b> Toolbox for <b>Diffusion Transformers</b> <br>
+    ♥️ <b>Cache Acceleration</b> with <b>One-line</b> Code ~ ♥️
   </p>
-  <img src=https://github.com/vipshop/cache-dit/raw/main/assets/cache-dit-v1.png >
   <div align='center'>
       <img src=https://img.shields.io/badge/Language-Python-brightgreen.svg >
       <img src=https://img.shields.io/badge/PRs-welcome-9cf.svg >
@@ -52,25 +54,36 @@ Dynamic: requires-python
       <img src=https://img.shields.io/badge/Python-3.10|3.11|3.12-9cf.svg >
       <img src=https://img.shields.io/badge/Release-v0.2-brightgreen.svg >
  </div>
+  <p align="center">
   🔥<b><a href="#unified">Unified Cache APIs</a> | <a href="#dbcache">DBCache</a> | <a href="#taylorseer">Hybrid TaylorSeer</a> | <a href="#cfg">Hybrid Cache CFG</a></b>🔥
-</div>
-<div align="center">
+  </p>
   <p align="center">
-    ♥️ Cache <b>Acceleration</b> with <b>One-line</b> Code ~ ♥️
+  🎉Now, <b>Diffusers's</b> Pipelines <b>Coverage Ratio: 60%~70%</b>🎉
   </p>
-</div>
+</div>
+<!--
+  <img src=https://github.com/vipshop/cache-dit/raw/main/assets/cache-dit-v1.png >
+    <img src=https://github.com/vipshop/cache-dit/raw/main/assets/dbcache-v1.png height="320px">
+ <img src=https://github.com/vipshop/cache-dit/raw/main/assets/dbcache-v1.png>
+<img src=https://github.com/vipshop/cache-dit/raw/main/assets/patterns.png>
+-->
 ## 🔥News
 - [2025-08-19] 🔥[**Qwen-Image-Edit**](https://github.com/QwenLM/Qwen-Image) **2x⚡️** speedup! Check example [run_qwen_image_edit.py](./examples/run_qwen_image_edit.py).
-- [2025-08-18] 🎉Early **[Unified Cache APIs](#unified)** released! Check [Qwen-Image w/ UAPI](./examples/run_qwen_image_uapi.py) as an example.
 - [2025-08-12] 🎉First caching mechanism in [QwenLM/Qwen-Image](https://github.com/QwenLM/Qwen-Image) with **[cache-dit](https://github.com/vipshop/cache-dit)**, check the [PR](https://github.com/QwenLM/Qwen-Image/pull/61).
 - [2025-08-11] 🔥[**Qwen-Image**](https://github.com/QwenLM/Qwen-Image) **1.8x⚡️** speedup! Please refer [run_qwen_image.py](./examples/run_qwen_image.py) as an example.
+<details>
+<summary> Previous News </summary>
 - [2025-08-10] 🔥[**FLUX.1-Kontext-dev**](https://huggingface.co/black-forest-labs/FLUX.1-Kontext-dev) is supported! Please refer [run_flux_kontext.py](./examples/run_flux_kontext.py) as an example.
 - [2025-07-18] 🎉First caching mechanism in [🤗huggingface/flux-fast](https://github.com/huggingface/flux-fast) with **[cache-dit](https://github.com/vipshop/cache-dit)**, check the [PR](https://github.com/huggingface/flux-fast/pull/13).
 - [2025-07-13] **[🤗flux-faster](https://github.com/xlite-dev/flux-faster)** is released! **3.3x** speedup for FLUX.1 on NVIDIA L20 with **[cache-dit](https://github.com/vipshop/cache-dit)**.
+</details>
 ## 📖Contents
 <div id="contents"></div>
@@ -81,7 +94,7 @@ Dynamic: requires-python
 - [⚡️Dual Block Cache](#dbcache)
 - [🔥Hybrid TaylorSeer](#taylorseer)
 - [⚡️Hybrid Cache CFG](#cfg)
-- [🔥Torch Compile](#compile)
+- [⚙️Torch Compile](#compile)
 - [🛠Metrics CLI](#metrics)
 ## ⚙️Installation
@@ -110,11 +123,15 @@ Currently, **cache-dit** library supports almost **Any** Diffusion Transformers
 - [🚀FLUX.1-dev](https://github.com/vipshop/cache-dit/raw/main/examples)
 - [🚀FLUX.1-Fill-dev](https://github.com/vipshop/cache-dit/raw/main/examples)
 - [🚀FLUX.1-Kontext-dev](https://github.com/vipshop/cache-dit/raw/main/examples)
-- [🚀mochi-1-preview](https://github.com/vipshop/cache-dit/raw/main/examples)
 - [🚀CogVideoX](https://github.com/vipshop/cache-dit/raw/main/examples)
 - [🚀CogVideoX1.5](https://github.com/vipshop/cache-dit/raw/main/examples)
 - [🚀Wan2.1-T2V](https://github.com/vipshop/cache-dit/raw/main/examples)
 - [🚀Wan2.1-FLF2V](https://github.com/vipshop/cache-dit/raw/main/examples)
+<details>
+<summary> More Pipelines </summary>
+- [🚀mochi-1-preview](https://github.com/vipshop/cache-dit/raw/main/examples)
 - [🚀HunyuanVideo](https://github.com/vipshop/cache-dit/raw/main/examples)
 - [🚀LTXVideo](https://github.com/vipshop/cache-dit/raw/main/examples)
 - [🚀Allegro](https://github.com/vipshop/cache-dit/raw/main/examples)
@@ -124,22 +141,18 @@ Currently, **cache-dit** library supports almost **Any** Diffusion Transformers
 - [🚀EasyAnimate](https://github.com/vipshop/cache-dit/raw/main/examples)
 - [🚀SkyReelsV2](https://github.com/vipshop/cache-dit/raw/main/examples)
 - [🚀SD3](https://github.com/vipshop/cache-dit/raw/main/examples)
+</details>
 ## 🎉Unified Cache APIs
 <div id="unified"></div>
-Currently, for any **Diffusion** models with **Transformer Blocks** that match the specific **Input/Output patterns**, we can use the **Unified Cache APIs** from **cache-dit**, namely, the `cache_dit.enable_cache(...)` API. The supported patterns are listed as follows:
+Currently, for any **Diffusion** models with **Transformer Blocks** that match the specific **Input/Output patterns**, we can use the **Unified Cache APIs** from **cache-dit**, namely, the `cache_dit.enable_cache(...)` API. The **Unified Cache APIs** are currently in the experimental phase; please stay tuned for updates. The supported patterns are listed as follows:
-```python
-(IN: hidden_states, encoder_hidden_states, ...) -> (OUT: hidden_states, encoder_hidden_states)
-(IN: hidden_states, encoder_hidden_states, ...) -> (OUT: encoder_hidden_states, hidden_states)
-(IN: hidden_states, encoder_hidden_states, ...) -> (OUT: hidden_states)
-(IN: hidden_states, ...) -> (OUT: hidden_states) # TODO, DiT, Lumina2, etc.
-```
-After the `cache_dit.enable_cache(...)` API is called, you just need to call the pipe as normal. The `pipe` param can be **any** Diffusion Pipeline. Please refer to [Qwen-Image](./examples/run_qwen_image_uapi.py) as an example. The **Unified Cache APIs** are currently in the experimental phase; please stay tuned for updates.
+![](https://github.com/vipshop/cache-dit/raw/main/assets/patterns.png)
+After the `cache_dit.enable_cache(...)` API is called, you just need to call the pipe as normal. The `pipe` param can be **any** Diffusion Pipeline. Please refer to [Qwen-Image](./examples/run_qwen_image_uapi.py) as an example.
 ```python
 import cache_dit
 from diffusers import DiffusionPipeline
@@ -180,47 +193,13 @@ After finishing each inference of `pipe(...)`, you can call the `cache_dit.summa
 <div id="dbcache"></div>
-![](https://github.com/vipshop/cache-dit/raw/main/assets/dbcache-v1.png)
-**DBCache**: **Dual Block Caching** for Diffusion Transformers. We have enhanced `FBCache` into a more general and customizable cache algorithm, namely `DBCache`, enabling it to achieve fully `UNet-style` cache acceleration for DiT models. Different configurations of compute blocks (**F8B12**, etc.) can be customized in DBCache. Moreover, it can be entirely **training**-**free**. DBCache can strike a perfect **balance** between performance and precision!
-<div align="center">
-  <p align="center">
-    DBCache, <b> L20x1 </b>, Steps: 28, "A cat holding a sign that says hello world with complex background"
-  </p>
-</div>
-|Baseline(L20x1)|F1B0 (0.08)|F1B0 (0.20)|F8B8 (0.15)|F12B12 (0.20)|F16B16 (0.20)|
-|:---:|:---:|:---:|:---:|:---:|:---:|
-|24.85s|15.59s|8.58s|15.41s|15.11s|17.74s|
-|<img src=https://github.com/vipshop/cache-dit/raw/main/assets/NONE_R0.08_S0.png width=105px>|<img src=https://github.com/vipshop/cache-dit/raw/main/assets/DBCACHE_F1B0S1_R0.08_S11.png width=105px> | <img src=https://github.com/vipshop/cache-dit/raw/main/assets/DBCACHE_F1B0S1_R0.2_S19.png width=105px>|<img src=https://github.com/vipshop/cache-dit/raw/main/assets/DBCACHE_F8B8S1_R0.15_S15.png width=105px>|<img src=https://github.com/vipshop/cache-dit/raw/main/assets/DBCACHE_F12B12S4_R0.2_S16.png width=105px>|<img src=https://github.com/vipshop/cache-dit/raw/main/assets/DBCACHE_F16B16S4_R0.2_S13.png width=105px>|
-|**Baseline(L20x1)**|**F1B0 (0.08)**|**F8B8 (0.12)**|**F8B12 (0.12)**|**F8B16 (0.20)**|**F8B20 (0.20)**|
-|27.85s|6.04s|5.88s|5.77s|6.01s|6.20s|
-|<img src=https://github.com/vipshop/cache-dit/raw/main/assets/TEXTURE_NONE_R0.08.png width=105px>|<img src=https://github.com/vipshop/cache-dit/raw/main/assets/TEXTURE_DBCACHE_F1B0_R0.08.png width=105px> |<img src=https://github.com/vipshop/cache-dit/raw/main/assets/TEXTURE_DBCACHE_F8B8_R0.12.png width=105px>|<img src=https://github.com/vipshop/cache-dit/raw/main/assets/TEXTURE_DBCACHE_F8B12_R0.12.png width=105px>|<img src=https://github.com/vipshop/cache-dit/raw/main/assets/TEXTURE_DBCACHE_F8B16_R0.2.png width=105px>|<img src=https://github.com/vipshop/cache-dit/raw/main/assets/TEXTURE_DBCACHE_F8B20_R0.2.png width=105px>|
-<div align="center">
-  <p align="center">
-    DBCache, <b> L20x4 </b>, Steps: 20, case to show the texture recovery ability of DBCache
-  </p>
-</div>
-These case studies demonstrate that even with relatively high thresholds (such as 0.12, 0.15, 0.2, etc.) under the DBCache **F12B12** or **F8B16** configuration, the detailed texture of the kitten's fur, colored cloth, and the clarity of text can still be preserved. This suggests that users can leverage DBCache to effectively balance performance and precision in their workflows!
+![](https://github.com/vipshop/cache-dit/raw/main/assets/dbcache-fnbn-v1.png)
-**DBCache** provides configurable parameters for custom optimization, enabling a balanced trade-off between performance and precision:
+**DBCache**: **Dual Block Caching** for Diffusion Transformers. Different configurations of compute blocks (**F8B12**, etc.) can be customized in DBCache, enabling a balanced trade-off between performance and precision. Moreover, it can be entirely **training**-**free**. Please check [DBCache.md](./docs/DBCache.md) docs for more design details.
 - **Fn**: Specifies that DBCache uses the **first n** Transformer blocks to fit the information at time step t, enabling the calculation of a more stable L1 diff and delivering more accurate information to subsequent blocks.
 - **Bn**: Further fuses approximate information in the **last n** Transformer blocks to enhance prediction accuracy. These blocks act as an auto-scaler for approximate hidden states that use residual cache.
-![](https://github.com/vipshop/cache-dit/raw/main/assets/dbcache-fnbn-v1.png)
-- **warmup_steps**: (default: 0) DBCache does not apply the caching strategy when the number of running steps is less than or equal to this value, ensuring the model sufficiently learns basic features during warmup.
-- **max_cached_steps**:  (default: -1) DBCache disables the caching strategy when the previous cached steps exceed this value to prevent precision degradation.
-- **residual_diff_threshold**: The value of residual diff threshold, a higher value leads to faster performance at the cost of lower precision.
-For a good balance between performance and precision, DBCache is configured by default with **F8B0**, 8 warmup steps, and unlimited cached steps.
 ```python
 import cache_dit
 from diffusers import FluxPipeline
@@ -230,7 +209,8 @@ pipe = FluxPipeline.from_pretrained(
     torch_dtype=torch.bfloat16,
 ).to("cuda")
-# Default options, F8B0, good balance between performance and precision
+# Default options, F8B0, 8 warmup steps, and unlimited cached
+# steps for good balance between performance and precision
 cache_options = cache_dit.default_options()
 # Custom options, F8B8, higher precision
@@ -259,6 +239,17 @@ cache_options = {
 }
 ```
+<div align="center">
+  <p align="center">
+    DBCache, <b> L20x1 </b>, Steps: 28, "A cat holding a sign that says hello world with complex background"
+  </p>
+</div>
+|Baseline(L20x1)|F1B0 (0.08)|F1B0 (0.20)|F8B8 (0.15)|F12B12 (0.20)|F16B16 (0.20)|
+|:---:|:---:|:---:|:---:|:---:|:---:|
+|24.85s|15.59s|8.58s|15.41s|15.11s|17.74s|
+|<img src=https://github.com/vipshop/cache-dit/raw/main/assets/NONE_R0.08_S0.png width=105px>|<img src=https://github.com/vipshop/cache-dit/raw/main/assets/DBCACHE_F1B0S1_R0.08_S11.png width=105px> | <img src=https://github.com/vipshop/cache-dit/raw/main/assets/DBCACHE_F1B0S1_R0.2_S19.png width=105px>|<img src=https://github.com/vipshop/cache-dit/raw/main/assets/DBCACHE_F8B8S1_R0.15_S15.png width=105px>|<img src=https://github.com/vipshop/cache-dit/raw/main/assets/DBCACHE_F12B12S4_R0.2_S16.png width=105px>|<img src=https://github.com/vipshop/cache-dit/raw/main/assets/DBCACHE_F16B16S4_R0.2_S13.png width=105px>|
 ## 🔥Hybrid TaylorSeer
 <div id="taylorseer"></div>
@@ -326,16 +317,15 @@ cache_options = {
 }
 ```
-## 🔥Torch Compile
+## ⚙️Torch Compile
 <div id="compile"></div>
 By the way, **cache-dit** is designed to work compatibly with **torch.compile.** You can easily use cache-dit with torch.compile to further achieve a better performance. For example:
 ```python
-cache_dit.enable_cache(
-    pipe, **cache_dit.default_options()
-)
+cache_dit.enable_cache(pipe)
 # Compile the Transformer module
 pipe.transformer = torch.compile(pipe.transformer)
 ```

{cache_dit-0.2.19.dist-info → cache_dit-0.2.20.dist-info}/RECORD RENAMED Viewed

@@ -1,5 +1,5 @@
 cache_dit/__init__.py,sha256=TvZI861ipGnYaOEHJA0Og-ksRUGNCld-PGy_NgjcKZE,641
-cache_dit/_version.py,sha256=32XF9c5EeiOUdyiWeKcwkXTWZQBgtvbmKx8wZoDEW0o,706
+cache_dit/_version.py,sha256=4N3ayuoZZJYPEGMvrxu7tnGigRTxbAdCyp5a8y7c6aw,706
 cache_dit/logger.py,sha256=0zsu42hN-3-rgGC_C29ms1IvVpV4_b4_SwJCKSenxBE,4304
 cache_dit/primitives.py,sha256=A2iG9YLot3gOsZSPp-_gyjqjLgJvWQRx8aitD4JQ23Y,3877
 cache_dit/utils.py,sha256=yybhUTGPfeCoIVZzpoefZ2ypvH8de-10UhPls81ceG4,4800
@@ -22,9 +22,9 @@ cache_dit/metrics/fid.py,sha256=9Ivtazl6mW0Bon2VXa-Ia5Xj2ewxRD3V1Qkd69zYM3Y,1706
 cache_dit/metrics/inception.py,sha256=pBVe2X6ylLPIXTG4-GWDM9DWnCviMJbJ45R3ulhktR0,12759
 cache_dit/metrics/lpips.py,sha256=I2qCNi6qJh5TRsaIsdxO0WoRX1DN7U_H3zS0oCSahYM,1032
 cache_dit/metrics/metrics.py,sha256=8jvM1sF-nDxUuwCRy44QEoo4dYVLCQVh1QyAMs4eaQY,27840
-cache_dit-0.2.19.dist-info/licenses/LICENSE,sha256=Dqb07Ik2dV41s9nIdMUbiRWEfDqo7-dQeRiY7kPO8PE,3769
-cache_dit-0.2.19.dist-info/METADATA,sha256=cCnv_b_F06xdttqdHhnbmPDpF_xRgz-O03tYfvzGGrI,20910
-cache_dit-0.2.19.dist-info/WHEEL,sha256=_zCd3N1l69ArxyTb8rzEoP9TpbYXkqRFSNOD5OuxnTs,91
-cache_dit-0.2.19.dist-info/entry_points.txt,sha256=FX2gysXaZx6NeK1iCLMcIdP8Q4_qikkIHtEmi3oWn8o,65
-cache_dit-0.2.19.dist-info/top_level.txt,sha256=ZJDydonLEhujzz0FOkVbO-BqfzO9d_VqRHmZU-3MOZo,10
-cache_dit-0.2.19.dist-info/RECORD,,
+cache_dit-0.2.20.dist-info/licenses/LICENSE,sha256=Dqb07Ik2dV41s9nIdMUbiRWEfDqo7-dQeRiY7kPO8PE,3769
+cache_dit-0.2.20.dist-info/METADATA,sha256=rq7U5fZNeRMn6CvKRyuh1JcovMSwbDWGt1v6LzfaBUE,18752
+cache_dit-0.2.20.dist-info/WHEEL,sha256=_zCd3N1l69ArxyTb8rzEoP9TpbYXkqRFSNOD5OuxnTs,91
+cache_dit-0.2.20.dist-info/entry_points.txt,sha256=FX2gysXaZx6NeK1iCLMcIdP8Q4_qikkIHtEmi3oWn8o,65
+cache_dit-0.2.20.dist-info/top_level.txt,sha256=ZJDydonLEhujzz0FOkVbO-BqfzO9d_VqRHmZU-3MOZo,10
+cache_dit-0.2.20.dist-info/RECORD,,

{cache_dit-0.2.19.dist-info → cache_dit-0.2.20.dist-info}/WHEEL RENAMED Viewed

File without changes

{cache_dit-0.2.19.dist-info → cache_dit-0.2.20.dist-info}/entry_points.txt RENAMED Viewed

File without changes

{cache_dit-0.2.19.dist-info → cache_dit-0.2.20.dist-info}/licenses/LICENSE RENAMED Viewed

File without changes

{cache_dit-0.2.19.dist-info → cache_dit-0.2.20.dist-info}/top_level.txt RENAMED Viewed

File without changes

cache-dit 0.2.19__py3-none-any.whl → 0.2.20__py3-none-any.whl

Potentially problematic release.

cache-dit 0.2.19py3-none-any.whl → 0.2.20py3-none-any.whl