PyPI - diffsynth - Versions diffs - 2.0.10__tar.gz → 2.0.12__tar.gz - Mend

diffsynth 2.0.10tar.gz → 2.0.12tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (187) hide show

{diffsynth-2.0.10 → diffsynth-2.0.12}/PKG-INFO RENAMED Viewed

@@ -1,6 +1,6 @@
 Metadata-Version: 2.4
 Name: diffsynth
-Version: 2.0.10
+Version: 2.0.12
 Summary: Enjoy the magic of Diffusion models!
 Author: ModelScope Team
 License: Apache-2.0
@@ -33,6 +33,14 @@ Requires-Dist: torch==2.7.1+cpu; extra == "npu"
 Requires-Dist: torch-npu==2.7.1; extra == "npu"
 Requires-Dist: torchvision==0.22.1+cpu; extra == "npu"
 Provides-Extra: audio
+Requires-Dist: av; extra == "audio"
 Requires-Dist: torchaudio; extra == "audio"
 Requires-Dist: torchcodec; extra == "audio"
+Requires-Dist: librosa; extra == "audio"
+Provides-Extra: all
+Requires-Dist: av; extra == "all"
+Requires-Dist: torchaudio; extra == "all"
+Requires-Dist: torchcodec; extra == "all"
+Requires-Dist: librosa; extra == "all"
+Requires-Dist: streamlit; extra == "all"
 Dynamic: license-file

{diffsynth-2.0.10 → diffsynth-2.0.12}/README.md RENAMED Viewed

@@ -34,6 +34,19 @@ We believe that a well-developed open-source code framework can lower the thresh
 > Currently, the development personnel of this project are limited, with most of the work handled by [Artiprocher](https://github.com/Artiprocher) and [mi804](https://github.com/mi804). Therefore, the progress of new feature development will be relatively slow, and the speed of responding to and resolving issues is limited. We apologize for this and ask developers to understand.
+- **May 18, 2026** Added **CPU Offload Training** support. By moving model weights layer-by-layer between CPU and GPU, it significantly reduces GPU VRAM usage during training, enabling LoRA training of large models even on consumer-grade GPUs, compatible with all models. Simply add `--enable_model_cpu_offload` to your training command to enable (currently supports single-GPU training only). For details, see the [documentation](/docs/en/Training/Offload_Training.md).
+- **May 14, 2026** HiDream-O1-Image open-sourced, welcome a new member to the image model family! Support includes text-to-image generation, image editing, low VRAM inference, and training capabilities. For details, please refer to the [documentation](/docs/en/Model_Details/HiDream-O1-Image.md) and [example code](/examples/hidream_o1_image/).
+- **April 28, 2026** 🔥 We are excited to announce the release of **Diffusion Templates**, a plugin framework designed for Diffusion models that significantly lowers the barrier to training controllable generative models. Let's explore this cutting-edge technology together!
+    * Open-source code: [DiffSynth-Studio](https://github.com/modelscope/DiffSynth-Studio)
+    * Technical report: [arXiv](https://arxiv.org/abs/2604.24351)
+    * Project homepage: [GitHub](https://modelscope.github.io/diffusion-templates-web/)
+    * Documentation: [English Version](https://diffsynth-studio-doc.readthedocs.io/en/latest/Diffusion_Templates/Introducing_Diffusion_Templates.html) | [Chinese Version](https://diffsynth-studio-doc.readthedocs.io/zh-cn/latest/Diffusion_Templates/Introducing_Diffusion_Templates.html)
+    * Online demo: [ModelScope](https://modelscope.cn/studios/DiffSynth-Studio/Diffusion-Templates)
+    * Model collections: [ModelScope](https://modelscope.cn/collections/DiffSynth-Studio/KleinBase4B-Templates) | [ModelScope International](https://modelscope.ai/collections/DiffSynth-Studio/KleinBase4B-Templates) | [HuggingFace](https://huggingface.co/collections/DiffSynth-Studio/kleinbase4b-templates)
+    * Datasets: [ModelScope](https://modelscope.cn/collections/DiffSynth-Studio/ImagePulseV2) | [ModelScope International](https://modelscope.ai/collections/DiffSynth-Studio/ImagePulseV2) | [HuggingFace](https://huggingface.co/collections/DiffSynth-Studio/imagepulsev2)
 - **April 27, 2026** We support ACE-Step-1.5! Support includes text-to-music generation, low VRAM inference, and LoRA training capabilities. For details, please refer to the [documentation](/docs/en/Model_Details/ACE-Step.md) and [example code](/examples/ace_step/).
 - **April 27, 2026**: We have reinstated support for the Stable Diffusion v1.5 and SDXL models, providing academic research support exclusively for these two model types.
@@ -96,7 +109,7 @@ We believe that a well-developed open-source code framework can lower the thresh
 - **August 20, 2025** We open-sourced the [DiffSynth-Studio/Qwen-Image-Edit-Lowres-Fix](https://www.modelscope.cn/models/DiffSynth-Studio/Qwen-Image-Edit-Lowres-Fix) model, improving the editing effect of Qwen-Image-Edit on low-resolution image inputs. Please refer to [our sample code](./examples/qwen_image/model_inference/Qwen-Image-Edit-Lowres-Fix.py)
-- **August 19, 2025** 🔥 Qwen-Image-Edit open-sourced, welcome a new member to the image editing model family!
+- **August 19, 2025** Qwen-Image-Edit open-sourced, welcome a new member to the image editing model family!
 - **August 18, 2025** We trained and open-sourced the Qwen-Image inpainting ControlNet model [DiffSynth-Studio/Qwen-Image-Blockwise-ControlNet-Inpaint](https://www.modelscope.cn/models/DiffSynth-Studio/Qwen-Image-Blockwise-ControlNet-Inpaint). The model structure adopts a lightweight design. Please refer to [our sample code](./examples/qwen_image/model_inference/Qwen-Image-Blockwise-ControlNet-Inpaint.py).
@@ -112,7 +125,7 @@ We believe that a well-developed open-source code framework can lower the thresh
 - **August 5, 2025** We open-sourced the distilled acceleration model [DiffSynth-Studio/Qwen-Image-Distill-Full](https://www.modelscope.cn/models/DiffSynth-Studio/Qwen-Image-Distill-Full) for Qwen-Image, achieving approximately 5x acceleration.
-- **August 4, 2025** 🔥 Qwen-Image open-sourced, welcome a new member to the image generation model family!
+- **August 4, 2025** Qwen-Image open-sourced, welcome a new member to the image generation model family!
 - **August 1, 2025** [FLUX.1-Krea-dev](https://www.modelscope.cn/models/black-forest-labs/FLUX.1-Krea-dev) open-sourced, a text-to-image model focused on aesthetic photography. We provided comprehensive support in a timely manner, including low VRAM layer-by-layer offload, LoRA training, and full training. For more details, please refer to [./examples/flux/](./examples/flux/).
@@ -479,6 +492,17 @@ Example code for FLUX.2 is available at: [/examples/flux2/](/examples/flux2/)
 |[black-forest-labs/FLUX.2-klein-9B](https://www.modelscope.cn/models/black-forest-labs/FLUX.2-klein-9B)|[code](/examples/flux2/model_inference/FLUX.2-klein-9B.py)|[code](/examples/flux2/model_inference_low_vram/FLUX.2-klein-9B.py)|[code](/examples/flux2/model_training/full/FLUX.2-klein-9B.sh)|[code](/examples/flux2/model_training/validate_full/FLUX.2-klein-9B.py)|[code](/examples/flux2/model_training/lora/FLUX.2-klein-9B.sh)|[code](/examples/flux2/model_training/validate_lora/FLUX.2-klein-9B.py)|
 |[black-forest-labs/FLUX.2-klein-base-4B](https://www.modelscope.cn/models/black-forest-labs/FLUX.2-klein-base-4B)|[code](/examples/flux2/model_inference/FLUX.2-klein-base-4B.py)|[code](/examples/flux2/model_inference_low_vram/FLUX.2-klein-base-4B.py)|[code](/examples/flux2/model_training/full/FLUX.2-klein-base-4B.sh)|[code](/examples/flux2/model_training/validate_full/FLUX.2-klein-base-4B.py)|[code](/examples/flux2/model_training/lora/FLUX.2-klein-base-4B.sh)|[code](/examples/flux2/model_training/validate_lora/FLUX.2-klein-base-4B.py)|
 |[black-forest-labs/FLUX.2-klein-base-9B](https://www.modelscope.cn/models/black-forest-labs/FLUX.2-klein-base-9B)|[code](/examples/flux2/model_inference/FLUX.2-klein-base-9B.py)|[code](/examples/flux2/model_inference_low_vram/FLUX.2-klein-base-9B.py)|[code](/examples/flux2/model_training/full/FLUX.2-klein-base-9B.sh)|[code](/examples/flux2/model_training/validate_full/FLUX.2-klein-base-9B.py)|[code](/examples/flux2/model_training/lora/FLUX.2-klein-base-9B.sh)|[code](/examples/flux2/model_training/validate_lora/FLUX.2-klein-base-9B.py)|
+|[DiffSynth-Studio/Template-KleinBase4B-Aesthetic](https://www.modelscope.cn/models/DiffSynth-Studio/Template-KleinBase4B-Aesthetic)|[code](/examples/flux2/model_inference/Template-KleinBase4B-Aesthetic.py)|[code](/examples/flux2/model_inference_low_vram/Template-KleinBase4B-Aesthetic.py)|[code](/examples/flux2/model_training/full/Template-KleinBase4B-Aesthetic.sh)|[code](/examples/flux2/model_training/validate_full/Template-KleinBase4B-Aesthetic.py)|-|-|
+|[DiffSynth-Studio/Template-KleinBase4B-Brightness](https://www.modelscope.cn/models/DiffSynth-Studio/Template-KleinBase4B-Brightness)|[code](/examples/flux2/model_inference/Template-KleinBase4B-Brightness.py)|[code](/examples/flux2/model_inference_low_vram/Template-KleinBase4B-Brightness.py)|[code](/examples/flux2/model_training/full/Template-KleinBase4B-Brightness.sh)|[code](/examples/flux2/model_training/validate_full/Template-KleinBase4B-Brightness.py)|-|-|
+|[DiffSynth-Studio/Template-KleinBase4B-Age](https://www.modelscope.cn/models/DiffSynth-Studio/Template-KleinBase4B-Age)|[code](/examples/flux2/model_inference/Template-KleinBase4B-Age.py)|[code](/examples/flux2/model_inference_low_vram/Template-KleinBase4B-Age.py)|[code](/examples/flux2/model_training/full/Template-KleinBase4B-Age.sh)|[code](/examples/flux2/model_training/validate_full/Template-KleinBase4B-Age.py)|-|-|
+|[DiffSynth-Studio/Template-KleinBase4B-ControlNet](https://www.modelscope.cn/models/DiffSynth-Studio/Template-KleinBase4B-ControlNet)|[code](/examples/flux2/model_inference/Template-KleinBase4B-ControlNet.py)|[code](/examples/flux2/model_inference_low_vram/Template-KleinBase4B-ControlNet.py)|[code](/examples/flux2/model_training/full/Template-KleinBase4B-ControlNet.sh)|[code](/examples/flux2/model_training/validate_full/Template-KleinBase4B-ControlNet.py)|-|-|
+|[DiffSynth-Studio/Template-KleinBase4B-Edit](https://www.modelscope.cn/models/DiffSynth-Studio/Template-KleinBase4B-Edit)|[code](/examples/flux2/model_inference/Template-KleinBase4B-Edit.py)|[code](/examples/flux2/model_inference_low_vram/Template-KleinBase4B-Edit.py)|[code](/examples/flux2/model_training/full/Template-KleinBase4B-Edit.sh)|[code](/examples/flux2/model_training/validate_full/Template-KleinBase4B-Edit.py)|-|-|
+|[DiffSynth-Studio/Template-KleinBase4B-Inpaint](https://www.modelscope.cn/models/DiffSynth-Studio/Template-KleinBase4B-Inpaint)|[code](/examples/flux2/model_inference/Template-KleinBase4B-Inpaint.py)|[code](/examples/flux2/model_inference_low_vram/Template-KleinBase4B-Inpaint.py)|[code](/examples/flux2/model_training/full/Template-KleinBase4B-Inpaint.sh)|[code](/examples/flux2/model_training/validate_full/Template-KleinBase4B-Inpaint.py)|-|-|
+|[DiffSynth-Studio/Template-KleinBase4B-PandaMeme](https://www.modelscope.cn/models/DiffSynth-Studio/Template-KleinBase4B-PandaMeme)|[code](/examples/flux2/model_inference/Template-KleinBase4B-PandaMeme.py)|[code](/examples/flux2/model_inference_low_vram/Template-KleinBase4B-PandaMeme.py)|[code](/examples/flux2/model_training/full/Template-KleinBase4B-PandaMeme.sh)|[code](/examples/flux2/model_training/validate_full/Template-KleinBase4B-PandaMeme.py)|-|-|
+|[DiffSynth-Studio/Template-KleinBase4B-Sharpness](https://www.modelscope.cn/models/DiffSynth-Studio/Template-KleinBase4B-Sharpness)|[code](/examples/flux2/model_inference/Template-KleinBase4B-Sharpness.py)|[code](/examples/flux2/model_inference_low_vram/Template-KleinBase4B-Sharpness.py)|[code](/examples/flux2/model_training/full/Template-KleinBase4B-Sharpness.sh)|[code](/examples/flux2/model_training/validate_full/Template-KleinBase4B-Sharpness.py)|-|-|
+|[DiffSynth-Studio/Template-KleinBase4B-SoftRGB](https://www.modelscope.cn/models/DiffSynth-Studio/Template-KleinBase4B-SoftRGB)|[code](/examples/flux2/model_inference/Template-KleinBase4B-SoftRGB.py)|[code](/examples/flux2/model_inference_low_vram/Template-KleinBase4B-SoftRGB.py)|[code](/examples/flux2/model_training/full/Template-KleinBase4B-SoftRGB.sh)|[code](/examples/flux2/model_training/validate_full/Template-KleinBase4B-SoftRGB.py)|-|-|
+|[DiffSynth-Studio/Template-KleinBase4B-Upscaler](https://www.modelscope.cn/models/DiffSynth-Studio/Template-KleinBase4B-Upscaler)|[code](/examples/flux2/model_inference/Template-KleinBase4B-Upscaler.py)|[code](/examples/flux2/model_inference_low_vram/Template-KleinBase4B-Upscaler.py)|[code](/examples/flux2/model_training/full/Template-KleinBase4B-Upscaler.sh)|[code](/examples/flux2/model_training/validate_full/Template-KleinBase4B-Upscaler.py)|-|-|
+|[DiffSynth-Studio/Template-KleinBase4B-ContentRef](https://www.modelscope.cn/models/DiffSynth-Studio/Template-KleinBase4B-ContentRef)|[code](/examples/flux2/model_inference/Template-KleinBase4B-ContentRef.py)|[code](/examples/flux2/model_inference_low_vram/Template-KleinBase4B-ContentRef.py)|[code](/examples/flux2/model_training/full/Template-KleinBase4B-ContentRef.sh)|[code](/examples/flux2/model_training/validate_full/Template-KleinBase4B-ContentRef.py)|-|-|
 </details>
@@ -864,6 +888,68 @@ Example code for JoyAI-Image is available at: [/examples/joyai_image/](/examples
 </details>
+#### HiDream-O1-Image: [/docs/en/Model_Details/HiDream-O1-Image.md](/docs/en/Model_Details/HiDream-O1-Image.md)
+<details>
+<summary>Quick Start</summary>
+Running the following code will quickly load the [HiDream-ai/HiDream-O1-Image](https://modelscope.cn/HiDream-ai/HiDream-O1-Image) model and perform inference. VRAM management is enabled, and the framework will automatically control the loading of model parameters based on available VRAM. The model can run with a minimum of 3GB VRAM.
+```python
+from diffsynth.pipelines.hidream_o1_image import HiDreamO1ImagePipeline
+from diffsynth.core.loader.config import ModelConfig
+import torch
+vram_config = {
+    "offload_dtype": torch.bfloat16,
+    "offload_device": "cpu",
+    "onload_dtype": torch.bfloat16,
+    "onload_device": "cpu",
+    "preparing_dtype": torch.bfloat16,
+    "preparing_device": "cuda",
+    "computation_dtype": torch.bfloat16,
+    "computation_device": "cuda",
+}
+pipe = HiDreamO1ImagePipeline.from_pretrained(
+    torch_dtype=torch.bfloat16,
+    device="cuda",
+    model_configs=[
+        ModelConfig(model_id="HiDream-ai/HiDream-O1-Image", origin_file_pattern="model-*.safetensors", **vram_config),
+    ],
+    processor_config=ModelConfig(model_id="HiDream-ai/HiDream-O1-Image", origin_file_pattern="./"),
+    vram_limit=torch.cuda.mem_get_info("cuda")[1] / (1024 ** 3) - 0.5,
+)
+image = pipe(
+    prompt="medium shot, eye-level, front view. A woman is seated in an ornate bedroom, illuminated by candlelight, with a calm and composed expression. The subject is a young woman with fair skin, light brown hair styled in an updo with loose tendrils framing her face, and blue eyes. She wears a cream-colored satin robe with delicate floral embroidery and lace trim along the neckline. Her ears are adorned with pearl drop earrings. She is seated on a bed with a dark, intricately carved wooden headboard. To her left, a wooden nightstand holds three lit white candles and a candelabra with multiple lit candles in the background. The bed is covered with patterned pillows and a dark, textured blanket. The walls are paneled with dark wood and feature a large, ornate tapestry with muted earth tones. The lighting creates soft highlights on her face and robe, with warm shadows cast across the room.",
+    negative_prompt=" ",
+    cfg_scale=4.0,
+    height=2048,
+    width=2048,
+    seed=42,
+    num_inference_steps=50,
+)
+image.save("image.jpg")
+```
+</details>
+<details>
+<summary>Examples</summary>
+Example code for HiDream-O1-Image is available at: [/examples/hidream_o1_image/](/examples/hidream_o1_image/)
+| Model ID | Inference | Low VRAM Inference | Full Training | Full Training Validation | LoRA Training | LoRA Training Validation |
+|-|-|-|-|-|-|-|
+|[HiDream-ai/HiDream-O1-Image](https://modelscope.cn/HiDream-ai/HiDream-O1-Image)|[code](/examples/hidream_o1_image/model_inference/HiDream-O1-Image.py)|[code](/examples/hidream_o1_image/model_inference_low_vram/HiDream-O1-Image.py)|[code](/examples/hidream_o1_image/model_training/full/HiDream-O1-Image.sh)|[code](/examples/hidream_o1_image/model_training/validate_full/HiDream-O1-Image.py)|[code](/examples/hidream_o1_image/model_training/lora/HiDream-O1-Image.sh)|[code](/examples/hidream_o1_image/model_training/validate_lora/HiDream-O1-Image.py)|
+|[HiDream-ai/HiDream-O1-Image-Dev](https://modelscope.cn/HiDream-ai/HiDream-O1-Image-Dev)|[code](/examples/hidream_o1_image/model_inference/HiDream-O1-Image-Dev.py)|[code](/examples/hidream_o1_image/model_inference_low_vram/HiDream-O1-Image-Dev.py)|[code](/examples/hidream_o1_image/model_training/full/HiDream-O1-Image-Dev.sh)|[code](/examples/hidream_o1_image/model_training/validate_full/HiDream-O1-Image-Dev.py)|[code](/examples/hidream_o1_image/model_training/lora/HiDream-O1-Image-Dev.sh)|[code](/examples/hidream_o1_image/model_training/validate_lora/HiDream-O1-Image-Dev.py)|
+</details>
 ### Video Synthesis
 https://github.com/user-attachments/assets/1d66ae74-3b02-40a9-acc3-ea95fc039314
@@ -1138,8 +1224,8 @@ Example code for Wan is available at: [/examples/wanvideo/](/examples/wanvideo/)
 |[PAI/Wan2.2-Fun-A14B-Control-Camera](https://modelscope.cn/models/PAI/Wan2.2-Fun-A14B-Control-Camera)|`control_camera_video`, `input_image`|[code](https://github.com/modelscope/DiffSynth-Studio/blob/main/examples/wanvideo/model_inference/Wan2.2-Fun-A14B-Control-Camera.py)|[code](https://github.com/modelscope/DiffSynth-Studio/blob/main/examples/wanvideo/model_inference_low_vram/Wan2.2-Fun-A14B-Control-Camera.py)|[code](https://github.com/modelscope/DiffSynth-Studio/blob/main/examples/wanvideo/model_training/full/Wan2.2-Fun-A14B-Control-Camera.sh)|[code](https://github.com/modelscope/DiffSynth-Studio/blob/main/examples/wanvideo/model_training/validate_full/Wan2.2-Fun-A14B-Control-Camera.py)|[code](https://github.com/modelscope/DiffSynth-Studio/blob/main/examples/wanvideo/model_training/lora/Wan2.2-Fun-A14B-Control-Camera.sh)|[code](https://github.com/modelscope/DiffSynth-Studio/blob/main/examples/wanvideo/model_training/validate_lora/Wan2.2-Fun-A14B-Control-Camera.py)|
 |[openmoss/MOVA-360p](https://modelscope.cn/models/openmoss/MOVA-360p)|`input_image`|[code](https://github.com/modelscope/DiffSynth-Studio/blob/main/examples/mova/model_inference/MOVA-360p-I2AV.py)|[code](https://github.com/modelscope/DiffSynth-Studio/blob/main/examples/mova/model_inference_low_vram/MOVA-360p-I2AV.py)|[code](https://github.com/modelscope/DiffSynth-Studio/blob/main/examples/mova/model_training/full/MOVA-360P-I2AV.sh)|[code](https://github.com/modelscope/DiffSynth-Studio/blob/main/examples/mova/model_training/validate_full/MOVA-360p-I2AV.py)|[code](https://github.com/modelscope/DiffSynth-Studio/blob/main/examples/mova/model_training/lora/MOVA-360P-I2AV.sh)|[code](https://github.com/modelscope/DiffSynth-Studio/blob/main/examples/mova/model_training/validate_lora/MOVA-360p-I2AV.py)|
 |[openmoss/MOVA-720p](https://modelscope.cn/models/openmoss/MOVA-720p)|`input_image`|[code](https://github.com/modelscope/DiffSynth-Studio/blob/main/examples/mova/model_inference/MOVA-720p-I2AV.py)|[code](https://github.com/modelscope/DiffSynth-Studio/blob/main/examples/mova/model_inference_low_vram/MOVA-720p-I2AV.py)|[code](https://github.com/modelscope/DiffSynth-Studio/blob/main/examples/mova/model_training/full/MOVA-720P-I2AV.sh)|[code](https://github.com/modelscope/DiffSynth-Studio/blob/main/examples/mova/model_training/validate_full/MOVA-720p-I2AV.py)|[code](https://github.com/modelscope/DiffSynth-Studio/blob/main/examples/mova/model_training/lora/MOVA-720P-I2AV.sh)|[code](https://github.com/modelscope/DiffSynth-Studio/blob/main/examples/mova/model_training/validate_lora/MOVA-720p-I2AV.py)|
-|[Wan-AI/WanToDance-14B (global model)](https://modelscope.cn/models/Wan-AI/WanToDance-14B)|`wantodance_music_path`, `wantodance_reference_image`, `wantodance_fps`, `wantodance_keyframes`, `wantodance_keyframes_mask`|[code](https://github.com/modelscope/DiffSynth-Studio/blob/main/examples/wanvideo/model_inference/WanToDance-14B-global.py)|[code](https://github.com/modelscope/DiffSynth-Studio/blob/main/examples/wanvideo/model_inference_low_vram/WanToDance-14B-global.py)|[code](https://github.com/modelscope/DiffSynth-Studio/blob/main/examples/wanvideo/model_training/full/WanToDance-14B-global.sh)|[code](https://github.com/modelscope/DiffSynth-Studio/blob/main/examples/wanvideo/model_training/validate_full/WanToDance-14B-global.py)|[code](https://github.com/modelscope/DiffSynth-Studio/blob/main/examples/wanvideo/model_training/lora/WanToDance-14B-global.sh)|[code](https://github.com/modelscope/DiffSynth-Studio/blob/main/examples/wanvideo/model_training/validate_lora/WanToDance-14B-global.py)|
-|[Wan-AI/WanToDance-14B (local model)](https://modelscope.cn/models/Wan-AI/WanToDance-14B)|`wantodance_music_path`, `wantodance_reference_image`, `wantodance_fps`, `wantodance_keyframes`, `wantodance_keyframes_mask`|[code](https://github.com/modelscope/DiffSynth-Studio/blob/main/examples/wanvideo/model_inference/WanToDance-14B-local.py)|[code](https://github.com/modelscope/DiffSynth-Studio/blob/main/examples/wanvideo/model_inference_low_vram/WanToDance-14B-local.py)|[code](https://github.com/modelscope/DiffSynth-Studio/blob/main/examples/wanvideo/model_training/full/WanToDance-14B-local.sh)|[code](https://github.com/modelscope/DiffSynth-Studio/blob/main/examples/wanvideo/model_training/validate_full/WanToDance-14B-local.py)|[code](https://github.com/modelscope/DiffSynth-Studio/blob/main/examples/wanvideo/model_training/lora/WanToDance-14B-local.sh)|[code](https://github.com/modelscope/DiffSynth-Studio/blob/main/examples/wanvideo/model_training/validate_lora/WanToDance-14B-local.py)|
+|[Wan-AI/Wan2.2-Dancer-14B (global model)](https://modelscope.cn/models/Wan-AI/Wan2.2-Dancer-14B)|`wantodance_music_path`, `wantodance_reference_image`, `wantodance_fps`, `wantodance_keyframes`, `wantodance_keyframes_mask`|[code](https://github.com/modelscope/DiffSynth-Studio/blob/main/examples/wanvideo/model_inference/Wan2.2-Dancer-14B-global.py)|[code](https://github.com/modelscope/DiffSynth-Studio/blob/main/examples/wanvideo/model_inference_low_vram/Wan2.2-Dancer-14B-global.py)|[code](https://github.com/modelscope/DiffSynth-Studio/blob/main/examples/wanvideo/model_training/full/Wan2.2-Dancer-14B-global.sh)|[code](https://github.com/modelscope/DiffSynth-Studio/blob/main/examples/wanvideo/model_training/validate_full/Wan2.2-Dancer-14B-global.py)|[code](https://github.com/modelscope/DiffSynth-Studio/blob/main/examples/wanvideo/model_training/lora/Wan2.2-Dancer-14B-global.sh)|[code](https://github.com/modelscope/DiffSynth-Studio/blob/main/examples/wanvideo/model_training/validate_lora/Wan2.2-Dancer-14B-global.py)|
+|[Wan-AI/Wan2.2-Dancer-14B (local model)](https://modelscope.cn/models/Wan-AI/Wan2.2-Dancer-14B)|`wantodance_music_path`, `wantodance_reference_image`, `wantodance_fps`, `wantodance_keyframes`, `wantodance_keyframes_mask`|[code](https://github.com/modelscope/DiffSynth-Studio/blob/main/examples/wanvideo/model_inference/Wan2.2-Dancer-14B-local.py)|[code](https://github.com/modelscope/DiffSynth-Studio/blob/main/examples/wanvideo/model_inference_low_vram/Wan2.2-Dancer-14B-local.py)|[code](https://github.com/modelscope/DiffSynth-Studio/blob/main/examples/wanvideo/model_training/full/Wan2.2-Dancer-14B-local.sh)|[code](https://github.com/modelscope/DiffSynth-Studio/blob/main/examples/wanvideo/model_training/validate_full/Wan2.2-Dancer-14B-local.py)|[code](https://github.com/modelscope/DiffSynth-Studio/blob/main/examples/wanvideo/model_training/lora/Wan2.2-Dancer-14B-local.sh)|[code](https://github.com/modelscope/DiffSynth-Studio/blob/main/examples/wanvideo/model_training/validate_lora/Wan2.2-Dancer-14B-local.py)|
 </details>

{diffsynth-2.0.10 → diffsynth-2.0.12}/diffsynth/configs/model_configs.py RENAMED Viewed

@@ -309,7 +309,7 @@ wan_series = [
         "state_dict_converter": "diffsynth.utils.state_dict_converters.wans2v_audio_encoder.WanS2VAudioEncoderStateDictConverter",
     },
     {
-        # Example: ModelConfig(model_id="Wan-AI/WanToDance-14B", origin_file_pattern="global_model.safetensors")
+        # Example: ModelConfig(model_id="Wan-AI/Wan2.2-Dancer-14B", origin_file_pattern="global_model.safetensors")
         "model_hash": "eb18873fc0ba77b541eb7b62dbcd2059",
         "model_name": "wan_video_dit",
         "model_class": "diffsynth.models.wan_video_dit.WanModel",
@@ -833,20 +833,6 @@ ltx2_series = [
         "extra_kwargs": {"decoder_version": "ltx-2.3"},
         "state_dict_converter": "diffsynth.utils.state_dict_converters.ltx2_video_vae.LTX2VideoDecoderStateDictConverter",
     },
-    {
-        # Example: ModelConfig(model_id="DiffSynth-Studio/LTX-2.3-Repackage", origin_file_pattern="audio_vocoder.safetensors")
-        "model_hash": "7d7823dde8f1ea0b50fb07ac329dd4cb",
-        "model_name": "ltx2_audio_vae_decoder",
-        "model_class": "diffsynth.models.ltx2_audio_vae.LTX2AudioDecoder",
-        "state_dict_converter": "diffsynth.utils.state_dict_converters.ltx2_audio_vae.LTX2AudioDecoderStateDictConverter",
-    },
-    {
-        # Example: ModelConfig(model_id="DiffSynth-Studio/LTX-2.3-Repackage", origin_file_pattern="audio_vae_encoder.safetensors")
-        "model_hash": "29338f3b95e7e312a3460a482e4f4554",
-        "model_name": "ltx2_audio_vae_encoder",
-        "model_class": "diffsynth.models.ltx2_audio_vae.LTX2AudioEncoder",
-        "state_dict_converter": "diffsynth.utils.state_dict_converters.ltx2_audio_vae.LTX2AudioEncoderStateDictConverter",
-    },
     {
         # Example: ModelConfig(model_id="DiffSynth-Studio/LTX-2.3-Repackage", origin_file_pattern="audio_vocoder.safetensors")
         "model_hash": "cd436c99e69ec5c80f050f0944f02a15",
@@ -1040,7 +1026,16 @@ ace_step_series = [
     },
 ]
+hidream_o1_image_series = [
+    {
+        # Example: ModelConfig(model_id="HiDream-ai/HiDream-O1-Image", origin_file_pattern="model-*.safetensors")
+        "model_hash": "58a7c1073d79556bfc61e05e6061b771",
+        "model_name": "hidream_o1_image_dit",
+        "model_class": "diffsynth.models.hidream_o1_image_dit.HiDreamO1ImageModel",
+    },
+]
 MODEL_CONFIGS = (
     stable_diffusion_xl_series + stable_diffusion_series + qwen_image_series + wan_series + flux_series + flux2_series + ernie_image_series
-    + z_image_series + ltx2_series + anima_series + mova_series + joyai_image_series + ace_step_series
+    + z_image_series + ltx2_series + anima_series + mova_series + joyai_image_series + ace_step_series + hidream_o1_image_series
 )

{diffsynth-2.0.10 → diffsynth-2.0.12}/diffsynth/configs/vram_management_module_maps.py RENAMED Viewed

@@ -327,7 +327,7 @@ VRAM_MANAGEMENT_MODULE_MAPS = {
     "diffsynth.models.ace_step_tokenizer.AceStepTokenizer": {
         "torch.nn.Linear": "diffsynth.core.vram.layers.AutoWrappedLinear",
         "torch.nn.Embedding": "diffsynth.core.vram.layers.AutoWrappedModule",
-        "vector_quantize_pytorch.ResidualFSQ": "diffsynth.core.vram.layers.AutoWrappedModule",
+        "diffsynth.models.ace_step_residual_fsq.ResidualFSQ": "diffsynth.core.vram.layers.AutoWrappedModule",
         "transformers.models.qwen3.modeling_qwen3.Qwen3RMSNorm": "diffsynth.core.vram.layers.AutoWrappedModule",
         "transformers.models.qwen3.modeling_qwen3.Qwen3MLP": "diffsynth.core.vram.layers.AutoWrappedModule",
         "transformers.models.qwen3.modeling_qwen3.Qwen3RotaryEmbedding": "diffsynth.core.vram.layers.AutoWrappedModule",
@@ -372,6 +372,14 @@ VRAM_MANAGEMENT_MODULE_MAPS = {
         "diffsynth.models.stable_diffusion_text_encoder.CLIPAttention": "diffsynth.core.vram.layers.AutoWrappedModule",
         "diffsynth.models.stable_diffusion_xl_text_encoder.CLIPTextModelWithProjection": "diffsynth.core.vram.layers.AutoWrappedModule",
     },
+    "diffsynth.models.hidream_o1_image_dit.HiDreamO1ImageModel": {
+        "torch.nn.Linear": "diffsynth.core.vram.layers.AutoWrappedLinear",
+        "torch.nn.Embedding": "diffsynth.core.vram.layers.AutoWrappedModule",
+        "torch.nn.Conv3d": "diffsynth.core.vram.layers.AutoWrappedModule",
+        "torch.nn.LayerNorm": "diffsynth.core.vram.layers.AutoWrappedModule",
+        "diffsynth.models.hidream_o1_image_dit.Qwen3VLTextRMSNorm": "diffsynth.core.vram.layers.AutoWrappedModule",
+        "diffsynth.models.hidream_o1_image_dit.Qwen3VLVisionModel": "diffsynth.core.vram.layers.AutoWrappedModule",
+    },
 }
 def QwenImageTextEncoder_Module_Map_Updater():

{diffsynth-2.0.10 → diffsynth-2.0.12}/diffsynth/core/__init__.py RENAMED Viewed

@@ -4,3 +4,4 @@ from .gradient import *
 from .loader import *
 from .vram import *
 from .device import *
+from .offload_training import *

{diffsynth-2.0.10 → diffsynth-2.0.12}/diffsynth/core/attention/attention.py RENAMED Viewed

@@ -63,10 +63,10 @@ def rearrange_out(out: torch.Tensor, out_pattern="b n s d", required_out_pattern
     return out
-def torch_sdpa(q: torch.Tensor, k: torch.Tensor, v: torch.Tensor, q_pattern="b n s d", k_pattern="b n s d", v_pattern="b n s d", out_pattern="b n s d", dims=None, attn_mask=None, scale=None):
+def torch_sdpa(q: torch.Tensor, k: torch.Tensor, v: torch.Tensor, q_pattern="b n s d", k_pattern="b n s d", v_pattern="b n s d", out_pattern="b n s d", dims=None, attn_mask=None, scale=None, is_causal=False):
     required_in_pattern, required_out_pattern= "b n s d", "b n s d"
     q, k, v = rearrange_qkv(q, k, v, q_pattern, k_pattern, v_pattern, required_in_pattern, dims)
-    out = torch.nn.functional.scaled_dot_product_attention(q, k, v, attn_mask, scale=scale)
+    out = torch.nn.functional.scaled_dot_product_attention(q, k, v, attn_mask, scale=scale, is_causal=is_causal)
     out = rearrange_out(out, out_pattern, required_out_pattern, dims)
     return out
@@ -81,10 +81,10 @@ def flash_attention_3(q: torch.Tensor, k: torch.Tensor, v: torch.Tensor, q_patte
     return out
-def flash_attention_2(q: torch.Tensor, k: torch.Tensor, v: torch.Tensor, q_pattern="b n s d", k_pattern="b n s d", v_pattern="b n s d", out_pattern="b n s d", dims=None, scale=None):
+def flash_attention_2(q: torch.Tensor, k: torch.Tensor, v: torch.Tensor, q_pattern="b n s d", k_pattern="b n s d", v_pattern="b n s d", out_pattern="b n s d", dims=None, scale=None, is_causal=False):
     required_in_pattern, required_out_pattern= "b s n d", "b s n d"
     q, k, v = rearrange_qkv(q, k, v, q_pattern, k_pattern, v_pattern, required_in_pattern, dims)
-    out = flash_attn.flash_attn_func(q, k, v, softmax_scale=scale)
+    out = flash_attn.flash_attn_func(q, k, v, softmax_scale=scale, causal=is_causal)
     out = rearrange_out(out, out_pattern, required_out_pattern, dims)
     return out
@@ -105,17 +105,17 @@ def xformers_attention(q: torch.Tensor, k: torch.Tensor, v: torch.Tensor, q_patt
     return out
-def attention_forward(q: torch.Tensor, k: torch.Tensor, v: torch.Tensor, q_pattern="b n s d", k_pattern="b n s d", v_pattern="b n s d", out_pattern="b n s d", dims=None, attn_mask=None, scale=None, compatibility_mode=False):
+def attention_forward(q: torch.Tensor, k: torch.Tensor, v: torch.Tensor, q_pattern="b n s d", k_pattern="b n s d", v_pattern="b n s d", out_pattern="b n s d", dims=None, attn_mask=None, scale=None, is_causal=False, compatibility_mode=False):
     if compatibility_mode or (attn_mask is not None):
-        return torch_sdpa(q, k, v, q_pattern, k_pattern, v_pattern, out_pattern, dims, attn_mask=attn_mask, scale=scale)
+        return torch_sdpa(q, k, v, q_pattern, k_pattern, v_pattern, out_pattern, dims, attn_mask=attn_mask, scale=scale, is_causal=is_causal)
     else:
         if ATTENTION_IMPLEMENTATION == "flash_attention_3":
             return flash_attention_3(q, k, v, q_pattern, k_pattern, v_pattern, out_pattern, dims, scale=scale)
         elif ATTENTION_IMPLEMENTATION == "flash_attention_2":
-            return flash_attention_2(q, k, v, q_pattern, k_pattern, v_pattern, out_pattern, dims, scale=scale)
+            return flash_attention_2(q, k, v, q_pattern, k_pattern, v_pattern, out_pattern, dims, scale=scale, is_causal=is_causal)
         elif ATTENTION_IMPLEMENTATION == "sage_attention":
             return sage_attention(q, k, v, q_pattern, k_pattern, v_pattern, out_pattern, dims, scale=scale)
         elif ATTENTION_IMPLEMENTATION == "xformers":
             return xformers_attention(q, k, v, q_pattern, k_pattern, v_pattern, out_pattern, dims, scale=scale)
         else:
-            return torch_sdpa(q, k, v, q_pattern, k_pattern, v_pattern, out_pattern, dims, scale=scale)
+            return torch_sdpa(q, k, v, q_pattern, k_pattern, v_pattern, out_pattern, dims, scale=scale, is_causal=is_causal)

{diffsynth-2.0.10 → diffsynth-2.0.12}/diffsynth/core/data/operators.py RENAMED Viewed

@@ -2,8 +2,6 @@ import math, warnings
 import torch, torchvision, imageio, os
 import imageio.v3 as iio
 from PIL import Image
-import torchaudio
-from diffsynth.utils.data.audio import read_audio
 class DataProcessingPipeline:
@@ -249,9 +247,11 @@ class ToAbsolutePath(DataProcessingOperator):
 class LoadAudio(DataProcessingOperator):
     def __init__(self, sr=16000):
         self.sr = sr
-    def __call__(self, data: str):
         import librosa
-        input_audio, sample_rate = librosa.load(data, sr=self.sr)
+        self.audio_loader = librosa.load
+    def __call__(self, data: str):
+        input_audio, sample_rate = self.audio_loader(data, sr=self.sr)
         return input_audio
@@ -259,13 +259,15 @@ class LoadAudioWithTorchaudio(DataProcessingOperator, FrameSamplerByRateMixin):
     def __init__(self, num_frames=121, time_division_factor=8, time_division_remainder=1, frame_rate=24, fix_frame_rate=True):
         FrameSamplerByRateMixin.__init__(self, num_frames, time_division_factor, time_division_remainder, frame_rate, fix_frame_rate)
+        import torchaudio
+        self.audio_loader = torchaudio.load
     def __call__(self, data: str):
         try:
             reader = self.get_reader(data)
             num_frames = self.get_num_frames(reader)
             duration = num_frames / self.frame_rate
-            waveform, sample_rate = torchaudio.load(data)
+            waveform, sample_rate = self.audio_loader(data)
             target_samples = int(duration * sample_rate)
             current_samples = waveform.shape[-1]
             if current_samples > target_samples:
@@ -285,10 +287,12 @@ class LoadPureAudioWithTorchaudio(DataProcessingOperator):
         self.target_sample_rate = target_sample_rate
         self.target_duration = target_duration
         self.resample = True if target_sample_rate is not None else False
+        from diffsynth.utils.data.audio import read_audio
+        self.audio_loader = read_audio
     def __call__(self, data: str):
         try:
-            waveform, sample_rate = read_audio(data, resample=self.resample, resample_rate=self.target_sample_rate)
+            waveform, sample_rate = self.audio_loader(data, resample=self.resample, resample_rate=self.target_sample_rate)
             if self.target_duration is not None:
                 target_samples = int(self.target_duration * sample_rate)
                 current_samples = waveform.shape[-1]

diffsynth-2.0.12/diffsynth/core/offload_training/__init__.py ADDED Viewed

	@@ -0,0 +1 @@
1	+ from .manager import OffloadTrainingManager

diffsynth-2.0.12/diffsynth/core/offload_training/manager.py ADDED Viewed

@@ -0,0 +1,177 @@
+"""
+Layer offloading for training — hook-based CPU offload.
+Hook lifecycle per module:
+No checkpointing:
+  forward_pre(load→GPU) → forward() → forward_hook(offload)
+  backward_pre(load→GPU) → backward() → backward_hook(offload)
+With checkpointing (use_reentrant=False):
+  First forward:
+    forward_pre(load→GPU) → forward() → forward_hook(offload, mark in_recompute)
+  Recomputing forward (during backward):
+    forward_pre(load→GPU) → forward() → forward_hook(in_recompute=True → keep GPU)
+  Backward:
+    backward_pre(load→GPU) → backward() → backward_hook(offload)
+"""
+import torch
+import torch.nn as nn
+import warnings
+from .offloader import StaticParamOffloader, TrainableParamOffloader, AlwaysOnGPUParamOffloader, BufferOffloader
+from .memory_buffer import PinnedArenaPool, BaseBufferPool
+warnings.filterwarnings("ignore", message="Full backward hook is firing when gradients are computed with respect to module outputs")
+def has_parameters(module: nn.Module) -> bool:
+    return len(list(module.parameters())) > 0
+def count_parameters(module: nn.Module) -> int:
+    return sum(p.numel() for p in module.parameters())
+def is_leaf_module(module: nn.Module) -> bool:
+    return len(list(module.children())) == 0
+class UnitWiseParamManager:
+    def __init__(self, model: nn.Module, target_device: torch.device, enable_optimizer_cpu_offload: bool = False, params: list = None, buffers: list = None, memory_buffer: BaseBufferPool = None):
+        self.model = model
+        self.target_device = target_device
+        self.param_offloaders = {}
+        for param in (model.parameters() if params is None else params):
+            if not param.requires_grad:
+                self.param_offloaders[id(param)] = StaticParamOffloader(param, target_device, memory_buffer=memory_buffer)
+            else:
+                if enable_optimizer_cpu_offload:
+                    self.param_offloaders[id(param)] = TrainableParamOffloader(param, target_device)
+                else:
+                    self.param_offloaders[id(param)] = AlwaysOnGPUParamOffloader(param, target_device)
+        if buffers is not None and len(buffers) > 0:
+            for mod, buf_name, buf in buffers:
+                self.param_offloaders[id(buf)] = BufferOffloader(mod, buf_name, buf, target_device, memory_buffer=memory_buffer)
+    def move_gradients_to_cpu(self):
+        for offloader in self.param_offloaders.values():
+            offloader.offload_grad()
+    def onload_module(self, module: nn.Module):
+        for param in module.parameters(recurse=False):
+            if id(param) in self.param_offloaders:
+                self.param_offloaders[id(param)].onload()
+        for name, buf in module.named_buffers(recurse=False):
+            if id(buf) in self.param_offloaders:
+                self.param_offloaders[id(buf)].onload()
+    def offload_module(self, module: nn.Module):
+        for param in module.parameters(recurse=False):
+            if id(param) in self.param_offloaders:
+                self.param_offloaders[id(param)].offload()
+        for name, buf in module.named_buffers(recurse=False):
+            if id(buf) in self.param_offloaders:
+                self.param_offloaders[id(buf)].offload()
+class UnitWiseHookManager:
+    def __init__(self, model: nn.Module, target_device: torch.device, enable_optimizer_cpu_offload: bool = False,
+                 params: list = None, buffers: list = None, memory_buffer: BaseBufferPool = None):
+        self.param_manager = UnitWiseParamManager(model, target_device, enable_optimizer_cpu_offload, params=params, buffers=buffers, memory_buffer=memory_buffer)
+        self._in_recompute: set = set()
+        self._register_hooks(model)
+    def _register_hooks(self, module: nn.Module):
+        def forward_pre_hook(mod, args):
+            self.param_manager.onload_module(mod)
+        def forward_hook(mod, args, output):
+            if mod in self._in_recompute:
+                return
+            self._in_recompute.add(mod)
+            self.param_manager.offload_module(mod)
+        def backward_pre_hook(mod, grad_output):
+            self.param_manager.onload_module(mod)
+        def backward_hook(mod, grad_input, grad_output):
+            self.param_manager.offload_module(mod)
+        module.register_forward_pre_hook(forward_pre_hook)
+        module.register_forward_hook(forward_hook)
+        module.register_full_backward_pre_hook(backward_pre_hook)
+        if is_leaf_module(module):
+            module.register_full_backward_hook(backward_hook)
+        else:
+            # Parent module backward_hook fires before child backward completes.
+            # Register on leaf children instead.
+            sub_modules = [m for m in module.modules() if is_leaf_module(m) and has_parameters(m)]
+            for sub_mod in sub_modules:
+                sub_mod.register_full_backward_hook(backward_hook)
+    def after_backward(self):
+        self._in_recompute.clear()
+        self.param_manager.move_gradients_to_cpu()
+    @property
+    def managed_param_ids(self):
+        return set(self.param_manager.param_offloaders.keys())
+class OffloadTrainingManager:
+    def __init__(self, model: nn.Module, target_device: torch.device, enable_optimizer_cpu_offload: bool = False, cpu_offload_split_threshold: int = None):
+        self.model = model
+        self.target_device = target_device
+        self.enable_optimizer_cpu_offload = enable_optimizer_cpu_offload
+        cpu_offload_split_threshold = cpu_offload_split_threshold * 1024 * 1024 if cpu_offload_split_threshold is not None else None
+        self._register_units(model, target_device, enable_optimizer_cpu_offload, cpu_offload_split_threshold)
+    def _register_units(self, model: nn.Module, target_device: torch.device, enable_optimizer_cpu_offload: bool, cpu_offload_split_threshold: int = None):
+        self.memory_buffer = PinnedArenaPool.from_model(model)
+        units = self._find_units_recursive(model, cpu_offload_split_threshold)
+        self.units = [UnitWiseHookManager(u, target_device, enable_optimizer_cpu_offload, memory_buffer=self.memory_buffer) for u in units]
+        managed_param_ids = set().union(*[unit.managed_param_ids for unit in self.units])
+        orphan_params, orphan_buffers = self._find_orphan_params_and_buffers(model, managed_param_ids)
+        for orphan_module in set(orphan_params.keys()) | set(orphan_buffers.keys()):
+            params = orphan_params.get(orphan_module, [])
+            buffers = orphan_buffers.get(orphan_module, [])
+            self.units.append(UnitWiseHookManager(orphan_module, target_device, enable_optimizer_cpu_offload, params=params, buffers=buffers, memory_buffer=self.memory_buffer))
+    def _find_orphan_params_and_buffers(self, model: nn.Module, managed_param_ids: set):
+        orphan_params_by_module = {}
+        for _, mod in model.named_modules():
+            for param in mod.parameters(recurse=False):
+                if id(param) not in managed_param_ids:
+                    orphan_params_by_module.setdefault(mod, []).append(param)
+        # Collect orphan buffers grouped by owner module
+        orphan_buffers_by_module = {}
+        for _, mod in model.named_modules():
+            for name, buf in mod.named_buffers(recurse=False):
+                orphan_buffers_by_module.setdefault(mod, []).append((mod, name, buf))
+        return orphan_params_by_module, orphan_buffers_by_module
+    def _find_units_recursive(self, module: nn.Module, cpu_offload_split_threshold: int = None) -> list:
+        if cpu_offload_split_threshold is None:
+            return [m for m in module.modules() if is_leaf_module(m) and has_parameters(m)]
+        if self._should_force_recurse(module, cpu_offload_split_threshold):
+            units = []
+            for child in module.children():
+                units.extend(self._find_units_recursive(child, cpu_offload_split_threshold))
+            return units
+        return [module]
+    def _should_force_recurse(self, module: nn.Module, cpu_offload_split_threshold: int = None) -> bool:
+        if is_leaf_module(module):
+            return False
+        if (
+            count_parameters(module) > cpu_offload_split_threshold
+            or ('forward' not in type(module).__dict__)
+            or (hasattr(module, 'encode') and hasattr(module, 'decode'))
+        ):
+            return True
+        return False
+    # run after backward() and before optimizer.step()
+    def after_backward(self):
+        for unit in self.units:
+            unit.after_backward()
+        torch.cuda.synchronize()

diffsynth 2.0.10__tar.gz → 2.0.12__tar.gz

diffsynth 2.0.10tar.gz → 2.0.12tar.gz