PyPI - diffsynth-engine - Versions diffs - 0.1.0__tar.gz → 0.2.0__tar.gz - Mend

diffsynth-engine 0.1.0tar.gz → 0.2.0tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (218) hide show

diffsynth_engine-0.2.0/.github/workflows/python-publish.yml ADDED Viewed

@@ -0,0 +1,41 @@
+name: release
+on:
+  push:
+    tags:
+      - 'v**'
+  workflow_dispatch:
+    inputs:
+      branch:
+        required: true
+        default: 'main'
+permissions:
+  contents: read
+concurrency:
+  group: ${{ github.workflow }}-${{ github.ref }}
+  cancel-in-progress: true
+jobs:
+  build-and-publish:
+    runs-on: ubuntu-latest
+    steps:
+      - uses: actions/checkout@v4
+      - uses: actions/setup-python@v5
+        with:
+          python-version: "3.10"
+      - name: Install build
+        run: pip install build
+      - name: Build dist
+        run: python -m build
+      - name: Publish to PyPI
+        run: |
+          pip install twine
+          twine upload dist/* --skip-existing -p ${{ secrets.PYPI_API_TOKEN }}

diffsynth_engine-0.2.0/.gitignore ADDED Viewed

@@ -0,0 +1,9 @@
+*.pyc
+.idea/
+.vscode/
+__pycache__/
+tmp/
+build/
+dist/
+*.egg-info/
+.DS_Store/

diffsynth_engine-0.2.0/.pre-commit-config.yaml ADDED Viewed

@@ -0,0 +1,11 @@
+repos:
+- repo: https://github.com/astral-sh/ruff-pre-commit
+  # Ruff version.
+  rev: v0.11.5
+  hooks:
+    # Run the linter.
+    - id: ruff
+      types_or: [ python, pyi ]
+    # Run the formatter.
+    - id: ruff-format
+      types_or: [ python, pyi ]

diffsynth_engine-0.2.0/PKG-INFO ADDED Viewed

@@ -0,0 +1,34 @@
+Metadata-Version: 2.4
+Name: diffsynth_engine
+Version: 0.2.0
+Author: MuseAI x ModelScope
+Classifier: Programming Language :: Python :: 3
+Classifier: Operating System :: OS Independent
+Requires-Python: >=3.10
+License-File: LICENSE
+Requires-Dist: torch>=2.4.1
+Requires-Dist: torchvision
+Requires-Dist: xformers; sys_platform == "linux"
+Requires-Dist: safetensors
+Requires-Dist: gguf
+Requires-Dist: einops
+Requires-Dist: ftfy
+Requires-Dist: regex
+Requires-Dist: sentencepiece
+Requires-Dist: tokenizers
+Requires-Dist: modelscope
+Requires-Dist: flufl.lock
+Requires-Dist: scipy
+Requires-Dist: torchsde
+Requires-Dist: pillow
+Requires-Dist: imageio[ffmpeg]
+Requires-Dist: yunchang
+Provides-Extra: dev
+Requires-Dist: diffusers==0.31.0; extra == "dev"
+Requires-Dist: transformers==4.45.2; extra == "dev"
+Requires-Dist: build; extra == "dev"
+Requires-Dist: ruff; extra == "dev"
+Requires-Dist: scikit-image; extra == "dev"
+Requires-Dist: pytest; extra == "dev"
+Requires-Dist: pre-commit; extra == "dev"
+Dynamic: license-file

{diffsynth_engine-0.1.0 → diffsynth_engine-0.2.0}/README.md RENAMED Viewed

@@ -6,20 +6,20 @@
 [![GitHub pull-requests](https://img.shields.io/github/issues-pr/modelscope/DiffSynth-Engine.svg)](https://GitHub.com/modelscope/DiffSynth-Engine/pull/)
 [![GitHub latest commit](https://badgen.net/github/last-commit/modelscope/DiffSynth-Engine)](https://GitHub.com/modelscope/DiffSynth-Engine/commit/)
-Diffsynth Engine is a high-performance diffusion inference engine designed for developers.
+DiffSynth-Engine is a high-performance engine geared towards buidling efficient inference pipelines for diffusion models.
 **Key Features:**
-- **Clean and Readable Code:** Fully re-implements the Diffusion sampler and scheduler without relying on third-party libraries like k-diffusion, ldm, or sgm.
+- **Thoughtfully-Designed Implementation:** We carefully re-implemented key components in Diffusion pipelines, such as sampler and scheduler, without introducing external dependencies on libraries like k-diffusion, ldm, or sgm.
-- **Extensive Model Support:** Compatible with multiple formats (e.g., CivitAI format) of base models and LoRA models , catering to diverse use cases.
+- **Extensive Model Support:** Compatible with popular formats (e.g., CivitAI) of base models and LoRA models , catering to diverse use cases.
-- **Flexible Memory Management:** Supports various levels of model quantization (e.g., FP8, INT8)
-and offload strategies, enabling users to run large models (e.g., Flux.1 Dev) on limited GPU memory.
+- **Versatile Resource Management:** Comprehensive support for varous model quantization (e.g., FP8, INT8)
+and offloading strategies, enabling loading of larger diffusion models (e.g., Flux.1 Dev) on limited hardware budget of GPU memory.
-- **High-Performance Inference:** Optimizes the inference pipeline to achieve fast generation across various hardware environments.
+- **Optimized Performance:** Carefully-crafted inference pipeline to achieve fast generation across various hardware environments.
-- **Platform Compatibility:** Supports Windows, macOS (Apple Silicon), and Linux, ensuring a smooth experience across different operating systems.
+- **Cross-Platform Support:** Runnable on Windows, macOS (Apple Silicon), and Linux, ensuring a smooth experience across different operating systems.
 ## Quick Start
 ### Requirements
@@ -29,13 +29,13 @@ and offload strategies, enabling users to run large models (e.g., Flux.1 Dev) on
 ### Installation
-Install for PyPI (stable version)
-```python
+Install released version (from PyPI):
+```shell
 pip3 install diffsynth-engine
 ```
-Install for source (preview version)
-```python
+Install from source:
+```shell
 git clone https://github.com/modelscope/diffsynth-engine.git && cd diffsynth-engine
 pip3 install -e .
 ```
@@ -71,10 +71,10 @@ For more details, please refer to our tutorials ([English](./docs/tutorial.md),
 ## Contact
-If you have any questions or feedback, please scan the QR code or send email to muse@alibaba-inc.com.
+If you have any questions or feedback, please scan the QR code below, or send email to muse@alibaba-inc.com.
 <div style="display: flex; justify-content: space-between;">
-    <img src="assets/dingtalk.png" alt="dingtalk" style="zoom: 60%;" />
+    <img src="assets/dingtalk.png" alt="dingtalk" width="400" />
 </div>
 ## License
@@ -82,7 +82,7 @@ This project is licensed under the Apache License 2.0. See the LICENSE file for
 ## Citation
-If you use this codebase, or otherwise found our work valuable, please cite:
+If you use this codebase, or otherwise found our work helpful, please cite:
 ```bibtex
 @misc{diffsynth-engine2025,

diffsynth_engine-0.2.0/assets/dingtalk.png ADDED Viewed

Binary file

diffsynth_engine-0.2.0/assets/showcase.jpeg ADDED Viewed

Binary file

{diffsynth_engine-0.1.0 → diffsynth_engine-0.2.0}/diffsynth_engine/__init__.py RENAMED Viewed

@@ -10,6 +10,7 @@ from .pipelines import (
 )
 from .utils.download import fetch_model, fetch_modelscope_model, fetch_civitai_model
 from .utils.video import load_video, save_video
 __all__ = [
     "FluxImagePipeline",
     "SDXLImagePipeline",
@@ -22,4 +23,6 @@ __all__ = [
     "fetch_model",
     "fetch_modelscope_model",
     "fetch_civitai_model",
+    "load_video",
+    "save_video",
 ]

{diffsynth_engine-0.1.0 → diffsynth_engine-0.2.0}/diffsynth_engine/algorithm/noise_scheduler/flow_match/recifited_flow.py RENAMED Viewed

@@ -5,18 +5,19 @@ from diffsynth_engine.algorithm.noise_scheduler.base_scheduler import append_zer
 class RecifitedFlowScheduler(BaseScheduler):
-    def __init__(self,
-        shift=1.0,
-        sigma_min=0.001,
+    def __init__(
+        self,
+        shift=1.0,
+        sigma_min=0.001,
         sigma_max=1.0,
-        num_train_timesteps=1000,
+        num_train_timesteps=1000,
         use_dynamic_shifting=False,
     ):
         self.shift = shift
         self.sigma_min = sigma_min
         self.sigma_max = sigma_max
-        self.num_train_timesteps = num_train_timesteps
-        self.use_dynamic_shifting = use_dynamic_shifting
+        self.num_train_timesteps = num_train_timesteps
+        self.use_dynamic_shifting = use_dynamic_shifting
     def _sigma_to_t(self, sigma):
         return sigma * self.num_train_timesteps
@@ -30,19 +31,20 @@ class RecifitedFlowScheduler(BaseScheduler):
     def _shift_sigma(self, sigma: torch.Tensor, shift: float):
         return shift * sigma / (1 + (shift - 1) * sigma)
-    def schedule(self,
-                 num_inference_steps: int,
-                 mu: float | None = None,
-                 sigma_min: float | None = None,
-                 sigma_max: float | None = None
+    def schedule(
+        self,
+        num_inference_steps: int,
+        mu: float | None = None,
+        sigma_min: float | None = None,
+        sigma_max: float | None = None,
     ):
         sigma_min = self.sigma_min if sigma_min is None else sigma_min
-        sigma_max = self.sigma_max if sigma_max is None else sigma_max
+        sigma_max = self.sigma_max if sigma_max is None else sigma_max
         sigmas = torch.linspace(sigma_max, sigma_min, num_inference_steps)
         if self.use_dynamic_shifting:
-            sigmas = self._time_shift(mu, 1.0, sigmas)            # FLUX
+            sigmas = self._time_shift(mu, 1.0, sigmas)  # FLUX
         else:
             sigmas = self._shift_sigma(sigmas, self.shift)
         timesteps = sigmas * self.num_train_timesteps
         sigmas = append_zero(sigmas)
-        return sigmas, timesteps
+        return sigmas, timesteps

{diffsynth_engine-0.1.0 → diffsynth_engine-0.2.0}/diffsynth_engine/algorithm/noise_scheduler/stable_diffusion/ddim.py RENAMED Viewed

@@ -1,7 +1,4 @@
 import torch
-from .linear import ScaledLinearScheduler
-from ..base_scheduler import append_zero
-import numpy as np
 from diffsynth_engine.algorithm.noise_scheduler.stable_diffusion.linear import ScaledLinearScheduler
 from diffsynth_engine.algorithm.noise_scheduler.base_scheduler import append_zero

{diffsynth_engine-0.1.0 → diffsynth_engine-0.2.0}/diffsynth_engine/algorithm/noise_scheduler/stable_diffusion/sgm_uniform.py RENAMED Viewed

@@ -1,7 +1,4 @@
 import torch
-from .linear import ScaledLinearScheduler
-from ..base_scheduler import append_zero
-import numpy as np
 from diffsynth_engine.algorithm.noise_scheduler.stable_diffusion.linear import ScaledLinearScheduler
 from diffsynth_engine.algorithm.noise_scheduler.base_scheduler import append_zero

{diffsynth_engine-0.1.0 → diffsynth_engine-0.2.0}/diffsynth_engine/algorithm/sampler/flow_match/flow_match_euler.py RENAMED Viewed

@@ -2,7 +2,7 @@ import torch
 class FlowMatchEulerSampler:
-    def initialize(self, init_latents, timesteps, sigmas, mask=None):
+    def initialize(self, init_latents, timesteps, sigmas, mask=None):
         self.init_latents = init_latents
         self.timesteps = timesteps
         self.sigmas = sigmas

diffsynth_engine-0.2.0/diffsynth_engine/conf/models/components/vae.json ADDED Viewed

@@ -0,0 +1,254 @@
+{
+    "civitai": {
+        "rename_dict": {
+            "first_stage_model.decoder.conv_in.bias": "decoder.conv_in.bias",
+            "first_stage_model.decoder.conv_in.weight": "decoder.conv_in.weight",
+            "first_stage_model.decoder.conv_out.bias": "decoder.conv_out.bias",
+            "first_stage_model.decoder.conv_out.weight": "decoder.conv_out.weight",
+            "first_stage_model.decoder.mid.attn_1.k.bias": "decoder.blocks.1.transformer_blocks.0.to_k.bias",
+            "first_stage_model.decoder.mid.attn_1.k.weight": "decoder.blocks.1.transformer_blocks.0.to_k.weight",
+            "first_stage_model.decoder.mid.attn_1.norm.bias": "decoder.blocks.1.norm.bias",
+            "first_stage_model.decoder.mid.attn_1.norm.weight": "decoder.blocks.1.norm.weight",
+            "first_stage_model.decoder.mid.attn_1.proj_out.bias": "decoder.blocks.1.transformer_blocks.0.to_out.bias",
+            "first_stage_model.decoder.mid.attn_1.proj_out.weight": "decoder.blocks.1.transformer_blocks.0.to_out.weight",
+            "first_stage_model.decoder.mid.attn_1.q.bias": "decoder.blocks.1.transformer_blocks.0.to_q.bias",
+            "first_stage_model.decoder.mid.attn_1.q.weight": "decoder.blocks.1.transformer_blocks.0.to_q.weight",
+            "first_stage_model.decoder.mid.attn_1.v.bias": "decoder.blocks.1.transformer_blocks.0.to_v.bias",
+            "first_stage_model.decoder.mid.attn_1.v.weight": "decoder.blocks.1.transformer_blocks.0.to_v.weight",
+            "first_stage_model.decoder.mid.block_1.conv1.bias": "decoder.blocks.0.conv1.bias",
+            "first_stage_model.decoder.mid.block_1.conv1.weight": "decoder.blocks.0.conv1.weight",
+            "first_stage_model.decoder.mid.block_1.conv2.bias": "decoder.blocks.0.conv2.bias",
+            "first_stage_model.decoder.mid.block_1.conv2.weight": "decoder.blocks.0.conv2.weight",
+            "first_stage_model.decoder.mid.block_1.norm1.bias": "decoder.blocks.0.norm1.bias",
+            "first_stage_model.decoder.mid.block_1.norm1.weight": "decoder.blocks.0.norm1.weight",
+            "first_stage_model.decoder.mid.block_1.norm2.bias": "decoder.blocks.0.norm2.bias",
+            "first_stage_model.decoder.mid.block_1.norm2.weight": "decoder.blocks.0.norm2.weight",
+            "first_stage_model.decoder.mid.block_2.conv1.bias": "decoder.blocks.2.conv1.bias",
+            "first_stage_model.decoder.mid.block_2.conv1.weight": "decoder.blocks.2.conv1.weight",
+            "first_stage_model.decoder.mid.block_2.conv2.bias": "decoder.blocks.2.conv2.bias",
+            "first_stage_model.decoder.mid.block_2.conv2.weight": "decoder.blocks.2.conv2.weight",
+            "first_stage_model.decoder.mid.block_2.norm1.bias": "decoder.blocks.2.norm1.bias",
+            "first_stage_model.decoder.mid.block_2.norm1.weight": "decoder.blocks.2.norm1.weight",
+            "first_stage_model.decoder.mid.block_2.norm2.bias": "decoder.blocks.2.norm2.bias",
+            "first_stage_model.decoder.mid.block_2.norm2.weight": "decoder.blocks.2.norm2.weight",
+            "first_stage_model.decoder.norm_out.bias": "decoder.conv_norm_out.bias",
+            "first_stage_model.decoder.norm_out.weight": "decoder.conv_norm_out.weight",
+            "first_stage_model.decoder.up.0.block.0.conv1.bias": "decoder.blocks.15.conv1.bias",
+            "first_stage_model.decoder.up.0.block.0.conv1.weight": "decoder.blocks.15.conv1.weight",
+            "first_stage_model.decoder.up.0.block.0.conv2.bias": "decoder.blocks.15.conv2.bias",
+            "first_stage_model.decoder.up.0.block.0.conv2.weight": "decoder.blocks.15.conv2.weight",
+            "first_stage_model.decoder.up.0.block.0.nin_shortcut.bias": "decoder.blocks.15.conv_shortcut.bias",
+            "first_stage_model.decoder.up.0.block.0.nin_shortcut.weight": "decoder.blocks.15.conv_shortcut.weight",
+            "first_stage_model.decoder.up.0.block.0.norm1.bias": "decoder.blocks.15.norm1.bias",
+            "first_stage_model.decoder.up.0.block.0.norm1.weight": "decoder.blocks.15.norm1.weight",
+            "first_stage_model.decoder.up.0.block.0.norm2.bias": "decoder.blocks.15.norm2.bias",
+            "first_stage_model.decoder.up.0.block.0.norm2.weight": "decoder.blocks.15.norm2.weight",
+            "first_stage_model.decoder.up.0.block.1.conv1.bias": "decoder.blocks.16.conv1.bias",
+            "first_stage_model.decoder.up.0.block.1.conv1.weight": "decoder.blocks.16.conv1.weight",
+            "first_stage_model.decoder.up.0.block.1.conv2.bias": "decoder.blocks.16.conv2.bias",
+            "first_stage_model.decoder.up.0.block.1.conv2.weight": "decoder.blocks.16.conv2.weight",
+            "first_stage_model.decoder.up.0.block.1.norm1.bias": "decoder.blocks.16.norm1.bias",
+            "first_stage_model.decoder.up.0.block.1.norm1.weight": "decoder.blocks.16.norm1.weight",
+            "first_stage_model.decoder.up.0.block.1.norm2.bias": "decoder.blocks.16.norm2.bias",
+            "first_stage_model.decoder.up.0.block.1.norm2.weight": "decoder.blocks.16.norm2.weight",
+            "first_stage_model.decoder.up.0.block.2.conv1.bias": "decoder.blocks.17.conv1.bias",
+            "first_stage_model.decoder.up.0.block.2.conv1.weight": "decoder.blocks.17.conv1.weight",
+            "first_stage_model.decoder.up.0.block.2.conv2.bias": "decoder.blocks.17.conv2.bias",
+            "first_stage_model.decoder.up.0.block.2.conv2.weight": "decoder.blocks.17.conv2.weight",
+            "first_stage_model.decoder.up.0.block.2.norm1.bias": "decoder.blocks.17.norm1.bias",
+            "first_stage_model.decoder.up.0.block.2.norm1.weight": "decoder.blocks.17.norm1.weight",
+            "first_stage_model.decoder.up.0.block.2.norm2.bias": "decoder.blocks.17.norm2.bias",
+            "first_stage_model.decoder.up.0.block.2.norm2.weight": "decoder.blocks.17.norm2.weight",
+            "first_stage_model.decoder.up.1.block.0.conv1.bias": "decoder.blocks.11.conv1.bias",
+            "first_stage_model.decoder.up.1.block.0.conv1.weight": "decoder.blocks.11.conv1.weight",
+            "first_stage_model.decoder.up.1.block.0.conv2.bias": "decoder.blocks.11.conv2.bias",
+            "first_stage_model.decoder.up.1.block.0.conv2.weight": "decoder.blocks.11.conv2.weight",
+            "first_stage_model.decoder.up.1.block.0.nin_shortcut.bias": "decoder.blocks.11.conv_shortcut.bias",
+            "first_stage_model.decoder.up.1.block.0.nin_shortcut.weight": "decoder.blocks.11.conv_shortcut.weight",
+            "first_stage_model.decoder.up.1.block.0.norm1.bias": "decoder.blocks.11.norm1.bias",
+            "first_stage_model.decoder.up.1.block.0.norm1.weight": "decoder.blocks.11.norm1.weight",
+            "first_stage_model.decoder.up.1.block.0.norm2.bias": "decoder.blocks.11.norm2.bias",
+            "first_stage_model.decoder.up.1.block.0.norm2.weight": "decoder.blocks.11.norm2.weight",
+            "first_stage_model.decoder.up.1.block.1.conv1.bias": "decoder.blocks.12.conv1.bias",
+            "first_stage_model.decoder.up.1.block.1.conv1.weight": "decoder.blocks.12.conv1.weight",
+            "first_stage_model.decoder.up.1.block.1.conv2.bias": "decoder.blocks.12.conv2.bias",
+            "first_stage_model.decoder.up.1.block.1.conv2.weight": "decoder.blocks.12.conv2.weight",
+            "first_stage_model.decoder.up.1.block.1.norm1.bias": "decoder.blocks.12.norm1.bias",
+            "first_stage_model.decoder.up.1.block.1.norm1.weight": "decoder.blocks.12.norm1.weight",
+            "first_stage_model.decoder.up.1.block.1.norm2.bias": "decoder.blocks.12.norm2.bias",
+            "first_stage_model.decoder.up.1.block.1.norm2.weight": "decoder.blocks.12.norm2.weight",
+            "first_stage_model.decoder.up.1.block.2.conv1.bias": "decoder.blocks.13.conv1.bias",
+            "first_stage_model.decoder.up.1.block.2.conv1.weight": "decoder.blocks.13.conv1.weight",
+            "first_stage_model.decoder.up.1.block.2.conv2.bias": "decoder.blocks.13.conv2.bias",
+            "first_stage_model.decoder.up.1.block.2.conv2.weight": "decoder.blocks.13.conv2.weight",
+            "first_stage_model.decoder.up.1.block.2.norm1.bias": "decoder.blocks.13.norm1.bias",
+            "first_stage_model.decoder.up.1.block.2.norm1.weight": "decoder.blocks.13.norm1.weight",
+            "first_stage_model.decoder.up.1.block.2.norm2.bias": "decoder.blocks.13.norm2.bias",
+            "first_stage_model.decoder.up.1.block.2.norm2.weight": "decoder.blocks.13.norm2.weight",
+            "first_stage_model.decoder.up.1.upsample.conv.bias": "decoder.blocks.14.conv.bias",
+            "first_stage_model.decoder.up.1.upsample.conv.weight": "decoder.blocks.14.conv.weight",
+            "first_stage_model.decoder.up.2.block.0.conv1.bias": "decoder.blocks.7.conv1.bias",
+            "first_stage_model.decoder.up.2.block.0.conv1.weight": "decoder.blocks.7.conv1.weight",
+            "first_stage_model.decoder.up.2.block.0.conv2.bias": "decoder.blocks.7.conv2.bias",
+            "first_stage_model.decoder.up.2.block.0.conv2.weight": "decoder.blocks.7.conv2.weight",
+            "first_stage_model.decoder.up.2.block.0.norm1.bias": "decoder.blocks.7.norm1.bias",
+            "first_stage_model.decoder.up.2.block.0.norm1.weight": "decoder.blocks.7.norm1.weight",
+            "first_stage_model.decoder.up.2.block.0.norm2.bias": "decoder.blocks.7.norm2.bias",
+            "first_stage_model.decoder.up.2.block.0.norm2.weight": "decoder.blocks.7.norm2.weight",
+            "first_stage_model.decoder.up.2.block.1.conv1.bias": "decoder.blocks.8.conv1.bias",
+            "first_stage_model.decoder.up.2.block.1.conv1.weight": "decoder.blocks.8.conv1.weight",
+            "first_stage_model.decoder.up.2.block.1.conv2.bias": "decoder.blocks.8.conv2.bias",
+            "first_stage_model.decoder.up.2.block.1.conv2.weight": "decoder.blocks.8.conv2.weight",
+            "first_stage_model.decoder.up.2.block.1.norm1.bias": "decoder.blocks.8.norm1.bias",
+            "first_stage_model.decoder.up.2.block.1.norm1.weight": "decoder.blocks.8.norm1.weight",
+            "first_stage_model.decoder.up.2.block.1.norm2.bias": "decoder.blocks.8.norm2.bias",
+            "first_stage_model.decoder.up.2.block.1.norm2.weight": "decoder.blocks.8.norm2.weight",
+            "first_stage_model.decoder.up.2.block.2.conv1.bias": "decoder.blocks.9.conv1.bias",
+            "first_stage_model.decoder.up.2.block.2.conv1.weight": "decoder.blocks.9.conv1.weight",
+            "first_stage_model.decoder.up.2.block.2.conv2.bias": "decoder.blocks.9.conv2.bias",
+            "first_stage_model.decoder.up.2.block.2.conv2.weight": "decoder.blocks.9.conv2.weight",
+            "first_stage_model.decoder.up.2.block.2.norm1.bias": "decoder.blocks.9.norm1.bias",
+            "first_stage_model.decoder.up.2.block.2.norm1.weight": "decoder.blocks.9.norm1.weight",
+            "first_stage_model.decoder.up.2.block.2.norm2.bias": "decoder.blocks.9.norm2.bias",
+            "first_stage_model.decoder.up.2.block.2.norm2.weight": "decoder.blocks.9.norm2.weight",
+            "first_stage_model.decoder.up.2.upsample.conv.bias": "decoder.blocks.10.conv.bias",
+            "first_stage_model.decoder.up.2.upsample.conv.weight": "decoder.blocks.10.conv.weight",
+            "first_stage_model.decoder.up.3.block.0.conv1.bias": "decoder.blocks.3.conv1.bias",
+            "first_stage_model.decoder.up.3.block.0.conv1.weight": "decoder.blocks.3.conv1.weight",
+            "first_stage_model.decoder.up.3.block.0.conv2.bias": "decoder.blocks.3.conv2.bias",
+            "first_stage_model.decoder.up.3.block.0.conv2.weight": "decoder.blocks.3.conv2.weight",
+            "first_stage_model.decoder.up.3.block.0.norm1.bias": "decoder.blocks.3.norm1.bias",
+            "first_stage_model.decoder.up.3.block.0.norm1.weight": "decoder.blocks.3.norm1.weight",
+            "first_stage_model.decoder.up.3.block.0.norm2.bias": "decoder.blocks.3.norm2.bias",
+            "first_stage_model.decoder.up.3.block.0.norm2.weight": "decoder.blocks.3.norm2.weight",
+            "first_stage_model.decoder.up.3.block.1.conv1.bias": "decoder.blocks.4.conv1.bias",
+            "first_stage_model.decoder.up.3.block.1.conv1.weight": "decoder.blocks.4.conv1.weight",
+            "first_stage_model.decoder.up.3.block.1.conv2.bias": "decoder.blocks.4.conv2.bias",
+            "first_stage_model.decoder.up.3.block.1.conv2.weight": "decoder.blocks.4.conv2.weight",
+            "first_stage_model.decoder.up.3.block.1.norm1.bias": "decoder.blocks.4.norm1.bias",
+            "first_stage_model.decoder.up.3.block.1.norm1.weight": "decoder.blocks.4.norm1.weight",
+            "first_stage_model.decoder.up.3.block.1.norm2.bias": "decoder.blocks.4.norm2.bias",
+            "first_stage_model.decoder.up.3.block.1.norm2.weight": "decoder.blocks.4.norm2.weight",
+            "first_stage_model.decoder.up.3.block.2.conv1.bias": "decoder.blocks.5.conv1.bias",
+            "first_stage_model.decoder.up.3.block.2.conv1.weight": "decoder.blocks.5.conv1.weight",
+            "first_stage_model.decoder.up.3.block.2.conv2.bias": "decoder.blocks.5.conv2.bias",
+            "first_stage_model.decoder.up.3.block.2.conv2.weight": "decoder.blocks.5.conv2.weight",
+            "first_stage_model.decoder.up.3.block.2.norm1.bias": "decoder.blocks.5.norm1.bias",
+            "first_stage_model.decoder.up.3.block.2.norm1.weight": "decoder.blocks.5.norm1.weight",
+            "first_stage_model.decoder.up.3.block.2.norm2.bias": "decoder.blocks.5.norm2.bias",
+            "first_stage_model.decoder.up.3.block.2.norm2.weight": "decoder.blocks.5.norm2.weight",
+            "first_stage_model.decoder.up.3.upsample.conv.bias": "decoder.blocks.6.conv.bias",
+            "first_stage_model.decoder.up.3.upsample.conv.weight": "decoder.blocks.6.conv.weight",
+            "first_stage_model.post_quant_conv.bias": "decoder.post_quant_conv.bias",
+            "first_stage_model.post_quant_conv.weight": "decoder.post_quant_conv.weight",
+            "first_stage_model.encoder.conv_in.bias": "encoder.conv_in.bias",
+            "first_stage_model.encoder.conv_in.weight": "encoder.conv_in.weight",
+            "first_stage_model.encoder.conv_out.bias": "encoder.conv_out.bias",
+            "first_stage_model.encoder.conv_out.weight": "encoder.conv_out.weight",
+            "first_stage_model.encoder.down.0.block.0.conv1.bias": "encoder.blocks.0.conv1.bias",
+            "first_stage_model.encoder.down.0.block.0.conv1.weight": "encoder.blocks.0.conv1.weight",
+            "first_stage_model.encoder.down.0.block.0.conv2.bias": "encoder.blocks.0.conv2.bias",
+            "first_stage_model.encoder.down.0.block.0.conv2.weight": "encoder.blocks.0.conv2.weight",
+            "first_stage_model.encoder.down.0.block.0.norm1.bias": "encoder.blocks.0.norm1.bias",
+            "first_stage_model.encoder.down.0.block.0.norm1.weight": "encoder.blocks.0.norm1.weight",
+            "first_stage_model.encoder.down.0.block.0.norm2.bias": "encoder.blocks.0.norm2.bias",
+            "first_stage_model.encoder.down.0.block.0.norm2.weight": "encoder.blocks.0.norm2.weight",
+            "first_stage_model.encoder.down.0.block.1.conv1.bias": "encoder.blocks.1.conv1.bias",
+            "first_stage_model.encoder.down.0.block.1.conv1.weight": "encoder.blocks.1.conv1.weight",
+            "first_stage_model.encoder.down.0.block.1.conv2.bias": "encoder.blocks.1.conv2.bias",
+            "first_stage_model.encoder.down.0.block.1.conv2.weight": "encoder.blocks.1.conv2.weight",
+            "first_stage_model.encoder.down.0.block.1.norm1.bias": "encoder.blocks.1.norm1.bias",
+            "first_stage_model.encoder.down.0.block.1.norm1.weight": "encoder.blocks.1.norm1.weight",
+            "first_stage_model.encoder.down.0.block.1.norm2.bias": "encoder.blocks.1.norm2.bias",
+            "first_stage_model.encoder.down.0.block.1.norm2.weight": "encoder.blocks.1.norm2.weight",
+            "first_stage_model.encoder.down.0.downsample.conv.bias": "encoder.blocks.2.conv.bias",
+            "first_stage_model.encoder.down.0.downsample.conv.weight": "encoder.blocks.2.conv.weight",
+            "first_stage_model.encoder.down.1.block.0.conv1.bias": "encoder.blocks.3.conv1.bias",
+            "first_stage_model.encoder.down.1.block.0.conv1.weight": "encoder.blocks.3.conv1.weight",
+            "first_stage_model.encoder.down.1.block.0.conv2.bias": "encoder.blocks.3.conv2.bias",
+            "first_stage_model.encoder.down.1.block.0.conv2.weight": "encoder.blocks.3.conv2.weight",
+            "first_stage_model.encoder.down.1.block.0.nin_shortcut.bias": "encoder.blocks.3.conv_shortcut.bias",
+            "first_stage_model.encoder.down.1.block.0.nin_shortcut.weight": "encoder.blocks.3.conv_shortcut.weight",
+            "first_stage_model.encoder.down.1.block.0.norm1.bias": "encoder.blocks.3.norm1.bias",
+            "first_stage_model.encoder.down.1.block.0.norm1.weight": "encoder.blocks.3.norm1.weight",
+            "first_stage_model.encoder.down.1.block.0.norm2.bias": "encoder.blocks.3.norm2.bias",
+            "first_stage_model.encoder.down.1.block.0.norm2.weight": "encoder.blocks.3.norm2.weight",
+            "first_stage_model.encoder.down.1.block.1.conv1.bias": "encoder.blocks.4.conv1.bias",
+            "first_stage_model.encoder.down.1.block.1.conv1.weight": "encoder.blocks.4.conv1.weight",
+            "first_stage_model.encoder.down.1.block.1.conv2.bias": "encoder.blocks.4.conv2.bias",
+            "first_stage_model.encoder.down.1.block.1.conv2.weight": "encoder.blocks.4.conv2.weight",
+            "first_stage_model.encoder.down.1.block.1.norm1.bias": "encoder.blocks.4.norm1.bias",
+            "first_stage_model.encoder.down.1.block.1.norm1.weight": "encoder.blocks.4.norm1.weight",
+            "first_stage_model.encoder.down.1.block.1.norm2.bias": "encoder.blocks.4.norm2.bias",
+            "first_stage_model.encoder.down.1.block.1.norm2.weight": "encoder.blocks.4.norm2.weight",
+            "first_stage_model.encoder.down.1.downsample.conv.bias": "encoder.blocks.5.conv.bias",
+            "first_stage_model.encoder.down.1.downsample.conv.weight": "encoder.blocks.5.conv.weight",
+            "first_stage_model.encoder.down.2.block.0.conv1.bias": "encoder.blocks.6.conv1.bias",
+            "first_stage_model.encoder.down.2.block.0.conv1.weight": "encoder.blocks.6.conv1.weight",
+            "first_stage_model.encoder.down.2.block.0.conv2.bias": "encoder.blocks.6.conv2.bias",
+            "first_stage_model.encoder.down.2.block.0.conv2.weight": "encoder.blocks.6.conv2.weight",
+            "first_stage_model.encoder.down.2.block.0.nin_shortcut.bias": "encoder.blocks.6.conv_shortcut.bias",
+            "first_stage_model.encoder.down.2.block.0.nin_shortcut.weight": "encoder.blocks.6.conv_shortcut.weight",
+            "first_stage_model.encoder.down.2.block.0.norm1.bias": "encoder.blocks.6.norm1.bias",
+            "first_stage_model.encoder.down.2.block.0.norm1.weight": "encoder.blocks.6.norm1.weight",
+            "first_stage_model.encoder.down.2.block.0.norm2.bias": "encoder.blocks.6.norm2.bias",
+            "first_stage_model.encoder.down.2.block.0.norm2.weight": "encoder.blocks.6.norm2.weight",
+            "first_stage_model.encoder.down.2.block.1.conv1.bias": "encoder.blocks.7.conv1.bias",
+            "first_stage_model.encoder.down.2.block.1.conv1.weight": "encoder.blocks.7.conv1.weight",
+            "first_stage_model.encoder.down.2.block.1.conv2.bias": "encoder.blocks.7.conv2.bias",
+            "first_stage_model.encoder.down.2.block.1.conv2.weight": "encoder.blocks.7.conv2.weight",
+            "first_stage_model.encoder.down.2.block.1.norm1.bias": "encoder.blocks.7.norm1.bias",
+            "first_stage_model.encoder.down.2.block.1.norm1.weight": "encoder.blocks.7.norm1.weight",
+            "first_stage_model.encoder.down.2.block.1.norm2.bias": "encoder.blocks.7.norm2.bias",
+            "first_stage_model.encoder.down.2.block.1.norm2.weight": "encoder.blocks.7.norm2.weight",
+            "first_stage_model.encoder.down.2.downsample.conv.bias": "encoder.blocks.8.conv.bias",
+            "first_stage_model.encoder.down.2.downsample.conv.weight": "encoder.blocks.8.conv.weight",
+            "first_stage_model.encoder.down.3.block.0.conv1.bias": "encoder.blocks.9.conv1.bias",
+            "first_stage_model.encoder.down.3.block.0.conv1.weight": "encoder.blocks.9.conv1.weight",
+            "first_stage_model.encoder.down.3.block.0.conv2.bias": "encoder.blocks.9.conv2.bias",
+            "first_stage_model.encoder.down.3.block.0.conv2.weight": "encoder.blocks.9.conv2.weight",
+            "first_stage_model.encoder.down.3.block.0.norm1.bias": "encoder.blocks.9.norm1.bias",
+            "first_stage_model.encoder.down.3.block.0.norm1.weight": "encoder.blocks.9.norm1.weight",
+            "first_stage_model.encoder.down.3.block.0.norm2.bias": "encoder.blocks.9.norm2.bias",
+            "first_stage_model.encoder.down.3.block.0.norm2.weight": "encoder.blocks.9.norm2.weight",
+            "first_stage_model.encoder.down.3.block.1.conv1.bias": "encoder.blocks.10.conv1.bias",
+            "first_stage_model.encoder.down.3.block.1.conv1.weight": "encoder.blocks.10.conv1.weight",
+            "first_stage_model.encoder.down.3.block.1.conv2.bias": "encoder.blocks.10.conv2.bias",
+            "first_stage_model.encoder.down.3.block.1.conv2.weight": "encoder.blocks.10.conv2.weight",
+            "first_stage_model.encoder.down.3.block.1.norm1.bias": "encoder.blocks.10.norm1.bias",
+            "first_stage_model.encoder.down.3.block.1.norm1.weight": "encoder.blocks.10.norm1.weight",
+            "first_stage_model.encoder.down.3.block.1.norm2.bias": "encoder.blocks.10.norm2.bias",
+            "first_stage_model.encoder.down.3.block.1.norm2.weight": "encoder.blocks.10.norm2.weight",
+            "first_stage_model.encoder.mid.attn_1.k.bias": "encoder.blocks.12.transformer_blocks.0.to_k.bias",
+            "first_stage_model.encoder.mid.attn_1.k.weight": "encoder.blocks.12.transformer_blocks.0.to_k.weight",
+            "first_stage_model.encoder.mid.attn_1.norm.bias": "encoder.blocks.12.norm.bias",
+            "first_stage_model.encoder.mid.attn_1.norm.weight": "encoder.blocks.12.norm.weight",
+            "first_stage_model.encoder.mid.attn_1.proj_out.bias": "encoder.blocks.12.transformer_blocks.0.to_out.bias",
+            "first_stage_model.encoder.mid.attn_1.proj_out.weight": "encoder.blocks.12.transformer_blocks.0.to_out.weight",
+            "first_stage_model.encoder.mid.attn_1.q.bias": "encoder.blocks.12.transformer_blocks.0.to_q.bias",
+            "first_stage_model.encoder.mid.attn_1.q.weight": "encoder.blocks.12.transformer_blocks.0.to_q.weight",
+            "first_stage_model.encoder.mid.attn_1.v.bias": "encoder.blocks.12.transformer_blocks.0.to_v.bias",
+            "first_stage_model.encoder.mid.attn_1.v.weight": "encoder.blocks.12.transformer_blocks.0.to_v.weight",
+            "first_stage_model.encoder.mid.block_1.conv1.bias": "encoder.blocks.11.conv1.bias",
+            "first_stage_model.encoder.mid.block_1.conv1.weight": "encoder.blocks.11.conv1.weight",
+            "first_stage_model.encoder.mid.block_1.conv2.bias": "encoder.blocks.11.conv2.bias",
+            "first_stage_model.encoder.mid.block_1.conv2.weight": "encoder.blocks.11.conv2.weight",
+            "first_stage_model.encoder.mid.block_1.norm1.bias": "encoder.blocks.11.norm1.bias",
+            "first_stage_model.encoder.mid.block_1.norm1.weight": "encoder.blocks.11.norm1.weight",
+            "first_stage_model.encoder.mid.block_1.norm2.bias": "encoder.blocks.11.norm2.bias",
+            "first_stage_model.encoder.mid.block_1.norm2.weight": "encoder.blocks.11.norm2.weight",
+            "first_stage_model.encoder.mid.block_2.conv1.bias": "encoder.blocks.13.conv1.bias",
+            "first_stage_model.encoder.mid.block_2.conv1.weight": "encoder.blocks.13.conv1.weight",
+            "first_stage_model.encoder.mid.block_2.conv2.bias": "encoder.blocks.13.conv2.bias",
+            "first_stage_model.encoder.mid.block_2.conv2.weight": "encoder.blocks.13.conv2.weight",
+            "first_stage_model.encoder.mid.block_2.norm1.bias": "encoder.blocks.13.norm1.bias",
+            "first_stage_model.encoder.mid.block_2.norm1.weight": "encoder.blocks.13.norm1.weight",
+            "first_stage_model.encoder.mid.block_2.norm2.bias": "encoder.blocks.13.norm2.bias",
+            "first_stage_model.encoder.mid.block_2.norm2.weight": "encoder.blocks.13.norm2.weight",
+            "first_stage_model.encoder.norm_out.bias": "encoder.conv_norm_out.bias",
+            "first_stage_model.encoder.norm_out.weight": "encoder.conv_norm_out.weight",
+            "first_stage_model.quant_conv.bias": "encoder.quant_conv.bias",
+            "first_stage_model.quant_conv.weight": "encoder.quant_conv.weight"
+        }
+    }
+}

diffsynth-engine 0.1.0__tar.gz → 0.2.0__tar.gz

diffsynth-engine 0.1.0tar.gz → 0.2.0tar.gz