PyPI - diffsynth - Versions diffs - 2.0.3__tar.gz → 2.0.4__tar.gz - Mend

diffsynth 2.0.3tar.gz → 2.0.4tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (412) hide show

{diffsynth-2.0.3/diffsynth.egg-info → diffsynth-2.0.4}/PKG-INFO RENAMED Viewed

@@ -1,6 +1,6 @@
 Metadata-Version: 2.4
 Name: diffsynth
-Version: 2.0.3
+Version: 2.0.4
 Summary: Enjoy the magic of Diffusion models!
 Author: ModelScope Team
 License: Apache-2.0

{diffsynth-2.0.3 → diffsynth-2.0.4}/README.md RENAMED Viewed

@@ -33,6 +33,8 @@ We believe that a well-developed open-source code framework can lower the thresh
 > Currently, the development personnel of this project are limited, with most of the work handled by [Artiprocher](https://github.com/Artiprocher). Therefore, the progress of new feature development will be relatively slow, and the speed of responding to and resolving issues is limited. We apologize for this and ask developers to understand.
+- **January 27, 2026**: [Z-Image](https://modelscope.cn/models/Tongyi-MAI/Z-Image) is released, and our [Z-Image-i2L](https://www.modelscope.cn/models/DiffSynth-Studio/Z-Image-i2L) model is released concurrently. You can use it in [ModelScope Studios](https://modelscope.cn/studios/DiffSynth-Studio/Z-Image-i2L). For details, see the [documentation](/docs/zh/Model_Details/Z-Image.md).
 - **January 19, 2026**: Added support for [FLUX.2-klein-4B](https://modelscope.cn/models/black-forest-labs/FLUX.2-klein-4B) and [FLUX.2-klein-9B](https://modelscope.cn/models/black-forest-labs/FLUX.2-klein-9B) models, including training and inference capabilities. [Documentation](/docs/en/Model_Details/FLUX2.md) and [example code](/examples/flux2/) are now available.
 - **January 12, 2026**: We trained and open-sourced a text-guided image layer separation model ([Model Link](https://modelscope.cn/models/DiffSynth-Studio/Qwen-Image-Layered-Control)). Given an input image and a textual description, the model isolates the image layer corresponding to the described content. For more details, please refer to our blog post ([Chinese version](https://modelscope.cn/learn/4938), [English version](https://huggingface.co/blog/kelseye/qwen-image-layered-control)).
@@ -269,9 +271,14 @@ image.save("image.jpg")
 Example code for Z-Image is available at: [/examples/z_image/](/examples/z_image/)
-| Model ID | Inference | Low-VRAM Inference | Full Training | Full Training Validation | LoRA Training | LoRA Training Validation |
+|Model ID|Inference|Low VRAM Inference|Full Training|Validation After Full Training|LoRA Training|Validation After LoRA Training|
 |-|-|-|-|-|-|-|
+|[Tongyi-MAI/Z-Image](https://www.modelscope.cn/models/Tongyi-MAI/Z-Image)|[code](/examples/z_image/model_inference/Z-Image.py)|[code](/examples/z_image/model_inference_low_vram/Z-Image.py)|[code](/examples/z_image/model_training/full/Z-Image.sh)|[code](/examples/z_image/model_training/validate_full/Z-Image.py)|[code](/examples/z_image/model_training/lora/Z-Image.sh)|[code](/examples/z_image/model_training/validate_lora/Z-Image.py)|
+|[DiffSynth-Studio/Z-Image-i2L](https://www.modelscope.cn/models/DiffSynth-Studio/Z-Image-i2L)|[code](/examples/z_image/model_inference/Z-Image-i2L.py)|[code](/examples/z_image/model_inference_low_vram/Z-Image-i2L.py)|-|-|-|-|
 |[Tongyi-MAI/Z-Image-Turbo](https://www.modelscope.cn/models/Tongyi-MAI/Z-Image-Turbo)|[code](/examples/z_image/model_inference/Z-Image-Turbo.py)|[code](/examples/z_image/model_inference_low_vram/Z-Image-Turbo.py)|[code](/examples/z_image/model_training/full/Z-Image-Turbo.sh)|[code](/examples/z_image/model_training/validate_full/Z-Image-Turbo.py)|[code](/examples/z_image/model_training/lora/Z-Image-Turbo.sh)|[code](/examples/z_image/model_training/validate_lora/Z-Image-Turbo.py)|
+|[PAI/Z-Image-Turbo-Fun-Controlnet-Union-2.1](https://www.modelscope.cn/models/PAI/Z-Image-Turbo-Fun-Controlnet-Union-2.1)|[code](/examples/z_image/model_inference/Z-Image-Turbo-Fun-Controlnet-Union-2.1.py)|[code](/examples/z_image/model_inference_low_vram/Z-Image-Turbo-Fun-Controlnet-Union-2.1.py)|[code](/examples/z_image/model_training/full/Z-Image-Turbo-Fun-Controlnet-Union-2.1.sh)|[code](/examples/z_image/model_training/validate_full/Z-Image-Turbo-Fun-Controlnet-Union-2.1.py)|[code](/examples/z_image/model_training/lora/Z-Image-Turbo-Fun-Controlnet-Union-2.1.sh)|[code](/examples/z_image/model_training/validate_lora/Z-Image-Turbo-Fun-Controlnet-Union-2.1.py)|
+|[PAI/Z-Image-Turbo-Fun-Controlnet-Union-2.1-8steps](https://www.modelscope.cn/models/PAI/Z-Image-Turbo-Fun-Controlnet-Union-2.1)|[code](/examples/z_image/model_inference/Z-Image-Turbo-Fun-Controlnet-Union-2.1-8steps.py)|[code](/examples/z_image/model_inference_low_vram/Z-Image-Turbo-Fun-Controlnet-Union-2.1-8steps.py)|[code](/examples/z_image/model_training/full/Z-Image-Turbo-Fun-Controlnet-Union-2.1-8steps.sh)|[code](/examples/z_image/model_training/validate_full/Z-Image-Turbo-Fun-Controlnet-Union-2.1-8steps.py)|[code](/examples/z_image/model_training/lora/Z-Image-Turbo-Fun-Controlnet-Union-2.1-8steps.sh)|[code](/examples/z_image/model_training/validate_lora/Z-Image-Turbo-Fun-Controlnet-Union-2.1-8steps.py)|
+|[PAI/Z-Image-Turbo-Fun-Controlnet-Tile-2.1-8steps](https://www.modelscope.cn/models/PAI/Z-Image-Turbo-Fun-Controlnet-Union-2.1)|[code](/examples/z_image/model_inference/Z-Image-Turbo-Fun-Controlnet-Tile-2.1-8steps.py)|[code](/examples/z_image/model_inference_low_vram/Z-Image-Turbo-Fun-Controlnet-Tile-2.1-8steps.py)|[code](/examples/z_image/model_training/full/Z-Image-Turbo-Fun-Controlnet-Tile-2.1-8steps.sh)|[code](/examples/z_image/model_training/validate_full/Z-Image-Turbo-Fun-Controlnet-Tile-2.1-8steps.py)|[code](/examples/z_image/model_training/lora/Z-Image-Turbo-Fun-Controlnet-Tile-2.1-8steps.sh)|[code](/examples/z_image/model_training/validate_lora/Z-Image-Turbo-Fun-Controlnet-Tile-2.1-8steps.py)|
 </details>

{diffsynth-2.0.3 → diffsynth-2.0.4}/diffsynth/core/loader/config.py RENAMED Viewed

@@ -1,5 +1,5 @@
 import torch, glob, os
-from typing import Optional, Union
+from typing import Optional, Union, Dict
 from dataclasses import dataclass
 from modelscope import snapshot_download
 from huggingface_hub import snapshot_download as hf_snapshot_download
@@ -23,6 +23,7 @@ class ModelConfig:
     computation_device: Optional[Union[str, torch.device]] = None
     computation_dtype: Optional[torch.dtype] = None
     clear_parameters: bool = False
+    state_dict: Dict[str, torch.Tensor] = None
     def check_input(self):
         if self.path is None and self.model_id is None:

{diffsynth-2.0.3 → diffsynth-2.0.4}/diffsynth/core/loader/file.py RENAMED Viewed

@@ -2,16 +2,25 @@ from safetensors import safe_open
 import torch, hashlib
-def load_state_dict(file_path, torch_dtype=None, device="cpu"):
+def load_state_dict(file_path, torch_dtype=None, device="cpu", pin_memory=False, verbose=0):
     if isinstance(file_path, list):
         state_dict = {}
         for file_path_ in file_path:
-            state_dict.update(load_state_dict(file_path_, torch_dtype, device))
-        return state_dict
-    if file_path.endswith(".safetensors"):
-        return load_state_dict_from_safetensors(file_path, torch_dtype=torch_dtype, device=device)
+            state_dict.update(load_state_dict(file_path_, torch_dtype, device, pin_memory=pin_memory, verbose=verbose))
     else:
-        return load_state_dict_from_bin(file_path, torch_dtype=torch_dtype, device=device)
+        if verbose >= 1:
+            print(f"Loading file [started]: {file_path}")
+        if file_path.endswith(".safetensors"):
+            state_dict = load_state_dict_from_safetensors(file_path, torch_dtype=torch_dtype, device=device)
+        else:
+            state_dict = load_state_dict_from_bin(file_path, torch_dtype=torch_dtype, device=device)
+        # If load state dict in CPU memory, `pin_memory=True` will make `model.to("cuda")` faster.
+        if pin_memory:
+            for i in state_dict:
+                state_dict[i] = state_dict[i].pin_memory()
+        if verbose >= 1:
+            print(f"Loading file [done]: {file_path}")
+    return state_dict
 def load_state_dict_from_safetensors(file_path, torch_dtype=None, device="cpu"):

{diffsynth-2.0.3 → diffsynth-2.0.4}/diffsynth/core/loader/model.py RENAMED Viewed

@@ -5,7 +5,7 @@ from .file import load_state_dict
 import torch
-def load_model(model_class, path, config=None, torch_dtype=torch.bfloat16, device="cpu", state_dict_converter=None, use_disk_map=False, module_map=None, vram_config=None, vram_limit=None):
+def load_model(model_class, path, config=None, torch_dtype=torch.bfloat16, device="cpu", state_dict_converter=None, use_disk_map=False, module_map=None, vram_config=None, vram_limit=None, state_dict=None):
     config = {} if config is None else config
     # Why do we use `skip_model_initialization`?
     # It skips the random initialization of model parameters,
@@ -20,7 +20,7 @@ def load_model(model_class, path, config=None, torch_dtype=torch.bfloat16, devic
         dtypes = [vram_config["offload_dtype"], vram_config["onload_dtype"], vram_config["preparing_dtype"], vram_config["computation_dtype"]]
         dtype = [d for d in dtypes if d != "disk"][0]
         if vram_config["offload_device"] != "disk":
-            state_dict = DiskMap(path, device, torch_dtype=dtype)
+            if state_dict is None: state_dict = DiskMap(path, device, torch_dtype=dtype)
             if state_dict_converter is not None:
                 state_dict = state_dict_converter(state_dict)
             else:
@@ -35,7 +35,9 @@ def load_model(model_class, path, config=None, torch_dtype=torch.bfloat16, devic
         # Sometimes a model file contains multiple models,
         # and DiskMap can load only the parameters of a single model,
         # avoiding the need to load all parameters in the file.
-        if use_disk_map:
+        if state_dict is not None:
+            pass
+        elif use_disk_map:
             state_dict = DiskMap(path, device, torch_dtype=torch_dtype)
         else:
             state_dict = load_state_dict(path, torch_dtype, device)

{diffsynth-2.0.3 → diffsynth-2.0.4}/diffsynth/diffusion/base_pipeline.py RENAMED Viewed

@@ -296,6 +296,7 @@ class BasePipeline(torch.nn.Module):
                 vram_config=vram_config,
                 vram_limit=vram_limit,
                 clear_parameters=model_config.clear_parameters,
+                state_dict=model_config.state_dict,
             )
         return model_pool

{diffsynth-2.0.3 → diffsynth-2.0.4}/diffsynth/models/model_loader.py RENAMED Viewed

@@ -29,7 +29,7 @@ class ModelPool:
             module_map = None
         return module_map
-    def load_model_file(self, config, path, vram_config, vram_limit=None):
+    def load_model_file(self, config, path, vram_config, vram_limit=None, state_dict=None):
         model_class = self.import_model_class(config["model_class"])
         model_config = config.get("extra_kwargs", {})
         if "state_dict_converter" in config:
@@ -43,6 +43,7 @@ class ModelPool:
             state_dict_converter,
             use_disk_map=True,
             vram_config=vram_config, module_map=module_map, vram_limit=vram_limit,
+            state_dict=state_dict,
         )
         return model
@@ -59,7 +60,7 @@ class ModelPool:
         }
         return vram_config
-    def auto_load_model(self, path, vram_config=None, vram_limit=None, clear_parameters=False):
+    def auto_load_model(self, path, vram_config=None, vram_limit=None, clear_parameters=False, state_dict=None):
         print(f"Loading models from: {json.dumps(path, indent=4)}")
         if vram_config is None:
             vram_config = self.default_vram_config()
@@ -67,7 +68,7 @@ class ModelPool:
         loaded = False
         for config in MODEL_CONFIGS:
             if config["model_hash"] == model_hash:
-                model = self.load_model_file(config, path, vram_config, vram_limit=vram_limit)
+                model = self.load_model_file(config, path, vram_config, vram_limit=vram_limit, state_dict=state_dict)
                 if clear_parameters: self.clear_parameters(model)
                 self.model.append(model)
                 model_name = config["model_name"]

{diffsynth-2.0.3 → diffsynth-2.0.4}/diffsynth/pipelines/flux2_image.py RENAMED Viewed

@@ -1,4 +1,4 @@
-import torch, math
+import torch, math, torchvision
 from PIL import Image
 from typing import Union
 from tqdm import tqdm
@@ -477,10 +477,21 @@ class Flux2Unit_EditImageEmbedder(PipelineUnit):
         width = round(width / 32) * 32
         height = round(height / 32) * 32
         return width, height
+    def crop_and_resize(self, image, target_height, target_width):
+        width, height = image.size
+        scale = max(target_width / width, target_height / height)
+        image = torchvision.transforms.functional.resize(
+            image,
+            (round(height*scale), round(width*scale)),
+            interpolation=torchvision.transforms.InterpolationMode.BILINEAR
+        )
+        image = torchvision.transforms.functional.center_crop(image, (target_height, target_width))
+        return image
     def edit_image_auto_resize(self, edit_image):
         calculated_width, calculated_height = self.calculate_dimensions(1024 * 1024, edit_image.size[0] / edit_image.size[1])
-        return edit_image.resize((calculated_width, calculated_height))
+        return self.crop_and_resize(edit_image, calculated_height, calculated_width)
     def process_image_ids(self, image_latents, scale=10):
         t_coords = [scale + scale * t for t in torch.arange(0, len(image_latents))]

{diffsynth-2.0.3 → diffsynth-2.0.4/diffsynth.egg-info}/PKG-INFO RENAMED Viewed

@@ -1,6 +1,6 @@
 Metadata-Version: 2.4
 Name: diffsynth
-Version: 2.0.3
+Version: 2.0.4
 Summary: Enjoy the magic of Diffusion models!
 Author: ModelScope Team
 License: Apache-2.0

diffsynth-2.0.4/diffsynth.egg-info/SOURCES.txt ADDED Viewed

@@ -0,0 +1,119 @@
+LICENSE
+README.md
+pyproject.toml
+diffsynth/__init__.py
+diffsynth.egg-info/PKG-INFO
+diffsynth.egg-info/SOURCES.txt
+diffsynth.egg-info/dependency_links.txt
+diffsynth.egg-info/requires.txt
+diffsynth.egg-info/top_level.txt
+diffsynth/configs/__init__.py
+diffsynth/configs/model_configs.py
+diffsynth/configs/vram_management_module_maps.py
+diffsynth/core/__init__.py
+diffsynth/core/attention/__init__.py
+diffsynth/core/attention/attention.py
+diffsynth/core/data/__init__.py
+diffsynth/core/data/operators.py
+diffsynth/core/data/unified_dataset.py
+diffsynth/core/device/__init__.py
+diffsynth/core/device/npu_compatible_device.py
+diffsynth/core/gradient/__init__.py
+diffsynth/core/gradient/gradient_checkpoint.py
+diffsynth/core/loader/__init__.py
+diffsynth/core/loader/config.py
+diffsynth/core/loader/file.py
+diffsynth/core/loader/model.py
+diffsynth/core/vram/__init__.py
+diffsynth/core/vram/disk_map.py
+diffsynth/core/vram/initialization.py
+diffsynth/core/vram/layers.py
+diffsynth/diffusion/__init__.py
+diffsynth/diffusion/base_pipeline.py
+diffsynth/diffusion/flow_match.py
+diffsynth/diffusion/logger.py
+diffsynth/diffusion/loss.py
+diffsynth/diffusion/parsers.py
+diffsynth/diffusion/runner.py
+diffsynth/diffusion/training_module.py
+diffsynth/models/dinov3_image_encoder.py
+diffsynth/models/flux2_dit.py
+diffsynth/models/flux2_text_encoder.py
+diffsynth/models/flux2_vae.py
+diffsynth/models/flux_controlnet.py
+diffsynth/models/flux_dit.py
+diffsynth/models/flux_infiniteyou.py
+diffsynth/models/flux_ipadapter.py
+diffsynth/models/flux_lora_encoder.py
+diffsynth/models/flux_lora_patcher.py
+diffsynth/models/flux_text_encoder_clip.py
+diffsynth/models/flux_text_encoder_t5.py
+diffsynth/models/flux_vae.py
+diffsynth/models/flux_value_control.py
+diffsynth/models/general_modules.py
+diffsynth/models/longcat_video_dit.py
+diffsynth/models/model_loader.py
+diffsynth/models/nexus_gen.py
+diffsynth/models/nexus_gen_ar_model.py
+diffsynth/models/nexus_gen_projector.py
+diffsynth/models/qwen_image_controlnet.py
+diffsynth/models/qwen_image_dit.py
+diffsynth/models/qwen_image_image2lora.py
+diffsynth/models/qwen_image_text_encoder.py
+diffsynth/models/qwen_image_vae.py
+diffsynth/models/sd_text_encoder.py
+diffsynth/models/siglip2_image_encoder.py
+diffsynth/models/step1x_connector.py
+diffsynth/models/step1x_text_encoder.py
+diffsynth/models/wan_video_animate_adapter.py
+diffsynth/models/wan_video_camera_controller.py
+diffsynth/models/wan_video_dit.py
+diffsynth/models/wan_video_dit_s2v.py
+diffsynth/models/wan_video_image_encoder.py
+diffsynth/models/wan_video_mot.py
+diffsynth/models/wan_video_motion_controller.py
+diffsynth/models/wan_video_text_encoder.py
+diffsynth/models/wan_video_vace.py
+diffsynth/models/wan_video_vae.py
+diffsynth/models/wav2vec.py
+diffsynth/models/z_image_controlnet.py
+diffsynth/models/z_image_dit.py
+diffsynth/models/z_image_image2lora.py
+diffsynth/models/z_image_text_encoder.py
+diffsynth/pipelines/flux2_image.py
+diffsynth/pipelines/flux_image.py
+diffsynth/pipelines/qwen_image.py
+diffsynth/pipelines/wan_video.py
+diffsynth/pipelines/z_image.py
+diffsynth/utils/controlnet/__init__.py
+diffsynth/utils/controlnet/annotator.py
+diffsynth/utils/controlnet/controlnet_input.py
+diffsynth/utils/data/__init__.py
+diffsynth/utils/lora/__init__.py
+diffsynth/utils/lora/flux.py
+diffsynth/utils/lora/general.py
+diffsynth/utils/lora/merge.py
+diffsynth/utils/lora/reset_rank.py
+diffsynth/utils/state_dict_converters/__init__.py
+diffsynth/utils/state_dict_converters/flux2_text_encoder.py
+diffsynth/utils/state_dict_converters/flux_controlnet.py
+diffsynth/utils/state_dict_converters/flux_dit.py
+diffsynth/utils/state_dict_converters/flux_infiniteyou.py
+diffsynth/utils/state_dict_converters/flux_ipadapter.py
+diffsynth/utils/state_dict_converters/flux_text_encoder_clip.py
+diffsynth/utils/state_dict_converters/flux_text_encoder_t5.py
+diffsynth/utils/state_dict_converters/flux_vae.py
+diffsynth/utils/state_dict_converters/nexus_gen.py
+diffsynth/utils/state_dict_converters/nexus_gen_projector.py
+diffsynth/utils/state_dict_converters/qwen_image_text_encoder.py
+diffsynth/utils/state_dict_converters/step1x_connector.py
+diffsynth/utils/state_dict_converters/wan_video_animate_adapter.py
+diffsynth/utils/state_dict_converters/wan_video_dit.py
+diffsynth/utils/state_dict_converters/wan_video_image_encoder.py
+diffsynth/utils/state_dict_converters/wan_video_mot.py
+diffsynth/utils/state_dict_converters/wan_video_vace.py
+diffsynth/utils/state_dict_converters/wan_video_vae.py
+diffsynth/utils/state_dict_converters/wans2v_audio_encoder.py
+diffsynth/utils/state_dict_converters/z_image_text_encoder.py
+diffsynth/utils/xfuser/__init__.py
+diffsynth/utils/xfuser/xdit_context_parallel.py

{diffsynth-2.0.3 → diffsynth-2.0.4}/pyproject.toml RENAMED Viewed

@@ -4,7 +4,7 @@ build-backend = "setuptools.build_meta"
 [project]
 name = "diffsynth"
-version = "2.0.3"
+version = "2.0.4"
 description = "Enjoy the magic of Diffusion models!"
 authors = [{name = "ModelScope Team"}]
 license = {text = "Apache-2.0"}

diffsynth-2.0.3/data/style/move.py DELETED Viewed

@@ -1,13 +0,0 @@
-from shutil import copy
-import os
-for i, style_id in enumerate([1, 2, 4, 5, 7, 8, 9]):
-    os.makedirs(f"/mnt/nas1/duanzhongjie.dzj/dev6_zimagebase/Z-Image-Omni-Base-i2L/assets/style/{i}", exist_ok=True)
-    for file_name in os.listdir(f"data/style/{style_id}"):
-        copy(f"data/style/{style_id}/{file_name}", f"/mnt/nas1/duanzhongjie.dzj/dev6_zimagebase/Z-Image-Omni-Base-i2L/assets/style/{i}/{file_name}")
-    image_id = 0
-    for file_name in sorted(os.listdir(f"data/style_out/1")):
-        if file_name.startswith(f"image_lora_{style_id}_"):
-            copy(f"data/style_out/1/{file_name}", f"/mnt/nas1/duanzhongjie.dzj/dev6_zimagebase/Z-Image-Omni-Base-i2L/assets/style/{i}/image_{image_id}.jpg")
-            image_id += 1

diffsynth-2.0.3/data/style/test.py DELETED Viewed

@@ -1,57 +0,0 @@
-from diffsynth.pipelines.z_image import (
-    ZImagePipeline, ModelConfig,
-    ZImageUnit_Image2LoRAEncode, ZImageUnit_Image2LoRADecode
-)
-from modelscope import snapshot_download
-from safetensors.torch import save_file
-import torch, os
-from PIL import Image
-# Use `vram_config` to enable LoRA hot-loading
-vram_config = {
-    "offload_dtype": torch.bfloat16,
-    "offload_device": "cuda",
-    "onload_dtype": torch.bfloat16,
-    "onload_device": "cuda",
-    "preparing_dtype": torch.bfloat16,
-    "preparing_device": "cuda",
-    "computation_dtype": torch.bfloat16,
-    "computation_device": "cuda",
-}
-# Load models
-pipe = ZImagePipeline.from_pretrained(
-    torch_dtype=torch.bfloat16,
-    device="cuda",
-    model_configs=[
-        ModelConfig(model_id="Tongyi-MAI/Z-Image-Omni-Base", origin_file_pattern="transformer/*.safetensors", **vram_config),
-        ModelConfig(model_id="Tongyi-MAI/Z-Image-Omni-Base", origin_file_pattern="siglip/model.safetensors"),
-        ModelConfig(model_id="Tongyi-MAI/Z-Image-Turbo", origin_file_pattern="text_encoder/*.safetensors"),
-        ModelConfig(model_id="Tongyi-MAI/Z-Image-Turbo", origin_file_pattern="vae/diffusion_pytorch_model.safetensors"),
-        ModelConfig(model_id="DiffSynth-Studio/General-Image-Encoders", origin_file_pattern="SigLIP2-G384/model.safetensors"),
-        ModelConfig(model_id="DiffSynth-Studio/General-Image-Encoders", origin_file_pattern="DINOv3-7B/model.safetensors"),
-        ModelConfig("/mnt/nas1/duanzhongjie.dzj/dev3_zi2L/DiffSynth-Studio/models/train/ema_v30_0.9_0108.safetensors"),
-    ],
-    tokenizer_config=ModelConfig(model_id="Tongyi-MAI/Z-Image-Turbo", origin_file_pattern="tokenizer/"),
-)
-from diffsynth.core.data.operators import ImageCropAndResize
-processor_highres = ImageCropAndResize(height=1024, width=1024)
-for style_id in range(3, 12):
-    images = [Image.open(f"/mnt/nas1/duanzhongjie.dzj/dev3_zi2L/DiffSynth-Studio/data/style/{style_id}/{i}") for i in os.listdir(f"/mnt/nas1/duanzhongjie.dzj/dev3_zi2L/DiffSynth-Studio/data/style/{style_id}")]
-    os.makedirs(f"data/style/{style_id}", exist_ok=True)
-    for image_id, image in enumerate(images):
-        image = processor_highres(image)
-        image.save(f"data/style/{style_id}/{image_id}.jpg")
-    images = [Image.open(f"data/style/{style_id}/{i}.jpg") for i in range(len(images))]
-    with torch.no_grad():
-        embs = ZImageUnit_Image2LoRAEncode().process(pipe, image2lora_images=images)
-        lora = ZImageUnit_Image2LoRADecode().process(pipe, **embs)["lora"]
-    prompts = ["a cat", "a dog", "a girl"]
-    for prompt_id, prompt in enumerate(prompts):
-        negative_prompt = "泛黄，发绿，模糊，低分辨率，低质量图像，扭曲的肢体，诡异的外观，丑陋，AI感，噪点，网格感，JPEG压缩条纹，异常的肢体，水印，乱码，意义不明的字符"
-        image = pipe(prompt=prompt, negative_prompt=negative_prompt, seed=0, cfg_scale=7, num_inference_steps=50, positive_only_lora=lora, sigma_shift=8)
-        image.save(f"data/style_out/1/image_lora_{style_id}_{prompt_id}.jpg")

diffsynth 2.0.3__tar.gz → 2.0.4__tar.gz

diffsynth 2.0.3tar.gz → 2.0.4tar.gz