diffsynth-engine 0.4.1.dev1__tar.gz → 0.4.1.post2.dev1__tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (188) hide show
  1. {diffsynth_engine-0.4.1.dev1 → diffsynth_engine-0.4.1.post2.dev1}/PKG-INFO +1 -1
  2. {diffsynth_engine-0.4.1.dev1 → diffsynth_engine-0.4.1.post2.dev1}/diffsynth_engine/configs/pipeline.py +11 -0
  3. {diffsynth_engine-0.4.1.dev1 → diffsynth_engine-0.4.1.post2.dev1}/diffsynth_engine/models/qwen_image/qwen_image_dit.py +4 -0
  4. {diffsynth_engine-0.4.1.dev1 → diffsynth_engine-0.4.1.post2.dev1}/diffsynth_engine/pipelines/base.py +41 -18
  5. {diffsynth_engine-0.4.1.dev1 → diffsynth_engine-0.4.1.post2.dev1}/diffsynth_engine/pipelines/qwen_image.py +36 -10
  6. {diffsynth_engine-0.4.1.dev1 → diffsynth_engine-0.4.1.post2.dev1}/diffsynth_engine/pipelines/wan_video.py +7 -0
  7. {diffsynth_engine-0.4.1.dev1 → diffsynth_engine-0.4.1.post2.dev1}/diffsynth_engine/tokenizers/qwen2.py +2 -2
  8. {diffsynth_engine-0.4.1.dev1 → diffsynth_engine-0.4.1.post2.dev1}/diffsynth_engine/utils/offload.py +23 -0
  9. {diffsynth_engine-0.4.1.dev1 → diffsynth_engine-0.4.1.post2.dev1}/diffsynth_engine/utils/parallel.py +6 -4
  10. {diffsynth_engine-0.4.1.dev1 → diffsynth_engine-0.4.1.post2.dev1}/diffsynth_engine.egg-info/PKG-INFO +1 -1
  11. {diffsynth_engine-0.4.1.dev1 → diffsynth_engine-0.4.1.post2.dev1}/docs/tutorial.md +66 -21
  12. {diffsynth_engine-0.4.1.dev1 → diffsynth_engine-0.4.1.post2.dev1}/docs/tutorial_zh.md +71 -25
  13. {diffsynth_engine-0.4.1.dev1 → diffsynth_engine-0.4.1.post2.dev1}/.gitignore +0 -0
  14. {diffsynth_engine-0.4.1.dev1 → diffsynth_engine-0.4.1.post2.dev1}/.pre-commit-config.yaml +0 -0
  15. {diffsynth_engine-0.4.1.dev1 → diffsynth_engine-0.4.1.post2.dev1}/LICENSE +0 -0
  16. {diffsynth_engine-0.4.1.dev1 → diffsynth_engine-0.4.1.post2.dev1}/MANIFEST.in +0 -0
  17. {diffsynth_engine-0.4.1.dev1 → diffsynth_engine-0.4.1.post2.dev1}/README.md +0 -0
  18. {diffsynth_engine-0.4.1.dev1 → diffsynth_engine-0.4.1.post2.dev1}/assets/dingtalk.png +0 -0
  19. {diffsynth_engine-0.4.1.dev1 → diffsynth_engine-0.4.1.post2.dev1}/assets/showcase.jpeg +0 -0
  20. {diffsynth_engine-0.4.1.dev1 → diffsynth_engine-0.4.1.post2.dev1}/diffsynth_engine/__init__.py +0 -0
  21. {diffsynth_engine-0.4.1.dev1 → diffsynth_engine-0.4.1.post2.dev1}/diffsynth_engine/algorithm/__init__.py +0 -0
  22. {diffsynth_engine-0.4.1.dev1 → diffsynth_engine-0.4.1.post2.dev1}/diffsynth_engine/algorithm/noise_scheduler/__init__.py +0 -0
  23. {diffsynth_engine-0.4.1.dev1 → diffsynth_engine-0.4.1.post2.dev1}/diffsynth_engine/algorithm/noise_scheduler/base_scheduler.py +0 -0
  24. {diffsynth_engine-0.4.1.dev1 → diffsynth_engine-0.4.1.post2.dev1}/diffsynth_engine/algorithm/noise_scheduler/flow_match/__init__.py +0 -0
  25. {diffsynth_engine-0.4.1.dev1 → diffsynth_engine-0.4.1.post2.dev1}/diffsynth_engine/algorithm/noise_scheduler/flow_match/flow_beta.py +0 -0
  26. {diffsynth_engine-0.4.1.dev1 → diffsynth_engine-0.4.1.post2.dev1}/diffsynth_engine/algorithm/noise_scheduler/flow_match/flow_ddim.py +0 -0
  27. {diffsynth_engine-0.4.1.dev1 → diffsynth_engine-0.4.1.post2.dev1}/diffsynth_engine/algorithm/noise_scheduler/flow_match/recifited_flow.py +0 -0
  28. {diffsynth_engine-0.4.1.dev1 → diffsynth_engine-0.4.1.post2.dev1}/diffsynth_engine/algorithm/noise_scheduler/stable_diffusion/__init__.py +0 -0
  29. {diffsynth_engine-0.4.1.dev1 → diffsynth_engine-0.4.1.post2.dev1}/diffsynth_engine/algorithm/noise_scheduler/stable_diffusion/beta.py +0 -0
  30. {diffsynth_engine-0.4.1.dev1 → diffsynth_engine-0.4.1.post2.dev1}/diffsynth_engine/algorithm/noise_scheduler/stable_diffusion/ddim.py +0 -0
  31. {diffsynth_engine-0.4.1.dev1 → diffsynth_engine-0.4.1.post2.dev1}/diffsynth_engine/algorithm/noise_scheduler/stable_diffusion/exponential.py +0 -0
  32. {diffsynth_engine-0.4.1.dev1 → diffsynth_engine-0.4.1.post2.dev1}/diffsynth_engine/algorithm/noise_scheduler/stable_diffusion/karras.py +0 -0
  33. {diffsynth_engine-0.4.1.dev1 → diffsynth_engine-0.4.1.post2.dev1}/diffsynth_engine/algorithm/noise_scheduler/stable_diffusion/linear.py +0 -0
  34. {diffsynth_engine-0.4.1.dev1 → diffsynth_engine-0.4.1.post2.dev1}/diffsynth_engine/algorithm/noise_scheduler/stable_diffusion/sgm_uniform.py +0 -0
  35. {diffsynth_engine-0.4.1.dev1 → diffsynth_engine-0.4.1.post2.dev1}/diffsynth_engine/algorithm/sampler/__init__.py +0 -0
  36. {diffsynth_engine-0.4.1.dev1 → diffsynth_engine-0.4.1.post2.dev1}/diffsynth_engine/algorithm/sampler/flow_match/__init__.py +0 -0
  37. {diffsynth_engine-0.4.1.dev1 → diffsynth_engine-0.4.1.post2.dev1}/diffsynth_engine/algorithm/sampler/flow_match/flow_match_euler.py +0 -0
  38. {diffsynth_engine-0.4.1.dev1 → diffsynth_engine-0.4.1.post2.dev1}/diffsynth_engine/algorithm/sampler/stable_diffusion/__init__.py +0 -0
  39. {diffsynth_engine-0.4.1.dev1 → diffsynth_engine-0.4.1.post2.dev1}/diffsynth_engine/algorithm/sampler/stable_diffusion/brownian_tree.py +0 -0
  40. {diffsynth_engine-0.4.1.dev1 → diffsynth_engine-0.4.1.post2.dev1}/diffsynth_engine/algorithm/sampler/stable_diffusion/ddpm.py +0 -0
  41. {diffsynth_engine-0.4.1.dev1 → diffsynth_engine-0.4.1.post2.dev1}/diffsynth_engine/algorithm/sampler/stable_diffusion/deis.py +0 -0
  42. {diffsynth_engine-0.4.1.dev1 → diffsynth_engine-0.4.1.post2.dev1}/diffsynth_engine/algorithm/sampler/stable_diffusion/dpmpp_2m.py +0 -0
  43. {diffsynth_engine-0.4.1.dev1 → diffsynth_engine-0.4.1.post2.dev1}/diffsynth_engine/algorithm/sampler/stable_diffusion/dpmpp_2m_sde.py +0 -0
  44. {diffsynth_engine-0.4.1.dev1 → diffsynth_engine-0.4.1.post2.dev1}/diffsynth_engine/algorithm/sampler/stable_diffusion/dpmpp_3m_sde.py +0 -0
  45. {diffsynth_engine-0.4.1.dev1 → diffsynth_engine-0.4.1.post2.dev1}/diffsynth_engine/algorithm/sampler/stable_diffusion/epsilon.py +0 -0
  46. {diffsynth_engine-0.4.1.dev1 → diffsynth_engine-0.4.1.post2.dev1}/diffsynth_engine/algorithm/sampler/stable_diffusion/euler.py +0 -0
  47. {diffsynth_engine-0.4.1.dev1 → diffsynth_engine-0.4.1.post2.dev1}/diffsynth_engine/algorithm/sampler/stable_diffusion/euler_ancestral.py +0 -0
  48. {diffsynth_engine-0.4.1.dev1 → diffsynth_engine-0.4.1.post2.dev1}/diffsynth_engine/conf/models/components/vae.json +0 -0
  49. {diffsynth_engine-0.4.1.dev1 → diffsynth_engine-0.4.1.post2.dev1}/diffsynth_engine/conf/models/flux/flux_dit.json +0 -0
  50. {diffsynth_engine-0.4.1.dev1 → diffsynth_engine-0.4.1.post2.dev1}/diffsynth_engine/conf/models/flux/flux_text_encoder.json +0 -0
  51. {diffsynth_engine-0.4.1.dev1 → diffsynth_engine-0.4.1.post2.dev1}/diffsynth_engine/conf/models/flux/flux_vae.json +0 -0
  52. {diffsynth_engine-0.4.1.dev1 → diffsynth_engine-0.4.1.post2.dev1}/diffsynth_engine/conf/models/qwen_image/qwen2_5_vl_config.json +0 -0
  53. {diffsynth_engine-0.4.1.dev1 → diffsynth_engine-0.4.1.post2.dev1}/diffsynth_engine/conf/models/qwen_image/qwen2_5_vl_vision_config.json +0 -0
  54. {diffsynth_engine-0.4.1.dev1 → diffsynth_engine-0.4.1.post2.dev1}/diffsynth_engine/conf/models/qwen_image/qwen_image_vae.json +0 -0
  55. {diffsynth_engine-0.4.1.dev1 → diffsynth_engine-0.4.1.post2.dev1}/diffsynth_engine/conf/models/qwen_image/qwen_image_vae_keymap.json +0 -0
  56. {diffsynth_engine-0.4.1.dev1 → diffsynth_engine-0.4.1.post2.dev1}/diffsynth_engine/conf/models/sd/sd_text_encoder.json +0 -0
  57. {diffsynth_engine-0.4.1.dev1 → diffsynth_engine-0.4.1.post2.dev1}/diffsynth_engine/conf/models/sd/sd_unet.json +0 -0
  58. {diffsynth_engine-0.4.1.dev1 → diffsynth_engine-0.4.1.post2.dev1}/diffsynth_engine/conf/models/sd3/sd3_dit.json +0 -0
  59. {diffsynth_engine-0.4.1.dev1 → diffsynth_engine-0.4.1.post2.dev1}/diffsynth_engine/conf/models/sd3/sd3_text_encoder.json +0 -0
  60. {diffsynth_engine-0.4.1.dev1 → diffsynth_engine-0.4.1.post2.dev1}/diffsynth_engine/conf/models/sdxl/sdxl_text_encoder.json +0 -0
  61. {diffsynth_engine-0.4.1.dev1 → diffsynth_engine-0.4.1.post2.dev1}/diffsynth_engine/conf/models/sdxl/sdxl_unet.json +0 -0
  62. {diffsynth_engine-0.4.1.dev1 → diffsynth_engine-0.4.1.post2.dev1}/diffsynth_engine/conf/models/wan/dit/wan2.1-flf2v-14b.json +0 -0
  63. {diffsynth_engine-0.4.1.dev1 → diffsynth_engine-0.4.1.post2.dev1}/diffsynth_engine/conf/models/wan/dit/wan2.1-i2v-14b.json +0 -0
  64. {diffsynth_engine-0.4.1.dev1 → diffsynth_engine-0.4.1.post2.dev1}/diffsynth_engine/conf/models/wan/dit/wan2.1-t2v-1.3b.json +0 -0
  65. {diffsynth_engine-0.4.1.dev1 → diffsynth_engine-0.4.1.post2.dev1}/diffsynth_engine/conf/models/wan/dit/wan2.1-t2v-14b.json +0 -0
  66. {diffsynth_engine-0.4.1.dev1 → diffsynth_engine-0.4.1.post2.dev1}/diffsynth_engine/conf/models/wan/dit/wan2.2-i2v-a14b.json +0 -0
  67. {diffsynth_engine-0.4.1.dev1 → diffsynth_engine-0.4.1.post2.dev1}/diffsynth_engine/conf/models/wan/dit/wan2.2-t2v-a14b.json +0 -0
  68. {diffsynth_engine-0.4.1.dev1 → diffsynth_engine-0.4.1.post2.dev1}/diffsynth_engine/conf/models/wan/dit/wan2.2-ti2v-5b.json +0 -0
  69. {diffsynth_engine-0.4.1.dev1 → diffsynth_engine-0.4.1.post2.dev1}/diffsynth_engine/conf/models/wan/vae/wan-vae-keymap.json +0 -0
  70. {diffsynth_engine-0.4.1.dev1 → diffsynth_engine-0.4.1.post2.dev1}/diffsynth_engine/conf/models/wan/vae/wan2.1-vae.json +0 -0
  71. {diffsynth_engine-0.4.1.dev1 → diffsynth_engine-0.4.1.post2.dev1}/diffsynth_engine/conf/models/wan/vae/wan2.2-vae.json +0 -0
  72. {diffsynth_engine-0.4.1.dev1 → diffsynth_engine-0.4.1.post2.dev1}/diffsynth_engine/conf/tokenizers/flux/tokenizer_1/merges.txt +0 -0
  73. {diffsynth_engine-0.4.1.dev1 → diffsynth_engine-0.4.1.post2.dev1}/diffsynth_engine/conf/tokenizers/flux/tokenizer_1/special_tokens_map.json +0 -0
  74. {diffsynth_engine-0.4.1.dev1 → diffsynth_engine-0.4.1.post2.dev1}/diffsynth_engine/conf/tokenizers/flux/tokenizer_1/tokenizer_config.json +0 -0
  75. {diffsynth_engine-0.4.1.dev1 → diffsynth_engine-0.4.1.post2.dev1}/diffsynth_engine/conf/tokenizers/flux/tokenizer_1/vocab.json +0 -0
  76. {diffsynth_engine-0.4.1.dev1 → diffsynth_engine-0.4.1.post2.dev1}/diffsynth_engine/conf/tokenizers/flux/tokenizer_2/special_tokens_map.json +0 -0
  77. {diffsynth_engine-0.4.1.dev1 → diffsynth_engine-0.4.1.post2.dev1}/diffsynth_engine/conf/tokenizers/flux/tokenizer_2/spiece.model +0 -0
  78. {diffsynth_engine-0.4.1.dev1 → diffsynth_engine-0.4.1.post2.dev1}/diffsynth_engine/conf/tokenizers/flux/tokenizer_2/tokenizer.json +0 -0
  79. {diffsynth_engine-0.4.1.dev1 → diffsynth_engine-0.4.1.post2.dev1}/diffsynth_engine/conf/tokenizers/flux/tokenizer_2/tokenizer_config.json +0 -0
  80. {diffsynth_engine-0.4.1.dev1 → diffsynth_engine-0.4.1.post2.dev1}/diffsynth_engine/conf/tokenizers/qwen_image/tokenizer/added_tokens.json +0 -0
  81. {diffsynth_engine-0.4.1.dev1 → diffsynth_engine-0.4.1.post2.dev1}/diffsynth_engine/conf/tokenizers/qwen_image/tokenizer/merges.txt +0 -0
  82. {diffsynth_engine-0.4.1.dev1 → diffsynth_engine-0.4.1.post2.dev1}/diffsynth_engine/conf/tokenizers/qwen_image/tokenizer/special_tokens_map.json +0 -0
  83. {diffsynth_engine-0.4.1.dev1 → diffsynth_engine-0.4.1.post2.dev1}/diffsynth_engine/conf/tokenizers/qwen_image/tokenizer/tokenizer.json +0 -0
  84. {diffsynth_engine-0.4.1.dev1 → diffsynth_engine-0.4.1.post2.dev1}/diffsynth_engine/conf/tokenizers/qwen_image/tokenizer/tokenizer_config.json +0 -0
  85. {diffsynth_engine-0.4.1.dev1 → diffsynth_engine-0.4.1.post2.dev1}/diffsynth_engine/conf/tokenizers/qwen_image/tokenizer/vocab.json +0 -0
  86. {diffsynth_engine-0.4.1.dev1 → diffsynth_engine-0.4.1.post2.dev1}/diffsynth_engine/conf/tokenizers/sdxl/tokenizer/merges.txt +0 -0
  87. {diffsynth_engine-0.4.1.dev1 → diffsynth_engine-0.4.1.post2.dev1}/diffsynth_engine/conf/tokenizers/sdxl/tokenizer/special_tokens_map.json +0 -0
  88. {diffsynth_engine-0.4.1.dev1 → diffsynth_engine-0.4.1.post2.dev1}/diffsynth_engine/conf/tokenizers/sdxl/tokenizer/tokenizer_config.json +0 -0
  89. {diffsynth_engine-0.4.1.dev1 → diffsynth_engine-0.4.1.post2.dev1}/diffsynth_engine/conf/tokenizers/sdxl/tokenizer/vocab.json +0 -0
  90. {diffsynth_engine-0.4.1.dev1 → diffsynth_engine-0.4.1.post2.dev1}/diffsynth_engine/conf/tokenizers/sdxl/tokenizer_2/merges.txt +0 -0
  91. {diffsynth_engine-0.4.1.dev1 → diffsynth_engine-0.4.1.post2.dev1}/diffsynth_engine/conf/tokenizers/sdxl/tokenizer_2/special_tokens_map.json +0 -0
  92. {diffsynth_engine-0.4.1.dev1 → diffsynth_engine-0.4.1.post2.dev1}/diffsynth_engine/conf/tokenizers/sdxl/tokenizer_2/tokenizer_config.json +0 -0
  93. {diffsynth_engine-0.4.1.dev1 → diffsynth_engine-0.4.1.post2.dev1}/diffsynth_engine/conf/tokenizers/sdxl/tokenizer_2/vocab.json +0 -0
  94. {diffsynth_engine-0.4.1.dev1 → diffsynth_engine-0.4.1.post2.dev1}/diffsynth_engine/conf/tokenizers/wan/umt5-xxl/special_tokens_map.json +0 -0
  95. {diffsynth_engine-0.4.1.dev1 → diffsynth_engine-0.4.1.post2.dev1}/diffsynth_engine/conf/tokenizers/wan/umt5-xxl/spiece.model +0 -0
  96. {diffsynth_engine-0.4.1.dev1 → diffsynth_engine-0.4.1.post2.dev1}/diffsynth_engine/conf/tokenizers/wan/umt5-xxl/tokenizer.json +0 -0
  97. {diffsynth_engine-0.4.1.dev1 → diffsynth_engine-0.4.1.post2.dev1}/diffsynth_engine/conf/tokenizers/wan/umt5-xxl/tokenizer_config.json +0 -0
  98. {diffsynth_engine-0.4.1.dev1 → diffsynth_engine-0.4.1.post2.dev1}/diffsynth_engine/configs/__init__.py +0 -0
  99. {diffsynth_engine-0.4.1.dev1 → diffsynth_engine-0.4.1.post2.dev1}/diffsynth_engine/configs/controlnet.py +0 -0
  100. {diffsynth_engine-0.4.1.dev1 → diffsynth_engine-0.4.1.post2.dev1}/diffsynth_engine/kernels/__init__.py +0 -0
  101. {diffsynth_engine-0.4.1.dev1 → diffsynth_engine-0.4.1.post2.dev1}/diffsynth_engine/models/__init__.py +0 -0
  102. {diffsynth_engine-0.4.1.dev1 → diffsynth_engine-0.4.1.post2.dev1}/diffsynth_engine/models/base.py +0 -0
  103. {diffsynth_engine-0.4.1.dev1 → diffsynth_engine-0.4.1.post2.dev1}/diffsynth_engine/models/basic/__init__.py +0 -0
  104. {diffsynth_engine-0.4.1.dev1 → diffsynth_engine-0.4.1.post2.dev1}/diffsynth_engine/models/basic/attention.py +0 -0
  105. {diffsynth_engine-0.4.1.dev1 → diffsynth_engine-0.4.1.post2.dev1}/diffsynth_engine/models/basic/lora.py +0 -0
  106. {diffsynth_engine-0.4.1.dev1 → diffsynth_engine-0.4.1.post2.dev1}/diffsynth_engine/models/basic/relative_position_emb.py +0 -0
  107. {diffsynth_engine-0.4.1.dev1 → diffsynth_engine-0.4.1.post2.dev1}/diffsynth_engine/models/basic/timestep.py +0 -0
  108. {diffsynth_engine-0.4.1.dev1 → diffsynth_engine-0.4.1.post2.dev1}/diffsynth_engine/models/basic/transformer_helper.py +0 -0
  109. {diffsynth_engine-0.4.1.dev1 → diffsynth_engine-0.4.1.post2.dev1}/diffsynth_engine/models/basic/unet_helper.py +0 -0
  110. {diffsynth_engine-0.4.1.dev1 → diffsynth_engine-0.4.1.post2.dev1}/diffsynth_engine/models/flux/__init__.py +0 -0
  111. {diffsynth_engine-0.4.1.dev1 → diffsynth_engine-0.4.1.post2.dev1}/diffsynth_engine/models/flux/flux_controlnet.py +0 -0
  112. {diffsynth_engine-0.4.1.dev1 → diffsynth_engine-0.4.1.post2.dev1}/diffsynth_engine/models/flux/flux_dit.py +0 -0
  113. {diffsynth_engine-0.4.1.dev1 → diffsynth_engine-0.4.1.post2.dev1}/diffsynth_engine/models/flux/flux_dit_fbcache.py +0 -0
  114. {diffsynth_engine-0.4.1.dev1 → diffsynth_engine-0.4.1.post2.dev1}/diffsynth_engine/models/flux/flux_ipadapter.py +0 -0
  115. {diffsynth_engine-0.4.1.dev1 → diffsynth_engine-0.4.1.post2.dev1}/diffsynth_engine/models/flux/flux_redux.py +0 -0
  116. {diffsynth_engine-0.4.1.dev1 → diffsynth_engine-0.4.1.post2.dev1}/diffsynth_engine/models/flux/flux_text_encoder.py +0 -0
  117. {diffsynth_engine-0.4.1.dev1 → diffsynth_engine-0.4.1.post2.dev1}/diffsynth_engine/models/flux/flux_vae.py +0 -0
  118. {diffsynth_engine-0.4.1.dev1 → diffsynth_engine-0.4.1.post2.dev1}/diffsynth_engine/models/qwen_image/__init__.py +0 -0
  119. {diffsynth_engine-0.4.1.dev1 → diffsynth_engine-0.4.1.post2.dev1}/diffsynth_engine/models/qwen_image/qwen2_5_vl.py +0 -0
  120. {diffsynth_engine-0.4.1.dev1 → diffsynth_engine-0.4.1.post2.dev1}/diffsynth_engine/models/qwen_image/qwen_image_dit_fbcache.py +0 -0
  121. {diffsynth_engine-0.4.1.dev1 → diffsynth_engine-0.4.1.post2.dev1}/diffsynth_engine/models/qwen_image/qwen_image_vae.py +0 -0
  122. {diffsynth_engine-0.4.1.dev1 → diffsynth_engine-0.4.1.post2.dev1}/diffsynth_engine/models/sd/__init__.py +0 -0
  123. {diffsynth_engine-0.4.1.dev1 → diffsynth_engine-0.4.1.post2.dev1}/diffsynth_engine/models/sd/sd_controlnet.py +0 -0
  124. {diffsynth_engine-0.4.1.dev1 → diffsynth_engine-0.4.1.post2.dev1}/diffsynth_engine/models/sd/sd_text_encoder.py +0 -0
  125. {diffsynth_engine-0.4.1.dev1 → diffsynth_engine-0.4.1.post2.dev1}/diffsynth_engine/models/sd/sd_unet.py +0 -0
  126. {diffsynth_engine-0.4.1.dev1 → diffsynth_engine-0.4.1.post2.dev1}/diffsynth_engine/models/sd/sd_vae.py +0 -0
  127. {diffsynth_engine-0.4.1.dev1 → diffsynth_engine-0.4.1.post2.dev1}/diffsynth_engine/models/sd3/__init__.py +0 -0
  128. {diffsynth_engine-0.4.1.dev1 → diffsynth_engine-0.4.1.post2.dev1}/diffsynth_engine/models/sd3/sd3_dit.py +0 -0
  129. {diffsynth_engine-0.4.1.dev1 → diffsynth_engine-0.4.1.post2.dev1}/diffsynth_engine/models/sd3/sd3_text_encoder.py +0 -0
  130. {diffsynth_engine-0.4.1.dev1 → diffsynth_engine-0.4.1.post2.dev1}/diffsynth_engine/models/sd3/sd3_vae.py +0 -0
  131. {diffsynth_engine-0.4.1.dev1 → diffsynth_engine-0.4.1.post2.dev1}/diffsynth_engine/models/sdxl/__init__.py +0 -0
  132. {diffsynth_engine-0.4.1.dev1 → diffsynth_engine-0.4.1.post2.dev1}/diffsynth_engine/models/sdxl/sdxl_controlnet.py +0 -0
  133. {diffsynth_engine-0.4.1.dev1 → diffsynth_engine-0.4.1.post2.dev1}/diffsynth_engine/models/sdxl/sdxl_text_encoder.py +0 -0
  134. {diffsynth_engine-0.4.1.dev1 → diffsynth_engine-0.4.1.post2.dev1}/diffsynth_engine/models/sdxl/sdxl_unet.py +0 -0
  135. {diffsynth_engine-0.4.1.dev1 → diffsynth_engine-0.4.1.post2.dev1}/diffsynth_engine/models/sdxl/sdxl_vae.py +0 -0
  136. {diffsynth_engine-0.4.1.dev1 → diffsynth_engine-0.4.1.post2.dev1}/diffsynth_engine/models/text_encoder/__init__.py +0 -0
  137. {diffsynth_engine-0.4.1.dev1 → diffsynth_engine-0.4.1.post2.dev1}/diffsynth_engine/models/text_encoder/clip.py +0 -0
  138. {diffsynth_engine-0.4.1.dev1 → diffsynth_engine-0.4.1.post2.dev1}/diffsynth_engine/models/text_encoder/siglip.py +0 -0
  139. {diffsynth_engine-0.4.1.dev1 → diffsynth_engine-0.4.1.post2.dev1}/diffsynth_engine/models/text_encoder/t5.py +0 -0
  140. {diffsynth_engine-0.4.1.dev1 → diffsynth_engine-0.4.1.post2.dev1}/diffsynth_engine/models/utils.py +0 -0
  141. {diffsynth_engine-0.4.1.dev1 → diffsynth_engine-0.4.1.post2.dev1}/diffsynth_engine/models/vae/__init__.py +0 -0
  142. {diffsynth_engine-0.4.1.dev1 → diffsynth_engine-0.4.1.post2.dev1}/diffsynth_engine/models/vae/vae.py +0 -0
  143. {diffsynth_engine-0.4.1.dev1 → diffsynth_engine-0.4.1.post2.dev1}/diffsynth_engine/models/wan/__init__.py +0 -0
  144. {diffsynth_engine-0.4.1.dev1 → diffsynth_engine-0.4.1.post2.dev1}/diffsynth_engine/models/wan/wan_dit.py +0 -0
  145. {diffsynth_engine-0.4.1.dev1 → diffsynth_engine-0.4.1.post2.dev1}/diffsynth_engine/models/wan/wan_image_encoder.py +0 -0
  146. {diffsynth_engine-0.4.1.dev1 → diffsynth_engine-0.4.1.post2.dev1}/diffsynth_engine/models/wan/wan_text_encoder.py +0 -0
  147. {diffsynth_engine-0.4.1.dev1 → diffsynth_engine-0.4.1.post2.dev1}/diffsynth_engine/models/wan/wan_vae.py +0 -0
  148. {diffsynth_engine-0.4.1.dev1 → diffsynth_engine-0.4.1.post2.dev1}/diffsynth_engine/pipelines/__init__.py +0 -0
  149. {diffsynth_engine-0.4.1.dev1 → diffsynth_engine-0.4.1.post2.dev1}/diffsynth_engine/pipelines/flux_image.py +0 -0
  150. {diffsynth_engine-0.4.1.dev1 → diffsynth_engine-0.4.1.post2.dev1}/diffsynth_engine/pipelines/sd_image.py +0 -0
  151. {diffsynth_engine-0.4.1.dev1 → diffsynth_engine-0.4.1.post2.dev1}/diffsynth_engine/pipelines/sdxl_image.py +0 -0
  152. {diffsynth_engine-0.4.1.dev1 → diffsynth_engine-0.4.1.post2.dev1}/diffsynth_engine/pipelines/utils.py +0 -0
  153. {diffsynth_engine-0.4.1.dev1 → diffsynth_engine-0.4.1.post2.dev1}/diffsynth_engine/processor/__init__.py +0 -0
  154. {diffsynth_engine-0.4.1.dev1 → diffsynth_engine-0.4.1.post2.dev1}/diffsynth_engine/processor/canny_processor.py +0 -0
  155. {diffsynth_engine-0.4.1.dev1 → diffsynth_engine-0.4.1.post2.dev1}/diffsynth_engine/processor/depth_processor.py +0 -0
  156. {diffsynth_engine-0.4.1.dev1 → diffsynth_engine-0.4.1.post2.dev1}/diffsynth_engine/tokenizers/__init__.py +0 -0
  157. {diffsynth_engine-0.4.1.dev1 → diffsynth_engine-0.4.1.post2.dev1}/diffsynth_engine/tokenizers/base.py +0 -0
  158. {diffsynth_engine-0.4.1.dev1 → diffsynth_engine-0.4.1.post2.dev1}/diffsynth_engine/tokenizers/clip.py +0 -0
  159. {diffsynth_engine-0.4.1.dev1 → diffsynth_engine-0.4.1.post2.dev1}/diffsynth_engine/tokenizers/t5.py +0 -0
  160. {diffsynth_engine-0.4.1.dev1 → diffsynth_engine-0.4.1.post2.dev1}/diffsynth_engine/tokenizers/wan.py +0 -0
  161. {diffsynth_engine-0.4.1.dev1 → diffsynth_engine-0.4.1.post2.dev1}/diffsynth_engine/tools/__init__.py +0 -0
  162. {diffsynth_engine-0.4.1.dev1 → diffsynth_engine-0.4.1.post2.dev1}/diffsynth_engine/tools/flux_inpainting_tool.py +0 -0
  163. {diffsynth_engine-0.4.1.dev1 → diffsynth_engine-0.4.1.post2.dev1}/diffsynth_engine/tools/flux_outpainting_tool.py +0 -0
  164. {diffsynth_engine-0.4.1.dev1 → diffsynth_engine-0.4.1.post2.dev1}/diffsynth_engine/tools/flux_reference_tool.py +0 -0
  165. {diffsynth_engine-0.4.1.dev1 → diffsynth_engine-0.4.1.post2.dev1}/diffsynth_engine/tools/flux_replace_tool.py +0 -0
  166. {diffsynth_engine-0.4.1.dev1 → diffsynth_engine-0.4.1.post2.dev1}/diffsynth_engine/utils/__init__.py +0 -0
  167. {diffsynth_engine-0.4.1.dev1 → diffsynth_engine-0.4.1.post2.dev1}/diffsynth_engine/utils/cache.py +0 -0
  168. {diffsynth_engine-0.4.1.dev1 → diffsynth_engine-0.4.1.post2.dev1}/diffsynth_engine/utils/constants.py +0 -0
  169. {diffsynth_engine-0.4.1.dev1 → diffsynth_engine-0.4.1.post2.dev1}/diffsynth_engine/utils/download.py +0 -0
  170. {diffsynth_engine-0.4.1.dev1 → diffsynth_engine-0.4.1.post2.dev1}/diffsynth_engine/utils/env.py +0 -0
  171. {diffsynth_engine-0.4.1.dev1 → diffsynth_engine-0.4.1.post2.dev1}/diffsynth_engine/utils/flag.py +0 -0
  172. {diffsynth_engine-0.4.1.dev1 → diffsynth_engine-0.4.1.post2.dev1}/diffsynth_engine/utils/fp8_linear.py +0 -0
  173. {diffsynth_engine-0.4.1.dev1 → diffsynth_engine-0.4.1.post2.dev1}/diffsynth_engine/utils/gguf.py +0 -0
  174. {diffsynth_engine-0.4.1.dev1 → diffsynth_engine-0.4.1.post2.dev1}/diffsynth_engine/utils/image.py +0 -0
  175. {diffsynth_engine-0.4.1.dev1 → diffsynth_engine-0.4.1.post2.dev1}/diffsynth_engine/utils/loader.py +0 -0
  176. {diffsynth_engine-0.4.1.dev1 → diffsynth_engine-0.4.1.post2.dev1}/diffsynth_engine/utils/lock.py +0 -0
  177. {diffsynth_engine-0.4.1.dev1 → diffsynth_engine-0.4.1.post2.dev1}/diffsynth_engine/utils/logging.py +0 -0
  178. {diffsynth_engine-0.4.1.dev1 → diffsynth_engine-0.4.1.post2.dev1}/diffsynth_engine/utils/onnx.py +0 -0
  179. {diffsynth_engine-0.4.1.dev1 → diffsynth_engine-0.4.1.post2.dev1}/diffsynth_engine/utils/platform.py +0 -0
  180. {diffsynth_engine-0.4.1.dev1 → diffsynth_engine-0.4.1.post2.dev1}/diffsynth_engine/utils/prompt.py +0 -0
  181. {diffsynth_engine-0.4.1.dev1 → diffsynth_engine-0.4.1.post2.dev1}/diffsynth_engine/utils/video.py +0 -0
  182. {diffsynth_engine-0.4.1.dev1 → diffsynth_engine-0.4.1.post2.dev1}/diffsynth_engine.egg-info/SOURCES.txt +0 -0
  183. {diffsynth_engine-0.4.1.dev1 → diffsynth_engine-0.4.1.post2.dev1}/diffsynth_engine.egg-info/dependency_links.txt +0 -0
  184. {diffsynth_engine-0.4.1.dev1 → diffsynth_engine-0.4.1.post2.dev1}/diffsynth_engine.egg-info/requires.txt +0 -0
  185. {diffsynth_engine-0.4.1.dev1 → diffsynth_engine-0.4.1.post2.dev1}/diffsynth_engine.egg-info/top_level.txt +0 -0
  186. {diffsynth_engine-0.4.1.dev1 → diffsynth_engine-0.4.1.post2.dev1}/pyproject.toml +0 -0
  187. {diffsynth_engine-0.4.1.dev1 → diffsynth_engine-0.4.1.post2.dev1}/setup.cfg +0 -0
  188. {diffsynth_engine-0.4.1.dev1 → diffsynth_engine-0.4.1.post2.dev1}/setup.py +0 -0
@@ -1,6 +1,6 @@
1
1
  Metadata-Version: 2.4
2
2
  Name: diffsynth_engine
3
- Version: 0.4.1.dev1
3
+ Version: 0.4.1.post2.dev1
4
4
  Author: MuseAI x ModelScope
5
5
  Classifier: Programming Language :: Python :: 3
6
6
  Classifier: Operating System :: OS Independent
@@ -16,6 +16,7 @@ class BaseConfig:
16
16
  vae_tile_stride: int | Tuple[int, int] = 256
17
17
  device: str = "cuda"
18
18
  offload_mode: Optional[str] = None
19
+ offload_to_disk: bool = False
19
20
 
20
21
 
21
22
  @dataclass
@@ -62,11 +63,13 @@ class SDPipelineConfig(BaseConfig):
62
63
  model_path: str | os.PathLike | List[str | os.PathLike],
63
64
  device: str = "cuda",
64
65
  offload_mode: Optional[str] = None,
66
+ offload_to_disk: bool = False,
65
67
  ) -> "SDPipelineConfig":
66
68
  return cls(
67
69
  model_path=model_path,
68
70
  device=device,
69
71
  offload_mode=offload_mode,
72
+ offload_to_disk=offload_to_disk,
70
73
  )
71
74
 
72
75
 
@@ -87,11 +90,13 @@ class SDXLPipelineConfig(BaseConfig):
87
90
  model_path: str | os.PathLike | List[str | os.PathLike],
88
91
  device: str = "cuda",
89
92
  offload_mode: Optional[str] = None,
93
+ offload_to_disk: bool = False,
90
94
  ) -> "SDXLPipelineConfig":
91
95
  return cls(
92
96
  model_path=model_path,
93
97
  device=device,
94
98
  offload_mode=offload_mode,
99
+ offload_to_disk=offload_to_disk,
95
100
  )
96
101
 
97
102
 
@@ -116,6 +121,7 @@ class FluxPipelineConfig(AttentionConfig, OptimizationConfig, ParallelConfig, Ba
116
121
  device: str = "cuda",
117
122
  parallelism: int = 1,
118
123
  offload_mode: Optional[str] = None,
124
+ offload_to_disk: bool = False,
119
125
  ) -> "FluxPipelineConfig":
120
126
  return cls(
121
127
  model_path=model_path,
@@ -123,6 +129,7 @@ class FluxPipelineConfig(AttentionConfig, OptimizationConfig, ParallelConfig, Ba
123
129
  parallelism=parallelism,
124
130
  use_fsdp=True,
125
131
  offload_mode=offload_mode,
132
+ offload_to_disk=offload_to_disk,
126
133
  )
127
134
 
128
135
  def __post_init__(self):
@@ -160,6 +167,7 @@ class WanPipelineConfig(AttentionConfig, OptimizationConfig, ParallelConfig, Bas
160
167
  device: str = "cuda",
161
168
  parallelism: int = 1,
162
169
  offload_mode: Optional[str] = None,
170
+ offload_to_disk: bool = False,
163
171
  ) -> "WanPipelineConfig":
164
172
  return cls(
165
173
  model_path=model_path,
@@ -169,6 +177,7 @@ class WanPipelineConfig(AttentionConfig, OptimizationConfig, ParallelConfig, Bas
169
177
  use_cfg_parallel=True,
170
178
  use_fsdp=True,
171
179
  offload_mode=offload_mode,
180
+ offload_to_disk=offload_to_disk,
172
181
  )
173
182
 
174
183
  def __post_init__(self):
@@ -196,6 +205,7 @@ class QwenImagePipelineConfig(AttentionConfig, OptimizationConfig, ParallelConfi
196
205
  device: str = "cuda",
197
206
  parallelism: int = 1,
198
207
  offload_mode: Optional[str] = None,
208
+ offload_to_disk: bool = False,
199
209
  ) -> "QwenImagePipelineConfig":
200
210
  return cls(
201
211
  model_path=model_path,
@@ -206,6 +216,7 @@ class QwenImagePipelineConfig(AttentionConfig, OptimizationConfig, ParallelConfi
206
216
  use_cfg_parallel=True,
207
217
  use_fsdp=True,
208
218
  offload_mode=offload_mode,
219
+ offload_to_disk=offload_to_disk,
209
220
  )
210
221
 
211
222
  def __post_init__(self):
@@ -315,6 +315,7 @@ class QwenImageTransformerBlock(nn.Module):
315
315
 
316
316
  class QwenImageDiT(PreTrainedModel):
317
317
  converter = QwenImageDiTStateDictConverter()
318
+ _supports_parallelization = True
318
319
 
319
320
  def __init__(
320
321
  self,
@@ -423,3 +424,6 @@ class QwenImageDiT(PreTrainedModel):
423
424
  model.load_state_dict(state_dict, assign=True)
424
425
  model.to(device=device, dtype=dtype, non_blocking=True)
425
426
  return model
427
+
428
+ def get_fsdp_modules(self):
429
+ return ["transformer_blocks"]
@@ -6,7 +6,7 @@ from typing import Dict, List, Tuple
6
6
  from PIL import Image
7
7
 
8
8
  from diffsynth_engine.configs import BaseConfig, BaseStateDicts
9
- from diffsynth_engine.utils.offload import enable_sequential_cpu_offload
9
+ from diffsynth_engine.utils.offload import enable_sequential_cpu_offload, offload_model_to_dict, restore_model_from_dict
10
10
  from diffsynth_engine.utils.fp8_linear import enable_fp8_autocast
11
11
  from diffsynth_engine.utils.gguf import load_gguf_checkpoint
12
12
  from diffsynth_engine.utils import logging
@@ -40,6 +40,8 @@ class BasePipeline:
40
40
  self.dtype = dtype
41
41
  self.offload_mode = None
42
42
  self.model_names = []
43
+ self._offload_param_dict = {}
44
+ self.offload_to_disk = False
43
45
 
44
46
  @classmethod
45
47
  def from_pretrained(cls, model_path_or_config: str | BaseConfig) -> "BasePipeline":
@@ -227,32 +229,44 @@ class BasePipeline:
227
229
  model.eval()
228
230
  return self
229
231
 
230
- def enable_cpu_offload(self, offload_mode: str):
231
- valid_offload_mode = ("cpu_offload", "sequential_cpu_offload")
232
+ def enable_cpu_offload(self, offload_mode: str | None, offload_to_disk:bool = False):
233
+ valid_offload_mode = ("cpu_offload", "sequential_cpu_offload", "disable", None)
232
234
  if offload_mode not in valid_offload_mode:
233
235
  raise ValueError(f"offload_mode must be one of {valid_offload_mode}, but got {offload_mode}")
234
236
  if self.device == "cpu" or self.device == "mps":
235
237
  logger.warning("must set an non cpu device for pipeline before calling enable_cpu_offload")
236
238
  return
237
- if offload_mode == "cpu_offload":
239
+ if offload_mode is None or offload_mode == "disable":
240
+ self._disable_offload()
241
+ elif offload_mode == "cpu_offload":
238
242
  self._enable_model_cpu_offload()
239
243
  elif offload_mode == "sequential_cpu_offload":
240
244
  self._enable_sequential_cpu_offload()
245
+ self.offload_to_disk = offload_to_disk
241
246
 
242
- def _enable_model_cpu_offload(self):
247
+
248
+ def _enable_model_cpu_offload(self):
243
249
  for model_name in self.model_names:
244
250
  model = getattr(self, model_name)
245
251
  if model is not None:
246
- model.to("cpu")
252
+ self._offload_param_dict[model_name] = offload_model_to_dict(model)
247
253
  self.offload_mode = "cpu_offload"
248
254
 
249
255
  def _enable_sequential_cpu_offload(self):
250
256
  for model_name in self.model_names:
251
257
  model = getattr(self, model_name)
252
258
  if model is not None:
253
- model.to("cpu")
254
259
  enable_sequential_cpu_offload(model, self.device)
255
260
  self.offload_mode = "sequential_cpu_offload"
261
+
262
+ def _disable_offload(self):
263
+ self.offload_mode = None
264
+ self._offload_param_dict = {}
265
+ for model_name in self.model_names:
266
+ model = getattr(self, model_name)
267
+ if model is not None:
268
+ model.to(self.device)
269
+
256
270
 
257
271
  def enable_fp8_autocast(
258
272
  self, model_names: List[str], compute_dtype: torch.dtype = torch.bfloat16, use_fp8_linear: bool = False
@@ -260,6 +274,7 @@ class BasePipeline:
260
274
  for model_name in model_names:
261
275
  model = getattr(self, model_name)
262
276
  if model is not None:
277
+ model.to(device=self.device, dtype=torch.float8_e4m3fn)
263
278
  enable_fp8_autocast(model, compute_dtype, use_fp8_linear)
264
279
  self.fp8_autocast_enabled = True
265
280
 
@@ -277,23 +292,31 @@ class BasePipeline:
277
292
  for model_name in self.model_names:
278
293
  if model_name not in load_model_names:
279
294
  model = getattr(self, model_name)
280
- if (
281
- model is not None
282
- and (p := next(model.parameters(), None)) is not None
283
- and p.device != torch.device("cpu")
284
- ):
285
- model.to("cpu")
295
+ if model is not None and (p := next(model.parameters(), None)) is not None and p.device.type != "cpu":
296
+ restore_model_from_dict(model, self._offload_param_dict[model_name])
286
297
  # load the needed models to device
287
298
  for model_name in load_model_names:
288
299
  model = getattr(self, model_name)
289
- if (
290
- model is not None
291
- and (p := next(model.parameters(), None)) is not None
292
- and p.device != torch.device(self.device)
293
- ):
300
+ if model is None:
301
+ raise ValueError(f"model {model_name} is not loaded, maybe this model has been destroyed by model_lifecycle_finish function with offload_to_disk=True")
302
+ if model is not None and (p := next(model.parameters(), None)) is not None and p.device.type != self.device:
294
303
  model.to(self.device)
295
304
  # fresh the cuda cache
296
305
  empty_cache()
297
306
 
307
+ def model_lifecycle_finish(self, model_names: List[str] | None = None):
308
+ if not self.offload_to_disk or self.offload_mode is None:
309
+ return
310
+ for model_name in model_names:
311
+ model = getattr(self, model_name)
312
+ del model
313
+ if model_name in self._offload_param_dict:
314
+ del self._offload_param_dict[model_name]
315
+ setattr(self, model_name, None)
316
+ print(f"model {model_name} has been deleted from memory")
317
+ logger.info(f"model {model_name} has been deleted from memory")
318
+ empty_cache()
319
+
320
+
298
321
  def compile(self):
299
322
  raise NotImplementedError(f"{self.__class__.__name__} does not support compile")
@@ -41,19 +41,32 @@ class QwenImageLoRAConverter(LoRAStateDictConverter):
41
41
  dit_dict = {}
42
42
  for key, param in lora_state_dict.items():
43
43
  origin_key = key
44
- if "lora_A.default.weight" not in key:
44
+ lora_a_suffix = None
45
+ if "lora_A.default.weight" in key:
46
+ lora_a_suffix = "lora_A.default.weight"
47
+ elif "lora_A.weight" in key:
48
+ lora_a_suffix = "lora_A.weight"
49
+
50
+ if lora_a_suffix is None:
45
51
  continue
52
+
46
53
  lora_args = {}
47
54
  lora_args["down"] = param
48
- lora_args["up"] = lora_state_dict[origin_key.replace("lora_A.default.weight", "lora_B.default.weight")]
55
+
56
+ lora_b_suffix = lora_a_suffix.replace("lora_A", "lora_B")
57
+ lora_args["up"] = lora_state_dict[origin_key.replace(lora_a_suffix, lora_b_suffix)]
58
+
49
59
  lora_args["rank"] = lora_args["up"].shape[1]
50
- alpha_key = origin_key.replace("lora_A.default.weight", "alpha").replace("lora_up.default.weight", "alpha")
60
+ alpha_key = origin_key.replace("lora_up", "lora_A").replace(lora_a_suffix, "alpha")
61
+
51
62
  if alpha_key in lora_state_dict:
52
63
  alpha = lora_state_dict[alpha_key]
53
64
  else:
54
65
  alpha = lora_args["rank"]
55
66
  lora_args["alpha"] = alpha
56
- key = key.replace(".lora_A.default.weight", "")
67
+
68
+ key = key.replace(f".{lora_a_suffix}", "")
69
+
57
70
  if key.startswith("transformer") and "attn.to_out.0" in key:
58
71
  key = key.replace("attn.to_out.0", "attn.to_out")
59
72
  dit_dict[key] = lora_args
@@ -82,10 +95,8 @@ class QwenImagePipeline(BasePipeline):
82
95
  dtype=config.model_dtype,
83
96
  )
84
97
  self.config = config
85
- self.tokenizer_max_length = 1024
86
98
  self.prompt_template_encode = "<|im_start|>system\nDescribe the image by detailing the color, shape, size, texture, quantity, text, spatial relationships of the objects and background:<|im_end|>\n<|im_start|>user\n{}<|im_end|>\n<|im_start|>assistant\n"
87
99
  self.prompt_template_encode_start_idx = 34
88
- self.default_sample_size = 128
89
100
  # sampler
90
101
  self.noise_scheduler = RecifitedFlowScheduler(shift=3.0, use_dynamic_shifting=True)
91
102
  self.sampler = FlowMatchEulerSampler()
@@ -197,7 +208,19 @@ class QwenImagePipeline(BasePipeline):
197
208
  pipe.eval()
198
209
 
199
210
  if config.offload_mode is not None:
200
- pipe.enable_cpu_offload(config.offload_mode)
211
+ pipe.enable_cpu_offload(config.offload_mode, config.offload_to_disk)
212
+
213
+ if config.model_dtype == torch.float8_e4m3fn:
214
+ pipe.dtype = torch.bfloat16 # compute dtype
215
+ pipe.enable_fp8_autocast(
216
+ model_names=["dit"], compute_dtype=pipe.dtype, use_fp8_linear=config.use_fp8_linear
217
+ )
218
+
219
+ if config.encoder_dtype == torch.float8_e4m3fn:
220
+ pipe.dtype = torch.bfloat16 # compute dtype
221
+ pipe.enable_fp8_autocast(
222
+ model_names=["encoder"], compute_dtype=pipe.dtype, use_fp8_linear=config.use_fp8_linear
223
+ )
201
224
 
202
225
  if config.parallelism > 1:
203
226
  pipe = ParallelWrapper(
@@ -262,7 +285,7 @@ class QwenImagePipeline(BasePipeline):
262
285
  template = self.prompt_template_encode
263
286
  drop_idx = self.prompt_template_encode_start_idx
264
287
  texts = [template.format(txt) for txt in prompt]
265
- outputs = self.tokenizer(texts, max_length=min(max_sequence_length, self.tokenizer_max_length) + drop_idx)
288
+ outputs = self.tokenizer(texts, max_length=max_sequence_length + drop_idx)
266
289
  input_ids, attention_mask = outputs["input_ids"].to(self.device), outputs["attention_mask"].to(self.device)
267
290
  outputs = self.encoder(input_ids=input_ids, attention_mask=attention_mask)
268
291
  hidden_states = outputs["hidden_states"]
@@ -377,11 +400,12 @@ class QwenImagePipeline(BasePipeline):
377
400
  self.sampler.initialize(init_latents=init_latents, timesteps=timesteps, sigmas=sigmas)
378
401
 
379
402
  self.load_models_to_device(["encoder"])
380
- prompt_embeds, prompt_embeds_mask = self.encode_prompt(prompt, 1, 512)
403
+ prompt_embeds, prompt_embeds_mask = self.encode_prompt(prompt, 1, 4096)
381
404
  if cfg_scale > 1.0 and negative_prompt != "":
382
- negative_prompt_embeds, negative_prompt_embeds_mask = self.encode_prompt(negative_prompt, 1, 512)
405
+ negative_prompt_embeds, negative_prompt_embeds_mask = self.encode_prompt(negative_prompt, 1, 4096)
383
406
  else:
384
407
  negative_prompt_embeds, negative_prompt_embeds_mask = None, None
408
+ self.model_lifecycle_finish(["encoder"])
385
409
 
386
410
  hide_progress = dist.is_initialized() and dist.get_rank() != 0
387
411
  for i, timestep in enumerate(tqdm(timesteps, disable=hide_progress)):
@@ -401,6 +425,7 @@ class QwenImagePipeline(BasePipeline):
401
425
  # UI
402
426
  if progress_callback is not None:
403
427
  progress_callback(i, len(timesteps), "DENOISING")
428
+ self.model_lifecycle_finish(["dit"])
404
429
  # Decode image
405
430
  self.load_models_to_device(["vae"])
406
431
  latents = rearrange(latents, "B C H W -> B C 1 H W")
@@ -412,5 +437,6 @@ class QwenImagePipeline(BasePipeline):
412
437
  )
413
438
  image = self.vae_output_to_image(vae_output)
414
439
  # Offload all models
440
+ self.model_lifecycle_finish(["vae"])
415
441
  self.load_models_to_device([])
416
442
  return image
@@ -584,4 +584,11 @@ class WanVideoPipeline(BasePipeline):
584
584
  use_fsdp=config.use_fsdp,
585
585
  device="cuda",
586
586
  )
587
+ if config.use_torch_compile:
588
+ pipe.compile()
587
589
  return pipe
590
+
591
+ def compile(self):
592
+ self.dit.compile()
593
+ if self.dit2 is not None:
594
+ self.dit2.compile()
@@ -197,8 +197,8 @@ class Qwen2TokenizerFast(BaseTokenizer):
197
197
  encoded.fill_(self.pad_token_id)
198
198
  attention_mask = torch.zeros(len(texts), max_length, dtype=torch.long)
199
199
  for i, ids in enumerate(batch_ids):
200
- if len(ids) > self.model_max_length:
201
- ids = ids[: self.model_max_length]
200
+ if len(ids) > max_length:
201
+ ids = ids[:max_length]
202
202
  ids[-1] = self.eos_token_id
203
203
  if padding_side == "right":
204
204
  encoded[i, : len(ids)] = torch.tensor(ids)
@@ -1,8 +1,10 @@
1
1
  import torch
2
2
  import torch.nn as nn
3
+ from typing import Dict
3
4
 
4
5
 
5
6
  def enable_sequential_cpu_offload(module: nn.Module, device: str = "cuda"):
7
+ module = module.to("cpu")
6
8
  if len(list(module.children())) == 0:
7
9
  if len(list(module.parameters())) > 0 or len(list(module.buffers())) > 0:
8
10
  # leaf module with parameters or buffers
@@ -50,3 +52,24 @@ def add_cpu_offload_hook(module: nn.Module, device: str = "cuda", recurse: bool
50
52
  module.register_forward_pre_hook(_forward_pre_hook)
51
53
  module.register_forward_hook(_forward_hook)
52
54
  setattr(module, "_cpu_offload_enabled", True)
55
+
56
+
57
+ def offload_model_to_dict(module: nn.Module) -> Dict[str, torch.Tensor]:
58
+ module = module.to("cpu")
59
+ offload_param_dict = {}
60
+ for name, param in module.named_parameters(recurse=True):
61
+ param.data = param.data.pin_memory()
62
+ offload_param_dict[name] = param.data
63
+ for name, buffer in module.named_buffers(recurse=True):
64
+ buffer.data = buffer.data.pin_memory()
65
+ offload_param_dict[name] = buffer.data
66
+ return offload_param_dict
67
+
68
+
69
+ def restore_model_from_dict(module: nn.Module, offload_param_dict: Dict[str, torch.Tensor]):
70
+ for name, param in module.named_parameters(recurse=True):
71
+ if name in offload_param_dict:
72
+ param.data = offload_param_dict[name]
73
+ for name, buffer in module.named_buffers(recurse=True):
74
+ if name in offload_param_dict:
75
+ buffer.data = offload_param_dict[name]
@@ -304,12 +304,14 @@ def _worker_loop(
304
304
  if rank == 0:
305
305
  queue_out.put(res)
306
306
  dist.barrier()
307
- except Exception as e:
307
+ except Exception:
308
308
  import traceback
309
309
 
310
- traceback.print_exc()
311
- logger.error(f"Error in worker loop (rank {rank}): {e}")
312
- queue_out.put(e) # any exception caught in the worker will be raised to the main process
310
+ msg = traceback.format_exc()
311
+ err = RuntimeError(msg)
312
+ logger.error(f"Error in worker loop (rank {rank}): {msg}")
313
+ if rank == 0:
314
+ queue_out.put(err) # any exception caught in the worker will be raised to the main process
313
315
  finally:
314
316
  del module
315
317
  torch.cuda.synchronize()
@@ -1,6 +1,6 @@
1
1
  Metadata-Version: 2.4
2
2
  Name: diffsynth_engine
3
- Version: 0.4.1.dev1
3
+ Version: 0.4.1.post2.dev1
4
4
  Author: MuseAI x ModelScope
5
5
  Classifier: Programming Language :: Python :: 3
6
6
  Classifier: Operating System :: OS Independent
@@ -88,6 +88,51 @@ We will continuously update DiffSynth-Engine to support more models. (Wan2.2 LoR
88
88
 
89
89
  After the model is downloaded, load the model with the corresponding pipeline and perform inference.
90
90
 
91
+ ### Image Generation(Qwen-Image)
92
+
93
+ The following code calls `QwenImagePipeline` to load the [Qwen-Image](https://www.modelscope.cn/models/Qwen/Qwen-Image) model and generate an image. Recommended resolutions are 928×1664, 1104×1472, 1328×1328, 1472×1104, and 1664×928, with a suggested cfg_scale of 4. If no negative_prompt is provided, it defaults to a single space character (not an empty string). For multi-GPU parallelism, currently only cfg parallelism is supported (parallelism=2), with other optimization efforts underway.
94
+
95
+ ```python
96
+ from diffsynth_engine import fetch_model, QwenImagePipeline, QwenImagePipelineConfig
97
+
98
+ config = QwenImagePipelineConfig.basic_config(
99
+ model_path=fetch_model("MusePublic/Qwen-image", revision="v1", path="transformer/*.safetensors"),
100
+ encoder_path=fetch_model("MusePublic/Qwen-image", revision="v1", path="text_encoder/*.safetensors"),
101
+ vae_path=fetch_model("MusePublic/Qwen-image", revision="v1", path="vae/*.safetensors"),
102
+ parallelism=2,
103
+ )
104
+ pipe = QwenImagePipeline.from_pretrained(config)
105
+
106
+ prompt = """
107
+ 一副典雅庄重的对联悬挂于厅堂之中,房间是个安静古典的中式布置,桌子上放着一些青花瓷,对联上左书“思涌如泉万类灵感皆可触”,右书“智启于问千机代码自天成”,横批“AI脑洞力”,字体飘逸灵动,兼具传统笔意与未来感。中间挂着一幅中国风的画作,内容是岳阳楼,云雾缭绕间似有数据流光隐现,古今交融,意境深远。
108
+ """
109
+ negative_prompt = " "
110
+ image = pipe(
111
+ prompt=prompt,
112
+ negative_prompt=negative_prompt,
113
+ cfg_scale=4.0,
114
+ width=1104,
115
+ height=1472,
116
+ num_inference_steps=30,
117
+ seed=42,
118
+ )
119
+ image.save("image.png")
120
+ ```
121
+
122
+ Please note that if some necessary modules, like text encoders, are missing from a model repository, the pipeline will automatically download the required files.
123
+
124
+ ### Detailed Parameters(Qwen-Image)
125
+
126
+ In the image generation pipeline `pipe`, we can use the following parameters for fine-grained control:
127
+
128
+ * `prompt`: The prompt, used to describe the content of the generated image, It supports multiple languages (Chinese, English, Japanese, etc.), e.g., “一只猫” (Chinese), "a cat" (English), or "庭を走る猫" (Japanese).
129
+ * `negative_prompt`: The negative prompt, used to describe content you do not want in the image, it defaults to a single space character (not an empty string), e.g., "ugly".
130
+ * `cfg_scale`: The guidance scale for [Classifier-Free Guidance](https://arxiv.org/abs/2207.12598). A larger value usually results in stronger correlation between the text and the image but reduces the diversity of the generated content.
131
+ * `height`: Image height.
132
+ * `width`: Image width.
133
+ * `num_inference_steps`: The number of inference steps. Generally, more steps lead to longer computation time but higher image quality.
134
+ * `seed`: The random seed. A fixed seed ensures reproducible results.
135
+
91
136
  ### Image Generation
92
137
 
93
138
  The following code calls `FluxImagePipeline` to load the [MajicFlus](https://www.modelscope.cn/models/MAILAND/majicflus_v1/summary?version=v1.0) model and generate an image. To load other types of models, replace `FluxImagePipeline` and `FluxPipelineConfig` in the code with the corresponding pipeline and config.
@@ -109,16 +154,16 @@ Please note that if some necessary modules, like text encoders, are missing from
109
154
 
110
155
  In the image generation pipeline `pipe`, we can use the following parameters for fine-grained control:
111
156
 
112
- * `prompt`: The prompt, used to describe the content of the generated image, e.g., "a cat".
113
- * `negative_prompt`: The negative prompt, used to describe content you do not want in the image, e.g., "ugly".
114
- * `cfg_scale`: The guidance scale for [Classifier-Free Guidance](https://arxiv.org/abs/2207.12598). A larger value usually results in stronger correlation between the text and the image but reduces the diversity of the generated content.
115
- * `clip_skip`: The number of layers to skip in the [CLIP](https://arxiv.org/abs/2103.00020) text encoder. The more layers skipped, the lower the text-image correlation, but this can lead to interesting variations in the generated content.
116
- * `input_image`: Input image, used for image-to-image generation.
117
- * `denoising_strength`: The denoising strength. When set to 1, a full generation process is performed. When set to a value between 0 and 1, some information from the input image is preserved.
118
- * `height`: Image height.
119
- * `width`: Image width.
120
- * `num_inference_steps`: The number of inference steps. Generally, more steps lead to longer computation time but higher image quality.
121
- * `seed`: The random seed. A fixed seed ensures reproducible results.
157
+ * `prompt`: The prompt, used to describe the content of the generated image, e.g., "a cat".
158
+ * `negative_prompt`: The negative prompt, used to describe content you do not want in the image, e.g., "ugly".
159
+ * `cfg_scale`: The guidance scale for [Classifier-Free Guidance](https://arxiv.org/abs/2207.12598). A larger value usually results in stronger correlation between the text and the image but reduces the diversity of the generated content.
160
+ * `clip_skip`: The number of layers to skip in the [CLIP](https://arxiv.org/abs/2103.00020) text encoder. The more layers skipped, the lower the text-image correlation, but this can lead to interesting variations in the generated content.
161
+ * `input_image`: Input image, used for image-to-image generation.
162
+ * `denoising_strength`: The denoising strength. When set to 1, a full generation process is performed. When set to a value between 0 and 1, some information from the input image is preserved.
163
+ * `height`: Image height.
164
+ * `width`: Image width.
165
+ * `num_inference_steps`: The number of inference steps. Generally, more steps lead to longer computation time but higher image quality.
166
+ * `seed`: The random seed. A fixed seed ensures reproducible results.
122
167
 
123
168
  #### Loading LoRA
124
169
 
@@ -177,17 +222,17 @@ save_video(video, "video.mp4")
177
222
 
178
223
  In the video generation pipeline `pipe`, we can use the following parameters for fine-grained control:
179
224
 
180
- * `prompt`: The prompt, used to describe the content of the generated video, e.g., "a cat".
181
- * `negative_prompt`: The negative prompt, used to describe content you do not want in the video, e.g., "ugly".
182
- * `cfg_scale`: The guidance scale for [Classifier-Free Guidance](https://arxiv.org/abs/2207.12598). A larger value usually results in stronger correlation between the text and the video but reduces the diversity of the generated content.
183
- * `input_image`: Input image, only effective in image-to-video models, such as [Wan-AI/Wan2.1-I2V-14B-720P](https://modelscope.cn/models/Wan-AI/Wan2.1-I2V-14B-720P).
184
- * `input_video`: Input video, used for video-to-video generation.
185
- * `denoising_strength`: The denoising strength. When set to 1, a full generation process is performed. When set to a value between 0 and 1, some information from the input video is preserved.
186
- * `height`: Video frame height.
187
- * `width`: Video frame width.
188
- * `num_frames`: Number of video frames.
189
- * `num_inference_steps`: The number of inference steps. Generally, more steps lead to longer computation time but higher video quality.
190
- * `seed`: The random seed. A fixed seed ensures reproducible results.
225
+ * `prompt`: The prompt, used to describe the content of the generated video, e.g., "a cat".
226
+ * `negative_prompt`: The negative prompt, used to describe content you do not want in the video, e.g., "ugly".
227
+ * `cfg_scale`: The guidance scale for [Classifier-Free Guidance](https://arxiv.org/abs/2207.12598). A larger value usually results in stronger correlation between the text and the video but reduces the diversity of the generated content.
228
+ * `input_image`: Input image, only effective in image-to-video models, such as [Wan-AI/Wan2.1-I2V-14B-720P](https://modelscope.cn/models/Wan-AI/Wan2.1-I2V-14B-720P).
229
+ * `input_video`: Input video, used for video-to-video generation.
230
+ * `denoising_strength`: The denoising strength. When set to 1, a full generation process is performed. When set to a value between 0 and 1, some information from the input video is preserved.
231
+ * `height`: Video frame height.
232
+ * `width`: Video frame width.
233
+ * `num_frames`: Number of video frames.
234
+ * `num_inference_steps`: The number of inference steps. Generally, more steps lead to longer computation time but higher video quality.
235
+ * `seed`: The random seed. A fixed seed ensures reproducible results.
191
236
 
192
237
  #### Loading LoRA
193
238