diffsynth-engine 0.4.2.dev3__tar.gz → 0.4.2.dev5__tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (188) hide show
  1. {diffsynth_engine-0.4.2.dev3 → diffsynth_engine-0.4.2.dev5}/PKG-INFO +1 -1
  2. {diffsynth_engine-0.4.2.dev3 → diffsynth_engine-0.4.2.dev5}/diffsynth_engine/pipelines/qwen_image.py +1 -3
  3. {diffsynth_engine-0.4.2.dev3 → diffsynth_engine-0.4.2.dev5}/diffsynth_engine.egg-info/PKG-INFO +1 -1
  4. {diffsynth_engine-0.4.2.dev3 → diffsynth_engine-0.4.2.dev5}/docs/tutorial.md +67 -21
  5. {diffsynth_engine-0.4.2.dev3 → diffsynth_engine-0.4.2.dev5}/docs/tutorial_zh.md +64 -19
  6. {diffsynth_engine-0.4.2.dev3 → diffsynth_engine-0.4.2.dev5}/.gitignore +0 -0
  7. {diffsynth_engine-0.4.2.dev3 → diffsynth_engine-0.4.2.dev5}/.pre-commit-config.yaml +0 -0
  8. {diffsynth_engine-0.4.2.dev3 → diffsynth_engine-0.4.2.dev5}/LICENSE +0 -0
  9. {diffsynth_engine-0.4.2.dev3 → diffsynth_engine-0.4.2.dev5}/MANIFEST.in +0 -0
  10. {diffsynth_engine-0.4.2.dev3 → diffsynth_engine-0.4.2.dev5}/README.md +0 -0
  11. {diffsynth_engine-0.4.2.dev3 → diffsynth_engine-0.4.2.dev5}/assets/dingtalk.png +0 -0
  12. {diffsynth_engine-0.4.2.dev3 → diffsynth_engine-0.4.2.dev5}/assets/showcase.jpeg +0 -0
  13. {diffsynth_engine-0.4.2.dev3 → diffsynth_engine-0.4.2.dev5}/diffsynth_engine/__init__.py +0 -0
  14. {diffsynth_engine-0.4.2.dev3 → diffsynth_engine-0.4.2.dev5}/diffsynth_engine/algorithm/__init__.py +0 -0
  15. {diffsynth_engine-0.4.2.dev3 → diffsynth_engine-0.4.2.dev5}/diffsynth_engine/algorithm/noise_scheduler/__init__.py +0 -0
  16. {diffsynth_engine-0.4.2.dev3 → diffsynth_engine-0.4.2.dev5}/diffsynth_engine/algorithm/noise_scheduler/base_scheduler.py +0 -0
  17. {diffsynth_engine-0.4.2.dev3 → diffsynth_engine-0.4.2.dev5}/diffsynth_engine/algorithm/noise_scheduler/flow_match/__init__.py +0 -0
  18. {diffsynth_engine-0.4.2.dev3 → diffsynth_engine-0.4.2.dev5}/diffsynth_engine/algorithm/noise_scheduler/flow_match/flow_beta.py +0 -0
  19. {diffsynth_engine-0.4.2.dev3 → diffsynth_engine-0.4.2.dev5}/diffsynth_engine/algorithm/noise_scheduler/flow_match/flow_ddim.py +0 -0
  20. {diffsynth_engine-0.4.2.dev3 → diffsynth_engine-0.4.2.dev5}/diffsynth_engine/algorithm/noise_scheduler/flow_match/recifited_flow.py +0 -0
  21. {diffsynth_engine-0.4.2.dev3 → diffsynth_engine-0.4.2.dev5}/diffsynth_engine/algorithm/noise_scheduler/stable_diffusion/__init__.py +0 -0
  22. {diffsynth_engine-0.4.2.dev3 → diffsynth_engine-0.4.2.dev5}/diffsynth_engine/algorithm/noise_scheduler/stable_diffusion/beta.py +0 -0
  23. {diffsynth_engine-0.4.2.dev3 → diffsynth_engine-0.4.2.dev5}/diffsynth_engine/algorithm/noise_scheduler/stable_diffusion/ddim.py +0 -0
  24. {diffsynth_engine-0.4.2.dev3 → diffsynth_engine-0.4.2.dev5}/diffsynth_engine/algorithm/noise_scheduler/stable_diffusion/exponential.py +0 -0
  25. {diffsynth_engine-0.4.2.dev3 → diffsynth_engine-0.4.2.dev5}/diffsynth_engine/algorithm/noise_scheduler/stable_diffusion/karras.py +0 -0
  26. {diffsynth_engine-0.4.2.dev3 → diffsynth_engine-0.4.2.dev5}/diffsynth_engine/algorithm/noise_scheduler/stable_diffusion/linear.py +0 -0
  27. {diffsynth_engine-0.4.2.dev3 → diffsynth_engine-0.4.2.dev5}/diffsynth_engine/algorithm/noise_scheduler/stable_diffusion/sgm_uniform.py +0 -0
  28. {diffsynth_engine-0.4.2.dev3 → diffsynth_engine-0.4.2.dev5}/diffsynth_engine/algorithm/sampler/__init__.py +0 -0
  29. {diffsynth_engine-0.4.2.dev3 → diffsynth_engine-0.4.2.dev5}/diffsynth_engine/algorithm/sampler/flow_match/__init__.py +0 -0
  30. {diffsynth_engine-0.4.2.dev3 → diffsynth_engine-0.4.2.dev5}/diffsynth_engine/algorithm/sampler/flow_match/flow_match_euler.py +0 -0
  31. {diffsynth_engine-0.4.2.dev3 → diffsynth_engine-0.4.2.dev5}/diffsynth_engine/algorithm/sampler/stable_diffusion/__init__.py +0 -0
  32. {diffsynth_engine-0.4.2.dev3 → diffsynth_engine-0.4.2.dev5}/diffsynth_engine/algorithm/sampler/stable_diffusion/brownian_tree.py +0 -0
  33. {diffsynth_engine-0.4.2.dev3 → diffsynth_engine-0.4.2.dev5}/diffsynth_engine/algorithm/sampler/stable_diffusion/ddpm.py +0 -0
  34. {diffsynth_engine-0.4.2.dev3 → diffsynth_engine-0.4.2.dev5}/diffsynth_engine/algorithm/sampler/stable_diffusion/deis.py +0 -0
  35. {diffsynth_engine-0.4.2.dev3 → diffsynth_engine-0.4.2.dev5}/diffsynth_engine/algorithm/sampler/stable_diffusion/dpmpp_2m.py +0 -0
  36. {diffsynth_engine-0.4.2.dev3 → diffsynth_engine-0.4.2.dev5}/diffsynth_engine/algorithm/sampler/stable_diffusion/dpmpp_2m_sde.py +0 -0
  37. {diffsynth_engine-0.4.2.dev3 → diffsynth_engine-0.4.2.dev5}/diffsynth_engine/algorithm/sampler/stable_diffusion/dpmpp_3m_sde.py +0 -0
  38. {diffsynth_engine-0.4.2.dev3 → diffsynth_engine-0.4.2.dev5}/diffsynth_engine/algorithm/sampler/stable_diffusion/epsilon.py +0 -0
  39. {diffsynth_engine-0.4.2.dev3 → diffsynth_engine-0.4.2.dev5}/diffsynth_engine/algorithm/sampler/stable_diffusion/euler.py +0 -0
  40. {diffsynth_engine-0.4.2.dev3 → diffsynth_engine-0.4.2.dev5}/diffsynth_engine/algorithm/sampler/stable_diffusion/euler_ancestral.py +0 -0
  41. {diffsynth_engine-0.4.2.dev3 → diffsynth_engine-0.4.2.dev5}/diffsynth_engine/conf/models/components/vae.json +0 -0
  42. {diffsynth_engine-0.4.2.dev3 → diffsynth_engine-0.4.2.dev5}/diffsynth_engine/conf/models/flux/flux_dit.json +0 -0
  43. {diffsynth_engine-0.4.2.dev3 → diffsynth_engine-0.4.2.dev5}/diffsynth_engine/conf/models/flux/flux_text_encoder.json +0 -0
  44. {diffsynth_engine-0.4.2.dev3 → diffsynth_engine-0.4.2.dev5}/diffsynth_engine/conf/models/flux/flux_vae.json +0 -0
  45. {diffsynth_engine-0.4.2.dev3 → diffsynth_engine-0.4.2.dev5}/diffsynth_engine/conf/models/qwen_image/qwen2_5_vl_config.json +0 -0
  46. {diffsynth_engine-0.4.2.dev3 → diffsynth_engine-0.4.2.dev5}/diffsynth_engine/conf/models/qwen_image/qwen2_5_vl_vision_config.json +0 -0
  47. {diffsynth_engine-0.4.2.dev3 → diffsynth_engine-0.4.2.dev5}/diffsynth_engine/conf/models/qwen_image/qwen_image_vae.json +0 -0
  48. {diffsynth_engine-0.4.2.dev3 → diffsynth_engine-0.4.2.dev5}/diffsynth_engine/conf/models/qwen_image/qwen_image_vae_keymap.json +0 -0
  49. {diffsynth_engine-0.4.2.dev3 → diffsynth_engine-0.4.2.dev5}/diffsynth_engine/conf/models/sd/sd_text_encoder.json +0 -0
  50. {diffsynth_engine-0.4.2.dev3 → diffsynth_engine-0.4.2.dev5}/diffsynth_engine/conf/models/sd/sd_unet.json +0 -0
  51. {diffsynth_engine-0.4.2.dev3 → diffsynth_engine-0.4.2.dev5}/diffsynth_engine/conf/models/sd3/sd3_dit.json +0 -0
  52. {diffsynth_engine-0.4.2.dev3 → diffsynth_engine-0.4.2.dev5}/diffsynth_engine/conf/models/sd3/sd3_text_encoder.json +0 -0
  53. {diffsynth_engine-0.4.2.dev3 → diffsynth_engine-0.4.2.dev5}/diffsynth_engine/conf/models/sdxl/sdxl_text_encoder.json +0 -0
  54. {diffsynth_engine-0.4.2.dev3 → diffsynth_engine-0.4.2.dev5}/diffsynth_engine/conf/models/sdxl/sdxl_unet.json +0 -0
  55. {diffsynth_engine-0.4.2.dev3 → diffsynth_engine-0.4.2.dev5}/diffsynth_engine/conf/models/wan/dit/wan2.1-flf2v-14b.json +0 -0
  56. {diffsynth_engine-0.4.2.dev3 → diffsynth_engine-0.4.2.dev5}/diffsynth_engine/conf/models/wan/dit/wan2.1-i2v-14b.json +0 -0
  57. {diffsynth_engine-0.4.2.dev3 → diffsynth_engine-0.4.2.dev5}/diffsynth_engine/conf/models/wan/dit/wan2.1-t2v-1.3b.json +0 -0
  58. {diffsynth_engine-0.4.2.dev3 → diffsynth_engine-0.4.2.dev5}/diffsynth_engine/conf/models/wan/dit/wan2.1-t2v-14b.json +0 -0
  59. {diffsynth_engine-0.4.2.dev3 → diffsynth_engine-0.4.2.dev5}/diffsynth_engine/conf/models/wan/dit/wan2.2-i2v-a14b.json +0 -0
  60. {diffsynth_engine-0.4.2.dev3 → diffsynth_engine-0.4.2.dev5}/diffsynth_engine/conf/models/wan/dit/wan2.2-t2v-a14b.json +0 -0
  61. {diffsynth_engine-0.4.2.dev3 → diffsynth_engine-0.4.2.dev5}/diffsynth_engine/conf/models/wan/dit/wan2.2-ti2v-5b.json +0 -0
  62. {diffsynth_engine-0.4.2.dev3 → diffsynth_engine-0.4.2.dev5}/diffsynth_engine/conf/models/wan/vae/wan-vae-keymap.json +0 -0
  63. {diffsynth_engine-0.4.2.dev3 → diffsynth_engine-0.4.2.dev5}/diffsynth_engine/conf/models/wan/vae/wan2.1-vae.json +0 -0
  64. {diffsynth_engine-0.4.2.dev3 → diffsynth_engine-0.4.2.dev5}/diffsynth_engine/conf/models/wan/vae/wan2.2-vae.json +0 -0
  65. {diffsynth_engine-0.4.2.dev3 → diffsynth_engine-0.4.2.dev5}/diffsynth_engine/conf/tokenizers/flux/tokenizer_1/merges.txt +0 -0
  66. {diffsynth_engine-0.4.2.dev3 → diffsynth_engine-0.4.2.dev5}/diffsynth_engine/conf/tokenizers/flux/tokenizer_1/special_tokens_map.json +0 -0
  67. {diffsynth_engine-0.4.2.dev3 → diffsynth_engine-0.4.2.dev5}/diffsynth_engine/conf/tokenizers/flux/tokenizer_1/tokenizer_config.json +0 -0
  68. {diffsynth_engine-0.4.2.dev3 → diffsynth_engine-0.4.2.dev5}/diffsynth_engine/conf/tokenizers/flux/tokenizer_1/vocab.json +0 -0
  69. {diffsynth_engine-0.4.2.dev3 → diffsynth_engine-0.4.2.dev5}/diffsynth_engine/conf/tokenizers/flux/tokenizer_2/special_tokens_map.json +0 -0
  70. {diffsynth_engine-0.4.2.dev3 → diffsynth_engine-0.4.2.dev5}/diffsynth_engine/conf/tokenizers/flux/tokenizer_2/spiece.model +0 -0
  71. {diffsynth_engine-0.4.2.dev3 → diffsynth_engine-0.4.2.dev5}/diffsynth_engine/conf/tokenizers/flux/tokenizer_2/tokenizer.json +0 -0
  72. {diffsynth_engine-0.4.2.dev3 → diffsynth_engine-0.4.2.dev5}/diffsynth_engine/conf/tokenizers/flux/tokenizer_2/tokenizer_config.json +0 -0
  73. {diffsynth_engine-0.4.2.dev3 → diffsynth_engine-0.4.2.dev5}/diffsynth_engine/conf/tokenizers/qwen_image/tokenizer/added_tokens.json +0 -0
  74. {diffsynth_engine-0.4.2.dev3 → diffsynth_engine-0.4.2.dev5}/diffsynth_engine/conf/tokenizers/qwen_image/tokenizer/merges.txt +0 -0
  75. {diffsynth_engine-0.4.2.dev3 → diffsynth_engine-0.4.2.dev5}/diffsynth_engine/conf/tokenizers/qwen_image/tokenizer/special_tokens_map.json +0 -0
  76. {diffsynth_engine-0.4.2.dev3 → diffsynth_engine-0.4.2.dev5}/diffsynth_engine/conf/tokenizers/qwen_image/tokenizer/tokenizer.json +0 -0
  77. {diffsynth_engine-0.4.2.dev3 → diffsynth_engine-0.4.2.dev5}/diffsynth_engine/conf/tokenizers/qwen_image/tokenizer/tokenizer_config.json +0 -0
  78. {diffsynth_engine-0.4.2.dev3 → diffsynth_engine-0.4.2.dev5}/diffsynth_engine/conf/tokenizers/qwen_image/tokenizer/vocab.json +0 -0
  79. {diffsynth_engine-0.4.2.dev3 → diffsynth_engine-0.4.2.dev5}/diffsynth_engine/conf/tokenizers/sdxl/tokenizer/merges.txt +0 -0
  80. {diffsynth_engine-0.4.2.dev3 → diffsynth_engine-0.4.2.dev5}/diffsynth_engine/conf/tokenizers/sdxl/tokenizer/special_tokens_map.json +0 -0
  81. {diffsynth_engine-0.4.2.dev3 → diffsynth_engine-0.4.2.dev5}/diffsynth_engine/conf/tokenizers/sdxl/tokenizer/tokenizer_config.json +0 -0
  82. {diffsynth_engine-0.4.2.dev3 → diffsynth_engine-0.4.2.dev5}/diffsynth_engine/conf/tokenizers/sdxl/tokenizer/vocab.json +0 -0
  83. {diffsynth_engine-0.4.2.dev3 → diffsynth_engine-0.4.2.dev5}/diffsynth_engine/conf/tokenizers/sdxl/tokenizer_2/merges.txt +0 -0
  84. {diffsynth_engine-0.4.2.dev3 → diffsynth_engine-0.4.2.dev5}/diffsynth_engine/conf/tokenizers/sdxl/tokenizer_2/special_tokens_map.json +0 -0
  85. {diffsynth_engine-0.4.2.dev3 → diffsynth_engine-0.4.2.dev5}/diffsynth_engine/conf/tokenizers/sdxl/tokenizer_2/tokenizer_config.json +0 -0
  86. {diffsynth_engine-0.4.2.dev3 → diffsynth_engine-0.4.2.dev5}/diffsynth_engine/conf/tokenizers/sdxl/tokenizer_2/vocab.json +0 -0
  87. {diffsynth_engine-0.4.2.dev3 → diffsynth_engine-0.4.2.dev5}/diffsynth_engine/conf/tokenizers/wan/umt5-xxl/special_tokens_map.json +0 -0
  88. {diffsynth_engine-0.4.2.dev3 → diffsynth_engine-0.4.2.dev5}/diffsynth_engine/conf/tokenizers/wan/umt5-xxl/spiece.model +0 -0
  89. {diffsynth_engine-0.4.2.dev3 → diffsynth_engine-0.4.2.dev5}/diffsynth_engine/conf/tokenizers/wan/umt5-xxl/tokenizer.json +0 -0
  90. {diffsynth_engine-0.4.2.dev3 → diffsynth_engine-0.4.2.dev5}/diffsynth_engine/conf/tokenizers/wan/umt5-xxl/tokenizer_config.json +0 -0
  91. {diffsynth_engine-0.4.2.dev3 → diffsynth_engine-0.4.2.dev5}/diffsynth_engine/configs/__init__.py +0 -0
  92. {diffsynth_engine-0.4.2.dev3 → diffsynth_engine-0.4.2.dev5}/diffsynth_engine/configs/controlnet.py +0 -0
  93. {diffsynth_engine-0.4.2.dev3 → diffsynth_engine-0.4.2.dev5}/diffsynth_engine/configs/pipeline.py +0 -0
  94. {diffsynth_engine-0.4.2.dev3 → diffsynth_engine-0.4.2.dev5}/diffsynth_engine/kernels/__init__.py +0 -0
  95. {diffsynth_engine-0.4.2.dev3 → diffsynth_engine-0.4.2.dev5}/diffsynth_engine/models/__init__.py +0 -0
  96. {diffsynth_engine-0.4.2.dev3 → diffsynth_engine-0.4.2.dev5}/diffsynth_engine/models/base.py +0 -0
  97. {diffsynth_engine-0.4.2.dev3 → diffsynth_engine-0.4.2.dev5}/diffsynth_engine/models/basic/__init__.py +0 -0
  98. {diffsynth_engine-0.4.2.dev3 → diffsynth_engine-0.4.2.dev5}/diffsynth_engine/models/basic/attention.py +0 -0
  99. {diffsynth_engine-0.4.2.dev3 → diffsynth_engine-0.4.2.dev5}/diffsynth_engine/models/basic/lora.py +0 -0
  100. {diffsynth_engine-0.4.2.dev3 → diffsynth_engine-0.4.2.dev5}/diffsynth_engine/models/basic/relative_position_emb.py +0 -0
  101. {diffsynth_engine-0.4.2.dev3 → diffsynth_engine-0.4.2.dev5}/diffsynth_engine/models/basic/timestep.py +0 -0
  102. {diffsynth_engine-0.4.2.dev3 → diffsynth_engine-0.4.2.dev5}/diffsynth_engine/models/basic/transformer_helper.py +0 -0
  103. {diffsynth_engine-0.4.2.dev3 → diffsynth_engine-0.4.2.dev5}/diffsynth_engine/models/basic/unet_helper.py +0 -0
  104. {diffsynth_engine-0.4.2.dev3 → diffsynth_engine-0.4.2.dev5}/diffsynth_engine/models/flux/__init__.py +0 -0
  105. {diffsynth_engine-0.4.2.dev3 → diffsynth_engine-0.4.2.dev5}/diffsynth_engine/models/flux/flux_controlnet.py +0 -0
  106. {diffsynth_engine-0.4.2.dev3 → diffsynth_engine-0.4.2.dev5}/diffsynth_engine/models/flux/flux_dit.py +0 -0
  107. {diffsynth_engine-0.4.2.dev3 → diffsynth_engine-0.4.2.dev5}/diffsynth_engine/models/flux/flux_dit_fbcache.py +0 -0
  108. {diffsynth_engine-0.4.2.dev3 → diffsynth_engine-0.4.2.dev5}/diffsynth_engine/models/flux/flux_ipadapter.py +0 -0
  109. {diffsynth_engine-0.4.2.dev3 → diffsynth_engine-0.4.2.dev5}/diffsynth_engine/models/flux/flux_redux.py +0 -0
  110. {diffsynth_engine-0.4.2.dev3 → diffsynth_engine-0.4.2.dev5}/diffsynth_engine/models/flux/flux_text_encoder.py +0 -0
  111. {diffsynth_engine-0.4.2.dev3 → diffsynth_engine-0.4.2.dev5}/diffsynth_engine/models/flux/flux_vae.py +0 -0
  112. {diffsynth_engine-0.4.2.dev3 → diffsynth_engine-0.4.2.dev5}/diffsynth_engine/models/qwen_image/__init__.py +0 -0
  113. {diffsynth_engine-0.4.2.dev3 → diffsynth_engine-0.4.2.dev5}/diffsynth_engine/models/qwen_image/qwen2_5_vl.py +0 -0
  114. {diffsynth_engine-0.4.2.dev3 → diffsynth_engine-0.4.2.dev5}/diffsynth_engine/models/qwen_image/qwen_image_dit.py +0 -0
  115. {diffsynth_engine-0.4.2.dev3 → diffsynth_engine-0.4.2.dev5}/diffsynth_engine/models/qwen_image/qwen_image_dit_fbcache.py +0 -0
  116. {diffsynth_engine-0.4.2.dev3 → diffsynth_engine-0.4.2.dev5}/diffsynth_engine/models/qwen_image/qwen_image_vae.py +0 -0
  117. {diffsynth_engine-0.4.2.dev3 → diffsynth_engine-0.4.2.dev5}/diffsynth_engine/models/sd/__init__.py +0 -0
  118. {diffsynth_engine-0.4.2.dev3 → diffsynth_engine-0.4.2.dev5}/diffsynth_engine/models/sd/sd_controlnet.py +0 -0
  119. {diffsynth_engine-0.4.2.dev3 → diffsynth_engine-0.4.2.dev5}/diffsynth_engine/models/sd/sd_text_encoder.py +0 -0
  120. {diffsynth_engine-0.4.2.dev3 → diffsynth_engine-0.4.2.dev5}/diffsynth_engine/models/sd/sd_unet.py +0 -0
  121. {diffsynth_engine-0.4.2.dev3 → diffsynth_engine-0.4.2.dev5}/diffsynth_engine/models/sd/sd_vae.py +0 -0
  122. {diffsynth_engine-0.4.2.dev3 → diffsynth_engine-0.4.2.dev5}/diffsynth_engine/models/sd3/__init__.py +0 -0
  123. {diffsynth_engine-0.4.2.dev3 → diffsynth_engine-0.4.2.dev5}/diffsynth_engine/models/sd3/sd3_dit.py +0 -0
  124. {diffsynth_engine-0.4.2.dev3 → diffsynth_engine-0.4.2.dev5}/diffsynth_engine/models/sd3/sd3_text_encoder.py +0 -0
  125. {diffsynth_engine-0.4.2.dev3 → diffsynth_engine-0.4.2.dev5}/diffsynth_engine/models/sd3/sd3_vae.py +0 -0
  126. {diffsynth_engine-0.4.2.dev3 → diffsynth_engine-0.4.2.dev5}/diffsynth_engine/models/sdxl/__init__.py +0 -0
  127. {diffsynth_engine-0.4.2.dev3 → diffsynth_engine-0.4.2.dev5}/diffsynth_engine/models/sdxl/sdxl_controlnet.py +0 -0
  128. {diffsynth_engine-0.4.2.dev3 → diffsynth_engine-0.4.2.dev5}/diffsynth_engine/models/sdxl/sdxl_text_encoder.py +0 -0
  129. {diffsynth_engine-0.4.2.dev3 → diffsynth_engine-0.4.2.dev5}/diffsynth_engine/models/sdxl/sdxl_unet.py +0 -0
  130. {diffsynth_engine-0.4.2.dev3 → diffsynth_engine-0.4.2.dev5}/diffsynth_engine/models/sdxl/sdxl_vae.py +0 -0
  131. {diffsynth_engine-0.4.2.dev3 → diffsynth_engine-0.4.2.dev5}/diffsynth_engine/models/text_encoder/__init__.py +0 -0
  132. {diffsynth_engine-0.4.2.dev3 → diffsynth_engine-0.4.2.dev5}/diffsynth_engine/models/text_encoder/clip.py +0 -0
  133. {diffsynth_engine-0.4.2.dev3 → diffsynth_engine-0.4.2.dev5}/diffsynth_engine/models/text_encoder/siglip.py +0 -0
  134. {diffsynth_engine-0.4.2.dev3 → diffsynth_engine-0.4.2.dev5}/diffsynth_engine/models/text_encoder/t5.py +0 -0
  135. {diffsynth_engine-0.4.2.dev3 → diffsynth_engine-0.4.2.dev5}/diffsynth_engine/models/utils.py +0 -0
  136. {diffsynth_engine-0.4.2.dev3 → diffsynth_engine-0.4.2.dev5}/diffsynth_engine/models/vae/__init__.py +0 -0
  137. {diffsynth_engine-0.4.2.dev3 → diffsynth_engine-0.4.2.dev5}/diffsynth_engine/models/vae/vae.py +0 -0
  138. {diffsynth_engine-0.4.2.dev3 → diffsynth_engine-0.4.2.dev5}/diffsynth_engine/models/wan/__init__.py +0 -0
  139. {diffsynth_engine-0.4.2.dev3 → diffsynth_engine-0.4.2.dev5}/diffsynth_engine/models/wan/wan_dit.py +0 -0
  140. {diffsynth_engine-0.4.2.dev3 → diffsynth_engine-0.4.2.dev5}/diffsynth_engine/models/wan/wan_image_encoder.py +0 -0
  141. {diffsynth_engine-0.4.2.dev3 → diffsynth_engine-0.4.2.dev5}/diffsynth_engine/models/wan/wan_text_encoder.py +0 -0
  142. {diffsynth_engine-0.4.2.dev3 → diffsynth_engine-0.4.2.dev5}/diffsynth_engine/models/wan/wan_vae.py +0 -0
  143. {diffsynth_engine-0.4.2.dev3 → diffsynth_engine-0.4.2.dev5}/diffsynth_engine/pipelines/__init__.py +0 -0
  144. {diffsynth_engine-0.4.2.dev3 → diffsynth_engine-0.4.2.dev5}/diffsynth_engine/pipelines/base.py +0 -0
  145. {diffsynth_engine-0.4.2.dev3 → diffsynth_engine-0.4.2.dev5}/diffsynth_engine/pipelines/flux_image.py +0 -0
  146. {diffsynth_engine-0.4.2.dev3 → diffsynth_engine-0.4.2.dev5}/diffsynth_engine/pipelines/sd_image.py +0 -0
  147. {diffsynth_engine-0.4.2.dev3 → diffsynth_engine-0.4.2.dev5}/diffsynth_engine/pipelines/sdxl_image.py +0 -0
  148. {diffsynth_engine-0.4.2.dev3 → diffsynth_engine-0.4.2.dev5}/diffsynth_engine/pipelines/utils.py +0 -0
  149. {diffsynth_engine-0.4.2.dev3 → diffsynth_engine-0.4.2.dev5}/diffsynth_engine/pipelines/wan_video.py +0 -0
  150. {diffsynth_engine-0.4.2.dev3 → diffsynth_engine-0.4.2.dev5}/diffsynth_engine/processor/__init__.py +0 -0
  151. {diffsynth_engine-0.4.2.dev3 → diffsynth_engine-0.4.2.dev5}/diffsynth_engine/processor/canny_processor.py +0 -0
  152. {diffsynth_engine-0.4.2.dev3 → diffsynth_engine-0.4.2.dev5}/diffsynth_engine/processor/depth_processor.py +0 -0
  153. {diffsynth_engine-0.4.2.dev3 → diffsynth_engine-0.4.2.dev5}/diffsynth_engine/tokenizers/__init__.py +0 -0
  154. {diffsynth_engine-0.4.2.dev3 → diffsynth_engine-0.4.2.dev5}/diffsynth_engine/tokenizers/base.py +0 -0
  155. {diffsynth_engine-0.4.2.dev3 → diffsynth_engine-0.4.2.dev5}/diffsynth_engine/tokenizers/clip.py +0 -0
  156. {diffsynth_engine-0.4.2.dev3 → diffsynth_engine-0.4.2.dev5}/diffsynth_engine/tokenizers/qwen2.py +0 -0
  157. {diffsynth_engine-0.4.2.dev3 → diffsynth_engine-0.4.2.dev5}/diffsynth_engine/tokenizers/t5.py +0 -0
  158. {diffsynth_engine-0.4.2.dev3 → diffsynth_engine-0.4.2.dev5}/diffsynth_engine/tokenizers/wan.py +0 -0
  159. {diffsynth_engine-0.4.2.dev3 → diffsynth_engine-0.4.2.dev5}/diffsynth_engine/tools/__init__.py +0 -0
  160. {diffsynth_engine-0.4.2.dev3 → diffsynth_engine-0.4.2.dev5}/diffsynth_engine/tools/flux_inpainting_tool.py +0 -0
  161. {diffsynth_engine-0.4.2.dev3 → diffsynth_engine-0.4.2.dev5}/diffsynth_engine/tools/flux_outpainting_tool.py +0 -0
  162. {diffsynth_engine-0.4.2.dev3 → diffsynth_engine-0.4.2.dev5}/diffsynth_engine/tools/flux_reference_tool.py +0 -0
  163. {diffsynth_engine-0.4.2.dev3 → diffsynth_engine-0.4.2.dev5}/diffsynth_engine/tools/flux_replace_tool.py +0 -0
  164. {diffsynth_engine-0.4.2.dev3 → diffsynth_engine-0.4.2.dev5}/diffsynth_engine/utils/__init__.py +0 -0
  165. {diffsynth_engine-0.4.2.dev3 → diffsynth_engine-0.4.2.dev5}/diffsynth_engine/utils/cache.py +0 -0
  166. {diffsynth_engine-0.4.2.dev3 → diffsynth_engine-0.4.2.dev5}/diffsynth_engine/utils/constants.py +0 -0
  167. {diffsynth_engine-0.4.2.dev3 → diffsynth_engine-0.4.2.dev5}/diffsynth_engine/utils/download.py +0 -0
  168. {diffsynth_engine-0.4.2.dev3 → diffsynth_engine-0.4.2.dev5}/diffsynth_engine/utils/env.py +0 -0
  169. {diffsynth_engine-0.4.2.dev3 → diffsynth_engine-0.4.2.dev5}/diffsynth_engine/utils/flag.py +0 -0
  170. {diffsynth_engine-0.4.2.dev3 → diffsynth_engine-0.4.2.dev5}/diffsynth_engine/utils/fp8_linear.py +0 -0
  171. {diffsynth_engine-0.4.2.dev3 → diffsynth_engine-0.4.2.dev5}/diffsynth_engine/utils/gguf.py +0 -0
  172. {diffsynth_engine-0.4.2.dev3 → diffsynth_engine-0.4.2.dev5}/diffsynth_engine/utils/image.py +0 -0
  173. {diffsynth_engine-0.4.2.dev3 → diffsynth_engine-0.4.2.dev5}/diffsynth_engine/utils/loader.py +0 -0
  174. {diffsynth_engine-0.4.2.dev3 → diffsynth_engine-0.4.2.dev5}/diffsynth_engine/utils/lock.py +0 -0
  175. {diffsynth_engine-0.4.2.dev3 → diffsynth_engine-0.4.2.dev5}/diffsynth_engine/utils/logging.py +0 -0
  176. {diffsynth_engine-0.4.2.dev3 → diffsynth_engine-0.4.2.dev5}/diffsynth_engine/utils/offload.py +0 -0
  177. {diffsynth_engine-0.4.2.dev3 → diffsynth_engine-0.4.2.dev5}/diffsynth_engine/utils/onnx.py +0 -0
  178. {diffsynth_engine-0.4.2.dev3 → diffsynth_engine-0.4.2.dev5}/diffsynth_engine/utils/parallel.py +0 -0
  179. {diffsynth_engine-0.4.2.dev3 → diffsynth_engine-0.4.2.dev5}/diffsynth_engine/utils/platform.py +0 -0
  180. {diffsynth_engine-0.4.2.dev3 → diffsynth_engine-0.4.2.dev5}/diffsynth_engine/utils/prompt.py +0 -0
  181. {diffsynth_engine-0.4.2.dev3 → diffsynth_engine-0.4.2.dev5}/diffsynth_engine/utils/video.py +0 -0
  182. {diffsynth_engine-0.4.2.dev3 → diffsynth_engine-0.4.2.dev5}/diffsynth_engine.egg-info/SOURCES.txt +0 -0
  183. {diffsynth_engine-0.4.2.dev3 → diffsynth_engine-0.4.2.dev5}/diffsynth_engine.egg-info/dependency_links.txt +0 -0
  184. {diffsynth_engine-0.4.2.dev3 → diffsynth_engine-0.4.2.dev5}/diffsynth_engine.egg-info/requires.txt +0 -0
  185. {diffsynth_engine-0.4.2.dev3 → diffsynth_engine-0.4.2.dev5}/diffsynth_engine.egg-info/top_level.txt +0 -0
  186. {diffsynth_engine-0.4.2.dev3 → diffsynth_engine-0.4.2.dev5}/pyproject.toml +0 -0
  187. {diffsynth_engine-0.4.2.dev3 → diffsynth_engine-0.4.2.dev5}/setup.cfg +0 -0
  188. {diffsynth_engine-0.4.2.dev3 → diffsynth_engine-0.4.2.dev5}/setup.py +0 -0
@@ -1,6 +1,6 @@
1
1
  Metadata-Version: 2.4
2
2
  Name: diffsynth_engine
3
- Version: 0.4.2.dev3
3
+ Version: 0.4.2.dev5
4
4
  Author: MuseAI x ModelScope
5
5
  Classifier: Programming Language :: Python :: 3
6
6
  Classifier: Operating System :: OS Independent
@@ -82,10 +82,8 @@ class QwenImagePipeline(BasePipeline):
82
82
  dtype=config.model_dtype,
83
83
  )
84
84
  self.config = config
85
- self.tokenizer_max_length = 1024
86
85
  self.prompt_template_encode = "<|im_start|>system\nDescribe the image by detailing the color, shape, size, texture, quantity, text, spatial relationships of the objects and background:<|im_end|>\n<|im_start|>user\n{}<|im_end|>\n<|im_start|>assistant\n"
87
86
  self.prompt_template_encode_start_idx = 34
88
- self.default_sample_size = 128
89
87
  # sampler
90
88
  self.noise_scheduler = RecifitedFlowScheduler(shift=3.0, use_dynamic_shifting=True)
91
89
  self.sampler = FlowMatchEulerSampler()
@@ -262,7 +260,7 @@ class QwenImagePipeline(BasePipeline):
262
260
  template = self.prompt_template_encode
263
261
  drop_idx = self.prompt_template_encode_start_idx
264
262
  texts = [template.format(txt) for txt in prompt]
265
- outputs = self.tokenizer(texts, max_length=min(max_sequence_length, self.tokenizer_max_length) + drop_idx)
263
+ outputs = self.tokenizer(texts, max_length=max_sequence_length + drop_idx)
266
264
  input_ids, attention_mask = outputs["input_ids"].to(self.device), outputs["attention_mask"].to(self.device)
267
265
  outputs = self.encoder(input_ids=input_ids, attention_mask=attention_mask)
268
266
  hidden_states = outputs["hidden_states"]
@@ -1,6 +1,6 @@
1
1
  Metadata-Version: 2.4
2
2
  Name: diffsynth_engine
3
- Version: 0.4.2.dev3
3
+ Version: 0.4.2.dev5
4
4
  Author: MuseAI x ModelScope
5
5
  Classifier: Programming Language :: Python :: 3
6
6
  Classifier: Operating System :: OS Independent
@@ -88,6 +88,52 @@ We will continuously update DiffSynth-Engine to support more models. (Wan2.2 LoR
88
88
 
89
89
  After the model is downloaded, load the model with the corresponding pipeline and perform inference.
90
90
 
91
+
92
+ ### Image Generation(Qwen-Image)
93
+
94
+ The following code calls `QwenImagePipeline` to load the [Qwen-Image](https://www.modelscope.cn/models/Qwen/Qwen-Image) model and generate an image. Recommended resolutions are 928×1664, 1104×1472, 1328×1328, 1472×1104, and 1664×928, with a suggested cfg_scale of 4. If no negative_prompt is provided, it defaults to a single space character (not an empty string). For multi-GPU parallelism, currently only cfg parallelism is supported (parallelism=2), with other optimization efforts underway.
95
+
96
+ ```python
97
+ from diffsynth_engine import fetch_model, QwenImagePipeline, QwenImagePipelineConfig
98
+
99
+ config = QwenImagePipelineConfig.basic_config(
100
+ model_path=fetch_model("MusePublic/Qwen-image", revision="v1", path="transformer/*.safetensors"),
101
+ encoder_path=fetch_model("MusePublic/Qwen-image", revision="v1", path="text_encoder/*.safetensors"),
102
+ vae_path=fetch_model("MusePublic/Qwen-image", revision="v1", path="vae/*.safetensors"),
103
+ parallelism=2,
104
+ )
105
+ pipe = QwenImagePipeline.from_pretrained(config)
106
+
107
+ prompt = """
108
+ 一副典雅庄重的对联悬挂于厅堂之中,房间是个安静古典的中式布置,桌子上放着一些青花瓷,对联上左书“思涌如泉万类灵感皆可触”,右书“智启于问千机代码自天成”,横批“AI脑洞力”,字体飘逸灵动,兼具传统笔意与未来感。中间挂着一幅中国风的画作,内容是岳阳楼,云雾缭绕间似有数据流光隐现,古今交融,意境深远。
109
+ """
110
+ negative_prompt = " "
111
+ image = pipe(
112
+ prompt=prompt,
113
+ negative_prompt=negative_prompt,
114
+ cfg_scale=4.0,
115
+ width=1104,
116
+ height=1472,
117
+ num_inference_steps=30,
118
+ seed=42,
119
+ )
120
+ image.save("image.png")
121
+ ```
122
+
123
+ Please note that if some necessary modules, like text encoders, are missing from a model repository, the pipeline will automatically download the required files.
124
+
125
+ #### Detailed Parameters(Qwen-Image)
126
+
127
+ In the image generation pipeline `pipe`, we can use the following parameters for fine-grained control:
128
+
129
+ * `prompt`: The prompt, used to describe the content of the generated image, It supports multiple languages (Chinese, English, Japanese, etc.), e.g., “一只猫” (Chinese), "a cat" (English), or "庭を走る猫" (Japanese).
130
+ * `negative_prompt`: The negative prompt, used to describe content you do not want in the image, it defaults to a single space character (not an empty string), e.g., "ugly".
131
+ * `cfg_scale`: The guidance scale for [Classifier-Free Guidance](https://arxiv.org/abs/2207.12598). A larger value usually results in stronger correlation between the text and the image but reduces the diversity of the generated content.
132
+ * `height`: Image height.
133
+ * `width`: Image width.
134
+ * `num_inference_steps`: The number of inference steps. Generally, more steps lead to longer computation time but higher image quality.
135
+ * `seed`: The random seed. A fixed seed ensures reproducible results.
136
+
91
137
  ### Image Generation
92
138
 
93
139
  The following code calls `FluxImagePipeline` to load the [MajicFlus](https://www.modelscope.cn/models/MAILAND/majicflus_v1/summary?version=v1.0) model and generate an image. To load other types of models, replace `FluxImagePipeline` and `FluxPipelineConfig` in the code with the corresponding pipeline and config.
@@ -109,16 +155,16 @@ Please note that if some necessary modules, like text encoders, are missing from
109
155
 
110
156
  In the image generation pipeline `pipe`, we can use the following parameters for fine-grained control:
111
157
 
112
- * `prompt`: The prompt, used to describe the content of the generated image, e.g., "a cat".
113
- * `negative_prompt`: The negative prompt, used to describe content you do not want in the image, e.g., "ugly".
114
- * `cfg_scale`: The guidance scale for [Classifier-Free Guidance](https://arxiv.org/abs/2207.12598). A larger value usually results in stronger correlation between the text and the image but reduces the diversity of the generated content.
115
- * `clip_skip`: The number of layers to skip in the [CLIP](https://arxiv.org/abs/2103.00020) text encoder. The more layers skipped, the lower the text-image correlation, but this can lead to interesting variations in the generated content.
116
- * `input_image`: Input image, used for image-to-image generation.
117
- * `denoising_strength`: The denoising strength. When set to 1, a full generation process is performed. When set to a value between 0 and 1, some information from the input image is preserved.
118
- * `height`: Image height.
119
- * `width`: Image width.
120
- * `num_inference_steps`: The number of inference steps. Generally, more steps lead to longer computation time but higher image quality.
121
- * `seed`: The random seed. A fixed seed ensures reproducible results.
158
+ * `prompt`: The prompt, used to describe the content of the generated image, e.g., "a cat".
159
+ * `negative_prompt`: The negative prompt, used to describe content you do not want in the image, e.g., "ugly".
160
+ * `cfg_scale`: The guidance scale for [Classifier-Free Guidance](https://arxiv.org/abs/2207.12598). A larger value usually results in stronger correlation between the text and the image but reduces the diversity of the generated content.
161
+ * `clip_skip`: The number of layers to skip in the [CLIP](https://arxiv.org/abs/2103.00020) text encoder. The more layers skipped, the lower the text-image correlation, but this can lead to interesting variations in the generated content.
162
+ * `input_image`: Input image, used for image-to-image generation.
163
+ * `denoising_strength`: The denoising strength. When set to 1, a full generation process is performed. When set to a value between 0 and 1, some information from the input image is preserved.
164
+ * `height`: Image height.
165
+ * `width`: Image width.
166
+ * `num_inference_steps`: The number of inference steps. Generally, more steps lead to longer computation time but higher image quality.
167
+ * `seed`: The random seed. A fixed seed ensures reproducible results.
122
168
 
123
169
  #### Loading LoRA
124
170
 
@@ -177,17 +223,17 @@ save_video(video, "video.mp4")
177
223
 
178
224
  In the video generation pipeline `pipe`, we can use the following parameters for fine-grained control:
179
225
 
180
- * `prompt`: The prompt, used to describe the content of the generated video, e.g., "a cat".
181
- * `negative_prompt`: The negative prompt, used to describe content you do not want in the video, e.g., "ugly".
182
- * `cfg_scale`: The guidance scale for [Classifier-Free Guidance](https://arxiv.org/abs/2207.12598). A larger value usually results in stronger correlation between the text and the video but reduces the diversity of the generated content.
183
- * `input_image`: Input image, only effective in image-to-video models, such as [Wan-AI/Wan2.1-I2V-14B-720P](https://modelscope.cn/models/Wan-AI/Wan2.1-I2V-14B-720P).
184
- * `input_video`: Input video, used for video-to-video generation.
185
- * `denoising_strength`: The denoising strength. When set to 1, a full generation process is performed. When set to a value between 0 and 1, some information from the input video is preserved.
186
- * `height`: Video frame height.
187
- * `width`: Video frame width.
188
- * `num_frames`: Number of video frames.
189
- * `num_inference_steps`: The number of inference steps. Generally, more steps lead to longer computation time but higher video quality.
190
- * `seed`: The random seed. A fixed seed ensures reproducible results.
226
+ * `prompt`: The prompt, used to describe the content of the generated video, e.g., "a cat".
227
+ * `negative_prompt`: The negative prompt, used to describe content you do not want in the video, e.g., "ugly".
228
+ * `cfg_scale`: The guidance scale for [Classifier-Free Guidance](https://arxiv.org/abs/2207.12598). A larger value usually results in stronger correlation between the text and the video but reduces the diversity of the generated content.
229
+ * `input_image`: Input image, only effective in image-to-video models, such as [Wan-AI/Wan2.1-I2V-14B-720P](https://modelscope.cn/models/Wan-AI/Wan2.1-I2V-14B-720P).
230
+ * `input_video`: Input video, used for video-to-video generation.
231
+ * `denoising_strength`: The denoising strength. When set to 1, a full generation process is performed. When set to a value between 0 and 1, some information from the input video is preserved.
232
+ * `height`: Video frame height.
233
+ * `width`: Video frame width.
234
+ * `num_frames`: Number of video frames.
235
+ * `num_inference_steps`: The number of inference steps. Generally, more steps lead to longer computation time but higher video quality.
236
+ * `seed`: The random seed. A fixed seed ensures reproducible results.
191
237
 
192
238
  #### Loading LoRA
193
239
 
@@ -88,6 +88,51 @@ Diffusion 模型包含多种多样的模型结构,每种模型由对应的流
88
88
 
89
89
  模型下载完毕后,我们可以根据对应的模型类型选择流水线加载模型并进行推理。
90
90
 
91
+ ### 图像生成(Qwen-Image)
92
+
93
+ 以下代码可以调用 `QwenImagePipeline` 加载[Qwen-Image](https://www.modelscope.cn/models/Qwen/Qwen-Image)模型生成一张图。推荐分辨率为928×1664, 1104×1472, 1328×1328, 1472×1104, 1664×928,cfg_scale为4,如果没有negative_prompt默认为一个空格而不是空字符串。多卡并行目前支持cfg并行(parallelism=2),其他优化工作正在进行中。
94
+
95
+ ```python
96
+ from diffsynth_engine import fetch_model, QwenImagePipeline, QwenImagePipelineConfig
97
+
98
+ config = QwenImagePipelineConfig.basic_config(
99
+ model_path=fetch_model("MusePublic/Qwen-image", revision="v1", path="transformer/*.safetensors"),
100
+ encoder_path=fetch_model("MusePublic/Qwen-image", revision="v1", path="text_encoder/*.safetensors"),
101
+ vae_path=fetch_model("MusePublic/Qwen-image", revision="v1", path="vae/*.safetensors"),
102
+ parallelism=2,
103
+ )
104
+ pipe = QwenImagePipeline.from_pretrained(config)
105
+
106
+ prompt = """
107
+ 一副典雅庄重的对联悬挂于厅堂之中,房间是个安静古典的中式布置,桌子上放着一些青花瓷,对联上左书“思涌如泉万类灵感皆可触”,右书“智启于问千机代码自天成”,横批“AI脑洞力”,字体飘逸灵动,兼具传统笔意与未来感。中间挂着一幅中国风的画作,内容是岳阳楼,云雾缭绕间似有数据流光隐现,古今交融,意境深远。
108
+ """
109
+ negative_prompt = " "
110
+ image = pipe(
111
+ prompt=prompt,
112
+ negative_prompt=negative_prompt,
113
+ cfg_scale=4.0,
114
+ width=1104,
115
+ height=1472,
116
+ num_inference_steps=30,
117
+ seed=42,
118
+ )
119
+ image.save("image.png")
120
+ ```
121
+
122
+ 请注意,某些模型库中缺乏必要的文本编码器等模块,我们的代码会自动补充下载所需的模型文件。
123
+
124
+ #### 详细参数(Qwen-Image)
125
+
126
+ 在图像生成流水线 `pipe` 中,我们可以通过以下参数进行精细的控制:
127
+
128
+ * `prompt`: 提示词,用于描述生成图像的内容,支持多种语言(中文/英文/日文等),例如“一只猫”/"a cat"/"庭を走る猫"。
129
+ * `negative_prompt`: 负面提示词,用于描述不希望图像中出现的内容,例如“ugly”,默认为一个空格而不是空字符串, " "。
130
+ * `cfg_scale`: [Classifier-free guidance](https://arxiv.org/abs/2207.12598) 的引导系数,通常更大的引导系数可以达到更强的文图相关性,但会降低生成内容的多样性,推荐值为4。
131
+ * `height`: 图像高度。
132
+ * `width`: 图像宽度。
133
+ * `num_inference_steps`: 推理步数,通常推理步数越多,计算时间越长,图像质量越高。
134
+ * `seed`: 随机种子,固定的随机种子可以使生成的内容固定。
135
+
91
136
  ### 图像生成
92
137
 
93
138
  以下代码可以调用 `FluxImagePipeline` 加载[麦橘超然](https://www.modelscope.cn/models/MAILAND/majicflus_v1/summary?version=v1.0)模型生成一张图。如果要加载其他结构的模型,请将代码中的 `FluxImagePipeline` 和 `FluxPipelineConfig` 替换成对应的流水线模块及配置。
@@ -110,15 +155,15 @@ image.save("image.png")
110
155
  在图像生成流水线 `pipe` 中,我们可以通过以下参数进行精细的控制:
111
156
 
112
157
  * `prompt`: 提示词,用于描述生成图像的内容,例如“a cat”。
113
- * `negative_prompt`:负面提示词,用于描述不希望图像中出现的内容,例如“ugly”。
114
- * `cfg_scale`:[Classifier-free guidance](https://arxiv.org/abs/2207.12598) 的引导系数,通常更大的引导系数可以达到更强的文图相关性,但会降低生成内容的多样性。
115
- * `clip_skip`:跳过 [CLIP](https://arxiv.org/abs/2103.00020) 文本编码器的层数,跳过的层数越多,生成的图像与文本的相关性越低,但生成的图像内容可能会出现奇妙的变化。
116
- * `input_image`:输入图像,用于图生图。
117
- * `denoising_strength`:去噪力度,当设置为 1 时,执行完整的生成过程,当设置为 0 到 1 之间的值时,会保留输入图像中的部分信息。
118
- * `height`:图像高度。
119
- * `width`:图像宽度。
120
- * `num_inference_steps`:推理步数,通常推理步数越多,计算时间越长,图像质量越高。
121
- * `seed`:随机种子,固定的随机种子可以使生成的内容固定。
158
+ * `negative_prompt`: 负面提示词,用于描述不希望图像中出现的内容,例如“ugly”。
159
+ * `cfg_scale`: [Classifier-free guidance](https://arxiv.org/abs/2207.12598) 的引导系数,通常更大的引导系数可以达到更强的文图相关性,但会降低生成内容的多样性。
160
+ * `clip_skip`: 跳过 [CLIP](https://arxiv.org/abs/2103.00020) 文本编码器的层数,跳过的层数越多,生成的图像与文本的相关性越低,但生成的图像内容可能会出现奇妙的变化。
161
+ * `input_image`: 输入图像,用于图生图。
162
+ * `denoising_strength`: 去噪力度,当设置为 1 时,执行完整的生成过程,当设置为 0 到 1 之间的值时,会保留输入图像中的部分信息。
163
+ * `height`: 图像高度。
164
+ * `width`: 图像宽度。
165
+ * `num_inference_steps`: 推理步数,通常推理步数越多,计算时间越长,图像质量越高。
166
+ * `seed`: 随机种子,固定的随机种子可以使生成的内容固定。
122
167
 
123
168
  #### LoRA 加载
124
169
 
@@ -175,16 +220,16 @@ save_video(video, "video.mp4")
175
220
  在视频生成流水线 `pipe` 中,我们可以通过以下参数进行精细的控制:
176
221
 
177
222
  * `prompt`: 提示词,用于描述生成图像的内容,例如“a cat”。
178
- * `negative_prompt`:负面提示词,用于描述不希望图像中出现的内容,例如“ugly”。
179
- * `cfg_scale`:[Classifier-free guidance](https://arxiv.org/abs/2207.12598) 的引导系数,通常更大的引导系数可以达到更强的文图相关性,但会降低生成内容的多样性。
180
- * `input_image`:输入图像,只在图生视频模型中有效,例如 [Wan-AI/Wan2.1-I2V-14B-720P](https://modelscope.cn/models/Wan-AI/Wan2.1-I2V-14B-720P)。
181
- * `input_video`:输入视频,用于视频生视频。
182
- * `denoising_strength`:去噪力度,当设置为 1 时,执行完整的生成过程,当设置为 0 到 1 之间的值时,会保留输入视频中的部分信息。
183
- * `height`:视频帧高度。
184
- * `width`:视频帧宽度。
185
- * `num_frames`:视频帧数。
186
- * `num_inference_steps`:推理步数,通常推理步数越多,计算时间越长,图像质量越高。
187
- * `seed`:随机种子,固定的随机种子可以使生成的内容固定。
223
+ * `negative_prompt`: 负面提示词,用于描述不希望图像中出现的内容,例如“ugly”。
224
+ * `cfg_scale`: [Classifier-free guidance](https://arxiv.org/abs/2207.12598) 的引导系数,通常更大的引导系数可以达到更强的文图相关性,但会降低生成内容的多样性。
225
+ * `input_image`: 输入图像,只在图生视频模型中有效,例如 [Wan-AI/Wan2.1-I2V-14B-720P](https://modelscope.cn/models/Wan-AI/Wan2.1-I2V-14B-720P)。
226
+ * `input_video`: 输入视频,用于视频生视频。
227
+ * `denoising_strength`: 去噪力度,当设置为 1 时,执行完整的生成过程,当设置为 0 到 1 之间的值时,会保留输入视频中的部分信息。
228
+ * `height`: 视频帧高度。
229
+ * `width`: 视频帧宽度。
230
+ * `num_frames`: 视频帧数。
231
+ * `num_inference_steps`: 推理步数,通常推理步数越多,计算时间越长,图像质量越高。
232
+ * `seed`: 随机种子,固定的随机种子可以使生成的内容固定。
188
233
 
189
234
  #### LoRA 加载
190
235