telefuser 0.1.0.post3__tar.gz
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- telefuser-0.1.0.post3/.claude/skills/add-new-pipeline/SKILL.md +635 -0
- telefuser-0.1.0.post3/.claude/skills/framework-analysis/SKILL.md +120 -0
- telefuser-0.1.0.post3/.claude/skills/optimize-pipeline/SKILL.md +668 -0
- telefuser-0.1.0.post3/.claude/skills/profile-pipeline/SKILL.md +273 -0
- telefuser-0.1.0.post3/.github/ISSUE_TEMPLATE/bug_report.yml +73 -0
- telefuser-0.1.0.post3/.github/ISSUE_TEMPLATE/config.yml +11 -0
- telefuser-0.1.0.post3/.github/ISSUE_TEMPLATE/feature_request.yml +43 -0
- telefuser-0.1.0.post3/.github/ISSUE_TEMPLATE/roadmap.yml +125 -0
- telefuser-0.1.0.post3/.github/PULL_REQUEST_TEMPLATE.md +59 -0
- telefuser-0.1.0.post3/.github/workflows/docs.yml +66 -0
- telefuser-0.1.0.post3/.github/workflows/lint.yml +35 -0
- telefuser-0.1.0.post3/.github/workflows/release-pypi.yml +66 -0
- telefuser-0.1.0.post3/.github/workflows/release-testpypi.yml +66 -0
- telefuser-0.1.0.post3/.github/workflows/test.yml +108 -0
- telefuser-0.1.0.post3/.gitignore +153 -0
- telefuser-0.1.0.post3/.pre-commit-config.yaml +21 -0
- telefuser-0.1.0.post3/AGENTS.md +227 -0
- telefuser-0.1.0.post3/CLAUDE.md +227 -0
- telefuser-0.1.0.post3/CODE_OF_CONDUCT.md +132 -0
- telefuser-0.1.0.post3/CONTRIBUTING.md +387 -0
- telefuser-0.1.0.post3/GEMINI.md +227 -0
- telefuser-0.1.0.post3/LICENSE +201 -0
- telefuser-0.1.0.post3/PKG-INFO +379 -0
- telefuser-0.1.0.post3/README.md +287 -0
- telefuser-0.1.0.post3/README_zh.md +287 -0
- telefuser-0.1.0.post3/assets/telefuser_logo.png +0 -0
- telefuser-0.1.0.post3/benchmarks/feature_cache/README.md +188 -0
- telefuser-0.1.0.post3/benchmarks/feature_cache/analyze_residual_taylor_error.py +434 -0
- telefuser-0.1.0.post3/benchmarks/feature_cache/wan21_1_3b_ada_taylor_cache.py +748 -0
- telefuser-0.1.0.post3/benchmarks/kernel/bench_rmsnorm.py +130 -0
- telefuser-0.1.0.post3/benchmarks/kernel/bench_rotary.py +108 -0
- telefuser-0.1.0.post3/benchmarks/kernel/bench_scale_shift.py +93 -0
- telefuser-0.1.0.post3/benchmarks/kernel/run_benchmarks.sh +113 -0
- telefuser-0.1.0.post3/docker/.gitkeep +0 -0
- telefuser-0.1.0.post3/docs/en/adding_new_example.md +593 -0
- telefuser-0.1.0.post3/docs/en/adding_new_model.md +783 -0
- telefuser-0.1.0.post3/docs/en/adding_new_stage.md +585 -0
- telefuser-0.1.0.post3/docs/en/attention.md +736 -0
- telefuser-0.1.0.post3/docs/en/configuration.md +357 -0
- telefuser-0.1.0.post3/docs/en/feature_cache.md +310 -0
- telefuser-0.1.0.post3/docs/en/hash_config_management.md +519 -0
- telefuser-0.1.0.post3/docs/en/index.md +66 -0
- telefuser-0.1.0.post3/docs/en/latent_cache.md +424 -0
- telefuser-0.1.0.post3/docs/en/logging.md +512 -0
- telefuser-0.1.0.post3/docs/en/metrics.md +520 -0
- telefuser-0.1.0.post3/docs/en/model_loading.md +237 -0
- telefuser-0.1.0.post3/docs/en/offload.md +514 -0
- telefuser-0.1.0.post3/docs/en/ops.md +557 -0
- telefuser-0.1.0.post3/docs/en/parallel.md +614 -0
- telefuser-0.1.0.post3/docs/en/profiler.md +533 -0
- telefuser-0.1.0.post3/docs/en/release_pypi.md +78 -0
- telefuser-0.1.0.post3/docs/en/service.md +1166 -0
- telefuser-0.1.0.post3/docs/en/service_metadata.md +171 -0
- telefuser-0.1.0.post3/docs/en/stream_server.md +879 -0
- telefuser-0.1.0.post3/docs/en/testing.md +500 -0
- telefuser-0.1.0.post3/docs/en/torch_compile_compatibility.md +287 -0
- telefuser-0.1.0.post3/docs/requirements.txt +12 -0
- telefuser-0.1.0.post3/docs/stylesheets/extra.css +37 -0
- telefuser-0.1.0.post3/docs/zh/adding_new_example.md +593 -0
- telefuser-0.1.0.post3/docs/zh/adding_new_model.md +782 -0
- telefuser-0.1.0.post3/docs/zh/adding_new_stage.md +585 -0
- telefuser-0.1.0.post3/docs/zh/adf_scoring_guide.md +479 -0
- telefuser-0.1.0.post3/docs/zh/attention.md +736 -0
- telefuser-0.1.0.post3/docs/zh/configuration.md +357 -0
- telefuser-0.1.0.post3/docs/zh/feature_cache.md +310 -0
- telefuser-0.1.0.post3/docs/zh/hash_config_management.md +519 -0
- telefuser-0.1.0.post3/docs/zh/index.md +66 -0
- telefuser-0.1.0.post3/docs/zh/latent_cache.md +377 -0
- telefuser-0.1.0.post3/docs/zh/logging.md +562 -0
- telefuser-0.1.0.post3/docs/zh/metrics.md +520 -0
- telefuser-0.1.0.post3/docs/zh/model_loading.md +237 -0
- telefuser-0.1.0.post3/docs/zh/offload.md +514 -0
- telefuser-0.1.0.post3/docs/zh/ops.md +557 -0
- telefuser-0.1.0.post3/docs/zh/parallel.md +614 -0
- telefuser-0.1.0.post3/docs/zh/profiler.md +481 -0
- telefuser-0.1.0.post3/docs/zh/release_pypi.md +74 -0
- telefuser-0.1.0.post3/docs/zh/service.md +1164 -0
- telefuser-0.1.0.post3/docs/zh/service_metadata.md +169 -0
- telefuser-0.1.0.post3/docs/zh/stream_server.md +889 -0
- telefuser-0.1.0.post3/docs/zh/testing.md +500 -0
- telefuser-0.1.0.post3/docs/zh/torch_compile_compatibility.md +286 -0
- telefuser-0.1.0.post3/examples/README.md +208 -0
- telefuser-0.1.0.post3/examples/data/1.png +0 -0
- telefuser-0.1.0.post3/examples/data/1.wav +0 -0
- telefuser-0.1.0.post3/examples/data/101235-video-720_0.png +0 -0
- telefuser-0.1.0.post3/examples/data/dag.mp4 +0 -0
- telefuser-0.1.0.post3/examples/data/edit2511input.png +0 -0
- telefuser-0.1.0.post3/examples/example_config.yaml +230 -0
- telefuser-0.1.0.post3/examples/flashvsr/README.md +96 -0
- telefuser-0.1.0.post3/examples/flashvsr/flashvsr_stream.py +230 -0
- telefuser-0.1.0.post3/examples/flux2_klein/README.md +109 -0
- telefuser-0.1.0.post3/examples/flux2_klein/flux2_klein_text_to_image_h100.py +211 -0
- telefuser-0.1.0.post3/examples/flux2_klein/flux2_klein_text_to_image_official.py +145 -0
- telefuser-0.1.0.post3/examples/hunyuan_video/README.md +158 -0
- telefuser-0.1.0.post3/examples/hunyuan_video/hunyuan_video_i2v.py +201 -0
- telefuser-0.1.0.post3/examples/hunyuan_video/hunyuan_video_i2v_cache_calibrate.py +244 -0
- telefuser-0.1.0.post3/examples/hunyuan_video/hunyuan_video_t2v.py +277 -0
- telefuser-0.1.0.post3/examples/hunyuan_video/hunyuan_video_t2v_cache_calibrate.py +235 -0
- telefuser-0.1.0.post3/examples/lingbot/stream_lingbot_world_fast.py +83 -0
- telefuser-0.1.0.post3/examples/liveact/liveact_s2v_h100.py +176 -0
- telefuser-0.1.0.post3/examples/longcat_video/README.md +239 -0
- telefuser-0.1.0.post3/examples/longcat_video/longcat_image_to_video.py +207 -0
- telefuser-0.1.0.post3/examples/longcat_video/longcat_text_to_video.py +203 -0
- telefuser-0.1.0.post3/examples/longcat_video/longcat_text_to_video_refine.py +269 -0
- telefuser-0.1.0.post3/examples/longcat_video/longcat_video_continue.py +321 -0
- telefuser-0.1.0.post3/examples/longcat_video/longcat_video_unify.py +404 -0
- telefuser-0.1.0.post3/examples/ltx_video/README.md +69 -0
- telefuser-0.1.0.post3/examples/ltx_video/ltx23_22b_image_to_video_two_stage_h100.py +233 -0
- telefuser-0.1.0.post3/examples/qwen_image/README.md +191 -0
- telefuser-0.1.0.post3/examples/qwen_image/qwen_image_cache_calibrate.py +225 -0
- telefuser-0.1.0.post3/examples/qwen_image/qwen_image_edit_plus_cache_calibrate.py +227 -0
- telefuser-0.1.0.post3/examples/qwen_image/qwen_image_edit_plus_h100.py +112 -0
- telefuser-0.1.0.post3/examples/qwen_image/qwen_image_edit_plus_official.py +31 -0
- telefuser-0.1.0.post3/examples/qwen_image/qwen_image_t2i_h100.py +138 -0
- telefuser-0.1.0.post3/examples/qwen_image/qwen_image_t2i_lightning_fp8_h100.py +108 -0
- telefuser-0.1.0.post3/examples/qwen_image/qwen_image_t2i_lora_h100.py +115 -0
- telefuser-0.1.0.post3/examples/qwen_image/qwen_image_t2i_official.py +52 -0
- telefuser-0.1.0.post3/examples/run_examples.py +2069 -0
- telefuser-0.1.0.post3/examples/stream_server/stream_arrow_overlay.py +177 -0
- telefuser-0.1.0.post3/examples/stream_server/stream_video_replay.py +144 -0
- telefuser-0.1.0.post3/examples/stream_server/webrtc_arrow_overlay_demo.py +257 -0
- telefuser-0.1.0.post3/examples/stream_server/webrtc_bidirectional_demo.py +1261 -0
- telefuser-0.1.0.post3/examples/stream_server/webrtc_client_demo.py +221 -0
- telefuser-0.1.0.post3/examples/wan_video/README.md +577 -0
- telefuser-0.1.0.post3/examples/wan_video/async_wan22_14b_image_to_video_distill_h100.py +261 -0
- telefuser-0.1.0.post3/examples/wan_video/wan21_14b_image_to_video_h100.py +220 -0
- telefuser-0.1.0.post3/examples/wan_video/wan21_14b_image_to_video_lora_h100.py +213 -0
- telefuser-0.1.0.post3/examples/wan_video/wan21_14b_text_to_video_h100.py +192 -0
- telefuser-0.1.0.post3/examples/wan_video/wan21_1_3b_text_to_video_ada_taylor_cache.py +299 -0
- telefuser-0.1.0.post3/examples/wan_video/wan21_1_3b_text_to_video_cache_calibrate.py +233 -0
- telefuser-0.1.0.post3/examples/wan_video/wan21_1_3b_text_to_video_h100.py +222 -0
- telefuser-0.1.0.post3/examples/wan_video/wan21_1_3b_text_to_video_hf.py +202 -0
- telefuser-0.1.0.post3/examples/wan_video/wan21_1_3b_text_to_video_radial.py +300 -0
- telefuser-0.1.0.post3/examples/wan_video/wan22_14b_first_last_frame_to_video_h100.py +236 -0
- telefuser-0.1.0.post3/examples/wan_video/wan22_14b_image_to_video_cache_calibrate.py +270 -0
- telefuser-0.1.0.post3/examples/wan_video/wan22_14b_image_to_video_distill_fp8_h100.py +221 -0
- telefuser-0.1.0.post3/examples/wan_video/wan22_14b_image_to_video_distill_h100.py +272 -0
- telefuser-0.1.0.post3/examples/wan_video/wan22_14b_image_to_video_h100.py +203 -0
- telefuser-0.1.0.post3/examples/wan_video/wan22_14b_image_to_video_h100_ray.py +219 -0
- telefuser-0.1.0.post3/examples/wan_video/wan22_14b_image_to_video_lora_h100.py +220 -0
- telefuser-0.1.0.post3/examples/wan_video/wan22_14b_image_to_video_mix_h100.py +203 -0
- telefuser-0.1.0.post3/examples/wan_video/wan22_14b_text_to_video_h100.py +243 -0
- telefuser-0.1.0.post3/examples/wan_video/wan22_14b_text_to_video_service.py +282 -0
- telefuser-0.1.0.post3/examples/wan_video/wan22_i2v_5b.py +180 -0
- telefuser-0.1.0.post3/examples/wan_video/wan22_t2v_5b.py +172 -0
- telefuser-0.1.0.post3/examples/z_image/README.md +67 -0
- telefuser-0.1.0.post3/examples/z_image/z_image_turbo_official.py +43 -0
- telefuser-0.1.0.post3/examples/z_image/z_image_turbo_t2i_h100.py +103 -0
- telefuser-0.1.0.post3/mkdocs.yml +216 -0
- telefuser-0.1.0.post3/pyproject.toml +249 -0
- telefuser-0.1.0.post3/scripts/build_latent_dataset.py +396 -0
- telefuser-0.1.0.post3/scripts/build_telefuser_dist.sh +34 -0
- telefuser-0.1.0.post3/scripts/publish_telefuser_pypi.sh +15 -0
- telefuser-0.1.0.post3/scripts/run_ci_tests.sh +73 -0
- telefuser-0.1.0.post3/setup.cfg +7 -0
- telefuser-0.1.0.post3/telefuser/__init__.py +6 -0
- telefuser-0.1.0.post3/telefuser/_logo.py +12 -0
- telefuser-0.1.0.post3/telefuser/_version.py +24 -0
- telefuser-0.1.0.post3/telefuser/cache/__init__.py +5 -0
- telefuser-0.1.0.post3/telefuser/cache/kv_cache.py +438 -0
- telefuser-0.1.0.post3/telefuser/cache_mem/__init__.py +27 -0
- telefuser-0.1.0.post3/telefuser/cache_mem/cache_types.py +40 -0
- telefuser-0.1.0.post3/telefuser/cache_mem/config.py +83 -0
- telefuser-0.1.0.post3/telefuser/cache_mem/connection.py +197 -0
- telefuser-0.1.0.post3/telefuser/cache_mem/encoders.py +398 -0
- telefuser-0.1.0.post3/telefuser/cache_mem/encoding/__init__.py +0 -0
- telefuser-0.1.0.post3/telefuser/cache_mem/encoding/interfaces.py +27 -0
- telefuser-0.1.0.post3/telefuser/cache_mem/latent_cache.py +213 -0
- telefuser-0.1.0.post3/telefuser/cache_mem/log_monitor.py +77 -0
- telefuser-0.1.0.post3/telefuser/cache_mem/metadata.py +268 -0
- telefuser-0.1.0.post3/telefuser/cache_mem/src/__init__.py +0 -0
- telefuser-0.1.0.post3/telefuser/cache_mem/src/models/__init__.py +0 -0
- telefuser-0.1.0.post3/telefuser/cache_mem/src/models/qwen3_vl_embedding.py +346 -0
- telefuser-0.1.0.post3/telefuser/cache_mem/src/models/qwen3_vl_reranker.py +437 -0
- telefuser-0.1.0.post3/telefuser/cache_mem/state/__init__.py +0 -0
- telefuser-0.1.0.post3/telefuser/cache_mem/state/interfaces.py +67 -0
- telefuser-0.1.0.post3/telefuser/cache_mem/storage/__init__.py +11 -0
- telefuser-0.1.0.post3/telefuser/cache_mem/storage/fluxon.py +24 -0
- telefuser-0.1.0.post3/telefuser/cache_mem/storage/interfaces.py +25 -0
- telefuser-0.1.0.post3/telefuser/cache_mem/storage/local_file.py +112 -0
- telefuser-0.1.0.post3/telefuser/cache_mem/storage/memory.py +24 -0
- telefuser-0.1.0.post3/telefuser/cache_mem/strategies.py +819 -0
- telefuser-0.1.0.post3/telefuser/cache_mem/vector_store/__init__.py +5 -0
- telefuser-0.1.0.post3/telefuser/cache_mem/vector_store/faiss.py +298 -0
- telefuser-0.1.0.post3/telefuser/cache_mem/vector_store/interfaces.py +42 -0
- telefuser-0.1.0.post3/telefuser/cache_mem/vector_store/qdrant.py +46 -0
- telefuser-0.1.0.post3/telefuser/client/__init__.py +34 -0
- telefuser-0.1.0.post3/telefuser/client/openai/__init__.py +34 -0
- telefuser-0.1.0.post3/telefuser/client/openai/client.py +146 -0
- telefuser-0.1.0.post3/telefuser/client/openai/images.py +221 -0
- telefuser-0.1.0.post3/telefuser/client/openai/videos.py +307 -0
- telefuser-0.1.0.post3/telefuser/client/tf_client.py +1016 -0
- telefuser-0.1.0.post3/telefuser/core/__init__.py +37 -0
- telefuser-0.1.0.post3/telefuser/core/base_model.py +262 -0
- telefuser-0.1.0.post3/telefuser/core/base_pipeline.py +421 -0
- telefuser-0.1.0.post3/telefuser/core/base_stage.py +169 -0
- telefuser-0.1.0.post3/telefuser/core/config.py +409 -0
- telefuser-0.1.0.post3/telefuser/core/config_serializer.py +54 -0
- telefuser-0.1.0.post3/telefuser/core/model_registry.py +108 -0
- telefuser-0.1.0.post3/telefuser/core/module_manager.py +412 -0
- telefuser-0.1.0.post3/telefuser/distributed/__init__.py +95 -0
- telefuser-0.1.0.post3/telefuser/distributed/device_mesh.py +347 -0
- telefuser-0.1.0.post3/telefuser/distributed/fsdp.py +143 -0
- telefuser-0.1.0.post3/telefuser/distributed/parallel_shard.py +250 -0
- telefuser-0.1.0.post3/telefuser/distributed/pp_comm.py +306 -0
- telefuser-0.1.0.post3/telefuser/distributed/ring.py +357 -0
- telefuser-0.1.0.post3/telefuser/distributed/tp_parallelize.py +63 -0
- telefuser-0.1.0.post3/telefuser/distributed/ulysses_comm.py +250 -0
- telefuser-0.1.0.post3/telefuser/entrypoints/__init__.py +3 -0
- telefuser-0.1.0.post3/telefuser/entrypoints/cli/main.py +257 -0
- telefuser-0.1.0.post3/telefuser/feature_cache/__init__.py +47 -0
- telefuser-0.1.0.post3/telefuser/feature_cache/ada_taylor_cache/__init__.py +28 -0
- telefuser-0.1.0.post3/telefuser/feature_cache/ada_taylor_cache/ada_taylor_cache.py +656 -0
- telefuser-0.1.0.post3/telefuser/feature_cache/ada_taylor_cache/params/HunyuanVideo15-I2V-480P.json +111 -0
- telefuser-0.1.0.post3/telefuser/feature_cache/ada_taylor_cache/params/HunyuanVideo15-T2V-480P.json +111 -0
- telefuser-0.1.0.post3/telefuser/feature_cache/ada_taylor_cache/params/Qwen-Image-2512.json +111 -0
- telefuser-0.1.0.post3/telefuser/feature_cache/ada_taylor_cache/params/Wan2_1-FL2V-14B-720P.json +89 -0
- telefuser-0.1.0.post3/telefuser/feature_cache/ada_taylor_cache/params/Wan2_1-I2V-14B-480P.json +89 -0
- telefuser-0.1.0.post3/telefuser/feature_cache/ada_taylor_cache/params/Wan2_1-I2V-14B-720P.json +89 -0
- telefuser-0.1.0.post3/telefuser/feature_cache/ada_taylor_cache/params/Wan2_1-T2V-14B.json +109 -0
- telefuser-0.1.0.post3/telefuser/feature_cache/ada_taylor_cache/params/Wan2_1-T2V-1_3B.json +109 -0
- telefuser-0.1.0.post3/telefuser/feature_cache/ada_taylor_cache/params/Wan2_2-FL2V-A14B.json +89 -0
- telefuser-0.1.0.post3/telefuser/feature_cache/ada_taylor_cache/params/Wan2_2-I2V-A14B-Camera.json +109 -0
- telefuser-0.1.0.post3/telefuser/feature_cache/ada_taylor_cache/params/Wan2_2-I2V-A14B.json +89 -0
- telefuser-0.1.0.post3/telefuser/feature_cache/ada_taylor_cache/params/Wan2_2-T2V-A14B.json +89 -0
- telefuser-0.1.0.post3/telefuser/feature_cache/base.py +150 -0
- telefuser-0.1.0.post3/telefuser/kernel/__init__.py +55 -0
- telefuser-0.1.0.post3/telefuser/kernel/triton/__init__.py +43 -0
- telefuser-0.1.0.post3/telefuser/kernel/triton/merge_attn_states.py +115 -0
- telefuser-0.1.0.post3/telefuser/kernel/triton/norm.py +816 -0
- telefuser-0.1.0.post3/telefuser/kernel/triton/quant.py +147 -0
- telefuser-0.1.0.post3/telefuser/kernel/triton/quant_per_block.py +154 -0
- telefuser-0.1.0.post3/telefuser/kernel/triton/rotary.py +162 -0
- telefuser-0.1.0.post3/telefuser/kernel/triton/scale_shift.py +1064 -0
- telefuser-0.1.0.post3/telefuser/kernel/triton/sparse_int8_attn.py +280 -0
- telefuser-0.1.0.post3/telefuser/metrics/__init__.py +101 -0
- telefuser-0.1.0.post3/telefuser/metrics/collector.py +442 -0
- telefuser-0.1.0.post3/telefuser/metrics/config.py +111 -0
- telefuser-0.1.0.post3/telefuser/metrics/exporters.py +113 -0
- telefuser-0.1.0.post3/telefuser/metrics/registry.py +485 -0
- telefuser-0.1.0.post3/telefuser/metrics/service_metrics.py +436 -0
- telefuser-0.1.0.post3/telefuser/metrics/stage_metrics.py +207 -0
- telefuser-0.1.0.post3/telefuser/models/TCDecoder.py +352 -0
- telefuser-0.1.0.post3/telefuser/models/__init__.py +24 -0
- telefuser-0.1.0.post3/telefuser/models/flashvsr_dit.py +608 -0
- telefuser-0.1.0.post3/telefuser/models/flux2_dit.py +1126 -0
- telefuser-0.1.0.post3/telefuser/models/hunyuan_video_byt5.py +433 -0
- telefuser-0.1.0.post3/telefuser/models/hunyuan_video_dit.py +2124 -0
- telefuser-0.1.0.post3/telefuser/models/hunyuan_video_image_encoder.py +222 -0
- telefuser-0.1.0.post3/telefuser/models/hunyuan_video_text_encoder.py +461 -0
- telefuser-0.1.0.post3/telefuser/models/hunyuan_video_upsampler.py +320 -0
- telefuser-0.1.0.post3/telefuser/models/hunyuan_video_vae.py +850 -0
- telefuser-0.1.0.post3/telefuser/models/lingbot_world_fast_dit.py +573 -0
- telefuser-0.1.0.post3/telefuser/models/liveact_dit.py +1213 -0
- telefuser-0.1.0.post3/telefuser/models/longcat_video_dit.py +1214 -0
- telefuser-0.1.0.post3/telefuser/models/ltx_audio_vae.py +1183 -0
- telefuser-0.1.0.post3/telefuser/models/ltx_dit.py +2202 -0
- telefuser-0.1.0.post3/telefuser/models/ltx_gemma_text_encoder.py +1004 -0
- telefuser-0.1.0.post3/telefuser/models/ltx_upsampler.py +416 -0
- telefuser-0.1.0.post3/telefuser/models/ltx_video_vae.py +2668 -0
- telefuser-0.1.0.post3/telefuser/models/qwen_image_dit.py +780 -0
- telefuser-0.1.0.post3/telefuser/models/qwen_image_text_encoder.py +196 -0
- telefuser-0.1.0.post3/telefuser/models/qwen_image_vae.py +643 -0
- telefuser-0.1.0.post3/telefuser/models/realesrgan.py +356 -0
- telefuser-0.1.0.post3/telefuser/models/rift_hdv3.py +353 -0
- telefuser-0.1.0.post3/telefuser/models/t5_tokenizer.py +96 -0
- telefuser-0.1.0.post3/telefuser/models/video_projector.py +457 -0
- telefuser-0.1.0.post3/telefuser/models/wan22_video_vae.py +1548 -0
- telefuser-0.1.0.post3/telefuser/models/wan_video_dit.py +1586 -0
- telefuser-0.1.0.post3/telefuser/models/wan_video_image_encoder.py +534 -0
- telefuser-0.1.0.post3/telefuser/models/wan_video_text_encoder.py +317 -0
- telefuser-0.1.0.post3/telefuser/models/wan_video_vae.py +1519 -0
- telefuser-0.1.0.post3/telefuser/models/wav2vec2.py +154 -0
- telefuser-0.1.0.post3/telefuser/models/xlm_roberta.py +157 -0
- telefuser-0.1.0.post3/telefuser/models/z_image_dit.py +695 -0
- telefuser-0.1.0.post3/telefuser/models/z_image_text_encoder.py +81 -0
- telefuser-0.1.0.post3/telefuser/offload/__init__.py +26 -0
- telefuser-0.1.0.post3/telefuser/offload/async_offload.py +417 -0
- telefuser-0.1.0.post3/telefuser/offload/model_offload.py +35 -0
- telefuser-0.1.0.post3/telefuser/offload/sequential_offload.py +318 -0
- telefuser-0.1.0.post3/telefuser/ops/__init__.py +33 -0
- telefuser-0.1.0.post3/telefuser/ops/activations.py +187 -0
- telefuser-0.1.0.post3/telefuser/ops/attention/__init__.py +29 -0
- telefuser-0.1.0.post3/telefuser/ops/attention/attention_impl.py +529 -0
- telefuser-0.1.0.post3/telefuser/ops/attention/backends.py +209 -0
- telefuser-0.1.0.post3/telefuser/ops/attention/bsa.py +250 -0
- telefuser-0.1.0.post3/telefuser/ops/attention/local_sparse_attn.py +547 -0
- telefuser-0.1.0.post3/telefuser/ops/attention/sparse_patterns.py +622 -0
- telefuser-0.1.0.post3/telefuser/ops/attention/sparse_sage.py +80 -0
- telefuser-0.1.0.post3/telefuser/ops/base.py +145 -0
- telefuser-0.1.0.post3/telefuser/ops/custom_op.py +121 -0
- telefuser-0.1.0.post3/telefuser/ops/ffn.py +69 -0
- telefuser-0.1.0.post3/telefuser/ops/fp8_gemm.py +348 -0
- telefuser-0.1.0.post3/telefuser/ops/normalization.py +274 -0
- telefuser-0.1.0.post3/telefuser/ops/quantized_linear.py +164 -0
- telefuser-0.1.0.post3/telefuser/ops/rotary.py +138 -0
- telefuser-0.1.0.post3/telefuser/orchestrator/__init__.py +22 -0
- telefuser-0.1.0.post3/telefuser/orchestrator/artifact_save_stage.py +119 -0
- telefuser-0.1.0.post3/telefuser/orchestrator/pipeline_orchestrator.py +358 -0
- telefuser-0.1.0.post3/telefuser/orchestrator/stage_wrapper.py +276 -0
- telefuser-0.1.0.post3/telefuser/pipelines/__init__.py +9 -0
- telefuser-0.1.0.post3/telefuser/pipelines/common/realesrgan_upscale.py +92 -0
- telefuser-0.1.0.post3/telefuser/pipelines/common/rift_vfi.py +54 -0
- telefuser-0.1.0.post3/telefuser/pipelines/flashvsr/__init__.py +4 -0
- telefuser-0.1.0.post3/telefuser/pipelines/flashvsr/dit_denoising.py +312 -0
- telefuser-0.1.0.post3/telefuser/pipelines/flashvsr/flashvsr_stream.py +197 -0
- telefuser-0.1.0.post3/telefuser/pipelines/flashvsr/vae.py +57 -0
- telefuser-0.1.0.post3/telefuser/pipelines/flux2_klein/__init__.py +5 -0
- telefuser-0.1.0.post3/telefuser/pipelines/flux2_klein/dit_denoising.py +329 -0
- telefuser-0.1.0.post3/telefuser/pipelines/flux2_klein/pipeline.py +427 -0
- telefuser-0.1.0.post3/telefuser/pipelines/flux2_klein/text_encoding.py +201 -0
- telefuser-0.1.0.post3/telefuser/pipelines/flux2_klein/vae.py +215 -0
- telefuser-0.1.0.post3/telefuser/pipelines/hunyuan_video_1_5/__init__.py +55 -0
- telefuser-0.1.0.post3/telefuser/pipelines/hunyuan_video_1_5/dit_denoising.py +270 -0
- telefuser-0.1.0.post3/telefuser/pipelines/hunyuan_video_1_5/image_encoding.py +87 -0
- telefuser-0.1.0.post3/telefuser/pipelines/hunyuan_video_1_5/pipeline.py +324 -0
- telefuser-0.1.0.post3/telefuser/pipelines/hunyuan_video_1_5/sr_dit_denoising.py +363 -0
- telefuser-0.1.0.post3/telefuser/pipelines/hunyuan_video_1_5/text_encoding.py +291 -0
- telefuser-0.1.0.post3/telefuser/pipelines/hunyuan_video_1_5/upsampler.py +95 -0
- telefuser-0.1.0.post3/telefuser/pipelines/hunyuan_video_1_5/vae.py +133 -0
- telefuser-0.1.0.post3/telefuser/pipelines/lingbot_world_fast/__init__.py +31 -0
- telefuser-0.1.0.post3/telefuser/pipelines/lingbot_world_fast/control.py +208 -0
- telefuser-0.1.0.post3/telefuser/pipelines/lingbot_world_fast/denoising.py +85 -0
- telefuser-0.1.0.post3/telefuser/pipelines/lingbot_world_fast/pipeline.py +592 -0
- telefuser-0.1.0.post3/telefuser/pipelines/lingbot_world_fast/service.py +483 -0
- telefuser-0.1.0.post3/telefuser/pipelines/lingbot_world_fast/session.py +76 -0
- telefuser-0.1.0.post3/telefuser/pipelines/liveact/__init__.py +16 -0
- telefuser-0.1.0.post3/telefuser/pipelines/liveact/audio_encoding.py +365 -0
- telefuser-0.1.0.post3/telefuser/pipelines/liveact/denoising.py +306 -0
- telefuser-0.1.0.post3/telefuser/pipelines/liveact/pipeline.py +337 -0
- telefuser-0.1.0.post3/telefuser/pipelines/longcat_video/__init__.py +12 -0
- telefuser-0.1.0.post3/telefuser/pipelines/longcat_video/dit_denoising.py +297 -0
- telefuser-0.1.0.post3/telefuser/pipelines/longcat_video/longcat_video.py +542 -0
- telefuser-0.1.0.post3/telefuser/pipelines/longcat_video/refine_denoise.py +235 -0
- telefuser-0.1.0.post3/telefuser/pipelines/longcat_video/text_encoding.py +118 -0
- telefuser-0.1.0.post3/telefuser/pipelines/ltx_video/__init__.py +1 -0
- telefuser-0.1.0.post3/telefuser/pipelines/ltx_video/dit_denoising.py +1010 -0
- telefuser-0.1.0.post3/telefuser/pipelines/ltx_video/gemma_text_encoding.py +165 -0
- telefuser-0.1.0.post3/telefuser/pipelines/ltx_video/ltx23_video.py +518 -0
- telefuser-0.1.0.post3/telefuser/pipelines/ltx_video/upsampler.py +29 -0
- telefuser-0.1.0.post3/telefuser/pipelines/ltx_video/vae.py +195 -0
- telefuser-0.1.0.post3/telefuser/pipelines/qwen_image/__init__.py +11 -0
- telefuser-0.1.0.post3/telefuser/pipelines/qwen_image/dit_denoising.py +228 -0
- telefuser-0.1.0.post3/telefuser/pipelines/qwen_image/qwen_image.py +301 -0
- telefuser-0.1.0.post3/telefuser/pipelines/qwen_image/qwen_image_edit.py +209 -0
- telefuser-0.1.0.post3/telefuser/pipelines/qwen_image/text_encoding.py +223 -0
- telefuser-0.1.0.post3/telefuser/pipelines/qwen_image/vae.py +91 -0
- telefuser-0.1.0.post3/telefuser/pipelines/wan_video/__init__.py +6 -0
- telefuser-0.1.0.post3/telefuser/pipelines/wan_video/async_wan22_video.py +467 -0
- telefuser-0.1.0.post3/telefuser/pipelines/wan_video/clip_encoding.py +54 -0
- telefuser-0.1.0.post3/telefuser/pipelines/wan_video/latent_data_utils.py +53 -0
- telefuser-0.1.0.post3/telefuser/pipelines/wan_video/moe_dit_denoising.py +409 -0
- telefuser-0.1.0.post3/telefuser/pipelines/wan_video/single_dit_denoising.py +262 -0
- telefuser-0.1.0.post3/telefuser/pipelines/wan_video/text_encoding.py +58 -0
- telefuser-0.1.0.post3/telefuser/pipelines/wan_video/ti2v_denoising.py +396 -0
- telefuser-0.1.0.post3/telefuser/pipelines/wan_video/vae.py +237 -0
- telefuser-0.1.0.post3/telefuser/pipelines/wan_video/wan21_video.py +353 -0
- telefuser-0.1.0.post3/telefuser/pipelines/wan_video/wan22_ti2v.py +372 -0
- telefuser-0.1.0.post3/telefuser/pipelines/wan_video/wan22_video.py +318 -0
- telefuser-0.1.0.post3/telefuser/pipelines/z_image/__init__.py +8 -0
- telefuser-0.1.0.post3/telefuser/pipelines/z_image/dit_denoising.py +281 -0
- telefuser-0.1.0.post3/telefuser/pipelines/z_image/text_encoding.py +117 -0
- telefuser-0.1.0.post3/telefuser/pipelines/z_image/vae.py +49 -0
- telefuser-0.1.0.post3/telefuser/pipelines/z_image/z_image.py +139 -0
- telefuser-0.1.0.post3/telefuser/platforms/__init__.py +80 -0
- telefuser-0.1.0.post3/telefuser/platforms/cpu.py +30 -0
- telefuser-0.1.0.post3/telefuser/platforms/cuda.py +86 -0
- telefuser-0.1.0.post3/telefuser/platforms/interface.py +99 -0
- telefuser-0.1.0.post3/telefuser/platforms/npu.py +78 -0
- telefuser-0.1.0.post3/telefuser/platforms/rocm.py +72 -0
- telefuser-0.1.0.post3/telefuser/schedulers/__init__.py +13 -0
- telefuser-0.1.0.post3/telefuser/schedulers/flow_match.py +377 -0
- telefuser-0.1.0.post3/telefuser/schedulers/flow_match_discrete.py +325 -0
- telefuser-0.1.0.post3/telefuser/schedulers/lcm.py +81 -0
- telefuser-0.1.0.post3/telefuser/schedulers/unipc.py +697 -0
- telefuser-0.1.0.post3/telefuser/service/__init__.py +38 -0
- telefuser-0.1.0.post3/telefuser/service/api/__init__.py +41 -0
- telefuser-0.1.0.post3/telefuser/service/api/api_server.py +327 -0
- telefuser-0.1.0.post3/telefuser/service/api/middleware.py +334 -0
- telefuser-0.1.0.post3/telefuser/service/api/openai/__init__.py +58 -0
- telefuser-0.1.0.post3/telefuser/service/api/openai/adapter.py +423 -0
- telefuser-0.1.0.post3/telefuser/service/api/openai/image_routes.py +417 -0
- telefuser-0.1.0.post3/telefuser/service/api/openai/protocol.py +287 -0
- telefuser-0.1.0.post3/telefuser/service/api/openai/video_routes.py +416 -0
- telefuser-0.1.0.post3/telefuser/service/api/routers/__init__.py +20 -0
- telefuser-0.1.0.post3/telefuser/service/api/routers/files.py +65 -0
- telefuser-0.1.0.post3/telefuser/service/api/routers/service.py +143 -0
- telefuser-0.1.0.post3/telefuser/service/api/routers/stream.py +88 -0
- telefuser-0.1.0.post3/telefuser/service/api/routers/tasks.py +432 -0
- telefuser-0.1.0.post3/telefuser/service/api/routers/webrtc.py +164 -0
- telefuser-0.1.0.post3/telefuser/service/api/schema.py +80 -0
- telefuser-0.1.0.post3/telefuser/service/api/stream_schema.py +76 -0
- telefuser-0.1.0.post3/telefuser/service/api/task_contract_runtime.py +148 -0
- telefuser-0.1.0.post3/telefuser/service/api/utils.py +139 -0
- telefuser-0.1.0.post3/telefuser/service/cache/__init__.py +4 -0
- telefuser-0.1.0.post3/telefuser/service/cache/cache_factory.py +176 -0
- telefuser-0.1.0.post3/telefuser/service/cache/cache_service.py +389 -0
- telefuser-0.1.0.post3/telefuser/service/core/__init__.py +30 -0
- telefuser-0.1.0.post3/telefuser/service/core/config.py +249 -0
- telefuser-0.1.0.post3/telefuser/service/core/container.py +264 -0
- telefuser-0.1.0.post3/telefuser/service/core/contract_templates.py +147 -0
- telefuser-0.1.0.post3/telefuser/service/core/file_service.py +269 -0
- telefuser-0.1.0.post3/telefuser/service/core/pipeline_contract.py +339 -0
- telefuser-0.1.0.post3/telefuser/service/core/pipeline_loader.py +94 -0
- telefuser-0.1.0.post3/telefuser/service/core/pipeline_pool.py +280 -0
- telefuser-0.1.0.post3/telefuser/service/core/pipeline_runner.py +205 -0
- telefuser-0.1.0.post3/telefuser/service/core/pipeline_service.py +311 -0
- telefuser-0.1.0.post3/telefuser/service/core/replica_worker.py +298 -0
- telefuser-0.1.0.post3/telefuser/service/core/stream_pipeline_service.py +261 -0
- telefuser-0.1.0.post3/telefuser/service/core/task_manager.py +416 -0
- telefuser-0.1.0.post3/telefuser/service/core/task_processor.py +162 -0
- telefuser-0.1.0.post3/telefuser/service/core/task_service.py +156 -0
- telefuser-0.1.0.post3/telefuser/service/main.py +99 -0
- telefuser-0.1.0.post3/telefuser/service/media/__init__.py +17 -0
- telefuser-0.1.0.post3/telefuser/service/media/media_base.py +298 -0
- telefuser-0.1.0.post3/telefuser/service/security/__init__.py +32 -0
- telefuser-0.1.0.post3/telefuser/service/security/security_validator.py +797 -0
- telefuser-0.1.0.post3/telefuser/service/webrtc/__init__.py +32 -0
- telefuser-0.1.0.post3/telefuser/service/webrtc/chunk_router.py +111 -0
- telefuser-0.1.0.post3/telefuser/service/webrtc/session_manager.py +322 -0
- telefuser-0.1.0.post3/telefuser/service/webrtc/track.py +307 -0
- telefuser-0.1.0.post3/telefuser/service_types.py +83 -0
- telefuser-0.1.0.post3/telefuser/utils/__init__.py +25 -0
- telefuser-0.1.0.post3/telefuser/utils/audio.py +51 -0
- telefuser-0.1.0.post3/telefuser/utils/func.py +31 -0
- telefuser-0.1.0.post3/telefuser/utils/hf_model_analyzer.py +382 -0
- telefuser-0.1.0.post3/telefuser/utils/hf_model_utils.py +209 -0
- telefuser-0.1.0.post3/telefuser/utils/hf_utils.py +256 -0
- telefuser-0.1.0.post3/telefuser/utils/logging.py +749 -0
- telefuser-0.1.0.post3/telefuser/utils/lora_loader.py +295 -0
- telefuser-0.1.0.post3/telefuser/utils/lora_network.py +212 -0
- telefuser-0.1.0.post3/telefuser/utils/memory_snapshot.py +423 -0
- telefuser-0.1.0.post3/telefuser/utils/model_weight.py +163 -0
- telefuser-0.1.0.post3/telefuser/utils/profiler.py +1079 -0
- telefuser-0.1.0.post3/telefuser/utils/stage_bench_harness.py +740 -0
- telefuser-0.1.0.post3/telefuser/utils/system.py +228 -0
- telefuser-0.1.0.post3/telefuser/utils/torch_compile.py +83 -0
- telefuser-0.1.0.post3/telefuser/utils/utils.py +49 -0
- telefuser-0.1.0.post3/telefuser/utils/video.py +464 -0
- telefuser-0.1.0.post3/telefuser/worker/__init__.py +18 -0
- telefuser-0.1.0.post3/telefuser/worker/native_worker.py +125 -0
- telefuser-0.1.0.post3/telefuser/worker/parallel_worker.py +292 -0
- telefuser-0.1.0.post3/telefuser/worker/ray_worker.py +107 -0
- telefuser-0.1.0.post3/telefuser.egg-info/PKG-INFO +379 -0
- telefuser-0.1.0.post3/telefuser.egg-info/SOURCES.txt +768 -0
- telefuser-0.1.0.post3/telefuser.egg-info/dependency_links.txt +1 -0
- telefuser-0.1.0.post3/telefuser.egg-info/entry_points.txt +2 -0
- telefuser-0.1.0.post3/telefuser.egg-info/requires.txt +73 -0
- telefuser-0.1.0.post3/telefuser.egg-info/scm_file_list.json +763 -0
- telefuser-0.1.0.post3/telefuser.egg-info/scm_version.json +8 -0
- telefuser-0.1.0.post3/telefuser.egg-info/top_level.txt +1 -0
- telefuser-0.1.0.post3/tests/__init__.py +0 -0
- telefuser-0.1.0.post3/tests/conftest.py +162 -0
- telefuser-0.1.0.post3/tests/integration/conftest.py +161 -0
- telefuser-0.1.0.post3/tests/integration/test_pp_forward_consistency.py +228 -0
- telefuser-0.1.0.post3/tests/integration/test_service_api.py +380 -0
- telefuser-0.1.0.post3/tests/integration/test_stream_api.py +188 -0
- telefuser-0.1.0.post3/tests/integration/test_webrtc_api.py +180 -0
- telefuser-0.1.0.post3/tests/server/README.md +188 -0
- telefuser-0.1.0.post3/tests/server/__init__.py +7 -0
- telefuser-0.1.0.post3/tests/server/client/__init__.py +0 -0
- telefuser-0.1.0.post3/tests/server/client/test_client.py +241 -0
- telefuser-0.1.0.post3/tests/server/conftest.py +114 -0
- telefuser-0.1.0.post3/tests/server/fixtures/__init__.py +0 -0
- telefuser-0.1.0.post3/tests/server/pipeline/__init__.py +0 -0
- telefuser-0.1.0.post3/tests/server/pipeline/fake_t2v_pipeline.py +310 -0
- telefuser-0.1.0.post3/tests/server/run_integration_test.py +112 -0
- telefuser-0.1.0.post3/tests/server/run_test_server.py +99 -0
- telefuser-0.1.0.post3/tests/server/test_middleware.py +345 -0
- telefuser-0.1.0.post3/tests/server/test_openai_api.py +438 -0
- telefuser-0.1.0.post3/tests/server/test_openai_client.py +409 -0
- telefuser-0.1.0.post3/tests/server/test_server.py +306 -0
- telefuser-0.1.0.post3/tests/unit/__init__.py +0 -0
- telefuser-0.1.0.post3/tests/unit/cache_mem/__init__.py +1 -0
- telefuser-0.1.0.post3/tests/unit/cache_mem/test_concurrency.py +355 -0
- telefuser-0.1.0.post3/tests/unit/cache_mem/test_metadata.py +150 -0
- telefuser-0.1.0.post3/tests/unit/cache_mem/test_storage.py +105 -0
- telefuser-0.1.0.post3/tests/unit/cache_mem/test_types_and_config.py +96 -0
- telefuser-0.1.0.post3/tests/unit/core/test_base_pipeline.py +225 -0
- telefuser-0.1.0.post3/tests/unit/core/test_config.py +161 -0
- telefuser-0.1.0.post3/tests/unit/core/test_config_serializer.py +103 -0
- telefuser-0.1.0.post3/tests/unit/distributed/__init__.py +0 -0
- telefuser-0.1.0.post3/tests/unit/distributed/test_device_mesh.py +270 -0
- telefuser-0.1.0.post3/tests/unit/distributed/test_parallel_shard.py +361 -0
- telefuser-0.1.0.post3/tests/unit/distributed/test_pp_comm.py +265 -0
- telefuser-0.1.0.post3/tests/unit/distributed/test_ulysses_comm.py +164 -0
- telefuser-0.1.0.post3/tests/unit/feature_cache/__init__.py +0 -0
- telefuser-0.1.0.post3/tests/unit/feature_cache/test_feature_cache.py +644 -0
- telefuser-0.1.0.post3/tests/unit/kernel/__init__.py +1 -0
- telefuser-0.1.0.post3/tests/unit/kernel/test_rmsnorm.py +325 -0
- telefuser-0.1.0.post3/tests/unit/kernel/test_rotary.py +214 -0
- telefuser-0.1.0.post3/tests/unit/kernel/test_scale_shift.py +590 -0
- telefuser-0.1.0.post3/tests/unit/models/__init__.py +1 -0
- telefuser-0.1.0.post3/tests/unit/offload/__init__.py +1 -0
- telefuser-0.1.0.post3/tests/unit/offload/test_async_offload.py +387 -0
- telefuser-0.1.0.post3/tests/unit/offload/test_sequential_offload.py +250 -0
- telefuser-0.1.0.post3/tests/unit/openai/__init__.py +3 -0
- telefuser-0.1.0.post3/tests/unit/openai/test_adapter.py +156 -0
- telefuser-0.1.0.post3/tests/unit/openai/test_client.py +111 -0
- telefuser-0.1.0.post3/tests/unit/openai/test_image_routes.py +154 -0
- telefuser-0.1.0.post3/tests/unit/openai/test_integration_server.py +102 -0
- telefuser-0.1.0.post3/tests/unit/openai/test_protocol.py +150 -0
- telefuser-0.1.0.post3/tests/unit/openai/test_video_routes.py +151 -0
- telefuser-0.1.0.post3/tests/unit/ops/test_activations.py +179 -0
- telefuser-0.1.0.post3/tests/unit/ops/test_long_context_attention.py +463 -0
- telefuser-0.1.0.post3/tests/unit/ops/test_normalization.py +183 -0
- telefuser-0.1.0.post3/tests/unit/ops/test_parallel_shard_attention.py +680 -0
- telefuser-0.1.0.post3/tests/unit/pipelines/__init__.py +0 -0
- telefuser-0.1.0.post3/tests/unit/pipelines/lingbot_world_fast/test_control_alignment.py +62 -0
- telefuser-0.1.0.post3/tests/unit/pipelines/wan_video/__init__.py +0 -0
- telefuser-0.1.0.post3/tests/unit/pipelines/wan_video/test_latent_data_utils.py +57 -0
- telefuser-0.1.0.post3/tests/unit/quantize/__init__.py +1 -0
- telefuser-0.1.0.post3/tests/unit/quantize/test_quantized_linear.py +139 -0
- telefuser-0.1.0.post3/tests/unit/schedulers/test_flow_match.py +338 -0
- telefuser-0.1.0.post3/tests/unit/service/__init__.py +1 -0
- telefuser-0.1.0.post3/tests/unit/service/test_config.py +23 -0
- telefuser-0.1.0.post3/tests/unit/service/test_pipeline_contract.py +201 -0
- telefuser-0.1.0.post3/tests/unit/service/test_pipeline_pool.py +352 -0
- telefuser-0.1.0.post3/tests/unit/service/test_schema.py +197 -0
- telefuser-0.1.0.post3/tests/unit/service/test_task_contract_runtime.py +45 -0
- telefuser-0.1.0.post3/tests/unit/service/test_task_routes.py +116 -0
- telefuser-0.1.0.post3/tests/unit/service/test_task_runtime.py +142 -0
- telefuser-0.1.0.post3/tests/unit/test_metrics.py +499 -0
- telefuser-0.1.0.post3/tests/unit/utils/test_profiler_flags.py +78 -0
- telefuser-0.1.0.post3/tests/unit/utils/test_utils.py +198 -0
- telefuser-0.1.0.post3/tests/unit/worker/test_parallel_worker.py +378 -0
- telefuser-0.1.0.post3/tf-kernel/.clang-format +15 -0
- telefuser-0.1.0.post3/tf-kernel/.github/ISSUE_TEMPLATE/bug_report.yml +73 -0
- telefuser-0.1.0.post3/tf-kernel/.github/ISSUE_TEMPLATE/feature_request.yml +43 -0
- telefuser-0.1.0.post3/tf-kernel/.github/PULL_REQUEST_TEMPLATE.md +58 -0
- telefuser-0.1.0.post3/tf-kernel/.github/workflows/docs.yml +80 -0
- telefuser-0.1.0.post3/tf-kernel/.github/workflows/lint.yml +65 -0
- telefuser-0.1.0.post3/tf-kernel/.gitignore +258 -0
- telefuser-0.1.0.post3/tf-kernel/.pre-commit-config.yaml +47 -0
- telefuser-0.1.0.post3/tf-kernel/AGENTS.md +126 -0
- telefuser-0.1.0.post3/tf-kernel/CLAUDE.md +126 -0
- telefuser-0.1.0.post3/tf-kernel/CMakeLists.txt +496 -0
- telefuser-0.1.0.post3/tf-kernel/CODE_OF_CONDUCT.md +132 -0
- telefuser-0.1.0.post3/tf-kernel/CONTRIBUTING.md +247 -0
- telefuser-0.1.0.post3/tf-kernel/Dockerfile +169 -0
- telefuser-0.1.0.post3/tf-kernel/GEMINI.md +126 -0
- telefuser-0.1.0.post3/tf-kernel/LICENSE +201 -0
- telefuser-0.1.0.post3/tf-kernel/Makefile +129 -0
- telefuser-0.1.0.post3/tf-kernel/README.md +206 -0
- telefuser-0.1.0.post3/tf-kernel/README_zh.md +206 -0
- telefuser-0.1.0.post3/tf-kernel/THIRDPARTYNOTICES.txt +488 -0
- telefuser-0.1.0.post3/tf-kernel/analyze_whl_kernel_sizes.py +221 -0
- telefuser-0.1.0.post3/tf-kernel/benchmark/bench_activation.py +326 -0
- telefuser-0.1.0.post3/tf-kernel/benchmark/bench_fp4_gemm.py +399 -0
- telefuser-0.1.0.post3/tf-kernel/benchmark/bench_fp8_gemm.py +333 -0
- telefuser-0.1.0.post3/tf-kernel/benchmark/bench_int8_gemm.py +295 -0
- telefuser-0.1.0.post3/tf-kernel/benchmark/bench_nvfp4_scaled_gemm.py +308 -0
- telefuser-0.1.0.post3/tf-kernel/benchmark/bench_per_tensor_quant_fp8.py +225 -0
- telefuser-0.1.0.post3/tf-kernel/benchmark/bench_rmsnorm.py +489 -0
- telefuser-0.1.0.post3/tf-kernel/benchmark/bench_sageattn2.py +352 -0
- telefuser-0.1.0.post3/tf-kernel/build.sh +71 -0
- telefuser-0.1.0.post3/tf-kernel/cmake/detect_gpu.cmake +107 -0
- telefuser-0.1.0.post3/tf-kernel/cmake/utils.cmake +19 -0
- telefuser-0.1.0.post3/tf-kernel/csrc/block_sparse_attn/block_sparse_attn_api.h +82 -0
- telefuser-0.1.0.post3/tf-kernel/csrc/block_sparse_attn/flash_api.cpp +886 -0
- telefuser-0.1.0.post3/tf-kernel/csrc/block_sparse_attn/src/alibi.h +63 -0
- telefuser-0.1.0.post3/tf-kernel/csrc/block_sparse_attn/src/block_info.h +56 -0
- telefuser-0.1.0.post3/tf-kernel/csrc/block_sparse_attn/src/flash.h +191 -0
- telefuser-0.1.0.post3/tf-kernel/csrc/block_sparse_attn/src/flash_blockmask.h +523 -0
- telefuser-0.1.0.post3/tf-kernel/csrc/block_sparse_attn/src/flash_bwd_block_hdim128_bf16_causal_sm80.cu +15 -0
- telefuser-0.1.0.post3/tf-kernel/csrc/block_sparse_attn/src/flash_bwd_block_hdim128_bf16_sm80.cu +15 -0
- telefuser-0.1.0.post3/tf-kernel/csrc/block_sparse_attn/src/flash_bwd_block_hdim128_fp16_causal_sm80.cu +15 -0
- telefuser-0.1.0.post3/tf-kernel/csrc/block_sparse_attn/src/flash_bwd_block_hdim128_fp16_sm80.cu +15 -0
- telefuser-0.1.0.post3/tf-kernel/csrc/block_sparse_attn/src/flash_bwd_block_hdim32_bf16_causal_sm80.cu +15 -0
- telefuser-0.1.0.post3/tf-kernel/csrc/block_sparse_attn/src/flash_bwd_block_hdim32_bf16_sm80.cu +15 -0
- telefuser-0.1.0.post3/tf-kernel/csrc/block_sparse_attn/src/flash_bwd_block_hdim32_fp16_causal_sm80.cu +15 -0
- telefuser-0.1.0.post3/tf-kernel/csrc/block_sparse_attn/src/flash_bwd_block_hdim32_fp16_sm80.cu +15 -0
- telefuser-0.1.0.post3/tf-kernel/csrc/block_sparse_attn/src/flash_bwd_block_hdim64_bf16_causal_sm80.cu +15 -0
- telefuser-0.1.0.post3/tf-kernel/csrc/block_sparse_attn/src/flash_bwd_block_hdim64_bf16_sm80.cu +15 -0
- telefuser-0.1.0.post3/tf-kernel/csrc/block_sparse_attn/src/flash_bwd_block_hdim64_fp16_causal_sm80.cu +15 -0
- telefuser-0.1.0.post3/tf-kernel/csrc/block_sparse_attn/src/flash_bwd_block_hdim64_fp16_sm80.cu +15 -0
- telefuser-0.1.0.post3/tf-kernel/csrc/block_sparse_attn/src/flash_bwd_kernel.h +2132 -0
- telefuser-0.1.0.post3/tf-kernel/csrc/block_sparse_attn/src/flash_bwd_launch_template.h +259 -0
- telefuser-0.1.0.post3/tf-kernel/csrc/block_sparse_attn/src/flash_fwd_block_hdim128_bf16_causal_sm80.cu +15 -0
- telefuser-0.1.0.post3/tf-kernel/csrc/block_sparse_attn/src/flash_fwd_block_hdim128_bf16_sm80.cu +15 -0
- telefuser-0.1.0.post3/tf-kernel/csrc/block_sparse_attn/src/flash_fwd_block_hdim128_fp16_causal_sm80.cu +15 -0
- telefuser-0.1.0.post3/tf-kernel/csrc/block_sparse_attn/src/flash_fwd_block_hdim128_fp16_sm80.cu +15 -0
- telefuser-0.1.0.post3/tf-kernel/csrc/block_sparse_attn/src/flash_fwd_block_hdim32_bf16_causal_sm80.cu +15 -0
- telefuser-0.1.0.post3/tf-kernel/csrc/block_sparse_attn/src/flash_fwd_block_hdim32_bf16_sm80.cu +15 -0
- telefuser-0.1.0.post3/tf-kernel/csrc/block_sparse_attn/src/flash_fwd_block_hdim32_fp16_causal_sm80.cu +15 -0
- telefuser-0.1.0.post3/tf-kernel/csrc/block_sparse_attn/src/flash_fwd_block_hdim32_fp16_sm80.cu +15 -0
- telefuser-0.1.0.post3/tf-kernel/csrc/block_sparse_attn/src/flash_fwd_block_hdim64_bf16_causal_sm80.cu +15 -0
- telefuser-0.1.0.post3/tf-kernel/csrc/block_sparse_attn/src/flash_fwd_block_hdim64_bf16_sm80.cu +15 -0
- telefuser-0.1.0.post3/tf-kernel/csrc/block_sparse_attn/src/flash_fwd_block_hdim64_fp16_causal_sm80.cu +15 -0
- telefuser-0.1.0.post3/tf-kernel/csrc/block_sparse_attn/src/flash_fwd_block_hdim64_fp16_sm80.cu +15 -0
- telefuser-0.1.0.post3/tf-kernel/csrc/block_sparse_attn/src/flash_fwd_kernel.h +1477 -0
- telefuser-0.1.0.post3/tf-kernel/csrc/block_sparse_attn/src/flash_fwd_launch_template.h +138 -0
- telefuser-0.1.0.post3/tf-kernel/csrc/block_sparse_attn/src/hardware_info.h +39 -0
- telefuser-0.1.0.post3/tf-kernel/csrc/block_sparse_attn/src/kernel_traits.h +357 -0
- telefuser-0.1.0.post3/tf-kernel/csrc/block_sparse_attn/src/kernel_traits_sm90.h +147 -0
- telefuser-0.1.0.post3/tf-kernel/csrc/block_sparse_attn/src/namespace_config.h +67 -0
- telefuser-0.1.0.post3/tf-kernel/csrc/block_sparse_attn/src/philox.cuh +155 -0
- telefuser-0.1.0.post3/tf-kernel/csrc/block_sparse_attn/src/softmax.h +363 -0
- telefuser-0.1.0.post3/tf-kernel/csrc/block_sparse_attn/src/static_switch.h +51 -0
- telefuser-0.1.0.post3/tf-kernel/csrc/block_sparse_attn/src/utils.h +482 -0
- telefuser-0.1.0.post3/tf-kernel/csrc/common_extension.cc +410 -0
- telefuser-0.1.0.post3/tf-kernel/csrc/cutlass_extensions/common.hpp +21 -0
- telefuser-0.1.0.post3/tf-kernel/csrc/cutlass_extensions/detail/collective/mixed_input_utils.hpp +482 -0
- telefuser-0.1.0.post3/tf-kernel/csrc/cutlass_extensions/epilogue/epilogue_per_row_per_col_scale.h +309 -0
- telefuser-0.1.0.post3/tf-kernel/csrc/cutlass_extensions/gemm/collective/builders/sm90_gmma_builder_mixed_input.inl +278 -0
- telefuser-0.1.0.post3/tf-kernel/csrc/cutlass_extensions/gemm/collective/collective_builder_mixed_input.hpp +52 -0
- telefuser-0.1.0.post3/tf-kernel/csrc/cutlass_extensions/gemm/collective/collective_mma_array_mixed_input.hpp +53 -0
- telefuser-0.1.0.post3/tf-kernel/csrc/cutlass_extensions/gemm/collective/sm90_mma_array_tma_gmma_rs_warpspecialized_mixed_input_.hpp +1538 -0
- telefuser-0.1.0.post3/tf-kernel/csrc/cutlass_extensions/gemm/cutlass_gemm_caller.cuh +62 -0
- telefuser-0.1.0.post3/tf-kernel/csrc/cutlass_extensions/gemm/dispatch_policy.hpp +38 -0
- telefuser-0.1.0.post3/tf-kernel/csrc/cutlass_extensions/gemm/fp8_blockwise_gemm_sm90_dispatch.cuh +197 -0
- telefuser-0.1.0.post3/tf-kernel/csrc/cutlass_extensions/gemm/gemm_universal_base_compat.h +356 -0
- telefuser-0.1.0.post3/tf-kernel/csrc/cutlass_extensions/gemm/gemm_with_epilogue_visitor.h +492 -0
- telefuser-0.1.0.post3/tf-kernel/csrc/elementwise/activation.cu +170 -0
- telefuser-0.1.0.post3/tf-kernel/csrc/elementwise/cast.cu +170 -0
- telefuser-0.1.0.post3/tf-kernel/csrc/elementwise/concat_mla.cu +217 -0
- telefuser-0.1.0.post3/tf-kernel/csrc/elementwise/copy.cu +58 -0
- telefuser-0.1.0.post3/tf-kernel/csrc/elementwise/fused_add_rms_norm_kernel.cu +59 -0
- telefuser-0.1.0.post3/tf-kernel/csrc/elementwise/pos_enc.cu +208 -0
- telefuser-0.1.0.post3/tf-kernel/csrc/elementwise/pos_enc.cuh +467 -0
- telefuser-0.1.0.post3/tf-kernel/csrc/elementwise/rope.cu +168 -0
- telefuser-0.1.0.post3/tf-kernel/csrc/elementwise/topk.cu +545 -0
- telefuser-0.1.0.post3/tf-kernel/csrc/elementwise/utils.cuh +72 -0
- telefuser-0.1.0.post3/tf-kernel/csrc/expert_specialization/es_fp8_blockwise.cu +167 -0
- telefuser-0.1.0.post3/tf-kernel/csrc/expert_specialization/es_fp8_blockwise_functor.cuh +268 -0
- telefuser-0.1.0.post3/tf-kernel/csrc/expert_specialization/es_fp8_blockwise_launcher.cuh +284 -0
- telefuser-0.1.0.post3/tf-kernel/csrc/expert_specialization/es_fp8_blockwise_traits.cuh +174 -0
- telefuser-0.1.0.post3/tf-kernel/csrc/expert_specialization/es_sm100_mxfp8_blockscaled.cu +40 -0
- telefuser-0.1.0.post3/tf-kernel/csrc/expert_specialization/es_sm100_mxfp8_blockscaled_functor.cuh +134 -0
- telefuser-0.1.0.post3/tf-kernel/csrc/expert_specialization/es_sm100_mxfp8_blockscaled_group_quant.cu +39 -0
- telefuser-0.1.0.post3/tf-kernel/csrc/expert_specialization/es_sm100_mxfp8_blockscaled_group_quant.cuh +407 -0
- telefuser-0.1.0.post3/tf-kernel/csrc/expert_specialization/es_sm100_mxfp8_blockscaled_launcher.cuh +214 -0
- telefuser-0.1.0.post3/tf-kernel/csrc/expert_specialization/es_sm100_mxfp8_blockscaled_traits.cuh +131 -0
- telefuser-0.1.0.post3/tf-kernel/csrc/gemm/bmm_fp8.cu +75 -0
- telefuser-0.1.0.post3/tf-kernel/csrc/gemm/fp8_blockwise_gemm_kernel.cu +465 -0
- telefuser-0.1.0.post3/tf-kernel/csrc/gemm/fp8_gemm_kernel.cu +1532 -0
- telefuser-0.1.0.post3/tf-kernel/csrc/gemm/int8_gemm_kernel.cu +747 -0
- telefuser-0.1.0.post3/tf-kernel/csrc/gemm/math.hpp +28 -0
- telefuser-0.1.0.post3/tf-kernel/csrc/gemm/nvfp4_expert_quant.cu +728 -0
- telefuser-0.1.0.post3/tf-kernel/csrc/gemm/nvfp4_quant.cuh +182 -0
- telefuser-0.1.0.post3/tf-kernel/csrc/gemm/nvfp4_quant_entry.cu +77 -0
- telefuser-0.1.0.post3/tf-kernel/csrc/gemm/nvfp4_quant_kernels.cu +242 -0
- telefuser-0.1.0.post3/tf-kernel/csrc/gemm/nvfp4_scaled_mm_entry.cu +64 -0
- telefuser-0.1.0.post3/tf-kernel/csrc/gemm/nvfp4_scaled_mm_kernels.cu +687 -0
- telefuser-0.1.0.post3/tf-kernel/csrc/gemm/per_tensor_quant_fp8.cu +123 -0
- telefuser-0.1.0.post3/tf-kernel/csrc/gemm/per_token_group_quant_8bit.cu +217 -0
- telefuser-0.1.0.post3/tf-kernel/csrc/gemm/per_token_group_quant_8bit_v2.cu +514 -0
- telefuser-0.1.0.post3/tf-kernel/csrc/gemm/per_token_quant_fp8.cu +314 -0
- telefuser-0.1.0.post3/tf-kernel/csrc/memory/store.cu +147 -0
- telefuser-0.1.0.post3/tf-kernel/csrc/memory/weak_ref_tensor.cpp +35 -0
- telefuser-0.1.0.post3/tf-kernel/csrc/sageattn2/cp_async.cuh +154 -0
- telefuser-0.1.0.post3/tf-kernel/csrc/sageattn2/dispatch_utils.h +111 -0
- telefuser-0.1.0.post3/tf-kernel/csrc/sageattn2/fused/fused.cu +1094 -0
- telefuser-0.1.0.post3/tf-kernel/csrc/sageattn2/fused/fused.h +70 -0
- telefuser-0.1.0.post3/tf-kernel/csrc/sageattn2/math.cuh +157 -0
- telefuser-0.1.0.post3/tf-kernel/csrc/sageattn2/mma.cuh +665 -0
- telefuser-0.1.0.post3/tf-kernel/csrc/sageattn2/numeric_conversion.cuh +151 -0
- telefuser-0.1.0.post3/tf-kernel/csrc/sageattn2/permuted_smem.cuh +195 -0
- telefuser-0.1.0.post3/tf-kernel/csrc/sageattn2/qattn/attn_cuda_sm80.h +75 -0
- telefuser-0.1.0.post3/tf-kernel/csrc/sageattn2/qattn/attn_cuda_sm89.h +169 -0
- telefuser-0.1.0.post3/tf-kernel/csrc/sageattn2/qattn/attn_cuda_sm90.h +49 -0
- telefuser-0.1.0.post3/tf-kernel/csrc/sageattn2/qattn/attn_utils.cuh +972 -0
- telefuser-0.1.0.post3/tf-kernel/csrc/sageattn2/qattn/qk_int_sv_f16_cuda_sm80.cu +1563 -0
- telefuser-0.1.0.post3/tf-kernel/csrc/sageattn2/qattn/qk_int_sv_f8_cuda_sm89.cuh +841 -0
- telefuser-0.1.0.post3/tf-kernel/csrc/sageattn2/qattn/qk_int_sv_f8_cuda_sm90.cu +967 -0
- telefuser-0.1.0.post3/tf-kernel/csrc/sageattn2/qattn/sm89_qk_int8_sv_f8_accum_f16_attn_inst_buf.cu +201 -0
- telefuser-0.1.0.post3/tf-kernel/csrc/sageattn2/qattn/sm89_qk_int8_sv_f8_accum_f16_fuse_v_scale_attn_inst_buf.cu +207 -0
- telefuser-0.1.0.post3/tf-kernel/csrc/sageattn2/qattn/sm89_qk_int8_sv_f8_accum_f32_attn.cu +201 -0
- telefuser-0.1.0.post3/tf-kernel/csrc/sageattn2/qattn/sm89_qk_int8_sv_f8_accum_f32_attn_inst_buf.cu +200 -0
- telefuser-0.1.0.post3/tf-kernel/csrc/sageattn2/qattn/sm89_qk_int8_sv_f8_accum_f32_fuse_v_scale_attn.cu +207 -0
- telefuser-0.1.0.post3/tf-kernel/csrc/sageattn2/qattn/sm89_qk_int8_sv_f8_accum_f32_fuse_v_scale_attn_inst_buf.cu +207 -0
- telefuser-0.1.0.post3/tf-kernel/csrc/sageattn2/qattn/sm89_qk_int8_sv_f8_accum_f32_fuse_v_scale_fuse_v_mean_attn.cu +213 -0
- telefuser-0.1.0.post3/tf-kernel/csrc/sageattn2/reduction_utils.cuh +182 -0
- telefuser-0.1.0.post3/tf-kernel/csrc/sageattn2/utils.cuh +40 -0
- telefuser-0.1.0.post3/tf-kernel/csrc/sageattn2/wgmma.cuh +653 -0
- telefuser-0.1.0.post3/tf-kernel/csrc/sageattn3/__init__.py +16 -0
- telefuser-0.1.0.post3/tf-kernel/csrc/sageattn3/api.h +28 -0
- telefuser-0.1.0.post3/tf-kernel/csrc/sageattn3/api.py +152 -0
- telefuser-0.1.0.post3/tf-kernel/csrc/sageattn3/blackwell/api.cu +353 -0
- telefuser-0.1.0.post3/tf-kernel/csrc/sageattn3/blackwell/block_info.h +69 -0
- telefuser-0.1.0.post3/tf-kernel/csrc/sageattn3/blackwell/blockscaled_layout.h +141 -0
- telefuser-0.1.0.post3/tf-kernel/csrc/sageattn3/blackwell/cute_extension.h +371 -0
- telefuser-0.1.0.post3/tf-kernel/csrc/sageattn3/blackwell/epilogue_tma_ws.h +209 -0
- telefuser-0.1.0.post3/tf-kernel/csrc/sageattn3/blackwell/kernel_traits.h +195 -0
- telefuser-0.1.0.post3/tf-kernel/csrc/sageattn3/blackwell/kernel_ws.h +210 -0
- telefuser-0.1.0.post3/tf-kernel/csrc/sageattn3/blackwell/launch.h +103 -0
- telefuser-0.1.0.post3/tf-kernel/csrc/sageattn3/blackwell/mainloop_tma_ws.h +919 -0
- telefuser-0.1.0.post3/tf-kernel/csrc/sageattn3/blackwell/named_barrier.h +117 -0
- telefuser-0.1.0.post3/tf-kernel/csrc/sageattn3/blackwell/params.h +179 -0
- telefuser-0.1.0.post3/tf-kernel/csrc/sageattn3/blackwell/softmax_fused.h +167 -0
- telefuser-0.1.0.post3/tf-kernel/csrc/sageattn3/blackwell/static_switch.h +83 -0
- telefuser-0.1.0.post3/tf-kernel/csrc/sageattn3/blackwell/tile_scheduler.h +270 -0
- telefuser-0.1.0.post3/tf-kernel/csrc/sageattn3/blackwell/utils.h +408 -0
- telefuser-0.1.0.post3/tf-kernel/csrc/sageattn3/quantization/cuda_utils.h +49 -0
- telefuser-0.1.0.post3/tf-kernel/csrc/sageattn3/quantization/fp4_quantization_4d.cu +658 -0
- telefuser-0.1.0.post3/tf-kernel/csrc/sageattn3/setup.py +187 -0
- telefuser-0.1.0.post3/tf-kernel/docs/Makefile +27 -0
- telefuser-0.1.0.post3/tf-kernel/docs/source/api/attention.rst +30 -0
- telefuser-0.1.0.post3/tf-kernel/docs/source/api/elementwise.rst +37 -0
- telefuser-0.1.0.post3/tf-kernel/docs/source/api/gemm.rst +46 -0
- telefuser-0.1.0.post3/tf-kernel/docs/source/api/index.rst +25 -0
- telefuser-0.1.0.post3/tf-kernel/docs/source/conf.py +77 -0
- telefuser-0.1.0.post3/tf-kernel/docs/source/contributing.rst +50 -0
- telefuser-0.1.0.post3/tf-kernel/docs/source/development.rst +147 -0
- telefuser-0.1.0.post3/tf-kernel/docs/source/index.rst +38 -0
- telefuser-0.1.0.post3/tf-kernel/docs/source/installation.rst +96 -0
- telefuser-0.1.0.post3/tf-kernel/docs/source/quickstart.rst +91 -0
- telefuser-0.1.0.post3/tf-kernel/include/pytorch_extension_utils_rocm.h +20 -0
- telefuser-0.1.0.post3/tf-kernel/include/scalar_type.hpp +331 -0
- telefuser-0.1.0.post3/tf-kernel/include/tf_kernel_ops.h +240 -0
- telefuser-0.1.0.post3/tf-kernel/include/tf_kernel_torch_shim.h +122 -0
- telefuser-0.1.0.post3/tf-kernel/include/utils.h +469 -0
- telefuser-0.1.0.post3/tf-kernel/kernel-runner-setup.sh +150 -0
- telefuser-0.1.0.post3/tf-kernel/pyproject.toml +114 -0
- telefuser-0.1.0.post3/tf-kernel/pytest.ini +16 -0
- telefuser-0.1.0.post3/tf-kernel/rename_wheels.sh +91 -0
- telefuser-0.1.0.post3/tf-kernel/scripts/check_wheel_symbols.py +272 -0
- telefuser-0.1.0.post3/tf-kernel/tests/conftest.py +14 -0
- telefuser-0.1.0.post3/tf-kernel/tests/test_activation.py +40 -0
- telefuser-0.1.0.post3/tf-kernel/tests/test_block_sparse_attn.py +628 -0
- telefuser-0.1.0.post3/tf-kernel/tests/test_block_sparse_attn_simple.py +156 -0
- telefuser-0.1.0.post3/tf-kernel/tests/test_bmm_fp8.py +44 -0
- telefuser-0.1.0.post3/tf-kernel/tests/test_copy.py +16 -0
- telefuser-0.1.0.post3/tf-kernel/tests/test_fp4_gemm.py +163 -0
- telefuser-0.1.0.post3/tf-kernel/tests/test_fp4_quantize.py +268 -0
- telefuser-0.1.0.post3/tf-kernel/tests/test_fp8_blockwise_gemm.py +92 -0
- telefuser-0.1.0.post3/tf-kernel/tests/test_fp8_gemm.py +50 -0
- telefuser-0.1.0.post3/tf-kernel/tests/test_int8_gemm.py +49 -0
- telefuser-0.1.0.post3/tf-kernel/tests/test_norm.py +77 -0
- telefuser-0.1.0.post3/tf-kernel/tests/test_sageattn2.py +295 -0
- telefuser-0.1.0.post3/tf-kernel/tests/test_sageattn3.py +209 -0
- telefuser-0.1.0.post3/tf-kernel/tests/test_torch_defaults_reset.py +16 -0
- telefuser-0.1.0.post3/tf-kernel/tests/utils.py +16 -0
- telefuser-0.1.0.post3/tf-kernel/tf_kernel/__init__.py +71 -0
- telefuser-0.1.0.post3/tf-kernel/tf_kernel/block_sparse_attn.py +438 -0
- telefuser-0.1.0.post3/tf-kernel/tf_kernel/elementwise.py +350 -0
- telefuser-0.1.0.post3/tf-kernel/tf_kernel/gemm.py +415 -0
- telefuser-0.1.0.post3/tf-kernel/tf_kernel/load_utils.py +260 -0
- telefuser-0.1.0.post3/tf-kernel/tf_kernel/memory.py +26 -0
- telefuser-0.1.0.post3/tf-kernel/tf_kernel/sageattn2.py +1693 -0
- telefuser-0.1.0.post3/tf-kernel/tf_kernel/sageattn3.py +316 -0
- telefuser-0.1.0.post3/tf-kernel/tf_kernel/sampling.py +544 -0
- telefuser-0.1.0.post3/tf-kernel/tf_kernel/scalar_type.py +352 -0
- telefuser-0.1.0.post3/tf-kernel/tf_kernel/test_utils.py +125 -0
- telefuser-0.1.0.post3/tf-kernel/tf_kernel/testing/__init__.py +0 -0
- telefuser-0.1.0.post3/tf-kernel/tf_kernel/testing/rotary_embedding.py +244 -0
- telefuser-0.1.0.post3/tf-kernel/tf_kernel/triton/__init__.py +0 -0
- telefuser-0.1.0.post3/tf-kernel/tf_kernel/triton/attn_qk_int8_block_varlen.py +244 -0
- telefuser-0.1.0.post3/tf-kernel/tf_kernel/triton/attn_qk_int8_per_block.py +320 -0
- telefuser-0.1.0.post3/tf-kernel/tf_kernel/triton/attn_qk_int8_per_block_causal.py +290 -0
- telefuser-0.1.0.post3/tf-kernel/tf_kernel/triton/attn_qk_int8_per_block_causal_varlen.py +276 -0
- telefuser-0.1.0.post3/tf-kernel/tf_kernel/triton/quant_per_block.py +166 -0
- telefuser-0.1.0.post3/tf-kernel/tf_kernel/triton/quant_per_block_varlen.py +158 -0
- telefuser-0.1.0.post3/tf-kernel/tf_kernel/triton/quant_per_thread.py +369 -0
- telefuser-0.1.0.post3/tf-kernel/tf_kernel/utils.py +45 -0
- telefuser-0.1.0.post3/tools/convert/README.md +382 -0
- telefuser-0.1.0.post3/tools/convert/converter.py +937 -0
- telefuser-0.1.0.post3/tools/convert/lora_extractor.py +504 -0
- telefuser-0.1.0.post3/tools/convert/lora_loader.py +451 -0
- telefuser-0.1.0.post3/tools/convert/quant.py +145 -0
- telefuser-0.1.0.post3/tools/convert/register.py +47 -0
- telefuser-0.1.0.post3/tools/deploy/README.md +15 -0
- telefuser-0.1.0.post3/tools/deploy/docker_monitor.py +945 -0
- telefuser-0.1.0.post3/tools/deploy/multi_device_communication_test.py +429 -0
- telefuser-0.1.0.post3/tools/deploy/show_stat.py +993 -0
- telefuser-0.1.0.post3/tools/viewer/README.md +198 -0
- telefuser-0.1.0.post3/tools/viewer/weight_viewer.py +512 -0
- telefuser-0.1.0.post3/webui/__init__.py +0 -0
- telefuser-0.1.0.post3/webui/stream_app.py +191 -0
|
@@ -0,0 +1,635 @@
|
|
|
1
|
+
---
|
|
2
|
+
name: add-new-pipeline
|
|
3
|
+
description: Guide for integrating external project pipelines into TeleFuser. Six-phase workflow with interactive checkpoints.
|
|
4
|
+
---
|
|
5
|
+
|
|
6
|
+
# Add New Pipeline Integration Guide
|
|
7
|
+
|
|
8
|
+
## Trigger Conditions
|
|
9
|
+
|
|
10
|
+
- User requests to integrate a new model/pipeline from external project
|
|
11
|
+
- User mentions "integrate xxx into telefuser"
|
|
12
|
+
|
|
13
|
+
---
|
|
14
|
+
|
|
15
|
+
## Workflow Overview
|
|
16
|
+
|
|
17
|
+
```
|
|
18
|
+
Phase 1 → Phase 2.1 → Phase 2.2 → Phase 3 → Phase 4 → Phase 5
|
|
19
|
+
Analyze Pipeline Stages Models Cleanup Review
|
|
20
|
+
↓ ↓ ↓ ↓ ↓ ↓
|
|
21
|
+
Checkpoint Checkpoint Checkpoint Checkpoint Checkpoint Done
|
|
22
|
+
```
|
|
23
|
+
|
|
24
|
+
**Each phase ends with AskUserQuestion checkpoint - wait for approval before proceeding.**
|
|
25
|
+
|
|
26
|
+
---
|
|
27
|
+
|
|
28
|
+
## Phase 1: Analyze Original Pipeline
|
|
29
|
+
|
|
30
|
+
### Goals
|
|
31
|
+
1. Understand model architecture by reading source code
|
|
32
|
+
2. Document pipeline logic and inference flow
|
|
33
|
+
3. Create analysis reports
|
|
34
|
+
|
|
35
|
+
### Key Tasks
|
|
36
|
+
|
|
37
|
+
1. **Read pipeline entry point** - Trace `__call__` method execution flow
|
|
38
|
+
2. **Read model definitions** - Go deep to actual class implementations (DiT, VAE, Text Encoder)
|
|
39
|
+
3. **Create analysis reports** in `examples/<model_name>/analysis/`:
|
|
40
|
+
- `PIPELINE_LOGIC.md` - Entry point, execution steps, key functions
|
|
41
|
+
- `MODEL_DEFINITION.md` - Architecture, configuration, class hierarchy
|
|
42
|
+
- `INFERENCE_LOGIC.md` - Forward flow, data transformations
|
|
43
|
+
|
|
44
|
+
### Progress Tracking
|
|
45
|
+
|
|
46
|
+
Create `examples/<model_name>/PROGRESS.md`:
|
|
47
|
+
|
|
48
|
+
```markdown
|
|
49
|
+
# [Model Name] Integration Progress
|
|
50
|
+
|
|
51
|
+
## Overview
|
|
52
|
+
- **Model**: [Name]
|
|
53
|
+
- **Type**: [T2V/I2V/T2I/SR]
|
|
54
|
+
- **Started**: [Date]
|
|
55
|
+
|
|
56
|
+
## Phase Status
|
|
57
|
+
| Phase | Status | Notes |
|
|
58
|
+
|-------|--------|-------|
|
|
59
|
+
| 1. Analyze | 🔄 In Progress | |
|
|
60
|
+
| 2.1 Pipeline | ⏳ Pending | |
|
|
61
|
+
| 2.2 Stages | ⏳ Pending | |
|
|
62
|
+
| 3. Integrate | ⏳ Pending | |
|
|
63
|
+
| 4. Cleanup | ⏳ Pending | |
|
|
64
|
+
| 5. Review | ⏳ Pending | |
|
|
65
|
+
|
|
66
|
+
## Key Findings
|
|
67
|
+
- Architecture patterns: ...
|
|
68
|
+
- Special handling required: ...
|
|
69
|
+
- Implementation challenges: ...
|
|
70
|
+
```
|
|
71
|
+
|
|
72
|
+
### Model Source Rules
|
|
73
|
+
|
|
74
|
+
| Component | Integration Method |
|
|
75
|
+
|-----------|-------------------|
|
|
76
|
+
| **DiT/Transformer** | Source-level (`telefuser/models/<model>_dit.py`, inherit `BaseModel`) |
|
|
77
|
+
| VAE | `module_manager.load_from_huggingface()` |
|
|
78
|
+
| Text Encoder | `module_manager.load_from_huggingface()` |
|
|
79
|
+
| Scheduler | Use existing or HuggingFace |
|
|
80
|
+
|
|
81
|
+
### 🛑 Phase 1 Checkpoint
|
|
82
|
+
|
|
83
|
+
After completion:
|
|
84
|
+
1. Show analysis report summaries
|
|
85
|
+
2. Highlight critical findings (unique patterns, challenges)
|
|
86
|
+
3. **AskUserQuestion**: "Phase 1 complete. Ready for Phase 2.1?"
|
|
87
|
+
|
|
88
|
+
---
|
|
89
|
+
|
|
90
|
+
## Phase 2.1: Minimal Pipeline Integration (Faithful Copy)
|
|
91
|
+
|
|
92
|
+
### Goals
|
|
93
|
+
1. Create Pipeline class with **faithful copy** of original pipeline logic
|
|
94
|
+
2. Initialize models externally using ModuleManager
|
|
95
|
+
3. Verify pipeline can be instantiated and run
|
|
96
|
+
|
|
97
|
+
### ⚠️ CRITICAL: Faithful Copy Requirements for Pipeline
|
|
98
|
+
|
|
99
|
+
**Same rules as model integration apply to pipeline code:**
|
|
100
|
+
|
|
101
|
+
| Prohibited | Example |
|
|
102
|
+
|------------|---------|
|
|
103
|
+
| ❌ Modify any logic | Change computation order |
|
|
104
|
+
| ❌ Add/remove operations | Add preprocessing steps |
|
|
105
|
+
| ❌ Change parameter names | Rename `num_frames` to `frame_num` |
|
|
106
|
+
| ❌ "Optimize" code | Refactor loops, merge functions |
|
|
107
|
+
|
|
108
|
+
| Allowed | Description |
|
|
109
|
+
|---------|-------------|
|
|
110
|
+
| ✅ Change inheritance | Inherit `BasePipeline` |
|
|
111
|
+
| ✅ Add type annotations | Parameter and return types |
|
|
112
|
+
| ✅ Adjust imports | Use TeleFuser imports |
|
|
113
|
+
| ✅ Use ModuleManager | `self.dit = module_manager.fetch_module("dit")` |
|
|
114
|
+
|
|
115
|
+
### Files to Create
|
|
116
|
+
|
|
117
|
+
```
|
|
118
|
+
telefuser/pipelines/<model_name>/
|
|
119
|
+
├── __init__.py
|
|
120
|
+
└── pipeline.py # Pipeline class - faithful copy of original
|
|
121
|
+
```
|
|
122
|
+
|
|
123
|
+
### Pipeline Template
|
|
124
|
+
|
|
125
|
+
```python
|
|
126
|
+
# telefuser/pipelines/<model_name>/pipeline.py
|
|
127
|
+
from telefuser.core.base_pipeline import BasePipeline
|
|
128
|
+
from telefuser.core.module_manager import ModuleManager
|
|
129
|
+
|
|
130
|
+
class MyModelPipeline(BasePipeline):
|
|
131
|
+
"""Pipeline for MyModel - faithful copy from original project."""
|
|
132
|
+
|
|
133
|
+
def __init__(self, device="cuda", torch_dtype=torch.bfloat16):
|
|
134
|
+
super().__init__(device=device, torch_dtype=torch_dtype)
|
|
135
|
+
# Division factors for resolution
|
|
136
|
+
self.height_division_factor = 16
|
|
137
|
+
self.width_division_factor = 16
|
|
138
|
+
|
|
139
|
+
def init(self, module_manager: ModuleManager, config: MyModelConfig):
|
|
140
|
+
"""Initialize pipeline with external modules."""
|
|
141
|
+
self._model_info = module_manager.get_model_info()
|
|
142
|
+
self.config = config
|
|
143
|
+
|
|
144
|
+
# Fetch modules from ModuleManager (added externally)
|
|
145
|
+
self.dit = module_manager.fetch_module("dit")
|
|
146
|
+
self.vae = module_manager.fetch_module("vae")
|
|
147
|
+
self.text_encoder = module_manager.fetch_module("text_encoder")
|
|
148
|
+
|
|
149
|
+
def __call__(self, prompt: str, ...):
|
|
150
|
+
"""Forward pass - FAITHFUL COPY of original pipeline logic.
|
|
151
|
+
|
|
152
|
+
DO NOT modify:
|
|
153
|
+
- Computation order
|
|
154
|
+
- Parameter names
|
|
155
|
+
- Math formulas
|
|
156
|
+
- Control flow
|
|
157
|
+
"""
|
|
158
|
+
# Copy original __call__ logic exactly
|
|
159
|
+
...
|
|
160
|
+
```
|
|
161
|
+
|
|
162
|
+
### Example File Template
|
|
163
|
+
|
|
164
|
+
Create `examples/<model_name>/<model>_<task>_<hardware>.py`:
|
|
165
|
+
|
|
166
|
+
```python
|
|
167
|
+
"""Example for MyModel pipeline integration.
|
|
168
|
+
|
|
169
|
+
This example shows how to:
|
|
170
|
+
1. Initialize models externally
|
|
171
|
+
2. Add them to ModuleManager
|
|
172
|
+
3. Create and run the pipeline
|
|
173
|
+
"""
|
|
174
|
+
|
|
175
|
+
import torch
|
|
176
|
+
from telefuser.core.module_manager import ModuleManager
|
|
177
|
+
from telefuser.pipelines.<model_name> import MyModelPipeline, MyModelConfig
|
|
178
|
+
|
|
179
|
+
PPL_CONFIG = dict(
|
|
180
|
+
name="<model>_<task>_<hardware>",
|
|
181
|
+
num_inference_steps=50,
|
|
182
|
+
cfg_scale=4.0,
|
|
183
|
+
)
|
|
184
|
+
|
|
185
|
+
|
|
186
|
+
def get_pipeline(
|
|
187
|
+
model_root: str,
|
|
188
|
+
device: str = "cuda",
|
|
189
|
+
torch_dtype: torch.dtype = torch.bfloat16,
|
|
190
|
+
):
|
|
191
|
+
"""Initialize pipeline with external model loading."""
|
|
192
|
+
|
|
193
|
+
# 1. Create ModuleManager
|
|
194
|
+
mm = ModuleManager(torch_dtype=torch_dtype, device="cpu")
|
|
195
|
+
|
|
196
|
+
# 2. Load models EXTERNALLY (not inside pipeline)
|
|
197
|
+
# DiT - source-level model
|
|
198
|
+
dit_path = f"{model_root}/dit.safetensors"
|
|
199
|
+
mm.load_model(dit_path, name="dit", torch_dtype=torch_dtype)
|
|
200
|
+
|
|
201
|
+
# VAE - HuggingFace loading
|
|
202
|
+
vae_path = f"{model_root}/vae"
|
|
203
|
+
mm.load_from_huggingface(
|
|
204
|
+
vae_path,
|
|
205
|
+
module_source="diffusers",
|
|
206
|
+
module_class=AutoencoderKL,
|
|
207
|
+
module_name="vae",
|
|
208
|
+
)
|
|
209
|
+
|
|
210
|
+
# Text Encoder
|
|
211
|
+
text_encoder_path = f"{model_root}/text_encoder"
|
|
212
|
+
mm.load_from_huggingface(
|
|
213
|
+
text_encoder_path,
|
|
214
|
+
module_source="transformers",
|
|
215
|
+
module_class=T5EncoderModel,
|
|
216
|
+
module_name="text_encoder",
|
|
217
|
+
)
|
|
218
|
+
|
|
219
|
+
# 3. Create pipeline
|
|
220
|
+
pipeline = MyModelPipeline(device=device, torch_dtype=torch_dtype)
|
|
221
|
+
|
|
222
|
+
# 4. Initialize pipeline with external modules
|
|
223
|
+
config = MyModelConfig()
|
|
224
|
+
pipeline.init(mm, config)
|
|
225
|
+
|
|
226
|
+
return pipeline
|
|
227
|
+
|
|
228
|
+
|
|
229
|
+
def run(pipeline, prompt: str, ...):
|
|
230
|
+
"""Run inference."""
|
|
231
|
+
with torch.inference_mode():
|
|
232
|
+
output = pipeline(prompt=prompt, ...)
|
|
233
|
+
return output
|
|
234
|
+
|
|
235
|
+
|
|
236
|
+
@click.command()
|
|
237
|
+
@click.option("--model_root", required=True)
|
|
238
|
+
@click.option("--prompt", required=True)
|
|
239
|
+
def main(model_root, prompt):
|
|
240
|
+
pipeline = get_pipeline(model_root)
|
|
241
|
+
output = run(pipeline, prompt)
|
|
242
|
+
print(f"Generated: {output}")
|
|
243
|
+
|
|
244
|
+
|
|
245
|
+
if __name__ == "__main__":
|
|
246
|
+
main()
|
|
247
|
+
```
|
|
248
|
+
|
|
249
|
+
### Verification
|
|
250
|
+
|
|
251
|
+
After creating pipeline:
|
|
252
|
+
|
|
253
|
+
```bash
|
|
254
|
+
# Compare original pipeline with integrated version
|
|
255
|
+
diff -u <original_pipeline.py> telefuser/pipelines/<model>/pipeline.py
|
|
256
|
+
|
|
257
|
+
# Ensure only allowed differences:
|
|
258
|
+
# - import statements
|
|
259
|
+
# - class inheritance (BasePipeline)
|
|
260
|
+
# - type annotations
|
|
261
|
+
# - ModuleManager.fetch_module() calls
|
|
262
|
+
```
|
|
263
|
+
|
|
264
|
+
### 🛑 Phase 2.1 Checkpoint
|
|
265
|
+
|
|
266
|
+
After completion:
|
|
267
|
+
1. Show `pipeline.py` - highlight it's a faithful copy
|
|
268
|
+
2. Show example file with external model initialization
|
|
269
|
+
3. **Run diff comparison**
|
|
270
|
+
4. **AskUserQuestion**: "Phase 2.1 complete. Pipeline is faithful copy. Ready for Phase 2.2?"
|
|
271
|
+
|
|
272
|
+
---
|
|
273
|
+
|
|
274
|
+
## Phase 2.2: Split Pipeline into Stages (Faithful Copy)
|
|
275
|
+
|
|
276
|
+
### Goals
|
|
277
|
+
1. Split pipeline `__call__` into separate Stage classes
|
|
278
|
+
2. Each stage inherits `BaseStage`
|
|
279
|
+
3. **NO logic modification** - just code organization
|
|
280
|
+
|
|
281
|
+
### ⚠️ CRITICAL: Stage Splitting Rules
|
|
282
|
+
|
|
283
|
+
**Stages are for code organization, NOT refactoring:**
|
|
284
|
+
|
|
285
|
+
| Prohibited | Reason |
|
|
286
|
+
|------------|--------|
|
|
287
|
+
| ❌ Change computation order | Breaks correctness |
|
|
288
|
+
| ❌ Add/remove operations | Breaks correctness |
|
|
289
|
+
| ❌ Modify stage interfaces | Breaks data flow |
|
|
290
|
+
| ❌ "Optimize" within stages | Introduces bugs |
|
|
291
|
+
|
|
292
|
+
| Allowed | Description |
|
|
293
|
+
|---------|-------------|
|
|
294
|
+
| ✅ Group related operations | e.g., all text encoding in one stage |
|
|
295
|
+
| ✅ Use BaseStage decorators | `@with_model_offload`, `@torch.inference_mode` |
|
|
296
|
+
| ✅ Pass data between stages | Via method parameters |
|
|
297
|
+
|
|
298
|
+
### Files Structure
|
|
299
|
+
|
|
300
|
+
```
|
|
301
|
+
telefuser/pipelines/<model_name>/
|
|
302
|
+
├── __init__.py
|
|
303
|
+
├── pipeline.py # Pipeline class (updated to use stages)
|
|
304
|
+
├── text_encoding.py # Text encoding stage
|
|
305
|
+
├── vae.py # VAE encode/decode stage
|
|
306
|
+
├── denoising.py # DiT denoising stage
|
|
307
|
+
└── <other>_stage.py # Other stages as needed
|
|
308
|
+
```
|
|
309
|
+
|
|
310
|
+
### Stage Template
|
|
311
|
+
|
|
312
|
+
```python
|
|
313
|
+
# telefuser/pipelines/<model_name>/text_encoding.py
|
|
314
|
+
from telefuser.core.base_stage import BaseStage, with_model_offload
|
|
315
|
+
from telefuser.core.module_manager import ModuleManager
|
|
316
|
+
|
|
317
|
+
class TextEncodingStage(BaseStage):
|
|
318
|
+
"""Text encoding stage - faithful copy from original pipeline."""
|
|
319
|
+
|
|
320
|
+
def __init__(self, name: str, module_manager: ModuleManager, config):
|
|
321
|
+
super().__init__(name, config)
|
|
322
|
+
self.text_encoder = module_manager.fetch_module("text_encoder")
|
|
323
|
+
self.model_names = ["text_encoder"]
|
|
324
|
+
|
|
325
|
+
@with_model_offload(["text_encoder"])
|
|
326
|
+
@torch.inference_mode()
|
|
327
|
+
def process(self, prompts: list[str]) -> torch.Tensor:
|
|
328
|
+
"""Encode text prompts.
|
|
329
|
+
|
|
330
|
+
FAITHFUL COPY of original encoding logic - DO NOT MODIFY.
|
|
331
|
+
"""
|
|
332
|
+
# Copy exact logic from original pipeline
|
|
333
|
+
text_embeddings = self.text_encoder(prompts)
|
|
334
|
+
return text_embeddings
|
|
335
|
+
```
|
|
336
|
+
|
|
337
|
+
### Updated Pipeline Using Stages
|
|
338
|
+
|
|
339
|
+
```python
|
|
340
|
+
# telefuser/pipelines/<model_name>/pipeline.py
|
|
341
|
+
class MyModelPipeline(BasePipeline):
|
|
342
|
+
|
|
343
|
+
def init(self, module_manager: ModuleManager, config: MyModelConfig):
|
|
344
|
+
self._model_info = module_manager.get_model_info()
|
|
345
|
+
self.config = config
|
|
346
|
+
|
|
347
|
+
# Create stages
|
|
348
|
+
self.text_encoding_stage = TextEncodingStage(
|
|
349
|
+
"text_encoding", module_manager, config.text_encoding_config
|
|
350
|
+
)
|
|
351
|
+
self.vae_stage = VAEStage(
|
|
352
|
+
"vae", module_manager, config.vae_config
|
|
353
|
+
)
|
|
354
|
+
self.denoising_stage = DenoisingStage(
|
|
355
|
+
"denoising", module_manager, config.dit_config
|
|
356
|
+
)
|
|
357
|
+
|
|
358
|
+
def __call__(self, prompt: str, ...):
|
|
359
|
+
"""Forward pass using stages - SAME LOGIC as original."""
|
|
360
|
+
# Stage 1: Text encoding
|
|
361
|
+
text_embeddings = self.text_encoding_stage.process([prompt])
|
|
362
|
+
|
|
363
|
+
# Stage 2: VAE encoding
|
|
364
|
+
latents = self.vae_stage.encode(image)
|
|
365
|
+
|
|
366
|
+
# Stage 3: Denoising
|
|
367
|
+
latents = self.denoising_stage.process(latents, text_embeddings, ...)
|
|
368
|
+
|
|
369
|
+
# Stage 4: VAE decoding
|
|
370
|
+
output = self.vae_stage.decode(latents)
|
|
371
|
+
|
|
372
|
+
return output
|
|
373
|
+
```
|
|
374
|
+
|
|
375
|
+
### Verification Checklist
|
|
376
|
+
|
|
377
|
+
After splitting into stages:
|
|
378
|
+
|
|
379
|
+
```markdown
|
|
380
|
+
| Check Item | Pass? |
|
|
381
|
+
|------------|-------|
|
|
382
|
+
| Computation order identical | □ |
|
|
383
|
+
| All operations preserved | □ |
|
|
384
|
+
| Parameter passing correct | □ |
|
|
385
|
+
| Output matches original | □ |
|
|
386
|
+
```
|
|
387
|
+
|
|
388
|
+
### 🛑 Phase 2.2 Checkpoint
|
|
389
|
+
|
|
390
|
+
After completion:
|
|
391
|
+
1. Show all stage files
|
|
392
|
+
2. Show updated pipeline.py
|
|
393
|
+
3. **Run verification checklist**
|
|
394
|
+
4. **AskUserQuestion**: "Phase 2.2 complete. Stages are faithful copies. Ready for Phase 3?"
|
|
395
|
+
|
|
396
|
+
---
|
|
397
|
+
|
|
398
|
+
## Phase 3: Integrate Internal Models
|
|
399
|
+
|
|
400
|
+
### Goals
|
|
401
|
+
1. Implement DiT model at source-level (inherit `BaseModel`)
|
|
402
|
+
2. Implement `state_dict_converter()` for loading pretrained weights
|
|
403
|
+
3. Verify model loads correctly
|
|
404
|
+
|
|
405
|
+
### ⚠️ CRITICAL: Faithful Copy Requirements
|
|
406
|
+
|
|
407
|
+
**When integrating model code from external projects, strictly follow these rules:**
|
|
408
|
+
|
|
409
|
+
#### 1. Prohibited Modifications
|
|
410
|
+
|
|
411
|
+
| Prohibited Operation | Example | Reason |
|
|
412
|
+
|---------------------|---------|--------|
|
|
413
|
+
| ❌ Add/remove tensor operations | `x.flatten(2)`, `x.view()` | Changes data flow |
|
|
414
|
+
| ❌ Change parameter passing | `e0[0]` instead of `e0` | Changes parameter meaning |
|
|
415
|
+
| ❌ Modify math formulas | `x = x + y` → `x = x + y * e` | Incorrect computation logic |
|
|
416
|
+
| ❌ Merge/split functions | Combine multiple attention | Changes semantics |
|
|
417
|
+
| ❌ "Optimize" code structure | Refactor, simplify | Introduces bugs |
|
|
418
|
+
| ❌ Modify logic branches | Change conditionals | Inconsistent behavior |
|
|
419
|
+
|
|
420
|
+
#### 2. Allowed Modifications
|
|
421
|
+
|
|
422
|
+
| Allowed Operation | Description |
|
|
423
|
+
|-------------------|-------------|
|
|
424
|
+
| ✅ Change inheritance | `ModelMixin` → `BaseModel` |
|
|
425
|
+
| ✅ Add type annotations | `def forward(x)` → `def forward(x: torch.Tensor)` |
|
|
426
|
+
| ✅ Adjust import paths | Relative → Absolute imports |
|
|
427
|
+
| ✅ Remove external dependencies | Like `diffusers`, `transformers` mixin classes |
|
|
428
|
+
| ✅ Add `state_dict_converter()` | Weight loading adaptation |
|
|
429
|
+
| ✅ Add docstrings | English comments |
|
|
430
|
+
|
|
431
|
+
#### 3. Verification Steps
|
|
432
|
+
|
|
433
|
+
**After copying each class, MUST execute:**
|
|
434
|
+
|
|
435
|
+
```bash
|
|
436
|
+
# Compare original and integrated file differences
|
|
437
|
+
diff -u <original_file> <integrated_file> | grep "^[-+]" | grep -v "^[-+][-+][-+]"
|
|
438
|
+
|
|
439
|
+
# Ensure only these types of differences:
|
|
440
|
+
# - import statements
|
|
441
|
+
# - class inheritance declaration
|
|
442
|
+
# - type annotations
|
|
443
|
+
# - docstrings
|
|
444
|
+
```
|
|
445
|
+
|
|
446
|
+
**Difference Verification Checklist:**
|
|
447
|
+
|
|
448
|
+
```markdown
|
|
449
|
+
| Check Item | Pass? |
|
|
450
|
+
|------------|-------|
|
|
451
|
+
| forward() logic identical | □ |
|
|
452
|
+
| Parameter names and order identical | □ |
|
|
453
|
+
| Tensor operation calls identical | □ |
|
|
454
|
+
| Math formulas identical | □ |
|
|
455
|
+
| Conditional branches identical | □ |
|
|
456
|
+
```
|
|
457
|
+
|
|
458
|
+
#### 4. Model Configuration Parameters
|
|
459
|
+
|
|
460
|
+
**⚠️ CRITICAL: Never guess default parameters - always verify with user!**
|
|
461
|
+
|
|
462
|
+
When the original model uses `from_pretrained()` (diffusers `ModelMixin`) or reads from `config.json`, the model parameters are determined by the checkpoint's config file, NOT hardcoded defaults.
|
|
463
|
+
|
|
464
|
+
| Wrong Approach | Correct Approach |
|
|
465
|
+
|----------------|------------------|
|
|
466
|
+
| ❌ Copy default values from original code | ✅ Ask user for actual config from checkpoint |
|
|
467
|
+
| ❌ Use generic defaults (dim=2048, num_heads=16) | ✅ Get dim, ffn_dim, num_heads, num_layers from config.json |
|
|
468
|
+
| ❌ Assume parameters work with weights | ✅ Verify parameters match weight tensor shapes |
|
|
469
|
+
|
|
470
|
+
**Example: LiveAct config.json shows:**
|
|
471
|
+
```json
|
|
472
|
+
{
|
|
473
|
+
"dim": 5120,
|
|
474
|
+
"ffn_dim": 13824,
|
|
475
|
+
"num_heads": 40,
|
|
476
|
+
"num_layers": 40,
|
|
477
|
+
"in_dim": 36
|
|
478
|
+
}
|
|
479
|
+
```
|
|
480
|
+
|
|
481
|
+
If using wrong defaults (dim=2048, num_heads=16), weight loading will fail with shape mismatch errors.
|
|
482
|
+
|
|
483
|
+
**Required Action:**
|
|
484
|
+
When implementing model `__init__`, **AskUserQuestion** to request:
|
|
485
|
+
1. `config.json` content from checkpoint directory
|
|
486
|
+
2. Or key parameters: `dim`, `ffn_dim`, `num_heads`, `num_layers`, `in_dim`, etc.
|
|
487
|
+
3. Update default parameter values to match actual checkpoint config
|
|
488
|
+
|
|
489
|
+
#### 5. Correct vs Incorrect Example
|
|
490
|
+
|
|
491
|
+
```python
|
|
492
|
+
# ❌ WRONG - Unnecessary "optimization"
|
|
493
|
+
def forward(self, x, grid_sizes, freqs):
|
|
494
|
+
q = causal_rope_apply(x.flatten(2), grid_sizes, freqs) # Wrongly added flatten
|
|
495
|
+
q = q.view(B, -1, self.num_heads, self.head_dim)
|
|
496
|
+
|
|
497
|
+
# ✅ CORRECT - Faithful copy
|
|
498
|
+
def forward(self, x, grid_sizes, freqs):
|
|
499
|
+
q = causal_rope_apply(x, grid_sizes, freqs) # Identical to original
|
|
500
|
+
q = q.transpose(1, 2)
|
|
501
|
+
```
|
|
502
|
+
|
|
503
|
+
### DiT Model Implementation
|
|
504
|
+
|
|
505
|
+
```python
|
|
506
|
+
# telefuser/models/<model>_dit.py
|
|
507
|
+
from telefuser.core.base_model import BaseModel
|
|
508
|
+
|
|
509
|
+
class MyModelDiT(BaseModel):
|
|
510
|
+
def __init__(self, config: MyModelDiTConfig):
|
|
511
|
+
super().__init__()
|
|
512
|
+
# Directly copy from original - DO NOT modify
|
|
513
|
+
self.x_embedder = nn.Linear(...)
|
|
514
|
+
self.transformer_blocks = nn.ModuleList([...])
|
|
515
|
+
|
|
516
|
+
def forward(self, hidden_states, timestep, encoder_hidden_states, ...):
|
|
517
|
+
# Directly copy from original - DO NOT modify logic
|
|
518
|
+
...
|
|
519
|
+
|
|
520
|
+
def get_fsdp_module_names(self) -> list[str]:
|
|
521
|
+
return ["TransformerBlock", "SingleTransformerBlock"]
|
|
522
|
+
|
|
523
|
+
@staticmethod
|
|
524
|
+
def state_dict_converter():
|
|
525
|
+
return MyModelDiTStateDictConverter()
|
|
526
|
+
|
|
527
|
+
|
|
528
|
+
class MyModelDiTStateDictConverter:
|
|
529
|
+
def from_diffusers(self, state_dict: dict) -> dict:
|
|
530
|
+
return state_dict # or key remapping
|
|
531
|
+
|
|
532
|
+
def from_official(self, state_dict: dict) -> dict:
|
|
533
|
+
# Key remapping from official/BFL format
|
|
534
|
+
...
|
|
535
|
+
```
|
|
536
|
+
|
|
537
|
+
### Loading in Pipeline
|
|
538
|
+
|
|
539
|
+
```python
|
|
540
|
+
# DiT - source-level
|
|
541
|
+
transformer = Flux2DiT.from_pretrained(transformer_path, torch_dtype)
|
|
542
|
+
mm.add_module(transformer, "transformer")
|
|
543
|
+
|
|
544
|
+
# VAE/TextEncoder - HuggingFace loading
|
|
545
|
+
mm.load_from_huggingface(vae_path, module_source="diffusers",
|
|
546
|
+
module_class=AutoencoderKLFlux2, module_name="vae")
|
|
547
|
+
mm.load_from_huggingface(text_encoder_path, module_source="transformers",
|
|
548
|
+
module_class=Qwen3ForCausalLM, module_name="text_encoder")
|
|
549
|
+
```
|
|
550
|
+
|
|
551
|
+
### 🛑 Phase 3 Checkpoint
|
|
552
|
+
|
|
553
|
+
After completion:
|
|
554
|
+
1. Show `<model>_dit.py` implementation
|
|
555
|
+
2. Show state_dict_converter
|
|
556
|
+
3. **Run diff comparison and show results**
|
|
557
|
+
4. **AskUserQuestion**: "Phase 3 complete. Ready for Phase 4?"
|
|
558
|
+
|
|
559
|
+
---
|
|
560
|
+
|
|
561
|
+
## Phase 4: Code Cleanup
|
|
562
|
+
|
|
563
|
+
### Remove
|
|
564
|
+
- `gradient_checkpointing` attributes
|
|
565
|
+
- `self.training` conditionals
|
|
566
|
+
- Duplicate definitions (RMSNorm, swish, etc.)
|
|
567
|
+
- Unused code
|
|
568
|
+
|
|
569
|
+
### Standardize
|
|
570
|
+
- Consistent `from_pretrained` parameter names
|
|
571
|
+
- Encoders return dataclass (not dict)
|
|
572
|
+
- Shared utilities in single location
|
|
573
|
+
|
|
574
|
+
### 🛑 Phase 4 Checkpoint
|
|
575
|
+
|
|
576
|
+
After completion:
|
|
577
|
+
1. Run `pre-commit run --all-files`
|
|
578
|
+
2. Show cleanup summary
|
|
579
|
+
3. Update PROGRESS.md status
|
|
580
|
+
4. **AskUserQuestion**: "Phase 4 complete. Ready for Phase 5?"
|
|
581
|
+
|
|
582
|
+
---
|
|
583
|
+
|
|
584
|
+
## Phase 5: Review & Compare
|
|
585
|
+
|
|
586
|
+
### Goals
|
|
587
|
+
1. Compare pipeline logic with original
|
|
588
|
+
2. Verify edge case handling
|
|
589
|
+
3. Ensure numerical output matches
|
|
590
|
+
|
|
591
|
+
### Comparison Checklist
|
|
592
|
+
|
|
593
|
+
| Aspect | Check |
|
|
594
|
+
|--------|-------|
|
|
595
|
+
| Pipeline flow | Steps match original? |
|
|
596
|
+
| Edge cases | CFG=1, batch>1, custom sizes? |
|
|
597
|
+
| Model config | Parameters match? |
|
|
598
|
+
| Numerical | Output matches original? |
|
|
599
|
+
|
|
600
|
+
Create `examples/<model_name>/COMPARISON_REPORT.md` with findings.
|
|
601
|
+
|
|
602
|
+
### 🛑 Phase 5 Checkpoint
|
|
603
|
+
|
|
604
|
+
After completion:
|
|
605
|
+
1. Show comparison summary
|
|
606
|
+
2. Highlight any mismatches
|
|
607
|
+
3. If issues: provide fix suggestions
|
|
608
|
+
4. **AskUserQuestion**: "Integration complete. What next?"
|
|
609
|
+
|
|
610
|
+
---
|
|
611
|
+
|
|
612
|
+
## Skip Handling
|
|
613
|
+
|
|
614
|
+
When user says "skip X":
|
|
615
|
+
- Skip the work, NOT the checkpoint
|
|
616
|
+
- Still analyze from code if skipping execution
|
|
617
|
+
- Still use AskUserQuestion for approval
|
|
618
|
+
|
|
619
|
+
---
|
|
620
|
+
|
|
621
|
+
## Context Management
|
|
622
|
+
|
|
623
|
+
Phase 2.2 and 3 are critical and need precision. If context is exhausted after earlier phases, recommend starting a fresh session.
|
|
624
|
+
|
|
625
|
+
---
|
|
626
|
+
|
|
627
|
+
## Related Documentation
|
|
628
|
+
|
|
629
|
+
| Topic | Document |
|
|
630
|
+
|-------|----------|
|
|
631
|
+
| Model Implementation | `docs/en/adding_new_model.md` |
|
|
632
|
+
| Stage Implementation | `docs/en/adding_new_stage.md` |
|
|
633
|
+
| Attention Config | `docs/en/attention.md` |
|
|
634
|
+
| Parallel Inference | `docs/en/parallel.md` |
|
|
635
|
+
| Optimization | Use `/optimize-pipeline` skill after integration |
|