ejkernel 0.0.79__tar.gz → 0.0.80__tar.gz
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- ejkernel-0.0.80/PKG-INFO +687 -0
- ejkernel-0.0.80/csrc/flash_attention/CMakeLists.txt +78 -0
- ejkernel-0.0.80/csrc/flash_attention/src/flash_fwd_launch_template.h +449 -0
- ejkernel-0.0.80/csrc/quantized_matmul/CMakeLists.txt +79 -0
- ejkernel-0.0.80/csrc/quantized_matmul/src/code_gen.py +392 -0
- ejkernel-0.0.80/csrc/quantized_matmul/src/qmm_cuda_impl.h +3519 -0
- ejkernel-0.0.80/csrc/quantized_matmul/src/qmm_dequant_affine_bits1_bf16.cu +52 -0
- ejkernel-0.0.80/csrc/quantized_matmul/src/qmm_dequant_affine_bits1_f16.cu +52 -0
- ejkernel-0.0.80/csrc/quantized_matmul/src/qmm_dequant_affine_bits1_f32.cu +52 -0
- ejkernel-0.0.80/csrc/quantized_matmul/src/qmm_dequant_affine_bits2_bf16.cu +52 -0
- ejkernel-0.0.80/csrc/quantized_matmul/src/qmm_dequant_affine_bits2_f16.cu +52 -0
- ejkernel-0.0.80/csrc/quantized_matmul/src/qmm_dequant_affine_bits2_f32.cu +52 -0
- ejkernel-0.0.80/csrc/quantized_matmul/src/qmm_dequant_affine_bits3_bf16.cu +52 -0
- ejkernel-0.0.80/csrc/quantized_matmul/src/qmm_dequant_affine_bits3_f16.cu +52 -0
- ejkernel-0.0.80/csrc/quantized_matmul/src/qmm_dequant_affine_bits3_f32.cu +52 -0
- ejkernel-0.0.80/csrc/quantized_matmul/src/qmm_dequant_affine_bits4_bf16.cu +52 -0
- ejkernel-0.0.80/csrc/quantized_matmul/src/qmm_dequant_affine_bits4_f16.cu +52 -0
- ejkernel-0.0.80/csrc/quantized_matmul/src/qmm_dequant_affine_bits4_f32.cu +52 -0
- ejkernel-0.0.80/csrc/quantized_matmul/src/qmm_dequant_affine_bits5_bf16.cu +52 -0
- ejkernel-0.0.80/csrc/quantized_matmul/src/qmm_dequant_affine_bits5_f16.cu +52 -0
- ejkernel-0.0.80/csrc/quantized_matmul/src/qmm_dequant_affine_bits5_f32.cu +52 -0
- ejkernel-0.0.80/csrc/quantized_matmul/src/qmm_dequant_affine_bits6_bf16.cu +52 -0
- ejkernel-0.0.80/csrc/quantized_matmul/src/qmm_dequant_affine_bits6_f16.cu +52 -0
- ejkernel-0.0.80/csrc/quantized_matmul/src/qmm_dequant_affine_bits6_f32.cu +52 -0
- ejkernel-0.0.80/csrc/quantized_matmul/src/qmm_dequant_affine_bits7_bf16.cu +52 -0
- ejkernel-0.0.80/csrc/quantized_matmul/src/qmm_dequant_affine_bits7_f16.cu +52 -0
- ejkernel-0.0.80/csrc/quantized_matmul/src/qmm_dequant_affine_bits7_f32.cu +52 -0
- ejkernel-0.0.80/csrc/quantized_matmul/src/qmm_dequant_affine_bits8_bf16.cu +52 -0
- ejkernel-0.0.80/csrc/quantized_matmul/src/qmm_dequant_affine_bits8_f16.cu +52 -0
- ejkernel-0.0.80/csrc/quantized_matmul/src/qmm_dequant_affine_bits8_f32.cu +52 -0
- ejkernel-0.0.80/csrc/quantized_matmul/src/qmm_dequant_dispatch.h +603 -0
- ejkernel-0.0.80/csrc/quantized_matmul/src/qmm_dequant_mxfp4.cu +22 -0
- ejkernel-0.0.80/csrc/quantized_matmul/src/qmm_dequant_mxfp8.cu +22 -0
- ejkernel-0.0.80/csrc/quantized_matmul/src/qmm_dequant_nf4_bf16.cu +52 -0
- ejkernel-0.0.80/csrc/quantized_matmul/src/qmm_dequant_nf4_f16.cu +52 -0
- ejkernel-0.0.80/csrc/quantized_matmul/src/qmm_dequant_nf4_f32.cu +52 -0
- ejkernel-0.0.80/csrc/quantized_matmul/src/qmm_dequant_nvfp4.cu +22 -0
- ejkernel-0.0.80/csrc/quantized_matmul/src/qmm_dequant_nvfp8.cu +22 -0
- ejkernel-0.0.80/ejkernel/__init__.py +87 -0
- ejkernel-0.0.80/ejkernel/benchmarks.py +1073 -0
- ejkernel-0.0.80/ejkernel/build_cudalib.py +30 -0
- ejkernel-0.0.80/ejkernel/callib/__init__.py +217 -0
- ejkernel-0.0.80/ejkernel/callib/_cute_call.py +537 -0
- ejkernel-0.0.80/ejkernel/callib/_cute_ffi.py +530 -0
- ejkernel-0.0.80/ejkernel/callib/_ejit.py +700 -0
- ejkernel-0.0.80/ejkernel/callib/_pallas_call.py +275 -0
- ejkernel-0.0.80/ejkernel/callib/_tilelang_call.py +907 -0
- ejkernel-0.0.80/ejkernel/callib/_tilelang_ffi.py +560 -0
- ejkernel-0.0.80/ejkernel/callib/_triton_call.py +1638 -0
- ejkernel-0.0.80/ejkernel/callib/_utils.py +273 -0
- ejkernel-0.0.80/ejkernel/errors.py +45 -0
- ejkernel-0.0.80/ejkernel/kernels/__init__.py +152 -0
- ejkernel-0.0.80/ejkernel/kernels/_cuda/__init__.py +108 -0
- ejkernel-0.0.80/ejkernel/kernels/_cuda/blocksparse_attention/__init__.py +46 -0
- ejkernel-0.0.80/ejkernel/kernels/_cuda/blocksparse_attention/_build.py +202 -0
- ejkernel-0.0.80/ejkernel/kernels/_cuda/blocksparse_attention/_cuda_impl.py +284 -0
- ejkernel-0.0.80/ejkernel/kernels/_cuda/blocksparse_attention/_interface.py +982 -0
- ejkernel-0.0.80/ejkernel/kernels/_cuda/flash_attention/__init__.py +35 -0
- ejkernel-0.0.80/ejkernel/kernels/_cuda/flash_attention/_build.py +262 -0
- ejkernel-0.0.80/ejkernel/kernels/_cuda/flash_attention/_cuda_impl.py +478 -0
- ejkernel-0.0.80/ejkernel/kernels/_cuda/flash_attention/_interface.py +867 -0
- ejkernel-0.0.80/ejkernel/kernels/_cuda/quantized_matmul/__init__.py +39 -0
- ejkernel-0.0.80/ejkernel/kernels/_cuda/quantized_matmul/_build.py +274 -0
- ejkernel-0.0.80/ejkernel/kernels/_cuda/quantized_matmul/_cuda_impl.py +340 -0
- ejkernel-0.0.80/ejkernel/kernels/_cuda/quantized_matmul/_cuda_impl_bwd.py +103 -0
- ejkernel-0.0.80/ejkernel/kernels/_cuda/quantized_matmul/_cuda_impl_fwd.py +104 -0
- ejkernel-0.0.80/ejkernel/kernels/_cuda/quantized_matmul/_interface.py +290 -0
- ejkernel-0.0.80/ejkernel/kernels/_cuda/ragged_page_attention_v3/__init__.py +43 -0
- ejkernel-0.0.80/ejkernel/kernels/_cuda/ragged_page_attention_v3/_build.py +211 -0
- ejkernel-0.0.80/ejkernel/kernels/_cuda/ragged_page_attention_v3/_cuda_impl.py +178 -0
- ejkernel-0.0.80/ejkernel/kernels/_cuda/ragged_page_attention_v3/_interface.py +153 -0
- ejkernel-0.0.80/ejkernel/kernels/_cuda/unified_attention/__init__.py +49 -0
- ejkernel-0.0.80/ejkernel/kernels/_cuda/unified_attention/_build.py +204 -0
- ejkernel-0.0.80/ejkernel/kernels/_cuda/unified_attention/_cuda_impl.py +174 -0
- ejkernel-0.0.80/ejkernel/kernels/_cuda/unified_attention/_interface.py +170 -0
- ejkernel-0.0.80/ejkernel/kernels/_cute/__init__.py +32 -0
- ejkernel-0.0.80/ejkernel/kernels/_cute/chunked_prefill_paged_decode/__init__.py +24 -0
- ejkernel-0.0.80/ejkernel/kernels/_cute/chunked_prefill_paged_decode/_cute_impl_fwd.py +532 -0
- ejkernel-0.0.80/ejkernel/kernels/_cute/chunked_prefill_paged_decode/_interface.py +140 -0
- ejkernel-0.0.80/ejkernel/kernels/_cute/flash_attention/__init__.py +27 -0
- ejkernel-0.0.80/ejkernel/kernels/_cute/flash_attention/_cute_impl.py +1545 -0
- ejkernel-0.0.80/ejkernel/kernels/_cute/flash_attention/_interface.py +598 -0
- ejkernel-0.0.80/ejkernel/kernels/_cute/quantized_matmul/__init__.py +27 -0
- ejkernel-0.0.80/ejkernel/kernels/_cute/quantized_matmul/_cute_impl.py +3098 -0
- ejkernel-0.0.80/ejkernel/kernels/_cute/quantized_matmul/_cute_impl_bwd.py +119 -0
- ejkernel-0.0.80/ejkernel/kernels/_cute/quantized_matmul/_cute_impl_fwd.py +121 -0
- ejkernel-0.0.80/ejkernel/kernels/_cute/quantized_matmul/_interface.py +309 -0
- ejkernel-0.0.80/ejkernel/kernels/_cute/ragged_page_attention_v3/__init__.py +20 -0
- ejkernel-0.0.80/ejkernel/kernels/_cute/ragged_page_attention_v3/_cute_impl_fwd.py +21 -0
- ejkernel-0.0.80/ejkernel/kernels/_cute/ragged_page_attention_v3/_interface.py +22 -0
- ejkernel-0.0.80/ejkernel/kernels/_cute/unified_attention/__init__.py +27 -0
- ejkernel-0.0.80/ejkernel/kernels/_cute/unified_attention/_cute_impl.py +168 -0
- ejkernel-0.0.80/ejkernel/kernels/_cute/unified_attention/_interface.py +141 -0
- ejkernel-0.0.80/ejkernel/kernels/_pallas/__init__.py +46 -0
- ejkernel-0.0.80/ejkernel/kernels/_pallas/gpu/__init__.py +36 -0
- ejkernel-0.0.80/ejkernel/kernels/_pallas/gpu/ragged_decode_attention/__init__.py +34 -0
- ejkernel-0.0.80/ejkernel/kernels/_pallas/gpu/ragged_decode_attention/_interface.py +123 -0
- ejkernel-0.0.80/ejkernel/kernels/_pallas/gpu/ragged_decode_attention/_pallas_impl_fwd.py +510 -0
- ejkernel-0.0.80/ejkernel/kernels/_pallas/gpu/scaled_dot_product_attention/__init__.py +35 -0
- ejkernel-0.0.80/ejkernel/kernels/_pallas/gpu/scaled_dot_product_attention/_interface.py +101 -0
- ejkernel-0.0.80/ejkernel/kernels/_pallas/gpu/scaled_dot_product_attention/_pallas_impl_bwd.py +21 -0
- ejkernel-0.0.80/ejkernel/kernels/_pallas/gpu/scaled_dot_product_attention/_pallas_impl_fwd.py +125 -0
- ejkernel-0.0.80/ejkernel/kernels/_pallas/tpu/__init__.py +97 -0
- ejkernel-0.0.80/ejkernel/kernels/_pallas/tpu/all_gather_matmul/__init__.py +29 -0
- ejkernel-0.0.80/ejkernel/kernels/_pallas/tpu/all_gather_matmul/_interface.py +293 -0
- ejkernel-0.0.80/ejkernel/kernels/_pallas/tpu/all_gather_matmul/_pallas_impl.py +725 -0
- ejkernel-0.0.80/ejkernel/kernels/_pallas/tpu/blocksparse_attention/__init__.py +131 -0
- ejkernel-0.0.80/ejkernel/kernels/_pallas/tpu/blocksparse_attention/_info.py +1068 -0
- ejkernel-0.0.80/ejkernel/kernels/_pallas/tpu/blocksparse_attention/_interface.py +156 -0
- ejkernel-0.0.80/ejkernel/kernels/_pallas/tpu/blocksparse_attention/_kernel.py +2869 -0
- ejkernel-0.0.80/ejkernel/kernels/_pallas/tpu/blocksparse_attention/_masks.py +692 -0
- ejkernel-0.0.80/ejkernel/kernels/_pallas/tpu/deepseek_attn/__init__.py +27 -0
- ejkernel-0.0.80/ejkernel/kernels/_pallas/tpu/deepseek_attn/_interface.py +159 -0
- ejkernel-0.0.80/ejkernel/kernels/_pallas/tpu/deepseek_attn/_pallas_impl_bwd.py +97 -0
- ejkernel-0.0.80/ejkernel/kernels/_pallas/tpu/deepseek_attn/_pallas_impl_fwd.py +259 -0
- ejkernel-0.0.80/ejkernel/kernels/_pallas/tpu/flash_attention/__init__.py +36 -0
- ejkernel-0.0.80/ejkernel/kernels/_pallas/tpu/flash_attention/_interface.py +385 -0
- ejkernel-0.0.80/ejkernel/kernels/_pallas/tpu/flash_attention/_pallas_impl_bwd.py +912 -0
- ejkernel-0.0.80/ejkernel/kernels/_pallas/tpu/flash_attention/_pallas_impl_fwd.py +665 -0
- ejkernel-0.0.80/ejkernel/kernels/_pallas/tpu/flash_attention/_utils.py +609 -0
- ejkernel-0.0.80/ejkernel/kernels/_pallas/tpu/flash_mla/__init__.py +33 -0
- ejkernel-0.0.80/ejkernel/kernels/_pallas/tpu/flash_mla/_interface.py +205 -0
- ejkernel-0.0.80/ejkernel/kernels/_pallas/tpu/flash_mla/_pallas_impl_bwd.py +996 -0
- ejkernel-0.0.80/ejkernel/kernels/_pallas/tpu/flash_mla/_pallas_impl_fwd.py +626 -0
- ejkernel-0.0.80/ejkernel/kernels/_pallas/tpu/flash_mla/_utils.py +71 -0
- ejkernel-0.0.80/ejkernel/kernels/_pallas/tpu/fused_cross_entropy/__init__.py +25 -0
- ejkernel-0.0.80/ejkernel/kernels/_pallas/tpu/fused_cross_entropy/_interface.py +90 -0
- ejkernel-0.0.80/ejkernel/kernels/_pallas/tpu/fused_cross_entropy/_pallas_impl_bwd.py +202 -0
- ejkernel-0.0.80/ejkernel/kernels/_pallas/tpu/fused_cross_entropy/_pallas_impl_fwd.py +675 -0
- ejkernel-0.0.80/ejkernel/kernels/_pallas/tpu/fused_kl_divergence/__init__.py +25 -0
- ejkernel-0.0.80/ejkernel/kernels/_pallas/tpu/fused_kl_divergence/_interface.py +84 -0
- ejkernel-0.0.80/ejkernel/kernels/_pallas/tpu/fused_kl_divergence/_pallas_impl_bwd.py +222 -0
- ejkernel-0.0.80/ejkernel/kernels/_pallas/tpu/fused_kl_divergence/_pallas_impl_fwd.py +822 -0
- ejkernel-0.0.80/ejkernel/kernels/_pallas/tpu/gated_delta_rule/_interface.py +142 -0
- ejkernel-0.0.80/ejkernel/kernels/_pallas/tpu/gated_delta_rule/_pallas_impl_bwd.py +758 -0
- ejkernel-0.0.80/ejkernel/kernels/_pallas/tpu/gated_delta_rule/_pallas_impl_fwd.py +941 -0
- ejkernel-0.0.80/ejkernel/kernels/_pallas/tpu/grouped_matmul/__init__.py +30 -0
- ejkernel-0.0.80/ejkernel/kernels/_pallas/tpu/grouped_matmul/_interface.py +298 -0
- ejkernel-0.0.80/ejkernel/kernels/_pallas/tpu/grouped_matmul/_pallas_impl.py +996 -0
- ejkernel-0.0.80/ejkernel/kernels/_pallas/tpu/grouped_matmul/_utils.py +191 -0
- ejkernel-0.0.80/ejkernel/kernels/_pallas/tpu/grouped_matmulv2/__init__.py +30 -0
- ejkernel-0.0.80/ejkernel/kernels/_pallas/tpu/grouped_matmulv2/_interface.py +319 -0
- ejkernel-0.0.80/ejkernel/kernels/_pallas/tpu/grouped_matmulv2/_pallas_impl.py +627 -0
- ejkernel-0.0.80/ejkernel/kernels/_pallas/tpu/grouped_matmulv3/__init__.py +26 -0
- ejkernel-0.0.80/ejkernel/kernels/_pallas/tpu/grouped_matmulv3/_interface.py +444 -0
- ejkernel-0.0.80/ejkernel/kernels/_pallas/tpu/grouped_matmulv3/_pallas_impl.py +1418 -0
- ejkernel-0.0.80/ejkernel/kernels/_pallas/tpu/multi_latent_ragged_page_attention/__init__.py +19 -0
- ejkernel-0.0.80/ejkernel/kernels/_pallas/tpu/multi_latent_ragged_page_attention/_interface.py +143 -0
- ejkernel-0.0.80/ejkernel/kernels/_pallas/tpu/multi_latent_ragged_page_attention/_pallas_impl_fwd.py +1335 -0
- ejkernel-0.0.80/ejkernel/kernels/_pallas/tpu/multi_latent_ragged_page_attention_v2/__init__.py +19 -0
- ejkernel-0.0.80/ejkernel/kernels/_pallas/tpu/multi_latent_ragged_page_attention_v2/_interface.py +177 -0
- ejkernel-0.0.80/ejkernel/kernels/_pallas/tpu/multi_latent_ragged_page_attention_v2/_pallas_impl_fwd.py +1533 -0
- ejkernel-0.0.80/ejkernel/kernels/_pallas/tpu/page_attention/__init__.py +30 -0
- ejkernel-0.0.80/ejkernel/kernels/_pallas/tpu/page_attention/_interface.py +349 -0
- ejkernel-0.0.80/ejkernel/kernels/_pallas/tpu/page_attention/_pallas_impl_fwd.py +516 -0
- ejkernel-0.0.80/ejkernel/kernels/_pallas/tpu/prefill_page_attention/__init__.py +30 -0
- ejkernel-0.0.80/ejkernel/kernels/_pallas/tpu/prefill_page_attention/_interface.py +230 -0
- ejkernel-0.0.80/ejkernel/kernels/_pallas/tpu/prefill_page_attention/_pallas_impl_fwd.py +402 -0
- ejkernel-0.0.80/ejkernel/kernels/_pallas/tpu/quantized_matmul/__init__.py +23 -0
- ejkernel-0.0.80/ejkernel/kernels/_pallas/tpu/quantized_matmul/_interface.py +734 -0
- ejkernel-0.0.80/ejkernel/kernels/_pallas/tpu/quantized_matmul/_pallas_impl_bwd.py +470 -0
- ejkernel-0.0.80/ejkernel/kernels/_pallas/tpu/quantized_matmul/_pallas_impl_core.py +1042 -0
- ejkernel-0.0.80/ejkernel/kernels/_pallas/tpu/quantized_matmul/_pallas_impl_fwd.py +332 -0
- ejkernel-0.0.80/ejkernel/kernels/_pallas/tpu/ragged_decode_attention/__init__.py +30 -0
- ejkernel-0.0.80/ejkernel/kernels/_pallas/tpu/ragged_decode_attention/_interface.py +138 -0
- ejkernel-0.0.80/ejkernel/kernels/_pallas/tpu/ragged_decode_attention/_pallas_impl_fwd.py +322 -0
- ejkernel-0.0.80/ejkernel/kernels/_pallas/tpu/ragged_gated_delta_rule/__init__.py +25 -0
- ejkernel-0.0.80/ejkernel/kernels/_pallas/tpu/ragged_gated_delta_rule/_interface.py +229 -0
- ejkernel-0.0.80/ejkernel/kernels/_pallas/tpu/ragged_gated_delta_rule/_pallas_impl_fwd.py +140 -0
- ejkernel-0.0.80/ejkernel/kernels/_pallas/tpu/ragged_page_attention_v2/__init__.py +30 -0
- ejkernel-0.0.80/ejkernel/kernels/_pallas/tpu/ragged_page_attention_v2/_interface.py +338 -0
- ejkernel-0.0.80/ejkernel/kernels/_pallas/tpu/ragged_page_attention_v2/_pallas_impl_fwd.py +809 -0
- ejkernel-0.0.80/ejkernel/kernels/_pallas/tpu/ragged_page_attention_v2/_utils.py +561 -0
- ejkernel-0.0.80/ejkernel/kernels/_pallas/tpu/ragged_page_attention_v3/__init__.py +30 -0
- ejkernel-0.0.80/ejkernel/kernels/_pallas/tpu/ragged_page_attention_v3/_interface.py +205 -0
- ejkernel-0.0.80/ejkernel/kernels/_pallas/tpu/ragged_page_attention_v3/_pallas_impl_fwd.py +1785 -0
- ejkernel-0.0.80/ejkernel/kernels/_pallas/tpu/ragged_page_attention_v3/_pallas_impl_fwd_h64.py +1610 -0
- ejkernel-0.0.80/ejkernel/kernels/_pallas/tpu/ragged_page_attention_v3/_utils.py +4792 -0
- ejkernel-0.0.80/ejkernel/kernels/_pallas/tpu/reduce_scatter_matmul/__init__.py +19 -0
- ejkernel-0.0.80/ejkernel/kernels/_pallas/tpu/reduce_scatter_matmul/_interface.py +226 -0
- ejkernel-0.0.80/ejkernel/kernels/_pallas/tpu/reduce_scatter_matmul/_pallas_impl.py +780 -0
- ejkernel-0.0.80/ejkernel/kernels/_pallas/tpu/ring_attention/__init__.py +48 -0
- ejkernel-0.0.80/ejkernel/kernels/_pallas/tpu/ring_attention/_interface.py +125 -0
- ejkernel-0.0.80/ejkernel/kernels/_pallas/tpu/ring_attention/_pallas_impl_bwd.py +919 -0
- ejkernel-0.0.80/ejkernel/kernels/_pallas/tpu/ring_attention/_pallas_impl_fwd.py +477 -0
- ejkernel-0.0.80/ejkernel/kernels/_registry.py +720 -0
- ejkernel-0.0.80/ejkernel/kernels/_tilelang/__init__.py +138 -0
- ejkernel-0.0.80/ejkernel/kernels/_tilelang/_dense_matmul.py +200 -0
- ejkernel-0.0.80/ejkernel/kernels/_tilelang/_gate_impl.py +408 -0
- ejkernel-0.0.80/ejkernel/kernels/_tilelang/_gate_kernel.py +520 -0
- ejkernel-0.0.80/ejkernel/kernels/_tilelang/_grouped_matmul_impl.py +252 -0
- ejkernel-0.0.80/ejkernel/kernels/_tilelang/_grouped_matmul_kernel.py +232 -0
- ejkernel-0.0.80/ejkernel/kernels/_tilelang/all_gather_matmul/__init__.py +24 -0
- ejkernel-0.0.80/ejkernel/kernels/_tilelang/all_gather_matmul/_interface.py +102 -0
- ejkernel-0.0.80/ejkernel/kernels/_tilelang/attention/__init__.py +26 -0
- ejkernel-0.0.80/ejkernel/kernels/_tilelang/attention/_interface.py +750 -0
- ejkernel-0.0.80/ejkernel/kernels/_tilelang/attention/_kernel.py +505 -0
- ejkernel-0.0.80/ejkernel/kernels/_tilelang/blocksparse_attention/__init__.py +19 -0
- ejkernel-0.0.80/ejkernel/kernels/_tilelang/blocksparse_attention/_interface.py +123 -0
- ejkernel-0.0.80/ejkernel/kernels/_tilelang/chunked_prefill_paged_decode/__init__.py +29 -0
- ejkernel-0.0.80/ejkernel/kernels/_tilelang/chunked_prefill_paged_decode/_interface.py +288 -0
- ejkernel-0.0.80/ejkernel/kernels/_tilelang/chunked_prefill_paged_decode/_kernel.py +305 -0
- ejkernel-0.0.80/ejkernel/kernels/_tilelang/decode_attention/__init__.py +24 -0
- ejkernel-0.0.80/ejkernel/kernels/_tilelang/decode_attention/_impl.py +202 -0
- ejkernel-0.0.80/ejkernel/kernels/_tilelang/decode_attention/_interface.py +215 -0
- ejkernel-0.0.80/ejkernel/kernels/_tilelang/decode_attention/_kernel.py +340 -0
- ejkernel-0.0.80/ejkernel/kernels/_tilelang/decode_attention/_split_kernel.py +224 -0
- ejkernel-0.0.80/ejkernel/kernels/_tilelang/deepseek_attn/__init__.py +35 -0
- ejkernel-0.0.80/ejkernel/kernels/_tilelang/deepseek_attn/_impl.py +630 -0
- ejkernel-0.0.80/ejkernel/kernels/_tilelang/deepseek_attn/_interface.py +112 -0
- ejkernel-0.0.80/ejkernel/kernels/_tilelang/deepseek_attn/_kernel.py +471 -0
- ejkernel-0.0.80/ejkernel/kernels/_tilelang/flash_attention/__init__.py +19 -0
- ejkernel-0.0.80/ejkernel/kernels/_tilelang/flash_attention/_impl.py +1462 -0
- ejkernel-0.0.80/ejkernel/kernels/_tilelang/flash_attention/_interface.py +153 -0
- ejkernel-0.0.80/ejkernel/kernels/_tilelang/flash_attention/_kernel.py +1294 -0
- ejkernel-0.0.80/ejkernel/kernels/_tilelang/flash_mla/__init__.py +26 -0
- ejkernel-0.0.80/ejkernel/kernels/_tilelang/flash_mla/_interface.py +161 -0
- ejkernel-0.0.80/ejkernel/kernels/_tilelang/fused_cross_entropy/__init__.py +19 -0
- ejkernel-0.0.80/ejkernel/kernels/_tilelang/fused_cross_entropy/_impl.py +658 -0
- ejkernel-0.0.80/ejkernel/kernels/_tilelang/fused_cross_entropy/_interface.py +124 -0
- ejkernel-0.0.80/ejkernel/kernels/_tilelang/fused_cross_entropy/_kernel.py +835 -0
- ejkernel-0.0.80/ejkernel/kernels/_tilelang/fused_kl_divergence/__init__.py +19 -0
- ejkernel-0.0.80/ejkernel/kernels/_tilelang/fused_kl_divergence/_impl.py +911 -0
- ejkernel-0.0.80/ejkernel/kernels/_tilelang/fused_kl_divergence/_interface.py +85 -0
- ejkernel-0.0.80/ejkernel/kernels/_tilelang/fused_kl_divergence/_kernel.py +1244 -0
- ejkernel-0.0.80/ejkernel/kernels/_tilelang/gated_delta_rule/__init__.py +23 -0
- ejkernel-0.0.80/ejkernel/kernels/_tilelang/gated_delta_rule/_impl.py +372 -0
- ejkernel-0.0.80/ejkernel/kernels/_tilelang/gated_delta_rule/_interface.py +131 -0
- ejkernel-0.0.80/ejkernel/kernels/_tilelang/gated_delta_rule/_kernel.py +537 -0
- ejkernel-0.0.80/ejkernel/kernels/_tilelang/gla/__init__.py +27 -0
- ejkernel-0.0.80/ejkernel/kernels/_tilelang/gla/_interface.py +119 -0
- ejkernel-0.0.80/ejkernel/kernels/_tilelang/grouped_matmul/__init__.py +27 -0
- ejkernel-0.0.80/ejkernel/kernels/_tilelang/grouped_matmul/_interface.py +209 -0
- ejkernel-0.0.80/ejkernel/kernels/_tilelang/grouped_matmulv2/__init__.py +15 -0
- ejkernel-0.0.80/ejkernel/kernels/_tilelang/grouped_matmulv2/_interface.py +25 -0
- ejkernel-0.0.80/ejkernel/kernels/_tilelang/grouped_matmulv3/__init__.py +28 -0
- ejkernel-0.0.80/ejkernel/kernels/_tilelang/grouped_matmulv3/_impl.py +795 -0
- ejkernel-0.0.80/ejkernel/kernels/_tilelang/grouped_matmulv3/_interface.py +138 -0
- ejkernel-0.0.80/ejkernel/kernels/_tilelang/grouped_matmulv3/_kernel.py +769 -0
- ejkernel-0.0.80/ejkernel/kernels/_tilelang/kernel_delta_attention/__init__.py +33 -0
- ejkernel-0.0.80/ejkernel/kernels/_tilelang/kernel_delta_attention/_interface.py +214 -0
- ejkernel-0.0.80/ejkernel/kernels/_tilelang/lightning_attn/__init__.py +27 -0
- ejkernel-0.0.80/ejkernel/kernels/_tilelang/lightning_attn/_interface.py +130 -0
- ejkernel-0.0.80/ejkernel/kernels/_tilelang/mamba1/__init__.py +28 -0
- ejkernel-0.0.80/ejkernel/kernels/_tilelang/mamba1/_interface.py +25 -0
- ejkernel-0.0.80/ejkernel/kernels/_tilelang/mamba2/__init__.py +15 -0
- ejkernel-0.0.80/ejkernel/kernels/_tilelang/mamba2/_interface.py +25 -0
- ejkernel-0.0.80/ejkernel/kernels/_tilelang/mean_pooling/__init__.py +19 -0
- ejkernel-0.0.80/ejkernel/kernels/_tilelang/mean_pooling/_impl.py +263 -0
- ejkernel-0.0.80/ejkernel/kernels/_tilelang/mean_pooling/_interface.py +102 -0
- ejkernel-0.0.80/ejkernel/kernels/_tilelang/mean_pooling/_kernel.py +351 -0
- ejkernel-0.0.80/ejkernel/kernels/_tilelang/multi_latent_ragged_page_attention/__init__.py +24 -0
- ejkernel-0.0.80/ejkernel/kernels/_tilelang/multi_latent_ragged_page_attention/_interface.py +475 -0
- ejkernel-0.0.80/ejkernel/kernels/_tilelang/multi_latent_ragged_page_attention/_kernel.py +385 -0
- ejkernel-0.0.80/ejkernel/kernels/_tilelang/multi_latent_ragged_page_attention_v2/__init__.py +24 -0
- ejkernel-0.0.80/ejkernel/kernels/_tilelang/multi_latent_ragged_page_attention_v2/_interface.py +131 -0
- ejkernel-0.0.80/ejkernel/kernels/_tilelang/native_sparse_attention/__init__.py +29 -0
- ejkernel-0.0.80/ejkernel/kernels/_tilelang/native_sparse_attention/_impl.py +530 -0
- ejkernel-0.0.80/ejkernel/kernels/_tilelang/native_sparse_attention/_interface.py +187 -0
- ejkernel-0.0.80/ejkernel/kernels/_tilelang/native_sparse_attention/_kernel.py +659 -0
- ejkernel-0.0.80/ejkernel/kernels/_tilelang/page_attention/__init__.py +24 -0
- ejkernel-0.0.80/ejkernel/kernels/_tilelang/page_attention/_interface.py +276 -0
- ejkernel-0.0.80/ejkernel/kernels/_tilelang/page_attention/_kernel.py +417 -0
- ejkernel-0.0.80/ejkernel/kernels/_tilelang/prefill_page_attention/__init__.py +24 -0
- ejkernel-0.0.80/ejkernel/kernels/_tilelang/prefill_page_attention/_interface.py +243 -0
- ejkernel-0.0.80/ejkernel/kernels/_tilelang/prefill_page_attention/_kernel.py +300 -0
- ejkernel-0.0.80/ejkernel/kernels/_tilelang/quantized_matmul/__init__.py +19 -0
- ejkernel-0.0.80/ejkernel/kernels/_tilelang/quantized_matmul/_impl.py +2180 -0
- ejkernel-0.0.80/ejkernel/kernels/_tilelang/quantized_matmul/_interface.py +210 -0
- ejkernel-0.0.80/ejkernel/kernels/_tilelang/quantized_matmul/_kernel.py +2091 -0
- ejkernel-0.0.80/ejkernel/kernels/_tilelang/ragged_decode_attention/__init__.py +24 -0
- ejkernel-0.0.80/ejkernel/kernels/_tilelang/ragged_decode_attention/_interface.py +354 -0
- ejkernel-0.0.80/ejkernel/kernels/_tilelang/ragged_decode_attention/_kernel.py +412 -0
- ejkernel-0.0.80/ejkernel/kernels/_tilelang/ragged_gated_delta_rule/__init__.py +24 -0
- ejkernel-0.0.80/ejkernel/kernels/_tilelang/ragged_gated_delta_rule/_impl.py +455 -0
- ejkernel-0.0.80/ejkernel/kernels/_tilelang/ragged_gated_delta_rule/_interface.py +109 -0
- ejkernel-0.0.80/ejkernel/kernels/_tilelang/ragged_gated_delta_rule/_kernel.py +1118 -0
- ejkernel-0.0.80/ejkernel/kernels/_tilelang/ragged_page_attention_v2/__init__.py +24 -0
- ejkernel-0.0.80/ejkernel/kernels/_tilelang/ragged_page_attention_v2/_interface.py +297 -0
- ejkernel-0.0.80/ejkernel/kernels/_tilelang/ragged_page_attention_v2/_kernel.py +312 -0
- ejkernel-0.0.80/ejkernel/kernels/_tilelang/ragged_page_attention_v2_turboquant/__init__.py +26 -0
- ejkernel-0.0.80/ejkernel/kernels/_tilelang/ragged_page_attention_v2_turboquant/_interface.py +342 -0
- ejkernel-0.0.80/ejkernel/kernels/_tilelang/ragged_page_attention_v2_turboquant/_kernel.py +355 -0
- ejkernel-0.0.80/ejkernel/kernels/_tilelang/ragged_page_attention_v3/__init__.py +25 -0
- ejkernel-0.0.80/ejkernel/kernels/_tilelang/ragged_page_attention_v3/_interface.py +326 -0
- ejkernel-0.0.80/ejkernel/kernels/_tilelang/ragged_page_attention_v3/_kernel.py +562 -0
- ejkernel-0.0.80/ejkernel/kernels/_tilelang/ragged_page_attention_v3_turboquant/__init__.py +24 -0
- ejkernel-0.0.80/ejkernel/kernels/_tilelang/ragged_page_attention_v3_turboquant/_interface.py +325 -0
- ejkernel-0.0.80/ejkernel/kernels/_tilelang/ragged_page_attention_v3_turboquant/_kernel.py +300 -0
- ejkernel-0.0.80/ejkernel/kernels/_tilelang/recurrent/__init__.py +19 -0
- ejkernel-0.0.80/ejkernel/kernels/_tilelang/recurrent/_impl.py +980 -0
- ejkernel-0.0.80/ejkernel/kernels/_tilelang/recurrent/_interface.py +127 -0
- ejkernel-0.0.80/ejkernel/kernels/_tilelang/recurrent/_kernel.py +883 -0
- ejkernel-0.0.80/ejkernel/kernels/_tilelang/reduce_scatter_matmul/__init__.py +19 -0
- ejkernel-0.0.80/ejkernel/kernels/_tilelang/reduce_scatter_matmul/_interface.py +100 -0
- ejkernel-0.0.80/ejkernel/kernels/_tilelang/ring_attention/__init__.py +24 -0
- ejkernel-0.0.80/ejkernel/kernels/_tilelang/ring_attention/_interface.py +110 -0
- ejkernel-0.0.80/ejkernel/kernels/_tilelang/rwkv4/__init__.py +19 -0
- ejkernel-0.0.80/ejkernel/kernels/_tilelang/rwkv4/_impl.py +351 -0
- ejkernel-0.0.80/ejkernel/kernels/_tilelang/rwkv4/_interface.py +79 -0
- ejkernel-0.0.80/ejkernel/kernels/_tilelang/rwkv4/_kernel.py +519 -0
- ejkernel-0.0.80/ejkernel/kernels/_tilelang/rwkv6/__init__.py +19 -0
- ejkernel-0.0.80/ejkernel/kernels/_tilelang/rwkv6/_impl.py +485 -0
- ejkernel-0.0.80/ejkernel/kernels/_tilelang/rwkv6/_interface.py +98 -0
- ejkernel-0.0.80/ejkernel/kernels/_tilelang/rwkv6/_kernel.py +966 -0
- ejkernel-0.0.80/ejkernel/kernels/_tilelang/rwkv7/__init__.py +19 -0
- ejkernel-0.0.80/ejkernel/kernels/_tilelang/rwkv7/_impl.py +540 -0
- ejkernel-0.0.80/ejkernel/kernels/_tilelang/rwkv7/_interface.py +191 -0
- ejkernel-0.0.80/ejkernel/kernels/_tilelang/rwkv7/_kernel.py +994 -0
- ejkernel-0.0.80/ejkernel/kernels/_tilelang/rwkv7_mul/__init__.py +19 -0
- ejkernel-0.0.80/ejkernel/kernels/_tilelang/rwkv7_mul/_interface.py +25 -0
- ejkernel-0.0.80/ejkernel/kernels/_tilelang/scaled_dot_product_attention/__init__.py +19 -0
- ejkernel-0.0.80/ejkernel/kernels/_tilelang/scaled_dot_product_attention/_interface.py +88 -0
- ejkernel-0.0.80/ejkernel/kernels/_tilelang/ssm1/__init__.py +6 -0
- ejkernel-0.0.80/ejkernel/kernels/_tilelang/ssm1/_interface.py +19 -0
- ejkernel-0.0.80/ejkernel/kernels/_tilelang/ssm2/__init__.py +6 -0
- ejkernel-0.0.80/ejkernel/kernels/_tilelang/ssm2/_interface.py +19 -0
- ejkernel-0.0.80/ejkernel/kernels/_tilelang/state_space_v1/__init__.py +19 -0
- ejkernel-0.0.80/ejkernel/kernels/_tilelang/state_space_v1/_impl.py +341 -0
- ejkernel-0.0.80/ejkernel/kernels/_tilelang/state_space_v1/_interface.py +143 -0
- ejkernel-0.0.80/ejkernel/kernels/_tilelang/state_space_v1/_kernel.py +630 -0
- ejkernel-0.0.80/ejkernel/kernels/_tilelang/state_space_v2/__init__.py +19 -0
- ejkernel-0.0.80/ejkernel/kernels/_tilelang/state_space_v2/_impl.py +300 -0
- ejkernel-0.0.80/ejkernel/kernels/_tilelang/state_space_v2/_interface.py +138 -0
- ejkernel-0.0.80/ejkernel/kernels/_tilelang/state_space_v2/_kernel.py +531 -0
- ejkernel-0.0.80/ejkernel/kernels/_tilelang/unified_attention/__init__.py +19 -0
- ejkernel-0.0.80/ejkernel/kernels/_tilelang/unified_attention/_interface.py +301 -0
- ejkernel-0.0.80/ejkernel/kernels/_tilelang/unified_attention/_kernel.py +301 -0
- ejkernel-0.0.80/ejkernel/kernels/_triton/__init__.py +90 -0
- ejkernel-0.0.80/ejkernel/kernels/_triton/blocksparse_attention/__init__.py +33 -0
- ejkernel-0.0.80/ejkernel/kernels/_triton/blocksparse_attention/_interface.py +469 -0
- ejkernel-0.0.80/ejkernel/kernels/_triton/blocksparse_attention/_mask.py +571 -0
- ejkernel-0.0.80/ejkernel/kernels/_triton/blocksparse_attention/_triton_impl_bwd.py +1469 -0
- ejkernel-0.0.80/ejkernel/kernels/_triton/blocksparse_attention/_triton_impl_fwd.py +753 -0
- ejkernel-0.0.80/ejkernel/kernels/_triton/blocksparse_attention/_utilities.py +606 -0
- ejkernel-0.0.80/ejkernel/kernels/_triton/chunked_prefill_paged_decode/__init__.py +31 -0
- ejkernel-0.0.80/ejkernel/kernels/_triton/chunked_prefill_paged_decode/_interface.py +113 -0
- ejkernel-0.0.80/ejkernel/kernels/_triton/chunked_prefill_paged_decode/_triton_impl_fwd.py +244 -0
- ejkernel-0.0.80/ejkernel/kernels/_triton/decode_attention/__init__.py +23 -0
- ejkernel-0.0.80/ejkernel/kernels/_triton/decode_attention/_interface.py +96 -0
- ejkernel-0.0.80/ejkernel/kernels/_triton/decode_attention/_triton_impl_fwd.py +496 -0
- ejkernel-0.0.80/ejkernel/kernels/_triton/flash_attention/__init__.py +36 -0
- ejkernel-0.0.80/ejkernel/kernels/_triton/flash_attention/_interface.py +473 -0
- ejkernel-0.0.80/ejkernel/kernels/_triton/flash_attention/_triton_impl_bwd.py +1628 -0
- ejkernel-0.0.80/ejkernel/kernels/_triton/flash_attention/_triton_impl_fwd.py +1074 -0
- ejkernel-0.0.80/ejkernel/kernels/_triton/flash_attention/_utilities.py +472 -0
- ejkernel-0.0.80/ejkernel/kernels/_triton/flash_mla/__init__.py +25 -0
- ejkernel-0.0.80/ejkernel/kernels/_triton/flash_mla/_interface.py +152 -0
- ejkernel-0.0.80/ejkernel/kernels/_triton/flash_mla/_triton_impl_bwd.py +40 -0
- ejkernel-0.0.80/ejkernel/kernels/_triton/flash_mla/_triton_impl_fwd.py +218 -0
- ejkernel-0.0.80/ejkernel/kernels/_triton/flash_mla/_utilities.py +39 -0
- ejkernel-0.0.80/ejkernel/kernels/_triton/gla/__init__.py +30 -0
- ejkernel-0.0.80/ejkernel/kernels/_triton/gla/_interface.py +106 -0
- ejkernel-0.0.80/ejkernel/kernels/_triton/gla/_triton_impl_bwd.py +21 -0
- ejkernel-0.0.80/ejkernel/kernels/_triton/gla/_triton_impl_fwd.py +127 -0
- ejkernel-0.0.80/ejkernel/kernels/_triton/lightning_attn/__init__.py +30 -0
- ejkernel-0.0.80/ejkernel/kernels/_triton/lightning_attn/_interface.py +97 -0
- ejkernel-0.0.80/ejkernel/kernels/_triton/lightning_attn/_triton_impl_bwd.py +21 -0
- ejkernel-0.0.80/ejkernel/kernels/_triton/lightning_attn/_triton_impl_fwd.py +146 -0
- ejkernel-0.0.80/ejkernel/kernels/_triton/mean_pooling/__init__.py +42 -0
- ejkernel-0.0.80/ejkernel/kernels/_triton/mean_pooling/_interface.py +207 -0
- ejkernel-0.0.80/ejkernel/kernels/_triton/mean_pooling/_triton_impl_bwd.py +241 -0
- ejkernel-0.0.80/ejkernel/kernels/_triton/mean_pooling/_triton_impl_fwd.py +250 -0
- ejkernel-0.0.80/ejkernel/kernels/_triton/native_sparse_attention/__init__.py +37 -0
- ejkernel-0.0.80/ejkernel/kernels/_triton/native_sparse_attention/_compression.py +1012 -0
- ejkernel-0.0.80/ejkernel/kernels/_triton/native_sparse_attention/_interface.py +469 -0
- ejkernel-0.0.80/ejkernel/kernels/_triton/native_sparse_attention/_triton_impl_bwd.py +586 -0
- ejkernel-0.0.80/ejkernel/kernels/_triton/native_sparse_attention/_triton_impl_fwd.py +627 -0
- ejkernel-0.0.80/ejkernel/kernels/_triton/native_sparse_attention/_utilities.py +174 -0
- ejkernel-0.0.80/ejkernel/kernels/_triton/page_attention/__init__.py +30 -0
- ejkernel-0.0.80/ejkernel/kernels/_triton/page_attention/_interface.py +381 -0
- ejkernel-0.0.80/ejkernel/kernels/_triton/page_attention/_triton_impl_fwd.py +308 -0
- ejkernel-0.0.80/ejkernel/kernels/_triton/quantized_matmul/__init__.py +46 -0
- ejkernel-0.0.80/ejkernel/kernels/_triton/quantized_matmul/_interface.py +481 -0
- ejkernel-0.0.80/ejkernel/kernels/_triton/quantized_matmul/_triton_impl.py +2905 -0
- ejkernel-0.0.80/ejkernel/kernels/_triton/quantized_matmul/_triton_impl_bwd.py +125 -0
- ejkernel-0.0.80/ejkernel/kernels/_triton/quantized_matmul/_triton_impl_fwd.py +330 -0
- ejkernel-0.0.80/ejkernel/kernels/_triton/quantized_matmul/_triton_impl_gemv.py +754 -0
- ejkernel-0.0.80/ejkernel/kernels/_triton/ragged_decode_attention/__init__.py +30 -0
- ejkernel-0.0.80/ejkernel/kernels/_triton/ragged_decode_attention/_interface.py +173 -0
- ejkernel-0.0.80/ejkernel/kernels/_triton/ragged_decode_attention/_triton_impl_fwd.py +394 -0
- ejkernel-0.0.80/ejkernel/kernels/_triton/ragged_page_attention_v2/__init__.py +30 -0
- ejkernel-0.0.80/ejkernel/kernels/_triton/ragged_page_attention_v2/_interface.py +228 -0
- ejkernel-0.0.80/ejkernel/kernels/_triton/ragged_page_attention_v2/_triton_impl_fwd.py +796 -0
- ejkernel-0.0.80/ejkernel/kernels/_triton/ragged_page_attention_v3/__init__.py +30 -0
- ejkernel-0.0.80/ejkernel/kernels/_triton/ragged_page_attention_v3/_interface.py +205 -0
- ejkernel-0.0.80/ejkernel/kernels/_triton/ragged_page_attention_v3/_triton_impl_fwd.py +661 -0
- ejkernel-0.0.80/ejkernel/kernels/_triton/recurrent/__init__.py +42 -0
- ejkernel-0.0.80/ejkernel/kernels/_triton/recurrent/_interface.py +339 -0
- ejkernel-0.0.80/ejkernel/kernels/_triton/recurrent/_triton_impl_bwd.py +419 -0
- ejkernel-0.0.80/ejkernel/kernels/_triton/recurrent/_triton_impl_fwd.py +282 -0
- ejkernel-0.0.80/ejkernel/kernels/_triton/ring_attention/__init__.py +33 -0
- ejkernel-0.0.80/ejkernel/kernels/_triton/ring_attention/_interface.py +164 -0
- ejkernel-0.0.80/ejkernel/kernels/_triton/ring_attention/_triton_impl_bwd.py +862 -0
- ejkernel-0.0.80/ejkernel/kernels/_triton/ring_attention/_triton_impl_fwd.py +225 -0
- ejkernel-0.0.80/ejkernel/kernels/_triton/rwkv4/__init__.py +32 -0
- ejkernel-0.0.80/ejkernel/kernels/_triton/rwkv4/_interface.py +226 -0
- ejkernel-0.0.80/ejkernel/kernels/_triton/rwkv4/_triton_impl_bwd.py +232 -0
- ejkernel-0.0.80/ejkernel/kernels/_triton/rwkv4/_triton_impl_fwd.py +281 -0
- ejkernel-0.0.80/ejkernel/kernels/_triton/rwkv6/__init__.py +33 -0
- ejkernel-0.0.80/ejkernel/kernels/_triton/rwkv6/_interface.py +561 -0
- ejkernel-0.0.80/ejkernel/kernels/_triton/rwkv6/_triton_impl_fwd.py +221 -0
- ejkernel-0.0.80/ejkernel/kernels/_triton/rwkv7/__init__.py +34 -0
- ejkernel-0.0.80/ejkernel/kernels/_triton/rwkv7/_interface.py +684 -0
- ejkernel-0.0.80/ejkernel/kernels/_triton/rwkv7/_triton_impl_fwd.py +238 -0
- ejkernel-0.0.80/ejkernel/kernels/_triton/unified_attention/__init__.py +30 -0
- ejkernel-0.0.80/ejkernel/kernels/_triton/unified_attention/_interface.py +161 -0
- ejkernel-0.0.80/ejkernel/kernels/_triton/unified_attention/_triton_impl_fwd.py +1024 -0
- ejkernel-0.0.80/ejkernel/kernels/_xla/__init__.py +165 -0
- ejkernel-0.0.80/ejkernel/kernels/_xla/all_gather_matmul/__init__.py +24 -0
- ejkernel-0.0.80/ejkernel/kernels/_xla/all_gather_matmul/_interface.py +95 -0
- ejkernel-0.0.80/ejkernel/kernels/_xla/all_gather_matmul/_xla_impl_bwd.py +50 -0
- ejkernel-0.0.80/ejkernel/kernels/_xla/all_gather_matmul/_xla_impl_fwd.py +236 -0
- ejkernel-0.0.80/ejkernel/kernels/_xla/attention/__init__.py +31 -0
- ejkernel-0.0.80/ejkernel/kernels/_xla/attention/_interface.py +149 -0
- ejkernel-0.0.80/ejkernel/kernels/_xla/attention/_xla_impl_bwd.py +23 -0
- ejkernel-0.0.80/ejkernel/kernels/_xla/attention/_xla_impl_fwd.py +331 -0
- ejkernel-0.0.80/ejkernel/kernels/_xla/blocksparse_attention/__init__.py +30 -0
- ejkernel-0.0.80/ejkernel/kernels/_xla/blocksparse_attention/_interface.py +150 -0
- ejkernel-0.0.80/ejkernel/kernels/_xla/blocksparse_attention/_xla_impl_bwd.py +23 -0
- ejkernel-0.0.80/ejkernel/kernels/_xla/blocksparse_attention/_xla_impl_fwd.py +475 -0
- ejkernel-0.0.80/ejkernel/kernels/_xla/chunked_prefill_paged_decode/__init__.py +38 -0
- ejkernel-0.0.80/ejkernel/kernels/_xla/chunked_prefill_paged_decode/_interface.py +113 -0
- ejkernel-0.0.80/ejkernel/kernels/_xla/chunked_prefill_paged_decode/_xla_impl_fwd.py +237 -0
- ejkernel-0.0.80/ejkernel/kernels/_xla/decode_attention/__init__.py +34 -0
- ejkernel-0.0.80/ejkernel/kernels/_xla/decode_attention/_interface.py +87 -0
- ejkernel-0.0.80/ejkernel/kernels/_xla/decode_attention/_xla_impl_fwd.py +167 -0
- ejkernel-0.0.80/ejkernel/kernels/_xla/deepseek_attn/__init__.py +28 -0
- ejkernel-0.0.80/ejkernel/kernels/_xla/deepseek_attn/_interface.py +239 -0
- ejkernel-0.0.80/ejkernel/kernels/_xla/deepseek_attn/_xla_impl_bwd.py +127 -0
- ejkernel-0.0.80/ejkernel/kernels/_xla/deepseek_attn/_xla_impl_fwd.py +211 -0
- ejkernel-0.0.80/ejkernel/kernels/_xla/flash_attention/__init__.py +40 -0
- ejkernel-0.0.80/ejkernel/kernels/_xla/flash_attention/_interface.py +670 -0
- ejkernel-0.0.80/ejkernel/kernels/_xla/flash_attention/_xla_impl_bwd.py +150 -0
- ejkernel-0.0.80/ejkernel/kernels/_xla/flash_attention/_xla_impl_fwd.py +742 -0
- ejkernel-0.0.80/ejkernel/kernels/_xla/flash_mla/__init__.py +23 -0
- ejkernel-0.0.80/ejkernel/kernels/_xla/flash_mla/_xla_impl_bwd.py +21 -0
- ejkernel-0.0.80/ejkernel/kernels/_xla/flash_mla/_xla_impl_fwd.py +480 -0
- ejkernel-0.0.80/ejkernel/kernels/_xla/fused_cross_entropy/__init__.py +19 -0
- ejkernel-0.0.80/ejkernel/kernels/_xla/fused_cross_entropy/_interface.py +68 -0
- ejkernel-0.0.80/ejkernel/kernels/_xla/fused_cross_entropy/_xla_impl_bwd.py +124 -0
- ejkernel-0.0.80/ejkernel/kernels/_xla/fused_cross_entropy/_xla_impl_chunked.py +396 -0
- ejkernel-0.0.80/ejkernel/kernels/_xla/fused_cross_entropy/_xla_impl_fwd.py +227 -0
- ejkernel-0.0.80/ejkernel/kernels/_xla/fused_cross_entropy/_xla_impl_linear.py +201 -0
- ejkernel-0.0.80/ejkernel/kernels/_xla/fused_kl_divergence/__init__.py +19 -0
- ejkernel-0.0.80/ejkernel/kernels/_xla/fused_kl_divergence/_interface.py +62 -0
- ejkernel-0.0.80/ejkernel/kernels/_xla/fused_kl_divergence/_xla_impl_bwd.py +110 -0
- ejkernel-0.0.80/ejkernel/kernels/_xla/fused_kl_divergence/_xla_impl_fwd.py +201 -0
- ejkernel-0.0.80/ejkernel/kernels/_xla/gated_delta_rule/__init__.py +36 -0
- ejkernel-0.0.80/ejkernel/kernels/_xla/gated_delta_rule/_interface.py +176 -0
- ejkernel-0.0.80/ejkernel/kernels/_xla/gated_delta_rule/_xla_impl_bwd.py +305 -0
- ejkernel-0.0.80/ejkernel/kernels/_xla/gated_delta_rule/_xla_impl_fwd.py +814 -0
- ejkernel-0.0.80/ejkernel/kernels/_xla/gla/__init__.py +34 -0
- ejkernel-0.0.80/ejkernel/kernels/_xla/gla/_interface.py +122 -0
- ejkernel-0.0.80/ejkernel/kernels/_xla/gla/_xla_impl_bwd.py +21 -0
- ejkernel-0.0.80/ejkernel/kernels/_xla/gla/_xla_impl_fwd.py +119 -0
- ejkernel-0.0.80/ejkernel/kernels/_xla/grouped_matmul/__init__.py +30 -0
- ejkernel-0.0.80/ejkernel/kernels/_xla/grouped_matmul/_interface.py +112 -0
- ejkernel-0.0.80/ejkernel/kernels/_xla/grouped_matmul/_xla_impl_bwd.py +21 -0
- ejkernel-0.0.80/ejkernel/kernels/_xla/grouped_matmul/_xla_impl_fwd.py +122 -0
- ejkernel-0.0.80/ejkernel/kernels/_xla/grouped_matmulv3/__init__.py +30 -0
- ejkernel-0.0.80/ejkernel/kernels/_xla/grouped_matmulv3/_interface.py +583 -0
- ejkernel-0.0.80/ejkernel/kernels/_xla/kernel_delta_attention/__init__.py +43 -0
- ejkernel-0.0.80/ejkernel/kernels/_xla/kernel_delta_attention/_interface.py +239 -0
- ejkernel-0.0.80/ejkernel/kernels/_xla/kernel_delta_attention/_xla_impl_fwd.py +467 -0
- ejkernel-0.0.80/ejkernel/kernels/_xla/lightning_attn/__init__.py +34 -0
- ejkernel-0.0.80/ejkernel/kernels/_xla/lightning_attn/_interface.py +107 -0
- ejkernel-0.0.80/ejkernel/kernels/_xla/lightning_attn/_xla_impl_bwd.py +21 -0
- ejkernel-0.0.80/ejkernel/kernels/_xla/lightning_attn/_xla_impl_fwd.py +141 -0
- ejkernel-0.0.80/ejkernel/kernels/_xla/mean_pooling/__init__.py +29 -0
- ejkernel-0.0.80/ejkernel/kernels/_xla/mean_pooling/_interface.py +82 -0
- ejkernel-0.0.80/ejkernel/kernels/_xla/mean_pooling/_xla_impl_bwd.py +104 -0
- ejkernel-0.0.80/ejkernel/kernels/_xla/mean_pooling/_xla_impl_fwd.py +191 -0
- ejkernel-0.0.80/ejkernel/kernels/_xla/multi_latent_ragged_page_attention/__init__.py +43 -0
- ejkernel-0.0.80/ejkernel/kernels/_xla/multi_latent_ragged_page_attention/_interface.py +145 -0
- ejkernel-0.0.80/ejkernel/kernels/_xla/multi_latent_ragged_page_attention/_xla_impl_fwd.py +488 -0
- ejkernel-0.0.80/ejkernel/kernels/_xla/multi_latent_ragged_page_attention_v2/__init__.py +28 -0
- ejkernel-0.0.80/ejkernel/kernels/_xla/multi_latent_ragged_page_attention_v2/_interface.py +125 -0
- ejkernel-0.0.80/ejkernel/kernels/_xla/multi_latent_ragged_page_attention_v2/_xla_impl_fwd.py +117 -0
- ejkernel-0.0.80/ejkernel/kernels/_xla/native_sparse_attention/__init__.py +48 -0
- ejkernel-0.0.80/ejkernel/kernels/_xla/native_sparse_attention/_interface.py +623 -0
- ejkernel-0.0.80/ejkernel/kernels/_xla/native_sparse_attention/_xla_impl_bwd.py +241 -0
- ejkernel-0.0.80/ejkernel/kernels/_xla/native_sparse_attention/_xla_impl_fwd.py +171 -0
- ejkernel-0.0.80/ejkernel/kernels/_xla/page_attention/__init__.py +30 -0
- ejkernel-0.0.80/ejkernel/kernels/_xla/page_attention/_interface.py +166 -0
- ejkernel-0.0.80/ejkernel/kernels/_xla/page_attention/_xla_impl_fwd.py +161 -0
- ejkernel-0.0.80/ejkernel/kernels/_xla/prefill_page_attention/__init__.py +77 -0
- ejkernel-0.0.80/ejkernel/kernels/_xla/prefill_page_attention/_impl.py +172 -0
- ejkernel-0.0.80/ejkernel/kernels/_xla/prefill_page_attention/_interface.py +97 -0
- ejkernel-0.0.80/ejkernel/kernels/_xla/quantized_matmul/__init__.py +33 -0
- ejkernel-0.0.80/ejkernel/kernels/_xla/quantized_matmul/_interface.py +139 -0
- ejkernel-0.0.80/ejkernel/kernels/_xla/quantized_matmul/_xla_impl_bwd.py +21 -0
- ejkernel-0.0.80/ejkernel/kernels/_xla/quantized_matmul/_xla_impl_fwd.py +751 -0
- ejkernel-0.0.80/ejkernel/kernels/_xla/ragged_decode_attention/__init__.py +29 -0
- ejkernel-0.0.80/ejkernel/kernels/_xla/ragged_decode_attention/_interface.py +109 -0
- ejkernel-0.0.80/ejkernel/kernels/_xla/ragged_decode_attention/_xla_impl_fwd.py +600 -0
- ejkernel-0.0.80/ejkernel/kernels/_xla/ragged_gated_delta_rule/__init__.py +25 -0
- ejkernel-0.0.80/ejkernel/kernels/_xla/ragged_gated_delta_rule/_interface.py +117 -0
- ejkernel-0.0.80/ejkernel/kernels/_xla/ragged_gated_delta_rule/_xla_impl_fwd.py +503 -0
- ejkernel-0.0.80/ejkernel/kernels/_xla/ragged_page_attention_v2/__init__.py +30 -0
- ejkernel-0.0.80/ejkernel/kernels/_xla/ragged_page_attention_v2/_interface.py +131 -0
- ejkernel-0.0.80/ejkernel/kernels/_xla/ragged_page_attention_v2/_xla_impl_fwd.py +318 -0
- ejkernel-0.0.80/ejkernel/kernels/_xla/ragged_page_attention_v2_turboquant/_interface.py +158 -0
- ejkernel-0.0.80/ejkernel/kernels/_xla/ragged_page_attention_v2_turboquant/_xla_impl_fwd.py +407 -0
- ejkernel-0.0.80/ejkernel/kernels/_xla/ragged_page_attention_v3/__init__.py +34 -0
- ejkernel-0.0.80/ejkernel/kernels/_xla/ragged_page_attention_v3/_interface.py +195 -0
- ejkernel-0.0.80/ejkernel/kernels/_xla/ragged_page_attention_v3/_xla_impl_bwd.py +21 -0
- ejkernel-0.0.80/ejkernel/kernels/_xla/ragged_page_attention_v3/_xla_impl_fwd.py +694 -0
- ejkernel-0.0.80/ejkernel/kernels/_xla/ragged_page_attention_v3_turboquant/_interface.py +142 -0
- ejkernel-0.0.80/ejkernel/kernels/_xla/ragged_page_attention_v3_turboquant/_xla_impl_fwd.py +561 -0
- ejkernel-0.0.80/ejkernel/kernels/_xla/recurrent/__init__.py +44 -0
- ejkernel-0.0.80/ejkernel/kernels/_xla/recurrent/_interface.py +781 -0
- ejkernel-0.0.80/ejkernel/kernels/_xla/recurrent/_xla_impl_bwd.py +254 -0
- ejkernel-0.0.80/ejkernel/kernels/_xla/recurrent/_xla_impl_fwd.py +361 -0
- ejkernel-0.0.80/ejkernel/kernels/_xla/reduce_scatter_matmul/__init__.py +28 -0
- ejkernel-0.0.80/ejkernel/kernels/_xla/reduce_scatter_matmul/_interface.py +86 -0
- ejkernel-0.0.80/ejkernel/kernels/_xla/reduce_scatter_matmul/_xla_impl_bwd.py +66 -0
- ejkernel-0.0.80/ejkernel/kernels/_xla/reduce_scatter_matmul/_xla_impl_fwd.py +158 -0
- ejkernel-0.0.80/ejkernel/kernels/_xla/ring_attention/__init__.py +45 -0
- ejkernel-0.0.80/ejkernel/kernels/_xla/ring_attention/_interface.py +393 -0
- ejkernel-0.0.80/ejkernel/kernels/_xla/ring_attention/_utils.py +250 -0
- ejkernel-0.0.80/ejkernel/kernels/_xla/ring_attention/_xla_impl_bwd.py +505 -0
- ejkernel-0.0.80/ejkernel/kernels/_xla/ring_attention/_xla_impl_fwd.py +571 -0
- ejkernel-0.0.80/ejkernel/kernels/_xla/rwkv4/__init__.py +45 -0
- ejkernel-0.0.80/ejkernel/kernels/_xla/rwkv4/_interface.py +94 -0
- ejkernel-0.0.80/ejkernel/kernels/_xla/rwkv4/_xla_impl_bwd.py +21 -0
- ejkernel-0.0.80/ejkernel/kernels/_xla/rwkv4/_xla_impl_fwd.py +219 -0
- ejkernel-0.0.80/ejkernel/kernels/_xla/rwkv6/__init__.py +43 -0
- ejkernel-0.0.80/ejkernel/kernels/_xla/rwkv6/_interface.py +107 -0
- ejkernel-0.0.80/ejkernel/kernels/_xla/rwkv6/_xla_impl_bwd.py +21 -0
- ejkernel-0.0.80/ejkernel/kernels/_xla/rwkv6/_xla_impl_fwd.py +412 -0
- ejkernel-0.0.80/ejkernel/kernels/_xla/rwkv7/__init__.py +42 -0
- ejkernel-0.0.80/ejkernel/kernels/_xla/rwkv7/_interface.py +195 -0
- ejkernel-0.0.80/ejkernel/kernels/_xla/rwkv7/_xla_impl_bwd.py +21 -0
- ejkernel-0.0.80/ejkernel/kernels/_xla/rwkv7/_xla_impl_fwd.py +533 -0
- ejkernel-0.0.80/ejkernel/kernels/_xla/scaled_dot_product_attention/__init__.py +24 -0
- ejkernel-0.0.80/ejkernel/kernels/_xla/scaled_dot_product_attention/_interface.py +105 -0
- ejkernel-0.0.80/ejkernel/kernels/_xla/scaled_dot_product_attention/_xla_impl_bwd.py +21 -0
- ejkernel-0.0.80/ejkernel/kernels/_xla/scaled_dot_product_attention/_xla_impl_fwd.py +101 -0
- ejkernel-0.0.80/ejkernel/kernels/_xla/state_space_v1/__init__.py +35 -0
- ejkernel-0.0.80/ejkernel/kernels/_xla/state_space_v1/_interface.py +346 -0
- ejkernel-0.0.80/ejkernel/kernels/_xla/state_space_v1/_xla_impl_bwd.py +192 -0
- ejkernel-0.0.80/ejkernel/kernels/_xla/state_space_v1/_xla_impl_fwd.py +215 -0
- ejkernel-0.0.80/ejkernel/kernels/_xla/state_space_v2/__init__.py +31 -0
- ejkernel-0.0.80/ejkernel/kernels/_xla/state_space_v2/_interface.py +381 -0
- ejkernel-0.0.80/ejkernel/kernels/_xla/state_space_v2/_xla_impl_bwd.py +195 -0
- ejkernel-0.0.80/ejkernel/kernels/_xla/state_space_v2/_xla_impl_fwd.py +209 -0
- ejkernel-0.0.80/ejkernel/kernels/_xla/unified_attention/__init__.py +30 -0
- ejkernel-0.0.80/ejkernel/kernels/_xla/unified_attention/_interface.py +197 -0
- ejkernel-0.0.80/ejkernel/kernels/_xla/unified_attention/_xla_impl_fwd.py +334 -0
- ejkernel-0.0.80/ejkernel/loggings.py +606 -0
- ejkernel-0.0.80/ejkernel/modules/__init__.py +338 -0
- ejkernel-0.0.80/ejkernel/modules/base.py +279 -0
- ejkernel-0.0.80/ejkernel/modules/operations/__init__.py +338 -0
- ejkernel-0.0.80/ejkernel/modules/operations/all_gather_matmul.py +500 -0
- ejkernel-0.0.80/ejkernel/modules/operations/attention.py +490 -0
- ejkernel-0.0.80/ejkernel/modules/operations/blocksparse_attention.py +1053 -0
- ejkernel-0.0.80/ejkernel/modules/operations/chunked_prefill_paged_decode.py +561 -0
- ejkernel-0.0.80/ejkernel/modules/operations/configs.py +1158 -0
- ejkernel-0.0.80/ejkernel/modules/operations/decode_attention.py +489 -0
- ejkernel-0.0.80/ejkernel/modules/operations/deepseek_attn.py +371 -0
- ejkernel-0.0.80/ejkernel/modules/operations/flash_attention.py +958 -0
- ejkernel-0.0.80/ejkernel/modules/operations/fused_cross_entropy.py +912 -0
- ejkernel-0.0.80/ejkernel/modules/operations/fused_kl_divergence.py +633 -0
- ejkernel-0.0.80/ejkernel/modules/operations/gated_delta_rule.py +544 -0
- ejkernel-0.0.80/ejkernel/modules/operations/gated_linear_attention.py +455 -0
- ejkernel-0.0.80/ejkernel/modules/operations/grouped_matmul.py +669 -0
- ejkernel-0.0.80/ejkernel/modules/operations/kernel_delta_attention.py +388 -0
- ejkernel-0.0.80/ejkernel/modules/operations/lightning_attention.py +490 -0
- ejkernel-0.0.80/ejkernel/modules/operations/multi_head_latent_attention.py +425 -0
- ejkernel-0.0.80/ejkernel/modules/operations/multi_latent_ragged_page_attention.py +506 -0
- ejkernel-0.0.80/ejkernel/modules/operations/multi_latent_ragged_page_attention_v2.py +460 -0
- ejkernel-0.0.80/ejkernel/modules/operations/native_sparse_attention.py +495 -0
- ejkernel-0.0.80/ejkernel/modules/operations/page_attention.py +539 -0
- ejkernel-0.0.80/ejkernel/modules/operations/pooling.py +397 -0
- ejkernel-0.0.80/ejkernel/modules/operations/prefill_page_attention.py +439 -0
- ejkernel-0.0.80/ejkernel/modules/operations/quantized_matmul.py +2203 -0
- ejkernel-0.0.80/ejkernel/modules/operations/ragged_decode_attention.py +681 -0
- ejkernel-0.0.80/ejkernel/modules/operations/ragged_gated_delta_rule.py +429 -0
- ejkernel-0.0.80/ejkernel/modules/operations/ragged_page_attention_v2.py +783 -0
- ejkernel-0.0.80/ejkernel/modules/operations/ragged_page_attention_v2_turboquant.py +653 -0
- ejkernel-0.0.80/ejkernel/modules/operations/ragged_page_attention_v3.py +1344 -0
- ejkernel-0.0.80/ejkernel/modules/operations/ragged_page_attention_v3_turboquant.py +692 -0
- ejkernel-0.0.80/ejkernel/modules/operations/recurrent.py +503 -0
- ejkernel-0.0.80/ejkernel/modules/operations/reduce_scatter_matmul.py +501 -0
- ejkernel-0.0.80/ejkernel/modules/operations/ring_attention.py +635 -0
- ejkernel-0.0.80/ejkernel/modules/operations/rwkv4.py +334 -0
- ejkernel-0.0.80/ejkernel/modules/operations/rwkv6.py +347 -0
- ejkernel-0.0.80/ejkernel/modules/operations/rwkv7.py +733 -0
- ejkernel-0.0.80/ejkernel/modules/operations/scaled_dot_product_attention.py +615 -0
- ejkernel-0.0.80/ejkernel/modules/operations/state_space_v1.py +388 -0
- ejkernel-0.0.80/ejkernel/modules/operations/state_space_v2.py +445 -0
- ejkernel-0.0.80/ejkernel/modules/operations/unified_attention.py +669 -0
- ejkernel-0.0.80/ejkernel/ops/__init__.py +152 -0
- ejkernel-0.0.80/ejkernel/ops/config/__init__.py +62 -0
- ejkernel-0.0.80/ejkernel/ops/config/cache.py +199 -0
- ejkernel-0.0.80/ejkernel/ops/config/persistent.py +248 -0
- ejkernel-0.0.80/ejkernel/ops/config/selection.py +832 -0
- ejkernel-0.0.80/ejkernel/ops/core/__init__.py +58 -0
- ejkernel-0.0.80/ejkernel/ops/core/kernel.py +818 -0
- ejkernel-0.0.80/ejkernel/ops/core/types.py +50 -0
- ejkernel-0.0.80/ejkernel/ops/execution/__init__.py +106 -0
- ejkernel-0.0.80/ejkernel/ops/execution/batch.py +191 -0
- ejkernel-0.0.80/ejkernel/ops/execution/executor.py +744 -0
- ejkernel-0.0.80/ejkernel/ops/execution/offline.py +163 -0
- ejkernel-0.0.80/ejkernel/ops/execution/profiler.py +506 -0
- ejkernel-0.0.80/ejkernel/ops/execution/tuning.py +1600 -0
- ejkernel-0.0.80/ejkernel/ops/registry.py +93 -0
- ejkernel-0.0.80/ejkernel/ops/utils/__init__.py +77 -0
- ejkernel-0.0.80/ejkernel/ops/utils/datacarrier.py +176 -0
- ejkernel-0.0.80/ejkernel/ops/utils/fingerprint.py +353 -0
- ejkernel-0.0.80/ejkernel/ops/utils/meta.py +166 -0
- ejkernel-0.0.80/ejkernel/ops/utils/serialize.py +98 -0
- ejkernel-0.0.80/ejkernel/quantization/__init__.py +86 -0
- ejkernel-0.0.80/ejkernel/quantization/_quants/__init__.py +37 -0
- ejkernel-0.0.80/ejkernel/quantization/_quants/quantizations.py +1407 -0
- ejkernel-0.0.80/ejkernel/quantization/_utils/__init__.py +90 -0
- ejkernel-0.0.80/ejkernel/quantization/_utils/bitpack.py +250 -0
- ejkernel-0.0.80/ejkernel/quantization/_utils/fp_tables.py +302 -0
- ejkernel-0.0.80/ejkernel/quantization/_utils/grouping.py +220 -0
- ejkernel-0.0.80/ejkernel/quantization/_utils/qparams.py +539 -0
- ejkernel-0.0.80/ejkernel/quantization/quantized_array.py +599 -0
- ejkernel-0.0.80/ejkernel/quantization/runtime.py +164 -0
- ejkernel-0.0.80/ejkernel/quantization/turboquant/codebook.py +213 -0
- ejkernel-0.0.80/ejkernel/quantization/turboquant/matrices.py +79 -0
- ejkernel-0.0.80/ejkernel/quantization/turboquant/ops.py +219 -0
- ejkernel-0.0.80/ejkernel/quantization/turboquant/packing.py +114 -0
- ejkernel-0.0.80/ejkernel/types/__init__.py +57 -0
- ejkernel-0.0.80/ejkernel/types/mask.py +3313 -0
- ejkernel-0.0.80/ejkernel/utils.py +1210 -0
- ejkernel-0.0.80/ejkernel/xla_utils/__init__.py +122 -0
- ejkernel-0.0.80/ejkernel/xla_utils/cumsum.py +568 -0
- ejkernel-0.0.80/ejkernel/xla_utils/shardings.py +270 -0
- ejkernel-0.0.80/ejkernel/xla_utils/utils.py +376 -0
- ejkernel-0.0.80/pyproject.toml +149 -0
- ejkernel-0.0.79/PKG-INFO +0 -678
- ejkernel-0.0.79/csrc/flash_attention/CMakeLists.txt +0 -75
- ejkernel-0.0.79/csrc/flash_attention/src/flash_fwd_launch_template.h +0 -447
- ejkernel-0.0.79/csrc/quantized_matmul/CMakeLists.txt +0 -62
- ejkernel-0.0.79/csrc/quantized_matmul/src/code_gen.py +0 -378
- ejkernel-0.0.79/csrc/quantized_matmul/src/qmm_cuda_impl.h +0 -2337
- ejkernel-0.0.79/csrc/quantized_matmul/src/qmm_dequant_affine_bits2_bf16.cu +0 -4092
- ejkernel-0.0.79/csrc/quantized_matmul/src/qmm_dequant_affine_bits2_f16.cu +0 -4092
- ejkernel-0.0.79/csrc/quantized_matmul/src/qmm_dequant_affine_bits2_f32.cu +0 -4092
- ejkernel-0.0.79/csrc/quantized_matmul/src/qmm_dequant_affine_bits3_bf16.cu +0 -4092
- ejkernel-0.0.79/csrc/quantized_matmul/src/qmm_dequant_affine_bits3_f16.cu +0 -4092
- ejkernel-0.0.79/csrc/quantized_matmul/src/qmm_dequant_affine_bits3_f32.cu +0 -4092
- ejkernel-0.0.79/csrc/quantized_matmul/src/qmm_dequant_affine_bits4_bf16.cu +0 -4092
- ejkernel-0.0.79/csrc/quantized_matmul/src/qmm_dequant_affine_bits4_f16.cu +0 -4092
- ejkernel-0.0.79/csrc/quantized_matmul/src/qmm_dequant_affine_bits4_f32.cu +0 -4092
- ejkernel-0.0.79/csrc/quantized_matmul/src/qmm_dequant_affine_bits5_bf16.cu +0 -4092
- ejkernel-0.0.79/csrc/quantized_matmul/src/qmm_dequant_affine_bits5_f16.cu +0 -4092
- ejkernel-0.0.79/csrc/quantized_matmul/src/qmm_dequant_affine_bits5_f32.cu +0 -4092
- ejkernel-0.0.79/csrc/quantized_matmul/src/qmm_dequant_affine_bits6_bf16.cu +0 -4092
- ejkernel-0.0.79/csrc/quantized_matmul/src/qmm_dequant_affine_bits6_f16.cu +0 -4092
- ejkernel-0.0.79/csrc/quantized_matmul/src/qmm_dequant_affine_bits6_f32.cu +0 -4092
- ejkernel-0.0.79/csrc/quantized_matmul/src/qmm_dequant_affine_bits7_bf16.cu +0 -4092
- ejkernel-0.0.79/csrc/quantized_matmul/src/qmm_dequant_affine_bits7_f16.cu +0 -4092
- ejkernel-0.0.79/csrc/quantized_matmul/src/qmm_dequant_affine_bits7_f32.cu +0 -4092
- ejkernel-0.0.79/csrc/quantized_matmul/src/qmm_dequant_affine_bits8_bf16.cu +0 -4092
- ejkernel-0.0.79/csrc/quantized_matmul/src/qmm_dequant_affine_bits8_f16.cu +0 -4092
- ejkernel-0.0.79/csrc/quantized_matmul/src/qmm_dequant_affine_bits8_f32.cu +0 -4092
- ejkernel-0.0.79/csrc/quantized_matmul/src/qmm_dequant_dispatch.h +0 -18405
- ejkernel-0.0.79/csrc/quantized_matmul/src/qmm_dequant_mxfp4.cu +0 -22
- ejkernel-0.0.79/csrc/quantized_matmul/src/qmm_dequant_mxfp8.cu +0 -22
- ejkernel-0.0.79/csrc/quantized_matmul/src/qmm_dequant_nf4_bf16.cu +0 -4092
- ejkernel-0.0.79/csrc/quantized_matmul/src/qmm_dequant_nf4_f16.cu +0 -4092
- ejkernel-0.0.79/csrc/quantized_matmul/src/qmm_dequant_nf4_f32.cu +0 -4092
- ejkernel-0.0.79/csrc/quantized_matmul/src/qmm_dequant_nvfp4.cu +0 -22
- ejkernel-0.0.79/csrc/quantized_matmul/src/qmm_dequant_nvfp8.cu +0 -22
- ejkernel-0.0.79/ejkernel/__init__.py +0 -65
- ejkernel-0.0.79/ejkernel/benchmarks.py +0 -1028
- ejkernel-0.0.79/ejkernel/build_cudalib.py +0 -30
- ejkernel-0.0.79/ejkernel/callib/__init__.py +0 -160
- ejkernel-0.0.79/ejkernel/callib/_cute_call.py +0 -537
- ejkernel-0.0.79/ejkernel/callib/_cute_ffi.py +0 -496
- ejkernel-0.0.79/ejkernel/callib/_ejit.py +0 -666
- ejkernel-0.0.79/ejkernel/callib/_pallas_call.py +0 -275
- ejkernel-0.0.79/ejkernel/callib/_triton_call.py +0 -1554
- ejkernel-0.0.79/ejkernel/callib/_utils.py +0 -273
- ejkernel-0.0.79/ejkernel/errors.py +0 -45
- ejkernel-0.0.79/ejkernel/kernels/__init__.py +0 -119
- ejkernel-0.0.79/ejkernel/kernels/_cuda/__init__.py +0 -108
- ejkernel-0.0.79/ejkernel/kernels/_cuda/blocksparse_attention/__init__.py +0 -46
- ejkernel-0.0.79/ejkernel/kernels/_cuda/blocksparse_attention/_build.py +0 -197
- ejkernel-0.0.79/ejkernel/kernels/_cuda/blocksparse_attention/_cuda_impl.py +0 -275
- ejkernel-0.0.79/ejkernel/kernels/_cuda/blocksparse_attention/_interface.py +0 -813
- ejkernel-0.0.79/ejkernel/kernels/_cuda/flash_attention/__init__.py +0 -30
- ejkernel-0.0.79/ejkernel/kernels/_cuda/flash_attention/_build.py +0 -232
- ejkernel-0.0.79/ejkernel/kernels/_cuda/flash_attention/_cuda_impl.py +0 -478
- ejkernel-0.0.79/ejkernel/kernels/_cuda/flash_attention/_interface.py +0 -783
- ejkernel-0.0.79/ejkernel/kernels/_cuda/quantized_matmul/__init__.py +0 -39
- ejkernel-0.0.79/ejkernel/kernels/_cuda/quantized_matmul/_build.py +0 -205
- ejkernel-0.0.79/ejkernel/kernels/_cuda/quantized_matmul/_cuda_impl.py +0 -303
- ejkernel-0.0.79/ejkernel/kernels/_cuda/quantized_matmul/_cuda_impl_bwd.py +0 -105
- ejkernel-0.0.79/ejkernel/kernels/_cuda/quantized_matmul/_cuda_impl_fwd.py +0 -98
- ejkernel-0.0.79/ejkernel/kernels/_cuda/quantized_matmul/_interface.py +0 -277
- ejkernel-0.0.79/ejkernel/kernels/_cuda/ragged_page_attention_v3/__init__.py +0 -43
- ejkernel-0.0.79/ejkernel/kernels/_cuda/ragged_page_attention_v3/_build.py +0 -206
- ejkernel-0.0.79/ejkernel/kernels/_cuda/ragged_page_attention_v3/_cuda_impl.py +0 -178
- ejkernel-0.0.79/ejkernel/kernels/_cuda/ragged_page_attention_v3/_interface.py +0 -153
- ejkernel-0.0.79/ejkernel/kernels/_cuda/unified_attention/__init__.py +0 -49
- ejkernel-0.0.79/ejkernel/kernels/_cuda/unified_attention/_build.py +0 -199
- ejkernel-0.0.79/ejkernel/kernels/_cuda/unified_attention/_cuda_impl.py +0 -185
- ejkernel-0.0.79/ejkernel/kernels/_cuda/unified_attention/_interface.py +0 -167
- ejkernel-0.0.79/ejkernel/kernels/_cute/__init__.py +0 -32
- ejkernel-0.0.79/ejkernel/kernels/_cute/chunked_prefill_paged_decode/__init__.py +0 -24
- ejkernel-0.0.79/ejkernel/kernels/_cute/chunked_prefill_paged_decode/_cute_impl_fwd.py +0 -498
- ejkernel-0.0.79/ejkernel/kernels/_cute/chunked_prefill_paged_decode/_interface.py +0 -105
- ejkernel-0.0.79/ejkernel/kernels/_cute/flash_attention/__init__.py +0 -19
- ejkernel-0.0.79/ejkernel/kernels/_cute/flash_attention/_cute_impl.py +0 -1443
- ejkernel-0.0.79/ejkernel/kernels/_cute/flash_attention/_interface.py +0 -516
- ejkernel-0.0.79/ejkernel/kernels/_cute/quantized_matmul/__init__.py +0 -19
- ejkernel-0.0.79/ejkernel/kernels/_cute/quantized_matmul/_cute_impl.py +0 -3317
- ejkernel-0.0.79/ejkernel/kernels/_cute/quantized_matmul/_cute_impl_bwd.py +0 -119
- ejkernel-0.0.79/ejkernel/kernels/_cute/quantized_matmul/_cute_impl_fwd.py +0 -121
- ejkernel-0.0.79/ejkernel/kernels/_cute/quantized_matmul/_interface.py +0 -309
- ejkernel-0.0.79/ejkernel/kernels/_cute/ragged_page_attention_v3/__init__.py +0 -15
- ejkernel-0.0.79/ejkernel/kernels/_cute/ragged_page_attention_v3/_cute_impl_fwd.py +0 -15
- ejkernel-0.0.79/ejkernel/kernels/_cute/ragged_page_attention_v3/_interface.py +0 -15
- ejkernel-0.0.79/ejkernel/kernels/_cute/unified_attention/__init__.py +0 -19
- ejkernel-0.0.79/ejkernel/kernels/_cute/unified_attention/_cute_impl.py +0 -163
- ejkernel-0.0.79/ejkernel/kernels/_cute/unified_attention/_interface.py +0 -122
- ejkernel-0.0.79/ejkernel/kernels/_pallas/__init__.py +0 -38
- ejkernel-0.0.79/ejkernel/kernels/_pallas/gpu/__init__.py +0 -30
- ejkernel-0.0.79/ejkernel/kernels/_pallas/gpu/ragged_decode_attention/__init__.py +0 -30
- ejkernel-0.0.79/ejkernel/kernels/_pallas/gpu/ragged_decode_attention/_interface.py +0 -123
- ejkernel-0.0.79/ejkernel/kernels/_pallas/gpu/ragged_decode_attention/_pallas_impl_fwd.py +0 -411
- ejkernel-0.0.79/ejkernel/kernels/_pallas/gpu/scaled_dot_product_attention/__init__.py +0 -25
- ejkernel-0.0.79/ejkernel/kernels/_pallas/gpu/scaled_dot_product_attention/_interface.py +0 -122
- ejkernel-0.0.79/ejkernel/kernels/_pallas/gpu/scaled_dot_product_attention/_pallas_impl_bwd.py +0 -21
- ejkernel-0.0.79/ejkernel/kernels/_pallas/gpu/scaled_dot_product_attention/_pallas_impl_fwd.py +0 -124
- ejkernel-0.0.79/ejkernel/kernels/_pallas/tpu/__init__.py +0 -80
- ejkernel-0.0.79/ejkernel/kernels/_pallas/tpu/all_gather_matmul/__init__.py +0 -19
- ejkernel-0.0.79/ejkernel/kernels/_pallas/tpu/all_gather_matmul/_interface.py +0 -181
- ejkernel-0.0.79/ejkernel/kernels/_pallas/tpu/all_gather_matmul/_pallas_impl.py +0 -636
- ejkernel-0.0.79/ejkernel/kernels/_pallas/tpu/blocksparse_attention/__init__.py +0 -131
- ejkernel-0.0.79/ejkernel/kernels/_pallas/tpu/blocksparse_attention/_info.py +0 -1066
- ejkernel-0.0.79/ejkernel/kernels/_pallas/tpu/blocksparse_attention/_interface.py +0 -110
- ejkernel-0.0.79/ejkernel/kernels/_pallas/tpu/blocksparse_attention/_kernel.py +0 -2869
- ejkernel-0.0.79/ejkernel/kernels/_pallas/tpu/blocksparse_attention/_masks.py +0 -640
- ejkernel-0.0.79/ejkernel/kernels/_pallas/tpu/deepseek_attn/__init__.py +0 -27
- ejkernel-0.0.79/ejkernel/kernels/_pallas/tpu/deepseek_attn/_interface.py +0 -159
- ejkernel-0.0.79/ejkernel/kernels/_pallas/tpu/deepseek_attn/_pallas_impl_bwd.py +0 -97
- ejkernel-0.0.79/ejkernel/kernels/_pallas/tpu/deepseek_attn/_pallas_impl_fwd.py +0 -259
- ejkernel-0.0.79/ejkernel/kernels/_pallas/tpu/flash_attention/__init__.py +0 -36
- ejkernel-0.0.79/ejkernel/kernels/_pallas/tpu/flash_attention/_interface.py +0 -385
- ejkernel-0.0.79/ejkernel/kernels/_pallas/tpu/flash_attention/_pallas_impl_bwd.py +0 -887
- ejkernel-0.0.79/ejkernel/kernels/_pallas/tpu/flash_attention/_pallas_impl_fwd.py +0 -664
- ejkernel-0.0.79/ejkernel/kernels/_pallas/tpu/flash_attention/_utils.py +0 -590
- ejkernel-0.0.79/ejkernel/kernels/_pallas/tpu/flash_mla/__init__.py +0 -33
- ejkernel-0.0.79/ejkernel/kernels/_pallas/tpu/flash_mla/_interface.py +0 -212
- ejkernel-0.0.79/ejkernel/kernels/_pallas/tpu/flash_mla/_pallas_impl_bwd.py +0 -858
- ejkernel-0.0.79/ejkernel/kernels/_pallas/tpu/flash_mla/_pallas_impl_fwd.py +0 -586
- ejkernel-0.0.79/ejkernel/kernels/_pallas/tpu/flash_mla/_utils.py +0 -71
- ejkernel-0.0.79/ejkernel/kernels/_pallas/tpu/gated_delta_rule/_interface.py +0 -132
- ejkernel-0.0.79/ejkernel/kernels/_pallas/tpu/gated_delta_rule/_pallas_impl_bwd.py +0 -528
- ejkernel-0.0.79/ejkernel/kernels/_pallas/tpu/gated_delta_rule/_pallas_impl_fwd.py +0 -640
- ejkernel-0.0.79/ejkernel/kernels/_pallas/tpu/grouped_matmul/__init__.py +0 -30
- ejkernel-0.0.79/ejkernel/kernels/_pallas/tpu/grouped_matmul/_interface.py +0 -298
- ejkernel-0.0.79/ejkernel/kernels/_pallas/tpu/grouped_matmul/_pallas_impl.py +0 -996
- ejkernel-0.0.79/ejkernel/kernels/_pallas/tpu/grouped_matmul/_utils.py +0 -191
- ejkernel-0.0.79/ejkernel/kernels/_pallas/tpu/grouped_matmulv2/__init__.py +0 -30
- ejkernel-0.0.79/ejkernel/kernels/_pallas/tpu/grouped_matmulv2/_interface.py +0 -311
- ejkernel-0.0.79/ejkernel/kernels/_pallas/tpu/grouped_matmulv2/_pallas_impl.py +0 -627
- ejkernel-0.0.79/ejkernel/kernels/_pallas/tpu/grouped_matmulv3/__init__.py +0 -19
- ejkernel-0.0.79/ejkernel/kernels/_pallas/tpu/grouped_matmulv3/_interface.py +0 -296
- ejkernel-0.0.79/ejkernel/kernels/_pallas/tpu/grouped_matmulv3/_pallas_impl.py +0 -995
- ejkernel-0.0.79/ejkernel/kernels/_pallas/tpu/multi_latent_ragged_page_attention/__init__.py +0 -19
- ejkernel-0.0.79/ejkernel/kernels/_pallas/tpu/multi_latent_ragged_page_attention/_interface.py +0 -143
- ejkernel-0.0.79/ejkernel/kernels/_pallas/tpu/multi_latent_ragged_page_attention/_pallas_impl_fwd.py +0 -1329
- ejkernel-0.0.79/ejkernel/kernels/_pallas/tpu/multi_latent_ragged_page_attention_v2/__init__.py +0 -19
- ejkernel-0.0.79/ejkernel/kernels/_pallas/tpu/multi_latent_ragged_page_attention_v2/_interface.py +0 -177
- ejkernel-0.0.79/ejkernel/kernels/_pallas/tpu/multi_latent_ragged_page_attention_v2/_pallas_impl_fwd.py +0 -1407
- ejkernel-0.0.79/ejkernel/kernels/_pallas/tpu/page_attention/__init__.py +0 -30
- ejkernel-0.0.79/ejkernel/kernels/_pallas/tpu/page_attention/_interface.py +0 -352
- ejkernel-0.0.79/ejkernel/kernels/_pallas/tpu/page_attention/_pallas_impl_fwd.py +0 -427
- ejkernel-0.0.79/ejkernel/kernels/_pallas/tpu/prefill_page_attention/__init__.py +0 -30
- ejkernel-0.0.79/ejkernel/kernels/_pallas/tpu/prefill_page_attention/_interface.py +0 -227
- ejkernel-0.0.79/ejkernel/kernels/_pallas/tpu/prefill_page_attention/_pallas_impl_fwd.py +0 -438
- ejkernel-0.0.79/ejkernel/kernels/_pallas/tpu/quantized_matmul/__init__.py +0 -23
- ejkernel-0.0.79/ejkernel/kernels/_pallas/tpu/quantized_matmul/_interface.py +0 -741
- ejkernel-0.0.79/ejkernel/kernels/_pallas/tpu/quantized_matmul/_pallas_impl_bwd.py +0 -473
- ejkernel-0.0.79/ejkernel/kernels/_pallas/tpu/quantized_matmul/_pallas_impl_core.py +0 -1014
- ejkernel-0.0.79/ejkernel/kernels/_pallas/tpu/quantized_matmul/_pallas_impl_fwd.py +0 -332
- ejkernel-0.0.79/ejkernel/kernels/_pallas/tpu/ragged_decode_attention/__init__.py +0 -30
- ejkernel-0.0.79/ejkernel/kernels/_pallas/tpu/ragged_decode_attention/_interface.py +0 -125
- ejkernel-0.0.79/ejkernel/kernels/_pallas/tpu/ragged_decode_attention/_pallas_impl_fwd.py +0 -335
- ejkernel-0.0.79/ejkernel/kernels/_pallas/tpu/ragged_gated_delta_rule/__init__.py +0 -25
- ejkernel-0.0.79/ejkernel/kernels/_pallas/tpu/ragged_gated_delta_rule/_interface.py +0 -145
- ejkernel-0.0.79/ejkernel/kernels/_pallas/tpu/ragged_gated_delta_rule/_pallas_impl_fwd.py +0 -140
- ejkernel-0.0.79/ejkernel/kernels/_pallas/tpu/ragged_page_attention_v2/__init__.py +0 -30
- ejkernel-0.0.79/ejkernel/kernels/_pallas/tpu/ragged_page_attention_v2/_interface.py +0 -339
- ejkernel-0.0.79/ejkernel/kernels/_pallas/tpu/ragged_page_attention_v2/_pallas_impl_fwd.py +0 -813
- ejkernel-0.0.79/ejkernel/kernels/_pallas/tpu/ragged_page_attention_v2/_utils.py +0 -561
- ejkernel-0.0.79/ejkernel/kernels/_pallas/tpu/ragged_page_attention_v3/__init__.py +0 -30
- ejkernel-0.0.79/ejkernel/kernels/_pallas/tpu/ragged_page_attention_v3/_interface.py +0 -205
- ejkernel-0.0.79/ejkernel/kernels/_pallas/tpu/ragged_page_attention_v3/_pallas_impl_fwd.py +0 -1789
- ejkernel-0.0.79/ejkernel/kernels/_pallas/tpu/ragged_page_attention_v3/_pallas_impl_fwd_h64.py +0 -1611
- ejkernel-0.0.79/ejkernel/kernels/_pallas/tpu/ragged_page_attention_v3/_utils.py +0 -4792
- ejkernel-0.0.79/ejkernel/kernels/_pallas/tpu/reduce_scatter_matmul/__init__.py +0 -19
- ejkernel-0.0.79/ejkernel/kernels/_pallas/tpu/reduce_scatter_matmul/_interface.py +0 -121
- ejkernel-0.0.79/ejkernel/kernels/_pallas/tpu/reduce_scatter_matmul/_pallas_impl.py +0 -782
- ejkernel-0.0.79/ejkernel/kernels/_pallas/tpu/ring_attention/__init__.py +0 -48
- ejkernel-0.0.79/ejkernel/kernels/_pallas/tpu/ring_attention/_interface.py +0 -102
- ejkernel-0.0.79/ejkernel/kernels/_pallas/tpu/ring_attention/_pallas_impl_bwd.py +0 -921
- ejkernel-0.0.79/ejkernel/kernels/_pallas/tpu/ring_attention/_pallas_impl_fwd.py +0 -458
- ejkernel-0.0.79/ejkernel/kernels/_registry.py +0 -720
- ejkernel-0.0.79/ejkernel/kernels/_triton/__init__.py +0 -79
- ejkernel-0.0.79/ejkernel/kernels/_triton/blocksparse_attention/__init__.py +0 -33
- ejkernel-0.0.79/ejkernel/kernels/_triton/blocksparse_attention/_interface.py +0 -462
- ejkernel-0.0.79/ejkernel/kernels/_triton/blocksparse_attention/_mask.py +0 -585
- ejkernel-0.0.79/ejkernel/kernels/_triton/blocksparse_attention/_triton_impl_bwd.py +0 -1475
- ejkernel-0.0.79/ejkernel/kernels/_triton/blocksparse_attention/_triton_impl_fwd.py +0 -753
- ejkernel-0.0.79/ejkernel/kernels/_triton/blocksparse_attention/_utilities.py +0 -606
- ejkernel-0.0.79/ejkernel/kernels/_triton/chunked_prefill_paged_decode/__init__.py +0 -31
- ejkernel-0.0.79/ejkernel/kernels/_triton/chunked_prefill_paged_decode/_interface.py +0 -113
- ejkernel-0.0.79/ejkernel/kernels/_triton/chunked_prefill_paged_decode/_triton_impl_fwd.py +0 -244
- ejkernel-0.0.79/ejkernel/kernels/_triton/decode_attention/__init__.py +0 -23
- ejkernel-0.0.79/ejkernel/kernels/_triton/decode_attention/_interface.py +0 -96
- ejkernel-0.0.79/ejkernel/kernels/_triton/decode_attention/_triton_impl_fwd.py +0 -456
- ejkernel-0.0.79/ejkernel/kernels/_triton/flash_attention/__init__.py +0 -36
- ejkernel-0.0.79/ejkernel/kernels/_triton/flash_attention/_interface.py +0 -446
- ejkernel-0.0.79/ejkernel/kernels/_triton/flash_attention/_triton_impl_bwd.py +0 -1743
- ejkernel-0.0.79/ejkernel/kernels/_triton/flash_attention/_triton_impl_fwd.py +0 -1050
- ejkernel-0.0.79/ejkernel/kernels/_triton/flash_attention/_utilities.py +0 -472
- ejkernel-0.0.79/ejkernel/kernels/_triton/flash_mla/__init__.py +0 -25
- ejkernel-0.0.79/ejkernel/kernels/_triton/flash_mla/_interface.py +0 -155
- ejkernel-0.0.79/ejkernel/kernels/_triton/flash_mla/_triton_impl_bwd.py +0 -40
- ejkernel-0.0.79/ejkernel/kernels/_triton/flash_mla/_triton_impl_fwd.py +0 -218
- ejkernel-0.0.79/ejkernel/kernels/_triton/flash_mla/_utilities.py +0 -39
- ejkernel-0.0.79/ejkernel/kernels/_triton/gla/__init__.py +0 -30
- ejkernel-0.0.79/ejkernel/kernels/_triton/gla/_interface.py +0 -78
- ejkernel-0.0.79/ejkernel/kernels/_triton/gla/_triton_impl_bwd.py +0 -21
- ejkernel-0.0.79/ejkernel/kernels/_triton/gla/_triton_impl_fwd.py +0 -131
- ejkernel-0.0.79/ejkernel/kernels/_triton/lightning_attn/__init__.py +0 -30
- ejkernel-0.0.79/ejkernel/kernels/_triton/lightning_attn/_interface.py +0 -81
- ejkernel-0.0.79/ejkernel/kernels/_triton/lightning_attn/_triton_impl_bwd.py +0 -21
- ejkernel-0.0.79/ejkernel/kernels/_triton/lightning_attn/_triton_impl_fwd.py +0 -138
- ejkernel-0.0.79/ejkernel/kernels/_triton/mean_pooling/__init__.py +0 -42
- ejkernel-0.0.79/ejkernel/kernels/_triton/mean_pooling/_interface.py +0 -164
- ejkernel-0.0.79/ejkernel/kernels/_triton/mean_pooling/_triton_impl_bwd.py +0 -161
- ejkernel-0.0.79/ejkernel/kernels/_triton/mean_pooling/_triton_impl_fwd.py +0 -157
- ejkernel-0.0.79/ejkernel/kernels/_triton/native_sparse_attention/__init__.py +0 -37
- ejkernel-0.0.79/ejkernel/kernels/_triton/native_sparse_attention/_compression.py +0 -753
- ejkernel-0.0.79/ejkernel/kernels/_triton/native_sparse_attention/_interface.py +0 -404
- ejkernel-0.0.79/ejkernel/kernels/_triton/native_sparse_attention/_triton_impl_bwd.py +0 -441
- ejkernel-0.0.79/ejkernel/kernels/_triton/native_sparse_attention/_triton_impl_fwd.py +0 -473
- ejkernel-0.0.79/ejkernel/kernels/_triton/native_sparse_attention/_utilities.py +0 -123
- ejkernel-0.0.79/ejkernel/kernels/_triton/page_attention/__init__.py +0 -30
- ejkernel-0.0.79/ejkernel/kernels/_triton/page_attention/_interface.py +0 -379
- ejkernel-0.0.79/ejkernel/kernels/_triton/page_attention/_triton_impl_fwd.py +0 -352
- ejkernel-0.0.79/ejkernel/kernels/_triton/quantized_matmul/__init__.py +0 -19
- ejkernel-0.0.79/ejkernel/kernels/_triton/quantized_matmul/_interface.py +0 -369
- ejkernel-0.0.79/ejkernel/kernels/_triton/quantized_matmul/_triton_impl.py +0 -2945
- ejkernel-0.0.79/ejkernel/kernels/_triton/quantized_matmul/_triton_impl_bwd.py +0 -114
- ejkernel-0.0.79/ejkernel/kernels/_triton/quantized_matmul/_triton_impl_fwd.py +0 -268
- ejkernel-0.0.79/ejkernel/kernels/_triton/quantized_matmul/_triton_impl_gemv.py +0 -572
- ejkernel-0.0.79/ejkernel/kernels/_triton/ragged_decode_attention/__init__.py +0 -30
- ejkernel-0.0.79/ejkernel/kernels/_triton/ragged_decode_attention/_interface.py +0 -173
- ejkernel-0.0.79/ejkernel/kernels/_triton/ragged_decode_attention/_triton_impl_fwd.py +0 -392
- ejkernel-0.0.79/ejkernel/kernels/_triton/ragged_page_attention_v2/__init__.py +0 -30
- ejkernel-0.0.79/ejkernel/kernels/_triton/ragged_page_attention_v2/_interface.py +0 -228
- ejkernel-0.0.79/ejkernel/kernels/_triton/ragged_page_attention_v2/_triton_impl_fwd.py +0 -796
- ejkernel-0.0.79/ejkernel/kernels/_triton/ragged_page_attention_v3/__init__.py +0 -30
- ejkernel-0.0.79/ejkernel/kernels/_triton/ragged_page_attention_v3/_interface.py +0 -205
- ejkernel-0.0.79/ejkernel/kernels/_triton/ragged_page_attention_v3/_triton_impl_fwd.py +0 -652
- ejkernel-0.0.79/ejkernel/kernels/_triton/recurrent/__init__.py +0 -42
- ejkernel-0.0.79/ejkernel/kernels/_triton/recurrent/_interface.py +0 -291
- ejkernel-0.0.79/ejkernel/kernels/_triton/recurrent/_triton_impl_bwd.py +0 -411
- ejkernel-0.0.79/ejkernel/kernels/_triton/recurrent/_triton_impl_fwd.py +0 -272
- ejkernel-0.0.79/ejkernel/kernels/_triton/ring_attention/__init__.py +0 -33
- ejkernel-0.0.79/ejkernel/kernels/_triton/ring_attention/_interface.py +0 -154
- ejkernel-0.0.79/ejkernel/kernels/_triton/ring_attention/_triton_impl_bwd.py +0 -780
- ejkernel-0.0.79/ejkernel/kernels/_triton/ring_attention/_triton_impl_fwd.py +0 -226
- ejkernel-0.0.79/ejkernel/kernels/_triton/rwkv4/__init__.py +0 -32
- ejkernel-0.0.79/ejkernel/kernels/_triton/rwkv4/_interface.py +0 -221
- ejkernel-0.0.79/ejkernel/kernels/_triton/rwkv4/_triton_impl_bwd.py +0 -205
- ejkernel-0.0.79/ejkernel/kernels/_triton/rwkv4/_triton_impl_fwd.py +0 -258
- ejkernel-0.0.79/ejkernel/kernels/_triton/rwkv6/__init__.py +0 -33
- ejkernel-0.0.79/ejkernel/kernels/_triton/rwkv6/_interface.py +0 -498
- ejkernel-0.0.79/ejkernel/kernels/_triton/rwkv6/_triton_impl_fwd.py +0 -221
- ejkernel-0.0.79/ejkernel/kernels/_triton/rwkv7/__init__.py +0 -34
- ejkernel-0.0.79/ejkernel/kernels/_triton/rwkv7/_interface.py +0 -585
- ejkernel-0.0.79/ejkernel/kernels/_triton/rwkv7/_triton_impl_fwd.py +0 -237
- ejkernel-0.0.79/ejkernel/kernels/_triton/unified_attention/__init__.py +0 -30
- ejkernel-0.0.79/ejkernel/kernels/_triton/unified_attention/_interface.py +0 -159
- ejkernel-0.0.79/ejkernel/kernels/_triton/unified_attention/_triton_impl_fwd.py +0 -1029
- ejkernel-0.0.79/ejkernel/kernels/_xla/__init__.py +0 -147
- ejkernel-0.0.79/ejkernel/kernels/_xla/all_gather_matmul/__init__.py +0 -19
- ejkernel-0.0.79/ejkernel/kernels/_xla/all_gather_matmul/_interface.py +0 -55
- ejkernel-0.0.79/ejkernel/kernels/_xla/all_gather_matmul/_xla_impl_bwd.py +0 -50
- ejkernel-0.0.79/ejkernel/kernels/_xla/all_gather_matmul/_xla_impl_fwd.py +0 -136
- ejkernel-0.0.79/ejkernel/kernels/_xla/attention/__init__.py +0 -31
- ejkernel-0.0.79/ejkernel/kernels/_xla/attention/_interface.py +0 -134
- ejkernel-0.0.79/ejkernel/kernels/_xla/attention/_xla_impl_bwd.py +0 -21
- ejkernel-0.0.79/ejkernel/kernels/_xla/attention/_xla_impl_fwd.py +0 -332
- ejkernel-0.0.79/ejkernel/kernels/_xla/blocksparse_attention/__init__.py +0 -30
- ejkernel-0.0.79/ejkernel/kernels/_xla/blocksparse_attention/_interface.py +0 -144
- ejkernel-0.0.79/ejkernel/kernels/_xla/blocksparse_attention/_xla_impl_bwd.py +0 -21
- ejkernel-0.0.79/ejkernel/kernels/_xla/blocksparse_attention/_xla_impl_fwd.py +0 -478
- ejkernel-0.0.79/ejkernel/kernels/_xla/chunked_prefill_paged_decode/__init__.py +0 -36
- ejkernel-0.0.79/ejkernel/kernels/_xla/chunked_prefill_paged_decode/_interface.py +0 -113
- ejkernel-0.0.79/ejkernel/kernels/_xla/chunked_prefill_paged_decode/_xla_impl_fwd.py +0 -237
- ejkernel-0.0.79/ejkernel/kernels/_xla/decode_attention/__init__.py +0 -34
- ejkernel-0.0.79/ejkernel/kernels/_xla/decode_attention/_interface.py +0 -87
- ejkernel-0.0.79/ejkernel/kernels/_xla/decode_attention/_xla_impl_fwd.py +0 -167
- ejkernel-0.0.79/ejkernel/kernels/_xla/deepseek_attn/__init__.py +0 -28
- ejkernel-0.0.79/ejkernel/kernels/_xla/deepseek_attn/_interface.py +0 -239
- ejkernel-0.0.79/ejkernel/kernels/_xla/deepseek_attn/_xla_impl_bwd.py +0 -117
- ejkernel-0.0.79/ejkernel/kernels/_xla/deepseek_attn/_xla_impl_fwd.py +0 -186
- ejkernel-0.0.79/ejkernel/kernels/_xla/flash_attention/__init__.py +0 -40
- ejkernel-0.0.79/ejkernel/kernels/_xla/flash_attention/_interface.py +0 -668
- ejkernel-0.0.79/ejkernel/kernels/_xla/flash_attention/_xla_impl_bwd.py +0 -150
- ejkernel-0.0.79/ejkernel/kernels/_xla/flash_attention/_xla_impl_fwd.py +0 -742
- ejkernel-0.0.79/ejkernel/kernels/_xla/flash_mla/__init__.py +0 -23
- ejkernel-0.0.79/ejkernel/kernels/_xla/flash_mla/_xla_impl_bwd.py +0 -21
- ejkernel-0.0.79/ejkernel/kernels/_xla/flash_mla/_xla_impl_fwd.py +0 -482
- ejkernel-0.0.79/ejkernel/kernels/_xla/gated_delta_rule/__init__.py +0 -35
- ejkernel-0.0.79/ejkernel/kernels/_xla/gated_delta_rule/_interface.py +0 -158
- ejkernel-0.0.79/ejkernel/kernels/_xla/gated_delta_rule/_xla_impl_bwd.py +0 -307
- ejkernel-0.0.79/ejkernel/kernels/_xla/gated_delta_rule/_xla_impl_fwd.py +0 -634
- ejkernel-0.0.79/ejkernel/kernels/_xla/gla/__init__.py +0 -34
- ejkernel-0.0.79/ejkernel/kernels/_xla/gla/_interface.py +0 -96
- ejkernel-0.0.79/ejkernel/kernels/_xla/gla/_xla_impl_bwd.py +0 -21
- ejkernel-0.0.79/ejkernel/kernels/_xla/gla/_xla_impl_fwd.py +0 -124
- ejkernel-0.0.79/ejkernel/kernels/_xla/grouped_matmul/__init__.py +0 -30
- ejkernel-0.0.79/ejkernel/kernels/_xla/grouped_matmul/_interface.py +0 -120
- ejkernel-0.0.79/ejkernel/kernels/_xla/grouped_matmul/_xla_impl_bwd.py +0 -21
- ejkernel-0.0.79/ejkernel/kernels/_xla/grouped_matmul/_xla_impl_fwd.py +0 -143
- ejkernel-0.0.79/ejkernel/kernels/_xla/grouped_matmulv3/__init__.py +0 -19
- ejkernel-0.0.79/ejkernel/kernels/_xla/grouped_matmulv3/_interface.py +0 -356
- ejkernel-0.0.79/ejkernel/kernels/_xla/kernel_delta_attention/__init__.py +0 -43
- ejkernel-0.0.79/ejkernel/kernels/_xla/kernel_delta_attention/_interface.py +0 -223
- ejkernel-0.0.79/ejkernel/kernels/_xla/kernel_delta_attention/_xla_impl_fwd.py +0 -467
- ejkernel-0.0.79/ejkernel/kernels/_xla/lightning_attn/__init__.py +0 -34
- ejkernel-0.0.79/ejkernel/kernels/_xla/lightning_attn/_interface.py +0 -100
- ejkernel-0.0.79/ejkernel/kernels/_xla/lightning_attn/_xla_impl_bwd.py +0 -21
- ejkernel-0.0.79/ejkernel/kernels/_xla/lightning_attn/_xla_impl_fwd.py +0 -130
- ejkernel-0.0.79/ejkernel/kernels/_xla/mean_pooling/__init__.py +0 -29
- ejkernel-0.0.79/ejkernel/kernels/_xla/mean_pooling/_interface.py +0 -66
- ejkernel-0.0.79/ejkernel/kernels/_xla/mean_pooling/_xla_impl_bwd.py +0 -56
- ejkernel-0.0.79/ejkernel/kernels/_xla/mean_pooling/_xla_impl_fwd.py +0 -172
- ejkernel-0.0.79/ejkernel/kernels/_xla/multi_latent_ragged_page_attention/__init__.py +0 -19
- ejkernel-0.0.79/ejkernel/kernels/_xla/multi_latent_ragged_page_attention/_interface.py +0 -90
- ejkernel-0.0.79/ejkernel/kernels/_xla/multi_latent_ragged_page_attention/_xla_impl_fwd.py +0 -489
- ejkernel-0.0.79/ejkernel/kernels/_xla/multi_latent_ragged_page_attention_v2/__init__.py +0 -19
- ejkernel-0.0.79/ejkernel/kernels/_xla/multi_latent_ragged_page_attention_v2/_interface.py +0 -120
- ejkernel-0.0.79/ejkernel/kernels/_xla/multi_latent_ragged_page_attention_v2/_xla_impl_fwd.py +0 -117
- ejkernel-0.0.79/ejkernel/kernels/_xla/native_sparse_attention/__init__.py +0 -48
- ejkernel-0.0.79/ejkernel/kernels/_xla/native_sparse_attention/_interface.py +0 -577
- ejkernel-0.0.79/ejkernel/kernels/_xla/native_sparse_attention/_xla_impl_bwd.py +0 -241
- ejkernel-0.0.79/ejkernel/kernels/_xla/native_sparse_attention/_xla_impl_fwd.py +0 -172
- ejkernel-0.0.79/ejkernel/kernels/_xla/page_attention/__init__.py +0 -30
- ejkernel-0.0.79/ejkernel/kernels/_xla/page_attention/_interface.py +0 -158
- ejkernel-0.0.79/ejkernel/kernels/_xla/page_attention/_xla_impl_fwd.py +0 -161
- ejkernel-0.0.79/ejkernel/kernels/_xla/prefill_page_attention/__init__.py +0 -77
- ejkernel-0.0.79/ejkernel/kernels/_xla/prefill_page_attention/_impl.py +0 -169
- ejkernel-0.0.79/ejkernel/kernels/_xla/prefill_page_attention/_interface.py +0 -102
- ejkernel-0.0.79/ejkernel/kernels/_xla/quantized_matmul/__init__.py +0 -33
- ejkernel-0.0.79/ejkernel/kernels/_xla/quantized_matmul/_interface.py +0 -129
- ejkernel-0.0.79/ejkernel/kernels/_xla/quantized_matmul/_xla_impl_bwd.py +0 -21
- ejkernel-0.0.79/ejkernel/kernels/_xla/quantized_matmul/_xla_impl_fwd.py +0 -744
- ejkernel-0.0.79/ejkernel/kernels/_xla/ragged_decode_attention/__init__.py +0 -30
- ejkernel-0.0.79/ejkernel/kernels/_xla/ragged_decode_attention/_interface.py +0 -125
- ejkernel-0.0.79/ejkernel/kernels/_xla/ragged_decode_attention/_xla_impl_fwd.py +0 -536
- ejkernel-0.0.79/ejkernel/kernels/_xla/ragged_gated_delta_rule/__init__.py +0 -25
- ejkernel-0.0.79/ejkernel/kernels/_xla/ragged_gated_delta_rule/_interface.py +0 -117
- ejkernel-0.0.79/ejkernel/kernels/_xla/ragged_gated_delta_rule/_xla_impl_fwd.py +0 -503
- ejkernel-0.0.79/ejkernel/kernels/_xla/ragged_page_attention_v2/__init__.py +0 -30
- ejkernel-0.0.79/ejkernel/kernels/_xla/ragged_page_attention_v2/_interface.py +0 -115
- ejkernel-0.0.79/ejkernel/kernels/_xla/ragged_page_attention_v2/_xla_impl_fwd.py +0 -315
- ejkernel-0.0.79/ejkernel/kernels/_xla/ragged_page_attention_v2_turboquant/_interface.py +0 -124
- ejkernel-0.0.79/ejkernel/kernels/_xla/ragged_page_attention_v2_turboquant/_xla_impl_fwd.py +0 -422
- ejkernel-0.0.79/ejkernel/kernels/_xla/ragged_page_attention_v3/__init__.py +0 -30
- ejkernel-0.0.79/ejkernel/kernels/_xla/ragged_page_attention_v3/_interface.py +0 -195
- ejkernel-0.0.79/ejkernel/kernels/_xla/ragged_page_attention_v3/_xla_impl_bwd.py +0 -21
- ejkernel-0.0.79/ejkernel/kernels/_xla/ragged_page_attention_v3/_xla_impl_fwd.py +0 -678
- ejkernel-0.0.79/ejkernel/kernels/_xla/ragged_page_attention_v3_turboquant/_interface.py +0 -144
- ejkernel-0.0.79/ejkernel/kernels/_xla/ragged_page_attention_v3_turboquant/_xla_impl_fwd.py +0 -580
- ejkernel-0.0.79/ejkernel/kernels/_xla/recurrent/__init__.py +0 -42
- ejkernel-0.0.79/ejkernel/kernels/_xla/recurrent/_interface.py +0 -639
- ejkernel-0.0.79/ejkernel/kernels/_xla/recurrent/_xla_impl_bwd.py +0 -246
- ejkernel-0.0.79/ejkernel/kernels/_xla/recurrent/_xla_impl_fwd.py +0 -347
- ejkernel-0.0.79/ejkernel/kernels/_xla/reduce_scatter_matmul/__init__.py +0 -19
- ejkernel-0.0.79/ejkernel/kernels/_xla/reduce_scatter_matmul/_interface.py +0 -55
- ejkernel-0.0.79/ejkernel/kernels/_xla/reduce_scatter_matmul/_xla_impl_bwd.py +0 -40
- ejkernel-0.0.79/ejkernel/kernels/_xla/reduce_scatter_matmul/_xla_impl_fwd.py +0 -112
- ejkernel-0.0.79/ejkernel/kernels/_xla/ring_attention/__init__.py +0 -45
- ejkernel-0.0.79/ejkernel/kernels/_xla/ring_attention/_interface.py +0 -376
- ejkernel-0.0.79/ejkernel/kernels/_xla/ring_attention/_utils.py +0 -199
- ejkernel-0.0.79/ejkernel/kernels/_xla/ring_attention/_xla_impl_bwd.py +0 -471
- ejkernel-0.0.79/ejkernel/kernels/_xla/ring_attention/_xla_impl_fwd.py +0 -528
- ejkernel-0.0.79/ejkernel/kernels/_xla/rwkv4/__init__.py +0 -41
- ejkernel-0.0.79/ejkernel/kernels/_xla/rwkv4/_interface.py +0 -88
- ejkernel-0.0.79/ejkernel/kernels/_xla/rwkv4/_xla_impl_bwd.py +0 -21
- ejkernel-0.0.79/ejkernel/kernels/_xla/rwkv4/_xla_impl_fwd.py +0 -219
- ejkernel-0.0.79/ejkernel/kernels/_xla/rwkv6/__init__.py +0 -41
- ejkernel-0.0.79/ejkernel/kernels/_xla/rwkv6/_interface.py +0 -103
- ejkernel-0.0.79/ejkernel/kernels/_xla/rwkv6/_xla_impl_bwd.py +0 -21
- ejkernel-0.0.79/ejkernel/kernels/_xla/rwkv6/_xla_impl_fwd.py +0 -422
- ejkernel-0.0.79/ejkernel/kernels/_xla/rwkv7/__init__.py +0 -42
- ejkernel-0.0.79/ejkernel/kernels/_xla/rwkv7/_interface.py +0 -178
- ejkernel-0.0.79/ejkernel/kernels/_xla/rwkv7/_xla_impl_bwd.py +0 -21
- ejkernel-0.0.79/ejkernel/kernels/_xla/rwkv7/_xla_impl_fwd.py +0 -536
- ejkernel-0.0.79/ejkernel/kernels/_xla/scaled_dot_product_attention/__init__.py +0 -24
- ejkernel-0.0.79/ejkernel/kernels/_xla/scaled_dot_product_attention/_interface.py +0 -97
- ejkernel-0.0.79/ejkernel/kernels/_xla/scaled_dot_product_attention/_xla_impl_bwd.py +0 -21
- ejkernel-0.0.79/ejkernel/kernels/_xla/scaled_dot_product_attention/_xla_impl_fwd.py +0 -100
- ejkernel-0.0.79/ejkernel/kernels/_xla/state_space_v1/__init__.py +0 -35
- ejkernel-0.0.79/ejkernel/kernels/_xla/state_space_v1/_interface.py +0 -339
- ejkernel-0.0.79/ejkernel/kernels/_xla/state_space_v1/_xla_impl_bwd.py +0 -216
- ejkernel-0.0.79/ejkernel/kernels/_xla/state_space_v1/_xla_impl_fwd.py +0 -234
- ejkernel-0.0.79/ejkernel/kernels/_xla/state_space_v2/__init__.py +0 -31
- ejkernel-0.0.79/ejkernel/kernels/_xla/state_space_v2/_interface.py +0 -372
- ejkernel-0.0.79/ejkernel/kernels/_xla/state_space_v2/_xla_impl_bwd.py +0 -232
- ejkernel-0.0.79/ejkernel/kernels/_xla/state_space_v2/_xla_impl_fwd.py +0 -231
- ejkernel-0.0.79/ejkernel/kernels/_xla/unified_attention/__init__.py +0 -30
- ejkernel-0.0.79/ejkernel/kernels/_xla/unified_attention/_interface.py +0 -195
- ejkernel-0.0.79/ejkernel/kernels/_xla/unified_attention/_xla_impl_fwd.py +0 -332
- ejkernel-0.0.79/ejkernel/loggings.py +0 -588
- ejkernel-0.0.79/ejkernel/modules/__init__.py +0 -311
- ejkernel-0.0.79/ejkernel/modules/base.py +0 -271
- ejkernel-0.0.79/ejkernel/modules/operations/__init__.py +0 -299
- ejkernel-0.0.79/ejkernel/modules/operations/all_gather_matmul.py +0 -371
- ejkernel-0.0.79/ejkernel/modules/operations/attention.py +0 -337
- ejkernel-0.0.79/ejkernel/modules/operations/blocksparse_attention.py +0 -1013
- ejkernel-0.0.79/ejkernel/modules/operations/chunked_prefill_paged_decode.py +0 -469
- ejkernel-0.0.79/ejkernel/modules/operations/configs.py +0 -1027
- ejkernel-0.0.79/ejkernel/modules/operations/decode_attention.py +0 -393
- ejkernel-0.0.79/ejkernel/modules/operations/deepseek_attn.py +0 -294
- ejkernel-0.0.79/ejkernel/modules/operations/flash_attention.py +0 -941
- ejkernel-0.0.79/ejkernel/modules/operations/gated_delta_rule.py +0 -481
- ejkernel-0.0.79/ejkernel/modules/operations/gated_linear_attention.py +0 -368
- ejkernel-0.0.79/ejkernel/modules/operations/grouped_matmul.py +0 -527
- ejkernel-0.0.79/ejkernel/modules/operations/kernel_delta_attention.py +0 -360
- ejkernel-0.0.79/ejkernel/modules/operations/lightning_attention.py +0 -383
- ejkernel-0.0.79/ejkernel/modules/operations/multi_head_latent_attention.py +0 -388
- ejkernel-0.0.79/ejkernel/modules/operations/multi_latent_ragged_page_attention.py +0 -448
- ejkernel-0.0.79/ejkernel/modules/operations/multi_latent_ragged_page_attention_v2.py +0 -375
- ejkernel-0.0.79/ejkernel/modules/operations/native_sparse_attention.py +0 -460
- ejkernel-0.0.79/ejkernel/modules/operations/page_attention.py +0 -451
- ejkernel-0.0.79/ejkernel/modules/operations/pooling.py +0 -293
- ejkernel-0.0.79/ejkernel/modules/operations/prefill_page_attention.py +0 -359
- ejkernel-0.0.79/ejkernel/modules/operations/quantized_matmul.py +0 -1821
- ejkernel-0.0.79/ejkernel/modules/operations/ragged_decode_attention.py +0 -616
- ejkernel-0.0.79/ejkernel/modules/operations/ragged_gated_delta_rule.py +0 -357
- ejkernel-0.0.79/ejkernel/modules/operations/ragged_page_attention_v2.py +0 -763
- ejkernel-0.0.79/ejkernel/modules/operations/ragged_page_attention_v2_turboquant.py +0 -568
- ejkernel-0.0.79/ejkernel/modules/operations/ragged_page_attention_v3.py +0 -1424
- ejkernel-0.0.79/ejkernel/modules/operations/ragged_page_attention_v3_turboquant.py +0 -615
- ejkernel-0.0.79/ejkernel/modules/operations/recurrent.py +0 -388
- ejkernel-0.0.79/ejkernel/modules/operations/reduce_scatter_matmul.py +0 -326
- ejkernel-0.0.79/ejkernel/modules/operations/ring_attention.py +0 -591
- ejkernel-0.0.79/ejkernel/modules/operations/rwkv4.py +0 -283
- ejkernel-0.0.79/ejkernel/modules/operations/rwkv6.py +0 -329
- ejkernel-0.0.79/ejkernel/modules/operations/rwkv7.py +0 -604
- ejkernel-0.0.79/ejkernel/modules/operations/scaled_dot_product_attention.py +0 -456
- ejkernel-0.0.79/ejkernel/modules/operations/state_space_v1.py +0 -341
- ejkernel-0.0.79/ejkernel/modules/operations/state_space_v2.py +0 -379
- ejkernel-0.0.79/ejkernel/modules/operations/unified_attention.py +0 -559
- ejkernel-0.0.79/ejkernel/ops/__init__.py +0 -152
- ejkernel-0.0.79/ejkernel/ops/config/__init__.py +0 -55
- ejkernel-0.0.79/ejkernel/ops/config/cache.py +0 -187
- ejkernel-0.0.79/ejkernel/ops/config/persistent.py +0 -233
- ejkernel-0.0.79/ejkernel/ops/config/selection.py +0 -804
- ejkernel-0.0.79/ejkernel/ops/core/__init__.py +0 -58
- ejkernel-0.0.79/ejkernel/ops/core/kernel.py +0 -759
- ejkernel-0.0.79/ejkernel/ops/core/types.py +0 -50
- ejkernel-0.0.79/ejkernel/ops/execution/__init__.py +0 -87
- ejkernel-0.0.79/ejkernel/ops/execution/batch.py +0 -191
- ejkernel-0.0.79/ejkernel/ops/execution/executor.py +0 -711
- ejkernel-0.0.79/ejkernel/ops/execution/offline.py +0 -144
- ejkernel-0.0.79/ejkernel/ops/execution/profiler.py +0 -506
- ejkernel-0.0.79/ejkernel/ops/execution/tuning.py +0 -1538
- ejkernel-0.0.79/ejkernel/ops/registry.py +0 -93
- ejkernel-0.0.79/ejkernel/ops/utils/__init__.py +0 -77
- ejkernel-0.0.79/ejkernel/ops/utils/datacarrier.py +0 -167
- ejkernel-0.0.79/ejkernel/ops/utils/fingerprint.py +0 -344
- ejkernel-0.0.79/ejkernel/ops/utils/meta.py +0 -160
- ejkernel-0.0.79/ejkernel/ops/utils/serialize.py +0 -98
- ejkernel-0.0.79/ejkernel/quantization/__init__.py +0 -86
- ejkernel-0.0.79/ejkernel/quantization/_quants/__init__.py +0 -37
- ejkernel-0.0.79/ejkernel/quantization/_quants/quantizations.py +0 -1151
- ejkernel-0.0.79/ejkernel/quantization/_utils/__init__.py +0 -90
- ejkernel-0.0.79/ejkernel/quantization/_utils/bitpack.py +0 -251
- ejkernel-0.0.79/ejkernel/quantization/_utils/fp_tables.py +0 -255
- ejkernel-0.0.79/ejkernel/quantization/_utils/grouping.py +0 -184
- ejkernel-0.0.79/ejkernel/quantization/_utils/qparams.py +0 -529
- ejkernel-0.0.79/ejkernel/quantization/quantized_array.py +0 -374
- ejkernel-0.0.79/ejkernel/quantization/runtime.py +0 -82
- ejkernel-0.0.79/ejkernel/quantization/turboquant/codebook.py +0 -190
- ejkernel-0.0.79/ejkernel/quantization/turboquant/matrices.py +0 -81
- ejkernel-0.0.79/ejkernel/quantization/turboquant/ops.py +0 -229
- ejkernel-0.0.79/ejkernel/quantization/turboquant/packing.py +0 -117
- ejkernel-0.0.79/ejkernel/types/__init__.py +0 -57
- ejkernel-0.0.79/ejkernel/types/mask.py +0 -3366
- ejkernel-0.0.79/ejkernel/utils.py +0 -1137
- ejkernel-0.0.79/ejkernel/xla_utils/__init__.py +0 -122
- ejkernel-0.0.79/ejkernel/xla_utils/cumsum.py +0 -568
- ejkernel-0.0.79/ejkernel/xla_utils/shardings.py +0 -270
- ejkernel-0.0.79/ejkernel/xla_utils/utils.py +0 -376
- ejkernel-0.0.79/pyproject.toml +0 -135
- {ejkernel-0.0.79 → ejkernel-0.0.80}/README.md +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/blocksparse_attention/CMakeLists.txt +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/blocksparse_attention/src/blocksparse_attention.h +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/blocksparse_attention/src/blocksparse_attention_ffi.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/blocksparse_attention/src/blocksparse_attention_kernel.h +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/blocksparse_attention/src/blocksparse_attention_launch_template.h +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/blocksparse_attention/src/blocksparse_fwd_hdim128_vhdim128_bf16_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/blocksparse_attention/src/blocksparse_fwd_hdim128_vhdim128_bf16_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/blocksparse_attention/src/blocksparse_fwd_hdim128_vhdim128_bf16_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/blocksparse_attention/src/blocksparse_fwd_hdim128_vhdim128_bf16_sm80.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/blocksparse_attention/src/blocksparse_fwd_hdim128_vhdim128_bf16_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/blocksparse_attention/src/blocksparse_fwd_hdim128_vhdim128_fp16_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/blocksparse_attention/src/blocksparse_fwd_hdim128_vhdim128_fp16_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/blocksparse_attention/src/blocksparse_fwd_hdim128_vhdim128_fp16_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/blocksparse_attention/src/blocksparse_fwd_hdim128_vhdim128_fp16_sm80.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/blocksparse_attention/src/blocksparse_fwd_hdim128_vhdim128_fp16_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/blocksparse_attention/src/blocksparse_fwd_hdim192_vhdim192_bf16_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/blocksparse_attention/src/blocksparse_fwd_hdim192_vhdim192_bf16_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/blocksparse_attention/src/blocksparse_fwd_hdim192_vhdim192_bf16_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/blocksparse_attention/src/blocksparse_fwd_hdim192_vhdim192_bf16_sm80.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/blocksparse_attention/src/blocksparse_fwd_hdim192_vhdim192_bf16_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/blocksparse_attention/src/blocksparse_fwd_hdim192_vhdim192_fp16_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/blocksparse_attention/src/blocksparse_fwd_hdim192_vhdim192_fp16_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/blocksparse_attention/src/blocksparse_fwd_hdim192_vhdim192_fp16_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/blocksparse_attention/src/blocksparse_fwd_hdim192_vhdim192_fp16_sm80.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/blocksparse_attention/src/blocksparse_fwd_hdim192_vhdim192_fp16_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/blocksparse_attention/src/blocksparse_fwd_hdim256_vhdim256_bf16_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/blocksparse_attention/src/blocksparse_fwd_hdim256_vhdim256_bf16_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/blocksparse_attention/src/blocksparse_fwd_hdim256_vhdim256_bf16_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/blocksparse_attention/src/blocksparse_fwd_hdim256_vhdim256_bf16_sm80.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/blocksparse_attention/src/blocksparse_fwd_hdim256_vhdim256_bf16_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/blocksparse_attention/src/blocksparse_fwd_hdim256_vhdim256_fp16_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/blocksparse_attention/src/blocksparse_fwd_hdim256_vhdim256_fp16_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/blocksparse_attention/src/blocksparse_fwd_hdim256_vhdim256_fp16_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/blocksparse_attention/src/blocksparse_fwd_hdim256_vhdim256_fp16_sm80.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/blocksparse_attention/src/blocksparse_fwd_hdim256_vhdim256_fp16_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/blocksparse_attention/src/blocksparse_fwd_hdim32_vhdim32_bf16_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/blocksparse_attention/src/blocksparse_fwd_hdim32_vhdim32_bf16_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/blocksparse_attention/src/blocksparse_fwd_hdim32_vhdim32_bf16_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/blocksparse_attention/src/blocksparse_fwd_hdim32_vhdim32_bf16_sm80.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/blocksparse_attention/src/blocksparse_fwd_hdim32_vhdim32_bf16_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/blocksparse_attention/src/blocksparse_fwd_hdim32_vhdim32_fp16_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/blocksparse_attention/src/blocksparse_fwd_hdim32_vhdim32_fp16_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/blocksparse_attention/src/blocksparse_fwd_hdim32_vhdim32_fp16_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/blocksparse_attention/src/blocksparse_fwd_hdim32_vhdim32_fp16_sm80.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/blocksparse_attention/src/blocksparse_fwd_hdim32_vhdim32_fp16_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/blocksparse_attention/src/blocksparse_fwd_hdim64_vhdim64_bf16_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/blocksparse_attention/src/blocksparse_fwd_hdim64_vhdim64_bf16_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/blocksparse_attention/src/blocksparse_fwd_hdim64_vhdim64_bf16_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/blocksparse_attention/src/blocksparse_fwd_hdim64_vhdim64_bf16_sm80.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/blocksparse_attention/src/blocksparse_fwd_hdim64_vhdim64_bf16_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/blocksparse_attention/src/blocksparse_fwd_hdim64_vhdim64_fp16_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/blocksparse_attention/src/blocksparse_fwd_hdim64_vhdim64_fp16_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/blocksparse_attention/src/blocksparse_fwd_hdim64_vhdim64_fp16_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/blocksparse_attention/src/blocksparse_fwd_hdim64_vhdim64_fp16_sm80.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/blocksparse_attention/src/blocksparse_fwd_hdim64_vhdim64_fp16_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/blocksparse_attention/src/blocksparse_fwd_hdim96_vhdim96_bf16_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/blocksparse_attention/src/blocksparse_fwd_hdim96_vhdim96_bf16_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/blocksparse_attention/src/blocksparse_fwd_hdim96_vhdim96_bf16_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/blocksparse_attention/src/blocksparse_fwd_hdim96_vhdim96_bf16_sm80.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/blocksparse_attention/src/blocksparse_fwd_hdim96_vhdim96_bf16_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/blocksparse_attention/src/blocksparse_fwd_hdim96_vhdim96_fp16_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/blocksparse_attention/src/blocksparse_fwd_hdim96_vhdim96_fp16_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/blocksparse_attention/src/blocksparse_fwd_hdim96_vhdim96_fp16_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/blocksparse_attention/src/blocksparse_fwd_hdim96_vhdim96_fp16_sm80.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/blocksparse_attention/src/blocksparse_fwd_hdim96_vhdim96_fp16_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/blocksparse_attention/src/code_gen.py +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/code_gen.py +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/block.h +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/code_gen.py +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/copy_sm90_bulk_reduce.hpp +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/cuda_check.h +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/epilogue_bwd.hpp +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/epilogue_fwd.hpp +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/flash.h +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/flash_api.cpp +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/flash_api_stable.cpp +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/flash_bwd_kernel_sm80.h +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/flash_bwd_kernel_sm90.h +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/flash_bwd_launch_template.h +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/flash_bwd_postprocess_kernel.h +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/flash_bwd_preprocess_kernel.h +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/flash_fwd_combine.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/flash_fwd_combine_kernel.h +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/flash_fwd_combine_launch_template.h +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/flash_fwd_kernel_sm80.h +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/flash_fwd_kernel_sm90.h +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/flash_fwd_launch_template.h +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/flash_prepare_scheduler.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/heuristics.h +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_bwd_hdim128_bf16_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_bwd_hdim128_bf16_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_bwd_hdim128_bf16_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_bwd_hdim128_bf16_sm80.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_bwd_hdim128_bf16_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_bwd_hdim128_bf16_softcap_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_bwd_hdim128_bf16_softcap_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_bwd_hdim128_bf16_softcap_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_bwd_hdim128_bf16_softcap_sm80.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_bwd_hdim128_bf16_softcap_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_bwd_hdim128_bf16_softcapall_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_bwd_hdim128_fp16_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_bwd_hdim128_fp16_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_bwd_hdim128_fp16_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_bwd_hdim128_fp16_sm80.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_bwd_hdim128_fp16_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_bwd_hdim128_fp16_softcap_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_bwd_hdim128_fp16_softcap_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_bwd_hdim128_fp16_softcap_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_bwd_hdim128_fp16_softcap_sm80.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_bwd_hdim128_fp16_softcap_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_bwd_hdim128_fp16_softcapall_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_bwd_hdim192_bf16_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_bwd_hdim192_bf16_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_bwd_hdim192_bf16_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_bwd_hdim192_bf16_sm80.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_bwd_hdim192_bf16_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_bwd_hdim192_bf16_softcap_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_bwd_hdim192_bf16_softcap_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_bwd_hdim192_bf16_softcap_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_bwd_hdim192_bf16_softcap_sm80.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_bwd_hdim192_bf16_softcap_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_bwd_hdim192_bf16_softcapall_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_bwd_hdim192_fp16_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_bwd_hdim192_fp16_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_bwd_hdim192_fp16_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_bwd_hdim192_fp16_sm80.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_bwd_hdim192_fp16_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_bwd_hdim192_fp16_softcap_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_bwd_hdim192_fp16_softcap_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_bwd_hdim192_fp16_softcap_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_bwd_hdim192_fp16_softcap_sm80.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_bwd_hdim192_fp16_softcap_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_bwd_hdim192_fp16_softcapall_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_bwd_hdim256_bf16_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_bwd_hdim256_bf16_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_bwd_hdim256_bf16_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_bwd_hdim256_bf16_sm80.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_bwd_hdim256_bf16_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_bwd_hdim256_bf16_softcap_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_bwd_hdim256_bf16_softcap_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_bwd_hdim256_bf16_softcap_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_bwd_hdim256_bf16_softcap_sm80.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_bwd_hdim256_bf16_softcap_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_bwd_hdim256_bf16_softcapall_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_bwd_hdim256_fp16_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_bwd_hdim256_fp16_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_bwd_hdim256_fp16_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_bwd_hdim256_fp16_sm80.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_bwd_hdim256_fp16_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_bwd_hdim256_fp16_softcap_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_bwd_hdim256_fp16_softcap_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_bwd_hdim256_fp16_softcap_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_bwd_hdim256_fp16_softcap_sm80.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_bwd_hdim256_fp16_softcap_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_bwd_hdim256_fp16_softcapall_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_bwd_hdim64_bf16_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_bwd_hdim64_bf16_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_bwd_hdim64_bf16_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_bwd_hdim64_bf16_sm80.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_bwd_hdim64_bf16_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_bwd_hdim64_bf16_softcap_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_bwd_hdim64_bf16_softcap_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_bwd_hdim64_bf16_softcap_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_bwd_hdim64_bf16_softcap_sm80.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_bwd_hdim64_bf16_softcap_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_bwd_hdim64_bf16_softcapall_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_bwd_hdim64_fp16_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_bwd_hdim64_fp16_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_bwd_hdim64_fp16_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_bwd_hdim64_fp16_sm80.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_bwd_hdim64_fp16_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_bwd_hdim64_fp16_softcap_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_bwd_hdim64_fp16_softcap_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_bwd_hdim64_fp16_softcap_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_bwd_hdim64_fp16_softcap_sm80.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_bwd_hdim64_fp16_softcap_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_bwd_hdim64_fp16_softcapall_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_bwd_hdim96_bf16_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_bwd_hdim96_bf16_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_bwd_hdim96_bf16_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_bwd_hdim96_bf16_sm80.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_bwd_hdim96_bf16_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_bwd_hdim96_bf16_softcap_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_bwd_hdim96_bf16_softcap_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_bwd_hdim96_bf16_softcap_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_bwd_hdim96_bf16_softcap_sm80.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_bwd_hdim96_bf16_softcap_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_bwd_hdim96_bf16_softcapall_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_bwd_hdim96_fp16_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_bwd_hdim96_fp16_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_bwd_hdim96_fp16_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_bwd_hdim96_fp16_sm80.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_bwd_hdim96_fp16_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_bwd_hdim96_fp16_softcap_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_bwd_hdim96_fp16_softcap_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_bwd_hdim96_fp16_softcap_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_bwd_hdim96_fp16_softcap_sm80.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_bwd_hdim96_fp16_softcap_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_bwd_hdim96_fp16_softcapall_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_bf16_packgqa_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_bf16_packgqa_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_bf16_packgqa_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_bf16_packgqa_sm80.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_bf16_packgqa_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_bf16_paged_packgqa_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_bf16_paged_packgqa_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_bf16_paged_packgqa_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_bf16_paged_packgqa_sm80.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_bf16_paged_packgqa_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_bf16_paged_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_bf16_paged_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_bf16_paged_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_bf16_paged_sm80.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_bf16_paged_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_bf16_paged_softcap_packgqa_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_bf16_paged_softcap_packgqa_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_bf16_paged_softcap_packgqa_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_bf16_paged_softcap_packgqa_sm80.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_bf16_paged_softcap_packgqa_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_bf16_paged_softcap_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_bf16_paged_softcap_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_bf16_paged_softcap_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_bf16_paged_softcap_sm80.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_bf16_paged_softcap_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_bf16_paged_softcapall_sm80.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_bf16_paged_split_packgqa_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_bf16_paged_split_packgqa_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_bf16_paged_split_packgqa_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_bf16_paged_split_packgqa_sm80.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_bf16_paged_split_packgqa_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_bf16_paged_split_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_bf16_paged_split_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_bf16_paged_split_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_bf16_paged_split_sm80.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_bf16_paged_split_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_bf16_paged_split_softcap_packgqa_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_bf16_paged_split_softcap_packgqa_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_bf16_paged_split_softcap_packgqa_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_bf16_paged_split_softcap_packgqa_sm80.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_bf16_paged_split_softcap_packgqa_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_bf16_paged_split_softcap_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_bf16_paged_split_softcap_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_bf16_paged_split_softcap_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_bf16_paged_split_softcap_sm80.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_bf16_paged_split_softcap_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_bf16_paged_split_softcapall_sm80.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_bf16_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_bf16_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_bf16_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_bf16_sm80.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_bf16_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_bf16_softcap_packgqa_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_bf16_softcap_packgqa_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_bf16_softcap_packgqa_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_bf16_softcap_packgqa_sm80.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_bf16_softcap_packgqa_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_bf16_softcap_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_bf16_softcap_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_bf16_softcap_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_bf16_softcap_sm80.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_bf16_softcap_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_bf16_softcapall_sm80.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_bf16_split_packgqa_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_bf16_split_packgqa_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_bf16_split_packgqa_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_bf16_split_packgqa_sm80.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_bf16_split_packgqa_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_bf16_split_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_bf16_split_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_bf16_split_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_bf16_split_sm80.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_bf16_split_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_bf16_split_softcap_packgqa_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_bf16_split_softcap_packgqa_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_bf16_split_softcap_packgqa_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_bf16_split_softcap_packgqa_sm80.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_bf16_split_softcap_packgqa_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_bf16_split_softcap_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_bf16_split_softcap_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_bf16_split_softcap_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_bf16_split_softcap_sm80.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_bf16_split_softcap_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_bf16_split_softcapall_sm80.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_e4m3_packgqa_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_e4m3_packgqa_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_e4m3_packgqa_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_e4m3_packgqa_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_e4m3_paged_packgqa_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_e4m3_paged_packgqa_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_e4m3_paged_packgqa_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_e4m3_paged_packgqa_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_e4m3_paged_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_e4m3_paged_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_e4m3_paged_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_e4m3_paged_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_e4m3_paged_softcap_packgqa_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_e4m3_paged_softcap_packgqa_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_e4m3_paged_softcap_packgqa_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_e4m3_paged_softcap_packgqa_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_e4m3_paged_softcap_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_e4m3_paged_softcap_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_e4m3_paged_softcap_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_e4m3_paged_softcap_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_e4m3_paged_split_packgqa_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_e4m3_paged_split_packgqa_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_e4m3_paged_split_packgqa_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_e4m3_paged_split_packgqa_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_e4m3_paged_split_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_e4m3_paged_split_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_e4m3_paged_split_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_e4m3_paged_split_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_e4m3_paged_split_softcap_packgqa_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_e4m3_paged_split_softcap_packgqa_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_e4m3_paged_split_softcap_packgqa_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_e4m3_paged_split_softcap_packgqa_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_e4m3_paged_split_softcap_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_e4m3_paged_split_softcap_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_e4m3_paged_split_softcap_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_e4m3_paged_split_softcap_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_e4m3_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_e4m3_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_e4m3_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_e4m3_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_e4m3_softcap_packgqa_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_e4m3_softcap_packgqa_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_e4m3_softcap_packgqa_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_e4m3_softcap_packgqa_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_e4m3_softcap_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_e4m3_softcap_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_e4m3_softcap_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_e4m3_softcap_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_e4m3_split_packgqa_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_e4m3_split_packgqa_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_e4m3_split_packgqa_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_e4m3_split_packgqa_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_e4m3_split_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_e4m3_split_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_e4m3_split_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_e4m3_split_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_e4m3_split_softcap_packgqa_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_e4m3_split_softcap_packgqa_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_e4m3_split_softcap_packgqa_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_e4m3_split_softcap_packgqa_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_e4m3_split_softcap_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_e4m3_split_softcap_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_e4m3_split_softcap_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_e4m3_split_softcap_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_fp16_packgqa_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_fp16_packgqa_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_fp16_packgqa_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_fp16_packgqa_sm80.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_fp16_packgqa_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_fp16_paged_packgqa_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_fp16_paged_packgqa_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_fp16_paged_packgqa_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_fp16_paged_packgqa_sm80.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_fp16_paged_packgqa_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_fp16_paged_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_fp16_paged_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_fp16_paged_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_fp16_paged_sm80.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_fp16_paged_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_fp16_paged_softcap_packgqa_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_fp16_paged_softcap_packgqa_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_fp16_paged_softcap_packgqa_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_fp16_paged_softcap_packgqa_sm80.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_fp16_paged_softcap_packgqa_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_fp16_paged_softcap_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_fp16_paged_softcap_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_fp16_paged_softcap_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_fp16_paged_softcap_sm80.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_fp16_paged_softcap_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_fp16_paged_softcapall_sm80.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_fp16_paged_split_packgqa_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_fp16_paged_split_packgqa_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_fp16_paged_split_packgqa_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_fp16_paged_split_packgqa_sm80.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_fp16_paged_split_packgqa_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_fp16_paged_split_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_fp16_paged_split_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_fp16_paged_split_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_fp16_paged_split_sm80.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_fp16_paged_split_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_fp16_paged_split_softcap_packgqa_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_fp16_paged_split_softcap_packgqa_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_fp16_paged_split_softcap_packgqa_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_fp16_paged_split_softcap_packgqa_sm80.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_fp16_paged_split_softcap_packgqa_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_fp16_paged_split_softcap_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_fp16_paged_split_softcap_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_fp16_paged_split_softcap_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_fp16_paged_split_softcap_sm80.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_fp16_paged_split_softcap_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_fp16_paged_split_softcapall_sm80.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_fp16_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_fp16_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_fp16_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_fp16_sm80.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_fp16_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_fp16_softcap_packgqa_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_fp16_softcap_packgqa_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_fp16_softcap_packgqa_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_fp16_softcap_packgqa_sm80.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_fp16_softcap_packgqa_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_fp16_softcap_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_fp16_softcap_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_fp16_softcap_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_fp16_softcap_sm80.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_fp16_softcap_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_fp16_softcapall_sm80.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_fp16_split_packgqa_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_fp16_split_packgqa_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_fp16_split_packgqa_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_fp16_split_packgqa_sm80.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_fp16_split_packgqa_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_fp16_split_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_fp16_split_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_fp16_split_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_fp16_split_sm80.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_fp16_split_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_fp16_split_softcap_packgqa_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_fp16_split_softcap_packgqa_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_fp16_split_softcap_packgqa_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_fp16_split_softcap_packgqa_sm80.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_fp16_split_softcap_packgqa_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_fp16_split_softcap_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_fp16_split_softcap_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_fp16_split_softcap_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_fp16_split_softcap_sm80.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_fp16_split_softcap_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_fp16_split_softcapall_sm80.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_128_bf16_packgqa_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_128_bf16_paged_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_128_bf16_paged_softcap_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_128_bf16_paged_split_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_128_bf16_paged_split_softcap_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_128_bf16_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_128_bf16_softcap_packgqa_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_128_bf16_softcap_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_128_bf16_split_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_128_bf16_split_softcap_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_128_e4m3_packgqa_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_128_e4m3_paged_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_128_e4m3_paged_softcap_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_128_e4m3_paged_split_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_128_e4m3_paged_split_softcap_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_128_e4m3_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_128_e4m3_softcap_packgqa_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_128_e4m3_softcap_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_128_e4m3_split_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_128_e4m3_split_softcap_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_128_fp16_packgqa_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_128_fp16_paged_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_128_fp16_paged_softcap_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_128_fp16_paged_split_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_128_fp16_paged_split_softcap_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_128_fp16_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_128_fp16_softcap_packgqa_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_128_fp16_softcap_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_128_fp16_split_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_128_fp16_split_softcap_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_bf16_packgqa_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_bf16_packgqa_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_bf16_packgqa_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_bf16_packgqa_sm80.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_bf16_packgqa_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_bf16_paged_packgqa_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_bf16_paged_packgqa_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_bf16_paged_packgqa_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_bf16_paged_packgqa_sm80.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_bf16_paged_packgqa_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_bf16_paged_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_bf16_paged_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_bf16_paged_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_bf16_paged_sm80.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_bf16_paged_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_bf16_paged_softcap_packgqa_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_bf16_paged_softcap_packgqa_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_bf16_paged_softcap_packgqa_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_bf16_paged_softcap_packgqa_sm80.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_bf16_paged_softcap_packgqa_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_bf16_paged_softcap_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_bf16_paged_softcap_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_bf16_paged_softcap_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_bf16_paged_softcap_sm80.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_bf16_paged_softcap_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_bf16_paged_softcapall_sm80.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_bf16_paged_split_packgqa_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_bf16_paged_split_packgqa_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_bf16_paged_split_packgqa_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_bf16_paged_split_packgqa_sm80.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_bf16_paged_split_packgqa_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_bf16_paged_split_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_bf16_paged_split_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_bf16_paged_split_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_bf16_paged_split_sm80.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_bf16_paged_split_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_bf16_paged_split_softcap_packgqa_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_bf16_paged_split_softcap_packgqa_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_bf16_paged_split_softcap_packgqa_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_bf16_paged_split_softcap_packgqa_sm80.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_bf16_paged_split_softcap_packgqa_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_bf16_paged_split_softcap_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_bf16_paged_split_softcap_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_bf16_paged_split_softcap_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_bf16_paged_split_softcap_sm80.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_bf16_paged_split_softcap_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_bf16_paged_split_softcapall_sm80.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_bf16_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_bf16_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_bf16_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_bf16_sm80.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_bf16_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_bf16_softcap_packgqa_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_bf16_softcap_packgqa_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_bf16_softcap_packgqa_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_bf16_softcap_packgqa_sm80.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_bf16_softcap_packgqa_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_bf16_softcap_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_bf16_softcap_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_bf16_softcap_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_bf16_softcap_sm80.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_bf16_softcap_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_bf16_softcapall_sm80.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_bf16_split_packgqa_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_bf16_split_packgqa_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_bf16_split_packgqa_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_bf16_split_packgqa_sm80.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_bf16_split_packgqa_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_bf16_split_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_bf16_split_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_bf16_split_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_bf16_split_sm80.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_bf16_split_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_bf16_split_softcap_packgqa_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_bf16_split_softcap_packgqa_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_bf16_split_softcap_packgqa_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_bf16_split_softcap_packgqa_sm80.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_bf16_split_softcap_packgqa_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_bf16_split_softcap_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_bf16_split_softcap_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_bf16_split_softcap_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_bf16_split_softcap_sm80.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_bf16_split_softcap_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_bf16_split_softcapall_sm80.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_e4m3_packgqa_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_e4m3_packgqa_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_e4m3_packgqa_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_e4m3_packgqa_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_e4m3_paged_packgqa_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_e4m3_paged_packgqa_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_e4m3_paged_packgqa_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_e4m3_paged_packgqa_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_e4m3_paged_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_e4m3_paged_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_e4m3_paged_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_e4m3_paged_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_e4m3_paged_softcap_packgqa_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_e4m3_paged_softcap_packgqa_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_e4m3_paged_softcap_packgqa_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_e4m3_paged_softcap_packgqa_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_e4m3_paged_softcap_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_e4m3_paged_softcap_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_e4m3_paged_softcap_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_e4m3_paged_softcap_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_e4m3_paged_split_packgqa_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_e4m3_paged_split_packgqa_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_e4m3_paged_split_packgqa_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_e4m3_paged_split_packgqa_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_e4m3_paged_split_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_e4m3_paged_split_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_e4m3_paged_split_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_e4m3_paged_split_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_e4m3_paged_split_softcap_packgqa_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_e4m3_paged_split_softcap_packgqa_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_e4m3_paged_split_softcap_packgqa_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_e4m3_paged_split_softcap_packgqa_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_e4m3_paged_split_softcap_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_e4m3_paged_split_softcap_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_e4m3_paged_split_softcap_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_e4m3_paged_split_softcap_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_e4m3_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_e4m3_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_e4m3_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_e4m3_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_e4m3_softcap_packgqa_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_e4m3_softcap_packgqa_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_e4m3_softcap_packgqa_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_e4m3_softcap_packgqa_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_e4m3_softcap_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_e4m3_softcap_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_e4m3_softcap_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_e4m3_softcap_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_e4m3_split_packgqa_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_e4m3_split_packgqa_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_e4m3_split_packgqa_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_e4m3_split_packgqa_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_e4m3_split_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_e4m3_split_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_e4m3_split_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_e4m3_split_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_e4m3_split_softcap_packgqa_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_e4m3_split_softcap_packgqa_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_e4m3_split_softcap_packgqa_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_e4m3_split_softcap_packgqa_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_e4m3_split_softcap_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_e4m3_split_softcap_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_e4m3_split_softcap_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_e4m3_split_softcap_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_fp16_packgqa_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_fp16_packgqa_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_fp16_packgqa_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_fp16_packgqa_sm80.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_fp16_packgqa_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_fp16_paged_packgqa_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_fp16_paged_packgqa_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_fp16_paged_packgqa_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_fp16_paged_packgqa_sm80.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_fp16_paged_packgqa_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_fp16_paged_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_fp16_paged_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_fp16_paged_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_fp16_paged_sm80.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_fp16_paged_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_fp16_paged_softcap_packgqa_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_fp16_paged_softcap_packgqa_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_fp16_paged_softcap_packgqa_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_fp16_paged_softcap_packgqa_sm80.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_fp16_paged_softcap_packgqa_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_fp16_paged_softcap_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_fp16_paged_softcap_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_fp16_paged_softcap_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_fp16_paged_softcap_sm80.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_fp16_paged_softcap_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_fp16_paged_softcapall_sm80.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_fp16_paged_split_packgqa_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_fp16_paged_split_packgqa_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_fp16_paged_split_packgqa_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_fp16_paged_split_packgqa_sm80.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_fp16_paged_split_packgqa_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_fp16_paged_split_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_fp16_paged_split_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_fp16_paged_split_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_fp16_paged_split_sm80.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_fp16_paged_split_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_fp16_paged_split_softcap_packgqa_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_fp16_paged_split_softcap_packgqa_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_fp16_paged_split_softcap_packgqa_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_fp16_paged_split_softcap_packgqa_sm80.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_fp16_paged_split_softcap_packgqa_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_fp16_paged_split_softcap_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_fp16_paged_split_softcap_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_fp16_paged_split_softcap_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_fp16_paged_split_softcap_sm80.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_fp16_paged_split_softcap_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_fp16_paged_split_softcapall_sm80.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_fp16_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_fp16_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_fp16_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_fp16_sm80.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_fp16_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_fp16_softcap_packgqa_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_fp16_softcap_packgqa_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_fp16_softcap_packgqa_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_fp16_softcap_packgqa_sm80.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_fp16_softcap_packgqa_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_fp16_softcap_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_fp16_softcap_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_fp16_softcap_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_fp16_softcap_sm80.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_fp16_softcap_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_fp16_softcapall_sm80.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_fp16_split_packgqa_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_fp16_split_packgqa_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_fp16_split_packgqa_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_fp16_split_packgqa_sm80.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_fp16_split_packgqa_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_fp16_split_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_fp16_split_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_fp16_split_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_fp16_split_sm80.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_fp16_split_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_fp16_split_softcap_packgqa_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_fp16_split_softcap_packgqa_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_fp16_split_softcap_packgqa_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_fp16_split_softcap_packgqa_sm80.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_fp16_split_softcap_packgqa_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_fp16_split_softcap_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_fp16_split_softcap_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_fp16_split_softcap_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_fp16_split_softcap_sm80.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_fp16_split_softcap_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_fp16_split_softcapall_sm80.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_bf16_packgqa_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_bf16_packgqa_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_bf16_packgqa_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_bf16_packgqa_sm80.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_bf16_packgqa_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_bf16_paged_packgqa_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_bf16_paged_packgqa_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_bf16_paged_packgqa_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_bf16_paged_packgqa_sm80.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_bf16_paged_packgqa_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_bf16_paged_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_bf16_paged_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_bf16_paged_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_bf16_paged_sm80.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_bf16_paged_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_bf16_paged_softcap_packgqa_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_bf16_paged_softcap_packgqa_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_bf16_paged_softcap_packgqa_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_bf16_paged_softcap_packgqa_sm80.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_bf16_paged_softcap_packgqa_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_bf16_paged_softcap_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_bf16_paged_softcap_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_bf16_paged_softcap_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_bf16_paged_softcap_sm80.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_bf16_paged_softcap_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_bf16_paged_softcapall_sm80.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_bf16_paged_split_packgqa_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_bf16_paged_split_packgqa_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_bf16_paged_split_packgqa_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_bf16_paged_split_packgqa_sm80.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_bf16_paged_split_packgqa_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_bf16_paged_split_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_bf16_paged_split_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_bf16_paged_split_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_bf16_paged_split_sm80.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_bf16_paged_split_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_bf16_paged_split_softcap_packgqa_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_bf16_paged_split_softcap_packgqa_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_bf16_paged_split_softcap_packgqa_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_bf16_paged_split_softcap_packgqa_sm80.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_bf16_paged_split_softcap_packgqa_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_bf16_paged_split_softcap_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_bf16_paged_split_softcap_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_bf16_paged_split_softcap_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_bf16_paged_split_softcap_sm80.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_bf16_paged_split_softcap_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_bf16_paged_split_softcapall_sm80.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_bf16_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_bf16_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_bf16_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_bf16_sm80.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_bf16_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_bf16_softcap_packgqa_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_bf16_softcap_packgqa_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_bf16_softcap_packgqa_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_bf16_softcap_packgqa_sm80.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_bf16_softcap_packgqa_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_bf16_softcap_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_bf16_softcap_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_bf16_softcap_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_bf16_softcap_sm80.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_bf16_softcap_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_bf16_softcapall_sm80.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_bf16_split_packgqa_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_bf16_split_packgqa_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_bf16_split_packgqa_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_bf16_split_packgqa_sm80.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_bf16_split_packgqa_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_bf16_split_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_bf16_split_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_bf16_split_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_bf16_split_sm80.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_bf16_split_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_bf16_split_softcap_packgqa_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_bf16_split_softcap_packgqa_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_bf16_split_softcap_packgqa_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_bf16_split_softcap_packgqa_sm80.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_bf16_split_softcap_packgqa_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_bf16_split_softcap_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_bf16_split_softcap_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_bf16_split_softcap_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_bf16_split_softcap_sm80.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_bf16_split_softcap_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_bf16_split_softcapall_sm80.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_e4m3_packgqa_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_e4m3_packgqa_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_e4m3_packgqa_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_e4m3_packgqa_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_e4m3_paged_packgqa_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_e4m3_paged_packgqa_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_e4m3_paged_packgqa_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_e4m3_paged_packgqa_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_e4m3_paged_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_e4m3_paged_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_e4m3_paged_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_e4m3_paged_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_e4m3_paged_softcap_packgqa_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_e4m3_paged_softcap_packgqa_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_e4m3_paged_softcap_packgqa_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_e4m3_paged_softcap_packgqa_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_e4m3_paged_softcap_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_e4m3_paged_softcap_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_e4m3_paged_softcap_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_e4m3_paged_softcap_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_e4m3_paged_split_packgqa_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_e4m3_paged_split_packgqa_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_e4m3_paged_split_packgqa_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_e4m3_paged_split_packgqa_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_e4m3_paged_split_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_e4m3_paged_split_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_e4m3_paged_split_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_e4m3_paged_split_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_e4m3_paged_split_softcap_packgqa_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_e4m3_paged_split_softcap_packgqa_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_e4m3_paged_split_softcap_packgqa_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_e4m3_paged_split_softcap_packgqa_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_e4m3_paged_split_softcap_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_e4m3_paged_split_softcap_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_e4m3_paged_split_softcap_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_e4m3_paged_split_softcap_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_e4m3_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_e4m3_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_e4m3_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_e4m3_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_e4m3_softcap_packgqa_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_e4m3_softcap_packgqa_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_e4m3_softcap_packgqa_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_e4m3_softcap_packgqa_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_e4m3_softcap_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_e4m3_softcap_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_e4m3_softcap_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_e4m3_softcap_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_e4m3_split_packgqa_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_e4m3_split_packgqa_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_e4m3_split_packgqa_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_e4m3_split_packgqa_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_e4m3_split_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_e4m3_split_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_e4m3_split_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_e4m3_split_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_e4m3_split_softcap_packgqa_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_e4m3_split_softcap_packgqa_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_e4m3_split_softcap_packgqa_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_e4m3_split_softcap_packgqa_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_e4m3_split_softcap_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_e4m3_split_softcap_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_e4m3_split_softcap_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_e4m3_split_softcap_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_fp16_packgqa_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_fp16_packgqa_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_fp16_packgqa_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_fp16_packgqa_sm80.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_fp16_packgqa_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_fp16_paged_packgqa_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_fp16_paged_packgqa_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_fp16_paged_packgqa_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_fp16_paged_packgqa_sm80.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_fp16_paged_packgqa_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_fp16_paged_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_fp16_paged_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_fp16_paged_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_fp16_paged_sm80.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_fp16_paged_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_fp16_paged_softcap_packgqa_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_fp16_paged_softcap_packgqa_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_fp16_paged_softcap_packgqa_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_fp16_paged_softcap_packgqa_sm80.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_fp16_paged_softcap_packgqa_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_fp16_paged_softcap_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_fp16_paged_softcap_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_fp16_paged_softcap_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_fp16_paged_softcap_sm80.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_fp16_paged_softcap_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_fp16_paged_softcapall_sm80.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_fp16_paged_split_packgqa_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_fp16_paged_split_packgqa_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_fp16_paged_split_packgqa_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_fp16_paged_split_packgqa_sm80.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_fp16_paged_split_packgqa_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_fp16_paged_split_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_fp16_paged_split_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_fp16_paged_split_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_fp16_paged_split_sm80.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_fp16_paged_split_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_fp16_paged_split_softcap_packgqa_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_fp16_paged_split_softcap_packgqa_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_fp16_paged_split_softcap_packgqa_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_fp16_paged_split_softcap_packgqa_sm80.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_fp16_paged_split_softcap_packgqa_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_fp16_paged_split_softcap_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_fp16_paged_split_softcap_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_fp16_paged_split_softcap_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_fp16_paged_split_softcap_sm80.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_fp16_paged_split_softcap_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_fp16_paged_split_softcapall_sm80.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_fp16_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_fp16_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_fp16_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_fp16_sm80.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_fp16_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_fp16_softcap_packgqa_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_fp16_softcap_packgqa_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_fp16_softcap_packgqa_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_fp16_softcap_packgqa_sm80.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_fp16_softcap_packgqa_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_fp16_softcap_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_fp16_softcap_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_fp16_softcap_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_fp16_softcap_sm80.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_fp16_softcap_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_fp16_softcapall_sm80.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_fp16_split_packgqa_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_fp16_split_packgqa_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_fp16_split_packgqa_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_fp16_split_packgqa_sm80.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_fp16_split_packgqa_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_fp16_split_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_fp16_split_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_fp16_split_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_fp16_split_sm80.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_fp16_split_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_fp16_split_softcap_packgqa_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_fp16_split_softcap_packgqa_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_fp16_split_softcap_packgqa_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_fp16_split_softcap_packgqa_sm80.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_fp16_split_softcap_packgqa_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_fp16_split_softcap_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_fp16_split_softcap_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_fp16_split_softcap_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_fp16_split_softcap_sm80.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_fp16_split_softcap_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_fp16_split_softcapall_sm80.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_256_bf16_packgqa_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_256_bf16_paged_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_256_bf16_paged_softcap_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_256_bf16_paged_split_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_256_bf16_paged_split_softcap_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_256_bf16_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_256_bf16_softcap_packgqa_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_256_bf16_softcap_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_256_bf16_split_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_256_bf16_split_softcap_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_256_fp16_packgqa_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_256_fp16_paged_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_256_fp16_paged_softcap_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_256_fp16_paged_split_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_256_fp16_paged_split_softcap_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_256_fp16_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_256_fp16_softcap_packgqa_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_256_fp16_softcap_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_256_fp16_split_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_256_fp16_split_softcap_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_512_bf16_packgqa_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_512_bf16_paged_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_512_bf16_paged_softcap_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_512_bf16_paged_split_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_512_bf16_paged_split_softcap_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_512_bf16_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_512_bf16_softcap_packgqa_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_512_bf16_softcap_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_512_bf16_split_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_512_bf16_split_softcap_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_512_fp16_packgqa_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_512_fp16_paged_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_512_fp16_paged_softcap_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_512_fp16_paged_split_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_512_fp16_paged_split_softcap_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_512_fp16_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_512_fp16_softcap_packgqa_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_512_fp16_softcap_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_512_fp16_split_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_512_fp16_split_softcap_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_bf16_packgqa_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_bf16_packgqa_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_bf16_packgqa_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_bf16_packgqa_sm80.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_bf16_packgqa_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_bf16_paged_packgqa_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_bf16_paged_packgqa_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_bf16_paged_packgqa_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_bf16_paged_packgqa_sm80.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_bf16_paged_packgqa_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_bf16_paged_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_bf16_paged_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_bf16_paged_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_bf16_paged_sm80.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_bf16_paged_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_bf16_paged_softcap_packgqa_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_bf16_paged_softcap_packgqa_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_bf16_paged_softcap_packgqa_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_bf16_paged_softcap_packgqa_sm80.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_bf16_paged_softcap_packgqa_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_bf16_paged_softcap_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_bf16_paged_softcap_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_bf16_paged_softcap_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_bf16_paged_softcap_sm80.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_bf16_paged_softcap_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_bf16_paged_softcapall_sm80.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_bf16_paged_split_packgqa_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_bf16_paged_split_packgqa_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_bf16_paged_split_packgqa_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_bf16_paged_split_packgqa_sm80.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_bf16_paged_split_packgqa_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_bf16_paged_split_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_bf16_paged_split_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_bf16_paged_split_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_bf16_paged_split_sm80.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_bf16_paged_split_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_bf16_paged_split_softcap_packgqa_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_bf16_paged_split_softcap_packgqa_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_bf16_paged_split_softcap_packgqa_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_bf16_paged_split_softcap_packgqa_sm80.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_bf16_paged_split_softcap_packgqa_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_bf16_paged_split_softcap_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_bf16_paged_split_softcap_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_bf16_paged_split_softcap_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_bf16_paged_split_softcap_sm80.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_bf16_paged_split_softcap_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_bf16_paged_split_softcapall_sm80.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_bf16_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_bf16_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_bf16_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_bf16_sm80.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_bf16_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_bf16_softcap_packgqa_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_bf16_softcap_packgqa_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_bf16_softcap_packgqa_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_bf16_softcap_packgqa_sm80.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_bf16_softcap_packgqa_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_bf16_softcap_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_bf16_softcap_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_bf16_softcap_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_bf16_softcap_sm80.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_bf16_softcap_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_bf16_softcapall_sm80.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_bf16_split_packgqa_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_bf16_split_packgqa_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_bf16_split_packgqa_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_bf16_split_packgqa_sm80.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_bf16_split_packgqa_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_bf16_split_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_bf16_split_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_bf16_split_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_bf16_split_sm80.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_bf16_split_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_bf16_split_softcap_packgqa_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_bf16_split_softcap_packgqa_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_bf16_split_softcap_packgqa_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_bf16_split_softcap_packgqa_sm80.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_bf16_split_softcap_packgqa_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_bf16_split_softcap_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_bf16_split_softcap_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_bf16_split_softcap_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_bf16_split_softcap_sm80.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_bf16_split_softcap_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_bf16_split_softcapall_sm80.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_e4m3_packgqa_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_e4m3_packgqa_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_e4m3_packgqa_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_e4m3_packgqa_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_e4m3_paged_packgqa_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_e4m3_paged_packgqa_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_e4m3_paged_packgqa_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_e4m3_paged_packgqa_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_e4m3_paged_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_e4m3_paged_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_e4m3_paged_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_e4m3_paged_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_e4m3_paged_softcap_packgqa_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_e4m3_paged_softcap_packgqa_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_e4m3_paged_softcap_packgqa_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_e4m3_paged_softcap_packgqa_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_e4m3_paged_softcap_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_e4m3_paged_softcap_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_e4m3_paged_softcap_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_e4m3_paged_softcap_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_e4m3_paged_split_packgqa_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_e4m3_paged_split_packgqa_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_e4m3_paged_split_packgqa_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_e4m3_paged_split_packgqa_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_e4m3_paged_split_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_e4m3_paged_split_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_e4m3_paged_split_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_e4m3_paged_split_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_e4m3_paged_split_softcap_packgqa_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_e4m3_paged_split_softcap_packgqa_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_e4m3_paged_split_softcap_packgqa_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_e4m3_paged_split_softcap_packgqa_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_e4m3_paged_split_softcap_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_e4m3_paged_split_softcap_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_e4m3_paged_split_softcap_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_e4m3_paged_split_softcap_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_e4m3_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_e4m3_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_e4m3_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_e4m3_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_e4m3_softcap_packgqa_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_e4m3_softcap_packgqa_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_e4m3_softcap_packgqa_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_e4m3_softcap_packgqa_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_e4m3_softcap_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_e4m3_softcap_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_e4m3_softcap_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_e4m3_softcap_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_e4m3_split_packgqa_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_e4m3_split_packgqa_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_e4m3_split_packgqa_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_e4m3_split_packgqa_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_e4m3_split_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_e4m3_split_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_e4m3_split_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_e4m3_split_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_e4m3_split_softcap_packgqa_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_e4m3_split_softcap_packgqa_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_e4m3_split_softcap_packgqa_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_e4m3_split_softcap_packgqa_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_e4m3_split_softcap_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_e4m3_split_softcap_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_e4m3_split_softcap_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_e4m3_split_softcap_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_fp16_packgqa_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_fp16_packgqa_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_fp16_packgqa_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_fp16_packgqa_sm80.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_fp16_packgqa_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_fp16_paged_packgqa_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_fp16_paged_packgqa_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_fp16_paged_packgqa_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_fp16_paged_packgqa_sm80.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_fp16_paged_packgqa_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_fp16_paged_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_fp16_paged_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_fp16_paged_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_fp16_paged_sm80.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_fp16_paged_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_fp16_paged_softcap_packgqa_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_fp16_paged_softcap_packgqa_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_fp16_paged_softcap_packgqa_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_fp16_paged_softcap_packgqa_sm80.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_fp16_paged_softcap_packgqa_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_fp16_paged_softcap_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_fp16_paged_softcap_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_fp16_paged_softcap_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_fp16_paged_softcap_sm80.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_fp16_paged_softcap_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_fp16_paged_softcapall_sm80.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_fp16_paged_split_packgqa_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_fp16_paged_split_packgqa_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_fp16_paged_split_packgqa_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_fp16_paged_split_packgqa_sm80.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_fp16_paged_split_packgqa_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_fp16_paged_split_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_fp16_paged_split_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_fp16_paged_split_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_fp16_paged_split_sm80.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_fp16_paged_split_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_fp16_paged_split_softcap_packgqa_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_fp16_paged_split_softcap_packgqa_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_fp16_paged_split_softcap_packgqa_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_fp16_paged_split_softcap_packgqa_sm80.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_fp16_paged_split_softcap_packgqa_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_fp16_paged_split_softcap_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_fp16_paged_split_softcap_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_fp16_paged_split_softcap_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_fp16_paged_split_softcap_sm80.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_fp16_paged_split_softcap_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_fp16_paged_split_softcapall_sm80.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_fp16_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_fp16_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_fp16_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_fp16_sm80.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_fp16_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_fp16_softcap_packgqa_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_fp16_softcap_packgqa_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_fp16_softcap_packgqa_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_fp16_softcap_packgqa_sm80.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_fp16_softcap_packgqa_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_fp16_softcap_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_fp16_softcap_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_fp16_softcap_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_fp16_softcap_sm80.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_fp16_softcap_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_fp16_softcapall_sm80.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_fp16_split_packgqa_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_fp16_split_packgqa_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_fp16_split_packgqa_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_fp16_split_packgqa_sm80.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_fp16_split_packgqa_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_fp16_split_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_fp16_split_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_fp16_split_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_fp16_split_sm80.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_fp16_split_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_fp16_split_softcap_packgqa_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_fp16_split_softcap_packgqa_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_fp16_split_softcap_packgqa_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_fp16_split_softcap_packgqa_sm80.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_fp16_split_softcap_packgqa_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_fp16_split_softcap_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_fp16_split_softcap_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_fp16_split_softcap_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_fp16_split_softcap_sm80.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_fp16_split_softcap_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_fp16_split_softcapall_sm80.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_bf16_packgqa_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_bf16_packgqa_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_bf16_packgqa_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_bf16_packgqa_sm80.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_bf16_packgqa_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_bf16_paged_packgqa_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_bf16_paged_packgqa_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_bf16_paged_packgqa_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_bf16_paged_packgqa_sm80.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_bf16_paged_packgqa_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_bf16_paged_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_bf16_paged_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_bf16_paged_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_bf16_paged_sm80.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_bf16_paged_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_bf16_paged_softcap_packgqa_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_bf16_paged_softcap_packgqa_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_bf16_paged_softcap_packgqa_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_bf16_paged_softcap_packgqa_sm80.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_bf16_paged_softcap_packgqa_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_bf16_paged_softcap_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_bf16_paged_softcap_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_bf16_paged_softcap_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_bf16_paged_softcap_sm80.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_bf16_paged_softcap_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_bf16_paged_softcapall_sm80.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_bf16_paged_split_packgqa_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_bf16_paged_split_packgqa_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_bf16_paged_split_packgqa_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_bf16_paged_split_packgqa_sm80.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_bf16_paged_split_packgqa_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_bf16_paged_split_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_bf16_paged_split_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_bf16_paged_split_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_bf16_paged_split_sm80.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_bf16_paged_split_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_bf16_paged_split_softcap_packgqa_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_bf16_paged_split_softcap_packgqa_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_bf16_paged_split_softcap_packgqa_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_bf16_paged_split_softcap_packgqa_sm80.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_bf16_paged_split_softcap_packgqa_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_bf16_paged_split_softcap_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_bf16_paged_split_softcap_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_bf16_paged_split_softcap_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_bf16_paged_split_softcap_sm80.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_bf16_paged_split_softcap_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_bf16_paged_split_softcapall_sm80.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_bf16_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_bf16_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_bf16_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_bf16_sm80.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_bf16_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_bf16_softcap_packgqa_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_bf16_softcap_packgqa_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_bf16_softcap_packgqa_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_bf16_softcap_packgqa_sm80.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_bf16_softcap_packgqa_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_bf16_softcap_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_bf16_softcap_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_bf16_softcap_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_bf16_softcap_sm80.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_bf16_softcap_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_bf16_softcapall_sm80.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_bf16_split_packgqa_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_bf16_split_packgqa_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_bf16_split_packgqa_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_bf16_split_packgqa_sm80.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_bf16_split_packgqa_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_bf16_split_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_bf16_split_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_bf16_split_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_bf16_split_sm80.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_bf16_split_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_bf16_split_softcap_packgqa_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_bf16_split_softcap_packgqa_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_bf16_split_softcap_packgqa_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_bf16_split_softcap_packgqa_sm80.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_bf16_split_softcap_packgqa_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_bf16_split_softcap_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_bf16_split_softcap_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_bf16_split_softcap_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_bf16_split_softcap_sm80.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_bf16_split_softcap_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_bf16_split_softcapall_sm80.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_e4m3_packgqa_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_e4m3_packgqa_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_e4m3_packgqa_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_e4m3_packgqa_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_e4m3_paged_packgqa_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_e4m3_paged_packgqa_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_e4m3_paged_packgqa_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_e4m3_paged_packgqa_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_e4m3_paged_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_e4m3_paged_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_e4m3_paged_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_e4m3_paged_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_e4m3_paged_softcap_packgqa_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_e4m3_paged_softcap_packgqa_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_e4m3_paged_softcap_packgqa_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_e4m3_paged_softcap_packgqa_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_e4m3_paged_softcap_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_e4m3_paged_softcap_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_e4m3_paged_softcap_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_e4m3_paged_softcap_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_e4m3_paged_split_packgqa_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_e4m3_paged_split_packgqa_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_e4m3_paged_split_packgqa_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_e4m3_paged_split_packgqa_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_e4m3_paged_split_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_e4m3_paged_split_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_e4m3_paged_split_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_e4m3_paged_split_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_e4m3_paged_split_softcap_packgqa_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_e4m3_paged_split_softcap_packgqa_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_e4m3_paged_split_softcap_packgqa_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_e4m3_paged_split_softcap_packgqa_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_e4m3_paged_split_softcap_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_e4m3_paged_split_softcap_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_e4m3_paged_split_softcap_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_e4m3_paged_split_softcap_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_e4m3_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_e4m3_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_e4m3_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_e4m3_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_e4m3_softcap_packgqa_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_e4m3_softcap_packgqa_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_e4m3_softcap_packgqa_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_e4m3_softcap_packgqa_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_e4m3_softcap_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_e4m3_softcap_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_e4m3_softcap_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_e4m3_softcap_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_e4m3_split_packgqa_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_e4m3_split_packgqa_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_e4m3_split_packgqa_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_e4m3_split_packgqa_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_e4m3_split_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_e4m3_split_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_e4m3_split_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_e4m3_split_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_e4m3_split_softcap_packgqa_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_e4m3_split_softcap_packgqa_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_e4m3_split_softcap_packgqa_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_e4m3_split_softcap_packgqa_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_e4m3_split_softcap_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_e4m3_split_softcap_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_e4m3_split_softcap_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_e4m3_split_softcap_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_fp16_packgqa_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_fp16_packgqa_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_fp16_packgqa_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_fp16_packgqa_sm80.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_fp16_packgqa_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_fp16_paged_packgqa_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_fp16_paged_packgqa_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_fp16_paged_packgqa_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_fp16_paged_packgqa_sm80.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_fp16_paged_packgqa_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_fp16_paged_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_fp16_paged_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_fp16_paged_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_fp16_paged_sm80.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_fp16_paged_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_fp16_paged_softcap_packgqa_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_fp16_paged_softcap_packgqa_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_fp16_paged_softcap_packgqa_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_fp16_paged_softcap_packgqa_sm80.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_fp16_paged_softcap_packgqa_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_fp16_paged_softcap_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_fp16_paged_softcap_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_fp16_paged_softcap_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_fp16_paged_softcap_sm80.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_fp16_paged_softcap_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_fp16_paged_softcapall_sm80.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_fp16_paged_split_packgqa_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_fp16_paged_split_packgqa_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_fp16_paged_split_packgqa_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_fp16_paged_split_packgqa_sm80.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_fp16_paged_split_packgqa_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_fp16_paged_split_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_fp16_paged_split_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_fp16_paged_split_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_fp16_paged_split_sm80.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_fp16_paged_split_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_fp16_paged_split_softcap_packgqa_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_fp16_paged_split_softcap_packgqa_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_fp16_paged_split_softcap_packgqa_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_fp16_paged_split_softcap_packgqa_sm80.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_fp16_paged_split_softcap_packgqa_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_fp16_paged_split_softcap_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_fp16_paged_split_softcap_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_fp16_paged_split_softcap_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_fp16_paged_split_softcap_sm80.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_fp16_paged_split_softcap_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_fp16_paged_split_softcapall_sm80.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_fp16_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_fp16_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_fp16_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_fp16_sm80.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_fp16_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_fp16_softcap_packgqa_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_fp16_softcap_packgqa_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_fp16_softcap_packgqa_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_fp16_softcap_packgqa_sm80.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_fp16_softcap_packgqa_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_fp16_softcap_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_fp16_softcap_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_fp16_softcap_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_fp16_softcap_sm80.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_fp16_softcap_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_fp16_softcapall_sm80.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_fp16_split_packgqa_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_fp16_split_packgqa_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_fp16_split_packgqa_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_fp16_split_packgqa_sm80.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_fp16_split_packgqa_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_fp16_split_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_fp16_split_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_fp16_split_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_fp16_split_sm80.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_fp16_split_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_fp16_split_softcap_packgqa_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_fp16_split_softcap_packgqa_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_fp16_split_softcap_packgqa_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_fp16_split_softcap_packgqa_sm80.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_fp16_split_softcap_packgqa_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_fp16_split_softcap_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_fp16_split_softcap_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_fp16_split_softcap_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_fp16_split_softcap_sm80.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_fp16_split_softcap_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_fp16_split_softcapall_sm80.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdimall_bf16_packgqa_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdimall_bf16_paged_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdimall_bf16_paged_softcap_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdimall_bf16_paged_split_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdimall_bf16_paged_split_softcap_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdimall_bf16_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdimall_bf16_softcap_packgqa_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdimall_bf16_softcap_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdimall_bf16_split_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdimall_bf16_split_softcap_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdimall_e4m3_packgqa_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdimall_e4m3_paged_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdimall_e4m3_paged_softcap_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdimall_e4m3_paged_split_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdimall_e4m3_paged_split_softcap_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdimall_e4m3_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdimall_e4m3_softcap_packgqa_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdimall_e4m3_softcap_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdimall_e4m3_split_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdimall_e4m3_split_softcap_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdimall_fp16_packgqa_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdimall_fp16_paged_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdimall_fp16_paged_softcap_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdimall_fp16_paged_split_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdimall_fp16_paged_split_softcap_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdimall_fp16_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdimall_fp16_softcap_packgqa_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdimall_fp16_softcap_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdimall_fp16_split_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdimall_fp16_split_softcap_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdimdiff_bf16_packgqa_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdimdiff_bf16_paged_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdimdiff_bf16_paged_softcap_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdimdiff_bf16_paged_split_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdimdiff_bf16_paged_split_softcap_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdimdiff_bf16_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdimdiff_bf16_softcap_packgqa_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdimdiff_bf16_softcap_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdimdiff_bf16_split_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdimdiff_bf16_split_softcap_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdimdiff_e4m3_packgqa_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdimdiff_e4m3_paged_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdimdiff_e4m3_paged_softcap_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdimdiff_e4m3_paged_split_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdimdiff_e4m3_paged_split_softcap_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdimdiff_e4m3_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdimdiff_e4m3_softcap_packgqa_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdimdiff_e4m3_softcap_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdimdiff_e4m3_split_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdimdiff_e4m3_split_softcap_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdimdiff_fp16_packgqa_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdimdiff_fp16_paged_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdimdiff_fp16_paged_softcap_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdimdiff_fp16_paged_split_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdimdiff_fp16_paged_split_softcap_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdimdiff_fp16_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdimdiff_fp16_softcap_packgqa_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdimdiff_fp16_softcap_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdimdiff_fp16_split_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdimdiff_fp16_split_softcap_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/mainloop_bwd_sm80.hpp +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/mainloop_bwd_sm90_tma_gmma_ws.hpp +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/mainloop_fwd_sm80.hpp +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/mainloop_fwd_sm90_tma_gmma_ws.hpp +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/mask.h +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/named_barrier.hpp +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/pack_gqa.h +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/paged_kv.h +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/rotary.h +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/seqlen.h +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/sm90_pipeline_no_cluster.hpp +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/softmax.h +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/static_switch.h +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/tile_scheduler.hpp +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/tile_size.h +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/utils.h +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/include/c10/cuda/CUDAException.h +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/include/ejkernel_flash_attention.h +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/include/ejkernel_flash_attention_cutlass.h +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/alibi.h +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/aten_shim.h +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/block_info.h +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/code_gen.py +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/dropout.h +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash.h +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_attention_ffi.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_bwd_hdim128_bf16_causal_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_bwd_hdim128_bf16_causal_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_bwd_hdim128_bf16_causal_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_bwd_hdim128_bf16_causal_sm80.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_bwd_hdim128_bf16_causal_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_bwd_hdim128_bf16_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_bwd_hdim128_bf16_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_bwd_hdim128_bf16_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_bwd_hdim128_bf16_sm80.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_bwd_hdim128_bf16_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_bwd_hdim128_fp16_causal_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_bwd_hdim128_fp16_causal_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_bwd_hdim128_fp16_causal_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_bwd_hdim128_fp16_causal_sm80.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_bwd_hdim128_fp16_causal_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_bwd_hdim128_fp16_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_bwd_hdim128_fp16_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_bwd_hdim128_fp16_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_bwd_hdim128_fp16_sm80.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_bwd_hdim128_fp16_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_bwd_hdim192_bf16_causal_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_bwd_hdim192_bf16_causal_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_bwd_hdim192_bf16_causal_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_bwd_hdim192_bf16_causal_sm80.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_bwd_hdim192_bf16_causal_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_bwd_hdim192_bf16_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_bwd_hdim192_bf16_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_bwd_hdim192_bf16_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_bwd_hdim192_bf16_sm80.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_bwd_hdim192_bf16_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_bwd_hdim192_fp16_causal_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_bwd_hdim192_fp16_causal_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_bwd_hdim192_fp16_causal_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_bwd_hdim192_fp16_causal_sm80.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_bwd_hdim192_fp16_causal_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_bwd_hdim192_fp16_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_bwd_hdim192_fp16_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_bwd_hdim192_fp16_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_bwd_hdim192_fp16_sm80.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_bwd_hdim192_fp16_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_bwd_hdim256_bf16_causal_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_bwd_hdim256_bf16_causal_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_bwd_hdim256_bf16_causal_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_bwd_hdim256_bf16_causal_sm80.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_bwd_hdim256_bf16_causal_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_bwd_hdim256_bf16_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_bwd_hdim256_bf16_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_bwd_hdim256_bf16_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_bwd_hdim256_bf16_sm80.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_bwd_hdim256_bf16_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_bwd_hdim256_fp16_causal_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_bwd_hdim256_fp16_causal_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_bwd_hdim256_fp16_causal_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_bwd_hdim256_fp16_causal_sm80.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_bwd_hdim256_fp16_causal_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_bwd_hdim256_fp16_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_bwd_hdim256_fp16_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_bwd_hdim256_fp16_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_bwd_hdim256_fp16_sm80.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_bwd_hdim256_fp16_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_bwd_hdim32_bf16_causal_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_bwd_hdim32_bf16_causal_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_bwd_hdim32_bf16_causal_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_bwd_hdim32_bf16_causal_sm80.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_bwd_hdim32_bf16_causal_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_bwd_hdim32_bf16_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_bwd_hdim32_bf16_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_bwd_hdim32_bf16_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_bwd_hdim32_bf16_sm80.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_bwd_hdim32_bf16_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_bwd_hdim32_fp16_causal_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_bwd_hdim32_fp16_causal_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_bwd_hdim32_fp16_causal_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_bwd_hdim32_fp16_causal_sm80.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_bwd_hdim32_fp16_causal_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_bwd_hdim32_fp16_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_bwd_hdim32_fp16_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_bwd_hdim32_fp16_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_bwd_hdim32_fp16_sm80.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_bwd_hdim32_fp16_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_bwd_hdim64_bf16_causal_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_bwd_hdim64_bf16_causal_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_bwd_hdim64_bf16_causal_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_bwd_hdim64_bf16_causal_sm80.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_bwd_hdim64_bf16_causal_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_bwd_hdim64_bf16_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_bwd_hdim64_bf16_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_bwd_hdim64_bf16_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_bwd_hdim64_bf16_sm80.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_bwd_hdim64_bf16_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_bwd_hdim64_fp16_causal_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_bwd_hdim64_fp16_causal_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_bwd_hdim64_fp16_causal_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_bwd_hdim64_fp16_causal_sm80.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_bwd_hdim64_fp16_causal_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_bwd_hdim64_fp16_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_bwd_hdim64_fp16_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_bwd_hdim64_fp16_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_bwd_hdim64_fp16_sm80.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_bwd_hdim64_fp16_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_bwd_hdim96_bf16_causal_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_bwd_hdim96_bf16_causal_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_bwd_hdim96_bf16_causal_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_bwd_hdim96_bf16_causal_sm80.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_bwd_hdim96_bf16_causal_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_bwd_hdim96_bf16_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_bwd_hdim96_bf16_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_bwd_hdim96_bf16_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_bwd_hdim96_bf16_sm80.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_bwd_hdim96_bf16_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_bwd_hdim96_fp16_causal_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_bwd_hdim96_fp16_causal_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_bwd_hdim96_fp16_causal_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_bwd_hdim96_fp16_causal_sm80.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_bwd_hdim96_fp16_causal_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_bwd_hdim96_fp16_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_bwd_hdim96_fp16_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_bwd_hdim96_fp16_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_bwd_hdim96_fp16_sm80.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_bwd_hdim96_fp16_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_bwd_kernel.h +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_bwd_launch_template.h +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_bwd_preprocess_kernel.h +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_hdim128_bf16_causal_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_hdim128_bf16_causal_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_hdim128_bf16_causal_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_hdim128_bf16_causal_sm80.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_hdim128_bf16_causal_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_hdim128_bf16_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_hdim128_bf16_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_hdim128_bf16_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_hdim128_bf16_sm80.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_hdim128_bf16_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_hdim128_fp16_causal_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_hdim128_fp16_causal_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_hdim128_fp16_causal_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_hdim128_fp16_causal_sm80.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_hdim128_fp16_causal_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_hdim128_fp16_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_hdim128_fp16_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_hdim128_fp16_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_hdim128_fp16_sm80.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_hdim128_fp16_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_hdim192_bf16_causal_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_hdim192_bf16_causal_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_hdim192_bf16_causal_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_hdim192_bf16_causal_sm80.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_hdim192_bf16_causal_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_hdim192_bf16_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_hdim192_bf16_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_hdim192_bf16_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_hdim192_bf16_sm80.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_hdim192_bf16_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_hdim192_fp16_causal_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_hdim192_fp16_causal_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_hdim192_fp16_causal_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_hdim192_fp16_causal_sm80.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_hdim192_fp16_causal_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_hdim192_fp16_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_hdim192_fp16_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_hdim192_fp16_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_hdim192_fp16_sm80.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_hdim192_fp16_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_hdim256_bf16_causal_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_hdim256_bf16_causal_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_hdim256_bf16_causal_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_hdim256_bf16_causal_sm80.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_hdim256_bf16_causal_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_hdim256_bf16_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_hdim256_bf16_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_hdim256_bf16_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_hdim256_bf16_sm80.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_hdim256_bf16_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_hdim256_fp16_causal_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_hdim256_fp16_causal_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_hdim256_fp16_causal_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_hdim256_fp16_causal_sm80.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_hdim256_fp16_causal_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_hdim256_fp16_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_hdim256_fp16_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_hdim256_fp16_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_hdim256_fp16_sm80.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_hdim256_fp16_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_hdim32_bf16_causal_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_hdim32_bf16_causal_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_hdim32_bf16_causal_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_hdim32_bf16_causal_sm80.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_hdim32_bf16_causal_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_hdim32_bf16_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_hdim32_bf16_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_hdim32_bf16_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_hdim32_bf16_sm80.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_hdim32_bf16_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_hdim32_fp16_causal_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_hdim32_fp16_causal_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_hdim32_fp16_causal_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_hdim32_fp16_causal_sm80.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_hdim32_fp16_causal_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_hdim32_fp16_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_hdim32_fp16_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_hdim32_fp16_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_hdim32_fp16_sm80.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_hdim32_fp16_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_hdim64_bf16_causal_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_hdim64_bf16_causal_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_hdim64_bf16_causal_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_hdim64_bf16_causal_sm80.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_hdim64_bf16_causal_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_hdim64_bf16_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_hdim64_bf16_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_hdim64_bf16_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_hdim64_bf16_sm80.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_hdim64_bf16_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_hdim64_fp16_causal_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_hdim64_fp16_causal_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_hdim64_fp16_causal_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_hdim64_fp16_causal_sm80.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_hdim64_fp16_causal_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_hdim64_fp16_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_hdim64_fp16_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_hdim64_fp16_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_hdim64_fp16_sm80.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_hdim64_fp16_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_hdim96_bf16_causal_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_hdim96_bf16_causal_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_hdim96_bf16_causal_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_hdim96_bf16_causal_sm80.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_hdim96_bf16_causal_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_hdim96_bf16_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_hdim96_bf16_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_hdim96_bf16_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_hdim96_bf16_sm80.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_hdim96_bf16_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_hdim96_fp16_causal_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_hdim96_fp16_causal_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_hdim96_fp16_causal_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_hdim96_fp16_causal_sm80.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_hdim96_fp16_causal_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_hdim96_fp16_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_hdim96_fp16_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_hdim96_fp16_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_hdim96_fp16_sm80.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_hdim96_fp16_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_kernel.h +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_split_hdim128_bf16_causal_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_split_hdim128_bf16_causal_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_split_hdim128_bf16_causal_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_split_hdim128_bf16_causal_sm80.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_split_hdim128_bf16_causal_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_split_hdim128_bf16_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_split_hdim128_bf16_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_split_hdim128_bf16_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_split_hdim128_bf16_sm80.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_split_hdim128_bf16_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_split_hdim128_fp16_causal_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_split_hdim128_fp16_causal_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_split_hdim128_fp16_causal_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_split_hdim128_fp16_causal_sm80.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_split_hdim128_fp16_causal_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_split_hdim128_fp16_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_split_hdim128_fp16_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_split_hdim128_fp16_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_split_hdim128_fp16_sm80.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_split_hdim128_fp16_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_split_hdim192_bf16_causal_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_split_hdim192_bf16_causal_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_split_hdim192_bf16_causal_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_split_hdim192_bf16_causal_sm80.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_split_hdim192_bf16_causal_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_split_hdim192_bf16_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_split_hdim192_bf16_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_split_hdim192_bf16_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_split_hdim192_bf16_sm80.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_split_hdim192_bf16_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_split_hdim192_fp16_causal_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_split_hdim192_fp16_causal_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_split_hdim192_fp16_causal_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_split_hdim192_fp16_causal_sm80.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_split_hdim192_fp16_causal_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_split_hdim192_fp16_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_split_hdim192_fp16_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_split_hdim192_fp16_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_split_hdim192_fp16_sm80.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_split_hdim192_fp16_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_split_hdim256_bf16_causal_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_split_hdim256_bf16_causal_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_split_hdim256_bf16_causal_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_split_hdim256_bf16_causal_sm80.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_split_hdim256_bf16_causal_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_split_hdim256_bf16_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_split_hdim256_bf16_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_split_hdim256_bf16_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_split_hdim256_bf16_sm80.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_split_hdim256_bf16_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_split_hdim256_fp16_causal_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_split_hdim256_fp16_causal_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_split_hdim256_fp16_causal_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_split_hdim256_fp16_causal_sm80.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_split_hdim256_fp16_causal_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_split_hdim256_fp16_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_split_hdim256_fp16_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_split_hdim256_fp16_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_split_hdim256_fp16_sm80.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_split_hdim256_fp16_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_split_hdim32_bf16_causal_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_split_hdim32_bf16_causal_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_split_hdim32_bf16_causal_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_split_hdim32_bf16_causal_sm80.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_split_hdim32_bf16_causal_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_split_hdim32_bf16_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_split_hdim32_bf16_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_split_hdim32_bf16_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_split_hdim32_bf16_sm80.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_split_hdim32_bf16_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_split_hdim32_fp16_causal_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_split_hdim32_fp16_causal_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_split_hdim32_fp16_causal_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_split_hdim32_fp16_causal_sm80.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_split_hdim32_fp16_causal_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_split_hdim32_fp16_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_split_hdim32_fp16_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_split_hdim32_fp16_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_split_hdim32_fp16_sm80.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_split_hdim32_fp16_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_split_hdim64_bf16_causal_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_split_hdim64_bf16_causal_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_split_hdim64_bf16_causal_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_split_hdim64_bf16_causal_sm80.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_split_hdim64_bf16_causal_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_split_hdim64_bf16_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_split_hdim64_bf16_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_split_hdim64_bf16_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_split_hdim64_bf16_sm80.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_split_hdim64_bf16_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_split_hdim64_fp16_causal_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_split_hdim64_fp16_causal_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_split_hdim64_fp16_causal_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_split_hdim64_fp16_causal_sm80.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_split_hdim64_fp16_causal_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_split_hdim64_fp16_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_split_hdim64_fp16_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_split_hdim64_fp16_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_split_hdim64_fp16_sm80.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_split_hdim64_fp16_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_split_hdim96_bf16_causal_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_split_hdim96_bf16_causal_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_split_hdim96_bf16_causal_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_split_hdim96_bf16_causal_sm80.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_split_hdim96_bf16_causal_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_split_hdim96_bf16_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_split_hdim96_bf16_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_split_hdim96_bf16_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_split_hdim96_bf16_sm80.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_split_hdim96_bf16_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_split_hdim96_fp16_causal_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_split_hdim96_fp16_causal_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_split_hdim96_fp16_causal_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_split_hdim96_fp16_causal_sm80.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_split_hdim96_fp16_causal_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_split_hdim96_fp16_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_split_hdim96_fp16_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_split_hdim96_fp16_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_split_hdim96_fp16_sm80.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_split_hdim96_fp16_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/hardware_info.h +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/kernel_traits.h +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/mask.h +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/namespace_config.h +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/philox.cuh +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/philox_unpack.cuh +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/rotary.h +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/softmax.h +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/static_switch.h +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/utils.h +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/quantized_matmul/src/qmm_cuda.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/quantized_matmul/src/qmm_dequant_kernels.h +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/ragged_page_attention_v3/CMakeLists.txt +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/ragged_page_attention_v3/src/code_gen.py +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/ragged_page_attention_v3/src/rpa_v3.h +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/ragged_page_attention_v3/src/rpa_v3_ffi.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/ragged_page_attention_v3/src/rpa_v3_fwd_hdim128_bf16_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/ragged_page_attention_v3/src/rpa_v3_fwd_hdim128_bf16_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/ragged_page_attention_v3/src/rpa_v3_fwd_hdim128_bf16_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/ragged_page_attention_v3/src/rpa_v3_fwd_hdim128_bf16_sm80.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/ragged_page_attention_v3/src/rpa_v3_fwd_hdim128_bf16_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/ragged_page_attention_v3/src/rpa_v3_fwd_hdim128_fp16_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/ragged_page_attention_v3/src/rpa_v3_fwd_hdim128_fp16_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/ragged_page_attention_v3/src/rpa_v3_fwd_hdim128_fp16_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/ragged_page_attention_v3/src/rpa_v3_fwd_hdim128_fp16_sm80.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/ragged_page_attention_v3/src/rpa_v3_fwd_hdim128_fp16_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/ragged_page_attention_v3/src/rpa_v3_fwd_hdim128_fp32_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/ragged_page_attention_v3/src/rpa_v3_fwd_hdim128_fp32_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/ragged_page_attention_v3/src/rpa_v3_fwd_hdim128_fp32_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/ragged_page_attention_v3/src/rpa_v3_fwd_hdim128_fp32_sm80.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/ragged_page_attention_v3/src/rpa_v3_fwd_hdim128_fp32_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/ragged_page_attention_v3/src/rpa_v3_fwd_hdim192_bf16_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/ragged_page_attention_v3/src/rpa_v3_fwd_hdim192_bf16_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/ragged_page_attention_v3/src/rpa_v3_fwd_hdim192_bf16_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/ragged_page_attention_v3/src/rpa_v3_fwd_hdim192_bf16_sm80.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/ragged_page_attention_v3/src/rpa_v3_fwd_hdim192_bf16_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/ragged_page_attention_v3/src/rpa_v3_fwd_hdim192_fp16_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/ragged_page_attention_v3/src/rpa_v3_fwd_hdim192_fp16_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/ragged_page_attention_v3/src/rpa_v3_fwd_hdim192_fp16_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/ragged_page_attention_v3/src/rpa_v3_fwd_hdim192_fp16_sm80.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/ragged_page_attention_v3/src/rpa_v3_fwd_hdim192_fp16_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/ragged_page_attention_v3/src/rpa_v3_fwd_hdim192_fp32_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/ragged_page_attention_v3/src/rpa_v3_fwd_hdim192_fp32_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/ragged_page_attention_v3/src/rpa_v3_fwd_hdim192_fp32_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/ragged_page_attention_v3/src/rpa_v3_fwd_hdim192_fp32_sm80.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/ragged_page_attention_v3/src/rpa_v3_fwd_hdim192_fp32_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/ragged_page_attention_v3/src/rpa_v3_fwd_hdim256_bf16_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/ragged_page_attention_v3/src/rpa_v3_fwd_hdim256_bf16_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/ragged_page_attention_v3/src/rpa_v3_fwd_hdim256_bf16_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/ragged_page_attention_v3/src/rpa_v3_fwd_hdim256_bf16_sm80.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/ragged_page_attention_v3/src/rpa_v3_fwd_hdim256_bf16_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/ragged_page_attention_v3/src/rpa_v3_fwd_hdim256_fp16_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/ragged_page_attention_v3/src/rpa_v3_fwd_hdim256_fp16_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/ragged_page_attention_v3/src/rpa_v3_fwd_hdim256_fp16_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/ragged_page_attention_v3/src/rpa_v3_fwd_hdim256_fp16_sm80.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/ragged_page_attention_v3/src/rpa_v3_fwd_hdim256_fp16_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/ragged_page_attention_v3/src/rpa_v3_fwd_hdim256_fp32_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/ragged_page_attention_v3/src/rpa_v3_fwd_hdim256_fp32_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/ragged_page_attention_v3/src/rpa_v3_fwd_hdim256_fp32_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/ragged_page_attention_v3/src/rpa_v3_fwd_hdim256_fp32_sm80.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/ragged_page_attention_v3/src/rpa_v3_fwd_hdim256_fp32_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/ragged_page_attention_v3/src/rpa_v3_fwd_hdim32_bf16_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/ragged_page_attention_v3/src/rpa_v3_fwd_hdim32_bf16_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/ragged_page_attention_v3/src/rpa_v3_fwd_hdim32_bf16_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/ragged_page_attention_v3/src/rpa_v3_fwd_hdim32_bf16_sm80.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/ragged_page_attention_v3/src/rpa_v3_fwd_hdim32_bf16_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/ragged_page_attention_v3/src/rpa_v3_fwd_hdim32_fp16_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/ragged_page_attention_v3/src/rpa_v3_fwd_hdim32_fp16_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/ragged_page_attention_v3/src/rpa_v3_fwd_hdim32_fp16_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/ragged_page_attention_v3/src/rpa_v3_fwd_hdim32_fp16_sm80.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/ragged_page_attention_v3/src/rpa_v3_fwd_hdim32_fp16_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/ragged_page_attention_v3/src/rpa_v3_fwd_hdim32_fp32_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/ragged_page_attention_v3/src/rpa_v3_fwd_hdim32_fp32_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/ragged_page_attention_v3/src/rpa_v3_fwd_hdim32_fp32_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/ragged_page_attention_v3/src/rpa_v3_fwd_hdim32_fp32_sm80.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/ragged_page_attention_v3/src/rpa_v3_fwd_hdim32_fp32_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/ragged_page_attention_v3/src/rpa_v3_fwd_hdim64_bf16_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/ragged_page_attention_v3/src/rpa_v3_fwd_hdim64_bf16_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/ragged_page_attention_v3/src/rpa_v3_fwd_hdim64_bf16_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/ragged_page_attention_v3/src/rpa_v3_fwd_hdim64_bf16_sm80.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/ragged_page_attention_v3/src/rpa_v3_fwd_hdim64_bf16_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/ragged_page_attention_v3/src/rpa_v3_fwd_hdim64_fp16_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/ragged_page_attention_v3/src/rpa_v3_fwd_hdim64_fp16_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/ragged_page_attention_v3/src/rpa_v3_fwd_hdim64_fp16_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/ragged_page_attention_v3/src/rpa_v3_fwd_hdim64_fp16_sm80.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/ragged_page_attention_v3/src/rpa_v3_fwd_hdim64_fp16_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/ragged_page_attention_v3/src/rpa_v3_fwd_hdim64_fp32_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/ragged_page_attention_v3/src/rpa_v3_fwd_hdim64_fp32_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/ragged_page_attention_v3/src/rpa_v3_fwd_hdim64_fp32_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/ragged_page_attention_v3/src/rpa_v3_fwd_hdim64_fp32_sm80.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/ragged_page_attention_v3/src/rpa_v3_fwd_hdim64_fp32_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/ragged_page_attention_v3/src/rpa_v3_fwd_hdim96_bf16_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/ragged_page_attention_v3/src/rpa_v3_fwd_hdim96_bf16_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/ragged_page_attention_v3/src/rpa_v3_fwd_hdim96_bf16_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/ragged_page_attention_v3/src/rpa_v3_fwd_hdim96_bf16_sm80.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/ragged_page_attention_v3/src/rpa_v3_fwd_hdim96_bf16_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/ragged_page_attention_v3/src/rpa_v3_fwd_hdim96_fp16_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/ragged_page_attention_v3/src/rpa_v3_fwd_hdim96_fp16_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/ragged_page_attention_v3/src/rpa_v3_fwd_hdim96_fp16_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/ragged_page_attention_v3/src/rpa_v3_fwd_hdim96_fp16_sm80.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/ragged_page_attention_v3/src/rpa_v3_fwd_hdim96_fp16_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/ragged_page_attention_v3/src/rpa_v3_fwd_hdim96_fp32_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/ragged_page_attention_v3/src/rpa_v3_fwd_hdim96_fp32_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/ragged_page_attention_v3/src/rpa_v3_fwd_hdim96_fp32_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/ragged_page_attention_v3/src/rpa_v3_fwd_hdim96_fp32_sm80.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/ragged_page_attention_v3/src/rpa_v3_fwd_hdim96_fp32_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/ragged_page_attention_v3/src/rpa_v3_kernel.h +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/ragged_page_attention_v3/src/rpa_v3_launch_template.h +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/unified_attention/CMakeLists.txt +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/unified_attention/src/code_gen.py +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/unified_attention/src/ua.h +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/unified_attention/src/ua_cuda.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/unified_attention/src/ua_fwd_hdim128_bf16_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/unified_attention/src/ua_fwd_hdim128_bf16_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/unified_attention/src/ua_fwd_hdim128_bf16_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/unified_attention/src/ua_fwd_hdim128_bf16_sm80.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/unified_attention/src/ua_fwd_hdim128_bf16_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/unified_attention/src/ua_fwd_hdim128_fp16_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/unified_attention/src/ua_fwd_hdim128_fp16_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/unified_attention/src/ua_fwd_hdim128_fp16_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/unified_attention/src/ua_fwd_hdim128_fp16_sm80.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/unified_attention/src/ua_fwd_hdim128_fp16_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/unified_attention/src/ua_fwd_hdim192_bf16_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/unified_attention/src/ua_fwd_hdim192_bf16_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/unified_attention/src/ua_fwd_hdim192_bf16_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/unified_attention/src/ua_fwd_hdim192_bf16_sm80.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/unified_attention/src/ua_fwd_hdim192_bf16_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/unified_attention/src/ua_fwd_hdim192_fp16_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/unified_attention/src/ua_fwd_hdim192_fp16_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/unified_attention/src/ua_fwd_hdim192_fp16_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/unified_attention/src/ua_fwd_hdim192_fp16_sm80.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/unified_attention/src/ua_fwd_hdim192_fp16_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/unified_attention/src/ua_fwd_hdim256_bf16_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/unified_attention/src/ua_fwd_hdim256_bf16_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/unified_attention/src/ua_fwd_hdim256_bf16_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/unified_attention/src/ua_fwd_hdim256_bf16_sm80.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/unified_attention/src/ua_fwd_hdim256_bf16_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/unified_attention/src/ua_fwd_hdim256_fp16_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/unified_attention/src/ua_fwd_hdim256_fp16_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/unified_attention/src/ua_fwd_hdim256_fp16_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/unified_attention/src/ua_fwd_hdim256_fp16_sm80.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/unified_attention/src/ua_fwd_hdim256_fp16_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/unified_attention/src/ua_fwd_hdim32_bf16_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/unified_attention/src/ua_fwd_hdim32_bf16_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/unified_attention/src/ua_fwd_hdim32_bf16_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/unified_attention/src/ua_fwd_hdim32_bf16_sm80.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/unified_attention/src/ua_fwd_hdim32_bf16_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/unified_attention/src/ua_fwd_hdim32_fp16_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/unified_attention/src/ua_fwd_hdim32_fp16_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/unified_attention/src/ua_fwd_hdim32_fp16_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/unified_attention/src/ua_fwd_hdim32_fp16_sm80.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/unified_attention/src/ua_fwd_hdim32_fp16_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/unified_attention/src/ua_fwd_hdim64_bf16_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/unified_attention/src/ua_fwd_hdim64_bf16_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/unified_attention/src/ua_fwd_hdim64_bf16_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/unified_attention/src/ua_fwd_hdim64_bf16_sm80.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/unified_attention/src/ua_fwd_hdim64_bf16_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/unified_attention/src/ua_fwd_hdim64_fp16_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/unified_attention/src/ua_fwd_hdim64_fp16_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/unified_attention/src/ua_fwd_hdim64_fp16_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/unified_attention/src/ua_fwd_hdim64_fp16_sm80.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/unified_attention/src/ua_fwd_hdim64_fp16_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/unified_attention/src/ua_fwd_hdim96_bf16_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/unified_attention/src/ua_fwd_hdim96_bf16_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/unified_attention/src/ua_fwd_hdim96_bf16_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/unified_attention/src/ua_fwd_hdim96_bf16_sm80.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/unified_attention/src/ua_fwd_hdim96_bf16_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/unified_attention/src/ua_fwd_hdim96_fp16_sm100.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/unified_attention/src/ua_fwd_hdim96_fp16_sm110.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/unified_attention/src/ua_fwd_hdim96_fp16_sm120.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/unified_attention/src/ua_fwd_hdim96_fp16_sm80.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/unified_attention/src/ua_fwd_hdim96_fp16_sm90.cu +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/unified_attention/src/ua_kernel.h +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/unified_attention/src/ua_launch_template.h +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/ejkernel/kernels/_pallas/tpu/gated_delta_rule/__init__.py +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/ejkernel/kernels/_xla/flash_mla/_interface.py +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/ejkernel/kernels/_xla/ragged_page_attention_v2_turboquant/__init__.py +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/ejkernel/kernels/_xla/ragged_page_attention_v3_turboquant/__init__.py +0 -0
- {ejkernel-0.0.79 → ejkernel-0.0.80}/ejkernel/quantization/turboquant/__init__.py +0 -0
ejkernel-0.0.80/PKG-INFO
ADDED
|
@@ -0,0 +1,687 @@
|
|
|
1
|
+
Metadata-Version: 2.3
|
|
2
|
+
Name: ejkernel
|
|
3
|
+
Version: 0.0.80
|
|
4
|
+
Summary: Accelerate, Optimize performance with streamlined training and serving options with JAX.
|
|
5
|
+
Keywords: Deep Learning,Machine Learning,JAX,CUDA,XLA,Triton,Pallas
|
|
6
|
+
Author: Erfan Zare Chavoshi
|
|
7
|
+
Author-email: Erfan Zare Chavoshi <Erfanzare810@gmail.com>
|
|
8
|
+
License: Apache-2.0
|
|
9
|
+
Classifier: Development Status :: 3 - Alpha
|
|
10
|
+
Classifier: Intended Audience :: Developers
|
|
11
|
+
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
|
|
12
|
+
Classifier: License :: OSI Approved :: Apache Software License
|
|
13
|
+
Classifier: Programming Language :: Python :: 3.11
|
|
14
|
+
Classifier: Programming Language :: Python :: 3.12
|
|
15
|
+
Classifier: Programming Language :: Python :: 3.13
|
|
16
|
+
Requires-Dist: beartype>=0.22.2
|
|
17
|
+
Requires-Dist: chex>=0.1.91
|
|
18
|
+
Requires-Dist: einops>=0.8.1
|
|
19
|
+
Requires-Dist: jax~=0.10.0
|
|
20
|
+
Requires-Dist: jaxlib~=0.10.0
|
|
21
|
+
Requires-Dist: jaxtyping>=0.3.2
|
|
22
|
+
Requires-Dist: pydantic>=2.11.10
|
|
23
|
+
Requires-Dist: tqdm>=4.67.1
|
|
24
|
+
Requires-Dist: jax[cuda13]~=0.10.0 ; extra == 'cuda'
|
|
25
|
+
Requires-Dist: jax-cuda13-plugin[with-cuda]~=0.10.0 ; extra == 'cuda'
|
|
26
|
+
Requires-Dist: jax-cuda13-pjrt~=0.10.0 ; extra == 'cuda'
|
|
27
|
+
Requires-Dist: triton==3.6.0 ; extra == 'cuda'
|
|
28
|
+
Requires-Dist: nvidia-cutlass-dsl[cu13]==4.4.0.dev1 ; extra == 'cuda'
|
|
29
|
+
Requires-Dist: nvidia-cutlass-dsl-libs-base==4.4.0.dev1 ; extra == 'cuda'
|
|
30
|
+
Requires-Dist: nvidia-cutlass-dsl-libs-cu13==4.4.0.dev1 ; extra == 'cuda'
|
|
31
|
+
Requires-Dist: jax-tvm-ffi==0.1.2 ; extra == 'cuda'
|
|
32
|
+
Requires-Dist: apache-tvm-ffi==0.1.8.post2 ; extra == 'cuda'
|
|
33
|
+
Requires-Dist: eformer ; extra == 'dev'
|
|
34
|
+
Requires-Dist: xprof>=2.20.6 ; extra == 'profile'
|
|
35
|
+
Requires-Dist: tb-nightly>=2.21.0a20250820 ; extra == 'profile'
|
|
36
|
+
Requires-Dist: xprof-nightly>=2.21.6a20250820 ; extra == 'profile'
|
|
37
|
+
Requires-Dist: tilelang==0.1.9 ; extra == 'tilelang'
|
|
38
|
+
Requires-Dist: apache-tvm-ffi==0.1.8.post2 ; extra == 'tilelang'
|
|
39
|
+
Requires-Dist: nvidia-cudnn-cu12==9.12.0.46 ; extra == 'tilelang'
|
|
40
|
+
Requires-Dist: jax[tpu]~=0.10.0 ; extra == 'tpu'
|
|
41
|
+
Requires-Dist: triton==3.6.0 ; extra == 'triton'
|
|
42
|
+
Requires-Python: >=3.11, <3.14
|
|
43
|
+
Project-URL: Documentation, https://ejkernel.readthedocs.io/en/latest/
|
|
44
|
+
Project-URL: Homepage, https://github.com/erfanzar/ejkernel
|
|
45
|
+
Project-URL: Repository, https://github.com/erfanzar/ejkernel
|
|
46
|
+
Provides-Extra: cuda
|
|
47
|
+
Provides-Extra: dev
|
|
48
|
+
Provides-Extra: profile
|
|
49
|
+
Provides-Extra: tilelang
|
|
50
|
+
Provides-Extra: tpu
|
|
51
|
+
Provides-Extra: triton
|
|
52
|
+
Description-Content-Type: text/markdown
|
|
53
|
+
|
|
54
|
+
# ejKernel: High-Performance JAX Kernels for Deep Learning
|
|
55
|
+
|
|
56
|
+
> _"The best optimization is the one you don't have to think about."_
|
|
57
|
+
|
|
58
|
+
[](https://opensource.org/licenses/Apache-2.0)
|
|
59
|
+
[](https://www.python.org/downloads/)
|
|
60
|
+
[](https://github.com/google/jax)
|
|
61
|
+
[](https://ejkernel.readthedocs.io/en/latest/)
|
|
62
|
+
|
|
63
|
+
ejKernel is a production-grade kernel library for JAX that provides highly optimized implementations of deep learning operations with automatic multi-backend support. The library features a sophisticated configuration management system with autotuning, comprehensive type safety, and seamless execution across GPUs, TPUs, and CPUs.
|
|
64
|
+
|
|
65
|
+
> [!NOTE]
|
|
66
|
+
> eJkernel contains **no AI-generated code**. All kernels, modules, and core logic are manually designed and implemented by human developers.
|
|
67
|
+
> AI tooling (Opus 4.5) is used **exclusively for documentation**, which may therefore contain minor inaccuracies. There is no “vibe coding” or automated code generation anywhere in the codebase.
|
|
68
|
+
|
|
69
|
+
## Table of Contents
|
|
70
|
+
|
|
71
|
+
- [Key Features](#key-features)
|
|
72
|
+
- [Installation](#installation)
|
|
73
|
+
- [Quick Start](#quick-start)
|
|
74
|
+
- [Architecture Overview](#architecture-overview)
|
|
75
|
+
- [Supported Operations](#supported-operations)
|
|
76
|
+
- [Advanced Usage](#advanced-usage)
|
|
77
|
+
- [Development](#development)
|
|
78
|
+
- [Testing](#testing)
|
|
79
|
+
- [Contributing](#contributing)
|
|
80
|
+
- [Citation](#citation)
|
|
81
|
+
- [License](#license)
|
|
82
|
+
|
|
83
|
+
## Key Features
|
|
84
|
+
|
|
85
|
+
### Intelligent Kernel Management
|
|
86
|
+
|
|
87
|
+
- **7-Tier Configuration System**: Override → Overlay → Memory Cache → Persistent Cache → Autotune → Heuristics → Error
|
|
88
|
+
- **Automatic Platform Detection**: Seamlessly selects optimal implementation based on hardware
|
|
89
|
+
- **Priority-Based Registry**: Multi-backend support with intelligent fallback mechanisms
|
|
90
|
+
- **Device Fingerprinting**: Hardware-specific configuration caching for optimal performance
|
|
91
|
+
|
|
92
|
+
### State-of-the-Art Operations
|
|
93
|
+
|
|
94
|
+
- **30+ Deep Learning Operations**: Flash Attention v2, Flash MLA, Ring Attention, Page Attention, Block Sparse, GLA, Lightning, Gated Delta Rule, Quantized MatMul, State Space Models (Mamba), RWKV (v4/v6/v7), and more
|
|
95
|
+
- **Memory Efficiency**: Custom VJP implementations with O(N) memory complexity for attention
|
|
96
|
+
- **Distributed Support**: Full shard_map integration for model and data parallelism
|
|
97
|
+
- **Mixed Precision**: Comprehensive dtype support with automatic gradient conversion
|
|
98
|
+
|
|
99
|
+
### Production-Ready Infrastructure
|
|
100
|
+
|
|
101
|
+
- **Type Safety**: Full jaxtyping annotations with runtime validation via beartype
|
|
102
|
+
- **Comprehensive Testing**: Cross-backend validation, performance benchmarks, integration tests
|
|
103
|
+
- **Atomic Persistence**: Thread-safe configuration storage with automatic optimization
|
|
104
|
+
- **Profiling Integration**: Built-in support for JAX profiling and performance monitoring
|
|
105
|
+
|
|
106
|
+
## Installation
|
|
107
|
+
|
|
108
|
+
### Basic Installation
|
|
109
|
+
|
|
110
|
+
```bash
|
|
111
|
+
pip install ejkernel
|
|
112
|
+
```
|
|
113
|
+
|
|
114
|
+
### Platform-Specific Installation
|
|
115
|
+
|
|
116
|
+
```bash
|
|
117
|
+
# GPU Support (CUDA)
|
|
118
|
+
pip install ejkernel[cuda]
|
|
119
|
+
|
|
120
|
+
# TPU Support
|
|
121
|
+
pip install ejkernel[tpu]
|
|
122
|
+
|
|
123
|
+
# Development Installation
|
|
124
|
+
git clone https://github.com/erfanzar/ejkernel.git
|
|
125
|
+
cd ejkernel
|
|
126
|
+
pip install -e ".[dev]"
|
|
127
|
+
```
|
|
128
|
+
|
|
129
|
+
### Dependencies
|
|
130
|
+
|
|
131
|
+
- Python 3.11-3.13
|
|
132
|
+
- JAX >= 0.9.0
|
|
133
|
+
- Triton == 3.6.0 (for GPU)
|
|
134
|
+
- nvidia-cutlass-dsl >= 4.4.0 (optional, for CuTe DSL kernels)
|
|
135
|
+
- jax-tvm-ffi == 0.1.2 (optional, for CuTe TVM-FFI primitive path)
|
|
136
|
+
- jaxtyping >= 0.3.2
|
|
137
|
+
- beartype >= 0.22.2
|
|
138
|
+
- pydantic >= 2.11.10
|
|
139
|
+
|
|
140
|
+
## Quick Start
|
|
141
|
+
|
|
142
|
+
### Simple API with Automatic Optimization
|
|
143
|
+
|
|
144
|
+
```python
|
|
145
|
+
import jax.numpy as jnp
|
|
146
|
+
from ejkernel.modules import flash_attention
|
|
147
|
+
|
|
148
|
+
# Basic usage - automatic configuration selection
|
|
149
|
+
output = flash_attention(
|
|
150
|
+
query, key, value,
|
|
151
|
+
causal=True,
|
|
152
|
+
dropout_prob=0.1
|
|
153
|
+
)
|
|
154
|
+
|
|
155
|
+
# With advanced features
|
|
156
|
+
output = flash_attention(
|
|
157
|
+
query, key, value,
|
|
158
|
+
causal=True,
|
|
159
|
+
sliding_window=128, # Local attention window
|
|
160
|
+
logits_soft_cap=30.0, # Gemma-2 style soft capping
|
|
161
|
+
attention_mask=mask, # Custom attention pattern
|
|
162
|
+
)
|
|
163
|
+
```
|
|
164
|
+
|
|
165
|
+
### Custom Configuration
|
|
166
|
+
|
|
167
|
+
```python
|
|
168
|
+
from ejkernel.modules import FlashAttentionConfig
|
|
169
|
+
from ejkernel.ops.utils.datacarrier import FwdParams, BwdParams
|
|
170
|
+
|
|
171
|
+
# Create optimized configuration
|
|
172
|
+
config = FlashAttentionConfig(
|
|
173
|
+
fwd_params=FwdParams(
|
|
174
|
+
q_blocksize=256,
|
|
175
|
+
kv_blocksize=256,
|
|
176
|
+
num_warps=8,
|
|
177
|
+
num_stages=2
|
|
178
|
+
),
|
|
179
|
+
bwd_params=BwdParams(
|
|
180
|
+
q_blocksize=128,
|
|
181
|
+
kv_blocksize=128,
|
|
182
|
+
num_warps=4
|
|
183
|
+
),
|
|
184
|
+
platform="triton", # Force specific backend
|
|
185
|
+
backend="gpu"
|
|
186
|
+
)
|
|
187
|
+
|
|
188
|
+
output = flash_attention(query, key, value, cfg=config)
|
|
189
|
+
```
|
|
190
|
+
|
|
191
|
+
### Direct Kernel Registry Access
|
|
192
|
+
|
|
193
|
+
```python
|
|
194
|
+
from ejkernel import kernel_registry, Platform, Backend
|
|
195
|
+
|
|
196
|
+
# Get specific implementation
|
|
197
|
+
kernel = kernel_registry.get(
|
|
198
|
+
algorithm="flash_attention",
|
|
199
|
+
platform=Platform.TRITON,
|
|
200
|
+
backend=Backend.GPU
|
|
201
|
+
)
|
|
202
|
+
|
|
203
|
+
# Direct execution
|
|
204
|
+
output = kernel(query, key, value, causal=True)
|
|
205
|
+
```
|
|
206
|
+
|
|
207
|
+
### Distributed Execution
|
|
208
|
+
|
|
209
|
+
```python
|
|
210
|
+
import jax
|
|
211
|
+
from jax.sharding import Mesh, PartitionSpec as P
|
|
212
|
+
from ejkernel.modules import flash_attention
|
|
213
|
+
|
|
214
|
+
# Setup mesh for distributed execution
|
|
215
|
+
devices = jax.devices()
|
|
216
|
+
mesh = Mesh(devices, axis_names=("data", "model"))
|
|
217
|
+
|
|
218
|
+
# Run distributed attention
|
|
219
|
+
output = flash_attention(
|
|
220
|
+
query, key, value,
|
|
221
|
+
causal=True,
|
|
222
|
+
mesh=mesh,
|
|
223
|
+
in_specs=(P("data", None), P("data", None), P("data", None)),
|
|
224
|
+
out_specs=P("data", None)
|
|
225
|
+
)
|
|
226
|
+
```
|
|
227
|
+
|
|
228
|
+
## Architecture Overview
|
|
229
|
+
|
|
230
|
+
### System Design
|
|
231
|
+
|
|
232
|
+
ejKernel employs a sophisticated layered architecture that separates concerns while maintaining high performance:
|
|
233
|
+
|
|
234
|
+
```md
|
|
235
|
+
┌─────────────────────────────────────────────────────┐
|
|
236
|
+
│ Public API (modules/) │
|
|
237
|
+
│ Simple functions with sensible defaults │
|
|
238
|
+
├─────────────────────────────────────────────────────┤
|
|
239
|
+
│ Operations Layer (ops/) │
|
|
240
|
+
│ Configuration management, autotuning, caching │
|
|
241
|
+
├─────────────────────────────────────────────────────┤
|
|
242
|
+
│ Kernel Registry (kernels/) │
|
|
243
|
+
│ Platform routing, signature validation │
|
|
244
|
+
├─────────────────────────────────────────────────────┤
|
|
245
|
+
│ Backend Implementations (kernels/\_\*) │
|
|
246
|
+
│ Triton, CuTe, Pallas, XLA, CUDA kernels │
|
|
247
|
+
└─────────────────────────────────────────────────────┘
|
|
248
|
+
```
|
|
249
|
+
|
|
250
|
+
### Core Components
|
|
251
|
+
|
|
252
|
+
#### Kernel Registry
|
|
253
|
+
|
|
254
|
+
The registry provides automatic platform-specific kernel selection:
|
|
255
|
+
|
|
256
|
+
```python
|
|
257
|
+
@kernel_registry.register("my_operation", Platform.TRITON, Backend.GPU, priority=100)
|
|
258
|
+
def my_operation_gpu(x, y):
|
|
259
|
+
# GPU-optimized implementation
|
|
260
|
+
pass
|
|
261
|
+
|
|
262
|
+
@kernel_registry.register("my_operation", Platform.XLA, Backend.ANY, priority=50)
|
|
263
|
+
def my_operation_fallback(x, y):
|
|
264
|
+
# Universal fallback
|
|
265
|
+
pass
|
|
266
|
+
|
|
267
|
+
# Automatic selection based on available hardware
|
|
268
|
+
impl = kernel_registry.get("my_operation")
|
|
269
|
+
```
|
|
270
|
+
|
|
271
|
+
#### Configuration Management
|
|
272
|
+
|
|
273
|
+
Multi-tier configuration system with intelligent fallback:
|
|
274
|
+
|
|
275
|
+
```python
|
|
276
|
+
class ConfigSelectorChain:
|
|
277
|
+
"""
|
|
278
|
+
Selection hierarchy:
|
|
279
|
+
1. Override - Explicit user configuration
|
|
280
|
+
2. Overlay - Temporary context overrides
|
|
281
|
+
3. Memory Cache - In-memory lookup
|
|
282
|
+
4. Persistent Cache - Disk-based storage
|
|
283
|
+
5. Autotune - Performance benchmarking
|
|
284
|
+
6. Heuristics - Intelligent defaults
|
|
285
|
+
7. Error - Clear failure message
|
|
286
|
+
"""
|
|
287
|
+
```
|
|
288
|
+
|
|
289
|
+
#### Custom VJP System
|
|
290
|
+
|
|
291
|
+
All performance-critical kernels implement memory-efficient gradients:
|
|
292
|
+
|
|
293
|
+
```python
|
|
294
|
+
@jax.custom_vjp
|
|
295
|
+
def kernel_with_custom_grad(inputs):
|
|
296
|
+
return forward(inputs)
|
|
297
|
+
|
|
298
|
+
def kernel_fwd(inputs):
|
|
299
|
+
output, residuals = forward_with_residuals(inputs)
|
|
300
|
+
return output, residuals
|
|
301
|
+
|
|
302
|
+
def kernel_bwd(residuals, grad_output):
|
|
303
|
+
return efficient_backward(residuals, grad_output)
|
|
304
|
+
|
|
305
|
+
kernel_with_custom_grad.defvjp(kernel_fwd, kernel_bwd)
|
|
306
|
+
```
|
|
307
|
+
|
|
308
|
+
## Supported Operations
|
|
309
|
+
|
|
310
|
+
### Attention Mechanisms
|
|
311
|
+
|
|
312
|
+
| Algorithm | Description | Memory | Key Features |
|
|
313
|
+
| -------------------------------- | -------------------------------- | ------ | ----------------------------------------------------------------------- |
|
|
314
|
+
| **Flash Attention v2** | Memory-efficient exact attention | O(N) | Causal masking, dropout, sliding windows, soft capping |
|
|
315
|
+
| **Ring Attention** | Distributed sequence parallelism | O(N/P) | Ultra-long sequences, communication overlap, XLA single-device fallback |
|
|
316
|
+
| **Page Attention** | KV-cache optimized inference | O(N) | Block-wise memory, continuous batching |
|
|
317
|
+
| **Block Sparse Attention** | Configurable sparse patterns | O(N√N) | Local+global, custom patterns |
|
|
318
|
+
| **GLA** | Gated Linear Attention | O(N) | Linear complexity, gated updates |
|
|
319
|
+
| **Lightning Attention** | Layer-dependent decay | O(N) | Exponential moving average |
|
|
320
|
+
| **MLA** | Multi-head Latent Attention | O(N) | Compressed KV representation |
|
|
321
|
+
| **Ragged Page Attention v2** | Variable-length paged attention | O(N) | Ragged sequences with page caching |
|
|
322
|
+
| **Ragged Page Attention v3** | Enhanced ragged page attention | O(N) | Attention sinks support, improved handling |
|
|
323
|
+
| **Ragged Decode Attention** | Variable-length decoding | O(N) | Efficient batched inference |
|
|
324
|
+
| **Gated Delta Rule (GDR)** | Gated delta-rule recurrence | O(N) | Chunked + recurrent + single-step, custom VJP, Qwen3Next |
|
|
325
|
+
| **Ragged GDR** | Packed continuous-batching GDR | O(N) | Variable-length sequences, Pallas TPU decode (3.6x speedup) |
|
|
326
|
+
| **Kernel Delta Attention** | Delta-rule linear attention | O(N) | Linear complexity, delta updates, decay control |
|
|
327
|
+
| **Unified Attention** | vLLM-style paged attention | O(N) | Segmented 3D decode kernel |
|
|
328
|
+
| **Prefill Page Attention** | Page attention prefill phase | O(N) | Separate prefill handling |
|
|
329
|
+
| **Decode Attention** | Single-token decode attention | O(N) | Optimized single-step decoding |
|
|
330
|
+
| **Chunked Prefill Paged Decode** | Combined prefill + decode | O(N) | Chunked prefill with paged KV cache decode |
|
|
331
|
+
| **Flash MLA** | Multi-head Latent Attention | O(N) | Low-rank KV compression, memory-efficient inference |
|
|
332
|
+
| **Scaled Dot-Product Attention** | Standard attention | O(N²) | Basic reference implementation |
|
|
333
|
+
|
|
334
|
+
### Recurrent Linear Attention (RWKV)
|
|
335
|
+
|
|
336
|
+
| Operation | Description | Key Features |
|
|
337
|
+
| -------------- | ------------------------------------- | -------------------------------------------------- |
|
|
338
|
+
| **RWKV-4** | Time-mix recurrence | Numerically stable (α,β,ε) state, O(N) memory |
|
|
339
|
+
| **RWKV-6** | Multi-head linear attention | Variable-length packing, reverse mode, O(N) memory |
|
|
340
|
+
| **RWKV-7** | DPLR (Diagonal + Low-Rank) recurrence | (a,b) parameterization, state-space inspired |
|
|
341
|
+
| **RWKV-7 Mul** | Multiplicative RWKV-7 variant | (kk,a) reparameterization for optimized kernels |
|
|
342
|
+
|
|
343
|
+
### Other Operations
|
|
344
|
+
|
|
345
|
+
| Operation | Description | Use Case |
|
|
346
|
+
| --------------------- | ----------------------------------------------------------- | ------------------------- |
|
|
347
|
+
| **Grouped MatMul** | Efficient batched matrix operations | Expert models, MoE |
|
|
348
|
+
| **Grouped MatMul v2** | Enhanced with shard_map support | Distributed expert models |
|
|
349
|
+
| **Mean Pooling** | Variable-length sequence aggregation | Sentence embeddings |
|
|
350
|
+
| **Recurrent** | Optimized RNN/LSTM/GRU operations | Sequential modeling |
|
|
351
|
+
| **Native Sparse** | Block-sparse matrix computations | Sparse attention patterns |
|
|
352
|
+
| **Quantized MatMul** | Multi-mode quantized matmul (affine, NF4, MXFP4/8, NVFP4/8) | Low-bit inference |
|
|
353
|
+
|
|
354
|
+
### State Space Models
|
|
355
|
+
|
|
356
|
+
| Operation | Description | Key Features |
|
|
357
|
+
| ------------------ | ---------------- | -------------------------------------------------------------------------- |
|
|
358
|
+
| **State Space v1** | Mamba1-style SSM | 2D A matrix, separate dt_proj, custom VJP for memory efficiency |
|
|
359
|
+
| **State Space v2** | Mamba2-style SSM | Per-head scalar A, n_groups for parameter grouping, optional gated RMSNorm |
|
|
360
|
+
|
|
361
|
+
### Platform Support Matrix
|
|
362
|
+
|
|
363
|
+
| Operation | Triton (GPU) | CUTE (GPU) | CUDA (GPU) | Pallas (TPU) | XLA (Universal) |
|
|
364
|
+
| ---------------------------- | ------------ | ---------- | ---------- | ------------ | --------------- |
|
|
365
|
+
| Flash Attention v2 | ✅ | ✅ | ✅ | ✅ | ✅ |
|
|
366
|
+
| Flash MLA | ✅ | - | - | - | ✅ |
|
|
367
|
+
| Ring Attention | ✅ | - | - | ✅ | ✅ |
|
|
368
|
+
| Page Attention | ✅ | - | - | ✅ | ✅ |
|
|
369
|
+
| Block Sparse Attention | ✅ | - | ✅ | ✅ | ✅ |
|
|
370
|
+
| Decode Attention | ✅ | - | - | - | ✅ |
|
|
371
|
+
| Chunked Prefill Paged Decode | ✅ | ✅ | - | - | ✅ |
|
|
372
|
+
| Ragged Page Attention v2 | ✅ | - | - | ✅ | ✅ |
|
|
373
|
+
| Ragged Page Attention v3 | ✅ | - | ✅ | ✅ | ✅ |
|
|
374
|
+
| Ragged Decode Attention | ✅ | - | - | ✅ | ✅ |
|
|
375
|
+
| GLA | ✅ | - | - | - | ✅ |
|
|
376
|
+
| Lightning Attention | ✅ | - | - | - | ✅ |
|
|
377
|
+
| Recurrent | ✅ | - | - | - | ✅ |
|
|
378
|
+
| Mean Pooling | ✅ | - | - | - | ✅ |
|
|
379
|
+
| Grouped MatMul | - | - | - | ✅ | ✅ |
|
|
380
|
+
| Grouped MatMul v2 | - | - | - | ✅ | - |
|
|
381
|
+
| Native Sparse Attention | ✅ | - | - | - | ✅ |
|
|
382
|
+
| Quantized MatMul | ✅ | ✅ | ✅ | ✅ | ✅ |
|
|
383
|
+
| Gated Delta Rule | - | - | - | ✅ | ✅ |
|
|
384
|
+
| Ragged Gated Delta Rule | - | - | - | ✅ | ✅ |
|
|
385
|
+
| Kernel Delta Attention | - | - | - | - | ✅ |
|
|
386
|
+
| Unified Attention | ✅ | ✅ | ✅ | - | ✅ |
|
|
387
|
+
| Prefill Page Attention | - | - | - | ✅ | ✅ |
|
|
388
|
+
| Scaled Dot-Product Attention | - | - | - | - | ✅ |
|
|
389
|
+
| State Space v1 | - | - | - | - | ✅ |
|
|
390
|
+
| State Space v2 | - | - | - | - | ✅ |
|
|
391
|
+
| RWKV-4 | ✅ | - | - | - | ✅ |
|
|
392
|
+
| RWKV-6 | ✅ | - | - | - | ✅ |
|
|
393
|
+
| RWKV-7 | ✅ | - | - | - | ✅ |
|
|
394
|
+
| RWKV-7 Mul | ✅ | - | - | - | ✅ |
|
|
395
|
+
|
|
396
|
+
✅ = Production ready | - = Not available
|
|
397
|
+
|
|
398
|
+
\* CuTe backend uses TVM-FFI primitive path with fused kernels. \* Quantized MatMul on TPU uses hybrid dispatch (packed Pallas / predecode / XLA fallback). \* Distributed matmul ops (`all_gather_matmul`, `reduce_scatter_matmul`) intentionally do not perform runtime fallback between distributed backends; choose `platform`/`cfg.platform` explicitly.
|
|
399
|
+
|
|
400
|
+
## Advanced Usage
|
|
401
|
+
|
|
402
|
+
### Page Attention for KV-Cache Inference
|
|
403
|
+
|
|
404
|
+
```python
|
|
405
|
+
from ejkernel.modules import page_attention, PageAttentionConfig
|
|
406
|
+
|
|
407
|
+
# Configure paged attention for inference
|
|
408
|
+
config = PageAttentionConfig(
|
|
409
|
+
platform="auto",
|
|
410
|
+
backend="gpu"
|
|
411
|
+
)
|
|
412
|
+
|
|
413
|
+
output = page_attention(
|
|
414
|
+
query=q,
|
|
415
|
+
key_cache=k_cache,
|
|
416
|
+
value_cache=v_cache,
|
|
417
|
+
block_table=block_table,
|
|
418
|
+
cache_seqlens=cache_seqlens,
|
|
419
|
+
cfg=config
|
|
420
|
+
)
|
|
421
|
+
```
|
|
422
|
+
|
|
423
|
+
### Ragged Page Attention for Variable-Length Batches
|
|
424
|
+
|
|
425
|
+
```python
|
|
426
|
+
from ejkernel.modules import ragged_page_attention_v3, RaggedPageAttentionv3Config
|
|
427
|
+
|
|
428
|
+
# For variable-length sequences with attention sinks
|
|
429
|
+
config = RaggedPageAttentionv3Config(
|
|
430
|
+
platform="pallas",
|
|
431
|
+
backend="tpu"
|
|
432
|
+
)
|
|
433
|
+
|
|
434
|
+
output = ragged_page_attention_v3(
|
|
435
|
+
query=q,
|
|
436
|
+
key_pages=k_pages,
|
|
437
|
+
value_pages=v_pages,
|
|
438
|
+
lengths=seq_lengths,
|
|
439
|
+
page_indices=page_indices,
|
|
440
|
+
cfg=config
|
|
441
|
+
)
|
|
442
|
+
```
|
|
443
|
+
|
|
444
|
+
### Performance Optimization
|
|
445
|
+
|
|
446
|
+
```python
|
|
447
|
+
# Force autotuning for optimal configuration
|
|
448
|
+
import os
|
|
449
|
+
os.environ["EJKERNEL_AUTOTUNE_POLICY"] = "autotune"
|
|
450
|
+
os.environ["EJKERNEL_LOG_AUTOTUNE"] = "1"
|
|
451
|
+
|
|
452
|
+
# Enable profiling
|
|
453
|
+
os.environ["EJKERNEL_OPS_STAMP"] = "json" # Detailed metadata
|
|
454
|
+
os.environ["EJKERNEL_OPS_RECORD"] = "1" # Record invocations
|
|
455
|
+
```
|
|
456
|
+
|
|
457
|
+
### Custom Kernel Development
|
|
458
|
+
|
|
459
|
+
```python
|
|
460
|
+
from ejkernel.ops.core import Kernel
|
|
461
|
+
from ejkernel.modules.operations.configs import BaseOperationConfig
|
|
462
|
+
from dataclasses import dataclass
|
|
463
|
+
|
|
464
|
+
@dataclass
|
|
465
|
+
class MyConfig(BaseOperationConfig):
|
|
466
|
+
param1: int = 128
|
|
467
|
+
param2: float = 0.1
|
|
468
|
+
|
|
469
|
+
class MyKernel(Kernel[MyConfig, Array]):
|
|
470
|
+
def __init__(self):
|
|
471
|
+
super().__init__(op_id="my_kernel")
|
|
472
|
+
|
|
473
|
+
def run(self, x, cfg: MyConfig):
|
|
474
|
+
impl = kernel_registry.get("my_kernel", cfg.platform)
|
|
475
|
+
return impl(x, param1=cfg.param1, param2=cfg.param2)
|
|
476
|
+
|
|
477
|
+
def heuristic_cfg(self, inv):
|
|
478
|
+
# Return default configuration
|
|
479
|
+
return MyConfig(param1=256)
|
|
480
|
+
|
|
481
|
+
def candidate_cfgs(self, inv):
|
|
482
|
+
# Return autotuning candidates
|
|
483
|
+
return [MyConfig(param1=p) for p in [64, 128, 256]]
|
|
484
|
+
```
|
|
485
|
+
|
|
486
|
+
### Integration with Flax Models
|
|
487
|
+
|
|
488
|
+
```python
|
|
489
|
+
import flax.linen as nn
|
|
490
|
+
from ejkernel.modules import flash_attention
|
|
491
|
+
|
|
492
|
+
class TransformerBlock(nn.Module):
|
|
493
|
+
num_heads: int = 8
|
|
494
|
+
head_dim: int = 64
|
|
495
|
+
|
|
496
|
+
@nn.compact
|
|
497
|
+
def __call__(self, x, mask=None):
|
|
498
|
+
# Project to Q, K, V
|
|
499
|
+
q = nn.Dense(self.num_heads * self.head_dim)(x)
|
|
500
|
+
k = nn.Dense(self.num_heads * self.head_dim)(x)
|
|
501
|
+
v = nn.Dense(self.num_heads * self.head_dim)(x)
|
|
502
|
+
|
|
503
|
+
# Reshape for attention
|
|
504
|
+
shape = (x.shape[0], x.shape[1], self.num_heads, self.head_dim)
|
|
505
|
+
q, k, v = map(lambda t: t.reshape(shape), (q, k, v))
|
|
506
|
+
|
|
507
|
+
# Apply ejKernel Flash Attention
|
|
508
|
+
attn_output = flash_attention(
|
|
509
|
+
q, k, v,
|
|
510
|
+
causal=True,
|
|
511
|
+
attention_mask=mask
|
|
512
|
+
)
|
|
513
|
+
|
|
514
|
+
# Project output
|
|
515
|
+
return nn.Dense(x.shape[-1])(attn_output.reshape(x.shape))
|
|
516
|
+
```
|
|
517
|
+
|
|
518
|
+
## Development
|
|
519
|
+
|
|
520
|
+
### Setting Up Development Environment
|
|
521
|
+
|
|
522
|
+
```bash
|
|
523
|
+
# Clone repository
|
|
524
|
+
git clone https://github.com/erfanzar/ejkernel.git
|
|
525
|
+
cd ejkernel
|
|
526
|
+
|
|
527
|
+
# Create virtual environment
|
|
528
|
+
python -m venv .venv
|
|
529
|
+
source .venv/bin/activate # On Windows: .venv\Scripts\activate
|
|
530
|
+
|
|
531
|
+
# Install in development mode
|
|
532
|
+
pip install -e ".[dev]"
|
|
533
|
+
|
|
534
|
+
# Install pre-commit hooks
|
|
535
|
+
pre-commit install
|
|
536
|
+
```
|
|
537
|
+
|
|
538
|
+
### Code Style
|
|
539
|
+
|
|
540
|
+
The project uses:
|
|
541
|
+
|
|
542
|
+
- **black** for code formatting (line length: 121)
|
|
543
|
+
- **ruff** for linting
|
|
544
|
+
- **mypy/pyright** for type checking
|
|
545
|
+
- **pre-commit** for automated checks
|
|
546
|
+
|
|
547
|
+
### Adding New Kernels
|
|
548
|
+
|
|
549
|
+
1. **Implement the kernel** in appropriate backend directory:
|
|
550
|
+
|
|
551
|
+
```python
|
|
552
|
+
# ejkernel/kernels/_triton/my_kernel/_interface.py
|
|
553
|
+
@kernel_registry.register("my_kernel", Platform.TRITON, Backend.GPU)
|
|
554
|
+
def my_kernel_triton(x, config):
|
|
555
|
+
# Implementation
|
|
556
|
+
pass
|
|
557
|
+
```
|
|
558
|
+
|
|
559
|
+
1. **Create module wrapper**:
|
|
560
|
+
|
|
561
|
+
```python
|
|
562
|
+
# ejkernel/modules/operations/my_kernel.py
|
|
563
|
+
class MyKernel(Kernel[MyKernelConfig, Array]):
|
|
564
|
+
# Module implementation
|
|
565
|
+
pass
|
|
566
|
+
```
|
|
567
|
+
|
|
568
|
+
1. **Add tests**:
|
|
569
|
+
|
|
570
|
+
```python
|
|
571
|
+
# test/kernels/_triton/test_my_kernel.py
|
|
572
|
+
class TestMyKernel(unittest.TestCase):
|
|
573
|
+
# Test implementation
|
|
574
|
+
pass
|
|
575
|
+
```
|
|
576
|
+
|
|
577
|
+
1. **Update documentation**
|
|
578
|
+
|
|
579
|
+
## Testing
|
|
580
|
+
|
|
581
|
+
### Running Tests
|
|
582
|
+
|
|
583
|
+
```bash
|
|
584
|
+
# Run all tests
|
|
585
|
+
pytest test/
|
|
586
|
+
|
|
587
|
+
# Platform-specific tests
|
|
588
|
+
pytest test/kernels/_xla/ # XLA implementations
|
|
589
|
+
pytest test/kernels/_triton/ # Triton implementations
|
|
590
|
+
pytest test/kernels/_pallas/ # Pallas implementations
|
|
591
|
+
|
|
592
|
+
# Specific test patterns
|
|
593
|
+
pytest -k "flash_attention"
|
|
594
|
+
pytest --verbose --failfast
|
|
595
|
+
|
|
596
|
+
# Module operations tests
|
|
597
|
+
pytest test/modules/operations
|
|
598
|
+
```
|
|
599
|
+
|
|
600
|
+
### Test Categories
|
|
601
|
+
|
|
602
|
+
- **Unit Tests**: Individual component testing
|
|
603
|
+
- **Integration Tests**: End-to-end workflows
|
|
604
|
+
- **Comparison Tests**: Cross-backend consistency
|
|
605
|
+
- **Performance Tests**: Regression detection
|
|
606
|
+
|
|
607
|
+
## Benchmarking
|
|
608
|
+
|
|
609
|
+
Run benchmarks to compare performance across backends:
|
|
610
|
+
|
|
611
|
+
```bash
|
|
612
|
+
# General attention benchmarks
|
|
613
|
+
python benchmarks/benchmark_attention.py
|
|
614
|
+
|
|
615
|
+
# Flash attention benchmarks
|
|
616
|
+
python benchmarks/benchmark_flash_attention.py
|
|
617
|
+
|
|
618
|
+
# Ragged page attention benchmarks
|
|
619
|
+
python benchmarks/benchmark_ragged_page_attention_v3.py
|
|
620
|
+
```
|
|
621
|
+
|
|
622
|
+
## Contributing
|
|
623
|
+
|
|
624
|
+
We welcome contributions!
|
|
625
|
+
|
|
626
|
+
### Priority Areas
|
|
627
|
+
|
|
628
|
+
- TPU/Pallas implementations for existing algorithms
|
|
629
|
+
- CUDA native kernels for maximum performance
|
|
630
|
+
- New attention mechanisms from recent papers
|
|
631
|
+
- Performance optimizations and kernel fusion
|
|
632
|
+
- Documentation and examples
|
|
633
|
+
|
|
634
|
+
### Contribution Process
|
|
635
|
+
|
|
636
|
+
1. Fork the repository
|
|
637
|
+
1. Create a feature branch
|
|
638
|
+
1. Implement your changes with tests
|
|
639
|
+
1. Ensure all tests pass
|
|
640
|
+
1. Submit a pull request
|
|
641
|
+
|
|
642
|
+
## Documentation
|
|
643
|
+
|
|
644
|
+
Comprehensive documentation available at [ejkernel.readthedocs.io](https://ejkernel.readthedocs.io/en/latest/)
|
|
645
|
+
|
|
646
|
+
- **[API Reference](https://ejkernel.readthedocs.io/en/latest/api/)**: Complete API documentation
|
|
647
|
+
- **[Tutorials](https://ejkernel.readthedocs.io/en/latest/tutorials/)**: Step-by-step guides
|
|
648
|
+
- **[Architecture](https://ejkernel.readthedocs.io/en/latest/architecture/)**: Design documentation
|
|
649
|
+
- **[Benchmarks](https://ejkernel.readthedocs.io/en/latest/benchmarks/)**: Performance analysis
|
|
650
|
+
|
|
651
|
+
## Citation
|
|
652
|
+
|
|
653
|
+
If you use ejKernel in your research, please cite:
|
|
654
|
+
|
|
655
|
+
```bibtex
|
|
656
|
+
@software{ejkernel2025,
|
|
657
|
+
author = {Erfan Zare Chavoshi},
|
|
658
|
+
title = {ejKernel: High-Performance JAX Kernels for Deep Learning},
|
|
659
|
+
year = {2025},
|
|
660
|
+
url = {https://github.com/erfanzar/ejkernel},
|
|
661
|
+
note = {Production-grade kernel library with multi-backend support}
|
|
662
|
+
}
|
|
663
|
+
```
|
|
664
|
+
|
|
665
|
+
## License
|
|
666
|
+
|
|
667
|
+
ejKernel is licensed under the Apache License 2.0. See [LICENSE](LICENSE) for details.
|
|
668
|
+
|
|
669
|
+
## Acknowledgments
|
|
670
|
+
|
|
671
|
+
ejKernel builds upon excellent work from:
|
|
672
|
+
|
|
673
|
+
- [JAX](https://github.com/google/jax) - Composable transformations of Python+NumPy programs
|
|
674
|
+
- [Triton](https://github.com/openai/triton) - GPU kernel programming language
|
|
675
|
+
- [Pallas](https://github.com/google/jax/tree/main/jax/experimental/pallas) - JAX kernel language
|
|
676
|
+
- [Flash Attention](https://github.com/Dao-AILab/flash-attention) - Memory-efficient attention
|
|
677
|
+
- [EasyDeL](https://github.com/erfanzar/EasyDeL) - Parent framework for JAX deep learning
|
|
678
|
+
|
|
679
|
+
## Community
|
|
680
|
+
|
|
681
|
+
- **GitHub Issues**: [Bug reports and feature requests](https://github.com/erfanzar/ejkernel/issues)
|
|
682
|
+
- **Discussions**: [Community forum](https://github.com/erfanzar/ejkernel/discussions)
|
|
683
|
+
- **Email**: <Erfanzare810@gmail.com>
|
|
684
|
+
|
|
685
|
+
---
|
|
686
|
+
|
|
687
|
+
**ejKernel** - Production-grade kernels for JAX deep learning
|