ejkernel 0.0.79__tar.gz → 0.0.80__tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (3162) hide show
  1. ejkernel-0.0.80/PKG-INFO +687 -0
  2. ejkernel-0.0.80/csrc/flash_attention/CMakeLists.txt +78 -0
  3. ejkernel-0.0.80/csrc/flash_attention/src/flash_fwd_launch_template.h +449 -0
  4. ejkernel-0.0.80/csrc/quantized_matmul/CMakeLists.txt +79 -0
  5. ejkernel-0.0.80/csrc/quantized_matmul/src/code_gen.py +392 -0
  6. ejkernel-0.0.80/csrc/quantized_matmul/src/qmm_cuda_impl.h +3519 -0
  7. ejkernel-0.0.80/csrc/quantized_matmul/src/qmm_dequant_affine_bits1_bf16.cu +52 -0
  8. ejkernel-0.0.80/csrc/quantized_matmul/src/qmm_dequant_affine_bits1_f16.cu +52 -0
  9. ejkernel-0.0.80/csrc/quantized_matmul/src/qmm_dequant_affine_bits1_f32.cu +52 -0
  10. ejkernel-0.0.80/csrc/quantized_matmul/src/qmm_dequant_affine_bits2_bf16.cu +52 -0
  11. ejkernel-0.0.80/csrc/quantized_matmul/src/qmm_dequant_affine_bits2_f16.cu +52 -0
  12. ejkernel-0.0.80/csrc/quantized_matmul/src/qmm_dequant_affine_bits2_f32.cu +52 -0
  13. ejkernel-0.0.80/csrc/quantized_matmul/src/qmm_dequant_affine_bits3_bf16.cu +52 -0
  14. ejkernel-0.0.80/csrc/quantized_matmul/src/qmm_dequant_affine_bits3_f16.cu +52 -0
  15. ejkernel-0.0.80/csrc/quantized_matmul/src/qmm_dequant_affine_bits3_f32.cu +52 -0
  16. ejkernel-0.0.80/csrc/quantized_matmul/src/qmm_dequant_affine_bits4_bf16.cu +52 -0
  17. ejkernel-0.0.80/csrc/quantized_matmul/src/qmm_dequant_affine_bits4_f16.cu +52 -0
  18. ejkernel-0.0.80/csrc/quantized_matmul/src/qmm_dequant_affine_bits4_f32.cu +52 -0
  19. ejkernel-0.0.80/csrc/quantized_matmul/src/qmm_dequant_affine_bits5_bf16.cu +52 -0
  20. ejkernel-0.0.80/csrc/quantized_matmul/src/qmm_dequant_affine_bits5_f16.cu +52 -0
  21. ejkernel-0.0.80/csrc/quantized_matmul/src/qmm_dequant_affine_bits5_f32.cu +52 -0
  22. ejkernel-0.0.80/csrc/quantized_matmul/src/qmm_dequant_affine_bits6_bf16.cu +52 -0
  23. ejkernel-0.0.80/csrc/quantized_matmul/src/qmm_dequant_affine_bits6_f16.cu +52 -0
  24. ejkernel-0.0.80/csrc/quantized_matmul/src/qmm_dequant_affine_bits6_f32.cu +52 -0
  25. ejkernel-0.0.80/csrc/quantized_matmul/src/qmm_dequant_affine_bits7_bf16.cu +52 -0
  26. ejkernel-0.0.80/csrc/quantized_matmul/src/qmm_dequant_affine_bits7_f16.cu +52 -0
  27. ejkernel-0.0.80/csrc/quantized_matmul/src/qmm_dequant_affine_bits7_f32.cu +52 -0
  28. ejkernel-0.0.80/csrc/quantized_matmul/src/qmm_dequant_affine_bits8_bf16.cu +52 -0
  29. ejkernel-0.0.80/csrc/quantized_matmul/src/qmm_dequant_affine_bits8_f16.cu +52 -0
  30. ejkernel-0.0.80/csrc/quantized_matmul/src/qmm_dequant_affine_bits8_f32.cu +52 -0
  31. ejkernel-0.0.80/csrc/quantized_matmul/src/qmm_dequant_dispatch.h +603 -0
  32. ejkernel-0.0.80/csrc/quantized_matmul/src/qmm_dequant_mxfp4.cu +22 -0
  33. ejkernel-0.0.80/csrc/quantized_matmul/src/qmm_dequant_mxfp8.cu +22 -0
  34. ejkernel-0.0.80/csrc/quantized_matmul/src/qmm_dequant_nf4_bf16.cu +52 -0
  35. ejkernel-0.0.80/csrc/quantized_matmul/src/qmm_dequant_nf4_f16.cu +52 -0
  36. ejkernel-0.0.80/csrc/quantized_matmul/src/qmm_dequant_nf4_f32.cu +52 -0
  37. ejkernel-0.0.80/csrc/quantized_matmul/src/qmm_dequant_nvfp4.cu +22 -0
  38. ejkernel-0.0.80/csrc/quantized_matmul/src/qmm_dequant_nvfp8.cu +22 -0
  39. ejkernel-0.0.80/ejkernel/__init__.py +87 -0
  40. ejkernel-0.0.80/ejkernel/benchmarks.py +1073 -0
  41. ejkernel-0.0.80/ejkernel/build_cudalib.py +30 -0
  42. ejkernel-0.0.80/ejkernel/callib/__init__.py +217 -0
  43. ejkernel-0.0.80/ejkernel/callib/_cute_call.py +537 -0
  44. ejkernel-0.0.80/ejkernel/callib/_cute_ffi.py +530 -0
  45. ejkernel-0.0.80/ejkernel/callib/_ejit.py +700 -0
  46. ejkernel-0.0.80/ejkernel/callib/_pallas_call.py +275 -0
  47. ejkernel-0.0.80/ejkernel/callib/_tilelang_call.py +907 -0
  48. ejkernel-0.0.80/ejkernel/callib/_tilelang_ffi.py +560 -0
  49. ejkernel-0.0.80/ejkernel/callib/_triton_call.py +1638 -0
  50. ejkernel-0.0.80/ejkernel/callib/_utils.py +273 -0
  51. ejkernel-0.0.80/ejkernel/errors.py +45 -0
  52. ejkernel-0.0.80/ejkernel/kernels/__init__.py +152 -0
  53. ejkernel-0.0.80/ejkernel/kernels/_cuda/__init__.py +108 -0
  54. ejkernel-0.0.80/ejkernel/kernels/_cuda/blocksparse_attention/__init__.py +46 -0
  55. ejkernel-0.0.80/ejkernel/kernels/_cuda/blocksparse_attention/_build.py +202 -0
  56. ejkernel-0.0.80/ejkernel/kernels/_cuda/blocksparse_attention/_cuda_impl.py +284 -0
  57. ejkernel-0.0.80/ejkernel/kernels/_cuda/blocksparse_attention/_interface.py +982 -0
  58. ejkernel-0.0.80/ejkernel/kernels/_cuda/flash_attention/__init__.py +35 -0
  59. ejkernel-0.0.80/ejkernel/kernels/_cuda/flash_attention/_build.py +262 -0
  60. ejkernel-0.0.80/ejkernel/kernels/_cuda/flash_attention/_cuda_impl.py +478 -0
  61. ejkernel-0.0.80/ejkernel/kernels/_cuda/flash_attention/_interface.py +867 -0
  62. ejkernel-0.0.80/ejkernel/kernels/_cuda/quantized_matmul/__init__.py +39 -0
  63. ejkernel-0.0.80/ejkernel/kernels/_cuda/quantized_matmul/_build.py +274 -0
  64. ejkernel-0.0.80/ejkernel/kernels/_cuda/quantized_matmul/_cuda_impl.py +340 -0
  65. ejkernel-0.0.80/ejkernel/kernels/_cuda/quantized_matmul/_cuda_impl_bwd.py +103 -0
  66. ejkernel-0.0.80/ejkernel/kernels/_cuda/quantized_matmul/_cuda_impl_fwd.py +104 -0
  67. ejkernel-0.0.80/ejkernel/kernels/_cuda/quantized_matmul/_interface.py +290 -0
  68. ejkernel-0.0.80/ejkernel/kernels/_cuda/ragged_page_attention_v3/__init__.py +43 -0
  69. ejkernel-0.0.80/ejkernel/kernels/_cuda/ragged_page_attention_v3/_build.py +211 -0
  70. ejkernel-0.0.80/ejkernel/kernels/_cuda/ragged_page_attention_v3/_cuda_impl.py +178 -0
  71. ejkernel-0.0.80/ejkernel/kernels/_cuda/ragged_page_attention_v3/_interface.py +153 -0
  72. ejkernel-0.0.80/ejkernel/kernels/_cuda/unified_attention/__init__.py +49 -0
  73. ejkernel-0.0.80/ejkernel/kernels/_cuda/unified_attention/_build.py +204 -0
  74. ejkernel-0.0.80/ejkernel/kernels/_cuda/unified_attention/_cuda_impl.py +174 -0
  75. ejkernel-0.0.80/ejkernel/kernels/_cuda/unified_attention/_interface.py +170 -0
  76. ejkernel-0.0.80/ejkernel/kernels/_cute/__init__.py +32 -0
  77. ejkernel-0.0.80/ejkernel/kernels/_cute/chunked_prefill_paged_decode/__init__.py +24 -0
  78. ejkernel-0.0.80/ejkernel/kernels/_cute/chunked_prefill_paged_decode/_cute_impl_fwd.py +532 -0
  79. ejkernel-0.0.80/ejkernel/kernels/_cute/chunked_prefill_paged_decode/_interface.py +140 -0
  80. ejkernel-0.0.80/ejkernel/kernels/_cute/flash_attention/__init__.py +27 -0
  81. ejkernel-0.0.80/ejkernel/kernels/_cute/flash_attention/_cute_impl.py +1545 -0
  82. ejkernel-0.0.80/ejkernel/kernels/_cute/flash_attention/_interface.py +598 -0
  83. ejkernel-0.0.80/ejkernel/kernels/_cute/quantized_matmul/__init__.py +27 -0
  84. ejkernel-0.0.80/ejkernel/kernels/_cute/quantized_matmul/_cute_impl.py +3098 -0
  85. ejkernel-0.0.80/ejkernel/kernels/_cute/quantized_matmul/_cute_impl_bwd.py +119 -0
  86. ejkernel-0.0.80/ejkernel/kernels/_cute/quantized_matmul/_cute_impl_fwd.py +121 -0
  87. ejkernel-0.0.80/ejkernel/kernels/_cute/quantized_matmul/_interface.py +309 -0
  88. ejkernel-0.0.80/ejkernel/kernels/_cute/ragged_page_attention_v3/__init__.py +20 -0
  89. ejkernel-0.0.80/ejkernel/kernels/_cute/ragged_page_attention_v3/_cute_impl_fwd.py +21 -0
  90. ejkernel-0.0.80/ejkernel/kernels/_cute/ragged_page_attention_v3/_interface.py +22 -0
  91. ejkernel-0.0.80/ejkernel/kernels/_cute/unified_attention/__init__.py +27 -0
  92. ejkernel-0.0.80/ejkernel/kernels/_cute/unified_attention/_cute_impl.py +168 -0
  93. ejkernel-0.0.80/ejkernel/kernels/_cute/unified_attention/_interface.py +141 -0
  94. ejkernel-0.0.80/ejkernel/kernels/_pallas/__init__.py +46 -0
  95. ejkernel-0.0.80/ejkernel/kernels/_pallas/gpu/__init__.py +36 -0
  96. ejkernel-0.0.80/ejkernel/kernels/_pallas/gpu/ragged_decode_attention/__init__.py +34 -0
  97. ejkernel-0.0.80/ejkernel/kernels/_pallas/gpu/ragged_decode_attention/_interface.py +123 -0
  98. ejkernel-0.0.80/ejkernel/kernels/_pallas/gpu/ragged_decode_attention/_pallas_impl_fwd.py +510 -0
  99. ejkernel-0.0.80/ejkernel/kernels/_pallas/gpu/scaled_dot_product_attention/__init__.py +35 -0
  100. ejkernel-0.0.80/ejkernel/kernels/_pallas/gpu/scaled_dot_product_attention/_interface.py +101 -0
  101. ejkernel-0.0.80/ejkernel/kernels/_pallas/gpu/scaled_dot_product_attention/_pallas_impl_bwd.py +21 -0
  102. ejkernel-0.0.80/ejkernel/kernels/_pallas/gpu/scaled_dot_product_attention/_pallas_impl_fwd.py +125 -0
  103. ejkernel-0.0.80/ejkernel/kernels/_pallas/tpu/__init__.py +97 -0
  104. ejkernel-0.0.80/ejkernel/kernels/_pallas/tpu/all_gather_matmul/__init__.py +29 -0
  105. ejkernel-0.0.80/ejkernel/kernels/_pallas/tpu/all_gather_matmul/_interface.py +293 -0
  106. ejkernel-0.0.80/ejkernel/kernels/_pallas/tpu/all_gather_matmul/_pallas_impl.py +725 -0
  107. ejkernel-0.0.80/ejkernel/kernels/_pallas/tpu/blocksparse_attention/__init__.py +131 -0
  108. ejkernel-0.0.80/ejkernel/kernels/_pallas/tpu/blocksparse_attention/_info.py +1068 -0
  109. ejkernel-0.0.80/ejkernel/kernels/_pallas/tpu/blocksparse_attention/_interface.py +156 -0
  110. ejkernel-0.0.80/ejkernel/kernels/_pallas/tpu/blocksparse_attention/_kernel.py +2869 -0
  111. ejkernel-0.0.80/ejkernel/kernels/_pallas/tpu/blocksparse_attention/_masks.py +692 -0
  112. ejkernel-0.0.80/ejkernel/kernels/_pallas/tpu/deepseek_attn/__init__.py +27 -0
  113. ejkernel-0.0.80/ejkernel/kernels/_pallas/tpu/deepseek_attn/_interface.py +159 -0
  114. ejkernel-0.0.80/ejkernel/kernels/_pallas/tpu/deepseek_attn/_pallas_impl_bwd.py +97 -0
  115. ejkernel-0.0.80/ejkernel/kernels/_pallas/tpu/deepseek_attn/_pallas_impl_fwd.py +259 -0
  116. ejkernel-0.0.80/ejkernel/kernels/_pallas/tpu/flash_attention/__init__.py +36 -0
  117. ejkernel-0.0.80/ejkernel/kernels/_pallas/tpu/flash_attention/_interface.py +385 -0
  118. ejkernel-0.0.80/ejkernel/kernels/_pallas/tpu/flash_attention/_pallas_impl_bwd.py +912 -0
  119. ejkernel-0.0.80/ejkernel/kernels/_pallas/tpu/flash_attention/_pallas_impl_fwd.py +665 -0
  120. ejkernel-0.0.80/ejkernel/kernels/_pallas/tpu/flash_attention/_utils.py +609 -0
  121. ejkernel-0.0.80/ejkernel/kernels/_pallas/tpu/flash_mla/__init__.py +33 -0
  122. ejkernel-0.0.80/ejkernel/kernels/_pallas/tpu/flash_mla/_interface.py +205 -0
  123. ejkernel-0.0.80/ejkernel/kernels/_pallas/tpu/flash_mla/_pallas_impl_bwd.py +996 -0
  124. ejkernel-0.0.80/ejkernel/kernels/_pallas/tpu/flash_mla/_pallas_impl_fwd.py +626 -0
  125. ejkernel-0.0.80/ejkernel/kernels/_pallas/tpu/flash_mla/_utils.py +71 -0
  126. ejkernel-0.0.80/ejkernel/kernels/_pallas/tpu/fused_cross_entropy/__init__.py +25 -0
  127. ejkernel-0.0.80/ejkernel/kernels/_pallas/tpu/fused_cross_entropy/_interface.py +90 -0
  128. ejkernel-0.0.80/ejkernel/kernels/_pallas/tpu/fused_cross_entropy/_pallas_impl_bwd.py +202 -0
  129. ejkernel-0.0.80/ejkernel/kernels/_pallas/tpu/fused_cross_entropy/_pallas_impl_fwd.py +675 -0
  130. ejkernel-0.0.80/ejkernel/kernels/_pallas/tpu/fused_kl_divergence/__init__.py +25 -0
  131. ejkernel-0.0.80/ejkernel/kernels/_pallas/tpu/fused_kl_divergence/_interface.py +84 -0
  132. ejkernel-0.0.80/ejkernel/kernels/_pallas/tpu/fused_kl_divergence/_pallas_impl_bwd.py +222 -0
  133. ejkernel-0.0.80/ejkernel/kernels/_pallas/tpu/fused_kl_divergence/_pallas_impl_fwd.py +822 -0
  134. ejkernel-0.0.80/ejkernel/kernels/_pallas/tpu/gated_delta_rule/_interface.py +142 -0
  135. ejkernel-0.0.80/ejkernel/kernels/_pallas/tpu/gated_delta_rule/_pallas_impl_bwd.py +758 -0
  136. ejkernel-0.0.80/ejkernel/kernels/_pallas/tpu/gated_delta_rule/_pallas_impl_fwd.py +941 -0
  137. ejkernel-0.0.80/ejkernel/kernels/_pallas/tpu/grouped_matmul/__init__.py +30 -0
  138. ejkernel-0.0.80/ejkernel/kernels/_pallas/tpu/grouped_matmul/_interface.py +298 -0
  139. ejkernel-0.0.80/ejkernel/kernels/_pallas/tpu/grouped_matmul/_pallas_impl.py +996 -0
  140. ejkernel-0.0.80/ejkernel/kernels/_pallas/tpu/grouped_matmul/_utils.py +191 -0
  141. ejkernel-0.0.80/ejkernel/kernels/_pallas/tpu/grouped_matmulv2/__init__.py +30 -0
  142. ejkernel-0.0.80/ejkernel/kernels/_pallas/tpu/grouped_matmulv2/_interface.py +319 -0
  143. ejkernel-0.0.80/ejkernel/kernels/_pallas/tpu/grouped_matmulv2/_pallas_impl.py +627 -0
  144. ejkernel-0.0.80/ejkernel/kernels/_pallas/tpu/grouped_matmulv3/__init__.py +26 -0
  145. ejkernel-0.0.80/ejkernel/kernels/_pallas/tpu/grouped_matmulv3/_interface.py +444 -0
  146. ejkernel-0.0.80/ejkernel/kernels/_pallas/tpu/grouped_matmulv3/_pallas_impl.py +1418 -0
  147. ejkernel-0.0.80/ejkernel/kernels/_pallas/tpu/multi_latent_ragged_page_attention/__init__.py +19 -0
  148. ejkernel-0.0.80/ejkernel/kernels/_pallas/tpu/multi_latent_ragged_page_attention/_interface.py +143 -0
  149. ejkernel-0.0.80/ejkernel/kernels/_pallas/tpu/multi_latent_ragged_page_attention/_pallas_impl_fwd.py +1335 -0
  150. ejkernel-0.0.80/ejkernel/kernels/_pallas/tpu/multi_latent_ragged_page_attention_v2/__init__.py +19 -0
  151. ejkernel-0.0.80/ejkernel/kernels/_pallas/tpu/multi_latent_ragged_page_attention_v2/_interface.py +177 -0
  152. ejkernel-0.0.80/ejkernel/kernels/_pallas/tpu/multi_latent_ragged_page_attention_v2/_pallas_impl_fwd.py +1533 -0
  153. ejkernel-0.0.80/ejkernel/kernels/_pallas/tpu/page_attention/__init__.py +30 -0
  154. ejkernel-0.0.80/ejkernel/kernels/_pallas/tpu/page_attention/_interface.py +349 -0
  155. ejkernel-0.0.80/ejkernel/kernels/_pallas/tpu/page_attention/_pallas_impl_fwd.py +516 -0
  156. ejkernel-0.0.80/ejkernel/kernels/_pallas/tpu/prefill_page_attention/__init__.py +30 -0
  157. ejkernel-0.0.80/ejkernel/kernels/_pallas/tpu/prefill_page_attention/_interface.py +230 -0
  158. ejkernel-0.0.80/ejkernel/kernels/_pallas/tpu/prefill_page_attention/_pallas_impl_fwd.py +402 -0
  159. ejkernel-0.0.80/ejkernel/kernels/_pallas/tpu/quantized_matmul/__init__.py +23 -0
  160. ejkernel-0.0.80/ejkernel/kernels/_pallas/tpu/quantized_matmul/_interface.py +734 -0
  161. ejkernel-0.0.80/ejkernel/kernels/_pallas/tpu/quantized_matmul/_pallas_impl_bwd.py +470 -0
  162. ejkernel-0.0.80/ejkernel/kernels/_pallas/tpu/quantized_matmul/_pallas_impl_core.py +1042 -0
  163. ejkernel-0.0.80/ejkernel/kernels/_pallas/tpu/quantized_matmul/_pallas_impl_fwd.py +332 -0
  164. ejkernel-0.0.80/ejkernel/kernels/_pallas/tpu/ragged_decode_attention/__init__.py +30 -0
  165. ejkernel-0.0.80/ejkernel/kernels/_pallas/tpu/ragged_decode_attention/_interface.py +138 -0
  166. ejkernel-0.0.80/ejkernel/kernels/_pallas/tpu/ragged_decode_attention/_pallas_impl_fwd.py +322 -0
  167. ejkernel-0.0.80/ejkernel/kernels/_pallas/tpu/ragged_gated_delta_rule/__init__.py +25 -0
  168. ejkernel-0.0.80/ejkernel/kernels/_pallas/tpu/ragged_gated_delta_rule/_interface.py +229 -0
  169. ejkernel-0.0.80/ejkernel/kernels/_pallas/tpu/ragged_gated_delta_rule/_pallas_impl_fwd.py +140 -0
  170. ejkernel-0.0.80/ejkernel/kernels/_pallas/tpu/ragged_page_attention_v2/__init__.py +30 -0
  171. ejkernel-0.0.80/ejkernel/kernels/_pallas/tpu/ragged_page_attention_v2/_interface.py +338 -0
  172. ejkernel-0.0.80/ejkernel/kernels/_pallas/tpu/ragged_page_attention_v2/_pallas_impl_fwd.py +809 -0
  173. ejkernel-0.0.80/ejkernel/kernels/_pallas/tpu/ragged_page_attention_v2/_utils.py +561 -0
  174. ejkernel-0.0.80/ejkernel/kernels/_pallas/tpu/ragged_page_attention_v3/__init__.py +30 -0
  175. ejkernel-0.0.80/ejkernel/kernels/_pallas/tpu/ragged_page_attention_v3/_interface.py +205 -0
  176. ejkernel-0.0.80/ejkernel/kernels/_pallas/tpu/ragged_page_attention_v3/_pallas_impl_fwd.py +1785 -0
  177. ejkernel-0.0.80/ejkernel/kernels/_pallas/tpu/ragged_page_attention_v3/_pallas_impl_fwd_h64.py +1610 -0
  178. ejkernel-0.0.80/ejkernel/kernels/_pallas/tpu/ragged_page_attention_v3/_utils.py +4792 -0
  179. ejkernel-0.0.80/ejkernel/kernels/_pallas/tpu/reduce_scatter_matmul/__init__.py +19 -0
  180. ejkernel-0.0.80/ejkernel/kernels/_pallas/tpu/reduce_scatter_matmul/_interface.py +226 -0
  181. ejkernel-0.0.80/ejkernel/kernels/_pallas/tpu/reduce_scatter_matmul/_pallas_impl.py +780 -0
  182. ejkernel-0.0.80/ejkernel/kernels/_pallas/tpu/ring_attention/__init__.py +48 -0
  183. ejkernel-0.0.80/ejkernel/kernels/_pallas/tpu/ring_attention/_interface.py +125 -0
  184. ejkernel-0.0.80/ejkernel/kernels/_pallas/tpu/ring_attention/_pallas_impl_bwd.py +919 -0
  185. ejkernel-0.0.80/ejkernel/kernels/_pallas/tpu/ring_attention/_pallas_impl_fwd.py +477 -0
  186. ejkernel-0.0.80/ejkernel/kernels/_registry.py +720 -0
  187. ejkernel-0.0.80/ejkernel/kernels/_tilelang/__init__.py +138 -0
  188. ejkernel-0.0.80/ejkernel/kernels/_tilelang/_dense_matmul.py +200 -0
  189. ejkernel-0.0.80/ejkernel/kernels/_tilelang/_gate_impl.py +408 -0
  190. ejkernel-0.0.80/ejkernel/kernels/_tilelang/_gate_kernel.py +520 -0
  191. ejkernel-0.0.80/ejkernel/kernels/_tilelang/_grouped_matmul_impl.py +252 -0
  192. ejkernel-0.0.80/ejkernel/kernels/_tilelang/_grouped_matmul_kernel.py +232 -0
  193. ejkernel-0.0.80/ejkernel/kernels/_tilelang/all_gather_matmul/__init__.py +24 -0
  194. ejkernel-0.0.80/ejkernel/kernels/_tilelang/all_gather_matmul/_interface.py +102 -0
  195. ejkernel-0.0.80/ejkernel/kernels/_tilelang/attention/__init__.py +26 -0
  196. ejkernel-0.0.80/ejkernel/kernels/_tilelang/attention/_interface.py +750 -0
  197. ejkernel-0.0.80/ejkernel/kernels/_tilelang/attention/_kernel.py +505 -0
  198. ejkernel-0.0.80/ejkernel/kernels/_tilelang/blocksparse_attention/__init__.py +19 -0
  199. ejkernel-0.0.80/ejkernel/kernels/_tilelang/blocksparse_attention/_interface.py +123 -0
  200. ejkernel-0.0.80/ejkernel/kernels/_tilelang/chunked_prefill_paged_decode/__init__.py +29 -0
  201. ejkernel-0.0.80/ejkernel/kernels/_tilelang/chunked_prefill_paged_decode/_interface.py +288 -0
  202. ejkernel-0.0.80/ejkernel/kernels/_tilelang/chunked_prefill_paged_decode/_kernel.py +305 -0
  203. ejkernel-0.0.80/ejkernel/kernels/_tilelang/decode_attention/__init__.py +24 -0
  204. ejkernel-0.0.80/ejkernel/kernels/_tilelang/decode_attention/_impl.py +202 -0
  205. ejkernel-0.0.80/ejkernel/kernels/_tilelang/decode_attention/_interface.py +215 -0
  206. ejkernel-0.0.80/ejkernel/kernels/_tilelang/decode_attention/_kernel.py +340 -0
  207. ejkernel-0.0.80/ejkernel/kernels/_tilelang/decode_attention/_split_kernel.py +224 -0
  208. ejkernel-0.0.80/ejkernel/kernels/_tilelang/deepseek_attn/__init__.py +35 -0
  209. ejkernel-0.0.80/ejkernel/kernels/_tilelang/deepseek_attn/_impl.py +630 -0
  210. ejkernel-0.0.80/ejkernel/kernels/_tilelang/deepseek_attn/_interface.py +112 -0
  211. ejkernel-0.0.80/ejkernel/kernels/_tilelang/deepseek_attn/_kernel.py +471 -0
  212. ejkernel-0.0.80/ejkernel/kernels/_tilelang/flash_attention/__init__.py +19 -0
  213. ejkernel-0.0.80/ejkernel/kernels/_tilelang/flash_attention/_impl.py +1462 -0
  214. ejkernel-0.0.80/ejkernel/kernels/_tilelang/flash_attention/_interface.py +153 -0
  215. ejkernel-0.0.80/ejkernel/kernels/_tilelang/flash_attention/_kernel.py +1294 -0
  216. ejkernel-0.0.80/ejkernel/kernels/_tilelang/flash_mla/__init__.py +26 -0
  217. ejkernel-0.0.80/ejkernel/kernels/_tilelang/flash_mla/_interface.py +161 -0
  218. ejkernel-0.0.80/ejkernel/kernels/_tilelang/fused_cross_entropy/__init__.py +19 -0
  219. ejkernel-0.0.80/ejkernel/kernels/_tilelang/fused_cross_entropy/_impl.py +658 -0
  220. ejkernel-0.0.80/ejkernel/kernels/_tilelang/fused_cross_entropy/_interface.py +124 -0
  221. ejkernel-0.0.80/ejkernel/kernels/_tilelang/fused_cross_entropy/_kernel.py +835 -0
  222. ejkernel-0.0.80/ejkernel/kernels/_tilelang/fused_kl_divergence/__init__.py +19 -0
  223. ejkernel-0.0.80/ejkernel/kernels/_tilelang/fused_kl_divergence/_impl.py +911 -0
  224. ejkernel-0.0.80/ejkernel/kernels/_tilelang/fused_kl_divergence/_interface.py +85 -0
  225. ejkernel-0.0.80/ejkernel/kernels/_tilelang/fused_kl_divergence/_kernel.py +1244 -0
  226. ejkernel-0.0.80/ejkernel/kernels/_tilelang/gated_delta_rule/__init__.py +23 -0
  227. ejkernel-0.0.80/ejkernel/kernels/_tilelang/gated_delta_rule/_impl.py +372 -0
  228. ejkernel-0.0.80/ejkernel/kernels/_tilelang/gated_delta_rule/_interface.py +131 -0
  229. ejkernel-0.0.80/ejkernel/kernels/_tilelang/gated_delta_rule/_kernel.py +537 -0
  230. ejkernel-0.0.80/ejkernel/kernels/_tilelang/gla/__init__.py +27 -0
  231. ejkernel-0.0.80/ejkernel/kernels/_tilelang/gla/_interface.py +119 -0
  232. ejkernel-0.0.80/ejkernel/kernels/_tilelang/grouped_matmul/__init__.py +27 -0
  233. ejkernel-0.0.80/ejkernel/kernels/_tilelang/grouped_matmul/_interface.py +209 -0
  234. ejkernel-0.0.80/ejkernel/kernels/_tilelang/grouped_matmulv2/__init__.py +15 -0
  235. ejkernel-0.0.80/ejkernel/kernels/_tilelang/grouped_matmulv2/_interface.py +25 -0
  236. ejkernel-0.0.80/ejkernel/kernels/_tilelang/grouped_matmulv3/__init__.py +28 -0
  237. ejkernel-0.0.80/ejkernel/kernels/_tilelang/grouped_matmulv3/_impl.py +795 -0
  238. ejkernel-0.0.80/ejkernel/kernels/_tilelang/grouped_matmulv3/_interface.py +138 -0
  239. ejkernel-0.0.80/ejkernel/kernels/_tilelang/grouped_matmulv3/_kernel.py +769 -0
  240. ejkernel-0.0.80/ejkernel/kernels/_tilelang/kernel_delta_attention/__init__.py +33 -0
  241. ejkernel-0.0.80/ejkernel/kernels/_tilelang/kernel_delta_attention/_interface.py +214 -0
  242. ejkernel-0.0.80/ejkernel/kernels/_tilelang/lightning_attn/__init__.py +27 -0
  243. ejkernel-0.0.80/ejkernel/kernels/_tilelang/lightning_attn/_interface.py +130 -0
  244. ejkernel-0.0.80/ejkernel/kernels/_tilelang/mamba1/__init__.py +28 -0
  245. ejkernel-0.0.80/ejkernel/kernels/_tilelang/mamba1/_interface.py +25 -0
  246. ejkernel-0.0.80/ejkernel/kernels/_tilelang/mamba2/__init__.py +15 -0
  247. ejkernel-0.0.80/ejkernel/kernels/_tilelang/mamba2/_interface.py +25 -0
  248. ejkernel-0.0.80/ejkernel/kernels/_tilelang/mean_pooling/__init__.py +19 -0
  249. ejkernel-0.0.80/ejkernel/kernels/_tilelang/mean_pooling/_impl.py +263 -0
  250. ejkernel-0.0.80/ejkernel/kernels/_tilelang/mean_pooling/_interface.py +102 -0
  251. ejkernel-0.0.80/ejkernel/kernels/_tilelang/mean_pooling/_kernel.py +351 -0
  252. ejkernel-0.0.80/ejkernel/kernels/_tilelang/multi_latent_ragged_page_attention/__init__.py +24 -0
  253. ejkernel-0.0.80/ejkernel/kernels/_tilelang/multi_latent_ragged_page_attention/_interface.py +475 -0
  254. ejkernel-0.0.80/ejkernel/kernels/_tilelang/multi_latent_ragged_page_attention/_kernel.py +385 -0
  255. ejkernel-0.0.80/ejkernel/kernels/_tilelang/multi_latent_ragged_page_attention_v2/__init__.py +24 -0
  256. ejkernel-0.0.80/ejkernel/kernels/_tilelang/multi_latent_ragged_page_attention_v2/_interface.py +131 -0
  257. ejkernel-0.0.80/ejkernel/kernels/_tilelang/native_sparse_attention/__init__.py +29 -0
  258. ejkernel-0.0.80/ejkernel/kernels/_tilelang/native_sparse_attention/_impl.py +530 -0
  259. ejkernel-0.0.80/ejkernel/kernels/_tilelang/native_sparse_attention/_interface.py +187 -0
  260. ejkernel-0.0.80/ejkernel/kernels/_tilelang/native_sparse_attention/_kernel.py +659 -0
  261. ejkernel-0.0.80/ejkernel/kernels/_tilelang/page_attention/__init__.py +24 -0
  262. ejkernel-0.0.80/ejkernel/kernels/_tilelang/page_attention/_interface.py +276 -0
  263. ejkernel-0.0.80/ejkernel/kernels/_tilelang/page_attention/_kernel.py +417 -0
  264. ejkernel-0.0.80/ejkernel/kernels/_tilelang/prefill_page_attention/__init__.py +24 -0
  265. ejkernel-0.0.80/ejkernel/kernels/_tilelang/prefill_page_attention/_interface.py +243 -0
  266. ejkernel-0.0.80/ejkernel/kernels/_tilelang/prefill_page_attention/_kernel.py +300 -0
  267. ejkernel-0.0.80/ejkernel/kernels/_tilelang/quantized_matmul/__init__.py +19 -0
  268. ejkernel-0.0.80/ejkernel/kernels/_tilelang/quantized_matmul/_impl.py +2180 -0
  269. ejkernel-0.0.80/ejkernel/kernels/_tilelang/quantized_matmul/_interface.py +210 -0
  270. ejkernel-0.0.80/ejkernel/kernels/_tilelang/quantized_matmul/_kernel.py +2091 -0
  271. ejkernel-0.0.80/ejkernel/kernels/_tilelang/ragged_decode_attention/__init__.py +24 -0
  272. ejkernel-0.0.80/ejkernel/kernels/_tilelang/ragged_decode_attention/_interface.py +354 -0
  273. ejkernel-0.0.80/ejkernel/kernels/_tilelang/ragged_decode_attention/_kernel.py +412 -0
  274. ejkernel-0.0.80/ejkernel/kernels/_tilelang/ragged_gated_delta_rule/__init__.py +24 -0
  275. ejkernel-0.0.80/ejkernel/kernels/_tilelang/ragged_gated_delta_rule/_impl.py +455 -0
  276. ejkernel-0.0.80/ejkernel/kernels/_tilelang/ragged_gated_delta_rule/_interface.py +109 -0
  277. ejkernel-0.0.80/ejkernel/kernels/_tilelang/ragged_gated_delta_rule/_kernel.py +1118 -0
  278. ejkernel-0.0.80/ejkernel/kernels/_tilelang/ragged_page_attention_v2/__init__.py +24 -0
  279. ejkernel-0.0.80/ejkernel/kernels/_tilelang/ragged_page_attention_v2/_interface.py +297 -0
  280. ejkernel-0.0.80/ejkernel/kernels/_tilelang/ragged_page_attention_v2/_kernel.py +312 -0
  281. ejkernel-0.0.80/ejkernel/kernels/_tilelang/ragged_page_attention_v2_turboquant/__init__.py +26 -0
  282. ejkernel-0.0.80/ejkernel/kernels/_tilelang/ragged_page_attention_v2_turboquant/_interface.py +342 -0
  283. ejkernel-0.0.80/ejkernel/kernels/_tilelang/ragged_page_attention_v2_turboquant/_kernel.py +355 -0
  284. ejkernel-0.0.80/ejkernel/kernels/_tilelang/ragged_page_attention_v3/__init__.py +25 -0
  285. ejkernel-0.0.80/ejkernel/kernels/_tilelang/ragged_page_attention_v3/_interface.py +326 -0
  286. ejkernel-0.0.80/ejkernel/kernels/_tilelang/ragged_page_attention_v3/_kernel.py +562 -0
  287. ejkernel-0.0.80/ejkernel/kernels/_tilelang/ragged_page_attention_v3_turboquant/__init__.py +24 -0
  288. ejkernel-0.0.80/ejkernel/kernels/_tilelang/ragged_page_attention_v3_turboquant/_interface.py +325 -0
  289. ejkernel-0.0.80/ejkernel/kernels/_tilelang/ragged_page_attention_v3_turboquant/_kernel.py +300 -0
  290. ejkernel-0.0.80/ejkernel/kernels/_tilelang/recurrent/__init__.py +19 -0
  291. ejkernel-0.0.80/ejkernel/kernels/_tilelang/recurrent/_impl.py +980 -0
  292. ejkernel-0.0.80/ejkernel/kernels/_tilelang/recurrent/_interface.py +127 -0
  293. ejkernel-0.0.80/ejkernel/kernels/_tilelang/recurrent/_kernel.py +883 -0
  294. ejkernel-0.0.80/ejkernel/kernels/_tilelang/reduce_scatter_matmul/__init__.py +19 -0
  295. ejkernel-0.0.80/ejkernel/kernels/_tilelang/reduce_scatter_matmul/_interface.py +100 -0
  296. ejkernel-0.0.80/ejkernel/kernels/_tilelang/ring_attention/__init__.py +24 -0
  297. ejkernel-0.0.80/ejkernel/kernels/_tilelang/ring_attention/_interface.py +110 -0
  298. ejkernel-0.0.80/ejkernel/kernels/_tilelang/rwkv4/__init__.py +19 -0
  299. ejkernel-0.0.80/ejkernel/kernels/_tilelang/rwkv4/_impl.py +351 -0
  300. ejkernel-0.0.80/ejkernel/kernels/_tilelang/rwkv4/_interface.py +79 -0
  301. ejkernel-0.0.80/ejkernel/kernels/_tilelang/rwkv4/_kernel.py +519 -0
  302. ejkernel-0.0.80/ejkernel/kernels/_tilelang/rwkv6/__init__.py +19 -0
  303. ejkernel-0.0.80/ejkernel/kernels/_tilelang/rwkv6/_impl.py +485 -0
  304. ejkernel-0.0.80/ejkernel/kernels/_tilelang/rwkv6/_interface.py +98 -0
  305. ejkernel-0.0.80/ejkernel/kernels/_tilelang/rwkv6/_kernel.py +966 -0
  306. ejkernel-0.0.80/ejkernel/kernels/_tilelang/rwkv7/__init__.py +19 -0
  307. ejkernel-0.0.80/ejkernel/kernels/_tilelang/rwkv7/_impl.py +540 -0
  308. ejkernel-0.0.80/ejkernel/kernels/_tilelang/rwkv7/_interface.py +191 -0
  309. ejkernel-0.0.80/ejkernel/kernels/_tilelang/rwkv7/_kernel.py +994 -0
  310. ejkernel-0.0.80/ejkernel/kernels/_tilelang/rwkv7_mul/__init__.py +19 -0
  311. ejkernel-0.0.80/ejkernel/kernels/_tilelang/rwkv7_mul/_interface.py +25 -0
  312. ejkernel-0.0.80/ejkernel/kernels/_tilelang/scaled_dot_product_attention/__init__.py +19 -0
  313. ejkernel-0.0.80/ejkernel/kernels/_tilelang/scaled_dot_product_attention/_interface.py +88 -0
  314. ejkernel-0.0.80/ejkernel/kernels/_tilelang/ssm1/__init__.py +6 -0
  315. ejkernel-0.0.80/ejkernel/kernels/_tilelang/ssm1/_interface.py +19 -0
  316. ejkernel-0.0.80/ejkernel/kernels/_tilelang/ssm2/__init__.py +6 -0
  317. ejkernel-0.0.80/ejkernel/kernels/_tilelang/ssm2/_interface.py +19 -0
  318. ejkernel-0.0.80/ejkernel/kernels/_tilelang/state_space_v1/__init__.py +19 -0
  319. ejkernel-0.0.80/ejkernel/kernels/_tilelang/state_space_v1/_impl.py +341 -0
  320. ejkernel-0.0.80/ejkernel/kernels/_tilelang/state_space_v1/_interface.py +143 -0
  321. ejkernel-0.0.80/ejkernel/kernels/_tilelang/state_space_v1/_kernel.py +630 -0
  322. ejkernel-0.0.80/ejkernel/kernels/_tilelang/state_space_v2/__init__.py +19 -0
  323. ejkernel-0.0.80/ejkernel/kernels/_tilelang/state_space_v2/_impl.py +300 -0
  324. ejkernel-0.0.80/ejkernel/kernels/_tilelang/state_space_v2/_interface.py +138 -0
  325. ejkernel-0.0.80/ejkernel/kernels/_tilelang/state_space_v2/_kernel.py +531 -0
  326. ejkernel-0.0.80/ejkernel/kernels/_tilelang/unified_attention/__init__.py +19 -0
  327. ejkernel-0.0.80/ejkernel/kernels/_tilelang/unified_attention/_interface.py +301 -0
  328. ejkernel-0.0.80/ejkernel/kernels/_tilelang/unified_attention/_kernel.py +301 -0
  329. ejkernel-0.0.80/ejkernel/kernels/_triton/__init__.py +90 -0
  330. ejkernel-0.0.80/ejkernel/kernels/_triton/blocksparse_attention/__init__.py +33 -0
  331. ejkernel-0.0.80/ejkernel/kernels/_triton/blocksparse_attention/_interface.py +469 -0
  332. ejkernel-0.0.80/ejkernel/kernels/_triton/blocksparse_attention/_mask.py +571 -0
  333. ejkernel-0.0.80/ejkernel/kernels/_triton/blocksparse_attention/_triton_impl_bwd.py +1469 -0
  334. ejkernel-0.0.80/ejkernel/kernels/_triton/blocksparse_attention/_triton_impl_fwd.py +753 -0
  335. ejkernel-0.0.80/ejkernel/kernels/_triton/blocksparse_attention/_utilities.py +606 -0
  336. ejkernel-0.0.80/ejkernel/kernels/_triton/chunked_prefill_paged_decode/__init__.py +31 -0
  337. ejkernel-0.0.80/ejkernel/kernels/_triton/chunked_prefill_paged_decode/_interface.py +113 -0
  338. ejkernel-0.0.80/ejkernel/kernels/_triton/chunked_prefill_paged_decode/_triton_impl_fwd.py +244 -0
  339. ejkernel-0.0.80/ejkernel/kernels/_triton/decode_attention/__init__.py +23 -0
  340. ejkernel-0.0.80/ejkernel/kernels/_triton/decode_attention/_interface.py +96 -0
  341. ejkernel-0.0.80/ejkernel/kernels/_triton/decode_attention/_triton_impl_fwd.py +496 -0
  342. ejkernel-0.0.80/ejkernel/kernels/_triton/flash_attention/__init__.py +36 -0
  343. ejkernel-0.0.80/ejkernel/kernels/_triton/flash_attention/_interface.py +473 -0
  344. ejkernel-0.0.80/ejkernel/kernels/_triton/flash_attention/_triton_impl_bwd.py +1628 -0
  345. ejkernel-0.0.80/ejkernel/kernels/_triton/flash_attention/_triton_impl_fwd.py +1074 -0
  346. ejkernel-0.0.80/ejkernel/kernels/_triton/flash_attention/_utilities.py +472 -0
  347. ejkernel-0.0.80/ejkernel/kernels/_triton/flash_mla/__init__.py +25 -0
  348. ejkernel-0.0.80/ejkernel/kernels/_triton/flash_mla/_interface.py +152 -0
  349. ejkernel-0.0.80/ejkernel/kernels/_triton/flash_mla/_triton_impl_bwd.py +40 -0
  350. ejkernel-0.0.80/ejkernel/kernels/_triton/flash_mla/_triton_impl_fwd.py +218 -0
  351. ejkernel-0.0.80/ejkernel/kernels/_triton/flash_mla/_utilities.py +39 -0
  352. ejkernel-0.0.80/ejkernel/kernels/_triton/gla/__init__.py +30 -0
  353. ejkernel-0.0.80/ejkernel/kernels/_triton/gla/_interface.py +106 -0
  354. ejkernel-0.0.80/ejkernel/kernels/_triton/gla/_triton_impl_bwd.py +21 -0
  355. ejkernel-0.0.80/ejkernel/kernels/_triton/gla/_triton_impl_fwd.py +127 -0
  356. ejkernel-0.0.80/ejkernel/kernels/_triton/lightning_attn/__init__.py +30 -0
  357. ejkernel-0.0.80/ejkernel/kernels/_triton/lightning_attn/_interface.py +97 -0
  358. ejkernel-0.0.80/ejkernel/kernels/_triton/lightning_attn/_triton_impl_bwd.py +21 -0
  359. ejkernel-0.0.80/ejkernel/kernels/_triton/lightning_attn/_triton_impl_fwd.py +146 -0
  360. ejkernel-0.0.80/ejkernel/kernels/_triton/mean_pooling/__init__.py +42 -0
  361. ejkernel-0.0.80/ejkernel/kernels/_triton/mean_pooling/_interface.py +207 -0
  362. ejkernel-0.0.80/ejkernel/kernels/_triton/mean_pooling/_triton_impl_bwd.py +241 -0
  363. ejkernel-0.0.80/ejkernel/kernels/_triton/mean_pooling/_triton_impl_fwd.py +250 -0
  364. ejkernel-0.0.80/ejkernel/kernels/_triton/native_sparse_attention/__init__.py +37 -0
  365. ejkernel-0.0.80/ejkernel/kernels/_triton/native_sparse_attention/_compression.py +1012 -0
  366. ejkernel-0.0.80/ejkernel/kernels/_triton/native_sparse_attention/_interface.py +469 -0
  367. ejkernel-0.0.80/ejkernel/kernels/_triton/native_sparse_attention/_triton_impl_bwd.py +586 -0
  368. ejkernel-0.0.80/ejkernel/kernels/_triton/native_sparse_attention/_triton_impl_fwd.py +627 -0
  369. ejkernel-0.0.80/ejkernel/kernels/_triton/native_sparse_attention/_utilities.py +174 -0
  370. ejkernel-0.0.80/ejkernel/kernels/_triton/page_attention/__init__.py +30 -0
  371. ejkernel-0.0.80/ejkernel/kernels/_triton/page_attention/_interface.py +381 -0
  372. ejkernel-0.0.80/ejkernel/kernels/_triton/page_attention/_triton_impl_fwd.py +308 -0
  373. ejkernel-0.0.80/ejkernel/kernels/_triton/quantized_matmul/__init__.py +46 -0
  374. ejkernel-0.0.80/ejkernel/kernels/_triton/quantized_matmul/_interface.py +481 -0
  375. ejkernel-0.0.80/ejkernel/kernels/_triton/quantized_matmul/_triton_impl.py +2905 -0
  376. ejkernel-0.0.80/ejkernel/kernels/_triton/quantized_matmul/_triton_impl_bwd.py +125 -0
  377. ejkernel-0.0.80/ejkernel/kernels/_triton/quantized_matmul/_triton_impl_fwd.py +330 -0
  378. ejkernel-0.0.80/ejkernel/kernels/_triton/quantized_matmul/_triton_impl_gemv.py +754 -0
  379. ejkernel-0.0.80/ejkernel/kernels/_triton/ragged_decode_attention/__init__.py +30 -0
  380. ejkernel-0.0.80/ejkernel/kernels/_triton/ragged_decode_attention/_interface.py +173 -0
  381. ejkernel-0.0.80/ejkernel/kernels/_triton/ragged_decode_attention/_triton_impl_fwd.py +394 -0
  382. ejkernel-0.0.80/ejkernel/kernels/_triton/ragged_page_attention_v2/__init__.py +30 -0
  383. ejkernel-0.0.80/ejkernel/kernels/_triton/ragged_page_attention_v2/_interface.py +228 -0
  384. ejkernel-0.0.80/ejkernel/kernels/_triton/ragged_page_attention_v2/_triton_impl_fwd.py +796 -0
  385. ejkernel-0.0.80/ejkernel/kernels/_triton/ragged_page_attention_v3/__init__.py +30 -0
  386. ejkernel-0.0.80/ejkernel/kernels/_triton/ragged_page_attention_v3/_interface.py +205 -0
  387. ejkernel-0.0.80/ejkernel/kernels/_triton/ragged_page_attention_v3/_triton_impl_fwd.py +661 -0
  388. ejkernel-0.0.80/ejkernel/kernels/_triton/recurrent/__init__.py +42 -0
  389. ejkernel-0.0.80/ejkernel/kernels/_triton/recurrent/_interface.py +339 -0
  390. ejkernel-0.0.80/ejkernel/kernels/_triton/recurrent/_triton_impl_bwd.py +419 -0
  391. ejkernel-0.0.80/ejkernel/kernels/_triton/recurrent/_triton_impl_fwd.py +282 -0
  392. ejkernel-0.0.80/ejkernel/kernels/_triton/ring_attention/__init__.py +33 -0
  393. ejkernel-0.0.80/ejkernel/kernels/_triton/ring_attention/_interface.py +164 -0
  394. ejkernel-0.0.80/ejkernel/kernels/_triton/ring_attention/_triton_impl_bwd.py +862 -0
  395. ejkernel-0.0.80/ejkernel/kernels/_triton/ring_attention/_triton_impl_fwd.py +225 -0
  396. ejkernel-0.0.80/ejkernel/kernels/_triton/rwkv4/__init__.py +32 -0
  397. ejkernel-0.0.80/ejkernel/kernels/_triton/rwkv4/_interface.py +226 -0
  398. ejkernel-0.0.80/ejkernel/kernels/_triton/rwkv4/_triton_impl_bwd.py +232 -0
  399. ejkernel-0.0.80/ejkernel/kernels/_triton/rwkv4/_triton_impl_fwd.py +281 -0
  400. ejkernel-0.0.80/ejkernel/kernels/_triton/rwkv6/__init__.py +33 -0
  401. ejkernel-0.0.80/ejkernel/kernels/_triton/rwkv6/_interface.py +561 -0
  402. ejkernel-0.0.80/ejkernel/kernels/_triton/rwkv6/_triton_impl_fwd.py +221 -0
  403. ejkernel-0.0.80/ejkernel/kernels/_triton/rwkv7/__init__.py +34 -0
  404. ejkernel-0.0.80/ejkernel/kernels/_triton/rwkv7/_interface.py +684 -0
  405. ejkernel-0.0.80/ejkernel/kernels/_triton/rwkv7/_triton_impl_fwd.py +238 -0
  406. ejkernel-0.0.80/ejkernel/kernels/_triton/unified_attention/__init__.py +30 -0
  407. ejkernel-0.0.80/ejkernel/kernels/_triton/unified_attention/_interface.py +161 -0
  408. ejkernel-0.0.80/ejkernel/kernels/_triton/unified_attention/_triton_impl_fwd.py +1024 -0
  409. ejkernel-0.0.80/ejkernel/kernels/_xla/__init__.py +165 -0
  410. ejkernel-0.0.80/ejkernel/kernels/_xla/all_gather_matmul/__init__.py +24 -0
  411. ejkernel-0.0.80/ejkernel/kernels/_xla/all_gather_matmul/_interface.py +95 -0
  412. ejkernel-0.0.80/ejkernel/kernels/_xla/all_gather_matmul/_xla_impl_bwd.py +50 -0
  413. ejkernel-0.0.80/ejkernel/kernels/_xla/all_gather_matmul/_xla_impl_fwd.py +236 -0
  414. ejkernel-0.0.80/ejkernel/kernels/_xla/attention/__init__.py +31 -0
  415. ejkernel-0.0.80/ejkernel/kernels/_xla/attention/_interface.py +149 -0
  416. ejkernel-0.0.80/ejkernel/kernels/_xla/attention/_xla_impl_bwd.py +23 -0
  417. ejkernel-0.0.80/ejkernel/kernels/_xla/attention/_xla_impl_fwd.py +331 -0
  418. ejkernel-0.0.80/ejkernel/kernels/_xla/blocksparse_attention/__init__.py +30 -0
  419. ejkernel-0.0.80/ejkernel/kernels/_xla/blocksparse_attention/_interface.py +150 -0
  420. ejkernel-0.0.80/ejkernel/kernels/_xla/blocksparse_attention/_xla_impl_bwd.py +23 -0
  421. ejkernel-0.0.80/ejkernel/kernels/_xla/blocksparse_attention/_xla_impl_fwd.py +475 -0
  422. ejkernel-0.0.80/ejkernel/kernels/_xla/chunked_prefill_paged_decode/__init__.py +38 -0
  423. ejkernel-0.0.80/ejkernel/kernels/_xla/chunked_prefill_paged_decode/_interface.py +113 -0
  424. ejkernel-0.0.80/ejkernel/kernels/_xla/chunked_prefill_paged_decode/_xla_impl_fwd.py +237 -0
  425. ejkernel-0.0.80/ejkernel/kernels/_xla/decode_attention/__init__.py +34 -0
  426. ejkernel-0.0.80/ejkernel/kernels/_xla/decode_attention/_interface.py +87 -0
  427. ejkernel-0.0.80/ejkernel/kernels/_xla/decode_attention/_xla_impl_fwd.py +167 -0
  428. ejkernel-0.0.80/ejkernel/kernels/_xla/deepseek_attn/__init__.py +28 -0
  429. ejkernel-0.0.80/ejkernel/kernels/_xla/deepseek_attn/_interface.py +239 -0
  430. ejkernel-0.0.80/ejkernel/kernels/_xla/deepseek_attn/_xla_impl_bwd.py +127 -0
  431. ejkernel-0.0.80/ejkernel/kernels/_xla/deepseek_attn/_xla_impl_fwd.py +211 -0
  432. ejkernel-0.0.80/ejkernel/kernels/_xla/flash_attention/__init__.py +40 -0
  433. ejkernel-0.0.80/ejkernel/kernels/_xla/flash_attention/_interface.py +670 -0
  434. ejkernel-0.0.80/ejkernel/kernels/_xla/flash_attention/_xla_impl_bwd.py +150 -0
  435. ejkernel-0.0.80/ejkernel/kernels/_xla/flash_attention/_xla_impl_fwd.py +742 -0
  436. ejkernel-0.0.80/ejkernel/kernels/_xla/flash_mla/__init__.py +23 -0
  437. ejkernel-0.0.80/ejkernel/kernels/_xla/flash_mla/_xla_impl_bwd.py +21 -0
  438. ejkernel-0.0.80/ejkernel/kernels/_xla/flash_mla/_xla_impl_fwd.py +480 -0
  439. ejkernel-0.0.80/ejkernel/kernels/_xla/fused_cross_entropy/__init__.py +19 -0
  440. ejkernel-0.0.80/ejkernel/kernels/_xla/fused_cross_entropy/_interface.py +68 -0
  441. ejkernel-0.0.80/ejkernel/kernels/_xla/fused_cross_entropy/_xla_impl_bwd.py +124 -0
  442. ejkernel-0.0.80/ejkernel/kernels/_xla/fused_cross_entropy/_xla_impl_chunked.py +396 -0
  443. ejkernel-0.0.80/ejkernel/kernels/_xla/fused_cross_entropy/_xla_impl_fwd.py +227 -0
  444. ejkernel-0.0.80/ejkernel/kernels/_xla/fused_cross_entropy/_xla_impl_linear.py +201 -0
  445. ejkernel-0.0.80/ejkernel/kernels/_xla/fused_kl_divergence/__init__.py +19 -0
  446. ejkernel-0.0.80/ejkernel/kernels/_xla/fused_kl_divergence/_interface.py +62 -0
  447. ejkernel-0.0.80/ejkernel/kernels/_xla/fused_kl_divergence/_xla_impl_bwd.py +110 -0
  448. ejkernel-0.0.80/ejkernel/kernels/_xla/fused_kl_divergence/_xla_impl_fwd.py +201 -0
  449. ejkernel-0.0.80/ejkernel/kernels/_xla/gated_delta_rule/__init__.py +36 -0
  450. ejkernel-0.0.80/ejkernel/kernels/_xla/gated_delta_rule/_interface.py +176 -0
  451. ejkernel-0.0.80/ejkernel/kernels/_xla/gated_delta_rule/_xla_impl_bwd.py +305 -0
  452. ejkernel-0.0.80/ejkernel/kernels/_xla/gated_delta_rule/_xla_impl_fwd.py +814 -0
  453. ejkernel-0.0.80/ejkernel/kernels/_xla/gla/__init__.py +34 -0
  454. ejkernel-0.0.80/ejkernel/kernels/_xla/gla/_interface.py +122 -0
  455. ejkernel-0.0.80/ejkernel/kernels/_xla/gla/_xla_impl_bwd.py +21 -0
  456. ejkernel-0.0.80/ejkernel/kernels/_xla/gla/_xla_impl_fwd.py +119 -0
  457. ejkernel-0.0.80/ejkernel/kernels/_xla/grouped_matmul/__init__.py +30 -0
  458. ejkernel-0.0.80/ejkernel/kernels/_xla/grouped_matmul/_interface.py +112 -0
  459. ejkernel-0.0.80/ejkernel/kernels/_xla/grouped_matmul/_xla_impl_bwd.py +21 -0
  460. ejkernel-0.0.80/ejkernel/kernels/_xla/grouped_matmul/_xla_impl_fwd.py +122 -0
  461. ejkernel-0.0.80/ejkernel/kernels/_xla/grouped_matmulv3/__init__.py +30 -0
  462. ejkernel-0.0.80/ejkernel/kernels/_xla/grouped_matmulv3/_interface.py +583 -0
  463. ejkernel-0.0.80/ejkernel/kernels/_xla/kernel_delta_attention/__init__.py +43 -0
  464. ejkernel-0.0.80/ejkernel/kernels/_xla/kernel_delta_attention/_interface.py +239 -0
  465. ejkernel-0.0.80/ejkernel/kernels/_xla/kernel_delta_attention/_xla_impl_fwd.py +467 -0
  466. ejkernel-0.0.80/ejkernel/kernels/_xla/lightning_attn/__init__.py +34 -0
  467. ejkernel-0.0.80/ejkernel/kernels/_xla/lightning_attn/_interface.py +107 -0
  468. ejkernel-0.0.80/ejkernel/kernels/_xla/lightning_attn/_xla_impl_bwd.py +21 -0
  469. ejkernel-0.0.80/ejkernel/kernels/_xla/lightning_attn/_xla_impl_fwd.py +141 -0
  470. ejkernel-0.0.80/ejkernel/kernels/_xla/mean_pooling/__init__.py +29 -0
  471. ejkernel-0.0.80/ejkernel/kernels/_xla/mean_pooling/_interface.py +82 -0
  472. ejkernel-0.0.80/ejkernel/kernels/_xla/mean_pooling/_xla_impl_bwd.py +104 -0
  473. ejkernel-0.0.80/ejkernel/kernels/_xla/mean_pooling/_xla_impl_fwd.py +191 -0
  474. ejkernel-0.0.80/ejkernel/kernels/_xla/multi_latent_ragged_page_attention/__init__.py +43 -0
  475. ejkernel-0.0.80/ejkernel/kernels/_xla/multi_latent_ragged_page_attention/_interface.py +145 -0
  476. ejkernel-0.0.80/ejkernel/kernels/_xla/multi_latent_ragged_page_attention/_xla_impl_fwd.py +488 -0
  477. ejkernel-0.0.80/ejkernel/kernels/_xla/multi_latent_ragged_page_attention_v2/__init__.py +28 -0
  478. ejkernel-0.0.80/ejkernel/kernels/_xla/multi_latent_ragged_page_attention_v2/_interface.py +125 -0
  479. ejkernel-0.0.80/ejkernel/kernels/_xla/multi_latent_ragged_page_attention_v2/_xla_impl_fwd.py +117 -0
  480. ejkernel-0.0.80/ejkernel/kernels/_xla/native_sparse_attention/__init__.py +48 -0
  481. ejkernel-0.0.80/ejkernel/kernels/_xla/native_sparse_attention/_interface.py +623 -0
  482. ejkernel-0.0.80/ejkernel/kernels/_xla/native_sparse_attention/_xla_impl_bwd.py +241 -0
  483. ejkernel-0.0.80/ejkernel/kernels/_xla/native_sparse_attention/_xla_impl_fwd.py +171 -0
  484. ejkernel-0.0.80/ejkernel/kernels/_xla/page_attention/__init__.py +30 -0
  485. ejkernel-0.0.80/ejkernel/kernels/_xla/page_attention/_interface.py +166 -0
  486. ejkernel-0.0.80/ejkernel/kernels/_xla/page_attention/_xla_impl_fwd.py +161 -0
  487. ejkernel-0.0.80/ejkernel/kernels/_xla/prefill_page_attention/__init__.py +77 -0
  488. ejkernel-0.0.80/ejkernel/kernels/_xla/prefill_page_attention/_impl.py +172 -0
  489. ejkernel-0.0.80/ejkernel/kernels/_xla/prefill_page_attention/_interface.py +97 -0
  490. ejkernel-0.0.80/ejkernel/kernels/_xla/quantized_matmul/__init__.py +33 -0
  491. ejkernel-0.0.80/ejkernel/kernels/_xla/quantized_matmul/_interface.py +139 -0
  492. ejkernel-0.0.80/ejkernel/kernels/_xla/quantized_matmul/_xla_impl_bwd.py +21 -0
  493. ejkernel-0.0.80/ejkernel/kernels/_xla/quantized_matmul/_xla_impl_fwd.py +751 -0
  494. ejkernel-0.0.80/ejkernel/kernels/_xla/ragged_decode_attention/__init__.py +29 -0
  495. ejkernel-0.0.80/ejkernel/kernels/_xla/ragged_decode_attention/_interface.py +109 -0
  496. ejkernel-0.0.80/ejkernel/kernels/_xla/ragged_decode_attention/_xla_impl_fwd.py +600 -0
  497. ejkernel-0.0.80/ejkernel/kernels/_xla/ragged_gated_delta_rule/__init__.py +25 -0
  498. ejkernel-0.0.80/ejkernel/kernels/_xla/ragged_gated_delta_rule/_interface.py +117 -0
  499. ejkernel-0.0.80/ejkernel/kernels/_xla/ragged_gated_delta_rule/_xla_impl_fwd.py +503 -0
  500. ejkernel-0.0.80/ejkernel/kernels/_xla/ragged_page_attention_v2/__init__.py +30 -0
  501. ejkernel-0.0.80/ejkernel/kernels/_xla/ragged_page_attention_v2/_interface.py +131 -0
  502. ejkernel-0.0.80/ejkernel/kernels/_xla/ragged_page_attention_v2/_xla_impl_fwd.py +318 -0
  503. ejkernel-0.0.80/ejkernel/kernels/_xla/ragged_page_attention_v2_turboquant/_interface.py +158 -0
  504. ejkernel-0.0.80/ejkernel/kernels/_xla/ragged_page_attention_v2_turboquant/_xla_impl_fwd.py +407 -0
  505. ejkernel-0.0.80/ejkernel/kernels/_xla/ragged_page_attention_v3/__init__.py +34 -0
  506. ejkernel-0.0.80/ejkernel/kernels/_xla/ragged_page_attention_v3/_interface.py +195 -0
  507. ejkernel-0.0.80/ejkernel/kernels/_xla/ragged_page_attention_v3/_xla_impl_bwd.py +21 -0
  508. ejkernel-0.0.80/ejkernel/kernels/_xla/ragged_page_attention_v3/_xla_impl_fwd.py +694 -0
  509. ejkernel-0.0.80/ejkernel/kernels/_xla/ragged_page_attention_v3_turboquant/_interface.py +142 -0
  510. ejkernel-0.0.80/ejkernel/kernels/_xla/ragged_page_attention_v3_turboquant/_xla_impl_fwd.py +561 -0
  511. ejkernel-0.0.80/ejkernel/kernels/_xla/recurrent/__init__.py +44 -0
  512. ejkernel-0.0.80/ejkernel/kernels/_xla/recurrent/_interface.py +781 -0
  513. ejkernel-0.0.80/ejkernel/kernels/_xla/recurrent/_xla_impl_bwd.py +254 -0
  514. ejkernel-0.0.80/ejkernel/kernels/_xla/recurrent/_xla_impl_fwd.py +361 -0
  515. ejkernel-0.0.80/ejkernel/kernels/_xla/reduce_scatter_matmul/__init__.py +28 -0
  516. ejkernel-0.0.80/ejkernel/kernels/_xla/reduce_scatter_matmul/_interface.py +86 -0
  517. ejkernel-0.0.80/ejkernel/kernels/_xla/reduce_scatter_matmul/_xla_impl_bwd.py +66 -0
  518. ejkernel-0.0.80/ejkernel/kernels/_xla/reduce_scatter_matmul/_xla_impl_fwd.py +158 -0
  519. ejkernel-0.0.80/ejkernel/kernels/_xla/ring_attention/__init__.py +45 -0
  520. ejkernel-0.0.80/ejkernel/kernels/_xla/ring_attention/_interface.py +393 -0
  521. ejkernel-0.0.80/ejkernel/kernels/_xla/ring_attention/_utils.py +250 -0
  522. ejkernel-0.0.80/ejkernel/kernels/_xla/ring_attention/_xla_impl_bwd.py +505 -0
  523. ejkernel-0.0.80/ejkernel/kernels/_xla/ring_attention/_xla_impl_fwd.py +571 -0
  524. ejkernel-0.0.80/ejkernel/kernels/_xla/rwkv4/__init__.py +45 -0
  525. ejkernel-0.0.80/ejkernel/kernels/_xla/rwkv4/_interface.py +94 -0
  526. ejkernel-0.0.80/ejkernel/kernels/_xla/rwkv4/_xla_impl_bwd.py +21 -0
  527. ejkernel-0.0.80/ejkernel/kernels/_xla/rwkv4/_xla_impl_fwd.py +219 -0
  528. ejkernel-0.0.80/ejkernel/kernels/_xla/rwkv6/__init__.py +43 -0
  529. ejkernel-0.0.80/ejkernel/kernels/_xla/rwkv6/_interface.py +107 -0
  530. ejkernel-0.0.80/ejkernel/kernels/_xla/rwkv6/_xla_impl_bwd.py +21 -0
  531. ejkernel-0.0.80/ejkernel/kernels/_xla/rwkv6/_xla_impl_fwd.py +412 -0
  532. ejkernel-0.0.80/ejkernel/kernels/_xla/rwkv7/__init__.py +42 -0
  533. ejkernel-0.0.80/ejkernel/kernels/_xla/rwkv7/_interface.py +195 -0
  534. ejkernel-0.0.80/ejkernel/kernels/_xla/rwkv7/_xla_impl_bwd.py +21 -0
  535. ejkernel-0.0.80/ejkernel/kernels/_xla/rwkv7/_xla_impl_fwd.py +533 -0
  536. ejkernel-0.0.80/ejkernel/kernels/_xla/scaled_dot_product_attention/__init__.py +24 -0
  537. ejkernel-0.0.80/ejkernel/kernels/_xla/scaled_dot_product_attention/_interface.py +105 -0
  538. ejkernel-0.0.80/ejkernel/kernels/_xla/scaled_dot_product_attention/_xla_impl_bwd.py +21 -0
  539. ejkernel-0.0.80/ejkernel/kernels/_xla/scaled_dot_product_attention/_xla_impl_fwd.py +101 -0
  540. ejkernel-0.0.80/ejkernel/kernels/_xla/state_space_v1/__init__.py +35 -0
  541. ejkernel-0.0.80/ejkernel/kernels/_xla/state_space_v1/_interface.py +346 -0
  542. ejkernel-0.0.80/ejkernel/kernels/_xla/state_space_v1/_xla_impl_bwd.py +192 -0
  543. ejkernel-0.0.80/ejkernel/kernels/_xla/state_space_v1/_xla_impl_fwd.py +215 -0
  544. ejkernel-0.0.80/ejkernel/kernels/_xla/state_space_v2/__init__.py +31 -0
  545. ejkernel-0.0.80/ejkernel/kernels/_xla/state_space_v2/_interface.py +381 -0
  546. ejkernel-0.0.80/ejkernel/kernels/_xla/state_space_v2/_xla_impl_bwd.py +195 -0
  547. ejkernel-0.0.80/ejkernel/kernels/_xla/state_space_v2/_xla_impl_fwd.py +209 -0
  548. ejkernel-0.0.80/ejkernel/kernels/_xla/unified_attention/__init__.py +30 -0
  549. ejkernel-0.0.80/ejkernel/kernels/_xla/unified_attention/_interface.py +197 -0
  550. ejkernel-0.0.80/ejkernel/kernels/_xla/unified_attention/_xla_impl_fwd.py +334 -0
  551. ejkernel-0.0.80/ejkernel/loggings.py +606 -0
  552. ejkernel-0.0.80/ejkernel/modules/__init__.py +338 -0
  553. ejkernel-0.0.80/ejkernel/modules/base.py +279 -0
  554. ejkernel-0.0.80/ejkernel/modules/operations/__init__.py +338 -0
  555. ejkernel-0.0.80/ejkernel/modules/operations/all_gather_matmul.py +500 -0
  556. ejkernel-0.0.80/ejkernel/modules/operations/attention.py +490 -0
  557. ejkernel-0.0.80/ejkernel/modules/operations/blocksparse_attention.py +1053 -0
  558. ejkernel-0.0.80/ejkernel/modules/operations/chunked_prefill_paged_decode.py +561 -0
  559. ejkernel-0.0.80/ejkernel/modules/operations/configs.py +1158 -0
  560. ejkernel-0.0.80/ejkernel/modules/operations/decode_attention.py +489 -0
  561. ejkernel-0.0.80/ejkernel/modules/operations/deepseek_attn.py +371 -0
  562. ejkernel-0.0.80/ejkernel/modules/operations/flash_attention.py +958 -0
  563. ejkernel-0.0.80/ejkernel/modules/operations/fused_cross_entropy.py +912 -0
  564. ejkernel-0.0.80/ejkernel/modules/operations/fused_kl_divergence.py +633 -0
  565. ejkernel-0.0.80/ejkernel/modules/operations/gated_delta_rule.py +544 -0
  566. ejkernel-0.0.80/ejkernel/modules/operations/gated_linear_attention.py +455 -0
  567. ejkernel-0.0.80/ejkernel/modules/operations/grouped_matmul.py +669 -0
  568. ejkernel-0.0.80/ejkernel/modules/operations/kernel_delta_attention.py +388 -0
  569. ejkernel-0.0.80/ejkernel/modules/operations/lightning_attention.py +490 -0
  570. ejkernel-0.0.80/ejkernel/modules/operations/multi_head_latent_attention.py +425 -0
  571. ejkernel-0.0.80/ejkernel/modules/operations/multi_latent_ragged_page_attention.py +506 -0
  572. ejkernel-0.0.80/ejkernel/modules/operations/multi_latent_ragged_page_attention_v2.py +460 -0
  573. ejkernel-0.0.80/ejkernel/modules/operations/native_sparse_attention.py +495 -0
  574. ejkernel-0.0.80/ejkernel/modules/operations/page_attention.py +539 -0
  575. ejkernel-0.0.80/ejkernel/modules/operations/pooling.py +397 -0
  576. ejkernel-0.0.80/ejkernel/modules/operations/prefill_page_attention.py +439 -0
  577. ejkernel-0.0.80/ejkernel/modules/operations/quantized_matmul.py +2203 -0
  578. ejkernel-0.0.80/ejkernel/modules/operations/ragged_decode_attention.py +681 -0
  579. ejkernel-0.0.80/ejkernel/modules/operations/ragged_gated_delta_rule.py +429 -0
  580. ejkernel-0.0.80/ejkernel/modules/operations/ragged_page_attention_v2.py +783 -0
  581. ejkernel-0.0.80/ejkernel/modules/operations/ragged_page_attention_v2_turboquant.py +653 -0
  582. ejkernel-0.0.80/ejkernel/modules/operations/ragged_page_attention_v3.py +1344 -0
  583. ejkernel-0.0.80/ejkernel/modules/operations/ragged_page_attention_v3_turboquant.py +692 -0
  584. ejkernel-0.0.80/ejkernel/modules/operations/recurrent.py +503 -0
  585. ejkernel-0.0.80/ejkernel/modules/operations/reduce_scatter_matmul.py +501 -0
  586. ejkernel-0.0.80/ejkernel/modules/operations/ring_attention.py +635 -0
  587. ejkernel-0.0.80/ejkernel/modules/operations/rwkv4.py +334 -0
  588. ejkernel-0.0.80/ejkernel/modules/operations/rwkv6.py +347 -0
  589. ejkernel-0.0.80/ejkernel/modules/operations/rwkv7.py +733 -0
  590. ejkernel-0.0.80/ejkernel/modules/operations/scaled_dot_product_attention.py +615 -0
  591. ejkernel-0.0.80/ejkernel/modules/operations/state_space_v1.py +388 -0
  592. ejkernel-0.0.80/ejkernel/modules/operations/state_space_v2.py +445 -0
  593. ejkernel-0.0.80/ejkernel/modules/operations/unified_attention.py +669 -0
  594. ejkernel-0.0.80/ejkernel/ops/__init__.py +152 -0
  595. ejkernel-0.0.80/ejkernel/ops/config/__init__.py +62 -0
  596. ejkernel-0.0.80/ejkernel/ops/config/cache.py +199 -0
  597. ejkernel-0.0.80/ejkernel/ops/config/persistent.py +248 -0
  598. ejkernel-0.0.80/ejkernel/ops/config/selection.py +832 -0
  599. ejkernel-0.0.80/ejkernel/ops/core/__init__.py +58 -0
  600. ejkernel-0.0.80/ejkernel/ops/core/kernel.py +818 -0
  601. ejkernel-0.0.80/ejkernel/ops/core/types.py +50 -0
  602. ejkernel-0.0.80/ejkernel/ops/execution/__init__.py +106 -0
  603. ejkernel-0.0.80/ejkernel/ops/execution/batch.py +191 -0
  604. ejkernel-0.0.80/ejkernel/ops/execution/executor.py +744 -0
  605. ejkernel-0.0.80/ejkernel/ops/execution/offline.py +163 -0
  606. ejkernel-0.0.80/ejkernel/ops/execution/profiler.py +506 -0
  607. ejkernel-0.0.80/ejkernel/ops/execution/tuning.py +1600 -0
  608. ejkernel-0.0.80/ejkernel/ops/registry.py +93 -0
  609. ejkernel-0.0.80/ejkernel/ops/utils/__init__.py +77 -0
  610. ejkernel-0.0.80/ejkernel/ops/utils/datacarrier.py +176 -0
  611. ejkernel-0.0.80/ejkernel/ops/utils/fingerprint.py +353 -0
  612. ejkernel-0.0.80/ejkernel/ops/utils/meta.py +166 -0
  613. ejkernel-0.0.80/ejkernel/ops/utils/serialize.py +98 -0
  614. ejkernel-0.0.80/ejkernel/quantization/__init__.py +86 -0
  615. ejkernel-0.0.80/ejkernel/quantization/_quants/__init__.py +37 -0
  616. ejkernel-0.0.80/ejkernel/quantization/_quants/quantizations.py +1407 -0
  617. ejkernel-0.0.80/ejkernel/quantization/_utils/__init__.py +90 -0
  618. ejkernel-0.0.80/ejkernel/quantization/_utils/bitpack.py +250 -0
  619. ejkernel-0.0.80/ejkernel/quantization/_utils/fp_tables.py +302 -0
  620. ejkernel-0.0.80/ejkernel/quantization/_utils/grouping.py +220 -0
  621. ejkernel-0.0.80/ejkernel/quantization/_utils/qparams.py +539 -0
  622. ejkernel-0.0.80/ejkernel/quantization/quantized_array.py +599 -0
  623. ejkernel-0.0.80/ejkernel/quantization/runtime.py +164 -0
  624. ejkernel-0.0.80/ejkernel/quantization/turboquant/codebook.py +213 -0
  625. ejkernel-0.0.80/ejkernel/quantization/turboquant/matrices.py +79 -0
  626. ejkernel-0.0.80/ejkernel/quantization/turboquant/ops.py +219 -0
  627. ejkernel-0.0.80/ejkernel/quantization/turboquant/packing.py +114 -0
  628. ejkernel-0.0.80/ejkernel/types/__init__.py +57 -0
  629. ejkernel-0.0.80/ejkernel/types/mask.py +3313 -0
  630. ejkernel-0.0.80/ejkernel/utils.py +1210 -0
  631. ejkernel-0.0.80/ejkernel/xla_utils/__init__.py +122 -0
  632. ejkernel-0.0.80/ejkernel/xla_utils/cumsum.py +568 -0
  633. ejkernel-0.0.80/ejkernel/xla_utils/shardings.py +270 -0
  634. ejkernel-0.0.80/ejkernel/xla_utils/utils.py +376 -0
  635. ejkernel-0.0.80/pyproject.toml +149 -0
  636. ejkernel-0.0.79/PKG-INFO +0 -678
  637. ejkernel-0.0.79/csrc/flash_attention/CMakeLists.txt +0 -75
  638. ejkernel-0.0.79/csrc/flash_attention/src/flash_fwd_launch_template.h +0 -447
  639. ejkernel-0.0.79/csrc/quantized_matmul/CMakeLists.txt +0 -62
  640. ejkernel-0.0.79/csrc/quantized_matmul/src/code_gen.py +0 -378
  641. ejkernel-0.0.79/csrc/quantized_matmul/src/qmm_cuda_impl.h +0 -2337
  642. ejkernel-0.0.79/csrc/quantized_matmul/src/qmm_dequant_affine_bits2_bf16.cu +0 -4092
  643. ejkernel-0.0.79/csrc/quantized_matmul/src/qmm_dequant_affine_bits2_f16.cu +0 -4092
  644. ejkernel-0.0.79/csrc/quantized_matmul/src/qmm_dequant_affine_bits2_f32.cu +0 -4092
  645. ejkernel-0.0.79/csrc/quantized_matmul/src/qmm_dequant_affine_bits3_bf16.cu +0 -4092
  646. ejkernel-0.0.79/csrc/quantized_matmul/src/qmm_dequant_affine_bits3_f16.cu +0 -4092
  647. ejkernel-0.0.79/csrc/quantized_matmul/src/qmm_dequant_affine_bits3_f32.cu +0 -4092
  648. ejkernel-0.0.79/csrc/quantized_matmul/src/qmm_dequant_affine_bits4_bf16.cu +0 -4092
  649. ejkernel-0.0.79/csrc/quantized_matmul/src/qmm_dequant_affine_bits4_f16.cu +0 -4092
  650. ejkernel-0.0.79/csrc/quantized_matmul/src/qmm_dequant_affine_bits4_f32.cu +0 -4092
  651. ejkernel-0.0.79/csrc/quantized_matmul/src/qmm_dequant_affine_bits5_bf16.cu +0 -4092
  652. ejkernel-0.0.79/csrc/quantized_matmul/src/qmm_dequant_affine_bits5_f16.cu +0 -4092
  653. ejkernel-0.0.79/csrc/quantized_matmul/src/qmm_dequant_affine_bits5_f32.cu +0 -4092
  654. ejkernel-0.0.79/csrc/quantized_matmul/src/qmm_dequant_affine_bits6_bf16.cu +0 -4092
  655. ejkernel-0.0.79/csrc/quantized_matmul/src/qmm_dequant_affine_bits6_f16.cu +0 -4092
  656. ejkernel-0.0.79/csrc/quantized_matmul/src/qmm_dequant_affine_bits6_f32.cu +0 -4092
  657. ejkernel-0.0.79/csrc/quantized_matmul/src/qmm_dequant_affine_bits7_bf16.cu +0 -4092
  658. ejkernel-0.0.79/csrc/quantized_matmul/src/qmm_dequant_affine_bits7_f16.cu +0 -4092
  659. ejkernel-0.0.79/csrc/quantized_matmul/src/qmm_dequant_affine_bits7_f32.cu +0 -4092
  660. ejkernel-0.0.79/csrc/quantized_matmul/src/qmm_dequant_affine_bits8_bf16.cu +0 -4092
  661. ejkernel-0.0.79/csrc/quantized_matmul/src/qmm_dequant_affine_bits8_f16.cu +0 -4092
  662. ejkernel-0.0.79/csrc/quantized_matmul/src/qmm_dequant_affine_bits8_f32.cu +0 -4092
  663. ejkernel-0.0.79/csrc/quantized_matmul/src/qmm_dequant_dispatch.h +0 -18405
  664. ejkernel-0.0.79/csrc/quantized_matmul/src/qmm_dequant_mxfp4.cu +0 -22
  665. ejkernel-0.0.79/csrc/quantized_matmul/src/qmm_dequant_mxfp8.cu +0 -22
  666. ejkernel-0.0.79/csrc/quantized_matmul/src/qmm_dequant_nf4_bf16.cu +0 -4092
  667. ejkernel-0.0.79/csrc/quantized_matmul/src/qmm_dequant_nf4_f16.cu +0 -4092
  668. ejkernel-0.0.79/csrc/quantized_matmul/src/qmm_dequant_nf4_f32.cu +0 -4092
  669. ejkernel-0.0.79/csrc/quantized_matmul/src/qmm_dequant_nvfp4.cu +0 -22
  670. ejkernel-0.0.79/csrc/quantized_matmul/src/qmm_dequant_nvfp8.cu +0 -22
  671. ejkernel-0.0.79/ejkernel/__init__.py +0 -65
  672. ejkernel-0.0.79/ejkernel/benchmarks.py +0 -1028
  673. ejkernel-0.0.79/ejkernel/build_cudalib.py +0 -30
  674. ejkernel-0.0.79/ejkernel/callib/__init__.py +0 -160
  675. ejkernel-0.0.79/ejkernel/callib/_cute_call.py +0 -537
  676. ejkernel-0.0.79/ejkernel/callib/_cute_ffi.py +0 -496
  677. ejkernel-0.0.79/ejkernel/callib/_ejit.py +0 -666
  678. ejkernel-0.0.79/ejkernel/callib/_pallas_call.py +0 -275
  679. ejkernel-0.0.79/ejkernel/callib/_triton_call.py +0 -1554
  680. ejkernel-0.0.79/ejkernel/callib/_utils.py +0 -273
  681. ejkernel-0.0.79/ejkernel/errors.py +0 -45
  682. ejkernel-0.0.79/ejkernel/kernels/__init__.py +0 -119
  683. ejkernel-0.0.79/ejkernel/kernels/_cuda/__init__.py +0 -108
  684. ejkernel-0.0.79/ejkernel/kernels/_cuda/blocksparse_attention/__init__.py +0 -46
  685. ejkernel-0.0.79/ejkernel/kernels/_cuda/blocksparse_attention/_build.py +0 -197
  686. ejkernel-0.0.79/ejkernel/kernels/_cuda/blocksparse_attention/_cuda_impl.py +0 -275
  687. ejkernel-0.0.79/ejkernel/kernels/_cuda/blocksparse_attention/_interface.py +0 -813
  688. ejkernel-0.0.79/ejkernel/kernels/_cuda/flash_attention/__init__.py +0 -30
  689. ejkernel-0.0.79/ejkernel/kernels/_cuda/flash_attention/_build.py +0 -232
  690. ejkernel-0.0.79/ejkernel/kernels/_cuda/flash_attention/_cuda_impl.py +0 -478
  691. ejkernel-0.0.79/ejkernel/kernels/_cuda/flash_attention/_interface.py +0 -783
  692. ejkernel-0.0.79/ejkernel/kernels/_cuda/quantized_matmul/__init__.py +0 -39
  693. ejkernel-0.0.79/ejkernel/kernels/_cuda/quantized_matmul/_build.py +0 -205
  694. ejkernel-0.0.79/ejkernel/kernels/_cuda/quantized_matmul/_cuda_impl.py +0 -303
  695. ejkernel-0.0.79/ejkernel/kernels/_cuda/quantized_matmul/_cuda_impl_bwd.py +0 -105
  696. ejkernel-0.0.79/ejkernel/kernels/_cuda/quantized_matmul/_cuda_impl_fwd.py +0 -98
  697. ejkernel-0.0.79/ejkernel/kernels/_cuda/quantized_matmul/_interface.py +0 -277
  698. ejkernel-0.0.79/ejkernel/kernels/_cuda/ragged_page_attention_v3/__init__.py +0 -43
  699. ejkernel-0.0.79/ejkernel/kernels/_cuda/ragged_page_attention_v3/_build.py +0 -206
  700. ejkernel-0.0.79/ejkernel/kernels/_cuda/ragged_page_attention_v3/_cuda_impl.py +0 -178
  701. ejkernel-0.0.79/ejkernel/kernels/_cuda/ragged_page_attention_v3/_interface.py +0 -153
  702. ejkernel-0.0.79/ejkernel/kernels/_cuda/unified_attention/__init__.py +0 -49
  703. ejkernel-0.0.79/ejkernel/kernels/_cuda/unified_attention/_build.py +0 -199
  704. ejkernel-0.0.79/ejkernel/kernels/_cuda/unified_attention/_cuda_impl.py +0 -185
  705. ejkernel-0.0.79/ejkernel/kernels/_cuda/unified_attention/_interface.py +0 -167
  706. ejkernel-0.0.79/ejkernel/kernels/_cute/__init__.py +0 -32
  707. ejkernel-0.0.79/ejkernel/kernels/_cute/chunked_prefill_paged_decode/__init__.py +0 -24
  708. ejkernel-0.0.79/ejkernel/kernels/_cute/chunked_prefill_paged_decode/_cute_impl_fwd.py +0 -498
  709. ejkernel-0.0.79/ejkernel/kernels/_cute/chunked_prefill_paged_decode/_interface.py +0 -105
  710. ejkernel-0.0.79/ejkernel/kernels/_cute/flash_attention/__init__.py +0 -19
  711. ejkernel-0.0.79/ejkernel/kernels/_cute/flash_attention/_cute_impl.py +0 -1443
  712. ejkernel-0.0.79/ejkernel/kernels/_cute/flash_attention/_interface.py +0 -516
  713. ejkernel-0.0.79/ejkernel/kernels/_cute/quantized_matmul/__init__.py +0 -19
  714. ejkernel-0.0.79/ejkernel/kernels/_cute/quantized_matmul/_cute_impl.py +0 -3317
  715. ejkernel-0.0.79/ejkernel/kernels/_cute/quantized_matmul/_cute_impl_bwd.py +0 -119
  716. ejkernel-0.0.79/ejkernel/kernels/_cute/quantized_matmul/_cute_impl_fwd.py +0 -121
  717. ejkernel-0.0.79/ejkernel/kernels/_cute/quantized_matmul/_interface.py +0 -309
  718. ejkernel-0.0.79/ejkernel/kernels/_cute/ragged_page_attention_v3/__init__.py +0 -15
  719. ejkernel-0.0.79/ejkernel/kernels/_cute/ragged_page_attention_v3/_cute_impl_fwd.py +0 -15
  720. ejkernel-0.0.79/ejkernel/kernels/_cute/ragged_page_attention_v3/_interface.py +0 -15
  721. ejkernel-0.0.79/ejkernel/kernels/_cute/unified_attention/__init__.py +0 -19
  722. ejkernel-0.0.79/ejkernel/kernels/_cute/unified_attention/_cute_impl.py +0 -163
  723. ejkernel-0.0.79/ejkernel/kernels/_cute/unified_attention/_interface.py +0 -122
  724. ejkernel-0.0.79/ejkernel/kernels/_pallas/__init__.py +0 -38
  725. ejkernel-0.0.79/ejkernel/kernels/_pallas/gpu/__init__.py +0 -30
  726. ejkernel-0.0.79/ejkernel/kernels/_pallas/gpu/ragged_decode_attention/__init__.py +0 -30
  727. ejkernel-0.0.79/ejkernel/kernels/_pallas/gpu/ragged_decode_attention/_interface.py +0 -123
  728. ejkernel-0.0.79/ejkernel/kernels/_pallas/gpu/ragged_decode_attention/_pallas_impl_fwd.py +0 -411
  729. ejkernel-0.0.79/ejkernel/kernels/_pallas/gpu/scaled_dot_product_attention/__init__.py +0 -25
  730. ejkernel-0.0.79/ejkernel/kernels/_pallas/gpu/scaled_dot_product_attention/_interface.py +0 -122
  731. ejkernel-0.0.79/ejkernel/kernels/_pallas/gpu/scaled_dot_product_attention/_pallas_impl_bwd.py +0 -21
  732. ejkernel-0.0.79/ejkernel/kernels/_pallas/gpu/scaled_dot_product_attention/_pallas_impl_fwd.py +0 -124
  733. ejkernel-0.0.79/ejkernel/kernels/_pallas/tpu/__init__.py +0 -80
  734. ejkernel-0.0.79/ejkernel/kernels/_pallas/tpu/all_gather_matmul/__init__.py +0 -19
  735. ejkernel-0.0.79/ejkernel/kernels/_pallas/tpu/all_gather_matmul/_interface.py +0 -181
  736. ejkernel-0.0.79/ejkernel/kernels/_pallas/tpu/all_gather_matmul/_pallas_impl.py +0 -636
  737. ejkernel-0.0.79/ejkernel/kernels/_pallas/tpu/blocksparse_attention/__init__.py +0 -131
  738. ejkernel-0.0.79/ejkernel/kernels/_pallas/tpu/blocksparse_attention/_info.py +0 -1066
  739. ejkernel-0.0.79/ejkernel/kernels/_pallas/tpu/blocksparse_attention/_interface.py +0 -110
  740. ejkernel-0.0.79/ejkernel/kernels/_pallas/tpu/blocksparse_attention/_kernel.py +0 -2869
  741. ejkernel-0.0.79/ejkernel/kernels/_pallas/tpu/blocksparse_attention/_masks.py +0 -640
  742. ejkernel-0.0.79/ejkernel/kernels/_pallas/tpu/deepseek_attn/__init__.py +0 -27
  743. ejkernel-0.0.79/ejkernel/kernels/_pallas/tpu/deepseek_attn/_interface.py +0 -159
  744. ejkernel-0.0.79/ejkernel/kernels/_pallas/tpu/deepseek_attn/_pallas_impl_bwd.py +0 -97
  745. ejkernel-0.0.79/ejkernel/kernels/_pallas/tpu/deepseek_attn/_pallas_impl_fwd.py +0 -259
  746. ejkernel-0.0.79/ejkernel/kernels/_pallas/tpu/flash_attention/__init__.py +0 -36
  747. ejkernel-0.0.79/ejkernel/kernels/_pallas/tpu/flash_attention/_interface.py +0 -385
  748. ejkernel-0.0.79/ejkernel/kernels/_pallas/tpu/flash_attention/_pallas_impl_bwd.py +0 -887
  749. ejkernel-0.0.79/ejkernel/kernels/_pallas/tpu/flash_attention/_pallas_impl_fwd.py +0 -664
  750. ejkernel-0.0.79/ejkernel/kernels/_pallas/tpu/flash_attention/_utils.py +0 -590
  751. ejkernel-0.0.79/ejkernel/kernels/_pallas/tpu/flash_mla/__init__.py +0 -33
  752. ejkernel-0.0.79/ejkernel/kernels/_pallas/tpu/flash_mla/_interface.py +0 -212
  753. ejkernel-0.0.79/ejkernel/kernels/_pallas/tpu/flash_mla/_pallas_impl_bwd.py +0 -858
  754. ejkernel-0.0.79/ejkernel/kernels/_pallas/tpu/flash_mla/_pallas_impl_fwd.py +0 -586
  755. ejkernel-0.0.79/ejkernel/kernels/_pallas/tpu/flash_mla/_utils.py +0 -71
  756. ejkernel-0.0.79/ejkernel/kernels/_pallas/tpu/gated_delta_rule/_interface.py +0 -132
  757. ejkernel-0.0.79/ejkernel/kernels/_pallas/tpu/gated_delta_rule/_pallas_impl_bwd.py +0 -528
  758. ejkernel-0.0.79/ejkernel/kernels/_pallas/tpu/gated_delta_rule/_pallas_impl_fwd.py +0 -640
  759. ejkernel-0.0.79/ejkernel/kernels/_pallas/tpu/grouped_matmul/__init__.py +0 -30
  760. ejkernel-0.0.79/ejkernel/kernels/_pallas/tpu/grouped_matmul/_interface.py +0 -298
  761. ejkernel-0.0.79/ejkernel/kernels/_pallas/tpu/grouped_matmul/_pallas_impl.py +0 -996
  762. ejkernel-0.0.79/ejkernel/kernels/_pallas/tpu/grouped_matmul/_utils.py +0 -191
  763. ejkernel-0.0.79/ejkernel/kernels/_pallas/tpu/grouped_matmulv2/__init__.py +0 -30
  764. ejkernel-0.0.79/ejkernel/kernels/_pallas/tpu/grouped_matmulv2/_interface.py +0 -311
  765. ejkernel-0.0.79/ejkernel/kernels/_pallas/tpu/grouped_matmulv2/_pallas_impl.py +0 -627
  766. ejkernel-0.0.79/ejkernel/kernels/_pallas/tpu/grouped_matmulv3/__init__.py +0 -19
  767. ejkernel-0.0.79/ejkernel/kernels/_pallas/tpu/grouped_matmulv3/_interface.py +0 -296
  768. ejkernel-0.0.79/ejkernel/kernels/_pallas/tpu/grouped_matmulv3/_pallas_impl.py +0 -995
  769. ejkernel-0.0.79/ejkernel/kernels/_pallas/tpu/multi_latent_ragged_page_attention/__init__.py +0 -19
  770. ejkernel-0.0.79/ejkernel/kernels/_pallas/tpu/multi_latent_ragged_page_attention/_interface.py +0 -143
  771. ejkernel-0.0.79/ejkernel/kernels/_pallas/tpu/multi_latent_ragged_page_attention/_pallas_impl_fwd.py +0 -1329
  772. ejkernel-0.0.79/ejkernel/kernels/_pallas/tpu/multi_latent_ragged_page_attention_v2/__init__.py +0 -19
  773. ejkernel-0.0.79/ejkernel/kernels/_pallas/tpu/multi_latent_ragged_page_attention_v2/_interface.py +0 -177
  774. ejkernel-0.0.79/ejkernel/kernels/_pallas/tpu/multi_latent_ragged_page_attention_v2/_pallas_impl_fwd.py +0 -1407
  775. ejkernel-0.0.79/ejkernel/kernels/_pallas/tpu/page_attention/__init__.py +0 -30
  776. ejkernel-0.0.79/ejkernel/kernels/_pallas/tpu/page_attention/_interface.py +0 -352
  777. ejkernel-0.0.79/ejkernel/kernels/_pallas/tpu/page_attention/_pallas_impl_fwd.py +0 -427
  778. ejkernel-0.0.79/ejkernel/kernels/_pallas/tpu/prefill_page_attention/__init__.py +0 -30
  779. ejkernel-0.0.79/ejkernel/kernels/_pallas/tpu/prefill_page_attention/_interface.py +0 -227
  780. ejkernel-0.0.79/ejkernel/kernels/_pallas/tpu/prefill_page_attention/_pallas_impl_fwd.py +0 -438
  781. ejkernel-0.0.79/ejkernel/kernels/_pallas/tpu/quantized_matmul/__init__.py +0 -23
  782. ejkernel-0.0.79/ejkernel/kernels/_pallas/tpu/quantized_matmul/_interface.py +0 -741
  783. ejkernel-0.0.79/ejkernel/kernels/_pallas/tpu/quantized_matmul/_pallas_impl_bwd.py +0 -473
  784. ejkernel-0.0.79/ejkernel/kernels/_pallas/tpu/quantized_matmul/_pallas_impl_core.py +0 -1014
  785. ejkernel-0.0.79/ejkernel/kernels/_pallas/tpu/quantized_matmul/_pallas_impl_fwd.py +0 -332
  786. ejkernel-0.0.79/ejkernel/kernels/_pallas/tpu/ragged_decode_attention/__init__.py +0 -30
  787. ejkernel-0.0.79/ejkernel/kernels/_pallas/tpu/ragged_decode_attention/_interface.py +0 -125
  788. ejkernel-0.0.79/ejkernel/kernels/_pallas/tpu/ragged_decode_attention/_pallas_impl_fwd.py +0 -335
  789. ejkernel-0.0.79/ejkernel/kernels/_pallas/tpu/ragged_gated_delta_rule/__init__.py +0 -25
  790. ejkernel-0.0.79/ejkernel/kernels/_pallas/tpu/ragged_gated_delta_rule/_interface.py +0 -145
  791. ejkernel-0.0.79/ejkernel/kernels/_pallas/tpu/ragged_gated_delta_rule/_pallas_impl_fwd.py +0 -140
  792. ejkernel-0.0.79/ejkernel/kernels/_pallas/tpu/ragged_page_attention_v2/__init__.py +0 -30
  793. ejkernel-0.0.79/ejkernel/kernels/_pallas/tpu/ragged_page_attention_v2/_interface.py +0 -339
  794. ejkernel-0.0.79/ejkernel/kernels/_pallas/tpu/ragged_page_attention_v2/_pallas_impl_fwd.py +0 -813
  795. ejkernel-0.0.79/ejkernel/kernels/_pallas/tpu/ragged_page_attention_v2/_utils.py +0 -561
  796. ejkernel-0.0.79/ejkernel/kernels/_pallas/tpu/ragged_page_attention_v3/__init__.py +0 -30
  797. ejkernel-0.0.79/ejkernel/kernels/_pallas/tpu/ragged_page_attention_v3/_interface.py +0 -205
  798. ejkernel-0.0.79/ejkernel/kernels/_pallas/tpu/ragged_page_attention_v3/_pallas_impl_fwd.py +0 -1789
  799. ejkernel-0.0.79/ejkernel/kernels/_pallas/tpu/ragged_page_attention_v3/_pallas_impl_fwd_h64.py +0 -1611
  800. ejkernel-0.0.79/ejkernel/kernels/_pallas/tpu/ragged_page_attention_v3/_utils.py +0 -4792
  801. ejkernel-0.0.79/ejkernel/kernels/_pallas/tpu/reduce_scatter_matmul/__init__.py +0 -19
  802. ejkernel-0.0.79/ejkernel/kernels/_pallas/tpu/reduce_scatter_matmul/_interface.py +0 -121
  803. ejkernel-0.0.79/ejkernel/kernels/_pallas/tpu/reduce_scatter_matmul/_pallas_impl.py +0 -782
  804. ejkernel-0.0.79/ejkernel/kernels/_pallas/tpu/ring_attention/__init__.py +0 -48
  805. ejkernel-0.0.79/ejkernel/kernels/_pallas/tpu/ring_attention/_interface.py +0 -102
  806. ejkernel-0.0.79/ejkernel/kernels/_pallas/tpu/ring_attention/_pallas_impl_bwd.py +0 -921
  807. ejkernel-0.0.79/ejkernel/kernels/_pallas/tpu/ring_attention/_pallas_impl_fwd.py +0 -458
  808. ejkernel-0.0.79/ejkernel/kernels/_registry.py +0 -720
  809. ejkernel-0.0.79/ejkernel/kernels/_triton/__init__.py +0 -79
  810. ejkernel-0.0.79/ejkernel/kernels/_triton/blocksparse_attention/__init__.py +0 -33
  811. ejkernel-0.0.79/ejkernel/kernels/_triton/blocksparse_attention/_interface.py +0 -462
  812. ejkernel-0.0.79/ejkernel/kernels/_triton/blocksparse_attention/_mask.py +0 -585
  813. ejkernel-0.0.79/ejkernel/kernels/_triton/blocksparse_attention/_triton_impl_bwd.py +0 -1475
  814. ejkernel-0.0.79/ejkernel/kernels/_triton/blocksparse_attention/_triton_impl_fwd.py +0 -753
  815. ejkernel-0.0.79/ejkernel/kernels/_triton/blocksparse_attention/_utilities.py +0 -606
  816. ejkernel-0.0.79/ejkernel/kernels/_triton/chunked_prefill_paged_decode/__init__.py +0 -31
  817. ejkernel-0.0.79/ejkernel/kernels/_triton/chunked_prefill_paged_decode/_interface.py +0 -113
  818. ejkernel-0.0.79/ejkernel/kernels/_triton/chunked_prefill_paged_decode/_triton_impl_fwd.py +0 -244
  819. ejkernel-0.0.79/ejkernel/kernels/_triton/decode_attention/__init__.py +0 -23
  820. ejkernel-0.0.79/ejkernel/kernels/_triton/decode_attention/_interface.py +0 -96
  821. ejkernel-0.0.79/ejkernel/kernels/_triton/decode_attention/_triton_impl_fwd.py +0 -456
  822. ejkernel-0.0.79/ejkernel/kernels/_triton/flash_attention/__init__.py +0 -36
  823. ejkernel-0.0.79/ejkernel/kernels/_triton/flash_attention/_interface.py +0 -446
  824. ejkernel-0.0.79/ejkernel/kernels/_triton/flash_attention/_triton_impl_bwd.py +0 -1743
  825. ejkernel-0.0.79/ejkernel/kernels/_triton/flash_attention/_triton_impl_fwd.py +0 -1050
  826. ejkernel-0.0.79/ejkernel/kernels/_triton/flash_attention/_utilities.py +0 -472
  827. ejkernel-0.0.79/ejkernel/kernels/_triton/flash_mla/__init__.py +0 -25
  828. ejkernel-0.0.79/ejkernel/kernels/_triton/flash_mla/_interface.py +0 -155
  829. ejkernel-0.0.79/ejkernel/kernels/_triton/flash_mla/_triton_impl_bwd.py +0 -40
  830. ejkernel-0.0.79/ejkernel/kernels/_triton/flash_mla/_triton_impl_fwd.py +0 -218
  831. ejkernel-0.0.79/ejkernel/kernels/_triton/flash_mla/_utilities.py +0 -39
  832. ejkernel-0.0.79/ejkernel/kernels/_triton/gla/__init__.py +0 -30
  833. ejkernel-0.0.79/ejkernel/kernels/_triton/gla/_interface.py +0 -78
  834. ejkernel-0.0.79/ejkernel/kernels/_triton/gla/_triton_impl_bwd.py +0 -21
  835. ejkernel-0.0.79/ejkernel/kernels/_triton/gla/_triton_impl_fwd.py +0 -131
  836. ejkernel-0.0.79/ejkernel/kernels/_triton/lightning_attn/__init__.py +0 -30
  837. ejkernel-0.0.79/ejkernel/kernels/_triton/lightning_attn/_interface.py +0 -81
  838. ejkernel-0.0.79/ejkernel/kernels/_triton/lightning_attn/_triton_impl_bwd.py +0 -21
  839. ejkernel-0.0.79/ejkernel/kernels/_triton/lightning_attn/_triton_impl_fwd.py +0 -138
  840. ejkernel-0.0.79/ejkernel/kernels/_triton/mean_pooling/__init__.py +0 -42
  841. ejkernel-0.0.79/ejkernel/kernels/_triton/mean_pooling/_interface.py +0 -164
  842. ejkernel-0.0.79/ejkernel/kernels/_triton/mean_pooling/_triton_impl_bwd.py +0 -161
  843. ejkernel-0.0.79/ejkernel/kernels/_triton/mean_pooling/_triton_impl_fwd.py +0 -157
  844. ejkernel-0.0.79/ejkernel/kernels/_triton/native_sparse_attention/__init__.py +0 -37
  845. ejkernel-0.0.79/ejkernel/kernels/_triton/native_sparse_attention/_compression.py +0 -753
  846. ejkernel-0.0.79/ejkernel/kernels/_triton/native_sparse_attention/_interface.py +0 -404
  847. ejkernel-0.0.79/ejkernel/kernels/_triton/native_sparse_attention/_triton_impl_bwd.py +0 -441
  848. ejkernel-0.0.79/ejkernel/kernels/_triton/native_sparse_attention/_triton_impl_fwd.py +0 -473
  849. ejkernel-0.0.79/ejkernel/kernels/_triton/native_sparse_attention/_utilities.py +0 -123
  850. ejkernel-0.0.79/ejkernel/kernels/_triton/page_attention/__init__.py +0 -30
  851. ejkernel-0.0.79/ejkernel/kernels/_triton/page_attention/_interface.py +0 -379
  852. ejkernel-0.0.79/ejkernel/kernels/_triton/page_attention/_triton_impl_fwd.py +0 -352
  853. ejkernel-0.0.79/ejkernel/kernels/_triton/quantized_matmul/__init__.py +0 -19
  854. ejkernel-0.0.79/ejkernel/kernels/_triton/quantized_matmul/_interface.py +0 -369
  855. ejkernel-0.0.79/ejkernel/kernels/_triton/quantized_matmul/_triton_impl.py +0 -2945
  856. ejkernel-0.0.79/ejkernel/kernels/_triton/quantized_matmul/_triton_impl_bwd.py +0 -114
  857. ejkernel-0.0.79/ejkernel/kernels/_triton/quantized_matmul/_triton_impl_fwd.py +0 -268
  858. ejkernel-0.0.79/ejkernel/kernels/_triton/quantized_matmul/_triton_impl_gemv.py +0 -572
  859. ejkernel-0.0.79/ejkernel/kernels/_triton/ragged_decode_attention/__init__.py +0 -30
  860. ejkernel-0.0.79/ejkernel/kernels/_triton/ragged_decode_attention/_interface.py +0 -173
  861. ejkernel-0.0.79/ejkernel/kernels/_triton/ragged_decode_attention/_triton_impl_fwd.py +0 -392
  862. ejkernel-0.0.79/ejkernel/kernels/_triton/ragged_page_attention_v2/__init__.py +0 -30
  863. ejkernel-0.0.79/ejkernel/kernels/_triton/ragged_page_attention_v2/_interface.py +0 -228
  864. ejkernel-0.0.79/ejkernel/kernels/_triton/ragged_page_attention_v2/_triton_impl_fwd.py +0 -796
  865. ejkernel-0.0.79/ejkernel/kernels/_triton/ragged_page_attention_v3/__init__.py +0 -30
  866. ejkernel-0.0.79/ejkernel/kernels/_triton/ragged_page_attention_v3/_interface.py +0 -205
  867. ejkernel-0.0.79/ejkernel/kernels/_triton/ragged_page_attention_v3/_triton_impl_fwd.py +0 -652
  868. ejkernel-0.0.79/ejkernel/kernels/_triton/recurrent/__init__.py +0 -42
  869. ejkernel-0.0.79/ejkernel/kernels/_triton/recurrent/_interface.py +0 -291
  870. ejkernel-0.0.79/ejkernel/kernels/_triton/recurrent/_triton_impl_bwd.py +0 -411
  871. ejkernel-0.0.79/ejkernel/kernels/_triton/recurrent/_triton_impl_fwd.py +0 -272
  872. ejkernel-0.0.79/ejkernel/kernels/_triton/ring_attention/__init__.py +0 -33
  873. ejkernel-0.0.79/ejkernel/kernels/_triton/ring_attention/_interface.py +0 -154
  874. ejkernel-0.0.79/ejkernel/kernels/_triton/ring_attention/_triton_impl_bwd.py +0 -780
  875. ejkernel-0.0.79/ejkernel/kernels/_triton/ring_attention/_triton_impl_fwd.py +0 -226
  876. ejkernel-0.0.79/ejkernel/kernels/_triton/rwkv4/__init__.py +0 -32
  877. ejkernel-0.0.79/ejkernel/kernels/_triton/rwkv4/_interface.py +0 -221
  878. ejkernel-0.0.79/ejkernel/kernels/_triton/rwkv4/_triton_impl_bwd.py +0 -205
  879. ejkernel-0.0.79/ejkernel/kernels/_triton/rwkv4/_triton_impl_fwd.py +0 -258
  880. ejkernel-0.0.79/ejkernel/kernels/_triton/rwkv6/__init__.py +0 -33
  881. ejkernel-0.0.79/ejkernel/kernels/_triton/rwkv6/_interface.py +0 -498
  882. ejkernel-0.0.79/ejkernel/kernels/_triton/rwkv6/_triton_impl_fwd.py +0 -221
  883. ejkernel-0.0.79/ejkernel/kernels/_triton/rwkv7/__init__.py +0 -34
  884. ejkernel-0.0.79/ejkernel/kernels/_triton/rwkv7/_interface.py +0 -585
  885. ejkernel-0.0.79/ejkernel/kernels/_triton/rwkv7/_triton_impl_fwd.py +0 -237
  886. ejkernel-0.0.79/ejkernel/kernels/_triton/unified_attention/__init__.py +0 -30
  887. ejkernel-0.0.79/ejkernel/kernels/_triton/unified_attention/_interface.py +0 -159
  888. ejkernel-0.0.79/ejkernel/kernels/_triton/unified_attention/_triton_impl_fwd.py +0 -1029
  889. ejkernel-0.0.79/ejkernel/kernels/_xla/__init__.py +0 -147
  890. ejkernel-0.0.79/ejkernel/kernels/_xla/all_gather_matmul/__init__.py +0 -19
  891. ejkernel-0.0.79/ejkernel/kernels/_xla/all_gather_matmul/_interface.py +0 -55
  892. ejkernel-0.0.79/ejkernel/kernels/_xla/all_gather_matmul/_xla_impl_bwd.py +0 -50
  893. ejkernel-0.0.79/ejkernel/kernels/_xla/all_gather_matmul/_xla_impl_fwd.py +0 -136
  894. ejkernel-0.0.79/ejkernel/kernels/_xla/attention/__init__.py +0 -31
  895. ejkernel-0.0.79/ejkernel/kernels/_xla/attention/_interface.py +0 -134
  896. ejkernel-0.0.79/ejkernel/kernels/_xla/attention/_xla_impl_bwd.py +0 -21
  897. ejkernel-0.0.79/ejkernel/kernels/_xla/attention/_xla_impl_fwd.py +0 -332
  898. ejkernel-0.0.79/ejkernel/kernels/_xla/blocksparse_attention/__init__.py +0 -30
  899. ejkernel-0.0.79/ejkernel/kernels/_xla/blocksparse_attention/_interface.py +0 -144
  900. ejkernel-0.0.79/ejkernel/kernels/_xla/blocksparse_attention/_xla_impl_bwd.py +0 -21
  901. ejkernel-0.0.79/ejkernel/kernels/_xla/blocksparse_attention/_xla_impl_fwd.py +0 -478
  902. ejkernel-0.0.79/ejkernel/kernels/_xla/chunked_prefill_paged_decode/__init__.py +0 -36
  903. ejkernel-0.0.79/ejkernel/kernels/_xla/chunked_prefill_paged_decode/_interface.py +0 -113
  904. ejkernel-0.0.79/ejkernel/kernels/_xla/chunked_prefill_paged_decode/_xla_impl_fwd.py +0 -237
  905. ejkernel-0.0.79/ejkernel/kernels/_xla/decode_attention/__init__.py +0 -34
  906. ejkernel-0.0.79/ejkernel/kernels/_xla/decode_attention/_interface.py +0 -87
  907. ejkernel-0.0.79/ejkernel/kernels/_xla/decode_attention/_xla_impl_fwd.py +0 -167
  908. ejkernel-0.0.79/ejkernel/kernels/_xla/deepseek_attn/__init__.py +0 -28
  909. ejkernel-0.0.79/ejkernel/kernels/_xla/deepseek_attn/_interface.py +0 -239
  910. ejkernel-0.0.79/ejkernel/kernels/_xla/deepseek_attn/_xla_impl_bwd.py +0 -117
  911. ejkernel-0.0.79/ejkernel/kernels/_xla/deepseek_attn/_xla_impl_fwd.py +0 -186
  912. ejkernel-0.0.79/ejkernel/kernels/_xla/flash_attention/__init__.py +0 -40
  913. ejkernel-0.0.79/ejkernel/kernels/_xla/flash_attention/_interface.py +0 -668
  914. ejkernel-0.0.79/ejkernel/kernels/_xla/flash_attention/_xla_impl_bwd.py +0 -150
  915. ejkernel-0.0.79/ejkernel/kernels/_xla/flash_attention/_xla_impl_fwd.py +0 -742
  916. ejkernel-0.0.79/ejkernel/kernels/_xla/flash_mla/__init__.py +0 -23
  917. ejkernel-0.0.79/ejkernel/kernels/_xla/flash_mla/_xla_impl_bwd.py +0 -21
  918. ejkernel-0.0.79/ejkernel/kernels/_xla/flash_mla/_xla_impl_fwd.py +0 -482
  919. ejkernel-0.0.79/ejkernel/kernels/_xla/gated_delta_rule/__init__.py +0 -35
  920. ejkernel-0.0.79/ejkernel/kernels/_xla/gated_delta_rule/_interface.py +0 -158
  921. ejkernel-0.0.79/ejkernel/kernels/_xla/gated_delta_rule/_xla_impl_bwd.py +0 -307
  922. ejkernel-0.0.79/ejkernel/kernels/_xla/gated_delta_rule/_xla_impl_fwd.py +0 -634
  923. ejkernel-0.0.79/ejkernel/kernels/_xla/gla/__init__.py +0 -34
  924. ejkernel-0.0.79/ejkernel/kernels/_xla/gla/_interface.py +0 -96
  925. ejkernel-0.0.79/ejkernel/kernels/_xla/gla/_xla_impl_bwd.py +0 -21
  926. ejkernel-0.0.79/ejkernel/kernels/_xla/gla/_xla_impl_fwd.py +0 -124
  927. ejkernel-0.0.79/ejkernel/kernels/_xla/grouped_matmul/__init__.py +0 -30
  928. ejkernel-0.0.79/ejkernel/kernels/_xla/grouped_matmul/_interface.py +0 -120
  929. ejkernel-0.0.79/ejkernel/kernels/_xla/grouped_matmul/_xla_impl_bwd.py +0 -21
  930. ejkernel-0.0.79/ejkernel/kernels/_xla/grouped_matmul/_xla_impl_fwd.py +0 -143
  931. ejkernel-0.0.79/ejkernel/kernels/_xla/grouped_matmulv3/__init__.py +0 -19
  932. ejkernel-0.0.79/ejkernel/kernels/_xla/grouped_matmulv3/_interface.py +0 -356
  933. ejkernel-0.0.79/ejkernel/kernels/_xla/kernel_delta_attention/__init__.py +0 -43
  934. ejkernel-0.0.79/ejkernel/kernels/_xla/kernel_delta_attention/_interface.py +0 -223
  935. ejkernel-0.0.79/ejkernel/kernels/_xla/kernel_delta_attention/_xla_impl_fwd.py +0 -467
  936. ejkernel-0.0.79/ejkernel/kernels/_xla/lightning_attn/__init__.py +0 -34
  937. ejkernel-0.0.79/ejkernel/kernels/_xla/lightning_attn/_interface.py +0 -100
  938. ejkernel-0.0.79/ejkernel/kernels/_xla/lightning_attn/_xla_impl_bwd.py +0 -21
  939. ejkernel-0.0.79/ejkernel/kernels/_xla/lightning_attn/_xla_impl_fwd.py +0 -130
  940. ejkernel-0.0.79/ejkernel/kernels/_xla/mean_pooling/__init__.py +0 -29
  941. ejkernel-0.0.79/ejkernel/kernels/_xla/mean_pooling/_interface.py +0 -66
  942. ejkernel-0.0.79/ejkernel/kernels/_xla/mean_pooling/_xla_impl_bwd.py +0 -56
  943. ejkernel-0.0.79/ejkernel/kernels/_xla/mean_pooling/_xla_impl_fwd.py +0 -172
  944. ejkernel-0.0.79/ejkernel/kernels/_xla/multi_latent_ragged_page_attention/__init__.py +0 -19
  945. ejkernel-0.0.79/ejkernel/kernels/_xla/multi_latent_ragged_page_attention/_interface.py +0 -90
  946. ejkernel-0.0.79/ejkernel/kernels/_xla/multi_latent_ragged_page_attention/_xla_impl_fwd.py +0 -489
  947. ejkernel-0.0.79/ejkernel/kernels/_xla/multi_latent_ragged_page_attention_v2/__init__.py +0 -19
  948. ejkernel-0.0.79/ejkernel/kernels/_xla/multi_latent_ragged_page_attention_v2/_interface.py +0 -120
  949. ejkernel-0.0.79/ejkernel/kernels/_xla/multi_latent_ragged_page_attention_v2/_xla_impl_fwd.py +0 -117
  950. ejkernel-0.0.79/ejkernel/kernels/_xla/native_sparse_attention/__init__.py +0 -48
  951. ejkernel-0.0.79/ejkernel/kernels/_xla/native_sparse_attention/_interface.py +0 -577
  952. ejkernel-0.0.79/ejkernel/kernels/_xla/native_sparse_attention/_xla_impl_bwd.py +0 -241
  953. ejkernel-0.0.79/ejkernel/kernels/_xla/native_sparse_attention/_xla_impl_fwd.py +0 -172
  954. ejkernel-0.0.79/ejkernel/kernels/_xla/page_attention/__init__.py +0 -30
  955. ejkernel-0.0.79/ejkernel/kernels/_xla/page_attention/_interface.py +0 -158
  956. ejkernel-0.0.79/ejkernel/kernels/_xla/page_attention/_xla_impl_fwd.py +0 -161
  957. ejkernel-0.0.79/ejkernel/kernels/_xla/prefill_page_attention/__init__.py +0 -77
  958. ejkernel-0.0.79/ejkernel/kernels/_xla/prefill_page_attention/_impl.py +0 -169
  959. ejkernel-0.0.79/ejkernel/kernels/_xla/prefill_page_attention/_interface.py +0 -102
  960. ejkernel-0.0.79/ejkernel/kernels/_xla/quantized_matmul/__init__.py +0 -33
  961. ejkernel-0.0.79/ejkernel/kernels/_xla/quantized_matmul/_interface.py +0 -129
  962. ejkernel-0.0.79/ejkernel/kernels/_xla/quantized_matmul/_xla_impl_bwd.py +0 -21
  963. ejkernel-0.0.79/ejkernel/kernels/_xla/quantized_matmul/_xla_impl_fwd.py +0 -744
  964. ejkernel-0.0.79/ejkernel/kernels/_xla/ragged_decode_attention/__init__.py +0 -30
  965. ejkernel-0.0.79/ejkernel/kernels/_xla/ragged_decode_attention/_interface.py +0 -125
  966. ejkernel-0.0.79/ejkernel/kernels/_xla/ragged_decode_attention/_xla_impl_fwd.py +0 -536
  967. ejkernel-0.0.79/ejkernel/kernels/_xla/ragged_gated_delta_rule/__init__.py +0 -25
  968. ejkernel-0.0.79/ejkernel/kernels/_xla/ragged_gated_delta_rule/_interface.py +0 -117
  969. ejkernel-0.0.79/ejkernel/kernels/_xla/ragged_gated_delta_rule/_xla_impl_fwd.py +0 -503
  970. ejkernel-0.0.79/ejkernel/kernels/_xla/ragged_page_attention_v2/__init__.py +0 -30
  971. ejkernel-0.0.79/ejkernel/kernels/_xla/ragged_page_attention_v2/_interface.py +0 -115
  972. ejkernel-0.0.79/ejkernel/kernels/_xla/ragged_page_attention_v2/_xla_impl_fwd.py +0 -315
  973. ejkernel-0.0.79/ejkernel/kernels/_xla/ragged_page_attention_v2_turboquant/_interface.py +0 -124
  974. ejkernel-0.0.79/ejkernel/kernels/_xla/ragged_page_attention_v2_turboquant/_xla_impl_fwd.py +0 -422
  975. ejkernel-0.0.79/ejkernel/kernels/_xla/ragged_page_attention_v3/__init__.py +0 -30
  976. ejkernel-0.0.79/ejkernel/kernels/_xla/ragged_page_attention_v3/_interface.py +0 -195
  977. ejkernel-0.0.79/ejkernel/kernels/_xla/ragged_page_attention_v3/_xla_impl_bwd.py +0 -21
  978. ejkernel-0.0.79/ejkernel/kernels/_xla/ragged_page_attention_v3/_xla_impl_fwd.py +0 -678
  979. ejkernel-0.0.79/ejkernel/kernels/_xla/ragged_page_attention_v3_turboquant/_interface.py +0 -144
  980. ejkernel-0.0.79/ejkernel/kernels/_xla/ragged_page_attention_v3_turboquant/_xla_impl_fwd.py +0 -580
  981. ejkernel-0.0.79/ejkernel/kernels/_xla/recurrent/__init__.py +0 -42
  982. ejkernel-0.0.79/ejkernel/kernels/_xla/recurrent/_interface.py +0 -639
  983. ejkernel-0.0.79/ejkernel/kernels/_xla/recurrent/_xla_impl_bwd.py +0 -246
  984. ejkernel-0.0.79/ejkernel/kernels/_xla/recurrent/_xla_impl_fwd.py +0 -347
  985. ejkernel-0.0.79/ejkernel/kernels/_xla/reduce_scatter_matmul/__init__.py +0 -19
  986. ejkernel-0.0.79/ejkernel/kernels/_xla/reduce_scatter_matmul/_interface.py +0 -55
  987. ejkernel-0.0.79/ejkernel/kernels/_xla/reduce_scatter_matmul/_xla_impl_bwd.py +0 -40
  988. ejkernel-0.0.79/ejkernel/kernels/_xla/reduce_scatter_matmul/_xla_impl_fwd.py +0 -112
  989. ejkernel-0.0.79/ejkernel/kernels/_xla/ring_attention/__init__.py +0 -45
  990. ejkernel-0.0.79/ejkernel/kernels/_xla/ring_attention/_interface.py +0 -376
  991. ejkernel-0.0.79/ejkernel/kernels/_xla/ring_attention/_utils.py +0 -199
  992. ejkernel-0.0.79/ejkernel/kernels/_xla/ring_attention/_xla_impl_bwd.py +0 -471
  993. ejkernel-0.0.79/ejkernel/kernels/_xla/ring_attention/_xla_impl_fwd.py +0 -528
  994. ejkernel-0.0.79/ejkernel/kernels/_xla/rwkv4/__init__.py +0 -41
  995. ejkernel-0.0.79/ejkernel/kernels/_xla/rwkv4/_interface.py +0 -88
  996. ejkernel-0.0.79/ejkernel/kernels/_xla/rwkv4/_xla_impl_bwd.py +0 -21
  997. ejkernel-0.0.79/ejkernel/kernels/_xla/rwkv4/_xla_impl_fwd.py +0 -219
  998. ejkernel-0.0.79/ejkernel/kernels/_xla/rwkv6/__init__.py +0 -41
  999. ejkernel-0.0.79/ejkernel/kernels/_xla/rwkv6/_interface.py +0 -103
  1000. ejkernel-0.0.79/ejkernel/kernels/_xla/rwkv6/_xla_impl_bwd.py +0 -21
  1001. ejkernel-0.0.79/ejkernel/kernels/_xla/rwkv6/_xla_impl_fwd.py +0 -422
  1002. ejkernel-0.0.79/ejkernel/kernels/_xla/rwkv7/__init__.py +0 -42
  1003. ejkernel-0.0.79/ejkernel/kernels/_xla/rwkv7/_interface.py +0 -178
  1004. ejkernel-0.0.79/ejkernel/kernels/_xla/rwkv7/_xla_impl_bwd.py +0 -21
  1005. ejkernel-0.0.79/ejkernel/kernels/_xla/rwkv7/_xla_impl_fwd.py +0 -536
  1006. ejkernel-0.0.79/ejkernel/kernels/_xla/scaled_dot_product_attention/__init__.py +0 -24
  1007. ejkernel-0.0.79/ejkernel/kernels/_xla/scaled_dot_product_attention/_interface.py +0 -97
  1008. ejkernel-0.0.79/ejkernel/kernels/_xla/scaled_dot_product_attention/_xla_impl_bwd.py +0 -21
  1009. ejkernel-0.0.79/ejkernel/kernels/_xla/scaled_dot_product_attention/_xla_impl_fwd.py +0 -100
  1010. ejkernel-0.0.79/ejkernel/kernels/_xla/state_space_v1/__init__.py +0 -35
  1011. ejkernel-0.0.79/ejkernel/kernels/_xla/state_space_v1/_interface.py +0 -339
  1012. ejkernel-0.0.79/ejkernel/kernels/_xla/state_space_v1/_xla_impl_bwd.py +0 -216
  1013. ejkernel-0.0.79/ejkernel/kernels/_xla/state_space_v1/_xla_impl_fwd.py +0 -234
  1014. ejkernel-0.0.79/ejkernel/kernels/_xla/state_space_v2/__init__.py +0 -31
  1015. ejkernel-0.0.79/ejkernel/kernels/_xla/state_space_v2/_interface.py +0 -372
  1016. ejkernel-0.0.79/ejkernel/kernels/_xla/state_space_v2/_xla_impl_bwd.py +0 -232
  1017. ejkernel-0.0.79/ejkernel/kernels/_xla/state_space_v2/_xla_impl_fwd.py +0 -231
  1018. ejkernel-0.0.79/ejkernel/kernels/_xla/unified_attention/__init__.py +0 -30
  1019. ejkernel-0.0.79/ejkernel/kernels/_xla/unified_attention/_interface.py +0 -195
  1020. ejkernel-0.0.79/ejkernel/kernels/_xla/unified_attention/_xla_impl_fwd.py +0 -332
  1021. ejkernel-0.0.79/ejkernel/loggings.py +0 -588
  1022. ejkernel-0.0.79/ejkernel/modules/__init__.py +0 -311
  1023. ejkernel-0.0.79/ejkernel/modules/base.py +0 -271
  1024. ejkernel-0.0.79/ejkernel/modules/operations/__init__.py +0 -299
  1025. ejkernel-0.0.79/ejkernel/modules/operations/all_gather_matmul.py +0 -371
  1026. ejkernel-0.0.79/ejkernel/modules/operations/attention.py +0 -337
  1027. ejkernel-0.0.79/ejkernel/modules/operations/blocksparse_attention.py +0 -1013
  1028. ejkernel-0.0.79/ejkernel/modules/operations/chunked_prefill_paged_decode.py +0 -469
  1029. ejkernel-0.0.79/ejkernel/modules/operations/configs.py +0 -1027
  1030. ejkernel-0.0.79/ejkernel/modules/operations/decode_attention.py +0 -393
  1031. ejkernel-0.0.79/ejkernel/modules/operations/deepseek_attn.py +0 -294
  1032. ejkernel-0.0.79/ejkernel/modules/operations/flash_attention.py +0 -941
  1033. ejkernel-0.0.79/ejkernel/modules/operations/gated_delta_rule.py +0 -481
  1034. ejkernel-0.0.79/ejkernel/modules/operations/gated_linear_attention.py +0 -368
  1035. ejkernel-0.0.79/ejkernel/modules/operations/grouped_matmul.py +0 -527
  1036. ejkernel-0.0.79/ejkernel/modules/operations/kernel_delta_attention.py +0 -360
  1037. ejkernel-0.0.79/ejkernel/modules/operations/lightning_attention.py +0 -383
  1038. ejkernel-0.0.79/ejkernel/modules/operations/multi_head_latent_attention.py +0 -388
  1039. ejkernel-0.0.79/ejkernel/modules/operations/multi_latent_ragged_page_attention.py +0 -448
  1040. ejkernel-0.0.79/ejkernel/modules/operations/multi_latent_ragged_page_attention_v2.py +0 -375
  1041. ejkernel-0.0.79/ejkernel/modules/operations/native_sparse_attention.py +0 -460
  1042. ejkernel-0.0.79/ejkernel/modules/operations/page_attention.py +0 -451
  1043. ejkernel-0.0.79/ejkernel/modules/operations/pooling.py +0 -293
  1044. ejkernel-0.0.79/ejkernel/modules/operations/prefill_page_attention.py +0 -359
  1045. ejkernel-0.0.79/ejkernel/modules/operations/quantized_matmul.py +0 -1821
  1046. ejkernel-0.0.79/ejkernel/modules/operations/ragged_decode_attention.py +0 -616
  1047. ejkernel-0.0.79/ejkernel/modules/operations/ragged_gated_delta_rule.py +0 -357
  1048. ejkernel-0.0.79/ejkernel/modules/operations/ragged_page_attention_v2.py +0 -763
  1049. ejkernel-0.0.79/ejkernel/modules/operations/ragged_page_attention_v2_turboquant.py +0 -568
  1050. ejkernel-0.0.79/ejkernel/modules/operations/ragged_page_attention_v3.py +0 -1424
  1051. ejkernel-0.0.79/ejkernel/modules/operations/ragged_page_attention_v3_turboquant.py +0 -615
  1052. ejkernel-0.0.79/ejkernel/modules/operations/recurrent.py +0 -388
  1053. ejkernel-0.0.79/ejkernel/modules/operations/reduce_scatter_matmul.py +0 -326
  1054. ejkernel-0.0.79/ejkernel/modules/operations/ring_attention.py +0 -591
  1055. ejkernel-0.0.79/ejkernel/modules/operations/rwkv4.py +0 -283
  1056. ejkernel-0.0.79/ejkernel/modules/operations/rwkv6.py +0 -329
  1057. ejkernel-0.0.79/ejkernel/modules/operations/rwkv7.py +0 -604
  1058. ejkernel-0.0.79/ejkernel/modules/operations/scaled_dot_product_attention.py +0 -456
  1059. ejkernel-0.0.79/ejkernel/modules/operations/state_space_v1.py +0 -341
  1060. ejkernel-0.0.79/ejkernel/modules/operations/state_space_v2.py +0 -379
  1061. ejkernel-0.0.79/ejkernel/modules/operations/unified_attention.py +0 -559
  1062. ejkernel-0.0.79/ejkernel/ops/__init__.py +0 -152
  1063. ejkernel-0.0.79/ejkernel/ops/config/__init__.py +0 -55
  1064. ejkernel-0.0.79/ejkernel/ops/config/cache.py +0 -187
  1065. ejkernel-0.0.79/ejkernel/ops/config/persistent.py +0 -233
  1066. ejkernel-0.0.79/ejkernel/ops/config/selection.py +0 -804
  1067. ejkernel-0.0.79/ejkernel/ops/core/__init__.py +0 -58
  1068. ejkernel-0.0.79/ejkernel/ops/core/kernel.py +0 -759
  1069. ejkernel-0.0.79/ejkernel/ops/core/types.py +0 -50
  1070. ejkernel-0.0.79/ejkernel/ops/execution/__init__.py +0 -87
  1071. ejkernel-0.0.79/ejkernel/ops/execution/batch.py +0 -191
  1072. ejkernel-0.0.79/ejkernel/ops/execution/executor.py +0 -711
  1073. ejkernel-0.0.79/ejkernel/ops/execution/offline.py +0 -144
  1074. ejkernel-0.0.79/ejkernel/ops/execution/profiler.py +0 -506
  1075. ejkernel-0.0.79/ejkernel/ops/execution/tuning.py +0 -1538
  1076. ejkernel-0.0.79/ejkernel/ops/registry.py +0 -93
  1077. ejkernel-0.0.79/ejkernel/ops/utils/__init__.py +0 -77
  1078. ejkernel-0.0.79/ejkernel/ops/utils/datacarrier.py +0 -167
  1079. ejkernel-0.0.79/ejkernel/ops/utils/fingerprint.py +0 -344
  1080. ejkernel-0.0.79/ejkernel/ops/utils/meta.py +0 -160
  1081. ejkernel-0.0.79/ejkernel/ops/utils/serialize.py +0 -98
  1082. ejkernel-0.0.79/ejkernel/quantization/__init__.py +0 -86
  1083. ejkernel-0.0.79/ejkernel/quantization/_quants/__init__.py +0 -37
  1084. ejkernel-0.0.79/ejkernel/quantization/_quants/quantizations.py +0 -1151
  1085. ejkernel-0.0.79/ejkernel/quantization/_utils/__init__.py +0 -90
  1086. ejkernel-0.0.79/ejkernel/quantization/_utils/bitpack.py +0 -251
  1087. ejkernel-0.0.79/ejkernel/quantization/_utils/fp_tables.py +0 -255
  1088. ejkernel-0.0.79/ejkernel/quantization/_utils/grouping.py +0 -184
  1089. ejkernel-0.0.79/ejkernel/quantization/_utils/qparams.py +0 -529
  1090. ejkernel-0.0.79/ejkernel/quantization/quantized_array.py +0 -374
  1091. ejkernel-0.0.79/ejkernel/quantization/runtime.py +0 -82
  1092. ejkernel-0.0.79/ejkernel/quantization/turboquant/codebook.py +0 -190
  1093. ejkernel-0.0.79/ejkernel/quantization/turboquant/matrices.py +0 -81
  1094. ejkernel-0.0.79/ejkernel/quantization/turboquant/ops.py +0 -229
  1095. ejkernel-0.0.79/ejkernel/quantization/turboquant/packing.py +0 -117
  1096. ejkernel-0.0.79/ejkernel/types/__init__.py +0 -57
  1097. ejkernel-0.0.79/ejkernel/types/mask.py +0 -3366
  1098. ejkernel-0.0.79/ejkernel/utils.py +0 -1137
  1099. ejkernel-0.0.79/ejkernel/xla_utils/__init__.py +0 -122
  1100. ejkernel-0.0.79/ejkernel/xla_utils/cumsum.py +0 -568
  1101. ejkernel-0.0.79/ejkernel/xla_utils/shardings.py +0 -270
  1102. ejkernel-0.0.79/ejkernel/xla_utils/utils.py +0 -376
  1103. ejkernel-0.0.79/pyproject.toml +0 -135
  1104. {ejkernel-0.0.79 → ejkernel-0.0.80}/README.md +0 -0
  1105. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/blocksparse_attention/CMakeLists.txt +0 -0
  1106. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/blocksparse_attention/src/blocksparse_attention.h +0 -0
  1107. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/blocksparse_attention/src/blocksparse_attention_ffi.cu +0 -0
  1108. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/blocksparse_attention/src/blocksparse_attention_kernel.h +0 -0
  1109. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/blocksparse_attention/src/blocksparse_attention_launch_template.h +0 -0
  1110. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/blocksparse_attention/src/blocksparse_fwd_hdim128_vhdim128_bf16_sm100.cu +0 -0
  1111. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/blocksparse_attention/src/blocksparse_fwd_hdim128_vhdim128_bf16_sm110.cu +0 -0
  1112. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/blocksparse_attention/src/blocksparse_fwd_hdim128_vhdim128_bf16_sm120.cu +0 -0
  1113. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/blocksparse_attention/src/blocksparse_fwd_hdim128_vhdim128_bf16_sm80.cu +0 -0
  1114. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/blocksparse_attention/src/blocksparse_fwd_hdim128_vhdim128_bf16_sm90.cu +0 -0
  1115. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/blocksparse_attention/src/blocksparse_fwd_hdim128_vhdim128_fp16_sm100.cu +0 -0
  1116. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/blocksparse_attention/src/blocksparse_fwd_hdim128_vhdim128_fp16_sm110.cu +0 -0
  1117. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/blocksparse_attention/src/blocksparse_fwd_hdim128_vhdim128_fp16_sm120.cu +0 -0
  1118. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/blocksparse_attention/src/blocksparse_fwd_hdim128_vhdim128_fp16_sm80.cu +0 -0
  1119. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/blocksparse_attention/src/blocksparse_fwd_hdim128_vhdim128_fp16_sm90.cu +0 -0
  1120. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/blocksparse_attention/src/blocksparse_fwd_hdim192_vhdim192_bf16_sm100.cu +0 -0
  1121. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/blocksparse_attention/src/blocksparse_fwd_hdim192_vhdim192_bf16_sm110.cu +0 -0
  1122. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/blocksparse_attention/src/blocksparse_fwd_hdim192_vhdim192_bf16_sm120.cu +0 -0
  1123. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/blocksparse_attention/src/blocksparse_fwd_hdim192_vhdim192_bf16_sm80.cu +0 -0
  1124. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/blocksparse_attention/src/blocksparse_fwd_hdim192_vhdim192_bf16_sm90.cu +0 -0
  1125. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/blocksparse_attention/src/blocksparse_fwd_hdim192_vhdim192_fp16_sm100.cu +0 -0
  1126. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/blocksparse_attention/src/blocksparse_fwd_hdim192_vhdim192_fp16_sm110.cu +0 -0
  1127. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/blocksparse_attention/src/blocksparse_fwd_hdim192_vhdim192_fp16_sm120.cu +0 -0
  1128. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/blocksparse_attention/src/blocksparse_fwd_hdim192_vhdim192_fp16_sm80.cu +0 -0
  1129. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/blocksparse_attention/src/blocksparse_fwd_hdim192_vhdim192_fp16_sm90.cu +0 -0
  1130. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/blocksparse_attention/src/blocksparse_fwd_hdim256_vhdim256_bf16_sm100.cu +0 -0
  1131. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/blocksparse_attention/src/blocksparse_fwd_hdim256_vhdim256_bf16_sm110.cu +0 -0
  1132. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/blocksparse_attention/src/blocksparse_fwd_hdim256_vhdim256_bf16_sm120.cu +0 -0
  1133. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/blocksparse_attention/src/blocksparse_fwd_hdim256_vhdim256_bf16_sm80.cu +0 -0
  1134. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/blocksparse_attention/src/blocksparse_fwd_hdim256_vhdim256_bf16_sm90.cu +0 -0
  1135. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/blocksparse_attention/src/blocksparse_fwd_hdim256_vhdim256_fp16_sm100.cu +0 -0
  1136. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/blocksparse_attention/src/blocksparse_fwd_hdim256_vhdim256_fp16_sm110.cu +0 -0
  1137. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/blocksparse_attention/src/blocksparse_fwd_hdim256_vhdim256_fp16_sm120.cu +0 -0
  1138. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/blocksparse_attention/src/blocksparse_fwd_hdim256_vhdim256_fp16_sm80.cu +0 -0
  1139. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/blocksparse_attention/src/blocksparse_fwd_hdim256_vhdim256_fp16_sm90.cu +0 -0
  1140. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/blocksparse_attention/src/blocksparse_fwd_hdim32_vhdim32_bf16_sm100.cu +0 -0
  1141. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/blocksparse_attention/src/blocksparse_fwd_hdim32_vhdim32_bf16_sm110.cu +0 -0
  1142. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/blocksparse_attention/src/blocksparse_fwd_hdim32_vhdim32_bf16_sm120.cu +0 -0
  1143. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/blocksparse_attention/src/blocksparse_fwd_hdim32_vhdim32_bf16_sm80.cu +0 -0
  1144. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/blocksparse_attention/src/blocksparse_fwd_hdim32_vhdim32_bf16_sm90.cu +0 -0
  1145. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/blocksparse_attention/src/blocksparse_fwd_hdim32_vhdim32_fp16_sm100.cu +0 -0
  1146. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/blocksparse_attention/src/blocksparse_fwd_hdim32_vhdim32_fp16_sm110.cu +0 -0
  1147. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/blocksparse_attention/src/blocksparse_fwd_hdim32_vhdim32_fp16_sm120.cu +0 -0
  1148. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/blocksparse_attention/src/blocksparse_fwd_hdim32_vhdim32_fp16_sm80.cu +0 -0
  1149. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/blocksparse_attention/src/blocksparse_fwd_hdim32_vhdim32_fp16_sm90.cu +0 -0
  1150. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/blocksparse_attention/src/blocksparse_fwd_hdim64_vhdim64_bf16_sm100.cu +0 -0
  1151. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/blocksparse_attention/src/blocksparse_fwd_hdim64_vhdim64_bf16_sm110.cu +0 -0
  1152. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/blocksparse_attention/src/blocksparse_fwd_hdim64_vhdim64_bf16_sm120.cu +0 -0
  1153. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/blocksparse_attention/src/blocksparse_fwd_hdim64_vhdim64_bf16_sm80.cu +0 -0
  1154. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/blocksparse_attention/src/blocksparse_fwd_hdim64_vhdim64_bf16_sm90.cu +0 -0
  1155. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/blocksparse_attention/src/blocksparse_fwd_hdim64_vhdim64_fp16_sm100.cu +0 -0
  1156. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/blocksparse_attention/src/blocksparse_fwd_hdim64_vhdim64_fp16_sm110.cu +0 -0
  1157. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/blocksparse_attention/src/blocksparse_fwd_hdim64_vhdim64_fp16_sm120.cu +0 -0
  1158. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/blocksparse_attention/src/blocksparse_fwd_hdim64_vhdim64_fp16_sm80.cu +0 -0
  1159. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/blocksparse_attention/src/blocksparse_fwd_hdim64_vhdim64_fp16_sm90.cu +0 -0
  1160. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/blocksparse_attention/src/blocksparse_fwd_hdim96_vhdim96_bf16_sm100.cu +0 -0
  1161. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/blocksparse_attention/src/blocksparse_fwd_hdim96_vhdim96_bf16_sm110.cu +0 -0
  1162. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/blocksparse_attention/src/blocksparse_fwd_hdim96_vhdim96_bf16_sm120.cu +0 -0
  1163. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/blocksparse_attention/src/blocksparse_fwd_hdim96_vhdim96_bf16_sm80.cu +0 -0
  1164. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/blocksparse_attention/src/blocksparse_fwd_hdim96_vhdim96_bf16_sm90.cu +0 -0
  1165. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/blocksparse_attention/src/blocksparse_fwd_hdim96_vhdim96_fp16_sm100.cu +0 -0
  1166. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/blocksparse_attention/src/blocksparse_fwd_hdim96_vhdim96_fp16_sm110.cu +0 -0
  1167. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/blocksparse_attention/src/blocksparse_fwd_hdim96_vhdim96_fp16_sm120.cu +0 -0
  1168. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/blocksparse_attention/src/blocksparse_fwd_hdim96_vhdim96_fp16_sm80.cu +0 -0
  1169. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/blocksparse_attention/src/blocksparse_fwd_hdim96_vhdim96_fp16_sm90.cu +0 -0
  1170. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/blocksparse_attention/src/code_gen.py +0 -0
  1171. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/code_gen.py +0 -0
  1172. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/block.h +0 -0
  1173. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/code_gen.py +0 -0
  1174. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/copy_sm90_bulk_reduce.hpp +0 -0
  1175. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/cuda_check.h +0 -0
  1176. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/epilogue_bwd.hpp +0 -0
  1177. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/epilogue_fwd.hpp +0 -0
  1178. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/flash.h +0 -0
  1179. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/flash_api.cpp +0 -0
  1180. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/flash_api_stable.cpp +0 -0
  1181. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/flash_bwd_kernel_sm80.h +0 -0
  1182. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/flash_bwd_kernel_sm90.h +0 -0
  1183. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/flash_bwd_launch_template.h +0 -0
  1184. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/flash_bwd_postprocess_kernel.h +0 -0
  1185. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/flash_bwd_preprocess_kernel.h +0 -0
  1186. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/flash_fwd_combine.cu +0 -0
  1187. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/flash_fwd_combine_kernel.h +0 -0
  1188. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/flash_fwd_combine_launch_template.h +0 -0
  1189. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/flash_fwd_kernel_sm80.h +0 -0
  1190. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/flash_fwd_kernel_sm90.h +0 -0
  1191. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/flash_fwd_launch_template.h +0 -0
  1192. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/flash_prepare_scheduler.cu +0 -0
  1193. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/heuristics.h +0 -0
  1194. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_bwd_hdim128_bf16_sm100.cu +0 -0
  1195. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_bwd_hdim128_bf16_sm110.cu +0 -0
  1196. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_bwd_hdim128_bf16_sm120.cu +0 -0
  1197. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_bwd_hdim128_bf16_sm80.cu +0 -0
  1198. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_bwd_hdim128_bf16_sm90.cu +0 -0
  1199. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_bwd_hdim128_bf16_softcap_sm100.cu +0 -0
  1200. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_bwd_hdim128_bf16_softcap_sm110.cu +0 -0
  1201. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_bwd_hdim128_bf16_softcap_sm120.cu +0 -0
  1202. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_bwd_hdim128_bf16_softcap_sm80.cu +0 -0
  1203. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_bwd_hdim128_bf16_softcap_sm90.cu +0 -0
  1204. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_bwd_hdim128_bf16_softcapall_sm90.cu +0 -0
  1205. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_bwd_hdim128_fp16_sm100.cu +0 -0
  1206. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_bwd_hdim128_fp16_sm110.cu +0 -0
  1207. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_bwd_hdim128_fp16_sm120.cu +0 -0
  1208. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_bwd_hdim128_fp16_sm80.cu +0 -0
  1209. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_bwd_hdim128_fp16_sm90.cu +0 -0
  1210. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_bwd_hdim128_fp16_softcap_sm100.cu +0 -0
  1211. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_bwd_hdim128_fp16_softcap_sm110.cu +0 -0
  1212. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_bwd_hdim128_fp16_softcap_sm120.cu +0 -0
  1213. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_bwd_hdim128_fp16_softcap_sm80.cu +0 -0
  1214. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_bwd_hdim128_fp16_softcap_sm90.cu +0 -0
  1215. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_bwd_hdim128_fp16_softcapall_sm90.cu +0 -0
  1216. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_bwd_hdim192_bf16_sm100.cu +0 -0
  1217. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_bwd_hdim192_bf16_sm110.cu +0 -0
  1218. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_bwd_hdim192_bf16_sm120.cu +0 -0
  1219. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_bwd_hdim192_bf16_sm80.cu +0 -0
  1220. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_bwd_hdim192_bf16_sm90.cu +0 -0
  1221. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_bwd_hdim192_bf16_softcap_sm100.cu +0 -0
  1222. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_bwd_hdim192_bf16_softcap_sm110.cu +0 -0
  1223. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_bwd_hdim192_bf16_softcap_sm120.cu +0 -0
  1224. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_bwd_hdim192_bf16_softcap_sm80.cu +0 -0
  1225. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_bwd_hdim192_bf16_softcap_sm90.cu +0 -0
  1226. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_bwd_hdim192_bf16_softcapall_sm90.cu +0 -0
  1227. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_bwd_hdim192_fp16_sm100.cu +0 -0
  1228. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_bwd_hdim192_fp16_sm110.cu +0 -0
  1229. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_bwd_hdim192_fp16_sm120.cu +0 -0
  1230. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_bwd_hdim192_fp16_sm80.cu +0 -0
  1231. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_bwd_hdim192_fp16_sm90.cu +0 -0
  1232. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_bwd_hdim192_fp16_softcap_sm100.cu +0 -0
  1233. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_bwd_hdim192_fp16_softcap_sm110.cu +0 -0
  1234. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_bwd_hdim192_fp16_softcap_sm120.cu +0 -0
  1235. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_bwd_hdim192_fp16_softcap_sm80.cu +0 -0
  1236. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_bwd_hdim192_fp16_softcap_sm90.cu +0 -0
  1237. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_bwd_hdim192_fp16_softcapall_sm90.cu +0 -0
  1238. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_bwd_hdim256_bf16_sm100.cu +0 -0
  1239. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_bwd_hdim256_bf16_sm110.cu +0 -0
  1240. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_bwd_hdim256_bf16_sm120.cu +0 -0
  1241. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_bwd_hdim256_bf16_sm80.cu +0 -0
  1242. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_bwd_hdim256_bf16_sm90.cu +0 -0
  1243. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_bwd_hdim256_bf16_softcap_sm100.cu +0 -0
  1244. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_bwd_hdim256_bf16_softcap_sm110.cu +0 -0
  1245. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_bwd_hdim256_bf16_softcap_sm120.cu +0 -0
  1246. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_bwd_hdim256_bf16_softcap_sm80.cu +0 -0
  1247. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_bwd_hdim256_bf16_softcap_sm90.cu +0 -0
  1248. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_bwd_hdim256_bf16_softcapall_sm90.cu +0 -0
  1249. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_bwd_hdim256_fp16_sm100.cu +0 -0
  1250. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_bwd_hdim256_fp16_sm110.cu +0 -0
  1251. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_bwd_hdim256_fp16_sm120.cu +0 -0
  1252. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_bwd_hdim256_fp16_sm80.cu +0 -0
  1253. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_bwd_hdim256_fp16_sm90.cu +0 -0
  1254. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_bwd_hdim256_fp16_softcap_sm100.cu +0 -0
  1255. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_bwd_hdim256_fp16_softcap_sm110.cu +0 -0
  1256. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_bwd_hdim256_fp16_softcap_sm120.cu +0 -0
  1257. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_bwd_hdim256_fp16_softcap_sm80.cu +0 -0
  1258. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_bwd_hdim256_fp16_softcap_sm90.cu +0 -0
  1259. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_bwd_hdim256_fp16_softcapall_sm90.cu +0 -0
  1260. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_bwd_hdim64_bf16_sm100.cu +0 -0
  1261. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_bwd_hdim64_bf16_sm110.cu +0 -0
  1262. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_bwd_hdim64_bf16_sm120.cu +0 -0
  1263. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_bwd_hdim64_bf16_sm80.cu +0 -0
  1264. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_bwd_hdim64_bf16_sm90.cu +0 -0
  1265. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_bwd_hdim64_bf16_softcap_sm100.cu +0 -0
  1266. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_bwd_hdim64_bf16_softcap_sm110.cu +0 -0
  1267. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_bwd_hdim64_bf16_softcap_sm120.cu +0 -0
  1268. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_bwd_hdim64_bf16_softcap_sm80.cu +0 -0
  1269. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_bwd_hdim64_bf16_softcap_sm90.cu +0 -0
  1270. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_bwd_hdim64_bf16_softcapall_sm90.cu +0 -0
  1271. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_bwd_hdim64_fp16_sm100.cu +0 -0
  1272. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_bwd_hdim64_fp16_sm110.cu +0 -0
  1273. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_bwd_hdim64_fp16_sm120.cu +0 -0
  1274. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_bwd_hdim64_fp16_sm80.cu +0 -0
  1275. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_bwd_hdim64_fp16_sm90.cu +0 -0
  1276. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_bwd_hdim64_fp16_softcap_sm100.cu +0 -0
  1277. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_bwd_hdim64_fp16_softcap_sm110.cu +0 -0
  1278. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_bwd_hdim64_fp16_softcap_sm120.cu +0 -0
  1279. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_bwd_hdim64_fp16_softcap_sm80.cu +0 -0
  1280. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_bwd_hdim64_fp16_softcap_sm90.cu +0 -0
  1281. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_bwd_hdim64_fp16_softcapall_sm90.cu +0 -0
  1282. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_bwd_hdim96_bf16_sm100.cu +0 -0
  1283. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_bwd_hdim96_bf16_sm110.cu +0 -0
  1284. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_bwd_hdim96_bf16_sm120.cu +0 -0
  1285. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_bwd_hdim96_bf16_sm80.cu +0 -0
  1286. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_bwd_hdim96_bf16_sm90.cu +0 -0
  1287. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_bwd_hdim96_bf16_softcap_sm100.cu +0 -0
  1288. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_bwd_hdim96_bf16_softcap_sm110.cu +0 -0
  1289. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_bwd_hdim96_bf16_softcap_sm120.cu +0 -0
  1290. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_bwd_hdim96_bf16_softcap_sm80.cu +0 -0
  1291. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_bwd_hdim96_bf16_softcap_sm90.cu +0 -0
  1292. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_bwd_hdim96_bf16_softcapall_sm90.cu +0 -0
  1293. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_bwd_hdim96_fp16_sm100.cu +0 -0
  1294. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_bwd_hdim96_fp16_sm110.cu +0 -0
  1295. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_bwd_hdim96_fp16_sm120.cu +0 -0
  1296. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_bwd_hdim96_fp16_sm80.cu +0 -0
  1297. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_bwd_hdim96_fp16_sm90.cu +0 -0
  1298. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_bwd_hdim96_fp16_softcap_sm100.cu +0 -0
  1299. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_bwd_hdim96_fp16_softcap_sm110.cu +0 -0
  1300. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_bwd_hdim96_fp16_softcap_sm120.cu +0 -0
  1301. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_bwd_hdim96_fp16_softcap_sm80.cu +0 -0
  1302. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_bwd_hdim96_fp16_softcap_sm90.cu +0 -0
  1303. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_bwd_hdim96_fp16_softcapall_sm90.cu +0 -0
  1304. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_bf16_packgqa_sm100.cu +0 -0
  1305. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_bf16_packgqa_sm110.cu +0 -0
  1306. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_bf16_packgqa_sm120.cu +0 -0
  1307. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_bf16_packgqa_sm80.cu +0 -0
  1308. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_bf16_packgqa_sm90.cu +0 -0
  1309. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_bf16_paged_packgqa_sm100.cu +0 -0
  1310. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_bf16_paged_packgqa_sm110.cu +0 -0
  1311. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_bf16_paged_packgqa_sm120.cu +0 -0
  1312. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_bf16_paged_packgqa_sm80.cu +0 -0
  1313. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_bf16_paged_packgqa_sm90.cu +0 -0
  1314. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_bf16_paged_sm100.cu +0 -0
  1315. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_bf16_paged_sm110.cu +0 -0
  1316. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_bf16_paged_sm120.cu +0 -0
  1317. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_bf16_paged_sm80.cu +0 -0
  1318. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_bf16_paged_sm90.cu +0 -0
  1319. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_bf16_paged_softcap_packgqa_sm100.cu +0 -0
  1320. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_bf16_paged_softcap_packgqa_sm110.cu +0 -0
  1321. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_bf16_paged_softcap_packgqa_sm120.cu +0 -0
  1322. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_bf16_paged_softcap_packgqa_sm80.cu +0 -0
  1323. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_bf16_paged_softcap_packgqa_sm90.cu +0 -0
  1324. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_bf16_paged_softcap_sm100.cu +0 -0
  1325. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_bf16_paged_softcap_sm110.cu +0 -0
  1326. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_bf16_paged_softcap_sm120.cu +0 -0
  1327. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_bf16_paged_softcap_sm80.cu +0 -0
  1328. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_bf16_paged_softcap_sm90.cu +0 -0
  1329. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_bf16_paged_softcapall_sm80.cu +0 -0
  1330. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_bf16_paged_split_packgqa_sm100.cu +0 -0
  1331. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_bf16_paged_split_packgqa_sm110.cu +0 -0
  1332. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_bf16_paged_split_packgqa_sm120.cu +0 -0
  1333. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_bf16_paged_split_packgqa_sm80.cu +0 -0
  1334. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_bf16_paged_split_packgqa_sm90.cu +0 -0
  1335. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_bf16_paged_split_sm100.cu +0 -0
  1336. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_bf16_paged_split_sm110.cu +0 -0
  1337. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_bf16_paged_split_sm120.cu +0 -0
  1338. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_bf16_paged_split_sm80.cu +0 -0
  1339. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_bf16_paged_split_sm90.cu +0 -0
  1340. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_bf16_paged_split_softcap_packgqa_sm100.cu +0 -0
  1341. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_bf16_paged_split_softcap_packgqa_sm110.cu +0 -0
  1342. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_bf16_paged_split_softcap_packgqa_sm120.cu +0 -0
  1343. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_bf16_paged_split_softcap_packgqa_sm80.cu +0 -0
  1344. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_bf16_paged_split_softcap_packgqa_sm90.cu +0 -0
  1345. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_bf16_paged_split_softcap_sm100.cu +0 -0
  1346. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_bf16_paged_split_softcap_sm110.cu +0 -0
  1347. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_bf16_paged_split_softcap_sm120.cu +0 -0
  1348. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_bf16_paged_split_softcap_sm80.cu +0 -0
  1349. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_bf16_paged_split_softcap_sm90.cu +0 -0
  1350. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_bf16_paged_split_softcapall_sm80.cu +0 -0
  1351. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_bf16_sm100.cu +0 -0
  1352. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_bf16_sm110.cu +0 -0
  1353. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_bf16_sm120.cu +0 -0
  1354. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_bf16_sm80.cu +0 -0
  1355. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_bf16_sm90.cu +0 -0
  1356. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_bf16_softcap_packgqa_sm100.cu +0 -0
  1357. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_bf16_softcap_packgqa_sm110.cu +0 -0
  1358. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_bf16_softcap_packgqa_sm120.cu +0 -0
  1359. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_bf16_softcap_packgqa_sm80.cu +0 -0
  1360. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_bf16_softcap_packgqa_sm90.cu +0 -0
  1361. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_bf16_softcap_sm100.cu +0 -0
  1362. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_bf16_softcap_sm110.cu +0 -0
  1363. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_bf16_softcap_sm120.cu +0 -0
  1364. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_bf16_softcap_sm80.cu +0 -0
  1365. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_bf16_softcap_sm90.cu +0 -0
  1366. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_bf16_softcapall_sm80.cu +0 -0
  1367. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_bf16_split_packgqa_sm100.cu +0 -0
  1368. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_bf16_split_packgqa_sm110.cu +0 -0
  1369. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_bf16_split_packgqa_sm120.cu +0 -0
  1370. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_bf16_split_packgqa_sm80.cu +0 -0
  1371. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_bf16_split_packgqa_sm90.cu +0 -0
  1372. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_bf16_split_sm100.cu +0 -0
  1373. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_bf16_split_sm110.cu +0 -0
  1374. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_bf16_split_sm120.cu +0 -0
  1375. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_bf16_split_sm80.cu +0 -0
  1376. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_bf16_split_sm90.cu +0 -0
  1377. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_bf16_split_softcap_packgqa_sm100.cu +0 -0
  1378. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_bf16_split_softcap_packgqa_sm110.cu +0 -0
  1379. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_bf16_split_softcap_packgqa_sm120.cu +0 -0
  1380. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_bf16_split_softcap_packgqa_sm80.cu +0 -0
  1381. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_bf16_split_softcap_packgqa_sm90.cu +0 -0
  1382. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_bf16_split_softcap_sm100.cu +0 -0
  1383. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_bf16_split_softcap_sm110.cu +0 -0
  1384. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_bf16_split_softcap_sm120.cu +0 -0
  1385. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_bf16_split_softcap_sm80.cu +0 -0
  1386. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_bf16_split_softcap_sm90.cu +0 -0
  1387. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_bf16_split_softcapall_sm80.cu +0 -0
  1388. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_e4m3_packgqa_sm100.cu +0 -0
  1389. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_e4m3_packgqa_sm110.cu +0 -0
  1390. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_e4m3_packgqa_sm120.cu +0 -0
  1391. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_e4m3_packgqa_sm90.cu +0 -0
  1392. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_e4m3_paged_packgqa_sm100.cu +0 -0
  1393. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_e4m3_paged_packgqa_sm110.cu +0 -0
  1394. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_e4m3_paged_packgqa_sm120.cu +0 -0
  1395. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_e4m3_paged_packgqa_sm90.cu +0 -0
  1396. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_e4m3_paged_sm100.cu +0 -0
  1397. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_e4m3_paged_sm110.cu +0 -0
  1398. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_e4m3_paged_sm120.cu +0 -0
  1399. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_e4m3_paged_sm90.cu +0 -0
  1400. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_e4m3_paged_softcap_packgqa_sm100.cu +0 -0
  1401. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_e4m3_paged_softcap_packgqa_sm110.cu +0 -0
  1402. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_e4m3_paged_softcap_packgqa_sm120.cu +0 -0
  1403. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_e4m3_paged_softcap_packgqa_sm90.cu +0 -0
  1404. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_e4m3_paged_softcap_sm100.cu +0 -0
  1405. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_e4m3_paged_softcap_sm110.cu +0 -0
  1406. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_e4m3_paged_softcap_sm120.cu +0 -0
  1407. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_e4m3_paged_softcap_sm90.cu +0 -0
  1408. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_e4m3_paged_split_packgqa_sm100.cu +0 -0
  1409. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_e4m3_paged_split_packgqa_sm110.cu +0 -0
  1410. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_e4m3_paged_split_packgqa_sm120.cu +0 -0
  1411. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_e4m3_paged_split_packgqa_sm90.cu +0 -0
  1412. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_e4m3_paged_split_sm100.cu +0 -0
  1413. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_e4m3_paged_split_sm110.cu +0 -0
  1414. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_e4m3_paged_split_sm120.cu +0 -0
  1415. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_e4m3_paged_split_sm90.cu +0 -0
  1416. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_e4m3_paged_split_softcap_packgqa_sm100.cu +0 -0
  1417. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_e4m3_paged_split_softcap_packgqa_sm110.cu +0 -0
  1418. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_e4m3_paged_split_softcap_packgqa_sm120.cu +0 -0
  1419. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_e4m3_paged_split_softcap_packgqa_sm90.cu +0 -0
  1420. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_e4m3_paged_split_softcap_sm100.cu +0 -0
  1421. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_e4m3_paged_split_softcap_sm110.cu +0 -0
  1422. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_e4m3_paged_split_softcap_sm120.cu +0 -0
  1423. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_e4m3_paged_split_softcap_sm90.cu +0 -0
  1424. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_e4m3_sm100.cu +0 -0
  1425. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_e4m3_sm110.cu +0 -0
  1426. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_e4m3_sm120.cu +0 -0
  1427. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_e4m3_sm90.cu +0 -0
  1428. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_e4m3_softcap_packgqa_sm100.cu +0 -0
  1429. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_e4m3_softcap_packgqa_sm110.cu +0 -0
  1430. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_e4m3_softcap_packgqa_sm120.cu +0 -0
  1431. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_e4m3_softcap_packgqa_sm90.cu +0 -0
  1432. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_e4m3_softcap_sm100.cu +0 -0
  1433. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_e4m3_softcap_sm110.cu +0 -0
  1434. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_e4m3_softcap_sm120.cu +0 -0
  1435. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_e4m3_softcap_sm90.cu +0 -0
  1436. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_e4m3_split_packgqa_sm100.cu +0 -0
  1437. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_e4m3_split_packgqa_sm110.cu +0 -0
  1438. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_e4m3_split_packgqa_sm120.cu +0 -0
  1439. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_e4m3_split_packgqa_sm90.cu +0 -0
  1440. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_e4m3_split_sm100.cu +0 -0
  1441. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_e4m3_split_sm110.cu +0 -0
  1442. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_e4m3_split_sm120.cu +0 -0
  1443. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_e4m3_split_sm90.cu +0 -0
  1444. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_e4m3_split_softcap_packgqa_sm100.cu +0 -0
  1445. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_e4m3_split_softcap_packgqa_sm110.cu +0 -0
  1446. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_e4m3_split_softcap_packgqa_sm120.cu +0 -0
  1447. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_e4m3_split_softcap_packgqa_sm90.cu +0 -0
  1448. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_e4m3_split_softcap_sm100.cu +0 -0
  1449. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_e4m3_split_softcap_sm110.cu +0 -0
  1450. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_e4m3_split_softcap_sm120.cu +0 -0
  1451. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_e4m3_split_softcap_sm90.cu +0 -0
  1452. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_fp16_packgqa_sm100.cu +0 -0
  1453. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_fp16_packgqa_sm110.cu +0 -0
  1454. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_fp16_packgqa_sm120.cu +0 -0
  1455. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_fp16_packgqa_sm80.cu +0 -0
  1456. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_fp16_packgqa_sm90.cu +0 -0
  1457. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_fp16_paged_packgqa_sm100.cu +0 -0
  1458. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_fp16_paged_packgqa_sm110.cu +0 -0
  1459. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_fp16_paged_packgqa_sm120.cu +0 -0
  1460. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_fp16_paged_packgqa_sm80.cu +0 -0
  1461. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_fp16_paged_packgqa_sm90.cu +0 -0
  1462. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_fp16_paged_sm100.cu +0 -0
  1463. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_fp16_paged_sm110.cu +0 -0
  1464. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_fp16_paged_sm120.cu +0 -0
  1465. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_fp16_paged_sm80.cu +0 -0
  1466. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_fp16_paged_sm90.cu +0 -0
  1467. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_fp16_paged_softcap_packgqa_sm100.cu +0 -0
  1468. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_fp16_paged_softcap_packgqa_sm110.cu +0 -0
  1469. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_fp16_paged_softcap_packgqa_sm120.cu +0 -0
  1470. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_fp16_paged_softcap_packgqa_sm80.cu +0 -0
  1471. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_fp16_paged_softcap_packgqa_sm90.cu +0 -0
  1472. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_fp16_paged_softcap_sm100.cu +0 -0
  1473. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_fp16_paged_softcap_sm110.cu +0 -0
  1474. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_fp16_paged_softcap_sm120.cu +0 -0
  1475. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_fp16_paged_softcap_sm80.cu +0 -0
  1476. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_fp16_paged_softcap_sm90.cu +0 -0
  1477. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_fp16_paged_softcapall_sm80.cu +0 -0
  1478. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_fp16_paged_split_packgqa_sm100.cu +0 -0
  1479. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_fp16_paged_split_packgqa_sm110.cu +0 -0
  1480. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_fp16_paged_split_packgqa_sm120.cu +0 -0
  1481. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_fp16_paged_split_packgqa_sm80.cu +0 -0
  1482. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_fp16_paged_split_packgqa_sm90.cu +0 -0
  1483. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_fp16_paged_split_sm100.cu +0 -0
  1484. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_fp16_paged_split_sm110.cu +0 -0
  1485. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_fp16_paged_split_sm120.cu +0 -0
  1486. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_fp16_paged_split_sm80.cu +0 -0
  1487. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_fp16_paged_split_sm90.cu +0 -0
  1488. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_fp16_paged_split_softcap_packgqa_sm100.cu +0 -0
  1489. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_fp16_paged_split_softcap_packgqa_sm110.cu +0 -0
  1490. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_fp16_paged_split_softcap_packgqa_sm120.cu +0 -0
  1491. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_fp16_paged_split_softcap_packgqa_sm80.cu +0 -0
  1492. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_fp16_paged_split_softcap_packgqa_sm90.cu +0 -0
  1493. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_fp16_paged_split_softcap_sm100.cu +0 -0
  1494. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_fp16_paged_split_softcap_sm110.cu +0 -0
  1495. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_fp16_paged_split_softcap_sm120.cu +0 -0
  1496. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_fp16_paged_split_softcap_sm80.cu +0 -0
  1497. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_fp16_paged_split_softcap_sm90.cu +0 -0
  1498. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_fp16_paged_split_softcapall_sm80.cu +0 -0
  1499. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_fp16_sm100.cu +0 -0
  1500. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_fp16_sm110.cu +0 -0
  1501. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_fp16_sm120.cu +0 -0
  1502. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_fp16_sm80.cu +0 -0
  1503. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_fp16_sm90.cu +0 -0
  1504. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_fp16_softcap_packgqa_sm100.cu +0 -0
  1505. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_fp16_softcap_packgqa_sm110.cu +0 -0
  1506. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_fp16_softcap_packgqa_sm120.cu +0 -0
  1507. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_fp16_softcap_packgqa_sm80.cu +0 -0
  1508. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_fp16_softcap_packgqa_sm90.cu +0 -0
  1509. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_fp16_softcap_sm100.cu +0 -0
  1510. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_fp16_softcap_sm110.cu +0 -0
  1511. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_fp16_softcap_sm120.cu +0 -0
  1512. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_fp16_softcap_sm80.cu +0 -0
  1513. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_fp16_softcap_sm90.cu +0 -0
  1514. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_fp16_softcapall_sm80.cu +0 -0
  1515. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_fp16_split_packgqa_sm100.cu +0 -0
  1516. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_fp16_split_packgqa_sm110.cu +0 -0
  1517. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_fp16_split_packgqa_sm120.cu +0 -0
  1518. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_fp16_split_packgqa_sm80.cu +0 -0
  1519. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_fp16_split_packgqa_sm90.cu +0 -0
  1520. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_fp16_split_sm100.cu +0 -0
  1521. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_fp16_split_sm110.cu +0 -0
  1522. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_fp16_split_sm120.cu +0 -0
  1523. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_fp16_split_sm80.cu +0 -0
  1524. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_fp16_split_sm90.cu +0 -0
  1525. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_fp16_split_softcap_packgqa_sm100.cu +0 -0
  1526. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_fp16_split_softcap_packgqa_sm110.cu +0 -0
  1527. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_fp16_split_softcap_packgqa_sm120.cu +0 -0
  1528. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_fp16_split_softcap_packgqa_sm80.cu +0 -0
  1529. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_fp16_split_softcap_packgqa_sm90.cu +0 -0
  1530. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_fp16_split_softcap_sm100.cu +0 -0
  1531. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_fp16_split_softcap_sm110.cu +0 -0
  1532. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_fp16_split_softcap_sm120.cu +0 -0
  1533. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_fp16_split_softcap_sm80.cu +0 -0
  1534. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_fp16_split_softcap_sm90.cu +0 -0
  1535. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim128_fp16_split_softcapall_sm80.cu +0 -0
  1536. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_128_bf16_packgqa_sm90.cu +0 -0
  1537. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_128_bf16_paged_sm90.cu +0 -0
  1538. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_128_bf16_paged_softcap_sm90.cu +0 -0
  1539. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_128_bf16_paged_split_sm90.cu +0 -0
  1540. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_128_bf16_paged_split_softcap_sm90.cu +0 -0
  1541. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_128_bf16_sm90.cu +0 -0
  1542. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_128_bf16_softcap_packgqa_sm90.cu +0 -0
  1543. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_128_bf16_softcap_sm90.cu +0 -0
  1544. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_128_bf16_split_sm90.cu +0 -0
  1545. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_128_bf16_split_softcap_sm90.cu +0 -0
  1546. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_128_e4m3_packgqa_sm90.cu +0 -0
  1547. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_128_e4m3_paged_sm90.cu +0 -0
  1548. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_128_e4m3_paged_softcap_sm90.cu +0 -0
  1549. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_128_e4m3_paged_split_sm90.cu +0 -0
  1550. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_128_e4m3_paged_split_softcap_sm90.cu +0 -0
  1551. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_128_e4m3_sm90.cu +0 -0
  1552. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_128_e4m3_softcap_packgqa_sm90.cu +0 -0
  1553. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_128_e4m3_softcap_sm90.cu +0 -0
  1554. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_128_e4m3_split_sm90.cu +0 -0
  1555. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_128_e4m3_split_softcap_sm90.cu +0 -0
  1556. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_128_fp16_packgqa_sm90.cu +0 -0
  1557. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_128_fp16_paged_sm90.cu +0 -0
  1558. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_128_fp16_paged_softcap_sm90.cu +0 -0
  1559. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_128_fp16_paged_split_sm90.cu +0 -0
  1560. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_128_fp16_paged_split_softcap_sm90.cu +0 -0
  1561. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_128_fp16_sm90.cu +0 -0
  1562. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_128_fp16_softcap_packgqa_sm90.cu +0 -0
  1563. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_128_fp16_softcap_sm90.cu +0 -0
  1564. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_128_fp16_split_sm90.cu +0 -0
  1565. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_128_fp16_split_softcap_sm90.cu +0 -0
  1566. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_bf16_packgqa_sm100.cu +0 -0
  1567. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_bf16_packgqa_sm110.cu +0 -0
  1568. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_bf16_packgqa_sm120.cu +0 -0
  1569. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_bf16_packgqa_sm80.cu +0 -0
  1570. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_bf16_packgqa_sm90.cu +0 -0
  1571. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_bf16_paged_packgqa_sm100.cu +0 -0
  1572. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_bf16_paged_packgqa_sm110.cu +0 -0
  1573. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_bf16_paged_packgqa_sm120.cu +0 -0
  1574. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_bf16_paged_packgqa_sm80.cu +0 -0
  1575. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_bf16_paged_packgqa_sm90.cu +0 -0
  1576. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_bf16_paged_sm100.cu +0 -0
  1577. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_bf16_paged_sm110.cu +0 -0
  1578. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_bf16_paged_sm120.cu +0 -0
  1579. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_bf16_paged_sm80.cu +0 -0
  1580. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_bf16_paged_sm90.cu +0 -0
  1581. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_bf16_paged_softcap_packgqa_sm100.cu +0 -0
  1582. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_bf16_paged_softcap_packgqa_sm110.cu +0 -0
  1583. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_bf16_paged_softcap_packgqa_sm120.cu +0 -0
  1584. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_bf16_paged_softcap_packgqa_sm80.cu +0 -0
  1585. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_bf16_paged_softcap_packgqa_sm90.cu +0 -0
  1586. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_bf16_paged_softcap_sm100.cu +0 -0
  1587. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_bf16_paged_softcap_sm110.cu +0 -0
  1588. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_bf16_paged_softcap_sm120.cu +0 -0
  1589. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_bf16_paged_softcap_sm80.cu +0 -0
  1590. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_bf16_paged_softcap_sm90.cu +0 -0
  1591. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_bf16_paged_softcapall_sm80.cu +0 -0
  1592. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_bf16_paged_split_packgqa_sm100.cu +0 -0
  1593. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_bf16_paged_split_packgqa_sm110.cu +0 -0
  1594. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_bf16_paged_split_packgqa_sm120.cu +0 -0
  1595. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_bf16_paged_split_packgqa_sm80.cu +0 -0
  1596. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_bf16_paged_split_packgqa_sm90.cu +0 -0
  1597. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_bf16_paged_split_sm100.cu +0 -0
  1598. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_bf16_paged_split_sm110.cu +0 -0
  1599. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_bf16_paged_split_sm120.cu +0 -0
  1600. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_bf16_paged_split_sm80.cu +0 -0
  1601. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_bf16_paged_split_sm90.cu +0 -0
  1602. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_bf16_paged_split_softcap_packgqa_sm100.cu +0 -0
  1603. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_bf16_paged_split_softcap_packgqa_sm110.cu +0 -0
  1604. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_bf16_paged_split_softcap_packgqa_sm120.cu +0 -0
  1605. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_bf16_paged_split_softcap_packgqa_sm80.cu +0 -0
  1606. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_bf16_paged_split_softcap_packgqa_sm90.cu +0 -0
  1607. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_bf16_paged_split_softcap_sm100.cu +0 -0
  1608. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_bf16_paged_split_softcap_sm110.cu +0 -0
  1609. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_bf16_paged_split_softcap_sm120.cu +0 -0
  1610. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_bf16_paged_split_softcap_sm80.cu +0 -0
  1611. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_bf16_paged_split_softcap_sm90.cu +0 -0
  1612. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_bf16_paged_split_softcapall_sm80.cu +0 -0
  1613. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_bf16_sm100.cu +0 -0
  1614. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_bf16_sm110.cu +0 -0
  1615. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_bf16_sm120.cu +0 -0
  1616. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_bf16_sm80.cu +0 -0
  1617. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_bf16_sm90.cu +0 -0
  1618. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_bf16_softcap_packgqa_sm100.cu +0 -0
  1619. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_bf16_softcap_packgqa_sm110.cu +0 -0
  1620. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_bf16_softcap_packgqa_sm120.cu +0 -0
  1621. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_bf16_softcap_packgqa_sm80.cu +0 -0
  1622. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_bf16_softcap_packgqa_sm90.cu +0 -0
  1623. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_bf16_softcap_sm100.cu +0 -0
  1624. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_bf16_softcap_sm110.cu +0 -0
  1625. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_bf16_softcap_sm120.cu +0 -0
  1626. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_bf16_softcap_sm80.cu +0 -0
  1627. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_bf16_softcap_sm90.cu +0 -0
  1628. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_bf16_softcapall_sm80.cu +0 -0
  1629. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_bf16_split_packgqa_sm100.cu +0 -0
  1630. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_bf16_split_packgqa_sm110.cu +0 -0
  1631. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_bf16_split_packgqa_sm120.cu +0 -0
  1632. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_bf16_split_packgqa_sm80.cu +0 -0
  1633. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_bf16_split_packgqa_sm90.cu +0 -0
  1634. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_bf16_split_sm100.cu +0 -0
  1635. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_bf16_split_sm110.cu +0 -0
  1636. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_bf16_split_sm120.cu +0 -0
  1637. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_bf16_split_sm80.cu +0 -0
  1638. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_bf16_split_sm90.cu +0 -0
  1639. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_bf16_split_softcap_packgqa_sm100.cu +0 -0
  1640. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_bf16_split_softcap_packgqa_sm110.cu +0 -0
  1641. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_bf16_split_softcap_packgqa_sm120.cu +0 -0
  1642. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_bf16_split_softcap_packgqa_sm80.cu +0 -0
  1643. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_bf16_split_softcap_packgqa_sm90.cu +0 -0
  1644. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_bf16_split_softcap_sm100.cu +0 -0
  1645. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_bf16_split_softcap_sm110.cu +0 -0
  1646. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_bf16_split_softcap_sm120.cu +0 -0
  1647. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_bf16_split_softcap_sm80.cu +0 -0
  1648. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_bf16_split_softcap_sm90.cu +0 -0
  1649. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_bf16_split_softcapall_sm80.cu +0 -0
  1650. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_e4m3_packgqa_sm100.cu +0 -0
  1651. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_e4m3_packgqa_sm110.cu +0 -0
  1652. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_e4m3_packgqa_sm120.cu +0 -0
  1653. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_e4m3_packgqa_sm90.cu +0 -0
  1654. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_e4m3_paged_packgqa_sm100.cu +0 -0
  1655. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_e4m3_paged_packgqa_sm110.cu +0 -0
  1656. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_e4m3_paged_packgqa_sm120.cu +0 -0
  1657. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_e4m3_paged_packgqa_sm90.cu +0 -0
  1658. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_e4m3_paged_sm100.cu +0 -0
  1659. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_e4m3_paged_sm110.cu +0 -0
  1660. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_e4m3_paged_sm120.cu +0 -0
  1661. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_e4m3_paged_sm90.cu +0 -0
  1662. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_e4m3_paged_softcap_packgqa_sm100.cu +0 -0
  1663. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_e4m3_paged_softcap_packgqa_sm110.cu +0 -0
  1664. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_e4m3_paged_softcap_packgqa_sm120.cu +0 -0
  1665. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_e4m3_paged_softcap_packgqa_sm90.cu +0 -0
  1666. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_e4m3_paged_softcap_sm100.cu +0 -0
  1667. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_e4m3_paged_softcap_sm110.cu +0 -0
  1668. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_e4m3_paged_softcap_sm120.cu +0 -0
  1669. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_e4m3_paged_softcap_sm90.cu +0 -0
  1670. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_e4m3_paged_split_packgqa_sm100.cu +0 -0
  1671. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_e4m3_paged_split_packgqa_sm110.cu +0 -0
  1672. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_e4m3_paged_split_packgqa_sm120.cu +0 -0
  1673. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_e4m3_paged_split_packgqa_sm90.cu +0 -0
  1674. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_e4m3_paged_split_sm100.cu +0 -0
  1675. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_e4m3_paged_split_sm110.cu +0 -0
  1676. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_e4m3_paged_split_sm120.cu +0 -0
  1677. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_e4m3_paged_split_sm90.cu +0 -0
  1678. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_e4m3_paged_split_softcap_packgqa_sm100.cu +0 -0
  1679. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_e4m3_paged_split_softcap_packgqa_sm110.cu +0 -0
  1680. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_e4m3_paged_split_softcap_packgqa_sm120.cu +0 -0
  1681. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_e4m3_paged_split_softcap_packgqa_sm90.cu +0 -0
  1682. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_e4m3_paged_split_softcap_sm100.cu +0 -0
  1683. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_e4m3_paged_split_softcap_sm110.cu +0 -0
  1684. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_e4m3_paged_split_softcap_sm120.cu +0 -0
  1685. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_e4m3_paged_split_softcap_sm90.cu +0 -0
  1686. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_e4m3_sm100.cu +0 -0
  1687. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_e4m3_sm110.cu +0 -0
  1688. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_e4m3_sm120.cu +0 -0
  1689. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_e4m3_sm90.cu +0 -0
  1690. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_e4m3_softcap_packgqa_sm100.cu +0 -0
  1691. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_e4m3_softcap_packgqa_sm110.cu +0 -0
  1692. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_e4m3_softcap_packgqa_sm120.cu +0 -0
  1693. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_e4m3_softcap_packgqa_sm90.cu +0 -0
  1694. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_e4m3_softcap_sm100.cu +0 -0
  1695. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_e4m3_softcap_sm110.cu +0 -0
  1696. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_e4m3_softcap_sm120.cu +0 -0
  1697. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_e4m3_softcap_sm90.cu +0 -0
  1698. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_e4m3_split_packgqa_sm100.cu +0 -0
  1699. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_e4m3_split_packgqa_sm110.cu +0 -0
  1700. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_e4m3_split_packgqa_sm120.cu +0 -0
  1701. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_e4m3_split_packgqa_sm90.cu +0 -0
  1702. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_e4m3_split_sm100.cu +0 -0
  1703. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_e4m3_split_sm110.cu +0 -0
  1704. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_e4m3_split_sm120.cu +0 -0
  1705. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_e4m3_split_sm90.cu +0 -0
  1706. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_e4m3_split_softcap_packgqa_sm100.cu +0 -0
  1707. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_e4m3_split_softcap_packgqa_sm110.cu +0 -0
  1708. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_e4m3_split_softcap_packgqa_sm120.cu +0 -0
  1709. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_e4m3_split_softcap_packgqa_sm90.cu +0 -0
  1710. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_e4m3_split_softcap_sm100.cu +0 -0
  1711. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_e4m3_split_softcap_sm110.cu +0 -0
  1712. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_e4m3_split_softcap_sm120.cu +0 -0
  1713. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_e4m3_split_softcap_sm90.cu +0 -0
  1714. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_fp16_packgqa_sm100.cu +0 -0
  1715. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_fp16_packgqa_sm110.cu +0 -0
  1716. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_fp16_packgqa_sm120.cu +0 -0
  1717. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_fp16_packgqa_sm80.cu +0 -0
  1718. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_fp16_packgqa_sm90.cu +0 -0
  1719. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_fp16_paged_packgqa_sm100.cu +0 -0
  1720. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_fp16_paged_packgqa_sm110.cu +0 -0
  1721. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_fp16_paged_packgqa_sm120.cu +0 -0
  1722. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_fp16_paged_packgqa_sm80.cu +0 -0
  1723. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_fp16_paged_packgqa_sm90.cu +0 -0
  1724. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_fp16_paged_sm100.cu +0 -0
  1725. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_fp16_paged_sm110.cu +0 -0
  1726. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_fp16_paged_sm120.cu +0 -0
  1727. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_fp16_paged_sm80.cu +0 -0
  1728. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_fp16_paged_sm90.cu +0 -0
  1729. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_fp16_paged_softcap_packgqa_sm100.cu +0 -0
  1730. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_fp16_paged_softcap_packgqa_sm110.cu +0 -0
  1731. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_fp16_paged_softcap_packgqa_sm120.cu +0 -0
  1732. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_fp16_paged_softcap_packgqa_sm80.cu +0 -0
  1733. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_fp16_paged_softcap_packgqa_sm90.cu +0 -0
  1734. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_fp16_paged_softcap_sm100.cu +0 -0
  1735. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_fp16_paged_softcap_sm110.cu +0 -0
  1736. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_fp16_paged_softcap_sm120.cu +0 -0
  1737. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_fp16_paged_softcap_sm80.cu +0 -0
  1738. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_fp16_paged_softcap_sm90.cu +0 -0
  1739. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_fp16_paged_softcapall_sm80.cu +0 -0
  1740. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_fp16_paged_split_packgqa_sm100.cu +0 -0
  1741. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_fp16_paged_split_packgqa_sm110.cu +0 -0
  1742. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_fp16_paged_split_packgqa_sm120.cu +0 -0
  1743. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_fp16_paged_split_packgqa_sm80.cu +0 -0
  1744. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_fp16_paged_split_packgqa_sm90.cu +0 -0
  1745. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_fp16_paged_split_sm100.cu +0 -0
  1746. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_fp16_paged_split_sm110.cu +0 -0
  1747. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_fp16_paged_split_sm120.cu +0 -0
  1748. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_fp16_paged_split_sm80.cu +0 -0
  1749. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_fp16_paged_split_sm90.cu +0 -0
  1750. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_fp16_paged_split_softcap_packgqa_sm100.cu +0 -0
  1751. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_fp16_paged_split_softcap_packgqa_sm110.cu +0 -0
  1752. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_fp16_paged_split_softcap_packgqa_sm120.cu +0 -0
  1753. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_fp16_paged_split_softcap_packgqa_sm80.cu +0 -0
  1754. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_fp16_paged_split_softcap_packgqa_sm90.cu +0 -0
  1755. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_fp16_paged_split_softcap_sm100.cu +0 -0
  1756. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_fp16_paged_split_softcap_sm110.cu +0 -0
  1757. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_fp16_paged_split_softcap_sm120.cu +0 -0
  1758. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_fp16_paged_split_softcap_sm80.cu +0 -0
  1759. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_fp16_paged_split_softcap_sm90.cu +0 -0
  1760. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_fp16_paged_split_softcapall_sm80.cu +0 -0
  1761. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_fp16_sm100.cu +0 -0
  1762. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_fp16_sm110.cu +0 -0
  1763. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_fp16_sm120.cu +0 -0
  1764. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_fp16_sm80.cu +0 -0
  1765. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_fp16_sm90.cu +0 -0
  1766. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_fp16_softcap_packgqa_sm100.cu +0 -0
  1767. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_fp16_softcap_packgqa_sm110.cu +0 -0
  1768. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_fp16_softcap_packgqa_sm120.cu +0 -0
  1769. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_fp16_softcap_packgqa_sm80.cu +0 -0
  1770. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_fp16_softcap_packgqa_sm90.cu +0 -0
  1771. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_fp16_softcap_sm100.cu +0 -0
  1772. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_fp16_softcap_sm110.cu +0 -0
  1773. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_fp16_softcap_sm120.cu +0 -0
  1774. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_fp16_softcap_sm80.cu +0 -0
  1775. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_fp16_softcap_sm90.cu +0 -0
  1776. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_fp16_softcapall_sm80.cu +0 -0
  1777. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_fp16_split_packgqa_sm100.cu +0 -0
  1778. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_fp16_split_packgqa_sm110.cu +0 -0
  1779. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_fp16_split_packgqa_sm120.cu +0 -0
  1780. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_fp16_split_packgqa_sm80.cu +0 -0
  1781. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_fp16_split_packgqa_sm90.cu +0 -0
  1782. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_fp16_split_sm100.cu +0 -0
  1783. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_fp16_split_sm110.cu +0 -0
  1784. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_fp16_split_sm120.cu +0 -0
  1785. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_fp16_split_sm80.cu +0 -0
  1786. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_fp16_split_sm90.cu +0 -0
  1787. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_fp16_split_softcap_packgqa_sm100.cu +0 -0
  1788. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_fp16_split_softcap_packgqa_sm110.cu +0 -0
  1789. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_fp16_split_softcap_packgqa_sm120.cu +0 -0
  1790. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_fp16_split_softcap_packgqa_sm80.cu +0 -0
  1791. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_fp16_split_softcap_packgqa_sm90.cu +0 -0
  1792. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_fp16_split_softcap_sm100.cu +0 -0
  1793. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_fp16_split_softcap_sm110.cu +0 -0
  1794. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_fp16_split_softcap_sm120.cu +0 -0
  1795. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_fp16_split_softcap_sm80.cu +0 -0
  1796. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_fp16_split_softcap_sm90.cu +0 -0
  1797. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim192_fp16_split_softcapall_sm80.cu +0 -0
  1798. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_bf16_packgqa_sm100.cu +0 -0
  1799. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_bf16_packgqa_sm110.cu +0 -0
  1800. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_bf16_packgqa_sm120.cu +0 -0
  1801. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_bf16_packgqa_sm80.cu +0 -0
  1802. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_bf16_packgqa_sm90.cu +0 -0
  1803. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_bf16_paged_packgqa_sm100.cu +0 -0
  1804. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_bf16_paged_packgqa_sm110.cu +0 -0
  1805. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_bf16_paged_packgqa_sm120.cu +0 -0
  1806. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_bf16_paged_packgqa_sm80.cu +0 -0
  1807. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_bf16_paged_packgqa_sm90.cu +0 -0
  1808. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_bf16_paged_sm100.cu +0 -0
  1809. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_bf16_paged_sm110.cu +0 -0
  1810. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_bf16_paged_sm120.cu +0 -0
  1811. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_bf16_paged_sm80.cu +0 -0
  1812. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_bf16_paged_sm90.cu +0 -0
  1813. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_bf16_paged_softcap_packgqa_sm100.cu +0 -0
  1814. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_bf16_paged_softcap_packgqa_sm110.cu +0 -0
  1815. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_bf16_paged_softcap_packgqa_sm120.cu +0 -0
  1816. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_bf16_paged_softcap_packgqa_sm80.cu +0 -0
  1817. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_bf16_paged_softcap_packgqa_sm90.cu +0 -0
  1818. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_bf16_paged_softcap_sm100.cu +0 -0
  1819. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_bf16_paged_softcap_sm110.cu +0 -0
  1820. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_bf16_paged_softcap_sm120.cu +0 -0
  1821. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_bf16_paged_softcap_sm80.cu +0 -0
  1822. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_bf16_paged_softcap_sm90.cu +0 -0
  1823. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_bf16_paged_softcapall_sm80.cu +0 -0
  1824. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_bf16_paged_split_packgqa_sm100.cu +0 -0
  1825. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_bf16_paged_split_packgqa_sm110.cu +0 -0
  1826. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_bf16_paged_split_packgqa_sm120.cu +0 -0
  1827. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_bf16_paged_split_packgqa_sm80.cu +0 -0
  1828. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_bf16_paged_split_packgqa_sm90.cu +0 -0
  1829. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_bf16_paged_split_sm100.cu +0 -0
  1830. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_bf16_paged_split_sm110.cu +0 -0
  1831. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_bf16_paged_split_sm120.cu +0 -0
  1832. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_bf16_paged_split_sm80.cu +0 -0
  1833. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_bf16_paged_split_sm90.cu +0 -0
  1834. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_bf16_paged_split_softcap_packgqa_sm100.cu +0 -0
  1835. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_bf16_paged_split_softcap_packgqa_sm110.cu +0 -0
  1836. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_bf16_paged_split_softcap_packgqa_sm120.cu +0 -0
  1837. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_bf16_paged_split_softcap_packgqa_sm80.cu +0 -0
  1838. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_bf16_paged_split_softcap_packgqa_sm90.cu +0 -0
  1839. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_bf16_paged_split_softcap_sm100.cu +0 -0
  1840. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_bf16_paged_split_softcap_sm110.cu +0 -0
  1841. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_bf16_paged_split_softcap_sm120.cu +0 -0
  1842. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_bf16_paged_split_softcap_sm80.cu +0 -0
  1843. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_bf16_paged_split_softcap_sm90.cu +0 -0
  1844. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_bf16_paged_split_softcapall_sm80.cu +0 -0
  1845. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_bf16_sm100.cu +0 -0
  1846. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_bf16_sm110.cu +0 -0
  1847. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_bf16_sm120.cu +0 -0
  1848. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_bf16_sm80.cu +0 -0
  1849. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_bf16_sm90.cu +0 -0
  1850. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_bf16_softcap_packgqa_sm100.cu +0 -0
  1851. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_bf16_softcap_packgqa_sm110.cu +0 -0
  1852. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_bf16_softcap_packgqa_sm120.cu +0 -0
  1853. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_bf16_softcap_packgqa_sm80.cu +0 -0
  1854. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_bf16_softcap_packgqa_sm90.cu +0 -0
  1855. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_bf16_softcap_sm100.cu +0 -0
  1856. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_bf16_softcap_sm110.cu +0 -0
  1857. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_bf16_softcap_sm120.cu +0 -0
  1858. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_bf16_softcap_sm80.cu +0 -0
  1859. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_bf16_softcap_sm90.cu +0 -0
  1860. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_bf16_softcapall_sm80.cu +0 -0
  1861. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_bf16_split_packgqa_sm100.cu +0 -0
  1862. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_bf16_split_packgqa_sm110.cu +0 -0
  1863. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_bf16_split_packgqa_sm120.cu +0 -0
  1864. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_bf16_split_packgqa_sm80.cu +0 -0
  1865. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_bf16_split_packgqa_sm90.cu +0 -0
  1866. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_bf16_split_sm100.cu +0 -0
  1867. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_bf16_split_sm110.cu +0 -0
  1868. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_bf16_split_sm120.cu +0 -0
  1869. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_bf16_split_sm80.cu +0 -0
  1870. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_bf16_split_sm90.cu +0 -0
  1871. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_bf16_split_softcap_packgqa_sm100.cu +0 -0
  1872. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_bf16_split_softcap_packgqa_sm110.cu +0 -0
  1873. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_bf16_split_softcap_packgqa_sm120.cu +0 -0
  1874. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_bf16_split_softcap_packgqa_sm80.cu +0 -0
  1875. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_bf16_split_softcap_packgqa_sm90.cu +0 -0
  1876. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_bf16_split_softcap_sm100.cu +0 -0
  1877. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_bf16_split_softcap_sm110.cu +0 -0
  1878. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_bf16_split_softcap_sm120.cu +0 -0
  1879. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_bf16_split_softcap_sm80.cu +0 -0
  1880. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_bf16_split_softcap_sm90.cu +0 -0
  1881. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_bf16_split_softcapall_sm80.cu +0 -0
  1882. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_e4m3_packgqa_sm100.cu +0 -0
  1883. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_e4m3_packgqa_sm110.cu +0 -0
  1884. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_e4m3_packgqa_sm120.cu +0 -0
  1885. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_e4m3_packgqa_sm90.cu +0 -0
  1886. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_e4m3_paged_packgqa_sm100.cu +0 -0
  1887. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_e4m3_paged_packgqa_sm110.cu +0 -0
  1888. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_e4m3_paged_packgqa_sm120.cu +0 -0
  1889. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_e4m3_paged_packgqa_sm90.cu +0 -0
  1890. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_e4m3_paged_sm100.cu +0 -0
  1891. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_e4m3_paged_sm110.cu +0 -0
  1892. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_e4m3_paged_sm120.cu +0 -0
  1893. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_e4m3_paged_sm90.cu +0 -0
  1894. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_e4m3_paged_softcap_packgqa_sm100.cu +0 -0
  1895. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_e4m3_paged_softcap_packgqa_sm110.cu +0 -0
  1896. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_e4m3_paged_softcap_packgqa_sm120.cu +0 -0
  1897. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_e4m3_paged_softcap_packgqa_sm90.cu +0 -0
  1898. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_e4m3_paged_softcap_sm100.cu +0 -0
  1899. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_e4m3_paged_softcap_sm110.cu +0 -0
  1900. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_e4m3_paged_softcap_sm120.cu +0 -0
  1901. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_e4m3_paged_softcap_sm90.cu +0 -0
  1902. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_e4m3_paged_split_packgqa_sm100.cu +0 -0
  1903. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_e4m3_paged_split_packgqa_sm110.cu +0 -0
  1904. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_e4m3_paged_split_packgqa_sm120.cu +0 -0
  1905. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_e4m3_paged_split_packgqa_sm90.cu +0 -0
  1906. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_e4m3_paged_split_sm100.cu +0 -0
  1907. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_e4m3_paged_split_sm110.cu +0 -0
  1908. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_e4m3_paged_split_sm120.cu +0 -0
  1909. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_e4m3_paged_split_sm90.cu +0 -0
  1910. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_e4m3_paged_split_softcap_packgqa_sm100.cu +0 -0
  1911. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_e4m3_paged_split_softcap_packgqa_sm110.cu +0 -0
  1912. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_e4m3_paged_split_softcap_packgqa_sm120.cu +0 -0
  1913. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_e4m3_paged_split_softcap_packgqa_sm90.cu +0 -0
  1914. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_e4m3_paged_split_softcap_sm100.cu +0 -0
  1915. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_e4m3_paged_split_softcap_sm110.cu +0 -0
  1916. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_e4m3_paged_split_softcap_sm120.cu +0 -0
  1917. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_e4m3_paged_split_softcap_sm90.cu +0 -0
  1918. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_e4m3_sm100.cu +0 -0
  1919. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_e4m3_sm110.cu +0 -0
  1920. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_e4m3_sm120.cu +0 -0
  1921. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_e4m3_sm90.cu +0 -0
  1922. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_e4m3_softcap_packgqa_sm100.cu +0 -0
  1923. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_e4m3_softcap_packgqa_sm110.cu +0 -0
  1924. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_e4m3_softcap_packgqa_sm120.cu +0 -0
  1925. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_e4m3_softcap_packgqa_sm90.cu +0 -0
  1926. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_e4m3_softcap_sm100.cu +0 -0
  1927. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_e4m3_softcap_sm110.cu +0 -0
  1928. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_e4m3_softcap_sm120.cu +0 -0
  1929. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_e4m3_softcap_sm90.cu +0 -0
  1930. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_e4m3_split_packgqa_sm100.cu +0 -0
  1931. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_e4m3_split_packgqa_sm110.cu +0 -0
  1932. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_e4m3_split_packgqa_sm120.cu +0 -0
  1933. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_e4m3_split_packgqa_sm90.cu +0 -0
  1934. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_e4m3_split_sm100.cu +0 -0
  1935. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_e4m3_split_sm110.cu +0 -0
  1936. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_e4m3_split_sm120.cu +0 -0
  1937. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_e4m3_split_sm90.cu +0 -0
  1938. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_e4m3_split_softcap_packgqa_sm100.cu +0 -0
  1939. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_e4m3_split_softcap_packgqa_sm110.cu +0 -0
  1940. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_e4m3_split_softcap_packgqa_sm120.cu +0 -0
  1941. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_e4m3_split_softcap_packgqa_sm90.cu +0 -0
  1942. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_e4m3_split_softcap_sm100.cu +0 -0
  1943. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_e4m3_split_softcap_sm110.cu +0 -0
  1944. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_e4m3_split_softcap_sm120.cu +0 -0
  1945. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_e4m3_split_softcap_sm90.cu +0 -0
  1946. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_fp16_packgqa_sm100.cu +0 -0
  1947. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_fp16_packgqa_sm110.cu +0 -0
  1948. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_fp16_packgqa_sm120.cu +0 -0
  1949. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_fp16_packgqa_sm80.cu +0 -0
  1950. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_fp16_packgqa_sm90.cu +0 -0
  1951. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_fp16_paged_packgqa_sm100.cu +0 -0
  1952. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_fp16_paged_packgqa_sm110.cu +0 -0
  1953. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_fp16_paged_packgqa_sm120.cu +0 -0
  1954. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_fp16_paged_packgqa_sm80.cu +0 -0
  1955. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_fp16_paged_packgqa_sm90.cu +0 -0
  1956. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_fp16_paged_sm100.cu +0 -0
  1957. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_fp16_paged_sm110.cu +0 -0
  1958. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_fp16_paged_sm120.cu +0 -0
  1959. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_fp16_paged_sm80.cu +0 -0
  1960. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_fp16_paged_sm90.cu +0 -0
  1961. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_fp16_paged_softcap_packgqa_sm100.cu +0 -0
  1962. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_fp16_paged_softcap_packgqa_sm110.cu +0 -0
  1963. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_fp16_paged_softcap_packgqa_sm120.cu +0 -0
  1964. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_fp16_paged_softcap_packgqa_sm80.cu +0 -0
  1965. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_fp16_paged_softcap_packgqa_sm90.cu +0 -0
  1966. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_fp16_paged_softcap_sm100.cu +0 -0
  1967. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_fp16_paged_softcap_sm110.cu +0 -0
  1968. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_fp16_paged_softcap_sm120.cu +0 -0
  1969. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_fp16_paged_softcap_sm80.cu +0 -0
  1970. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_fp16_paged_softcap_sm90.cu +0 -0
  1971. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_fp16_paged_softcapall_sm80.cu +0 -0
  1972. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_fp16_paged_split_packgqa_sm100.cu +0 -0
  1973. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_fp16_paged_split_packgqa_sm110.cu +0 -0
  1974. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_fp16_paged_split_packgqa_sm120.cu +0 -0
  1975. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_fp16_paged_split_packgqa_sm80.cu +0 -0
  1976. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_fp16_paged_split_packgqa_sm90.cu +0 -0
  1977. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_fp16_paged_split_sm100.cu +0 -0
  1978. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_fp16_paged_split_sm110.cu +0 -0
  1979. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_fp16_paged_split_sm120.cu +0 -0
  1980. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_fp16_paged_split_sm80.cu +0 -0
  1981. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_fp16_paged_split_sm90.cu +0 -0
  1982. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_fp16_paged_split_softcap_packgqa_sm100.cu +0 -0
  1983. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_fp16_paged_split_softcap_packgqa_sm110.cu +0 -0
  1984. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_fp16_paged_split_softcap_packgqa_sm120.cu +0 -0
  1985. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_fp16_paged_split_softcap_packgqa_sm80.cu +0 -0
  1986. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_fp16_paged_split_softcap_packgqa_sm90.cu +0 -0
  1987. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_fp16_paged_split_softcap_sm100.cu +0 -0
  1988. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_fp16_paged_split_softcap_sm110.cu +0 -0
  1989. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_fp16_paged_split_softcap_sm120.cu +0 -0
  1990. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_fp16_paged_split_softcap_sm80.cu +0 -0
  1991. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_fp16_paged_split_softcap_sm90.cu +0 -0
  1992. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_fp16_paged_split_softcapall_sm80.cu +0 -0
  1993. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_fp16_sm100.cu +0 -0
  1994. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_fp16_sm110.cu +0 -0
  1995. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_fp16_sm120.cu +0 -0
  1996. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_fp16_sm80.cu +0 -0
  1997. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_fp16_sm90.cu +0 -0
  1998. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_fp16_softcap_packgqa_sm100.cu +0 -0
  1999. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_fp16_softcap_packgqa_sm110.cu +0 -0
  2000. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_fp16_softcap_packgqa_sm120.cu +0 -0
  2001. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_fp16_softcap_packgqa_sm80.cu +0 -0
  2002. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_fp16_softcap_packgqa_sm90.cu +0 -0
  2003. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_fp16_softcap_sm100.cu +0 -0
  2004. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_fp16_softcap_sm110.cu +0 -0
  2005. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_fp16_softcap_sm120.cu +0 -0
  2006. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_fp16_softcap_sm80.cu +0 -0
  2007. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_fp16_softcap_sm90.cu +0 -0
  2008. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_fp16_softcapall_sm80.cu +0 -0
  2009. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_fp16_split_packgqa_sm100.cu +0 -0
  2010. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_fp16_split_packgqa_sm110.cu +0 -0
  2011. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_fp16_split_packgqa_sm120.cu +0 -0
  2012. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_fp16_split_packgqa_sm80.cu +0 -0
  2013. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_fp16_split_packgqa_sm90.cu +0 -0
  2014. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_fp16_split_sm100.cu +0 -0
  2015. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_fp16_split_sm110.cu +0 -0
  2016. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_fp16_split_sm120.cu +0 -0
  2017. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_fp16_split_sm80.cu +0 -0
  2018. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_fp16_split_sm90.cu +0 -0
  2019. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_fp16_split_softcap_packgqa_sm100.cu +0 -0
  2020. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_fp16_split_softcap_packgqa_sm110.cu +0 -0
  2021. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_fp16_split_softcap_packgqa_sm120.cu +0 -0
  2022. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_fp16_split_softcap_packgqa_sm80.cu +0 -0
  2023. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_fp16_split_softcap_packgqa_sm90.cu +0 -0
  2024. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_fp16_split_softcap_sm100.cu +0 -0
  2025. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_fp16_split_softcap_sm110.cu +0 -0
  2026. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_fp16_split_softcap_sm120.cu +0 -0
  2027. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_fp16_split_softcap_sm80.cu +0 -0
  2028. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_fp16_split_softcap_sm90.cu +0 -0
  2029. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim256_fp16_split_softcapall_sm80.cu +0 -0
  2030. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_256_bf16_packgqa_sm90.cu +0 -0
  2031. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_256_bf16_paged_sm90.cu +0 -0
  2032. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_256_bf16_paged_softcap_sm90.cu +0 -0
  2033. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_256_bf16_paged_split_sm90.cu +0 -0
  2034. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_256_bf16_paged_split_softcap_sm90.cu +0 -0
  2035. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_256_bf16_sm90.cu +0 -0
  2036. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_256_bf16_softcap_packgqa_sm90.cu +0 -0
  2037. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_256_bf16_softcap_sm90.cu +0 -0
  2038. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_256_bf16_split_sm90.cu +0 -0
  2039. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_256_bf16_split_softcap_sm90.cu +0 -0
  2040. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_256_fp16_packgqa_sm90.cu +0 -0
  2041. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_256_fp16_paged_sm90.cu +0 -0
  2042. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_256_fp16_paged_softcap_sm90.cu +0 -0
  2043. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_256_fp16_paged_split_sm90.cu +0 -0
  2044. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_256_fp16_paged_split_softcap_sm90.cu +0 -0
  2045. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_256_fp16_sm90.cu +0 -0
  2046. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_256_fp16_softcap_packgqa_sm90.cu +0 -0
  2047. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_256_fp16_softcap_sm90.cu +0 -0
  2048. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_256_fp16_split_sm90.cu +0 -0
  2049. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_256_fp16_split_softcap_sm90.cu +0 -0
  2050. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_512_bf16_packgqa_sm90.cu +0 -0
  2051. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_512_bf16_paged_sm90.cu +0 -0
  2052. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_512_bf16_paged_softcap_sm90.cu +0 -0
  2053. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_512_bf16_paged_split_sm90.cu +0 -0
  2054. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_512_bf16_paged_split_softcap_sm90.cu +0 -0
  2055. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_512_bf16_sm90.cu +0 -0
  2056. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_512_bf16_softcap_packgqa_sm90.cu +0 -0
  2057. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_512_bf16_softcap_sm90.cu +0 -0
  2058. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_512_bf16_split_sm90.cu +0 -0
  2059. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_512_bf16_split_softcap_sm90.cu +0 -0
  2060. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_512_fp16_packgqa_sm90.cu +0 -0
  2061. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_512_fp16_paged_sm90.cu +0 -0
  2062. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_512_fp16_paged_softcap_sm90.cu +0 -0
  2063. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_512_fp16_paged_split_sm90.cu +0 -0
  2064. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_512_fp16_paged_split_softcap_sm90.cu +0 -0
  2065. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_512_fp16_sm90.cu +0 -0
  2066. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_512_fp16_softcap_packgqa_sm90.cu +0 -0
  2067. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_512_fp16_softcap_sm90.cu +0 -0
  2068. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_512_fp16_split_sm90.cu +0 -0
  2069. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_512_fp16_split_softcap_sm90.cu +0 -0
  2070. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_bf16_packgqa_sm100.cu +0 -0
  2071. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_bf16_packgqa_sm110.cu +0 -0
  2072. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_bf16_packgqa_sm120.cu +0 -0
  2073. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_bf16_packgqa_sm80.cu +0 -0
  2074. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_bf16_packgqa_sm90.cu +0 -0
  2075. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_bf16_paged_packgqa_sm100.cu +0 -0
  2076. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_bf16_paged_packgqa_sm110.cu +0 -0
  2077. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_bf16_paged_packgqa_sm120.cu +0 -0
  2078. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_bf16_paged_packgqa_sm80.cu +0 -0
  2079. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_bf16_paged_packgqa_sm90.cu +0 -0
  2080. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_bf16_paged_sm100.cu +0 -0
  2081. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_bf16_paged_sm110.cu +0 -0
  2082. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_bf16_paged_sm120.cu +0 -0
  2083. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_bf16_paged_sm80.cu +0 -0
  2084. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_bf16_paged_sm90.cu +0 -0
  2085. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_bf16_paged_softcap_packgqa_sm100.cu +0 -0
  2086. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_bf16_paged_softcap_packgqa_sm110.cu +0 -0
  2087. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_bf16_paged_softcap_packgqa_sm120.cu +0 -0
  2088. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_bf16_paged_softcap_packgqa_sm80.cu +0 -0
  2089. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_bf16_paged_softcap_packgqa_sm90.cu +0 -0
  2090. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_bf16_paged_softcap_sm100.cu +0 -0
  2091. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_bf16_paged_softcap_sm110.cu +0 -0
  2092. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_bf16_paged_softcap_sm120.cu +0 -0
  2093. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_bf16_paged_softcap_sm80.cu +0 -0
  2094. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_bf16_paged_softcap_sm90.cu +0 -0
  2095. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_bf16_paged_softcapall_sm80.cu +0 -0
  2096. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_bf16_paged_split_packgqa_sm100.cu +0 -0
  2097. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_bf16_paged_split_packgqa_sm110.cu +0 -0
  2098. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_bf16_paged_split_packgqa_sm120.cu +0 -0
  2099. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_bf16_paged_split_packgqa_sm80.cu +0 -0
  2100. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_bf16_paged_split_packgqa_sm90.cu +0 -0
  2101. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_bf16_paged_split_sm100.cu +0 -0
  2102. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_bf16_paged_split_sm110.cu +0 -0
  2103. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_bf16_paged_split_sm120.cu +0 -0
  2104. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_bf16_paged_split_sm80.cu +0 -0
  2105. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_bf16_paged_split_sm90.cu +0 -0
  2106. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_bf16_paged_split_softcap_packgqa_sm100.cu +0 -0
  2107. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_bf16_paged_split_softcap_packgqa_sm110.cu +0 -0
  2108. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_bf16_paged_split_softcap_packgqa_sm120.cu +0 -0
  2109. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_bf16_paged_split_softcap_packgqa_sm80.cu +0 -0
  2110. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_bf16_paged_split_softcap_packgqa_sm90.cu +0 -0
  2111. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_bf16_paged_split_softcap_sm100.cu +0 -0
  2112. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_bf16_paged_split_softcap_sm110.cu +0 -0
  2113. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_bf16_paged_split_softcap_sm120.cu +0 -0
  2114. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_bf16_paged_split_softcap_sm80.cu +0 -0
  2115. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_bf16_paged_split_softcap_sm90.cu +0 -0
  2116. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_bf16_paged_split_softcapall_sm80.cu +0 -0
  2117. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_bf16_sm100.cu +0 -0
  2118. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_bf16_sm110.cu +0 -0
  2119. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_bf16_sm120.cu +0 -0
  2120. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_bf16_sm80.cu +0 -0
  2121. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_bf16_sm90.cu +0 -0
  2122. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_bf16_softcap_packgqa_sm100.cu +0 -0
  2123. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_bf16_softcap_packgqa_sm110.cu +0 -0
  2124. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_bf16_softcap_packgqa_sm120.cu +0 -0
  2125. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_bf16_softcap_packgqa_sm80.cu +0 -0
  2126. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_bf16_softcap_packgqa_sm90.cu +0 -0
  2127. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_bf16_softcap_sm100.cu +0 -0
  2128. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_bf16_softcap_sm110.cu +0 -0
  2129. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_bf16_softcap_sm120.cu +0 -0
  2130. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_bf16_softcap_sm80.cu +0 -0
  2131. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_bf16_softcap_sm90.cu +0 -0
  2132. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_bf16_softcapall_sm80.cu +0 -0
  2133. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_bf16_split_packgqa_sm100.cu +0 -0
  2134. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_bf16_split_packgqa_sm110.cu +0 -0
  2135. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_bf16_split_packgqa_sm120.cu +0 -0
  2136. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_bf16_split_packgqa_sm80.cu +0 -0
  2137. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_bf16_split_packgqa_sm90.cu +0 -0
  2138. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_bf16_split_sm100.cu +0 -0
  2139. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_bf16_split_sm110.cu +0 -0
  2140. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_bf16_split_sm120.cu +0 -0
  2141. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_bf16_split_sm80.cu +0 -0
  2142. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_bf16_split_sm90.cu +0 -0
  2143. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_bf16_split_softcap_packgqa_sm100.cu +0 -0
  2144. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_bf16_split_softcap_packgqa_sm110.cu +0 -0
  2145. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_bf16_split_softcap_packgqa_sm120.cu +0 -0
  2146. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_bf16_split_softcap_packgqa_sm80.cu +0 -0
  2147. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_bf16_split_softcap_packgqa_sm90.cu +0 -0
  2148. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_bf16_split_softcap_sm100.cu +0 -0
  2149. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_bf16_split_softcap_sm110.cu +0 -0
  2150. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_bf16_split_softcap_sm120.cu +0 -0
  2151. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_bf16_split_softcap_sm80.cu +0 -0
  2152. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_bf16_split_softcap_sm90.cu +0 -0
  2153. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_bf16_split_softcapall_sm80.cu +0 -0
  2154. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_e4m3_packgqa_sm100.cu +0 -0
  2155. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_e4m3_packgqa_sm110.cu +0 -0
  2156. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_e4m3_packgqa_sm120.cu +0 -0
  2157. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_e4m3_packgqa_sm90.cu +0 -0
  2158. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_e4m3_paged_packgqa_sm100.cu +0 -0
  2159. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_e4m3_paged_packgqa_sm110.cu +0 -0
  2160. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_e4m3_paged_packgqa_sm120.cu +0 -0
  2161. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_e4m3_paged_packgqa_sm90.cu +0 -0
  2162. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_e4m3_paged_sm100.cu +0 -0
  2163. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_e4m3_paged_sm110.cu +0 -0
  2164. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_e4m3_paged_sm120.cu +0 -0
  2165. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_e4m3_paged_sm90.cu +0 -0
  2166. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_e4m3_paged_softcap_packgqa_sm100.cu +0 -0
  2167. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_e4m3_paged_softcap_packgqa_sm110.cu +0 -0
  2168. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_e4m3_paged_softcap_packgqa_sm120.cu +0 -0
  2169. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_e4m3_paged_softcap_packgqa_sm90.cu +0 -0
  2170. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_e4m3_paged_softcap_sm100.cu +0 -0
  2171. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_e4m3_paged_softcap_sm110.cu +0 -0
  2172. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_e4m3_paged_softcap_sm120.cu +0 -0
  2173. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_e4m3_paged_softcap_sm90.cu +0 -0
  2174. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_e4m3_paged_split_packgqa_sm100.cu +0 -0
  2175. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_e4m3_paged_split_packgqa_sm110.cu +0 -0
  2176. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_e4m3_paged_split_packgqa_sm120.cu +0 -0
  2177. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_e4m3_paged_split_packgqa_sm90.cu +0 -0
  2178. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_e4m3_paged_split_sm100.cu +0 -0
  2179. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_e4m3_paged_split_sm110.cu +0 -0
  2180. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_e4m3_paged_split_sm120.cu +0 -0
  2181. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_e4m3_paged_split_sm90.cu +0 -0
  2182. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_e4m3_paged_split_softcap_packgqa_sm100.cu +0 -0
  2183. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_e4m3_paged_split_softcap_packgqa_sm110.cu +0 -0
  2184. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_e4m3_paged_split_softcap_packgqa_sm120.cu +0 -0
  2185. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_e4m3_paged_split_softcap_packgqa_sm90.cu +0 -0
  2186. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_e4m3_paged_split_softcap_sm100.cu +0 -0
  2187. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_e4m3_paged_split_softcap_sm110.cu +0 -0
  2188. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_e4m3_paged_split_softcap_sm120.cu +0 -0
  2189. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_e4m3_paged_split_softcap_sm90.cu +0 -0
  2190. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_e4m3_sm100.cu +0 -0
  2191. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_e4m3_sm110.cu +0 -0
  2192. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_e4m3_sm120.cu +0 -0
  2193. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_e4m3_sm90.cu +0 -0
  2194. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_e4m3_softcap_packgqa_sm100.cu +0 -0
  2195. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_e4m3_softcap_packgqa_sm110.cu +0 -0
  2196. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_e4m3_softcap_packgqa_sm120.cu +0 -0
  2197. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_e4m3_softcap_packgqa_sm90.cu +0 -0
  2198. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_e4m3_softcap_sm100.cu +0 -0
  2199. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_e4m3_softcap_sm110.cu +0 -0
  2200. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_e4m3_softcap_sm120.cu +0 -0
  2201. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_e4m3_softcap_sm90.cu +0 -0
  2202. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_e4m3_split_packgqa_sm100.cu +0 -0
  2203. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_e4m3_split_packgqa_sm110.cu +0 -0
  2204. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_e4m3_split_packgqa_sm120.cu +0 -0
  2205. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_e4m3_split_packgqa_sm90.cu +0 -0
  2206. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_e4m3_split_sm100.cu +0 -0
  2207. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_e4m3_split_sm110.cu +0 -0
  2208. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_e4m3_split_sm120.cu +0 -0
  2209. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_e4m3_split_sm90.cu +0 -0
  2210. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_e4m3_split_softcap_packgqa_sm100.cu +0 -0
  2211. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_e4m3_split_softcap_packgqa_sm110.cu +0 -0
  2212. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_e4m3_split_softcap_packgqa_sm120.cu +0 -0
  2213. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_e4m3_split_softcap_packgqa_sm90.cu +0 -0
  2214. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_e4m3_split_softcap_sm100.cu +0 -0
  2215. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_e4m3_split_softcap_sm110.cu +0 -0
  2216. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_e4m3_split_softcap_sm120.cu +0 -0
  2217. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_e4m3_split_softcap_sm90.cu +0 -0
  2218. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_fp16_packgqa_sm100.cu +0 -0
  2219. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_fp16_packgqa_sm110.cu +0 -0
  2220. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_fp16_packgqa_sm120.cu +0 -0
  2221. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_fp16_packgqa_sm80.cu +0 -0
  2222. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_fp16_packgqa_sm90.cu +0 -0
  2223. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_fp16_paged_packgqa_sm100.cu +0 -0
  2224. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_fp16_paged_packgqa_sm110.cu +0 -0
  2225. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_fp16_paged_packgqa_sm120.cu +0 -0
  2226. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_fp16_paged_packgqa_sm80.cu +0 -0
  2227. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_fp16_paged_packgqa_sm90.cu +0 -0
  2228. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_fp16_paged_sm100.cu +0 -0
  2229. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_fp16_paged_sm110.cu +0 -0
  2230. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_fp16_paged_sm120.cu +0 -0
  2231. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_fp16_paged_sm80.cu +0 -0
  2232. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_fp16_paged_sm90.cu +0 -0
  2233. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_fp16_paged_softcap_packgqa_sm100.cu +0 -0
  2234. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_fp16_paged_softcap_packgqa_sm110.cu +0 -0
  2235. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_fp16_paged_softcap_packgqa_sm120.cu +0 -0
  2236. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_fp16_paged_softcap_packgqa_sm80.cu +0 -0
  2237. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_fp16_paged_softcap_packgqa_sm90.cu +0 -0
  2238. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_fp16_paged_softcap_sm100.cu +0 -0
  2239. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_fp16_paged_softcap_sm110.cu +0 -0
  2240. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_fp16_paged_softcap_sm120.cu +0 -0
  2241. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_fp16_paged_softcap_sm80.cu +0 -0
  2242. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_fp16_paged_softcap_sm90.cu +0 -0
  2243. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_fp16_paged_softcapall_sm80.cu +0 -0
  2244. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_fp16_paged_split_packgqa_sm100.cu +0 -0
  2245. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_fp16_paged_split_packgqa_sm110.cu +0 -0
  2246. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_fp16_paged_split_packgqa_sm120.cu +0 -0
  2247. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_fp16_paged_split_packgqa_sm80.cu +0 -0
  2248. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_fp16_paged_split_packgqa_sm90.cu +0 -0
  2249. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_fp16_paged_split_sm100.cu +0 -0
  2250. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_fp16_paged_split_sm110.cu +0 -0
  2251. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_fp16_paged_split_sm120.cu +0 -0
  2252. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_fp16_paged_split_sm80.cu +0 -0
  2253. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_fp16_paged_split_sm90.cu +0 -0
  2254. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_fp16_paged_split_softcap_packgqa_sm100.cu +0 -0
  2255. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_fp16_paged_split_softcap_packgqa_sm110.cu +0 -0
  2256. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_fp16_paged_split_softcap_packgqa_sm120.cu +0 -0
  2257. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_fp16_paged_split_softcap_packgqa_sm80.cu +0 -0
  2258. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_fp16_paged_split_softcap_packgqa_sm90.cu +0 -0
  2259. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_fp16_paged_split_softcap_sm100.cu +0 -0
  2260. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_fp16_paged_split_softcap_sm110.cu +0 -0
  2261. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_fp16_paged_split_softcap_sm120.cu +0 -0
  2262. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_fp16_paged_split_softcap_sm80.cu +0 -0
  2263. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_fp16_paged_split_softcap_sm90.cu +0 -0
  2264. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_fp16_paged_split_softcapall_sm80.cu +0 -0
  2265. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_fp16_sm100.cu +0 -0
  2266. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_fp16_sm110.cu +0 -0
  2267. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_fp16_sm120.cu +0 -0
  2268. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_fp16_sm80.cu +0 -0
  2269. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_fp16_sm90.cu +0 -0
  2270. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_fp16_softcap_packgqa_sm100.cu +0 -0
  2271. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_fp16_softcap_packgqa_sm110.cu +0 -0
  2272. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_fp16_softcap_packgqa_sm120.cu +0 -0
  2273. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_fp16_softcap_packgqa_sm80.cu +0 -0
  2274. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_fp16_softcap_packgqa_sm90.cu +0 -0
  2275. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_fp16_softcap_sm100.cu +0 -0
  2276. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_fp16_softcap_sm110.cu +0 -0
  2277. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_fp16_softcap_sm120.cu +0 -0
  2278. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_fp16_softcap_sm80.cu +0 -0
  2279. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_fp16_softcap_sm90.cu +0 -0
  2280. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_fp16_softcapall_sm80.cu +0 -0
  2281. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_fp16_split_packgqa_sm100.cu +0 -0
  2282. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_fp16_split_packgqa_sm110.cu +0 -0
  2283. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_fp16_split_packgqa_sm120.cu +0 -0
  2284. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_fp16_split_packgqa_sm80.cu +0 -0
  2285. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_fp16_split_packgqa_sm90.cu +0 -0
  2286. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_fp16_split_sm100.cu +0 -0
  2287. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_fp16_split_sm110.cu +0 -0
  2288. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_fp16_split_sm120.cu +0 -0
  2289. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_fp16_split_sm80.cu +0 -0
  2290. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_fp16_split_sm90.cu +0 -0
  2291. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_fp16_split_softcap_packgqa_sm100.cu +0 -0
  2292. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_fp16_split_softcap_packgqa_sm110.cu +0 -0
  2293. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_fp16_split_softcap_packgqa_sm120.cu +0 -0
  2294. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_fp16_split_softcap_packgqa_sm80.cu +0 -0
  2295. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_fp16_split_softcap_packgqa_sm90.cu +0 -0
  2296. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_fp16_split_softcap_sm100.cu +0 -0
  2297. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_fp16_split_softcap_sm110.cu +0 -0
  2298. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_fp16_split_softcap_sm120.cu +0 -0
  2299. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_fp16_split_softcap_sm80.cu +0 -0
  2300. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_fp16_split_softcap_sm90.cu +0 -0
  2301. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim64_fp16_split_softcapall_sm80.cu +0 -0
  2302. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_bf16_packgqa_sm100.cu +0 -0
  2303. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_bf16_packgqa_sm110.cu +0 -0
  2304. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_bf16_packgqa_sm120.cu +0 -0
  2305. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_bf16_packgqa_sm80.cu +0 -0
  2306. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_bf16_packgqa_sm90.cu +0 -0
  2307. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_bf16_paged_packgqa_sm100.cu +0 -0
  2308. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_bf16_paged_packgqa_sm110.cu +0 -0
  2309. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_bf16_paged_packgqa_sm120.cu +0 -0
  2310. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_bf16_paged_packgqa_sm80.cu +0 -0
  2311. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_bf16_paged_packgqa_sm90.cu +0 -0
  2312. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_bf16_paged_sm100.cu +0 -0
  2313. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_bf16_paged_sm110.cu +0 -0
  2314. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_bf16_paged_sm120.cu +0 -0
  2315. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_bf16_paged_sm80.cu +0 -0
  2316. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_bf16_paged_sm90.cu +0 -0
  2317. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_bf16_paged_softcap_packgqa_sm100.cu +0 -0
  2318. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_bf16_paged_softcap_packgqa_sm110.cu +0 -0
  2319. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_bf16_paged_softcap_packgqa_sm120.cu +0 -0
  2320. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_bf16_paged_softcap_packgqa_sm80.cu +0 -0
  2321. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_bf16_paged_softcap_packgqa_sm90.cu +0 -0
  2322. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_bf16_paged_softcap_sm100.cu +0 -0
  2323. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_bf16_paged_softcap_sm110.cu +0 -0
  2324. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_bf16_paged_softcap_sm120.cu +0 -0
  2325. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_bf16_paged_softcap_sm80.cu +0 -0
  2326. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_bf16_paged_softcap_sm90.cu +0 -0
  2327. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_bf16_paged_softcapall_sm80.cu +0 -0
  2328. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_bf16_paged_split_packgqa_sm100.cu +0 -0
  2329. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_bf16_paged_split_packgqa_sm110.cu +0 -0
  2330. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_bf16_paged_split_packgqa_sm120.cu +0 -0
  2331. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_bf16_paged_split_packgqa_sm80.cu +0 -0
  2332. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_bf16_paged_split_packgqa_sm90.cu +0 -0
  2333. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_bf16_paged_split_sm100.cu +0 -0
  2334. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_bf16_paged_split_sm110.cu +0 -0
  2335. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_bf16_paged_split_sm120.cu +0 -0
  2336. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_bf16_paged_split_sm80.cu +0 -0
  2337. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_bf16_paged_split_sm90.cu +0 -0
  2338. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_bf16_paged_split_softcap_packgqa_sm100.cu +0 -0
  2339. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_bf16_paged_split_softcap_packgqa_sm110.cu +0 -0
  2340. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_bf16_paged_split_softcap_packgqa_sm120.cu +0 -0
  2341. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_bf16_paged_split_softcap_packgqa_sm80.cu +0 -0
  2342. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_bf16_paged_split_softcap_packgqa_sm90.cu +0 -0
  2343. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_bf16_paged_split_softcap_sm100.cu +0 -0
  2344. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_bf16_paged_split_softcap_sm110.cu +0 -0
  2345. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_bf16_paged_split_softcap_sm120.cu +0 -0
  2346. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_bf16_paged_split_softcap_sm80.cu +0 -0
  2347. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_bf16_paged_split_softcap_sm90.cu +0 -0
  2348. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_bf16_paged_split_softcapall_sm80.cu +0 -0
  2349. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_bf16_sm100.cu +0 -0
  2350. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_bf16_sm110.cu +0 -0
  2351. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_bf16_sm120.cu +0 -0
  2352. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_bf16_sm80.cu +0 -0
  2353. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_bf16_sm90.cu +0 -0
  2354. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_bf16_softcap_packgqa_sm100.cu +0 -0
  2355. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_bf16_softcap_packgqa_sm110.cu +0 -0
  2356. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_bf16_softcap_packgqa_sm120.cu +0 -0
  2357. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_bf16_softcap_packgqa_sm80.cu +0 -0
  2358. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_bf16_softcap_packgqa_sm90.cu +0 -0
  2359. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_bf16_softcap_sm100.cu +0 -0
  2360. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_bf16_softcap_sm110.cu +0 -0
  2361. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_bf16_softcap_sm120.cu +0 -0
  2362. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_bf16_softcap_sm80.cu +0 -0
  2363. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_bf16_softcap_sm90.cu +0 -0
  2364. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_bf16_softcapall_sm80.cu +0 -0
  2365. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_bf16_split_packgqa_sm100.cu +0 -0
  2366. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_bf16_split_packgqa_sm110.cu +0 -0
  2367. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_bf16_split_packgqa_sm120.cu +0 -0
  2368. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_bf16_split_packgqa_sm80.cu +0 -0
  2369. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_bf16_split_packgqa_sm90.cu +0 -0
  2370. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_bf16_split_sm100.cu +0 -0
  2371. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_bf16_split_sm110.cu +0 -0
  2372. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_bf16_split_sm120.cu +0 -0
  2373. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_bf16_split_sm80.cu +0 -0
  2374. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_bf16_split_sm90.cu +0 -0
  2375. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_bf16_split_softcap_packgqa_sm100.cu +0 -0
  2376. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_bf16_split_softcap_packgqa_sm110.cu +0 -0
  2377. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_bf16_split_softcap_packgqa_sm120.cu +0 -0
  2378. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_bf16_split_softcap_packgqa_sm80.cu +0 -0
  2379. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_bf16_split_softcap_packgqa_sm90.cu +0 -0
  2380. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_bf16_split_softcap_sm100.cu +0 -0
  2381. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_bf16_split_softcap_sm110.cu +0 -0
  2382. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_bf16_split_softcap_sm120.cu +0 -0
  2383. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_bf16_split_softcap_sm80.cu +0 -0
  2384. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_bf16_split_softcap_sm90.cu +0 -0
  2385. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_bf16_split_softcapall_sm80.cu +0 -0
  2386. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_e4m3_packgqa_sm100.cu +0 -0
  2387. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_e4m3_packgqa_sm110.cu +0 -0
  2388. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_e4m3_packgqa_sm120.cu +0 -0
  2389. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_e4m3_packgqa_sm90.cu +0 -0
  2390. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_e4m3_paged_packgqa_sm100.cu +0 -0
  2391. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_e4m3_paged_packgqa_sm110.cu +0 -0
  2392. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_e4m3_paged_packgqa_sm120.cu +0 -0
  2393. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_e4m3_paged_packgqa_sm90.cu +0 -0
  2394. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_e4m3_paged_sm100.cu +0 -0
  2395. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_e4m3_paged_sm110.cu +0 -0
  2396. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_e4m3_paged_sm120.cu +0 -0
  2397. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_e4m3_paged_sm90.cu +0 -0
  2398. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_e4m3_paged_softcap_packgqa_sm100.cu +0 -0
  2399. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_e4m3_paged_softcap_packgqa_sm110.cu +0 -0
  2400. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_e4m3_paged_softcap_packgqa_sm120.cu +0 -0
  2401. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_e4m3_paged_softcap_packgqa_sm90.cu +0 -0
  2402. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_e4m3_paged_softcap_sm100.cu +0 -0
  2403. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_e4m3_paged_softcap_sm110.cu +0 -0
  2404. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_e4m3_paged_softcap_sm120.cu +0 -0
  2405. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_e4m3_paged_softcap_sm90.cu +0 -0
  2406. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_e4m3_paged_split_packgqa_sm100.cu +0 -0
  2407. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_e4m3_paged_split_packgqa_sm110.cu +0 -0
  2408. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_e4m3_paged_split_packgqa_sm120.cu +0 -0
  2409. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_e4m3_paged_split_packgqa_sm90.cu +0 -0
  2410. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_e4m3_paged_split_sm100.cu +0 -0
  2411. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_e4m3_paged_split_sm110.cu +0 -0
  2412. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_e4m3_paged_split_sm120.cu +0 -0
  2413. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_e4m3_paged_split_sm90.cu +0 -0
  2414. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_e4m3_paged_split_softcap_packgqa_sm100.cu +0 -0
  2415. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_e4m3_paged_split_softcap_packgqa_sm110.cu +0 -0
  2416. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_e4m3_paged_split_softcap_packgqa_sm120.cu +0 -0
  2417. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_e4m3_paged_split_softcap_packgqa_sm90.cu +0 -0
  2418. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_e4m3_paged_split_softcap_sm100.cu +0 -0
  2419. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_e4m3_paged_split_softcap_sm110.cu +0 -0
  2420. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_e4m3_paged_split_softcap_sm120.cu +0 -0
  2421. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_e4m3_paged_split_softcap_sm90.cu +0 -0
  2422. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_e4m3_sm100.cu +0 -0
  2423. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_e4m3_sm110.cu +0 -0
  2424. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_e4m3_sm120.cu +0 -0
  2425. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_e4m3_sm90.cu +0 -0
  2426. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_e4m3_softcap_packgqa_sm100.cu +0 -0
  2427. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_e4m3_softcap_packgqa_sm110.cu +0 -0
  2428. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_e4m3_softcap_packgqa_sm120.cu +0 -0
  2429. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_e4m3_softcap_packgqa_sm90.cu +0 -0
  2430. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_e4m3_softcap_sm100.cu +0 -0
  2431. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_e4m3_softcap_sm110.cu +0 -0
  2432. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_e4m3_softcap_sm120.cu +0 -0
  2433. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_e4m3_softcap_sm90.cu +0 -0
  2434. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_e4m3_split_packgqa_sm100.cu +0 -0
  2435. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_e4m3_split_packgqa_sm110.cu +0 -0
  2436. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_e4m3_split_packgqa_sm120.cu +0 -0
  2437. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_e4m3_split_packgqa_sm90.cu +0 -0
  2438. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_e4m3_split_sm100.cu +0 -0
  2439. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_e4m3_split_sm110.cu +0 -0
  2440. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_e4m3_split_sm120.cu +0 -0
  2441. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_e4m3_split_sm90.cu +0 -0
  2442. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_e4m3_split_softcap_packgqa_sm100.cu +0 -0
  2443. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_e4m3_split_softcap_packgqa_sm110.cu +0 -0
  2444. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_e4m3_split_softcap_packgqa_sm120.cu +0 -0
  2445. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_e4m3_split_softcap_packgqa_sm90.cu +0 -0
  2446. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_e4m3_split_softcap_sm100.cu +0 -0
  2447. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_e4m3_split_softcap_sm110.cu +0 -0
  2448. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_e4m3_split_softcap_sm120.cu +0 -0
  2449. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_e4m3_split_softcap_sm90.cu +0 -0
  2450. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_fp16_packgqa_sm100.cu +0 -0
  2451. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_fp16_packgqa_sm110.cu +0 -0
  2452. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_fp16_packgqa_sm120.cu +0 -0
  2453. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_fp16_packgqa_sm80.cu +0 -0
  2454. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_fp16_packgqa_sm90.cu +0 -0
  2455. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_fp16_paged_packgqa_sm100.cu +0 -0
  2456. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_fp16_paged_packgqa_sm110.cu +0 -0
  2457. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_fp16_paged_packgqa_sm120.cu +0 -0
  2458. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_fp16_paged_packgqa_sm80.cu +0 -0
  2459. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_fp16_paged_packgqa_sm90.cu +0 -0
  2460. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_fp16_paged_sm100.cu +0 -0
  2461. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_fp16_paged_sm110.cu +0 -0
  2462. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_fp16_paged_sm120.cu +0 -0
  2463. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_fp16_paged_sm80.cu +0 -0
  2464. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_fp16_paged_sm90.cu +0 -0
  2465. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_fp16_paged_softcap_packgqa_sm100.cu +0 -0
  2466. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_fp16_paged_softcap_packgqa_sm110.cu +0 -0
  2467. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_fp16_paged_softcap_packgqa_sm120.cu +0 -0
  2468. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_fp16_paged_softcap_packgqa_sm80.cu +0 -0
  2469. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_fp16_paged_softcap_packgqa_sm90.cu +0 -0
  2470. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_fp16_paged_softcap_sm100.cu +0 -0
  2471. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_fp16_paged_softcap_sm110.cu +0 -0
  2472. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_fp16_paged_softcap_sm120.cu +0 -0
  2473. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_fp16_paged_softcap_sm80.cu +0 -0
  2474. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_fp16_paged_softcap_sm90.cu +0 -0
  2475. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_fp16_paged_softcapall_sm80.cu +0 -0
  2476. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_fp16_paged_split_packgqa_sm100.cu +0 -0
  2477. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_fp16_paged_split_packgqa_sm110.cu +0 -0
  2478. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_fp16_paged_split_packgqa_sm120.cu +0 -0
  2479. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_fp16_paged_split_packgqa_sm80.cu +0 -0
  2480. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_fp16_paged_split_packgqa_sm90.cu +0 -0
  2481. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_fp16_paged_split_sm100.cu +0 -0
  2482. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_fp16_paged_split_sm110.cu +0 -0
  2483. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_fp16_paged_split_sm120.cu +0 -0
  2484. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_fp16_paged_split_sm80.cu +0 -0
  2485. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_fp16_paged_split_sm90.cu +0 -0
  2486. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_fp16_paged_split_softcap_packgqa_sm100.cu +0 -0
  2487. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_fp16_paged_split_softcap_packgqa_sm110.cu +0 -0
  2488. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_fp16_paged_split_softcap_packgqa_sm120.cu +0 -0
  2489. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_fp16_paged_split_softcap_packgqa_sm80.cu +0 -0
  2490. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_fp16_paged_split_softcap_packgqa_sm90.cu +0 -0
  2491. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_fp16_paged_split_softcap_sm100.cu +0 -0
  2492. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_fp16_paged_split_softcap_sm110.cu +0 -0
  2493. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_fp16_paged_split_softcap_sm120.cu +0 -0
  2494. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_fp16_paged_split_softcap_sm80.cu +0 -0
  2495. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_fp16_paged_split_softcap_sm90.cu +0 -0
  2496. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_fp16_paged_split_softcapall_sm80.cu +0 -0
  2497. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_fp16_sm100.cu +0 -0
  2498. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_fp16_sm110.cu +0 -0
  2499. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_fp16_sm120.cu +0 -0
  2500. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_fp16_sm80.cu +0 -0
  2501. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_fp16_sm90.cu +0 -0
  2502. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_fp16_softcap_packgqa_sm100.cu +0 -0
  2503. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_fp16_softcap_packgqa_sm110.cu +0 -0
  2504. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_fp16_softcap_packgqa_sm120.cu +0 -0
  2505. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_fp16_softcap_packgqa_sm80.cu +0 -0
  2506. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_fp16_softcap_packgqa_sm90.cu +0 -0
  2507. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_fp16_softcap_sm100.cu +0 -0
  2508. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_fp16_softcap_sm110.cu +0 -0
  2509. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_fp16_softcap_sm120.cu +0 -0
  2510. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_fp16_softcap_sm80.cu +0 -0
  2511. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_fp16_softcap_sm90.cu +0 -0
  2512. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_fp16_softcapall_sm80.cu +0 -0
  2513. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_fp16_split_packgqa_sm100.cu +0 -0
  2514. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_fp16_split_packgqa_sm110.cu +0 -0
  2515. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_fp16_split_packgqa_sm120.cu +0 -0
  2516. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_fp16_split_packgqa_sm80.cu +0 -0
  2517. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_fp16_split_packgqa_sm90.cu +0 -0
  2518. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_fp16_split_sm100.cu +0 -0
  2519. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_fp16_split_sm110.cu +0 -0
  2520. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_fp16_split_sm120.cu +0 -0
  2521. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_fp16_split_sm80.cu +0 -0
  2522. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_fp16_split_sm90.cu +0 -0
  2523. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_fp16_split_softcap_packgqa_sm100.cu +0 -0
  2524. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_fp16_split_softcap_packgqa_sm110.cu +0 -0
  2525. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_fp16_split_softcap_packgqa_sm120.cu +0 -0
  2526. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_fp16_split_softcap_packgqa_sm80.cu +0 -0
  2527. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_fp16_split_softcap_packgqa_sm90.cu +0 -0
  2528. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_fp16_split_softcap_sm100.cu +0 -0
  2529. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_fp16_split_softcap_sm110.cu +0 -0
  2530. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_fp16_split_softcap_sm120.cu +0 -0
  2531. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_fp16_split_softcap_sm80.cu +0 -0
  2532. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_fp16_split_softcap_sm90.cu +0 -0
  2533. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdim96_fp16_split_softcapall_sm80.cu +0 -0
  2534. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdimall_bf16_packgqa_sm90.cu +0 -0
  2535. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdimall_bf16_paged_sm90.cu +0 -0
  2536. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdimall_bf16_paged_softcap_sm90.cu +0 -0
  2537. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdimall_bf16_paged_split_sm90.cu +0 -0
  2538. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdimall_bf16_paged_split_softcap_sm90.cu +0 -0
  2539. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdimall_bf16_sm90.cu +0 -0
  2540. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdimall_bf16_softcap_packgqa_sm90.cu +0 -0
  2541. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdimall_bf16_softcap_sm90.cu +0 -0
  2542. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdimall_bf16_split_sm90.cu +0 -0
  2543. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdimall_bf16_split_softcap_sm90.cu +0 -0
  2544. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdimall_e4m3_packgqa_sm90.cu +0 -0
  2545. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdimall_e4m3_paged_sm90.cu +0 -0
  2546. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdimall_e4m3_paged_softcap_sm90.cu +0 -0
  2547. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdimall_e4m3_paged_split_sm90.cu +0 -0
  2548. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdimall_e4m3_paged_split_softcap_sm90.cu +0 -0
  2549. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdimall_e4m3_sm90.cu +0 -0
  2550. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdimall_e4m3_softcap_packgqa_sm90.cu +0 -0
  2551. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdimall_e4m3_softcap_sm90.cu +0 -0
  2552. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdimall_e4m3_split_sm90.cu +0 -0
  2553. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdimall_e4m3_split_softcap_sm90.cu +0 -0
  2554. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdimall_fp16_packgqa_sm90.cu +0 -0
  2555. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdimall_fp16_paged_sm90.cu +0 -0
  2556. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdimall_fp16_paged_softcap_sm90.cu +0 -0
  2557. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdimall_fp16_paged_split_sm90.cu +0 -0
  2558. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdimall_fp16_paged_split_softcap_sm90.cu +0 -0
  2559. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdimall_fp16_sm90.cu +0 -0
  2560. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdimall_fp16_softcap_packgqa_sm90.cu +0 -0
  2561. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdimall_fp16_softcap_sm90.cu +0 -0
  2562. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdimall_fp16_split_sm90.cu +0 -0
  2563. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdimall_fp16_split_softcap_sm90.cu +0 -0
  2564. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdimdiff_bf16_packgqa_sm90.cu +0 -0
  2565. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdimdiff_bf16_paged_sm90.cu +0 -0
  2566. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdimdiff_bf16_paged_softcap_sm90.cu +0 -0
  2567. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdimdiff_bf16_paged_split_sm90.cu +0 -0
  2568. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdimdiff_bf16_paged_split_softcap_sm90.cu +0 -0
  2569. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdimdiff_bf16_sm90.cu +0 -0
  2570. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdimdiff_bf16_softcap_packgqa_sm90.cu +0 -0
  2571. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdimdiff_bf16_softcap_sm90.cu +0 -0
  2572. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdimdiff_bf16_split_sm90.cu +0 -0
  2573. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdimdiff_bf16_split_softcap_sm90.cu +0 -0
  2574. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdimdiff_e4m3_packgqa_sm90.cu +0 -0
  2575. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdimdiff_e4m3_paged_sm90.cu +0 -0
  2576. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdimdiff_e4m3_paged_softcap_sm90.cu +0 -0
  2577. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdimdiff_e4m3_paged_split_sm90.cu +0 -0
  2578. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdimdiff_e4m3_paged_split_softcap_sm90.cu +0 -0
  2579. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdimdiff_e4m3_sm90.cu +0 -0
  2580. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdimdiff_e4m3_softcap_packgqa_sm90.cu +0 -0
  2581. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdimdiff_e4m3_softcap_sm90.cu +0 -0
  2582. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdimdiff_e4m3_split_sm90.cu +0 -0
  2583. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdimdiff_e4m3_split_softcap_sm90.cu +0 -0
  2584. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdimdiff_fp16_packgqa_sm90.cu +0 -0
  2585. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdimdiff_fp16_paged_sm90.cu +0 -0
  2586. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdimdiff_fp16_paged_softcap_sm90.cu +0 -0
  2587. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdimdiff_fp16_paged_split_sm90.cu +0 -0
  2588. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdimdiff_fp16_paged_split_softcap_sm90.cu +0 -0
  2589. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdimdiff_fp16_sm90.cu +0 -0
  2590. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdimdiff_fp16_softcap_packgqa_sm90.cu +0 -0
  2591. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdimdiff_fp16_softcap_sm90.cu +0 -0
  2592. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdimdiff_fp16_split_sm90.cu +0 -0
  2593. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/instantiations/flash_fwd_hdimdiff_fp16_split_softcap_sm90.cu +0 -0
  2594. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/mainloop_bwd_sm80.hpp +0 -0
  2595. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/mainloop_bwd_sm90_tma_gmma_ws.hpp +0 -0
  2596. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/mainloop_fwd_sm80.hpp +0 -0
  2597. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/mainloop_fwd_sm90_tma_gmma_ws.hpp +0 -0
  2598. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/mask.h +0 -0
  2599. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/named_barrier.hpp +0 -0
  2600. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/pack_gqa.h +0 -0
  2601. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/paged_kv.h +0 -0
  2602. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/rotary.h +0 -0
  2603. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/seqlen.h +0 -0
  2604. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/sm90_pipeline_no_cluster.hpp +0 -0
  2605. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/softmax.h +0 -0
  2606. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/static_switch.h +0 -0
  2607. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/tile_scheduler.hpp +0 -0
  2608. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/tile_size.h +0 -0
  2609. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/hopper/utils.h +0 -0
  2610. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/include/c10/cuda/CUDAException.h +0 -0
  2611. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/include/ejkernel_flash_attention.h +0 -0
  2612. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/include/ejkernel_flash_attention_cutlass.h +0 -0
  2613. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/alibi.h +0 -0
  2614. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/aten_shim.h +0 -0
  2615. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/block_info.h +0 -0
  2616. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/code_gen.py +0 -0
  2617. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/dropout.h +0 -0
  2618. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash.h +0 -0
  2619. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_attention_ffi.cu +0 -0
  2620. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_bwd_hdim128_bf16_causal_sm100.cu +0 -0
  2621. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_bwd_hdim128_bf16_causal_sm110.cu +0 -0
  2622. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_bwd_hdim128_bf16_causal_sm120.cu +0 -0
  2623. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_bwd_hdim128_bf16_causal_sm80.cu +0 -0
  2624. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_bwd_hdim128_bf16_causal_sm90.cu +0 -0
  2625. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_bwd_hdim128_bf16_sm100.cu +0 -0
  2626. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_bwd_hdim128_bf16_sm110.cu +0 -0
  2627. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_bwd_hdim128_bf16_sm120.cu +0 -0
  2628. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_bwd_hdim128_bf16_sm80.cu +0 -0
  2629. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_bwd_hdim128_bf16_sm90.cu +0 -0
  2630. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_bwd_hdim128_fp16_causal_sm100.cu +0 -0
  2631. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_bwd_hdim128_fp16_causal_sm110.cu +0 -0
  2632. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_bwd_hdim128_fp16_causal_sm120.cu +0 -0
  2633. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_bwd_hdim128_fp16_causal_sm80.cu +0 -0
  2634. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_bwd_hdim128_fp16_causal_sm90.cu +0 -0
  2635. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_bwd_hdim128_fp16_sm100.cu +0 -0
  2636. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_bwd_hdim128_fp16_sm110.cu +0 -0
  2637. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_bwd_hdim128_fp16_sm120.cu +0 -0
  2638. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_bwd_hdim128_fp16_sm80.cu +0 -0
  2639. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_bwd_hdim128_fp16_sm90.cu +0 -0
  2640. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_bwd_hdim192_bf16_causal_sm100.cu +0 -0
  2641. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_bwd_hdim192_bf16_causal_sm110.cu +0 -0
  2642. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_bwd_hdim192_bf16_causal_sm120.cu +0 -0
  2643. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_bwd_hdim192_bf16_causal_sm80.cu +0 -0
  2644. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_bwd_hdim192_bf16_causal_sm90.cu +0 -0
  2645. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_bwd_hdim192_bf16_sm100.cu +0 -0
  2646. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_bwd_hdim192_bf16_sm110.cu +0 -0
  2647. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_bwd_hdim192_bf16_sm120.cu +0 -0
  2648. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_bwd_hdim192_bf16_sm80.cu +0 -0
  2649. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_bwd_hdim192_bf16_sm90.cu +0 -0
  2650. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_bwd_hdim192_fp16_causal_sm100.cu +0 -0
  2651. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_bwd_hdim192_fp16_causal_sm110.cu +0 -0
  2652. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_bwd_hdim192_fp16_causal_sm120.cu +0 -0
  2653. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_bwd_hdim192_fp16_causal_sm80.cu +0 -0
  2654. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_bwd_hdim192_fp16_causal_sm90.cu +0 -0
  2655. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_bwd_hdim192_fp16_sm100.cu +0 -0
  2656. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_bwd_hdim192_fp16_sm110.cu +0 -0
  2657. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_bwd_hdim192_fp16_sm120.cu +0 -0
  2658. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_bwd_hdim192_fp16_sm80.cu +0 -0
  2659. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_bwd_hdim192_fp16_sm90.cu +0 -0
  2660. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_bwd_hdim256_bf16_causal_sm100.cu +0 -0
  2661. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_bwd_hdim256_bf16_causal_sm110.cu +0 -0
  2662. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_bwd_hdim256_bf16_causal_sm120.cu +0 -0
  2663. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_bwd_hdim256_bf16_causal_sm80.cu +0 -0
  2664. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_bwd_hdim256_bf16_causal_sm90.cu +0 -0
  2665. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_bwd_hdim256_bf16_sm100.cu +0 -0
  2666. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_bwd_hdim256_bf16_sm110.cu +0 -0
  2667. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_bwd_hdim256_bf16_sm120.cu +0 -0
  2668. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_bwd_hdim256_bf16_sm80.cu +0 -0
  2669. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_bwd_hdim256_bf16_sm90.cu +0 -0
  2670. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_bwd_hdim256_fp16_causal_sm100.cu +0 -0
  2671. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_bwd_hdim256_fp16_causal_sm110.cu +0 -0
  2672. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_bwd_hdim256_fp16_causal_sm120.cu +0 -0
  2673. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_bwd_hdim256_fp16_causal_sm80.cu +0 -0
  2674. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_bwd_hdim256_fp16_causal_sm90.cu +0 -0
  2675. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_bwd_hdim256_fp16_sm100.cu +0 -0
  2676. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_bwd_hdim256_fp16_sm110.cu +0 -0
  2677. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_bwd_hdim256_fp16_sm120.cu +0 -0
  2678. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_bwd_hdim256_fp16_sm80.cu +0 -0
  2679. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_bwd_hdim256_fp16_sm90.cu +0 -0
  2680. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_bwd_hdim32_bf16_causal_sm100.cu +0 -0
  2681. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_bwd_hdim32_bf16_causal_sm110.cu +0 -0
  2682. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_bwd_hdim32_bf16_causal_sm120.cu +0 -0
  2683. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_bwd_hdim32_bf16_causal_sm80.cu +0 -0
  2684. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_bwd_hdim32_bf16_causal_sm90.cu +0 -0
  2685. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_bwd_hdim32_bf16_sm100.cu +0 -0
  2686. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_bwd_hdim32_bf16_sm110.cu +0 -0
  2687. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_bwd_hdim32_bf16_sm120.cu +0 -0
  2688. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_bwd_hdim32_bf16_sm80.cu +0 -0
  2689. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_bwd_hdim32_bf16_sm90.cu +0 -0
  2690. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_bwd_hdim32_fp16_causal_sm100.cu +0 -0
  2691. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_bwd_hdim32_fp16_causal_sm110.cu +0 -0
  2692. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_bwd_hdim32_fp16_causal_sm120.cu +0 -0
  2693. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_bwd_hdim32_fp16_causal_sm80.cu +0 -0
  2694. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_bwd_hdim32_fp16_causal_sm90.cu +0 -0
  2695. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_bwd_hdim32_fp16_sm100.cu +0 -0
  2696. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_bwd_hdim32_fp16_sm110.cu +0 -0
  2697. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_bwd_hdim32_fp16_sm120.cu +0 -0
  2698. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_bwd_hdim32_fp16_sm80.cu +0 -0
  2699. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_bwd_hdim32_fp16_sm90.cu +0 -0
  2700. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_bwd_hdim64_bf16_causal_sm100.cu +0 -0
  2701. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_bwd_hdim64_bf16_causal_sm110.cu +0 -0
  2702. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_bwd_hdim64_bf16_causal_sm120.cu +0 -0
  2703. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_bwd_hdim64_bf16_causal_sm80.cu +0 -0
  2704. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_bwd_hdim64_bf16_causal_sm90.cu +0 -0
  2705. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_bwd_hdim64_bf16_sm100.cu +0 -0
  2706. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_bwd_hdim64_bf16_sm110.cu +0 -0
  2707. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_bwd_hdim64_bf16_sm120.cu +0 -0
  2708. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_bwd_hdim64_bf16_sm80.cu +0 -0
  2709. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_bwd_hdim64_bf16_sm90.cu +0 -0
  2710. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_bwd_hdim64_fp16_causal_sm100.cu +0 -0
  2711. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_bwd_hdim64_fp16_causal_sm110.cu +0 -0
  2712. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_bwd_hdim64_fp16_causal_sm120.cu +0 -0
  2713. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_bwd_hdim64_fp16_causal_sm80.cu +0 -0
  2714. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_bwd_hdim64_fp16_causal_sm90.cu +0 -0
  2715. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_bwd_hdim64_fp16_sm100.cu +0 -0
  2716. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_bwd_hdim64_fp16_sm110.cu +0 -0
  2717. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_bwd_hdim64_fp16_sm120.cu +0 -0
  2718. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_bwd_hdim64_fp16_sm80.cu +0 -0
  2719. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_bwd_hdim64_fp16_sm90.cu +0 -0
  2720. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_bwd_hdim96_bf16_causal_sm100.cu +0 -0
  2721. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_bwd_hdim96_bf16_causal_sm110.cu +0 -0
  2722. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_bwd_hdim96_bf16_causal_sm120.cu +0 -0
  2723. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_bwd_hdim96_bf16_causal_sm80.cu +0 -0
  2724. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_bwd_hdim96_bf16_causal_sm90.cu +0 -0
  2725. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_bwd_hdim96_bf16_sm100.cu +0 -0
  2726. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_bwd_hdim96_bf16_sm110.cu +0 -0
  2727. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_bwd_hdim96_bf16_sm120.cu +0 -0
  2728. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_bwd_hdim96_bf16_sm80.cu +0 -0
  2729. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_bwd_hdim96_bf16_sm90.cu +0 -0
  2730. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_bwd_hdim96_fp16_causal_sm100.cu +0 -0
  2731. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_bwd_hdim96_fp16_causal_sm110.cu +0 -0
  2732. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_bwd_hdim96_fp16_causal_sm120.cu +0 -0
  2733. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_bwd_hdim96_fp16_causal_sm80.cu +0 -0
  2734. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_bwd_hdim96_fp16_causal_sm90.cu +0 -0
  2735. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_bwd_hdim96_fp16_sm100.cu +0 -0
  2736. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_bwd_hdim96_fp16_sm110.cu +0 -0
  2737. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_bwd_hdim96_fp16_sm120.cu +0 -0
  2738. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_bwd_hdim96_fp16_sm80.cu +0 -0
  2739. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_bwd_hdim96_fp16_sm90.cu +0 -0
  2740. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_bwd_kernel.h +0 -0
  2741. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_bwd_launch_template.h +0 -0
  2742. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_bwd_preprocess_kernel.h +0 -0
  2743. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_hdim128_bf16_causal_sm100.cu +0 -0
  2744. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_hdim128_bf16_causal_sm110.cu +0 -0
  2745. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_hdim128_bf16_causal_sm120.cu +0 -0
  2746. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_hdim128_bf16_causal_sm80.cu +0 -0
  2747. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_hdim128_bf16_causal_sm90.cu +0 -0
  2748. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_hdim128_bf16_sm100.cu +0 -0
  2749. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_hdim128_bf16_sm110.cu +0 -0
  2750. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_hdim128_bf16_sm120.cu +0 -0
  2751. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_hdim128_bf16_sm80.cu +0 -0
  2752. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_hdim128_bf16_sm90.cu +0 -0
  2753. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_hdim128_fp16_causal_sm100.cu +0 -0
  2754. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_hdim128_fp16_causal_sm110.cu +0 -0
  2755. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_hdim128_fp16_causal_sm120.cu +0 -0
  2756. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_hdim128_fp16_causal_sm80.cu +0 -0
  2757. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_hdim128_fp16_causal_sm90.cu +0 -0
  2758. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_hdim128_fp16_sm100.cu +0 -0
  2759. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_hdim128_fp16_sm110.cu +0 -0
  2760. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_hdim128_fp16_sm120.cu +0 -0
  2761. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_hdim128_fp16_sm80.cu +0 -0
  2762. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_hdim128_fp16_sm90.cu +0 -0
  2763. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_hdim192_bf16_causal_sm100.cu +0 -0
  2764. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_hdim192_bf16_causal_sm110.cu +0 -0
  2765. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_hdim192_bf16_causal_sm120.cu +0 -0
  2766. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_hdim192_bf16_causal_sm80.cu +0 -0
  2767. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_hdim192_bf16_causal_sm90.cu +0 -0
  2768. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_hdim192_bf16_sm100.cu +0 -0
  2769. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_hdim192_bf16_sm110.cu +0 -0
  2770. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_hdim192_bf16_sm120.cu +0 -0
  2771. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_hdim192_bf16_sm80.cu +0 -0
  2772. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_hdim192_bf16_sm90.cu +0 -0
  2773. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_hdim192_fp16_causal_sm100.cu +0 -0
  2774. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_hdim192_fp16_causal_sm110.cu +0 -0
  2775. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_hdim192_fp16_causal_sm120.cu +0 -0
  2776. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_hdim192_fp16_causal_sm80.cu +0 -0
  2777. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_hdim192_fp16_causal_sm90.cu +0 -0
  2778. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_hdim192_fp16_sm100.cu +0 -0
  2779. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_hdim192_fp16_sm110.cu +0 -0
  2780. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_hdim192_fp16_sm120.cu +0 -0
  2781. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_hdim192_fp16_sm80.cu +0 -0
  2782. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_hdim192_fp16_sm90.cu +0 -0
  2783. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_hdim256_bf16_causal_sm100.cu +0 -0
  2784. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_hdim256_bf16_causal_sm110.cu +0 -0
  2785. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_hdim256_bf16_causal_sm120.cu +0 -0
  2786. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_hdim256_bf16_causal_sm80.cu +0 -0
  2787. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_hdim256_bf16_causal_sm90.cu +0 -0
  2788. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_hdim256_bf16_sm100.cu +0 -0
  2789. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_hdim256_bf16_sm110.cu +0 -0
  2790. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_hdim256_bf16_sm120.cu +0 -0
  2791. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_hdim256_bf16_sm80.cu +0 -0
  2792. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_hdim256_bf16_sm90.cu +0 -0
  2793. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_hdim256_fp16_causal_sm100.cu +0 -0
  2794. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_hdim256_fp16_causal_sm110.cu +0 -0
  2795. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_hdim256_fp16_causal_sm120.cu +0 -0
  2796. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_hdim256_fp16_causal_sm80.cu +0 -0
  2797. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_hdim256_fp16_causal_sm90.cu +0 -0
  2798. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_hdim256_fp16_sm100.cu +0 -0
  2799. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_hdim256_fp16_sm110.cu +0 -0
  2800. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_hdim256_fp16_sm120.cu +0 -0
  2801. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_hdim256_fp16_sm80.cu +0 -0
  2802. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_hdim256_fp16_sm90.cu +0 -0
  2803. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_hdim32_bf16_causal_sm100.cu +0 -0
  2804. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_hdim32_bf16_causal_sm110.cu +0 -0
  2805. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_hdim32_bf16_causal_sm120.cu +0 -0
  2806. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_hdim32_bf16_causal_sm80.cu +0 -0
  2807. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_hdim32_bf16_causal_sm90.cu +0 -0
  2808. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_hdim32_bf16_sm100.cu +0 -0
  2809. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_hdim32_bf16_sm110.cu +0 -0
  2810. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_hdim32_bf16_sm120.cu +0 -0
  2811. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_hdim32_bf16_sm80.cu +0 -0
  2812. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_hdim32_bf16_sm90.cu +0 -0
  2813. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_hdim32_fp16_causal_sm100.cu +0 -0
  2814. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_hdim32_fp16_causal_sm110.cu +0 -0
  2815. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_hdim32_fp16_causal_sm120.cu +0 -0
  2816. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_hdim32_fp16_causal_sm80.cu +0 -0
  2817. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_hdim32_fp16_causal_sm90.cu +0 -0
  2818. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_hdim32_fp16_sm100.cu +0 -0
  2819. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_hdim32_fp16_sm110.cu +0 -0
  2820. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_hdim32_fp16_sm120.cu +0 -0
  2821. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_hdim32_fp16_sm80.cu +0 -0
  2822. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_hdim32_fp16_sm90.cu +0 -0
  2823. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_hdim64_bf16_causal_sm100.cu +0 -0
  2824. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_hdim64_bf16_causal_sm110.cu +0 -0
  2825. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_hdim64_bf16_causal_sm120.cu +0 -0
  2826. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_hdim64_bf16_causal_sm80.cu +0 -0
  2827. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_hdim64_bf16_causal_sm90.cu +0 -0
  2828. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_hdim64_bf16_sm100.cu +0 -0
  2829. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_hdim64_bf16_sm110.cu +0 -0
  2830. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_hdim64_bf16_sm120.cu +0 -0
  2831. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_hdim64_bf16_sm80.cu +0 -0
  2832. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_hdim64_bf16_sm90.cu +0 -0
  2833. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_hdim64_fp16_causal_sm100.cu +0 -0
  2834. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_hdim64_fp16_causal_sm110.cu +0 -0
  2835. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_hdim64_fp16_causal_sm120.cu +0 -0
  2836. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_hdim64_fp16_causal_sm80.cu +0 -0
  2837. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_hdim64_fp16_causal_sm90.cu +0 -0
  2838. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_hdim64_fp16_sm100.cu +0 -0
  2839. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_hdim64_fp16_sm110.cu +0 -0
  2840. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_hdim64_fp16_sm120.cu +0 -0
  2841. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_hdim64_fp16_sm80.cu +0 -0
  2842. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_hdim64_fp16_sm90.cu +0 -0
  2843. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_hdim96_bf16_causal_sm100.cu +0 -0
  2844. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_hdim96_bf16_causal_sm110.cu +0 -0
  2845. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_hdim96_bf16_causal_sm120.cu +0 -0
  2846. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_hdim96_bf16_causal_sm80.cu +0 -0
  2847. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_hdim96_bf16_causal_sm90.cu +0 -0
  2848. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_hdim96_bf16_sm100.cu +0 -0
  2849. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_hdim96_bf16_sm110.cu +0 -0
  2850. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_hdim96_bf16_sm120.cu +0 -0
  2851. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_hdim96_bf16_sm80.cu +0 -0
  2852. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_hdim96_bf16_sm90.cu +0 -0
  2853. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_hdim96_fp16_causal_sm100.cu +0 -0
  2854. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_hdim96_fp16_causal_sm110.cu +0 -0
  2855. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_hdim96_fp16_causal_sm120.cu +0 -0
  2856. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_hdim96_fp16_causal_sm80.cu +0 -0
  2857. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_hdim96_fp16_causal_sm90.cu +0 -0
  2858. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_hdim96_fp16_sm100.cu +0 -0
  2859. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_hdim96_fp16_sm110.cu +0 -0
  2860. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_hdim96_fp16_sm120.cu +0 -0
  2861. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_hdim96_fp16_sm80.cu +0 -0
  2862. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_hdim96_fp16_sm90.cu +0 -0
  2863. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_kernel.h +0 -0
  2864. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_split_hdim128_bf16_causal_sm100.cu +0 -0
  2865. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_split_hdim128_bf16_causal_sm110.cu +0 -0
  2866. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_split_hdim128_bf16_causal_sm120.cu +0 -0
  2867. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_split_hdim128_bf16_causal_sm80.cu +0 -0
  2868. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_split_hdim128_bf16_causal_sm90.cu +0 -0
  2869. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_split_hdim128_bf16_sm100.cu +0 -0
  2870. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_split_hdim128_bf16_sm110.cu +0 -0
  2871. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_split_hdim128_bf16_sm120.cu +0 -0
  2872. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_split_hdim128_bf16_sm80.cu +0 -0
  2873. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_split_hdim128_bf16_sm90.cu +0 -0
  2874. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_split_hdim128_fp16_causal_sm100.cu +0 -0
  2875. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_split_hdim128_fp16_causal_sm110.cu +0 -0
  2876. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_split_hdim128_fp16_causal_sm120.cu +0 -0
  2877. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_split_hdim128_fp16_causal_sm80.cu +0 -0
  2878. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_split_hdim128_fp16_causal_sm90.cu +0 -0
  2879. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_split_hdim128_fp16_sm100.cu +0 -0
  2880. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_split_hdim128_fp16_sm110.cu +0 -0
  2881. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_split_hdim128_fp16_sm120.cu +0 -0
  2882. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_split_hdim128_fp16_sm80.cu +0 -0
  2883. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_split_hdim128_fp16_sm90.cu +0 -0
  2884. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_split_hdim192_bf16_causal_sm100.cu +0 -0
  2885. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_split_hdim192_bf16_causal_sm110.cu +0 -0
  2886. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_split_hdim192_bf16_causal_sm120.cu +0 -0
  2887. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_split_hdim192_bf16_causal_sm80.cu +0 -0
  2888. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_split_hdim192_bf16_causal_sm90.cu +0 -0
  2889. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_split_hdim192_bf16_sm100.cu +0 -0
  2890. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_split_hdim192_bf16_sm110.cu +0 -0
  2891. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_split_hdim192_bf16_sm120.cu +0 -0
  2892. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_split_hdim192_bf16_sm80.cu +0 -0
  2893. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_split_hdim192_bf16_sm90.cu +0 -0
  2894. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_split_hdim192_fp16_causal_sm100.cu +0 -0
  2895. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_split_hdim192_fp16_causal_sm110.cu +0 -0
  2896. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_split_hdim192_fp16_causal_sm120.cu +0 -0
  2897. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_split_hdim192_fp16_causal_sm80.cu +0 -0
  2898. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_split_hdim192_fp16_causal_sm90.cu +0 -0
  2899. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_split_hdim192_fp16_sm100.cu +0 -0
  2900. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_split_hdim192_fp16_sm110.cu +0 -0
  2901. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_split_hdim192_fp16_sm120.cu +0 -0
  2902. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_split_hdim192_fp16_sm80.cu +0 -0
  2903. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_split_hdim192_fp16_sm90.cu +0 -0
  2904. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_split_hdim256_bf16_causal_sm100.cu +0 -0
  2905. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_split_hdim256_bf16_causal_sm110.cu +0 -0
  2906. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_split_hdim256_bf16_causal_sm120.cu +0 -0
  2907. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_split_hdim256_bf16_causal_sm80.cu +0 -0
  2908. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_split_hdim256_bf16_causal_sm90.cu +0 -0
  2909. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_split_hdim256_bf16_sm100.cu +0 -0
  2910. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_split_hdim256_bf16_sm110.cu +0 -0
  2911. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_split_hdim256_bf16_sm120.cu +0 -0
  2912. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_split_hdim256_bf16_sm80.cu +0 -0
  2913. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_split_hdim256_bf16_sm90.cu +0 -0
  2914. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_split_hdim256_fp16_causal_sm100.cu +0 -0
  2915. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_split_hdim256_fp16_causal_sm110.cu +0 -0
  2916. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_split_hdim256_fp16_causal_sm120.cu +0 -0
  2917. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_split_hdim256_fp16_causal_sm80.cu +0 -0
  2918. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_split_hdim256_fp16_causal_sm90.cu +0 -0
  2919. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_split_hdim256_fp16_sm100.cu +0 -0
  2920. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_split_hdim256_fp16_sm110.cu +0 -0
  2921. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_split_hdim256_fp16_sm120.cu +0 -0
  2922. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_split_hdim256_fp16_sm80.cu +0 -0
  2923. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_split_hdim256_fp16_sm90.cu +0 -0
  2924. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_split_hdim32_bf16_causal_sm100.cu +0 -0
  2925. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_split_hdim32_bf16_causal_sm110.cu +0 -0
  2926. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_split_hdim32_bf16_causal_sm120.cu +0 -0
  2927. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_split_hdim32_bf16_causal_sm80.cu +0 -0
  2928. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_split_hdim32_bf16_causal_sm90.cu +0 -0
  2929. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_split_hdim32_bf16_sm100.cu +0 -0
  2930. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_split_hdim32_bf16_sm110.cu +0 -0
  2931. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_split_hdim32_bf16_sm120.cu +0 -0
  2932. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_split_hdim32_bf16_sm80.cu +0 -0
  2933. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_split_hdim32_bf16_sm90.cu +0 -0
  2934. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_split_hdim32_fp16_causal_sm100.cu +0 -0
  2935. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_split_hdim32_fp16_causal_sm110.cu +0 -0
  2936. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_split_hdim32_fp16_causal_sm120.cu +0 -0
  2937. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_split_hdim32_fp16_causal_sm80.cu +0 -0
  2938. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_split_hdim32_fp16_causal_sm90.cu +0 -0
  2939. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_split_hdim32_fp16_sm100.cu +0 -0
  2940. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_split_hdim32_fp16_sm110.cu +0 -0
  2941. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_split_hdim32_fp16_sm120.cu +0 -0
  2942. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_split_hdim32_fp16_sm80.cu +0 -0
  2943. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_split_hdim32_fp16_sm90.cu +0 -0
  2944. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_split_hdim64_bf16_causal_sm100.cu +0 -0
  2945. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_split_hdim64_bf16_causal_sm110.cu +0 -0
  2946. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_split_hdim64_bf16_causal_sm120.cu +0 -0
  2947. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_split_hdim64_bf16_causal_sm80.cu +0 -0
  2948. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_split_hdim64_bf16_causal_sm90.cu +0 -0
  2949. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_split_hdim64_bf16_sm100.cu +0 -0
  2950. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_split_hdim64_bf16_sm110.cu +0 -0
  2951. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_split_hdim64_bf16_sm120.cu +0 -0
  2952. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_split_hdim64_bf16_sm80.cu +0 -0
  2953. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_split_hdim64_bf16_sm90.cu +0 -0
  2954. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_split_hdim64_fp16_causal_sm100.cu +0 -0
  2955. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_split_hdim64_fp16_causal_sm110.cu +0 -0
  2956. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_split_hdim64_fp16_causal_sm120.cu +0 -0
  2957. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_split_hdim64_fp16_causal_sm80.cu +0 -0
  2958. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_split_hdim64_fp16_causal_sm90.cu +0 -0
  2959. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_split_hdim64_fp16_sm100.cu +0 -0
  2960. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_split_hdim64_fp16_sm110.cu +0 -0
  2961. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_split_hdim64_fp16_sm120.cu +0 -0
  2962. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_split_hdim64_fp16_sm80.cu +0 -0
  2963. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_split_hdim64_fp16_sm90.cu +0 -0
  2964. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_split_hdim96_bf16_causal_sm100.cu +0 -0
  2965. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_split_hdim96_bf16_causal_sm110.cu +0 -0
  2966. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_split_hdim96_bf16_causal_sm120.cu +0 -0
  2967. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_split_hdim96_bf16_causal_sm80.cu +0 -0
  2968. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_split_hdim96_bf16_causal_sm90.cu +0 -0
  2969. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_split_hdim96_bf16_sm100.cu +0 -0
  2970. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_split_hdim96_bf16_sm110.cu +0 -0
  2971. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_split_hdim96_bf16_sm120.cu +0 -0
  2972. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_split_hdim96_bf16_sm80.cu +0 -0
  2973. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_split_hdim96_bf16_sm90.cu +0 -0
  2974. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_split_hdim96_fp16_causal_sm100.cu +0 -0
  2975. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_split_hdim96_fp16_causal_sm110.cu +0 -0
  2976. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_split_hdim96_fp16_causal_sm120.cu +0 -0
  2977. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_split_hdim96_fp16_causal_sm80.cu +0 -0
  2978. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_split_hdim96_fp16_causal_sm90.cu +0 -0
  2979. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_split_hdim96_fp16_sm100.cu +0 -0
  2980. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_split_hdim96_fp16_sm110.cu +0 -0
  2981. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_split_hdim96_fp16_sm120.cu +0 -0
  2982. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_split_hdim96_fp16_sm80.cu +0 -0
  2983. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/flash_fwd_split_hdim96_fp16_sm90.cu +0 -0
  2984. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/hardware_info.h +0 -0
  2985. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/kernel_traits.h +0 -0
  2986. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/mask.h +0 -0
  2987. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/namespace_config.h +0 -0
  2988. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/philox.cuh +0 -0
  2989. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/philox_unpack.cuh +0 -0
  2990. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/rotary.h +0 -0
  2991. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/softmax.h +0 -0
  2992. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/static_switch.h +0 -0
  2993. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/flash_attention/src/utils.h +0 -0
  2994. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/quantized_matmul/src/qmm_cuda.cu +0 -0
  2995. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/quantized_matmul/src/qmm_dequant_kernels.h +0 -0
  2996. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/ragged_page_attention_v3/CMakeLists.txt +0 -0
  2997. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/ragged_page_attention_v3/src/code_gen.py +0 -0
  2998. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/ragged_page_attention_v3/src/rpa_v3.h +0 -0
  2999. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/ragged_page_attention_v3/src/rpa_v3_ffi.cu +0 -0
  3000. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/ragged_page_attention_v3/src/rpa_v3_fwd_hdim128_bf16_sm100.cu +0 -0
  3001. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/ragged_page_attention_v3/src/rpa_v3_fwd_hdim128_bf16_sm110.cu +0 -0
  3002. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/ragged_page_attention_v3/src/rpa_v3_fwd_hdim128_bf16_sm120.cu +0 -0
  3003. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/ragged_page_attention_v3/src/rpa_v3_fwd_hdim128_bf16_sm80.cu +0 -0
  3004. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/ragged_page_attention_v3/src/rpa_v3_fwd_hdim128_bf16_sm90.cu +0 -0
  3005. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/ragged_page_attention_v3/src/rpa_v3_fwd_hdim128_fp16_sm100.cu +0 -0
  3006. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/ragged_page_attention_v3/src/rpa_v3_fwd_hdim128_fp16_sm110.cu +0 -0
  3007. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/ragged_page_attention_v3/src/rpa_v3_fwd_hdim128_fp16_sm120.cu +0 -0
  3008. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/ragged_page_attention_v3/src/rpa_v3_fwd_hdim128_fp16_sm80.cu +0 -0
  3009. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/ragged_page_attention_v3/src/rpa_v3_fwd_hdim128_fp16_sm90.cu +0 -0
  3010. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/ragged_page_attention_v3/src/rpa_v3_fwd_hdim128_fp32_sm100.cu +0 -0
  3011. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/ragged_page_attention_v3/src/rpa_v3_fwd_hdim128_fp32_sm110.cu +0 -0
  3012. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/ragged_page_attention_v3/src/rpa_v3_fwd_hdim128_fp32_sm120.cu +0 -0
  3013. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/ragged_page_attention_v3/src/rpa_v3_fwd_hdim128_fp32_sm80.cu +0 -0
  3014. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/ragged_page_attention_v3/src/rpa_v3_fwd_hdim128_fp32_sm90.cu +0 -0
  3015. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/ragged_page_attention_v3/src/rpa_v3_fwd_hdim192_bf16_sm100.cu +0 -0
  3016. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/ragged_page_attention_v3/src/rpa_v3_fwd_hdim192_bf16_sm110.cu +0 -0
  3017. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/ragged_page_attention_v3/src/rpa_v3_fwd_hdim192_bf16_sm120.cu +0 -0
  3018. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/ragged_page_attention_v3/src/rpa_v3_fwd_hdim192_bf16_sm80.cu +0 -0
  3019. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/ragged_page_attention_v3/src/rpa_v3_fwd_hdim192_bf16_sm90.cu +0 -0
  3020. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/ragged_page_attention_v3/src/rpa_v3_fwd_hdim192_fp16_sm100.cu +0 -0
  3021. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/ragged_page_attention_v3/src/rpa_v3_fwd_hdim192_fp16_sm110.cu +0 -0
  3022. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/ragged_page_attention_v3/src/rpa_v3_fwd_hdim192_fp16_sm120.cu +0 -0
  3023. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/ragged_page_attention_v3/src/rpa_v3_fwd_hdim192_fp16_sm80.cu +0 -0
  3024. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/ragged_page_attention_v3/src/rpa_v3_fwd_hdim192_fp16_sm90.cu +0 -0
  3025. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/ragged_page_attention_v3/src/rpa_v3_fwd_hdim192_fp32_sm100.cu +0 -0
  3026. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/ragged_page_attention_v3/src/rpa_v3_fwd_hdim192_fp32_sm110.cu +0 -0
  3027. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/ragged_page_attention_v3/src/rpa_v3_fwd_hdim192_fp32_sm120.cu +0 -0
  3028. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/ragged_page_attention_v3/src/rpa_v3_fwd_hdim192_fp32_sm80.cu +0 -0
  3029. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/ragged_page_attention_v3/src/rpa_v3_fwd_hdim192_fp32_sm90.cu +0 -0
  3030. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/ragged_page_attention_v3/src/rpa_v3_fwd_hdim256_bf16_sm100.cu +0 -0
  3031. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/ragged_page_attention_v3/src/rpa_v3_fwd_hdim256_bf16_sm110.cu +0 -0
  3032. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/ragged_page_attention_v3/src/rpa_v3_fwd_hdim256_bf16_sm120.cu +0 -0
  3033. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/ragged_page_attention_v3/src/rpa_v3_fwd_hdim256_bf16_sm80.cu +0 -0
  3034. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/ragged_page_attention_v3/src/rpa_v3_fwd_hdim256_bf16_sm90.cu +0 -0
  3035. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/ragged_page_attention_v3/src/rpa_v3_fwd_hdim256_fp16_sm100.cu +0 -0
  3036. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/ragged_page_attention_v3/src/rpa_v3_fwd_hdim256_fp16_sm110.cu +0 -0
  3037. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/ragged_page_attention_v3/src/rpa_v3_fwd_hdim256_fp16_sm120.cu +0 -0
  3038. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/ragged_page_attention_v3/src/rpa_v3_fwd_hdim256_fp16_sm80.cu +0 -0
  3039. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/ragged_page_attention_v3/src/rpa_v3_fwd_hdim256_fp16_sm90.cu +0 -0
  3040. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/ragged_page_attention_v3/src/rpa_v3_fwd_hdim256_fp32_sm100.cu +0 -0
  3041. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/ragged_page_attention_v3/src/rpa_v3_fwd_hdim256_fp32_sm110.cu +0 -0
  3042. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/ragged_page_attention_v3/src/rpa_v3_fwd_hdim256_fp32_sm120.cu +0 -0
  3043. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/ragged_page_attention_v3/src/rpa_v3_fwd_hdim256_fp32_sm80.cu +0 -0
  3044. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/ragged_page_attention_v3/src/rpa_v3_fwd_hdim256_fp32_sm90.cu +0 -0
  3045. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/ragged_page_attention_v3/src/rpa_v3_fwd_hdim32_bf16_sm100.cu +0 -0
  3046. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/ragged_page_attention_v3/src/rpa_v3_fwd_hdim32_bf16_sm110.cu +0 -0
  3047. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/ragged_page_attention_v3/src/rpa_v3_fwd_hdim32_bf16_sm120.cu +0 -0
  3048. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/ragged_page_attention_v3/src/rpa_v3_fwd_hdim32_bf16_sm80.cu +0 -0
  3049. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/ragged_page_attention_v3/src/rpa_v3_fwd_hdim32_bf16_sm90.cu +0 -0
  3050. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/ragged_page_attention_v3/src/rpa_v3_fwd_hdim32_fp16_sm100.cu +0 -0
  3051. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/ragged_page_attention_v3/src/rpa_v3_fwd_hdim32_fp16_sm110.cu +0 -0
  3052. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/ragged_page_attention_v3/src/rpa_v3_fwd_hdim32_fp16_sm120.cu +0 -0
  3053. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/ragged_page_attention_v3/src/rpa_v3_fwd_hdim32_fp16_sm80.cu +0 -0
  3054. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/ragged_page_attention_v3/src/rpa_v3_fwd_hdim32_fp16_sm90.cu +0 -0
  3055. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/ragged_page_attention_v3/src/rpa_v3_fwd_hdim32_fp32_sm100.cu +0 -0
  3056. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/ragged_page_attention_v3/src/rpa_v3_fwd_hdim32_fp32_sm110.cu +0 -0
  3057. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/ragged_page_attention_v3/src/rpa_v3_fwd_hdim32_fp32_sm120.cu +0 -0
  3058. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/ragged_page_attention_v3/src/rpa_v3_fwd_hdim32_fp32_sm80.cu +0 -0
  3059. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/ragged_page_attention_v3/src/rpa_v3_fwd_hdim32_fp32_sm90.cu +0 -0
  3060. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/ragged_page_attention_v3/src/rpa_v3_fwd_hdim64_bf16_sm100.cu +0 -0
  3061. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/ragged_page_attention_v3/src/rpa_v3_fwd_hdim64_bf16_sm110.cu +0 -0
  3062. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/ragged_page_attention_v3/src/rpa_v3_fwd_hdim64_bf16_sm120.cu +0 -0
  3063. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/ragged_page_attention_v3/src/rpa_v3_fwd_hdim64_bf16_sm80.cu +0 -0
  3064. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/ragged_page_attention_v3/src/rpa_v3_fwd_hdim64_bf16_sm90.cu +0 -0
  3065. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/ragged_page_attention_v3/src/rpa_v3_fwd_hdim64_fp16_sm100.cu +0 -0
  3066. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/ragged_page_attention_v3/src/rpa_v3_fwd_hdim64_fp16_sm110.cu +0 -0
  3067. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/ragged_page_attention_v3/src/rpa_v3_fwd_hdim64_fp16_sm120.cu +0 -0
  3068. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/ragged_page_attention_v3/src/rpa_v3_fwd_hdim64_fp16_sm80.cu +0 -0
  3069. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/ragged_page_attention_v3/src/rpa_v3_fwd_hdim64_fp16_sm90.cu +0 -0
  3070. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/ragged_page_attention_v3/src/rpa_v3_fwd_hdim64_fp32_sm100.cu +0 -0
  3071. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/ragged_page_attention_v3/src/rpa_v3_fwd_hdim64_fp32_sm110.cu +0 -0
  3072. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/ragged_page_attention_v3/src/rpa_v3_fwd_hdim64_fp32_sm120.cu +0 -0
  3073. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/ragged_page_attention_v3/src/rpa_v3_fwd_hdim64_fp32_sm80.cu +0 -0
  3074. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/ragged_page_attention_v3/src/rpa_v3_fwd_hdim64_fp32_sm90.cu +0 -0
  3075. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/ragged_page_attention_v3/src/rpa_v3_fwd_hdim96_bf16_sm100.cu +0 -0
  3076. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/ragged_page_attention_v3/src/rpa_v3_fwd_hdim96_bf16_sm110.cu +0 -0
  3077. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/ragged_page_attention_v3/src/rpa_v3_fwd_hdim96_bf16_sm120.cu +0 -0
  3078. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/ragged_page_attention_v3/src/rpa_v3_fwd_hdim96_bf16_sm80.cu +0 -0
  3079. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/ragged_page_attention_v3/src/rpa_v3_fwd_hdim96_bf16_sm90.cu +0 -0
  3080. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/ragged_page_attention_v3/src/rpa_v3_fwd_hdim96_fp16_sm100.cu +0 -0
  3081. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/ragged_page_attention_v3/src/rpa_v3_fwd_hdim96_fp16_sm110.cu +0 -0
  3082. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/ragged_page_attention_v3/src/rpa_v3_fwd_hdim96_fp16_sm120.cu +0 -0
  3083. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/ragged_page_attention_v3/src/rpa_v3_fwd_hdim96_fp16_sm80.cu +0 -0
  3084. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/ragged_page_attention_v3/src/rpa_v3_fwd_hdim96_fp16_sm90.cu +0 -0
  3085. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/ragged_page_attention_v3/src/rpa_v3_fwd_hdim96_fp32_sm100.cu +0 -0
  3086. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/ragged_page_attention_v3/src/rpa_v3_fwd_hdim96_fp32_sm110.cu +0 -0
  3087. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/ragged_page_attention_v3/src/rpa_v3_fwd_hdim96_fp32_sm120.cu +0 -0
  3088. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/ragged_page_attention_v3/src/rpa_v3_fwd_hdim96_fp32_sm80.cu +0 -0
  3089. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/ragged_page_attention_v3/src/rpa_v3_fwd_hdim96_fp32_sm90.cu +0 -0
  3090. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/ragged_page_attention_v3/src/rpa_v3_kernel.h +0 -0
  3091. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/ragged_page_attention_v3/src/rpa_v3_launch_template.h +0 -0
  3092. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/unified_attention/CMakeLists.txt +0 -0
  3093. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/unified_attention/src/code_gen.py +0 -0
  3094. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/unified_attention/src/ua.h +0 -0
  3095. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/unified_attention/src/ua_cuda.cu +0 -0
  3096. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/unified_attention/src/ua_fwd_hdim128_bf16_sm100.cu +0 -0
  3097. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/unified_attention/src/ua_fwd_hdim128_bf16_sm110.cu +0 -0
  3098. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/unified_attention/src/ua_fwd_hdim128_bf16_sm120.cu +0 -0
  3099. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/unified_attention/src/ua_fwd_hdim128_bf16_sm80.cu +0 -0
  3100. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/unified_attention/src/ua_fwd_hdim128_bf16_sm90.cu +0 -0
  3101. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/unified_attention/src/ua_fwd_hdim128_fp16_sm100.cu +0 -0
  3102. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/unified_attention/src/ua_fwd_hdim128_fp16_sm110.cu +0 -0
  3103. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/unified_attention/src/ua_fwd_hdim128_fp16_sm120.cu +0 -0
  3104. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/unified_attention/src/ua_fwd_hdim128_fp16_sm80.cu +0 -0
  3105. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/unified_attention/src/ua_fwd_hdim128_fp16_sm90.cu +0 -0
  3106. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/unified_attention/src/ua_fwd_hdim192_bf16_sm100.cu +0 -0
  3107. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/unified_attention/src/ua_fwd_hdim192_bf16_sm110.cu +0 -0
  3108. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/unified_attention/src/ua_fwd_hdim192_bf16_sm120.cu +0 -0
  3109. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/unified_attention/src/ua_fwd_hdim192_bf16_sm80.cu +0 -0
  3110. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/unified_attention/src/ua_fwd_hdim192_bf16_sm90.cu +0 -0
  3111. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/unified_attention/src/ua_fwd_hdim192_fp16_sm100.cu +0 -0
  3112. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/unified_attention/src/ua_fwd_hdim192_fp16_sm110.cu +0 -0
  3113. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/unified_attention/src/ua_fwd_hdim192_fp16_sm120.cu +0 -0
  3114. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/unified_attention/src/ua_fwd_hdim192_fp16_sm80.cu +0 -0
  3115. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/unified_attention/src/ua_fwd_hdim192_fp16_sm90.cu +0 -0
  3116. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/unified_attention/src/ua_fwd_hdim256_bf16_sm100.cu +0 -0
  3117. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/unified_attention/src/ua_fwd_hdim256_bf16_sm110.cu +0 -0
  3118. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/unified_attention/src/ua_fwd_hdim256_bf16_sm120.cu +0 -0
  3119. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/unified_attention/src/ua_fwd_hdim256_bf16_sm80.cu +0 -0
  3120. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/unified_attention/src/ua_fwd_hdim256_bf16_sm90.cu +0 -0
  3121. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/unified_attention/src/ua_fwd_hdim256_fp16_sm100.cu +0 -0
  3122. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/unified_attention/src/ua_fwd_hdim256_fp16_sm110.cu +0 -0
  3123. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/unified_attention/src/ua_fwd_hdim256_fp16_sm120.cu +0 -0
  3124. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/unified_attention/src/ua_fwd_hdim256_fp16_sm80.cu +0 -0
  3125. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/unified_attention/src/ua_fwd_hdim256_fp16_sm90.cu +0 -0
  3126. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/unified_attention/src/ua_fwd_hdim32_bf16_sm100.cu +0 -0
  3127. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/unified_attention/src/ua_fwd_hdim32_bf16_sm110.cu +0 -0
  3128. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/unified_attention/src/ua_fwd_hdim32_bf16_sm120.cu +0 -0
  3129. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/unified_attention/src/ua_fwd_hdim32_bf16_sm80.cu +0 -0
  3130. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/unified_attention/src/ua_fwd_hdim32_bf16_sm90.cu +0 -0
  3131. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/unified_attention/src/ua_fwd_hdim32_fp16_sm100.cu +0 -0
  3132. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/unified_attention/src/ua_fwd_hdim32_fp16_sm110.cu +0 -0
  3133. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/unified_attention/src/ua_fwd_hdim32_fp16_sm120.cu +0 -0
  3134. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/unified_attention/src/ua_fwd_hdim32_fp16_sm80.cu +0 -0
  3135. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/unified_attention/src/ua_fwd_hdim32_fp16_sm90.cu +0 -0
  3136. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/unified_attention/src/ua_fwd_hdim64_bf16_sm100.cu +0 -0
  3137. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/unified_attention/src/ua_fwd_hdim64_bf16_sm110.cu +0 -0
  3138. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/unified_attention/src/ua_fwd_hdim64_bf16_sm120.cu +0 -0
  3139. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/unified_attention/src/ua_fwd_hdim64_bf16_sm80.cu +0 -0
  3140. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/unified_attention/src/ua_fwd_hdim64_bf16_sm90.cu +0 -0
  3141. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/unified_attention/src/ua_fwd_hdim64_fp16_sm100.cu +0 -0
  3142. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/unified_attention/src/ua_fwd_hdim64_fp16_sm110.cu +0 -0
  3143. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/unified_attention/src/ua_fwd_hdim64_fp16_sm120.cu +0 -0
  3144. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/unified_attention/src/ua_fwd_hdim64_fp16_sm80.cu +0 -0
  3145. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/unified_attention/src/ua_fwd_hdim64_fp16_sm90.cu +0 -0
  3146. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/unified_attention/src/ua_fwd_hdim96_bf16_sm100.cu +0 -0
  3147. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/unified_attention/src/ua_fwd_hdim96_bf16_sm110.cu +0 -0
  3148. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/unified_attention/src/ua_fwd_hdim96_bf16_sm120.cu +0 -0
  3149. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/unified_attention/src/ua_fwd_hdim96_bf16_sm80.cu +0 -0
  3150. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/unified_attention/src/ua_fwd_hdim96_bf16_sm90.cu +0 -0
  3151. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/unified_attention/src/ua_fwd_hdim96_fp16_sm100.cu +0 -0
  3152. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/unified_attention/src/ua_fwd_hdim96_fp16_sm110.cu +0 -0
  3153. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/unified_attention/src/ua_fwd_hdim96_fp16_sm120.cu +0 -0
  3154. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/unified_attention/src/ua_fwd_hdim96_fp16_sm80.cu +0 -0
  3155. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/unified_attention/src/ua_fwd_hdim96_fp16_sm90.cu +0 -0
  3156. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/unified_attention/src/ua_kernel.h +0 -0
  3157. {ejkernel-0.0.79 → ejkernel-0.0.80}/csrc/unified_attention/src/ua_launch_template.h +0 -0
  3158. {ejkernel-0.0.79 → ejkernel-0.0.80}/ejkernel/kernels/_pallas/tpu/gated_delta_rule/__init__.py +0 -0
  3159. {ejkernel-0.0.79 → ejkernel-0.0.80}/ejkernel/kernels/_xla/flash_mla/_interface.py +0 -0
  3160. {ejkernel-0.0.79 → ejkernel-0.0.80}/ejkernel/kernels/_xla/ragged_page_attention_v2_turboquant/__init__.py +0 -0
  3161. {ejkernel-0.0.79 → ejkernel-0.0.80}/ejkernel/kernels/_xla/ragged_page_attention_v3_turboquant/__init__.py +0 -0
  3162. {ejkernel-0.0.79 → ejkernel-0.0.80}/ejkernel/quantization/turboquant/__init__.py +0 -0
@@ -0,0 +1,687 @@
1
+ Metadata-Version: 2.3
2
+ Name: ejkernel
3
+ Version: 0.0.80
4
+ Summary: Accelerate, Optimize performance with streamlined training and serving options with JAX.
5
+ Keywords: Deep Learning,Machine Learning,JAX,CUDA,XLA,Triton,Pallas
6
+ Author: Erfan Zare Chavoshi
7
+ Author-email: Erfan Zare Chavoshi <Erfanzare810@gmail.com>
8
+ License: Apache-2.0
9
+ Classifier: Development Status :: 3 - Alpha
10
+ Classifier: Intended Audience :: Developers
11
+ Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
12
+ Classifier: License :: OSI Approved :: Apache Software License
13
+ Classifier: Programming Language :: Python :: 3.11
14
+ Classifier: Programming Language :: Python :: 3.12
15
+ Classifier: Programming Language :: Python :: 3.13
16
+ Requires-Dist: beartype>=0.22.2
17
+ Requires-Dist: chex>=0.1.91
18
+ Requires-Dist: einops>=0.8.1
19
+ Requires-Dist: jax~=0.10.0
20
+ Requires-Dist: jaxlib~=0.10.0
21
+ Requires-Dist: jaxtyping>=0.3.2
22
+ Requires-Dist: pydantic>=2.11.10
23
+ Requires-Dist: tqdm>=4.67.1
24
+ Requires-Dist: jax[cuda13]~=0.10.0 ; extra == 'cuda'
25
+ Requires-Dist: jax-cuda13-plugin[with-cuda]~=0.10.0 ; extra == 'cuda'
26
+ Requires-Dist: jax-cuda13-pjrt~=0.10.0 ; extra == 'cuda'
27
+ Requires-Dist: triton==3.6.0 ; extra == 'cuda'
28
+ Requires-Dist: nvidia-cutlass-dsl[cu13]==4.4.0.dev1 ; extra == 'cuda'
29
+ Requires-Dist: nvidia-cutlass-dsl-libs-base==4.4.0.dev1 ; extra == 'cuda'
30
+ Requires-Dist: nvidia-cutlass-dsl-libs-cu13==4.4.0.dev1 ; extra == 'cuda'
31
+ Requires-Dist: jax-tvm-ffi==0.1.2 ; extra == 'cuda'
32
+ Requires-Dist: apache-tvm-ffi==0.1.8.post2 ; extra == 'cuda'
33
+ Requires-Dist: eformer ; extra == 'dev'
34
+ Requires-Dist: xprof>=2.20.6 ; extra == 'profile'
35
+ Requires-Dist: tb-nightly>=2.21.0a20250820 ; extra == 'profile'
36
+ Requires-Dist: xprof-nightly>=2.21.6a20250820 ; extra == 'profile'
37
+ Requires-Dist: tilelang==0.1.9 ; extra == 'tilelang'
38
+ Requires-Dist: apache-tvm-ffi==0.1.8.post2 ; extra == 'tilelang'
39
+ Requires-Dist: nvidia-cudnn-cu12==9.12.0.46 ; extra == 'tilelang'
40
+ Requires-Dist: jax[tpu]~=0.10.0 ; extra == 'tpu'
41
+ Requires-Dist: triton==3.6.0 ; extra == 'triton'
42
+ Requires-Python: >=3.11, <3.14
43
+ Project-URL: Documentation, https://ejkernel.readthedocs.io/en/latest/
44
+ Project-URL: Homepage, https://github.com/erfanzar/ejkernel
45
+ Project-URL: Repository, https://github.com/erfanzar/ejkernel
46
+ Provides-Extra: cuda
47
+ Provides-Extra: dev
48
+ Provides-Extra: profile
49
+ Provides-Extra: tilelang
50
+ Provides-Extra: tpu
51
+ Provides-Extra: triton
52
+ Description-Content-Type: text/markdown
53
+
54
+ # ejKernel: High-Performance JAX Kernels for Deep Learning
55
+
56
+ > _"The best optimization is the one you don't have to think about."_
57
+
58
+ [![License](https://img.shields.io/badge/License-Apache%202.0-blue.svg)](https://opensource.org/licenses/Apache-2.0)
59
+ [![Python 3.11+](https://img.shields.io/badge/python-3.11+-blue.svg)](https://www.python.org/downloads/)
60
+ [![JAX](https://img.shields.io/badge/JAX-0.9.0+-orange.svg)](https://github.com/google/jax)
61
+ [![Documentation](https://img.shields.io/badge/docs-readthedocs-green.svg)](https://ejkernel.readthedocs.io/en/latest/)
62
+
63
+ ejKernel is a production-grade kernel library for JAX that provides highly optimized implementations of deep learning operations with automatic multi-backend support. The library features a sophisticated configuration management system with autotuning, comprehensive type safety, and seamless execution across GPUs, TPUs, and CPUs.
64
+
65
+ > [!NOTE]
66
+ > eJkernel contains **no AI-generated code**. All kernels, modules, and core logic are manually designed and implemented by human developers.
67
+ > AI tooling (Opus 4.5) is used **exclusively for documentation**, which may therefore contain minor inaccuracies. There is no “vibe coding” or automated code generation anywhere in the codebase.
68
+
69
+ ## Table of Contents
70
+
71
+ - [Key Features](#key-features)
72
+ - [Installation](#installation)
73
+ - [Quick Start](#quick-start)
74
+ - [Architecture Overview](#architecture-overview)
75
+ - [Supported Operations](#supported-operations)
76
+ - [Advanced Usage](#advanced-usage)
77
+ - [Development](#development)
78
+ - [Testing](#testing)
79
+ - [Contributing](#contributing)
80
+ - [Citation](#citation)
81
+ - [License](#license)
82
+
83
+ ## Key Features
84
+
85
+ ### Intelligent Kernel Management
86
+
87
+ - **7-Tier Configuration System**: Override → Overlay → Memory Cache → Persistent Cache → Autotune → Heuristics → Error
88
+ - **Automatic Platform Detection**: Seamlessly selects optimal implementation based on hardware
89
+ - **Priority-Based Registry**: Multi-backend support with intelligent fallback mechanisms
90
+ - **Device Fingerprinting**: Hardware-specific configuration caching for optimal performance
91
+
92
+ ### State-of-the-Art Operations
93
+
94
+ - **30+ Deep Learning Operations**: Flash Attention v2, Flash MLA, Ring Attention, Page Attention, Block Sparse, GLA, Lightning, Gated Delta Rule, Quantized MatMul, State Space Models (Mamba), RWKV (v4/v6/v7), and more
95
+ - **Memory Efficiency**: Custom VJP implementations with O(N) memory complexity for attention
96
+ - **Distributed Support**: Full shard_map integration for model and data parallelism
97
+ - **Mixed Precision**: Comprehensive dtype support with automatic gradient conversion
98
+
99
+ ### Production-Ready Infrastructure
100
+
101
+ - **Type Safety**: Full jaxtyping annotations with runtime validation via beartype
102
+ - **Comprehensive Testing**: Cross-backend validation, performance benchmarks, integration tests
103
+ - **Atomic Persistence**: Thread-safe configuration storage with automatic optimization
104
+ - **Profiling Integration**: Built-in support for JAX profiling and performance monitoring
105
+
106
+ ## Installation
107
+
108
+ ### Basic Installation
109
+
110
+ ```bash
111
+ pip install ejkernel
112
+ ```
113
+
114
+ ### Platform-Specific Installation
115
+
116
+ ```bash
117
+ # GPU Support (CUDA)
118
+ pip install ejkernel[cuda]
119
+
120
+ # TPU Support
121
+ pip install ejkernel[tpu]
122
+
123
+ # Development Installation
124
+ git clone https://github.com/erfanzar/ejkernel.git
125
+ cd ejkernel
126
+ pip install -e ".[dev]"
127
+ ```
128
+
129
+ ### Dependencies
130
+
131
+ - Python 3.11-3.13
132
+ - JAX >= 0.9.0
133
+ - Triton == 3.6.0 (for GPU)
134
+ - nvidia-cutlass-dsl >= 4.4.0 (optional, for CuTe DSL kernels)
135
+ - jax-tvm-ffi == 0.1.2 (optional, for CuTe TVM-FFI primitive path)
136
+ - jaxtyping >= 0.3.2
137
+ - beartype >= 0.22.2
138
+ - pydantic >= 2.11.10
139
+
140
+ ## Quick Start
141
+
142
+ ### Simple API with Automatic Optimization
143
+
144
+ ```python
145
+ import jax.numpy as jnp
146
+ from ejkernel.modules import flash_attention
147
+
148
+ # Basic usage - automatic configuration selection
149
+ output = flash_attention(
150
+ query, key, value,
151
+ causal=True,
152
+ dropout_prob=0.1
153
+ )
154
+
155
+ # With advanced features
156
+ output = flash_attention(
157
+ query, key, value,
158
+ causal=True,
159
+ sliding_window=128, # Local attention window
160
+ logits_soft_cap=30.0, # Gemma-2 style soft capping
161
+ attention_mask=mask, # Custom attention pattern
162
+ )
163
+ ```
164
+
165
+ ### Custom Configuration
166
+
167
+ ```python
168
+ from ejkernel.modules import FlashAttentionConfig
169
+ from ejkernel.ops.utils.datacarrier import FwdParams, BwdParams
170
+
171
+ # Create optimized configuration
172
+ config = FlashAttentionConfig(
173
+ fwd_params=FwdParams(
174
+ q_blocksize=256,
175
+ kv_blocksize=256,
176
+ num_warps=8,
177
+ num_stages=2
178
+ ),
179
+ bwd_params=BwdParams(
180
+ q_blocksize=128,
181
+ kv_blocksize=128,
182
+ num_warps=4
183
+ ),
184
+ platform="triton", # Force specific backend
185
+ backend="gpu"
186
+ )
187
+
188
+ output = flash_attention(query, key, value, cfg=config)
189
+ ```
190
+
191
+ ### Direct Kernel Registry Access
192
+
193
+ ```python
194
+ from ejkernel import kernel_registry, Platform, Backend
195
+
196
+ # Get specific implementation
197
+ kernel = kernel_registry.get(
198
+ algorithm="flash_attention",
199
+ platform=Platform.TRITON,
200
+ backend=Backend.GPU
201
+ )
202
+
203
+ # Direct execution
204
+ output = kernel(query, key, value, causal=True)
205
+ ```
206
+
207
+ ### Distributed Execution
208
+
209
+ ```python
210
+ import jax
211
+ from jax.sharding import Mesh, PartitionSpec as P
212
+ from ejkernel.modules import flash_attention
213
+
214
+ # Setup mesh for distributed execution
215
+ devices = jax.devices()
216
+ mesh = Mesh(devices, axis_names=("data", "model"))
217
+
218
+ # Run distributed attention
219
+ output = flash_attention(
220
+ query, key, value,
221
+ causal=True,
222
+ mesh=mesh,
223
+ in_specs=(P("data", None), P("data", None), P("data", None)),
224
+ out_specs=P("data", None)
225
+ )
226
+ ```
227
+
228
+ ## Architecture Overview
229
+
230
+ ### System Design
231
+
232
+ ejKernel employs a sophisticated layered architecture that separates concerns while maintaining high performance:
233
+
234
+ ```md
235
+ ┌─────────────────────────────────────────────────────┐
236
+ │ Public API (modules/) │
237
+ │ Simple functions with sensible defaults │
238
+ ├─────────────────────────────────────────────────────┤
239
+ │ Operations Layer (ops/) │
240
+ │ Configuration management, autotuning, caching │
241
+ ├─────────────────────────────────────────────────────┤
242
+ │ Kernel Registry (kernels/) │
243
+ │ Platform routing, signature validation │
244
+ ├─────────────────────────────────────────────────────┤
245
+ │ Backend Implementations (kernels/\_\*) │
246
+ │ Triton, CuTe, Pallas, XLA, CUDA kernels │
247
+ └─────────────────────────────────────────────────────┘
248
+ ```
249
+
250
+ ### Core Components
251
+
252
+ #### Kernel Registry
253
+
254
+ The registry provides automatic platform-specific kernel selection:
255
+
256
+ ```python
257
+ @kernel_registry.register("my_operation", Platform.TRITON, Backend.GPU, priority=100)
258
+ def my_operation_gpu(x, y):
259
+ # GPU-optimized implementation
260
+ pass
261
+
262
+ @kernel_registry.register("my_operation", Platform.XLA, Backend.ANY, priority=50)
263
+ def my_operation_fallback(x, y):
264
+ # Universal fallback
265
+ pass
266
+
267
+ # Automatic selection based on available hardware
268
+ impl = kernel_registry.get("my_operation")
269
+ ```
270
+
271
+ #### Configuration Management
272
+
273
+ Multi-tier configuration system with intelligent fallback:
274
+
275
+ ```python
276
+ class ConfigSelectorChain:
277
+ """
278
+ Selection hierarchy:
279
+ 1. Override - Explicit user configuration
280
+ 2. Overlay - Temporary context overrides
281
+ 3. Memory Cache - In-memory lookup
282
+ 4. Persistent Cache - Disk-based storage
283
+ 5. Autotune - Performance benchmarking
284
+ 6. Heuristics - Intelligent defaults
285
+ 7. Error - Clear failure message
286
+ """
287
+ ```
288
+
289
+ #### Custom VJP System
290
+
291
+ All performance-critical kernels implement memory-efficient gradients:
292
+
293
+ ```python
294
+ @jax.custom_vjp
295
+ def kernel_with_custom_grad(inputs):
296
+ return forward(inputs)
297
+
298
+ def kernel_fwd(inputs):
299
+ output, residuals = forward_with_residuals(inputs)
300
+ return output, residuals
301
+
302
+ def kernel_bwd(residuals, grad_output):
303
+ return efficient_backward(residuals, grad_output)
304
+
305
+ kernel_with_custom_grad.defvjp(kernel_fwd, kernel_bwd)
306
+ ```
307
+
308
+ ## Supported Operations
309
+
310
+ ### Attention Mechanisms
311
+
312
+ | Algorithm | Description | Memory | Key Features |
313
+ | -------------------------------- | -------------------------------- | ------ | ----------------------------------------------------------------------- |
314
+ | **Flash Attention v2** | Memory-efficient exact attention | O(N) | Causal masking, dropout, sliding windows, soft capping |
315
+ | **Ring Attention** | Distributed sequence parallelism | O(N/P) | Ultra-long sequences, communication overlap, XLA single-device fallback |
316
+ | **Page Attention** | KV-cache optimized inference | O(N) | Block-wise memory, continuous batching |
317
+ | **Block Sparse Attention** | Configurable sparse patterns | O(N√N) | Local+global, custom patterns |
318
+ | **GLA** | Gated Linear Attention | O(N) | Linear complexity, gated updates |
319
+ | **Lightning Attention** | Layer-dependent decay | O(N) | Exponential moving average |
320
+ | **MLA** | Multi-head Latent Attention | O(N) | Compressed KV representation |
321
+ | **Ragged Page Attention v2** | Variable-length paged attention | O(N) | Ragged sequences with page caching |
322
+ | **Ragged Page Attention v3** | Enhanced ragged page attention | O(N) | Attention sinks support, improved handling |
323
+ | **Ragged Decode Attention** | Variable-length decoding | O(N) | Efficient batched inference |
324
+ | **Gated Delta Rule (GDR)** | Gated delta-rule recurrence | O(N) | Chunked + recurrent + single-step, custom VJP, Qwen3Next |
325
+ | **Ragged GDR** | Packed continuous-batching GDR | O(N) | Variable-length sequences, Pallas TPU decode (3.6x speedup) |
326
+ | **Kernel Delta Attention** | Delta-rule linear attention | O(N) | Linear complexity, delta updates, decay control |
327
+ | **Unified Attention** | vLLM-style paged attention | O(N) | Segmented 3D decode kernel |
328
+ | **Prefill Page Attention** | Page attention prefill phase | O(N) | Separate prefill handling |
329
+ | **Decode Attention** | Single-token decode attention | O(N) | Optimized single-step decoding |
330
+ | **Chunked Prefill Paged Decode** | Combined prefill + decode | O(N) | Chunked prefill with paged KV cache decode |
331
+ | **Flash MLA** | Multi-head Latent Attention | O(N) | Low-rank KV compression, memory-efficient inference |
332
+ | **Scaled Dot-Product Attention** | Standard attention | O(N²) | Basic reference implementation |
333
+
334
+ ### Recurrent Linear Attention (RWKV)
335
+
336
+ | Operation | Description | Key Features |
337
+ | -------------- | ------------------------------------- | -------------------------------------------------- |
338
+ | **RWKV-4** | Time-mix recurrence | Numerically stable (α,β,ε) state, O(N) memory |
339
+ | **RWKV-6** | Multi-head linear attention | Variable-length packing, reverse mode, O(N) memory |
340
+ | **RWKV-7** | DPLR (Diagonal + Low-Rank) recurrence | (a,b) parameterization, state-space inspired |
341
+ | **RWKV-7 Mul** | Multiplicative RWKV-7 variant | (kk,a) reparameterization for optimized kernels |
342
+
343
+ ### Other Operations
344
+
345
+ | Operation | Description | Use Case |
346
+ | --------------------- | ----------------------------------------------------------- | ------------------------- |
347
+ | **Grouped MatMul** | Efficient batched matrix operations | Expert models, MoE |
348
+ | **Grouped MatMul v2** | Enhanced with shard_map support | Distributed expert models |
349
+ | **Mean Pooling** | Variable-length sequence aggregation | Sentence embeddings |
350
+ | **Recurrent** | Optimized RNN/LSTM/GRU operations | Sequential modeling |
351
+ | **Native Sparse** | Block-sparse matrix computations | Sparse attention patterns |
352
+ | **Quantized MatMul** | Multi-mode quantized matmul (affine, NF4, MXFP4/8, NVFP4/8) | Low-bit inference |
353
+
354
+ ### State Space Models
355
+
356
+ | Operation | Description | Key Features |
357
+ | ------------------ | ---------------- | -------------------------------------------------------------------------- |
358
+ | **State Space v1** | Mamba1-style SSM | 2D A matrix, separate dt_proj, custom VJP for memory efficiency |
359
+ | **State Space v2** | Mamba2-style SSM | Per-head scalar A, n_groups for parameter grouping, optional gated RMSNorm |
360
+
361
+ ### Platform Support Matrix
362
+
363
+ | Operation | Triton (GPU) | CUTE (GPU) | CUDA (GPU) | Pallas (TPU) | XLA (Universal) |
364
+ | ---------------------------- | ------------ | ---------- | ---------- | ------------ | --------------- |
365
+ | Flash Attention v2 | ✅ | ✅ | ✅ | ✅ | ✅ |
366
+ | Flash MLA | ✅ | - | - | - | ✅ |
367
+ | Ring Attention | ✅ | - | - | ✅ | ✅ |
368
+ | Page Attention | ✅ | - | - | ✅ | ✅ |
369
+ | Block Sparse Attention | ✅ | - | ✅ | ✅ | ✅ |
370
+ | Decode Attention | ✅ | - | - | - | ✅ |
371
+ | Chunked Prefill Paged Decode | ✅ | ✅ | - | - | ✅ |
372
+ | Ragged Page Attention v2 | ✅ | - | - | ✅ | ✅ |
373
+ | Ragged Page Attention v3 | ✅ | - | ✅ | ✅ | ✅ |
374
+ | Ragged Decode Attention | ✅ | - | - | ✅ | ✅ |
375
+ | GLA | ✅ | - | - | - | ✅ |
376
+ | Lightning Attention | ✅ | - | - | - | ✅ |
377
+ | Recurrent | ✅ | - | - | - | ✅ |
378
+ | Mean Pooling | ✅ | - | - | - | ✅ |
379
+ | Grouped MatMul | - | - | - | ✅ | ✅ |
380
+ | Grouped MatMul v2 | - | - | - | ✅ | - |
381
+ | Native Sparse Attention | ✅ | - | - | - | ✅ |
382
+ | Quantized MatMul | ✅ | ✅ | ✅ | ✅ | ✅ |
383
+ | Gated Delta Rule | - | - | - | ✅ | ✅ |
384
+ | Ragged Gated Delta Rule | - | - | - | ✅ | ✅ |
385
+ | Kernel Delta Attention | - | - | - | - | ✅ |
386
+ | Unified Attention | ✅ | ✅ | ✅ | - | ✅ |
387
+ | Prefill Page Attention | - | - | - | ✅ | ✅ |
388
+ | Scaled Dot-Product Attention | - | - | - | - | ✅ |
389
+ | State Space v1 | - | - | - | - | ✅ |
390
+ | State Space v2 | - | - | - | - | ✅ |
391
+ | RWKV-4 | ✅ | - | - | - | ✅ |
392
+ | RWKV-6 | ✅ | - | - | - | ✅ |
393
+ | RWKV-7 | ✅ | - | - | - | ✅ |
394
+ | RWKV-7 Mul | ✅ | - | - | - | ✅ |
395
+
396
+ ✅ = Production ready | - = Not available
397
+
398
+ \* CuTe backend uses TVM-FFI primitive path with fused kernels. \* Quantized MatMul on TPU uses hybrid dispatch (packed Pallas / predecode / XLA fallback). \* Distributed matmul ops (`all_gather_matmul`, `reduce_scatter_matmul`) intentionally do not perform runtime fallback between distributed backends; choose `platform`/`cfg.platform` explicitly.
399
+
400
+ ## Advanced Usage
401
+
402
+ ### Page Attention for KV-Cache Inference
403
+
404
+ ```python
405
+ from ejkernel.modules import page_attention, PageAttentionConfig
406
+
407
+ # Configure paged attention for inference
408
+ config = PageAttentionConfig(
409
+ platform="auto",
410
+ backend="gpu"
411
+ )
412
+
413
+ output = page_attention(
414
+ query=q,
415
+ key_cache=k_cache,
416
+ value_cache=v_cache,
417
+ block_table=block_table,
418
+ cache_seqlens=cache_seqlens,
419
+ cfg=config
420
+ )
421
+ ```
422
+
423
+ ### Ragged Page Attention for Variable-Length Batches
424
+
425
+ ```python
426
+ from ejkernel.modules import ragged_page_attention_v3, RaggedPageAttentionv3Config
427
+
428
+ # For variable-length sequences with attention sinks
429
+ config = RaggedPageAttentionv3Config(
430
+ platform="pallas",
431
+ backend="tpu"
432
+ )
433
+
434
+ output = ragged_page_attention_v3(
435
+ query=q,
436
+ key_pages=k_pages,
437
+ value_pages=v_pages,
438
+ lengths=seq_lengths,
439
+ page_indices=page_indices,
440
+ cfg=config
441
+ )
442
+ ```
443
+
444
+ ### Performance Optimization
445
+
446
+ ```python
447
+ # Force autotuning for optimal configuration
448
+ import os
449
+ os.environ["EJKERNEL_AUTOTUNE_POLICY"] = "autotune"
450
+ os.environ["EJKERNEL_LOG_AUTOTUNE"] = "1"
451
+
452
+ # Enable profiling
453
+ os.environ["EJKERNEL_OPS_STAMP"] = "json" # Detailed metadata
454
+ os.environ["EJKERNEL_OPS_RECORD"] = "1" # Record invocations
455
+ ```
456
+
457
+ ### Custom Kernel Development
458
+
459
+ ```python
460
+ from ejkernel.ops.core import Kernel
461
+ from ejkernel.modules.operations.configs import BaseOperationConfig
462
+ from dataclasses import dataclass
463
+
464
+ @dataclass
465
+ class MyConfig(BaseOperationConfig):
466
+ param1: int = 128
467
+ param2: float = 0.1
468
+
469
+ class MyKernel(Kernel[MyConfig, Array]):
470
+ def __init__(self):
471
+ super().__init__(op_id="my_kernel")
472
+
473
+ def run(self, x, cfg: MyConfig):
474
+ impl = kernel_registry.get("my_kernel", cfg.platform)
475
+ return impl(x, param1=cfg.param1, param2=cfg.param2)
476
+
477
+ def heuristic_cfg(self, inv):
478
+ # Return default configuration
479
+ return MyConfig(param1=256)
480
+
481
+ def candidate_cfgs(self, inv):
482
+ # Return autotuning candidates
483
+ return [MyConfig(param1=p) for p in [64, 128, 256]]
484
+ ```
485
+
486
+ ### Integration with Flax Models
487
+
488
+ ```python
489
+ import flax.linen as nn
490
+ from ejkernel.modules import flash_attention
491
+
492
+ class TransformerBlock(nn.Module):
493
+ num_heads: int = 8
494
+ head_dim: int = 64
495
+
496
+ @nn.compact
497
+ def __call__(self, x, mask=None):
498
+ # Project to Q, K, V
499
+ q = nn.Dense(self.num_heads * self.head_dim)(x)
500
+ k = nn.Dense(self.num_heads * self.head_dim)(x)
501
+ v = nn.Dense(self.num_heads * self.head_dim)(x)
502
+
503
+ # Reshape for attention
504
+ shape = (x.shape[0], x.shape[1], self.num_heads, self.head_dim)
505
+ q, k, v = map(lambda t: t.reshape(shape), (q, k, v))
506
+
507
+ # Apply ejKernel Flash Attention
508
+ attn_output = flash_attention(
509
+ q, k, v,
510
+ causal=True,
511
+ attention_mask=mask
512
+ )
513
+
514
+ # Project output
515
+ return nn.Dense(x.shape[-1])(attn_output.reshape(x.shape))
516
+ ```
517
+
518
+ ## Development
519
+
520
+ ### Setting Up Development Environment
521
+
522
+ ```bash
523
+ # Clone repository
524
+ git clone https://github.com/erfanzar/ejkernel.git
525
+ cd ejkernel
526
+
527
+ # Create virtual environment
528
+ python -m venv .venv
529
+ source .venv/bin/activate # On Windows: .venv\Scripts\activate
530
+
531
+ # Install in development mode
532
+ pip install -e ".[dev]"
533
+
534
+ # Install pre-commit hooks
535
+ pre-commit install
536
+ ```
537
+
538
+ ### Code Style
539
+
540
+ The project uses:
541
+
542
+ - **black** for code formatting (line length: 121)
543
+ - **ruff** for linting
544
+ - **mypy/pyright** for type checking
545
+ - **pre-commit** for automated checks
546
+
547
+ ### Adding New Kernels
548
+
549
+ 1. **Implement the kernel** in appropriate backend directory:
550
+
551
+ ```python
552
+ # ejkernel/kernels/_triton/my_kernel/_interface.py
553
+ @kernel_registry.register("my_kernel", Platform.TRITON, Backend.GPU)
554
+ def my_kernel_triton(x, config):
555
+ # Implementation
556
+ pass
557
+ ```
558
+
559
+ 1. **Create module wrapper**:
560
+
561
+ ```python
562
+ # ejkernel/modules/operations/my_kernel.py
563
+ class MyKernel(Kernel[MyKernelConfig, Array]):
564
+ # Module implementation
565
+ pass
566
+ ```
567
+
568
+ 1. **Add tests**:
569
+
570
+ ```python
571
+ # test/kernels/_triton/test_my_kernel.py
572
+ class TestMyKernel(unittest.TestCase):
573
+ # Test implementation
574
+ pass
575
+ ```
576
+
577
+ 1. **Update documentation**
578
+
579
+ ## Testing
580
+
581
+ ### Running Tests
582
+
583
+ ```bash
584
+ # Run all tests
585
+ pytest test/
586
+
587
+ # Platform-specific tests
588
+ pytest test/kernels/_xla/ # XLA implementations
589
+ pytest test/kernels/_triton/ # Triton implementations
590
+ pytest test/kernels/_pallas/ # Pallas implementations
591
+
592
+ # Specific test patterns
593
+ pytest -k "flash_attention"
594
+ pytest --verbose --failfast
595
+
596
+ # Module operations tests
597
+ pytest test/modules/operations
598
+ ```
599
+
600
+ ### Test Categories
601
+
602
+ - **Unit Tests**: Individual component testing
603
+ - **Integration Tests**: End-to-end workflows
604
+ - **Comparison Tests**: Cross-backend consistency
605
+ - **Performance Tests**: Regression detection
606
+
607
+ ## Benchmarking
608
+
609
+ Run benchmarks to compare performance across backends:
610
+
611
+ ```bash
612
+ # General attention benchmarks
613
+ python benchmarks/benchmark_attention.py
614
+
615
+ # Flash attention benchmarks
616
+ python benchmarks/benchmark_flash_attention.py
617
+
618
+ # Ragged page attention benchmarks
619
+ python benchmarks/benchmark_ragged_page_attention_v3.py
620
+ ```
621
+
622
+ ## Contributing
623
+
624
+ We welcome contributions!
625
+
626
+ ### Priority Areas
627
+
628
+ - TPU/Pallas implementations for existing algorithms
629
+ - CUDA native kernels for maximum performance
630
+ - New attention mechanisms from recent papers
631
+ - Performance optimizations and kernel fusion
632
+ - Documentation and examples
633
+
634
+ ### Contribution Process
635
+
636
+ 1. Fork the repository
637
+ 1. Create a feature branch
638
+ 1. Implement your changes with tests
639
+ 1. Ensure all tests pass
640
+ 1. Submit a pull request
641
+
642
+ ## Documentation
643
+
644
+ Comprehensive documentation available at [ejkernel.readthedocs.io](https://ejkernel.readthedocs.io/en/latest/)
645
+
646
+ - **[API Reference](https://ejkernel.readthedocs.io/en/latest/api/)**: Complete API documentation
647
+ - **[Tutorials](https://ejkernel.readthedocs.io/en/latest/tutorials/)**: Step-by-step guides
648
+ - **[Architecture](https://ejkernel.readthedocs.io/en/latest/architecture/)**: Design documentation
649
+ - **[Benchmarks](https://ejkernel.readthedocs.io/en/latest/benchmarks/)**: Performance analysis
650
+
651
+ ## Citation
652
+
653
+ If you use ejKernel in your research, please cite:
654
+
655
+ ```bibtex
656
+ @software{ejkernel2025,
657
+ author = {Erfan Zare Chavoshi},
658
+ title = {ejKernel: High-Performance JAX Kernels for Deep Learning},
659
+ year = {2025},
660
+ url = {https://github.com/erfanzar/ejkernel},
661
+ note = {Production-grade kernel library with multi-backend support}
662
+ }
663
+ ```
664
+
665
+ ## License
666
+
667
+ ejKernel is licensed under the Apache License 2.0. See [LICENSE](LICENSE) for details.
668
+
669
+ ## Acknowledgments
670
+
671
+ ejKernel builds upon excellent work from:
672
+
673
+ - [JAX](https://github.com/google/jax) - Composable transformations of Python+NumPy programs
674
+ - [Triton](https://github.com/openai/triton) - GPU kernel programming language
675
+ - [Pallas](https://github.com/google/jax/tree/main/jax/experimental/pallas) - JAX kernel language
676
+ - [Flash Attention](https://github.com/Dao-AILab/flash-attention) - Memory-efficient attention
677
+ - [EasyDeL](https://github.com/erfanzar/EasyDeL) - Parent framework for JAX deep learning
678
+
679
+ ## Community
680
+
681
+ - **GitHub Issues**: [Bug reports and feature requests](https://github.com/erfanzar/ejkernel/issues)
682
+ - **Discussions**: [Community forum](https://github.com/erfanzar/ejkernel/discussions)
683
+ - **Email**: <Erfanzare810@gmail.com>
684
+
685
+ ---
686
+
687
+ **ejKernel** - Production-grade kernels for JAX deep learning