ob-metaflow-stubs 6.0.10.6__py2.py3-none-any.whl → 6.0.10.8__py2.py3-none-any.whl

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Potentially problematic release.


This version of ob-metaflow-stubs might be problematic. Click here for more details.

Files changed (262) hide show
  1. metaflow-stubs/__init__.pyi +977 -977
  2. metaflow-stubs/cards.pyi +3 -2
  3. metaflow-stubs/cli.pyi +2 -2
  4. metaflow-stubs/cli_components/__init__.pyi +2 -2
  5. metaflow-stubs/cli_components/utils.pyi +2 -2
  6. metaflow-stubs/client/__init__.pyi +2 -2
  7. metaflow-stubs/client/core.pyi +5 -5
  8. metaflow-stubs/client/filecache.pyi +3 -3
  9. metaflow-stubs/events.pyi +2 -2
  10. metaflow-stubs/exception.pyi +2 -2
  11. metaflow-stubs/flowspec.pyi +5 -5
  12. metaflow-stubs/generated_for.txt +1 -1
  13. metaflow-stubs/includefile.pyi +4 -4
  14. metaflow-stubs/meta_files.pyi +2 -2
  15. metaflow-stubs/metadata_provider/__init__.pyi +2 -2
  16. metaflow-stubs/metadata_provider/heartbeat.pyi +2 -2
  17. metaflow-stubs/metadata_provider/metadata.pyi +3 -3
  18. metaflow-stubs/metadata_provider/util.pyi +2 -2
  19. metaflow-stubs/metaflow_config.pyi +2 -2
  20. metaflow-stubs/metaflow_current.pyi +86 -86
  21. metaflow-stubs/metaflow_git.pyi +2 -2
  22. metaflow-stubs/mf_extensions/__init__.pyi +2 -2
  23. metaflow-stubs/mf_extensions/obcheckpoint/__init__.pyi +2 -2
  24. metaflow-stubs/mf_extensions/obcheckpoint/plugins/__init__.pyi +2 -2
  25. metaflow-stubs/mf_extensions/obcheckpoint/plugins/machine_learning_utilities/__init__.pyi +2 -2
  26. metaflow-stubs/mf_extensions/obcheckpoint/plugins/machine_learning_utilities/card_utils/__init__.pyi +2 -2
  27. metaflow-stubs/mf_extensions/obcheckpoint/plugins/machine_learning_utilities/card_utils/async_cards.pyi +2 -2
  28. metaflow-stubs/mf_extensions/obcheckpoint/plugins/machine_learning_utilities/card_utils/deco_injection_mixin.pyi +2 -2
  29. metaflow-stubs/mf_extensions/obcheckpoint/plugins/machine_learning_utilities/card_utils/extra_components.pyi +3 -3
  30. metaflow-stubs/mf_extensions/obcheckpoint/plugins/machine_learning_utilities/checkpoints/__init__.pyi +2 -2
  31. metaflow-stubs/mf_extensions/obcheckpoint/plugins/machine_learning_utilities/checkpoints/cards/__init__.pyi +2 -2
  32. metaflow-stubs/mf_extensions/obcheckpoint/plugins/machine_learning_utilities/checkpoints/cards/checkpoint_lister.pyi +4 -4
  33. metaflow-stubs/mf_extensions/obcheckpoint/plugins/machine_learning_utilities/checkpoints/cards/lineage_card.pyi +2 -2
  34. metaflow-stubs/mf_extensions/obcheckpoint/plugins/machine_learning_utilities/checkpoints/checkpoint_storage.pyi +5 -5
  35. metaflow-stubs/mf_extensions/obcheckpoint/plugins/machine_learning_utilities/checkpoints/constructors.pyi +2 -2
  36. metaflow-stubs/mf_extensions/obcheckpoint/plugins/machine_learning_utilities/checkpoints/core.pyi +3 -3
  37. metaflow-stubs/mf_extensions/obcheckpoint/plugins/machine_learning_utilities/checkpoints/decorator.pyi +4 -4
  38. metaflow-stubs/mf_extensions/obcheckpoint/plugins/machine_learning_utilities/checkpoints/exceptions.pyi +2 -2
  39. metaflow-stubs/mf_extensions/obcheckpoint/plugins/machine_learning_utilities/checkpoints/final_api.pyi +2 -2
  40. metaflow-stubs/mf_extensions/obcheckpoint/plugins/machine_learning_utilities/checkpoints/lineage.pyi +2 -2
  41. metaflow-stubs/mf_extensions/obcheckpoint/plugins/machine_learning_utilities/datastore/__init__.pyi +2 -2
  42. metaflow-stubs/mf_extensions/obcheckpoint/plugins/machine_learning_utilities/datastore/context.pyi +3 -3
  43. metaflow-stubs/mf_extensions/obcheckpoint/plugins/machine_learning_utilities/datastore/core.pyi +3 -3
  44. metaflow-stubs/mf_extensions/obcheckpoint/plugins/machine_learning_utilities/datastore/decorator.pyi +2 -2
  45. metaflow-stubs/mf_extensions/obcheckpoint/plugins/machine_learning_utilities/datastore/exceptions.pyi +2 -2
  46. metaflow-stubs/mf_extensions/obcheckpoint/plugins/machine_learning_utilities/datastore/task_utils.pyi +3 -3
  47. metaflow-stubs/mf_extensions/obcheckpoint/plugins/machine_learning_utilities/datastore/utils.pyi +2 -2
  48. metaflow-stubs/mf_extensions/obcheckpoint/plugins/machine_learning_utilities/datastructures.pyi +3 -3
  49. metaflow-stubs/mf_extensions/obcheckpoint/plugins/machine_learning_utilities/exceptions.pyi +2 -2
  50. metaflow-stubs/mf_extensions/obcheckpoint/plugins/machine_learning_utilities/hf_hub/__init__.pyi +2 -2
  51. metaflow-stubs/mf_extensions/obcheckpoint/plugins/machine_learning_utilities/hf_hub/decorator.pyi +2 -2
  52. metaflow-stubs/mf_extensions/obcheckpoint/plugins/machine_learning_utilities/modeling_utils/__init__.pyi +2 -2
  53. metaflow-stubs/mf_extensions/obcheckpoint/plugins/machine_learning_utilities/modeling_utils/core.pyi +3 -3
  54. metaflow-stubs/mf_extensions/obcheckpoint/plugins/machine_learning_utilities/modeling_utils/exceptions.pyi +2 -2
  55. metaflow-stubs/mf_extensions/obcheckpoint/plugins/machine_learning_utilities/modeling_utils/model_storage.pyi +5 -5
  56. metaflow-stubs/mf_extensions/obcheckpoint/plugins/machine_learning_utilities/utils/__init__.pyi +2 -2
  57. metaflow-stubs/mf_extensions/obcheckpoint/plugins/machine_learning_utilities/utils/flowspec_utils.pyi +2 -2
  58. metaflow-stubs/mf_extensions/obcheckpoint/plugins/machine_learning_utilities/utils/general.pyi +2 -2
  59. metaflow-stubs/mf_extensions/obcheckpoint/plugins/machine_learning_utilities/utils/identity_utils.pyi +2 -2
  60. metaflow-stubs/mf_extensions/obcheckpoint/plugins/machine_learning_utilities/utils/serialization_handler/__init__.pyi +2 -2
  61. metaflow-stubs/mf_extensions/obcheckpoint/plugins/machine_learning_utilities/utils/serialization_handler/base.pyi +2 -2
  62. metaflow-stubs/mf_extensions/obcheckpoint/plugins/machine_learning_utilities/utils/serialization_handler/tar.pyi +3 -3
  63. metaflow-stubs/mf_extensions/obcheckpoint/plugins/machine_learning_utilities/utils/tar_utils.pyi +3 -3
  64. metaflow-stubs/mf_extensions/outerbounds/__init__.pyi +2 -2
  65. metaflow-stubs/mf_extensions/outerbounds/plugins/__init__.pyi +2 -2
  66. metaflow-stubs/mf_extensions/outerbounds/plugins/apps/__init__.pyi +2 -2
  67. metaflow-stubs/mf_extensions/outerbounds/plugins/apps/core/__init__.pyi +2 -2
  68. metaflow-stubs/mf_extensions/outerbounds/plugins/apps/core/_state_machine.pyi +2 -2
  69. metaflow-stubs/mf_extensions/outerbounds/plugins/apps/core/_vendor/__init__.pyi +2 -2
  70. metaflow-stubs/mf_extensions/outerbounds/plugins/apps/core/_vendor/spinner/__init__.pyi +2 -2
  71. metaflow-stubs/mf_extensions/outerbounds/plugins/apps/core/_vendor/spinner/spinners.pyi +2 -2
  72. metaflow-stubs/mf_extensions/outerbounds/plugins/apps/core/app_cli.pyi +2 -2
  73. metaflow-stubs/mf_extensions/outerbounds/plugins/apps/core/app_config.pyi +3 -3
  74. metaflow-stubs/mf_extensions/outerbounds/plugins/apps/core/capsule.pyi +4 -4
  75. metaflow-stubs/mf_extensions/outerbounds/plugins/apps/core/click_importer.pyi +2 -2
  76. metaflow-stubs/mf_extensions/outerbounds/plugins/apps/core/code_package/__init__.pyi +2 -2
  77. metaflow-stubs/mf_extensions/outerbounds/plugins/apps/core/code_package/code_packager.pyi +3 -3
  78. metaflow-stubs/mf_extensions/outerbounds/plugins/apps/core/config/__init__.pyi +2 -2
  79. metaflow-stubs/mf_extensions/outerbounds/plugins/apps/core/config/cli_generator.pyi +2 -2
  80. metaflow-stubs/mf_extensions/outerbounds/plugins/apps/core/config/config_utils.pyi +4 -4
  81. metaflow-stubs/mf_extensions/outerbounds/plugins/apps/core/config/schema_export.pyi +2 -2
  82. metaflow-stubs/mf_extensions/outerbounds/plugins/apps/core/config/typed_configs.pyi +4 -4
  83. metaflow-stubs/mf_extensions/outerbounds/plugins/apps/core/config/unified_config.pyi +4 -4
  84. metaflow-stubs/mf_extensions/outerbounds/plugins/apps/core/dependencies.pyi +3 -3
  85. metaflow-stubs/mf_extensions/outerbounds/plugins/apps/core/deployer.pyi +6 -6
  86. metaflow-stubs/mf_extensions/outerbounds/plugins/apps/core/experimental/__init__.pyi +2 -2
  87. metaflow-stubs/mf_extensions/outerbounds/plugins/apps/core/perimeters.pyi +2 -2
  88. metaflow-stubs/mf_extensions/outerbounds/plugins/apps/core/utils.pyi +4 -4
  89. metaflow-stubs/mf_extensions/outerbounds/plugins/aws/__init__.pyi +2 -2
  90. metaflow-stubs/mf_extensions/outerbounds/plugins/aws/assume_role_decorator.pyi +3 -3
  91. metaflow-stubs/mf_extensions/outerbounds/plugins/card_utilities/__init__.pyi +2 -2
  92. metaflow-stubs/mf_extensions/outerbounds/plugins/card_utilities/async_cards.pyi +2 -2
  93. metaflow-stubs/mf_extensions/outerbounds/plugins/card_utilities/injector.pyi +2 -2
  94. metaflow-stubs/mf_extensions/outerbounds/plugins/checkpoint_datastores/__init__.pyi +2 -2
  95. metaflow-stubs/mf_extensions/outerbounds/plugins/checkpoint_datastores/coreweave.pyi +3 -3
  96. metaflow-stubs/mf_extensions/outerbounds/plugins/checkpoint_datastores/nebius.pyi +3 -3
  97. metaflow-stubs/mf_extensions/outerbounds/plugins/fast_bakery/__init__.pyi +2 -2
  98. metaflow-stubs/mf_extensions/outerbounds/plugins/fast_bakery/baker.pyi +4 -4
  99. metaflow-stubs/mf_extensions/outerbounds/plugins/fast_bakery/docker_environment.pyi +3 -3
  100. metaflow-stubs/mf_extensions/outerbounds/plugins/fast_bakery/fast_bakery.pyi +2 -2
  101. metaflow-stubs/mf_extensions/outerbounds/plugins/kubernetes/__init__.pyi +2 -2
  102. metaflow-stubs/mf_extensions/outerbounds/plugins/kubernetes/pod_killer.pyi +2 -2
  103. metaflow-stubs/mf_extensions/outerbounds/plugins/ollama/__init__.pyi +2 -2
  104. metaflow-stubs/mf_extensions/outerbounds/plugins/ollama/constants.pyi +2 -2
  105. metaflow-stubs/mf_extensions/outerbounds/plugins/ollama/exceptions.pyi +2 -2
  106. metaflow-stubs/mf_extensions/outerbounds/plugins/ollama/ollama.pyi +2 -2
  107. metaflow-stubs/mf_extensions/outerbounds/plugins/ollama/status_card.pyi +2 -2
  108. metaflow-stubs/mf_extensions/outerbounds/plugins/snowflake/__init__.pyi +2 -2
  109. metaflow-stubs/mf_extensions/outerbounds/plugins/snowflake/snowflake.pyi +2 -2
  110. metaflow-stubs/mf_extensions/outerbounds/profilers/__init__.pyi +2 -2
  111. metaflow-stubs/mf_extensions/outerbounds/profilers/gpu.pyi +2 -2
  112. metaflow-stubs/mf_extensions/outerbounds/remote_config.pyi +3 -3
  113. metaflow-stubs/mf_extensions/outerbounds/toplevel/__init__.pyi +2 -2
  114. metaflow-stubs/mf_extensions/outerbounds/toplevel/global_aliases_for_metaflow_package.pyi +2 -2
  115. metaflow-stubs/mf_extensions/outerbounds/toplevel/s3_proxy.pyi +2 -2
  116. metaflow-stubs/multicore_utils.pyi +2 -2
  117. metaflow-stubs/ob_internal.pyi +2 -2
  118. metaflow-stubs/packaging_sys/__init__.pyi +7 -7
  119. metaflow-stubs/packaging_sys/backend.pyi +5 -5
  120. metaflow-stubs/packaging_sys/distribution_support.pyi +5 -5
  121. metaflow-stubs/packaging_sys/tar_backend.pyi +5 -5
  122. metaflow-stubs/packaging_sys/utils.pyi +2 -2
  123. metaflow-stubs/packaging_sys/v1.pyi +4 -4
  124. metaflow-stubs/parameters.pyi +4 -4
  125. metaflow-stubs/plugins/__init__.pyi +13 -13
  126. metaflow-stubs/plugins/airflow/__init__.pyi +2 -2
  127. metaflow-stubs/plugins/airflow/airflow_utils.pyi +2 -2
  128. metaflow-stubs/plugins/airflow/exception.pyi +2 -2
  129. metaflow-stubs/plugins/airflow/sensors/__init__.pyi +2 -2
  130. metaflow-stubs/plugins/airflow/sensors/base_sensor.pyi +2 -2
  131. metaflow-stubs/plugins/airflow/sensors/external_task_sensor.pyi +2 -2
  132. metaflow-stubs/plugins/airflow/sensors/s3_sensor.pyi +2 -2
  133. metaflow-stubs/plugins/argo/__init__.pyi +2 -2
  134. metaflow-stubs/plugins/argo/argo_client.pyi +2 -2
  135. metaflow-stubs/plugins/argo/argo_events.pyi +2 -2
  136. metaflow-stubs/plugins/argo/argo_workflows.pyi +3 -3
  137. metaflow-stubs/plugins/argo/argo_workflows_decorator.pyi +3 -3
  138. metaflow-stubs/plugins/argo/argo_workflows_deployer.pyi +4 -4
  139. metaflow-stubs/plugins/argo/argo_workflows_deployer_objects.pyi +4 -4
  140. metaflow-stubs/plugins/argo/exit_hooks.pyi +3 -3
  141. metaflow-stubs/plugins/aws/__init__.pyi +2 -2
  142. metaflow-stubs/plugins/aws/aws_client.pyi +2 -2
  143. metaflow-stubs/plugins/aws/aws_utils.pyi +2 -2
  144. metaflow-stubs/plugins/aws/batch/__init__.pyi +2 -2
  145. metaflow-stubs/plugins/aws/batch/batch.pyi +2 -2
  146. metaflow-stubs/plugins/aws/batch/batch_client.pyi +2 -2
  147. metaflow-stubs/plugins/aws/batch/batch_decorator.pyi +2 -2
  148. metaflow-stubs/plugins/aws/secrets_manager/__init__.pyi +2 -2
  149. metaflow-stubs/plugins/aws/secrets_manager/aws_secrets_manager_secrets_provider.pyi +4 -4
  150. metaflow-stubs/plugins/aws/step_functions/__init__.pyi +2 -2
  151. metaflow-stubs/plugins/aws/step_functions/event_bridge_client.pyi +2 -2
  152. metaflow-stubs/plugins/aws/step_functions/schedule_decorator.pyi +2 -2
  153. metaflow-stubs/plugins/aws/step_functions/step_functions.pyi +2 -2
  154. metaflow-stubs/plugins/aws/step_functions/step_functions_client.pyi +2 -2
  155. metaflow-stubs/plugins/aws/step_functions/step_functions_deployer.pyi +4 -4
  156. metaflow-stubs/plugins/aws/step_functions/step_functions_deployer_objects.pyi +4 -4
  157. metaflow-stubs/plugins/azure/__init__.pyi +2 -2
  158. metaflow-stubs/plugins/azure/azure_credential.pyi +2 -2
  159. metaflow-stubs/plugins/azure/azure_exceptions.pyi +2 -2
  160. metaflow-stubs/plugins/azure/azure_secret_manager_secrets_provider.pyi +4 -4
  161. metaflow-stubs/plugins/azure/azure_utils.pyi +2 -2
  162. metaflow-stubs/plugins/azure/blob_service_client_factory.pyi +2 -2
  163. metaflow-stubs/plugins/azure/includefile_support.pyi +2 -2
  164. metaflow-stubs/plugins/cards/__init__.pyi +2 -2
  165. metaflow-stubs/plugins/cards/card_client.pyi +2 -2
  166. metaflow-stubs/plugins/cards/card_creator.pyi +2 -2
  167. metaflow-stubs/plugins/cards/card_datastore.pyi +2 -2
  168. metaflow-stubs/plugins/cards/card_decorator.pyi +3 -3
  169. metaflow-stubs/plugins/cards/card_modules/__init__.pyi +2 -2
  170. metaflow-stubs/plugins/cards/card_modules/basic.pyi +2 -2
  171. metaflow-stubs/plugins/cards/card_modules/card.pyi +2 -2
  172. metaflow-stubs/plugins/cards/card_modules/components.pyi +84 -4
  173. metaflow-stubs/plugins/cards/card_modules/convert_to_native_type.pyi +2 -2
  174. metaflow-stubs/plugins/cards/card_modules/renderer_tools.pyi +2 -2
  175. metaflow-stubs/plugins/cards/card_modules/test_cards.pyi +2 -2
  176. metaflow-stubs/plugins/cards/card_resolver.pyi +2 -2
  177. metaflow-stubs/plugins/cards/component_serializer.pyi +2 -2
  178. metaflow-stubs/plugins/cards/exception.pyi +2 -2
  179. metaflow-stubs/plugins/catch_decorator.pyi +2 -2
  180. metaflow-stubs/plugins/datatools/__init__.pyi +2 -2
  181. metaflow-stubs/plugins/datatools/local.pyi +2 -2
  182. metaflow-stubs/plugins/datatools/s3/__init__.pyi +2 -2
  183. metaflow-stubs/plugins/datatools/s3/s3.pyi +4 -4
  184. metaflow-stubs/plugins/datatools/s3/s3tail.pyi +2 -2
  185. metaflow-stubs/plugins/datatools/s3/s3util.pyi +2 -2
  186. metaflow-stubs/plugins/debug_logger.pyi +2 -2
  187. metaflow-stubs/plugins/debug_monitor.pyi +2 -2
  188. metaflow-stubs/plugins/environment_decorator.pyi +2 -2
  189. metaflow-stubs/plugins/events_decorator.pyi +45 -4
  190. metaflow-stubs/plugins/exit_hook/__init__.pyi +2 -2
  191. metaflow-stubs/plugins/exit_hook/exit_hook_decorator.pyi +2 -2
  192. metaflow-stubs/plugins/frameworks/__init__.pyi +2 -2
  193. metaflow-stubs/plugins/frameworks/pytorch.pyi +2 -2
  194. metaflow-stubs/plugins/gcp/__init__.pyi +2 -2
  195. metaflow-stubs/plugins/gcp/gcp_secret_manager_secrets_provider.pyi +4 -4
  196. metaflow-stubs/plugins/gcp/gs_exceptions.pyi +2 -2
  197. metaflow-stubs/plugins/gcp/gs_storage_client_factory.pyi +2 -2
  198. metaflow-stubs/plugins/gcp/gs_utils.pyi +2 -2
  199. metaflow-stubs/plugins/gcp/includefile_support.pyi +2 -2
  200. metaflow-stubs/plugins/kubernetes/__init__.pyi +2 -2
  201. metaflow-stubs/plugins/kubernetes/kube_utils.pyi +3 -3
  202. metaflow-stubs/plugins/kubernetes/kubernetes.pyi +2 -2
  203. metaflow-stubs/plugins/kubernetes/kubernetes_client.pyi +2 -2
  204. metaflow-stubs/plugins/kubernetes/kubernetes_decorator.pyi +2 -2
  205. metaflow-stubs/plugins/kubernetes/kubernetes_jobsets.pyi +2 -2
  206. metaflow-stubs/plugins/kubernetes/spot_monitor_sidecar.pyi +2 -2
  207. metaflow-stubs/plugins/ollama/__init__.pyi +3 -3
  208. metaflow-stubs/plugins/optuna/__init__.pyi +2 -2
  209. metaflow-stubs/plugins/parallel_decorator.pyi +2 -2
  210. metaflow-stubs/plugins/perimeters.pyi +2 -2
  211. metaflow-stubs/plugins/project_decorator.pyi +2 -2
  212. metaflow-stubs/plugins/pypi/__init__.pyi +2 -2
  213. metaflow-stubs/plugins/pypi/conda_decorator.pyi +2 -2
  214. metaflow-stubs/plugins/pypi/conda_environment.pyi +4 -4
  215. metaflow-stubs/plugins/pypi/parsers.pyi +2 -2
  216. metaflow-stubs/plugins/pypi/pypi_decorator.pyi +2 -2
  217. metaflow-stubs/plugins/pypi/pypi_environment.pyi +2 -2
  218. metaflow-stubs/plugins/pypi/utils.pyi +2 -2
  219. metaflow-stubs/plugins/resources_decorator.pyi +2 -2
  220. metaflow-stubs/plugins/retry_decorator.pyi +2 -2
  221. metaflow-stubs/plugins/secrets/__init__.pyi +3 -3
  222. metaflow-stubs/plugins/secrets/inline_secrets_provider.pyi +3 -3
  223. metaflow-stubs/plugins/secrets/secrets_decorator.pyi +2 -2
  224. metaflow-stubs/plugins/secrets/secrets_func.pyi +2 -2
  225. metaflow-stubs/plugins/secrets/secrets_spec.pyi +2 -2
  226. metaflow-stubs/plugins/secrets/utils.pyi +2 -2
  227. metaflow-stubs/plugins/snowflake/__init__.pyi +2 -2
  228. metaflow-stubs/plugins/storage_executor.pyi +2 -2
  229. metaflow-stubs/plugins/test_unbounded_foreach_decorator.pyi +2 -2
  230. metaflow-stubs/plugins/timeout_decorator.pyi +2 -2
  231. metaflow-stubs/plugins/torchtune/__init__.pyi +2 -2
  232. metaflow-stubs/plugins/uv/__init__.pyi +2 -2
  233. metaflow-stubs/plugins/uv/uv_environment.pyi +3 -3
  234. metaflow-stubs/profilers/__init__.pyi +2 -2
  235. metaflow-stubs/pylint_wrapper.pyi +2 -2
  236. metaflow-stubs/runner/__init__.pyi +2 -2
  237. metaflow-stubs/runner/deployer.pyi +34 -34
  238. metaflow-stubs/runner/deployer_impl.pyi +3 -3
  239. metaflow-stubs/runner/metaflow_runner.pyi +5 -5
  240. metaflow-stubs/runner/nbdeploy.pyi +2 -2
  241. metaflow-stubs/runner/nbrun.pyi +2 -2
  242. metaflow-stubs/runner/subprocess_manager.pyi +2 -2
  243. metaflow-stubs/runner/utils.pyi +2 -2
  244. metaflow-stubs/system/__init__.pyi +2 -2
  245. metaflow-stubs/system/system_logger.pyi +3 -3
  246. metaflow-stubs/system/system_monitor.pyi +2 -2
  247. metaflow-stubs/tagging_util.pyi +2 -2
  248. metaflow-stubs/tuple_util.pyi +2 -2
  249. metaflow-stubs/user_configs/__init__.pyi +2 -2
  250. metaflow-stubs/user_configs/config_options.pyi +3 -3
  251. metaflow-stubs/user_configs/config_parameters.pyi +6 -6
  252. metaflow-stubs/user_decorators/__init__.pyi +2 -2
  253. metaflow-stubs/user_decorators/common.pyi +2 -2
  254. metaflow-stubs/user_decorators/mutable_flow.pyi +4 -4
  255. metaflow-stubs/user_decorators/mutable_step.pyi +4 -4
  256. metaflow-stubs/user_decorators/user_flow_decorator.pyi +5 -5
  257. metaflow-stubs/user_decorators/user_step_decorator.pyi +4 -4
  258. {ob_metaflow_stubs-6.0.10.6.dist-info → ob_metaflow_stubs-6.0.10.8.dist-info}/METADATA +1 -1
  259. ob_metaflow_stubs-6.0.10.8.dist-info/RECORD +262 -0
  260. ob_metaflow_stubs-6.0.10.6.dist-info/RECORD +0 -262
  261. {ob_metaflow_stubs-6.0.10.6.dist-info → ob_metaflow_stubs-6.0.10.8.dist-info}/WHEEL +0 -0
  262. {ob_metaflow_stubs-6.0.10.6.dist-info → ob_metaflow_stubs-6.0.10.8.dist-info}/top_level.txt +0 -0
@@ -1,15 +1,15 @@
1
1
  ######################################################################################################
2
2
  # Auto-generated Metaflow stub file #
3
- # MF version: 2.18.5.1+obcheckpoint(0.2.6);ob(v1) #
4
- # Generated on 2025-09-19T18:02:17.702045 #
3
+ # MF version: 2.18.7.5+obcheckpoint(0.2.6);ob(v1) #
4
+ # Generated on 2025-09-19T21:56:58.875223 #
5
5
  ######################################################################################################
6
6
 
7
7
  from __future__ import annotations
8
8
 
9
9
  import typing
10
10
  if typing.TYPE_CHECKING:
11
- import datetime
12
11
  import typing
12
+ import datetime
13
13
  FlowSpecDerived = typing.TypeVar("FlowSpecDerived", bound="FlowSpec", contravariant=False, covariant=False)
14
14
  StepFlag = typing.NewType("StepFlag", bool)
15
15
 
@@ -39,9 +39,9 @@ from .user_decorators.user_step_decorator import UserStepDecorator as UserStepDe
39
39
  from .user_decorators.user_step_decorator import StepMutator as StepMutator
40
40
  from .user_decorators.user_step_decorator import user_step_decorator as user_step_decorator
41
41
  from .user_decorators.user_flow_decorator import FlowMutator as FlowMutator
42
- from . import cards as cards
43
- from . import tuple_util as tuple_util
44
42
  from . import metaflow_git as metaflow_git
43
+ from . import tuple_util as tuple_util
44
+ from . import cards as cards
45
45
  from . import events as events
46
46
  from . import runner as runner
47
47
  from . import plugins as plugins
@@ -167,654 +167,233 @@ def step(f: typing.Union[typing.Callable[[FlowSpecDerived], None], typing.Callab
167
167
  """
168
168
  ...
169
169
 
170
- def nvct(*, gpu: int, gpu_type: str) -> typing.Callable[[typing.Union[typing.Callable[[FlowSpecDerived, StepFlag], None], typing.Callable[[FlowSpecDerived, typing.Any, StepFlag], None]]], typing.Union[typing.Callable[[FlowSpecDerived, StepFlag], None], typing.Callable[[FlowSpecDerived, typing.Any, StepFlag], None]]]:
170
+ @typing.overload
171
+ def app_deploy(f: typing.Callable[[FlowSpecDerived, StepFlag], None]) -> typing.Callable[[FlowSpecDerived, StepFlag], None]:
171
172
  """
172
- Specifies that this step should execute on DGX cloud.
173
-
174
-
175
- Parameters
176
- ----------
177
- gpu : int
178
- Number of GPUs to use.
179
- gpu_type : str
180
- Type of Nvidia GPU to use.
173
+ Decorator prototype for all step decorators. This function gets specialized
174
+ and imported for all decorators types by _import_plugin_decorators().
181
175
  """
182
176
  ...
183
177
 
184
- def ollama(*, models: list, backend: str, force_pull: bool, cache_update_policy: str, force_cache_update: bool, debug: bool, circuit_breaker_config: dict, timeout_config: dict) -> typing.Callable[[typing.Union[typing.Callable[[FlowSpecDerived, StepFlag], None], typing.Callable[[FlowSpecDerived, typing.Any, StepFlag], None]]], typing.Union[typing.Callable[[FlowSpecDerived, StepFlag], None], typing.Callable[[FlowSpecDerived, typing.Any, StepFlag], None]]]:
178
+ @typing.overload
179
+ def app_deploy(f: typing.Callable[[FlowSpecDerived, typing.Any, StepFlag], None]) -> typing.Callable[[FlowSpecDerived, typing.Any, StepFlag], None]:
180
+ ...
181
+
182
+ def app_deploy(f: typing.Union[typing.Callable[[FlowSpecDerived, StepFlag], None], typing.Callable[[FlowSpecDerived, typing.Any, StepFlag], None], None] = None):
185
183
  """
186
- This decorator is used to run Ollama APIs as Metaflow task sidecars.
184
+ Decorator prototype for all step decorators. This function gets specialized
185
+ and imported for all decorators types by _import_plugin_decorators().
186
+ """
187
+ ...
188
+
189
+ @typing.overload
190
+ def timeout(*, seconds: int = 0, minutes: int = 0, hours: int = 0) -> typing.Callable[[typing.Union[typing.Callable[[FlowSpecDerived, StepFlag], None], typing.Callable[[FlowSpecDerived, typing.Any, StepFlag], None]]], typing.Union[typing.Callable[[FlowSpecDerived, StepFlag], None], typing.Callable[[FlowSpecDerived, typing.Any, StepFlag], None]]]:
191
+ """
192
+ Specifies a timeout for your step.
187
193
 
188
- User code call
189
- --------------
190
- @ollama(
191
- models=[...],
192
- ...
193
- )
194
+ This decorator is useful if this step may hang indefinitely.
194
195
 
195
- Valid backend options
196
- ---------------------
197
- - 'local': Run as a separate process on the local task machine.
198
- - (TODO) 'managed': Outerbounds hosts and selects compute provider.
199
- - (TODO) 'remote': Spin up separate instance to serve Ollama models.
196
+ This can be used in conjunction with the `@retry` decorator as well as the `@catch` decorator.
197
+ A timeout is considered to be an exception thrown by the step. It will cause the step to be
198
+ retried if needed and the exception will be caught by the `@catch` decorator, if present.
200
199
 
201
- Valid model options
202
- -------------------
203
- Any model here https://ollama.com/search, e.g. 'llama3.2', 'llama3.3'
200
+ Note that all the values specified in parameters are added together so if you specify
201
+ 60 seconds and 1 hour, the decorator will have an effective timeout of 1 hour and 1 minute.
204
202
 
205
203
 
206
204
  Parameters
207
205
  ----------
208
- models: list[str]
209
- List of Ollama containers running models in sidecars.
210
- backend: str
211
- Determines where and how to run the Ollama process.
212
- force_pull: bool
213
- Whether to run `ollama pull` no matter what, or first check the remote cache in Metaflow datastore for this model key.
214
- cache_update_policy: str
215
- Cache update policy: "auto", "force", or "never".
216
- force_cache_update: bool
217
- Simple override for "force" cache update policy.
218
- debug: bool
219
- Whether to turn on verbose debugging logs.
220
- circuit_breaker_config: dict
221
- Configuration for circuit breaker protection. Keys: failure_threshold, recovery_timeout, reset_timeout.
222
- timeout_config: dict
223
- Configuration for various operation timeouts. Keys: pull, stop, health_check, install, server_startup.
206
+ seconds : int, default 0
207
+ Number of seconds to wait prior to timing out.
208
+ minutes : int, default 0
209
+ Number of minutes to wait prior to timing out.
210
+ hours : int, default 0
211
+ Number of hours to wait prior to timing out.
224
212
  """
225
213
  ...
226
214
 
227
215
  @typing.overload
228
- def test_append_card(f: typing.Callable[[FlowSpecDerived, StepFlag], None]) -> typing.Callable[[FlowSpecDerived, StepFlag], None]:
229
- """
230
- A simple decorator that demonstrates using CardDecoratorInjector
231
- to inject a card and render simple markdown content.
232
- """
216
+ def timeout(f: typing.Callable[[FlowSpecDerived, StepFlag], None]) -> typing.Callable[[FlowSpecDerived, StepFlag], None]:
233
217
  ...
234
218
 
235
219
  @typing.overload
236
- def test_append_card(f: typing.Callable[[FlowSpecDerived, typing.Any, StepFlag], None]) -> typing.Callable[[FlowSpecDerived, typing.Any, StepFlag], None]:
237
- ...
238
-
239
- def test_append_card(f: typing.Union[typing.Callable[[FlowSpecDerived, StepFlag], None], typing.Callable[[FlowSpecDerived, typing.Any, StepFlag], None], None] = None):
240
- """
241
- A simple decorator that demonstrates using CardDecoratorInjector
242
- to inject a card and render simple markdown content.
243
- """
220
+ def timeout(f: typing.Callable[[FlowSpecDerived, typing.Any, StepFlag], None]) -> typing.Callable[[FlowSpecDerived, typing.Any, StepFlag], None]:
244
221
  ...
245
222
 
246
- def s3_proxy(*, integration_name: typing.Optional[str] = None, write_mode: typing.Optional[str] = None, debug: typing.Optional[bool] = None) -> typing.Callable[[typing.Union[typing.Callable[[FlowSpecDerived, StepFlag], None], typing.Callable[[FlowSpecDerived, typing.Any, StepFlag], None]]], typing.Union[typing.Callable[[FlowSpecDerived, StepFlag], None], typing.Callable[[FlowSpecDerived, typing.Any, StepFlag], None]]]:
223
+ def timeout(f: typing.Union[typing.Callable[[FlowSpecDerived, StepFlag], None], typing.Callable[[FlowSpecDerived, typing.Any, StepFlag], None], None] = None, *, seconds: int = 0, minutes: int = 0, hours: int = 0):
247
224
  """
248
- S3 Proxy decorator for routing S3 requests through a local proxy service.
225
+ Specifies a timeout for your step.
226
+
227
+ This decorator is useful if this step may hang indefinitely.
228
+
229
+ This can be used in conjunction with the `@retry` decorator as well as the `@catch` decorator.
230
+ A timeout is considered to be an exception thrown by the step. It will cause the step to be
231
+ retried if needed and the exception will be caught by the `@catch` decorator, if present.
232
+
233
+ Note that all the values specified in parameters are added together so if you specify
234
+ 60 seconds and 1 hour, the decorator will have an effective timeout of 1 hour and 1 minute.
249
235
 
250
236
 
251
237
  Parameters
252
238
  ----------
253
- integration_name : str, optional
254
- Name of the S3 proxy integration. If not specified, will use the only
255
- available S3 proxy integration in the namespace (fails if multiple exist).
256
- write_mode : str, optional
257
- The desired behavior during write operations to target (origin) S3 bucket.
258
- allowed options are:
259
- "origin-and-cache" -> write to both the target S3 bucket and local object
260
- storage
261
- "origin" -> only write to the target S3 bucket
262
- "cache" -> only write to the object storage service used for caching
263
- debug : bool, optional
264
- Enable debug logging for proxy operations.
239
+ seconds : int, default 0
240
+ Number of seconds to wait prior to timing out.
241
+ minutes : int, default 0
242
+ Number of minutes to wait prior to timing out.
243
+ hours : int, default 0
244
+ Number of hours to wait prior to timing out.
265
245
  """
266
246
  ...
267
247
 
268
- @typing.overload
269
- def coreweave_s3_proxy(f: typing.Callable[[FlowSpecDerived, StepFlag], None]) -> typing.Callable[[FlowSpecDerived, StepFlag], None]:
248
+ def nvct(*, gpu: int, gpu_type: str) -> typing.Callable[[typing.Union[typing.Callable[[FlowSpecDerived, StepFlag], None], typing.Callable[[FlowSpecDerived, typing.Any, StepFlag], None]]], typing.Union[typing.Callable[[FlowSpecDerived, StepFlag], None], typing.Callable[[FlowSpecDerived, typing.Any, StepFlag], None]]]:
270
249
  """
271
- CoreWeave-specific S3 Proxy decorator for routing S3 requests through a local proxy service.
272
- It exists to make it easier for users to know that this decorator should only be used with
273
- a Neo Cloud like CoreWeave.
250
+ Specifies that this step should execute on DGX cloud.
251
+
252
+
253
+ Parameters
254
+ ----------
255
+ gpu : int
256
+ Number of GPUs to use.
257
+ gpu_type : str
258
+ Type of Nvidia GPU to use.
274
259
  """
275
260
  ...
276
261
 
277
262
  @typing.overload
278
- def coreweave_s3_proxy(f: typing.Callable[[FlowSpecDerived, typing.Any, StepFlag], None]) -> typing.Callable[[FlowSpecDerived, typing.Any, StepFlag], None]:
279
- ...
280
-
281
- def coreweave_s3_proxy(f: typing.Union[typing.Callable[[FlowSpecDerived, StepFlag], None], typing.Callable[[FlowSpecDerived, typing.Any, StepFlag], None], None] = None):
263
+ def conda(*, packages: typing.Dict[str, str] = {}, libraries: typing.Dict[str, str] = {}, python: typing.Optional[str] = None, disabled: bool = False) -> typing.Callable[[typing.Union[typing.Callable[[FlowSpecDerived, StepFlag], None], typing.Callable[[FlowSpecDerived, typing.Any, StepFlag], None]]], typing.Union[typing.Callable[[FlowSpecDerived, StepFlag], None], typing.Callable[[FlowSpecDerived, typing.Any, StepFlag], None]]]:
282
264
  """
283
- CoreWeave-specific S3 Proxy decorator for routing S3 requests through a local proxy service.
284
- It exists to make it easier for users to know that this decorator should only be used with
285
- a Neo Cloud like CoreWeave.
265
+ Specifies the Conda environment for the step.
266
+
267
+ Information in this decorator will augment any
268
+ attributes set in the `@conda_base` flow-level decorator. Hence,
269
+ you can use `@conda_base` to set packages required by all
270
+ steps and use `@conda` to specify step-specific overrides.
271
+
272
+
273
+ Parameters
274
+ ----------
275
+ packages : Dict[str, str], default {}
276
+ Packages to use for this step. The key is the name of the package
277
+ and the value is the version to use.
278
+ libraries : Dict[str, str], default {}
279
+ Supported for backward compatibility. When used with packages, packages will take precedence.
280
+ python : str, optional, default None
281
+ Version of Python to use, e.g. '3.7.4'. A default value of None implies
282
+ that the version used will correspond to the version of the Python interpreter used to start the run.
283
+ disabled : bool, default False
284
+ If set to True, disables @conda.
286
285
  """
287
286
  ...
288
287
 
289
288
  @typing.overload
290
- def fast_bakery_internal(f: typing.Callable[[FlowSpecDerived, StepFlag], None]) -> typing.Callable[[FlowSpecDerived, StepFlag], None]:
291
- """
292
- Internal decorator to support Fast bakery
293
- """
289
+ def conda(f: typing.Callable[[FlowSpecDerived, StepFlag], None]) -> typing.Callable[[FlowSpecDerived, StepFlag], None]:
294
290
  ...
295
291
 
296
292
  @typing.overload
297
- def fast_bakery_internal(f: typing.Callable[[FlowSpecDerived, typing.Any, StepFlag], None]) -> typing.Callable[[FlowSpecDerived, typing.Any, StepFlag], None]:
293
+ def conda(f: typing.Callable[[FlowSpecDerived, typing.Any, StepFlag], None]) -> typing.Callable[[FlowSpecDerived, typing.Any, StepFlag], None]:
298
294
  ...
299
295
 
300
- def fast_bakery_internal(f: typing.Union[typing.Callable[[FlowSpecDerived, StepFlag], None], typing.Callable[[FlowSpecDerived, typing.Any, StepFlag], None], None] = None):
296
+ def conda(f: typing.Union[typing.Callable[[FlowSpecDerived, StepFlag], None], typing.Callable[[FlowSpecDerived, typing.Any, StepFlag], None], None] = None, *, packages: typing.Dict[str, str] = {}, libraries: typing.Dict[str, str] = {}, python: typing.Optional[str] = None, disabled: bool = False):
301
297
  """
302
- Internal decorator to support Fast bakery
298
+ Specifies the Conda environment for the step.
299
+
300
+ Information in this decorator will augment any
301
+ attributes set in the `@conda_base` flow-level decorator. Hence,
302
+ you can use `@conda_base` to set packages required by all
303
+ steps and use `@conda` to specify step-specific overrides.
304
+
305
+
306
+ Parameters
307
+ ----------
308
+ packages : Dict[str, str], default {}
309
+ Packages to use for this step. The key is the name of the package
310
+ and the value is the version to use.
311
+ libraries : Dict[str, str], default {}
312
+ Supported for backward compatibility. When used with packages, packages will take precedence.
313
+ python : str, optional, default None
314
+ Version of Python to use, e.g. '3.7.4'. A default value of None implies
315
+ that the version used will correspond to the version of the Python interpreter used to start the run.
316
+ disabled : bool, default False
317
+ If set to True, disables @conda.
303
318
  """
304
319
  ...
305
320
 
306
321
  @typing.overload
307
- def retry(*, times: int = 3, minutes_between_retries: int = 2) -> typing.Callable[[typing.Union[typing.Callable[[FlowSpecDerived, StepFlag], None], typing.Callable[[FlowSpecDerived, typing.Any, StepFlag], None]]], typing.Union[typing.Callable[[FlowSpecDerived, StepFlag], None], typing.Callable[[FlowSpecDerived, typing.Any, StepFlag], None]]]:
322
+ def resources(*, cpu: int = 1, gpu: typing.Optional[int] = None, disk: typing.Optional[int] = None, memory: int = 4096, shared_memory: typing.Optional[int] = None) -> typing.Callable[[typing.Union[typing.Callable[[FlowSpecDerived, StepFlag], None], typing.Callable[[FlowSpecDerived, typing.Any, StepFlag], None]]], typing.Union[typing.Callable[[FlowSpecDerived, StepFlag], None], typing.Callable[[FlowSpecDerived, typing.Any, StepFlag], None]]]:
308
323
  """
309
- Specifies the number of times the task corresponding
310
- to a step needs to be retried.
324
+ Specifies the resources needed when executing this step.
311
325
 
312
- This decorator is useful for handling transient errors, such as networking issues.
313
- If your task contains operations that can't be retried safely, e.g. database updates,
314
- it is advisable to annotate it with `@retry(times=0)`.
326
+ Use `@resources` to specify the resource requirements
327
+ independently of the specific compute layer (`@batch`, `@kubernetes`).
315
328
 
316
- This can be used in conjunction with the `@catch` decorator. The `@catch`
317
- decorator will execute a no-op task after all retries have been exhausted,
318
- ensuring that the flow execution can continue.
329
+ You can choose the compute layer on the command line by executing e.g.
330
+ ```
331
+ python myflow.py run --with batch
332
+ ```
333
+ or
334
+ ```
335
+ python myflow.py run --with kubernetes
336
+ ```
337
+ which executes the flow on the desired system using the
338
+ requirements specified in `@resources`.
319
339
 
320
340
 
321
341
  Parameters
322
342
  ----------
323
- times : int, default 3
324
- Number of times to retry this task.
325
- minutes_between_retries : int, default 2
326
- Number of minutes between retries.
343
+ cpu : int, default 1
344
+ Number of CPUs required for this step.
345
+ gpu : int, optional, default None
346
+ Number of GPUs required for this step.
347
+ disk : int, optional, default None
348
+ Disk size (in MB) required for this step. Only applies on Kubernetes.
349
+ memory : int, default 4096
350
+ Memory size (in MB) required for this step.
351
+ shared_memory : int, optional, default None
352
+ The value for the size (in MiB) of the /dev/shm volume for this step.
353
+ This parameter maps to the `--shm-size` option in Docker.
327
354
  """
328
355
  ...
329
356
 
330
357
  @typing.overload
331
- def retry(f: typing.Callable[[FlowSpecDerived, StepFlag], None]) -> typing.Callable[[FlowSpecDerived, StepFlag], None]:
358
+ def resources(f: typing.Callable[[FlowSpecDerived, StepFlag], None]) -> typing.Callable[[FlowSpecDerived, StepFlag], None]:
332
359
  ...
333
360
 
334
361
  @typing.overload
335
- def retry(f: typing.Callable[[FlowSpecDerived, typing.Any, StepFlag], None]) -> typing.Callable[[FlowSpecDerived, typing.Any, StepFlag], None]:
362
+ def resources(f: typing.Callable[[FlowSpecDerived, typing.Any, StepFlag], None]) -> typing.Callable[[FlowSpecDerived, typing.Any, StepFlag], None]:
336
363
  ...
337
364
 
338
- def retry(f: typing.Union[typing.Callable[[FlowSpecDerived, StepFlag], None], typing.Callable[[FlowSpecDerived, typing.Any, StepFlag], None], None] = None, *, times: int = 3, minutes_between_retries: int = 2):
365
+ def resources(f: typing.Union[typing.Callable[[FlowSpecDerived, StepFlag], None], typing.Callable[[FlowSpecDerived, typing.Any, StepFlag], None], None] = None, *, cpu: int = 1, gpu: typing.Optional[int] = None, disk: typing.Optional[int] = None, memory: int = 4096, shared_memory: typing.Optional[int] = None):
339
366
  """
340
- Specifies the number of times the task corresponding
341
- to a step needs to be retried.
342
-
343
- This decorator is useful for handling transient errors, such as networking issues.
344
- If your task contains operations that can't be retried safely, e.g. database updates,
345
- it is advisable to annotate it with `@retry(times=0)`.
346
-
347
- This can be used in conjunction with the `@catch` decorator. The `@catch`
348
- decorator will execute a no-op task after all retries have been exhausted,
349
- ensuring that the flow execution can continue.
367
+ Specifies the resources needed when executing this step.
350
368
 
369
+ Use `@resources` to specify the resource requirements
370
+ independently of the specific compute layer (`@batch`, `@kubernetes`).
351
371
 
352
- Parameters
353
- ----------
354
- times : int, default 3
355
- Number of times to retry this task.
356
- minutes_between_retries : int, default 2
357
- Number of minutes between retries.
358
- """
359
- ...
360
-
361
- def huggingface_hub(*, temp_dir_root: typing.Optional[str] = None, cache_scope: typing.Optional[str] = None, load: typing.Union[typing.List[str], typing.List[typing.Tuple[typing.Dict, str]], typing.List[typing.Tuple[str, str]], typing.List[typing.Dict], None]) -> typing.Callable[[typing.Union[typing.Callable[[FlowSpecDerived, StepFlag], None], typing.Callable[[FlowSpecDerived, typing.Any, StepFlag], None]]], typing.Union[typing.Callable[[FlowSpecDerived, StepFlag], None], typing.Callable[[FlowSpecDerived, typing.Any, StepFlag], None]]]:
362
- """
363
- Decorator that helps cache, version, and store models/datasets from the Hugging Face Hub.
364
-
365
- > Examples
366
-
367
- **Usage: creating references to models from the Hugging Face Hub that may be loaded in downstream steps**
368
- ```python
369
- @huggingface_hub
370
- @step
371
- def pull_model_from_huggingface(self):
372
- # `current.huggingface_hub.snapshot_download` downloads the model from the Hugging Face Hub
373
- # and saves it in the backend storage based on the model's `repo_id`. If there exists a model
374
- # with the same `repo_id` in the backend storage, it will not download the model again. The return
375
- # value of the function is a reference to the model in the backend storage.
376
- # This reference can be used to load the model in the subsequent steps via `@model(load=["llama_model"])`
377
-
378
- self.model_id = "mistralai/Mistral-7B-Instruct-v0.1"
379
- self.llama_model = current.huggingface_hub.snapshot_download(
380
- repo_id=self.model_id,
381
- allow_patterns=["*.safetensors", "*.json", "tokenizer.*"],
382
- )
383
- self.next(self.train)
384
- ```
385
-
386
- **Usage: explicitly loading models at runtime from the Hugging Face Hub or from cache (from Metaflow's datastore)**
387
- ```python
388
- @huggingface_hub
389
- @step
390
- def run_training(self):
391
- # Temporary directory (auto-cleaned on exit)
392
- with current.huggingface_hub.load(
393
- repo_id="google-bert/bert-base-uncased",
394
- allow_patterns=["*.bin"],
395
- ) as local_path:
396
- # Use files under local_path
397
- train_model(local_path)
398
- ...
399
-
400
- ```
401
-
402
- **Usage: loading models directly from the Hugging Face Hub or from cache (from Metaflow's datastore)**
403
- ```python
404
- @huggingface_hub(load=["mistralai/Mistral-7B-Instruct-v0.1"])
405
- @step
406
- def pull_model_from_huggingface(self):
407
- path_to_model = current.huggingface_hub.loaded["mistralai/Mistral-7B-Instruct-v0.1"]
408
- ```
409
-
410
- ```python
411
- @huggingface_hub(load=[("mistralai/Mistral-7B-Instruct-v0.1", "/my-directory"), ("myorg/mistral-lora", "/my-lora-directory")])
412
- @step
413
- def finetune_model(self):
414
- path_to_model = current.huggingface_hub.loaded["mistralai/Mistral-7B-Instruct-v0.1"]
415
- # path_to_model will be /my-directory
416
- ```
417
-
418
- ```python
419
- # Takes all the arguments passed to `snapshot_download`
420
- # except for `local_dir`
421
- @huggingface_hub(load=[
422
- {
423
- "repo_id": "mistralai/Mistral-7B-Instruct-v0.1",
424
- },
425
- {
426
- "repo_id": "myorg/mistral-lora",
427
- "repo_type": "model",
428
- },
429
- ])
430
- @step
431
- def finetune_model(self):
432
- path_to_model = current.huggingface_hub.loaded["mistralai/Mistral-7B-Instruct-v0.1"]
433
- # path_to_model will be /my-directory
434
- ```
435
-
436
-
437
- Parameters
438
- ----------
439
- temp_dir_root : str, optional
440
- The root directory that will hold the temporary directory where objects will be downloaded.
441
-
442
- cache_scope : str, optional
443
- The scope of the cache. Can be `checkpoint` / `flow` / `global`.
444
-
445
- - `checkpoint` (default): All repos are stored like objects saved by `@checkpoint`.
446
- i.e., the cached path is derived from the namespace, flow, step, and Metaflow foreach iteration.
447
- Any repo downloaded under this scope will only be retrieved from the cache when the step runs under the same namespace in the same flow (at the same foreach index).
448
-
449
- - `flow`: All repos are cached under the flow, regardless of namespace.
450
- i.e., the cached path is derived solely from the flow name.
451
- When to use this mode:
452
- - Multiple users are executing the same flow and want shared access to the repos cached by the decorator.
453
- - Multiple versions of a flow are deployed, all needing access to the same repos cached by the decorator.
454
-
455
- - `global`: All repos are cached under a globally static path.
456
- i.e., the base path of the cache is static and all repos are stored under it.
457
- When to use this mode:
458
- - All repos from the Hugging Face Hub need to be shared by users across all flow executions.
459
-
460
- Each caching scope comes with its own trade-offs:
461
- - `checkpoint`:
462
- - Has explicit control over when caches are populated (controlled by the same flow that has the `@huggingface_hub` decorator) but ends up hitting the Hugging Face Hub more often if there are many users/namespaces/steps.
463
- - Since objects are written on a `namespace/flow/step` basis, the blast radius of a bad checkpoint is limited to a particular flow in a namespace.
464
- - `flow`:
465
- - Has less control over when caches are populated (can be written by any execution instance of a flow from any namespace) but results in more cache hits.
466
- - The blast radius of a bad checkpoint is limited to all runs of a particular flow.
467
- - It doesn't promote cache reuse across flows.
468
- - `global`:
469
- - Has no control over when caches are populated (can be written by any flow execution) but has the highest cache hit rate.
470
- - It promotes cache reuse across flows.
471
- - The blast radius of a bad checkpoint spans every flow that could be using a particular repo.
472
-
473
- load: Union[List[str], List[Tuple[Dict, str]], List[Tuple[str, str]], List[Dict], None]
474
- The list of repos (models/datasets) to load.
475
-
476
- Loaded repos can be accessed via `current.huggingface_hub.loaded`. If load is set, then the following happens:
477
-
478
- - If repo (model/dataset) is not found in the datastore:
479
- - Downloads the repo from Hugging Face Hub to a temporary directory (or uses specified path) for local access
480
- - Stores it in Metaflow's datastore (s3/gcs/azure etc.) with a unique name based on repo_type/repo_id
481
- - All HF models loaded for a `@step` will be cached separately under flow/step/namespace.
482
-
483
- - If repo is found in the datastore:
484
- - Loads it directly from datastore to local path (can be temporary directory or specified path)
485
- """
486
- ...
487
-
488
- @typing.overload
489
- def resources(*, cpu: int = 1, gpu: typing.Optional[int] = None, disk: typing.Optional[int] = None, memory: int = 4096, shared_memory: typing.Optional[int] = None) -> typing.Callable[[typing.Union[typing.Callable[[FlowSpecDerived, StepFlag], None], typing.Callable[[FlowSpecDerived, typing.Any, StepFlag], None]]], typing.Union[typing.Callable[[FlowSpecDerived, StepFlag], None], typing.Callable[[FlowSpecDerived, typing.Any, StepFlag], None]]]:
490
- """
491
- Specifies the resources needed when executing this step.
492
-
493
- Use `@resources` to specify the resource requirements
494
- independently of the specific compute layer (`@batch`, `@kubernetes`).
495
-
496
- You can choose the compute layer on the command line by executing e.g.
497
- ```
498
- python myflow.py run --with batch
499
- ```
500
- or
501
- ```
502
- python myflow.py run --with kubernetes
503
- ```
504
- which executes the flow on the desired system using the
505
- requirements specified in `@resources`.
506
-
507
-
508
- Parameters
509
- ----------
510
- cpu : int, default 1
511
- Number of CPUs required for this step.
512
- gpu : int, optional, default None
513
- Number of GPUs required for this step.
514
- disk : int, optional, default None
515
- Disk size (in MB) required for this step. Only applies on Kubernetes.
516
- memory : int, default 4096
517
- Memory size (in MB) required for this step.
518
- shared_memory : int, optional, default None
519
- The value for the size (in MiB) of the /dev/shm volume for this step.
520
- This parameter maps to the `--shm-size` option in Docker.
521
- """
522
- ...
523
-
524
- @typing.overload
525
- def resources(f: typing.Callable[[FlowSpecDerived, StepFlag], None]) -> typing.Callable[[FlowSpecDerived, StepFlag], None]:
526
- ...
527
-
528
- @typing.overload
529
- def resources(f: typing.Callable[[FlowSpecDerived, typing.Any, StepFlag], None]) -> typing.Callable[[FlowSpecDerived, typing.Any, StepFlag], None]:
530
- ...
531
-
532
- def resources(f: typing.Union[typing.Callable[[FlowSpecDerived, StepFlag], None], typing.Callable[[FlowSpecDerived, typing.Any, StepFlag], None], None] = None, *, cpu: int = 1, gpu: typing.Optional[int] = None, disk: typing.Optional[int] = None, memory: int = 4096, shared_memory: typing.Optional[int] = None):
533
- """
534
- Specifies the resources needed when executing this step.
535
-
536
- Use `@resources` to specify the resource requirements
537
- independently of the specific compute layer (`@batch`, `@kubernetes`).
538
-
539
- You can choose the compute layer on the command line by executing e.g.
540
- ```
541
- python myflow.py run --with batch
542
- ```
543
- or
544
- ```
545
- python myflow.py run --with kubernetes
546
- ```
547
- which executes the flow on the desired system using the
548
- requirements specified in `@resources`.
549
-
550
-
551
- Parameters
552
- ----------
553
- cpu : int, default 1
554
- Number of CPUs required for this step.
555
- gpu : int, optional, default None
556
- Number of GPUs required for this step.
557
- disk : int, optional, default None
558
- Disk size (in MB) required for this step. Only applies on Kubernetes.
559
- memory : int, default 4096
560
- Memory size (in MB) required for this step.
561
- shared_memory : int, optional, default None
562
- The value for the size (in MiB) of the /dev/shm volume for this step.
563
- This parameter maps to the `--shm-size` option in Docker.
564
- """
565
- ...
566
-
567
- def nvidia(*, gpu: int, gpu_type: str, queue_timeout: int) -> typing.Callable[[typing.Union[typing.Callable[[FlowSpecDerived, StepFlag], None], typing.Callable[[FlowSpecDerived, typing.Any, StepFlag], None]]], typing.Union[typing.Callable[[FlowSpecDerived, StepFlag], None], typing.Callable[[FlowSpecDerived, typing.Any, StepFlag], None]]]:
568
- """
569
- Specifies that this step should execute on DGX cloud.
570
-
571
-
572
- Parameters
573
- ----------
574
- gpu : int
575
- Number of GPUs to use.
576
- gpu_type : str
577
- Type of Nvidia GPU to use.
578
- queue_timeout : int
579
- Time to keep the job in NVCF's queue.
580
- """
581
- ...
582
-
583
- @typing.overload
584
- def timeout(*, seconds: int = 0, minutes: int = 0, hours: int = 0) -> typing.Callable[[typing.Union[typing.Callable[[FlowSpecDerived, StepFlag], None], typing.Callable[[FlowSpecDerived, typing.Any, StepFlag], None]]], typing.Union[typing.Callable[[FlowSpecDerived, StepFlag], None], typing.Callable[[FlowSpecDerived, typing.Any, StepFlag], None]]]:
585
- """
586
- Specifies a timeout for your step.
587
-
588
- This decorator is useful if this step may hang indefinitely.
589
-
590
- This can be used in conjunction with the `@retry` decorator as well as the `@catch` decorator.
591
- A timeout is considered to be an exception thrown by the step. It will cause the step to be
592
- retried if needed and the exception will be caught by the `@catch` decorator, if present.
593
-
594
- Note that all the values specified in parameters are added together so if you specify
595
- 60 seconds and 1 hour, the decorator will have an effective timeout of 1 hour and 1 minute.
596
-
597
-
598
- Parameters
599
- ----------
600
- seconds : int, default 0
601
- Number of seconds to wait prior to timing out.
602
- minutes : int, default 0
603
- Number of minutes to wait prior to timing out.
604
- hours : int, default 0
605
- Number of hours to wait prior to timing out.
606
- """
607
- ...
608
-
609
- @typing.overload
610
- def timeout(f: typing.Callable[[FlowSpecDerived, StepFlag], None]) -> typing.Callable[[FlowSpecDerived, StepFlag], None]:
611
- ...
612
-
613
- @typing.overload
614
- def timeout(f: typing.Callable[[FlowSpecDerived, typing.Any, StepFlag], None]) -> typing.Callable[[FlowSpecDerived, typing.Any, StepFlag], None]:
615
- ...
616
-
617
- def timeout(f: typing.Union[typing.Callable[[FlowSpecDerived, StepFlag], None], typing.Callable[[FlowSpecDerived, typing.Any, StepFlag], None], None] = None, *, seconds: int = 0, minutes: int = 0, hours: int = 0):
618
- """
619
- Specifies a timeout for your step.
620
-
621
- This decorator is useful if this step may hang indefinitely.
622
-
623
- This can be used in conjunction with the `@retry` decorator as well as the `@catch` decorator.
624
- A timeout is considered to be an exception thrown by the step. It will cause the step to be
625
- retried if needed and the exception will be caught by the `@catch` decorator, if present.
626
-
627
- Note that all the values specified in parameters are added together so if you specify
628
- 60 seconds and 1 hour, the decorator will have an effective timeout of 1 hour and 1 minute.
629
-
630
-
631
- Parameters
632
- ----------
633
- seconds : int, default 0
634
- Number of seconds to wait prior to timing out.
635
- minutes : int, default 0
636
- Number of minutes to wait prior to timing out.
637
- hours : int, default 0
638
- Number of hours to wait prior to timing out.
639
- """
640
- ...
641
-
642
- @typing.overload
643
- def secrets(*, sources: typing.List[typing.Union[str, typing.Dict[str, typing.Any]]] = [], role: typing.Optional[str] = None) -> typing.Callable[[typing.Union[typing.Callable[[FlowSpecDerived, StepFlag], None], typing.Callable[[FlowSpecDerived, typing.Any, StepFlag], None]]], typing.Union[typing.Callable[[FlowSpecDerived, StepFlag], None], typing.Callable[[FlowSpecDerived, typing.Any, StepFlag], None]]]:
644
- """
645
- Specifies secrets to be retrieved and injected as environment variables prior to
646
- the execution of a step.
647
-
648
-
649
- Parameters
650
- ----------
651
- sources : List[Union[str, Dict[str, Any]]], default: []
652
- List of secret specs, defining how the secrets are to be retrieved
653
- role : str, optional, default: None
654
- Role to use for fetching secrets
655
- """
656
- ...
657
-
658
- @typing.overload
659
- def secrets(f: typing.Callable[[FlowSpecDerived, StepFlag], None]) -> typing.Callable[[FlowSpecDerived, StepFlag], None]:
660
- ...
661
-
662
- @typing.overload
663
- def secrets(f: typing.Callable[[FlowSpecDerived, typing.Any, StepFlag], None]) -> typing.Callable[[FlowSpecDerived, typing.Any, StepFlag], None]:
664
- ...
665
-
666
- def secrets(f: typing.Union[typing.Callable[[FlowSpecDerived, StepFlag], None], typing.Callable[[FlowSpecDerived, typing.Any, StepFlag], None], None] = None, *, sources: typing.List[typing.Union[str, typing.Dict[str, typing.Any]]] = [], role: typing.Optional[str] = None):
667
- """
668
- Specifies secrets to be retrieved and injected as environment variables prior to
669
- the execution of a step.
670
-
671
-
672
- Parameters
673
- ----------
674
- sources : List[Union[str, Dict[str, Any]]], default: []
675
- List of secret specs, defining how the secrets are to be retrieved
676
- role : str, optional, default: None
677
- Role to use for fetching secrets
678
- """
679
- ...
680
-
681
- @typing.overload
682
- def nebius_s3_proxy(f: typing.Callable[[FlowSpecDerived, StepFlag], None]) -> typing.Callable[[FlowSpecDerived, StepFlag], None]:
683
- """
684
- Nebius-specific S3 Proxy decorator for routing S3 requests through a local proxy service.
685
- It exists to make it easier for users to know that this decorator should only be used with
686
- a Neo Cloud like Nebius.
687
- """
688
- ...
689
-
690
- @typing.overload
691
- def nebius_s3_proxy(f: typing.Callable[[FlowSpecDerived, typing.Any, StepFlag], None]) -> typing.Callable[[FlowSpecDerived, typing.Any, StepFlag], None]:
692
- ...
693
-
694
- def nebius_s3_proxy(f: typing.Union[typing.Callable[[FlowSpecDerived, StepFlag], None], typing.Callable[[FlowSpecDerived, typing.Any, StepFlag], None], None] = None):
695
- """
696
- Nebius-specific S3 Proxy decorator for routing S3 requests through a local proxy service.
697
- It exists to make it easier for users to know that this decorator should only be used with
698
- a Neo Cloud like Nebius.
699
- """
700
- ...
701
-
702
- @typing.overload
703
- def parallel(f: typing.Callable[[FlowSpecDerived, StepFlag], None]) -> typing.Callable[[FlowSpecDerived, StepFlag], None]:
704
- """
705
- Decorator prototype for all step decorators. This function gets specialized
706
- and imported for all decorators types by _import_plugin_decorators().
707
- """
708
- ...
709
-
710
- @typing.overload
711
- def parallel(f: typing.Callable[[FlowSpecDerived, typing.Any, StepFlag], None]) -> typing.Callable[[FlowSpecDerived, typing.Any, StepFlag], None]:
712
- ...
713
-
714
- def parallel(f: typing.Union[typing.Callable[[FlowSpecDerived, StepFlag], None], typing.Callable[[FlowSpecDerived, typing.Any, StepFlag], None], None] = None):
715
- """
716
- Decorator prototype for all step decorators. This function gets specialized
717
- and imported for all decorators types by _import_plugin_decorators().
718
- """
719
- ...
720
-
721
- @typing.overload
722
- def card(*, type: str = 'default', id: typing.Optional[str] = None, options: typing.Dict[str, typing.Any] = {}, timeout: int = 45) -> typing.Callable[[typing.Union[typing.Callable[[FlowSpecDerived, StepFlag], None], typing.Callable[[FlowSpecDerived, typing.Any, StepFlag], None]]], typing.Union[typing.Callable[[FlowSpecDerived, StepFlag], None], typing.Callable[[FlowSpecDerived, typing.Any, StepFlag], None]]]:
723
- """
724
- Creates a human-readable report, a Metaflow Card, after this step completes.
725
-
726
- Note that you may add multiple `@card` decorators in a step with different parameters.
727
-
728
-
729
- Parameters
730
- ----------
731
- type : str, default 'default'
732
- Card type.
733
- id : str, optional, default None
734
- If multiple cards are present, use this id to identify this card.
735
- options : Dict[str, Any], default {}
736
- Options passed to the card. The contents depend on the card type.
737
- timeout : int, default 45
738
- Interrupt reporting if it takes more than this many seconds.
739
- """
740
- ...
741
-
742
- @typing.overload
743
- def card(f: typing.Callable[[FlowSpecDerived, StepFlag], None]) -> typing.Callable[[FlowSpecDerived, StepFlag], None]:
744
- ...
745
-
746
- @typing.overload
747
- def card(f: typing.Callable[[FlowSpecDerived, typing.Any, StepFlag], None]) -> typing.Callable[[FlowSpecDerived, typing.Any, StepFlag], None]:
748
- ...
749
-
750
- def card(f: typing.Union[typing.Callable[[FlowSpecDerived, StepFlag], None], typing.Callable[[FlowSpecDerived, typing.Any, StepFlag], None], None] = None, *, type: str = 'default', id: typing.Optional[str] = None, options: typing.Dict[str, typing.Any] = {}, timeout: int = 45):
751
- """
752
- Creates a human-readable report, a Metaflow Card, after this step completes.
753
-
754
- Note that you may add multiple `@card` decorators in a step with different parameters.
755
-
756
-
757
- Parameters
758
- ----------
759
- type : str, default 'default'
760
- Card type.
761
- id : str, optional, default None
762
- If multiple cards are present, use this id to identify this card.
763
- options : Dict[str, Any], default {}
764
- Options passed to the card. The contents depend on the card type.
765
- timeout : int, default 45
766
- Interrupt reporting if it takes more than this many seconds.
767
- """
768
- ...
769
-
770
- @typing.overload
771
- def catch(*, var: typing.Optional[str] = None, print_exception: bool = True) -> typing.Callable[[typing.Union[typing.Callable[[FlowSpecDerived, StepFlag], None], typing.Callable[[FlowSpecDerived, typing.Any, StepFlag], None]]], typing.Union[typing.Callable[[FlowSpecDerived, StepFlag], None], typing.Callable[[FlowSpecDerived, typing.Any, StepFlag], None]]]:
772
- """
773
- Specifies that the step will success under all circumstances.
774
-
775
- The decorator will create an optional artifact, specified by `var`, which
776
- contains the exception raised. You can use it to detect the presence
777
- of errors, indicating that all happy-path artifacts produced by the step
778
- are missing.
779
-
780
-
781
- Parameters
782
- ----------
783
- var : str, optional, default None
784
- Name of the artifact in which to store the caught exception.
785
- If not specified, the exception is not stored.
786
- print_exception : bool, default True
787
- Determines whether or not the exception is printed to
788
- stdout when caught.
789
- """
790
- ...
791
-
792
- @typing.overload
793
- def catch(f: typing.Callable[[FlowSpecDerived, StepFlag], None]) -> typing.Callable[[FlowSpecDerived, StepFlag], None]:
794
- ...
795
-
796
- @typing.overload
797
- def catch(f: typing.Callable[[FlowSpecDerived, typing.Any, StepFlag], None]) -> typing.Callable[[FlowSpecDerived, typing.Any, StepFlag], None]:
798
- ...
799
-
800
- def catch(f: typing.Union[typing.Callable[[FlowSpecDerived, StepFlag], None], typing.Callable[[FlowSpecDerived, typing.Any, StepFlag], None], None] = None, *, var: typing.Optional[str] = None, print_exception: bool = True):
801
- """
802
- Specifies that the step will success under all circumstances.
803
-
804
- The decorator will create an optional artifact, specified by `var`, which
805
- contains the exception raised. You can use it to detect the presence
806
- of errors, indicating that all happy-path artifacts produced by the step
807
- are missing.
372
+ You can choose the compute layer on the command line by executing e.g.
373
+ ```
374
+ python myflow.py run --with batch
375
+ ```
376
+ or
377
+ ```
378
+ python myflow.py run --with kubernetes
379
+ ```
380
+ which executes the flow on the desired system using the
381
+ requirements specified in `@resources`.
808
382
 
809
383
 
810
384
  Parameters
811
385
  ----------
812
- var : str, optional, default None
813
- Name of the artifact in which to store the caught exception.
814
- If not specified, the exception is not stored.
815
- print_exception : bool, default True
816
- Determines whether or not the exception is printed to
817
- stdout when caught.
386
+ cpu : int, default 1
387
+ Number of CPUs required for this step.
388
+ gpu : int, optional, default None
389
+ Number of GPUs required for this step.
390
+ disk : int, optional, default None
391
+ Disk size (in MB) required for this step. Only applies on Kubernetes.
392
+ memory : int, default 4096
393
+ Memory size (in MB) required for this step.
394
+ shared_memory : int, optional, default None
395
+ The value for the size (in MiB) of the /dev/shm volume for this step.
396
+ This parameter maps to the `--shm-size` option in Docker.
818
397
  """
819
398
  ...
820
399
 
@@ -918,37 +497,260 @@ def model(f: typing.Union[typing.Callable[[FlowSpecDerived, StepFlag], None], ty
918
497
  self.next(self.end)
919
498
  ```
920
499
 
921
- - Loading models
500
+ - Loading models
501
+ ```python
502
+ @step
503
+ def train(self):
504
+ # current.model.load returns the path to the model loaded
505
+ checkpoint_path = current.model.load(
506
+ self.checkpoint_key,
507
+ )
508
+ model_path = current.model.load(
509
+ self.model,
510
+ )
511
+ self.next(self.test)
512
+ ```
513
+
514
+
515
+ Parameters
516
+ ----------
517
+ load : Union[List[str],str,List[Tuple[str,Union[str,None]]]], default: None
518
+ Artifact name/s referencing the models/checkpoints to load. Artifact names refer to the names of the instance variables set to `self`.
519
+ These artifact names give to `load` be reference objects or reference `key` string's from objects created by `current.checkpoint` / `current.model` / `current.huggingface_hub`.
520
+ If a list of tuples is provided, the first element is the artifact name and the second element is the path the artifact needs be unpacked on
521
+ the local filesystem. If the second element is None, the artifact will be unpacked in the current working directory.
522
+ If a string is provided, then the artifact corresponding to that name will be loaded in the current working directory.
523
+
524
+ temp_dir_root : str, default: None
525
+ The root directory under which `current.model.loaded` will store loaded models
526
+ """
527
+ ...
528
+
529
+ @typing.overload
530
+ def catch(*, var: typing.Optional[str] = None, print_exception: bool = True) -> typing.Callable[[typing.Union[typing.Callable[[FlowSpecDerived, StepFlag], None], typing.Callable[[FlowSpecDerived, typing.Any, StepFlag], None]]], typing.Union[typing.Callable[[FlowSpecDerived, StepFlag], None], typing.Callable[[FlowSpecDerived, typing.Any, StepFlag], None]]]:
531
+ """
532
+ Specifies that the step will success under all circumstances.
533
+
534
+ The decorator will create an optional artifact, specified by `var`, which
535
+ contains the exception raised. You can use it to detect the presence
536
+ of errors, indicating that all happy-path artifacts produced by the step
537
+ are missing.
538
+
539
+
540
+ Parameters
541
+ ----------
542
+ var : str, optional, default None
543
+ Name of the artifact in which to store the caught exception.
544
+ If not specified, the exception is not stored.
545
+ print_exception : bool, default True
546
+ Determines whether or not the exception is printed to
547
+ stdout when caught.
548
+ """
549
+ ...
550
+
551
+ @typing.overload
552
+ def catch(f: typing.Callable[[FlowSpecDerived, StepFlag], None]) -> typing.Callable[[FlowSpecDerived, StepFlag], None]:
553
+ ...
554
+
555
+ @typing.overload
556
+ def catch(f: typing.Callable[[FlowSpecDerived, typing.Any, StepFlag], None]) -> typing.Callable[[FlowSpecDerived, typing.Any, StepFlag], None]:
557
+ ...
558
+
559
+ def catch(f: typing.Union[typing.Callable[[FlowSpecDerived, StepFlag], None], typing.Callable[[FlowSpecDerived, typing.Any, StepFlag], None], None] = None, *, var: typing.Optional[str] = None, print_exception: bool = True):
560
+ """
561
+ Specifies that the step will success under all circumstances.
562
+
563
+ The decorator will create an optional artifact, specified by `var`, which
564
+ contains the exception raised. You can use it to detect the presence
565
+ of errors, indicating that all happy-path artifacts produced by the step
566
+ are missing.
567
+
568
+
569
+ Parameters
570
+ ----------
571
+ var : str, optional, default None
572
+ Name of the artifact in which to store the caught exception.
573
+ If not specified, the exception is not stored.
574
+ print_exception : bool, default True
575
+ Determines whether or not the exception is printed to
576
+ stdout when caught.
577
+ """
578
+ ...
579
+
580
+ @typing.overload
581
+ def coreweave_s3_proxy(f: typing.Callable[[FlowSpecDerived, StepFlag], None]) -> typing.Callable[[FlowSpecDerived, StepFlag], None]:
582
+ """
583
+ CoreWeave-specific S3 Proxy decorator for routing S3 requests through a local proxy service.
584
+ It exists to make it easier for users to know that this decorator should only be used with
585
+ a Neo Cloud like CoreWeave.
586
+ """
587
+ ...
588
+
589
+ @typing.overload
590
+ def coreweave_s3_proxy(f: typing.Callable[[FlowSpecDerived, typing.Any, StepFlag], None]) -> typing.Callable[[FlowSpecDerived, typing.Any, StepFlag], None]:
591
+ ...
592
+
593
+ def coreweave_s3_proxy(f: typing.Union[typing.Callable[[FlowSpecDerived, StepFlag], None], typing.Callable[[FlowSpecDerived, typing.Any, StepFlag], None], None] = None):
594
+ """
595
+ CoreWeave-specific S3 Proxy decorator for routing S3 requests through a local proxy service.
596
+ It exists to make it easier for users to know that this decorator should only be used with
597
+ a Neo Cloud like CoreWeave.
598
+ """
599
+ ...
600
+
601
+ @typing.overload
602
+ def parallel(f: typing.Callable[[FlowSpecDerived, StepFlag], None]) -> typing.Callable[[FlowSpecDerived, StepFlag], None]:
603
+ """
604
+ Decorator prototype for all step decorators. This function gets specialized
605
+ and imported for all decorators types by _import_plugin_decorators().
606
+ """
607
+ ...
608
+
609
+ @typing.overload
610
+ def parallel(f: typing.Callable[[FlowSpecDerived, typing.Any, StepFlag], None]) -> typing.Callable[[FlowSpecDerived, typing.Any, StepFlag], None]:
611
+ ...
612
+
613
+ def parallel(f: typing.Union[typing.Callable[[FlowSpecDerived, StepFlag], None], typing.Callable[[FlowSpecDerived, typing.Any, StepFlag], None], None] = None):
614
+ """
615
+ Decorator prototype for all step decorators. This function gets specialized
616
+ and imported for all decorators types by _import_plugin_decorators().
617
+ """
618
+ ...
619
+
620
+ def s3_proxy(*, integration_name: typing.Optional[str] = None, write_mode: typing.Optional[str] = None, debug: typing.Optional[bool] = None) -> typing.Callable[[typing.Union[typing.Callable[[FlowSpecDerived, StepFlag], None], typing.Callable[[FlowSpecDerived, typing.Any, StepFlag], None]]], typing.Union[typing.Callable[[FlowSpecDerived, StepFlag], None], typing.Callable[[FlowSpecDerived, typing.Any, StepFlag], None]]]:
621
+ """
622
+ S3 Proxy decorator for routing S3 requests through a local proxy service.
623
+
624
+
625
+ Parameters
626
+ ----------
627
+ integration_name : str, optional
628
+ Name of the S3 proxy integration. If not specified, will use the only
629
+ available S3 proxy integration in the namespace (fails if multiple exist).
630
+ write_mode : str, optional
631
+ The desired behavior during write operations to target (origin) S3 bucket.
632
+ allowed options are:
633
+ "origin-and-cache" -> write to both the target S3 bucket and local object
634
+ storage
635
+ "origin" -> only write to the target S3 bucket
636
+ "cache" -> only write to the object storage service used for caching
637
+ debug : bool, optional
638
+ Enable debug logging for proxy operations.
639
+ """
640
+ ...
641
+
642
+ @typing.overload
643
+ def environment(*, vars: typing.Dict[str, str] = {}) -> typing.Callable[[typing.Union[typing.Callable[[FlowSpecDerived, StepFlag], None], typing.Callable[[FlowSpecDerived, typing.Any, StepFlag], None]]], typing.Union[typing.Callable[[FlowSpecDerived, StepFlag], None], typing.Callable[[FlowSpecDerived, typing.Any, StepFlag], None]]]:
644
+ """
645
+ Specifies environment variables to be set prior to the execution of a step.
646
+
647
+
648
+ Parameters
649
+ ----------
650
+ vars : Dict[str, str], default {}
651
+ Dictionary of environment variables to set.
652
+ """
653
+ ...
654
+
655
+ @typing.overload
656
+ def environment(f: typing.Callable[[FlowSpecDerived, StepFlag], None]) -> typing.Callable[[FlowSpecDerived, StepFlag], None]:
657
+ ...
658
+
659
+ @typing.overload
660
+ def environment(f: typing.Callable[[FlowSpecDerived, typing.Any, StepFlag], None]) -> typing.Callable[[FlowSpecDerived, typing.Any, StepFlag], None]:
661
+ ...
662
+
663
+ def environment(f: typing.Union[typing.Callable[[FlowSpecDerived, StepFlag], None], typing.Callable[[FlowSpecDerived, typing.Any, StepFlag], None], None] = None, *, vars: typing.Dict[str, str] = {}):
664
+ """
665
+ Specifies environment variables to be set prior to the execution of a step.
666
+
667
+
668
+ Parameters
669
+ ----------
670
+ vars : Dict[str, str], default {}
671
+ Dictionary of environment variables to set.
672
+ """
673
+ ...
674
+
675
+ @typing.overload
676
+ def checkpoint(*, load_policy: str = 'fresh', temp_dir_root: str = None) -> typing.Callable[[typing.Union[typing.Callable[[FlowSpecDerived, StepFlag], None], typing.Callable[[FlowSpecDerived, typing.Any, StepFlag], None]]], typing.Union[typing.Callable[[FlowSpecDerived, StepFlag], None], typing.Callable[[FlowSpecDerived, typing.Any, StepFlag], None]]]:
677
+ """
678
+ Enables checkpointing for a step.
679
+
680
+ > Examples
681
+
682
+ - Saving Checkpoints
683
+
684
+ ```python
685
+ @checkpoint
686
+ @step
687
+ def train(self):
688
+ model = create_model(self.parameters, checkpoint_path = None)
689
+ for i in range(self.epochs):
690
+ # some training logic
691
+ loss = model.train(self.dataset)
692
+ if i % 10 == 0:
693
+ model.save(
694
+ current.checkpoint.directory,
695
+ )
696
+ # saves the contents of the `current.checkpoint.directory` as a checkpoint
697
+ # and returns a reference dictionary to the checkpoint saved in the datastore
698
+ self.latest_checkpoint = current.checkpoint.save(
699
+ name="epoch_checkpoint",
700
+ metadata={
701
+ "epoch": i,
702
+ "loss": loss,
703
+ }
704
+ )
705
+ ```
706
+
707
+ - Using Loaded Checkpoints
708
+
922
709
  ```python
710
+ @retry(times=3)
711
+ @checkpoint
923
712
  @step
924
713
  def train(self):
925
- # current.model.load returns the path to the model loaded
926
- checkpoint_path = current.model.load(
927
- self.checkpoint_key,
928
- )
929
- model_path = current.model.load(
930
- self.model,
931
- )
932
- self.next(self.test)
714
+ # Assume that the task has restarted and the previous attempt of the task
715
+ # saved a checkpoint
716
+ checkpoint_path = None
717
+ if current.checkpoint.is_loaded: # Check if a checkpoint is loaded
718
+ print("Loaded checkpoint from the previous attempt")
719
+ checkpoint_path = current.checkpoint.directory
720
+
721
+ model = create_model(self.parameters, checkpoint_path = checkpoint_path)
722
+ for i in range(self.epochs):
723
+ ...
933
724
  ```
934
725
 
935
726
 
936
727
  Parameters
937
728
  ----------
938
- load : Union[List[str],str,List[Tuple[str,Union[str,None]]]], default: None
939
- Artifact name/s referencing the models/checkpoints to load. Artifact names refer to the names of the instance variables set to `self`.
940
- These artifact names give to `load` be reference objects or reference `key` string's from objects created by `current.checkpoint` / `current.model` / `current.huggingface_hub`.
941
- If a list of tuples is provided, the first element is the artifact name and the second element is the path the artifact needs be unpacked on
942
- the local filesystem. If the second element is None, the artifact will be unpacked in the current working directory.
943
- If a string is provided, then the artifact corresponding to that name will be loaded in the current working directory.
729
+ load_policy : str, default: "fresh"
730
+ The policy for loading the checkpoint. The following policies are supported:
731
+ - "eager": Loads the the latest available checkpoint within the namespace.
732
+ With this mode, the latest checkpoint written by any previous task (can be even a different run) of the step
733
+ will be loaded at the start of the task.
734
+ - "none": Do not load any checkpoint
735
+ - "fresh": Loads the lastest checkpoint created within the running Task.
736
+ This mode helps loading checkpoints across various retry attempts of the same task.
737
+ With this mode, no checkpoint will be loaded at the start of a task but any checkpoints
738
+ created within the task will be loaded when the task is retries execution on failure.
944
739
 
945
740
  temp_dir_root : str, default: None
946
- The root directory under which `current.model.loaded` will store loaded models
741
+ The root directory under which `current.checkpoint.directory` will be created.
947
742
  """
948
743
  ...
949
744
 
950
745
  @typing.overload
951
- def checkpoint(*, load_policy: str = 'fresh', temp_dir_root: str = None) -> typing.Callable[[typing.Union[typing.Callable[[FlowSpecDerived, StepFlag], None], typing.Callable[[FlowSpecDerived, typing.Any, StepFlag], None]]], typing.Union[typing.Callable[[FlowSpecDerived, StepFlag], None], typing.Callable[[FlowSpecDerived, typing.Any, StepFlag], None]]]:
746
+ def checkpoint(f: typing.Callable[[FlowSpecDerived, StepFlag], None]) -> typing.Callable[[FlowSpecDerived, StepFlag], None]:
747
+ ...
748
+
749
+ @typing.overload
750
+ def checkpoint(f: typing.Callable[[FlowSpecDerived, typing.Any, StepFlag], None]) -> typing.Callable[[FlowSpecDerived, typing.Any, StepFlag], None]:
751
+ ...
752
+
753
+ def checkpoint(f: typing.Union[typing.Callable[[FlowSpecDerived, StepFlag], None], typing.Callable[[FlowSpecDerived, typing.Any, StepFlag], None], None] = None, *, load_policy: str = 'fresh', temp_dir_root: str = None):
952
754
  """
953
755
  Enables checkpointing for a step.
954
756
 
@@ -1001,174 +803,349 @@ def checkpoint(*, load_policy: str = 'fresh', temp_dir_root: str = None) -> typi
1001
803
 
1002
804
  Parameters
1003
805
  ----------
1004
- load_policy : str, default: "fresh"
1005
- The policy for loading the checkpoint. The following policies are supported:
1006
- - "eager": Loads the the latest available checkpoint within the namespace.
1007
- With this mode, the latest checkpoint written by any previous task (can be even a different run) of the step
1008
- will be loaded at the start of the task.
1009
- - "none": Do not load any checkpoint
1010
- - "fresh": Loads the lastest checkpoint created within the running Task.
1011
- This mode helps loading checkpoints across various retry attempts of the same task.
1012
- With this mode, no checkpoint will be loaded at the start of a task but any checkpoints
1013
- created within the task will be loaded when the task is retries execution on failure.
1014
-
1015
- temp_dir_root : str, default: None
1016
- The root directory under which `current.checkpoint.directory` will be created.
1017
- """
1018
- ...
1019
-
1020
- @typing.overload
1021
- def checkpoint(f: typing.Callable[[FlowSpecDerived, StepFlag], None]) -> typing.Callable[[FlowSpecDerived, StepFlag], None]:
1022
- ...
1023
-
1024
- @typing.overload
1025
- def checkpoint(f: typing.Callable[[FlowSpecDerived, typing.Any, StepFlag], None]) -> typing.Callable[[FlowSpecDerived, typing.Any, StepFlag], None]:
1026
- ...
1027
-
1028
- def checkpoint(f: typing.Union[typing.Callable[[FlowSpecDerived, StepFlag], None], typing.Callable[[FlowSpecDerived, typing.Any, StepFlag], None], None] = None, *, load_policy: str = 'fresh', temp_dir_root: str = None):
1029
- """
1030
- Enables checkpointing for a step.
806
+ load_policy : str, default: "fresh"
807
+ The policy for loading the checkpoint. The following policies are supported:
808
+ - "eager": Loads the the latest available checkpoint within the namespace.
809
+ With this mode, the latest checkpoint written by any previous task (can be even a different run) of the step
810
+ will be loaded at the start of the task.
811
+ - "none": Do not load any checkpoint
812
+ - "fresh": Loads the lastest checkpoint created within the running Task.
813
+ This mode helps loading checkpoints across various retry attempts of the same task.
814
+ With this mode, no checkpoint will be loaded at the start of a task but any checkpoints
815
+ created within the task will be loaded when the task is retries execution on failure.
816
+
817
+ temp_dir_root : str, default: None
818
+ The root directory under which `current.checkpoint.directory` will be created.
819
+ """
820
+ ...
821
+
822
+ @typing.overload
823
+ def secrets(*, sources: typing.List[typing.Union[str, typing.Dict[str, typing.Any]]] = [], role: typing.Optional[str] = None) -> typing.Callable[[typing.Union[typing.Callable[[FlowSpecDerived, StepFlag], None], typing.Callable[[FlowSpecDerived, typing.Any, StepFlag], None]]], typing.Union[typing.Callable[[FlowSpecDerived, StepFlag], None], typing.Callable[[FlowSpecDerived, typing.Any, StepFlag], None]]]:
824
+ """
825
+ Specifies secrets to be retrieved and injected as environment variables prior to
826
+ the execution of a step.
827
+
828
+
829
+ Parameters
830
+ ----------
831
+ sources : List[Union[str, Dict[str, Any]]], default: []
832
+ List of secret specs, defining how the secrets are to be retrieved
833
+ role : str, optional, default: None
834
+ Role to use for fetching secrets
835
+ """
836
+ ...
837
+
838
+ @typing.overload
839
+ def secrets(f: typing.Callable[[FlowSpecDerived, StepFlag], None]) -> typing.Callable[[FlowSpecDerived, StepFlag], None]:
840
+ ...
841
+
842
+ @typing.overload
843
+ def secrets(f: typing.Callable[[FlowSpecDerived, typing.Any, StepFlag], None]) -> typing.Callable[[FlowSpecDerived, typing.Any, StepFlag], None]:
844
+ ...
845
+
846
+ def secrets(f: typing.Union[typing.Callable[[FlowSpecDerived, StepFlag], None], typing.Callable[[FlowSpecDerived, typing.Any, StepFlag], None], None] = None, *, sources: typing.List[typing.Union[str, typing.Dict[str, typing.Any]]] = [], role: typing.Optional[str] = None):
847
+ """
848
+ Specifies secrets to be retrieved and injected as environment variables prior to
849
+ the execution of a step.
850
+
851
+
852
+ Parameters
853
+ ----------
854
+ sources : List[Union[str, Dict[str, Any]]], default: []
855
+ List of secret specs, defining how the secrets are to be retrieved
856
+ role : str, optional, default: None
857
+ Role to use for fetching secrets
858
+ """
859
+ ...
860
+
861
+ @typing.overload
862
+ def retry(*, times: int = 3, minutes_between_retries: int = 2) -> typing.Callable[[typing.Union[typing.Callable[[FlowSpecDerived, StepFlag], None], typing.Callable[[FlowSpecDerived, typing.Any, StepFlag], None]]], typing.Union[typing.Callable[[FlowSpecDerived, StepFlag], None], typing.Callable[[FlowSpecDerived, typing.Any, StepFlag], None]]]:
863
+ """
864
+ Specifies the number of times the task corresponding
865
+ to a step needs to be retried.
866
+
867
+ This decorator is useful for handling transient errors, such as networking issues.
868
+ If your task contains operations that can't be retried safely, e.g. database updates,
869
+ it is advisable to annotate it with `@retry(times=0)`.
870
+
871
+ This can be used in conjunction with the `@catch` decorator. The `@catch`
872
+ decorator will execute a no-op task after all retries have been exhausted,
873
+ ensuring that the flow execution can continue.
874
+
875
+
876
+ Parameters
877
+ ----------
878
+ times : int, default 3
879
+ Number of times to retry this task.
880
+ minutes_between_retries : int, default 2
881
+ Number of minutes between retries.
882
+ """
883
+ ...
884
+
885
+ @typing.overload
886
+ def retry(f: typing.Callable[[FlowSpecDerived, StepFlag], None]) -> typing.Callable[[FlowSpecDerived, StepFlag], None]:
887
+ ...
888
+
889
+ @typing.overload
890
+ def retry(f: typing.Callable[[FlowSpecDerived, typing.Any, StepFlag], None]) -> typing.Callable[[FlowSpecDerived, typing.Any, StepFlag], None]:
891
+ ...
892
+
893
+ def retry(f: typing.Union[typing.Callable[[FlowSpecDerived, StepFlag], None], typing.Callable[[FlowSpecDerived, typing.Any, StepFlag], None], None] = None, *, times: int = 3, minutes_between_retries: int = 2):
894
+ """
895
+ Specifies the number of times the task corresponding
896
+ to a step needs to be retried.
897
+
898
+ This decorator is useful for handling transient errors, such as networking issues.
899
+ If your task contains operations that can't be retried safely, e.g. database updates,
900
+ it is advisable to annotate it with `@retry(times=0)`.
901
+
902
+ This can be used in conjunction with the `@catch` decorator. The `@catch`
903
+ decorator will execute a no-op task after all retries have been exhausted,
904
+ ensuring that the flow execution can continue.
905
+
906
+
907
+ Parameters
908
+ ----------
909
+ times : int, default 3
910
+ Number of times to retry this task.
911
+ minutes_between_retries : int, default 2
912
+ Number of minutes between retries.
913
+ """
914
+ ...
915
+
916
+ def huggingface_hub(*, temp_dir_root: typing.Optional[str] = None, cache_scope: typing.Optional[str] = None, load: typing.Union[typing.List[str], typing.List[typing.Tuple[typing.Dict, str]], typing.List[typing.Tuple[str, str]], typing.List[typing.Dict], None]) -> typing.Callable[[typing.Union[typing.Callable[[FlowSpecDerived, StepFlag], None], typing.Callable[[FlowSpecDerived, typing.Any, StepFlag], None]]], typing.Union[typing.Callable[[FlowSpecDerived, StepFlag], None], typing.Callable[[FlowSpecDerived, typing.Any, StepFlag], None]]]:
917
+ """
918
+ Decorator that helps cache, version, and store models/datasets from the Hugging Face Hub.
919
+
920
+ > Examples
921
+
922
+ **Usage: creating references to models from the Hugging Face Hub that may be loaded in downstream steps**
923
+ ```python
924
+ @huggingface_hub
925
+ @step
926
+ def pull_model_from_huggingface(self):
927
+ # `current.huggingface_hub.snapshot_download` downloads the model from the Hugging Face Hub
928
+ # and saves it in the backend storage based on the model's `repo_id`. If there exists a model
929
+ # with the same `repo_id` in the backend storage, it will not download the model again. The return
930
+ # value of the function is a reference to the model in the backend storage.
931
+ # This reference can be used to load the model in the subsequent steps via `@model(load=["llama_model"])`
932
+
933
+ self.model_id = "mistralai/Mistral-7B-Instruct-v0.1"
934
+ self.llama_model = current.huggingface_hub.snapshot_download(
935
+ repo_id=self.model_id,
936
+ allow_patterns=["*.safetensors", "*.json", "tokenizer.*"],
937
+ )
938
+ self.next(self.train)
939
+ ```
940
+
941
+ **Usage: explicitly loading models at runtime from the Hugging Face Hub or from cache (from Metaflow's datastore)**
942
+ ```python
943
+ @huggingface_hub
944
+ @step
945
+ def run_training(self):
946
+ # Temporary directory (auto-cleaned on exit)
947
+ with current.huggingface_hub.load(
948
+ repo_id="google-bert/bert-base-uncased",
949
+ allow_patterns=["*.bin"],
950
+ ) as local_path:
951
+ # Use files under local_path
952
+ train_model(local_path)
953
+ ...
954
+
955
+ ```
956
+
957
+ **Usage: loading models directly from the Hugging Face Hub or from cache (from Metaflow's datastore)**
958
+ ```python
959
+ @huggingface_hub(load=["mistralai/Mistral-7B-Instruct-v0.1"])
960
+ @step
961
+ def pull_model_from_huggingface(self):
962
+ path_to_model = current.huggingface_hub.loaded["mistralai/Mistral-7B-Instruct-v0.1"]
963
+ ```
964
+
965
+ ```python
966
+ @huggingface_hub(load=[("mistralai/Mistral-7B-Instruct-v0.1", "/my-directory"), ("myorg/mistral-lora", "/my-lora-directory")])
967
+ @step
968
+ def finetune_model(self):
969
+ path_to_model = current.huggingface_hub.loaded["mistralai/Mistral-7B-Instruct-v0.1"]
970
+ # path_to_model will be /my-directory
971
+ ```
972
+
973
+ ```python
974
+ # Takes all the arguments passed to `snapshot_download`
975
+ # except for `local_dir`
976
+ @huggingface_hub(load=[
977
+ {
978
+ "repo_id": "mistralai/Mistral-7B-Instruct-v0.1",
979
+ },
980
+ {
981
+ "repo_id": "myorg/mistral-lora",
982
+ "repo_type": "model",
983
+ },
984
+ ])
985
+ @step
986
+ def finetune_model(self):
987
+ path_to_model = current.huggingface_hub.loaded["mistralai/Mistral-7B-Instruct-v0.1"]
988
+ # path_to_model will be /my-directory
989
+ ```
990
+
991
+
992
+ Parameters
993
+ ----------
994
+ temp_dir_root : str, optional
995
+ The root directory that will hold the temporary directory where objects will be downloaded.
1031
996
 
1032
- > Examples
997
+ cache_scope : str, optional
998
+ The scope of the cache. Can be `checkpoint` / `flow` / `global`.
1033
999
 
1034
- - Saving Checkpoints
1000
+ - `checkpoint` (default): All repos are stored like objects saved by `@checkpoint`.
1001
+ i.e., the cached path is derived from the namespace, flow, step, and Metaflow foreach iteration.
1002
+ Any repo downloaded under this scope will only be retrieved from the cache when the step runs under the same namespace in the same flow (at the same foreach index).
1035
1003
 
1036
- ```python
1037
- @checkpoint
1038
- @step
1039
- def train(self):
1040
- model = create_model(self.parameters, checkpoint_path = None)
1041
- for i in range(self.epochs):
1042
- # some training logic
1043
- loss = model.train(self.dataset)
1044
- if i % 10 == 0:
1045
- model.save(
1046
- current.checkpoint.directory,
1047
- )
1048
- # saves the contents of the `current.checkpoint.directory` as a checkpoint
1049
- # and returns a reference dictionary to the checkpoint saved in the datastore
1050
- self.latest_checkpoint = current.checkpoint.save(
1051
- name="epoch_checkpoint",
1052
- metadata={
1053
- "epoch": i,
1054
- "loss": loss,
1055
- }
1056
- )
1057
- ```
1004
+ - `flow`: All repos are cached under the flow, regardless of namespace.
1005
+ i.e., the cached path is derived solely from the flow name.
1006
+ When to use this mode:
1007
+ - Multiple users are executing the same flow and want shared access to the repos cached by the decorator.
1008
+ - Multiple versions of a flow are deployed, all needing access to the same repos cached by the decorator.
1058
1009
 
1059
- - Using Loaded Checkpoints
1010
+ - `global`: All repos are cached under a globally static path.
1011
+ i.e., the base path of the cache is static and all repos are stored under it.
1012
+ When to use this mode:
1013
+ - All repos from the Hugging Face Hub need to be shared by users across all flow executions.
1060
1014
 
1061
- ```python
1062
- @retry(times=3)
1063
- @checkpoint
1064
- @step
1065
- def train(self):
1066
- # Assume that the task has restarted and the previous attempt of the task
1067
- # saved a checkpoint
1068
- checkpoint_path = None
1069
- if current.checkpoint.is_loaded: # Check if a checkpoint is loaded
1070
- print("Loaded checkpoint from the previous attempt")
1071
- checkpoint_path = current.checkpoint.directory
1015
+ Each caching scope comes with its own trade-offs:
1016
+ - `checkpoint`:
1017
+ - Has explicit control over when caches are populated (controlled by the same flow that has the `@huggingface_hub` decorator) but ends up hitting the Hugging Face Hub more often if there are many users/namespaces/steps.
1018
+ - Since objects are written on a `namespace/flow/step` basis, the blast radius of a bad checkpoint is limited to a particular flow in a namespace.
1019
+ - `flow`:
1020
+ - Has less control over when caches are populated (can be written by any execution instance of a flow from any namespace) but results in more cache hits.
1021
+ - The blast radius of a bad checkpoint is limited to all runs of a particular flow.
1022
+ - It doesn't promote cache reuse across flows.
1023
+ - `global`:
1024
+ - Has no control over when caches are populated (can be written by any flow execution) but has the highest cache hit rate.
1025
+ - It promotes cache reuse across flows.
1026
+ - The blast radius of a bad checkpoint spans every flow that could be using a particular repo.
1072
1027
 
1073
- model = create_model(self.parameters, checkpoint_path = checkpoint_path)
1074
- for i in range(self.epochs):
1075
- ...
1076
- ```
1028
+ load: Union[List[str], List[Tuple[Dict, str]], List[Tuple[str, str]], List[Dict], None]
1029
+ The list of repos (models/datasets) to load.
1077
1030
 
1031
+ Loaded repos can be accessed via `current.huggingface_hub.loaded`. If load is set, then the following happens:
1078
1032
 
1079
- Parameters
1080
- ----------
1081
- load_policy : str, default: "fresh"
1082
- The policy for loading the checkpoint. The following policies are supported:
1083
- - "eager": Loads the the latest available checkpoint within the namespace.
1084
- With this mode, the latest checkpoint written by any previous task (can be even a different run) of the step
1085
- will be loaded at the start of the task.
1086
- - "none": Do not load any checkpoint
1087
- - "fresh": Loads the lastest checkpoint created within the running Task.
1088
- This mode helps loading checkpoints across various retry attempts of the same task.
1089
- With this mode, no checkpoint will be loaded at the start of a task but any checkpoints
1090
- created within the task will be loaded when the task is retries execution on failure.
1033
+ - If repo (model/dataset) is not found in the datastore:
1034
+ - Downloads the repo from Hugging Face Hub to a temporary directory (or uses specified path) for local access
1035
+ - Stores it in Metaflow's datastore (s3/gcs/azure etc.) with a unique name based on repo_type/repo_id
1036
+ - All HF models loaded for a `@step` will be cached separately under flow/step/namespace.
1091
1037
 
1092
- temp_dir_root : str, default: None
1093
- The root directory under which `current.checkpoint.directory` will be created.
1038
+ - If repo is found in the datastore:
1039
+ - Loads it directly from datastore to local path (can be temporary directory or specified path)
1094
1040
  """
1095
1041
  ...
1096
1042
 
1097
- @typing.overload
1098
- def conda(*, packages: typing.Dict[str, str] = {}, libraries: typing.Dict[str, str] = {}, python: typing.Optional[str] = None, disabled: bool = False) -> typing.Callable[[typing.Union[typing.Callable[[FlowSpecDerived, StepFlag], None], typing.Callable[[FlowSpecDerived, typing.Any, StepFlag], None]]], typing.Union[typing.Callable[[FlowSpecDerived, StepFlag], None], typing.Callable[[FlowSpecDerived, typing.Any, StepFlag], None]]]:
1043
+ def ollama(*, models: list, backend: str, force_pull: bool, cache_update_policy: str, force_cache_update: bool, debug: bool, circuit_breaker_config: dict, timeout_config: dict) -> typing.Callable[[typing.Union[typing.Callable[[FlowSpecDerived, StepFlag], None], typing.Callable[[FlowSpecDerived, typing.Any, StepFlag], None]]], typing.Union[typing.Callable[[FlowSpecDerived, StepFlag], None], typing.Callable[[FlowSpecDerived, typing.Any, StepFlag], None]]]:
1099
1044
  """
1100
- Specifies the Conda environment for the step.
1045
+ This decorator is used to run Ollama APIs as Metaflow task sidecars.
1101
1046
 
1102
- Information in this decorator will augment any
1103
- attributes set in the `@conda_base` flow-level decorator. Hence,
1104
- you can use `@conda_base` to set packages required by all
1105
- steps and use `@conda` to specify step-specific overrides.
1047
+ User code call
1048
+ --------------
1049
+ @ollama(
1050
+ models=[...],
1051
+ ...
1052
+ )
1053
+
1054
+ Valid backend options
1055
+ ---------------------
1056
+ - 'local': Run as a separate process on the local task machine.
1057
+ - (TODO) 'managed': Outerbounds hosts and selects compute provider.
1058
+ - (TODO) 'remote': Spin up separate instance to serve Ollama models.
1059
+
1060
+ Valid model options
1061
+ -------------------
1062
+ Any model here https://ollama.com/search, e.g. 'llama3.2', 'llama3.3'
1106
1063
 
1107
1064
 
1108
1065
  Parameters
1109
1066
  ----------
1110
- packages : Dict[str, str], default {}
1111
- Packages to use for this step. The key is the name of the package
1112
- and the value is the version to use.
1113
- libraries : Dict[str, str], default {}
1114
- Supported for backward compatibility. When used with packages, packages will take precedence.
1115
- python : str, optional, default None
1116
- Version of Python to use, e.g. '3.7.4'. A default value of None implies
1117
- that the version used will correspond to the version of the Python interpreter used to start the run.
1118
- disabled : bool, default False
1119
- If set to True, disables @conda.
1067
+ models: list[str]
1068
+ List of Ollama containers running models in sidecars.
1069
+ backend: str
1070
+ Determines where and how to run the Ollama process.
1071
+ force_pull: bool
1072
+ Whether to run `ollama pull` no matter what, or first check the remote cache in Metaflow datastore for this model key.
1073
+ cache_update_policy: str
1074
+ Cache update policy: "auto", "force", or "never".
1075
+ force_cache_update: bool
1076
+ Simple override for "force" cache update policy.
1077
+ debug: bool
1078
+ Whether to turn on verbose debugging logs.
1079
+ circuit_breaker_config: dict
1080
+ Configuration for circuit breaker protection. Keys: failure_threshold, recovery_timeout, reset_timeout.
1081
+ timeout_config: dict
1082
+ Configuration for various operation timeouts. Keys: pull, stop, health_check, install, server_startup.
1120
1083
  """
1121
1084
  ...
1122
1085
 
1123
1086
  @typing.overload
1124
- def conda(f: typing.Callable[[FlowSpecDerived, StepFlag], None]) -> typing.Callable[[FlowSpecDerived, StepFlag], None]:
1087
+ def fast_bakery_internal(f: typing.Callable[[FlowSpecDerived, StepFlag], None]) -> typing.Callable[[FlowSpecDerived, StepFlag], None]:
1088
+ """
1089
+ Internal decorator to support Fast bakery
1090
+ """
1125
1091
  ...
1126
1092
 
1127
1093
  @typing.overload
1128
- def conda(f: typing.Callable[[FlowSpecDerived, typing.Any, StepFlag], None]) -> typing.Callable[[FlowSpecDerived, typing.Any, StepFlag], None]:
1094
+ def fast_bakery_internal(f: typing.Callable[[FlowSpecDerived, typing.Any, StepFlag], None]) -> typing.Callable[[FlowSpecDerived, typing.Any, StepFlag], None]:
1129
1095
  ...
1130
1096
 
1131
- def conda(f: typing.Union[typing.Callable[[FlowSpecDerived, StepFlag], None], typing.Callable[[FlowSpecDerived, typing.Any, StepFlag], None], None] = None, *, packages: typing.Dict[str, str] = {}, libraries: typing.Dict[str, str] = {}, python: typing.Optional[str] = None, disabled: bool = False):
1097
+ def fast_bakery_internal(f: typing.Union[typing.Callable[[FlowSpecDerived, StepFlag], None], typing.Callable[[FlowSpecDerived, typing.Any, StepFlag], None], None] = None):
1132
1098
  """
1133
- Specifies the Conda environment for the step.
1099
+ Internal decorator to support Fast bakery
1100
+ """
1101
+ ...
1102
+
1103
+ @typing.overload
1104
+ def card(*, type: str = 'default', id: typing.Optional[str] = None, options: typing.Dict[str, typing.Any] = {}, timeout: int = 45) -> typing.Callable[[typing.Union[typing.Callable[[FlowSpecDerived, StepFlag], None], typing.Callable[[FlowSpecDerived, typing.Any, StepFlag], None]]], typing.Union[typing.Callable[[FlowSpecDerived, StepFlag], None], typing.Callable[[FlowSpecDerived, typing.Any, StepFlag], None]]]:
1105
+ """
1106
+ Creates a human-readable report, a Metaflow Card, after this step completes.
1134
1107
 
1135
- Information in this decorator will augment any
1136
- attributes set in the `@conda_base` flow-level decorator. Hence,
1137
- you can use `@conda_base` to set packages required by all
1138
- steps and use `@conda` to specify step-specific overrides.
1108
+ Note that you may add multiple `@card` decorators in a step with different parameters.
1139
1109
 
1140
1110
 
1141
1111
  Parameters
1142
1112
  ----------
1143
- packages : Dict[str, str], default {}
1144
- Packages to use for this step. The key is the name of the package
1145
- and the value is the version to use.
1146
- libraries : Dict[str, str], default {}
1147
- Supported for backward compatibility. When used with packages, packages will take precedence.
1148
- python : str, optional, default None
1149
- Version of Python to use, e.g. '3.7.4'. A default value of None implies
1150
- that the version used will correspond to the version of the Python interpreter used to start the run.
1151
- disabled : bool, default False
1152
- If set to True, disables @conda.
1113
+ type : str, default 'default'
1114
+ Card type.
1115
+ id : str, optional, default None
1116
+ If multiple cards are present, use this id to identify this card.
1117
+ options : Dict[str, Any], default {}
1118
+ Options passed to the card. The contents depend on the card type.
1119
+ timeout : int, default 45
1120
+ Interrupt reporting if it takes more than this many seconds.
1153
1121
  """
1154
1122
  ...
1155
1123
 
1156
1124
  @typing.overload
1157
- def app_deploy(f: typing.Callable[[FlowSpecDerived, StepFlag], None]) -> typing.Callable[[FlowSpecDerived, StepFlag], None]:
1158
- """
1159
- Decorator prototype for all step decorators. This function gets specialized
1160
- and imported for all decorators types by _import_plugin_decorators().
1161
- """
1125
+ def card(f: typing.Callable[[FlowSpecDerived, StepFlag], None]) -> typing.Callable[[FlowSpecDerived, StepFlag], None]:
1162
1126
  ...
1163
1127
 
1164
1128
  @typing.overload
1165
- def app_deploy(f: typing.Callable[[FlowSpecDerived, typing.Any, StepFlag], None]) -> typing.Callable[[FlowSpecDerived, typing.Any, StepFlag], None]:
1129
+ def card(f: typing.Callable[[FlowSpecDerived, typing.Any, StepFlag], None]) -> typing.Callable[[FlowSpecDerived, typing.Any, StepFlag], None]:
1166
1130
  ...
1167
1131
 
1168
- def app_deploy(f: typing.Union[typing.Callable[[FlowSpecDerived, StepFlag], None], typing.Callable[[FlowSpecDerived, typing.Any, StepFlag], None], None] = None):
1132
+ def card(f: typing.Union[typing.Callable[[FlowSpecDerived, StepFlag], None], typing.Callable[[FlowSpecDerived, typing.Any, StepFlag], None], None] = None, *, type: str = 'default', id: typing.Optional[str] = None, options: typing.Dict[str, typing.Any] = {}, timeout: int = 45):
1169
1133
  """
1170
- Decorator prototype for all step decorators. This function gets specialized
1171
- and imported for all decorators types by _import_plugin_decorators().
1134
+ Creates a human-readable report, a Metaflow Card, after this step completes.
1135
+
1136
+ Note that you may add multiple `@card` decorators in a step with different parameters.
1137
+
1138
+
1139
+ Parameters
1140
+ ----------
1141
+ type : str, default 'default'
1142
+ Card type.
1143
+ id : str, optional, default None
1144
+ If multiple cards are present, use this id to identify this card.
1145
+ options : Dict[str, Any], default {}
1146
+ Options passed to the card. The contents depend on the card type.
1147
+ timeout : int, default 45
1148
+ Interrupt reporting if it takes more than this many seconds.
1172
1149
  """
1173
1150
  ...
1174
1151
 
@@ -1251,13 +1228,69 @@ def kubernetes(*, cpu: int = 1, memory: int = 4096, disk: int = 10240, image: ty
1251
1228
  qos: str, default: Burstable
1252
1229
  Quality of Service class to assign to the pod. Supported values are: Guaranteed, Burstable, BestEffort
1253
1230
 
1254
- security_context: Dict[str, Any], optional, default None
1255
- Container security context. Applies to the task container. Allows the following keys:
1256
- - privileged: bool, optional, default None
1257
- - allow_privilege_escalation: bool, optional, default None
1258
- - run_as_user: int, optional, default None
1259
- - run_as_group: int, optional, default None
1260
- - run_as_non_root: bool, optional, default None
1231
+ security_context: Dict[str, Any], optional, default None
1232
+ Container security context. Applies to the task container. Allows the following keys:
1233
+ - privileged: bool, optional, default None
1234
+ - allow_privilege_escalation: bool, optional, default None
1235
+ - run_as_user: int, optional, default None
1236
+ - run_as_group: int, optional, default None
1237
+ - run_as_non_root: bool, optional, default None
1238
+ """
1239
+ ...
1240
+
1241
+ @typing.overload
1242
+ def test_append_card(f: typing.Callable[[FlowSpecDerived, StepFlag], None]) -> typing.Callable[[FlowSpecDerived, StepFlag], None]:
1243
+ """
1244
+ A simple decorator that demonstrates using CardDecoratorInjector
1245
+ to inject a card and render simple markdown content.
1246
+ """
1247
+ ...
1248
+
1249
+ @typing.overload
1250
+ def test_append_card(f: typing.Callable[[FlowSpecDerived, typing.Any, StepFlag], None]) -> typing.Callable[[FlowSpecDerived, typing.Any, StepFlag], None]:
1251
+ ...
1252
+
1253
+ def test_append_card(f: typing.Union[typing.Callable[[FlowSpecDerived, StepFlag], None], typing.Callable[[FlowSpecDerived, typing.Any, StepFlag], None], None] = None):
1254
+ """
1255
+ A simple decorator that demonstrates using CardDecoratorInjector
1256
+ to inject a card and render simple markdown content.
1257
+ """
1258
+ ...
1259
+
1260
+ def nvidia(*, gpu: int, gpu_type: str, queue_timeout: int) -> typing.Callable[[typing.Union[typing.Callable[[FlowSpecDerived, StepFlag], None], typing.Callable[[FlowSpecDerived, typing.Any, StepFlag], None]]], typing.Union[typing.Callable[[FlowSpecDerived, StepFlag], None], typing.Callable[[FlowSpecDerived, typing.Any, StepFlag], None]]]:
1261
+ """
1262
+ Specifies that this step should execute on DGX cloud.
1263
+
1264
+
1265
+ Parameters
1266
+ ----------
1267
+ gpu : int
1268
+ Number of GPUs to use.
1269
+ gpu_type : str
1270
+ Type of Nvidia GPU to use.
1271
+ queue_timeout : int
1272
+ Time to keep the job in NVCF's queue.
1273
+ """
1274
+ ...
1275
+
1276
+ @typing.overload
1277
+ def nebius_s3_proxy(f: typing.Callable[[FlowSpecDerived, StepFlag], None]) -> typing.Callable[[FlowSpecDerived, StepFlag], None]:
1278
+ """
1279
+ Nebius-specific S3 Proxy decorator for routing S3 requests through a local proxy service.
1280
+ It exists to make it easier for users to know that this decorator should only be used with
1281
+ a Neo Cloud like Nebius.
1282
+ """
1283
+ ...
1284
+
1285
+ @typing.overload
1286
+ def nebius_s3_proxy(f: typing.Callable[[FlowSpecDerived, typing.Any, StepFlag], None]) -> typing.Callable[[FlowSpecDerived, typing.Any, StepFlag], None]:
1287
+ ...
1288
+
1289
+ def nebius_s3_proxy(f: typing.Union[typing.Callable[[FlowSpecDerived, StepFlag], None], typing.Callable[[FlowSpecDerived, typing.Any, StepFlag], None], None] = None):
1290
+ """
1291
+ Nebius-specific S3 Proxy decorator for routing S3 requests through a local proxy service.
1292
+ It exists to make it easier for users to know that this decorator should only be used with
1293
+ a Neo Cloud like Nebius.
1261
1294
  """
1262
1295
  ...
1263
1296
 
@@ -1312,39 +1345,6 @@ def pypi(f: typing.Union[typing.Callable[[FlowSpecDerived, StepFlag], None], typ
1312
1345
  """
1313
1346
  ...
1314
1347
 
1315
- @typing.overload
1316
- def environment(*, vars: typing.Dict[str, str] = {}) -> typing.Callable[[typing.Union[typing.Callable[[FlowSpecDerived, StepFlag], None], typing.Callable[[FlowSpecDerived, typing.Any, StepFlag], None]]], typing.Union[typing.Callable[[FlowSpecDerived, StepFlag], None], typing.Callable[[FlowSpecDerived, typing.Any, StepFlag], None]]]:
1317
- """
1318
- Specifies environment variables to be set prior to the execution of a step.
1319
-
1320
-
1321
- Parameters
1322
- ----------
1323
- vars : Dict[str, str], default {}
1324
- Dictionary of environment variables to set.
1325
- """
1326
- ...
1327
-
1328
- @typing.overload
1329
- def environment(f: typing.Callable[[FlowSpecDerived, StepFlag], None]) -> typing.Callable[[FlowSpecDerived, StepFlag], None]:
1330
- ...
1331
-
1332
- @typing.overload
1333
- def environment(f: typing.Callable[[FlowSpecDerived, typing.Any, StepFlag], None]) -> typing.Callable[[FlowSpecDerived, typing.Any, StepFlag], None]:
1334
- ...
1335
-
1336
- def environment(f: typing.Union[typing.Callable[[FlowSpecDerived, StepFlag], None], typing.Callable[[FlowSpecDerived, typing.Any, StepFlag], None], None] = None, *, vars: typing.Dict[str, str] = {}):
1337
- """
1338
- Specifies environment variables to be set prior to the execution of a step.
1339
-
1340
-
1341
- Parameters
1342
- ----------
1343
- vars : Dict[str, str], default {}
1344
- Dictionary of environment variables to set.
1345
- """
1346
- ...
1347
-
1348
1348
  def vllm(*, model: str, backend: str, openai_api_server: bool, debug: bool, card_refresh_interval: int, max_retries: int, retry_alert_frequency: int, engine_args: dict) -> typing.Callable[[typing.Union[typing.Callable[[FlowSpecDerived, StepFlag], None], typing.Callable[[FlowSpecDerived, typing.Any, StepFlag], None]]], typing.Union[typing.Callable[[FlowSpecDerived, StepFlag], None], typing.Callable[[FlowSpecDerived, typing.Any, StepFlag], None]]]:
1349
1349
  """
1350
1350
  This decorator is used to run vllm APIs as Metaflow task sidecars.
@@ -1395,38 +1395,46 @@ def vllm(*, model: str, backend: str, openai_api_server: bool, debug: bool, card
1395
1395
  """
1396
1396
  ...
1397
1397
 
1398
- def project(*, name: str, branch: typing.Optional[str] = None, production: bool = False) -> typing.Callable[[typing.Type[FlowSpecDerived]], typing.Type[FlowSpecDerived]]:
1398
+ def airflow_s3_key_sensor(*, timeout: int, poke_interval: int, mode: str, exponential_backoff: bool, pool: str, soft_fail: bool, name: str, description: str, bucket_key: typing.Union[str, typing.List[str]], bucket_name: str, wildcard_match: bool, aws_conn_id: str, verify: bool) -> typing.Callable[[typing.Type[FlowSpecDerived]], typing.Type[FlowSpecDerived]]:
1399
1399
  """
1400
- Specifies what flows belong to the same project.
1401
-
1402
- A project-specific namespace is created for all flows that
1403
- use the same `@project(name)`.
1400
+ The `@airflow_s3_key_sensor` decorator attaches a Airflow [S3KeySensor](https://airflow.apache.org/docs/apache-airflow-providers-amazon/stable/_api/airflow/providers/amazon/aws/sensors/s3/index.html#airflow.providers.amazon.aws.sensors.s3.S3KeySensor)
1401
+ before the start step of the flow. This decorator only works when a flow is scheduled on Airflow
1402
+ and is compiled using `airflow create`. More than one `@airflow_s3_key_sensor` can be
1403
+ added as a flow decorators. Adding more than one decorator will ensure that `start` step
1404
+ starts only after all sensors finish.
1404
1405
 
1405
1406
 
1406
1407
  Parameters
1407
1408
  ----------
1409
+ timeout : int
1410
+ Time, in seconds before the task times out and fails. (Default: 3600)
1411
+ poke_interval : int
1412
+ Time in seconds that the job should wait in between each try. (Default: 60)
1413
+ mode : str
1414
+ How the sensor operates. Options are: { poke | reschedule }. (Default: "poke")
1415
+ exponential_backoff : bool
1416
+ allow progressive longer waits between pokes by using exponential backoff algorithm. (Default: True)
1417
+ pool : str
1418
+ the slot pool this task should run in,
1419
+ slot pools are a way to limit concurrency for certain tasks. (Default:None)
1420
+ soft_fail : bool
1421
+ Set to true to mark the task as SKIPPED on failure. (Default: False)
1408
1422
  name : str
1409
- Project name. Make sure that the name is unique amongst all
1410
- projects that use the same production scheduler. The name may
1411
- contain only lowercase alphanumeric characters and underscores.
1412
-
1413
- branch : Optional[str], default None
1414
- The branch to use. If not specified, the branch is set to
1415
- `user.<username>` unless `production` is set to `True`. This can
1416
- also be set on the command line using `--branch` as a top-level option.
1417
- It is an error to specify `branch` in the decorator and on the command line.
1418
-
1419
- production : bool, default False
1420
- Whether or not the branch is the production branch. This can also be set on the
1421
- command line using `--production` as a top-level option. It is an error to specify
1422
- `production` in the decorator and on the command line.
1423
- The project branch name will be:
1424
- - if `branch` is specified:
1425
- - if `production` is True: `prod.<branch>`
1426
- - if `production` is False: `test.<branch>`
1427
- - if `branch` is not specified:
1428
- - if `production` is True: `prod`
1429
- - if `production` is False: `user.<username>`
1423
+ Name of the sensor on Airflow
1424
+ description : str
1425
+ Description of sensor in the Airflow UI
1426
+ bucket_key : Union[str, List[str]]
1427
+ The key(s) being waited on. Supports full s3:// style url or relative path from root level.
1428
+ When it's specified as a full s3:// url, please leave `bucket_name` as None
1429
+ bucket_name : str
1430
+ Name of the S3 bucket. Only needed when bucket_key is not provided as a full s3:// url.
1431
+ When specified, all the keys passed to bucket_key refers to this bucket. (Default:None)
1432
+ wildcard_match : bool
1433
+ whether the bucket_key should be interpreted as a Unix wildcard pattern. (Default: False)
1434
+ aws_conn_id : str
1435
+ a reference to the s3 connection on Airflow. (Default: None)
1436
+ verify : bool
1437
+ Whether or not to verify SSL certificates for S3 connection. (Default: None)
1430
1438
  """
1431
1439
  ...
1432
1440
 
@@ -1481,117 +1489,44 @@ def schedule(f: typing.Optional[typing.Type[FlowSpecDerived]] = None, *, hourly:
1481
1489
  """
1482
1490
  ...
1483
1491
 
1484
- def with_artifact_store(f: typing.Optional[typing.Type[FlowSpecDerived]] = None):
1492
+ @typing.overload
1493
+ def pypi_base(*, packages: typing.Dict[str, str] = {}, python: typing.Optional[str] = None) -> typing.Callable[[typing.Type[FlowSpecDerived]], typing.Type[FlowSpecDerived]]:
1485
1494
  """
1486
- Allows setting external datastores to save data for the
1487
- `@checkpoint`/`@model`/`@huggingface_hub` decorators.
1488
-
1489
- This decorator is useful when users wish to save data to a different datastore
1490
- than what is configured in Metaflow. This can be for variety of reasons:
1491
-
1492
- 1. Data security: The objects needs to be stored in a bucket (object storage) that is not accessible by other flows.
1493
- 2. Data Locality: The location where the task is executing is not located in the same region as the datastore.
1494
- - Example: Metaflow datastore lives in US East, but the task is executing in Finland datacenters.
1495
- 3. Data Lifecycle Policies: The objects need to be archived / managed separately from the Metaflow managed objects.
1496
- - Example: Flow is training very large models that need to be stored separately and will be deleted more aggressively than the Metaflow managed objects.
1497
-
1498
- Usage:
1499
- ----------
1500
-
1501
- - Using a custom IAM role to access the datastore.
1502
-
1503
- ```python
1504
- @with_artifact_store(
1505
- type="s3",
1506
- config=lambda: {
1507
- "root": "s3://my-bucket-foo/path/to/root",
1508
- "role_arn": ROLE,
1509
- },
1510
- )
1511
- class MyFlow(FlowSpec):
1512
-
1513
- @checkpoint
1514
- @step
1515
- def start(self):
1516
- with open("my_file.txt", "w") as f:
1517
- f.write("Hello, World!")
1518
- self.external_bucket_checkpoint = current.checkpoint.save("my_file.txt")
1519
- self.next(self.end)
1520
-
1521
- ```
1522
-
1523
- - Using credentials to access the s3-compatible datastore.
1524
-
1525
- ```python
1526
- @with_artifact_store(
1527
- type="s3",
1528
- config=lambda: {
1529
- "root": "s3://my-bucket-foo/path/to/root",
1530
- "client_params": {
1531
- "aws_access_key_id": os.environ.get("MY_CUSTOM_ACCESS_KEY"),
1532
- "aws_secret_access_key": os.environ.get("MY_CUSTOM_SECRET_KEY"),
1533
- },
1534
- },
1535
- )
1536
- class MyFlow(FlowSpec):
1537
-
1538
- @checkpoint
1539
- @step
1540
- def start(self):
1541
- with open("my_file.txt", "w") as f:
1542
- f.write("Hello, World!")
1543
- self.external_bucket_checkpoint = current.checkpoint.save("my_file.txt")
1544
- self.next(self.end)
1545
-
1546
- ```
1547
-
1548
- - Accessing objects stored in external datastores after task execution.
1495
+ Specifies the PyPI packages for all steps of the flow.
1549
1496
 
1550
- ```python
1551
- run = Run("CheckpointsTestsFlow/8992")
1552
- with artifact_store_from(run=run, config={
1553
- "client_params": {
1554
- "aws_access_key_id": os.environ.get("MY_CUSTOM_ACCESS_KEY"),
1555
- "aws_secret_access_key": os.environ.get("MY_CUSTOM_SECRET_KEY"),
1556
- },
1557
- }):
1558
- with Checkpoint() as cp:
1559
- latest = cp.list(
1560
- task=run["start"].task
1561
- )[0]
1562
- print(latest)
1563
- cp.load(
1564
- latest,
1565
- "test-checkpoints"
1566
- )
1497
+ Use `@pypi_base` to set common packages required by all
1498
+ steps and use `@pypi` to specify step-specific overrides.
1567
1499
 
1568
- task = Task("TorchTuneFlow/8484/train/53673")
1569
- with artifact_store_from(run=run, config={
1570
- "client_params": {
1571
- "aws_access_key_id": os.environ.get("MY_CUSTOM_ACCESS_KEY"),
1572
- "aws_secret_access_key": os.environ.get("MY_CUSTOM_SECRET_KEY"),
1573
- },
1574
- }):
1575
- load_model(
1576
- task.data.model_ref,
1577
- "test-models"
1578
- )
1579
- ```
1580
- Parameters:
1500
+ Parameters
1581
1501
  ----------
1502
+ packages : Dict[str, str], default: {}
1503
+ Packages to use for this flow. The key is the name of the package
1504
+ and the value is the version to use.
1505
+ python : str, optional, default: None
1506
+ Version of Python to use, e.g. '3.7.4'. A default value of None implies
1507
+ that the version used will correspond to the version of the Python interpreter used to start the run.
1508
+ """
1509
+ ...
1510
+
1511
+ @typing.overload
1512
+ def pypi_base(f: typing.Type[FlowSpecDerived]) -> typing.Type[FlowSpecDerived]:
1513
+ ...
1514
+
1515
+ def pypi_base(f: typing.Optional[typing.Type[FlowSpecDerived]] = None, *, packages: typing.Dict[str, str] = {}, python: typing.Optional[str] = None):
1516
+ """
1517
+ Specifies the PyPI packages for all steps of the flow.
1582
1518
 
1583
- type: str
1584
- The type of the datastore. Can be one of 's3', 'gcs', 'azure' or any other supported metaflow Datastore.
1519
+ Use `@pypi_base` to set common packages required by all
1520
+ steps and use `@pypi` to specify step-specific overrides.
1585
1521
 
1586
- config: dict or Callable
1587
- Dictionary of configuration options for the datastore. The following keys are required:
1588
- - root: The root path in the datastore where the data will be saved. (needs to be in the format expected by the datastore)
1589
- - example: 's3://bucket-name/path/to/root'
1590
- - example: 'gs://bucket-name/path/to/root'
1591
- - example: 'https://myblockacc.blob.core.windows.net/metaflow/'
1592
- - role_arn (optional): AWS IAM role to access s3 bucket (only when `type` is 's3')
1593
- - session_vars (optional): AWS session variables to access s3 bucket (only when `type` is 's3')
1594
- - client_params (optional): AWS client parameters to access s3 bucket (only when `type` is 's3')
1522
+ Parameters
1523
+ ----------
1524
+ packages : Dict[str, str], default: {}
1525
+ Packages to use for this flow. The key is the name of the package
1526
+ and the value is the version to use.
1527
+ python : str, optional, default: None
1528
+ Version of Python to use, e.g. '3.7.4'. A default value of None implies
1529
+ that the version used will correspond to the version of the Python interpreter used to start the run.
1595
1530
  """
1596
1531
  ...
1597
1532
 
@@ -1696,13 +1631,10 @@ def trigger_on_finish(f: typing.Optional[typing.Type[FlowSpecDerived]] = None, *
1696
1631
  """
1697
1632
  ...
1698
1633
 
1699
- def airflow_s3_key_sensor(*, timeout: int, poke_interval: int, mode: str, exponential_backoff: bool, pool: str, soft_fail: bool, name: str, description: str, bucket_key: typing.Union[str, typing.List[str]], bucket_name: str, wildcard_match: bool, aws_conn_id: str, verify: bool) -> typing.Callable[[typing.Type[FlowSpecDerived]], typing.Type[FlowSpecDerived]]:
1634
+ def airflow_external_task_sensor(*, timeout: int, poke_interval: int, mode: str, exponential_backoff: bool, pool: str, soft_fail: bool, name: str, description: str, external_dag_id: str, external_task_ids: typing.List[str], allowed_states: typing.List[str], failed_states: typing.List[str], execution_delta: "datetime.timedelta", check_existence: bool) -> typing.Callable[[typing.Type[FlowSpecDerived]], typing.Type[FlowSpecDerived]]:
1700
1635
  """
1701
- The `@airflow_s3_key_sensor` decorator attaches a Airflow [S3KeySensor](https://airflow.apache.org/docs/apache-airflow-providers-amazon/stable/_api/airflow/providers/amazon/aws/sensors/s3/index.html#airflow.providers.amazon.aws.sensors.s3.S3KeySensor)
1702
- before the start step of the flow. This decorator only works when a flow is scheduled on Airflow
1703
- and is compiled using `airflow create`. More than one `@airflow_s3_key_sensor` can be
1704
- added as a flow decorators. Adding more than one decorator will ensure that `start` step
1705
- starts only after all sensors finish.
1636
+ The `@airflow_external_task_sensor` decorator attaches a Airflow [ExternalTaskSensor](https://airflow.apache.org/docs/apache-airflow/stable/_api/airflow/sensors/external_task/index.html#airflow.sensors.external_task.ExternalTaskSensor) before the start step of the flow.
1637
+ This decorator only works when a flow is scheduled on Airflow and is compiled using `airflow create`. More than one `@airflow_external_task_sensor` can be added as a flow decorators. Adding more than one decorator will ensure that `start` step starts only after all sensors finish.
1706
1638
 
1707
1639
 
1708
1640
  Parameters
@@ -1724,102 +1656,170 @@ def airflow_s3_key_sensor(*, timeout: int, poke_interval: int, mode: str, expone
1724
1656
  Name of the sensor on Airflow
1725
1657
  description : str
1726
1658
  Description of sensor in the Airflow UI
1727
- bucket_key : Union[str, List[str]]
1728
- The key(s) being waited on. Supports full s3:// style url or relative path from root level.
1729
- When it's specified as a full s3:// url, please leave `bucket_name` as None
1730
- bucket_name : str
1731
- Name of the S3 bucket. Only needed when bucket_key is not provided as a full s3:// url.
1732
- When specified, all the keys passed to bucket_key refers to this bucket. (Default:None)
1733
- wildcard_match : bool
1734
- whether the bucket_key should be interpreted as a Unix wildcard pattern. (Default: False)
1735
- aws_conn_id : str
1736
- a reference to the s3 connection on Airflow. (Default: None)
1737
- verify : bool
1738
- Whether or not to verify SSL certificates for S3 connection. (Default: None)
1659
+ external_dag_id : str
1660
+ The dag_id that contains the task you want to wait for.
1661
+ external_task_ids : List[str]
1662
+ The list of task_ids that you want to wait for.
1663
+ If None (default value) the sensor waits for the DAG. (Default: None)
1664
+ allowed_states : List[str]
1665
+ Iterable of allowed states, (Default: ['success'])
1666
+ failed_states : List[str]
1667
+ Iterable of failed or dis-allowed states. (Default: None)
1668
+ execution_delta : datetime.timedelta
1669
+ time difference with the previous execution to look at,
1670
+ the default is the same logical date as the current task or DAG. (Default: None)
1671
+ check_existence: bool
1672
+ Set to True to check if the external task exists or check if
1673
+ the DAG to wait for exists. (Default: True)
1739
1674
  """
1740
1675
  ...
1741
1676
 
1742
- @typing.overload
1743
- def pypi_base(*, packages: typing.Dict[str, str] = {}, python: typing.Optional[str] = None) -> typing.Callable[[typing.Type[FlowSpecDerived]], typing.Type[FlowSpecDerived]]:
1677
+ def with_artifact_store(f: typing.Optional[typing.Type[FlowSpecDerived]] = None):
1744
1678
  """
1745
- Specifies the PyPI packages for all steps of the flow.
1679
+ Allows setting external datastores to save data for the
1680
+ `@checkpoint`/`@model`/`@huggingface_hub` decorators.
1746
1681
 
1747
- Use `@pypi_base` to set common packages required by all
1748
- steps and use `@pypi` to specify step-specific overrides.
1682
+ This decorator is useful when users wish to save data to a different datastore
1683
+ than what is configured in Metaflow. This can be for variety of reasons:
1749
1684
 
1750
- Parameters
1685
+ 1. Data security: The objects needs to be stored in a bucket (object storage) that is not accessible by other flows.
1686
+ 2. Data Locality: The location where the task is executing is not located in the same region as the datastore.
1687
+ - Example: Metaflow datastore lives in US East, but the task is executing in Finland datacenters.
1688
+ 3. Data Lifecycle Policies: The objects need to be archived / managed separately from the Metaflow managed objects.
1689
+ - Example: Flow is training very large models that need to be stored separately and will be deleted more aggressively than the Metaflow managed objects.
1690
+
1691
+ Usage:
1751
1692
  ----------
1752
- packages : Dict[str, str], default: {}
1753
- Packages to use for this flow. The key is the name of the package
1754
- and the value is the version to use.
1755
- python : str, optional, default: None
1756
- Version of Python to use, e.g. '3.7.4'. A default value of None implies
1757
- that the version used will correspond to the version of the Python interpreter used to start the run.
1758
- """
1759
- ...
1760
-
1761
- @typing.overload
1762
- def pypi_base(f: typing.Type[FlowSpecDerived]) -> typing.Type[FlowSpecDerived]:
1763
- ...
1764
-
1765
- def pypi_base(f: typing.Optional[typing.Type[FlowSpecDerived]] = None, *, packages: typing.Dict[str, str] = {}, python: typing.Optional[str] = None):
1766
- """
1767
- Specifies the PyPI packages for all steps of the flow.
1768
1693
 
1769
- Use `@pypi_base` to set common packages required by all
1770
- steps and use `@pypi` to specify step-specific overrides.
1694
+ - Using a custom IAM role to access the datastore.
1771
1695
 
1772
- Parameters
1696
+ ```python
1697
+ @with_artifact_store(
1698
+ type="s3",
1699
+ config=lambda: {
1700
+ "root": "s3://my-bucket-foo/path/to/root",
1701
+ "role_arn": ROLE,
1702
+ },
1703
+ )
1704
+ class MyFlow(FlowSpec):
1705
+
1706
+ @checkpoint
1707
+ @step
1708
+ def start(self):
1709
+ with open("my_file.txt", "w") as f:
1710
+ f.write("Hello, World!")
1711
+ self.external_bucket_checkpoint = current.checkpoint.save("my_file.txt")
1712
+ self.next(self.end)
1713
+
1714
+ ```
1715
+
1716
+ - Using credentials to access the s3-compatible datastore.
1717
+
1718
+ ```python
1719
+ @with_artifact_store(
1720
+ type="s3",
1721
+ config=lambda: {
1722
+ "root": "s3://my-bucket-foo/path/to/root",
1723
+ "client_params": {
1724
+ "aws_access_key_id": os.environ.get("MY_CUSTOM_ACCESS_KEY"),
1725
+ "aws_secret_access_key": os.environ.get("MY_CUSTOM_SECRET_KEY"),
1726
+ },
1727
+ },
1728
+ )
1729
+ class MyFlow(FlowSpec):
1730
+
1731
+ @checkpoint
1732
+ @step
1733
+ def start(self):
1734
+ with open("my_file.txt", "w") as f:
1735
+ f.write("Hello, World!")
1736
+ self.external_bucket_checkpoint = current.checkpoint.save("my_file.txt")
1737
+ self.next(self.end)
1738
+
1739
+ ```
1740
+
1741
+ - Accessing objects stored in external datastores after task execution.
1742
+
1743
+ ```python
1744
+ run = Run("CheckpointsTestsFlow/8992")
1745
+ with artifact_store_from(run=run, config={
1746
+ "client_params": {
1747
+ "aws_access_key_id": os.environ.get("MY_CUSTOM_ACCESS_KEY"),
1748
+ "aws_secret_access_key": os.environ.get("MY_CUSTOM_SECRET_KEY"),
1749
+ },
1750
+ }):
1751
+ with Checkpoint() as cp:
1752
+ latest = cp.list(
1753
+ task=run["start"].task
1754
+ )[0]
1755
+ print(latest)
1756
+ cp.load(
1757
+ latest,
1758
+ "test-checkpoints"
1759
+ )
1760
+
1761
+ task = Task("TorchTuneFlow/8484/train/53673")
1762
+ with artifact_store_from(run=run, config={
1763
+ "client_params": {
1764
+ "aws_access_key_id": os.environ.get("MY_CUSTOM_ACCESS_KEY"),
1765
+ "aws_secret_access_key": os.environ.get("MY_CUSTOM_SECRET_KEY"),
1766
+ },
1767
+ }):
1768
+ load_model(
1769
+ task.data.model_ref,
1770
+ "test-models"
1771
+ )
1772
+ ```
1773
+ Parameters:
1773
1774
  ----------
1774
- packages : Dict[str, str], default: {}
1775
- Packages to use for this flow. The key is the name of the package
1776
- and the value is the version to use.
1777
- python : str, optional, default: None
1778
- Version of Python to use, e.g. '3.7.4'. A default value of None implies
1779
- that the version used will correspond to the version of the Python interpreter used to start the run.
1775
+
1776
+ type: str
1777
+ The type of the datastore. Can be one of 's3', 'gcs', 'azure' or any other supported metaflow Datastore.
1778
+
1779
+ config: dict or Callable
1780
+ Dictionary of configuration options for the datastore. The following keys are required:
1781
+ - root: The root path in the datastore where the data will be saved. (needs to be in the format expected by the datastore)
1782
+ - example: 's3://bucket-name/path/to/root'
1783
+ - example: 'gs://bucket-name/path/to/root'
1784
+ - example: 'https://myblockacc.blob.core.windows.net/metaflow/'
1785
+ - role_arn (optional): AWS IAM role to access s3 bucket (only when `type` is 's3')
1786
+ - session_vars (optional): AWS session variables to access s3 bucket (only when `type` is 's3')
1787
+ - client_params (optional): AWS client parameters to access s3 bucket (only when `type` is 's3')
1780
1788
  """
1781
1789
  ...
1782
1790
 
1783
- def airflow_external_task_sensor(*, timeout: int, poke_interval: int, mode: str, exponential_backoff: bool, pool: str, soft_fail: bool, name: str, description: str, external_dag_id: str, external_task_ids: typing.List[str], allowed_states: typing.List[str], failed_states: typing.List[str], execution_delta: "datetime.timedelta", check_existence: bool) -> typing.Callable[[typing.Type[FlowSpecDerived]], typing.Type[FlowSpecDerived]]:
1791
+ def project(*, name: str, branch: typing.Optional[str] = None, production: bool = False) -> typing.Callable[[typing.Type[FlowSpecDerived]], typing.Type[FlowSpecDerived]]:
1784
1792
  """
1785
- The `@airflow_external_task_sensor` decorator attaches a Airflow [ExternalTaskSensor](https://airflow.apache.org/docs/apache-airflow/stable/_api/airflow/sensors/external_task/index.html#airflow.sensors.external_task.ExternalTaskSensor) before the start step of the flow.
1786
- This decorator only works when a flow is scheduled on Airflow and is compiled using `airflow create`. More than one `@airflow_external_task_sensor` can be added as a flow decorators. Adding more than one decorator will ensure that `start` step starts only after all sensors finish.
1793
+ Specifies what flows belong to the same project.
1794
+
1795
+ A project-specific namespace is created for all flows that
1796
+ use the same `@project(name)`.
1787
1797
 
1788
1798
 
1789
1799
  Parameters
1790
1800
  ----------
1791
- timeout : int
1792
- Time, in seconds before the task times out and fails. (Default: 3600)
1793
- poke_interval : int
1794
- Time in seconds that the job should wait in between each try. (Default: 60)
1795
- mode : str
1796
- How the sensor operates. Options are: { poke | reschedule }. (Default: "poke")
1797
- exponential_backoff : bool
1798
- allow progressive longer waits between pokes by using exponential backoff algorithm. (Default: True)
1799
- pool : str
1800
- the slot pool this task should run in,
1801
- slot pools are a way to limit concurrency for certain tasks. (Default:None)
1802
- soft_fail : bool
1803
- Set to true to mark the task as SKIPPED on failure. (Default: False)
1804
1801
  name : str
1805
- Name of the sensor on Airflow
1806
- description : str
1807
- Description of sensor in the Airflow UI
1808
- external_dag_id : str
1809
- The dag_id that contains the task you want to wait for.
1810
- external_task_ids : List[str]
1811
- The list of task_ids that you want to wait for.
1812
- If None (default value) the sensor waits for the DAG. (Default: None)
1813
- allowed_states : List[str]
1814
- Iterable of allowed states, (Default: ['success'])
1815
- failed_states : List[str]
1816
- Iterable of failed or dis-allowed states. (Default: None)
1817
- execution_delta : datetime.timedelta
1818
- time difference with the previous execution to look at,
1819
- the default is the same logical date as the current task or DAG. (Default: None)
1820
- check_existence: bool
1821
- Set to True to check if the external task exists or check if
1822
- the DAG to wait for exists. (Default: True)
1802
+ Project name. Make sure that the name is unique amongst all
1803
+ projects that use the same production scheduler. The name may
1804
+ contain only lowercase alphanumeric characters and underscores.
1805
+
1806
+ branch : Optional[str], default None
1807
+ The branch to use. If not specified, the branch is set to
1808
+ `user.<username>` unless `production` is set to `True`. This can
1809
+ also be set on the command line using `--branch` as a top-level option.
1810
+ It is an error to specify `branch` in the decorator and on the command line.
1811
+
1812
+ production : bool, default False
1813
+ Whether or not the branch is the production branch. This can also be set on the
1814
+ command line using `--production` as a top-level option. It is an error to specify
1815
+ `production` in the decorator and on the command line.
1816
+ The project branch name will be:
1817
+ - if `branch` is specified:
1818
+ - if `production` is True: `prod.<branch>`
1819
+ - if `production` is False: `test.<branch>`
1820
+ - if `branch` is not specified:
1821
+ - if `production` is True: `prod`
1822
+ - if `production` is False: `user.<username>`
1823
1823
  """
1824
1824
  ...
1825
1825