flowcept 0.8.12__tar.gz → 0.9.2__tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (209) hide show
  1. {flowcept-0.8.12 → flowcept-0.9.2}/.github/workflows/run_examples.sh +1 -0
  2. {flowcept-0.8.12 → flowcept-0.9.2}/.gitignore +1 -0
  3. {flowcept-0.8.12 → flowcept-0.9.2}/Makefile +20 -0
  4. flowcept-0.9.2/PKG-INFO +589 -0
  5. flowcept-0.9.2/README.md +461 -0
  6. flowcept-0.9.2/docs/api-reference.rst +61 -0
  7. flowcept-0.9.2/docs/architecture.rst +79 -0
  8. flowcept-0.9.2/docs/cli-reference.rst +36 -0
  9. {flowcept-0.8.12 → flowcept-0.9.2}/docs/conf.py +3 -2
  10. {flowcept-0.8.12 → flowcept-0.9.2}/docs/contributing.rst +4 -0
  11. flowcept-0.9.2/docs/img/architecture-diagram.png +0 -0
  12. flowcept-0.9.2/docs/img/flowcept-logo-dark.png +0 -0
  13. flowcept-0.9.2/docs/img/flowcept-logo.png +0 -0
  14. flowcept-0.9.2/docs/index.rst +28 -0
  15. flowcept-0.9.2/docs/prov_capture.rst +458 -0
  16. flowcept-0.9.2/docs/prov_storage.rst +108 -0
  17. flowcept-0.9.2/docs/quick_start.rst +111 -0
  18. flowcept-0.9.2/docs/setup.rst +228 -0
  19. flowcept-0.9.2/docs/telemetry_capture.rst +293 -0
  20. flowcept-0.9.2/examples/consumers/simple_publisher.py +25 -0
  21. {flowcept-0.8.12 → flowcept-0.9.2}/pyproject.toml +1 -1
  22. {flowcept-0.8.12 → flowcept-0.9.2}/resources/sample_settings.yaml +2 -2
  23. {flowcept-0.8.12 → flowcept-0.9.2}/resources/simple_redis_consumer.py +3 -0
  24. {flowcept-0.8.12 → flowcept-0.9.2}/src/flowcept/__init__.py +5 -0
  25. {flowcept-0.8.12 → flowcept-0.9.2}/src/flowcept/agents/prompts/general_prompts.py +1 -1
  26. {flowcept-0.8.12 → flowcept-0.9.2}/src/flowcept/cli.py +41 -42
  27. {flowcept-0.8.12 → flowcept-0.9.2}/src/flowcept/commons/autoflush_buffer.py +5 -0
  28. {flowcept-0.8.12 → flowcept-0.9.2}/src/flowcept/commons/daos/docdb_dao/lmdb_dao.py +4 -1
  29. {flowcept-0.8.12 → flowcept-0.9.2}/src/flowcept/commons/daos/mq_dao/mq_dao_base.py +1 -0
  30. {flowcept-0.8.12 → flowcept-0.9.2}/src/flowcept/commons/flowcept_dataclasses/task_object.py +86 -6
  31. {flowcept-0.8.12 → flowcept-0.9.2}/src/flowcept/commons/flowcept_dataclasses/workflow_object.py +41 -1
  32. {flowcept-0.8.12 → flowcept-0.9.2}/src/flowcept/configs.py +1 -1
  33. {flowcept-0.8.12 → flowcept-0.9.2}/src/flowcept/flowcept_api/flowcept_controller.py +9 -1
  34. {flowcept-0.8.12 → flowcept-0.9.2}/src/flowcept/flowceptor/consumers/base_consumer.py +5 -0
  35. {flowcept-0.8.12 → flowcept-0.9.2}/src/flowcept/flowceptor/consumers/consumer_utils.py +11 -3
  36. {flowcept-0.8.12 → flowcept-0.9.2}/src/flowcept/flowceptor/telemetry_capture.py +2 -1
  37. {flowcept-0.8.12 → flowcept-0.9.2}/src/flowcept/instrumentation/task_capture.py +16 -16
  38. {flowcept-0.8.12 → flowcept-0.9.2}/src/flowcept/version.py +1 -1
  39. {flowcept-0.8.12 → flowcept-0.9.2}/tests/instrumentation_tests/flowcept_explicit_tasks.py +19 -0
  40. {flowcept-0.8.12 → flowcept-0.9.2}/tests/instrumentation_tests/flowcept_task_decorator_test.py +0 -1
  41. flowcept-0.8.12/PKG-INFO +0 -439
  42. flowcept-0.8.12/README.md +0 -312
  43. flowcept-0.8.12/docs/api-reference.rst +0 -10
  44. flowcept-0.8.12/docs/getstarted.rst +0 -105
  45. flowcept-0.8.12/docs/index.rst +0 -18
  46. {flowcept-0.8.12 → flowcept-0.9.2}/.github/workflows/checks.yml +0 -0
  47. {flowcept-0.8.12 → flowcept-0.9.2}/.github/workflows/create-release-n-publish.yml +0 -0
  48. {flowcept-0.8.12 → flowcept-0.9.2}/.github/workflows/run-llm-tests.yml +0 -0
  49. {flowcept-0.8.12 → flowcept-0.9.2}/.github/workflows/run-tests-all-dbs.yml +0 -0
  50. {flowcept-0.8.12 → flowcept-0.9.2}/.github/workflows/run-tests-in-container.yml +0 -0
  51. {flowcept-0.8.12 → flowcept-0.9.2}/.github/workflows/run-tests-kafka.yml +0 -0
  52. {flowcept-0.8.12 → flowcept-0.9.2}/.github/workflows/run-tests-py313.yml +0 -0
  53. {flowcept-0.8.12 → flowcept-0.9.2}/.github/workflows/run-tests-simple.yml +0 -0
  54. {flowcept-0.8.12 → flowcept-0.9.2}/.github/workflows/run-tests.yml +0 -0
  55. {flowcept-0.8.12 → flowcept-0.9.2}/.github/workflows/version_bumper.py +0 -0
  56. {flowcept-0.8.12 → flowcept-0.9.2}/.readthedocs.yaml +0 -0
  57. {flowcept-0.8.12 → flowcept-0.9.2}/CONTRIBUTING.md +0 -0
  58. {flowcept-0.8.12 → flowcept-0.9.2}/LICENSE +0 -0
  59. {flowcept-0.8.12 → flowcept-0.9.2}/deployment/Dockerfile +0 -0
  60. {flowcept-0.8.12 → flowcept-0.9.2}/deployment/compose-grafana.yml +0 -0
  61. {flowcept-0.8.12 → flowcept-0.9.2}/deployment/compose-kafka.yml +0 -0
  62. {flowcept-0.8.12 → flowcept-0.9.2}/deployment/compose-mofka.yml +0 -0
  63. {flowcept-0.8.12 → flowcept-0.9.2}/deployment/compose-mongo.yml +0 -0
  64. {flowcept-0.8.12 → flowcept-0.9.2}/deployment/compose.yml +0 -0
  65. {flowcept-0.8.12 → flowcept-0.9.2}/docs/schemas.rst +0 -0
  66. {flowcept-0.8.12 → flowcept-0.9.2}/docs/task_schema.rst +0 -0
  67. {flowcept-0.8.12 → flowcept-0.9.2}/docs/workflow_schema.rst +0 -0
  68. {flowcept-0.8.12 → flowcept-0.9.2}/examples/agents/a2a/README.md +0 -0
  69. {flowcept-0.8.12 → flowcept-0.9.2}/examples/agents/a2a/agent1.py +0 -0
  70. {flowcept-0.8.12 → flowcept-0.9.2}/examples/agents/a2a/agent2.py +0 -0
  71. {flowcept-0.8.12 → flowcept-0.9.2}/examples/agents/aec_agent_context_manager.py +0 -0
  72. {flowcept-0.8.12 → flowcept-0.9.2}/examples/agents/aec_agent_mock.py +0 -0
  73. {flowcept-0.8.12 → flowcept-0.9.2}/examples/agents/aec_prompts.py +0 -0
  74. {flowcept-0.8.12 → flowcept-0.9.2}/examples/agents/opt_driver_mock.py +0 -0
  75. {flowcept-0.8.12 → flowcept-0.9.2}/examples/consumers/ping_pong_example.py +0 -0
  76. {flowcept-0.8.12 → flowcept-0.9.2}/examples/consumers/simple_consumer.py +0 -0
  77. {flowcept-0.8.12 → flowcept-0.9.2}/examples/dask_example.py +0 -0
  78. {flowcept-0.8.12 → flowcept-0.9.2}/examples/distributed_consumer_example.py +0 -0
  79. {flowcept-0.8.12 → flowcept-0.9.2}/examples/instrumented_loop_example.py +0 -0
  80. {flowcept-0.8.12 → flowcept-0.9.2}/examples/instrumented_simple_example.py +0 -0
  81. {flowcept-0.8.12 → flowcept-0.9.2}/examples/llm_complex/README.md +0 -0
  82. {flowcept-0.8.12 → flowcept-0.9.2}/examples/llm_complex/custom_provenance_id_mapping.yaml +0 -0
  83. {flowcept-0.8.12 → flowcept-0.9.2}/examples/llm_complex/llm_dataprep.py +0 -0
  84. {flowcept-0.8.12 → flowcept-0.9.2}/examples/llm_complex/llm_main_example.py +0 -0
  85. {flowcept-0.8.12 → flowcept-0.9.2}/examples/llm_complex/llm_model.py +0 -0
  86. {flowcept-0.8.12 → flowcept-0.9.2}/examples/llm_complex/llm_test_runner.py +0 -0
  87. {flowcept-0.8.12 → flowcept-0.9.2}/examples/mlflow_example.py +0 -0
  88. {flowcept-0.8.12 → flowcept-0.9.2}/examples/single_layer_perceptron_example.py +0 -0
  89. {flowcept-0.8.12 → flowcept-0.9.2}/examples/start_here.py +0 -0
  90. {flowcept-0.8.12 → flowcept-0.9.2}/examples/tensorboard_example.py +0 -0
  91. {flowcept-0.8.12 → flowcept-0.9.2}/examples/unmanaged/main.py +0 -0
  92. {flowcept-0.8.12 → flowcept-0.9.2}/examples/unmanaged/simple_task.py +0 -0
  93. {flowcept-0.8.12 → flowcept-0.9.2}/notebooks/analytics.ipynb +0 -0
  94. {flowcept-0.8.12 → flowcept-0.9.2}/notebooks/dask.ipynb +0 -0
  95. {flowcept-0.8.12 → flowcept-0.9.2}/notebooks/dask_from_CLI.ipynb +0 -0
  96. {flowcept-0.8.12 → flowcept-0.9.2}/notebooks/mlflow.ipynb +0 -0
  97. {flowcept-0.8.12 → flowcept-0.9.2}/notebooks/reset_dask_nb_exec_counts.py +0 -0
  98. {flowcept-0.8.12 → flowcept-0.9.2}/notebooks/tensorboard.ipynb +0 -0
  99. {flowcept-0.8.12 → flowcept-0.9.2}/resources/mofka/bedrock_setup.sh +0 -0
  100. {flowcept-0.8.12 → flowcept-0.9.2}/resources/mofka/consumer.py +0 -0
  101. {flowcept-0.8.12 → flowcept-0.9.2}/resources/mofka/mofka-requirements.yaml +0 -0
  102. {flowcept-0.8.12 → flowcept-0.9.2}/resources/mofka/mofka_config.json +0 -0
  103. {flowcept-0.8.12 → flowcept-0.9.2}/src/flowcept/agents/__init__.py +0 -0
  104. {flowcept-0.8.12 → flowcept-0.9.2}/src/flowcept/agents/agent_client.py +0 -0
  105. {flowcept-0.8.12 → flowcept-0.9.2}/src/flowcept/agents/agents_utils.py +0 -0
  106. {flowcept-0.8.12 → flowcept-0.9.2}/src/flowcept/agents/dynamic_schema_tracker.py +0 -0
  107. {flowcept-0.8.12 → flowcept-0.9.2}/src/flowcept/agents/flowcept_agent.py +0 -0
  108. {flowcept-0.8.12 → flowcept-0.9.2}/src/flowcept/agents/flowcept_ctx_manager.py +0 -0
  109. {flowcept-0.8.12 → flowcept-0.9.2}/src/flowcept/agents/gui/__init__.py +0 -0
  110. {flowcept-0.8.12 → flowcept-0.9.2}/src/flowcept/agents/gui/agent_gui.py +0 -0
  111. {flowcept-0.8.12 → flowcept-0.9.2}/src/flowcept/agents/gui/gui_utils.py +0 -0
  112. {flowcept-0.8.12 → flowcept-0.9.2}/src/flowcept/agents/llms/__init__.py +0 -0
  113. {flowcept-0.8.12 → flowcept-0.9.2}/src/flowcept/agents/llms/claude_gcp.py +0 -0
  114. {flowcept-0.8.12 → flowcept-0.9.2}/src/flowcept/agents/llms/gemini25.py +0 -0
  115. {flowcept-0.8.12 → flowcept-0.9.2}/src/flowcept/agents/prompts/__init__.py +0 -0
  116. {flowcept-0.8.12 → flowcept-0.9.2}/src/flowcept/agents/prompts/in_memory_query_prompts.py +0 -0
  117. {flowcept-0.8.12 → flowcept-0.9.2}/src/flowcept/agents/tools/__init__.py +0 -0
  118. {flowcept-0.8.12 → flowcept-0.9.2}/src/flowcept/agents/tools/general_tools.py +0 -0
  119. {flowcept-0.8.12 → flowcept-0.9.2}/src/flowcept/agents/tools/in_memory_queries/__init__.py +0 -0
  120. {flowcept-0.8.12 → flowcept-0.9.2}/src/flowcept/agents/tools/in_memory_queries/in_memory_queries_tools.py +0 -0
  121. {flowcept-0.8.12 → flowcept-0.9.2}/src/flowcept/agents/tools/in_memory_queries/pandas_agent_utils.py +0 -0
  122. {flowcept-0.8.12 → flowcept-0.9.2}/src/flowcept/analytics/__init__.py +0 -0
  123. {flowcept-0.8.12 → flowcept-0.9.2}/src/flowcept/analytics/analytics_utils.py +0 -0
  124. {flowcept-0.8.12 → flowcept-0.9.2}/src/flowcept/analytics/data_augmentation.py +0 -0
  125. {flowcept-0.8.12 → flowcept-0.9.2}/src/flowcept/analytics/plot.py +0 -0
  126. {flowcept-0.8.12 → flowcept-0.9.2}/src/flowcept/commons/__init__.py +0 -0
  127. {flowcept-0.8.12 → flowcept-0.9.2}/src/flowcept/commons/daos/__init__.py +0 -0
  128. {flowcept-0.8.12 → flowcept-0.9.2}/src/flowcept/commons/daos/docdb_dao/__init__.py +0 -0
  129. {flowcept-0.8.12 → flowcept-0.9.2}/src/flowcept/commons/daos/docdb_dao/docdb_dao_base.py +0 -0
  130. {flowcept-0.8.12 → flowcept-0.9.2}/src/flowcept/commons/daos/docdb_dao/mongodb_dao.py +0 -0
  131. {flowcept-0.8.12 → flowcept-0.9.2}/src/flowcept/commons/daos/keyvalue_dao.py +0 -0
  132. {flowcept-0.8.12 → flowcept-0.9.2}/src/flowcept/commons/daos/mq_dao/__init__.py +0 -0
  133. {flowcept-0.8.12 → flowcept-0.9.2}/src/flowcept/commons/daos/mq_dao/mq_dao_kafka.py +0 -0
  134. {flowcept-0.8.12 → flowcept-0.9.2}/src/flowcept/commons/daos/mq_dao/mq_dao_mofka.py +0 -0
  135. {flowcept-0.8.12 → flowcept-0.9.2}/src/flowcept/commons/daos/mq_dao/mq_dao_redis.py +0 -0
  136. {flowcept-0.8.12 → flowcept-0.9.2}/src/flowcept/commons/daos/redis_conn.py +0 -0
  137. {flowcept-0.8.12 → flowcept-0.9.2}/src/flowcept/commons/flowcept_dataclasses/__init__.py +0 -0
  138. {flowcept-0.8.12 → flowcept-0.9.2}/src/flowcept/commons/flowcept_dataclasses/base_settings_dataclasses.py +0 -0
  139. {flowcept-0.8.12 → flowcept-0.9.2}/src/flowcept/commons/flowcept_dataclasses/telemetry.py +0 -0
  140. {flowcept-0.8.12 → flowcept-0.9.2}/src/flowcept/commons/flowcept_logger.py +0 -0
  141. {flowcept-0.8.12 → flowcept-0.9.2}/src/flowcept/commons/query_utils.py +0 -0
  142. {flowcept-0.8.12 → flowcept-0.9.2}/src/flowcept/commons/settings_factory.py +0 -0
  143. {flowcept-0.8.12 → flowcept-0.9.2}/src/flowcept/commons/task_data_preprocess.py +0 -0
  144. {flowcept-0.8.12 → flowcept-0.9.2}/src/flowcept/commons/utils.py +0 -0
  145. {flowcept-0.8.12 → flowcept-0.9.2}/src/flowcept/commons/vocabulary.py +0 -0
  146. {flowcept-0.8.12 → flowcept-0.9.2}/src/flowcept/flowcept_api/__init__.py +0 -0
  147. {flowcept-0.8.12 → flowcept-0.9.2}/src/flowcept/flowcept_api/db_api.py +0 -0
  148. {flowcept-0.8.12 → flowcept-0.9.2}/src/flowcept/flowcept_api/task_query_api.py +0 -0
  149. {flowcept-0.8.12 → flowcept-0.9.2}/src/flowcept/flowcept_webserver/__init__.py +0 -0
  150. {flowcept-0.8.12 → flowcept-0.9.2}/src/flowcept/flowcept_webserver/app.py +0 -0
  151. {flowcept-0.8.12 → flowcept-0.9.2}/src/flowcept/flowcept_webserver/resources/__init__.py +0 -0
  152. {flowcept-0.8.12 → flowcept-0.9.2}/src/flowcept/flowcept_webserver/resources/query_rsrc.py +0 -0
  153. {flowcept-0.8.12 → flowcept-0.9.2}/src/flowcept/flowcept_webserver/resources/task_messages_rsrc.py +0 -0
  154. {flowcept-0.8.12 → flowcept-0.9.2}/src/flowcept/flowceptor/__init__.py +0 -0
  155. {flowcept-0.8.12 → flowcept-0.9.2}/src/flowcept/flowceptor/adapters/__init__.py +0 -0
  156. {flowcept-0.8.12 → flowcept-0.9.2}/src/flowcept/flowceptor/adapters/base_interceptor.py +0 -0
  157. {flowcept-0.8.12 → flowcept-0.9.2}/src/flowcept/flowceptor/adapters/brokers/__init__.py +0 -0
  158. {flowcept-0.8.12 → flowcept-0.9.2}/src/flowcept/flowceptor/adapters/brokers/mqtt_interceptor.py +0 -0
  159. {flowcept-0.8.12 → flowcept-0.9.2}/src/flowcept/flowceptor/adapters/dask/__init__.py +0 -0
  160. {flowcept-0.8.12 → flowcept-0.9.2}/src/flowcept/flowceptor/adapters/dask/dask_dataclasses.py +0 -0
  161. {flowcept-0.8.12 → flowcept-0.9.2}/src/flowcept/flowceptor/adapters/dask/dask_interceptor.py +0 -0
  162. {flowcept-0.8.12 → flowcept-0.9.2}/src/flowcept/flowceptor/adapters/dask/dask_plugins.py +0 -0
  163. {flowcept-0.8.12 → flowcept-0.9.2}/src/flowcept/flowceptor/adapters/instrumentation_interceptor.py +0 -0
  164. {flowcept-0.8.12 → flowcept-0.9.2}/src/flowcept/flowceptor/adapters/interceptor_state_manager.py +0 -0
  165. {flowcept-0.8.12 → flowcept-0.9.2}/src/flowcept/flowceptor/adapters/mlflow/__init__.py +0 -0
  166. {flowcept-0.8.12 → flowcept-0.9.2}/src/flowcept/flowceptor/adapters/mlflow/interception_event_handler.py +0 -0
  167. {flowcept-0.8.12 → flowcept-0.9.2}/src/flowcept/flowceptor/adapters/mlflow/mlflow_dao.py +0 -0
  168. {flowcept-0.8.12 → flowcept-0.9.2}/src/flowcept/flowceptor/adapters/mlflow/mlflow_dataclasses.py +0 -0
  169. {flowcept-0.8.12 → flowcept-0.9.2}/src/flowcept/flowceptor/adapters/mlflow/mlflow_interceptor.py +0 -0
  170. {flowcept-0.8.12 → flowcept-0.9.2}/src/flowcept/flowceptor/adapters/tensorboard/__init__.py +0 -0
  171. {flowcept-0.8.12 → flowcept-0.9.2}/src/flowcept/flowceptor/adapters/tensorboard/tensorboard_dataclasses.py +0 -0
  172. {flowcept-0.8.12 → flowcept-0.9.2}/src/flowcept/flowceptor/adapters/tensorboard/tensorboard_interceptor.py +0 -0
  173. {flowcept-0.8.12 → flowcept-0.9.2}/src/flowcept/flowceptor/consumers/__init__.py +0 -0
  174. {flowcept-0.8.12 → flowcept-0.9.2}/src/flowcept/flowceptor/consumers/agent/__init__.py +0 -0
  175. {flowcept-0.8.12 → flowcept-0.9.2}/src/flowcept/flowceptor/consumers/agent/base_agent_context_manager.py +0 -0
  176. {flowcept-0.8.12 → flowcept-0.9.2}/src/flowcept/flowceptor/consumers/document_inserter.py +0 -0
  177. {flowcept-0.8.12 → flowcept-0.9.2}/src/flowcept/instrumentation/__init__.py +0 -0
  178. {flowcept-0.8.12 → flowcept-0.9.2}/src/flowcept/instrumentation/flowcept_agent_task.py +0 -0
  179. {flowcept-0.8.12 → flowcept-0.9.2}/src/flowcept/instrumentation/flowcept_decorator.py +0 -0
  180. {flowcept-0.8.12 → flowcept-0.9.2}/src/flowcept/instrumentation/flowcept_loop.py +0 -0
  181. {flowcept-0.8.12 → flowcept-0.9.2}/src/flowcept/instrumentation/flowcept_task.py +0 -0
  182. {flowcept-0.8.12 → flowcept-0.9.2}/src/flowcept/instrumentation/flowcept_torch.py +0 -0
  183. {flowcept-0.8.12 → flowcept-0.9.2}/tests/__init__.py +0 -0
  184. {flowcept-0.8.12 → flowcept-0.9.2}/tests/adapters/__init__.py +0 -0
  185. {flowcept-0.8.12 → flowcept-0.9.2}/tests/adapters/dask_test_utils.py +0 -0
  186. {flowcept-0.8.12 → flowcept-0.9.2}/tests/adapters/test_broker.py +0 -0
  187. {flowcept-0.8.12 → flowcept-0.9.2}/tests/adapters/test_dask.py +0 -0
  188. {flowcept-0.8.12 → flowcept-0.9.2}/tests/adapters/test_dask_with_context_mgmt.py +0 -0
  189. {flowcept-0.8.12 → flowcept-0.9.2}/tests/adapters/test_file_observer.py +0 -0
  190. {flowcept-0.8.12 → flowcept-0.9.2}/tests/adapters/test_mlflow.py +0 -0
  191. {flowcept-0.8.12 → flowcept-0.9.2}/tests/adapters/test_tensorboard.py +0 -0
  192. {flowcept-0.8.12 → flowcept-0.9.2}/tests/api/__init__.py +0 -0
  193. {flowcept-0.8.12 → flowcept-0.9.2}/tests/api/db_api_test.py +0 -0
  194. {flowcept-0.8.12 → flowcept-0.9.2}/tests/api/flowcept_api_test.py +0 -0
  195. {flowcept-0.8.12 → flowcept-0.9.2}/tests/api/sample_data.json +0 -0
  196. {flowcept-0.8.12 → flowcept-0.9.2}/tests/api/sample_data_with_telemetry_and_rai.json +0 -0
  197. {flowcept-0.8.12 → flowcept-0.9.2}/tests/api/task_query_api_test.py +0 -0
  198. {flowcept-0.8.12 → flowcept-0.9.2}/tests/doc_db_inserter/__init__.py +0 -0
  199. {flowcept-0.8.12 → flowcept-0.9.2}/tests/doc_db_inserter/doc_db_inserter_test.py +0 -0
  200. {flowcept-0.8.12 → flowcept-0.9.2}/tests/instrumentation_tests/__init__.py +0 -0
  201. {flowcept-0.8.12 → flowcept-0.9.2}/tests/instrumentation_tests/flowcept_loop_test.py +0 -0
  202. {flowcept-0.8.12 → flowcept-0.9.2}/tests/instrumentation_tests/ml_tests/__init__.py +0 -0
  203. {flowcept-0.8.12 → flowcept-0.9.2}/tests/instrumentation_tests/ml_tests/dl_trainer.py +0 -0
  204. {flowcept-0.8.12 → flowcept-0.9.2}/tests/instrumentation_tests/ml_tests/ml_decorator_dask_test.py +0 -0
  205. {flowcept-0.8.12 → flowcept-0.9.2}/tests/instrumentation_tests/ml_tests/ml_decorator_test.py +0 -0
  206. {flowcept-0.8.12 → flowcept-0.9.2}/tests/misc_tests/__init__.py +0 -0
  207. {flowcept-0.8.12 → flowcept-0.9.2}/tests/misc_tests/log_test.py +0 -0
  208. {flowcept-0.8.12 → flowcept-0.9.2}/tests/misc_tests/singleton_test.py +0 -0
  209. {flowcept-0.8.12 → flowcept-0.9.2}/tests/misc_tests/telemetry_test.py +0 -0
@@ -35,6 +35,7 @@ run_test() {
35
35
 
36
36
  pip install . > /dev/null 2>&1
37
37
  pip install .[extras] > /dev/null 2>&1
38
+ pip install .[lmdb]
38
39
  pip list
39
40
 
40
41
  if [[ "$with_mongo" == "true" ]]; then
@@ -32,3 +32,4 @@ core.*
32
32
  *.csv
33
33
 
34
34
  uv.lock
35
+ examples/flowcept_messages.jsonl
@@ -8,6 +8,10 @@ help:
8
8
  @printf "\033[32mservices-stop\033[0m stop the running Docker services\n"
9
9
  @printf "\033[32mservices-mongo\033[0m run services with MongoDB using Docker\n"
10
10
  @printf "\033[32mservices-stop-mongo\033[0m stop MongoDB services and remove attached volumes\n"
11
+ @printf "\033[32mservices-kafka\033[0m run services with Kafka using Docker\n"
12
+ @printf "\033[32mservices-stop-kafka\033[0m stop Kafka services and remove attached volumes\n"
13
+ @printf "\033[32mservices-mofka\033[0m run services with Mofka using Docker\n"
14
+ @printf "\033[32mservices-stop-mofka\033[0m stop Mofka services and remove attached volumes\n"
11
15
  @printf "\033[32mtests\033[0m run unit tests with pytest\n"
12
16
  @printf "\033[32mtests-in-container\033[0m run unit tests with pytest inside Flowcept's container\n"
13
17
  @printf "\033[32mtests-in-container-mongo\033[0m run unit tests inside container with MongoDB\n"
@@ -90,6 +94,22 @@ install_dev_agent: # Run this to fix python env problems in the MCP studio env
90
94
  mcp install src/flowcept/flowceptor/adapters/agents/flowcept_agent.py
91
95
 
92
96
 
97
+ # Run services with Kafka using Docker
98
+ services-kafka:
99
+ docker compose --file deployment/compose-kafka.yml up --detach
100
+
101
+ # Stop Kafka services and remove attached volumes
102
+ services-stop-kafka:
103
+ docker compose --file deployment/compose-kafka.yml down --volumes
104
+
105
+ # Run services with Mofka using Docker
106
+ services-mofka:
107
+ docker compose --file deployment/compose-mofka.yml up --detach
108
+
109
+ # Stop Mofka services and remove attached volumes
110
+ services-stop-mofka:
111
+ docker compose --file deployment/compose-mofka.yml down --volumes
112
+
93
113
  # Run unit tests using pytest
94
114
  .PHONY: tests
95
115
  tests:
@@ -0,0 +1,589 @@
1
+ Metadata-Version: 2.4
2
+ Name: flowcept
3
+ Version: 0.9.2
4
+ Summary: Capture and query workflow provenance data using data observability
5
+ Author: Oak Ridge National Laboratory
6
+ License-Expression: MIT
7
+ License-File: LICENSE
8
+ Keywords: agentic-ai,agentic-workflows,ai,big-data,dask,data-analytics,data-integration,databases,lineage,llm,machine-learning,ml,mlflow,model-management,parallel-processing,provenance,reproducibility,responsible-ai,scientific-workflows,tensorboard,workflows
9
+ Classifier: License :: OSI Approved :: MIT License
10
+ Classifier: Operating System :: OS Independent
11
+ Classifier: Programming Language :: Python :: 3
12
+ Requires-Python: >=3.10
13
+ Requires-Dist: msgpack
14
+ Requires-Dist: numpy
15
+ Requires-Dist: omegaconf
16
+ Requires-Dist: orjson
17
+ Provides-Extra: all
18
+ Requires-Dist: alembic; extra == 'all'
19
+ Requires-Dist: confluent-kafka<=2.8.0; extra == 'all'
20
+ Requires-Dist: cryptography; extra == 'all'
21
+ Requires-Dist: dask[distributed]<=2024.10.0; extra == 'all'
22
+ Requires-Dist: flask-restful; extra == 'all'
23
+ Requires-Dist: furo; extra == 'all'
24
+ Requires-Dist: gitpython; extra == 'all'
25
+ Requires-Dist: google-genai; extra == 'all'
26
+ Requires-Dist: jupyterlab; extra == 'all'
27
+ Requires-Dist: langchain-community; extra == 'all'
28
+ Requires-Dist: lmdb; extra == 'all'
29
+ Requires-Dist: mcp[cli]; extra == 'all'
30
+ Requires-Dist: mlflow-skinny; extra == 'all'
31
+ Requires-Dist: nbmake; extra == 'all'
32
+ Requires-Dist: paho-mqtt; extra == 'all'
33
+ Requires-Dist: pandas; extra == 'all'
34
+ Requires-Dist: pika; extra == 'all'
35
+ Requires-Dist: plotly; extra == 'all'
36
+ Requires-Dist: psutil>=6.1.1; extra == 'all'
37
+ Requires-Dist: py-cpuinfo; extra == 'all'
38
+ Requires-Dist: pyarrow; extra == 'all'
39
+ Requires-Dist: pymongo; extra == 'all'
40
+ Requires-Dist: pytest; extra == 'all'
41
+ Requires-Dist: pyyaml; extra == 'all'
42
+ Requires-Dist: redis; extra == 'all'
43
+ Requires-Dist: requests; extra == 'all'
44
+ Requires-Dist: ruff; extra == 'all'
45
+ Requires-Dist: scipy; extra == 'all'
46
+ Requires-Dist: seaborn; extra == 'all'
47
+ Requires-Dist: sphinx; extra == 'all'
48
+ Requires-Dist: sqlalchemy; extra == 'all'
49
+ Requires-Dist: streamlit; extra == 'all'
50
+ Requires-Dist: tbparse; extra == 'all'
51
+ Requires-Dist: tensorboard; extra == 'all'
52
+ Requires-Dist: tensorflow; extra == 'all'
53
+ Requires-Dist: tomli; extra == 'all'
54
+ Requires-Dist: watchdog; extra == 'all'
55
+ Provides-Extra: analytics
56
+ Requires-Dist: plotly; extra == 'analytics'
57
+ Requires-Dist: scipy; extra == 'analytics'
58
+ Requires-Dist: seaborn; extra == 'analytics'
59
+ Provides-Extra: dask
60
+ Requires-Dist: dask[distributed]<=2024.10.0; extra == 'dask'
61
+ Requires-Dist: tomli; extra == 'dask'
62
+ Provides-Extra: dev
63
+ Requires-Dist: furo; extra == 'dev'
64
+ Requires-Dist: jupyterlab; extra == 'dev'
65
+ Requires-Dist: nbmake; extra == 'dev'
66
+ Requires-Dist: pika; extra == 'dev'
67
+ Requires-Dist: pytest; extra == 'dev'
68
+ Requires-Dist: pyyaml; extra == 'dev'
69
+ Requires-Dist: ruff; extra == 'dev'
70
+ Requires-Dist: sphinx; extra == 'dev'
71
+ Provides-Extra: docs
72
+ Requires-Dist: furo; extra == 'docs'
73
+ Requires-Dist: sphinx; extra == 'docs'
74
+ Provides-Extra: extras
75
+ Requires-Dist: flask-restful; extra == 'extras'
76
+ Requires-Dist: gitpython; extra == 'extras'
77
+ Requires-Dist: pandas; extra == 'extras'
78
+ Requires-Dist: psutil>=6.1.1; extra == 'extras'
79
+ Requires-Dist: py-cpuinfo; extra == 'extras'
80
+ Requires-Dist: pyarrow; extra == 'extras'
81
+ Requires-Dist: pymongo; extra == 'extras'
82
+ Requires-Dist: redis; extra == 'extras'
83
+ Requires-Dist: requests; extra == 'extras'
84
+ Provides-Extra: kafka
85
+ Requires-Dist: confluent-kafka<=2.8.0; extra == 'kafka'
86
+ Provides-Extra: llm-agent
87
+ Requires-Dist: langchain-community; extra == 'llm-agent'
88
+ Requires-Dist: mcp[cli]; extra == 'llm-agent'
89
+ Requires-Dist: streamlit; extra == 'llm-agent'
90
+ Provides-Extra: llm-google
91
+ Requires-Dist: google-genai; extra == 'llm-google'
92
+ Requires-Dist: langchain-community; extra == 'llm-google'
93
+ Requires-Dist: mcp[cli]; extra == 'llm-google'
94
+ Requires-Dist: streamlit; extra == 'llm-google'
95
+ Provides-Extra: lmdb
96
+ Requires-Dist: lmdb; extra == 'lmdb'
97
+ Provides-Extra: ml-dev
98
+ Requires-Dist: datasets==2.17.0; extra == 'ml-dev'
99
+ Requires-Dist: nltk; extra == 'ml-dev'
100
+ Requires-Dist: numpy<2.0; extra == 'ml-dev'
101
+ Requires-Dist: sacremoses; extra == 'ml-dev'
102
+ Requires-Dist: torch==2.2.2; extra == 'ml-dev'
103
+ Requires-Dist: torchtext==0.17.2; extra == 'ml-dev'
104
+ Requires-Dist: torchvision==0.17.2; extra == 'ml-dev'
105
+ Provides-Extra: mlflow
106
+ Requires-Dist: alembic; extra == 'mlflow'
107
+ Requires-Dist: cryptography; extra == 'mlflow'
108
+ Requires-Dist: mlflow-skinny; extra == 'mlflow'
109
+ Requires-Dist: sqlalchemy; extra == 'mlflow'
110
+ Requires-Dist: watchdog; extra == 'mlflow'
111
+ Provides-Extra: mongo
112
+ Requires-Dist: pyarrow; extra == 'mongo'
113
+ Requires-Dist: pymongo; extra == 'mongo'
114
+ Provides-Extra: mqtt
115
+ Requires-Dist: paho-mqtt; extra == 'mqtt'
116
+ Provides-Extra: nvidia
117
+ Requires-Dist: nvidia-ml-py; extra == 'nvidia'
118
+ Provides-Extra: redis
119
+ Requires-Dist: redis; extra == 'redis'
120
+ Provides-Extra: telemetry
121
+ Requires-Dist: psutil>=6.1.1; extra == 'telemetry'
122
+ Requires-Dist: py-cpuinfo; extra == 'telemetry'
123
+ Provides-Extra: tensorboard
124
+ Requires-Dist: tbparse; extra == 'tensorboard'
125
+ Requires-Dist: tensorboard; extra == 'tensorboard'
126
+ Requires-Dist: tensorflow; extra == 'tensorboard'
127
+ Description-Content-Type: text/markdown
128
+
129
+ <p align="center">
130
+ <picture>
131
+ <img src="./docs/img/flowcept-logo.png" alt="Flowcept Logo" width="200"/>
132
+ </picture>
133
+ </p>
134
+ <h3 align="center">Lightweight Distributed Workflow Provenance</h3>
135
+
136
+
137
+
138
+ ---
139
+
140
+ Flowcept captures and queries workflow provenance at runtime with minimal code changes and low overhead. It unifies data from diverse tools and workflows across the Edge–Cloud–HPC continuum and provides ML-aware capture, MCP agents provenance, telemetry, extensible adapters, and flexible storage.
141
+
142
+ ---
143
+
144
+ [![Documentation](https://img.shields.io/badge/docs-readthedocs.io-green.svg)](https://flowcept.readthedocs.io/)
145
+ [![Build](https://github.com/ORNL/flowcept/actions/workflows/create-release-n-publish.yml/badge.svg)](https://github.com/ORNL/flowcept/actions/workflows/create-release-n-publish.yml)
146
+ [![PyPI](https://badge.fury.io/py/flowcept.svg)](https://pypi.org/project/flowcept)
147
+ [![Tests](https://github.com/ORNL/flowcept/actions/workflows/run-tests.yml/badge.svg)](https://github.com/ORNL/flowcept/actions/workflows/run-tests.yml)
148
+ [![Code Formatting](https://github.com/ORNL/flowcept/actions/workflows/checks.yml/badge.svg?branch=dev)](https://github.com/ORNL/flowcept/actions/workflows/checks.yml)
149
+ [![License: MIT](https://img.shields.io/github/license/ORNL/flowcept)](LICENSE)
150
+
151
+ ---
152
+
153
+ # Quickstart
154
+
155
+ The easiest way to capture provenance from plain Python functions, with no external services needed:
156
+
157
+ 1) Install and initialize settings
158
+
159
+ ```shell
160
+ # Make sure you activate your Python environment (e.g., conda, venv) first
161
+ pip install flowcept
162
+ flowcept --init-settings
163
+ ```
164
+ This generates a minimal settings file in `~/.flowcept/settings.yaml.`
165
+
166
+ 2) Run the minimal example
167
+
168
+ Save the following script as `quickstart.py` and run `python quickstart.py.`
169
+
170
+ ```python
171
+ """
172
+ A minimal example of Flowcept's instrumentation using @decorators.
173
+ This example needs no DB, broker, or external service.
174
+ """
175
+ import json
176
+
177
+ from flowcept import Flowcept, flowcept_task
178
+ from flowcept.instrumentation.flowcept_decorator import flowcept
179
+
180
+
181
+ @flowcept_task(output_names="o1")
182
+ def sum_one(i1):
183
+ return i1 + 1
184
+
185
+
186
+ @flowcept_task(output_names="o2")
187
+ def mult_two(o1):
188
+ return o1 * 2
189
+
190
+
191
+ @flowcept
192
+ def main():
193
+ n = 3
194
+ o1 = sum_one(n)
195
+ o2 = mult_two(o1)
196
+ print("Final output", o2)
197
+
198
+
199
+ if __name__ == "__main__":
200
+ main()
201
+
202
+ prov_messages = Flowcept.read_messages_file()
203
+ assert len(prov_messages) == 2
204
+ print(json.dumps(prov_messages, indent=2))
205
+ ```
206
+
207
+ This creates a provenance file in `flowcept_messages.jsonl`. In it, you will see two provenance messages, each related to an executed function.
208
+
209
+ ```json
210
+ [
211
+ {
212
+ "activity_id": "sum_one",
213
+ "workflow_id": "fe546706-ef46-4482-8f70-3af664a7131b",
214
+ "campaign_id": "76088532-3bef-4343-831e-d8a5d9156174",
215
+ "used": {
216
+ "i1": 3
217
+ },
218
+ "started_at": 1757171258.637908,
219
+ "hostname": "my_laptop",
220
+ "task_id": "1757171258.637908",
221
+ "status": "FINISHED",
222
+ "ended_at": 1757171258.6379142,
223
+ "generated": {
224
+ "o1": 4
225
+ },
226
+ "type": "task"
227
+ },
228
+ {
229
+ "activity_id": "mult_two",
230
+ "workflow_id": "fe546706-ef46-4482-8f70-3af664a7131b",
231
+ "campaign_id": "76088532-3bef-4343-831e-d8a5d9156174",
232
+ "used": {
233
+ "o1": 4
234
+ },
235
+ "started_at": 1757171258.637933,
236
+ "hostname": "my_laptop",
237
+ "task_id": "1757171258.637933",
238
+ "status": "FINISHED",
239
+ "ended_at": 1757171258.6379352,
240
+ "generated": {
241
+ "o2": 8
242
+ },
243
+ "type": "task"
244
+ }
245
+ ]
246
+ ```
247
+
248
+
249
+ For online querying using databases, MCP agents and Grafana, telemetry, adapters (MLflow, Dask, TensorBoard), PyTorch and MCP instrumentation, HPC optimization or federated runs,
250
+ and more, see the [Jupyter Notebooks](notebooks), the [Examples directory](examples) and the [complete documentation](https://flowcept.readthedocs.io/).
251
+
252
+ ## Table of Contents
253
+
254
+ - [Overview](#overview)
255
+ - [Features](#features)
256
+ - [Installation](#installation)
257
+ - [Setup and the Settings File](#setup)
258
+ - [Running with Containers](#running-with-containers)
259
+ - [Examples](#examples)
260
+ - [Data Persistence](#data-persistence)
261
+ - [Performance Tuning](#performance-tuning-for-performance-evaluation)
262
+ - [AMD GPU Setup](#install-amd-gpu-lib)
263
+ - [Further Documentation](#documentation)
264
+
265
+ ## Overview
266
+
267
+ Flowcept captures and queries workflow provenance at runtime with minimal code changes and low data capture overhead,
268
+ unifying data from diverse tools and workflows.
269
+
270
+ Designed for scenarios involving critical data from multiple, federated workflows in the Edge-Cloud-HPC continuum, Flowcept supports end-to-end monitoring, analysis, querying, and enhanced support for Machine Learning (ML) and for agentic workflows.
271
+
272
+ ## Features
273
+
274
+ - Automatic workflow provenance capture with minimal intrusion
275
+ - Adapters for MLflow, Dask, TensorBoard; easy to add more
276
+ - Optional explicit instrumentation via decorators
277
+ - ML-aware capture, from workflow to epoch and layer granularity
278
+ - Agentic workflows: MCP agents-aware provenance capture
279
+ - Low overhead, suitable for HPC and highly distributed setups
280
+ - Telemetry capture for CPU, GPU, memory, linked to dataflow
281
+ - Pluggable MQ and storage backends (Redis, Kafka, MongoDB, LMDB)
282
+ - [W3C PROV](https://www.w3.org/TR/prov-overview/) adherence
283
+
284
+ Explore [Jupyter Notebooks](notebooks) and [Examples](examples) for usage.
285
+
286
+ ## Installation
287
+
288
+ Flowcept can be installed in multiple ways, depending on your needs.
289
+
290
+ ### 1. Default Installation
291
+ To install Flowcept with its basic dependencies from [PyPI](https://pypi.org/project/flowcept/), run:
292
+
293
+ ```shell
294
+ pip install flowcept
295
+ ```
296
+
297
+ This installs the minimal Flowcept package, **not** including MongoDB, Redis, MCP, or any adapter-specific dependencies.
298
+
299
+ ### 2. Installing Specific Adapters and Additional Dependencies
300
+
301
+ Flowcept integrates with several tools and services, but you should **only install what you actually need**.
302
+ Good practice is to cherry-pick the extras relevant to your workflow instead of installing them all.
303
+
304
+ ```shell
305
+ pip install flowcept[mongo] # MongoDB support
306
+ pip install flowcept[mlflow] # MLflow adapter
307
+ pip install flowcept[dask] # Dask adapter
308
+ pip install flowcept[tensorboard] # TensorBoard adapter
309
+ pip install flowcept[kafka] # Kafka message queue
310
+ pip install flowcept[nvidia] # NVIDIA GPU runtime capture
311
+ pip install flowcept[telemetry] # CPU/GPU/memory telemetry capture
312
+ pip install flowcept[lmdb] # LMDB lightweight database
313
+ pip install flowcept[mqtt] # MQTT support
314
+ pip install flowcept[llm_agent] # MCP agent, LangChain, Streamlit integration: needed either for MCP capture or for the Flowcept Agent.
315
+ pip install flowcept[llm_google] # Google GenAI + Flowcept agent support
316
+ pip install flowcept[analytics] # Extra analytics (seaborn, plotly, scipy)
317
+ pip install flowcept[dev] # Developer dependencies (docs, tests, lint, etc.)
318
+ ```
319
+
320
+ ### 3. Installing with Common Runtime Bundle
321
+
322
+ ```shell
323
+ pip install flowcept[extras]
324
+ ```
325
+
326
+ The `extras` group is a convenience shortcut that bundles the most common runtime dependencies.
327
+ It is intended for users who want a fairly complete, but not maximal, Flowcept environment.
328
+
329
+ You might choose `flowcept[extras]` if:
330
+
331
+ - You want Flowcept to run out-of-the-box with Redis, telemetry, and MongoDB.
332
+ - You prefer not to install each extra one by one
333
+
334
+ ⚠️ If you only need one of these features, install it individually instead of `extras`.
335
+
336
+ ### 4. Install All Optional Dependencies at Once
337
+
338
+ Flowcept provides a combined all extra, but installing everything into a single environment is not recommended for users.
339
+ Many of these dependencies are unrelated and should not be mixed in the same runtime. This option is only intended for Flowcept developers who need to test across all adapters and integrations.
340
+
341
+ ```
342
+ pip install flowcept[all]
343
+ ```
344
+
345
+ ### 5. Installing from Source
346
+ To install Flowcept from the source repository:
347
+
348
+ ```
349
+ git clone https://github.com/ORNL/flowcept.git
350
+ cd flowcept
351
+ pip install .
352
+ ```
353
+
354
+ You can then install specific dependencies similarly as above:
355
+
356
+ ```
357
+ pip install .[optional_dependency_name]
358
+ ```
359
+
360
+ This follows the same pattern as step 2, allowing for a customized installation from source.
361
+
362
+ ## Setup
363
+
364
+ The [Quickstart](#quickstart) example works with just `pip install flowcept`, no extra setup is required.
365
+
366
+ For online queries or distributed capture, Flowcept relies on two optional components:
367
+
368
+ - **Message Queue (MQ)** — message broker / pub-sub / data stream
369
+ - **Database (DB)** — persistent storage for historical queries
370
+
371
+ ---
372
+
373
+ #### Message Queue (MQ)
374
+
375
+ - Required for anything beyond Quickstart
376
+ - Flowcept publishes provenance data to the MQ during workflow runs
377
+ - Developers can subscribe with custom consumers (see [this example](examples/consumers/simple_consumer.py).
378
+ - You can monitor or print messages in motion using `flowcept --stream-messages --print`.
379
+
380
+ Supported MQs:
381
+ - [Redis](https://redis.io) → **default**, lightweight, works on Linux, macOS, Windows, and HPC (tested on [Frontier](link) and [Summit](link))
382
+ - [Kafka](https://kafka.apache.org) → for distributed environments or if Kafka is already in your stack
383
+ - [Mofka](https://mofka.readthedocs.io) → optimized for HPC runs
384
+
385
+ ---
386
+
387
+ #### Database (DB)
388
+
389
+ - **Optional**, but required for:
390
+ - Persisting provenance beyond MQ memory/disk buffers
391
+ - Running complex analytical queries on historical data
392
+
393
+ Supported DBs:
394
+ - [MongoDB](https://www.mongodb.com) → default, efficient bulk writes + rich query support
395
+ - [LMDB](https://lmdb.readthedocs.io) → lightweight, no external service, basic query capabilities
396
+
397
+ ---
398
+
399
+ ### Notes
400
+
401
+ - Without a DB:
402
+ - Provenance remains in the MQ only (persistence not guaranteed)
403
+ - Complex historical queries are unavailable
404
+ - Flowcept’s architecture is modular: other MQs and DBs (graph, relational, etc.) can be added in the future
405
+ - Deployment examples for MQ and DB are provided in the [deployment](deployment) directory
406
+
407
+
408
+ ### Downloading and Starting External Services (MQ or DB)
409
+
410
+ Flowcept uses external services for message queues (MQ) and databases (DB). You can start them with Docker Compose, plain containers, or directly on your host.
411
+
412
+ ---
413
+
414
+ #### Using Docker Compose (recommended)
415
+
416
+ We provide a [Makefile](deployment/Makefile) with shortcuts:
417
+
418
+ 1. **Redis only (no DB)**: `make services` (LMDB can be used in this setup as a lightweight DB)
419
+ 2. **Redis + MongoDB**: `make services-mongo`
420
+ 3. **Kafka + MongoDB**: `make services-kafka`
421
+ 4. **Mofka only (no DB)**: `make services-mofka`
422
+
423
+ To customize, edit the YAML files in [deployment](deployment/) and run `docker compose -f deployment/<compose-file>.yml up -d`
424
+
425
+ ---
426
+
427
+ #### Using Docker (without Compose)
428
+
429
+ See the [deployment/](deployment/) compose files for expected images and configurations. You can adapt them to your environment and use standard `docker pull / run / exec` commands.
430
+
431
+ ---
432
+
433
+ #### Running on the Host (no containers)
434
+
435
+ 1. Install binaries for the service you need:
436
+ - **macOS** users can install with [Homebrew](https://brew.sh).
437
+ Example for Redis:
438
+ ```bash
439
+ brew install redis
440
+ brew services start redis
441
+ ```
442
+
443
+ - On Linux, use your distro package manager (e.g. `apt`, `dnf`, `yum`)
444
+ - If non-root (typically the case if you want to deploy these services locally in an HPC system), search for the installed binaries for your OS/hardware architecture, download them in a directory that you have r+w permission, and run them.
445
+ - On Windows, utilize [WSL](https://learn.microsoft.com/en-us/windows/wsl/install) to use a Linux distro.
446
+
447
+ 2. Start services normally (`redis-server`, `mongod`, `kafka-server-start.sh`, etc.).
448
+
449
+ ## Flowcept Settings File
450
+
451
+ Flowcept uses a settings file for configuration.
452
+
453
+ - To create a minimal settings file (**recommended**), run: `flowcept --init-settings` → creates `~/.flowcept/settings.yaml`
454
+
455
+ - To create a full settings file with all options, run: `flowcept --init-settings --full` → creates `~/.flowcept/settings.yaml`
456
+
457
+ ---
458
+
459
+ #### What You Can Configure
460
+
461
+ - Message queue and database routes, ports, and paths
462
+ - MCP agent ports and LLM API keys
463
+ - Buffer sizes and flush settings
464
+ - Telemetry capture settings
465
+ - Instrumentation and PyTorch details
466
+ - Log levels
467
+ - Data observability adapters
468
+ - And more (see [example file](resources/sample_settings.yaml))
469
+
470
+ ---
471
+
472
+ #### Custom Settings File
473
+
474
+ Flowcept looks for its settings in the following order:
475
+
476
+ 1. `~/.flowcept/settings.yaml` — created by running `flowcept --init-settings`
477
+ 2. Environment variable `FLOWCEPT_SETTINGS_PATH` — if set, Flowcept will use this environment variable
478
+ 3. [Default sample file](resources/sample_settings.yaml) — used if neither of the above is found
479
+
480
+ # Examples
481
+
482
+ ### Adapters and Notebooks
483
+
484
+ See the [Jupyter Notebooks](notebooks) and [Examples directory](examples) for utilization examples.
485
+
486
+ # Summary: Observability, Instrumentation, MQs, DBs, and Querying
487
+
488
+ | Category | Supported Options |
489
+ |------------------------------------|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
490
+ | **Data Observability Adapters** | [MLflow](https://github.com/ORNL/flowcept/blob/main/examples/mlflow_example.py), [Dask](https://github.com/ORNL/flowcept/blob/main/examples/dask_example.py), [TensorBoard](https://github.com/ORNL/flowcept/blob/main/examples/tensorboard_example.py) |
491
+ | **Instrumentation and Decorators** | - [@flowcept](https://github.com/ORNL/flowcept/blob/main/examples/start_here.py): encapsulate a function (e.g., a main function) as a workflow. <br> - [@flowcept_task](https://github.com/ORNL/flowcept/blob/main/examples/instrumented_simple_example.py): encapsulate a function as a task. <br> - `@telemetry_flowcept_task`: same as `@flowcept_task`, but optimized for telemetry capture. <br> - `@lightweight_flowcept_task`: same as `@flowcept_task`, but very lightweight, optimized for HPC workloads <br> - [Loop](https://github.com/ORNL/flowcept/blob/main/examples/instrumented_loop_example.py) <br> - [PyTorch Model](https://github.com/ORNL/flowcept/blob/main/examples/llm_complex/llm_model.py) <br> - [MCP Agent](https://github.com/ORNL/flowcept/blob/main/examples/agents/aec_agent_mock.py) |
492
+ | **Context Manager** | `with Flowcept():` <br/> &nbsp;&nbsp;&nbsp;`# Workflow code` <br/><br/>Similar to the `@flowcept` decorator, but more flexible for instrumenting code blocks that aren’t encapsulated in a single function and for workflows with scattered code across multiple files. |
493
+ | **Custom Task Creation** | `FlowceptTask(activity_id=<id>, used=<inputs>, generated=<outputs>, ...)` <br/><br/>Use for fully customizable task instrumentation. Publishes directly to the MQ either via context management (`with FlowceptTask(...)`) or by calling `send()`. It needs to have a `Flowcept().start()` first (or within a `with Flowcept()` context). See [example](examples/consumers/ping_pong_example.py). |
494
+ | **Message Queues (MQ)** | - **Disabled** (offline mode: provenance events stay in an in-memory buffer, not accessible to external processes) <br> - [Redis](https://redis.io) → default, lightweight, easy to run anywhere <br> - [Kafka](https://kafka.apache.org) → for distributed, production setups <br> - [Mofka](https://mofka.readthedocs.io) → optimized for HPC runs <br><br> _Setup example:_ [docker compose](https://github.com/ORNL/flowcept/blob/main/deployment/compose.yml) |
495
+ | **Databases** | - **Disabled** → Flowcept runs in ephemeral mode (data only in MQ, no persistence) <br> - **[MongoDB](https://www.mongodb.com)** → default, rich queries and efficient bulk writes <br> - **[LMDB](https://lmdb.readthedocs.io)** → lightweight, file-based, no external service, basic query support |
496
+ | **Querying and Monitoring** | - **[Grafana](deployment/compose-grafana.yml)** → dashboarding via MongoDB connector <br> - **MCP Flowcept Agent** → LLM-based querying of provenance data |
497
+ | **Custom Consumer** | You can implement your own consumer to monitor or query the provenance stream in real time. Useful for custom analytics, monitoring, debugging, or to persist the data in a different data model (e.g., graph) . See [example](examples/consumers/simple_consumer.py). |
498
+
499
+
500
+ ## Performance Tuning for Performance Evaluation
501
+
502
+ In the settings.yaml file, many variables may impact interception efficiency.
503
+ Please be mindful of the following parameters:
504
+
505
+ * `mq`
506
+ - `buffer_size` and `insertion_buffer_time_secs`. -- `buffer_size: 1` is really bad for performance, but it will give the most up-to-date info possible to the MQ.
507
+
508
+ * `log`
509
+ - set both stream and files to disable
510
+
511
+ * `telemetry_capture`
512
+ The more things you enable, the more overhead you'll get. For GPU, you can turn on/off specific metrics.
513
+
514
+ * `instrumentation`
515
+ This will configure whether every single granular step in the model training process will be captured. Disable very granular model inspection and try to use more lightweight methods. There are commented instructions in the settings.yaml sample file.
516
+
517
+ Other thing to consider:
518
+
519
+ ```
520
+ project:
521
+ replace_non_json_serializable: false # Here it will assume that all captured data are JSON serializable
522
+ db_flush_mode: offline # This disables the feature of runtime analysis in the database.
523
+ mq:
524
+ chunk_size: -1 # This disables chunking the messages to be sent to the MQ. Use this only if the main memory of the compute notes is large enough.
525
+ ```
526
+
527
+ Other variables depending on the adapter may impact too. For instance, in Dask, timestamp creation by workers add interception overhead. As we evolve the software, other variables that impact overhead appear and we might not stated them in this README file yet. If you are doing extensive performance evaluation experiments using this software, please reach out to us (e.g., create an issue in the repository) for hints on how to reduce the overhead of our software.
528
+
529
+ ## Install AMD GPU Lib
530
+
531
+ This section is only important if you want to enable GPU runtime data capture and the GPU is from AMD. NVIDIA GPUs don't need this step.
532
+
533
+ For AMD GPUs, we rely on the official AMD ROCM library to capture GPU data.
534
+
535
+ Unfortunately, this library is not available as a pypi/conda package, so you must manually install it. See instructions in the link: https://rocm.docs.amd.com/projects/amdsmi/en/latest/
536
+
537
+ Here is a summary:
538
+
539
+ 1. Install the AMD drivers on the machine (check if they are available already under `/opt/rocm-*`).
540
+ 2. Suppose it is /opt/rocm-6.2.0. Then, make sure it has a share/amd_smi subdirectory and pyproject.toml or setup.py in it.
541
+ 3. Copy the amd_smi to your home directory: `cp -r /opt/rocm-6.2.0/share/amd_smi ~`
542
+ 4. cd ~/amd_smi
543
+ 5. In your python environment, do a pip install .
544
+
545
+ Current code is compatible with this version: amdsmi==24.7.1+0012a68
546
+ Which was installed using Frontier's /opt/rocm-6.3.1/share/amd_smi
547
+
548
+ ## Torch Dependencies
549
+
550
+ Some unit tests utilize `torch==2.2.2`, `torchtext=0.17.2`, and `torchvision==0.17.2`. They are only really needed to run some tests and will be installed if you run `pip install flowcept[ml_dev]` or `pip install flowcept[all]`. If you want to use Flowcept with Torch, please adapt torch dependencies according to your project's dependencies.
551
+
552
+ ## Documentation
553
+
554
+ Full documentation is available on [Read the Docs](https://flowcept.readthedocs.io/).
555
+
556
+ ## Cite us
557
+
558
+ If you used Flowcept in your research, consider citing our paper.
559
+
560
+ ```
561
+ Towards Lightweight Data Integration using Multi-workflow Provenance and Data Observability
562
+ R. Souza, T. Skluzacek, S. Wilkinson, M. Ziatdinov, and R. da Silva
563
+ 19th IEEE International Conference on e-Science, 2023.
564
+ ```
565
+
566
+ **Bibtex:**
567
+
568
+ ```latex
569
+ @inproceedings{souza2023towards,
570
+ author = {Souza, Renan and Skluzacek, Tyler J and Wilkinson, Sean R and Ziatdinov, Maxim and da Silva, Rafael Ferreira},
571
+ booktitle = {IEEE International Conference on e-Science},
572
+ doi = {10.1109/e-Science58273.2023.10254822},
573
+ link = {https://doi.org/10.1109/e-Science58273.2023.10254822},
574
+ pdf = {https://arxiv.org/pdf/2308.09004.pdf},
575
+ title = {Towards Lightweight Data Integration using Multi-workflow Provenance and Data Observability},
576
+ year = {2023}
577
+ }
578
+ ```
579
+
580
+ ## Disclaimer & Get in Touch
581
+
582
+ Refer to [Contributing](CONTRIBUTING.md) for adding new adapters or contributing with the codebase.
583
+
584
+ Please note that this a research software. We encourage you to give it a try and use it with your own stack.
585
+ We are continuously working on improving documentation and adding more examples and notebooks, but we are continuously improving documentation and examples. If you are interested in working with Flowcept in your own scientific project, we can give you a jump start if you reach out to us. Feel free to [create an issue](https://github.com/ORNL/flowcept/issues/new), [create a new discussion thread](https://github.com/ORNL/flowcept/discussions/new/choose) or drop us an email (we trust you'll find a way to reach out to us :wink:).
586
+
587
+ ## Acknowledgement
588
+
589
+ This research uses resources of the Oak Ridge Leadership Computing Facility at the Oak Ridge National Laboratory, which is supported by the Office of Science of the U.S. Department of Energy under Contract No. DE-AC05-00OR22725.