PyPI - flowcept - Versions diffs - 0.8.8__tar.gz → 0.8.10__tar.gz - Mend

flowcept 0.8.8tar.gz → 0.8.10tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (158) hide show

{flowcept-0.8.8 → flowcept-0.8.10}/.github/workflows/checks.yml RENAMED Viewed

@@ -1,4 +1,4 @@
-name: Linter, formatter, and docs checks
+name: Code and doc checks
 on: pull_request

{flowcept-0.8.8 → flowcept-0.8.10}/PKG-INFO RENAMED Viewed

@@ -1,6 +1,6 @@
 Metadata-Version: 2.4
 Name: flowcept
-Version: 0.8.8
+Version: 0.8.10
 Summary: Capture and query workflow provenance data using data observability
 Project-URL: GitHub, https://github.com/ORNL/flowcept
 Author: Oak Ridge National Laboratory
@@ -88,6 +88,7 @@ Requires-Dist: tensorboard; extra == 'tensorboard'
 Requires-Dist: tensorflow; extra == 'tensorboard'
 Description-Content-Type: text/markdown
+[![Documentation](https://img.shields.io/badge/docs-readthedocs.io-green.svg)](https://flowcept.readthedocs.io/)
 [![Build](https://github.com/ORNL/flowcept/actions/workflows/create-release-n-publish.yml/badge.svg)](https://github.com/ORNL/flowcept/actions/workflows/create-release-n-publish.yml)
 [![PyPI](https://badge.fury.io/py/flowcept.svg)](https://pypi.org/project/flowcept)
 [![Tests](https://github.com/ORNL/flowcept/actions/workflows/run-tests.yml/badge.svg)](https://github.com/ORNL/flowcept/actions/workflows/run-tests.yml)
@@ -107,10 +108,13 @@ Description-Content-Type: text/markdown
 - [Data Persistence](#data-persistence)
 - [Performance Tuning](#performance-tuning-for-performance-evaluation)
 - [AMD GPU Setup](#install-amd-gpu-lib)
+- [Further Documentation](#documentation)
 ## Overview
-Flowcept is a runtime data integration system that captures and queries workflow provenance with minimal or no code changes. It unifies data across diverse workflows and tools, enabling integrated analysis and insights, especially in federated environments. Designed for scenarios involving critical data from multiple workflows, Flowcept seamlessly integrates data at runtime, providing a unified view for end-to-end monitoring and analysis, and enhanced support for Machine Learning (ML) workflows.
+Flowcept is a runtime data integration system that captures and queries workflow provenance with minimal or no code changes. It unifies data from diverse workflows and tools, enabling integrated analysis and insights, especially in federated environments.
+Designed for scenarios involving critical data from multiple workflows, Flowcept supports end-to-end monitoring, analysis, querying, and enhanced support for Machine Learning (ML) workflows.
 ## Features
@@ -133,8 +137,9 @@ Notes:
   - TensorBoard
 - Python scripts can be easily instrumented via `@decorators` using `@flowcept_task` (for generic Python method) or `@torch_task` (for methods that encapsulate PyTorch model manipulation, such as training or evaluation).
 - Currently supported MQ systems:
-  - Kafka
-  - Redis
+  - [Kafka](https://kafka.apache.org)
+  - [Redis](https://redis.io)
+  - [Mofka](https://mofka.readthedocs.io)
 - Currently supported database systems:
   - MongoDB
   - Lightning Memory-Mapped Database (lightweight file-only database system)
@@ -179,7 +184,7 @@ If you want to install all optional dependencies, use:
 pip install flowcept[all]
 ```
-This is a convenient way to ensure all adapters are available, but it may install dependencies you don't need.
+This is useful mostly for Flowcept developers. Please avoid installing like this if you can, as it may install several dependencies you will never use.
 ### 4. Installing from Source
 To install Flowcept from the source repository:
@@ -359,6 +364,10 @@ Which was installed using Frontier's /opt/rocm-6.3.1/share/amd_smi
 Some unit tests utilize `torch==2.2.2`, `torchtext=0.17.2`, and `torchvision==0.17.2`. They are only really needed to run some tests and will be installed if you run `pip install flowcept[ml_dev]` or `pip install flowcept[all]`. If you want to use Flowcept with Torch, please adapt torch dependencies according to your project's dependencies.
+## Documentation
+Full documentation is available on [Read the Docs](https://flowcept.readthedocs.io/).
 ## Cite us
 If you used Flowcept in your research, consider citing our paper.

{flowcept-0.8.8 → flowcept-0.8.10}/README.md RENAMED Viewed

@@ -1,3 +1,4 @@
+[![Documentation](https://img.shields.io/badge/docs-readthedocs.io-green.svg)](https://flowcept.readthedocs.io/)
 [![Build](https://github.com/ORNL/flowcept/actions/workflows/create-release-n-publish.yml/badge.svg)](https://github.com/ORNL/flowcept/actions/workflows/create-release-n-publish.yml)
 [![PyPI](https://badge.fury.io/py/flowcept.svg)](https://pypi.org/project/flowcept)
 [![Tests](https://github.com/ORNL/flowcept/actions/workflows/run-tests.yml/badge.svg)](https://github.com/ORNL/flowcept/actions/workflows/run-tests.yml)
@@ -17,10 +18,13 @@
 - [Data Persistence](#data-persistence)
 - [Performance Tuning](#performance-tuning-for-performance-evaluation)
 - [AMD GPU Setup](#install-amd-gpu-lib)
+- [Further Documentation](#documentation)
 ## Overview
-Flowcept is a runtime data integration system that captures and queries workflow provenance with minimal or no code changes. It unifies data across diverse workflows and tools, enabling integrated analysis and insights, especially in federated environments. Designed for scenarios involving critical data from multiple workflows, Flowcept seamlessly integrates data at runtime, providing a unified view for end-to-end monitoring and analysis, and enhanced support for Machine Learning (ML) workflows.
+Flowcept is a runtime data integration system that captures and queries workflow provenance with minimal or no code changes. It unifies data from diverse workflows and tools, enabling integrated analysis and insights, especially in federated environments.
+Designed for scenarios involving critical data from multiple workflows, Flowcept supports end-to-end monitoring, analysis, querying, and enhanced support for Machine Learning (ML) workflows.
 ## Features
@@ -43,8 +47,9 @@ Notes:
   - TensorBoard
 - Python scripts can be easily instrumented via `@decorators` using `@flowcept_task` (for generic Python method) or `@torch_task` (for methods that encapsulate PyTorch model manipulation, such as training or evaluation).
 - Currently supported MQ systems:
-  - Kafka
-  - Redis
+  - [Kafka](https://kafka.apache.org)
+  - [Redis](https://redis.io)
+  - [Mofka](https://mofka.readthedocs.io)
 - Currently supported database systems:
   - MongoDB
   - Lightning Memory-Mapped Database (lightweight file-only database system)
@@ -89,7 +94,7 @@ If you want to install all optional dependencies, use:
 pip install flowcept[all]
 ```
-This is a convenient way to ensure all adapters are available, but it may install dependencies you don't need.
+This is useful mostly for Flowcept developers. Please avoid installing like this if you can, as it may install several dependencies you will never use.
 ### 4. Installing from Source
 To install Flowcept from the source repository:
@@ -269,6 +274,10 @@ Which was installed using Frontier's /opt/rocm-6.3.1/share/amd_smi
 Some unit tests utilize `torch==2.2.2`, `torchtext=0.17.2`, and `torchvision==0.17.2`. They are only really needed to run some tests and will be installed if you run `pip install flowcept[ml_dev]` or `pip install flowcept[all]`. If you want to use Flowcept with Torch, please adapt torch dependencies according to your project's dependencies.
+## Documentation
+Full documentation is available on [Read the Docs](https://flowcept.readthedocs.io/).
 ## Cite us
 If you used Flowcept in your research, consider citing our paper.

{flowcept-0.8.8 → flowcept-0.8.10}/docs/getstarted.rst RENAMED Viewed

@@ -40,7 +40,8 @@ Customizing Settings
 Flowcept allows extensive configuration via a YAML file. To use a custom configuration, set the environment variable
 ``FLOWCEPT_SETTINGS_PATH`` to point to the absolute path of your settings file. A sample file is provided at For more options, see the `sample_settings.yaml <https://github.com/ORNL/flowcept/blob/main/resources/sample_settings.yaml>`_.
- **Key Settings to Adjust**
+Key Settings to Adjust
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 - **Service Connections:** Set host, port, and credentials for MQ (`mq:`), key-value DB (`kv_db:`), and optionally MongoDB (`mongodb:`).

flowcept-0.8.10/docs/index.rst ADDED Viewed

@@ -0,0 +1,18 @@
+Flowcept
+========
+GitHub Repository: https://github.com/ornl/flowcept
+Flowcept is a runtime data integration system that captures and queries workflow provenance with minimal or no code changes. It unifies data from diverse workflows and tools, enabling integrated analysis and insights, especially in federated environments.
+Designed for scenarios involving critical data from multiple workflows, Flowcept supports end-to-end monitoring, analysis, querying, and enhanced support for Machine Learning (ML) workflows.
+.. toctree::
+   :maxdepth: 2
+   :caption: Contents:
+   getstarted
+   schemas
+   contributing
+   api-reference

{flowcept-0.8.8 → flowcept-0.8.10}/pyproject.toml RENAMED Viewed

@@ -110,3 +110,6 @@ packages = ["src/flowcept"]
 [tool.hatch.build.targets.wheel.force-include]
 "resources/sample_settings.yaml" = "resources/sample_settings.yaml"
+[project.scripts]
+flowcept = "flowcept.cli:main"

{flowcept-0.8.8 → flowcept-0.8.10}/resources/sample_settings.yaml RENAMED Viewed

@@ -1,4 +1,4 @@
-flowcept_version: 0.8.8 # Version of the Flowcept package. This setting file is compatible with this version.
+flowcept_version: 0.8.10 # Version of the Flowcept package. This setting file is compatible with this version.
 project:
   debug: true # Toggle debug mode. This will add a property `debug: true` to all saved data, making it easier to retrieve/delete them later.
@@ -25,7 +25,6 @@ telemetry_capture: # This toggles each individual type of telemetry capture. GPU
 instrumentation:
   enabled: true # This toggles data capture for instrumentation.
-  singleton: true # Use a single instrumentation instance per process. Defaults to true
   torch:
     what: parent_and_children # Scope of instrumentation: "parent_only" -- will capture only at the main model level, "parent_and_children" -- will capture the inner layers, or ~ (disable).
     children_mode: telemetry_and_tensor_inspection   # What to capture if parent_and_children is chosen in the scope. Possible values: "tensor_inspection" (i.e., tensor metadata), "telemetry", "telemetry_and_tensor_inspection"
@@ -49,7 +48,7 @@ mq:
   timing: false
   chunk_size: -1  # use 0 or -1 to disable this. Or simply omit this from the config file.
-kv_db:
+kv_db:  # You can optionally use KV == MQ if MQ is Redis. Otherwise, these will be the Redis instance credentials.
   host: localhost
   port: 6379
   # uri: use Redis connection uri here
@@ -59,9 +58,9 @@ web_server:
   port: 5000
 sys_metadata:
-  environment_id: "laptop"
+  environment_id: "laptop"   # We use this to keep track of the environment used to run an experiment. Typical values include the cluster name, but it can be anything that you think will help identify your experimentation environment.
-extra_metadata:
+extra_metadata: # We use this to store any extra metadata you want to keep track of during an experiment.
   place_holder: ""
 analytics:
@@ -70,13 +69,11 @@ analytics:
     generated.accuracy: maximum_first
 db_buffer:
-  adaptive_buffer_size: true
-  insertion_buffer_time_secs: 5
-  max_buffer_size: 50
-  min_buffer_size: 10
-  remove_empty_fields: false
-  stop_max_trials: 240
-  stop_trials_sleep: 0.01
+  insertion_buffer_time_secs: 5   # Time interval (in seconds) to buffer incoming records before flushing to the database
+  buffer_size: 50    # Maximum number of records to hold in the buffer before forcing a flush
+  remove_empty_fields: false    # If true, fields with null/empty values will be removed before insertion
+  stop_max_trials: 240    # Maximum number of trials before giving up when waiting for a fully safe stop (i.e., all records have been inserted as expected).
+  stop_trials_sleep: 0.01   # Sleep duration (in seconds) between trials when waiting for a fully safe stop.
 databases:
@@ -89,7 +86,7 @@ databases:
     host: localhost
     port: 27017
     db: flowcept
-    create_collection_index: true
+    create_collection_index: true  # Whether flowcept should create collection indices if they haven't been created yet. This is done only at the Flowcept start up.
 adapters:
   # For each key below, you can have multiple instances. Like mlflow1, mlflow2; zambeze1, zambeze2. Use an empty dict, {}, if you won't use any adapter.

flowcept-0.8.10/src/flowcept/cli.py ADDED Viewed

@@ -0,0 +1,260 @@
+"""
+Flowcept CLI.
+How to add a new command:
+--------------------------
+1. Write a function with type-annotated arguments and a NumPy-style docstring.
+2. Add it to one of the groups in `COMMAND_GROUPS`.
+3. It will automatically become available as `flowcept --<function-name>` (underscores become hyphens).
+Supports:
+- `flowcept --command`
+- `flowcept --command --arg=value`
+- `flowcept -h` or `flowcept` for full help
+- `flowcept --help --command` for command-specific help
+"""
+import argparse
+import os
+import sys
+import json
+import textwrap
+import inspect
+from functools import wraps
+from typing import List
+from flowcept import Flowcept, configs
+def no_docstring(func):
+    """Decorator to silence linter for missing docstrings."""
+    @wraps(func)
+    def wrapper(*args, **kwargs):
+        return func(*args, **kwargs)
+    return wrapper
+def show_config():
+    """
+    Show Flowcept configuration.
+    """
+    config_data = {
+        "session_settings_path": configs.SETTINGS_PATH,
+        "env_FLOWCEPT_SETTINGS_PATH": os.environ.get("FLOWCEPT_SETTINGS_PATH", None),
+    }
+    print(f"This is the settings path in this session: {configs.SETTINGS_PATH}")
+    print(
+        f"This is your FLOWCEPT_SETTINGS_PATH environment variable value: "
+        f"{config_data['env_FLOWCEPT_SETTINGS_PATH']}"
+    )
+def start_consumption_services(bundle_exec_id: str = None, check_safe_stops: bool = False, consumers: List[str] = None):
+    """
+    Start services that consume data from a queue or other source.
+    Parameters
+    ----------
+    bundle_exec_id : str, optional
+        The ID of the bundle execution to associate with the consumers.
+    check_safe_stops : bool, optional
+        Whether to check for safe stopping conditions before starting.
+    consumers : list of str, optional
+        List of consumer IDs to start. If not provided, all consumers will be started.
+    """
+    print("Starting consumption services...")
+    print(f"  bundle_exec_id: {bundle_exec_id}")
+    print(f"  check_safe_stops: {check_safe_stops}")
+    print(f"  consumers: {consumers or []}")
+    Flowcept.start_consumption_services(
+        bundle_exec_id=bundle_exec_id,
+        check_safe_stops=check_safe_stops,
+        consumers=consumers,
+    )
+def stop_consumption_services():
+    """
+    Stop the document inserter.
+    """
+    print("Not implemented yet.")
+def start_services(with_mongo: bool = False):
+    """
+    Start Flowcept services (optionally including MongoDB).
+    Parameters
+    ----------
+    with_mongo : bool, optional
+        Whether to also start MongoDB.
+    """
+    print(f"Starting services{' with Mongo' if with_mongo else ''}")
+    print("Not implemented yet.")
+def stop_services():
+    """
+    Stop Flowcept services.
+    """
+    print("Not implemented yet.")
+def workflow_count(workflow_id: str):
+    """
+    Count number of documents in the DB.
+    Parameters
+    ----------
+    workflow_id : str
+        The ID of the workflow to count tasks for.
+    """
+    result = {
+        "workflow_id": workflow_id,
+        "tasks": len(Flowcept.db.query({"workflow_id": workflow_id})),
+        "workflows": len(Flowcept.db.query({"workflow_id": workflow_id}, collection="workflows")),
+        "objects": len(Flowcept.db.query({"workflow_id": workflow_id}, collection="objects")),
+    }
+    print(json.dumps(result, indent=2))
+def query(query_str: str):
+    """
+    Query the Document DB.
+    Parameters
+    ----------
+    query_str : str
+        A JSON string representing the Mongo query.
+    """
+    query = json.loads(query_str)
+    print(Flowcept.db.query(query))
+COMMAND_GROUPS = [
+    ("Basic Commands", [show_config, start_services, stop_services]),
+    ("Consumption Commands", [start_consumption_services, stop_consumption_services]),
+    ("Database Commands", [workflow_count, query]),
+]
+COMMANDS = set(f for _, fs in COMMAND_GROUPS for f in fs)
+def _parse_numpy_doc(docstring: str):
+    parsed = {}
+    lines = docstring.splitlines() if docstring else []
+    in_params = False
+    for line in lines:
+        line = line.strip()
+        if line.lower().startswith("parameters"):
+            in_params = True
+            continue
+        if in_params:
+            if " : " in line:
+                name, typeinfo = line.split(" : ", 1)
+                parsed[name.strip()] = {"type": typeinfo.strip(), "desc": ""}
+            elif parsed:
+                last = list(parsed)[-1]
+                parsed[last]["desc"] += " " + line
+    return parsed
+@no_docstring
+def main():  # noqa: D103
+    parser = argparse.ArgumentParser(
+        description="Flowcept CLI", formatter_class=argparse.RawTextHelpFormatter, add_help=False
+    )
+    for func in COMMANDS:
+        doc = func.__doc__ or ""
+        func_name = func.__name__
+        flag = f"--{func_name.replace('_', '-')}"
+        short_help = doc.strip().splitlines()[0] if doc else ""
+        parser.add_argument(flag, action="store_true", help=short_help)
+        for pname, param in inspect.signature(func).parameters.items():
+            arg_name = f"--{pname.replace('_', '-')}"
+            params_doc = _parse_numpy_doc(doc).get(pname, {})
+            help_text = f"{params_doc.get('type', '')} - {params_doc.get('desc', '').strip()}"
+            if isinstance(param.annotation, bool):
+                parser.add_argument(arg_name, action="store_true", help=help_text)
+            elif param.annotation == List[str]:
+                parser.add_argument(arg_name, type=lambda s: s.split(","), help=help_text)
+            else:
+                parser.add_argument(arg_name, type=str, help=help_text)
+    # Handle --help --command
+    help_flag = "--help" in sys.argv
+    command_flags = {f"--{f.__name__.replace('_', '-')}" for f in COMMANDS}
+    matched_command_flag = next((arg for arg in sys.argv if arg in command_flags), None)
+    if help_flag and matched_command_flag:
+        command_func = next(f for f in COMMANDS if f"--{f.__name__.replace('_', '-')}" == matched_command_flag)
+        doc = command_func.__doc__ or ""
+        sig = inspect.signature(command_func)
+        print(f"\nHelp for `flowcept {matched_command_flag}`:\n")
+        print(textwrap.indent(doc.strip(), "  "))
+        print("\n  Arguments:")
+        params = _parse_numpy_doc(doc)
+        for pname, p in sig.parameters.items():
+            meta = params.get(pname, {})
+            opt = p.default != inspect.Parameter.empty
+            print(
+                f"    --{pname:<18} {meta.get('type', 'str')}, "
+                f"{'optional' if opt else 'required'} - {meta.get('desc', '').strip()}"
+            )
+        print()
+        sys.exit(0)
+    if len(sys.argv) == 1 or help_flag:
+        print("\nFlowcept CLI\n")
+        for group, funcs in COMMAND_GROUPS:
+            print(f"{group}:\n")
+            for func in funcs:
+                name = func.__name__
+                flag = f"--{name.replace('_', '-')}"
+                doc = func.__doc__ or ""
+                summary = doc.strip().splitlines()[0] if doc else ""
+                sig = inspect.signature(func)
+                print(f"  flowcept {flag}", end="")
+                for pname, p in sig.parameters.items():
+                    is_opt = p.default != inspect.Parameter.empty
+                    print(f" [--{pname.replace('_', '-')}] " if is_opt else f" --{pname.replace('_', '-')}", end="")
+                print(f"\n      {summary}")
+                params = _parse_numpy_doc(doc)
+                if params:
+                    print("      Arguments:")
+                    for argname, meta in params.items():
+                        opt = sig.parameters[argname].default != inspect.Parameter.empty
+                        print(
+                            f"          --"
+                            f"{argname:<18} {meta['type']}, "
+                            f"{'optional' if opt else 'required'} - {meta['desc'].strip()}"
+                        )
+                print()
+        print("Run `flowcept --<command>` to invoke a command.\n")
+        sys.exit(0)
+    args = vars(parser.parse_args())
+    for func in COMMANDS:
+        flag = f"--{func.__name__.replace('_', '-')}"
+        if args.get(func.__name__.replace("-", "_")):
+            sig = inspect.signature(func)
+            kwargs = {}
+            for pname in sig.parameters:
+                val = args.get(pname.replace("-", "_"))
+                if val is not None:
+                    kwargs[pname] = val
+            func(**kwargs)
+            break
+    else:
+        print("Unknown command. Use `flowcept -h` to see available commands.")
+        sys.exit(1)
+if __name__ == "__main__":
+    main()

{flowcept-0.8.8 → flowcept-0.8.10}/src/flowcept/configs.py RENAMED Viewed

@@ -126,11 +126,9 @@ if not LMDB_ENABLED and not MONGO_ENABLED:
 # DB Buffer Settings        #
 ##########################
 db_buffer_settings = settings["db_buffer"]
-# In seconds:
-INSERTION_BUFFER_TIME = db_buffer_settings.get("insertion_buffer_time_secs", None)
-ADAPTIVE_DB_BUFFER_SIZE = db_buffer_settings.get("adaptive_buffer_size", True)
-DB_MAX_BUFFER_SIZE = int(db_buffer_settings.get("max_buffer_size", 50))
-DB_MIN_BUFFER_SIZE = max(1, int(db_buffer_settings.get("min_buffer_size", 10)))
+INSERTION_BUFFER_TIME = db_buffer_settings.get("insertion_buffer_time_secs", None)  # In seconds:
+DB_BUFFER_SIZE = int(db_buffer_settings.get("buffer_size", 50))
 REMOVE_EMPTY_FIELDS = db_buffer_settings.get("remove_empty_fields", False)
 DB_INSERTER_MAX_TRIALS_STOP = db_buffer_settings.get("stop_max_trials", 240)
 DB_INSERTER_SLEEP_TRIALS_STOP = db_buffer_settings.get("stop_trials_sleep", 0.01)

{flowcept-0.8.8 → flowcept-0.8.10}/src/flowcept/flowceptor/adapters/base_interceptor.py RENAMED Viewed

@@ -9,7 +9,6 @@ from flowcept.commons.flowcept_dataclasses.workflow_object import (
 )
 from flowcept.configs import (
     ENRICH_MESSAGES,
-    INSTRUMENTATION,
 )
 from flowcept.commons.flowcept_logger import FlowceptLogger
 from flowcept.commons.daos.mq_dao.mq_dao_base import MQDao
@@ -50,23 +49,15 @@ class BaseInterceptor(object):
         elif kind in "dask":
             # This is dask's client interceptor. We essentially use it to store the dask workflow.
             # That's why we don't need another special interceptor and we can reuse the instrumentation one.
-            return BaseInterceptor._build_instrumentation_interceptor()
-        elif kind == "instrumentation":
-            return BaseInterceptor._build_instrumentation_interceptor()
-        else:
-            raise NotImplementedError
+            from flowcept.flowceptor.adapters.instrumentation_interceptor import InstrumentationInterceptor
-    @staticmethod
-    def _build_instrumentation_interceptor():
-        # By using singleton, we lose the thread safety for the Interceptor, particularly, its MQ buffer.
-        # Since some use cases need threads, this allows disabling the singleton for more thread safety.
-        is_singleton = INSTRUMENTATION.get("singleton", True)
-        if is_singleton:
+            return InstrumentationInterceptor.get_instance()
+        elif kind == "instrumentation":
             from flowcept.flowceptor.adapters.instrumentation_interceptor import InstrumentationInterceptor
             return InstrumentationInterceptor.get_instance()
         else:
-            return BaseInterceptor(kind="instrumentation")
+            raise NotImplementedError
     def __init__(self, plugin_key=None, kind=None):
         self.logger = FlowceptLogger()

{flowcept-0.8.8 → flowcept-0.8.10}/src/flowcept/flowceptor/consumers/document_inserter.py RENAMED Viewed

@@ -16,11 +16,9 @@ from flowcept.commons.utils import GenericJSONDecoder
 from flowcept.commons.vocabulary import Status
 from flowcept.configs import (
     INSERTION_BUFFER_TIME,
-    DB_MAX_BUFFER_SIZE,
-    DB_MIN_BUFFER_SIZE,
+    DB_BUFFER_SIZE,
     DB_INSERTER_MAX_TRIALS_STOP,
     DB_INSERTER_SLEEP_TRIALS_STOP,
-    ADAPTIVE_DB_BUFFER_SIZE,
     REMOVE_EMPTY_FIELDS,
     JSON_SERIALIZER,
     ENRICH_MESSAGES,
@@ -67,28 +65,16 @@ class DocumentInserter:
         self._previous_time = time()
         self.logger = FlowceptLogger()
         self._main_thread: Thread = None
-        self._curr_max_buffer_size = DB_MAX_BUFFER_SIZE
+        self._curr_db_buffer_size = DB_BUFFER_SIZE
         self._bundle_exec_id = bundle_exec_id
         self.check_safe_stops = check_safe_stops
         self.buffer: AutoflushBuffer = AutoflushBuffer(
             flush_function=DocumentInserter.flush_function,
             flush_function_kwargs={"logger": self.logger, "doc_daos": self._doc_daos},
-            max_size=self._curr_max_buffer_size,
+            max_size=self._curr_db_buffer_size,
             flush_interval=INSERTION_BUFFER_TIME,
         )
-    def _set_buffer_size(self):
-        if not ADAPTIVE_DB_BUFFER_SIZE:
-            return
-        else:
-            self._curr_max_buffer_size = max(
-                DB_MIN_BUFFER_SIZE,
-                min(
-                    DB_MAX_BUFFER_SIZE,
-                    int(self._curr_max_buffer_size * 1.1),
-                ),
-            )
     @staticmethod
     def flush_function(buffer, doc_daos, logger):
         """Flush it."""

{flowcept-0.8.8 → flowcept-0.8.10}/src/flowcept/instrumentation/task_capture.py RENAMED Viewed

@@ -1,5 +1,8 @@
 from time import time
 from typing import Dict
+import os
+import threading
+import random
 from flowcept.commons.flowcept_dataclasses.task_object import (
     TaskObject,
@@ -57,21 +60,16 @@ class FlowceptTask(object):
         activity_id: str = None,
         used: Dict = None,
         custom_metadata: Dict = None,
-        flowcept: "Flowcept" = None,
     ):
         if not INSTRUMENTATION_ENABLED:
             self._ended = True
             return
-        if flowcept is not None and flowcept._interceptor_instances[0].kind == "instrumentation":
-            self._interceptor = flowcept._interceptor_instances[0]
-        else:
-            self._interceptor = InstrumentationInterceptor.get_instance()
         self._task = TaskObject()
+        self._interceptor = InstrumentationInterceptor.get_instance()
         self._task.telemetry_at_start = self._interceptor.telemetry_capture.capture()
         self._task.activity_id = activity_id
         self._task.started_at = time()
-        self._task.task_id = task_id or str(self._task.started_at)
+        self._task.task_id = task_id or self._gen_task_id()
         self._task.workflow_id = workflow_id or Flowcept.current_workflow_id
         self._task.campaign_id = campaign_id or Flowcept.campaign_id
         self._task.used = used
@@ -85,6 +83,12 @@ class FlowceptTask(object):
         if not self._ended:
             self.end()
+    def _gen_task_id(self):
+        pid = os.getpid()
+        tid = threading.get_ident()
+        rand = random.getrandbits(32)
+        return f"{self._task.started_at}_{pid}_{tid}_{rand}"
     def end(
         self,
         generated: Dict = None,

{flowcept-0.8.8 → flowcept-0.8.10}/src/flowcept/version.py RENAMED Viewed

@@ -4,4 +4,4 @@
 # The expected format is: <Major>.<Minor>.<Patch>
 # This file is supposed to be automatically modified by the CI Bot.
 # See .github/workflows/version_bumper.py
-__version__ = "0.8.8"
+__version__ = "0.8.10"

flowcept-0.8.8/docs/index.rst DELETED Viewed

@@ -1,15 +0,0 @@
-Flowcept
-========
-Flowcept is a runtime data integration system that captures and queries workflow provenance with minimal or no code changes. It unifies data across diverse workflows and tools, enabling integrated analysis and insights, especially in federated environments. Designed for scenarios involving critical data from multiple workflows, Flowcept seamlessly integrates data at runtime, providing a unified view for end-to-end monitoring and analysis, and enhanced support for Machine Learning (ML) workflows.
-.. toctree::
-   :maxdepth: 2
-   :caption: Contents:
-   getstarted
-   schemas
-   contributing
-   api-reference