PyPI - avtomatika - Versions diffs - 1.0b5__tar.gz → 1.0b6__tar.gz - Mend

avtomatika 1.0b5tar.gz → 1.0b6tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (75) hide show

{avtomatika-1.0b5/src/avtomatika.egg-info → avtomatika-1.0b6}/PKG-INFO RENAMED Viewed

@@ -1,6 +1,6 @@
 Metadata-Version: 2.4
 Name: avtomatika
-Version: 1.0b5
+Version: 1.0b6
 Summary: A state-machine based orchestrator for long-running AI and other jobs.
 Project-URL: Homepage, https://github.com/avtomatika-ai/avtomatika
 Project-URL: Bug Tracker, https://github.com/avtomatika-ai/avtomatika/issues
@@ -60,6 +60,7 @@ This document serves as a comprehensive guide for developers looking to build pi
   - [Delegating Tasks to Workers (dispatch_task)](#delegating-tasks-to-workers-dispatch_task)
   - [Parallel Execution and Aggregation (Fan-out/Fan-in)](#parallel-execution-and-aggregation-fan-outfan-in)
   - [Dependency Injection (DataStore)](#dependency-injection-datastore)
+  - [Native Scheduler](#native-scheduler)
 - [Production Configuration](#production-configuration)
   - [Fault Tolerance](#fault-tolerance)
   - [Storage Backend](#storage-backend)
@@ -74,7 +75,17 @@ The project is based on a simple yet powerful architectural pattern that separat
 *   **Orchestrator (OrchestratorEngine)** — The Director. It manages the entire process from start to finish, tracks state, handles errors, and decides what should happen next. It does not perform business tasks itself.
 *   **Blueprints (Blueprint)** — The Script. Each blueprint is a detailed plan (a state machine) for a specific business process. It describes the steps (states) and the rules for transitioning between them.
-*   **Workers (Worker)** — The Team of Specialists. These are independent, specialized executors. Each worker knows how to perform a specific set of tasks (e.g., "process video," "send email") and reports back to the Orchestrator.## Installation
+*   **Workers (Worker)** — The Team of Specialists. These are independent, specialized executors. Each worker knows how to perform a specific set of tasks (e.g., "process video," "send email") and reports back to the Orchestrator.
+## Ecosystem
+Avtomatika is part of a larger ecosystem:
+*   **[Avtomatika Worker SDK](https://github.com/avtomatika-ai/avtomatika-worker)**: The official Python SDK for building workers that connect to this engine.
+*   **[RCA Protocol](https://github.com/avtomatika-ai/rca)**: The architectural specification and manifesto behind the system.
+*   **[Full Example](https://github.com/avtomatika-ai/avtomatika-full-example)**: A complete reference project demonstrating the engine and workers in action.
+## Installation
 *   **Install the core engine only:**
     ```bash
@@ -328,6 +339,22 @@ async def cache_handler(data_stores):
     user_data = await data_stores.cache.get("user:123")
     print(f"User from cache: {user_data}")
 ```
+### 5. Native Scheduler
+Avtomatika includes a built-in distributed scheduler. It allows you to trigger blueprints periodically (interval, daily, weekly, monthly) without external tools like cron.
+*   **Configuration:** Defined in `schedules.toml`.
+*   **Timezone Aware:** Supports global timezone configuration (e.g., `TZ="Europe/Moscow"`).
+*   **Distributed Locking:** Safe to run with multiple orchestrator instances; jobs are guaranteed to run only once per interval using distributed locks (Redis/Memory).
+```toml
+# schedules.toml example
+[nightly_backup]
+blueprint = "backup_flow"
+daily_at = "02:00"
+```
 ## Production Configuration
 The orchestrator's behavior can be configured through environment variables. Additionally, any configuration parameter loaded from environment variables can be programmatically overridden in your application code after the `Config` object has been initialized. This provides flexibility for different deployment and testing scenarios.
@@ -349,6 +376,12 @@ To manage access and worker settings securely, Avtomatika uses TOML configuratio
     [gpu-worker-01]
     token = "worker-secret-456"
     ```
+-   **`schedules.toml`**: Defines periodic tasks (CRON-like) for the native scheduler.
+    ```toml
+    [nightly_backup]
+    blueprint = "backup_flow"
+    daily_at = "02:00"
+    ```
 For detailed specifications and examples, please refer to the [**Configuration Guide**](docs/configuration.md).

{avtomatika-1.0b5 → avtomatika-1.0b6}/README.md RENAMED Viewed

@@ -14,6 +14,7 @@ This document serves as a comprehensive guide for developers looking to build pi
   - [Delegating Tasks to Workers (dispatch_task)](#delegating-tasks-to-workers-dispatch_task)
   - [Parallel Execution and Aggregation (Fan-out/Fan-in)](#parallel-execution-and-aggregation-fan-outfan-in)
   - [Dependency Injection (DataStore)](#dependency-injection-datastore)
+  - [Native Scheduler](#native-scheduler)
 - [Production Configuration](#production-configuration)
   - [Fault Tolerance](#fault-tolerance)
   - [Storage Backend](#storage-backend)
@@ -28,7 +29,17 @@ The project is based on a simple yet powerful architectural pattern that separat
 *   **Orchestrator (OrchestratorEngine)** — The Director. It manages the entire process from start to finish, tracks state, handles errors, and decides what should happen next. It does not perform business tasks itself.
 *   **Blueprints (Blueprint)** — The Script. Each blueprint is a detailed plan (a state machine) for a specific business process. It describes the steps (states) and the rules for transitioning between them.
-*   **Workers (Worker)** — The Team of Specialists. These are independent, specialized executors. Each worker knows how to perform a specific set of tasks (e.g., "process video," "send email") and reports back to the Orchestrator.## Installation
+*   **Workers (Worker)** — The Team of Specialists. These are independent, specialized executors. Each worker knows how to perform a specific set of tasks (e.g., "process video," "send email") and reports back to the Orchestrator.
+## Ecosystem
+Avtomatika is part of a larger ecosystem:
+*   **[Avtomatika Worker SDK](https://github.com/avtomatika-ai/avtomatika-worker)**: The official Python SDK for building workers that connect to this engine.
+*   **[RCA Protocol](https://github.com/avtomatika-ai/rca)**: The architectural specification and manifesto behind the system.
+*   **[Full Example](https://github.com/avtomatika-ai/avtomatika-full-example)**: A complete reference project demonstrating the engine and workers in action.
+## Installation
 *   **Install the core engine only:**
     ```bash
@@ -282,6 +293,22 @@ async def cache_handler(data_stores):
     user_data = await data_stores.cache.get("user:123")
     print(f"User from cache: {user_data}")
 ```
+### 5. Native Scheduler
+Avtomatika includes a built-in distributed scheduler. It allows you to trigger blueprints periodically (interval, daily, weekly, monthly) without external tools like cron.
+*   **Configuration:** Defined in `schedules.toml`.
+*   **Timezone Aware:** Supports global timezone configuration (e.g., `TZ="Europe/Moscow"`).
+*   **Distributed Locking:** Safe to run with multiple orchestrator instances; jobs are guaranteed to run only once per interval using distributed locks (Redis/Memory).
+```toml
+# schedules.toml example
+[nightly_backup]
+blueprint = "backup_flow"
+daily_at = "02:00"
+```
 ## Production Configuration
 The orchestrator's behavior can be configured through environment variables. Additionally, any configuration parameter loaded from environment variables can be programmatically overridden in your application code after the `Config` object has been initialized. This provides flexibility for different deployment and testing scenarios.
@@ -303,6 +330,12 @@ To manage access and worker settings securely, Avtomatika uses TOML configuratio
     [gpu-worker-01]
     token = "worker-secret-456"
     ```
+-   **`schedules.toml`**: Defines periodic tasks (CRON-like) for the native scheduler.
+    ```toml
+    [nightly_backup]
+    blueprint = "backup_flow"
+    daily_at = "02:00"
+    ```
 For detailed specifications and examples, please refer to the [**Configuration Guide**](docs/configuration.md).

{avtomatika-1.0b5 → avtomatika-1.0b6}/pyproject.toml RENAMED Viewed

@@ -4,7 +4,7 @@ build-backend = "setuptools.build_meta"
 [project]
 name = "avtomatika"
-version = "1.0b5"
+version = "1.0b6"
 description = "A state-machine based orchestrator for long-running AI and other jobs."
 readme = "README.md"
 requires-python = ">=3.11"

{avtomatika-1.0b5 → avtomatika-1.0b6}/src/avtomatika/config.py RENAMED Viewed

@@ -62,3 +62,7 @@ class Config:
         # External config files
         self.WORKERS_CONFIG_PATH: str = getenv("WORKERS_CONFIG_PATH", "")
         self.CLIENTS_CONFIG_PATH: str = getenv("CLIENTS_CONFIG_PATH", "")
+        self.SCHEDULES_CONFIG_PATH: str = getenv("SCHEDULES_CONFIG_PATH", "")
+        # Timezone settings
+        self.TZ: str = getenv("TZ", "UTC")

avtomatika-1.0b6/src/avtomatika/constants.py ADDED Viewed

@@ -0,0 +1,30 @@
+"""
+Centralized constants for the Avtomatika protocol.
+Use these constants instead of hardcoded strings to ensure consistency.
+"""
+# --- Auth Headers ---
+AUTH_HEADER_CLIENT = "X-Avtomatika-Token"
+AUTH_HEADER_WORKER = "X-Worker-Token"
+# --- Error Codes ---
+# Error codes returned by workers in the result payload
+ERROR_CODE_TRANSIENT = "TRANSIENT_ERROR"
+ERROR_CODE_PERMANENT = "PERMANENT_ERROR"
+ERROR_CODE_INVALID_INPUT = "INVALID_INPUT_ERROR"
+# --- Task Statuses ---
+# Standard statuses for task results
+TASK_STATUS_SUCCESS = "success"
+TASK_STATUS_FAILURE = "failure"
+TASK_STATUS_CANCELLED = "cancelled"
+# --- Job Statuses ---
+JOB_STATUS_PENDING = "pending"
+JOB_STATUS_WAITING_FOR_WORKER = "waiting_for_worker"
+JOB_STATUS_RUNNING = "running"
+JOB_STATUS_FAILED = "failed"
+JOB_STATUS_QUARANTINED = "quarantined"
+JOB_STATUS_CANCELLED = "cancelled"
+JOB_STATUS_WAITING_FOR_HUMAN = "waiting_for_human"
+JOB_STATUS_WAITING_FOR_PARALLEL = "waiting_for_parallel_tasks"

{avtomatika-1.0b5 → avtomatika-1.0b6}/src/avtomatika/engine.py RENAMED Viewed

@@ -14,6 +14,22 @@ from .blueprint import StateMachineBlueprint
 from .client_config_loader import load_client_configs_to_redis
 from .compression import compression_middleware
 from .config import Config
+from .constants import (
+    ERROR_CODE_INVALID_INPUT,
+    ERROR_CODE_PERMANENT,
+    ERROR_CODE_TRANSIENT,
+    JOB_STATUS_CANCELLED,
+    JOB_STATUS_FAILED,
+    JOB_STATUS_PENDING,
+    JOB_STATUS_QUARANTINED,
+    JOB_STATUS_RUNNING,
+    JOB_STATUS_WAITING_FOR_HUMAN,
+    JOB_STATUS_WAITING_FOR_PARALLEL,
+    JOB_STATUS_WAITING_FOR_WORKER,
+    TASK_STATUS_CANCELLED,
+    TASK_STATUS_FAILURE,
+    TASK_STATUS_SUCCESS,
+)
 from .dispatcher import Dispatcher
 from .executor import JobExecutor
 from .health_checker import HealthChecker
@@ -23,6 +39,7 @@ from .logging_config import setup_logging
 from .quota import quota_middleware_factory
 from .ratelimit import rate_limit_middleware_factory
 from .reputation import ReputationCalculator
+from .scheduler import Scheduler
 from .security import client_auth_middleware_factory, worker_auth_middleware_factory
 from .storage.base import StorageBackend
 from .telemetry import setup_telemetry
@@ -38,10 +55,13 @@ EXECUTOR_KEY = AppKey("executor", JobExecutor)
 WATCHER_KEY = AppKey("watcher", Watcher)
 REPUTATION_CALCULATOR_KEY = AppKey("reputation_calculator", ReputationCalculator)
 HEALTH_CHECKER_KEY = AppKey("health_checker", HealthChecker)
+SCHEDULER_KEY = AppKey("scheduler", Scheduler)
 EXECUTOR_TASK_KEY = AppKey("executor_task", Task)
 WATCHER_TASK_KEY = AppKey("watcher_task", Task)
 REPUTATION_CALCULATOR_TASK_KEY = AppKey("reputation_calculator_task", Task)
 HEALTH_CHECKER_TASK_KEY = AppKey("health_checker_task", Task)
+SCHEDULER_TASK_KEY = AppKey("scheduler_task", Task)
 metrics.init_metrics()
@@ -66,7 +86,7 @@ async def metrics_handler(_request: web.Request) -> web.Response:
 class OrchestratorEngine:
     def __init__(self, storage: StorageBackend, config: Config):
-        setup_logging(config.LOG_LEVEL, config.LOG_FORMAT)
+        setup_logging(config.LOG_LEVEL, config.LOG_FORMAT, config.TZ)
         setup_telemetry()
         self.storage = storage
         self.config = config
@@ -115,7 +135,7 @@ class OrchestratorEngine:
                 storage_class = module.SQLiteHistoryStorage
                 parsed_uri = urlparse(uri)
                 db_path = parsed_uri.path
-                storage_args = [db_path]
+                storage_args = [db_path, self.config.TZ]
             except ImportError as e:
                 logger.error(f"Could not import SQLiteHistoryStorage, perhaps aiosqlite is not installed? Error: {e}")
                 self.history_storage = NoOpHistoryStorage()
@@ -125,7 +145,7 @@ class OrchestratorEngine:
             try:
                 module = import_module(".history.postgres", package="avtomatika")
                 storage_class = module.PostgresHistoryStorage
-                storage_args = [uri]
+                storage_args = [uri, self.config.TZ]
             except ImportError as e:
                 logger.error(f"Could not import PostgresHistoryStorage, perhaps asyncpg is not installed? Error: {e}")
                 self.history_storage = NoOpHistoryStorage()
@@ -199,11 +219,13 @@ class OrchestratorEngine:
         app[WATCHER_KEY] = Watcher(self)
         app[REPUTATION_CALCULATOR_KEY] = ReputationCalculator(self)
         app[HEALTH_CHECKER_KEY] = HealthChecker(self)
+        app[SCHEDULER_KEY] = Scheduler(self)
         app[EXECUTOR_TASK_KEY] = create_task(app[EXECUTOR_KEY].run())
         app[WATCHER_TASK_KEY] = create_task(app[WATCHER_KEY].run())
         app[REPUTATION_CALCULATOR_TASK_KEY] = create_task(app[REPUTATION_CALCULATOR_KEY].run())
         app[HEALTH_CHECKER_TASK_KEY] = create_task(app[HEALTH_CHECKER_KEY].run())
+        app[SCHEDULER_TASK_KEY] = create_task(app[SCHEDULER_KEY].run())
     async def on_shutdown(self, app: web.Application):
         logger.info("Shutdown sequence started.")
@@ -211,6 +233,7 @@ class OrchestratorEngine:
         app[WATCHER_KEY].stop()
         app[REPUTATION_CALCULATOR_KEY].stop()
         app[HEALTH_CHECKER_KEY].stop()
+        app[SCHEDULER_KEY].stop()
         logger.info("Background task running flags set to False.")
         if hasattr(self.history_storage, "close"):
@@ -226,6 +249,8 @@ class OrchestratorEngine:
         app[WATCHER_TASK_KEY].cancel()
         app[REPUTATION_CALCULATOR_TASK_KEY].cancel()
         app[EXECUTOR_TASK_KEY].cancel()
+        # Scheduler task manages its own loop cancellation in stop(), but just in case:
+        app[SCHEDULER_TASK_KEY].cancel()
         logger.info("Background tasks cancelled.")
         logger.info("Gathering background tasks with a 10s timeout...")
@@ -236,6 +261,7 @@ class OrchestratorEngine:
                     app[WATCHER_TASK_KEY],
                     app[REPUTATION_CALCULATOR_TASK_KEY],
                     app[EXECUTOR_TASK_KEY],
+                    app[SCHEDULER_TASK_KEY],
                     return_exceptions=True,
                 ),
                 timeout=10.0,
@@ -249,6 +275,55 @@ class OrchestratorEngine:
         logger.info("HTTP session closed.")
         logger.info("Shutdown sequence finished.")
+    async def create_background_job(
+        self,
+        blueprint_name: str,
+        initial_data: dict[str, Any],
+        source: str = "internal",
+    ) -> str:
+        """Creates a job directly, bypassing the HTTP API layer.
+        Useful for internal schedulers and triggers.
+        """
+        blueprint = self.blueprints.get(blueprint_name)
+        if not blueprint:
+            raise ValueError(f"Blueprint '{blueprint_name}' not found.")
+        job_id = str(uuid4())
+        # Use a special internal client config
+        client_config = {
+            "token": "internal-scheduler",
+            "plan": "system",
+            "params": {"source": source},
+        }
+        job_state = {
+            "id": job_id,
+            "blueprint_name": blueprint.name,
+            "current_state": blueprint.start_state,
+            "initial_data": initial_data,
+            "state_history": {},
+            "status": JOB_STATUS_PENDING,
+            "tracing_context": {},
+            "client_config": client_config,
+        }
+        await self.storage.save_job_state(job_id, job_state)
+        await self.storage.enqueue_job(job_id)
+        metrics.jobs_total.inc({metrics.LABEL_BLUEPRINT: blueprint.name})
+        # Log the creation in history as well (so we can track scheduled jobs)
+        await self.history_storage.log_job_event(
+            {
+                "job_id": job_id,
+                "state": "pending",
+                "event_type": "job_created",
+                "context_snapshot": job_state,
+                "metadata": {"source": source, "scheduled": True},
+            }
+        )
+        logger.info(f"Created background job {job_id} for blueprint '{blueprint_name}' (source: {source})")
+        return job_id
     def _create_job_handler(self, blueprint: StateMachineBlueprint) -> Callable:
         async def handler(request: web.Request) -> web.Response:
             try:
@@ -266,7 +341,7 @@ class OrchestratorEngine:
                 "current_state": blueprint.start_state,
                 "initial_data": initial_data,
                 "state_history": {},
-                "status": "pending",
+                "status": JOB_STATUS_PENDING,
                 "tracing_context": carrier,
                 "client_config": client_config,
             }
@@ -295,7 +370,7 @@ class OrchestratorEngine:
         if not job_state:
             return json_response({"error": "Job not found"}, status=404)
-        if job_state.get("status") != "waiting_for_worker":
+        if job_state.get("status") != JOB_STATUS_WAITING_FOR_WORKER:
             return json_response(
                 {"error": "Job is not in a state that can be cancelled (must be waiting for a worker)."},
                 status=409,
@@ -388,7 +463,7 @@ class OrchestratorEngine:
             job_id = data.get("job_id")
             task_id = data.get("task_id")
             result = data.get("result", {})
-            result_status = result.get("status", "success")
+            result_status = result.get("status", TASK_STATUS_SUCCESS)
             error_message = result.get("error")
             payload_worker_id = data.get("worker_id")
         except Exception:
@@ -417,14 +492,14 @@ class OrchestratorEngine:
             return json_response({"error": "Job not found"}, status=404)
         # Handle parallel task completion
-        if job_state.get("status") == "waiting_for_parallel_tasks":
+        if job_state.get("status") == JOB_STATUS_WAITING_FOR_PARALLEL:
             await self.storage.remove_job_from_watch(f"{job_id}:{task_id}")
             job_state.setdefault("aggregation_results", {})[task_id] = result
             job_state.setdefault("active_branches", []).remove(task_id)
             if not job_state["active_branches"]:
                 logger.info(f"All parallel branches for job {job_id} have completed.")
-                job_state["status"] = "running"
+                job_state["status"] = JOB_STATUS_RUNNING
                 job_state["current_state"] = job_state["aggregation_target"]
                 await self.storage.save_job_state(job_id, job_state)
                 await self.storage.enqueue_job(job_id)
@@ -458,13 +533,13 @@ class OrchestratorEngine:
         job_state["tracing_context"] = {str(k): v for k, v in request.headers.items()}
-        if result_status == "failure":
+        if result_status == TASK_STATUS_FAILURE:
             error_details = result.get("error", {})
-            error_type = "TRANSIENT_ERROR"
+            error_type = ERROR_CODE_TRANSIENT
             error_message = "No error details provided."
             if isinstance(error_details, dict):
-                error_type = error_details.get("code", "TRANSIENT_ERROR")
+                error_type = error_details.get("code", ERROR_CODE_TRANSIENT)
                 error_message = error_details.get("message", "No error message provided.")
             elif isinstance(error_details, str):
                 # Fallback for old format where `error` was just a string
@@ -472,13 +547,13 @@ class OrchestratorEngine:
             logging.warning(f"Task {task_id} for job {job_id} failed with error type '{error_type}'.")
-            if error_type == "PERMANENT_ERROR":
-                job_state["status"] = "quarantined"
+            if error_type == ERROR_CODE_PERMANENT:
+                job_state["status"] = JOB_STATUS_QUARANTINED
                 job_state["error_message"] = f"Task failed with permanent error: {error_message}"
                 await self.storage.save_job_state(job_id, job_state)
                 await self.storage.quarantine_job(job_id)
-            elif error_type == "INVALID_INPUT_ERROR":
-                job_state["status"] = "failed"
+            elif error_type == ERROR_CODE_INVALID_INPUT:
+                job_state["status"] = JOB_STATUS_FAILED
                 job_state["error_message"] = f"Task failed due to invalid input: {error_message}"
                 await self.storage.save_job_state(job_id, job_state)
             else:  # TRANSIENT_ERROR or any other/unspecified error
@@ -486,15 +561,15 @@ class OrchestratorEngine:
             return json_response({"status": "result_accepted_failure"}, status=200)
-        if result_status == "cancelled":
+        if result_status == TASK_STATUS_CANCELLED:
             logging.info(f"Task {task_id} for job {job_id} was cancelled by worker.")
-            job_state["status"] = "cancelled"
+            job_state["status"] = JOB_STATUS_CANCELLED
             await self.storage.save_job_state(job_id, job_state)
             # Optionally, trigger a specific 'cancelled' transition if defined in the blueprint
             transitions = job_state.get("current_task_transitions", {})
             if next_state := transitions.get("cancelled"):
                 job_state["current_state"] = next_state
-                job_state["status"] = "running"  # It's running the cancellation handler now
+                job_state["status"] = JOB_STATUS_RUNNING  # It's running the cancellation handler now
                 await self.storage.save_job_state(job_id, job_state)
                 await self.storage.enqueue_job(job_id)
             return json_response({"status": "result_accepted_cancelled"}, status=200)
@@ -510,12 +585,12 @@ class OrchestratorEngine:
                 job_state["state_history"].update(worker_data)
             job_state["current_state"] = next_state
-            job_state["status"] = "running"
+            job_state["status"] = JOB_STATUS_RUNNING
             await self.storage.save_job_state(job_id, job_state)
             await self.storage.enqueue_job(job_id)
         else:
             logging.error(f"Job {job_id} failed. Worker returned unhandled status '{result_status}'.")
-            job_state["status"] = "failed"
+            job_state["status"] = JOB_STATUS_FAILED
             job_state["error_message"] = f"Worker returned unhandled status: {result_status}"
             await self.storage.save_job_state(job_id, job_state)
@@ -535,7 +610,7 @@ class OrchestratorEngine:
             task_info = job_state.get("current_task_info")
             if not task_info:
                 logging.error(f"Cannot retry job {job_id}: missing 'current_task_info' in job state.")
-                job_state["status"] = "failed"
+                job_state["status"] = JOB_STATUS_FAILED
                 job_state["error_message"] = "Cannot retry: original task info not found."
                 await self.storage.save_job_state(job_id, job_state)
                 return
@@ -544,7 +619,7 @@ class OrchestratorEngine:
             timeout_seconds = task_info.get("timeout_seconds", self.config.WORKER_TIMEOUT_SECONDS)
             timeout_at = now + timeout_seconds
-            job_state["status"] = "waiting_for_worker"
+            job_state["status"] = JOB_STATUS_WAITING_FOR_WORKER
             job_state["task_dispatched_at"] = now
             await self.storage.save_job_state(job_id, job_state)
             await self.storage.add_job_to_watch(job_id, timeout_at)
@@ -552,7 +627,7 @@ class OrchestratorEngine:
             await self.dispatcher.dispatch(job_state, task_info)
         else:
             logging.critical(f"Job {job_id} has failed {max_retries + 1} times. Moving to quarantine.")
-            job_state["status"] = "quarantined"
+            job_state["status"] = JOB_STATUS_QUARANTINED
             job_state["error_message"] = f"Task failed after {max_retries + 1} attempts: {error_message}"
             await self.storage.save_job_state(job_id, job_state)
             await self.storage.quarantine_job(job_id)
@@ -571,14 +646,14 @@ class OrchestratorEngine:
         job_state = await self.storage.get_job_state(job_id)
         if not job_state:
             return json_response({"error": "Job not found"}, status=404)
-        if job_state.get("status") not in ["waiting_for_worker", "waiting_for_human"]:
+        if job_state.get("status") not in [JOB_STATUS_WAITING_FOR_WORKER, JOB_STATUS_WAITING_FOR_HUMAN]:
             return json_response({"error": "Job is not in a state that can be approved"}, status=409)
         transitions = job_state.get("current_task_transitions", {})
         next_state = transitions.get(decision)
         if not next_state:
             return json_response({"error": f"Invalid decision '{decision}' for this job"}, status=400)
         job_state["current_state"] = next_state
-        job_state["status"] = "running"
+        job_state["status"] = JOB_STATUS_RUNNING
         await self.storage.save_job_state(job_id, job_state)
         await self.storage.enqueue_job(job_id)
         return json_response({"status": "approval_received", "job_id": job_id})

{avtomatika-1.0b5 → avtomatika-1.0b6}/src/avtomatika/history/postgres.py RENAMED Viewed

@@ -1,9 +1,13 @@
 from abc import ABC
+from contextlib import suppress
+from datetime import datetime
 from logging import getLogger
 from typing import Any
 from uuid import uuid4
+from zoneinfo import ZoneInfo
-from asyncpg import Pool, PostgresError, create_pool  # type: ignore[import-untyped]
+from asyncpg import Connection, Pool, PostgresError, create_pool  # type: ignore[import-untyped]
+from orjson import dumps, loads
 from .base import HistoryStorageBase
@@ -41,14 +45,24 @@ CREATE_JOB_ID_INDEX_PG = "CREATE INDEX IF NOT EXISTS idx_job_id ON job_history(j
 class PostgresHistoryStorage(HistoryStorageBase, ABC):
     """Implementation of the history store based on asyncpg for PostgreSQL."""
-    def __init__(self, dsn: str):
+    def __init__(self, dsn: str, tz_name: str = "UTC"):
         self._dsn = dsn
         self._pool: Pool | None = None
+        self.tz_name = tz_name
+        self.tz = ZoneInfo(tz_name)
+    async def _setup_connection(self, conn: Connection):
+        """Configures the connection session with the correct timezone."""
+        try:
+            await conn.execute(f"SET TIME ZONE '{self.tz_name}'")
+        except PostgresError as e:
+            logger.error(f"Failed to set timezone '{self.tz_name}' for PG connection: {e}")
     async def initialize(self):
         """Initializes the connection pool to PostgreSQL and creates tables."""
         try:
-            self._pool = await create_pool(dsn=self._dsn)
+            # We use init parameter to configure each new connection in the pool
+            self._pool = await create_pool(dsn=self._dsn, init=self._setup_connection)
             if not self._pool:
                 raise RuntimeError("Failed to create a connection pool.")
@@ -56,7 +70,7 @@ class PostgresHistoryStorage(HistoryStorageBase, ABC):
                 await conn.execute(CREATE_JOB_HISTORY_TABLE_PG)
                 await conn.execute(CREATE_WORKER_HISTORY_TABLE_PG)
                 await conn.execute(CREATE_JOB_ID_INDEX_PG)
-            logger.info("PostgreSQL history storage initialized.")
+            logger.info(f"PostgreSQL history storage initialized (TZ={self.tz_name}).")
         except (PostgresError, OSError) as e:
             logger.error(f"Failed to initialize PostgreSQL history storage: {e}")
             raise
@@ -74,14 +88,20 @@ class PostgresHistoryStorage(HistoryStorageBase, ABC):
         query = """
             INSERT INTO job_history (
-                event_id, job_id, state, event_type, duration_ms,
+                event_id, job_id, timestamp, state, event_type, duration_ms,
                 previous_state, next_state, worker_id, attempt_number,
                 context_snapshot
-            ) VALUES ($1, $2, $3, $4, $5, $6, $7, $8, $9, $10)
+            ) VALUES ($1, $2, $3, $4, $5, $6, $7, $8, $9, $10, $11)
         """
+        now = datetime.now(self.tz)
+        context_snapshot = event_data.get("context_snapshot")
+        context_snapshot_json = dumps(context_snapshot).decode("utf-8") if context_snapshot else None
         params = (
             uuid4(),
             event_data.get("job_id"),
+            now,
             event_data.get("state"),
             event_data.get("event_type"),
             event_data.get("duration_ms"),
@@ -89,7 +109,7 @@ class PostgresHistoryStorage(HistoryStorageBase, ABC):
             event_data.get("next_state"),
             event_data.get("worker_id"),
             event_data.get("attempt_number"),
-            event_data.get("context_snapshot"),
+            context_snapshot_json,
         )
         try:
             async with self._pool.acquire() as conn:
@@ -104,14 +124,20 @@ class PostgresHistoryStorage(HistoryStorageBase, ABC):
         query = """
             INSERT INTO worker_history (
-                event_id, worker_id, event_type, worker_info_snapshot
-            ) VALUES ($1, $2, $3, $4)
+                event_id, worker_id, timestamp, event_type, worker_info_snapshot
+            ) VALUES ($1, $2, $3, $4, $5)
         """
+        now = datetime.now(self.tz)
+        worker_info = event_data.get("worker_info_snapshot")
+        worker_info_json = dumps(worker_info).decode("utf-8") if worker_info else None
         params = (
             uuid4(),
             event_data.get("worker_id"),
+            now,
             event_data.get("event_type"),
-            event_data.get("worker_info_snapshot"),
+            worker_info_json,
         )
         try:
             async with self._pool.acquire() as conn:
@@ -119,6 +145,23 @@ class PostgresHistoryStorage(HistoryStorageBase, ABC):
         except PostgresError as e:
             logger.error(f"Failed to log worker event to PostgreSQL: {e}")
+    def _format_row(self, row: dict[str, Any]) -> dict[str, Any]:
+        """Helper to format a row from DB: convert timestamp to local TZ and decode JSON."""
+        item = dict(row)
+        if isinstance(item.get("context_snapshot"), str):
+            with suppress(Exception):
+                item["context_snapshot"] = loads(item["context_snapshot"])
+        if isinstance(item.get("worker_info_snapshot"), str):
+            with suppress(Exception):
+                item["worker_info_snapshot"] = loads(item["worker_info_snapshot"])
+        if "timestamp" in item and isinstance(item["timestamp"], datetime):
+            item["timestamp"] = item["timestamp"].astimezone(self.tz)
+        return item
     async def get_job_history(self, job_id: str) -> list[dict[str, Any]]:
         """Gets the full history for the specified job from PostgreSQL."""
         if not self._pool:
@@ -128,7 +171,7 @@ class PostgresHistoryStorage(HistoryStorageBase, ABC):
         try:
             async with self._pool.acquire() as conn:
                 rows = await conn.fetch(query, job_id)
-                return [dict(row) for row in rows]
+                return [self._format_row(row) for row in rows]
         except PostgresError as e:
             logger.error(
                 f"Failed to get job history for job_id {job_id} from PostgreSQL: {e}",
@@ -154,7 +197,7 @@ class PostgresHistoryStorage(HistoryStorageBase, ABC):
         try:
             async with self._pool.acquire() as conn:
                 rows = await conn.fetch(query, limit, offset)
-                return [dict(row) for row in rows]
+                return [self._format_row(row) for row in rows]
         except PostgresError as e:
             logger.error(f"Failed to get jobs list from PostgreSQL: {e}")
             return []
@@ -206,7 +249,7 @@ class PostgresHistoryStorage(HistoryStorageBase, ABC):
         try:
             async with self._pool.acquire() as conn:
                 rows = await conn.fetch(query, worker_id, since_days)
-                return [dict(row) for row in rows]
+                return [self._format_row(row) for row in rows]
         except PostgresError as e:
             logger.error(f"Failed to get worker history for worker_id {worker_id} from PostgreSQL: {e}")
             return []

avtomatika 1.0b5__tar.gz → 1.0b6__tar.gz

avtomatika 1.0b5tar.gz → 1.0b6tar.gz