PyPI - avtomatika - Versions diffs - 1.0b9__tar.gz → 1.0b11__tar.gz - Mend

avtomatika 1.0b9tar.gz → 1.0b11tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (95) hide show

{avtomatika-1.0b9/src/avtomatika.egg-info → avtomatika-1.0b11}/PKG-INFO RENAMED Viewed

@@ -1,6 +1,6 @@
 Metadata-Version: 2.4
 Name: avtomatika
-Version: 1.0b9
+Version: 1.0b11
 Summary: A state-machine based orchestrator for long-running AI and other jobs.
 Author-email: Dmitrii Gagarin <madgagarin@gmail.com>
 Project-URL: Homepage, https://github.com/avtomatika-ai/avtomatika
@@ -15,7 +15,7 @@ Classifier: Typing :: Typed
 Requires-Python: >=3.11
 Description-Content-Type: text/markdown
 License-File: LICENSE
-Requires-Dist: rxon
+Requires-Dist: rxon==1.0b2
 Requires-Dist: aiohttp~=3.12
 Requires-Dist: python-json-logger~=4.0
 Requires-Dist: graphviz~=0.21
@@ -58,7 +58,6 @@ Dynamic: license-file
 [![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)
 [![Python 3.11+](https://img.shields.io/badge/python-3.11+-blue.svg)](https://www.python.org/downloads/release/python-3110/)
-[![Tests](https://github.com/avtomatika-ai/avtomatika/actions/workflows/ci.yml/badge.svg)](https://github.com/avtomatika-ai/avtomatika/actions/workflows/ci.yml)
 [![Code Style: Ruff](https://img.shields.io/badge/code%20style-ruff-000000.svg)](https://github.com/astral-sh/ruff)
 Avtomatika is a powerful, state-driven engine for managing complex asynchronous workflows in Python. It provides a robust framework for building scalable and resilient applications by separating process logic from execution logic.
@@ -494,10 +493,16 @@ For detailed specifications and examples, please refer to the [**Configuration G
 The orchestrator has built-in mechanisms for handling failures based on the `error.code` field in a worker's response.
-*   **TRANSIENT_ERROR**: A temporary error (e.g., network failure, rate limit). The orchestrator will automatically retry the task several times.
-*   **PERMANENT_ERROR**: A permanent error (e.g., a corrupted file). The task will be immediately sent to quarantine for manual investigation.
+*   **TRANSIENT_ERROR**: A temporary error (e.g., network failure). The orchestrator will automatically retry the task several times.
+*   **RESOURCE_EXHAUSTED_ERROR / TIMEOUT_ERROR / INTERNAL_ERROR**: Treated as transient errors and retried.
+*   **PERMANENT_ERROR**: A permanent error. The task will be immediately sent to quarantine.
+*   **SECURITY_ERROR / DEPENDENCY_ERROR**: Treated as permanent errors (e.g., security violation or missing model). Immediate quarantine.
 *   **INVALID_INPUT_ERROR**: An error in the input data. The entire pipeline (Job) will be immediately moved to the failed state.
+### Progress Tracking
+Workers can report real-time execution progress (0-100%) and status messages. This information is automatically persisted by the Orchestrator and exposed via the Job Status API (`GET /api/v1/jobs/{job_id}`).
 ### Concurrency & Performance
 To prevent system overload during high traffic, the Orchestrator implements a backpressure mechanism for its internal job processing logic.

{avtomatika-1.0b9 → avtomatika-1.0b11}/README.md RENAMED Viewed

@@ -2,7 +2,6 @@
 [![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)
 [![Python 3.11+](https://img.shields.io/badge/python-3.11+-blue.svg)](https://www.python.org/downloads/release/python-3110/)
-[![Tests](https://github.com/avtomatika-ai/avtomatika/actions/workflows/ci.yml/badge.svg)](https://github.com/avtomatika-ai/avtomatika/actions/workflows/ci.yml)
 [![Code Style: Ruff](https://img.shields.io/badge/code%20style-ruff-000000.svg)](https://github.com/astral-sh/ruff)
 Avtomatika is a powerful, state-driven engine for managing complex asynchronous workflows in Python. It provides a robust framework for building scalable and resilient applications by separating process logic from execution logic.
@@ -438,10 +437,16 @@ For detailed specifications and examples, please refer to the [**Configuration G
 The orchestrator has built-in mechanisms for handling failures based on the `error.code` field in a worker's response.
-*   **TRANSIENT_ERROR**: A temporary error (e.g., network failure, rate limit). The orchestrator will automatically retry the task several times.
-*   **PERMANENT_ERROR**: A permanent error (e.g., a corrupted file). The task will be immediately sent to quarantine for manual investigation.
+*   **TRANSIENT_ERROR**: A temporary error (e.g., network failure). The orchestrator will automatically retry the task several times.
+*   **RESOURCE_EXHAUSTED_ERROR / TIMEOUT_ERROR / INTERNAL_ERROR**: Treated as transient errors and retried.
+*   **PERMANENT_ERROR**: A permanent error. The task will be immediately sent to quarantine.
+*   **SECURITY_ERROR / DEPENDENCY_ERROR**: Treated as permanent errors (e.g., security violation or missing model). Immediate quarantine.
 *   **INVALID_INPUT_ERROR**: An error in the input data. The entire pipeline (Job) will be immediately moved to the failed state.
+### Progress Tracking
+Workers can report real-time execution progress (0-100%) and status messages. This information is automatically persisted by the Orchestrator and exposed via the Job Status API (`GET /api/v1/jobs/{job_id}`).
 ### Concurrency & Performance
 To prevent system overload during high traffic, the Orchestrator implements a backpressure mechanism for its internal job processing logic.

{avtomatika-1.0b9 → avtomatika-1.0b11}/pyproject.toml RENAMED Viewed

@@ -4,7 +4,7 @@ build-backend = "setuptools.build_meta"
 [project]
 name = "avtomatika"
-version = "1.0b9"
+version = "1.0b11"
 description = "A state-machine based orchestrator for long-running AI and other jobs."
 readme = "README.md"
 requires-python = ">=3.11"
@@ -21,7 +21,7 @@ classifiers = [
     "Typing :: Typed",
 ]
 dependencies = [
-    "rxon",
+    "rxon==1.0b2",
     "aiohttp~=3.12",
     "python-json-logger~=4.0",
     "graphviz~=0.21",

{avtomatika-1.0b9 → avtomatika-1.0b11}/src/avtomatika/api/handlers.py RENAMED Viewed

@@ -25,11 +25,11 @@ from ..worker_config_loader import load_worker_configs_to_redis
 logger = getLogger(__name__)
-def json_dumps(obj) -> str:
+def json_dumps(obj: Any) -> str:
     return dumps(obj).decode("utf-8")
-def json_response(data, **kwargs) -> web.Response:
+def json_response(data: Any, **kwargs: Any) -> web.Response:
     return web.json_response(data, dumps=json_dumps, **kwargs)

{avtomatika-1.0b9 → avtomatika-1.0b11}/src/avtomatika/api.html RENAMED Viewed

@@ -211,7 +211,7 @@
                         ],
                         request: { body: null },
                         responses: [
-                            { code: '200 OK', description: 'Successful response.', body: { "id": "...", "status": "..." } }
+                            { code: '200 OK', description: 'Successful response.', body: { "id": "...", "status": "running", "progress": 0.75, "progress_message": "Processing..." } }
                         ]
                     },
                     {

{avtomatika-1.0b9 → avtomatika-1.0b11}/src/avtomatika/blueprint.py RENAMED Viewed

@@ -62,7 +62,8 @@ class ConditionalHandler:
         try:
             context_area = getattr(context, self.condition.area)
             actual_value = context_area[self.condition.field]
-            return self.condition.op(actual_value, self.condition.value)
+            result = self.condition.op(actual_value, self.condition.value)
+            return bool(result)
         except (AttributeError, KeyError):
             return False
@@ -130,7 +131,14 @@ class StateMachineBlueprint:
         self.name = name
         self.api_endpoint = api_endpoint
         self.api_version = api_version
-        self.data_stores: dict[str, AsyncDictStore] = data_stores if data_stores is not None else {}
+        self.data_stores: dict[str, AsyncDictStore] = {}
+        if data_stores:
+            for ds_name, ds_data in data_stores.items():
+                if isinstance(ds_data, AsyncDictStore):
+                    self.data_stores[ds_name] = ds_data
+                else:
+                    self.data_stores[ds_name] = AsyncDictStore(ds_data)
         self.handlers: dict[str, Callable] = {}
         self.aggregator_handlers: dict[str, Callable] = {}
         self.conditional_handlers: list[ConditionalHandler] = []
@@ -279,7 +287,7 @@ class StateMachineBlueprint:
             f"No suitable handler found for state '{state}' in blueprint '{self.name}' for the given context.",
         )
-    def render_graph(self, output_filename: str | None = None, output_format: str = "png"):
+    def render_graph(self, output_filename: str | None = None, output_format: str = "png") -> str | None:
         from graphviz import Digraph  # type: ignore[import]
         dot = Digraph(comment=f"State Machine for {self.name}")

avtomatika-1.0b11/src/avtomatika/constants.py ADDED Viewed

@@ -0,0 +1,80 @@
+"""
+Centralized constants for the Avtomatika protocol.
+(Legacy wrapper, pointing to rxon.constants)
+"""
+from rxon.constants import (
+    AUTH_HEADER_CLIENT,
+    AUTH_HEADER_WORKER,
+    COMMAND_CANCEL_TASK,
+    ENDPOINT_TASK_NEXT,
+    ENDPOINT_TASK_RESULT,
+    ENDPOINT_WORKER_HEARTBEAT,
+    ENDPOINT_WORKER_REGISTER,
+    ERROR_CODE_DEPENDENCY,
+    ERROR_CODE_INTEGRITY_MISMATCH,
+    ERROR_CODE_INTERNAL,
+    ERROR_CODE_INVALID_INPUT,
+    ERROR_CODE_PERMANENT,
+    ERROR_CODE_RESOURCE_EXHAUSTED,
+    ERROR_CODE_SECURITY,
+    ERROR_CODE_TIMEOUT,
+    ERROR_CODE_TRANSIENT,
+    JOB_STATUS_CANCELLED,
+    JOB_STATUS_ERROR,
+    JOB_STATUS_FAILED,
+    JOB_STATUS_FINISHED,
+    JOB_STATUS_PENDING,
+    JOB_STATUS_QUARANTINED,
+    JOB_STATUS_RUNNING,
+    JOB_STATUS_WAITING_FOR_HUMAN,
+    JOB_STATUS_WAITING_FOR_PARALLEL,
+    JOB_STATUS_WAITING_FOR_WORKER,
+    MSG_TYPE_PROGRESS,
+    PROTOCOL_VERSION,
+    PROTOCOL_VERSION_HEADER,
+    STS_TOKEN_ENDPOINT,
+    TASK_STATUS_CANCELLED,
+    TASK_STATUS_FAILURE,
+    TASK_STATUS_SUCCESS,
+    WORKER_API_PREFIX,
+    WS_ENDPOINT,
+)
+__all__ = [
+    "AUTH_HEADER_CLIENT",
+    "AUTH_HEADER_WORKER",
+    "COMMAND_CANCEL_TASK",
+    "ENDPOINT_TASK_NEXT",
+    "ENDPOINT_TASK_RESULT",
+    "ENDPOINT_WORKER_HEARTBEAT",
+    "ENDPOINT_WORKER_REGISTER",
+    "ERROR_CODE_DEPENDENCY",
+    "ERROR_CODE_INTEGRITY_MISMATCH",
+    "ERROR_CODE_INTERNAL",
+    "ERROR_CODE_INVALID_INPUT",
+    "ERROR_CODE_PERMANENT",
+    "ERROR_CODE_RESOURCE_EXHAUSTED",
+    "ERROR_CODE_SECURITY",
+    "ERROR_CODE_TIMEOUT",
+    "ERROR_CODE_TRANSIENT",
+    "JOB_STATUS_CANCELLED",
+    "JOB_STATUS_ERROR",
+    "JOB_STATUS_FAILED",
+    "JOB_STATUS_FINISHED",
+    "JOB_STATUS_PENDING",
+    "JOB_STATUS_QUARANTINED",
+    "JOB_STATUS_RUNNING",
+    "JOB_STATUS_WAITING_FOR_HUMAN",
+    "JOB_STATUS_WAITING_FOR_PARALLEL",
+    "JOB_STATUS_WAITING_FOR_WORKER",
+    "MSG_TYPE_PROGRESS",
+    "PROTOCOL_VERSION",
+    "PROTOCOL_VERSION_HEADER",
+    "STS_TOKEN_ENDPOINT",
+    "TASK_STATUS_CANCELLED",
+    "TASK_STATUS_FAILURE",
+    "TASK_STATUS_SUCCESS",
+    "WORKER_API_PREFIX",
+    "WS_ENDPOINT",
+]

{avtomatika-1.0b9 → avtomatika-1.0b11}/src/avtomatika/dispatcher.py RENAMED Viewed

@@ -184,6 +184,9 @@ class Dispatcher:
             selected_worker = self._select_default(capable_workers, task_type)
         worker_id = selected_worker.get("worker_id")
+        if not worker_id:
+            raise RuntimeError(f"Selected worker for task '{task_type}' has no worker_id")
         logger.info(
             f"Dispatching task '{task_type}' to worker {worker_id} (strategy: {dispatch_strategy})",
         )

{avtomatika-1.0b9 → avtomatika-1.0b11}/src/avtomatika/engine.py RENAMED Viewed

@@ -1,7 +1,7 @@
 from asyncio import TimeoutError as AsyncTimeoutError
 from asyncio import create_task, gather, get_running_loop, wait_for
 from logging import getLogger
-from typing import Any
+from typing import Any, Optional
 from uuid import uuid4
 from aiohttp import ClientSession, web
@@ -58,7 +58,7 @@ def json_dumps(obj: Any) -> str:
     return dumps(obj).decode("utf-8")
-def json_response(data, **kwargs: Any) -> web.Response:
+def json_response(data: Any, **kwargs: Any) -> web.Response:
     return web.json_response(data, dumps=json_dumps, **kwargs)
@@ -70,11 +70,15 @@ class OrchestratorEngine:
         self.config = config
         self.blueprints: dict[str, StateMachineBlueprint] = {}
         self.history_storage: HistoryStorageBase = NoOpHistoryStorage()
-        self.ws_manager = WebSocketManager()
+        self.ws_manager = WebSocketManager(self.storage)
         self.app = web.Application(middlewares=[compression_middleware])
         self.app[ENGINE_KEY] = self
-        self.worker_service = None
+        self.worker_service: Optional[WorkerService] = None
         self._setup_done = False
+        self.webhook_sender: WebhookSender
+        self.dispatcher: Dispatcher
+        self.runner: web.AppRunner
+        self.site: web.TCPSite
         from rxon import HttpListener
@@ -176,6 +180,9 @@ class OrchestratorEngine:
         except ValueError as e:
             raise web.HTTPBadRequest(text=str(e)) from e
+        if self.worker_service is None:
+            raise web.HTTPInternalServerError(text="WorkerService is not initialized.")
         if message_type == "register":
             return await self.worker_service.register_worker(payload)
@@ -352,6 +359,7 @@ class OrchestratorEngine:
         initial_data: dict[str, Any],
         source: str = "internal",
         tracing_context: dict[str, str] | None = None,
+        data_metadata: dict[str, Any] | None = None,
     ) -> str:
         """Creates a job directly, bypassing the HTTP API layer.
         Useful for internal schedulers and triggers.
@@ -377,6 +385,7 @@ class OrchestratorEngine:
             "status": JOB_STATUS_PENDING,
             "tracing_context": tracing_context or {},
             "client_config": client_config,
+            "data_metadata": data_metadata or {},
         }
         await self.storage.save_job_state(job_id, job_state)
         await self.storage.enqueue_job(job_id)

{avtomatika-1.0b9 → avtomatika-1.0b11}/src/avtomatika/executor.py RENAMED Viewed

@@ -238,6 +238,9 @@ class JobExecutor:
                             action_factory.sub_blueprint_to_run,
                             duration_ms,
                         )
+                    elif job_state["current_state"] in blueprint.end_states:
+                        status = JOB_STATUS_FINISHED if job_state["current_state"] == "finished" else JOB_STATUS_FAILED
+                        await self._handle_terminal_reached(job_state, status, duration_ms)
                 except Exception as e:
                     # This catches errors within the handler's execution.
@@ -248,6 +251,40 @@ class JobExecutor:
             if message_id in self._processing_messages:
                 self._processing_messages.remove(message_id)
+    async def _handle_terminal_reached(
+        self,
+        job_state: dict[str, Any],
+        status: str,
+        duration_ms: int,
+    ) -> None:
+        job_id = job_state["id"]
+        current_state = job_state["current_state"]
+        logger.info(f"Job {job_id} reached terminal state '{current_state}' with status '{status}'")
+        await self.history_storage.log_job_event(
+            {
+                "job_id": job_id,
+                "state": current_state,
+                "event_type": "job_completed",
+                "duration_ms": duration_ms,
+                "context_snapshot": job_state,
+            },
+        )
+        job_state["status"] = status
+        await self.storage.save_job_state(job_id, job_state)
+        # Clean up S3 files if service is available
+        s3_service = self.engine.app.get(S3_SERVICE_KEY)
+        if s3_service:
+            task_files = s3_service.get_task_files(job_id)
+            if task_files:
+                create_task(task_files.cleanup())
+        await self._check_and_resume_parent(job_state)
+        event_type = "job_finished" if status == JOB_STATUS_FINISHED else "job_failed"
+        await self.engine.send_job_webhook(job_state, event_type)
     async def _handle_transition(
         self,
         job_state: dict[str, Any],
@@ -270,28 +307,11 @@ class JobExecutor:
             },
         )
-        # When transitioning to a new state, reset the retry counter.
         job_state["retry_count"] = 0
         job_state["current_state"] = next_state
         job_state["status"] = JOB_STATUS_RUNNING
         await self.storage.save_job_state(job_id, job_state)
-        if next_state not in TERMINAL_STATES:
-            await self.storage.enqueue_job(job_id)
-        else:
-            logger.info(f"Job {job_id} reached terminal state {next_state}")
-            # Clean up S3 files if service is available
-            s3_service = self.engine.app.get(S3_SERVICE_KEY)
-            if s3_service:
-                task_files = s3_service.get_task_files(job_id)
-                if task_files:
-                    # Run cleanup in background to not block response
-                    create_task(task_files.cleanup())
-            await self._check_and_resume_parent(job_state)
-            event_type = "job_finished" if next_state == JOB_STATUS_FINISHED else "job_failed"
-            await self.engine.send_job_webhook(job_state, event_type)
+        await self.storage.enqueue_job(job_id)
     async def _handle_dispatch(
         self,

{avtomatika-1.0b9 → avtomatika-1.0b11}/src/avtomatika/logging_config.py RENAMED Viewed

@@ -1,6 +1,7 @@
 from datetime import datetime
 from logging import DEBUG, Formatter, StreamHandler, getLogger
 from sys import stdout
+from typing import Any, Literal, Optional
 from zoneinfo import ZoneInfo
 from pythonjsonlogger import json
@@ -9,14 +10,22 @@ from pythonjsonlogger import json
 class TimezoneFormatter(Formatter):
     """Formatter that respects a custom timezone."""
-    def __init__(self, fmt=None, datefmt=None, style="%", validate=True, *, tz_name="UTC"):
+    def __init__(
+        self,
+        fmt: Optional[str] = None,
+        datefmt: Optional[str] = None,
+        style: Literal["%", "{", "$"] = "%",
+        validate: bool = True,
+        *,
+        tz_name: str = "UTC",
+    ) -> None:
         super().__init__(fmt, datefmt, style, validate)
         self.tz = ZoneInfo(tz_name)
-    def converter(self, timestamp):
+    def converter(self, timestamp: float) -> datetime:  # type: ignore[override]
         return datetime.fromtimestamp(timestamp, self.tz)
-    def formatTime(self, record, datefmt=None):
+    def formatTime(self, record: Any, datefmt: Optional[str] = None) -> str:
         dt = self.converter(record.created)
         if datefmt:
             s = dt.strftime(datefmt)
@@ -28,14 +37,14 @@ class TimezoneFormatter(Formatter):
         return s
-class TimezoneJsonFormatter(json.JsonFormatter):
+class TimezoneJsonFormatter(json.JsonFormatter):  # type: ignore[name-defined]
     """JSON Formatter that respects a custom timezone."""
-    def __init__(self, *args, tz_name="UTC", **kwargs):
+    def __init__(self, *args: Any, tz_name: str = "UTC", **kwargs: Any) -> None:
         super().__init__(*args, **kwargs)
         self.tz = ZoneInfo(tz_name)
-    def formatTime(self, record, datefmt=None):
+    def formatTime(self, record: Any, datefmt: Optional[str] = None) -> str:
         # Override formatTime to use timezone-aware datetime
         dt = datetime.fromtimestamp(record.created, self.tz)
         if datefmt:
@@ -44,7 +53,7 @@ class TimezoneJsonFormatter(json.JsonFormatter):
         return dt.isoformat()
-def setup_logging(log_level: str = "INFO", log_format: str = "json", tz_name: str = "UTC"):
+def setup_logging(log_level: str = "INFO", log_format: str = "json", tz_name: str = "UTC") -> None:
     """Configures structured logging for the entire application."""
     logger = getLogger("avtomatika")
     logger.setLevel(log_level)

{avtomatika-1.0b9 → avtomatika-1.0b11}/src/avtomatika/s3.py RENAMED Viewed

@@ -335,12 +335,11 @@ class S3Service:
         try:
             self._store = S3Store(
                 bucket=self.config.S3_DEFAULT_BUCKET,
-                access_key_id=self.config.S3_ACCESS_KEY,
-                secret_access_key=self.config.S3_SECRET_KEY,
+                aws_access_key_id=self.config.S3_ACCESS_KEY,
+                aws_secret_access_key=self.config.S3_SECRET_KEY,
                 region=self.config.S3_REGION,
                 endpoint=self.config.S3_ENDPOINT_URL,
                 allow_http="http://" in self.config.S3_ENDPOINT_URL,
-                force_path_style=True,
             )
             self._semaphore = Semaphore(self.config.S3_MAX_CONCURRENCY)
             logger.info(

{avtomatika-1.0b9 → avtomatika-1.0b11}/src/avtomatika/scheduler_config_loader.py RENAMED Viewed

@@ -22,14 +22,17 @@ def load_schedules_from_file(file_path: str) -> list[ScheduledJobConfig]:
     schedules = []
     for name, config in data.items():
-        # Skip sections that might be metadata (though TOML structure usually implies all top-level keys are jobs)
         if not isinstance(config, dict):
             continue
+        blueprint = config.get("blueprint")
+        if not isinstance(blueprint, str):
+            raise ValueError(f"Schedule '{name}' is missing a 'blueprint' name.")
         schedules.append(
             ScheduledJobConfig(
                 name=name,
-                blueprint=config.get("blueprint"),
+                blueprint=blueprint,
                 input_data=config.get("input_data", {}),
                 interval_seconds=config.get("interval_seconds"),
                 daily_at=config.get("daily_at"),

{avtomatika-1.0b9 → avtomatika-1.0b11}/src/avtomatika/services/worker_service.py RENAMED Viewed

@@ -10,9 +10,11 @@ from rxon.validators import validate_identifier
 from ..app_keys import S3_SERVICE_KEY
 from ..config import Config
 from ..constants import (
+    ERROR_CODE_DEPENDENCY,
     ERROR_CODE_INTEGRITY_MISMATCH,
     ERROR_CODE_INVALID_INPUT,
     ERROR_CODE_PERMANENT,
+    ERROR_CODE_SECURITY,
     ERROR_CODE_TRANSIENT,
     JOB_STATUS_CANCELLED,
     JOB_STATUS_FAILED,
@@ -102,7 +104,6 @@ class WorkerService:
         job_id = result_payload.get("job_id")
         task_id = result_payload.get("task_id")
-        result_data = result_payload.get("result", {})
         if not job_id or not task_id:
             raise ValueError("job_id and task_id are required")
@@ -111,25 +112,33 @@ class WorkerService:
         if not job_state:
             raise LookupError("Job not found")
+        result_status = result_payload.get("status", TASK_STATUS_SUCCESS)
+        worker_data_content = result_payload.get("data")
         if job_state.get("status") == JOB_STATUS_WAITING_FOR_PARALLEL:
             await self.storage.remove_job_from_watch(f"{job_id}:{task_id}")
-            job_state.setdefault("aggregation_results", {})[task_id] = result_data
-            branches = job_state.setdefault("active_branches", [])
-            if task_id in branches:
-                branches.remove(task_id)
+            def _update_parallel_results(state: dict[str, Any]) -> dict[str, Any]:
+                state.setdefault("aggregation_results", {})[task_id] = result_payload
+                branches = state.setdefault("active_branches", [])
+                if task_id in branches:
+                    branches.remove(task_id)
+                if not branches:
+                    state["status"] = JOB_STATUS_RUNNING
+                    state["current_state"] = state["aggregation_target"]
+                return state
-            if not branches:
+            updated_job_state = await self.storage.update_job_state_atomic(job_id, _update_parallel_results)
+            if not updated_job_state.get("active_branches"):
                 logger.info(f"All parallel branches for job {job_id} have completed.")
-                job_state["status"] = JOB_STATUS_RUNNING
-                job_state["current_state"] = job_state["aggregation_target"]
-                await self.storage.save_job_state(job_id, job_state)
                 await self.storage.enqueue_job(job_id)
             else:
+                remaining = len(updated_job_state["active_branches"])
                 logger.info(
-                    f"Branch {task_id} for job {job_id} completed. Waiting for {len(branches)} more.",
+                    f"Branch {task_id} for job {job_id} completed. Waiting for {remaining} more.",
                 )
-                await self.storage.save_job_state(job_id, job_state)
             return "parallel_branch_result_accepted"
@@ -146,14 +155,12 @@ class WorkerService:
                 "event_type": "task_finished",
                 "duration_ms": duration_ms,
                 "worker_id": authenticated_worker_id,
-                "context_snapshot": {**job_state, "result": result_data},
+                "context_snapshot": {**job_state, "result": result_payload},
             },
         )
-        result_status = result_data.get("status", TASK_STATUS_SUCCESS)  # Default to success? Constant?
         if result_status == TASK_STATUS_FAILURE:
-            return await self._handle_task_failure(job_state, task_id, result_data)
+            return await self._handle_task_failure(job_state, task_id, result_payload)
         if result_status == TASK_STATUS_CANCELLED:
             logger.info(f"Task {task_id} for job {job_id} was cancelled by worker.")
@@ -169,13 +176,11 @@ class WorkerService:
             return "result_accepted_cancelled"
         transitions = job_state.get("current_task_transitions", {})
-        result_status = result_data.get("status", TASK_STATUS_SUCCESS)
         next_state = transitions.get(result_status)
         if next_state:
             logger.info(f"Job {job_id} transitioning based on worker status '{result_status}' to state '{next_state}'")
-            worker_data_content = result_data.get("data")
             if worker_data_content and isinstance(worker_data_content, dict):
                 if "state_history" not in job_state:
                     job_state["state_history"] = {}
@@ -200,8 +205,8 @@ class WorkerService:
             await self.storage.save_job_state(job_id, job_state)
             return "result_accepted_failure"
-    async def _handle_task_failure(self, job_state: dict, task_id: str, result_data: dict) -> str:
-        error_details = result_data.get("error", {})
+    async def _handle_task_failure(self, job_state: dict, task_id: str, result_payload: dict) -> str:
+        error_details = result_payload.get("error", {})
         error_type = ERROR_CODE_TRANSIENT
         error_message = "No error details provided."
@@ -214,9 +219,9 @@ class WorkerService:
         job_id = job_state["id"]
         logger.warning(f"Task {task_id} for job {job_id} failed with error type '{error_type}'.")
-        if error_type == ERROR_CODE_PERMANENT:
+        if error_type in (ERROR_CODE_PERMANENT, ERROR_CODE_SECURITY, ERROR_CODE_DEPENDENCY):
             job_state["status"] = JOB_STATUS_QUARANTINED
-            job_state["error_message"] = f"Task failed with permanent error: {error_message}"
+            job_state["error_message"] = f"Task failed with permanent error ({error_type}): {error_message}"
             await self.storage.save_job_state(job_id, job_state)
             await self.storage.quarantine_job(job_id)
         elif error_type == ERROR_CODE_INVALID_INPUT:
@@ -230,7 +235,6 @@ class WorkerService:
             logger.critical(f"Data integrity mismatch detected for job {job_id}: {error_message}")
         else:
             await self.engine.handle_task_failure(job_state, task_id, error_message)
         return "result_accepted_failure"
     async def issue_access_token(self, worker_id: str) -> TokenResponse:

{avtomatika-1.0b9 → avtomatika-1.0b11}/src/avtomatika/storage/base.py RENAMED Viewed

@@ -90,6 +90,20 @@ class StorageBackend(ABC):
         """
         raise NotImplementedError
+    @abstractmethod
+    async def update_job_state_atomic(
+        self,
+        job_id: str,
+        update_callback: Any,
+    ) -> dict[str, Any]:
+        """Atomically update the state of a job using a callback function.
+        :param job_id: Unique identifier for the job.
+        :param update_callback: A callable that takes the current state and returns the updated state.
+        :return: The updated full state of the job.
+        """
+        raise NotImplementedError
     @abstractmethod
     async def register_worker(
         self,

{avtomatika-1.0b9 → avtomatika-1.0b11}/src/avtomatika/storage/memory.py RENAMED Viewed

@@ -12,12 +12,12 @@ class MemoryStorage(StorageBackend):
     Not persistent.
     """
-    def __init__(self):
+    def __init__(self) -> None:
         self._jobs: dict[str, dict[str, Any]] = {}
         self._workers: dict[str, dict[str, Any]] = {}
         self._worker_ttls: dict[str, float] = {}
-        self._worker_task_queues: dict[str, PriorityQueue] = {}
-        self._job_queue = Queue()
+        self._worker_task_queues: dict[str, PriorityQueue[Any]] = {}
+        self._job_queue: Queue[str] = Queue()
         self._quarantine_queue: list[str] = []
         self._watched_jobs: dict[str, float] = {}
         self._client_configs: dict[str, dict[str, Any]] = {}
@@ -62,6 +62,17 @@ class MemoryStorage(StorageBackend):
             self._jobs[job_id].update(update_data)
             return self._jobs[job_id]
+    async def update_job_state_atomic(
+        self,
+        job_id: str,
+        update_callback: Any,
+    ) -> dict[str, Any]:
+        async with self._lock:
+            current_state = self._jobs.get(job_id, {})
+            updated_state = update_callback(current_state)
+            self._jobs[job_id] = updated_state
+            return updated_state
     async def register_worker(
         self,
         worker_id: str,

avtomatika 1.0b9__tar.gz → 1.0b11__tar.gz

avtomatika 1.0b9tar.gz → 1.0b11tar.gz