haystack-ml-stack 0.1.0__tar.gz
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Potentially problematic release.
This version of haystack-ml-stack might be problematic. Click here for more details.
- haystack_ml_stack-0.1.0/PKG-INFO +96 -0
- haystack_ml_stack-0.1.0/README.md +81 -0
- haystack_ml_stack-0.1.0/pyproject.toml +21 -0
- haystack_ml_stack-0.1.0/setup.cfg +4 -0
- haystack_ml_stack-0.1.0/src/haystack_ml_stack/__init__.py +4 -0
- haystack_ml_stack-0.1.0/src/haystack_ml_stack/app.py +158 -0
- haystack_ml_stack-0.1.0/src/haystack_ml_stack/cache.py +19 -0
- haystack_ml_stack-0.1.0/src/haystack_ml_stack/dynamo.py +103 -0
- haystack_ml_stack-0.1.0/src/haystack_ml_stack/model_store.py +36 -0
- haystack_ml_stack-0.1.0/src/haystack_ml_stack/settings.py +22 -0
- haystack_ml_stack-0.1.0/src/haystack_ml_stack.egg-info/PKG-INFO +96 -0
- haystack_ml_stack-0.1.0/src/haystack_ml_stack.egg-info/SOURCES.txt +13 -0
- haystack_ml_stack-0.1.0/src/haystack_ml_stack.egg-info/dependency_links.txt +1 -0
- haystack_ml_stack-0.1.0/src/haystack_ml_stack.egg-info/requires.txt +6 -0
- haystack_ml_stack-0.1.0/src/haystack_ml_stack.egg-info/top_level.txt +1 -0
|
@@ -0,0 +1,96 @@
|
|
|
1
|
+
Metadata-Version: 2.4
|
|
2
|
+
Name: haystack-ml-stack
|
|
3
|
+
Version: 0.1.0
|
|
4
|
+
Summary: Functions related to Haystack ML
|
|
5
|
+
Author-email: Oscar Vega <oscar@haystack.tv>
|
|
6
|
+
License: MIT
|
|
7
|
+
Requires-Python: >=3.11
|
|
8
|
+
Description-Content-Type: text/markdown
|
|
9
|
+
Requires-Dist: pydantic==2.5.0
|
|
10
|
+
Requires-Dist: cachetools==5.5.2
|
|
11
|
+
Requires-Dist: cloudpickle==2.2.1
|
|
12
|
+
Requires-Dist: aioboto3==12.0.0
|
|
13
|
+
Requires-Dist: fastapi==0.104.1
|
|
14
|
+
Requires-Dist: pydantic-settings==2.2
|
|
15
|
+
|
|
16
|
+
# Haystack ML Stack
|
|
17
|
+
|
|
18
|
+
Currently this project contains a FastAPI-based service designed for low-latency scoring of streams data coming from http requests
|
|
19
|
+
|
|
20
|
+
## 🚀 Features
|
|
21
|
+
|
|
22
|
+
* **FastAPI Service:** Lightweight and fast web service for ML inference.
|
|
23
|
+
* **Asynchronous I/O:** Utilizes `aiobotocore` for non-blocking S3 and DynamoDB operations.
|
|
24
|
+
* **Model Loading:** Downloads and loads the ML model (using `cloudpickle`) from a configurable S3 path on startup.
|
|
25
|
+
* **Feature Caching:** Implements a thread-safe Time-To-Live (TTL) / Least-Recently-Used (LRU) cache (`cachetools.TLRUCache`) for DynamoDB features, reducing latency and database load.
|
|
26
|
+
* **DynamoDB Integration:** Fetches stream-specific features from DynamoDB to enrich the data before scoring.
|
|
27
|
+
* **Health Check:** Provides a `/health` endpoint to monitor service status and model loading.
|
|
28
|
+
|
|
29
|
+
## 📦 Installation
|
|
30
|
+
|
|
31
|
+
This project requires Python 3.11 or later.
|
|
32
|
+
|
|
33
|
+
1. **Install package:**
|
|
34
|
+
The dependencies associated are listed in `pyproject.toml`.
|
|
35
|
+
|
|
36
|
+
```bash
|
|
37
|
+
pip install haystack-ml-stack
|
|
38
|
+
```
|
|
39
|
+
|
|
40
|
+
## ⚙️ Configuration
|
|
41
|
+
|
|
42
|
+
The service is configured using environment variables, managed by `pydantic-settings`. You can use a `.env` file for local development.
|
|
43
|
+
|
|
44
|
+
| Variable Name | Alias | Default | Description |
|
|
45
|
+
| :--- | :--- | :--- | :--- |
|
|
46
|
+
| `S3_MODEL_PATH` | `S3_MODEL_PATH` | `None` | **Required.** The `s3://bucket/key` URL for the cloudpickled ML model file. |
|
|
47
|
+
| `FEATURES_TABLE`| `FEATURES_TABLE`| `"features"` | Name of the DynamoDB table storing stream features. |
|
|
48
|
+
| `LOGS_FRACTION` | `LOGS_FRACTION` | `0.01` | Fraction of requests to log detailed stream data for sampling/debugging (0.0 to 1.0). |
|
|
49
|
+
| `CACHE_MAXSIZE` | *(none)* | `50000` | Maximum size of the in-memory feature cache. |
|
|
50
|
+
|
|
51
|
+
**Example env vars**
|
|
52
|
+
|
|
53
|
+
```env
|
|
54
|
+
S3_MODEL_PATH="s3://my-ml-models/stream-scorer/latest.pkl"
|
|
55
|
+
FEATURES_TABLE="features"
|
|
56
|
+
LOGS_FRACTION=0.05
|
|
57
|
+
```
|
|
58
|
+
|
|
59
|
+
## 🌐 Endpoints
|
|
60
|
+
| Method | Path | Description |
|
|
61
|
+
| :--- | :--- | :--- |
|
|
62
|
+
| **GET** | `/` | Root endpoint, returns a simple running message. |
|
|
63
|
+
| **GET** | `/health` | Checks if the service is running and if the ML model has been loaded. |
|
|
64
|
+
| **POST** | `/score` | **Main scoring endpoint.** Accepts stream data and returns model predictions. |
|
|
65
|
+
|
|
66
|
+
## 💻 Technical Details
|
|
67
|
+
|
|
68
|
+
### Model Structure
|
|
69
|
+
The ML model file downloaded from S3 is expected to be a cloudpickle-serialized Python dictionary with the following structure:
|
|
70
|
+
|
|
71
|
+
``` python
|
|
72
|
+
|
|
73
|
+
model = {
|
|
74
|
+
"preprocess": <function>, # Function to transform request data into model input.
|
|
75
|
+
"predict": <function>, # Function to perform the actual model inference.
|
|
76
|
+
"params": <dict/any>, # Optional parameters passed to preprocess/predict.
|
|
77
|
+
"stream_features": <list[str]>, # Optional list of feature names to fetch from DynamoDB.
|
|
78
|
+
}
|
|
79
|
+
```
|
|
80
|
+
|
|
81
|
+
### Feature Caching (cache.py)
|
|
82
|
+
The `ThreadSafeTLRUCache` ensures that feature lookups and updates are thread-safe.
|
|
83
|
+
The `_ttu` (time-to-use) policy allows features to specify their own TTL via a `cache_ttl_in_seconds` key in the stored value.
|
|
84
|
+
|
|
85
|
+
### DynamoDB Feature Fetching (dynamo.py)
|
|
86
|
+
The set_stream_features function handles:
|
|
87
|
+
|
|
88
|
+
- Checking the in-memory cache for required `stream_features`.
|
|
89
|
+
|
|
90
|
+
- Batch-fetching any missing features from DynamoDB.
|
|
91
|
+
|
|
92
|
+
- Parsing the low-level DynamoDB items into Python types.
|
|
93
|
+
|
|
94
|
+
- Populating the cache with the fetched data, respecting the feature's TTL.
|
|
95
|
+
|
|
96
|
+
- Injecting the fetched feature values back into the streams list in the request payload.
|
|
@@ -0,0 +1,81 @@
|
|
|
1
|
+
# Haystack ML Stack
|
|
2
|
+
|
|
3
|
+
Currently this project contains a FastAPI-based service designed for low-latency scoring of streams data coming from http requests
|
|
4
|
+
|
|
5
|
+
## 🚀 Features
|
|
6
|
+
|
|
7
|
+
* **FastAPI Service:** Lightweight and fast web service for ML inference.
|
|
8
|
+
* **Asynchronous I/O:** Utilizes `aiobotocore` for non-blocking S3 and DynamoDB operations.
|
|
9
|
+
* **Model Loading:** Downloads and loads the ML model (using `cloudpickle`) from a configurable S3 path on startup.
|
|
10
|
+
* **Feature Caching:** Implements a thread-safe Time-To-Live (TTL) / Least-Recently-Used (LRU) cache (`cachetools.TLRUCache`) for DynamoDB features, reducing latency and database load.
|
|
11
|
+
* **DynamoDB Integration:** Fetches stream-specific features from DynamoDB to enrich the data before scoring.
|
|
12
|
+
* **Health Check:** Provides a `/health` endpoint to monitor service status and model loading.
|
|
13
|
+
|
|
14
|
+
## 📦 Installation
|
|
15
|
+
|
|
16
|
+
This project requires Python 3.11 or later.
|
|
17
|
+
|
|
18
|
+
1. **Install package:**
|
|
19
|
+
The dependencies associated are listed in `pyproject.toml`.
|
|
20
|
+
|
|
21
|
+
```bash
|
|
22
|
+
pip install haystack-ml-stack
|
|
23
|
+
```
|
|
24
|
+
|
|
25
|
+
## ⚙️ Configuration
|
|
26
|
+
|
|
27
|
+
The service is configured using environment variables, managed by `pydantic-settings`. You can use a `.env` file for local development.
|
|
28
|
+
|
|
29
|
+
| Variable Name | Alias | Default | Description |
|
|
30
|
+
| :--- | :--- | :--- | :--- |
|
|
31
|
+
| `S3_MODEL_PATH` | `S3_MODEL_PATH` | `None` | **Required.** The `s3://bucket/key` URL for the cloudpickled ML model file. |
|
|
32
|
+
| `FEATURES_TABLE`| `FEATURES_TABLE`| `"features"` | Name of the DynamoDB table storing stream features. |
|
|
33
|
+
| `LOGS_FRACTION` | `LOGS_FRACTION` | `0.01` | Fraction of requests to log detailed stream data for sampling/debugging (0.0 to 1.0). |
|
|
34
|
+
| `CACHE_MAXSIZE` | *(none)* | `50000` | Maximum size of the in-memory feature cache. |
|
|
35
|
+
|
|
36
|
+
**Example env vars**
|
|
37
|
+
|
|
38
|
+
```env
|
|
39
|
+
S3_MODEL_PATH="s3://my-ml-models/stream-scorer/latest.pkl"
|
|
40
|
+
FEATURES_TABLE="features"
|
|
41
|
+
LOGS_FRACTION=0.05
|
|
42
|
+
```
|
|
43
|
+
|
|
44
|
+
## 🌐 Endpoints
|
|
45
|
+
| Method | Path | Description |
|
|
46
|
+
| :--- | :--- | :--- |
|
|
47
|
+
| **GET** | `/` | Root endpoint, returns a simple running message. |
|
|
48
|
+
| **GET** | `/health` | Checks if the service is running and if the ML model has been loaded. |
|
|
49
|
+
| **POST** | `/score` | **Main scoring endpoint.** Accepts stream data and returns model predictions. |
|
|
50
|
+
|
|
51
|
+
## 💻 Technical Details
|
|
52
|
+
|
|
53
|
+
### Model Structure
|
|
54
|
+
The ML model file downloaded from S3 is expected to be a cloudpickle-serialized Python dictionary with the following structure:
|
|
55
|
+
|
|
56
|
+
``` python
|
|
57
|
+
|
|
58
|
+
model = {
|
|
59
|
+
"preprocess": <function>, # Function to transform request data into model input.
|
|
60
|
+
"predict": <function>, # Function to perform the actual model inference.
|
|
61
|
+
"params": <dict/any>, # Optional parameters passed to preprocess/predict.
|
|
62
|
+
"stream_features": <list[str]>, # Optional list of feature names to fetch from DynamoDB.
|
|
63
|
+
}
|
|
64
|
+
```
|
|
65
|
+
|
|
66
|
+
### Feature Caching (cache.py)
|
|
67
|
+
The `ThreadSafeTLRUCache` ensures that feature lookups and updates are thread-safe.
|
|
68
|
+
The `_ttu` (time-to-use) policy allows features to specify their own TTL via a `cache_ttl_in_seconds` key in the stored value.
|
|
69
|
+
|
|
70
|
+
### DynamoDB Feature Fetching (dynamo.py)
|
|
71
|
+
The set_stream_features function handles:
|
|
72
|
+
|
|
73
|
+
- Checking the in-memory cache for required `stream_features`.
|
|
74
|
+
|
|
75
|
+
- Batch-fetching any missing features from DynamoDB.
|
|
76
|
+
|
|
77
|
+
- Parsing the low-level DynamoDB items into Python types.
|
|
78
|
+
|
|
79
|
+
- Populating the cache with the fetched data, respecting the feature's TTL.
|
|
80
|
+
|
|
81
|
+
- Injecting the fetched feature values back into the streams list in the request payload.
|
|
@@ -0,0 +1,21 @@
|
|
|
1
|
+
# pyproject.toml
|
|
2
|
+
[build-system]
|
|
3
|
+
requires = ["setuptools>=69", "wheel", "build"]
|
|
4
|
+
build-backend = "setuptools.build_meta"
|
|
5
|
+
|
|
6
|
+
[project]
|
|
7
|
+
name = "haystack-ml-stack"
|
|
8
|
+
version = "0.1.0"
|
|
9
|
+
description = "Functions related to Haystack ML"
|
|
10
|
+
readme = "README.md"
|
|
11
|
+
authors = [{ name = "Oscar Vega", email = "oscar@haystack.tv" }]
|
|
12
|
+
requires-python = ">=3.11"
|
|
13
|
+
dependencies = [
|
|
14
|
+
"pydantic==2.5.0",
|
|
15
|
+
"cachetools==5.5.2",
|
|
16
|
+
"cloudpickle==2.2.1",
|
|
17
|
+
"aioboto3==12.0.0",
|
|
18
|
+
"fastapi==0.104.1",
|
|
19
|
+
"pydantic-settings==2.2"
|
|
20
|
+
]
|
|
21
|
+
license = { text = "MIT" }
|
|
@@ -0,0 +1,158 @@
|
|
|
1
|
+
import logging
|
|
2
|
+
import os
|
|
3
|
+
import random
|
|
4
|
+
import sys
|
|
5
|
+
from http import HTTPStatus
|
|
6
|
+
from typing import Any, Dict, List, Optional
|
|
7
|
+
|
|
8
|
+
import aiobotocore.session
|
|
9
|
+
from fastapi import FastAPI, HTTPException, Request, Response
|
|
10
|
+
from fastapi.encoders import jsonable_encoder
|
|
11
|
+
|
|
12
|
+
from .cache import make_features_cache
|
|
13
|
+
from .dynamo import set_stream_features
|
|
14
|
+
from .model_store import download_and_load_model
|
|
15
|
+
from .settings import Settings
|
|
16
|
+
|
|
17
|
+
logging.basicConfig(
|
|
18
|
+
level=logging.INFO,
|
|
19
|
+
format="[%(levelname)s] [%(process)d] %(name)s : %(message)s",
|
|
20
|
+
handlers=[logging.StreamHandler(sys.stdout)],
|
|
21
|
+
force=True,
|
|
22
|
+
)
|
|
23
|
+
|
|
24
|
+
logger = logging.getLogger(__name__)
|
|
25
|
+
|
|
26
|
+
|
|
27
|
+
def create_app(
|
|
28
|
+
settings: Optional[Settings] = None,
|
|
29
|
+
*,
|
|
30
|
+
preloaded_model: Optional[Dict[str, Any]] = None,
|
|
31
|
+
) -> FastAPI:
|
|
32
|
+
"""
|
|
33
|
+
Build a FastAPI app with injectable settings and model.
|
|
34
|
+
If `preloaded_model` is None, the app will load from S3 on startup.
|
|
35
|
+
"""
|
|
36
|
+
cfg = settings or Settings()
|
|
37
|
+
|
|
38
|
+
app = FastAPI(
|
|
39
|
+
title="ML Stream Scorer",
|
|
40
|
+
description="Scores video streams using a pre-trained ML model and DynamoDB features.",
|
|
41
|
+
version="1.0.0",
|
|
42
|
+
)
|
|
43
|
+
|
|
44
|
+
# Mutable state: cache + model
|
|
45
|
+
features_cache = make_features_cache(cfg.cache_maxsize)
|
|
46
|
+
state: Dict[str, Any] = {
|
|
47
|
+
"model": preloaded_model,
|
|
48
|
+
"session": aiobotocore.session.get_session(),
|
|
49
|
+
"model_name": (
|
|
50
|
+
os.path.basename(cfg.s3_model_path) if cfg.s3_model_path else None
|
|
51
|
+
),
|
|
52
|
+
}
|
|
53
|
+
|
|
54
|
+
@app.on_event("startup")
|
|
55
|
+
async def _startup() -> None:
|
|
56
|
+
if state["model"] is not None:
|
|
57
|
+
logger.info("Using preloaded model.")
|
|
58
|
+
return
|
|
59
|
+
|
|
60
|
+
if not cfg.s3_model_path:
|
|
61
|
+
logger.critical("S3_MODEL_PATH not set; service will be unhealthy.")
|
|
62
|
+
return
|
|
63
|
+
|
|
64
|
+
try:
|
|
65
|
+
state["model"] = await download_and_load_model(
|
|
66
|
+
cfg.s3_model_path, aio_session=state["session"]
|
|
67
|
+
)
|
|
68
|
+
state["stream_features"] = state["model"].get("stream_features", [])
|
|
69
|
+
logger.info("Model loaded on startup.")
|
|
70
|
+
except Exception as e:
|
|
71
|
+
logger.critical("Failed to load model: %s", e)
|
|
72
|
+
|
|
73
|
+
@app.get("/health", status_code=HTTPStatus.OK)
|
|
74
|
+
async def health():
|
|
75
|
+
model_ok = state["model"] is not None
|
|
76
|
+
if not model_ok:
|
|
77
|
+
raise HTTPException(
|
|
78
|
+
status_code=HTTPStatus.SERVICE_UNAVAILABLE,
|
|
79
|
+
detail="ML Model not loaded",
|
|
80
|
+
)
|
|
81
|
+
return {
|
|
82
|
+
"status": "ok",
|
|
83
|
+
"model_loaded": True,
|
|
84
|
+
"cache_size": len(features_cache),
|
|
85
|
+
"model_name": state.get("model_name"),
|
|
86
|
+
"stream_features": state.get("stream_features", []),
|
|
87
|
+
}
|
|
88
|
+
|
|
89
|
+
@app.post("/score", status_code=HTTPStatus.OK)
|
|
90
|
+
async def score_stream(request: Request, response: Response):
|
|
91
|
+
if state["model"] is None:
|
|
92
|
+
raise HTTPException(
|
|
93
|
+
status_code=HTTPStatus.SERVICE_UNAVAILABLE,
|
|
94
|
+
detail="ML Model not loaded",
|
|
95
|
+
)
|
|
96
|
+
|
|
97
|
+
try:
|
|
98
|
+
data = await request.json()
|
|
99
|
+
except Exception:
|
|
100
|
+
raise HTTPException(
|
|
101
|
+
status_code=HTTPStatus.BAD_REQUEST, detail="Invalid JSON payload"
|
|
102
|
+
)
|
|
103
|
+
|
|
104
|
+
user = data.get("user", {})
|
|
105
|
+
streams: List[Dict[str, Any]] = data.get("streams", [])
|
|
106
|
+
playlist = data.get("playlist", {})
|
|
107
|
+
|
|
108
|
+
if not streams:
|
|
109
|
+
logger.warning("No streams provided for user %s", user.get("userid", ""))
|
|
110
|
+
return {}
|
|
111
|
+
|
|
112
|
+
# Feature fetch (optional based on model)
|
|
113
|
+
model = state["model"]
|
|
114
|
+
stream_features = model.get("stream_features", []) or []
|
|
115
|
+
if stream_features:
|
|
116
|
+
logger.info("Fetching stream features for user %s", user.get("userid", ""))
|
|
117
|
+
await set_stream_features(
|
|
118
|
+
aio_session=state["session"],
|
|
119
|
+
streams=streams,
|
|
120
|
+
stream_features=stream_features,
|
|
121
|
+
features_cache=features_cache,
|
|
122
|
+
features_table=cfg.features_table,
|
|
123
|
+
stream_pk_prefix=cfg.stream_pk_prefix,
|
|
124
|
+
cache_sep=cfg.cache_separator,
|
|
125
|
+
)
|
|
126
|
+
|
|
127
|
+
# Sampling logs
|
|
128
|
+
if random.random() < cfg.logs_fraction:
|
|
129
|
+
logger.info("User %s streams: %s", user.get("userid", ""), streams)
|
|
130
|
+
|
|
131
|
+
# Synchronous model execution (user code)
|
|
132
|
+
try:
|
|
133
|
+
model_input = model["preprocess"](
|
|
134
|
+
user, streams, playlist, model.get("params")
|
|
135
|
+
)
|
|
136
|
+
model_output = model["predict"](model_input, model.get("params"))
|
|
137
|
+
except Exception as e:
|
|
138
|
+
logger.error("Model prediction failed: %s", e)
|
|
139
|
+
raise HTTPException(
|
|
140
|
+
status_code=HTTPStatus.INTERNAL_SERVER_ERROR,
|
|
141
|
+
detail="Model prediction failed",
|
|
142
|
+
)
|
|
143
|
+
|
|
144
|
+
if model_output:
|
|
145
|
+
return jsonable_encoder(model_output)
|
|
146
|
+
|
|
147
|
+
raise HTTPException(
|
|
148
|
+
status_code=HTTPStatus.NOT_FOUND, detail="No model output generated"
|
|
149
|
+
)
|
|
150
|
+
|
|
151
|
+
@app.get("/", status_code=HTTPStatus.OK)
|
|
152
|
+
async def root():
|
|
153
|
+
return {
|
|
154
|
+
"message": "ML Scoring Service is running.",
|
|
155
|
+
"model_name": state.get("model_name"),
|
|
156
|
+
}
|
|
157
|
+
|
|
158
|
+
return app
|
|
@@ -0,0 +1,19 @@
|
|
|
1
|
+
from typing import Any
|
|
2
|
+
|
|
3
|
+
from cachetools import TLRUCache
|
|
4
|
+
|
|
5
|
+
|
|
6
|
+
def _ttu(_, value: Any, now: float) -> float:
|
|
7
|
+
"""Time-To-Use policy: allow per-item TTL via 'cache_ttl_in_seconds' or fallback."""
|
|
8
|
+
ONE_YEAR = 365 * 24 * 60 * 60
|
|
9
|
+
try:
|
|
10
|
+
ttl = int(value.get("cache_ttl_in_seconds", -1))
|
|
11
|
+
if ttl > 0:
|
|
12
|
+
return now + ttl
|
|
13
|
+
except Exception:
|
|
14
|
+
pass
|
|
15
|
+
return now + ONE_YEAR
|
|
16
|
+
|
|
17
|
+
|
|
18
|
+
def make_features_cache(maxsize: int) -> TLRUCache:
|
|
19
|
+
return TLRUCache(maxsize=maxsize, ttu=_ttu)
|
|
@@ -0,0 +1,103 @@
|
|
|
1
|
+
from typing import Any, Dict, List
|
|
2
|
+
import logging
|
|
3
|
+
|
|
4
|
+
import aiobotocore.session
|
|
5
|
+
|
|
6
|
+
logger = logging.getLogger(__name__)
|
|
7
|
+
|
|
8
|
+
|
|
9
|
+
async def async_batch_get(
|
|
10
|
+
dynamo_client, table_name: str, keys: List[Dict[str, Any]]
|
|
11
|
+
) -> List[Dict[str, Any]]:
|
|
12
|
+
"""Asynchronous batch_get_item with unprocessed keys handling."""
|
|
13
|
+
all_items: List[Dict[str, Any]] = []
|
|
14
|
+
to_fetch = {table_name: {"Keys": keys}}
|
|
15
|
+
|
|
16
|
+
while to_fetch:
|
|
17
|
+
resp = await dynamo_client.batch_get_item(RequestItems=to_fetch)
|
|
18
|
+
all_items.extend(resp["Responses"].get(table_name, []))
|
|
19
|
+
unprocessed = resp.get("UnprocessedKeys", {})
|
|
20
|
+
to_fetch = unprocessed if unprocessed.get(table_name) else {}
|
|
21
|
+
|
|
22
|
+
return all_items
|
|
23
|
+
|
|
24
|
+
|
|
25
|
+
def parse_dynamo_item(item: Dict[str, Any]) -> Dict[str, Any]:
|
|
26
|
+
"""Parse a DynamoDB attribute map (low-level) to Python types."""
|
|
27
|
+
out: Dict[str, Any] = {}
|
|
28
|
+
for k, v in item.items():
|
|
29
|
+
if "N" in v:
|
|
30
|
+
out[k] = float(v["N"])
|
|
31
|
+
elif "S" in v:
|
|
32
|
+
out[k] = v["S"]
|
|
33
|
+
elif "SS" in v:
|
|
34
|
+
out[k] = v["SS"]
|
|
35
|
+
elif "NS" in v:
|
|
36
|
+
out[k] = [float(n) for n in v["NS"]]
|
|
37
|
+
elif "BOOL" in v:
|
|
38
|
+
out[k] = v["BOOL"]
|
|
39
|
+
elif "NULL" in v:
|
|
40
|
+
out[k] = None
|
|
41
|
+
elif "L" in v:
|
|
42
|
+
out[k] = [parse_dynamo_item({"value": i})["value"] for i in v["L"]]
|
|
43
|
+
elif "M" in v:
|
|
44
|
+
out[k] = parse_dynamo_item(v["M"])
|
|
45
|
+
return out
|
|
46
|
+
|
|
47
|
+
|
|
48
|
+
async def set_stream_features(
|
|
49
|
+
*,
|
|
50
|
+
streams: List[Dict[str, Any]],
|
|
51
|
+
stream_features: List[str],
|
|
52
|
+
features_cache,
|
|
53
|
+
features_table: str,
|
|
54
|
+
stream_pk_prefix: str,
|
|
55
|
+
cache_sep: str,
|
|
56
|
+
aio_session: aiobotocore.session.Session | None = None,
|
|
57
|
+
) -> None:
|
|
58
|
+
"""Fetch missing features for streams from DynamoDB and fill them into streams."""
|
|
59
|
+
if not streams or not stream_features:
|
|
60
|
+
return
|
|
61
|
+
|
|
62
|
+
cache_miss: Dict[str, Dict[str, Any]] = {}
|
|
63
|
+
for f in stream_features:
|
|
64
|
+
for s in streams:
|
|
65
|
+
key = f"{s['streamUrl']}{cache_sep}{f}"
|
|
66
|
+
cached = features_cache.get(key)
|
|
67
|
+
if cached is not None:
|
|
68
|
+
s[f] = cached["value"]
|
|
69
|
+
else:
|
|
70
|
+
cache_miss[key] = s
|
|
71
|
+
|
|
72
|
+
if not cache_miss:
|
|
73
|
+
return
|
|
74
|
+
|
|
75
|
+
logger.info("Cache miss for %d items", len(cache_miss))
|
|
76
|
+
|
|
77
|
+
# Prepare keys
|
|
78
|
+
keys = []
|
|
79
|
+
for k in cache_miss.keys():
|
|
80
|
+
stream_url, sk = k.split(cache_sep, 1)
|
|
81
|
+
pk = f"{stream_pk_prefix}{stream_url}"
|
|
82
|
+
keys.append({"pk": {"S": pk}, "sk": {"S": sk}})
|
|
83
|
+
|
|
84
|
+
session = aio_session or aiobotocore.session.get_session()
|
|
85
|
+
async with session.create_client("dynamodb") as dynamodb:
|
|
86
|
+
try:
|
|
87
|
+
items = await async_batch_get(dynamodb, features_table, keys)
|
|
88
|
+
except Exception as e:
|
|
89
|
+
logger.error("DynamoDB batch_get failed: %s", e)
|
|
90
|
+
return
|
|
91
|
+
|
|
92
|
+
for item in items:
|
|
93
|
+
stream_url = item["pk"]["S"].removeprefix(stream_pk_prefix)
|
|
94
|
+
feature_name = item["sk"]["S"]
|
|
95
|
+
cache_key = f"{stream_url}{cache_sep}{feature_name}"
|
|
96
|
+
parsed = parse_dynamo_item(item)
|
|
97
|
+
|
|
98
|
+
features_cache[cache_key] = {
|
|
99
|
+
"value": parsed.get("value"),
|
|
100
|
+
"cache_ttl_in_seconds": int(parsed.get("cache_ttl_in_seconds", -1)),
|
|
101
|
+
}
|
|
102
|
+
if cache_key in cache_miss:
|
|
103
|
+
cache_miss[cache_key][feature_name] = parsed.get("value")
|
|
@@ -0,0 +1,36 @@
|
|
|
1
|
+
import logging
|
|
2
|
+
import os
|
|
3
|
+
from typing import Any, Dict
|
|
4
|
+
|
|
5
|
+
import aiobotocore.session
|
|
6
|
+
import cloudpickle
|
|
7
|
+
|
|
8
|
+
logger = logging.getLogger(__name__)
|
|
9
|
+
|
|
10
|
+
|
|
11
|
+
async def download_and_load_model(
|
|
12
|
+
s3_url: str, aio_session: aiobotocore.session.Session | None = None
|
|
13
|
+
) -> Dict[str, Any]:
|
|
14
|
+
"""
|
|
15
|
+
Downloads cloudpickled model dict from S3 and loads it.
|
|
16
|
+
Expected keys: 'preprocess', 'predict', 'params', optional 'stream_features'.
|
|
17
|
+
"""
|
|
18
|
+
if not s3_url or not s3_url.startswith("s3://"):
|
|
19
|
+
raise ValueError("S3_MODEL_PATH must be a valid s3:// URL")
|
|
20
|
+
|
|
21
|
+
bucket, key = s3_url.replace("s3://", "").split("/", 1)
|
|
22
|
+
pid = os.getpid()
|
|
23
|
+
local_path = f"/tmp/model_{pid}.pkl"
|
|
24
|
+
|
|
25
|
+
session = aio_session or aiobotocore.session.get_session()
|
|
26
|
+
async with session.create_client("s3") as s3:
|
|
27
|
+
logger.info("Downloading model from %s...", s3_url)
|
|
28
|
+
resp = await s3.get_object(Bucket=bucket, Key=key)
|
|
29
|
+
data = await resp["Body"].read()
|
|
30
|
+
with open(local_path, "wb") as f:
|
|
31
|
+
f.write(data)
|
|
32
|
+
logger.info("Model downloaded to %s", local_path)
|
|
33
|
+
|
|
34
|
+
with open(local_path, "rb") as f:
|
|
35
|
+
model: Dict[str, Any] = cloudpickle.load(f)
|
|
36
|
+
return model
|
|
@@ -0,0 +1,22 @@
|
|
|
1
|
+
from pydantic_settings import BaseSettings
|
|
2
|
+
from pydantic import Field
|
|
3
|
+
|
|
4
|
+
class Settings(BaseSettings):
|
|
5
|
+
# Logging
|
|
6
|
+
logs_fraction: float = Field(0.01, alias="LOGS_FRACTION")
|
|
7
|
+
|
|
8
|
+
# Model (S3)
|
|
9
|
+
s3_model_path: str | None = Field(default=None, alias="S3_MODEL_PATH")
|
|
10
|
+
|
|
11
|
+
# DynamoDB
|
|
12
|
+
features_table: str = Field("features", alias="FEATURES_TABLE")
|
|
13
|
+
stream_pk_prefix: str = "STREAM#"
|
|
14
|
+
|
|
15
|
+
# Cache
|
|
16
|
+
cache_maxsize: int = 50_000
|
|
17
|
+
cache_separator: str = "--"
|
|
18
|
+
|
|
19
|
+
class Config:
|
|
20
|
+
env_file = ".env"
|
|
21
|
+
env_file_encoding = "utf-8"
|
|
22
|
+
extra = "ignore"
|
|
@@ -0,0 +1,96 @@
|
|
|
1
|
+
Metadata-Version: 2.4
|
|
2
|
+
Name: haystack-ml-stack
|
|
3
|
+
Version: 0.1.0
|
|
4
|
+
Summary: Functions related to Haystack ML
|
|
5
|
+
Author-email: Oscar Vega <oscar@haystack.tv>
|
|
6
|
+
License: MIT
|
|
7
|
+
Requires-Python: >=3.11
|
|
8
|
+
Description-Content-Type: text/markdown
|
|
9
|
+
Requires-Dist: pydantic==2.5.0
|
|
10
|
+
Requires-Dist: cachetools==5.5.2
|
|
11
|
+
Requires-Dist: cloudpickle==2.2.1
|
|
12
|
+
Requires-Dist: aioboto3==12.0.0
|
|
13
|
+
Requires-Dist: fastapi==0.104.1
|
|
14
|
+
Requires-Dist: pydantic-settings==2.2
|
|
15
|
+
|
|
16
|
+
# Haystack ML Stack
|
|
17
|
+
|
|
18
|
+
Currently this project contains a FastAPI-based service designed for low-latency scoring of streams data coming from http requests
|
|
19
|
+
|
|
20
|
+
## 🚀 Features
|
|
21
|
+
|
|
22
|
+
* **FastAPI Service:** Lightweight and fast web service for ML inference.
|
|
23
|
+
* **Asynchronous I/O:** Utilizes `aiobotocore` for non-blocking S3 and DynamoDB operations.
|
|
24
|
+
* **Model Loading:** Downloads and loads the ML model (using `cloudpickle`) from a configurable S3 path on startup.
|
|
25
|
+
* **Feature Caching:** Implements a thread-safe Time-To-Live (TTL) / Least-Recently-Used (LRU) cache (`cachetools.TLRUCache`) for DynamoDB features, reducing latency and database load.
|
|
26
|
+
* **DynamoDB Integration:** Fetches stream-specific features from DynamoDB to enrich the data before scoring.
|
|
27
|
+
* **Health Check:** Provides a `/health` endpoint to monitor service status and model loading.
|
|
28
|
+
|
|
29
|
+
## 📦 Installation
|
|
30
|
+
|
|
31
|
+
This project requires Python 3.11 or later.
|
|
32
|
+
|
|
33
|
+
1. **Install package:**
|
|
34
|
+
The dependencies associated are listed in `pyproject.toml`.
|
|
35
|
+
|
|
36
|
+
```bash
|
|
37
|
+
pip install haystack-ml-stack
|
|
38
|
+
```
|
|
39
|
+
|
|
40
|
+
## ⚙️ Configuration
|
|
41
|
+
|
|
42
|
+
The service is configured using environment variables, managed by `pydantic-settings`. You can use a `.env` file for local development.
|
|
43
|
+
|
|
44
|
+
| Variable Name | Alias | Default | Description |
|
|
45
|
+
| :--- | :--- | :--- | :--- |
|
|
46
|
+
| `S3_MODEL_PATH` | `S3_MODEL_PATH` | `None` | **Required.** The `s3://bucket/key` URL for the cloudpickled ML model file. |
|
|
47
|
+
| `FEATURES_TABLE`| `FEATURES_TABLE`| `"features"` | Name of the DynamoDB table storing stream features. |
|
|
48
|
+
| `LOGS_FRACTION` | `LOGS_FRACTION` | `0.01` | Fraction of requests to log detailed stream data for sampling/debugging (0.0 to 1.0). |
|
|
49
|
+
| `CACHE_MAXSIZE` | *(none)* | `50000` | Maximum size of the in-memory feature cache. |
|
|
50
|
+
|
|
51
|
+
**Example env vars**
|
|
52
|
+
|
|
53
|
+
```env
|
|
54
|
+
S3_MODEL_PATH="s3://my-ml-models/stream-scorer/latest.pkl"
|
|
55
|
+
FEATURES_TABLE="features"
|
|
56
|
+
LOGS_FRACTION=0.05
|
|
57
|
+
```
|
|
58
|
+
|
|
59
|
+
## 🌐 Endpoints
|
|
60
|
+
| Method | Path | Description |
|
|
61
|
+
| :--- | :--- | :--- |
|
|
62
|
+
| **GET** | `/` | Root endpoint, returns a simple running message. |
|
|
63
|
+
| **GET** | `/health` | Checks if the service is running and if the ML model has been loaded. |
|
|
64
|
+
| **POST** | `/score` | **Main scoring endpoint.** Accepts stream data and returns model predictions. |
|
|
65
|
+
|
|
66
|
+
## 💻 Technical Details
|
|
67
|
+
|
|
68
|
+
### Model Structure
|
|
69
|
+
The ML model file downloaded from S3 is expected to be a cloudpickle-serialized Python dictionary with the following structure:
|
|
70
|
+
|
|
71
|
+
``` python
|
|
72
|
+
|
|
73
|
+
model = {
|
|
74
|
+
"preprocess": <function>, # Function to transform request data into model input.
|
|
75
|
+
"predict": <function>, # Function to perform the actual model inference.
|
|
76
|
+
"params": <dict/any>, # Optional parameters passed to preprocess/predict.
|
|
77
|
+
"stream_features": <list[str]>, # Optional list of feature names to fetch from DynamoDB.
|
|
78
|
+
}
|
|
79
|
+
```
|
|
80
|
+
|
|
81
|
+
### Feature Caching (cache.py)
|
|
82
|
+
The `ThreadSafeTLRUCache` ensures that feature lookups and updates are thread-safe.
|
|
83
|
+
The `_ttu` (time-to-use) policy allows features to specify their own TTL via a `cache_ttl_in_seconds` key in the stored value.
|
|
84
|
+
|
|
85
|
+
### DynamoDB Feature Fetching (dynamo.py)
|
|
86
|
+
The set_stream_features function handles:
|
|
87
|
+
|
|
88
|
+
- Checking the in-memory cache for required `stream_features`.
|
|
89
|
+
|
|
90
|
+
- Batch-fetching any missing features from DynamoDB.
|
|
91
|
+
|
|
92
|
+
- Parsing the low-level DynamoDB items into Python types.
|
|
93
|
+
|
|
94
|
+
- Populating the cache with the fetched data, respecting the feature's TTL.
|
|
95
|
+
|
|
96
|
+
- Injecting the fetched feature values back into the streams list in the request payload.
|
|
@@ -0,0 +1,13 @@
|
|
|
1
|
+
README.md
|
|
2
|
+
pyproject.toml
|
|
3
|
+
src/haystack_ml_stack/__init__.py
|
|
4
|
+
src/haystack_ml_stack/app.py
|
|
5
|
+
src/haystack_ml_stack/cache.py
|
|
6
|
+
src/haystack_ml_stack/dynamo.py
|
|
7
|
+
src/haystack_ml_stack/model_store.py
|
|
8
|
+
src/haystack_ml_stack/settings.py
|
|
9
|
+
src/haystack_ml_stack.egg-info/PKG-INFO
|
|
10
|
+
src/haystack_ml_stack.egg-info/SOURCES.txt
|
|
11
|
+
src/haystack_ml_stack.egg-info/dependency_links.txt
|
|
12
|
+
src/haystack_ml_stack.egg-info/requires.txt
|
|
13
|
+
src/haystack_ml_stack.egg-info/top_level.txt
|
|
@@ -0,0 +1 @@
|
|
|
1
|
+
|
|
@@ -0,0 +1 @@
|
|
|
1
|
+
haystack_ml_stack
|