beaver-db 0.15.0__tar.gz → 0.16.0__tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Potentially problematic release.


This version of beaver-db might be problematic. Click here for more details.

@@ -1,6 +1,6 @@
1
1
  Metadata-Version: 2.4
2
2
  Name: beaver-db
3
- Version: 0.15.0
3
+ Version: 0.16.0
4
4
  Summary: Fast, embedded, and multi-modal DB based on SQLite for AI-powered applications.
5
5
  Requires-Python: >=3.13
6
6
  Description-Content-Type: text/markdown
@@ -19,25 +19,26 @@ A fast, single-file, multi-modal database for Python, built with the standard `s
19
19
 
20
20
  `beaver` is built with a minimalistic philosophy for small, local use cases where a full-blown database server would be overkill.
21
21
 
22
- - **Minimalistic**: Uses only Python's standard libraries (`sqlite3`) and `numpy`/`faiss-cpu`.
23
- - **Schemaless**: Flexible data storage without rigid schemas across all modalities.
24
- - **Synchronous, Multi-Process, and Thread-Safe**: Designed for simplicity and safety in multi-threaded and multi-process environments.
25
- - **Built for Local Applications**: Perfect for local AI tools, RAG prototypes, chatbots, and desktop utilities that need persistent, structured data without network overhead.
26
- - **Fast by Default**: It's built on SQLite, which is famously fast and reliable for local applications. Vector search is accelerated with a high-performance, persistent `faiss` index.
27
- - **Standard Relational Interface**: While `beaver` provides high-level features, you can always use the same SQLite file for normal relational tasks with standard SQL.
22
+ - **Minimalistic**: Uses only Python's standard libraries (`sqlite3`) and `numpy`/`faiss-cpu`.
23
+ - **Schemaless**: Flexible data storage without rigid schemas across all modalities.
24
+ - **Synchronous, Multi-Process, and Thread-Safe**: Designed for simplicity and safety in multi-threaded and multi-process environments.
25
+ - **Built for Local Applications**: Perfect for local AI tools, RAG prototypes, chatbots, and desktop utilities that need persistent, structured data without network overhead.
26
+ - **Fast by Default**: It's built on SQLite, which is famously fast and reliable for local applications. Vector search is accelerated with a high-performance, persistent `faiss` index.
27
+ - **Standard Relational Interface**: While `beaver` provides high-level features, you can always use the same SQLite file for normal relational tasks with standard SQL.
28
28
 
29
29
  ## Core Features
30
30
 
31
- - **Sync/Async High-Efficiency Pub/Sub**: A powerful, thread and process-safe publish-subscribe system for real-time messaging with a fan-out architecture. Sync by default, but with an `as_async` wrapper for async applications.
32
- - **Namespaced Key-Value Dictionaries**: A Pythonic, dictionary-like interface for storing any JSON-serializable object within separate namespaces with optional TTL for cache implementations.
33
- - **Pythonic List Management**: A fluent, Redis-like interface for managing persistent, ordered lists.
34
- - **Persistent Priority Queue**: A high-performance, persistent priority queue perfect for task orchestration across multiple processes. Also with optional async support.
35
- - **Simple Blob Storage**: A dictionary-like interface for storing medium-sized binary files (like PDFs or images) directly in the database, ensuring transactional integrity with your other data.
36
- - **High-Performance Vector Storage & Search**: Store vector embeddings and perform fast, crash-safe approximate nearest neighbor searches using a `faiss`-based hybrid index.
37
- - **Full-Text and Fuzzy Search**: Automatically index and search through document metadata using SQLite's powerful FTS5 engine, enhanced with optional fuzzy search for typo-tolerant matching.
38
- - **Knowledge Graph**: Create relationships between documents and traverse the graph to find neighbors or perform multi-hop walks.
39
- - **Single-File & Portable**: All data is stored in a single SQLite file, making it incredibly easy to move, back up, or embed in your application.
40
- - **Optional Type-Safety:** Although the database is schemaless, you can use a minimalistic typing system for automatic serialization and deserialization that is Pydantic-compatible out of the box.
31
+ - **Sync/Async High-Efficiency Pub/Sub**: A powerful, thread and process-safe publish-subscribe system for real-time messaging with a fan-out architecture. Sync by default, but with an `as_async` wrapper for async applications.
32
+ - **Namespaced Key-Value Dictionaries**: A Pythonic, dictionary-like interface for storing any JSON-serializable object within separate namespaces with optional TTL for cache implementations.
33
+ - **Pythonic List Management**: A fluent, Redis-like interface for managing persistent, ordered lists.
34
+ - **Persistent Priority Queue**: A high-performance, persistent priority queue perfect for task orchestration across multiple processes. Also with optional async support.
35
+ - **Time-Indexed Log for Monitoring**: A specialized data structure for structured, time-series logs. Query historical data by time range or create a live, aggregated view of the most recent events for real-time dashboards.
36
+ - **Simple Blob Storage**: A dictionary-like interface for storing medium-sized binary files (like PDFs or images) directly in the database, ensuring transactional integrity with your other data.
37
+ - **High-Performance Vector Storage & Search**: Store vector embeddings and perform fast, crash-safe approximate nearest neighbor searches using a `faiss`-based hybrid index.
38
+ - **Full-Text and Fuzzy Search**: Automatically index and search through document metadata using SQLite's powerful FTS5 engine, enhanced with optional fuzzy search for typo-tolerant matching.
39
+ - **Knowledge Graph**: Create relationships between documents and traverse the graph to find neighbors or perform multi-hop walks.
40
+ - **Single-File & Portable**: All data is stored in a single SQLite file, making it incredibly easy to move, back up, or embed in your application.
41
+ - **Optional Type-Safety:** Although the database is schemaless, you can use a minimalistic typing system for automatic serialization and deserialization that is Pydantic-compatible out of the box.
41
42
 
42
43
  ## How Beaver is Implemented
43
44
 
@@ -220,6 +221,29 @@ attachments.put(
220
221
  avatar = attachments.get("user_123_avatar.png")
221
222
  ```
222
223
 
224
+ ### 8. Real-time Application Monitoring
225
+
226
+ Use the **time-indexed log** to monitor your application's health in real-time. The `live()` method provides a continuously updating, aggregated view of your log data, perfect for building simple dashboards directly in your terminal.
227
+
228
+ ```python
229
+ from datetime import timedelta
230
+ import statistics
231
+
232
+ logs = db.log("system_metrics")
233
+
234
+ def summarize(window):
235
+ values = [log.get("value", 0) for log in window]
236
+ return {"mean": statistics.mean(values), "count": len(values)}
237
+
238
+ live_summary = logs.live(
239
+ window_duration=timedelta(seconds=10),
240
+ sampling_period=timedelta(seconds=1),
241
+ aggregator=summarize
242
+ )
243
+
244
+ for summary in live_summary:
245
+ print(f"Live Stats (10s window): Count={summary['count']}, Mean={summary['mean']:.2f}")
246
+ ```
223
247
 
224
248
  ## Type-Safe Data Models
225
249
 
@@ -8,25 +8,26 @@ A fast, single-file, multi-modal database for Python, built with the standard `s
8
8
 
9
9
  `beaver` is built with a minimalistic philosophy for small, local use cases where a full-blown database server would be overkill.
10
10
 
11
- - **Minimalistic**: Uses only Python's standard libraries (`sqlite3`) and `numpy`/`faiss-cpu`.
12
- - **Schemaless**: Flexible data storage without rigid schemas across all modalities.
13
- - **Synchronous, Multi-Process, and Thread-Safe**: Designed for simplicity and safety in multi-threaded and multi-process environments.
14
- - **Built for Local Applications**: Perfect for local AI tools, RAG prototypes, chatbots, and desktop utilities that need persistent, structured data without network overhead.
15
- - **Fast by Default**: It's built on SQLite, which is famously fast and reliable for local applications. Vector search is accelerated with a high-performance, persistent `faiss` index.
16
- - **Standard Relational Interface**: While `beaver` provides high-level features, you can always use the same SQLite file for normal relational tasks with standard SQL.
11
+ - **Minimalistic**: Uses only Python's standard libraries (`sqlite3`) and `numpy`/`faiss-cpu`.
12
+ - **Schemaless**: Flexible data storage without rigid schemas across all modalities.
13
+ - **Synchronous, Multi-Process, and Thread-Safe**: Designed for simplicity and safety in multi-threaded and multi-process environments.
14
+ - **Built for Local Applications**: Perfect for local AI tools, RAG prototypes, chatbots, and desktop utilities that need persistent, structured data without network overhead.
15
+ - **Fast by Default**: It's built on SQLite, which is famously fast and reliable for local applications. Vector search is accelerated with a high-performance, persistent `faiss` index.
16
+ - **Standard Relational Interface**: While `beaver` provides high-level features, you can always use the same SQLite file for normal relational tasks with standard SQL.
17
17
 
18
18
  ## Core Features
19
19
 
20
- - **Sync/Async High-Efficiency Pub/Sub**: A powerful, thread and process-safe publish-subscribe system for real-time messaging with a fan-out architecture. Sync by default, but with an `as_async` wrapper for async applications.
21
- - **Namespaced Key-Value Dictionaries**: A Pythonic, dictionary-like interface for storing any JSON-serializable object within separate namespaces with optional TTL for cache implementations.
22
- - **Pythonic List Management**: A fluent, Redis-like interface for managing persistent, ordered lists.
23
- - **Persistent Priority Queue**: A high-performance, persistent priority queue perfect for task orchestration across multiple processes. Also with optional async support.
24
- - **Simple Blob Storage**: A dictionary-like interface for storing medium-sized binary files (like PDFs or images) directly in the database, ensuring transactional integrity with your other data.
25
- - **High-Performance Vector Storage & Search**: Store vector embeddings and perform fast, crash-safe approximate nearest neighbor searches using a `faiss`-based hybrid index.
26
- - **Full-Text and Fuzzy Search**: Automatically index and search through document metadata using SQLite's powerful FTS5 engine, enhanced with optional fuzzy search for typo-tolerant matching.
27
- - **Knowledge Graph**: Create relationships between documents and traverse the graph to find neighbors or perform multi-hop walks.
28
- - **Single-File & Portable**: All data is stored in a single SQLite file, making it incredibly easy to move, back up, or embed in your application.
29
- - **Optional Type-Safety:** Although the database is schemaless, you can use a minimalistic typing system for automatic serialization and deserialization that is Pydantic-compatible out of the box.
20
+ - **Sync/Async High-Efficiency Pub/Sub**: A powerful, thread and process-safe publish-subscribe system for real-time messaging with a fan-out architecture. Sync by default, but with an `as_async` wrapper for async applications.
21
+ - **Namespaced Key-Value Dictionaries**: A Pythonic, dictionary-like interface for storing any JSON-serializable object within separate namespaces with optional TTL for cache implementations.
22
+ - **Pythonic List Management**: A fluent, Redis-like interface for managing persistent, ordered lists.
23
+ - **Persistent Priority Queue**: A high-performance, persistent priority queue perfect for task orchestration across multiple processes. Also with optional async support.
24
+ - **Time-Indexed Log for Monitoring**: A specialized data structure for structured, time-series logs. Query historical data by time range or create a live, aggregated view of the most recent events for real-time dashboards.
25
+ - **Simple Blob Storage**: A dictionary-like interface for storing medium-sized binary files (like PDFs or images) directly in the database, ensuring transactional integrity with your other data.
26
+ - **High-Performance Vector Storage & Search**: Store vector embeddings and perform fast, crash-safe approximate nearest neighbor searches using a `faiss`-based hybrid index.
27
+ - **Full-Text and Fuzzy Search**: Automatically index and search through document metadata using SQLite's powerful FTS5 engine, enhanced with optional fuzzy search for typo-tolerant matching.
28
+ - **Knowledge Graph**: Create relationships between documents and traverse the graph to find neighbors or perform multi-hop walks.
29
+ - **Single-File & Portable**: All data is stored in a single SQLite file, making it incredibly easy to move, back up, or embed in your application.
30
+ - **Optional Type-Safety:** Although the database is schemaless, you can use a minimalistic typing system for automatic serialization and deserialization that is Pydantic-compatible out of the box.
30
31
 
31
32
  ## How Beaver is Implemented
32
33
 
@@ -209,6 +210,29 @@ attachments.put(
209
210
  avatar = attachments.get("user_123_avatar.png")
210
211
  ```
211
212
 
213
+ ### 8. Real-time Application Monitoring
214
+
215
+ Use the **time-indexed log** to monitor your application's health in real-time. The `live()` method provides a continuously updating, aggregated view of your log data, perfect for building simple dashboards directly in your terminal.
216
+
217
+ ```python
218
+ from datetime import timedelta
219
+ import statistics
220
+
221
+ logs = db.log("system_metrics")
222
+
223
+ def summarize(window):
224
+ values = [log.get("value", 0) for log in window]
225
+ return {"mean": statistics.mean(values), "count": len(values)}
226
+
227
+ live_summary = logs.live(
228
+ window_duration=timedelta(seconds=10),
229
+ sampling_period=timedelta(seconds=1),
230
+ aggregator=summarize
231
+ )
232
+
233
+ for summary in live_summary:
234
+ print(f"Live Stats (10s window): Count={summary['count']}, Mean={summary['mean']:.2f}")
235
+ ```
212
236
 
213
237
  ## Type-Safe Data Models
214
238
 
@@ -8,6 +8,7 @@ from .channels import ChannelManager
8
8
  from .collections import CollectionManager, Document
9
9
  from .dicts import DictManager
10
10
  from .lists import ListManager
11
+ from .logs import LogManager
11
12
  from .queues import QueueManager
12
13
 
13
14
 
@@ -51,11 +52,31 @@ class BeaverDB:
51
52
  self._create_edges_table()
52
53
  self._create_fts_table()
53
54
  self._create_list_table()
55
+ self._create_logs_table()
54
56
  self._create_priority_queue_table()
55
57
  self._create_pubsub_table()
56
58
  self._create_trigrams_table()
57
59
  self._create_versions_table()
58
60
 
61
+ def _create_logs_table(self):
62
+ """Creates the table for time-indexed logs."""
63
+ self._conn.execute(
64
+ """
65
+ CREATE TABLE IF NOT EXISTS beaver_logs (
66
+ log_name TEXT NOT NULL,
67
+ timestamp REAL NOT NULL,
68
+ data TEXT NOT NULL,
69
+ PRIMARY KEY (log_name, timestamp)
70
+ )
71
+ """
72
+ )
73
+ self._conn.execute(
74
+ """
75
+ CREATE INDEX IF NOT EXISTS idx_logs_timestamp
76
+ ON beaver_logs (log_name, timestamp)
77
+ """
78
+ )
79
+
59
80
  def _create_blobs_table(self):
60
81
  """Creates the table for storing named blobs."""
61
82
  self._conn.execute(
@@ -343,3 +364,16 @@ class BeaverDB:
343
364
  raise TypeError("Blob store name must be a non-empty string.")
344
365
 
345
366
  return BlobManager(name, self._conn, model)
367
+
368
+ def log[T](self, name: str, model: type[T] | None = None) -> LogManager[T]:
369
+ """
370
+ Returns a wrapper for interacting with a named, time-indexed log.
371
+ If model is defined, it should be a type used for automatic (de)serialization.
372
+ """
373
+ if not isinstance(name, str) or not name:
374
+ raise TypeError("Log name must be a non-empty string.")
375
+
376
+ if model and not isinstance(model, JsonSerializable):
377
+ raise TypeError("The model parameter must be a JsonSerializable class.")
378
+
379
+ return LogManager(name, self._conn, self._db_path, model)
@@ -0,0 +1,278 @@
1
+ import asyncio
2
+ import collections
3
+ import json
4
+ import sqlite3
5
+ import threading
6
+ import time
7
+ from datetime import datetime, timedelta, timezone
8
+ from queue import Empty, Queue
9
+ from typing import Any, AsyncIterator, Callable, Iterator, Type, TypeVar
10
+
11
+ from .types import JsonSerializable
12
+
13
+
14
+ # A special message object used to signal the iterator to gracefully shut down.
15
+ _SHUTDOWN_SENTINEL = object()
16
+
17
+
18
+ class LiveIterator[T,R]:
19
+ """
20
+ A thread-safe, blocking iterator that yields aggregated results from a
21
+ rolling window of log data.
22
+ """
23
+
24
+ def __init__(
25
+ self,
26
+ db_path: str,
27
+ log_name: str,
28
+ window: timedelta,
29
+ period: timedelta,
30
+ aggregator: Callable[[list[T]], R],
31
+ deserializer: Callable[[str], T],
32
+ ):
33
+ self._db_path = db_path
34
+ self._log_name = log_name
35
+ self._window_duration_seconds = window.total_seconds()
36
+ self._sampling_period_seconds = period.total_seconds()
37
+ self._aggregator = aggregator
38
+ self._deserializer = deserializer
39
+ self._queue: Queue = Queue()
40
+ self._stop_event = threading.Event()
41
+ self._thread: threading.Thread | None = None
42
+
43
+ def _polling_loop(self):
44
+ """The main loop for the background thread that queries and aggregates data."""
45
+ # Each thread needs its own database connection.
46
+ thread_conn = sqlite3.connect(self._db_path, check_same_thread=False)
47
+ thread_conn.row_factory = sqlite3.Row
48
+
49
+ window_deque: collections.deque[tuple[float, T]] = collections.deque()
50
+ last_seen_timestamp = 0.0
51
+
52
+ # --- Initial window population ---
53
+ now = datetime.now(timezone.utc).timestamp()
54
+ start_time = now - self._window_duration_seconds
55
+ cursor = thread_conn.cursor()
56
+ cursor.execute(
57
+ "SELECT timestamp, data FROM beaver_logs WHERE log_name = ? AND timestamp >= ? ORDER BY timestamp ASC",
58
+ (self._log_name, start_time),
59
+ )
60
+ for row in cursor:
61
+ ts, data_str = row
62
+ window_deque.append((ts, self._deserializer(data_str)))
63
+ last_seen_timestamp = max(last_seen_timestamp, ts)
64
+
65
+ # Yield the first result
66
+ try:
67
+ initial_result = self._aggregator([item[1] for item in window_deque])
68
+ self._queue.put(initial_result)
69
+ except Exception as e:
70
+ # Propagate aggregator errors to the main thread
71
+ self._queue.put(e)
72
+
73
+ # --- Continuous polling loop ---
74
+ while not self._stop_event.is_set():
75
+ time.sleep(self._sampling_period_seconds)
76
+
77
+ # Fetch only new data since the last check
78
+ cursor.execute(
79
+ "SELECT timestamp, data FROM beaver_logs WHERE log_name = ? AND timestamp > ? ORDER BY timestamp ASC",
80
+ (self._log_name, last_seen_timestamp),
81
+ )
82
+ for row in cursor:
83
+ ts, data_str = row
84
+ window_deque.append((ts, self._deserializer(data_str)))
85
+ last_seen_timestamp = max(last_seen_timestamp, ts)
86
+
87
+ # Evict old data from the left of the deque
88
+ now = datetime.now(timezone.utc).timestamp()
89
+ eviction_time = now - self._window_duration_seconds
90
+ while window_deque and window_deque[0][0] < eviction_time:
91
+ window_deque.popleft()
92
+
93
+ # Run aggregator and yield the new result
94
+ try:
95
+ new_result = self._aggregator([item[1] for item in window_deque])
96
+ self._queue.put(new_result)
97
+ except Exception as e:
98
+ self._queue.put(e)
99
+
100
+ thread_conn.close()
101
+
102
+ def __iter__(self) -> "LiveIterator[T,R]":
103
+ self._thread = threading.Thread(target=self._polling_loop, daemon=True)
104
+ self._thread.start()
105
+ return self
106
+
107
+ def __next__(self) -> R:
108
+ result = self._queue.get()
109
+ if result is _SHUTDOWN_SENTINEL:
110
+ raise StopIteration
111
+ if isinstance(result, Exception):
112
+ # If the background thread put an exception in the queue, re-raise it
113
+ raise result
114
+ return result
115
+
116
+ def close(self):
117
+ """Stops the background polling thread."""
118
+ self._stop_event.set()
119
+ self._queue.put(_SHUTDOWN_SENTINEL)
120
+ if self._thread:
121
+ self._thread.join()
122
+
123
+
124
+ class AsyncLiveIterator[T,R]:
125
+ """An async wrapper for the LiveIterator."""
126
+
127
+ def __init__(self, sync_iterator: LiveIterator[T,R]):
128
+ self._sync_iterator = sync_iterator
129
+
130
+ async def __anext__(self) -> R:
131
+ try:
132
+ return await asyncio.to_thread(self._sync_iterator.__next__)
133
+ except StopIteration:
134
+ raise StopAsyncIteration
135
+
136
+ def __aiter__(self) -> "AsyncLiveIterator[T,R]":
137
+ # The synchronous iterator's __iter__ method starts the thread.
138
+ # This is non-blocking, so it's safe to call directly.
139
+ self._sync_iterator.__iter__()
140
+ return self
141
+
142
+ def close(self):
143
+ self._sync_iterator.close()
144
+
145
+
146
+ class AsyncLogManager[T]:
147
+ """An async-compatible wrapper for the LogManager."""
148
+
149
+ def __init__(self, sync_manager: "LogManager[T]"):
150
+ self._sync_manager = sync_manager
151
+
152
+ async def log(self, data: T, timestamp: datetime | None = None) -> None:
153
+ """Asynchronously adds a new entry to the log."""
154
+ await asyncio.to_thread(self._sync_manager.log, data, timestamp)
155
+
156
+ async def range(self, start: datetime, end: datetime) -> list[T]:
157
+ """Asynchronously retrieves all log entries within a specific time window."""
158
+ return await asyncio.to_thread(self._sync_manager.range, start, end)
159
+
160
+ def live[R](
161
+ self,
162
+ window: timedelta,
163
+ period: timedelta,
164
+ aggregator: Callable[[list[T]], R],
165
+ ) -> AsyncIterator[R]:
166
+ """Returns an async, infinite iterator for real-time log analysis."""
167
+ sync_iterator = self._sync_manager.live(
168
+ window, period, aggregator
169
+ )
170
+ return AsyncLiveIterator(sync_iterator)
171
+
172
+
173
+ class LogManager[T]:
174
+ """
175
+ A wrapper for interacting with a named, time-indexed log, providing
176
+ type-safe and async-compatible methods.
177
+ """
178
+
179
+ def __init__(
180
+ self,
181
+ name: str,
182
+ conn: sqlite3.Connection,
183
+ db_path: str,
184
+ model: Type[T] | None = None,
185
+ ):
186
+ self._name = name
187
+ self._conn = conn
188
+ self._db_path = db_path
189
+ self._model = model
190
+
191
+ def _serialize(self, value: T) -> str:
192
+ """Serializes the given value to a JSON string."""
193
+ if isinstance(value, JsonSerializable):
194
+ return value.model_dump_json()
195
+
196
+ return json.dumps(value)
197
+
198
+ def _deserialize(self, value: str) -> T:
199
+ """Deserializes a JSON string into the specified model or a generic object."""
200
+ if self._model:
201
+ return self._model.model_validate_json(value)
202
+
203
+ return json.loads(value)
204
+
205
+ def log(self, data: T, timestamp: datetime | None = None) -> None:
206
+ """
207
+ Adds a new entry to the log.
208
+
209
+ Args:
210
+ data: The JSON-serializable data to store. If a model is used, this
211
+ should be an instance of that model.
212
+ timestamp: A timezone-naive datetime object. If not provided,
213
+ `datetime.now()` is used.
214
+ """
215
+ ts = timestamp or datetime.now(timezone.utc)
216
+ ts_float = ts.timestamp()
217
+
218
+ with self._conn:
219
+ self._conn.execute(
220
+ "INSERT INTO beaver_logs (log_name, timestamp, data) VALUES (?, ?, ?)",
221
+ (self._name, ts_float, self._serialize(data)),
222
+ )
223
+
224
+ def range(self, start: datetime, end: datetime) -> list[T]:
225
+ """
226
+ Retrieves all log entries within a specific time window.
227
+
228
+ Args:
229
+ start: The start of the time range (inclusive).
230
+ end: The end of the time range (inclusive).
231
+
232
+ Returns:
233
+ A list of log entries, deserialized into the specified model if provided.
234
+ """
235
+ start_ts = start.timestamp()
236
+ end_ts = end.timestamp()
237
+
238
+ cursor = self._conn.cursor()
239
+ cursor.execute(
240
+ "SELECT data FROM beaver_logs WHERE log_name = ? AND timestamp BETWEEN ? AND ? ORDER BY timestamp ASC",
241
+ (self._name, start_ts, end_ts),
242
+ )
243
+ return [self._deserialize(row["data"]) for row in cursor.fetchall()]
244
+
245
+ def live[R](
246
+ self,
247
+ window: timedelta,
248
+ period: timedelta,
249
+ aggregator: Callable[[list[T]], R],
250
+ ) -> Iterator[R]:
251
+ """
252
+ Returns a blocking, infinite iterator for real-time log analysis.
253
+
254
+ This maintains a sliding window of log entries and yields the result
255
+ of an aggregator function at specified intervals.
256
+
257
+ Args:
258
+ window: The duration of the sliding window (e.g., `timedelta(minutes=5)`).
259
+ period: The interval at which to update and yield a new result
260
+ (e.g., `timedelta(seconds=10)`).
261
+ aggregator: A function that takes a list of log entries (the window) and
262
+ returns a single, aggregated result.
263
+
264
+ Returns:
265
+ An iterator that yields the results of the aggregator.
266
+ """
267
+ return LiveIterator(
268
+ db_path=self._db_path,
269
+ log_name=self._name,
270
+ window=window,
271
+ period=period,
272
+ aggregator=aggregator,
273
+ deserializer=self._deserialize,
274
+ )
275
+
276
+ def as_async(self) -> AsyncLogManager[T]:
277
+ """Returns an async-compatible version of the log manager."""
278
+ return AsyncLogManager(self)
@@ -1,6 +1,6 @@
1
1
  Metadata-Version: 2.4
2
2
  Name: beaver-db
3
- Version: 0.15.0
3
+ Version: 0.16.0
4
4
  Summary: Fast, embedded, and multi-modal DB based on SQLite for AI-powered applications.
5
5
  Requires-Python: >=3.13
6
6
  Description-Content-Type: text/markdown
@@ -19,25 +19,26 @@ A fast, single-file, multi-modal database for Python, built with the standard `s
19
19
 
20
20
  `beaver` is built with a minimalistic philosophy for small, local use cases where a full-blown database server would be overkill.
21
21
 
22
- - **Minimalistic**: Uses only Python's standard libraries (`sqlite3`) and `numpy`/`faiss-cpu`.
23
- - **Schemaless**: Flexible data storage without rigid schemas across all modalities.
24
- - **Synchronous, Multi-Process, and Thread-Safe**: Designed for simplicity and safety in multi-threaded and multi-process environments.
25
- - **Built for Local Applications**: Perfect for local AI tools, RAG prototypes, chatbots, and desktop utilities that need persistent, structured data without network overhead.
26
- - **Fast by Default**: It's built on SQLite, which is famously fast and reliable for local applications. Vector search is accelerated with a high-performance, persistent `faiss` index.
27
- - **Standard Relational Interface**: While `beaver` provides high-level features, you can always use the same SQLite file for normal relational tasks with standard SQL.
22
+ - **Minimalistic**: Uses only Python's standard libraries (`sqlite3`) and `numpy`/`faiss-cpu`.
23
+ - **Schemaless**: Flexible data storage without rigid schemas across all modalities.
24
+ - **Synchronous, Multi-Process, and Thread-Safe**: Designed for simplicity and safety in multi-threaded and multi-process environments.
25
+ - **Built for Local Applications**: Perfect for local AI tools, RAG prototypes, chatbots, and desktop utilities that need persistent, structured data without network overhead.
26
+ - **Fast by Default**: It's built on SQLite, which is famously fast and reliable for local applications. Vector search is accelerated with a high-performance, persistent `faiss` index.
27
+ - **Standard Relational Interface**: While `beaver` provides high-level features, you can always use the same SQLite file for normal relational tasks with standard SQL.
28
28
 
29
29
  ## Core Features
30
30
 
31
- - **Sync/Async High-Efficiency Pub/Sub**: A powerful, thread and process-safe publish-subscribe system for real-time messaging with a fan-out architecture. Sync by default, but with an `as_async` wrapper for async applications.
32
- - **Namespaced Key-Value Dictionaries**: A Pythonic, dictionary-like interface for storing any JSON-serializable object within separate namespaces with optional TTL for cache implementations.
33
- - **Pythonic List Management**: A fluent, Redis-like interface for managing persistent, ordered lists.
34
- - **Persistent Priority Queue**: A high-performance, persistent priority queue perfect for task orchestration across multiple processes. Also with optional async support.
35
- - **Simple Blob Storage**: A dictionary-like interface for storing medium-sized binary files (like PDFs or images) directly in the database, ensuring transactional integrity with your other data.
36
- - **High-Performance Vector Storage & Search**: Store vector embeddings and perform fast, crash-safe approximate nearest neighbor searches using a `faiss`-based hybrid index.
37
- - **Full-Text and Fuzzy Search**: Automatically index and search through document metadata using SQLite's powerful FTS5 engine, enhanced with optional fuzzy search for typo-tolerant matching.
38
- - **Knowledge Graph**: Create relationships between documents and traverse the graph to find neighbors or perform multi-hop walks.
39
- - **Single-File & Portable**: All data is stored in a single SQLite file, making it incredibly easy to move, back up, or embed in your application.
40
- - **Optional Type-Safety:** Although the database is schemaless, you can use a minimalistic typing system for automatic serialization and deserialization that is Pydantic-compatible out of the box.
31
+ - **Sync/Async High-Efficiency Pub/Sub**: A powerful, thread and process-safe publish-subscribe system for real-time messaging with a fan-out architecture. Sync by default, but with an `as_async` wrapper for async applications.
32
+ - **Namespaced Key-Value Dictionaries**: A Pythonic, dictionary-like interface for storing any JSON-serializable object within separate namespaces with optional TTL for cache implementations.
33
+ - **Pythonic List Management**: A fluent, Redis-like interface for managing persistent, ordered lists.
34
+ - **Persistent Priority Queue**: A high-performance, persistent priority queue perfect for task orchestration across multiple processes. Also with optional async support.
35
+ - **Time-Indexed Log for Monitoring**: A specialized data structure for structured, time-series logs. Query historical data by time range or create a live, aggregated view of the most recent events for real-time dashboards.
36
+ - **Simple Blob Storage**: A dictionary-like interface for storing medium-sized binary files (like PDFs or images) directly in the database, ensuring transactional integrity with your other data.
37
+ - **High-Performance Vector Storage & Search**: Store vector embeddings and perform fast, crash-safe approximate nearest neighbor searches using a `faiss`-based hybrid index.
38
+ - **Full-Text and Fuzzy Search**: Automatically index and search through document metadata using SQLite's powerful FTS5 engine, enhanced with optional fuzzy search for typo-tolerant matching.
39
+ - **Knowledge Graph**: Create relationships between documents and traverse the graph to find neighbors or perform multi-hop walks.
40
+ - **Single-File & Portable**: All data is stored in a single SQLite file, making it incredibly easy to move, back up, or embed in your application.
41
+ - **Optional Type-Safety:** Although the database is schemaless, you can use a minimalistic typing system for automatic serialization and deserialization that is Pydantic-compatible out of the box.
41
42
 
42
43
  ## How Beaver is Implemented
43
44
 
@@ -220,6 +221,29 @@ attachments.put(
220
221
  avatar = attachments.get("user_123_avatar.png")
221
222
  ```
222
223
 
224
+ ### 8. Real-time Application Monitoring
225
+
226
+ Use the **time-indexed log** to monitor your application's health in real-time. The `live()` method provides a continuously updating, aggregated view of your log data, perfect for building simple dashboards directly in your terminal.
227
+
228
+ ```python
229
+ from datetime import timedelta
230
+ import statistics
231
+
232
+ logs = db.log("system_metrics")
233
+
234
+ def summarize(window):
235
+ values = [log.get("value", 0) for log in window]
236
+ return {"mean": statistics.mean(values), "count": len(values)}
237
+
238
+ live_summary = logs.live(
239
+ window_duration=timedelta(seconds=10),
240
+ sampling_period=timedelta(seconds=1),
241
+ aggregator=summarize
242
+ )
243
+
244
+ for summary in live_summary:
245
+ print(f"Live Stats (10s window): Count={summary['count']}, Mean={summary['mean']:.2f}")
246
+ ```
223
247
 
224
248
  ## Type-Safe Data Models
225
249
 
@@ -8,6 +8,7 @@ beaver/collections.py
8
8
  beaver/core.py
9
9
  beaver/dicts.py
10
10
  beaver/lists.py
11
+ beaver/logs.py
11
12
  beaver/queues.py
12
13
  beaver/types.py
13
14
  beaver/vectors.py
@@ -1,6 +1,6 @@
1
1
  [project]
2
2
  name = "beaver-db"
3
- version = "0.15.0"
3
+ version = "0.16.0"
4
4
  description = "Fast, embedded, and multi-modal DB based on SQLite for AI-powered applications."
5
5
  readme = "README.md"
6
6
  requires-python = ">=3.13"
File without changes
File without changes
File without changes
File without changes
File without changes
File without changes
File without changes
File without changes