beaver-db 0.14.0__tar.gz → 0.16.0__tar.gz
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Potentially problematic release.
This version of beaver-db might be problematic. Click here for more details.
- {beaver_db-0.14.0/beaver_db.egg-info → beaver_db-0.16.0}/PKG-INFO +59 -32
- {beaver_db-0.14.0 → beaver_db-0.16.0}/README.md +58 -31
- {beaver_db-0.14.0 → beaver_db-0.16.0}/beaver/core.py +34 -0
- beaver_db-0.16.0/beaver/logs.py +278 -0
- {beaver_db-0.14.0 → beaver_db-0.16.0}/beaver/queues.py +78 -15
- {beaver_db-0.14.0 → beaver_db-0.16.0/beaver_db.egg-info}/PKG-INFO +59 -32
- {beaver_db-0.14.0 → beaver_db-0.16.0}/beaver_db.egg-info/SOURCES.txt +1 -0
- {beaver_db-0.14.0 → beaver_db-0.16.0}/pyproject.toml +6 -1
- {beaver_db-0.14.0 → beaver_db-0.16.0}/LICENSE +0 -0
- {beaver_db-0.14.0 → beaver_db-0.16.0}/beaver/__init__.py +0 -0
- {beaver_db-0.14.0 → beaver_db-0.16.0}/beaver/blobs.py +0 -0
- {beaver_db-0.14.0 → beaver_db-0.16.0}/beaver/channels.py +0 -0
- {beaver_db-0.14.0 → beaver_db-0.16.0}/beaver/collections.py +0 -0
- {beaver_db-0.14.0 → beaver_db-0.16.0}/beaver/dicts.py +0 -0
- {beaver_db-0.14.0 → beaver_db-0.16.0}/beaver/lists.py +0 -0
- {beaver_db-0.14.0 → beaver_db-0.16.0}/beaver/types.py +0 -0
- {beaver_db-0.14.0 → beaver_db-0.16.0}/beaver/vectors.py +0 -0
- {beaver_db-0.14.0 → beaver_db-0.16.0}/beaver_db.egg-info/dependency_links.txt +0 -0
- {beaver_db-0.14.0 → beaver_db-0.16.0}/beaver_db.egg-info/requires.txt +0 -0
- {beaver_db-0.14.0 → beaver_db-0.16.0}/beaver_db.egg-info/top_level.txt +0 -0
- {beaver_db-0.14.0 → beaver_db-0.16.0}/setup.cfg +0 -0
|
@@ -1,6 +1,6 @@
|
|
|
1
1
|
Metadata-Version: 2.4
|
|
2
2
|
Name: beaver-db
|
|
3
|
-
Version: 0.
|
|
3
|
+
Version: 0.16.0
|
|
4
4
|
Summary: Fast, embedded, and multi-modal DB based on SQLite for AI-powered applications.
|
|
5
5
|
Requires-Python: >=3.13
|
|
6
6
|
Description-Content-Type: text/markdown
|
|
@@ -19,24 +19,26 @@ A fast, single-file, multi-modal database for Python, built with the standard `s
|
|
|
19
19
|
|
|
20
20
|
`beaver` is built with a minimalistic philosophy for small, local use cases where a full-blown database server would be overkill.
|
|
21
21
|
|
|
22
|
-
|
|
23
|
-
|
|
24
|
-
|
|
25
|
-
|
|
26
|
-
|
|
27
|
-
|
|
22
|
+
- **Minimalistic**: Uses only Python's standard libraries (`sqlite3`) and `numpy`/`faiss-cpu`.
|
|
23
|
+
- **Schemaless**: Flexible data storage without rigid schemas across all modalities.
|
|
24
|
+
- **Synchronous, Multi-Process, and Thread-Safe**: Designed for simplicity and safety in multi-threaded and multi-process environments.
|
|
25
|
+
- **Built for Local Applications**: Perfect for local AI tools, RAG prototypes, chatbots, and desktop utilities that need persistent, structured data without network overhead.
|
|
26
|
+
- **Fast by Default**: It's built on SQLite, which is famously fast and reliable for local applications. Vector search is accelerated with a high-performance, persistent `faiss` index.
|
|
27
|
+
- **Standard Relational Interface**: While `beaver` provides high-level features, you can always use the same SQLite file for normal relational tasks with standard SQL.
|
|
28
28
|
|
|
29
29
|
## Core Features
|
|
30
30
|
|
|
31
|
-
|
|
32
|
-
|
|
33
|
-
|
|
34
|
-
|
|
35
|
-
|
|
36
|
-
|
|
37
|
-
|
|
38
|
-
|
|
39
|
-
|
|
31
|
+
- **Sync/Async High-Efficiency Pub/Sub**: A powerful, thread and process-safe publish-subscribe system for real-time messaging with a fan-out architecture. Sync by default, but with an `as_async` wrapper for async applications.
|
|
32
|
+
- **Namespaced Key-Value Dictionaries**: A Pythonic, dictionary-like interface for storing any JSON-serializable object within separate namespaces with optional TTL for cache implementations.
|
|
33
|
+
- **Pythonic List Management**: A fluent, Redis-like interface for managing persistent, ordered lists.
|
|
34
|
+
- **Persistent Priority Queue**: A high-performance, persistent priority queue perfect for task orchestration across multiple processes. Also with optional async support.
|
|
35
|
+
- **Time-Indexed Log for Monitoring**: A specialized data structure for structured, time-series logs. Query historical data by time range or create a live, aggregated view of the most recent events for real-time dashboards.
|
|
36
|
+
- **Simple Blob Storage**: A dictionary-like interface for storing medium-sized binary files (like PDFs or images) directly in the database, ensuring transactional integrity with your other data.
|
|
37
|
+
- **High-Performance Vector Storage & Search**: Store vector embeddings and perform fast, crash-safe approximate nearest neighbor searches using a `faiss`-based hybrid index.
|
|
38
|
+
- **Full-Text and Fuzzy Search**: Automatically index and search through document metadata using SQLite's powerful FTS5 engine, enhanced with optional fuzzy search for typo-tolerant matching.
|
|
39
|
+
- **Knowledge Graph**: Create relationships between documents and traverse the graph to find neighbors or perform multi-hop walks.
|
|
40
|
+
- **Single-File & Portable**: All data is stored in a single SQLite file, making it incredibly easy to move, back up, or embed in your application.
|
|
41
|
+
- **Optional Type-Safety:** Although the database is schemaless, you can use a minimalistic typing system for automatic serialization and deserialization that is Pydantic-compatible out of the box.
|
|
40
42
|
|
|
41
43
|
## How Beaver is Implemented
|
|
42
44
|
|
|
@@ -219,6 +221,29 @@ attachments.put(
|
|
|
219
221
|
avatar = attachments.get("user_123_avatar.png")
|
|
220
222
|
```
|
|
221
223
|
|
|
224
|
+
### 8. Real-time Application Monitoring
|
|
225
|
+
|
|
226
|
+
Use the **time-indexed log** to monitor your application's health in real-time. The `live()` method provides a continuously updating, aggregated view of your log data, perfect for building simple dashboards directly in your terminal.
|
|
227
|
+
|
|
228
|
+
```python
|
|
229
|
+
from datetime import timedelta
|
|
230
|
+
import statistics
|
|
231
|
+
|
|
232
|
+
logs = db.log("system_metrics")
|
|
233
|
+
|
|
234
|
+
def summarize(window):
|
|
235
|
+
values = [log.get("value", 0) for log in window]
|
|
236
|
+
return {"mean": statistics.mean(values), "count": len(values)}
|
|
237
|
+
|
|
238
|
+
live_summary = logs.live(
|
|
239
|
+
window_duration=timedelta(seconds=10),
|
|
240
|
+
sampling_period=timedelta(seconds=1),
|
|
241
|
+
aggregator=summarize
|
|
242
|
+
)
|
|
243
|
+
|
|
244
|
+
for summary in live_summary:
|
|
245
|
+
print(f"Live Stats (10s window): Count={summary['count']}, Mean={summary['mean']:.2f}")
|
|
246
|
+
```
|
|
222
247
|
|
|
223
248
|
## Type-Safe Data Models
|
|
224
249
|
|
|
@@ -260,21 +285,24 @@ Basically everywhere you can store or get some object in BeaverDB, you can use a
|
|
|
260
285
|
|
|
261
286
|
For more in-depth examples, check out the scripts in the `examples/` directory:
|
|
262
287
|
|
|
263
|
-
- [`
|
|
264
|
-
- [`
|
|
265
|
-
- [`
|
|
266
|
-
- [`
|
|
267
|
-
- [`
|
|
268
|
-
- [`
|
|
269
|
-
- [`
|
|
270
|
-
- [`
|
|
271
|
-
- [`
|
|
272
|
-
- [`
|
|
273
|
-
- [`
|
|
274
|
-
- [`examples/
|
|
275
|
-
- [`
|
|
276
|
-
- [`
|
|
277
|
-
- [`
|
|
288
|
+
- [`async_pubsub.py`](examples/async_pubsub.py): A demonstration of the asynchronous wrapper for the publish/subscribe system.
|
|
289
|
+
- [`blobs.py`](examples/blobs.py): Demonstrates how to store and retrieve binary data in the database.
|
|
290
|
+
- [`cache.py`](examples/cache.py): A practical example of using a dictionary with TTL as a cache for API calls.
|
|
291
|
+
- [`fts.py`](examples/fts.py): A detailed look at full-text search, including targeted searches on specific metadata fields.
|
|
292
|
+
- [`fuzzy.py`](examples/fuzzy.py): Demonstrates fuzzy search capabilities for text search.
|
|
293
|
+
- [`general_test.py`](examples/general_test.py): A general-purpose test to run all operations randomly which allows testing long-running processes and synchronicity issues.
|
|
294
|
+
- [`graph.py`](examples/graph.py): Shows how to create relationships between documents and perform multi-hop graph traversals.
|
|
295
|
+
- [`kvstore.py`](examples/kvstore.py): A comprehensive demo of the namespaced dictionary feature.
|
|
296
|
+
- [`list.py`](examples/list.py): Shows the full capabilities of the persistent list, including slicing and in-place updates.
|
|
297
|
+
- [`pqueue.py`](examples/pqueue.py): A practical example of using the persistent priority queue for task management.
|
|
298
|
+
- [`producer_consumer.py`](examples/producer_consumer.py): A demonstration of the distributed task queue system in a multi-process environment.
|
|
299
|
+
- [`publisher.py`](examples/publisher.py) and [`subscriber.py`](examples/subscriber.py): A pair of examples demonstrating inter-process message passing with the publish/subscribe system.
|
|
300
|
+
- [`pubsub.py`](examples/pubsub.py): A demonstration of the synchronous, thread-safe publish/subscribe system in a single process.
|
|
301
|
+
- [`rerank.py`](examples/rerank.py): Shows how to combine results from vector and text search for more refined results.
|
|
302
|
+
- [`stress_vectors.py`](examples/stress_vectors.py): A stress test for the vector search functionality.
|
|
303
|
+
- [`textual_chat.py`](examples/textual_chat.py): A chat application built with `textual` and `beaver` to illustrate the use of several primitives (lists, dicts, and channels) at the same time.
|
|
304
|
+
- [`type_hints.py](examples/type_hints.py): Shows how to use type hints with `beaver` to get better IDE support and type safety.
|
|
305
|
+
- [`vector.py`](examples/vector.py): Demonstrates how to index and search vector embeddings, including upserts.
|
|
278
306
|
|
|
279
307
|
## Roadmap
|
|
280
308
|
|
|
@@ -283,7 +311,6 @@ For more in-depth examples, check out the scripts in the `examples/` directory:
|
|
|
283
311
|
These are some of the features and improvements planned for future releases:
|
|
284
312
|
|
|
285
313
|
- **Async API**: Extend the async support with on-demand wrappers for all features besides channels.
|
|
286
|
-
- **Type Hints**: Extend type hints for channels and documents.
|
|
287
314
|
|
|
288
315
|
Check out the [roadmap](roadmap.md) for a detailed list of upcoming features and design ideas.
|
|
289
316
|
|
|
@@ -8,24 +8,26 @@ A fast, single-file, multi-modal database for Python, built with the standard `s
|
|
|
8
8
|
|
|
9
9
|
`beaver` is built with a minimalistic philosophy for small, local use cases where a full-blown database server would be overkill.
|
|
10
10
|
|
|
11
|
-
|
|
12
|
-
|
|
13
|
-
|
|
14
|
-
|
|
15
|
-
|
|
16
|
-
|
|
11
|
+
- **Minimalistic**: Uses only Python's standard libraries (`sqlite3`) and `numpy`/`faiss-cpu`.
|
|
12
|
+
- **Schemaless**: Flexible data storage without rigid schemas across all modalities.
|
|
13
|
+
- **Synchronous, Multi-Process, and Thread-Safe**: Designed for simplicity and safety in multi-threaded and multi-process environments.
|
|
14
|
+
- **Built for Local Applications**: Perfect for local AI tools, RAG prototypes, chatbots, and desktop utilities that need persistent, structured data without network overhead.
|
|
15
|
+
- **Fast by Default**: It's built on SQLite, which is famously fast and reliable for local applications. Vector search is accelerated with a high-performance, persistent `faiss` index.
|
|
16
|
+
- **Standard Relational Interface**: While `beaver` provides high-level features, you can always use the same SQLite file for normal relational tasks with standard SQL.
|
|
17
17
|
|
|
18
18
|
## Core Features
|
|
19
19
|
|
|
20
|
-
|
|
21
|
-
|
|
22
|
-
|
|
23
|
-
|
|
24
|
-
|
|
25
|
-
|
|
26
|
-
|
|
27
|
-
|
|
28
|
-
|
|
20
|
+
- **Sync/Async High-Efficiency Pub/Sub**: A powerful, thread and process-safe publish-subscribe system for real-time messaging with a fan-out architecture. Sync by default, but with an `as_async` wrapper for async applications.
|
|
21
|
+
- **Namespaced Key-Value Dictionaries**: A Pythonic, dictionary-like interface for storing any JSON-serializable object within separate namespaces with optional TTL for cache implementations.
|
|
22
|
+
- **Pythonic List Management**: A fluent, Redis-like interface for managing persistent, ordered lists.
|
|
23
|
+
- **Persistent Priority Queue**: A high-performance, persistent priority queue perfect for task orchestration across multiple processes. Also with optional async support.
|
|
24
|
+
- **Time-Indexed Log for Monitoring**: A specialized data structure for structured, time-series logs. Query historical data by time range or create a live, aggregated view of the most recent events for real-time dashboards.
|
|
25
|
+
- **Simple Blob Storage**: A dictionary-like interface for storing medium-sized binary files (like PDFs or images) directly in the database, ensuring transactional integrity with your other data.
|
|
26
|
+
- **High-Performance Vector Storage & Search**: Store vector embeddings and perform fast, crash-safe approximate nearest neighbor searches using a `faiss`-based hybrid index.
|
|
27
|
+
- **Full-Text and Fuzzy Search**: Automatically index and search through document metadata using SQLite's powerful FTS5 engine, enhanced with optional fuzzy search for typo-tolerant matching.
|
|
28
|
+
- **Knowledge Graph**: Create relationships between documents and traverse the graph to find neighbors or perform multi-hop walks.
|
|
29
|
+
- **Single-File & Portable**: All data is stored in a single SQLite file, making it incredibly easy to move, back up, or embed in your application.
|
|
30
|
+
- **Optional Type-Safety:** Although the database is schemaless, you can use a minimalistic typing system for automatic serialization and deserialization that is Pydantic-compatible out of the box.
|
|
29
31
|
|
|
30
32
|
## How Beaver is Implemented
|
|
31
33
|
|
|
@@ -208,6 +210,29 @@ attachments.put(
|
|
|
208
210
|
avatar = attachments.get("user_123_avatar.png")
|
|
209
211
|
```
|
|
210
212
|
|
|
213
|
+
### 8. Real-time Application Monitoring
|
|
214
|
+
|
|
215
|
+
Use the **time-indexed log** to monitor your application's health in real-time. The `live()` method provides a continuously updating, aggregated view of your log data, perfect for building simple dashboards directly in your terminal.
|
|
216
|
+
|
|
217
|
+
```python
|
|
218
|
+
from datetime import timedelta
|
|
219
|
+
import statistics
|
|
220
|
+
|
|
221
|
+
logs = db.log("system_metrics")
|
|
222
|
+
|
|
223
|
+
def summarize(window):
|
|
224
|
+
values = [log.get("value", 0) for log in window]
|
|
225
|
+
return {"mean": statistics.mean(values), "count": len(values)}
|
|
226
|
+
|
|
227
|
+
live_summary = logs.live(
|
|
228
|
+
window_duration=timedelta(seconds=10),
|
|
229
|
+
sampling_period=timedelta(seconds=1),
|
|
230
|
+
aggregator=summarize
|
|
231
|
+
)
|
|
232
|
+
|
|
233
|
+
for summary in live_summary:
|
|
234
|
+
print(f"Live Stats (10s window): Count={summary['count']}, Mean={summary['mean']:.2f}")
|
|
235
|
+
```
|
|
211
236
|
|
|
212
237
|
## Type-Safe Data Models
|
|
213
238
|
|
|
@@ -249,21 +274,24 @@ Basically everywhere you can store or get some object in BeaverDB, you can use a
|
|
|
249
274
|
|
|
250
275
|
For more in-depth examples, check out the scripts in the `examples/` directory:
|
|
251
276
|
|
|
252
|
-
- [`
|
|
253
|
-
- [`
|
|
254
|
-
- [`
|
|
255
|
-
- [`
|
|
256
|
-
- [`
|
|
257
|
-
- [`
|
|
258
|
-
- [`
|
|
259
|
-
- [`
|
|
260
|
-
- [`
|
|
261
|
-
- [`
|
|
262
|
-
- [`
|
|
263
|
-
- [`examples/
|
|
264
|
-
- [`
|
|
265
|
-
- [`
|
|
266
|
-
- [`
|
|
277
|
+
- [`async_pubsub.py`](examples/async_pubsub.py): A demonstration of the asynchronous wrapper for the publish/subscribe system.
|
|
278
|
+
- [`blobs.py`](examples/blobs.py): Demonstrates how to store and retrieve binary data in the database.
|
|
279
|
+
- [`cache.py`](examples/cache.py): A practical example of using a dictionary with TTL as a cache for API calls.
|
|
280
|
+
- [`fts.py`](examples/fts.py): A detailed look at full-text search, including targeted searches on specific metadata fields.
|
|
281
|
+
- [`fuzzy.py`](examples/fuzzy.py): Demonstrates fuzzy search capabilities for text search.
|
|
282
|
+
- [`general_test.py`](examples/general_test.py): A general-purpose test to run all operations randomly which allows testing long-running processes and synchronicity issues.
|
|
283
|
+
- [`graph.py`](examples/graph.py): Shows how to create relationships between documents and perform multi-hop graph traversals.
|
|
284
|
+
- [`kvstore.py`](examples/kvstore.py): A comprehensive demo of the namespaced dictionary feature.
|
|
285
|
+
- [`list.py`](examples/list.py): Shows the full capabilities of the persistent list, including slicing and in-place updates.
|
|
286
|
+
- [`pqueue.py`](examples/pqueue.py): A practical example of using the persistent priority queue for task management.
|
|
287
|
+
- [`producer_consumer.py`](examples/producer_consumer.py): A demonstration of the distributed task queue system in a multi-process environment.
|
|
288
|
+
- [`publisher.py`](examples/publisher.py) and [`subscriber.py`](examples/subscriber.py): A pair of examples demonstrating inter-process message passing with the publish/subscribe system.
|
|
289
|
+
- [`pubsub.py`](examples/pubsub.py): A demonstration of the synchronous, thread-safe publish/subscribe system in a single process.
|
|
290
|
+
- [`rerank.py`](examples/rerank.py): Shows how to combine results from vector and text search for more refined results.
|
|
291
|
+
- [`stress_vectors.py`](examples/stress_vectors.py): A stress test for the vector search functionality.
|
|
292
|
+
- [`textual_chat.py`](examples/textual_chat.py): A chat application built with `textual` and `beaver` to illustrate the use of several primitives (lists, dicts, and channels) at the same time.
|
|
293
|
+
- [`type_hints.py](examples/type_hints.py): Shows how to use type hints with `beaver` to get better IDE support and type safety.
|
|
294
|
+
- [`vector.py`](examples/vector.py): Demonstrates how to index and search vector embeddings, including upserts.
|
|
267
295
|
|
|
268
296
|
## Roadmap
|
|
269
297
|
|
|
@@ -272,7 +300,6 @@ For more in-depth examples, check out the scripts in the `examples/` directory:
|
|
|
272
300
|
These are some of the features and improvements planned for future releases:
|
|
273
301
|
|
|
274
302
|
- **Async API**: Extend the async support with on-demand wrappers for all features besides channels.
|
|
275
|
-
- **Type Hints**: Extend type hints for channels and documents.
|
|
276
303
|
|
|
277
304
|
Check out the [roadmap](roadmap.md) for a detailed list of upcoming features and design ideas.
|
|
278
305
|
|
|
@@ -8,6 +8,7 @@ from .channels import ChannelManager
|
|
|
8
8
|
from .collections import CollectionManager, Document
|
|
9
9
|
from .dicts import DictManager
|
|
10
10
|
from .lists import ListManager
|
|
11
|
+
from .logs import LogManager
|
|
11
12
|
from .queues import QueueManager
|
|
12
13
|
|
|
13
14
|
|
|
@@ -51,11 +52,31 @@ class BeaverDB:
|
|
|
51
52
|
self._create_edges_table()
|
|
52
53
|
self._create_fts_table()
|
|
53
54
|
self._create_list_table()
|
|
55
|
+
self._create_logs_table()
|
|
54
56
|
self._create_priority_queue_table()
|
|
55
57
|
self._create_pubsub_table()
|
|
56
58
|
self._create_trigrams_table()
|
|
57
59
|
self._create_versions_table()
|
|
58
60
|
|
|
61
|
+
def _create_logs_table(self):
|
|
62
|
+
"""Creates the table for time-indexed logs."""
|
|
63
|
+
self._conn.execute(
|
|
64
|
+
"""
|
|
65
|
+
CREATE TABLE IF NOT EXISTS beaver_logs (
|
|
66
|
+
log_name TEXT NOT NULL,
|
|
67
|
+
timestamp REAL NOT NULL,
|
|
68
|
+
data TEXT NOT NULL,
|
|
69
|
+
PRIMARY KEY (log_name, timestamp)
|
|
70
|
+
)
|
|
71
|
+
"""
|
|
72
|
+
)
|
|
73
|
+
self._conn.execute(
|
|
74
|
+
"""
|
|
75
|
+
CREATE INDEX IF NOT EXISTS idx_logs_timestamp
|
|
76
|
+
ON beaver_logs (log_name, timestamp)
|
|
77
|
+
"""
|
|
78
|
+
)
|
|
79
|
+
|
|
59
80
|
def _create_blobs_table(self):
|
|
60
81
|
"""Creates the table for storing named blobs."""
|
|
61
82
|
self._conn.execute(
|
|
@@ -343,3 +364,16 @@ class BeaverDB:
|
|
|
343
364
|
raise TypeError("Blob store name must be a non-empty string.")
|
|
344
365
|
|
|
345
366
|
return BlobManager(name, self._conn, model)
|
|
367
|
+
|
|
368
|
+
def log[T](self, name: str, model: type[T] | None = None) -> LogManager[T]:
|
|
369
|
+
"""
|
|
370
|
+
Returns a wrapper for interacting with a named, time-indexed log.
|
|
371
|
+
If model is defined, it should be a type used for automatic (de)serialization.
|
|
372
|
+
"""
|
|
373
|
+
if not isinstance(name, str) or not name:
|
|
374
|
+
raise TypeError("Log name must be a non-empty string.")
|
|
375
|
+
|
|
376
|
+
if model and not isinstance(model, JsonSerializable):
|
|
377
|
+
raise TypeError("The model parameter must be a JsonSerializable class.")
|
|
378
|
+
|
|
379
|
+
return LogManager(name, self._conn, self._db_path, model)
|
|
@@ -0,0 +1,278 @@
|
|
|
1
|
+
import asyncio
|
|
2
|
+
import collections
|
|
3
|
+
import json
|
|
4
|
+
import sqlite3
|
|
5
|
+
import threading
|
|
6
|
+
import time
|
|
7
|
+
from datetime import datetime, timedelta, timezone
|
|
8
|
+
from queue import Empty, Queue
|
|
9
|
+
from typing import Any, AsyncIterator, Callable, Iterator, Type, TypeVar
|
|
10
|
+
|
|
11
|
+
from .types import JsonSerializable
|
|
12
|
+
|
|
13
|
+
|
|
14
|
+
# A special message object used to signal the iterator to gracefully shut down.
|
|
15
|
+
_SHUTDOWN_SENTINEL = object()
|
|
16
|
+
|
|
17
|
+
|
|
18
|
+
class LiveIterator[T,R]:
|
|
19
|
+
"""
|
|
20
|
+
A thread-safe, blocking iterator that yields aggregated results from a
|
|
21
|
+
rolling window of log data.
|
|
22
|
+
"""
|
|
23
|
+
|
|
24
|
+
def __init__(
|
|
25
|
+
self,
|
|
26
|
+
db_path: str,
|
|
27
|
+
log_name: str,
|
|
28
|
+
window: timedelta,
|
|
29
|
+
period: timedelta,
|
|
30
|
+
aggregator: Callable[[list[T]], R],
|
|
31
|
+
deserializer: Callable[[str], T],
|
|
32
|
+
):
|
|
33
|
+
self._db_path = db_path
|
|
34
|
+
self._log_name = log_name
|
|
35
|
+
self._window_duration_seconds = window.total_seconds()
|
|
36
|
+
self._sampling_period_seconds = period.total_seconds()
|
|
37
|
+
self._aggregator = aggregator
|
|
38
|
+
self._deserializer = deserializer
|
|
39
|
+
self._queue: Queue = Queue()
|
|
40
|
+
self._stop_event = threading.Event()
|
|
41
|
+
self._thread: threading.Thread | None = None
|
|
42
|
+
|
|
43
|
+
def _polling_loop(self):
|
|
44
|
+
"""The main loop for the background thread that queries and aggregates data."""
|
|
45
|
+
# Each thread needs its own database connection.
|
|
46
|
+
thread_conn = sqlite3.connect(self._db_path, check_same_thread=False)
|
|
47
|
+
thread_conn.row_factory = sqlite3.Row
|
|
48
|
+
|
|
49
|
+
window_deque: collections.deque[tuple[float, T]] = collections.deque()
|
|
50
|
+
last_seen_timestamp = 0.0
|
|
51
|
+
|
|
52
|
+
# --- Initial window population ---
|
|
53
|
+
now = datetime.now(timezone.utc).timestamp()
|
|
54
|
+
start_time = now - self._window_duration_seconds
|
|
55
|
+
cursor = thread_conn.cursor()
|
|
56
|
+
cursor.execute(
|
|
57
|
+
"SELECT timestamp, data FROM beaver_logs WHERE log_name = ? AND timestamp >= ? ORDER BY timestamp ASC",
|
|
58
|
+
(self._log_name, start_time),
|
|
59
|
+
)
|
|
60
|
+
for row in cursor:
|
|
61
|
+
ts, data_str = row
|
|
62
|
+
window_deque.append((ts, self._deserializer(data_str)))
|
|
63
|
+
last_seen_timestamp = max(last_seen_timestamp, ts)
|
|
64
|
+
|
|
65
|
+
# Yield the first result
|
|
66
|
+
try:
|
|
67
|
+
initial_result = self._aggregator([item[1] for item in window_deque])
|
|
68
|
+
self._queue.put(initial_result)
|
|
69
|
+
except Exception as e:
|
|
70
|
+
# Propagate aggregator errors to the main thread
|
|
71
|
+
self._queue.put(e)
|
|
72
|
+
|
|
73
|
+
# --- Continuous polling loop ---
|
|
74
|
+
while not self._stop_event.is_set():
|
|
75
|
+
time.sleep(self._sampling_period_seconds)
|
|
76
|
+
|
|
77
|
+
# Fetch only new data since the last check
|
|
78
|
+
cursor.execute(
|
|
79
|
+
"SELECT timestamp, data FROM beaver_logs WHERE log_name = ? AND timestamp > ? ORDER BY timestamp ASC",
|
|
80
|
+
(self._log_name, last_seen_timestamp),
|
|
81
|
+
)
|
|
82
|
+
for row in cursor:
|
|
83
|
+
ts, data_str = row
|
|
84
|
+
window_deque.append((ts, self._deserializer(data_str)))
|
|
85
|
+
last_seen_timestamp = max(last_seen_timestamp, ts)
|
|
86
|
+
|
|
87
|
+
# Evict old data from the left of the deque
|
|
88
|
+
now = datetime.now(timezone.utc).timestamp()
|
|
89
|
+
eviction_time = now - self._window_duration_seconds
|
|
90
|
+
while window_deque and window_deque[0][0] < eviction_time:
|
|
91
|
+
window_deque.popleft()
|
|
92
|
+
|
|
93
|
+
# Run aggregator and yield the new result
|
|
94
|
+
try:
|
|
95
|
+
new_result = self._aggregator([item[1] for item in window_deque])
|
|
96
|
+
self._queue.put(new_result)
|
|
97
|
+
except Exception as e:
|
|
98
|
+
self._queue.put(e)
|
|
99
|
+
|
|
100
|
+
thread_conn.close()
|
|
101
|
+
|
|
102
|
+
def __iter__(self) -> "LiveIterator[T,R]":
|
|
103
|
+
self._thread = threading.Thread(target=self._polling_loop, daemon=True)
|
|
104
|
+
self._thread.start()
|
|
105
|
+
return self
|
|
106
|
+
|
|
107
|
+
def __next__(self) -> R:
|
|
108
|
+
result = self._queue.get()
|
|
109
|
+
if result is _SHUTDOWN_SENTINEL:
|
|
110
|
+
raise StopIteration
|
|
111
|
+
if isinstance(result, Exception):
|
|
112
|
+
# If the background thread put an exception in the queue, re-raise it
|
|
113
|
+
raise result
|
|
114
|
+
return result
|
|
115
|
+
|
|
116
|
+
def close(self):
|
|
117
|
+
"""Stops the background polling thread."""
|
|
118
|
+
self._stop_event.set()
|
|
119
|
+
self._queue.put(_SHUTDOWN_SENTINEL)
|
|
120
|
+
if self._thread:
|
|
121
|
+
self._thread.join()
|
|
122
|
+
|
|
123
|
+
|
|
124
|
+
class AsyncLiveIterator[T,R]:
|
|
125
|
+
"""An async wrapper for the LiveIterator."""
|
|
126
|
+
|
|
127
|
+
def __init__(self, sync_iterator: LiveIterator[T,R]):
|
|
128
|
+
self._sync_iterator = sync_iterator
|
|
129
|
+
|
|
130
|
+
async def __anext__(self) -> R:
|
|
131
|
+
try:
|
|
132
|
+
return await asyncio.to_thread(self._sync_iterator.__next__)
|
|
133
|
+
except StopIteration:
|
|
134
|
+
raise StopAsyncIteration
|
|
135
|
+
|
|
136
|
+
def __aiter__(self) -> "AsyncLiveIterator[T,R]":
|
|
137
|
+
# The synchronous iterator's __iter__ method starts the thread.
|
|
138
|
+
# This is non-blocking, so it's safe to call directly.
|
|
139
|
+
self._sync_iterator.__iter__()
|
|
140
|
+
return self
|
|
141
|
+
|
|
142
|
+
def close(self):
|
|
143
|
+
self._sync_iterator.close()
|
|
144
|
+
|
|
145
|
+
|
|
146
|
+
class AsyncLogManager[T]:
|
|
147
|
+
"""An async-compatible wrapper for the LogManager."""
|
|
148
|
+
|
|
149
|
+
def __init__(self, sync_manager: "LogManager[T]"):
|
|
150
|
+
self._sync_manager = sync_manager
|
|
151
|
+
|
|
152
|
+
async def log(self, data: T, timestamp: datetime | None = None) -> None:
|
|
153
|
+
"""Asynchronously adds a new entry to the log."""
|
|
154
|
+
await asyncio.to_thread(self._sync_manager.log, data, timestamp)
|
|
155
|
+
|
|
156
|
+
async def range(self, start: datetime, end: datetime) -> list[T]:
|
|
157
|
+
"""Asynchronously retrieves all log entries within a specific time window."""
|
|
158
|
+
return await asyncio.to_thread(self._sync_manager.range, start, end)
|
|
159
|
+
|
|
160
|
+
def live[R](
|
|
161
|
+
self,
|
|
162
|
+
window: timedelta,
|
|
163
|
+
period: timedelta,
|
|
164
|
+
aggregator: Callable[[list[T]], R],
|
|
165
|
+
) -> AsyncIterator[R]:
|
|
166
|
+
"""Returns an async, infinite iterator for real-time log analysis."""
|
|
167
|
+
sync_iterator = self._sync_manager.live(
|
|
168
|
+
window, period, aggregator
|
|
169
|
+
)
|
|
170
|
+
return AsyncLiveIterator(sync_iterator)
|
|
171
|
+
|
|
172
|
+
|
|
173
|
+
class LogManager[T]:
|
|
174
|
+
"""
|
|
175
|
+
A wrapper for interacting with a named, time-indexed log, providing
|
|
176
|
+
type-safe and async-compatible methods.
|
|
177
|
+
"""
|
|
178
|
+
|
|
179
|
+
def __init__(
|
|
180
|
+
self,
|
|
181
|
+
name: str,
|
|
182
|
+
conn: sqlite3.Connection,
|
|
183
|
+
db_path: str,
|
|
184
|
+
model: Type[T] | None = None,
|
|
185
|
+
):
|
|
186
|
+
self._name = name
|
|
187
|
+
self._conn = conn
|
|
188
|
+
self._db_path = db_path
|
|
189
|
+
self._model = model
|
|
190
|
+
|
|
191
|
+
def _serialize(self, value: T) -> str:
|
|
192
|
+
"""Serializes the given value to a JSON string."""
|
|
193
|
+
if isinstance(value, JsonSerializable):
|
|
194
|
+
return value.model_dump_json()
|
|
195
|
+
|
|
196
|
+
return json.dumps(value)
|
|
197
|
+
|
|
198
|
+
def _deserialize(self, value: str) -> T:
|
|
199
|
+
"""Deserializes a JSON string into the specified model or a generic object."""
|
|
200
|
+
if self._model:
|
|
201
|
+
return self._model.model_validate_json(value)
|
|
202
|
+
|
|
203
|
+
return json.loads(value)
|
|
204
|
+
|
|
205
|
+
def log(self, data: T, timestamp: datetime | None = None) -> None:
|
|
206
|
+
"""
|
|
207
|
+
Adds a new entry to the log.
|
|
208
|
+
|
|
209
|
+
Args:
|
|
210
|
+
data: The JSON-serializable data to store. If a model is used, this
|
|
211
|
+
should be an instance of that model.
|
|
212
|
+
timestamp: A timezone-naive datetime object. If not provided,
|
|
213
|
+
`datetime.now()` is used.
|
|
214
|
+
"""
|
|
215
|
+
ts = timestamp or datetime.now(timezone.utc)
|
|
216
|
+
ts_float = ts.timestamp()
|
|
217
|
+
|
|
218
|
+
with self._conn:
|
|
219
|
+
self._conn.execute(
|
|
220
|
+
"INSERT INTO beaver_logs (log_name, timestamp, data) VALUES (?, ?, ?)",
|
|
221
|
+
(self._name, ts_float, self._serialize(data)),
|
|
222
|
+
)
|
|
223
|
+
|
|
224
|
+
def range(self, start: datetime, end: datetime) -> list[T]:
|
|
225
|
+
"""
|
|
226
|
+
Retrieves all log entries within a specific time window.
|
|
227
|
+
|
|
228
|
+
Args:
|
|
229
|
+
start: The start of the time range (inclusive).
|
|
230
|
+
end: The end of the time range (inclusive).
|
|
231
|
+
|
|
232
|
+
Returns:
|
|
233
|
+
A list of log entries, deserialized into the specified model if provided.
|
|
234
|
+
"""
|
|
235
|
+
start_ts = start.timestamp()
|
|
236
|
+
end_ts = end.timestamp()
|
|
237
|
+
|
|
238
|
+
cursor = self._conn.cursor()
|
|
239
|
+
cursor.execute(
|
|
240
|
+
"SELECT data FROM beaver_logs WHERE log_name = ? AND timestamp BETWEEN ? AND ? ORDER BY timestamp ASC",
|
|
241
|
+
(self._name, start_ts, end_ts),
|
|
242
|
+
)
|
|
243
|
+
return [self._deserialize(row["data"]) for row in cursor.fetchall()]
|
|
244
|
+
|
|
245
|
+
def live[R](
|
|
246
|
+
self,
|
|
247
|
+
window: timedelta,
|
|
248
|
+
period: timedelta,
|
|
249
|
+
aggregator: Callable[[list[T]], R],
|
|
250
|
+
) -> Iterator[R]:
|
|
251
|
+
"""
|
|
252
|
+
Returns a blocking, infinite iterator for real-time log analysis.
|
|
253
|
+
|
|
254
|
+
This maintains a sliding window of log entries and yields the result
|
|
255
|
+
of an aggregator function at specified intervals.
|
|
256
|
+
|
|
257
|
+
Args:
|
|
258
|
+
window: The duration of the sliding window (e.g., `timedelta(minutes=5)`).
|
|
259
|
+
period: The interval at which to update and yield a new result
|
|
260
|
+
(e.g., `timedelta(seconds=10)`).
|
|
261
|
+
aggregator: A function that takes a list of log entries (the window) and
|
|
262
|
+
returns a single, aggregated result.
|
|
263
|
+
|
|
264
|
+
Returns:
|
|
265
|
+
An iterator that yields the results of the aggregator.
|
|
266
|
+
"""
|
|
267
|
+
return LiveIterator(
|
|
268
|
+
db_path=self._db_path,
|
|
269
|
+
log_name=self._name,
|
|
270
|
+
window=window,
|
|
271
|
+
period=period,
|
|
272
|
+
aggregator=aggregator,
|
|
273
|
+
deserializer=self._deserialize,
|
|
274
|
+
)
|
|
275
|
+
|
|
276
|
+
def as_async(self) -> AsyncLogManager[T]:
|
|
277
|
+
"""Returns an async-compatible version of the log manager."""
|
|
278
|
+
return AsyncLogManager(self)
|
|
@@ -1,3 +1,4 @@
|
|
|
1
|
+
import asyncio
|
|
1
2
|
import json
|
|
2
3
|
import sqlite3
|
|
3
4
|
import time
|
|
@@ -14,8 +15,34 @@ class QueueItem[T](NamedTuple):
|
|
|
14
15
|
data: T
|
|
15
16
|
|
|
16
17
|
|
|
18
|
+
class AsyncQueueManager[T]:
|
|
19
|
+
"""An async wrapper for the producer-consumer priority queue."""
|
|
20
|
+
|
|
21
|
+
def __init__(self, queue: "QueueManager[T]"):
|
|
22
|
+
self._queue = queue
|
|
23
|
+
|
|
24
|
+
async def put(self, data: T, priority: float):
|
|
25
|
+
"""Asynchronously adds an item to the queue with a specific priority."""
|
|
26
|
+
await asyncio.to_thread(self._queue.put, data, priority)
|
|
27
|
+
|
|
28
|
+
@overload
|
|
29
|
+
async def get(self, block: Literal[True] = True, timeout: float | None = None) -> QueueItem[T]: ...
|
|
30
|
+
@overload
|
|
31
|
+
async def get(self, block: Literal[False]) -> QueueItem[T]: ...
|
|
32
|
+
|
|
33
|
+
async def get(self, block: bool = True, timeout: float | None = None) -> QueueItem[T]:
|
|
34
|
+
"""
|
|
35
|
+
Asynchronously and atomically retrieves the highest-priority item.
|
|
36
|
+
This method will run the synchronous blocking logic in a separate thread.
|
|
37
|
+
"""
|
|
38
|
+
return await asyncio.to_thread(self._queue.get, block=block, timeout=timeout)
|
|
39
|
+
|
|
40
|
+
|
|
17
41
|
class QueueManager[T]:
|
|
18
|
-
"""
|
|
42
|
+
"""
|
|
43
|
+
A wrapper providing a Pythonic interface to a persistent, multi-process
|
|
44
|
+
producer-consumer priority queue.
|
|
45
|
+
"""
|
|
19
46
|
|
|
20
47
|
def __init__(self, name: str, conn: sqlite3.Connection, model: Type[T] | None = None):
|
|
21
48
|
self._name = name
|
|
@@ -50,19 +77,13 @@ class QueueManager[T]:
|
|
|
50
77
|
(self._name, priority, time.time(), self._serialize(data)),
|
|
51
78
|
)
|
|
52
79
|
|
|
53
|
-
|
|
54
|
-
def get(self, safe:Literal[True]) -> QueueItem[T] | None: ...
|
|
55
|
-
@overload
|
|
56
|
-
def get(self) -> QueueItem[T]: ...
|
|
57
|
-
|
|
58
|
-
def get(self, safe:bool=False) -> QueueItem[T] | None:
|
|
80
|
+
def _get_item_atomically(self) -> QueueItem[T] | None:
|
|
59
81
|
"""
|
|
60
|
-
|
|
61
|
-
|
|
82
|
+
Performs a single, atomic attempt to retrieve and remove the
|
|
83
|
+
highest-priority item from the queue. Returns None if the queue is empty.
|
|
62
84
|
"""
|
|
63
85
|
with self._conn:
|
|
64
86
|
cursor = self._conn.cursor()
|
|
65
|
-
# The compound index on (queue_name, priority, timestamp) makes this query efficient.
|
|
66
87
|
cursor.execute(
|
|
67
88
|
"""
|
|
68
89
|
SELECT rowid, priority, timestamp, data
|
|
@@ -76,19 +97,61 @@ class QueueManager[T]:
|
|
|
76
97
|
result = cursor.fetchone()
|
|
77
98
|
|
|
78
99
|
if result is None:
|
|
79
|
-
|
|
80
|
-
return None
|
|
81
|
-
else:
|
|
82
|
-
raise IndexError("No item available.")
|
|
100
|
+
return None
|
|
83
101
|
|
|
84
102
|
rowid, priority, timestamp, data = result
|
|
85
|
-
# Delete the retrieved item to ensure it's processed only once.
|
|
86
103
|
cursor.execute("DELETE FROM beaver_priority_queues WHERE rowid = ?", (rowid,))
|
|
87
104
|
|
|
88
105
|
return QueueItem(
|
|
89
106
|
priority=priority, timestamp=timestamp, data=self._deserialize(data)
|
|
90
107
|
)
|
|
91
108
|
|
|
109
|
+
@overload
|
|
110
|
+
def get(self, block: Literal[True] = True, timeout: float | None = None) -> QueueItem[T]: ...
|
|
111
|
+
@overload
|
|
112
|
+
def get(self, block: Literal[False]) -> QueueItem[T]: ...
|
|
113
|
+
|
|
114
|
+
def get(self, block: bool = True, timeout: float | None = None) -> QueueItem[T]:
|
|
115
|
+
"""
|
|
116
|
+
Atomically retrieves and removes the highest-priority item from the queue.
|
|
117
|
+
|
|
118
|
+
This method is designed for producer-consumer patterns and can block
|
|
119
|
+
until an item becomes available.
|
|
120
|
+
|
|
121
|
+
Args:
|
|
122
|
+
block: If True (default), the method will wait until an item is available.
|
|
123
|
+
timeout: If `block` is True, this specifies the maximum number of seconds
|
|
124
|
+
to wait. If the timeout is reached, `TimeoutError` is raised.
|
|
125
|
+
|
|
126
|
+
Returns:
|
|
127
|
+
A `QueueItem` containing the retrieved data.
|
|
128
|
+
|
|
129
|
+
Raises:
|
|
130
|
+
IndexError: If `block` is False and the queue is empty.
|
|
131
|
+
TimeoutError: If `block` is True and the timeout expires.
|
|
132
|
+
"""
|
|
133
|
+
if not block:
|
|
134
|
+
item = self._get_item_atomically()
|
|
135
|
+
if item is None:
|
|
136
|
+
raise IndexError("get from an empty queue.")
|
|
137
|
+
return item
|
|
138
|
+
|
|
139
|
+
start_time = time.time()
|
|
140
|
+
while True:
|
|
141
|
+
item = self._get_item_atomically()
|
|
142
|
+
if item is not None:
|
|
143
|
+
return item
|
|
144
|
+
|
|
145
|
+
if timeout is not None and (time.time() - start_time) > timeout:
|
|
146
|
+
raise TimeoutError("Timeout expired while waiting for an item.")
|
|
147
|
+
|
|
148
|
+
# Sleep for a short interval to avoid busy-waiting and consuming CPU.
|
|
149
|
+
time.sleep(0.1)
|
|
150
|
+
|
|
151
|
+
def as_async(self) -> "AsyncQueueManager[T]":
|
|
152
|
+
"""Returns an async version of the queue manager."""
|
|
153
|
+
return AsyncQueueManager(self)
|
|
154
|
+
|
|
92
155
|
def __len__(self) -> int:
|
|
93
156
|
"""Returns the current number of items in the queue."""
|
|
94
157
|
cursor = self._conn.cursor()
|
|
@@ -1,6 +1,6 @@
|
|
|
1
1
|
Metadata-Version: 2.4
|
|
2
2
|
Name: beaver-db
|
|
3
|
-
Version: 0.
|
|
3
|
+
Version: 0.16.0
|
|
4
4
|
Summary: Fast, embedded, and multi-modal DB based on SQLite for AI-powered applications.
|
|
5
5
|
Requires-Python: >=3.13
|
|
6
6
|
Description-Content-Type: text/markdown
|
|
@@ -19,24 +19,26 @@ A fast, single-file, multi-modal database for Python, built with the standard `s
|
|
|
19
19
|
|
|
20
20
|
`beaver` is built with a minimalistic philosophy for small, local use cases where a full-blown database server would be overkill.
|
|
21
21
|
|
|
22
|
-
|
|
23
|
-
|
|
24
|
-
|
|
25
|
-
|
|
26
|
-
|
|
27
|
-
|
|
22
|
+
- **Minimalistic**: Uses only Python's standard libraries (`sqlite3`) and `numpy`/`faiss-cpu`.
|
|
23
|
+
- **Schemaless**: Flexible data storage without rigid schemas across all modalities.
|
|
24
|
+
- **Synchronous, Multi-Process, and Thread-Safe**: Designed for simplicity and safety in multi-threaded and multi-process environments.
|
|
25
|
+
- **Built for Local Applications**: Perfect for local AI tools, RAG prototypes, chatbots, and desktop utilities that need persistent, structured data without network overhead.
|
|
26
|
+
- **Fast by Default**: It's built on SQLite, which is famously fast and reliable for local applications. Vector search is accelerated with a high-performance, persistent `faiss` index.
|
|
27
|
+
- **Standard Relational Interface**: While `beaver` provides high-level features, you can always use the same SQLite file for normal relational tasks with standard SQL.
|
|
28
28
|
|
|
29
29
|
## Core Features
|
|
30
30
|
|
|
31
|
-
|
|
32
|
-
|
|
33
|
-
|
|
34
|
-
|
|
35
|
-
|
|
36
|
-
|
|
37
|
-
|
|
38
|
-
|
|
39
|
-
|
|
31
|
+
- **Sync/Async High-Efficiency Pub/Sub**: A powerful, thread and process-safe publish-subscribe system for real-time messaging with a fan-out architecture. Sync by default, but with an `as_async` wrapper for async applications.
|
|
32
|
+
- **Namespaced Key-Value Dictionaries**: A Pythonic, dictionary-like interface for storing any JSON-serializable object within separate namespaces with optional TTL for cache implementations.
|
|
33
|
+
- **Pythonic List Management**: A fluent, Redis-like interface for managing persistent, ordered lists.
|
|
34
|
+
- **Persistent Priority Queue**: A high-performance, persistent priority queue perfect for task orchestration across multiple processes. Also with optional async support.
|
|
35
|
+
- **Time-Indexed Log for Monitoring**: A specialized data structure for structured, time-series logs. Query historical data by time range or create a live, aggregated view of the most recent events for real-time dashboards.
|
|
36
|
+
- **Simple Blob Storage**: A dictionary-like interface for storing medium-sized binary files (like PDFs or images) directly in the database, ensuring transactional integrity with your other data.
|
|
37
|
+
- **High-Performance Vector Storage & Search**: Store vector embeddings and perform fast, crash-safe approximate nearest neighbor searches using a `faiss`-based hybrid index.
|
|
38
|
+
- **Full-Text and Fuzzy Search**: Automatically index and search through document metadata using SQLite's powerful FTS5 engine, enhanced with optional fuzzy search for typo-tolerant matching.
|
|
39
|
+
- **Knowledge Graph**: Create relationships between documents and traverse the graph to find neighbors or perform multi-hop walks.
|
|
40
|
+
- **Single-File & Portable**: All data is stored in a single SQLite file, making it incredibly easy to move, back up, or embed in your application.
|
|
41
|
+
- **Optional Type-Safety:** Although the database is schemaless, you can use a minimalistic typing system for automatic serialization and deserialization that is Pydantic-compatible out of the box.
|
|
40
42
|
|
|
41
43
|
## How Beaver is Implemented
|
|
42
44
|
|
|
@@ -219,6 +221,29 @@ attachments.put(
|
|
|
219
221
|
avatar = attachments.get("user_123_avatar.png")
|
|
220
222
|
```
|
|
221
223
|
|
|
224
|
+
### 8. Real-time Application Monitoring
|
|
225
|
+
|
|
226
|
+
Use the **time-indexed log** to monitor your application's health in real-time. The `live()` method provides a continuously updating, aggregated view of your log data, perfect for building simple dashboards directly in your terminal.
|
|
227
|
+
|
|
228
|
+
```python
|
|
229
|
+
from datetime import timedelta
|
|
230
|
+
import statistics
|
|
231
|
+
|
|
232
|
+
logs = db.log("system_metrics")
|
|
233
|
+
|
|
234
|
+
def summarize(window):
|
|
235
|
+
values = [log.get("value", 0) for log in window]
|
|
236
|
+
return {"mean": statistics.mean(values), "count": len(values)}
|
|
237
|
+
|
|
238
|
+
live_summary = logs.live(
|
|
239
|
+
window_duration=timedelta(seconds=10),
|
|
240
|
+
sampling_period=timedelta(seconds=1),
|
|
241
|
+
aggregator=summarize
|
|
242
|
+
)
|
|
243
|
+
|
|
244
|
+
for summary in live_summary:
|
|
245
|
+
print(f"Live Stats (10s window): Count={summary['count']}, Mean={summary['mean']:.2f}")
|
|
246
|
+
```
|
|
222
247
|
|
|
223
248
|
## Type-Safe Data Models
|
|
224
249
|
|
|
@@ -260,21 +285,24 @@ Basically everywhere you can store or get some object in BeaverDB, you can use a
|
|
|
260
285
|
|
|
261
286
|
For more in-depth examples, check out the scripts in the `examples/` directory:
|
|
262
287
|
|
|
263
|
-
- [`
|
|
264
|
-
- [`
|
|
265
|
-
- [`
|
|
266
|
-
- [`
|
|
267
|
-
- [`
|
|
268
|
-
- [`
|
|
269
|
-
- [`
|
|
270
|
-
- [`
|
|
271
|
-
- [`
|
|
272
|
-
- [`
|
|
273
|
-
- [`
|
|
274
|
-
- [`examples/
|
|
275
|
-
- [`
|
|
276
|
-
- [`
|
|
277
|
-
- [`
|
|
288
|
+
- [`async_pubsub.py`](examples/async_pubsub.py): A demonstration of the asynchronous wrapper for the publish/subscribe system.
|
|
289
|
+
- [`blobs.py`](examples/blobs.py): Demonstrates how to store and retrieve binary data in the database.
|
|
290
|
+
- [`cache.py`](examples/cache.py): A practical example of using a dictionary with TTL as a cache for API calls.
|
|
291
|
+
- [`fts.py`](examples/fts.py): A detailed look at full-text search, including targeted searches on specific metadata fields.
|
|
292
|
+
- [`fuzzy.py`](examples/fuzzy.py): Demonstrates fuzzy search capabilities for text search.
|
|
293
|
+
- [`general_test.py`](examples/general_test.py): A general-purpose test to run all operations randomly which allows testing long-running processes and synchronicity issues.
|
|
294
|
+
- [`graph.py`](examples/graph.py): Shows how to create relationships between documents and perform multi-hop graph traversals.
|
|
295
|
+
- [`kvstore.py`](examples/kvstore.py): A comprehensive demo of the namespaced dictionary feature.
|
|
296
|
+
- [`list.py`](examples/list.py): Shows the full capabilities of the persistent list, including slicing and in-place updates.
|
|
297
|
+
- [`pqueue.py`](examples/pqueue.py): A practical example of using the persistent priority queue for task management.
|
|
298
|
+
- [`producer_consumer.py`](examples/producer_consumer.py): A demonstration of the distributed task queue system in a multi-process environment.
|
|
299
|
+
- [`publisher.py`](examples/publisher.py) and [`subscriber.py`](examples/subscriber.py): A pair of examples demonstrating inter-process message passing with the publish/subscribe system.
|
|
300
|
+
- [`pubsub.py`](examples/pubsub.py): A demonstration of the synchronous, thread-safe publish/subscribe system in a single process.
|
|
301
|
+
- [`rerank.py`](examples/rerank.py): Shows how to combine results from vector and text search for more refined results.
|
|
302
|
+
- [`stress_vectors.py`](examples/stress_vectors.py): A stress test for the vector search functionality.
|
|
303
|
+
- [`textual_chat.py`](examples/textual_chat.py): A chat application built with `textual` and `beaver` to illustrate the use of several primitives (lists, dicts, and channels) at the same time.
|
|
304
|
+
- [`type_hints.py](examples/type_hints.py): Shows how to use type hints with `beaver` to get better IDE support and type safety.
|
|
305
|
+
- [`vector.py`](examples/vector.py): Demonstrates how to index and search vector embeddings, including upserts.
|
|
278
306
|
|
|
279
307
|
## Roadmap
|
|
280
308
|
|
|
@@ -283,7 +311,6 @@ For more in-depth examples, check out the scripts in the `examples/` directory:
|
|
|
283
311
|
These are some of the features and improvements planned for future releases:
|
|
284
312
|
|
|
285
313
|
- **Async API**: Extend the async support with on-demand wrappers for all features besides channels.
|
|
286
|
-
- **Type Hints**: Extend type hints for channels and documents.
|
|
287
314
|
|
|
288
315
|
Check out the [roadmap](roadmap.md) for a detailed list of upcoming features and design ideas.
|
|
289
316
|
|
|
@@ -1,6 +1,6 @@
|
|
|
1
1
|
[project]
|
|
2
2
|
name = "beaver-db"
|
|
3
|
-
version = "0.
|
|
3
|
+
version = "0.16.0"
|
|
4
4
|
description = "Fast, embedded, and multi-modal DB based on SQLite for AI-powered applications."
|
|
5
5
|
readme = "README.md"
|
|
6
6
|
requires-python = ">=3.13"
|
|
@@ -11,3 +11,8 @@ dependencies = [
|
|
|
11
11
|
|
|
12
12
|
[tool.hatch.build.targets.wheel]
|
|
13
13
|
packages = ["beaver"]
|
|
14
|
+
|
|
15
|
+
[dependency-groups]
|
|
16
|
+
dev = [
|
|
17
|
+
"textual>=6.1.0",
|
|
18
|
+
]
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|