beaver-db 0.11.0__tar.gz → 0.12.0__tar.gz
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Potentially problematic release.
This version of beaver-db might be problematic. Click here for more details.
- {beaver_db-0.11.0 → beaver_db-0.12.0}/PKG-INFO +35 -11
- {beaver_db-0.11.0 → beaver_db-0.12.0}/README.md +35 -11
- beaver_db-0.12.0/beaver/blobs.py +107 -0
- {beaver_db-0.11.0 → beaver_db-0.12.0}/beaver/core.py +35 -11
- {beaver_db-0.11.0 → beaver_db-0.12.0}/beaver/dicts.py +18 -3
- {beaver_db-0.11.0 → beaver_db-0.12.0}/beaver_db.egg-info/PKG-INFO +35 -11
- {beaver_db-0.11.0 → beaver_db-0.12.0}/beaver_db.egg-info/SOURCES.txt +1 -0
- {beaver_db-0.11.0 → beaver_db-0.12.0}/pyproject.toml +1 -1
- {beaver_db-0.11.0 → beaver_db-0.12.0}/LICENSE +0 -0
- {beaver_db-0.11.0 → beaver_db-0.12.0}/beaver/__init__.py +0 -0
- {beaver_db-0.11.0 → beaver_db-0.12.0}/beaver/channels.py +0 -0
- {beaver_db-0.11.0 → beaver_db-0.12.0}/beaver/collections.py +0 -0
- {beaver_db-0.11.0 → beaver_db-0.12.0}/beaver/lists.py +0 -0
- {beaver_db-0.11.0 → beaver_db-0.12.0}/beaver/queues.py +0 -0
- {beaver_db-0.11.0 → beaver_db-0.12.0}/beaver/vectors.py +0 -0
- {beaver_db-0.11.0 → beaver_db-0.12.0}/beaver_db.egg-info/dependency_links.txt +0 -0
- {beaver_db-0.11.0 → beaver_db-0.12.0}/beaver_db.egg-info/requires.txt +0 -0
- {beaver_db-0.11.0 → beaver_db-0.12.0}/beaver_db.egg-info/top_level.txt +0 -0
- {beaver_db-0.11.0 → beaver_db-0.12.0}/setup.cfg +0 -0
|
@@ -1,6 +1,6 @@
|
|
|
1
1
|
Metadata-Version: 2.4
|
|
2
2
|
Name: beaver-db
|
|
3
|
-
Version: 0.
|
|
3
|
+
Version: 0.12.0
|
|
4
4
|
Summary: Fast, embedded, and multi-modal DB based on SQLite for AI-powered applications.
|
|
5
5
|
Requires-Python: >=3.13
|
|
6
6
|
Description-Content-Type: text/markdown
|
|
@@ -34,6 +34,7 @@ A fast, single-file, multi-modal database for Python, built with the standard `s
|
|
|
34
34
|
- **Namespaced Key-Value Dictionaries**: A Pythonic, dictionary-like interface for storing any JSON-serializable object within separate namespaces with optional TTL for cache implementations.
|
|
35
35
|
- **Pythonic List Management**: A fluent, Redis-like interface for managing persistent, ordered lists.
|
|
36
36
|
- **Persistent Priority Queue**: A high-performance, persistent queue that always returns the item with the highest priority, perfect for task management.
|
|
37
|
+
- **Simple Blob Storage**: A dictionary-like interface for storing medium-sized binary files (like PDFs or images) directly in the database, ensuring transactional integrity with your other data.
|
|
37
38
|
- **High-Performance Vector Storage & Search**: Store vector embeddings and perform fast, crash-safe approximate nearest neighbor searches using a `faiss`-based hybrid index.
|
|
38
39
|
- **Full-Text and Fuzzy Search**: Automatically index and search through document metadata using SQLite's powerful FTS5 engine, enhanced with optional fuzzy search for typo-tolerant matching.
|
|
39
40
|
- **Knowledge Graph**: Create relationships between documents and traverse the graph to find neighbors or perform multi-hop walks.
|
|
@@ -199,30 +200,53 @@ with db.channel("system_events").subscribe() as listener:
|
|
|
199
200
|
# >> Event received: {'event': 'user_login', 'user_id': 'alice'}
|
|
200
201
|
```
|
|
201
202
|
|
|
203
|
+
### 7. Storing User-Uploaded Content
|
|
204
|
+
|
|
205
|
+
Use the simple blob store to save files like user avatars, attachments, or generated reports directly in the database. This keeps all your data in one portable file.
|
|
206
|
+
|
|
207
|
+
```python
|
|
208
|
+
attachments = db.blobs("user_uploads")
|
|
209
|
+
|
|
210
|
+
# Store a user's avatar
|
|
211
|
+
with open("avatar.png", "rb") as f:
|
|
212
|
+
avatar_bytes = f.read()
|
|
213
|
+
|
|
214
|
+
attachments.put(
|
|
215
|
+
key="user_123_avatar.png",
|
|
216
|
+
data=avatar_bytes,
|
|
217
|
+
metadata={"mimetype": "image/png"}
|
|
218
|
+
)
|
|
219
|
+
|
|
220
|
+
# Retrieve it later
|
|
221
|
+
avatar = attachments.get("user_123_avatar.png")
|
|
222
|
+
```
|
|
223
|
+
|
|
202
224
|
## More Examples
|
|
203
225
|
|
|
204
226
|
For more in-depth examples, check out the scripts in the `examples/` directory:
|
|
205
227
|
|
|
206
|
-
- [`examples/kvstore.py`](examples/kvstore.py): A comprehensive demo of the namespaced dictionary feature.
|
|
207
|
-
- [`examples/list.py`](examples/list.py): Shows the full capabilities of the persistent list, including slicing and in-place updates.
|
|
208
|
-
- [`examples/queue.py`](examples/queue.py): A practical example of using the persistent priority queue for task management.
|
|
209
|
-
- [`examples/vector.py`](examples/vector.py): Demonstrates how to index and search vector embeddings, including upserts.
|
|
210
|
-
- [`examples/fts.py`](examples/fts.py): A detailed look at full-text search, including targeted searches on specific metadata fields.
|
|
211
|
-
- [`examples/graph.py`](examples/graph.py): Shows how to create relationships between documents and perform multi-hop graph traversals.
|
|
212
|
-
- [`examples/pubsub.py`](examples/pubsub.py): A demonstration of the synchronous, thread-safe publish/subscribe system in a single process.
|
|
213
228
|
- [`examples/async_pubsub.py`](examples/async_pubsub.py): A demonstration of the asynchronous wrapper for the publish/subscribe system.
|
|
214
|
-
- [`examples/
|
|
229
|
+
- [`examples/blobs.py`](examples/blobs.py): Demonstrates how to store and retrieve binary data in the database.
|
|
215
230
|
- [`examples/cache.py`](examples/cache.py): A practical example of using a dictionary with TTL as a cache for API calls.
|
|
216
|
-
- [`examples/
|
|
231
|
+
- [`examples/fts.py`](examples/fts.py): A detailed look at full-text search, including targeted searches on specific metadata fields.
|
|
217
232
|
- [`examples/fuzzy.py`](examples/fuzzy.py): Demonstrates fuzzy search capabilities for text search.
|
|
218
|
-
- [`examples/stress_vectors.py](examples/stress_vectors.py): A stress test for the vector search functionality.
|
|
219
233
|
- [`examples/general_test.py`](examples/general_test.py): A general-purpose test to run all operations randomly which allows testing long-running processes and synchronicity issues.
|
|
234
|
+
- [`examples/graph.py`](examples/graph.py): Shows how to create relationships between documents and perform multi-hop graph traversals.
|
|
235
|
+
- [`examples/kvstore.py`](examples/kvstore.py): A comprehensive demo of the namespaced dictionary feature.
|
|
236
|
+
- [`examples/list.py`](examples/list.py): Shows the full capabilities of the persistent list, including slicing and in-place updates.
|
|
237
|
+
- [`examples/publisher.py`](examples/publisher.py) and [`examples/subscriber.py`](examples/subscriber.py): A pair of examples demonstrating inter-process message passing with the publish/subscribe system.
|
|
238
|
+
- [`examples/pubsub.py`](examples/pubsub.py): A demonstration of the synchronous, thread-safe publish/subscribe system in a single process.
|
|
239
|
+
- [`examples/queue.py`](examples/queue.py): A practical example of using the persistent priority queue for task management.
|
|
240
|
+
- [`examples/rerank.py`](examples/rerank.py): Shows how to combine results from vector and text search for more refined results.
|
|
241
|
+
- [`examples/stress_vectors.py`](examples/stress_vectors.py): A stress test for the vector search functionality.
|
|
242
|
+
- [`examples/vector.py`](examples/vector.py): Demonstrates how to index and search vector embeddings, including upserts.
|
|
220
243
|
|
|
221
244
|
## Roadmap
|
|
222
245
|
|
|
223
246
|
These are some of the features and improvements planned for future releases:
|
|
224
247
|
|
|
225
248
|
- **Full Async API**: Comprehensive async support with on-demand wrappers for all collections.
|
|
249
|
+
- **Type Hints**: Pydantic-based type hints for lists, dicts, and queues.
|
|
226
250
|
|
|
227
251
|
Check out the [roadmap](roadmap.md) for a detailed list of upcoming features and design ideas.
|
|
228
252
|
|
|
@@ -23,6 +23,7 @@ A fast, single-file, multi-modal database for Python, built with the standard `s
|
|
|
23
23
|
- **Namespaced Key-Value Dictionaries**: A Pythonic, dictionary-like interface for storing any JSON-serializable object within separate namespaces with optional TTL for cache implementations.
|
|
24
24
|
- **Pythonic List Management**: A fluent, Redis-like interface for managing persistent, ordered lists.
|
|
25
25
|
- **Persistent Priority Queue**: A high-performance, persistent queue that always returns the item with the highest priority, perfect for task management.
|
|
26
|
+
- **Simple Blob Storage**: A dictionary-like interface for storing medium-sized binary files (like PDFs or images) directly in the database, ensuring transactional integrity with your other data.
|
|
26
27
|
- **High-Performance Vector Storage & Search**: Store vector embeddings and perform fast, crash-safe approximate nearest neighbor searches using a `faiss`-based hybrid index.
|
|
27
28
|
- **Full-Text and Fuzzy Search**: Automatically index and search through document metadata using SQLite's powerful FTS5 engine, enhanced with optional fuzzy search for typo-tolerant matching.
|
|
28
29
|
- **Knowledge Graph**: Create relationships between documents and traverse the graph to find neighbors or perform multi-hop walks.
|
|
@@ -188,33 +189,56 @@ with db.channel("system_events").subscribe() as listener:
|
|
|
188
189
|
# >> Event received: {'event': 'user_login', 'user_id': 'alice'}
|
|
189
190
|
```
|
|
190
191
|
|
|
192
|
+
### 7. Storing User-Uploaded Content
|
|
193
|
+
|
|
194
|
+
Use the simple blob store to save files like user avatars, attachments, or generated reports directly in the database. This keeps all your data in one portable file.
|
|
195
|
+
|
|
196
|
+
```python
|
|
197
|
+
attachments = db.blobs("user_uploads")
|
|
198
|
+
|
|
199
|
+
# Store a user's avatar
|
|
200
|
+
with open("avatar.png", "rb") as f:
|
|
201
|
+
avatar_bytes = f.read()
|
|
202
|
+
|
|
203
|
+
attachments.put(
|
|
204
|
+
key="user_123_avatar.png",
|
|
205
|
+
data=avatar_bytes,
|
|
206
|
+
metadata={"mimetype": "image/png"}
|
|
207
|
+
)
|
|
208
|
+
|
|
209
|
+
# Retrieve it later
|
|
210
|
+
avatar = attachments.get("user_123_avatar.png")
|
|
211
|
+
```
|
|
212
|
+
|
|
191
213
|
## More Examples
|
|
192
214
|
|
|
193
215
|
For more in-depth examples, check out the scripts in the `examples/` directory:
|
|
194
216
|
|
|
195
|
-
- [`examples/kvstore.py`](examples/kvstore.py): A comprehensive demo of the namespaced dictionary feature.
|
|
196
|
-
- [`examples/list.py`](examples/list.py): Shows the full capabilities of the persistent list, including slicing and in-place updates.
|
|
197
|
-
- [`examples/queue.py`](examples/queue.py): A practical example of using the persistent priority queue for task management.
|
|
198
|
-
- [`examples/vector.py`](examples/vector.py): Demonstrates how to index and search vector embeddings, including upserts.
|
|
199
|
-
- [`examples/fts.py`](examples/fts.py): A detailed look at full-text search, including targeted searches on specific metadata fields.
|
|
200
|
-
- [`examples/graph.py`](examples/graph.py): Shows how to create relationships between documents and perform multi-hop graph traversals.
|
|
201
|
-
- [`examples/pubsub.py`](examples/pubsub.py): A demonstration of the synchronous, thread-safe publish/subscribe system in a single process.
|
|
202
217
|
- [`examples/async_pubsub.py`](examples/async_pubsub.py): A demonstration of the asynchronous wrapper for the publish/subscribe system.
|
|
203
|
-
- [`examples/
|
|
218
|
+
- [`examples/blobs.py`](examples/blobs.py): Demonstrates how to store and retrieve binary data in the database.
|
|
204
219
|
- [`examples/cache.py`](examples/cache.py): A practical example of using a dictionary with TTL as a cache for API calls.
|
|
205
|
-
- [`examples/
|
|
220
|
+
- [`examples/fts.py`](examples/fts.py): A detailed look at full-text search, including targeted searches on specific metadata fields.
|
|
206
221
|
- [`examples/fuzzy.py`](examples/fuzzy.py): Demonstrates fuzzy search capabilities for text search.
|
|
207
|
-
- [`examples/stress_vectors.py](examples/stress_vectors.py): A stress test for the vector search functionality.
|
|
208
222
|
- [`examples/general_test.py`](examples/general_test.py): A general-purpose test to run all operations randomly which allows testing long-running processes and synchronicity issues.
|
|
223
|
+
- [`examples/graph.py`](examples/graph.py): Shows how to create relationships between documents and perform multi-hop graph traversals.
|
|
224
|
+
- [`examples/kvstore.py`](examples/kvstore.py): A comprehensive demo of the namespaced dictionary feature.
|
|
225
|
+
- [`examples/list.py`](examples/list.py): Shows the full capabilities of the persistent list, including slicing and in-place updates.
|
|
226
|
+
- [`examples/publisher.py`](examples/publisher.py) and [`examples/subscriber.py`](examples/subscriber.py): A pair of examples demonstrating inter-process message passing with the publish/subscribe system.
|
|
227
|
+
- [`examples/pubsub.py`](examples/pubsub.py): A demonstration of the synchronous, thread-safe publish/subscribe system in a single process.
|
|
228
|
+
- [`examples/queue.py`](examples/queue.py): A practical example of using the persistent priority queue for task management.
|
|
229
|
+
- [`examples/rerank.py`](examples/rerank.py): Shows how to combine results from vector and text search for more refined results.
|
|
230
|
+
- [`examples/stress_vectors.py`](examples/stress_vectors.py): A stress test for the vector search functionality.
|
|
231
|
+
- [`examples/vector.py`](examples/vector.py): Demonstrates how to index and search vector embeddings, including upserts.
|
|
209
232
|
|
|
210
233
|
## Roadmap
|
|
211
234
|
|
|
212
235
|
These are some of the features and improvements planned for future releases:
|
|
213
236
|
|
|
214
237
|
- **Full Async API**: Comprehensive async support with on-demand wrappers for all collections.
|
|
238
|
+
- **Type Hints**: Pydantic-based type hints for lists, dicts, and queues.
|
|
215
239
|
|
|
216
240
|
Check out the [roadmap](roadmap.md) for a detailed list of upcoming features and design ideas.
|
|
217
241
|
|
|
218
242
|
## License
|
|
219
243
|
|
|
220
|
-
This project is licensed under the MIT License.
|
|
244
|
+
This project is licensed under the MIT License.
|
|
@@ -0,0 +1,107 @@
|
|
|
1
|
+
import json
|
|
2
|
+
import sqlite3
|
|
3
|
+
from typing import Any, Dict, Iterator, NamedTuple, Optional
|
|
4
|
+
|
|
5
|
+
|
|
6
|
+
class Blob(NamedTuple):
|
|
7
|
+
"""A data class representing a single blob retrieved from the store."""
|
|
8
|
+
|
|
9
|
+
key: str
|
|
10
|
+
data: bytes
|
|
11
|
+
metadata: Dict[str, Any]
|
|
12
|
+
|
|
13
|
+
|
|
14
|
+
class BlobManager:
|
|
15
|
+
"""A wrapper providing a Pythonic interface to a blob store in the database."""
|
|
16
|
+
|
|
17
|
+
def __init__(self, name: str, conn: sqlite3.Connection):
|
|
18
|
+
self._name = name
|
|
19
|
+
self._conn = conn
|
|
20
|
+
|
|
21
|
+
def put(self, key: str, data: bytes, metadata: Optional[Dict[str, Any]] = None):
|
|
22
|
+
"""
|
|
23
|
+
Stores or replaces a blob in the store.
|
|
24
|
+
|
|
25
|
+
Args:
|
|
26
|
+
key: The unique string identifier for the blob.
|
|
27
|
+
data: The binary data to store.
|
|
28
|
+
metadata: Optional JSON-serializable dictionary for metadata.
|
|
29
|
+
"""
|
|
30
|
+
if not isinstance(data, bytes):
|
|
31
|
+
raise TypeError("Blob data must be of type bytes.")
|
|
32
|
+
|
|
33
|
+
metadata_json = json.dumps(metadata) if metadata else None
|
|
34
|
+
|
|
35
|
+
with self._conn:
|
|
36
|
+
self._conn.execute(
|
|
37
|
+
"INSERT OR REPLACE INTO beaver_blobs (store_name, key, data, metadata) VALUES (?, ?, ?, ?)",
|
|
38
|
+
(self._name, key, data, metadata_json),
|
|
39
|
+
)
|
|
40
|
+
|
|
41
|
+
def get(self, key: str) -> Optional[Blob]:
|
|
42
|
+
"""
|
|
43
|
+
Retrieves a blob from the store.
|
|
44
|
+
|
|
45
|
+
Args:
|
|
46
|
+
key: The unique string identifier for the blob.
|
|
47
|
+
|
|
48
|
+
Returns:
|
|
49
|
+
A Blob object containing the data and metadata, or None if the key is not found.
|
|
50
|
+
"""
|
|
51
|
+
cursor = self._conn.cursor()
|
|
52
|
+
cursor.execute(
|
|
53
|
+
"SELECT data, metadata FROM beaver_blobs WHERE store_name = ? AND key = ?",
|
|
54
|
+
(self._name, key),
|
|
55
|
+
)
|
|
56
|
+
result = cursor.fetchone()
|
|
57
|
+
cursor.close()
|
|
58
|
+
|
|
59
|
+
if result is None:
|
|
60
|
+
return None
|
|
61
|
+
|
|
62
|
+
data, metadata_json = result
|
|
63
|
+
metadata = json.loads(metadata_json) if metadata_json else {}
|
|
64
|
+
|
|
65
|
+
return Blob(key=key, data=data, metadata=metadata)
|
|
66
|
+
|
|
67
|
+
def delete(self, key: str):
|
|
68
|
+
"""
|
|
69
|
+
Deletes a blob from the store.
|
|
70
|
+
|
|
71
|
+
Raises:
|
|
72
|
+
KeyError: If the key does not exist in the store.
|
|
73
|
+
"""
|
|
74
|
+
with self._conn:
|
|
75
|
+
cursor = self._conn.cursor()
|
|
76
|
+
cursor.execute(
|
|
77
|
+
"DELETE FROM beaver_blobs WHERE store_name = ? AND key = ?",
|
|
78
|
+
(self._name, key),
|
|
79
|
+
)
|
|
80
|
+
if cursor.rowcount == 0:
|
|
81
|
+
raise KeyError(f"Key '{key}' not found in blob store '{self._name}'")
|
|
82
|
+
|
|
83
|
+
def __contains__(self, key: str) -> bool:
|
|
84
|
+
"""
|
|
85
|
+
Checks if a key exists in the blob store (e.g., `key in blobs`).
|
|
86
|
+
"""
|
|
87
|
+
cursor = self._conn.cursor()
|
|
88
|
+
cursor.execute(
|
|
89
|
+
"SELECT 1 FROM beaver_blobs WHERE store_name = ? AND key = ? LIMIT 1",
|
|
90
|
+
(self._name, key),
|
|
91
|
+
)
|
|
92
|
+
result = cursor.fetchone()
|
|
93
|
+
cursor.close()
|
|
94
|
+
return result is not None
|
|
95
|
+
|
|
96
|
+
def __iter__(self) -> Iterator[str]:
|
|
97
|
+
"""Returns an iterator over the keys in the blob store."""
|
|
98
|
+
cursor = self._conn.cursor()
|
|
99
|
+
cursor.execute(
|
|
100
|
+
"SELECT key FROM beaver_blobs WHERE store_name = ?", (self._name,)
|
|
101
|
+
)
|
|
102
|
+
for row in cursor:
|
|
103
|
+
yield row["key"]
|
|
104
|
+
cursor.close()
|
|
105
|
+
|
|
106
|
+
def __repr__(self) -> str:
|
|
107
|
+
return f"BlobManager(name='{self._name}')"
|
|
@@ -1,10 +1,12 @@
|
|
|
1
1
|
import sqlite3
|
|
2
2
|
import threading
|
|
3
3
|
|
|
4
|
-
|
|
5
|
-
from .
|
|
4
|
+
|
|
5
|
+
from .blobs import BlobManager
|
|
6
6
|
from .channels import ChannelManager
|
|
7
7
|
from .collections import CollectionManager
|
|
8
|
+
from .dicts import DictManager
|
|
9
|
+
from .lists import ListManager
|
|
8
10
|
from .queues import QueueManager
|
|
9
11
|
|
|
10
12
|
|
|
@@ -38,19 +40,34 @@ class BeaverDB:
|
|
|
38
40
|
def _create_all_tables(self):
|
|
39
41
|
"""Initializes all required tables in the database file."""
|
|
40
42
|
with self._conn:
|
|
41
|
-
self.
|
|
42
|
-
self.
|
|
43
|
+
self._create_ann_deletions_log_table()
|
|
44
|
+
self._create_ann_id_mapping_table()
|
|
45
|
+
self._create_ann_indexes_table()
|
|
46
|
+
self._create_ann_pending_log_table()
|
|
47
|
+
self._create_blobs_table()
|
|
43
48
|
self._create_collections_table()
|
|
49
|
+
self._create_dict_table()
|
|
50
|
+
self._create_edges_table()
|
|
44
51
|
self._create_fts_table()
|
|
52
|
+
self._create_list_table()
|
|
53
|
+
self._create_priority_queue_table()
|
|
54
|
+
self._create_pubsub_table()
|
|
45
55
|
self._create_trigrams_table()
|
|
46
|
-
self._create_edges_table()
|
|
47
56
|
self._create_versions_table()
|
|
48
|
-
|
|
49
|
-
|
|
50
|
-
|
|
51
|
-
|
|
52
|
-
|
|
53
|
-
|
|
57
|
+
|
|
58
|
+
def _create_blobs_table(self):
|
|
59
|
+
"""Creates the table for storing named blobs."""
|
|
60
|
+
self._conn.execute(
|
|
61
|
+
"""
|
|
62
|
+
CREATE TABLE IF NOT EXISTS beaver_blobs (
|
|
63
|
+
store_name TEXT NOT NULL,
|
|
64
|
+
key TEXT NOT NULL,
|
|
65
|
+
data BLOB NOT NULL,
|
|
66
|
+
metadata TEXT,
|
|
67
|
+
PRIMARY KEY (store_name, key)
|
|
68
|
+
)
|
|
69
|
+
"""
|
|
70
|
+
)
|
|
54
71
|
|
|
55
72
|
def _create_ann_indexes_table(self):
|
|
56
73
|
"""Creates the table to store the serialized base ANN index."""
|
|
@@ -299,3 +316,10 @@ class BeaverDB:
|
|
|
299
316
|
if name not in self._channels:
|
|
300
317
|
self._channels[name] = ChannelManager(name, self._conn, self._db_path)
|
|
301
318
|
return self._channels[name]
|
|
319
|
+
|
|
320
|
+
def blobs(self, name: str) -> BlobManager:
|
|
321
|
+
"""Returns a wrapper object for interacting with a named blob store."""
|
|
322
|
+
if not isinstance(name, str) or not name:
|
|
323
|
+
raise TypeError("Blob store name must be a non-empty string.")
|
|
324
|
+
|
|
325
|
+
return BlobManager(name, self._conn)
|
|
@@ -1,8 +1,9 @@
|
|
|
1
1
|
import json
|
|
2
2
|
import sqlite3
|
|
3
|
-
import time
|
|
3
|
+
import time # Add this import
|
|
4
4
|
from typing import Any, Iterator, Tuple
|
|
5
5
|
|
|
6
|
+
|
|
6
7
|
class DictManager:
|
|
7
8
|
"""A wrapper providing a Pythonic interface to a dictionary in the database."""
|
|
8
9
|
|
|
@@ -52,14 +53,28 @@ class DictManager:
|
|
|
52
53
|
|
|
53
54
|
if expires_at is not None and time.time() > expires_at:
|
|
54
55
|
# Expired: delete the key and raise KeyError
|
|
55
|
-
cursor.execute(
|
|
56
|
+
cursor.execute(
|
|
57
|
+
"DELETE FROM beaver_dicts WHERE dict_name = ? AND key = ?",
|
|
58
|
+
(self._name, key),
|
|
59
|
+
)
|
|
56
60
|
self._conn.commit()
|
|
57
61
|
cursor.close()
|
|
58
|
-
raise KeyError(
|
|
62
|
+
raise KeyError(
|
|
63
|
+
f"Key '{key}' not found in dictionary '{self._name}' (expired)"
|
|
64
|
+
)
|
|
59
65
|
|
|
60
66
|
cursor.close()
|
|
61
67
|
return json.loads(value)
|
|
62
68
|
|
|
69
|
+
def pop(self, key: str, default: Any = None):
|
|
70
|
+
"""Deletes an item if it exists and returns its value."""
|
|
71
|
+
try:
|
|
72
|
+
value = self[key]
|
|
73
|
+
del self[key]
|
|
74
|
+
return value
|
|
75
|
+
except KeyError:
|
|
76
|
+
return default
|
|
77
|
+
|
|
63
78
|
def __delitem__(self, key: str):
|
|
64
79
|
"""Deletes a key-value pair (e.g., `del my_dict[key]`)."""
|
|
65
80
|
with self._conn:
|
|
@@ -1,6 +1,6 @@
|
|
|
1
1
|
Metadata-Version: 2.4
|
|
2
2
|
Name: beaver-db
|
|
3
|
-
Version: 0.
|
|
3
|
+
Version: 0.12.0
|
|
4
4
|
Summary: Fast, embedded, and multi-modal DB based on SQLite for AI-powered applications.
|
|
5
5
|
Requires-Python: >=3.13
|
|
6
6
|
Description-Content-Type: text/markdown
|
|
@@ -34,6 +34,7 @@ A fast, single-file, multi-modal database for Python, built with the standard `s
|
|
|
34
34
|
- **Namespaced Key-Value Dictionaries**: A Pythonic, dictionary-like interface for storing any JSON-serializable object within separate namespaces with optional TTL for cache implementations.
|
|
35
35
|
- **Pythonic List Management**: A fluent, Redis-like interface for managing persistent, ordered lists.
|
|
36
36
|
- **Persistent Priority Queue**: A high-performance, persistent queue that always returns the item with the highest priority, perfect for task management.
|
|
37
|
+
- **Simple Blob Storage**: A dictionary-like interface for storing medium-sized binary files (like PDFs or images) directly in the database, ensuring transactional integrity with your other data.
|
|
37
38
|
- **High-Performance Vector Storage & Search**: Store vector embeddings and perform fast, crash-safe approximate nearest neighbor searches using a `faiss`-based hybrid index.
|
|
38
39
|
- **Full-Text and Fuzzy Search**: Automatically index and search through document metadata using SQLite's powerful FTS5 engine, enhanced with optional fuzzy search for typo-tolerant matching.
|
|
39
40
|
- **Knowledge Graph**: Create relationships between documents and traverse the graph to find neighbors or perform multi-hop walks.
|
|
@@ -199,30 +200,53 @@ with db.channel("system_events").subscribe() as listener:
|
|
|
199
200
|
# >> Event received: {'event': 'user_login', 'user_id': 'alice'}
|
|
200
201
|
```
|
|
201
202
|
|
|
203
|
+
### 7. Storing User-Uploaded Content
|
|
204
|
+
|
|
205
|
+
Use the simple blob store to save files like user avatars, attachments, or generated reports directly in the database. This keeps all your data in one portable file.
|
|
206
|
+
|
|
207
|
+
```python
|
|
208
|
+
attachments = db.blobs("user_uploads")
|
|
209
|
+
|
|
210
|
+
# Store a user's avatar
|
|
211
|
+
with open("avatar.png", "rb") as f:
|
|
212
|
+
avatar_bytes = f.read()
|
|
213
|
+
|
|
214
|
+
attachments.put(
|
|
215
|
+
key="user_123_avatar.png",
|
|
216
|
+
data=avatar_bytes,
|
|
217
|
+
metadata={"mimetype": "image/png"}
|
|
218
|
+
)
|
|
219
|
+
|
|
220
|
+
# Retrieve it later
|
|
221
|
+
avatar = attachments.get("user_123_avatar.png")
|
|
222
|
+
```
|
|
223
|
+
|
|
202
224
|
## More Examples
|
|
203
225
|
|
|
204
226
|
For more in-depth examples, check out the scripts in the `examples/` directory:
|
|
205
227
|
|
|
206
|
-
- [`examples/kvstore.py`](examples/kvstore.py): A comprehensive demo of the namespaced dictionary feature.
|
|
207
|
-
- [`examples/list.py`](examples/list.py): Shows the full capabilities of the persistent list, including slicing and in-place updates.
|
|
208
|
-
- [`examples/queue.py`](examples/queue.py): A practical example of using the persistent priority queue for task management.
|
|
209
|
-
- [`examples/vector.py`](examples/vector.py): Demonstrates how to index and search vector embeddings, including upserts.
|
|
210
|
-
- [`examples/fts.py`](examples/fts.py): A detailed look at full-text search, including targeted searches on specific metadata fields.
|
|
211
|
-
- [`examples/graph.py`](examples/graph.py): Shows how to create relationships between documents and perform multi-hop graph traversals.
|
|
212
|
-
- [`examples/pubsub.py`](examples/pubsub.py): A demonstration of the synchronous, thread-safe publish/subscribe system in a single process.
|
|
213
228
|
- [`examples/async_pubsub.py`](examples/async_pubsub.py): A demonstration of the asynchronous wrapper for the publish/subscribe system.
|
|
214
|
-
- [`examples/
|
|
229
|
+
- [`examples/blobs.py`](examples/blobs.py): Demonstrates how to store and retrieve binary data in the database.
|
|
215
230
|
- [`examples/cache.py`](examples/cache.py): A practical example of using a dictionary with TTL as a cache for API calls.
|
|
216
|
-
- [`examples/
|
|
231
|
+
- [`examples/fts.py`](examples/fts.py): A detailed look at full-text search, including targeted searches on specific metadata fields.
|
|
217
232
|
- [`examples/fuzzy.py`](examples/fuzzy.py): Demonstrates fuzzy search capabilities for text search.
|
|
218
|
-
- [`examples/stress_vectors.py](examples/stress_vectors.py): A stress test for the vector search functionality.
|
|
219
233
|
- [`examples/general_test.py`](examples/general_test.py): A general-purpose test to run all operations randomly which allows testing long-running processes and synchronicity issues.
|
|
234
|
+
- [`examples/graph.py`](examples/graph.py): Shows how to create relationships between documents and perform multi-hop graph traversals.
|
|
235
|
+
- [`examples/kvstore.py`](examples/kvstore.py): A comprehensive demo of the namespaced dictionary feature.
|
|
236
|
+
- [`examples/list.py`](examples/list.py): Shows the full capabilities of the persistent list, including slicing and in-place updates.
|
|
237
|
+
- [`examples/publisher.py`](examples/publisher.py) and [`examples/subscriber.py`](examples/subscriber.py): A pair of examples demonstrating inter-process message passing with the publish/subscribe system.
|
|
238
|
+
- [`examples/pubsub.py`](examples/pubsub.py): A demonstration of the synchronous, thread-safe publish/subscribe system in a single process.
|
|
239
|
+
- [`examples/queue.py`](examples/queue.py): A practical example of using the persistent priority queue for task management.
|
|
240
|
+
- [`examples/rerank.py`](examples/rerank.py): Shows how to combine results from vector and text search for more refined results.
|
|
241
|
+
- [`examples/stress_vectors.py`](examples/stress_vectors.py): A stress test for the vector search functionality.
|
|
242
|
+
- [`examples/vector.py`](examples/vector.py): Demonstrates how to index and search vector embeddings, including upserts.
|
|
220
243
|
|
|
221
244
|
## Roadmap
|
|
222
245
|
|
|
223
246
|
These are some of the features and improvements planned for future releases:
|
|
224
247
|
|
|
225
248
|
- **Full Async API**: Comprehensive async support with on-demand wrappers for all collections.
|
|
249
|
+
- **Type Hints**: Pydantic-based type hints for lists, dicts, and queues.
|
|
226
250
|
|
|
227
251
|
Check out the [roadmap](roadmap.md) for a detailed list of upcoming features and design ideas.
|
|
228
252
|
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|