beaver-db 0.11.1__tar.gz → 0.12.2__tar.gz
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Potentially problematic release.
This version of beaver-db might be problematic. Click here for more details.
- {beaver_db-0.11.1 → beaver_db-0.12.2}/PKG-INFO +34 -12
- {beaver_db-0.11.1 → beaver_db-0.12.2}/README.md +33 -11
- beaver_db-0.12.2/beaver/blobs.py +107 -0
- {beaver_db-0.11.1 → beaver_db-0.12.2}/beaver/collections.py +11 -0
- {beaver_db-0.11.1 → beaver_db-0.12.2}/beaver/core.py +35 -11
- {beaver_db-0.11.1 → beaver_db-0.12.2}/beaver_db.egg-info/PKG-INFO +34 -12
- {beaver_db-0.11.1 → beaver_db-0.12.2}/beaver_db.egg-info/SOURCES.txt +1 -0
- {beaver_db-0.11.1 → beaver_db-0.12.2}/pyproject.toml +1 -1
- {beaver_db-0.11.1 → beaver_db-0.12.2}/LICENSE +0 -0
- {beaver_db-0.11.1 → beaver_db-0.12.2}/beaver/__init__.py +0 -0
- {beaver_db-0.11.1 → beaver_db-0.12.2}/beaver/channels.py +0 -0
- {beaver_db-0.11.1 → beaver_db-0.12.2}/beaver/dicts.py +0 -0
- {beaver_db-0.11.1 → beaver_db-0.12.2}/beaver/lists.py +0 -0
- {beaver_db-0.11.1 → beaver_db-0.12.2}/beaver/queues.py +0 -0
- {beaver_db-0.11.1 → beaver_db-0.12.2}/beaver/vectors.py +0 -0
- {beaver_db-0.11.1 → beaver_db-0.12.2}/beaver_db.egg-info/dependency_links.txt +0 -0
- {beaver_db-0.11.1 → beaver_db-0.12.2}/beaver_db.egg-info/requires.txt +0 -0
- {beaver_db-0.11.1 → beaver_db-0.12.2}/beaver_db.egg-info/top_level.txt +0 -0
- {beaver_db-0.11.1 → beaver_db-0.12.2}/setup.cfg +0 -0
|
@@ -1,6 +1,6 @@
|
|
|
1
1
|
Metadata-Version: 2.4
|
|
2
2
|
Name: beaver-db
|
|
3
|
-
Version: 0.
|
|
3
|
+
Version: 0.12.2
|
|
4
4
|
Summary: Fast, embedded, and multi-modal DB based on SQLite for AI-powered applications.
|
|
5
5
|
Requires-Python: >=3.13
|
|
6
6
|
Description-Content-Type: text/markdown
|
|
@@ -9,8 +9,6 @@ Requires-Dist: faiss-cpu>=1.12.0
|
|
|
9
9
|
Requires-Dist: numpy>=2.3.3
|
|
10
10
|
Dynamic: license-file
|
|
11
11
|
|
|
12
|
-
Of course, here is a rewritten README to explain the vector store uses a high performance FAISS-based implementation with in-memory and persistent indices, with an added small section on how is this implemented to explain the basic ideas behind the implementation of beaver.
|
|
13
|
-
|
|
14
12
|
# beaver 🦫
|
|
15
13
|
|
|
16
14
|
A fast, single-file, multi-modal database for Python, built with the standard `sqlite3` library.
|
|
@@ -34,6 +32,7 @@ A fast, single-file, multi-modal database for Python, built with the standard `s
|
|
|
34
32
|
- **Namespaced Key-Value Dictionaries**: A Pythonic, dictionary-like interface for storing any JSON-serializable object within separate namespaces with optional TTL for cache implementations.
|
|
35
33
|
- **Pythonic List Management**: A fluent, Redis-like interface for managing persistent, ordered lists.
|
|
36
34
|
- **Persistent Priority Queue**: A high-performance, persistent queue that always returns the item with the highest priority, perfect for task management.
|
|
35
|
+
- **Simple Blob Storage**: A dictionary-like interface for storing medium-sized binary files (like PDFs or images) directly in the database, ensuring transactional integrity with your other data.
|
|
37
36
|
- **High-Performance Vector Storage & Search**: Store vector embeddings and perform fast, crash-safe approximate nearest neighbor searches using a `faiss`-based hybrid index.
|
|
38
37
|
- **Full-Text and Fuzzy Search**: Automatically index and search through document metadata using SQLite's powerful FTS5 engine, enhanced with optional fuzzy search for typo-tolerant matching.
|
|
39
38
|
- **Knowledge Graph**: Create relationships between documents and traverse the graph to find neighbors or perform multi-hop walks.
|
|
@@ -199,30 +198,53 @@ with db.channel("system_events").subscribe() as listener:
|
|
|
199
198
|
# >> Event received: {'event': 'user_login', 'user_id': 'alice'}
|
|
200
199
|
```
|
|
201
200
|
|
|
201
|
+
### 7. Storing User-Uploaded Content
|
|
202
|
+
|
|
203
|
+
Use the simple blob store to save files like user avatars, attachments, or generated reports directly in the database. This keeps all your data in one portable file.
|
|
204
|
+
|
|
205
|
+
```python
|
|
206
|
+
attachments = db.blobs("user_uploads")
|
|
207
|
+
|
|
208
|
+
# Store a user's avatar
|
|
209
|
+
with open("avatar.png", "rb") as f:
|
|
210
|
+
avatar_bytes = f.read()
|
|
211
|
+
|
|
212
|
+
attachments.put(
|
|
213
|
+
key="user_123_avatar.png",
|
|
214
|
+
data=avatar_bytes,
|
|
215
|
+
metadata={"mimetype": "image/png"}
|
|
216
|
+
)
|
|
217
|
+
|
|
218
|
+
# Retrieve it later
|
|
219
|
+
avatar = attachments.get("user_123_avatar.png")
|
|
220
|
+
```
|
|
221
|
+
|
|
202
222
|
## More Examples
|
|
203
223
|
|
|
204
224
|
For more in-depth examples, check out the scripts in the `examples/` directory:
|
|
205
225
|
|
|
206
|
-
- [`examples/
|
|
207
|
-
- [`examples/
|
|
208
|
-
- [`examples/
|
|
209
|
-
- [`examples/vector.py`](examples/vector.py): Demonstrates how to index and search vector embeddings, including upserts.
|
|
226
|
+
- [`examples/async_pubsub.py`](examples/async_pubsub.py): A demonstration of the asynchronous wrapper for the publish/subscribe system.
|
|
227
|
+
- [`examples/blobs.py`](examples/blobs.py): Demonstrates how to store and retrieve binary data in the database.
|
|
228
|
+
- [`examples/cache.py`](examples/cache.py): A practical example of using a dictionary with TTL as a cache for API calls.
|
|
210
229
|
- [`examples/fts.py`](examples/fts.py): A detailed look at full-text search, including targeted searches on specific metadata fields.
|
|
230
|
+
- [`examples/fuzzy.py`](examples/fuzzy.py): Demonstrates fuzzy search capabilities for text search.
|
|
231
|
+
- [`examples/general_test.py`](examples/general_test.py): A general-purpose test to run all operations randomly which allows testing long-running processes and synchronicity issues.
|
|
211
232
|
- [`examples/graph.py`](examples/graph.py): Shows how to create relationships between documents and perform multi-hop graph traversals.
|
|
212
|
-
- [`examples/
|
|
213
|
-
- [`examples/
|
|
233
|
+
- [`examples/kvstore.py`](examples/kvstore.py): A comprehensive demo of the namespaced dictionary feature.
|
|
234
|
+
- [`examples/list.py`](examples/list.py): Shows the full capabilities of the persistent list, including slicing and in-place updates.
|
|
214
235
|
- [`examples/publisher.py`](examples/publisher.py) and [`examples/subscriber.py`](examples/subscriber.py): A pair of examples demonstrating inter-process message passing with the publish/subscribe system.
|
|
215
|
-
- [`examples/
|
|
236
|
+
- [`examples/pubsub.py`](examples/pubsub.py): A demonstration of the synchronous, thread-safe publish/subscribe system in a single process.
|
|
237
|
+
- [`examples/queue.py`](examples/queue.py): A practical example of using the persistent priority queue for task management.
|
|
216
238
|
- [`examples/rerank.py`](examples/rerank.py): Shows how to combine results from vector and text search for more refined results.
|
|
217
|
-
- [`examples/fuzzy.py`](examples/fuzzy.py): Demonstrates fuzzy search capabilities for text search.
|
|
218
239
|
- [`examples/stress_vectors.py`](examples/stress_vectors.py): A stress test for the vector search functionality.
|
|
219
|
-
- [`examples/
|
|
240
|
+
- [`examples/vector.py`](examples/vector.py): Demonstrates how to index and search vector embeddings, including upserts.
|
|
220
241
|
|
|
221
242
|
## Roadmap
|
|
222
243
|
|
|
223
244
|
These are some of the features and improvements planned for future releases:
|
|
224
245
|
|
|
225
246
|
- **Full Async API**: Comprehensive async support with on-demand wrappers for all collections.
|
|
247
|
+
- **Type Hints**: Pydantic-based type hints for lists, dicts, and queues.
|
|
226
248
|
|
|
227
249
|
Check out the [roadmap](roadmap.md) for a detailed list of upcoming features and design ideas.
|
|
228
250
|
|
|
@@ -1,5 +1,3 @@
|
|
|
1
|
-
Of course, here is a rewritten README to explain the vector store uses a high performance FAISS-based implementation with in-memory and persistent indices, with an added small section on how is this implemented to explain the basic ideas behind the implementation of beaver.
|
|
2
|
-
|
|
3
1
|
# beaver 🦫
|
|
4
2
|
|
|
5
3
|
A fast, single-file, multi-modal database for Python, built with the standard `sqlite3` library.
|
|
@@ -23,6 +21,7 @@ A fast, single-file, multi-modal database for Python, built with the standard `s
|
|
|
23
21
|
- **Namespaced Key-Value Dictionaries**: A Pythonic, dictionary-like interface for storing any JSON-serializable object within separate namespaces with optional TTL for cache implementations.
|
|
24
22
|
- **Pythonic List Management**: A fluent, Redis-like interface for managing persistent, ordered lists.
|
|
25
23
|
- **Persistent Priority Queue**: A high-performance, persistent queue that always returns the item with the highest priority, perfect for task management.
|
|
24
|
+
- **Simple Blob Storage**: A dictionary-like interface for storing medium-sized binary files (like PDFs or images) directly in the database, ensuring transactional integrity with your other data.
|
|
26
25
|
- **High-Performance Vector Storage & Search**: Store vector embeddings and perform fast, crash-safe approximate nearest neighbor searches using a `faiss`-based hybrid index.
|
|
27
26
|
- **Full-Text and Fuzzy Search**: Automatically index and search through document metadata using SQLite's powerful FTS5 engine, enhanced with optional fuzzy search for typo-tolerant matching.
|
|
28
27
|
- **Knowledge Graph**: Create relationships between documents and traverse the graph to find neighbors or perform multi-hop walks.
|
|
@@ -188,30 +187,53 @@ with db.channel("system_events").subscribe() as listener:
|
|
|
188
187
|
# >> Event received: {'event': 'user_login', 'user_id': 'alice'}
|
|
189
188
|
```
|
|
190
189
|
|
|
190
|
+
### 7. Storing User-Uploaded Content
|
|
191
|
+
|
|
192
|
+
Use the simple blob store to save files like user avatars, attachments, or generated reports directly in the database. This keeps all your data in one portable file.
|
|
193
|
+
|
|
194
|
+
```python
|
|
195
|
+
attachments = db.blobs("user_uploads")
|
|
196
|
+
|
|
197
|
+
# Store a user's avatar
|
|
198
|
+
with open("avatar.png", "rb") as f:
|
|
199
|
+
avatar_bytes = f.read()
|
|
200
|
+
|
|
201
|
+
attachments.put(
|
|
202
|
+
key="user_123_avatar.png",
|
|
203
|
+
data=avatar_bytes,
|
|
204
|
+
metadata={"mimetype": "image/png"}
|
|
205
|
+
)
|
|
206
|
+
|
|
207
|
+
# Retrieve it later
|
|
208
|
+
avatar = attachments.get("user_123_avatar.png")
|
|
209
|
+
```
|
|
210
|
+
|
|
191
211
|
## More Examples
|
|
192
212
|
|
|
193
213
|
For more in-depth examples, check out the scripts in the `examples/` directory:
|
|
194
214
|
|
|
195
|
-
- [`examples/
|
|
196
|
-
- [`examples/
|
|
197
|
-
- [`examples/
|
|
198
|
-
- [`examples/vector.py`](examples/vector.py): Demonstrates how to index and search vector embeddings, including upserts.
|
|
215
|
+
- [`examples/async_pubsub.py`](examples/async_pubsub.py): A demonstration of the asynchronous wrapper for the publish/subscribe system.
|
|
216
|
+
- [`examples/blobs.py`](examples/blobs.py): Demonstrates how to store and retrieve binary data in the database.
|
|
217
|
+
- [`examples/cache.py`](examples/cache.py): A practical example of using a dictionary with TTL as a cache for API calls.
|
|
199
218
|
- [`examples/fts.py`](examples/fts.py): A detailed look at full-text search, including targeted searches on specific metadata fields.
|
|
219
|
+
- [`examples/fuzzy.py`](examples/fuzzy.py): Demonstrates fuzzy search capabilities for text search.
|
|
220
|
+
- [`examples/general_test.py`](examples/general_test.py): A general-purpose test to run all operations randomly which allows testing long-running processes and synchronicity issues.
|
|
200
221
|
- [`examples/graph.py`](examples/graph.py): Shows how to create relationships between documents and perform multi-hop graph traversals.
|
|
201
|
-
- [`examples/
|
|
202
|
-
- [`examples/
|
|
222
|
+
- [`examples/kvstore.py`](examples/kvstore.py): A comprehensive demo of the namespaced dictionary feature.
|
|
223
|
+
- [`examples/list.py`](examples/list.py): Shows the full capabilities of the persistent list, including slicing and in-place updates.
|
|
203
224
|
- [`examples/publisher.py`](examples/publisher.py) and [`examples/subscriber.py`](examples/subscriber.py): A pair of examples demonstrating inter-process message passing with the publish/subscribe system.
|
|
204
|
-
- [`examples/
|
|
225
|
+
- [`examples/pubsub.py`](examples/pubsub.py): A demonstration of the synchronous, thread-safe publish/subscribe system in a single process.
|
|
226
|
+
- [`examples/queue.py`](examples/queue.py): A practical example of using the persistent priority queue for task management.
|
|
205
227
|
- [`examples/rerank.py`](examples/rerank.py): Shows how to combine results from vector and text search for more refined results.
|
|
206
|
-
- [`examples/fuzzy.py`](examples/fuzzy.py): Demonstrates fuzzy search capabilities for text search.
|
|
207
228
|
- [`examples/stress_vectors.py`](examples/stress_vectors.py): A stress test for the vector search functionality.
|
|
208
|
-
- [`examples/
|
|
229
|
+
- [`examples/vector.py`](examples/vector.py): Demonstrates how to index and search vector embeddings, including upserts.
|
|
209
230
|
|
|
210
231
|
## Roadmap
|
|
211
232
|
|
|
212
233
|
These are some of the features and improvements planned for future releases:
|
|
213
234
|
|
|
214
235
|
- **Full Async API**: Comprehensive async support with on-demand wrappers for all collections.
|
|
236
|
+
- **Type Hints**: Pydantic-based type hints for lists, dicts, and queues.
|
|
215
237
|
|
|
216
238
|
Check out the [roadmap](roadmap.md) for a detailed list of upcoming features and design ideas.
|
|
217
239
|
|
|
@@ -0,0 +1,107 @@
|
|
|
1
|
+
import json
|
|
2
|
+
import sqlite3
|
|
3
|
+
from typing import Any, Dict, Iterator, NamedTuple, Optional
|
|
4
|
+
|
|
5
|
+
|
|
6
|
+
class Blob(NamedTuple):
|
|
7
|
+
"""A data class representing a single blob retrieved from the store."""
|
|
8
|
+
|
|
9
|
+
key: str
|
|
10
|
+
data: bytes
|
|
11
|
+
metadata: Dict[str, Any]
|
|
12
|
+
|
|
13
|
+
|
|
14
|
+
class BlobManager:
|
|
15
|
+
"""A wrapper providing a Pythonic interface to a blob store in the database."""
|
|
16
|
+
|
|
17
|
+
def __init__(self, name: str, conn: sqlite3.Connection):
|
|
18
|
+
self._name = name
|
|
19
|
+
self._conn = conn
|
|
20
|
+
|
|
21
|
+
def put(self, key: str, data: bytes, metadata: Optional[Dict[str, Any]] = None):
|
|
22
|
+
"""
|
|
23
|
+
Stores or replaces a blob in the store.
|
|
24
|
+
|
|
25
|
+
Args:
|
|
26
|
+
key: The unique string identifier for the blob.
|
|
27
|
+
data: The binary data to store.
|
|
28
|
+
metadata: Optional JSON-serializable dictionary for metadata.
|
|
29
|
+
"""
|
|
30
|
+
if not isinstance(data, bytes):
|
|
31
|
+
raise TypeError("Blob data must be of type bytes.")
|
|
32
|
+
|
|
33
|
+
metadata_json = json.dumps(metadata) if metadata else None
|
|
34
|
+
|
|
35
|
+
with self._conn:
|
|
36
|
+
self._conn.execute(
|
|
37
|
+
"INSERT OR REPLACE INTO beaver_blobs (store_name, key, data, metadata) VALUES (?, ?, ?, ?)",
|
|
38
|
+
(self._name, key, data, metadata_json),
|
|
39
|
+
)
|
|
40
|
+
|
|
41
|
+
def get(self, key: str) -> Optional[Blob]:
|
|
42
|
+
"""
|
|
43
|
+
Retrieves a blob from the store.
|
|
44
|
+
|
|
45
|
+
Args:
|
|
46
|
+
key: The unique string identifier for the blob.
|
|
47
|
+
|
|
48
|
+
Returns:
|
|
49
|
+
A Blob object containing the data and metadata, or None if the key is not found.
|
|
50
|
+
"""
|
|
51
|
+
cursor = self._conn.cursor()
|
|
52
|
+
cursor.execute(
|
|
53
|
+
"SELECT data, metadata FROM beaver_blobs WHERE store_name = ? AND key = ?",
|
|
54
|
+
(self._name, key),
|
|
55
|
+
)
|
|
56
|
+
result = cursor.fetchone()
|
|
57
|
+
cursor.close()
|
|
58
|
+
|
|
59
|
+
if result is None:
|
|
60
|
+
return None
|
|
61
|
+
|
|
62
|
+
data, metadata_json = result
|
|
63
|
+
metadata = json.loads(metadata_json) if metadata_json else {}
|
|
64
|
+
|
|
65
|
+
return Blob(key=key, data=data, metadata=metadata)
|
|
66
|
+
|
|
67
|
+
def delete(self, key: str):
|
|
68
|
+
"""
|
|
69
|
+
Deletes a blob from the store.
|
|
70
|
+
|
|
71
|
+
Raises:
|
|
72
|
+
KeyError: If the key does not exist in the store.
|
|
73
|
+
"""
|
|
74
|
+
with self._conn:
|
|
75
|
+
cursor = self._conn.cursor()
|
|
76
|
+
cursor.execute(
|
|
77
|
+
"DELETE FROM beaver_blobs WHERE store_name = ? AND key = ?",
|
|
78
|
+
(self._name, key),
|
|
79
|
+
)
|
|
80
|
+
if cursor.rowcount == 0:
|
|
81
|
+
raise KeyError(f"Key '{key}' not found in blob store '{self._name}'")
|
|
82
|
+
|
|
83
|
+
def __contains__(self, key: str) -> bool:
|
|
84
|
+
"""
|
|
85
|
+
Checks if a key exists in the blob store (e.g., `key in blobs`).
|
|
86
|
+
"""
|
|
87
|
+
cursor = self._conn.cursor()
|
|
88
|
+
cursor.execute(
|
|
89
|
+
"SELECT 1 FROM beaver_blobs WHERE store_name = ? AND key = ? LIMIT 1",
|
|
90
|
+
(self._name, key),
|
|
91
|
+
)
|
|
92
|
+
result = cursor.fetchone()
|
|
93
|
+
cursor.close()
|
|
94
|
+
return result is not None
|
|
95
|
+
|
|
96
|
+
def __iter__(self) -> Iterator[str]:
|
|
97
|
+
"""Returns an iterator over the keys in the blob store."""
|
|
98
|
+
cursor = self._conn.cursor()
|
|
99
|
+
cursor.execute(
|
|
100
|
+
"SELECT key FROM beaver_blobs WHERE store_name = ?", (self._name,)
|
|
101
|
+
)
|
|
102
|
+
for row in cursor:
|
|
103
|
+
yield row["key"]
|
|
104
|
+
cursor.close()
|
|
105
|
+
|
|
106
|
+
def __repr__(self) -> str:
|
|
107
|
+
return f"BlobManager(name='{self._name}')"
|
|
@@ -560,6 +560,17 @@ class CollectionManager:
|
|
|
560
560
|
for row in rows
|
|
561
561
|
]
|
|
562
562
|
|
|
563
|
+
def __len__(self) -> int:
|
|
564
|
+
"""Returns the number of documents in the collection."""
|
|
565
|
+
cursor = self._conn.cursor()
|
|
566
|
+
cursor.execute(
|
|
567
|
+
"SELECT COUNT(*) FROM beaver_collections WHERE collection = ?",
|
|
568
|
+
(self._name,),
|
|
569
|
+
)
|
|
570
|
+
count = cursor.fetchone()[0]
|
|
571
|
+
cursor.close()
|
|
572
|
+
return count
|
|
573
|
+
|
|
563
574
|
|
|
564
575
|
def rerank(
|
|
565
576
|
*results: list[Document],
|
|
@@ -1,10 +1,12 @@
|
|
|
1
1
|
import sqlite3
|
|
2
2
|
import threading
|
|
3
3
|
|
|
4
|
-
|
|
5
|
-
from .
|
|
4
|
+
|
|
5
|
+
from .blobs import BlobManager
|
|
6
6
|
from .channels import ChannelManager
|
|
7
7
|
from .collections import CollectionManager
|
|
8
|
+
from .dicts import DictManager
|
|
9
|
+
from .lists import ListManager
|
|
8
10
|
from .queues import QueueManager
|
|
9
11
|
|
|
10
12
|
|
|
@@ -38,19 +40,34 @@ class BeaverDB:
|
|
|
38
40
|
def _create_all_tables(self):
|
|
39
41
|
"""Initializes all required tables in the database file."""
|
|
40
42
|
with self._conn:
|
|
41
|
-
self.
|
|
42
|
-
self.
|
|
43
|
+
self._create_ann_deletions_log_table()
|
|
44
|
+
self._create_ann_id_mapping_table()
|
|
45
|
+
self._create_ann_indexes_table()
|
|
46
|
+
self._create_ann_pending_log_table()
|
|
47
|
+
self._create_blobs_table()
|
|
43
48
|
self._create_collections_table()
|
|
49
|
+
self._create_dict_table()
|
|
50
|
+
self._create_edges_table()
|
|
44
51
|
self._create_fts_table()
|
|
52
|
+
self._create_list_table()
|
|
53
|
+
self._create_priority_queue_table()
|
|
54
|
+
self._create_pubsub_table()
|
|
45
55
|
self._create_trigrams_table()
|
|
46
|
-
self._create_edges_table()
|
|
47
56
|
self._create_versions_table()
|
|
48
|
-
|
|
49
|
-
|
|
50
|
-
|
|
51
|
-
|
|
52
|
-
|
|
53
|
-
|
|
57
|
+
|
|
58
|
+
def _create_blobs_table(self):
|
|
59
|
+
"""Creates the table for storing named blobs."""
|
|
60
|
+
self._conn.execute(
|
|
61
|
+
"""
|
|
62
|
+
CREATE TABLE IF NOT EXISTS beaver_blobs (
|
|
63
|
+
store_name TEXT NOT NULL,
|
|
64
|
+
key TEXT NOT NULL,
|
|
65
|
+
data BLOB NOT NULL,
|
|
66
|
+
metadata TEXT,
|
|
67
|
+
PRIMARY KEY (store_name, key)
|
|
68
|
+
)
|
|
69
|
+
"""
|
|
70
|
+
)
|
|
54
71
|
|
|
55
72
|
def _create_ann_indexes_table(self):
|
|
56
73
|
"""Creates the table to store the serialized base ANN index."""
|
|
@@ -299,3 +316,10 @@ class BeaverDB:
|
|
|
299
316
|
if name not in self._channels:
|
|
300
317
|
self._channels[name] = ChannelManager(name, self._conn, self._db_path)
|
|
301
318
|
return self._channels[name]
|
|
319
|
+
|
|
320
|
+
def blobs(self, name: str) -> BlobManager:
|
|
321
|
+
"""Returns a wrapper object for interacting with a named blob store."""
|
|
322
|
+
if not isinstance(name, str) or not name:
|
|
323
|
+
raise TypeError("Blob store name must be a non-empty string.")
|
|
324
|
+
|
|
325
|
+
return BlobManager(name, self._conn)
|
|
@@ -1,6 +1,6 @@
|
|
|
1
1
|
Metadata-Version: 2.4
|
|
2
2
|
Name: beaver-db
|
|
3
|
-
Version: 0.
|
|
3
|
+
Version: 0.12.2
|
|
4
4
|
Summary: Fast, embedded, and multi-modal DB based on SQLite for AI-powered applications.
|
|
5
5
|
Requires-Python: >=3.13
|
|
6
6
|
Description-Content-Type: text/markdown
|
|
@@ -9,8 +9,6 @@ Requires-Dist: faiss-cpu>=1.12.0
|
|
|
9
9
|
Requires-Dist: numpy>=2.3.3
|
|
10
10
|
Dynamic: license-file
|
|
11
11
|
|
|
12
|
-
Of course, here is a rewritten README to explain the vector store uses a high performance FAISS-based implementation with in-memory and persistent indices, with an added small section on how is this implemented to explain the basic ideas behind the implementation of beaver.
|
|
13
|
-
|
|
14
12
|
# beaver 🦫
|
|
15
13
|
|
|
16
14
|
A fast, single-file, multi-modal database for Python, built with the standard `sqlite3` library.
|
|
@@ -34,6 +32,7 @@ A fast, single-file, multi-modal database for Python, built with the standard `s
|
|
|
34
32
|
- **Namespaced Key-Value Dictionaries**: A Pythonic, dictionary-like interface for storing any JSON-serializable object within separate namespaces with optional TTL for cache implementations.
|
|
35
33
|
- **Pythonic List Management**: A fluent, Redis-like interface for managing persistent, ordered lists.
|
|
36
34
|
- **Persistent Priority Queue**: A high-performance, persistent queue that always returns the item with the highest priority, perfect for task management.
|
|
35
|
+
- **Simple Blob Storage**: A dictionary-like interface for storing medium-sized binary files (like PDFs or images) directly in the database, ensuring transactional integrity with your other data.
|
|
37
36
|
- **High-Performance Vector Storage & Search**: Store vector embeddings and perform fast, crash-safe approximate nearest neighbor searches using a `faiss`-based hybrid index.
|
|
38
37
|
- **Full-Text and Fuzzy Search**: Automatically index and search through document metadata using SQLite's powerful FTS5 engine, enhanced with optional fuzzy search for typo-tolerant matching.
|
|
39
38
|
- **Knowledge Graph**: Create relationships between documents and traverse the graph to find neighbors or perform multi-hop walks.
|
|
@@ -199,30 +198,53 @@ with db.channel("system_events").subscribe() as listener:
|
|
|
199
198
|
# >> Event received: {'event': 'user_login', 'user_id': 'alice'}
|
|
200
199
|
```
|
|
201
200
|
|
|
201
|
+
### 7. Storing User-Uploaded Content
|
|
202
|
+
|
|
203
|
+
Use the simple blob store to save files like user avatars, attachments, or generated reports directly in the database. This keeps all your data in one portable file.
|
|
204
|
+
|
|
205
|
+
```python
|
|
206
|
+
attachments = db.blobs("user_uploads")
|
|
207
|
+
|
|
208
|
+
# Store a user's avatar
|
|
209
|
+
with open("avatar.png", "rb") as f:
|
|
210
|
+
avatar_bytes = f.read()
|
|
211
|
+
|
|
212
|
+
attachments.put(
|
|
213
|
+
key="user_123_avatar.png",
|
|
214
|
+
data=avatar_bytes,
|
|
215
|
+
metadata={"mimetype": "image/png"}
|
|
216
|
+
)
|
|
217
|
+
|
|
218
|
+
# Retrieve it later
|
|
219
|
+
avatar = attachments.get("user_123_avatar.png")
|
|
220
|
+
```
|
|
221
|
+
|
|
202
222
|
## More Examples
|
|
203
223
|
|
|
204
224
|
For more in-depth examples, check out the scripts in the `examples/` directory:
|
|
205
225
|
|
|
206
|
-
- [`examples/
|
|
207
|
-
- [`examples/
|
|
208
|
-
- [`examples/
|
|
209
|
-
- [`examples/vector.py`](examples/vector.py): Demonstrates how to index and search vector embeddings, including upserts.
|
|
226
|
+
- [`examples/async_pubsub.py`](examples/async_pubsub.py): A demonstration of the asynchronous wrapper for the publish/subscribe system.
|
|
227
|
+
- [`examples/blobs.py`](examples/blobs.py): Demonstrates how to store and retrieve binary data in the database.
|
|
228
|
+
- [`examples/cache.py`](examples/cache.py): A practical example of using a dictionary with TTL as a cache for API calls.
|
|
210
229
|
- [`examples/fts.py`](examples/fts.py): A detailed look at full-text search, including targeted searches on specific metadata fields.
|
|
230
|
+
- [`examples/fuzzy.py`](examples/fuzzy.py): Demonstrates fuzzy search capabilities for text search.
|
|
231
|
+
- [`examples/general_test.py`](examples/general_test.py): A general-purpose test to run all operations randomly which allows testing long-running processes and synchronicity issues.
|
|
211
232
|
- [`examples/graph.py`](examples/graph.py): Shows how to create relationships between documents and perform multi-hop graph traversals.
|
|
212
|
-
- [`examples/
|
|
213
|
-
- [`examples/
|
|
233
|
+
- [`examples/kvstore.py`](examples/kvstore.py): A comprehensive demo of the namespaced dictionary feature.
|
|
234
|
+
- [`examples/list.py`](examples/list.py): Shows the full capabilities of the persistent list, including slicing and in-place updates.
|
|
214
235
|
- [`examples/publisher.py`](examples/publisher.py) and [`examples/subscriber.py`](examples/subscriber.py): A pair of examples demonstrating inter-process message passing with the publish/subscribe system.
|
|
215
|
-
- [`examples/
|
|
236
|
+
- [`examples/pubsub.py`](examples/pubsub.py): A demonstration of the synchronous, thread-safe publish/subscribe system in a single process.
|
|
237
|
+
- [`examples/queue.py`](examples/queue.py): A practical example of using the persistent priority queue for task management.
|
|
216
238
|
- [`examples/rerank.py`](examples/rerank.py): Shows how to combine results from vector and text search for more refined results.
|
|
217
|
-
- [`examples/fuzzy.py`](examples/fuzzy.py): Demonstrates fuzzy search capabilities for text search.
|
|
218
239
|
- [`examples/stress_vectors.py`](examples/stress_vectors.py): A stress test for the vector search functionality.
|
|
219
|
-
- [`examples/
|
|
240
|
+
- [`examples/vector.py`](examples/vector.py): Demonstrates how to index and search vector embeddings, including upserts.
|
|
220
241
|
|
|
221
242
|
## Roadmap
|
|
222
243
|
|
|
223
244
|
These are some of the features and improvements planned for future releases:
|
|
224
245
|
|
|
225
246
|
- **Full Async API**: Comprehensive async support with on-demand wrappers for all collections.
|
|
247
|
+
- **Type Hints**: Pydantic-based type hints for lists, dicts, and queues.
|
|
226
248
|
|
|
227
249
|
Check out the [roadmap](roadmap.md) for a detailed list of upcoming features and design ideas.
|
|
228
250
|
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|