beaver-db 0.7.1__tar.gz → 0.8.0__tar.gz
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Potentially problematic release.
This version of beaver-db might be problematic. Click here for more details.
- {beaver_db-0.7.1 → beaver_db-0.8.0}/PKG-INFO +32 -18
- {beaver_db-0.7.1 → beaver_db-0.8.0}/README.md +31 -17
- {beaver_db-0.7.1 → beaver_db-0.8.0}/beaver/core.py +30 -0
- beaver_db-0.8.0/beaver/queues.py +87 -0
- {beaver_db-0.7.1 → beaver_db-0.8.0}/beaver_db.egg-info/PKG-INFO +32 -18
- {beaver_db-0.7.1 → beaver_db-0.8.0}/beaver_db.egg-info/SOURCES.txt +1 -0
- {beaver_db-0.7.1 → beaver_db-0.8.0}/pyproject.toml +1 -1
- {beaver_db-0.7.1 → beaver_db-0.8.0}/LICENSE +0 -0
- {beaver_db-0.7.1 → beaver_db-0.8.0}/beaver/__init__.py +0 -0
- {beaver_db-0.7.1 → beaver_db-0.8.0}/beaver/channels.py +0 -0
- {beaver_db-0.7.1 → beaver_db-0.8.0}/beaver/collections.py +0 -0
- {beaver_db-0.7.1 → beaver_db-0.8.0}/beaver/dicts.py +0 -0
- {beaver_db-0.7.1 → beaver_db-0.8.0}/beaver/lists.py +0 -0
- {beaver_db-0.7.1 → beaver_db-0.8.0}/beaver_db.egg-info/dependency_links.txt +0 -0
- {beaver_db-0.7.1 → beaver_db-0.8.0}/beaver_db.egg-info/requires.txt +0 -0
- {beaver_db-0.7.1 → beaver_db-0.8.0}/beaver_db.egg-info/top_level.txt +0 -0
- {beaver_db-0.7.1 → beaver_db-0.8.0}/setup.cfg +0 -0
|
@@ -1,6 +1,6 @@
|
|
|
1
1
|
Metadata-Version: 2.4
|
|
2
2
|
Name: beaver-db
|
|
3
|
-
Version: 0.
|
|
3
|
+
Version: 0.8.0
|
|
4
4
|
Summary: Fast, embedded, and multi-modal DB based on SQLite for AI-powered applications.
|
|
5
5
|
Requires-Python: >=3.13
|
|
6
6
|
Description-Content-Type: text/markdown
|
|
@@ -11,10 +11,6 @@ Dynamic: license-file
|
|
|
11
11
|
|
|
12
12
|
# beaver 🦫
|
|
13
13
|
|
|
14
|
-

|
|
15
|
-

|
|
16
|
-

|
|
17
|
-
|
|
18
14
|
A fast, single-file, multi-modal database for Python, built with the standard `sqlite3` library.
|
|
19
15
|
|
|
20
16
|
`beaver` is the **B**ackend for **E**mbedded, **A**ll-in-one **V**ector, **E**ntity, and **R**elationship storage. It's a simple, local, and embedded database designed to manage complex, modern data types without requiring a database server, built on top of SQLite.
|
|
@@ -34,6 +30,7 @@ A fast, single-file, multi-modal database for Python, built with the standard `s
|
|
|
34
30
|
- **Synchronous Pub/Sub**: A simple, thread-safe, Redis-like publish-subscribe system for real-time messaging.
|
|
35
31
|
- **Namespaced Key-Value Dictionaries**: A Pythonic, dictionary-like interface for storing any JSON-serializable object within separate namespaces with optional TTL for cache implementations.
|
|
36
32
|
- **Pythonic List Management**: A fluent, Redis-like interface for managing persistent, ordered lists.
|
|
33
|
+
- **Persistent Priority Queue**: A high-performance, persistent queue that always returns the item with the highest priority, perfect for task management.
|
|
37
34
|
- **Efficient Vector Storage & Search**: Store vector embeddings and perform fast approximate nearest neighbor searches using an in-memory k-d tree.
|
|
38
35
|
- **Full-Text Search**: Automatically index and search through document metadata using SQLite's powerful FTS5 engine.
|
|
39
36
|
- **Graph Traversal**: Create relationships between documents and traverse the graph to find neighbors or perform multi-hop walks.
|
|
@@ -86,7 +83,24 @@ db.close()
|
|
|
86
83
|
|
|
87
84
|
Here are a few ideas to inspire your next project, showcasing how to combine Beaver's features to build powerful local applications.
|
|
88
85
|
|
|
89
|
-
### 1.
|
|
86
|
+
### 1. AI Agent Task Management
|
|
87
|
+
|
|
88
|
+
Use a **persistent priority queue** to manage tasks for an AI agent. This ensures the agent always works on the most important task first, even if the application restarts.
|
|
89
|
+
|
|
90
|
+
```python
|
|
91
|
+
tasks = db.queue("agent_tasks")
|
|
92
|
+
|
|
93
|
+
# Tasks are added with a priority (lower is higher)
|
|
94
|
+
tasks.put({"action": "summarize_news"}, priority=10)
|
|
95
|
+
tasks.put({"action": "respond_to_user"}, priority=1)
|
|
96
|
+
tasks.put({"action": "run_backup"}, priority=20)
|
|
97
|
+
|
|
98
|
+
# The agent retrieves the highest-priority task
|
|
99
|
+
next_task = tasks.get() # -> Returns the "respond_to_user" task
|
|
100
|
+
print(f"Agent's next task: {next_task.data['action']}")
|
|
101
|
+
```
|
|
102
|
+
|
|
103
|
+
### 2. User Authentication and Profile Store
|
|
90
104
|
|
|
91
105
|
Use a **namespaced dictionary** to create a simple and secure user store. The key can be the username, and the value can be a dictionary containing the hashed password and other profile information.
|
|
92
106
|
|
|
@@ -104,7 +118,7 @@ users["alice"] = {
|
|
|
104
118
|
alice_profile = users.get("alice")
|
|
105
119
|
```
|
|
106
120
|
|
|
107
|
-
###
|
|
121
|
+
### 3. Chatbot Conversation History
|
|
108
122
|
|
|
109
123
|
A **persistent list** is perfect for storing the history of a conversation. Each time the user or the bot sends a message, just `push` it to the list. This maintains a chronological record of the entire dialogue.
|
|
110
124
|
|
|
@@ -119,7 +133,7 @@ for message in chat_history:
|
|
|
119
133
|
print(f"{message['role']}: {message['content']}")
|
|
120
134
|
```
|
|
121
135
|
|
|
122
|
-
###
|
|
136
|
+
### 4. Build a RAG (Retrieval-Augmented Generation) System
|
|
123
137
|
|
|
124
138
|
Combine **vector search** and **full-text search** to build a powerful RAG pipeline for your local documents.
|
|
125
139
|
|
|
@@ -133,7 +147,7 @@ from beaver.collections import rerank
|
|
|
133
147
|
best_context = rerank(vector_results, text_results, weights=[0.6, 0.4])
|
|
134
148
|
```
|
|
135
149
|
|
|
136
|
-
###
|
|
150
|
+
### 5. Caching for Expensive API Calls
|
|
137
151
|
|
|
138
152
|
Leverage a **dictionary with a TTL (Time-To-Live)** to cache the results of slow network requests. This can dramatically speed up your application and reduce your reliance on external services.
|
|
139
153
|
|
|
@@ -153,14 +167,15 @@ if response is None:
|
|
|
153
167
|
|
|
154
168
|
For more in-depth examples, check out the scripts in the `examples/` directory:
|
|
155
169
|
|
|
156
|
-
- [`examples/kvstore.py`](https://www.google.com/search?q
|
|
157
|
-
- [`examples/list.py`](https://www.google.com/search?q
|
|
158
|
-
- [`examples/
|
|
159
|
-
- [`examples/
|
|
160
|
-
- [`examples/
|
|
161
|
-
- [`examples/
|
|
162
|
-
- [`examples/
|
|
163
|
-
- [`examples/
|
|
170
|
+
- [`examples/kvstore.py`](https://www.google.com/search?q=%5Bhttps://www.google.com/search%3Fq%3Dexamples/kvstore.py%5D\(https://www.google.com/search%3Fq%3Dexamples/kvstore.py\)): A comprehensive demo of the namespaced dictionary feature.
|
|
171
|
+
- [`examples/list.py`](https://www.google.com/search?q=%5Bhttps://www.google.com/search%3Fq%3Dexamples/list.py%5D\(https://www.google.com/search%3Fq%3Dexamples/list.py\)): Shows the full capabilities of the persistent list, including slicing and in-place updates.
|
|
172
|
+
- [`examples/queue.py`](https://www.google.com/search?q=%5Bhttps://www.google.com/search%3Fq%3Dexamples/queue.py%5D\(https://www.google.com/search%3Fq%3Dexamples/queue.py\)): A practical example of using the persistent priority queue for task management.
|
|
173
|
+
- [`examples/vector.py`](https://www.google.com/search?q=%5Bhttps://www.google.com/search%3Fq%3Dexamples/vector.py%5D\(https://www.google.com/search%3Fq%3Dexamples/vector.py\)): Demonstrates how to index and search vector embeddings, including upserts.
|
|
174
|
+
- [`examples/fts.py`](https://www.google.com/search?q=%5Bhttps://www.google.com/search%3Fq%3Dexamples/fts.py%5D\(https://www.google.com/search%3Fq%3Dexamples/fts.py\)): A detailed look at full-text search, including targeted searches on specific metadata fields.
|
|
175
|
+
- [`examples/graph.py`](https://www.google.com/search?q=%5Bhttps://www.google.com/search%3Fq%3Dexamples/graph.py%5D\(https://www.google.com/search%3Fq%3Dexamples/graph.py\)): Shows how to create relationships between documents and perform multi-hop graph traversals.
|
|
176
|
+
- [`examples/pubsub.py`](https://www.google.com/search?q=%5Bhttps://www.google.com/search%3Fq%3Dexamples/pubsub.py%5D\(https://www.google.com/search%3Fq%3Dexamples/pubsub.py\)): A demonstration of the synchronous, thread-safe publish/subscribe system.
|
|
177
|
+
- [`examples/cache.py`](https://www.google.com/search?q=%5Bhttps://www.google.com/search%3Fq%3Dexamples/cache.py%5D\(https://www.google.com/search%3Fq%3Dexamples/cache.py\)): A practical example of using a dictionary with TTL as a cache for API calls.
|
|
178
|
+
- [`examples/rerank.py`](https://www.google.com/search?q=%5Bhttps://www.google.com/search%3Fq%3Dexamples/rerank.py%5D\(https://www.google.com/search%3Fq%3Dexamples/rerank.py\)): Shows how to combine results from vector and text search for more refined results.
|
|
164
179
|
|
|
165
180
|
## Roadmap
|
|
166
181
|
|
|
@@ -168,7 +183,6 @@ These are some of the features and improvements planned for future releases:
|
|
|
168
183
|
|
|
169
184
|
- **Fuzzy search**: Implement fuzzy matching capabilities for text search.
|
|
170
185
|
- **Faster ANN**: Explore integrating more advanced ANN libraries like `faiss` for improved vector search performance.
|
|
171
|
-
- **Priority Queues**: Introduce a priority queue data structure for task management.
|
|
172
186
|
- **Improved Pub/Sub**: Fan-out implementation with a more Pythonic API.
|
|
173
187
|
- **Async API**: Comprehensive async support with on-demand wrappers for all collections.
|
|
174
188
|
|
|
@@ -1,9 +1,5 @@
|
|
|
1
1
|
# beaver 🦫
|
|
2
2
|
|
|
3
|
-

|
|
4
|
-

|
|
5
|
-

|
|
6
|
-
|
|
7
3
|
A fast, single-file, multi-modal database for Python, built with the standard `sqlite3` library.
|
|
8
4
|
|
|
9
5
|
`beaver` is the **B**ackend for **E**mbedded, **A**ll-in-one **V**ector, **E**ntity, and **R**elationship storage. It's a simple, local, and embedded database designed to manage complex, modern data types without requiring a database server, built on top of SQLite.
|
|
@@ -23,6 +19,7 @@ A fast, single-file, multi-modal database for Python, built with the standard `s
|
|
|
23
19
|
- **Synchronous Pub/Sub**: A simple, thread-safe, Redis-like publish-subscribe system for real-time messaging.
|
|
24
20
|
- **Namespaced Key-Value Dictionaries**: A Pythonic, dictionary-like interface for storing any JSON-serializable object within separate namespaces with optional TTL for cache implementations.
|
|
25
21
|
- **Pythonic List Management**: A fluent, Redis-like interface for managing persistent, ordered lists.
|
|
22
|
+
- **Persistent Priority Queue**: A high-performance, persistent queue that always returns the item with the highest priority, perfect for task management.
|
|
26
23
|
- **Efficient Vector Storage & Search**: Store vector embeddings and perform fast approximate nearest neighbor searches using an in-memory k-d tree.
|
|
27
24
|
- **Full-Text Search**: Automatically index and search through document metadata using SQLite's powerful FTS5 engine.
|
|
28
25
|
- **Graph Traversal**: Create relationships between documents and traverse the graph to find neighbors or perform multi-hop walks.
|
|
@@ -75,7 +72,24 @@ db.close()
|
|
|
75
72
|
|
|
76
73
|
Here are a few ideas to inspire your next project, showcasing how to combine Beaver's features to build powerful local applications.
|
|
77
74
|
|
|
78
|
-
### 1.
|
|
75
|
+
### 1. AI Agent Task Management
|
|
76
|
+
|
|
77
|
+
Use a **persistent priority queue** to manage tasks for an AI agent. This ensures the agent always works on the most important task first, even if the application restarts.
|
|
78
|
+
|
|
79
|
+
```python
|
|
80
|
+
tasks = db.queue("agent_tasks")
|
|
81
|
+
|
|
82
|
+
# Tasks are added with a priority (lower is higher)
|
|
83
|
+
tasks.put({"action": "summarize_news"}, priority=10)
|
|
84
|
+
tasks.put({"action": "respond_to_user"}, priority=1)
|
|
85
|
+
tasks.put({"action": "run_backup"}, priority=20)
|
|
86
|
+
|
|
87
|
+
# The agent retrieves the highest-priority task
|
|
88
|
+
next_task = tasks.get() # -> Returns the "respond_to_user" task
|
|
89
|
+
print(f"Agent's next task: {next_task.data['action']}")
|
|
90
|
+
```
|
|
91
|
+
|
|
92
|
+
### 2. User Authentication and Profile Store
|
|
79
93
|
|
|
80
94
|
Use a **namespaced dictionary** to create a simple and secure user store. The key can be the username, and the value can be a dictionary containing the hashed password and other profile information.
|
|
81
95
|
|
|
@@ -93,7 +107,7 @@ users["alice"] = {
|
|
|
93
107
|
alice_profile = users.get("alice")
|
|
94
108
|
```
|
|
95
109
|
|
|
96
|
-
###
|
|
110
|
+
### 3. Chatbot Conversation History
|
|
97
111
|
|
|
98
112
|
A **persistent list** is perfect for storing the history of a conversation. Each time the user or the bot sends a message, just `push` it to the list. This maintains a chronological record of the entire dialogue.
|
|
99
113
|
|
|
@@ -108,7 +122,7 @@ for message in chat_history:
|
|
|
108
122
|
print(f"{message['role']}: {message['content']}")
|
|
109
123
|
```
|
|
110
124
|
|
|
111
|
-
###
|
|
125
|
+
### 4. Build a RAG (Retrieval-Augmented Generation) System
|
|
112
126
|
|
|
113
127
|
Combine **vector search** and **full-text search** to build a powerful RAG pipeline for your local documents.
|
|
114
128
|
|
|
@@ -122,7 +136,7 @@ from beaver.collections import rerank
|
|
|
122
136
|
best_context = rerank(vector_results, text_results, weights=[0.6, 0.4])
|
|
123
137
|
```
|
|
124
138
|
|
|
125
|
-
###
|
|
139
|
+
### 5. Caching for Expensive API Calls
|
|
126
140
|
|
|
127
141
|
Leverage a **dictionary with a TTL (Time-To-Live)** to cache the results of slow network requests. This can dramatically speed up your application and reduce your reliance on external services.
|
|
128
142
|
|
|
@@ -142,14 +156,15 @@ if response is None:
|
|
|
142
156
|
|
|
143
157
|
For more in-depth examples, check out the scripts in the `examples/` directory:
|
|
144
158
|
|
|
145
|
-
- [`examples/kvstore.py`](https://www.google.com/search?q
|
|
146
|
-
- [`examples/list.py`](https://www.google.com/search?q
|
|
147
|
-
- [`examples/
|
|
148
|
-
- [`examples/
|
|
149
|
-
- [`examples/
|
|
150
|
-
- [`examples/
|
|
151
|
-
- [`examples/
|
|
152
|
-
- [`examples/
|
|
159
|
+
- [`examples/kvstore.py`](https://www.google.com/search?q=%5Bhttps://www.google.com/search%3Fq%3Dexamples/kvstore.py%5D\(https://www.google.com/search%3Fq%3Dexamples/kvstore.py\)): A comprehensive demo of the namespaced dictionary feature.
|
|
160
|
+
- [`examples/list.py`](https://www.google.com/search?q=%5Bhttps://www.google.com/search%3Fq%3Dexamples/list.py%5D\(https://www.google.com/search%3Fq%3Dexamples/list.py\)): Shows the full capabilities of the persistent list, including slicing and in-place updates.
|
|
161
|
+
- [`examples/queue.py`](https://www.google.com/search?q=%5Bhttps://www.google.com/search%3Fq%3Dexamples/queue.py%5D\(https://www.google.com/search%3Fq%3Dexamples/queue.py\)): A practical example of using the persistent priority queue for task management.
|
|
162
|
+
- [`examples/vector.py`](https://www.google.com/search?q=%5Bhttps://www.google.com/search%3Fq%3Dexamples/vector.py%5D\(https://www.google.com/search%3Fq%3Dexamples/vector.py\)): Demonstrates how to index and search vector embeddings, including upserts.
|
|
163
|
+
- [`examples/fts.py`](https://www.google.com/search?q=%5Bhttps://www.google.com/search%3Fq%3Dexamples/fts.py%5D\(https://www.google.com/search%3Fq%3Dexamples/fts.py\)): A detailed look at full-text search, including targeted searches on specific metadata fields.
|
|
164
|
+
- [`examples/graph.py`](https://www.google.com/search?q=%5Bhttps://www.google.com/search%3Fq%3Dexamples/graph.py%5D\(https://www.google.com/search%3Fq%3Dexamples/graph.py\)): Shows how to create relationships between documents and perform multi-hop graph traversals.
|
|
165
|
+
- [`examples/pubsub.py`](https://www.google.com/search?q=%5Bhttps://www.google.com/search%3Fq%3Dexamples/pubsub.py%5D\(https://www.google.com/search%3Fq%3Dexamples/pubsub.py\)): A demonstration of the synchronous, thread-safe publish/subscribe system.
|
|
166
|
+
- [`examples/cache.py`](https://www.google.com/search?q=%5Bhttps://www.google.com/search%3Fq%3Dexamples/cache.py%5D\(https://www.google.com/search%3Fq%3Dexamples/cache.py\)): A practical example of using a dictionary with TTL as a cache for API calls.
|
|
167
|
+
- [`examples/rerank.py`](https://www.google.com/search?q=%5Bhttps://www.google.com/search%3Fq%3Dexamples/rerank.py%5D\(https://www.google.com/search%3Fq%3Dexamples/rerank.py\)): Shows how to combine results from vector and text search for more refined results.
|
|
153
168
|
|
|
154
169
|
## Roadmap
|
|
155
170
|
|
|
@@ -157,7 +172,6 @@ These are some of the features and improvements planned for future releases:
|
|
|
157
172
|
|
|
158
173
|
- **Fuzzy search**: Implement fuzzy matching capabilities for text search.
|
|
159
174
|
- **Faster ANN**: Explore integrating more advanced ANN libraries like `faiss` for improved vector search performance.
|
|
160
|
-
- **Priority Queues**: Introduce a priority queue data structure for task management.
|
|
161
175
|
- **Improved Pub/Sub**: Fan-out implementation with a more Pythonic API.
|
|
162
176
|
- **Async API**: Comprehensive async support with on-demand wrappers for all collections.
|
|
163
177
|
|
|
@@ -7,6 +7,7 @@ from .dicts import DictManager
|
|
|
7
7
|
from .lists import ListManager
|
|
8
8
|
from .channels import ChannelManager
|
|
9
9
|
from .collections import CollectionManager
|
|
10
|
+
from .queues import QueueManager
|
|
10
11
|
|
|
11
12
|
|
|
12
13
|
class BeaverDB:
|
|
@@ -38,6 +39,27 @@ class BeaverDB:
|
|
|
38
39
|
self._create_edges_table()
|
|
39
40
|
self._create_versions_table()
|
|
40
41
|
self._create_dict_table()
|
|
42
|
+
self._create_priority_queue_table()
|
|
43
|
+
|
|
44
|
+
def _create_priority_queue_table(self):
|
|
45
|
+
"""Creates the priority queue table and its performance index."""
|
|
46
|
+
with self._conn:
|
|
47
|
+
self._conn.execute(
|
|
48
|
+
"""
|
|
49
|
+
CREATE TABLE IF NOT EXISTS beaver_priority_queues (
|
|
50
|
+
queue_name TEXT NOT NULL,
|
|
51
|
+
priority REAL NOT NULL,
|
|
52
|
+
timestamp REAL NOT NULL,
|
|
53
|
+
data TEXT NOT NULL
|
|
54
|
+
)
|
|
55
|
+
"""
|
|
56
|
+
)
|
|
57
|
+
self._conn.execute(
|
|
58
|
+
"""
|
|
59
|
+
CREATE INDEX IF NOT EXISTS idx_priority_queue_order
|
|
60
|
+
ON beaver_priority_queues (queue_name, priority ASC, timestamp ASC)
|
|
61
|
+
"""
|
|
62
|
+
)
|
|
41
63
|
|
|
42
64
|
def _create_dict_table(self):
|
|
43
65
|
"""Creates the namespaced dictionary table."""
|
|
@@ -164,8 +186,16 @@ class BeaverDB:
|
|
|
164
186
|
raise TypeError("List name must be a non-empty string.")
|
|
165
187
|
return ListManager(name, self._conn)
|
|
166
188
|
|
|
189
|
+
def queue(self, name: str) -> QueueManager:
|
|
190
|
+
"""Returns a wrapper object for interacting with a persistent priority queue."""
|
|
191
|
+
if not isinstance(name, str) or not name:
|
|
192
|
+
raise TypeError("Queue name must be a non-empty string.")
|
|
193
|
+
return QueueManager(name, self._conn)
|
|
194
|
+
|
|
167
195
|
def collection(self, name: str) -> CollectionManager:
|
|
168
196
|
"""Returns a wrapper for interacting with a document collection."""
|
|
197
|
+
if not isinstance(name, str) or not name:
|
|
198
|
+
raise TypeError("Collection name must be a non-empty string.")
|
|
169
199
|
return CollectionManager(name, self._conn)
|
|
170
200
|
|
|
171
201
|
def publish(self, channel_name: str, payload: Any):
|
|
@@ -0,0 +1,87 @@
|
|
|
1
|
+
import json
|
|
2
|
+
import sqlite3
|
|
3
|
+
import time
|
|
4
|
+
from typing import Any, NamedTuple
|
|
5
|
+
|
|
6
|
+
|
|
7
|
+
class QueueItem(NamedTuple):
|
|
8
|
+
"""A data class representing a single item retrieved from the queue."""
|
|
9
|
+
|
|
10
|
+
priority: float
|
|
11
|
+
timestamp: float
|
|
12
|
+
data: Any
|
|
13
|
+
|
|
14
|
+
|
|
15
|
+
class QueueManager:
|
|
16
|
+
"""A wrapper providing a Pythonic interface to a persistent priority queue."""
|
|
17
|
+
|
|
18
|
+
def __init__(self, name: str, conn: sqlite3.Connection):
|
|
19
|
+
self._name = name
|
|
20
|
+
self._conn = conn
|
|
21
|
+
|
|
22
|
+
def put(self, data: Any, priority: float):
|
|
23
|
+
"""
|
|
24
|
+
Adds an item to the queue with a specific priority.
|
|
25
|
+
|
|
26
|
+
Args:
|
|
27
|
+
data: The JSON-serializable data to store.
|
|
28
|
+
priority: The priority of the item (lower numbers are higher priority).
|
|
29
|
+
"""
|
|
30
|
+
with self._conn:
|
|
31
|
+
self._conn.execute(
|
|
32
|
+
"INSERT INTO beaver_priority_queues (queue_name, priority, timestamp, data) VALUES (?, ?, ?, ?)",
|
|
33
|
+
(self._name, priority, time.time(), json.dumps(data)),
|
|
34
|
+
)
|
|
35
|
+
|
|
36
|
+
def get(self) -> QueueItem:
|
|
37
|
+
"""
|
|
38
|
+
Atomically retrieves and removes the highest-priority item from the queue.
|
|
39
|
+
|
|
40
|
+
Returns:
|
|
41
|
+
A QueueItem containing the data and its metadata.
|
|
42
|
+
|
|
43
|
+
Raises IndexError if queue is empty.
|
|
44
|
+
"""
|
|
45
|
+
with self._conn:
|
|
46
|
+
cursor = self._conn.cursor()
|
|
47
|
+
# The compound index on (queue_name, priority, timestamp) makes this query efficient.
|
|
48
|
+
cursor.execute(
|
|
49
|
+
"""
|
|
50
|
+
SELECT rowid, priority, timestamp, data
|
|
51
|
+
FROM beaver_priority_queues
|
|
52
|
+
WHERE queue_name = ?
|
|
53
|
+
ORDER BY priority ASC, timestamp ASC
|
|
54
|
+
LIMIT 1
|
|
55
|
+
""",
|
|
56
|
+
(self._name,),
|
|
57
|
+
)
|
|
58
|
+
result = cursor.fetchone()
|
|
59
|
+
|
|
60
|
+
if result is None:
|
|
61
|
+
raise IndexError("Queue is empty")
|
|
62
|
+
|
|
63
|
+
rowid, priority, timestamp, data = result
|
|
64
|
+
# Delete the retrieved item to ensure it's processed only once.
|
|
65
|
+
cursor.execute("DELETE FROM beaver_priority_queues WHERE rowid = ?", (rowid,))
|
|
66
|
+
|
|
67
|
+
return QueueItem(
|
|
68
|
+
priority=priority, timestamp=timestamp, data=json.loads(data)
|
|
69
|
+
)
|
|
70
|
+
|
|
71
|
+
def __len__(self) -> int:
|
|
72
|
+
"""Returns the current number of items in the queue."""
|
|
73
|
+
cursor = self._conn.cursor()
|
|
74
|
+
cursor.execute(
|
|
75
|
+
"SELECT COUNT(*) FROM beaver_priority_queues WHERE queue_name = ?",
|
|
76
|
+
(self._name,),
|
|
77
|
+
)
|
|
78
|
+
count = cursor.fetchone()[0]
|
|
79
|
+
cursor.close()
|
|
80
|
+
return count
|
|
81
|
+
|
|
82
|
+
def __nonzero__(self) -> bool:
|
|
83
|
+
"""Returns True if the queue is not empty."""
|
|
84
|
+
return len(self) > 0
|
|
85
|
+
|
|
86
|
+
def __repr__(self) -> str:
|
|
87
|
+
return f"QueueManager(name='{self._name}')"
|
|
@@ -1,6 +1,6 @@
|
|
|
1
1
|
Metadata-Version: 2.4
|
|
2
2
|
Name: beaver-db
|
|
3
|
-
Version: 0.
|
|
3
|
+
Version: 0.8.0
|
|
4
4
|
Summary: Fast, embedded, and multi-modal DB based on SQLite for AI-powered applications.
|
|
5
5
|
Requires-Python: >=3.13
|
|
6
6
|
Description-Content-Type: text/markdown
|
|
@@ -11,10 +11,6 @@ Dynamic: license-file
|
|
|
11
11
|
|
|
12
12
|
# beaver 🦫
|
|
13
13
|
|
|
14
|
-

|
|
15
|
-

|
|
16
|
-

|
|
17
|
-
|
|
18
14
|
A fast, single-file, multi-modal database for Python, built with the standard `sqlite3` library.
|
|
19
15
|
|
|
20
16
|
`beaver` is the **B**ackend for **E**mbedded, **A**ll-in-one **V**ector, **E**ntity, and **R**elationship storage. It's a simple, local, and embedded database designed to manage complex, modern data types without requiring a database server, built on top of SQLite.
|
|
@@ -34,6 +30,7 @@ A fast, single-file, multi-modal database for Python, built with the standard `s
|
|
|
34
30
|
- **Synchronous Pub/Sub**: A simple, thread-safe, Redis-like publish-subscribe system for real-time messaging.
|
|
35
31
|
- **Namespaced Key-Value Dictionaries**: A Pythonic, dictionary-like interface for storing any JSON-serializable object within separate namespaces with optional TTL for cache implementations.
|
|
36
32
|
- **Pythonic List Management**: A fluent, Redis-like interface for managing persistent, ordered lists.
|
|
33
|
+
- **Persistent Priority Queue**: A high-performance, persistent queue that always returns the item with the highest priority, perfect for task management.
|
|
37
34
|
- **Efficient Vector Storage & Search**: Store vector embeddings and perform fast approximate nearest neighbor searches using an in-memory k-d tree.
|
|
38
35
|
- **Full-Text Search**: Automatically index and search through document metadata using SQLite's powerful FTS5 engine.
|
|
39
36
|
- **Graph Traversal**: Create relationships between documents and traverse the graph to find neighbors or perform multi-hop walks.
|
|
@@ -86,7 +83,24 @@ db.close()
|
|
|
86
83
|
|
|
87
84
|
Here are a few ideas to inspire your next project, showcasing how to combine Beaver's features to build powerful local applications.
|
|
88
85
|
|
|
89
|
-
### 1.
|
|
86
|
+
### 1. AI Agent Task Management
|
|
87
|
+
|
|
88
|
+
Use a **persistent priority queue** to manage tasks for an AI agent. This ensures the agent always works on the most important task first, even if the application restarts.
|
|
89
|
+
|
|
90
|
+
```python
|
|
91
|
+
tasks = db.queue("agent_tasks")
|
|
92
|
+
|
|
93
|
+
# Tasks are added with a priority (lower is higher)
|
|
94
|
+
tasks.put({"action": "summarize_news"}, priority=10)
|
|
95
|
+
tasks.put({"action": "respond_to_user"}, priority=1)
|
|
96
|
+
tasks.put({"action": "run_backup"}, priority=20)
|
|
97
|
+
|
|
98
|
+
# The agent retrieves the highest-priority task
|
|
99
|
+
next_task = tasks.get() # -> Returns the "respond_to_user" task
|
|
100
|
+
print(f"Agent's next task: {next_task.data['action']}")
|
|
101
|
+
```
|
|
102
|
+
|
|
103
|
+
### 2. User Authentication and Profile Store
|
|
90
104
|
|
|
91
105
|
Use a **namespaced dictionary** to create a simple and secure user store. The key can be the username, and the value can be a dictionary containing the hashed password and other profile information.
|
|
92
106
|
|
|
@@ -104,7 +118,7 @@ users["alice"] = {
|
|
|
104
118
|
alice_profile = users.get("alice")
|
|
105
119
|
```
|
|
106
120
|
|
|
107
|
-
###
|
|
121
|
+
### 3. Chatbot Conversation History
|
|
108
122
|
|
|
109
123
|
A **persistent list** is perfect for storing the history of a conversation. Each time the user or the bot sends a message, just `push` it to the list. This maintains a chronological record of the entire dialogue.
|
|
110
124
|
|
|
@@ -119,7 +133,7 @@ for message in chat_history:
|
|
|
119
133
|
print(f"{message['role']}: {message['content']}")
|
|
120
134
|
```
|
|
121
135
|
|
|
122
|
-
###
|
|
136
|
+
### 4. Build a RAG (Retrieval-Augmented Generation) System
|
|
123
137
|
|
|
124
138
|
Combine **vector search** and **full-text search** to build a powerful RAG pipeline for your local documents.
|
|
125
139
|
|
|
@@ -133,7 +147,7 @@ from beaver.collections import rerank
|
|
|
133
147
|
best_context = rerank(vector_results, text_results, weights=[0.6, 0.4])
|
|
134
148
|
```
|
|
135
149
|
|
|
136
|
-
###
|
|
150
|
+
### 5. Caching for Expensive API Calls
|
|
137
151
|
|
|
138
152
|
Leverage a **dictionary with a TTL (Time-To-Live)** to cache the results of slow network requests. This can dramatically speed up your application and reduce your reliance on external services.
|
|
139
153
|
|
|
@@ -153,14 +167,15 @@ if response is None:
|
|
|
153
167
|
|
|
154
168
|
For more in-depth examples, check out the scripts in the `examples/` directory:
|
|
155
169
|
|
|
156
|
-
- [`examples/kvstore.py`](https://www.google.com/search?q
|
|
157
|
-
- [`examples/list.py`](https://www.google.com/search?q
|
|
158
|
-
- [`examples/
|
|
159
|
-
- [`examples/
|
|
160
|
-
- [`examples/
|
|
161
|
-
- [`examples/
|
|
162
|
-
- [`examples/
|
|
163
|
-
- [`examples/
|
|
170
|
+
- [`examples/kvstore.py`](https://www.google.com/search?q=%5Bhttps://www.google.com/search%3Fq%3Dexamples/kvstore.py%5D\(https://www.google.com/search%3Fq%3Dexamples/kvstore.py\)): A comprehensive demo of the namespaced dictionary feature.
|
|
171
|
+
- [`examples/list.py`](https://www.google.com/search?q=%5Bhttps://www.google.com/search%3Fq%3Dexamples/list.py%5D\(https://www.google.com/search%3Fq%3Dexamples/list.py\)): Shows the full capabilities of the persistent list, including slicing and in-place updates.
|
|
172
|
+
- [`examples/queue.py`](https://www.google.com/search?q=%5Bhttps://www.google.com/search%3Fq%3Dexamples/queue.py%5D\(https://www.google.com/search%3Fq%3Dexamples/queue.py\)): A practical example of using the persistent priority queue for task management.
|
|
173
|
+
- [`examples/vector.py`](https://www.google.com/search?q=%5Bhttps://www.google.com/search%3Fq%3Dexamples/vector.py%5D\(https://www.google.com/search%3Fq%3Dexamples/vector.py\)): Demonstrates how to index and search vector embeddings, including upserts.
|
|
174
|
+
- [`examples/fts.py`](https://www.google.com/search?q=%5Bhttps://www.google.com/search%3Fq%3Dexamples/fts.py%5D\(https://www.google.com/search%3Fq%3Dexamples/fts.py\)): A detailed look at full-text search, including targeted searches on specific metadata fields.
|
|
175
|
+
- [`examples/graph.py`](https://www.google.com/search?q=%5Bhttps://www.google.com/search%3Fq%3Dexamples/graph.py%5D\(https://www.google.com/search%3Fq%3Dexamples/graph.py\)): Shows how to create relationships between documents and perform multi-hop graph traversals.
|
|
176
|
+
- [`examples/pubsub.py`](https://www.google.com/search?q=%5Bhttps://www.google.com/search%3Fq%3Dexamples/pubsub.py%5D\(https://www.google.com/search%3Fq%3Dexamples/pubsub.py\)): A demonstration of the synchronous, thread-safe publish/subscribe system.
|
|
177
|
+
- [`examples/cache.py`](https://www.google.com/search?q=%5Bhttps://www.google.com/search%3Fq%3Dexamples/cache.py%5D\(https://www.google.com/search%3Fq%3Dexamples/cache.py\)): A practical example of using a dictionary with TTL as a cache for API calls.
|
|
178
|
+
- [`examples/rerank.py`](https://www.google.com/search?q=%5Bhttps://www.google.com/search%3Fq%3Dexamples/rerank.py%5D\(https://www.google.com/search%3Fq%3Dexamples/rerank.py\)): Shows how to combine results from vector and text search for more refined results.
|
|
164
179
|
|
|
165
180
|
## Roadmap
|
|
166
181
|
|
|
@@ -168,7 +183,6 @@ These are some of the features and improvements planned for future releases:
|
|
|
168
183
|
|
|
169
184
|
- **Fuzzy search**: Implement fuzzy matching capabilities for text search.
|
|
170
185
|
- **Faster ANN**: Explore integrating more advanced ANN libraries like `faiss` for improved vector search performance.
|
|
171
|
-
- **Priority Queues**: Introduce a priority queue data structure for task management.
|
|
172
186
|
- **Improved Pub/Sub**: Fan-out implementation with a more Pythonic API.
|
|
173
187
|
- **Async API**: Comprehensive async support with on-demand wrappers for all collections.
|
|
174
188
|
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|