kiarina-lib-redisearch 1.0.0__py3-none-any.whl
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- kiarina/lib/redisearch/__init__.py +35 -0
- kiarina/lib/redisearch/_async/__init__.py +0 -0
- kiarina/lib/redisearch/_async/client.py +181 -0
- kiarina/lib/redisearch/_async/registry.py +16 -0
- kiarina/lib/redisearch/_core/__init__.py +0 -0
- kiarina/lib/redisearch/_core/context.py +69 -0
- kiarina/lib/redisearch/_core/operations/__init__.py +0 -0
- kiarina/lib/redisearch/_core/operations/count.py +55 -0
- kiarina/lib/redisearch/_core/operations/create_index.py +52 -0
- kiarina/lib/redisearch/_core/operations/delete.py +43 -0
- kiarina/lib/redisearch/_core/operations/drop_index.py +59 -0
- kiarina/lib/redisearch/_core/operations/exists_index.py +56 -0
- kiarina/lib/redisearch/_core/operations/find.py +105 -0
- kiarina/lib/redisearch/_core/operations/get.py +61 -0
- kiarina/lib/redisearch/_core/operations/get_info.py +155 -0
- kiarina/lib/redisearch/_core/operations/get_key.py +8 -0
- kiarina/lib/redisearch/_core/operations/migrate_index.py +160 -0
- kiarina/lib/redisearch/_core/operations/reset_index.py +60 -0
- kiarina/lib/redisearch/_core/operations/search.py +111 -0
- kiarina/lib/redisearch/_core/operations/set.py +65 -0
- kiarina/lib/redisearch/_core/utils/__init__.py +0 -0
- kiarina/lib/redisearch/_core/utils/calc_score.py +35 -0
- kiarina/lib/redisearch/_core/utils/marshal_mappings.py +57 -0
- kiarina/lib/redisearch/_core/utils/parse_search_result.py +57 -0
- kiarina/lib/redisearch/_core/utils/unmarshal_mappings.py +57 -0
- kiarina/lib/redisearch/_core/views/__init__.py +0 -0
- kiarina/lib/redisearch/_core/views/document.py +25 -0
- kiarina/lib/redisearch/_core/views/info_result.py +24 -0
- kiarina/lib/redisearch/_core/views/search_result.py +31 -0
- kiarina/lib/redisearch/_sync/__init__.py +0 -0
- kiarina/lib/redisearch/_sync/client.py +179 -0
- kiarina/lib/redisearch/_sync/registry.py +16 -0
- kiarina/lib/redisearch/asyncio.py +33 -0
- kiarina/lib/redisearch/filter/__init__.py +61 -0
- kiarina/lib/redisearch/filter/_decorators.py +28 -0
- kiarina/lib/redisearch/filter/_enums.py +28 -0
- kiarina/lib/redisearch/filter/_field/__init__.py +5 -0
- kiarina/lib/redisearch/filter/_field/base.py +67 -0
- kiarina/lib/redisearch/filter/_field/numeric.py +178 -0
- kiarina/lib/redisearch/filter/_field/tag.py +142 -0
- kiarina/lib/redisearch/filter/_field/text.py +111 -0
- kiarina/lib/redisearch/filter/_model.py +93 -0
- kiarina/lib/redisearch/filter/_registry.py +153 -0
- kiarina/lib/redisearch/filter/_types.py +32 -0
- kiarina/lib/redisearch/filter/_utils.py +18 -0
- kiarina/lib/redisearch/py.typed +0 -0
- kiarina/lib/redisearch/schema/__init__.py +25 -0
- kiarina/lib/redisearch/schema/_field/__init__.py +0 -0
- kiarina/lib/redisearch/schema/_field/base.py +20 -0
- kiarina/lib/redisearch/schema/_field/numeric.py +33 -0
- kiarina/lib/redisearch/schema/_field/tag.py +46 -0
- kiarina/lib/redisearch/schema/_field/text.py +44 -0
- kiarina/lib/redisearch/schema/_field/vector/__init__.py +0 -0
- kiarina/lib/redisearch/schema/_field/vector/base.py +61 -0
- kiarina/lib/redisearch/schema/_field/vector/flat.py +40 -0
- kiarina/lib/redisearch/schema/_field/vector/hnsw.py +53 -0
- kiarina/lib/redisearch/schema/_model.py +98 -0
- kiarina/lib/redisearch/schema/_types.py +16 -0
- kiarina/lib/redisearch/settings.py +47 -0
- kiarina_lib_redisearch-1.0.0.dist-info/METADATA +886 -0
- kiarina_lib_redisearch-1.0.0.dist-info/RECORD +62 -0
- kiarina_lib_redisearch-1.0.0.dist-info/WHEEL +4 -0
@@ -0,0 +1,886 @@
|
|
1
|
+
Metadata-Version: 2.4
|
2
|
+
Name: kiarina-lib-redisearch
|
3
|
+
Version: 1.0.0
|
4
|
+
Summary: RediSearch client library for kiarina namespace
|
5
|
+
Project-URL: Homepage, https://github.com/kiarina/kiarina-python
|
6
|
+
Project-URL: Repository, https://github.com/kiarina/kiarina-python
|
7
|
+
Project-URL: Issues, https://github.com/kiarina/kiarina-python/issues
|
8
|
+
Project-URL: Changelog, https://github.com/kiarina/kiarina-python/blob/main/packages/kiarina-lib-redisearch/CHANGELOG.md
|
9
|
+
Project-URL: Documentation, https://github.com/kiarina/kiarina-python/tree/main/packages/kiarina-lib-redisearch#readme
|
10
|
+
Author-email: kiarina <kiarinadawa@gmail.com>
|
11
|
+
Maintainer-email: kiarina <kiarinadawa@gmail.com>
|
12
|
+
License: MIT
|
13
|
+
Keywords: database,fulltext,pydantic,redis,redisearch,search,settings,vector
|
14
|
+
Classifier: Development Status :: 4 - Beta
|
15
|
+
Classifier: Intended Audience :: Developers
|
16
|
+
Classifier: License :: OSI Approved :: MIT License
|
17
|
+
Classifier: Operating System :: OS Independent
|
18
|
+
Classifier: Programming Language :: Python :: 3
|
19
|
+
Classifier: Programming Language :: Python :: 3.12
|
20
|
+
Classifier: Programming Language :: Python :: 3.13
|
21
|
+
Classifier: Topic :: Database
|
22
|
+
Classifier: Topic :: Internet :: WWW/HTTP :: Indexing/Search
|
23
|
+
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
|
24
|
+
Classifier: Topic :: Software Development :: Libraries :: Python Modules
|
25
|
+
Classifier: Typing :: Typed
|
26
|
+
Requires-Python: >=3.12
|
27
|
+
Requires-Dist: numpy>=2.3.2
|
28
|
+
Requires-Dist: pydantic-settings-manager>=2.1.0
|
29
|
+
Requires-Dist: pydantic-settings>=2.10.1
|
30
|
+
Requires-Dist: pydantic>=2.11.7
|
31
|
+
Requires-Dist: redis>=6.4.0
|
32
|
+
Description-Content-Type: text/markdown
|
33
|
+
|
34
|
+
# kiarina-lib-redisearch
|
35
|
+
|
36
|
+
A comprehensive Python client library for [RediSearch](https://redis.io/docs/interact/search-and-query/) with advanced configuration management, schema definition, and both full-text and vector search capabilities.
|
37
|
+
|
38
|
+
## Features
|
39
|
+
|
40
|
+
- **Full-Text Search**: Advanced text search with stemming, phonetic matching, and fuzzy search
|
41
|
+
- **Vector Search**: Similarity search using FLAT and HNSW algorithms with multiple distance metrics
|
42
|
+
- **Schema Management**: Type-safe schema definition with automatic migration support
|
43
|
+
- **Configuration Management**: Flexible configuration using `pydantic-settings-manager`
|
44
|
+
- **Sync & Async**: Support for both synchronous and asynchronous operations
|
45
|
+
- **Advanced Filtering**: Intuitive query builder with type-safe filter expressions
|
46
|
+
- **Index Management**: Complete index lifecycle management (create, migrate, reset, drop)
|
47
|
+
- **Type Safety**: Full type hints and Pydantic validation throughout
|
48
|
+
|
49
|
+
## Installation
|
50
|
+
|
51
|
+
```bash
|
52
|
+
pip install kiarina-lib-redisearch
|
53
|
+
```
|
54
|
+
|
55
|
+
## Quick Start
|
56
|
+
|
57
|
+
### Basic Usage (Sync)
|
58
|
+
|
59
|
+
```python
|
60
|
+
import redis
|
61
|
+
from kiarina.lib.redisearch import create_redisearch_client, RedisearchSettings
|
62
|
+
|
63
|
+
# Configure your schema
|
64
|
+
schema = [
|
65
|
+
{"type": "tag", "name": "category"},
|
66
|
+
{"type": "text", "name": "title"},
|
67
|
+
{"type": "numeric", "name": "price", "sortable": True},
|
68
|
+
{"type": "vector", "name": "embedding", "algorithm": "FLAT", "dims": 1536}
|
69
|
+
]
|
70
|
+
|
71
|
+
# Create Redis connection (decode_responses=False is required)
|
72
|
+
redis_client = redis.Redis(host="localhost", port=6379, decode_responses=False)
|
73
|
+
|
74
|
+
# Create RediSearch client
|
75
|
+
client = create_redisearch_client(
|
76
|
+
redis=redis_client,
|
77
|
+
config_key="default" # Optional: use specific configuration
|
78
|
+
)
|
79
|
+
|
80
|
+
# Configure settings
|
81
|
+
from kiarina.lib.redisearch import settings_manager
|
82
|
+
settings_manager.user_config = {
|
83
|
+
"default": {
|
84
|
+
"key_prefix": "products:",
|
85
|
+
"index_name": "products_index",
|
86
|
+
"index_schema": schema
|
87
|
+
}
|
88
|
+
}
|
89
|
+
|
90
|
+
# Create index
|
91
|
+
client.create_index()
|
92
|
+
|
93
|
+
# Add documents
|
94
|
+
client.set({
|
95
|
+
"category": "electronics",
|
96
|
+
"title": "Wireless Headphones",
|
97
|
+
"price": 99.99,
|
98
|
+
"embedding": [0.1, 0.2, 0.3, ...] # 1536-dimensional vector
|
99
|
+
}, id="product_1")
|
100
|
+
|
101
|
+
# Full-text search
|
102
|
+
results = client.find(
|
103
|
+
filter=[["category", "==", "electronics"]],
|
104
|
+
return_fields=["title", "price"]
|
105
|
+
)
|
106
|
+
|
107
|
+
# Vector similarity search
|
108
|
+
results = client.search(
|
109
|
+
vector=[0.1, 0.2, 0.3, ...], # Query vector
|
110
|
+
limit=10
|
111
|
+
)
|
112
|
+
```
|
113
|
+
|
114
|
+
### Async Usage
|
115
|
+
|
116
|
+
```python
|
117
|
+
import redis.asyncio
|
118
|
+
from kiarina.lib.redisearch.asyncio import create_redisearch_client
|
119
|
+
|
120
|
+
async def main():
|
121
|
+
# Create async Redis connection
|
122
|
+
redis_client = redis.asyncio.Redis(host="localhost", port=6379, decode_responses=False)
|
123
|
+
|
124
|
+
# Create async RediSearch client
|
125
|
+
client = create_redisearch_client(redis=redis_client)
|
126
|
+
|
127
|
+
# All operations are awaitable
|
128
|
+
await client.create_index()
|
129
|
+
await client.set({"title": "Example"}, id="doc_1")
|
130
|
+
results = await client.find()
|
131
|
+
```
|
132
|
+
|
133
|
+
## Schema Definition
|
134
|
+
|
135
|
+
Define your search schema with type-safe field definitions:
|
136
|
+
|
137
|
+
### Field Types
|
138
|
+
|
139
|
+
#### Tag Fields
|
140
|
+
```python
|
141
|
+
{
|
142
|
+
"type": "tag",
|
143
|
+
"name": "category",
|
144
|
+
"separator": ",", # Default: ","
|
145
|
+
"case_sensitive": False, # Default: False
|
146
|
+
"sortable": True, # Default: False
|
147
|
+
"multiple": True # Allow multiple tags (library-specific)
|
148
|
+
}
|
149
|
+
```
|
150
|
+
|
151
|
+
#### Text Fields
|
152
|
+
```python
|
153
|
+
{
|
154
|
+
"type": "text",
|
155
|
+
"name": "description",
|
156
|
+
"weight": 2.0, # Default: 1.0
|
157
|
+
"no_stem": False, # Default: False
|
158
|
+
"sortable": True, # Default: False
|
159
|
+
"phonetic_matcher": "dm:en" # Optional phonetic matching
|
160
|
+
}
|
161
|
+
```
|
162
|
+
|
163
|
+
#### Numeric Fields
|
164
|
+
```python
|
165
|
+
{
|
166
|
+
"type": "numeric",
|
167
|
+
"name": "price",
|
168
|
+
"sortable": True, # Default: False
|
169
|
+
"no_index": False # Default: False
|
170
|
+
}
|
171
|
+
```
|
172
|
+
|
173
|
+
#### Vector Fields
|
174
|
+
|
175
|
+
**FLAT Algorithm (Exact Search)**
|
176
|
+
```python
|
177
|
+
{
|
178
|
+
"type": "vector",
|
179
|
+
"name": "embedding",
|
180
|
+
"algorithm": "FLAT",
|
181
|
+
"dims": 1536,
|
182
|
+
"datatype": "FLOAT32", # FLOAT32 or FLOAT64
|
183
|
+
"distance_metric": "COSINE", # L2, COSINE, or IP
|
184
|
+
"initial_cap": 1000 # Optional initial capacity
|
185
|
+
}
|
186
|
+
```
|
187
|
+
|
188
|
+
**HNSW Algorithm (Approximate Search)**
|
189
|
+
```python
|
190
|
+
{
|
191
|
+
"type": "vector",
|
192
|
+
"name": "embedding",
|
193
|
+
"algorithm": "HNSW",
|
194
|
+
"dims": 1536,
|
195
|
+
"datatype": "FLOAT32",
|
196
|
+
"distance_metric": "COSINE",
|
197
|
+
"m": 16, # Default: 16
|
198
|
+
"ef_construction": 200, # Default: 200
|
199
|
+
"ef_runtime": 10, # Default: 10
|
200
|
+
"epsilon": 0.01 # Default: 0.01
|
201
|
+
}
|
202
|
+
```
|
203
|
+
|
204
|
+
## Configuration
|
205
|
+
|
206
|
+
This library uses [pydantic-settings-manager](https://github.com/kiarina/pydantic-settings-manager) for flexible configuration management.
|
207
|
+
|
208
|
+
### Environment Variables
|
209
|
+
|
210
|
+
```bash
|
211
|
+
# Basic settings
|
212
|
+
export KIARINA_LIB_REDISEARCH_KEY_PREFIX="myapp:"
|
213
|
+
export KIARINA_LIB_REDISEARCH_INDEX_NAME="main_index"
|
214
|
+
export KIARINA_LIB_REDISEARCH_PROTECT_INDEX_DELETION="true"
|
215
|
+
```
|
216
|
+
|
217
|
+
### Programmatic Configuration
|
218
|
+
|
219
|
+
```python
|
220
|
+
from kiarina.lib.redisearch import settings_manager
|
221
|
+
|
222
|
+
# Configure multiple environments
|
223
|
+
settings_manager.user_config = {
|
224
|
+
"development": {
|
225
|
+
"key_prefix": "dev:",
|
226
|
+
"index_name": "dev_index",
|
227
|
+
"index_schema": dev_schema,
|
228
|
+
"protect_index_deletion": False
|
229
|
+
},
|
230
|
+
"production": {
|
231
|
+
"key_prefix": "prod:",
|
232
|
+
"index_name": "prod_index",
|
233
|
+
"index_schema": prod_schema,
|
234
|
+
"protect_index_deletion": True
|
235
|
+
}
|
236
|
+
}
|
237
|
+
|
238
|
+
# Switch configurations
|
239
|
+
settings_manager.active_key = "production"
|
240
|
+
```
|
241
|
+
|
242
|
+
## Advanced Filtering
|
243
|
+
|
244
|
+
Use the intuitive filter API to build complex queries:
|
245
|
+
|
246
|
+
### Filter API
|
247
|
+
|
248
|
+
```python
|
249
|
+
import kiarina.lib.redisearch.filter as rf
|
250
|
+
|
251
|
+
# Tag filters
|
252
|
+
filter1 = rf.Tag("category") == "electronics"
|
253
|
+
filter2 = rf.Tag("tags") == ["new", "featured"] # Multiple tags
|
254
|
+
filter3 = rf.Tag("brand") != "apple"
|
255
|
+
|
256
|
+
# Numeric filters
|
257
|
+
filter4 = rf.Numeric("price") > 100
|
258
|
+
filter5 = rf.Numeric("rating") >= 4.5
|
259
|
+
filter6 = rf.Numeric("stock") <= 10
|
260
|
+
|
261
|
+
# Text filters
|
262
|
+
filter7 = rf.Text("title") == "exact match"
|
263
|
+
filter8 = rf.Text("description") % "*wireless*" # Wildcard search
|
264
|
+
filter9 = rf.Text("content") % "%%fuzzy%%" # Fuzzy search
|
265
|
+
|
266
|
+
# Combine filters
|
267
|
+
complex_filter = (
|
268
|
+
(rf.Tag("category") == "electronics") &
|
269
|
+
(rf.Numeric("price") < 500) &
|
270
|
+
(rf.Text("title") % "*headphone*")
|
271
|
+
)
|
272
|
+
|
273
|
+
# Use in searches
|
274
|
+
results = client.find(filter=complex_filter)
|
275
|
+
```
|
276
|
+
|
277
|
+
### Condition Lists
|
278
|
+
|
279
|
+
Alternatively, use simple condition lists:
|
280
|
+
|
281
|
+
```python
|
282
|
+
# Equivalent to the complex filter above
|
283
|
+
conditions = [
|
284
|
+
["category", "==", "electronics"],
|
285
|
+
["price", "<", 500],
|
286
|
+
["title", "like", "*headphone*"]
|
287
|
+
]
|
288
|
+
|
289
|
+
results = client.find(filter=conditions)
|
290
|
+
```
|
291
|
+
|
292
|
+
## Search Operations
|
293
|
+
|
294
|
+
The library provides three main search operations: `count`, `find`, and `search`. These are the core functions for querying your indexed data.
|
295
|
+
|
296
|
+
### 1. Count Documents (`count`)
|
297
|
+
|
298
|
+
Count the number of documents matching specific criteria without retrieving the actual documents. This is efficient for getting result counts.
|
299
|
+
|
300
|
+
```python
|
301
|
+
# Count all documents
|
302
|
+
total = client.count()
|
303
|
+
print(f"Total documents: {total.total}")
|
304
|
+
|
305
|
+
# Count with filters
|
306
|
+
electronics_count = client.count(
|
307
|
+
filter=[["category", "==", "electronics"]]
|
308
|
+
)
|
309
|
+
print(f"Electronics products: {electronics_count.total}")
|
310
|
+
|
311
|
+
# Complex filter counting
|
312
|
+
expensive_electronics = client.count(
|
313
|
+
filter=[
|
314
|
+
["category", "==", "electronics"],
|
315
|
+
["price", ">", 500]
|
316
|
+
]
|
317
|
+
)
|
318
|
+
print(f"Expensive electronics: {expensive_electronics.total}")
|
319
|
+
|
320
|
+
# Using filter API
|
321
|
+
import kiarina.lib.redisearch.filter as rf
|
322
|
+
filter_expr = (rf.Tag("category") == "electronics") & (rf.Numeric("price") > 500)
|
323
|
+
count_result = client.count(filter=filter_expr)
|
324
|
+
```
|
325
|
+
|
326
|
+
**Count Result Structure:**
|
327
|
+
```python
|
328
|
+
class SearchResult:
|
329
|
+
total: int # Number of matching documents
|
330
|
+
duration: float # Query execution time in milliseconds
|
331
|
+
documents: list # Empty for count operations
|
332
|
+
```
|
333
|
+
|
334
|
+
### 2. Full-Text Search (`find`)
|
335
|
+
|
336
|
+
Search and retrieve documents based on filters, with support for sorting, pagination, and field selection.
|
337
|
+
|
338
|
+
#### Basic Find Operations
|
339
|
+
|
340
|
+
```python
|
341
|
+
# Find all documents
|
342
|
+
results = client.find()
|
343
|
+
print(f"Found {results.total} documents")
|
344
|
+
|
345
|
+
# Find with specific fields returned
|
346
|
+
results = client.find(
|
347
|
+
return_fields=["title", "price", "category"]
|
348
|
+
)
|
349
|
+
for doc in results.documents:
|
350
|
+
print(f"ID: {doc.id}")
|
351
|
+
print(f"Title: {doc.mapping['title']}")
|
352
|
+
print(f"Price: {doc.mapping['price']}")
|
353
|
+
```
|
354
|
+
|
355
|
+
#### Filtering
|
356
|
+
|
357
|
+
```python
|
358
|
+
# Single filter condition
|
359
|
+
results = client.find(
|
360
|
+
filter=[["category", "==", "electronics"]]
|
361
|
+
)
|
362
|
+
|
363
|
+
# Multiple filter conditions (AND logic)
|
364
|
+
results = client.find(
|
365
|
+
filter=[
|
366
|
+
["category", "==", "electronics"],
|
367
|
+
["price", ">=", 100],
|
368
|
+
["price", "<=", 500]
|
369
|
+
]
|
370
|
+
)
|
371
|
+
|
372
|
+
# Using filter expressions for complex logic
|
373
|
+
import kiarina.lib.redisearch.filter as rf
|
374
|
+
complex_filter = (
|
375
|
+
(rf.Tag("category") == "electronics") |
|
376
|
+
(rf.Tag("category") == "computers")
|
377
|
+
) & (rf.Numeric("price") < 1000)
|
378
|
+
|
379
|
+
results = client.find(filter=complex_filter)
|
380
|
+
```
|
381
|
+
|
382
|
+
#### Sorting
|
383
|
+
|
384
|
+
```python
|
385
|
+
# Sort by price (ascending)
|
386
|
+
results = client.find(
|
387
|
+
sort_by="price",
|
388
|
+
sort_desc=False
|
389
|
+
)
|
390
|
+
|
391
|
+
# Sort by rating (descending)
|
392
|
+
results = client.find(
|
393
|
+
filter=[["category", "==", "electronics"]],
|
394
|
+
sort_by="rating",
|
395
|
+
sort_desc=True
|
396
|
+
)
|
397
|
+
|
398
|
+
# Note: Only sortable fields can be used for sorting
|
399
|
+
# Define sortable fields in your schema:
|
400
|
+
# {"type": "numeric", "name": "price", "sortable": True}
|
401
|
+
```
|
402
|
+
|
403
|
+
#### Pagination
|
404
|
+
|
405
|
+
```python
|
406
|
+
# Get first 10 results
|
407
|
+
results = client.find(limit=10)
|
408
|
+
|
409
|
+
# Get next 10 results (pagination)
|
410
|
+
results = client.find(offset=10, limit=10)
|
411
|
+
|
412
|
+
# Get results 21-30
|
413
|
+
results = client.find(offset=20, limit=10)
|
414
|
+
|
415
|
+
# Combine with filtering and sorting
|
416
|
+
results = client.find(
|
417
|
+
filter=[["category", "==", "electronics"]],
|
418
|
+
sort_by="price",
|
419
|
+
sort_desc=True,
|
420
|
+
offset=0,
|
421
|
+
limit=20
|
422
|
+
)
|
423
|
+
```
|
424
|
+
|
425
|
+
#### Field Selection
|
426
|
+
|
427
|
+
```python
|
428
|
+
# Return only specific fields (more efficient)
|
429
|
+
results = client.find(
|
430
|
+
return_fields=["title", "price"]
|
431
|
+
)
|
432
|
+
|
433
|
+
# Return no content, only document IDs (most efficient for counting)
|
434
|
+
results = client.find(
|
435
|
+
return_fields=[] # or omit return_fields parameter
|
436
|
+
)
|
437
|
+
|
438
|
+
# Include computed fields
|
439
|
+
results = client.find(
|
440
|
+
return_fields=["title", "price", "id"] # id is automatically computed
|
441
|
+
)
|
442
|
+
```
|
443
|
+
|
444
|
+
#### Complete Find Example
|
445
|
+
|
446
|
+
```python
|
447
|
+
# Comprehensive search with all options
|
448
|
+
results = client.find(
|
449
|
+
filter=[
|
450
|
+
["category", "in", ["electronics", "computers"]],
|
451
|
+
["price", ">=", 50],
|
452
|
+
["rating", ">=", 4.0]
|
453
|
+
],
|
454
|
+
sort_by="price",
|
455
|
+
sort_desc=False,
|
456
|
+
offset=0,
|
457
|
+
limit=25,
|
458
|
+
return_fields=["title", "price", "rating", "category"]
|
459
|
+
)
|
460
|
+
|
461
|
+
print(f"Found {results.total} products (showing {len(results.documents)})")
|
462
|
+
print(f"Query took {results.duration}ms")
|
463
|
+
|
464
|
+
for doc in results.documents:
|
465
|
+
print(f"- {doc.mapping['title']}: ${doc.mapping['price']} ({doc.mapping['rating']}⭐)")
|
466
|
+
```
|
467
|
+
|
468
|
+
### 3. Vector Similarity Search (`search`)
|
469
|
+
|
470
|
+
Perform semantic similarity search using vector embeddings. This is ideal for AI-powered search, recommendation systems, and semantic matching.
|
471
|
+
|
472
|
+
#### Basic Vector Search
|
473
|
+
|
474
|
+
```python
|
475
|
+
# Simple vector search
|
476
|
+
query_vector = [0.1, 0.2, 0.3, ...] # Your query embedding (must match schema dims)
|
477
|
+
results = client.search(vector=query_vector)
|
478
|
+
|
479
|
+
print(f"Found {results.total} similar documents")
|
480
|
+
for doc in results.documents:
|
481
|
+
print(f"Document: {doc.id}, Similarity Score: {doc.score:.4f}")
|
482
|
+
```
|
483
|
+
|
484
|
+
#### Vector Search with Filtering
|
485
|
+
|
486
|
+
```python
|
487
|
+
# Pre-filter documents before vector search (more efficient)
|
488
|
+
results = client.search(
|
489
|
+
vector=query_vector,
|
490
|
+
filter=[["category", "==", "electronics"]], # Only search within electronics
|
491
|
+
limit=10
|
492
|
+
)
|
493
|
+
|
494
|
+
# Complex pre-filtering
|
495
|
+
results = client.search(
|
496
|
+
vector=query_vector,
|
497
|
+
filter=[
|
498
|
+
["category", "in", ["electronics", "computers"]],
|
499
|
+
["price", "<=", 1000],
|
500
|
+
["in_stock", "==", "true"]
|
501
|
+
],
|
502
|
+
limit=20
|
503
|
+
)
|
504
|
+
```
|
505
|
+
|
506
|
+
#### Pagination and Field Selection
|
507
|
+
|
508
|
+
```python
|
509
|
+
# Paginated vector search
|
510
|
+
results = client.search(
|
511
|
+
vector=query_vector,
|
512
|
+
offset=10,
|
513
|
+
limit=10,
|
514
|
+
return_fields=["title", "description", "price", "distance"]
|
515
|
+
)
|
516
|
+
|
517
|
+
# Get similarity scores and distances
|
518
|
+
for doc in results.documents:
|
519
|
+
distance = doc.mapping.get('distance', 0)
|
520
|
+
score = doc.score # Normalized similarity score (0-1)
|
521
|
+
print(f"{doc.mapping['title']}: score={score:.4f}, distance={distance:.4f}")
|
522
|
+
```
|
523
|
+
|
524
|
+
#### Understanding Vector Search Results
|
525
|
+
|
526
|
+
```python
|
527
|
+
results = client.search(
|
528
|
+
vector=query_vector,
|
529
|
+
limit=5,
|
530
|
+
return_fields=["title", "distance"]
|
531
|
+
)
|
532
|
+
|
533
|
+
for i, doc in enumerate(results.documents, 1):
|
534
|
+
print(f"{i}. {doc.mapping['title']}")
|
535
|
+
print(f" Similarity Score: {doc.score:.4f}") # Higher = more similar
|
536
|
+
print(f" Distance: {doc.mapping['distance']:.4f}") # Lower = more similar
|
537
|
+
print(f" Document ID: {doc.id}")
|
538
|
+
print()
|
539
|
+
```
|
540
|
+
|
541
|
+
#### Vector Search Best Practices
|
542
|
+
|
543
|
+
```python
|
544
|
+
# 1. Use appropriate vector dimensions
|
545
|
+
schema = [{
|
546
|
+
"type": "vector",
|
547
|
+
"name": "embedding",
|
548
|
+
"algorithm": "HNSW", # or "FLAT"
|
549
|
+
"dims": 1536, # Must match your embedding model
|
550
|
+
"distance_metric": "COSINE" # COSINE, L2, or IP
|
551
|
+
}]
|
552
|
+
|
553
|
+
# 2. Pre-filter for better performance
|
554
|
+
results = client.search(
|
555
|
+
vector=query_vector,
|
556
|
+
filter=[["category", "==", "target_category"]], # Reduce search space
|
557
|
+
limit=50 # Don't retrieve more than needed
|
558
|
+
)
|
559
|
+
|
560
|
+
# 3. Use HNSW for large datasets
|
561
|
+
hnsw_schema = {
|
562
|
+
"type": "vector",
|
563
|
+
"name": "embedding",
|
564
|
+
"algorithm": "HNSW",
|
565
|
+
"dims": 1536,
|
566
|
+
"m": 16, # Connections per node
|
567
|
+
"ef_construction": 200, # Build-time accuracy
|
568
|
+
"ef_runtime": 100 # Search-time accuracy
|
569
|
+
}
|
570
|
+
|
571
|
+
# 4. Use FLAT for smaller datasets or exact search
|
572
|
+
flat_schema = {
|
573
|
+
"type": "vector",
|
574
|
+
"name": "embedding",
|
575
|
+
"algorithm": "FLAT",
|
576
|
+
"dims": 1536
|
577
|
+
}
|
578
|
+
```
|
579
|
+
|
580
|
+
### Search Result Structure
|
581
|
+
|
582
|
+
All search operations return a `SearchResult` object:
|
583
|
+
|
584
|
+
```python
|
585
|
+
class SearchResult:
|
586
|
+
total: int # Total matching documents
|
587
|
+
duration: float # Query execution time (ms)
|
588
|
+
documents: list[Document] # Retrieved documents
|
589
|
+
|
590
|
+
class Document:
|
591
|
+
key: str # Redis key
|
592
|
+
id: str # Document ID
|
593
|
+
score: float # Relevance/similarity score (-1.0 to 1.0)*
|
594
|
+
mapping: dict[str, Any] # Document fields
|
595
|
+
```
|
596
|
+
|
597
|
+
### Performance Comparison
|
598
|
+
|
599
|
+
| Operation | Use Case | Performance | Returns |
|
600
|
+
|-----------|----------|-------------|---------|
|
601
|
+
| `count()` | Get result counts | Fastest | Count only |
|
602
|
+
| `find()` | Full-text search, filtering | Fast | Full documents |
|
603
|
+
| `search()` | Semantic similarity | Moderate* | Ranked by similarity |
|
604
|
+
|
605
|
+
*Vector search performance depends on algorithm (FLAT vs HNSW) and dataset size.
|
606
|
+
|
607
|
+
### Combining Operations
|
608
|
+
|
609
|
+
```python
|
610
|
+
# 1. First, check how many results we'll get
|
611
|
+
count_result = client.count(
|
612
|
+
filter=[["category", "==", "electronics"]]
|
613
|
+
)
|
614
|
+
print(f"Will search through {count_result.total} electronics")
|
615
|
+
|
616
|
+
# 2. If reasonable number, do full-text search
|
617
|
+
if count_result.total < 10000:
|
618
|
+
text_results = client.find(
|
619
|
+
filter=[["category", "==", "electronics"]],
|
620
|
+
sort_by="rating",
|
621
|
+
sort_desc=True,
|
622
|
+
limit=100
|
623
|
+
)
|
624
|
+
|
625
|
+
# 3. For semantic search within results
|
626
|
+
if query_vector:
|
627
|
+
semantic_results = client.search(
|
628
|
+
vector=query_vector,
|
629
|
+
filter=[["category", "==", "electronics"]],
|
630
|
+
limit=20
|
631
|
+
)
|
632
|
+
```
|
633
|
+
|
634
|
+
## Index Management
|
635
|
+
|
636
|
+
### Index Lifecycle
|
637
|
+
|
638
|
+
```python
|
639
|
+
# Check if index exists
|
640
|
+
if not client.exists_index():
|
641
|
+
client.create_index()
|
642
|
+
|
643
|
+
# Get index information
|
644
|
+
info = client.get_info()
|
645
|
+
print(f"Index: {info.index_name}, Documents: {info.num_docs}")
|
646
|
+
|
647
|
+
# Reset index (delete all documents, recreate index)
|
648
|
+
client.reset_index()
|
649
|
+
|
650
|
+
# Drop index (optionally delete documents)
|
651
|
+
client.drop_index(delete_documents=True)
|
652
|
+
```
|
653
|
+
|
654
|
+
### Schema Migration
|
655
|
+
|
656
|
+
Automatically migrate your index when schema changes:
|
657
|
+
|
658
|
+
```python
|
659
|
+
# Update your schema
|
660
|
+
new_schema = [
|
661
|
+
{"type": "tag", "name": "category"},
|
662
|
+
{"type": "text", "name": "title"},
|
663
|
+
{"type": "numeric", "name": "price", "sortable": True},
|
664
|
+
{"type": "numeric", "name": "rating", "sortable": True}, # New field
|
665
|
+
{"type": "vector", "name": "embedding", "algorithm": "HNSW", "dims": 1536} # Changed algorithm
|
666
|
+
]
|
667
|
+
|
668
|
+
# Update configuration
|
669
|
+
settings_manager.user_config["production"]["index_schema"] = new_schema
|
670
|
+
|
671
|
+
# Migrate (automatically detects changes and recreates index)
|
672
|
+
client.migrate_index()
|
673
|
+
```
|
674
|
+
|
675
|
+
## Document Operations
|
676
|
+
|
677
|
+
### Adding Documents
|
678
|
+
|
679
|
+
```python
|
680
|
+
# Add single document
|
681
|
+
client.set({
|
682
|
+
"category": "electronics",
|
683
|
+
"title": "Wireless Mouse",
|
684
|
+
"price": 29.99,
|
685
|
+
"rating": 4.5,
|
686
|
+
"embedding": [0.1, 0.2, ...]
|
687
|
+
}, id="mouse_001")
|
688
|
+
|
689
|
+
# Add document with ID in mapping
|
690
|
+
client.set({
|
691
|
+
"id": "keyboard_001",
|
692
|
+
"category": "electronics",
|
693
|
+
"title": "Mechanical Keyboard",
|
694
|
+
"price": 129.99,
|
695
|
+
"embedding": [0.2, 0.3, ...]
|
696
|
+
})
|
697
|
+
```
|
698
|
+
|
699
|
+
### Retrieving Documents
|
700
|
+
|
701
|
+
```python
|
702
|
+
# Get single document
|
703
|
+
doc = client.get("mouse_001")
|
704
|
+
if doc:
|
705
|
+
print(f"Title: {doc.mapping['title']}")
|
706
|
+
print(f"Price: {doc.mapping['price']}")
|
707
|
+
|
708
|
+
# Get Redis key for document
|
709
|
+
key = client.get_key("mouse_001") # Returns "products:mouse_001"
|
710
|
+
```
|
711
|
+
|
712
|
+
### Deleting Documents
|
713
|
+
|
714
|
+
```python
|
715
|
+
# Delete single document
|
716
|
+
client.delete("mouse_001")
|
717
|
+
```
|
718
|
+
|
719
|
+
## Integration with Other Libraries
|
720
|
+
|
721
|
+
### Using with kiarina-lib-redis
|
722
|
+
|
723
|
+
```python
|
724
|
+
from kiarina.lib.redis import get_redis
|
725
|
+
from kiarina.lib.redisearch import create_redisearch_client
|
726
|
+
|
727
|
+
# Get Redis client from kiarina-lib-redis
|
728
|
+
redis_client = get_redis(decode_responses=False)
|
729
|
+
|
730
|
+
# Create RediSearch client
|
731
|
+
search_client = create_redisearch_client(redis=redis_client)
|
732
|
+
```
|
733
|
+
|
734
|
+
### Custom Redis Configuration
|
735
|
+
|
736
|
+
```python
|
737
|
+
import redis
|
738
|
+
from kiarina.lib.redisearch import create_redisearch_client
|
739
|
+
|
740
|
+
# Custom Redis client with connection pooling
|
741
|
+
redis_client = redis.Redis(
|
742
|
+
host="localhost",
|
743
|
+
port=6379,
|
744
|
+
db=0,
|
745
|
+
decode_responses=False, # Required!
|
746
|
+
max_connections=20,
|
747
|
+
socket_timeout=30,
|
748
|
+
socket_connect_timeout=10
|
749
|
+
)
|
750
|
+
|
751
|
+
search_client = create_redisearch_client(redis=redis_client)
|
752
|
+
```
|
753
|
+
|
754
|
+
## Error Handling
|
755
|
+
|
756
|
+
```python
|
757
|
+
try:
|
758
|
+
client.create_index()
|
759
|
+
except Exception as e:
|
760
|
+
if "Index already exists" in str(e):
|
761
|
+
print("Index already exists, continuing...")
|
762
|
+
else:
|
763
|
+
raise
|
764
|
+
|
765
|
+
# Protect against accidental index deletion
|
766
|
+
settings_manager.user_config["production"]["protect_index_deletion"] = True
|
767
|
+
|
768
|
+
# This will return False instead of deleting
|
769
|
+
success = client.drop_index()
|
770
|
+
if not success:
|
771
|
+
print("Index deletion is protected")
|
772
|
+
```
|
773
|
+
|
774
|
+
## Performance Considerations
|
775
|
+
|
776
|
+
### Vector Search Optimization
|
777
|
+
|
778
|
+
```python
|
779
|
+
# Use HNSW for large datasets (faster but approximate)
|
780
|
+
hnsw_schema = {
|
781
|
+
"type": "vector",
|
782
|
+
"name": "embedding",
|
783
|
+
"algorithm": "HNSW",
|
784
|
+
"dims": 1536,
|
785
|
+
"m": 32, # Higher M = better recall, more memory
|
786
|
+
"ef_construction": 400, # Higher = better index quality, slower indexing
|
787
|
+
"ef_runtime": 100 # Higher = better recall, slower search
|
788
|
+
}
|
789
|
+
|
790
|
+
# Use FLAT for smaller datasets or exact search
|
791
|
+
flat_schema = {
|
792
|
+
"type": "vector",
|
793
|
+
"name": "embedding",
|
794
|
+
"algorithm": "FLAT",
|
795
|
+
"dims": 1536,
|
796
|
+
"initial_cap": 10000 # Pre-allocate capacity
|
797
|
+
}
|
798
|
+
```
|
799
|
+
|
800
|
+
### Indexing Best Practices
|
801
|
+
|
802
|
+
```python
|
803
|
+
# Use appropriate field options
|
804
|
+
schema = [
|
805
|
+
{
|
806
|
+
"type": "tag",
|
807
|
+
"name": "category",
|
808
|
+
"sortable": True, # Only if you need sorting
|
809
|
+
"no_index": False # Set True for storage-only fields
|
810
|
+
},
|
811
|
+
{
|
812
|
+
"type": "text",
|
813
|
+
"name": "description",
|
814
|
+
"weight": 1.0, # Adjust relevance weight
|
815
|
+
"no_stem": False # Enable stemming for better search
|
816
|
+
}
|
817
|
+
]
|
818
|
+
```
|
819
|
+
|
820
|
+
## Development
|
821
|
+
|
822
|
+
### Prerequisites
|
823
|
+
|
824
|
+
- Python 3.12+
|
825
|
+
- Redis with RediSearch module
|
826
|
+
- Docker (for running Redis in tests)
|
827
|
+
|
828
|
+
### Setup
|
829
|
+
|
830
|
+
```bash
|
831
|
+
# Clone the repository
|
832
|
+
git clone https://github.com/kiarina/kiarina-python.git
|
833
|
+
cd kiarina-python
|
834
|
+
|
835
|
+
# Setup development environment
|
836
|
+
mise run setup
|
837
|
+
|
838
|
+
# Start Redis with RediSearch for testing
|
839
|
+
docker compose up -d redis
|
840
|
+
```
|
841
|
+
|
842
|
+
### Running Tests
|
843
|
+
|
844
|
+
```bash
|
845
|
+
# Run all tests for this package
|
846
|
+
mise run package kiarina-lib-redisearch
|
847
|
+
|
848
|
+
# Run specific test categories
|
849
|
+
uv run --group test pytest packages/kiarina-lib-redisearch/tests/sync/
|
850
|
+
uv run --group test pytest packages/kiarina-lib-redisearch/tests/async/
|
851
|
+
|
852
|
+
# Run with coverage
|
853
|
+
mise run package:test kiarina-lib-redisearch --coverage
|
854
|
+
```
|
855
|
+
|
856
|
+
## Configuration Reference
|
857
|
+
|
858
|
+
| Setting | Environment Variable | Default | Description |
|
859
|
+
|---------|---------------------|---------|-------------|
|
860
|
+
| `key_prefix` | `KIARINA_LIB_REDISEARCH_KEY_PREFIX` | `""` | Redis key prefix for documents |
|
861
|
+
| `index_name` | `KIARINA_LIB_REDISEARCH_INDEX_NAME` | `"default"` | RediSearch index name |
|
862
|
+
| `index_schema` | - | `None` | Index schema definition (list of field dicts) |
|
863
|
+
| `protect_index_deletion` | `KIARINA_LIB_REDISEARCH_PROTECT_INDEX_DELETION` | `false` | Prevent accidental index deletion |
|
864
|
+
|
865
|
+
## Dependencies
|
866
|
+
|
867
|
+
- [redis](https://github.com/redis/redis-py) - Redis client for Python
|
868
|
+
- [numpy](https://numpy.org/) - Numerical computing (for vector operations)
|
869
|
+
- [pydantic](https://docs.pydantic.dev/) - Data validation and settings management
|
870
|
+
- [pydantic-settings](https://docs.pydantic.dev/latest/concepts/pydantic_settings/) - Settings management
|
871
|
+
- [pydantic-settings-manager](https://github.com/kiarina/pydantic-settings-manager) - Advanced settings management
|
872
|
+
|
873
|
+
## License
|
874
|
+
|
875
|
+
This project is licensed under the MIT License - see the [LICENSE](../../LICENSE) file for details.
|
876
|
+
|
877
|
+
## Contributing
|
878
|
+
|
879
|
+
This is a personal project, but contributions are welcome! Please feel free to submit issues or pull requests.
|
880
|
+
|
881
|
+
## Related Projects
|
882
|
+
|
883
|
+
- [kiarina-python](https://github.com/kiarina/kiarina-python) - The main monorepo containing this package
|
884
|
+
- [RediSearch](https://redis.io/docs/interact/search-and-query/) - The search and query engine this library connects to
|
885
|
+
- [kiarina-lib-redis](../kiarina-lib-redis/) - Redis client library for basic Redis operations
|
886
|
+
- [pydantic-settings-manager](https://github.com/kiarina/pydantic-settings-manager) - Configuration management library used by this package
|