simplevecdb 2.6.0__tar.gz → 2.6.1__tar.gz
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- simplevecdb-2.6.1/.github/FUNDING.yml +5 -0
- simplevecdb-2.6.1/.gitignore +70 -0
- {simplevecdb-2.6.0 → simplevecdb-2.6.1}/CHANGELOG.md +103 -0
- simplevecdb-2.6.1/PKG-INFO +377 -0
- simplevecdb-2.6.1/README.md +330 -0
- {simplevecdb-2.6.0 → simplevecdb-2.6.1}/docs/CHANGELOG.md +103 -0
- simplevecdb-2.6.1/docs/Features.md +206 -0
- {simplevecdb-2.6.0 → simplevecdb-2.6.1}/docs/api/core.md +0 -1
- simplevecdb-2.6.1/docs/examples.md +286 -0
- {simplevecdb-2.6.0 → simplevecdb-2.6.1}/examples/rag/langchain_rag.ipynb +10 -7
- {simplevecdb-2.6.0 → simplevecdb-2.6.1}/examples/rag/llama_rag.ipynb +10 -4
- {simplevecdb-2.6.0 → simplevecdb-2.6.1}/examples/rag/ollama_rag.ipynb +21 -16
- simplevecdb-2.6.1/lefthook.yml +64 -0
- {simplevecdb-2.6.0 → simplevecdb-2.6.1}/mkdocs.yml +1 -0
- {simplevecdb-2.6.0 → simplevecdb-2.6.1}/pyproject.toml +1 -2
- {simplevecdb-2.6.0 → simplevecdb-2.6.1}/scripts/check_version_sync.py +2 -6
- simplevecdb-2.6.1/scripts/exercise_async_collection.py +358 -0
- {simplevecdb-2.6.0 → simplevecdb-2.6.1}/scripts/track_metrics.py +25 -16
- {simplevecdb-2.6.0 → simplevecdb-2.6.1}/src/simplevecdb/__init__.py +1 -2
- {simplevecdb-2.6.0 → simplevecdb-2.6.1}/src/simplevecdb/async_core.py +325 -198
- {simplevecdb-2.6.0 → simplevecdb-2.6.1}/src/simplevecdb/constants.py +28 -0
- {simplevecdb-2.6.0 → simplevecdb-2.6.1}/src/simplevecdb/core.py +946 -236
- {simplevecdb-2.6.0 → simplevecdb-2.6.1}/src/simplevecdb/embeddings/models.py +1 -0
- {simplevecdb-2.6.0 → simplevecdb-2.6.1}/src/simplevecdb/embeddings/server.py +3 -1
- {simplevecdb-2.6.0 → simplevecdb-2.6.1}/src/simplevecdb/encryption.py +9 -6
- simplevecdb-2.6.1/src/simplevecdb/engine/catalog.py +2267 -0
- {simplevecdb-2.6.0 → simplevecdb-2.6.1}/src/simplevecdb/engine/search.py +74 -13
- {simplevecdb-2.6.0 → simplevecdb-2.6.1}/src/simplevecdb/engine/usearch_index.py +6 -11
- {simplevecdb-2.6.0 → simplevecdb-2.6.1}/src/simplevecdb/integrations/llamaindex.py +1 -3
- {simplevecdb-2.6.0 → simplevecdb-2.6.1}/src/simplevecdb/logging.py +0 -3
- {simplevecdb-2.6.0 → simplevecdb-2.6.1}/src/simplevecdb/types.py +60 -35
- {simplevecdb-2.6.0 → simplevecdb-2.6.1}/src/simplevecdb/utils.py +205 -31
- {simplevecdb-2.6.0 → simplevecdb-2.6.1}/tests/integration/test_rag.py +11 -9
- {simplevecdb-2.6.0 → simplevecdb-2.6.1}/tests/integration/test_server.py +3 -1
- {simplevecdb-2.6.0 → simplevecdb-2.6.1}/tests/unit/core/test_core_additional_coverage.py +6 -3
- {simplevecdb-2.6.0 → simplevecdb-2.6.1}/tests/unit/core/test_filters.py +5 -4
- {simplevecdb-2.6.0 → simplevecdb-2.6.1}/tests/unit/core/test_missing_coverage.py +4 -184
- {simplevecdb-2.6.0 → simplevecdb-2.6.1}/tests/unit/core/test_v25_correctness.py +3 -1
- {simplevecdb-2.6.0 → simplevecdb-2.6.1}/tests/unit/core/test_v25_features.py +9 -3
- {simplevecdb-2.6.0 → simplevecdb-2.6.1}/tests/unit/core/test_v25_robustness.py +28 -14
- {simplevecdb-2.6.0 → simplevecdb-2.6.1}/tests/unit/embeddings/test_repo_id_validation.py +6 -4
- {simplevecdb-2.6.0 → simplevecdb-2.6.1}/tests/unit/embeddings/test_server.py +9 -3
- {simplevecdb-2.6.0 → simplevecdb-2.6.1}/tests/unit/embeddings/test_v25_enhancements.py +4 -4
- {simplevecdb-2.6.0 → simplevecdb-2.6.1}/tests/unit/engine/test_v26_quantization_clustering.py +7 -1
- {simplevecdb-2.6.0 → simplevecdb-2.6.1}/tests/unit/integrations/test_langchain_coverage.py +3 -1
- {simplevecdb-2.6.0 → simplevecdb-2.6.1}/tests/unit/integrations/test_llamaindex_v26.py +1 -3
- {simplevecdb-2.6.0 → simplevecdb-2.6.1}/tests/unit/test_async_coverage.py +3 -1
- {simplevecdb-2.6.0 → simplevecdb-2.6.1}/tests/unit/test_catalog_coverage.py +2 -62
- {simplevecdb-2.6.0 → simplevecdb-2.6.1}/tests/unit/test_core.py +2 -39
- {simplevecdb-2.6.0 → simplevecdb-2.6.1}/tests/unit/test_encryption_coverage.py +23 -7
- {simplevecdb-2.6.0 → simplevecdb-2.6.1}/tests/unit/test_encryption_v1_format.py +1 -3
- {simplevecdb-2.6.0 → simplevecdb-2.6.1}/tests/unit/test_error_handling.py +23 -58
- {simplevecdb-2.6.0 → simplevecdb-2.6.1}/tests/unit/test_multi_collection.py +14 -11
- {simplevecdb-2.6.0 → simplevecdb-2.6.1}/tests/unit/test_search.py +27 -18
- {simplevecdb-2.6.0 → simplevecdb-2.6.1}/tests/unit/test_search_missing_coverage.py +4 -4
- {simplevecdb-2.6.0 → simplevecdb-2.6.1}/tests/unit/test_usearch_index_missing_coverage.py +11 -5
- simplevecdb-2.6.1/tests/unit/test_v26_1_features.py +655 -0
- {simplevecdb-2.6.0 → simplevecdb-2.6.1}/tests/unit/test_v26_encryption_review_pass_3.py +6 -14
- {simplevecdb-2.6.0 → simplevecdb-2.6.1}/tests/unit/test_v26_misc.py +3 -4
- {simplevecdb-2.6.0 → simplevecdb-2.6.1}/tests/unit/test_v26_review_pass_3.py +3 -9
- {simplevecdb-2.6.0 → simplevecdb-2.6.1}/tests/unit/test_v26_review_pass_4.py +3 -9
- {simplevecdb-2.6.0 → simplevecdb-2.6.1}/uv.lock +1 -15
- simplevecdb-2.6.0/.github/FUNDING.yml +0 -6
- simplevecdb-2.6.0/.gitignore +0 -38
- simplevecdb-2.6.0/PKG-INFO +0 -546
- simplevecdb-2.6.0/README.md +0 -498
- simplevecdb-2.6.0/docs/LICENSE +0 -0
- simplevecdb-2.6.0/docs/examples.md +0 -356
- simplevecdb-2.6.0/lefthook.yml +0 -39
- simplevecdb-2.6.0/src/simplevecdb/engine/catalog.py +0 -1123
- {simplevecdb-2.6.0 → simplevecdb-2.6.1}/.bandit +0 -0
- {simplevecdb-2.6.0 → simplevecdb-2.6.1}/.env.example +0 -0
- {simplevecdb-2.6.0 → simplevecdb-2.6.1}/.github/ISSUE_TEMPLATE/bug_report.yml +0 -0
- {simplevecdb-2.6.0 → simplevecdb-2.6.1}/.github/ISSUE_TEMPLATE/config.yml +0 -0
- {simplevecdb-2.6.0 → simplevecdb-2.6.1}/.github/ISSUE_TEMPLATE/feature_request.yml +0 -0
- {simplevecdb-2.6.0 → simplevecdb-2.6.1}/.github/dependabot.yml +0 -0
- {simplevecdb-2.6.0 → simplevecdb-2.6.1}/.github/workflows/ci.yml +0 -0
- {simplevecdb-2.6.0 → simplevecdb-2.6.1}/.github/workflows/publish.yml +0 -0
- {simplevecdb-2.6.0 → simplevecdb-2.6.1}/.github/workflows/security.yml +0 -0
- {simplevecdb-2.6.0 → simplevecdb-2.6.1}/.github/workflows/update-sponsors.yml +0 -0
- {simplevecdb-2.6.0 → simplevecdb-2.6.1}/.python-version +0 -0
- {simplevecdb-2.6.0 → simplevecdb-2.6.1}/CODE_OF_CONDUCT.md +0 -0
- {simplevecdb-2.6.0 → simplevecdb-2.6.1}/CONTRIBUTING.md +0 -0
- {simplevecdb-2.6.0 → simplevecdb-2.6.1}/LICENSE +0 -0
- {simplevecdb-2.6.0 → simplevecdb-2.6.1}/SECURITY.md +0 -0
- {simplevecdb-2.6.0 → simplevecdb-2.6.1}/docs/CONTRIBUTING.md +0 -0
- {simplevecdb-2.6.0 → simplevecdb-2.6.1}/docs/ENV_SETUP.md +0 -0
- {simplevecdb-2.6.0 → simplevecdb-2.6.1}/docs/api/async.md +0 -0
- {simplevecdb-2.6.0 → simplevecdb-2.6.1}/docs/api/config.md +0 -0
- {simplevecdb-2.6.0 → simplevecdb-2.6.1}/docs/api/embeddings.md +0 -0
- {simplevecdb-2.6.0 → simplevecdb-2.6.1}/docs/api/encryption.md +0 -0
- {simplevecdb-2.6.0 → simplevecdb-2.6.1}/docs/api/engine/catalog.md +0 -0
- {simplevecdb-2.6.0 → simplevecdb-2.6.1}/docs/api/engine/quantization.md +0 -0
- {simplevecdb-2.6.0 → simplevecdb-2.6.1}/docs/api/engine/search.md +0 -0
- {simplevecdb-2.6.0 → simplevecdb-2.6.1}/docs/api/integrations.md +0 -0
- {simplevecdb-2.6.0 → simplevecdb-2.6.1}/docs/api/types.md +0 -0
- {simplevecdb-2.6.0 → simplevecdb-2.6.1}/docs/benchmarks.md +0 -0
- {simplevecdb-2.6.0 → simplevecdb-2.6.1}/docs/guides/clustering.md +0 -0
- {simplevecdb-2.6.0 → simplevecdb-2.6.1}/docs/index.md +0 -0
- {simplevecdb-2.6.0 → simplevecdb-2.6.1}/examples/auto_embed.py +0 -0
- {simplevecdb-2.6.0 → simplevecdb-2.6.1}/examples/backend_benchmark.py +0 -0
- {simplevecdb-2.6.0 → simplevecdb-2.6.1}/examples/embeddings/perf_benchmark.py +0 -0
- {simplevecdb-2.6.0 → simplevecdb-2.6.1}/examples/quant_benchmark.py +0 -0
- {simplevecdb-2.6.0 → simplevecdb-2.6.1}/examples/smoke_test.py +0 -0
- {simplevecdb-2.6.0 → simplevecdb-2.6.1}/scripts/bump_version.py +0 -0
- {simplevecdb-2.6.0 → simplevecdb-2.6.1}/src/simplevecdb/config.py +0 -0
- {simplevecdb-2.6.0 → simplevecdb-2.6.1}/src/simplevecdb/embeddings/__init__.py +0 -0
- {simplevecdb-2.6.0 → simplevecdb-2.6.1}/src/simplevecdb/engine/__init__.py +0 -0
- {simplevecdb-2.6.0 → simplevecdb-2.6.1}/src/simplevecdb/engine/clustering.py +0 -0
- {simplevecdb-2.6.0 → simplevecdb-2.6.1}/src/simplevecdb/engine/quantization.py +0 -0
- {simplevecdb-2.6.0 → simplevecdb-2.6.1}/src/simplevecdb/integrations/__init__.py +0 -0
- {simplevecdb-2.6.0 → simplevecdb-2.6.1}/src/simplevecdb/integrations/langchain.py +0 -0
- {simplevecdb-2.6.0 → simplevecdb-2.6.1}/tests/conftest.py +0 -0
- {simplevecdb-2.6.0 → simplevecdb-2.6.1}/tests/integration/test_langchain.py +0 -0
- {simplevecdb-2.6.0 → simplevecdb-2.6.1}/tests/integration/test_llamaindex.py +0 -0
- {simplevecdb-2.6.0 → simplevecdb-2.6.1}/tests/integration/test_v21_features.py +0 -0
- {simplevecdb-2.6.0 → simplevecdb-2.6.1}/tests/perf/test_batch_detection.py +0 -0
- {simplevecdb-2.6.0 → simplevecdb-2.6.1}/tests/perf/test_performance.py +0 -0
- {simplevecdb-2.6.0 → simplevecdb-2.6.1}/tests/unit/core/__init__.py +0 -0
- {simplevecdb-2.6.0 → simplevecdb-2.6.1}/tests/unit/core/test_batch_detection.py +0 -0
- {simplevecdb-2.6.0 → simplevecdb-2.6.1}/tests/unit/core/test_factory_methods.py +0 -0
- {simplevecdb-2.6.0 → simplevecdb-2.6.1}/tests/unit/core/test_initialization.py +0 -0
- {simplevecdb-2.6.0 → simplevecdb-2.6.1}/tests/unit/core/test_quantization.py +0 -0
- {simplevecdb-2.6.0 → simplevecdb-2.6.1}/tests/unit/core/test_similarity_search.py +0 -0
- {simplevecdb-2.6.0 → simplevecdb-2.6.1}/tests/unit/core/test_v26_safety.py +0 -0
- {simplevecdb-2.6.0 → simplevecdb-2.6.1}/tests/unit/embeddings/__init__.py +0 -0
- {simplevecdb-2.6.0 → simplevecdb-2.6.1}/tests/unit/embeddings/test_models.py +0 -0
- {simplevecdb-2.6.0 → simplevecdb-2.6.1}/tests/unit/embeddings/test_server_coverage.py +0 -0
- {simplevecdb-2.6.0 → simplevecdb-2.6.1}/tests/unit/integrations/__init__.py +0 -0
- {simplevecdb-2.6.0 → simplevecdb-2.6.1}/tests/unit/integrations/test_llamaindex_coverage.py +0 -0
- {simplevecdb-2.6.0 → simplevecdb-2.6.1}/tests/unit/integrations/test_llamaindex_review_pass_3.py +0 -0
- {simplevecdb-2.6.0 → simplevecdb-2.6.1}/tests/unit/test_async.py +0 -0
- {simplevecdb-2.6.0 → simplevecdb-2.6.1}/tests/unit/test_async_v26.py +0 -0
- {simplevecdb-2.6.0 → simplevecdb-2.6.1}/tests/unit/test_clustering.py +0 -0
- {simplevecdb-2.6.0 → simplevecdb-2.6.1}/tests/unit/test_config.py +0 -0
- {simplevecdb-2.6.0 → simplevecdb-2.6.1}/tests/unit/test_cross_collection_search.py +0 -0
- {simplevecdb-2.6.0 → simplevecdb-2.6.1}/tests/unit/test_encryption.py +0 -0
- {simplevecdb-2.6.0 → simplevecdb-2.6.1}/tests/unit/test_encryption_salt.py +0 -0
- {simplevecdb-2.6.0 → simplevecdb-2.6.1}/tests/unit/test_hierarchy.py +0 -0
- {simplevecdb-2.6.0 → simplevecdb-2.6.1}/tests/unit/test_search_coverage.py +0 -0
- {simplevecdb-2.6.0 → simplevecdb-2.6.1}/tests/unit/test_streaming.py +0 -0
- {simplevecdb-2.6.0 → simplevecdb-2.6.1}/tests/unit/test_types.py +0 -0
- {simplevecdb-2.6.0 → simplevecdb-2.6.1}/tests/unit/test_utils.py +0 -0
|
@@ -0,0 +1,70 @@
|
|
|
1
|
+
# Python / uv
|
|
2
|
+
.venv/
|
|
3
|
+
__pycache__/
|
|
4
|
+
*.py[cod]
|
|
5
|
+
*.egg-info/
|
|
6
|
+
.eggs/
|
|
7
|
+
build/
|
|
8
|
+
dist/
|
|
9
|
+
.tox/
|
|
10
|
+
.nox/
|
|
11
|
+
|
|
12
|
+
# Tooling caches
|
|
13
|
+
.mypy_cache/
|
|
14
|
+
.pytest_cache/
|
|
15
|
+
.ruff_cache/
|
|
16
|
+
.cache/
|
|
17
|
+
.hypothesis/
|
|
18
|
+
cython_debug/
|
|
19
|
+
|
|
20
|
+
# Coverage
|
|
21
|
+
.coverage
|
|
22
|
+
.coverage.*
|
|
23
|
+
coverage.xml
|
|
24
|
+
*.cover
|
|
25
|
+
htmlcov/
|
|
26
|
+
|
|
27
|
+
# Jupyter
|
|
28
|
+
.ipynb_checkpoints/
|
|
29
|
+
|
|
30
|
+
# Environment / secrets
|
|
31
|
+
.env
|
|
32
|
+
.envrc
|
|
33
|
+
.direnv/
|
|
34
|
+
|
|
35
|
+
# Databases (SimpleVecDB writes WAL/SHM sidecars)
|
|
36
|
+
*.db
|
|
37
|
+
*.db-journal
|
|
38
|
+
*.db-shm
|
|
39
|
+
*.db-wal
|
|
40
|
+
*.sqlite
|
|
41
|
+
*.sqlite3
|
|
42
|
+
|
|
43
|
+
# Editor / IDE
|
|
44
|
+
.vscode/
|
|
45
|
+
.idea/
|
|
46
|
+
.history/
|
|
47
|
+
*.iml
|
|
48
|
+
*.swp
|
|
49
|
+
*.swo
|
|
50
|
+
*~
|
|
51
|
+
|
|
52
|
+
# OS
|
|
53
|
+
.DS_Store
|
|
54
|
+
Thumbs.db
|
|
55
|
+
desktop.ini
|
|
56
|
+
|
|
57
|
+
# Docs build
|
|
58
|
+
site/
|
|
59
|
+
|
|
60
|
+
# Agentic CLI tools (per-developer state)
|
|
61
|
+
.opencode/
|
|
62
|
+
opencode.json
|
|
63
|
+
.claude/
|
|
64
|
+
.codex
|
|
65
|
+
|
|
66
|
+
# Project-specific scratch
|
|
67
|
+
simplevecdb_plan.md
|
|
68
|
+
AGENTS.md
|
|
69
|
+
NEXT_UPDATES.md
|
|
70
|
+
pro_pack/
|
|
@@ -5,6 +5,109 @@ All notable changes to SimpleVecDB will be documented in this file.
|
|
|
5
5
|
The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/),
|
|
6
6
|
and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
|
|
7
7
|
|
|
8
|
+
## [2.6.1] - 2026-05-10
|
|
9
|
+
|
|
10
|
+
### Storage, mutation, and eventing improvements
|
|
11
|
+
|
|
12
|
+
This release closes ten long-standing gaps in the catalog layer with a coherent
|
|
13
|
+
set of additive primitives. No public API breaks; existing 2.6.0 databases
|
|
14
|
+
upgrade transparently (the new tables are created on first open).
|
|
15
|
+
|
|
16
|
+
#### New features
|
|
17
|
+
|
|
18
|
+
- **Native vector update via pending buffer** — `collection.update_embedding(id, vector)`
|
|
19
|
+
writes a row to a per-collection `_pending_vectors` overlay inside one SQL
|
|
20
|
+
transaction; the new vector becomes visible to reads immediately and is
|
|
21
|
+
promoted to the HNSW index on `collection.pending.flush()`. Removes the
|
|
22
|
+
HNSW remove+re-add churn previously required for in-place updates.
|
|
23
|
+
- **Bulk vector math** — `collection.pending.update_many([(id, vec), …])` and
|
|
24
|
+
`collection.pending.blend_toward(ids, centroid, alpha)` for batched edits.
|
|
25
|
+
- **Atomic transaction boundary** — `with db.transaction() as tx: …` and
|
|
26
|
+
`with collection.tx(): …` wrap a SAVEPOINT around catalog writes
|
|
27
|
+
(metadata, counters, edges, events, TTL, and `update_embedding`'s
|
|
28
|
+
pending overlay) so a raised exception rolls all SQL writes back.
|
|
29
|
+
Coarse vector mutations (`add_texts`, `delete`) are NOT rolled back —
|
|
30
|
+
use `update_embedding` + `pending.flush()` for vector changes that
|
|
31
|
+
must be commit-gated. Nested contexts share a single savepoint stack
|
|
32
|
+
via the new `_TxState` helper.
|
|
33
|
+
- **Weighted directed edges** — new `collection.edges` namespace with
|
|
34
|
+
`add_edge / get_edges / update_edge / delete_edge / prune` over a
|
|
35
|
+
per-collection `_edges` table. Numeric columns (`weight`, `bonus`, `hits`,
|
|
36
|
+
`last_touch`) are addressable by the new range-filter grammar; deltas
|
|
37
|
+
(`dweight=+0.02, dhits=+1`) compile to a single atomic SQL UPDATE.
|
|
38
|
+
- **Atomic counter increments** — `collection.increment_metadata(id, {"hits": 1, "drift": 0.02})`
|
|
39
|
+
applies a dict of numeric deltas to JSON metadata in one statement using
|
|
40
|
+
chained `json_set(... json_extract + ?)` calls. WAL-atomic; safe under
|
|
41
|
+
concurrent writers.
|
|
42
|
+
- **Mongo-style range filters** — `filter={"score": {"$gt": 0.5, "$lte": 0.9}}`
|
|
43
|
+
on `similarity_search`, `keyword_search`, `hybrid_search`, `edges.get_edges`,
|
|
44
|
+
and `events.read`. Supported operators: `$eq $ne $gt $gte $lt $lte $in $nin
|
|
45
|
+
$exists $between`. Tuple shorthand (`("range", lo, hi)`, `(">", x)`) is
|
|
46
|
+
normalised into the operator-dict form.
|
|
47
|
+
- **Append-only change feed** — every mutating method now appends one row to
|
|
48
|
+
a per-collection `_events` table (kind, doc_id, payload, monotonic seq).
|
|
49
|
+
`collection.events.read(since=, kind=, limit=)`,
|
|
50
|
+
`collection.events.subscribe(since=, poll_interval=)`, and
|
|
51
|
+
`collection.events.prune(before_seq=)` expose the feed; cross-process
|
|
52
|
+
visibility comes from the existing WAL mode.
|
|
53
|
+
- **TTL / expiry hooks** — `collection.ttl.set(id, seconds=…, on_expire="delete"|"callback")`,
|
|
54
|
+
`collection.ttl.clear(id)`, and `collection.ttl.sweep()` over a
|
|
55
|
+
`_ttl` table; `start_background(interval=…)` runs the sweep in a daemon
|
|
56
|
+
thread (off by default).
|
|
57
|
+
- **Incremental rebuild scheduler** — `collection.maintenance.rebuild_if_needed(max_pending=, max_deleted=)`
|
|
58
|
+
triggers a full `rebuild_index()` only when the configured pending /
|
|
59
|
+
tombstone / wall-time thresholds are crossed.
|
|
60
|
+
- **Multi-process write safety** — added `PRAGMA busy_timeout=5000` and
|
|
61
|
+
`PRAGMA foreign_keys=ON` at every connection-open site (encrypted and
|
|
62
|
+
unencrypted). The native 5 s wait window reduces `DatabaseLockedError`
|
|
63
|
+
pressure under contention; foreign keys cascade-delete pending /
|
|
64
|
+
edges / TTL rows when a doc is deleted. The events table is
|
|
65
|
+
intentionally FK-less so the audit trail survives deletions.
|
|
66
|
+
- **Async wrappers** — `AsyncVectorCollection` gains async equivalents of the
|
|
67
|
+
new methods (`update_embedding`, `flush_pending`, `increment_metadata`,
|
|
68
|
+
`add_edge`, `update_edge`, `delete_edge`, `get_edges`, `set_ttl`,
|
|
69
|
+
`clear_ttl`, `sweep_ttl`, `read_events`, `last_event_seq`,
|
|
70
|
+
`rebuild_if_needed`).
|
|
71
|
+
|
|
72
|
+
#### New types & constants
|
|
73
|
+
|
|
74
|
+
- `simplevecdb.types`: `Edge`, `Event`, `TTLEntry` frozen dataclasses.
|
|
75
|
+
- `simplevecdb.constants`: `PENDING_FLUSH_DEFAULT_BATCH=1000`,
|
|
76
|
+
`EVENTS_POLL_INTERVAL_S=0.1`, `EVENTS_RETENTION_LIMIT=100_000`,
|
|
77
|
+
`TTL_SWEEP_DEFAULT_INTERVAL_S=60.0`, `REBUILD_PENDING_THRESHOLD=5_000`,
|
|
78
|
+
`REBUILD_TOMBSTONE_THRESHOLD=5_000`, `REBUILD_MIN_INTERVAL_S=3600.0`,
|
|
79
|
+
`SQLITE_BUSY_TIMEOUT_MS=5000`.
|
|
80
|
+
|
|
81
|
+
#### Test coverage
|
|
82
|
+
|
|
83
|
+
- `tests/unit/test_v26_1_features.py` — 25 tests covering the five must-have
|
|
84
|
+
primitives end-to-end: `update_embedding` + pending buffer + flush; edges
|
|
85
|
+
CRUD with atomic deltas, range filtering, and prune; `increment_metadata`
|
|
86
|
+
under 800-thread contention (exact total preserved); transaction rollback
|
|
87
|
+
and commit semantics; Mongo-style and tuple-shorthand range filters in
|
|
88
|
+
`similarity_search`; events append on every mutation; TTL sweep with
|
|
89
|
+
`delete` and `callback` paths; threshold-driven rebuild scheduler.
|
|
90
|
+
|
|
91
|
+
#### Removed
|
|
92
|
+
|
|
93
|
+
- **`sqlite-vec` dependency** dropped from `pyproject.toml`. The package was
|
|
94
|
+
never imported and the v1.x → v2.0 auto-migration code path could not have
|
|
95
|
+
worked without explicitly loading the extension at connection time.
|
|
96
|
+
- **`MigrationRequiredError`**, **`VectorDB.check_migration()`**, and the
|
|
97
|
+
**`auto_migrate=`** constructor flag have been removed. Databases written
|
|
98
|
+
by `simplevecdb < 2.0.0` (sqlite-vec backend) are no longer auto-migrated
|
|
99
|
+
on open. To upgrade a v1.x database, dump the rows with a v1.x install and
|
|
100
|
+
re-ingest them through the v2 API; or stay on the last release that shipped
|
|
101
|
+
the migration path (anything ≤ v2.6.1's predecessor).
|
|
102
|
+
- The catalog helpers `check_legacy_sqlite_vec`, `get_legacy_vectors`, and
|
|
103
|
+
`drop_legacy_vec_table` are gone alongside the migration entry point.
|
|
104
|
+
|
|
105
|
+
#### Out of scope
|
|
106
|
+
|
|
107
|
+
- No external pub/sub for events — polling only.
|
|
108
|
+
- No multi-master writer support; single-writer + many readers remains the
|
|
109
|
+
recommended topology.
|
|
110
|
+
|
|
8
111
|
## [2.6.0] - 2026-05-06
|
|
9
112
|
|
|
10
113
|
### Review pass 3 — final correctness/security pass before tag
|
|
@@ -0,0 +1,377 @@
|
|
|
1
|
+
Metadata-Version: 2.4
|
|
2
|
+
Name: simplevecdb
|
|
3
|
+
Version: 2.6.1
|
|
4
|
+
Summary: Dead-simple local vector database powered by usearch HNSW.
|
|
5
|
+
Project-URL: Homepage, https://github.com/CoderDayton/simplevecdb
|
|
6
|
+
Project-URL: Repository, https://github.com/CoderDayton/simplevecdb
|
|
7
|
+
Project-URL: Issues, https://github.com/CoderDayton/simplevecdb/issues
|
|
8
|
+
Project-URL: Changelog, https://github.com/CoderDayton/simplevecdb/blob/main/CHANGELOG.md
|
|
9
|
+
Author-email: Dayton Dunbar <coderdayton14@gmail.com>
|
|
10
|
+
License: MIT
|
|
11
|
+
License-File: LICENSE
|
|
12
|
+
Keywords: embeddings,hnsw,langchain,llamaindex,rag,similarity-search,sqlite,usearch,vector-database,vectordb
|
|
13
|
+
Classifier: Development Status :: 5 - Production/Stable
|
|
14
|
+
Classifier: Intended Audience :: Developers
|
|
15
|
+
Classifier: License :: OSI Approved :: MIT License
|
|
16
|
+
Classifier: Operating System :: OS Independent
|
|
17
|
+
Classifier: Programming Language :: Python :: 3
|
|
18
|
+
Classifier: Programming Language :: Python :: 3.10
|
|
19
|
+
Classifier: Programming Language :: Python :: 3.11
|
|
20
|
+
Classifier: Programming Language :: Python :: 3.12
|
|
21
|
+
Classifier: Programming Language :: Python :: 3.13
|
|
22
|
+
Classifier: Topic :: Database
|
|
23
|
+
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
|
|
24
|
+
Classifier: Topic :: Scientific/Engineering :: Information Analysis
|
|
25
|
+
Classifier: Typing :: Typed
|
|
26
|
+
Requires-Python: >=3.10
|
|
27
|
+
Requires-Dist: cryptography>=41.0
|
|
28
|
+
Requires-Dist: hdbscan>=0.8.33
|
|
29
|
+
Requires-Dist: numpy>=1.24
|
|
30
|
+
Requires-Dist: python-dotenv>=1.0
|
|
31
|
+
Requires-Dist: scikit-learn>=1.3.0
|
|
32
|
+
Requires-Dist: sqlcipher3-binary>=0.5.0
|
|
33
|
+
Requires-Dist: usearch>=2.16.3
|
|
34
|
+
Provides-Extra: examples
|
|
35
|
+
Requires-Dist: ollama; extra == 'examples'
|
|
36
|
+
Provides-Extra: integrations
|
|
37
|
+
Requires-Dist: langchain-core>=1.0.7; extra == 'integrations'
|
|
38
|
+
Requires-Dist: langchain-openai>=1.0.3; extra == 'integrations'
|
|
39
|
+
Requires-Dist: llama-index-llms-ollama>=0.9.0; extra == 'integrations'
|
|
40
|
+
Requires-Dist: llama-index-llms-openai-like>=0.5.3; extra == 'integrations'
|
|
41
|
+
Requires-Dist: llama-index>=0.14.8; extra == 'integrations'
|
|
42
|
+
Provides-Extra: server
|
|
43
|
+
Requires-Dist: fastapi>=0.115; extra == 'server'
|
|
44
|
+
Requires-Dist: sentence-transformers>=5.0; extra == 'server'
|
|
45
|
+
Requires-Dist: uvicorn[standard]>=0.30; extra == 'server'
|
|
46
|
+
Description-Content-Type: text/markdown
|
|
47
|
+
|
|
48
|
+
# SimpleVecDB
|
|
49
|
+
|
|
50
|
+
[](https://github.com/coderdayton/simplevecdb/actions)
|
|
51
|
+
[](https://pypi.org/project/simplevecdb/)
|
|
52
|
+
[](LICENSE)
|
|
53
|
+
[](https://github.com/coderdayton/simplevecdb)
|
|
54
|
+
|
|
55
|
+
<a href='https://ko-fi.com/U7U01WTJF9' target='_blank'><img height='36' style='border:0px;height:36px;' src='https://storage.ko-fi.com/cdn/kofi3.png?v=6' border='0' alt='Buy Me a Coffee at ko-fi.com' /></a>
|
|
56
|
+
|
|
57
|
+
**The dead-simple, local-first vector database.**
|
|
58
|
+
|
|
59
|
+
SimpleVecDB brings **Chroma-like simplicity** to a single **SQLite file**. Built on `usearch` HNSW indexing, it offers high-performance vector search, quantization, and zero infrastructure headaches. Perfect for local RAG, offline agents, and indie hackers who need production-grade vector search without the operational overhead.
|
|
60
|
+
|
|
61
|
+
## Why SimpleVecDB?
|
|
62
|
+
|
|
63
|
+
- **Zero Infrastructure** — Just a `.db` file. No Docker, no Redis, no cloud bills.
|
|
64
|
+
- **Blazing Fast** — 10-100x faster search via usearch HNSW. Adaptive: brute-force for <10k vectors (perfect recall), HNSW for larger collections.
|
|
65
|
+
- **Truly Portable** — Runs anywhere SQLite runs: Linux, macOS, Windows, even WASM.
|
|
66
|
+
- **Async Ready** — Full async/await support with optional executor injection for thread-safe ONNX/usearch sharing.
|
|
67
|
+
- **Batteries Included** — Optional FastAPI embeddings server + LangChain/LlamaIndex integrations via `[integrations]` extra.
|
|
68
|
+
- **Production Ready** — Hybrid search (BM25 + vector), metadata filtering, multi-collection support, and automatic hardware acceleration.
|
|
69
|
+
|
|
70
|
+
### When to Choose SimpleVecDB
|
|
71
|
+
|
|
72
|
+
| Use Case | SimpleVecDB | Cloud Vector DB |
|
|
73
|
+
| :----------------------------- | :-------------------- | :----------------------- |
|
|
74
|
+
| **Local RAG applications** | ✅ Perfect fit | ❌ Overkill + latency |
|
|
75
|
+
| **Offline-first agents** | ✅ No internet needed | ❌ Requires connectivity |
|
|
76
|
+
| **Prototyping & MVPs** | ✅ Zero config | ⚠️ Setup overhead |
|
|
77
|
+
| **Multi-tenant SaaS at scale** | ⚠️ Consider sharding | ✅ Built for this |
|
|
78
|
+
| **Budget-conscious projects** | ✅ $0/month | ❌ $50-500+/month |
|
|
79
|
+
|
|
80
|
+
## Prerequisites
|
|
81
|
+
|
|
82
|
+
**System Requirements:**
|
|
83
|
+
|
|
84
|
+
- Python 3.10+
|
|
85
|
+
- SQLite 3.35+ with FTS5 support (included in Python 3.8+ standard library)
|
|
86
|
+
- 50MB+ disk space for core library, 500MB+ with `[server]` extras
|
|
87
|
+
|
|
88
|
+
**Optional for GPU Acceleration:**
|
|
89
|
+
|
|
90
|
+
- CUDA 11.8+ for NVIDIA GPUs
|
|
91
|
+
- Metal Performance Shaders (MPS) for Apple Silicon
|
|
92
|
+
|
|
93
|
+
> **Note:** If using custom-compiled SQLite, ensure `-DSQLITE_ENABLE_FTS5` is enabled for full-text search support.
|
|
94
|
+
|
|
95
|
+
## Installation
|
|
96
|
+
|
|
97
|
+
```bash
|
|
98
|
+
# Standard installation (includes clustering, encryption)
|
|
99
|
+
pip install simplevecdb
|
|
100
|
+
|
|
101
|
+
# With LangChain & LlamaIndex integrations
|
|
102
|
+
pip install "simplevecdb[integrations]"
|
|
103
|
+
|
|
104
|
+
# With local embeddings server (adds 500MB+ models)
|
|
105
|
+
pip install "simplevecdb[server]"
|
|
106
|
+
```
|
|
107
|
+
|
|
108
|
+
**What's included by default:**
|
|
109
|
+
- Vector search with HNSW indexing
|
|
110
|
+
- Clustering (K-means, MiniBatch K-means, HDBSCAN)
|
|
111
|
+
- Encryption (SQLCipher AES-256)
|
|
112
|
+
- Async support
|
|
113
|
+
|
|
114
|
+
**Verify Installation:**
|
|
115
|
+
|
|
116
|
+
```bash
|
|
117
|
+
python -c "from simplevecdb import VectorDB; print('SimpleVecDB installed successfully!')"
|
|
118
|
+
```
|
|
119
|
+
|
|
120
|
+
## Quickstart
|
|
121
|
+
|
|
122
|
+
SimpleVecDB is just a storage and search layer — it doesn't ship an LLM
|
|
123
|
+
and won't generate embeddings for you. Bring whichever embedding source
|
|
124
|
+
you already use; three common ones below.
|
|
125
|
+
|
|
126
|
+
### Option 1: OpenAI embeddings
|
|
127
|
+
|
|
128
|
+
```python
|
|
129
|
+
from simplevecdb import VectorDB
|
|
130
|
+
from openai import OpenAI
|
|
131
|
+
|
|
132
|
+
client = OpenAI()
|
|
133
|
+
db = VectorDB("notes.db")
|
|
134
|
+
notes = db.collection("personal")
|
|
135
|
+
|
|
136
|
+
def embed(text: str) -> list[float]:
|
|
137
|
+
return (
|
|
138
|
+
client.embeddings
|
|
139
|
+
.create(model="text-embedding-3-small", input=text)
|
|
140
|
+
.data[0].embedding
|
|
141
|
+
)
|
|
142
|
+
|
|
143
|
+
entries = [
|
|
144
|
+
("Cherry MX silent reds bottom out around 45g — quieter than browns", "keyboards"),
|
|
145
|
+
("Sourdough hydration sweet spot is ~75% with this flour", "baking"),
|
|
146
|
+
("EXPLAIN ANALYZE showed seq scan; ANALYZE on the table fixed it", "work"),
|
|
147
|
+
("Passport renewal took 3 weeks, not the advertised 6–8", "admin"),
|
|
148
|
+
]
|
|
149
|
+
|
|
150
|
+
notes.add_texts(
|
|
151
|
+
texts=[t for t, _ in entries],
|
|
152
|
+
embeddings=[embed(t) for t, _ in entries],
|
|
153
|
+
metadatas=[{"tag": tag} for _, tag in entries],
|
|
154
|
+
)
|
|
155
|
+
|
|
156
|
+
hits = notes.similarity_search(embed("how loud are silent reds"), k=2)
|
|
157
|
+
for doc, score in hits:
|
|
158
|
+
print(f"{score:.3f} {doc.page_content}")
|
|
159
|
+
|
|
160
|
+
work = notes.similarity_search(
|
|
161
|
+
embed("query plan slow"),
|
|
162
|
+
k=5,
|
|
163
|
+
filter={"tag": "work"},
|
|
164
|
+
)
|
|
165
|
+
```
|
|
166
|
+
|
|
167
|
+
### Option 2: Fully local (no network, no API key)
|
|
168
|
+
|
|
169
|
+
```bash
|
|
170
|
+
pip install "simplevecdb[server]"
|
|
171
|
+
```
|
|
172
|
+
|
|
173
|
+
```python
|
|
174
|
+
from simplevecdb import VectorDB
|
|
175
|
+
from simplevecdb.embeddings.models import embed_texts
|
|
176
|
+
|
|
177
|
+
db = VectorDB("notes.db")
|
|
178
|
+
notes = db.collection("personal")
|
|
179
|
+
|
|
180
|
+
texts = [
|
|
181
|
+
"Cherry MX silent reds bottom out around 45g",
|
|
182
|
+
"Sourdough hydration sweet spot is ~75% with this flour",
|
|
183
|
+
"EXPLAIN ANALYZE showed seq scan; ANALYZE on the table fixed it",
|
|
184
|
+
]
|
|
185
|
+
notes.add_texts(texts=texts, embeddings=embed_texts(texts))
|
|
186
|
+
|
|
187
|
+
vec = notes.similarity_search(embed_texts(["quieter switches"])[0], k=2)
|
|
188
|
+
mixed = notes.hybrid_search("postgres slow query", k=3)
|
|
189
|
+
```
|
|
190
|
+
|
|
191
|
+
If you'd rather hit an HTTP endpoint than import the embedding models
|
|
192
|
+
directly, the bundled server speaks the same shape as OpenAI's
|
|
193
|
+
embeddings API:
|
|
194
|
+
|
|
195
|
+
```bash
|
|
196
|
+
simplevecdb-server --port 8000 # default model, auto warm-up
|
|
197
|
+
simplevecdb-server --host 0.0.0.0 --port 9000
|
|
198
|
+
simplevecdb-server --no-warmup # skip the model preload
|
|
199
|
+
simplevecdb-server --help
|
|
200
|
+
```
|
|
201
|
+
|
|
202
|
+
Server tuning (model registry, rate limits, API keys, CORS, CUDA) lives
|
|
203
|
+
in the [Setup Guide](ENV_SETUP.md).
|
|
204
|
+
|
|
205
|
+
### Option 3: LangChain or LlamaIndex
|
|
206
|
+
|
|
207
|
+
Already wired into one of the big RAG frameworks? Drop SimpleVecDB in
|
|
208
|
+
as the vector store:
|
|
209
|
+
|
|
210
|
+
```bash
|
|
211
|
+
pip install "simplevecdb[integrations]"
|
|
212
|
+
```
|
|
213
|
+
|
|
214
|
+
```python
|
|
215
|
+
from simplevecdb.integrations.langchain import SimpleVecDBVectorStore
|
|
216
|
+
from langchain_openai import OpenAIEmbeddings
|
|
217
|
+
|
|
218
|
+
store = SimpleVecDBVectorStore(
|
|
219
|
+
db_path="notes.db",
|
|
220
|
+
embedding=OpenAIEmbeddings(model="text-embedding-3-small"),
|
|
221
|
+
)
|
|
222
|
+
|
|
223
|
+
store.add_texts([
|
|
224
|
+
"Cherry MX silent reds bottom out around 45g",
|
|
225
|
+
"EXPLAIN ANALYZE showed seq scan; ANALYZE on the table fixed it",
|
|
226
|
+
])
|
|
227
|
+
store.similarity_search("quieter switches", k=1)
|
|
228
|
+
store.hybrid_search("postgres performance", k=3)
|
|
229
|
+
```
|
|
230
|
+
|
|
231
|
+
LlamaIndex is the same shape:
|
|
232
|
+
|
|
233
|
+
```python
|
|
234
|
+
from simplevecdb.integrations.llamaindex import SimpleVecDBLlamaStore
|
|
235
|
+
from llama_index.embeddings.openai import OpenAIEmbedding
|
|
236
|
+
|
|
237
|
+
store = SimpleVecDBLlamaStore(
|
|
238
|
+
db_path="notes.db",
|
|
239
|
+
embedding=OpenAIEmbedding(model="text-embedding-3-small"),
|
|
240
|
+
)
|
|
241
|
+
```
|
|
242
|
+
|
|
243
|
+
End-to-end notebooks (including a fully local Ollama RAG) live in the
|
|
244
|
+
[examples gallery](https://coderdayton.github.io/SimpleVecDB/examples/).
|
|
245
|
+
|
|
246
|
+
## Feature Highlights
|
|
247
|
+
|
|
248
|
+
A few of the things SimpleVecDB does well — see
|
|
249
|
+
[`docs/Features.md`](docs/Features.md) for the comprehensive list.
|
|
250
|
+
|
|
251
|
+
- **Vector + keyword + hybrid search** — cosine / L2 similarity, BM25
|
|
252
|
+
via SQLite FTS5, and Reciprocal Rank Fusion in one collection.
|
|
253
|
+
- **Adaptive HNSW** — brute-force for <10k vectors (perfect recall),
|
|
254
|
+
`usearch` HNSW above that. Override per query with `exact=True/False`.
|
|
255
|
+
- **Quantization** — `FLOAT32`, `FLOAT16`, `INT8`, `BIT` for 1×–32×
|
|
256
|
+
compression.
|
|
257
|
+
- **Multi-collection + cross-collection search** — isolated namespaces in
|
|
258
|
+
one `.db` file, with merged ranked search across them.
|
|
259
|
+
- **Mongo-style filters** — `$eq $ne $gt $gte $lt $lte $in $nin $exists
|
|
260
|
+
$between` on metadata, edges, and events.
|
|
261
|
+
- **Memory primitives (v2.6.1)** — pending-vector buffer with atomic
|
|
262
|
+
flush, weighted directed edges, append-only event feed, TTL with
|
|
263
|
+
delete/callback sweep, and a threshold-driven rebuild scheduler.
|
|
264
|
+
- **Atomic counters & transactions (v2.6.1)** — `increment_metadata` for
|
|
265
|
+
JSON deltas in one statement; SAVEPOINT-backed `db.transaction()` /
|
|
266
|
+
`collection.tx()` rolling all catalog writes back on error.
|
|
267
|
+
- **Async, encryption, clustering, hierarchies** — full async surface
|
|
268
|
+
(with executor injection), SQLCipher AES-256, K-means / MiniBatch
|
|
269
|
+
K-means / HDBSCAN, parent/child relationships.
|
|
270
|
+
- **Framework integrations** — drop-in `LangChain` and `LlamaIndex`
|
|
271
|
+
adapters via the `[integrations]` extra; optional FastAPI embeddings
|
|
272
|
+
server via `[server]`.
|
|
273
|
+
|
|
274
|
+
For full method-level coverage, see [the Features doc](docs/Features.md)
|
|
275
|
+
or the [API reference](https://coderdayton.github.io/SimpleVecDB/api/core).
|
|
276
|
+
|
|
277
|
+
|
|
278
|
+
## Performance Benchmarks
|
|
279
|
+
|
|
280
|
+
**10,000 vectors, 384 dimensions, k=10 search** — [Full benchmarks →](https://coderdayton.github.io/SimpleVecDB/benchmarks)
|
|
281
|
+
|
|
282
|
+
| Quantization | Storage | Query Time | Compression |
|
|
283
|
+
| :----------- | :------- | :--------- | :---------- |
|
|
284
|
+
| FLOAT32 | 36.0 MB | 0.20 ms | 1x |
|
|
285
|
+
| FLOAT16 | 28.7 MB | 0.20 ms | 2x |
|
|
286
|
+
| INT8 | 25.0 MB | 0.16 ms | 4x |
|
|
287
|
+
| BIT | 21.8 MB | 0.08 ms | 32x |
|
|
288
|
+
|
|
289
|
+
**Key highlights:**
|
|
290
|
+
- **3-34x faster** than brute-force for collections >10k vectors
|
|
291
|
+
- **Adaptive search**: perfect recall for small collections, HNSW for large
|
|
292
|
+
- **FLOAT16 recommended**: best balance of speed, memory, and precision
|
|
293
|
+
|
|
294
|
+
## Documentation
|
|
295
|
+
|
|
296
|
+
- **[Features](docs/Features.md)** — Comprehensive list of every capability, grouped by area
|
|
297
|
+
- **[Setup Guide](https://coderdayton.github.io/SimpleVecDB/ENV_SETUP)** — Environment variables, server configuration, authentication
|
|
298
|
+
- **[API Reference](https://coderdayton.github.io/SimpleVecDB/api/core)** — Complete class/method documentation with type signatures
|
|
299
|
+
- **[Benchmarks](https://coderdayton.github.io/SimpleVecDB/benchmarks)** — Quantization strategies, batch sizes, hardware optimization
|
|
300
|
+
- **[Integration Examples](https://coderdayton.github.io/SimpleVecDB/examples)** — RAG notebooks, Ollama workflows, production patterns
|
|
301
|
+
- **[Contributing Guide](CONTRIBUTING.md)** — Development setup, testing, PR guidelines
|
|
302
|
+
|
|
303
|
+
## Troubleshooting
|
|
304
|
+
|
|
305
|
+
**Import Error: `sqlite3.OperationalError: no such module: fts5`**
|
|
306
|
+
|
|
307
|
+
```bash
|
|
308
|
+
# Your Python's SQLite was compiled without FTS5
|
|
309
|
+
# Solution: Install Python from python.org (includes FTS5) or compile SQLite with:
|
|
310
|
+
# -DSQLITE_ENABLE_FTS5
|
|
311
|
+
```
|
|
312
|
+
|
|
313
|
+
**Dimension Mismatch Error**
|
|
314
|
+
|
|
315
|
+
```python
|
|
316
|
+
# Ensure all vectors in a collection have identical dimensions
|
|
317
|
+
collection = db.collection("docs", dim=384) # Explicit dimension
|
|
318
|
+
```
|
|
319
|
+
|
|
320
|
+
**CUDA Not Detected (GPU Available)**
|
|
321
|
+
|
|
322
|
+
```bash
|
|
323
|
+
# Verify CUDA installation
|
|
324
|
+
python -c "import torch; print(torch.cuda.is_available())"
|
|
325
|
+
|
|
326
|
+
# Reinstall PyTorch with CUDA support
|
|
327
|
+
pip install torch --index-url https://download.pytorch.org/whl/cu118
|
|
328
|
+
```
|
|
329
|
+
|
|
330
|
+
**Slow Queries on Large Datasets**
|
|
331
|
+
|
|
332
|
+
- Enable quantization: `collection = db.collection("docs", quantization=Quantization.INT8)`
|
|
333
|
+
- For >10k vectors, HNSW is automatic; tune with `rebuild_index(connectivity=32)`
|
|
334
|
+
- Use `exact=False` to force HNSW even on smaller collections
|
|
335
|
+
- Use metadata filtering to reduce search space
|
|
336
|
+
|
|
337
|
+
## Roadmap
|
|
338
|
+
|
|
339
|
+
What's on the near-term radar:
|
|
340
|
+
|
|
341
|
+
- [ ] Incremental clustering (online learning)
|
|
342
|
+
- [ ] Cluster visualization exports
|
|
343
|
+
|
|
344
|
+
For shipped capabilities, see [`docs/Features.md`](docs/Features.md) and the
|
|
345
|
+
release-by-release [Changelog](CHANGELOG.md). Vote on these or propose new
|
|
346
|
+
ideas in [GitHub Discussions](https://github.com/coderdayton/simplevecdb/discussions).
|
|
347
|
+
|
|
348
|
+
## Contributing
|
|
349
|
+
|
|
350
|
+
Contributions are welcome! Whether you're fixing bugs, improving documentation, or proposing new features:
|
|
351
|
+
|
|
352
|
+
1. Read [CONTRIBUTING.md](CONTRIBUTING.md) for development setup
|
|
353
|
+
2. Check existing [Issues](https://github.com/coderdayton/simplevecdb/issues) and [Discussions](https://github.com/coderdayton/simplevecdb/discussions)
|
|
354
|
+
3. Open a PR with clear description and tests
|
|
355
|
+
|
|
356
|
+
## Community & Support
|
|
357
|
+
|
|
358
|
+
**Get Help:**
|
|
359
|
+
|
|
360
|
+
- [GitHub Discussions](https://github.com/coderdayton/simplevecdb/discussions) — Q&A and feature requests
|
|
361
|
+
- [GitHub Issues](https://github.com/coderdayton/simplevecdb/issues) — Bug reports
|
|
362
|
+
|
|
363
|
+
**Stay Updated:**
|
|
364
|
+
|
|
365
|
+
- [GitHub Releases](https://github.com/coderdayton/simplevecdb/releases) — Changelog and updates
|
|
366
|
+
- [Examples Gallery](https://coderdayton.github.io/SimpleVecDB/examples/) — Community-contributed notebooks
|
|
367
|
+
|
|
368
|
+
## Other Ways to Support
|
|
369
|
+
|
|
370
|
+
- ☕ **[Buy me a coffee](https://ko-fi.com/xbbvii)** - One-time donation
|
|
371
|
+
- ⭐ **Star the repo** - Helps with visibility
|
|
372
|
+
- 🐛 **Report bugs** - Improve the project for everyone
|
|
373
|
+
- 📝 **Contribute** - See [CONTRIBUTING.md](CONTRIBUTING.md)
|
|
374
|
+
|
|
375
|
+
## License
|
|
376
|
+
|
|
377
|
+
[MIT License](LICENSE) — Free for personal and commercial use.
|