PyPI - endee-llamaindex - Versions diffs - 0.1.3__py3-none-any.whl → 0.1.5__py3-none-any.whl - Mend

endee-llamaindex 0.1.3py3-none-any.whl → 0.1.5py3-none-any.whl

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (8) hide show

endee_llamaindex/base.py +613 -658
endee_llamaindex/constants.py +70 -0
endee_llamaindex/utils.py +7 -587
{endee_llamaindex-0.1.3.dist-info → endee_llamaindex-0.1.5.dist-info}/METADATA +173 -50
endee_llamaindex-0.1.5.dist-info/RECORD +8 -0
{endee_llamaindex-0.1.3.dist-info → endee_llamaindex-0.1.5.dist-info}/WHEEL +1 -1
endee_llamaindex-0.1.3.dist-info/RECORD +0 -7
{endee_llamaindex-0.1.3.dist-info → endee_llamaindex-0.1.5.dist-info}/top_level.txt +0 -0

{endee_llamaindex-0.1.3.dist-info → endee_llamaindex-0.1.5.dist-info}/METADATA RENAMED Viewed

@@ -1,6 +1,6 @@
 Metadata-Version: 2.4
 Name: endee-llamaindex
-Version: 0.1.3
+Version: 0.1.5
 Summary: Vector Database for Fast ANN Searches
 Home-page: https://endee.io
 Author: Endee Labs
@@ -11,13 +11,17 @@ Classifier: Operating System :: OS Independent
 Requires-Python: >=3.6
 Description-Content-Type: text/markdown
 Requires-Dist: llama-index>=0.12.34
-Requires-Dist: endee>=0.1.4
+Requires-Dist: endee==0.1.9
+Requires-Dist: fastembed>=0.3.0
+Provides-Extra: gpu
+Requires-Dist: fastembed-gpu>=0.3.0; extra == "gpu"
 Dynamic: author
 Dynamic: author-email
 Dynamic: classifier
 Dynamic: description
 Dynamic: description-content-type
 Dynamic: home-page
+Dynamic: provides-extra
 Dynamic: requires-dist
 Dynamic: requires-python
 Dynamic: summary
@@ -31,18 +35,19 @@ Build powerful RAG applications with Endee vector database and LlamaIndex.
 ## Table of Contents
 1. [Installation](#1-installation)
-2. [Setting up Credentials](#2-setting-up-endee-and-openai-credentials)
-3. [Creating Sample Documents](#3-creating-sample-documents)
-4. [Setting up Endee with LlamaIndex](#4-setting-up-endee-with-llamaindex)
-5. [Creating a Vector Index](#5-creating-a-vector-index-from-documents)
-6. [Basic Retrieval](#6-basic-retrieval-with-query-engine)
-7. [Using Metadata Filters](#7-using-metadata-filters)
-8. [Advanced Filtering](#8-advanced-filtering-with-multiple-conditions)
-9. [Custom Retriever Setup](#9-custom-retriever-setup)
-10. [Custom Retriever with Query Engine](#10-using-a-custom-retriever-with-a-query-engine)
-11. [Direct VectorStore Querying](#11-direct-vectorstore-querying)
-12. [Saving and Loading Indexes](#12-saving-and-loading-indexes)
-13. [Cleanup](#13-cleanup)
+2. [Testing locally](#testing-locally)
+3. [Setting up Credentials](#2-setting-up-endee-and-openai-credentials)
+4. [Creating Sample Documents](#3-creating-sample-documents)
+5. [Setting up Endee with LlamaIndex](#4-setting-up-endee-with-llamaindex)
+6. [Creating a Vector Index](#5-creating-a-vector-index-from-documents)
+7. [Basic Retrieval](#6-basic-retrieval-with-query-engine)
+8. [Using Metadata Filters](#7-using-metadata-filters)
+9. [Advanced Filtering](#8-advanced-filtering-with-multiple-conditions)
+10. [Custom Retriever Setup](#9-custom-retriever-setup)
+11. [Custom Retriever with Query Engine](#10-using-a-custom-retriever-with-a-query-engine)
+12. [Direct VectorStore Querying](#11-direct-vectorstore-querying)
+13. [Saving and Loading Indexes](#12-saving-and-loading-indexes)
+14. [Cleanup](#13-cleanup)
 ---
@@ -50,12 +55,66 @@ Build powerful RAG applications with Endee vector database and LlamaIndex.
 Get started by installing the required package.
+### Basic Installation (Dense-only search)
 ```bash
 pip install endee-llamaindex
 ```
 > **Note:** This will automatically install `endee` and `llama-index` as dependencies.
+### Full Installation (with Hybrid Search support)
+For hybrid search capabilities (dense + sparse vectors), install with the `hybrid` extra:
+```bash
+pip install endee-llamaindex[hybrid]
+```
+This includes FastEmbed for sparse vector encoding (SPLADE, BM25, etc.).
+### GPU-Accelerated Hybrid Search
+For GPU-accelerated sparse encoding:
+```bash
+pip install endee-llamaindex[hybrid-gpu]
+```
+### All Features
+To install all optional dependencies:
+```bash
+pip install endee-llamaindex[all]
+```
+### Installation Options Summary
+| Installation | Use Case | Includes |
+|--------------|----------|----------|
+| `pip install endee-llamaindex` | Dense vector search only | Core dependencies |
+| `pip install endee-llamaindex[hybrid]` | Dense + sparse hybrid search | + FastEmbed (CPU) |
+| `pip install endee-llamaindex[hybrid-gpu]` | GPU-accelerated hybrid search | + FastEmbed (GPU) |
+| `pip install endee-llamaindex[all]` | All features | All optional deps |
+---
+## Testing locally
+From the project root:
+```bash
+python -m venv env && source env/bin/activate   # optional
+pip install -e .
+pip install pytest sentence-transformers huggingface-hub
+export ENDEE_API_TOKEN="your-endee-api-token"   # or set in endee_llamaindex/test_cases/setup_class.py
+cd endee_llamaindex/test_cases && PYTHONPATH=.. python -m pytest . -v
+```
+See [TESTING.md](TESTING.md) for more options and single-test runs.
 ---
 ## 2. Setting up Endee and OpenAI credentials
@@ -145,7 +204,7 @@ vector_store = EndeeVectorStore.from_params(
     index_name=index_name,
     dimension=dimension,
     space_type="cosine",  # Can be "cosine", "l2", or "ip"
-    precision="medium"  # Index precision: "low", "medium", "high", or None
+    precision="float16"  # Options: "binary", "float16", "float32", "int16d", "int8d" (default: "float16")
 )
 # Create storage context with our vector store
@@ -160,9 +219,37 @@ print(f"Initialized Endee vector store with index: {index_name}")
 |-----------|-------------|---------|
 | `space_type` | Distance metric for similarity | `cosine`, `l2`, `ip` |
 | `dimension` | Vector dimension | Must match embedding model |
-| `precision` | Index precision setting | `"low"`, `"medium"` (default), `"high"`, or `None` |
-| `key` | Encryption key for metadata | 256-bit hex key (64 hex characters) |
+| `precision` | Index precision setting | `"binary"`, `"float16"` (default), `"float32"`, `"int16d"`, `"int8d"` |
 | `batch_size` | Vectors per API call | Default: `100` |
+| `hybrid` | Enable hybrid search (dense + sparse) | Default: `False` |
+| `M` | Optional HNSW M parameter (bi-directional links) | Optional (backend default if not specified) |
+| `ef_con` | Optional HNSW ef_construction parameter | Optional (backend default if not specified) |
+### Hybrid Search and Sparse Models
+When you enable hybrid search by providing a positive `sparse_dim` and a `model_name`, the vector store automatically computes sparse (bag-of-words‑style) vectors in addition to dense vectors.
+- **Sparse dimension (`sparse_dim`)**:
+  - For the built-in SPLADE models, the recommended `sparse_dim` is **30522** (matching the model vocabulary size).
+  - For dense‑only search, omit `sparse_dim` (or set it to `0`).
+- **Supported sparse models (`model_name`)**:
+  - `"splade_pp"` → `prithivida/Splade_PP_en_v1` (SPLADE++)
+  - `"splade_cocondenser"` → `naver/splade-cocondenser-ensembledistil`
+Example hybrid configuration:
+```python
+vector_store = EndeeVectorStore.from_params(
+    api_token=endee_api_token,
+    index_name=index_name,
+    dimension=dimension,        # dense dimension (e.g., 1536 for OpenAI)
+    space_type="cosine",
+    precision="float16",
+    hybrid=True,
+    sparse_dim=30522,           # sparse dimension for SPLADE models
+    model_name="splade_pp",     # or "splade_cocondenser"
+)
+```
 ---
@@ -239,16 +326,47 @@ print(response)
 ### Available Filter Operators
-| Operator | Description |
-|----------|-------------|
-| `FilterOperator.EQ` | Equal to |
-| `FilterOperator.NE` | Not equal to |
-| `FilterOperator.GT` | Greater than |
-| `FilterOperator.GTE` | Greater than or equal |
-| `FilterOperator.LT` | Less than |
-| `FilterOperator.LTE` | Less than or equal |
-| `FilterOperator.IN` | In list |
-| `FilterOperator.NIN` | Not in list |
+| Operator | Description | Backend Symbol | Example |
+|----------|-------------|----------------|---------|
+| `FilterOperator.EQ` | Equal to | `$eq` | `rating == 5` |
+| `FilterOperator.IN` | In list | `$in` | `category in ["ai", "ml"]` |
+> **Important Notes:**
+> - Currently, the Endee LlamaIndex integration only supports **EQ** and **IN** metadata filters.
+> - Range-style operators (LT, LTE, GT, GTE) are **not** supported in this adapter.
+### Filter Examples
+Here are practical examples showing how to use the supported filter operators:
+```python
+from llama_index.core.vector_stores.types import MetadataFilters, MetadataFilter, FilterOperator
+# Example 1: Equal to (EQ)
+# Find documents with rating equal to 5
+rating_filter = MetadataFilter(key="rating", value=5, operator=FilterOperator.EQ)
+filters = MetadataFilters(filters=[rating_filter])
+# Backend: {"rating": {"$eq": 5}}
+# Example 2: In list (IN)
+# Find documents in AI or ML categories
+category_filter = MetadataFilter(key="category", value=["ai", "ml"], operator=FilterOperator.IN)
+filters = MetadataFilters(filters=[category_filter])
+# Backend: {"category": {"$in": ["ai", "ml"]}}
+# Example 3: Combined filters (AND logic)
+# Find AI documents with rating equal to 5
+filters = MetadataFilters(filters=[
+    MetadataFilter(key="category", value="ai", operator=FilterOperator.EQ),
+    MetadataFilter(key="rating", value=5, operator=FilterOperator.EQ)
+])
+# Backend: [{"category": {"$eq": "ai"}}, {"rating": {"$eq": 5}}]
+# Create a query engine with filters
+filtered_engine = index.as_query_engine(filters=filters)
+response = filtered_engine.query("What is machine learning?")
+```
 ---
@@ -439,10 +557,14 @@ Delete the index when you're done to free up resources.
 | `api_token` | `str` | Your Endee API token | Required |
 | `index_name` | `str` | Name of the index | Required |
 | `dimension` | `int` | Vector dimension | Required |
-| `space_type` | `str` | Distance metric | `"cosine"` |
-| `precision` | `str` | Index precision setting | `"medium"` |
-| `key` | `str` | Encryption key for metadata (256-bit hex) | `None` |
+| `space_type` | `str` | Distance metric (`"cosine"`, `"l2"`, `"ip"`) | `"cosine"` |
+| `precision` | `str` | Index precision (`"binary"`, `"float16"`, `"float32"`, `"int16d"`, `"int8d"`) | `"float16"` |
 | `batch_size` | `int` | Vectors per API call | `100` |
+| `hybrid` | `bool` | Enable hybrid search (dense + sparse vectors) | `False` |
+| `sparse_dim` | `int` | Sparse dimension for hybrid index | `None` |
+| `model_name` | `str` | Model name for sparse embeddings (e.g., `'splade_pp'`, `'bert_base'`) | `None` |
+| `M` | `int` | Optional HNSW M parameter (bi-directional links per node) | `None` (backend default) |
+| `ef_con` | `int` | Optional HNSW ef_construction parameter | `None` (backend default) |
 ### Distance Metrics
@@ -454,38 +576,39 @@ Delete the index when you're done to free up resources.
 ### Precision Settings
-The `precision` parameter controls the trade-off between search accuracy and performance:
+The `precision` parameter controls the vector storage format and affects memory usage and search performance:
 | Precision | Description | Use Case |
 |-----------|-------------|----------|
-| `"low"` | Faster searches, lower accuracy | Large-scale applications where speed is critical |
-| `"medium"` | Balanced performance and accuracy | General purpose applications (default) |
-| `"high"` | Slower searches, higher accuracy | Applications requiring maximum precision |
-| `None` | Default system precision | Use system defaults |
+| `"float32"` | Full precision floating point | Maximum accuracy, higher memory usage |
+| `"float16"` | Half precision floating point | Balanced accuracy and memory (default) |
+| `"binary"` | Binary vectors | Extremely compact, best for binary embeddings |
+| `"int8d"` | 8-bit integer quantization | High compression, good accuracy |
+| `"int16d"` | 16-bit integer quantization | Better accuracy than int8d, moderate compression |
-### Encryption Support
+### HNSW Parameters (Optional)
-You can encrypt metadata stored in Endee by providing a 256-bit encryption key (64 hex characters). This ensures sensitive information is encrypted at rest.
+HNSW (Hierarchical Navigable Small World) parameters control index construction and search quality. These are **optional** - if not provided, the Endee backend uses optimized defaults.
-```python
-# Generate a 256-bit key (example - use a secure method in production)
-import secrets
-encryption_key = secrets.token_hex(32)  # 32 bytes = 64 hex characters
+| Parameter | Description | Impact |
+|-----------|-------------|--------|
+| `M` | Number of bi-directional links per node | Higher M = better recall, more memory |
+| `ef_con` | Size of dynamic candidate list during construction | Higher ef_con = better quality, slower indexing |
+**Example with custom HNSW parameters:**
-# Create vector store with encryption
+```python
 vector_store = EndeeVectorStore.from_params(
-    api_token=endee_api_token,
-    index_name=index_name,
-    dimension=dimension,
+    api_token="your-token",
+    index_name="custom_index",
+    dimension=384,
     space_type="cosine",
-    precision="medium",
-    key=encryption_key  # Metadata will be encrypted
+    M=32,           # Optional: custom M value
+    ef_con=256      # Optional: custom ef_construction
 )
-# Important: Store this key securely! You'll need it to access the index later.
 ```
-> **Warning:** If you lose the encryption key, you will not be able to decrypt your metadata. Store it securely (e.g., in a secrets manager).
+**Note:** Only specify M and ef_con if you need to fine-tune performance. The backend defaults work well for most use cases.
 ---

endee_llamaindex-0.1.5.dist-info/RECORD ADDED Viewed

@@ -0,0 +1,8 @@
+endee_llamaindex/__init__.py,sha256=ctCcicNLMO3LpXPGLwvQifvQLX7TEd8CYgFO6Nd9afc,83
+endee_llamaindex/base.py,sha256=JJ6KOSv32utKh93DDLTJFu8_wd-WUWi42hWVgIeGIZc,31533
+endee_llamaindex/constants.py,sha256=-RMx-48CsOklYnarwae5d-BrixCWQfzPawWB-ZgH6gA,2128
+endee_llamaindex/utils.py,sha256=EIdDGZ8clesbiCJSgowonVBtGrimEwa-YV2qj05GMcE,5263
+endee_llamaindex-0.1.5.dist-info/METADATA,sha256=346YtslXIy6EK2V2Iy0OxZhQlRXxjOI_lTNulUPQI4U,19879
+endee_llamaindex-0.1.5.dist-info/WHEEL,sha256=wUyA8OaulRlbfwMtmQsvNngGrxQHAvkKcvRmdizlJi0,92
+endee_llamaindex-0.1.5.dist-info/top_level.txt,sha256=AReiKL0lBXSdKPsQlDusPIH_qbS_txOSUctuCR0rRNQ,17
+endee_llamaindex-0.1.5.dist-info/RECORD,,

{endee_llamaindex-0.1.3.dist-info → endee_llamaindex-0.1.5.dist-info}/WHEEL RENAMED Viewed

@@ -1,5 +1,5 @@
 Wheel-Version: 1.0
-Generator: setuptools (80.9.0)
+Generator: setuptools (80.10.2)
 Root-Is-Purelib: true
 Tag: py3-none-any

endee_llamaindex-0.1.3.dist-info/RECORD DELETED Viewed

@@ -1,7 +0,0 @@
-endee_llamaindex/__init__.py,sha256=ctCcicNLMO3LpXPGLwvQifvQLX7TEd8CYgFO6Nd9afc,83
-endee_llamaindex/base.py,sha256=I_i2cvGpran4EG0Eu2Wpr5dic-818VsJ_ZYaFSzj0D8,29032
-endee_llamaindex/utils.py,sha256=psGw_VkJlirKiFpk233E8l2xVfPf3gcq1C0SxMQxUsA,25468
-endee_llamaindex-0.1.3.dist-info/METADATA,sha256=iNmNAsblquqz6e8TEmKReytv1WJ0cPNuRPYMTpDfYmI,15026
-endee_llamaindex-0.1.3.dist-info/WHEEL,sha256=_zCd3N1l69ArxyTb8rzEoP9TpbYXkqRFSNOD5OuxnTs,91
-endee_llamaindex-0.1.3.dist-info/top_level.txt,sha256=AReiKL0lBXSdKPsQlDusPIH_qbS_txOSUctuCR0rRNQ,17
-endee_llamaindex-0.1.3.dist-info/RECORD,,

{endee_llamaindex-0.1.3.dist-info → endee_llamaindex-0.1.5.dist-info}/top_level.txt RENAMED Viewed

File without changes

endee-llamaindex 0.1.3__py3-none-any.whl → 0.1.5__py3-none-any.whl

endee-llamaindex 0.1.3py3-none-any.whl → 0.1.5py3-none-any.whl