PyPI - schema-search - Versions diffs - 0.1.5__tar.gz → 0.1.6__tar.gz - Mend

schema-search 0.1.5tar.gz → 0.1.6tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Potentially problematic release.

This version of schema-search might be problematic. Click here for more details.

Files changed (45) hide show

{schema_search-0.1.5/schema_search.egg-info → schema_search-0.1.6}/PKG-INFO RENAMED Viewed

@@ -1,9 +1,9 @@
 Metadata-Version: 2.4
 Name: schema-search
-Version: 0.1.5
-Summary: Natural language search for database schemas with graph-aware semantic retrieval
-Home-page: https://github.com/neehan/schema-search
-Author:
+Version: 0.1.6
+Summary: Natural language database schema search with graph-aware semantic retrieval
+Home-page: https://adibhasan.com/blog/schema-search/
+Author: Adib Hasan
 Classifier: Development Status :: 3 - Alpha
 Classifier: Intended Audience :: Developers
 Classifier: License :: OSI Approved :: MIT License
@@ -38,6 +38,7 @@ Requires-Dist: snowflake-sqlalchemy>=1.4.0; extra == "snowflake"
 Requires-Dist: snowflake-connector-python>=3.0.0; extra == "snowflake"
 Provides-Extra: bigquery
 Requires-Dist: sqlalchemy-bigquery>=1.6.0; extra == "bigquery"
+Dynamic: author
 Dynamic: classifier
 Dynamic: description
 Dynamic: description-content-type
@@ -209,14 +210,14 @@ We [benchmarked](/tests/test_spider_eval.py) on the Spider dataset (1,234 train
 **Memory:** The embedding model requires ~90 MB and the optional reranker adds ~155 MB. Actual process memory depends on your Python runtime.
 ### Without Reranker (`reranker.model: null`)
-![Without Reranker](img/spider_benchmark_without_reranker.png)
+![Without Reranker](https://raw.githubusercontent.com/Neehan/schema-search/refs/heads/main/img/spider_benchmark_without_reranker.png)
 - **Indexing:** 0.22s ± 0.08s per database (18 total).
 - **Accuracy:** Hybrid leads with Recall@1 62% / MRR 0.93; Semantic follows at Recall@1 58% / MRR 0.89.
 - **Latency:** BM25 and Fuzzy return in ~5ms; Semantic spends ~15ms; Hybrid (semantic + fuzzy) averages 52ms.
 - **Fuzzy baseline:** Recall@1 22%, highlighting the need for semantic signals on natural-language queries.
 ### With Reranker (`Alibaba-NLP/gte-reranker-modernbert-base`)
-![With Reranker](img/spider_benchmark_with_reranker.png)
+![With Reranker](https://raw.githubusercontent.com/Neehan/schema-search/refs/heads/main/img/spider_benchmark_with_reranker.png)
 - **Indexing:** 0.25s ± 0.05s per database (same 18 DBs).
 - **Accuracy:** All strategies converge around Recall@1 62% and MRR ≈ 0.92; Fuzzy jumps from 51% → 92% MRR.
 - **Latency trade-off:** Extra CrossEncoder pass lifts per-query latency to ~0.18–0.29s depending on strategy.
@@ -266,7 +267,7 @@ search = SchemaSearch(
 5. **Optional reranking** with CrossEncoder to refine results
 6. Return top tables with full schema and relationships
-Cache stored in `.schema_search_cache/` (configurable in `config.yml`)
+Cache stored in `/tmp/.schema_search_cache/` (configurable in `config.yml`)
 ## License

{schema_search-0.1.5 → schema_search-0.1.6}/README.md RENAMED Viewed

@@ -159,14 +159,14 @@ We [benchmarked](/tests/test_spider_eval.py) on the Spider dataset (1,234 train
 **Memory:** The embedding model requires ~90 MB and the optional reranker adds ~155 MB. Actual process memory depends on your Python runtime.
 ### Without Reranker (`reranker.model: null`)
-![Without Reranker](img/spider_benchmark_without_reranker.png)
+![Without Reranker](https://raw.githubusercontent.com/Neehan/schema-search/refs/heads/main/img/spider_benchmark_without_reranker.png)
 - **Indexing:** 0.22s ± 0.08s per database (18 total).
 - **Accuracy:** Hybrid leads with Recall@1 62% / MRR 0.93; Semantic follows at Recall@1 58% / MRR 0.89.
 - **Latency:** BM25 and Fuzzy return in ~5ms; Semantic spends ~15ms; Hybrid (semantic + fuzzy) averages 52ms.
 - **Fuzzy baseline:** Recall@1 22%, highlighting the need for semantic signals on natural-language queries.
 ### With Reranker (`Alibaba-NLP/gte-reranker-modernbert-base`)
-![With Reranker](img/spider_benchmark_with_reranker.png)
+![With Reranker](https://raw.githubusercontent.com/Neehan/schema-search/refs/heads/main/img/spider_benchmark_with_reranker.png)
 - **Indexing:** 0.25s ± 0.05s per database (same 18 DBs).
 - **Accuracy:** All strategies converge around Recall@1 62% and MRR ≈ 0.92; Fuzzy jumps from 51% → 92% MRR.
 - **Latency trade-off:** Extra CrossEncoder pass lifts per-query latency to ~0.18–0.29s depending on strategy.
@@ -216,7 +216,7 @@ search = SchemaSearch(
 5. **Optional reranking** with CrossEncoder to refine results
 6. Return top tables with full schema and relationships
-Cache stored in `.schema_search_cache/` (configurable in `config.yml`)
+Cache stored in `/tmp/.schema_search_cache/` (configurable in `config.yml`)
 ## License

{schema_search-0.1.5 → schema_search-0.1.6/schema_search.egg-info}/PKG-INFO RENAMED Viewed

@@ -1,9 +1,9 @@
 Metadata-Version: 2.4
 Name: schema-search
-Version: 0.1.5
-Summary: Natural language search for database schemas with graph-aware semantic retrieval
-Home-page: https://github.com/neehan/schema-search
-Author:
+Version: 0.1.6
+Summary: Natural language database schema search with graph-aware semantic retrieval
+Home-page: https://adibhasan.com/blog/schema-search/
+Author: Adib Hasan
 Classifier: Development Status :: 3 - Alpha
 Classifier: Intended Audience :: Developers
 Classifier: License :: OSI Approved :: MIT License
@@ -38,6 +38,7 @@ Requires-Dist: snowflake-sqlalchemy>=1.4.0; extra == "snowflake"
 Requires-Dist: snowflake-connector-python>=3.0.0; extra == "snowflake"
 Provides-Extra: bigquery
 Requires-Dist: sqlalchemy-bigquery>=1.6.0; extra == "bigquery"
+Dynamic: author
 Dynamic: classifier
 Dynamic: description
 Dynamic: description-content-type
@@ -209,14 +210,14 @@ We [benchmarked](/tests/test_spider_eval.py) on the Spider dataset (1,234 train
 **Memory:** The embedding model requires ~90 MB and the optional reranker adds ~155 MB. Actual process memory depends on your Python runtime.
 ### Without Reranker (`reranker.model: null`)
-![Without Reranker](img/spider_benchmark_without_reranker.png)
+![Without Reranker](https://raw.githubusercontent.com/Neehan/schema-search/refs/heads/main/img/spider_benchmark_without_reranker.png)
 - **Indexing:** 0.22s ± 0.08s per database (18 total).
 - **Accuracy:** Hybrid leads with Recall@1 62% / MRR 0.93; Semantic follows at Recall@1 58% / MRR 0.89.
 - **Latency:** BM25 and Fuzzy return in ~5ms; Semantic spends ~15ms; Hybrid (semantic + fuzzy) averages 52ms.
 - **Fuzzy baseline:** Recall@1 22%, highlighting the need for semantic signals on natural-language queries.
 ### With Reranker (`Alibaba-NLP/gte-reranker-modernbert-base`)
-![With Reranker](img/spider_benchmark_with_reranker.png)
+![With Reranker](https://raw.githubusercontent.com/Neehan/schema-search/refs/heads/main/img/spider_benchmark_with_reranker.png)
 - **Indexing:** 0.25s ± 0.05s per database (same 18 DBs).
 - **Accuracy:** All strategies converge around Recall@1 62% and MRR ≈ 0.92; Fuzzy jumps from 51% → 92% MRR.
 - **Latency trade-off:** Extra CrossEncoder pass lifts per-query latency to ~0.18–0.29s depending on strategy.
@@ -266,7 +267,7 @@ search = SchemaSearch(
 5. **Optional reranking** with CrossEncoder to refine results
 6. Return top tables with full schema and relationships
-Cache stored in `.schema_search_cache/` (configurable in `config.yml`)
+Cache stored in `/tmp/.schema_search_cache/` (configurable in `config.yml`)
 ## License

{schema_search-0.1.5 → schema_search-0.1.6}/setup.py RENAMED Viewed

@@ -2,12 +2,12 @@ from setuptools import setup, find_packages
 setup(
     name="schema-search",
-    version="0.1.5",
-    description="Natural language search for database schemas with graph-aware semantic retrieval",
-    author="",
+    version="0.1.6",
+    description="Natural language database schema search with graph-aware semantic retrieval",
+    author="Adib Hasan",
     long_description=open("README.md").read(),
     long_description_content_type="text/markdown",
-    url="https://github.com/neehan/schema-search",
+    url="https://adibhasan.com/blog/schema-search/",
     packages=find_packages(),
     install_requires=[
         "sqlalchemy>=1.4.0",