PyPI - python-nlql - Versions diffs - 0.2.0__tar.gz → 0.3.0__tar.gz - Mend

python-nlql 0.2.0tar.gz → 0.3.0tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (221) hide show

{python_nlql-0.2.0 → python_nlql-0.3.0}/PKG-INFO RENAMED Viewed

@@ -1,6 +1,6 @@
 Metadata-Version: 2.4
 Name: python-nlql
-Version: 0.2.0
+Version: 0.3.0
 Summary: SQL-style semantic query language and retrieval middleware for Agents & RAG
 Project-URL: Repository, https://github.com/natural-language-query-language/python-nlql
 Author: Okysu
@@ -57,16 +57,18 @@ Description-Content-Type: text/markdown
 [![License: MIT](https://img.shields.io/badge/license-MIT-green.svg)](LICENSE)
 [![Documentation](https://img.shields.io/badge/docs-online-blue.svg)](https://natural-language-query-language.github.io/python-nlql/)
-NLQL 让你用类似 SQL 的语句做语义检索。把"从文本里找相关内容"这件事，变得像查数据库一样直接——相关度计算、过滤、排序写在一条查询里，不再需要拼凑 embedding 调用和后处理代码。
+**English** · [简体中文](README.zh-CN.md) · [在线文档](https://natural-language-query-language.github.io/python-nlql/)
-适合 Agent 与 RAG 应用：查询本身就是结构化数据，可以直接作为大模型的工具调用载体。
+NLQL lets you do semantic search with SQL-style statements. Relevance scoring, filtering, and sorting live in one query — no more scattered embedding calls and post-processing code.
-## 它长什么样
+Built for Agent and RAG applications: the query itself is structured data, usable directly as an LLM tool-call payload.
+## What it looks like
 ```python
 import nlql
-engine = nlql.Engine(nlql.FakeEmbedder())  # 或 OpenAIEmbedder，以及任意 Embedder 实现
+engine = nlql.Engine(nlql.embed.FakeEmbedder())  # or OpenAIEmbedder, or any Embedder
 engine.add_text("AI agents plan tasks and call tools.", metadata={"status": "published"})
 engine.add_text("Banana bread needs flour and sugar.", metadata={"status": "draft"})
@@ -80,54 +82,54 @@ for unit in engine.search('''
     print(f"{unit.scores['rel']:+.3f}  {unit.content}")
 ```
-语句和 SQL 几乎一样：`SELECT` 指定返回粒度，`LET` 算相关度，`WHERE` 过滤，`ORDER BY` / `LIMIT` 排序限量。
+The statement reads almost like SQL: `SELECT` sets the return granularity, `LET` computes relevance, `WHERE` filters, `ORDER BY` / `LIMIT` sort and cap.
-## 特性
+## Features
-- **一条语句表达完整意图** —— 相关度、过滤、排序集中在一处，不再散在业务代码里
-- **三种写法，结果一致** —— SQL 语句、Python 链式构造、JSON IR，都编译到同一份内部表示
-- **后端可插拔** —— 内置存储开箱即用；切换 Qdrant / Faiss / Chroma / HnswLib / pgvector 只需改一行
-- **召回 + 重排两段式** —— 向量召回后挂重排器，提升结果准确性
-- **多模态** —— 文本与图像在同一向量空间，用文字检索图像
-- **可解释** —— `engine.explain()` 输出查询的执行计划
+- **One statement, full intent** — relevance, filtering, and sorting in one place, not scattered across business code
+- **Three ways to write, identical results** — SQL statement, Python chained builder, or JSON IR; all compile to the same internal representation
+- **Pluggable backends** — built-in store works out of the box; switch to Qdrant / Faiss / Chroma / HnswLib / pgvector with one line
+- **Two-stage retrieval** — attach a reranker after recall for higher accuracy
+- **Multimodal** — text and images share one vector space; retrieve images with text
+- **Explainable** — `engine.explain()` prints the query plan
-## 安装
+## Installation
 ```bash
 pip install python-nlql
 ```
-可选依赖：
+Optional extras:
-| 命令 | 用途 |
+| Command | Purpose |
 |---|---|
-| `pip install "python-nlql[faiss]"` | Faiss 后端 |
-| `pip install "python-nlql[hnsw]"` | HnswLib 后端（适合大数据量） |
-| `pip install "python-nlql[qdrant]"` | Qdrant 后端 |
-| `pip install "python-nlql[chroma]"` | Chroma 后端 |
-| `pip install "python-nlql[pgvector]"` | Postgres + pgvector 后端 |
-| `pip install "python-nlql[local]"` | 本地 sentence-transformers / CLIP / cross-encoder |
-| `pip install "python-nlql[loaders]"` | 加载 DOCX / PDF 文件 |
+| `pip install "python-nlql[faiss]"` | Faiss backend |
+| `pip install "python-nlql[hnsw]"` | HnswLib backend (for large-scale data) |
+| `pip install "python-nlql[qdrant]"` | Qdrant backend |
+| `pip install "python-nlql[chroma]"` | Chroma backend |
+| `pip install "python-nlql[pgvector]"` | Postgres + pgvector backend |
+| `pip install "python-nlql[local]"` | local sentence-transformers / CLIP / cross-encoder |
+| `pip install "python-nlql[loaders]"` | DOCX / PDF file loaders |
-## 切换后端
+## Switching backends
-切换存储后端只需一行，写入与查询代码完全不变：
+One line; ingestion and query code stay the same:
 ```python
 from nlql.store.qdrant_store import QdrantStore
 engine = nlql.Engine(embedder, store=QdrantStore(location=":memory:"))
 ```
-## 文档
+## Documentation
-完整文档、教程与 API 参考：**https://natural-language-query-language.github.io/python-nlql/**
+Full docs, tutorials, and API reference: **https://natural-language-query-language.github.io/python-nlql/en/**
-- [快速开始](https://natural-language-query-language.github.io/python-nlql/content/tutorials/quickstart/)
-- [设计思路](https://natural-language-query-language.github.io/python-nlql/content/concepts/overview/)
-- [API 参考](https://natural-language-query-language.github.io/python-nlql/reference/sdk/)
-- [English docs](https://natural-language-query-language.github.io/python-nlql/en/)
+- [Quick start](https://natural-language-query-language.github.io/python-nlql/en/content/tutorials/quickstart/)
+- [Design](https://natural-language-query-language.github.io/python-nlql/en/content/concepts/overview/)
+- [API reference](https://natural-language-query-language.github.io/python-nlql/en/reference/sdk/)
+- [中文文档](https://natural-language-query-language.github.io/python-nlql/)
-更多示例见 [`examples/`](examples/) 目录。
+More examples in the [`examples/`](examples/) directory.
 ## License

python_nlql-0.3.0/README.md ADDED Viewed

@@ -0,0 +1,84 @@
+# NLQL
+[![PyPI version](https://img.shields.io/pypi/v/python-nlql.svg?label=pypi)](https://pypi.org/project/python-nlql/)
+[![Python](https://img.shields.io/badge/python-3.11+-blue.svg)](https://www.python.org/downloads/)
+[![License: MIT](https://img.shields.io/badge/license-MIT-green.svg)](LICENSE)
+[![Documentation](https://img.shields.io/badge/docs-online-blue.svg)](https://natural-language-query-language.github.io/python-nlql/)
+**English** · [简体中文](README.zh-CN.md) · [在线文档](https://natural-language-query-language.github.io/python-nlql/)
+NLQL lets you do semantic search with SQL-style statements. Relevance scoring, filtering, and sorting live in one query — no more scattered embedding calls and post-processing code.
+Built for Agent and RAG applications: the query itself is structured data, usable directly as an LLM tool-call payload.
+## What it looks like
+```python
+import nlql
+engine = nlql.Engine(nlql.embed.FakeEmbedder())  # or OpenAIEmbedder, or any Embedder
+engine.add_text("AI agents plan tasks and call tools.", metadata={"status": "published"})
+engine.add_text("Banana bread needs flour and sugar.", metadata={"status": "draft"})
+for unit in engine.search('''
+    SELECT SENTENCE
+    LET rel = SIMILARITY(content, "autonomous agents")
+    WHERE rel >= 0.2 AND meta.status == "published"
+    ORDER BY rel DESC
+    LIMIT 5
+'''):
+    print(f"{unit.scores['rel']:+.3f}  {unit.content}")
+```
+The statement reads almost like SQL: `SELECT` sets the return granularity, `LET` computes relevance, `WHERE` filters, `ORDER BY` / `LIMIT` sort and cap.
+## Features
+- **One statement, full intent** — relevance, filtering, and sorting in one place, not scattered across business code
+- **Three ways to write, identical results** — SQL statement, Python chained builder, or JSON IR; all compile to the same internal representation
+- **Pluggable backends** — built-in store works out of the box; switch to Qdrant / Faiss / Chroma / HnswLib / pgvector with one line
+- **Two-stage retrieval** — attach a reranker after recall for higher accuracy
+- **Multimodal** — text and images share one vector space; retrieve images with text
+- **Explainable** — `engine.explain()` prints the query plan
+## Installation
+```bash
+pip install python-nlql
+```
+Optional extras:
+| Command | Purpose |
+|---|---|
+| `pip install "python-nlql[faiss]"` | Faiss backend |
+| `pip install "python-nlql[hnsw]"` | HnswLib backend (for large-scale data) |
+| `pip install "python-nlql[qdrant]"` | Qdrant backend |
+| `pip install "python-nlql[chroma]"` | Chroma backend |
+| `pip install "python-nlql[pgvector]"` | Postgres + pgvector backend |
+| `pip install "python-nlql[local]"` | local sentence-transformers / CLIP / cross-encoder |
+| `pip install "python-nlql[loaders]"` | DOCX / PDF file loaders |
+## Switching backends
+One line; ingestion and query code stay the same:
+```python
+from nlql.store.qdrant_store import QdrantStore
+engine = nlql.Engine(embedder, store=QdrantStore(location=":memory:"))
+```
+## Documentation
+Full docs, tutorials, and API reference: **https://natural-language-query-language.github.io/python-nlql/en/**
+- [Quick start](https://natural-language-query-language.github.io/python-nlql/en/content/tutorials/quickstart/)
+- [Design](https://natural-language-query-language.github.io/python-nlql/en/content/concepts/overview/)
+- [API reference](https://natural-language-query-language.github.io/python-nlql/en/reference/sdk/)
+- [中文文档](https://natural-language-query-language.github.io/python-nlql/)
+More examples in the [`examples/`](examples/) directory.
+## License
+[MIT](LICENSE)

python_nlql-0.2.0/README.md → python_nlql-0.3.0/README.zh-CN.md RENAMED Viewed

@@ -5,6 +5,8 @@
 [![License: MIT](https://img.shields.io/badge/license-MIT-green.svg)](LICENSE)
 [![Documentation](https://img.shields.io/badge/docs-online-blue.svg)](https://natural-language-query-language.github.io/python-nlql/)
+[English](README.md) · **简体中文** · [在线文档](https://natural-language-query-language.github.io/python-nlql/)
 NLQL 让你用类似 SQL 的语句做语义检索。把"从文本里找相关内容"这件事，变得像查数据库一样直接——相关度计算、过滤、排序写在一条查询里，不再需要拼凑 embedding 调用和后处理代码。
 适合 Agent 与 RAG 应用：查询本身就是结构化数据，可以直接作为大模型的工具调用载体。
@@ -14,7 +16,7 @@ NLQL 让你用类似 SQL 的语句做语义检索。把"从文本里找相关内
 ```python
 import nlql
-engine = nlql.Engine(nlql.FakeEmbedder())  # 或 OpenAIEmbedder，以及任意 Embedder 实现
+engine = nlql.Engine(nlql.embed.FakeEmbedder())  # 或 OpenAIEmbedder，以及任意 Embedder 实现
 engine.add_text("AI agents plan tasks and call tools.", metadata={"status": "published"})
 engine.add_text("Banana bread needs flour and sugar.", metadata={"status": "draft"})

{python_nlql-0.2.0 → python_nlql-0.3.0}/benchmarks/bench.py RENAMED Viewed

@@ -17,7 +17,8 @@ import random
 import sys
 import time
-from nlql import Document, Engine, FakeEmbedder
+from nlql import Document, Engine
+from nlql.embed import FakeEmbedder
 from nlql.lang import parse
 from nlql.store import LocalStore

{python_nlql-0.2.0 → python_nlql-0.3.0}/docs/en/content/concepts/index-cache.md RENAMED Viewed

@@ -5,7 +5,7 @@ Vectors are computed at ingestion time and stored in the index; they are not rec
 ```python
 import nlql
-engine = nlql.Engine(nlql.OpenAIEmbedder(base_url="...", api_key="..."))
+engine = nlql.Engine(nlql.embed.OpenAIEmbedder(base_url="...", api_key="..."))
 engine.add_text("AI agents plan tasks and call external tools.")
 engine.add_text("Banana bread is a quick loaf made with ripe bananas.")

{python_nlql-0.2.0 → python_nlql-0.3.0}/docs/en/content/concepts/ingestion.md RENAMED Viewed

@@ -5,7 +5,7 @@ When a document enters NLQL it passes through four stages in order: normalize, s
 ```python
 import nlql
-engine = nlql.Engine(nlql.OpenAIEmbedder(base_url="...", api_key="..."))
+engine = nlql.Engine(nlql.embed.OpenAIEmbedder(base_url="...", api_key="..."))
 doc_id = engine.add_text(
     "AI agents plan tasks. They keep memory and call external tools.",
     metadata={"status": "published", "year": 2026},
@@ -26,8 +26,8 @@ Before splitting, text is normalized: whitespace and line breaks are unified, an
 The normalized text is sliced into units by the splitter for the active granularity. The default is by sentence (`SENTENCE`), and the built-in splitter covers Chinese, English, Japanese, and CJK punctuation.
 ```python
-engine = nlql.Engine(nlql.FakeEmbedder(), granularity="sentence")  # default
-engine = nlql.Engine(nlql.FakeEmbedder(), granularity="chunk")     # use the chunk splitter instead
+engine = nlql.Engine(nlql.embed.FakeEmbedder(), granularity="sentence")  # default
+engine = nlql.Engine(nlql.embed.FakeEmbedder(), granularity="chunk")     # use the chunk splitter instead
 ```
 Splitting happens at ingest and is reused at query time — the boundaries returned by `SELECT SENTENCE` and `SELECT SPAN(SENTENCE, window => n)` both come from this stage; there is no on-the-fly re-splitting at query time.
@@ -64,7 +64,7 @@ engine.add_documents([
 ```python
 import nlql
-engine = nlql.Engine(nlql.FakeEmbedder())
+engine = nlql.Engine(nlql.embed.FakeEmbedder())
 ids = engine.add_files(["agents.txt", "rag.md"])
 print(f"loaded into {len(engine)} units: {ids}")
 ```
@@ -78,7 +78,7 @@ print(f"loaded into {len(engine)} units: {ids}")
 - **Custom granularity** — register your own splitter (see [Registry and Extension](./registry.md)), for example by paragraph or by chapter.
 ```python
-engine = nlql.Engine(nlql.FakeEmbedder(), granularity="chunk")
+engine = nlql.Engine(nlql.embed.FakeEmbedder(), granularity="chunk")
 engine.add_file("long_document.md")
 # each chunk is one retrieval unit
 ```

{python_nlql-0.2.0 → python_nlql-0.3.0}/docs/en/content/concepts/registry.md RENAMED Viewed

@@ -50,7 +50,7 @@ def my_fn(text: str) -> float: ...
 **Instance-level registration** — applies only to the current engine and does not leak to other instances:
 ```python
-engine = nlql.Engine(nlql.FakeEmbedder())
+engine = nlql.Engine(nlql.embed.FakeEmbedder())
 @engine.register_function("TEMP_SCORE")
 def temp_score(text: str) -> float: ...
@@ -71,7 +71,7 @@ def pysbd_sentences(text: str) -> list[str]:
     seg = pysbd.Segmenter(language="en", clean=False)
     return seg.segment(text)
-engine = nlql.Engine(nlql.FakeEmbedder())  # the splitter above is used automatically at ingest
+engine = nlql.Engine(nlql.embed.FakeEmbedder())  # the splitter above is used automatically at ingest
 ```
 The same mechanism uses the splitter at both ingest and query time, so the boundaries returned by `SELECT SENTENCE` / `SELECT SPAN(SENTENCE, window => n)` match those from ingestion — there is no mismatch from re-splitting on the fly at query time.
@@ -83,7 +83,7 @@ You can also register a new granularity name (such as `"paragraph"`) and specify
 def split_paragraphs(text: str) -> list[str]:
     return [p for p in text.split("\n\n") if p.strip()]
-engine = nlql.Engine(nlql.FakeEmbedder(), granularity="paragraph")
+engine = nlql.Engine(nlql.embed.FakeEmbedder(), granularity="paragraph")
 ```
 ## Custom Embedders

{python_nlql-0.2.0 → python_nlql-0.3.0}/docs/en/content/tutorials/document-loading.md RENAMED Viewed

@@ -25,7 +25,7 @@ files = [tmp / "agents.txt", tmp / "rag.md"]
 `add_files` takes a list of paths and dispatches a loader per file based on its extension. `.txt` and `.md` go through the plain-text loader, which works out of the box.
 ```python
-engine = nlql.Engine(nlql.FakeEmbedder())
+engine = nlql.Engine(nlql.embed.FakeEmbedder())
 ids = engine.add_files([str(f) for f in files])
 print(f"loaded {len(ids)} files -> {len(engine)} sentence units: {ids}")
 ```
@@ -53,7 +53,7 @@ try:
 except ImportError:
     print("(python-docx not installed — skipping the .docx file)")
-engine = nlql.Engine(nlql.FakeEmbedder())
+engine = nlql.Engine(nlql.embed.FakeEmbedder())
 ids = engine.add_files([str(f) for f in files])
 ```

{python_nlql-0.2.0 → python_nlql-0.3.0}/docs/en/content/tutorials/hybrid-stores.md RENAMED Viewed

@@ -9,7 +9,8 @@ All backends conform to the same `Store` interface. The engine preferentially us
 ## Example
 ```python
-from nlql import Document, Engine, FakeEmbedder
+from nlql import Document, Engine
+from nlql.embed import FakeEmbedder
 from nlql.store import LocalStore
 CORPUS = [

{python_nlql-0.2.0 → python_nlql-0.3.0}/docs/en/content/tutorials/llm-function-calling.md RENAMED Viewed

@@ -8,7 +8,7 @@ The following example uses `FakeEmbedder`, which requires no network access or m
 import json
 import nlql
-engine = nlql.Engine(nlql.FakeEmbedder())
+engine = nlql.Engine(nlql.embed.FakeEmbedder())
 engine.add_text("AI agents use planning, memory, and tool use.",
                 metadata={"status": "published"})
 engine.add_text("Vector databases store embeddings for similarity search.",

{python_nlql-0.2.0 → python_nlql-0.3.0}/docs/en/content/tutorials/query-builder.md RENAMED Viewed

@@ -8,7 +8,7 @@ The following example uses `FakeEmbedder`, which requires no network access or m
 import nlql
 from nlql.sdk.builder import select, similarity, Meta, F
-engine = nlql.Engine(nlql.FakeEmbedder())
+engine = nlql.Engine(nlql.embed.FakeEmbedder())
 engine.add_text("AI agents plan tasks, keep memory, and call external tools.",
                 id="doc-0", metadata={"status": "published", "topic": "agents"})
 engine.add_text("Retrieval-augmented generation grounds LLM answers in your documents.",

{python_nlql-0.2.0 → python_nlql-0.3.0}/docs/en/content/tutorials/quickstart.md RENAMED Viewed

@@ -10,7 +10,7 @@ The following example uses `FakeEmbedder`, which requires no network access or m
 ```python
 import nlql
-engine = nlql.Engine(nlql.FakeEmbedder())
+engine = nlql.Engine(nlql.embed.FakeEmbedder())
 engine.add_text("AI agents plan tasks, keep memory, and call external tools.",
                 id="doc-0", metadata={"status": "published", "topic": "agents"})

{python_nlql-0.2.0 → python_nlql-0.3.0}/docs/en/content/tutorials/reranking.md RENAMED Viewed

@@ -6,7 +6,7 @@ This example uses `FakeEmbedder` and `FakeReranker` for an offline demonstration
 ```python
 import nlql
-from nlql import FakeReranker
+from nlql.rerank import FakeReranker
 DOCS = [
     # Contains all query terms but is very long; dual-encoder similarity gets diluted
@@ -20,7 +20,7 @@ QUERY = 'SELECT SENTENCE LET rel = SIMILARITY(content, "agent memory planning to
 ## Without a reranker
 ```python
-engine = nlql.Engine(nlql.FakeEmbedder(), reranker=None, rerank_factor=10)
+engine = nlql.Engine(nlql.embed.FakeEmbedder(), reranker=None, rerank_factor=10)
 for text, doc_id in DOCS:
     engine.add_text(text, id=doc_id)
@@ -34,7 +34,7 @@ The `full` document covers every query term, but because the sentence is long, i
 ## With a reranker
 ```python
-engine = nlql.Engine(nlql.FakeEmbedder(), reranker=FakeReranker(), rerank_factor=10)
+engine = nlql.Engine(nlql.embed.FakeEmbedder(), reranker=FakeReranker(), rerank_factor=10)
 for text, doc_id in DOCS:
     engine.add_text(text, id=doc_id)
@@ -52,7 +52,7 @@ The `Reranker` protocol requires `rerank(query, units) -> units`: it takes the q
 `rerank_factor` controls the over-fetch multiple: the final `limit` multiplied by this factor gives the recall count. A larger factor yields more complete recall and leans more on the reranker for precision, but is also slower. Common values range from 5 to 20.
 ```python
-nlql.Engine(nlql.OpenAIEmbedder(), reranker=FakeReranker(), rerank_factor=5)
+nlql.Engine(nlql.embed.OpenAIEmbedder(), reranker=FakeReranker(), rerank_factor=5)
 ```
 ## CrossEncoder for production
@@ -60,10 +60,10 @@ nlql.Engine(nlql.OpenAIEmbedder(), reranker=FakeReranker(), rerank_factor=5)
 `FakeReranker` is for demonstration only. In production, replace it with a real reranker:
 ```python
-from nlql import CrossEncoderReranker
+from nlql.rerank import CrossEncoderReranker
 engine = nlql.Engine(
-    nlql.OpenAIEmbedder(),
+    nlql.embed.OpenAIEmbedder(),
     reranker=CrossEncoderReranker(model="cross-encoder/ms-marco-MiniLM-L-6-v2"),
     rerank_factor=5,
 )

{python_nlql-0.2.0 → python_nlql-0.3.0}/docs/zh/content/concepts/index-cache.md RENAMED Viewed

@@ -5,7 +5,7 @@
 ```python
 import nlql
-engine = nlql.Engine(nlql.OpenAIEmbedder(base_url="...", api_key="..."))
+engine = nlql.Engine(nlql.embed.OpenAIEmbedder(base_url="...", api_key="..."))
 engine.add_text("AI agents plan tasks and call external tools.")
 engine.add_text("Banana bread is a quick loaf made with ripe bananas.")

{python_nlql-0.2.0 → python_nlql-0.3.0}/docs/zh/content/concepts/ingestion.md RENAMED Viewed

@@ -5,7 +5,7 @@
 ```python
 import nlql
-engine = nlql.Engine(nlql.OpenAIEmbedder(base_url="...", api_key="..."))
+engine = nlql.Engine(nlql.embed.OpenAIEmbedder(base_url="...", api_key="..."))
 doc_id = engine.add_text(
     "AI agents plan tasks. They keep memory and call external tools.",
     metadata={"status": "published", "year": 2026},
@@ -26,8 +26,8 @@ results = engine.search(
 规整后的文本按当前粒度对应的分词器切成单元。默认按句切（`SENTENCE`），内置分词器覆盖中、英、日及 CJK 标点。
 ```python
-engine = nlql.Engine(nlql.FakeEmbedder(), granularity="sentence")  # 默认
-engine = nlql.Engine(nlql.FakeEmbedder(), granularity="chunk")     # 改用 chunk 分词器
+engine = nlql.Engine(nlql.embed.FakeEmbedder(), granularity="sentence")  # 默认
+engine = nlql.Engine(nlql.embed.FakeEmbedder(), granularity="chunk")     # 改用 chunk 分词器
 ```
 切分在写入时完成、查询时复用——`SELECT SENTENCE` 与 `SELECT SPAN(SENTENCE, window => n)` 返回的边界都来自这一步，不会出现查询时临时重切。
@@ -64,7 +64,7 @@ engine.add_documents([
 ```python
 import nlql
-engine = nlql.Engine(nlql.FakeEmbedder())
+engine = nlql.Engine(nlql.embed.FakeEmbedder())
 ids = engine.add_files(["agents.txt", "rag.md"])
 print(f"loaded into {len(engine)} units: {ids}")
 ```
@@ -78,7 +78,7 @@ print(f"loaded into {len(engine)} units: {ids}")
 - **自定义粒度** —— 注册自己的分词器即可（见 [注册与扩展](./registry.md)），比如按段落、按章节。
 ```python
-engine = nlql.Engine(nlql.FakeEmbedder(), granularity="chunk")
+engine = nlql.Engine(nlql.embed.FakeEmbedder(), granularity="chunk")
 engine.add_file("long_document.md")
 # 每个 chunk 是一个检索单元
 ```

{python_nlql-0.2.0 → python_nlql-0.3.0}/docs/zh/content/concepts/registry.md RENAMED Viewed

@@ -50,7 +50,7 @@ def my_fn(text: str) -> float: ...
 **实例级注册**——只对当前引擎生效，不泄漏到其它实例：
 ```python
-engine = nlql.Engine(nlql.FakeEmbedder())
+engine = nlql.Engine(nlql.embed.FakeEmbedder())
 @engine.register_function("TEMP_SCORE")
 def temp_score(text: str) -> float: ...
@@ -71,7 +71,7 @@ def pysbd_sentences(text: str) -> list[str]:
     seg = pysbd.Segmenter(language="en", clean=False)
     return seg.segment(text)
-engine = nlql.Engine(nlql.FakeEmbedder())  # 写入时自动用上面的分词器
+engine = nlql.Engine(nlql.embed.FakeEmbedder())  # 写入时自动用上面的分词器
 ```
 分词器在写入和查询时被同一套机制使用，因此 `SELECT SENTENCE` / `SELECT SPAN(SENTENCE, window => n)` 返回的边界与写入时一致，不会出现查询时临时重切导致的不匹配。
@@ -83,7 +83,7 @@ engine = nlql.Engine(nlql.FakeEmbedder())  # 写入时自动用上面的分词
 def split_paragraphs(text: str) -> list[str]:
     return [p for p in text.split("\n\n") if p.strip()]
-engine = nlql.Engine(nlql.FakeEmbedder(), granularity="paragraph")
+engine = nlql.Engine(nlql.embed.FakeEmbedder(), granularity="paragraph")
 ```
 ## 自定义 embedder

{python_nlql-0.2.0 → python_nlql-0.3.0}/docs/zh/content/tutorials/document-loading.md RENAMED Viewed

@@ -25,7 +25,7 @@ files = [tmp / "agents.txt", tmp / "rag.md"]
 `add_files` 接收路径列表，内部对每个文件按扩展名分派加载器。`.txt` 与 `.md` 走纯文本加载器，开箱即用。
 ```python
-engine = nlql.Engine(nlql.FakeEmbedder())
+engine = nlql.Engine(nlql.embed.FakeEmbedder())
 ids = engine.add_files([str(f) for f in files])
 print(f"loaded {len(ids)} files -> {len(engine)} sentence units: {ids}")
 ```
@@ -53,7 +53,7 @@ try:
 except ImportError:
     print("(python-docx not installed — skipping the .docx file)")
-engine = nlql.Engine(nlql.FakeEmbedder())
+engine = nlql.Engine(nlql.embed.FakeEmbedder())
 ids = engine.add_files([str(f) for f in files])
 ```

{python_nlql-0.2.0 → python_nlql-0.3.0}/docs/zh/content/tutorials/hybrid-stores.md RENAMED Viewed

@@ -9,7 +9,8 @@ NLQL 的查询与后端解耦：内置存储开箱即用，也可接入 Qdrant
 ## 示例
 ```python
-from nlql import Document, Engine, FakeEmbedder
+from nlql import Document, Engine
+from nlql.embed import FakeEmbedder
 from nlql.store import LocalStore
 CORPUS = [

{python_nlql-0.2.0 → python_nlql-0.3.0}/docs/zh/content/tutorials/llm-function-calling.md RENAMED Viewed

@@ -8,7 +8,7 @@
 import json
 import nlql
-engine = nlql.Engine(nlql.FakeEmbedder())
+engine = nlql.Engine(nlql.embed.FakeEmbedder())
 engine.add_text("AI agents use planning, memory, and tool use.",
                 metadata={"status": "published"})
 engine.add_text("Vector databases store embeddings for similarity search.",

{python_nlql-0.2.0 → python_nlql-0.3.0}/docs/zh/content/tutorials/query-builder.md RENAMED Viewed

@@ -8,7 +8,7 @@ Query Builder 用 Python 链式调用构造查询，与 NLQL 字符串编译到
 import nlql
 from nlql.sdk.builder import select, similarity, Meta, F
-engine = nlql.Engine(nlql.FakeEmbedder())
+engine = nlql.Engine(nlql.embed.FakeEmbedder())
 engine.add_text("AI agents plan tasks, keep memory, and call external tools.",
                 id="doc-0", metadata={"status": "published", "topic": "agents"})
 engine.add_text("Retrieval-augmented generation grounds LLM answers in your documents.",

{python_nlql-0.2.0 → python_nlql-0.3.0}/docs/zh/content/tutorials/quickstart.md RENAMED Viewed

@@ -10,7 +10,7 @@
 ```python
 import nlql
-engine = nlql.Engine(nlql.FakeEmbedder())
+engine = nlql.Engine(nlql.embed.FakeEmbedder())
 engine.add_text("AI agents plan tasks, keep memory, and call external tools.",
                 id="doc-0", metadata={"status": "published", "topic": "agents"})

{python_nlql-0.2.0 → python_nlql-0.3.0}/docs/zh/content/tutorials/reranking.md RENAMED Viewed

@@ -6,7 +6,7 @@
 ```python
 import nlql
-from nlql import FakeReranker
+from nlql.rerank import FakeReranker
 DOCS = [
     # 包含全部查询词但很长，双塔相似度被稀释
@@ -20,7 +20,7 @@ QUERY = 'SELECT SENTENCE LET rel = SIMILARITY(content, "agent memory planning to
 ## 不加重排器
 ```python
-engine = nlql.Engine(nlql.FakeEmbedder(), reranker=None, rerank_factor=10)
+engine = nlql.Engine(nlql.embed.FakeEmbedder(), reranker=None, rerank_factor=10)
 for text, doc_id in DOCS:
     engine.add_text(text, id=doc_id)
@@ -34,7 +34,7 @@ for unit in engine.search(QUERY):
 ## 加重排器
 ```python
-engine = nlql.Engine(nlql.FakeEmbedder(), reranker=FakeReranker(), rerank_factor=10)
+engine = nlql.Engine(nlql.embed.FakeEmbedder(), reranker=FakeReranker(), rerank_factor=10)
 for text, doc_id in DOCS:
     engine.add_text(text, id=doc_id)
@@ -52,7 +52,7 @@ for unit in engine.search(QUERY):
 `rerank_factor` 控制过取倍数：最终需要的 `limit` 乘以这个倍数得到召回数量。倍数越大召回越全、精度越依赖重排器，但也越慢。常用值在 5 到 20 之间。
 ```python
-nlql.Engine(nlql.OpenAIEmbedder(), reranker=FakeReranker(), rerank_factor=5)
+nlql.Engine(nlql.embed.OpenAIEmbedder(), reranker=FakeReranker(), rerank_factor=5)
 ```
 ## 生产用 CrossEncoder
@@ -60,10 +60,10 @@ nlql.Engine(nlql.OpenAIEmbedder(), reranker=FakeReranker(), rerank_factor=5)
 `FakeReranker` 仅用于演示。生产中替换为真实重排器：
 ```python
-from nlql import CrossEncoderReranker
+from nlql.rerank import CrossEncoderReranker
 engine = nlql.Engine(
-    nlql.OpenAIEmbedder(),
+    nlql.embed.OpenAIEmbedder(),
     reranker=CrossEncoderReranker(model="cross-encoder/ms-marco-MiniLM-L-6-v2"),
     rerank_factor=5,
 )

python-nlql 0.2.0__tar.gz → 0.3.0__tar.gz

python-nlql 0.2.0tar.gz → 0.3.0tar.gz