haiku.rag 0.19.4__tar.gz → 0.19.6__tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -1,6 +1,27 @@
1
1
  # Changelog
2
2
  ## [Unreleased]
3
3
 
4
+ ## [0.19.6] - 2025-12-03
5
+
6
+ ## [0.19.6] - 2025-12-03
7
+
8
+ ### Changed
9
+
10
+ - **BREAKING: Explicit Database Creation**: Databases must now be explicitly created before use
11
+ - New `haiku-rag init` command creates a new empty database
12
+ - Python API: `HaikuRAG(path, create=True)` to create database programmatically
13
+ - Operations on non-existent databases raise `FileNotFoundError`
14
+ - **BREAKING: Embeddings Configuration**: Restructured to nested `EmbeddingModelConfig`
15
+ - Config path changed from `embeddings.{provider, model, vector_dim}` to `embeddings.model.{provider, name, vector_dim}`
16
+ - Automatic migration upgrades existing databases to new format
17
+ - **Database Migrations**: Always run when opening an existing database
18
+
19
+ ## [0.19.5] - 2025-12-01
20
+
21
+ ### Changed
22
+
23
+ - **Rebuild Performance**: Optimized `rebuild --embed-only` to use batch updates via LanceDB's `merge_insert` instead of individual chunk updates, and skip chunks with unchanged embeddings
24
+
4
25
  ## [0.19.4] - 2025-11-28
5
26
 
6
27
  ### Added
@@ -1,11 +1,11 @@
1
1
  Metadata-Version: 2.4
2
2
  Name: haiku.rag
3
- Version: 0.19.4
4
- Summary: Agentic Retrieval Augmented Generation (RAG) with LanceDB
3
+ Version: 0.19.6
4
+ Summary: Opinionated agentic RAG powered by LanceDB, Pydantic AI, and Docling
5
5
  Author-email: Yiorgis Gozadinos <ggozadinos@gmail.com>
6
6
  License: MIT
7
7
  License-File: LICENSE
8
- Keywords: RAG,lancedb,mcp,ml,vector-database
8
+ Keywords: RAG,docling,lancedb,mcp,ml,pydantic-ai,vector-database
9
9
  Classifier: Development Status :: 4 - Beta
10
10
  Classifier: Environment :: Console
11
11
  Classifier: Intended Audience :: Developers
@@ -17,16 +17,16 @@ Classifier: Programming Language :: Python :: 3.12
17
17
  Classifier: Programming Language :: Python :: 3.13
18
18
  Classifier: Typing :: Typed
19
19
  Requires-Python: >=3.12
20
- Requires-Dist: haiku-rag-slim[cohere,docling,inspector,mxbai,voyageai,zeroentropy]==0.19.4
20
+ Requires-Dist: haiku-rag-slim[cohere,docling,inspector,mxbai,voyageai,zeroentropy]==0.19.6
21
21
  Provides-Extra: inspector
22
22
  Requires-Dist: textual>=1.0.0; extra == 'inspector'
23
23
  Description-Content-Type: text/markdown
24
24
 
25
25
  # Haiku RAG
26
26
 
27
- Retrieval-Augmented Generation (RAG) library built on LanceDB.
27
+ Opinionated agentic RAG powered by LanceDB, Pydantic AI, and Docling.
28
28
 
29
- `haiku.rag` is a Retrieval-Augmented Generation (RAG) library built to work with LanceDB as a local vector database. It uses LanceDB for storing embeddings and performs semantic (vector) search as well as full-text search combined through native hybrid search with Reciprocal Rank Fusion. Both open-source (Ollama) as well as commercial (OpenAI, VoyageAI) embedding providers are supported.
29
+ `haiku.rag` is an opinionated agentic RAG system that uses LanceDB for vector storage, Pydantic AI for multi-agent workflows, and Docling for document processing. It supports hybrid search (vector + full-text) with Reciprocal Rank Fusion, multiple embedding providers (Ollama, LM Studio, vLLM, OpenAI, VoyageAI), and includes research agents that plan, search, evaluate, and synthesize answers.
30
30
 
31
31
  ## Features
32
32
 
@@ -168,10 +168,23 @@ async with HaikuRAG("database.lancedb") as client:
168
168
  Use with AI assistants like Claude Desktop:
169
169
 
170
170
  ```bash
171
- haiku-rag serve --stdio
171
+ haiku-rag serve --mcp --stdio
172
172
  ```
173
173
 
174
- Provides tools for document management and search directly in your AI assistant.
174
+ Add to your Claude Desktop configuration:
175
+
176
+ ```json
177
+ {
178
+ "mcpServers": {
179
+ "haiku-rag": {
180
+ "command": "haiku-rag",
181
+ "args": ["serve", "--mcp", "--stdio"]
182
+ }
183
+ }
184
+ }
185
+ ```
186
+
187
+ Provides tools for document management, search, QA, and research directly in your AI assistant.
175
188
 
176
189
  ## Examples
177
190
 
@@ -190,7 +203,10 @@ Full documentation at: https://ggozad.github.io/haiku.rag/
190
203
  - [CLI](https://ggozad.github.io/haiku.rag/cli/) - Command reference
191
204
  - [Python API](https://ggozad.github.io/haiku.rag/python/) - Complete API docs
192
205
  - [Agents](https://ggozad.github.io/haiku.rag/agents/) - QA agent and multi-agent research
193
- - [MCP Server](https://ggozad.github.io/haiku.rag/mcp/) - Model Context Protocol integration
194
- - [Benchmarks](https://ggozad.github.io/haiku.rag/benchmarks/) - Performance Benchmarks
206
+ - [Server](https://ggozad.github.io/haiku.rag/server/) - File monitoring, MCP, and AG-UI
207
+ - [MCP](https://ggozad.github.io/haiku.rag/mcp/) - Model Context Protocol integration
208
+ - [Inspector](https://ggozad.github.io/haiku.rag/inspector/) - Database browser TUI
209
+ - [Benchmarks](https://ggozad.github.io/haiku.rag/benchmarks/) - Performance benchmarks
210
+ - [Changelog](https://ggozad.github.io/haiku.rag/changelog/) - Version history
195
211
 
196
212
  mcp-name: io.github.ggozad/haiku-rag
@@ -1,8 +1,8 @@
1
1
  # Haiku RAG
2
2
 
3
- Retrieval-Augmented Generation (RAG) library built on LanceDB.
3
+ Opinionated agentic RAG powered by LanceDB, Pydantic AI, and Docling.
4
4
 
5
- `haiku.rag` is a Retrieval-Augmented Generation (RAG) library built to work with LanceDB as a local vector database. It uses LanceDB for storing embeddings and performs semantic (vector) search as well as full-text search combined through native hybrid search with Reciprocal Rank Fusion. Both open-source (Ollama) as well as commercial (OpenAI, VoyageAI) embedding providers are supported.
5
+ `haiku.rag` is an opinionated agentic RAG system that uses LanceDB for vector storage, Pydantic AI for multi-agent workflows, and Docling for document processing. It supports hybrid search (vector + full-text) with Reciprocal Rank Fusion, multiple embedding providers (Ollama, LM Studio, vLLM, OpenAI, VoyageAI), and includes research agents that plan, search, evaluate, and synthesize answers.
6
6
 
7
7
  ## Features
8
8
 
@@ -144,10 +144,23 @@ async with HaikuRAG("database.lancedb") as client:
144
144
  Use with AI assistants like Claude Desktop:
145
145
 
146
146
  ```bash
147
- haiku-rag serve --stdio
147
+ haiku-rag serve --mcp --stdio
148
148
  ```
149
149
 
150
- Provides tools for document management and search directly in your AI assistant.
150
+ Add to your Claude Desktop configuration:
151
+
152
+ ```json
153
+ {
154
+ "mcpServers": {
155
+ "haiku-rag": {
156
+ "command": "haiku-rag",
157
+ "args": ["serve", "--mcp", "--stdio"]
158
+ }
159
+ }
160
+ }
161
+ ```
162
+
163
+ Provides tools for document management, search, QA, and research directly in your AI assistant.
151
164
 
152
165
  ## Examples
153
166
 
@@ -166,7 +179,10 @@ Full documentation at: https://ggozad.github.io/haiku.rag/
166
179
  - [CLI](https://ggozad.github.io/haiku.rag/cli/) - Command reference
167
180
  - [Python API](https://ggozad.github.io/haiku.rag/python/) - Complete API docs
168
181
  - [Agents](https://ggozad.github.io/haiku.rag/agents/) - QA agent and multi-agent research
169
- - [MCP Server](https://ggozad.github.io/haiku.rag/mcp/) - Model Context Protocol integration
170
- - [Benchmarks](https://ggozad.github.io/haiku.rag/benchmarks/) - Performance Benchmarks
182
+ - [Server](https://ggozad.github.io/haiku.rag/server/) - File monitoring, MCP, and AG-UI
183
+ - [MCP](https://ggozad.github.io/haiku.rag/mcp/) - Model Context Protocol integration
184
+ - [Inspector](https://ggozad.github.io/haiku.rag/inspector/) - Database browser TUI
185
+ - [Benchmarks](https://ggozad.github.io/haiku.rag/benchmarks/) - Performance benchmarks
186
+ - [Changelog](https://ggozad.github.io/haiku.rag/changelog/) - Version history
171
187
 
172
188
  mcp-name: io.github.ggozad/haiku-rag
@@ -1,5 +1,5 @@
1
1
  site_name: haiku.rag
2
- site_description: Retrieval-Augmented Generation (RAG) library on LanceDB.
2
+ site_description: Opinionated agentic RAG powered by LanceDB, Pydantic AI, and Docling.
3
3
  site_url: https://ggozad.github.io/haiku.rag/
4
4
  theme:
5
5
  name: material
@@ -73,6 +73,7 @@ nav:
73
73
  - MCP: mcp.md
74
74
  - Inspector: inspector.md
75
75
  - Benchmarks: benchmarks.md
76
+ - Changelog: changelog.md
76
77
  markdown_extensions:
77
78
  - admonition
78
79
  - attr_list
@@ -83,7 +84,8 @@ markdown_extensions:
83
84
  pygments_lang_class: true
84
85
  use_pygments: true
85
86
  - pymdownx.inlinehilite
86
- - pymdownx.snippets
87
+ - pymdownx.snippets:
88
+ base_path: ['.']
87
89
  - pymdownx.superfences:
88
90
  custom_fences:
89
91
  - name: mermaid
@@ -1,13 +1,21 @@
1
1
  [project]
2
2
 
3
3
  name = "haiku.rag"
4
- description = "Agentic Retrieval Augmented Generation (RAG) with LanceDB"
5
- version = "0.19.4"
4
+ description = "Opinionated agentic RAG powered by LanceDB, Pydantic AI, and Docling"
5
+ version = "0.19.6"
6
6
  authors = [{ name = "Yiorgis Gozadinos", email = "ggozadinos@gmail.com" }]
7
7
  license = { text = "MIT" }
8
8
  readme = { file = "README.md", content-type = "text/markdown" }
9
9
  requires-python = ">=3.12"
10
- keywords = ["RAG", "lancedb", "vector-database", "ml", "mcp"]
10
+ keywords = [
11
+ "RAG",
12
+ "lancedb",
13
+ "vector-database",
14
+ "ml",
15
+ "mcp",
16
+ "pydantic-ai",
17
+ "docling",
18
+ ]
11
19
  classifiers = [
12
20
  "Development Status :: 4 - Beta",
13
21
  "Environment :: Console",
@@ -22,16 +30,14 @@ classifiers = [
22
30
  ]
23
31
 
24
32
  dependencies = [
25
- "haiku.rag-slim[docling,voyageai,mxbai,cohere,zeroentropy,inspector]==0.19.4",
33
+ "haiku.rag-slim[docling,voyageai,mxbai,cohere,zeroentropy,inspector]==0.19.6",
26
34
  ]
27
35
 
28
36
  [project.scripts]
29
37
  haiku-rag = "haiku.rag.cli:cli"
30
38
 
31
39
  [project.optional-dependencies]
32
- inspector = [
33
- "textual>=1.0.0",
34
- ]
40
+ inspector = ["textual>=1.0.0"]
35
41
 
36
42
  [build-system]
37
43
  requires = ["hatchling"]
@@ -2,7 +2,7 @@
2
2
  "$schema": "https://static.modelcontextprotocol.io/schemas/2025-10-17/server.schema.json",
3
3
  "name": "io.github.ggozad/haiku-rag",
4
4
  "version": "{{VERSION}}",
5
- "description": "Agentic Retrieval Augmented Generation (RAG) with LanceDB",
5
+ "description": "Opinionated agentic RAG powered by LanceDB, Pydantic AI, and Docling",
6
6
  "repository": {
7
7
  "url": "https://github.com/ggozad/haiku.rag",
8
8
  "source": "github"
@@ -1264,7 +1264,7 @@ wheels = [
1264
1264
 
1265
1265
  [[package]]
1266
1266
  name = "haiku-rag"
1267
- version = "0.19.4"
1267
+ version = "0.19.6"
1268
1268
  source = { editable = "." }
1269
1269
  dependencies = [
1270
1270
  { name = "haiku-rag-slim", extra = ["cohere", "docling", "inspector", "mxbai", "voyageai", "zeroentropy"] },
@@ -1312,7 +1312,7 @@ dev = [
1312
1312
 
1313
1313
  [[package]]
1314
1314
  name = "haiku-rag-evals"
1315
- version = "0.19.4"
1315
+ version = "0.19.6"
1316
1316
  source = { editable = "evaluations" }
1317
1317
  dependencies = [
1318
1318
  { name = "datasets" },
@@ -1333,7 +1333,7 @@ requires-dist = [
1333
1333
 
1334
1334
  [[package]]
1335
1335
  name = "haiku-rag-slim"
1336
- version = "0.19.4"
1336
+ version = "0.19.6"
1337
1337
  source = { editable = "haiku_rag_slim" }
1338
1338
  dependencies = [
1339
1339
  { name = "docling-core" },
File without changes
File without changes
File without changes
File without changes