ragmint 0.2.3__py3-none-any.whl → 0.4.6__py3-none-any.whl

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (37) hide show
  1. ragmint/app.py +512 -0
  2. ragmint/autotuner.py +201 -17
  3. ragmint/core/chunking.py +68 -4
  4. ragmint/core/embeddings.py +46 -10
  5. ragmint/core/evaluation.py +33 -14
  6. ragmint/core/pipeline.py +34 -10
  7. ragmint/core/retriever.py +152 -20
  8. ragmint/experiments/validation_qa.json +1 -14
  9. ragmint/explainer.py +47 -20
  10. ragmint/integrations/__init__.py +0 -0
  11. ragmint/integrations/config_adapter.py +96 -0
  12. ragmint/integrations/langchain_prebuilder.py +99 -0
  13. ragmint/leaderboard.py +41 -35
  14. ragmint/qa_generator.py +190 -0
  15. ragmint/tests/test_autotuner.py +52 -30
  16. ragmint/tests/test_config_adapter.py +39 -0
  17. ragmint/tests/test_embeddings.py +46 -0
  18. ragmint/tests/test_explainer.py +28 -12
  19. ragmint/tests/test_integration_autotuner_ragmint.py +39 -52
  20. ragmint/tests/test_langchain_prebuilder.py +82 -0
  21. ragmint/tests/test_leaderboard.py +78 -25
  22. ragmint/tests/test_pipeline.py +3 -2
  23. ragmint/tests/test_qa_generator.py +66 -0
  24. ragmint/tests/test_retriever.py +3 -2
  25. ragmint/tests/test_tuner.py +1 -1
  26. ragmint/tuner.py +109 -22
  27. ragmint-0.4.6.data/data/README.md +485 -0
  28. ragmint-0.4.6.dist-info/METADATA +530 -0
  29. ragmint-0.4.6.dist-info/RECORD +48 -0
  30. ragmint/tests/test_explainer_integration.py +0 -18
  31. ragmint-0.2.3.data/data/README.md +0 -284
  32. ragmint-0.2.3.dist-info/METADATA +0 -312
  33. ragmint-0.2.3.dist-info/RECORD +0 -40
  34. {ragmint-0.2.3.data → ragmint-0.4.6.data}/data/LICENSE +0 -0
  35. {ragmint-0.2.3.dist-info → ragmint-0.4.6.dist-info}/WHEEL +0 -0
  36. {ragmint-0.2.3.dist-info → ragmint-0.4.6.dist-info}/licenses/LICENSE +0 -0
  37. {ragmint-0.2.3.dist-info → ragmint-0.4.6.dist-info}/top_level.txt +0 -0
@@ -0,0 +1,485 @@
1
+ # Ragmint
2
+
3
+ <p align="center">
4
+ <img src="src/ragmint/assets/img/ragmint-banner.png" width="auto" height="70px" alt="Ragmint Banner">
5
+ </p>
6
+
7
+ ![Python](https://img.shields.io/badge/python-3.9%2B-blue)
8
+ ![License](https://img.shields.io/badge/license-Apache%202.0-green)
9
+ ![Tests](https://github.com/andyolivers/ragmint/actions/workflows/tests.yml/badge.svg)
10
+ ![Optuna](https://img.shields.io/badge/Optuna-Integrated-orange)
11
+ ![Status](https://img.shields.io/badge/Status-Active-success)
12
+ ![PyPI](https://img.shields.io/pypi/v/ragmint?color=blue)
13
+ ![Docs](https://img.shields.io/badge/docs-latest-blueviolet)
14
+
15
+
16
+ **Ragmint** (Retrieval-Augmented Generation Model Inspection & Tuning) is a modular, developer-friendly Python library for **evaluating, optimizing, and tuning RAG (Retrieval-Augmented Generation) pipelines**.
17
+
18
+ It provides a complete toolkit for **retriever selection**, **embedding model tuning**, **automated RAG evaluation**, and **config-driven prebuilding** of pipelines with support for **Optuna-based Bayesian optimization**, **Auto-RAG tuning**, **chunking**, and **explainability** through Gemini or Claude.
19
+
20
+ ---
21
+
22
+ ## ✨ Features
23
+
24
+ - ✅ **Automated hyperparameter optimization** (Grid, Random, Bayesian via Optuna).
25
+ - 🤖 **Auto-RAG Tuner** — dynamically recommends retriever–embedding pairs based on corpus size and document statistics, **suggests multiple chunk sizes with overlaps**, and can **test configurations to identify the best-performing RAG setup**.
26
+ - 🧮 **Validation QA Generator** — automatically creates QA datasets from a corpus for evaluating and tuning RAG pipelines when no labeled data is available.
27
+ - 🧠 **Explainability Layer** — interprets RAG performance via Gemini or Claude APIs.
28
+ - 🏆 **Leaderboard Tracking** — stores and ranks experiment runs via JSON or external DB.
29
+ - 🔍 **Built-in RAG evaluation metrics** — faithfulness, recall, BLEU, ROUGE, latency.
30
+ - 📦 **Chunking system** — automatic or configurable `chunk_size` and `overlap` for documents with multiple suggested pairs.
31
+ - ⚙️ **Retrievers** — FAISS, Chroma, scikit-learn.
32
+ - 🧩 **Embeddings** — Hugging Face.
33
+ - 💾 **Caching, experiment tracking, and reproducibility** out of the box.
34
+ - 🧰 **Clean modular structure** for easy integration in research and production setups.
35
+ - 🏗️ **Langchain Prebuilder** — prepares pipelines, applies chunking, embeddings, and vector store creation automatically.
36
+ - ⚙️ **Config Adapter (LangchainConfigAdapter)** — normalizes configuration, fills defaults, validates retrievers.
37
+
38
+ ---
39
+
40
+ ## 🚀 Quick Start
41
+
42
+ ### Installation
43
+
44
+ ```bash
45
+ git clone https://github.com/andyolivers/ragmint.git
46
+ cd ragmint
47
+ pip install -e .
48
+ python -m ragmint.app
49
+ ```
50
+
51
+ > The `-e` flag installs Ragmint in editable (development) mode.
52
+ > Requires **Python ≥ 3.9**.
53
+
54
+ ### Installation via PyPI
55
+
56
+ ```bash
57
+ pip install ragmint
58
+ ```
59
+
60
+ ---
61
+
62
+ ### 2️⃣ Run a RAG Optimization Experiment
63
+
64
+ ```bash
65
+ python ragmint/main.py --config configs/default.yaml --search bayesian
66
+ ```
67
+
68
+ Example `configs/default.yaml`:
69
+ ```yaml
70
+ retriever: faiss
71
+ embedding_model: text-embedding-3-small
72
+ chunk_size: 500
73
+ overlap: 100
74
+ reranker:
75
+ mode: mmr
76
+ lambda_param: 0.5
77
+ optimization:
78
+ search_method: bayesian
79
+ n_trials: 20
80
+ ```
81
+
82
+ ---
83
+
84
+ ### 3️⃣ Manual Pipeline Usage
85
+
86
+ ```python
87
+ from ragmint.prebuilder import PreBuilder
88
+ from ragmint.tuner import RAGMint
89
+
90
+ # Prebuild pipeline (chunking, embeddings, vector store)
91
+ prebuilder = PreBuilder(
92
+ docs_path="data/docs/",
93
+ config_path="configs/default.yaml"
94
+ )
95
+ pipeline = prebuilder.build_pipeline()
96
+
97
+ # Initialize RAGMint with prebuilt components
98
+ rag = RAGMint(pipeline=pipeline)
99
+
100
+ # Run optimization
101
+ best, results = rag.optimize(validation_set=None, metric="faithfulness", trials=3)
102
+ print("Best configuration:", best)
103
+
104
+ ```
105
+ ---
106
+ # 🧩 Embeddings and Retrievers
107
+
108
+ **Ragmint** supports a flexible set of embeddings and retrievers, allowing you to adapt easily to various **RAG architectures**.
109
+
110
+ ---
111
+ ## 🧩 Chunking System
112
+
113
+ * **Automatically splits documents** into chunks with `chunk_size` and `overlap` parameters.
114
+ * **Supports default values** if not provided in configuration.
115
+ * **Optimized** for downstream **retrieval and embeddings**.
116
+ * **Enables adaptive chunking strategies** in future releases.
117
+ ---
118
+ ## 🧮 Validation QA Generator
119
+
120
+ The **QA Generator** module automatically creates **question–answer (QA) validation datasets** from any corpus of `.txt` documents.
121
+ This dataset can be used to **evaluate and tune RAG pipelines** inside Ragmint when no labeled data is available.
122
+
123
+ ### ✨ Key Capabilities
124
+
125
+ - 🔁 **Batch processing** — splits large corpora into batches to prevent token overflows and API timeouts.
126
+
127
+ - 🧠 **Topic-aware question estimation** — dynamically determines how many questions to generate per document based on:
128
+ - Document length (logarithmic scaling)
129
+ - Topic diversity (via `SentenceTransformer` + `KMeans` clustering)
130
+
131
+ - 🤖 **LLM-powered QA synthesis** — generates factual QA pairs using **Gemini** or **Claude** models.
132
+
133
+ - 💾 **Automatic JSON export** — saves the generated dataset to `experiments/validation_qa.json` (configurable).
134
+
135
+ ### ⚙️ Usage
136
+
137
+ You can run the generator directly from the command line:
138
+
139
+ ```bash
140
+ python -m ragmint.qa_generator --density 0.005
141
+ ```
142
+
143
+ ### 💡 Example: Using in Python
144
+
145
+ ```python
146
+ from ragmint.qa_generator import generate_validation_qa
147
+
148
+ generate_validation_qa(
149
+ docs_path="data/docs", # Folder with .txt documents
150
+ output_path="experiments/validation_qa.json", # Output JSON file
151
+ llm_model="gemini-2.5-flash-lite", # or "claude-3-opus-20240229"
152
+ batch_size=5, # Number of docs per LLM call
153
+ sleep_between_batches=2, # Wait time between calls (seconds)
154
+ min_q=3, # Minimum questions per doc
155
+ max_q=25 # Maximum questions per doc
156
+ )
157
+ ```
158
+ ✅ The generator supports both Gemini and Claude models.
159
+ Set your API key in a `.env` file or via environment variables:
160
+ ```
161
+ export GOOGLE_API_KEY="your_gemini_key"
162
+ export ANTHROPIC_API_KEY="your_claude_key"
163
+ ```
164
+
165
+ ---
166
+ ## 🧩 Langchain Config Adapter
167
+
168
+ * **Ensures consistent configuration** across pipeline components.
169
+ * **Normalizes retriever and embedding names** (e.g., `faiss`, `sentence-transformers/...`).
170
+ * **Adds default chunk parameters** when missing.
171
+ * **Validates retriever backends** and **raises clear errors** for unsupported options.
172
+
173
+ ---
174
+ ## 🧩 Langchain Prebuilder
175
+
176
+ **Automates pipeline preparation:**
177
+ 1. Reads documents
178
+ 2. Applies chunking
179
+ 3. Creates embeddings
180
+ 4. Initializes retriever / vector store
181
+ 5. Returns ready-to-use pipeline** for RAGMint or custom usage.
182
+
183
+ ---
184
+
185
+ ## 🔤 Available Embeddings (Hugging Face)
186
+
187
+ You can select from the following models:
188
+
189
+ * `sentence-transformers/all-MiniLM-L6-v2` — **lightweight**, general-purpose
190
+ * `sentence-transformers/all-mpnet-base-v2` — **higher accuracy**, slower
191
+ * `BAAI/bge-base-en-v1.5` — **multilingual**, dense embeddings
192
+ * `intfloat/multilingual-e5-base` — ideal for **multilingual corpora**
193
+
194
+
195
+
196
+ ### Configuration Example
197
+
198
+ Use the following format in your config file to specify the embedding model:
199
+
200
+ ```yaml
201
+ embedding_model: sentence-transformers/all-MiniLM-L6-v2
202
+ ```
203
+ ---
204
+
205
+ ## 🔍 Available Retrievers
206
+
207
+ **Ragmint** integrates multiple **retrieval backends** to suit different needs:
208
+
209
+ | Retriever | Description |
210
+ | :--- | :--- |
211
+ | **FAISS** | Fast vector similarity search; efficient for dense embeddings |
212
+ | **Chroma** | Persistent vector DB; works well for incremental updates |
213
+ | **scikit-learn (NearestNeighbors)** | Lightweight, zero-dependency local retriever |
214
+
215
+
216
+ ### Configuration Example
217
+
218
+ To specify the retriever in your configuration file, use the following format:
219
+
220
+ ```yaml
221
+ retriever: faiss
222
+ ```
223
+
224
+ ---
225
+
226
+ ## 🧪 Dataset Options
227
+
228
+ Ragmint can automatically load evaluation datasets for your RAG pipeline:
229
+
230
+ | Mode | Example | Description |
231
+ |------|----------|-------------|
232
+ | 🧱 **Default** | `validation_set=None` | Uses built-in `experiments/validation_qa.json` |
233
+ | 📁 **Custom File** | `validation_set="data/my_eval.json"` | Load your own QA dataset (JSON or CSV) |
234
+ | 🌐 **Hugging Face Dataset** | `validation_set="squad"` | Automatically downloads benchmark datasets (requires `pip install datasets`) |
235
+
236
+ ### Example
237
+
238
+ ```python
239
+ from ragmint.tuner import RAGMint
240
+
241
+ ragmint = RAGMint(
242
+ docs_path="data/docs/",
243
+ retrievers=["faiss", "chroma"],
244
+ embeddings=["text-embedding-3-small"],
245
+ rerankers=["mmr"],
246
+ )
247
+
248
+ # Use built-in default
249
+ ragmint.optimize(validation_set=None)
250
+
251
+ # Use Hugging Face benchmark
252
+ ragmint.optimize(validation_set="squad")
253
+
254
+ # Use your own dataset
255
+ ragmint.optimize(validation_set="data/custom_qa.json")
256
+ ```
257
+
258
+ ---
259
+
260
+ ## 🧠 Auto-RAG Tuner
261
+
262
+ The **AutoRAGTuner** automatically analyzes your corpus and recommends retriever–embedding combinations based on corpus statistics (size and average document length). It also **suggests multiple chunk sizes with overlaps** to improve retrieval performance.
263
+
264
+ Beyond recommendations, it can **run full end-to-end testing** of the suggested configurations and **identify the best-performing RAG setup** for your dataset.
265
+
266
+
267
+ ```python
268
+ from ragmint.autotuner import AutoRAGTuner
269
+
270
+ # Initialize with your documents
271
+ tuner = AutoRAGTuner(docs_path="data/docs/")
272
+
273
+ # Recommend configurations and suggest chunk sizes
274
+ recommendation = tuner.recommend(num_chunk_pairs=5)
275
+ print("Initial recommendation:", recommendation)
276
+
277
+ # Run full auto-tuning on validation set
278
+ best_config, results = tuner.auto_tune(validation_set="data/validation.json", trials=5)
279
+ print("Best configuration after testing:", best_config)
280
+ print("All trial results:", results)
281
+ ```
282
+ ---
283
+ ## 🧠 Live Dashboard (Gradio)
284
+ Ragmint includes a visual dashboard to AutoTune and analyze RAG pipelines.
285
+
286
+ <p align="center">
287
+ <img src="/assets/images/dashboard-preview.png" width="80%" alt="Ragmint Gradio App Preview">
288
+ </p>
289
+
290
+ ---
291
+
292
+ ## 🏆 Leaderboard Tracking
293
+
294
+ Track and visualize your best experiments across runs.
295
+
296
+ ```python
297
+ from ragmint.leaderboard import Leaderboard
298
+
299
+ # Initialize local leaderboard
300
+ leaderboard = Leaderboard(storage_path="leaderboard.jsonl")
301
+
302
+ # Retrieve top 5 runs
303
+ print("\n🏅 Top 5 Experiments:")
304
+ for result in leaderboard.top_results(limit=5):
305
+ print(f"{result['run_id']} | Score: {result['best_score']:.2f} | Model: {result['model']}")
306
+ ```
307
+
308
+ ---
309
+
310
+ ## 🧠 Explainability with Gemini / Claude
311
+
312
+ Compare RAG configurations and receive **natural language insights** on why one performs better.
313
+
314
+ ```python
315
+ from ragmint.autotuner import AutoRAGTuner
316
+ from ragmint.explainer import explain_results
317
+
318
+ tuner = AutoRAGTuner(docs_path="data/docs/")
319
+ best, results = tuner.auto_tune(
320
+ validation_set='data/docs/validation_qa.json',
321
+ metric="faithfulness",
322
+ trials=5,
323
+ search_type='bayesian'
324
+ )
325
+
326
+ analysis = explain_results(best, results, corpus_stats=tuner.corpus_stats)
327
+ print(analysis)
328
+ ```
329
+
330
+ > Set your API keys in a `.env` file or via environment variables:
331
+ > ```
332
+ > export GEMINI_API_KEY="your_gemini_key"
333
+ > export ANTHROPIC_API_KEY="your_claude_key"
334
+ > ```
335
+
336
+ ---
337
+
338
+ ## 🧩 Folder Structure
339
+
340
+ ```
341
+ ragmint/
342
+ ├── core/
343
+ │ ├── pipeline.py
344
+ │ ├── retriever.py
345
+ │ ├── reranker.py
346
+ │ ├── embeddings.py
347
+ │ ├── chunking.py
348
+ │ └── evaluation.py
349
+ ├── integration/
350
+ │ ├── config_adapter.py
351
+ │ └── langchain_prebuilder.py
352
+ ├── autotuner.py
353
+ ├── explainer.py
354
+ ├── leaderboard.py
355
+ ├── tuner.py
356
+ ├── utils/
357
+ ├── configs/
358
+ ├── experiments/
359
+ ├── tests/
360
+ └── main.py
361
+ ```
362
+
363
+ ---
364
+
365
+ ## 🧪 Running Tests
366
+
367
+ ```bash
368
+ pytest -v
369
+ ```
370
+
371
+ To include integration tests with Gemini or Claude APIs:
372
+ ```bash
373
+ pytest -m integration
374
+ ```
375
+
376
+ ---
377
+
378
+ ## ⚙️ Configuration via `pyproject.toml`
379
+
380
+ Your `pyproject.toml` includes all required dependencies:
381
+
382
+ ```toml
383
+ [project]
384
+ name = "ragmint"
385
+ version = "0.1.0"
386
+ dependencies = [
387
+ # Core ML + Embeddings
388
+ "numpy<2.0.0",
389
+ "pandas>=2.0",
390
+ "scikit-learn>=1.3",
391
+ "sentence-transformers>=2.2.2",
392
+
393
+ # Retrieval backends
394
+ "chromadb>=0.4",
395
+ "faiss-cpu; sys_platform != 'darwin'", # For Linux/Windows
396
+ "faiss-cpu==1.7.4; sys_platform == 'darwin'", # Optional fix for macOS MPS
397
+ "rank-bm25>=0.2.2", # For BM25 retriever
398
+
399
+ # Optimization & evaluation
400
+ "optuna>=3.0",
401
+ "tqdm",
402
+ "colorama",
403
+
404
+ # RAG evaluation and data utils
405
+ "pyyaml",
406
+ "python-dotenv",
407
+
408
+ # Explainability and LLM APIs
409
+ "openai>=1.0.0",
410
+ "google-generativeai>=0.8.0",
411
+ "anthropic>=0.25.0",
412
+
413
+ # Integration / storage
414
+ "supabase>=2.4.0",
415
+
416
+ # Testing
417
+ "pytest",
418
+
419
+ # LangChain integration layer
420
+ "langchain>=0.2.5",
421
+ "langchain-community>=0.2.5",
422
+ "langchain-text-splitters>=0.2.1"
423
+ ]
424
+ ```
425
+
426
+ ---
427
+
428
+ ## 📊 Example Experiment Workflow
429
+
430
+ 1. Define your retriever, embedding, and reranker setup
431
+ 2. Launch optimization (Grid, Random, Bayesian) or AutoTune
432
+ 3. Compare performance with explainability
433
+ 4. Persist results to leaderboard for later inspection
434
+
435
+ ---
436
+
437
+ ## 🧬 Architecture Overview
438
+
439
+ ```mermaid
440
+ flowchart TD
441
+ A[Query] --> B[Chunking / Preprocessing]
442
+ B --> C[Embedder]
443
+ C --> D[Retriever]
444
+ D --> E[Reranker]
445
+ E --> F[Generator]
446
+ F --> G[Evaluation]
447
+ G --> H[AutoRAGTuner / Optuna]
448
+ H --> I[Suggested Configs & Chunk Sizes]
449
+ I --> J[Best Configuration]
450
+ J -->|Update Params| C
451
+
452
+ ```
453
+
454
+ ---
455
+
456
+ ## 📘 Example Output
457
+
458
+ ```
459
+ [INFO] Starting Auto-RAG Tuning
460
+ [INFO] Suggested retriever=Chroma, embedding_model=sentence-transformers/all-MiniLM-L6-v2
461
+ [INFO] Suggested chunk-size candidates: [(380, 80), (420, 100), (350, 70), (400, 90), (360, 75)]
462
+ [INFO] Running full evaluation on validation set with 5 trials
463
+ [INFO] Trial 1 finished: faithfulness=0.82, latency=0.40s
464
+ [INFO] Trial 2 finished: faithfulness=0.85, latency=0.44s
465
+ ...
466
+ [INFO] Best configuration after testing: {'retriever': 'Chroma', 'embedding_model': 'sentence-transformers/all-MiniLM-L6-v2', 'chunk_size': 400, 'overlap': 90, 'strategy': 'sentence'}
467
+ ```
468
+ ---
469
+ ## 🧾 Citation
470
+ If you use **Ragmint** in your research, please cite:
471
+ ```markdown
472
+ @software{oliveira2025ragmint,
473
+ author = {André Oliveira},
474
+ title = {Ragmint: Retrieval-Augmented Generation Model Inspection & Tuning},
475
+ year = {2025},
476
+ url = {https://github.com/andyolivers/ragmint},
477
+ license = {Apache-2.0}
478
+ }
479
+ ```
480
+
481
+ ---
482
+
483
+ <p align="center">
484
+ <sub>Built with ❤️ by <a href="https://andyolivers.com">André Oliveira</a> | Apache 2.0 License</sub>
485
+ </p>