mirage-benchmark 1.0.4__py3-none-any.whl

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Potentially problematic release.


This version of mirage-benchmark might be problematic. Click here for more details.

@@ -0,0 +1,490 @@
1
+ Metadata-Version: 2.4
2
+ Name: mirage-benchmark
3
+ Version: 1.0.4
4
+ Summary: A Multiagent Framework for Generating Multimodal Multihop QA Datasets for RAG Evaluation
5
+ Home-page: https://github.com/ChandanKSahu/MiRAGE
6
+ Author: MiRAGE Authors
7
+ Author-email: MiRAGE Authors <contact@example.com>
8
+ Maintainer-email: MiRAGE Authors <contact@example.com>
9
+ License: Apache-2.0
10
+ Project-URL: Homepage, https://github.com/ChandanKSahu/MiRAGE
11
+ Project-URL: Documentation, https://github.com/ChandanKSahu/MiRAGE#readme
12
+ Project-URL: Repository, https://github.com/ChandanKSahu/MiRAGE.git
13
+ Project-URL: Issues, https://github.com/ChandanKSahu/MiRAGE/issues
14
+ Keywords: rag,multimodal,qa,dataset,generation,llm,vlm,evaluation,benchmark
15
+ Classifier: Development Status :: 4 - Beta
16
+ Classifier: Intended Audience :: Developers
17
+ Classifier: Intended Audience :: Science/Research
18
+ Classifier: License :: OSI Approved :: Apache Software License
19
+ Classifier: Programming Language :: Python :: 3
20
+ Classifier: Programming Language :: Python :: 3.9
21
+ Classifier: Programming Language :: Python :: 3.10
22
+ Classifier: Programming Language :: Python :: 3.11
23
+ Classifier: Programming Language :: Python :: 3.12
24
+ Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
25
+ Classifier: Topic :: Software Development :: Libraries :: Python Modules
26
+ Requires-Python: >=3.9
27
+ Description-Content-Type: text/markdown
28
+ License-File: LICENSE
29
+ Requires-Dist: torch>=2.0.0
30
+ Requires-Dist: faiss-cpu>=1.7.0
31
+ Requires-Dist: numpy>=1.21.0
32
+ Requires-Dist: Pillow>=9.0.0
33
+ Requires-Dist: transformers>=4.44.0
34
+ Requires-Dist: huggingface_hub>=0.16.0
35
+ Requires-Dist: tqdm>=4.65.0
36
+ Requires-Dist: pyyaml>=6.0
37
+ Requires-Dist: requests>=2.28.0
38
+ Requires-Dist: aiohttp>=3.8.0
39
+ Requires-Dist: sentence-transformers>=2.2.0
40
+ Requires-Dist: bertopic>=0.16.0
41
+ Requires-Dist: umap-learn>=0.5.0
42
+ Requires-Dist: pandas>=1.5.0
43
+ Requires-Dist: scikit-learn>=1.0.0
44
+ Provides-Extra: gpu
45
+ Requires-Dist: faiss-gpu>=1.7.0; extra == "gpu"
46
+ Requires-Dist: bitsandbytes>=0.43.0; extra == "gpu"
47
+ Requires-Dist: accelerate>=0.20.0; extra == "gpu"
48
+ Provides-Extra: pdf
49
+ Requires-Dist: docling>=0.1.0; extra == "pdf"
50
+ Requires-Dist: pypdfium2>=4.0.0; extra == "pdf"
51
+ Provides-Extra: eval
52
+ Requires-Dist: ragas>=0.1.0; extra == "eval"
53
+ Requires-Dist: datasets>=2.0.0; extra == "eval"
54
+ Requires-Dist: langchain-google-genai>=1.0.0; extra == "eval"
55
+ Requires-Dist: langchain-openai>=0.1.0; extra == "eval"
56
+ Provides-Extra: dev
57
+ Requires-Dist: pytest>=7.0.0; extra == "dev"
58
+ Requires-Dist: flake8>=5.0.0; extra == "dev"
59
+ Requires-Dist: black>=22.0.0; extra == "dev"
60
+ Requires-Dist: twine>=4.0.0; extra == "dev"
61
+ Requires-Dist: build>=0.10.0; extra == "dev"
62
+ Provides-Extra: all
63
+ Requires-Dist: faiss-gpu>=1.7.0; extra == "all"
64
+ Requires-Dist: bitsandbytes>=0.43.0; extra == "all"
65
+ Requires-Dist: accelerate>=0.20.0; extra == "all"
66
+ Requires-Dist: docling>=0.1.0; extra == "all"
67
+ Requires-Dist: pypdfium2>=4.0.0; extra == "all"
68
+ Requires-Dist: ragas>=0.1.0; extra == "all"
69
+ Requires-Dist: datasets>=2.0.0; extra == "all"
70
+ Requires-Dist: langchain-google-genai>=1.0.0; extra == "all"
71
+ Requires-Dist: langchain-openai>=0.1.0; extra == "all"
72
+ Requires-Dist: pytest>=7.0.0; extra == "all"
73
+ Requires-Dist: flake8>=5.0.0; extra == "all"
74
+ Requires-Dist: black>=22.0.0; extra == "all"
75
+ Dynamic: author
76
+ Dynamic: home-page
77
+ Dynamic: license-file
78
+ Dynamic: requires-python
79
+
80
+ # MiRAGE: A Multiagent Framework for Generating Multimodal Multihop Question-Answer Dataset for RAG Evaluation
81
+
82
+ <p align="center">
83
+ <img src="https://img.shields.io/badge/python-3.9+-blue.svg" alt="Python 3.9+">
84
+ <img src="https://img.shields.io/badge/license-Apache%202.0-green.svg" alt="License">
85
+ <img src="https://img.shields.io/pypi/v/mirage-benchmark.svg" alt="PyPI">
86
+ </p>
87
+
88
+ **MiRAGE** is a multi-agent framework for generating high-quality, multimodal, multihop question-answer datasets for evaluating Retrieval-Augmented Generation (RAG) systems.
89
+
90
+ <p align="center">
91
+ <img src="assets/mirage_framework.png" alt="MiRAGE Framework Architecture" width="100%">
92
+ </p>
93
+
94
+ ## Key Features
95
+
96
+ - **Multi-hop Context Completion**: Iteratively expands incomplete chunks with relevant context
97
+ - **Domain and Expert Role Detection**: Automatic domain identification using BERTopic + LLM
98
+ - **Multi-stage QA Pipeline**: Generate, Select, Verify, Correct for quality assurance
99
+ - **Multimodal Support**: Handles text, tables, figures, and images
100
+ - **Multiple Backend Support**: Gemini, OpenAI, and local Ollama models
101
+ - **Fully Parallelized**: Thread and process pools for maximum throughput
102
+
103
+ ## Table of Contents
104
+
105
+ - [Installation](#installation)
106
+ - [Quick Start](#quick-start)
107
+ - [Usage](#usage)
108
+ - [API Keys Setup](#api-keys-setup)
109
+ - [Configuration](#configuration)
110
+ - [Command Line Options](#command-line-options)
111
+ - [Output Format](#output-format)
112
+ - [Project Structure](#project-structure)
113
+ - [Contributing](#contributing)
114
+ - [License](#license)
115
+
116
+ ## Installation
117
+
118
+ ### From PyPI
119
+
120
+ ```bash
121
+ pip install mirage-benchmark
122
+ ```
123
+
124
+ ### From Source
125
+
126
+ ```bash
127
+ git clone https://github.com/ChandanKSahu/MiRAGE.git
128
+ cd MiRAGE
129
+ pip install -e .
130
+ ```
131
+
132
+ ### With Optional Dependencies
133
+
134
+ ```bash
135
+ pip install mirage-benchmark[gpu] # GPU support
136
+ pip install mirage-benchmark[pdf] # PDF processing
137
+ pip install mirage-benchmark[all] # All dependencies
138
+ ```
139
+
140
+ ## Quick Start
141
+
142
+ ### Step 1: Set Up API Key
143
+
144
+ Choose one of the following backends:
145
+
146
+ **Option A: Google Gemini (Recommended)**
147
+ ```bash
148
+ export GEMINI_API_KEY="your-gemini-api-key"
149
+ ```
150
+
151
+ **Option B: OpenAI**
152
+ ```bash
153
+ export OPENAI_API_KEY="your-openai-api-key"
154
+ ```
155
+
156
+ **Option C: Local Ollama (No API key needed)**
157
+ ```bash
158
+ # Install and start Ollama
159
+ ollama serve
160
+ ollama pull llama3
161
+ ```
162
+
163
+ ### Step 2: Prepare Your Data
164
+
165
+ Place your documents in a folder:
166
+ ```bash
167
+ mkdir -p data/my_documents
168
+ cp /path/to/your/*.pdf data/my_documents/
169
+ ```
170
+
171
+ ### Step 3: Run MiRAGE
172
+
173
+ ```bash
174
+ # Basic usage
175
+ python run_mirage.py --input data/my_documents --output output/my_dataset
176
+
177
+ # With API key as argument
178
+ python run_mirage.py -i data/my_documents -o output/my_dataset --api-key YOUR_API_KEY
179
+
180
+ # Using OpenAI
181
+ python run_mirage.py -i data/my_documents -o output/my_dataset --backend openai
182
+
183
+ # Using local Ollama
184
+ python run_mirage.py -i data/my_documents -o output/my_dataset --backend ollama
185
+ ```
186
+
187
+ ### Step 4: Check Results
188
+
189
+ ```bash
190
+ ls output/my_dataset/
191
+ # qa_deduplicated.json - Final QA dataset
192
+ # chunks.json - Semantic chunks
193
+ # evaluation_report.json - Quality metrics
194
+ ```
195
+
196
+ ## Usage
197
+
198
+ ### Basic Usage
199
+
200
+ ```bash
201
+ python run_mirage.py --input <INPUT_DIR> --output <OUTPUT_DIR>
202
+ ```
203
+
204
+ ### With All Options
205
+
206
+ ```bash
207
+ python run_mirage.py \
208
+ --input data/documents \
209
+ --output output/results \
210
+ --backend gemini \
211
+ --api-key YOUR_API_KEY \
212
+ --num-qa-pairs 100 \
213
+ --max-workers 4 \
214
+ --verbose
215
+ ```
216
+
217
+ ### Run Preflight Checks
218
+
219
+ Before running the full pipeline, verify your setup:
220
+
221
+ ```bash
222
+ python run_mirage.py --preflight
223
+ ```
224
+
225
+ ### Using Sample Dataset
226
+
227
+ A sample dataset is included for testing:
228
+
229
+ ```bash
230
+ # Unzip sample data
231
+ unzip data/FinanceAnnualReports.zip -d data/sample/
232
+
233
+ # Run on sample
234
+ python run_mirage.py -i data/sample -o output/sample_results
235
+ ```
236
+
237
+ ## API Keys Setup
238
+
239
+ ### Google Gemini
240
+
241
+ 1. Get API key from: https://makersuite.google.com/app/apikey
242
+ 2. Set environment variable:
243
+ ```bash
244
+ export GEMINI_API_KEY="your-key-here"
245
+ ```
246
+
247
+ Or create a file:
248
+ ```bash
249
+ mkdir -p ~/.config/gemini
250
+ echo "your-key-here" > ~/.config/gemini/api_key.txt
251
+ ```
252
+
253
+ ### OpenAI
254
+
255
+ 1. Get API key from: https://platform.openai.com/api-keys
256
+ 2. Set environment variable:
257
+ ```bash
258
+ export OPENAI_API_KEY="your-key-here"
259
+ ```
260
+
261
+ ### Ollama (Local - Free)
262
+
263
+ No API key needed! Just install Ollama:
264
+
265
+ ```bash
266
+ # Install
267
+ curl -fsSL https://ollama.com/install.sh | sh
268
+
269
+ # Start server
270
+ ollama serve
271
+
272
+ # Pull models
273
+ ollama pull llama3 # For text
274
+ ollama pull llava # For vision
275
+ ```
276
+
277
+ ## Configuration
278
+
279
+ ### Using config.yaml
280
+
281
+ Copy the example config and customize:
282
+
283
+ ```bash
284
+ cp config.yaml.example config.yaml
285
+ ```
286
+
287
+ Edit `config.yaml`:
288
+
289
+ ```yaml
290
+ backend:
291
+ active: GEMINI # GEMINI, OPENAI, or OLLAMA
292
+
293
+ gemini:
294
+ api_key_path: ~/.config/gemini/api_key.txt
295
+ llm_model: gemini-2.0-flash
296
+ vlm_model: gemini-2.0-flash
297
+
298
+ openai:
299
+ api_key_path: ~/.config/openai/api_key.txt
300
+ llm_model: gpt-4o
301
+ vlm_model: gpt-4o
302
+
303
+ ollama:
304
+ base_url: http://localhost:11434
305
+ llm_model: llama3
306
+ vlm_model: llava
307
+
308
+ paths:
309
+ input_pdf_dir: data/documents
310
+ output_dir: output/results
311
+
312
+ qa_generation:
313
+ target_qa_pairs: 100
314
+ max_workers: 4
315
+ ```
316
+
317
+ Then run:
318
+ ```bash
319
+ python run_mirage.py --config config.yaml
320
+ ```
321
+
322
+ ## Command Line Options
323
+
324
+ | Option | Short | Description | Default |
325
+ |--------|-------|-------------|---------|
326
+ | `--input` | `-i` | Input directory with documents | Required |
327
+ | `--output` | `-o` | Output directory for results | Required |
328
+ | `--api-key` | `-k` | API key for LLM backend | From env |
329
+ | `--backend` | `-b` | Backend: gemini, openai, ollama | gemini |
330
+ | `--model` | | Model name | Auto |
331
+ | `--config` | `-c` | Config file path | config.yaml |
332
+ | `--num-qa-pairs` | | Target QA pairs to generate | 100 |
333
+ | `--max-workers` | | Parallel workers | 4 |
334
+ | `--preflight` | | Run preflight checks only | - |
335
+ | `--skip-preflight` | | Skip preflight checks | - |
336
+ | `--skip-pdf-processing` | | Skip PDF conversion | - |
337
+ | `--skip-chunking` | | Skip chunking step | - |
338
+ | `--verbose` | `-v` | Verbose output | - |
339
+ | `--version` | | Show version | - |
340
+ | `--help` | `-h` | Show help | - |
341
+
342
+ ## Output Format
343
+
344
+ ### Generated Files
345
+
346
+ ```
347
+ output/my_dataset/
348
+ ├── markdown/ # Converted markdown files
349
+ ├── chunks.json # Semantic chunks
350
+ ├── qa_dataset.json # Raw QA pairs
351
+ ├── qa_deduplicated.json # Final deduplicated QA pairs
352
+ ├── evaluation_report.json # Quality metrics
353
+ └── run_config.json # Run configuration
354
+ ```
355
+
356
+ ### QA Dataset Structure
357
+
358
+ ```json
359
+ {
360
+ "chunk_id": 1,
361
+ "question": "What is the company's revenue growth?",
362
+ "answer": "The company achieved 15% revenue growth...",
363
+ "context_chunks": [...],
364
+ "hop_count": 2,
365
+ "relevance_score": "9",
366
+ "difficulty_score": "7",
367
+ "expert_persona": "Financial Analyst",
368
+ "domain": "Finance"
369
+ }
370
+ ```
371
+
372
+ <p align="center">
373
+ <img src="assets/ample question-answer pair generated.png" alt="Sample QA Pair" width="100%">
374
+ </p>
375
+
376
+ ## Project Structure
377
+
378
+ ```
379
+ MiRAGE/
380
+ ├── src/mirage/ # Main package
381
+ │ ├── core/ # LLM interfaces, prompts, config
382
+ │ ├── embeddings/ # Embedding models, rerankers
383
+ │ ├── pipeline/ # PDF processing, QA generation
384
+ │ ├── evaluation/ # Metrics
385
+ │ └── utils/ # Utilities
386
+ ├── data/ # Your documents
387
+ │ └── documents/ # Input folder
388
+ ├── output/ # Generated results
389
+ ├── config.yaml.example # Example configuration
390
+ ├── run_mirage.py # Main entry point
391
+ └── README.md
392
+ ```
393
+
394
+ ## Examples
395
+
396
+ ### Generate QA from PDFs
397
+
398
+ ```bash
399
+ # Using Gemini
400
+ export GEMINI_API_KEY="your-key"
401
+ python run_mirage.py -i data/pdfs -o output/qa_dataset
402
+
403
+ # Using OpenAI
404
+ export OPENAI_API_KEY="your-key"
405
+ python run_mirage.py -i data/pdfs -o output/qa_dataset --backend openai
406
+
407
+ # Using Ollama (local, free)
408
+ python run_mirage.py -i data/pdfs -o output/qa_dataset --backend ollama
409
+ ```
410
+
411
+ ### Generate More QA Pairs
412
+
413
+ ```bash
414
+ python run_mirage.py -i data/documents -o output/large_dataset --num-qa-pairs 500
415
+ ```
416
+
417
+ ### Use More Workers
418
+
419
+ ```bash
420
+ python run_mirage.py -i data/documents -o output/fast_run --max-workers 8
421
+ ```
422
+
423
+ ### Skip Already Processed Steps
424
+
425
+ ```bash
426
+ # If you already have markdown files
427
+ python run_mirage.py -i data/documents -o output/results --skip-pdf-processing
428
+
429
+ # If you already have chunks
430
+ python run_mirage.py -i data/documents -o output/results --skip-chunking
431
+ ```
432
+
433
+ ## Troubleshooting
434
+
435
+ ### API Key Issues
436
+
437
+ ```bash
438
+ # Check if API key is set
439
+ echo $GEMINI_API_KEY
440
+
441
+ # Set it if missing
442
+ export GEMINI_API_KEY="your-key"
443
+ ```
444
+
445
+ ### Import Errors
446
+
447
+ ```bash
448
+ # Reinstall package
449
+ pip install -e .
450
+ ```
451
+
452
+ ### Preflight Check Failures
453
+
454
+ ```bash
455
+ # Run verbose preflight
456
+ python run_mirage.py --preflight --verbose
457
+ ```
458
+
459
+ ## Contributing
460
+
461
+ 1. Fork the repository
462
+ 2. Create a feature branch
463
+ 3. Make your changes
464
+ 4. Submit a pull request
465
+
466
+ See [CONTRIBUTING.md](CONTRIBUTING.md) for details.
467
+
468
+ ## Citation
469
+
470
+ ```bibtex
471
+ @software{mirage2024,
472
+ title = {MiRAGE: A Multiagent Framework for Generating Multimodal Multihop Question-Answer Dataset for RAG Evaluation},
473
+ author = {MiRAGE Authors},
474
+ year = {2026},
475
+ url = {https://github.com/ChandanKSahu/MiRAGE}
476
+ }
477
+ ```
478
+
479
+ ## License
480
+
481
+ Apache License 2.0 - see [LICENSE](LICENSE)
482
+
483
+ ## Acknowledgments
484
+
485
+ - [RAGAS](https://github.com/explodinggradients/ragas) for evaluation metrics
486
+ - [BERTopic](https://github.com/MaartenGr/BERTopic) for topic modeling
487
+ - [FAISS](https://github.com/facebookresearch/faiss) for similarity search
488
+ - [Docling](https://github.com/DS4SD/docling) for PDF processing
489
+
490
+
@@ -0,0 +1,30 @@
1
+ mirage/__init__.py,sha256=wjg2h7W2McIwR2m3mcpfqZ1VfPlLXSJXGQ6YnnjRF9w,2588
2
+ mirage/cli.py,sha256=MGzZ9rHCa4q5TesEsgMbigt49wC5qGRMMP7iMlzKi58,4824
3
+ mirage/core/__init__.py,sha256=YQTAWi5AfsHWnKuI_W2_vQ2tzt7ooevkHEqruayAqo8,1484
4
+ mirage/core/config.py,sha256=hQjdYt-pohPmYPBFidnKUg7p-1y84ljx29fbNY0f-6Y,7235
5
+ mirage/core/llm.py,sha256=qpPcrM6WWFNMF6Y0WAuPvGlMbd6k7ILRCj1oa3Pm4zY,77587
6
+ mirage/core/prompts.py,sha256=J4SJ-sjDjYRDfIr9aJSvuz07ecy_Jt6Amza0SsqpZ7g,22953
7
+ mirage/embeddings/__init__.py,sha256=RlXUthPGdNNXU-09Ms0Wrm7AQifLUP107_6OHz0AXR8,1291
8
+ mirage/embeddings/models.py,sha256=0x0q2tXfulXqA_lHwYDvhxfQ3LWWLRcv7khb_H2FiJM,21519
9
+ mirage/embeddings/rerankers_multimodal.py,sha256=Lf-doykXHmMA-Rzem6-tS5DwbukNy-bTx-5WmNLCJ0E,32780
10
+ mirage/embeddings/rerankers_text.py,sha256=VviQ7TiGYl8xIAMbZeh5olSAeBLjg-cDcIwd-0oWCAc,6406
11
+ mirage/evaluation/__init__.py,sha256=yL29oL1HLZuXJgrgCHdfXZpvTF-nV5uQN-LUUjPwp8c,1033
12
+ mirage/evaluation/metrics.py,sha256=odWhLx0pU-vTniy6Yf3khwgmiR8R_phrdhencgy7GHs,98042
13
+ mirage/evaluation/metrics_optimized.py,sha256=KP2YP8Y5kqRTAa7g7InAjWmm2h-iipirE3bWeWYUavY,88048
14
+ mirage/pipeline/__init__.py,sha256=StBcwv_doTd639v7wTg4ZCeV-aNv8QvwuFNLDzW4nrU,1984
15
+ mirage/pipeline/chunker.py,sha256=0eiqqn4x5nxkj8ovFXMFM99AoF5CE6KeVtfJvNyDh3M,21612
16
+ mirage/pipeline/context.py,sha256=_sNTeNo2Wsq72mg0uHD52dBHfVNCbZFyIjbDfS5ozfE,42772
17
+ mirage/pipeline/deduplication.py,sha256=5fko2OkfEBQx5lKHYNOrqJP4roETMnBTK7IOEbQ9Prw,19074
18
+ mirage/pipeline/domain.py,sha256=AsZDnBfOWdmTcelb6j-SfUlrNoQYDAiycqb4l2AE2bw,20817
19
+ mirage/pipeline/pdf_processor.py,sha256=6gcxPrJl9gujuMOugUMG8KAoSvxBfhBXXmaC0ixAXMw,25358
20
+ mirage/pipeline/qa_generator.py,sha256=ZKC2LJiZfiNqs6o4IQQkF5-A6S9aKFlbTayKc1amNDI,34490
21
+ mirage/utils/__init__.py,sha256=h3UTJyyEZWIectZbklgOaA0s73xSIe8fJKwgcDpiodk,1285
22
+ mirage/utils/ablation.py,sha256=vfUNAfXHhF4_S2Sxpu4kZ9akWcQzWqPahg7FMzZsJm8,12272
23
+ mirage/utils/preflight.py,sha256=QRmzL-YJzAy0jp5wdYg7zmoFRJ-uZq1PFsDdIUcc-aI,23998
24
+ mirage/utils/stats.py,sha256=kqrP2BuLy6eZnZRQZqE7julNrrsKEznOGFvFH03kpjw,23115
25
+ mirage_benchmark-1.0.4.dist-info/licenses/LICENSE,sha256=S5DUh9Vf2wAP1uRJuvpRPF3fCZJbq_O-mbx-R9gOBIY,10763
26
+ mirage_benchmark-1.0.4.dist-info/METADATA,sha256=94EtjpY6QD2pWU0JFMIwx4ZLJH43yVLhQ4kzr8BonSc,12926
27
+ mirage_benchmark-1.0.4.dist-info/WHEEL,sha256=_zCd3N1l69ArxyTb8rzEoP9TpbYXkqRFSNOD5OuxnTs,91
28
+ mirage_benchmark-1.0.4.dist-info/entry_points.txt,sha256=OvyHIX_38WFVMsNJMqkpdcQXY9-kQgR2173ZRKeNlcM,90
29
+ mirage_benchmark-1.0.4.dist-info/top_level.txt,sha256=x6Yl54RzCGuLqqWDF4zZ-tEaP2pUKzUqb57hVrbakVI,7
30
+ mirage_benchmark-1.0.4.dist-info/RECORD,,
@@ -0,0 +1,5 @@
1
+ Wheel-Version: 1.0
2
+ Generator: setuptools (80.9.0)
3
+ Root-Is-Purelib: true
4
+ Tag: py3-none-any
5
+
@@ -0,0 +1,3 @@
1
+ [console_scripts]
2
+ mirage = mirage.cli:main
3
+ mirage-preflight = mirage.utils.preflight:main