PyPI - quiz-gen - Versions diffs - 0.1.5__py3-none-any.whl - Mend

quiz-gen 0.1.5__py3-none-any.whl

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (37) hide show

quiz_gen/__init__.py +23 -0
quiz_gen/__version__.py +13 -0
quiz_gen/agents/__init__.py +0 -0
quiz_gen/agents/answer_generator.py +0 -0
quiz_gen/agents/base_agent.py +0 -0
quiz_gen/agents/orchestrator.py +0 -0
quiz_gen/agents/question_generator.py +0 -0
quiz_gen/agents/reviewer.py +0 -0
quiz_gen/agents/validator.py +0 -0
quiz_gen/cli.py +209 -0
quiz_gen/config.py +0 -0
quiz_gen/models/__init__.py +0 -0
quiz_gen/models/chunk.py +0 -0
quiz_gen/models/document.py +0 -0
quiz_gen/models/question.py +0 -0
quiz_gen/models/quiz.py +0 -0
quiz_gen/parsers/__init__.py +13 -0
quiz_gen/parsers/base.py +0 -0
quiz_gen/parsers/html/eu_lex_parser.py +805 -0
quiz_gen/parsers/pdf_parser.py +0 -0
quiz_gen/parsers/utils.py +0 -0
quiz_gen/storage/__init__.py +0 -0
quiz_gen/storage/base.py +0 -0
quiz_gen/storage/database.py +0 -0
quiz_gen/storage/json_storage.py +0 -0
quiz_gen/utils/__init__.py +0 -0
quiz_gen/utils/helpers.py +0 -0
quiz_gen/utils/logging.py +0 -0
quiz_gen/validation/__init__.py +0 -0
quiz_gen/validation/human_feedback.py +0 -0
quiz_gen/validation/quality_checker.py +0 -0
quiz_gen-0.1.5.dist-info/METADATA +395 -0
quiz_gen-0.1.5.dist-info/RECORD +37 -0
quiz_gen-0.1.5.dist-info/WHEEL +5 -0
quiz_gen-0.1.5.dist-info/entry_points.txt +2 -0
quiz_gen-0.1.5.dist-info/licenses/LICENSE +21 -0
quiz_gen-0.1.5.dist-info/top_level.txt +1 -0

quiz_gen/parsers/pdf_parser.py ADDED Viewed

File without changes

quiz_gen/parsers/utils.py ADDED Viewed

File without changes

quiz_gen/storage/__init__.py ADDED Viewed

File without changes

quiz_gen/storage/base.py ADDED Viewed

File without changes

quiz_gen/storage/database.py ADDED Viewed

File without changes

quiz_gen/storage/json_storage.py ADDED Viewed

File without changes

quiz_gen/utils/__init__.py ADDED Viewed

File without changes

quiz_gen/utils/helpers.py ADDED Viewed

File without changes

quiz_gen/utils/logging.py ADDED Viewed

File without changes

quiz_gen/validation/__init__.py ADDED Viewed

File without changes

quiz_gen/validation/human_feedback.py ADDED Viewed

File without changes

quiz_gen/validation/quality_checker.py ADDED Viewed

File without changes

quiz_gen-0.1.5.dist-info/METADATA ADDED Viewed

@@ -0,0 +1,395 @@
+Metadata-Version: 2.4
+Name: quiz-gen
+Version: 0.1.5
+Summary: AI-powered quiz generator for regulatory, certification, and educational documentation
+Author-email: Yauheniya Varabyova <yauheniya.ai@gmail.com>
+License-Expression: MIT
+Project-URL: Documentation, https://quiz-gen.readthedocs.io
+Project-URL: Repository, https://github.com/yauheniya-ai/quiz-gen
+Project-URL: Bug Tracker, https://github.com/yauheniya-ai/quiz-gen/issues
+Keywords: quiz,regulation,certification,education,eur-lex,cfr
+Classifier: Development Status :: 3 - Alpha
+Classifier: Intended Audience :: Education
+Classifier: Intended Audience :: Developers
+Classifier: Programming Language :: Python :: 3
+Classifier: Programming Language :: Python :: 3.10
+Classifier: Programming Language :: Python :: 3.11
+Classifier: Programming Language :: Python :: 3.12
+Classifier: Programming Language :: Python :: 3.13
+Requires-Python: >=3.10
+Description-Content-Type: text/markdown
+License-File: LICENSE
+Requires-Dist: beautifulsoup4>=4.12.0
+Requires-Dist: lxml>=5.0.0
+Requires-Dist: requests>=2.31.0
+Provides-Extra: dev
+Requires-Dist: pytest>=7.4.0; extra == "dev"
+Requires-Dist: pytest-cov>=4.1.0; extra == "dev"
+Requires-Dist: black>=23.0.0; extra == "dev"
+Requires-Dist: ruff>=0.1.0; extra == "dev"
+Requires-Dist: mypy>=1.5.0; extra == "dev"
+Requires-Dist: pre-commit>=3.4.0; extra == "dev"
+Requires-Dist: twine>=4.0.0; extra == "dev"
+Requires-Dist: mkdocs; extra == "dev"
+Requires-Dist: mkdocs-material; extra == "dev"
+Dynamic: license-file
+# quiz-gen
+[![Python 3.10+](https://img.shields.io/badge/python-3.10+-blue.svg)](https://www.python.org/downloads/)
+[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)
+[![PyPI version](https://img.shields.io/pypi/v/quiz-gen?color=blue&label=PyPI)](https://pypi.org/project/quiz-gen/)
+[![GitHub last commit](https://img.shields.io/github/last-commit/yauheniya-ai/quiz-gen)](https://github.com/yauheniya-ai/quiz-gen/commits/main)
+[![Downloads](https://pepy.tech/badge/quiz-gen)](https://pepy.tech/project/quiz-gen)
+AI-powered quiz generator for regulatory, certification, and educational documentation. Extract structured content from complex legal and technical documents to create comprehensive learning materials.
+## Features
+- **EUR-Lex Document Parser**: Parse and structure European Union legal documents with full table of contents extraction
+- **Hierarchical Document Analysis**: Automatically identify document structure including chapters, sections, articles, and recitals
+- **Intelligent Chunking**: Extract meaningful content chunks at appropriate granularity levels (articles and recitals)
+- **Table of Contents Generation**: Build complete document navigation structure with 3-level hierarchy
+- **Regulatory Document Support**: Specialized parsing for aviation regulations, directives, and other technical documentation
+## Installation
+```bash
+pip install quiz-gen
+```
+## Quick Start
+### Parsing EUR-Lex Documents
+```python
+from quiz_gen import EURLexParser
+# Parse a regulation document
+url = "https://eur-lex.europa.eu/legal-content/EN/TXT/HTML/?uri=OJ:L_202401689"
+parser = EURLexParser(url=url)
+chunks, toc = parser.parse()
+# Access structured content
+print(f"Extracted {len(chunks)} content chunks")
+print(f"Document has {len(toc['sections'])} major sections")
+# Save results
+parser.save_chunks('output_chunks.json')
+parser.save_toc('output_toc.json')
+```
+### Document Structure
+The parser extracts documents into a multi-level hierarchy:
+**Level 1**: Major Sections
+- Preamble
+- Enacting Terms
+**Level 2/3**: Structural Divisions
+- Chapters
+- Sections
+**Level 1/2/3/4**: Content Elements
+- Title
+- Citation
+- Recitals
+- Articles
+- Concluding formulas
+- Annex
+- Appendix
+### Working with Chunks
+```python
+# Iterate through extracted chunks
+for chunk in chunks:
+    print(f"{chunk.title}")
+    print(f"Type: {chunk.section_type.value}")
+    print(f"Number: {chunk.number}")
+    print(f"Content: {chunk.content[:200]}...")
+    print(f"Hierarchy: {' > '.join(chunk.hierarchy_path)}")
+    print()
+```
+### Displaying Table of Contents
+```python
+# Print formatted TOC
+parser.print_toc()
+# Output:
+# PREAMBLE
+#   Citation
+#   Recital 1
+#   Recital 2
+#   ...
+#
+# ENACTING TERMS
+#   CHAPTER I - PRINCIPLES
+#     Article 1 - Subject matter and objectives
+#     Article 2 - Scope
+```
+## Use Cases
+### Compliance and Legal
+- Analyze regulatory requirements systematically
+- Track changes across document versions
+- Build searchable knowledge bases from legal texts
+### Documentation Processing
+- Convert unstructured documents into structured data
+- Build citation networks and cross-references
+- Support automated document analysis workflows
+### Education and Training
+- Generate study materials from regulatory documents
+- Create structured learning paths for certification programs
+- Extract key concepts for examination preparation
+## Supported Document Types
+Currently supports:
+- **EUR-Lex HTML Documents**: European Union regulations, directives, decisions
+- **Legislative Acts**: Structured legal documents with formal hierarchies
+### Document Format Requirements
+- Documents must use EUR-Lex HTML format
+- Must contain `eli-subdivision` elements for proper structure identification
+- Supports multi-level hierarchies with chapters, sections, and articles
+## Advanced Usage
+### Custom Parsing Workflows
+```python
+from quiz_gen import EURLexParser
+parser = EURLexParser(url=document_url)
+# Parse specific sections
+parser._parse_preamble()  # Extract citations and recitals
+parser._parse_enacting_terms()  # Extract chapters and articles
+parser._parse_annexes()  # Extract annexes
+# Access intermediate results
+toc = parser.toc  # Full table of contents
+chunks = parser.chunks  # Content chunks only
+```
+### Filtering Chunks by Type
+```python
+from quiz_gen import SectionType
+# Get only recitals
+recitals = [c for c in chunks if c.section_type == SectionType.RECITAL]
+# Get only articles
+articles = [c for c in chunks if c.section_type == SectionType.ARTICLE]
+# Filter by chapter
+chapter_1_articles = [
+    c for c in articles
+    if 'CHAPTER I' in ' > '.join(c.hierarchy_path)
+]
+```
+### Accessing Metadata
+```python
+for chunk in chunks:
+    # Access structured metadata
+    print(chunk.metadata)  # {'id': 'art_1', 'subtitle': '...'}
+    # Navigate hierarchy
+    print(chunk.hierarchy_path)  # ['CHAPTER I - PRINCIPLES', 'Article 1']
+    # Identify parent sections
+    print(chunk.parent_section)
+```
+## Project Structure
+```
+quiz-gen/
+├── src/
+│   └── quiz_gen/
+│       ├── parsers/
+│       │   └── html/
+│       │       └── eu_lex_parser.py
+│       ├── models/
+│       │   ├── chunk.py
+│       │   ├── document.py
+│       │   └── quiz.py
+│       └── utils/
+├── examples/
+│   └── eu_lex_toc_chunks.py
+├── tests/
+├── data/
+│   ├── processed/
+│   └── raw/
+└── docs/
+```
+## Development
+### Setting up Development Environment
+```bash
+# Clone the repository
+git clone https://github.com/yauheniya-ai/quiz-gen.git
+cd quiz-gen
+# Install with development dependencies
+pip install -e ".[dev]"
+# Run tests
+pytest
+# Run linting
+ruff check .
+black .
+```
+### Contributing
+Contributions are welcome! Please ensure:
+1. Code follows PEP 8 style guidelines
+2. All tests pass
+3. New features include appropriate tests
+4. Documentation is updated
+## API Reference
+### EURLexParser
+Main parser class for EUR-Lex documents.
+**Methods**:
+- `parse()` -> `tuple[List[RegulationChunk], Dict]`: Parse document and return chunks and TOC
+- `fetch()` -> `str`: Fetch HTML content from URL
+- `save_chunks(filepath: str)`: Save chunks to JSON file
+- `save_toc(filepath: str)`: Save table of contents to JSON file
+- `print_toc()`: Display formatted table of contents
+### RegulationChunk
+Represents a parsed content chunk (article or recital).
+**Attributes**:
+- `section_type`: Type of section (ARTICLE, RECITAL, etc.)
+- `number`: Section number (e.g., "1", "42")
+- `title`: Full title including subtitle
+- `content`: Text content
+- `hierarchy_path`: List of parent sections
+- `metadata`: Additional structured data
+### SectionType
+Enumeration of document section types.
+**Values**:
+- `PREAMBLE`: Preamble section
+- `ENACTING_TERMS`: Main regulatory content
+- `CITATION`: Citation in preamble
+- `RECITAL`: Recital in preamble
+- `CHAPTER`: Chapter division
+- `SECTION`: Section within chapter
+- `ARTICLE`: Article (main content unit)
+- `ANNEX`: Annex section
+## Roadmap
+Future enhancements planned:
+- AI-powered quiz generation from extracted content
+- Support for additional document formats (PDF, DOCX, PPTX)
+- Multi-language support
+- Question validation and quality metrics
+- Integration with learning management systems
+- Version comparison and diff analysis
+## License
+This project is licensed under the MIT License. See the [LICENSE](LICENSE) file for details.
+## Citation
+If you use this software in academic work, please cite:
+```
+Varabyova, Y. (2026). Quiz Gen AI: AI-powered quiz generator for regulatory documentation.
+GitHub repository: https://github.com/yauheniya-ai/quiz-gen
+```
+## Support
+- Documentation: https://quiz-gen.readthedocs.io
+- Issue Tracker: https://github.com/yauheniya-ai/quiz-gen/issues
+## Acknowledgments
+Built with:
+- BeautifulSoup4 for HTML parsing
+- lxml for XML processing
+- EUR-Lex for providing structured legal documents
+## Changelog
+### Version 0.1.0 (2026-01-17)
+Initial release:
+- EUR-Lex document parser
+- Hierarchical document structure extraction
+- Table of contents generation
+- JSON export for chunks and TOC
+### Version 0.1.1 (2026-01-18)
+Parser enhancements:
+- Added regulation title extraction and chunking
+- Support for flexible 3-4 level hierarchy with sections within chapters
+- Complete annexes extraction including table-based content
+- Combined citations into single chunk matching EU-Lex structure
+- Added concluding formulas parsing
+### Version 0.1.2 (2026-01-18)
+Text formatting and tooling:
+- Implemented smart text cleaning for proper list formatting (removes extra newlines after list markers)
+- Fixed numbered paragraph spacing
+- Added professional command-line interface (CLI)
+- Created comprehensive documentation with MkDocs and Material theme
+### Version 0.1.3 (2026-01-19)
+Parser robustness improvements:
+- Fixed parsing of articles directly under enacting terms (without chapter hierarchy)
+- Enhanced article content extraction to handle table-based list items (e.g., (a), (b), (c) in table cells)
+- Added proper appendix detection and parsing (distinguishes appendices from annexes)
+- Improved title extraction for multi-paragraph appendix titles
+### Version 0.1.4 (2026-01-19)
+Annex parsing improvements:
+- Added intelligent detection and parsing of parts within annexes (PART 1, PART 2, etc.)
+- Improved part titles to include annex identifier (e.g., "ANNEX 1 - PART 1" instead of "ANNEX - PART 1")
+- Removed arbitrary content truncation in annexes and appendices - all content now preserved in full
+- Enhanced content collection for parts with proper boundary detection between sections
+### Version 0.1.5 (2026-01-19)
+Bug fixes:
+- Fixed annex TOC title to display with identifier (e.g., "ANNEX 1" instead of "ANNEX")
+- Fixed empty content in annex parts by switching from sibling navigation to descendants iteration

quiz_gen-0.1.5.dist-info/RECORD ADDED Viewed

@@ -0,0 +1,37 @@
+quiz_gen/__init__.py,sha256=BLiiFuMIAzlX_G0mTpoGpx1Y6V19kurOM7dhqZopAJg,540
+quiz_gen/__version__.py,sha256=03oJbrV_EJ7RkHGfnaj4OK9XO1l5fKnBwqKcHiKtgKk,410
+quiz_gen/cli.py,sha256=rpqIDQMqnqppeOQkDNtXq63egC-eUgIEsdcGDpWyz1E,6264
+quiz_gen/config.py,sha256=47DEQpj8HBSa-_TImW-5JCeuQeRkm5NMpJWZG3hSuFU,0
+quiz_gen/agents/__init__.py,sha256=47DEQpj8HBSa-_TImW-5JCeuQeRkm5NMpJWZG3hSuFU,0
+quiz_gen/agents/answer_generator.py,sha256=47DEQpj8HBSa-_TImW-5JCeuQeRkm5NMpJWZG3hSuFU,0
+quiz_gen/agents/base_agent.py,sha256=47DEQpj8HBSa-_TImW-5JCeuQeRkm5NMpJWZG3hSuFU,0
+quiz_gen/agents/orchestrator.py,sha256=47DEQpj8HBSa-_TImW-5JCeuQeRkm5NMpJWZG3hSuFU,0
+quiz_gen/agents/question_generator.py,sha256=47DEQpj8HBSa-_TImW-5JCeuQeRkm5NMpJWZG3hSuFU,0
+quiz_gen/agents/reviewer.py,sha256=47DEQpj8HBSa-_TImW-5JCeuQeRkm5NMpJWZG3hSuFU,0
+quiz_gen/agents/validator.py,sha256=47DEQpj8HBSa-_TImW-5JCeuQeRkm5NMpJWZG3hSuFU,0
+quiz_gen/models/__init__.py,sha256=47DEQpj8HBSa-_TImW-5JCeuQeRkm5NMpJWZG3hSuFU,0
+quiz_gen/models/chunk.py,sha256=47DEQpj8HBSa-_TImW-5JCeuQeRkm5NMpJWZG3hSuFU,0
+quiz_gen/models/document.py,sha256=47DEQpj8HBSa-_TImW-5JCeuQeRkm5NMpJWZG3hSuFU,0
+quiz_gen/models/question.py,sha256=47DEQpj8HBSa-_TImW-5JCeuQeRkm5NMpJWZG3hSuFU,0
+quiz_gen/models/quiz.py,sha256=47DEQpj8HBSa-_TImW-5JCeuQeRkm5NMpJWZG3hSuFU,0
+quiz_gen/parsers/__init__.py,sha256=g2KpJf5yen7mZJy-GWQXybGK2k7mZuYEpyfqKg1rZEw,230
+quiz_gen/parsers/base.py,sha256=47DEQpj8HBSa-_TImW-5JCeuQeRkm5NMpJWZG3hSuFU,0
+quiz_gen/parsers/pdf_parser.py,sha256=47DEQpj8HBSa-_TImW-5JCeuQeRkm5NMpJWZG3hSuFU,0
+quiz_gen/parsers/utils.py,sha256=47DEQpj8HBSa-_TImW-5JCeuQeRkm5NMpJWZG3hSuFU,0
+quiz_gen/parsers/html/eu_lex_parser.py,sha256=-3bJ8RVjqeD9qKysMKG6ICPcLYI81_TtYuWdkn-Yobo,36109
+quiz_gen/storage/__init__.py,sha256=47DEQpj8HBSa-_TImW-5JCeuQeRkm5NMpJWZG3hSuFU,0
+quiz_gen/storage/base.py,sha256=47DEQpj8HBSa-_TImW-5JCeuQeRkm5NMpJWZG3hSuFU,0
+quiz_gen/storage/database.py,sha256=47DEQpj8HBSa-_TImW-5JCeuQeRkm5NMpJWZG3hSuFU,0
+quiz_gen/storage/json_storage.py,sha256=47DEQpj8HBSa-_TImW-5JCeuQeRkm5NMpJWZG3hSuFU,0
+quiz_gen/utils/__init__.py,sha256=47DEQpj8HBSa-_TImW-5JCeuQeRkm5NMpJWZG3hSuFU,0
+quiz_gen/utils/helpers.py,sha256=47DEQpj8HBSa-_TImW-5JCeuQeRkm5NMpJWZG3hSuFU,0
+quiz_gen/utils/logging.py,sha256=47DEQpj8HBSa-_TImW-5JCeuQeRkm5NMpJWZG3hSuFU,0
+quiz_gen/validation/__init__.py,sha256=47DEQpj8HBSa-_TImW-5JCeuQeRkm5NMpJWZG3hSuFU,0
+quiz_gen/validation/human_feedback.py,sha256=47DEQpj8HBSa-_TImW-5JCeuQeRkm5NMpJWZG3hSuFU,0
+quiz_gen/validation/quality_checker.py,sha256=47DEQpj8HBSa-_TImW-5JCeuQeRkm5NMpJWZG3hSuFU,0
+quiz_gen-0.1.5.dist-info/licenses/LICENSE,sha256=bJXbgXAmDWf3_2rn3BVh6-4wZtB5xbycNfqTHyGu_tE,1076
+quiz_gen-0.1.5.dist-info/METADATA,sha256=NSWmeH_HCGkIA6gqqevP7BCObHA8jxprD0ZK2Y7saoA,11475
+quiz_gen-0.1.5.dist-info/WHEEL,sha256=_zCd3N1l69ArxyTb8rzEoP9TpbYXkqRFSNOD5OuxnTs,91
+quiz_gen-0.1.5.dist-info/entry_points.txt,sha256=uRVta04GJsOC3b3AZUONSq924-Vqwp2UjsEGYsJKQe4,47
+quiz_gen-0.1.5.dist-info/top_level.txt,sha256=Nrt267uGX_L3FsFX2NSW0Uh9XQ9LthH7Mss9n9VrL2g,9
+quiz_gen-0.1.5.dist-info/RECORD,,

quiz_gen-0.1.5.dist-info/WHEEL ADDED Viewed

@@ -0,0 +1,5 @@
+Wheel-Version: 1.0
+Generator: setuptools (80.9.0)
+Root-Is-Purelib: true
+Tag: py3-none-any

quiz_gen-0.1.5.dist-info/entry_points.txt ADDED Viewed

	@@ -0,0 +1,2 @@
1	+ [console_scripts]
2	+ quiz-gen = quiz_gen.cli:main

quiz_gen-0.1.5.dist-info/licenses/LICENSE ADDED Viewed

@@ -0,0 +1,21 @@
+MIT License
+Copyright (c) 2026 Yauheniya Varabyova
+Permission is hereby granted, free of charge, to any person obtaining a copy
+of this software and associated documentation files (the "Software"), to deal
+in the Software without restriction, including without limitation the rights
+to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+copies of the Software, and to permit persons to whom the Software is
+furnished to do so, subject to the following conditions:
+The above copyright notice and this permission notice shall be included in all
+copies or substantial portions of the Software.
+THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
+OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+SOFTWARE.

quiz_gen-0.1.5.dist-info/top_level.txt ADDED Viewed

	@@ -0,0 +1 @@
1	+ quiz_gen