lopace 0.1.3__tar.gz → 0.1.5__tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (64) hide show
  1. lopace-0.1.5/.dockerignore +57 -0
  2. lopace-0.1.5/.github/ISSUE_TEMPLATE/bug_report.md +59 -0
  3. lopace-0.1.5/.github/ISSUE_TEMPLATE/feature_request.md +44 -0
  4. lopace-0.1.5/.github/dependabot.yml +34 -0
  5. lopace-0.1.5/.github/pull_request_template.md +70 -0
  6. lopace-0.1.5/.github/workflows/ci-cd.yml +142 -0
  7. lopace-0.1.5/.github/workflows/test-only.yml +45 -0
  8. lopace-0.1.5/.gitignore +79 -0
  9. lopace-0.1.5/CHANGELOG.md +87 -0
  10. lopace-0.1.5/CODE_OF_CONDUCT.md +119 -0
  11. lopace-0.1.5/CONTRIBUTING.md +282 -0
  12. lopace-0.1.5/Dockerfile +38 -0
  13. {lopace-0.1.3/lopace.egg-info → lopace-0.1.5}/PKG-INFO +37 -12
  14. {lopace-0.1.3 → lopace-0.1.5}/README.md +36 -11
  15. lopace-0.1.5/SECURITY.md +106 -0
  16. {lopace-0.1.3 → lopace-0.1.5}/lopace/__init__.py +5 -1
  17. lopace-0.1.5/lopace/_version.py +34 -0
  18. {lopace-0.1.3 → lopace-0.1.5/lopace.egg-info}/PKG-INFO +37 -12
  19. lopace-0.1.5/lopace.egg-info/SOURCES.txt +61 -0
  20. lopace-0.1.5/notebooks/LoPace_Complete_Guide.ipynb +798 -0
  21. lopace-0.1.5/notebooks/README.md +64 -0
  22. lopace-0.1.5/paper/lopace-preprint-arxiv-01.png +0 -0
  23. lopace-0.1.5/paper/lopace-preprint-arxiv.pdf +0 -0
  24. {lopace-0.1.3 → lopace-0.1.5}/pyproject.toml +9 -3
  25. lopace-0.1.5/requirements-dev.txt +14 -0
  26. lopace-0.1.5/screenshots/.gitkeep +2 -0
  27. lopace-0.1.5/screenshots/benchmark_data.csv +31 -0
  28. lopace-0.1.5/screenshots/comprehensive_comparison.png +0 -0
  29. lopace-0.1.5/screenshots/comprehensive_comparison.svg +961 -0
  30. lopace-0.1.5/screenshots/compression-pipeline.png +0 -0
  31. lopace-0.1.5/screenshots/compression_ratio.png +0 -0
  32. lopace-0.1.5/screenshots/compression_ratio.svg +546 -0
  33. lopace-0.1.5/screenshots/disk_size_comparison.png +0 -0
  34. lopace-0.1.5/screenshots/disk_size_comparison.svg +576 -0
  35. lopace-0.1.5/screenshots/logo-text.png +0 -0
  36. lopace-0.1.5/screenshots/logo.png +0 -0
  37. lopace-0.1.5/screenshots/lopace-compression-technique.png +0 -0
  38. lopace-0.1.5/screenshots/memory_usage.png +0 -0
  39. lopace-0.1.5/screenshots/memory_usage.svg +721 -0
  40. lopace-0.1.5/screenshots/original_vs_decompressed.png +0 -0
  41. lopace-0.1.5/screenshots/original_vs_decompressed.svg +2804 -0
  42. lopace-0.1.5/screenshots/scalability_analysis.png +0 -0
  43. lopace-0.1.5/screenshots/scalability_analysis.svg +1267 -0
  44. lopace-0.1.5/screenshots/space_savings.png +0 -0
  45. lopace-0.1.5/screenshots/space_savings.svg +397 -0
  46. lopace-0.1.5/screenshots/speed_metrics.png +0 -0
  47. lopace-0.1.5/screenshots/speed_metrics.svg +1044 -0
  48. lopace-0.1.5/scripts/README.md +81 -0
  49. lopace-0.1.5/scripts/requirements.txt +5 -0
  50. {lopace-0.1.3 → lopace-0.1.5}/setup.py +1 -1
  51. lopace-0.1.5/streamlit_app.py +538 -0
  52. lopace-0.1.3/lopace.egg-info/SOURCES.txt +0 -17
  53. {lopace-0.1.3 → lopace-0.1.5}/LICENSE +0 -0
  54. {lopace-0.1.3 → lopace-0.1.5}/MANIFEST.in +0 -0
  55. {lopace-0.1.3 → lopace-0.1.5}/lopace/compressor.py +0 -0
  56. {lopace-0.1.3 → lopace-0.1.5}/lopace.egg-info/dependency_links.txt +0 -0
  57. {lopace-0.1.3 → lopace-0.1.5}/lopace.egg-info/requires.txt +0 -0
  58. {lopace-0.1.3 → lopace-0.1.5}/lopace.egg-info/top_level.txt +0 -0
  59. {lopace-0.1.3 → lopace-0.1.5}/requirements.txt +0 -0
  60. {lopace-0.1.3 → lopace-0.1.5}/scripts/__init__.py +0 -0
  61. {lopace-0.1.3 → lopace-0.1.5}/scripts/generate_visualizations.py +0 -0
  62. {lopace-0.1.3 → lopace-0.1.5}/setup.cfg +0 -0
  63. {lopace-0.1.3 → lopace-0.1.5}/tests/__init__.py +0 -0
  64. {lopace-0.1.3 → lopace-0.1.5}/tests/test_compressor.py +0 -0
@@ -0,0 +1,57 @@
1
+ # Git
2
+ .git
3
+ .gitignore
4
+ .github
5
+
6
+ # Python
7
+ __pycache__
8
+ *.py[cod]
9
+ *$py.class
10
+ *.so
11
+ .Python
12
+ *.egg-info
13
+ dist
14
+ build
15
+ .eggs
16
+
17
+ # Virtual environments
18
+ venv/
19
+ ENV/
20
+ env/
21
+ .venv
22
+
23
+ # IDE
24
+ .vscode/
25
+ .idea/
26
+ *.swp
27
+ *.swo
28
+
29
+ # OS
30
+ .DS_Store
31
+ Thumbs.db
32
+
33
+ # Testing
34
+ .pytest_cache/
35
+ .coverage
36
+ htmlcov/
37
+ .tox/
38
+
39
+ # Documentation
40
+ *.md
41
+ !README.md
42
+
43
+ # CI/CD
44
+ .github/
45
+
46
+ # Development files
47
+ tests/
48
+ scripts/
49
+ example.py
50
+
51
+ # Logs
52
+ *.log
53
+
54
+ # Cache
55
+ .mypy_cache/
56
+ .dmypy.json
57
+ dmypy.json
@@ -0,0 +1,59 @@
1
+ ---
2
+ name: Bug Report
3
+ about: Create a report to help us improve
4
+ title: '[BUG] '
5
+ labels: bug
6
+ assignees: ''
7
+ ---
8
+
9
+ ## Bug Description
10
+
11
+ A clear and concise description of what the bug is.
12
+
13
+ ## Steps to Reproduce
14
+
15
+ 1. Import the library: `...`
16
+ 2. Call the function: `...`
17
+ 3. See error: `...`
18
+
19
+ ## Expected Behavior
20
+
21
+ A clear and concise description of what you expected to happen.
22
+
23
+ ## Actual Behavior
24
+
25
+ A clear and concise description of what actually happened.
26
+
27
+ ## Code Example
28
+
29
+ ```python
30
+ from lopace import PromptCompressor, CompressionMethod
31
+
32
+ compressor = PromptCompressor()
33
+ # Your code here
34
+ ```
35
+
36
+ ## Error Message
37
+
38
+ ```
39
+ Paste the full error traceback here
40
+ ```
41
+
42
+ ## Environment
43
+
44
+ - **OS**: [e.g., Windows 10, macOS 13, Ubuntu 22.04]
45
+ - **Python Version**: [e.g., 3.11.0]
46
+ - **LoPace Version**: [e.g., 0.1.0]
47
+ - **Installed Dependencies**:
48
+ ```
49
+ zstandard==...
50
+ tiktoken==...
51
+ ```
52
+
53
+ ## Additional Context
54
+
55
+ Add any other context about the problem here.
56
+
57
+ ## Possible Solution
58
+
59
+ If you have suggestions on how to fix this, please describe them here.
@@ -0,0 +1,44 @@
1
+ ---
2
+ name: Feature Request
3
+ about: Suggest an idea for this project
4
+ title: '[FEATURE] '
5
+ labels: enhancement
6
+ assignees: ''
7
+ ---
8
+
9
+ ## Feature Description
10
+
11
+ A clear and concise description of the feature you'd like to see.
12
+
13
+ ## Motivation
14
+
15
+ Why is this feature needed? What problem does it solve?
16
+
17
+ ## Proposed Solution
18
+
19
+ A clear and concise description of what you want to happen.
20
+
21
+ ## Example Usage
22
+
23
+ ```python
24
+ from lopace import PromptCompressor
25
+
26
+ compressor = PromptCompressor()
27
+ # Show how you would use the new feature
28
+ ```
29
+
30
+ ## Alternatives Considered
31
+
32
+ A clear and concise description of any alternative solutions or features you've considered.
33
+
34
+ ## Additional Context
35
+
36
+ Add any other context, mockups, or examples about the feature request here.
37
+
38
+ ## Implementation Ideas
39
+
40
+ If you have ideas on how to implement this feature, please share them here.
41
+
42
+ ## Related Issues
43
+
44
+ Link to any related issues or discussions.
@@ -0,0 +1,34 @@
1
+ version: 2
2
+ updates:
3
+ # Enable version updates for pip
4
+ - package-ecosystem: "pip"
5
+ directory: "/"
6
+ schedule:
7
+ interval: "weekly"
8
+ day: "monday"
9
+ time: "09:00"
10
+ open-pull-requests-limit: 5
11
+ labels:
12
+ - "dependencies"
13
+ - "python"
14
+ commit-message:
15
+ prefix: "chore"
16
+ include: "scope"
17
+ ignore:
18
+ # Ignore major version updates for critical dependencies
19
+ - dependency-name: "zstandard"
20
+ update-types: ["version-update:semver-major"]
21
+ - dependency-name: "tiktoken"
22
+ update-types: ["version-update:semver-major"]
23
+
24
+ # Enable version updates for GitHub Actions
25
+ - package-ecosystem: "github-actions"
26
+ directory: "/"
27
+ schedule:
28
+ interval: "monthly"
29
+ labels:
30
+ - "dependencies"
31
+ - "github-actions"
32
+ commit-message:
33
+ prefix: "chore"
34
+ include: "scope"
@@ -0,0 +1,70 @@
1
+ ## Description
2
+
3
+ <!-- Provide a brief description of your changes -->
4
+
5
+ ## Type of Change
6
+
7
+ <!-- Mark the relevant option with an 'x' -->
8
+
9
+ - [ ] Bug fix (non-breaking change which fixes an issue)
10
+ - [ ] New feature (non-breaking change which adds functionality)
11
+ - [ ] Breaking change (fix or feature that would cause existing functionality to not work as expected)
12
+ - [ ] Documentation update
13
+ - [ ] Performance improvement
14
+ - [ ] Code refactoring
15
+ - [ ] Test addition/update
16
+ - [ ] Other (please describe):
17
+
18
+ ## Related Issues
19
+
20
+ <!-- Link to related issues using #issue_number -->
21
+
22
+ Fixes #
23
+ Related to #
24
+
25
+ ## Changes Made
26
+
27
+ <!-- Describe the changes you made in detail -->
28
+
29
+ ## Testing
30
+
31
+ <!-- Describe the tests you ran and how to verify your changes -->
32
+
33
+ - [ ] All existing tests pass
34
+ - [ ] New tests added for new functionality
35
+ - [ ] Manual testing completed
36
+
37
+ ### Test Commands
38
+
39
+ ```bash
40
+ # Add any test commands you ran
41
+ pytest tests/ -v
42
+ ```
43
+
44
+ ## Screenshots/Demo
45
+
46
+ <!-- If applicable, add screenshots or a demo -->
47
+
48
+ ## Checklist
49
+
50
+ <!-- Mark completed items with an 'x' -->
51
+
52
+ - [ ] My code follows the style guidelines of this project
53
+ - [ ] I have performed a self-review of my own code
54
+ - [ ] I have commented my code, particularly in hard-to-understand areas
55
+ - [ ] I have made corresponding changes to the documentation
56
+ - [ ] My changes generate no new warnings
57
+ - [ ] I have added tests that prove my fix is effective or that my feature works
58
+ - [ ] New and existing unit tests pass locally with my changes
59
+ - [ ] Any dependent changes have been merged and published
60
+ - [ ] I have updated the CHANGELOG.md if needed
61
+ - [ ] I have read and followed the [Contributing Guidelines](CONTRIBUTING.md)
62
+ - [ ] I have read and agree to the [Code of Conduct](CODE_OF_CONDUCT.md)
63
+
64
+ ## Additional Notes
65
+
66
+ <!-- Add any additional notes, context, or information for reviewers -->
67
+
68
+ ## Review Request
69
+
70
+ <!-- Tag specific reviewers or leave blank for general review -->
@@ -0,0 +1,142 @@
1
+ name: CI/CD - Test and Publish to PyPI
2
+
3
+ on:
4
+ push:
5
+ branches:
6
+ - main
7
+ - master
8
+ tags:
9
+ - 'v*' # Trigger on version tags like v0.1.0
10
+ pull_request:
11
+ branches:
12
+ - main
13
+ - master
14
+
15
+ jobs:
16
+ test:
17
+ name: Run Tests
18
+ runs-on: ${{ matrix.os }}
19
+ strategy:
20
+ matrix:
21
+ os: [ubuntu-latest, windows-latest, macos-latest]
22
+ python-version: ['3.8', '3.9', '3.10', '3.11', '3.12']
23
+
24
+ steps:
25
+ - name: Checkout code
26
+ uses: actions/checkout@v4
27
+
28
+ - name: Set up Python ${{ matrix.python-version }}
29
+ uses: actions/setup-python@v6
30
+ with:
31
+ python-version: ${{ matrix.python-version }}
32
+ cache: 'pip'
33
+
34
+ - name: Install dependencies
35
+ run: |
36
+ python -m pip install --upgrade pip
37
+ pip install -r requirements-dev.txt
38
+ pip install -e .
39
+
40
+ - name: Run linting
41
+ run: |
42
+ flake8 lopace tests --count --select=E9,F63,F7,F82 --show-source --statistics
43
+ flake8 lopace tests --count --exit-zero --max-complexity=10 --max-line-length=127 --statistics
44
+ continue-on-error: true
45
+
46
+ - name: Run tests with pytest
47
+ run: |
48
+ pytest tests/ -v --cov=lopace --cov-report=xml --cov-report=term
49
+
50
+ - name: Upload coverage reports
51
+ uses: codecov/codecov-action@v5
52
+ with:
53
+ file: ./coverage.xml
54
+ flags: unittests
55
+ name: codecov-umbrella
56
+ fail_ci_if_error: false
57
+ continue-on-error: true
58
+
59
+ build-and-publish:
60
+ name: Build and Publish to PyPI
61
+ needs: test
62
+ runs-on: ubuntu-latest
63
+ if: github.event_name == 'push' && (github.ref == 'refs/heads/main' || github.ref == 'refs/heads/master' || startsWith(github.ref, 'refs/tags/v'))
64
+
65
+ steps:
66
+ - name: Checkout code
67
+ uses: actions/checkout@v4
68
+ with:
69
+ fetch-depth: 0
70
+
71
+ - name: Set up Python
72
+ uses: actions/setup-python@v6
73
+ with:
74
+ python-version: '3.11'
75
+ cache: 'pip'
76
+
77
+ - name: Install build dependencies
78
+ run: |
79
+ python -m pip install --upgrade pip
80
+ pip install build twine
81
+
82
+ - name: Install package dependencies
83
+ run: |
84
+ pip install -r requirements-dev.txt
85
+
86
+ - name: Run tests before build
87
+ run: |
88
+ pytest tests/ -v
89
+
90
+ - name: Build package
91
+ run: |
92
+ python -m build
93
+
94
+ - name: Check package
95
+ run: |
96
+ twine check dist/*
97
+
98
+ - name: Publish to PyPI
99
+ env:
100
+ TWINE_USERNAME: __token__
101
+ TWINE_PASSWORD: ${{ secrets.PYPI_API_TOKEN }}
102
+ run: |
103
+ twine upload dist/*
104
+ if: success()
105
+
106
+ publish-testpypi:
107
+ name: Publish to TestPyPI
108
+ needs: test
109
+ runs-on: ubuntu-latest
110
+ if: github.event_name == 'pull_request'
111
+
112
+ steps:
113
+ - name: Checkout code
114
+ uses: actions/checkout@v4
115
+
116
+ - name: Set up Python
117
+ uses: actions/setup-python@v6
118
+ with:
119
+ python-version: '3.11'
120
+ cache: 'pip'
121
+
122
+ - name: Install build dependencies
123
+ run: |
124
+ python -m pip install --upgrade pip
125
+ pip install build twine
126
+
127
+ - name: Run tests before build
128
+ run: |
129
+ pip install -r requirements-dev.txt
130
+ pytest tests/ -v
131
+
132
+ - name: Build package
133
+ run: |
134
+ python -m build
135
+
136
+ - name: Publish to TestPyPI
137
+ env:
138
+ TWINE_USERNAME: __token__
139
+ TWINE_PASSWORD: ${{ secrets.TEST_PYPI_API_TOKEN }}
140
+ run: |
141
+ twine upload --repository testpypi dist/*
142
+ continue-on-error: true
@@ -0,0 +1,45 @@
1
+ name: Test Only
2
+
3
+ on:
4
+ push:
5
+ branches-ignore:
6
+ - main
7
+ - master
8
+ pull_request:
9
+ branches:
10
+ - main
11
+ - master
12
+
13
+ jobs:
14
+ test:
15
+ name: Run Tests
16
+ runs-on: ${{ matrix.os }}
17
+ strategy:
18
+ matrix:
19
+ os: [ubuntu-latest, windows-latest]
20
+ python-version: ['3.9', '3.11', '3.12']
21
+
22
+ steps:
23
+ - name: Checkout code
24
+ uses: actions/checkout@v4
25
+
26
+ - name: Set up Python ${{ matrix.python-version }}
27
+ uses: actions/setup-python@v6
28
+ with:
29
+ python-version: ${{ matrix.python-version }}
30
+ cache: 'pip'
31
+
32
+ - name: Install dependencies
33
+ run: |
34
+ python -m pip install --upgrade pip
35
+ pip install -r requirements-dev.txt
36
+ pip install -e .
37
+
38
+ - name: Run tests with pytest
39
+ run: |
40
+ pytest tests/ -v --cov=lopace --cov-report=term
41
+
42
+ - name: Run linting
43
+ run: |
44
+ flake8 lopace tests --count --select=E9,F63,F7,F82 --show-source --statistics
45
+ continue-on-error: true
@@ -0,0 +1,79 @@
1
+ # Byte-compiled / optimized / DLL files
2
+ __pycache__/
3
+ *.py[cod]
4
+ *$py.class
5
+
6
+ # C extensions
7
+ *.so
8
+
9
+ # Distribution / packaging
10
+ .Python
11
+ build/
12
+ scripts/
13
+ develop-eggs/
14
+ dist/
15
+ downloads/
16
+ eggs/
17
+ .eggs/
18
+ lib/
19
+ lib64/
20
+ parts/
21
+ sdist/
22
+ var/
23
+ wheels/
24
+ pip-wheel-metadata/
25
+ share/python-wheels/
26
+ *.egg-info/
27
+ .installed.cfg
28
+ *.egg
29
+ MANIFEST
30
+
31
+ # PyInstaller
32
+ *.manifest
33
+ *.spec
34
+
35
+ # Unit test / coverage reports
36
+ htmlcov/
37
+ .tox/
38
+ .nox/
39
+ .coverage
40
+ .coverage.*
41
+ .cache
42
+ nosetests.xml
43
+ coverage.xml
44
+ *.cover
45
+ *.py,cover
46
+ .hypothesis/
47
+ .pytest_cache/
48
+
49
+ # Virtual environments
50
+ venv/
51
+ ENV/
52
+ env/
53
+ .venv
54
+
55
+ # IDEs
56
+ .vscode/
57
+ .idea/
58
+ *.swp
59
+ *.swo
60
+ *~
61
+
62
+ # OS
63
+ .DS_Store
64
+ Thumbs.db
65
+
66
+ # Project specific
67
+ *.log
68
+ .pytest_cache/
69
+ .mypy_cache/
70
+ .dmypy.json
71
+ dmypy.json
72
+
73
+ # setuptools-scm generated (version from git tags)
74
+ lopace/_version.py
75
+
76
+ # Generated visualizations (keep directory structure but not generated files)
77
+ # screenshots/*.svg
78
+ # screenshots/*.csv
79
+ !screenshots/.gitkeep
@@ -0,0 +1,87 @@
1
+ # Changelog
2
+
3
+ All notable changes to this project will be documented in this file.
4
+
5
+ The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/),
6
+ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
7
+
8
+ ## [Unreleased]
9
+
10
+ ### Added
11
+ - Initial project structure
12
+ - GitHub Actions CI/CD workflow
13
+ - Contributing guidelines
14
+ - Code of Conduct
15
+
16
+ ## [0.1.0] - 2024-12-XX
17
+
18
+ ### Added
19
+ - **Core Features**:
20
+ - Zstd compression method
21
+ - Token-based (BPE) compression method
22
+ - Hybrid compression method (Token + Zstd)
23
+ - Lossless compression/decompression
24
+
25
+ - **Compression Methods**:
26
+ - `compress_zstd()` / `decompress_zstd()` - Zstandard compression
27
+ - `compress_token()` / `decompress_token()` - BPE tokenization with binary packing
28
+ - `compress_hybrid()` / `decompress_hybrid()` - Combined tokenization and Zstd
29
+ - Generic `compress()` / `decompress()` methods
30
+
31
+ - **Evaluation Metrics**:
32
+ - Compression Ratio (CR)
33
+ - Space Savings (SS)
34
+ - Bits Per Character (BPC)
35
+ - Throughput (MB/s)
36
+ - SHA-256 hash verification
37
+ - Exact match verification
38
+ - Reconstruction error calculation
39
+ - Shannon Entropy calculation
40
+ - Theoretical compression limits
41
+
42
+ - **API**:
43
+ - `PromptCompressor` class with configurable tokenizer and Zstd level
44
+ - `CompressionMethod` enum for method selection
45
+ - `compress_and_return_both()` method
46
+ - `get_compression_stats()` method
47
+ - `calculate_shannon_entropy()` method
48
+ - `get_theoretical_compression_limit()` method
49
+
50
+ - **Streamlit Web App**:
51
+ - Interactive compression interface
52
+ - Real-time metrics calculation
53
+ - Side-by-side method comparison
54
+ - Comprehensive evaluation dashboard
55
+
56
+ - **Documentation**:
57
+ - Comprehensive README with usage examples
58
+ - API reference documentation
59
+ - Mathematical background explanation
60
+ - Installation instructions
61
+
62
+ - **Testing**:
63
+ - Complete test suite with pytest
64
+ - Test coverage for all compression methods
65
+ - Edge case testing
66
+ - Lossless verification tests
67
+
68
+ - **DevOps**:
69
+ - GitHub Actions CI/CD pipeline
70
+ - Automated testing on multiple Python versions
71
+ - Automated PyPI publishing
72
+
73
+ ### Technical Details
74
+
75
+ - **Supported Tokenizers**: cl100k_base, p50k_base, r50k_base, gpt2
76
+ - **Zstd Levels**: 1-22 (default: 15)
77
+ - **Python Versions**: 3.8, 3.9, 3.10, 3.11, 3.12
78
+ - **Dependencies**: zstandard>=0.22.0, tiktoken>=0.5.0
79
+ - **Smart Format Detection**: Automatically uses uint16 or uint32 based on token ID ranges
80
+
81
+ ### Fixed
82
+ - Token ID overflow handling for tokenizers with vocab > 65535
83
+ - Format byte handling for backward compatibility
84
+ - Error handling for edge cases
85
+
86
+ [Unreleased]: https://github.com/amanulla/lopace/compare/v0.1.0...HEAD
87
+ [0.1.0]: https://github.com/amanulla/lopace/releases/tag/v0.1.0