lopace 0.1.0__tar.gz → 0.1.2__tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -1,13 +1,13 @@
1
1
  Metadata-Version: 2.4
2
2
  Name: lopace
3
- Version: 0.1.0
3
+ Version: 0.1.2
4
4
  Summary: Lossless Optimized Prompt Accurate Compression Engine
5
5
  Home-page: https://github.com/connectaman/LoPace
6
6
  Author: Aman Ulla
7
7
  License: MIT
8
- Project-URL: Homepage, https://github.com/amanulla/lopace
9
- Project-URL: Repository, https://github.com/amanulla/lopace
10
- Project-URL: Issues, https://github.com/amanulla/lopace/issues
8
+ Project-URL: Homepage, https://github.com/connectaman/LoPace
9
+ Project-URL: Repository, https://github.com/connectaman/LoPace
10
+ Project-URL: Issues, https://github.com/connectaman/LoPace/issues
11
11
  Keywords: prompt,compression,tokenization,zstd,bpe,nlp
12
12
  Classifier: Development Status :: 4 - Beta
13
13
  Classifier: Intended Audience :: Developers
@@ -38,6 +38,32 @@ A professional, open-source Python package for compressing and decompressing pro
38
38
  [![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)
39
39
  [![Python 3.8+](https://img.shields.io/badge/python-3.8+-blue.svg)](https://www.python.org/downloads/)
40
40
 
41
+ ## The Problem: Storage Challenges with Large Prompts
42
+
43
+ When building LLM applications, storing prompts efficiently becomes a critical challenge, especially as you scale:
44
+
45
+ - **💾 Massive Storage Overhead**: Large system prompts, context windows, and conversation histories consume significant database space. For applications serving thousands of users with multiple LLM calls, this translates to gigabytes or terabytes of storage requirements.
46
+
47
+ - **🚀 Performance Bottlenecks**: Storing uncompressed prompts increases database size, slows down queries, and increases I/O operations. As your user base grows, retrieval and storage operations become progressively slower.
48
+
49
+ - **💰 Cost Implications**: Larger databases mean higher cloud storage costs, increased backup times, and more expensive infrastructure. With LLM applications handling millions of prompts, these costs compound rapidly.
50
+
51
+ - **⚡ Latency Issues**: Loading large prompts from storage adds latency to your application. Multiple LLM calls per user session multiply this problem, creating noticeable delays in response times.
52
+
53
+ ## The Solution: LoPace Compression Engine
54
+
55
+ LoPace solves these challenges by providing **lossless compression** that dramatically reduces storage requirements while maintaining fast compression and decompression speeds:
56
+
57
+ - **📉 Up to 80% Space Reduction**: The hybrid compression method can reduce prompt storage by 70-80% on average, meaning you store 5x less data while maintaining perfect fidelity.
58
+
59
+ - **⚡ Fast Processing**: Achieve 50-200 MB/s compression throughput with sub-linear scaling. Decompression is even faster (100-500 MB/s), ensuring minimal impact on application latency.
60
+
61
+ - **✅ 100% Lossless**: Perfect reconstruction guarantees your prompts are identical to the original - no data loss, no corruption, no compromises.
62
+
63
+ - **🎯 Production-Ready**: Optimized for database storage with minimal memory footprint (under 10 MB for typical use cases) and excellent scalability for millions of prompts.
64
+
65
+ Whether you're storing system prompts for thousands of users, maintaining conversation histories, or caching LLM interactions, LoPace helps you optimize storage costs and improve performance without sacrificing data integrity.
66
+
41
67
  ## Features
42
68
 
43
69
  - 🚀 **Three Compression Methods**:
@@ -229,17 +255,7 @@ Enumeration of available compression methods:
229
255
 
230
256
  ### Compression Pipeline (Hybrid Method)
231
257
 
232
- ```
233
- Input: Raw System Prompt String (100%)
234
-
235
- Tokenization: Convert to Tiktoken IDs (~70% reduced)
236
-
237
- Binary Packing: Convert IDs to uint16 (~50% of above)
238
-
239
- Zstd: Final compression (~30% further reduction)
240
-
241
- Output: Compressed Binary Blob
242
- ```
258
+ ![Compression Pipeline](screenshots/compression-pipeline.png)
243
259
 
244
260
  ### Why Hybrid is Best for Databases
245
261
 
@@ -257,6 +273,88 @@ Output: Compressed Binary Blob
257
273
  # Hybrid: 120 bytes (76% space saved) ← Best!
258
274
  ```
259
275
 
276
+ ## Benchmarks & Performance Analysis
277
+
278
+ Comprehensive benchmarks were conducted on 10 diverse prompts across three size categories (small, medium, and large) to evaluate LoPace's compression performance. The following visualizations present detailed analysis of compression metrics, storage efficiency, speed, and memory usage.
279
+
280
+ ### Compression Ratio Analysis
281
+
282
+ ![Compression Ratio](screenshots/compression_ratio.svg)
283
+
284
+ **Key Insights:**
285
+ - **Hybrid method consistently achieves the highest compression ratios** across all prompt sizes
286
+ - Compression effectiveness increases with prompt size, with large prompts showing 4-6x compression ratios
287
+ - Box plots show the distribution of compression ratios, demonstrating consistent performance
288
+ - Token-based compression provides moderate compression, while Zstd alone offers good baseline performance
289
+
290
+ ### Space Savings Performance
291
+
292
+ ![Space Savings](screenshots/space_savings.svg)
293
+
294
+ **Key Insights:**
295
+ - **Hybrid method achieves 70-80% space savings** on average across all prompt categories
296
+ - Space savings improve significantly with larger prompts (up to 85% for very large prompts)
297
+ - Error bars indicate consistent performance with low variance
298
+ - All three methods show substantial space reduction compared to uncompressed storage
299
+
300
+ ### Disk Size Comparison
301
+
302
+ ![Disk Size Comparison](screenshots/disk_size_comparison.svg)
303
+
304
+ **Key Insights:**
305
+ - **Dramatic reduction in storage requirements** - compressed data is 3-6x smaller than original
306
+ - Log-scale visualization shows the magnitude of space savings across different prompt sizes
307
+ - Hybrid method provides the best storage efficiency, especially for large prompts
308
+ - Size reduction percentage increases linearly with prompt complexity
309
+
310
+ ### Speed & Throughput Metrics
311
+
312
+ ![Speed Metrics](screenshots/speed_metrics.svg)
313
+
314
+ **Key Insights:**
315
+ - **Compression speeds range from 50-200 MB/s** depending on method and prompt size
316
+ - Decompression is consistently faster than compression (100-500 MB/s)
317
+ - Hybrid method maintains excellent throughput despite additional processing steps
318
+ - Processing time scales sub-linearly with prompt size, demonstrating efficient algorithms
319
+
320
+ ### Memory Usage Analysis
321
+
322
+ ![Memory Usage](screenshots/memory_usage.svg)
323
+
324
+ **Key Insights:**
325
+ - **Memory footprint is minimal** - typically under 10 MB even for large prompts
326
+ - Memory usage scales gracefully with input size
327
+ - Compression and decompression show similar memory requirements
328
+ - All methods demonstrate efficient memory utilization suitable for production environments
329
+
330
+ ### Comprehensive Method Comparison
331
+
332
+ ![Comprehensive Comparison](screenshots/comprehensive_comparison.svg)
333
+
334
+ **Key Insights:**
335
+ - **Heatmaps provide at-a-glance comparison** of all metrics across methods and prompt sizes
336
+ - Hybrid method consistently ranks highest in compression ratio and space savings
337
+ - Throughput remains competitive across all methods
338
+ - Memory usage is well-balanced, with no method showing excessive requirements
339
+
340
+ ### Scalability Analysis
341
+
342
+ ![Scalability Analysis](screenshots/scalability_analysis.svg)
343
+
344
+ **Key Insights:**
345
+ - **Performance scales efficiently** with prompt size across all metrics
346
+ - Compression ratio improves with larger inputs (better pattern recognition)
347
+ - Processing time increases sub-linearly, demonstrating algorithmic efficiency
348
+ - Memory usage grows modestly, making LoPace suitable for very large prompts
349
+
350
+ ### Key Findings Summary
351
+
352
+ 1. **Hybrid method is optimal** for maximum compression (70-80% space savings)
353
+ 2. **All methods are lossless** - 100% fidelity verified across all test cases
354
+ 3. **Speed is production-ready** - 50-200 MB/s compression throughput
355
+ 4. **Memory efficient** - Under 10 MB for typical use cases
356
+ 5. **Scales excellently** - Performance improves with larger prompts
357
+
260
358
  ## Running the Example
261
359
 
262
360
  ```bash
@@ -334,6 +432,8 @@ See [.github/workflows/README.md](.github/workflows/README.md) for detailed setu
334
432
 
335
433
  LoPace uses the following compression techniques:
336
434
 
435
+ ![Compression techniques](screenshots/lopace-compression-technique.png)
436
+
337
437
  1. **LZ77 (Sliding Window)**: Used **indirectly** through Zstandard
338
438
  - Zstandard internally uses LZ77-style algorithms to find repeated patterns
339
439
  - Instead of storing "assistant" again, it stores a tuple: (distance_back, length)
@@ -7,6 +7,32 @@ A professional, open-source Python package for compressing and decompressing pro
7
7
  [![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)
8
8
  [![Python 3.8+](https://img.shields.io/badge/python-3.8+-blue.svg)](https://www.python.org/downloads/)
9
9
 
10
+ ## The Problem: Storage Challenges with Large Prompts
11
+
12
+ When building LLM applications, storing prompts efficiently becomes a critical challenge, especially as you scale:
13
+
14
+ - **💾 Massive Storage Overhead**: Large system prompts, context windows, and conversation histories consume significant database space. For applications serving thousands of users with multiple LLM calls, this translates to gigabytes or terabytes of storage requirements.
15
+
16
+ - **🚀 Performance Bottlenecks**: Storing uncompressed prompts increases database size, slows down queries, and increases I/O operations. As your user base grows, retrieval and storage operations become progressively slower.
17
+
18
+ - **💰 Cost Implications**: Larger databases mean higher cloud storage costs, increased backup times, and more expensive infrastructure. With LLM applications handling millions of prompts, these costs compound rapidly.
19
+
20
+ - **⚡ Latency Issues**: Loading large prompts from storage adds latency to your application. Multiple LLM calls per user session multiply this problem, creating noticeable delays in response times.
21
+
22
+ ## The Solution: LoPace Compression Engine
23
+
24
+ LoPace solves these challenges by providing **lossless compression** that dramatically reduces storage requirements while maintaining fast compression and decompression speeds:
25
+
26
+ - **📉 Up to 80% Space Reduction**: The hybrid compression method can reduce prompt storage by 70-80% on average, meaning you store 5x less data while maintaining perfect fidelity.
27
+
28
+ - **⚡ Fast Processing**: Achieve 50-200 MB/s compression throughput with sub-linear scaling. Decompression is even faster (100-500 MB/s), ensuring minimal impact on application latency.
29
+
30
+ - **✅ 100% Lossless**: Perfect reconstruction guarantees your prompts are identical to the original - no data loss, no corruption, no compromises.
31
+
32
+ - **🎯 Production-Ready**: Optimized for database storage with minimal memory footprint (under 10 MB for typical use cases) and excellent scalability for millions of prompts.
33
+
34
+ Whether you're storing system prompts for thousands of users, maintaining conversation histories, or caching LLM interactions, LoPace helps you optimize storage costs and improve performance without sacrificing data integrity.
35
+
10
36
  ## Features
11
37
 
12
38
  - 🚀 **Three Compression Methods**:
@@ -198,17 +224,7 @@ Enumeration of available compression methods:
198
224
 
199
225
  ### Compression Pipeline (Hybrid Method)
200
226
 
201
- ```
202
- Input: Raw System Prompt String (100%)
203
-
204
- Tokenization: Convert to Tiktoken IDs (~70% reduced)
205
-
206
- Binary Packing: Convert IDs to uint16 (~50% of above)
207
-
208
- Zstd: Final compression (~30% further reduction)
209
-
210
- Output: Compressed Binary Blob
211
- ```
227
+ ![Compression Pipeline](screenshots/compression-pipeline.png)
212
228
 
213
229
  ### Why Hybrid is Best for Databases
214
230
 
@@ -226,6 +242,88 @@ Output: Compressed Binary Blob
226
242
  # Hybrid: 120 bytes (76% space saved) ← Best!
227
243
  ```
228
244
 
245
+ ## Benchmarks & Performance Analysis
246
+
247
+ Comprehensive benchmarks were conducted on 10 diverse prompts across three size categories (small, medium, and large) to evaluate LoPace's compression performance. The following visualizations present detailed analysis of compression metrics, storage efficiency, speed, and memory usage.
248
+
249
+ ### Compression Ratio Analysis
250
+
251
+ ![Compression Ratio](screenshots/compression_ratio.svg)
252
+
253
+ **Key Insights:**
254
+ - **Hybrid method consistently achieves the highest compression ratios** across all prompt sizes
255
+ - Compression effectiveness increases with prompt size, with large prompts showing 4-6x compression ratios
256
+ - Box plots show the distribution of compression ratios, demonstrating consistent performance
257
+ - Token-based compression provides moderate compression, while Zstd alone offers good baseline performance
258
+
259
+ ### Space Savings Performance
260
+
261
+ ![Space Savings](screenshots/space_savings.svg)
262
+
263
+ **Key Insights:**
264
+ - **Hybrid method achieves 70-80% space savings** on average across all prompt categories
265
+ - Space savings improve significantly with larger prompts (up to 85% for very large prompts)
266
+ - Error bars indicate consistent performance with low variance
267
+ - All three methods show substantial space reduction compared to uncompressed storage
268
+
269
+ ### Disk Size Comparison
270
+
271
+ ![Disk Size Comparison](screenshots/disk_size_comparison.svg)
272
+
273
+ **Key Insights:**
274
+ - **Dramatic reduction in storage requirements** - compressed data is 3-6x smaller than original
275
+ - Log-scale visualization shows the magnitude of space savings across different prompt sizes
276
+ - Hybrid method provides the best storage efficiency, especially for large prompts
277
+ - Size reduction percentage increases linearly with prompt complexity
278
+
279
+ ### Speed & Throughput Metrics
280
+
281
+ ![Speed Metrics](screenshots/speed_metrics.svg)
282
+
283
+ **Key Insights:**
284
+ - **Compression speeds range from 50-200 MB/s** depending on method and prompt size
285
+ - Decompression is consistently faster than compression (100-500 MB/s)
286
+ - Hybrid method maintains excellent throughput despite additional processing steps
287
+ - Processing time scales sub-linearly with prompt size, demonstrating efficient algorithms
288
+
289
+ ### Memory Usage Analysis
290
+
291
+ ![Memory Usage](screenshots/memory_usage.svg)
292
+
293
+ **Key Insights:**
294
+ - **Memory footprint is minimal** - typically under 10 MB even for large prompts
295
+ - Memory usage scales gracefully with input size
296
+ - Compression and decompression show similar memory requirements
297
+ - All methods demonstrate efficient memory utilization suitable for production environments
298
+
299
+ ### Comprehensive Method Comparison
300
+
301
+ ![Comprehensive Comparison](screenshots/comprehensive_comparison.svg)
302
+
303
+ **Key Insights:**
304
+ - **Heatmaps provide at-a-glance comparison** of all metrics across methods and prompt sizes
305
+ - Hybrid method consistently ranks highest in compression ratio and space savings
306
+ - Throughput remains competitive across all methods
307
+ - Memory usage is well-balanced, with no method showing excessive requirements
308
+
309
+ ### Scalability Analysis
310
+
311
+ ![Scalability Analysis](screenshots/scalability_analysis.svg)
312
+
313
+ **Key Insights:**
314
+ - **Performance scales efficiently** with prompt size across all metrics
315
+ - Compression ratio improves with larger inputs (better pattern recognition)
316
+ - Processing time increases sub-linearly, demonstrating algorithmic efficiency
317
+ - Memory usage grows modestly, making LoPace suitable for very large prompts
318
+
319
+ ### Key Findings Summary
320
+
321
+ 1. **Hybrid method is optimal** for maximum compression (70-80% space savings)
322
+ 2. **All methods are lossless** - 100% fidelity verified across all test cases
323
+ 3. **Speed is production-ready** - 50-200 MB/s compression throughput
324
+ 4. **Memory efficient** - Under 10 MB for typical use cases
325
+ 5. **Scales excellently** - Performance improves with larger prompts
326
+
229
327
  ## Running the Example
230
328
 
231
329
  ```bash
@@ -303,6 +401,8 @@ See [.github/workflows/README.md](.github/workflows/README.md) for detailed setu
303
401
 
304
402
  LoPace uses the following compression techniques:
305
403
 
404
+ ![Compression techniques](screenshots/lopace-compression-technique.png)
405
+
306
406
  1. **LZ77 (Sliding Window)**: Used **indirectly** through Zstandard
307
407
  - Zstandard internally uses LZ77-style algorithms to find repeated patterns
308
408
  - Instead of storing "assistant" again, it stores a tuple: (distance_back, length)
@@ -1,13 +1,13 @@
1
1
  Metadata-Version: 2.4
2
2
  Name: lopace
3
- Version: 0.1.0
3
+ Version: 0.1.2
4
4
  Summary: Lossless Optimized Prompt Accurate Compression Engine
5
5
  Home-page: https://github.com/connectaman/LoPace
6
6
  Author: Aman Ulla
7
7
  License: MIT
8
- Project-URL: Homepage, https://github.com/amanulla/lopace
9
- Project-URL: Repository, https://github.com/amanulla/lopace
10
- Project-URL: Issues, https://github.com/amanulla/lopace/issues
8
+ Project-URL: Homepage, https://github.com/connectaman/LoPace
9
+ Project-URL: Repository, https://github.com/connectaman/LoPace
10
+ Project-URL: Issues, https://github.com/connectaman/LoPace/issues
11
11
  Keywords: prompt,compression,tokenization,zstd,bpe,nlp
12
12
  Classifier: Development Status :: 4 - Beta
13
13
  Classifier: Intended Audience :: Developers
@@ -38,6 +38,32 @@ A professional, open-source Python package for compressing and decompressing pro
38
38
  [![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)
39
39
  [![Python 3.8+](https://img.shields.io/badge/python-3.8+-blue.svg)](https://www.python.org/downloads/)
40
40
 
41
+ ## The Problem: Storage Challenges with Large Prompts
42
+
43
+ When building LLM applications, storing prompts efficiently becomes a critical challenge, especially as you scale:
44
+
45
+ - **💾 Massive Storage Overhead**: Large system prompts, context windows, and conversation histories consume significant database space. For applications serving thousands of users with multiple LLM calls, this translates to gigabytes or terabytes of storage requirements.
46
+
47
+ - **🚀 Performance Bottlenecks**: Storing uncompressed prompts increases database size, slows down queries, and increases I/O operations. As your user base grows, retrieval and storage operations become progressively slower.
48
+
49
+ - **💰 Cost Implications**: Larger databases mean higher cloud storage costs, increased backup times, and more expensive infrastructure. With LLM applications handling millions of prompts, these costs compound rapidly.
50
+
51
+ - **⚡ Latency Issues**: Loading large prompts from storage adds latency to your application. Multiple LLM calls per user session multiply this problem, creating noticeable delays in response times.
52
+
53
+ ## The Solution: LoPace Compression Engine
54
+
55
+ LoPace solves these challenges by providing **lossless compression** that dramatically reduces storage requirements while maintaining fast compression and decompression speeds:
56
+
57
+ - **📉 Up to 80% Space Reduction**: The hybrid compression method can reduce prompt storage by 70-80% on average, meaning you store 5x less data while maintaining perfect fidelity.
58
+
59
+ - **⚡ Fast Processing**: Achieve 50-200 MB/s compression throughput with sub-linear scaling. Decompression is even faster (100-500 MB/s), ensuring minimal impact on application latency.
60
+
61
+ - **✅ 100% Lossless**: Perfect reconstruction guarantees your prompts are identical to the original - no data loss, no corruption, no compromises.
62
+
63
+ - **🎯 Production-Ready**: Optimized for database storage with minimal memory footprint (under 10 MB for typical use cases) and excellent scalability for millions of prompts.
64
+
65
+ Whether you're storing system prompts for thousands of users, maintaining conversation histories, or caching LLM interactions, LoPace helps you optimize storage costs and improve performance without sacrificing data integrity.
66
+
41
67
  ## Features
42
68
 
43
69
  - 🚀 **Three Compression Methods**:
@@ -229,17 +255,7 @@ Enumeration of available compression methods:
229
255
 
230
256
  ### Compression Pipeline (Hybrid Method)
231
257
 
232
- ```
233
- Input: Raw System Prompt String (100%)
234
-
235
- Tokenization: Convert to Tiktoken IDs (~70% reduced)
236
-
237
- Binary Packing: Convert IDs to uint16 (~50% of above)
238
-
239
- Zstd: Final compression (~30% further reduction)
240
-
241
- Output: Compressed Binary Blob
242
- ```
258
+ ![Compression Pipeline](screenshots/compression-pipeline.png)
243
259
 
244
260
  ### Why Hybrid is Best for Databases
245
261
 
@@ -257,6 +273,88 @@ Output: Compressed Binary Blob
257
273
  # Hybrid: 120 bytes (76% space saved) ← Best!
258
274
  ```
259
275
 
276
+ ## Benchmarks & Performance Analysis
277
+
278
+ Comprehensive benchmarks were conducted on 10 diverse prompts across three size categories (small, medium, and large) to evaluate LoPace's compression performance. The following visualizations present detailed analysis of compression metrics, storage efficiency, speed, and memory usage.
279
+
280
+ ### Compression Ratio Analysis
281
+
282
+ ![Compression Ratio](screenshots/compression_ratio.svg)
283
+
284
+ **Key Insights:**
285
+ - **Hybrid method consistently achieves the highest compression ratios** across all prompt sizes
286
+ - Compression effectiveness increases with prompt size, with large prompts showing 4-6x compression ratios
287
+ - Box plots show the distribution of compression ratios, demonstrating consistent performance
288
+ - Token-based compression provides moderate compression, while Zstd alone offers good baseline performance
289
+
290
+ ### Space Savings Performance
291
+
292
+ ![Space Savings](screenshots/space_savings.svg)
293
+
294
+ **Key Insights:**
295
+ - **Hybrid method achieves 70-80% space savings** on average across all prompt categories
296
+ - Space savings improve significantly with larger prompts (up to 85% for very large prompts)
297
+ - Error bars indicate consistent performance with low variance
298
+ - All three methods show substantial space reduction compared to uncompressed storage
299
+
300
+ ### Disk Size Comparison
301
+
302
+ ![Disk Size Comparison](screenshots/disk_size_comparison.svg)
303
+
304
+ **Key Insights:**
305
+ - **Dramatic reduction in storage requirements** - compressed data is 3-6x smaller than original
306
+ - Log-scale visualization shows the magnitude of space savings across different prompt sizes
307
+ - Hybrid method provides the best storage efficiency, especially for large prompts
308
+ - Size reduction percentage increases linearly with prompt complexity
309
+
310
+ ### Speed & Throughput Metrics
311
+
312
+ ![Speed Metrics](screenshots/speed_metrics.svg)
313
+
314
+ **Key Insights:**
315
+ - **Compression speeds range from 50-200 MB/s** depending on method and prompt size
316
+ - Decompression is consistently faster than compression (100-500 MB/s)
317
+ - Hybrid method maintains excellent throughput despite additional processing steps
318
+ - Processing time scales sub-linearly with prompt size, demonstrating efficient algorithms
319
+
320
+ ### Memory Usage Analysis
321
+
322
+ ![Memory Usage](screenshots/memory_usage.svg)
323
+
324
+ **Key Insights:**
325
+ - **Memory footprint is minimal** - typically under 10 MB even for large prompts
326
+ - Memory usage scales gracefully with input size
327
+ - Compression and decompression show similar memory requirements
328
+ - All methods demonstrate efficient memory utilization suitable for production environments
329
+
330
+ ### Comprehensive Method Comparison
331
+
332
+ ![Comprehensive Comparison](screenshots/comprehensive_comparison.svg)
333
+
334
+ **Key Insights:**
335
+ - **Heatmaps provide at-a-glance comparison** of all metrics across methods and prompt sizes
336
+ - Hybrid method consistently ranks highest in compression ratio and space savings
337
+ - Throughput remains competitive across all methods
338
+ - Memory usage is well-balanced, with no method showing excessive requirements
339
+
340
+ ### Scalability Analysis
341
+
342
+ ![Scalability Analysis](screenshots/scalability_analysis.svg)
343
+
344
+ **Key Insights:**
345
+ - **Performance scales efficiently** with prompt size across all metrics
346
+ - Compression ratio improves with larger inputs (better pattern recognition)
347
+ - Processing time increases sub-linearly, demonstrating algorithmic efficiency
348
+ - Memory usage grows modestly, making LoPace suitable for very large prompts
349
+
350
+ ### Key Findings Summary
351
+
352
+ 1. **Hybrid method is optimal** for maximum compression (70-80% space savings)
353
+ 2. **All methods are lossless** - 100% fidelity verified across all test cases
354
+ 3. **Speed is production-ready** - 50-200 MB/s compression throughput
355
+ 4. **Memory efficient** - Under 10 MB for typical use cases
356
+ 5. **Scales excellently** - Performance improves with larger prompts
357
+
260
358
  ## Running the Example
261
359
 
262
360
  ```bash
@@ -334,6 +432,8 @@ See [.github/workflows/README.md](.github/workflows/README.md) for detailed setu
334
432
 
335
433
  LoPace uses the following compression techniques:
336
434
 
435
+ ![Compression techniques](screenshots/lopace-compression-technique.png)
436
+
337
437
  1. **LZ77 (Sliding Window)**: Used **indirectly** through Zstandard
338
438
  - Zstandard internally uses LZ77-style algorithms to find repeated patterns
339
439
  - Instead of storing "assistant" again, it stores a tuple: (distance_back, length)
@@ -11,5 +11,7 @@ lopace.egg-info/SOURCES.txt
11
11
  lopace.egg-info/dependency_links.txt
12
12
  lopace.egg-info/requires.txt
13
13
  lopace.egg-info/top_level.txt
14
+ scripts/__init__.py
15
+ scripts/generate_visualizations.py
14
16
  tests/__init__.py
15
17
  tests/test_compressor.py
@@ -1,2 +1,3 @@
1
1
  lopace
2
+ scripts
2
3
  tests
@@ -4,7 +4,7 @@ build-backend = "setuptools.build_meta"
4
4
 
5
5
  [project]
6
6
  name = "lopace"
7
- version = "0.1.0"
7
+ version = "0.1.2"
8
8
  description = "Lossless Optimized Prompt Accurate Compression Engine"
9
9
  readme = "README.md"
10
10
  requires-python = ">=3.8"
@@ -32,6 +32,6 @@ dependencies = [
32
32
  ]
33
33
 
34
34
  [project.urls]
35
- Homepage = "https://github.com/amanulla/lopace"
36
- Repository = "https://github.com/amanulla/lopace"
37
- Issues = "https://github.com/amanulla/lopace/issues"
35
+ Homepage = "https://github.com/connectaman/LoPace"
36
+ Repository = "https://github.com/connectaman/LoPace"
37
+ Issues = "https://github.com/connectaman/LoPace/issues"
@@ -0,0 +1 @@
1
+ """Scripts for generating visualizations and benchmarks."""