lopace 0.1.1__tar.gz → 0.1.2__tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -1,6 +1,6 @@
1
1
  Metadata-Version: 2.4
2
2
  Name: lopace
3
- Version: 0.1.1
3
+ Version: 0.1.2
4
4
  Summary: Lossless Optimized Prompt Accurate Compression Engine
5
5
  Home-page: https://github.com/connectaman/LoPace
6
6
  Author: Aman Ulla
@@ -38,6 +38,32 @@ A professional, open-source Python package for compressing and decompressing pro
38
38
  [![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)
39
39
  [![Python 3.8+](https://img.shields.io/badge/python-3.8+-blue.svg)](https://www.python.org/downloads/)
40
40
 
41
+ ## The Problem: Storage Challenges with Large Prompts
42
+
43
+ When building LLM applications, storing prompts efficiently becomes a critical challenge, especially as you scale:
44
+
45
+ - **💾 Massive Storage Overhead**: Large system prompts, context windows, and conversation histories consume significant database space. For applications serving thousands of users with multiple LLM calls, this translates to gigabytes or terabytes of storage requirements.
46
+
47
+ - **🚀 Performance Bottlenecks**: Storing uncompressed prompts increases database size, slows down queries, and increases I/O operations. As your user base grows, retrieval and storage operations become progressively slower.
48
+
49
+ - **💰 Cost Implications**: Larger databases mean higher cloud storage costs, increased backup times, and more expensive infrastructure. With LLM applications handling millions of prompts, these costs compound rapidly.
50
+
51
+ - **⚡ Latency Issues**: Loading large prompts from storage adds latency to your application. Multiple LLM calls per user session multiply this problem, creating noticeable delays in response times.
52
+
53
+ ## The Solution: LoPace Compression Engine
54
+
55
+ LoPace solves these challenges by providing **lossless compression** that dramatically reduces storage requirements while maintaining fast compression and decompression speeds:
56
+
57
+ - **📉 Up to 80% Space Reduction**: The hybrid compression method can reduce prompt storage by 70-80% on average, meaning you store 5x less data while maintaining perfect fidelity.
58
+
59
+ - **⚡ Fast Processing**: Achieve 50-200 MB/s compression throughput with sub-linear scaling. Decompression is even faster (100-500 MB/s), ensuring minimal impact on application latency.
60
+
61
+ - **✅ 100% Lossless**: Perfect reconstruction guarantees your prompts are identical to the original - no data loss, no corruption, no compromises.
62
+
63
+ - **🎯 Production-Ready**: Optimized for database storage with minimal memory footprint (under 10 MB for typical use cases) and excellent scalability for millions of prompts.
64
+
65
+ Whether you're storing system prompts for thousands of users, maintaining conversation histories, or caching LLM interactions, LoPace helps you optimize storage costs and improve performance without sacrificing data integrity.
66
+
41
67
  ## Features
42
68
 
43
69
  - 🚀 **Three Compression Methods**:
@@ -7,6 +7,32 @@ A professional, open-source Python package for compressing and decompressing pro
7
7
  [![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)
8
8
  [![Python 3.8+](https://img.shields.io/badge/python-3.8+-blue.svg)](https://www.python.org/downloads/)
9
9
 
10
+ ## The Problem: Storage Challenges with Large Prompts
11
+
12
+ When building LLM applications, storing prompts efficiently becomes a critical challenge, especially as you scale:
13
+
14
+ - **💾 Massive Storage Overhead**: Large system prompts, context windows, and conversation histories consume significant database space. For applications serving thousands of users with multiple LLM calls, this translates to gigabytes or terabytes of storage requirements.
15
+
16
+ - **🚀 Performance Bottlenecks**: Storing uncompressed prompts increases database size, slows down queries, and increases I/O operations. As your user base grows, retrieval and storage operations become progressively slower.
17
+
18
+ - **💰 Cost Implications**: Larger databases mean higher cloud storage costs, increased backup times, and more expensive infrastructure. With LLM applications handling millions of prompts, these costs compound rapidly.
19
+
20
+ - **⚡ Latency Issues**: Loading large prompts from storage adds latency to your application. Multiple LLM calls per user session multiply this problem, creating noticeable delays in response times.
21
+
22
+ ## The Solution: LoPace Compression Engine
23
+
24
+ LoPace solves these challenges by providing **lossless compression** that dramatically reduces storage requirements while maintaining fast compression and decompression speeds:
25
+
26
+ - **📉 Up to 80% Space Reduction**: The hybrid compression method can reduce prompt storage by 70-80% on average, meaning you store 5x less data while maintaining perfect fidelity.
27
+
28
+ - **⚡ Fast Processing**: Achieve 50-200 MB/s compression throughput with sub-linear scaling. Decompression is even faster (100-500 MB/s), ensuring minimal impact on application latency.
29
+
30
+ - **✅ 100% Lossless**: Perfect reconstruction guarantees your prompts are identical to the original - no data loss, no corruption, no compromises.
31
+
32
+ - **🎯 Production-Ready**: Optimized for database storage with minimal memory footprint (under 10 MB for typical use cases) and excellent scalability for millions of prompts.
33
+
34
+ Whether you're storing system prompts for thousands of users, maintaining conversation histories, or caching LLM interactions, LoPace helps you optimize storage costs and improve performance without sacrificing data integrity.
35
+
10
36
  ## Features
11
37
 
12
38
  - 🚀 **Three Compression Methods**:
@@ -1,6 +1,6 @@
1
1
  Metadata-Version: 2.4
2
2
  Name: lopace
3
- Version: 0.1.1
3
+ Version: 0.1.2
4
4
  Summary: Lossless Optimized Prompt Accurate Compression Engine
5
5
  Home-page: https://github.com/connectaman/LoPace
6
6
  Author: Aman Ulla
@@ -38,6 +38,32 @@ A professional, open-source Python package for compressing and decompressing pro
38
38
  [![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)
39
39
  [![Python 3.8+](https://img.shields.io/badge/python-3.8+-blue.svg)](https://www.python.org/downloads/)
40
40
 
41
+ ## The Problem: Storage Challenges with Large Prompts
42
+
43
+ When building LLM applications, storing prompts efficiently becomes a critical challenge, especially as you scale:
44
+
45
+ - **💾 Massive Storage Overhead**: Large system prompts, context windows, and conversation histories consume significant database space. For applications serving thousands of users with multiple LLM calls, this translates to gigabytes or terabytes of storage requirements.
46
+
47
+ - **🚀 Performance Bottlenecks**: Storing uncompressed prompts increases database size, slows down queries, and increases I/O operations. As your user base grows, retrieval and storage operations become progressively slower.
48
+
49
+ - **💰 Cost Implications**: Larger databases mean higher cloud storage costs, increased backup times, and more expensive infrastructure. With LLM applications handling millions of prompts, these costs compound rapidly.
50
+
51
+ - **⚡ Latency Issues**: Loading large prompts from storage adds latency to your application. Multiple LLM calls per user session multiply this problem, creating noticeable delays in response times.
52
+
53
+ ## The Solution: LoPace Compression Engine
54
+
55
+ LoPace solves these challenges by providing **lossless compression** that dramatically reduces storage requirements while maintaining fast compression and decompression speeds:
56
+
57
+ - **📉 Up to 80% Space Reduction**: The hybrid compression method can reduce prompt storage by 70-80% on average, meaning you store 5x less data while maintaining perfect fidelity.
58
+
59
+ - **⚡ Fast Processing**: Achieve 50-200 MB/s compression throughput with sub-linear scaling. Decompression is even faster (100-500 MB/s), ensuring minimal impact on application latency.
60
+
61
+ - **✅ 100% Lossless**: Perfect reconstruction guarantees your prompts are identical to the original - no data loss, no corruption, no compromises.
62
+
63
+ - **🎯 Production-Ready**: Optimized for database storage with minimal memory footprint (under 10 MB for typical use cases) and excellent scalability for millions of prompts.
64
+
65
+ Whether you're storing system prompts for thousands of users, maintaining conversation histories, or caching LLM interactions, LoPace helps you optimize storage costs and improve performance without sacrificing data integrity.
66
+
41
67
  ## Features
42
68
 
43
69
  - 🚀 **Three Compression Methods**:
@@ -4,7 +4,7 @@ build-backend = "setuptools.build_meta"
4
4
 
5
5
  [project]
6
6
  name = "lopace"
7
- version = "0.1.1"
7
+ version = "0.1.2"
8
8
  description = "Lossless Optimized Prompt Accurate Compression Engine"
9
9
  readme = "README.md"
10
10
  requires-python = ">=3.8"
@@ -796,6 +796,104 @@ def plot_scalability(df: pd.DataFrame, output_dir: Path):
796
796
  print(f" Saved: scalability_analysis.svg")
797
797
 
798
798
 
799
+ def plot_original_vs_decompressed(output_dir: Path):
800
+ """Plot original vs decompressed data comparison across multiple prompts."""
801
+ compressor = PromptCompressor(model="cl100k_base", zstd_level=15)
802
+ prompts = generate_test_prompts()
803
+
804
+ # Select a few diverse prompts for visualization
805
+ selected_prompts = [
806
+ ("Small Prompt 1", prompts[0][1]),
807
+ ("Medium Prompt 1", prompts[4][1]),
808
+ ("Large Prompt 1", prompts[7][1]),
809
+ ("Medium Prompt 2", prompts[5][1]),
810
+ ("Small Prompt 2", prompts[1][1]),
811
+ ]
812
+
813
+ # Use Hybrid method (best compression)
814
+ method = CompressionMethod.HYBRID
815
+
816
+ fig, axes = plt.subplots(len(selected_prompts), 1, figsize=(16, 14))
817
+ if len(selected_prompts) == 1:
818
+ axes = [axes]
819
+
820
+ fig.suptitle('Original vs Decompressed: Lossless Compression Verification',
821
+ fontsize=18, fontweight='bold', y=0.995)
822
+
823
+ for idx, (title, prompt) in enumerate(selected_prompts):
824
+ ax = axes[idx]
825
+
826
+ # Compress and decompress
827
+ compressed = compressor.compress(prompt, method)
828
+ decompressed = compressor.decompress(compressed, method)
829
+
830
+ # Verify losslessness
831
+ is_lossless = prompt == decompressed
832
+
833
+ # Create representation: show byte-by-byte or character-by-character
834
+ original_bytes = prompt.encode('utf-8')
835
+ decompressed_bytes = decompressed.encode('utf-8')
836
+
837
+ # Sample points for visualization (every Nth byte/char for performance)
838
+ sample_rate = max(1, len(original_bytes) // 200) # ~200 points max
839
+ sample_indices = np.arange(0, len(original_bytes), sample_rate)
840
+
841
+ # Get byte values (0-255) for visualization
842
+ original_byte_values = np.array([original_bytes[i] for i in sample_indices])
843
+ decompressed_byte_values = np.array([decompressed_bytes[i] for i in sample_indices])
844
+
845
+ # Normalize to 0-100 range for better visualization
846
+ original_normalized = (original_byte_values / 255.0) * 100
847
+ decompressed_normalized = (decompressed_byte_values / 255.0) * 100
848
+
849
+ # Plot original (blue line)
850
+ ax.plot(sample_indices, original_normalized, 'b-', linewidth=2.0,
851
+ label='Original', alpha=0.7)
852
+
853
+ # Plot decompressed (red line) - should overlap perfectly for lossless
854
+ ax.plot(sample_indices, decompressed_normalized, 'r-', linewidth=2.0,
855
+ label='Decompressed', alpha=0.7, linestyle='--')
856
+
857
+ # Mark key compression points (sample every Nth point)
858
+ step = max(1, len(sample_indices) // 20)
859
+ key_indices = sample_indices[::step]
860
+ key_original = original_normalized[::step]
861
+ ax.scatter(key_indices, key_original,
862
+ color='red', s=40, alpha=0.8, zorder=5,
863
+ label='Sample Points', marker='o', edgecolors='darkred', linewidths=1)
864
+
865
+ # Add text info
866
+ original_size = len(original_bytes)
867
+ compressed_size = len(compressed)
868
+ compression_ratio = original_size / compressed_size if compressed_size > 0 else 0
869
+ space_saved = (1 - compressed_size / original_size) * 100 if original_size > 0 else 0
870
+
871
+ info_text = (f"Size: {original_size} → {compressed_size} bytes "
872
+ f"({space_saved:.1f}% saved, {compression_ratio:.2f}x) | "
873
+ f"Lossless: {'✓' if is_lossless else '✗'}")
874
+
875
+ ax.text(0.02, 0.95, info_text, transform=ax.transAxes,
876
+ fontsize=10, verticalalignment='top',
877
+ bbox=dict(boxstyle='round', facecolor='wheat', alpha=0.5),
878
+ fontweight='bold')
879
+
880
+ ax.set_ylabel(f'{title}\n(Normalized Byte Values)', fontweight='bold')
881
+ ax.set_xlabel('Byte Position' if idx == len(selected_prompts) - 1 else '', fontweight='bold')
882
+ ax.set_title(f'{title} - {len(original_bytes)} bytes', fontweight='bold', pad=10)
883
+ ax.grid(True, alpha=0.3, linestyle='--')
884
+ ax.legend(loc='upper right', framealpha=0.9, fontsize=9)
885
+ ax.set_ylim(-5, 105)
886
+
887
+ # Highlight that they overlap perfectly (lossless)
888
+ if is_lossless:
889
+ ax.axhspan(-5, 105, alpha=0.05, color='green', zorder=0)
890
+
891
+ plt.tight_layout(rect=[0, 0, 1, 0.99])
892
+ plt.savefig(output_dir / 'original_vs_decompressed.svg', format='svg', bbox_inches='tight')
893
+ plt.close()
894
+ print(f" Saved: original_vs_decompressed.svg")
895
+
896
+
799
897
  def main():
800
898
  """Main function to generate all visualizations."""
801
899
  # Create output directory
@@ -827,6 +925,7 @@ def main():
827
925
  plot_memory_usage(df, output_dir)
828
926
  plot_comprehensive_comparison(df, output_dir)
829
927
  plot_scalability(df, output_dir)
928
+ plot_original_vs_decompressed(output_dir)
830
929
 
831
930
  print("\n" + "=" * 70)
832
931
  print("Visualization generation complete!")
@@ -10,7 +10,7 @@ with open("requirements.txt", "r", encoding="utf-8") as fh:
10
10
 
11
11
  setup(
12
12
  name="lopace",
13
- version="0.1.1",
13
+ version="0.1.2",
14
14
  author="Aman Ulla",
15
15
  description="Lossless Optimized Prompt Accurate Compression Engine",
16
16
  long_description=long_description,
File without changes
File without changes
File without changes
File without changes
File without changes
File without changes
File without changes
File without changes
File without changes