pardox 0.1.0__tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -0,0 +1,2 @@
1
+ include README.md
2
+ recursive-include pardox/libs *.dll *.so *.dylib
pardox-0.1.0/PKG-INFO ADDED
@@ -0,0 +1,142 @@
1
+ Metadata-Version: 2.4
2
+ Name: pardox
3
+ Version: 0.1.0
4
+ Summary: High-Performance DataFrame Engine powered by Rust (The PardoX Project)
5
+ Author-email: Alberto Cardenas <iam@albertocardenas.com>
6
+ License: MIT
7
+ Project-URL: Homepage, https://www.albertocardenas.com
8
+ Project-URL: Source, https://github.com/betoalien/pardox
9
+ Keywords: dataframe,rust,etl,big-data,simd
10
+ Classifier: Programming Language :: Python :: 3
11
+ Classifier: Programming Language :: Rust
12
+ Classifier: Operating System :: OS Independent
13
+ Classifier: Topic :: Scientific/Engineering :: Information Analysis
14
+ Requires-Python: >=3.8
15
+ Description-Content-Type: text/markdown
16
+
17
+ # PardoX: The Hyper-Fast Data Engine πŸš€
18
+
19
+ [![PyPI version](https://badge.fury.io/py/pardox.svg)](https://badge.fury.io/py/pardox)
20
+ [![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)
21
+ [![Python 3.8+](https://img.shields.io/badge/python-3.8+-blue.svg)](https://www.python.org/downloads/)
22
+ [![Powered By Rust](https://img.shields.io/badge/powered%20by-Rust-orange.svg)](https://www.rust-lang.org/)
23
+
24
+ **The Speed of Rust. The Simplicity of Python.**
25
+
26
+ PardoX is a next-generation DataFrame engine designed for high-performance ETL and data analysis. It bridges the gap between low-level memory efficiency and high-level developer productivity by running a **Rust Core** wrapped in a lightweight **Python SDK**.
27
+
28
+ > **v0.1 Beta is now available!** Supports Windows, Linux, and MacOS (Intel & Apple Silicon).
29
+
30
+ ---
31
+
32
+ ## ⚑ Why PardoX?
33
+
34
+ Traditional DataFrames (like Pandas) often struggle with memory overhead and single-threaded execution. PardoX introduces a **Hybrid Architecture**:
35
+
36
+ * **Core:** Written in **Rust** for memory safety, multithreading, and SIMD (AVX2) optimizations.
37
+ * **Interface:** Native **Python** bindings that feel familiar but run at compiled speeds.
38
+ * **Memory:** Uses **HyperBlock Architecture** to manage data in contiguous chunks, minimizing fragmentation and maximizing CPU cache hits.
39
+
40
+ ---
41
+
42
+ ## πŸ”₯ Key Features (v0.1)
43
+
44
+ ### 1. Zero-Copy Ingestion
45
+ Load massive datasets in seconds. PardoX supports multithreaded CSV parsing and direct SQL ingestion without the overhead of Python objects.
46
+
47
+ ### 2. Native Binary Format (`.prdx`)
48
+ Save and load your data instantly using the `.prdx` format.
49
+ * **Speed:** Up to **4.6 GB/s** read throughput.
50
+ * **Tech:** Custom binary layout optimized for SSDs and OS page caching.
51
+
52
+ ### 3. High-Performance Mutation
53
+ Transform your data in-place without memory duplication.
54
+ * **Arithmetic:** Vectorized addition, subtraction, multiplication, and division.
55
+ * **Hygiene:** Instant `fillna()` and `round()` operations across millions of rows.
56
+ * **Feature Engineering:** Create new columns on the fly: `df['total'] = df['qty'] * df['price']`.
57
+
58
+ ### 4. Cross-Platform & Universal
59
+ Run your code anywhere. PardoX automatically detects your OS and CPU architecture to load the optimized binary kernel.
60
+ * βœ… **Windows (x64)**
61
+ * βœ… **Linux (x64)**
62
+ * βœ… **MacOS (Intel & Apple Silicon M1/M2/M3)**
63
+
64
+ ---
65
+
66
+ ## πŸ“¦ Installation
67
+
68
+ PardoX is available on PyPI. The package includes pre-compiled binaries for all supported platforms.
69
+
70
+ ```bash
71
+ pip install pardox
72
+ ```
73
+ πŸš€ Quick Start
74
+
75
+ Here is a complete ETL pipeline example: Load, Clean, Transform, and Analyze.
76
+
77
+ ```bash
78
+ import pardox as px
79
+
80
+ # 1. Ingest Data (Auto-detected Schema)
81
+ # Uses multi-threaded Rust reader
82
+ df = px.read_csv("sales_data.csv")
83
+
84
+ print(f"Loaded {df.shape[0]} rows.")
85
+
86
+ # 2. Data Hygiene
87
+ # Fill nulls in numeric columns instantly
88
+ df.fillna(0.0)
89
+
90
+ # 3. Feature Engineering (Vectorized)
91
+ # Calculate total amount (Price * Quantity)
92
+ # This executes in Rust using SIMD instructions
93
+ df['total_amount'] = df['price'] * df['quantity']
94
+
95
+ # 4. Aggregations & Analysis
96
+ revenue = df['total_amount'].sum()
97
+ avg_ticket = df['total_amount'].mean()
98
+
99
+ print(f"Total Revenue: ${revenue:,.2f}")
100
+ print(f"Avg Ticket: ${avg_ticket:,.2f}")
101
+
102
+ # 5. Persist to Disk
103
+ # Save as PRDX for ultra-fast loading later
104
+ df.to_prdx("sales_data_processed.prdx")
105
+ ```
106
+
107
+ ## πŸ“Š Benchmarks
108
+
109
+ Hardware: MacBook Pro M2, 16GB RAM.
110
+
111
+ | Operation | Pandas (v2.x) | PardoX (v0.1) | Speedup |
112
+ |-----------|---------------|---------------|----------|
113
+ | Read CSV (1GB) | 4.2s | 0.8s | 5.2x |
114
+ | Column Math | 0.15s | 0.02s | 7.5x |
115
+ | Fill NA | 0.30s | 0.04s | 7.5x |
116
+ | Read Binary | 0.9s (Parquet) | 0.2s (.prdx) | 4.5x |
117
+
118
+ ## πŸ—ΊοΈ Roadmap
119
+
120
+ We are building the universal data engine. Here is what's coming next:
121
+
122
+ v0.1 (Current): Python Core, Arithmetic, I/O, Basic Aggregations.
123
+
124
+ v0.2 (Planned):
125
+
126
+ Universal SDKs: Bindings for Node.js, Go, and PHP.
127
+
128
+ Advanced Types: String manipulation kernels (Regex, Splitting).
129
+
130
+ ML Bridge: Zero-Copy export to NumPy and Arrow.
131
+
132
+ ## 🀝 Contributing
133
+
134
+ We welcome contributions! Please see our Contributing Guide for details on how to set up the Rust environment and build the project locally.
135
+
136
+ ## πŸ“„ License
137
+
138
+ This project is licensed under the MIT License.
139
+
140
+ <p align="center"> Built with ❀️ by Alberto Cardenas
141
+
142
+ <a href="https://www.albertocardenas.com">www.albertocardenas.com</a> </p>
pardox-0.1.0/README.md ADDED
@@ -0,0 +1,126 @@
1
+ # PardoX: The Hyper-Fast Data Engine πŸš€
2
+
3
+ [![PyPI version](https://badge.fury.io/py/pardox.svg)](https://badge.fury.io/py/pardox)
4
+ [![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)
5
+ [![Python 3.8+](https://img.shields.io/badge/python-3.8+-blue.svg)](https://www.python.org/downloads/)
6
+ [![Powered By Rust](https://img.shields.io/badge/powered%20by-Rust-orange.svg)](https://www.rust-lang.org/)
7
+
8
+ **The Speed of Rust. The Simplicity of Python.**
9
+
10
+ PardoX is a next-generation DataFrame engine designed for high-performance ETL and data analysis. It bridges the gap between low-level memory efficiency and high-level developer productivity by running a **Rust Core** wrapped in a lightweight **Python SDK**.
11
+
12
+ > **v0.1 Beta is now available!** Supports Windows, Linux, and MacOS (Intel & Apple Silicon).
13
+
14
+ ---
15
+
16
+ ## ⚑ Why PardoX?
17
+
18
+ Traditional DataFrames (like Pandas) often struggle with memory overhead and single-threaded execution. PardoX introduces a **Hybrid Architecture**:
19
+
20
+ * **Core:** Written in **Rust** for memory safety, multithreading, and SIMD (AVX2) optimizations.
21
+ * **Interface:** Native **Python** bindings that feel familiar but run at compiled speeds.
22
+ * **Memory:** Uses **HyperBlock Architecture** to manage data in contiguous chunks, minimizing fragmentation and maximizing CPU cache hits.
23
+
24
+ ---
25
+
26
+ ## πŸ”₯ Key Features (v0.1)
27
+
28
+ ### 1. Zero-Copy Ingestion
29
+ Load massive datasets in seconds. PardoX supports multithreaded CSV parsing and direct SQL ingestion without the overhead of Python objects.
30
+
31
+ ### 2. Native Binary Format (`.prdx`)
32
+ Save and load your data instantly using the `.prdx` format.
33
+ * **Speed:** Up to **4.6 GB/s** read throughput.
34
+ * **Tech:** Custom binary layout optimized for SSDs and OS page caching.
35
+
36
+ ### 3. High-Performance Mutation
37
+ Transform your data in-place without memory duplication.
38
+ * **Arithmetic:** Vectorized addition, subtraction, multiplication, and division.
39
+ * **Hygiene:** Instant `fillna()` and `round()` operations across millions of rows.
40
+ * **Feature Engineering:** Create new columns on the fly: `df['total'] = df['qty'] * df['price']`.
41
+
42
+ ### 4. Cross-Platform & Universal
43
+ Run your code anywhere. PardoX automatically detects your OS and CPU architecture to load the optimized binary kernel.
44
+ * βœ… **Windows (x64)**
45
+ * βœ… **Linux (x64)**
46
+ * βœ… **MacOS (Intel & Apple Silicon M1/M2/M3)**
47
+
48
+ ---
49
+
50
+ ## πŸ“¦ Installation
51
+
52
+ PardoX is available on PyPI. The package includes pre-compiled binaries for all supported platforms.
53
+
54
+ ```bash
55
+ pip install pardox
56
+ ```
57
+ πŸš€ Quick Start
58
+
59
+ Here is a complete ETL pipeline example: Load, Clean, Transform, and Analyze.
60
+
61
+ ```bash
62
+ import pardox as px
63
+
64
+ # 1. Ingest Data (Auto-detected Schema)
65
+ # Uses multi-threaded Rust reader
66
+ df = px.read_csv("sales_data.csv")
67
+
68
+ print(f"Loaded {df.shape[0]} rows.")
69
+
70
+ # 2. Data Hygiene
71
+ # Fill nulls in numeric columns instantly
72
+ df.fillna(0.0)
73
+
74
+ # 3. Feature Engineering (Vectorized)
75
+ # Calculate total amount (Price * Quantity)
76
+ # This executes in Rust using SIMD instructions
77
+ df['total_amount'] = df['price'] * df['quantity']
78
+
79
+ # 4. Aggregations & Analysis
80
+ revenue = df['total_amount'].sum()
81
+ avg_ticket = df['total_amount'].mean()
82
+
83
+ print(f"Total Revenue: ${revenue:,.2f}")
84
+ print(f"Avg Ticket: ${avg_ticket:,.2f}")
85
+
86
+ # 5. Persist to Disk
87
+ # Save as PRDX for ultra-fast loading later
88
+ df.to_prdx("sales_data_processed.prdx")
89
+ ```
90
+
91
+ ## πŸ“Š Benchmarks
92
+
93
+ Hardware: MacBook Pro M2, 16GB RAM.
94
+
95
+ | Operation | Pandas (v2.x) | PardoX (v0.1) | Speedup |
96
+ |-----------|---------------|---------------|----------|
97
+ | Read CSV (1GB) | 4.2s | 0.8s | 5.2x |
98
+ | Column Math | 0.15s | 0.02s | 7.5x |
99
+ | Fill NA | 0.30s | 0.04s | 7.5x |
100
+ | Read Binary | 0.9s (Parquet) | 0.2s (.prdx) | 4.5x |
101
+
102
+ ## πŸ—ΊοΈ Roadmap
103
+
104
+ We are building the universal data engine. Here is what's coming next:
105
+
106
+ v0.1 (Current): Python Core, Arithmetic, I/O, Basic Aggregations.
107
+
108
+ v0.2 (Planned):
109
+
110
+ Universal SDKs: Bindings for Node.js, Go, and PHP.
111
+
112
+ Advanced Types: String manipulation kernels (Regex, Splitting).
113
+
114
+ ML Bridge: Zero-Copy export to NumPy and Arrow.
115
+
116
+ ## 🀝 Contributing
117
+
118
+ We welcome contributions! Please see our Contributing Guide for details on how to set up the Rust environment and build the project locally.
119
+
120
+ ## πŸ“„ License
121
+
122
+ This project is licensed under the MIT License.
123
+
124
+ <p align="center"> Built with ❀️ by Alberto Cardenas
125
+
126
+ <a href="https://www.albertocardenas.com">www.albertocardenas.com</a> </p>
@@ -0,0 +1,7 @@
1
+ # pardox/__init__.py
2
+
3
+ from .frame import DataFrame
4
+ from .io import read_csv, read_sql, from_arrow, read_prdx
5
+
6
+ # Y lo exponemos pΓΊblicamente aquΓ­
7
+ __all__ = ["DataFrame", "read_csv", "read_sql", "from_arrow", "read_prdx", "Series"]