dataprof 0.4.70__cp310-cp310-macosx_11_0_arm64.whl → 0.4.77__cp310-cp310-macosx_11_0_arm64.whl

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Potentially problematic release.


This version of dataprof might be problematic. Click here for more details.

Binary file
@@ -1,6 +1,6 @@
1
1
  Metadata-Version: 2.4
2
2
  Name: dataprof
3
- Version: 0.4.70
3
+ Version: 0.4.77
4
4
  Classifier: Development Status :: 4 - Beta
5
5
  Classifier: Intended Audience :: Developers
6
6
  Classifier: Intended Audience :: Science/Research
@@ -44,7 +44,6 @@ Project-URL: Issues, https://github.com/AndreaBozzo/dataprof/issues
44
44
  [![License](https://img.shields.io/github/license/AndreaBozzo/dataprof)](LICENSE)
45
45
  [![Rust](https://img.shields.io/badge/rust-1.70%2B-orange.svg)](https://www.rust-lang.org)
46
46
  [![Crates.io](https://img.shields.io/crates/v/dataprof.svg)](https://crates.io/crates/dataprof)
47
- [![Try Online](https://img.shields.io/badge/Try%20Online-CSV%20Online%20Check-blue?style=flat&logo=vercel)](https://csv-mlready-api.vercel.app)
48
47
 
49
48
  **DISCLAIMER FOR HUMAN READERS**
50
49
 
@@ -71,18 +70,6 @@ DataProf processes **all data locally** on your machine. Zero telemetry, zero ex
71
70
 
72
71
  **Complete transparency:** Every metric, calculation, and data point is documented with source code references for independent verification.
73
72
 
74
- ## Try Online
75
-
76
- **No installation required!** Test dataprof instantly with our web interface:
77
-
78
- **[CSV Quality API →](https://csv-mlready-api.vercel.app)**
79
-
80
- - Drag & drop your CSV (up to 50MB)
81
- - Get comprehensive quality score in ~10 seconds
82
- - ISO 8000/25012 compliant metrics
83
- - Powered by dataprof v0.4.61 core engine
84
- - Embeddable badges for your README
85
-
86
73
  ## CI/CD Integration
87
74
 
88
75
  Automate data quality checks in your workflows with our GitHub Action:
@@ -122,6 +109,9 @@ Perfect for ensuring data quality in pipelines, validating data integrity, or ge
122
109
  # Comprehensive quality analysis
123
110
  dataprof analyze data.csv --detailed
124
111
 
112
+ # Analyze Parquet files (requires --features parquet)
113
+ dataprof analyze data.parquet --detailed
114
+
125
115
  # Windows example (from project root after cargo build --release)
126
116
  target\release\dataprof-cli.exe analyze data.csv --detailed
127
117
  ```
@@ -143,6 +133,12 @@ dataprof batch /data/folder --recursive --parallel
143
133
  # Generate HTML batch dashboard
144
134
  dataprof batch /data/folder --recursive --html batch_report.html
145
135
 
136
+ # JSON export for CI/CD automation
137
+ dataprof batch /data/folder --json batch_results.json --recursive
138
+
139
+ # JSON output to stdout
140
+ dataprof batch /data/folder --format json --recursive
141
+
146
142
  # With custom filter and progress
147
143
  dataprof batch /data/folder --filter "*.csv" --parallel --progress
148
144
  ```
@@ -257,10 +253,13 @@ cargo build --release
257
253
  # With Apache Arrow (columnar processing, ~90s compile)
258
254
  cargo build --release --features arrow
259
255
 
256
+ # With Parquet support (requires arrow, ~95s compile)
257
+ cargo build --release --features parquet
258
+
260
259
  # With database connectors
261
260
  cargo build --release --features postgres,mysql,sqlite
262
261
 
263
- # All features (full functionality, ~120s compile)
262
+ # All features (full functionality, ~130s compile)
264
263
  cargo build --release --all-features
265
264
  ```
266
265
 
@@ -271,6 +270,13 @@ cargo build --release --all-features
271
270
  - ❌ Small files (<10MB) - standard engine is faster
272
271
  - ❌ Mixed/messy data - streaming engine handles better
273
272
 
273
+ **When to use Parquet?**
274
+ - ✅ Analytics workloads with columnar data
275
+ - ✅ Data lake architectures
276
+ - ✅ Integration with Spark, Pandas, PyArrow
277
+ - ✅ Efficient storage and compression
278
+ - ✅ Type-safe schema preservation
279
+
274
280
  ### Common Development Tasks
275
281
  ```bash
276
282
  cargo test # Run all tests
@@ -0,0 +1,6 @@
1
+ dataprof-0.4.77.dist-info/METADATA,sha256=4yqb2tpw6cnWeSnXVtXCWPn63JCzcx2kZvhUr9JQBI4,10879
2
+ dataprof-0.4.77.dist-info/WHEEL,sha256=PmVieto1wuHPE0V9Yj-HDpDcXvcgT7RQ6xfDnzOpcS8,104
3
+ dataprof-0.4.77.dist-info/licenses/LICENSE,sha256=CImaqYZiNIl11Vmb14wZW9Jzj33IxXlZnXWegpfXJF0,1069
4
+ dataprof/__init__.py,sha256=84U5MpyP59z3koB4vbdsJg1XQSKYeTS1SC7b3VqwjfU,115
5
+ dataprof/dataprof.cpython-310-darwin.so,sha256=NCHkPaPIQnE_TewYXnp1GRwAwPn8TJbiWHVHYuw_kpM,2306368
6
+ dataprof-0.4.77.dist-info/RECORD,,
@@ -1,4 +1,4 @@
1
1
  Wheel-Version: 1.0
2
- Generator: maturin (1.9.4)
2
+ Generator: maturin (1.9.6)
3
3
  Root-Is-Purelib: false
4
4
  Tag: cp310-cp310-macosx_11_0_arm64
@@ -1,6 +0,0 @@
1
- dataprof-0.4.70.dist-info/METADATA,sha256=Io9XDIig6d7QwloFrAMowDTEdoufGBvTPyuU9FW7c_I,10783
2
- dataprof-0.4.70.dist-info/WHEEL,sha256=aXz49xVjjC2bkgTnE4xcanfAmG9wdfNG_Q2OldK7oKM,104
3
- dataprof-0.4.70.dist-info/licenses/LICENSE,sha256=CImaqYZiNIl11Vmb14wZW9Jzj33IxXlZnXWegpfXJF0,1069
4
- dataprof/__init__.py,sha256=84U5MpyP59z3koB4vbdsJg1XQSKYeTS1SC7b3VqwjfU,115
5
- dataprof/dataprof.cpython-310-darwin.so,sha256=zll6vW3brbe9Iuh4IXxKdbBsgN2ZCk2ck82RYa_Wst8,2269200
6
- dataprof-0.4.70.dist-info/RECORD,,