pointblank 0.8.5__py3-none-any.whl → 0.8.6__py3-none-any.whl

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -0,0 +1,312 @@
1
+ Metadata-Version: 2.4
2
+ Name: pointblank
3
+ Version: 0.8.6
4
+ Summary: Find out if your data is what you think it is.
5
+ Author-email: Richard Iannone <riannone@me.com>
6
+ License: MIT License
7
+
8
+ Copyright (c) 2024-2025 Posit Software, PBC
9
+
10
+ Permission is hereby granted, free of charge, to any person obtaining a copy
11
+ of this software and associated documentation files (the "Software"), to deal
12
+ in the Software without restriction, including without limitation the rights
13
+ to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
14
+ copies of the Software, and to permit persons to whom the Software is
15
+ furnished to do so, subject to the following conditions:
16
+
17
+ The above copyright notice and this permission notice shall be included in all
18
+ copies or substantial portions of the Software.
19
+
20
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
21
+ IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
22
+ FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
23
+ AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
24
+ LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
25
+ OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
26
+ SOFTWARE.
27
+
28
+ Project-URL: homepage, https://github.com/posit-dev/pointblank
29
+ Keywords: data,quality,validation,testing,data science,data engineering
30
+ Classifier: Development Status :: 5 - Production/Stable
31
+ Classifier: Intended Audience :: Science/Research
32
+ Classifier: License :: OSI Approved :: MIT License
33
+ Classifier: Programming Language :: Python
34
+ Classifier: Programming Language :: Python :: 3
35
+ Classifier: Programming Language :: Python :: 3.10
36
+ Classifier: Programming Language :: Python :: 3.11
37
+ Classifier: Programming Language :: Python :: 3.12
38
+ Classifier: Operating System :: OS Independent
39
+ Classifier: Topic :: Scientific/Engineering
40
+ Requires-Python: >=3.10
41
+ Description-Content-Type: text/markdown
42
+ License-File: LICENSE
43
+ Requires-Dist: commonmark>=0.9.1
44
+ Requires-Dist: importlib-metadata
45
+ Requires-Dist: great_tables>=0.17.0
46
+ Requires-Dist: narwhals>=1.24.1
47
+ Requires-Dist: typing_extensions>=3.10.0.0
48
+ Requires-Dist: requests>=2.31.0
49
+ Provides-Extra: pd
50
+ Requires-Dist: pandas>=2.2.3; extra == "pd"
51
+ Provides-Extra: pl
52
+ Requires-Dist: polars>=1.24.0; extra == "pl"
53
+ Provides-Extra: generate
54
+ Requires-Dist: chatlas>=0.3.0; extra == "generate"
55
+ Requires-Dist: anthropic[bedrock]>=0.45.2; extra == "generate"
56
+ Requires-Dist: openai>=1.63.0; extra == "generate"
57
+ Requires-Dist: shiny>=1.3.0; extra == "generate"
58
+ Provides-Extra: databricks
59
+ Requires-Dist: ibis-framework[databricks]>=9.5.0; extra == "databricks"
60
+ Provides-Extra: duckdb
61
+ Requires-Dist: ibis-framework[duckdb]>=9.5.0; extra == "duckdb"
62
+ Provides-Extra: mysql
63
+ Requires-Dist: ibis-framework[mysql]>=9.5.0; extra == "mysql"
64
+ Provides-Extra: mssql
65
+ Requires-Dist: ibis-framework[mssql]>=9.5.0; extra == "mssql"
66
+ Provides-Extra: postgres
67
+ Requires-Dist: ibis-framework[postgres]>=9.5.0; extra == "postgres"
68
+ Provides-Extra: pyspark
69
+ Requires-Dist: ibis-framework[pyspark]>=9.5.0; extra == "pyspark"
70
+ Provides-Extra: snowflake
71
+ Requires-Dist: ibis-framework[snowflake]>=9.5.0; extra == "snowflake"
72
+ Provides-Extra: sqlite
73
+ Requires-Dist: ibis-framework[sqlite]>=9.5.0; extra == "sqlite"
74
+ Provides-Extra: docs
75
+ Requires-Dist: jupyter; extra == "docs"
76
+ Requires-Dist: nbclient>=0.10.0; extra == "docs"
77
+ Requires-Dist: nbformat>=5.10.4; extra == "docs"
78
+ Requires-Dist: quartodoc>=0.8.1; python_version >= "3.9" and extra == "docs"
79
+ Requires-Dist: pandas>=2.2.3; extra == "docs"
80
+ Requires-Dist: polars>=1.17.1; extra == "docs"
81
+ Dynamic: license-file
82
+
83
+ <div align="center">
84
+
85
+ <a href="https://posit-dev.github.io/pointblank/"><img src="https://posit-dev.github.io/pointblank/assets/pointblank_logo.svg" width="75%"/></a>
86
+
87
+ _Data validation made beautiful and powerful_
88
+
89
+ [![Python Versions](https://img.shields.io/pypi/pyversions/pointblank.svg)](https://pypi.python.org/pypi/pointblank)
90
+ [![PyPI](https://img.shields.io/pypi/v/pointblank)](https://pypi.org/project/pointblank/#history)
91
+ [![PyPI Downloads](https://img.shields.io/pypi/dm/pointblank)](https://pypistats.org/packages/pointblank)
92
+ [![License](https://img.shields.io/github/license/posit-dev/pointblank)](https://img.shields.io/github/license/posit-dev/pointblank)
93
+
94
+ [![CI Build](https://github.com/posit-dev/pointblank/actions/workflows/ci-tests.yaml/badge.svg)](https://github.com/posit-dev/pointblank/actions/workflows/ci-tests.yaml)
95
+ [![Codecov branch](https://img.shields.io/codecov/c/github/posit-dev/pointblank/main.svg)](https://codecov.io/gh/posit-dev/pointblank)
96
+ [![Repo Status](https://www.repostatus.org/badges/latest/active.svg)](https://www.repostatus.org/#active)
97
+ [![Documentation](https://img.shields.io/badge/docs-project_website-blue.svg)](https://posit-dev.github.io/pointblank/)
98
+
99
+ [![Contributors](https://img.shields.io/github/contributors/posit-dev/pointblank)](https://github.com/posit-dev/pointblank/graphs/contributors)
100
+ [![Discord](https://img.shields.io/discord/1345877328982446110?color=%237289da&label=Discord)](https://discord.com/invite/YH7CybCNCQ)
101
+ [![Contributor Covenant](https://img.shields.io/badge/Contributor%20Covenant-v2.1%20adopted-ff69b4.svg)](https://www.contributor-covenant.org/version/2/1/code_of_conduct.html)
102
+
103
+ </div>
104
+
105
+ ## What is Pointblank?
106
+
107
+ Pointblank is a modern data validation framework for Python that helps you trust your data with confidence. It provides a fluent, expressive API to validate your data against a wide range of constraints and presents results in beautiful, interactive reports.
108
+
109
+ Whether you're a data scientist, data engineer, or analyst, Pointblank helps you catch data quality issues before they impact your analyses or downstream systems.
110
+
111
+ ## Getting Started in 30 Seconds
112
+
113
+ ```python
114
+ import pointblank as pb
115
+
116
+ validation = (
117
+ pb.Validate(data=pb.load_dataset(dataset="small_table"))
118
+ .col_vals_gt(columns="d", value=100) # Validate values > 100
119
+ .col_vals_le(columns="c", value=5) # Validate values <= 5
120
+ .col_exists(columns=["date", "date_time"]) # Check columns exist
121
+ .interrogate() # Execute and collect results
122
+ )
123
+
124
+ # Get the validation report from the REPL with:
125
+ validation.get_tabular_report().show()
126
+
127
+ # From a notebook simply use:
128
+ validation
129
+ ```
130
+
131
+ <div align="center">
132
+ <img src="https://posit-dev.github.io/pointblank/assets/pointblank-tabular-report.png" width="800px">
133
+ </div>
134
+
135
+ <br>
136
+
137
+ Why Choose Pointblank?
138
+
139
+ - **Works with your existing stack** - Seamlessly integrates with Polars, Pandas, DuckDB, MySQL, PostgreSQL, SQLite, Parquet, and more!
140
+ - **Beautiful, interactive reports** - Crystal-clear validation results that highlight issues and help communicate data quality
141
+ - **Composable validation pipeline** - Chain validation steps into a complete data quality workflow
142
+ - **Threshold-based alerts** - Set 'warning', 'error', and 'critical' thresholds with custom actions
143
+ - **Practical outputs** - Use validation results to filter tables, extract problematic data, or trigger downstream processes
144
+
145
+ ## Real-World Example
146
+
147
+ ```python
148
+ import pointblank as pb
149
+ import polars as pl
150
+
151
+ # Load your data
152
+ sales_data = pl.read_csv("sales_data.csv")
153
+
154
+ # Create a comprehensive validation
155
+ validation = (
156
+ pb.Validate(
157
+ data=sales_data,
158
+ tbl_name="sales_data", # Name of the table for reporting
159
+ label="Real-world example.", # Label for the validation, appears in reports
160
+ thresholds=(0.01, 0.02, 0.05), # Set thresholds for warnings, errors, and critical issues
161
+ actions=pb.Actions( # Define actions for any threshold exceedance
162
+ critical="Major data quality issue found in step {step} ({time})."
163
+ ),
164
+ final_actions=pb.FinalActions( # Define final actions for the entire validation
165
+ pb.send_slack_notification(
166
+ webhook_url="https://hooks.slack.com/services/your/webhook/url"
167
+ )
168
+ ),
169
+ brief=True, # Add automatically-generated briefs for each step
170
+ )
171
+ .col_vals_between( # Check numeric ranges with precision
172
+ columns=["price", "quantity"],
173
+ left=0, right=1000
174
+ )
175
+ .col_vals_not_null( # Ensure that columns ending with '_id' don't have null values
176
+ columns=pb.ends_with("_id")
177
+ )
178
+ .col_vals_regex( # Validate patterns with regex
179
+ columns="email",
180
+ pattern="^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\\.[a-zA-Z]{2,}$"
181
+ )
182
+ .col_vals_in_set( # Check categorical values
183
+ columns="status",
184
+ set=["pending", "shipped", "delivered", "returned"]
185
+ )
186
+ .conjointly( # Combine multiple conditions
187
+ lambda df: pb.expr_col("revenue") == pb.expr_col("price") * pb.expr_col("quantity"),
188
+ lambda df: pb.expr_col("tax") >= pb.expr_col("revenue") * 0.05
189
+ )
190
+ .interrogate()
191
+ )
192
+ ```
193
+
194
+ ```
195
+ Major data quality issue found in step 7 (2025-04-16 15:03:04.685612+00:00).
196
+ ```
197
+
198
+ ```python
199
+ # Get an HTML report you can share with your team
200
+ validation.get_tabular_report().show("browser")
201
+ ```
202
+
203
+ <div align="center">
204
+ <img src="https://posit-dev.github.io/pointblank/assets/pointblank-sales-data.png" width="800px">
205
+ </div>
206
+
207
+ ```python
208
+ # Get a report of failing records from a specific step
209
+ validation.get_step_report(i=3).show("browser") # Get failing records from step 3
210
+ ```
211
+
212
+ <div align="center">
213
+ <img src="https://posit-dev.github.io/pointblank/assets/pointblank-step-report.png" width="800px">
214
+ </div>
215
+
216
+ ## Features That Set Pointblank Apart
217
+
218
+ - **Complete validation workflow** - From data access to validation to reporting in a single pipeline
219
+ - **Built for collaboration** - Share results with colleagues through beautiful interactive reports
220
+ - **Practical outputs** - Get exactly what you need: counts, extracts, summaries, or full reports
221
+ - **Flexible deployment** - Use in notebooks, scripts, or data pipelines
222
+ - **Customizable** - Tailor validation steps and reporting to your specific needs
223
+ - **Internationalization** - Reports can be generated in over 20 languages, including English, Spanish, French, and German
224
+
225
+ ## Documentation and Examples
226
+
227
+ Visit our [documentation site](https://posit-dev.github.io/pointblank) for:
228
+
229
+ - [The User Guide](https://posit-dev.github.io/pointblank/user-guide/)
230
+ - [API reference](https://posit-dev.github.io/pointblank/reference/)
231
+ - [Example gallery](https://posit-dev.github.io/pointblank/demos/)
232
+ - [The Pointblog](https://posit-dev.github.io/pointblank/blog/)
233
+
234
+ ## Join the Community
235
+
236
+ We'd love to hear from you! Connect with us:
237
+
238
+ - [GitHub Issues](https://github.com/posit-dev/pointblank/issues) for bug reports and feature requests
239
+ - [_Discord server_](https://discord.com/invite/YH7CybCNCQ) for discussions and help
240
+ - [Contributing guidelines](https://github.com/posit-dev/pointblank/blob/main/CONTRIBUTING.md) if you'd like to help improve Pointblank
241
+
242
+ ## Installation
243
+
244
+ You can install Pointblank using pip:
245
+
246
+ ```bash
247
+ pip install pointblank
248
+ ```
249
+
250
+ You can also install Pointblank from Conda-Forge by using:
251
+
252
+ ```bash
253
+ conda install conda-forge::pointblank
254
+ ```
255
+
256
+ If you don't have Polars or Pandas installed, you'll need to install one of them to use Pointblank.
257
+
258
+ ```bash
259
+ pip install "pointblank[pl]" # Install Pointblank with Polars
260
+ pip install "pointblank[pd]" # Install Pointblank with Pandas
261
+ ```
262
+
263
+ To use Pointblank with DuckDB, MySQL, PostgreSQL, or SQLite, install Ibis with the appropriate backend:
264
+
265
+ ```bash
266
+ pip install "pointblank[duckdb]" # Install Pointblank with Ibis + DuckDB
267
+ pip install "pointblank[mysql]" # Install Pointblank with Ibis + MySQL
268
+ pip install "pointblank[postgres]" # Install Pointblank with Ibis + PostgreSQL
269
+ pip install "pointblank[sqlite]" # Install Pointblank with Ibis + SQLite
270
+ ```
271
+
272
+ ## Technical Details
273
+
274
+ Pointblank uses [Narwhals](https://github.com/narwhals-dev/narwhals) to work with Polars and Pandas DataFrames, and integrates with [Ibis](https://github.com/ibis-project/ibis) for database and file format support. This architecture provides a consistent API for validating tabular data from various sources.
275
+
276
+ ## Contributing to Pointblank
277
+
278
+ There are many ways to contribute to the ongoing development of Pointblank. Some contributions can be simple (like fixing typos, improving documentation, filing issues for feature requests or problems, etc.) and others might take more time and care (like answering questions and submitting PRs with code changes). Just know that anything you can do to help would be very much appreciated!
279
+
280
+ Please read over the [contributing guidelines](https://github.com/posit-dev/pointblank/blob/main/CONTRIBUTING.md) for
281
+ information on how to get started.
282
+
283
+ ## Roadmap
284
+
285
+ We're actively working on enhancing Pointblank with:
286
+
287
+ 1. Additional validation methods for comprehensive data quality checks
288
+ 2. Advanced logging capabilities
289
+ 3. Messaging actions (Slack, email) for threshold exceedances
290
+ 4. LLM-powered validation suggestions and data dictionary generation
291
+ 5. JSON/YAML configuration for pipeline portability
292
+ 6. CLI utility for validation from the command line
293
+ 7. Expanded backend support and certification
294
+ 8. High-quality documentation and examples
295
+
296
+ If you have any ideas for features or improvements, don't hesitate to share them with us! We are always looking for ways to make Pointblank better.
297
+
298
+ ## Code of Conduct
299
+
300
+ Please note that the Pointblank project is released with a [contributor code of conduct](https://www.contributor-covenant.org/version/2/1/code_of_conduct/). <br>By participating in this project you agree to abide by its terms.
301
+
302
+ ## 📄 License
303
+
304
+ Pointblank is licensed under the MIT license.
305
+
306
+ © Posit Software, PBC.
307
+
308
+ ## 🏛️ Governance
309
+
310
+ This project is primarily maintained by
311
+ [Rich Iannone](https://bsky.app/profile/richmeister.bsky.social). Other authors may occasionally
312
+ assist with some of these duties.
@@ -1,22 +1,22 @@
1
- pointblank/__init__.py,sha256=c1lZsS_xsMq3OfkCuYQPxDByK_IRLGTYtd5n6uIveks,1555
2
- pointblank/_constants.py,sha256=xbvHGDi5mt85FBnznXupwE79KttHFbORLVSQVXBKdXE,72533
1
+ pointblank/__init__.py,sha256=uHrX-ARZOhvWogXXqKV65RO2DXdYLZNCD1oNcm8hE6o,1585
2
+ pointblank/_constants.py,sha256=1CkIbDutX3oSo2iVjyGthN6GipE0NIxB6dDlLs71PWo,75793
3
3
  pointblank/_constants_docs.py,sha256=JBmtt16zTYQ-zaM4ElLExtKs-dKlnN553Ys2ML1Y1C8,2099
4
- pointblank/_constants_translations.py,sha256=5I-QNY6b3wTIvDS1PzMG-uP2OkCB6c86NP2hr-RHji4,161031
5
- pointblank/_interrogation.py,sha256=AtygXSb5iaqUcobnfVF3HjO9mjrtPWkLJ8No9XFSvR8,73186
4
+ pointblank/_constants_translations.py,sha256=QfOmVESwWFokWXpgLkEFHGik8o1EUBhIXYtaEqtGGNg,166575
5
+ pointblank/_interrogation.py,sha256=H9gSmtV7QiHMOyHNMbS2MvgG5YX4kexaZ7Mwcb2P9tE,80799
6
6
  pointblank/_typing.py,sha256=YQ6Bt-j-W6Cg91qXHHDzBM-ptc-IEvhMg6T5ugWnGwM,306
7
- pointblank/_utils.py,sha256=Loyu9qo_QR3lgtsWYmFsxfVQCxdU_GWOAk9LqrQq0Wo,24630
7
+ pointblank/_utils.py,sha256=0V-LxUjSjGfcZV2_IH-5KPikYiVWdt4QSMQDioyZoZc,24681
8
8
  pointblank/_utils_check_args.py,sha256=rFEc1nbCN8ftsQQWVjCNWmQ2QmUDxkfgmoJclrZeTLs,5489
9
9
  pointblank/_utils_html.py,sha256=sTcmnBljkPjRZF1hbpoHl4HmnXOazsA91gC9iWVIrRk,2848
10
10
  pointblank/actions.py,sha256=oazJk4pe3lIA14hjyCDtPOr4r_sp4vGGo2eyU_LX5_0,18268
11
11
  pointblank/assistant.py,sha256=ZIQJKTy9rDwq_Wmr1FMp0J7Q3ekxSgF3_tK0p4PTEUM,14850
12
- pointblank/column.py,sha256=0DcfGqQrrAqLl6TkSvzKBWZLF_G-NP4C26CqvuLQCb4,63328
12
+ pointblank/column.py,sha256=LumGbnterw5VM7-2-7Za3jdlug1VVS9a3TOH0Y1E5eg,76548
13
13
  pointblank/datascan.py,sha256=p0b7j4sxbJxNqIvYqq5r-9-8f-i9niswK19PrmWOfFE,47727
14
14
  pointblank/draft.py,sha256=lIbSlY9Avi1GbRvJhqR-69sGWCfD11im3Go20XsX8L0,15783
15
15
  pointblank/schema.py,sha256=gzUCmtccO2v15MH2bo9uHUYjkKEEne1okQucxcH39pc,44291
16
16
  pointblank/tf.py,sha256=8o_8m4i01teulEe3-YYMotSNf3tImjBMInsvdjSAO5Q,8844
17
17
  pointblank/thresholds.py,sha256=C8_Rn2z3MVFu4UH5eaGRd7DkW3slgkWB3Hhim2h5CfU,25340
18
- pointblank/validate.py,sha256=fnm3xy85AcMPA7v2n9s2NsbCWHjG7A8hcwo7a7lm2N8,500689
19
- pointblank/data/api-docs.txt,sha256=u9Q0eWlTLW396YSp2lY15bh_omw01XnGPF_jirODLCQ,397547
18
+ pointblank/validate.py,sha256=y6fEe8f9idql9lGQrC8AhIJyUzF3_CBsExEpyjjMi7E,512522
19
+ pointblank/data/api-docs.txt,sha256=3HZprV731qL-jNJ4rnntq_Nymc819ZSLM4fGpsjNfLc,408766
20
20
  pointblank/data/game_revenue-duckdb.zip,sha256=tKIVx48OGLYGsQPS3h5AjA2Nyq_rfEpLCjBiFUWhagU,35880
21
21
  pointblank/data/game_revenue.zip,sha256=7c9EvHLyi93CHUd4p3dM4CZ-GucFCtXKSPxgLojL32U,33749
22
22
  pointblank/data/nycflights-duckdb.zip,sha256=GQrHO9tp7d9cNGFNSbA9EKF19MLf6t2wZE0U9-hIKow,5293077
@@ -24,8 +24,8 @@ pointblank/data/nycflights.zip,sha256=yVjbUaKUz2LydSdF9cABuir0VReHBBgV7shiNWSd0m
24
24
  pointblank/data/polars-api-docs.txt,sha256=KGcS-BOtUs9zgpkWfXD-GFdFh4O_zjdkpX7msHjztLg,198045
25
25
  pointblank/data/small_table-duckdb.zip,sha256=BhTaZ2CRS4-9Z1uVhOU6HggvW3XCar7etMznfENIcOc,2028
26
26
  pointblank/data/small_table.zip,sha256=lmFb90Nb-v5X559Ikjg31YLAXuRyMkD9yLRElkXPMzQ,472
27
- pointblank-0.8.5.dist-info/licenses/LICENSE,sha256=apLF-HWPNU7pT5bmf5KmZpD5Cklpy2u-BN_0xBoRMLY,1081
28
- pointblank-0.8.5.dist-info/METADATA,sha256=BPfM_mGzEYNoSFSys35V2vBCGhR1hywdgyoRxGTmpEo,12839
29
- pointblank-0.8.5.dist-info/WHEEL,sha256=CmyFI0kx5cdEMTLiONQRbGQwjIoR1aIYB7eCAQ4KPJ0,91
30
- pointblank-0.8.5.dist-info/top_level.txt,sha256=-wHrS1SvV8-nhvc3w-PPYs1C1WtEc1pK-eGjubbCCKc,11
31
- pointblank-0.8.5.dist-info/RECORD,,
27
+ pointblank-0.8.6.dist-info/licenses/LICENSE,sha256=apLF-HWPNU7pT5bmf5KmZpD5Cklpy2u-BN_0xBoRMLY,1081
28
+ pointblank-0.8.6.dist-info/METADATA,sha256=aDXpDvJMC8tWYgFNxmcQogOkcRWBFN21c14adAVCs78,13977
29
+ pointblank-0.8.6.dist-info/WHEEL,sha256=CmyFI0kx5cdEMTLiONQRbGQwjIoR1aIYB7eCAQ4KPJ0,91
30
+ pointblank-0.8.6.dist-info/top_level.txt,sha256=-wHrS1SvV8-nhvc3w-PPYs1C1WtEc1pK-eGjubbCCKc,11
31
+ pointblank-0.8.6.dist-info/RECORD,,
@@ -1,269 +0,0 @@
1
- Metadata-Version: 2.4
2
- Name: pointblank
3
- Version: 0.8.5
4
- Summary: Find out if your data is what you think it is.
5
- Author-email: Richard Iannone <riannone@me.com>
6
- License: MIT License
7
-
8
- Copyright (c) 2024-2025 Posit Software, PBC
9
-
10
- Permission is hereby granted, free of charge, to any person obtaining a copy
11
- of this software and associated documentation files (the "Software"), to deal
12
- in the Software without restriction, including without limitation the rights
13
- to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
14
- copies of the Software, and to permit persons to whom the Software is
15
- furnished to do so, subject to the following conditions:
16
-
17
- The above copyright notice and this permission notice shall be included in all
18
- copies or substantial portions of the Software.
19
-
20
- THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
21
- IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
22
- FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
23
- AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
24
- LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
25
- OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
26
- SOFTWARE.
27
-
28
- Project-URL: homepage, https://github.com/posit-dev/pointblank
29
- Keywords: data,quality,validation,testing,data science,data engineering
30
- Classifier: Development Status :: 5 - Production/Stable
31
- Classifier: Intended Audience :: Science/Research
32
- Classifier: License :: OSI Approved :: MIT License
33
- Classifier: Programming Language :: Python
34
- Classifier: Programming Language :: Python :: 3
35
- Classifier: Programming Language :: Python :: 3.10
36
- Classifier: Programming Language :: Python :: 3.11
37
- Classifier: Programming Language :: Python :: 3.12
38
- Classifier: Operating System :: OS Independent
39
- Classifier: Topic :: Scientific/Engineering
40
- Requires-Python: >=3.10
41
- Description-Content-Type: text/markdown
42
- License-File: LICENSE
43
- Requires-Dist: commonmark>=0.9.1
44
- Requires-Dist: importlib-metadata
45
- Requires-Dist: great_tables>=0.17.0
46
- Requires-Dist: narwhals>=1.24.1
47
- Requires-Dist: typing_extensions>=3.10.0.0
48
- Requires-Dist: requests>=2.31.0
49
- Provides-Extra: pd
50
- Requires-Dist: pandas>=2.2.3; extra == "pd"
51
- Provides-Extra: pl
52
- Requires-Dist: polars>=1.24.0; extra == "pl"
53
- Provides-Extra: generate
54
- Requires-Dist: chatlas>=0.3.0; extra == "generate"
55
- Requires-Dist: anthropic[bedrock]>=0.45.2; extra == "generate"
56
- Requires-Dist: openai>=1.63.0; extra == "generate"
57
- Requires-Dist: shiny>=1.3.0; extra == "generate"
58
- Provides-Extra: databricks
59
- Requires-Dist: ibis-framework[databricks]>=9.5.0; extra == "databricks"
60
- Provides-Extra: duckdb
61
- Requires-Dist: ibis-framework[duckdb]>=9.5.0; extra == "duckdb"
62
- Provides-Extra: mysql
63
- Requires-Dist: ibis-framework[mysql]>=9.5.0; extra == "mysql"
64
- Provides-Extra: mssql
65
- Requires-Dist: ibis-framework[mssql]>=9.5.0; extra == "mssql"
66
- Provides-Extra: postgres
67
- Requires-Dist: ibis-framework[postgres]>=9.5.0; extra == "postgres"
68
- Provides-Extra: pyspark
69
- Requires-Dist: ibis-framework[pyspark]>=9.5.0; extra == "pyspark"
70
- Provides-Extra: snowflake
71
- Requires-Dist: ibis-framework[snowflake]>=9.5.0; extra == "snowflake"
72
- Provides-Extra: sqlite
73
- Requires-Dist: ibis-framework[sqlite]>=9.5.0; extra == "sqlite"
74
- Provides-Extra: docs
75
- Requires-Dist: jupyter; extra == "docs"
76
- Requires-Dist: nbclient>=0.10.0; extra == "docs"
77
- Requires-Dist: nbformat>=5.10.4; extra == "docs"
78
- Requires-Dist: quartodoc>=0.8.1; python_version >= "3.9" and extra == "docs"
79
- Requires-Dist: pandas>=2.2.3; extra == "docs"
80
- Requires-Dist: polars>=1.17.1; extra == "docs"
81
- Dynamic: license-file
82
-
83
- <div align="center">
84
-
85
- <a href="https://posit-dev.github.io/pointblank/"><img src="https://posit-dev.github.io/pointblank/assets/pointblank_logo.svg" width="75%"/></a>
86
-
87
- _Find out if your data is what you think it is._
88
-
89
- [![Python Versions](https://img.shields.io/pypi/pyversions/pointblank.svg)](https://pypi.python.org/pypi/pointblank)
90
- [![PyPI](https://img.shields.io/pypi/v/pointblank)](https://pypi.org/project/pointblank/#history)
91
- [![PyPI Downloads](https://img.shields.io/pypi/dm/pointblank)](https://pypistats.org/packages/pointblank)
92
- [![License](https://img.shields.io/github/license/posit-dev/pointblank)](https://img.shields.io/github/license/posit-dev/pointblank)
93
-
94
- [![CI Build](https://github.com/posit-dev/pointblank/actions/workflows/ci-tests.yaml/badge.svg)](https://github.com/posit-dev/pointblank/actions/workflows/ci-tests.yaml)
95
- [![Codecov branch](https://img.shields.io/codecov/c/github/posit-dev/pointblank/main.svg)](https://codecov.io/gh/posit-dev/pointblank)
96
- [![Repo Status](https://www.repostatus.org/badges/latest/active.svg)](https://www.repostatus.org/#active)
97
- [![Documentation](https://img.shields.io/badge/docs-project_website-blue.svg)](https://posit-dev.github.io/pointblank/)
98
-
99
- [![Contributors](https://img.shields.io/github/contributors/posit-dev/pointblank)](https://github.com/posit-dev/pointblank/graphs/contributors)
100
- [![Discord](https://img.shields.io/discord/1345877328982446110?color=%237289da&label=Discord)](https://discord.com/invite/YH7CybCNCQ)
101
- [![Contributor Covenant](https://img.shields.io/badge/Contributor%20Covenant-v2.1%20adopted-ff69b4.svg)](https://www.contributor-covenant.org/version/2/1/code_of_conduct.html)
102
-
103
- </div>
104
-
105
- Pointblank is a table validation and testing library for Python. It helps you ensure that your
106
- tabular data meets certain expectations and constraints and it presents the results in a beautiful
107
- validation report table.
108
-
109
- ## Getting Started
110
-
111
- Let's take a Polars DataFrame and validate it against a set of constraints. We do that by using the
112
- `Validate` class along with adding validation steps:
113
-
114
- ```python
115
- import pointblank as pb
116
-
117
- validation = (
118
- pb.Validate(data=pb.load_dataset(dataset="small_table")) # Use Validate() to start
119
- .col_vals_gt(columns="d", value=100) # STEP 1 |
120
- .col_vals_le(columns="c", value=5) # STEP 2 | <-- Build up a validation plan
121
- .col_exists(columns=["date", "date_time"]) # STEPS 3 & 4 |
122
- .interrogate() # This will execute all validation steps and collect intel
123
- )
124
-
125
- validation
126
- ```
127
-
128
- <div align="center">
129
- <img src="https://posit-dev.github.io/pointblank/assets/pointblank-tabular-report.png" width="800px">
130
- </div>
131
-
132
- The rows in the validation report table correspond to each of the validation steps. One of the key
133
- concepts is that validation steps can be broken down into atomic test cases (test units), where each
134
- of these test units is given either of pass/fail status based on the validation constraints. You'll
135
- see these tallied up in the reporting table (in the `UNITS`, `PASS`, and `FAIL` columns).
136
-
137
- The tabular reporting view is just one way to see the results. You can also obtain fine-grained
138
- results of the interrogation as individual step reports or via methods that provide key metrics.
139
- It's also possible to use the validation results for downstream processing, such as filtering the
140
- input table based on the pass/fail status of the rows.
141
-
142
- On the input side, we can use the following types of tables:
143
-
144
- - Polars DataFrame
145
- - Pandas DataFrame
146
- - DuckDB table
147
- - MySQL table
148
- - PostgreSQL table
149
- - SQLite table
150
- - Parquet
151
-
152
- To make this all work seamlessly, we use [Narwhals](https://github.com/narwhals-dev/narwhals) to
153
- work with Polars and Pandas DataFrames. We also integrate with
154
- [Ibis](https://github.com/ibis-project/ibis) to enable the use of DuckDB, MySQL, PostgreSQL, SQLite,
155
- Parquet, and more! In doing all of this, we can provide an ergonomic and consistent API for
156
- validating tabular data from various sources.
157
-
158
- Note: if you want the validation report from the REPL, you have to run `validation.get_tabular_report().show()`.
159
-
160
- ## Features
161
-
162
- Here's a short list of what we think makes Pointblank a great tool for data validation:
163
-
164
- - **Flexible**: We support tables from Polars, Pandas, DuckDB, MySQL, PostgreSQL, SQLite, and Parquet
165
- - **Beautiful Reports**: Generate beautiful HTML table reports of your data validation results
166
- - **Functional Output**: Easily pull the specific data validation outputs you need for further processing
167
- - **Easy to Use**: Get started quickly with a straightforward API and clear documentation examples
168
- - **Powerful**: You can make complex data validation rules with flexible options for composition
169
-
170
- There's a lot of [interesting examples](https://posit-dev.github.io/pointblank/demos/) you can
171
- check out in the documentation website.
172
-
173
- ## Installation
174
-
175
- You can install Pointblank using pip:
176
-
177
- ```bash
178
- pip install pointblank
179
- ```
180
-
181
- You can also install [Pointblank from Conda-Forge](https://anaconda.org/conda-forge/pointblank) by
182
- using:
183
-
184
- ```bash
185
- conda install conda-forge::pointblank
186
- ```
187
-
188
- If you don't have Polars or Pandas installed, you'll need to install one of them to use Pointblank.
189
-
190
- ```bash
191
- pip install "pointblank[pl]" # Install Pointblank with Polars
192
- pip install "pointblank[pd]" # Install Pointblank with Pandas
193
- ```
194
-
195
- To use Pointblank with DuckDB, MySQL, PostgreSQL, or SQLite, install Ibis with the appropriate
196
- backend:
197
-
198
- ```bash
199
- pip install "pointblank[duckdb]" # Install Pointblank with Ibis + DuckDB
200
- pip install "pointblank[mysql]" # Install Pointblank with Ibis + MySQL
201
- pip install "pointblank[postgres]" # Install Pointblank with Ibis + PostgreSQL
202
- pip install "pointblank[sqlite]" # Install Pointblank with Ibis + SQLite
203
- ```
204
-
205
- ## Getting in Touch
206
-
207
- If you encounter a bug, have usage questions, or want to share ideas to make this package better,
208
- please feel free to file an [issue](https://github.com/posit-dev/pointblank/issues).
209
-
210
- Wanna talk about data validation in a more relaxed setting? Join our
211
- [_Discord server_](https://discord.com/invite/YH7CybCNCQ)! This is a great option for asking about
212
- the development of Pointblank, pitching ideas that may become features, and just sharing your ideas!
213
-
214
- [![Discord Server](https://img.shields.io/badge/Discord-Chat%20with%20us-blue?style=social&logo=discord&logoColor=purple)](https://discord.com/invite/YH7CybCNCQ)
215
-
216
- ## Contributing to Pointblank
217
-
218
- There are many ways to contribute to the ongoing development of Pointblank. Some contributions can
219
- be simple (like fixing typos, improving documentation, filing issues for feature requests or
220
- problems, etc.) and others might take more time and care (like answering questions and submitting
221
- PRs with code changes). Just know that anything you can do to help would be very much appreciated!
222
-
223
- Please read over the
224
- [contributing guidelines](https://github.com/posit-dev/pointblank/blob/main/CONTRIBUTING.md) for
225
- information on how to get started.
226
-
227
- ## Roadmap
228
-
229
- There is much to do to make Pointblank a dependable and useful tool for data validation. To that
230
- end, we have a roadmap that will serve as a guide for the development of the library. Here are some
231
- of the things we are working on or plan to work on in the near future:
232
-
233
- 1. more validation methods to cover a wider range of data validation needs
234
- 2. easy-to-use but powerful logging functionality
235
- 3. messaging actions (e.g., Slack, emailing, etc.) to better react to threshold exceedances
236
- 4. additional functionality for building more complex validations via LLMs (extension of ideas from
237
- the current `DraftValidation` class)
238
- 5. a feature for quickly obtaining summary information on any dataset (tying together existing and
239
- future dataset summary-generation pieces)
240
- 6. ensuring there are text/dict/JSON/HTML versions of all reports
241
- 7. supporting the writing and reading of YAML validation config files
242
- 8. a cli utility for Pointblank that can be used to run validations from the command line
243
- 9. complete testing of validations across all compatible backends (for certification of those
244
- backends as fully supported)
245
- 10. completion of the **User Guide** in the project website
246
- 11. functionality for creating and publishing data dictionaries, which could: (a) use LLMs to more
247
- quickly draft column-level descriptions, and (b) incorporate templating features to make it
248
- easier to keep descriptions consistent and up to date
249
-
250
- If you have any ideas for features or improvements, don't hesitate to share them with us! We are
251
- always looking for ways to make Pointblank better.
252
-
253
- ## Code of Conduct
254
-
255
- Please note that the Pointblank project is released with a
256
- [contributor code of conduct](https://www.contributor-covenant.org/version/2/1/code_of_conduct/).
257
- <br>By participating in this project you agree to abide by its terms.
258
-
259
- ## 📄 License
260
-
261
- Pointblank is licensed under the MIT license.
262
-
263
- © Posit Software, PBC.
264
-
265
- ## 🏛️ Governance
266
-
267
- This project is primarily maintained by
268
- [Rich Iannone](https://bsky.app/profile/richmeister.bsky.social). Other authors may occasionally
269
- assist with some of these duties.