data-contract-validator 1.0.0__tar.gz → 1.0.1__tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (29) hide show
  1. data_contract_validator-1.0.1/PKG-INFO +512 -0
  2. data_contract_validator-1.0.1/README.md +463 -0
  3. {data_contract_validator-1.0.0 → data_contract_validator-1.0.1}/data_contract_validator/cli.py +67 -0
  4. data_contract_validator-1.0.1/data_contract_validator.egg-info/PKG-INFO +512 -0
  5. {data_contract_validator-1.0.0 → data_contract_validator-1.0.1}/pyproject.toml +6 -6
  6. data_contract_validator-1.0.0/PKG-INFO +0 -344
  7. data_contract_validator-1.0.0/README.md +0 -295
  8. data_contract_validator-1.0.0/data_contract_validator.egg-info/PKG-INFO +0 -344
  9. {data_contract_validator-1.0.0 → data_contract_validator-1.0.1}/CHANGELOG.md +0 -0
  10. {data_contract_validator-1.0.0 → data_contract_validator-1.0.1}/LICENSE +0 -0
  11. {data_contract_validator-1.0.0 → data_contract_validator-1.0.1}/MANIFEST.in +0 -0
  12. {data_contract_validator-1.0.0 → data_contract_validator-1.0.1}/data_contract_validator/__init__.py +0 -0
  13. {data_contract_validator-1.0.0 → data_contract_validator-1.0.1}/data_contract_validator/core/__init__.py +0 -0
  14. {data_contract_validator-1.0.0 → data_contract_validator-1.0.1}/data_contract_validator/core/models.py +0 -0
  15. {data_contract_validator-1.0.0 → data_contract_validator-1.0.1}/data_contract_validator/core/validator.py +0 -0
  16. {data_contract_validator-1.0.0 → data_contract_validator-1.0.1}/data_contract_validator/extractors/__init__.py +0 -0
  17. {data_contract_validator-1.0.0 → data_contract_validator-1.0.1}/data_contract_validator/extractors/base.py +0 -0
  18. {data_contract_validator-1.0.0 → data_contract_validator-1.0.1}/data_contract_validator/extractors/dbt.py +0 -0
  19. {data_contract_validator-1.0.0 → data_contract_validator-1.0.1}/data_contract_validator/extractors/fastapi.py +0 -0
  20. {data_contract_validator-1.0.0 → data_contract_validator-1.0.1}/data_contract_validator/integrations/__init__.py +0 -0
  21. {data_contract_validator-1.0.0 → data_contract_validator-1.0.1}/data_contract_validator/py.typed +0 -0
  22. {data_contract_validator-1.0.0 → data_contract_validator-1.0.1}/data_contract_validator/templates/github-actions-template.yml +0 -0
  23. {data_contract_validator-1.0.0 → data_contract_validator-1.0.1}/data_contract_validator.egg-info/SOURCES.txt +0 -0
  24. {data_contract_validator-1.0.0 → data_contract_validator-1.0.1}/data_contract_validator.egg-info/dependency_links.txt +0 -0
  25. {data_contract_validator-1.0.0 → data_contract_validator-1.0.1}/data_contract_validator.egg-info/entry_points.txt +0 -0
  26. {data_contract_validator-1.0.0 → data_contract_validator-1.0.1}/data_contract_validator.egg-info/requires.txt +0 -0
  27. {data_contract_validator-1.0.0 → data_contract_validator-1.0.1}/data_contract_validator.egg-info/top_level.txt +0 -0
  28. {data_contract_validator-1.0.0 → data_contract_validator-1.0.1}/requirements.txt +0 -0
  29. {data_contract_validator-1.0.0 → data_contract_validator-1.0.1}/setup.cfg +0 -0
@@ -0,0 +1,512 @@
1
+ Metadata-Version: 2.4
2
+ Name: data-contract-validator
3
+ Version: 1.0.1
4
+ Summary: Prevent production API breaks by validating data contracts between DBT models and API frameworks
5
+ Author-email: Ogunniran Siji <ogunniransiji@gmail.com>
6
+ Maintainer-email: Ogunniran Siji <ogunniransiji@gmail.com>
7
+ License: MIT
8
+ Project-URL: Homepage, https://github.com/OGsiji/data-contract-validator
9
+ Project-URL: Documentation, https://github.com/OGsiji/data-contract-validator/blob/main/README.md
10
+ Project-URL: Repository, https://github.com/OGsiji/data-contract-validator
11
+ Project-URL: Bug Reports, https://github.com/OGsiji/data-contract-validator/issues
12
+ Project-URL: Changelog, https://github.com/OGsiji/data-contract-validator/blob/main/CHANGELOG.md
13
+ Keywords: dbt,fastapi,contract-testing,api-validation,data-engineering,schema-validation,ci-cd,devops
14
+ Classifier: Development Status :: 4 - Beta
15
+ Classifier: Intended Audience :: Developers
16
+ Classifier: License :: OSI Approved :: MIT License
17
+ Classifier: Operating System :: OS Independent
18
+ Classifier: Programming Language :: Python :: 3
19
+ Classifier: Programming Language :: Python :: 3.8
20
+ Classifier: Programming Language :: Python :: 3.9
21
+ Classifier: Programming Language :: Python :: 3.10
22
+ Classifier: Programming Language :: Python :: 3.11
23
+ Classifier: Programming Language :: Python :: 3.12
24
+ Classifier: Topic :: Software Development :: Quality Assurance
25
+ Classifier: Topic :: Software Development :: Testing
26
+ Classifier: Topic :: Database
27
+ Classifier: Topic :: Internet :: WWW/HTTP :: HTTP Servers
28
+ Requires-Python: >=3.8
29
+ Description-Content-Type: text/markdown
30
+ License-File: LICENSE
31
+ Requires-Dist: pydantic>=2.0.0
32
+ Requires-Dist: PyYAML>=6.0
33
+ Requires-Dist: requests>=2.25.0
34
+ Requires-Dist: click>=8.0.0
35
+ Provides-Extra: dev
36
+ Requires-Dist: pytest>=7.0.0; extra == "dev"
37
+ Requires-Dist: pytest-cov>=4.0.0; extra == "dev"
38
+ Requires-Dist: black>=22.0.0; extra == "dev"
39
+ Requires-Dist: flake8>=4.0.0; extra == "dev"
40
+ Requires-Dist: mypy>=0.991; extra == "dev"
41
+ Requires-Dist: pre-commit>=2.20.0; extra == "dev"
42
+ Requires-Dist: build>=0.8.0; extra == "dev"
43
+ Requires-Dist: twine>=4.0.0; extra == "dev"
44
+ Provides-Extra: test
45
+ Requires-Dist: pytest>=7.0.0; extra == "test"
46
+ Requires-Dist: pytest-cov>=4.0.0; extra == "test"
47
+ Requires-Dist: pytest-mock>=3.8.0; extra == "test"
48
+ Dynamic: license-file
49
+
50
+ # 🛡️ Data Contract Validator
51
+
52
+ > **Prevent production API breaks by validating data contracts between your data pipelines and API frameworks**
53
+
54
+ [![PyPI version](https://badge.fury.io/py/data-contract-validator.svg)](https://badge.fury.io/py/data-contract-validator)
55
+ [![Tests](https://github.com/OGsiji/data-contract-validator/workflows/Tests/badge.svg)](https://github.com/OGsiji/data-contract-validator/actions)
56
+ [![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)
57
+
58
+ ## 🎯 **What This Solves**
59
+
60
+ Ever deployed a DBT model change only to break your FastAPI in production? This tool prevents that by validating data contracts between your data pipelines and APIs **before** deployment.
61
+
62
+ ```
63
+ DBT Models Contract FastAPI Models
64
+ (What data Validator (What APIs
65
+ produces) ↕️ VALIDATES ↕️ expect)
66
+ ↓ ↓ ↓
67
+ Schema Finds Schema
68
+ Extraction Mismatches Extraction
69
+ ```
70
+
71
+ ## ⚡ **Quick Start**
72
+
73
+ ### **Installation**
74
+ ```bash
75
+ pip install data-contract-validator
76
+ ```
77
+
78
+ ### **30-Second Setup**
79
+ ```bash
80
+ # 1. Initialize in your project
81
+ contract-validator init --interactive
82
+
83
+ # 2. Test setup
84
+ contract-validator test
85
+
86
+ # 3. Validate contracts
87
+ contract-validator validate
88
+
89
+ # 4. Commit and push - you're protected! 🛡️
90
+ ```
91
+
92
+ ### **Basic Usage**
93
+ ```bash
94
+ # Validate local DBT project against FastAPI models
95
+ contract-validator validate \
96
+ --dbt-project ./my-dbt-project \
97
+ --fastapi-local ./my-api/models.py
98
+
99
+ # Validate across repositories (microservices)
100
+ contract-validator validate \
101
+ --dbt-project . \
102
+ --fastapi-repo "my-org/my-api-repo" \
103
+ --fastapi-path "app/models.py"
104
+ ```
105
+
106
+ ## 🔍 **Real Example: Production Validation**
107
+
108
+ **Actual output from a production analytics project:**
109
+
110
+ ```bash
111
+ $ contract-validator validate
112
+
113
+ 🔍 Starting contract validation...
114
+ 📊 Extracting source schemas...
115
+ ✅ Found 14 DBT models (user_analytics_summary: 54 columns)
116
+ 🎯 Extracting target schemas...
117
+ ✅ Found 3 FastAPI models
118
+ 🔍 Validating schema compatibility...
119
+
120
+ 🛡️ Results:
121
+ ✅ PASSED - 0 critical issues (no production breaks!)
122
+ ⚠️ 42 warnings (type mismatches to review)
123
+
124
+ Issues caught:
125
+ ⚠️ user_analytics_summary.age_years: source 'varchar' vs target 'integer'
126
+ ⚠️ user_analytics_summary.is_verified: source 'varchar' vs target 'boolean'
127
+ ⚠️ user_analytics_summary.user_created_at: source 'varchar' vs target 'timestamp'
128
+
129
+ 🎉 Your API contracts are protected!
130
+ ```
131
+
132
+ ## 🚨 **What It Prevents**
133
+
134
+ ### **Before Data Contract Validation:**
135
+ ```sql
136
+ -- Analytics team changes DBT model
137
+ select
138
+ user_id,
139
+ email,
140
+ -- total_orders, ❌ REMOVED this column
141
+ revenue
142
+ from users
143
+ ```
144
+
145
+ ```python
146
+ # API team's FastAPI model (unchanged)
147
+ class UserAnalytics(BaseModel):
148
+ user_id: str
149
+ email: str
150
+ total_orders: int # ❌ Still expects this!
151
+ revenue: float
152
+ ```
153
+
154
+ **Result:** 💥 **Production API breaks**, angry customers, 2AM debugging
155
+
156
+ ### **After Data Contract Validation:**
157
+ ```bash
158
+ $ git push
159
+
160
+ ❌ VALIDATION FAILED
161
+ 💥 user_analytics.total_orders: FastAPI REQUIRES column but DBT removed it
162
+ 🔧 Fix: Add 'total_orders' back to DBT model or update FastAPI model
163
+
164
+ # Push blocked until fixed ✋
165
+ ```
166
+
167
+ **Result:** 🛡️ **Production protected**, issues caught in CI/CD
168
+
169
+ ## 🛠️ **Pre-commit Integration**
170
+
171
+ ### **Automatic Setup (Recommended)**
172
+ ```bash
173
+ # Initialize with pre-commit support
174
+ contract-validator init --interactive
175
+ contract-validator setup-precommit --install-hooks
176
+
177
+ # Now every commit validates contracts automatically! 🛡️
178
+ ```
179
+
180
+ ### **Manual Setup**
181
+ If you prefer manual setup:
182
+
183
+ 1. **Install pre-commit:**
184
+ ```bash
185
+ pip install pre-commit
186
+ ```
187
+
188
+ 2. **Add to `.pre-commit-config.yaml`:**
189
+ ```yaml
190
+ repos:
191
+ - repo: https://github.com/OGsiji/data-contract-validator
192
+ rev: v1.0.0
193
+ hooks:
194
+ - id: contract-validation
195
+ name: Validate Data Contracts
196
+ files: '^(.*models.*\.(sql|py)|\.retl-validator\.yml|dbt_project\.yml)$'
197
+ ```
198
+
199
+ 3. **Install hooks:**
200
+ ```bash
201
+ pre-commit install
202
+ ```
203
+
204
+ ### **How It Works**
205
+ ```bash
206
+ $ git add models/user_analytics.sql
207
+ $ git commit -m "update user analytics model"
208
+
209
+ # Pre-commit automatically runs:
210
+ 🔍 Validating Data Contracts...
211
+ ✅ Contract validation passed
212
+ [main abc1234] update user analytics model
213
+ ```
214
+
215
+ ### **On Validation Failure**
216
+ ```bash
217
+ $ git commit -m "remove important column"
218
+
219
+ 🔍 Validating Data Contracts...
220
+ ❌ CRITICAL: user_analytics.total_revenue missing
221
+ 💡 Fix the issue before committing
222
+
223
+ # Commit blocked until fixed! 🛡️
224
+ ```
225
+
226
+ ### **Skip Validation (Emergency Only)**
227
+ ```bash
228
+ # Only for emergencies!
229
+ git commit -m "emergency fix" --no-verify
230
+ ```
231
+
232
+ ### **Benefits of Pre-commit Integration**
233
+ - ✅ **Catches issues before they reach CI/CD**
234
+ - ✅ **Faster feedback loop** (seconds, not minutes)
235
+ - ✅ **No broken commits** in your git history
236
+ - ✅ **Team protection** - everyone gets validation
237
+ - ✅ **Zero configuration** after setup
238
+
239
+ ## 📦 **GitHub Actions Integration**
240
+
241
+ Add this to `.github/workflows/validate-contracts.yml`:
242
+
243
+ ```yaml
244
+ name: 🛡️ Data Contract Validation
245
+
246
+ on:
247
+ pull_request:
248
+ paths:
249
+ - 'models/**/*.sql'
250
+ - 'dbt_project.yml'
251
+ - '**/*models*.py'
252
+
253
+ jobs:
254
+ validate-contracts:
255
+ runs-on: ubuntu-latest
256
+ steps:
257
+ - uses: actions/checkout@v4
258
+ - uses: actions/setup-python@v4
259
+ with:
260
+ python-version: '3.9'
261
+
262
+ - name: Validate contracts
263
+ env:
264
+ GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
265
+ run: |
266
+ pip install data-contract-validator
267
+ contract-validator validate
268
+ ```
269
+
270
+ **Auto-generated when you run `contract-validator init`!**
271
+
272
+ ## 🔧 **Configuration**
273
+
274
+ ### **Auto-Generated Config (`.retl-validator.yml`)**
275
+ ```yaml
276
+ version: '1.0'
277
+ name: 'my-project-contracts'
278
+
279
+ source:
280
+ dbt:
281
+ project_path: '.'
282
+ auto_compile: true
283
+
284
+ target:
285
+ fastapi:
286
+ # For GitHub repos
287
+ type: "github"
288
+ repo: "my-org/my-api"
289
+ path: "app/models.py"
290
+
291
+ # For local files
292
+ # type: "local"
293
+ # path: "../my-api/models.py"
294
+
295
+ validation:
296
+ fail_on: ['missing_tables', 'missing_required_columns']
297
+ warn_on: ['type_mismatches', 'missing_optional_columns']
298
+ ```
299
+
300
+ ### **Command Line Options**
301
+ ```bash
302
+ contract-validator validate \
303
+ --dbt-project ./dbt-project \ # DBT project path
304
+ --fastapi-repo "org/repo" \ # GitHub repo
305
+ --fastapi-path "app/models.py" \ # Path to models
306
+ --github-token "$GITHUB_TOKEN" \ # For private repos
307
+ --output json # json, terminal, github
308
+ ```
309
+
310
+ ## 🚀 **Supported Frameworks**
311
+
312
+ ### **Data Sources ✅**
313
+ - **DBT** (all adapters: Snowflake, BigQuery, Redshift, etc.)
314
+
315
+ ### **API Frameworks ✅**
316
+ - **FastAPI** (Pydantic + SQLModel)
317
+
318
+ ### **Coming Soon 🔄**
319
+ - Django, Flask-SQLAlchemy
320
+ - Databricks, Airflow
321
+ - [Request other frameworks](https://github.com/OGsiji/data-contract-validator/issues)
322
+
323
+ ## 🎯 **Output Formats**
324
+
325
+ ### **Terminal (Default)**
326
+ ```bash
327
+ 🛡️ Data Contract Validation Results:
328
+ Status: ✅ PASSED
329
+ Critical: 0 | Warnings: 5
330
+
331
+ ⚠️ Warnings:
332
+ user_analytics.age: Type mismatch (varchar vs integer)
333
+ user_analytics.country: Type mismatch (integer vs varchar)
334
+
335
+ 🎉 Your API contracts are protected!
336
+ ```
337
+
338
+ ### **JSON (for CI/CD)**
339
+ ```json
340
+ {
341
+ "success": true,
342
+ "critical_issues": 0,
343
+ "warnings": 5,
344
+ "issues": [
345
+ {
346
+ "severity": "warning",
347
+ "table": "user_analytics",
348
+ "column": "age",
349
+ "message": "Type mismatch: source 'varchar' vs target 'integer'",
350
+ "suggested_fix": "Update target to expect 'varchar' or fix source type"
351
+ }
352
+ ]
353
+ }
354
+ ```
355
+
356
+ ### **GitHub Actions**
357
+ ```bash
358
+ ::warning::user_analytics.age: Type mismatch detected
359
+ ✅ Contract validation passed - no critical issues
360
+ ```
361
+
362
+ ## 🏗️ **Architecture**
363
+
364
+ ### **Simple Python API**
365
+ ```python
366
+ from data_contract_validator import ContractValidator, DBTExtractor, FastAPIExtractor
367
+
368
+ # Initialize extractors
369
+ dbt = DBTExtractor(project_path='./dbt-project')
370
+ fastapi = FastAPIExtractor.from_github_repo('my-org/my-api', 'app/models.py')
371
+
372
+ # Run validation
373
+ validator = ContractValidator(source=dbt, target=fastapi)
374
+ result = validator.validate()
375
+
376
+ if not result.success:
377
+ print(f"❌ {len(result.critical_issues)} critical issues found")
378
+ for issue in result.critical_issues:
379
+ print(f"💥 {issue.table}.{issue.column}: {issue.message}")
380
+ ```
381
+
382
+ ### **CLI Interface**
383
+ ```bash
384
+ # Interactive setup
385
+ contract-validator init --interactive
386
+
387
+ # Test configuration
388
+ contract-validator test
389
+
390
+ # Run validation
391
+ contract-validator validate
392
+
393
+ # Setup pre-commit hooks
394
+ contract-validator setup-precommit --install-hooks
395
+
396
+ # Multiple output formats
397
+ contract-validator validate --output json
398
+ ```
399
+
400
+ ## 🔄 **Development Workflow**
401
+
402
+ ### **With Pre-commit (Recommended)**
403
+ ```bash
404
+ # Team workflow with automated validation
405
+ git clone your-dbt-project
406
+ cd your-dbt-project
407
+
408
+ # One-time setup for new team members
409
+ contract-validator init --interactive
410
+ contract-validator setup-precommit --install-hooks
411
+
412
+ # Protected development workflow:
413
+ # 1. Make changes to DBT models
414
+ # 2. git add models/my_model.sql
415
+ # 3. git commit -m "update model" # ← Validation runs here automatically
416
+ # 4. If validation passes → commit succeeds
417
+ # 5. If validation fails → fix issues first
418
+ # 6. git push # ← CI/CD validation as backup
419
+ ```
420
+
421
+ ### **Manual Workflow**
422
+ ```bash
423
+ # Traditional workflow
424
+ # 1. Make changes
425
+ # 2. contract-validator validate # Manual validation
426
+ # 3. git commit
427
+ # 4. git push
428
+ ```
429
+
430
+ ## 🤝 **Contributing**
431
+
432
+ We welcome contributions! This tool is actively used in production.
433
+
434
+ ### **Development Setup**
435
+ ```bash
436
+ git clone https://github.com/OGsiji/data-contract-validator
437
+ cd data-contract-validator
438
+ pip install -e ".[dev]"
439
+ pytest
440
+ ```
441
+
442
+ ### **Adding New Extractors**
443
+ ```python
444
+ from retl_validator.extractors import BaseExtractor
445
+
446
+ class MyFrameworkExtractor(BaseExtractor):
447
+ def extract_schemas(self) -> Dict[str, Schema]:
448
+ # Your implementation
449
+ return schemas
450
+ ```
451
+
452
+ ### **Reporting Issues**
453
+ - 🐛 **Bugs**: [GitHub Issues](https://github.com/OGsiji/data-contract-validator/issues)
454
+ - 💡 **Features**: [GitHub Discussions](https://github.com/OGsiji/data-contract-validator/discussions)
455
+
456
+ ## 📚 **Documentation**
457
+
458
+ - **[Quick Start Guide](https://github.com/OGsiji/data-contract-validator#quick-start)** - Get running in 2 minutes
459
+ - **[Configuration Reference](https://github.com/OGsiji/data-contract-validator/blob/main/examples)** - All config options
460
+ - **[GitHub Actions Setup](https://github.com/OGsiji/data-contract-validator/blob/main/examples/.github_actions)** - CI/CD integration
461
+ - **[Examples](https://github.com/OGsiji/data-contract-validator/tree/main/examples)** - Real-world usage
462
+ - **[Pre-commit Integration](https://github.com/OGsiji/data-contract-validator#pre-commit-integration)** - Automated validation
463
+
464
+ ## 🎉 **Real-World Usage**
465
+
466
+ This tool is actively preventing production incidents in:
467
+ - **Analytics pipelines** with 50+ DBT models
468
+ - **Microservices architectures** with multiple APIs
469
+ - **Data engineering teams** using Snowflake, BigQuery, Redshift
470
+ - **Cross-repository validation** in large organizations
471
+
472
+ **Proven to catch:**
473
+ - ✅ **Type mismatches** (varchar vs integer)
474
+ - ✅ **Missing columns** (API expects columns DBT doesn't provide)
475
+ - ✅ **Schema drift** (gradual model changes)
476
+ - ✅ **Breaking changes** before they reach production
477
+
478
+ ## 🛡️ **Multiple Layers of Protection**
479
+
480
+ 1. **Pre-commit hooks**: Immediate feedback (fastest)
481
+ 2. **CI/CD validation**: Team protection (backup)
482
+ 3. **Manual validation**: Development testing
483
+ 4. **Configuration files**: Team standards
484
+
485
+ This creates a comprehensive safety net for your data contracts.
486
+
487
+ ## 📄 **License**
488
+
489
+ MIT License - see [LICENSE](https://github.com/OGsiji/data-contract-validator/blob/main/LICENSE) for details.
490
+
491
+ ## 🆘 **Support**
492
+
493
+ - 🐛 **Issues**: [GitHub Issues](https://github.com/OGsiji/data-contract-validator/issues)
494
+ - 💬 **Discussions**: [GitHub Discussions](https://github.com/OGsiji/data-contract-validator/discussions)
495
+ - 📧 **Email**: ogunniransiji@gmail.com
496
+
497
+ ## ⭐ **Star the Project**
498
+
499
+ If this tool helps you prevent production incidents, please ⭐ star the repository!
500
+
501
+ ---
502
+
503
+ **🛡️ Built by data engineers, for data engineers. Stop breaking production with data changes!**
504
+
505
+ ## 🚀 **Get Started Now**
506
+
507
+ ```bash
508
+ pip install data-contract-validator
509
+ contract-validator init --interactive
510
+ contract-validator setup-precommit --install-hooks
511
+ # 2 minutes to production protection with automated validation!
512
+ ```