data-contract-validator 1.1.0__tar.gz → 1.1.1__tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (27) hide show
  1. {data_contract_validator-1.1.0 → data_contract_validator-1.1.1}/CHANGELOG.md +12 -1
  2. {data_contract_validator-1.1.0/data_contract_validator.egg-info → data_contract_validator-1.1.1}/PKG-INFO +8 -5
  3. {data_contract_validator-1.1.0 → data_contract_validator-1.1.1}/README.md +7 -4
  4. {data_contract_validator-1.1.0 → data_contract_validator-1.1.1}/data_contract_validator/__init__.py +1 -1
  5. {data_contract_validator-1.1.0 → data_contract_validator-1.1.1}/data_contract_validator/core/types.py +57 -1
  6. {data_contract_validator-1.1.0 → data_contract_validator-1.1.1}/data_contract_validator/core/validator.py +14 -8
  7. {data_contract_validator-1.1.0 → data_contract_validator-1.1.1/data_contract_validator.egg-info}/PKG-INFO +8 -5
  8. {data_contract_validator-1.1.0 → data_contract_validator-1.1.1}/pyproject.toml +1 -1
  9. {data_contract_validator-1.1.0 → data_contract_validator-1.1.1}/LICENSE +0 -0
  10. {data_contract_validator-1.1.0 → data_contract_validator-1.1.1}/MANIFEST.in +0 -0
  11. {data_contract_validator-1.1.0 → data_contract_validator-1.1.1}/data_contract_validator/cli.py +0 -0
  12. {data_contract_validator-1.1.0 → data_contract_validator-1.1.1}/data_contract_validator/core/__init__.py +0 -0
  13. {data_contract_validator-1.1.0 → data_contract_validator-1.1.1}/data_contract_validator/core/models.py +0 -0
  14. {data_contract_validator-1.1.0 → data_contract_validator-1.1.1}/data_contract_validator/extractors/__init__.py +0 -0
  15. {data_contract_validator-1.1.0 → data_contract_validator-1.1.1}/data_contract_validator/extractors/base.py +0 -0
  16. {data_contract_validator-1.1.0 → data_contract_validator-1.1.1}/data_contract_validator/extractors/dbt.py +0 -0
  17. {data_contract_validator-1.1.0 → data_contract_validator-1.1.1}/data_contract_validator/extractors/fastapi.py +0 -0
  18. {data_contract_validator-1.1.0 → data_contract_validator-1.1.1}/data_contract_validator/integrations/__init__.py +0 -0
  19. {data_contract_validator-1.1.0 → data_contract_validator-1.1.1}/data_contract_validator/py.typed +0 -0
  20. {data_contract_validator-1.1.0 → data_contract_validator-1.1.1}/data_contract_validator/templates/github-actions-template.yml +0 -0
  21. {data_contract_validator-1.1.0 → data_contract_validator-1.1.1}/data_contract_validator.egg-info/SOURCES.txt +0 -0
  22. {data_contract_validator-1.1.0 → data_contract_validator-1.1.1}/data_contract_validator.egg-info/dependency_links.txt +0 -0
  23. {data_contract_validator-1.1.0 → data_contract_validator-1.1.1}/data_contract_validator.egg-info/entry_points.txt +0 -0
  24. {data_contract_validator-1.1.0 → data_contract_validator-1.1.1}/data_contract_validator.egg-info/requires.txt +0 -0
  25. {data_contract_validator-1.1.0 → data_contract_validator-1.1.1}/data_contract_validator.egg-info/top_level.txt +0 -0
  26. {data_contract_validator-1.1.0 → data_contract_validator-1.1.1}/requirements.txt +0 -0
  27. {data_contract_validator-1.1.0 → data_contract_validator-1.1.1}/setup.cfg +0 -0
@@ -7,6 +7,16 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
7
7
 
8
8
  ## [Unreleased]
9
9
 
10
+ ## [1.1.1] - 2026-06-30
11
+
12
+ ### Added
13
+ - **Automatic plural/singular table & column matching.** dbt models are
14
+ conventionally plural (`users`) while Pydantic classes are singular
15
+ (`User` → `user`); these now match automatically with no `mapping` needed.
16
+ Candidate forms are only matched against names that actually exist on the
17
+ other side, so it never over-strips (`address` is never mistaken for
18
+ `addres`). Explicit `mapping` still takes precedence.
19
+
10
20
  ## [1.1.0] - 2026-06-30
11
21
 
12
22
  This release is focused on **accuracy** — making a red check always mean a real
@@ -115,7 +125,8 @@ deploy.
115
125
  - Limited type inference from SQL
116
126
  - No support for complex nested types
117
127
 
118
- [Unreleased]: https://github.com/OGsiji/data-contract-validator/compare/v1.1.0...HEAD
128
+ [Unreleased]: https://github.com/OGsiji/data-contract-validator/compare/v1.1.1...HEAD
129
+ [1.1.1]: https://github.com/OGsiji/data-contract-validator/releases/tag/v1.1.1
119
130
  [1.1.0]: https://github.com/OGsiji/data-contract-validator/releases/tag/v1.1.0
120
131
  [1.0.5]: https://github.com/OGsiji/data-contract-validator/releases/tag/v1.0.5
121
132
  [1.0.0]: https://github.com/OGsiji/data-contract-validator/releases/tag/v1.0.0
@@ -1,6 +1,6 @@
1
1
  Metadata-Version: 2.4
2
2
  Name: data-contract-validator
3
- Version: 1.1.0
3
+ Version: 1.1.1
4
4
  Summary: Validate data contracts between dbt models and FastAPI/Pydantic APIs with accurate, low-false-positive schema checks
5
5
  Author-email: Ogunniran Siji <ogunniransiji@gmail.com>
6
6
  Maintainer-email: Ogunniran Siji <ogunniransiji@gmail.com>
@@ -201,10 +201,13 @@ validation:
201
201
 
202
202
  ### When do I need `mapping`?
203
203
 
204
- By default, names are matched across `snake_case` / `camelCase` / casing
205
- (`UserAnalytics` → `user_analytics`, `userId` → `user_id`). Reach for `mapping`
206
- only when a model or column is named so differently that the convention can't
207
- bridge it (e.g. Pydantic `user_id` dbt `customer_identifier`).
204
+ Most of the time you don't. Names are matched automatically across:
205
+ - `snake_case` / `camelCase` / casing — `UserAnalytics` → `user_analytics`, `userId` → `user_id`
206
+ - **plural singular** dbt's plural `users` matches Pydantic's `User` (→ `user`)
207
+ with no config (and it won't over-match — `address` is never confused with `addres`).
208
+
209
+ Reach for `mapping` only when a model or column is named so differently that
210
+ convention can't bridge it (e.g. Pydantic `user_id` ↔ dbt `customer_identifier`).
208
211
 
209
212
  ## 🐍 Python API
210
213
 
@@ -151,10 +151,13 @@ validation:
151
151
 
152
152
  ### When do I need `mapping`?
153
153
 
154
- By default, names are matched across `snake_case` / `camelCase` / casing
155
- (`UserAnalytics` → `user_analytics`, `userId` → `user_id`). Reach for `mapping`
156
- only when a model or column is named so differently that the convention can't
157
- bridge it (e.g. Pydantic `user_id` dbt `customer_identifier`).
154
+ Most of the time you don't. Names are matched automatically across:
155
+ - `snake_case` / `camelCase` / casing — `UserAnalytics` → `user_analytics`, `userId` → `user_id`
156
+ - **plural singular** dbt's plural `users` matches Pydantic's `User` (→ `user`)
157
+ with no config (and it won't over-match — `address` is never confused with `addres`).
158
+
159
+ Reach for `mapping` only when a model or column is named so differently that
160
+ convention can't bridge it (e.g. Pydantic `user_id` ↔ dbt `customer_identifier`).
158
161
 
159
162
  ## 🐍 Python API
160
163
 
@@ -5,7 +5,7 @@ Prevent production API breaks by validating data contracts between
5
5
  your data pipelines and API frameworks.
6
6
  """
7
7
 
8
- __version__ = "1.1.0"
8
+ __version__ = "1.1.1"
9
9
  __author__ = "Ogunniran Siji"
10
10
  __email__ = "ogunniransiji@gmail.com"
11
11
 
@@ -14,7 +14,7 @@ the tool stays quiet rather than crying wolf.
14
14
 
15
15
  from enum import Enum
16
16
  import re
17
- from typing import Optional
17
+ from typing import Any, Dict, List, Optional
18
18
 
19
19
 
20
20
  class CanonicalType(Enum):
@@ -289,3 +289,59 @@ def normalize_name(name: Optional[str]) -> str:
289
289
  text = re.sub(r"(.)([A-Z][a-z]+)", r"\1_\2", text)
290
290
  text = re.sub(r"([a-z0-9])([A-Z])", r"\1_\2", text)
291
291
  return text.lower().strip()
292
+
293
+
294
+ def name_variants(name: Optional[str]) -> List[str]:
295
+ """Return candidate forms of a name for plural/singular-insensitive matching.
296
+
297
+ dbt models are conventionally plural (``users``) while Pydantic classes are
298
+ singular (``User`` -> ``user``); this bridges that gap automatically.
299
+
300
+ The normalized form is always first (so exact matches win). The remaining
301
+ plural/singular candidates are deliberately over-generated -- callers should
302
+ only treat a candidate as a match if it equals a name that *actually exists*
303
+ on the other side, which makes spurious forms (e.g. ``statu`` from
304
+ ``status``) harmless rather than dangerous.
305
+ """
306
+ n = normalize_name(name)
307
+ variants: List[str] = [n] if n else []
308
+
309
+ def add(value: str) -> None:
310
+ if value and value not in variants:
311
+ variants.append(value)
312
+
313
+ if not n:
314
+ return variants
315
+
316
+ # Pluralize.
317
+ if n.endswith("y") and len(n) > 1 and n[-2] not in "aeiou":
318
+ add(n[:-1] + "ies") # category -> categories
319
+ if n.endswith(("s", "x", "z", "ch", "sh")):
320
+ add(n + "es") # address -> addresses, box -> boxes
321
+ add(n + "s") # user -> users
322
+
323
+ # Singularize.
324
+ if n.endswith("ies") and len(n) > 4:
325
+ add(n[:-3] + "y") # categories -> category
326
+ if n.endswith("es") and len(n) > 3:
327
+ add(n[:-2]) # addresses -> address, boxes -> box
328
+ if n.endswith("s") and not n.endswith("ss") and len(n) > 2:
329
+ add(n[:-1]) # users -> user (but never address -> addres)
330
+
331
+ return variants
332
+
333
+
334
+ def find_match(name: str, index: Dict[str, Any]) -> Any:
335
+ """Look up ``name`` in an index keyed by normalized names.
336
+
337
+ Prefers an exact normalized match, then falls back to a plural/singular
338
+ variant that actually exists in the index. Returns the matched value or
339
+ ``None``.
340
+ """
341
+ n = normalize_name(name)
342
+ if n in index:
343
+ return index[n]
344
+ for variant in name_variants(name):
345
+ if variant in index:
346
+ return index[variant]
347
+ return None
@@ -4,7 +4,13 @@ Core validation logic for comparing schemas.
4
4
 
5
5
  from typing import Dict, List, Optional, Any
6
6
  from .models import ValidationResult, ValidationIssue, IssueSeverity, Schema
7
- from .types import CanonicalType, normalize_name, normalize_sql_type, types_compatible
7
+ from .types import (
8
+ CanonicalType,
9
+ find_match,
10
+ normalize_name,
11
+ normalize_sql_type,
12
+ types_compatible,
13
+ )
8
14
  from ..extractors.base import BaseExtractor
9
15
 
10
16
 
@@ -107,11 +113,11 @@ class ContractValidator:
107
113
  """Validate a single table."""
108
114
  print(f" 🔍 Validating table: {table_name}")
109
115
 
110
- # Resolve the source table: explicit mapping first, else normalized name.
116
+ # Resolve the source table: explicit mapping first, then exact normalized
117
+ # name, then a plural/singular variant (users <-> user).
111
118
  target_norm = normalize_name(table_name)
112
119
  mapped_source = self.table_map.get(target_norm)
113
- lookup_norm = normalize_name(mapped_source) if mapped_source else target_norm
114
- source_schema = source_by_norm.get(lookup_norm)
120
+ source_schema = find_match(mapped_source or table_name, source_by_norm)
115
121
  if not source_schema:
116
122
  hint = f" (mapped to source '{mapped_source}')" if mapped_source else ""
117
123
  self.issues.append(
@@ -152,11 +158,12 @@ class ContractValidator:
152
158
  check_types = source_schema.confidence != "low"
153
159
 
154
160
  for col_norm, col_info in target_columns.items():
155
- # Apply an explicit column mapping for this target column, if any.
161
+ # Apply an explicit column mapping for this target column, if any,
162
+ # then match by exact name, then by plural/singular variant.
156
163
  override = col_overrides.get(col_norm)
157
- source_key = normalize_name(override) if override else col_norm
164
+ source_col = find_match(override or col_info["name"], source_columns)
158
165
 
159
- if source_key not in source_columns:
166
+ if source_col is None:
160
167
  is_required = col_info.get("required", True)
161
168
  if is_required and source_complete:
162
169
  severity = IssueSeverity.CRITICAL
@@ -188,7 +195,6 @@ class ContractValidator:
188
195
  )
189
196
  )
190
197
  elif check_types:
191
- source_col = source_columns[source_key]
192
198
  if not self._columns_type_compatible(source_col, col_info):
193
199
  self.issues.append(
194
200
  ValidationIssue(
@@ -1,6 +1,6 @@
1
1
  Metadata-Version: 2.4
2
2
  Name: data-contract-validator
3
- Version: 1.1.0
3
+ Version: 1.1.1
4
4
  Summary: Validate data contracts between dbt models and FastAPI/Pydantic APIs with accurate, low-false-positive schema checks
5
5
  Author-email: Ogunniran Siji <ogunniransiji@gmail.com>
6
6
  Maintainer-email: Ogunniran Siji <ogunniransiji@gmail.com>
@@ -201,10 +201,13 @@ validation:
201
201
 
202
202
  ### When do I need `mapping`?
203
203
 
204
- By default, names are matched across `snake_case` / `camelCase` / casing
205
- (`UserAnalytics` → `user_analytics`, `userId` → `user_id`). Reach for `mapping`
206
- only when a model or column is named so differently that the convention can't
207
- bridge it (e.g. Pydantic `user_id` dbt `customer_identifier`).
204
+ Most of the time you don't. Names are matched automatically across:
205
+ - `snake_case` / `camelCase` / casing — `UserAnalytics` → `user_analytics`, `userId` → `user_id`
206
+ - **plural singular** dbt's plural `users` matches Pydantic's `User` (→ `user`)
207
+ with no config (and it won't over-match — `address` is never confused with `addres`).
208
+
209
+ Reach for `mapping` only when a model or column is named so differently that
210
+ convention can't bridge it (e.g. Pydantic `user_id` ↔ dbt `customer_identifier`).
208
211
 
209
212
  ## 🐍 Python API
210
213
 
@@ -4,7 +4,7 @@ build-backend = "setuptools.build_meta"
4
4
 
5
5
  [project]
6
6
  name = "data-contract-validator"
7
- version = "1.1.0"
7
+ version = "1.1.1"
8
8
  description = "Validate data contracts between dbt models and FastAPI/Pydantic APIs with accurate, low-false-positive schema checks"
9
9
  readme = "README.md"
10
10
  license = {text = "MIT"}