datacontract-cli 0.10.23__py3-none-any.whl → 0.10.25__py3-none-any.whl

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Potentially problematic release.


This version of datacontract-cli might be problematic. Click here for more details.

Files changed (43) hide show
  1. datacontract/__init__.py +13 -0
  2. datacontract/api.py +3 -3
  3. datacontract/catalog/catalog.py +2 -2
  4. datacontract/cli.py +1 -1
  5. datacontract/data_contract.py +5 -3
  6. datacontract/engines/data_contract_test.py +13 -4
  7. datacontract/engines/fastjsonschema/s3/s3_read_files.py +3 -2
  8. datacontract/engines/soda/check_soda_execute.py +16 -3
  9. datacontract/engines/soda/connections/duckdb_connection.py +61 -5
  10. datacontract/engines/soda/connections/kafka.py +3 -2
  11. datacontract/export/avro_converter.py +8 -1
  12. datacontract/export/bigquery_converter.py +1 -1
  13. datacontract/export/duckdb_type_converter.py +57 -0
  14. datacontract/export/great_expectations_converter.py +49 -2
  15. datacontract/export/odcs_v3_exporter.py +162 -136
  16. datacontract/export/protobuf_converter.py +163 -69
  17. datacontract/export/spark_converter.py +1 -1
  18. datacontract/imports/avro_importer.py +30 -5
  19. datacontract/imports/csv_importer.py +111 -57
  20. datacontract/imports/excel_importer.py +850 -0
  21. datacontract/imports/importer.py +5 -2
  22. datacontract/imports/importer_factory.py +10 -0
  23. datacontract/imports/odcs_v3_importer.py +226 -127
  24. datacontract/imports/protobuf_importer.py +264 -0
  25. datacontract/lint/linters/description_linter.py +1 -3
  26. datacontract/lint/linters/field_reference_linter.py +1 -2
  27. datacontract/lint/linters/notice_period_linter.py +2 -2
  28. datacontract/lint/linters/valid_constraints_linter.py +3 -3
  29. datacontract/lint/resolve.py +23 -8
  30. datacontract/model/data_contract_specification/__init__.py +1 -0
  31. datacontract/model/run.py +3 -0
  32. datacontract/output/__init__.py +0 -0
  33. datacontract/templates/datacontract.html +2 -1
  34. datacontract/templates/index.html +2 -1
  35. {datacontract_cli-0.10.23.dist-info → datacontract_cli-0.10.25.dist-info}/METADATA +305 -195
  36. {datacontract_cli-0.10.23.dist-info → datacontract_cli-0.10.25.dist-info}/RECORD +40 -38
  37. {datacontract_cli-0.10.23.dist-info → datacontract_cli-0.10.25.dist-info}/WHEEL +1 -1
  38. datacontract/export/csv_type_converter.py +0 -36
  39. datacontract/lint/linters/quality_schema_linter.py +0 -52
  40. datacontract/model/data_contract_specification.py +0 -327
  41. {datacontract_cli-0.10.23.dist-info → datacontract_cli-0.10.25.dist-info}/entry_points.txt +0 -0
  42. {datacontract_cli-0.10.23.dist-info → datacontract_cli-0.10.25.dist-info/licenses}/LICENSE +0 -0
  43. {datacontract_cli-0.10.23.dist-info → datacontract_cli-0.10.25.dist-info}/top_level.txt +0 -0
@@ -1,12 +1,12 @@
1
- Metadata-Version: 2.2
1
+ Metadata-Version: 2.4
2
2
  Name: datacontract-cli
3
- Version: 0.10.23
3
+ Version: 0.10.25
4
4
  Summary: The datacontract CLI is an open source command-line tool for working with Data Contracts. It uses data contract YAML files to lint the data contract, connect to data sources and execute schema and quality tests, detect breaking changes, and export to different formats. The tool is written in Python. It can be used as a standalone CLI tool, in a CI/CD pipeline, or directly as a Python library.
5
5
  Author-email: Jochen Christ <jochen.christ@innoq.com>, Stefan Negele <stefan.negele@innoq.com>, Simon Harrer <simon.harrer@innoq.com>
6
+ License-Expression: MIT
6
7
  Project-URL: Homepage, https://cli.datacontract.com
7
8
  Project-URL: Issues, https://github.com/datacontract/datacontract-cli/issues
8
9
  Classifier: Programming Language :: Python :: 3
9
- Classifier: License :: OSI Approved :: MIT License
10
10
  Classifier: Operating System :: OS Independent
11
11
  Requires-Python: >=3.10
12
12
  Description-Content-Type: text/markdown
@@ -18,45 +18,48 @@ Requires-Dist: requests<2.33,>=2.31
18
18
  Requires-Dist: fastjsonschema<2.22.0,>=2.19.1
19
19
  Requires-Dist: fastparquet<2025.0.0,>=2024.5.0
20
20
  Requires-Dist: numpy<2.0.0,>=1.26.4
21
- Requires-Dist: python-multipart==0.0.20
22
- Requires-Dist: rich<13.10,>=13.7
21
+ Requires-Dist: python-multipart<1.0.0,>=0.0.20
22
+ Requires-Dist: rich<14.0,>=13.7
23
23
  Requires-Dist: sqlglot<27.0.0,>=26.6.0
24
24
  Requires-Dist: duckdb<2.0.0,>=1.0.0
25
- Requires-Dist: soda-core-duckdb<3.5.0,>=3.3.20
25
+ Requires-Dist: soda-core-duckdb<3.6.0,>=3.3.20
26
26
  Requires-Dist: setuptools>=60
27
- Requires-Dist: python-dotenv~=1.0.0
28
- Requires-Dist: boto3<1.36.12,>=1.34.41
29
- Requires-Dist: Jinja2>=3.1.5
30
- Requires-Dist: jinja_partials>=0.2.1
27
+ Requires-Dist: python-dotenv<2.0.0,>=1.0.0
28
+ Requires-Dist: boto3<2.0.0,>=1.34.41
29
+ Requires-Dist: Jinja2<4.0.0,>=3.1.5
30
+ Requires-Dist: jinja_partials<1.0.0,>=0.2.1
31
+ Requires-Dist: datacontract-specification<2.0.0,>=1.1.1
32
+ Requires-Dist: open-data-contract-standard<4.0.0,>=3.0.4
31
33
  Provides-Extra: avro
32
34
  Requires-Dist: avro==1.12.0; extra == "avro"
33
35
  Provides-Extra: bigquery
34
- Requires-Dist: soda-core-bigquery<3.4.0,>=3.3.20; extra == "bigquery"
36
+ Requires-Dist: soda-core-bigquery<3.6.0,>=3.3.20; extra == "bigquery"
35
37
  Provides-Extra: csv
36
- Requires-Dist: clevercsv>=0.8.2; extra == "csv"
37
38
  Requires-Dist: pandas>=2.0.0; extra == "csv"
39
+ Provides-Extra: excel
40
+ Requires-Dist: openpyxl<4.0.0,>=3.1.5; extra == "excel"
38
41
  Provides-Extra: databricks
39
- Requires-Dist: soda-core-spark-df<3.4.0,>=3.3.20; extra == "databricks"
40
- Requires-Dist: soda-core-spark[databricks]<3.4.0,>=3.3.20; extra == "databricks"
41
- Requires-Dist: databricks-sql-connector<3.8.0,>=3.7.0; extra == "databricks"
42
- Requires-Dist: databricks-sdk<0.45.0; extra == "databricks"
42
+ Requires-Dist: soda-core-spark-df<3.6.0,>=3.3.20; extra == "databricks"
43
+ Requires-Dist: soda-core-spark[databricks]<3.6.0,>=3.3.20; extra == "databricks"
44
+ Requires-Dist: databricks-sql-connector<4.1.0,>=3.7.0; extra == "databricks"
45
+ Requires-Dist: databricks-sdk<0.51.0; extra == "databricks"
43
46
  Provides-Extra: iceberg
44
47
  Requires-Dist: pyiceberg==0.8.1; extra == "iceberg"
45
48
  Provides-Extra: kafka
46
49
  Requires-Dist: datacontract-cli[avro]; extra == "kafka"
47
- Requires-Dist: soda-core-spark-df<3.4.0,>=3.3.20; extra == "kafka"
50
+ Requires-Dist: soda-core-spark-df<3.6.0,>=3.3.20; extra == "kafka"
48
51
  Provides-Extra: postgres
49
- Requires-Dist: soda-core-postgres<3.4.0,>=3.3.20; extra == "postgres"
52
+ Requires-Dist: soda-core-postgres<3.6.0,>=3.3.20; extra == "postgres"
50
53
  Provides-Extra: s3
51
54
  Requires-Dist: s3fs==2025.2.0; extra == "s3"
52
55
  Requires-Dist: aiobotocore<2.20.0,>=2.17.0; extra == "s3"
53
56
  Provides-Extra: snowflake
54
- Requires-Dist: snowflake-connector-python[pandas]<3.14,>=3.6; extra == "snowflake"
55
- Requires-Dist: soda-core-snowflake<3.5.0,>=3.3.20; extra == "snowflake"
57
+ Requires-Dist: snowflake-connector-python[pandas]<3.15,>=3.6; extra == "snowflake"
58
+ Requires-Dist: soda-core-snowflake<3.6.0,>=3.3.20; extra == "snowflake"
56
59
  Provides-Extra: sqlserver
57
- Requires-Dist: soda-core-sqlserver<3.4.0,>=3.3.20; extra == "sqlserver"
60
+ Requires-Dist: soda-core-sqlserver<3.6.0,>=3.3.20; extra == "sqlserver"
58
61
  Provides-Extra: trino
59
- Requires-Dist: soda-core-trino<3.4.0,>=3.3.20; extra == "trino"
62
+ Requires-Dist: soda-core-trino<3.6.0,>=3.3.20; extra == "trino"
60
63
  Provides-Extra: dbt
61
64
  Requires-Dist: dbt-core>=1.8.0; extra == "dbt"
62
65
  Provides-Extra: dbml
@@ -66,23 +69,26 @@ Requires-Dist: pyarrow>=18.1.0; extra == "parquet"
66
69
  Provides-Extra: rdf
67
70
  Requires-Dist: rdflib==7.0.0; extra == "rdf"
68
71
  Provides-Extra: api
69
- Requires-Dist: fastapi==0.115.8; extra == "api"
72
+ Requires-Dist: fastapi==0.115.12; extra == "api"
70
73
  Requires-Dist: uvicorn==0.34.0; extra == "api"
74
+ Provides-Extra: protobuf
75
+ Requires-Dist: grpcio-tools>=1.53; extra == "protobuf"
71
76
  Provides-Extra: all
72
- Requires-Dist: datacontract-cli[api,bigquery,csv,databricks,dbml,dbt,iceberg,kafka,parquet,postgres,rdf,s3,snowflake,sqlserver,trino]; extra == "all"
77
+ Requires-Dist: datacontract-cli[api,bigquery,csv,databricks,dbml,dbt,excel,iceberg,kafka,parquet,postgres,protobuf,rdf,s3,snowflake,sqlserver,trino]; extra == "all"
73
78
  Provides-Extra: dev
74
79
  Requires-Dist: datacontract-cli[all]; extra == "dev"
75
80
  Requires-Dist: httpx==0.28.1; extra == "dev"
76
81
  Requires-Dist: kafka-python; extra == "dev"
77
- Requires-Dist: moto==5.0.27; extra == "dev"
82
+ Requires-Dist: moto==5.1.3; extra == "dev"
78
83
  Requires-Dist: pandas>=2.1.0; extra == "dev"
79
- Requires-Dist: pre-commit<4.1.0,>=3.7.1; extra == "dev"
84
+ Requires-Dist: pre-commit<4.3.0,>=3.7.1; extra == "dev"
80
85
  Requires-Dist: pytest; extra == "dev"
81
86
  Requires-Dist: pytest-xdist; extra == "dev"
82
87
  Requires-Dist: pymssql==2.3.2; extra == "dev"
83
88
  Requires-Dist: ruff; extra == "dev"
84
- Requires-Dist: testcontainers[kafka,minio,mssql,postgres]==4.9.0; extra == "dev"
85
- Requires-Dist: trino==0.332.0; extra == "dev"
89
+ Requires-Dist: testcontainers[kafka,minio,mssql,postgres]==4.9.2; extra == "dev"
90
+ Requires-Dist: trino==0.333.0; extra == "dev"
91
+ Dynamic: license-file
86
92
 
87
93
  # Data Contract CLI
88
94
 
@@ -317,7 +323,7 @@ A list of available extras:
317
323
  | Parquet | `pip install datacontract-cli[parquet]` |
318
324
  | RDF | `pip install datacontract-cli[rdf]` |
319
325
  | API (run as web server) | `pip install datacontract-cli[api]` |
320
-
326
+ | protobuf | `pip install datacontract-cli[protobuf]` |
321
327
 
322
328
 
323
329
  ## Documentation
@@ -341,17 +347,16 @@ Commands
341
347
 
342
348
  Usage: datacontract init [OPTIONS] [LOCATION]
343
349
 
344
- Download a datacontract.yaml template and write it to file.
350
+ Create an empty data contract.
345
351
 
346
352
  ╭─ Arguments ──────────────────────────────────────────────────────────────────╮
347
- │ location [LOCATION] The location (url or path) of the data contract
348
- yaml to create.
353
+ │ location [LOCATION] The location of the data contract file to
354
+ │ create.
349
355
  │ [default: datacontract.yaml] │
350
356
  ╰──────────────────────────────────────────────────────────────────────────────╯
351
357
  ╭─ Options ────────────────────────────────────────────────────────────────────╮
352
358
  │ --template TEXT URL of a template or data contract │
353
- │ [default:
354
- │ https://datacontract.com/datacontrac… │
359
+ │ [default: None]
355
360
  │ --overwrite --no-overwrite Replace the existing │
356
361
  │ datacontract.yaml │
357
362
  │ [default: no-overwrite] │
@@ -373,52 +378,77 @@ Commands
373
378
  │ [default: datacontract.yaml] │
374
379
  ╰──────────────────────────────────────────────────────────────────────────────╯
375
380
  ╭─ Options ────────────────────────────────────────────────────────────────────╮
376
- │ --schema TEXT The location (url or path) of the Data Contract
377
- Specification JSON Schema
378
- [default:
379
- https://datacontract.com/datacontract.schema.json]
380
- --help Show this message and exit.
381
+ │ --schema TEXT The location (url or path) of the Data
382
+ Contract Specification JSON Schema
383
+ [default: None]
384
+ --output PATH Specify the file path where the test results
385
+ should be written to (e.g.,
386
+ │ './test-results/TEST-datacontract.xml'). If │
387
+ │ no path is provided, the output will be │
388
+ │ printed to stdout. │
389
+ │ [default: None] │
390
+ │ --output-format [junit] The target format for the test results. │
391
+ │ [default: None] │
392
+ │ --help Show this message and exit. │
381
393
  ╰──────────────────────────────────────────────────────────────────────────────╯
382
394
 
383
395
  ```
384
396
 
385
397
  ### test
386
398
  ```
387
-
388
- Usage: datacontract test [OPTIONS] [LOCATION]
389
-
390
- Run schema and quality tests on configured servers.
391
-
392
- ╭─ Arguments ──────────────────────────────────────────────────────────────────────────────────────╮
393
- │ location [LOCATION] The location (url or path) of the data contract yaml.
394
- [default: datacontract.yaml]
395
- ╰──────────────────────────────────────────────────────────────────────────────────────────────────╯
396
- ╭─ Options ────────────────────────────────────────────────────────────────────────────────────────╮
397
- --schema TEXT The location (url or path) of the Data │
398
- Contract Specification JSON Schema
399
- [default: None]
400
- --server TEXT The server configuration to run the
401
- schema and quality tests. Use the key of
402
- the server object in the data contract
403
- yaml file to refer to a server, e.g.,
404
- `production`, or `all` for all servers
405
- (default).
406
- [default: all]
407
- --publish TEXT The url to publish the results after the
408
- test
409
- [default: None]
410
- --output PATH Specify the file path where the test
411
- results should be written to (e.g.,
412
- './test-results/TEST-datacontract.xml').
413
- [default: None]
414
- --output-format [junit] The target format for the test results.
415
- [default: None]
416
- │ --logs --no-logs Print logs [default: no-logs]
417
- --ssl-verification --no-ssl-verification SSL verification when publishing the
418
- data contract.
419
- [default: ssl-verification]
420
- │ --help Show this message and exit.
421
- ╰──────────────────────────────────────────────────────────────────────────────────────────────────╯
399
+
400
+ Usage: datacontract test [OPTIONS] [LOCATION]
401
+
402
+ Run schema and quality tests on configured servers.
403
+
404
+ ╭─ Arguments ──────────────────────────────────────────────────────────────────╮
405
+ │ location [LOCATION] The location (url or path) of the data contract
406
+ yaml.
407
+ │ [default: datacontract.yaml] │
408
+ ╰──────────────────────────────────────────────────────────────────────────────╯
409
+ ╭─ Options ────────────────────────────────────────────────────────────────────╮
410
+ --schema TEXT The location (url or
411
+ path) of the Data
412
+ Contract
413
+ Specification JSON
414
+ Schema
415
+ [default: None]
416
+ --server TEXT The server
417
+ configuration to run
418
+ the schema and
419
+ quality tests. Use
420
+ the key of the server
421
+ object in the data
422
+ contract yaml file to
423
+ refer to a server,
424
+ e.g., `production`,
425
+ or `all` for all
426
+ servers (default).
427
+ [default: all]
428
+ │ --publish TEXT The url to publish
429
+ the results after the
430
+ test
431
+ [default: None]
432
+ │ --output PATH Specify the file path
433
+ │ where the test │
434
+ │ results should be │
435
+ │ written to (e.g., │
436
+ │ './test-results/TEST… │
437
+ │ [default: None] │
438
+ │ --output-format [junit] The target format for │
439
+ │ the test results. │
440
+ │ [default: None] │
441
+ │ --logs --no-logs Print logs │
442
+ │ [default: no-logs] │
443
+ │ --ssl-verification --no-ssl-verificati… SSL verification when │
444
+ │ publishing the data │
445
+ │ contract. │
446
+ │ [default: │
447
+ │ ssl-verification] │
448
+ │ --help Show this message and │
449
+ │ exit. │
450
+ ╰──────────────────────────────────────────────────────────────────────────────╯
451
+
422
452
  ```
423
453
 
424
454
  Data Contract CLI connects to a data source and runs schema and quality tests to verify that the data contract is valid.
@@ -874,6 +904,29 @@ models:
874
904
  | `DATACONTRACT_TRINO_PASSWORD` | `mysecretpassword` | Password |
875
905
 
876
906
 
907
+ #### Local
908
+
909
+ Data Contract CLI can test local files in parquet, json, csv, or delta format.
910
+
911
+ ##### Example
912
+
913
+ datacontract.yaml
914
+ ```yaml
915
+ servers:
916
+ local:
917
+ type: local
918
+ path: ./*.parquet
919
+ format: parquet
920
+ models:
921
+ my_table_1: # corresponds to a table
922
+ type: table
923
+ fields:
924
+ my_column_1: # corresponds to a column
925
+ type: varchar
926
+ my_column_2: # corresponds to a column
927
+ type: string
928
+ ```
929
+
877
930
 
878
931
  ### export
879
932
  ```
@@ -889,41 +942,40 @@ models:
889
942
  │ [default: datacontract.yaml] │
890
943
  ╰──────────────────────────────────────────────────────────────────────────────╯
891
944
  ╭─ Options ────────────────────────────────────────────────────────────────────╮
892
- │ * --format [jsonschema|pydantic-model| The export format.
893
- sodacl|dbt|dbt-sources|dbt- [default: None]
894
- staging-sql|odcs| [required]
895
- rdf|avro|protobuf|gre
896
- at-expectations|terraform|a
897
- vro-idl|sql|sql-query|html|
898
- go|bigquery|dbml|spark|sqla
899
- lchemy|data-caterer|dcs|mar
900
- kdown|iceberg|custom]
901
- │ --output PATH Specify the file path where
902
- the exported data will be
903
- saved. If no path is
904
- provided, the output will be
905
- printed to stdout.
906
- [default: None]
907
- │ --server TEXT The server name to export.
908
- [default: None]
909
- │ --model TEXT Use the key of the model in
910
- the data contract yaml file
911
- to refer to a model, e.g.,
912
- `orders`, or `all` for all
913
- models (default).
914
- [default: all]
915
- │ --schema TEXT The location (url or path)
916
- of the Data Contract
917
- Specification JSON Schema
918
- [default:
919
- https://datacontract.com/da…
920
- --engine TEXT [engine] The engine used for
921
- great expection run.
922
- [default: None]
923
- --template PATH [custom] The file path of
924
- Jinja template.
925
- [default: None]
926
- │ --help Show this message and exit. │
945
+ │ * --format [jsonschema|pydantic-model The export format.
946
+ |sodacl|dbt|dbt-sources|db [default: None]
947
+ t-staging-sql|odcs|rdf|avr [required]
948
+ o|protobuf|great-expectati
949
+ ons|terraform|avro-idl|sql
950
+ |sql-query|html|go|bigquer
951
+ y|dbml|spark|sqlalchemy|da
952
+ ta-caterer|dcs|markdown|ic
953
+ eberg|custom]
954
+ │ --output PATH Specify the file path where
955
+ the exported data will be
956
+ saved. If no path is
957
+ provided, the output will
958
+ be printed to stdout.
959
+ [default: None]
960
+ │ --server TEXT The server name to export.
961
+ [default: None]
962
+ │ --model TEXT Use the key of the model in
963
+ the data contract yaml file
964
+ to refer to a model, e.g.,
965
+ `orders`, or `all` for all
966
+ models (default).
967
+ [default: all]
968
+ │ --schema TEXT The location (url or path)
969
+ of the Data Contract
970
+ Specification JSON Schema
971
+ [default: None]
972
+ --engine TEXT [engine] The engine used
973
+ for great expection run.
974
+ [default: None]
975
+ --template PATH [custom] The file path of
976
+ Jinja template.
977
+ [default: None]
978
+ --help Show this message and exit.
927
979
  ╰──────────────────────────────────────────────────────────────────────────────╯
928
980
  ╭─ RDF Options ────────────────────────────────────────────────────────────────╮
929
981
  │ --rdf-base TEXT [rdf] The base URI used to generate the RDF graph. │
@@ -1232,71 +1284,104 @@ FROM
1232
1284
 
1233
1285
  ### import
1234
1286
  ```
1235
- Usage: datacontract import [OPTIONS]
1236
-
1237
- Create a data contract from the given source location. Saves to file specified by `output` option if present,
1238
- otherwise prints to stdout.
1239
-
1240
- ╭─ Options ──────────────────────────────────────────────────────────────────────────────────────────────────────╮
1241
- * --format [sql|avro|dbt|dbml|glue|jsonschema|bi The format of the source file. │
1242
- gquery|odcs|unity|spark|iceberg|parqu [default: None]
1243
- et|csv] [required]
1244
- --output PATH Specify the file path where the Data
1245
- Contract will be saved. If no path is
1246
- provided, the output will be printed
1247
- to stdout.
1248
- [default: None]
1249
- --source TEXT The path to the file or Glue Database
1250
- that should be imported.
1251
- [default: None]
1252
- --dialect TEXT The SQL dialect to use when importing
1253
- SQL files, e.g., postgres, tsql,
1254
- bigquery.
1255
- [default: None]
1256
- --glue-table TEXT List of table ids to import from the
1257
- Glue Database (repeat for multiple
1258
- table ids, leave empty for all tables
1259
- in the dataset).
1260
- [default: None]
1261
- --bigquery-project TEXT The bigquery project id.
1262
- [default: None]
1263
- --bigquery-dataset TEXT The bigquery dataset id.
1264
- [default: None]
1265
- │ --bigquery-table TEXT List of table ids to import from the
1266
- bigquery API (repeat for multiple
1267
- table ids, leave empty for all tables
1268
- in the dataset).
1269
- [default: None]
1270
- --unity-table-full-name TEXT Full name of a table in the unity
1271
- catalog
1272
- [default: None]
1273
- │ --dbt-model TEXT List of models names to import from
1274
- the dbt manifest file (repeat for
1275
- multiple models names, leave empty
1276
- for all models in the dataset).
1277
- [default: None]
1278
- --dbml-schema TEXT List of schema names to import from
1279
- the DBML file (repeat for multiple
1280
- schema names, leave empty for all
1281
- tables in the file).
1282
- [default: None]
1283
- --dbml-table TEXT List of table names to import from
1284
- the DBML file (repeat for multiple
1285
- table names, leave empty for all
1286
- tables in the file).
1287
- [default: None]
1288
- --iceberg-table TEXT Table name to assign to the model
1289
- created from the Iceberg schema.
1290
- [default: None]
1291
- --template TEXT The location (url or path) of the
1292
- Data Contract Specification Template
1293
- [default: None]
1294
- --schema TEXT The location (url or path) of the
1295
- Data Contract Specification JSON
1296
- Schema
1297
- [default: None]
1298
- │ --help Show this message and exit.
1299
- ╰────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯
1287
+
1288
+ Usage: datacontract import [OPTIONS]
1289
+
1290
+ Create a data contract from the given source location. Saves to file specified
1291
+ by `output` option if present, otherwise prints to stdout.
1292
+
1293
+ ╭─ Options ────────────────────────────────────────────────────────────────────╮
1294
+ * --format [sql|avro|dbt|dbml|gl The format of the
1295
+ ue|jsonschema|bigquer source file.
1296
+ y|odcs|unity|spark|ic [default: None]
1297
+ eberg|parquet|csv|pro [required]
1298
+ tobuf]
1299
+ --output PATH Specify the file path
1300
+ where the Data
1301
+ Contract will be
1302
+ saved. If no path is
1303
+ provided, the output
1304
+ will be printed to
1305
+ stdout.
1306
+ [default: None]
1307
+ --source TEXT The path to the file
1308
+ or Glue Database that
1309
+ should be imported.
1310
+ [default: None]
1311
+ --dialect TEXT The SQL dialect to
1312
+ use when importing
1313
+ SQL files, e.g.,
1314
+ postgres, tsql,
1315
+ bigquery.
1316
+ [default: None]
1317
+ │ --glue-table TEXT List of table ids to │
1318
+ import from the Glue
1319
+ Database (repeat for
1320
+ multiple table ids,
1321
+ leave empty for all
1322
+ tables in the
1323
+ dataset).
1324
+ [default: None]
1325
+ │ --bigquery-project TEXT The bigquery project
1326
+ id.
1327
+ [default: None]
1328
+ --bigquery-dataset TEXT The bigquery dataset
1329
+ id.
1330
+ [default: None]
1331
+ --bigquery-table TEXT List of table ids to
1332
+ import from the
1333
+ bigquery API (repeat
1334
+ for multiple table
1335
+ ids, leave empty for
1336
+ all tables in the
1337
+ dataset).
1338
+ [default: None]
1339
+ --unity-table-full-n… TEXT Full name of a table
1340
+ in the unity catalog
1341
+ [default: None]
1342
+ --dbt-model TEXT List of models names
1343
+ to import from the
1344
+ dbt manifest file
1345
+ (repeat for multiple
1346
+ models names, leave
1347
+ empty for all models
1348
+ in the dataset).
1349
+ [default: None]
1350
+ │ --dbml-schema TEXT List of schema names
1351
+ │ to import from the │
1352
+ │ DBML file (repeat for │
1353
+ │ multiple schema │
1354
+ │ names, leave empty │
1355
+ │ for all tables in the │
1356
+ │ file). │
1357
+ │ [default: None] │
1358
+ │ --dbml-table TEXT List of table names │
1359
+ │ to import from the │
1360
+ │ DBML file (repeat for │
1361
+ │ multiple table names, │
1362
+ │ leave empty for all │
1363
+ │ tables in the file). │
1364
+ │ [default: None] │
1365
+ │ --iceberg-table TEXT Table name to assign │
1366
+ │ to the model created │
1367
+ │ from the Iceberg │
1368
+ │ schema. │
1369
+ │ [default: None] │
1370
+ │ --template TEXT The location (url or │
1371
+ │ path) of the Data │
1372
+ │ Contract │
1373
+ │ Specification │
1374
+ │ Template │
1375
+ │ [default: None] │
1376
+ │ --schema TEXT The location (url or │
1377
+ │ path) of the Data │
1378
+ │ Contract │
1379
+ │ Specification JSON │
1380
+ │ Schema │
1381
+ │ [default: None] │
1382
+ │ --help Show this message and │
1383
+ │ exit. │
1384
+ ╰──────────────────────────────────────────────────────────────────────────────╯
1300
1385
 
1301
1386
  ```
1302
1387
 
@@ -1323,7 +1408,7 @@ Available import options:
1323
1408
  | `spark` | Import from Spark StructTypes | ✅ |
1324
1409
  | `dbml` | Import from DBML models | ✅ |
1325
1410
  | `csv` | Import from CSV File | ✅ |
1326
- | `protobuf` | Import from Protobuf schemas | TBD |
1411
+ | `protobuf` | Import from Protobuf schemas | |
1327
1412
  | `iceberg` | Import from an Iceberg JSON Schema Definition | partial |
1328
1413
  | `parquet` | Import from Parquet File Metadta | ✅ |
1329
1414
  | Missing something? | Please create an issue on GitHub | TBD |
@@ -1475,6 +1560,16 @@ Example:
1475
1560
  datacontract import --format csv --source "test.csv"
1476
1561
  ```
1477
1562
 
1563
+ #### protobuf
1564
+
1565
+ Importing from protobuf File. Specify file in `source` parameter.
1566
+
1567
+ Example:
1568
+
1569
+ ```bash
1570
+ datacontract import --format protobuf --source "test.proto"
1571
+ ```
1572
+
1478
1573
 
1479
1574
  ### breaking
1480
1575
  ```
@@ -1550,7 +1645,7 @@ datacontract import --format csv --source "test.csv"
1550
1645
 
1551
1646
  Usage: datacontract catalog [OPTIONS]
1552
1647
 
1553
- Create an html catalog of data contracts.
1648
+ Create a html catalog of data contracts.
1554
1649
 
1555
1650
  ╭─ Options ────────────────────────────────────────────────────────────────────╮
1556
1651
  │ --files TEXT Glob pattern for the data contract files to include in │
@@ -1560,8 +1655,7 @@ datacontract import --format csv --source "test.csv"
1560
1655
  │ [default: catalog/] │
1561
1656
  │ --schema TEXT The location (url or path) of the Data Contract │
1562
1657
  │ Specification JSON Schema │
1563
- │ [default:
1564
- │ https://datacontract.com/datacontract.schema.json] │
1658
+ │ [default: None]
1565
1659
  │ --help Show this message and exit. │
1566
1660
  ╰──────────────────────────────────────────────────────────────────────────────╯
1567
1661
 
@@ -1594,8 +1688,7 @@ datacontract catalog --files "*.odcs.yaml"
1594
1688
  │ path) of the Data │
1595
1689
  │ Contract Specification │
1596
1690
  │ JSON Schema │
1597
- │ [default:
1598
- │ https://datacontract.c… │
1691
+ │ [default: None]
1599
1692
  │ --ssl-verification --no-ssl-verification SSL verification when │
1600
1693
  │ publishing the data │
1601
1694
  │ contract. │
@@ -1609,21 +1702,27 @@ datacontract catalog --files "*.odcs.yaml"
1609
1702
 
1610
1703
  ### api
1611
1704
  ```
1612
-
1613
- Usage: datacontract api [OPTIONS]
1614
-
1615
- Start the datacontract CLI as server application with REST API.
1616
- The OpenAPI documentation as Swagger UI is available on http://localhost:4242. You can execute the commands directly from the Swagger UI.
1617
- To protect the API, you can set the environment variable DATACONTRACT_CLI_API_KEY to a secret API key. To authenticate, requests must include the header 'x-api-key' with the
1618
- correct API key. This is highly recommended, as data contract tests may be subject to SQL injections or leak sensitive information.
1619
- To connect to servers (such as a Snowflake data source), set the credentials as environment variables as documented in https://cli.datacontract.com/#test
1620
-
1621
- ╭─ Options ──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╮
1622
- --port INTEGER Bind socket to this port. [default: 4242] │
1623
- --host TEXT Bind socket to this host. Hint: For running in docker, set it to 0.0.0.0 [default: 127.0.0.1] │
1624
- --help Show this message and exit.
1625
- ╰────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯
1626
-
1705
+
1706
+ Usage: datacontract api [OPTIONS]
1707
+
1708
+ Start the datacontract CLI as server application with REST API.
1709
+ The OpenAPI documentation as Swagger UI is available on http://localhost:4242.
1710
+ You can execute the commands directly from the Swagger UI.
1711
+ To protect the API, you can set the environment variable
1712
+ DATACONTRACT_CLI_API_KEY to a secret API key. To authenticate, requests must
1713
+ include the header 'x-api-key' with the correct API key. This is highly
1714
+ recommended, as data contract tests may be subject to SQL injections or leak
1715
+ sensitive information.
1716
+ To connect to servers (such as a Snowflake data source), set the credentials
1717
+ as environment variables as documented in https://cli.datacontract.com/#test
1718
+
1719
+ ╭─ Options ────────────────────────────────────────────────────────────────────╮
1720
+ │ --port INTEGER Bind socket to this port. [default: 4242] │
1721
+ │ --host TEXT Bind socket to this host. Hint: For running in │
1722
+ │ docker, set it to 0.0.0.0 │
1723
+ │ [default: 127.0.0.1] │
1724
+ │ --help Show this message and exit. │
1725
+ ╰──────────────────────────────────────────────────────────────────────────────╯
1627
1726
 
1628
1727
  ```
1629
1728
 
@@ -1917,6 +2016,16 @@ pre-commit run --all-files
1917
2016
  pytest
1918
2017
  ```
1919
2018
 
2019
+ ### Use uv (recommended)
2020
+
2021
+ ```bash
2022
+ # make sure uv is installed
2023
+ uv python pin 3.11
2024
+ uv pip install -e '.[dev]'
2025
+ uv run ruff check
2026
+ uv run pytest
2027
+ ```
2028
+
1920
2029
 
1921
2030
  ### Docker Build
1922
2031
 
@@ -1988,6 +2097,7 @@ We are happy to receive your contributions. Propose your change in an issue or d
1988
2097
 
1989
2098
  - [INNOQ](https://innoq.com)
1990
2099
  - [Data Catering](https://data.catering/)
2100
+ - [Oliver Wyman](https://www.oliverwyman.com/)
1991
2101
  - And many more. To add your company, please create a pull request.
1992
2102
 
1993
2103
  ## Related Tools