datacontract-cli 0.10.26__py3-none-any.whl → 0.10.28__py3-none-any.whl

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Potentially problematic release.


This version of datacontract-cli might be problematic. Click here for more details.

Files changed (33) hide show
  1. datacontract/catalog/catalog.py +1 -1
  2. datacontract/cli.py +20 -3
  3. datacontract/data_contract.py +125 -22
  4. datacontract/engines/data_contract_checks.py +2 -0
  5. datacontract/export/dbt_converter.py +6 -3
  6. datacontract/export/exporter.py +1 -0
  7. datacontract/export/exporter_factory.py +7 -1
  8. datacontract/export/{html_export.py → html_exporter.py} +31 -20
  9. datacontract/export/mermaid_exporter.py +97 -0
  10. datacontract/export/odcs_v3_exporter.py +8 -10
  11. datacontract/export/sodacl_converter.py +9 -1
  12. datacontract/export/sql_converter.py +2 -2
  13. datacontract/export/sql_type_converter.py +6 -2
  14. datacontract/imports/excel_importer.py +5 -2
  15. datacontract/imports/importer.py +10 -1
  16. datacontract/imports/odcs_importer.py +2 -2
  17. datacontract/imports/odcs_v3_importer.py +9 -9
  18. datacontract/imports/spark_importer.py +103 -12
  19. datacontract/imports/sql_importer.py +4 -2
  20. datacontract/imports/unity_importer.py +77 -37
  21. datacontract/integration/datamesh_manager.py +16 -2
  22. datacontract/lint/resolve.py +60 -6
  23. datacontract/templates/datacontract.html +52 -2
  24. datacontract/templates/datacontract_odcs.html +666 -0
  25. datacontract/templates/index.html +2 -0
  26. datacontract/templates/partials/server.html +2 -0
  27. datacontract/templates/style/output.css +319 -145
  28. {datacontract_cli-0.10.26.dist-info → datacontract_cli-0.10.28.dist-info}/METADATA +364 -381
  29. {datacontract_cli-0.10.26.dist-info → datacontract_cli-0.10.28.dist-info}/RECORD +33 -31
  30. {datacontract_cli-0.10.26.dist-info → datacontract_cli-0.10.28.dist-info}/WHEEL +1 -1
  31. {datacontract_cli-0.10.26.dist-info → datacontract_cli-0.10.28.dist-info}/entry_points.txt +0 -0
  32. {datacontract_cli-0.10.26.dist-info → datacontract_cli-0.10.28.dist-info}/licenses/LICENSE +0 -0
  33. {datacontract_cli-0.10.26.dist-info → datacontract_cli-0.10.28.dist-info}/top_level.txt +0 -0
@@ -1,6 +1,6 @@
1
1
  Metadata-Version: 2.4
2
2
  Name: datacontract-cli
3
- Version: 0.10.26
3
+ Version: 0.10.28
4
4
  Summary: The datacontract CLI is an open source command-line tool for working with Data Contracts. It uses data contract YAML files to lint the data contract, connect to data sources and execute schema and quality tests, detect breaking changes, and export to different formats. The tool is written in Python. It can be used as a standalone CLI tool, in a CI/CD pipeline, or directly as a Python library.
5
5
  Author-email: Jochen Christ <jochen.christ@innoq.com>, Stefan Negele <stefan.negele@innoq.com>, Simon Harrer <simon.harrer@innoq.com>
6
6
  License-Expression: MIT
@@ -12,14 +12,14 @@ Requires-Python: >=3.10
12
12
  Description-Content-Type: text/markdown
13
13
  License-File: LICENSE
14
14
  Requires-Dist: typer<0.16,>=0.15.1
15
- Requires-Dist: pydantic<2.11.0,>=2.8.2
15
+ Requires-Dist: pydantic<2.12.0,>=2.8.2
16
16
  Requires-Dist: pyyaml~=6.0.1
17
17
  Requires-Dist: requests<2.33,>=2.31
18
18
  Requires-Dist: fastjsonschema<2.22.0,>=2.19.1
19
19
  Requires-Dist: fastparquet<2025.0.0,>=2024.5.0
20
20
  Requires-Dist: numpy<2.0.0,>=1.26.4
21
21
  Requires-Dist: python-multipart<1.0.0,>=0.0.20
22
- Requires-Dist: rich<14.0,>=13.7
22
+ Requires-Dist: rich<15.0,>=13.7
23
23
  Requires-Dist: sqlglot<27.0.0,>=26.6.0
24
24
  Requires-Dist: duckdb<2.0.0,>=1.0.0
25
25
  Requires-Dist: soda-core-duckdb<3.6.0,>=3.3.20
@@ -42,17 +42,19 @@ Provides-Extra: databricks
42
42
  Requires-Dist: soda-core-spark-df<3.6.0,>=3.3.20; extra == "databricks"
43
43
  Requires-Dist: soda-core-spark[databricks]<3.6.0,>=3.3.20; extra == "databricks"
44
44
  Requires-Dist: databricks-sql-connector<4.1.0,>=3.7.0; extra == "databricks"
45
- Requires-Dist: databricks-sdk<0.51.0; extra == "databricks"
45
+ Requires-Dist: databricks-sdk<0.55.0; extra == "databricks"
46
+ Requires-Dist: pyspark==3.5.5; extra == "databricks"
46
47
  Provides-Extra: iceberg
47
48
  Requires-Dist: pyiceberg==0.8.1; extra == "iceberg"
48
49
  Provides-Extra: kafka
49
50
  Requires-Dist: datacontract-cli[avro]; extra == "kafka"
50
51
  Requires-Dist: soda-core-spark-df<3.6.0,>=3.3.20; extra == "kafka"
52
+ Requires-Dist: pyspark==3.5.5; extra == "kafka"
51
53
  Provides-Extra: postgres
52
54
  Requires-Dist: soda-core-postgres<3.6.0,>=3.3.20; extra == "postgres"
53
55
  Provides-Extra: s3
54
- Requires-Dist: s3fs==2025.2.0; extra == "s3"
55
- Requires-Dist: aiobotocore<2.20.0,>=2.17.0; extra == "s3"
56
+ Requires-Dist: s3fs<2026.0.0,>=2025.2.0; extra == "s3"
57
+ Requires-Dist: aiobotocore<2.23.0,>=2.17.0; extra == "s3"
56
58
  Provides-Extra: snowflake
57
59
  Requires-Dist: snowflake-connector-python[pandas]<3.15,>=3.6; extra == "snowflake"
58
60
  Requires-Dist: soda-core-snowflake<3.6.0,>=3.3.20; extra == "snowflake"
@@ -79,14 +81,14 @@ Provides-Extra: dev
79
81
  Requires-Dist: datacontract-cli[all]; extra == "dev"
80
82
  Requires-Dist: httpx==0.28.1; extra == "dev"
81
83
  Requires-Dist: kafka-python; extra == "dev"
82
- Requires-Dist: moto==5.1.3; extra == "dev"
84
+ Requires-Dist: moto==5.1.5; extra == "dev"
83
85
  Requires-Dist: pandas>=2.1.0; extra == "dev"
84
86
  Requires-Dist: pre-commit<4.3.0,>=3.7.1; extra == "dev"
85
87
  Requires-Dist: pytest; extra == "dev"
86
88
  Requires-Dist: pytest-xdist; extra == "dev"
87
89
  Requires-Dist: pymssql==2.3.4; extra == "dev"
88
90
  Requires-Dist: ruff; extra == "dev"
89
- Requires-Dist: testcontainers[kafka,minio,mssql,postgres]==4.9.2; extra == "dev"
91
+ Requires-Dist: testcontainers[kafka,minio,mssql,postgres]==4.10.0; extra == "dev"
90
92
  Requires-Dist: trino==0.333.0; extra == "dev"
91
93
  Dynamic: license-file
92
94
 
@@ -344,110 +346,94 @@ Commands
344
346
 
345
347
  ### init
346
348
  ```
347
-
348
- Usage: datacontract init [OPTIONS] [LOCATION]
349
-
350
- Create an empty data contract.
351
-
352
- ╭─ Arguments ──────────────────────────────────────────────────────────────────╮
353
- │ location [LOCATION] The location of the data contract file to
354
- create.
355
- │ [default: datacontract.yaml] │
356
- ╰──────────────────────────────────────────────────────────────────────────────╯
357
- ╭─ Options ────────────────────────────────────────────────────────────────────╮
358
- │ --template TEXT URL of a template or data contract
359
- │ [default: None]
360
- │ --overwrite --no-overwrite Replace the existing
361
- │ datacontract.yaml │
362
- │ [default: no-overwrite] │
363
- │ --help Show this message and exit. │
364
- ╰──────────────────────────────────────────────────────────────────────────────╯
349
+
350
+ Usage: datacontract init [OPTIONS] [LOCATION]
351
+
352
+ Create an empty data contract.
353
+
354
+ ╭─ Arguments ──────────────────────────────────────────────────────────────────────────────────────╮
355
+ │ location [LOCATION] The location of the data contract file to create.
356
+ [default: datacontract.yaml]
357
+ ╰──────────────────────────────────────────────────────────────────────────────────────────────────╯
358
+ ╭─ Options ────────────────────────────────────────────────────────────────────────────────────────╮
359
+ --template TEXT URL of a template or data contract [default: None] │
360
+ │ --overwrite --no-overwrite Replace the existing datacontract.yaml
361
+ │ [default: no-overwrite]
362
+ │ --help Show this message and exit.
363
+ ╰──────────────────────────────────────────────────────────────────────────────────────────────────╯
365
364
 
366
365
  ```
367
366
 
368
367
  ### lint
369
368
  ```
370
-
371
- Usage: datacontract lint [OPTIONS] [LOCATION]
372
-
373
- Validate that the datacontract.yaml is correctly formatted.
374
-
375
- ╭─ Arguments ──────────────────────────────────────────────────────────────────╮
376
- │ location [LOCATION] The location (url or path) of the data contract
377
- │ yaml.
378
- │ [default: datacontract.yaml] │
379
- ╰──────────────────────────────────────────────────────────────────────────────╯
380
- ╭─ Options ────────────────────────────────────────────────────────────────────╮
381
- --schema TEXT The location (url or path) of the Data
382
- Contract Specification JSON Schema
383
- [default: None]
384
- --output PATH Specify the file path where the test results
385
- should be written to (e.g.,
386
- './test-results/TEST-datacontract.xml'). If
387
- no path is provided, the output will be
388
- printed to stdout.
389
- │ [default: None] │
390
- │ --output-format [junit] The target format for the test results. │
391
- │ [default: None] │
392
- │ --help Show this message and exit. │
393
- ╰──────────────────────────────────────────────────────────────────────────────╯
369
+
370
+ Usage: datacontract lint [OPTIONS] [LOCATION]
371
+
372
+ Validate that the datacontract.yaml is correctly formatted.
373
+
374
+ ╭─ Arguments ──────────────────────────────────────────────────────────────────────────────────────╮
375
+ │ location [LOCATION] The location (url or path) of the data contract yaml.
376
+ [default: datacontract.yaml]
377
+ ╰──────────────────────────────────────────────────────────────────────────────────────────────────╯
378
+ ╭─ Options ────────────────────────────────────────────────────────────────────────────────────────╮
379
+ --schema TEXT The location (url or path) of the Data Contract Specification │
380
+ JSON Schema
381
+ [default: None]
382
+ --output PATH Specify the file path where the test results should be written
383
+ to (e.g., './test-results/TEST-datacontract.xml'). If no path is
384
+ provided, the output will be printed to stdout. │
385
+ [default: None]
386
+ --output-format [junit] The target format for the test results. [default: None]
387
+ --help Show this message and exit.
388
+ ╰──────────────────────────────────────────────────────────────────────────────────────────────────╯
394
389
 
395
390
  ```
396
391
 
397
392
  ### test
398
393
  ```
399
-
400
- Usage: datacontract test [OPTIONS] [LOCATION]
401
-
402
- Run schema and quality tests on configured servers.
403
-
404
- ╭─ Arguments ──────────────────────────────────────────────────────────────────╮
405
- │ location [LOCATION] The location (url or path) of the data contract
406
- │ yaml.
407
- │ [default: datacontract.yaml] │
408
- ╰──────────────────────────────────────────────────────────────────────────────╯
409
- ╭─ Options ────────────────────────────────────────────────────────────────────╮
410
- --schema TEXT The location (url or
411
- path) of the Data
412
- Contract
413
- Specification JSON
414
- Schema
415
- [default: None]
416
- --server TEXT The server
417
- configuration to run
418
- the schema and
419
- quality tests. Use
420
- the key of the server
421
- object in the data
422
- contract yaml file to
423
- refer to a server,
424
- e.g., `production`,
425
- or `all` for all
426
- servers (default).
427
- [default: all]
428
- │ --publish TEXT The url to publish
429
- the results after the
430
- test
431
- [default: None]
432
- --output PATH Specify the file path
433
- where the test
434
- results should be
435
- written to (e.g.,
436
- './test-results/TEST…
437
- [default: None]
438
- --output-format [junit] The target format for
439
- the test results.
440
- [default: None]
441
- │ --logs --no-logs Print logs │
442
- │ [default: no-logs] │
443
- │ --ssl-verification --no-ssl-verificati… SSL verification when │
444
- │ publishing the data │
445
- │ contract. │
446
- │ [default: │
447
- │ ssl-verification] │
448
- │ --help Show this message and │
449
- │ exit. │
450
- ╰──────────────────────────────────────────────────────────────────────────────╯
394
+
395
+ Usage: datacontract test [OPTIONS] [LOCATION]
396
+
397
+ Run schema and quality tests on configured servers.
398
+
399
+ ╭─ Arguments ──────────────────────────────────────────────────────────────────────────────────────╮
400
+ │ location [LOCATION] The location (url or path) of the data contract yaml.
401
+ [default: datacontract.yaml]
402
+ ╰──────────────────────────────────────────────────────────────────────────────────────────────────╯
403
+ ╭─ Options ────────────────────────────────────────────────────────────────────────────────────────╮
404
+ --schema TEXT The location (url or path) of │
405
+ the Data Contract Specification
406
+ JSON Schema
407
+ [default: None]
408
+ --server TEXT The server configuration to run
409
+ the schema and quality tests.
410
+ Use the key of the server object
411
+ in the data contract yaml file
412
+ to refer to a server, e.g.,
413
+ `production`, or `all` for all
414
+ servers (default).
415
+ [default: all]
416
+ --publish-test-results --no-publish-test-results Publish the results after the
417
+ test
418
+ [default:
419
+ no-publish-test-results]
420
+ --publish TEXT DEPRECATED. The url to publish
421
+ the results after the test.
422
+ [default: None]
423
+ │ --output PATH Specify the file path where the
424
+ test results should be written
425
+ to (e.g.,
426
+ './test-results/TEST-datacontra…
427
+ [default: None]
428
+ --output-format [junit] The target format for the test
429
+ results.
430
+ [default: None]
431
+ --logs --no-logs Print logs [default: no-logs]
432
+ --ssl-verification --no-ssl-verification SSL verification when publishing
433
+ the data contract.
434
+ [default: ssl-verification]
435
+ --help Show this message and exit.
436
+ ╰──────────────────────────────────────────────────────────────────────────────────────────────────╯
451
437
 
452
438
  ```
453
439
 
@@ -930,64 +916,57 @@ models:
930
916
 
931
917
  ### export
932
918
  ```
933
-
934
- Usage: datacontract export [OPTIONS] [LOCATION]
935
-
936
- Convert data contract to a specific format. Saves to file specified by
937
- `output` option if present, otherwise prints to stdout.
938
-
939
- ╭─ Arguments ──────────────────────────────────────────────────────────────────╮
940
- │ location [LOCATION] The location (url or path) of the data contract
941
- │ yaml.
942
- │ [default: datacontract.yaml] │
943
- ╰──────────────────────────────────────────────────────────────────────────────╯
944
- ╭─ Options ────────────────────────────────────────────────────────────────────╮
945
- * --format [jsonschema|pydantic-model The export format.
946
- │ |sodacl|dbt|dbt-sources|db [default: None]
947
- t-staging-sql|odcs|rdf|avr [required]
948
- o|protobuf|great-expectati
949
- ons|terraform|avro-idl|sql
950
- │ |sql-query|html|go|bigquer
951
- y|dbml|spark|sqlalchemy|da
952
- ta-caterer|dcs|markdown|ic
953
- eberg|custom]
954
- --output PATH Specify the file path where
955
- the exported data will be
956
- saved. If no path is
957
- provided, the output will
958
- be printed to stdout.
959
- [default: None]
960
- --server TEXT The server name to export.
961
- [default: None]
962
- --model TEXT Use the key of the model in
963
- the data contract yaml file
964
- to refer to a model, e.g.,
965
- `orders`, or `all` for all
966
- models (default).
967
- [default: all]
968
- --schema TEXT The location (url or path)
969
- of the Data Contract
970
- Specification JSON Schema
971
- [default: None]
972
- --engine TEXT [engine] The engine used
973
- for great expection run.
974
- │ [default: None] │
975
- │ --template PATH [custom] The file path of │
976
- Jinja template.
977
- │ [default: None] │
978
- │ --help Show this message and exit. │
979
- ╰──────────────────────────────────────────────────────────────────────────────╯
980
- ╭─ RDF Options ────────────────────────────────────────────────────────────────╮
981
- --rdf-base TEXT [rdf] The base URI used to generate the RDF graph.
982
- [default: None]
983
- ╰──────────────────────────────────────────────────────────────────────────────╯
984
- ╭─ SQL Options ────────────────────────────────────────────────────────────────╮
985
- │ --sql-server-type TEXT [sql] The server type to determine the sql │
986
- │ dialect. By default, it uses 'auto' to │
987
- │ automatically detect the sql dialect via the │
988
- │ specified servers in the data contract. │
989
- │ [default: auto] │
990
- ╰──────────────────────────────────────────────────────────────────────────────╯
919
+
920
+ Usage: datacontract export [OPTIONS] [LOCATION]
921
+
922
+ Convert data contract to a specific format. Saves to file specified by `output` option if present,
923
+ otherwise prints to stdout.
924
+
925
+ ╭─ Arguments ──────────────────────────────────────────────────────────────────────────────────────╮
926
+ │ location [LOCATION] The location (url or path) of the data contract yaml.
927
+ [default: datacontract.yaml]
928
+ ╰──────────────────────────────────────────────────────────────────────────────────────────────────╯
929
+ ╭─ Options ────────────────────────────────────────────────────────────────────────────────────────╮
930
+ * --format [jsonschema|pydantic-model|sodacl|db The export format. [default: None] │
931
+ t|dbt-sources|dbt-staging-sql|odcs|r [required]
932
+ df|avro|protobuf|great-expectations|
933
+ terraform|avro-idl|sql|sql-query|mer
934
+ maid|html|go|bigquery|dbml|spark|sql
935
+ alchemy|data-caterer|dcs|markdown|ic
936
+ eberg|custom]
937
+ --output PATH Specify the file path where the
938
+ exported data will be saved. If no
939
+ path is provided, the output will be
940
+ printed to stdout.
941
+ [default: None]
942
+ --server TEXT The server name to export.
943
+ [default: None]
944
+ --model TEXT Use the key of the model in the data
945
+ contract yaml file to refer to a
946
+ model, e.g., `orders`, or `all` for
947
+ all models (default).
948
+ [default: all]
949
+ --schema TEXT The location (url or path) of the
950
+ Data Contract Specification JSON
951
+ Schema
952
+ [default: None]
953
+ --engine TEXT [engine] The engine used for great
954
+ expection run.
955
+ [default: None]
956
+ --template PATH [custom] The file path of Jinja
957
+ template.
958
+ [default: None]
959
+ --help Show this message and exit.
960
+ ╰──────────────────────────────────────────────────────────────────────────────────────────────────╯
961
+ ╭─ RDF Options ────────────────────────────────────────────────────────────────────────────────────╮
962
+ --rdf-base TEXT [rdf] The base URI used to generate the RDF graph. [default: None]
963
+ ╰──────────────────────────────────────────────────────────────────────────────────────────────────╯
964
+ ╭─ SQL Options ────────────────────────────────────────────────────────────────────────────────────╮
965
+ │ --sql-server-type TEXT [sql] The server type to determine the sql dialect. By default, │
966
+ │ it uses 'auto' to automatically detect the sql dialect via the │
967
+ specified servers in the data contract.
968
+ [default: auto]
969
+ ╰──────────────────────────────────────────────────────────────────────────────────────────────────╯
991
970
 
992
971
  ```
993
972
 
@@ -1027,6 +1006,21 @@ Available export options:
1027
1006
  | `custom` | Export to Custom format with Jinja | ✅ |
1028
1007
  | Missing something? | Please create an issue on GitHub | TBD |
1029
1008
 
1009
+ #### SQL
1010
+
1011
+ The `export` function converts a given data contract into a SQL data definition language (DDL).
1012
+
1013
+ ```shell
1014
+ datacontract export datacontract.yaml --format sql --output output.sql
1015
+ ```
1016
+
1017
+ If using Databricks, and an error is thrown when trying to deploy the SQL DDLs with `variant` columns set the following properties.
1018
+
1019
+ ```shell
1020
+ spark.conf.set(“spark.databricks.delta.schema.typeCheck.enabled”, “false”)
1021
+ from datacontract.model import data_contract_specification
1022
+ data_contract_specification.DATACONTRACT_TYPES.append(“variant”)
1023
+ ```
1030
1024
 
1031
1025
  #### Great Expectations
1032
1026
 
@@ -1034,7 +1028,7 @@ The `export` function transforms a specified data contract into a comprehensive
1034
1028
  If the contract includes multiple models, you need to specify the names of the model you wish to export.
1035
1029
 
1036
1030
  ```shell
1037
- datacontract export datacontract.yaml --format great-expectations --model orders
1031
+ datacontract export datacontract.yaml --format great-expectations --model orders
1038
1032
  ```
1039
1033
 
1040
1034
  The export creates a list of expectations by utilizing:
@@ -1059,7 +1053,7 @@ To further customize the export, the following optional arguments are available:
1059
1053
 
1060
1054
  #### RDF
1061
1055
 
1062
- The export function converts a given data contract into a RDF representation. You have the option to
1056
+ The `export` function converts a given data contract into a RDF representation. You have the option to
1063
1057
  add a base_url which will be used as the default prefix to resolve relative IRIs inside the document.
1064
1058
 
1065
1059
  ```shell
@@ -1284,104 +1278,92 @@ FROM
1284
1278
 
1285
1279
  ### import
1286
1280
  ```
1287
-
1288
- Usage: datacontract import [OPTIONS]
1289
-
1290
- Create a data contract from the given source location. Saves to file specified
1291
- by `output` option if present, otherwise prints to stdout.
1292
-
1293
- ╭─ Options ────────────────────────────────────────────────────────────────────╮
1294
- │ * --format [sql|avro|dbt|dbml|gl The format of the
1295
- ue|jsonschema|bigquer source file.
1296
- y|odcs|unity|spark|ic [default: None]
1297
- eberg|parquet|csv|pro [required]
1298
- tobuf]
1299
- --output PATH Specify the file path
1300
- where the Data
1301
- Contract will be
1302
- saved. If no path is
1303
- provided, the output
1304
- will be printed to
1305
- stdout.
1306
- [default: None]
1307
- │ --source TEXT The path to the file
1308
- or Glue Database that
1309
- should be imported.
1310
- [default: None]
1311
- │ --dialect TEXT The SQL dialect to │
1312
- use when importing
1313
- SQL files, e.g.,
1314
- postgres, tsql,
1315
- bigquery.
1316
- [default: None]
1317
- --glue-table TEXT List of table ids to
1318
- import from the Glue
1319
- Database (repeat for
1320
- multiple table ids,
1321
- leave empty for all
1322
- tables in the
1323
- dataset).
1324
- [default: None]
1325
- │ --bigquery-project TEXT The bigquery project
1326
- id.
1327
- [default: None]
1328
- --bigquery-dataset TEXT The bigquery dataset
1329
- id.
1330
- [default: None]
1331
- │ --bigquery-table TEXT List of table ids to
1332
- import from the
1333
- bigquery API (repeat
1334
- for multiple table
1335
- ids, leave empty for
1336
- all tables in the
1337
- dataset).
1338
- [default: None]
1339
- --unity-table-full-n… TEXT Full name of a table
1340
- in the unity catalog
1341
- [default: None]
1342
- --dbt-model TEXT List of models names
1343
- to import from the │
1344
- dbt manifest file
1345
- (repeat for multiple
1346
- models names, leave
1347
- empty for all models
1348
- in the dataset).
1349
- [default: None]
1350
- --dbml-schema TEXT List of schema names
1351
- to import from the
1352
- DBML file (repeat for
1353
- multiple schema
1354
- names, leave empty
1355
- for all tables in the
1356
- file).
1357
- [default: None]
1358
- --dbml-table TEXT List of table names
1359
- to import from the
1360
- DBML file (repeat for
1361
- multiple table names,
1362
- leave empty for all
1363
- tables in the file).
1364
- [default: None]
1365
- --iceberg-table TEXT Table name to assign
1366
- to the model created
1367
- from the Iceberg
1368
- schema.
1369
- [default: None]
1370
- --template TEXT The location (url or
1371
- path) of the Data
1372
- │ Contract │
1373
- │ Specification │
1374
- │ Template │
1375
- │ [default: None] │
1376
- │ --schema TEXT The location (url or │
1377
- │ path) of the Data │
1378
- │ Contract │
1379
- │ Specification JSON │
1380
- │ Schema │
1381
- │ [default: None] │
1382
- │ --help Show this message and │
1383
- │ exit. │
1384
- ╰──────────────────────────────────────────────────────────────────────────────╯
1281
+
1282
+ Usage: datacontract import [OPTIONS]
1283
+
1284
+ Create a data contract from the given source location. Saves to file specified by `output` option
1285
+ if present, otherwise prints to stdout.
1286
+
1287
+ ╭─ Options ────────────────────────────────────────────────────────────────────────────────────────╮
1288
+ │ * --format [sql|avro|dbt|dbml|glue|jsonsc The format of the source file.
1289
+ hema|bigquery|odcs|unity|spark [default: None]
1290
+ |iceberg|parquet|csv|protobuf| [required]
1291
+ excel]
1292
+ --output PATH Specify the file path where
1293
+ the Data Contract will be
1294
+ saved. If no path is provided,
1295
+ the output will be printed to
1296
+ stdout.
1297
+ [default: None]
1298
+ --source TEXT The path to the file that
1299
+ should be imported.
1300
+ [default: None]
1301
+ │ --spec [datacontract_specification|od The format of the data
1302
+ cs] contract to import.
1303
+ [default:
1304
+ datacontract_specification]
1305
+ │ --dialect TEXT The SQL dialect to use when
1306
+ importing SQL files, e.g.,
1307
+ postgres, tsql, bigquery. │
1308
+ [default: None]
1309
+ --glue-table TEXT List of table ids to import
1310
+ from the Glue Database (repeat
1311
+ for multiple table ids, leave
1312
+ empty for all tables in the
1313
+ dataset).
1314
+ [default: None]
1315
+ --bigquery-project TEXT The bigquery project id.
1316
+ [default: None]
1317
+ --bigquery-dataset TEXT The bigquery dataset id.
1318
+ [default: None]
1319
+ │ --bigquery-table TEXT List of table ids to import
1320
+ from the bigquery API (repeat
1321
+ for multiple table ids, leave
1322
+ empty for all tables in the
1323
+ dataset).
1324
+ [default: None]
1325
+ │ --unity-table-full-name TEXT Full name of a table in the
1326
+ unity catalog
1327
+ [default: None]
1328
+ --dbt-model TEXT List of models names to import
1329
+ from the dbt manifest file
1330
+ (repeat for multiple models
1331
+ names, leave empty for all
1332
+ models in the dataset).
1333
+ [default: None]
1334
+ --dbml-schema TEXT List of schema names to import
1335
+ from the DBML file (repeat for
1336
+ multiple schema names, leave
1337
+ empty for all tables in the │
1338
+ file).
1339
+ [default: None]
1340
+ --dbml-table TEXT List of table names to import
1341
+ from the DBML file (repeat for │
1342
+ multiple table names, leave
1343
+ empty for all tables in the
1344
+ file).
1345
+ [default: None]
1346
+ --iceberg-table TEXT Table name to assign to the
1347
+ model created from the Iceberg
1348
+ schema.
1349
+ [default: None]
1350
+ --template TEXT The location (url or path) of
1351
+ the Data Contract
1352
+ Specification Template
1353
+ [default: None]
1354
+ --schema TEXT The location (url or path) of
1355
+ the Data Contract
1356
+ Specification JSON Schema
1357
+ [default: None]
1358
+ --owner TEXT The owner or team responsible
1359
+ for managing the data
1360
+ contract.
1361
+ [default: None]
1362
+ --id TEXT The identifier for the the
1363
+ data contract.
1364
+ [default: None]
1365
+ --help Show this message and exit.
1366
+ ╰──────────────────────────────────────────────────────────────────────────────────────────────────╯
1385
1367
 
1386
1368
  ```
1387
1369
 
@@ -1408,11 +1390,11 @@ Available import options:
1408
1390
  | `jsonschema` | Import from JSON Schemas | ✅ |
1409
1391
  | `odcs` | Import from Open Data Contract Standard (ODCS) | ✅ |
1410
1392
  | `parquet` | Import from Parquet File Metadata | ✅ |
1411
- | `protobuf` | Import from Protobuf schemas | ✅ |
1412
- | `spark` | Import from Spark StructTypes | ✅ |
1393
+ | `protobuf` | Import from Protobuf schemas | ✅ |
1394
+ | `spark` | Import from Spark StructTypes, Variant | ✅ |
1413
1395
  | `sql` | Import from SQL DDL | ✅ |
1414
1396
  | `unity` | Import from Databricks Unity Catalog | partial |
1415
- | Missing something? | Please create an issue on GitHub | TBD |
1397
+ | Missing something? | Please create an issue on GitHub | TBD |
1416
1398
 
1417
1399
 
1418
1400
  #### ODCS
@@ -1514,14 +1496,31 @@ datacontract import --format glue --source <database_name>
1514
1496
 
1515
1497
  #### Spark
1516
1498
 
1517
- Importing from Spark table or view these must be created or accessible in the Spark context. Specify tables list in `source` parameter.
1518
-
1519
- Example:
1499
+ Importing from Spark table or view these must be created or accessible in the Spark context. Specify tables list in `source` parameter. If the `source` tables are registered as tables in Databricks, and they have a table-level descriptions they will also be added to the Data Contract Specification.
1520
1500
 
1521
1501
  ```bash
1502
+ # Example: Import Spark table(s) from Spark context
1522
1503
  datacontract import --format spark --source "users,orders"
1523
1504
  ```
1524
1505
 
1506
+ ```bash
1507
+ # Example: Import Spark table
1508
+ DataContract().import_from_source("spark", "users")
1509
+ DataContract().import_from_source(format = "spark", source = "users")
1510
+
1511
+ # Example: Import Spark dataframe
1512
+ DataContract().import_from_source("spark", "users", dataframe = df_user)
1513
+ DataContract().import_from_source(format = "spark", source = "users", dataframe = df_user)
1514
+
1515
+ # Example: Import Spark table + table description
1516
+ DataContract().import_from_source("spark", "users", description = "description")
1517
+ DataContract().import_from_source(format = "spark", source = "users", description = "description")
1518
+
1519
+ # Example: Import Spark dataframe + table description
1520
+ DataContract().import_from_source("spark", "users", dataframe = df_user, description = "description")
1521
+ DataContract().import_from_source(format = "spark", source = "users", dataframe = df_user, description = "description")
1522
+ ```
1523
+
1525
1524
  #### DBML
1526
1525
 
1527
1526
  Importing from DBML Documents.
@@ -1586,91 +1585,83 @@ datacontract import --format protobuf --source "test.proto"
1586
1585
 
1587
1586
  ### breaking
1588
1587
  ```
1589
-
1590
- Usage: datacontract breaking [OPTIONS] LOCATION_OLD LOCATION_NEW
1591
-
1592
- Identifies breaking changes between data contracts. Prints to stdout.
1593
-
1594
- ╭─ Arguments ──────────────────────────────────────────────────────────────────╮
1595
- │ * location_old TEXT The location (url or path) of the old data
1596
- contract yaml.
1597
- │ [default: None]
1598
- [required]
1599
- * location_new TEXT The location (url or path) of the new data
1600
- contract yaml.
1601
- │ [default: None] │
1602
- │ [required] │
1603
- ╰──────────────────────────────────────────────────────────────────────────────╯
1604
- ╭─ Options ────────────────────────────────────────────────────────────────────╮
1605
- │ --help Show this message and exit. │
1606
- ╰──────────────────────────────────────────────────────────────────────────────╯
1588
+
1589
+ Usage: datacontract breaking [OPTIONS] LOCATION_OLD LOCATION_NEW
1590
+
1591
+ Identifies breaking changes between data contracts. Prints to stdout.
1592
+
1593
+ ╭─ Arguments ──────────────────────────────────────────────────────────────────────────────────────╮
1594
+ │ * location_old TEXT The location (url or path) of the old data contract yaml.
1595
+ [default: None]
1596
+ │ [required]
1597
+ * location_new TEXT The location (url or path) of the new data contract yaml.
1598
+ [default: None]
1599
+ [required]
1600
+ ╰──────────────────────────────────────────────────────────────────────────────────────────────────╯
1601
+ ╭─ Options ────────────────────────────────────────────────────────────────────────────────────────╮
1602
+ │ --help Show this message and exit. │
1603
+ ╰──────────────────────────────────────────────────────────────────────────────────────────────────╯
1607
1604
 
1608
1605
  ```
1609
1606
 
1610
1607
  ### changelog
1611
1608
  ```
1612
-
1613
- Usage: datacontract changelog [OPTIONS] LOCATION_OLD LOCATION_NEW
1614
-
1615
- Generate a changelog between data contracts. Prints to stdout.
1616
-
1617
- ╭─ Arguments ──────────────────────────────────────────────────────────────────╮
1618
- │ * location_old TEXT The location (url or path) of the old data
1619
- contract yaml.
1620
- │ [default: None]
1621
- [required]
1622
- * location_new TEXT The location (url or path) of the new data
1623
- contract yaml.
1624
- │ [default: None] │
1625
- │ [required] │
1626
- ╰──────────────────────────────────────────────────────────────────────────────╯
1627
- ╭─ Options ────────────────────────────────────────────────────────────────────╮
1628
- │ --help Show this message and exit. │
1629
- ╰──────────────────────────────────────────────────────────────────────────────╯
1609
+
1610
+ Usage: datacontract changelog [OPTIONS] LOCATION_OLD LOCATION_NEW
1611
+
1612
+ Generate a changelog between data contracts. Prints to stdout.
1613
+
1614
+ ╭─ Arguments ──────────────────────────────────────────────────────────────────────────────────────╮
1615
+ │ * location_old TEXT The location (url or path) of the old data contract yaml.
1616
+ [default: None]
1617
+ │ [required]
1618
+ * location_new TEXT The location (url or path) of the new data contract yaml.
1619
+ [default: None]
1620
+ [required]
1621
+ ╰──────────────────────────────────────────────────────────────────────────────────────────────────╯
1622
+ ╭─ Options ────────────────────────────────────────────────────────────────────────────────────────╮
1623
+ │ --help Show this message and exit. │
1624
+ ╰──────────────────────────────────────────────────────────────────────────────────────────────────╯
1630
1625
 
1631
1626
  ```
1632
1627
 
1633
1628
  ### diff
1634
1629
  ```
1635
-
1636
- Usage: datacontract diff [OPTIONS] LOCATION_OLD LOCATION_NEW
1637
-
1638
- PLACEHOLDER. Currently works as 'changelog' does.
1639
-
1640
- ╭─ Arguments ──────────────────────────────────────────────────────────────────╮
1641
- │ * location_old TEXT The location (url or path) of the old data
1642
- contract yaml.
1643
- │ [default: None]
1644
- [required]
1645
- * location_new TEXT The location (url or path) of the new data
1646
- contract yaml.
1647
- │ [default: None] │
1648
- │ [required] │
1649
- ╰──────────────────────────────────────────────────────────────────────────────╯
1650
- ╭─ Options ────────────────────────────────────────────────────────────────────╮
1651
- │ --help Show this message and exit. │
1652
- ╰──────────────────────────────────────────────────────────────────────────────╯
1630
+
1631
+ Usage: datacontract diff [OPTIONS] LOCATION_OLD LOCATION_NEW
1632
+
1633
+ PLACEHOLDER. Currently works as 'changelog' does.
1634
+
1635
+ ╭─ Arguments ──────────────────────────────────────────────────────────────────────────────────────╮
1636
+ │ * location_old TEXT The location (url or path) of the old data contract yaml.
1637
+ [default: None]
1638
+ │ [required]
1639
+ * location_new TEXT The location (url or path) of the new data contract yaml.
1640
+ [default: None]
1641
+ [required]
1642
+ ╰──────────────────────────────────────────────────────────────────────────────────────────────────╯
1643
+ ╭─ Options ────────────────────────────────────────────────────────────────────────────────────────╮
1644
+ │ --help Show this message and exit. │
1645
+ ╰──────────────────────────────────────────────────────────────────────────────────────────────────╯
1653
1646
 
1654
1647
  ```
1655
1648
 
1656
1649
  ### catalog
1657
1650
  ```
1658
-
1659
- Usage: datacontract catalog [OPTIONS]
1660
-
1661
- Create a html catalog of data contracts.
1662
-
1663
- ╭─ Options ────────────────────────────────────────────────────────────────────╮
1664
- │ --files TEXT Glob pattern for the data contract files to include in │
1665
- the catalog. Applies recursively to any subfolders.
1666
- │ [default: *.yaml]
1667
- │ --output TEXT Output directory for the catalog html files. │
1668
- [default: catalog/]
1669
- --schema TEXT The location (url or path) of the Data Contract
1670
- Specification JSON Schema
1671
- │ [default: None] │
1672
- │ --help Show this message and exit. │
1673
- ╰──────────────────────────────────────────────────────────────────────────────╯
1651
+
1652
+ Usage: datacontract catalog [OPTIONS]
1653
+
1654
+ Create a html catalog of data contracts.
1655
+
1656
+ ╭─ Options ────────────────────────────────────────────────────────────────────────────────────────╮
1657
+ │ --files TEXT Glob pattern for the data contract files to include in the catalog.
1658
+ │ Applies recursively to any subfolders.
1659
+ │ [default: *.yaml]
1660
+ │ --output TEXT Output directory for the catalog html files. [default: catalog/]
1661
+ --schema TEXT The location (url or path) of the Data Contract Specification JSON Schema
1662
+ [default: None]
1663
+ --help Show this message and exit.
1664
+ ╰──────────────────────────────────────────────────────────────────────────────────────────────────╯
1674
1665
 
1675
1666
  ```
1676
1667
 
@@ -1686,56 +1677,48 @@ datacontract catalog --files "*.odcs.yaml"
1686
1677
 
1687
1678
  ### publish
1688
1679
  ```
1689
-
1690
- Usage: datacontract publish [OPTIONS] [LOCATION]
1691
-
1692
- Publish the data contract to the Data Mesh Manager.
1693
-
1694
- ╭─ Arguments ──────────────────────────────────────────────────────────────────╮
1695
- │ location [LOCATION] The location (url or path) of the data contract
1696
- │ yaml.
1697
- │ [default: datacontract.yaml] │
1698
- ╰──────────────────────────────────────────────────────────────────────────────╯
1699
- ╭─ Options ────────────────────────────────────────────────────────────────────╮
1700
- --schema TEXT The location (url or
1701
- path) of the Data
1702
- Contract Specification
1703
- JSON Schema
1704
- │ [default: None]
1705
- │ --ssl-verification --no-ssl-verification SSL verification when
1706
- │ publishing the data │
1707
- │ contract. │
1708
- │ [default: │
1709
- │ ssl-verification] │
1710
- │ --help Show this message and │
1711
- │ exit. │
1712
- ╰──────────────────────────────────────────────────────────────────────────────╯
1680
+
1681
+ Usage: datacontract publish [OPTIONS] [LOCATION]
1682
+
1683
+ Publish the data contract to the Data Mesh Manager.
1684
+
1685
+ ╭─ Arguments ──────────────────────────────────────────────────────────────────────────────────────╮
1686
+ │ location [LOCATION] The location (url or path) of the data contract yaml.
1687
+ [default: datacontract.yaml]
1688
+ ╰──────────────────────────────────────────────────────────────────────────────────────────────────╯
1689
+ ╭─ Options ────────────────────────────────────────────────────────────────────────────────────────╮
1690
+ --schema TEXT The location (url or path) of the Data │
1691
+ Contract Specification JSON Schema
1692
+ [default: None]
1693
+ --ssl-verification --no-ssl-verification SSL verification when publishing the data
1694
+ contract.
1695
+ │ [default: ssl-verification]
1696
+ │ --help Show this message and exit.
1697
+ ╰──────────────────────────────────────────────────────────────────────────────────────────────────╯
1713
1698
 
1714
1699
  ```
1715
1700
 
1716
1701
  ### api
1717
1702
  ```
1718
-
1719
- Usage: datacontract api [OPTIONS]
1720
-
1721
- Start the datacontract CLI as server application with REST API.
1722
- The OpenAPI documentation as Swagger UI is available on http://localhost:4242.
1723
- You can execute the commands directly from the Swagger UI.
1724
- To protect the API, you can set the environment variable
1725
- DATACONTRACT_CLI_API_KEY to a secret API key. To authenticate, requests must
1726
- include the header 'x-api-key' with the correct API key. This is highly
1727
- recommended, as data contract tests may be subject to SQL injections or leak
1728
- sensitive information.
1729
- To connect to servers (such as a Snowflake data source), set the credentials
1730
- as environment variables as documented in https://cli.datacontract.com/#test
1731
-
1732
- ╭─ Options ────────────────────────────────────────────────────────────────────╮
1733
- │ --port INTEGER Bind socket to this port. [default: 4242]
1734
- --host TEXT Bind socket to this host. Hint: For running in
1735
- docker, set it to 0.0.0.0
1736
- │ [default: 127.0.0.1] │
1737
- │ --help Show this message and exit. │
1738
- ╰──────────────────────────────────────────────────────────────────────────────╯
1703
+
1704
+ Usage: datacontract api [OPTIONS]
1705
+
1706
+ Start the datacontract CLI as server application with REST API.
1707
+ The OpenAPI documentation as Swagger UI is available on http://localhost:4242. You can execute the
1708
+ commands directly from the Swagger UI.
1709
+ To protect the API, you can set the environment variable DATACONTRACT_CLI_API_KEY to a secret API
1710
+ key. To authenticate, requests must include the header 'x-api-key' with the correct API key. This
1711
+ is highly recommended, as data contract tests may be subject to SQL injections or leak sensitive
1712
+ information.
1713
+ To connect to servers (such as a Snowflake data source), set the credentials as environment
1714
+ variables as documented in https://cli.datacontract.com/#test
1715
+
1716
+ ╭─ Options ────────────────────────────────────────────────────────────────────────────────────────╮
1717
+ --port INTEGER Bind socket to this port. [default: 4242] │
1718
+ │ --host TEXT Bind socket to this host. Hint: For running in docker, set it to 0.0.0.0
1719
+ [default: 127.0.0.1]
1720
+ --help Show this message and exit.
1721
+ ╰──────────────────────────────────────────────────────────────────────────────────────────────────╯
1739
1722
 
1740
1723
  ```
1741
1724