datacontract-cli 0.9.8__tar.gz → 0.9.9__tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Potentially problematic release.


This version of datacontract-cli might be problematic. Click here for more details.

Files changed (128) hide show
  1. {datacontract-cli-0.9.8/datacontract_cli.egg-info → datacontract_cli-0.9.9}/PKG-INFO +314 -105
  2. datacontract_cli-0.9.9/README.md +904 -0
  3. {datacontract-cli-0.9.8 → datacontract_cli-0.9.9}/datacontract/cli.py +2 -0
  4. {datacontract-cli-0.9.8 → datacontract_cli-0.9.9}/datacontract/data_contract.py +27 -27
  5. {datacontract-cli-0.9.8 → datacontract_cli-0.9.9}/datacontract/engines/soda/check_soda_execute.py +17 -6
  6. {datacontract-cli-0.9.8 → datacontract_cli-0.9.9}/datacontract/engines/soda/connections/duckdb.py +21 -4
  7. {datacontract-cli-0.9.8 → datacontract_cli-0.9.9}/datacontract/export/avro_converter.py +6 -4
  8. datacontract_cli-0.9.9/datacontract/export/csv_type_converter.py +36 -0
  9. {datacontract-cli-0.9.8 → datacontract_cli-0.9.9}/datacontract/export/great_expectations_converter.py +1 -1
  10. datacontract_cli-0.9.9/datacontract/export/html_export.py +46 -0
  11. {datacontract-cli-0.9.8 → datacontract_cli-0.9.9}/datacontract/export/pydantic_converter.py +51 -60
  12. datacontract_cli-0.9.9/datacontract/export/sodacl_converter.py +190 -0
  13. {datacontract-cli-0.9.8 → datacontract_cli-0.9.9}/datacontract/export/sql_converter.py +12 -1
  14. {datacontract-cli-0.9.8 → datacontract_cli-0.9.9}/datacontract/imports/avro_importer.py +37 -12
  15. {datacontract-cli-0.9.8 → datacontract_cli-0.9.9}/datacontract/integration/publish_datamesh_manager.py +2 -3
  16. {datacontract-cli-0.9.8 → datacontract_cli-0.9.9}/datacontract/lint/resolve.py +45 -6
  17. {datacontract-cli-0.9.8 → datacontract_cli-0.9.9}/datacontract/model/run.py +2 -1
  18. {datacontract-cli-0.9.8 → datacontract_cli-0.9.9/datacontract_cli.egg-info}/PKG-INFO +314 -105
  19. {datacontract-cli-0.9.8 → datacontract_cli-0.9.9}/datacontract_cli.egg-info/SOURCES.txt +21 -19
  20. {datacontract-cli-0.9.8 → datacontract_cli-0.9.9}/datacontract_cli.egg-info/requires.txt +3 -2
  21. {datacontract-cli-0.9.8 → datacontract_cli-0.9.9}/pyproject.toml +7 -3
  22. {datacontract-cli-0.9.8 → datacontract_cli-0.9.9}/tests/test_breaking.py +26 -26
  23. {datacontract-cli-0.9.8 → datacontract_cli-0.9.9}/tests/test_changelog.py +137 -137
  24. {datacontract-cli-0.9.8 → datacontract_cli-0.9.9}/tests/test_example_model_linter.py +7 -7
  25. {datacontract-cli-0.9.8 → datacontract_cli-0.9.9}/tests/test_export_avro.py +3 -13
  26. {datacontract-cli-0.9.8 → datacontract_cli-0.9.9}/tests/test_export_avro_idl.py +3 -3
  27. {datacontract-cli-0.9.8 → datacontract_cli-0.9.9}/tests/test_export_dbt_models.py +2 -2
  28. {datacontract-cli-0.9.8 → datacontract_cli-0.9.9}/tests/test_export_dbt_sources.py +2 -2
  29. {datacontract-cli-0.9.8 → datacontract_cli-0.9.9}/tests/test_export_dbt_staging_sql.py +2 -2
  30. {datacontract-cli-0.9.8 → datacontract_cli-0.9.9}/tests/test_export_great_expectations.py +80 -30
  31. datacontract_cli-0.9.9/tests/test_export_html.py +25 -0
  32. {datacontract-cli-0.9.8 → datacontract_cli-0.9.9}/tests/test_export_jsonschema.py +3 -3
  33. {datacontract-cli-0.9.8 → datacontract_cli-0.9.9}/tests/test_export_odcs.py +2 -2
  34. {datacontract-cli-0.9.8 → datacontract_cli-0.9.9}/tests/test_export_protobuf.py +3 -3
  35. {datacontract-cli-0.9.8 → datacontract_cli-0.9.9}/tests/test_export_pydantic.py +66 -59
  36. {datacontract-cli-0.9.8 → datacontract_cli-0.9.9}/tests/test_export_rdf.py +5 -5
  37. datacontract_cli-0.9.9/tests/test_export_sodacl.py +86 -0
  38. {datacontract-cli-0.9.8 → datacontract_cli-0.9.9}/tests/test_export_sql.py +14 -14
  39. {datacontract-cli-0.9.8 → datacontract_cli-0.9.9}/tests/test_export_sql_query.py +3 -5
  40. {datacontract-cli-0.9.8 → datacontract_cli-0.9.9}/tests/test_export_terraform.py +2 -2
  41. datacontract_cli-0.9.9/tests/test_import_avro.py +157 -0
  42. {datacontract-cli-0.9.8 → datacontract_cli-0.9.9}/tests/test_import_sql.py +2 -2
  43. {datacontract-cli-0.9.8 → datacontract_cli-0.9.9}/tests/test_lint.py +8 -8
  44. {datacontract-cli-0.9.8 → datacontract_cli-0.9.9}/tests/test_quality_schema_linter.py +1 -1
  45. datacontract-cli-0.9.8/tests/test_examples_bigquery.py → datacontract_cli-0.9.9/tests/test_test_bigquery.py +3 -2
  46. datacontract-cli-0.9.8/tests/test_examples_databricks.py → datacontract_cli-0.9.9/tests/test_test_databricks.py +2 -2
  47. datacontract-cli-0.9.8/tests/test_examples_examples_csv.py → datacontract_cli-0.9.9/tests/test_test_examples_csv.py +3 -3
  48. datacontract_cli-0.9.9/tests/test_test_examples_formats_valid.py +17 -0
  49. datacontract-cli-0.9.8/tests/test_examples_examples_inline.py → datacontract_cli-0.9.9/tests/test_test_examples_inline.py +2 -2
  50. datacontract-cli-0.9.8/tests/test_examples_examples_json.py → datacontract_cli-0.9.9/tests/test_test_examples_json.py +2 -2
  51. datacontract-cli-0.9.8/tests/test_examples_examples_missing.py → datacontract_cli-0.9.9/tests/test_test_examples_missing.py +2 -2
  52. datacontract-cli-0.9.8/tests/test_examples_kafka.py → datacontract_cli-0.9.9/tests/test_test_kafka.py +7 -3
  53. datacontract-cli-0.9.8/tests/test_examples_kafka_remote.py → datacontract_cli-0.9.9/tests/test_test_kafka_remote.py +4 -4
  54. datacontract-cli-0.9.8/tests/test_examples_local_json.py → datacontract_cli-0.9.9/tests/test_test_local_json.py +2 -2
  55. datacontract-cli-0.9.8/tests/test_examples_parquet.py → datacontract_cli-0.9.9/tests/test_test_parquet.py +4 -4
  56. datacontract-cli-0.9.8/tests/test_examples_postgres.py → datacontract_cli-0.9.9/tests/test_test_postgres.py +3 -3
  57. datacontract-cli-0.9.8/tests/test_examples_s3_csv.py → datacontract_cli-0.9.9/tests/test_test_s3_csv.py +3 -3
  58. datacontract-cli-0.9.8/tests/test_examples_s3_json.py → datacontract_cli-0.9.9/tests/test_test_s3_json.py +3 -3
  59. datacontract-cli-0.9.8/tests/test_examples_s3_json_complex.py → datacontract_cli-0.9.9/tests/test_test_s3_json_complex.py +3 -3
  60. datacontract-cli-0.9.8/tests/test_examples_s3_json_multiple_models.py → datacontract_cli-0.9.9/tests/test_test_s3_json_multiple_models.py +3 -3
  61. datacontract-cli-0.9.8/tests/test_examples_s3_json_remote.py → datacontract_cli-0.9.9/tests/test_test_s3_json_remote.py +2 -2
  62. datacontract-cli-0.9.8/tests/test_examples_snowflake.py → datacontract_cli-0.9.9/tests/test_test_snowflake.py +2 -2
  63. {datacontract-cli-0.9.8 → datacontract_cli-0.9.9}/tests/test_web.py +2 -2
  64. datacontract-cli-0.9.8/README.md +0 -696
  65. datacontract-cli-0.9.8/datacontract/export/sodacl_converter.py +0 -93
  66. datacontract-cli-0.9.8/datacontract/lint/linters/primary_field_linter.py +0 -28
  67. datacontract-cli-0.9.8/tests/test_export_sodacl.py +0 -69
  68. datacontract-cli-0.9.8/tests/test_import_avro.py +0 -77
  69. datacontract-cli-0.9.8/tests/test_primary_field_linter.py +0 -38
  70. {datacontract-cli-0.9.8 → datacontract_cli-0.9.9}/LICENSE +0 -0
  71. {datacontract-cli-0.9.8 → datacontract_cli-0.9.9}/datacontract/__init__.py +0 -0
  72. {datacontract-cli-0.9.8 → datacontract_cli-0.9.9}/datacontract/breaking/breaking.py +0 -0
  73. {datacontract-cli-0.9.8 → datacontract_cli-0.9.9}/datacontract/breaking/breaking_rules.py +0 -0
  74. {datacontract-cli-0.9.8 → datacontract_cli-0.9.9}/datacontract/engines/__init__.py +0 -0
  75. {datacontract-cli-0.9.8 → datacontract_cli-0.9.9}/datacontract/engines/datacontract/check_that_datacontract_contains_valid_servers_configuration.py +0 -0
  76. {datacontract-cli-0.9.8 → datacontract_cli-0.9.9}/datacontract/engines/datacontract/check_that_datacontract_file_exists.py +0 -0
  77. {datacontract-cli-0.9.8 → datacontract_cli-0.9.9}/datacontract/engines/datacontract/check_that_datacontract_str_is_valid.py +0 -0
  78. {datacontract-cli-0.9.8 → datacontract_cli-0.9.9}/datacontract/engines/fastjsonschema/check_jsonschema.py +0 -0
  79. {datacontract-cli-0.9.8 → datacontract_cli-0.9.9}/datacontract/engines/fastjsonschema/s3/s3_read_files.py +0 -0
  80. {datacontract-cli-0.9.8 → datacontract_cli-0.9.9}/datacontract/engines/soda/__init__.py +0 -0
  81. {datacontract-cli-0.9.8 → datacontract_cli-0.9.9}/datacontract/engines/soda/connections/bigquery.py +0 -0
  82. {datacontract-cli-0.9.8 → datacontract_cli-0.9.9}/datacontract/engines/soda/connections/dask.py +0 -0
  83. {datacontract-cli-0.9.8 → datacontract_cli-0.9.9}/datacontract/engines/soda/connections/databricks.py +0 -0
  84. {datacontract-cli-0.9.8 → datacontract_cli-0.9.9}/datacontract/engines/soda/connections/kafka.py +0 -0
  85. {datacontract-cli-0.9.8 → datacontract_cli-0.9.9}/datacontract/engines/soda/connections/postgres.py +0 -0
  86. {datacontract-cli-0.9.8 → datacontract_cli-0.9.9}/datacontract/engines/soda/connections/snowflake.py +0 -0
  87. {datacontract-cli-0.9.8 → datacontract_cli-0.9.9}/datacontract/export/avro_idl_converter.py +0 -0
  88. {datacontract-cli-0.9.8 → datacontract_cli-0.9.9}/datacontract/export/dbt_converter.py +0 -0
  89. {datacontract-cli-0.9.8 → datacontract_cli-0.9.9}/datacontract/export/jsonschema_converter.py +0 -0
  90. {datacontract-cli-0.9.8 → datacontract_cli-0.9.9}/datacontract/export/odcs_converter.py +0 -0
  91. {datacontract-cli-0.9.8 → datacontract_cli-0.9.9}/datacontract/export/protobuf_converter.py +0 -0
  92. {datacontract-cli-0.9.8 → datacontract_cli-0.9.9}/datacontract/export/rdf_converter.py +0 -0
  93. {datacontract-cli-0.9.8 → datacontract_cli-0.9.9}/datacontract/export/sql_type_converter.py +0 -0
  94. {datacontract-cli-0.9.8 → datacontract_cli-0.9.9}/datacontract/export/terraform_converter.py +0 -0
  95. {datacontract-cli-0.9.8 → datacontract_cli-0.9.9}/datacontract/imports/sql_importer.py +0 -0
  96. {datacontract-cli-0.9.8 → datacontract_cli-0.9.9}/datacontract/init/download_datacontract_file.py +0 -0
  97. {datacontract-cli-0.9.8 → datacontract_cli-0.9.9}/datacontract/integration/publish_opentelemetry.py +0 -0
  98. {datacontract-cli-0.9.8 → datacontract_cli-0.9.9}/datacontract/lint/files.py +0 -0
  99. {datacontract-cli-0.9.8 → datacontract_cli-0.9.9}/datacontract/lint/lint.py +0 -0
  100. {datacontract-cli-0.9.8 → datacontract_cli-0.9.9}/datacontract/lint/linters/__init__.py +0 -0
  101. {datacontract-cli-0.9.8 → datacontract_cli-0.9.9}/datacontract/lint/linters/description_linter.py +0 -0
  102. {datacontract-cli-0.9.8 → datacontract_cli-0.9.9}/datacontract/lint/linters/example_model_linter.py +0 -0
  103. {datacontract-cli-0.9.8 → datacontract_cli-0.9.9}/datacontract/lint/linters/field_pattern_linter.py +0 -0
  104. {datacontract-cli-0.9.8 → datacontract_cli-0.9.9}/datacontract/lint/linters/field_reference_linter.py +0 -0
  105. {datacontract-cli-0.9.8 → datacontract_cli-0.9.9}/datacontract/lint/linters/notice_period_linter.py +0 -0
  106. {datacontract-cli-0.9.8 → datacontract_cli-0.9.9}/datacontract/lint/linters/quality_schema_linter.py +0 -0
  107. {datacontract-cli-0.9.8 → datacontract_cli-0.9.9}/datacontract/lint/linters/valid_constraints_linter.py +0 -0
  108. {datacontract-cli-0.9.8 → datacontract_cli-0.9.9}/datacontract/lint/schema.py +0 -0
  109. {datacontract-cli-0.9.8 → datacontract_cli-0.9.9}/datacontract/lint/urls.py +0 -0
  110. {datacontract-cli-0.9.8 → datacontract_cli-0.9.9}/datacontract/model/breaking_change.py +0 -0
  111. {datacontract-cli-0.9.8 → datacontract_cli-0.9.9}/datacontract/model/data_contract_specification.py +0 -0
  112. {datacontract-cli-0.9.8 → datacontract_cli-0.9.9}/datacontract/model/exceptions.py +0 -0
  113. {datacontract-cli-0.9.8 → datacontract_cli-0.9.9}/datacontract/web.py +0 -0
  114. {datacontract-cli-0.9.8 → datacontract_cli-0.9.9}/datacontract_cli.egg-info/dependency_links.txt +0 -0
  115. {datacontract-cli-0.9.8 → datacontract_cli-0.9.9}/datacontract_cli.egg-info/entry_points.txt +0 -0
  116. {datacontract-cli-0.9.8 → datacontract_cli-0.9.9}/datacontract_cli.egg-info/top_level.txt +0 -0
  117. {datacontract-cli-0.9.8 → datacontract_cli-0.9.9}/setup.cfg +0 -0
  118. {datacontract-cli-0.9.8 → datacontract_cli-0.9.9}/tests/test_cli.py +0 -0
  119. {datacontract-cli-0.9.8 → datacontract_cli-0.9.9}/tests/test_description_linter.py +0 -0
  120. {datacontract-cli-0.9.8 → datacontract_cli-0.9.9}/tests/test_documentation_linter.py +0 -0
  121. {datacontract-cli-0.9.8 → datacontract_cli-0.9.9}/tests/test_download_datacontract_file.py +0 -0
  122. {datacontract-cli-0.9.8 → datacontract_cli-0.9.9}/tests/test_field_constraint_linter.py +0 -0
  123. {datacontract-cli-0.9.8 → datacontract_cli-0.9.9}/tests/test_field_pattern_linter.py +0 -0
  124. {datacontract-cli-0.9.8 → datacontract_cli-0.9.9}/tests/test_field_reference_linter.py +0 -0
  125. {datacontract-cli-0.9.8 → datacontract_cli-0.9.9}/tests/test_integration_datameshmanager.py +0 -0
  126. {datacontract-cli-0.9.8 → datacontract_cli-0.9.9}/tests/test_integration_opentelemetry.py +0 -0
  127. {datacontract-cli-0.9.8 → datacontract_cli-0.9.9}/tests/test_notice_period_linter.py +0 -0
  128. {datacontract-cli-0.9.8 → datacontract_cli-0.9.9}/tests/test_schema.py +0 -0
@@ -1,6 +1,6 @@
1
1
  Metadata-Version: 2.1
2
2
  Name: datacontract-cli
3
- Version: 0.9.8
3
+ Version: 0.9.9
4
4
  Summary: Test data contracts
5
5
  Author-email: Jochen Christ <jochen.christ@innoq.com>, Stefan Negele <stefan.negele@innoq.com>
6
6
  Project-URL: Homepage, https://cli.datacontract.com
@@ -12,10 +12,10 @@ Requires-Python: >=3.10
12
12
  Description-Content-Type: text/markdown
13
13
  License-File: LICENSE
14
14
  Requires-Dist: typer[all]<0.13,>=0.9
15
- Requires-Dist: pydantic<2.7.0,>=2.5.3
15
+ Requires-Dist: pydantic<2.8.0,>=2.5.3
16
16
  Requires-Dist: pyyaml~=6.0.1
17
17
  Requires-Dist: requests~=2.31.0
18
- Requires-Dist: fastapi==0.110.0
18
+ Requires-Dist: fastapi==0.110.1
19
19
  Requires-Dist: fastparquet==2024.2.0
20
20
  Requires-Dist: python-multipart==0.0.9
21
21
  Requires-Dist: rich~=13.7.0
@@ -39,6 +39,7 @@ Provides-Extra: dev
39
39
  Requires-Dist: httpx==0.27.0; extra == "dev"
40
40
  Requires-Dist: ruff; extra == "dev"
41
41
  Requires-Dist: pytest; extra == "dev"
42
+ Requires-Dist: pytest-xdist; extra == "dev"
42
43
  Requires-Dist: testcontainers<4.0; extra == "dev"
43
44
  Requires-Dist: testcontainers-minio; extra == "dev"
44
45
  Requires-Dist: testcontainers-postgres; extra == "dev"
@@ -59,21 +60,13 @@ It uses data contract YAML files to lint the data contract, connect to data sour
59
60
 
60
61
  ![Main features of the Data Contract CLI](datacontractcli.png)
61
62
 
62
- <div align="center">
63
- <a href="https://www.youtube.com/watch?v=B1dixhgO2vQ">
64
- <img
65
- src="https://img.youtube.com/vi/B1dixhgO2vQ/0.jpg"
66
- alt="Demo of Data Contract CLI"
67
- style="width:100%;">
68
- </a>
69
- </div>
70
63
 
71
64
  ## Getting started
72
65
 
73
- Let's look at this data contract:
66
+ Let's look at this data contract:
74
67
  [https://datacontract.com/examples/orders-latest/datacontract.yaml](https://datacontract.com/examples/orders-latest/datacontract.yaml)
75
68
 
76
- We have a _servers_ section with endpoint details to the S3 bucket, _models_ for the structure of the data, and _quality_ attributes that describe the expected freshness and number of rows.
69
+ We have a _servers_ section with endpoint details to the S3 bucket, _models_ for the structure of the data, _servicelevels_ and _quality_ attributes that describe the expected freshness and number of rows.
77
70
 
78
71
  This data contract contains all information to connect to S3 and check that the actual data meets the defined schema and quality requirements. We can use this information to test if the actual data set in S3 is compliant to the data contract.
79
72
 
@@ -120,6 +113,31 @@ Testing https://datacontract.com/examples/orders-latest/datacontract.yaml
120
113
 
121
114
  Voilà, the CLI tested that the _datacontract.yaml_ itself is valid, all records comply with the schema, and all quality attributes are met.
122
115
 
116
+ We can also use the datacontract.yaml to export in many [formats](#format), e.g., to SQL:
117
+
118
+ ```bash
119
+ $ datacontract export --format sql https://datacontract.com/examples/orders-latest/datacontract.yaml
120
+
121
+ # returns:
122
+ -- Data Contract: urn:datacontract:checkout:orders-latest
123
+ -- SQL Dialect: snowflake
124
+ CREATE TABLE orders (
125
+ order_id TEXT not null primary key,
126
+ order_timestamp TIMESTAMP_TZ not null,
127
+ order_total NUMBER not null,
128
+ customer_id TEXT,
129
+ customer_email_address TEXT not null,
130
+ processed_timestamp TIMESTAMP_TZ not null
131
+ );
132
+ CREATE TABLE line_items (
133
+ lines_item_id TEXT not null primary key,
134
+ order_id TEXT,
135
+ sku TEXT
136
+ );
137
+ ```
138
+
139
+ Or generate this [HTML page](https://datacontract.com/examples/orders-latest/datacontract.html).
140
+
123
141
  ## Usage
124
142
 
125
143
  ```bash
@@ -135,7 +153,13 @@ $ datacontract test datacontract.yaml
135
153
  # execute schema and quality checks on the examples within the contract
136
154
  $ datacontract test --examples datacontract.yaml
137
155
 
138
- # find differences between to data contracts (Coming Soon)
156
+ # export data contract as html (other formats: avro, dbt, dbt-sources, dbt-staging-sql, jsonschema, odcs, rdf, sql, sodacl, terraform, ...)
157
+ $ datacontract export --format html datacontract.yaml > datacontract.html
158
+
159
+ # import avro (other formats: sql, ...)
160
+ $ datacontract import --format avro --source avro_schema.avsc
161
+
162
+ # find differences between to data contracts
139
163
  $ datacontract diff datacontract-v1.yaml datacontract-v2.yaml
140
164
 
141
165
  # find differences between to data contracts categorized into error, warning, and info.
@@ -143,15 +167,6 @@ $ datacontract changelog datacontract-v1.yaml datacontract-v2.yaml
143
167
 
144
168
  # fail pipeline on breaking changes. Uses changelog internally and showing only error and warning.
145
169
  $ datacontract breaking datacontract-v1.yaml datacontract-v2.yaml
146
-
147
- # export model as jsonschema (other formats: avro, dbt, dbt-sources, dbt-staging-sql, jsonschema, odcs, rdf, sql (coming soon), sodacl, terraform)
148
- $ datacontract export --format jsonschema datacontract.yaml
149
-
150
- # import sql
151
- $ datacontract import --format sql --source my_ddl.sql
152
-
153
- # import avro
154
- $ datacontract import --format avro --source avro_schema.avsc
155
170
  ```
156
171
 
157
172
  ## Programmatic (Python)
@@ -165,52 +180,6 @@ if not run.has_passed():
165
180
  # Abort pipeline, alert, or take corrective actions...
166
181
  ```
167
182
 
168
- ## Integrations
169
-
170
-
171
- | Integration | Option | Description |
172
- |-------------------|------------------------------|-------------------------------------------------------------------------------------------------------|
173
- | Data Mesh Manager | `--publish` | Push full results to the [Data Mesh Manager API](https://api.datamesh-manager.com/swagger/index.html) |
174
- | OpenTelemetry | `--publish-to-opentelemetry` | Push result as gauge metrics (logs are planned) |
175
-
176
- ### Integration with Data Mesh Manager
177
-
178
- If you use [Data Mesh Manager](https://datamesh-manager.com/), you can use the data contract URL and append the `--publish` option to send and display the test results. Set an environment variable for your API key.
179
-
180
- ```bash
181
- # Fetch current data contract, execute tests on production, and publish result to data mesh manager
182
- $ EXPORT DATAMESH_MANAGER_API_KEY=xxx
183
- $ datacontract test https://demo.datamesh-manager.com/demo279750347121/datacontracts/4df9d6ee-e55d-4088-9598-b635b2fdcbbc/datacontract.yaml --server production --publish
184
- ```
185
-
186
- ### Integration with OpenTelemetry
187
-
188
- If you use OpenTelemetry, you can use the data contract URL and append the `--publish-to-opentelemetry` option to send the test results to your OLTP-compatible instance, e.g., Prometheus.
189
-
190
- The metric name is "datacontract.cli.test.result" and it uses the following encoding for the result:
191
-
192
- | datacontract.cli.test.result | Description |
193
- |-------|---------------------------------------|
194
- | 0 | test run passed, no warnings |
195
- | 1 | test run has warnings |
196
- | 2 | test run failed |
197
- | 3 | test run not possible due to an error |
198
- | 4 | test status unknown |
199
-
200
-
201
- ```bash
202
- # Fetch current data contract, execute tests on production, and publish result to open telemetry
203
- $ EXPORT OTEL_SERVICE_NAME=datacontract-cli
204
- $ EXPORT OTEL_EXPORTER_OTLP_ENDPOINT=https://YOUR_ID.apm.westeurope.azure.elastic-cloud.com:443
205
- $ EXPORT OTEL_EXPORTER_OTLP_HEADERS=Authorization=Bearer%20secret # Optional, when using SaaS Products
206
- $ EXPORT OTEL_EXPORTER_OTLP_PROTOCOL=http/protobuf # Optional, default is http/protobuf - use value grpc to use the gRPC protocol instead
207
- # Send to OpenTelemetry
208
- $ datacontract test https://demo.datamesh-manager.com/demo279750347121/datacontracts/4df9d6ee-e55d-4088-9598-b635b2fdcbbc/datacontract.yaml --server production --publish-to-opentelemetry
209
- ```
210
-
211
- Current limitations:
212
- - currently, only ConsoleExporter and OTLP Exporter
213
- - Metrics only, no logs yet (but loosely planned)
214
183
 
215
184
  ## Installation
216
185
 
@@ -245,7 +214,87 @@ alias datacontract='docker run --rm -v "${PWD}:/home/datacontract" datacontract/
245
214
 
246
215
  ## Documentation
247
216
 
248
- ### Tests
217
+ Commands
218
+
219
+ - [init](#init)
220
+ - [lint](#lint)
221
+ - [test](#test)
222
+ - [export](#export)
223
+ - [import](#import)
224
+ - [breaking](#breaking)
225
+ - [changelog](#changelog)
226
+ - [diff](#diff)
227
+
228
+ ### init
229
+
230
+ ```
231
+ Usage: datacontract init [OPTIONS] [LOCATION]
232
+
233
+ Download a datacontract.yaml template and write it to file.
234
+
235
+ ╭─ Arguments ──────────────────────────────────────────────────────────────────────────────────╮
236
+ │ location [LOCATION] The location (url or path) of the data contract yaml to create. │
237
+ │ [default: datacontract.yaml] │
238
+ ╰──────────────────────────────────────────────────────────────────────────────────────────────╯
239
+ ╭─ Options ────────────────────────────────────────────────────────────────────────────────────╮
240
+ │ --template TEXT URL of a template or data contract │
241
+ │ [default: │
242
+ │ https://datacontract.com/datacontract.init.yaml] │
243
+ │ --overwrite --no-overwrite Replace the existing datacontract.yaml │
244
+ │ [default: no-overwrite] │
245
+ │ --help Show this message and exit. │
246
+ ╰──────────────────────────────────────────────────────────────────────────────────────────────╯
247
+ ```
248
+
249
+ ### lint
250
+
251
+ ```
252
+ Usage: datacontract lint [OPTIONS] [LOCATION]
253
+
254
+ Validate that the datacontract.yaml is correctly formatted.
255
+
256
+ ╭─ Arguments ─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╮
257
+ │ location [LOCATION] The location (url or path) of the data contract yaml. [default: datacontract.yaml] │
258
+ ╰─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯
259
+ ╭─ Options ───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╮
260
+ │ --schema TEXT The location (url or path) of the Data Contract Specification JSON Schema │
261
+ │ [default: https://datacontract.com/datacontract.schema.json] │
262
+ │ --help Show this message and exit. │
263
+ ╰─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯
264
+ ```
265
+
266
+ ### test
267
+
268
+ ```
269
+ Usage: datacontract test [OPTIONS] [LOCATION]
270
+
271
+ Run schema and quality tests on configured servers.
272
+
273
+ ╭─ Arguments ─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╮
274
+ │ location [LOCATION] The location (url or path) of the data contract yaml. [default: datacontract.yaml] │
275
+ ╰─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯
276
+ ╭─ Options ───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╮
277
+ │ --schema TEXT The location (url or path) of the Data Contract │
278
+ │ Specification JSON Schema │
279
+ │ [default: │
280
+ │ https://datacontract.com/datacontract.schema.json] │
281
+ │ --server TEXT The server configuration to run the schema and quality │
282
+ │ tests. Use the key of the server object in the data │
283
+ │ contract yaml file to refer to a server, e.g., │
284
+ │ `production`, or `all` for all servers (default). │
285
+ │ [default: all] │
286
+ │ --examples --no-examples Run the schema and quality tests on the example data │
287
+ │ within the data contract. │
288
+ │ [default: no-examples] │
289
+ │ --publish TEXT The url to publish the results after the test │
290
+ │ [default: None] │
291
+ │ --publish-to-opentelemetry --no-publish-to-opentelemetry Publish the results to opentelemetry. Use environment │
292
+ │ variables to configure the OTLP endpoint, headers, etc. │
293
+ │ [default: no-publish-to-opentelemetry] │
294
+ │ --logs --no-logs Print logs [default: no-logs] │
295
+ │ --help Show this message and exit. │
296
+ ╰─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯
297
+ ```
249
298
 
250
299
  Data Contract CLI can connect to data sources and run schema and quality tests to verify that the data contract is valid.
251
300
 
@@ -256,25 +305,29 @@ $ datacontract test --server production datacontract.yaml
256
305
  To connect to the databases the `server` block in the datacontract.yaml is used to set up the connection. In addition, credentials, such as username and passwords, may be defined with environment variables.
257
306
 
258
307
  The application uses different engines, based on the server `type`.
259
-
260
- | Type | Format | Description | Status | Engines |
261
- |--------------|------------|---------------------------------------------------------------------------|-------------|-------------------------------------|
262
- | `s3` | `parquet` | Works for any S3-compliant endpoint., e.g., AWS S3, GCS, MinIO, Ceph, ... | ✅ | soda-core-duckdb |
263
- | `s3` | `json` | Support for `new_line` delimited JSON files and one JSON record per file. | ✅ | fastjsonschema<br> soda-core-duckdb |
264
- | `s3` | `csv` | | | soda-core-duckdb |
265
- | `s3` | `delta` | | Coming soon | TBD |
266
- | `postgres` | n/a | | | soda-core-postgres |
267
- | `snowflake` | n/a | | | soda-core-snowflake |
268
- | `bigquery` | n/a | | | soda-core-bigquery |
269
- | `redshift` | n/a | | Coming soon | TBD |
270
- | `databricks` | n/a | Support for Databricks SQL with Unity catalog and Hive metastore. | ✅ | soda-core-spark |
271
- | `databricks` | n/a | Support for Spark for programmatic use in Notebooks. | | soda-core-spark-df |
272
- | `kafka` | `json` | Experimental. | | pyspark<br>soda-core-spark-df |
273
- | `kafka` | `avro` | | Coming soon | TBD |
274
- | `kafka` | `protobuf` | | Coming soon | TBD |
275
- | `local` | `parquet` | | | soda-core-duckdb |
276
- | `local` | `json` | Support for `new_line` delimited JSON files and one JSON record per file. | | fastjsonschema<br> soda-core-duckdb |
277
- | `local` | `csv` | | ✅ | soda-core-duckdb |
308
+ Internally, it connects with DuckDB, Spark, or a native connection and executes the most tests with soda-core and fastjsonschema.
309
+ Credentials are read from the environment variables.
310
+
311
+ Supported server types:
312
+
313
+ | Type | Format | Status |
314
+ |--------------|------------|--------------------------------------------------------------------|
315
+ | `s3` | `parquet` | ✅ |
316
+ | `s3` | `json` | ✅ |
317
+ | `s3` | `csv` | ✅ |
318
+ | `s3` | `delta` | Coming soon ([#24](https://github.com/datacontract/cli/issues/24)) |
319
+ | `s3` | `iceberg` | Coming soon |
320
+ | `postgres` | n/a | ✅ |
321
+ | `snowflake` | n/a | ✅ |
322
+ | `bigquery` | n/a | |
323
+ | `redshift` | n/a | Coming soon |
324
+ | `databricks` | n/a | ✅ |
325
+ | `kafka` | `json` | ✅ |
326
+ | `kafka` | `avro` | Coming soon |
327
+ | `kafka` | `protobuf` | Coming soon |
328
+ | `local` | `parquet` | ✅ |
329
+ | `local` | `json` | ✅ |
330
+ | `local` | `csv` | ✅ |
278
331
 
279
332
  Feel free to create an issue, if you need support for an additional type.
280
333
 
@@ -490,17 +543,46 @@ servers:
490
543
 
491
544
 
492
545
 
493
- ### Exports
546
+ ### export
547
+
548
+ ```
549
+ Usage: datacontract export [OPTIONS] [LOCATION]
550
+
551
+ Convert data contract to a specific format. console.prints to stdout.
552
+
553
+ ╭─ Arguments ─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╮
554
+ │ location [LOCATION] The location (url or path) of the data contract yaml. [default: datacontract.yaml] │
555
+ ╰─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯
556
+ ╭─ Options ───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╮
557
+ │ * --format [html|jsonschema|pydantic-model|sodacl|dbt|dbt-sources|dbt-staging-sql|odcs|rd The export format. [default: None] [required] │
558
+ │ f|avro|protobuf|great-expectations|terraform|avro-idl|sql|sql-query] │
559
+ │ --server TEXT The server name to export. [default: None] │
560
+ │ --model TEXT Use the key of the model in the data contract yaml file to refer to a │
561
+ │ model, e.g., `orders`, or `all` for all models (default). │
562
+ │ [default: all] │
563
+ │ --help Show this message and exit. │
564
+ ╰─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯
565
+ ╭─ RDF Options ───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╮
566
+ │ --rdf-base TEXT [rdf] The base URI used to generate the RDF graph. [default: None] │
567
+ ╰─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯
568
+ ╭─ SQL Options ───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╮
569
+ │ --sql-server-type TEXT [sql] The server type to determine the sql dialect. By default, it uses 'auto' to automatically detect the sql dialect via the specified │
570
+ │ servers in the data contract. │
571
+ │ [default: auto] │
572
+ ╰─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯
573
+
574
+ ```
494
575
 
495
576
  ```bash
496
- # Example export to dbt model
497
- datacontract export --format dbt
577
+ # Example export data contract as HTML
578
+ datacontract export --format html > datacontract.html
498
579
  ```
499
580
 
500
581
  Available export options:
501
582
 
502
583
  | Type | Description | Status |
503
584
  |----------------------|---------------------------------------------------------|--------|
585
+ | `html` | Export to HTML | ✅ |
504
586
  | `jsonschema` | Export to JSON Schema | ✅ |
505
587
  | `odcs` | Export to Open Data Contract Standard (ODCS) | ✅ |
506
588
  | `sodacl` | Export to SodaCL quality checks in YAML format | ✅ |
@@ -516,15 +598,17 @@ Available export options:
516
598
  | `great-expectations` | Export to Great Expectations Suites in JSON Format | ✅ |
517
599
  | `bigquery` | Export to BigQuery Schemas | TBD |
518
600
  | `pydantic` | Export to pydantic models | TBD |
519
- | `html` | Export to HTML page | TBD |
520
601
  | Missing something? | Please create an issue on GitHub | TBD |
521
602
 
522
603
  #### Great Expectations
604
+
523
605
  The export function transforms a specified data contract into a comprehensive Great Expectations JSON suite.
524
606
  If the contract includes multiple models, you need to specify the names of the model you wish to export.
607
+
525
608
  ```shell
526
609
  datacontract export datacontract.yaml --format great-expectations --model orders
527
610
  ```
611
+
528
612
  The export creates a list of expectations by utilizing:
529
613
 
530
614
  - The data from the Model definition with a fixed mapping
@@ -554,8 +638,21 @@ Having the data contract inside an RDF Graph gives us access the following use c
554
638
  - Apply graph algorithms on multiple data contracts (Find similar data contracts, find "gatekeeper"
555
639
  data products, find the true domain owner of a field attribute)
556
640
 
557
- ### Imports
641
+ ### import
642
+
643
+ ```
644
+ Usage: datacontract import [OPTIONS]
645
+
646
+ Create a data contract from the given source file. Prints to stdout.
647
+
648
+ ╭─ Options ───────────────────────────────────────────────────────────────────────────────────────────────────────╮
649
+ │ * --format [sql|avro] The format of the source file. [default: None] [required] │
650
+ │ * --source TEXT The path to the file that should be imported. [default: None] [required] │
651
+ │ --help Show this message and exit. │
652
+ ╰─────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯
653
+ ```
558
654
 
655
+ Example:
559
656
  ```bash
560
657
  # Example import from SQL DDL
561
658
  datacontract import --format sql --source my_ddl.sql
@@ -574,6 +671,103 @@ Available import options:
574
671
  | `odcs` | Import from Open Data Contract Standard (ODCS) | TBD |
575
672
  | Missing something? | Please create an issue on GitHub | TBD |
576
673
 
674
+
675
+ ### breaking
676
+
677
+ ```
678
+ Usage: datacontract breaking [OPTIONS] LOCATION_OLD LOCATION_NEW
679
+
680
+ Identifies breaking changes between data contracts. Prints to stdout.
681
+
682
+ ╭─ Arguments ───────────────────────────────────────────────────────────────────────────────────────────────────────────────╮
683
+ │ * location_old TEXT The location (url or path) of the old data contract yaml. [default: None] [required] │
684
+ │ * location_new TEXT The location (url or path) of the new data contract yaml. [default: None] [required] │
685
+ ╰───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯
686
+ ╭─ Options ─────────────────────────────────────────────────────────────────────────────────────────────────────────────────╮
687
+ │ --help Show this message and exit. │
688
+ ╰───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯
689
+ ```
690
+
691
+ ### changelog
692
+
693
+ ```
694
+ Usage: datacontract changelog [OPTIONS] LOCATION_OLD LOCATION_NEW
695
+
696
+ Generate a changelog between data contracts. Prints to stdout.
697
+
698
+ ╭─ Arguments ───────────────────────────────────────────────────────────────────────────────────────────────────────────────╮
699
+ │ * location_old TEXT The location (url or path) of the old data contract yaml. [default: None] [required] │
700
+ │ * location_new TEXT The location (url or path) of the new data contract yaml. [default: None] [required] │
701
+ ╰───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯
702
+ ╭─ Options ─────────────────────────────────────────────────────────────────────────────────────────────────────────────────╮
703
+ │ --help Show this message and exit. │
704
+ ╰───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯
705
+ ```
706
+
707
+ ### diff
708
+
709
+ ```
710
+ Usage: datacontract diff [OPTIONS] LOCATION_OLD LOCATION_NEW
711
+
712
+ PLACEHOLDER. Currently works as 'changelog' does.
713
+
714
+ ╭─ Arguments ───────────────────────────────────────────────────────────────────────────────────────────────────────────────╮
715
+ │ * location_old TEXT The location (url or path) of the old data contract yaml. [default: None] [required] │
716
+ │ * location_new TEXT The location (url or path) of the new data contract yaml. [default: None] [required] │
717
+ ╰───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯
718
+ ╭─ Options ─────────────────────────────────────────────────────────────────────────────────────────────────────────────────╮
719
+ │ --help Show this message and exit. │
720
+ ╰───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯
721
+ ```
722
+
723
+
724
+ ## Integrations
725
+
726
+ | Integration | Option | Description |
727
+ |-------------------|------------------------------|-------------------------------------------------------------------------------------------------------|
728
+ | Data Mesh Manager | `--publish` | Push full results to the [Data Mesh Manager API](https://api.datamesh-manager.com/swagger/index.html) |
729
+ | OpenTelemetry | `--publish-to-opentelemetry` | Push result as gauge metrics |
730
+
731
+ ### Integration with Data Mesh Manager
732
+
733
+ If you use [Data Mesh Manager](https://datamesh-manager.com/), you can use the data contract URL and append the `--publish` option to send and display the test results. Set an environment variable for your API key.
734
+
735
+ ```bash
736
+ # Fetch current data contract, execute tests on production, and publish result to data mesh manager
737
+ $ EXPORT DATAMESH_MANAGER_API_KEY=xxx
738
+ $ datacontract test https://demo.datamesh-manager.com/demo279750347121/datacontracts/4df9d6ee-e55d-4088-9598-b635b2fdcbbc/datacontract.yaml --server production --publish
739
+ ```
740
+
741
+ ### Integration with OpenTelemetry
742
+
743
+ If you use OpenTelemetry, you can use the data contract URL and append the `--publish-to-opentelemetry` option to send the test results to your OLTP-compatible instance, e.g., Prometheus.
744
+
745
+ The metric name is "datacontract.cli.test.result" and it uses the following encoding for the result:
746
+
747
+ | datacontract.cli.test.result | Description |
748
+ |-------|---------------------------------------|
749
+ | 0 | test run passed, no warnings |
750
+ | 1 | test run has warnings |
751
+ | 2 | test run failed |
752
+ | 3 | test run not possible due to an error |
753
+ | 4 | test status unknown |
754
+
755
+
756
+ ```bash
757
+ # Fetch current data contract, execute tests on production, and publish result to open telemetry
758
+ $ EXPORT OTEL_SERVICE_NAME=datacontract-cli
759
+ $ EXPORT OTEL_EXPORTER_OTLP_ENDPOINT=https://YOUR_ID.apm.westeurope.azure.elastic-cloud.com:443
760
+ $ EXPORT OTEL_EXPORTER_OTLP_HEADERS=Authorization=Bearer%20secret # Optional, when using SaaS Products
761
+ $ EXPORT OTEL_EXPORTER_OTLP_PROTOCOL=http/protobuf # Optional, default is http/protobuf - use value grpc to use the gRPC protocol instead
762
+ # Send to OpenTelemetry
763
+ $ datacontract test https://demo.datamesh-manager.com/demo279750347121/datacontracts/4df9d6ee-e55d-4088-9598-b635b2fdcbbc/datacontract.yaml --server production --publish-to-opentelemetry
764
+ ```
765
+
766
+ Current limitations:
767
+ - currently, only ConsoleExporter and OTLP Exporter
768
+ - Metrics only, no logs yet (but loosely planned)
769
+
770
+
577
771
  ## Best Practices
578
772
 
579
773
  We share best practices in using the Data Contract CLI.
@@ -697,25 +891,40 @@ ruff format --check
697
891
  pytest
698
892
  ```
699
893
 
700
- Release
894
+
895
+ ### Docker Build
701
896
 
702
897
  ```bash
703
- git tag v0.9.0
704
- git push origin v0.9.0
705
- python3 -m pip install --upgrade build twine
706
- rm -r dist/
707
- python3 -m build
708
- # for now only test.pypi.org
709
- python3 -m twine upload --repository testpypi dist/*
898
+ docker build -t datacontract/cli .
899
+ docker run --rm -v ${PWD}:/home/datacontract datacontract/cli
710
900
  ```
711
901
 
712
- Docker Build
902
+ #### Docker compose integration
903
+
904
+ We've included a [docker-compose.yml](./docker-compose.yml) configuration to simplify the build, test, and deployment of the image.
905
+
906
+ ##### Building the Image with Docker Compose
907
+
908
+ To build the Docker image using Docker Compose, run the following command:
713
909
 
714
910
  ```bash
715
- docker build -t datacontract/cli .
716
- docker run --rm -v ${PWD}:/home/datacontract datacontract/cli
911
+ docker compose build
717
912
  ```
718
913
 
914
+ This command utilizes the `docker-compose.yml` to build the image, leveraging predefined settings such as the build context and Dockerfile location. This approach streamlines the image creation process, avoiding the need for manual build specifications each time.
915
+
916
+ #### Testing the Image
917
+
918
+ After building the image, you can test it directly with Docker Compose:
919
+
920
+ ```bash
921
+ docker compose run --rm datacontract --version
922
+ ```
923
+
924
+ This command runs the container momentarily to check the version of the `datacontract` CLI. The `--rm` flag ensures that the container is automatically removed after the command executes, keeping your environment clean.
925
+
926
+
927
+
719
928
  ## Release Steps
720
929
 
721
930
  1. Update the version in `pyproject.toml`