datacontract-cli 0.10.8__tar.gz → 0.10.10__tar.gz
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Potentially problematic release.
This version of datacontract-cli might be problematic. Click here for more details.
- {datacontract_cli-0.10.8/datacontract_cli.egg-info → datacontract_cli-0.10.10}/PKG-INFO +310 -122
- {datacontract_cli-0.10.8 → datacontract_cli-0.10.10}/README.md +292 -108
- {datacontract_cli-0.10.8 → datacontract_cli-0.10.10}/datacontract/catalog/catalog.py +4 -2
- {datacontract_cli-0.10.8 → datacontract_cli-0.10.10}/datacontract/cli.py +36 -18
- {datacontract_cli-0.10.8 → datacontract_cli-0.10.10}/datacontract/data_contract.py +13 -53
- {datacontract_cli-0.10.8 → datacontract_cli-0.10.10}/datacontract/engines/soda/check_soda_execute.py +10 -2
- {datacontract_cli-0.10.8 → datacontract_cli-0.10.10}/datacontract/engines/soda/connections/duckdb.py +32 -12
- datacontract_cli-0.10.10/datacontract/engines/soda/connections/trino.py +26 -0
- {datacontract_cli-0.10.8 → datacontract_cli-0.10.10}/datacontract/export/avro_converter.py +1 -1
- {datacontract_cli-0.10.8 → datacontract_cli-0.10.10}/datacontract/export/exporter.py +3 -2
- datacontract_cli-0.10.10/datacontract/export/exporter_factory.py +145 -0
- {datacontract_cli-0.10.8 → datacontract_cli-0.10.10}/datacontract/export/jsonschema_converter.py +7 -7
- {datacontract_cli-0.10.8 → datacontract_cli-0.10.10}/datacontract/export/sodacl_converter.py +17 -12
- datacontract_cli-0.10.10/datacontract/export/spark_converter.py +211 -0
- {datacontract_cli-0.10.8 → datacontract_cli-0.10.10}/datacontract/export/sql_type_converter.py +28 -0
- {datacontract_cli-0.10.8 → datacontract_cli-0.10.10}/datacontract/imports/avro_importer.py +149 -7
- {datacontract_cli-0.10.8 → datacontract_cli-0.10.10}/datacontract/imports/bigquery_importer.py +17 -0
- datacontract_cli-0.10.10/datacontract/imports/dbt_importer.py +117 -0
- {datacontract_cli-0.10.8 → datacontract_cli-0.10.10}/datacontract/imports/glue_importer.py +116 -33
- datacontract_cli-0.10.10/datacontract/imports/importer.py +34 -0
- datacontract_cli-0.10.10/datacontract/imports/importer_factory.py +90 -0
- {datacontract_cli-0.10.8 → datacontract_cli-0.10.10}/datacontract/imports/jsonschema_importer.py +14 -3
- {datacontract_cli-0.10.8 → datacontract_cli-0.10.10}/datacontract/imports/odcs_importer.py +8 -0
- datacontract_cli-0.10.10/datacontract/imports/spark_importer.py +134 -0
- {datacontract_cli-0.10.8 → datacontract_cli-0.10.10}/datacontract/imports/sql_importer.py +8 -0
- {datacontract_cli-0.10.8 → datacontract_cli-0.10.10}/datacontract/imports/unity_importer.py +23 -9
- {datacontract_cli-0.10.8 → datacontract_cli-0.10.10}/datacontract/integration/publish_datamesh_manager.py +10 -5
- {datacontract_cli-0.10.8 → datacontract_cli-0.10.10}/datacontract/lint/resolve.py +87 -21
- datacontract_cli-0.10.10/datacontract/lint/schema.py +47 -0
- {datacontract_cli-0.10.8 → datacontract_cli-0.10.10}/datacontract/model/data_contract_specification.py +37 -4
- {datacontract_cli-0.10.8 → datacontract_cli-0.10.10}/datacontract/templates/datacontract.html +18 -3
- {datacontract_cli-0.10.8 → datacontract_cli-0.10.10}/datacontract/templates/index.html +1 -1
- datacontract_cli-0.10.10/datacontract/templates/partials/datacontract_information.html +86 -0
- {datacontract_cli-0.10.8 → datacontract_cli-0.10.10}/datacontract/templates/partials/datacontract_terms.html +7 -0
- {datacontract_cli-0.10.8 → datacontract_cli-0.10.10}/datacontract/templates/partials/definition.html +9 -1
- {datacontract_cli-0.10.8 → datacontract_cli-0.10.10}/datacontract/templates/partials/model_field.html +23 -6
- {datacontract_cli-0.10.8 → datacontract_cli-0.10.10}/datacontract/templates/partials/server.html +49 -16
- {datacontract_cli-0.10.8 → datacontract_cli-0.10.10}/datacontract/templates/style/output.css +42 -0
- {datacontract_cli-0.10.8 → datacontract_cli-0.10.10/datacontract_cli.egg-info}/PKG-INFO +310 -122
- {datacontract_cli-0.10.8 → datacontract_cli-0.10.10}/datacontract_cli.egg-info/SOURCES.txt +12 -1
- {datacontract_cli-0.10.8 → datacontract_cli-0.10.10}/datacontract_cli.egg-info/requires.txt +15 -10
- {datacontract_cli-0.10.8 → datacontract_cli-0.10.10}/pyproject.toml +23 -16
- datacontract_cli-0.10.8/tests/test_schema.py → datacontract_cli-0.10.10/tests/test_export_complex_data_contract.py +41 -3
- {datacontract_cli-0.10.8 → datacontract_cli-0.10.10}/tests/test_export_custom_exporter.py +2 -1
- datacontract_cli-0.10.10/tests/test_export_spark.py +142 -0
- {datacontract_cli-0.10.8 → datacontract_cli-0.10.10}/tests/test_import_avro.py +41 -4
- datacontract_cli-0.10.10/tests/test_import_dbt.py +255 -0
- {datacontract_cli-0.10.8 → datacontract_cli-0.10.10}/tests/test_import_glue.py +10 -1
- {datacontract_cli-0.10.8 → datacontract_cli-0.10.10}/tests/test_import_jsonschema.py +1 -1
- datacontract_cli-0.10.10/tests/test_import_spark.py +162 -0
- {datacontract_cli-0.10.8 → datacontract_cli-0.10.10}/tests/test_import_unity_file.py +1 -0
- {datacontract_cli-0.10.8 → datacontract_cli-0.10.10}/tests/test_integration_datameshmanager.py +1 -1
- datacontract_cli-0.10.10/tests/test_resolve.py +114 -0
- {datacontract_cli-0.10.8 → datacontract_cli-0.10.10}/tests/test_test_dataframe.py +30 -21
- {datacontract_cli-0.10.8 → datacontract_cli-0.10.10}/tests/test_test_postgres.py +23 -10
- datacontract_cli-0.10.10/tests/test_test_trino.py +99 -0
- datacontract_cli-0.10.8/datacontract/export/exporter_factory.py +0 -52
- datacontract_cli-0.10.8/datacontract/lint/schema.py +0 -27
- datacontract_cli-0.10.8/datacontract/templates/partials/datacontract_information.html +0 -66
- {datacontract_cli-0.10.8 → datacontract_cli-0.10.10}/LICENSE +0 -0
- {datacontract_cli-0.10.8 → datacontract_cli-0.10.10}/MANIFEST.in +0 -0
- {datacontract_cli-0.10.8 → datacontract_cli-0.10.10}/datacontract/__init__.py +0 -0
- {datacontract_cli-0.10.8 → datacontract_cli-0.10.10}/datacontract/breaking/breaking.py +0 -0
- {datacontract_cli-0.10.8 → datacontract_cli-0.10.10}/datacontract/breaking/breaking_rules.py +0 -0
- {datacontract_cli-0.10.8 → datacontract_cli-0.10.10}/datacontract/engines/__init__.py +0 -0
- {datacontract_cli-0.10.8 → datacontract_cli-0.10.10}/datacontract/engines/datacontract/check_that_datacontract_contains_valid_servers_configuration.py +0 -0
- {datacontract_cli-0.10.8 → datacontract_cli-0.10.10}/datacontract/engines/datacontract/check_that_datacontract_file_exists.py +0 -0
- {datacontract_cli-0.10.8 → datacontract_cli-0.10.10}/datacontract/engines/datacontract/check_that_datacontract_str_is_valid.py +0 -0
- {datacontract_cli-0.10.8 → datacontract_cli-0.10.10}/datacontract/engines/fastjsonschema/check_jsonschema.py +0 -0
- {datacontract_cli-0.10.8 → datacontract_cli-0.10.10}/datacontract/engines/fastjsonschema/s3/s3_read_files.py +0 -0
- {datacontract_cli-0.10.8 → datacontract_cli-0.10.10}/datacontract/engines/soda/__init__.py +0 -0
- {datacontract_cli-0.10.8 → datacontract_cli-0.10.10}/datacontract/engines/soda/connections/bigquery.py +0 -0
- {datacontract_cli-0.10.8 → datacontract_cli-0.10.10}/datacontract/engines/soda/connections/dask.py +0 -0
- {datacontract_cli-0.10.8 → datacontract_cli-0.10.10}/datacontract/engines/soda/connections/databricks.py +0 -0
- {datacontract_cli-0.10.8 → datacontract_cli-0.10.10}/datacontract/engines/soda/connections/kafka.py +0 -0
- {datacontract_cli-0.10.8 → datacontract_cli-0.10.10}/datacontract/engines/soda/connections/postgres.py +0 -0
- {datacontract_cli-0.10.8 → datacontract_cli-0.10.10}/datacontract/engines/soda/connections/snowflake.py +0 -0
- {datacontract_cli-0.10.8 → datacontract_cli-0.10.10}/datacontract/engines/soda/connections/sqlserver.py +0 -0
- {datacontract_cli-0.10.8 → datacontract_cli-0.10.10}/datacontract/export/__init__.py +0 -0
- {datacontract_cli-0.10.8 → datacontract_cli-0.10.10}/datacontract/export/avro_idl_converter.py +0 -0
- {datacontract_cli-0.10.8 → datacontract_cli-0.10.10}/datacontract/export/bigquery_converter.py +0 -0
- {datacontract_cli-0.10.8 → datacontract_cli-0.10.10}/datacontract/export/csv_type_converter.py +0 -0
- {datacontract_cli-0.10.8 → datacontract_cli-0.10.10}/datacontract/export/dbml_converter.py +0 -0
- {datacontract_cli-0.10.8 → datacontract_cli-0.10.10}/datacontract/export/dbt_converter.py +0 -0
- {datacontract_cli-0.10.8 → datacontract_cli-0.10.10}/datacontract/export/go_converter.py +0 -0
- {datacontract_cli-0.10.8 → datacontract_cli-0.10.10}/datacontract/export/great_expectations_converter.py +0 -0
- {datacontract_cli-0.10.8 → datacontract_cli-0.10.10}/datacontract/export/html_export.py +0 -0
- {datacontract_cli-0.10.8 → datacontract_cli-0.10.10}/datacontract/export/odcs_converter.py +0 -0
- {datacontract_cli-0.10.8 → datacontract_cli-0.10.10}/datacontract/export/protobuf_converter.py +0 -0
- {datacontract_cli-0.10.8 → datacontract_cli-0.10.10}/datacontract/export/pydantic_converter.py +0 -0
- {datacontract_cli-0.10.8 → datacontract_cli-0.10.10}/datacontract/export/rdf_converter.py +0 -0
- {datacontract_cli-0.10.8 → datacontract_cli-0.10.10}/datacontract/export/sql_converter.py +0 -0
- {datacontract_cli-0.10.8 → datacontract_cli-0.10.10}/datacontract/export/terraform_converter.py +0 -0
- {datacontract_cli-0.10.8 → datacontract_cli-0.10.10}/datacontract/init/download_datacontract_file.py +0 -0
- {datacontract_cli-0.10.8 → datacontract_cli-0.10.10}/datacontract/integration/publish_opentelemetry.py +0 -0
- {datacontract_cli-0.10.8 → datacontract_cli-0.10.10}/datacontract/lint/files.py +0 -0
- {datacontract_cli-0.10.8 → datacontract_cli-0.10.10}/datacontract/lint/lint.py +0 -0
- {datacontract_cli-0.10.8 → datacontract_cli-0.10.10}/datacontract/lint/linters/__init__.py +0 -0
- {datacontract_cli-0.10.8 → datacontract_cli-0.10.10}/datacontract/lint/linters/description_linter.py +0 -0
- {datacontract_cli-0.10.8 → datacontract_cli-0.10.10}/datacontract/lint/linters/example_model_linter.py +0 -0
- {datacontract_cli-0.10.8 → datacontract_cli-0.10.10}/datacontract/lint/linters/field_pattern_linter.py +0 -0
- {datacontract_cli-0.10.8 → datacontract_cli-0.10.10}/datacontract/lint/linters/field_reference_linter.py +0 -0
- {datacontract_cli-0.10.8 → datacontract_cli-0.10.10}/datacontract/lint/linters/notice_period_linter.py +0 -0
- {datacontract_cli-0.10.8 → datacontract_cli-0.10.10}/datacontract/lint/linters/quality_schema_linter.py +0 -0
- {datacontract_cli-0.10.8 → datacontract_cli-0.10.10}/datacontract/lint/linters/valid_constraints_linter.py +0 -0
- {datacontract_cli-0.10.8 → datacontract_cli-0.10.10}/datacontract/lint/urls.py +0 -0
- {datacontract_cli-0.10.8 → datacontract_cli-0.10.10}/datacontract/model/breaking_change.py +0 -0
- {datacontract_cli-0.10.8 → datacontract_cli-0.10.10}/datacontract/model/exceptions.py +0 -0
- {datacontract_cli-0.10.8 → datacontract_cli-0.10.10}/datacontract/model/run.py +0 -0
- {datacontract_cli-0.10.8 → datacontract_cli-0.10.10}/datacontract/publish/publish.py +0 -0
- {datacontract_cli-0.10.8 → datacontract_cli-0.10.10}/datacontract/py.typed +0 -0
- {datacontract_cli-0.10.8 → datacontract_cli-0.10.10}/datacontract/templates/partials/datacontract_servicelevels.html +0 -0
- {datacontract_cli-0.10.8 → datacontract_cli-0.10.10}/datacontract/templates/partials/example.html +0 -0
- {datacontract_cli-0.10.8 → datacontract_cli-0.10.10}/datacontract/web.py +0 -0
- {datacontract_cli-0.10.8 → datacontract_cli-0.10.10}/datacontract_cli.egg-info/dependency_links.txt +0 -0
- {datacontract_cli-0.10.8 → datacontract_cli-0.10.10}/datacontract_cli.egg-info/entry_points.txt +0 -0
- {datacontract_cli-0.10.8 → datacontract_cli-0.10.10}/datacontract_cli.egg-info/top_level.txt +0 -0
- {datacontract_cli-0.10.8 → datacontract_cli-0.10.10}/setup.cfg +0 -0
- {datacontract_cli-0.10.8 → datacontract_cli-0.10.10}/tests/test_breaking.py +0 -0
- {datacontract_cli-0.10.8 → datacontract_cli-0.10.10}/tests/test_catalog.py +0 -0
- {datacontract_cli-0.10.8 → datacontract_cli-0.10.10}/tests/test_changelog.py +0 -0
- {datacontract_cli-0.10.8 → datacontract_cli-0.10.10}/tests/test_cli.py +0 -0
- {datacontract_cli-0.10.8 → datacontract_cli-0.10.10}/tests/test_description_linter.py +0 -0
- {datacontract_cli-0.10.8 → datacontract_cli-0.10.10}/tests/test_documentation_linter.py +0 -0
- {datacontract_cli-0.10.8 → datacontract_cli-0.10.10}/tests/test_download_datacontract_file.py +0 -0
- {datacontract_cli-0.10.8 → datacontract_cli-0.10.10}/tests/test_example_model_linter.py +0 -0
- {datacontract_cli-0.10.8 → datacontract_cli-0.10.10}/tests/test_export_avro.py +0 -0
- {datacontract_cli-0.10.8 → datacontract_cli-0.10.10}/tests/test_export_avro_idl.py +0 -0
- {datacontract_cli-0.10.8 → datacontract_cli-0.10.10}/tests/test_export_bigquery.py +0 -0
- {datacontract_cli-0.10.8 → datacontract_cli-0.10.10}/tests/test_export_dbml.py +0 -0
- {datacontract_cli-0.10.8 → datacontract_cli-0.10.10}/tests/test_export_dbt_models.py +0 -0
- {datacontract_cli-0.10.8 → datacontract_cli-0.10.10}/tests/test_export_dbt_sources.py +0 -0
- {datacontract_cli-0.10.8 → datacontract_cli-0.10.10}/tests/test_export_dbt_staging_sql.py +0 -0
- {datacontract_cli-0.10.8 → datacontract_cli-0.10.10}/tests/test_export_go.py +0 -0
- {datacontract_cli-0.10.8 → datacontract_cli-0.10.10}/tests/test_export_great_expectations.py +0 -0
- {datacontract_cli-0.10.8 → datacontract_cli-0.10.10}/tests/test_export_html.py +0 -0
- {datacontract_cli-0.10.8 → datacontract_cli-0.10.10}/tests/test_export_jsonschema.py +0 -0
- {datacontract_cli-0.10.8 → datacontract_cli-0.10.10}/tests/test_export_odcs.py +0 -0
- {datacontract_cli-0.10.8 → datacontract_cli-0.10.10}/tests/test_export_protobuf.py +0 -0
- {datacontract_cli-0.10.8 → datacontract_cli-0.10.10}/tests/test_export_pydantic.py +0 -0
- {datacontract_cli-0.10.8 → datacontract_cli-0.10.10}/tests/test_export_rdf.py +0 -0
- {datacontract_cli-0.10.8 → datacontract_cli-0.10.10}/tests/test_export_sodacl.py +0 -0
- {datacontract_cli-0.10.8 → datacontract_cli-0.10.10}/tests/test_export_sql.py +0 -0
- {datacontract_cli-0.10.8 → datacontract_cli-0.10.10}/tests/test_export_sql_query.py +0 -0
- {datacontract_cli-0.10.8 → datacontract_cli-0.10.10}/tests/test_export_terraform.py +0 -0
- {datacontract_cli-0.10.8 → datacontract_cli-0.10.10}/tests/test_field_constraint_linter.py +0 -0
- {datacontract_cli-0.10.8 → datacontract_cli-0.10.10}/tests/test_field_pattern_linter.py +0 -0
- {datacontract_cli-0.10.8 → datacontract_cli-0.10.10}/tests/test_field_reference_linter.py +0 -0
- {datacontract_cli-0.10.8 → datacontract_cli-0.10.10}/tests/test_import_bigquery.py +0 -0
- {datacontract_cli-0.10.8 → datacontract_cli-0.10.10}/tests/test_import_odcs.py +0 -0
- {datacontract_cli-0.10.8 → datacontract_cli-0.10.10}/tests/test_import_sql.py +0 -0
- {datacontract_cli-0.10.8 → datacontract_cli-0.10.10}/tests/test_integration_opentelemetry.py +0 -0
- {datacontract_cli-0.10.8 → datacontract_cli-0.10.10}/tests/test_lint.py +0 -0
- {datacontract_cli-0.10.8 → datacontract_cli-0.10.10}/tests/test_notice_period_linter.py +0 -0
- {datacontract_cli-0.10.8 → datacontract_cli-0.10.10}/tests/test_quality_schema_linter.py +0 -0
- {datacontract_cli-0.10.8 → datacontract_cli-0.10.10}/tests/test_roundtrip_jsonschema.py +0 -0
- {datacontract_cli-0.10.8 → datacontract_cli-0.10.10}/tests/test_test_azure_parquet_remote.py +0 -0
- {datacontract_cli-0.10.8 → datacontract_cli-0.10.10}/tests/test_test_bigquery.py +0 -0
- {datacontract_cli-0.10.8 → datacontract_cli-0.10.10}/tests/test_test_databricks.py +0 -0
- {datacontract_cli-0.10.8 → datacontract_cli-0.10.10}/tests/test_test_examples_csv.py +0 -0
- {datacontract_cli-0.10.8 → datacontract_cli-0.10.10}/tests/test_test_examples_formats_valid.py +0 -0
- {datacontract_cli-0.10.8 → datacontract_cli-0.10.10}/tests/test_test_examples_inline.py +0 -0
- {datacontract_cli-0.10.8 → datacontract_cli-0.10.10}/tests/test_test_examples_json.py +0 -0
- {datacontract_cli-0.10.8 → datacontract_cli-0.10.10}/tests/test_test_examples_missing.py +0 -0
- {datacontract_cli-0.10.8 → datacontract_cli-0.10.10}/tests/test_test_kafka.py +0 -0
- {datacontract_cli-0.10.8 → datacontract_cli-0.10.10}/tests/test_test_kafka_remote.py +0 -0
- {datacontract_cli-0.10.8 → datacontract_cli-0.10.10}/tests/test_test_local_json.py +0 -0
- {datacontract_cli-0.10.8 → datacontract_cli-0.10.10}/tests/test_test_parquet.py +0 -0
- {datacontract_cli-0.10.8 → datacontract_cli-0.10.10}/tests/test_test_s3_csv.py +0 -0
- {datacontract_cli-0.10.8 → datacontract_cli-0.10.10}/tests/test_test_s3_delta.py +0 -0
- {datacontract_cli-0.10.8 → datacontract_cli-0.10.10}/tests/test_test_s3_json.py +0 -0
- {datacontract_cli-0.10.8 → datacontract_cli-0.10.10}/tests/test_test_s3_json_complex.py +0 -0
- {datacontract_cli-0.10.8 → datacontract_cli-0.10.10}/tests/test_test_s3_json_multiple_models.py +0 -0
- {datacontract_cli-0.10.8 → datacontract_cli-0.10.10}/tests/test_test_s3_json_remote.py +0 -0
- {datacontract_cli-0.10.8 → datacontract_cli-0.10.10}/tests/test_test_snowflake.py +0 -0
- {datacontract_cli-0.10.8 → datacontract_cli-0.10.10}/tests/test_test_sqlserver.py +0 -0
- {datacontract_cli-0.10.8 → datacontract_cli-0.10.10}/tests/test_web.py +0 -0
|
@@ -1,10 +1,10 @@
|
|
|
1
1
|
Metadata-Version: 2.1
|
|
2
2
|
Name: datacontract-cli
|
|
3
|
-
Version: 0.10.
|
|
4
|
-
Summary:
|
|
5
|
-
Author-email: Jochen Christ <jochen.christ@innoq.com>, Stefan Negele <stefan.negele@innoq.com>
|
|
3
|
+
Version: 0.10.10
|
|
4
|
+
Summary: The datacontract CLI is an open source command-line tool for working with Data Contracts. It uses data contract YAML files to lint the data contract, connect to data sources and execute schema and quality tests, detect breaking changes, and export to different formats. The tool is written in Python. It can be used as a standalone CLI tool, in a CI/CD pipeline, or directly as a Python library.
|
|
5
|
+
Author-email: Jochen Christ <jochen.christ@innoq.com>, Stefan Negele <stefan.negele@innoq.com>, Simon Harrer <simon.harrer@innoq.com>
|
|
6
6
|
Project-URL: Homepage, https://cli.datacontract.com
|
|
7
|
-
Project-URL: Issues, https://github.com/datacontract/cli/issues
|
|
7
|
+
Project-URL: Issues, https://github.com/datacontract/datacontract-cli/issues
|
|
8
8
|
Classifier: Programming Language :: Python :: 3
|
|
9
9
|
Classifier: License :: OSI Approved :: MIT License
|
|
10
10
|
Classifier: Operating System :: OS Independent
|
|
@@ -12,10 +12,10 @@ Requires-Python: >=3.10
|
|
|
12
12
|
Description-Content-Type: text/markdown
|
|
13
13
|
License-File: LICENSE
|
|
14
14
|
Requires-Dist: typer[all]<0.13,>=0.9
|
|
15
|
-
Requires-Dist: pydantic<2.
|
|
15
|
+
Requires-Dist: pydantic<2.9.0,>=2.8.2
|
|
16
16
|
Requires-Dist: pyyaml~=6.0.1
|
|
17
17
|
Requires-Dist: requests<2.33,>=2.31
|
|
18
|
-
Requires-Dist: fastapi==0.111.
|
|
18
|
+
Requires-Dist: fastapi==0.111.1
|
|
19
19
|
Requires-Dist: fastparquet==2024.5.0
|
|
20
20
|
Requires-Dist: python-multipart==0.0.9
|
|
21
21
|
Requires-Dist: rich~=13.7.0
|
|
@@ -28,8 +28,8 @@ Requires-Dist: python-dotenv~=1.0.0
|
|
|
28
28
|
Requires-Dist: rdflib==7.0.0
|
|
29
29
|
Requires-Dist: opentelemetry-exporter-otlp-proto-grpc~=1.16
|
|
30
30
|
Requires-Dist: opentelemetry-exporter-otlp-proto-http~=1.16
|
|
31
|
-
Requires-Dist: boto3<1.34.
|
|
32
|
-
Requires-Dist: botocore<1.34.
|
|
31
|
+
Requires-Dist: boto3<1.34.137,>=1.34.41
|
|
32
|
+
Requires-Dist: botocore<1.34.137,>=1.34.41
|
|
33
33
|
Requires-Dist: jinja_partials>=0.2.1
|
|
34
34
|
Provides-Extra: avro
|
|
35
35
|
Requires-Dist: avro==1.11.3; extra == "avro"
|
|
@@ -37,7 +37,7 @@ Provides-Extra: bigquery
|
|
|
37
37
|
Requires-Dist: soda-core-bigquery<3.4.0,>=3.3.1; extra == "bigquery"
|
|
38
38
|
Provides-Extra: databricks
|
|
39
39
|
Requires-Dist: soda-core-spark-df<3.4.0,>=3.3.1; extra == "databricks"
|
|
40
|
-
Requires-Dist: databricks-sql-connector<3.
|
|
40
|
+
Requires-Dist: databricks-sql-connector<3.3.0,>=3.1.2; extra == "databricks"
|
|
41
41
|
Requires-Dist: soda-core-spark[databricks]<3.4.0,>=3.3.1; extra == "databricks"
|
|
42
42
|
Provides-Extra: deltalake
|
|
43
43
|
Requires-Dist: deltalake<0.19,>=0.17; extra == "deltalake"
|
|
@@ -47,14 +47,16 @@ Requires-Dist: soda-core-spark-df<3.4.0,>=3.3.1; extra == "kafka"
|
|
|
47
47
|
Provides-Extra: postgres
|
|
48
48
|
Requires-Dist: soda-core-postgres<3.4.0,>=3.3.1; extra == "postgres"
|
|
49
49
|
Provides-Extra: s3
|
|
50
|
-
Requires-Dist: s3fs==2024.6.
|
|
50
|
+
Requires-Dist: s3fs==2024.6.1; extra == "s3"
|
|
51
51
|
Provides-Extra: snowflake
|
|
52
|
-
Requires-Dist: snowflake-connector-python[pandas]<3.
|
|
52
|
+
Requires-Dist: snowflake-connector-python[pandas]<3.12,>=3.6; extra == "snowflake"
|
|
53
53
|
Requires-Dist: soda-core-snowflake<3.4.0,>=3.3.1; extra == "snowflake"
|
|
54
54
|
Provides-Extra: sqlserver
|
|
55
55
|
Requires-Dist: soda-core-sqlserver<3.4.0,>=3.3.1; extra == "sqlserver"
|
|
56
|
+
Provides-Extra: trino
|
|
57
|
+
Requires-Dist: soda-core-trino<3.4.0,>=3.3.1; extra == "trino"
|
|
56
58
|
Provides-Extra: all
|
|
57
|
-
Requires-Dist: datacontract-cli[bigquery,databricks,deltalake,kafka,postgres,s3,snowflake,sqlserver]; extra == "all"
|
|
59
|
+
Requires-Dist: datacontract-cli[bigquery,databricks,deltalake,kafka,postgres,s3,snowflake,sqlserver,trino]; extra == "all"
|
|
58
60
|
Provides-Extra: dev
|
|
59
61
|
Requires-Dist: datacontract-cli[all]; extra == "dev"
|
|
60
62
|
Requires-Dist: httpx==0.27.0; extra == "dev"
|
|
@@ -62,10 +64,12 @@ Requires-Dist: ruff; extra == "dev"
|
|
|
62
64
|
Requires-Dist: pre-commit~=3.7.1; extra == "dev"
|
|
63
65
|
Requires-Dist: pytest; extra == "dev"
|
|
64
66
|
Requires-Dist: pytest-xdist; extra == "dev"
|
|
65
|
-
Requires-Dist: moto; extra == "dev"
|
|
67
|
+
Requires-Dist: moto==5.0.11; extra == "dev"
|
|
66
68
|
Requires-Dist: pymssql==2.3.0; extra == "dev"
|
|
67
69
|
Requires-Dist: kafka-python; extra == "dev"
|
|
68
|
-
Requires-Dist:
|
|
70
|
+
Requires-Dist: trino==0.329.0; extra == "dev"
|
|
71
|
+
Requires-Dist: testcontainers<4.8,>=4.5; extra == "dev"
|
|
72
|
+
Requires-Dist: testcontainers[core]; extra == "dev"
|
|
69
73
|
Requires-Dist: testcontainers[minio]; extra == "dev"
|
|
70
74
|
Requires-Dist: testcontainers[postgres]; extra == "dev"
|
|
71
75
|
Requires-Dist: testcontainers[kafka]; extra == "dev"
|
|
@@ -258,17 +262,18 @@ pip install datacontract-cli[all]
|
|
|
258
262
|
|
|
259
263
|
A list of available extras:
|
|
260
264
|
|
|
261
|
-
| Dependency
|
|
262
|
-
|
|
263
|
-
| Avro Support
|
|
264
|
-
| Google BigQuery
|
|
265
|
-
| Databricks Integration
|
|
266
|
-
| Deltalake Integration
|
|
267
|
-
| Kafka Integration
|
|
268
|
-
| PostgreSQL Integration
|
|
269
|
-
| S3 Integration
|
|
270
|
-
| Snowflake Integration
|
|
271
|
-
| Microsoft SQL Server
|
|
265
|
+
| Dependency | Installation Command |
|
|
266
|
+
|------------------------|--------------------------------------------|
|
|
267
|
+
| Avro Support | `pip install datacontract-cli[avro]` |
|
|
268
|
+
| Google BigQuery | `pip install datacontract-cli[bigquery]` |
|
|
269
|
+
| Databricks Integration | `pip install datacontract-cli[databricks]` |
|
|
270
|
+
| Deltalake Integration | `pip install datacontract-cli[deltalake]` |
|
|
271
|
+
| Kafka Integration | `pip install datacontract-cli[kafka]` |
|
|
272
|
+
| PostgreSQL Integration | `pip install datacontract-cli[postgres]` |
|
|
273
|
+
| S3 Integration | `pip install datacontract-cli[s3]` |
|
|
274
|
+
| Snowflake Integration | `pip install datacontract-cli[snowflake]` |
|
|
275
|
+
| Microsoft SQL Server | `pip install datacontract-cli[sqlserver]` |
|
|
276
|
+
| Trino | `pip install datacontract-cli[trino]` |
|
|
272
277
|
|
|
273
278
|
|
|
274
279
|
|
|
@@ -295,16 +300,16 @@ Commands
|
|
|
295
300
|
Download a datacontract.yaml template and write it to file.
|
|
296
301
|
|
|
297
302
|
╭─ Arguments ──────────────────────────────────────────────────────────────────────────────────╮
|
|
298
|
-
│ location [LOCATION] The location (url or path) of the data contract yaml to create.
|
|
299
|
-
│ [default: datacontract.yaml]
|
|
303
|
+
│ location [LOCATION] The location (url or path) of the data contract yaml to create. │
|
|
304
|
+
│ [default: datacontract.yaml] │
|
|
300
305
|
╰──────────────────────────────────────────────────────────────────────────────────────────────╯
|
|
301
306
|
╭─ Options ────────────────────────────────────────────────────────────────────────────────────╮
|
|
302
|
-
│ --template TEXT URL of a template or data contract
|
|
303
|
-
│ [default:
|
|
304
|
-
│ https://datacontract.com/datacontract.init.yaml]
|
|
305
|
-
│ --overwrite --no-overwrite Replace the existing datacontract.yaml
|
|
306
|
-
│ [default: no-overwrite]
|
|
307
|
-
│ --help Show this message and exit.
|
|
307
|
+
│ --template TEXT URL of a template or data contract │
|
|
308
|
+
│ [default: │
|
|
309
|
+
│ https://datacontract.com/datacontract.init.yaml] │
|
|
310
|
+
│ --overwrite --no-overwrite Replace the existing datacontract.yaml │
|
|
311
|
+
│ [default: no-overwrite] │
|
|
312
|
+
│ --help Show this message and exit. │
|
|
308
313
|
╰──────────────────────────────────────────────────────────────────────────────────────────────╯
|
|
309
314
|
```
|
|
310
315
|
|
|
@@ -316,12 +321,12 @@ Commands
|
|
|
316
321
|
Validate that the datacontract.yaml is correctly formatted.
|
|
317
322
|
|
|
318
323
|
╭─ Arguments ─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╮
|
|
319
|
-
│ location [LOCATION] The location (url or path) of the data contract yaml. [default: datacontract.yaml]
|
|
324
|
+
│ location [LOCATION] The location (url or path) of the data contract yaml. [default: datacontract.yaml] │
|
|
320
325
|
╰─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯
|
|
321
326
|
╭─ Options ───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╮
|
|
322
|
-
│ --schema TEXT The location (url or path) of the Data Contract Specification JSON Schema
|
|
323
|
-
│ [default: https://datacontract.com/datacontract.schema.json]
|
|
324
|
-
│ --help Show this message and exit.
|
|
327
|
+
│ --schema TEXT The location (url or path) of the Data Contract Specification JSON Schema │
|
|
328
|
+
│ [default: https://datacontract.com/datacontract.schema.json] │
|
|
329
|
+
│ --help Show this message and exit. │
|
|
325
330
|
╰─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯
|
|
326
331
|
```
|
|
327
332
|
|
|
@@ -333,28 +338,28 @@ Commands
|
|
|
333
338
|
Run schema and quality tests on configured servers.
|
|
334
339
|
|
|
335
340
|
╭─ Arguments ─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╮
|
|
336
|
-
│ location [LOCATION] The location (url or path) of the data contract yaml. [default: datacontract.yaml]
|
|
341
|
+
│ location [LOCATION] The location (url or path) of the data contract yaml. [default: datacontract.yaml] │
|
|
337
342
|
╰─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯
|
|
338
343
|
╭─ Options ───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╮
|
|
339
|
-
│ --schema TEXT The location (url or path) of the Data Contract
|
|
340
|
-
│ Specification JSON Schema
|
|
341
|
-
│ [default:
|
|
342
|
-
│ https://datacontract.com/datacontract.schema.json]
|
|
343
|
-
│ --server TEXT The server configuration to run the schema and quality
|
|
344
|
-
│ tests. Use the key of the server object in the data
|
|
345
|
-
│ contract yaml file to refer to a server, e.g.,
|
|
346
|
-
│ `production`, or `all` for all servers (default).
|
|
347
|
-
│ [default: all]
|
|
348
|
-
│ --examples --no-examples Run the schema and quality tests on the example data
|
|
349
|
-
│ within the data contract.
|
|
350
|
-
│ [default: no-examples]
|
|
351
|
-
│ --publish TEXT The url to publish the results after the test
|
|
352
|
-
│ [default: None]
|
|
353
|
-
│ --publish-to-opentelemetry --no-publish-to-opentelemetry Publish the results to opentelemetry. Use environment
|
|
354
|
-
│ variables to configure the OTLP endpoint, headers, etc.
|
|
355
|
-
│ [default: no-publish-to-opentelemetry]
|
|
356
|
-
│ --logs --no-logs Print logs [default: no-logs]
|
|
357
|
-
│ --help Show this message and exit.
|
|
344
|
+
│ --schema TEXT The location (url or path) of the Data Contract │
|
|
345
|
+
│ Specification JSON Schema │
|
|
346
|
+
│ [default: │
|
|
347
|
+
│ https://datacontract.com/datacontract.schema.json] │
|
|
348
|
+
│ --server TEXT The server configuration to run the schema and quality │
|
|
349
|
+
│ tests. Use the key of the server object in the data │
|
|
350
|
+
│ contract yaml file to refer to a server, e.g., │
|
|
351
|
+
│ `production`, or `all` for all servers (default). │
|
|
352
|
+
│ [default: all] │
|
|
353
|
+
│ --examples --no-examples Run the schema and quality tests on the example data │
|
|
354
|
+
│ within the data contract. │
|
|
355
|
+
│ [default: no-examples] │
|
|
356
|
+
│ --publish TEXT The url to publish the results after the test │
|
|
357
|
+
│ [default: None] │
|
|
358
|
+
│ --publish-to-opentelemetry --no-publish-to-opentelemetry Publish the results to opentelemetry. Use environment │
|
|
359
|
+
│ variables to configure the OTLP endpoint, headers, etc. │
|
|
360
|
+
│ [default: no-publish-to-opentelemetry] │
|
|
361
|
+
│ --logs --no-logs Print logs [default: no-logs] │
|
|
362
|
+
│ --help Show this message and exit. │
|
|
358
363
|
╰─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯
|
|
359
364
|
```
|
|
360
365
|
|
|
@@ -384,6 +389,7 @@ Supported server types:
|
|
|
384
389
|
- [snowflake](#snowflake)
|
|
385
390
|
- [kafka](#kafka)
|
|
386
391
|
- [postgres](#postgres)
|
|
392
|
+
- [trino](#trino)
|
|
387
393
|
- [local](#local)
|
|
388
394
|
|
|
389
395
|
Supported formats:
|
|
@@ -429,11 +435,12 @@ servers:
|
|
|
429
435
|
|
|
430
436
|
#### Environment Variables
|
|
431
437
|
|
|
432
|
-
| Environment Variable
|
|
433
|
-
|
|
434
|
-
| `DATACONTRACT_S3_REGION` | `eu-central-1` | Region of S3 bucket
|
|
435
|
-
| `DATACONTRACT_S3_ACCESS_KEY_ID` | `AKIAXV5Q5QABCDEFGH` | AWS Access Key ID
|
|
436
|
-
| `DATACONTRACT_S3_SECRET_ACCESS_KEY` | `93S7LRrJcqLaaaa/XXXXXXXXXXXXX` | AWS Secret Access Key
|
|
438
|
+
| Environment Variable | Example | Description |
|
|
439
|
+
|-------------------------------------|---------------------------------|----------------------------------------|
|
|
440
|
+
| `DATACONTRACT_S3_REGION` | `eu-central-1` | Region of S3 bucket |
|
|
441
|
+
| `DATACONTRACT_S3_ACCESS_KEY_ID` | `AKIAXV5Q5QABCDEFGH` | AWS Access Key ID |
|
|
442
|
+
| `DATACONTRACT_S3_SECRET_ACCESS_KEY` | `93S7LRrJcqLaaaa/XXXXXXXXXXXXX` | AWS Secret Access Key |
|
|
443
|
+
| `DATACONTRACT_S3_SESSION_TOKEN` | `AQoDYXdzEJr...` | AWS temporary session token (optional) |
|
|
437
444
|
|
|
438
445
|
|
|
439
446
|
|
|
@@ -466,7 +473,6 @@ models:
|
|
|
466
473
|
| `DATACONTRACT_BIGQUERY_ACCOUNT_INFO_JSON_PATH` | `~/service-access-key.json` | Service Access key as saved on key creation by BigQuery. If this environment variable isn't set, the cli tries to use `GOOGLE_APPLICATION_CREDENTIALS` as a fallback, so if you have that set for using their Python library anyway, it should work seamlessly. |
|
|
467
474
|
|
|
468
475
|
|
|
469
|
-
|
|
470
476
|
### Azure
|
|
471
477
|
|
|
472
478
|
Data Contract CLI can test data that is stored in Azure Blob storage or Azure Data Lake Storage (Gen2) (ADLS) in various formats.
|
|
@@ -486,11 +492,11 @@ servers:
|
|
|
486
492
|
|
|
487
493
|
Authentication works with an Azure Service Principal (SPN) aka App Registration with a secret.
|
|
488
494
|
|
|
489
|
-
| Environment Variable
|
|
490
|
-
|
|
491
|
-
| `DATACONTRACT_AZURE_TENANT_ID`
|
|
492
|
-
| `DATACONTRACT_AZURE_CLIENT_ID` | `3cf7ce49-e2e9-4cbc-a922-4328d4a58622`
|
|
493
|
-
| `DATACONTRACT_AZURE_CLIENT_SECRET` | `yZK8Q~GWO1MMXXXXXXXXXXXXX`
|
|
495
|
+
| Environment Variable | Example | Description |
|
|
496
|
+
|------------------------------------|----------------------------------------|------------------------------------------------------|
|
|
497
|
+
| `DATACONTRACT_AZURE_TENANT_ID` | `79f5b80f-10ff-40b9-9d1f-774b42d605fc` | The Azure Tenant ID |
|
|
498
|
+
| `DATACONTRACT_AZURE_CLIENT_ID` | `3cf7ce49-e2e9-4cbc-a922-4328d4a58622` | The ApplicationID / ClientID of the app registration |
|
|
499
|
+
| `DATACONTRACT_AZURE_CLIENT_SECRET` | `yZK8Q~GWO1MMXXXXXXXXXXXXX` | The Client Secret value |
|
|
494
500
|
|
|
495
501
|
|
|
496
502
|
|
|
@@ -520,13 +526,13 @@ models:
|
|
|
520
526
|
|
|
521
527
|
#### Environment Variables
|
|
522
528
|
|
|
523
|
-
| Environment Variable
|
|
524
|
-
|
|
525
|
-
| `DATACONTRACT_SQLSERVER_USERNAME`
|
|
526
|
-
| `DATACONTRACT_SQLSERVER_PASSWORD`
|
|
527
|
-
| `DATACONTRACT_SQLSERVER_TRUSTED_CONNECTION`
|
|
528
|
-
| `DATACONTRACT_SQLSERVER_TRUST_SERVER_CERTIFICATE` | `True` | Trust self-signed certificate
|
|
529
|
-
| `DATACONTRACT_SQLSERVER_ENCRYPTED_CONNECTION`
|
|
529
|
+
| Environment Variable | Example| Description |
|
|
530
|
+
|---------------------------------------------------|--------|----------------------------------------------|
|
|
531
|
+
| `DATACONTRACT_SQLSERVER_USERNAME` | `root` | Username |
|
|
532
|
+
| `DATACONTRACT_SQLSERVER_PASSWORD` | `toor` | Password |
|
|
533
|
+
| `DATACONTRACT_SQLSERVER_TRUSTED_CONNECTION` | `True` | Use windows authentication, instead of login |
|
|
534
|
+
| `DATACONTRACT_SQLSERVER_TRUST_SERVER_CERTIFICATE` | `True` | Trust self-signed certificate |
|
|
535
|
+
| `DATACONTRACT_SQLSERVER_ENCRYPTED_CONNECTION` | `True` | Use SSL |
|
|
530
536
|
|
|
531
537
|
|
|
532
538
|
|
|
@@ -557,8 +563,8 @@ models:
|
|
|
557
563
|
|
|
558
564
|
| Environment Variable | Example | Description |
|
|
559
565
|
|----------------------------------------------|--------------------------------------|-------------------------------------------------------|
|
|
560
|
-
| `DATACONTRACT_DATABRICKS_TOKEN`
|
|
561
|
-
| `DATACONTRACT_DATABRICKS_HTTP_PATH`
|
|
566
|
+
| `DATACONTRACT_DATABRICKS_TOKEN` | `dapia00000000000000000000000000000` | The personal access token to authenticate |
|
|
567
|
+
| `DATACONTRACT_DATABRICKS_HTTP_PATH` | `/sql/1.0/warehouses/b053a3ffffffff` | The HTTP path to the SQL warehouse or compute cluster |
|
|
562
568
|
|
|
563
569
|
|
|
564
570
|
### Databricks (programmatic)
|
|
@@ -603,7 +609,7 @@ run.result
|
|
|
603
609
|
|
|
604
610
|
Works with Spark DataFrames.
|
|
605
611
|
DataFrames need to be created as named temporary views.
|
|
606
|
-
Multiple temporary views are
|
|
612
|
+
Multiple temporary views are supported if your data contract contains multiple models.
|
|
607
613
|
|
|
608
614
|
Testing DataFrames is useful to test your datasets in a pipeline before writing them to a data source.
|
|
609
615
|
|
|
@@ -724,6 +730,35 @@ models:
|
|
|
724
730
|
| `DATACONTRACT_POSTGRES_PASSWORD` | `mysecretpassword` | Password |
|
|
725
731
|
|
|
726
732
|
|
|
733
|
+
### Trino
|
|
734
|
+
|
|
735
|
+
Data Contract CLI can test data in Trino.
|
|
736
|
+
|
|
737
|
+
#### Example
|
|
738
|
+
|
|
739
|
+
datacontract.yaml
|
|
740
|
+
```yaml
|
|
741
|
+
servers:
|
|
742
|
+
trino:
|
|
743
|
+
type: trino
|
|
744
|
+
host: localhost
|
|
745
|
+
port: 8080
|
|
746
|
+
catalog: my_catalog
|
|
747
|
+
schema: my_schema
|
|
748
|
+
models:
|
|
749
|
+
my_table_1: # corresponds to a table
|
|
750
|
+
type: table
|
|
751
|
+
fields:
|
|
752
|
+
my_column_1: # corresponds to a column
|
|
753
|
+
type: varchar
|
|
754
|
+
```
|
|
755
|
+
|
|
756
|
+
#### Environment Variables
|
|
757
|
+
|
|
758
|
+
| Environment Variable | Example | Description |
|
|
759
|
+
|-------------------------------|--------------------|-------------|
|
|
760
|
+
| `DATACONTRACT_TRINO_USERNAME` | `trino` | Username |
|
|
761
|
+
| `DATACONTRACT_TRINO_PASSWORD` | `mysecretpassword` | Password |
|
|
727
762
|
|
|
728
763
|
|
|
729
764
|
|
|
@@ -742,7 +777,7 @@ models:
|
|
|
742
777
|
│ * --format [jsonschema|pydantic-model|sodacl|dbt|dbt-sources|db The export format. [default: None] [required] │
|
|
743
778
|
│ t-staging-sql|odcs|rdf|avro|protobuf|great-expectati │
|
|
744
779
|
│ ons|terraform|avro-idl|sql|sql-query|html|go|bigquer │
|
|
745
|
-
│ y|dbml]
|
|
780
|
+
│ y|dbml|spark] │
|
|
746
781
|
│ --output PATH Specify the file path where the exported data will be │
|
|
747
782
|
│ saved. If no path is provided, the output will be │
|
|
748
783
|
│ printed to stdout. │
|
|
@@ -792,6 +827,7 @@ Available export options:
|
|
|
792
827
|
| `go` | Export to Go types | ✅ |
|
|
793
828
|
| `pydantic-model` | Export to pydantic models | ✅ |
|
|
794
829
|
| `DBML` | Export to a DBML Diagram description | ✅ |
|
|
830
|
+
| `spark` | Export to a Spark StructType | ✅ |
|
|
795
831
|
| Missing something? | Please create an issue on GitHub | TBD |
|
|
796
832
|
|
|
797
833
|
#### Great Expectations
|
|
@@ -838,6 +874,10 @@ The export function converts the logical data types of the datacontract into the
|
|
|
838
874
|
if a server is selected via the `--server` option (based on the `type` of that server). If no server is selected, the
|
|
839
875
|
logical data types are exported.
|
|
840
876
|
|
|
877
|
+
#### Spark
|
|
878
|
+
|
|
879
|
+
The export function converts the data contract specification into a StructType Spark schema. The returned value is a Python code picture of the model schemas.
|
|
880
|
+
Spark DataFrame schema is defined as StructType. For more details about Spark Data Types please see [the spark documentation](https://spark.apache.org/docs/latest/sql-ref-datatypes.html)
|
|
841
881
|
|
|
842
882
|
#### Avro
|
|
843
883
|
|
|
@@ -888,18 +928,19 @@ models:
|
|
|
888
928
|
Create a data contract from the given source location. Prints to stdout.
|
|
889
929
|
|
|
890
930
|
╭─ Options ───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╮
|
|
891
|
-
│ * --format [sql|avro|glue|bigquery|jsonschema|
|
|
892
|
-
│
|
|
893
|
-
│
|
|
894
|
-
│
|
|
895
|
-
│
|
|
896
|
-
│
|
|
897
|
-
│
|
|
898
|
-
│ --bigquery-
|
|
899
|
-
│ --bigquery-
|
|
900
|
-
│
|
|
901
|
-
│
|
|
902
|
-
│
|
|
931
|
+
│ * --format [sql|avro|glue|bigquery|jsonschema| The format of the source file. [default: None] [required] |
|
|
932
|
+
│ unity|spark] |
|
|
933
|
+
│ --source TEXT The path to the file or Glue Database that should be imported. │
|
|
934
|
+
│ [default: None] │
|
|
935
|
+
│ --glue-table TEXT List of table ids to import from the Glue Database (repeat for │
|
|
936
|
+
│ multiple table ids, leave empty for all tables in the dataset). │
|
|
937
|
+
│ [default: None] │
|
|
938
|
+
│ --bigquery-project TEXT The bigquery project id. [default: None] │
|
|
939
|
+
│ --bigquery-dataset TEXT The bigquery dataset id. [default: None] │
|
|
940
|
+
│ --bigquery-table TEXT List of table ids to import from the bigquery API (repeat for │
|
|
941
|
+
│ multiple table ids, leave empty for all tables in the dataset). │
|
|
942
|
+
│ [default: None] │
|
|
943
|
+
│ --help Show this message and exit. │
|
|
903
944
|
╰─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯
|
|
904
945
|
```
|
|
905
946
|
|
|
@@ -917,11 +958,11 @@ Available import options:
|
|
|
917
958
|
| `avro` | Import from AVRO schemas | ✅ |
|
|
918
959
|
| `glue` | Import from AWS Glue DataCatalog | ✅ |
|
|
919
960
|
| `protobuf` | Import from Protobuf schemas | TBD |
|
|
920
|
-
| `jsonschema` | Import from JSON Schemas | ✅
|
|
921
|
-
| `bigquery` | Import from BigQuery Schemas | ✅
|
|
961
|
+
| `jsonschema` | Import from JSON Schemas | ✅ |
|
|
962
|
+
| `bigquery` | Import from BigQuery Schemas | ✅ |
|
|
922
963
|
| `unity` | Import from Databricks Unity Catalog | partial |
|
|
923
964
|
| `dbt` | Import from dbt models | TBD |
|
|
924
|
-
| `odcs` | Import from Open Data Contract Standard (ODCS) | ✅
|
|
965
|
+
| `odcs` | Import from Open Data Contract Standard (ODCS) | ✅ |
|
|
925
966
|
| Missing something? | Please create an issue on GitHub | TBD |
|
|
926
967
|
|
|
927
968
|
|
|
@@ -964,7 +1005,7 @@ export DATABRICKS_IMPORT_ACCESS_TOKEN=<token>
|
|
|
964
1005
|
datacontract import --format unity --unity-table-full-name <table_full_name>
|
|
965
1006
|
```
|
|
966
1007
|
|
|
967
|
-
|
|
1008
|
+
#### Glue
|
|
968
1009
|
|
|
969
1010
|
Importing from Glue reads the necessary Data directly off of the AWS API.
|
|
970
1011
|
You may give the `glue-table` parameter to enumerate the tables that should be imported. If no tables are given, _all_ available tables of the database will be imported.
|
|
@@ -981,6 +1022,15 @@ datacontract import --format glue --source <database_name> --glue-table <table_n
|
|
|
981
1022
|
datacontract import --format glue --source <database_name>
|
|
982
1023
|
```
|
|
983
1024
|
|
|
1025
|
+
#### Spark
|
|
1026
|
+
|
|
1027
|
+
Importing from Spark table or view these must be created or accessible in the Spark context. Specify tables list in `source` parameter.
|
|
1028
|
+
|
|
1029
|
+
Example:
|
|
1030
|
+
|
|
1031
|
+
```bash
|
|
1032
|
+
datacontract import --format spark --source "users,orders"
|
|
1033
|
+
```
|
|
984
1034
|
|
|
985
1035
|
### breaking
|
|
986
1036
|
|
|
@@ -990,11 +1040,11 @@ datacontract import --format glue --source <database_name>
|
|
|
990
1040
|
Identifies breaking changes between data contracts. Prints to stdout.
|
|
991
1041
|
|
|
992
1042
|
╭─ Arguments ───────────────────────────────────────────────────────────────────────────────────────────────────────────────╮
|
|
993
|
-
│ * location_old TEXT The location (url or path) of the old data contract yaml. [default: None] [required]
|
|
994
|
-
│ * location_new TEXT The location (url or path) of the new data contract yaml. [default: None] [required]
|
|
1043
|
+
│ * location_old TEXT The location (url or path) of the old data contract yaml. [default: None] [required] │
|
|
1044
|
+
│ * location_new TEXT The location (url or path) of the new data contract yaml. [default: None] [required] │
|
|
995
1045
|
╰───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯
|
|
996
1046
|
╭─ Options ─────────────────────────────────────────────────────────────────────────────────────────────────────────────────╮
|
|
997
|
-
│ --help Show this message and exit.
|
|
1047
|
+
│ --help Show this message and exit. │
|
|
998
1048
|
╰───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯
|
|
999
1049
|
```
|
|
1000
1050
|
|
|
@@ -1006,11 +1056,11 @@ datacontract import --format glue --source <database_name>
|
|
|
1006
1056
|
Generate a changelog between data contracts. Prints to stdout.
|
|
1007
1057
|
|
|
1008
1058
|
╭─ Arguments ───────────────────────────────────────────────────────────────────────────────────────────────────────────────╮
|
|
1009
|
-
│ * location_old TEXT The location (url or path) of the old data contract yaml. [default: None] [required]
|
|
1010
|
-
│ * location_new TEXT The location (url or path) of the new data contract yaml. [default: None] [required]
|
|
1059
|
+
│ * location_old TEXT The location (url or path) of the old data contract yaml. [default: None] [required] │
|
|
1060
|
+
│ * location_new TEXT The location (url or path) of the new data contract yaml. [default: None] [required] │
|
|
1011
1061
|
╰───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯
|
|
1012
1062
|
╭─ Options ─────────────────────────────────────────────────────────────────────────────────────────────────────────────────╮
|
|
1013
|
-
│ --help Show this message and exit.
|
|
1063
|
+
│ --help Show this message and exit. │
|
|
1014
1064
|
╰───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯
|
|
1015
1065
|
```
|
|
1016
1066
|
|
|
@@ -1022,8 +1072,8 @@ datacontract import --format glue --source <database_name>
|
|
|
1022
1072
|
PLACEHOLDER. Currently works as 'changelog' does.
|
|
1023
1073
|
|
|
1024
1074
|
╭─ Arguments ───────────────────────────────────────────────────────────────────────────────────────────────────────────────╮
|
|
1025
|
-
│ * location_old TEXT The location (url or path) of the old data contract yaml. [default: None] [required]
|
|
1026
|
-
│ * location_new TEXT The location (url or path) of the new data contract yaml. [default: None] [required]
|
|
1075
|
+
│ * location_old TEXT The location (url or path) of the old data contract yaml. [default: None] [required] │
|
|
1076
|
+
│ * location_new TEXT The location (url or path) of the new data contract yaml. [default: None] [required] │
|
|
1027
1077
|
╰───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯
|
|
1028
1078
|
╭─ Options ─────────────────────────────────────────────────────────────────────────────────────────────────────────────────╮
|
|
1029
1079
|
│ --help Show this message and exit. │
|
|
@@ -1039,9 +1089,9 @@ datacontract import --format glue --source <database_name>
|
|
|
1039
1089
|
Create an html catalog of data contracts.
|
|
1040
1090
|
|
|
1041
1091
|
╭─ Options ────────────────────────────────────────────────────────────────────────────────────────────────────────────────╮
|
|
1042
|
-
│ --files TEXT Glob pattern for the data contract files to include in the catalog. [default: *.yaml]
|
|
1043
|
-
│ --output TEXT Output directory for the catalog html files. [default: catalog/]
|
|
1044
|
-
│ --help Show this message and exit.
|
|
1092
|
+
│ --files TEXT Glob pattern for the data contract files to include in the catalog. [default: *.yaml] │
|
|
1093
|
+
│ --output TEXT Output directory for the catalog html files. [default: catalog/] │
|
|
1094
|
+
│ --help Show this message and exit. │
|
|
1045
1095
|
╰──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯
|
|
1046
1096
|
```
|
|
1047
1097
|
|
|
@@ -1063,19 +1113,22 @@ datacontract import --format glue --source <database_name>
|
|
|
1063
1113
|
|
|
1064
1114
|
## Integrations
|
|
1065
1115
|
|
|
1066
|
-
| Integration
|
|
1067
|
-
|
|
1068
|
-
| Data Mesh Manager
|
|
1069
|
-
|
|
|
1116
|
+
| Integration | Option | Description |
|
|
1117
|
+
|-----------------------|------------------------------|---------------------------------------------------------------------------------------------------------------|
|
|
1118
|
+
| Data Mesh Manager | `--publish` | Push full results to the [Data Mesh Manager API](https://api.datamesh-manager.com/swagger/index.html) |
|
|
1119
|
+
| Data Contract Manager | `--publish` | Push full results to the [Data Contract Manager API](https://api.datacontract-manager.com/swagger/index.html) |
|
|
1120
|
+
| OpenTelemetry | `--publish-to-opentelemetry` | Push result as gauge metrics |
|
|
1070
1121
|
|
|
1071
1122
|
### Integration with Data Mesh Manager
|
|
1072
1123
|
|
|
1073
|
-
If you use [Data Mesh Manager](https://datamesh-manager.com/), you can use the data contract URL and append the `--publish` option to send and display the test results. Set an environment variable for your API key.
|
|
1124
|
+
If you use [Data Mesh Manager](https://datamesh-manager.com/) or [Data Contract Manager](https://datacontract-manager.com/), you can use the data contract URL and append the `--publish` option to send and display the test results. Set an environment variable for your API key.
|
|
1074
1125
|
|
|
1075
1126
|
```bash
|
|
1076
1127
|
# Fetch current data contract, execute tests on production, and publish result to data mesh manager
|
|
1077
1128
|
$ EXPORT DATAMESH_MANAGER_API_KEY=xxx
|
|
1078
|
-
$ datacontract test https://demo.datamesh-manager.com/demo279750347121/datacontracts/4df9d6ee-e55d-4088-9598-b635b2fdcbbc/datacontract.yaml
|
|
1129
|
+
$ datacontract test https://demo.datamesh-manager.com/demo279750347121/datacontracts/4df9d6ee-e55d-4088-9598-b635b2fdcbbc/datacontract.yaml \
|
|
1130
|
+
--server production \
|
|
1131
|
+
--publish https://api.datamesh-manager.com/api/test-results
|
|
1079
1132
|
```
|
|
1080
1133
|
|
|
1081
1134
|
### Integration with OpenTelemetry
|
|
@@ -1085,12 +1138,12 @@ If you use OpenTelemetry, you can use the data contract URL and append the `--pu
|
|
|
1085
1138
|
The metric name is "datacontract.cli.test.result" and it uses the following encoding for the result:
|
|
1086
1139
|
|
|
1087
1140
|
| datacontract.cli.test.result | Description |
|
|
1088
|
-
|
|
1089
|
-
| 0
|
|
1090
|
-
| 1
|
|
1091
|
-
| 2
|
|
1092
|
-
| 3
|
|
1093
|
-
| 4
|
|
1141
|
+
|------------------------------|---------------------------------------|
|
|
1142
|
+
| 0 | test run passed, no warnings |
|
|
1143
|
+
| 1 | test run has warnings |
|
|
1144
|
+
| 2 | test run failed |
|
|
1145
|
+
| 3 | test run not possible due to an error |
|
|
1146
|
+
| 4 | test status unknown |
|
|
1094
1147
|
|
|
1095
1148
|
|
|
1096
1149
|
```bash
|
|
@@ -1118,7 +1171,7 @@ Create a data contract based on the actual data. This is the fastest way to get
|
|
|
1118
1171
|
|
|
1119
1172
|
1. Use an existing physical schema (e.g., SQL DDL) as a starting point to define your logical data model in the contract. Double check right after the import whether the actual data meets the imported logical data model. Just to be sure.
|
|
1120
1173
|
```bash
|
|
1121
|
-
$ datacontract import --format sql ddl.sql
|
|
1174
|
+
$ datacontract import --format sql --source ddl.sql
|
|
1122
1175
|
$ datacontract test
|
|
1123
1176
|
```
|
|
1124
1177
|
|
|
@@ -1141,7 +1194,7 @@ Create a data contract based on the actual data. This is the fastest way to get
|
|
|
1141
1194
|
|
|
1142
1195
|
5. Set up a CI pipeline that executes daily and reports the results to the [Data Mesh Manager](https://datamesh-manager.com). Or to some place else. You can even publish to any opentelemetry compatible system.
|
|
1143
1196
|
```bash
|
|
1144
|
-
$ datacontract test --publish https://api.datamesh-manager.com/api/
|
|
1197
|
+
$ datacontract test --publish https://api.datamesh-manager.com/api/test-results
|
|
1145
1198
|
```
|
|
1146
1199
|
|
|
1147
1200
|
### Contract-First
|
|
@@ -1193,7 +1246,7 @@ Examples: adding models or fields
|
|
|
1193
1246
|
- Add the models or fields in the datacontract.yaml
|
|
1194
1247
|
- Increment the minor version of the datacontract.yaml on any change. Simply edit the datacontract.yaml for this.
|
|
1195
1248
|
- You need a policy that these changes are non-breaking. That means that one cannot use the star expression in SQL to query a table under contract. Make the consequences known.
|
|
1196
|
-
- Fail the build in the Pull Request if a datacontract.yaml
|
|
1249
|
+
- Fail the build in the Pull Request if a datacontract.yaml accidentally adds a breaking change even despite only a minor version change
|
|
1197
1250
|
```bash
|
|
1198
1251
|
$ datacontract breaking datacontract-from-pr.yaml datacontract-from-main.yaml
|
|
1199
1252
|
```
|
|
@@ -1214,6 +1267,121 @@ Examples: Removing or renaming models and fields.
|
|
|
1214
1267
|
$ datacontract changelog datacontract-from-pr.yaml datacontract-from-main.yaml
|
|
1215
1268
|
```
|
|
1216
1269
|
|
|
1270
|
+
## Customizing Exporters and Importers
|
|
1271
|
+
|
|
1272
|
+
### Custom Exporter
|
|
1273
|
+
Using the exporter factory to add a new custom exporter
|
|
1274
|
+
```python
|
|
1275
|
+
|
|
1276
|
+
from datacontract.data_contract import DataContract
|
|
1277
|
+
from datacontract.export.exporter import Exporter
|
|
1278
|
+
from datacontract.export.exporter_factory import exporter_factory
|
|
1279
|
+
|
|
1280
|
+
|
|
1281
|
+
# Create a custom class that implements export method
|
|
1282
|
+
class CustomExporter(Exporter):
|
|
1283
|
+
def export(self, data_contract, model, server, sql_server_type, export_args) -> dict:
|
|
1284
|
+
result = {
|
|
1285
|
+
"title": data_contract.info.title,
|
|
1286
|
+
"version": data_contract.info.version,
|
|
1287
|
+
"description": data_contract.info.description,
|
|
1288
|
+
"email": data_contract.info.contact.email,
|
|
1289
|
+
"url": data_contract.info.contact.url,
|
|
1290
|
+
"model": model,
|
|
1291
|
+
"model_columns": ", ".join(list(data_contract.models.get(model).fields.keys())),
|
|
1292
|
+
"export_args": export_args,
|
|
1293
|
+
"custom_args": export_args.get("custom_arg", ""),
|
|
1294
|
+
}
|
|
1295
|
+
return result
|
|
1296
|
+
|
|
1297
|
+
|
|
1298
|
+
# Register the new custom class into factory
|
|
1299
|
+
exporter_factory.register_exporter("custom", CustomExporter)
|
|
1300
|
+
|
|
1301
|
+
|
|
1302
|
+
if __name__ == "__main__":
|
|
1303
|
+
# Create a DataContract instance
|
|
1304
|
+
data_contract = DataContract(
|
|
1305
|
+
data_contract_file="/path/datacontract.yaml"
|
|
1306
|
+
)
|
|
1307
|
+
# call export
|
|
1308
|
+
result = data_contract.export(
|
|
1309
|
+
export_format="custom", model="orders", server="production", custom_arg="my_custom_arg"
|
|
1310
|
+
)
|
|
1311
|
+
print(result)
|
|
1312
|
+
|
|
1313
|
+
```
|
|
1314
|
+
Output
|
|
1315
|
+
```python
|
|
1316
|
+
{
|
|
1317
|
+
'title': 'Orders Unit Test',
|
|
1318
|
+
'version': '1.0.0',
|
|
1319
|
+
'description': 'The orders data contract',
|
|
1320
|
+
'email': 'team-orders@example.com',
|
|
1321
|
+
'url': 'https://wiki.example.com/teams/checkout',
|
|
1322
|
+
'model': 'orders',
|
|
1323
|
+
'model_columns': 'order_id, order_total, order_status',
|
|
1324
|
+
'export_args': {'server': 'production', 'custom_arg': 'my_custom_arg'},
|
|
1325
|
+
'custom_args': 'my_custom_arg'
|
|
1326
|
+
}
|
|
1327
|
+
```
|
|
1328
|
+
|
|
1329
|
+
### Custom Importer
|
|
1330
|
+
Using the importer factory to add a new custom importer
|
|
1331
|
+
```python
|
|
1332
|
+
|
|
1333
|
+
from datacontract.model.data_contract_specification import DataContractSpecification
|
|
1334
|
+
from datacontract.data_contract import DataContract
|
|
1335
|
+
from datacontract.imports.importer import Importer
|
|
1336
|
+
from datacontract.imports.importer_factory import importer_factory
|
|
1337
|
+
import json
|
|
1338
|
+
|
|
1339
|
+
# Create a custom class that implements import_source method
|
|
1340
|
+
class CustomImporter(Importer):
|
|
1341
|
+
def import_source(
|
|
1342
|
+
self, data_contract_specification: DataContractSpecification, source: str, import_args: dict
|
|
1343
|
+
) -> dict:
|
|
1344
|
+
source_dict = json.loads(source)
|
|
1345
|
+
data_contract_specification.id = source_dict.get("id_custom")
|
|
1346
|
+
data_contract_specification.info.title = source_dict.get("title")
|
|
1347
|
+
data_contract_specification.info.description = source_dict.get("description_from_app")
|
|
1348
|
+
|
|
1349
|
+
return data_contract_specification
|
|
1350
|
+
|
|
1351
|
+
|
|
1352
|
+
# Register the new custom class into factory
|
|
1353
|
+
importer_factory.register_importer("custom_company_importer", CustomImporter)
|
|
1354
|
+
|
|
1355
|
+
|
|
1356
|
+
if __name__ == "__main__":
|
|
1357
|
+
# get a custom da
|
|
1358
|
+
json_from_custom_app = '{"id_custom":"uuid-custom","version":"0.0.2", "title":"my_custom_imported_data", "description_from_app": "Custom contract description"}'
|
|
1359
|
+
# Create a DataContract instance
|
|
1360
|
+
data_contract = DataContract()
|
|
1361
|
+
|
|
1362
|
+
# call import_from
|
|
1363
|
+
result = data_contract.import_from_source(
|
|
1364
|
+
format="custom_company_importer", data_contract_specification=DataContract.init(), source=json_from_custom_app
|
|
1365
|
+
)
|
|
1366
|
+
print(dict(result))
|
|
1367
|
+
|
|
1368
|
+
```
|
|
1369
|
+
Output
|
|
1370
|
+
|
|
1371
|
+
```python
|
|
1372
|
+
{
|
|
1373
|
+
'dataContractSpecification': '0.9.3',
|
|
1374
|
+
'id': 'uuid-custom',
|
|
1375
|
+
'info': Info(title='my_custom_imported_data', version='0.0.1', status=None, description='Custom contract description', owner=None, contact=None),
|
|
1376
|
+
'servers': {},
|
|
1377
|
+
'terms': None,
|
|
1378
|
+
'models': {},
|
|
1379
|
+
'definitions': {},
|
|
1380
|
+
'examples': [],
|
|
1381
|
+
'quality': None,
|
|
1382
|
+
'servicelevels': None
|
|
1383
|
+
}
|
|
1384
|
+
```
|
|
1217
1385
|
## Development Setup
|
|
1218
1386
|
|
|
1219
1387
|
Python base interpreter should be 3.11.x (unless working on 3.12 release candidate).
|
|
@@ -1263,7 +1431,27 @@ docker compose run --rm datacontract --version
|
|
|
1263
1431
|
|
|
1264
1432
|
This command runs the container momentarily to check the version of the `datacontract` CLI. The `--rm` flag ensures that the container is automatically removed after the command executes, keeping your environment clean.
|
|
1265
1433
|
|
|
1434
|
+
## Use with pre-commit
|
|
1435
|
+
|
|
1436
|
+
To run `datacontract-cli` as part of a [pre-commit](https://pre-commit.com/) workflow, add something like the below to the `repos` list in the project's `.pre-commit-config.yaml`:
|
|
1437
|
+
|
|
1438
|
+
```yaml
|
|
1439
|
+
repos:
|
|
1440
|
+
- repo: https://github.com/datacontract/datacontract-cli
|
|
1441
|
+
rev: "v0.10.9"
|
|
1442
|
+
hooks:
|
|
1443
|
+
- id: datacontract-lint
|
|
1444
|
+
- id: datacontract-test
|
|
1445
|
+
args: ["--server", "production"]
|
|
1446
|
+
```
|
|
1447
|
+
|
|
1448
|
+
### Available Hook IDs
|
|
1266
1449
|
|
|
1450
|
+
| Hook ID | Description | Dependency |
|
|
1451
|
+
| ----------------- | -------------------------------------------------- | ---------- |
|
|
1452
|
+
| datacontract-lint | Runs the lint subcommand. | Python3 |
|
|
1453
|
+
| datacontract-test | Runs the test subcommand. Please look at | Python3 |
|
|
1454
|
+
| | [test](#test) section for all available arguments. | |
|
|
1267
1455
|
|
|
1268
1456
|
## Release Steps
|
|
1269
1457
|
|
|
@@ -1285,7 +1473,7 @@ We are happy to receive your contributions. Propose your change in an issue or d
|
|
|
1285
1473
|
|
|
1286
1474
|
## Related Tools
|
|
1287
1475
|
|
|
1288
|
-
- [Data
|
|
1476
|
+
- [Data Contract Manager](https://www.datacontract-manager.com/) is a commercial tool to manage data contracts. It contains a web UI, access management, and data governance for a full enterprise data marketplace.
|
|
1289
1477
|
- [Data Contract GPT](https://gpt.datacontract.com) is a custom GPT that can help you write data contracts.
|
|
1290
1478
|
- [Data Contract Editor](https://editor.datacontract.com) is an editor for Data Contracts, including a live html preview.
|
|
1291
1479
|
|