fastflowtransform 0.6.2__tar.gz
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- fastflowtransform-0.6.2/.gitignore +52 -0
- fastflowtransform-0.6.2/PKG-INFO +122 -0
- fastflowtransform-0.6.2/README.md +48 -0
- fastflowtransform-0.6.2/examples/basic_demo/README.md +8 -0
- fastflowtransform-0.6.2/examples/basic_demo/models/README.md +8 -0
- fastflowtransform-0.6.2/examples/basic_demo/seeds/README.md +3 -0
- fastflowtransform-0.6.2/examples/basic_demo/tests/unit/README.md +4 -0
- fastflowtransform-0.6.2/examples/cache_demo/README.md +56 -0
- fastflowtransform-0.6.2/examples/cache_demo/models/README.md +4 -0
- fastflowtransform-0.6.2/examples/cache_demo/seeds/README.md +4 -0
- fastflowtransform-0.6.2/examples/cache_demo/tests/unit/README.md +4 -0
- fastflowtransform-0.6.2/examples/dq_demo/README.md +27 -0
- fastflowtransform-0.6.2/examples/dq_demo/models/README.md +4 -0
- fastflowtransform-0.6.2/examples/dq_demo/seeds/README.md +4 -0
- fastflowtransform-0.6.2/examples/dq_demo/tests/unit/README.md +4 -0
- fastflowtransform-0.6.2/examples/incremental_demo/README.md +10 -0
- fastflowtransform-0.6.2/examples/incremental_demo/models/README.md +4 -0
- fastflowtransform-0.6.2/examples/incremental_demo/seeds/README.md +4 -0
- fastflowtransform-0.6.2/examples/incremental_demo/tests/unit/README.md +4 -0
- fastflowtransform-0.6.2/examples/macros_demo/README.md +12 -0
- fastflowtransform-0.6.2/examples/macros_demo/models/README.md +4 -0
- fastflowtransform-0.6.2/examples/macros_demo/seeds/README.md +4 -0
- fastflowtransform-0.6.2/examples/macros_demo/tests/unit/README.md +4 -0
- fastflowtransform-0.6.2/examples/materializations_demo/README.md +12 -0
- fastflowtransform-0.6.2/examples/materializations_demo/models/README.md +4 -0
- fastflowtransform-0.6.2/examples/materializations_demo/seeds/README.md +4 -0
- fastflowtransform-0.6.2/examples/materializations_demo/tests/unit/README.md +4 -0
- fastflowtransform-0.6.2/pyproject.toml +215 -0
- fastflowtransform-0.6.2/src/fastflowtransform/.env +13 -0
- fastflowtransform-0.6.2/src/fastflowtransform/__init__.py +49 -0
- fastflowtransform-0.6.2/src/fastflowtransform/_version.py +54 -0
- fastflowtransform-0.6.2/src/fastflowtransform/api/__init__.py +0 -0
- fastflowtransform-0.6.2/src/fastflowtransform/api/context.py +158 -0
- fastflowtransform-0.6.2/src/fastflowtransform/api/http.py +413 -0
- fastflowtransform-0.6.2/src/fastflowtransform/api/rate_limit.py +112 -0
- fastflowtransform-0.6.2/src/fastflowtransform/artifacts.py +304 -0
- fastflowtransform-0.6.2/src/fastflowtransform/cache.py +220 -0
- fastflowtransform-0.6.2/src/fastflowtransform/cli/__init__.py +198 -0
- fastflowtransform-0.6.2/src/fastflowtransform/cli/__main__.py +3 -0
- fastflowtransform-0.6.2/src/fastflowtransform/cli/bootstrap.py +384 -0
- fastflowtransform-0.6.2/src/fastflowtransform/cli/dag_cmd.py +70 -0
- fastflowtransform-0.6.2/src/fastflowtransform/cli/docgen_cmd.py +84 -0
- fastflowtransform-0.6.2/src/fastflowtransform/cli/docs_utils.py +149 -0
- fastflowtransform-0.6.2/src/fastflowtransform/cli/init_cmd.py +287 -0
- fastflowtransform-0.6.2/src/fastflowtransform/cli/logging_utils.py +36 -0
- fastflowtransform-0.6.2/src/fastflowtransform/cli/options.py +221 -0
- fastflowtransform-0.6.2/src/fastflowtransform/cli/run.py +564 -0
- fastflowtransform-0.6.2/src/fastflowtransform/cli/seed_cmd.py +44 -0
- fastflowtransform-0.6.2/src/fastflowtransform/cli/selectors.py +259 -0
- fastflowtransform-0.6.2/src/fastflowtransform/cli/sync_db_comments_cmd.py +196 -0
- fastflowtransform-0.6.2/src/fastflowtransform/cli/test_cmd.py +428 -0
- fastflowtransform-0.6.2/src/fastflowtransform/cli/utest_cmd.py +56 -0
- fastflowtransform-0.6.2/src/fastflowtransform/config/__init__.py +0 -0
- fastflowtransform-0.6.2/src/fastflowtransform/config/models.py +406 -0
- fastflowtransform-0.6.2/src/fastflowtransform/config/project.py +562 -0
- fastflowtransform-0.6.2/src/fastflowtransform/config/seeds.py +128 -0
- fastflowtransform-0.6.2/src/fastflowtransform/config/sources.py +379 -0
- fastflowtransform-0.6.2/src/fastflowtransform/core.py +816 -0
- fastflowtransform-0.6.2/src/fastflowtransform/dag.py +127 -0
- fastflowtransform-0.6.2/src/fastflowtransform/decorators.py +246 -0
- fastflowtransform-0.6.2/src/fastflowtransform/docs.py +554 -0
- fastflowtransform-0.6.2/src/fastflowtransform/errors.py +151 -0
- fastflowtransform-0.6.2/src/fastflowtransform/executors/__init__.py +55 -0
- fastflowtransform-0.6.2/src/fastflowtransform/executors/_shims.py +142 -0
- fastflowtransform-0.6.2/src/fastflowtransform/executors/base.py +823 -0
- fastflowtransform-0.6.2/src/fastflowtransform/executors/bigquery/__init__.py +48 -0
- fastflowtransform-0.6.2/src/fastflowtransform/executors/bigquery/_bigquery_mixin.py +40 -0
- fastflowtransform-0.6.2/src/fastflowtransform/executors/bigquery/base.py +277 -0
- fastflowtransform-0.6.2/src/fastflowtransform/executors/bigquery/bigframes.py +158 -0
- fastflowtransform-0.6.2/src/fastflowtransform/executors/bigquery/pandas.py +79 -0
- fastflowtransform-0.6.2/src/fastflowtransform/executors/databricks_spark.py +753 -0
- fastflowtransform-0.6.2/src/fastflowtransform/executors/duckdb.py +286 -0
- fastflowtransform-0.6.2/src/fastflowtransform/executors/postgres.py +259 -0
- fastflowtransform-0.6.2/src/fastflowtransform/executors/snowflake_snowpark.py +325 -0
- fastflowtransform-0.6.2/src/fastflowtransform/fingerprint.py +262 -0
- fastflowtransform-0.6.2/src/fastflowtransform/incremental.py +311 -0
- fastflowtransform-0.6.2/src/fastflowtransform/lineage.py +271 -0
- fastflowtransform-0.6.2/src/fastflowtransform/log_queue.py +46 -0
- fastflowtransform-0.6.2/src/fastflowtransform/logging.py +351 -0
- fastflowtransform-0.6.2/src/fastflowtransform/meta.py +508 -0
- fastflowtransform-0.6.2/src/fastflowtransform/run_executor.py +193 -0
- fastflowtransform-0.6.2/src/fastflowtransform/schema_loader.py +133 -0
- fastflowtransform-0.6.2/src/fastflowtransform/seeding.py +929 -0
- fastflowtransform-0.6.2/src/fastflowtransform/settings.py +351 -0
- fastflowtransform-0.6.2/src/fastflowtransform/storage.py +163 -0
- fastflowtransform-0.6.2/src/fastflowtransform/streaming.py +106 -0
- fastflowtransform-0.6.2/src/fastflowtransform/table_formats/__init__.py +59 -0
- fastflowtransform-0.6.2/src/fastflowtransform/table_formats/base.py +115 -0
- fastflowtransform-0.6.2/src/fastflowtransform/table_formats/spark_default.py +47 -0
- fastflowtransform-0.6.2/src/fastflowtransform/table_formats/spark_delta.py +124 -0
- fastflowtransform-0.6.2/src/fastflowtransform/table_formats/spark_hudi.py +138 -0
- fastflowtransform-0.6.2/src/fastflowtransform/table_formats/spark_iceberg.py +126 -0
- fastflowtransform-0.6.2/src/fastflowtransform/templates/index.html.j2 +220 -0
- fastflowtransform-0.6.2/src/fastflowtransform/templates/model.html.j2 +222 -0
- fastflowtransform-0.6.2/src/fastflowtransform/testing/__init__.py +5 -0
- fastflowtransform-0.6.2/src/fastflowtransform/testing/base.py +475 -0
- fastflowtransform-0.6.2/src/fastflowtransform/testing/discovery.py +234 -0
- fastflowtransform-0.6.2/src/fastflowtransform/testing/registry.py +552 -0
- fastflowtransform-0.6.2/src/fastflowtransform/typing.py +168 -0
- fastflowtransform-0.6.2/src/fastflowtransform/utest.py +738 -0
- fastflowtransform-0.6.2/src/fastflowtransform/validation.py +38 -0
- fastflowtransform-0.6.2/tests/README.md +32 -0
|
@@ -0,0 +1,52 @@
|
|
|
1
|
+
# Envs & Secrets
|
|
2
|
+
.env.local
|
|
3
|
+
.env.*.local
|
|
4
|
+
secrets/
|
|
5
|
+
|
|
6
|
+
# Local DBs / Artifacts
|
|
7
|
+
*.duckdb
|
|
8
|
+
/local/**
|
|
9
|
+
.local/
|
|
10
|
+
|
|
11
|
+
# Virtual Environments
|
|
12
|
+
.venv/
|
|
13
|
+
venv/
|
|
14
|
+
|
|
15
|
+
# Python
|
|
16
|
+
__pycache__/
|
|
17
|
+
*.pyc
|
|
18
|
+
*.pyo
|
|
19
|
+
*.pyd
|
|
20
|
+
*.egg-info/
|
|
21
|
+
.build/
|
|
22
|
+
.eggs/
|
|
23
|
+
.uv-cache/
|
|
24
|
+
.DS_Store
|
|
25
|
+
|
|
26
|
+
# Tooling Caches
|
|
27
|
+
.pytest_cache/
|
|
28
|
+
.mypy_cache/
|
|
29
|
+
.ruff_cache/
|
|
30
|
+
.coverage*
|
|
31
|
+
htmlcov/
|
|
32
|
+
|
|
33
|
+
# Build Artifacts
|
|
34
|
+
build/
|
|
35
|
+
dist/
|
|
36
|
+
**/site/dag/
|
|
37
|
+
**/site/dag/**
|
|
38
|
+
spark-warehouse
|
|
39
|
+
metastore_db
|
|
40
|
+
derby.log
|
|
41
|
+
.fastflowtransform
|
|
42
|
+
_exports/**
|
|
43
|
+
|
|
44
|
+
# Editors / IDEs
|
|
45
|
+
.vscode/
|
|
46
|
+
.idea/
|
|
47
|
+
|
|
48
|
+
# Docs Output
|
|
49
|
+
examples/**/docs/
|
|
50
|
+
tickets/**
|
|
51
|
+
site/dag/**
|
|
52
|
+
cache/**
|
|
@@ -0,0 +1,122 @@
|
|
|
1
|
+
Metadata-Version: 2.4
|
|
2
|
+
Name: fastflowtransform
|
|
3
|
+
Version: 0.6.2
|
|
4
|
+
Summary: Python framework for SQL & Python data transformation, ETL pipelines, and dbt-style data modeling
|
|
5
|
+
Project-URL: Homepage, https://github.com/MirrorsAndMisdirections/FastFlowTransform
|
|
6
|
+
Project-URL: Documentation, https://fastflowtransform.com
|
|
7
|
+
Project-URL: Repository, https://github.com/MirrorsAndMisdirections/FastFlowTransform.git
|
|
8
|
+
Project-URL: Source, https://github.com/MirrorsAndMisdirections/fastflowtransform
|
|
9
|
+
Project-URL: Issues, https://github.com/MirrorsAndMisdirections/fastflowtransform/issues
|
|
10
|
+
Author: Marko Lekic
|
|
11
|
+
License: Apache-2.0
|
|
12
|
+
Keywords: bigquery,data modeling,data transformation,dbt alternative,duckdb,etl,postgres,snowflake,spark,sql
|
|
13
|
+
Classifier: Intended Audience :: Developers
|
|
14
|
+
Classifier: License :: OSI Approved :: Apache Software License
|
|
15
|
+
Classifier: Operating System :: OS Independent
|
|
16
|
+
Classifier: Programming Language :: Python :: 3
|
|
17
|
+
Classifier: Topic :: Database
|
|
18
|
+
Classifier: Topic :: Scientific/Engineering :: Information Analysis
|
|
19
|
+
Classifier: Topic :: Software Development :: Build Tools
|
|
20
|
+
Classifier: Topic :: Software Development :: Libraries
|
|
21
|
+
Requires-Python: >=3.12
|
|
22
|
+
Requires-Dist: duckdb>=1.0
|
|
23
|
+
Requires-Dist: httpx>=0.28.1
|
|
24
|
+
Requires-Dist: jinja2>=3.1
|
|
25
|
+
Requires-Dist: pandas>=2.0
|
|
26
|
+
Requires-Dist: pydantic-settings>=2.4
|
|
27
|
+
Requires-Dist: pydantic>=2.8
|
|
28
|
+
Requires-Dist: python-dotenv>=1.0
|
|
29
|
+
Requires-Dist: pyyaml>=6.0
|
|
30
|
+
Requires-Dist: sqlalchemy>=2.0
|
|
31
|
+
Requires-Dist: typer>=0.12
|
|
32
|
+
Provides-Extra: bigquery
|
|
33
|
+
Requires-Dist: google-cloud-bigquery>=3.25; extra == 'bigquery'
|
|
34
|
+
Provides-Extra: bigquery-bf
|
|
35
|
+
Requires-Dist: bigframes>=2.24.0; extra == 'bigquery-bf'
|
|
36
|
+
Requires-Dist: google-cloud-bigquery>=3.25; extra == 'bigquery-bf'
|
|
37
|
+
Provides-Extra: dev
|
|
38
|
+
Requires-Dist: mypy==1.18.*; extra == 'dev'
|
|
39
|
+
Requires-Dist: pandas-stubs>=2.1; extra == 'dev'
|
|
40
|
+
Requires-Dist: pre-commit==3.*; extra == 'dev'
|
|
41
|
+
Requires-Dist: pytest-cov==7.0.*; extra == 'dev'
|
|
42
|
+
Requires-Dist: pytest-rerunfailures==14.0.*; extra == 'dev'
|
|
43
|
+
Requires-Dist: pytest==8.4.*; extra == 'dev'
|
|
44
|
+
Requires-Dist: ruff==0.14.*; extra == 'dev'
|
|
45
|
+
Requires-Dist: types-pyyaml>=6.0.12; extra == 'dev'
|
|
46
|
+
Provides-Extra: docs
|
|
47
|
+
Requires-Dist: mkdocs-autorefs>=1.0; extra == 'docs'
|
|
48
|
+
Requires-Dist: mkdocs-gen-files>=0.5; extra == 'docs'
|
|
49
|
+
Requires-Dist: mkdocs-literate-nav>=0.6; extra == 'docs'
|
|
50
|
+
Requires-Dist: mkdocs-material>=9.5; extra == 'docs'
|
|
51
|
+
Requires-Dist: mkdocs-section-index>=0.3; extra == 'docs'
|
|
52
|
+
Requires-Dist: mkdocs>=1.6; extra == 'docs'
|
|
53
|
+
Requires-Dist: mkdocstrings[python]>=0.25; extra == 'docs'
|
|
54
|
+
Requires-Dist: pymdown-extensions>=10.0; extra == 'docs'
|
|
55
|
+
Provides-Extra: full
|
|
56
|
+
Requires-Dist: bigframes>=2.24.0; extra == 'full'
|
|
57
|
+
Requires-Dist: delta-spark>=4.0.0; extra == 'full'
|
|
58
|
+
Requires-Dist: google-cloud-bigquery>=3.25; extra == 'full'
|
|
59
|
+
Requires-Dist: psycopg2-binary>=2.9; extra == 'full'
|
|
60
|
+
Requires-Dist: psycopg[binary]>=3.1; extra == 'full'
|
|
61
|
+
Requires-Dist: pyspark>=4.0.1; extra == 'full'
|
|
62
|
+
Requires-Dist: snowflake-connector-python>=4.0.0; extra == 'full'
|
|
63
|
+
Requires-Dist: snowflake-snowpark-python>=1.40.0; extra == 'full'
|
|
64
|
+
Provides-Extra: postgres
|
|
65
|
+
Requires-Dist: psycopg2-binary>=2.9; extra == 'postgres'
|
|
66
|
+
Requires-Dist: psycopg[binary]>=3.1; extra == 'postgres'
|
|
67
|
+
Provides-Extra: snowflake
|
|
68
|
+
Requires-Dist: snowflake-connector-python>=4.0.0; extra == 'snowflake'
|
|
69
|
+
Requires-Dist: snowflake-snowpark-python>=1.40.0; extra == 'snowflake'
|
|
70
|
+
Provides-Extra: spark
|
|
71
|
+
Requires-Dist: delta-spark>=4.0.0; extra == 'spark'
|
|
72
|
+
Requires-Dist: pyspark>=4.0.1; extra == 'spark'
|
|
73
|
+
Description-Content-Type: text/markdown
|
|
74
|
+
|
|
75
|
+
# FastFlowTransform
|
|
76
|
+
|
|
77
|
+
[](https://github.com/MirrorsAndMisdirections/FastFlowTransform/actions/workflows/ci.yml)
|
|
78
|
+
[](https://pypi.org/project/fastflowtransform/)
|
|
79
|
+
|
|
80
|
+
FastFlowTransform (FFT) is a SQL + Python data modeling engine with a deterministic DAG, level-wise parallelism, optional caching, incremental builds, auto-docs, and built-in data-quality tests. Projects are plain directories containing models, seeds, and YAML config; the `fft` CLI handles compilation, execution, docs, and validation across multiple execution engines.
|
|
81
|
+
|
|
82
|
+
## Highlights
|
|
83
|
+
- SQL or Python models (`*.ff.sql` / `*.ff.py`) wired with `ref()` / `source()` / `deps=[...]`.
|
|
84
|
+
- Executors for DuckDB, Postgres, BigQuery (pandas + BigFrames), Databricks/Spark, and Snowflake Snowpark.
|
|
85
|
+
- Level-wise parallel scheduler with cache fingerprints, rebuild flags, and state/result selectors.
|
|
86
|
+
- Incremental and materialized models with engine-specific merge/append hooks.
|
|
87
|
+
- Tests everywhere: schema/YAML checks, reconciliation rules, and fast model unit tests (`fft utest`).
|
|
88
|
+
- Docs on demand: `fft dag --html` and `fft docgen` generate a browsable site plus JSON artifacts; optional `sync-db-comments` to push descriptions to Postgres/Snowflake.
|
|
89
|
+
- HTTP helpers for Python models (`fastflowtransform.api.http`) and Jinja macros/config for templating.
|
|
90
|
+
|
|
91
|
+
## Requirements
|
|
92
|
+
- Python 3.12+
|
|
93
|
+
- Engine extras installed only as needed (e.g. BigQuery, Snowflake, Spark/Delta, Postgres drivers). The core DuckDB path works out of the box.
|
|
94
|
+
|
|
95
|
+
## Install & Quickstart
|
|
96
|
+
- Pick the engine extras you need (combine as `pkg[a,b]`):
|
|
97
|
+
- DuckDB/core: `pip install fastflowtransform`
|
|
98
|
+
- Postgres: `pip install fastflowtransform[postgres]`
|
|
99
|
+
- BigQuery (pandas): `pip install fastflowtransform[bigquery]`
|
|
100
|
+
- BigQuery (BigFrames): `pip install fastflowtransform[bigquery_bf]`
|
|
101
|
+
- Databricks/Spark + Delta: `pip install fastflowtransform[spark]`
|
|
102
|
+
- Snowflake Snowpark: `pip install fastflowtransform[snowflake]`
|
|
103
|
+
- Everything: `pip install fastflowtransform[full]`
|
|
104
|
+
- Installation and first run: see `docs/Quickstart.md` (venv + editable install, DuckDB demo, and init walkthrough).
|
|
105
|
+
- CLI usage and flags: see `docs/CLI_Guide.md`.
|
|
106
|
+
- Makefile shortcut: `make demo` runs the simple DuckDB example end-to-end and opens the DAG (`examples/simple_duckdb`).
|
|
107
|
+
|
|
108
|
+
## Docs & examples
|
|
109
|
+
- Docs hub: `docs/index.md` or https://fastflowtransform.com.
|
|
110
|
+
- Operational guide & architecture: `docs/Technical_Overview.md`.
|
|
111
|
+
- Modeling reference & macros: `docs/Config_and_Macros.md`.
|
|
112
|
+
- Parallelism, cache, and state selection: `docs/Cache_and_Parallelism.md`, `docs/State_Selection.md`.
|
|
113
|
+
- Incremental models: `docs/Incremental.md`.
|
|
114
|
+
- Data-quality + YAML tests: `docs/Data_Quality_Tests.md`, `docs/YAML_Tests.md`, `docs/Unit_Tests.md`.
|
|
115
|
+
- CLI details and troubleshooting: `docs/CLI_Guide.md`, `docs/Troubleshooting.md`.
|
|
116
|
+
- Runnable demos live under `examples/` (basic, materializations, incremental, DQ, macros, cache, env matrix, API, events).
|
|
117
|
+
|
|
118
|
+
## Contributing
|
|
119
|
+
Issues and PRs are welcome. See `Contributing.md` for development setup, testing (`make demo`, `uv run pytest`, `fft utest`), and code-style guidelines.
|
|
120
|
+
|
|
121
|
+
## License
|
|
122
|
+
Apache 2.0 — see `License.md`.
|
|
@@ -0,0 +1,48 @@
|
|
|
1
|
+
# FastFlowTransform
|
|
2
|
+
|
|
3
|
+
[](https://github.com/MirrorsAndMisdirections/FastFlowTransform/actions/workflows/ci.yml)
|
|
4
|
+
[](https://pypi.org/project/fastflowtransform/)
|
|
5
|
+
|
|
6
|
+
FastFlowTransform (FFT) is a SQL + Python data modeling engine with a deterministic DAG, level-wise parallelism, optional caching, incremental builds, auto-docs, and built-in data-quality tests. Projects are plain directories containing models, seeds, and YAML config; the `fft` CLI handles compilation, execution, docs, and validation across multiple execution engines.
|
|
7
|
+
|
|
8
|
+
## Highlights
|
|
9
|
+
- SQL or Python models (`*.ff.sql` / `*.ff.py`) wired with `ref()` / `source()` / `deps=[...]`.
|
|
10
|
+
- Executors for DuckDB, Postgres, BigQuery (pandas + BigFrames), Databricks/Spark, and Snowflake Snowpark.
|
|
11
|
+
- Level-wise parallel scheduler with cache fingerprints, rebuild flags, and state/result selectors.
|
|
12
|
+
- Incremental and materialized models with engine-specific merge/append hooks.
|
|
13
|
+
- Tests everywhere: schema/YAML checks, reconciliation rules, and fast model unit tests (`fft utest`).
|
|
14
|
+
- Docs on demand: `fft dag --html` and `fft docgen` generate a browsable site plus JSON artifacts; optional `sync-db-comments` to push descriptions to Postgres/Snowflake.
|
|
15
|
+
- HTTP helpers for Python models (`fastflowtransform.api.http`) and Jinja macros/config for templating.
|
|
16
|
+
|
|
17
|
+
## Requirements
|
|
18
|
+
- Python 3.12+
|
|
19
|
+
- Engine extras installed only as needed (e.g. BigQuery, Snowflake, Spark/Delta, Postgres drivers). The core DuckDB path works out of the box.
|
|
20
|
+
|
|
21
|
+
## Install & Quickstart
|
|
22
|
+
- Pick the engine extras you need (combine as `pkg[a,b]`):
|
|
23
|
+
- DuckDB/core: `pip install fastflowtransform`
|
|
24
|
+
- Postgres: `pip install fastflowtransform[postgres]`
|
|
25
|
+
- BigQuery (pandas): `pip install fastflowtransform[bigquery]`
|
|
26
|
+
- BigQuery (BigFrames): `pip install fastflowtransform[bigquery_bf]`
|
|
27
|
+
- Databricks/Spark + Delta: `pip install fastflowtransform[spark]`
|
|
28
|
+
- Snowflake Snowpark: `pip install fastflowtransform[snowflake]`
|
|
29
|
+
- Everything: `pip install fastflowtransform[full]`
|
|
30
|
+
- Installation and first run: see `docs/Quickstart.md` (venv + editable install, DuckDB demo, and init walkthrough).
|
|
31
|
+
- CLI usage and flags: see `docs/CLI_Guide.md`.
|
|
32
|
+
- Makefile shortcut: `make demo` runs the simple DuckDB example end-to-end and opens the DAG (`examples/simple_duckdb`).
|
|
33
|
+
|
|
34
|
+
## Docs & examples
|
|
35
|
+
- Docs hub: `docs/index.md` or https://fastflowtransform.com.
|
|
36
|
+
- Operational guide & architecture: `docs/Technical_Overview.md`.
|
|
37
|
+
- Modeling reference & macros: `docs/Config_and_Macros.md`.
|
|
38
|
+
- Parallelism, cache, and state selection: `docs/Cache_and_Parallelism.md`, `docs/State_Selection.md`.
|
|
39
|
+
- Incremental models: `docs/Incremental.md`.
|
|
40
|
+
- Data-quality + YAML tests: `docs/Data_Quality_Tests.md`, `docs/YAML_Tests.md`, `docs/Unit_Tests.md`.
|
|
41
|
+
- CLI details and troubleshooting: `docs/CLI_Guide.md`, `docs/Troubleshooting.md`.
|
|
42
|
+
- Runnable demos live under `examples/` (basic, materializations, incremental, DQ, macros, cache, env matrix, API, events).
|
|
43
|
+
|
|
44
|
+
## Contributing
|
|
45
|
+
Issues and PRs are welcome. See `Contributing.md` for development setup, testing (`make demo`, `uv run pytest`, `fft utest`), and code-style guidelines.
|
|
46
|
+
|
|
47
|
+
## License
|
|
48
|
+
Apache 2.0 — see `License.md`.
|
|
@@ -0,0 +1,8 @@
|
|
|
1
|
+
# Basic demo
|
|
2
|
+
|
|
3
|
+
Minimal FFT pipeline that runs unchanged on DuckDB, Postgres, Databricks Spark, BigQuery, and Snowflake (Snowpark).
|
|
4
|
+
|
|
5
|
+
## How to use
|
|
6
|
+
- See the full walkthrough (env setup, Makefile targets, engine notes, DQ tests) in `docs/examples/Basic_Demo.md`.
|
|
7
|
+
- From this directory: set the desired `.env.dev_*` (for BigQuery choose `.env.dev_bigquery_pandas` or `.env.dev_bigquery_bigframes`), then run `make demo ENGINE=<duckdb|postgres|databricks_spark|bigquery|snowflake_snowpark>` (set `BQ_FRAME` to switch BigQuery client) to seed → run → dag → test.
|
|
8
|
+
- To inspect results, open `site/dag/index.html` after a run or query the mart tables via your engine client.
|
|
@@ -0,0 +1,8 @@
|
|
|
1
|
+
# Models directory
|
|
2
|
+
|
|
3
|
+
This demo ships with:
|
|
4
|
+
- `staging/users_clean.ff.sql` – normalizes the seeded users table.
|
|
5
|
+
- `marts/mart_users_by_domain.ff.sql` – aggregates signups per email domain.
|
|
6
|
+
- `engines/*/mart_latest_signup.ff.py` – engine-scoped Python models (pandas for DuckDB/Postgres, PySpark for Databricks) that select the most recent signup per domain using the staging view as input.
|
|
7
|
+
|
|
8
|
+
Add further SQL (`*.ff.sql`) or Python (`*.ff.py`) models alongside them to grow the pipeline.
|
|
@@ -0,0 +1,56 @@
|
|
|
1
|
+
# Cache Demo
|
|
2
|
+
|
|
3
|
+
This demo shows:
|
|
4
|
+
- Build cache skip/hit via fingerprints
|
|
5
|
+
- Downstream invalidation (seed → staging → mart)
|
|
6
|
+
- Environment-driven invalidation (only `FF_*`)
|
|
7
|
+
- Parallelism within levels (`--jobs`)
|
|
8
|
+
- HTTP response cache + offline mode
|
|
9
|
+
|
|
10
|
+
## Quickstart
|
|
11
|
+
|
|
12
|
+
```bash
|
|
13
|
+
# pick your engine (duckdb, postgres, databricks_spark, bigquery, or snowflake_snowpark); defaults to duckdb
|
|
14
|
+
cp .env.dev_duckdb .env
|
|
15
|
+
# or: cp .env.dev_postgres .env (then edit DSN/schema)
|
|
16
|
+
# or: cp .env.dev_databricks .env
|
|
17
|
+
# or: cp .env.dev_bigquery_pandas .env # or .env.dev_bigquery_bigframes
|
|
18
|
+
# or: cp .env.dev_snowflake .env
|
|
19
|
+
|
|
20
|
+
cd examples/cache_demo
|
|
21
|
+
make cache_first ENGINE=duckdb # builds and writes cache
|
|
22
|
+
make cache_second ENGINE=duckdb # should SKIP everything
|
|
23
|
+
make change_sql ENGINE=duckdb # touch SQL → mart rebuilds
|
|
24
|
+
make change_seed ENGINE=duckdb # seed with base + patches/seed_users_patch.csv (no tracked edits)
|
|
25
|
+
make change_env ENGINE=duckdb # FF_* env change → full rebuild
|
|
26
|
+
make change_py ENGINE=duckdb # edit constant in py_constants.ff.py → it rebuilds
|
|
27
|
+
|
|
28
|
+
make http_first ENGINE=duckdb # warms HTTP cache
|
|
29
|
+
make http_offline ENGINE=duckdb # reuses HTTP cache without network
|
|
30
|
+
make http_cache_clear # clears HTTP response cache
|
|
31
|
+
#
|
|
32
|
+
# Seeds stay immutable: change_seed builds a temporary combined copy in .local/seeds using
|
|
33
|
+
# patches/seed_users_patch.csv so the repo doesn’t become dirty.
|
|
34
|
+
Inspect:
|
|
35
|
+
|
|
36
|
+
site/dag/index.html
|
|
37
|
+
|
|
38
|
+
.fastflowtransform/target/run_results.json (HTTP stats, results)
|
|
39
|
+
|
|
40
|
+
markdown
|
|
41
|
+
Code kopieren
|
|
42
|
+
|
|
43
|
+
---
|
|
44
|
+
|
|
45
|
+
To run everything on Postgres, set `ENGINE=postgres` and copy/edit `.env.dev_postgres`, e.g. `make demo ENGINE=postgres`.
|
|
46
|
+
To run on Databricks/Spark locally, set `ENGINE=databricks_spark` and copy/edit `.env.dev_databricks`, e.g. `make demo ENGINE=databricks_spark`.
|
|
47
|
+
To run on BigQuery, set `ENGINE=bigquery` and copy/edit `.env.dev_bigquery_pandas` (or `.env.dev_bigquery_bigframes`), e.g. `make demo ENGINE=bigquery BQ_FRAME=bigframes` (default) or `BQ_FRAME=pandas`.
|
|
48
|
+
To run on Snowflake Snowpark, install `fastflowtransform[snowflake]`, set `ENGINE=snowflake_snowpark`, copy/edit `.env.dev_snowflake`, and run e.g. `make demo ENGINE=snowflake_snowpark`.
|
|
49
|
+
|
|
50
|
+
## What this demo proves (in a minute)
|
|
51
|
+
|
|
52
|
+
- **Cache hit/skip:** `make cache_second` should skip everything (if nothing changed).
|
|
53
|
+
- **Upstream invalidation:** `make change_seed` rebuilds staging **and** the mart.
|
|
54
|
+
- **Env invalidation:** `make change_env` (because `FF_*` is part of the fingerprint).
|
|
55
|
+
- **Python source sensitivity:** `py_constants` rebuilds only when its code changes.
|
|
56
|
+
- **HTTP cache:** `http_first` fetches; `http_offline` runs fully offline using cached responses.
|
|
@@ -0,0 +1,27 @@
|
|
|
1
|
+
# Data Quality Demo
|
|
2
|
+
|
|
3
|
+
Run the complete DQ demo (seeds → models → DAG → tests) on DuckDB, Postgres, Databricks Spark, BigQuery (pandas or BigFrames), or Snowflake Snowpark.
|
|
4
|
+
|
|
5
|
+
## Quickstart
|
|
6
|
+
From this directory:
|
|
7
|
+
|
|
8
|
+
1) Pick an engine and copy the matching `.env.dev_*` to `.env` (edit project/dataset if needed):
|
|
9
|
+
- DuckDB: `.env.dev_duckdb`
|
|
10
|
+
- Postgres: `.env.dev_postgres`
|
|
11
|
+
- Databricks Spark: `.env.dev_databricks`
|
|
12
|
+
- BigQuery (pandas): `.env.dev_bigquery_pandas`
|
|
13
|
+
- BigQuery (BigFrames): `.env.dev_bigquery_bigframes`
|
|
14
|
+
- Snowflake Snowpark: `.env.dev_snowflake`
|
|
15
|
+
|
|
16
|
+
2) Run the demo (set `BQ_FRAME` when using BigQuery):
|
|
17
|
+
```sh
|
|
18
|
+
make demo ENGINE=duckdb
|
|
19
|
+
make demo ENGINE=postgres
|
|
20
|
+
make demo ENGINE=databricks_spark
|
|
21
|
+
make demo ENGINE=bigquery BQ_FRAME=pandas # or bigframes
|
|
22
|
+
make demo ENGINE=snowflake_snowpark # install fastflowtransform[snowflake]
|
|
23
|
+
```
|
|
24
|
+
|
|
25
|
+
Artifacts:
|
|
26
|
+
- Target metadata: `.fastflowtransform/target/{manifest.json,run_results.json,catalog.xml}`
|
|
27
|
+
- DAG HTML: `site/dag/index.html`
|
|
@@ -0,0 +1,10 @@
|
|
|
1
|
+
# Incremental demo
|
|
2
|
+
|
|
3
|
+
Small FFT example that showcases incremental models and Delta/Iceberg-style merges
|
|
4
|
+
across DuckDB, Postgres, Databricks Spark, BigQuery (pandas or BigFrames), and Snowflake Snowpark.
|
|
5
|
+
|
|
6
|
+
## How to use
|
|
7
|
+
- Fill an `.env.dev_*` for your engine (DuckDB/Postgres/Databricks/BigQuery/Snowflake). For BigQuery use `.env.dev_bigquery_pandas` or `.env.dev_bigquery_bigframes`; for Snowflake use `.env.dev_snowflake`.
|
|
8
|
+
- From this directory run `make demo ENGINE=<duckdb|postgres|databricks_spark|bigquery|snowflake_snowpark>` (set `BQ_FRAME` for BigQuery, `DBR_TABLE_FORMAT` for Spark).
|
|
9
|
+
- Artifacts: DAG HTML in `site/dag/index.html`, FFT metadata in `.fastflowtransform/target/`.
|
|
10
|
+
- See `docs/examples/Incremental_Demo.md` for a full walkthrough of the models and incremental configs.
|
|
@@ -0,0 +1,12 @@
|
|
|
1
|
+
# Macros demo
|
|
2
|
+
|
|
3
|
+
FastFlowTransform example that highlights SQL & Python macros. See `docs/examples/Macros_Demo.md`
|
|
4
|
+
for a full walkthrough.
|
|
5
|
+
|
|
6
|
+
## Engines
|
|
7
|
+
|
|
8
|
+
- DuckDB/Postgres/Databricks Spark are pre-wired. Run `make demo ENGINE=duckdb|postgres|databricks_spark`.
|
|
9
|
+
- BigQuery (pandas or BigFrames) mirrors the basic demo setup. Set `ENGINE=bigquery` and optionally
|
|
10
|
+
`BQ_FRAME=pandas|bigframes` (default `bigframes`), then run `make demo ENGINE=bigquery BQ_FRAME=<frame>`.
|
|
11
|
+
- Snowflake Snowpark is available via `ENGINE=snowflake_snowpark`. Copy `.env.dev_snowflake` and install
|
|
12
|
+
`fastflowtransform[snowflake]` before running `make demo ENGINE=snowflake_snowpark`.
|
|
@@ -0,0 +1,12 @@
|
|
|
1
|
+
# Materializations demo
|
|
2
|
+
|
|
3
|
+
FastFlowTransform example highlighting materialized views/tables, incremental models, and Python emitters.
|
|
4
|
+
See `docs/examples/Materializations_Demo.md` for a full walkthrough.
|
|
5
|
+
|
|
6
|
+
## Engines
|
|
7
|
+
|
|
8
|
+
- DuckDB/Postgres/Databricks Spark are wired via the Makefile: `make demo ENGINE=duckdb|postgres|databricks_spark`.
|
|
9
|
+
- BigQuery supports both pandas and BigFrames clients. Copy `.env.dev_bigquery_pandas` (or `_bigframes`),
|
|
10
|
+
set `GOOGLE_APPLICATION_CREDENTIALS`, and run `make demo ENGINE=bigquery BQ_FRAME=pandas|bigframes`.
|
|
11
|
+
- Snowflake Snowpark mirrors the basic demo setup. Copy `.env.dev_snowflake`, install `fastflowtransform[snowflake]`,
|
|
12
|
+
then run `make demo ENGINE=snowflake_snowpark`.
|