ddeutil-workflow 0.0.8__py3-none-any.whl → 0.0.9__py3-none-any.whl

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -1,266 +0,0 @@
1
- Metadata-Version: 2.1
2
- Name: ddeutil-workflow
3
- Version: 0.0.8
4
- Summary: Data Developer & Engineer Workflow Utility Objects
5
- Author-email: ddeutils <korawich.anu@gmail.com>
6
- License: MIT
7
- Project-URL: Homepage, https://github.com/ddeutils/ddeutil-workflow/
8
- Project-URL: Source Code, https://github.com/ddeutils/ddeutil-workflow/
9
- Keywords: data,workflow,utility,pipeline
10
- Classifier: Topic :: Utilities
11
- Classifier: Natural Language :: English
12
- Classifier: Development Status :: 4 - Beta
13
- Classifier: Intended Audience :: Developers
14
- Classifier: Operating System :: OS Independent
15
- Classifier: Programming Language :: Python
16
- Classifier: Programming Language :: Python :: 3 :: Only
17
- Classifier: Programming Language :: Python :: 3.9
18
- Classifier: Programming Language :: Python :: 3.10
19
- Classifier: Programming Language :: Python :: 3.11
20
- Classifier: Programming Language :: Python :: 3.12
21
- Requires-Python: >=3.9.13
22
- Description-Content-Type: text/markdown
23
- License-File: LICENSE
24
- Requires-Dist: fmtutil
25
- Requires-Dist: ddeutil-io
26
- Requires-Dist: python-dotenv ==1.0.1
27
- Provides-Extra: api
28
- Requires-Dist: fastapi[standard] ==0.112.0 ; extra == 'api'
29
- Requires-Dist: apscheduler[sqlalchemy] <4.0.0,==3.10.4 ; extra == 'api'
30
- Requires-Dist: croniter ==3.0.3 ; extra == 'api'
31
- Provides-Extra: app
32
- Requires-Dist: schedule <2.0.0,==1.2.2 ; extra == 'app'
33
-
34
- # Workflow
35
-
36
- [![test](https://github.com/ddeutils/ddeutil-workflow/actions/workflows/tests.yml/badge.svg?branch=main)](https://github.com/ddeutils/ddeutil-workflow/actions/workflows/tests.yml)
37
- [![python support version](https://img.shields.io/pypi/pyversions/ddeutil-workflow)](https://pypi.org/project/ddeutil-workflow/)
38
- [![size](https://img.shields.io/github/languages/code-size/ddeutils/ddeutil-workflow)](https://github.com/ddeutils/ddeutil-workflow)
39
- [![gh license](https://img.shields.io/github/license/ddeutils/ddeutil-workflow)](https://github.com/ddeutils/ddeutil-workflow/blob/main/LICENSE)
40
-
41
- **Table of Contents**:
42
-
43
- - [Installation](#installation)
44
- - [Getting Started](#getting-started)
45
- - [On](#on)
46
- - [Pipeline](#pipeline)
47
- - [Usage](#usage)
48
- - [Python & Bash](#python--bash)
49
- - [Hook (EL)](#hook-extract--load)
50
- - [Hook (T)](#hook-transform)
51
- - [Configuration](#configuration)
52
- - [Deployment](#deployment)
53
-
54
- This **Workflow** objects was created for easy to make a simple metadata
55
- driven for data pipeline orchestration that able to use for **ETL, T, EL, or
56
- ELT** by a `.yaml` file template.
57
-
58
- In my opinion, I think it should not create duplicate pipeline codes if I can
59
- write with dynamic input parameters on the one template pipeline that just change
60
- the input parameters per use-case instead.
61
- This way I can handle a lot of logical pipelines in our orgs with only metadata
62
- configuration. It called **Metadata Driven Data Pipeline**.
63
-
64
- Next, we should get some monitoring tools for manage logging that return from
65
- pipeline running. Because it not show us what is a use-case that running data
66
- pipeline.
67
-
68
- > [!NOTE]
69
- > _Disclaimer_: I inspire the dynamic statement from the GitHub Action `.yml` files
70
- > and all of config file from several data orchestration framework tools from my
71
- > experience on Data Engineer.
72
-
73
- ## Installation
74
-
75
- ```shell
76
- pip install ddeutil-workflow
77
- ```
78
-
79
- This project need `ddeutil-io` extension namespace packages. If you want to install
80
- this package with application add-ons, you should add `app` in installation;
81
-
82
- | Usecase | Install Optional |
83
- |--------------------|---------------------------|
84
- | Scheduler Service | `ddeutil-workflow[app]` |
85
- | FastAPI Server | `ddeutil-workflow[api]` |
86
-
87
- ## Getting Started
88
-
89
- The first step, you should start create the connections and datasets for In and
90
- Out of you data that want to use in pipeline of workflow. Some of this component
91
- is similar component of the **Airflow** because I like it orchestration concepts.
92
-
93
- The main feature of this project is the `Pipeline` object that can call any
94
- registries function. The pipeline can handle everything that you want to do, it
95
- will passing parameters and catching the output for re-use it to next step.
96
-
97
- > [!IMPORTANT]
98
- > In the future of this project, I will drop the connection and dataset to
99
- > dynamic registries instead of main features because it have a lot of maintain
100
- > vendor codes and deps. (I do not have time to handle this features)
101
-
102
- ### On
103
-
104
- The **On** is schedule object.
105
-
106
- ```yaml
107
- on_every_5_min:
108
- type: on.On
109
- cron: "*/5 * * * *"
110
- ```
111
-
112
- ```python
113
- from ddeutil.workflow.on import On
114
-
115
- schedule = On.from_loader(name='on_every_5_min', externals={})
116
- assert '*/5 * * * *' == str(schedule.cronjob)
117
-
118
- cron_iter = schedule.generate('2022-01-01 00:00:00')
119
- assert '2022-01-01 00:05:00' f"{cron_iter.next:%Y-%m-%d %H:%M:%S}"
120
- assert '2022-01-01 00:10:00' f"{cron_iter.next:%Y-%m-%d %H:%M:%S}"
121
- assert '2022-01-01 00:15:00' f"{cron_iter.next:%Y-%m-%d %H:%M:%S}"
122
- assert '2022-01-01 00:20:00' f"{cron_iter.next:%Y-%m-%d %H:%M:%S}"
123
- ```
124
-
125
- ### Pipeline
126
-
127
- The **Pipeline** object that is the core feature of this project.
128
-
129
- ```yaml
130
- run_py_local:
131
- type: ddeutil.workflow.pipeline.Pipeline
132
- on: 'on_every_5_min'
133
- params:
134
- author-run:
135
- type: str
136
- run-date:
137
- type: datetime
138
- ```
139
-
140
- ```python
141
- from ddeutil.workflow.pipeline import Pipeline
142
-
143
- pipe = Pipeline.from_loader(name='run_py_local', externals={})
144
- pipe.execute(params={'author-run': 'Local Workflow', 'run-date': '2024-01-01'})
145
- ```
146
-
147
- > [!NOTE]
148
- > The above parameter use short declarative statement. You can pass a parameter
149
- > type to the key of a parameter name.
150
- > ```yaml
151
- > params:
152
- > author-run: str
153
- > run-date: datetime
154
- > ```
155
- >
156
- > And for the type, you can remove `ddeutil.workflow` prefix because we can find
157
- > it by looping search from `WORKFLOW_CORE_REGISTRY` value.
158
-
159
- ## Usage
160
-
161
- This is examples that use workflow file for running common Data Engineering
162
- use-case.
163
-
164
- > [!IMPORTANT]
165
- > I recommend you to use `task` stage for all actions that you want to do with
166
- > pipeline object.
167
-
168
- ```yaml
169
- run_py_local:
170
- type: pipeline.Pipeline
171
- on:
172
- - cronjob: '* * * * *'
173
- timezone: "Asia/Bangkok"
174
- params:
175
- author-run: str
176
- run-date: datetime
177
- jobs:
178
- first-job:
179
- stages:
180
- - name: "Printing Information"
181
- id: define-func
182
- run: |
183
- x = '${{ params.run-date | fmt("%Y%m%d") }}'
184
- print(f'Hello at {x}')
185
-
186
- def echo(name: str):
187
- print(f'Hello {name}')
188
- - name: "Run Sequence and use var from Above"
189
- vars:
190
- x: ${{ params.author-run }}
191
- run: |
192
- print(f'Receive x from above with {x}')
193
- # Change x value
194
- x: int = 1
195
- - name: "Call Function"
196
- vars:
197
- echo: ${{ stages.define-func.outputs.echo }}
198
- run: |
199
- echo('Caller')
200
- second-job:
201
- stages:
202
- - name: "Echo Bash Script"
203
- id: shell-echo
204
- bash: |
205
- echo "Hello World from Shell"
206
- ```
207
-
208
- ```python
209
- from datetime import datetime
210
- from ddeutil.workflow.pipeline import Pipeline
211
-
212
- pipe: Pipeline = Pipeline.from_loader(name='run_py_local', externals={})
213
- pipe.execute(params={
214
- 'author-run': 'Local Workflow',
215
- 'run-date': datetime(2024, 1, 1),
216
- })
217
- ```
218
-
219
- ```shell
220
- > Hello at 20240101
221
- > Receive x from above with Local Workflow
222
- > Hello Caller
223
- > Hello World from Shell
224
- ```
225
-
226
- ## Configuration
227
-
228
- ```bash
229
- export WORKFLOW_ROOT_PATH=.
230
- export WORKFLOW_CORE_REGISTRY=ddeutil.workflow,tests.utils
231
- export WORKFLOW_CORE_REGISTRY_FILTER=ddeutil.workflow.utils
232
- export WORKFLOW_CORE_PATH_CONF=conf
233
- export WORKFLOW_CORE_TIMEZONE=Asia/Bangkok
234
- export WORKFLOW_CORE_DEFAULT_STAGE_ID=true
235
- export WORKFLOW_CORE_MAX_PIPELINE_POKING=4
236
- export WORKFLOW_CORE_MAX_JOB_PARALLEL=2
237
- ```
238
-
239
- Application config:
240
-
241
- ```bash
242
- export WORKFLOW_APP_DB_URL=postgresql+asyncpg://user:pass@localhost:5432/schedule
243
- export WORKFLOW_APP_INTERVAL=10
244
- ```
245
-
246
- ## Deployment
247
-
248
- This package able to run as a application service for receive manual trigger
249
- from the master node via RestAPI or use to be Scheduler background service
250
- like crontab job but via Python API.
251
-
252
- ### Schedule Service
253
-
254
- ```shell
255
- (venv) $ python src.ddeutil.workflow.app
256
- ```
257
-
258
- ### API Server
259
-
260
- ```shell
261
- (venv) $ uvicorn src.ddeutil.workflow.api:app --host 0.0.0.0 --port 80 --reload
262
- ```
263
-
264
- > [!NOTE]
265
- > If this package already deploy, it able to use
266
- > `uvicorn ddeutil.workflow.api:app --host 0.0.0.0 --port 80`
@@ -1,20 +0,0 @@
1
- ddeutil/workflow/__about__.py,sha256=FA15NQYpQvn7SrHupxQQQ9Ad5ZzEXOvwDS5UyB1h1bo,27
2
- ddeutil/workflow/__init__.py,sha256=4PEL3RdHmUowK0Dz-tK7fO0wvFX4u9CLd0Up7b3lrAQ,760
3
- ddeutil/workflow/__types.py,sha256=SYMoxbENQX8uPsiCZkjtpHAqqHOh8rUrarAFicAJd0E,1773
4
- ddeutil/workflow/api.py,sha256=d2Mmv9jTtN3FITIy-2mivyAKdBOGZxtkNWRMPbCLlFI,3341
5
- ddeutil/workflow/app.py,sha256=BuYhOoSJCHiSoj3xb2I5QoxaHrD3bKdmoJua3bKBetc,1165
6
- ddeutil/workflow/exceptions.py,sha256=zuCcsfJ1hFivubXz6lXCpGYXk07d_PkRaUD5ew3_LC0,632
7
- ddeutil/workflow/loader.py,sha256=_ZD-XP5P7VbUeqItrUVPaKIZu6dMUZ2aywbCbReW1hQ,2778
8
- ddeutil/workflow/log.py,sha256=N2TyjcuAoH0YTvzJCHTO037IHgVkLA986Xhtz1LSgE4,1742
9
- ddeutil/workflow/on.py,sha256=YoEqDbzJUwqOA3JRltbvlYr0rNTtxdmb7cWMxl8U19k,6717
10
- ddeutil/workflow/pipeline.py,sha256=VC6VDxycUdGKn13V42RZxAlCFySYb2HIZGq_ku5Kp5k,30844
11
- ddeutil/workflow/repeat.py,sha256=sNoRfbOR4cYm_edrSvlVy9N8Dk_osLIq9FC5GMZz32M,4621
12
- ddeutil/workflow/route.py,sha256=Ck_O1xJwI-vKkMJr37El0-1PGKlwKF8__DDNWVQrf0A,2079
13
- ddeutil/workflow/scheduler.py,sha256=FqmkvWCqwJ4eRf8aDn5Ce4FcNWqmcvu2aTTfL34lfgs,22184
14
- ddeutil/workflow/stage.py,sha256=tbxENx_-2BQ6peXKM_s6RQ1oGzTlXcZ4yDpP1Hufkdk,18095
15
- ddeutil/workflow/utils.py,sha256=seyU81JXfb2zz6QbJvVEb2Wn4qt8f-FBA6QFC97xY5k,21240
16
- ddeutil_workflow-0.0.8.dist-info/LICENSE,sha256=nGFZ1QEhhhWeMHf9n99_fdt4vQaXS29xWKxt-OcLywk,1085
17
- ddeutil_workflow-0.0.8.dist-info/METADATA,sha256=9i7Jk3CZlBpNkmFFjD247opgYA6Mc8AT6CtZjcvamYI,8314
18
- ddeutil_workflow-0.0.8.dist-info/WHEEL,sha256=HiCZjzuy6Dw0hdX5R3LCFPDmFS4BWl8H-8W39XfmgX4,91
19
- ddeutil_workflow-0.0.8.dist-info/top_level.txt,sha256=m9M6XeSWDwt_yMsmH6gcOjHZVK5O0-vgtNBuncHjzW4,8
20
- ddeutil_workflow-0.0.8.dist-info/RECORD,,