ddeutil-workflow 0.0.9__py3-none-any.whl → 0.0.11__py3-none-any.whl

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -1,273 +0,0 @@
1
- Metadata-Version: 2.1
2
- Name: ddeutil-workflow
3
- Version: 0.0.9
4
- Summary: Lightweight workflow orchestration with less dependencies
5
- Author-email: ddeutils <korawich.anu@gmail.com>
6
- License: MIT
7
- Project-URL: Homepage, https://github.com/ddeutils/ddeutil-workflow/
8
- Project-URL: Source Code, https://github.com/ddeutils/ddeutil-workflow/
9
- Keywords: orchestration,workflow
10
- Classifier: Topic :: Utilities
11
- Classifier: Natural Language :: English
12
- Classifier: Development Status :: 4 - Beta
13
- Classifier: Intended Audience :: Developers
14
- Classifier: Operating System :: OS Independent
15
- Classifier: Programming Language :: Python
16
- Classifier: Programming Language :: Python :: 3 :: Only
17
- Classifier: Programming Language :: Python :: 3.9
18
- Classifier: Programming Language :: Python :: 3.10
19
- Classifier: Programming Language :: Python :: 3.11
20
- Classifier: Programming Language :: Python :: 3.12
21
- Requires-Python: >=3.9.13
22
- Description-Content-Type: text/markdown
23
- License-File: LICENSE
24
- Requires-Dist: fmtutil
25
- Requires-Dist: ddeutil-io
26
- Requires-Dist: python-dotenv ==1.0.1
27
- Requires-Dist: typer ==0.12.4
28
- Provides-Extra: api
29
- Requires-Dist: fastapi[standard] ==0.112.1 ; extra == 'api'
30
- Requires-Dist: croniter ==3.0.3 ; extra == 'api'
31
- Provides-Extra: schedule
32
- Requires-Dist: schedule <2.0.0,==1.2.2 ; extra == 'schedule'
33
-
34
- # Workflow
35
-
36
- [![test](https://github.com/ddeutils/ddeutil-workflow/actions/workflows/tests.yml/badge.svg?branch=main)](https://github.com/ddeutils/ddeutil-workflow/actions/workflows/tests.yml)
37
- [![python support version](https://img.shields.io/pypi/pyversions/ddeutil-workflow)](https://pypi.org/project/ddeutil-workflow/)
38
- [![size](https://img.shields.io/github/languages/code-size/ddeutils/ddeutil-workflow)](https://github.com/ddeutils/ddeutil-workflow)
39
- [![gh license](https://img.shields.io/github/license/ddeutils/ddeutil-workflow)](https://github.com/ddeutils/ddeutil-workflow/blob/main/LICENSE)
40
- [![code style: black](https://img.shields.io/badge/code%20style-black-000000.svg)](https://github.com/psf/black)
41
-
42
- **Table of Contents**:
43
-
44
- - [Installation](#installation)
45
- - [Getting Started](#getting-started)
46
- - [On](#on)
47
- - [Pipeline](#pipeline)
48
- - [Usage](#usage)
49
- - [Configuration](#configuration)
50
- - [Future](#future)
51
- - [Deployment](#deployment)
52
-
53
- The **Lightweight workflow orchestration** with less dependencies the was created
54
- for easy to make a simple metadata driven for data pipeline orchestration.
55
- It can to use for data operator by a `.yaml` template.
56
-
57
- > [!WARNING]
58
- > This package provide only orchestration workload. That mean you should not use
59
- > workflow stage to process any large data which use lot of compute usecase.
60
-
61
- In my opinion, I think it should not create duplicate pipeline codes if I can
62
- write with dynamic input parameters on the one template pipeline that just change
63
- the input parameters per use-case instead.
64
- This way I can handle a lot of logical pipelines in our orgs with only metadata
65
- configuration. It called **Metadata Driven Data Pipeline**.
66
-
67
- Next, we should get some monitoring tools for manage logging that return from
68
- pipeline running. Because it not show us what is a use-case that running data
69
- pipeline.
70
-
71
- > [!NOTE]
72
- > _Disclaimer_: I inspire the dynamic statement from the GitHub Action `.yml` files
73
- > and all of config file from several data orchestration framework tools from my
74
- > experience on Data Engineer.
75
-
76
- **Rules of This Workflow engine**:
77
-
78
- 1. Minimum unit of scheduling is 1 minute
79
- 2. Cannot re-run only failed stage and its pending downstream
80
- 3. All parallel tasks inside workflow engine use Threading
81
- (Because Python 3.13 unlock GIL)
82
-
83
- ## Installation
84
-
85
- This project need `ddeutil-io` extension namespace packages. If you want to install
86
- this package with application add-ons, you should add `app` in installation;
87
-
88
- | Usecase | Install Optional | Support |
89
- |-------------------|------------------------------------------|--------------------|
90
- | Python & CLI | `pip install ddeutil-workflow` | :heavy_check_mark: |
91
- | Scheduler Service | `pip install ddeutil-workflow[schedule]` | :x: |
92
- | FastAPI Server | `pip install ddeutil-workflow[api]` | :x: |
93
-
94
-
95
- > I added this feature to the main milestone.
96
- >
97
- > **Docker Images** supported:
98
- >
99
- > | Docker Image | Python Version | Support |
100
- > |-----------------------------|----------------|---------|
101
- > | ddeutil-workflow:latest | `3.9` | :x: |
102
- > | ddeutil-workflow:python3.10 | `3.10` | :x: |
103
- > | ddeutil-workflow:python3.11 | `3.11` | :x: |
104
- > | ddeutil-workflow:python3.12 | `3.12` | :x: |
105
-
106
- ## Getting Started
107
-
108
- The main feature of this project is the `Pipeline` object that can call any
109
- registries function. The pipeline can handle everything that you want to do, it
110
- will passing parameters and catching the output for re-use it to next step.
111
-
112
- ### On
113
-
114
- The **On** is schedule object that receive crontab value and able to generate
115
- datetime value with next or previous with any start point of an input datetime.
116
-
117
- ```yaml
118
- # This file should keep under this path: `./root-path/conf-path/*`
119
- on_every_5_min:
120
- type: on.On
121
- cron: "*/5 * * * *"
122
- ```
123
-
124
- ```python
125
- from ddeutil.workflow.on import On
126
-
127
- # NOTE: Start load the on data from `.yaml` template file with this key.
128
- schedule = On.from_loader(name='on_every_5_min', externals={})
129
-
130
- assert '*/5 * * * *' == str(schedule.cronjob)
131
-
132
- cron_iter = schedule.generate('2022-01-01 00:00:00')
133
-
134
- assert "2022-01-01 00:05:00" f"{cron_iter.next:%Y-%m-%d %H:%M:%S}"
135
- assert "2022-01-01 00:10:00" f"{cron_iter.next:%Y-%m-%d %H:%M:%S}"
136
- assert "2022-01-01 00:15:00" f"{cron_iter.next:%Y-%m-%d %H:%M:%S}"
137
- ```
138
-
139
- ### Pipeline
140
-
141
- The **Pipeline** object that is the core feature of this project.
142
-
143
- ```yaml
144
- # This file should keep under this path: `./root-path/conf-path/*`
145
- pipeline-name:
146
- type: ddeutil.workflow.pipeline.Pipeline
147
- on: 'on_every_5_min'
148
- params:
149
- author-run:
150
- type: str
151
- run-date:
152
- type: datetime
153
- jobs:
154
- first-job:
155
- stages:
156
- - name: "Empty stage do logging to console only!!"
157
- ```
158
-
159
- ```python
160
- from ddeutil.workflow.pipeline import Pipeline
161
-
162
- pipe = Pipeline.from_loader(name='pipeline-name', externals={})
163
- pipe.execute(params={'author-run': 'Local Workflow', 'run-date': '2024-01-01'})
164
- ```
165
-
166
- > [!NOTE]
167
- > The above parameter can use short declarative statement. You can pass a parameter
168
- > type to the key of a parameter name but it does not handler default value if you
169
- > run this pipeline workflow with schedule.
170
- >
171
- > ```yaml
172
- > ...
173
- > params:
174
- > author-run: str
175
- > run-date: datetime
176
- > ...
177
- > ```
178
- >
179
- > And for the type, you can remove `ddeutil.workflow` prefix because we can find
180
- > it by looping search from `WORKFLOW_CORE_REGISTRY` value.
181
-
182
- ## Usage
183
-
184
- This is examples that use workflow file for running common Data Engineering
185
- use-case.
186
-
187
- > [!IMPORTANT]
188
- > I recommend you to use the `hook` stage for all actions that you want to do
189
- > with pipeline activity that you want to orchestrate. Because it able to dynamic
190
- > an input argument with the same hook function that make you use less time to
191
- > maintenance your data pipelines.
192
-
193
- ```yaml
194
- run_py_local:
195
- type: pipeline.Pipeline
196
- on:
197
- - cronjob: '*/5 * * * *'
198
- timezone: "Asia/Bangkok"
199
- params:
200
- author-run: str
201
- run-date: datetime
202
- jobs:
203
- getting-api-data:
204
- stages:
205
- - name: "Retrieve API Data"
206
- id: retrieve-api
207
- uses: tasks/get-api-with-oauth-to-s3@requests
208
- with:
209
- url: https://open-data/
210
- auth: ${API_ACCESS_REFRESH_TOKEN}
211
- aws_s3_path: my-data/open-data/
212
- # This Authentication code should implement with your custom hook function
213
- aws_access_client_id: ${AWS_ACCESS_CLIENT_ID}
214
- aws_access_client_secret: ${AWS_ACCESS_CLIENT_SECRET}
215
- ```
216
-
217
- ## Configuration
218
-
219
- | Environment | Component | Default | Description |
220
- |-------------------------------------|-----------|------------------------------|----------------------------------------------------------------------------|
221
- | `WORKFLOW_ROOT_PATH` | Core | . | The root path of the workflow application |
222
- | `WORKFLOW_CORE_REGISTRY` | Core | ddeutil.workflow,tests.utils | List of importable string for the hook stage |
223
- | `WORKFLOW_CORE_REGISTRY_FILTER` | Core | ddeutil.workflow.utils | List of importable string for the filter template |
224
- | `WORKFLOW_CORE_PATH_CONF` | Core | conf | The config path that keep all template `.yaml` files |
225
- | `WORKFLOW_CORE_TIMEZONE` | Core | Asia/Bangkok | A Timezone string value that will pass to `ZoneInfo` object |
226
- | `WORKFLOW_CORE_STAGE_DEFAULT_ID` | Core | true | A flag that enable default stage ID that use for catch an execution output |
227
- | `WORKFLOW_CORE_STAGE_RAISE_ERROR` | Core | true | A flag that all stage raise StageException from stage execution |
228
- | `WORKFLOW_CORE_MAX_PIPELINE_POKING` | Core | 4 | |
229
- | `WORKFLOW_CORE_MAX_JOB_PARALLEL` | Core | 2 | The maximum job number that able to run parallel in pipeline executor |
230
- | `WORKFLOW_LOG_ENABLE_WRITE` | Log | true | A flag that enable logging object saving log to its destination |
231
-
232
-
233
- **Application**:
234
-
235
- | Environment | Default | Description |
236
- |-------------------------------------|---------|-------------|
237
- | `WORKFLOW_APP_PROCESS_WORKER` | 2 | |
238
- | `WORKFLOW_APP_PIPELINE_PER_PROCESS` | 100 | |
239
-
240
- **API server**:
241
-
242
- | Environment | Default | Description |
243
- |-----------------------|--------------------------------------------------------|--------------------------------------------------------------------|
244
- | `WORKFLOW_API_DB_URL` | postgresql+asyncpg://user:pass@localhost:5432/schedule | A Database URL that will pass to SQLAlchemy create_engine function |
245
-
246
- ## Future
247
-
248
- The current milestone that will develop and necessary features that should to
249
- implement on this project.
250
-
251
- - ...
252
-
253
- ## Deployment
254
-
255
- This package able to run as a application service for receive manual trigger
256
- from the master node via RestAPI or use to be Scheduler background service
257
- like crontab job but via Python API.
258
-
259
- ### Schedule Service
260
-
261
- ```shell
262
- (venv) $ python src.ddeutil.workflow.app
263
- ```
264
-
265
- ### API Server
266
-
267
- ```shell
268
- (venv) $ uvicorn src.ddeutil.workflow.api:app --host 0.0.0.0 --port 80 --reload
269
- ```
270
-
271
- > [!NOTE]
272
- > If this package already deploy, it able to use
273
- > `uvicorn ddeutil.workflow.api:app --host 0.0.0.0 --port 80`
@@ -1,22 +0,0 @@
1
- ddeutil/workflow/__about__.py,sha256=gh9CIut-EzZx1bHdgqILjssQNzzmuo1z_7iXAotDuKk,27
2
- ddeutil/workflow/__init__.py,sha256=oGvg_BpKKb_FG76DlMvXTKD7BsYhqF9wB1r4x5Q_lQI,647
3
- ddeutil/workflow/__types.py,sha256=SYMoxbENQX8uPsiCZkjtpHAqqHOh8rUrarAFicAJd0E,1773
4
- ddeutil/workflow/api.py,sha256=GxjGTLnohbsLsQbcJ0CL00d2LHpuw6J7PN6NqJ3oyRw,2502
5
- ddeutil/workflow/cli.py,sha256=RsP7evb3HCkzzO89ODjX6VEemQsSv9I-XOdWUJsiLfg,1180
6
- ddeutil/workflow/cron.py,sha256=FqmkvWCqwJ4eRf8aDn5Ce4FcNWqmcvu2aTTfL34lfgs,22184
7
- ddeutil/workflow/exceptions.py,sha256=zuCcsfJ1hFivubXz6lXCpGYXk07d_PkRaUD5ew3_LC0,632
8
- ddeutil/workflow/loader.py,sha256=uMMDc7hzPHqcmIoX2tF91KF1R9AerSC-TScrWmKLlNU,4490
9
- ddeutil/workflow/log.py,sha256=MxRZMnpq_p0khgZQXffJ7mlGPeVPeY6ABYBBauxUapc,5192
10
- ddeutil/workflow/on.py,sha256=6E8P4Cbc5y-nywF7xk0KDCJFEG8GhUVGnbjAQnQN2Dg,6892
11
- ddeutil/workflow/pipeline.py,sha256=uSX5qtDvBXjTDZheQPPafb704R9C0upFPCNIDnoIFOE,39219
12
- ddeutil/workflow/repeat.py,sha256=e127Z-Fl5Ft2CZSQwLOhInU21IBio0XAyk00B2TLQmU,4730
13
- ddeutil/workflow/route.py,sha256=w095eB4zMQsqszVgll-M15ky1mxmLKCbwfcTXc9xOPE,1933
14
- ddeutil/workflow/scheduler.py,sha256=06p0BAHehdP-23rUfrswZi1mF7Kgolqf4OLMtFVsVX4,14875
15
- ddeutil/workflow/stage.py,sha256=4Xtjl0GQUceqe8VGV8DsqmvfuX6lq8C0ne-Ls9qtLMs,20589
16
- ddeutil/workflow/utils.py,sha256=HY3tEARQHJrm4WTQX9jmeHUBwQaFmJFIrtzttYvaCRA,23963
17
- ddeutil_workflow-0.0.9.dist-info/LICENSE,sha256=nGFZ1QEhhhWeMHf9n99_fdt4vQaXS29xWKxt-OcLywk,1085
18
- ddeutil_workflow-0.0.9.dist-info/METADATA,sha256=VSDq5YFeEJ6Ni0e-I32B4M9Anwh18Pd7q2CCp_igTMY,11148
19
- ddeutil_workflow-0.0.9.dist-info/WHEEL,sha256=Mdi9PDNwEZptOjTlUcAth7XJDFtKrHYaQMPulZeBCiQ,91
20
- ddeutil_workflow-0.0.9.dist-info/entry_points.txt,sha256=gLS1mgLig424zJql6CYYz4TxjKzoOwsS_Ez_NkEw0DA,54
21
- ddeutil_workflow-0.0.9.dist-info/top_level.txt,sha256=m9M6XeSWDwt_yMsmH6gcOjHZVK5O0-vgtNBuncHjzW4,8
22
- ddeutil_workflow-0.0.9.dist-info/RECORD,,
@@ -1,2 +0,0 @@
1
- [console_scripts]
2
- workflow = ddeutil.workflow.cli:app