highway-dsl 0.0.2__py3-none-any.whl → 1.2.0__py3-none-any.whl
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Potentially problematic release.
This version of highway-dsl might be problematic. Click here for more details.
- highway_dsl/__init__.py +15 -6
- highway_dsl/workflow_dsl.py +498 -72
- highway_dsl-1.2.0.dist-info/METADATA +481 -0
- highway_dsl-1.2.0.dist-info/RECORD +7 -0
- highway_dsl-0.0.2.dist-info/METADATA +0 -227
- highway_dsl-0.0.2.dist-info/RECORD +0 -7
- {highway_dsl-0.0.2.dist-info → highway_dsl-1.2.0.dist-info}/WHEEL +0 -0
- {highway_dsl-0.0.2.dist-info → highway_dsl-1.2.0.dist-info}/licenses/LICENSE +0 -0
- {highway_dsl-0.0.2.dist-info → highway_dsl-1.2.0.dist-info}/top_level.txt +0 -0
|
@@ -0,0 +1,481 @@
|
|
|
1
|
+
Metadata-Version: 2.4
|
|
2
|
+
Name: highway_dsl
|
|
3
|
+
Version: 1.2.0
|
|
4
|
+
Summary: A stable domain specific language (DSL) for defining and managing data processing pipelines and workflow engines.
|
|
5
|
+
Author-email: Farseed Ashouri <farseed.ashouri@gmail.com>
|
|
6
|
+
License: MIT
|
|
7
|
+
Project-URL: Homepage, https://github.com/rodmena-limited/highway_dsl
|
|
8
|
+
Project-URL: Issues, https://github.com/rodmena-limited/highway_dsl/issues
|
|
9
|
+
Classifier: Programming Language :: Python :: 3
|
|
10
|
+
Classifier: License :: OSI Approved :: MIT License
|
|
11
|
+
Classifier: Operating System :: OS Independent
|
|
12
|
+
Requires-Python: >=3.10
|
|
13
|
+
Description-Content-Type: text/markdown
|
|
14
|
+
License-File: LICENSE
|
|
15
|
+
Requires-Dist: pydantic>=2.12.3
|
|
16
|
+
Requires-Dist: pyyaml>=6.0
|
|
17
|
+
Provides-Extra: dev
|
|
18
|
+
Requires-Dist: pytest>=7.0.0; extra == "dev"
|
|
19
|
+
Requires-Dist: mypy>=1.0.0; extra == "dev"
|
|
20
|
+
Requires-Dist: types-PyYAML>=6.0.0; extra == "dev"
|
|
21
|
+
Requires-Dist: pytest-cov>=2.12.1; extra == "dev"
|
|
22
|
+
Dynamic: license-file
|
|
23
|
+
|
|
24
|
+
# Highway DSL
|
|
25
|
+
|
|
26
|
+
[](https://badge.fury.io/py/highway-dsl)
|
|
27
|
+
[](https://opensource.org/licenses/MIT)
|
|
28
|
+
[](https://pypi.org/project/highway-dsl/)
|
|
29
|
+
[](https://github.com/rodmena-limited/highway_dsl/actions/workflows/publish.yml)
|
|
30
|
+
|
|
31
|
+
**Highway DSL** is a Python-based domain-specific language for defining complex workflows in a clear, concise, and fluent manner. It is part of the larger **Highway** project, an advanced workflow engine capable of running complex DAG-based workflows.
|
|
32
|
+
|
|
33
|
+
## Version 1.1.0 - Feature Release
|
|
34
|
+
|
|
35
|
+
This major feature release adds **Airflow-parity** features to enable production-grade workflows:
|
|
36
|
+
|
|
37
|
+
### New Features
|
|
38
|
+
|
|
39
|
+
#### 1. **Scheduling Metadata** (Airflow Parity)
|
|
40
|
+
Define cron-based schedules directly in your workflow:
|
|
41
|
+
```python
|
|
42
|
+
builder = (
|
|
43
|
+
WorkflowBuilder("daily_pipeline")
|
|
44
|
+
.set_schedule("0 2 * * *") # Run daily at 2 AM
|
|
45
|
+
.set_start_date(datetime(2025, 1, 1))
|
|
46
|
+
.set_catchup(False)
|
|
47
|
+
.add_tags("production", "daily")
|
|
48
|
+
.set_max_active_runs(1)
|
|
49
|
+
)
|
|
50
|
+
```
|
|
51
|
+
|
|
52
|
+
#### 2. **Event-Based Operators** (Absurd Integration)
|
|
53
|
+
First-class support for event-driven workflows:
|
|
54
|
+
```python
|
|
55
|
+
# Emit an event that other workflows can wait for
|
|
56
|
+
builder.emit_event(
|
|
57
|
+
"notify_completion",
|
|
58
|
+
event_name="pipeline_done",
|
|
59
|
+
payload={"status": "success"}
|
|
60
|
+
)
|
|
61
|
+
|
|
62
|
+
# Wait for an external event
|
|
63
|
+
builder.wait_for_event(
|
|
64
|
+
"wait_upstream",
|
|
65
|
+
event_name="data_ready",
|
|
66
|
+
timeout_seconds=3600
|
|
67
|
+
)
|
|
68
|
+
```
|
|
69
|
+
|
|
70
|
+
#### 3. **Callback Hooks** (Production Workflows)
|
|
71
|
+
Durable success/failure handlers as first-class workflow nodes:
|
|
72
|
+
```python
|
|
73
|
+
builder.task("risky_operation", "process.data")
|
|
74
|
+
|
|
75
|
+
builder.task("send_alert", "alerts.notify")
|
|
76
|
+
builder.on_failure("send_alert") # Runs if risky_operation fails
|
|
77
|
+
|
|
78
|
+
builder.task("cleanup", "cleanup.resources")
|
|
79
|
+
builder.on_success("cleanup") # Runs if risky_operation succeeds
|
|
80
|
+
```
|
|
81
|
+
|
|
82
|
+
#### 4. **Switch/Case Operator**
|
|
83
|
+
Multi-branch routing with cleaner syntax than nested conditions:
|
|
84
|
+
```python
|
|
85
|
+
builder.switch(
|
|
86
|
+
"route_by_status",
|
|
87
|
+
switch_on="{{data.status}}",
|
|
88
|
+
cases={
|
|
89
|
+
"approved": "approve_task",
|
|
90
|
+
"rejected": "reject_task",
|
|
91
|
+
"pending": "review_task"
|
|
92
|
+
},
|
|
93
|
+
default="unknown_handler"
|
|
94
|
+
)
|
|
95
|
+
```
|
|
96
|
+
|
|
97
|
+
#### 5. **Task Descriptions**
|
|
98
|
+
Document your workflow inline:
|
|
99
|
+
```python
|
|
100
|
+
builder.task(
|
|
101
|
+
"process",
|
|
102
|
+
"data.transform",
|
|
103
|
+
description="Transform raw data into analytics format"
|
|
104
|
+
)
|
|
105
|
+
```
|
|
106
|
+
|
|
107
|
+
#### 6. **Workflow-Level Default Retry Policy**
|
|
108
|
+
Set a default retry policy for all tasks:
|
|
109
|
+
```python
|
|
110
|
+
builder.set_default_retry_policy(
|
|
111
|
+
RetryPolicy(max_retries=3, delay=timedelta(seconds=60))
|
|
112
|
+
)
|
|
113
|
+
```
|
|
114
|
+
|
|
115
|
+
See `examples/scheduled_event_workflow.py` for a comprehensive example using all new features.
|
|
116
|
+
|
|
117
|
+
### RFC-Style Specification
|
|
118
|
+
|
|
119
|
+
For implementers and advanced users, Highway DSL v1.1.0 includes a comprehensive **3,215-line RFC-style specification** (`spec.txt`) modeled after IETF RFCs (RFC 2119, RFC 8259). This authoritative document provides:
|
|
120
|
+
|
|
121
|
+
- Complete operator specifications with execution semantics
|
|
122
|
+
- Integration guidance for Absurd and other runtimes
|
|
123
|
+
- Security considerations and best practices
|
|
124
|
+
- Comprehensive examples for all features
|
|
125
|
+
- Formal data model definitions
|
|
126
|
+
|
|
127
|
+
Access the specification at `/dsl/spec.txt` in the repository.
|
|
128
|
+
|
|
129
|
+
## Architecture Diagram
|
|
130
|
+
|
|
131
|
+
```mermaid
|
|
132
|
+
graph TB
|
|
133
|
+
subgraph "Highway DSL v1.1.0 Features"
|
|
134
|
+
A[WorkflowBuilder<br/>Fluent API] --> B[Core Operators]
|
|
135
|
+
A --> C[Scheduling]
|
|
136
|
+
A --> D[Events]
|
|
137
|
+
A --> E[Error Handling]
|
|
138
|
+
|
|
139
|
+
B --> B1[Task]
|
|
140
|
+
B --> B2[Condition]
|
|
141
|
+
B --> B3[Parallel]
|
|
142
|
+
B --> B4[ForEach]
|
|
143
|
+
B --> B5[While]
|
|
144
|
+
B --> B6[Wait]
|
|
145
|
+
B --> B7[Switch]
|
|
146
|
+
|
|
147
|
+
C --> C1[Cron Schedules]
|
|
148
|
+
C --> C2[Start Date]
|
|
149
|
+
C --> C3[Catchup]
|
|
150
|
+
C --> C4[Tags]
|
|
151
|
+
|
|
152
|
+
D --> D1[EmitEvent]
|
|
153
|
+
D --> D2[WaitForEvent]
|
|
154
|
+
|
|
155
|
+
E --> E1[RetryPolicy]
|
|
156
|
+
E --> E2[TimeoutPolicy]
|
|
157
|
+
E --> E3[Callbacks]
|
|
158
|
+
end
|
|
159
|
+
|
|
160
|
+
subgraph "Output Formats"
|
|
161
|
+
F[YAML]
|
|
162
|
+
G[JSON]
|
|
163
|
+
end
|
|
164
|
+
|
|
165
|
+
subgraph "Runtime Integration"
|
|
166
|
+
H[Absurd Runtime]
|
|
167
|
+
I[Airflow]
|
|
168
|
+
J[Temporal]
|
|
169
|
+
K[Custom Engines]
|
|
170
|
+
end
|
|
171
|
+
|
|
172
|
+
A --> F
|
|
173
|
+
A --> G
|
|
174
|
+
F --> H
|
|
175
|
+
F --> I
|
|
176
|
+
F --> J
|
|
177
|
+
F --> K
|
|
178
|
+
G --> H
|
|
179
|
+
G --> I
|
|
180
|
+
G --> J
|
|
181
|
+
G --> K
|
|
182
|
+
|
|
183
|
+
style A fill:#2563eb,stroke:#1e40af,color:#fff
|
|
184
|
+
style B fill:#8b5cf6,stroke:#7c3aed,color:#fff
|
|
185
|
+
style C fill:#10b981,stroke:#059669,color:#fff
|
|
186
|
+
style D fill:#f59e0b,stroke:#d97706,color:#fff
|
|
187
|
+
style E fill:#ef4444,stroke:#dc2626,color:#fff
|
|
188
|
+
```
|
|
189
|
+
|
|
190
|
+
## Features
|
|
191
|
+
|
|
192
|
+
* **Fluent API:** A powerful and intuitive `WorkflowBuilder` for defining workflows programmatically.
|
|
193
|
+
* **Pydantic-based:** All models are built on Pydantic, providing robust data validation, serialization, and documentation.
|
|
194
|
+
* **Rich Operators:** A comprehensive set of operators for handling various workflow scenarios:
|
|
195
|
+
* `Task` - Basic workflow steps
|
|
196
|
+
* `Condition` (if/else) - Conditional branching
|
|
197
|
+
* `Parallel` - Execute multiple branches simultaneously
|
|
198
|
+
* `ForEach` - Iterate over collections with proper dependency management
|
|
199
|
+
* `Wait` - Pause execution for scheduled tasks
|
|
200
|
+
* `While` - Execute loops based on conditions
|
|
201
|
+
* **NEW in v1.1:** `EmitEvent` - Emit events for cross-workflow coordination
|
|
202
|
+
* **NEW in v1.1:** `WaitForEvent` - Wait for external events with timeout
|
|
203
|
+
* **NEW in v1.1:** `Switch` - Multi-branch routing (switch/case)
|
|
204
|
+
* **Scheduling:** Built-in support for cron-based scheduling, start dates, and catchup configuration
|
|
205
|
+
* **Event-Driven:** First-class support for event emission and waiting (Absurd integration)
|
|
206
|
+
* **Callback Hooks:** Durable success/failure handlers as workflow nodes
|
|
207
|
+
* **YAML/JSON Interoperability:** Workflows can be defined in Python and exported to YAML or JSON, and vice-versa.
|
|
208
|
+
* **Retry and Timeout Policies:** Built-in error handling and execution time management.
|
|
209
|
+
* **Extensible:** The DSL is designed to be extensible with custom operators and policies.
|
|
210
|
+
|
|
211
|
+
## Installation
|
|
212
|
+
|
|
213
|
+
```bash
|
|
214
|
+
pip install highway-dsl
|
|
215
|
+
```
|
|
216
|
+
|
|
217
|
+
## Quick Start
|
|
218
|
+
|
|
219
|
+
Here's a simple example of how to define a workflow using the `WorkflowBuilder`:
|
|
220
|
+
|
|
221
|
+
```python
|
|
222
|
+
from datetime import timedelta
|
|
223
|
+
from highway_dsl import WorkflowBuilder
|
|
224
|
+
|
|
225
|
+
workflow = (
|
|
226
|
+
WorkflowBuilder("simple_etl")
|
|
227
|
+
.task("extract", "etl.extract_data", result_key="raw_data")
|
|
228
|
+
.task(
|
|
229
|
+
"transform",
|
|
230
|
+
"etl.transform_data",
|
|
231
|
+
args=["{{raw_data}}"],
|
|
232
|
+
result_key="transformed_data",
|
|
233
|
+
)
|
|
234
|
+
.retry(max_retries=3, delay=timedelta(seconds=10))
|
|
235
|
+
.task("load", "etl.load_data", args=["{{transformed_data}}"])
|
|
236
|
+
.timeout(timeout=timedelta(minutes=30))
|
|
237
|
+
.wait("wait_next", timedelta(hours=24))
|
|
238
|
+
.task("cleanup", "etl.cleanup")
|
|
239
|
+
.build()
|
|
240
|
+
)
|
|
241
|
+
|
|
242
|
+
print(workflow.to_yaml())
|
|
243
|
+
```
|
|
244
|
+
|
|
245
|
+
## Real-World Example: E-Commerce Order Processing
|
|
246
|
+
|
|
247
|
+
```python
|
|
248
|
+
from highway_dsl import WorkflowBuilder, RetryPolicy
|
|
249
|
+
from datetime import datetime, timedelta
|
|
250
|
+
|
|
251
|
+
# Production-ready e-commerce order workflow
|
|
252
|
+
workflow = (
|
|
253
|
+
WorkflowBuilder("order_processing")
|
|
254
|
+
.set_schedule("*/5 * * * *") # Run every 5 minutes
|
|
255
|
+
.set_start_date(datetime(2025, 1, 1))
|
|
256
|
+
.add_tags("production", "orders", "critical")
|
|
257
|
+
.set_default_retry_policy(RetryPolicy(max_retries=3, delay=timedelta(seconds=30)))
|
|
258
|
+
|
|
259
|
+
# Fetch pending orders
|
|
260
|
+
.task("fetch_orders", "orders.get_pending", result_key="orders")
|
|
261
|
+
|
|
262
|
+
# Process each order
|
|
263
|
+
.foreach(
|
|
264
|
+
"process_each_order",
|
|
265
|
+
items="{{orders}}",
|
|
266
|
+
loop_body=lambda b: (
|
|
267
|
+
b.task("validate", "orders.validate", args=["{{item}}"])
|
|
268
|
+
.task("charge_payment", "payments.charge", args=["{{item}}"],
|
|
269
|
+
result_key="payment_result")
|
|
270
|
+
.task("send_failure_email", "email.send_failure",
|
|
271
|
+
args=["{{item.customer_email}}"])
|
|
272
|
+
.on_failure("send_failure_email") # Alert on payment failure
|
|
273
|
+
.switch(
|
|
274
|
+
"route_by_amount",
|
|
275
|
+
switch_on="{{item.total}}",
|
|
276
|
+
cases={
|
|
277
|
+
"high": "priority_shipping", # > $500
|
|
278
|
+
"medium": "standard_shipping", # $100-500
|
|
279
|
+
"low": "economy_shipping" # < $100
|
|
280
|
+
},
|
|
281
|
+
default="standard_shipping"
|
|
282
|
+
)
|
|
283
|
+
)
|
|
284
|
+
)
|
|
285
|
+
|
|
286
|
+
# Emit completion event for analytics workflow
|
|
287
|
+
.emit_event(
|
|
288
|
+
"notify_analytics",
|
|
289
|
+
event_name="orders_processed_{{ds}}",
|
|
290
|
+
payload={"count": "{{orders.length}}", "timestamp": "{{run.started_at}}"}
|
|
291
|
+
)
|
|
292
|
+
|
|
293
|
+
.build()
|
|
294
|
+
)
|
|
295
|
+
```
|
|
296
|
+
|
|
297
|
+
This workflow demonstrates:
|
|
298
|
+
- Scheduled execution every 5 minutes
|
|
299
|
+
- Default retry policy for all tasks
|
|
300
|
+
- ForEach loop processing multiple orders
|
|
301
|
+
- Payment failure callbacks
|
|
302
|
+
- Switch/case routing based on order amount
|
|
303
|
+
- Event emission for cross-workflow coordination
|
|
304
|
+
|
|
305
|
+
## Mermaid Diagram Generation
|
|
306
|
+
|
|
307
|
+
You can generate a Mermaid state diagram of your workflow using the `to_mermaid` method:
|
|
308
|
+
|
|
309
|
+
```python
|
|
310
|
+
print(workflow.to_mermaid())
|
|
311
|
+
```
|
|
312
|
+
|
|
313
|
+
This will output a Mermaid diagram in the `stateDiagram-v2` format, which can be used with a variety of tools to visualize your workflow.
|
|
314
|
+
|
|
315
|
+
## Bank ETL Example
|
|
316
|
+
|
|
317
|
+
A more complex example of a bank's end-of-day ETL process can be found in `examples/bank_end_of_the_day_etl_workflow.py`.
|
|
318
|
+
|
|
319
|
+
A mermaid diagram of this workflow can be found [here](docs/bank_etl.mermaid).
|
|
320
|
+
|
|
321
|
+
## Advanced Usage
|
|
322
|
+
|
|
323
|
+
### Conditional Logic
|
|
324
|
+
|
|
325
|
+
```python
|
|
326
|
+
from highway_dsl import WorkflowBuilder, RetryPolicy
|
|
327
|
+
from datetime import timedelta
|
|
328
|
+
|
|
329
|
+
builder = WorkflowBuilder("data_processing_pipeline")
|
|
330
|
+
|
|
331
|
+
builder.task("start", "workflows.tasks.initialize", result_key="init_data")
|
|
332
|
+
builder.task(
|
|
333
|
+
"validate",
|
|
334
|
+
"workflows.tasks.validate_data",
|
|
335
|
+
args=["{{init_data}}"],
|
|
336
|
+
result_key="validated_data",
|
|
337
|
+
)
|
|
338
|
+
|
|
339
|
+
builder.condition(
|
|
340
|
+
"check_quality",
|
|
341
|
+
condition="{{validated_data.quality_score}} > 0.8",
|
|
342
|
+
if_true=lambda b: b.task(
|
|
343
|
+
"high_quality_processing",
|
|
344
|
+
"workflows.tasks.advanced_processing",
|
|
345
|
+
args=["{{validated_data}}"],
|
|
346
|
+
retry_policy=RetryPolicy(max_retries=5, delay=timedelta(seconds=10), backoff_factor=2.0),
|
|
347
|
+
),
|
|
348
|
+
if_false=lambda b: b.task(
|
|
349
|
+
"standard_processing",
|
|
350
|
+
"workflows.tasks.basic_processing",
|
|
351
|
+
args=["{{validated_data}}"],
|
|
352
|
+
),
|
|
353
|
+
)
|
|
354
|
+
|
|
355
|
+
workflow = builder.build()
|
|
356
|
+
```
|
|
357
|
+
|
|
358
|
+
### While Loops
|
|
359
|
+
|
|
360
|
+
```python
|
|
361
|
+
from highway_dsl import WorkflowBuilder
|
|
362
|
+
|
|
363
|
+
builder = WorkflowBuilder("qa_rework_workflow")
|
|
364
|
+
|
|
365
|
+
builder.task("start_qa", "workflows.tasks.start_qa", result_key="qa_results")
|
|
366
|
+
|
|
367
|
+
builder.while_loop(
|
|
368
|
+
"qa_rework_loop",
|
|
369
|
+
condition="{{qa_results.status}} == 'failed'",
|
|
370
|
+
loop_body=lambda b: b.task("perform_rework", "workflows.tasks.perform_rework").task(
|
|
371
|
+
"re_run_qa", "workflows.tasks.run_qa", result_key="qa_results"
|
|
372
|
+
),
|
|
373
|
+
)
|
|
374
|
+
|
|
375
|
+
builder.task("finalize_product", "workflows.tasks.finalize_product", dependencies=["qa_rework_loop"])
|
|
376
|
+
|
|
377
|
+
workflow = builder.build()
|
|
378
|
+
```
|
|
379
|
+
|
|
380
|
+
### For-Each Loops with Proper Dependency Management
|
|
381
|
+
|
|
382
|
+
Fixed bug where foreach loops were incorrectly inheriting dependencies from containing parallel operators:
|
|
383
|
+
|
|
384
|
+
```python
|
|
385
|
+
# This loop now properly encapsulates its internal tasks
|
|
386
|
+
builder.foreach(
|
|
387
|
+
"process_items",
|
|
388
|
+
items="{{data.items}}",
|
|
389
|
+
loop_body=lambda fb: fb.task("process_item", "processor.handle_item", args=["{{item.id}}"])
|
|
390
|
+
# Loop body tasks only have proper dependencies, not unwanted "grandparent" dependencies
|
|
391
|
+
)
|
|
392
|
+
```
|
|
393
|
+
|
|
394
|
+
### Retry Policies
|
|
395
|
+
|
|
396
|
+
```python
|
|
397
|
+
from highway_dsl import RetryPolicy
|
|
398
|
+
from datetime import timedelta
|
|
399
|
+
|
|
400
|
+
builder.task(
|
|
401
|
+
"reliable_task",
|
|
402
|
+
"service.operation",
|
|
403
|
+
retry_policy=RetryPolicy(
|
|
404
|
+
max_retries=5,
|
|
405
|
+
delay=timedelta(seconds=10),
|
|
406
|
+
backoff_factor=2.0
|
|
407
|
+
)
|
|
408
|
+
)
|
|
409
|
+
```
|
|
410
|
+
|
|
411
|
+
### Timeout Policies
|
|
412
|
+
|
|
413
|
+
```python
|
|
414
|
+
from highway_dsl import TimeoutPolicy
|
|
415
|
+
from datetime import timedelta
|
|
416
|
+
|
|
417
|
+
builder.task(
|
|
418
|
+
"timed_task",
|
|
419
|
+
"service.operation",
|
|
420
|
+
timeout_policy=TimeoutPolicy(
|
|
421
|
+
timeout=timedelta(hours=1),
|
|
422
|
+
kill_on_timeout=True
|
|
423
|
+
)
|
|
424
|
+
)
|
|
425
|
+
```
|
|
426
|
+
|
|
427
|
+
## Version History
|
|
428
|
+
|
|
429
|
+
### Version 1.1.0 - Feature Release (Current)
|
|
430
|
+
|
|
431
|
+
**Airflow-Parity Features:**
|
|
432
|
+
- Scheduling metadata (cron, start_date, catchup, tags, max_active_runs)
|
|
433
|
+
- Workflow-level default retry policy
|
|
434
|
+
|
|
435
|
+
**Event-Driven Features:**
|
|
436
|
+
- EmitEventOperator for cross-workflow coordination
|
|
437
|
+
- WaitForEventOperator with timeout support
|
|
438
|
+
|
|
439
|
+
**Production Features:**
|
|
440
|
+
- Durable callback hooks (on_success, on_failure)
|
|
441
|
+
- SwitchOperator for multi-branch routing
|
|
442
|
+
- Task descriptions for documentation
|
|
443
|
+
- RFC-style specification document (3,215 lines)
|
|
444
|
+
|
|
445
|
+
### Version 1.0.3 - Stable Release
|
|
446
|
+
|
|
447
|
+
This is a stable release with important bug fixes and enhancements, including a critical fix for the ForEach operator dependency management issue.
|
|
448
|
+
|
|
449
|
+
## Development
|
|
450
|
+
|
|
451
|
+
To set up the development environment:
|
|
452
|
+
|
|
453
|
+
```bash
|
|
454
|
+
git clone https://github.com/your-username/highway.git
|
|
455
|
+
cd highway/dsl
|
|
456
|
+
python -m venv .venv
|
|
457
|
+
source .venv/bin/activate
|
|
458
|
+
pip install -e .[dev]
|
|
459
|
+
```
|
|
460
|
+
|
|
461
|
+
### Running Tests
|
|
462
|
+
|
|
463
|
+
```bash
|
|
464
|
+
pytest
|
|
465
|
+
```
|
|
466
|
+
|
|
467
|
+
### Type Checking
|
|
468
|
+
|
|
469
|
+
```bash
|
|
470
|
+
mypy .
|
|
471
|
+
```
|
|
472
|
+
|
|
473
|
+
## Documentation
|
|
474
|
+
|
|
475
|
+
- **README.md** (this file) - Getting started and examples
|
|
476
|
+
- **spec.txt** - RFC-style formal specification (3,215 lines)
|
|
477
|
+
- **examples/** - Comprehensive workflow examples
|
|
478
|
+
|
|
479
|
+
## License
|
|
480
|
+
|
|
481
|
+
MIT License
|
|
@@ -0,0 +1,7 @@
|
|
|
1
|
+
highway_dsl/__init__.py,sha256=D3gFgy3Wq5-NqfmnGcN1RDw8JxGsJBGVKwVsUFDKn4A,651
|
|
2
|
+
highway_dsl/workflow_dsl.py,sha256=NGXTddjeVz530wwKwLvvx0bjMDDmk9WNOxcPqXxNZzs,26742
|
|
3
|
+
highway_dsl-1.2.0.dist-info/licenses/LICENSE,sha256=qdFq1H66BvKg67mf4-WGpFwtG2u_dNknxuJDQ1_ubaY,1072
|
|
4
|
+
highway_dsl-1.2.0.dist-info/METADATA,sha256=4fokdO8BWKOF3v8lnM1cl77BJ4YK_1XhSwSxXks3hbE,14242
|
|
5
|
+
highway_dsl-1.2.0.dist-info/WHEEL,sha256=_zCd3N1l69ArxyTb8rzEoP9TpbYXkqRFSNOD5OuxnTs,91
|
|
6
|
+
highway_dsl-1.2.0.dist-info/top_level.txt,sha256=_5uX-bbBsQ2rsi1XMr7WRyKbr6ack5GqVBcy-QjF1C8,12
|
|
7
|
+
highway_dsl-1.2.0.dist-info/RECORD,,
|
|
@@ -1,227 +0,0 @@
|
|
|
1
|
-
Metadata-Version: 2.4
|
|
2
|
-
Name: highway_dsl
|
|
3
|
-
Version: 0.0.2
|
|
4
|
-
Summary: A domain specific language (DSL) for defining and managing data processing pipelines.
|
|
5
|
-
Author-email: Farseed Ashouri <farseed.ashouri@gmail.com>
|
|
6
|
-
License: MIT
|
|
7
|
-
Project-URL: Homepage, https://github.com/rodmena-limited/highway_dsl
|
|
8
|
-
Project-URL: Issues, https://github.com/rodmena-limited/highway_dsl/issues
|
|
9
|
-
Classifier: Programming Language :: Python :: 3
|
|
10
|
-
Classifier: License :: OSI Approved :: MIT License
|
|
11
|
-
Classifier: Operating System :: OS Independent
|
|
12
|
-
Requires-Python: >=3.9
|
|
13
|
-
Description-Content-Type: text/markdown
|
|
14
|
-
License-File: LICENSE
|
|
15
|
-
Requires-Dist: pydantic>=2.12.3
|
|
16
|
-
Requires-Dist: pyyaml>=6.0
|
|
17
|
-
Provides-Extra: dev
|
|
18
|
-
Requires-Dist: pytest>=7.0.0; extra == "dev"
|
|
19
|
-
Requires-Dist: mypy>=1.0.0; extra == "dev"
|
|
20
|
-
Requires-Dist: types-PyYAML>=6.0.0; extra == "dev"
|
|
21
|
-
Requires-Dist: pytest-cov>=2.12.1; extra == "dev"
|
|
22
|
-
Dynamic: license-file
|
|
23
|
-
|
|
24
|
-
# Highway DSL
|
|
25
|
-
|
|
26
|
-
Highway DSL is a Python-based Domain Specific Language (DSL) for defining and managing complex workflows. It allows users to declaratively specify tasks, dependencies, and execution parameters, supporting various control flow mechanisms like conditions, parallel execution, and retries.
|
|
27
|
-
|
|
28
|
-
## Features
|
|
29
|
-
|
|
30
|
-
* **Declarative Workflow Definition:** Define workflows using a clear and concise Python API or through YAML/JSON configurations.
|
|
31
|
-
* **Pydantic Models:** Leverages Pydantic for robust data validation and serialization/deserialization of workflow definitions.
|
|
32
|
-
* **Rich Task Types:** Supports various operators including:
|
|
33
|
-
* `TaskOperator`: Executes a Python function.
|
|
34
|
-
* `ConditionOperator`: Enables conditional branching based on expressions.
|
|
35
|
-
* `WaitOperator`: Pauses workflow execution for a specified duration or until a specific datetime.
|
|
36
|
-
* `ParallelOperator`: Executes multiple branches of tasks concurrently.
|
|
37
|
-
* `ForEachOperator`: Iterates over a collection, executing a chain of tasks for each item.
|
|
38
|
-
* **Retry and Timeout Policies:** Define retry strategies and timeout limits for individual tasks.
|
|
39
|
-
* **Serialization/Deserialization:** Seamless conversion of workflow definitions between Python objects, YAML, and JSON formats.
|
|
40
|
-
* **Workflow Builder:** A fluent API for constructing workflows programmatically.
|
|
41
|
-
|
|
42
|
-
### Feature Overview
|
|
43
|
-
|
|
44
|
-
```mermaid
|
|
45
|
-
graph TD
|
|
46
|
-
A[Workflow] --> B{TaskOperator};
|
|
47
|
-
A --> C{ConditionOperator};
|
|
48
|
-
A --> D{WaitOperator};
|
|
49
|
-
A --> E{ParallelOperator};
|
|
50
|
-
A --> F{ForEachOperator};
|
|
51
|
-
|
|
52
|
-
B --> G[Executes Python Function];
|
|
53
|
-
C --> H{If/Else Branching};
|
|
54
|
-
D --> I[Pauses Execution];
|
|
55
|
-
E --> J[Concurrent Branches];
|
|
56
|
-
F --> K[Iterates Over Items];
|
|
57
|
-
|
|
58
|
-
subgraph Policies
|
|
59
|
-
B --> L[RetryPolicy];
|
|
60
|
-
B --> M[TimeoutPolicy];
|
|
61
|
-
end
|
|
62
|
-
```
|
|
63
|
-
|
|
64
|
-
## Installation
|
|
65
|
-
|
|
66
|
-
To install Highway DSL, you can use pip:
|
|
67
|
-
|
|
68
|
-
```bash
|
|
69
|
-
pip install highway-dsl
|
|
70
|
-
```
|
|
71
|
-
|
|
72
|
-
If you want to install it for development, including testing dependencies:
|
|
73
|
-
|
|
74
|
-
```bash
|
|
75
|
-
pip install "highway-dsl[dev]"
|
|
76
|
-
```
|
|
77
|
-
|
|
78
|
-
## Usage
|
|
79
|
-
|
|
80
|
-
### Defining a Simple Workflow
|
|
81
|
-
|
|
82
|
-
```python
|
|
83
|
-
from datetime import timedelta
|
|
84
|
-
from highway_dsl import WorkflowBuilder
|
|
85
|
-
|
|
86
|
-
def demonstrate_basic_workflow():
|
|
87
|
-
"""Show a simple complete workflow using just the builder"""
|
|
88
|
-
|
|
89
|
-
workflow = (
|
|
90
|
-
WorkflowBuilder("simple_etl")
|
|
91
|
-
.task("extract", "etl.extract_data", result_key="raw_data")
|
|
92
|
-
.task(
|
|
93
|
-
"transform",
|
|
94
|
-
"etl.transform_data",
|
|
95
|
-
args=["{{raw_data}}"],
|
|
96
|
-
result_key="transformed_data",
|
|
97
|
-
)
|
|
98
|
-
.retry(max_retries=3, delay=timedelta(seconds=10))
|
|
99
|
-
.task("load", "etl.load_data", args=["{{transformed_data}}"])
|
|
100
|
-
.timeout(timeout=timedelta(minutes=30))
|
|
101
|
-
.wait("wait_next", timedelta(hours=24))
|
|
102
|
-
.task("cleanup", "etl.cleanup")
|
|
103
|
-
.build()
|
|
104
|
-
)
|
|
105
|
-
|
|
106
|
-
workflow.set_variables(
|
|
107
|
-
{"database_url": "postgresql://localhost/mydb", "chunk_size": 1000}
|
|
108
|
-
)
|
|
109
|
-
|
|
110
|
-
return workflow
|
|
111
|
-
|
|
112
|
-
if __name__ == "__main__":
|
|
113
|
-
basic_workflow = demonstrate_basic_workflow()
|
|
114
|
-
print(basic_workflow.to_yaml())
|
|
115
|
-
```
|
|
116
|
-
|
|
117
|
-
### Defining a Complex Workflow
|
|
118
|
-
|
|
119
|
-
Refer to `example_usage.py` for a more complex example demonstrating conditional logic, parallel execution, and iteration.
|
|
120
|
-
|
|
121
|
-
### YAML Configuration
|
|
122
|
-
|
|
123
|
-
You can also define workflows directly in YAML:
|
|
124
|
-
|
|
125
|
-
```yaml
|
|
126
|
-
name: simple_etl
|
|
127
|
-
version: 1.0.0
|
|
128
|
-
description: Simple ETL workflow with retry and timeout
|
|
129
|
-
variables:
|
|
130
|
-
database_url: postgresql://localhost/mydb
|
|
131
|
-
chunk_size: 1000
|
|
132
|
-
start_task: extract
|
|
133
|
-
tasks:
|
|
134
|
-
extract:
|
|
135
|
-
task_id: extract
|
|
136
|
-
operator_type: task
|
|
137
|
-
function: etl.extract_data
|
|
138
|
-
result_key: raw_data
|
|
139
|
-
dependencies: []
|
|
140
|
-
metadata: {}
|
|
141
|
-
|
|
142
|
-
transform:
|
|
143
|
-
task_id: transform
|
|
144
|
-
operator_type: task
|
|
145
|
-
function: etl.transform_data
|
|
146
|
-
args: ["{{raw_data}}"]
|
|
147
|
-
result_key: transformed_data
|
|
148
|
-
dependencies: ["extract"]
|
|
149
|
-
retry_policy:
|
|
150
|
-
max_retries: 3
|
|
151
|
-
delay: PT10S
|
|
152
|
-
backoff_factor: 2.0
|
|
153
|
-
metadata: {}
|
|
154
|
-
|
|
155
|
-
load:
|
|
156
|
-
task_id: load
|
|
157
|
-
operator_type: task
|
|
158
|
-
function: etl.load_data
|
|
159
|
-
args: ["{{transformed_data}}"]
|
|
160
|
-
dependencies: ["transform"]
|
|
161
|
-
timeout_policy:
|
|
162
|
-
timeout: PT30M
|
|
163
|
-
kill_on_timeout: true
|
|
164
|
-
metadata: {}
|
|
165
|
-
|
|
166
|
-
wait_next:
|
|
167
|
-
task_id: wait_next
|
|
168
|
-
operator_type: wait
|
|
169
|
-
wait_for: "P1D"
|
|
170
|
-
dependencies: ["load"]
|
|
171
|
-
metadata: {}
|
|
172
|
-
|
|
173
|
-
cleanup:
|
|
174
|
-
task_id: cleanup
|
|
175
|
-
operator_type: task
|
|
176
|
-
function: etl.cleanup
|
|
177
|
-
dependencies: ["wait_next"]
|
|
178
|
-
metadata: {}
|
|
179
|
-
```
|
|
180
|
-
|
|
181
|
-
To load this YAML:
|
|
182
|
-
|
|
183
|
-
```python
|
|
184
|
-
from highway_dsl import Workflow
|
|
185
|
-
|
|
186
|
-
yaml_content = """
|
|
187
|
-
# ... (yaml content from above)
|
|
188
|
-
"""
|
|
189
|
-
|
|
190
|
-
workflow = Workflow.from_yaml(yaml_content)
|
|
191
|
-
print(workflow.name)
|
|
192
|
-
```
|
|
193
|
-
|
|
194
|
-
## Development
|
|
195
|
-
|
|
196
|
-
### Running Tests
|
|
197
|
-
|
|
198
|
-
To run the unit tests, navigate to the project root and execute:
|
|
199
|
-
|
|
200
|
-
```bash
|
|
201
|
-
pytest
|
|
202
|
-
```
|
|
203
|
-
|
|
204
|
-
### Type Checking
|
|
205
|
-
|
|
206
|
-
To perform static type checking with MyPy:
|
|
207
|
-
|
|
208
|
-
```bash
|
|
209
|
-
mypy .
|
|
210
|
-
```
|
|
211
|
-
|
|
212
|
-
## Project Structure
|
|
213
|
-
|
|
214
|
-
```
|
|
215
|
-
.highway/
|
|
216
|
-
├── highway_dsl/
|
|
217
|
-
│ ├── __init__.py # Exposes the public API
|
|
218
|
-
│ └── workflow_dsl.py # Core DSL definitions (Pydantic models)
|
|
219
|
-
├── example_usage.py # Examples of how to use the DSL
|
|
220
|
-
├── tests/
|
|
221
|
-
│ ├── __init__.py
|
|
222
|
-
│ ├── conftest.py # Pytest configuration
|
|
223
|
-
│ └── test_workflow_dsl.py # Unit and integration tests
|
|
224
|
-
├── pyproject.toml # Project metadata and dependencies
|
|
225
|
-
├── README.md # This file
|
|
226
|
-
└── SUMMARY.md # Summary of changes and future instructions
|
|
227
|
-
```
|