FlowerPower 0.11.6.19__py3-none-any.whl → 0.20.0__py3-none-any.whl

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (80) hide show
  1. flowerpower/cfg/__init__.py +3 -3
  2. flowerpower/cfg/pipeline/__init__.py +5 -3
  3. flowerpower/cfg/project/__init__.py +3 -3
  4. flowerpower/cfg/project/job_queue.py +1 -128
  5. flowerpower/cli/__init__.py +5 -5
  6. flowerpower/cli/cfg.py +0 -3
  7. flowerpower/cli/job_queue.py +401 -133
  8. flowerpower/cli/pipeline.py +14 -413
  9. flowerpower/cli/utils.py +0 -1
  10. flowerpower/flowerpower.py +537 -28
  11. flowerpower/job_queue/__init__.py +5 -94
  12. flowerpower/job_queue/base.py +201 -3
  13. flowerpower/job_queue/rq/concurrent_workers/thread_worker.py +0 -3
  14. flowerpower/job_queue/rq/manager.py +388 -77
  15. flowerpower/pipeline/__init__.py +2 -0
  16. flowerpower/pipeline/base.py +2 -2
  17. flowerpower/pipeline/io.py +14 -16
  18. flowerpower/pipeline/manager.py +21 -642
  19. flowerpower/pipeline/pipeline.py +571 -0
  20. flowerpower/pipeline/registry.py +242 -10
  21. flowerpower/pipeline/visualizer.py +1 -2
  22. flowerpower/plugins/_io/__init__.py +8 -0
  23. flowerpower/plugins/mqtt/manager.py +6 -6
  24. flowerpower/settings/backend.py +0 -2
  25. flowerpower/settings/job_queue.py +1 -57
  26. flowerpower/utils/misc.py +0 -256
  27. flowerpower/utils/monkey.py +1 -83
  28. {flowerpower-0.11.6.19.dist-info → flowerpower-0.20.0.dist-info}/METADATA +308 -152
  29. flowerpower-0.20.0.dist-info/RECORD +58 -0
  30. flowerpower/fs/__init__.py +0 -29
  31. flowerpower/fs/base.py +0 -662
  32. flowerpower/fs/ext.py +0 -2143
  33. flowerpower/fs/storage_options.py +0 -1420
  34. flowerpower/job_queue/apscheduler/__init__.py +0 -11
  35. flowerpower/job_queue/apscheduler/_setup/datastore.py +0 -110
  36. flowerpower/job_queue/apscheduler/_setup/eventbroker.py +0 -93
  37. flowerpower/job_queue/apscheduler/manager.py +0 -1051
  38. flowerpower/job_queue/apscheduler/setup.py +0 -554
  39. flowerpower/job_queue/apscheduler/trigger.py +0 -169
  40. flowerpower/job_queue/apscheduler/utils.py +0 -311
  41. flowerpower/pipeline/job_queue.py +0 -583
  42. flowerpower/pipeline/runner.py +0 -603
  43. flowerpower/plugins/io/base.py +0 -2520
  44. flowerpower/plugins/io/helpers/datetime.py +0 -298
  45. flowerpower/plugins/io/helpers/polars.py +0 -875
  46. flowerpower/plugins/io/helpers/pyarrow.py +0 -570
  47. flowerpower/plugins/io/helpers/sql.py +0 -202
  48. flowerpower/plugins/io/loader/__init__.py +0 -28
  49. flowerpower/plugins/io/loader/csv.py +0 -37
  50. flowerpower/plugins/io/loader/deltatable.py +0 -190
  51. flowerpower/plugins/io/loader/duckdb.py +0 -19
  52. flowerpower/plugins/io/loader/json.py +0 -37
  53. flowerpower/plugins/io/loader/mqtt.py +0 -159
  54. flowerpower/plugins/io/loader/mssql.py +0 -26
  55. flowerpower/plugins/io/loader/mysql.py +0 -26
  56. flowerpower/plugins/io/loader/oracle.py +0 -26
  57. flowerpower/plugins/io/loader/parquet.py +0 -35
  58. flowerpower/plugins/io/loader/postgres.py +0 -26
  59. flowerpower/plugins/io/loader/pydala.py +0 -19
  60. flowerpower/plugins/io/loader/sqlite.py +0 -23
  61. flowerpower/plugins/io/metadata.py +0 -244
  62. flowerpower/plugins/io/saver/__init__.py +0 -28
  63. flowerpower/plugins/io/saver/csv.py +0 -36
  64. flowerpower/plugins/io/saver/deltatable.py +0 -186
  65. flowerpower/plugins/io/saver/duckdb.py +0 -19
  66. flowerpower/plugins/io/saver/json.py +0 -36
  67. flowerpower/plugins/io/saver/mqtt.py +0 -28
  68. flowerpower/plugins/io/saver/mssql.py +0 -26
  69. flowerpower/plugins/io/saver/mysql.py +0 -26
  70. flowerpower/plugins/io/saver/oracle.py +0 -26
  71. flowerpower/plugins/io/saver/parquet.py +0 -36
  72. flowerpower/plugins/io/saver/postgres.py +0 -26
  73. flowerpower/plugins/io/saver/pydala.py +0 -20
  74. flowerpower/plugins/io/saver/sqlite.py +0 -24
  75. flowerpower/utils/scheduler.py +0 -311
  76. flowerpower-0.11.6.19.dist-info/RECORD +0 -102
  77. {flowerpower-0.11.6.19.dist-info → flowerpower-0.20.0.dist-info}/WHEEL +0 -0
  78. {flowerpower-0.11.6.19.dist-info → flowerpower-0.20.0.dist-info}/entry_points.txt +0 -0
  79. {flowerpower-0.11.6.19.dist-info → flowerpower-0.20.0.dist-info}/licenses/LICENSE +0 -0
  80. {flowerpower-0.11.6.19.dist-info → flowerpower-0.20.0.dist-info}/top_level.txt +0 -0
@@ -1,63 +1,30 @@
1
1
  Metadata-Version: 2.4
2
2
  Name: FlowerPower
3
- Version: 0.11.6.19
4
- Summary: A simple workflow framework. Hamilton + APScheduler = FlowerPower
3
+ Version: 0.20.0
4
+ Summary: A simple workflow framework. Hamilton + RQ = FlowerPower
5
5
  Author-email: "Volker L." <ligno.blades@gmail.com>
6
6
  Project-URL: Homepage, https://github.com/legout/flowerpower
7
7
  Project-URL: Bug Tracker, https://github.com/legout/flowerpower/issues
8
- Keywords: hamilton,workflow,pipeline,scheduler,apscheduler,dask,ray
8
+ Keywords: hamilton,workflow,pipeline,scheduler,rq,dask,ray
9
9
  Requires-Python: >=3.11
10
10
  Description-Content-Type: text/markdown
11
11
  License-File: LICENSE
12
- Requires-Dist: dill>=0.3.8
13
12
  Requires-Dist: duration-parser>=1.0.1
14
13
  Requires-Dist: fsspec>=2024.10.0
14
+ Requires-Dist: fsspec-utils>=0.1.0
15
15
  Requires-Dist: humanize>=4.12.2
16
16
  Requires-Dist: msgspec>=0.19.0
17
17
  Requires-Dist: munch>=4.0.0
18
- Requires-Dist: orjson>=3.10.15
19
- Requires-Dist: python-dotenv>=1.0.1
20
18
  Requires-Dist: pyyaml>=6.0.1
21
19
  Requires-Dist: rich>=13.9.3
22
20
  Requires-Dist: s3fs>=2024.10.0
23
21
  Requires-Dist: sf-hamilton-sdk>=0.5.2
24
22
  Requires-Dist: sf-hamilton[rich,tqdm,visualization]>=1.69.0
25
23
  Requires-Dist: typer>=0.12.3
26
- Provides-Extra: apscheduler
27
- Requires-Dist: aiosqlite>=0.21.0; extra == "apscheduler"
28
- Requires-Dist: apscheduler==4.0.0a5; extra == "apscheduler"
29
- Requires-Dist: asyncpg>=0.29.0; extra == "apscheduler"
30
- Requires-Dist: greenlet>=3.0.3; extra == "apscheduler"
31
- Requires-Dist: sqlalchemy>=2.0.30; extra == "apscheduler"
32
- Requires-Dist: cron-descriptor>=1.4.5; extra == "apscheduler"
33
24
  Provides-Extra: io
34
- Requires-Dist: adbc-driver-manager>=1.4.0; extra == "io"
35
- Requires-Dist: aiosqlite>=0.21.0; extra == "io"
36
- Requires-Dist: datafusion>=43.1.0; extra == "io"
37
- Requires-Dist: deltalake>=0.24.0; extra == "io"
38
- Requires-Dist: duckdb>=1.1.3; extra == "io"
39
- Requires-Dist: orjson>=3.10.12; extra == "io"
40
- Requires-Dist: pandas>=2.2.3; extra == "io"
41
- Requires-Dist: polars>=1.15.0; extra == "io"
42
- Requires-Dist: pyarrow>=18.1.0; extra == "io"
43
- Requires-Dist: pydala2>=0.9.4.5; extra == "io"
44
- Requires-Dist: redis>=5.2.1; extra == "io"
45
- Requires-Dist: sherlock>=0.4.1; extra == "io"
46
- Requires-Dist: sqlalchemy>=2.0.30; extra == "io"
25
+ Requires-Dist: flowerpower-io>=0.1.1; extra == "io"
47
26
  Provides-Extra: io-legacy
48
- Requires-Dist: adbc-driver-manager>=1.4.0; extra == "io-legacy"
49
- Requires-Dist: aiosqlite>=0.21.0; extra == "io-legacy"
50
- Requires-Dist: datafusion>=43.1.0; extra == "io-legacy"
51
- Requires-Dist: deltalake>=0.24.0; extra == "io-legacy"
52
- Requires-Dist: duckdb>=1.1.3; extra == "io-legacy"
53
- Requires-Dist: orjson>=3.10.12; extra == "io-legacy"
54
- Requires-Dist: pandas>=2.2.3; extra == "io-legacy"
55
- Requires-Dist: polars-lts-cpu>=1.15.0; extra == "io-legacy"
56
- Requires-Dist: pyarrow>=18.1.0; extra == "io-legacy"
57
- Requires-Dist: pydala2>=0.9.4.5; extra == "io-legacy"
58
- Requires-Dist: redis>=5.2.1; extra == "io-legacy"
59
- Requires-Dist: sherlock>=0.4.1; extra == "io-legacy"
60
- Requires-Dist: sqlalchemy>=2.0.30; extra == "io-legacy"
27
+ Requires-Dist: flowerpower-io[legacy]>=0.1.1; extra == "io-legacy"
61
28
  Provides-Extra: mongodb
62
29
  Requires-Dist: pymongo>=4.7.2; extra == "mongodb"
63
30
  Provides-Extra: mqtt
@@ -90,7 +57,7 @@ Dynamic: license-file
90
57
 
91
58
  <div align="center">
92
59
  <h1>FlowerPower 🌸 - Build & Orchestrate Data Pipelines</h1>
93
- <h3>Simple Workflow Framework - Hamilton + APScheduler or RQ = FlowerPower</h3>
60
+ <h3>Simple Workflow Framework - Hamilton + RQ = FlowerPower</h3>
94
61
  <img src="./image.png" alt="FlowerPower Logo" width="400" height="300">
95
62
  </div>
96
63
 
@@ -103,20 +70,19 @@ Dynamic: license-file
103
70
 
104
71
  **FlowerPower** is a Python framework designed for building, configuring, scheduling, and executing data processing pipelines with ease and flexibility. It promotes a modular, configuration-driven approach, allowing you to focus on your pipeline logic while FlowerPower handles the orchestration.
105
72
 
106
- It is leveraging the [Hamilton](https://github.com/DAGWorks-Inc/hamilton) library for defining dataflows in a clean, functional way within your Python pipeline scripts. Pipelines are defined in Python modules and configured using YAML files, making it easy to manage and understand your data workflows.
107
- FlowerPower integrates with job queue systems like [APScheduler](https://github.com/scheduler/apscheduler) and [RQ](https://github.com/rq/rq), enabling you to schedule and manage your pipeline runs efficiently. It also provides a web UI (Hamilton UI) for monitoring and managing your pipelines.
108
- FlowerPower is designed to be extensible, allowing you to easily swap components like job queue backends or add custom I/O plugins. This flexibility makes it suitable for a wide range of data processing tasks, from simple ETL jobs to complex data workflows.
73
+ It leverages the [Hamilton](https://github.com/apache/hamilton) library for defining dataflows in a clean, functional way within your Python pipeline scripts. Pipelines are defined in Python modules and configured using YAML files, making it easy to manage and understand your data workflows.
74
+ FlowerPower integrates with [RQ (Redis Queue)](https://github.com/rq/rq) for job queue management, enabling you to schedule and manage your pipeline runs efficiently. The framework features a clean separation between pipeline execution and job queue management, with a unified project interface that makes it easy to work with both synchronous and asynchronous execution modes. It also provides a web UI (Hamilton UI) for monitoring and managing your pipelines.
75
+ FlowerPower is designed to be extensible, allowing you to easily add custom I/O plugins and adapt to different deployment scenarios. This flexibility makes it suitable for a wide range of data processing tasks, from simple ETL jobs to complex data workflows.
109
76
 
110
77
 
111
78
  ## ✨ Key Features
112
79
 
113
- * **Modular Pipeline Design:** Thanks to [Hamilton](https://github.com/DAGWorks-Inc/hamilton), you can define your data processing logic in Python modules, using functions as nodes in a directed acyclic graph (DAG).
80
+ * **Modular Pipeline Design:** Thanks to [Hamilton](https://github.com/apache/hamilton), you can define your data processing logic in Python modules, using functions as nodes in a directed acyclic graph (DAG).
114
81
  * **Configuration-Driven:** Define pipeline parameters, execution logic, and scheduling declaratively using simple YAML files.
115
- * **Job Queue Integration:** Built-in support for different asynchronous execution models:
116
- * **APScheduler:** For time-based scheduling (cron, interval, date).
117
- * **RQ (Redis Queue):** For distributed task queues.
82
+ * **Job Queue Integration:** Built-in support for asynchronous execution with **RQ (Redis Queue)** for distributed task queues, background processing, and time-based scheduling.
118
83
  * **Extensible I/O Plugins:** Connect to various data sources and destinations (CSV, JSON, Parquet, DeltaTable, DuckDB, PostgreSQL, MySQL, MSSQL, Oracle, MQTT, SQLite, and more).
119
- * **Multiple Interfaces:** Interact with your pipelines via:
84
+ * **Unified Project Interface:** Interact with your pipelines via:
85
+ * **FlowerPowerProject API:** A unified interface for both synchronous and asynchronous pipeline execution, job queue management, and worker control.
120
86
  * **Command Line Interface (CLI):** For running, managing, and inspecting pipelines.
121
87
  * **Web UI:** A graphical interface for monitoring and managing pipelines and schedules. ([Hamilton UI](https://hamilton.dagworks.io/en/latest/hamilton-ui/ui/))
122
88
  * **Filesystem Abstraction:** Simplified file handling with support for local and remote filesystems (e.g., S3, GCS).
@@ -134,9 +100,9 @@ source .venv/bin/activate # Or .\.venv\Scripts\activate on Windows
134
100
  uv pip install flowerpower
135
101
 
136
102
  # Optional: Install additional dependencies for specific features
137
- uv pip install flowerpower[apscheduler,rq] # Example for APScheduler and RQ
138
- uv pip install flowerpower[io] # Example for I/O plugins (CSV, JSON, Parquet, DeltaTable, DuckDB, PostgreSQL, MySQL, MSSQL, Oracle, SQLite)
139
- uv pip install flowerpower[ui] # Example for Hamilton UI
103
+ uv pip install flowerpower[rq] # For RQ job queue support
104
+ uv pip install flowerpower[io] # For I/O plugins (CSV, JSON, Parquet, DeltaTable, DuckDB, PostgreSQL, MySQL, MSSQL, Oracle, SQLite)
105
+ uv pip install flowerpower[ui] # For Hamilton UI
140
106
  uv pip install flowerpower[all] # Install all optional dependencies
141
107
  ```
142
108
 
@@ -162,10 +128,13 @@ flowerpower init --name hello-flowerpower-project
162
128
 
163
129
  Alternatively, you can initialize programmatically:
164
130
  ```python
165
- from flowerpower import init_project
131
+ from flowerpower import FlowerPowerProject
166
132
 
167
- # Creates the structure in the current directory
168
- init_project(name='hello-flowerpower-project', job_queue_type='rq') # Or 'apscheduler'
133
+ # Initialize a new project
134
+ project = FlowerPowerProject.init(
135
+ name='hello-flowerpower-project',
136
+ job_queue_type='rq'
137
+ )
169
138
  ```
170
139
 
171
140
  This will create a `hello-flowerpower-project` directory with the necessary `conf/` and `pipelines/` subdirectories and default configuration files.
@@ -186,7 +155,7 @@ cd hello-flowerpower-project
186
155
 
187
156
  **Configure Project (`conf/project.yml`):**
188
157
 
189
- Open `conf/project.yml` and define your project name and choose your job queue backend. Here's an example using RQ:
158
+ Open `conf/project.yml` and define your project name and job queue backend. FlowerPower now uses RQ (Redis Queue) as its job queue system:
190
159
 
191
160
  ```yaml
192
161
  name: hello-flowerpower
@@ -216,12 +185,16 @@ flowerpower pipeline new hello_world
216
185
 
217
186
  **Using Python:**
218
187
 
219
- There is a `PipelineManager` class to manage pipelines programmatically:
188
+ You can create pipelines programmatically using the FlowerPowerProject interface:
220
189
 
221
190
  ```python
222
- from flowerpower.pipeline import PipelineManager
223
- pm = PipelineManager(base_dir='.')
224
- pm.new(name='hello_world') # Creates a new pipeline
191
+ from flowerpower import FlowerPowerProject
192
+
193
+ # Load the project
194
+ project = FlowerPowerProject.load('.')
195
+
196
+ # Create a new pipeline
197
+ project.pipeline_manager.new(name='hello_world')
225
198
  ```
226
199
 
227
200
  This will create a new file `hello_world.py` in the `pipelines/` directory and a corresponding configuration file `hello_world.yml` in `conf/pipelines/`.
@@ -288,7 +261,7 @@ Open `conf/pipelines/hello_world.yml` and specify parameters, run configurations
288
261
  params: # Parameters accessible in your Python code
289
262
  greeting_message:
290
263
  message: "Hello"
291
- target:
264
+ target_name:
292
265
  name: "World"
293
266
 
294
267
  run: # How to execute the pipeline
@@ -321,29 +294,34 @@ For quick testing or local runs, you can execute your pipeline synchronously. Th
321
294
  ```
322
295
  * **Via Python:**
323
296
  ```python
324
- from flowerpower.pipeline import PipelineManager
325
- pm = PipelineManager(base_dir='.')
326
- pm.run('hello_world') # Execute the pipeline named 'hello_world'
297
+ from flowerpower import FlowerPowerProject
298
+
299
+ # Load the project
300
+ project = FlowerPowerProject.load('.')
301
+
302
+ # Execute the pipeline synchronously
303
+ result = project.run('hello_world')
304
+ ```
327
305
 
328
306
  #### 2. Asynchronous Execution (Job Queues):
329
307
 
330
- For scheduling, background execution, or distributed processing, leverage FlowerPower's job queue integration. Ideal for distributed task queues where workers can pick up jobs.
308
+ For scheduling, background execution, or distributed processing, leverage FlowerPower's job queue integration with RQ (Redis Queue). This is ideal for distributed task queues where workers can pick up jobs.
331
309
 
332
- You have to install the job queue backend you want to use. FlowerPower supports two job queue backends: RQ (Redis Queue) and APScheduler.
310
+ First, install the RQ dependencies:
333
311
  ```bash
334
- # Install RQ (Redis Queue) or APScheduler
335
- uv pip install flowerpower[rq] # For RQ (Redis Queue)
336
- uv pip install flowerpower[apscheduler] # For APScheduler
312
+ # Install RQ (Redis Queue) support
313
+ uv pip install flowerpower[rq]
337
314
  ```
338
- * **Note:** Ensure you have the required dependencies installed for your chosen job queue backend. For RQ, you need Redis running. For APScheduler, you need a data store (PostgreSQL, MySQL, SQLite, MongoDB) and an event broker (Redis, MQTT, PostgreSQL).
339
315
 
340
- **a) Configuring Job Queue Backends:**
316
+ * **Note:** Ensure you have Redis running for RQ job queue functionality.
317
+
318
+ **a) Configuring the RQ Job Queue Backend:**
341
319
 
342
- Configuration of the job queue backend is done in your `conf/project.yml`. Currently, FlowerPower supports two job queue backends:
320
+ Configuration of the job queue backend is done in your `conf/project.yml`. FlowerPower uses RQ (Redis Queue) as its job queue backend:
343
321
 
344
- * **RQ (Redis Queue):**
345
- * **Requires:** Access to a running Redis server.
346
- * Configure in `conf/project.yml`:
322
+ * **RQ (Redis Queue) Requirements:**
323
+ * A **Redis server** running for job queuing and task coordination.
324
+ * Configure in `conf/project.yml`:
347
325
  ```yaml
348
326
  job_queue:
349
327
  type: rq
@@ -351,77 +329,49 @@ Configuration of the job queue backend is done in your `conf/project.yml`. Curre
351
329
  type: redis
352
330
  host: localhost
353
331
  port: 6379
354
- ... # other redis options
355
-
356
- * **APScheduler:**
357
- * **Requires:**
358
- * A **Data Store:** To persist job information (Options: PostgreSQL, MySQL, SQLite, MongoDB).
359
- * An **Event Broker:** To notify workers of scheduled jobs (Options: Redis, MQTT, PostgreSQL).
360
- * Configure in `cong/project.yml`:
361
- ```yaml
362
- job_queue:
363
- type: apscheduler
364
- backend:
365
- type: postgresql # or mysql, sqlite, mongodb
366
- host: localhost
367
- port: 5432
368
- user: your_user
369
- password: your_password
370
- database: your_database
371
- ... # other database options
372
- event_broker:
373
- type: redis # or mqtt, postgresql
374
- host: localhost
375
- port: 6379
376
- ... # other redis options
332
+ database: 0
333
+ # Optional: username, password for Redis auth
334
+ username: your_username # if needed
335
+ password: your_password # if needed
336
+ queues:
337
+ - default
338
+ - high
339
+ - low
377
340
  ```
378
341
 
379
- It is possible to override the job queue backend configuration using environment variables, the `settings` module or by monkey patching the backend configuration of the `PipelineManager` or `JobQueueManager` classes. This might be useful for testing or when you want to avoid hardcoding values in your configuration files.
342
+ You can override the job queue backend configuration using environment variables, the `settings` module, or by modifying the configuration programmatically. This is useful for testing or when you want to avoid hardcoding values in your configuration files.
343
+
380
344
  * **Using the `settings` module:**
381
- e.g to override the RQ backend username and password:
345
+ Override RQ backend configuration:
382
346
  ```python
383
347
  from flowerpower import settings
384
348
 
385
- # Override some configuration values. e.g. when using rq
349
+ # Override RQ backend configuration
386
350
  settings.RQ_BACKEND_USERNAME = 'your_username'
387
351
  settings.RQ_BACKEND_PASSWORD = 'your_password'
388
352
  ```
389
353
  See the `flowerpower/settings/job_queue.py` file for all available settings.
390
354
 
391
- * **Monkey Patching:**
392
- e.g to override the APScheduler data store username and password:
355
+ * **Programmatic Configuration:**
356
+ Modify configuration via the FlowerPowerProject:
393
357
  ```python
394
- from flowerpower.pipeline import PipelineManager
358
+ from flowerpower import FlowerPowerProject
395
359
 
396
- pm = PipelineManager(base_dir='.')
397
- pm.project_cfg.job_queue.backend.username = 'your_username'
398
- pm.project_cfg.job_queue.backend.password = 'your_password'
360
+ project = FlowerPowerProject.load('.')
361
+ project.job_queue_manager.cfg.backend.username = 'your_username'
362
+ project.job_queue_manager.cfg.backend.password = 'your_password'
399
363
  ```
364
+
400
365
  * **Using Environment Variables:**
401
- e.g. use a `.env` file or set them in your environment. Here is a list of the available environment variables for the job queue backend configuration:
366
+ Use a `.env` file or set them in your environment:
402
367
  ```
403
- FP_JOB_QUEUE_TYPE
368
+ FP_JOB_QUEUE_TYPE=rq
404
369
 
405
370
  # RQ (Redis Queue) backend
406
- FP_RQ_BACKEND
407
- FP_RQ_BACKEND_USERNAME
408
- FP_RQ_BACKEND_PASSWORD
409
- FP_RQ_BACKEND_HOST
410
- FP_RQ_BACKEND_PORT
411
-
412
- # APScheduler data store
413
- FP_APS_BACKEND_DS
414
- FP_APS_BACKEND_DS_USERNAME
415
- FP_APS_BACKEND_DS_PASSWORD
416
- FP_APS_BACKEND_DS_HOST
417
- FP_APS_BACKEND_DS_PORT
418
-
419
- # APScheduler event broker
420
- FP_APS_BACKEND_EB
421
- FP_APS_BACKEND_EB_USERNAME
422
- FP_APS_BACKEND_EB_PASSWORD
423
- FP_APS_BACKEND_EB_HOST
424
- FP_APS_BACKEND_EB_PORT
371
+ FP_RQ_BACKEND_USERNAME=your_username
372
+ FP_RQ_BACKEND_PASSWORD=your_password
373
+ FP_RQ_BACKEND_HOST=localhost
374
+ FP_RQ_BACKEND_PORT=6379
425
375
  ```
426
376
 
427
377
 
@@ -430,23 +380,25 @@ Run your pipeline using the job queue system. This allows you to schedule jobs,
430
380
 
431
381
  * **Via CLI:**
432
382
  ```bash
433
- # This will run the pipeline immediately and return the job result (blocking, until the job is done)
434
- flowerpower pipeline run-job hello_world --base_dir .
435
-
436
383
  # Submit the pipeline to the job queue and return the job ID (non-blocking)
437
384
  flowerpower pipeline add-job hello_world --base_dir .
385
+
386
+ # Run the pipeline via job queue and wait for result (blocking)
387
+ flowerpower pipeline run-job hello_world --base_dir .
438
388
  ```
439
389
  * **Via Python:**
440
390
 
441
391
  ```python
442
- from flowerpower.pipeline import PipelineManager
443
- pm = PipelineManager(base_dir='.')
444
-
445
- # submit the pipeline to the job queue and return the job ID (non-blocking)
446
- job_id = pm.add_job('hello_world')
392
+ from flowerpower import FlowerPowerProject
393
+
394
+ # Load the project
395
+ project = FlowerPowerProject.load('.')
447
396
 
448
- # submit the pipeline to the job queue, runs it immediately and returns the job ID (non-blocking)
449
- result = pm.run_job('hello_world')
397
+ # Enqueue the pipeline for execution (non-blocking)
398
+ job_id = project.enqueue('hello_world')
399
+
400
+ # Schedule the pipeline for future/recurring execution
401
+ schedule_id = project.schedule('hello_world', cron="0 9 * * *") # Daily at 9 AM
450
402
  ```
451
403
 
452
404
  These commands will add the pipeline to the job queue, allowing it to be executed in the background or at scheduled intervals. The jobs will be processed by one or more workers, depending on your job queue configuration. You have to start the job queue workers separately.
@@ -462,10 +414,16 @@ To process jobs in the queue, you need to start one or more workers.
462
414
 
463
415
  * **Via Python:**
464
416
  ```python
465
- from flowerpower.job_queue import JobQueueManager
466
- with JobQueueManager(base_dir='.'):
467
- # Start the job queue worker
468
- jqm.start_worker()
417
+ from flowerpower import FlowerPowerProject
418
+
419
+ # Load the project
420
+ project = FlowerPowerProject.load('.')
421
+
422
+ # Start a single worker (blocking)
423
+ project.start_worker()
424
+
425
+ # Start a worker pool (multiple workers)
426
+ project.start_worker_pool(num_workers=4, background=True)
469
427
  ```
470
428
 
471
429
 
@@ -486,7 +444,7 @@ docker-compose up -d redis postgres # Example: Start Redis and PostgreSQL
486
444
 
487
445
  FlowerPower uses a layered configuration system:
488
446
 
489
- * **`conf/project.yml`:** Defines global settings for your project, primarily the `job_queue` backend (RQ or APScheduler) and configurations for integrated `adapter`s (like Hamilton Tracker, MLflow, etc.).
447
+ * **`conf/project.yml`:** Defines global settings for your project, including the RQ job queue backend configuration and integrated `adapter`s (like Hamilton Tracker, MLflow, etc.).
490
448
  * **`conf/pipelines/*.yml`:** Each file defines a specific pipeline. It contains:
491
449
  * `params`: Input parameters for your Hamilton functions.
492
450
  * `run`: Execution details like target outputs (`final_vars`), Hamilton runtime `config`, and `executor` settings.
@@ -495,8 +453,29 @@ FlowerPower uses a layered configuration system:
495
453
 
496
454
  ## 🛠️ Basic Usage
497
455
 
498
- The primary way to interact with pipelines is often through the CLI:
456
+ You can interact with FlowerPower pipelines through multiple interfaces:
457
+
458
+ **Python API (Recommended):**
459
+ ```python
460
+ from flowerpower import FlowerPowerProject
461
+
462
+ # Load the project
463
+ project = FlowerPowerProject.load('.')
464
+
465
+ # Run a pipeline synchronously
466
+ result = project.run('hello_world')
499
467
 
468
+ # Enqueue a pipeline for background execution
469
+ job_id = project.enqueue('hello_world')
470
+
471
+ # Schedule a pipeline
472
+ schedule_id = project.schedule('hello_world', cron="0 9 * * *")
473
+
474
+ # Start workers
475
+ project.start_worker_pool(num_workers=4, background=True)
476
+ ```
477
+
478
+ **CLI:**
500
479
  ```bash
501
480
  # Run a pipeline manually
502
481
  flowerpower pipeline run hello_world --base_dir .
@@ -505,13 +484,196 @@ flowerpower pipeline run hello_world --base_dir .
505
484
  flowerpower pipeline add-job hello_world --base_dir .
506
485
 
507
486
  # Schedule a pipeline
508
- flowerpower pipeline schedule hello_world --base_dir . # Schedules like cron, interval, or date are configured in the pipeline config
487
+ flowerpower pipeline schedule hello_world --base_dir .
488
+
489
+ # Start job queue worker
490
+ flowerpower job-queue start-worker --base_dir .
509
491
 
510
- # And many more commands...
511
- flowerpower --help # List all available commands
492
+ # List all available commands
493
+ flowerpower --help
494
+ ```
495
+
496
+ ## 🔧 Direct Module Usage
497
+
498
+ While the unified `FlowerPowerProject` interface is recommended for most use cases, you can also use the pipeline and job queue modules directly for more granular control or when you only need specific functionality.
499
+
500
+ ### Pipeline-Only Usage
501
+
502
+ If you only need pipeline execution without job queue functionality, you can use the `PipelineManager` directly:
503
+
504
+ ```python
505
+ from flowerpower.pipeline import PipelineManager
506
+
507
+ # Initialize pipeline manager
508
+ pm = PipelineManager(base_dir='.')
509
+
510
+ # Create a new pipeline
511
+ pm.new(name='my_pipeline')
512
+
513
+ # Run a pipeline synchronously
514
+ result = pm.run(
515
+ name='my_pipeline',
516
+ inputs={'param': 'value'},
517
+ final_vars=['output_var']
518
+ )
519
+
520
+ # List available pipelines
521
+ pipelines = pm.list()
522
+ print(f"Available pipelines: {pipelines}")
523
+
524
+ # Get pipeline information
525
+ info = pm.get('my_pipeline')
526
+ print(f"Pipeline config: {info}")
527
+
528
+ # Delete a pipeline
529
+ pm.delete('old_pipeline')
530
+ ```
531
+
532
+ **When to use Pipeline-only approach:**
533
+ - Simple synchronous workflows
534
+ - Testing and development
535
+ - When you don't need background processing or scheduling
536
+ - Lightweight applications with minimal dependencies
537
+
538
+ ### Job Queue-Only Usage
512
539
 
540
+ If you need job queue functionality for general task processing (not necessarily pipelines), you can use the job queue managers directly:
541
+
542
+ ```python
543
+ import datetime as dt
544
+ from flowerpower.job_queue import JobQueueManager
545
+
546
+ # Initialize job queue manager with RQ backend
547
+ jqm = JobQueueManager(
548
+ type='rq',
549
+ name='my_worker',
550
+ base_dir='.'
551
+ )
552
+
553
+ # Define a simple task function
554
+ def add_numbers(x: int, y: int) -> int:
555
+ """Simple task that adds two numbers."""
556
+ return x + y
557
+
558
+ def process_data(data: dict) -> dict:
559
+ """More complex task that processes data."""
560
+ result = {
561
+ 'processed': True,
562
+ 'count': len(data.get('items', [])),
563
+ 'timestamp': str(dt.datetime.now())
564
+ }
565
+ return result
566
+
567
+ # Enqueue jobs for immediate execution
568
+ job1 = jqm.enqueue(add_numbers, 5, 10)
569
+ job2 = jqm.enqueue(process_data, {'items': [1, 2, 3, 4, 5]})
570
+
571
+ # Enqueue jobs with delays
572
+ job3 = jqm.enqueue_in(300, add_numbers, 20, 30) # Run in 5 minutes
573
+ job4 = jqm.enqueue_at(dt.datetime(2025, 1, 1, 9, 0), process_data, {'items': []})
574
+
575
+ # Schedule recurring jobs
576
+ schedule_id = jqm.add_schedule(
577
+ func=process_data,
578
+ func_kwargs={'data': {'items': []}},
579
+ cron="0 */6 * * *", # Every 6 hours
580
+ schedule_id="data_processing_job"
581
+ )
582
+
583
+ # Start a worker to process jobs (blocking)
584
+ jqm.start_worker()
585
+
586
+ # Or start multiple workers in background
587
+ jqm.start_worker_pool(num_workers=4, background=True)
588
+
589
+ # Get job results
590
+ result1 = jqm.get_job_result(job1)
591
+ print(f"Addition result: {result1}")
592
+
593
+ # Clean up
594
+ jqm.stop_worker_pool()
595
+ ```
596
+
597
+ **Alternatively, use RQManager directly for more RQ-specific features:**
598
+
599
+ ```python
600
+ from flowerpower.job_queue.rq import RQManager
601
+
602
+ # Initialize RQ manager with custom configuration
603
+ rq_manager = RQManager(
604
+ name='specialized_worker',
605
+ base_dir='.',
606
+ log_level='DEBUG'
607
+ )
608
+
609
+ # Use RQ-specific features
610
+ job = rq_manager.add_job(
611
+ func=add_numbers,
612
+ func_args=(100, 200),
613
+ queue_name='high_priority',
614
+ timeout=300,
615
+ retry=3,
616
+ result_ttl=3600
617
+ )
618
+
619
+ # Start worker for specific queues
620
+ rq_manager.start_worker(
621
+ queue_names=['high_priority', 'default'],
622
+ background=True
623
+ )
624
+
625
+ # Monitor jobs and queues
626
+ jobs = rq_manager.get_jobs()
627
+ schedules = rq_manager.get_schedules()
628
+
629
+ print(f"Active jobs: {len(jobs)}")
630
+ print(f"Active schedules: {len(schedules)}")
631
+ ```
632
+
633
+ **When to use Job Queue-only approach:**
634
+ - General task processing and background jobs
635
+ - When you need fine-grained control over job queue behavior
636
+ - Microservices that only handle specific job types
637
+ - Integration with existing RQ-based systems
638
+ - When you don't need Hamilton-based pipeline functionality
639
+
640
+ ### Combining Both Approaches
641
+
642
+ You can also combine both managers for custom workflows:
643
+
644
+ ```python
645
+ from flowerpower.pipeline import PipelineManager
646
+ from flowerpower.job_queue import JobQueueManager
647
+
648
+ # Initialize both managers
649
+ pm = PipelineManager(base_dir='.')
650
+ jqm = JobQueueManager(type='rq', name='combined_worker', base_dir='.')
651
+
652
+ # Create a custom function that runs a pipeline
653
+ def run_pipeline_task(pipeline_name: str, inputs: dict = None):
654
+ """Custom task that executes a pipeline."""
655
+ result = pm.run(pipeline_name, inputs=inputs)
656
+ return result
657
+
658
+ # Enqueue pipeline execution as a job
659
+ job_id = jqm.enqueue(
660
+ run_pipeline_task,
661
+ 'my_pipeline',
662
+ {'param': 'value'}
663
+ )
664
+
665
+ # Start worker to process the pipeline jobs
666
+ jqm.start_worker()
513
667
  ```
514
668
 
669
+ **Benefits of FlowerPowerProject vs Direct Usage:**
670
+
671
+ | Approach | Benefits | Use Cases |
672
+ |----------|----------|-----------|
673
+ | **FlowerPowerProject** | - Unified interface<br>- Automatic dependency injection<br>- Simplified configuration<br>- Best practices built-in | - Most applications<br>- Rapid development<br>- Full feature integration |
674
+ | **Pipeline-only** | - Lightweight<br>- No Redis dependency<br>- Simple synchronous execution | - Testing<br>- Simple workflows<br>- No background processing needed |
675
+ | **Job Queue-only** | - Fine-grained control<br>- Custom job types<br>- Existing RQ integration | - Microservices<br>- Custom task processing<br>- Non-pipeline jobs |
676
+
515
677
  ## 🖥️ UI
516
678
 
517
679
  The FlowerPower web UI (Hamilton UI) provides a graphical interface for monitoring and managing your pipelines. It allows you to visualize pipeline runs, schedules, and potentially manage configurations.
@@ -523,13 +685,7 @@ flowerpower ui
523
685
 
524
686
  ## 📖 Documentation
525
687
 
526
- There is not much documentation yet, but you can find some examples in the `examples/` directory. The examples cover various use cases, including:
527
- * Basic pipeline creation and execution.
528
- * Using different job queue backends (RQ and APScheduler).
529
- * Configuring and scheduling pipelines.
530
-
531
688
 
532
- There is a first version of documentation in `docs/`. This documentation is generated using [Pocket Flow Tutorial Project](https://github.com/The-Pocket/PocketFlow-Tutorial-Codebase-Knowledge). Although it is not complete and might be wrong in some parts, it can be a good starting point for understanding how to use FlowerPower.
533
689
 
534
690
 
535
691
  ## 📜 License