good_pipeline 0.1.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (117) hide show
  1. checksums.yaml +7 -0
  2. data/CHANGELOG.md +16 -0
  3. data/CODE_OF_CONDUCT.md +132 -0
  4. data/LICENSE.txt +21 -0
  5. data/README.md +217 -0
  6. data/Rakefile +20 -0
  7. data/app/controllers/good_pipeline/application_controller.rb +9 -0
  8. data/app/controllers/good_pipeline/frontends_controller.rb +31 -0
  9. data/app/controllers/good_pipeline/pipelines_controller.rb +57 -0
  10. data/app/frontend/good_pipeline/style.css +518 -0
  11. data/app/helpers/good_pipeline/pipelines_helper.rb +119 -0
  12. data/app/jobs/good_pipeline/pipeline_callback_job.rb +52 -0
  13. data/app/jobs/good_pipeline/pipeline_reconciliation_job.rb +10 -0
  14. data/app/jobs/good_pipeline/step_finished_job.rb +10 -0
  15. data/app/models/good_pipeline/chain_record.rb +18 -0
  16. data/app/models/good_pipeline/dependency_record.rb +23 -0
  17. data/app/models/good_pipeline/pipeline_record.rb +73 -0
  18. data/app/models/good_pipeline/step_record.rb +74 -0
  19. data/app/views/good_pipeline/pipelines/_chain_links.html.erb +30 -0
  20. data/app/views/good_pipeline/pipelines/_pagination.html.erb +24 -0
  21. data/app/views/good_pipeline/pipelines/_pipeline_row.html.erb +7 -0
  22. data/app/views/good_pipeline/pipelines/_steps_table.html.erb +33 -0
  23. data/app/views/good_pipeline/pipelines/definitions.html.erb +49 -0
  24. data/app/views/good_pipeline/pipelines/index.html.erb +43 -0
  25. data/app/views/good_pipeline/pipelines/show.html.erb +40 -0
  26. data/app/views/layouts/good_pipeline/application.html.erb +40 -0
  27. data/config/routes.rb +13 -0
  28. data/demo/Rakefile +5 -0
  29. data/demo/app/jobs/always_failing_job.rb +12 -0
  30. data/demo/app/jobs/application_job.rb +4 -0
  31. data/demo/app/jobs/cleanup_job.rb +5 -0
  32. data/demo/app/jobs/download_job.rb +5 -0
  33. data/demo/app/jobs/failing_job.rb +12 -0
  34. data/demo/app/jobs/publish_job.rb +5 -0
  35. data/demo/app/jobs/retryable_job.rb +19 -0
  36. data/demo/app/jobs/thumbnail_job.rb +5 -0
  37. data/demo/app/jobs/transcode_job.rb +5 -0
  38. data/demo/app/pipelines/analytics_pipeline.rb +7 -0
  39. data/demo/app/pipelines/archive_pipeline.rb +7 -0
  40. data/demo/app/pipelines/continue_test_pipeline.rb +11 -0
  41. data/demo/app/pipelines/halt_test_pipeline.rb +10 -0
  42. data/demo/app/pipelines/notification_pipeline.rb +7 -0
  43. data/demo/app/pipelines/test_pipeline.rb +5 -0
  44. data/demo/app/pipelines/video_processing_pipeline.rb +14 -0
  45. data/demo/bin/rails +6 -0
  46. data/demo/config/application.rb +18 -0
  47. data/demo/config/boot.rb +5 -0
  48. data/demo/config/database.yml +15 -0
  49. data/demo/config/environment.rb +5 -0
  50. data/demo/config/environments/development.rb +9 -0
  51. data/demo/config/environments/test.rb +10 -0
  52. data/demo/config/routes.rb +6 -0
  53. data/demo/config.ru +5 -0
  54. data/demo/db/migrate/20260319205325_create_good_jobs.rb +112 -0
  55. data/demo/db/migrate/20260319205326_create_good_pipeline_tables.rb +53 -0
  56. data/demo/db/seeds.rb +153 -0
  57. data/demo/test/good_pipeline/test_chain_record.rb +29 -0
  58. data/demo/test/good_pipeline/test_cleanup.rb +93 -0
  59. data/demo/test/good_pipeline/test_coordinator.rb +286 -0
  60. data/demo/test/good_pipeline/test_dependency_record.rb +46 -0
  61. data/demo/test/good_pipeline/test_failure_metadata.rb +77 -0
  62. data/demo/test/good_pipeline/test_introspection.rb +46 -0
  63. data/demo/test/good_pipeline/test_pipeline_callback_job.rb +132 -0
  64. data/demo/test/good_pipeline/test_pipeline_reconciliation_job.rb +33 -0
  65. data/demo/test/good_pipeline/test_pipeline_record.rb +183 -0
  66. data/demo/test/good_pipeline/test_runner.rb +86 -0
  67. data/demo/test/good_pipeline/test_step_finished_job.rb +37 -0
  68. data/demo/test/good_pipeline/test_step_record.rb +208 -0
  69. data/demo/test/integration/test_concurrent_fan_in.rb +109 -0
  70. data/demo/test/integration/test_end_to_end.rb +89 -0
  71. data/demo/test/integration/test_enqueue_atomicity.rb +59 -0
  72. data/demo/test/integration/test_pipeline_chaining.rb +183 -0
  73. data/demo/test/integration/test_retry_scenarios.rb +90 -0
  74. data/demo/test/integration/test_step_finished_idempotency.rb +38 -0
  75. data/demo/test/test_helper.rb +71 -0
  76. data/dev-docker-compose.yml +16 -0
  77. data/docs/.vitepress/config.mts +66 -0
  78. data/docs/.vitepress/theme/custom.css +21 -0
  79. data/docs/.vitepress/theme/index.ts +4 -0
  80. data/docs/architecture.md +184 -0
  81. data/docs/callbacks.md +66 -0
  82. data/docs/cleanup.md +45 -0
  83. data/docs/dag-validation.md +88 -0
  84. data/docs/dashboard.md +66 -0
  85. data/docs/defining-pipelines.md +167 -0
  86. data/docs/failure-strategies.md +138 -0
  87. data/docs/getting-started.md +77 -0
  88. data/docs/index.md +23 -0
  89. data/docs/introduction.md +42 -0
  90. data/docs/monitoring.md +103 -0
  91. data/docs/package-lock.json +2510 -0
  92. data/docs/package.json +11 -0
  93. data/docs/pipeline-chaining.md +104 -0
  94. data/docs/public/screenshots/definitions.png +0 -0
  95. data/docs/public/screenshots/index.png +0 -0
  96. data/docs/public/screenshots/show.png +0 -0
  97. data/docs/screenshots/definitions.png +0 -0
  98. data/docs/screenshots/index.png +0 -0
  99. data/docs/screenshots/show.png +0 -0
  100. data/lib/generators/good_pipeline/install/install_generator.rb +20 -0
  101. data/lib/generators/good_pipeline/install/templates/create_good_pipeline_tables.rb.erb +51 -0
  102. data/lib/good_pipeline/chain.rb +54 -0
  103. data/lib/good_pipeline/chain_coordinator.rb +53 -0
  104. data/lib/good_pipeline/coordinator.rb +176 -0
  105. data/lib/good_pipeline/cycle_detector.rb +36 -0
  106. data/lib/good_pipeline/engine.rb +23 -0
  107. data/lib/good_pipeline/errors.rb +11 -0
  108. data/lib/good_pipeline/failure_metadata.rb +29 -0
  109. data/lib/good_pipeline/graph_validator.rb +71 -0
  110. data/lib/good_pipeline/pipeline.rb +122 -0
  111. data/lib/good_pipeline/runner.rb +63 -0
  112. data/lib/good_pipeline/step_definition.rb +18 -0
  113. data/lib/good_pipeline/version.rb +5 -0
  114. data/lib/good_pipeline.rb +45 -0
  115. data/mise.toml +10 -0
  116. data/sig/good_pipeline.rbs +4 -0
  117. metadata +204 -0
@@ -0,0 +1,138 @@
1
+ # Failure Strategies
2
+
3
+ GoodPipeline provides three failure strategies that control what happens when a step fails. Strategies can be set at the pipeline level and overridden per step.
4
+
5
+ ## Pipeline-level strategy
6
+
7
+ Set with `failure_strategy` in the pipeline class body:
8
+
9
+ ```ruby
10
+ class MyPipeline < GoodPipeline::Pipeline
11
+ failure_strategy :continue # :halt (default), :continue, or :ignore
12
+ end
13
+ ```
14
+
15
+ ### `:halt` (default)
16
+
17
+ When any step fails, the coordinator sets `halt_triggered = true` and marks all remaining `pending` steps as `skipped`. The pipeline derives to `halted`.
18
+
19
+ ```ruby
20
+ class HaltPipeline < GoodPipeline::Pipeline
21
+ failure_strategy :halt
22
+
23
+ def configure(id:)
24
+ run :a, JobA, with: { id: id }
25
+ run :b, JobB, with: { id: id } # independent of :a
26
+ run :c, JobC, after: :a
27
+ end
28
+ end
29
+ ```
30
+
31
+ If `:a` fails: `:b` is skipped (even though it's independent), `:c` is skipped, pipeline status is `halted`.
32
+
33
+ ### `:continue`
34
+
35
+ The coordinator applies skip propagation only to permanently unsatisfied descendants. Independent branches continue executing. The pipeline derives to `failed`.
36
+
37
+ ```ruby
38
+ class ContinuePipeline < GoodPipeline::Pipeline
39
+ failure_strategy :continue
40
+
41
+ def configure(id:)
42
+ run :a, JobA, with: { id: id }
43
+ run :b, JobB, with: { id: id } # independent of :a
44
+ run :c, JobC, after: :a
45
+ end
46
+ end
47
+ ```
48
+
49
+ If `:a` fails: `:c` is skipped (depends on `:a`), `:b` still runs, pipeline status is `failed`.
50
+
51
+ ### `:ignore`
52
+
53
+ Treats all failed steps as satisfied for dependency resolution. Nothing is skipped. The pipeline derives to `failed` if any step actually failed.
54
+
55
+ ```ruby
56
+ class IgnorePipeline < GoodPipeline::Pipeline
57
+ failure_strategy :ignore
58
+
59
+ def configure(id:)
60
+ run :a, JobA, with: { id: id }
61
+ run :b, JobB, after: :a
62
+ end
63
+ end
64
+ ```
65
+
66
+ If `:a` fails: `:b` is still enqueued (failure treated as success for dependencies), pipeline status is `failed`.
67
+
68
+ ## Step-level override
69
+
70
+ Override the failure strategy for a specific step's **outgoing edges** using `on_failure:` in the `run` call:
71
+
72
+ ```ruby
73
+ run :thumbnail, ThumbnailJob,
74
+ after: :download,
75
+ on_failure: :ignore # thumbnail failure never blocks downstream steps
76
+ ```
77
+
78
+ Step-level `on_failure` takes precedence over the pipeline-level strategy for that step's outgoing edges only.
79
+
80
+ ## Effective strategy resolution
81
+
82
+ The coordinator resolves the effective strategy for each step's outgoing edges:
83
+
84
+ 1. If the step has a step-level `on_failure:` override, use that
85
+ 2. Otherwise, fall back to the pipeline-level `failure_strategy`
86
+
87
+ ## The `:halt` + step `:ignore` interaction
88
+
89
+ ::: warning Important
90
+ When a step fails with step-level `on_failure: :ignore` under a pipeline-level `:halt` strategy, the behavior may be surprising:
91
+
92
+ - That step's **outgoing edges** are treated as non-blocking — its dependents remain eligible
93
+ - The pipeline `:halt` policy **still fires** for all other unrelated pending steps
94
+ - `halt_triggered` is still set to `true`
95
+ - The pipeline still derives to `halted`
96
+ :::
97
+
98
+ Step-level `:ignore` scopes only to that step's outgoing edges, not to the global halt behavior of the pipeline.
99
+
100
+ ```ruby
101
+ class MixedPipeline < GoodPipeline::Pipeline
102
+ failure_strategy :halt
103
+
104
+ def configure(id:)
105
+ run :optional, OptionalJob, with: { id: id }, on_failure: :ignore
106
+ run :required, RequiredJob, with: { id: id }
107
+ run :after_optional, AfterOptionalJob, after: :optional
108
+ end
109
+ end
110
+ ```
111
+
112
+ If `:optional` fails: `:after_optional` remains eligible (ignore override), `:required` is skipped (halt policy), pipeline is `halted`.
113
+
114
+ ## Dependency satisfaction rules
115
+
116
+ A dependency edge (upstream → downstream) is **satisfied** when:
117
+
118
+ | Upstream status | Upstream strategy | Edge satisfied? |
119
+ |---|---|---|
120
+ | `succeeded` | any | Yes |
121
+ | `failed` | `:ignore` | Yes — treated as non-blocking |
122
+ | `failed` | `:continue` or `:halt` | No |
123
+ | `skipped` | any | No |
124
+ | `pending` or `enqueued` | any | No — not yet terminal |
125
+
126
+ A downstream step is eligible for enqueue when **all** of its incoming edges are satisfied.
127
+
128
+ A downstream step is marked `skipped` when it's still `pending` and at least one incoming edge is **permanently unsatisfied** — the upstream is terminal, cannot satisfy the edge, and no future event can change that.
129
+
130
+ ## Failure resolution table
131
+
132
+ | Pipeline strategy | Step override | Effect when step fails |
133
+ |---|---|---|
134
+ | `:halt` | none | `halt_triggered = true`; all pending steps skipped; pipeline → `halted` |
135
+ | `:halt` | `:ignore` on failed step | That step's dependents still eligible; all other pending steps still skipped; pipeline → `halted` |
136
+ | `:continue` | none | Permanently unsatisfied descendants skipped; pipeline → `failed` |
137
+ | `:continue` | `:ignore` on failed step | That step's dependents still eligible |
138
+ | `:ignore` | none | Nothing skipped; pipeline → `failed` if any step failed |
@@ -0,0 +1,77 @@
1
+ # Installation & Setup
2
+
3
+ ## Install the gem
4
+
5
+ Add GoodPipeline to your Gemfile:
6
+
7
+ ```ruby
8
+ gem "good_pipeline"
9
+ ```
10
+
11
+ Then install:
12
+
13
+ ```bash
14
+ bundle install
15
+ ```
16
+
17
+ ## Run the install generator
18
+
19
+ GoodPipeline provides a generator that creates the necessary database migration:
20
+
21
+ ```bash
22
+ bin/rails generate good_pipeline:install
23
+ bin/rails db:migrate
24
+ ```
25
+
26
+ This creates four tables: `good_pipeline_pipelines`, `good_pipeline_steps`, `good_pipeline_dependencies`, and `good_pipeline_chains`.
27
+
28
+ ## Configure GoodJob
29
+
30
+ GoodPipeline requires GoodJob to preserve job records so it can read terminal failure metadata:
31
+
32
+ ```ruby
33
+ # config/initializers/good_job.rb
34
+ GoodJob.preserve_job_records = true
35
+ ```
36
+
37
+ GoodPipeline will raise `GoodPipeline::ConfigurationError` at boot if this is not set.
38
+
39
+ ## Mount the dashboard (optional)
40
+
41
+ ```ruby
42
+ # config/routes.rb
43
+ mount GoodPipeline::Engine => "/good_pipeline"
44
+ ```
45
+
46
+ See the [Web Dashboard](/dashboard) page for details.
47
+
48
+ ## Your first pipeline
49
+
50
+ Define a pipeline by subclassing `GoodPipeline::Pipeline` and implementing `configure`:
51
+
52
+ ```ruby
53
+ class DataIngestionPipeline < GoodPipeline::Pipeline
54
+ description "Fetches, transforms, and loads data"
55
+
56
+ def configure(source_id:)
57
+ run :fetch, FetchJob, with: { source_id: source_id }
58
+ run :transform, TransformJob, with: { source_id: source_id }, after: :fetch
59
+ run :load, LoadJob, with: { source_id: source_id }, after: :transform
60
+ end
61
+ end
62
+ ```
63
+
64
+ Run it:
65
+
66
+ ```ruby
67
+ DataIngestionPipeline.run(source_id: 42)
68
+ ```
69
+
70
+ This enqueues `:fetch` immediately. When it succeeds, `:transform` is enqueued. When that succeeds, `:load` is enqueued. If any step fails, the pipeline halts by default.
71
+
72
+ ## Next steps
73
+
74
+ - [Defining Pipelines](/defining-pipelines) — full DSL reference and DAG patterns
75
+ - [Failure Strategies](/failure-strategies) — control what happens when steps fail
76
+ - [Pipeline Chaining](/pipeline-chaining) — wire pipelines together
77
+ - [Monitoring](/monitoring) — inspect pipeline and step state
data/docs/index.md ADDED
@@ -0,0 +1,23 @@
1
+ ---
2
+ layout: home
3
+
4
+ hero:
5
+ name: GoodPipeline
6
+ text: DAG-based job pipelines for Rails
7
+ tagline: Postgres-only workflow orchestration built on GoodJob. Define multi-step workflows as directed acyclic graphs with dependency resolution, parallel execution, failure strategies, and a built-in dashboard.
8
+ actions:
9
+ - theme: brand
10
+ text: Get Started
11
+ link: /getting-started
12
+ - theme: alt
13
+ text: View on GitHub
14
+ link: https://github.com/milkstrawai/good_pipeline
15
+
16
+ features:
17
+ - title: Postgres Only
18
+ details: All state lives in Postgres — no Redis, no external dependencies. Step transitions and job enqueues are atomically coupled in a single database transaction.
19
+ - title: DAG Orchestration
20
+ details: Define pipelines as directed acyclic graphs with the run DSL. Steps run in parallel when possible and wait for dependencies automatically. Fan-out, fan-in, and chaining are all built in.
21
+ - title: Built-in Dashboard
22
+ details: A mountable Rails engine with pipeline executions, step details with DAG visualization, and a pipeline definitions catalog. No build step — uses CDN assets.
23
+ ---
@@ -0,0 +1,42 @@
1
+ # Introduction
2
+
3
+ GoodPipeline is a Ruby gem that brings DAG-based (Directed Acyclic Graph) workflow orchestration to Rails applications using [GoodJob](https://github.com/bensheldon/good_job) as the job backend. It allows you to define pipelines of jobs that run in parallel or with explicit dependencies, chain multiple pipelines together, and monitor execution — all without any infrastructure beyond Postgres.
4
+
5
+ ## Why GoodPipeline?
6
+
7
+ ### The gap in the ecosystem
8
+
9
+ The two prominent DAG workflow gems in Ruby are:
10
+
11
+ - **[Gush](https://github.com/chaps-io/gush)** — graph-based with a clean DSL, but requires **Sidekiq + Redis**
12
+ - **[Jongleur](https://gitlab.com/RedFred7/Jongleur)** — DAG-based, but runs jobs as **OS processes**, not ActiveJob workers
13
+
14
+ Neither integrates with GoodJob. Teams that have chosen GoodJob for its Postgres-only simplicity have no DAG workflow option that stays within that constraint.
15
+
16
+ ### Why GoodJob::Batch isn't enough
17
+
18
+ GoodJob's Batch feature fires a single `on_finish` callback when all jobs in a batch complete. This is powerful for fan-out/fan-in patterns but insufficient for DAGs because:
19
+
20
+ - There is no per-job completion hook
21
+ - There is no concept of edges (dependencies) between individual jobs
22
+ - There is no way to express "enqueue Job C only after Job A and Job B both succeed"
23
+
24
+ GoodPipeline solves this by building a formal coordination state machine, DAG validation, and atomic coordination layer on top of Batch.
25
+
26
+ ## Key features
27
+
28
+ - **DAG topology via `run` DSL** — define steps and their dependencies with a single verb
29
+ - **Parallel execution** — steps without dependencies run concurrently
30
+ - **Three failure strategies** — `:halt`, `:continue`, and `:ignore` at pipeline and step level
31
+ - **Pipeline chaining** — serial chains, fan-out, fan-in, and parallel start
32
+ - **Lifecycle callbacks** — `on_complete`, `on_success`, `on_failure` with exactly-once dispatch
33
+ - **Built-in dashboard** — mountable Rails engine with execution list, DAG visualization, and definitions catalog
34
+ - **Automatic cleanup** — piggybacks on GoodJob's cleanup cycle
35
+ - **Postgres-only** — all state in Postgres, no Redis, atomic enqueue transactions
36
+
37
+ ## Requirements
38
+
39
+ - Ruby >= 3.2
40
+ - Rails >= 7.1
41
+ - PostgreSQL
42
+ - GoodJob >= 3.10 with `preserve_job_records = true`
@@ -0,0 +1,103 @@
1
+ # Monitoring & Introspection
2
+
3
+ GoodPipeline records are standard ActiveRecord models. You can query and inspect them using familiar Rails patterns.
4
+
5
+ ## Pipeline instance methods
6
+
7
+ ```ruby
8
+ pipeline = VideoProcessingPipeline.run(video_id: 123)
9
+
10
+ pipeline.id # => "uuid-string"
11
+ pipeline.status # => "running"
12
+ pipeline.type # => "VideoProcessingPipeline"
13
+ pipeline.params # => { "video_id" => 123 }
14
+ pipeline.halt_triggered? # => false
15
+ pipeline.terminal? # => false
16
+ pipeline.on_failure_strategy # => "halt"
17
+ pipeline.created_at
18
+ pipeline.updated_at
19
+ ```
20
+
21
+ ## Step instance methods
22
+
23
+ ```ruby
24
+ step = pipeline.steps.find_by(key: "transcode")
25
+
26
+ step.key # => "transcode"
27
+ step.job_class # => "TranscodeJob"
28
+ step.coordination_status # => "succeeded"
29
+ step.params # => { "video_id" => 123 }
30
+ step.queue # => nil (or custom queue name)
31
+ step.priority # => nil (or custom priority)
32
+ step.good_job_id # => "uuid" of the GoodJob record
33
+ step.attempts # => 3
34
+ step.error_class # => "TransientError" (on failure)
35
+ step.error_message # => "Connection timed out" (on failure)
36
+ step.duration # => 12.34 (Float seconds, from GoodJob record)
37
+ ```
38
+
39
+ ## Pipeline statuses
40
+
41
+ | Status | Meaning |
42
+ |---|---|
43
+ | `pending` | Created but root steps not yet enqueued — waiting in a chain |
44
+ | `running` | At least one step is enqueued or executing |
45
+ | `succeeded` | All steps terminal, none failed |
46
+ | `failed` | One or more steps failed; `:continue` or `:ignore` strategy was used |
47
+ | `halted` | `:halt` strategy was applied — `halt_triggered` is `true` |
48
+ | `skipped` | Skipped because an upstream pipeline in a chain failed |
49
+
50
+ ## Step statuses
51
+
52
+ The `coordination_status` column is the authoritative step state:
53
+
54
+ | Status | Meaning |
55
+ |---|---|
56
+ | `pending` | Waiting for upstream dependencies to be satisfied |
57
+ | `enqueued` | Dependencies satisfied; job enqueued |
58
+ | `succeeded` | Job completed successfully — terminal |
59
+ | `failed` | Job exhausted retries or was discarded — terminal |
60
+ | `skipped` | Skipped due to upstream failure propagation — terminal |
61
+
62
+ ## Querying with ActiveRecord
63
+
64
+ ```ruby
65
+ # Find all failed pipelines in the last 24 hours
66
+ GoodPipeline::PipelineRecord.where(status: "failed")
67
+ .where("created_at > ?", 24.hours.ago)
68
+
69
+ # Find all pipelines of a specific type
70
+ GoodPipeline::PipelineRecord.where(type: "VideoProcessingPipeline")
71
+
72
+ # Find pipelines where a specific job class failed
73
+ GoodPipeline::PipelineRecord
74
+ .joins(:steps)
75
+ .where(good_pipeline_steps: {
76
+ job_class: "TranscodeJob",
77
+ coordination_status: "failed"
78
+ })
79
+
80
+ # Running pipelines
81
+ GoodPipeline::PipelineRecord.where(status: "running")
82
+ ```
83
+
84
+ ## Step associations
85
+
86
+ Steps expose their dependency graph through associations:
87
+
88
+ ```ruby
89
+ step = pipeline.steps.find_by(key: "publish")
90
+
91
+ step.upstream_steps # => steps that must complete before this one
92
+ step.downstream_steps # => steps waiting on this one
93
+ ```
94
+
95
+ ## Step duration
96
+
97
+ The `duration` method calculates how long a step took to execute by reading timing data from the associated GoodJob record:
98
+
99
+ ```ruby
100
+ step.duration # => 12.34 (seconds as Float), or nil if not available
101
+ ```
102
+
103
+ Duration is `nil` if the step hasn't run yet or if the GoodJob record is unavailable.