good_pipeline 0.2.2 → 0.3.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA256:
3
- metadata.gz: e52af813bd86f0de3d905a3f591209e0a155e0b5c28e3a76f963f1c2cdbd8232
4
- data.tar.gz: 3a48213db593949e7f9afff0f341cf29dfdb90d4972b20b4c5e48091cb86bab1
3
+ metadata.gz: e9df8b5fbd57895f53adf1d3c5804ca9bd64ca792e880c2d4941957e4b8ca368
4
+ data.tar.gz: c4e2a7c4edbe27a0e40e7ff62a061f5ade53c44a74c0b824ce21bdf019cb8781
5
5
  SHA512:
6
- metadata.gz: 7bc3d7e1181b852bbc795f1e65ef9b2764645ffc13d7d860f444074158c741202ddae188253c50cb12a44da9958fe6403b2435fef6ec71d9172a35b46816da55
7
- data.tar.gz: 22b59ba7dba841a6d55559c27a7427032bad94ed1fd3e336f2d53997ee4a24bf2f8c193d77e743095e4c5c73b948975b0f85228d63839eba367af1dc441f8456
6
+ metadata.gz: e3c6e6940034efbc8679ede33b352022deb64a36b7a16a9dd3f02fd6dc32a837caf545407541bd0ca629f690d38cfa60342e6aa756e53ea321fb754eebbd772c
7
+ data.tar.gz: 878566f6d9ddc7b9ce6b4201f5bd241b21f2758f5475e3fa60e6c3fd629651bff06ff973af9f3f29ac4624bee58e78dc0f9628e8a1cdb503836e923a09e3944e
data/CHANGELOG.md CHANGED
@@ -1,5 +1,43 @@
1
1
  ## [Unreleased]
2
2
 
3
+ ## [0.3.1] - 2026-03-26
4
+
5
+ ### Added
6
+
7
+ - **`halt_pipeline!`** — call from any job to stop the pipeline early with a `succeeded` status. The halting step is marked `halted`, remaining pending steps are `skipped`, and the `on_success` callback fires. The GoodJob record completes as succeeded (no error, no discard). Available in all jobs via `GoodPipeline::Haltable`, included automatically by the Engine.
8
+ - **`halted` coordination status** — new terminal step status for steps that called `halt_pipeline!`. Treated as satisfied for downstream dependency resolution.
9
+ - **`halt_requested` column** — boolean column on steps table, set by `halt_pipeline!` and checked by the coordinator on step completion.
10
+ - **`good_job_id` index** — partial unique index on `good_job_id` for fast step lookup from within jobs.
11
+
12
+ ## [0.3.0] - 2026-03-25
13
+
14
+ ### Performance
15
+
16
+ - **Bulk insert steps and dependencies** — `Runner` uses `insert_all!` with `RETURNING` for steps and dependencies instead of individual `create!` calls, reducing pipeline creation from N+M queries to 2.
17
+ - **Pre-generated pipeline UUID** — `Runner` generates the pipeline UUID upfront, folding batch ID and initial status into a single INSERT instead of separate UPDATEs.
18
+ - **Atomic upstream counter** — new `pending_upstream_count` column on steps tracks how many upstreams remain. `unblock_downstream_steps` atomically decrements via `UPDATE ... RETURNING` and only calls `try_enqueue_step` when the count reaches zero, eliminating O(N) wasted lock acquisitions for fan-in and diamond topologies.
19
+ - **Merged UPDATE round-trips** — `enqueue_user_job` folds status transition, batch ID, and job ID into one `update_columns`. `record_step_failure` merges status and error metadata into one `update_columns`.
20
+ - **Removed redundant transaction** — `record_step_outcome` no longer wraps a single `update_columns` in an explicit transaction.
21
+ - **`update_columns` in transition methods** — `transition_coordination_status_to!` and `transition_to!` use `update_columns` instead of `update!`, skipping AR dirty tracking overhead.
22
+ - **SQL EXISTS for status checks** — `recompute_pipeline_status` and `derive_terminal_status` use `EXISTS` queries instead of loading all step records.
23
+ - **Pipeline load with EXISTS** — `load_pipeline_with_active_check` combines pipeline load with active-step and downstream-chain EXISTS checks in a single query.
24
+ - **Conditional callback dispatch** — `dispatch_callbacks_once` uses `UPDATE WHERE callbacks_dispatched_at IS NULL` instead of `SELECT FOR UPDATE` + `UPDATE`.
25
+ - **Early return on active pipeline** — `complete_step` skips pipeline status recomputation when `unblock_downstream_steps` enqueued any downstream step.
26
+ - **Bulk skip on halt** — `skip_all_pending_steps` uses `update_all` instead of iterating with individual updates.
27
+ - **Single-pass graph validation** — `GraphValidator` merges duplicate-key check, self-dependency check, steps-by-key index, and forward-edges construction into one O(n) pass and returns `steps_by_key` for reuse by `Pipeline`.
28
+ - **Frozen constant defaults** — `EMPTY_HASH` and `EMPTY_ARRAY` shared constants avoid allocating fresh empty containers on every `StepDefinition` and `Pipeline#run` call.
29
+ - **Fast-path shortcuts** — `validate_enqueue_options!` returns immediately for empty options. `expand_branch_aliases` skips `flat_map` when no branches are defined.
30
+
31
+ ### Added
32
+
33
+ - **Benchmarking scripts** — `bench/memory_bench.rb` (in-memory, no DB) and `bench/database_bench.rb` (PostgreSQL) with `--json` flag for structured output. Covers pipeline construction, graph validation, cycle detection, step enqueue, step completion, status recomputation, halt propagation, and full pipeline run across linear, fan-out, fan-in, and diamond topologies.
34
+ - **`pending_upstream_count` column** — integer column on steps table, set by `Runner` at creation time, decremented atomically by `Coordinator` on step completion.
35
+
36
+ ### Changed
37
+
38
+ - **`Runner` refactored** — `call` method extracted into `create_pipeline_batch`, `create_pipeline_record`, `insert_steps`, `insert_dependencies`, and `enqueue_root_steps` for readability. Pipeline record is a local variable passed to methods instead of an instance variable.
39
+ - **`Coordinator` method reordering** — private methods grouped by concern (outcome recording, downstream unblocking, step resolution, pipeline status) rather than call order.
40
+
3
41
  ## [0.2.2] - 2026-03-24
4
42
 
5
43
  ### Fixed
@@ -15,6 +15,7 @@ module GoodPipeline
15
15
  " classDef failed fill:#f44336,color:#fff",
16
16
  " classDef skipped fill:#bdbdbd,color:#333",
17
17
  " classDef skipped_by_branch fill:#bdbdbd,color:#333",
18
+ " classDef halted fill:#8bc34a,color:#fff",
18
19
  " classDef branch fill:#ff9800,color:#fff,stroke:#f57c00",
19
20
  " classDef terminal fill:#1a1a2e,color:#fff,stroke:#1a1a2e"
20
21
  ].freeze
@@ -1,6 +1,9 @@
1
1
  # frozen_string_literal: true
2
2
 
3
3
  module GoodPipeline
4
+ # This model intentionally has no AR callbacks or validations. Status transitions
5
+ # use update_columns throughout the coordinator layer. If you need lifecycle hooks,
6
+ # ensure all update_columns call sites are updated accordingly.
4
7
  class PipelineRecord < ActiveRecord::Base
5
8
  self.table_name = "good_pipeline_pipelines"
6
9
  self.inheritance_column = nil
@@ -67,7 +70,7 @@ module GoodPipeline
67
70
  raise InvalidTransition, "cannot transition pipeline from '#{status}' to '#{new_status}'"
68
71
  end
69
72
 
70
- update!(status: new_status)
73
+ update_columns(status: new_status, updated_at: Time.current)
71
74
  end
72
75
  end
73
76
  end
@@ -1,14 +1,17 @@
1
1
  # frozen_string_literal: true
2
2
 
3
3
  module GoodPipeline
4
+ # This model intentionally has no AR callbacks or validations. Status transitions
5
+ # use update_columns throughout the coordinator layer. If you need lifecycle hooks,
6
+ # ensure all update_columns call sites are updated accordingly.
4
7
  class StepRecord < ActiveRecord::Base
5
8
  self.table_name = "good_pipeline_steps"
6
9
 
7
- TERMINAL_COORDINATION_STATUSES = %w[succeeded failed skipped skipped_by_branch].freeze
10
+ TERMINAL_COORDINATION_STATUSES = %w[succeeded failed skipped skipped_by_branch halted].freeze
8
11
 
9
12
  VALID_COORDINATION_TRANSITIONS = {
10
- "pending" => %w[enqueued skipped skipped_by_branch succeeded failed],
11
- "enqueued" => %w[succeeded failed]
13
+ "pending" => %w[enqueued skipped skipped_by_branch succeeded failed halted],
14
+ "enqueued" => %w[succeeded failed halted]
12
15
  }.freeze
13
16
 
14
17
  enum :coordination_status, {
@@ -17,7 +20,8 @@ module GoodPipeline
17
20
  succeeded: "succeeded",
18
21
  failed: "failed",
19
22
  skipped: "skipped",
20
- skipped_by_branch: "skipped_by_branch"
23
+ skipped_by_branch: "skipped_by_branch",
24
+ halted: "halted"
21
25
  }
22
26
 
23
27
  enum :on_failure_strategy, { halt: "halt", continue: "continue", ignore: "ignore" }
@@ -74,7 +78,7 @@ module GoodPipeline
74
78
  "cannot transition step '#{key}' coordination_status from '#{coordination_status}' to '#{new_status}'"
75
79
  end
76
80
 
77
- update!(coordination_status: new_status)
81
+ update_columns(coordination_status: new_status, updated_at: Time.current)
78
82
  end
79
83
  end
80
84
  end
@@ -0,0 +1,7 @@
1
+ # frozen_string_literal: true
2
+
3
+ class HaltExecutionJob < ApplicationJob
4
+ def perform(**)
5
+ halt_pipeline!
6
+ end
7
+ end
@@ -30,6 +30,8 @@ class CreateGoodPipelineTables < ActiveRecord::Migration[8.1]
30
30
  t.jsonb :branch, null: false, default: {}
31
31
  t.uuid :good_job_batch_id
32
32
  t.uuid :good_job_id
33
+ t.integer :pending_upstream_count, null: false, default: 0
34
+ t.boolean :halt_requested, null: false, default: false
33
35
  t.integer :attempts
34
36
  t.string :error_class
35
37
  t.text :error_message
@@ -39,6 +41,7 @@ class CreateGoodPipelineTables < ActiveRecord::Migration[8.1]
39
41
 
40
42
  add_index :good_pipeline_steps, %i[pipeline_id key], unique: true
41
43
  add_index :good_pipeline_steps, :coordination_status
44
+ add_index :good_pipeline_steps, :good_job_id, unique: true, where: "good_job_id IS NOT NULL"
42
45
 
43
46
  create_table :good_pipeline_dependencies do |t|
44
47
  t.references :pipeline, null: false, foreign_key: { to_table: :good_pipeline_pipelines }, type: :uuid
@@ -0,0 +1,113 @@
1
+ # frozen_string_literal: true
2
+
3
+ require "test_helper"
4
+
5
+ class TestHaltExecution < ActiveSupport::TestCase
6
+ def run_pipeline_to_completion(pipeline_record, timeout: 15)
7
+ deadline = Time.current + timeout
8
+ loop do
9
+ perform_enqueued_jobs_inline
10
+ pipeline_record.reload
11
+ return pipeline_record if pipeline_record.terminal?
12
+
13
+ raise "Pipeline did not reach terminal state within #{timeout}s (status: #{pipeline_record.status})" if Time.current > deadline
14
+
15
+ sleep 0.05
16
+ end
17
+ end
18
+
19
+ def test_halt_pipeline_marks_step_halted
20
+ pipeline_class = Class.new(GoodPipeline::Pipeline) do
21
+ failure_strategy :halt
22
+ define_method(:configure) do |**|
23
+ run :halt_step, HaltExecutionJob
24
+ run :after_step, DownloadJob, after: :halt_step
25
+ end
26
+ end
27
+ Object.const_set(:HaltSucceededPipeline, pipeline_class) unless defined?(::HaltSucceededPipeline)
28
+
29
+ chain = HaltSucceededPipeline.run
30
+ result = run_pipeline_to_completion(chain)
31
+
32
+ halt_step = result.steps.find_by(key: "halt_step")
33
+ assert_equal "halted", halt_step.coordination_status
34
+ assert halt_step.halt_requested?, "halt_requested should be true"
35
+ end
36
+
37
+ def test_halt_pipeline_skips_remaining_steps
38
+ pipeline_class = Class.new(GoodPipeline::Pipeline) do
39
+ failure_strategy :halt
40
+ define_method(:configure) do |**|
41
+ run :halt_step, HaltExecutionJob
42
+ run :after_step, DownloadJob, after: :halt_step
43
+ end
44
+ end
45
+ Object.const_set(:HaltSkipsPipeline, pipeline_class) unless defined?(::HaltSkipsPipeline)
46
+
47
+ chain = HaltSkipsPipeline.run
48
+ result = run_pipeline_to_completion(chain)
49
+
50
+ after_step = result.steps.find_by(key: "after_step")
51
+ assert_equal "skipped", after_step.coordination_status
52
+ end
53
+
54
+ def test_halt_pipeline_pipeline_succeeds
55
+ pipeline_class = Class.new(GoodPipeline::Pipeline) do
56
+ failure_strategy :halt
57
+ define_method(:configure) do |**|
58
+ run :halt_step, HaltExecutionJob
59
+ run :after_step, DownloadJob, after: :halt_step
60
+ end
61
+ end
62
+ Object.const_set(:HaltSucceedsPipeline, pipeline_class) unless defined?(::HaltSucceedsPipeline)
63
+
64
+ chain = HaltSucceedsPipeline.run
65
+ result = run_pipeline_to_completion(chain)
66
+
67
+ assert_equal "succeeded", result.status
68
+ end
69
+
70
+ def test_halt_pipeline_job_succeeds_in_good_job
71
+ pipeline_class = Class.new(GoodPipeline::Pipeline) do
72
+ failure_strategy :halt
73
+ define_method(:configure) do |**|
74
+ run :halt_step, HaltExecutionJob
75
+ end
76
+ end
77
+ Object.const_set(:HaltJobSucceedsPipeline, pipeline_class) unless defined?(::HaltJobSucceedsPipeline)
78
+
79
+ chain = HaltJobSucceedsPipeline.run
80
+ run_pipeline_to_completion(chain)
81
+
82
+ halt_step = chain.steps.find_by(key: "halt_step")
83
+ good_job = GoodJob::Job.find(halt_step.good_job_id)
84
+
85
+ assert_equal 1, good_job.executions_count
86
+ assert_nil good_job.error, "GoodJob record should have no error"
87
+ assert_not_nil good_job.finished_at
88
+ end
89
+
90
+ def test_halt_pipeline_with_parallel_steps
91
+ pipeline_class = Class.new(GoodPipeline::Pipeline) do
92
+ failure_strategy :continue
93
+ define_method(:configure) do |**|
94
+ run :halt_step, HaltExecutionJob
95
+ run :normal_step, DownloadJob
96
+ run :after_both, CleanupJob, after: %i[halt_step normal_step]
97
+ end
98
+ end
99
+ Object.const_set(:HaltParallelPipeline, pipeline_class) unless defined?(::HaltParallelPipeline)
100
+
101
+ chain = HaltParallelPipeline.run
102
+ result = run_pipeline_to_completion(chain)
103
+
104
+ halt_step = result.steps.find_by(key: "halt_step")
105
+ normal_step = result.steps.find_by(key: "normal_step")
106
+ after_both = result.steps.find_by(key: "after_both")
107
+
108
+ assert_equal "halted", halt_step.coordination_status
109
+ assert_equal "succeeded", normal_step.coordination_status
110
+ assert_equal "skipped", after_both.coordination_status
111
+ assert_equal "succeeded", result.status
112
+ end
113
+ end
@@ -65,6 +65,7 @@ module ActiveSupport
65
65
  dependencies.each do |dependency_step|
66
66
  GoodPipeline::DependencyRecord.create!(pipeline: pipeline, step: step, depends_on_step: dependency_step)
67
67
  end
68
+ step.update_column(:pending_upstream_count, dependencies.size)
68
69
  step
69
70
  end
70
71
  end
@@ -127,6 +127,41 @@ A downstream step is eligible for enqueue when **all** of its incoming edges are
127
127
 
128
128
  A downstream step is marked `skipped` when it's still `pending` and at least one incoming edge is **permanently unsatisfied** — the upstream is terminal, cannot satisfy the edge, and no future event can change that.
129
129
 
130
+ ## Early termination with success
131
+
132
+ Sometimes a job determines there is nothing to do — the account is deactivated, the resource was already processed, etc. Call `halt_pipeline!` to stop the pipeline early and mark it as `succeeded`:
133
+
134
+ ```ruby
135
+ class FetchDataJob < ApplicationJob
136
+ def perform(account_id:)
137
+ account = Account.find(account_id)
138
+ return halt_pipeline! if account.deactivated?
139
+
140
+ # ... normal work
141
+ end
142
+ end
143
+ ```
144
+
145
+ The behavior:
146
+
147
+ | Aspect | Value |
148
+ |---|---|
149
+ | Halting step status | `halted` |
150
+ | Remaining pending steps | `skipped` |
151
+ | Pipeline status | `succeeded` |
152
+ | Callback triggered | `on_success` |
153
+ | GoodJob record | Succeeded (no error, no discard) |
154
+
155
+ No configuration or module includes are required. The Engine includes `GoodPipeline::Haltable` into `ActiveJob::Base` at boot, so `halt_pipeline!` is available in any job. For non-pipeline jobs, it's a no-op.
156
+
157
+ ::: tip Return early
158
+ Remember to use `return halt_pipeline!` — without `return`, the job continues executing after the call.
159
+ :::
160
+
161
+ ::: warning Parallel steps
162
+ If another step is already running when `halt_pipeline!` is called, that step continues to completion. Only `pending` steps are skipped. If the running step fails, the pipeline will derive to `failed`, not `succeeded`.
163
+ :::
164
+
130
165
  ## Failure resolution table
131
166
 
132
167
  | Pipeline strategy | Step override | Effect when step fails |
@@ -28,6 +28,8 @@ class CreateGoodPipelineTables < ActiveRecord::Migration[<%= ActiveRecord::Migra
28
28
  t.jsonb :branch, null: false, default: {}
29
29
  t.uuid :good_job_batch_id
30
30
  t.uuid :good_job_id
31
+ t.integer :pending_upstream_count, null: false, default: 0
32
+ t.boolean :halt_requested, null: false, default: false
31
33
  t.integer :attempts
32
34
  t.string :error_class
33
35
  t.text :error_message
@@ -37,6 +39,7 @@ class CreateGoodPipelineTables < ActiveRecord::Migration[<%= ActiveRecord::Migra
37
39
 
38
40
  add_index :good_pipeline_steps, %i[pipeline_id key], unique: true
39
41
  add_index :good_pipeline_steps, :coordination_status
42
+ add_index :good_pipeline_steps, :good_job_id, unique: true, where: "good_job_id IS NOT NULL"
40
43
 
41
44
  create_table :good_pipeline_dependencies do |t|
42
45
  t.references :pipeline, null: false, foreign_key: { to_table: :good_pipeline_pipelines }, type: :uuid
@@ -0,0 +1,6 @@
1
+ # frozen_string_literal: true
2
+
3
+ module GoodPipeline
4
+ EMPTY_HASH = {}.freeze
5
+ EMPTY_ARRAY = [].freeze
6
+ end
@@ -3,25 +3,39 @@
3
3
  module GoodPipeline
4
4
  class Coordinator # rubocop:disable Metrics/ClassLength
5
5
  class << self
6
- def complete_step(step, succeeded:)
6
+ def complete_step(step, succeeded:) # rubocop:disable Metrics/MethodLength
7
7
  return if step.terminal_coordination_status?
8
8
 
9
+ if succeeded && step.halt_requested?
10
+ handle_halt_execution(step)
11
+ return
12
+ end
13
+
9
14
  record_step_outcome(step, succeeded)
10
15
  propagate_halt(step) if !succeeded && step.pipeline.halt?
11
- unblock_downstream_steps(step)
12
- recompute_pipeline_status(step.pipeline.reload)
16
+ return if unblock_downstream_steps(step)
17
+
18
+ pipeline = load_pipeline_with_active_check(step.pipeline_id)
19
+
20
+ recompute_pipeline_status(
21
+ pipeline,
22
+ has_active_steps: pipeline["has_active_steps"],
23
+ has_downstream_chains: pipeline["has_downstream_chains"]
24
+ )
13
25
  end
14
26
 
15
27
  def try_enqueue_step(step_id) # rubocop:disable Metrics/AbcSize, Metrics/CyclomaticComplexity, Metrics/MethodLength, Metrics/PerceivedComplexity
28
+ step_was_enqueued = false
16
29
  skipped_downstream_ids = nil
17
30
  recompute_pipeline = nil
18
31
 
19
32
  StepRecord.transaction do
20
33
  locked_step = StepRecord.lock("FOR UPDATE").find_by(id: step_id)
21
- return unless locked_step&.pending?
22
- return if locked_step.good_job_id.present?
34
+ return false unless locked_step&.pending?
35
+ return false if locked_step.good_job_id.present?
23
36
 
24
37
  skipped_downstream_ids = resolve_step(locked_step)
38
+ step_was_enqueued = skipped_downstream_ids.nil?
25
39
  rescue ConfigurationError => error
26
40
  fail_step_with_error(locked_step, error)
27
41
  propagate_halt(locked_step) if locked_step.pipeline.halt?
@@ -29,48 +43,68 @@ module GoodPipeline
29
43
  recompute_pipeline = locked_step.pipeline
30
44
  end
31
45
 
32
- skipped_downstream_ids&.each { |downstream_step_id| try_enqueue_step(downstream_step_id) }
46
+ downstream_enqueued = false
47
+ skipped_downstream_ids&.each { |id| downstream_enqueued = true if try_enqueue_step(id) }
33
48
  recompute_pipeline_status(recompute_pipeline.reload) if recompute_pipeline
49
+ step_was_enqueued || downstream_enqueued
34
50
  end
35
51
 
36
- def recompute_pipeline_status(pipeline)
37
- steps = pipeline.steps.reload
38
-
39
- return if steps.any? { |step| step.pending? || step.enqueued? }
52
+ def recompute_pipeline_status(pipeline, has_active_steps: nil, has_downstream_chains: nil) # rubocop:disable Metrics/MethodLength
40
53
  return if pipeline.terminal?
41
54
 
42
- new_status = derive_terminal_status(steps, pipeline)
55
+ active = if has_active_steps.nil?
56
+ pipeline.steps.where(coordination_status: %w[pending enqueued]).exists?
57
+ else
58
+ has_active_steps
59
+ end
60
+
61
+ return if active
62
+
63
+ new_status = derive_terminal_status(pipeline)
43
64
  pipeline.transition_to!(new_status)
44
65
  dispatch_callbacks_once(pipeline, new_status)
45
- ChainCoordinator.propagate_terminal_state(pipeline)
66
+ ChainCoordinator.propagate_terminal_state(pipeline) unless has_downstream_chains == false
46
67
  end
47
68
 
48
69
  def dispatch_callbacks_once(pipeline, new_status)
49
70
  PipelineRecord.transaction do
50
- locked = PipelineRecord.lock("FOR UPDATE").find(pipeline.id)
51
- return if locked.callbacks_dispatched_at.present?
71
+ rows_updated = PipelineRecord.where(id: pipeline.id, callbacks_dispatched_at: nil)
72
+ .update_all(callbacks_dispatched_at: Time.current)
73
+
74
+ return if rows_updated.zero?
52
75
 
53
- locked.update!(callbacks_dispatched_at: Time.current)
54
- PipelineCallbackJob.perform_later(locked.id, new_status.to_s)
76
+ PipelineCallbackJob.perform_later(pipeline.id, new_status.to_s)
55
77
  end
56
78
  end
57
79
 
58
80
  private
59
81
 
82
+ def handle_halt_execution(step)
83
+ step.transition_coordination_status_to!(:halted)
84
+ step.pipeline.steps.pending.update_all(coordination_status: "skipped")
85
+
86
+ pipeline = load_pipeline_with_active_check(step.pipeline_id)
87
+
88
+ recompute_pipeline_status(
89
+ pipeline,
90
+ has_active_steps: pipeline["has_active_steps"],
91
+ has_downstream_chains: pipeline["has_downstream_chains"]
92
+ )
93
+ end
94
+
60
95
  def record_step_outcome(step, succeeded)
61
- StepRecord.transaction do
62
- if succeeded
63
- step.transition_coordination_status_to!(:succeeded)
64
- else
65
- record_step_failure(step)
66
- end
96
+ if succeeded
97
+ step.transition_coordination_status_to!(:succeeded)
98
+ else
99
+ record_step_failure(step)
67
100
  end
68
101
  end
69
102
 
70
103
  def record_step_failure(step)
71
104
  metadata = FailureMetadata.extract(step)
72
- step.transition_coordination_status_to!(:failed)
73
105
  step.update_columns(
106
+ coordination_status: "failed",
107
+ updated_at: Time.current,
74
108
  error_class: metadata.error_class,
75
109
  error_message: metadata.error_message,
76
110
  attempts: metadata.attempts
@@ -78,28 +112,71 @@ module GoodPipeline
78
112
  end
79
113
 
80
114
  def propagate_halt(step)
81
- pipeline = step.pipeline
82
115
  StepRecord.transaction do
83
- pipeline.update_column(:halt_triggered, true)
84
- skip_all_pending_steps(pipeline, except_dependents_of: step)
116
+ step.pipeline.update_column(:halt_triggered, true)
117
+ skip_all_pending_steps(step.pipeline, except_dependents_of: step)
118
+ end
119
+ end
120
+
121
+ def skip_all_pending_steps(pipeline, except_dependents_of:)
122
+ scope = pipeline.steps.pending
123
+
124
+ if effective_failure_strategy(except_dependents_of) == :ignore
125
+ exempt_ids = transitive_downstream_ids(except_dependents_of)
126
+ scope = scope.where.not(id: exempt_ids.to_a) if exempt_ids.any?
85
127
  end
128
+
129
+ scope.update_all(coordination_status: "skipped")
86
130
  end
87
131
 
88
132
  def unblock_downstream_steps(step)
89
- step.downstream_steps.each do |downstream_step|
90
- try_enqueue_step(downstream_step.id)
133
+ sql = <<~SQL
134
+ UPDATE good_pipeline_steps
135
+ SET pending_upstream_count = pending_upstream_count - 1
136
+ WHERE id IN (
137
+ SELECT step_id FROM good_pipeline_dependencies
138
+ WHERE depends_on_step_id = $1
139
+ )
140
+ AND coordination_status = 'pending'
141
+ RETURNING id, pending_upstream_count
142
+ SQL
143
+
144
+ any_enqueued = false
145
+ StepRecord.connection.exec_query(sql, "SQL", [step.id]).each do |row|
146
+ any_enqueued = true if row["pending_upstream_count"].zero? && try_enqueue_step(row["id"])
91
147
  end
148
+ any_enqueued
92
149
  end
93
150
 
94
- def resolve_step(locked_step) # rubocop:disable Metrics/MethodLength
151
+ def load_pipeline_with_active_check(pipeline_id)
152
+ sql = <<~SQL.squish
153
+ good_pipeline_pipelines.*,
154
+ EXISTS(
155
+ SELECT 1 FROM good_pipeline_steps
156
+ WHERE pipeline_id = good_pipeline_pipelines.id
157
+ AND coordination_status IN ('pending', 'enqueued')
158
+ ) AS has_active_steps,
159
+ EXISTS(
160
+ SELECT 1 FROM good_pipeline_chains
161
+ WHERE upstream_pipeline_id = good_pipeline_pipelines.id
162
+ ) AS has_downstream_chains
163
+ SQL
164
+
165
+ PipelineRecord.select(sql).where(id: pipeline_id).first!
166
+ end
167
+
168
+ def resolve_step(locked_step) # rubocop:disable Metrics/MethodLength,Metrics/AbcSize
95
169
  if should_skip?(locked_step)
96
170
  locked_step.transition_coordination_status_to!(:skipped)
171
+ decrement_upstream_counts_for_terminal_step(locked_step.id)
97
172
  locked_step.downstream_steps.pluck(:id)
98
173
  elsif locked_step.branch_step? && all_upstreams_satisfied?(locked_step)
99
174
  BranchResolver.resolve(locked_step)
175
+ decrement_upstream_counts_for_terminal_step(locked_step.id)
100
176
  locked_step.downstream_steps.pluck(:id)
101
177
  elsif BranchResolver.skipped_by_branch?(locked_step)
102
178
  locked_step.transition_coordination_status_to!(:skipped_by_branch)
179
+ decrement_upstream_counts_for_terminal_step(locked_step.id)
103
180
  locked_step.downstream_steps.pluck(:id)
104
181
  else
105
182
  enqueue_user_job(locked_step) if all_upstreams_satisfied?(locked_step)
@@ -107,12 +184,43 @@ module GoodPipeline
107
184
  end
108
185
  end
109
186
 
110
- def enqueue_user_job(step)
111
- step.transition_coordination_status_to!(:enqueued)
187
+ def should_skip?(step)
188
+ step.pending? && step.upstream_steps.any? { |upstream| permanently_unsatisfied?(upstream) }
189
+ end
190
+
191
+ def permanently_unsatisfied?(upstream)
192
+ upstream.terminal_coordination_status? &&
193
+ !upstream.succeeded? &&
194
+ !upstream.halted? &&
195
+ !upstream.skipped_by_branch? &&
196
+ effective_failure_strategy(upstream) != :ignore
197
+ end
198
+
199
+ def decrement_upstream_counts_for_terminal_step(step_id)
200
+ downstream_ids = DependencyRecord.where(depends_on_step_id: step_id).select(:step_id)
201
+ StepRecord.where(id: downstream_ids, coordination_status: "pending")
202
+ .update_all("pending_upstream_count = pending_upstream_count - 1")
203
+ end
204
+
205
+ def all_upstreams_satisfied?(step)
206
+ step.upstream_steps.all? do |upstream|
207
+ upstream.succeeded? ||
208
+ upstream.halted? ||
209
+ upstream.skipped_by_branch? ||
210
+ (upstream.failed? && effective_failure_strategy(upstream) == :ignore)
211
+ end
212
+ end
112
213
 
214
+ def enqueue_user_job(step)
113
215
  batch = build_step_batch(step)
114
- batch.enqueue { enqueue_step_job(step) }
115
- step.update_column(:good_job_batch_id, batch.id)
216
+ good_job_id = nil
217
+ batch.enqueue { good_job_id = enqueue_step_job(step) }
218
+ step.update_columns(
219
+ coordination_status: "enqueued",
220
+ good_job_batch_id: batch.id,
221
+ good_job_id: good_job_id,
222
+ updated_at: Time.current
223
+ )
116
224
  end
117
225
 
118
226
  def build_step_batch(step)
@@ -125,11 +233,11 @@ module GoodPipeline
125
233
  def enqueue_step_job(step)
126
234
  job = step.job_class.constantize.new(**step.params.symbolize_keys)
127
235
  enqueued_job = job.enqueue(**step.enqueue_options.symbolize_keys)
128
- step.update_column(:good_job_id, enqueued_job.provider_job_id || enqueued_job.job_id)
236
+ enqueued_job.provider_job_id || enqueued_job.job_id
129
237
  end
130
238
 
131
- def derive_terminal_status(steps, pipeline)
132
- has_failures = steps.any?(&:failed?)
239
+ def derive_terminal_status(pipeline)
240
+ has_failures = pipeline.steps.where(coordination_status: "failed").exists?
133
241
 
134
242
  return :succeeded unless has_failures
135
243
  return :halted if pipeline.halt_triggered?
@@ -137,40 +245,6 @@ module GoodPipeline
137
245
  :failed
138
246
  end
139
247
 
140
- def all_upstreams_satisfied?(step)
141
- step.upstream_steps.all? do |upstream|
142
- upstream.succeeded? ||
143
- upstream.skipped_by_branch? ||
144
- (upstream.failed? && effective_failure_strategy(upstream) == :ignore)
145
- end
146
- end
147
-
148
- def should_skip?(step)
149
- step.pending? &&
150
- step.upstream_steps.any? { |upstream| permanently_unsatisfied?(upstream) }
151
- end
152
-
153
- def permanently_unsatisfied?(upstream)
154
- upstream.terminal_coordination_status? &&
155
- !upstream.succeeded? &&
156
- !upstream.skipped_by_branch? &&
157
- effective_failure_strategy(upstream) != :ignore
158
- end
159
-
160
- def skip_all_pending_steps(pipeline, except_dependents_of:)
161
- exempt_step_ids = if effective_failure_strategy(except_dependents_of) == :ignore
162
- transitive_downstream_ids(except_dependents_of)
163
- else
164
- Set.new
165
- end
166
-
167
- pipeline.steps.pending.find_each do |pending_step|
168
- next if exempt_step_ids.include?(pending_step.id)
169
-
170
- pending_step.transition_coordination_status_to!(:skipped)
171
- end
172
- end
173
-
174
248
  def transitive_downstream_ids(step)
175
249
  visited = Set.new
176
250
  queue = step.downstream_steps.pluck(:id)
@@ -13,6 +13,12 @@ module GoodPipeline
13
13
  end
14
14
  end
15
15
 
16
+ initializer "good_pipeline.haltable" do
17
+ ActiveSupport.on_load(:active_job) do
18
+ include GoodPipeline::Haltable
19
+ end
20
+ end
21
+
16
22
  initializer "good_pipeline.cleanup_hook" do
17
23
  ActiveSupport::Notifications.subscribe("cleanup_preserved_jobs.good_job") do |event|
18
24
  timestamp = event.payload[:timestamp]
@@ -12,11 +12,10 @@ module GoodPipeline
12
12
 
13
13
  def validate!
14
14
  check_empty_pipeline!
15
- check_duplicate_keys!
16
- build_steps_by_key!
17
- check_self_dependencies!
15
+ build_index!
18
16
  check_unknown_references!
19
17
  check_cycles!
18
+ @steps_by_key
20
19
  end
21
20
 
22
21
  private
@@ -25,22 +24,20 @@ module GoodPipeline
25
24
  raise InvalidPipelineError, "pipeline has no steps" if @step_definitions.empty?
26
25
  end
27
26
 
28
- def check_duplicate_keys!
29
- seen = {}
27
+ def build_index! # rubocop:disable Metrics/AbcSize
28
+ @steps_by_key = {}
29
+ @forward_edges = Hash.new { |h, k| h[k] = [] }
30
+
30
31
  @step_definitions.each do |step|
31
- raise InvalidPipelineError, "duplicate step key :#{step.key}" if seen.key?(step.key)
32
+ raise InvalidPipelineError, "duplicate step key :#{step.key}" if @steps_by_key.key?(step.key)
32
33
 
33
- seen[step.key] = true
34
- end
35
- end
34
+ step.dependencies.each do |dependency_key|
35
+ raise InvalidPipelineError, "step :#{step.key} depends on itself" if dependency_key == step.key
36
36
 
37
- def build_steps_by_key!
38
- @steps_by_key = @step_definitions.to_h { |step| [step.key, step] }
39
- end
37
+ @forward_edges[dependency_key] << step.key
38
+ end
40
39
 
41
- def check_self_dependencies!
42
- @steps_by_key.each_value do |step|
43
- raise InvalidPipelineError, "step :#{step.key} depends on itself" if step.dependencies.include?(step.key)
40
+ @steps_by_key[step.key] = step
44
41
  end
45
42
  end
46
43
 
@@ -55,17 +52,7 @@ module GoodPipeline
55
52
  end
56
53
 
57
54
  def check_cycles!
58
- CycleDetector.check!(@steps_by_key, build_forward_edges)
59
- end
60
-
61
- def build_forward_edges
62
- edges = Hash.new { |h, k| h[k] = [] }
63
- @steps_by_key.each_value do |step|
64
- step.dependencies.each do |dependency_key|
65
- edges[dependency_key] << step.key
66
- end
67
- end
68
- edges
55
+ CycleDetector.check!(@steps_by_key, @forward_edges)
69
56
  end
70
57
  end
71
58
  end
@@ -0,0 +1,10 @@
1
+ # frozen_string_literal: true
2
+
3
+ module GoodPipeline
4
+ module Haltable
5
+ def halt_pipeline!
6
+ step = GoodPipeline::StepRecord.find_by(good_job_id: provider_job_id)
7
+ step&.update_columns(halt_requested: true, updated_at: Time.current)
8
+ end
9
+ end
10
+ end
@@ -103,11 +103,10 @@ module GoodPipeline
103
103
  @branch_context_stack = []
104
104
  @building = true
105
105
  configure(**kwargs)
106
- GraphValidator.validate!(@step_definitions)
106
+ @steps_by_key = GraphValidator.validate!(@step_definitions).freeze
107
107
  @step_definitions.freeze
108
108
  @branch_aliases.freeze
109
109
  @building = false
110
- @steps_by_key = @step_definitions.to_h { |step| [step.key, step] }.freeze
111
110
  @root_steps = @step_definitions.select { |step| step.dependencies.empty? }.freeze
112
111
  freeze
113
112
  end
@@ -118,7 +117,7 @@ module GoodPipeline
118
117
  raise NotImplementedError, "#{self.class} must implement #configure"
119
118
  end
120
119
 
121
- def run(key, job_class, with: {}, after: [], on_failure: nil, enqueue: {}) # rubocop:disable Metrics/MethodLength
120
+ def run(key, job_class, with: EMPTY_HASH, after: EMPTY_ARRAY, on_failure: nil, enqueue: EMPTY_HASH) # rubocop:disable Metrics/MethodLength
122
121
  raise ConfigurationError, "run can only be called inside configure" unless @building
123
122
 
124
123
  expanded_after = expand_branch_aliases(after)
@@ -186,6 +185,8 @@ module GoodPipeline
186
185
  # NOTE: Single-level expansion only. If nested branches are added in the future,
187
186
  # this must become recursive to expand inner branch aliases.
188
187
  def expand_branch_aliases(dependencies)
188
+ return Array(dependencies) if @branch_aliases.empty?
189
+
189
190
  Array(dependencies).flat_map { |dependency| @branch_aliases.fetch(dependency, [dependency]) }
190
191
  end
191
192
  end
@@ -10,61 +10,81 @@ module GoodPipeline
10
10
  @pipeline = pipeline_instance
11
11
  end
12
12
 
13
- def call(start: true) # rubocop:disable Metrics/AbcSize, Metrics/CyclomaticComplexity, Metrics/MethodLength
13
+ def call(start: true) # rubocop:disable Metrics/MethodLength
14
+ pipeline_id = SecureRandom.uuid
14
15
  pipeline_record = nil
15
- step_records = {}
16
-
17
- PipelineRecord.transaction do # rubocop:disable Metrics/BlockLength
18
- pipeline_record = PipelineRecord.create!(
19
- type: @pipeline.class.name,
20
- params: @pipeline.params,
21
- status: :pending,
22
- on_failure_strategy: @pipeline.failure_strategy.to_s
23
- )
24
-
25
- # Two passes: create all step records first, then dependencies.
26
- # Branch steps may appear after their dependents in step_definitions.
27
- @pipeline.step_definitions.each do |step_definition|
28
- step_records[step_definition.key] = StepRecord.create!(
29
- pipeline: pipeline_record,
30
- key: step_definition.key.to_s,
31
- job_class: resolve_job_class(step_definition),
32
- params: step_definition.params,
33
- on_failure_strategy: step_definition.failure_strategy&.to_s,
34
- enqueue_options: step_definition.enqueue_options,
35
- branch: build_branch_hash(step_definition)
36
- )
37
- end
16
+ step_id_by_key = {}
38
17
 
39
- @pipeline.step_definitions.each do |step_definition| # rubocop:disable Style/CombinableLoops
40
- step_definition.dependencies.each do |dependency_key|
41
- DependencyRecord.create!(
42
- pipeline: pipeline_record,
43
- step: step_records[step_definition.key],
44
- depends_on_step: step_records[dependency_key]
45
- )
46
- end
47
- end
18
+ PipelineRecord.transaction do
19
+ batch = create_pipeline_batch(pipeline_id)
20
+ pipeline_record = create_pipeline_record(pipeline_id, batch.id, start: start)
21
+ step_id_by_key = insert_steps(pipeline_record)
22
+ insert_dependencies(pipeline_record, step_id_by_key)
23
+ end
24
+
25
+ enqueue_root_steps(step_id_by_key) if start
26
+
27
+ pipeline_record
28
+ end
29
+
30
+ private
48
31
 
49
- pipeline_batch = GoodJob::Batch.new
50
- pipeline_batch.on_finish = "GoodPipeline::PipelineReconciliationJob"
51
- pipeline_batch.properties = { pipeline_id: pipeline_record.id }
52
- pipeline_batch.save
53
- pipeline_record.update_column(:good_job_batch_id, pipeline_batch.id)
32
+ def create_pipeline_batch(pipeline_id)
33
+ batch = GoodJob::Batch.new
34
+ batch.on_finish = "GoodPipeline::PipelineReconciliationJob"
35
+ batch.properties = { pipeline_id: pipeline_id }
36
+ batch.save
37
+ batch
38
+ end
39
+
40
+ def create_pipeline_record(pipeline_id, batch_id, start:)
41
+ PipelineRecord.create!(
42
+ id: pipeline_id,
43
+ type: @pipeline.class.name,
44
+ params: @pipeline.params,
45
+ status: start ? :running : :pending,
46
+ on_failure_strategy: @pipeline.failure_strategy.to_s,
47
+ good_job_batch_id: batch_id
48
+ )
49
+ end
54
50
 
55
- pipeline_record.transition_to!(:running) if start
51
+ def insert_steps(pipeline_record) # rubocop:disable Metrics/AbcSize,Metrics/MethodLength
52
+ step_rows = @pipeline.step_definitions.map do |step_definition|
53
+ {
54
+ pipeline_id: pipeline_record.id,
55
+ key: step_definition.key.to_s,
56
+ job_class: resolve_job_class(step_definition),
57
+ params: step_definition.params,
58
+ on_failure_strategy: step_definition.failure_strategy&.to_s,
59
+ enqueue_options: step_definition.enqueue_options,
60
+ branch: build_branch_hash(step_definition),
61
+ pending_upstream_count: step_definition.dependencies.size
62
+ }
56
63
  end
57
64
 
58
- if start
59
- @pipeline.root_steps.each do |step_definition|
60
- Coordinator.try_enqueue_step(step_records[step_definition.key].id)
65
+ result = StepRecord.insert_all!(step_rows, returning: %w[id key])
66
+ result.rows.each_with_object({}) { |(id, key), hash| hash[key.to_sym] = id }
67
+ end
68
+
69
+ def insert_dependencies(pipeline_record, step_id_by_key)
70
+ dependency_rows = @pipeline.step_definitions.flat_map do |step_definition|
71
+ step_definition.dependencies.map do |dependency_key|
72
+ {
73
+ pipeline_id: pipeline_record.id,
74
+ step_id: step_id_by_key[step_definition.key],
75
+ depends_on_step_id: step_id_by_key[dependency_key]
76
+ }
61
77
  end
62
78
  end
63
79
 
64
- pipeline_record
80
+ DependencyRecord.insert_all!(dependency_rows) if dependency_rows.any?
65
81
  end
66
82
 
67
- private
83
+ def enqueue_root_steps(step_id_by_key)
84
+ @pipeline.root_steps.each do |step_definition|
85
+ Coordinator.try_enqueue_step(step_id_by_key[step_definition.key])
86
+ end
87
+ end
68
88
 
69
89
  def resolve_job_class(step_definition)
70
90
  step_definition.job_class.is_a?(String) ? step_definition.job_class : step_definition.job_class.name
@@ -4,11 +4,29 @@ module GoodPipeline
4
4
  class StepDefinition
5
5
  SUPPORTED_ENQUEUE_OPTIONS = %i[queue priority wait good_job_labels good_job_notify].freeze
6
6
 
7
- attr_reader :key, :job_class, :params, :dependencies, :failure_strategy, :enqueue_options,
8
- :branch_key, :branch_arm, :decides, :empty_arms
7
+ attr_reader :key,
8
+ :job_class,
9
+ :params,
10
+ :dependencies,
11
+ :failure_strategy,
12
+ :enqueue_options,
13
+ :branch_key,
14
+ :branch_arm,
15
+ :decides,
16
+ :empty_arms
9
17
 
10
- def initialize(key:, job_class:, params: {}, dependencies: [], failure_strategy: nil, enqueue_options: {}, # rubocop:disable Metrics/MethodLength
11
- branch_key: nil, branch_arm: nil, decides: nil, empty_arms: [])
18
+ def initialize( # rubocop:disable Metrics/MethodLength
19
+ key:,
20
+ job_class:,
21
+ params: EMPTY_HASH,
22
+ dependencies: EMPTY_ARRAY,
23
+ failure_strategy: nil,
24
+ enqueue_options: EMPTY_HASH,
25
+ branch_key: nil,
26
+ branch_arm: nil,
27
+ decides: nil,
28
+ empty_arms: EMPTY_ARRAY
29
+ )
12
30
  @key = key
13
31
  @job_class = job_class
14
32
  @params = params.freeze
@@ -37,6 +55,8 @@ module GoodPipeline
37
55
  end
38
56
 
39
57
  def validate_enqueue_options!(options)
58
+ return if options.empty?
59
+
40
60
  unsupported = options.keys.map(&:to_sym) - SUPPORTED_ENQUEUE_OPTIONS
41
61
  return if unsupported.empty?
42
62
 
@@ -1,5 +1,5 @@
1
1
  # frozen_string_literal: true
2
2
 
3
3
  module GoodPipeline
4
- VERSION = "0.2.2"
4
+ VERSION = "0.3.1"
5
5
  end
data/lib/good_pipeline.rb CHANGED
@@ -1,6 +1,7 @@
1
1
  # frozen_string_literal: true
2
2
 
3
3
  require_relative "good_pipeline/version"
4
+ require_relative "good_pipeline/constants"
4
5
  require_relative "good_pipeline/errors"
5
6
  require_relative "good_pipeline/step_definition"
6
7
  require_relative "good_pipeline/branch_builder"
@@ -12,6 +13,7 @@ require_relative "good_pipeline/branch_resolver"
12
13
  require_relative "good_pipeline/coordinator"
13
14
  require_relative "good_pipeline/chain_coordinator"
14
15
  require_relative "good_pipeline/runner"
16
+ require_relative "good_pipeline/haltable"
15
17
  require_relative "good_pipeline/chain"
16
18
  require_relative "good_pipeline/engine" if defined?(Rails::Engine)
17
19
 
metadata CHANGED
@@ -1,7 +1,7 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: good_pipeline
3
3
  version: !ruby/object:Gem::Version
4
- version: 0.2.2
4
+ version: 0.3.1
5
5
  platform: ruby
6
6
  authors:
7
7
  - Ali Hamdi Ali Fadel
@@ -93,6 +93,7 @@ files:
93
93
  - demo/app/jobs/cleanup_job.rb
94
94
  - demo/app/jobs/download_job.rb
95
95
  - demo/app/jobs/failing_job.rb
96
+ - demo/app/jobs/halt_execution_job.rb
96
97
  - demo/app/jobs/publish_job.rb
97
98
  - demo/app/jobs/retryable_job.rb
98
99
  - demo/app/jobs/thumbnail_job.rb
@@ -135,6 +136,7 @@ files:
135
136
  - demo/test/integration/test_concurrent_fan_in.rb
136
137
  - demo/test/integration/test_end_to_end.rb
137
138
  - demo/test/integration/test_enqueue_atomicity.rb
139
+ - demo/test/integration/test_halt_execution.rb
138
140
  - demo/test/integration/test_halt_ignore_chain.rb
139
141
  - demo/test/integration/test_ignore_transitive_exemption.rb
140
142
  - demo/test/integration/test_late_chain_registration.rb
@@ -177,12 +179,14 @@ files:
177
179
  - lib/good_pipeline/branch_resolver.rb
178
180
  - lib/good_pipeline/chain.rb
179
181
  - lib/good_pipeline/chain_coordinator.rb
182
+ - lib/good_pipeline/constants.rb
180
183
  - lib/good_pipeline/coordinator.rb
181
184
  - lib/good_pipeline/cycle_detector.rb
182
185
  - lib/good_pipeline/engine.rb
183
186
  - lib/good_pipeline/errors.rb
184
187
  - lib/good_pipeline/failure_metadata.rb
185
188
  - lib/good_pipeline/graph_validator.rb
189
+ - lib/good_pipeline/haltable.rb
186
190
  - lib/good_pipeline/pipeline.rb
187
191
  - lib/good_pipeline/runner.rb
188
192
  - lib/good_pipeline/step_definition.rb