chrono_forge 0.8.0 → 0.9.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA256:
3
- metadata.gz: 518b82abfe5b0061b385b4293ae955eb6a188c1e3bf76f4763e5f03e13359771
4
- data.tar.gz: ca59809649371db8eaa1e6eb8579f7bc34d5471f19f065e908b9585b9fb0e2fb
3
+ metadata.gz: 1d7b16cc9e00eb7cc95b21a13331c9bc2dffdfcbd4f4e060c03e876f261fa73c
4
+ data.tar.gz: 4f9b0b28f7f69f518898ffc1e591087096ac71d84547d4c185c07cc6f0a11643
5
5
  SHA512:
6
- metadata.gz: ae7de522ddbaa15625fa38109a63e28167b1831e635e78d95f2f9c0e9d6ff8117418a972dfbeda1555f2059c4d33b5faa286bbd1a07132c6a157c71f3194839d
7
- data.tar.gz: ee9a38551628d68e0efb02ad0b4d6f42d97fec92d776e8a394a64c5b05efcac9608cd9f0bb6897b43c8195b31a17e1cb9fb475e44b67a69cda94e898fc59f7f7
6
+ metadata.gz: 208180ae5d6fbe4b3ad30b05f1a60eb231f9bd3135304211ff7861f50e546eb32815ab5e8307e9c910cd1c5a7bdf951777635304ed7c629a7a62f8740df21c85
7
+ data.tar.gz: e3fac31dc46d1b126e8de982b02fe70d37ac7ffd6233ec643f20f93296237029640e67bb004cfaf7e71d5f60429cbc4702d4efe40951c48915e3bc08e561ec51
data/CHANGELOG.md CHANGED
@@ -1,5 +1,44 @@
1
1
  ## [Unreleased]
2
2
 
3
+ ## [0.9.0] - 2026-06-03
4
+
5
+ ### Added
6
+
7
+ - `ChronoForge::Cleanup` and `ChronoForge::CleanupJob` — a schedulable, batched cleanup that deletes old terminal (completed/failed) workflows and their logs, and (opt-in) prunes the unbounded repetition logs that long-lived `durably_repeat` tasks accumulate. Repetition pruning is frontier-safe: only terminal repetitions scheduled strictly before the coordination log's `last_execution_at` are removed, so catch-up is never disrupted. Retention is configurable per terminal state (`completed_older_than` / `failed_older_than`).
8
+ - `chrono_forge:upgrade` generator that installs additive migrations existing apps are missing (idempotent — re-running either generator skips migrations that already exist).
9
+ - Composite `[state, completed_at]` index on `chrono_forge_workflows` (separate, strong_migrations-safe migration: built `CONCURRENTLY` on PostgreSQL, `if_not_exists`) to keep monitoring and cleanup scans efficient.
10
+ - Validation of user-supplied step names: a name/method/condition containing the reserved `$` separator now raises `ChronoForge::Executor::InvalidStepName`.
11
+ - `step_name` and `attempt` columns on `chrono_forge_error_logs` (additive migration), populated by error tracking so each error is attributable to the step and attempt it came from and can be ordered/correlated when tailing a workflow.
12
+ - Record-level re-execution: `ChronoForge::Workflow#retry_now` / `#retry_later` (plus `#retryable?`), so a failed/stalled workflow can be re-run straight from its record (e.g. `ChronoForge::Workflow.failed.find_each(&:retry_later)`) without constantizing the job class or re-passing the key. `retry_later` validates retryability up front and raises `WorkflowNotRetryableError` immediately instead of enqueuing a job that would fail in the worker.
13
+
14
+ ### Changed
15
+
16
+ - **Performance:** execution-log and workflow lookups are now SELECT-first instead of INSERT-first, eliminating an `INSERT`-that-fails-on-the-unique-index (plus a burned sequence value) for every already-completed step on every replay.
17
+ - **Performance:** `LockStrategy.release_lock` reads only the lock owner column instead of reloading the full workflow row (which dragged the large JSON `context`/`kwargs`/`options` into memory on every resume).
18
+ - **Performance:** workflow completion/failure persist their execution log in a single `UPDATE` instead of two.
19
+ - **Performance:** `Context` deep-copies values via `as_json` instead of a `JSON.parse(JSON.generate(...))` round-trip.
20
+ - Error-log context snapshots are now bounded: all keys are kept, but once a 64 KB total budget is reached the remaining values are replaced with an `<<omitted>>` marker, so repeated error logging no longer duplicates large context blobs.
21
+ - Workflow retention is measured from when a workflow became terminal (`completed_at` for completed, `updated_at` for failed), not from `created_at` — long-running workflows that only just finished are retained for the full window.
22
+
23
+ ### Breaking
24
+
25
+ - The per-value `Context` size limit is reduced from 64 KB to **16 KB** and is now measured in **bytes** (previously characters, and `String`-only). `Hash` and `Array` values are now size-validated too. Context is intended for small working state; store large payloads elsewhere and keep a reference. Existing workflows that *write* values larger than 16 KB will raise `ChronoForge::Executor::Context::ValidationError`; already-stored values are unaffected when read.
26
+
27
+ ### Fixed
28
+
29
+ - A failed step no longer logs its terminal failure twice. Previously the step logged the underlying error and `perform` re-logged the `ExecutionFailedError` control-flow wrapper, producing a duplicate row. The wrapper is no longer logged; `wait_until` timeouts (which had no step-level log) are now logged at the step instead.
30
+
31
+ ### Removed
32
+
33
+ - Dead `serialize :metadata` declaration on `ChronoForge::Workflow` (the table has no `metadata` column).
34
+
35
+ ### Upgrading
36
+
37
+ ```bash
38
+ rails generate chrono_forge:upgrade
39
+ rails db:migrate
40
+ ```
41
+
3
42
  ## [0.1.0] - 2024-12-21
4
43
 
5
44
  - Initial release
data/README.md CHANGED
@@ -19,6 +19,7 @@ ChronoForge provides a powerful solution for handling long-running processes, ma
19
19
  - **Wait States**: Support for time-based waits and condition-based waiting
20
20
  - **Database-Backed**: All workflow state is persisted to ensure durability
21
21
  - **ActiveJob Integration**: Compatible with all ActiveJob backends, though database-backed processors (like Solid Queue) provide the most reliable experience for long-running workflows
22
+ - **Retention & Cleanup**: A schedulable job to prune finished workflows and the unbounded logs that periodic tasks accumulate (see [Cleanup & Retention](#-cleanup--retention))
22
23
 
23
24
  ## 📦 Installation
24
25
 
@@ -47,6 +48,21 @@ $ rails generate chrono_forge:install
47
48
  $ rails db:migrate
48
49
  ```
49
50
 
51
+ ### Upgrading
52
+
53
+ When upgrading ChronoForge in an application that was installed with an earlier
54
+ version, run the upgrade generator to pick up any additive schema changes, then
55
+ migrate:
56
+
57
+ ```bash
58
+ $ rails generate chrono_forge:upgrade
59
+ $ rails db:migrate
60
+ ```
61
+
62
+ The upgrade migration is idempotent (`if_not_exists`), so it is safe to run even
63
+ if your schema already has the index. Fresh installs get the index from the
64
+ install migration and do **not** need to run the upgrade.
65
+
50
66
  ## 📋 Usage
51
67
 
52
68
  ### Creating and Executing Workflows
@@ -431,6 +447,18 @@ end
431
447
 
432
448
  The context supports serializable Ruby objects (Hash, Array, String, Integer, Float, Boolean, and nil) and validates types automatically.
433
449
 
450
+ Hash and Array values are stored as JSON, which has no symbols — so **symbol keys inside a stored hash come back as strings**:
451
+
452
+ ```ruby
453
+ context[:totals] = { paid: 5, pending: 2 }
454
+ context[:totals] # => { "paid" => 5, "pending" => 2 }
455
+ context[:totals]["paid"] # => 5 (not context[:totals][:paid])
456
+ ```
457
+
458
+ (The top-level context key itself is interchangeable — `context[:totals]` and `context["totals"]` refer to the same entry.)
459
+
460
+ Context is meant for **small working state** — ids, flags, timestamps, and small structures used to coordinate steps. Each value is capped at **16 KB** (a `ChronoForge::Executor::Context::ValidationError` is raised above that). Store large payloads (documents, uploads, API responses) in their own storage and keep just a reference (an id or key) in the context.
461
+
434
462
  ### 🛡️ Error Handling
435
463
 
436
464
  ChronoForge automatically tracks errors and provides configurable retry capabilities:
@@ -581,20 +609,32 @@ stateDiagram-v2
581
609
 
582
610
  #### Recovering Stalled/Failed Workflows
583
611
 
612
+ Re-execute a failed or stalled workflow directly from its record — no need to
613
+ constantize the job class or re-pass the key. Execution resumes via replay, so
614
+ completed steps are skipped and it picks up at the step that failed:
615
+
584
616
  ```ruby
585
617
  workflow = ChronoForge::Workflow.find_by(key: "order-123")
586
618
 
587
- if workflow.stalled? || workflow.failed?
588
- job_class = workflow.job_class.constantize
589
-
590
- # Retry immediately
591
- job_class.retry_now(workflow.key)
592
-
593
- # Or retry asynchronously
594
- job_class.retry_later(workflow.key)
595
- end
619
+ workflow.retry_later # re-run asynchronously (the common case)
620
+ workflow.retry_now # re-run inline (console/debugging)
596
621
  ```
597
622
 
623
+ Only `stalled` or `failed` workflows are retryable. `retryable?` lets you check
624
+ first, and both methods **validate up front** — calling `retry_later`
625
+ on a non-retryable workflow raises `ChronoForge::Executor::WorkflowNotRetryableError`
626
+ immediately rather than enqueuing a job that would fail in the worker:
627
+
628
+ ```ruby
629
+ workflow.retryable? # => true/false
630
+
631
+ # Bulk re-run everything that failed:
632
+ ChronoForge::Workflow.failed.find_each(&:retry_later)
633
+ ```
634
+
635
+ The class-level form (`MyWorkflow.retry_now(key)` / `retry_later(key)`) still
636
+ works if you have the class and key rather than the record.
637
+
598
638
  #### Monitoring Running Workflows
599
639
 
600
640
  Long-running workflows might indicate issues:
@@ -614,6 +654,78 @@ long_running.each do |workflow|
614
654
  end
615
655
  ```
616
656
 
657
+ ## 🧹 Cleanup & Retention
658
+
659
+ ChronoForge keeps every workflow and execution-log row indefinitely so that
660
+ replays remain idempotent. Over time two things grow without bound:
661
+
662
+ 1. **Terminal workflows** (`completed` / `failed`) that are no longer needed.
663
+ 2. **`durably_repeat` repetition logs** — one row per scheduled execution. A
664
+ long-lived periodic workflow never reaches a terminal state, so its
665
+ repetition logs accumulate indefinitely. Past repetitions (those behind the
666
+ task's current frontier) are never read again, since each resume recomputes
667
+ the next execution from the coordination log — so they are safe to prune (see
668
+ the safety note below).
669
+
670
+ `ChronoForge::Cleanup` reclaims both. It is **not** run automatically — schedule
671
+ it from your own scheduler so you stay in control of retention:
672
+
673
+ ```ruby
674
+ ChronoForge::Cleanup.run(
675
+ older_than: 90.days, # default retention for terminal workflows (+ cascades their logs)
676
+ completed_older_than: 30.days, # optional: retention for completed workflows (defaults to older_than)
677
+ failed_older_than: 180.days, # optional: keep failures longer for debugging (defaults to older_than)
678
+ prune_repetition_logs_older_than: 30.days, # opt-in: prune old durably_repeat logs from still-active workflows
679
+ batch_size: 1_000 # rows deleted per batch
680
+ )
681
+ # => { workflows: 12, execution_logs: 84, error_logs: 3, repetition_logs: 240 }
682
+ ```
683
+
684
+ Notes:
685
+
686
+ - `running`, `idle`, and `stalled` workflows are **never** deleted.
687
+ - `completed_older_than` / `failed_older_than` let you keep failed workflows
688
+ around longer than completed ones; both default to `older_than`.
689
+ - `prune_repetition_logs_older_than` is opt-in (defaults to `nil`); when unset,
690
+ repetition logs are only removed as part of deleting their parent workflow.
691
+ Pruning is deliberately conservative: it only removes terminal repetition logs
692
+ that are both older than the window **and** scheduled strictly before the
693
+ periodic task's current frontier (the coordination log's `last_execution_at`).
694
+ Anything at or after the frontier is kept so `durably_repeat`'s catch-up
695
+ mechanism is never disrupted — so the window is purely a retention preference
696
+ and is safe even for yearly schedules.
697
+ - Workflow retention is measured from when a workflow became terminal, not when
698
+ it was created — a long-running workflow that only just finished is kept for
699
+ the full window. Completed workflows use `completed_at` (immutable); failed
700
+ workflows use `updated_at` (they have no `completed_at`).
701
+ - The composite `[state, completed_at]` index added in this version keeps these
702
+ scans efficient — run `chrono_forge:upgrade` if you installed an earlier
703
+ version.
704
+
705
+ A ready-made job is bundled so you can schedule it with any recurring-job
706
+ mechanism (Solid Queue recurring tasks, sidekiq-cron, GoodJob cron, the
707
+ `whenever` gem, ...):
708
+
709
+ ```ruby
710
+ ChronoForge::CleanupJob.perform_later(
711
+ older_than_days: 90,
712
+ failed_older_than_days: 180,
713
+ prune_repetition_logs_older_than_days: 30
714
+ )
715
+ ```
716
+
717
+ The job takes plain day counts (not `Duration` objects) so it can be driven from
718
+ a config file. For example, with Solid Queue's recurring tasks
719
+ (`config/recurring.yml`):
720
+
721
+ ```yaml
722
+ production:
723
+ chrono_forge_cleanup:
724
+ class: ChronoForge::CleanupJob
725
+ args: { older_than_days: 90, prune_repetition_logs_older_than_days: 30 }
726
+ schedule: every day at 3am
727
+ ```
728
+
617
729
  ## 🚀 Development
618
730
 
619
731
  After checking out the repo, run:
@@ -0,0 +1,154 @@
1
+ module ChronoForge
2
+ # Reclaims storage from finished workflows and the unbounded execution-log
3
+ # rows that periodic tasks (durably_repeat) accumulate.
4
+ #
5
+ # ChronoForge keeps every workflow and execution-log row indefinitely so that
6
+ # replays stay idempotent. Two things grow without bound over time:
7
+ #
8
+ # 1. Terminal workflows (completed/failed) that are no longer needed.
9
+ # 2. durably_repeat repetition logs — one row per scheduled execution. A
10
+ # long-lived periodic workflow never reaches a terminal state, so its
11
+ # repetition logs accumulate forever.
12
+ #
13
+ # This is not run automatically — schedule it from your own scheduler (cron,
14
+ # Solid Queue recurring tasks, sidekiq-cron, GoodJob cron, the `whenever`
15
+ # gem, ...). See ChronoForge::CleanupJob for a ready-made job, e.g.:
16
+ #
17
+ # ChronoForge::Cleanup.run(
18
+ # older_than: 90.days, # default retention for terminal workflows
19
+ # failed_older_than: 180.days, # keep failures longer for debugging
20
+ # prune_repetition_logs_older_than: 30.days # opt in to periodic-log pruning
21
+ # )
22
+ #
23
+ # == Workflow retention is measured from when a workflow became terminal
24
+ #
25
+ # Retention is measured from the terminal transition, not created_at: a
26
+ # long-running workflow may have been created long ago but only just finished.
27
+ # Completed workflows use the immutable completed_at; failed workflows have no
28
+ # completed_at, so they use updated_at (the failed! transition, after which
29
+ # nothing touches the row — release_lock/context use update_columns/
30
+ # update_column, which do not bump it).
31
+ #
32
+ # == Repetition-log pruning safety
33
+ #
34
+ # Pruning periodic logs is opt-in and deliberately conservative. A repetition
35
+ # log is removed only when its scheduled time is BOTH older than the retention
36
+ # window AND strictly before the periodic task's current frontier (the
37
+ # coordination log's last_execution_at). Everything at or after the frontier is
38
+ # kept, because durably_repeat's catch-up mechanism may still need it: the next
39
+ # execution is computed as last_execution_at + every, so anything at/after the
40
+ # frontier can still be revisited, while anything strictly before it never is.
41
+ # Both checks use the scheduled time embedded in the step name rather than
42
+ # created_at, which is misleading for catch-up rows created long after the
43
+ # occurrence they represent. A task that has not executed yet (no frontier) is
44
+ # never pruned.
45
+ class Cleanup
46
+ DEFAULT_RETENTION = 90.days
47
+ DEFAULT_BATCH_SIZE = 1_000
48
+ TERMINAL_LOG_STATES = %i[completed failed].freeze
49
+
50
+ # @param older_than [ActiveSupport::Duration] default retention for terminal
51
+ # workflows; used for any state without a specific override.
52
+ # @param completed_older_than [ActiveSupport::Duration, nil] retention for
53
+ # completed workflows. Defaults to older_than.
54
+ # @param failed_older_than [ActiveSupport::Duration, nil] retention for
55
+ # failed workflows. Defaults to older_than.
56
+ # @param prune_repetition_logs_older_than [ActiveSupport::Duration, nil]
57
+ # when set, also prune old terminal durably_repeat repetition logs from
58
+ # still-active workflows (see safety notes above). nil disables it.
59
+ # @param batch_size [Integer] rows per delete batch.
60
+ # @return [Hash] counts of deleted rows by category.
61
+ def self.run(**)
62
+ new(**).run
63
+ end
64
+
65
+ def initialize(older_than: DEFAULT_RETENTION, completed_older_than: nil, failed_older_than: nil,
66
+ prune_repetition_logs_older_than: nil, batch_size: DEFAULT_BATCH_SIZE)
67
+ @completed_older_than = completed_older_than || older_than
68
+ @failed_older_than = failed_older_than || older_than
69
+ @prune_repetition_logs_older_than = prune_repetition_logs_older_than
70
+ @batch_size = batch_size
71
+ end
72
+
73
+ def run
74
+ result = {workflows: 0, execution_logs: 0, error_logs: 0, repetition_logs: 0}
75
+
76
+ # Completed workflows use the immutable completed_at; failed workflows
77
+ # have no completed_at, so they fall back to updated_at.
78
+ delete_terminal_workflows(:completed, :completed_at, @completed_older_than, result)
79
+ delete_terminal_workflows(:failed, :updated_at, @failed_older_than, result)
80
+ prune_repetition_logs(result) if @prune_repetition_logs_older_than
81
+
82
+ result
83
+ end
84
+
85
+ private
86
+
87
+ def delete_terminal_workflows(state, timestamp_column, older_than, result)
88
+ cutoff = older_than.ago
89
+
90
+ Workflow.where(state: state)
91
+ .where(timestamp_column => ..cutoff)
92
+ .in_batches(of: @batch_size) do |batch|
93
+ ids = batch.ids
94
+ next if ids.empty?
95
+
96
+ # Delete dependent rows in bulk rather than relying on row-by-row
97
+ # dependent: :destroy callbacks.
98
+ result[:execution_logs] += ExecutionLog.where(workflow_id: ids).delete_all
99
+ result[:error_logs] += ErrorLog.where(workflow_id: ids).delete_all
100
+ result[:workflows] += Workflow.where(id: ids).delete_all
101
+ end
102
+ end
103
+
104
+ def prune_repetition_logs(result)
105
+ cutoff = @prune_repetition_logs_older_than.ago.to_i
106
+
107
+ coordination_logs.find_each do |coordination_log|
108
+ frontier = coordination_frontier(coordination_log)
109
+ next unless frontier
110
+
111
+ # Repetition logs are "<coordination step_name>$<scheduled_at_unix>".
112
+ # Match the prefix exactly in Ruby rather than via SQL LIKE: the step
113
+ # name contains "_", a LIKE wildcard, so a LIKE pattern would need
114
+ # escaping that is not portable across adapters.
115
+ prefix = "#{coordination_log.step_name}$"
116
+
117
+ # Scan in batches so a periodic workflow with a large backlog of
118
+ # repetition logs (exactly the case cleanup exists to fix) never loads
119
+ # them all into memory at once. Batching by primary key and only
120
+ # deleting rows within the current batch keeps the cursor valid.
121
+ ExecutionLog
122
+ .where(workflow_id: coordination_log.workflow_id, state: TERMINAL_LOG_STATES)
123
+ .in_batches(of: @batch_size) do |batch|
124
+ prunable_ids = batch.pluck(:id, :step_name).filter_map do |id, step_name|
125
+ next unless step_name.start_with?(prefix)
126
+
127
+ scheduled_at = step_name.delete_prefix(prefix).to_i
128
+ id if scheduled_at < frontier && scheduled_at < cutoff
129
+ end
130
+
131
+ result[:repetition_logs] += ExecutionLog.where(id: prunable_ids).delete_all if prunable_ids.any?
132
+ end
133
+ end
134
+ end
135
+
136
+ # Coordination logs are "durably_repeat$<name>" — exactly one "$" segment
137
+ # after the prefix. Repetition logs add a second "$<timestamp>" segment.
138
+ def coordination_logs
139
+ ExecutionLog
140
+ .where("step_name LIKE ?", "durably_repeat$%")
141
+ .where.not("step_name LIKE ?", "durably_repeat$%$%")
142
+ .order(:id)
143
+ end
144
+
145
+ def coordination_frontier(coordination_log)
146
+ last_execution_at = coordination_log.metadata && coordination_log.metadata["last_execution_at"]
147
+ return unless last_execution_at
148
+
149
+ Time.parse(last_execution_at).to_i
150
+ rescue ArgumentError, TypeError
151
+ nil
152
+ end
153
+ end
154
+ end
@@ -0,0 +1,30 @@
1
+ module ChronoForge
2
+ # ActiveJob wrapper around {Cleanup} so the cleanup can be enqueued and
3
+ # scheduled with any recurring-job mechanism (Solid Queue recurring tasks,
4
+ # sidekiq-cron, GoodJob cron, ...).
5
+ #
6
+ # Arguments are plain scalars (day counts) rather than ActiveSupport::Duration
7
+ # objects so the job can be configured from YAML/cron config files, which can
8
+ # only carry primitive values:
9
+ #
10
+ # ChronoForge::CleanupJob.perform_later(
11
+ # older_than_days: 90,
12
+ # failed_older_than_days: 180,
13
+ # prune_repetition_logs_older_than_days: 30
14
+ # )
15
+ class CleanupJob < ActiveJob::Base
16
+ def perform(older_than_days: nil, completed_older_than_days: nil, failed_older_than_days: nil,
17
+ prune_repetition_logs_older_than_days: nil, batch_size: nil)
18
+ options = {}
19
+ options[:older_than] = older_than_days.to_i.days if older_than_days
20
+ options[:completed_older_than] = completed_older_than_days.to_i.days if completed_older_than_days
21
+ options[:failed_older_than] = failed_older_than_days.to_i.days if failed_older_than_days
22
+ if prune_repetition_logs_older_than_days
23
+ options[:prune_repetition_logs_older_than] = prune_repetition_logs_older_than_days.to_i.days
24
+ end
25
+ options[:batch_size] = batch_size.to_i if batch_size
26
+
27
+ Cleanup.run(**options)
28
+ end
29
+ end
30
+ end
@@ -5,10 +5,12 @@
5
5
  # Table name: chrono_forge_error_logs
6
6
  #
7
7
  # id :integer not null, primary key
8
+ # attempt :integer
8
9
  # backtrace :text
9
10
  # context :json
10
11
  # error_class :string
11
12
  # error_message :text
13
+ # step_name :string
12
14
  # created_at :datetime not null
13
15
  # updated_at :datetime not null
14
16
  # workflow_id :integer not null
@@ -14,6 +14,17 @@ module ChronoForge
14
14
  Array
15
15
  ]
16
16
 
17
+ # Maximum serialized byte size of a single context value. Applies to the
18
+ # variable-length types (String, Hash, Array); scalars are unbounded in
19
+ # practice. Measured in bytes (not characters) since that is what is
20
+ # actually stored and what matters for write/storage cost.
21
+ #
22
+ # Context is meant to hold small working state (ids, flags, timestamps,
23
+ # small structures) — not documents or payloads, which belong in their own
24
+ # storage and can be referenced from context by id. 16 KB per value is
25
+ # already generous for that (hundreds of ids / dozens of records).
26
+ MAX_VALUE_BYTESIZE = 16.kilobytes
27
+
17
28
  def initialize(workflow)
18
29
  @workflow = workflow
19
30
  @context = workflow.context || {}
@@ -67,7 +78,11 @@ module ChronoForge
67
78
 
68
79
  @context[key.to_s] =
69
80
  if value.is_a?(Hash) || value.is_a?(Array)
70
- deep_dup(value)
81
+ # as_json returns a fresh JSON-compatible structure with string keys
82
+ # — the same normalization the JSON column would apply on save and a
83
+ # deep copy that protects the store from later mutation of the
84
+ # source — without the cost of serializing to a string and reparsing.
85
+ value.as_json
71
86
  else
72
87
  value
73
88
  end
@@ -84,16 +99,19 @@ module ChronoForge
84
99
  raise ValidationError, "Unsupported context value type: #{value.inspect}"
85
100
  end
86
101
 
87
- # Optional: Add size constraints
88
- if value.is_a?(String) && value.size > 64.kilobytes
89
- raise ValidationError, "Context value too large"
102
+ byte_size = value_byte_size(value)
103
+ if byte_size && byte_size > MAX_VALUE_BYTESIZE
104
+ raise ValidationError, "Context value too large (#{byte_size} bytes, max #{MAX_VALUE_BYTESIZE})"
90
105
  end
91
106
  end
92
107
 
93
- def deep_dup(obj)
94
- JSON.parse(JSON.generate(obj))
95
- rescue
96
- obj.dup
108
+ # Serialized byte size for the variable-length types; nil for scalars,
109
+ # which need no size constraint.
110
+ def value_byte_size(value)
111
+ case value
112
+ when String then value.bytesize
113
+ when Hash, Array then value.to_json.bytesize
114
+ end
97
115
  end
98
116
  end
99
117
  end
@@ -1,16 +1,55 @@
1
1
  module ChronoForge
2
2
  module Executor
3
3
  class ExecutionTracker
4
- def self.track_error(workflow, error)
5
- # Create a detailed error log
4
+ # Total budget for the context snapshot copied into each error log.
5
+ # Transient errors can be logged repeatedly (one row per retry), and the
6
+ # full context always remains on the workflow itself, so the error copy
7
+ # only needs to be a bounded diagnostic breadcrumb. Keys are preserved;
8
+ # values are kept until the running total would exceed this budget, after
9
+ # which each remaining value is replaced by OMITTED_VALUE. Per-value size
10
+ # is already bounded by Context validation, so no per-value truncation is
11
+ # needed here — a single value larger than the budget is simply replaced.
12
+ MAX_CONTEXT_BYTESIZE = 64.kilobytes
13
+
14
+ # Placeholder stored in place of a value that didn't fit the budget.
15
+ OMITTED_VALUE = "<<omitted>>"
16
+
17
+ # @param execution_log [ExecutionLog, nil] the step the error occurred in,
18
+ # if any. Its step_name and attempt count are recorded on the error log
19
+ # so errors can be attributed to a step and ordered within the workflow.
20
+ # @param attempt [Integer, nil] explicit attempt number for errors not tied
21
+ # to a step (e.g. a workflow-level failure). Falls back to the execution
22
+ # log's attempt count.
23
+ def self.track_error(workflow, error, execution_log: nil, attempt: nil)
6
24
  ErrorLog.create!(
7
25
  workflow: workflow,
26
+ step_name: execution_log&.step_name,
27
+ attempt: attempt || execution_log&.attempts,
8
28
  error_class: error.class.name,
9
29
  error_message: error.message,
10
- backtrace: error.backtrace.join("\n"),
11
- context: workflow.context
30
+ backtrace: error.backtrace&.join("\n"),
31
+ context: error_context(workflow.context)
12
32
  )
13
33
  end
34
+
35
+ def self.error_context(context)
36
+ remaining = MAX_CONTEXT_BYTESIZE
37
+
38
+ context.each_with_object({}) do |(key, value), kept|
39
+ size = value.to_json.bytesize
40
+ if size <= remaining
41
+ kept[key] = value
42
+ remaining -= size
43
+ else
44
+ kept[key] = OMITTED_VALUE
45
+ end
46
+ end
47
+ rescue
48
+ # If the context cannot be traversed/serialized, fail safe to a marker
49
+ # rather than risk persisting something unbounded or unserializable.
50
+ {"_truncated" => true}
51
+ end
52
+ private_class_method :error_context
14
53
  end
15
54
  end
16
55
  end
@@ -34,11 +34,17 @@ module ChronoForge
34
34
  end
35
35
 
36
36
  def release_lock(job_id, workflow, force: false)
37
- workflow = workflow.reload
38
- if !force && workflow.locked_by != job_id
37
+ # Read only the lock owner from the DB rather than reloading the whole
38
+ # row (which would drag the heavy context/kwargs/options JSON into memory
39
+ # on every resume) just to verify ownership. The in-memory state is
40
+ # already accurate here: acquire_lock set it to :running, and a
41
+ # completed/failed workflow had its state updated on this same instance.
42
+ current_locked_by = workflow.class.where(id: workflow.id).pick(:locked_by)
43
+
44
+ if !force && current_locked_by != job_id
39
45
  raise LongRunningConcurrentExecutionError,
40
46
  "ChronoForge:#{self.class}(#{workflow.key}) job(#{job_id}) executed longer than specified max_duration, " \
41
- "allowed job(#{workflow.locked_by}) to acquire the lock."
47
+ "allowed job(#{current_locked_by}) to acquire the lock."
42
48
  end
43
49
 
44
50
  columns = {locked_at: nil, locked_by: nil}
@@ -96,13 +96,11 @@ module ChronoForge
96
96
  # - User actions or form submissions
97
97
  #
98
98
  def continue_if(condition, name: nil)
99
+ validate_step_name_segment!(name || condition)
99
100
  step_name = "continue_if$#{name || condition}"
100
101
 
101
102
  # Find or create execution log
102
- execution_log = ExecutionLog.create_or_find_by!(
103
- workflow: @workflow,
104
- step_name: step_name
105
- ) do |log|
103
+ execution_log = find_or_create_execution_log!(step_name) do |log|
106
104
  log.started_at = Time.current
107
105
  log.metadata = {
108
106
  condition: condition.to_s,
@@ -128,7 +126,7 @@ module ChronoForge
128
126
  rescue => e
129
127
  # Log the error and fail the execution
130
128
  Rails.logger.error { "Error evaluating condition #{condition}: #{e.message}" }
131
- self.class::ExecutionTracker.track_error(workflow, e)
129
+ self.class::ExecutionTracker.track_error(workflow, e, execution_log: execution_log)
132
130
 
133
131
  execution_log.update!(
134
132
  state: :failed,
@@ -60,12 +60,10 @@ module ChronoForge
60
60
  # - Enables monitoring and debugging of execution history
61
61
  #
62
62
  def durably_execute(method, max_attempts: 3, name: nil)
63
+ validate_step_name_segment!(name || method)
63
64
  step_name = "durably_execute$#{name || method}"
64
65
  # Find or create execution log
65
- execution_log = ExecutionLog.create_or_find_by!(
66
- workflow: @workflow,
67
- step_name: step_name
68
- ) do |log|
66
+ execution_log = find_or_create_execution_log!(step_name) do |log|
69
67
  log.started_at = Time.current
70
68
  end
71
69
 
@@ -96,7 +94,7 @@ module ChronoForge
96
94
  rescue => e
97
95
  # Log the error
98
96
  Rails.logger.error { "Error while durably executing #{method}: #{e.message}" }
99
- self.class::ExecutionTracker.track_error(workflow, e)
97
+ self.class::ExecutionTracker.track_error(workflow, e, execution_log: execution_log)
100
98
 
101
99
  # Optional retry logic
102
100
  if execution_log.attempts < max_attempts
@@ -101,13 +101,11 @@ module ChronoForge
101
101
  # - Repetition logs: `durably_repeat$#{name}$#{timestamp}` - tracks individual executions
102
102
  #
103
103
  def durably_repeat(method, every:, till:, start_at: nil, max_attempts: 3, timeout: 1.hour, on_error: :continue, name: nil)
104
+ validate_step_name_segment!(name || method)
104
105
  step_name = "durably_repeat$#{name || method}"
105
106
 
106
107
  # Get or create the main coordination log for this periodic task
107
- coordination_log = ExecutionLog.create_or_find_by!(
108
- workflow: @workflow,
109
- step_name: step_name
110
- ) do |log|
108
+ coordination_log = find_or_create_execution_log!(step_name) do |log|
111
109
  log.started_at = Time.current
112
110
  log.metadata = {last_execution_at: nil}
113
111
  end
@@ -157,10 +155,7 @@ module ChronoForge
157
155
  step_name = "#{coordination_log.step_name}$#{next_execution_at.to_i}"
158
156
 
159
157
  # Create execution log for this specific repetition
160
- repetition_log = ExecutionLog.create_or_find_by!(
161
- workflow: @workflow,
162
- step_name: step_name
163
- ) do |log|
158
+ repetition_log = find_or_create_execution_log!(step_name) do |log|
164
159
  log.started_at = Time.current
165
160
  log.metadata = {
166
161
  scheduled_for: next_execution_at,
@@ -225,7 +220,7 @@ module ChronoForge
225
220
  rescue => e
226
221
  # Log the error
227
222
  Rails.logger.error { "Error in periodic task #{method}: #{e.message}" }
228
- self.class::ExecutionTracker.track_error(@workflow, e)
223
+ self.class::ExecutionTracker.track_error(@workflow, e, execution_log: repetition_log)
229
224
 
230
225
  # Handle retry logic for this specific repetition
231
226
  if repetition_log.attempts < max_attempts
@@ -73,12 +73,10 @@ module ChronoForge
73
73
  # - Marks as completed when wait period has elapsed
74
74
  #
75
75
  def wait(duration, name)
76
+ validate_step_name_segment!(name)
76
77
  step_name = "wait$#{name}"
77
78
  # Find or create execution log
78
- execution_log = ExecutionLog.create_or_find_by!(
79
- workflow: @workflow,
80
- step_name: step_name
81
- ) do |log|
79
+ execution_log = find_or_create_execution_log!(step_name) do |log|
82
80
  log.started_at = Time.current
83
81
  log.metadata = {
84
82
  wait_until: duration.from_now
@@ -86,12 +86,10 @@ module ChronoForge
86
86
  # - Records final result (true for success, :timed_out for timeout)
87
87
  #
88
88
  def wait_until(condition, timeout: 1.hour, check_interval: 15.minutes, retry_on: [])
89
+ validate_step_name_segment!(condition)
89
90
  step_name = "wait_until$#{condition}"
90
91
  # Find or create execution log
91
- execution_log = ExecutionLog.create_or_find_by!(
92
- workflow: @workflow,
93
- step_name: step_name
94
- ) do |log|
92
+ execution_log = find_or_create_execution_log!(step_name) do |log|
95
93
  log.started_at = Time.current
96
94
  log.metadata = {
97
95
  timeout_at: timeout.from_now,
@@ -117,7 +115,7 @@ module ChronoForge
117
115
  rescue => e
118
116
  # Log the error
119
117
  Rails.logger.error { "Error evaluating condition #{condition}: #{e.message}" }
120
- self.class::ExecutionTracker.track_error(workflow, e)
118
+ self.class::ExecutionTracker.track_error(workflow, e, execution_log: execution_log)
121
119
 
122
120
  # Optional retry logic
123
121
  if retry_on.include?(e.class)
@@ -162,7 +160,11 @@ module ChronoForge
162
160
  metadata: metadata.merge("result" => :timed_out)
163
161
  )
164
162
  Rails.logger.warn { "Timeout reached for condition '#{condition}'." }
165
- raise WaitConditionNotMet, "Condition '#{condition}' not met within timeout period"
163
+ # Log here (with step context) rather than relying on the workflow-level
164
+ # rescue, which no longer logs ExecutionFailedError.
165
+ error = WaitConditionNotMet.new("Condition '#{condition}' not met within timeout period")
166
+ self.class::ExecutionTracker.track_error(workflow, error, execution_log: execution_log)
167
+ raise error
166
168
  end
167
169
 
168
170
  # Reschedule with delay
@@ -49,24 +49,20 @@ module ChronoForge
49
49
  #
50
50
  def complete_workflow!
51
51
  # Create an execution log for workflow completion
52
- execution_log = ExecutionLog.create_or_find_by!(
53
- workflow: workflow,
54
- step_name: "$workflow_completion$"
55
- ) do |log|
52
+ execution_log = find_or_create_execution_log!("$workflow_completion$") do |log|
56
53
  log.started_at = Time.current
57
54
  end
58
55
 
59
56
  begin
60
- execution_log.update!(
61
- attempts: execution_log.attempts + 1,
62
- last_executed_at: Time.current
63
- )
64
-
65
57
  workflow.completed_at = Time.current
66
58
  workflow.completed!
67
59
 
68
- # Mark execution log as completed
60
+ # Mark execution log as completed. Attempt tracking and the terminal
61
+ # state are written together: completion is not retried on an
62
+ # attempt-count basis, so there is no need for a separate pre-write.
69
63
  execution_log.update!(
64
+ attempts: execution_log.attempts + 1,
65
+ last_executed_at: Time.current,
70
66
  state: :completed,
71
67
  completed_at: Time.current
72
68
  )
@@ -142,10 +138,7 @@ module ChronoForge
142
138
  #
143
139
  def fail_workflow!(error_log)
144
140
  # Create an execution log for workflow failure
145
- execution_log = ExecutionLog.create_or_find_by!(
146
- workflow: workflow,
147
- step_name: "$workflow_failure$#{error_log.id}"
148
- ) do |log|
141
+ execution_log = find_or_create_execution_log!("$workflow_failure$#{error_log.id}") do |log|
149
142
  log.started_at = Time.current
150
143
  log.metadata = {
151
144
  error_log_id: error_log.id
@@ -153,15 +146,14 @@ module ChronoForge
153
146
  end
154
147
 
155
148
  begin
156
- execution_log.update!(
157
- attempts: execution_log.attempts + 1,
158
- last_executed_at: Time.current
159
- )
160
-
161
149
  workflow.failed!
162
150
 
163
- # Mark execution log as completed
151
+ # Mark execution log as completed. Attempt tracking and the terminal
152
+ # state are written together (failure handling is not retried on an
153
+ # attempt-count basis).
164
154
  execution_log.update!(
155
+ attempts: execution_log.attempts + 1,
156
+ last_executed_at: Time.current,
165
157
  state: :completed,
166
158
  completed_at: Time.current
167
159
  )
@@ -242,10 +234,9 @@ module ChronoForge
242
234
  # - Original exception is re-raised after logging
243
235
  #
244
236
  def retry_workflow!
245
- # Check if the workflow is stalled or failed
246
- unless workflow.stalled? || workflow.failed?
247
- raise WorkflowNotRetryableError, "Cannot retry workflow(#{workflow.key}) in #{workflow.state} state. Only stalled or failed workflows can be retried."
248
- end
237
+ # Authoritative check at execution time (the record-level retry methods
238
+ # also check up front, but state may have changed since enqueue).
239
+ workflow.ensure_retryable!
249
240
 
250
241
  # Create an execution log for workflow retry
251
242
  execution_log = ExecutionLog.create!(
@@ -12,6 +12,12 @@ module ChronoForge
12
12
 
13
13
  class WorkflowNotRetryableError < NotExecutableError; end
14
14
 
15
+ class InvalidStepName < NotExecutableError; end
16
+
17
+ # "$" separates the segments of a step name (e.g. "durably_repeat$name$ts").
18
+ # User-supplied names/methods must not contain it.
19
+ STEP_NAME_DELIMITER = "$"
20
+
15
21
  include Methods
16
22
 
17
23
  # Add class methods
@@ -34,13 +40,13 @@ module ChronoForge
34
40
  end
35
41
 
36
42
  # Add retry_now class method that calls perform_now with retry_workflow: true
37
- def retry_now(key, **kwargs)
38
- perform_now(key, retry_workflow: true, **kwargs)
43
+ def retry_now(key, **)
44
+ perform_now(key, retry_workflow: true, **)
39
45
  end
40
46
 
41
47
  # Add retry_later class method that calls perform_later with retry_workflow: true
42
- def retry_later(key, **kwargs)
43
- perform_later(key, retry_workflow: true, **kwargs)
48
+ def retry_later(key, **)
49
+ perform_later(key, retry_workflow: true, **)
44
50
  end
45
51
  end
46
52
  end
@@ -74,9 +80,11 @@ module ChronoForge
74
80
 
75
81
  # Mark as complete
76
82
  complete_workflow!
77
- rescue ExecutionFailedError => e
83
+ rescue ExecutionFailedError
84
+ # The step that raised this already logged the underlying cause (with its
85
+ # step/attempt context); ExecutionFailedError is control flow, not a new
86
+ # error, so re-logging it here would just duplicate the row.
78
87
  Rails.logger.error { "ChronoForge:#{self.class}(#{key}) step execution failed" }
79
- self.class::ExecutionTracker.track_error(workflow, e)
80
88
  workflow.stalled!
81
89
  nil
82
90
  rescue HaltExecutionFlow
@@ -91,7 +99,7 @@ module ChronoForge
91
99
  raise
92
100
  rescue => e
93
101
  Rails.logger.error { "ChronoForge:#{self.class}(#{key}) workflow execution failed" }
94
- error_log = self.class::ExecutionTracker.track_error(workflow, e)
102
+ error_log = self.class::ExecutionTracker.track_error(workflow, e, attempt: attempt)
95
103
 
96
104
  # Retry if applicable
97
105
  if should_retry?(e, attempt)
@@ -110,17 +118,52 @@ module ChronoForge
110
118
  private
111
119
 
112
120
  def setup_workflow!(key, options, kwargs)
113
- @workflow = Workflow.create_or_find_by!(job_class: self.class.to_s, key: key) do |workflow|
114
- workflow.options = options
115
- workflow.kwargs = kwargs
116
- workflow.started_at = Time.current
117
- end
121
+ # SELECT-first: on every resume (the common case) the workflow already
122
+ # exists, so a plain lookup avoids an INSERT that would fail on the unique
123
+ # [job_class, key] index. create_or_find_by! is only reached on first-ever
124
+ # creation, where it also handles a concurrent insert race safely.
125
+ @workflow = Workflow.find_by(job_class: self.class.to_s, key: key) ||
126
+ Workflow.create_or_find_by!(job_class: self.class.to_s, key: key) do |workflow|
127
+ workflow.options = options
128
+ workflow.kwargs = kwargs
129
+ workflow.started_at = Time.current
130
+ end
118
131
  end
119
132
 
120
133
  def setup_context!
121
134
  @context = Context.new(workflow)
122
135
  end
123
136
 
137
+ # Idempotent, SELECT-first execution-log lookup.
138
+ #
139
+ # The engine replays the whole workflow body on every resume, so each durable
140
+ # step is looked up again every pass. A plain create_or_find_by! would INSERT
141
+ # first and fail on the unique index for the (overwhelmingly common) case
142
+ # where the step already exists — turning every replayed step into a wasted
143
+ # INSERT plus a burned sequence value. Looking up first means replays cost a
144
+ # single indexed SELECT.
145
+ #
146
+ # All lookups are by exact step_name (no method ever scans a workflow's logs),
147
+ # so a per-step lookup is also the right shape for durably_repeat workflows,
148
+ # which accumulate unbounded repetition logs: we touch only the rows we need,
149
+ # never the whole set. create_or_find_by! is used only on a miss, keeping
150
+ # creation safe if a lock takeover ever lets two executors race.
151
+ def find_or_create_execution_log!(step_name, &)
152
+ ExecutionLog.find_by(workflow: @workflow, step_name: step_name) ||
153
+ ExecutionLog.create_or_find_by!(workflow: @workflow, step_name: step_name, &)
154
+ end
155
+
156
+ # Guards the user-supplied portion of a step name (a custom name, method, or
157
+ # condition). The "$" separator is reserved for the framework's own segment
158
+ # structure, so a user value containing it would make step names ambiguous
159
+ # and corrupt the cleanup logic that parses them.
160
+ def validate_step_name_segment!(segment)
161
+ return unless segment.to_s.include?(STEP_NAME_DELIMITER)
162
+
163
+ raise InvalidStepName,
164
+ "ChronoForge step name may not contain '#{STEP_NAME_DELIMITER}' (reserved separator): #{segment.inspect}"
165
+ end
166
+
124
167
  def should_retry?(error, attempt_count)
125
168
  attempt_count < 3
126
169
  end
@@ -1,5 +1,5 @@
1
1
  # frozen_string_literal: true
2
2
 
3
3
  module ChronoForge
4
- VERSION = "0.8.0"
4
+ VERSION = "0.9.0"
5
5
  end
@@ -36,13 +36,36 @@ module ChronoForge
36
36
  stalled
37
37
  ]
38
38
 
39
- # Serialization for metadata
40
- serialize :metadata, coder: JSON
41
-
42
39
  def executable?
43
40
  idle? || running?
44
41
  end
45
42
 
43
+ # Only stalled or failed workflows can be re-executed.
44
+ def retryable?
45
+ stalled? || failed?
46
+ end
47
+
48
+ def ensure_retryable!
49
+ return if retryable?
50
+
51
+ raise Executor::WorkflowNotRetryableError,
52
+ "Cannot retry workflow(#{key}) in #{state} state. Only stalled or failed workflows can be retried."
53
+ end
54
+
55
+ # Re-execute this workflow from its record, without constantizing the job
56
+ # class or re-passing the key. Retryability is validated up front so a
57
+ # non-retryable workflow raises immediately rather than enqueuing a job that
58
+ # would fail in the worker.
59
+ def retry_now(**)
60
+ ensure_retryable!
61
+ job_klass.retry_now(key, **)
62
+ end
63
+
64
+ def retry_later(**)
65
+ ensure_retryable!
66
+ job_klass.retry_later(key, **)
67
+ end
68
+
46
69
  def job_klass
47
70
  job_class.constantize
48
71
  end
@@ -1,8 +1,10 @@
1
1
  Description:
2
- Installs ChronoForge
2
+ Installs ChronoForge by copying all of its migrations into your app.
3
+ Idempotent: migrations that already exist are skipped.
3
4
 
4
5
  Example:
5
- bin/rails g chrono_form:install
6
+ bin/rails g chrono_forge:install
6
7
 
7
- This will create a new migration e.g:
8
+ This will create migrations e.g:
8
9
  20241221181505_install_chrono_forge.rb
10
+ 20241221181506_add_chrono_forge_workflow_state_index.rb
@@ -1,24 +1,22 @@
1
1
  # frozen_string_literal: true
2
2
 
3
3
  require "rails/generators/active_record/migration"
4
+ require_relative "../migration_actions"
4
5
 
5
6
  module ChronoForge
7
+ # Installs all ChronoForge migrations into a new application. Idempotent:
8
+ # migrations that already exist are skipped, so re-running is safe.
6
9
  class InstallGenerator < Rails::Generators::Base
7
10
  include ::ActiveRecord::Generators::Migration
11
+ include ChronoForge::Generators::MigrationActions
8
12
 
9
- source_root File.expand_path("templates", __dir__)
13
+ source_root File.expand_path("../templates", __dir__)
10
14
 
11
15
  def start
12
- install_migrations
16
+ copy_chrono_forge_migrations
13
17
  rescue => err
14
18
  say "#{err.class}: #{err}\n#{err.backtrace.join("\n")}", :red
15
19
  exit 1
16
20
  end
17
-
18
- private
19
-
20
- def install_migrations
21
- migration_template "install_chrono_forge.rb", "db/migrate/install_chrono_forge.rb"
22
- end
23
21
  end
24
22
  end
@@ -0,0 +1,39 @@
1
+ # frozen_string_literal: true
2
+
3
+ module ChronoForge
4
+ module Generators
5
+ # Shared migration-copy logic for the install and upgrade generators.
6
+ #
7
+ # Copying is idempotent: a migration whose name already exists in the host
8
+ # application's db/migrate is skipped, so it is safe to re-run either
9
+ # generator. `install` copies the full set (a fresh app has none yet);
10
+ # `upgrade` copies only the migrations a previously-installed app is missing.
11
+ # Both share this method — the difference is purely which migrations already
12
+ # exist in the target app.
13
+ #
14
+ # MIGRATIONS is listed in application order; copying preserves that order
15
+ # because each migration_template assigns the next sequential version number.
16
+ module MigrationActions
17
+ MIGRATIONS = %w[
18
+ install_chrono_forge
19
+ add_chrono_forge_workflow_state_index
20
+ add_chrono_forge_error_log_step_context
21
+ ].freeze
22
+
23
+ def copy_chrono_forge_migrations
24
+ MIGRATIONS.each do |name|
25
+ if chrono_forge_migration_exists?(name)
26
+ say_status :skip, "#{name} (migration already exists)", :yellow
27
+ else
28
+ migration_template "#{name}.rb", "db/migrate/#{name}.rb"
29
+ end
30
+ end
31
+ end
32
+
33
+ def chrono_forge_migration_exists?(name)
34
+ migrate_dir = File.join(destination_root, "db", "migrate")
35
+ Dir.glob(File.join(migrate_dir, "*_#{name}.rb")).any?
36
+ end
37
+ end
38
+ end
39
+ end
@@ -0,0 +1,13 @@
1
+ # frozen_string_literal: true
2
+
3
+ # Adds step context to error logs so each error can be attributed to the step
4
+ # and attempt it came from — making error logs orderable and correlatable when
5
+ # tailing a workflow, instead of an undifferentiated stream. Both columns are
6
+ # nullable (a workflow-level error has no step), so this is a safe additive
7
+ # change with no table rewrite.
8
+ class AddChronoForgeErrorLogStepContext < ActiveRecord::Migration[7.1]
9
+ def change
10
+ add_column :chrono_forge_error_logs, :step_name, :string, if_not_exists: true
11
+ add_column :chrono_forge_error_logs, :attempt, :integer, if_not_exists: true
12
+ end
13
+ end
@@ -0,0 +1,33 @@
1
+ # frozen_string_literal: true
2
+
3
+ # Adds a composite [state, completed_at] index to chrono_forge_workflows. This
4
+ # supports state-based monitoring (stalled/failed dashboards) and the retention
5
+ # scan ChronoForge::Cleanup runs over completed workflows (the high-volume
6
+ # terminal state), which filters by completed_at. The state prefix also serves
7
+ # the smaller failed-workflow scan.
8
+ #
9
+ # Shipped as a standalone migration (rather than folded into the install
10
+ # migration) so applications created with an earlier version of ChronoForge can
11
+ # pick it up via `rails generate chrono_forge:upgrade`.
12
+ class AddChronoForgeWorkflowStateIndex < ActiveRecord::Migration[7.1]
13
+ # On PostgreSQL the index is built CONCURRENTLY so it does not lock the table
14
+ # against writes, which also keeps strong_migrations satisfied. Concurrent
15
+ # index builds cannot run inside a transaction.
16
+ disable_ddl_transaction!
17
+
18
+ def change
19
+ add_index :chrono_forge_workflows, %i[state completed_at],
20
+ if_not_exists: true,
21
+ **chrono_forge_index_algorithm
22
+ end
23
+
24
+ private
25
+
26
+ def chrono_forge_index_algorithm
27
+ if connection.adapter_name.to_s.downcase.include?("postgresql")
28
+ {algorithm: :concurrently}
29
+ else
30
+ {}
31
+ end
32
+ end
33
+ end
@@ -0,0 +1,13 @@
1
+ Description:
2
+ Upgrades an existing ChronoForge installation to the current schema.
3
+
4
+ New applications use `chrono_forge:install`. Applications that installed an
5
+ earlier version run this to add additive schema changes (currently the
6
+ chrono_forge_workflows [state, completed_at] index). The generated migration
7
+ is idempotent and safe to run even if the index already exists.
8
+
9
+ Example:
10
+ bin/rails g chrono_forge:upgrade
11
+
12
+ This will create a new migration e.g:
13
+ 20250603181505_add_chrono_forge_workflow_state_index.rb
@@ -0,0 +1,28 @@
1
+ # frozen_string_literal: true
2
+
3
+ require "rails/generators/active_record/migration"
4
+ require_relative "../migration_actions"
5
+
6
+ module ChronoForge
7
+ # Brings an existing ChronoForge installation up to the current schema by
8
+ # copying any migrations the application does not already have. Applications
9
+ # created with `chrono_forge:install` on the current version already have
10
+ # everything; older installs pick up the additive migrations (currently the
11
+ # chrono_forge_workflows [state, completed_at] index).
12
+ #
13
+ # rails generate chrono_forge:upgrade
14
+ # rails db:migrate
15
+ class UpgradeGenerator < Rails::Generators::Base
16
+ include ::ActiveRecord::Generators::Migration
17
+ include ChronoForge::Generators::MigrationActions
18
+
19
+ source_root File.expand_path("../templates", __dir__)
20
+
21
+ def start
22
+ copy_chrono_forge_migrations
23
+ rescue => err
24
+ say "#{err.class}: #{err}\n#{err.backtrace.join("\n")}", :red
25
+ exit 1
26
+ end
27
+ end
28
+ end
metadata CHANGED
@@ -1,14 +1,14 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: chrono_forge
3
3
  version: !ruby/object:Gem::Version
4
- version: 0.8.0
4
+ version: 0.9.0
5
5
  platform: ruby
6
6
  authors:
7
7
  - Stefan Froelich
8
8
  autorequire:
9
9
  bindir: exe
10
10
  cert_chain: []
11
- date: 2025-07-16 00:00:00.000000000 Z
11
+ date: 2026-06-03 00:00:00.000000000 Z
12
12
  dependencies:
13
13
  - !ruby/object:Gem::Dependency
14
14
  name: activerecord
@@ -185,6 +185,8 @@ files:
185
185
  - gemfiles/rails_7.1.gemfile
186
186
  - gemfiles/rails_7.1.gemfile.lock
187
187
  - lib/chrono_forge.rb
188
+ - lib/chrono_forge/cleanup.rb
189
+ - lib/chrono_forge/cleanup_job.rb
188
190
  - lib/chrono_forge/error_log.rb
189
191
  - lib/chrono_forge/execution_log.rb
190
192
  - lib/chrono_forge/executor.rb
@@ -203,7 +205,12 @@ files:
203
205
  - lib/chrono_forge/workflow.rb
204
206
  - lib/generators/chrono_forge/install/USAGE
205
207
  - lib/generators/chrono_forge/install/install_generator.rb
206
- - lib/generators/chrono_forge/install/templates/install_chrono_forge.rb
208
+ - lib/generators/chrono_forge/migration_actions.rb
209
+ - lib/generators/chrono_forge/templates/add_chrono_forge_error_log_step_context.rb
210
+ - lib/generators/chrono_forge/templates/add_chrono_forge_workflow_state_index.rb
211
+ - lib/generators/chrono_forge/templates/install_chrono_forge.rb
212
+ - lib/generators/chrono_forge/upgrade/USAGE
213
+ - lib/generators/chrono_forge/upgrade/upgrade_generator.rb
207
214
  - sig/chrono_forge.rbs
208
215
  homepage: https://github.com/radioactive-labs/chrono_forge
209
216
  licenses: