chrono_forge 0.9.1 → 0.10.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- checksums.yaml +4 -4
- data/CHANGELOG.md +22 -0
- data/README.md +305 -44
- data/docs/superpowers/plans/2026-06-25-chrono_forge-dashboard.md +1748 -0
- data/docs/superpowers/plans/2026-06-25-chrono_forge-dashboard.md.tasks.json +17 -0
- data/docs/superpowers/plans/2026-06-25-composite-retry-policies.md +930 -0
- data/docs/superpowers/plans/2026-06-25-composite-retry-policies.md.tasks.json +54 -0
- data/docs/superpowers/plans/2026-06-25-reserved-kwarg-guard.md +241 -0
- data/docs/superpowers/plans/2026-06-25-reserved-kwarg-guard.md.tasks.json +12 -0
- data/docs/superpowers/plans/2026-06-26-branches-spawn-merge.md +1378 -0
- data/docs/superpowers/plans/2026-06-26-branches-spawn-merge.md.tasks.json +67 -0
- data/docs/superpowers/plans/2026-06-26-deferral-continuation-race-and-catchup.md +709 -0
- data/docs/superpowers/plans/2026-06-26-deferral-continuation-race-and-catchup.md.tasks.json +19 -0
- data/docs/superpowers/specs/2026-06-03-unified-retry-policy-design.md +226 -0
- data/docs/superpowers/specs/2026-06-25-chrono_forge-dashboard-design.md +190 -0
- data/docs/superpowers/specs/2026-06-25-composite-retry-policies-design.md +228 -0
- data/docs/superpowers/specs/2026-06-25-reserved-kwarg-guard-design.md +169 -0
- data/docs/superpowers/specs/2026-06-25-spawn-merge-branches-design.md +468 -0
- data/docs/superpowers/specs/2026-06-26-dashboard-branch-view-design.md +142 -0
- data/docs/superpowers/specs/2026-06-26-deferral-continuation-race-and-catchup-design.md +265 -0
- data/lib/chrono_forge/branch_merge_job.rb +138 -0
- data/lib/chrono_forge/branch_probe.rb +26 -0
- data/lib/chrono_forge/cleanup.rb +6 -0
- data/lib/chrono_forge/execution_log.rb +6 -0
- data/lib/chrono_forge/executor/composite_retry_policy.rb +47 -0
- data/lib/chrono_forge/executor/methods/branch.rb +185 -0
- data/lib/chrono_forge/executor/methods/durably_execute.rb +21 -19
- data/lib/chrono_forge/executor/methods/durably_repeat.rb +118 -25
- data/lib/chrono_forge/executor/methods/merge_branches.rb +83 -0
- data/lib/chrono_forge/executor/methods/wait.rb +2 -4
- data/lib/chrono_forge/executor/methods/wait_until.rb +25 -25
- data/lib/chrono_forge/executor/methods/workflow_states.rb +16 -0
- data/lib/chrono_forge/executor/methods.rb +2 -0
- data/lib/chrono_forge/executor/retry_policy.rb +111 -0
- data/lib/chrono_forge/executor.rb +216 -28
- data/lib/chrono_forge/version.rb +1 -1
- data/lib/chrono_forge/workflow.rb +10 -1
- data/lib/generators/chrono_forge/migration_actions.rb +1 -0
- data/lib/generators/chrono_forge/templates/add_chrono_forge_parent_execution_log.rb +38 -0
- metadata +42 -5
- data/lib/chrono_forge/executor/retry_strategy.rb +0 -29
|
@@ -0,0 +1,1378 @@
|
|
|
1
|
+
# Branches (`branch` / `spawn` / `spawn_each` / `merge_branches`) Implementation Plan
|
|
2
|
+
|
|
3
|
+
> **For agentic workers:** REQUIRED SUB-SKILL: Use superpowers-extended-cc:subagent-driven-development (recommended) or superpowers-extended-cc:executing-plans to implement this plan task-by-task. Steps use checkbox (`- [ ]`) syntax for tracking.
|
|
4
|
+
|
|
5
|
+
**Goal:** Add a durable, large-scale fan-out/fan-in primitive to ChronoForge — `branch` blocks that `spawn`/`spawn_each` child sub-workflows and are joined by `merge_branches` (or `automerge`), built to dispatch hundreds of thousands of children per branch.
|
|
6
|
+
|
|
7
|
+
**Architecture:** A `branch :name do … end` block is a durable coordination step (`branch$<name>` execution log). Inside it, `spawn`/`spawn_each` eagerly create + bulk-enqueue child workflows (one `chrono_forge_workflows` row each, linked by a new generic `parent_execution_log_id` FK to the branch log). The block seals when it closes; `spawn_each` streams its source with a per-spawn cursor so dispatch resumes after a crash without re-streaming. Joining is poll-based via a lightweight `BranchMergeJob` (no parent replay per poll); branch/merge state is tracked in an in-memory registry (`@open_branches`) rebuilt each replay pass, so the completion gate can raise on a forgotten join.
|
|
8
|
+
|
|
9
|
+
**Tech Stack:** Ruby, ActiveJob (>= 7.1, for `perform_all_later`), ActiveRecord, Zeitwerk, Minitest + Combustion + ChaoticJob.
|
|
10
|
+
|
|
11
|
+
**User Verification:** NO — no user verification required (library feature; verified by the test suite).
|
|
12
|
+
|
|
13
|
+
**Reference spec:** `docs/superpowers/specs/2026-06-25-spawn-merge-branches-design.md`
|
|
14
|
+
|
|
15
|
+
---
|
|
16
|
+
|
|
17
|
+
## File Structure
|
|
18
|
+
|
|
19
|
+
**New library files**
|
|
20
|
+
- `lib/chrono_forge/executor/methods/branch.rb` — `branch`, `spawn`, `spawn_each`, and shared dispatch/cursor/registry helpers.
|
|
21
|
+
- `lib/chrono_forge/executor/methods/merge_branches.rb` — `merge_branches`/`merge_branch`, plus `branches_done?` / `enqueue_branch_merge_job` / `open_branch!` (used by the completion gate too).
|
|
22
|
+
- `lib/chrono_forge/branch_merge_job.rb` — `ChronoForge::BranchMergeJob`, the lightweight poller.
|
|
23
|
+
- `lib/generators/chrono_forge/templates/add_chrono_forge_parent_execution_log.rb` — additive migration.
|
|
24
|
+
|
|
25
|
+
**Modified library files**
|
|
26
|
+
- `lib/chrono_forge/executor.rb` — new error classes; `include Methods::Branch` / `Methods::MergeBranches` (via methods.rb); poll-cadence constants.
|
|
27
|
+
- `lib/chrono_forge/executor/methods.rb` — include the two new modules.
|
|
28
|
+
- `lib/chrono_forge/executor/methods/workflow_states.rb` — completion gate in `complete_workflow!`.
|
|
29
|
+
- `lib/chrono_forge/workflow.rb` — `belongs_to :parent_execution_log`.
|
|
30
|
+
- `lib/chrono_forge/execution_log.rb` — `has_many :spawned_workflows`.
|
|
31
|
+
- `lib/generators/chrono_forge/migration_actions.rb` — add migration to `MIGRATIONS`.
|
|
32
|
+
- `chrono_forge.gemspec` — `activejob >= 7.1` floor.
|
|
33
|
+
- `README.md` — branch/merge section + caveats.
|
|
34
|
+
|
|
35
|
+
**New/modified test files**
|
|
36
|
+
- `test/internal/db/migrate/20260626000001_add_chrono_forge_parent_execution_log.rb` — apply the column to the test DB.
|
|
37
|
+
- `test/internal/app/jobs/` — branch test workflow jobs + a trivial child workflow.
|
|
38
|
+
- `test/branch_test.rb`, `test/spawn_each_test.rb`, `test/branch_merge_job_test.rb`, `test/merge_branches_test.rb`, `test/automerge_test.rb`, `test/branch_recovery_test.rb`, `test/branch_scale_test.rb`.
|
|
39
|
+
- `test/schema_test.rb`, `test/generators_test.rb`, `test/upgrade_migration_test.rb` — extend for the new column/index.
|
|
40
|
+
|
|
41
|
+
---
|
|
42
|
+
|
|
43
|
+
### Task 1: Schema — `parent_execution_log_id` column + `(parent_execution_log_id, state)` index
|
|
44
|
+
|
|
45
|
+
**Goal:** Add a generic `parent_execution_log_id` FK column to `chrono_forge_workflows` with a composite index on `(parent_execution_log_id, state)`, shipped as an additive migration wired into the install/upgrade generators and applied to the test DB.
|
|
46
|
+
|
|
47
|
+
**Files:**
|
|
48
|
+
- Create: `lib/generators/chrono_forge/templates/add_chrono_forge_parent_execution_log.rb`
|
|
49
|
+
- Modify: `lib/generators/chrono_forge/migration_actions.rb`
|
|
50
|
+
- Create: `test/internal/db/migrate/20260626000001_add_chrono_forge_parent_execution_log.rb`
|
|
51
|
+
- Modify: `test/schema_test.rb`, `test/generators_test.rb`
|
|
52
|
+
|
|
53
|
+
**Acceptance Criteria:**
|
|
54
|
+
- [ ] `chrono_forge_workflows` has a nullable `parent_execution_log_id` column whose type matches the table's primary-key type (bigint or uuid).
|
|
55
|
+
- [ ] A composite index `(parent_execution_log_id, state)` exists.
|
|
56
|
+
- [ ] The migration is idempotent (`if_not_exists`) and listed in `MigrationActions::MIGRATIONS`.
|
|
57
|
+
- [ ] `generators_test` expects the new migration in the copied set.
|
|
58
|
+
|
|
59
|
+
**Verify:** `cd .worktrees/branches && bundle exec ruby -I test test/schema_test.rb` → all pass
|
|
60
|
+
|
|
61
|
+
**Steps:**
|
|
62
|
+
|
|
63
|
+
- [ ] **Step 1: Write the failing schema test**
|
|
64
|
+
|
|
65
|
+
Add to `test/schema_test.rb` (inside the existing `SchemaTest`):
|
|
66
|
+
|
|
67
|
+
```ruby
|
|
68
|
+
def test_workflows_have_parent_execution_log_id_column
|
|
69
|
+
assert connection.column_exists?(:chrono_forge_workflows, :parent_execution_log_id),
|
|
70
|
+
"expected chrono_forge_workflows.parent_execution_log_id for branch children"
|
|
71
|
+
end
|
|
72
|
+
|
|
73
|
+
def test_workflows_have_parent_execution_log_state_index
|
|
74
|
+
assert connection.index_exists?(:chrono_forge_workflows, %i[parent_execution_log_id state]),
|
|
75
|
+
"expected composite index on [parent_execution_log_id, state] for the merge probe"
|
|
76
|
+
end
|
|
77
|
+
```
|
|
78
|
+
|
|
79
|
+
- [ ] **Step 2: Run the test to verify it fails**
|
|
80
|
+
|
|
81
|
+
Run: `bundle exec ruby -I test test/schema_test.rb -n test_workflows_have_parent_execution_log_id_column`
|
|
82
|
+
Expected: FAIL — column does not exist.
|
|
83
|
+
|
|
84
|
+
- [ ] **Step 3: Write the migration template**
|
|
85
|
+
|
|
86
|
+
Create `lib/generators/chrono_forge/templates/add_chrono_forge_parent_execution_log.rb`:
|
|
87
|
+
|
|
88
|
+
```ruby
|
|
89
|
+
# frozen_string_literal: true
|
|
90
|
+
|
|
91
|
+
# Adds chrono_forge_workflows.parent_execution_log_id: the execution log that
|
|
92
|
+
# spawned a workflow (for branches, the branch$<name> log). Deliberately generic
|
|
93
|
+
# so any future step that spawns sub-workflows can reuse it. The composite
|
|
94
|
+
# [parent_execution_log_id, state] index makes the merge completion probe and the
|
|
95
|
+
# dropped-job re-kick index-only at hundreds of thousands of children.
|
|
96
|
+
#
|
|
97
|
+
# Shipped standalone (matching add_chrono_forge_workflow_state_index) so existing
|
|
98
|
+
# installs pick it up via `rails generate chrono_forge:upgrade`.
|
|
99
|
+
class AddChronoForgeParentExecutionLog < ActiveRecord::Migration[7.1]
|
|
100
|
+
disable_ddl_transaction!
|
|
101
|
+
|
|
102
|
+
def change
|
|
103
|
+
add_column :chrono_forge_workflows, :parent_execution_log_id, parent_log_fk_type,
|
|
104
|
+
null: true, if_not_exists: true
|
|
105
|
+
|
|
106
|
+
add_index :chrono_forge_workflows, %i[parent_execution_log_id state],
|
|
107
|
+
if_not_exists: true, **chrono_forge_index_algorithm
|
|
108
|
+
end
|
|
109
|
+
|
|
110
|
+
private
|
|
111
|
+
|
|
112
|
+
# Match the type of chrono_forge_workflows.id so the FK lines up on both bigint
|
|
113
|
+
# and uuid installs.
|
|
114
|
+
def parent_log_fk_type
|
|
115
|
+
id_col = connection.columns(:chrono_forge_workflows).find { |c| c.name == "id" }
|
|
116
|
+
id_col && id_col.sql_type.to_s.downcase.include?("uuid") ? :uuid : :bigint
|
|
117
|
+
end
|
|
118
|
+
|
|
119
|
+
def chrono_forge_index_algorithm
|
|
120
|
+
if connection.adapter_name.to_s.downcase.include?("postgresql")
|
|
121
|
+
{algorithm: :concurrently}
|
|
122
|
+
else
|
|
123
|
+
{}
|
|
124
|
+
end
|
|
125
|
+
end
|
|
126
|
+
end
|
|
127
|
+
```
|
|
128
|
+
|
|
129
|
+
- [ ] **Step 4: Wire it into the generators**
|
|
130
|
+
|
|
131
|
+
In `lib/generators/chrono_forge/migration_actions.rb`, append to `MIGRATIONS`:
|
|
132
|
+
|
|
133
|
+
```ruby
|
|
134
|
+
MIGRATIONS = %w[
|
|
135
|
+
install_chrono_forge
|
|
136
|
+
add_chrono_forge_workflow_state_index
|
|
137
|
+
add_chrono_forge_error_log_step_context
|
|
138
|
+
add_chrono_forge_parent_execution_log
|
|
139
|
+
].freeze
|
|
140
|
+
```
|
|
141
|
+
|
|
142
|
+
Update `test/generators_test.rb` `test_install_copies_all_migrations` expected list to include `"add_chrono_forge_parent_execution_log.rb"` (keep it alphabetically sorted as the test sorts), and bump the idempotence count in `test_install_is_idempotent` from `3` to `4`.
|
|
143
|
+
|
|
144
|
+
- [ ] **Step 5: Apply to the test DB**
|
|
145
|
+
|
|
146
|
+
Create `test/internal/db/migrate/20260626000001_add_chrono_forge_parent_execution_log.rb` with the **same class body** as the template (Combustion runs these migrations to build the test schema):
|
|
147
|
+
|
|
148
|
+
```ruby
|
|
149
|
+
# frozen_string_literal: true
|
|
150
|
+
|
|
151
|
+
require_relative "../../../../lib/generators/chrono_forge/templates/add_chrono_forge_parent_execution_log"
|
|
152
|
+
```
|
|
153
|
+
|
|
154
|
+
(If the `require_relative` shortcut causes Combustion load-order issues, instead paste the full class body from Step 3 into this file verbatim.)
|
|
155
|
+
|
|
156
|
+
- [ ] **Step 6: Run tests to verify they pass**
|
|
157
|
+
|
|
158
|
+
Run: `bundle exec ruby -I test test/schema_test.rb && bundle exec ruby -I test test/generators_test.rb`
|
|
159
|
+
Expected: PASS.
|
|
160
|
+
|
|
161
|
+
- [ ] **Step 7: Commit**
|
|
162
|
+
|
|
163
|
+
```bash
|
|
164
|
+
git add lib/generators test/internal/db/migrate test/schema_test.rb test/generators_test.rb
|
|
165
|
+
git commit -m "feat(branches): add parent_execution_log_id column + index"
|
|
166
|
+
```
|
|
167
|
+
|
|
168
|
+
```json:metadata
|
|
169
|
+
{"files": ["lib/generators/chrono_forge/templates/add_chrono_forge_parent_execution_log.rb", "lib/generators/chrono_forge/migration_actions.rb", "test/internal/db/migrate/20260626000001_add_chrono_forge_parent_execution_log.rb", "test/schema_test.rb", "test/generators_test.rb"], "verifyCommand": "bundle exec ruby -I test test/schema_test.rb", "acceptanceCriteria": ["parent_execution_log_id column exists", "composite (parent_execution_log_id, state) index exists", "migration listed in MIGRATIONS and generators_test"], "requiresUserVerification": false}
|
|
170
|
+
```
|
|
171
|
+
|
|
172
|
+
---
|
|
173
|
+
|
|
174
|
+
### Task 2: Model associations
|
|
175
|
+
|
|
176
|
+
**Goal:** Link children to their spawning branch log via ActiveRecord associations.
|
|
177
|
+
|
|
178
|
+
**Files:**
|
|
179
|
+
- Modify: `lib/chrono_forge/workflow.rb`
|
|
180
|
+
- Modify: `lib/chrono_forge/execution_log.rb`
|
|
181
|
+
- Test: `test/branch_associations_test.rb`
|
|
182
|
+
|
|
183
|
+
**Acceptance Criteria:**
|
|
184
|
+
- [ ] `Workflow#parent_execution_log` returns the spawning `ExecutionLog` (optional).
|
|
185
|
+
- [ ] `ExecutionLog#spawned_workflows` returns the workflows it spawned.
|
|
186
|
+
|
|
187
|
+
**Verify:** `bundle exec ruby -I test test/branch_associations_test.rb` → PASS
|
|
188
|
+
|
|
189
|
+
**Steps:**
|
|
190
|
+
|
|
191
|
+
- [ ] **Step 1: Write the failing test**
|
|
192
|
+
|
|
193
|
+
Create `test/branch_associations_test.rb`:
|
|
194
|
+
|
|
195
|
+
```ruby
|
|
196
|
+
require "test_helper"
|
|
197
|
+
|
|
198
|
+
class BranchAssociationsTest < ActiveJob::TestCase
|
|
199
|
+
def test_parent_execution_log_and_spawned_workflows_round_trip
|
|
200
|
+
parent = ChronoForge::Workflow.create!(key: "p-#{SecureRandom.hex}", job_class: "X")
|
|
201
|
+
log = parent.execution_logs.create!(step_name: "branch$grp")
|
|
202
|
+
child = ChronoForge::Workflow.create!(
|
|
203
|
+
key: "c-#{SecureRandom.hex}", job_class: "Y", parent_execution_log_id: log.id
|
|
204
|
+
)
|
|
205
|
+
|
|
206
|
+
assert_equal log, child.parent_execution_log
|
|
207
|
+
assert_includes log.spawned_workflows, child
|
|
208
|
+
end
|
|
209
|
+
end
|
|
210
|
+
```
|
|
211
|
+
|
|
212
|
+
- [ ] **Step 2: Run to verify it fails**
|
|
213
|
+
|
|
214
|
+
Run: `bundle exec ruby -I test test/branch_associations_test.rb`
|
|
215
|
+
Expected: FAIL — `NoMethodError: undefined method 'parent_execution_log'`.
|
|
216
|
+
|
|
217
|
+
- [ ] **Step 3: Add the associations**
|
|
218
|
+
|
|
219
|
+
In `lib/chrono_forge/workflow.rb`, after `has_many :error_logs, dependent: :destroy`:
|
|
220
|
+
|
|
221
|
+
```ruby
|
|
222
|
+
belongs_to :parent_execution_log,
|
|
223
|
+
class_name: "ChronoForge::ExecutionLog", optional: true
|
|
224
|
+
```
|
|
225
|
+
|
|
226
|
+
In `lib/chrono_forge/execution_log.rb`, after `belongs_to :workflow`:
|
|
227
|
+
|
|
228
|
+
```ruby
|
|
229
|
+
has_many :spawned_workflows,
|
|
230
|
+
class_name: "ChronoForge::Workflow",
|
|
231
|
+
foreign_key: :parent_execution_log_id,
|
|
232
|
+
inverse_of: :parent_execution_log,
|
|
233
|
+
dependent: :nullify
|
|
234
|
+
```
|
|
235
|
+
|
|
236
|
+
- [ ] **Step 4: Run to verify it passes**
|
|
237
|
+
|
|
238
|
+
Run: `bundle exec ruby -I test test/branch_associations_test.rb`
|
|
239
|
+
Expected: PASS.
|
|
240
|
+
|
|
241
|
+
- [ ] **Step 5: Commit**
|
|
242
|
+
|
|
243
|
+
```bash
|
|
244
|
+
git add lib/chrono_forge/workflow.rb lib/chrono_forge/execution_log.rb test/branch_associations_test.rb
|
|
245
|
+
git commit -m "feat(branches): parent_execution_log / spawned_workflows associations"
|
|
246
|
+
```
|
|
247
|
+
|
|
248
|
+
```json:metadata
|
|
249
|
+
{"files": ["lib/chrono_forge/workflow.rb", "lib/chrono_forge/execution_log.rb", "test/branch_associations_test.rb"], "verifyCommand": "bundle exec ruby -I test test/branch_associations_test.rb", "acceptanceCriteria": ["parent_execution_log association", "spawned_workflows association"], "requiresUserVerification": false}
|
|
250
|
+
```
|
|
251
|
+
|
|
252
|
+
---
|
|
253
|
+
|
|
254
|
+
### Task 3: `branch` + `spawn` (block, registry, eager single dispatch, seal, skip-on-replay)
|
|
255
|
+
|
|
256
|
+
**Goal:** Implement the `branch` block (durable step, in-memory registry, seal-on-close, **skip-the-block-when-sealed**) and `spawn` (single eager child dispatch). `spawn` outside a branch raises.
|
|
257
|
+
|
|
258
|
+
**Files:**
|
|
259
|
+
- Create: `lib/chrono_forge/executor/methods/branch.rb`
|
|
260
|
+
- Modify: `lib/chrono_forge/executor.rb` (error classes)
|
|
261
|
+
- Modify: `lib/chrono_forge/executor/methods.rb` (include)
|
|
262
|
+
- Create: `test/internal/app/jobs/single_spawn_workflow.rb`, `test/internal/app/jobs/noop_child.rb`
|
|
263
|
+
- Create: `test/branch_test.rb`
|
|
264
|
+
|
|
265
|
+
**Acceptance Criteria:**
|
|
266
|
+
- [ ] `branch :g do spawn :c, NoopChild end` creates a child with key `"<parent.key>$g$c"`, `job_class: "NoopChild"`, `parent_execution_log_id` = the `branch$g` log id, `state: idle`.
|
|
267
|
+
- [ ] The `branch$g` log is `completed` (sealed) after the block closes.
|
|
268
|
+
- [ ] On replay (sealed), the block body is **not** re-executed (no duplicate child rows, no re-dispatch).
|
|
269
|
+
- [ ] `spawn` called outside a `branch` block raises `NotInBranchError`.
|
|
270
|
+
|
|
271
|
+
**Verify:** `bundle exec ruby -I test test/branch_test.rb` → PASS
|
|
272
|
+
|
|
273
|
+
**Steps:**
|
|
274
|
+
|
|
275
|
+
- [ ] **Step 1: Write failing tests + fixtures**
|
|
276
|
+
|
|
277
|
+
Create `test/internal/app/jobs/noop_child.rb`:
|
|
278
|
+
|
|
279
|
+
```ruby
|
|
280
|
+
class NoopChild < WorkflowJob
|
|
281
|
+
prepend ChronoForge::Executor
|
|
282
|
+
|
|
283
|
+
def perform(**)
|
|
284
|
+
durably_execute :noop
|
|
285
|
+
end
|
|
286
|
+
|
|
287
|
+
private
|
|
288
|
+
|
|
289
|
+
def noop = nil
|
|
290
|
+
end
|
|
291
|
+
```
|
|
292
|
+
|
|
293
|
+
Create `test/internal/app/jobs/single_spawn_workflow.rb`:
|
|
294
|
+
|
|
295
|
+
```ruby
|
|
296
|
+
class SingleSpawnWorkflow < WorkflowJob
|
|
297
|
+
prepend ChronoForge::Executor
|
|
298
|
+
|
|
299
|
+
def perform
|
|
300
|
+
branch :grp, automerge: true do
|
|
301
|
+
spawn :child, NoopChild, foo: "bar"
|
|
302
|
+
end
|
|
303
|
+
end
|
|
304
|
+
end
|
|
305
|
+
```
|
|
306
|
+
|
|
307
|
+
Create `test/branch_test.rb`:
|
|
308
|
+
|
|
309
|
+
```ruby
|
|
310
|
+
require "test_helper"
|
|
311
|
+
|
|
312
|
+
class BranchTest < ActiveJob::TestCase
|
|
313
|
+
def test_spawn_creates_linked_child_and_seals_branch
|
|
314
|
+
SingleSpawnWorkflow.perform_later("ss-1")
|
|
315
|
+
perform_all_jobs
|
|
316
|
+
|
|
317
|
+
parent = ChronoForge::Workflow.find_by(key: "ss-1")
|
|
318
|
+
branch_log = parent.execution_logs.find_by(step_name: "branch$grp")
|
|
319
|
+
assert branch_log.completed?, "branch should seal when the block closes"
|
|
320
|
+
|
|
321
|
+
child = ChronoForge::Workflow.find_by(key: "ss-1$grp$child")
|
|
322
|
+
assert child, "child should be created with deterministic key"
|
|
323
|
+
assert_equal "NoopChild", child.job_class
|
|
324
|
+
assert_equal branch_log.id, child.parent_execution_log_id
|
|
325
|
+
assert_equal({"foo" => "bar"}, child.kwargs)
|
|
326
|
+
end
|
|
327
|
+
|
|
328
|
+
def test_spawn_outside_branch_raises
|
|
329
|
+
workflow = Class.new(WorkflowJob) do
|
|
330
|
+
prepend ChronoForge::Executor
|
|
331
|
+
def perform = spawn(:x, NoopChild)
|
|
332
|
+
end
|
|
333
|
+
Object.const_set(:BareSpawnWorkflow, workflow)
|
|
334
|
+
BareSpawnWorkflow.perform_later("bare-1")
|
|
335
|
+
assert_raises(ChronoForge::Executor::NotInBranchError) { perform_all_jobs }
|
|
336
|
+
ensure
|
|
337
|
+
Object.send(:remove_const, :BareSpawnWorkflow) if defined?(BareSpawnWorkflow)
|
|
338
|
+
end
|
|
339
|
+
|
|
340
|
+
def test_sealed_branch_block_is_not_re_executed_on_replay
|
|
341
|
+
# First run dispatches + seals.
|
|
342
|
+
SingleSpawnWorkflow.perform_later("ss-2")
|
|
343
|
+
perform_all_jobs
|
|
344
|
+
branch_log = ChronoForge::Workflow.find_by(key: "ss-2").execution_logs.find_by(step_name: "branch$grp")
|
|
345
|
+
|
|
346
|
+
# Re-run the same workflow key: the sealed branch must skip its block.
|
|
347
|
+
inserts = 0
|
|
348
|
+
sub = ActiveSupport::Notifications.subscribe("sql.active_record") do |*a|
|
|
349
|
+
inserts += 1 if /INSERT INTO ["`]?chrono_forge_workflows/i.match?(a.last[:sql].to_s)
|
|
350
|
+
end
|
|
351
|
+
SingleSpawnWorkflow.perform_later("ss-2")
|
|
352
|
+
perform_all_jobs
|
|
353
|
+
ActiveSupport::Notifications.unsubscribe(sub)
|
|
354
|
+
|
|
355
|
+
assert_equal 0, inserts, "sealed branch must not re-dispatch children on replay"
|
|
356
|
+
assert_equal 1, ChronoForge::Workflow.where(parent_execution_log_id: branch_log.id).count
|
|
357
|
+
end
|
|
358
|
+
end
|
|
359
|
+
```
|
|
360
|
+
|
|
361
|
+
- [ ] **Step 2: Run to verify they fail**
|
|
362
|
+
|
|
363
|
+
Run: `bundle exec ruby -I test test/branch_test.rb`
|
|
364
|
+
Expected: FAIL — `NameError: uninitialized constant ... NotInBranchError` / `NoMethodError: branch`.
|
|
365
|
+
|
|
366
|
+
- [ ] **Step 3: Add the error classes**
|
|
367
|
+
|
|
368
|
+
In `lib/chrono_forge/executor.rb`, after `class InvalidStepName < NotExecutableError; end`:
|
|
369
|
+
|
|
370
|
+
```ruby
|
|
371
|
+
# spawn/spawn_each called outside a branch block. NotExecutableError so it
|
|
372
|
+
# propagates (fail-fast on a programming error) rather than being retried.
|
|
373
|
+
class NotInBranchError < NotExecutableError; end
|
|
374
|
+
|
|
375
|
+
# A branch was opened but neither merged via merge_branches nor declared
|
|
376
|
+
# automerge: true. Raised at the completion gate. Fail-fast (not retried).
|
|
377
|
+
class UnmergedBranchError < NotExecutableError; end
|
|
378
|
+
```
|
|
379
|
+
|
|
380
|
+
- [ ] **Step 4: Implement `branch` + `spawn` + shared helpers**
|
|
381
|
+
|
|
382
|
+
Create `lib/chrono_forge/executor/methods/branch.rb`:
|
|
383
|
+
|
|
384
|
+
```ruby
|
|
385
|
+
module ChronoForge
|
|
386
|
+
module Executor
|
|
387
|
+
module Methods
|
|
388
|
+
module Branch
|
|
389
|
+
# Opens a named branch — a durable fan-out step. Spawns inside the block
|
|
390
|
+
# eagerly create + enqueue child workflows; the branch SEALS when the
|
|
391
|
+
# block closes. Returns without waiting (branches are concurrent; the
|
|
392
|
+
# join is a separate merge_branches / automerge).
|
|
393
|
+
def branch(name, automerge: false)
|
|
394
|
+
raise ArgumentError, "branch requires a block" unless block_given?
|
|
395
|
+
raise ArgumentError, "branch blocks cannot be nested" if @current_branch
|
|
396
|
+
validate_step_name_segment!(name)
|
|
397
|
+
|
|
398
|
+
step_name = "branch$#{name}"
|
|
399
|
+
log = find_or_create_execution_log!(step_name) { |l| l.started_at = Time.current }
|
|
400
|
+
|
|
401
|
+
# The sealed branch log may be a readonly, id-less cache stand-in; fetch
|
|
402
|
+
# the real id so the registry/merge can scope children to it.
|
|
403
|
+
log_id = log.id || ExecutionLog.where(workflow: @workflow, step_name: step_name).pick(:id)
|
|
404
|
+
(@open_branches ||= {})[name.to_s] = {automerge: automerge, log_id: log_id}
|
|
405
|
+
|
|
406
|
+
# ---- THE single most important correctness/performance property ----
|
|
407
|
+
# A SEALED branch skips its block ENTIRELY. The expensive source
|
|
408
|
+
# enumeration in spawn_each never re-runs after sealing. Do not move
|
|
409
|
+
# dispatch out from behind this guard.
|
|
410
|
+
unless log.completed?
|
|
411
|
+
@current_branch = {name: name.to_s, log: log, seq: 0}
|
|
412
|
+
begin
|
|
413
|
+
yield
|
|
414
|
+
ensure
|
|
415
|
+
@current_branch = nil
|
|
416
|
+
end
|
|
417
|
+
log.update!(state: :completed, completed_at: Time.current)
|
|
418
|
+
end
|
|
419
|
+
|
|
420
|
+
name
|
|
421
|
+
end
|
|
422
|
+
|
|
423
|
+
# Dispatch a single child into the current branch.
|
|
424
|
+
def spawn(name, workflow_class, **kwargs)
|
|
425
|
+
cb = current_branch!
|
|
426
|
+
validate_step_name_segment!(name)
|
|
427
|
+
child_key = "#{@workflow.key}$#{cb[:name]}$#{name}"
|
|
428
|
+
dispatch_children(cb, [[child_key, workflow_class, kwargs]])
|
|
429
|
+
name
|
|
430
|
+
end
|
|
431
|
+
|
|
432
|
+
private
|
|
433
|
+
|
|
434
|
+
def current_branch!
|
|
435
|
+
@current_branch || raise(NotInBranchError, "spawn/spawn_each may only be called inside a branch block")
|
|
436
|
+
end
|
|
437
|
+
|
|
438
|
+
# Bulk-create child workflow rows then bulk-enqueue their jobs.
|
|
439
|
+
# perform_all_later bypasses the class-level perform_later guard, so we
|
|
440
|
+
# validate the args ourselves before enqueuing.
|
|
441
|
+
def dispatch_children(cb, entries)
|
|
442
|
+
return if entries.empty?
|
|
443
|
+
now = Time.current
|
|
444
|
+
rows = entries.map do |child_key, klass, kwargs|
|
|
445
|
+
validate_child_enqueue!(child_key, kwargs)
|
|
446
|
+
{
|
|
447
|
+
key: child_key, job_class: klass.to_s,
|
|
448
|
+
kwargs: kwargs, options: {}, context: {},
|
|
449
|
+
state: Workflow.states[:idle],
|
|
450
|
+
parent_execution_log_id: cb[:log].id,
|
|
451
|
+
created_at: now, updated_at: now
|
|
452
|
+
}
|
|
453
|
+
end
|
|
454
|
+
# On-conflict-ignore makes re-dispatch (crash recovery) idempotent.
|
|
455
|
+
Workflow.insert_all(rows, unique_by: :key)
|
|
456
|
+
jobs = entries.map { |child_key, klass, kwargs| klass.new(child_key, **kwargs) }
|
|
457
|
+
ActiveJob.perform_all_later(jobs)
|
|
458
|
+
end
|
|
459
|
+
|
|
460
|
+
def validate_child_enqueue!(child_key, kwargs)
|
|
461
|
+
unless child_key.is_a?(String)
|
|
462
|
+
raise ArgumentError, "child key must be a String (got #{child_key.inspect})"
|
|
463
|
+
end
|
|
464
|
+
reserved = kwargs.keys.map(&:to_sym) & RESERVED_KWARGS
|
|
465
|
+
if reserved.any?
|
|
466
|
+
raise ArgumentError, "#{reserved.join(", ")} are reserved ChronoForge keywords"
|
|
467
|
+
end
|
|
468
|
+
end
|
|
469
|
+
|
|
470
|
+
# Advance (and persist) a spawn_each cursor on the branch log.
|
|
471
|
+
# `n` is the running item index; `pk` is the AR keyset position (nil for
|
|
472
|
+
# plain enumerables).
|
|
473
|
+
def advance_cursor!(cb, spawn_name, n:, pk: nil)
|
|
474
|
+
meta = cb[:log].metadata || {}
|
|
475
|
+
cursors = meta["cursors"] || {}
|
|
476
|
+
entry = cursors[spawn_name.to_s] || {}
|
|
477
|
+
entry["n"] = n
|
|
478
|
+
entry["pk"] = pk unless pk.nil?
|
|
479
|
+
cursors[spawn_name.to_s] = entry
|
|
480
|
+
meta["cursors"] = cursors
|
|
481
|
+
cb[:log].update!(metadata: meta)
|
|
482
|
+
end
|
|
483
|
+
end
|
|
484
|
+
end
|
|
485
|
+
end
|
|
486
|
+
end
|
|
487
|
+
```
|
|
488
|
+
|
|
489
|
+
In `lib/chrono_forge/executor/methods.rb`, add the include (place `Branch` before `WorkflowStates` so its private helpers are available to the completion gate):
|
|
490
|
+
|
|
491
|
+
```ruby
|
|
492
|
+
module ChronoForge
|
|
493
|
+
module Executor
|
|
494
|
+
module Methods
|
|
495
|
+
include Methods::Wait
|
|
496
|
+
include Methods::WaitUntil
|
|
497
|
+
include Methods::ContinueIf
|
|
498
|
+
include Methods::DurablyExecute
|
|
499
|
+
include Methods::DurablyRepeat
|
|
500
|
+
include Methods::Branch
|
|
501
|
+
include Methods::MergeBranches
|
|
502
|
+
include Methods::WorkflowStates
|
|
503
|
+
end
|
|
504
|
+
end
|
|
505
|
+
end
|
|
506
|
+
```
|
|
507
|
+
|
|
508
|
+
> Note: `Methods::MergeBranches` is referenced here but created in Task 6. Until then, add a temporary empty module to keep the suite loading, OR implement Task 6 immediately after this task. The subagent executing this plan should create `merge_branches.rb` with at least `module ChronoForge; module Executor; module Methods; module MergeBranches; end; end; end; end` now and flesh it out in Task 6.
|
|
509
|
+
|
|
510
|
+
- [ ] **Step 5: Run tests to verify they pass**
|
|
511
|
+
|
|
512
|
+
Run: `bundle exec ruby -I test test/branch_test.rb`
|
|
513
|
+
Expected: PASS (3 tests).
|
|
514
|
+
|
|
515
|
+
- [ ] **Step 6: Commit**
|
|
516
|
+
|
|
517
|
+
```bash
|
|
518
|
+
git add lib/chrono_forge test/internal/app/jobs/noop_child.rb test/internal/app/jobs/single_spawn_workflow.rb test/branch_test.rb
|
|
519
|
+
git commit -m "feat(branches): branch block + spawn single child"
|
|
520
|
+
```
|
|
521
|
+
|
|
522
|
+
```json:metadata
|
|
523
|
+
{"files": ["lib/chrono_forge/executor/methods/branch.rb", "lib/chrono_forge/executor.rb", "lib/chrono_forge/executor/methods.rb", "test/branch_test.rb", "test/internal/app/jobs/noop_child.rb", "test/internal/app/jobs/single_spawn_workflow.rb"], "verifyCommand": "bundle exec ruby -I test test/branch_test.rb", "acceptanceCriteria": ["spawn creates linked child + seals branch", "spawn outside branch raises NotInBranchError", "sealed branch skips block on replay"], "requiresUserVerification": false}
|
|
524
|
+
```
|
|
525
|
+
|
|
526
|
+
---
|
|
527
|
+
|
|
528
|
+
### Task 4: `spawn_each` — streaming bulk dispatch with cursor
|
|
529
|
+
|
|
530
|
+
**Goal:** Implement `spawn_each(name, source, of:)` — stream an AR relation (keyset) or any enumerable, dispatching one child per item keyed `name_{index}`, with the class returned from the block and a resumable per-spawn cursor. Raise on a conflicting AR `.order`.
|
|
531
|
+
|
|
532
|
+
**Files:**
|
|
533
|
+
- Modify: `lib/chrono_forge/executor/methods/branch.rb`
|
|
534
|
+
- Create: `test/internal/app/jobs/spawn_each_workflow.rb`
|
|
535
|
+
- Create: `test/spawn_each_test.rb`
|
|
536
|
+
|
|
537
|
+
**Acceptance Criteria:**
|
|
538
|
+
- [ ] `spawn_each :items, User.all` over N users creates N children keyed `<parent.key>$grp$items_0 … items_{N-1}`, each `parent_execution_log_id` = the branch log.
|
|
539
|
+
- [ ] The block's returned class is honored per item (mixed classes supported).
|
|
540
|
+
- [ ] An AR relation with an explicit conflicting `.order(...)` raises (via `error_on_ignore: true`).
|
|
541
|
+
- [ ] A plain enumerable source works (offset cursor).
|
|
542
|
+
- [ ] Cursor `{ "pk" =>, "n" => }` is persisted under `metadata.cursors[name]`.
|
|
543
|
+
|
|
544
|
+
**Verify:** `bundle exec ruby -I test test/spawn_each_test.rb` → PASS
|
|
545
|
+
|
|
546
|
+
**Steps:**
|
|
547
|
+
|
|
548
|
+
- [ ] **Step 1: Write failing tests + fixtures**
|
|
549
|
+
|
|
550
|
+
Create `test/internal/app/jobs/spawn_each_workflow.rb`:
|
|
551
|
+
|
|
552
|
+
```ruby
|
|
553
|
+
class SpawnEachWorkflow < WorkflowJob
|
|
554
|
+
prepend ChronoForge::Executor
|
|
555
|
+
|
|
556
|
+
def perform(of: 1000)
|
|
557
|
+
branch :grp, automerge: true do
|
|
558
|
+
spawn_each :items, User.order(:id), of: of do |user|
|
|
559
|
+
[NoopChild, {user_id: user.id}]
|
|
560
|
+
end
|
|
561
|
+
end
|
|
562
|
+
end
|
|
563
|
+
end
|
|
564
|
+
```
|
|
565
|
+
|
|
566
|
+
Create `test/spawn_each_test.rb`:
|
|
567
|
+
|
|
568
|
+
```ruby
|
|
569
|
+
require "test_helper"
|
|
570
|
+
|
|
571
|
+
class SpawnEachTest < ActiveJob::TestCase
|
|
572
|
+
def setup
|
|
573
|
+
User.delete_all
|
|
574
|
+
@users = 5.times.map { |i| User.create!(name: "u#{i}", email: "u#{i}@e.com") }
|
|
575
|
+
end
|
|
576
|
+
|
|
577
|
+
def test_spawn_each_creates_one_indexed_child_per_item
|
|
578
|
+
SpawnEachWorkflow.perform_later("se-1")
|
|
579
|
+
perform_all_jobs
|
|
580
|
+
|
|
581
|
+
parent = ChronoForge::Workflow.find_by(key: "se-1")
|
|
582
|
+
branch_log = parent.execution_logs.find_by(step_name: "branch$grp")
|
|
583
|
+
children = ChronoForge::Workflow.where(parent_execution_log_id: branch_log.id).order(:key)
|
|
584
|
+
|
|
585
|
+
assert_equal 5, children.count
|
|
586
|
+
assert_equal (0..4).map { |i| "se-1$grp$items_#{i}" }, children.pluck(:key)
|
|
587
|
+
assert_equal [@users.first.id], [children.first.kwargs["user_id"]]
|
|
588
|
+
cursor = branch_log.reload.metadata["cursors"]["items"]
|
|
589
|
+
assert_equal 5, cursor["n"]
|
|
590
|
+
end
|
|
591
|
+
|
|
592
|
+
def test_spawn_each_honors_class_from_block
|
|
593
|
+
klass = Class.new(WorkflowJob) { prepend ChronoForge::Executor; def perform(**) = nil }
|
|
594
|
+
Object.const_set(:AltChild, klass)
|
|
595
|
+
job = Class.new(WorkflowJob) do
|
|
596
|
+
prepend ChronoForge::Executor
|
|
597
|
+
def perform
|
|
598
|
+
branch(:g, automerge: true) do
|
|
599
|
+
spawn_each(:i, User.order(:id)) { |u| u.id.even? ? [AltChild, {id: u.id}] : [NoopChild, {id: u.id}] }
|
|
600
|
+
end
|
|
601
|
+
end
|
|
602
|
+
end
|
|
603
|
+
Object.const_set(:MixedClassWorkflow, job)
|
|
604
|
+
|
|
605
|
+
MixedClassWorkflow.perform_later("mc-1")
|
|
606
|
+
perform_all_jobs
|
|
607
|
+
|
|
608
|
+
classes = ChronoForge::Workflow.where("key LIKE ?", "mc-1$g$i_%").pluck(:job_class).uniq.sort
|
|
609
|
+
assert_equal %w[AltChild NoopChild], classes
|
|
610
|
+
ensure
|
|
611
|
+
Object.send(:remove_const, :AltChild) if defined?(AltChild)
|
|
612
|
+
Object.send(:remove_const, :MixedClassWorkflow) if defined?(MixedClassWorkflow)
|
|
613
|
+
end
|
|
614
|
+
|
|
615
|
+
def test_spawn_each_raises_on_conflicting_order
|
|
616
|
+
job = Class.new(WorkflowJob) do
|
|
617
|
+
prepend ChronoForge::Executor
|
|
618
|
+
def perform
|
|
619
|
+
branch(:g, automerge: true) do
|
|
620
|
+
spawn_each(:i, User.order(:email)) { |u| [NoopChild, {id: u.id}] }
|
|
621
|
+
end
|
|
622
|
+
end
|
|
623
|
+
end
|
|
624
|
+
Object.const_set(:BadOrderWorkflow, job)
|
|
625
|
+
BadOrderWorkflow.perform_later("bo-1")
|
|
626
|
+
assert_raises(ActiveRecord::IrreversibleOrderError) { perform_all_jobs }
|
|
627
|
+
ensure
|
|
628
|
+
Object.send(:remove_const, :BadOrderWorkflow) if defined?(BadOrderWorkflow)
|
|
629
|
+
end
|
|
630
|
+
end
|
|
631
|
+
```
|
|
632
|
+
|
|
633
|
+
- [ ] **Step 2: Run to verify they fail**
|
|
634
|
+
|
|
635
|
+
Run: `bundle exec ruby -I test test/spawn_each_test.rb`
|
|
636
|
+
Expected: FAIL — `NoMethodError: spawn_each`.
|
|
637
|
+
|
|
638
|
+
- [ ] **Step 3: Implement `spawn_each`**
|
|
639
|
+
|
|
640
|
+
Add to `lib/chrono_forge/executor/methods/branch.rb` (in module `Branch`, public, next to `spawn`):
|
|
641
|
+
|
|
642
|
+
```ruby
|
|
643
|
+
# Dispatch one child per item of `source`, streamed. AR relations use
|
|
644
|
+
# keyset iteration (in_batches start:) for constant memory; any other
|
|
645
|
+
# enumerable uses an offset cursor. Items are keyed `name_{index}` by
|
|
646
|
+
# their sequential position, so the source must re-enumerate identically
|
|
647
|
+
# across replays. The block returns [WorkflowClass, kwargs] (or a class).
|
|
648
|
+
def spawn_each(name, source, of: 1000)
|
|
649
|
+
cb = current_branch!
|
|
650
|
+
validate_step_name_segment!(name)
|
|
651
|
+
cursor = (cb[:log].metadata&.dig("cursors", name.to_s)) || {}
|
|
652
|
+
n = (cursor["n"] || 0)
|
|
653
|
+
|
|
654
|
+
if source.is_a?(ActiveRecord::Relation)
|
|
655
|
+
source.find_in_batches(batch_size: of, start: cursor["pk"], error_on_ignore: true) do |records|
|
|
656
|
+
entries = records.map do |record|
|
|
657
|
+
klass, kw = normalize_spawn(yield(record))
|
|
658
|
+
ck = "#{@workflow.key}$#{cb[:name]}$#{name}_#{n}"
|
|
659
|
+
n += 1
|
|
660
|
+
[ck, klass, kw]
|
|
661
|
+
end
|
|
662
|
+
dispatch_children(cb, entries)
|
|
663
|
+
advance_cursor!(cb, name, pk: records.last.id, n: n)
|
|
664
|
+
end
|
|
665
|
+
else
|
|
666
|
+
source.drop(n).each_slice(of) do |slice|
|
|
667
|
+
entries = slice.map do |item|
|
|
668
|
+
klass, kw = normalize_spawn(yield(item))
|
|
669
|
+
ck = "#{@workflow.key}$#{cb[:name]}$#{name}_#{n}"
|
|
670
|
+
n += 1
|
|
671
|
+
[ck, klass, kw]
|
|
672
|
+
end
|
|
673
|
+
dispatch_children(cb, entries)
|
|
674
|
+
advance_cursor!(cb, name, n: n)
|
|
675
|
+
end
|
|
676
|
+
end
|
|
677
|
+
name
|
|
678
|
+
end
|
|
679
|
+
```
|
|
680
|
+
|
|
681
|
+
And the private helper (add near the other privates in `Branch`):
|
|
682
|
+
|
|
683
|
+
```ruby
|
|
684
|
+
# Normalize the block return: [Klass, kwargs] or a bare Klass.
|
|
685
|
+
def normalize_spawn(result)
|
|
686
|
+
klass, kwargs = Array(result)
|
|
687
|
+
[klass, kwargs || {}]
|
|
688
|
+
end
|
|
689
|
+
```
|
|
690
|
+
|
|
691
|
+
- [ ] **Step 4: Run to verify they pass**
|
|
692
|
+
|
|
693
|
+
Run: `bundle exec ruby -I test test/spawn_each_test.rb`
|
|
694
|
+
Expected: PASS.
|
|
695
|
+
|
|
696
|
+
- [ ] **Step 5: Commit**
|
|
697
|
+
|
|
698
|
+
```bash
|
|
699
|
+
git add lib/chrono_forge/executor/methods/branch.rb test/spawn_each_test.rb test/internal/app/jobs/spawn_each_workflow.rb
|
|
700
|
+
git commit -m "feat(branches): spawn_each streaming bulk dispatch with cursor"
|
|
701
|
+
```
|
|
702
|
+
|
|
703
|
+
```json:metadata
|
|
704
|
+
{"files": ["lib/chrono_forge/executor/methods/branch.rb", "test/spawn_each_test.rb", "test/internal/app/jobs/spawn_each_workflow.rb"], "verifyCommand": "bundle exec ruby -I test test/spawn_each_test.rb", "acceptanceCriteria": ["one indexed child per item", "class from block honored (mixed)", "raises on conflicting AR order", "cursor persisted"], "requiresUserVerification": false}
|
|
705
|
+
```
|
|
706
|
+
|
|
707
|
+
---
|
|
708
|
+
|
|
709
|
+
### Task 5: `BranchMergeJob` — the lightweight poller
|
|
710
|
+
|
|
711
|
+
**Goal:** Implement the dedicated poller: capped-count probe per branch, wake the parent when all branches are sealed + drained, otherwise re-kick dropped jobs and reschedule with an adaptive (capped-count) interval.
|
|
712
|
+
|
|
713
|
+
**Files:**
|
|
714
|
+
- Create: `lib/chrono_forge/branch_merge_job.rb`
|
|
715
|
+
- Modify: `lib/chrono_forge/executor.rb` (poll-cadence constants — optional, can live on the job)
|
|
716
|
+
- Create: `test/branch_merge_job_test.rb`
|
|
717
|
+
|
|
718
|
+
**Acceptance Criteria:**
|
|
719
|
+
- [ ] When every branch log is `completed` (sealed) and has zero incomplete children, the job enqueues the parent workflow (`parent_job_class.perform_later(parent_key)`) and does not reschedule.
|
|
720
|
+
- [ ] Otherwise it reschedules itself with delay `clamp(pending * FACTOR, min, max)` and does not wake the parent.
|
|
721
|
+
- [ ] The pending count is capped at `CAP` (never counts beyond it).
|
|
722
|
+
- [ ] A never-started child (`started_at` nil) older than the re-kick threshold is re-enqueued.
|
|
723
|
+
|
|
724
|
+
**Verify:** `bundle exec ruby -I test test/branch_merge_job_test.rb` → PASS
|
|
725
|
+
|
|
726
|
+
**Steps:**
|
|
727
|
+
|
|
728
|
+
- [ ] **Step 1: Write failing tests**
|
|
729
|
+
|
|
730
|
+
Create `test/branch_merge_job_test.rb`:
|
|
731
|
+
|
|
732
|
+
```ruby
|
|
733
|
+
require "test_helper"
|
|
734
|
+
|
|
735
|
+
class BranchMergeJobTest < ActiveJob::TestCase
|
|
736
|
+
def setup
|
|
737
|
+
@parent = ChronoForge::Workflow.create!(key: "bmj-parent", job_class: "SingleSpawnWorkflow")
|
|
738
|
+
@log = @parent.execution_logs.create!(step_name: "branch$g", state: :completed)
|
|
739
|
+
end
|
|
740
|
+
|
|
741
|
+
def child!(state:, started_at: Time.current)
|
|
742
|
+
ChronoForge::Workflow.create!(
|
|
743
|
+
key: "c-#{SecureRandom.hex}", job_class: "NoopChild",
|
|
744
|
+
parent_execution_log_id: @log.id, state: state, started_at: started_at
|
|
745
|
+
)
|
|
746
|
+
end
|
|
747
|
+
|
|
748
|
+
def test_wakes_parent_when_all_complete
|
|
749
|
+
child!(state: :completed)
|
|
750
|
+
assert_enqueued_with(job: SingleSpawnWorkflow, args: ["bmj-parent"]) do
|
|
751
|
+
ChronoForge::BranchMergeJob.perform_now("bmj-parent", "SingleSpawnWorkflow", [@log.id], 5, 300)
|
|
752
|
+
end
|
|
753
|
+
end
|
|
754
|
+
|
|
755
|
+
def test_reschedules_when_incomplete
|
|
756
|
+
child!(state: :running)
|
|
757
|
+
assert_enqueued_with(job: ChronoForge::BranchMergeJob) do
|
|
758
|
+
ChronoForge::BranchMergeJob.perform_now("bmj-parent", "SingleSpawnWorkflow", [@log.id], 5, 300)
|
|
759
|
+
end
|
|
760
|
+
refute_enqueued_with(job: SingleSpawnWorkflow)
|
|
761
|
+
end
|
|
762
|
+
|
|
763
|
+
def test_rekicks_never_started_child
|
|
764
|
+
stuck = child!(state: :idle, started_at: nil)
|
|
765
|
+
stuck.update_column(:updated_at, 10.minutes.ago)
|
|
766
|
+
assert_enqueued_with(job: NoopChild, args: ["#{stuck.key}"]) do
|
|
767
|
+
ChronoForge::BranchMergeJob.perform_now("bmj-parent", "SingleSpawnWorkflow", [@log.id], 5, 300)
|
|
768
|
+
end
|
|
769
|
+
end
|
|
770
|
+
end
|
|
771
|
+
```
|
|
772
|
+
|
|
773
|
+
- [ ] **Step 2: Run to verify they fail**
|
|
774
|
+
|
|
775
|
+
Run: `bundle exec ruby -I test test/branch_merge_job_test.rb`
|
|
776
|
+
Expected: FAIL — `NameError: uninitialized constant ChronoForge::BranchMergeJob`.
|
|
777
|
+
|
|
778
|
+
- [ ] **Step 3: Implement the poller**
|
|
779
|
+
|
|
780
|
+
Create `lib/chrono_forge/branch_merge_job.rb`:
|
|
781
|
+
|
|
782
|
+
```ruby
|
|
783
|
+
module ChronoForge
|
|
784
|
+
# Lightweight poller that joins one or more branches. NOT a workflow — it holds
|
|
785
|
+
# no lock, does no replay, and carries no context. It exists so the heavy parent
|
|
786
|
+
# workflow is replayed only twice per merge (kick off + completion wake).
|
|
787
|
+
class BranchMergeJob < ActiveJob::Base
|
|
788
|
+
CAP = 5_000 # cap the pending count; beyond it we just pick max_interval
|
|
789
|
+
FACTOR = 0.06 # seconds of delay per pending child
|
|
790
|
+
REKICK_AFTER = 5.minutes
|
|
791
|
+
|
|
792
|
+
def perform(parent_key, parent_job_class, branch_log_ids, min_interval, max_interval)
|
|
793
|
+
pending = branch_log_ids.sum { |id| incomplete_scope(id).limit(CAP).count }
|
|
794
|
+
sealed = branch_log_ids.all? { |id| branch_sealed?(id) }
|
|
795
|
+
|
|
796
|
+
if sealed && pending.zero?
|
|
797
|
+
parent_job_class.constantize.perform_later(parent_key)
|
|
798
|
+
return
|
|
799
|
+
end
|
|
800
|
+
|
|
801
|
+
rekick_dropped_jobs(branch_log_ids)
|
|
802
|
+
|
|
803
|
+
delay = [[pending * FACTOR, min_interval].max, max_interval].min
|
|
804
|
+
self.class.set(wait: delay.seconds)
|
|
805
|
+
.perform_later(parent_key, parent_job_class, branch_log_ids, min_interval, max_interval)
|
|
806
|
+
end
|
|
807
|
+
|
|
808
|
+
private
|
|
809
|
+
|
|
810
|
+
def incomplete_scope(branch_log_id)
|
|
811
|
+
Workflow.where(parent_execution_log_id: branch_log_id)
|
|
812
|
+
.where.not(state: Workflow.states[:completed])
|
|
813
|
+
end
|
|
814
|
+
|
|
815
|
+
def branch_sealed?(branch_log_id)
|
|
816
|
+
ExecutionLog.where(id: branch_log_id, state: ExecutionLog.states[:completed]).exists?
|
|
817
|
+
end
|
|
818
|
+
|
|
819
|
+
# A child dispatched but never run (its job was dropped by the backend) is
|
|
820
|
+
# re-enqueued. started_at IS NULL can't distinguish "never enqueued" from
|
|
821
|
+
# "queued but not yet picked up", so we only re-kick children that have been
|
|
822
|
+
# idle past REKICK_AFTER. Re-enqueue is idempotent: a completed/running child
|
|
823
|
+
# no-ops via the executable?/lock guard.
|
|
824
|
+
def rekick_dropped_jobs(branch_log_ids)
|
|
825
|
+
branch_log_ids.each do |id|
|
|
826
|
+
Workflow.where(parent_execution_log_id: id, started_at: nil)
|
|
827
|
+
.where("updated_at < ?", REKICK_AFTER.ago)
|
|
828
|
+
.find_each do |child|
|
|
829
|
+
child.job_klass.perform_later(child.key, **child.kwargs.symbolize_keys)
|
|
830
|
+
end
|
|
831
|
+
end
|
|
832
|
+
end
|
|
833
|
+
end
|
|
834
|
+
end
|
|
835
|
+
```
|
|
836
|
+
|
|
837
|
+
- [ ] **Step 4: Run to verify they pass**
|
|
838
|
+
|
|
839
|
+
Run: `bundle exec ruby -I test test/branch_merge_job_test.rb`
|
|
840
|
+
Expected: PASS.
|
|
841
|
+
|
|
842
|
+
- [ ] **Step 5: Commit**
|
|
843
|
+
|
|
844
|
+
```bash
|
|
845
|
+
git add lib/chrono_forge/branch_merge_job.rb test/branch_merge_job_test.rb
|
|
846
|
+
git commit -m "feat(branches): BranchMergeJob lightweight poller"
|
|
847
|
+
```
|
|
848
|
+
|
|
849
|
+
```json:metadata
|
|
850
|
+
{"files": ["lib/chrono_forge/branch_merge_job.rb", "test/branch_merge_job_test.rb"], "verifyCommand": "bundle exec ruby -I test test/branch_merge_job_test.rb", "acceptanceCriteria": ["wakes parent when all complete", "reschedules when incomplete", "capped count", "re-kicks never-started child"], "requiresUserVerification": false}
|
|
851
|
+
```
|
|
852
|
+
|
|
853
|
+
---
|
|
854
|
+
|
|
855
|
+
### Task 6: `merge_branches` / `merge_branch` — the join
|
|
856
|
+
|
|
857
|
+
**Goal:** Implement `merge_branches(*names)` (alias `merge_branch`): immediate done-check, else enqueue `BranchMergeJob` and halt; remove joined branches from `@open_branches` on completion; raise on an unopened name. Provide the shared helpers (`branches_done?`, `enqueue_branch_merge_job`, `open_branch!`) used by the completion gate in Task 7.
|
|
858
|
+
|
|
859
|
+
**Files:**
|
|
860
|
+
- Modify: `lib/chrono_forge/executor/methods/merge_branches.rb`
|
|
861
|
+
- Create: `test/merge_branches_test.rb`
|
|
862
|
+
- Create: `test/internal/app/jobs/two_branch_workflow.rb`
|
|
863
|
+
|
|
864
|
+
**Acceptance Criteria:**
|
|
865
|
+
- [ ] After all children of the named branches complete, the parent resumes and the `merge$<names>` log is `completed`.
|
|
866
|
+
- [ ] While children are incomplete, the parent halts (idle) and a `BranchMergeJob` is enqueued.
|
|
867
|
+
- [ ] A failed/stalled child keeps the parent parked (Option A); recovering it lets the merge resolve.
|
|
868
|
+
- [ ] `merge_branches :never_opened` raises `ArgumentError`.
|
|
869
|
+
|
|
870
|
+
**Verify:** `bundle exec ruby -I test test/merge_branches_test.rb` → PASS
|
|
871
|
+
|
|
872
|
+
**Steps:**
|
|
873
|
+
|
|
874
|
+
- [ ] **Step 1: Write failing tests + fixture**
|
|
875
|
+
|
|
876
|
+
Create `test/internal/app/jobs/two_branch_workflow.rb`:
|
|
877
|
+
|
|
878
|
+
```ruby
|
|
879
|
+
class TwoBranchWorkflow < WorkflowJob
|
|
880
|
+
prepend ChronoForge::Executor
|
|
881
|
+
|
|
882
|
+
def perform
|
|
883
|
+
branch :a do
|
|
884
|
+
spawn :one, NoopChild
|
|
885
|
+
end
|
|
886
|
+
branch :b do
|
|
887
|
+
spawn :two, NoopChild
|
|
888
|
+
end
|
|
889
|
+
merge_branches :a, :b
|
|
890
|
+
durably_execute :finalize
|
|
891
|
+
end
|
|
892
|
+
|
|
893
|
+
private
|
|
894
|
+
|
|
895
|
+
def finalize
|
|
896
|
+
context["finalized"] = true
|
|
897
|
+
end
|
|
898
|
+
end
|
|
899
|
+
```
|
|
900
|
+
|
|
901
|
+
Create `test/merge_branches_test.rb`:
|
|
902
|
+
|
|
903
|
+
```ruby
|
|
904
|
+
require "test_helper"
|
|
905
|
+
|
|
906
|
+
class MergeBranchesTest < ActiveJob::TestCase
|
|
907
|
+
def test_parent_resumes_after_branches_complete
|
|
908
|
+
TwoBranchWorkflow.perform_later("mb-1")
|
|
909
|
+
perform_all_jobs
|
|
910
|
+
|
|
911
|
+
parent = ChronoForge::Workflow.find_by(key: "mb-1")
|
|
912
|
+
assert parent.completed?, "parent should complete once both branches merge"
|
|
913
|
+
assert_equal true, parent.context["finalized"]
|
|
914
|
+
merge_log = parent.execution_logs.find { |l| l.step_name.start_with?("merge$") }
|
|
915
|
+
assert merge_log.completed?
|
|
916
|
+
end
|
|
917
|
+
|
|
918
|
+
def test_unopened_branch_name_raises
|
|
919
|
+
job = Class.new(WorkflowJob) do
|
|
920
|
+
prepend ChronoForge::Executor
|
|
921
|
+
def perform = merge_branches(:nope)
|
|
922
|
+
end
|
|
923
|
+
Object.const_set(:NoBranchMergeWorkflow, job)
|
|
924
|
+
NoBranchMergeWorkflow.perform_later("nb-1")
|
|
925
|
+
assert_raises(ArgumentError) { perform_all_jobs }
|
|
926
|
+
ensure
|
|
927
|
+
Object.send(:remove_const, :NoBranchMergeWorkflow) if defined?(NoBranchMergeWorkflow)
|
|
928
|
+
end
|
|
929
|
+
|
|
930
|
+
# Option A: a non-completed (stalled) child keeps the parent parked; recovering
|
|
931
|
+
# the child lets the merge resolve.
|
|
932
|
+
def test_failed_child_parks_parent_until_recovered
|
|
933
|
+
StalledChildBranchWorkflow.perform_later("oa-1")
|
|
934
|
+
perform_all_jobs
|
|
935
|
+
|
|
936
|
+
parent = ChronoForge::Workflow.find_by(key: "oa-1")
|
|
937
|
+
child = ChronoForge::Workflow.find_by(key: "oa-1$grp$c")
|
|
938
|
+
refute parent.completed?, "parent must stay parked while child is not completed"
|
|
939
|
+
assert child.stalled?, "child should be stalled (permanent failure)"
|
|
940
|
+
|
|
941
|
+
# Recover the child; drive jobs again — the merge poll should now resolve.
|
|
942
|
+
child.context # no-op touch
|
|
943
|
+
child.update!(state: :idle) # simulate fix + allow re-run
|
|
944
|
+
StalledChildBranchWorkflow::ALLOW_COMPLETE[:ok] = true
|
|
945
|
+
child.retry_later rescue child.job_klass.perform_later(child.key)
|
|
946
|
+
perform_all_jobs
|
|
947
|
+
|
|
948
|
+
assert ChronoForge::Workflow.find_by(key: "oa-1").completed?,
|
|
949
|
+
"parent should complete once the recovered child completes"
|
|
950
|
+
end
|
|
951
|
+
end
|
|
952
|
+
```
|
|
953
|
+
|
|
954
|
+
And add the stalling fixture `test/internal/app/jobs/stalled_child_branch_workflow.rb`:
|
|
955
|
+
|
|
956
|
+
```ruby
|
|
957
|
+
class StalledChildBranchWorkflow < WorkflowJob
|
|
958
|
+
prepend ChronoForge::Executor
|
|
959
|
+
|
|
960
|
+
# Toggled by the test to let the child succeed on recovery.
|
|
961
|
+
ALLOW_COMPLETE = {ok: false}
|
|
962
|
+
|
|
963
|
+
def perform
|
|
964
|
+
branch :grp do
|
|
965
|
+
spawn :c, StalledChild
|
|
966
|
+
end
|
|
967
|
+
merge_branches :grp
|
|
968
|
+
end
|
|
969
|
+
end
|
|
970
|
+
|
|
971
|
+
class StalledChild < WorkflowJob
|
|
972
|
+
prepend ChronoForge::Executor
|
|
973
|
+
|
|
974
|
+
def perform(**)
|
|
975
|
+
durably_execute :maybe_fail, retry_policy: ChronoForge::Executor::RetryPolicy.new(retry_on: [])
|
|
976
|
+
end
|
|
977
|
+
|
|
978
|
+
private
|
|
979
|
+
|
|
980
|
+
def maybe_fail
|
|
981
|
+
raise "not yet" unless StalledChildBranchWorkflow::ALLOW_COMPLETE[:ok]
|
|
982
|
+
end
|
|
983
|
+
end
|
|
984
|
+
```
|
|
985
|
+
|
|
986
|
+
> The exact recovery mechanics (`retry_later` vs re-enqueue, the `ALLOW_COMPLETE` toggle) may need adjusting against the real stall/retry behaviour observed in `test/chrono_forge_test.rb`'s permanent-failure tests — the assertion that matters is **parent parked while child not completed, parent completes after child completes**. Mirror the permanent-failure pattern already used in `chrono_forge_test.rb` for the stall setup.
|
|
987
|
+
|
|
988
|
+
- [ ] **Step 2: Run to verify they fail**
|
|
989
|
+
|
|
990
|
+
Run: `bundle exec ruby -I test test/merge_branches_test.rb`
|
|
991
|
+
Expected: FAIL — `NoMethodError: merge_branches`.
|
|
992
|
+
|
|
993
|
+
- [ ] **Step 3: Implement `merge_branches` + helpers**
|
|
994
|
+
|
|
995
|
+
Replace the placeholder `lib/chrono_forge/executor/methods/merge_branches.rb` with:
|
|
996
|
+
|
|
997
|
+
```ruby
|
|
998
|
+
module ChronoForge
|
|
999
|
+
module Executor
|
|
1000
|
+
module Methods
|
|
1001
|
+
module MergeBranches
|
|
1002
|
+
# Join one or more named branches. Separate from dispatch so branches run
|
|
1003
|
+
# concurrently. Does one immediate check; if not done, hands off to the
|
|
1004
|
+
# lightweight BranchMergeJob and halts (the heavy parent is not replayed
|
|
1005
|
+
# per poll). Default cadence clamps between min/max, scaled by pending.
|
|
1006
|
+
def merge_branches(*names, min_interval: 5.seconds, max_interval: 5.minutes)
|
|
1007
|
+
step_name = "merge$#{names.map(&:to_s).sort.join(",")}"
|
|
1008
|
+
log = find_or_create_execution_log!(step_name) { |l| l.started_at = Time.current }
|
|
1009
|
+
return if log.completed?
|
|
1010
|
+
|
|
1011
|
+
branch_log_ids = names.map { |nm| open_branch!(nm)[:log_id] }
|
|
1012
|
+
|
|
1013
|
+
if branches_done?(branch_log_ids)
|
|
1014
|
+
names.each { |nm| @open_branches.delete(nm.to_s) }
|
|
1015
|
+
log.update!(state: :completed, completed_at: Time.current)
|
|
1016
|
+
return
|
|
1017
|
+
end
|
|
1018
|
+
|
|
1019
|
+
enqueue_branch_merge_job(branch_log_ids, min_interval, max_interval)
|
|
1020
|
+
halt_execution!
|
|
1021
|
+
end
|
|
1022
|
+
alias_method :merge_branch, :merge_branches
|
|
1023
|
+
|
|
1024
|
+
private
|
|
1025
|
+
|
|
1026
|
+
def open_branch!(name)
|
|
1027
|
+
(@open_branches || {}).fetch(name.to_s) do
|
|
1028
|
+
raise ArgumentError, "no open branch named #{name.inspect} (open it with `branch #{name.inspect} do … end` first)"
|
|
1029
|
+
end
|
|
1030
|
+
end
|
|
1031
|
+
|
|
1032
|
+
# A branch is done when its log is sealed (completed) and it has no
|
|
1033
|
+
# incomplete children. exists? short-circuits at the first incomplete row.
|
|
1034
|
+
def branches_done?(branch_log_ids)
|
|
1035
|
+
branch_log_ids.all? do |id|
|
|
1036
|
+
next false unless ExecutionLog.where(id: id, state: ExecutionLog.states[:completed]).exists?
|
|
1037
|
+
!Workflow.where(parent_execution_log_id: id)
|
|
1038
|
+
.where.not(state: Workflow.states[:completed]).exists?
|
|
1039
|
+
end
|
|
1040
|
+
end
|
|
1041
|
+
|
|
1042
|
+
def enqueue_branch_merge_job(branch_log_ids, min_interval, max_interval)
|
|
1043
|
+
BranchMergeJob.perform_later(
|
|
1044
|
+
@workflow.key, self.class.to_s, branch_log_ids,
|
|
1045
|
+
min_interval.to_i, max_interval.to_i
|
|
1046
|
+
)
|
|
1047
|
+
end
|
|
1048
|
+
end
|
|
1049
|
+
end
|
|
1050
|
+
end
|
|
1051
|
+
end
|
|
1052
|
+
```
|
|
1053
|
+
|
|
1054
|
+
- [ ] **Step 4: Run to verify they pass**
|
|
1055
|
+
|
|
1056
|
+
Run: `bundle exec ruby -I test test/merge_branches_test.rb`
|
|
1057
|
+
Expected: PASS.
|
|
1058
|
+
|
|
1059
|
+
- [ ] **Step 5: Run the full suite to catch regressions**
|
|
1060
|
+
|
|
1061
|
+
Run: `bundle exec rake test`
|
|
1062
|
+
Expected: all green.
|
|
1063
|
+
|
|
1064
|
+
- [ ] **Step 6: Commit**
|
|
1065
|
+
|
|
1066
|
+
```bash
|
|
1067
|
+
git add lib/chrono_forge/executor/methods/merge_branches.rb test/merge_branches_test.rb test/internal/app/jobs/two_branch_workflow.rb
|
|
1068
|
+
git commit -m "feat(branches): merge_branches poll-join"
|
|
1069
|
+
```
|
|
1070
|
+
|
|
1071
|
+
```json:metadata
|
|
1072
|
+
{"files": ["lib/chrono_forge/executor/methods/merge_branches.rb", "test/merge_branches_test.rb", "test/internal/app/jobs/two_branch_workflow.rb"], "verifyCommand": "bundle exec ruby -I test test/merge_branches_test.rb", "acceptanceCriteria": ["parent resumes after branches complete", "halts + enqueues poller while incomplete", "unopened name raises"], "requiresUserVerification": false}
|
|
1073
|
+
```
|
|
1074
|
+
|
|
1075
|
+
---
|
|
1076
|
+
|
|
1077
|
+
### Task 7: Completion gate — automerge + raise on unmerged
|
|
1078
|
+
|
|
1079
|
+
**Goal:** In `complete_workflow!`, before sealing, inspect `@open_branches`: raise `UnmergedBranchError` for any leftover non-automerge branch; for leftover automerge branches, join them (poll/halt) before completing.
|
|
1080
|
+
|
|
1081
|
+
**Files:**
|
|
1082
|
+
- Modify: `lib/chrono_forge/executor/methods/workflow_states.rb`
|
|
1083
|
+
- Create: `test/automerge_test.rb`
|
|
1084
|
+
|
|
1085
|
+
**Acceptance Criteria:**
|
|
1086
|
+
- [ ] An `automerge: true` branch with no `merge_branches` blocks workflow completion until its children finish, then completes.
|
|
1087
|
+
- [ ] A branch opened with neither `merge_branches` nor `automerge: true` raises `UnmergedBranchError` at completion (even if its children already finished).
|
|
1088
|
+
- [ ] A branch already joined via `merge_branches` does not re-trigger at the gate.
|
|
1089
|
+
|
|
1090
|
+
**Verify:** `bundle exec ruby -I test test/automerge_test.rb` → PASS
|
|
1091
|
+
|
|
1092
|
+
**Steps:**
|
|
1093
|
+
|
|
1094
|
+
- [ ] **Step 1: Write failing tests + fixture**
|
|
1095
|
+
|
|
1096
|
+
Create `test/internal/app/jobs/unmerged_branch_workflow.rb`:
|
|
1097
|
+
|
|
1098
|
+
```ruby
|
|
1099
|
+
class UnmergedBranchWorkflow < WorkflowJob
|
|
1100
|
+
prepend ChronoForge::Executor
|
|
1101
|
+
|
|
1102
|
+
def perform
|
|
1103
|
+
branch :forgotten do # no automerge, never merged
|
|
1104
|
+
spawn :c, NoopChild
|
|
1105
|
+
end
|
|
1106
|
+
end
|
|
1107
|
+
end
|
|
1108
|
+
```
|
|
1109
|
+
|
|
1110
|
+
Create `test/automerge_test.rb`:
|
|
1111
|
+
|
|
1112
|
+
```ruby
|
|
1113
|
+
require "test_helper"
|
|
1114
|
+
|
|
1115
|
+
class AutomergeTest < ActiveJob::TestCase
|
|
1116
|
+
# SingleSpawnWorkflow opens branch :grp with automerge: true and no merge.
|
|
1117
|
+
def test_automerge_blocks_completion_until_children_done
|
|
1118
|
+
SingleSpawnWorkflow.perform_later("am-1")
|
|
1119
|
+
perform_all_jobs
|
|
1120
|
+
|
|
1121
|
+
parent = ChronoForge::Workflow.find_by(key: "am-1")
|
|
1122
|
+
assert parent.completed?, "automerge branch should be joined before completion"
|
|
1123
|
+
child = ChronoForge::Workflow.find_by(key: "am-1$grp$child")
|
|
1124
|
+
assert child.completed?
|
|
1125
|
+
end
|
|
1126
|
+
|
|
1127
|
+
def test_unmerged_branch_raises
|
|
1128
|
+
UnmergedBranchWorkflow.perform_later("um-1")
|
|
1129
|
+
error = assert_raises(ChronoForge::Executor::UnmergedBranchError) { perform_all_jobs }
|
|
1130
|
+
assert_match(/forgotten/, error.message)
|
|
1131
|
+
end
|
|
1132
|
+
end
|
|
1133
|
+
```
|
|
1134
|
+
|
|
1135
|
+
- [ ] **Step 2: Run to verify they fail**
|
|
1136
|
+
|
|
1137
|
+
Run: `bundle exec ruby -I test test/automerge_test.rb`
|
|
1138
|
+
Expected: `test_unmerged_branch_raises` FAILS (no error raised — branch silently detached).
|
|
1139
|
+
|
|
1140
|
+
- [ ] **Step 3: Add the completion gate**
|
|
1141
|
+
|
|
1142
|
+
In `lib/chrono_forge/executor/methods/workflow_states.rb`, change the start of `complete_workflow!` to call the gate first:
|
|
1143
|
+
|
|
1144
|
+
```ruby
|
|
1145
|
+
def complete_workflow!
|
|
1146
|
+
enforce_branch_joins!
|
|
1147
|
+
|
|
1148
|
+
# Create an execution log for workflow completion
|
|
1149
|
+
execution_log = find_or_create_execution_log!("$workflow_completion$") do |log|
|
|
1150
|
+
log.started_at = Time.current
|
|
1151
|
+
end
|
|
1152
|
+
# ... unchanged body ...
|
|
1153
|
+
```
|
|
1154
|
+
|
|
1155
|
+
Add the private gate method to the `WorkflowStates` module (it uses `branches_done?` / `enqueue_branch_merge_job` from `MergeBranches`, available on the same instance):
|
|
1156
|
+
|
|
1157
|
+
```ruby
|
|
1158
|
+
# Every branch must be joined — explicitly (merge_branches) or implicitly
|
|
1159
|
+
# (automerge: true). @open_branches is the in-memory registry rebuilt each
|
|
1160
|
+
# replay pass: branch adds, merge_branches removes on completion. Anything
|
|
1161
|
+
# left here is either an automerge branch to join, or a forgotten join.
|
|
1162
|
+
def enforce_branch_joins!
|
|
1163
|
+
open = @open_branches || {}
|
|
1164
|
+
return if open.empty?
|
|
1165
|
+
|
|
1166
|
+
unmerged = open.reject { |_, b| b[:automerge] }
|
|
1167
|
+
if unmerged.any?
|
|
1168
|
+
names = unmerged.keys
|
|
1169
|
+
raise UnmergedBranchError,
|
|
1170
|
+
"branch(es) #{names.join(", ")} were opened but never merged. " \
|
|
1171
|
+
"Add `merge_branches #{names.map { |n| ":#{n}" }.join(", ")}` " \
|
|
1172
|
+
"or open with `branch(..., automerge: true)`."
|
|
1173
|
+
end
|
|
1174
|
+
|
|
1175
|
+
auto_ids = open.values.map { |b| b[:log_id] }
|
|
1176
|
+
unless branches_done?(auto_ids)
|
|
1177
|
+
enqueue_branch_merge_job(auto_ids, 5.seconds, 5.minutes)
|
|
1178
|
+
halt_execution! # poller wakes the parent; the gate re-runs on replay
|
|
1179
|
+
end
|
|
1180
|
+
end
|
|
1181
|
+
```
|
|
1182
|
+
|
|
1183
|
+
- [ ] **Step 4: Run to verify they pass**
|
|
1184
|
+
|
|
1185
|
+
Run: `bundle exec ruby -I test test/automerge_test.rb`
|
|
1186
|
+
Expected: PASS.
|
|
1187
|
+
|
|
1188
|
+
- [ ] **Step 5: Run the full suite**
|
|
1189
|
+
|
|
1190
|
+
Run: `bundle exec rake test`
|
|
1191
|
+
Expected: all green (confirm KitchenSink etc. — which open no branches — are unaffected; `enforce_branch_joins!` returns early when `@open_branches` is empty).
|
|
1192
|
+
|
|
1193
|
+
- [ ] **Step 6: Commit**
|
|
1194
|
+
|
|
1195
|
+
```bash
|
|
1196
|
+
git add lib/chrono_forge/executor/methods/workflow_states.rb test/automerge_test.rb test/internal/app/jobs/unmerged_branch_workflow.rb
|
|
1197
|
+
git commit -m "feat(branches): completion gate — automerge + raise on unmerged"
|
|
1198
|
+
```
|
|
1199
|
+
|
|
1200
|
+
```json:metadata
|
|
1201
|
+
{"files": ["lib/chrono_forge/executor/methods/workflow_states.rb", "test/automerge_test.rb", "test/internal/app/jobs/unmerged_branch_workflow.rb"], "verifyCommand": "bundle exec ruby -I test test/automerge_test.rb", "acceptanceCriteria": ["automerge blocks completion until children done", "unmerged branch raises", "already-merged branch is a no-op at gate"], "requiresUserVerification": false}
|
|
1202
|
+
```
|
|
1203
|
+
|
|
1204
|
+
---
|
|
1205
|
+
|
|
1206
|
+
### Task 8: Crash-recovery (cursor resume) + scale/perf regression tests
|
|
1207
|
+
|
|
1208
|
+
**Goal:** Prove dispatch resumes from the cursor after a mid-dispatch crash (no duplicate children, bounded rework), and that dispatch is `⌈N/of⌉` inserts (not N) with constant per-item work.
|
|
1209
|
+
|
|
1210
|
+
**Files:**
|
|
1211
|
+
- Create: `test/branch_recovery_test.rb`
|
|
1212
|
+
- Create: `test/branch_scale_test.rb`
|
|
1213
|
+
|
|
1214
|
+
**Acceptance Criteria:**
|
|
1215
|
+
- [ ] A crash after chunk *k* leaves `metadata.cursors[name]` at that point; the resumed run continues from it, ends with exactly N children, no duplicate keys.
|
|
1216
|
+
- [ ] Dispatching N children with batch size `of` issues `⌈N/of⌉` `INSERT INTO chrono_forge_workflows` statements (not N).
|
|
1217
|
+
|
|
1218
|
+
**Verify:** `bundle exec ruby -I test test/branch_recovery_test.rb test/branch_scale_test.rb` → PASS
|
|
1219
|
+
|
|
1220
|
+
**Steps:**
|
|
1221
|
+
|
|
1222
|
+
- [ ] **Step 1: Write the scale test**
|
|
1223
|
+
|
|
1224
|
+
Create `test/branch_scale_test.rb`:
|
|
1225
|
+
|
|
1226
|
+
```ruby
|
|
1227
|
+
require "test_helper"
|
|
1228
|
+
|
|
1229
|
+
class BranchScaleTest < ActiveJob::TestCase
|
|
1230
|
+
def setup
|
|
1231
|
+
User.delete_all
|
|
1232
|
+
25.times { |i| User.create!(name: "u#{i}", email: "u#{i}@e.com") }
|
|
1233
|
+
end
|
|
1234
|
+
|
|
1235
|
+
def test_dispatch_uses_bulk_inserts_not_one_per_child
|
|
1236
|
+
inserts = 0
|
|
1237
|
+
pattern = /INSERT INTO ["`]?chrono_forge_workflows/i
|
|
1238
|
+
sub = ActiveSupport::Notifications.subscribe("sql.active_record") do |*a|
|
|
1239
|
+
inserts += 1 if pattern.match?(a.last[:sql].to_s)
|
|
1240
|
+
end
|
|
1241
|
+
# of: 10 over 25 users => ceil(25/10) = 3 insert_all statements.
|
|
1242
|
+
SpawnEachWorkflow.perform_later("scale-1", of: 10)
|
|
1243
|
+
perform_all_jobs_before(1.second) # dispatch happens on the first pass
|
|
1244
|
+
ActiveSupport::Notifications.unsubscribe(sub)
|
|
1245
|
+
|
|
1246
|
+
branch_log = ChronoForge::Workflow.find_by(key: "scale-1").execution_logs.find_by(step_name: "branch$grp")
|
|
1247
|
+
assert_equal 25, ChronoForge::Workflow.where(parent_execution_log_id: branch_log.id).count
|
|
1248
|
+
assert_operator inserts, :<=, 3, "expected <= ceil(25/10) bulk inserts, got #{inserts}"
|
|
1249
|
+
end
|
|
1250
|
+
end
|
|
1251
|
+
```
|
|
1252
|
+
|
|
1253
|
+
> Note: this asserts on **`insert_all`** (DB rows), which is always bulk. Do NOT assert bulk job *enqueue* — under the test adapter `perform_all_later` falls back to per-job enqueue.
|
|
1254
|
+
|
|
1255
|
+
- [ ] **Step 2: Write the recovery test**
|
|
1256
|
+
|
|
1257
|
+
Create `test/branch_recovery_test.rb`:
|
|
1258
|
+
|
|
1259
|
+
```ruby
|
|
1260
|
+
require "test_helper"
|
|
1261
|
+
|
|
1262
|
+
class BranchRecoveryTest < ActiveJob::TestCase
|
|
1263
|
+
include ChaoticJob::Helpers
|
|
1264
|
+
|
|
1265
|
+
def setup
|
|
1266
|
+
User.delete_all
|
|
1267
|
+
25.times { |i| User.create!(name: "u#{i}", email: "u#{i}@e.com") }
|
|
1268
|
+
end
|
|
1269
|
+
|
|
1270
|
+
def test_resumes_dispatch_from_cursor_after_glitch
|
|
1271
|
+
# Glitch once during dispatch; ChaoticJob re-runs the workflow, which must
|
|
1272
|
+
# resume spawn_each from the persisted cursor rather than restarting at 0.
|
|
1273
|
+
workflow = SpawnEachWorkflow.new("rec-1", of: 10)
|
|
1274
|
+
run_scenario(workflow, glitch: ["before", "#{ChronoForge::Executor::Methods::Branch.instance_method(:dispatch_children).source_location[0]}:#{dispatch_children_line}"])
|
|
1275
|
+
|
|
1276
|
+
branch_log = ChronoForge::Workflow.find_by(key: "rec-1").execution_logs.find_by(step_name: "branch$grp")
|
|
1277
|
+
children = ChronoForge::Workflow.where(parent_execution_log_id: branch_log.id)
|
|
1278
|
+
assert_equal 25, children.count, "exactly N children, no duplicates after resume"
|
|
1279
|
+
assert_equal 25, children.distinct.count(:key)
|
|
1280
|
+
end
|
|
1281
|
+
|
|
1282
|
+
private
|
|
1283
|
+
|
|
1284
|
+
# Resolve the line of the perform_all_later call inside dispatch_children so the
|
|
1285
|
+
# glitch lands mid-dispatch. Adjust if the method changes.
|
|
1286
|
+
def dispatch_children_line
|
|
1287
|
+
src = File.read(ChronoForge::Executor::Methods::Branch.instance_method(:dispatch_children).source_location[0])
|
|
1288
|
+
src.lines.index { |l| l.include?("perform_all_later") }.to_i + 1
|
|
1289
|
+
end
|
|
1290
|
+
end
|
|
1291
|
+
```
|
|
1292
|
+
|
|
1293
|
+
> If targeting an exact glitch line proves brittle, an acceptable alternative is to call `SpawnEachWorkflow.perform_later` twice in a row with the same key (simulating a re-run) after manually truncating `metadata.cursors` to a mid-point, and assert the final child set is exactly N with no duplicates. Either approach proves cursor-resume idempotency.
|
|
1294
|
+
|
|
1295
|
+
- [ ] **Step 3: Run both to verify they fail, then pass**
|
|
1296
|
+
|
|
1297
|
+
Run: `bundle exec ruby -I test test/branch_scale_test.rb test/branch_recovery_test.rb`
|
|
1298
|
+
Expected: PASS (these exercise Task 3/4 code; if they fail, fix the dispatch/cursor logic, not the tests).
|
|
1299
|
+
|
|
1300
|
+
- [ ] **Step 4: Commit**
|
|
1301
|
+
|
|
1302
|
+
```bash
|
|
1303
|
+
git add test/branch_recovery_test.rb test/branch_scale_test.rb
|
|
1304
|
+
git commit -m "test(branches): cursor-resume recovery + bulk-dispatch scale guards"
|
|
1305
|
+
```
|
|
1306
|
+
|
|
1307
|
+
```json:metadata
|
|
1308
|
+
{"files": ["test/branch_recovery_test.rb", "test/branch_scale_test.rb"], "verifyCommand": "bundle exec ruby -I test test/branch_scale_test.rb test/branch_recovery_test.rb", "acceptanceCriteria": ["cursor resume: exactly N children no dupes", "dispatch uses ceil(N/of) bulk inserts"], "requiresUserVerification": false}
|
|
1309
|
+
```
|
|
1310
|
+
|
|
1311
|
+
---
|
|
1312
|
+
|
|
1313
|
+
### Task 9: Dependency floor + README
|
|
1314
|
+
|
|
1315
|
+
**Goal:** Pin `activejob >= 7.1` (required for `perform_all_later`) and document the feature with the load-bearing caveats.
|
|
1316
|
+
|
|
1317
|
+
**Files:**
|
|
1318
|
+
- Modify: `chrono_forge.gemspec`
|
|
1319
|
+
- Modify: `README.md`
|
|
1320
|
+
|
|
1321
|
+
**Acceptance Criteria:**
|
|
1322
|
+
- [ ] `chrono_forge.gemspec` requires `activejob >= 7.1`.
|
|
1323
|
+
- [ ] README has a "Branches" section documenting `branch`/`spawn`/`spawn_each`/`merge_branches`/`automerge` and the three caveats (every branch must be joined; parent not replayed per poll; source must be stable during dispatch).
|
|
1324
|
+
|
|
1325
|
+
**Verify:** `bundle exec ruby -e "require 'rubygems'; spec = Gem::Specification.load('chrono_forge.gemspec'); dep = spec.dependencies.find { |d| d.name == 'activejob' }; abort('no floor') unless dep.requirement.satisfied_by?(Gem::Version.new('7.1')) && !dep.requirement.satisfied_by?(Gem::Version.new('7.0')); puts 'ok'"` → `ok`
|
|
1326
|
+
|
|
1327
|
+
**Steps:**
|
|
1328
|
+
|
|
1329
|
+
- [ ] **Step 1: Pin the dependency**
|
|
1330
|
+
|
|
1331
|
+
In `chrono_forge.gemspec`, change:
|
|
1332
|
+
|
|
1333
|
+
```ruby
|
|
1334
|
+
spec.add_dependency "activejob"
|
|
1335
|
+
```
|
|
1336
|
+
|
|
1337
|
+
to:
|
|
1338
|
+
|
|
1339
|
+
```ruby
|
|
1340
|
+
spec.add_dependency "activejob", ">= 7.1"
|
|
1341
|
+
```
|
|
1342
|
+
|
|
1343
|
+
- [ ] **Step 2: Document in README**
|
|
1344
|
+
|
|
1345
|
+
Add a "Branches: parallel sub-workflows" section to `README.md` with a worked example (the `branch :fulfillment, automerge: true do … end` + `merge_branches` example from the spec's Goal section) and a "Caveats" callout covering, verbatim in spirit:
|
|
1346
|
+
- Every branch must be merged or `automerge: true`, else `UnmergedBranchError`.
|
|
1347
|
+
- The heavy parent is not replayed per poll — a lightweight `BranchMergeJob` does the waiting.
|
|
1348
|
+
- The source must be stable during a branch's dispatch window (items keyed `name_{index}` by position; `error_on_ignore: true` catches ordering, not insertion).
|
|
1349
|
+
|
|
1350
|
+
- [ ] **Step 3: Verify the gemspec floor**
|
|
1351
|
+
|
|
1352
|
+
Run the Verify command above. Expected: `ok`.
|
|
1353
|
+
|
|
1354
|
+
- [ ] **Step 4: Run the full suite one last time**
|
|
1355
|
+
|
|
1356
|
+
Run: `bundle exec rake test`
|
|
1357
|
+
Expected: all green.
|
|
1358
|
+
|
|
1359
|
+
- [ ] **Step 5: Commit**
|
|
1360
|
+
|
|
1361
|
+
```bash
|
|
1362
|
+
git add chrono_forge.gemspec README.md
|
|
1363
|
+
git commit -m "feat(branches): require activejob >= 7.1; document branches"
|
|
1364
|
+
```
|
|
1365
|
+
|
|
1366
|
+
```json:metadata
|
|
1367
|
+
{"files": ["chrono_forge.gemspec", "README.md"], "verifyCommand": "bundle exec rake test", "acceptanceCriteria": ["activejob >= 7.1 floor", "README branches section + caveats"], "requiresUserVerification": false}
|
|
1368
|
+
```
|
|
1369
|
+
|
|
1370
|
+
---
|
|
1371
|
+
|
|
1372
|
+
## Notes for the implementer
|
|
1373
|
+
|
|
1374
|
+
- **Zeitwerk loading:** new files under `lib/chrono_forge/` autoload by namespace — `branch_merge_job.rb` → `ChronoForge::BranchMergeJob`; `executor/methods/branch.rb` → `ChronoForge::Executor::Methods::Branch`. No manual `require`. The only wiring is the `include`s in `executor/methods.rb`.
|
|
1375
|
+
- **Helper visibility:** `branch.rb`, `merge_branches.rb`, and `workflow_states.rb` are all mixed into the same `Executor` instance, so their private helpers (`dispatch_children`, `branches_done?`, `enqueue_branch_merge_job`, `current_branch!`) call each other freely.
|
|
1376
|
+
- **`insert_all` + JSON:** pass `kwargs`/`options`/`context` as Ruby hashes; Rails casts them to the json/jsonb columns. `insert_all` does not set timestamps — `created_at`/`updated_at` are set explicitly in `dispatch_children`.
|
|
1377
|
+
- **Child execution on first run:** `dispatch_children` pre-inserts the child row (with `kwargs`), then enqueues `klass.new(child_key, **kwargs)`. When the child job runs, the executor's `setup_workflow!` finds the pre-inserted row and uses its stored `kwargs` — the job-arg kwargs are redundant but harmless.
|
|
1378
|
+
- **Run order:** Tasks 1→2→3→4 are strictly sequential; 5 depends on 2; 6 depends on 5 and 3; 7 depends on 6 and 3/4; 8 depends on 7; 9 is independent. The `MergeBranches` module must exist (even empty) once Task 3 wires the include — create the file in Task 3, flesh it out in Task 6.
|