cdc-sidekiq 0.1.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
checksums.yaml ADDED
@@ -0,0 +1,7 @@
1
+ ---
2
+ SHA256:
3
+ metadata.gz: a814eddfc8362aa6e2ad54cf09a8034398327d23626aca65887a5fcee4d8b8cf
4
+ data.tar.gz: 175152c36354bf40d5831e0812c8d59eae68e5865c265e8c18e0a07ceb14d42e
5
+ SHA512:
6
+ metadata.gz: ee9285ba79a7c605dd9d43bc251e3352bdb811ec919df7377b1379660dc638b8f35519a11581830ce1a295fe15147fe17ad2f3bae9ce1d6221d134bc0fc93088
7
+ data.tar.gz: b3739c02f483d35c4c447417743855ef462d5c07c2c0254e9aec50bb88846257ffa58c5ec3b46be50c24c866d02bfb6c0350d92029e2eaeee07455b627bf0b05
data/CHANGELOG.md ADDED
@@ -0,0 +1,11 @@
1
+ ## [Unreleased]
2
+
3
+ - Improved unit test coverage for runtime selection, configuration copying, class-level job declarations, failure policies, and batch payload behavior.
4
+ - Expanded benchmark documentation with the 500,000-item Ruby 4.0.5 snapshot, interpretation, and runtime tuning guidance.
5
+ - Documented runtime-selection guidance and the shared-state correctness boundary between cdc-sidekiq and consumer processors/sinks.
6
+ - Updated gem metadata documentation URI and description.
7
+
8
+
9
+ ## [0.1.0] - 2026-06-08
10
+
11
+ - Initial release
data/LICENSE.txt ADDED
@@ -0,0 +1,21 @@
1
+ The MIT License (MIT)
2
+
3
+ Copyright (c) 2026 Kenneth C. Demanawa
4
+
5
+ Permission is hereby granted, free of charge, to any person obtaining a copy
6
+ of this software and associated documentation files (the "Software"), to deal
7
+ in the Software without restriction, including without limitation the rights
8
+ to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
9
+ copies of the Software, and to permit persons to whom the Software is
10
+ furnished to do so, subject to the following conditions:
11
+
12
+ The above copyright notice and this permission notice shall be included in
13
+ all copies or substantial portions of the Software.
14
+
15
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16
+ IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17
+ FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18
+ AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19
+ LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20
+ OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
21
+ THE SOFTWARE.
data/README.md ADDED
@@ -0,0 +1,321 @@
1
+ # cdc-sidekiq
2
+
3
+ `cdc-sidekiq` integrates Sidekiq with CDC execution primitives.
4
+
5
+ Sidekiq remains the durable job system. It owns scheduling, retries, queues, Redis persistence, and operational behavior.
6
+
7
+ `cdc-sidekiq` owns what happens inside selected jobs: processor contracts, runtime selection, fan-out/fan-in execution, and normalized processor results.
8
+
9
+ ```text
10
+ Sidekiq Job
11
+ |
12
+ v
13
+ CDC::Sidekiq::ProcessorJob
14
+ |
15
+ +--> cdc-parallel
16
+ | Ractor fan-out / fan-in
17
+ |
18
+ +--> cdc-concurrent
19
+ Async task fan-out / fan-in
20
+ ```
21
+
22
+ ## Why cdc-sidekiq Exists
23
+
24
+ Sidekiq is excellent at durable background job execution.
25
+
26
+ It provides:
27
+
28
+ - scheduling
29
+ - retries
30
+ - queues
31
+ - Redis-backed persistence
32
+ - operational familiarity
33
+
34
+ However, Sidekiq intentionally leaves the internal execution strategy of a job to the application.
35
+
36
+ `cdc-sidekiq` extends Sidekiq with CDC execution primitives:
37
+
38
+ - processor contracts
39
+ - parallel execution with `cdc-parallel`
40
+ - concurrent execution with `cdc-concurrent`
41
+ - fan-out / fan-in processing
42
+ - ordered result collection
43
+ - normalized processor results
44
+
45
+ The goal is not to replace Sidekiq.
46
+
47
+ The goal is to make execution topology an explicit choice inside Sidekiq jobs.
48
+
49
+ ```text
50
+ Sidekiq schedules the work.
51
+ CDC primitives execute the work.
52
+ ```
53
+
54
+ ---
55
+
56
+ ## Roadmap
57
+
58
+ ### Community Edition
59
+
60
+ The open-source edition focuses on execution primitives.
61
+
62
+ Current and planned OSS capabilities:
63
+
64
+ - `:direct` runtime
65
+ - `:concurrent` runtime
66
+ - `:parallel` runtime
67
+ - ProcessorJob abstraction
68
+ - `process`
69
+ - `process_many`
70
+ - ordered result collection
71
+ - normalized ProcessorResult handling
72
+
73
+ ### Commercial Orchestration Boundary
74
+
75
+ The open-source edition focuses on Sidekiq integration and explicit execution primitives.
76
+
77
+ Advanced orchestration belongs above this gem. Commercial orchestration features such as hybrid runtimes, nested worker topologies, worker-local resource pools, adaptive sizing, capacity guards, telemetry, and advanced failure policies live in the commercial orchestrator layer.
78
+
79
+ ```text
80
+ cdc-sidekiq
81
+ Sidekiq integration
82
+ ProcessorJob abstraction
83
+ :direct / :concurrent / :parallel runtime selection
84
+
85
+ cdc-orchestrator-pro
86
+ Hybrid runtime
87
+ Nested runtime / worker-local resource pools
88
+ orchestration, backpressure, telemetry, tuning
89
+ ```
90
+
91
+ This keeps the OSS gem small and useful while leaving operational orchestration to the commercial layer.
92
+
93
+ ## Runtime Selection Guide
94
+
95
+ Choose the smallest runtime that matches the work.
96
+
97
+ | Runtime | Best fit | Avoid when |
98
+ | --- | --- | --- |
99
+ | `:direct` | Tiny, cheap, already-batched work | You need internal fan-out |
100
+ | `:parallel` | CPU-heavy, Ractor-shareable processing | Payloads/processors are not shareable or work is too tiny |
101
+ | `:concurrent` | I/O-heavy processing with scheduler-friendly waits | Work is CPU-bound or effectively no-op |
102
+
103
+ A useful rule of thumb:
104
+
105
+ ```text
106
+ Sidekiq controls how many jobs run.
107
+ cdc-sidekiq controls how one selected job executes its internal payload.
108
+ ```
109
+
110
+ Do not increase both Sidekiq concurrency and CDC runtime concurrency blindly. That can multiply downstream pressure on Redis, PostgreSQL, HTTP APIs, and other shared resources.
111
+
112
+ ## Correctness Boundary
113
+
114
+ `cdc-sidekiq` controls execution topology. It does not make shared downstream state automatically safe.
115
+
116
+ When multiple jobs or multiple internal work items update the same database rows, files, search documents, Redis keys, or external API resources, the processor or sink must provide the correctness policy. Common strategies include:
117
+
118
+ - database transactions;
119
+ - row-level locks such as `SELECT ... FOR UPDATE`;
120
+ - optimistic locking;
121
+ - idempotency keys;
122
+ - conflict-safe writes;
123
+ - single-writer sink patterns.
124
+
125
+ The runtime can provide controlled execution. Application-specific correctness remains the responsibility of the processor and sink implementation.
126
+
127
+ ## Benchmarks
128
+
129
+ See [benchmark/README.md](./benchmark/README.md) for the `bin/cdc-sidekiq-load` benchmark, recent 500,000-item snapshots, interpretation, and runtime tuning guidance.
130
+
131
+ ## Requirements
132
+
133
+ Ruby 3.4 or newer.
134
+
135
+ Runtime support depends on the selected CDC execution substrate:
136
+
137
+ | Runtime | Ruby | Required gems |
138
+ | --- | --- | --- |
139
+ | `:direct` | 3.4+ | `cdc-core` |
140
+ | `:concurrent` | 3.4+ | `cdc-core`, `cdc-concurrent` |
141
+ | `:parallel` | 4.0+ | `cdc-core`, `cdc-parallel` |
142
+
143
+ `cdc-parallel` remains optional because it requires Ruby 4+. Ruby 3.4 users can still use `:direct` and `:concurrent`.
144
+
145
+ ## Installation
146
+
147
+ ```ruby
148
+ gem "cdc-sidekiq"
149
+ ```
150
+
151
+ Runtime gems are installed by the application according to the execution model it uses:
152
+
153
+ ```ruby
154
+ gem "cdc-parallel" # for Ractor-backed execution
155
+ gem "cdc-concurrent" # for Async-backed execution
156
+ ```
157
+
158
+ ## Configuration
159
+
160
+ ```ruby
161
+ Sidekiq.configure_server do |_config|
162
+ CDC::Sidekiq.configure do |cdc|
163
+ cdc.default_runtime = :concurrent
164
+ cdc.parallel_size = Etc.nprocessors - 1
165
+ cdc.concurrency = 100
166
+ cdc.timeout = nil
167
+ cdc.preserve_order = true
168
+ end
169
+ end
170
+ ```
171
+
172
+ Sidekiq concurrency and CDC runtime concurrency are intentionally separate.
173
+
174
+ ```text
175
+ Sidekiq concurrency
176
+ = how many Sidekiq jobs run at once
177
+
178
+ CDC parallel size
179
+ = how many Ractors one CDC-enabled job may use internally
180
+
181
+ CDC concurrency
182
+ = how many Async tasks one CDC-enabled job may use internally
183
+ ```
184
+
185
+ ## Usage
186
+
187
+ ### Parallel processor job
188
+
189
+ ```ruby
190
+ class UserIndexer < CDC::Core::Processor
191
+ ractor_safe!
192
+
193
+ def process(user_id)
194
+ # CPU-heavy or shareable work
195
+ CDC::Core::ProcessorResult.success(user_id)
196
+ end
197
+ end
198
+
199
+ class ReindexUsersJob
200
+ include Sidekiq::Job
201
+ include CDC::Sidekiq::ProcessorJob
202
+
203
+ cdc_processor UserIndexer
204
+ cdc_runtime :parallel
205
+ cdc_parallel_size Etc.nprocessors - 1
206
+ end
207
+
208
+ ReindexUsersJob.perform_async([1, 2, 3, 4])
209
+ ```
210
+
211
+ Array payloads are processed with `process_many` by default.
212
+
213
+ ```text
214
+ Sidekiq job payload
215
+ |
216
+ v
217
+ process_many
218
+ |
219
+ v
220
+ cdc-parallel ProcessorPool
221
+ |
222
+ v
223
+ ordered ProcessorResult array
224
+ ```
225
+
226
+ ### Concurrent processor job
227
+
228
+ ```ruby
229
+ class WebhookDeliverer < CDC::Core::Processor
230
+ def concurrent_safe? = true
231
+
232
+ def process(webhook_payload)
233
+ # I/O-heavy scheduler-friendly work
234
+ CDC::Core::ProcessorResult.success(webhook_payload)
235
+ end
236
+ end
237
+
238
+ class DeliverWebhooksJob
239
+ include Sidekiq::Job
240
+ include CDC::Sidekiq::ProcessorJob
241
+
242
+ cdc_processor WebhookDeliverer
243
+ cdc_runtime :concurrent
244
+ cdc_concurrency 250
245
+ cdc_timeout 5.0
246
+ end
247
+ ```
248
+
249
+ ### Per-job runtime override
250
+
251
+ Global configuration provides defaults. Each job can override the runtime.
252
+
253
+ ```ruby
254
+ class ProjectionJob
255
+ include Sidekiq::Job
256
+ include CDC::Sidekiq::ProcessorJob
257
+
258
+ cdc_processor ProjectionProcessor
259
+ cdc_runtime :parallel
260
+ end
261
+ ```
262
+
263
+ ## Failure behavior
264
+
265
+ By default, failed `ProcessorResult` objects raise `CDC::Sidekiq::ProcessorFailureError` so Sidekiq can apply its normal retry behavior.
266
+
267
+ ```ruby
268
+ class BestEffortJob
269
+ include Sidekiq::Job
270
+ include CDC::Sidekiq::ProcessorJob
271
+
272
+ cdc_processor BestEffortProcessor
273
+ cdc_runtime :concurrent
274
+ cdc_raise_on_failure false
275
+ end
276
+ ```
277
+
278
+
279
+ ## Benchmarking
280
+
281
+ `cdc-sidekiq` includes `bin/cdc-sidekiq-load`, a benchmark aligned with Sidekiq's `bin/sidekiq-load` style.
282
+
283
+ Sidekiq's benchmark creates many no-op jobs and drains them as fast as possible. `cdc-sidekiq-load` keeps the no-op workload shape but measures the downstream CDC runtime path inside one CDC-aware Sidekiq job.
284
+
285
+ ```bash
286
+ COUNT=500000 RUNTIME=concurrent CDC_CONCURRENCY=100 \
287
+ bundle exec bin/cdc-sidekiq-load
288
+ ```
289
+
290
+ ```bash
291
+ COUNT=500000 RUNTIME=parallel CDC_PARALLEL_SIZE=7 \
292
+ bundle exec bin/cdc-sidekiq-load
293
+ ```
294
+
295
+ See [`benchmark/README.md`](benchmark/README.md) for interpretation notes and all benchmark knobs.
296
+
297
+ ## Current scope
298
+
299
+ This gem currently implements the downstream/runtime integration only:
300
+
301
+ ```text
302
+ Sidekiq Job
303
+
304
+ CDC execution primitives
305
+
306
+ Future:
307
+
308
+ PostgreSQL WAL
309
+
310
+ pgoutput*
311
+
312
+ Sidekiq Job
313
+
314
+ CDC execution primitives
315
+ ```
316
+
317
+ A future upstream/source integration may map PostgreSQL logical replication events into Sidekiq work through the `pgoutput*` family and `pgoutput-source-adapter`, but that is intentionally out of scope for the initial release.
318
+
319
+ ## License
320
+
321
+ [MIT](./LICENSE.txt).
@@ -0,0 +1,143 @@
1
+ # Benchmarks
2
+
3
+ ## `bin/cdc-sidekiq-load`
4
+
5
+ `bin/cdc-sidekiq-load` is intentionally aligned with Sidekiq's own `bin/sidekiq-load` benchmark style.
6
+
7
+ Sidekiq's load benchmark creates a large number of no-op jobs and drains them as fast as possible. `cdc-sidekiq-load` keeps the no-op workload shape but measures the downstream `cdc-sidekiq` execution model:
8
+
9
+ ```text
10
+ Sidekiq-style job payload
11
+ |
12
+ v
13
+ CDC::Sidekiq::Runtime
14
+ |
15
+ +--> :direct
16
+ +--> :concurrent
17
+ +--> :parallel
18
+ |
19
+ v
20
+ process_many(items)
21
+ ```
22
+
23
+ This benchmark does **not** replace Sidekiq's Redis-backed load benchmark. It measures the inner execution primitive that a CDC-aware Sidekiq job can use after Sidekiq has already started the job.
24
+
25
+ ## Examples
26
+
27
+ ```bash
28
+ COUNT=500000 RUNTIME=direct \
29
+ bundle exec bin/cdc-sidekiq-load
30
+ ```
31
+
32
+ ```bash
33
+ COUNT=500000 RUNTIME=concurrent CDC_CONCURRENCY=100 \
34
+ bundle exec bin/cdc-sidekiq-load
35
+ ```
36
+
37
+ ```bash
38
+ COUNT=500000 RUNTIME=parallel CDC_PARALLEL_SIZE=7 \
39
+ bundle exec bin/cdc-sidekiq-load
40
+ ```
41
+
42
+ ## Knobs
43
+
44
+ | Environment variable | Purpose | Default |
45
+ | --- | --- | --- |
46
+ | `COUNT` | Total number of no-op work items | `500000` |
47
+ | `BATCH_SIZE` | Number of items per `process_many` call | `COUNT` |
48
+ | `RUNTIME` | `direct`, `concurrent`, or `parallel` | `concurrent` |
49
+ | `CDC_CONCURRENCY` | Async task limit for `cdc-concurrent` | `100` |
50
+ | `CDC_PARALLEL_SIZE` | Ractor worker count for `cdc-parallel` | `Etc.nprocessors - 1` |
51
+ | `CDC_TIMEOUT` | Per-item timeout in seconds | unset |
52
+ | `PRESERVE_ORDER` | Preserve result order for concurrent runtime | `true` |
53
+ | `WARMUP` | Warmup items before timing | `min(COUNT / 50, 10000)` |
54
+ | `JSON` | Print machine-readable JSON when set to `1` | unset |
55
+
56
+ ## Snapshot: 500,000 No-op Items
57
+
58
+ Environment:
59
+
60
+ ```text
61
+ ruby=ruby 4.0.5 (2026-05-20 revision 64336ffd0e) +PRISM [x86_64-linux]
62
+ count=500,000
63
+ batch_size=500,000
64
+ preserve_order=true
65
+ warmup=10,000
66
+ ```
67
+
68
+ Results:
69
+
70
+ | Runtime | Knobs | Elapsed | Throughput | GC count |
71
+ | --- | --- | ---: | ---: | ---: |
72
+ | `direct` | default direct execution | `0.085821 sec` | `5,826,083 items/sec` | `0` |
73
+ | `parallel` | `CDC_PARALLEL_SIZE=7` | `6.613177 sec` | `75,607 items/sec` | `58` |
74
+ | `parallel` | `CDC_PARALLEL_SIZE=7` | `5.830767 sec` | `85,752 items/sec` | `44` |
75
+ | `concurrent` | `CDC_CONCURRENCY=100` | `12.667181 sec` | `39,472 items/sec` | `45` |
76
+
77
+ ## Interpretation
78
+
79
+ This snapshot is intentionally a no-op workload. It is useful for measuring runtime overhead, not real downstream work.
80
+
81
+ The `:direct` runtime wins by a huge margin because it performs no fan-out, no Ractor messaging, no Async task scheduling, and no pool coordination. For tiny no-op processors, `:direct` should be expected to dominate.
82
+
83
+ The `:parallel` runtime is slower than `:direct` for this workload because every item pays Ractor dispatch and result-collection cost. It is still faster than `:concurrent` in this snapshot, which suggests the Async task orchestration overhead is not worthwhile for a tiny CPU-free processor.
84
+
85
+ The `:concurrent` runtime is intended for I/O-heavy processors. A no-op benchmark is a poor workload for proving its value because there is no socket wait, remote API latency, database latency, or scheduler-friendly blocking work to hide.
86
+
87
+ ## Tuning Recommendations
88
+
89
+ Use `:direct` when:
90
+
91
+ - each item is very cheap;
92
+ - the processor does little or no I/O;
93
+ - the payload is already batched efficiently;
94
+ - predictable low overhead is more important than fan-out.
95
+
96
+ Use `:parallel` when:
97
+
98
+ - the processor is CPU-heavy;
99
+ - the processor and payloads are Ractor-shareable;
100
+ - batches are large enough to amortize Ractor dispatch overhead;
101
+ - the machine has spare CPU cores.
102
+
103
+ Start with:
104
+
105
+ ```bash
106
+ CDC_PARALLEL_SIZE=$((nproc - 1))
107
+ ```
108
+
109
+ then test lower values. More Ractors are not automatically better. Watch throughput, GC count, memory use, and downstream resource pressure.
110
+
111
+ Use `:concurrent` when:
112
+
113
+ - the processor is I/O-heavy;
114
+ - work spends meaningful time waiting on HTTP, Redis, PostgreSQL, MySQL, object storage, or other external systems;
115
+ - downstream services can tolerate the requested concurrency;
116
+ - preserving result order is either required or intentionally disabled.
117
+
118
+ Start with:
119
+
120
+ ```bash
121
+ CDC_CONCURRENCY=25
122
+ ```
123
+
124
+ then increase gradually. A concurrency value of `100` can be reasonable for I/O-bound workloads, but it is pure overhead for no-op work.
125
+
126
+ ## Benchmark Rule of Thumb
127
+
128
+ ```text
129
+ Tiny/no-op work -> :direct
130
+ CPU-heavy work -> :parallel
131
+ I/O-heavy work -> :concurrent
132
+ Mixed topology -> commercial orchestrator layer
133
+ ```
134
+
135
+ The benchmark is useful for comparing:
136
+
137
+ ```text
138
+ one Sidekiq job with many internal work items
139
+ vs.
140
+ many Sidekiq jobs with one work item each
141
+ ```
142
+
143
+ That distinction is the core `cdc-sidekiq` value proposition.
@@ -0,0 +1,62 @@
1
+ # frozen_string_literal: true
2
+
3
+ require "etc"
4
+
5
+ module CDC
6
+ module Sidekiq
7
+ # Runtime defaults shared by CDC-aware Sidekiq jobs.
8
+ #
9
+ # The configuration intentionally describes only the CDC execution layer.
10
+ # Sidekiq still owns queue selection, scheduling, retries, durability, and
11
+ # job concurrency. cdc-sidekiq owns runtime selection for work performed
12
+ # inside a Sidekiq job.
13
+ class Configuration
14
+ # @return [Symbol] default runtime used when a job does not declare one.
15
+ attr_accessor :default_runtime
16
+
17
+ # @return [Integer] default number of Ractor workers for cdc-parallel jobs.
18
+ attr_accessor :parallel_size
19
+
20
+ # @return [Integer] default number of Async tasks for cdc-concurrent jobs.
21
+ attr_accessor :concurrency
22
+
23
+ # @return [Float, nil] default per-item timeout passed to CDC processor pools.
24
+ attr_accessor :timeout
25
+
26
+ # @return [Boolean] default result-ordering policy for cdc-concurrent jobs.
27
+ attr_accessor :preserve_order
28
+
29
+ # @return [Boolean] default failure policy for processor jobs.
30
+ attr_accessor :raise_on_failure
31
+
32
+ # @return [Boolean] whether array payloads should be processed with #process_many by default.
33
+ attr_accessor :batch_payloads
34
+
35
+ # @return [void]
36
+ def initialize
37
+ @default_runtime = :concurrent
38
+ @parallel_size = [Etc.nprocessors - 1, 1].max
39
+ @concurrency = 100
40
+ @timeout = nil
41
+ @preserve_order = true
42
+ @raise_on_failure = true
43
+ @batch_payloads = true
44
+ end
45
+
46
+ # Build an immutable copy so job-level overrides cannot mutate globals.
47
+ #
48
+ # @return [Configuration] independent copy of this configuration.
49
+ def dup
50
+ copy = self.class.new
51
+ copy.default_runtime = default_runtime
52
+ copy.parallel_size = parallel_size
53
+ copy.concurrency = concurrency
54
+ copy.timeout = timeout
55
+ copy.preserve_order = preserve_order
56
+ copy.raise_on_failure = raise_on_failure
57
+ copy.batch_payloads = batch_payloads
58
+ copy
59
+ end
60
+ end
61
+ end
62
+ end
@@ -0,0 +1,27 @@
1
+ # frozen_string_literal: true
2
+
3
+ module CDC
4
+ module Sidekiq
5
+ # Base error for all cdc-sidekiq failures.
6
+ class Error < StandardError; end
7
+
8
+ # Raised when a job declares an unsupported CDC runtime.
9
+ class UnsupportedRuntimeError < Error; end
10
+
11
+ # Raised when a Sidekiq processor job does not declare a processor.
12
+ class MissingProcessorError < Error; end
13
+
14
+ # Raised when processor execution returns one or more failed results.
15
+ class ProcessorFailureError < Error
16
+ # @return [Array<Object>] failed processor results that triggered the error.
17
+ attr_reader :failures
18
+
19
+ # @param failures [Array<Object>] failed processor results that should be exposed to Sidekiq retry handling.
20
+ # @return [void]
21
+ def initialize(failures)
22
+ @failures = failures.freeze
23
+ super("CDC processor failed for #{failures.length} item(s)")
24
+ end
25
+ end
26
+ end
27
+ end
@@ -0,0 +1,213 @@
1
+ # frozen_string_literal: true
2
+
3
+ module CDC
4
+ module Sidekiq
5
+ # Sidekiq job mixin that executes work through CDC runtime primitives.
6
+ #
7
+ # The job remains a normal Sidekiq job. Sidekiq still owns scheduling,
8
+ # retries, queues, and persistence. cdc-sidekiq only changes how the job
9
+ # executes its payload once Sidekiq has started the job.
10
+ #
11
+ # @example Process many items through cdc-parallel
12
+ # class ReindexUsersJob
13
+ # include Sidekiq::Job
14
+ # include CDC::Sidekiq::ProcessorJob
15
+ #
16
+ # cdc_processor UserIndexer
17
+ # cdc_runtime :parallel
18
+ # end
19
+ #
20
+ # @example Process I/O-heavy items through cdc-concurrent
21
+ # class DeliverWebhooksJob
22
+ # include Sidekiq::Job
23
+ # include CDC::Sidekiq::ProcessorJob
24
+ #
25
+ # cdc_processor WebhookDeliverer
26
+ # cdc_runtime :concurrent
27
+ # cdc_concurrency 250
28
+ # end
29
+ module ProcessorJob
30
+ # Add CDC processor-job class methods to the including job class.
31
+ #
32
+ # @param base [Class] Sidekiq job class including this module.
33
+ # @return [void]
34
+ def self.included(base)
35
+ base.extend(ClassMethods)
36
+ end
37
+
38
+ # Execute a Sidekiq payload through the configured CDC runtime.
39
+ #
40
+ # Array payloads are processed with #process_many when cdc_batch_payloads
41
+ # is enabled. Other payloads are processed with #process.
42
+ #
43
+ # @param payload [Object, Array<Object>] Sidekiq job payload or batch payload.
44
+ # @return [Object, Array<Object>] CDC processor result or frozen result array.
45
+ # @raise [MissingProcessorError] when the job does not declare a CDC processor.
46
+ # @raise [ProcessorFailureError] when raise-on-failure is enabled and one or more results failed.
47
+ def perform(payload)
48
+ results = process_payload(payload)
49
+ handle_processor_failures(results)
50
+ results
51
+ end
52
+
53
+ private
54
+
55
+ def process_payload(payload)
56
+ runtime = self.class.__cdc_sidekiq_runtime
57
+ if payload.is_a?(Array) && self.class.__cdc_sidekiq_batch_payloads
58
+ runtime.process_many(payload)
59
+ else
60
+ runtime.process(payload)
61
+ end
62
+ end
63
+
64
+ def handle_processor_failures(results)
65
+ return unless self.class.__cdc_sidekiq_raise_on_failure
66
+
67
+ failures = Array(results).select { |result| result.respond_to?(:failure?) && result.failure? }
68
+ raise ProcessorFailureError, failures unless failures.empty?
69
+ end
70
+
71
+ # Class-level declaration helpers for CDC-aware Sidekiq jobs.
72
+ module ClassMethods
73
+ # Declare or read the processor used by this job.
74
+ #
75
+ # @param value [Class, Object, nil] processor class or processor instance.
76
+ # @return [Class, Object, nil] configured processor when called without an argument.
77
+ def cdc_processor(value = nil)
78
+ return @cdc_processor if value.nil?
79
+
80
+ @cdc_processor = value
81
+ end
82
+
83
+ # Declare or read the CDC runtime used by this job.
84
+ #
85
+ # @param value [Symbol, String, nil] runtime name, such as :parallel, :concurrent, or :direct.
86
+ # @return [Symbol, nil] configured runtime when called without an argument.
87
+ def cdc_runtime(value = nil)
88
+ return @cdc_runtime if value.nil?
89
+
90
+ @cdc_runtime = value.to_sym
91
+ end
92
+
93
+ # Declare or read the cdc-parallel worker count for this job.
94
+ #
95
+ # @param value [Integer, nil] number of Ractor workers for this job.
96
+ # @return [Integer, nil] configured worker count when called without an argument.
97
+ def cdc_parallel_size(value = nil)
98
+ return @cdc_parallel_size if value.nil?
99
+
100
+ @cdc_parallel_size = Integer(value)
101
+ end
102
+
103
+ # Declare or read the cdc-concurrent task concurrency for this job.
104
+ #
105
+ # @param value [Integer, nil] maximum Async task count for this job.
106
+ # @return [Integer, nil] configured concurrency when called without an argument.
107
+ def cdc_concurrency(value = nil)
108
+ return @cdc_concurrency if value.nil?
109
+
110
+ @cdc_concurrency = Integer(value)
111
+ end
112
+
113
+ # Declare or read the runtime timeout for this job.
114
+ #
115
+ # @param value [Float, Integer, nil] timeout in seconds, or nil for no timeout.
116
+ # @return [Float, nil] configured timeout when called without an argument.
117
+ def cdc_timeout(value = :__cdc_sidekiq_read__)
118
+ return @cdc_timeout if value == :__cdc_sidekiq_read__
119
+
120
+ @cdc_timeout = value.nil? ? nil : Float(value)
121
+ end
122
+
123
+ # Declare or read result ordering for cdc-concurrent.
124
+ #
125
+ # @param value [Boolean, nil] true to preserve input order, false to keep completion order.
126
+ # @return [Boolean, nil] configured ordering policy when called without an argument.
127
+ def cdc_preserve_order(value = nil)
128
+ return @cdc_preserve_order if value.nil?
129
+
130
+ @cdc_preserve_order = value == true
131
+ end
132
+
133
+ # Declare or read whether array payloads use #process_many.
134
+ #
135
+ # @param value [Boolean, nil] true to batch array payloads, false to process the array as one item.
136
+ # @return [Boolean, nil] configured batching policy when called without an argument.
137
+ def cdc_batch_payloads(value = nil)
138
+ return @cdc_batch_payloads if value.nil?
139
+
140
+ @cdc_batch_payloads = value == true
141
+ end
142
+
143
+ # Declare or read whether failed ProcessorResult objects raise.
144
+ #
145
+ # @param value [Boolean, nil] true to raise on failed results so Sidekiq retries the job.
146
+ # @return [Boolean, nil] configured failure policy when called without an argument.
147
+ def cdc_raise_on_failure(value = nil)
148
+ return @cdc_raise_on_failure if value.nil?
149
+
150
+ @cdc_raise_on_failure = value == true
151
+ end
152
+
153
+ # Build the runtime used by one Sidekiq job invocation.
154
+ #
155
+ # @return [Runtime] runtime configured for this job class.
156
+ # @raise [MissingProcessorError] when no processor has been declared.
157
+ def __cdc_sidekiq_runtime
158
+ Runtime.new(
159
+ processor: __cdc_sidekiq_processor,
160
+ runtime: __cdc_sidekiq_runtime_name,
161
+ parallel_size: __cdc_sidekiq_parallel_size,
162
+ concurrency: __cdc_sidekiq_concurrency,
163
+ timeout: __cdc_sidekiq_timeout,
164
+ preserve_order: __cdc_sidekiq_preserve_order
165
+ )
166
+ end
167
+
168
+ # @return [Boolean] true when array payloads should use #process_many.
169
+ def __cdc_sidekiq_batch_payloads
170
+ configured_boolean(@cdc_batch_payloads, CDC::Sidekiq.configuration.batch_payloads)
171
+ end
172
+
173
+ # @return [Boolean] true when failed results should raise.
174
+ def __cdc_sidekiq_raise_on_failure
175
+ configured_boolean(@cdc_raise_on_failure, CDC::Sidekiq.configuration.raise_on_failure)
176
+ end
177
+
178
+ private
179
+
180
+ def __cdc_sidekiq_processor
181
+ processor = @cdc_processor
182
+ raise MissingProcessorError, "#{name} must declare cdc_processor" unless processor
183
+
184
+ processor.is_a?(Class) ? processor.new : processor
185
+ end
186
+
187
+ def __cdc_sidekiq_runtime_name
188
+ @cdc_runtime || CDC::Sidekiq.configuration.default_runtime
189
+ end
190
+
191
+ def __cdc_sidekiq_parallel_size
192
+ @cdc_parallel_size || CDC::Sidekiq.configuration.parallel_size
193
+ end
194
+
195
+ def __cdc_sidekiq_concurrency
196
+ @cdc_concurrency || CDC::Sidekiq.configuration.concurrency
197
+ end
198
+
199
+ def __cdc_sidekiq_timeout
200
+ defined?(@cdc_timeout) ? @cdc_timeout : CDC::Sidekiq.configuration.timeout
201
+ end
202
+
203
+ def __cdc_sidekiq_preserve_order
204
+ configured_boolean(@cdc_preserve_order, CDC::Sidekiq.configuration.preserve_order)
205
+ end
206
+
207
+ def configured_boolean(value, fallback)
208
+ value.nil? ? fallback : value
209
+ end
210
+ end
211
+ end
212
+ end
213
+ end
@@ -0,0 +1,91 @@
1
+ # frozen_string_literal: true
2
+
3
+ module CDC
4
+ module Sidekiq
5
+ # Executes a CDC processor through one of the CDC runtime primitives.
6
+ #
7
+ # Runtime is intentionally selected outside Sidekiq's own concurrency
8
+ # setting. Sidekiq concurrency controls how many jobs run at once. This
9
+ # object controls how one CDC-aware job fans work out internally.
10
+ class Runtime
11
+ # @param processor [Object] CDC processor object that responds to #process.
12
+ # @param runtime [Symbol] execution runtime, currently :parallel, :concurrent, or :direct.
13
+ # @param parallel_size [Integer] number of Ractors used by cdc-parallel.
14
+ # @param concurrency [Integer] number of Async tasks used by cdc-concurrent.
15
+ # @param timeout [Float, nil] optional timeout passed to the selected runtime.
16
+ # @param preserve_order [Boolean] whether cdc-concurrent should preserve input order.
17
+ # @return [void]
18
+ def initialize(processor:, runtime:, parallel_size:, concurrency:, timeout:, preserve_order:)
19
+ @processor = processor
20
+ @runtime = runtime.to_sym
21
+ @parallel_size = parallel_size
22
+ @concurrency = concurrency
23
+ @timeout = timeout
24
+ @preserve_order = preserve_order
25
+ end
26
+
27
+ # Process one work item through the selected runtime.
28
+ #
29
+ # @param item [Object] work item passed to the processor.
30
+ # @return [Object] processor result returned by the selected runtime.
31
+ def process(item)
32
+ with_pool { |pool| pool.process(item) }
33
+ end
34
+
35
+ # Process many work items through the selected runtime.
36
+ #
37
+ # @param items [Array<Object>] work items passed to the processor.
38
+ # @return [Array<Object>] processor results returned by the selected runtime.
39
+ def process_many(items)
40
+ with_pool { |pool| pool.process_many(items) }
41
+ end
42
+
43
+ private
44
+
45
+ attr_reader :processor, :runtime, :parallel_size, :concurrency, :timeout, :preserve_order
46
+
47
+ def with_pool
48
+ pool = build_pool
49
+ yield pool
50
+ ensure
51
+ pool.shutdown if pool.respond_to?(:shutdown)
52
+ end
53
+
54
+ def build_pool
55
+ case runtime
56
+ when :parallel
57
+ require "cdc/parallel"
58
+ CDC::Parallel::ProcessorPool.new(processor:, size: parallel_size, timeout:)
59
+ when :concurrent
60
+ require "cdc/concurrent"
61
+ CDC::Concurrent::ProcessorPool.new(processor:, concurrency:, timeout:, preserve_order:)
62
+ when :direct
63
+ DirectPool.new(processor)
64
+ else
65
+ raise UnsupportedRuntimeError, "unsupported CDC Sidekiq runtime: #{runtime.inspect}"
66
+ end
67
+ end
68
+
69
+ # Minimal runtime used for tests and simple sequential execution.
70
+ class DirectPool
71
+ # @param processor [Object] CDC processor object that responds to #process.
72
+ # @return [void]
73
+ def initialize(processor)
74
+ @processor = processor
75
+ end
76
+
77
+ # @param item [Object] work item passed to the processor.
78
+ # @return [Object] processor result returned by the processor.
79
+ def process(item)
80
+ @processor.process(item)
81
+ end
82
+
83
+ # @param items [Array<Object>] work items passed to the processor.
84
+ # @return [Array<Object>] processor results returned by the processor.
85
+ def process_many(items)
86
+ items.map { |item| process(item) }.freeze
87
+ end
88
+ end
89
+ end
90
+ end
91
+ end
@@ -0,0 +1,8 @@
1
+ # frozen_string_literal: true
2
+
3
+ module CDC
4
+ module Sidekiq
5
+ # Current cdc-sidekiq gem version.
6
+ VERSION = "0.1.0"
7
+ end
8
+ end
@@ -0,0 +1,37 @@
1
+ # frozen_string_literal: true
2
+
3
+ require_relative "sidekiq/version"
4
+ require_relative "sidekiq/errors"
5
+ require_relative "sidekiq/configuration"
6
+ require_relative "sidekiq/runtime"
7
+ require_relative "sidekiq/processor_job"
8
+
9
+ module CDC
10
+ # Integration layer between Sidekiq and CDC execution primitives.
11
+ module Sidekiq
12
+ class << self
13
+ # Read the process-wide cdc-sidekiq configuration.
14
+ #
15
+ # @return [Configuration] mutable global configuration object.
16
+ def configuration
17
+ @configuration ||= Configuration.new
18
+ end
19
+
20
+ # Configure process-wide defaults for CDC-aware Sidekiq jobs.
21
+ #
22
+ # @yieldparam configuration [Configuration] mutable configuration object.
23
+ # @return [Configuration] configured global configuration object.
24
+ def configure
25
+ yield configuration if block_given?
26
+ configuration
27
+ end
28
+
29
+ # Reset process-wide configuration to defaults.
30
+ #
31
+ # @return [Configuration] new default configuration object.
32
+ def reset_configuration!
33
+ @configuration = Configuration.new
34
+ end
35
+ end
36
+ end
37
+ end
@@ -0,0 +1,3 @@
1
+ # frozen_string_literal: true
2
+
3
+ require_relative "cdc/sidekiq"
metadata ADDED
@@ -0,0 +1,86 @@
1
+ --- !ruby/object:Gem::Specification
2
+ name: cdc-sidekiq
3
+ version: !ruby/object:Gem::Version
4
+ version: 0.1.0
5
+ platform: ruby
6
+ authors:
7
+ - Ken C. Demanawa
8
+ bindir: bin
9
+ cert_chain: []
10
+ date: 1980-01-02 00:00:00.000000000 Z
11
+ dependencies:
12
+ - !ruby/object:Gem::Dependency
13
+ name: cdc-core
14
+ requirement: !ruby/object:Gem::Requirement
15
+ requirements:
16
+ - - ">="
17
+ - !ruby/object:Gem::Version
18
+ version: '0.1'
19
+ type: :runtime
20
+ prerelease: false
21
+ version_requirements: !ruby/object:Gem::Requirement
22
+ requirements:
23
+ - - ">="
24
+ - !ruby/object:Gem::Version
25
+ version: '0.1'
26
+ - !ruby/object:Gem::Dependency
27
+ name: sidekiq
28
+ requirement: !ruby/object:Gem::Requirement
29
+ requirements:
30
+ - - ">="
31
+ - !ruby/object:Gem::Version
32
+ version: '7.0'
33
+ type: :runtime
34
+ prerelease: false
35
+ version_requirements: !ruby/object:Gem::Requirement
36
+ requirements:
37
+ - - ">="
38
+ - !ruby/object:Gem::Version
39
+ version: '7.0'
40
+ description: |
41
+ Adds CDC-aware processor jobs and runtime selection to Sidekiq,
42
+ allowing selected jobs to execute payloads directly, through cdc-parallel, or through cdc-concurrent.
43
+ email:
44
+ - kenneth.c.demanawa@gmail.com
45
+ executables: []
46
+ extensions: []
47
+ extra_rdoc_files: []
48
+ files:
49
+ - CHANGELOG.md
50
+ - LICENSE.txt
51
+ - README.md
52
+ - benchmark/README.md
53
+ - lib/cdc/sidekiq.rb
54
+ - lib/cdc/sidekiq/configuration.rb
55
+ - lib/cdc/sidekiq/errors.rb
56
+ - lib/cdc/sidekiq/processor_job.rb
57
+ - lib/cdc/sidekiq/runtime.rb
58
+ - lib/cdc/sidekiq/version.rb
59
+ - lib/cdc_sidekiq.rb
60
+ homepage: https://github.com/kanutocd/cdc-sidekiq
61
+ licenses:
62
+ - MIT
63
+ metadata:
64
+ homepage_uri: https://github.com/kanutocd/cdc-sidekiq
65
+ source_code_uri: https://github.com/kanutocd/cdc-sidekiq
66
+ changelog_uri: https://github.com/kanutocd/cdc-sidekiq/blob/main/CHANGELOG.md
67
+ documentation_uri: https://kanutocd.github.io/cdc-sidekiq/
68
+ rubygems_mfa_required: 'true'
69
+ rdoc_options: []
70
+ require_paths:
71
+ - lib
72
+ required_ruby_version: !ruby/object:Gem::Requirement
73
+ requirements:
74
+ - - ">="
75
+ - !ruby/object:Gem::Version
76
+ version: 3.4.0
77
+ required_rubygems_version: !ruby/object:Gem::Requirement
78
+ requirements:
79
+ - - ">="
80
+ - !ruby/object:Gem::Version
81
+ version: '0'
82
+ requirements: []
83
+ rubygems_version: 4.0.10
84
+ specification_version: 4
85
+ summary: Sidekiq integration for CDC execution runtimes.
86
+ test_files: []