cdc-parallel 0.2.2 → 0.2.3

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA256:
3
- metadata.gz: a51d304d55079509a1c2056573a9f4764924573b9a13eb1dbeda89ac94d57836
4
- data.tar.gz: 4e154536aaca3d801d4affe02e424f332c6a1a37b738ee70cc8fea5f335645fd
3
+ metadata.gz: 727373877c3c90e65e65ec4bbb5f5312b7e0d15d879af54402fa6752ac730aa0
4
+ data.tar.gz: cd2701bf5e93f5ef825f4b2e95a015dfbbb214bb14cb3f095a65d499aeaf9f28
5
5
  SHA512:
6
- metadata.gz: c3002bfcd3914510fc39e7621fba99a7868303d24f8c8e5b849d40c653f73355b9926636ed6bb31754f2f028d65af325207eb89466923bc413eb8e44538730e7
7
- data.tar.gz: 686ef1d6c0759d8ebedbd18e7109a4ac22008e8a0501faab31db8440bef3cf1127f79b076ce46fd3d8eec46f6e524e0ac6a75b1d3e0cefbc2f5ab151e99110b8
6
+ metadata.gz: 33e6152dfaf7fdeda19853229519f2154b6ec39cb87b3c074dea0658e94eb65156d11d8e7ce11d0c3560dae678fe196dc1cc99a4f6fb4c6d712f8b84be7ad274
7
+ data.tar.gz: 9999fdf8a4e05694f506b1a4f6e6db9679bc0da09d47b8c67513959d8467ead0fce0b26245b158c3101c8467110ed29af2aefeb4bc13c27d95ac76e38920e449
data/CHANGELOG.md CHANGED
@@ -4,30 +4,59 @@ All notable changes to this project will be documented in this file.
4
4
 
5
5
  The format is based on Keep a Changelog, and this project adheres to Semantic Versioning.
6
6
 
7
- ## [0.2.2] - 2026-06-03
7
+ ## Unreleased
8
8
 
9
- ### Changed
9
+ No unreleased changes.
10
10
 
11
- - Improved processor pool shutdown so workers are signaled and confirmed stopped where practical.
12
- - Updated transaction processing so partial event failures fail the transaction result while preserving per-event results.
13
- - Added CI validation for RBS signatures.
11
+ ## [0.2.3] - 2026-06-03
14
12
 
15
- ### Added
13
+ ### Added
14
+
15
+ - Added Port-native `Ractor::Port` worker inbox dispatch for the pre-warmed processor pool.
16
+ - Added concurrent threaded caller regression coverage for `ProcessorPool#process_many`.
17
+ - Added worker inbox boot verification coverage.
18
+ - Added multi-trial processor-pool benchmark reporting with min, median, max, and p95 distributions.
19
+ - Added minimum measurement duration support for benchmark trials.
20
+ - Added worker-count sweep support through `BENCHMARK_WORKER_COUNTS`.
21
+ - Added benchmark comparison across serial execution, repeated `ProcessorPool#process`, and batched `ProcessorPool#process_many`.
22
+ - Added benchmark environment metadata for Ruby, platform, host, CPU count, and uname details.
23
+ - Added detailed benchmark methodology and report documentation under `benchmark/README.md`.
24
+
25
+ ### Changed
26
+
27
+ - Updated worker dispatch to send work through worker-owned inbox ports instead of direct worker messages.
28
+ - Synchronized dispatch and shutdown with a mutex so multiple Ruby threads can submit work safely.
29
+ - Updated processor pool RBS signatures for worker inboxes and Port-native dispatch helpers.
30
+ - Expanded README documentation for the worker dispatch model.
31
+ - Updated README benchmark guidance to point to the detailed benchmark report documentation.
32
+ - Updated benchmark ratio reporting to compare median throughput against serial execution.
33
+
34
+ ## [0.2.2] - 2026-06-03
16
35
 
17
- - Added regression coverage for shutdown after processed and pending work.
18
- - Added regression coverage for timeout-bounded shutdown behavior.
19
- - Added regression coverage for `process_many([])` returning a clean empty result.
20
- - Added transaction pool coverage for successful and partially failed transactions.
36
+ ### Changed
37
+
38
+ - Improved processor pool shutdown so workers are signaled and confirmed stopped where practical.
39
+ - Updated transaction processing so partial event failures fail the transaction result while preserving per-event results.
40
+ - Added CI validation for RBS signatures.
41
+
42
+ ### Added
43
+
44
+ - Added regression coverage for shutdown after processed and pending work.
45
+ - Added regression coverage for timeout-bounded shutdown behavior.
46
+ - Added regression coverage for `process_many([])` returning a clean empty result.
47
+ - Added transaction pool coverage for successful and partially failed transactions.
21
48
 
22
49
  ## [0.2.1] - 2026-06-03
23
50
 
24
51
  ### Added
25
52
 
26
- v0.2.1 - Correctness and reliability patch
53
+ - Enforced processor timeout handling.
54
+ - Fixed transaction partial-failure behavior.
55
+ - Added regression coverage for hung processors and transaction failure cases.
27
56
 
28
- - Enforced processor timeout handling.
29
- - Fixed transaction partial-failure behavior.
30
- - Added regression coverage for hung processors and transaction failure cases.
57
+ ### Changed
58
+
59
+ - Released a correctness and reliability patch.
31
60
 
32
61
  ## [0.2.0] - 2026-06-03
33
62
 
@@ -57,7 +86,12 @@ Local benchmark results on Ruby 4.0.5 (4 workers) demonstrated measurable throug
57
86
 
58
87
  Benchmark results vary by hardware, operating system, Ruby version, and workload characteristics. Users are encouraged to reproduce results on their own systems using the included benchmark suite.
59
88
 
89
+ ## [0.1.1] - 2026-06-03
90
+
91
+ No code changes.
60
92
 
93
+ Improves RubyGems metadata and documentation wording to
94
+ explicitly identify CDC as Change Data Capture.
61
95
 
62
96
  ## [0.1.0] - 2026-05-31
63
97
 
@@ -77,10 +111,3 @@ Benchmark results vary by hardware, operating system, Ruby version, and workload
77
111
  - Added Minitest suite.
78
112
  - Added README and example.
79
113
  - Added CI and release workflows.
80
-
81
- ## [0.1.1] - 2026-06-03
82
-
83
- No code changes.
84
-
85
- Improves RubyGems metadata and documentation wording to
86
- explicitly identify CDC as Change Data Capture.
data/README.md CHANGED
@@ -2,7 +2,6 @@
2
2
 
3
3
  [![Gem Version](https://badge.fury.io/rb/cdc-parallel.svg)](https://badge.fury.io/rb/cdc-parallel)
4
4
  [![CI](https://github.com/kanutocd/cdc-parallel/workflows/CI/badge.svg)](https://github.com/kanutocd/cdc-parallel/actions)
5
- [![Coverage Status](https://codecov.io/gh/kanutocd/cdc-parallel/branch/main/graph/badge.svg)](https://codecov.io/gh/kanutocd/cdc-parallel)
6
5
  [![Ruby Version](https://img.shields.io/badge/ruby-%3E%3D%204.0-ruby.svg)](https://www.ruby-lang.org/en/)
7
6
  [![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)
8
7
 
@@ -82,6 +81,39 @@ Unsafe processors raise:
82
81
  CDC::Parallel::UnsafeProcessorError
83
82
  ```
84
83
 
84
+ ## Concurrency Contract
85
+
86
+ `CDC::Parallel::ProcessorPool` accepts submissions from multiple Ruby threads.
87
+ Dispatch state is synchronized inside the pool, while processor execution occurs
88
+ inside isolated Ruby 4 Ractors.
89
+
90
+ Workers own their `Ractor::Port` inboxes. The pool sends work to those inboxes,
91
+ and workers send results back to a caller-owned reply port.
92
+
93
+ ```text
94
+ Caller Thread A ─┐
95
+ Caller Thread B ─┼─> ProcessorPool
96
+ Caller Thread C ─┘ │
97
+ │ synchronized dispatch
98
+
99
+ +-------------------+
100
+ | worker selection |
101
+ +-------------------+
102
+ │ │ │
103
+ ▼ ▼ ▼
104
+ inbox port inbox port inbox port
105
+ │ │ │
106
+ ▼ ▼ ▼
107
+ Ractor 1 Ractor 2 Ractor 3
108
+ │ │ │
109
+ └───┬───┴───┬───┘
110
+ ▼ ▼
111
+ caller-owned reply port
112
+
113
+
114
+ ordered ProcessorResult[]
115
+ ```
116
+
85
117
  ## What Belongs Here
86
118
 
87
119
  - Ractor processor execution
@@ -175,7 +207,10 @@ The benchmark focuses on three workload categories:
175
207
  | cpu | Measure CPU-bound processing throughput |
176
208
  | batch | Measure batched CDC event processing throughput |
177
209
 
178
- ### Running Benchmarks
210
+ See [benchmark/README.md](benchmark/README.md) for the full benchmark methodology,
211
+ configuration reference, report schema, and interpretation guidance.
212
+
213
+ ### Quick Start
179
214
 
180
215
  Tiny workload:
181
216
 
@@ -200,6 +235,23 @@ BENCHMARK_BATCH_SIZE=10000 \
200
235
  bundle exec rake benchmark:processor_pool
201
236
  ```
202
237
 
238
+ Worker-count sweep:
239
+
240
+ ```bash
241
+ BENCHMARK_WORKLOAD=cpu \
242
+ BENCHMARK_WORKER_COUNTS=1,2,4 \
243
+ bundle exec rake benchmark:processor_pool
244
+ ```
245
+
246
+ Credibility controls:
247
+
248
+ ```bash
249
+ BENCHMARK_TRIALS=7 \
250
+ BENCHMARK_MIN_DURATION=0.25 \
251
+ BENCHMARK_ITERATIONS=1000 \
252
+ bundle exec rake benchmark:processor_pool
253
+ ```
254
+
203
255
  ### Benchmark Docker Image
204
256
 
205
257
  Build and run the reusable Docker image:
@@ -218,49 +270,3 @@ docker run --rm ghcr.io/kanutocd/cdc-parallel-benchmark:main
218
270
  The benchmark image is intended to become the shared performance validation
219
271
  pattern across CDC Ecosystem gems, enabling reproducible benchmark execution
220
272
  locally, in CI, and across different development environments.
221
-
222
- ### Example Result
223
-
224
- Environment:
225
-
226
- * Ruby 4.0.5
227
- * x86_64 Linux
228
- * 4 workers
229
-
230
- CPU workload (`BENCHMARK_CPU_ROUNDS=5000`):
231
-
232
- ```json
233
- {
234
- "serial": {
235
- "events_per_second": 120.26
236
- },
237
- "parallel": {
238
- "events_per_second": 250.15
239
- },
240
- "ratio": {
241
- "parallel_to_serial": 2.08
242
- }
243
- }
244
- ```
245
-
246
- ### Interpretation
247
-
248
- A ratio greater than `1.0` indicates that the pre-warmed Ractor worker pool outperformed serial execution.
249
-
250
- ```text
251
- ratio > 1.0 => parallel faster
252
- ratio = 1.0 => equivalent
253
- ratio < 1.0 => serial faster
254
- ```
255
-
256
- ### Reproducibility
257
-
258
- Benchmark results vary depending on:
259
-
260
- * CPU model
261
- * Core count
262
- * Operating system
263
- * Ruby version
264
- * Background system activity
265
-
266
- The benchmark suite is provided so that users can reproduce and validate results on their own hardware.
@@ -2,13 +2,36 @@
2
2
 
3
3
  module CDC
4
4
  module Parallel
5
- # Immutable configuration for Ractor runtimes.
5
+ # Immutable configuration shared by cdc-parallel runtime objects.
6
6
  #
7
- # @!attribute size
8
- # @return [Integer] worker count.
9
- # @!attribute timeout
10
- # @return [Float, nil] optional wait timeout in seconds.
7
+ # `Configuration` validates worker sizing and timeout values at construction
8
+ # time, freezes the resulting data object through `Data.define`, and makes
9
+ # the instance shareable so it is safe to retain around Ractor-oriented
10
+ # runtime objects.
11
+ #
12
+ # @example Default configuration
13
+ # config = CDC::Parallel::Configuration.new
14
+ # config.size #=> Etc.nprocessors
15
+ # config.timeout #=> nil
16
+ #
17
+ # @example Explicit worker count and timeout
18
+ # config = CDC::Parallel::Configuration.new(size: 4, timeout: 5)
19
+ #
20
+ # @!attribute [r] size
21
+ # @return [Integer] Number of worker Ractors to boot.
22
+ # @!attribute [r] timeout
23
+ # @return [Numeric, nil] Optional wait timeout in seconds.
24
+ # @api public
11
25
  class Configuration < Data.define(:size, :timeout)
26
+ # Create a validated runtime configuration.
27
+ #
28
+ # @param size [Integer]
29
+ # Worker count. Must be greater than zero.
30
+ # @param timeout [Numeric, nil]
31
+ # Optional timeout in seconds. Must be greater than zero when provided.
32
+ # @raise [ArgumentError]
33
+ # Raised when `size` or `timeout` is invalid.
34
+ # @return [void]
12
35
  def initialize(size: Etc.nprocessors, timeout: nil)
13
36
  raise ArgumentError, "size must be an Integer" unless size.is_a?(Integer)
14
37
  raise ArgumentError, "size must be greater than zero" unless size.positive?
@@ -2,22 +2,69 @@
2
2
 
3
3
  module CDC
4
4
  module Parallel
5
- # Base cdc-parallel error.
5
+ # Base error for all cdc-parallel-specific failures.
6
+ #
7
+ # Rescue this class when callers want to handle any failure raised directly
8
+ # by the parallel runtime layer.
9
+ #
10
+ # @api public
6
11
  class Error < StandardError; end
7
12
 
8
13
  # Raised when a processor has not declared itself Ractor-safe.
14
+ #
15
+ # Processors must opt in with `ractor_safe!` before they can be used by
16
+ # {ProcessorPool}, {TransactionPool}, or {Runtime}. This prevents accidental
17
+ # movement of mutable or otherwise unsafe processor objects across Ractor
18
+ # boundaries.
19
+ #
20
+ # @api public
9
21
  class UnsafeProcessorError < Error; end
10
22
 
11
- # Raised when work is submitted after shutdown.
23
+ # Raised when work is submitted after a pool or runtime has been shut down.
24
+ #
25
+ # @api public
12
26
  class ShutdownError < Error; end
13
27
 
14
- # Raised when the runtime receives an unsupported work item.
28
+ # Raised when the runtime receives an unsupported work item shape.
29
+ #
30
+ # `cdc-parallel` accepts normalized `CDC::Core::ChangeEvent` and
31
+ # `CDC::Core::TransactionEnvelope` objects. Source-specific payloads must be
32
+ # normalized by a source adapter before they reach this runtime layer.
33
+ #
34
+ # @api public
15
35
  class UnsupportedWorkItemError < Error; end
16
36
 
17
- # Raised when processor execution fails inside a worker Ractor.
37
+ # Represents an exception raised inside a worker Ractor.
38
+ #
39
+ # Worker exceptions are serialized before they cross the Ractor boundary and
40
+ # reconstructed as `ProcessorExecutionError` instances by
41
+ # {ResultCollector.normalize}. The original exception class name, message,
42
+ # and backtrace are exposed for diagnostics.
43
+ #
44
+ # @example Inspecting the original worker exception
45
+ # result = runtime.process(event)
46
+ # if result.failure?
47
+ # error = result.error
48
+ # error.original_class
49
+ # error.original_message
50
+ # end
51
+ #
52
+ # @attr_reader original_class [String] original exception class name.
53
+ # @attr_reader original_message [String] original exception message.
54
+ # @attr_reader original_backtrace [Array<String>] original exception backtrace.
55
+ # @api public
18
56
  class ProcessorExecutionError < Error
19
57
  attr_reader :original_class, :original_message, :original_backtrace
20
58
 
59
+ # Create a reconstructed worker exception.
60
+ #
61
+ # @param original_class [String]
62
+ # Class name of the exception raised inside the worker.
63
+ # @param original_message [String]
64
+ # Message from the exception raised inside the worker.
65
+ # @param original_backtrace [Array<String>]
66
+ # Backtrace captured inside the worker.
67
+ # @return [void]
21
68
  def initialize(original_class:, original_message:, original_backtrace: [])
22
69
  @original_class = original_class
23
70
  @original_message = original_message
@@ -28,7 +75,14 @@ module CDC
28
75
  end
29
76
  end
30
77
 
31
- # Raised when a worker does not return a result before timeout.
78
+ # Raised when a pool does not receive worker results before the configured
79
+ # timeout.
80
+ #
81
+ # Timeout failures are normally returned inside `CDC::Core::ProcessorResult`
82
+ # failure objects rather than raised directly to the caller during result
83
+ # collection.
84
+ #
85
+ # @api public
32
86
  class TimeoutError < Error; end
33
87
  end
34
88
  end