ruby_reactor 0.1.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (75) hide show
  1. checksums.yaml +7 -0
  2. data/.rspec +3 -0
  3. data/.rubocop.yml +98 -0
  4. data/CODE_OF_CONDUCT.md +84 -0
  5. data/README.md +570 -0
  6. data/Rakefile +12 -0
  7. data/documentation/DAG.md +457 -0
  8. data/documentation/README.md +123 -0
  9. data/documentation/async_reactors.md +369 -0
  10. data/documentation/composition.md +199 -0
  11. data/documentation/core_concepts.md +662 -0
  12. data/documentation/data_pipelines.md +224 -0
  13. data/documentation/examples/inventory_management.md +749 -0
  14. data/documentation/examples/order_processing.md +365 -0
  15. data/documentation/examples/payment_processing.md +654 -0
  16. data/documentation/getting_started.md +224 -0
  17. data/documentation/retry_configuration.md +357 -0
  18. data/lib/ruby_reactor/async_router.rb +91 -0
  19. data/lib/ruby_reactor/configuration.rb +41 -0
  20. data/lib/ruby_reactor/context.rb +169 -0
  21. data/lib/ruby_reactor/context_serializer.rb +164 -0
  22. data/lib/ruby_reactor/dependency_graph.rb +126 -0
  23. data/lib/ruby_reactor/dsl/compose_builder.rb +86 -0
  24. data/lib/ruby_reactor/dsl/map_builder.rb +112 -0
  25. data/lib/ruby_reactor/dsl/reactor.rb +151 -0
  26. data/lib/ruby_reactor/dsl/step_builder.rb +177 -0
  27. data/lib/ruby_reactor/dsl/template_helpers.rb +36 -0
  28. data/lib/ruby_reactor/dsl/validation_helpers.rb +35 -0
  29. data/lib/ruby_reactor/error/base.rb +16 -0
  30. data/lib/ruby_reactor/error/compensation_error.rb +8 -0
  31. data/lib/ruby_reactor/error/context_too_large_error.rb +11 -0
  32. data/lib/ruby_reactor/error/dependency_error.rb +8 -0
  33. data/lib/ruby_reactor/error/deserialization_error.rb +11 -0
  34. data/lib/ruby_reactor/error/input_validation_error.rb +29 -0
  35. data/lib/ruby_reactor/error/schema_version_error.rb +11 -0
  36. data/lib/ruby_reactor/error/step_failure_error.rb +18 -0
  37. data/lib/ruby_reactor/error/undo_error.rb +8 -0
  38. data/lib/ruby_reactor/error/validation_error.rb +8 -0
  39. data/lib/ruby_reactor/executor/compensation_manager.rb +79 -0
  40. data/lib/ruby_reactor/executor/graph_manager.rb +41 -0
  41. data/lib/ruby_reactor/executor/input_validator.rb +39 -0
  42. data/lib/ruby_reactor/executor/result_handler.rb +103 -0
  43. data/lib/ruby_reactor/executor/retry_manager.rb +156 -0
  44. data/lib/ruby_reactor/executor/step_executor.rb +319 -0
  45. data/lib/ruby_reactor/executor.rb +123 -0
  46. data/lib/ruby_reactor/map/collector.rb +65 -0
  47. data/lib/ruby_reactor/map/element_executor.rb +154 -0
  48. data/lib/ruby_reactor/map/execution.rb +60 -0
  49. data/lib/ruby_reactor/map/helpers.rb +67 -0
  50. data/lib/ruby_reactor/max_retries_exhausted_failure.rb +19 -0
  51. data/lib/ruby_reactor/reactor.rb +75 -0
  52. data/lib/ruby_reactor/retry_context.rb +92 -0
  53. data/lib/ruby_reactor/retry_queued_result.rb +26 -0
  54. data/lib/ruby_reactor/sidekiq_workers/map_collector_worker.rb +13 -0
  55. data/lib/ruby_reactor/sidekiq_workers/map_element_worker.rb +13 -0
  56. data/lib/ruby_reactor/sidekiq_workers/map_execution_worker.rb +15 -0
  57. data/lib/ruby_reactor/sidekiq_workers/worker.rb +55 -0
  58. data/lib/ruby_reactor/step/compose_step.rb +107 -0
  59. data/lib/ruby_reactor/step/map_step.rb +234 -0
  60. data/lib/ruby_reactor/step.rb +33 -0
  61. data/lib/ruby_reactor/storage/adapter.rb +51 -0
  62. data/lib/ruby_reactor/storage/configuration.rb +15 -0
  63. data/lib/ruby_reactor/storage/redis_adapter.rb +140 -0
  64. data/lib/ruby_reactor/template/base.rb +15 -0
  65. data/lib/ruby_reactor/template/element.rb +25 -0
  66. data/lib/ruby_reactor/template/input.rb +48 -0
  67. data/lib/ruby_reactor/template/result.rb +48 -0
  68. data/lib/ruby_reactor/template/value.rb +22 -0
  69. data/lib/ruby_reactor/validation/base.rb +26 -0
  70. data/lib/ruby_reactor/validation/input_validator.rb +62 -0
  71. data/lib/ruby_reactor/validation/schema_builder.rb +17 -0
  72. data/lib/ruby_reactor/version.rb +5 -0
  73. data/lib/ruby_reactor.rb +159 -0
  74. data/sig/ruby_reactor.rbs +4 -0
  75. metadata +178 -0
@@ -0,0 +1,224 @@
1
+ # Data Pipelines
2
+
3
+ RubyReactor provides powerful data pipeline capabilities through the `map` feature, allowing you to process collections of data efficiently. This system supports both synchronous and asynchronous execution, batch processing, and robust error handling.
4
+
5
+ ## Overview
6
+
7
+ The data pipeline system is built around the `map` step, which iterates over an input collection and processes each element through a defined sub-reactor or inline steps.
8
+
9
+ Key features:
10
+ - **Parallel Processing**: Execute steps asynchronously via Sidekiq
11
+ - **Batch Control**: Manage system load with configurable batch sizes
12
+ - **Error Handling**: Choose between failing fast or collecting partial results
13
+ - **Retries**: Configure granular retry policies for individual steps
14
+ - **Aggregation**: Collect and transform results after processing
15
+
16
+ ## Basic Usage
17
+
18
+ The simplest form of a data pipeline is an inline `map` step that processes elements synchronously.
19
+
20
+ ```ruby
21
+ class UserTransformationReactor < RubyReactor::Reactor
22
+ input :users
23
+
24
+ map :transformed_users do
25
+ source input(:users)
26
+ argument :user, element(:transformed_users)
27
+
28
+ # Define steps to run for each element
29
+ step :normalize do
30
+ argument :user, input(:user)
31
+ run do |args, _|
32
+ user = args[:user]
33
+ Success({
34
+ name: user[:name].strip,
35
+ email: user[:email].downcase
36
+ })
37
+ end
38
+ end
39
+
40
+ # The result of this step becomes the result for the element
41
+ returns :normalize
42
+ end
43
+ end
44
+ ```
45
+
46
+ ## Async Execution
47
+
48
+ For long-running or resource-intensive tasks, you can offload processing to background jobs using Sidekiq.
49
+
50
+ To enable async execution, simply add the `async true` directive to your map definition.
51
+
52
+ ```ruby
53
+ map :process_orders do
54
+ source input(:orders)
55
+ argument :order, element(:process_orders)
56
+
57
+ # Enable async execution via Sidekiq
58
+ async true
59
+
60
+ step :charge_card do
61
+ argument :order, input(:order)
62
+ run { PaymentService.charge(args[:order]) }
63
+ end
64
+
65
+ returns :charge_card
66
+ end
67
+ ```
68
+
69
+ ### Execution Flow
70
+
71
+ ```mermaid
72
+ sequenceDiagram
73
+ participant Reactor
74
+ participant Redis
75
+ participant Sidekiq
76
+ participant Worker
77
+
78
+ Reactor->>Redis: Store Context
79
+ Reactor->>Sidekiq: Enqueue MapElementWorkers
80
+ Note over Reactor: Returns AsyncResult immediately
81
+
82
+ loop For each element
83
+ Sidekiq->>Worker: Process Element
84
+ Worker->>Redis: Update Element Result
85
+ end
86
+
87
+ Worker->>Sidekiq: Enqueue MapCollectorWorker (when done)
88
+ Sidekiq->>Worker: Run Collector
89
+ Worker->>Redis: Store Final Result
90
+ ```
91
+
92
+ ## Batch Processing
93
+
94
+ When processing large datasets asynchronously, you can control the parallelism using `batch_size`. This limits how many Sidekiq jobs are enqueued simultaneously, preventing system overload.
95
+
96
+ ```ruby
97
+ map :bulk_import do
98
+ source input(:records)
99
+ argument :record, element(:bulk_import)
100
+
101
+ # Process only 50 records at a time
102
+ async true, batch_size: 50
103
+
104
+ step :import_record do
105
+ # ...
106
+ end
107
+ end
108
+ ```
109
+
110
+ **How it works:**
111
+ 1. The system initially enqueues `batch_size` jobs.
112
+ 2. As each job completes, it triggers the next job in the queue.
113
+ 3. This maintains a steady stream of processing without flooding the queue.
114
+
115
+ ## Error Handling
116
+
117
+ You can control how the pipeline reacts to failures using the `fail_fast` option.
118
+
119
+ ### Fail Fast (Default)
120
+
121
+ By default (`fail_fast true`), the entire map operation fails immediately if any single element fails.
122
+
123
+ ```ruby
124
+ map :strict_processing do
125
+ source input(:items)
126
+ # ...
127
+ fail_fast true # Default
128
+ end
129
+ ```
130
+
131
+ ### Collecting Partial Results
132
+
133
+ If you want to process all elements regardless of failures, set `fail_fast false`. You can then use a `collect` block to handle successes and failures separately.
134
+
135
+ ```ruby
136
+ map :resilient_processing do
137
+ source input(:items)
138
+ argument :item, element(:resilient_processing)
139
+
140
+ # Continue processing even if some items fail
141
+ fail_fast false
142
+
143
+ step :risky_operation do
144
+ # ...
145
+ end
146
+
147
+ returns :risky_operation
148
+
149
+ # Aggregate results
150
+ collect do |results|
151
+ # results is an array of Result objects (Success or Failure)
152
+ successful = results.select(&:success?).map(&:value)
153
+ failed = results.select(&:failure?).map(&:error)
154
+
155
+ {
156
+ processed: successful,
157
+ errors: failed,
158
+ success_rate: successful.length.to_f / results.length
159
+ }
160
+ end
161
+ end
162
+ ```
163
+
164
+ ## Retry Configuration
165
+
166
+ You can configure retries for individual steps within a map. This is particularly useful for transient failures (e.g., network timeouts) in async pipelines.
167
+
168
+ ```ruby
169
+ map :reliable_processing do
170
+ source input(:urls)
171
+ argument :url, element(:reliable_processing)
172
+ async true
173
+
174
+ step :fetch_data do
175
+ argument :url, input(:url)
176
+
177
+ # Retry up to 3 times with exponential backoff
178
+ retries max_attempts: 3, backoff: :exponential, base_delay: 1.second
179
+
180
+ run do |args, _|
181
+ # If this raises or returns Failure, it will be retried
182
+ HttpClient.get(args[:url])
183
+ end
184
+ end
185
+
186
+ returns :fetch_data
187
+ end
188
+ ```
189
+
190
+ ### Retry Behavior
191
+
192
+ - **Async Mode**: Retries are handled by requeuing the Sidekiq job with a delay. This is non-blocking and efficient.
193
+ - **Sync Mode**: Retries happen immediately within the execution thread (blocking).
194
+
195
+ ## Visualization
196
+
197
+ ### Async Batch Execution
198
+
199
+ ```mermaid
200
+ graph TD
201
+ Start[Start Map] --> Init[Initialize Batch]
202
+ Init --> Q1["Queue Initial Batch<br/>(Size N)"]
203
+
204
+ subgraph Workers
205
+ W1[Worker 1]
206
+ W2[Worker 2]
207
+ W3[Worker ...]
208
+ end
209
+
210
+ Q1 --> W1
211
+ Q1 --> W2
212
+
213
+ W1 -->|Complete| Next1{More Items?}
214
+ W2 -->|Complete| Next2{More Items?}
215
+
216
+ Next1 -->|Yes| Q2[Queue Next Item]
217
+ Next2 -->|Yes| Q2
218
+
219
+ Q2 --> W3
220
+
221
+ Next1 -->|No| Check{All Done?}
222
+ Check -->|Yes| Collect[Run Collector]
223
+ Collect --> Finish[Final Result]
224
+ ```