ruby_reactor 0.3.2 → 0.4.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (58) hide show
  1. checksums.yaml +4 -4
  2. data/.release-please-config.json +15 -0
  3. data/.release-please-manifest.json +3 -0
  4. data/.tool-versions +1 -0
  5. data/CHANGELOG.md +13 -0
  6. data/README.md +80 -4
  7. data/lib/ruby_reactor/context_serializer.rb +10 -1
  8. data/lib/ruby_reactor/map/result_enumerator.rb +4 -3
  9. data/lib/ruby_reactor/rate_limit.rb +2 -2
  10. data/lib/ruby_reactor/sidekiq_workers/worker.rb +58 -1
  11. data/lib/ruby_reactor/version.rb +1 -1
  12. metadata +7 -52
  13. data/documentation/DAG.md +0 -457
  14. data/documentation/README.md +0 -135
  15. data/documentation/async_reactors.md +0 -381
  16. data/documentation/composition.md +0 -199
  17. data/documentation/core_concepts.md +0 -676
  18. data/documentation/data_pipelines.md +0 -230
  19. data/documentation/examples/inventory_management.md +0 -748
  20. data/documentation/examples/order_processing.md +0 -380
  21. data/documentation/examples/payment_processing.md +0 -565
  22. data/documentation/getting_started.md +0 -242
  23. data/documentation/images/failed_order_processing.png +0 -0
  24. data/documentation/images/payment_workflow.png +0 -0
  25. data/documentation/interrupts.md +0 -163
  26. data/documentation/locks_and_semaphores.md +0 -459
  27. data/documentation/retry_configuration.md +0 -362
  28. data/documentation/testing.md +0 -994
  29. data/gui/.gitignore +0 -24
  30. data/gui/README.md +0 -73
  31. data/gui/eslint.config.js +0 -23
  32. data/gui/index.html +0 -13
  33. data/gui/package-lock.json +0 -5925
  34. data/gui/package.json +0 -46
  35. data/gui/postcss.config.js +0 -6
  36. data/gui/public/vite.svg +0 -1
  37. data/gui/src/App.css +0 -42
  38. data/gui/src/App.tsx +0 -51
  39. data/gui/src/assets/react.svg +0 -1
  40. data/gui/src/components/DagVisualizer.tsx +0 -424
  41. data/gui/src/components/Dashboard.tsx +0 -163
  42. data/gui/src/components/ErrorBoundary.tsx +0 -47
  43. data/gui/src/components/ReactorDetail.tsx +0 -135
  44. data/gui/src/components/StepInspector.tsx +0 -492
  45. data/gui/src/components/__tests__/DagVisualizer.test.tsx +0 -140
  46. data/gui/src/components/__tests__/ReactorDetail.test.tsx +0 -111
  47. data/gui/src/components/__tests__/StepInspector.test.tsx +0 -408
  48. data/gui/src/globals.d.ts +0 -7
  49. data/gui/src/index.css +0 -14
  50. data/gui/src/lib/utils.ts +0 -13
  51. data/gui/src/main.tsx +0 -14
  52. data/gui/src/test/setup.ts +0 -11
  53. data/gui/tailwind.config.js +0 -11
  54. data/gui/tsconfig.app.json +0 -28
  55. data/gui/tsconfig.json +0 -7
  56. data/gui/tsconfig.node.json +0 -26
  57. data/gui/vite.config.ts +0 -8
  58. data/gui/vitest.config.ts +0 -13
@@ -1,230 +0,0 @@
1
- # Data Pipelines
2
-
3
- RubyReactor provides powerful data pipeline capabilities through the `map` feature, allowing you to process collections of data efficiently. This system supports both synchronous and asynchronous execution, batch processing, and robust error handling.
4
-
5
- ## Overview
6
-
7
- The data pipeline system is built around the `map` step, which iterates over an input collection and processes each element through a defined sub-reactor or inline steps.
8
-
9
- Key features:
10
- - **Parallel Processing**: Execute steps asynchronously via Sidekiq
11
- - **Batch Control**: Manage system load with configurable batch sizes
12
- - **Error Handling**: Choose between failing fast or collecting partial results
13
- - **Retries**: Configure granular retry policies for individual steps
14
- - **Aggregation**: Collect and transform results after processing
15
-
16
- ## Basic Usage
17
-
18
- The simplest form of a data pipeline is an inline `map` step that processes elements synchronously.
19
-
20
- ```ruby
21
- class UserTransformationReactor < RubyReactor::Reactor
22
- input :users
23
-
24
- map :transformed_users do
25
- source input(:users)
26
- argument :user, element(:transformed_users)
27
-
28
- # Define steps to run for each element
29
- step :normalize do
30
- argument :user, input(:user)
31
- run do |args, _|
32
- user = args[:user]
33
- Success({
34
- name: user[:name].strip,
35
- email: user[:email].downcase
36
- })
37
- end
38
- end
39
-
40
- # The result of this step becomes the result for the element
41
- returns :normalize
42
- end
43
- end
44
- ```
45
-
46
- ## Dynamic Sources & ActiveRecord
47
-
48
- The `map` step supports a dynamic `source` block, which is particularly useful when working with ActiveRecord or when the collection depends on input arguments. Instead of passing a static collection, you can define a block that returns an Enumerable or an `ActiveRecord::Relation`.
49
-
50
- ```ruby
51
- map :process_products do
52
- argument :filter, input(:filter)
53
-
54
- # Dynamic source block
55
- source do |args|
56
- # This block executes at runtime
57
- threshold = args[:filter][:stock]
58
- Product.where("stock >= ?", threshold)
59
- end
60
-
61
- argument :product, element(:process_products)
62
- async true, batch_size: 100
63
-
64
- step :process do
65
- # ...
66
- end
67
-
68
- returns :process
69
- end
70
- ```
71
-
72
- When an `ActiveRecord::Relation` is returned, RubyReactor efficiently batches the query using database-level `OFFSET` and `LIMIT` based on the configured `batch_size`, preventing memory bloat by not loading all records at once.
73
-
74
- ## Batch Processing Mechanism
75
-
76
- When processing large datasets asynchronously, you can control the parallelism using `batch_size`. This limits how many Sidekiq jobs are enqueued simultaneously, preventing system overload.
77
-
78
- ```ruby
79
- map :bulk_import do
80
- source input(:records)
81
- argument :record, element(:bulk_import)
82
-
83
- # Process only 50 records at a time
84
- async true, batch_size: 50
85
-
86
- step :import_record do
87
- # ...
88
- end
89
- end
90
- ```
91
-
92
- ### Back Pressure & Resource Management
93
-
94
- When `async true` is used with a `batch_size`, RubyReactor implements an intelligent **back pressure** mechanism. Instead of flooding Redis and Sidekiq with millions of jobs immediately (which is the standard behavior for many background job systems), the system processes data in controlled chunks.
95
-
96
- This approach provides critical benefits for stability and scalability:
97
-
98
- 1. **Memory Efficiency**: By using `ActiveRecord` batching (`LIMIT` / `OFFSET`), only the current batch of records is loaded into memory. This allows processing datasets larger than available RAM.
99
- 2. **Redis Protection**: Prevents "Queue Explosion". Only a small number of job arguments are stored in Redis at any time, preventing OOM errors in your Redis instance.
100
- 3. **Database Stability**: Database load is distributed over time rather than spiking all at once.
101
-
102
- **Visualizing the Flow:**
103
-
104
- ```mermaid
105
- graph TD
106
- Start[Start Map] -->|Batch Size: N| BatchManager
107
-
108
- subgraph "Back Pressure Loop"
109
- BatchManager[Batch Manager] -->|Fetch N Items| DB[(Database)]
110
- DB --> Records
111
- Records -->|Enqueue N Jobs| Sidekiq
112
-
113
- Sidekiq --> W1[Worker 1]
114
- Sidekiq --> W2[Worker 2]
115
-
116
- W1 -.->|Complete| Check{Batch Done?}
117
- W2 -.->|Complete| Check
118
-
119
- Check -->|No| Wait[Wait]
120
- Check -->|Yes| Next[Trigger Next Batch]
121
- Next --> BatchManager
122
- end
123
-
124
- BatchManager -->|No More Items| Finish[Aggregator]
125
- ```
126
-
127
- This ensures that the system works at the speed of your workers, not the speed of the enqueueing process, maintaining a constant and manageable resource footprint regardless of dataset size.
128
-
129
- ## Error Handling
130
-
131
- You can control how the pipeline reacts to failures using the `fail_fast` option.
132
-
133
- ### Fail Fast (Default)
134
-
135
- By default (`fail_fast true`), the entire map operation fails immediately if any single element fails.
136
-
137
- ```ruby
138
- map :strict_processing do
139
- source input(:items)
140
- # ...
141
- fail_fast true # Default
142
- end
143
- ```
144
-
145
- ### Collecting Results (Successes & Failures)
146
-
147
- If you want to process all elements regardless of failures, set `fail_fast false`. The map step returns a `ResultEnumerator` that allows you to easily separate successful executions from failures.
148
-
149
- ```ruby
150
- map :resilient_processing do
151
- source input(:items)
152
- argument :item, element(:resilient_processing)
153
-
154
- # Continue processing even if some items fail
155
- fail_fast false
156
-
157
- step :risky_operation do
158
- # ...
159
- end
160
-
161
- returns :risky_operation
162
- end
163
-
164
- step :analyze_results do
165
- argument :results, result(:resilient_processing)
166
-
167
- run do |args|
168
- col = args[:results]
169
-
170
- # Iterate over successful results
171
- col.successes.each do |value|
172
- # 'value' is the direct return value of the map element
173
- puts "Success: #{value}"
174
- end
175
-
176
- # Iterate over failures
177
- col.failures.each do |error|
178
- # 'error' is the failure object/message itself
179
- puts "Error: #{error}"
180
- end
181
-
182
- # Note: Iterating the collection directly yields wrapped objects
183
- col.each do |result|
184
- if result.is_a?(RubyReactor::Success)
185
- puts "Wrapped Value: #{result.value}"
186
- else
187
- puts "Wrapped Error: #{result.error}"
188
- end
189
- end
190
-
191
- Success({
192
- success_count: col.successes.count,
193
- failure_count: col.failures.count
194
- })
195
- end
196
- end
197
- ```
198
-
199
- ## Retry Configuration
200
-
201
- You can configure retries for individual steps within a map. This is particularly useful for transient failures (e.g., network timeouts) in async pipelines.
202
-
203
- ```ruby
204
- map :reliable_processing do
205
- source input(:urls)
206
- argument :url, element(:reliable_processing)
207
- async true
208
-
209
- step :fetch_data do
210
- argument :url, input(:url)
211
-
212
- # Retry up to 3 times with exponential backoff
213
- retries max_attempts: 3, backoff: :exponential, base_delay: 1.second
214
-
215
- run do |args, _|
216
- # If this raises or returns Failure, it will be retried
217
- HttpClient.get(args[:url])
218
- end
219
- end
220
-
221
- returns :fetch_data
222
- end
223
- ```
224
-
225
- ### Retry Behavior
226
-
227
- - **Async Mode**: Retries are handled by requeuing the Sidekiq job with a delay. This is non-blocking and efficient.
228
- - **Sync Mode**: Retries happen immediately within the execution thread (blocking).
229
-
230
-