simple_flow 0.1.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (80) hide show
  1. checksums.yaml +7 -0
  2. data/.envrc +1 -0
  3. data/.github/workflows/deploy-github-pages.yml +52 -0
  4. data/.rubocop.yml +57 -0
  5. data/CHANGELOG.md +4 -0
  6. data/COMMITS.md +196 -0
  7. data/LICENSE +21 -0
  8. data/README.md +481 -0
  9. data/Rakefile +15 -0
  10. data/benchmarks/parallel_vs_sequential.rb +98 -0
  11. data/benchmarks/pipeline_overhead.rb +130 -0
  12. data/docs/api/middleware.md +468 -0
  13. data/docs/api/parallel-step.md +363 -0
  14. data/docs/api/pipeline.md +382 -0
  15. data/docs/api/result.md +375 -0
  16. data/docs/concurrent/best-practices.md +687 -0
  17. data/docs/concurrent/introduction.md +246 -0
  18. data/docs/concurrent/parallel-steps.md +418 -0
  19. data/docs/concurrent/performance.md +481 -0
  20. data/docs/core-concepts/flow-control.md +452 -0
  21. data/docs/core-concepts/middleware.md +389 -0
  22. data/docs/core-concepts/overview.md +219 -0
  23. data/docs/core-concepts/pipeline.md +315 -0
  24. data/docs/core-concepts/result.md +168 -0
  25. data/docs/core-concepts/steps.md +391 -0
  26. data/docs/development/benchmarking.md +443 -0
  27. data/docs/development/contributing.md +380 -0
  28. data/docs/development/dagwood-concepts.md +435 -0
  29. data/docs/development/testing.md +514 -0
  30. data/docs/getting-started/examples.md +197 -0
  31. data/docs/getting-started/installation.md +62 -0
  32. data/docs/getting-started/quick-start.md +218 -0
  33. data/docs/guides/choosing-concurrency-model.md +441 -0
  34. data/docs/guides/complex-workflows.md +440 -0
  35. data/docs/guides/data-fetching.md +478 -0
  36. data/docs/guides/error-handling.md +635 -0
  37. data/docs/guides/file-processing.md +505 -0
  38. data/docs/guides/validation-patterns.md +496 -0
  39. data/docs/index.md +169 -0
  40. data/examples/.gitignore +3 -0
  41. data/examples/01_basic_pipeline.rb +112 -0
  42. data/examples/02_error_handling.rb +178 -0
  43. data/examples/03_middleware.rb +186 -0
  44. data/examples/04_parallel_automatic.rb +221 -0
  45. data/examples/05_parallel_explicit.rb +279 -0
  46. data/examples/06_real_world_ecommerce.rb +288 -0
  47. data/examples/07_real_world_etl.rb +277 -0
  48. data/examples/08_graph_visualization.rb +246 -0
  49. data/examples/09_pipeline_visualization.rb +266 -0
  50. data/examples/10_concurrency_control.rb +235 -0
  51. data/examples/11_sequential_dependencies.rb +243 -0
  52. data/examples/12_none_constant.rb +161 -0
  53. data/examples/README.md +374 -0
  54. data/examples/regression_test/01_basic_pipeline.txt +38 -0
  55. data/examples/regression_test/02_error_handling.txt +92 -0
  56. data/examples/regression_test/03_middleware.txt +61 -0
  57. data/examples/regression_test/04_parallel_automatic.txt +86 -0
  58. data/examples/regression_test/05_parallel_explicit.txt +80 -0
  59. data/examples/regression_test/06_real_world_ecommerce.txt +53 -0
  60. data/examples/regression_test/07_real_world_etl.txt +58 -0
  61. data/examples/regression_test/08_graph_visualization.txt +429 -0
  62. data/examples/regression_test/09_pipeline_visualization.txt +305 -0
  63. data/examples/regression_test/10_concurrency_control.txt +96 -0
  64. data/examples/regression_test/11_sequential_dependencies.txt +86 -0
  65. data/examples/regression_test/12_none_constant.txt +64 -0
  66. data/examples/regression_test.rb +105 -0
  67. data/lib/simple_flow/dependency_graph.rb +120 -0
  68. data/lib/simple_flow/dependency_graph_visualizer.rb +326 -0
  69. data/lib/simple_flow/middleware.rb +36 -0
  70. data/lib/simple_flow/parallel_executor.rb +80 -0
  71. data/lib/simple_flow/pipeline.rb +405 -0
  72. data/lib/simple_flow/result.rb +88 -0
  73. data/lib/simple_flow/step_tracker.rb +58 -0
  74. data/lib/simple_flow/version.rb +5 -0
  75. data/lib/simple_flow.rb +41 -0
  76. data/mkdocs.yml +146 -0
  77. data/pipeline_graph.dot +51 -0
  78. data/pipeline_graph.html +60 -0
  79. data/pipeline_graph.mmd +19 -0
  80. metadata +127 -0
@@ -0,0 +1,391 @@
1
+ # Steps
2
+
3
+ Steps are the individual operations that make up your pipeline. Each step receives a Result and returns a Result.
4
+
5
+ ## Step Types
6
+
7
+ SimpleFlow supports any callable object as a step:
8
+
9
+ ### 1. Lambda/Proc
10
+
11
+ ```ruby
12
+ pipeline = SimpleFlow::Pipeline.new do
13
+ step ->(result) do
14
+ new_value = result.value.upcase
15
+ result.continue(new_value)
16
+ end
17
+ end
18
+ ```
19
+
20
+ ### 2. Method References
21
+
22
+ ```ruby
23
+ def validate_email(result)
24
+ if result.value[:email] =~ /\A[\w+\-.]+@[a-z\d\-]+(\.[a-z\d\-]+)*\.[a-z]+\z/i
25
+ result.continue(result.value)
26
+ else
27
+ result.with_error(:validation, 'Invalid email').halt
28
+ end
29
+ end
30
+
31
+ pipeline = SimpleFlow::Pipeline.new do
32
+ step method(:validate_email)
33
+ end
34
+ ```
35
+
36
+ ### 3. Callable Objects
37
+
38
+ ```ruby
39
+ class UserValidator
40
+ def call(result)
41
+ user = result.value
42
+
43
+ errors = []
44
+ errors << 'Name required' if user[:name].blank?
45
+ errors << 'Email required' if user[:email].blank?
46
+
47
+ if errors.any?
48
+ errors.each { |error| result = result.with_error(:validation, error) }
49
+ return result.halt
50
+ end
51
+
52
+ result.continue(user)
53
+ end
54
+ end
55
+
56
+ pipeline = SimpleFlow::Pipeline.new do
57
+ step UserValidator.new
58
+ end
59
+ ```
60
+
61
+ ### 4. Class Methods
62
+
63
+ ```ruby
64
+ class DataTransformer
65
+ def self.call(result)
66
+ transformed = transform_data(result.value)
67
+ result.continue(transformed)
68
+ end
69
+
70
+ def self.transform_data(data)
71
+ # Transformation logic
72
+ data.transform_values(&:to_s)
73
+ end
74
+ end
75
+
76
+ pipeline = SimpleFlow::Pipeline.new do
77
+ step DataTransformer
78
+ end
79
+ ```
80
+
81
+ ## Anonymous vs Named Steps
82
+
83
+ ### Anonymous Steps (Sequential Execution)
84
+
85
+ **Anonymous steps execute sequentially with automatic dependencies on the previous step's success.**
86
+
87
+ Each step implicitly depends on the previous step completing successfully (not halting). If any step halts, subsequent steps are skipped.
88
+
89
+ ```ruby
90
+ pipeline = SimpleFlow::Pipeline.new do
91
+ step ->(result) {
92
+ puts "Step 1"
93
+ result.continue(result.value * 2)
94
+ }
95
+
96
+ step ->(result) {
97
+ puts "Step 2"
98
+ result.continue(result.value + 10)
99
+ }
100
+
101
+ step ->(result) {
102
+ puts "Step 3"
103
+ result.continue(result.value.to_s)
104
+ }
105
+ end
106
+
107
+ result = pipeline.call(SimpleFlow::Result.new(5))
108
+ # Output:
109
+ # Step 1
110
+ # Step 2
111
+ # Step 3
112
+ # result.value => "20"
113
+ ```
114
+
115
+ **Key characteristics:**
116
+ - Execute in the order they were defined
117
+ - Each step receives the result from the previous step
118
+ - Pipeline short-circuits if any step halts (returns `result.halt`)
119
+ - No need to specify dependencies explicitly
120
+ - Use `pipeline.call(result)` to execute
121
+
122
+ **Example with halting:**
123
+
124
+ ```ruby
125
+ pipeline = SimpleFlow::Pipeline.new do
126
+ step ->(result) { puts "Step 1"; result.continue(1) }
127
+ step ->(result) { puts "Step 2"; result.halt(2) } # Halts here
128
+ step ->(result) { puts "Step 3"; result.continue(3) } # Never executes
129
+ end
130
+
131
+ result = pipeline.call(SimpleFlow::Result.new(0))
132
+ # Output:
133
+ # Step 1
134
+ # Step 2
135
+ # (Step 3 is skipped)
136
+ ```
137
+
138
+ ### Named Steps (Parallel Execution)
139
+
140
+ **Named steps with explicit dependencies enable parallel execution based on a dependency graph.**
141
+
142
+ Steps with the same satisfied dependencies run concurrently. No implicit ordering - you must specify all dependencies explicitly.
143
+
144
+ ```ruby
145
+ pipeline = SimpleFlow::Pipeline.new do
146
+ step :fetch_user, ->(result) { fetch_user(result) }, depends_on: []
147
+
148
+ # These two run in parallel (both depend only on :fetch_user)
149
+ step :fetch_orders, ->(result) { fetch_orders(result) }, depends_on: [:fetch_user]
150
+ step :fetch_products, ->(result) { fetch_products(result) }, depends_on: [:fetch_user]
151
+
152
+ # Waits for both parallel steps
153
+ step :merge, ->(result) { merge_data(result) }, depends_on: [:fetch_orders, :fetch_products]
154
+ end
155
+
156
+ result = pipeline.call_parallel(SimpleFlow::Result.new(user_id))
157
+ ```
158
+
159
+ **Key characteristics:**
160
+ - Execute based on dependency graph, not definition order
161
+ - Steps with satisfied dependencies run in parallel
162
+ - Must explicitly specify all dependencies with `depends_on:`
163
+ - Use `pipeline.call_parallel(result)` to execute
164
+ - Optimal for I/O-bound operations (API calls, database queries)
165
+
166
+ ## Step Contract
167
+
168
+ Every step must:
169
+
170
+ 1. Accept a `Result` object as input
171
+ 2. Return a `Result` object as output
172
+ 3. Use `.continue(value)` to proceed
173
+ 4. Use `.halt(value)` to stop the pipeline
174
+
175
+ ```ruby
176
+ # ✅ Good - follows contract
177
+ def my_step(result)
178
+ processed = process(result.value)
179
+ result.continue(processed)
180
+ end
181
+
182
+ # ❌ Bad - returns wrong type
183
+ def bad_step(result)
184
+ result.value * 2 # Returns a number, not a Result
185
+ end
186
+
187
+ # ❌ Bad - doesn't accept Result
188
+ def bad_step(value)
189
+ value * 2
190
+ end
191
+ ```
192
+
193
+ ## Working with Values
194
+
195
+ ### Transforming Values
196
+
197
+ ```ruby
198
+ step ->(result) do
199
+ # Get current value
200
+ data = result.value
201
+
202
+ # Transform it
203
+ transformed = data.map { |item| item.upcase }
204
+
205
+ # Continue with new value
206
+ result.continue(transformed)
207
+ end
208
+ ```
209
+
210
+ ### Modifying Nested Data
211
+
212
+ ```ruby
213
+ step ->(result) do
214
+ user = result.value
215
+ user[:processed_at] = Time.now
216
+ result.continue(user)
217
+ end
218
+ ```
219
+
220
+ ## Adding Context
221
+
222
+ Context persists across steps without modifying the value:
223
+
224
+ ```ruby
225
+ pipeline = SimpleFlow::Pipeline.new do
226
+ step ->(result) {
227
+ result
228
+ .continue(result.value)
229
+ .with_context(:started_at, Time.now)
230
+ }
231
+
232
+ step ->(result) {
233
+ result
234
+ .continue(process(result.value))
235
+ .with_context(:processed_at, Time.now)
236
+ }
237
+
238
+ step ->(result) {
239
+ duration = result.context[:processed_at] - result.context[:started_at]
240
+ result
241
+ .continue(result.value)
242
+ .with_context(:duration, duration)
243
+ }
244
+ end
245
+ ```
246
+
247
+ ## Error Handling in Steps
248
+
249
+ ### Collecting Errors
250
+
251
+ ```ruby
252
+ step ->(result) do
253
+ user = result.value
254
+ result_with_errors = result
255
+
256
+ if user[:email].nil?
257
+ result_with_errors = result_with_errors.with_error(:validation, 'Email required')
258
+ end
259
+
260
+ if user[:age] && user[:age] < 18
261
+ result_with_errors = result_with_errors.with_error(:validation, 'Must be 18+')
262
+ end
263
+
264
+ # Continue even with errors (they're tracked)
265
+ result_with_errors.continue(user)
266
+ end
267
+ ```
268
+
269
+ ### Halting on Errors
270
+
271
+ ```ruby
272
+ step ->(result) do
273
+ if critical_error?(result.value)
274
+ return result
275
+ .with_error(:critical, 'Cannot proceed')
276
+ .halt
277
+ end
278
+
279
+ result.continue(result.value)
280
+ end
281
+ ```
282
+
283
+ ## Conditional Logic
284
+
285
+ ### Early Return
286
+
287
+ ```ruby
288
+ step ->(result) do
289
+ return result.halt if should_skip?(result.value)
290
+
291
+ result.continue(process(result.value))
292
+ end
293
+ ```
294
+
295
+ ### Branching
296
+
297
+ ```ruby
298
+ step ->(result) do
299
+ if result.value[:type] == 'premium'
300
+ result.continue(process_premium(result.value))
301
+ else
302
+ result.continue(process_standard(result.value))
303
+ end
304
+ end
305
+ ```
306
+
307
+ ## Async/External Operations
308
+
309
+ Steps can perform I/O operations:
310
+
311
+ ```ruby
312
+ step ->(result) do
313
+ # API call
314
+ response = HTTParty.get("https://api.example.com/users/#{result.value[:id]}")
315
+
316
+ result
317
+ .continue(response.parsed_response)
318
+ .with_context(:api_response_time, response.headers['x-response-time'])
319
+ end
320
+ ```
321
+
322
+ ## Testing Steps
323
+
324
+ Steps are easy to test in isolation:
325
+
326
+ ```ruby
327
+ require 'minitest/autorun'
328
+
329
+ class StepTest < Minitest::Test
330
+ def test_validation_step
331
+ result = SimpleFlow::Result.new({ email: 'test@example.com' })
332
+ output = validate_email(result)
333
+
334
+ assert output.continue?
335
+ assert_empty output.errors
336
+ end
337
+
338
+ def test_validation_step_with_invalid_email
339
+ result = SimpleFlow::Result.new({ email: 'invalid' })
340
+ output = validate_email(result)
341
+
342
+ refute output.continue?
343
+ assert_includes output.errors[:validation], 'Invalid email'
344
+ end
345
+ end
346
+ ```
347
+
348
+ ## Best Practices
349
+
350
+ 1. **Single Responsibility**: Each step should do one thing
351
+ 2. **Pure Functions**: Avoid side effects when possible
352
+ 3. **Explicit Dependencies**: Use named steps with `depends_on` for clarity
353
+ 4. **Error Context**: Include helpful error messages with context
354
+ 5. **Testability**: Design steps to be easily testable in isolation
355
+ 6. **Immutability**: Never modify the input result - always return a new one
356
+ 7. **Meaningful Names**: For named steps, use descriptive names
357
+
358
+ ## Performance Considerations
359
+
360
+ ### I/O-Bound Steps
361
+
362
+ Use parallel execution for independent I/O operations:
363
+
364
+ ```ruby
365
+ pipeline = SimpleFlow::Pipeline.new do
366
+ step :validate, validator, depends_on: []
367
+
368
+ # These run in parallel
369
+ step :fetch_user_data, fetch_user, depends_on: [:validate]
370
+ step :fetch_order_data, fetch_orders, depends_on: [:validate]
371
+ step :fetch_product_data, fetch_products, depends_on: [:validate]
372
+ end
373
+ ```
374
+
375
+ ### CPU-Bound Steps
376
+
377
+ Keep CPU-intensive operations sequential (Ruby GIL limitation):
378
+
379
+ ```ruby
380
+ pipeline = SimpleFlow::Pipeline.new do
381
+ step ->(result) { heavy_computation_1(result) }
382
+ step ->(result) { heavy_computation_2(result) }
383
+ end
384
+ ```
385
+
386
+ ## Next Steps
387
+
388
+ - [Pipeline](pipeline.md) - Learn how steps are orchestrated
389
+ - [Flow Control](flow-control.md) - Advanced flow control patterns
390
+ - [Parallel Execution](../concurrent/parallel-steps.md) - Concurrent step execution
391
+ - [Error Handling Guide](../guides/error-handling.md) - Comprehensive error handling