simple_flow 0.1.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (80) hide show
  1. checksums.yaml +7 -0
  2. data/.envrc +1 -0
  3. data/.github/workflows/deploy-github-pages.yml +52 -0
  4. data/.rubocop.yml +57 -0
  5. data/CHANGELOG.md +4 -0
  6. data/COMMITS.md +196 -0
  7. data/LICENSE +21 -0
  8. data/README.md +481 -0
  9. data/Rakefile +15 -0
  10. data/benchmarks/parallel_vs_sequential.rb +98 -0
  11. data/benchmarks/pipeline_overhead.rb +130 -0
  12. data/docs/api/middleware.md +468 -0
  13. data/docs/api/parallel-step.md +363 -0
  14. data/docs/api/pipeline.md +382 -0
  15. data/docs/api/result.md +375 -0
  16. data/docs/concurrent/best-practices.md +687 -0
  17. data/docs/concurrent/introduction.md +246 -0
  18. data/docs/concurrent/parallel-steps.md +418 -0
  19. data/docs/concurrent/performance.md +481 -0
  20. data/docs/core-concepts/flow-control.md +452 -0
  21. data/docs/core-concepts/middleware.md +389 -0
  22. data/docs/core-concepts/overview.md +219 -0
  23. data/docs/core-concepts/pipeline.md +315 -0
  24. data/docs/core-concepts/result.md +168 -0
  25. data/docs/core-concepts/steps.md +391 -0
  26. data/docs/development/benchmarking.md +443 -0
  27. data/docs/development/contributing.md +380 -0
  28. data/docs/development/dagwood-concepts.md +435 -0
  29. data/docs/development/testing.md +514 -0
  30. data/docs/getting-started/examples.md +197 -0
  31. data/docs/getting-started/installation.md +62 -0
  32. data/docs/getting-started/quick-start.md +218 -0
  33. data/docs/guides/choosing-concurrency-model.md +441 -0
  34. data/docs/guides/complex-workflows.md +440 -0
  35. data/docs/guides/data-fetching.md +478 -0
  36. data/docs/guides/error-handling.md +635 -0
  37. data/docs/guides/file-processing.md +505 -0
  38. data/docs/guides/validation-patterns.md +496 -0
  39. data/docs/index.md +169 -0
  40. data/examples/.gitignore +3 -0
  41. data/examples/01_basic_pipeline.rb +112 -0
  42. data/examples/02_error_handling.rb +178 -0
  43. data/examples/03_middleware.rb +186 -0
  44. data/examples/04_parallel_automatic.rb +221 -0
  45. data/examples/05_parallel_explicit.rb +279 -0
  46. data/examples/06_real_world_ecommerce.rb +288 -0
  47. data/examples/07_real_world_etl.rb +277 -0
  48. data/examples/08_graph_visualization.rb +246 -0
  49. data/examples/09_pipeline_visualization.rb +266 -0
  50. data/examples/10_concurrency_control.rb +235 -0
  51. data/examples/11_sequential_dependencies.rb +243 -0
  52. data/examples/12_none_constant.rb +161 -0
  53. data/examples/README.md +374 -0
  54. data/examples/regression_test/01_basic_pipeline.txt +38 -0
  55. data/examples/regression_test/02_error_handling.txt +92 -0
  56. data/examples/regression_test/03_middleware.txt +61 -0
  57. data/examples/regression_test/04_parallel_automatic.txt +86 -0
  58. data/examples/regression_test/05_parallel_explicit.txt +80 -0
  59. data/examples/regression_test/06_real_world_ecommerce.txt +53 -0
  60. data/examples/regression_test/07_real_world_etl.txt +58 -0
  61. data/examples/regression_test/08_graph_visualization.txt +429 -0
  62. data/examples/regression_test/09_pipeline_visualization.txt +305 -0
  63. data/examples/regression_test/10_concurrency_control.txt +96 -0
  64. data/examples/regression_test/11_sequential_dependencies.txt +86 -0
  65. data/examples/regression_test/12_none_constant.txt +64 -0
  66. data/examples/regression_test.rb +105 -0
  67. data/lib/simple_flow/dependency_graph.rb +120 -0
  68. data/lib/simple_flow/dependency_graph_visualizer.rb +326 -0
  69. data/lib/simple_flow/middleware.rb +36 -0
  70. data/lib/simple_flow/parallel_executor.rb +80 -0
  71. data/lib/simple_flow/pipeline.rb +405 -0
  72. data/lib/simple_flow/result.rb +88 -0
  73. data/lib/simple_flow/step_tracker.rb +58 -0
  74. data/lib/simple_flow/version.rb +5 -0
  75. data/lib/simple_flow.rb +41 -0
  76. data/mkdocs.yml +146 -0
  77. data/pipeline_graph.dot +51 -0
  78. data/pipeline_graph.html +60 -0
  79. data/pipeline_graph.mmd +19 -0
  80. metadata +127 -0
@@ -0,0 +1,246 @@
1
+ # Concurrent Execution
2
+
3
+ One of SimpleFlow's most powerful features is the ability to execute independent steps **concurrently** using fiber-based concurrency.
4
+
5
+ ## Why Concurrent Execution?
6
+
7
+ Many workflows have steps that don't depend on each other and can run at the same time:
8
+
9
+ - Fetching data from multiple APIs
10
+ - Running independent validation checks
11
+ - Processing multiple files
12
+ - Enriching data from various sources
13
+
14
+ Running these steps concurrently can **dramatically improve performance**.
15
+
16
+ ## Performance Benefits
17
+
18
+ Consider fetching data from 4 APIs:
19
+
20
+ **Sequential Execution: ~0.4s**
21
+ ```ruby
22
+ pipeline = SimpleFlow::Pipeline.new do
23
+ step ->(result) { fetch_api_1(result) } # 0.1s
24
+ step ->(result) { fetch_api_2(result) } # 0.1s
25
+ step ->(result) { fetch_api_3(result) } # 0.1s
26
+ step ->(result) { fetch_api_4(result) } # 0.1s
27
+ end
28
+ # Total: 0.4s
29
+ ```
30
+
31
+ **Parallel Execution: ~0.1s**
32
+ ```ruby
33
+ pipeline = SimpleFlow::Pipeline.new do
34
+ parallel do
35
+ step ->(result) { fetch_api_1(result) } # ┐
36
+ step ->(result) { fetch_api_2(result) } # ├─ All run
37
+ step ->(result) { fetch_api_3(result) } # ├─ concurrently
38
+ step ->(result) { fetch_api_4(result) } # ┘
39
+ end
40
+ end
41
+ # Total: ~0.1s (4x speedup!)
42
+ ```
43
+
44
+ ## Basic Usage
45
+
46
+ Use the `parallel` block in your pipeline:
47
+
48
+ ```ruby
49
+ pipeline = SimpleFlow::Pipeline.new do
50
+ # This runs first (sequential)
51
+ step ->(result) { initialize_data(result) }
52
+
53
+ # These run concurrently
54
+ parallel do
55
+ step ->(result) { fetch_orders(result) }
56
+ step ->(result) { fetch_preferences(result) }
57
+ step ->(result) { fetch_analytics(result) }
58
+ end
59
+
60
+ # This waits for all parallel steps to complete
61
+ step ->(result) { aggregate_results(result) }
62
+ end
63
+ ```
64
+
65
+ ## How It Works
66
+
67
+ ### Fiber-Based Concurrency
68
+
69
+ SimpleFlow uses the **Async gem** which provides fiber-based concurrency:
70
+
71
+ - **No threading overhead**: Fibers are lightweight
72
+ - **No GIL limitations**: Not affected by Ruby's Global Interpreter Lock
73
+ - **Perfect for I/O**: Ideal for network requests, file operations, etc.
74
+
75
+ ### Result Merging
76
+
77
+ When parallel steps complete, their results are automatically merged:
78
+
79
+ ```ruby
80
+ parallel do
81
+ step ->(result) { result.with_context(:a, 1).continue(result.value) }
82
+ step ->(result) { result.with_context(:b, 2).continue(result.value) }
83
+ step ->(result) { result.with_context(:c, 3).continue(result.value) }
84
+ end
85
+
86
+ # Merged result has all contexts: {:a=>1, :b=>2, :c=>3}
87
+ ```
88
+
89
+ **Merging Rules:**
90
+ - **Values**: Uses the last non-halted result's value
91
+ - **Contexts**: Merges all contexts together
92
+ - **Errors**: Merges all errors together
93
+ - **Continue**: If any step halts, the merged result is halted
94
+
95
+ ## Real-World Example
96
+
97
+ ### User Data Aggregation
98
+
99
+ ```ruby
100
+ require 'simple_flow'
101
+ require 'net/http'
102
+ require 'json'
103
+
104
+ pipeline = SimpleFlow::Pipeline.new do
105
+ # Validate user ID
106
+ step ->(result) {
107
+ user_id = result.value
108
+ user_id > 0 ?
109
+ result.continue(user_id) :
110
+ result.halt.with_error(:validation, "Invalid user ID")
111
+ }
112
+
113
+ # Fetch data from multiple services concurrently
114
+ parallel do
115
+ step ->(result) {
116
+ user_id = result.value
117
+ profile = fetch_user_profile(user_id)
118
+ result.with_context(:profile, profile).continue(user_id)
119
+ }
120
+
121
+ step ->(result) {
122
+ user_id = result.value
123
+ orders = fetch_user_orders(user_id)
124
+ result.with_context(:orders, orders).continue(user_id)
125
+ }
126
+
127
+ step ->(result) {
128
+ user_id = result.value
129
+ preferences = fetch_user_preferences(user_id)
130
+ result.with_context(:preferences, preferences).continue(user_id)
131
+ }
132
+
133
+ step ->(result) {
134
+ user_id = result.value
135
+ analytics = fetch_user_analytics(user_id)
136
+ result.with_context(:analytics, analytics).continue(user_id)
137
+ }
138
+ end
139
+
140
+ # Aggregate all fetched data
141
+ step ->(result) {
142
+ aggregated = {
143
+ user_id: result.value,
144
+ profile: result.context[:profile],
145
+ orders: result.context[:orders],
146
+ preferences: result.context[:preferences],
147
+ analytics: result.context[:analytics]
148
+ }
149
+ result.continue(aggregated)
150
+ }
151
+ end
152
+
153
+ # Execute
154
+ result = pipeline.call(SimpleFlow::Result.new(123))
155
+ puts result.value[:profile]
156
+ # => {...}
157
+ ```
158
+
159
+ ## Multiple Parallel Blocks
160
+
161
+ You can have multiple parallel blocks in a pipeline:
162
+
163
+ ```ruby
164
+ pipeline = SimpleFlow::Pipeline.new do
165
+ step ->(result) { initialize(result) }
166
+
167
+ # First parallel block
168
+ parallel do
169
+ step ->(result) { fetch_data_a(result) }
170
+ step ->(result) { fetch_data_b(result) }
171
+ end
172
+
173
+ step ->(result) { process_first_batch(result) }
174
+
175
+ # Second parallel block
176
+ parallel do
177
+ step ->(result) { enrich_data_a(result) }
178
+ step ->(result) { enrich_data_b(result) }
179
+ step ->(result) { enrich_data_c(result) }
180
+ end
181
+
182
+ step ->(result) { finalize(result) }
183
+ end
184
+ ```
185
+
186
+ ## Error Handling
187
+
188
+ If any parallel step halts, the entire parallel block halts:
189
+
190
+ ```ruby
191
+ parallel do
192
+ step ->(result) { result.continue("success") }
193
+ step ->(result) { result.halt.with_error(:service, "Failed") }
194
+ step ->(result) { result.continue("success") }
195
+ end
196
+ # Result is halted with error: {:service=>["Failed"]}
197
+ ```
198
+
199
+ All errors are accumulated:
200
+
201
+ ```ruby
202
+ parallel do
203
+ step ->(result) { result.with_error(:a, "Error A").continue(result.value) }
204
+ step ->(result) { result.with_error(:b, "Error B").continue(result.value) }
205
+ end
206
+ # Result has errors: {:a=>["Error A"], :b=>["Error B"]}
207
+ ```
208
+
209
+ ## Best Practices
210
+
211
+ ### ✅ Good Use Cases
212
+
213
+ - **Independent I/O operations**: API calls, database queries
214
+ - **Independent validations**: Multiple validation checks
215
+ - **Data enrichment**: Fetching supplementary data
216
+ - **File processing**: Processing multiple files
217
+
218
+ ### ❌ Poor Use Cases
219
+
220
+ - **Dependent operations**: When step B needs step A's result
221
+ - **CPU-intensive work**: Better with threading or processes
222
+ - **Shared mutable state**: Could cause race conditions
223
+ - **Very quick operations**: Overhead might outweigh benefits
224
+
225
+ ## When to Use Parallel Execution
226
+
227
+ Use the `parallel` block when:
228
+
229
+ 1. ✅ Steps are **independent** (don't depend on each other's results)
230
+ 2. ✅ Steps are **I/O-bound** (network, file, database)
231
+ 3. ✅ Total execution time of steps > ~50ms
232
+ 4. ✅ Steps can safely run concurrently
233
+
234
+ Don't use `parallel` when:
235
+
236
+ 1. ❌ Steps depend on previous results
237
+ 2. ❌ Steps are very fast (<10ms each)
238
+ 3. ❌ Steps modify shared state
239
+ 4. ❌ Steps are CPU-intensive
240
+
241
+ ## Next Steps
242
+
243
+ - [Parallel Steps Guide](parallel-steps.md) - Deep dive into ParallelStep
244
+ - [Performance Tips](performance.md) - Optimize concurrent execution
245
+ - [Best Practices](best-practices.md) - Patterns and anti-patterns
246
+ - [Examples](../getting-started/examples.md) - See it in action
@@ -0,0 +1,418 @@
1
+ # Parallel Execution with Named Steps
2
+
3
+ SimpleFlow provides powerful parallel execution capabilities through two approaches: automatic parallel discovery using dependency graphs and explicit parallel blocks. This guide focuses on using named steps with dependencies for automatic parallelization.
4
+
5
+ ## Overview
6
+
7
+ When you define steps with names and dependencies, SimpleFlow automatically analyzes the dependency graph and executes independent steps concurrently. This provides optimal performance without requiring you to explicitly manage parallelism.
8
+
9
+ ## Basic Concepts
10
+
11
+ ### Named Steps
12
+
13
+ A named step is defined with three components:
14
+
15
+ 1. **Name** (Symbol) - Unique identifier for the step
16
+ 2. **Callable** (Proc/Lambda) - The code to execute
17
+ 3. **Dependencies** (Array of Symbols) - Steps that must complete first
18
+
19
+ ```ruby
20
+ pipeline = SimpleFlow::Pipeline.new do
21
+ step :step_name, ->(result) {
22
+ # Your code here
23
+ result.continue(new_value)
24
+ }, depends_on: [:prerequisite_step]
25
+ end
26
+ ```
27
+
28
+ ### Dependency Declaration
29
+
30
+ Dependencies are declared using the `depends_on:` parameter:
31
+
32
+ ```ruby
33
+ # No dependencies - can run immediately
34
+ step :initial_step, ->(result) { ... }, depends_on: []
35
+
36
+ # Depends on one step
37
+ step :second_step, ->(result) { ... }, depends_on: [:initial_step]
38
+
39
+ # Depends on multiple steps
40
+ step :final_step, ->(result) { ... }, depends_on: [:second_step, :third_step]
41
+ ```
42
+
43
+ ## Automatic Parallelization
44
+
45
+ ### How It Works
46
+
47
+ 1. **Graph Analysis**: SimpleFlow builds a dependency graph from your step declarations
48
+ 2. **Topological Sort**: Steps are organized into execution groups using Ruby's TSort module
49
+ 3. **Parallel Execution**: Steps with all dependencies satisfied run concurrently
50
+ 4. **Result Merging**: Contexts and errors from parallel steps are automatically merged
51
+
52
+ ### Simple Example
53
+
54
+ ```ruby
55
+ pipeline = SimpleFlow::Pipeline.new do
56
+ # Step 1: Runs first (no dependencies)
57
+ step :fetch_user, ->(result) {
58
+ user = UserService.find(result.value)
59
+ result.with_context(:user, user).continue(result.value)
60
+ }, depends_on: []
61
+
62
+ # Steps 2 & 3: Run in parallel (both depend only on step 1)
63
+ step :fetch_orders, ->(result) {
64
+ orders = OrderService.for_user(result.context[:user])
65
+ result.with_context(:orders, orders).continue(result.value)
66
+ }, depends_on: [:fetch_user]
67
+
68
+ step :fetch_preferences, ->(result) {
69
+ prefs = PreferenceService.for_user(result.context[:user])
70
+ result.with_context(:preferences, prefs).continue(result.value)
71
+ }, depends_on: [:fetch_user]
72
+
73
+ # Step 4: Runs after both parallel steps complete
74
+ step :build_profile, ->(result) {
75
+ profile = {
76
+ user: result.context[:user],
77
+ orders: result.context[:orders],
78
+ preferences: result.context[:preferences]
79
+ }
80
+ result.continue(profile)
81
+ }, depends_on: [:fetch_orders, :fetch_preferences]
82
+ end
83
+
84
+ # Execute with automatic parallelism
85
+ result = pipeline.call_parallel(SimpleFlow::Result.new(user_id))
86
+ ```
87
+
88
+ **Execution Flow:**
89
+ 1. `fetch_user` runs first
90
+ 2. `fetch_orders` and `fetch_preferences` run in parallel
91
+ 3. `build_profile` runs after both parallel steps complete
92
+
93
+ ## Complex Dependency Graphs
94
+
95
+ ### Multi-Level Parallelism
96
+
97
+ ```ruby
98
+ pipeline = SimpleFlow::Pipeline.new do
99
+ # Level 1: Validation (sequential)
100
+ step :validate_input, ->(result) {
101
+ # Validate request
102
+ result.with_context(:validated, true).continue(result.value)
103
+ }, depends_on: []
104
+
105
+ # Level 2: Three independent checks (parallel)
106
+ step :check_inventory, ->(result) {
107
+ inventory = InventoryService.check(result.value)
108
+ result.with_context(:inventory, inventory).continue(result.value)
109
+ }, depends_on: [:validate_input]
110
+
111
+ step :check_pricing, ->(result) {
112
+ price = PricingService.calculate(result.value)
113
+ result.with_context(:price, price).continue(result.value)
114
+ }, depends_on: [:validate_input]
115
+
116
+ step :check_shipping, ->(result) {
117
+ shipping = ShippingService.calculate(result.value)
118
+ result.with_context(:shipping, shipping).continue(result.value)
119
+ }, depends_on: [:validate_input]
120
+
121
+ # Level 3: Calculate discount (depends on inventory and pricing)
122
+ step :calculate_discount, ->(result) {
123
+ discount = DiscountService.calculate(
124
+ result.context[:inventory],
125
+ result.context[:price]
126
+ )
127
+ result.with_context(:discount, discount).continue(result.value)
128
+ }, depends_on: [:check_inventory, :check_pricing]
129
+
130
+ # Level 4: Finalize (depends on discount and shipping)
131
+ step :finalize_order, ->(result) {
132
+ total = result.context[:price] +
133
+ result.context[:shipping] -
134
+ result.context[:discount]
135
+ result.continue(total)
136
+ }, depends_on: [:calculate_discount, :check_shipping]
137
+ end
138
+ ```
139
+
140
+ **Execution Groups:**
141
+ - Group 1: `validate_input` (sequential)
142
+ - Group 2: `check_inventory`, `check_pricing`, `check_shipping` (parallel)
143
+ - Group 3: `calculate_discount` (sequential, waits for inventory and pricing)
144
+ - Group 4: `finalize_order` (sequential, waits for discount and shipping)
145
+
146
+ ## Context Merging
147
+
148
+ When parallel steps complete, SimpleFlow automatically merges their contexts and errors:
149
+
150
+ ```ruby
151
+ pipeline = SimpleFlow::Pipeline.new do
152
+ step :task_a, ->(result) {
153
+ result.with_context(:data_a, "from A").continue(result.value)
154
+ }, depends_on: []
155
+
156
+ step :task_b, ->(result) {
157
+ result.with_context(:data_b, "from B").continue(result.value)
158
+ }, depends_on: []
159
+
160
+ step :combine, ->(result) {
161
+ # Both contexts are available
162
+ combined = {
163
+ a: result.context[:data_a], # "from A"
164
+ b: result.context[:data_b] # "from B"
165
+ }
166
+ result.continue(combined)
167
+ }, depends_on: [:task_a, :task_b]
168
+ end
169
+ ```
170
+
171
+ ### Error Accumulation
172
+
173
+ Errors from parallel steps are also merged:
174
+
175
+ ```ruby
176
+ pipeline = SimpleFlow::Pipeline.new do
177
+ step :validate_email, ->(result) {
178
+ if invalid_email?(result.value[:email])
179
+ result.with_error(:email, "Invalid format")
180
+ end
181
+ result.continue(result.value)
182
+ }, depends_on: []
183
+
184
+ step :validate_phone, ->(result) {
185
+ if invalid_phone?(result.value[:phone])
186
+ result.with_error(:phone, "Invalid format")
187
+ end
188
+ result.continue(result.value)
189
+ }, depends_on: []
190
+
191
+ step :check_errors, ->(result) {
192
+ # Errors from both parallel validations are available
193
+ if result.errors.any?
194
+ result.halt(result.value) # Stop if any validation failed
195
+ else
196
+ result.continue(result.value)
197
+ end
198
+ }, depends_on: [:validate_email, :validate_phone]
199
+ end
200
+ ```
201
+
202
+ ## Halting Execution
203
+
204
+ If any parallel step calls `halt()`, the pipeline stops immediately:
205
+
206
+ ```ruby
207
+ pipeline = SimpleFlow::Pipeline.new do
208
+ step :task_a, ->(result) {
209
+ result.with_context(:success_a, true).continue(result.value)
210
+ }, depends_on: []
211
+
212
+ step :task_b, ->(result) {
213
+ # This step fails
214
+ result.halt.with_error(:failure, "Task B failed")
215
+ }, depends_on: []
216
+
217
+ step :task_c, ->(result) {
218
+ result.with_context(:success_c, true).continue(result.value)
219
+ }, depends_on: []
220
+
221
+ step :final_step, ->(result) {
222
+ # This will NOT execute because task_b halted
223
+ result.continue("Completed")
224
+ }, depends_on: [:task_a, :task_b, :task_c]
225
+ end
226
+
227
+ result = pipeline.call_parallel(initial_data)
228
+ # result.continue? => false
229
+ # result.errors => {:failure => ["Task B failed"]}
230
+ ```
231
+
232
+ ## Execution Methods
233
+
234
+ ### `call_parallel(result, strategy: :auto)`
235
+
236
+ Executes the pipeline with parallel support:
237
+
238
+ ```ruby
239
+ # Automatic strategy (default) - uses dependency graph if named steps exist
240
+ result = pipeline.call_parallel(initial_result)
241
+
242
+ # Automatic strategy (explicit)
243
+ result = pipeline.call_parallel(initial_result, strategy: :auto)
244
+
245
+ # Explicit strategy - only uses explicit parallel blocks
246
+ result = pipeline.call_parallel(initial_result, strategy: :explicit)
247
+ ```
248
+
249
+ ### `call(result)`
250
+
251
+ Executes sequentially (ignores parallelism):
252
+
253
+ ```ruby
254
+ # Sequential execution - useful for debugging
255
+ result = pipeline.call(initial_result)
256
+ ```
257
+
258
+ ## Visualizing Dependencies
259
+
260
+ ### ASCII Visualization
261
+
262
+ ```ruby
263
+ # Print dependency graph to console
264
+ puts pipeline.visualize_ascii
265
+
266
+ # Hide parallel groups
267
+ puts pipeline.visualize_ascii(show_groups: false)
268
+ ```
269
+
270
+ ### Graphviz DOT Format
271
+
272
+ ```ruby
273
+ # Generate DOT file for visualization
274
+ dot_content = pipeline.visualize_dot
275
+ File.write('pipeline.dot', dot_content)
276
+
277
+ # Generate image: dot -Tpng pipeline.dot -o pipeline.png
278
+
279
+ # Left-to-right orientation
280
+ dot_content = pipeline.visualize_dot(orientation: 'LR')
281
+ ```
282
+
283
+ ### Mermaid Diagrams
284
+
285
+ ```ruby
286
+ # Generate Mermaid diagram
287
+ mermaid = pipeline.visualize_mermaid
288
+ File.write('pipeline.mmd', mermaid)
289
+
290
+ # View at https://mermaid.live/
291
+ ```
292
+
293
+ ### Execution Plan
294
+
295
+ ```ruby
296
+ # Get detailed execution analysis
297
+ puts pipeline.execution_plan
298
+ ```
299
+
300
+ Output includes:
301
+ - Total steps and execution phases
302
+ - Which steps run in parallel
303
+ - Potential speedup vs sequential execution
304
+ - Step-by-step execution order
305
+
306
+ ## Best Practices
307
+
308
+ ### 1. Design Independent Steps
309
+
310
+ Ensure parallel steps are truly independent:
311
+
312
+ ```ruby
313
+ # GOOD: Independent operations
314
+ step :fetch_user_data, ->(result) { ... }, depends_on: []
315
+ step :fetch_product_data, ->(result) { ... }, depends_on: []
316
+
317
+ # BAD: Steps that modify shared state
318
+ step :increment_counter, ->(result) { @counter += 1; ... }, depends_on: []
319
+ step :read_counter, ->(result) { puts @counter; ... }, depends_on: []
320
+ ```
321
+
322
+ ### 2. Use Context for Data Sharing
323
+
324
+ Pass data between steps using context, not instance variables:
325
+
326
+ ```ruby
327
+ # GOOD: Using context
328
+ step :fetch_data, ->(result) {
329
+ data = API.fetch(result.value)
330
+ result.with_context(:api_data, data).continue(result.value)
331
+ }, depends_on: []
332
+
333
+ step :process_data, ->(result) {
334
+ processed = transform(result.context[:api_data])
335
+ result.continue(processed)
336
+ }, depends_on: [:fetch_data]
337
+
338
+ # BAD: Using instance variables
339
+ @shared_data = nil
340
+ step :fetch_data, ->(result) {
341
+ @shared_data = API.fetch(result.value) # Race condition!
342
+ result.continue(result.value)
343
+ }, depends_on: []
344
+ ```
345
+
346
+ ### 3. Declare All Dependencies
347
+
348
+ Be explicit about dependencies to ensure correct execution order:
349
+
350
+ ```ruby
351
+ # GOOD: Clear dependencies
352
+ step :load_config, ->(result) { ... }, depends_on: []
353
+ step :validate_config, ->(result) { ... }, depends_on: [:load_config]
354
+ step :apply_config, ->(result) { ... }, depends_on: [:validate_config]
355
+
356
+ # BAD: Missing dependencies
357
+ step :load_config, ->(result) { ... }, depends_on: []
358
+ step :apply_config, ->(result) { ... }, depends_on: [] # Should depend on load_config!
359
+ ```
360
+
361
+ ### 4. Keep Steps Focused
362
+
363
+ Each step should have a single responsibility:
364
+
365
+ ```ruby
366
+ # GOOD: Focused steps
367
+ step :fetch_user, ->(result) { ... }, depends_on: []
368
+ step :fetch_orders, ->(result) { ... }, depends_on: [:fetch_user]
369
+ step :calculate_total, ->(result) { ... }, depends_on: [:fetch_orders]
370
+
371
+ # BAD: Monolithic step
372
+ step :do_everything, ->(result) {
373
+ user = fetch_user
374
+ orders = fetch_orders(user)
375
+ total = calculate_total(orders)
376
+ # Too much in one step!
377
+ }, depends_on: []
378
+ ```
379
+
380
+ ### 5. Handle Errors Gracefully
381
+
382
+ Add error handling at appropriate points:
383
+
384
+ ```ruby
385
+ pipeline = SimpleFlow::Pipeline.new do
386
+ # Parallel data fetching
387
+ step :fetch_a, ->(result) { ... }, depends_on: []
388
+ step :fetch_b, ->(result) { ... }, depends_on: []
389
+
390
+ # Check for errors before proceeding
391
+ step :validate_fetch, ->(result) {
392
+ if result.errors.any?
393
+ result.halt.with_error(:fetch, "Failed to fetch required data")
394
+ else
395
+ result.continue(result.value)
396
+ end
397
+ }, depends_on: [:fetch_a, :fetch_b]
398
+
399
+ # Only runs if validation passes
400
+ step :process, ->(result) { ... }, depends_on: [:validate_fetch]
401
+ end
402
+ ```
403
+
404
+ ## Real-World Example
405
+
406
+ See `/Users/dewayne/sandbox/git_repos/madbomber/simple_flow/examples/06_real_world_ecommerce.rb` for a complete e-commerce order processing pipeline that demonstrates:
407
+
408
+ - Multi-level parallel execution
409
+ - Context merging
410
+ - Error handling
411
+ - Complex dependency relationships
412
+
413
+ ## Related Documentation
414
+
415
+ - [Performance Characteristics](performance.md) - Understanding parallel execution performance
416
+ - [Best Practices](best-practices.md) - Comprehensive best practices for concurrent execution
417
+ - [Pipeline API](../api/pipeline.md) - Complete Pipeline API reference
418
+ - [Parallel Executor API](../api/parallel-step.md) - Low-level parallel execution details