simple_flow 0.1.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (80) hide show
  1. checksums.yaml +7 -0
  2. data/.envrc +1 -0
  3. data/.github/workflows/deploy-github-pages.yml +52 -0
  4. data/.rubocop.yml +57 -0
  5. data/CHANGELOG.md +4 -0
  6. data/COMMITS.md +196 -0
  7. data/LICENSE +21 -0
  8. data/README.md +481 -0
  9. data/Rakefile +15 -0
  10. data/benchmarks/parallel_vs_sequential.rb +98 -0
  11. data/benchmarks/pipeline_overhead.rb +130 -0
  12. data/docs/api/middleware.md +468 -0
  13. data/docs/api/parallel-step.md +363 -0
  14. data/docs/api/pipeline.md +382 -0
  15. data/docs/api/result.md +375 -0
  16. data/docs/concurrent/best-practices.md +687 -0
  17. data/docs/concurrent/introduction.md +246 -0
  18. data/docs/concurrent/parallel-steps.md +418 -0
  19. data/docs/concurrent/performance.md +481 -0
  20. data/docs/core-concepts/flow-control.md +452 -0
  21. data/docs/core-concepts/middleware.md +389 -0
  22. data/docs/core-concepts/overview.md +219 -0
  23. data/docs/core-concepts/pipeline.md +315 -0
  24. data/docs/core-concepts/result.md +168 -0
  25. data/docs/core-concepts/steps.md +391 -0
  26. data/docs/development/benchmarking.md +443 -0
  27. data/docs/development/contributing.md +380 -0
  28. data/docs/development/dagwood-concepts.md +435 -0
  29. data/docs/development/testing.md +514 -0
  30. data/docs/getting-started/examples.md +197 -0
  31. data/docs/getting-started/installation.md +62 -0
  32. data/docs/getting-started/quick-start.md +218 -0
  33. data/docs/guides/choosing-concurrency-model.md +441 -0
  34. data/docs/guides/complex-workflows.md +440 -0
  35. data/docs/guides/data-fetching.md +478 -0
  36. data/docs/guides/error-handling.md +635 -0
  37. data/docs/guides/file-processing.md +505 -0
  38. data/docs/guides/validation-patterns.md +496 -0
  39. data/docs/index.md +169 -0
  40. data/examples/.gitignore +3 -0
  41. data/examples/01_basic_pipeline.rb +112 -0
  42. data/examples/02_error_handling.rb +178 -0
  43. data/examples/03_middleware.rb +186 -0
  44. data/examples/04_parallel_automatic.rb +221 -0
  45. data/examples/05_parallel_explicit.rb +279 -0
  46. data/examples/06_real_world_ecommerce.rb +288 -0
  47. data/examples/07_real_world_etl.rb +277 -0
  48. data/examples/08_graph_visualization.rb +246 -0
  49. data/examples/09_pipeline_visualization.rb +266 -0
  50. data/examples/10_concurrency_control.rb +235 -0
  51. data/examples/11_sequential_dependencies.rb +243 -0
  52. data/examples/12_none_constant.rb +161 -0
  53. data/examples/README.md +374 -0
  54. data/examples/regression_test/01_basic_pipeline.txt +38 -0
  55. data/examples/regression_test/02_error_handling.txt +92 -0
  56. data/examples/regression_test/03_middleware.txt +61 -0
  57. data/examples/regression_test/04_parallel_automatic.txt +86 -0
  58. data/examples/regression_test/05_parallel_explicit.txt +80 -0
  59. data/examples/regression_test/06_real_world_ecommerce.txt +53 -0
  60. data/examples/regression_test/07_real_world_etl.txt +58 -0
  61. data/examples/regression_test/08_graph_visualization.txt +429 -0
  62. data/examples/regression_test/09_pipeline_visualization.txt +305 -0
  63. data/examples/regression_test/10_concurrency_control.txt +96 -0
  64. data/examples/regression_test/11_sequential_dependencies.txt +86 -0
  65. data/examples/regression_test/12_none_constant.txt +64 -0
  66. data/examples/regression_test.rb +105 -0
  67. data/lib/simple_flow/dependency_graph.rb +120 -0
  68. data/lib/simple_flow/dependency_graph_visualizer.rb +326 -0
  69. data/lib/simple_flow/middleware.rb +36 -0
  70. data/lib/simple_flow/parallel_executor.rb +80 -0
  71. data/lib/simple_flow/pipeline.rb +405 -0
  72. data/lib/simple_flow/result.rb +88 -0
  73. data/lib/simple_flow/step_tracker.rb +58 -0
  74. data/lib/simple_flow/version.rb +5 -0
  75. data/lib/simple_flow.rb +41 -0
  76. data/mkdocs.yml +146 -0
  77. data/pipeline_graph.dot +51 -0
  78. data/pipeline_graph.html +60 -0
  79. data/pipeline_graph.mmd +19 -0
  80. metadata +127 -0
@@ -0,0 +1,62 @@
1
+ # Installation
2
+
3
+ ## Requirements
4
+
5
+ - Ruby >= 2.7.0
6
+ - Bundler (recommended)
7
+
8
+ ## Installation Methods
9
+
10
+ ### Using Bundler (Recommended)
11
+
12
+ Add SimpleFlow to your `Gemfile`:
13
+
14
+ ```ruby
15
+ gem 'simple_flow'
16
+ ```
17
+
18
+ Then install:
19
+
20
+ ```bash
21
+ bundle install
22
+ ```
23
+
24
+ ### Using RubyGems
25
+
26
+ Install directly with gem:
27
+
28
+ ```bash
29
+ gem install simple_flow
30
+ ```
31
+
32
+ ## Dependencies
33
+
34
+ SimpleFlow has minimal dependencies:
35
+
36
+ - **async** (~> 2.0) - For concurrent execution support
37
+
38
+ All dependencies are automatically installed.
39
+
40
+ ## Verifying Installation
41
+
42
+ After installation, verify SimpleFlow is working:
43
+
44
+ ```ruby
45
+ require 'simple_flow'
46
+
47
+ pipeline = SimpleFlow::Pipeline.new do
48
+ step ->(result) { result.continue("Hello, SimpleFlow!") }
49
+ end
50
+
51
+ result = pipeline.call(SimpleFlow::Result.new(nil))
52
+ puts result.value
53
+ # => "Hello, SimpleFlow!"
54
+ ```
55
+
56
+ If this runs without errors, you're ready to go!
57
+
58
+ ## Next Steps
59
+
60
+ - [Quick Start Guide](quick-start.md) - Build your first pipeline
61
+ - [Examples](examples.md) - See SimpleFlow in action
62
+ - [Core Concepts](../core-concepts/overview.md) - Understand the fundamentals
@@ -0,0 +1,218 @@
1
+ # Quick Start
2
+
3
+ Get up and running with SimpleFlow in 5 minutes!
4
+
5
+ ## Your First Pipeline
6
+
7
+ ```ruby
8
+ require 'simple_flow'
9
+
10
+ # Create a simple text processing pipeline
11
+ pipeline = SimpleFlow::Pipeline.new do
12
+ step ->(result) { result.continue(result.value.strip) }
13
+ step ->(result) { result.continue(result.value.downcase) }
14
+ step ->(result) { result.continue("Hello, #{result.value}!") }
15
+ end
16
+
17
+ # Execute the pipeline
18
+ result = pipeline.call(SimpleFlow::Result.new(" WORLD "))
19
+ puts result.value
20
+ # => "Hello, world!"
21
+ ```
22
+
23
+ ## Understanding the Basics
24
+
25
+ ### Sequential Execution
26
+
27
+ **Steps execute in order, with each step automatically depending on the previous step's success.**
28
+
29
+ ```ruby
30
+ pipeline = SimpleFlow::Pipeline.new do
31
+ step ->(result) { puts "Step 1"; result.continue(result.value) }
32
+ step ->(result) { puts "Step 2"; result.halt("error") } # Stops here
33
+ step ->(result) { puts "Step 3"; result.continue(result.value) } # Never runs
34
+ end
35
+
36
+ result = pipeline.call(SimpleFlow::Result.new(nil))
37
+ # Output: Step 1
38
+ # Step 2
39
+ # (Step 3 is skipped because Step 2 halted)
40
+ ```
41
+
42
+ When any step halts (returns `result.halt`), the pipeline stops immediately and subsequent steps are not executed.
43
+
44
+ ### 1. Create a Result
45
+
46
+ A `Result` wraps your data:
47
+
48
+ ```ruby
49
+ result = SimpleFlow::Result.new(42)
50
+ ```
51
+
52
+ ### 2. Define Steps
53
+
54
+ Steps are callable objects (usually lambdas) that transform results:
55
+
56
+ ```ruby
57
+ step ->(result) {
58
+ new_value = result.value * 2
59
+ result.continue(new_value)
60
+ }
61
+ ```
62
+
63
+ ### 3. Build a Pipeline
64
+
65
+ Combine steps into a pipeline:
66
+
67
+ ```ruby
68
+ pipeline = SimpleFlow::Pipeline.new do
69
+ step ->(result) { result.continue(result.value + 10) }
70
+ step ->(result) { result.continue(result.value * 2) }
71
+ end
72
+ ```
73
+
74
+ ### 4. Execute
75
+
76
+ Call the pipeline with an initial result:
77
+
78
+ ```ruby
79
+ final = pipeline.call(SimpleFlow::Result.new(5))
80
+ puts final.value # => 30 ((5 + 10) * 2)
81
+ ```
82
+
83
+ ## Adding Context
84
+
85
+ Track metadata throughout your pipeline:
86
+
87
+ ```ruby
88
+ pipeline = SimpleFlow::Pipeline.new do
89
+ step ->(result) {
90
+ result
91
+ .with_context(:started_at, Time.now)
92
+ .continue(result.value)
93
+ }
94
+
95
+ step ->(result) {
96
+ result
97
+ .with_context(:user, "Alice")
98
+ .continue(result.value.upcase)
99
+ }
100
+ end
101
+
102
+ result = pipeline.call(SimpleFlow::Result.new("hello"))
103
+ puts result.value # => "HELLO"
104
+ puts result.context # => {:started_at=>..., :user=>"Alice"}
105
+ ```
106
+
107
+ ## Error Handling
108
+
109
+ Accumulate errors and halt execution:
110
+
111
+ ```ruby
112
+ pipeline = SimpleFlow::Pipeline.new do
113
+ step ->(result) {
114
+ age = result.value
115
+ if age < 18
116
+ result.halt.with_error(:age, "Must be 18 or older")
117
+ else
118
+ result.continue(age)
119
+ end
120
+ }
121
+
122
+ step ->(result) {
123
+ # This won't execute if age < 18
124
+ result.continue("Approved for age #{result.value}")
125
+ }
126
+ end
127
+
128
+ result = pipeline.call(SimpleFlow::Result.new(16))
129
+ puts result.continue? # => false
130
+ puts result.errors # => {:age=>["Must be 18 or older"]}
131
+ ```
132
+
133
+ ## Concurrent Execution
134
+
135
+ Run independent steps in parallel:
136
+
137
+ ```ruby
138
+ pipeline = SimpleFlow::Pipeline.new do
139
+ parallel do
140
+ step ->(result) { result.with_context(:a, fetch_data_a).continue(result.value) }
141
+ step ->(result) { result.with_context(:b, fetch_data_b).continue(result.value) }
142
+ step ->(result) { result.with_context(:c, fetch_data_c).continue(result.value) }
143
+ end
144
+
145
+ step ->(result) {
146
+ # All three fetches completed concurrently
147
+ result.continue("Aggregated data")
148
+ }
149
+ end
150
+ ```
151
+
152
+ ## Middleware
153
+
154
+ Add cross-cutting concerns:
155
+
156
+ ```ruby
157
+ pipeline = SimpleFlow::Pipeline.new do
158
+ use_middleware SimpleFlow::MiddleWare::Logging
159
+
160
+ step ->(result) { result.continue(result.value + 1) }
161
+ step ->(result) { result.continue(result.value * 2) }
162
+ end
163
+
164
+ # Logs before and after each step
165
+ ```
166
+
167
+ ## Real-World Example
168
+
169
+ Here's a more complete example:
170
+
171
+ ```ruby
172
+ require 'simple_flow'
173
+
174
+ # Define validation steps
175
+ validate_email = ->(result) {
176
+ email = result.value[:email]
177
+ if email && email.match?(/\A[\w+\-.]+@[a-z\d\-]+(\.[a-z]+)+\z/i)
178
+ result.continue(result.value)
179
+ else
180
+ result.halt(result.value).with_error(:email, "Invalid email format")
181
+ end
182
+ }
183
+
184
+ validate_age = ->(result) {
185
+ age = result.value[:age]
186
+ if age && age >= 18
187
+ result.continue(result.value)
188
+ else
189
+ result.halt(result.value).with_error(:age, "Must be 18 or older")
190
+ end
191
+ }
192
+
193
+ # Build validation pipeline
194
+ validation_pipeline = SimpleFlow::Pipeline.new do
195
+ step validate_email
196
+ step validate_age
197
+ end
198
+
199
+ # Test with valid data
200
+ valid_data = { email: "alice@example.com", age: 25 }
201
+ result = validation_pipeline.call(SimpleFlow::Result.new(valid_data))
202
+ puts result.continue? # => true
203
+
204
+ # Test with invalid data
205
+ invalid_data = { email: "invalid", age: 16 }
206
+ result = validation_pipeline.call(SimpleFlow::Result.new(invalid_data))
207
+ puts result.continue? # => false
208
+ puts result.errors # => {:email=>["Invalid email format"]}
209
+ ```
210
+
211
+ ## Next Steps
212
+
213
+ Now that you've got the basics, explore:
214
+
215
+ - [Examples](examples.md) - Real-world use cases
216
+ - [Core Concepts](../core-concepts/overview.md) - Deep dive into architecture
217
+ - [Concurrent Execution](../concurrent/introduction.md) - Maximize performance
218
+ - [Error Handling Guide](../guides/error-handling.md) - Advanced error patterns
@@ -0,0 +1,441 @@
1
+ # Choosing a Concurrency Model
2
+
3
+ SimpleFlow supports two different approaches for parallel execution: Ruby threads and the async gem (fiber-based). This guide helps you choose the right one for your use case.
4
+
5
+ ## Overview
6
+
7
+ You can control which concurrency model a pipeline uses in two ways:
8
+
9
+ ### 1. Automatic Detection (Default)
10
+
11
+ When you create a pipeline without specifying concurrency:
12
+
13
+ ```ruby
14
+ pipeline = SimpleFlow::Pipeline.new do
15
+ # steps...
16
+ end
17
+ ```
18
+
19
+ SimpleFlow automatically uses the best available model:
20
+ - **Without async gem**: Uses Ruby's built-in threads
21
+ - **With async gem**: Uses fiber-based concurrency
22
+
23
+ ### 2. Explicit Concurrency Selection
24
+
25
+ You can explicitly choose the concurrency model per pipeline:
26
+
27
+ ```ruby
28
+ # Force threads (even if async gem is available)
29
+ pipeline = SimpleFlow::Pipeline.new(concurrency: :threads) do
30
+ # steps...
31
+ end
32
+
33
+ # Force async (raises error if async gem not available)
34
+ pipeline = SimpleFlow::Pipeline.new(concurrency: :async) do
35
+ # steps...
36
+ end
37
+
38
+ # Auto-detect (default behavior)
39
+ pipeline = SimpleFlow::Pipeline.new(concurrency: :auto) do
40
+ # steps...
41
+ end
42
+ ```
43
+
44
+ Both provide **actual parallel execution** - the difference is in how they achieve it and their resource characteristics.
45
+
46
+ ## Ruby Threads (Without async gem)
47
+
48
+ ### How It Works
49
+
50
+ - Creates actual OS threads (like having multiple workers)
51
+ - Each thread runs independently
52
+ - Ruby's GIL (Global Interpreter Lock) means only one thread runs Ruby code at a time
53
+ - **BUT**: When a thread waits for I/O (network, disk, database), other threads can run
54
+
55
+ ### Best For
56
+
57
+ - **Simple use cases**: You just want things to run in parallel
58
+ - **Blocking I/O operations**:
59
+ - Making HTTP requests to APIs
60
+ - Reading/writing files
61
+ - Database queries
62
+ - Any "waiting" operations
63
+ - **Mixed libraries**: Works with any Ruby gem (doesn't need async support)
64
+ - **Small-to-medium concurrency**: 10-100 parallel operations
65
+
66
+ ### Resource Usage
67
+
68
+ - Each thread uses ~1-2 MB of memory
69
+ - OS manages thread scheduling
70
+ - Limited by system resources (maybe 100-1,000 threads max)
71
+
72
+ ### Example Scenario
73
+
74
+ ```ruby
75
+ # Fetching data from 10 different APIs in parallel
76
+ pipeline = SimpleFlow::Pipeline.new do
77
+ step :validate, validator, depends_on: []
78
+
79
+ # These 10 API calls run in parallel with threads
80
+ step :api_1, ->(r) { r.with_context(:api_1, fetch_api_1) }, depends_on: [:validate]
81
+ step :api_2, ->(r) { r.with_context(:api_2, fetch_api_2) }, depends_on: [:validate]
82
+ # ... 8 more API calls
83
+
84
+ step :merge, merger, depends_on: [:api_1, :api_2, ...]
85
+ end
86
+
87
+ # Each API call takes 500ms, threads let them all wait simultaneously
88
+ # Total time: ~500ms instead of 5 seconds
89
+ result = pipeline.call_parallel(initial_data)
90
+ ```
91
+
92
+ ---
93
+
94
+ ## Async Gem (Fiber-based)
95
+
96
+ ### How It Works
97
+
98
+ - Uses Ruby "fibers" (lightweight green threads)
99
+ - Cooperative scheduling (fibers yield control when waiting)
100
+ - Event loop manages thousands of concurrent operations
101
+ - Requires async-aware libraries (async-http, async-postgres, etc.)
102
+
103
+ ### Best For
104
+
105
+ - **High concurrency**: Thousands of simultaneous operations
106
+ - **I/O-heavy applications**: Web scrapers, API gateways, chat servers
107
+ - **Long-running services**: Background workers processing many jobs
108
+ - **Async-compatible stack**: When using async-aware gems
109
+
110
+ ### Resource Usage
111
+
112
+ - Each fiber uses ~4-8 KB of memory (250x lighter than threads!)
113
+ - Can handle 10,000+ concurrent operations
114
+ - More efficient CPU and memory usage
115
+
116
+ ### Example Scenario
117
+
118
+ ```ruby
119
+ # Web scraper fetching 10,000 product pages
120
+ require 'async'
121
+ require 'async/http/internet'
122
+
123
+ pipeline = SimpleFlow::Pipeline.new do
124
+ step :load_urls, url_loader, depends_on: []
125
+
126
+ # With async gem, can handle thousands of concurrent requests
127
+ step :fetch_pages, ->(result) {
128
+ urls = result.value[:urls]
129
+ pages = Async::HTTP::Internet.new.get_all(urls)
130
+ result.with_context(:pages, pages).continue(result.value)
131
+ }, depends_on: [:load_urls]
132
+
133
+ step :parse_data, parser, depends_on: [:fetch_pages]
134
+ end
135
+
136
+ # With threads: Would crash or be very slow (10,000 threads = 10+ GB RAM)
137
+ # With async: Handles it smoothly (10,000 fibers = ~80 MB RAM)
138
+ result = pipeline.call_parallel(initial_data)
139
+ ```
140
+
141
+ ---
142
+
143
+ ## Decision Guide
144
+
145
+ ### Use Threads (no async gem) when:
146
+
147
+ ✅ You have **10-100 parallel operations**
148
+ ✅ Using **standard Ruby gems** (not async-compatible)
149
+ ✅ Making **database queries** or **HTTP requests** with traditional libraries
150
+ ✅ You want **simple, straightforward code**
151
+ ✅ Building **internal tools** or **scripts**
152
+
153
+ **Example:**
154
+ ```ruby
155
+ # E-commerce checkout: Check inventory, calculate shipping, process payment
156
+ # 3-5 parallel operations, standard libraries
157
+
158
+ # Option 1: Auto-detect (uses threads since no async gem needed)
159
+ pipeline = SimpleFlow::Pipeline.new do
160
+ step :validate_order, validator, depends_on: []
161
+ step :check_inventory, inventory_checker, depends_on: [:validate_order]
162
+ step :calculate_shipping, shipping_calculator, depends_on: [:validate_order]
163
+ step :process_payment, payment_processor, depends_on: [:check_inventory, :calculate_shipping]
164
+ end
165
+
166
+ # Option 2: Explicitly use threads (works even if async gem is installed)
167
+ pipeline = SimpleFlow::Pipeline.new(concurrency: :threads) do
168
+ step :validate_order, validator, depends_on: []
169
+ step :check_inventory, inventory_checker, depends_on: [:validate_order]
170
+ step :calculate_shipping, shipping_calculator, depends_on: [:validate_order]
171
+ step :process_payment, payment_processor, depends_on: [:check_inventory, :calculate_shipping]
172
+ end
173
+
174
+ result = pipeline.call_parallel(order) # ✅ Threads work great
175
+ ```
176
+
177
+ ### Use Async (add async gem) when:
178
+
179
+ ✅ You need **1,000+ concurrent operations**
180
+ ✅ Building **high-performance web services**
181
+ ✅ Processing **large-scale I/O operations** (web scraping, bulk APIs)
182
+ ✅ Using **async-compatible libraries** (async-http, async-postgres)
183
+ ✅ Optimizing **resource usage** (hosting costs, memory limits)
184
+
185
+ **Example:**
186
+ ```ruby
187
+ # Monitoring service checking 5,000 endpoints every minute
188
+ # Need low memory footprint and high concurrency
189
+
190
+ # Gemfile:
191
+ gem 'async', '~> 2.0'
192
+ gem 'async-http', '~> 0.60'
193
+
194
+ # Explicitly require async concurrency for this high-volume pipeline
195
+ pipeline = SimpleFlow::Pipeline.new(concurrency: :async) do
196
+ step :load_endpoints, endpoint_loader, depends_on: []
197
+
198
+ # Async gem allows 5,000 concurrent health checks efficiently
199
+ step :check_all, health_checker, depends_on: [:load_endpoints]
200
+
201
+ step :aggregate_results, aggregator, depends_on: [:check_all]
202
+ end
203
+
204
+ result = pipeline.call_parallel(config) # ✅ Async is essential
205
+ # Raises error if async gem not installed
206
+ ```
207
+
208
+ ---
209
+
210
+ ## Quick Comparison Table
211
+
212
+ | Factor | Ruby Threads | Async Gem |
213
+ |--------|-------------|-----------|
214
+ | **Setup** | None (built-in) | `gem 'async'` |
215
+ | **Concurrency Limit** | ~100-1,000 | ~10,000+ |
216
+ | **Memory per operation** | 1-2 MB | 4-8 KB |
217
+ | **Library compatibility** | Any Ruby gem | Needs async-aware gems |
218
+ | **Learning curve** | Simple | Moderate |
219
+ | **Speed (I/O)** | Fast | Faster |
220
+ | **Speed (CPU)** | GIL-limited | GIL-limited (same) |
221
+ | **Best use case** | Standard apps | High-concurrency services |
222
+
223
+ ---
224
+
225
+ ## Real-World Analogy
226
+
227
+ **Threads** = Hiring separate workers
228
+ - Each worker has their own desk, phone, computer (more resources)
229
+ - Can have 50-100 workers before office gets crowded
230
+ - Workers use regular tools everyone knows
231
+ - Easy to manage
232
+
233
+ **Async** = One worker with a really efficient task list
234
+ - Worker rapidly switches between tasks when waiting
235
+ - Can juggle 10,000 tasks because they're mostly waiting anyway
236
+ - Needs special tools designed for rapid task-switching
237
+ - More efficient but requires planning
238
+
239
+ ---
240
+
241
+ ## Switching Between Models
242
+
243
+ The beauty of SimpleFlow is that you can switch between concurrency models without changing your pipeline code:
244
+
245
+ ### Starting with Threads
246
+
247
+ ```ruby
248
+ # Gemfile - no async gem
249
+ gem 'simple_flow'
250
+
251
+ # Your pipeline code
252
+ pipeline = SimpleFlow::Pipeline.new do
253
+ step :fetch_user, user_fetcher, depends_on: []
254
+ step :fetch_orders, order_fetcher, depends_on: [:fetch_user]
255
+ step :fetch_products, product_fetcher, depends_on: [:fetch_user]
256
+ end
257
+
258
+ result = pipeline.call_parallel(data) # Uses threads
259
+ ```
260
+
261
+ ### Upgrading to Async
262
+
263
+ ```ruby
264
+ # Gemfile - add async gem
265
+ gem 'simple_flow'
266
+ gem 'async', '~> 2.0'
267
+
268
+ # Same pipeline code - no changes needed!
269
+ pipeline = SimpleFlow::Pipeline.new do
270
+ step :fetch_user, user_fetcher, depends_on: []
271
+ step :fetch_orders, order_fetcher, depends_on: [:fetch_user]
272
+ step :fetch_products, product_fetcher, depends_on: [:fetch_user]
273
+ end
274
+
275
+ result = pipeline.call_parallel(data) # Now uses async automatically
276
+ ```
277
+
278
+ ### Mixing Concurrency Models in One Application
279
+
280
+ You can use different concurrency models for different pipelines in the same application:
281
+
282
+ ```ruby
283
+ # Gemfile - include async for high-volume pipelines
284
+ gem 'simple_flow'
285
+ gem 'async', '~> 2.0'
286
+
287
+ # Low-volume pipeline: Use threads for simplicity
288
+ user_pipeline = SimpleFlow::Pipeline.new(concurrency: :threads) do
289
+ step :validate, validator, depends_on: []
290
+ step :fetch_profile, profile_fetcher, depends_on: [:validate]
291
+ step :fetch_preferences, prefs_fetcher, depends_on: [:validate]
292
+ end
293
+
294
+ # High-volume pipeline: Use async for efficiency
295
+ monitoring_pipeline = SimpleFlow::Pipeline.new(concurrency: :async) do
296
+ step :load_endpoints, endpoint_loader, depends_on: []
297
+ step :check_all, health_checker, depends_on: [:load_endpoints]
298
+ step :alert, alerter, depends_on: [:check_all]
299
+ end
300
+
301
+ # Each pipeline uses its configured concurrency model
302
+ user_result = user_pipeline.call_parallel(user_data) # Uses threads
303
+ monitoring_result = monitoring_pipeline.call_parallel(config) # Uses async
304
+ ```
305
+
306
+ This allows you to optimize each pipeline based on its specific requirements!
307
+
308
+ ---
309
+
310
+ ## Performance Characteristics
311
+
312
+ ### I/O-Bound Operations
313
+
314
+ Both threads and async excel at I/O-bound operations (network, disk, database):
315
+
316
+ ```ruby
317
+ # API calls, database queries, file operations
318
+ # Both models provide significant speedup over sequential execution
319
+
320
+ # Sequential: 10 API calls × 200ms = 2000ms
321
+ # Threads: 10 API calls in parallel = ~200ms
322
+ # Async: 10 API calls in parallel = ~200ms
323
+
324
+ # Winner: Tie (both are fast for moderate I/O)
325
+ ```
326
+
327
+ ### High Concurrency (1000+ operations)
328
+
329
+ Async shines when dealing with thousands of concurrent operations:
330
+
331
+ ```ruby
332
+ # 5,000 concurrent HTTP requests
333
+
334
+ # Threads: 5,000 threads × 1.5 MB = 7.5 GB RAM ❌
335
+ # Async: 5,000 fibers × 6 KB = 30 MB RAM ✅
336
+
337
+ # Winner: Async (dramatically lower resource usage)
338
+ ```
339
+
340
+ ### CPU-Bound Operations
341
+
342
+ Neither model helps with pure CPU work due to Ruby's GIL:
343
+
344
+ ```ruby
345
+ # Heavy computation (image processing, data crunching)
346
+ # GIL ensures only one thread/fiber does CPU work at a time
347
+
348
+ # Sequential: 1000ms
349
+ # Threads: 1000ms (GIL limitation)
350
+ # Async: 1000ms (GIL limitation)
351
+
352
+ # Winner: None (use process-based parallelism for CPU work)
353
+ ```
354
+
355
+ ---
356
+
357
+ ## Common Questions
358
+
359
+ ### Q: Can I use both in the same application?
360
+
361
+ **A:** Yes! SimpleFlow automatically detects if async is available and uses it. Different pipelines in the same app can use different models.
362
+
363
+ ### Q: Do I need to change my code to switch models?
364
+
365
+ **A:** No! Just add or remove the `async` gem from your Gemfile. Your pipeline code stays the same.
366
+
367
+ ### Q: What if I'm not sure which to use?
368
+
369
+ **A:** Start without async (use threads). It's simpler and works great for most use cases. Add async later if you need it.
370
+
371
+ ### Q: Can I check which model is being used?
372
+
373
+ **A:** Yes! Use the `async_available?` method:
374
+
375
+ ```ruby
376
+ pipeline = SimpleFlow::Pipeline.new
377
+ puts "Using async: #{pipeline.async_available?}"
378
+ ```
379
+
380
+ ### Q: Are there any compatibility issues with async?
381
+
382
+ **A:** Async requires async-aware libraries for best results:
383
+ - Use `async-http` instead of `net/http` or `httparty`
384
+ - Use `async-postgres` instead of `pg`
385
+ - Check if your favorite gems have async versions
386
+
387
+ With threads, any Ruby gem works out of the box.
388
+
389
+ ---
390
+
391
+ ## Recommendations
392
+
393
+ ### For Most Users
394
+
395
+ **Start with threads (no async gem):**
396
+ - Simpler setup
397
+ - Works with any library
398
+ - Sufficient for most applications
399
+ - Easy to understand and debug
400
+
401
+ ### Upgrade to Async When
402
+
403
+ You experience any of these:
404
+ - ⚠️ High memory usage from threads
405
+ - ⚠️ Need more than 100 concurrent operations
406
+ - ⚠️ Building high-throughput services
407
+ - ⚠️ Already using async-compatible libraries
408
+ - ⚠️ Hosting costs driven by memory usage
409
+
410
+ ### Migration Path
411
+
412
+ 1. **Start**: Build with threads (no dependencies)
413
+ 2. **Measure**: Profile your application under realistic load
414
+ 3. **Decide**: If you hit thread limits, add async gem
415
+ 4. **Switch**: Just add gem to Gemfile, no code changes
416
+ 5. **Optimize**: Gradually adopt async-aware libraries for better performance
417
+
418
+ ---
419
+
420
+ ## Next Steps
421
+
422
+ - [Parallel Execution](../concurrent/parallel-steps.md) - Deep dive into parallel execution patterns
423
+ - [Performance](../concurrent/performance.md) - Benchmarking and optimization tips
424
+ - [Best Practices](../concurrent/best-practices.md) - Concurrent programming patterns
425
+ - [Error Handling](error-handling.md) - Handling errors in parallel pipelines
426
+
427
+ ---
428
+
429
+ ## Summary
430
+
431
+ | Your Scenario | Recommendation |
432
+ |--------------|----------------|
433
+ | Building internal tools, scripts | ✅ **Threads** (no async) |
434
+ | Standard web app with DB queries | ✅ **Threads** (no async) |
435
+ | Processing 10-100 parallel tasks | ✅ **Threads** (no async) |
436
+ | High-volume API gateway | ✅ **Async** (add gem) |
437
+ | Web scraper (1000+ requests) | ✅ **Async** (add gem) |
438
+ | Real-time chat/notifications | ✅ **Async** (add gem) |
439
+ | Background job processor | ✅ **Async** (add gem) |
440
+
441
+ **Remember:** You can always start simple (threads) and upgrade to async later without changing your pipeline code!