simple_flow 0.1.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (80) hide show
  1. checksums.yaml +7 -0
  2. data/.envrc +1 -0
  3. data/.github/workflows/deploy-github-pages.yml +52 -0
  4. data/.rubocop.yml +57 -0
  5. data/CHANGELOG.md +4 -0
  6. data/COMMITS.md +196 -0
  7. data/LICENSE +21 -0
  8. data/README.md +481 -0
  9. data/Rakefile +15 -0
  10. data/benchmarks/parallel_vs_sequential.rb +98 -0
  11. data/benchmarks/pipeline_overhead.rb +130 -0
  12. data/docs/api/middleware.md +468 -0
  13. data/docs/api/parallel-step.md +363 -0
  14. data/docs/api/pipeline.md +382 -0
  15. data/docs/api/result.md +375 -0
  16. data/docs/concurrent/best-practices.md +687 -0
  17. data/docs/concurrent/introduction.md +246 -0
  18. data/docs/concurrent/parallel-steps.md +418 -0
  19. data/docs/concurrent/performance.md +481 -0
  20. data/docs/core-concepts/flow-control.md +452 -0
  21. data/docs/core-concepts/middleware.md +389 -0
  22. data/docs/core-concepts/overview.md +219 -0
  23. data/docs/core-concepts/pipeline.md +315 -0
  24. data/docs/core-concepts/result.md +168 -0
  25. data/docs/core-concepts/steps.md +391 -0
  26. data/docs/development/benchmarking.md +443 -0
  27. data/docs/development/contributing.md +380 -0
  28. data/docs/development/dagwood-concepts.md +435 -0
  29. data/docs/development/testing.md +514 -0
  30. data/docs/getting-started/examples.md +197 -0
  31. data/docs/getting-started/installation.md +62 -0
  32. data/docs/getting-started/quick-start.md +218 -0
  33. data/docs/guides/choosing-concurrency-model.md +441 -0
  34. data/docs/guides/complex-workflows.md +440 -0
  35. data/docs/guides/data-fetching.md +478 -0
  36. data/docs/guides/error-handling.md +635 -0
  37. data/docs/guides/file-processing.md +505 -0
  38. data/docs/guides/validation-patterns.md +496 -0
  39. data/docs/index.md +169 -0
  40. data/examples/.gitignore +3 -0
  41. data/examples/01_basic_pipeline.rb +112 -0
  42. data/examples/02_error_handling.rb +178 -0
  43. data/examples/03_middleware.rb +186 -0
  44. data/examples/04_parallel_automatic.rb +221 -0
  45. data/examples/05_parallel_explicit.rb +279 -0
  46. data/examples/06_real_world_ecommerce.rb +288 -0
  47. data/examples/07_real_world_etl.rb +277 -0
  48. data/examples/08_graph_visualization.rb +246 -0
  49. data/examples/09_pipeline_visualization.rb +266 -0
  50. data/examples/10_concurrency_control.rb +235 -0
  51. data/examples/11_sequential_dependencies.rb +243 -0
  52. data/examples/12_none_constant.rb +161 -0
  53. data/examples/README.md +374 -0
  54. data/examples/regression_test/01_basic_pipeline.txt +38 -0
  55. data/examples/regression_test/02_error_handling.txt +92 -0
  56. data/examples/regression_test/03_middleware.txt +61 -0
  57. data/examples/regression_test/04_parallel_automatic.txt +86 -0
  58. data/examples/regression_test/05_parallel_explicit.txt +80 -0
  59. data/examples/regression_test/06_real_world_ecommerce.txt +53 -0
  60. data/examples/regression_test/07_real_world_etl.txt +58 -0
  61. data/examples/regression_test/08_graph_visualization.txt +429 -0
  62. data/examples/regression_test/09_pipeline_visualization.txt +305 -0
  63. data/examples/regression_test/10_concurrency_control.txt +96 -0
  64. data/examples/regression_test/11_sequential_dependencies.txt +86 -0
  65. data/examples/regression_test/12_none_constant.txt +64 -0
  66. data/examples/regression_test.rb +105 -0
  67. data/lib/simple_flow/dependency_graph.rb +120 -0
  68. data/lib/simple_flow/dependency_graph_visualizer.rb +326 -0
  69. data/lib/simple_flow/middleware.rb +36 -0
  70. data/lib/simple_flow/parallel_executor.rb +80 -0
  71. data/lib/simple_flow/pipeline.rb +405 -0
  72. data/lib/simple_flow/result.rb +88 -0
  73. data/lib/simple_flow/step_tracker.rb +58 -0
  74. data/lib/simple_flow/version.rb +5 -0
  75. data/lib/simple_flow.rb +41 -0
  76. data/mkdocs.yml +146 -0
  77. data/pipeline_graph.dot +51 -0
  78. data/pipeline_graph.html +60 -0
  79. data/pipeline_graph.mmd +19 -0
  80. metadata +127 -0
@@ -0,0 +1,435 @@
1
+ # Dagwood Concepts Analysis
2
+
3
+ ## Overview
4
+
5
+ [Dagwood](https://github.com/MadBomber/dagwood) is a Ruby gem for dependency graph analysis and resolution ordering using topologically sorted directed acyclic graphs (DAGs).
6
+
7
+ ## Key Dagwood Concepts
8
+
9
+ ### 1. Dependency Declaration
10
+
11
+ Dagwood explicitly declares dependencies between tasks:
12
+
13
+ ```ruby
14
+ graph = Dagwood::DependencyGraph.new(
15
+ add_mustard: [:slice_bread],
16
+ add_smoked_meat: [:slice_bread],
17
+ close_sandwich: [:add_mustard, :add_smoked_meat]
18
+ )
19
+ ```
20
+
21
+ ### 2. Automatic Parallel Detection
22
+
23
+ The `parallel_order` method **automatically groups** tasks that can run concurrently:
24
+
25
+ ```ruby
26
+ graph.parallel_order
27
+ # => [[:slice_bread], [:add_mustard, :add_smoked_meat], [:close_sandwich]]
28
+ ```
29
+
30
+ Tasks in the same nested array can run in parallel because they have the same dependencies.
31
+
32
+ ### 3. Serial Ordering
33
+
34
+ The `order` method provides topologically sorted execution order:
35
+
36
+ ```ruby
37
+ graph.order
38
+ # => [:slice_bread, :add_mustard, :add_smoked_meat, :close_sandwich]
39
+ ```
40
+
41
+ ### 4. Reverse Ordering
42
+
43
+ The `reverse_order` method enables teardown/cleanup operations:
44
+
45
+ ```ruby
46
+ graph.reverse_order
47
+ # => [:close_sandwich, :add_smoked_meat, :add_mustard, :slice_bread]
48
+ ```
49
+
50
+ ### 5. Subgraphs
51
+
52
+ Extract dependency chains for specific nodes:
53
+
54
+ ```ruby
55
+ subgraph = graph.subgraph(:add_mustard)
56
+ subgraph.order
57
+ # => [:slice_bread, :add_mustard]
58
+ ```
59
+
60
+ ### 6. Graph Merging
61
+
62
+ Combine multiple dependency graphs:
63
+
64
+ ```ruby
65
+ ultimate_recipe = recipe1.merge(recipe2)
66
+ ```
67
+
68
+ ## Potential Improvements for SimpleFlow
69
+
70
+ ### 1. ⭐ Automatic Parallel Detection (High Value)
71
+
72
+ **Current SimpleFlow (Manual):**
73
+ ```ruby
74
+ pipeline = SimpleFlow::Pipeline.new do
75
+ step ->(result) { fetch_user(result) }
76
+
77
+ # User must manually identify parallel steps
78
+ parallel do
79
+ step ->(result) { fetch_orders(result) }
80
+ step ->(result) { fetch_preferences(result) }
81
+ end
82
+ end
83
+ ```
84
+
85
+ **With Dagwood Concepts (Automatic):**
86
+ ```ruby
87
+ pipeline = SimpleFlow::Pipeline.new do
88
+ step :fetch_user
89
+ step :fetch_orders, depends_on: [:fetch_user]
90
+ step :fetch_preferences, depends_on: [:fetch_user]
91
+ step :fetch_analytics # No dependencies, runs first
92
+ step :aggregate, depends_on: [:fetch_orders, :fetch_preferences]
93
+ end
94
+
95
+ # Pipeline automatically determines:
96
+ # Level 0: [:fetch_analytics, :fetch_user] (parallel)
97
+ # Level 1: [:fetch_orders, :fetch_preferences] (parallel)
98
+ # Level 2: [:aggregate]
99
+ ```
100
+
101
+ **Benefits:**
102
+ - No manual `parallel` blocks needed
103
+ - Automatic optimization
104
+ - Clearer dependency relationships
105
+ - Easier to maintain
106
+
107
+ ### 2. ⭐ Pipeline Composition (High Value)
108
+
109
+ **Merge Multiple Pipelines:**
110
+ ```ruby
111
+ user_flow = SimpleFlow::Pipeline.new do
112
+ step :fetch_user
113
+ step :validate_user, depends_on: [:fetch_user]
114
+ end
115
+
116
+ order_flow = SimpleFlow::Pipeline.new do
117
+ step :fetch_orders
118
+ step :calculate_total, depends_on: [:fetch_orders]
119
+ end
120
+
121
+ # Merge pipelines
122
+ combined = user_flow.merge(order_flow)
123
+ combined.parallel_order
124
+ # Automatically detects:
125
+ # Level 0: [:fetch_user, :fetch_orders] (parallel)
126
+ # Level 1: [:validate_user, :calculate_total] (parallel)
127
+ ```
128
+
129
+ **Benefits:**
130
+ - Reusable pipeline components
131
+ - Compose complex workflows from simple ones
132
+ - Better modularity
133
+
134
+ ### 3. Reverse/Cleanup Pipelines (Medium Value)
135
+
136
+ **Automatic Teardown:**
137
+ ```ruby
138
+ setup_pipeline = SimpleFlow::Pipeline.new do
139
+ step :create_temp_files
140
+ step :connect_database, depends_on: [:create_temp_files]
141
+ step :load_data, depends_on: [:connect_database]
142
+ end
143
+
144
+ # Automatically generate cleanup
145
+ cleanup_pipeline = setup_pipeline.reverse
146
+ # Executes: [:load_data, :connect_database, :create_temp_files] in reverse
147
+ ```
148
+
149
+ **Benefits:**
150
+ - Transaction rollback
151
+ - Resource cleanup
152
+ - Error recovery
153
+
154
+ ### 4. Subgraph Extraction (Medium Value)
155
+
156
+ **Partial Pipeline Execution:**
157
+ ```ruby
158
+ full_pipeline = SimpleFlow::Pipeline.new do
159
+ step :fetch_user
160
+ step :fetch_orders, depends_on: [:fetch_user]
161
+ step :fetch_preferences, depends_on: [:fetch_user]
162
+ step :calculate_total, depends_on: [:fetch_orders]
163
+ step :apply_discount, depends_on: [:calculate_total, :fetch_preferences]
164
+ end
165
+
166
+ # Extract only what's needed for calculate_total
167
+ partial = full_pipeline.subgraph(:calculate_total)
168
+ # Includes: [:fetch_user, :fetch_orders, :calculate_total]
169
+ # Excludes: [:fetch_preferences, :apply_discount]
170
+ ```
171
+
172
+ **Benefits:**
173
+ - Run only necessary steps
174
+ - Better performance
175
+ - Easier testing
176
+
177
+ ### 5. Named Steps with Dependency DSL (High Value)
178
+
179
+ **Better than Anonymous Lambdas:**
180
+ ```ruby
181
+ class UserPipeline < SimpleFlow::Pipeline
182
+ define do
183
+ step :validate_input
184
+
185
+ step :fetch_user, depends_on: [:validate_input]
186
+ step :fetch_orders, depends_on: [:fetch_user]
187
+ step :fetch_preferences, depends_on: [:fetch_user]
188
+
189
+ step :enrich_user_data, depends_on: [
190
+ :fetch_user,
191
+ :fetch_orders,
192
+ :fetch_preferences
193
+ ]
194
+ end
195
+
196
+ def validate_input(result)
197
+ # Implementation
198
+ result.continue(result.value)
199
+ end
200
+
201
+ def fetch_user(result)
202
+ # Implementation
203
+ end
204
+
205
+ # ... other step methods
206
+ end
207
+ ```
208
+
209
+ **Benefits:**
210
+ - Better debugging (named methods vs lambdas)
211
+ - Easier testing (test individual methods)
212
+ - Clear dependency visualization
213
+ - Self-documenting code
214
+
215
+ ## Implementation Proposal
216
+
217
+ ### Phase 1: Add Dependency Tracking
218
+
219
+ ```ruby
220
+ # lib/simple_flow/dependency_graph.rb
221
+ require 'dagwood'
222
+
223
+ module SimpleFlow
224
+ class DependencyGraph
225
+ def initialize
226
+ @steps = {}
227
+ @dependencies = {}
228
+ end
229
+
230
+ def add_step(name, callable, depends_on: [])
231
+ @steps[name] = callable
232
+ @dependencies[name] = depends_on
233
+ end
234
+
235
+ def parallel_order
236
+ graph = Dagwood::DependencyGraph.new(@dependencies)
237
+ graph.parallel_order
238
+ end
239
+
240
+ def execute(initial_result)
241
+ parallel_order.each do |level|
242
+ if level.length == 1
243
+ # Sequential step
244
+ step_name = level.first
245
+ initial_result = @steps[step_name].call(initial_result)
246
+ else
247
+ # Parallel steps
248
+ parallel_step = ParallelStep.new(level.map { |name| @steps[name] })
249
+ initial_result = parallel_step.call(initial_result)
250
+ end
251
+
252
+ break unless initial_result.continue?
253
+ end
254
+
255
+ initial_result
256
+ end
257
+ end
258
+ end
259
+ ```
260
+
261
+ ### Phase 2: Enhanced Pipeline DSL
262
+
263
+ ```ruby
264
+ # lib/simple_flow/pipeline.rb (enhanced)
265
+ module SimpleFlow
266
+ class Pipeline
267
+ def initialize(&config)
268
+ @dependency_graph = DependencyGraph.new
269
+ @steps = []
270
+ @middlewares = []
271
+ instance_eval(&config) if block_given?
272
+ end
273
+
274
+ # New: Named step with dependencies
275
+ def step(name = nil, depends_on: [], &block)
276
+ if name.is_a?(Symbol)
277
+ # Named step with dependencies
278
+ callable = block || method(name)
279
+ @dependency_graph.add_step(name, callable, depends_on: depends_on)
280
+ else
281
+ # Original anonymous step behavior (backward compatible)
282
+ callable = name || block
283
+ @steps << apply_middleware(callable)
284
+ end
285
+ self
286
+ end
287
+
288
+ def call(result)
289
+ if @dependency_graph.has_steps?
290
+ # Use automatic parallel detection
291
+ @dependency_graph.execute(result)
292
+ else
293
+ # Use original sequential/manual parallel execution
294
+ @steps.reduce(result) do |res, step|
295
+ res.respond_to?(:continue?) && !res.continue? ? res : step.call(res)
296
+ end
297
+ end
298
+ end
299
+ end
300
+ end
301
+ ```
302
+
303
+ ### Phase 3: Pipeline Composition
304
+
305
+ ```ruby
306
+ # lib/simple_flow/pipeline.rb (enhanced)
307
+ class Pipeline
308
+ def merge(other_pipeline)
309
+ merged = Pipeline.new
310
+ merged.dependency_graph = @dependency_graph.merge(other_pipeline.dependency_graph)
311
+ merged
312
+ end
313
+
314
+ def reverse
315
+ reversed = Pipeline.new
316
+ reversed.dependency_graph = @dependency_graph.reverse
317
+ reversed
318
+ end
319
+
320
+ def subgraph(step_name)
321
+ partial = Pipeline.new
322
+ partial.dependency_graph = @dependency_graph.subgraph(step_name)
323
+ partial
324
+ end
325
+ end
326
+ ```
327
+
328
+ ## Usage Examples
329
+
330
+ ### Example 1: Automatic Parallelization
331
+
332
+ ```ruby
333
+ pipeline = SimpleFlow::Pipeline.new do
334
+ step :fetch_user
335
+ step :fetch_orders, depends_on: [:fetch_user]
336
+ step :fetch_preferences, depends_on: [:fetch_user]
337
+ step :fetch_analytics, depends_on: [:fetch_user]
338
+ step :aggregate, depends_on: [:fetch_orders, :fetch_preferences, :fetch_analytics]
339
+ end
340
+
341
+ # Automatically executes as:
342
+ # Level 0: fetch_user
343
+ # Level 1: fetch_orders, fetch_preferences, fetch_analytics (parallel)
344
+ # Level 2: aggregate
345
+ ```
346
+
347
+ ### Example 2: Pipeline Composition
348
+
349
+ ```ruby
350
+ base_validation = SimpleFlow::Pipeline.new do
351
+ step :validate_email
352
+ step :validate_password
353
+ end
354
+
355
+ user_creation = SimpleFlow::Pipeline.new do
356
+ step :create_user, depends_on: [:validate_email, :validate_password]
357
+ step :send_welcome_email, depends_on: [:create_user]
358
+ end
359
+
360
+ full_flow = base_validation.merge(user_creation)
361
+ ```
362
+
363
+ ### Example 3: Cleanup Pipeline
364
+
365
+ ```ruby
366
+ setup = SimpleFlow::Pipeline.new do
367
+ step :allocate_resources
368
+ step :create_connection, depends_on: [:allocate_resources]
369
+ step :initialize_state, depends_on: [:create_connection]
370
+ end
371
+
372
+ # Automatic cleanup in reverse order
373
+ cleanup = setup.reverse
374
+ ```
375
+
376
+ ## Backward Compatibility
377
+
378
+ All enhancements maintain backward compatibility:
379
+
380
+ ```ruby
381
+ # Old style still works
382
+ pipeline = SimpleFlow::Pipeline.new do
383
+ step ->(result) { ... }
384
+ parallel do
385
+ step ->(result) { ... }
386
+ step ->(result) { ... }
387
+ end
388
+ end
389
+
390
+ # New style with dependencies
391
+ pipeline = SimpleFlow::Pipeline.new do
392
+ step :task1
393
+ step :task2, depends_on: [:task1]
394
+ end
395
+
396
+ # Mixed (both work together)
397
+ pipeline = SimpleFlow::Pipeline.new do
398
+ step :named_task
399
+ step ->(result) { ... } # Anonymous still works
400
+ end
401
+ ```
402
+
403
+ ## Recommendations
404
+
405
+ ### High Priority
406
+ 1. **Dependency tracking and automatic parallelization** - Biggest value add
407
+ 2. **Named steps DSL** - Better debugging and testing
408
+ 3. **Pipeline composition** - Better code reuse
409
+
410
+ ### Medium Priority
411
+ 4. **Reverse pipelines** - Useful for cleanup
412
+ 5. **Subgraph extraction** - Useful for testing
413
+
414
+ ### Low Priority
415
+ 6. **Complex dependency visualization** - Nice to have
416
+
417
+ ## Next Steps
418
+
419
+ 1. Add `dagwood` as a dependency
420
+ 2. Implement `SimpleFlow::DependencyGraph` wrapper
421
+ 3. Enhance `Pipeline` DSL with named steps and `depends_on`
422
+ 4. Add tests for new functionality
423
+ 5. Update documentation with examples
424
+ 6. Maintain 100% backward compatibility
425
+
426
+ ## Conclusion
427
+
428
+ Integrating Dagwood concepts would make SimpleFlow:
429
+ - **Smarter** - Automatic parallel detection
430
+ - **Cleaner** - Declarative dependencies vs manual parallel blocks
431
+ - **More Powerful** - Pipeline composition and merging
432
+ - **Easier to Debug** - Named steps instead of anonymous lambdas
433
+ - **More Testable** - Test individual steps by name
434
+
435
+ The most valuable improvement would be **automatic parallel detection** through dependency declaration, eliminating the need for manual `parallel` blocks while making dependencies explicit and clear.