RubyGems - simple_flow - Versions diffs - 0.1.0 - Mend

simple_flow 0.1.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (80) hide show

checksums.yaml +7 -0
data/.envrc +1 -0
data/.github/workflows/deploy-github-pages.yml +52 -0
data/.rubocop.yml +57 -0
data/CHANGELOG.md +4 -0
data/COMMITS.md +196 -0
data/LICENSE +21 -0
data/README.md +481 -0
data/Rakefile +15 -0
data/benchmarks/parallel_vs_sequential.rb +98 -0
data/benchmarks/pipeline_overhead.rb +130 -0
data/docs/api/middleware.md +468 -0
data/docs/api/parallel-step.md +363 -0
data/docs/api/pipeline.md +382 -0
data/docs/api/result.md +375 -0
data/docs/concurrent/best-practices.md +687 -0
data/docs/concurrent/introduction.md +246 -0
data/docs/concurrent/parallel-steps.md +418 -0
data/docs/concurrent/performance.md +481 -0
data/docs/core-concepts/flow-control.md +452 -0
data/docs/core-concepts/middleware.md +389 -0
data/docs/core-concepts/overview.md +219 -0
data/docs/core-concepts/pipeline.md +315 -0
data/docs/core-concepts/result.md +168 -0
data/docs/core-concepts/steps.md +391 -0
data/docs/development/benchmarking.md +443 -0
data/docs/development/contributing.md +380 -0
data/docs/development/dagwood-concepts.md +435 -0
data/docs/development/testing.md +514 -0
data/docs/getting-started/examples.md +197 -0
data/docs/getting-started/installation.md +62 -0
data/docs/getting-started/quick-start.md +218 -0
data/docs/guides/choosing-concurrency-model.md +441 -0
data/docs/guides/complex-workflows.md +440 -0
data/docs/guides/data-fetching.md +478 -0
data/docs/guides/error-handling.md +635 -0
data/docs/guides/file-processing.md +505 -0
data/docs/guides/validation-patterns.md +496 -0
data/docs/index.md +169 -0
data/examples/.gitignore +3 -0
data/examples/01_basic_pipeline.rb +112 -0
data/examples/02_error_handling.rb +178 -0
data/examples/03_middleware.rb +186 -0
data/examples/04_parallel_automatic.rb +221 -0
data/examples/05_parallel_explicit.rb +279 -0
data/examples/06_real_world_ecommerce.rb +288 -0
data/examples/07_real_world_etl.rb +277 -0
data/examples/08_graph_visualization.rb +246 -0
data/examples/09_pipeline_visualization.rb +266 -0
data/examples/10_concurrency_control.rb +235 -0
data/examples/11_sequential_dependencies.rb +243 -0
data/examples/12_none_constant.rb +161 -0
data/examples/README.md +374 -0
data/examples/regression_test/01_basic_pipeline.txt +38 -0
data/examples/regression_test/02_error_handling.txt +92 -0
data/examples/regression_test/03_middleware.txt +61 -0
data/examples/regression_test/04_parallel_automatic.txt +86 -0
data/examples/regression_test/05_parallel_explicit.txt +80 -0
data/examples/regression_test/06_real_world_ecommerce.txt +53 -0
data/examples/regression_test/07_real_world_etl.txt +58 -0
data/examples/regression_test/08_graph_visualization.txt +429 -0
data/examples/regression_test/09_pipeline_visualization.txt +305 -0
data/examples/regression_test/10_concurrency_control.txt +96 -0
data/examples/regression_test/11_sequential_dependencies.txt +86 -0
data/examples/regression_test/12_none_constant.txt +64 -0
data/examples/regression_test.rb +105 -0
data/lib/simple_flow/dependency_graph.rb +120 -0
data/lib/simple_flow/dependency_graph_visualizer.rb +326 -0
data/lib/simple_flow/middleware.rb +36 -0
data/lib/simple_flow/parallel_executor.rb +80 -0
data/lib/simple_flow/pipeline.rb +405 -0
data/lib/simple_flow/result.rb +88 -0
data/lib/simple_flow/step_tracker.rb +58 -0
data/lib/simple_flow/version.rb +5 -0
data/lib/simple_flow.rb +41 -0
data/mkdocs.yml +146 -0
data/pipeline_graph.dot +51 -0
data/pipeline_graph.html +60 -0
data/pipeline_graph.mmd +19 -0
metadata +127 -0

data/docs/core-concepts/pipeline.md ADDED Viewed

@@ -0,0 +1,315 @@
+# Pipeline
+The `Pipeline` class is the orchestrator that manages the execution of steps in your data processing workflow.
+## Overview
+A Pipeline defines a sequence of operations (steps) that transform data, with support for:
+- Sequential execution with automatic dependencies
+- Parallel execution (automatic and explicit)
+- Middleware integration
+- Short-circuit evaluation
+- Explicit dependency management
+## Execution Modes
+SimpleFlow pipelines support two distinct execution modes:
+### Sequential Execution (Default)
+**Unnamed steps execute in order, with each step automatically depending on the previous step's success.**
+When a step halts (returns `result.halt`), the pipeline immediately stops and subsequent steps are not executed.
+```ruby
+pipeline = SimpleFlow::Pipeline.new do
+  step ->(result) { puts "Step 1"; result.continue(result.value) }
+  step ->(result) { puts "Step 2"; result.halt("stopped") }
+  step ->(result) { puts "Step 3"; result.continue(result.value) }  # NEVER EXECUTES
+end
+result = pipeline.call(SimpleFlow::Result.new(nil))
+# Output:
+# Step 1
+# Step 2
+# (Step 3 is skipped because Step 2 halted)
+```
+This automatic dependency chain means:
+- Steps execute one at a time in the order they were defined
+- Each step receives the result from the previous step
+- If any step halts, the entire pipeline stops immediately
+- No need to specify dependencies for sequential workflows
+### Parallel Execution
+**Named steps with explicit dependencies can run concurrently using `call_parallel`.**
+```ruby
+pipeline = SimpleFlow::Pipeline.new do
+  step :validate, validator, depends_on: []
+  step :fetch_a, fetcher_a, depends_on: [:validate]  # Runs in parallel with fetch_b
+  step :fetch_b, fetcher_b, depends_on: [:validate]  # Runs in parallel with fetch_a
+  step :merge, merger, depends_on: [:fetch_a, :fetch_b]
+end
+result = pipeline.call_parallel(initial_data)
+```
+See [Parallel Execution](#parallel-execution) below for details.
+## Basic Usage
+```ruby
+require 'simple_flow'
+pipeline = SimpleFlow::Pipeline.new do
+  step ->(result) { result.continue(result.value * 2) }
+  step ->(result) { result.continue(result.value + 10) }
+  step ->(result) { result.continue(result.value.to_s) }
+end
+result = pipeline.call(SimpleFlow::Result.new(5))
+result.value # => "20"
+```
+## Defining Steps
+### Lambda Steps
+```ruby
+pipeline = SimpleFlow::Pipeline.new do
+  step ->(result) do
+    # Process the result
+    new_value = transform(result.value)
+    result.continue(new_value)
+  end
+end
+```
+### Method Steps
+```ruby
+def validate_user(result)
+  if result.value[:email].present?
+    result.continue(result.value)
+  else
+    result.with_error(:validation, 'Email required').halt
+  end
+end
+pipeline = SimpleFlow::Pipeline.new do
+  step method(:validate_user)
+end
+```
+### Callable Objects
+```ruby
+class EmailValidator
+  def call(result)
+    # Validation logic
+    result.continue(result.value)
+  end
+end
+pipeline = SimpleFlow::Pipeline.new do
+  step EmailValidator.new
+end
+```
+## Named Steps with Dependencies
+For parallel execution, you can define named steps with explicit dependencies:
+```ruby
+pipeline = SimpleFlow::Pipeline.new do
+  step :validate, ->(result) { validate(result) }, depends_on: []
+  step :fetch_user, ->(result) { fetch_user(result) }, depends_on: [:validate]
+  step :fetch_orders, ->(result) { fetch_orders(result) }, depends_on: [:validate]
+  step :calculate, ->(result) { calculate(result) }, depends_on: [:fetch_user, :fetch_orders]
+end
+```
+Steps with the same satisfied dependencies run in parallel automatically.
+## Parallel Execution
+### Automatic Parallelization
+```ruby
+# These will run in parallel (both depend only on :validate)
+pipeline = SimpleFlow::Pipeline.new do
+  step :validate, validator, depends_on: []
+  step :fetch_orders, fetch_orders_callable, depends_on: [:validate]
+  step :fetch_products, fetch_products_callable, depends_on: [:validate]
+end
+result = pipeline.call_parallel(initial_result)
+```
+### Explicit Parallel Blocks
+```ruby
+pipeline = SimpleFlow::Pipeline.new do
+  # Sequential step
+  step ->(result) { validate(result) }
+  # These run in parallel
+  parallel do
+    step ->(result) { fetch_from_api(result) }
+    step ->(result) { fetch_from_cache(result) }
+    step ->(result) { fetch_from_database(result) }
+  end
+  # Sequential step
+  step ->(result) { merge_results(result) }
+end
+```
+## Short-Circuit Evaluation
+**Pipelines automatically stop executing when a step halts.** This is a core feature of sequential execution - each unnamed step implicitly depends on the previous step's success.
+```ruby
+pipeline = SimpleFlow::Pipeline.new do
+  step ->(result) { result.continue("step 1") }
+  step ->(result) { result.halt("stopped") }        # Execution stops here
+  step ->(result) { result.continue("step 3") }     # Never executed
+end
+result = pipeline.call(SimpleFlow::Result.new(nil))
+result.value      # => "stopped"
+result.continue?  # => false
+```
+**Implementation detail:** The `call` method checks `result.continue?` after each step. If it returns `false`, the pipeline returns immediately without executing remaining steps:
+```ruby
+# Simplified view of Pipeline#call
+def call(result)
+  steps.reduce(result) do |res, step|
+    return res unless res.continue?  # Short-circuit on halt
+    step.call(res)
+  end
+end
+```
+This behavior ensures:
+- **Fail-fast**: Errors stop processing immediately
+- **Resource efficiency**: No wasted computation on already-failed results
+- **Predictable flow**: Clear execution path based on step outcomes
+## Middleware
+Apply cross-cutting concerns using middleware:
+```ruby
+pipeline = SimpleFlow::Pipeline.new do
+  use_middleware SimpleFlow::MiddleWare::Logging
+  use_middleware SimpleFlow::MiddleWare::Instrumentation, api_key: 'my-key'
+  step ->(result) { process(result) }
+end
+```
+[Learn more about Middleware](middleware.md)
+## Visualization
+Pipelines with named steps can be visualized:
+```ruby
+# Generate ASCII visualization
+puts pipeline.visualize_ascii
+# Export to Graphviz DOT format
+File.write('pipeline.dot', pipeline.visualize_dot)
+# Export to Mermaid diagram
+File.write('pipeline.mmd', pipeline.visualize_mermaid)
+# Get execution plan analysis
+puts pipeline.execution_plan
+```
+## API Reference
+### Class Methods
+| Method | Description |
+|--------|-------------|
+| `new(&block)` | Create a new pipeline with DSL block |
+### Instance Methods
+| Method | Description |
+|--------|-------------|
+| `call(result)` | Execute pipeline sequentially |
+| `call_parallel(result, strategy: :auto)` | Execute with parallelization |
+| `dependency_graph` | Get underlying dependency graph |
+| `visualize` | Get visualizer instance |
+| `visualize_ascii(show_groups: true)` | ASCII visualization |
+| `visualize_dot(include_groups: true, orientation: 'TB')` | Graphviz DOT export |
+| `visualize_mermaid` | Mermaid diagram export |
+| `execution_plan` | Performance analysis |
+### DSL Methods (in Pipeline.new block)
+| Method | Description |
+|--------|-------------|
+| `step(callable)` | Add anonymous step |
+| `step(name, callable, depends_on: [])` | Add named step with dependencies |
+| `parallel(&block)` | Define explicit parallel block |
+| `use_middleware(middleware, **options)` | Add middleware |
+## Best Practices
+1. **Keep steps focused**: Each step should do one thing well
+2. **Use meaningful names**: Named steps improve visualization and debugging
+3. **Handle errors gracefully**: Use `.halt` to stop processing on errors
+4. **Leverage context**: Pass metadata between steps via `result.context`
+5. **Consider parallelization**: Use named steps with dependencies for I/O-bound operations
+6. **Apply middleware judiciously**: Add logging/instrumentation for observability
+## Example: E-Commerce Order Processing
+```ruby
+pipeline = SimpleFlow::Pipeline.new do
+  use_middleware SimpleFlow::MiddleWare::Logging
+  use_middleware SimpleFlow::MiddleWare::Instrumentation
+  step :validate, ->(result) {
+    # Validate order
+    result.continue(result.value)
+  }, depends_on: :none
+  step :check_inventory, ->(result) {
+    # Check stock
+    result.continue(result.value)
+  }, depends_on: [:validate]
+  step :calculate_shipping, ->(result) {
+    # Calculate shipping cost
+    result.continue(result.value)
+  }, depends_on: [:validate]
+  step :process_payment, ->(result) {
+    # Process payment
+    result.continue(result.value)
+  }, depends_on: [:check_inventory, :calculate_shipping]
+  step :send_confirmation, ->(result) {
+    # Send email
+    result.continue(result.value)
+  }, depends_on: [:process_payment]
+end
+```
+## Next Steps
+- [Steps](steps.md) - Deep dive into step implementations
+- [Middleware](middleware.md) - Adding cross-cutting concerns
+- [Parallel Execution](../concurrent/parallel-steps.md) - Concurrent processing patterns
+- [Complex Workflows Guide](../guides/complex-workflows.md) - Real-world examples

data/docs/core-concepts/result.md ADDED Viewed

@@ -0,0 +1,168 @@
+# Result
+The `Result` class is the fundamental value object in SimpleFlow that encapsulates the outcome of each operation in your pipeline.
+## Overview
+A `Result` object contains three main components:
+- **Value**: The actual data being processed
+- **Context**: A hash of metadata and contextual information
+- **Errors**: Categorized error messages accumulated during processing
+## Immutability
+Results are immutable - every operation returns a new `Result` instance rather than modifying the existing one. This design promotes safer concurrent operations and functional programming patterns.
+```ruby
+original = SimpleFlow::Result.new("data")
+updated = original.with_context(:user_id, 123)
+original.context  # => {}
+updated.context   # => { user_id: 123 }
+```
+## Creating Results
+### Basic Initialization
+```ruby
+# Simple result with just a value
+result = SimpleFlow::Result.new(10)
+# Result with initial context and errors
+result = SimpleFlow::Result.new(
+  { count: 5 },
+  context: { user_id: 123 },
+  errors: { validation: ['Required field missing'] }
+)
+```
+## Working with Context
+Context allows you to pass metadata through your pipeline without modifying the primary value.
+```ruby
+result = SimpleFlow::Result.new(data)
+  .with_context(:user_id, 123)
+  .with_context(:timestamp, Time.now.to_i)
+  .with_context(:source, 'api')
+result.context
+# => { user_id: 123, timestamp: 1234567890, source: 'api' }
+```
+### Common Context Use Cases
+- User authentication details
+- Request timestamps
+- Transaction IDs
+- Debug information
+- Performance metrics
+## Error Handling
+Errors are organized by category, allowing multiple errors per category:
+```ruby
+result = SimpleFlow::Result.new(data)
+  .with_error(:validation, 'Email is required')
+  .with_error(:validation, 'Password too short')
+  .with_error(:authentication, 'Invalid token')
+result.errors
+# => {
+#   validation: ['Email is required', 'Password too short'],
+#   authentication: ['Invalid token']
+# }
+```
+## Flow Control
+Results include a continue flag that controls pipeline execution.
+### Continue
+Move to the next step with a new value:
+```ruby
+result = result.continue(new_value)
+# continue? => true
+```
+### Halt
+Stop pipeline execution:
+```ruby
+# Halt without changing value
+result = result.halt
+# continue? => false, value unchanged
+# Halt with a new value
+result = result.halt(error_response)
+# continue? => false, value changed
+```
+### Checking Status
+```ruby
+if result.continue?
+  # Pipeline will proceed
+else
+  # Pipeline has been halted
+end
+```
+## Example: Multi-Step Processing
+```ruby
+def process_user_registration(params)
+  result = SimpleFlow::Result.new(params)
+    .with_context(:ip_address, request.ip)
+    .with_context(:timestamp, Time.now)
+  # Validation
+  if params[:email].nil?
+    return result
+      .with_error(:validation, 'Email required')
+      .halt
+  end
+  # Process
+  user = create_user(params)
+  result
+    .continue(user)
+    .with_context(:user_id, user.id)
+end
+```
+## API Reference
+### Instance Methods
+| Method | Description | Returns |
+|--------|-------------|---------|
+| `value` | Get the current value | Object |
+| `context` | Get the context hash | Hash |
+| `errors` | Get the errors hash | Hash |
+| `continue?` | Check if pipeline should continue | Boolean |
+| `with_context(key, value)` | Add context | New Result |
+| `with_error(key, message)` | Add error | New Result |
+| `continue(new_value)` | Proceed with new value | New Result |
+| `halt(new_value = nil)` | Stop execution | New Result |
+## Best Practices
+1. **Use context for metadata**: Keep the value focused on the data being processed
+2. **Categorize errors**: Use meaningful error keys like `:validation`, `:authentication`, `:database`
+3. **Halt early**: Stop processing as soon as you know the operation cannot succeed
+4. **Chain operations**: Take advantage of immutability to build readable operation chains
+5. **Preserve information**: When halting, preserve context and errors for debugging
+## Next Steps
+- [Pipeline](pipeline.md) - Learn how Results flow through pipelines
+- [Flow Control](flow-control.md) - Advanced flow control patterns
+- [Error Handling Guide](../guides/error-handling.md) - Comprehensive error handling strategies