job-workflow 0.1.3

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (132) hide show
  1. checksums.yaml +7 -0
  2. data/.rspec +3 -0
  3. data/.rubocop.yml +91 -0
  4. data/CHANGELOG.md +23 -0
  5. data/LICENSE.txt +21 -0
  6. data/README.md +47 -0
  7. data/Rakefile +55 -0
  8. data/Steepfile +10 -0
  9. data/guides/API_REFERENCE.md +112 -0
  10. data/guides/BEST_PRACTICES.md +113 -0
  11. data/guides/CACHE_STORE_INTEGRATION.md +145 -0
  12. data/guides/CONDITIONAL_EXECUTION.md +66 -0
  13. data/guides/DEPENDENCY_WAIT.md +386 -0
  14. data/guides/DRY_RUN.md +390 -0
  15. data/guides/DSL_BASICS.md +216 -0
  16. data/guides/ERROR_HANDLING.md +187 -0
  17. data/guides/GETTING_STARTED.md +524 -0
  18. data/guides/INSTRUMENTATION.md +131 -0
  19. data/guides/LIFECYCLE_HOOKS.md +415 -0
  20. data/guides/NAMESPACES.md +75 -0
  21. data/guides/OPENTELEMETRY_INTEGRATION.md +86 -0
  22. data/guides/PARALLEL_PROCESSING.md +302 -0
  23. data/guides/PRODUCTION_DEPLOYMENT.md +110 -0
  24. data/guides/QUEUE_MANAGEMENT.md +141 -0
  25. data/guides/README.md +174 -0
  26. data/guides/SCHEDULED_JOBS.md +165 -0
  27. data/guides/STRUCTURED_LOGGING.md +268 -0
  28. data/guides/TASK_OUTPUTS.md +240 -0
  29. data/guides/TESTING_STRATEGY.md +56 -0
  30. data/guides/THROTTLING.md +198 -0
  31. data/guides/TROUBLESHOOTING.md +53 -0
  32. data/guides/WORKFLOW_COMPOSITION.md +675 -0
  33. data/guides/WORKFLOW_STATUS_QUERY.md +288 -0
  34. data/lib/job-workflow.rb +3 -0
  35. data/lib/job_workflow/argument_def.rb +16 -0
  36. data/lib/job_workflow/arguments.rb +40 -0
  37. data/lib/job_workflow/auto_scaling/adapter/aws_adapter.rb +66 -0
  38. data/lib/job_workflow/auto_scaling/adapter.rb +31 -0
  39. data/lib/job_workflow/auto_scaling/configuration.rb +85 -0
  40. data/lib/job_workflow/auto_scaling/executor.rb +43 -0
  41. data/lib/job_workflow/auto_scaling.rb +69 -0
  42. data/lib/job_workflow/cache_store_adapters.rb +46 -0
  43. data/lib/job_workflow/context.rb +352 -0
  44. data/lib/job_workflow/dry_run_config.rb +31 -0
  45. data/lib/job_workflow/dsl.rb +236 -0
  46. data/lib/job_workflow/error_hook.rb +24 -0
  47. data/lib/job_workflow/hook.rb +24 -0
  48. data/lib/job_workflow/hook_registry.rb +66 -0
  49. data/lib/job_workflow/instrumentation/log_subscriber.rb +194 -0
  50. data/lib/job_workflow/instrumentation/opentelemetry_subscriber.rb +221 -0
  51. data/lib/job_workflow/instrumentation.rb +257 -0
  52. data/lib/job_workflow/job_status.rb +92 -0
  53. data/lib/job_workflow/logger.rb +86 -0
  54. data/lib/job_workflow/namespace.rb +36 -0
  55. data/lib/job_workflow/output.rb +81 -0
  56. data/lib/job_workflow/output_def.rb +14 -0
  57. data/lib/job_workflow/queue.rb +74 -0
  58. data/lib/job_workflow/queue_adapter.rb +38 -0
  59. data/lib/job_workflow/queue_adapters/abstract.rb +87 -0
  60. data/lib/job_workflow/queue_adapters/null_adapter.rb +127 -0
  61. data/lib/job_workflow/queue_adapters/solid_queue_adapter.rb +224 -0
  62. data/lib/job_workflow/runner.rb +173 -0
  63. data/lib/job_workflow/schedule.rb +46 -0
  64. data/lib/job_workflow/semaphore.rb +71 -0
  65. data/lib/job_workflow/task.rb +83 -0
  66. data/lib/job_workflow/task_callable.rb +43 -0
  67. data/lib/job_workflow/task_context.rb +70 -0
  68. data/lib/job_workflow/task_dependency_wait.rb +66 -0
  69. data/lib/job_workflow/task_enqueue.rb +50 -0
  70. data/lib/job_workflow/task_graph.rb +43 -0
  71. data/lib/job_workflow/task_job_status.rb +70 -0
  72. data/lib/job_workflow/task_output.rb +51 -0
  73. data/lib/job_workflow/task_retry.rb +64 -0
  74. data/lib/job_workflow/task_throttle.rb +46 -0
  75. data/lib/job_workflow/version.rb +5 -0
  76. data/lib/job_workflow/workflow.rb +87 -0
  77. data/lib/job_workflow/workflow_status.rb +112 -0
  78. data/lib/job_workflow.rb +59 -0
  79. data/rbs_collection.lock.yaml +172 -0
  80. data/rbs_collection.yaml +14 -0
  81. data/sig/generated/job-workflow.rbs +2 -0
  82. data/sig/generated/job_workflow/argument_def.rbs +14 -0
  83. data/sig/generated/job_workflow/arguments.rbs +26 -0
  84. data/sig/generated/job_workflow/auto_scaling/adapter/aws_adapter.rbs +32 -0
  85. data/sig/generated/job_workflow/auto_scaling/adapter.rbs +22 -0
  86. data/sig/generated/job_workflow/auto_scaling/configuration.rbs +50 -0
  87. data/sig/generated/job_workflow/auto_scaling/executor.rbs +29 -0
  88. data/sig/generated/job_workflow/auto_scaling.rbs +47 -0
  89. data/sig/generated/job_workflow/cache_store_adapters.rbs +28 -0
  90. data/sig/generated/job_workflow/context.rbs +155 -0
  91. data/sig/generated/job_workflow/dry_run_config.rbs +16 -0
  92. data/sig/generated/job_workflow/dsl.rbs +117 -0
  93. data/sig/generated/job_workflow/error_hook.rbs +18 -0
  94. data/sig/generated/job_workflow/hook.rbs +18 -0
  95. data/sig/generated/job_workflow/hook_registry.rbs +47 -0
  96. data/sig/generated/job_workflow/instrumentation/log_subscriber.rbs +102 -0
  97. data/sig/generated/job_workflow/instrumentation/opentelemetry_subscriber.rbs +113 -0
  98. data/sig/generated/job_workflow/instrumentation.rbs +138 -0
  99. data/sig/generated/job_workflow/job_status.rbs +46 -0
  100. data/sig/generated/job_workflow/logger.rbs +56 -0
  101. data/sig/generated/job_workflow/namespace.rbs +24 -0
  102. data/sig/generated/job_workflow/output.rbs +39 -0
  103. data/sig/generated/job_workflow/output_def.rbs +12 -0
  104. data/sig/generated/job_workflow/queue.rbs +49 -0
  105. data/sig/generated/job_workflow/queue_adapter.rbs +18 -0
  106. data/sig/generated/job_workflow/queue_adapters/abstract.rbs +56 -0
  107. data/sig/generated/job_workflow/queue_adapters/null_adapter.rbs +73 -0
  108. data/sig/generated/job_workflow/queue_adapters/solid_queue_adapter.rbs +111 -0
  109. data/sig/generated/job_workflow/runner.rbs +66 -0
  110. data/sig/generated/job_workflow/schedule.rbs +34 -0
  111. data/sig/generated/job_workflow/semaphore.rbs +37 -0
  112. data/sig/generated/job_workflow/task.rbs +60 -0
  113. data/sig/generated/job_workflow/task_callable.rbs +30 -0
  114. data/sig/generated/job_workflow/task_context.rbs +52 -0
  115. data/sig/generated/job_workflow/task_dependency_wait.rbs +42 -0
  116. data/sig/generated/job_workflow/task_enqueue.rbs +27 -0
  117. data/sig/generated/job_workflow/task_graph.rbs +27 -0
  118. data/sig/generated/job_workflow/task_job_status.rbs +42 -0
  119. data/sig/generated/job_workflow/task_output.rbs +29 -0
  120. data/sig/generated/job_workflow/task_retry.rbs +30 -0
  121. data/sig/generated/job_workflow/task_throttle.rbs +20 -0
  122. data/sig/generated/job_workflow/version.rbs +5 -0
  123. data/sig/generated/job_workflow/workflow.rbs +48 -0
  124. data/sig/generated/job_workflow/workflow_status.rbs +55 -0
  125. data/sig/generated/job_workflow.rbs +8 -0
  126. data/sig-private/activejob.rbs +35 -0
  127. data/sig-private/activesupport.rbs +23 -0
  128. data/sig-private/aws.rbs +32 -0
  129. data/sig-private/opentelemetry.rbs +40 -0
  130. data/sig-private/solid_queue.rbs +108 -0
  131. data/tmp/.keep +0 -0
  132. metadata +190 -0
data/guides/README.md ADDED
@@ -0,0 +1,174 @@
1
+ # JobWorkflow Guides
2
+
3
+ > ⚠️ **Early Stage (v0.1.3):** JobWorkflow is in active development. APIs and features may change. The following guides provide patterns and examples for building workflows, but be aware that implementations may need adjustment as the library evolves.
4
+
5
+ Welcome to the JobWorkflow documentation! This directory contains comprehensive guides to help you build robust workflows with JobWorkflow.
6
+
7
+ ## 📚 Documentation Structure
8
+
9
+ ### 🚀 Getting Started
10
+
11
+ Start here if you're new to JobWorkflow:
12
+
13
+ - **[GETTING_STARTED.md](GETTING_STARTED.md)** - Quick 5-minute introduction and detailed getting started guide
14
+ - What is JobWorkflow and why use it
15
+ - Installation and setup
16
+ - Your first workflow
17
+ - Core concepts (Workflow, Task, Arguments, Context, Outputs)
18
+
19
+ ### 📖 Fundamentals
20
+
21
+ Core concepts and features you'll use in every workflow:
22
+
23
+ - **[DSL_BASICS.md](DSL_BASICS.md)** - Mastering the JobWorkflow DSL
24
+ - Defining tasks
25
+ - Working with arguments
26
+ - Task dependencies
27
+ - Task options (retry, condition, throttle, timeout)
28
+
29
+ - **[TASK_OUTPUTS.md](TASK_OUTPUTS.md)** - Understanding task outputs
30
+ - Defining and accessing outputs
31
+ - Using outputs with map tasks
32
+ - Output persistence and design patterns
33
+
34
+ - **[PARALLEL_PROCESSING.md](PARALLEL_PROCESSING.md)** - Efficient parallel execution
35
+ - Collection task basics
36
+ - Fork-Join pattern
37
+ - Controlling concurrency
38
+ - Context isolation
39
+
40
+ ### 🔧 Intermediate
41
+
42
+ Advanced workflow patterns and features:
43
+
44
+ - **[ERROR_HANDLING.md](ERROR_HANDLING.md)** - Robust error handling
45
+ - Retry configuration (simple and advanced)
46
+ - Retry strategies (linear, exponential, jitter)
47
+ - Task-level and workflow-level retry settings
48
+ - Combining multiple retry layers
49
+
50
+ - **[CONDITIONAL_EXECUTION.md](CONDITIONAL_EXECUTION.md)** - Dynamic workflow control
51
+ - Basic conditional execution
52
+ - Complex conditions
53
+ - Best practices
54
+
55
+ - **[LIFECYCLE_HOOKS.md](LIFECYCLE_HOOKS.md)** - Extending task behavior
56
+ - Hook types (before, after, around, on_error)
57
+ - Hook scope (global vs task-specific)
58
+ - Execution order and error handling
59
+
60
+ ### 🎓 Advanced
61
+
62
+ Power features for complex workflows:
63
+
64
+ - **[DEPENDENCY_WAIT.md](DEPENDENCY_WAIT.md)** - Efficient dependency waiting
65
+ - The thread occupation problem
66
+ - Automatic job rescheduling
67
+ - Configuration options (poll_timeout, poll_interval, reschedule_delay)
68
+ - SolidQueue integration
69
+
70
+ - **[NAMESPACES.md](NAMESPACES.md)** - Organizing large workflows
71
+ - Basic namespace usage
72
+ - Nested namespaces
73
+ - Cross-namespace dependencies
74
+
75
+ - **[THROTTLING.md](THROTTLING.md)** - Rate limiting and resource control
76
+ - Task-level throttling
77
+ - Runtime throttling
78
+ - Sharing throttle keys across jobs
79
+
80
+ - **[WORKFLOW_COMPOSITION.md](WORKFLOW_COMPOSITION.md)** - Composing and reusing workflows
81
+ - Invoking child workflows (sync/async)
82
+ - Accessing child workflow outputs
83
+ - Map tasks with child workflows
84
+ - Best practices and limitations
85
+
86
+ - **[SCHEDULED_JOBS.md](SCHEDULED_JOBS.md)** - Cron-like job scheduling
87
+ - Schedule DSL basics
88
+ - Schedule expressions (cron and natural language)
89
+ - Multiple schedules per job
90
+ - SolidQueue integration
91
+
92
+ ### 📊 Observability
93
+
94
+ Monitoring and debugging your workflows:
95
+
96
+ - **[STRUCTURED_LOGGING.md](STRUCTURED_LOGGING.md)** - JSON-based logging
97
+ - Log event types
98
+ - Customizing the logger
99
+ - Querying and analyzing logs
100
+
101
+ - **[INSTRUMENTATION.md](INSTRUMENTATION.md)** - Event-driven observability
102
+ - Architecture and event types
103
+ - Custom instrumentation
104
+ - Building custom subscribers
105
+
106
+ - **[OPENTELEMETRY_INTEGRATION.md](OPENTELEMETRY_INTEGRATION.md)** - Distributed tracing
107
+ - Configuration and setup
108
+ - Span attributes and naming
109
+ - Viewing traces in your backend
110
+
111
+ ### 🏭 Practical
112
+
113
+ Production deployment and operations:
114
+
115
+ - **[PRODUCTION_DEPLOYMENT.md](PRODUCTION_DEPLOYMENT.md)** - Running JobWorkflow in production
116
+ - SolidQueue configuration
117
+ - Worker optimization
118
+ - Auto-scaling (AWS ECS)
119
+ - SolidCache configuration
120
+
121
+ - **[QUEUE_MANAGEMENT.md](QUEUE_MANAGEMENT.md)** - Managing job queues
122
+ - Queue operations (status, pause, resume, clear)
123
+ - Finding workflows by queue
124
+ - Production best practices
125
+
126
+ - **[CACHE_STORE_INTEGRATION.md](CACHE_STORE_INTEGRATION.md)** - Using cache store backends
127
+ - Automatic cache detection (SolidCache, MemoryStore)
128
+ - Cache operations and integration
129
+
130
+ - **[WORKFLOW_STATUS_QUERY.md](WORKFLOW_STATUS_QUERY.md)** - Monitoring workflow execution
131
+ - Finding and inspecting workflows
132
+ - Accessing arguments, outputs, and job status
133
+ - Building dashboards and APIs
134
+
135
+ - **[TESTING_STRATEGY.md](TESTING_STRATEGY.md)** - Testing your workflows
136
+ - Unit testing individual tasks
137
+ - Integration testing workflows
138
+ - Test best practices
139
+
140
+ - **[DRY_RUN.md](DRY_RUN.md)** - Dry-run mode for safe testing
141
+ - Workflow-level and task-level dry-run
142
+ - Dynamic dry-run with Proc
143
+ - skip_in_dry_run for conditional execution
144
+ - Instrumentation and logging
145
+
146
+ - **[TROUBLESHOOTING.md](TROUBLESHOOTING.md)** - Common issues and solutions
147
+ - CircularDependencyError
148
+ - UnknownTaskError
149
+ - Debugging workflows
150
+
151
+ ### 📘 Reference
152
+
153
+ Complete API documentation and best practices:
154
+
155
+ - **[API_REFERENCE.md](API_REFERENCE.md)** - Detailed API documentation
156
+ - DSL method reference
157
+ - Class documentation
158
+ - Method signatures
159
+
160
+ - **[BEST_PRACTICES.md](BEST_PRACTICES.md)** - Design patterns and recommendations
161
+ - Workflow design principles
162
+ - Task granularity
163
+ - Dependency management
164
+ - Testing strategies
165
+
166
+ ---
167
+
168
+ ## 🤝 Contributing
169
+
170
+ Found an issue or have a suggestion? Please open an issue on the [GitHub repository](https://github.com/shoma07/job-workflow).
171
+
172
+ ## 📄 License
173
+
174
+ JobWorkflow is released under the MIT License. See LICENSE file for details.
@@ -0,0 +1,165 @@
1
+ # Scheduled Jobs
2
+
3
+ JobWorkflow integrates with SolidQueue's recurring tasks feature to enable scheduled job execution. You can define schedules directly in your job class using the DSL, and JobWorkflow automatically registers them with SolidQueue.
4
+
5
+ ## Overview
6
+
7
+ The `schedule` DSL method allows you to define cron-like schedules for your jobs. Multiple schedules can be defined for a single job, and all SolidQueue recurring task options are supported.
8
+
9
+ ### Key Features
10
+
11
+ - **DSL-based configuration**: Define schedules inline with your job class
12
+ - **SolidQueue integration**: Automatic registration with SolidQueue's recurring tasks
13
+ - **Multiple schedules**: Support for multiple schedules per job
14
+ - **All SolidQueue options**: key, args, queue, priority, description
15
+
16
+ ## Basic Usage
17
+
18
+ ```ruby
19
+ class DailyReportJob < ApplicationJob
20
+ include JobWorkflow::DSL
21
+
22
+ # Run daily at 9:00 AM
23
+ schedule "0 9 * * *"
24
+
25
+ task :generate do |ctx|
26
+ ReportGenerator.generate_daily_report
27
+ end
28
+ end
29
+ ```
30
+
31
+ ## Schedule Expression Formats
32
+
33
+ JobWorkflow supports both cron expressions and natural language via the Fugit gem:
34
+
35
+ ```ruby
36
+ # Cron expression
37
+ schedule "0 9 * * *" # Every day at 9:00 AM
38
+ schedule "*/15 * * * *" # Every 15 minutes
39
+ schedule "0 0 1 * *" # First day of every month at midnight
40
+
41
+ # Natural language (Fugit)
42
+ schedule "every hour"
43
+ schedule "every 5 minutes"
44
+ schedule "every day at 9am"
45
+ ```
46
+
47
+ ## Schedule Options
48
+
49
+ The `schedule` method accepts several options:
50
+
51
+ | Option | Type | Default | Description |
52
+ |--------|------|---------|-------------|
53
+ | `key` | String/Symbol | Job class name | Unique identifier for the schedule |
54
+ | `args` | Hash | `{}` | Arguments to pass to the job (as keyword arguments) |
55
+ | `queue` | String/Symbol | nil | Queue name for the job |
56
+ | `priority` | Integer | nil | Job priority |
57
+ | `description` | String | nil | Human-readable description |
58
+
59
+ ### Using Options
60
+
61
+ ```ruby
62
+ class DataSyncJob < ApplicationJob
63
+ include JobWorkflow::DSL
64
+
65
+ schedule "0 */4 * * *",
66
+ key: "data_sync_every_4_hours",
67
+ queue: "high_priority",
68
+ priority: 10,
69
+ args: { source: "primary" },
70
+ description: "Sync data from primary source every 4 hours"
71
+
72
+ argument :source, "String", default: "default"
73
+
74
+ task :sync do |ctx|
75
+ source = ctx.arguments.source
76
+ DataSynchronizer.sync(source)
77
+ end
78
+ end
79
+ ```
80
+
81
+ ## Multiple Schedules
82
+
83
+ You can define multiple schedules for the same job. When using multiple schedules, each must have a unique `key`:
84
+
85
+ ```ruby
86
+ class ReportJob < ApplicationJob
87
+ include JobWorkflow::DSL
88
+
89
+ # Morning report
90
+ schedule "0 9 * * *", key: "morning_report"
91
+
92
+ # Evening report with different args
93
+ schedule "0 18 * * *",
94
+ key: "evening_report",
95
+ args: { time_of_day: "evening" }
96
+
97
+ argument :time_of_day, "String", default: "morning"
98
+
99
+ task :generate do |ctx|
100
+ time = ctx.arguments.time_of_day
101
+ ReportGenerator.generate(time)
102
+ end
103
+ end
104
+ ```
105
+
106
+ ## How It Works
107
+
108
+ JobWorkflow's schedule integration works through SolidQueue's configuration system:
109
+
110
+ 1. **Registration**: When a job class is loaded, schedules are stored in the `Workflow#schedules` hash
111
+ 2. **Tracking**: JobWorkflow tracks all loaded job classes via `JobWorkflow::DSL._included_classes`
112
+ 3. **Integration**: JobWorkflow patches `SolidQueue::Configuration#recurring_tasks_config` to merge registered schedules
113
+ 4. **Execution**: SolidQueue's scheduler picks up the schedules and enqueues jobs at the specified times
114
+
115
+ ### Configuration File Compatibility
116
+
117
+ JobWorkflow schedules are merged with any existing SolidQueue YAML configuration:
118
+
119
+ ```yaml
120
+ # config/recurring.yml (SolidQueue's native config)
121
+ legacy_cleanup:
122
+ class: LegacyCleanupJob
123
+ schedule: "0 0 * * 0"
124
+ ```
125
+
126
+ Both the YAML-defined schedules and JobWorkflow DSL-defined schedules will be active. If there's a key conflict, the JobWorkflow schedule takes precedence.
127
+
128
+ ## Requirements
129
+
130
+ - SolidQueue must be configured as your ActiveJob backend
131
+ - The job class must be loaded before SolidQueue's recurring task supervisor starts
132
+ - Rails eager loading should be enabled in production (default behavior)
133
+
134
+ ## Checking Scheduled Jobs
135
+
136
+ You can inspect registered schedules programmatically:
137
+
138
+ ```ruby
139
+ # Get schedules from a specific job class
140
+ DailyReportJob._workflow.build_schedules_hash
141
+ # => {
142
+ # DailyReportJob: { class: "DailyReportJob", schedule: "0 9 * * *" }
143
+ # }
144
+
145
+ # For jobs with multiple schedules
146
+ ReportJob._workflow.build_schedules_hash
147
+ # => {
148
+ # morning_report: { class: "ReportJob", schedule: "0 9 * * *" },
149
+ # evening_report: { class: "ReportJob", schedule: "0 18 * * *", args: [{ time_of_day: "evening" }] }
150
+ # }
151
+
152
+ # Check if a workflow has schedules
153
+ DailyReportJob._workflow.schedules.any? # => true
154
+ ```
155
+
156
+ ## Best Practices
157
+
158
+ 1. **Use descriptive keys**: When defining multiple schedules, use meaningful keys that describe the schedule's purpose
159
+ 2. **Document schedules**: Use the `description` option to explain what each schedule does
160
+ 3. **Consider time zones**: Cron expressions use the server's time zone; consider using natural language for clarity
161
+ 4. **Test schedules**: Verify schedule expressions using Fugit before deployment:
162
+ ```ruby
163
+ require 'fugit'
164
+ Fugit.parse("0 9 * * *").next_time # => next occurrence
165
+ ```
@@ -0,0 +1,268 @@
1
+ # Structured Logging
2
+
3
+ JobWorkflow provides structured JSON logging for comprehensive workflow observability. All workflow and task lifecycle events are automatically logged with detailed context information.
4
+
5
+ ## Overview
6
+
7
+ JobWorkflow's logging system uses a JSON formatter that outputs structured logs with timestamps, log levels, and event-specific fields. This makes it easy to search, filter, and analyze workflow execution in production environments.
8
+
9
+ ### Key Features
10
+
11
+ - **JSON Format**: All logs are output in JSON format for easy parsing and analysis
12
+ - **Automatic Logging**: Workflow and task lifecycle events are automatically logged
13
+ - **Contextual Information**: Logs include job ID, task name, retry count, and other relevant metadata
14
+ - **Customizable**: Logger instance and formatter can be customized
15
+ - **Log Levels**: INFO for lifecycle events, WARN for retries, DEBUG for throttling
16
+
17
+ ## Log Event Types
18
+
19
+ JobWorkflow automatically logs the following events:
20
+
21
+ | Event | Description | Log Level | Fields |
22
+ |-------|-------------|-----------|--------|
23
+ | `workflow.start` | Workflow execution started | INFO | `job_name`, `job_id` |
24
+ | `workflow.complete` | Workflow execution completed | INFO | `job_name`, `job_id` |
25
+ | `task.start` | Task execution started | INFO | `job_name`, `job_id`, `task_name`, `each_index`, `retry_count` |
26
+ | `task.complete` | Task execution completed | INFO | `job_name`, `job_id`, `task_name`, `each_index` |
27
+ | `task.skip` | Task skipped (condition not met) | INFO | `job_name`, `job_id`, `task_name`, `reason` |
28
+ | `task.enqueue` | Sub-jobs enqueued for map task | INFO | `job_name`, `job_id`, `task_name`, `sub_job_count` |
29
+ | `task.retry` | Task retry after failure | WARN | `job_name`, `job_id`, `task_name`, `each_index`, `attempt`, `max_attempts`, `delay_seconds`, `error_class`, `error_message` |
30
+ | `throttle.acquire.start` | Semaphore acquisition started | DEBUG | `concurrency_key`, `concurrency_limit` |
31
+ | `throttle.acquire.complete` | Semaphore acquisition completed | DEBUG | `concurrency_key`, `concurrency_limit` |
32
+ | `throttle.release` | Semaphore released | DEBUG | `concurrency_key` |
33
+ | `dependent.wait.start` | Waiting for dependent task started | DEBUG | `job_name`, `job_id`, `dependent_task_name` |
34
+ | `dependent.wait.complete` | Dependent task completed | DEBUG | `job_name`, `job_id`, `dependent_task_name` |
35
+
36
+ ## Default Configuration
37
+
38
+ JobWorkflow automatically configures a logger with JSON output:
39
+
40
+ ```ruby
41
+ # Default logger (outputs to STDOUT)
42
+ JobWorkflow.logger
43
+ # => #<ActiveSupport::Logger:...>
44
+
45
+ JobWorkflow.logger.formatter
46
+ # => #<JobWorkflow::Logger::JsonFormatter:...>
47
+ ```
48
+
49
+ ## Log Output Examples
50
+
51
+ ### Workflow Lifecycle
52
+
53
+ ```json
54
+ {"time":"2026-01-02T10:00:00.123456+09:00","level":"INFO","progname":"ruby","event":"workflow.start","job_name":"DataProcessingJob","job_id":"abc123"}
55
+ {"time":"2026-01-02T10:05:30.654321+09:00","level":"INFO","progname":"ruby","event":"workflow.complete","job_name":"DataProcessingJob","job_id":"abc123"}
56
+ ```
57
+
58
+ ### Task Execution
59
+
60
+ ```json
61
+ {"time":"2026-01-02T10:00:01.234567+09:00","level":"INFO","progname":"ruby","event":"task.start","job_name":"DataProcessingJob","job_id":"abc123","task_name":"fetch_data","each_index":0,"retry_count":0}
62
+ {"time":"2026-01-02T10:00:05.345678+09:00","level":"INFO","progname":"ruby","event":"task.complete","job_name":"DataProcessingJob","job_id":"abc123","task_name":"fetch_data","each_index":0}
63
+ ```
64
+
65
+ ### Task Retry
66
+
67
+ ```json
68
+ {"time":"2026-01-02T10:00:10.456789+09:00","level":"WARN","progname":"ruby","event":"task.retry","job_name":"DataProcessingJob","job_id":"abc123","task_name":"process_item","each_index":5,"attempt":2,"max_attempts":3,"delay_seconds":4.0,"error_class":"Timeout::Error","error_message":"execution expired"}
69
+ ```
70
+
71
+ ### Task Skip (Conditional Execution)
72
+
73
+ ```json
74
+ {"time":"2026-01-02T10:00:15.567890+09:00","level":"INFO","progname":"ruby","event":"task.skip","job_name":"DataProcessingJob","job_id":"abc123","task_name":"send_notification","reason":"condition_not_met"}
75
+ ```
76
+
77
+ ### Throttling Events
78
+
79
+ ```json
80
+ {"time":"2026-01-02T10:00:20.678901+09:00","level":"DEBUG","progname":"ruby","event":"throttle.acquire.start","concurrency_key":"api_rate_limit","concurrency_limit":10}
81
+ {"time":"2026-01-02T10:00:23.789012+09:00","level":"DEBUG","progname":"ruby","event":"throttle.acquire.complete","concurrency_key":"api_rate_limit","concurrency_limit":10}
82
+ {"time":"2026-01-02T10:00:28.890123+09:00","level":"DEBUG","progname":"ruby","event":"throttle.release","concurrency_key":"api_rate_limit"}
83
+ ```
84
+
85
+ ## Customizing the Logger
86
+
87
+ ### Using a Custom Logger Instance
88
+
89
+ You can replace the default logger with your own:
90
+
91
+ ```ruby
92
+ # config/initializers/job_workflow.rb
93
+ JobWorkflow.logger = ActiveSupport::Logger.new(Rails.root.join('log', 'job_workflow.log'))
94
+ JobWorkflow.logger.formatter = JobWorkflow::Logger::JsonFormatter.new
95
+ JobWorkflow.logger.level = :info
96
+ ```
97
+
98
+ ### Custom Log Tags
99
+
100
+ Add custom tags to include in every log entry:
101
+
102
+ ```ruby
103
+ # config/initializers/job_workflow.rb
104
+ JobWorkflow.logger.formatter = JobWorkflow::Logger::JsonFormatter.new(
105
+ log_tags: [:request_id, :user_id]
106
+ )
107
+
108
+ # In your application code, set tags using ActiveSupport::TaggedLogging
109
+ JobWorkflow.logger.tagged(request_id: request.request_id, user_id: current_user.id) do
110
+ MyWorkflowJob.perform_later
111
+ end
112
+ ```
113
+
114
+ Log output will include the tags:
115
+
116
+ ```json
117
+ {"time":"2026-01-02T10:00:00.123456+09:00","level":"INFO","progname":"ruby","request_id":"req_xyz789","user_id":"user_123","event":"workflow.start","job_name":"MyWorkflowJob","job_id":"abc123"}
118
+ ```
119
+
120
+ ### Changing Log Level
121
+
122
+ Control which logs are output by setting the log level:
123
+
124
+ ```ruby
125
+ # config/environments/production.rb
126
+ JobWorkflow.logger.level = :info # INFO, WARN, ERROR only (no DEBUG)
127
+
128
+ # config/environments/development.rb
129
+ JobWorkflow.logger.level = :debug # All logs including throttling details
130
+ ```
131
+
132
+ ## Querying and Analyzing Logs
133
+
134
+ ### Finding Failed Tasks
135
+
136
+ ```bash
137
+ # Using jq
138
+ cat log/production.log | jq 'select(.event == "task.retry")'
139
+
140
+ # Using grep
141
+ grep '"event":"task.retry"' log/production.log | jq .
142
+ ```
143
+
144
+ ### Tracking Workflow Execution
145
+
146
+ ```bash
147
+ # Find all events for a specific job_id
148
+ cat log/production.log | jq 'select(.job_id == "abc123")'
149
+
150
+ # Calculate workflow duration
151
+ START=$(cat log/production.log | jq -r 'select(.event == "workflow.start" and .job_id == "abc123") | .time' | head -1)
152
+ END=$(cat log/production.log | jq -r 'select(.event == "workflow.complete" and .job_id == "abc123") | .time' | head -1)
153
+ echo "Start: $START, End: $END"
154
+ ```
155
+
156
+ ### Analyzing Throttling Behavior
157
+
158
+ ```bash
159
+ # Count throttle acquire events by concurrency_key
160
+ cat log/production.log | jq -r 'select(.event == "throttle.acquire.start") | .concurrency_key' | sort | uniq -c
161
+
162
+ # Calculate semaphore wait duration (requires timestamps)
163
+ cat log/production.log | jq 'select(.event == "throttle.acquire.start" or .event == "throttle.acquire.complete")' | jq -s 'group_by(.concurrency_key) | map({key: .[0].concurrency_key, count: length})'
164
+ ```
165
+
166
+ ## Best Practices
167
+
168
+ ### 1. Use Appropriate Log Levels
169
+
170
+ - **Production**: Set to `:info` to avoid verbose DEBUG logs
171
+ - **Development**: Set to `:debug` to see all throttling and dependency events
172
+ - **Staging**: Set to `:info` or `:debug` based on debugging needs
173
+
174
+ ### 2. Add Custom Tags for Context
175
+
176
+ Use tagged logging to add request-specific context:
177
+
178
+ ```ruby
179
+ class ApplicationController < ActionController::Base
180
+ around_action :tag_job_workflow_logs
181
+
182
+ private
183
+
184
+ def tag_job_workflow_logs
185
+ JobWorkflow.logger.tagged(
186
+ request_id: request.request_id,
187
+ user_id: current_user&.id,
188
+ tenant_id: current_tenant&.id
189
+ ) do
190
+ yield
191
+ end
192
+ end
193
+ end
194
+ ```
195
+
196
+ ### 3. Monitor Key Metrics
197
+
198
+ Set up alerts for:
199
+
200
+ - High retry rates: `event == "task.retry"`
201
+ - Long workflow durations: time between `workflow.start` and `workflow.complete`
202
+ - Long throttle wait times: duration between `throttle.acquire.start` and `throttle.acquire.complete`
203
+ - Skipped tasks: unexpected `task.skip` events
204
+
205
+ ### 4. Structured Log Queries
206
+
207
+ Design your monitoring queries around the JSON structure. Use `jq` for command-line analysis:
208
+
209
+ ```bash
210
+ # Find all retry events for a specific job
211
+ cat log/production.log | jq 'select(.event == "task.retry" and .job_name == "DataProcessingJob")'
212
+
213
+ # Filter retries with 2 or more attempts
214
+ cat log/production.log | jq 'select(.event == "task.retry" and .attempt >= 2)'
215
+
216
+ # Extract specific fields
217
+ cat log/production.log | jq 'select(.event == "task.retry") | {job_name, task_name, attempt, error_class}'
218
+ ```
219
+
220
+ Most log aggregation services support JSON-based querying. Consult your logging platform's documentation for specific query syntax.
221
+
222
+ ### 5. Log Retention
223
+
224
+ Configure appropriate retention policies based on your compliance and operational needs:
225
+
226
+ - **High-volume production**: 7-30 days retention
227
+ - **Critical workflows**: 90+ days retention
228
+ - **Archive**: Store historical logs for compliance if required
229
+
230
+ ## Troubleshooting Logging Issues
231
+
232
+ ### Logs Not Appearing
233
+
234
+ ```ruby
235
+ # Check logger configuration
236
+ JobWorkflow.logger.level # Should be :debug or :info
237
+ JobWorkflow.logger.formatter.class # Should be JobWorkflow::Logger::JsonFormatter
238
+
239
+ # Verify logger is writing
240
+ JobWorkflow.logger.info({ event: "test", message: "test message" })
241
+ ```
242
+
243
+ ### Malformed JSON
244
+
245
+ If you see non-JSON log lines mixed with JSON:
246
+
247
+ ```ruby
248
+ # Ensure all loggers use JsonFormatter
249
+ Rails.logger.formatter = JobWorkflow::Logger::JsonFormatter.new # If needed
250
+
251
+ # Or separate JobWorkflow logs to a dedicated file
252
+ JobWorkflow.logger = ActiveSupport::Logger.new('log/job_workflow.log')
253
+ JobWorkflow.logger.formatter = JobWorkflow::Logger::JsonFormatter.new
254
+ ```
255
+
256
+ ### Missing Context Fields
257
+
258
+ If expected fields are missing from logs:
259
+
260
+ ```ruby
261
+ # Verify the logger has access to job context
262
+ # The logger automatically includes job_name, job_id, task_name, etc.
263
+ # Custom fields require tagged logging:
264
+
265
+ JobWorkflow.logger.tagged(custom_field: "value") do
266
+ MyWorkflowJob.perform_later
267
+ end
268
+ ```