sidekiq-hierarchy 0.1.1
Sign up to get free protection for your applications and to get access to all the features.
- checksums.yaml +7 -0
- data/.gitignore +10 -0
- data/.rspec +2 -0
- data/.ruby-gemset +1 -0
- data/.ruby-version +1 -0
- data/.travis.yml +6 -0
- data/CHANGELOG.md +14 -0
- data/CONTRIBUTING.md +57 -0
- data/Gemfile +7 -0
- data/LICENSE.txt +21 -0
- data/README.md +396 -0
- data/Rakefile +6 -0
- data/bin/console +14 -0
- data/bin/setup +7 -0
- data/img/dashboard.png +0 -0
- data/img/failed_workflow.png +0 -0
- data/img/in_progress_workflow.png +0 -0
- data/img/job.png +0 -0
- data/img/workflow_set.png +0 -0
- data/lib/sidekiq-hierarchy.rb +1 -0
- data/lib/sidekiq/hierarchy.rb +105 -0
- data/lib/sidekiq/hierarchy/callback_registry.rb +33 -0
- data/lib/sidekiq/hierarchy/client/middleware.rb +23 -0
- data/lib/sidekiq/hierarchy/faraday/middleware.rb +16 -0
- data/lib/sidekiq/hierarchy/http.rb +8 -0
- data/lib/sidekiq/hierarchy/job.rb +290 -0
- data/lib/sidekiq/hierarchy/notifications.rb +8 -0
- data/lib/sidekiq/hierarchy/observers.rb +9 -0
- data/lib/sidekiq/hierarchy/observers/job_update.rb +15 -0
- data/lib/sidekiq/hierarchy/observers/workflow_update.rb +18 -0
- data/lib/sidekiq/hierarchy/rack/middleware.rb +27 -0
- data/lib/sidekiq/hierarchy/server/middleware.rb +62 -0
- data/lib/sidekiq/hierarchy/version.rb +5 -0
- data/lib/sidekiq/hierarchy/web.rb +149 -0
- data/lib/sidekiq/hierarchy/workflow.rb +130 -0
- data/lib/sidekiq/hierarchy/workflow_set.rb +134 -0
- data/sidekiq-hierarchy.gemspec +33 -0
- data/web/views/_job_progress_bar.erb +28 -0
- data/web/views/_job_table.erb +37 -0
- data/web/views/_job_timings.erb +10 -0
- data/web/views/_progress_bar.erb +8 -0
- data/web/views/_search_bar.erb +17 -0
- data/web/views/_summary_bar.erb +30 -0
- data/web/views/_workflow_progress_bar.erb +24 -0
- data/web/views/_workflow_set_clear.erb +7 -0
- data/web/views/_workflow_table.erb +33 -0
- data/web/views/_workflow_timings.erb +14 -0
- data/web/views/_workflow_tree.erb +82 -0
- data/web/views/_workflow_tree_node.erb +18 -0
- data/web/views/job.erb +12 -0
- data/web/views/not_found.erb +1 -0
- data/web/views/status.erb +120 -0
- data/web/views/workflow.erb +45 -0
- data/web/views/workflow_set.erb +3 -0
- metadata +225 -0
checksums.yaml
ADDED
@@ -0,0 +1,7 @@
|
|
1
|
+
---
|
2
|
+
SHA1:
|
3
|
+
metadata.gz: 33347354bed6c46a62bc8d37ef059999ef0646b1
|
4
|
+
data.tar.gz: f05c181f018f4998493fd855c097add3809f897d
|
5
|
+
SHA512:
|
6
|
+
metadata.gz: 960ea4782b087f62c9cdce3b143d43eb6eacbe9bb35ce500444937aced01eb57788435349d4d771a9534617a67bc2c53957b97e66711d7e5c9987341b6de0932
|
7
|
+
data.tar.gz: 072d1ef82b78c226563b0d99e5238b642e8e57091f515c8b6115874f2e326d914ab2d0b0db1548a02f96bd92d5b42c9423cb1407d8d992efd32835c831778e6b
|
data/.gitignore
ADDED
data/.rspec
ADDED
data/.ruby-gemset
ADDED
@@ -0,0 +1 @@
|
|
1
|
+
sidekiq-hierarchy
|
data/.ruby-version
ADDED
@@ -0,0 +1 @@
|
|
1
|
+
jruby-1.7
|
data/.travis.yml
ADDED
data/CHANGELOG.md
ADDED
@@ -0,0 +1,14 @@
|
|
1
|
+
# Change Log
|
2
|
+
All notable changes to this project will be documented in this file.
|
3
|
+
This project adheres to [Semantic Versioning](http://semver.org/).
|
4
|
+
|
5
|
+
## [0.1.1] - 2015-11-16
|
6
|
+
### Added
|
7
|
+
- This changelog
|
8
|
+
|
9
|
+
### Changed
|
10
|
+
- Allowed pushing to rubygems.org
|
11
|
+
|
12
|
+
## [0.1.0] - 2015-11-16
|
13
|
+
### Initial Release
|
14
|
+
- Shipping all launch features: tree tracking, network bridging, etc.
|
data/CONTRIBUTING.md
ADDED
@@ -0,0 +1,57 @@
|
|
1
|
+
How to Contribute to Sidekiq-Hierarchy
|
2
|
+
======================================
|
3
|
+
|
4
|
+
Thanks for your interest in Sidekiq-hierarchy project. When you make a
|
5
|
+
contribution to the project (e.g. any modifications, additions to existing
|
6
|
+
work, pull requests or any other work intentionally submitted by you for
|
7
|
+
inclusion in the project) (collectively, a "Contribution"), Lookout wants to be
|
8
|
+
able to use your Contribution to improve this project and other Lookout
|
9
|
+
products.
|
10
|
+
|
11
|
+
As a condition of providing a Contribution, you agree to the following terms
|
12
|
+
and conditions ("Terms"):
|
13
|
+
|
14
|
+
1. Copyright License: Subject to these Terms, you grant Lookout and to
|
15
|
+
recipients of software distributed by Lookout a perpetual, worldwide,
|
16
|
+
non-exclusive, no-charge, royalty-free, irrevocable license to make, use, sell,
|
17
|
+
reproduce, modify, distribute (directly and indirectly), and publicly display
|
18
|
+
and perform the Contribution, and any derivative works that Lookout may make
|
19
|
+
from the Contribution.
|
20
|
+
|
21
|
+
2. Patent License: Subject to these Terms, you grant Lookout and to
|
22
|
+
recipients of software distributed by Lookout a perpetual, worldwide,
|
23
|
+
non-exclusive, no-charge, royalty-free, irrevocable (except as stated in this
|
24
|
+
section) patent license to make, have made, use, offer to sell, sell, import,
|
25
|
+
or otherwise transfer the Contribution, where such license applies only to
|
26
|
+
those patent claims licensable by you that are necessarily infringed by your
|
27
|
+
Contribution(s) alone or by combination of your Contribution(s) with any works
|
28
|
+
or projects to which such Contribution(s) was submitted. If any entity
|
29
|
+
institutes patent litigation against you or any other entity (including a
|
30
|
+
cross-claim or counterclaim in a lawsuit) alleging that your Contribution(s) or
|
31
|
+
the projects to which you have contributed, constitutes direct or contributory
|
32
|
+
patent infringement, then any patent licenses granted to that entity under this
|
33
|
+
agreement for that Contribution, work or other project shall terminate as of
|
34
|
+
the date such litigation is filed.
|
35
|
+
|
36
|
+
3. You warrant and represent that the Contribution(s) is your original
|
37
|
+
creation, that you have the authority and are legally entitled to grant these
|
38
|
+
licenses to Lookout, and that these licenses do not require the permission of
|
39
|
+
any third party.
|
40
|
+
|
41
|
+
4. Except for the warranties in Section 3, you provide any Contribution(s) on
|
42
|
+
an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express
|
43
|
+
or implied, including, without limitation, any warranties or conditions of
|
44
|
+
TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A PARTICULAR PURPOSE.
|
45
|
+
|
46
|
+
5. You agree to notify Lookout of any facts or circumstances of which you
|
47
|
+
become aware of that would make any representations made by you inaccurate in
|
48
|
+
any respect.
|
49
|
+
|
50
|
+
|
51
|
+
Should you wish to submit a suggestion or work that is not your original
|
52
|
+
creation, you may submit it to Lookout separate from any Contribution,
|
53
|
+
explicitly identifying it as sourced from a third party, stating the complete
|
54
|
+
details of its source, and informing Lookout of any license or other
|
55
|
+
restriction (including but not limited to related patents, trademarks, and
|
56
|
+
license agreement) of which you are personally aware, and conspicuously marking
|
57
|
+
the work as "Submitted on behalf of a third party: [named here]."
|
data/Gemfile
ADDED
data/LICENSE.txt
ADDED
@@ -0,0 +1,21 @@
|
|
1
|
+
The MIT License (MIT)
|
2
|
+
|
3
|
+
Copyright (c) 2015 Lookout, Inc.
|
4
|
+
|
5
|
+
Permission is hereby granted, free of charge, to any person obtaining a copy
|
6
|
+
of this software and associated documentation files (the "Software"), to deal
|
7
|
+
in the Software without restriction, including without limitation the rights
|
8
|
+
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
|
9
|
+
copies of the Software, and to permit persons to whom the Software is
|
10
|
+
furnished to do so, subject to the following conditions:
|
11
|
+
|
12
|
+
The above copyright notice and this permission notice shall be included in
|
13
|
+
all copies or substantial portions of the Software.
|
14
|
+
|
15
|
+
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
|
16
|
+
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
|
17
|
+
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
|
18
|
+
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
|
19
|
+
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
|
20
|
+
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
|
21
|
+
THE SOFTWARE.
|
data/README.md
ADDED
@@ -0,0 +1,396 @@
|
|
1
|
+
# Sidekiq::Hierarchy
|
2
|
+
|
3
|
+
|
4
|
+
[![Build Status](https://travis-ci.org/anujdas/sidekiq-hierarchy.png?branch=master)](https://travis-ci.org/anujdas/sidekiq-hierarchy)
|
5
|
+
|
6
|
+
[![Gem Version](https://badge.fury.io/rb/sidekiq-hierarchy.png)](http://badge.fury.io/rb/sidekiq-hierarchy)
|
7
|
+
|
8
|
+
Sidekiq-hierarchy is a gem that implements parent-child hierarchies between sidekiq jobs. Via several middlewares, it allows tracking complete workflows of multiple levels of sidekiq jobs, even across network calls, so long as a shared redis host is available.
|
9
|
+
|
10
|
+
You may want to use sidekiq-hierarchy if you:
|
11
|
+
|
12
|
+
- have complex (or simple) hierarchies of jobs triggering other jobs
|
13
|
+
- want to understand timing breakdowns (enqueued, run, and completed times) per job and per workflow
|
14
|
+
- are investigating how job requeues and retries impact your runtimes, e.g., to maintain SLAs
|
15
|
+
- would like to perform actions on job and workflow status changes via callbacks, for instance providing progress feedback or statistical trend data
|
16
|
+
- need to pass arbitrary data between parent and child jobs, in order to implement, e.g., prioritized workflows, or fail-fast workflows
|
17
|
+
- trigger jobs via network calls and want insight into the call graphs
|
18
|
+
|
19
|
+
![Web UI](img/in_progress_workflow.png?raw=true)
|
20
|
+
|
21
|
+
Disclaimer: Sidekiq-hierarchy supports Sidekiq 3.x, and thus MRI 2.0+ and JRuby; it may work on MRI 1.9, but this configuration is untested as Sidekiq's unit testing support does not extend to it.
|
22
|
+
|
23
|
+
## Table of Contents
|
24
|
+
|
25
|
+
- [Sidekiq::Hierarchy](#sidekiqhierarchy)
|
26
|
+
- [Table of Contents](#table-of-contents)
|
27
|
+
- [Quickstart](#quickstart)
|
28
|
+
- [Web Interface](#web-interface)
|
29
|
+
- [Architecture and API](#architecture-and-api)
|
30
|
+
- [Callbacks](#callbacks)
|
31
|
+
- [Network integration](#network-integration)
|
32
|
+
- [Advanced Options](#advanced-options)
|
33
|
+
- [Additional Job Info](#additional-job-info)
|
34
|
+
- [CompleteSet and FailedSet](#completeset-and-failedset)
|
35
|
+
- [More Examples](#more-examples)
|
36
|
+
- [Fail-fast workflow cancellation](#fail-fast-workflow-cancellation)
|
37
|
+
- [Workflow Metrics Dashboard](#workflow-metrics-dashboard)
|
38
|
+
- [Installation](#installation)
|
39
|
+
- [Development](#development)
|
40
|
+
- [License](#license)
|
41
|
+
|
42
|
+
## Quickstart
|
43
|
+
|
44
|
+
Sidekiq-hierarchy is designed to be as unobtrusive as possible. The simplest possible use case, in which jobs trigger other jobs directly (via `#perform_async`), can be realized via a few lines of code. First, set up Sidekiq and make sure that the gem is installed (see Installation, below). Then:
|
45
|
+
|
46
|
+
- Add the Sidekiq middlewares to your global Sidekiq configuration, usually in an initializer (e.g., `config/initializers/sidekiq.rb`):
|
47
|
+
|
48
|
+
```ruby
|
49
|
+
Sidekiq.configure_client do |config|
|
50
|
+
config.client_middleware do |chain|
|
51
|
+
chain.add Sidekiq::Hierarchy::Client::Middleware
|
52
|
+
end
|
53
|
+
end
|
54
|
+
|
55
|
+
Sidekiq.configure_server do |config|
|
56
|
+
config.client_middleware do |chain|
|
57
|
+
chain.add Sidekiq::Hierarchy::Client::Middleware
|
58
|
+
end
|
59
|
+
config.server_middleware do |chain|
|
60
|
+
chain.add Sidekiq::Hierarchy::Server::Middleware
|
61
|
+
end
|
62
|
+
end
|
63
|
+
```
|
64
|
+
|
65
|
+
Note that the Client middleware must be added to both the server and client configs.
|
66
|
+
|
67
|
+
_Since instrumentation occurs in these middlewares, other middlewares you write that make use of Sidekiq-hierarchy's capabilities must be nested appropriately (inside or outside, depending on whether they make use of workflow data-passing or they modify queuing behaviour)._
|
68
|
+
|
69
|
+
- Mark your workflow entry points, the jobs that are the root nodes of your work trees. Only the roots need to be modified: any children (or children or children, etc.) will automatically inherit the setting (though it won't hurt if you add them too). Simply append to the sidekiq_options in the worker class:
|
70
|
+
|
71
|
+
```ruby
|
72
|
+
class RootWorker
|
73
|
+
include Sidekiq::Worker
|
74
|
+
sidekiq_options workflow: true
|
75
|
+
|
76
|
+
# def perform(*args)
|
77
|
+
# 5.times do |n|
|
78
|
+
# ChildWorker.perform_async(n, *args)
|
79
|
+
# end
|
80
|
+
# end
|
81
|
+
```
|
82
|
+
|
83
|
+
The main concern is Redis storage space: if you are fine instrumenting all jobs (because your Redis instance is huge, or your job throughput is not very high, or you're debugging), you can set this in your global options:
|
84
|
+
|
85
|
+
```ruby
|
86
|
+
Sidekiq.default_worker_options = { 'workflow' => true }
|
87
|
+
```
|
88
|
+
|
89
|
+
You're done! Any new instances of your root worker will now record their child hierarchies.
|
90
|
+
|
91
|
+
---
|
92
|
+
|
93
|
+
Some examples to try, given a root `JID`:
|
94
|
+
|
95
|
+
```ruby
|
96
|
+
# > root_jid = RootWorker.perform_async
|
97
|
+
# => "11c3ec3df251ebb646f910d7"
|
98
|
+
|
99
|
+
> workflow = Sidekiq::Hierarchy::Workflow.find_by_jid(root_jid)
|
100
|
+
=> <Sidekiq::Hierarchy::Workflow...>
|
101
|
+
|
102
|
+
> workflow.status # [:running, :complete, :failed]
|
103
|
+
=> :complete
|
104
|
+
|
105
|
+
> [workflow.enqueued_at, workflow.run_at, workflow.complete_at]
|
106
|
+
=> [2015-11-11 15:00:42 -0800, 2015-11-11 15:00:42 -0800, 2015-11-11 15:01:32 -0800]
|
107
|
+
```
|
108
|
+
```ruby
|
109
|
+
> workflow.jobs.count # lazily eval'd
|
110
|
+
=> 33
|
111
|
+
|
112
|
+
> workflow.jobs.map(&:jid)
|
113
|
+
=> ["11c3ec3df251ebb646f910d7", "f003db430a0eae99d72f1b7a", "bc2cf8f3de3b87f9a4c3c10e", ...]
|
114
|
+
```
|
115
|
+
```ruby
|
116
|
+
> root_job = workflow.root
|
117
|
+
=> <Sidekiq::Hierarchy::Job...>
|
118
|
+
|
119
|
+
> root_job.info # configurable hash
|
120
|
+
{"class"=>"WebWorker", "queue"=>"default"}
|
121
|
+
|
122
|
+
> [root_job.enqueued_at, root_job.run_at, root_job.complete_at]
|
123
|
+
=> [2015-11-11 15:00:42 -0800, 2015-11-11 15:00:42 -0800, 2015-11-11 15:00:42 -0800]
|
124
|
+
```
|
125
|
+
```ruby
|
126
|
+
> root_job.leaf? # tree traversal helpers
|
127
|
+
=> false
|
128
|
+
|
129
|
+
> root_job.children
|
130
|
+
=> [<Sidekiq::Hierarchy::Job...>, <Sidekiq::Hierarchy::Job...>, ...]
|
131
|
+
|
132
|
+
> root_job.leaves.count
|
133
|
+
=> 19
|
134
|
+
|
135
|
+
> root_job.leaves.last.root == root_job
|
136
|
+
=> true
|
137
|
+
```
|
138
|
+
|
139
|
+
## Web Interface
|
140
|
+
|
141
|
+
Sidekiq-hierarchy comes with a full-featured web UI that integrates into the standard sidekiq-web interface. Use it to investigate your workflows without dropping to the console. Keep in mind, displaying workflows is expensive (Redis-command-count-wise), so it may not be the best idea to leave this live-polling a very large workflow on production over the weekend...
|
142
|
+
|
143
|
+
If you've already got sidekiq-web running, just
|
144
|
+
|
145
|
+
```ruby
|
146
|
+
require 'sidekiq/hierarchy/web'
|
147
|
+
```
|
148
|
+
|
149
|
+
and you're done; click the "Hierarchy" tab on the web UI and dig in. If you don't, follow the steps at https://github.com/mperham/sidekiq/wiki/Monitoring#web-ui first, then add the `require`. Among the things you can do:
|
150
|
+
|
151
|
+
- See overall metrics and search for jobs/workflows:
|
152
|
+
![Dashboard](img/dashboard.png?raw=true)
|
153
|
+
|
154
|
+
- Summarize running, complete, and failed workflows:
|
155
|
+
![Workflow set](img/workflow_set.png?raw=true)
|
156
|
+
|
157
|
+
- Introspect jobs and workflows
|
158
|
+
![Job](img/failed_workflow.png?raw=true)
|
159
|
+
![Workflow](img/job.png?raw=true)
|
160
|
+
|
161
|
+
And more! Try out live polling for even more fun.
|
162
|
+
|
163
|
+
## Architecture and API
|
164
|
+
|
165
|
+
Most of the API is contained in the `Sidekiq::Hierarchy::Job`, `Sidekiq::Hierarchy::Workflow`, and `Sidekiq::Hierarchy::WorkflowSet` classes. At a high level,
|
166
|
+
|
167
|
+
- Information is stored as a number of `Job`s that are identified by their JID (job id, randomly generated by Sidekiq).
|
168
|
+
- Each `Job` can have one (optional) parent `Job` and any number of children `Job`s.
|
169
|
+
- Together, one job tree constitutes a `Workflow`; workflow data is actually stored on the root `Job` node in Redis, but the workflow class provides a handy abstraction.
|
170
|
+
- Workflows are organized by status into the three `WorkflowSet`s: the obviously-named`RunningSet`, `CompleteSet`, and `FailedSet`.
|
171
|
+
|
172
|
+
Explore the classes to learn more you can access, including:
|
173
|
+
|
174
|
+
- current `#status` (`:enqueued`, `:running`, `:complete`, `:requeued`, `:failed`)
|
175
|
+
- timestamps for all status changes (`#enqueued_at`, `#run_at`, etc.)
|
176
|
+
- tree exploration (`#root`, `#parent`, `#children`, `#leaf?`, etc.)
|
177
|
+
- lazy enumerators over jobs and workflows (`Workflow#jobs`, `WorkflowSet#each`)
|
178
|
+
- current workflow and job id context (`Sidekiq::Hierarchy.current_workflow`, `.current_jid`)
|
179
|
+
|
180
|
+
Each `Workflow` can be treated as a Redis-backed hash (all values will be coerced to strings). Combined with the fact that the current workflow context can always be accessed via `Sidekiq::Hierarchy.current_workflow` (nil if not in a workflow), you can pass arbitrary information through a work tree.
|
181
|
+
|
182
|
+
---
|
183
|
+
|
184
|
+
As a quick example:
|
185
|
+
|
186
|
+
Say you wanted to push child jobs to a higher-priority queue if the root job was triggered by an admin user. We can implement this trivially using the Sidekiq-hierarchy infrastructure:
|
187
|
+
|
188
|
+
- When the root job is triggered, let's store the "high-priority" flag on the workflow.
|
189
|
+
```ruby
|
190
|
+
class RootWorker
|
191
|
+
include Sidekiq::Worker
|
192
|
+
sidekiq_options workflow: true
|
193
|
+
def perform(user_id)
|
194
|
+
if User.find(user_id).admin?
|
195
|
+
# value will be turned into a string anyways
|
196
|
+
Sidekiq::Hierarchy.current_workflow[:important] = '1'
|
197
|
+
end
|
198
|
+
5.times { ChildWorker.perform_async }
|
199
|
+
end
|
200
|
+
end
|
201
|
+
```
|
202
|
+
|
203
|
+
- Now let's write a simple client middleware to read the flag and act accordingly:
|
204
|
+
```ruby
|
205
|
+
class PriorityMiddleware
|
206
|
+
def call(worker_class, msg, queue, redis_pool)
|
207
|
+
if Sidekiq::Hierarchy.current_workflow[:important]
|
208
|
+
queue = :ultrahigh # override worker's preset queue
|
209
|
+
end
|
210
|
+
yield worker_class, msg, queue, redis_pool
|
211
|
+
end
|
212
|
+
end
|
213
|
+
```
|
214
|
+
|
215
|
+
- Make sure the middleware is nested **inside** the Sidekiq-hierarchy client middleware in the Sidekiq config.
|
216
|
+
|
217
|
+
That's all it takes!
|
218
|
+
|
219
|
+
## Callbacks
|
220
|
+
|
221
|
+
Sidekiq-hierarchy implements a simple pub/sub events system that currently publishes on two topics: `Sidekiq::Hierarchy::Notifications::JOB_UPDATE` and `Sidekiq::Hierarchy::Notifications::WORKFLOW_UPDATE`. These topics see messages whenever a status change occurs for any job or workflow, respectively.
|
222
|
+
|
223
|
+
Observers on `:job_update` are called with `(job, status, old_status)`, while `:workflow_update` observers receive `(workflow, status, old_status)`. An observer can be anything that supports a #call method with the necessary signature: a class instance will suffice, as will a simple `Proc`.
|
224
|
+
|
225
|
+
To register an observer, add it to the global callback registry at any point (initialization usually makes the most sense). For example, to subscribe to the `:job_update` event, you could do:
|
226
|
+
|
227
|
+
```ruby
|
228
|
+
class JobPrinter
|
229
|
+
def call(job, status, old_status)
|
230
|
+
Rails.logger.log "#{job.jid} switched from #{old_status} to #{status}"
|
231
|
+
end
|
232
|
+
end
|
233
|
+
|
234
|
+
end
|
235
|
+
Sidekiq::Hierarchy.callback_registry
|
236
|
+
.subscribe(Sidekiq::Hierarchy::Notifications::JOB_UPDATE, JobPrinter.new)
|
237
|
+
```
|
238
|
+
|
239
|
+
or
|
240
|
+
|
241
|
+
```ruby
|
242
|
+
job_printer = Proc.new do |job, status, old_status|
|
243
|
+
Rails.logger.log "#{job.jid} switched from #{old_status} to #{status}"
|
244
|
+
end
|
245
|
+
Sidekiq::Hierarchy.callback_registry
|
246
|
+
.subscribe(Sidekiq::Hierarchy::Notifications::JOB_UPDATE, job_printer)
|
247
|
+
|
248
|
+
```
|
249
|
+
|
250
|
+
Callbacks are triggered sequentially and synchronously, so if you are doing anything slow (e.g., a network call), you might consider moving it to an async task.
|
251
|
+
|
252
|
+
Note: Sidekiq-hierarchy makes use of callbacks internally to drive some of its own logic as well. Each subscriber is wrapped in an exception handler to ensure that all subscribers will run at each event publication, even if one or more raise errors.
|
253
|
+
|
254
|
+
## Network integration
|
255
|
+
|
256
|
+
A somewhat common pattern with Sidekiq is moving network calls to async jobs, preventing the network's synchronous nature from holding up workers. However, if the network endpoint triggers additional jobs, those child will no longer be linked to their parent, as the worker context is lost. Sidekiq-hierarchy solves this with a set of two optional middlewares: one for Rack (deciphering context from inbound requests) and one instrumenting Faraday (passing context in HTTP headers). Together, they transparently bridge the network gap, ensuring that jobs triggering other jobs over a network hop are recorded correctly.
|
257
|
+
|
258
|
+
The network integration is not loaded by default. To use it, require `sidekiq/hierarchy/rack/middleware` and `sidekiq/hierarchy/faraday/middleware` (making sure `Rack` and `Faraday` are loaded), then insert them in the appropriate places. For Rails, the Rack middleware will usually go in `config/application.rb`:
|
259
|
+
```ruby
|
260
|
+
class Application < Rails::Application
|
261
|
+
# ...
|
262
|
+
config.middleware.use Sidekiq::Hierarchy::Rack::Middleware
|
263
|
+
# ...
|
264
|
+
end
|
265
|
+
```
|
266
|
+
|
267
|
+
For Faraday, the connection object should be modified before use:
|
268
|
+
```ruby
|
269
|
+
Faraday.new do |f|
|
270
|
+
# ...
|
271
|
+
f.use Sidekiq::Hierarchy::Faraday::Middleware
|
272
|
+
# ...
|
273
|
+
end
|
274
|
+
```
|
275
|
+
|
276
|
+
In the background, Sidekiq-hierarchy inserts and decodes two headers:
|
277
|
+
|
278
|
+
- Sidekiq-Jid: the job id of the parent worker, if any
|
279
|
+
- Sidekiq-Workflow: the workflow JID, if tracking is enabled (`workflow: true` in sidekiq_options)
|
280
|
+
|
281
|
+
Even if you are not using Faraday, adding these headers should be easy with your network library of choice.
|
282
|
+
|
283
|
+
## Advanced Options
|
284
|
+
|
285
|
+
There are a couple of additional configuration options you may want to use, depending on your needs:
|
286
|
+
|
287
|
+
###Additional Job Info
|
288
|
+
|
289
|
+
By default, Sidekiq-hierarchy only retains two pieces of information from each job, namely the class and queue. A full job hash in Sidekiq is much richer, but storing the full thing will take significantly more space (especially if you enable backtrace recording in the worker options). If there are additional pieces you need (for instance, the argument list could be quite useful), you can specify these per job:
|
290
|
+
|
291
|
+
```ruby
|
292
|
+
sidekiq_options workflow_keys: ['args']
|
293
|
+
```
|
294
|
+
|
295
|
+
The list of keys must be an array of strings, which will be merged with `['class', 'queue']` (the default).
|
296
|
+
|
297
|
+
###CompleteSet and FailedSet
|
298
|
+
|
299
|
+
While the `RunningSet` is never pruned, so that in-progress workflows will never lose information, completed and failed workflows must be pruned to prevent running out of space in Redis (though note, all keys used expire in one month, so don't expect data to stick around past that time regardless!). Sidekiq itself does not have this issue, since jobs are thrown away after completion, but this is obviously impossible for Sidekiq-hierarchy (else workflows would lose jobs as they completed).
|
300
|
+
|
301
|
+
Two pruning strategies are employed, running on every workflow insertion: one which trims workflows older than a certain time, one which trims workflows past a certain count. These limits can be accessed as `CompleteSet.timeout` and `CompleteSet.max_workflows` (likewise for `FailedSet`, which shares the limits). These are set from global Sidekiq settings as follows:
|
302
|
+
|
303
|
+
- `timeout`: `:dead_timeout_in_seconds` setting, also used by Sidekiq to prune dead jobs (default 6 months)
|
304
|
+
- `max_workflows`: the first of `:dead_max_workflows` and `:dead_max_jobs`, whichever is set; the latter is used internally by Sidekiq to prune dead jobs (`:dead_max_jobs` default 10,000)
|
305
|
+
|
306
|
+
## More Examples
|
307
|
+
|
308
|
+
These are just a few ways in which Sidekiq-hierarchy could help you:
|
309
|
+
|
310
|
+
###Fail-fast workflow cancellation
|
311
|
+
|
312
|
+
Let's say you want to enable workflow cancellation: if one job in a workflow fails, you can safely avoid running any of the others. Assuming Sidekiq-hierarchy is installed and running, we can do this with two middlewares.
|
313
|
+
|
314
|
+
On the server side, _inside_ the hierarchy middleware to ensure variables are set:
|
315
|
+
```ruby
|
316
|
+
class FailFast::ServerMiddleware
|
317
|
+
def call(worker, job, queue)
|
318
|
+
current_jid = Sidekiq::Hierarchy.current_jid
|
319
|
+
workflow = Sidekiq::Hierarchy.current_workflow
|
320
|
+
return if workflow && workflow[:fail_fast]
|
321
|
+
|
322
|
+
yield
|
323
|
+
|
324
|
+
rescue => e
|
325
|
+
if workflow && Sidekiq::Hierarchy::Job.find(current_jid).failed?
|
326
|
+
workflow[:fail_fast] = '1'
|
327
|
+
end
|
328
|
+
raise # make sure to propagate exception up
|
329
|
+
end
|
330
|
+
end
|
331
|
+
```
|
332
|
+
|
333
|
+
On the client side, _inside_ the hierarchy middleware (remember to install client middleware on both the server and client):
|
334
|
+
```ruby
|
335
|
+
class FailFast::ClientMiddleware
|
336
|
+
def call(worker_class, msg, queue, redis_pool)
|
337
|
+
workflow = Sidekiq::Hierarchy.current_workflow
|
338
|
+
return false if workflow && workflow[:fail_fast] # don't bother queueing
|
339
|
+
yield
|
340
|
+
end
|
341
|
+
end
|
342
|
+
```
|
343
|
+
|
344
|
+
The server middleware will flag the workflow on any non-retriable failure. Meanwhile, the client middleware pre-emptively cancels queuing any job according to the flag, and the server middleware refuses to execute jobs on cancelled workflows.
|
345
|
+
|
346
|
+
###Workflow Metrics Dashboard
|
347
|
+
|
348
|
+
Every workflow has a canonical representation given by `#as_json`/`#to_s` (depending on desired format), which will be the same for a given set of tree of jobs regardless of their actual queuing and execution order. This representation disambiguates by job class and child set. For example, a `ParentWorker` that kicked off two `ChildWorker`s would have the representation
|
349
|
+
|
350
|
+
"{\"k\":\"ParentWorker\",\"c\":[{\"k\":\"ChildWorker\",\"c\":[]},{\"k\":\"ChildWorker\",\"c\":[]}]}"
|
351
|
+
|
352
|
+
Let's put workflow metrics in [StatsD](https://github.com/etsy/statsd), an easy-to-use metrics collector. Assuming we've already set up our statsd client as `$statsd`, we can push the timing info collected by Sidekiq-hierarchy with a few lines of code in an initializer (plugging into the pub/sub system):
|
353
|
+
|
354
|
+
```ruby
|
355
|
+
require 'zlib'
|
356
|
+
|
357
|
+
metrics_pusher = Proc.new do |workflow, status, old_status|
|
358
|
+
if status == :complete
|
359
|
+
uniq_repr = Zlib.crc32(workflow.to_s)
|
360
|
+
time_in_ms = (workflow.complete_at - workflow.run_at) * 1000
|
361
|
+
$statsd.timing("workflows:#{uniq_repr}", time_in_ms)
|
362
|
+
end
|
363
|
+
end
|
364
|
+
|
365
|
+
Sidekiq::Hierarchy.callback_registry.subscribe(Sidekiq::Hierarchy::Notifications::WORKFLOW_UPDATE, metrics_pusher)
|
366
|
+
```
|
367
|
+
|
368
|
+
Using something like [Graphite](http://graphite.wikidot.com/), we can then analyze the results in realtime, accessing stats like minimum, mean, maximum, and 95th percentile runtime. You'll probably want to keep a CRC32 -> workflow mapping handy; a simple hashmap (or Redis hash, hint hint) will suffice nicely.
|
369
|
+
|
370
|
+
## Installation
|
371
|
+
|
372
|
+
Add this line to your application's Gemfile:
|
373
|
+
|
374
|
+
```ruby
|
375
|
+
gem 'sidekiq-hierarchy'
|
376
|
+
```
|
377
|
+
|
378
|
+
And then execute:
|
379
|
+
|
380
|
+
$ bundle
|
381
|
+
|
382
|
+
Or install it yourself as:
|
383
|
+
|
384
|
+
$ gem install sidekiq-hierarchy
|
385
|
+
|
386
|
+
If you want to use the network bridge, you'll need `faraday` as well; if you're using the web UI, make sure `sinatra` is installed.
|
387
|
+
|
388
|
+
## Development
|
389
|
+
|
390
|
+
After checking out the repo, run `bin/setup` to install dependencies. Then, run `rake spec` to run the tests. You can also run `bin/console` for an interactive prompt that will allow you to experiment.
|
391
|
+
|
392
|
+
To install this gem onto your local machine, run `bundle exec rake install`.
|
393
|
+
|
394
|
+
## License
|
395
|
+
|
396
|
+
The gem is available as open source under the terms of the [MIT License](http://opensource.org/licenses/MIT).
|