lowkiq 1.0.0 → 1.0.1
Sign up to get free protection for your applications and to get access to all the features.
- checksums.yaml +4 -4
- data/Gemfile.lock +16 -16
- data/LICENSE.md +3 -3
- data/README.md +366 -316
- data/README.ru.md +645 -0
- data/docker-compose.yml +1 -1
- data/lib/lowkiq/extend_tracker.rb +1 -1
- data/lib/lowkiq/version.rb +1 -1
- data/lowkiq.gemspec +2 -2
- metadata +8 -8
checksums.yaml
CHANGED
@@ -1,7 +1,7 @@
|
|
1
1
|
---
|
2
2
|
SHA256:
|
3
|
-
metadata.gz:
|
4
|
-
data.tar.gz:
|
3
|
+
metadata.gz: 12bf22fdbf98496119d373faa88915d5c6f7e25e9522ecaafb8f865c8a00b840
|
4
|
+
data.tar.gz: 95e857e0a27990987aeb9f8b17442888c5b39cfc346ce6f3f8bf0911758aaa21
|
5
5
|
SHA512:
|
6
|
-
metadata.gz:
|
7
|
-
data.tar.gz:
|
6
|
+
metadata.gz: 8c02521986c2ba633eead7c71811ca1f04377b1acb9080d86677a4f831621c789becbd7f5ed2d7255324f238187b4dab245d2bf1c22880e4f99d89675c5f2dc3
|
7
|
+
data.tar.gz: 6446af89cb40417500f250d81bae859e412b7a55eb5f1e4db7c43abf6815737ea433f94922c115eb5d6933f770e75b4c48c3a6e8974477dd84d2ec19d99eae06
|
data/Gemfile.lock
CHANGED
@@ -11,35 +11,35 @@ GEM
|
|
11
11
|
specs:
|
12
12
|
connection_pool (2.2.2)
|
13
13
|
diff-lcs (1.3)
|
14
|
-
rack (2.
|
14
|
+
rack (2.2.2)
|
15
15
|
rack-test (1.1.0)
|
16
16
|
rack (>= 1.0, < 3)
|
17
|
-
rake (
|
17
|
+
rake (12.3.3)
|
18
18
|
redis (4.1.3)
|
19
|
-
rspec (3.
|
20
|
-
rspec-core (~> 3.
|
21
|
-
rspec-expectations (~> 3.
|
22
|
-
rspec-mocks (~> 3.
|
23
|
-
rspec-core (3.
|
24
|
-
rspec-support (~> 3.
|
25
|
-
rspec-expectations (3.
|
19
|
+
rspec (3.9.0)
|
20
|
+
rspec-core (~> 3.9.0)
|
21
|
+
rspec-expectations (~> 3.9.0)
|
22
|
+
rspec-mocks (~> 3.9.0)
|
23
|
+
rspec-core (3.9.1)
|
24
|
+
rspec-support (~> 3.9.1)
|
25
|
+
rspec-expectations (3.9.0)
|
26
26
|
diff-lcs (>= 1.2.0, < 2.0)
|
27
|
-
rspec-support (~> 3.
|
28
|
-
rspec-mocks (3.
|
27
|
+
rspec-support (~> 3.9.0)
|
28
|
+
rspec-mocks (3.9.1)
|
29
29
|
diff-lcs (>= 1.2.0, < 2.0)
|
30
|
-
rspec-support (~> 3.
|
31
|
-
rspec-support (3.
|
30
|
+
rspec-support (~> 3.9.0)
|
31
|
+
rspec-support (3.9.2)
|
32
32
|
|
33
33
|
PLATFORMS
|
34
34
|
ruby
|
35
35
|
|
36
36
|
DEPENDENCIES
|
37
|
-
bundler (~> 1.
|
37
|
+
bundler (~> 2.1.0)
|
38
38
|
lowkiq!
|
39
39
|
rack-test (~> 1.1)
|
40
|
-
rake (~>
|
40
|
+
rake (~> 12.3.0)
|
41
41
|
rspec (~> 3.0)
|
42
42
|
rspec-mocks (~> 3.8)
|
43
43
|
|
44
44
|
BUNDLED WITH
|
45
|
-
1.
|
45
|
+
2.1.2
|
data/LICENSE.md
CHANGED
@@ -7,7 +7,7 @@ On granting a non-exclusive right to use open source software
|
|
7
7
|
|
8
8
|
1.1. The Licensor provides the Licensee, in the manner and on the terms set forth in this Agreement, the right to use (license) **the Lowkiq open source software** (hereinafter - the "Software").
|
9
9
|
|
10
|
-
1.2. The source code for the software is available on the website located in the Internet telecommunication network "Internet" at the address: https://github.com/bia-
|
10
|
+
1.2. The source code for the software is available on the website located in the Internet telecommunication network "Internet" at the address: https://github.com/bia-technologies/lowkiq.
|
11
11
|
|
12
12
|
1.3. Software characteristics, that individualize it as a unique result of intellectual activity:
|
13
13
|
|
@@ -129,5 +129,5 @@ TIN/ 7810385714
|
|
129
129
|
RRC/ 781001001
|
130
130
|
|
131
131
|
Name and email address of the representative:<br>
|
132
|
-
|
133
|
-
|
132
|
+
Mikhail Kuzmin<br>
|
133
|
+
Mihail.Kuzmin@bia-tech.ru<br>
|
data/README.md
CHANGED
@@ -1,163 +1,194 @@
|
|
1
|
+
[![Gem Version](https://badge.fury.io/rb/lowkiq.svg)](https://badge.fury.io/rb/lowkiq)
|
2
|
+
|
1
3
|
# Lowkiq
|
2
4
|
|
3
|
-
|
5
|
+
Ordered background jobs processing
|
4
6
|
|
5
7
|
![dashboard](doc/dashboard.png)
|
6
8
|
|
9
|
+
* [Rationale](#rationale)
|
10
|
+
* [Description](#description)
|
11
|
+
* [Sidekiq comparison](#sidekiq-comparison)
|
12
|
+
* [Queue](#queue)
|
13
|
+
+ [Calculation algorithm for `retry_count` and `perform_in`](#calculation-algorithm-for-retry_count-and-perform_in)
|
14
|
+
+ [Job merging rules](#job-merging-rules)
|
15
|
+
* [Install](#install)
|
16
|
+
* [Api](#api)
|
17
|
+
* [Ring app](#ring-app)
|
18
|
+
* [Configuration](#configuration)
|
19
|
+
* [Execution](#execution)
|
20
|
+
* [Shutdown](#shutdown)
|
21
|
+
* [Debug](#debug)
|
22
|
+
* [Development](#development)
|
23
|
+
* [Exceptions](#exceptions)
|
24
|
+
* [Rails integration](#rails-integration)
|
25
|
+
* [Splitter](#splitter)
|
26
|
+
* [Scheduler](#scheduler)
|
27
|
+
* [Recommendations on configuration](#recommendations-on-configuration)
|
28
|
+
+ [`SomeWorker.shards_count`](#someworkershards_count)
|
29
|
+
+ [`SomeWorker.max_retry_count`](#someworkermax_retry_count)
|
30
|
+
|
7
31
|
## Rationale
|
8
32
|
|
9
|
-
|
33
|
+
We've faced some problems using Sidekiq while processing messages from a side system.
|
34
|
+
For instance, the message is a data of an order in particular time.
|
35
|
+
The side system will send a new data of an order on an every change.
|
36
|
+
Orders are frequently updated and a queue containts some closely located messages of the same order.
|
10
37
|
|
11
|
-
Sidekiq
|
12
|
-
|
13
|
-
Sidekiq
|
14
|
-
|
38
|
+
Sidekiq doesn't guarantee a strict message order, because a queue is processed by multiple threads.
|
39
|
+
For example, we've received 2 messages: M1 and M2.
|
40
|
+
Sidekiq handlers begin to process them parallel,
|
41
|
+
so M2 can be processed before M1.
|
15
42
|
|
16
|
-
|
17
|
-
Параллельная обработка таких сообщений приводит к:
|
43
|
+
Parallel processing of such kind of messages can result in:
|
18
44
|
|
19
45
|
+ dead locks
|
20
|
-
+
|
46
|
+
+ overwriting new data with old one
|
21
47
|
|
22
|
-
Lowkiq
|
48
|
+
Lowkiq has been created to eliminate such problems by avoiding parallel task processing within one entity.
|
23
49
|
|
24
50
|
## Description
|
25
51
|
|
26
|
-
|
27
|
-
|
52
|
+
Lowkiq's queues are reliable i.e.,
|
53
|
+
Lowkiq saves information about a job being processed
|
54
|
+
and returns incompleted jobs back to the queue on startup.
|
55
|
+
|
56
|
+
Jobs in queues are ordered by preassigned execution time, so they are not FIFO queues.
|
28
57
|
|
29
|
-
|
30
|
-
когда несколько потоков обрабатывают задачи с одинаковыми идентификаторами.
|
58
|
+
Every job has it's own identifier. Lowkiq guarantees that jobs with equal id are processed by the same thread.
|
31
59
|
|
32
|
-
|
33
|
-
|
34
|
-
|
35
|
-
|
36
|
-
|
60
|
+
Every queue is divided into a permanent set of shards.
|
61
|
+
A job is placed into particular shard based on an id of the job.
|
62
|
+
So jobs with the same id are always placed into the same shard.
|
63
|
+
All jobs of the shard are always processed with the same thread.
|
64
|
+
This guarantees the sequently processing of jobs with the same ids and excludes the possibility of locks.
|
37
65
|
|
38
|
-
|
39
|
-
|
40
|
-
|
66
|
+
Besides the id, every job has a payload.
|
67
|
+
Payloads are accumulated for jobs with the same id.
|
68
|
+
So all accumulated payloads will be processed together.
|
69
|
+
It's useful when you need to process only the last message and drop all previous ones.
|
41
70
|
|
42
|
-
|
43
|
-
Если задачи содержат снимки (версии) сущности, то обработчик может использовать только последнюю версию.
|
71
|
+
A worker corresponds to a queue and contains a job processing logic.
|
44
72
|
|
45
|
-
|
46
|
-
|
47
|
-
таким образом, добавление или удаление очереди/воркера не приводит к изменению числа тредов.
|
48
|
-
Нет смысла задавать кол-во шардов одного воркера больше, чем общее кол-во тредов.
|
73
|
+
Fixed amount of threads is used to process all job of all queues.
|
74
|
+
Adding or removing queues or it's shards won't affect the amount of threads.
|
49
75
|
|
50
|
-
##
|
76
|
+
## Sidekiq comparison
|
51
77
|
|
52
|
-
|
78
|
+
If Sidekiq is good for your tasks you should use it.
|
79
|
+
But if you use plugins like
|
80
|
+
[sidekiq-grouping](https://github.com/gzigzigzeo/sidekiq-grouping),
|
81
|
+
[sidekiq-unique-jobs](https://github.com/mhenrixon/sidekiq-unique-jobs),
|
82
|
+
[sidekiq-merger](https://github.com/dtaniwaki/sidekiq-merger)
|
83
|
+
or implement your own lock system, you should look at Lowkiq.
|
53
84
|
|
54
|
-
|
55
|
-
|
56
|
-
|
85
|
+
For example, sidekiq-grouping accumulates a batch of jobs than enqueues it and accumulates a next batch.
|
86
|
+
With this approach queue can contains two batches with a data of the same order.
|
87
|
+
These batches are parallel processed with different threads, so we come back to the initial problem.
|
57
88
|
|
58
|
-
|
89
|
+
Lowkiq was designed to avoid any types of locking.
|
59
90
|
|
60
|
-
|
91
|
+
Furthermore, Lowkiq's queues are reliable. Only Sidekiq Pro or plugins can add such functionality.
|
61
92
|
|
62
|
-
|
63
|
-
|
93
|
+
This [benchmark](examples/benchmark) shows overhead on redis usage.
|
94
|
+
This is the results for 5 threads, 100,000 blank jobs:
|
64
95
|
|
65
|
-
|
66
|
-
|
96
|
+
+ lowkiq: 214 sec or 2.14 ms per job
|
97
|
+
+ sidekiq: 29 sec or 0.29 ms per job
|
67
98
|
|
68
|
-
|
99
|
+
This difference is related to different queues structure.
|
100
|
+
Sidekiq uses one list for all workers and fetches the job entirely for O(1).
|
101
|
+
Lowkiq uses several data structures, including sorted sets for storing ids of jobs.
|
102
|
+
So fetching only an id of a job takes O(log(N)).
|
69
103
|
|
70
|
-
|
104
|
+
## Queue
|
71
105
|
|
72
|
-
|
73
|
-
+ `payloads` - сортированное множество payload'ов (объекты) по их score (вещественное число)
|
74
|
-
+ `perform_in` - запланированное время начала иполнения задачи (unix timestamp, вещественное число)
|
75
|
-
+ `retry_count` - количество совершённых повторов задачи (вещественное число)
|
106
|
+
Please, look at [the presentation](https://docs.google.com/presentation/d/e/2PACX-1vRdwA2Ck22r26KV1DbY__XcYpj2FdlnR-2G05w1YULErnJLB_JL1itYbBC6_JbLSPOHwJ0nwvnIHH2A/pub?start=false&loop=false&delayms=3000).
|
76
107
|
|
77
|
-
|
78
|
-
`payloads` - множество,
|
79
|
-
получаемое в результате группировки полезной нагрузки задачи по `id` и отсортированное по ее `score`.
|
80
|
-
`payload` может быть объектом, т.к. сериализуется с помощью `Marshal.dump`.
|
81
|
-
`score` может быть датой (unix timestamp) создания `payload`
|
82
|
-
или ее монотонно увеличивающимся номером версии.
|
83
|
-
По умолчанию - текущий unix timestamp.
|
84
|
-
По умолчанию `perform_in` - текущий unix timestamp.
|
85
|
-
`retry_count` для новой необработанной задачи равен `-1`, для упавшей один раз - `0`,
|
86
|
-
т.е. считаются не совершённые, а запланированные повторы.
|
108
|
+
Every job has following attributes:
|
87
109
|
|
88
|
-
|
110
|
+
+ `id` is a job identifier with string type.
|
111
|
+
+ `payloads` is a sorted set of payloads ordered by it's score. Payload is an object. Score is a real number.
|
112
|
+
+ `perform_in` is planned execution time. It's unix timestamp with real number type.
|
113
|
+
+ `retry_count` is amount of retries. It's a real number.
|
89
114
|
|
90
|
-
|
115
|
+
For example, `id` can be an identifier of replicated entity.
|
116
|
+
`payloads` is a sorted set ordered by score of payload and resulted by grouping a payload of job by it's `id`.
|
117
|
+
`payload` can be a ruby object, because it is serialized by `Marshal.dump`.
|
118
|
+
`score` can be `payload`'s creation date (unix timestamp) or it's incremental version number.
|
119
|
+
By default `score` and `perform_in` are current unix timestamp.
|
120
|
+
`retry_count` for new unprocessed job equals to `-1`,
|
121
|
+
for one-time failed is `0`, so the planned retries are counted, not the performed ones.
|
91
122
|
|
92
|
-
|
93
|
-
В этом случае ее `retry_count` инкрементируется и по заданной формуле вычисляется новый `perform_at`,
|
94
|
-
и она ставится обратно в очередь.
|
123
|
+
A job execution can be unsuccessful. In this case, its `retry_count` is incremented, new `perform_in` is calculated with determined formula and it moves back to a queue.
|
95
124
|
|
96
|
-
|
97
|
-
|
98
|
-
а оставшиеся элементы помещаются обратно в очередь, при этом
|
99
|
-
`retry_count` и `perform_at` сбрасываются в `-1` и `now()` соответственно.
|
125
|
+
In case of `retry_count` is getting `>=` `max_retry_count` an element of `payloads` with less (oldest) score is moved to a morgue,
|
126
|
+
rest elements are moved back to the queue, wherein `retry_count` and `perform_in` are reset to `-1` and `now()` respectively.
|
100
127
|
|
101
|
-
###
|
128
|
+
### Calculation algorithm for `retry_count` and `perform_in`
|
102
129
|
|
103
|
-
0.
|
130
|
+
0. a job's been executed and failed
|
104
131
|
1. `retry_count++`
|
105
|
-
2. `perform_in = now + retry_in(try_count)`
|
106
|
-
3. `if retry_count >= max_retry_count`
|
132
|
+
2. `perform_in = now + retry_in (try_count)`
|
133
|
+
3. `if retry_count >= max_retry_count` the job will be moved to a morgue.
|
107
134
|
|
108
|
-
|
|
109
|
-
| ---
|
110
|
-
|
|
111
|
-
|
|
112
|
-
|
|
135
|
+
| type | `retry_count` | `perform_in` |
|
136
|
+
| --- | --- | --- |
|
137
|
+
| new haven't been executed | -1 | set or `now()` |
|
138
|
+
| new failed | 0 | `now() + retry_in(0)` |
|
139
|
+
| retry failed | 1 | `now() + retry_in(1)` |
|
113
140
|
|
114
|
-
|
141
|
+
If `max_retry_count = 1`, retries stop.
|
115
142
|
|
116
|
-
###
|
143
|
+
### Job merging rules
|
117
144
|
|
118
|
-
|
145
|
+
They are applied when:
|
119
146
|
|
120
|
-
+
|
121
|
-
+
|
122
|
-
+
|
147
|
+
+ a job had been in a queue and a new one with the same id was added
|
148
|
+
+ a job was failed, but a new one with the same id had been added
|
149
|
+
+ a job from morgue was moved back to queue, but queue had had a job with the same id
|
123
150
|
|
124
|
-
|
151
|
+
Algorithm:
|
125
152
|
|
126
|
-
+ payloads
|
127
|
-
|
128
|
-
+
|
129
|
-
|
130
|
-
+ если объединяется упавшая задача и задача из очереди,
|
131
|
-
то `perform_at` и `retry_count` берутся из упавшей
|
132
|
-
+ если объединяется задача из морга и задача из очереди,
|
133
|
-
то `perform_at = now()`, `retry_count = -1`
|
153
|
+
+ payloads is merged, minimal score is chosen for equal payloads
|
154
|
+
+ if a new job and queued job is merged, `perform_in` and `retry_count` is taken from the the job from the queue
|
155
|
+
+ if a failed job and queued job is merged, `perform_in` and `retry_count` is taken from the failed one
|
156
|
+
+ if morgue job and queued job is merged, `perform_in = now()`, `retry_count = -1`
|
134
157
|
|
135
|
-
|
158
|
+
Example:
|
136
159
|
|
137
160
|
```
|
138
|
-
# v1
|
139
|
-
# #{"v1": 1}
|
161
|
+
# v1 is the first version and v2 is the second
|
162
|
+
# #{"v1": 1} is a sorted set of a single element, the payload is "v1", the score is 1
|
140
163
|
|
141
|
-
#
|
142
|
-
{ id: "1", payloads: #{"v1": 1, "v2": 2}, retry_count: 0,
|
143
|
-
#
|
144
|
-
{ id: "1", payloads: #{"v2": 3, "v3": 4}, retry_count: -1,
|
164
|
+
# a job is in a queue
|
165
|
+
{ id: "1", payloads: #{"v1": 1, "v2": 2}, retry_count: 0, perform_in: 1536323288 }
|
166
|
+
# a job which is being added
|
167
|
+
{ id: "1", payloads: #{"v2": 3, "v3": 4}, retry_count: -1, perform_in: 1536323290 }
|
145
168
|
|
146
|
-
#
|
147
|
-
{ id: "1", payloads: #{"v1": 1, "v2": 3, "v3": 4}, retry_count: 0,
|
169
|
+
# a resulted job in the queue
|
170
|
+
{ id: "1", payloads: #{"v1": 1, "v2": 3, "v3": 4}, retry_count: 0, perform_in: 1536323288 }
|
148
171
|
```
|
149
172
|
|
150
|
-
|
151
|
-
|
173
|
+
Morgue is a part of the queue. Jobs in morgue are not processed.
|
174
|
+
A job in morgue has following attributes:
|
152
175
|
|
153
|
-
+ id
|
154
|
-
+ payloads
|
176
|
+
+ id is the job identifier
|
177
|
+
+ payloads
|
155
178
|
|
156
|
-
|
179
|
+
A job from morgue can be moved back to the queue, `retry_count` = 0 and `perform_in = now()` would be set.
|
157
180
|
|
158
|
-
|
181
|
+
## Install
|
159
182
|
|
160
|
-
|
183
|
+
```
|
184
|
+
# Gemfile
|
185
|
+
|
186
|
+
gem 'lowkiq'
|
187
|
+
```
|
188
|
+
|
189
|
+
Redis >= 3.2
|
190
|
+
|
191
|
+
## Api
|
161
192
|
|
162
193
|
```ruby
|
163
194
|
module ATestWorker
|
@@ -171,11 +202,10 @@ module ATestWorker
|
|
171
202
|
10 * (count + 1) # (i.e. 10, 20, 30, 40, 50)
|
172
203
|
end
|
173
204
|
|
174
|
-
def self.perform(
|
175
|
-
# payloads_by_id
|
205
|
+
def self.perform(payloads_by_id)
|
206
|
+
# payloads_by_id is a hash map
|
176
207
|
payloads_by_id.each do |id, payloads|
|
177
|
-
#
|
178
|
-
# payloads отсортированы по score, от старых к новым (от минимальных к максимальным)
|
208
|
+
# payloads are sorted by score, from old to new (min to max)
|
179
209
|
payloads.each do |payload|
|
180
210
|
do_some_work(id, payload)
|
181
211
|
end
|
@@ -184,7 +214,7 @@ module ATestWorker
|
|
184
214
|
end
|
185
215
|
```
|
186
216
|
|
187
|
-
|
217
|
+
Default values:
|
188
218
|
|
189
219
|
```ruby
|
190
220
|
self.shards_count = 5
|
@@ -204,11 +234,11 @@ ATestWorker.perform_async [
|
|
204
234
|
{ id: 1, payload: { attr: 'v1' } },
|
205
235
|
{ id: 2, payload: { attr: 'v1' }, score: Time.now.to_i, perform_in: Time.now.to_i },
|
206
236
|
]
|
207
|
-
# payload
|
208
|
-
# score
|
237
|
+
# payload by default equals to ""
|
238
|
+
# score and perform_in by default equals to Time.now.to_i
|
209
239
|
```
|
210
240
|
|
211
|
-
|
241
|
+
It is possible to redefine `perform_async` and calculate `id`, `score` и `perform_in` in a worker code:
|
212
242
|
|
213
243
|
```ruby
|
214
244
|
module ATestWorker
|
@@ -229,56 +259,28 @@ end
|
|
229
259
|
ATestWorker.perform_async 1000.times.map { |id| { payload: {id: id} } }
|
230
260
|
```
|
231
261
|
|
232
|
-
### Max retry count
|
233
|
-
|
234
|
-
Исходя из `retry_in` и `max_retry_count`,
|
235
|
-
можно вычислить примерное время, которая задача проведет в очереди.
|
236
|
-
|
237
|
-
Для `retry_in`, заданного по умолчанию получается следующая таблица:
|
238
|
-
|
239
|
-
```ruby
|
240
|
-
def retry_in(retry_count)
|
241
|
-
(retry_count ** 4) + 15 + (rand(30) * (retry_count + 1))
|
242
|
-
end
|
243
|
-
```
|
244
|
-
|
245
|
-
| `max_retry_count` | кол-во дней жизни задачи |
|
246
|
-
| --- | --- |
|
247
|
-
| 14 | 1 |
|
248
|
-
| 16 | 2 |
|
249
|
-
| 18 | 3 |
|
250
|
-
| 19 | 5 |
|
251
|
-
| 20 | 6 |
|
252
|
-
| 21 | 8 |
|
253
|
-
| 22 | 10 |
|
254
|
-
| 23 | 13 |
|
255
|
-
| 24 | 16 |
|
256
|
-
| 25 | 20 |
|
257
|
-
|
258
|
-
`(0...25).map{ |c| retry_in c }.sum / 60 / 60 / 24`
|
259
|
-
|
260
262
|
## Ring app
|
261
263
|
|
262
|
-
`Lowkiq::Web` - ring app.
|
264
|
+
`Lowkiq::Web` - a ring app.
|
263
265
|
|
264
|
-
+ `/` - dashboard
|
265
|
-
+ `/api/v1/stats` -
|
266
|
+
+ `/` - a dashboard
|
267
|
+
+ `/api/v1/stats` - queue length, morgue length, lag for each worker and total result
|
266
268
|
|
267
|
-
##
|
269
|
+
## Configuration
|
268
270
|
|
269
|
-
|
271
|
+
Default options and values are:
|
270
272
|
|
271
|
-
+ `Lowkiq.poll_interval = 1` -
|
272
|
-
|
273
|
-
+ `Lowkiq.threads_per_node = 5` -
|
274
|
-
+ `Lowkiq.redis = ->() { Redis.new url: ENV.fetch('REDIS_URL') }` -
|
275
|
-
+ `Lowkiq.client_pool_size = 5` -
|
276
|
-
+ `Lowkiq.pool_timeout = 5` -
|
277
|
-
+ `Lowkiq.server_middlewares = []` -
|
278
|
-
+ `Lowkiq.on_server_init = ->() {}` -
|
279
|
-
+ `Lowkiq.build_scheduler = ->() { Lowkiq.build_lag_scheduler }`
|
280
|
-
+ `Lowkiq.build_splitter = ->() { Lowkiq.build_default_splitter }`
|
281
|
-
+ `Lowkiq.last_words = ->(ex) {}`
|
273
|
+
+ `Lowkiq.poll_interval = 1` - delay in seconds between queue polling for new jobs.
|
274
|
+
Used only if the queue was empty at previous cycle or error was occured.
|
275
|
+
+ `Lowkiq.threads_per_node = 5` - threads per node.
|
276
|
+
+ `Lowkiq.redis = ->() { Redis.new url: ENV.fetch('REDIS_URL') }` - redis connection options
|
277
|
+
+ `Lowkiq.client_pool_size = 5` - redis pool size for queueing jobs
|
278
|
+
+ `Lowkiq.pool_timeout = 5` - client and server redis pool timeout
|
279
|
+
+ `Lowkiq.server_middlewares = []` - a middleware list, used for worker wrapping
|
280
|
+
+ `Lowkiq.on_server_init = ->() {}` - a lambda is being executed when server inits
|
281
|
+
+ `Lowkiq.build_scheduler = ->() { Lowkiq.build_lag_scheduler }` is a scheduler
|
282
|
+
+ `Lowkiq.build_splitter = ->() { Lowkiq.build_default_splitter }` is a splitter
|
283
|
+
+ `Lowkiq.last_words = ->(ex) {}` is an exception handler of descendants of `StandardError` caused the process stop
|
282
284
|
|
283
285
|
```ruby
|
284
286
|
$logger = Logger.new(STDOUT)
|
@@ -299,184 +301,53 @@ Lowkiq.server_middlewares << -> (worker, batch, &block) do
|
|
299
301
|
end
|
300
302
|
```
|
301
303
|
|
302
|
-
##
|
303
|
-
|
304
|
-
У каждого воркера есть несколько шардов:
|
305
|
-
|
306
|
-
```
|
307
|
-
# worker: shard ids
|
308
|
-
worker A: 0, 1, 2
|
309
|
-
worker B: 0, 1, 2, 3
|
310
|
-
worker C: 0
|
311
|
-
worker D: 0, 1
|
312
|
-
```
|
313
|
-
|
314
|
-
Lowkiq использует фиксированное кол-во тредов для обработки задач, следовательно нужно распределить шарды
|
315
|
-
между тредами. Этим занимается Splitter.
|
316
|
-
|
317
|
-
Чтобы определить набор шардов, которые будет обрабатывать тред, поместим их в один список:
|
318
|
-
|
319
|
-
```
|
320
|
-
A0, A1, A2, B0, B1, B2, B3, C0, D0, D1
|
321
|
-
```
|
322
|
-
|
323
|
-
Рассмотрим Default splitter, который равномерно распределяет шарды по тредам единственной ноды.
|
324
|
-
|
325
|
-
Если `threads_per_node` установлено в 3, то распределение будет таким:
|
326
|
-
|
327
|
-
```
|
328
|
-
# thread id: shards
|
329
|
-
t0: A0, B0, B3, D1
|
330
|
-
t1: A1, B1, C0
|
331
|
-
t2: A2, B2, D0
|
332
|
-
```
|
333
|
-
|
334
|
-
Помимо Default есть ByNode splitter. Он позволяет распределить нагрузку по нескольким процессам (нодам).
|
335
|
-
|
336
|
-
|
337
|
-
```
|
338
|
-
Lowkiq.build_splitter = -> () do
|
339
|
-
Lowkiq.build_by_node_splitter(
|
340
|
-
ENV.fetch('LOWKIQ_NUMBER_OF_NODES').to_i,
|
341
|
-
ENV.fetch('LOWKIQ_NODE_NUMBER').to_i
|
342
|
-
)
|
343
|
-
end
|
344
|
-
```
|
345
|
-
|
346
|
-
Таким образом, вместо одного процесса нужно запустить несколько и указать переменные окружения:
|
347
|
-
|
348
|
-
```
|
349
|
-
# process 0
|
350
|
-
LOWKIQ_NUMBER_OF_NODES=2 LOWKIQ_NODE_NUMBER=0 bundle exec lowkiq -r ./lib/app.rb
|
351
|
-
|
352
|
-
# process 1
|
353
|
-
LOWKIQ_NUMBER_OF_NODES=2 LOWKIQ_NODE_NUMBER=1 bundle exec lowkiq -r ./lib/app.rb
|
354
|
-
```
|
355
|
-
|
356
|
-
Отмечу, что общее количество тредов будет равно произведению `ENV.fetch('LOWKIQ_NUMBER_OF_NODES')` и `Lowkiq.threads_per_node`.
|
304
|
+
## Execution
|
357
305
|
|
358
|
-
|
359
|
-
|
360
|
-
## Scheduler
|
361
|
-
|
362
|
-
Каждый тред обрабатывает набор шардов. За выбор шарда для обработки отвечает планировщик.
|
363
|
-
Каждый поток имеет свой собственный экземпляр планировщика.
|
364
|
-
|
365
|
-
Lowkiq имеет 2 планировщика на выбор.
|
366
|
-
Первый, `Seq` - последовательно перебирает шарды.
|
367
|
-
Второй, `Lag` - выбирает шард с самой старой задачей, т.е. стремится минимизировать лаг.
|
368
|
-
Используется по умолчанию.
|
369
|
-
|
370
|
-
Планировщик задается через настройки:
|
371
|
-
|
372
|
-
```
|
373
|
-
Lowkiq.build_scheduler = ->() { Lowkiq.build_seq_scheduler }
|
374
|
-
# или
|
375
|
-
Lowkiq.build_scheduler = ->() { Lowkiq.build_lag_scheduler }
|
376
|
-
```
|
377
|
-
|
378
|
-
## Исключения
|
379
|
-
|
380
|
-
`StandardError` выброшенные воркером обрабатываются с помощью middleware.
|
381
|
-
Такие исключения не приводят к остановке процесса.
|
382
|
-
|
383
|
-
Все прочие исключения приводят к остановке процесса.
|
384
|
-
При этом Lowkiq дожидается выполнения задач другими тредами.
|
385
|
-
|
386
|
-
`StandardError` выброшенные вне воркера передаются в `Lowkiq.last_words`.
|
387
|
-
Например это происходит при потере соединения к Redis или при ошибке в коде Lowkiq.
|
306
|
+
`lowkiq -r ./path_to_app`
|
388
307
|
|
389
|
-
|
308
|
+
`path_to_app.rb` must load app. [Example](examples/dummy/lib/app.rb).
|
390
309
|
|
391
|
-
|
310
|
+
Lazy loading of workers modules is unacceptable.
|
311
|
+
For preliminarily loading modules use
|
312
|
+
`require`
|
313
|
+
or [`require_dependency`](https://api.rubyonrails.org/classes/ActiveSupport/Dependencies/Loadable.html#method-i-require_dependency)
|
314
|
+
for Ruby on Rails.
|
392
315
|
|
393
|
-
|
394
|
-
то дождитесь опустошения очередей и выкатите новую версию кода с измененным кол-вом шардов.
|
316
|
+
## Shutdown
|
395
317
|
|
396
|
-
|
318
|
+
Send TERM or INT signal to process (Ctrl-C).
|
319
|
+
Process will wait for executed jobs to finish.
|
397
320
|
|
398
|
-
|
321
|
+
Note that if queue is empty, process sleeps `poll_interval` seconds,
|
322
|
+
therefore, the process will not stop until the `poll_interval` seconds have passed.
|
399
323
|
|
400
|
-
|
401
|
-
module ATestWorker
|
402
|
-
extend Lowkiq::Worker
|
324
|
+
## Debug
|
403
325
|
|
404
|
-
|
326
|
+
To get trace of all threads of app:
|
405
327
|
|
406
|
-
def self.perform(payloads_by_id)
|
407
|
-
some_code
|
408
|
-
end
|
409
|
-
end
|
410
328
|
```
|
411
|
-
|
412
|
-
|
413
|
-
|
414
|
-
```ruby
|
415
|
-
module ATestWorker
|
416
|
-
extend Lowkiq::Worker
|
417
|
-
|
418
|
-
self.shards_count = 10
|
419
|
-
self.queue_name = "#{self.name}_V2"
|
420
|
-
|
421
|
-
def self.perform(payloads_by_id)
|
422
|
-
some_code
|
423
|
-
end
|
424
|
-
end
|
425
|
-
```
|
426
|
-
|
427
|
-
И добавить воркер, перекладывающий задачи из старой очереди в новую:
|
428
|
-
|
429
|
-
```ruby
|
430
|
-
module ATestMigrationWorker
|
431
|
-
extend Lowkiq::Worker
|
432
|
-
|
433
|
-
self.shards_count = 5
|
434
|
-
self.queue_name = "ATestWorker"
|
435
|
-
|
436
|
-
def self.perform(payloads_by_id)
|
437
|
-
jobs = payloads_by_id.each_with_object([]) do |(id, payloads), acc|
|
438
|
-
payloads.each do |payload|
|
439
|
-
acc << { id: id, payload: payload }
|
440
|
-
end
|
441
|
-
end
|
442
|
-
|
443
|
-
ATestWorker.perform_async jobs
|
444
|
-
end
|
445
|
-
end
|
329
|
+
kill -TTIN <pid>
|
330
|
+
cat /tmp/lowkiq_ttin.txt
|
446
331
|
```
|
447
332
|
|
448
|
-
## Запуск
|
449
|
-
|
450
|
-
`lowkiq -r ./path_to_app`
|
451
|
-
|
452
|
-
`path_to_app.rb` должен загрузить приложение.
|
453
|
-
Ленивая загрузка модулей воркеров недопустима.
|
454
|
-
|
455
|
-
Redis версии >= 3.2.
|
456
|
-
|
457
|
-
## Остановка
|
458
|
-
|
459
|
-
Послать процессу TERM или INT(Ctrl-C).
|
460
|
-
Процесс будет ждать завершения всех задач.
|
461
|
-
Обратите внимание, если очередь пуста, то на время завершения влияет величина `poll_interval`.
|
462
|
-
|
463
333
|
## Development
|
464
334
|
|
465
335
|
```
|
466
336
|
docker-compose run --rm --service-port app bash
|
467
|
-
|
337
|
+
bundle
|
468
338
|
rspec
|
469
339
|
cd examples/dummy ; bundle exec ../../exe/lowkiq -r ./lib/app.rb
|
470
340
|
```
|
471
341
|
|
472
|
-
##
|
342
|
+
## Exceptions
|
473
343
|
|
474
|
-
|
344
|
+
`StandardError` thrown by worker are handled with middleware. Such exceptions doesn't lead to process stop.
|
475
345
|
|
476
|
-
|
477
|
-
|
478
|
-
|
479
|
-
|
346
|
+
All other exceptions cause the process to stop.
|
347
|
+
Lowkiq will wait for job execution by other threads.
|
348
|
+
|
349
|
+
`StandardError` thrown outside of worker are passed to `Lowkiq.last_words`.
|
350
|
+
For example, it can happen when Redis connection is lost or when Lowkiq's code has a bug.
|
480
351
|
|
481
352
|
## Rails integration
|
482
353
|
|
@@ -493,10 +364,10 @@ end
|
|
493
364
|
```ruby
|
494
365
|
# config/initializers/lowkiq.rb
|
495
366
|
|
496
|
-
#
|
367
|
+
# loading all lowkiq workers
|
497
368
|
Dir["#{Rails.root}/app/lowkiq_workers/**/*.rb"].each { |file| require_dependency file }
|
498
369
|
|
499
|
-
#
|
370
|
+
# configuration:
|
500
371
|
# Lowkiq.redis = -> { Redis.new url: ENV.fetch('LOWKIQ_REDIS_URL') }
|
501
372
|
# Lowkiq.threads_per_node = ENV.fetch('LOWKIQ_THREADS_PER_NODE').to_i
|
502
373
|
# Lowkiq.client_pool_size = ENV.fetch('LOWKIQ_CLIENT_POOL_SIZE').to_i
|
@@ -558,7 +429,7 @@ if defined? NewRelic
|
|
558
429
|
Lowkiq.server_middlewares << NewRelicLowkiqMiddleware.new
|
559
430
|
end
|
560
431
|
|
561
|
-
# Rails reloader,
|
432
|
+
# Rails reloader, responsible for cleaning of ActiveRecord connections
|
562
433
|
Lowkiq.server_middlewares << -> (worker, batch, &block) do
|
563
434
|
Rails.application.reloader.wrap do
|
564
435
|
block.call
|
@@ -574,4 +445,183 @@ Lowkiq.on_server_init = ->() do
|
|
574
445
|
end
|
575
446
|
```
|
576
447
|
|
577
|
-
|
448
|
+
Execution: `bundle exec lowkiq -r ./config/environment.rb`
|
449
|
+
|
450
|
+
|
451
|
+
## Splitter
|
452
|
+
|
453
|
+
Each worker has several shards:
|
454
|
+
|
455
|
+
```
|
456
|
+
# worker: shard ids
|
457
|
+
worker A: 0, 1, 2
|
458
|
+
worker B: 0, 1, 2, 3
|
459
|
+
worker C: 0
|
460
|
+
worker D: 0, 1
|
461
|
+
```
|
462
|
+
|
463
|
+
Lowkiq uses fixed amount of threads for job processing, therefore it is necessary to distribute shards between threads.
|
464
|
+
Splitter does it.
|
465
|
+
|
466
|
+
To define a set of shards, which is being processed by thread, lets move them to one list:
|
467
|
+
|
468
|
+
```
|
469
|
+
A0, A1, A2, B0, B1, B2, B3, C0, D0, D1
|
470
|
+
```
|
471
|
+
|
472
|
+
Default splitter evenly distributes shards by threads of a single node.
|
473
|
+
|
474
|
+
If `threads_per_node` is set to 3, the distribution will be:
|
475
|
+
|
476
|
+
```
|
477
|
+
# thread id: shards
|
478
|
+
t0: A0, B0, B3, D1
|
479
|
+
t1: A1, B1, C0
|
480
|
+
t2: A2, B2, D0
|
481
|
+
```
|
482
|
+
|
483
|
+
Besides Default Lowkiq has ByNode splitter. It allows to divide the load by several processes (nodes).
|
484
|
+
|
485
|
+
```
|
486
|
+
Lowkiq.build_splitter = -> () do
|
487
|
+
Lowkiq.build_by_node_splitter(
|
488
|
+
ENV.fetch('LOWKIQ_NUMBER_OF_NODES').to_i,
|
489
|
+
ENV.fetch('LOWKIQ_NODE_NUMBER').to_i
|
490
|
+
)
|
491
|
+
end
|
492
|
+
```
|
493
|
+
|
494
|
+
So, instead of single process you need to execute multiple ones and to set environment variables up:
|
495
|
+
|
496
|
+
```
|
497
|
+
# process 0
|
498
|
+
LOWKIQ_NUMBER_OF_NODES=2 LOWKIQ_NODE_NUMBER=0 bundle exec lowkiq -r ./lib/app.rb
|
499
|
+
|
500
|
+
# process 1
|
501
|
+
LOWKIQ_NUMBER_OF_NODES=2 LOWKIQ_NODE_NUMBER=1 bundle exec lowkiq -r ./lib/app.rb
|
502
|
+
```
|
503
|
+
|
504
|
+
Summary amount of threads are equal product of `ENV.fetch('LOWKIQ_NUMBER_OF_NODES')` and `Lowkiq.threads_per_node`.
|
505
|
+
|
506
|
+
You can also write your own splitter if your app needs extra distribution of shards between threads or nodes.
|
507
|
+
|
508
|
+
## Scheduler
|
509
|
+
|
510
|
+
Every thread processes a set of shards. Scheduler select shard for processing.
|
511
|
+
Every thread has it's own instance of scheduler.
|
512
|
+
|
513
|
+
Lowkiq has 2 schedulers for your choice.
|
514
|
+
`Seq` sequentally looks over shards.
|
515
|
+
`Lag` chooses shard with the oldest job minimizing the lag. It's used by default.
|
516
|
+
|
517
|
+
Scheduler can be set up through settings:
|
518
|
+
|
519
|
+
```
|
520
|
+
Lowkiq.build_scheduler = ->() { Lowkiq.build_seq_scheduler }
|
521
|
+
# or
|
522
|
+
Lowkiq.build_scheduler = ->() { Lowkiq.build_lag_scheduler }
|
523
|
+
```
|
524
|
+
|
525
|
+
## Recommendations on configuration
|
526
|
+
|
527
|
+
### `SomeWorker.shards_count`
|
528
|
+
|
529
|
+
Sum of `shards_count` of all workers shouldn't be less than `Lowkiq.threads_per_node`
|
530
|
+
otherwise threads will stay idle.
|
531
|
+
|
532
|
+
Sum of `shards_count` of all workers can be equal to `Lowkiq.threads_per_node`.
|
533
|
+
In this case thread processes a single shard. This makes sense only with uniform queue load.
|
534
|
+
|
535
|
+
Sum of `shards_count` of all workers can be more than `Lowkiq.threads_per_node`.
|
536
|
+
In this case `shards_count` can be counted as a priority.
|
537
|
+
The larger it is, the more often the tasks of this queue will be processed.
|
538
|
+
|
539
|
+
There is no reason to set `shards_count` of one worker more than `Lowkiq.threads_per_node`,
|
540
|
+
because every thread will handle more than one shard from this queue, so it increases the overhead.
|
541
|
+
|
542
|
+
### `SomeWorker.max_retry_count`
|
543
|
+
|
544
|
+
From `retry_in` and `max_retry_count`, you can calculate approximate time that payload of job will be in a queue.
|
545
|
+
After `max_retry_count` is reached the payload with a minimal score will be moved to a morgue.
|
546
|
+
|
547
|
+
For default `retry_in` we receive the following table.
|
548
|
+
|
549
|
+
```ruby
|
550
|
+
def retry_in(retry_count)
|
551
|
+
(retry_count ** 4) + 15 + (rand(30) * (retry_count + 1))
|
552
|
+
end
|
553
|
+
```
|
554
|
+
|
555
|
+
| `max_retry_count` | amount of days of job's life |
|
556
|
+
| --- | --- |
|
557
|
+
| 14 | 1 |
|
558
|
+
| 16 | 2 |
|
559
|
+
| 18 | 3 |
|
560
|
+
| 19 | 5 |
|
561
|
+
| 20 | 6 |
|
562
|
+
| 21 | 8 |
|
563
|
+
| 22 | 10 |
|
564
|
+
| 23 | 13 |
|
565
|
+
| 24 | 16 |
|
566
|
+
| 25 | 20 |
|
567
|
+
|
568
|
+
`(0...25).map{ |c| retry_in c }.sum / 60 / 60 / 24`
|
569
|
+
|
570
|
+
|
571
|
+
## Changing of worker's shards amount
|
572
|
+
|
573
|
+
Try to count amount of shards right away and don't change it in future.
|
574
|
+
|
575
|
+
If you can disable adding of new jobs, wait for queues to get empty and deploy the new version of code with changed amount of shards.
|
576
|
+
|
577
|
+
If you can't do it, follow the next steps:
|
578
|
+
|
579
|
+
A worker example:
|
580
|
+
|
581
|
+
```ruby
|
582
|
+
module ATestWorker
|
583
|
+
extend Lowkiq::Worker
|
584
|
+
|
585
|
+
self.shards_count = 5
|
586
|
+
|
587
|
+
def self.perform(payloads_by_id)
|
588
|
+
some_code
|
589
|
+
end
|
590
|
+
end
|
591
|
+
```
|
592
|
+
|
593
|
+
Set the number of shards and new queue name:
|
594
|
+
|
595
|
+
```ruby
|
596
|
+
module ATestWorker
|
597
|
+
extend Lowkiq::Worker
|
598
|
+
|
599
|
+
self.shards_count = 10
|
600
|
+
self.queue_name = "#{self.name}_V2"
|
601
|
+
|
602
|
+
def self.perform(payloads_by_id)
|
603
|
+
some_code
|
604
|
+
end
|
605
|
+
end
|
606
|
+
```
|
607
|
+
|
608
|
+
Add a worker moving jobs from the old queue to a new one:
|
609
|
+
|
610
|
+
```ruby
|
611
|
+
module ATestMigrationWorker
|
612
|
+
extend Lowkiq::Worker
|
613
|
+
|
614
|
+
self.shards_count = 5
|
615
|
+
self.queue_name = "ATestWorker"
|
616
|
+
|
617
|
+
def self.perform(payloads_by_id)
|
618
|
+
jobs = payloads_by_id.each_with_object([]) do |(id, payloads), acc|
|
619
|
+
payloads.each do |payload|
|
620
|
+
acc << { id: id, payload: payload }
|
621
|
+
end
|
622
|
+
end
|
623
|
+
|
624
|
+
ATestWorker.perform_async jobs
|
625
|
+
end
|
626
|
+
end
|
627
|
+
```
|