rspecq 0.0.1.pre2 → 0.1.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- checksums.yaml +4 -4
- data/CHANGELOG.md +18 -0
- data/README.md +165 -63
- data/Rakefile +9 -0
- data/bin/rspecq +79 -28
- data/lib/rspecq.rb +5 -7
- data/lib/rspecq/formatters/failure_recorder.rb +3 -2
- data/lib/rspecq/queue.rb +21 -5
- data/lib/rspecq/reporter.rb +3 -4
- data/lib/rspecq/version.rb +1 -1
- data/lib/rspecq/worker.rb +112 -63
- metadata +55 -11
checksums.yaml
CHANGED
@@ -1,7 +1,7 @@
|
|
1
1
|
---
|
2
2
|
SHA256:
|
3
|
-
metadata.gz:
|
4
|
-
data.tar.gz:
|
3
|
+
metadata.gz: d6b4c91525a2fb29e2198f290877ffe5ef1e753dafe0f9babd75a581e25d7af8
|
4
|
+
data.tar.gz: 27d3705ee014a5dc77514238b36386eaf11d1bb76601ab25c463796184ea5795
|
5
5
|
SHA512:
|
6
|
-
metadata.gz:
|
7
|
-
data.tar.gz:
|
6
|
+
metadata.gz: 21803fa664abe45f173f7121dc948ebc8dbc0df41046f8e6269e8ec3751b647701b07a8b6a872aad0da22f438c043ebe6bf0fc69bc9c9c8b327e3b157dccab04
|
7
|
+
data.tar.gz: 4d683884610e2e28ca5ce5cf891e37135886ea4c6ec905f26607c1f0fdd62c1952e9c49983bb0191eb5d42433a800bf88f8227354a213b50645742f21f95af16
|
data/CHANGELOG.md
CHANGED
@@ -1,4 +1,22 @@
|
|
1
1
|
# Changelog
|
2
2
|
|
3
|
+
Breaking changes are prefixed with a "[BREAKING]" label.
|
4
|
+
|
3
5
|
## master (unreleased)
|
4
6
|
|
7
|
+
## 0.1.0 (2020-08-27)
|
8
|
+
|
9
|
+
### Added
|
10
|
+
|
11
|
+
- Sentry integration for various RSpecQ-level events [[#16](https://github.com/skroutz/rspecq/pull/16)]
|
12
|
+
- CLI: Flags can now be also set environment variables [[c519230](https://github.com/skroutz/rspecq/commit/c5192303e229f361e8ac86ae449b4ea84d42e022)]
|
13
|
+
- CLI: Added shorthand specifiers versions for some flags [[df9faa8](https://github.com/skroutz/rspecq/commit/df9faa8ec6721af8357cfee4de6a2fe7b32070fc)]
|
14
|
+
- CLI: Added `--help` and `--version` flags [[df9faa8](https://github.com/skroutz/rspecq/commit/df9faa8ec6721af8357cfee4de6a2fe7b32070fc)]
|
15
|
+
- CLI: Max number of retries for failed examples is now configurable via the `--max-requeues` option [[#14](https://github.com/skroutz/rspecq/pull/14)]
|
16
|
+
|
17
|
+
### Changed
|
18
|
+
|
19
|
+
- [BREAKING] CLI: Renamed `--timings` to `--update-timings` [[c519230](https://github.com/skroutz/rspecq/commit/c5192303e229f361e8ac86ae449b4ea84d42e022)]
|
20
|
+
- [BREAKING] CLI: Renamed `--build-id` to `--build` and `--worker-id` to `--worker` [[df9faa8](https://github.com/skroutz/rspecq/commit/df9faa8ec6721af8357cfee4de6a2fe7b32070fc)]
|
21
|
+
- CLI: `--worker` is not required when `--reporter` is used [[4323a75](https://github.com/skroutz/rspecq/commit/4323a75ca357274069d02ba9fb51cdebb04e0be4)]
|
22
|
+
- CLI: Improved help output [[df9faa8](https://github.com/skroutz/rspecq/commit/df9faa8ec6721af8357cfee4de6a2fe7b32070fc)]
|
data/README.md
CHANGED
@@ -1,102 +1,204 @@
|
|
1
|
-
|
1
|
+
RSpec Queue
|
2
|
+
=========================================================================
|
3
|
+
[](https://travis-ci.com/github/skroutz/rspecq)
|
4
|
+
[](https://badge.fury.io/rb/rspecq)
|
2
5
|
|
3
|
-
|
4
|
-
|
6
|
+
RSpec Queue (RSpecQ) distributes and executes RSpec suites among parallel
|
7
|
+
workers. It uses a centralized queue that workers connect to and pop off
|
8
|
+
tests from. It ensures optimal scheduling of tests based on their run time,
|
9
|
+
facilitating faster CI builds.
|
5
10
|
|
6
|
-
RSpecQ is
|
11
|
+
RSpecQ is inspired by [test-queue](https://github.com/tmm1/test-queue)
|
7
12
|
and [ci-queue](https://github.com/Shopify/ci-queue).
|
8
13
|
|
9
|
-
##
|
14
|
+
## Features
|
15
|
+
|
16
|
+
- Run an RSpec suite among many workers
|
17
|
+
(potentially located in different hosts) in a distributed fashion,
|
18
|
+
facilitating faster CI builds.
|
19
|
+
- Consolidated, real-time reporting of a build's progress.
|
20
|
+
- Optimal scheduling of test execution by using timings statistics from previous runs and
|
21
|
+
automatically scheduling slow spec files as individual examples. See
|
22
|
+
[*Spec file splitting*](#spec-file-splitting).
|
23
|
+
- Automatic retry of test failures before being considered legit, in order to
|
24
|
+
rule out flakiness. See [*Requeues*](#requeues).
|
25
|
+
- Handles intermittent worker failures (e.g. network hiccups, faulty hardware etc.)
|
26
|
+
by detecting non-responsive workers and requeing their jobs. See [*Worker failures*](#worker-failures)
|
27
|
+
- [Sentry](https://sentry.io) integration for monitoring important
|
28
|
+
RSpecQ-level events.
|
29
|
+
- [PLANNED] StatsD integration for various build-level metrics and insights.
|
30
|
+
See [#2](https://github.com/skroutz/rspecq/issues/2).
|
10
31
|
|
11
|
-
|
12
|
-
in the workers (up to 3 minutes), increased memory consumption and too much
|
13
|
-
disk I/O on boot. This is due to the fact that a worker in ci-queue has to
|
14
|
-
load every spec file on boot. This can be problematic for applications with
|
15
|
-
a large number of spec files.
|
16
|
-
|
17
|
-
RSpecQ works with spec files as its unit of work (as opposed to ci-queue which
|
18
|
-
works with individual examples). This means that an RSpecQ worker does not
|
19
|
-
have to load all spec files at once and so it doesn't have the aforementioned
|
20
|
-
problems. It also allows suites to keep using `before(:all)` hooks
|
21
|
-
(which ci-queue explicitly rejects). (Note: RSpecQ also schedules individual
|
22
|
-
examples, but only when this is deemed necessary, see section
|
23
|
-
"Spec file splitting").
|
32
|
+
## Usage
|
24
33
|
|
25
|
-
|
26
|
-
|
34
|
+
A worker needs to be given a name and the build it will participate in.
|
35
|
+
Assuming there's a Redis instance listening at `localhost`, starting a worker
|
36
|
+
is as simple as:
|
27
37
|
|
28
|
-
|
29
|
-
|
30
|
-
|
31
|
-
on every run (if the `--timings` option was used). Also, RSpecQ has a "slow
|
32
|
-
file threshold" which, currently has to be set manually (but this can be
|
33
|
-
improved).
|
38
|
+
```shell
|
39
|
+
$ rspecq --build=123 --worker=foo1 spec/
|
40
|
+
```
|
34
41
|
|
35
|
-
|
42
|
+
To start more workers for the same build, use distinct worker IDs but the same
|
43
|
+
build ID:
|
36
44
|
|
37
|
-
|
45
|
+
```shell
|
46
|
+
$ rspecq --build=123 --worker=foo2
|
47
|
+
```
|
38
48
|
|
39
|
-
|
40
|
-
Redis is located. To start a worker:
|
49
|
+
To view the progress of the build use `--report`:
|
41
50
|
|
42
51
|
```shell
|
43
|
-
$ rspecq --build
|
52
|
+
$ rspecq --build=123 --report
|
44
53
|
```
|
45
54
|
|
46
|
-
|
55
|
+
For detailed info use `--help`:
|
47
56
|
|
48
|
-
```shell
|
49
|
-
$ rspecq --build-id=foo --worker-id=reporter --redis=redis://localhost --report
|
50
57
|
```
|
58
|
+
NAME:
|
59
|
+
rspecq - Optimally distribute and run RSpec suites among parallel workers
|
60
|
+
|
61
|
+
USAGE:
|
62
|
+
rspecq [<options>] [spec files or directories]
|
63
|
+
|
64
|
+
OPTIONS:
|
65
|
+
-b, --build ID A unique identifier for the build. Should be common among workers participating in the same build.
|
66
|
+
-w, --worker ID An identifier for the worker. Workers participating in the same build should have distinct IDs.
|
67
|
+
-r, --redis HOST Redis host to connect to (default: 127.0.0.1).
|
68
|
+
--update-timings Update the global job timings key with the timings of this build. Note: This key is used as the basis for job scheduling.
|
69
|
+
--file-split-threshold N Split spec files slower than N seconds and schedule them as individual examples.
|
70
|
+
--report Enable reporter mode: do not pull tests off the queue; instead print build progress and exit when it's finished.
|
71
|
+
Exits with a non-zero status code if there were any failures.
|
72
|
+
--report-timeout N Fail if build is not finished after N seconds. Only applicable if --report is enabled (default: 3600).
|
73
|
+
--max-requeues N Retry failed examples up to N times before considering them legit failures (default: 3).
|
74
|
+
-h, --help Show this message.
|
75
|
+
-v, --version Print the version and exit.
|
76
|
+
```
|
77
|
+
|
78
|
+
### Sentry integration
|
51
79
|
|
52
|
-
|
80
|
+
RSpecQ can optionally emit build events to a
|
81
|
+
[Sentry](https://sentry.io) project by setting the
|
82
|
+
[`SENTRY_DSN`](https://github.com/getsentry/raven-ruby#raven-only-runs-when-sentry_dsn-is-set)
|
83
|
+
environment variable.
|
84
|
+
|
85
|
+
This is convenient for monitoring important warnings/errors that may impact
|
86
|
+
build times, such as the fact that no previous timings were found and
|
87
|
+
therefore job scheduling was effectively random for a particular build.
|
53
88
|
|
54
89
|
|
55
90
|
## How it works
|
56
91
|
|
57
|
-
The
|
92
|
+
The core design is almost identical to ci-queue so please refer to its
|
93
|
+
[README](https://github.com/Shopify/ci-queue/blob/master/README.md) instead.
|
58
94
|
|
59
95
|
### Terminology
|
60
96
|
|
61
|
-
- Job
|
97
|
+
- **Job**: the smallest unit of work, which is usually a spec file
|
62
98
|
(e.g. `./spec/models/foo_spec.rb`) but can also be an individual example
|
63
|
-
(e.g. `./spec/models/foo_spec.rb[1:2:1]`) if the file is too slow
|
64
|
-
- Queue
|
65
|
-
information for RSpecQ to
|
66
|
-
be executed, the failure reports
|
67
|
-
-
|
68
|
-
|
69
|
-
|
70
|
-
|
99
|
+
(e.g. `./spec/models/foo_spec.rb[1:2:1]`) if the file is too slow.
|
100
|
+
- **Queue**: a collection of Redis-backed structures that hold all the necessary
|
101
|
+
information for an RSpecQ build to run. This includes timing statistics,
|
102
|
+
jobs to be executed, the failure reports and more.
|
103
|
+
- **Build**: a particular test suite run. Each build has its own **Queue**.
|
104
|
+
- **Worker**: an `rspecq` process that, given a build id, consumes jobs off the
|
105
|
+
build's queue and executes them using RSpec
|
106
|
+
- **Reporter**: an `rspecq` process that, given a build id, waits for the build's
|
107
|
+
queue to be drained and prints the build summary report
|
71
108
|
|
72
109
|
### Spec file splitting
|
73
110
|
|
74
|
-
|
75
|
-
a
|
76
|
-
|
77
|
-
|
78
|
-
|
79
|
-
|
111
|
+
Particularly slow spec files may set a limit to how fast a build can be.
|
112
|
+
For example, a single file may need 10 minutes to run while all other
|
113
|
+
files finish after 8 minutes. This would cause all but one workers to be
|
114
|
+
sitting idle for 2 minutes.
|
115
|
+
|
116
|
+
To overcome this issue, RSpecQ can splits files which their execution time is
|
117
|
+
above a certain threshold (set with the `--file-split-threshold` option)
|
118
|
+
and instead schedule them as individual examples.
|
80
119
|
|
81
|
-
In the future, we'd like for the slow threshold to be calculated and set
|
82
|
-
dynamically.
|
120
|
+
Note: In the future, we'd like for the slow threshold to be calculated and set
|
121
|
+
dynamically (see #3).
|
83
122
|
|
84
123
|
### Requeues
|
85
124
|
|
86
|
-
As a mitigation
|
87
|
-
back to the queue to be picked up by
|
88
|
-
|
89
|
-
|
90
|
-
|
125
|
+
As a mitigation technique against flaky tests, if an example fails it will be
|
126
|
+
put back to the queue to be picked up by another worker. This will be repeated
|
127
|
+
up to a certain number of times (set with the `--max-requeues` option), after
|
128
|
+
which the example will be considered a legit failure and printed as such in the
|
129
|
+
final report.
|
91
130
|
|
92
131
|
### Worker failures
|
93
132
|
|
94
|
-
|
95
|
-
|
96
|
-
|
97
|
-
|
133
|
+
It's not uncommon for CI processes to encounter unrecoverable failures for
|
134
|
+
various reasons: faulty hardware, network hiccups, segmentation faults in
|
135
|
+
MRI etc.
|
136
|
+
|
137
|
+
For resiliency against such issues, workers emit a heartbeat after each
|
138
|
+
example they execute, to signal
|
139
|
+
that they're healthy and performing jobs as expected. If a worker hasn't
|
140
|
+
emitted a heartbeat for a given amount of time (set by `WORKER_LIVENESS_SEC`)
|
141
|
+
it is considered dead and its reserved job will be put back to the queue, to
|
142
|
+
be picked up by another healthy worker.
|
143
|
+
|
144
|
+
|
145
|
+
## Rationale
|
146
|
+
|
147
|
+
### Why didn't you use ci-queue?
|
148
|
+
|
149
|
+
**Update**: ci-queue [deprecated support for RSpec](https://github.com/Shopify/ci-queue/pull/149).
|
150
|
+
|
151
|
+
While evaluating ci-queue we experienced slow worker boot
|
152
|
+
times (up to 3 minutes in some cases) combined with disk IO saturation and
|
153
|
+
increased memory consumption. This is due to the fact that a worker in
|
154
|
+
ci-queue has to load every spec file on boot. In applications with a large
|
155
|
+
number of spec files this may result in a significant performance hit and
|
156
|
+
in case of cloud environments, increased costs.
|
157
|
+
|
158
|
+
We also observed slower build times compared to our previous solution which
|
159
|
+
scheduled whole spec files (as opposed to individual examples), due to
|
160
|
+
big differences in runtimes of individual examples, something common in big
|
161
|
+
RSpec suites.
|
162
|
+
|
163
|
+
We decided for RSpecQ to use whole spec files as its main unit of work (as
|
164
|
+
opposed to ci-queue which uses individual examples). This means that an RSpecQ
|
165
|
+
worker only loads the files needed and ends up with a subset of all the suite's
|
166
|
+
files. (Note: RSpecQ also schedules individual examples, but only when this is
|
167
|
+
deemed necessary, see [Spec file splitting](#spec-file-splitting)).
|
168
|
+
|
169
|
+
This kept boot and test run times considerably fast. As a side benefit, this
|
170
|
+
allows suites to keep using `before(:all)` hooks (which ci-queue explicitly
|
171
|
+
rejects).
|
172
|
+
|
173
|
+
The downside of this design is that it's more complicated, since the scheduling
|
174
|
+
of spec files happens based on timings calculated from previous runs. This
|
175
|
+
means that RSpecQ maintains a key with the timing of each job and updates it
|
176
|
+
on every run (if the `--timings` option was used). Also, RSpecQ has a "slow
|
177
|
+
file threshold" which, currently has to be set manually (but this can be
|
178
|
+
improved in the future).
|
179
|
+
|
180
|
+
|
181
|
+
## Development
|
182
|
+
|
183
|
+
Install the required dependencies:
|
184
|
+
|
185
|
+
```
|
186
|
+
$ bundle install
|
187
|
+
```
|
188
|
+
|
189
|
+
Then you can execute the tests after spinning up a Redis instance at
|
190
|
+
`127.0.0.1:6379`:
|
191
|
+
|
192
|
+
```
|
193
|
+
$ bundle exec rake
|
194
|
+
```
|
195
|
+
|
196
|
+
To enable verbose output in the tests:
|
197
|
+
|
198
|
+
```
|
199
|
+
$ RSPECQ_DEBUG=1 bundle exec rake
|
200
|
+
```
|
98
201
|
|
99
|
-
This protects us against unrecoverable worker failures (e.g. segfault).
|
100
202
|
|
101
203
|
## License
|
102
204
|
|
data/Rakefile
ADDED
data/bin/rspecq
CHANGED
@@ -1,67 +1,118 @@
|
|
1
1
|
#!/usr/bin/env ruby
|
2
|
-
require "
|
2
|
+
require "optparse"
|
3
3
|
require "rspecq"
|
4
4
|
|
5
|
+
DEFAULT_REDIS_HOST = "127.0.0.1"
|
6
|
+
DEFAULT_REPORT_TIMEOUT = 3600 # 1 hour
|
7
|
+
DEFAULT_MAX_REQUEUES = 3
|
8
|
+
|
9
|
+
def env_set?(var)
|
10
|
+
["1", "true"].include?(ENV[var])
|
11
|
+
end
|
12
|
+
|
5
13
|
opts = {}
|
14
|
+
|
6
15
|
OptionParser.new do |o|
|
7
|
-
|
16
|
+
name = File.basename($PROGRAM_NAME)
|
8
17
|
|
9
|
-
o.
|
10
|
-
|
18
|
+
o.banner = <<~BANNER
|
19
|
+
NAME:
|
20
|
+
#{name} - Optimally distribute and run RSpec suites among parallel workers
|
21
|
+
|
22
|
+
USAGE:
|
23
|
+
#{name} [<options>] [spec files or directories]
|
24
|
+
BANNER
|
25
|
+
|
26
|
+
o.separator ""
|
27
|
+
o.separator "OPTIONS:"
|
28
|
+
|
29
|
+
o.on("-b", "--build ID", "A unique identifier for the build. Should be " \
|
30
|
+
"common among workers participating in the same build.") do |v|
|
31
|
+
opts[:build] = v
|
11
32
|
end
|
12
33
|
|
13
|
-
o.on("--worker
|
14
|
-
|
34
|
+
o.on("-w", "--worker ID", "An identifier for the worker. Workers " \
|
35
|
+
"participating in the same build should have distinct IDs.") do |v|
|
36
|
+
opts[:worker] = v
|
15
37
|
end
|
16
38
|
|
17
|
-
o.on("--redis HOST", "Redis
|
18
|
-
|
39
|
+
o.on("-r", "--redis HOST", "Redis host to connect to " \
|
40
|
+
"(default: #{DEFAULT_REDIS_HOST}).") do |v|
|
41
|
+
opts[:redis_host] = v
|
19
42
|
end
|
20
43
|
|
21
|
-
o.on("--timings", "
|
44
|
+
o.on("--update-timings", "Update the global job timings key with the " \
|
45
|
+
"timings of this build. Note: This key is used as the basis for job " \
|
46
|
+
"scheduling.") do |v|
|
22
47
|
opts[:timings] = v
|
23
48
|
end
|
24
49
|
|
25
|
-
o.on("--file-split-threshold N", "Split spec files slower than N
|
26
|
-
"schedule them
|
27
|
-
opts[:file_split_threshold] =
|
50
|
+
o.on("--file-split-threshold N", Integer, "Split spec files slower than N " \
|
51
|
+
"seconds and schedule them as individual examples.") do |v|
|
52
|
+
opts[:file_split_threshold] = v
|
28
53
|
end
|
29
54
|
|
30
|
-
o.on("--report", "
|
31
|
-
|
55
|
+
o.on("--report", "Enable reporter mode: do not pull tests off the queue; " \
|
56
|
+
"instead print build progress and exit when it's " \
|
57
|
+
"finished.\n#{o.summary_indent*9} " \
|
58
|
+
"Exits with a non-zero status code if there were any " \
|
59
|
+
"failures.") do |v|
|
32
60
|
opts[:report] = v
|
33
61
|
end
|
34
62
|
|
35
|
-
o.on("--report-timeout N", Integer, "Fail if
|
36
|
-
"N seconds. Only applicable if --report is enabled "
|
37
|
-
"(default:
|
63
|
+
o.on("--report-timeout N", Integer, "Fail if build is not finished after " \
|
64
|
+
"N seconds. Only applicable if --report is enabled " \
|
65
|
+
"(default: #{DEFAULT_REPORT_TIMEOUT}).") do |v|
|
38
66
|
opts[:report_timeout] = v
|
39
67
|
end
|
40
68
|
|
69
|
+
o.on("--max-requeues N", Integer, "Retry failed examples up to N times " \
|
70
|
+
"before considering them legit failures " \
|
71
|
+
"(default: #{DEFAULT_MAX_REQUEUES}).") do |v|
|
72
|
+
opts[:max_requeues] = v
|
73
|
+
end
|
74
|
+
|
75
|
+
o.on_tail("-h", "--help", "Show this message.") do
|
76
|
+
puts o
|
77
|
+
exit
|
78
|
+
end
|
79
|
+
|
80
|
+
o.on_tail("-v", "--version", "Print the version and exit.") do
|
81
|
+
puts "#{name} #{RSpecQ::VERSION}"
|
82
|
+
exit
|
83
|
+
end
|
41
84
|
end.parse!
|
42
85
|
|
43
|
-
[:
|
44
|
-
|
45
|
-
|
86
|
+
opts[:build] ||= ENV["RSPECQ_BUILD"]
|
87
|
+
opts[:worker] ||= ENV["RSPECQ_WORKER"]
|
88
|
+
opts[:redis_host] ||= ENV["RSPECQ_REDIS"] || DEFAULT_REDIS_HOST
|
89
|
+
opts[:timings] ||= env_set?("RSPECQ_UPDATE_TIMINGS")
|
90
|
+
opts[:file_split_threshold] ||= Integer(ENV["RSPECQ_FILE_SPLIT_THRESHOLD"] || 9999999)
|
91
|
+
opts[:report] ||= env_set?("RSPECQ_REPORT")
|
92
|
+
opts[:report_timeout] ||= Integer(ENV["RSPECQ_REPORT_TIMEOUT"] || DEFAULT_REPORT_TIMEOUT)
|
93
|
+
opts[:max_requeues] ||= Integer(ENV["RSPECQ_MAX_REQUEUES"] || DEFAULT_MAX_REQUEUES)
|
94
|
+
|
95
|
+
raise OptionParser::MissingArgument.new(:build) if opts[:build].nil?
|
96
|
+
raise OptionParser::MissingArgument.new(:worker) if !opts[:report] && opts[:worker].nil?
|
46
97
|
|
47
98
|
if opts[:report]
|
48
99
|
reporter = RSpecQ::Reporter.new(
|
49
|
-
build_id: opts[:
|
50
|
-
|
51
|
-
timeout: opts[:report_timeout] || 3600,
|
100
|
+
build_id: opts[:build],
|
101
|
+
timeout: opts[:report_timeout],
|
52
102
|
redis_host: opts[:redis_host],
|
53
103
|
)
|
54
104
|
|
55
105
|
reporter.report
|
56
106
|
else
|
57
107
|
worker = RSpecQ::Worker.new(
|
58
|
-
build_id: opts[:
|
59
|
-
worker_id: opts[:
|
60
|
-
redis_host: opts[:redis_host]
|
61
|
-
files_or_dirs_to_run: ARGV[0] || "spec",
|
108
|
+
build_id: opts[:build],
|
109
|
+
worker_id: opts[:worker],
|
110
|
+
redis_host: opts[:redis_host]
|
62
111
|
)
|
63
112
|
|
113
|
+
worker.files_or_dirs_to_run = ARGV[0] if ARGV[0]
|
64
114
|
worker.populate_timings = opts[:timings]
|
65
|
-
worker.file_split_threshold = opts[:file_split_threshold]
|
115
|
+
worker.file_split_threshold = opts[:file_split_threshold]
|
116
|
+
worker.max_requeues = opts[:max_requeues]
|
66
117
|
worker.work
|
67
118
|
end
|
data/lib/rspecq.rb
CHANGED
@@ -1,11 +1,10 @@
|
|
1
1
|
require "rspec/core"
|
2
|
+
require "sentry-raven"
|
2
3
|
|
3
4
|
module RSpecQ
|
4
|
-
|
5
|
-
|
6
|
-
#
|
7
|
-
# (in seconds), it is considered dead and its reserved work will be put back
|
8
|
-
# to the queue, to be picked up by another worker.
|
5
|
+
# If a worker haven't executed an example for more than WORKER_LIVENESS_SEC
|
6
|
+
# seconds, it is considered dead and its reserved work will be put back
|
7
|
+
# to the queue to be picked up by another worker.
|
9
8
|
WORKER_LIVENESS_SEC = 60.0
|
10
9
|
end
|
11
10
|
|
@@ -16,6 +15,5 @@ require_relative "rspecq/formatters/worker_heartbeat_recorder"
|
|
16
15
|
|
17
16
|
require_relative "rspecq/queue"
|
18
17
|
require_relative "rspecq/reporter"
|
19
|
-
require_relative "rspecq/worker"
|
20
|
-
|
21
18
|
require_relative "rspecq/version"
|
19
|
+
require_relative "rspecq/worker"
|
@@ -1,11 +1,12 @@
|
|
1
1
|
module RSpecQ
|
2
2
|
module Formatters
|
3
3
|
class FailureRecorder
|
4
|
-
def initialize(queue, job)
|
4
|
+
def initialize(queue, job, max_requeues)
|
5
5
|
@queue = queue
|
6
6
|
@job = job
|
7
7
|
@colorizer = RSpec::Core::Formatters::ConsoleCodes
|
8
8
|
@non_example_error_recorded = false
|
9
|
+
@max_requeues = max_requeues
|
9
10
|
end
|
10
11
|
|
11
12
|
# Here we're notified about errors occuring outside of examples.
|
@@ -24,7 +25,7 @@ module RSpecQ
|
|
24
25
|
def example_failed(notification)
|
25
26
|
example = notification.example
|
26
27
|
|
27
|
-
if @queue.requeue_job(example.id,
|
28
|
+
if @queue.requeue_job(example.id, @max_requeues)
|
28
29
|
# HACK: try to avoid picking the job we just requeued; we want it
|
29
30
|
# to be picked up by a different worker
|
30
31
|
sleep 0.5
|
data/lib/rspecq/queue.rb
CHANGED
@@ -57,6 +57,8 @@ module RSpecQ
|
|
57
57
|
STATUS_INITIALIZING = "initializing".freeze
|
58
58
|
STATUS_READY = "ready".freeze
|
59
59
|
|
60
|
+
attr_reader :redis
|
61
|
+
|
60
62
|
def initialize(build_id, worker_id, redis_host)
|
61
63
|
@build_id = build_id
|
62
64
|
@worker_id = worker_id
|
@@ -150,13 +152,21 @@ module RSpecQ
|
|
150
152
|
end
|
151
153
|
|
152
154
|
def example_count
|
153
|
-
@redis.get(key_example_count)
|
155
|
+
@redis.get(key_example_count).to_i
|
154
156
|
end
|
155
157
|
|
156
158
|
def processed_jobs_count
|
157
159
|
@redis.scard(key_queue_processed)
|
158
160
|
end
|
159
161
|
|
162
|
+
def processed_jobs
|
163
|
+
@redis.smembers(key_queue_processed)
|
164
|
+
end
|
165
|
+
|
166
|
+
def requeued_jobs
|
167
|
+
@redis.hgetall(key_requeues)
|
168
|
+
end
|
169
|
+
|
160
170
|
def become_master
|
161
171
|
@redis.setnx(key_queue_status, STATUS_INITIALIZING)
|
162
172
|
end
|
@@ -200,10 +210,10 @@ module RSpecQ
|
|
200
210
|
exhausted? && example_failures.empty? && non_example_errors.empty?
|
201
211
|
end
|
202
212
|
|
203
|
-
|
204
|
-
|
205
|
-
def
|
206
|
-
|
213
|
+
# The remaining jobs to be processed. Jobs at the head of the list will
|
214
|
+
# be procesed first.
|
215
|
+
def unprocessed_jobs
|
216
|
+
@redis.lrange(key_queue_unprocessed, 0, -1)
|
207
217
|
end
|
208
218
|
|
209
219
|
# redis: STRING [STATUS_INITIALIZING, STATUS_READY]
|
@@ -279,6 +289,12 @@ module RSpecQ
|
|
279
289
|
"build_times"
|
280
290
|
end
|
281
291
|
|
292
|
+
private
|
293
|
+
|
294
|
+
def key(*keys)
|
295
|
+
[@build_id, keys].join(":")
|
296
|
+
end
|
297
|
+
|
282
298
|
# We don't use any Ruby `Time` methods because specs that use timecop in
|
283
299
|
# before(:all) hooks will mess up our times.
|
284
300
|
def current_time
|
data/lib/rspecq/reporter.rb
CHANGED
@@ -1,10 +1,9 @@
|
|
1
1
|
module RSpecQ
|
2
2
|
class Reporter
|
3
|
-
def initialize(build_id:,
|
3
|
+
def initialize(build_id:, timeout:, redis_host:)
|
4
4
|
@build_id = build_id
|
5
|
-
@worker_id = worker_id
|
6
5
|
@timeout = timeout
|
7
|
-
@queue = Queue.new(build_id,
|
6
|
+
@queue = Queue.new(build_id, "reporter", redis_host)
|
8
7
|
|
9
8
|
# We want feedback to be immediattely printed to CI users, so
|
10
9
|
# we disable buffering.
|
@@ -12,7 +11,7 @@ module RSpecQ
|
|
12
11
|
end
|
13
12
|
|
14
13
|
def report
|
15
|
-
|
14
|
+
@queue.wait_until_published
|
16
15
|
|
17
16
|
finished = false
|
18
17
|
|
data/lib/rspecq/version.rb
CHANGED
data/lib/rspecq/worker.rb
CHANGED
@@ -1,10 +1,16 @@
|
|
1
1
|
require "json"
|
2
|
+
require "pathname"
|
2
3
|
require "pp"
|
3
4
|
|
4
5
|
module RSpecQ
|
5
6
|
class Worker
|
6
7
|
HEARTBEAT_FREQUENCY = WORKER_LIVENESS_SEC / 6
|
7
8
|
|
9
|
+
# The root path or individual spec files to execute.
|
10
|
+
#
|
11
|
+
# Defaults to "spec" (just like in RSpec)
|
12
|
+
attr_accessor :files_or_dirs_to_run
|
13
|
+
|
8
14
|
# If true, job timings will be populated in the global Redis timings key
|
9
15
|
#
|
10
16
|
# Defaults to false
|
@@ -12,15 +18,27 @@ module RSpecQ
|
|
12
18
|
|
13
19
|
# If set, spec files that are known to take more than this value to finish,
|
14
20
|
# will be split and scheduled on a per-example basis.
|
21
|
+
#
|
22
|
+
# Defaults to 999999
|
15
23
|
attr_accessor :file_split_threshold
|
16
24
|
|
17
|
-
|
25
|
+
# Retry failed examples up to N times (with N being the supplied value)
|
26
|
+
# before considering them legit failures
|
27
|
+
#
|
28
|
+
# Defaults to 3
|
29
|
+
attr_accessor :max_requeues
|
30
|
+
|
31
|
+
attr_reader :queue
|
32
|
+
|
33
|
+
def initialize(build_id:, worker_id:, redis_host:)
|
18
34
|
@build_id = build_id
|
19
35
|
@worker_id = worker_id
|
20
36
|
@queue = Queue.new(build_id, worker_id, redis_host)
|
21
|
-
@files_or_dirs_to_run =
|
37
|
+
@files_or_dirs_to_run = "spec"
|
22
38
|
@populate_timings = false
|
23
39
|
@file_split_threshold = 999999
|
40
|
+
@heartbeat_updated_at = nil
|
41
|
+
@max_requeues = 3
|
24
42
|
|
25
43
|
RSpec::Core::Formatters.register(Formatters::JobTimingRecorder, :dump_summary)
|
26
44
|
RSpec::Core::Formatters.register(Formatters::ExampleCountRecorder, :dump_summary)
|
@@ -31,23 +49,23 @@ module RSpecQ
|
|
31
49
|
def work
|
32
50
|
puts "Working for build #{@build_id} (worker=#{@worker_id})"
|
33
51
|
|
34
|
-
try_publish_queue!(
|
35
|
-
|
52
|
+
try_publish_queue!(queue)
|
53
|
+
queue.wait_until_published
|
36
54
|
|
37
55
|
loop do
|
38
56
|
# we have to bootstrap this so that it can be used in the first call
|
39
57
|
# to `requeue_lost_job` inside the work loop
|
40
58
|
update_heartbeat
|
41
59
|
|
42
|
-
lost =
|
60
|
+
lost = queue.requeue_lost_job
|
43
61
|
puts "Requeued lost job: #{lost}" if lost
|
44
62
|
|
45
63
|
# TODO: can we make `reserve_job` also act like exhausted? and get
|
46
64
|
# rid of `exhausted?` (i.e. return false if no jobs remain)
|
47
|
-
job =
|
65
|
+
job = queue.reserve_job
|
48
66
|
|
49
67
|
# build is finished
|
50
|
-
return if job.nil? &&
|
68
|
+
return if job.nil? && queue.exhausted?
|
51
69
|
|
52
70
|
next if job.nil?
|
53
71
|
|
@@ -60,112 +78,125 @@ module RSpecQ
|
|
60
78
|
RSpec.configuration.detail_color = :magenta
|
61
79
|
RSpec.configuration.seed = srand && srand % 0xFFFF
|
62
80
|
RSpec.configuration.backtrace_formatter.filter_gem('rspecq')
|
63
|
-
RSpec.configuration.add_formatter(Formatters::FailureRecorder.new(
|
64
|
-
RSpec.configuration.add_formatter(Formatters::ExampleCountRecorder.new(
|
81
|
+
RSpec.configuration.add_formatter(Formatters::FailureRecorder.new(queue, job, max_requeues))
|
82
|
+
RSpec.configuration.add_formatter(Formatters::ExampleCountRecorder.new(queue))
|
65
83
|
RSpec.configuration.add_formatter(Formatters::WorkerHeartbeatRecorder.new(self))
|
66
84
|
|
67
85
|
if populate_timings
|
68
|
-
RSpec.configuration.add_formatter(Formatters::JobTimingRecorder.new(
|
86
|
+
RSpec.configuration.add_formatter(Formatters::JobTimingRecorder.new(queue, job))
|
69
87
|
end
|
70
88
|
|
71
89
|
opts = RSpec::Core::ConfigurationOptions.new(["--format", "progress", job])
|
72
90
|
_result = RSpec::Core::Runner.new(opts).run($stderr, $stdout)
|
73
91
|
|
74
|
-
|
92
|
+
queue.acknowledge_job(job)
|
75
93
|
end
|
76
94
|
end
|
77
95
|
|
78
96
|
# Update the worker heartbeat if necessary
|
79
97
|
def update_heartbeat
|
80
98
|
if @heartbeat_updated_at.nil? || elapsed(@heartbeat_updated_at) >= HEARTBEAT_FREQUENCY
|
81
|
-
|
99
|
+
queue.record_worker_heartbeat
|
82
100
|
@heartbeat_updated_at = Process.clock_gettime(Process::CLOCK_MONOTONIC)
|
83
101
|
end
|
84
102
|
end
|
85
103
|
|
86
|
-
private
|
87
|
-
|
88
|
-
def reset_rspec_state!
|
89
|
-
RSpec.clear_examples
|
90
|
-
|
91
|
-
# TODO: remove after https://github.com/rspec/rspec-core/pull/2723
|
92
|
-
RSpec.world.instance_variable_set(:@example_group_counts_by_spec_file, Hash.new(0))
|
93
|
-
|
94
|
-
# RSpec.clear_examples does not reset those, which causes issues when
|
95
|
-
# a non-example error occurs (subsequent jobs are not executed)
|
96
|
-
# TODO: upstream
|
97
|
-
RSpec.world.non_example_failure = false
|
98
|
-
|
99
|
-
# we don't want an error that occured outside of the examples (which
|
100
|
-
# would set this to `true`) to stop the worker
|
101
|
-
RSpec.world.wants_to_quit = false
|
102
|
-
end
|
103
|
-
|
104
104
|
def try_publish_queue!(queue)
|
105
105
|
return if !queue.become_master
|
106
106
|
|
107
|
-
RSpec.configuration.files_or_directories_to_run =
|
107
|
+
RSpec.configuration.files_or_directories_to_run = files_or_dirs_to_run
|
108
108
|
files_to_run = RSpec.configuration.files_to_run.map { |j| relative_path(j) }
|
109
109
|
|
110
110
|
timings = queue.timings
|
111
111
|
if timings.empty?
|
112
|
-
# TODO: should be a warning reported somewhere (Sentry?)
|
113
112
|
q_size = queue.publish(files_to_run.shuffle)
|
114
|
-
|
115
|
-
|
113
|
+
log_event(
|
114
|
+
"No timings found! Published queue in random order (size=#{q_size})",
|
115
|
+
"warning"
|
116
|
+
)
|
116
117
|
return
|
117
118
|
end
|
118
119
|
|
119
|
-
|
120
|
-
|
121
|
-
|
120
|
+
# prepare jobs to run
|
121
|
+
jobs = []
|
122
|
+
slow_files = []
|
122
123
|
|
123
|
-
if
|
124
|
-
|
124
|
+
if file_split_threshold
|
125
|
+
slow_files = timings.take_while do |_job, duration|
|
126
|
+
duration >= file_split_threshold
|
127
|
+
end.map(&:first) & files_to_run
|
125
128
|
end
|
126
129
|
|
127
|
-
|
128
|
-
|
129
|
-
|
130
|
-
|
130
|
+
if slow_files.any?
|
131
|
+
jobs.concat(files_to_run - slow_files)
|
132
|
+
jobs.concat(files_to_example_ids(slow_files))
|
133
|
+
else
|
134
|
+
jobs.concat(files_to_run)
|
135
|
+
end
|
131
136
|
|
132
|
-
# assign timings to all of them
|
133
137
|
default_timing = timings.values[timings.values.size/2]
|
134
138
|
|
139
|
+
# assign timings (based on previous runs) to all jobs
|
135
140
|
jobs = jobs.each_with_object({}) do |j, h|
|
136
|
-
|
137
|
-
|
141
|
+
puts "Untimed job: #{j}" if timings[j].nil?
|
142
|
+
|
143
|
+
# HEURISTIC: put jobs without previous timings (e.g. a newly added
|
144
|
+
# spec file) in the middle of the queue
|
138
145
|
h[j] = timings[j] || default_timing
|
139
146
|
end
|
140
147
|
|
141
|
-
#
|
148
|
+
# sort jobs based on their timings (slowest to be processed first)
|
142
149
|
jobs = jobs.sort_by { |_j, t| -t }.map(&:first)
|
143
150
|
|
144
151
|
puts "Published queue (size=#{queue.publish(jobs)})"
|
145
152
|
end
|
146
153
|
|
154
|
+
private
|
155
|
+
|
156
|
+
def reset_rspec_state!
|
157
|
+
RSpec.clear_examples
|
158
|
+
|
159
|
+
# see https://github.com/rspec/rspec-core/pull/2723
|
160
|
+
if Gem::Version.new(RSpec::Core::Version::STRING) <= Gem::Version.new("3.9.1")
|
161
|
+
RSpec.world.instance_variable_set(
|
162
|
+
:@example_group_counts_by_spec_file, Hash.new(0))
|
163
|
+
end
|
164
|
+
|
165
|
+
# RSpec.clear_examples does not reset those, which causes issues when
|
166
|
+
# a non-example error occurs (subsequent jobs are not executed)
|
167
|
+
# TODO: upstream
|
168
|
+
RSpec.world.non_example_failure = false
|
169
|
+
|
170
|
+
# we don't want an error that occured outside of the examples (which
|
171
|
+
# would set this to `true`) to stop the worker
|
172
|
+
RSpec.world.wants_to_quit = false
|
173
|
+
end
|
174
|
+
|
147
175
|
# NOTE: RSpec has to load the files before we can split them as individual
|
148
176
|
# examples. In case a file to be splitted fails to be loaded
|
149
|
-
# (e.g. contains a syntax error), we return the
|
150
|
-
#
|
151
|
-
#
|
152
|
-
# Their errors will be reported in the normal flow, when they're picked up
|
153
|
-
# as jobs by a worker.
|
177
|
+
# (e.g. contains a syntax error), we return the files unchanged, thereby
|
178
|
+
# falling back to scheduling them as whole files. Their errors will be
|
179
|
+
# reported in the normal flow when they're eventually picked up by a worker.
|
154
180
|
def files_to_example_ids(files)
|
155
|
-
|
156
|
-
cmd = "DISABLE_SPRING=1 bin/rspec --dry-run --format json #{files.join(' ')}"
|
181
|
+
cmd = "DISABLE_SPRING=1 bundle exec rspec --dry-run --format json #{files.join(' ')} 2>&1"
|
157
182
|
out = `#{cmd}`
|
183
|
+
cmd_result = $?
|
158
184
|
|
159
|
-
if
|
160
|
-
|
161
|
-
|
185
|
+
if !cmd_result.success?
|
186
|
+
rspec_output = begin
|
187
|
+
JSON.parse(out)
|
188
|
+
rescue JSON::ParserError
|
189
|
+
out
|
190
|
+
end
|
162
191
|
|
163
|
-
|
164
|
-
|
165
|
-
|
166
|
-
|
167
|
-
|
168
|
-
|
192
|
+
log_event(
|
193
|
+
"Failed to split slow files, falling back to regular scheduling",
|
194
|
+
"error",
|
195
|
+
rspec_output: rspec_output,
|
196
|
+
cmd_result: cmd_result.inspect,
|
197
|
+
)
|
198
|
+
|
199
|
+
pp rspec_output
|
169
200
|
|
170
201
|
return files
|
171
202
|
end
|
@@ -181,5 +212,23 @@ module RSpecQ
|
|
181
212
|
def elapsed(since)
|
182
213
|
Process.clock_gettime(Process::CLOCK_MONOTONIC) - since
|
183
214
|
end
|
215
|
+
|
216
|
+
# Prints msg to standard output and emits an event to Sentry, if the
|
217
|
+
# SENTRY_DSN environment variable is set.
|
218
|
+
def log_event(msg, level, additional={})
|
219
|
+
puts msg
|
220
|
+
|
221
|
+
Raven.capture_message(msg, level: level, extra: {
|
222
|
+
build: @build_id,
|
223
|
+
worker: @worker_id,
|
224
|
+
queue: queue.inspect,
|
225
|
+
files_or_dirs_to_run: files_or_dirs_to_run,
|
226
|
+
populate_timings: populate_timings,
|
227
|
+
file_split_threshold: file_split_threshold,
|
228
|
+
heartbeat_updated_at: @heartbeat_updated_at,
|
229
|
+
object: self.inspect,
|
230
|
+
pid: Process.pid,
|
231
|
+
}.merge(additional))
|
232
|
+
end
|
184
233
|
end
|
185
234
|
end
|
metadata
CHANGED
@@ -1,14 +1,14 @@
|
|
1
1
|
--- !ruby/object:Gem::Specification
|
2
2
|
name: rspecq
|
3
3
|
version: !ruby/object:Gem::Version
|
4
|
-
version: 0.
|
4
|
+
version: 0.1.0
|
5
5
|
platform: ruby
|
6
6
|
authors:
|
7
7
|
- Agis Anastasopoulos
|
8
8
|
autorequire:
|
9
9
|
bindir: bin
|
10
10
|
cert_chain: []
|
11
|
-
date: 2020-
|
11
|
+
date: 2020-08-27 00:00:00.000000000 Z
|
12
12
|
dependencies:
|
13
13
|
- !ruby/object:Gem::Dependency
|
14
14
|
name: rspec-core
|
@@ -38,22 +38,64 @@ dependencies:
|
|
38
38
|
- - ">="
|
39
39
|
- !ruby/object:Gem::Version
|
40
40
|
version: '0'
|
41
|
+
- !ruby/object:Gem::Dependency
|
42
|
+
name: sentry-raven
|
43
|
+
requirement: !ruby/object:Gem::Requirement
|
44
|
+
requirements:
|
45
|
+
- - ">="
|
46
|
+
- !ruby/object:Gem::Version
|
47
|
+
version: '0'
|
48
|
+
type: :runtime
|
49
|
+
prerelease: false
|
50
|
+
version_requirements: !ruby/object:Gem::Requirement
|
51
|
+
requirements:
|
52
|
+
- - ">="
|
53
|
+
- !ruby/object:Gem::Version
|
54
|
+
version: '0'
|
55
|
+
- !ruby/object:Gem::Dependency
|
56
|
+
name: rake
|
57
|
+
requirement: !ruby/object:Gem::Requirement
|
58
|
+
requirements:
|
59
|
+
- - ">="
|
60
|
+
- !ruby/object:Gem::Version
|
61
|
+
version: '0'
|
62
|
+
type: :development
|
63
|
+
prerelease: false
|
64
|
+
version_requirements: !ruby/object:Gem::Requirement
|
65
|
+
requirements:
|
66
|
+
- - ">="
|
67
|
+
- !ruby/object:Gem::Version
|
68
|
+
version: '0'
|
69
|
+
- !ruby/object:Gem::Dependency
|
70
|
+
name: pry-byebug
|
71
|
+
requirement: !ruby/object:Gem::Requirement
|
72
|
+
requirements:
|
73
|
+
- - ">="
|
74
|
+
- !ruby/object:Gem::Version
|
75
|
+
version: '0'
|
76
|
+
type: :development
|
77
|
+
prerelease: false
|
78
|
+
version_requirements: !ruby/object:Gem::Requirement
|
79
|
+
requirements:
|
80
|
+
- - ">="
|
81
|
+
- !ruby/object:Gem::Version
|
82
|
+
version: '0'
|
41
83
|
- !ruby/object:Gem::Dependency
|
42
84
|
name: minitest
|
43
85
|
requirement: !ruby/object:Gem::Requirement
|
44
86
|
requirements:
|
45
|
-
- - "
|
87
|
+
- - ">="
|
46
88
|
- !ruby/object:Gem::Version
|
47
|
-
version: '
|
89
|
+
version: '0'
|
48
90
|
type: :development
|
49
91
|
prerelease: false
|
50
92
|
version_requirements: !ruby/object:Gem::Requirement
|
51
93
|
requirements:
|
52
|
-
- - "
|
94
|
+
- - ">="
|
53
95
|
- !ruby/object:Gem::Version
|
54
|
-
version: '
|
96
|
+
version: '0'
|
55
97
|
- !ruby/object:Gem::Dependency
|
56
|
-
name:
|
98
|
+
name: rspec
|
57
99
|
requirement: !ruby/object:Gem::Requirement
|
58
100
|
requirements:
|
59
101
|
- - ">="
|
@@ -76,6 +118,7 @@ files:
|
|
76
118
|
- CHANGELOG.md
|
77
119
|
- LICENSE
|
78
120
|
- README.md
|
121
|
+
- Rakefile
|
79
122
|
- bin/rspecq
|
80
123
|
- lib/rspecq.rb
|
81
124
|
- lib/rspecq/formatters/example_count_recorder.rb
|
@@ -101,12 +144,13 @@ required_ruby_version: !ruby/object:Gem::Requirement
|
|
101
144
|
version: '0'
|
102
145
|
required_rubygems_version: !ruby/object:Gem::Requirement
|
103
146
|
requirements:
|
104
|
-
- - "
|
147
|
+
- - ">="
|
105
148
|
- !ruby/object:Gem::Version
|
106
|
-
version:
|
149
|
+
version: '0'
|
107
150
|
requirements: []
|
108
|
-
rubygems_version: 3.1.
|
151
|
+
rubygems_version: 3.1.4
|
109
152
|
signing_key:
|
110
153
|
specification_version: 4
|
111
|
-
summary:
|
154
|
+
summary: Optimally distribute and run RSpec suites among parallel workers; for faster
|
155
|
+
CI builds
|
112
156
|
test_files: []
|