rspecq 0.0.1.pre2 → 0.1.0

Sign up to get free protection for your applications and to get access to all the features.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA256:
3
- metadata.gz: 4fcc5311329946efb2a7801087f4cdb5f0be8becc01bd1bb8f367b6d130a02ea
4
- data.tar.gz: d6f64c6d0c1dae8a53af8bf2d7724e2fb988b03fa795c59d0f5ecd18a92b072a
3
+ metadata.gz: d6b4c91525a2fb29e2198f290877ffe5ef1e753dafe0f9babd75a581e25d7af8
4
+ data.tar.gz: 27d3705ee014a5dc77514238b36386eaf11d1bb76601ab25c463796184ea5795
5
5
  SHA512:
6
- metadata.gz: 7611cf0944ea7751eaf93a7aae5686f6c03563b01e71978d9e1d61f30f14f89b12de0ce9ac590f9351eff22a4d4811f9e2f6c241232754ede3162142225f2c27
7
- data.tar.gz: bdbd6da559607026b8e6fead442d4b15b1bb73e957a63b022c4459a83aba0c2a8e297204d9c29ae3bbc700d39ea2434c7b99e55dfb3752a56af5376d0511fea0
6
+ metadata.gz: 21803fa664abe45f173f7121dc948ebc8dbc0df41046f8e6269e8ec3751b647701b07a8b6a872aad0da22f438c043ebe6bf0fc69bc9c9c8b327e3b157dccab04
7
+ data.tar.gz: 4d683884610e2e28ca5ce5cf891e37135886ea4c6ec905f26607c1f0fdd62c1952e9c49983bb0191eb5d42433a800bf88f8227354a213b50645742f21f95af16
@@ -1,4 +1,22 @@
1
1
  # Changelog
2
2
 
3
+ Breaking changes are prefixed with a "[BREAKING]" label.
4
+
3
5
  ## master (unreleased)
4
6
 
7
+ ## 0.1.0 (2020-08-27)
8
+
9
+ ### Added
10
+
11
+ - Sentry integration for various RSpecQ-level events [[#16](https://github.com/skroutz/rspecq/pull/16)]
12
+ - CLI: Flags can now be also set environment variables [[c519230](https://github.com/skroutz/rspecq/commit/c5192303e229f361e8ac86ae449b4ea84d42e022)]
13
+ - CLI: Added shorthand specifiers versions for some flags [[df9faa8](https://github.com/skroutz/rspecq/commit/df9faa8ec6721af8357cfee4de6a2fe7b32070fc)]
14
+ - CLI: Added `--help` and `--version` flags [[df9faa8](https://github.com/skroutz/rspecq/commit/df9faa8ec6721af8357cfee4de6a2fe7b32070fc)]
15
+ - CLI: Max number of retries for failed examples is now configurable via the `--max-requeues` option [[#14](https://github.com/skroutz/rspecq/pull/14)]
16
+
17
+ ### Changed
18
+
19
+ - [BREAKING] CLI: Renamed `--timings` to `--update-timings` [[c519230](https://github.com/skroutz/rspecq/commit/c5192303e229f361e8ac86ae449b4ea84d42e022)]
20
+ - [BREAKING] CLI: Renamed `--build-id` to `--build` and `--worker-id` to `--worker` [[df9faa8](https://github.com/skroutz/rspecq/commit/df9faa8ec6721af8357cfee4de6a2fe7b32070fc)]
21
+ - CLI: `--worker` is not required when `--reporter` is used [[4323a75](https://github.com/skroutz/rspecq/commit/4323a75ca357274069d02ba9fb51cdebb04e0be4)]
22
+ - CLI: Improved help output [[df9faa8](https://github.com/skroutz/rspecq/commit/df9faa8ec6721af8357cfee4de6a2fe7b32070fc)]
data/README.md CHANGED
@@ -1,102 +1,204 @@
1
- # RSpecQ
1
+ RSpec Queue
2
+ =========================================================================
3
+ [![Build Status](https://travis-ci.com/skroutz/rspecq.svg?branch=master)](https://travis-ci.com/github/skroutz/rspecq)
4
+ [![Gem Version](https://badge.fury.io/rb/rspecq.svg)](https://badge.fury.io/rb/rspecq)
2
5
 
3
- RSpecQ (`rspecq`) distributes and executes an RSpec suite over many workers,
4
- using a centralized queue backed by Redis.
6
+ RSpec Queue (RSpecQ) distributes and executes RSpec suites among parallel
7
+ workers. It uses a centralized queue that workers connect to and pop off
8
+ tests from. It ensures optimal scheduling of tests based on their run time,
9
+ facilitating faster CI builds.
5
10
 
6
- RSpecQ is heavily inspired by [test-queue](https://github.com/tmm1/test-queue)
11
+ RSpecQ is inspired by [test-queue](https://github.com/tmm1/test-queue)
7
12
  and [ci-queue](https://github.com/Shopify/ci-queue).
8
13
 
9
- ## Why don't you just use ci-queue?
14
+ ## Features
15
+
16
+ - Run an RSpec suite among many workers
17
+ (potentially located in different hosts) in a distributed fashion,
18
+ facilitating faster CI builds.
19
+ - Consolidated, real-time reporting of a build's progress.
20
+ - Optimal scheduling of test execution by using timings statistics from previous runs and
21
+ automatically scheduling slow spec files as individual examples. See
22
+ [*Spec file splitting*](#spec-file-splitting).
23
+ - Automatic retry of test failures before being considered legit, in order to
24
+ rule out flakiness. See [*Requeues*](#requeues).
25
+ - Handles intermittent worker failures (e.g. network hiccups, faulty hardware etc.)
26
+ by detecting non-responsive workers and requeing their jobs. See [*Worker failures*](#worker-failures)
27
+ - [Sentry](https://sentry.io) integration for monitoring important
28
+ RSpecQ-level events.
29
+ - [PLANNED] StatsD integration for various build-level metrics and insights.
30
+ See [#2](https://github.com/skroutz/rspecq/issues/2).
10
31
 
11
- While evaluating ci-queue for our RSpec suite, we observed slow boot times
12
- in the workers (up to 3 minutes), increased memory consumption and too much
13
- disk I/O on boot. This is due to the fact that a worker in ci-queue has to
14
- load every spec file on boot. This can be problematic for applications with
15
- a large number of spec files.
16
-
17
- RSpecQ works with spec files as its unit of work (as opposed to ci-queue which
18
- works with individual examples). This means that an RSpecQ worker does not
19
- have to load all spec files at once and so it doesn't have the aforementioned
20
- problems. It also allows suites to keep using `before(:all)` hooks
21
- (which ci-queue explicitly rejects). (Note: RSpecQ also schedules individual
22
- examples, but only when this is deemed necessary, see section
23
- "Spec file splitting").
32
+ ## Usage
24
33
 
25
- We also observed faster build times by scheduling spec files instead of
26
- individual examples, due to way less Redis operations.
34
+ A worker needs to be given a name and the build it will participate in.
35
+ Assuming there's a Redis instance listening at `localhost`, starting a worker
36
+ is as simple as:
27
37
 
28
- The downside of this design is that it's more complicated, since the scheduling
29
- of spec files happens based on timings calculated from previous runs. This
30
- means that RSpecQ maintains a key with the timing of each job and updates it
31
- on every run (if the `--timings` option was used). Also, RSpecQ has a "slow
32
- file threshold" which, currently has to be set manually (but this can be
33
- improved).
38
+ ```shell
39
+ $ rspecq --build=123 --worker=foo1 spec/
40
+ ```
34
41
 
35
- *Update*: ci-queue deprecated support for RSpec, so there's that.
42
+ To start more workers for the same build, use distinct worker IDs but the same
43
+ build ID:
36
44
 
37
- ## Usage
45
+ ```shell
46
+ $ rspecq --build=123 --worker=foo2
47
+ ```
38
48
 
39
- Each worker needs to know the build it will participate in, its name and where
40
- Redis is located. To start a worker:
49
+ To view the progress of the build use `--report`:
41
50
 
42
51
  ```shell
43
- $ rspecq --build-id=foo --worker-id=worker1 --redis=redis://localhost
52
+ $ rspecq --build=123 --report
44
53
  ```
45
54
 
46
- To view the progress of the build print use `--report`:
55
+ For detailed info use `--help`:
47
56
 
48
- ```shell
49
- $ rspecq --build-id=foo --worker-id=reporter --redis=redis://localhost --report
50
57
  ```
58
+ NAME:
59
+ rspecq - Optimally distribute and run RSpec suites among parallel workers
60
+
61
+ USAGE:
62
+ rspecq [<options>] [spec files or directories]
63
+
64
+ OPTIONS:
65
+ -b, --build ID A unique identifier for the build. Should be common among workers participating in the same build.
66
+ -w, --worker ID An identifier for the worker. Workers participating in the same build should have distinct IDs.
67
+ -r, --redis HOST Redis host to connect to (default: 127.0.0.1).
68
+ --update-timings Update the global job timings key with the timings of this build. Note: This key is used as the basis for job scheduling.
69
+ --file-split-threshold N Split spec files slower than N seconds and schedule them as individual examples.
70
+ --report Enable reporter mode: do not pull tests off the queue; instead print build progress and exit when it's finished.
71
+ Exits with a non-zero status code if there were any failures.
72
+ --report-timeout N Fail if build is not finished after N seconds. Only applicable if --report is enabled (default: 3600).
73
+ --max-requeues N Retry failed examples up to N times before considering them legit failures (default: 3).
74
+ -h, --help Show this message.
75
+ -v, --version Print the version and exit.
76
+ ```
77
+
78
+ ### Sentry integration
51
79
 
52
- For detailed info use `--help`.
80
+ RSpecQ can optionally emit build events to a
81
+ [Sentry](https://sentry.io) project by setting the
82
+ [`SENTRY_DSN`](https://github.com/getsentry/raven-ruby#raven-only-runs-when-sentry_dsn-is-set)
83
+ environment variable.
84
+
85
+ This is convenient for monitoring important warnings/errors that may impact
86
+ build times, such as the fact that no previous timings were found and
87
+ therefore job scheduling was effectively random for a particular build.
53
88
 
54
89
 
55
90
  ## How it works
56
91
 
57
- The basic idea is identical to ci-queue so please refer to its README
92
+ The core design is almost identical to ci-queue so please refer to its
93
+ [README](https://github.com/Shopify/ci-queue/blob/master/README.md) instead.
58
94
 
59
95
  ### Terminology
60
96
 
61
- - Job: the smallest unit of work, which is usually a spec file
97
+ - **Job**: the smallest unit of work, which is usually a spec file
62
98
  (e.g. `./spec/models/foo_spec.rb`) but can also be an individual example
63
- (e.g. `./spec/models/foo_spec.rb[1:2:1]`) if the file is too slow
64
- - Queue: a collection of Redis-backed structures that hold all the necessary
65
- information for RSpecQ to function. This includes timing statistics, jobs to
66
- be executed, the failure reports, requeueing statistics and more.
67
- - Worker: a process that, given a build id, pops up jobs of that build and
68
- executes them using RSpec
69
- - Reporter: a process that, given a build id, waits for the build to finish
70
- and prints the summary report (examples executed, build result, failures etc.)
99
+ (e.g. `./spec/models/foo_spec.rb[1:2:1]`) if the file is too slow.
100
+ - **Queue**: a collection of Redis-backed structures that hold all the necessary
101
+ information for an RSpecQ build to run. This includes timing statistics,
102
+ jobs to be executed, the failure reports and more.
103
+ - **Build**: a particular test suite run. Each build has its own **Queue**.
104
+ - **Worker**: an `rspecq` process that, given a build id, consumes jobs off the
105
+ build's queue and executes them using RSpec
106
+ - **Reporter**: an `rspecq` process that, given a build id, waits for the build's
107
+ queue to be drained and prints the build summary report
71
108
 
72
109
  ### Spec file splitting
73
110
 
74
- Very slow files may put a limit to how fast the suite can execute. For example,
75
- a worker may spend 10 minutes running a single slow file, while all the other
76
- workers finish after 8 minutes. To overcome this issue, rspecq splits
77
- files that their execution time is above a certain threshold
78
- (set with the `--file-split-threshold` option) and will instead schedule them as
79
- individual examples.
111
+ Particularly slow spec files may set a limit to how fast a build can be.
112
+ For example, a single file may need 10 minutes to run while all other
113
+ files finish after 8 minutes. This would cause all but one workers to be
114
+ sitting idle for 2 minutes.
115
+
116
+ To overcome this issue, RSpecQ can splits files which their execution time is
117
+ above a certain threshold (set with the `--file-split-threshold` option)
118
+ and instead schedule them as individual examples.
80
119
 
81
- In the future, we'd like for the slow threshold to be calculated and set
82
- dynamically.
120
+ Note: In the future, we'd like for the slow threshold to be calculated and set
121
+ dynamically (see #3).
83
122
 
84
123
  ### Requeues
85
124
 
86
- As a mitigation measure for flaky tests, if an example fails it will be put
87
- back to the queue to be picked up by
88
- another worker. This will be repeated up to a certain number of times before,
89
- after which the example will be considered a legit failure and will be printed
90
- in the final report (`--report`).
125
+ As a mitigation technique against flaky tests, if an example fails it will be
126
+ put back to the queue to be picked up by another worker. This will be repeated
127
+ up to a certain number of times (set with the `--max-requeues` option), after
128
+ which the example will be considered a legit failure and printed as such in the
129
+ final report.
91
130
 
92
131
  ### Worker failures
93
132
 
94
- Workers emit a timestamp after each example, as a heartbeat, to denote
95
- that they're fine and performing jobs. If a worker hasn't reported for
96
- a given amount of time (see `WORKER_LIVENESS_SEC`) it is considered dead
97
- and the job it reserved will be requeued, so that it is picked up by another worker.
133
+ It's not uncommon for CI processes to encounter unrecoverable failures for
134
+ various reasons: faulty hardware, network hiccups, segmentation faults in
135
+ MRI etc.
136
+
137
+ For resiliency against such issues, workers emit a heartbeat after each
138
+ example they execute, to signal
139
+ that they're healthy and performing jobs as expected. If a worker hasn't
140
+ emitted a heartbeat for a given amount of time (set by `WORKER_LIVENESS_SEC`)
141
+ it is considered dead and its reserved job will be put back to the queue, to
142
+ be picked up by another healthy worker.
143
+
144
+
145
+ ## Rationale
146
+
147
+ ### Why didn't you use ci-queue?
148
+
149
+ **Update**: ci-queue [deprecated support for RSpec](https://github.com/Shopify/ci-queue/pull/149).
150
+
151
+ While evaluating ci-queue we experienced slow worker boot
152
+ times (up to 3 minutes in some cases) combined with disk IO saturation and
153
+ increased memory consumption. This is due to the fact that a worker in
154
+ ci-queue has to load every spec file on boot. In applications with a large
155
+ number of spec files this may result in a significant performance hit and
156
+ in case of cloud environments, increased costs.
157
+
158
+ We also observed slower build times compared to our previous solution which
159
+ scheduled whole spec files (as opposed to individual examples), due to
160
+ big differences in runtimes of individual examples, something common in big
161
+ RSpec suites.
162
+
163
+ We decided for RSpecQ to use whole spec files as its main unit of work (as
164
+ opposed to ci-queue which uses individual examples). This means that an RSpecQ
165
+ worker only loads the files needed and ends up with a subset of all the suite's
166
+ files. (Note: RSpecQ also schedules individual examples, but only when this is
167
+ deemed necessary, see [Spec file splitting](#spec-file-splitting)).
168
+
169
+ This kept boot and test run times considerably fast. As a side benefit, this
170
+ allows suites to keep using `before(:all)` hooks (which ci-queue explicitly
171
+ rejects).
172
+
173
+ The downside of this design is that it's more complicated, since the scheduling
174
+ of spec files happens based on timings calculated from previous runs. This
175
+ means that RSpecQ maintains a key with the timing of each job and updates it
176
+ on every run (if the `--timings` option was used). Also, RSpecQ has a "slow
177
+ file threshold" which, currently has to be set manually (but this can be
178
+ improved in the future).
179
+
180
+
181
+ ## Development
182
+
183
+ Install the required dependencies:
184
+
185
+ ```
186
+ $ bundle install
187
+ ```
188
+
189
+ Then you can execute the tests after spinning up a Redis instance at
190
+ `127.0.0.1:6379`:
191
+
192
+ ```
193
+ $ bundle exec rake
194
+ ```
195
+
196
+ To enable verbose output in the tests:
197
+
198
+ ```
199
+ $ RSPECQ_DEBUG=1 bundle exec rake
200
+ ```
98
201
 
99
- This protects us against unrecoverable worker failures (e.g. segfault).
100
202
 
101
203
  ## License
102
204
 
@@ -0,0 +1,9 @@
1
+ require "rake/testtask"
2
+
3
+ Rake::TestTask.new do |t|
4
+ t.libs << "test"
5
+ t.test_files = FileList['test/test*.rb']
6
+ t.verbose = true
7
+ end
8
+
9
+ task default: :test
data/bin/rspecq CHANGED
@@ -1,67 +1,118 @@
1
1
  #!/usr/bin/env ruby
2
- require "optionparser"
2
+ require "optparse"
3
3
  require "rspecq"
4
4
 
5
+ DEFAULT_REDIS_HOST = "127.0.0.1"
6
+ DEFAULT_REPORT_TIMEOUT = 3600 # 1 hour
7
+ DEFAULT_MAX_REQUEUES = 3
8
+
9
+ def env_set?(var)
10
+ ["1", "true"].include?(ENV[var])
11
+ end
12
+
5
13
  opts = {}
14
+
6
15
  OptionParser.new do |o|
7
- o.banner = "Usage: #{$PROGRAM_NAME} [opts] [files_or_directories_to_run]"
16
+ name = File.basename($PROGRAM_NAME)
8
17
 
9
- o.on("--build-id ID", "A unique identifier denoting the build") do |v|
10
- opts[:build_id] = v
18
+ o.banner = <<~BANNER
19
+ NAME:
20
+ #{name} - Optimally distribute and run RSpec suites among parallel workers
21
+
22
+ USAGE:
23
+ #{name} [<options>] [spec files or directories]
24
+ BANNER
25
+
26
+ o.separator ""
27
+ o.separator "OPTIONS:"
28
+
29
+ o.on("-b", "--build ID", "A unique identifier for the build. Should be " \
30
+ "common among workers participating in the same build.") do |v|
31
+ opts[:build] = v
11
32
  end
12
33
 
13
- o.on("--worker-id ID", "A unique identifier denoting the worker") do |v|
14
- opts[:worker_id] = v
34
+ o.on("-w", "--worker ID", "An identifier for the worker. Workers " \
35
+ "participating in the same build should have distinct IDs.") do |v|
36
+ opts[:worker] = v
15
37
  end
16
38
 
17
- o.on("--redis HOST", "Redis HOST to connect to (default: 127.0.0.1)") do |v|
18
- opts[:redis_host] = v || "127.0.0.1"
39
+ o.on("-r", "--redis HOST", "Redis host to connect to " \
40
+ "(default: #{DEFAULT_REDIS_HOST}).") do |v|
41
+ opts[:redis_host] = v
19
42
  end
20
43
 
21
- o.on("--timings", "Populate global job timings in Redis") do |v|
44
+ o.on("--update-timings", "Update the global job timings key with the " \
45
+ "timings of this build. Note: This key is used as the basis for job " \
46
+ "scheduling.") do |v|
22
47
  opts[:timings] = v
23
48
  end
24
49
 
25
- o.on("--file-split-threshold N", "Split spec files slower than N sec. and " \
26
- "schedule them by example (default: 999999)") do |v|
27
- opts[:file_split_threshold] = Float(v)
50
+ o.on("--file-split-threshold N", Integer, "Split spec files slower than N " \
51
+ "seconds and schedule them as individual examples.") do |v|
52
+ opts[:file_split_threshold] = v
28
53
  end
29
54
 
30
- o.on("--report", "Do not execute tests but wait until queue is empty and " \
31
- "print a report") do |v|
55
+ o.on("--report", "Enable reporter mode: do not pull tests off the queue; " \
56
+ "instead print build progress and exit when it's " \
57
+ "finished.\n#{o.summary_indent*9} " \
58
+ "Exits with a non-zero status code if there were any " \
59
+ "failures.") do |v|
32
60
  opts[:report] = v
33
61
  end
34
62
 
35
- o.on("--report-timeout N", Integer, "Fail if queue is not empty after " \
36
- "N seconds. Only applicable if --report is enabled " \
37
- "(default: 3600)") do |v|
63
+ o.on("--report-timeout N", Integer, "Fail if build is not finished after " \
64
+ "N seconds. Only applicable if --report is enabled " \
65
+ "(default: #{DEFAULT_REPORT_TIMEOUT}).") do |v|
38
66
  opts[:report_timeout] = v
39
67
  end
40
68
 
69
+ o.on("--max-requeues N", Integer, "Retry failed examples up to N times " \
70
+ "before considering them legit failures " \
71
+ "(default: #{DEFAULT_MAX_REQUEUES}).") do |v|
72
+ opts[:max_requeues] = v
73
+ end
74
+
75
+ o.on_tail("-h", "--help", "Show this message.") do
76
+ puts o
77
+ exit
78
+ end
79
+
80
+ o.on_tail("-v", "--version", "Print the version and exit.") do
81
+ puts "#{name} #{RSpecQ::VERSION}"
82
+ exit
83
+ end
41
84
  end.parse!
42
85
 
43
- [:build_id, :worker_id].each do |o|
44
- raise OptionParser::MissingArgument.new(o) if opts[o].nil?
45
- end
86
+ opts[:build] ||= ENV["RSPECQ_BUILD"]
87
+ opts[:worker] ||= ENV["RSPECQ_WORKER"]
88
+ opts[:redis_host] ||= ENV["RSPECQ_REDIS"] || DEFAULT_REDIS_HOST
89
+ opts[:timings] ||= env_set?("RSPECQ_UPDATE_TIMINGS")
90
+ opts[:file_split_threshold] ||= Integer(ENV["RSPECQ_FILE_SPLIT_THRESHOLD"] || 9999999)
91
+ opts[:report] ||= env_set?("RSPECQ_REPORT")
92
+ opts[:report_timeout] ||= Integer(ENV["RSPECQ_REPORT_TIMEOUT"] || DEFAULT_REPORT_TIMEOUT)
93
+ opts[:max_requeues] ||= Integer(ENV["RSPECQ_MAX_REQUEUES"] || DEFAULT_MAX_REQUEUES)
94
+
95
+ raise OptionParser::MissingArgument.new(:build) if opts[:build].nil?
96
+ raise OptionParser::MissingArgument.new(:worker) if !opts[:report] && opts[:worker].nil?
46
97
 
47
98
  if opts[:report]
48
99
  reporter = RSpecQ::Reporter.new(
49
- build_id: opts[:build_id],
50
- worker_id: opts[:worker_id],
51
- timeout: opts[:report_timeout] || 3600,
100
+ build_id: opts[:build],
101
+ timeout: opts[:report_timeout],
52
102
  redis_host: opts[:redis_host],
53
103
  )
54
104
 
55
105
  reporter.report
56
106
  else
57
107
  worker = RSpecQ::Worker.new(
58
- build_id: opts[:build_id],
59
- worker_id: opts[:worker_id],
60
- redis_host: opts[:redis_host],
61
- files_or_dirs_to_run: ARGV[0] || "spec",
108
+ build_id: opts[:build],
109
+ worker_id: opts[:worker],
110
+ redis_host: opts[:redis_host]
62
111
  )
63
112
 
113
+ worker.files_or_dirs_to_run = ARGV[0] if ARGV[0]
64
114
  worker.populate_timings = opts[:timings]
65
- worker.file_split_threshold = opts[:file_split_threshold] || 999999
115
+ worker.file_split_threshold = opts[:file_split_threshold]
116
+ worker.max_requeues = opts[:max_requeues]
66
117
  worker.work
67
118
  end
@@ -1,11 +1,10 @@
1
1
  require "rspec/core"
2
+ require "sentry-raven"
2
3
 
3
4
  module RSpecQ
4
- MAX_REQUEUES = 3
5
-
6
- # If a worker haven't executed an RSpec example for more than this time
7
- # (in seconds), it is considered dead and its reserved work will be put back
8
- # to the queue, to be picked up by another worker.
5
+ # If a worker haven't executed an example for more than WORKER_LIVENESS_SEC
6
+ # seconds, it is considered dead and its reserved work will be put back
7
+ # to the queue to be picked up by another worker.
9
8
  WORKER_LIVENESS_SEC = 60.0
10
9
  end
11
10
 
@@ -16,6 +15,5 @@ require_relative "rspecq/formatters/worker_heartbeat_recorder"
16
15
 
17
16
  require_relative "rspecq/queue"
18
17
  require_relative "rspecq/reporter"
19
- require_relative "rspecq/worker"
20
-
21
18
  require_relative "rspecq/version"
19
+ require_relative "rspecq/worker"
@@ -1,11 +1,12 @@
1
1
  module RSpecQ
2
2
  module Formatters
3
3
  class FailureRecorder
4
- def initialize(queue, job)
4
+ def initialize(queue, job, max_requeues)
5
5
  @queue = queue
6
6
  @job = job
7
7
  @colorizer = RSpec::Core::Formatters::ConsoleCodes
8
8
  @non_example_error_recorded = false
9
+ @max_requeues = max_requeues
9
10
  end
10
11
 
11
12
  # Here we're notified about errors occuring outside of examples.
@@ -24,7 +25,7 @@ module RSpecQ
24
25
  def example_failed(notification)
25
26
  example = notification.example
26
27
 
27
- if @queue.requeue_job(example.id, MAX_REQUEUES)
28
+ if @queue.requeue_job(example.id, @max_requeues)
28
29
  # HACK: try to avoid picking the job we just requeued; we want it
29
30
  # to be picked up by a different worker
30
31
  sleep 0.5
@@ -57,6 +57,8 @@ module RSpecQ
57
57
  STATUS_INITIALIZING = "initializing".freeze
58
58
  STATUS_READY = "ready".freeze
59
59
 
60
+ attr_reader :redis
61
+
60
62
  def initialize(build_id, worker_id, redis_host)
61
63
  @build_id = build_id
62
64
  @worker_id = worker_id
@@ -150,13 +152,21 @@ module RSpecQ
150
152
  end
151
153
 
152
154
  def example_count
153
- @redis.get(key_example_count) || 0
155
+ @redis.get(key_example_count).to_i
154
156
  end
155
157
 
156
158
  def processed_jobs_count
157
159
  @redis.scard(key_queue_processed)
158
160
  end
159
161
 
162
+ def processed_jobs
163
+ @redis.smembers(key_queue_processed)
164
+ end
165
+
166
+ def requeued_jobs
167
+ @redis.hgetall(key_requeues)
168
+ end
169
+
160
170
  def become_master
161
171
  @redis.setnx(key_queue_status, STATUS_INITIALIZING)
162
172
  end
@@ -200,10 +210,10 @@ module RSpecQ
200
210
  exhausted? && example_failures.empty? && non_example_errors.empty?
201
211
  end
202
212
 
203
- private
204
-
205
- def key(*keys)
206
- [@build_id, keys].join(":")
213
+ # The remaining jobs to be processed. Jobs at the head of the list will
214
+ # be procesed first.
215
+ def unprocessed_jobs
216
+ @redis.lrange(key_queue_unprocessed, 0, -1)
207
217
  end
208
218
 
209
219
  # redis: STRING [STATUS_INITIALIZING, STATUS_READY]
@@ -279,6 +289,12 @@ module RSpecQ
279
289
  "build_times"
280
290
  end
281
291
 
292
+ private
293
+
294
+ def key(*keys)
295
+ [@build_id, keys].join(":")
296
+ end
297
+
282
298
  # We don't use any Ruby `Time` methods because specs that use timecop in
283
299
  # before(:all) hooks will mess up our times.
284
300
  def current_time
@@ -1,10 +1,9 @@
1
1
  module RSpecQ
2
2
  class Reporter
3
- def initialize(build_id:, worker_id:, timeout:, redis_host:)
3
+ def initialize(build_id:, timeout:, redis_host:)
4
4
  @build_id = build_id
5
- @worker_id = worker_id
6
5
  @timeout = timeout
7
- @queue = Queue.new(build_id, worker_id, redis_host)
6
+ @queue = Queue.new(build_id, "reporter", redis_host)
8
7
 
9
8
  # We want feedback to be immediattely printed to CI users, so
10
9
  # we disable buffering.
@@ -12,7 +11,7 @@ module RSpecQ
12
11
  end
13
12
 
14
13
  def report
15
- t = measure_duration { @queue.wait_until_published }
14
+ @queue.wait_until_published
16
15
 
17
16
  finished = false
18
17
 
@@ -1,3 +1,3 @@
1
1
  module RSpecQ
2
- VERSION = "0.0.1.pre2".freeze
2
+ VERSION = "0.1.0".freeze
3
3
  end
@@ -1,10 +1,16 @@
1
1
  require "json"
2
+ require "pathname"
2
3
  require "pp"
3
4
 
4
5
  module RSpecQ
5
6
  class Worker
6
7
  HEARTBEAT_FREQUENCY = WORKER_LIVENESS_SEC / 6
7
8
 
9
+ # The root path or individual spec files to execute.
10
+ #
11
+ # Defaults to "spec" (just like in RSpec)
12
+ attr_accessor :files_or_dirs_to_run
13
+
8
14
  # If true, job timings will be populated in the global Redis timings key
9
15
  #
10
16
  # Defaults to false
@@ -12,15 +18,27 @@ module RSpecQ
12
18
 
13
19
  # If set, spec files that are known to take more than this value to finish,
14
20
  # will be split and scheduled on a per-example basis.
21
+ #
22
+ # Defaults to 999999
15
23
  attr_accessor :file_split_threshold
16
24
 
17
- def initialize(build_id:, worker_id:, redis_host:, files_or_dirs_to_run:)
25
+ # Retry failed examples up to N times (with N being the supplied value)
26
+ # before considering them legit failures
27
+ #
28
+ # Defaults to 3
29
+ attr_accessor :max_requeues
30
+
31
+ attr_reader :queue
32
+
33
+ def initialize(build_id:, worker_id:, redis_host:)
18
34
  @build_id = build_id
19
35
  @worker_id = worker_id
20
36
  @queue = Queue.new(build_id, worker_id, redis_host)
21
- @files_or_dirs_to_run = files_or_dirs_to_run
37
+ @files_or_dirs_to_run = "spec"
22
38
  @populate_timings = false
23
39
  @file_split_threshold = 999999
40
+ @heartbeat_updated_at = nil
41
+ @max_requeues = 3
24
42
 
25
43
  RSpec::Core::Formatters.register(Formatters::JobTimingRecorder, :dump_summary)
26
44
  RSpec::Core::Formatters.register(Formatters::ExampleCountRecorder, :dump_summary)
@@ -31,23 +49,23 @@ module RSpecQ
31
49
  def work
32
50
  puts "Working for build #{@build_id} (worker=#{@worker_id})"
33
51
 
34
- try_publish_queue!(@queue)
35
- @queue.wait_until_published
52
+ try_publish_queue!(queue)
53
+ queue.wait_until_published
36
54
 
37
55
  loop do
38
56
  # we have to bootstrap this so that it can be used in the first call
39
57
  # to `requeue_lost_job` inside the work loop
40
58
  update_heartbeat
41
59
 
42
- lost = @queue.requeue_lost_job
60
+ lost = queue.requeue_lost_job
43
61
  puts "Requeued lost job: #{lost}" if lost
44
62
 
45
63
  # TODO: can we make `reserve_job` also act like exhausted? and get
46
64
  # rid of `exhausted?` (i.e. return false if no jobs remain)
47
- job = @queue.reserve_job
65
+ job = queue.reserve_job
48
66
 
49
67
  # build is finished
50
- return if job.nil? && @queue.exhausted?
68
+ return if job.nil? && queue.exhausted?
51
69
 
52
70
  next if job.nil?
53
71
 
@@ -60,112 +78,125 @@ module RSpecQ
60
78
  RSpec.configuration.detail_color = :magenta
61
79
  RSpec.configuration.seed = srand && srand % 0xFFFF
62
80
  RSpec.configuration.backtrace_formatter.filter_gem('rspecq')
63
- RSpec.configuration.add_formatter(Formatters::FailureRecorder.new(@queue, job))
64
- RSpec.configuration.add_formatter(Formatters::ExampleCountRecorder.new(@queue))
81
+ RSpec.configuration.add_formatter(Formatters::FailureRecorder.new(queue, job, max_requeues))
82
+ RSpec.configuration.add_formatter(Formatters::ExampleCountRecorder.new(queue))
65
83
  RSpec.configuration.add_formatter(Formatters::WorkerHeartbeatRecorder.new(self))
66
84
 
67
85
  if populate_timings
68
- RSpec.configuration.add_formatter(Formatters::JobTimingRecorder.new(@queue, job))
86
+ RSpec.configuration.add_formatter(Formatters::JobTimingRecorder.new(queue, job))
69
87
  end
70
88
 
71
89
  opts = RSpec::Core::ConfigurationOptions.new(["--format", "progress", job])
72
90
  _result = RSpec::Core::Runner.new(opts).run($stderr, $stdout)
73
91
 
74
- @queue.acknowledge_job(job)
92
+ queue.acknowledge_job(job)
75
93
  end
76
94
  end
77
95
 
78
96
  # Update the worker heartbeat if necessary
79
97
  def update_heartbeat
80
98
  if @heartbeat_updated_at.nil? || elapsed(@heartbeat_updated_at) >= HEARTBEAT_FREQUENCY
81
- @queue.record_worker_heartbeat
99
+ queue.record_worker_heartbeat
82
100
  @heartbeat_updated_at = Process.clock_gettime(Process::CLOCK_MONOTONIC)
83
101
  end
84
102
  end
85
103
 
86
- private
87
-
88
- def reset_rspec_state!
89
- RSpec.clear_examples
90
-
91
- # TODO: remove after https://github.com/rspec/rspec-core/pull/2723
92
- RSpec.world.instance_variable_set(:@example_group_counts_by_spec_file, Hash.new(0))
93
-
94
- # RSpec.clear_examples does not reset those, which causes issues when
95
- # a non-example error occurs (subsequent jobs are not executed)
96
- # TODO: upstream
97
- RSpec.world.non_example_failure = false
98
-
99
- # we don't want an error that occured outside of the examples (which
100
- # would set this to `true`) to stop the worker
101
- RSpec.world.wants_to_quit = false
102
- end
103
-
104
104
  def try_publish_queue!(queue)
105
105
  return if !queue.become_master
106
106
 
107
- RSpec.configuration.files_or_directories_to_run = @files_or_dirs_to_run
107
+ RSpec.configuration.files_or_directories_to_run = files_or_dirs_to_run
108
108
  files_to_run = RSpec.configuration.files_to_run.map { |j| relative_path(j) }
109
109
 
110
110
  timings = queue.timings
111
111
  if timings.empty?
112
- # TODO: should be a warning reported somewhere (Sentry?)
113
112
  q_size = queue.publish(files_to_run.shuffle)
114
- puts "WARNING: No timings found! Published queue in " \
115
- "random order (size=#{q_size})"
113
+ log_event(
114
+ "No timings found! Published queue in random order (size=#{q_size})",
115
+ "warning"
116
+ )
116
117
  return
117
118
  end
118
119
 
119
- slow_files = timings.take_while do |_job, duration|
120
- duration >= file_split_threshold
121
- end.map(&:first) & files_to_run
120
+ # prepare jobs to run
121
+ jobs = []
122
+ slow_files = []
122
123
 
123
- if slow_files.any?
124
- puts "Slow files (threshold=#{file_split_threshold}): #{slow_files}"
124
+ if file_split_threshold
125
+ slow_files = timings.take_while do |_job, duration|
126
+ duration >= file_split_threshold
127
+ end.map(&:first) & files_to_run
125
128
  end
126
129
 
127
- # prepare jobs to run
128
- jobs = []
129
- jobs.concat(files_to_run - slow_files)
130
- jobs.concat(files_to_example_ids(slow_files)) if slow_files.any?
130
+ if slow_files.any?
131
+ jobs.concat(files_to_run - slow_files)
132
+ jobs.concat(files_to_example_ids(slow_files))
133
+ else
134
+ jobs.concat(files_to_run)
135
+ end
131
136
 
132
- # assign timings to all of them
133
137
  default_timing = timings.values[timings.values.size/2]
134
138
 
139
+ # assign timings (based on previous runs) to all jobs
135
140
  jobs = jobs.each_with_object({}) do |j, h|
136
- # heuristic: put untimed jobs in the middle of the queue
137
- puts "New/untimed job: #{j}" if timings[j].nil?
141
+ puts "Untimed job: #{j}" if timings[j].nil?
142
+
143
+ # HEURISTIC: put jobs without previous timings (e.g. a newly added
144
+ # spec file) in the middle of the queue
138
145
  h[j] = timings[j] || default_timing
139
146
  end
140
147
 
141
- # finally, sort them based on their timing (slowest first)
148
+ # sort jobs based on their timings (slowest to be processed first)
142
149
  jobs = jobs.sort_by { |_j, t| -t }.map(&:first)
143
150
 
144
151
  puts "Published queue (size=#{queue.publish(jobs)})"
145
152
  end
146
153
 
154
+ private
155
+
156
+ def reset_rspec_state!
157
+ RSpec.clear_examples
158
+
159
+ # see https://github.com/rspec/rspec-core/pull/2723
160
+ if Gem::Version.new(RSpec::Core::Version::STRING) <= Gem::Version.new("3.9.1")
161
+ RSpec.world.instance_variable_set(
162
+ :@example_group_counts_by_spec_file, Hash.new(0))
163
+ end
164
+
165
+ # RSpec.clear_examples does not reset those, which causes issues when
166
+ # a non-example error occurs (subsequent jobs are not executed)
167
+ # TODO: upstream
168
+ RSpec.world.non_example_failure = false
169
+
170
+ # we don't want an error that occured outside of the examples (which
171
+ # would set this to `true`) to stop the worker
172
+ RSpec.world.wants_to_quit = false
173
+ end
174
+
147
175
  # NOTE: RSpec has to load the files before we can split them as individual
148
176
  # examples. In case a file to be splitted fails to be loaded
149
- # (e.g. contains a syntax error), we return the slow files unchanged,
150
- # thereby falling back to scheduling them normally.
151
- #
152
- # Their errors will be reported in the normal flow, when they're picked up
153
- # as jobs by a worker.
177
+ # (e.g. contains a syntax error), we return the files unchanged, thereby
178
+ # falling back to scheduling them as whole files. Their errors will be
179
+ # reported in the normal flow when they're eventually picked up by a worker.
154
180
  def files_to_example_ids(files)
155
- # TODO: do this programatically
156
- cmd = "DISABLE_SPRING=1 bin/rspec --dry-run --format json #{files.join(' ')}"
181
+ cmd = "DISABLE_SPRING=1 bundle exec rspec --dry-run --format json #{files.join(' ')} 2>&1"
157
182
  out = `#{cmd}`
183
+ cmd_result = $?
158
184
 
159
- if !$?.success?
160
- # TODO: emit warning to Sentry
161
- puts "WARNING: Error splitting slow files; falling back to regular scheduling:"
185
+ if !cmd_result.success?
186
+ rspec_output = begin
187
+ JSON.parse(out)
188
+ rescue JSON::ParserError
189
+ out
190
+ end
162
191
 
163
- begin
164
- pp JSON.parse(out)
165
- rescue JSON::ParserError
166
- puts out
167
- end
168
- puts
192
+ log_event(
193
+ "Failed to split slow files, falling back to regular scheduling",
194
+ "error",
195
+ rspec_output: rspec_output,
196
+ cmd_result: cmd_result.inspect,
197
+ )
198
+
199
+ pp rspec_output
169
200
 
170
201
  return files
171
202
  end
@@ -181,5 +212,23 @@ module RSpecQ
181
212
  def elapsed(since)
182
213
  Process.clock_gettime(Process::CLOCK_MONOTONIC) - since
183
214
  end
215
+
216
+ # Prints msg to standard output and emits an event to Sentry, if the
217
+ # SENTRY_DSN environment variable is set.
218
+ def log_event(msg, level, additional={})
219
+ puts msg
220
+
221
+ Raven.capture_message(msg, level: level, extra: {
222
+ build: @build_id,
223
+ worker: @worker_id,
224
+ queue: queue.inspect,
225
+ files_or_dirs_to_run: files_or_dirs_to_run,
226
+ populate_timings: populate_timings,
227
+ file_split_threshold: file_split_threshold,
228
+ heartbeat_updated_at: @heartbeat_updated_at,
229
+ object: self.inspect,
230
+ pid: Process.pid,
231
+ }.merge(additional))
232
+ end
184
233
  end
185
234
  end
metadata CHANGED
@@ -1,14 +1,14 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: rspecq
3
3
  version: !ruby/object:Gem::Version
4
- version: 0.0.1.pre2
4
+ version: 0.1.0
5
5
  platform: ruby
6
6
  authors:
7
7
  - Agis Anastasopoulos
8
8
  autorequire:
9
9
  bindir: bin
10
10
  cert_chain: []
11
- date: 2020-06-26 00:00:00.000000000 Z
11
+ date: 2020-08-27 00:00:00.000000000 Z
12
12
  dependencies:
13
13
  - !ruby/object:Gem::Dependency
14
14
  name: rspec-core
@@ -38,22 +38,64 @@ dependencies:
38
38
  - - ">="
39
39
  - !ruby/object:Gem::Version
40
40
  version: '0'
41
+ - !ruby/object:Gem::Dependency
42
+ name: sentry-raven
43
+ requirement: !ruby/object:Gem::Requirement
44
+ requirements:
45
+ - - ">="
46
+ - !ruby/object:Gem::Version
47
+ version: '0'
48
+ type: :runtime
49
+ prerelease: false
50
+ version_requirements: !ruby/object:Gem::Requirement
51
+ requirements:
52
+ - - ">="
53
+ - !ruby/object:Gem::Version
54
+ version: '0'
55
+ - !ruby/object:Gem::Dependency
56
+ name: rake
57
+ requirement: !ruby/object:Gem::Requirement
58
+ requirements:
59
+ - - ">="
60
+ - !ruby/object:Gem::Version
61
+ version: '0'
62
+ type: :development
63
+ prerelease: false
64
+ version_requirements: !ruby/object:Gem::Requirement
65
+ requirements:
66
+ - - ">="
67
+ - !ruby/object:Gem::Version
68
+ version: '0'
69
+ - !ruby/object:Gem::Dependency
70
+ name: pry-byebug
71
+ requirement: !ruby/object:Gem::Requirement
72
+ requirements:
73
+ - - ">="
74
+ - !ruby/object:Gem::Version
75
+ version: '0'
76
+ type: :development
77
+ prerelease: false
78
+ version_requirements: !ruby/object:Gem::Requirement
79
+ requirements:
80
+ - - ">="
81
+ - !ruby/object:Gem::Version
82
+ version: '0'
41
83
  - !ruby/object:Gem::Dependency
42
84
  name: minitest
43
85
  requirement: !ruby/object:Gem::Requirement
44
86
  requirements:
45
- - - "~>"
87
+ - - ">="
46
88
  - !ruby/object:Gem::Version
47
- version: '5.14'
89
+ version: '0'
48
90
  type: :development
49
91
  prerelease: false
50
92
  version_requirements: !ruby/object:Gem::Requirement
51
93
  requirements:
52
- - - "~>"
94
+ - - ">="
53
95
  - !ruby/object:Gem::Version
54
- version: '5.14'
96
+ version: '0'
55
97
  - !ruby/object:Gem::Dependency
56
- name: rake
98
+ name: rspec
57
99
  requirement: !ruby/object:Gem::Requirement
58
100
  requirements:
59
101
  - - ">="
@@ -76,6 +118,7 @@ files:
76
118
  - CHANGELOG.md
77
119
  - LICENSE
78
120
  - README.md
121
+ - Rakefile
79
122
  - bin/rspecq
80
123
  - lib/rspecq.rb
81
124
  - lib/rspecq/formatters/example_count_recorder.rb
@@ -101,12 +144,13 @@ required_ruby_version: !ruby/object:Gem::Requirement
101
144
  version: '0'
102
145
  required_rubygems_version: !ruby/object:Gem::Requirement
103
146
  requirements:
104
- - - ">"
147
+ - - ">="
105
148
  - !ruby/object:Gem::Version
106
- version: 1.3.1
149
+ version: '0'
107
150
  requirements: []
108
- rubygems_version: 3.1.2
151
+ rubygems_version: 3.1.4
109
152
  signing_key:
110
153
  specification_version: 4
111
- summary: Distribute an RSpec suite among many workers
154
+ summary: Optimally distribute and run RSpec suites among parallel workers; for faster
155
+ CI builds
112
156
  test_files: []