rspecq 0.0.1.pre1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -0,0 +1,7 @@
1
+ ---
2
+ SHA256:
3
+ metadata.gz: a1c9e27a7a39ff772ee8f4303d26af799f4c7f20232cdc8729f3fa1ddcc4c144
4
+ data.tar.gz: 2dc1200b575b95b10f2dca4e4a1f3ba90e1577a6af4fd177691aece592249ed6
5
+ SHA512:
6
+ metadata.gz: c7654d037340e28e5ed31dfbed7826e30b84a2e092f930df13e76d92d0513c6ef2d25727c18bb7b84ad89675835e28e848c85f5b010ab13384674e2c2763f06f
7
+ data.tar.gz: 2b7421273d4b38848e8110526fb29740f67f9a620d2f205a612887503ce3b04463fafb9eff0dd9845c1eb56c4810fca97eed37761a6a23fa9fc39ab962d5373b
@@ -0,0 +1,4 @@
1
+ # Changelog
2
+
3
+ ## master (unreleased)
4
+
data/LICENSE ADDED
@@ -0,0 +1,20 @@
1
+ The MIT License
2
+
3
+ Copyright (c) 2020 Skroutz S.A.
4
+
5
+ Permission is hereby granted, free of charge, to any person obtaining a copy of
6
+ this software and associated documentation files (the "Software"), to deal in
7
+ the Software without restriction, including without limitation the rights to
8
+ use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of
9
+ the Software, and to permit persons to whom the Software is furnished to do so,
10
+ subject to the following conditions:
11
+
12
+ The above copyright notice and this permission notice shall be included in all
13
+ copies or substantial portions of the Software.
14
+
15
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16
+ IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS
17
+ FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR
18
+ COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER
19
+ IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
20
+ CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
@@ -0,0 +1,103 @@
1
+ # RSpecQ
2
+
3
+ RSpecQ (`rspecq`) distributes and executes an RSpec suite over many workers,
4
+ using a centralized queue backed by Redis.
5
+
6
+ RSpecQ is heavily inspired by [test-queue](https://github.com/tmm1/test-queue)
7
+ and [ci-queue](https://github.com/Shopify/ci-queue).
8
+
9
+ ## Why don't you just use ci-queue?
10
+
11
+ While evaluating ci-queue for our RSpec suite, we observed slow boot times
12
+ in the workers (up to 3 minutes), increased memory consumption and too much
13
+ disk I/O on boot. This is due to the fact that a worker in ci-queue has to
14
+ load every spec file on boot. This can be problematic for applications with
15
+ a large number of spec files.
16
+
17
+ RSpecQ works with spec files as its unit of work (as opposed to ci-queue which
18
+ works with individual examples). This means that an RSpecQ worker does not
19
+ have to load all spec files at once and so it doesn't have the aforementioned
20
+ problems. It also allows suites to keep using `before(:all)` hooks
21
+ (which ci-queue explicitly rejects). (Note: RSpecQ also schedules individual
22
+ examples, but only when this is deemed necessary, see section
23
+ "Spec file splitting").
24
+
25
+ We also observed faster build times by scheduling spec files instead of
26
+ individual examples, due to way less Redis operations.
27
+
28
+ The downside of this design is that it's more complicated, since the scheduling
29
+ of spec files happens based on timings calculated from previous runs. This
30
+ means that RSpecQ maintains a key with the timing of each job and updates it
31
+ on every run (if the `--timings` option was used). Also, RSpecQ has a "slow
32
+ file threshold" which, currently has to be set manually (but this can be
33
+ improved).
34
+
35
+ *Update*: ci-queue deprecated support for RSpec, so there's that.
36
+
37
+ ## Usage
38
+
39
+ Each worker needs to know the build it will participate in, its name and where
40
+ Redis is located. To start a worker:
41
+
42
+ ```shell
43
+ $ rspecq --build-id=foo --worker-id=worker1 --redis=redis://localhost
44
+ ```
45
+
46
+ To view the progress of the build print use `--report`:
47
+
48
+ ```shell
49
+ $ rspecq --build-id=foo --worker-id=reporter --redis=redis://localhost --report
50
+ ```
51
+
52
+ For detailed info use `--help`.
53
+
54
+
55
+ ## How it works
56
+
57
+ The basic idea is identical to ci-queue so please refer to its README
58
+
59
+ ### Terminology
60
+
61
+ - Job: the smallest unit of work, which is usually a spec file
62
+ (e.g. `./spec/models/foo_spec.rb`) but can also be an individual example
63
+ (e.g. `./spec/models/foo_spec.rb[1:2:1]`) if the file is too slow
64
+ - Queue: a collection of Redis-backed structures that hold all the necessary
65
+ information for RSpecQ to function. This includes timing statistics, jobs to
66
+ be executed, the failure reports, requeueing statistics and more.
67
+ - Worker: a process that, given a build id, pops up jobs of that build and
68
+ executes them using RSpec
69
+ - Reporter: a process that, given a build id, waits for the build to finish
70
+ and prints the summary report (examples executed, build result, failures etc.)
71
+
72
+ ### Spec file splitting
73
+
74
+ Very slow files may put a limit to how fast the suite can execute. For example,
75
+ a worker may spend 10 minutes running a single slow file, while all the other
76
+ workers finish after 8 minutes. To overcome this issue, rspecq splits
77
+ files that their execution time is above a certain threshold
78
+ (set with the `--file-split-threshold` option) and will instead schedule them as
79
+ individual examples.
80
+
81
+ In the future, we'd like for the slow threshold to be calculated and set
82
+ dynamically.
83
+
84
+ ### Requeues
85
+
86
+ As a mitigation measure for flaky tests, if an example fails it will be put
87
+ back to the queue to be picked up by
88
+ another worker. This will be repeated up to a certain number of times before,
89
+ after which the example will be considered a legit failure and will be printed
90
+ in the final report (`--report`).
91
+
92
+ ### Worker failures
93
+
94
+ Workers emit a timestamp after each example, as a heartbeat, to denote
95
+ that they're fine and performing jobs. If a worker hasn't reported for
96
+ a given amount of time (see `WORKER_LIVENESS_SEC`) it is considered dead
97
+ and the job it reserved will be requeued, so that it is picked up by another worker.
98
+
99
+ This protects us against unrecoverable worker failures (e.g. segfault).
100
+
101
+ ## License
102
+
103
+ RSpecQ is licensed under MIT. See [LICENSE](LICENSE).
@@ -0,0 +1,67 @@
1
+ #!/usr/bin/env ruby
2
+ require "optionparser"
3
+ require "rspecq"
4
+
5
+ opts = {}
6
+ OptionParser.new do |o|
7
+ o.banner = "Usage: #{$PROGRAM_NAME} [opts] [files_or_directories_to_run]"
8
+
9
+ o.on("--build-id ID", "A unique identifier denoting the build") do |v|
10
+ opts[:build_id] = v
11
+ end
12
+
13
+ o.on("--worker-id ID", "A unique identifier denoting the worker") do |v|
14
+ opts[:worker_id] = v
15
+ end
16
+
17
+ o.on("--redis HOST", "Redis HOST to connect to (default: 127.0.0.1)") do |v|
18
+ opts[:redis_host] = v || "127.0.0.1"
19
+ end
20
+
21
+ o.on("--timings", "Populate global job timings in Redis") do |v|
22
+ opts[:timings] = v
23
+ end
24
+
25
+ o.on("--file-split-threshold N", "Split spec files slower than N sec. and " \
26
+ "schedule them by example (default: 999999)") do |v|
27
+ opts[:file_split_threshold] = Float(v)
28
+ end
29
+
30
+ o.on("--report", "Do not execute tests but wait until queue is empty and " \
31
+ "print a report") do |v|
32
+ opts[:report] = v
33
+ end
34
+
35
+ o.on("--report-timeout N", Integer, "Fail if queue is not empty after " \
36
+ "N seconds. Only applicable if --report is enabled " \
37
+ "(default: 3600)") do |v|
38
+ opts[:report_timeout] = v
39
+ end
40
+
41
+ end.parse!
42
+
43
+ [:build_id, :worker_id].each do |o|
44
+ raise OptionParser::MissingArgument.new(o) if opts[o].nil?
45
+ end
46
+
47
+ if opts[:report]
48
+ reporter = RSpecQ::Reporter.new(
49
+ build_id: opts[:build_id],
50
+ worker_id: opts[:worker_id],
51
+ timeout: opts[:report_timeout] || 3600,
52
+ redis_host: opts[:redis_host],
53
+ )
54
+
55
+ reporter.report
56
+ else
57
+ worker = RSpecQ::Worker.new(
58
+ build_id: opts[:build_id],
59
+ worker_id: opts[:worker_id],
60
+ redis_host: opts[:redis_host],
61
+ files_or_dirs_to_run: ARGV[0] || "spec",
62
+ )
63
+
64
+ worker.populate_timings = opts[:timings]
65
+ worker.file_split_threshold = opts[:file_split_threshold] || 999999
66
+ worker.work
67
+ end
@@ -0,0 +1,21 @@
1
+ require "rspec/core"
2
+
3
+ module RSpecQ
4
+ MAX_REQUEUES = 3
5
+
6
+ # If a worker haven't executed an RSpec example for more than this time
7
+ # (in seconds), it is considered dead and its reserved work will be put back
8
+ # to the queue, to be picked up by another worker.
9
+ WORKER_LIVENESS_SEC = 60.0
10
+ end
11
+
12
+ require_relative "rspecq/formatters/example_count_recorder"
13
+ require_relative "rspecq/formatters/failure_recorder"
14
+ require_relative "rspecq/formatters/job_timing_recorder"
15
+ require_relative "rspecq/formatters/worker_heartbeat_recorder"
16
+
17
+ require_relative "rspecq/queue"
18
+ require_relative "rspecq/reporter"
19
+ require_relative "rspecq/worker"
20
+
21
+ require_relative "rspecq/version"
@@ -0,0 +1,15 @@
1
+ module RSpecQ
2
+ module Formatters
3
+ # Increments the example counter after each job.
4
+ class ExampleCountRecorder
5
+ def initialize(queue)
6
+ @queue = queue
7
+ end
8
+
9
+ def dump_summary(summary)
10
+ n = summary.examples.count
11
+ @queue.increment_example_count(n) if n > 0
12
+ end
13
+ end
14
+ end
15
+ end
@@ -0,0 +1,50 @@
1
+ module RSpecQ
2
+ module Formatters
3
+ class FailureRecorder
4
+ def initialize(queue, job)
5
+ @queue = queue
6
+ @job = job
7
+ @colorizer = RSpec::Core::Formatters::ConsoleCodes
8
+ @non_example_error_recorded = false
9
+ end
10
+
11
+ # Here we're notified about errors occuring outside of examples.
12
+ #
13
+ # NOTE: Upon such an error, RSpec emits multiple notifications but we only
14
+ # want the _first_, which is the one that contains the error backtrace.
15
+ # That's why have to keep track of whether we've already received the
16
+ # needed notification and act accordingly.
17
+ def message(n)
18
+ if RSpec.world.non_example_failure && !@non_example_error_recorded
19
+ @queue.record_non_example_error(@job, n.message)
20
+ @non_example_error_recorded = true
21
+ end
22
+ end
23
+
24
+ def example_failed(notification)
25
+ example = notification.example
26
+
27
+ if @queue.requeue_job(example.id, MAX_REQUEUES)
28
+ # HACK: try to avoid picking the job we just requeued; we want it
29
+ # to be picked up by a different worker
30
+ sleep 0.5
31
+ return
32
+ end
33
+
34
+ presenter = RSpec::Core::Formatters::ExceptionPresenter.new(
35
+ example.exception, example)
36
+
37
+ msg = presenter.fully_formatted(nil, @colorizer)
38
+ msg << "\n"
39
+ msg << @colorizer.wrap(
40
+ "bin/rspec #{example.location_rerun_argument}",
41
+ RSpec.configuration.failure_color)
42
+
43
+ msg << @colorizer.wrap(
44
+ " # #{example.full_description}", RSpec.configuration.detail_color)
45
+
46
+ @queue.record_example_failure(notification.example.id, msg)
47
+ end
48
+ end
49
+ end
50
+ end
@@ -0,0 +1,14 @@
1
+ module RSpecQ
2
+ module Formatters
3
+ class JobTimingRecorder
4
+ def initialize(queue, job)
5
+ @queue = queue
6
+ @job = job
7
+ end
8
+
9
+ def dump_summary(summary)
10
+ @queue.record_timing(@job, Float(summary.duration))
11
+ end
12
+ end
13
+ end
14
+ end
@@ -0,0 +1,17 @@
1
+ module RSpecQ
2
+ module Formatters
3
+ # Updates the respective heartbeat key of the worker after each example.
4
+ #
5
+ # Refer to the documentation of WORKER_LIVENESS_SEC for more info.
6
+ class WorkerHeartbeatRecorder
7
+ def initialize(worker)
8
+ @worker = worker
9
+ end
10
+
11
+ def example_finished(*)
12
+ @worker.update_heartbeat
13
+ end
14
+ end
15
+ end
16
+ end
17
+
@@ -0,0 +1,288 @@
1
+ require "redis"
2
+
3
+ module RSpecQ
4
+ class Queue
5
+ RESERVE_JOB = <<~LUA.freeze
6
+ local queue = KEYS[1]
7
+ local queue_running = KEYS[2]
8
+ local worker_id = ARGV[1]
9
+
10
+ local job = redis.call('lpop', queue)
11
+ if job then
12
+ redis.call('hset', queue_running, worker_id, job)
13
+ return job
14
+ else
15
+ return nil
16
+ end
17
+ LUA
18
+
19
+ # Scans for dead workers and puts their reserved jobs back to the queue.
20
+ REQUEUE_LOST_JOB = <<~LUA.freeze
21
+ local worker_heartbeats = KEYS[1]
22
+ local queue_running = KEYS[2]
23
+ local queue_unprocessed = KEYS[3]
24
+ local time_now = ARGV[1]
25
+ local timeout = ARGV[2]
26
+
27
+ local dead_workers = redis.call('zrangebyscore', worker_heartbeats, 0, time_now - timeout)
28
+ for _, worker in ipairs(dead_workers) do
29
+ local job = redis.call('hget', queue_running, worker)
30
+ if job then
31
+ redis.call('lpush', queue_unprocessed, job)
32
+ redis.call('hdel', queue_running, worker)
33
+ return job
34
+ end
35
+ end
36
+
37
+ return nil
38
+ LUA
39
+
40
+ REQUEUE_JOB = <<~LUA.freeze
41
+ local key_queue_unprocessed = KEYS[1]
42
+ local key_requeues = KEYS[2]
43
+ local job = ARGV[1]
44
+ local max_requeues = ARGV[2]
45
+
46
+ local requeued_times = redis.call('hget', key_requeues, job)
47
+ if requeued_times and requeued_times >= max_requeues then
48
+ return nil
49
+ end
50
+
51
+ redis.call('lpush', key_queue_unprocessed, job)
52
+ redis.call('hincrby', key_requeues, job, 1)
53
+
54
+ return true
55
+ LUA
56
+
57
+ STATUS_INITIALIZING = "initializing".freeze
58
+ STATUS_READY = "ready".freeze
59
+
60
+ def initialize(build_id, worker_id, redis_host)
61
+ @build_id = build_id
62
+ @worker_id = worker_id
63
+ @redis = Redis.new(host: redis_host, id: worker_id)
64
+ end
65
+
66
+ # NOTE: jobs will be processed from head to tail (lpop)
67
+ def publish(jobs)
68
+ @redis.multi do
69
+ @redis.rpush(key_queue_unprocessed, jobs)
70
+ @redis.set(key_queue_status, STATUS_READY)
71
+ end.first
72
+ end
73
+
74
+ def reserve_job
75
+ @redis.eval(
76
+ RESERVE_JOB,
77
+ keys: [
78
+ key_queue_unprocessed,
79
+ key_queue_running,
80
+ ],
81
+ argv: [@worker_id]
82
+ )
83
+ end
84
+
85
+ def requeue_lost_job
86
+ @redis.eval(
87
+ REQUEUE_LOST_JOB,
88
+ keys: [
89
+ key_worker_heartbeats,
90
+ key_queue_running,
91
+ key_queue_unprocessed
92
+ ],
93
+ argv: [
94
+ current_time,
95
+ WORKER_LIVENESS_SEC
96
+ ]
97
+ )
98
+ end
99
+
100
+ # NOTE: The same job might happen to be acknowledged more than once, in
101
+ # the case of requeues.
102
+ def acknowledge_job(job)
103
+ @redis.multi do
104
+ @redis.hdel(key_queue_running, @worker_id)
105
+ @redis.sadd(key_queue_processed, job)
106
+ end
107
+ end
108
+
109
+ # Put job at the head of the queue to be re-processed right after, by
110
+ # another worker. This is a mitigation measure against flaky tests.
111
+ #
112
+ # Returns nil if the job hit the requeue limit and therefore was not
113
+ # requeued and should be considered a failure.
114
+ def requeue_job(job, max_requeues)
115
+ return false if max_requeues.zero?
116
+
117
+ @redis.eval(
118
+ REQUEUE_JOB,
119
+ keys: [key_queue_unprocessed, key_requeues],
120
+ argv: [job, max_requeues],
121
+ )
122
+ end
123
+
124
+ def record_example_failure(example_id, message)
125
+ @redis.hset(key_failures, example_id, message)
126
+ end
127
+
128
+ # For errors occured outside of examples (e.g. while loading a spec file)
129
+ def record_non_example_error(job, message)
130
+ @redis.hset(key_errors, job, message)
131
+ end
132
+
133
+ def record_timing(job, duration)
134
+ @redis.zadd(key_timings, duration, job)
135
+ end
136
+
137
+ def record_build_time(duration)
138
+ @redis.multi do
139
+ @redis.lpush(key_build_times, Float(duration))
140
+ @redis.ltrim(key_build_times, 0, 99)
141
+ end
142
+ end
143
+
144
+ def record_worker_heartbeat
145
+ @redis.zadd(key_worker_heartbeats, current_time, @worker_id)
146
+ end
147
+
148
+ def increment_example_count(n)
149
+ @redis.incrby(key_example_count, n)
150
+ end
151
+
152
+ def example_count
153
+ @redis.get(key_example_count) || 0
154
+ end
155
+
156
+ def processed_jobs_count
157
+ @redis.scard(key_queue_processed)
158
+ end
159
+
160
+ def become_master
161
+ @redis.setnx(key_queue_status, STATUS_INITIALIZING)
162
+ end
163
+
164
+ # ordered by execution time desc (slowest are in the head)
165
+ def timings
166
+ Hash[@redis.zrevrange(key_timings, 0, -1, withscores: true)]
167
+ end
168
+
169
+ def example_failures
170
+ @redis.hgetall(key_failures)
171
+ end
172
+
173
+ def non_example_errors
174
+ @redis.hgetall(key_errors)
175
+ end
176
+
177
+ def exhausted?
178
+ return false if !published?
179
+
180
+ @redis.multi do
181
+ @redis.llen(key_queue_unprocessed)
182
+ @redis.hlen(key_queue_running)
183
+ end.inject(:+).zero?
184
+ end
185
+
186
+ def published?
187
+ @redis.get(key_queue_status) == STATUS_READY
188
+ end
189
+
190
+ def wait_until_published(timeout=30)
191
+ (timeout * 10).times do
192
+ return if published?
193
+ sleep 0.1
194
+ end
195
+
196
+ raise "Queue not yet published after #{timeout} seconds"
197
+ end
198
+
199
+ def build_successful?
200
+ exhausted? && example_failures.empty? && non_example_errors.empty?
201
+ end
202
+
203
+ private
204
+
205
+ def key(*keys)
206
+ [@build_id, keys].join(":")
207
+ end
208
+
209
+ # redis: STRING [STATUS_INITIALIZING, STATUS_READY]
210
+ def key_queue_status
211
+ key("queue", "status")
212
+ end
213
+
214
+ # redis: LIST<job>
215
+ def key_queue_unprocessed
216
+ key("queue", "unprocessed")
217
+ end
218
+
219
+ # redis: HASH<worker_id => job>
220
+ def key_queue_running
221
+ key("queue", "running")
222
+ end
223
+
224
+ # redis: SET<job>
225
+ def key_queue_processed
226
+ key("queue", "processed")
227
+ end
228
+
229
+ # Contains regular RSpec example failures.
230
+ #
231
+ # redis: HASH<example_id => error message>
232
+ def key_failures
233
+ key("example_failures")
234
+ end
235
+
236
+ # Contains errors raised outside of RSpec examples
237
+ # (e.g. a syntax error in spec_helper.rb).
238
+ #
239
+ # redis: HASH<job => error message>
240
+ def key_errors
241
+ key("errors")
242
+ end
243
+
244
+ # As a mitigation mechanism for flaky tests, we requeue example failures
245
+ # to be retried by another worker, up to a certain number of times.
246
+ #
247
+ # redis: HASH<job => times_retried>
248
+ def key_requeues
249
+ key("requeues")
250
+ end
251
+
252
+ # The total number of examples, those that were requeued.
253
+ #
254
+ # redis: STRING<integer>
255
+ def key_example_count
256
+ key("example_count")
257
+ end
258
+
259
+ # redis: ZSET<worker_id => timestamp>
260
+ #
261
+ # Timestamp of the last example processed by each worker.
262
+ def key_worker_heartbeats
263
+ key("worker_heartbeats")
264
+ end
265
+
266
+ # redis: ZSET<job => duration>
267
+ #
268
+ # NOTE: This key is not scoped to a build (i.e. shared among all builds),
269
+ # so be careful to only publish timings from a single branch (e.g. master).
270
+ # Otherwise, timings won't be accurate.
271
+ def key_timings
272
+ "timings"
273
+ end
274
+
275
+ # redis: LIST<duration>
276
+ #
277
+ # Last build is at the head of the list.
278
+ def key_build_times
279
+ "build_times"
280
+ end
281
+
282
+ # We don't use any Ruby `Time` methods because specs that use timecop in
283
+ # before(:all) hooks will mess up our times.
284
+ def current_time
285
+ @redis.time[0]
286
+ end
287
+ end
288
+ end
@@ -0,0 +1,95 @@
1
+ module RSpecQ
2
+ class Reporter
3
+ def initialize(build_id:, worker_id:, timeout:, redis_host:)
4
+ @build_id = build_id
5
+ @worker_id = worker_id
6
+ @timeout = timeout
7
+ @queue = Queue.new(build_id, worker_id, redis_host)
8
+
9
+ # We want feedback to be immediattely printed to CI users, so
10
+ # we disable buffering.
11
+ STDOUT.sync = true
12
+ end
13
+
14
+ def report
15
+ t = measure_duration { @queue.wait_until_published }
16
+
17
+ finished = false
18
+
19
+ reported_failures = {}
20
+ failure_heading_printed = false
21
+
22
+ tests_duration = measure_duration do
23
+ @timeout.times do |i|
24
+ @queue.example_failures.each do |job, rspec_output|
25
+ next if reported_failures[job]
26
+
27
+ if !failure_heading_printed
28
+ puts "\nFailures:\n"
29
+ failure_heading_printed = true
30
+ end
31
+
32
+ reported_failures[job] = true
33
+ puts failure_formatted(rspec_output)
34
+ end
35
+
36
+ if !@queue.exhausted?
37
+ sleep 1
38
+ next
39
+ end
40
+
41
+ finished = true
42
+ break
43
+ end
44
+ end
45
+
46
+ raise "Build not finished after #{@timeout} seconds" if !finished
47
+
48
+ @queue.record_build_time(tests_duration)
49
+ puts summary(@queue.example_failures, @queue.non_example_errors,
50
+ humanize_duration(tests_duration))
51
+
52
+ exit 1 if !@queue.build_successful?
53
+ end
54
+
55
+ private
56
+
57
+ def measure_duration
58
+ start = Process.clock_gettime(Process::CLOCK_MONOTONIC)
59
+ yield
60
+ (Process.clock_gettime(Process::CLOCK_MONOTONIC) - start).round(2)
61
+ end
62
+
63
+ # We try to keep this output consistent with RSpec's original output
64
+ def summary(failures, errors, duration)
65
+ failed_examples_section = "\nFailed examples:\n\n"
66
+
67
+ failures.each do |_job, msg|
68
+ parts = msg.split("\n")
69
+ failed_examples_section << " #{parts[-1]}\n"
70
+ end
71
+
72
+ summary = ""
73
+ summary << failed_examples_section if !failures.empty?
74
+
75
+ errors.each { |_job, msg| summary << msg }
76
+
77
+ summary << "\n"
78
+ summary << "Total results:\n"
79
+ summary << " #{@queue.example_count} examples " \
80
+ "(#{@queue.processed_jobs_count} jobs), " \
81
+ "#{failures.count} failures, " \
82
+ "#{errors.count} errors"
83
+ summary << "\n\n"
84
+ summary << "Spec execution time: #{duration}"
85
+ end
86
+
87
+ def failure_formatted(rspec_output)
88
+ rspec_output.split("\n")[0..-2].join("\n")
89
+ end
90
+
91
+ def humanize_duration(seconds)
92
+ Time.at(seconds).utc.strftime("%H:%M:%S")
93
+ end
94
+ end
95
+ end
@@ -0,0 +1,3 @@
1
+ module RSpecQ
2
+ VERSION = "0.0.1.pre1".freeze
3
+ end
@@ -0,0 +1,185 @@
1
+ require "json"
2
+ require "pp"
3
+
4
+ module RSpecQ
5
+ class Worker
6
+ HEARTBEAT_FREQUENCY = WORKER_LIVENESS_SEC / 6
7
+
8
+ # If true, job timings will be populated in the global Redis timings key
9
+ #
10
+ # Defaults to false
11
+ attr_accessor :populate_timings
12
+
13
+ # If set, spec files that are known to take more than this value to finish,
14
+ # will be split and scheduled on a per-example basis.
15
+ attr_accessor :file_split_threshold
16
+
17
+ def initialize(build_id:, worker_id:, redis_host:, files_or_dirs_to_run:)
18
+ @build_id = build_id
19
+ @worker_id = worker_id
20
+ @queue = Queue.new(build_id, worker_id, redis_host)
21
+ @files_or_dirs_to_run = files_or_dirs_to_run
22
+ @populate_timings = false
23
+ @file_split_threshold = 999999
24
+
25
+ RSpec::Core::Formatters.register(Formatters::JobTimingRecorder, :dump_summary)
26
+ RSpec::Core::Formatters.register(Formatters::ExampleCountRecorder, :dump_summary)
27
+ RSpec::Core::Formatters.register(Formatters::FailureRecorder, :example_failed, :message)
28
+ RSpec::Core::Formatters.register(Formatters::WorkerHeartbeatRecorder, :example_finished)
29
+ end
30
+
31
+ def work
32
+ puts "Working for build #{@build_id} (worker=#{@worker_id})"
33
+
34
+ try_publish_queue!(@queue)
35
+ @queue.wait_until_published
36
+
37
+ loop do
38
+ # we have to bootstrap this so that it can be used in the first call
39
+ # to `requeue_lost_job` inside the work loop
40
+ update_heartbeat
41
+
42
+ lost = @queue.requeue_lost_job
43
+ puts "Requeued lost job: #{lost}" if lost
44
+
45
+ # TODO: can we make `reserve_job` also act like exhausted? and get
46
+ # rid of `exhausted?` (i.e. return false if no jobs remain)
47
+ job = @queue.reserve_job
48
+
49
+ # build is finished
50
+ return if job.nil? && @queue.exhausted?
51
+
52
+ next if job.nil?
53
+
54
+ puts
55
+ puts "Executing #{job}"
56
+
57
+ reset_rspec_state!
58
+
59
+ # reconfigure rspec
60
+ RSpec.configuration.detail_color = :magenta
61
+ RSpec.configuration.seed = srand && srand % 0xFFFF
62
+ RSpec.configuration.backtrace_formatter.filter_gem('rspecq')
63
+ RSpec.configuration.add_formatter(Formatters::FailureRecorder.new(@queue, job))
64
+ RSpec.configuration.add_formatter(Formatters::ExampleCountRecorder.new(@queue))
65
+ RSpec.configuration.add_formatter(Formatters::WorkerHeartbeatRecorder.new(self))
66
+
67
+ if populate_timings
68
+ RSpec.configuration.add_formatter(Formatters::JobTimingRecorder.new(@queue, job))
69
+ end
70
+
71
+ opts = RSpec::Core::ConfigurationOptions.new(["--format", "progress", job])
72
+ _result = RSpec::Core::Runner.new(opts).run($stderr, $stdout)
73
+
74
+ @queue.acknowledge_job(job)
75
+ end
76
+ end
77
+
78
+ # Update the worker heartbeat if necessary
79
+ def update_heartbeat
80
+ if @heartbeat_updated_at.nil? || elapsed(@heartbeat_updated_at) >= HEARTBEAT_FREQUENCY
81
+ @queue.record_worker_heartbeat
82
+ @heartbeat_updated_at = Process.clock_gettime(Process::CLOCK_MONOTONIC)
83
+ end
84
+ end
85
+
86
+ private
87
+
88
+ def reset_rspec_state!
89
+ RSpec.clear_examples
90
+
91
+ # TODO: remove after https://github.com/rspec/rspec-core/pull/2723
92
+ RSpec.world.instance_variable_set(:@example_group_counts_by_spec_file, Hash.new(0))
93
+
94
+ # RSpec.clear_examples does not reset those, which causes issues when
95
+ # a non-example error occurs (subsequent jobs are not executed)
96
+ # TODO: upstream
97
+ RSpec.world.non_example_failure = false
98
+
99
+ # we don't want an error that occured outside of the examples (which
100
+ # would set this to `true`) to stop the worker
101
+ RSpec.world.wants_to_quit = false
102
+ end
103
+
104
+ def try_publish_queue!(queue)
105
+ return if !queue.become_master
106
+
107
+ RSpec.configuration.files_or_directories_to_run = @files_or_dirs_to_run
108
+ files_to_run = RSpec.configuration.files_to_run.map { |j| relative_path(j) }
109
+
110
+ timings = queue.timings
111
+ if timings.empty?
112
+ # TODO: should be a warning reported somewhere (Sentry?)
113
+ q_size = queue.publish(files_to_run.shuffle)
114
+ puts "WARNING: No timings found! Published queue in " \
115
+ "random order (size=#{q_size})"
116
+ return
117
+ end
118
+
119
+ slow_files = timings.take_while do |_job, duration|
120
+ duration >= file_split_threshold
121
+ end.map(&:first) & files_to_run
122
+
123
+ if slow_files.any?
124
+ puts "Slow files (threshold=#{file_split_threshold}): #{slow_files}"
125
+ end
126
+
127
+ # prepare jobs to run
128
+ jobs = []
129
+ jobs.concat(files_to_run - slow_files)
130
+ jobs.concat(files_to_example_ids(slow_files)) if slow_files.any?
131
+
132
+ # assign timings to all of them
133
+ default_timing = timings.values[timings.values.size/2]
134
+
135
+ jobs = jobs.each_with_object({}) do |j, h|
136
+ # heuristic: put untimed jobs in the middle of the queue
137
+ puts "New/untimed job: #{j}" if timings[j].nil?
138
+ h[j] = timings[j] || default_timing
139
+ end
140
+
141
+ # finally, sort them based on their timing (slowest first)
142
+ jobs = jobs.sort_by { |_j, t| -t }.map(&:first)
143
+
144
+ puts "Published queue (size=#{queue.publish(jobs)})"
145
+ end
146
+
147
+ # NOTE: RSpec has to load the files before we can split them as individual
148
+ # examples. In case a file to be splitted fails to be loaded
149
+ # (e.g. contains a syntax error), we return the slow files unchanged,
150
+ # thereby falling back to scheduling them normally.
151
+ #
152
+ # Their errors will be reported in the normal flow, when they're picked up
153
+ # as jobs by a worker.
154
+ def files_to_example_ids(files)
155
+ # TODO: do this programatically
156
+ cmd = "DISABLE_SPRING=1 bin/rspec --dry-run --format json #{files.join(' ')}"
157
+ out = `#{cmd}`
158
+
159
+ if !$?.success?
160
+ # TODO: emit warning to Sentry
161
+ puts "WARNING: Error splitting slow files; falling back to regular scheduling:"
162
+
163
+ begin
164
+ pp JSON.parse(out)
165
+ rescue JSON::ParserError
166
+ puts out
167
+ end
168
+ puts
169
+
170
+ return files
171
+ end
172
+
173
+ JSON.parse(out)["examples"].map { |e| e["id"] }
174
+ end
175
+
176
+ def relative_path(job)
177
+ @cwd ||= Pathname.new(Dir.pwd)
178
+ "./#{Pathname.new(job).relative_path_from(@cwd)}"
179
+ end
180
+
181
+ def elapsed(since)
182
+ Process.clock_gettime(Process::CLOCK_MONOTONIC) - since
183
+ end
184
+ end
185
+ end
metadata ADDED
@@ -0,0 +1,98 @@
1
+ --- !ruby/object:Gem::Specification
2
+ name: rspecq
3
+ version: !ruby/object:Gem::Version
4
+ version: 0.0.1.pre1
5
+ platform: ruby
6
+ authors:
7
+ - Agis Anastasopoulos
8
+ autorequire:
9
+ bindir: bin
10
+ cert_chain: []
11
+ date: 2020-06-26 00:00:00.000000000 Z
12
+ dependencies:
13
+ - !ruby/object:Gem::Dependency
14
+ name: rspec-core
15
+ requirement: !ruby/object:Gem::Requirement
16
+ requirements:
17
+ - - ">="
18
+ - !ruby/object:Gem::Version
19
+ version: '0'
20
+ type: :runtime
21
+ prerelease: false
22
+ version_requirements: !ruby/object:Gem::Requirement
23
+ requirements:
24
+ - - ">="
25
+ - !ruby/object:Gem::Version
26
+ version: '0'
27
+ - !ruby/object:Gem::Dependency
28
+ name: minitest
29
+ requirement: !ruby/object:Gem::Requirement
30
+ requirements:
31
+ - - "~>"
32
+ - !ruby/object:Gem::Version
33
+ version: '5.14'
34
+ type: :development
35
+ prerelease: false
36
+ version_requirements: !ruby/object:Gem::Requirement
37
+ requirements:
38
+ - - "~>"
39
+ - !ruby/object:Gem::Version
40
+ version: '5.14'
41
+ - !ruby/object:Gem::Dependency
42
+ name: rake
43
+ requirement: !ruby/object:Gem::Requirement
44
+ requirements:
45
+ - - ">="
46
+ - !ruby/object:Gem::Version
47
+ version: '0'
48
+ type: :development
49
+ prerelease: false
50
+ version_requirements: !ruby/object:Gem::Requirement
51
+ requirements:
52
+ - - ">="
53
+ - !ruby/object:Gem::Version
54
+ version: '0'
55
+ description:
56
+ email: agis.anast@gmail.com
57
+ executables:
58
+ - rspecq
59
+ extensions: []
60
+ extra_rdoc_files: []
61
+ files:
62
+ - CHANGELOG.md
63
+ - LICENSE
64
+ - README.md
65
+ - bin/rspecq
66
+ - lib/rspecq.rb
67
+ - lib/rspecq/formatters/example_count_recorder.rb
68
+ - lib/rspecq/formatters/failure_recorder.rb
69
+ - lib/rspecq/formatters/job_timing_recorder.rb
70
+ - lib/rspecq/formatters/worker_heartbeat_recorder.rb
71
+ - lib/rspecq/queue.rb
72
+ - lib/rspecq/reporter.rb
73
+ - lib/rspecq/version.rb
74
+ - lib/rspecq/worker.rb
75
+ homepage: https://github.com/skroutz/rspecq
76
+ licenses:
77
+ - MIT
78
+ metadata: {}
79
+ post_install_message:
80
+ rdoc_options: []
81
+ require_paths:
82
+ - lib
83
+ required_ruby_version: !ruby/object:Gem::Requirement
84
+ requirements:
85
+ - - ">="
86
+ - !ruby/object:Gem::Version
87
+ version: '0'
88
+ required_rubygems_version: !ruby/object:Gem::Requirement
89
+ requirements:
90
+ - - ">"
91
+ - !ruby/object:Gem::Version
92
+ version: 1.3.1
93
+ requirements: []
94
+ rubygems_version: 3.1.2
95
+ signing_key:
96
+ specification_version: 4
97
+ summary: Distribute an RSpec suite among many workers
98
+ test_files: []