ultravisor 0.0.0.3.g8cf10dc

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -0,0 +1,451 @@
1
+ > # WARNING WARNING WARNING
2
+ >
3
+ > This README is, at least in part, speculative fiction. I practice
4
+ > README-driven development, and as such, not everything described in here
5
+ > actually exists yet, and what does exist may not work right.
6
+
7
+ Ultravisor is like a supervisor, but... *ULTRA*. The idea is that you specify
8
+ objects to instantiate and run in threads, and then the Ultravisor makes that
9
+ happen behind the scenes, including logging failures, restarting if necessary,
10
+ and so on. If you're familiar with Erlang supervision trees, then Ultravisor
11
+ will feel familiar to you, because I stole pretty much every good idea that
12
+ is in Ultravisor from Erlang. You will get a lot of very excellent insight
13
+ from reading [the Erlang/OTP Supervision Principles](http://erlang.org/doc/design_principles/sup_princ.html).
14
+
15
+
16
+ # Installation
17
+
18
+ It's a gem:
19
+
20
+ gem install ultravisor
21
+
22
+ There's also the wonders of [the Gemfile](http://bundler.io):
23
+
24
+ gem 'ultravisor'
25
+
26
+ If you're the sturdy type that likes to run from git:
27
+
28
+ rake install
29
+
30
+ Or, if you've eschewed the convenience of Rubygems entirely, then you
31
+ presumably know what to do already.
32
+
33
+
34
+ # Usage
35
+
36
+ This section gives you a basic overview of the high points of how Ultravisor
37
+ can be used. It is not intended to be an exhaustive reference of all possible
38
+ options; the {Ultravisor} class API documentation provides every possible option
39
+ and its meaning.
40
+
41
+
42
+ ## The Basics
43
+
44
+ Start by loading the code:
45
+
46
+ require "ultravisor"
47
+
48
+ Creating a new Ultravisor is a matter of instantiating a new object:
49
+
50
+ u = Ultravisor.new
51
+
52
+ In order for it to be useful, though, you'll need to add one or more children
53
+ to the Ultravisor instance, which can either be done as part of the call to
54
+ `.new`, or afterwards, as you see fit:
55
+
56
+ # Defining a child in the constructor
57
+ u = Ultravisor.new(children: [{id: :child, klass: Child, method: :run}])
58
+
59
+ # OR define it afterwards
60
+ u = Ultravisor.new
61
+ u.add_child(id: :my_child, klass: Child, method: :run)
62
+
63
+ Once you have an Ultravisor with children configured, you can set it running:
64
+
65
+ u.run
66
+
67
+ This will block until the Ultravisor terminates, one way or another.
68
+
69
+ We'll learn about other available initialization arguments, and all the other
70
+ features of Ultravisor, in the following sections.
71
+
72
+
73
+ ## Defining Children
74
+
75
+ As children are the primary reason Ultravisor exists, it is worth getting a handle
76
+ on them first.
77
+
78
+ Defining children, as we saw in the introduction, can be done by calling
79
+ {Ultravisor#add_child} for each child you want to add, or else you can provide
80
+ a list of children to start as part of the {Ultravisor.new} call, using the
81
+ `children` named argument. You can also combine the two approaches, if some
82
+ children are defined statically, while others only get added conditionally.
83
+
84
+ Let's take another look at that {Ultravisor#add_child} method from earlier:
85
+
86
+ u.add_child(id: :my_child, klass: Child, method: :run)
87
+
88
+ First up, every child has an ID. This is fairly straightforward -- it's a
89
+ unique ID (within a given Ultravisor) that refers to the child. Attempting to
90
+ add two children with the same ID will raise an exception.
91
+
92
+ The `class` and `method` arguments require a little more explanation. One
93
+ of the foundational principles of "fail fast" is "clean restart" -- that is, if you
94
+ do need to restart something, it's important to start with as clean a state as possible.
95
+ Thus, if a child needs to be restarted, we don't want to reuse an existing object, which
96
+ may be in a messy and unuseable state. Instead, we want a clean, fresh object to work on.
97
+ That's why you specify a `class` when you define a child -- it is a new instance of that
98
+ class that will be used every time the child is started (or restarted).
99
+
100
+ The `method` argument might now be obvious. Once the new instance of the
101
+ specified `class` exists, the Ultravisor will call the specified `method` to start
102
+ work happening. It is expected that this method will ***not return***, in most cases.
103
+ So you probably want some sort of infinite loop.
104
+
105
+ You might think that this is extremely inflexible, only being able to specify a class
106
+ and a method to call. What if you want to pass in some parameters? Don't worry, we've
107
+ got you covered:
108
+
109
+ u.add_child(
110
+ id: :my_child,
111
+ klass: Child,
112
+ args: ['foo', 42, x: 1, y: 2],
113
+ method: :run,
114
+ )
115
+
116
+ The call to `Child.new` can take arbitrary arguments, just by defining an array
117
+ for the `args` named parameter. Did you know you can define a hash inside an
118
+ array like `['foo', 'bar', x: 1, y: 2] => ['foo', 'bar', {:x => 1, :y => 2}]`?
119
+ I didn't, either, until I started working on Ultravisor, but you can, and it
120
+ works *exactly* like named parameters in method calls.
121
+
122
+ You can also add children after the Ultravisor has been set running:
123
+
124
+ u = Ultravisor.new
125
+
126
+ u.add_child(id: :c1, klass: SomeWorker, method: :run)
127
+
128
+ u.run # => starts running an instance of SomeWorker, doesn't return
129
+
130
+ # In another thread...
131
+ u.add_child(id: :c2, klass: OtherWorker, method: go!)
132
+
133
+ # An instance of OtherWorker will be created and set running
134
+
135
+ If you add a child to an already-running Ultravisor, that child will immediately be
136
+ started running, almost like magic.
137
+
138
+
139
+ ### Ordering of Children
140
+
141
+ The order in which children are defined is important. When children are (re)started,
142
+ they are always started in the order they were defined. When children are stopped,
143
+ either because the Ultravisor is shutting down, or because of a [supervision
144
+ strategy](#supervision-strategies), they are always stopped in the *reverse* order
145
+ of their definition.
146
+
147
+ All child specifications passed to {Ultravisor.new} always come first, in the
148
+ order they were in the array. Any children defined via calls to
149
+ {Ultravisor#add_child} will go next, in the order the `add_child` calls were
150
+ made.
151
+
152
+
153
+ ## Restarting Children
154
+
155
+ One of the fundamental purposes of a supervisor like Ultravisor is that it restarts
156
+ children if they crash, on the principle of "fail fast". There's no point failing fast
157
+ if things don't get automatically fixed. This is the default behaviour of all
158
+ Ultravisor children.
159
+
160
+ Controlling how children are restarted is the purpose of the "restart policy",
161
+ which is controlled by the `restart` and `restart_policy` named arguments in
162
+ the child specification. For example, if you want to create a child that will
163
+ only ever be run once, regardless of what happens to it, then use `restart:
164
+ :never`:
165
+
166
+ u.add_child(
167
+ id: :my_one_shot_child,
168
+ klass: Child,
169
+ method: :run_maybe,
170
+ restart: :never
171
+ )
172
+
173
+ If you want a child which gets restarted if its `method` raises an exception,
174
+ but *not* if it runs to completion without error, then use `restart: :on_failure`:
175
+
176
+ u.add_child(
177
+ id: :my_run_once_child,
178
+ klass: Child,
179
+ method: :run_once,
180
+ restart: :on_failure
181
+ )
182
+
183
+ ### The Limits of Failure
184
+
185
+ While restarting is great in general, you don't particularly want to fill your
186
+ logs with an endlessly restarting child -- say, because it doesn't have
187
+ permission to access a database. To solve that problem, an Ultravisor will
188
+ only attempt to restart a child a certain number of times before giving up and
189
+ exiting itself. The parameters of how this works are controlled by the
190
+ `restart_policy`, which is itself a hash:
191
+
192
+ u.add_child(
193
+ id: :my_restartable_child,
194
+ klass: Child,
195
+ method: :run,
196
+ restart_policy: {
197
+ period: 5,
198
+ retries: 2,
199
+ delay: 1,
200
+ }
201
+ )
202
+
203
+ The meaning of each of the `restart_policy` keys is best explained as part
204
+ of how Ultravisor restarts children.
205
+
206
+ When a child needs to be restarted, Ultravisor first waits a little while
207
+ before attempting the restart. The amount of time to wait is specified
208
+ by the `delay` value in the `restart_policy`. Then a new instance of the
209
+ `class` is instantiated, and the `method` is called on that instance.
210
+
211
+ The `period` and `retries` values of the `restart_policy` come into play
212
+ when the child exits repeatedly. If a single child needs to be restarted
213
+ more than `retries` times in `period` seconds, then instead of trying to
214
+ restart again, Ultravisor gives up. It doesn't try to start the child
215
+ again, it terminates all the *other* children of the Ultravisor, and
216
+ then it exits. Note that the `delay` between restarts is *not* part
217
+ of the `period`; only time spent actually running the child is
218
+ accounted for.
219
+
220
+
221
+ ## Managed Child Termination
222
+
223
+ If children need to be terminated, by default, child threads are simply
224
+ forcibly terminated by calling {Thread#kill} on them. However, for workers
225
+ which hold resources, this can cause problems.
226
+
227
+ Thus, it is possible to control both how a child is terminated, and how long
228
+ to wait for that termination to occur, by using the `shutdown` named argument
229
+ when you add a child (either via {Ultravisor#add_child}, or as part of the
230
+ `children` named argument to {Ultravisor.new}), like this:
231
+
232
+ u.add_child(
233
+ id: :fancy_worker,
234
+ shutdown: {
235
+ method: :gentle_landing,
236
+ timeout: 30
237
+ }
238
+ )
239
+
240
+ When a child with a custom shutdown policy needs to be terminated, the
241
+ method named in the `method` key is called on the instance of `class` that
242
+ represents that child. Once the shutdown has been signalled to the
243
+ worker, up to `timeout` seconds is allowed to elapse. If the child thread has
244
+ not terminated by this time, the thread is forcibly terminated by calling
245
+ {Thread#kill}. This timeout prevents shutdown or group restart from hanging
246
+ indefinitely.
247
+
248
+ Note that the `method` specified in the `shutdown` specification should
249
+ signal the worker to terminate, and then return immediately. It should
250
+ *not* wait for termination itself.
251
+
252
+
253
+ ## Supervision Strategies
254
+
255
+ When a child needs to be restarted, by default only the child that exited
256
+ will be restarted. However, it is possible to cause other
257
+ children to be restarted as well, if that is necessary. To do that, you
258
+ use the `strategy` named parameter when creating the Ultravisor:
259
+
260
+ u = Ultravisor.new(strategy: :one_for_all)
261
+
262
+ The possible values for the strategy are:
263
+
264
+ * `:one_for_one` -- the default restart strategy, this simply causes the
265
+ child which exited to be started again, in line with its restart policy.
266
+
267
+ * `:all_for_one` -- if any child needs to be restarted, all children of the
268
+ Ultravisor get terminated in reverse of their start order, and then all
269
+ children are started again, except those which are `restart: :never`, or
270
+ `restart: :on_failure` which had not already exited without error.
271
+
272
+ * `:rest_for_one` -- if any child needs to be restarted, all children of
273
+ the Ultravisor which are *after* the restarted child get terminated
274
+ in reverse of their start order, and then all children are started again,
275
+ except those which are `restart: :never`, or `restart: :on_failure` which
276
+ had not already exited without error.
277
+
278
+
279
+ ## Interacting With Child Objects
280
+
281
+ Since the Ultravisor is creating the object instances that run in the worker
282
+ threads, you don't automatically have access to the object instance itself.
283
+ This is somewhat by design -- concurrency bugs are hell. However, there *are*
284
+ ways around this, if you need to.
285
+
286
+
287
+ ### The power of cast / call
288
+
289
+ A common approach for interacting with an object in an otherwise concurrent
290
+ environment is the `cast` / `call` pattern. From the outside, the interface
291
+ is quite straightforward:
292
+
293
+ ```
294
+ u = Ultravisor.new(children: [
295
+ { id: :castcall, klass: CastCall, method: :run, enable_castcall: true }
296
+ ])
297
+
298
+ # This will return `nil` immediately
299
+ u[:castcall].cast.some_method
300
+
301
+ # This will, at some point in the future, return whatever `CastCall#to_s` could
302
+ u[:castcall].call.some_method
303
+ ```
304
+
305
+ To enable `cast` / `call` support for a child, you must set the `enable_castcall`
306
+ keyword argument on the child. This is because failing to process `cast`s and
307
+ `call`s can cause all sorts of unpleasant backlogs, so children who intend to
308
+ receive (and process) `cast`s and `call`s must explicitly opt-in.
309
+
310
+ The interface to the object from outside is straightforward. You get a
311
+ reference to the instance of {Ultravisor::Child} for the child you want to talk
312
+ to (which is returned by {Ultravisor#add_child}, or {Ultravisor#[]}), and then
313
+ call `child.cast.<method>` or `child.call.<method>`, passing in arguments as
314
+ per normal. Any public method can be the target of the `cast` or `call`, and you
315
+ can pass in any arguments you like, *including blocks* (although bear in mind that
316
+ any blocks passed will be run in the child instance's thread, and many
317
+ concurrency dragons await the unwary).
318
+
319
+ The difference between the `cast` and `call` methods is in whether or not a
320
+ return value is expected, and hence when the method call chained through
321
+ `cast` or `call` returns.
322
+
323
+ When you call `cast`, the real method call gets queued for later execution,
324
+ and since no return value is expected, the `child.cast.<method>` returns
325
+ `nil` immediately and your code gets on with its day. This is useful
326
+ when you want to tell the worker something, or instruct it to do something,
327
+ but there's no value coming back.
328
+
329
+ In comparison, when you call `call`, the real method call still gets queued,
330
+ but the calling code blocks, waiting for the return value from the queued
331
+ method call. This may seem pointless -- why have concurrency that blocks? --
332
+ but the value comes from the synchronisation. The method call only happens
333
+ when the worker loop calls `process_castcall`, which it can do at a time that
334
+ suits it, and when it knows that nothing else is going on that could cause
335
+ problems.
336
+
337
+ One thing to be aware of when interacting with a worker instance is that it may
338
+ crash, and be restarted by the Ultravisor, before it gets around to processing
339
+ a queued message. If you used `child.cast`, then the method call is just...
340
+ lost, forever. On the other hand, if you used `child.call`, then an
341
+ {Ultravisor::ChildRestartedError} exception will be raised, which you can deal
342
+ with as you see fit.
343
+
344
+ The really interesting part is what happens *inside* the child instance. The
345
+ actual execution of code in response to the method calls passed through `cast`
346
+ and `call` will only happen when the running instance of the child's class
347
+ calls `process_castcall`. When that happens, all pending casts and calls will
348
+ be executed. Since this happens within the same thread as the rest of the
349
+ child instance's code, it's a lot safer than trying to synchronise everything
350
+ with locks.
351
+
352
+ You can, of course, just call `process_castcall` repeatedly, however that's a
353
+ somewhat herp-a-derp way of doing it. The `castcall_fd` method in the running
354
+ instance will return an IO object which will become readable whenever there is
355
+ a pending `cast` or `call` to process. Thus, if you're using `IO.select` or
356
+ similar to wait for work to do, you can add `castcall_fd` to the readable set
357
+ and only call `process_castcall` when the relevant IO object comes back. Don't
358
+ actually try *reading* from it yourself; `process_castcall` takes care of all that.
359
+
360
+ If you happen to have a child class whose *only* purpose is to process `cast`s
361
+ and `call`s, you should configure the Ultravisor to use `process_castcall_loop`
362
+ as its entry method. This is a wrapper method which blocks on `castcall_fd`
363
+ becoming readable, and loops infinitely.
364
+
365
+ It is important to remember that not all concurrency bugs can be prevented by
366
+ using `cast` / `call`. For example, read-modify-write operations will still
367
+ cause all the same problems they always do, so if you find yourself calling
368
+ `child.call`, modifying the value returned, and then calling `child.cast`
369
+ with that modified value, you're in for a bad time.
370
+
371
+
372
+ ### Direct (Unsafe) Instance Access
373
+
374
+ If you have a worker class which you're *really* sure is safe against concurrent
375
+ access, you can eschew the convenience and safety of `cast` / `call`, and instead
376
+ allow direct access to the worker instance object.
377
+
378
+ To do this, specify `access: :unsafe` in the child specification, and then
379
+ call `child.unsafe_instance` to get the instance object currently in play.
380
+
381
+ Yes, the multiple mentions of `unsafe` are there deliberately, and no, I won't
382
+ be removing them. They're there to remind you, always, that what you're doing
383
+ is unsafe.
384
+
385
+ If the child is restarting at the time `child.unsafe_instance` is called,
386
+ the call will block until the child worker is started again, after which
387
+ you'll get the newly created worker instance object. The worker could crash
388
+ again at any time, of course, leaving you with a now out-of-date object
389
+ that is no longer being actively run. It's up to you to figure out how to
390
+ deal with that. If the Ultravisor associated with the child
391
+ has terminated, your call to `child.unsafe_instance` will raise an
392
+ {Ultravisor::ChildRestartedError}.
393
+
394
+ Why yes, Gracie, there *are* a lot of things that can go wrong when using
395
+ direct instance object access. Still wondering why those `unsafe`s are in
396
+ the name?
397
+
398
+
399
+ ## Supervision Trees
400
+
401
+ Whilst a collection of workers is a neat thing to have, more powerful systems
402
+ can be constructed if supervisors can, themselves, be supervised. Primarily
403
+ this is useful when recovering from persistent errors, because you can use
404
+ a higher-level supervisor to restart an entire tree of workers which has one
405
+ which is having problems.
406
+
407
+ Creating a supervision tree is straightforward. Because Ultravisor works by
408
+ instantiating plain old ruby objects, and Ultravisor is, itself, a plain old
409
+ ruby class, you use it more-or-less like you would any other object:
410
+
411
+ u = Ultravisor.new
412
+ u.add_child(id: :sub_sup, klass: Ultravisor, method: :run, args: [children: [...]])
413
+
414
+ That's all there is to it. Whenever the parent Ultravisor wants to work on the
415
+ child Ultravisor, it treats it like any other child, asking it to terminate,
416
+ start, etc, and the child Ultravisor's work consists of terminating, starting,
417
+ etc all of its children.
418
+
419
+ The only difference in default behaviour between a regular worker child and an
420
+ Ultravisor child is that an Ultravisor's `shutdown` policy is automatically set
421
+ to `method: :stop!, timeout: :infinity`. This is because it is *very* bad news
422
+ to forcibly terminate an Ultravisor before its children have stopped -- all
423
+ those children just get cast into the VM, never to be heard from again.
424
+
425
+
426
+ # Contributing
427
+
428
+ Bug reports should be sent to the [Github issue
429
+ tracker](https://github.com/mpalmer/ultravisor/issues), or
430
+ [e-mailed](mailto:theshed+ultravisor@hezmatt.org). Patches can be sent as a
431
+ Github pull request, or [e-mailed](mailto:theshed+ultravisor@hezmatt.org).
432
+
433
+
434
+ # Licence
435
+
436
+ Unless otherwise stated, everything in this repo is covered by the following
437
+ copyright notice:
438
+
439
+ Copyright (C) 2019 Matt Palmer <matt@hezmatt.org>
440
+
441
+ This program is free software: you can redistribute it and/or modify it
442
+ under the terms of the GNU General Public License version 3, as
443
+ published by the Free Software Foundation.
444
+
445
+ This program is distributed in the hope that it will be useful,
446
+ but WITHOUT ANY WARRANTY; without even the implied warranty of
447
+ MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
448
+ GNU General Public License for more details.
449
+
450
+ You should have received a copy of the GNU General Public License
451
+ along with this program. If not, see <http://www.gnu.org/licenses/>.
@@ -0,0 +1,216 @@
1
+ # frozen_string_literal: true
2
+
3
+ require "logger"
4
+
5
+ require_relative "./ultravisor/child"
6
+ require_relative "./ultravisor/error"
7
+ require_relative "./ultravisor/logging_helpers"
8
+
9
+ # A super-dooOOOoooper supervisor.
10
+ #
11
+ class Ultravisor
12
+ include LoggingHelpers
13
+
14
+ def initialize(children: [], strategy: :one_for_one, logger: Logger.new("/dev/null"))
15
+ @queue, @logger = Queue.new, logger
16
+
17
+ @strategy = strategy
18
+ validate_strategy
19
+
20
+ @op_m, @op_cv = Mutex.new, ConditionVariable.new
21
+ @running_thread = nil
22
+
23
+ initialize_children(children)
24
+ end
25
+
26
+ def run
27
+ logger.debug(logloc) { "called" }
28
+
29
+ @op_m.synchronize do
30
+ if @running_thread
31
+ raise AlreadyRunningError,
32
+ "This ultravisor is already running"
33
+ end
34
+
35
+ @queue.clear
36
+ @running_thread = Thread.current
37
+ Thread.current.name = "Ultravisor"
38
+ end
39
+
40
+ logger.debug(logloc) { "Going to start children #{@children.map(&:first).inspect}" }
41
+ @children.each { |c| c.last.spawn(@queue) }
42
+
43
+ process_events
44
+
45
+ @op_m.synchronize do
46
+ logger.debug(logloc) { "Shutdown time for #{@children.reverse.map(&:first).inspect}" }
47
+ @children.reverse.each { |c| c.last.shutdown }
48
+
49
+ @running_thread = nil
50
+ @op_cv.broadcast
51
+ end
52
+
53
+ self
54
+ end
55
+
56
+ def shutdown(wait: true, force: false)
57
+ @op_m.synchronize do
58
+ return self unless @running_thread
59
+ if force
60
+ @children.reverse.each { |c| c.last.shutdown(force: true) }
61
+ @running_thread.kill
62
+ @running_thread = nil
63
+ @op_cv.broadcast
64
+ else
65
+ @queue << :shutdown
66
+ if wait
67
+ @op_cv.wait(@op_m) while @running_thread
68
+ end
69
+ end
70
+ end
71
+ self
72
+ end
73
+
74
+ def [](id)
75
+ @children.assoc(id)&.last
76
+ end
77
+
78
+ def add_child(**args)
79
+ logger.debug(logloc) { "Adding child #{args[:id].inspect}" }
80
+ args[:logger] ||= logger
81
+
82
+ @op_m.synchronize do
83
+ c = Ultravisor::Child.new(**args)
84
+
85
+ if @children.assoc(c.id)
86
+ raise DuplicateChildError,
87
+ "Child with ID #{c.id.inspect} already exists"
88
+ end
89
+
90
+ @children << [c.id, c]
91
+
92
+ if @running_thread
93
+ logger.debug(logloc) { "Auto-starting new child #{args[:id].inspect}" }
94
+ c.spawn(@queue)
95
+ end
96
+ end
97
+ end
98
+
99
+ def remove_child(id)
100
+ logger.debug(logloc) { "Removing child #{id.inspect}" }
101
+
102
+ @op_m.synchronize do
103
+ c = @children.assoc(id)
104
+
105
+ return nil if c.nil?
106
+
107
+ @children.delete(c)
108
+ if @running_thread
109
+ logger.debug(logloc) { "Shutting down removed child #{id.inspect}" }
110
+ c.last.shutdown
111
+ end
112
+ end
113
+ end
114
+
115
+ private
116
+
117
+ def validate_strategy
118
+ unless %i{one_for_one all_for_one rest_for_one}.include?(@strategy)
119
+ raise ArgumentError,
120
+ "Invalid strategy #{@strategy.inspect}"
121
+ end
122
+ end
123
+
124
+ def initialize_children(children)
125
+ unless children.is_a?(Array)
126
+ raise ArgumentError,
127
+ "children must be an Array"
128
+ end
129
+
130
+ @children = []
131
+
132
+ children.each do |cfg|
133
+ cfg[:logger] ||= logger
134
+ c = Ultravisor::Child.new(**cfg)
135
+ if @children.assoc(c.id)
136
+ raise DuplicateChildError,
137
+ "Duplicate child ID: #{c.id.inspect}"
138
+ end
139
+
140
+ @children << [c.id, c]
141
+ end
142
+ end
143
+
144
+ def process_events
145
+ loop do
146
+ qe = @queue.pop
147
+
148
+ case qe
149
+ when Ultravisor::Child
150
+ logger.debug(logloc) { "Received Ultravisor::Child queue entry for #{qe.id}" }
151
+ @op_m.synchronize { child_exited(qe) }
152
+ when :shutdown
153
+ logger.debug(logloc) { "Received :shutdown queue entry" }
154
+ break
155
+ else
156
+ logger.error(logloc) { "Unknown queue entry: #{qe.inspect}" }
157
+ end
158
+ end
159
+ end
160
+
161
+ def child_exited(child)
162
+ if child.termination_exception
163
+ log_exception(child.termination_exception, "Ultravisor::Child(#{child.id.inspect})") { "Thread terminated by unhandled exception" }
164
+ end
165
+
166
+ if @running_thread.nil?
167
+ logger.debug(logloc) { "Child termination after shutdown" }
168
+ # Child termination processed after we've shut down... nope
169
+ return
170
+ end
171
+
172
+ begin
173
+ return unless child.restart?
174
+ rescue Ultravisor::BlownRestartPolicyError
175
+ # Uh oh...
176
+ logger.error(logloc) { "Child #{child.id} has exceeded its restart policy. Shutting down the Ultravisor." }
177
+ @queue << :shutdown
178
+ return
179
+ end
180
+
181
+ case @strategy
182
+ when :all_for_one
183
+ @children.reverse.each do |id, c|
184
+ # Don't need to shut down the child that has caused all this mess
185
+ next if child.id == id
186
+
187
+ c.shutdown
188
+ end
189
+ when :rest_for_one
190
+ @children.reverse.each do |id, c|
191
+ # Don't go past the child that caused the problems
192
+ break if child.id == id
193
+
194
+ c.shutdown
195
+ end
196
+ end
197
+
198
+ sleep child.restart_delay
199
+
200
+ case @strategy
201
+ when :all_for_one
202
+ @children.each do |_, c|
203
+ c.spawn(@queue)
204
+ end
205
+ when :rest_for_one
206
+ s = false
207
+ @children.each do |id, c|
208
+ s = true if child.id == id
209
+
210
+ c.spawn(@queue) if s
211
+ end
212
+ when :one_for_one
213
+ child.spawn(@queue)
214
+ end
215
+ end
216
+ end